Different types of file downloads require different code strategies. This page outlines various strategies you may take.Documentation Index
Fetch the complete documentation index at: https://docs.reworkd.ai/llms.txt
Use this file to discover all available pages before exploring further.
Regular Download Links
Regular downloads occur when the file link is directly available within the HTML (typically in thehref of an <a> tag). Clicking these links directly initiates a file download.
To handle these downloads:
- Save the URL directly from the page.
- Reworkd will then asynchronously visit and download the file. We use
curl-cffimimicking browser behavior when downloading the file.
Indirect Download Links
Indirect downloads happen when the direct link isn’t immediately visible but becomes available after clicking a button or link. To handle indirect downloads:- Click the button/link to open the URL.
- Capture and save the newly loaded URL.
- Automatically navigate back.
JavaScript/Dynamic Downloads
Dynamic downloads occur when a file download is triggered by JavaScript events directly in the browser, without a direct URL. To handle dynamic downloads:- Use
capture_downloadmethod to trigger and capture the download directly in the browser. - Retrieve the file metadata (URL and title).
Downloads Requiring Cookies/Session
Some sites require the download to occur within the same browser session that accessed the page, making AWS Lambda unsuitable. In these cases:- Follow the same approach as dynamic downloads, handling the download directly in the browser context using
capture_download.

