Get outputs for a scraping group
Allows you to fetch all outputs for a given scraping group. The results are a materialized view of the outputs for the group; meaning the results are deduplicated. This view is updated depending on how often the group is scheduled to be re-scraped.
The results are paginated and sorted by the create_date
of items in ascending order.
You can fetch the next page by using the next_url
or next_cursor
fields in the
response metadata.
Typically, you’d also want to provide a created_after
filter to only fetch outputs
created after a certain date. This is useful when you want to fetch new outputs
since the last time you fetched outputs; thus allowing you to maintain a “real-time”
view of the outputs.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Path Parameters
ID of the scraping group you want to fetch outputs for. This can be found on the groups page in the individual group card.
Query Parameters
ID of the root scraping job to filter outputs by. Useful when you need to fetch results from a specific domain or data source within a group. When omitted, outputs from all jobs in the group will be returned.
"03583f9c-6c90-4f3c-9afd-186258d6f4d6"
null
Filter outputs to only include those created or updated on or after this timestamp. Accepts ISO 8601 format (YYYY-MM-DD or YYYY-MM-DDTHH:MM:SSZ). Essential for incremental data syncing to avoid fetching the entire dataset on each request.
"2025-01-15T10:00:00Z"
"2025-01-15"
null
Complete URL to fetch outputs for, including protocol and path. Must match exactly the URL that was processed by the scraper.
"https://www.example.com/product/product_id_123"
Number of results to return per page
1 <= x <= 1000
Cursor to paginate through results
x >= 0
Name of the country to filter by (eg: United States)
"United States"
ISO 3166-2 code of the region to filter by (eg: US-CA)
"US-CA"
"US-TX"
Response
Represents a single output from a scraping job. This is the data that was extracted
from a website by a scraper. The data
field contains the extracted data normalized
to the schema of the scraping group.
The files
field contains the files that were extracted by the scraper. The files
can be downloaded from the s3_url
field.
The change type indicate if it was the first time the output was created or if it was
an update or delete of an existing record. See ChangesetAction
for more details.