Parameters
All request parameters for /v1/parse — from basic URL parsing to screenshots and extraction hints.
Request body
All parameters are sent as JSON in the request body.
url required
Type: string
The full URL of the product page to extract. Must include the scheme (https://). Redirects are followed automatically.
{ "url": "https://www.shop.com/products/blue-widget" }
URLs must resolve to a product page. Category pages, homepages, and search result pages will return a 422 error.
extractionHint
Type: string · Default: none
Natural-language instructions that guide the extraction AI. Use this when you need data the standard schema doesn't capture, or when a site structures its data unusually.
{
"url": "https://example.com/product",
"extractionHint": "Return each storage variant separately. Include estimated shipping time and country of origin."
}
Good hints:
"Split colour and storage into separate variant entries""Include the quantity break pricing table""Return the warranty period as a separate field called warranty_years""The page shows a sale price and a crossed-out original price — capture both"
Hints are only used when the adaptive or visual extraction path is triggered. They have no effect when JSON-LD is found (fast-path).
includeScreenshot
Type: boolean | "full-height" · Default: false
When true, captures a viewport screenshot (1280×900) during extraction. Set to "full-height" for a full-page screenshot of the entire scrollable content.
{
"url": "https://example.com/product",
"includeScreenshot": true
}
Screenshots are stored for 7 days and linked in the response as artifacts.screenshotUrl. Useful for:
- Debugging extraction issues visually
- Compliance audit trails
- Verifying the page state at extraction time
Screenshots add ~200–400ms to the extraction time. They do not consume an extra request from your quota.
forceRefresh
Type: boolean · Default: false
When true, ignores the cached domain extraction script and regenerates it from scratch. Use this when a site has recently redesigned its product pages and the existing script is returning stale selectors.
{
"url": "https://example.com/product",
"forceRefresh": true
}
forceRefresh: true triggers full visual extraction and will consume an API credit even if the domain was previously on the free fast-path. Only use it when you know the site layout has changed.
strategies
Type: string[] · Default: ["jsonld", "cached-script", "adaptive"]
Override the extraction strategy order. Available strategies:
| Strategy | Description |
|---|---|
jsonld | Extract from embedded JSON-LD / microdata schema |
cached-script | Run domain-specific cached extraction selectors |
adaptive | Generate new selectors using AI (triggers on first visit) |
visual | Screenshot + vision model extraction |
{
"url": "https://example.com/product",
"strategies": ["visual", "jsonld"]
}
This example forces visual extraction first, then falls back to JSON-LD. Useful for sites where the JSON-LD is incomplete but the page renders all data visually.
minWait
Type: number (milliseconds) · Default: 0
Extra time to wait after the page loads before extracting. Use for pages with heavy JavaScript rendering or lazy-loaded product data.
{
"url": "https://example.com/product",
"minWait": 1500
}
Values above 5000 are clamped to 5000ms.
requestId
Type: string · Default: auto-generated
A client-supplied identifier for this request. Appears in dashboard logs and error reports, making it easier to correlate extractions with your own application logs.
{
"url": "https://example.com/product",
"requestId": "order-import-batch-2024-05-12-item-42"
}
Must be 3–128 characters. Alphanumeric, hyphens, and underscores only.
Parameter matrix
| Parameter | Free | Pro | Scale |
|---|---|---|---|
url | ✓ | ✓ | ✓ |
extractionHint | ✓ | ✓ | ✓ |
includeScreenshot | — | ✓ | ✓ |
forceRefresh | — | ✓ | ✓ |
strategies | — | ✓ | ✓ |
minWait | — | ✓ | ✓ |
requestId | ✓ | ✓ | ✓ |
| Batch mode | — | Up to 20 | Custom |