{P}

Parameters

All request parameters for /v1/parse — from basic URL parsing to screenshots and extraction hints.

Request body

All parameters are sent as JSON in the request body.

url required

Type: string

The full URL of the product page to extract. Must include the scheme (https://). Redirects are followed automatically.

{ "url": "https://www.shop.com/products/blue-widget" }

URLs must resolve to a product page. Category pages, homepages, and search result pages will return a 422 error.


extractionHint

Type: string · Default: none

Natural-language instructions that guide the extraction AI. Use this when you need data the standard schema doesn't capture, or when a site structures its data unusually.

{
  "url": "https://example.com/product",
  "extractionHint": "Return each storage variant separately. Include estimated shipping time and country of origin."
}

Good hints:

  • "Split colour and storage into separate variant entries"
  • "Include the quantity break pricing table"
  • "Return the warranty period as a separate field called warranty_years"
  • "The page shows a sale price and a crossed-out original price — capture both"

Hints are only used when the adaptive or visual extraction path is triggered. They have no effect when JSON-LD is found (fast-path).


includeScreenshot

Type: boolean | "full-height" · Default: false

When true, captures a viewport screenshot (1280×900) during extraction. Set to "full-height" for a full-page screenshot of the entire scrollable content.

{
  "url": "https://example.com/product",
  "includeScreenshot": true
}

Screenshots are stored for 7 days and linked in the response as artifacts.screenshotUrl. Useful for:

  • Debugging extraction issues visually
  • Compliance audit trails
  • Verifying the page state at extraction time

Screenshots add ~200–400ms to the extraction time. They do not consume an extra request from your quota.


forceRefresh

Type: boolean · Default: false

When true, ignores the cached domain extraction script and regenerates it from scratch. Use this when a site has recently redesigned its product pages and the existing script is returning stale selectors.

{
  "url": "https://example.com/product",
  "forceRefresh": true
}

forceRefresh: true triggers full visual extraction and will consume an API credit even if the domain was previously on the free fast-path. Only use it when you know the site layout has changed.


strategies

Type: string[] · Default: ["jsonld", "cached-script", "adaptive"]

Override the extraction strategy order. Available strategies:

StrategyDescription
jsonldExtract from embedded JSON-LD / microdata schema
cached-scriptRun domain-specific cached extraction selectors
adaptiveGenerate new selectors using AI (triggers on first visit)
visualScreenshot + vision model extraction
{
  "url": "https://example.com/product",
  "strategies": ["visual", "jsonld"]
}

This example forces visual extraction first, then falls back to JSON-LD. Useful for sites where the JSON-LD is incomplete but the page renders all data visually.


minWait

Type: number (milliseconds) · Default: 0

Extra time to wait after the page loads before extracting. Use for pages with heavy JavaScript rendering or lazy-loaded product data.

{
  "url": "https://example.com/product",
  "minWait": 1500
}

Values above 5000 are clamped to 5000ms.


requestId

Type: string · Default: auto-generated

A client-supplied identifier for this request. Appears in dashboard logs and error reports, making it easier to correlate extractions with your own application logs.

{
  "url": "https://example.com/product",
  "requestId": "order-import-batch-2024-05-12-item-42"
}

Must be 3–128 characters. Alphanumeric, hyphens, and underscores only.


Parameter matrix

ParameterFreeProScale
url
extractionHint
includeScreenshot
forceRefresh
strategies
minWait
requestId
Batch modeUp to 20Custom