Appearance
POST /extract
Extract structured data from a single URL.
POST https://api.scrapewithruno.com/v1/extract
X-API-Key: sk_live_...
Content-Type: application/jsonRequest Body
json
{
"url": "https://example.com/profile/rachel-mcadams",
"schema": [
{ "field": "firstName", "type": "string", "example": "Rachel" },
{ "field": "age", "type": "integer", "example": 35 },
{ "field": "netWorth", "type": "float", "example": 12500000.00 },
{ "field": "isVerified", "type": "boolean", "example": true },
{ "field": "tags", "type": "array<string>", "example": ["actress"] }
],
"options": {
"render_js": "auto",
"locale": "en-US",
"timeout_ms": 15000,
"process_images": false
}
}| Field | Type | Required | Notes |
|---|---|---|---|
url | string | yes | Absolute URL to fetch |
schema | array | dynamic keys only | See Defining a Schema. Static keys must omit it |
options.render_js | "auto" | "always" | "never" | no | Default "auto" |
options.locale | string | no | BCP-47 tag passed to the browser context (e.g., "en-US", "de-DE") |
options.timeout_ms | integer | no | Per-page deadline; default 15000 |
options.process_images | boolean | no | Vision pass on null fields. Scale only |
Response
json
{
"url": "https://example.com/profile/rachel-mcadams",
"status": "success",
"render_mode": "headless",
"data": {
"firstName": "Rachel",
"age": 45,
"netWorth": 8000000.00,
"isVerified": true,
"tags": ["actress"]
},
"images_processed": null
}| Field | Notes |
|---|---|
status | "success" or "error" |
render_mode | "fetch" | "headless" | "structured" | "cache" |
data | Object keyed by your schema's field names. Unresolvable fields are null |
images_processed | Number of images consumed by the vision pass; null when the pass didn't run |
warnings | (optional) Coercion warnings; omitted when empty |
The job ID for the request is on the response headers as X-Job-ID. See Jobs.
Errors
Most errors come back with status: "error" and a top-level error object:
json
{
"status": "error",
"error": { "code": "FETCH_BLOCKED", "message": "...", "retryable": true }
}Common codes for /extract:
| Code | When |
|---|---|
SCHEMA_INVALID | Schema malformed |
SCHEMA_REQUIRED | Dynamic key without schema |
SCHEMA_NOT_ALLOWED | Static key with inline schema |
URL_UNREACHABLE | DNS / network failure |
TIMEOUT | Page exceeded timeout_ms |
FETCH_BLOCKED | Anti-bot defeated both fetch and headless |
TIER_REQUIRED | Site needs T4/T5 bypass (Pro/Scale only) |
RATE_LIMITED | Per-minute limit hit |
QUOTA_EXCEEDED | Monthly quota exhausted |
LLM_* | Various LLM failures, see Errors & Retries |
Examples
bash
curl -X POST https://api.scrapewithruno.com/v1/extract \
-H "X-API-Key: $RUNO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/p/123",
"schema": [
{ "field": "title", "type": "string", "example": "Example product" },
{ "field": "price", "type": "float", "example": 19.99 }
]
}'js
const res = await fetch('https://api.scrapewithruno.com/v1/extract', {
method: 'POST',
headers: {
'X-API-Key': process.env.RUNO_API_KEY,
'Content-Type': 'application/json',
},
body: JSON.stringify({
url: 'https://example.com/p/123',
schema: [
{ field: 'title', type: 'string', example: 'Example product' },
{ field: 'price', type: 'float', example: 19.99 },
],
}),
})
const jobId = res.headers.get('X-Job-ID')
const json = await res.json()
console.log(jobId, json.data)py
import os, requests
res = requests.post(
"https://api.scrapewithruno.com/v1/extract",
headers={"X-API-Key": os.environ["RUNO_API_KEY"]},
json={
"url": "https://example.com/p/123",
"schema": [
{"field": "title", "type": "string", "example": "Example product"},
{"field": "price", "type": "float", "example": 19.99},
],
},
timeout=60,
)
print("job:", res.headers.get("X-Job-ID"))
print(res.json()["data"])