Errors and Warnings
Response semantics
success: truemeans the pipeline produced usable page content.success: falsemeans the request failed -- either an engine error or the target returned an error status (4xx/5xx) with minimal content.erroris present whensuccess: false. It describes what went wrong.warningflags degraded target outcomes (anti-bot pages, problematic status codes) whensuccess: true-- meaning content was produced but may be incomplete.metadata.statusCodeis the target site's HTTP status.datamay still be present whensuccess: falseif partial content was extracted.
The important distinction is that there are two layers of status:
- the HTTP status returned by CRW,
- and the target site's own status exposed through
metadata.statusCode.
You need both to debug real scraping failures.
HTTP status codes returned by the API
| Status | Meaning |
|---|---|
| 200 | Success |
| 400 | Invalid request parameters (bad URL, invalid JSON body, invalid selector) |
| 401 | Invalid or missing API key |
| 404 | Endpoint not found |
| 422 | Validation failed (unknown format, invalid schema, extraction error) |
| 429 | Rate limit or credit quota exceeded |
| 502 | Engine internal error |
| 503 | Server at capacity |
| 504 | Request timed out |
How To Read Common Cases
| Situation | What it usually means | What to do next |
|---|---|---|
HTTP 200 with warning |
The request succeeded, but the target result is degraded | Inspect warning and metadata.statusCode |
HTTP 400 |
Your request body is invalid | Fix fields, selectors, or schema |
HTTP 422 |
The request shape is valid JSON but semantically invalid | Check format names, schema, or extraction config |
HTTP 429 |
Rate limit or credit ceiling hit | Back off and honor Retry-After |
HTTP 502 / 504 |
Upstream or timeout issue | Retry with backoff |
Common engine errors
| Error | When it happens |
|---|---|
| Invalid URL | URL is malformed or targets a blocked address |
| Invalid selector | CSS selector or XPath expression cannot be parsed |
| Renderer timeout | JS rendering exceeded the page timeout |
| Navigation failed | CDP browser could not load the page |
| Response too large | Page exceeded the maximum allowed size |
| Invalid JSON schema | Schema provided for extraction is malformed |
| Extraction failure | LLM extraction failed (no LLM configured, or LLM returned an error) |
| No JS renderer available | renderJs: true but no CDP browser is configured |
Common warnings
| Warning | When it appears |
|---|---|
Target returned 403 Forbidden |
Target site blocked the request |
Target returned 429 Too Many Requests |
Target site rate limited the request |
Blocked by anti-bot protection |
Page contains Cloudflare/captcha markers |
JS rendering was requested but no renderer is available |
Fallback to HTTP-only fetch |
Retry Guidance
Retrying helps only for some classes of failure:
- retry
429,502, and504with backoff, - do not blindly retry
400or422, - and treat repeated warnings from the same domain as a target compatibility problem, not a random transient issue.
If a page is repeatedly blocked by anti-bot protection, longer retry loops usually make the situation worse rather than better.