Concepts

Choose Your Endpoint

CRW has six capabilities. Pick the one that matches your input and output, and you will have less code to write and fewer round-trips to debug.

Six verbs: scrape, map, crawl, search, extract, parse

One base URL: https://api.fastcrw.com

Extract note: not a separate route — it is scrape + JSON format

Jump to decision tree View comparison table

Comparison table

Verb	Route	Input	Output	Use when	LLM required?
scrape	`POST /v1/scrape`	A single URL	Markdown, HTML, plain text, links, or raw HTML	You know the exact page URL and want its content	No (yes for `summary` format)
map	`POST /v1/map`	A domain or start URL	List of URLs discovered under that origin	You need to enumerate pages before scraping or crawling	No
crawl	`POST /v1/crawl`	A start URL	Async job — poll `GET /v1/crawl/{id}` for all pages	You want every page under a URL scraped in one background job	No (yes if you add `summary` to `scrapeOptions`)
search	`POST /v1/search`	A query string	Ranked web search results, optionally with scraped content	You do not have a URL — you want the web to find relevant pages	No (yes for `answer`/`summarize_results` options)
extract	`POST /v1/scrape`	A URL + JSON schema	`data.json` — a filled-in object matching your schema	You need structured fields (price, title, date…) not prose	Yes
parse	`POST /v2/parse`	A PDF file upload	Markdown (or JSON/summary with schema) from the document	You have a local file, not a URL	No (yes for `summary`/`json` formats)

Extract is not a separate route. It is the same POST /v1/scrape endpoint with formats: ["json"] and a jsonSchema field. The engine scrapes the page, then passes the content to an LLM alongside your schema to fill it in.

Decision tree

Do you have a file (PDF) to parse?
  └─ Yes ──► Parse   POST /v2/parse

Do you know the exact URL of the page you want?
  ├─ Yes ──► Do you need structured fields (price, date, …)?
  │            ├─ Yes ──► Extract  POST /v1/scrape  (formats:["json"] + jsonSchema)
  │            └─ No  ──► Scrape   POST /v1/scrape
  └─ No  ──► Are you looking across an entire site?
               ├─ Yes ──► Do you want every page's content in one job?
               │            ├─ Yes ──► Crawl  POST /v1/crawl
               │            └─ No  ──► Map    POST /v1/map
               └─ No  ──► Search  POST /v1/search

About extract

Extract reuses the scrape route. The only difference from a plain scrape is that you add two fields to the request body:

{
  "url": "https://example.com/product/42",
  "formats": ["json"],
  "jsonSchema": {
    "type": "object",
    "properties": {
      "title":  { "type": "string" },
      "price":  { "type": "string" }
    },
    "required": ["title"]
  }
}

CRW scrapes the page and then calls an LLM with your schema to produce the data.json field in the response. Because an LLM call is involved, extraction requires either a server-side [extraction.llm] configuration (self-hosted) or a per-request llmApiKey. On the hosted service at api.fastcrw.com this is handled automatically.

See Extract for the full parameter reference and provider options.

Quick examples

Scrape one page as markdown:

curl -X POST https://api.fastcrw.com/v1/scrape \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com","formats":["markdown"]}'

Discover all URLs on a site:

curl -X POST https://api.fastcrw.com/v1/map \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com"}'

Crawl an entire site (start + poll):

# Start
curl -X POST https://api.fastcrw.com/v1/crawl \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com"}'

# Poll with the returned job ID
curl https://api.fastcrw.com/v1/crawl/{id} \
  -H "Authorization: Bearer YOUR_API_KEY"

Search the web:

curl -X POST https://api.fastcrw.com/v1/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query":"open source web scraping 2025","limit":5}'

Extract structured data:

curl -X POST https://api.fastcrw.com/v1/scrape \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/product/42",
    "formats": ["json"],
    "jsonSchema": {
      "type": "object",
      "properties": {
        "title": {"type": "string"},
        "price": {"type": "string"}
      },
      "required": ["title"]
    }
  }'

Parse a PDF:

curl -X POST https://api.fastcrw.com/v2/parse \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@/path/to/document.pdf"

CRW — open-source web scraper & crawler (Firecrawl-compatible)

Quick: copy this and run

Python (stdlib only — defaults to hosted, fails loudly if key is missing)

cURL

CLI (one-shot LLM-ready output)

Response shape

Other endpoints

Install (local setup only)

Resources

Choose Your Endpoint

Comparison table

Decision tree

About extract

Quick examples

What to read next