Self-Hosting Hardening
Minimum hardening baseline
- Terminate TLS in front of the API.
- Run the service as a non-root user.
- Restrict inbound access to required ports only.
- Isolate renderer sidecars from unnecessary network paths.
That baseline is the starting point, not the finish line. A self-hosted scraper talks to untrusted public pages and can sit close to valuable internal systems, so it deserves the same discipline as any other internet-facing API.
Network and Access Control
- Put a reverse proxy or gateway in front of the service.
- Restrict who can reach the API by network, identity, or both.
- Avoid exposing internal health or admin surfaces to the public internet.
- If browser rendering is enabled, isolate the renderer from internal systems it does not need to reach.
Runtime Isolation
Treat page fetching and browser rendering as higher-risk components than your application logic.
- run them with the least privilege possible,
- keep filesystem access narrow,
- and isolate sidecars so a renderer problem does not automatically become a broader platform problem.
Secrets and Keys
- Keep API keys, proxy credentials, and LLM keys out of image builds.
- Inject secrets at runtime through your platform's secret store.
- Rotate keys during environment changes or incident response, not only on a fixed calendar.
LLM features and trust boundary
CRW's summary format and /v1/search answer/summarize features need an LLM key. There are two deployment shapes; pick one and lock it down with the runtime guards listed below.
Solo / self-hosted (key in server config):
[extraction.llm]
provider = "openai"
api_key = "sk-..."
model = "gpt-4o-mini"
Anyone who can reach your opencore can spend on that key. Front it with auth, network policy, or a private network.
SaaS / multi-tenant (BYOK, per-request keys):
- Set
CRW_DISABLE_SERVER_LLM_KEY=1in opencore's environment. With this env var set, opencore refuses to boot if[extraction.llm].api_keyis also configured — the most common operator mistake. - Set
[extraction.llm].require_byok_header = "X-CRW-Tenant"(or similar). CRW rejects LLM-touching requests that lack that header AND do not pass a per-requestllmApiKey. Your SaaS layer adds the header on every forwarded request; direct public callers cannot. - Don't expose opencore on a public address; keep it behind your SaaS proxy.
- Use
GET /v1/capabilitieson boot from the SaaS layer to verify the opencore version's feature set before showing LLM toggles in your UI.
Per-request budget:
[extraction.llm].max_html_bytes(default100000) caps content sent to the LLM.- Per-request
maxContentCharsandmaxCharsPerSourceare clamped server-side (200 KB and 32 KB respectively) regardless of value. summaryPromptandanswerPromptare truncated at 500 chars and cannot override the safety wrapper.- The citation list is capped at 20 entries; fabricated
source_ids are dropped.
Operational guidance
- Rotate API keys during deployment cutovers.
- Keep browser-rendering dependencies on the smallest possible surface area.
- Expose
/healthonly where your load balancer or monitoring needs it. - Review warning-heavy targets separately; they often indicate anti-bot defenses rather than renderer bugs.
Monitoring and Auditability
At minimum, watch:
- API error rate,
- warning frequency,
- crawl job duration,
- renderer availability,
- and resource spikes on the browser sidecar.
Keep enough logs to answer three questions after an incident:
- what URL or workload triggered the issue,
- whether it was an engine problem or a target-site problem,
- and what data, if any, was still returned.
Example Hardening Sequence
If you are moving from a dev VM to a real environment, the order should usually be:
- put a reverse proxy and TLS in front,
- add auth and external rate limiting,
- move secrets into runtime injection,
- restrict network access around the API and any renderer sidecar,
- then enable monitoring and alerting on warnings, failures, and resource spikes.
That order keeps the riskiest exposure points under control early instead of treating hardening as a final cleanup step.
When To Isolate the Renderer More Aggressively
Stronger isolation is worth it when:
- your targets are highly dynamic and require frequent JS rendering,
- the service runs close to internal systems with sensitive access,
- or many tenants or workloads share the same cluster.
In those cases, a renderer problem should not become an easy pivot into the rest of your infrastructure.
Common Mistakes
- Leaving
/healthbroadly exposed when only an internal load balancer needs it. - Running the service with broader filesystem or network access than the scraping workload requires.
- Keeping incident logs too thin to separate target-site anti-bot issues from engine regressions.
Pair this page with rate limits and error codes so operational hardening and runtime diagnostics are documented together.