Self-hosted backends
pi-web-agent can use existing self-hosted services for search and page reading:
- SearXNG for search
- Firecrawl for page fetch/extraction
This keeps the public Pi tool the same: the model still calls web_explore. The backend config only changes what web_explore uses internally.
What this page does not cover
This project does not manage SearXNG or Firecrawl deployments.
Use the upstream docs for:
- installing either service
- Docker Compose files
- reverse proxies
- TLS
- auth setup
- service upgrades
The assumption here is that you already have working services and just want pi-web-agent to connect to them.
Default backend config
Without any backend config, pi-web-agent uses:
{
"backends": {
"search": { "provider": "duckduckgo" },
"fetch": { "provider": "http" },
"headless": { "provider": "local-browser" }
}
}That path does not require SearXNG or Firecrawl.
Config file locations
Global config:
~/.pi/agent/extensions/pi-web-agent/config.jsonProject config:
.pi/extensions/pi-web-agent/config.jsonProject config overrides global config.
SearXNG search
To use SearXNG for search, set backends.search.provider to searxng and provide baseUrl:
{
"backends": {
"search": {
"provider": "searxng",
"baseUrl": "http://localhost:8080"
}
}
}pi-web-agent expects SearXNG JSON search to work at:
/search?q=example&format=jsonRun this after editing config:
/web-agent doctorDoctor checks that the configured SearXNG endpoint responds with JSON that looks like search results.
Firecrawl fetch
To use Firecrawl for page reading, set backends.fetch.provider to firecrawl and provide baseUrl:
{
"backends": {
"fetch": {
"provider": "firecrawl",
"baseUrl": "http://localhost:3002"
}
}
}pi-web-agent calls Firecrawl's scrape endpoint:
/v1/scrapeIf your Firecrawl instance requires an API key, prefer an environment variable:
PI_WEB_AGENT_FIRECRAWL_API_KEY=...You can also set an API key in config for local-only setups:
{
"backends": {
"fetch": {
"provider": "firecrawl",
"baseUrl": "http://localhost:3002",
"apiKey": "..."
}
}
}Avoid committing project config files that contain secrets.
Full self-hosted example
{
"backends": {
"search": {
"provider": "searxng",
"baseUrl": "http://localhost:8080"
},
"fetch": {
"provider": "firecrawl",
"baseUrl": "http://localhost:3002"
},
"headless": {
"provider": "local-browser"
}
}
}You can combine this with presentation settings in the same file:
{
"presentation": {
"defaultMode": "preview"
},
"backends": {
"search": {
"provider": "searxng",
"baseUrl": "http://localhost:8080"
},
"fetch": {
"provider": "firecrawl",
"baseUrl": "http://localhost:3002"
},
"headless": {
"provider": "local-browser"
}
}
}Verify the setup
Show the effective config:
/web-agent showRun diagnostics:
/web-agent doctorExpected healthy output includes lines like:
search: searxng (http://localhost:8080)
fetch: firecrawl (http://localhost:3002)
backend config: ok
search backend: searxng ok
fetch backend: firecrawl okThen try a normal research prompt:
Find current docs for configuring Vitest coverage with the v8 provider.The model should still use web_explore; it should not need separate SearXNG or Firecrawl tool calls.
Troubleshooting
search provider searxng requires backends.search.baseUrl
You set provider to searxng but did not include baseUrl.
fetch provider firecrawl requires backends.fetch.baseUrl
You set provider to firecrawl but did not include baseUrl.
search backend: searxng warning
Check that:
- SearXNG is running
- the configured URL is reachable from the Pi process
- JSON output works with
format=json
fetch backend: firecrawl warning
Check that:
- Firecrawl is running
/v1/scrapeis available- the API key is set if your instance requires auth
- the Pi process can reach the configured URL
Self-hosted privacy expectations
pi-web-agent does not silently fall back from SearXNG to DuckDuckGo or from Firecrawl to plain HTTP when you choose self-hosted providers. Fallback policy needs to be explicit because some users choose self-hosted backends specifically to avoid external services.