Skip to main content
The playground on these pages works without an API key — the anonymous tier is live. Try it now:
cURL
curl -s https://search-api-staging-779189860552.europe-west1.run.app/v1/search \
  -H "Content-Type: application/json" \
  -d '{"query": "linux kernel scheduler", "max_results": 3}'

Base URL and spec

Base URLhttps://search-api-staging-779189860552.europe-west1.run.app
OpenAPI specGET /openapi/public.json — stable, machine-readable, served unauthenticated
Agents and code generators should fetch the spec directly; the generated endpoint pages alongside this one are rendered from it.

Authentication

Bearer tokens are optional: send Authorization: Bearer $CAESAR_API_KEY for keyed access (100 requests/second, account-level attribution), or send nothing and run as the anonymous tier (30 requests/second per IP). If a token is present it is always validated — a bad key returns 401 invalid_api_key rather than falling back to anonymous. See Authentication.

Conventions

ConventionDetail
JSON onlyRequests and responses are application/json. Bodies must be a single JSON object; unknown fields are rejected with 400 validation_error; max body size is 1 MiB.
snake_caseAll field names are snake_case (doc_id, max_results, canonical_url) — exactly as the API returns them.
UUIDs everywhereEvery identifier (request_id, search_id, session_id, doc_id, passage_id, capture_id, feedback_id) is a plain UUID string.
request_idEvery response — success or error — carries one. Quote it when reporting an issue.
session_idIf you do not supply one (session_id body field on /v1/search and /v1/feedback; the X-Session-ID header on any endpoint — /v1/document accepts only the header), the server issues a fresh UUID per request and echoes it. Pass it back on later calls for continuity. See Sessions.
access blockSuccess responses include your tier and live rate-limit state.
usage blockSuccess responses include requests, bytes_returned, and approx_tokens — a bytes_returned / 4 estimate, not a tokenizer count.
The access and usage blocks look like this:
{
  "access": {
    "tier": "anonymous",
    "rate_limit": { "limit_rps": 30, "remaining": 29, "reset_at": "2026-06-12T17:06:47Z" }
  },
  "usage": { "requests": 1, "bytes_returned": 550, "approx_tokens": 138 }
}

Endpoints

EndpointDescription
POST /v1/searchRun ranked retrieval over canonical documents and passages.
POST /v1/documentInspect one canonical document and retrieve selected content.
POST /v1/feedbackPersist an agent or evaluation feedback event.
The SDKs, CLI, and remote MCP server all wrap these exact three endpoints — same fields, same semantics.

Rate limits

Every response carries four headers: X-RateLimit-Tier, X-RateLimit-Limit-RPS, X-RateLimit-Remaining, and X-RateLimit-Reset (RFC3339). Exceeding your tier’s per-second limit returns 429 rate_limited. There is no Retry-After header — wait until X-RateLimit-Reset or back off exponentially. Details in Rate limits.

Errors

All errors share one envelope: type is the literal string "error", plus request_id and an error object with a stable snake_case code, a human-readable message, and optional details.
{
  "type": "error",
  "request_id": "ee314ccd-dbcd-418f-9c6a-f2917c599c67",
  "error": { "code": "invalid_api_key", "message": "missing or invalid API key" }
}
Branch on error.code, not message. The full code table is in Errors.

Idempotency and caching

All three endpoints are POST and there is no idempotency key. Repeating an identical search is not guaranteed to return identical results: ordering, scores, and search_id values can differ between replays. score.value is response-local — comparable only within the response that returned it, never across responses, modes, or ranker versions. Treat each response as a snapshot, and use doc_id plus provenance when you need stable references.