Errors are JSON with a stable machine-readable code. Trigger one against staging right now:
curl -s https://search-api-staging-779189860552.europe-west1.run.app/v1/search \
-H "Content-Type: application/json" \
-d '{"query":"x","mode":"deep"}'
{
"type": "error",
"request_id": "7e9a1f0c-2f43-4f5a-9d3e-6b1c2a4d5e6f",
"error": {
"code": "unsupported_mode",
"message": "mode must be fast, standard, or research",
"details": { "field": "mode" }
}
}
The error envelope
Every error response has the same shape:
| Field | Type | Notes |
|---|
type | string | Always the literal "error" |
request_id | string | UUID generated per request — quote it when reporting issues |
error.code | string | Stable snake_case code; branch on this |
error.message | string | Human-readable; wording may change, the code will not |
error.details | object | Optional structured context (often {"field": "..."}); omitted when empty |
$schema | string | Optional JSON Schema reference; present on errors returned by the API handlers, absent on middleware-emitted ones (401/403/429). Ignore it. |
There is no hint or retryable field — retryability is conveyed by HTTP status alone. Error envelopes never include the access block; the X-RateLimit-* headers are set on every response that reaches the rate limiter, including 400/404/429/5xx errors — but 401/403 authentication errors are rejected before the rate limiter runs and carry no X-RateLimit-* headers.
Error codes
These are the codes the API emits today, exhaustively:
| HTTP | error.code | When |
|---|
| 400 | validation_error | Malformed JSON, an unknown body field, a body over 1 MiB, or an invalid field value. details.field names the offender for invalid field values; malformed JSON, unknown fields, and oversized bodies return details.error with the parser message instead. Non-UUID session_id, doc_id, search_id, and passage_id values land here. |
| 400 | unsupported_mode | mode is not fast, standard, or research. |
| 400 | response_too_large | Only when you opt in with response.budget.on_exceed: "error" and the budget cannot be met. The default (shed) degrades with warnings instead — see response shaping. |
| 401 | missing_api_key | No Bearer token on a deployment that requires keys. (Staging allows keyless access, so you will not see this there.) |
| 401 | invalid_api_key | A Bearer token was sent but is malformed, unknown, expired, or revoked. A bad key never falls back to the anonymous tier. |
| 403 | insufficient_scope | Valid key without the endpoint’s scope (search:read, document:read, feedback:write). |
| 404 | document_not_found | /v1/document target not found; also /v1/feedback when the doc_id is not granted to your account. |
| 404 | search_not_found | /v1/feedback with a search_id that does not belong to your account. |
| 429 | rate_limited | Over your tier’s requests-per-second in the current one-second window. See rate limits. |
| 500 | internal_error | Server fault. Safe to retry. |
| 502 | provider_unavailable | The search infrastructure is unreachable. Transient; retry. |
| 503 | provider_unavailable | The deployment has no retrieval configured. |
The two 404 codes are ownership-scoped: on /v1/feedback they mean the target does not belong to your account, not necessarily that it never existed.
Warnings
Warnings are the API degrading gracefully instead of failing. They share the error shape — code, message, optional details — and arrive in a warnings array on search and document responses (/v1/feedback has no warnings field):
{
"code": "content_truncated",
"message": "Returned content was truncated to content.max_chars.",
"details": { "field": "content.text", "max_chars": 12000 }
}
Warnings never invalidate the response. A 200 with warnings is a complete answer: the status stays 200, the payload is valid and usable, and warnings survives every verbosity preset and budget shed.
code | Endpoint | Fires when |
|---|
response_truncated | search | response.budget.max_chars_total forced shedding; the response’s truncated flag is set, details.shed_levels lists what was dropped. |
budget_unsatisfiable | search | Even a single minimal result exceeds the budget; it is returned anyway. |
unknown_field | search | response.verbosity is unrecognized; the request proceeds with standard. |
source_policy_filtered | search | The requested source_policy filtered out one or more results (details.filtered_results). |
rerank_unavailable | search | Reranking is temporarily unavailable; results use first-stage order and ranking.ranker_version is first_stage_order_v1. |
content_unavailable | document | Document content could not be fetched for this request. |
content_truncated | document | Content exceeded content.max_chars; pairs with content.truncated: true and content.start_char for continuation reads. |
stale_range | document | content.range.capture_id pins a capture that is no longer the latest; offsets may not line up. |
stale_passage_id | document | One or more requested passage IDs are unavailable in the latest capture (details.passage_ids lists them). |
Missing passage IDs and over-restrictive source policies are deliberately warnings, not errors — your agent gets whatever is available plus a signal about what it did not get.
Retrying
Retry 429 and 5xx with backoff; treat 4xx (other than 429) as bugs in the request, not transient failures.
- There is no
Retry-After header. On 429, read X-RateLimit-Reset (RFC3339) — limits use fixed one-second windows, so the reset is at most about a second away. Exponential backoff also works fine.
- Rate-limit tokens are spent before validation, so retry loops sending invalid bodies still burn budget.
- The SDKs and CLI retry automatically: the CLI retries 429 and 500+ up to 3 times with exponential backoff capped at 8 seconds (
--no-retry disables).
CLI exit codes
caesar-search maps outcomes to exit codes so scripts branch on status, not output parsing. See CLI automation.
| Exit code | Meaning |
|---|
| 0 | Success — including a 200 with warnings |
| 2 | Bad input (usage or flag errors, invalid local input) |
| 3 | Auth error (HTTP 401 or 403) |
| 4 | API error (any other non-2xx, after retries; also network failures) |
| 5 | Timeout (default 30s; raise with --timeout) |
With --json, CLI errors go to stderr as {"error":{"code","message","hint"}} — the code is taken from the API envelope’s error.code when one exists. Exit code 1 is not part of the contract.