Agents guide - Caesar

Caesar is one API with several client surfaces. Pick the surface that matches your harness; everything else — verbs, field names, presets, limits — is identical across them.

For agents reading the docs: use /llms.txt to discover pages, append .md to page URLs for markdown, and start with CLI vs MCP when choosing an agent surface.

Pick your surface

Your harness	Use	Why
Inner-loop shell work (Claude Code, terminal agents, scripts)	The CLI, plus the forked skills in Claude Code	`--json` machine output, `-o <file>` keeps large payloads out of your context window, stable exit codes to branch on
Chat or IDE assistant (Cursor, Windsurf, any MCP client)	The remote MCP server at `/mcp?profile=web-search`	Zero install, streamable HTTP, generic `web_search` / `web_fetch` names so ordinary web searches route to Caesar
Building an application	Python SDK or TypeScript SDK; `caesar-search/ai` for Vercel AI SDK agents	Typed clients with retries built in; `caesarTools()` is a drop-in tool set
Everything else	Raw HTTP — spec served at `/openapi/public.json` on the API host	See the API reference

Same three verbs everywhere

Verb	REST	CLI	MCP	SDKs
Search	`POST /v1/search`	`caesar-search search`	`web_search`	`client.search()`
Read	`POST /v1/document`	`caesar-search read`	`web_fetch`	`client.read()`
Feedback	`POST /v1/feedback`	`caesar-search feedback`	not exposed	`client.feedback()`

The same search on each surface:

curl -s https://alpha.api.trycaesar.com/v1/search \
  -H 'Authorization: Bearer $CAESAR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"query":"zig comptime explained","max_results":5,"response":{"verbosity":"compact"}}'

from caesar_search import Caesar

client = Caesar()  # reads CAESAR_API_KEY
results = client.search("zig comptime explained", max_results=5, verbosity="compact")

import { Caesar } from "caesar-search";

const caesar = new Caesar(); // reads CAESAR_API_KEY
const results = await caesar.search("zig comptime explained", {
  maxResults: 5,
  verbosity: "compact",
});

caesar-search search "zig comptime explained" --max-results 5 --format compact --json

Conventions that hold on every surface

Wire fields are snake_case

doc_id, search_id, max_results, start_char, canonical_url — exactly as the API returns them. No surface converts to camelCase. Do not expect docId in any response.

One set of verbosity presets

verbosity (REST/SDKs), --format (CLI search), and response_format (MCP, AI SDK tools) are the same presets. Use compact in agent loops — it is the token-efficient choice. standard adds quotable passages; full adds provenance. See response shaping.

Surface	Parameter	Values	Default
REST / SDKs	`response.verbosity` / `verbosity`	`ids_only`, `compact`, `standard`, `full`	`standard`
CLI	`--format`	`ids_only`, `compact`, `standard`, `full`	`standard`
MCP	`response_format`	`compact`, `standard`, `full` (`ids_only` falls back to `compact`)	`compact`
AI SDK tools	`response_format`	`compact`, `standard`, `full`	`compact`

max_results is 1-50

REST and SDKs default to 10 and reject out-of-range values with 400 validation_error. The CLI also defaults to 10 but rejects out-of-range values locally before any request is sent (bad_input, exit 2). MCP defaults to 8 and silently clamps out-of-range values into 1-50. The AI SDK tools (caesarTools()) also default to 8, like MCP.

Continue truncated reads with start_char

A truncated read reports truncated: true with start_char and char_count (nested under content on REST/CLI/SDKs, top-level over MCP). Continue from start_char + char_count:

caesar-search read "$DOC_ID" --start-char 12000 --json

Do not retry with a bigger max_chars. A non-zero start_char forces full_document selection so offsets stay contiguous — a query passed alongside it will not excerpt.

Preserve identifiers, cite only returned URLs

doc_id and search_id are opaque IDs: thread them verbatim between search, read, and feedback. Cite only URLs the API returned (canonical_url, url, source_url) — never construct or guess a URL.

Errors and retries

All API errors use one envelope:

{
  "type": "error",
  "request_id": "ee314ccd-dbcd-418f-9c6a-f2917c599c67",
  "error": { "code": "rate_limited", "message": "rate limit exceeded" }
}

Retry 429 and 5xx with exponential backoff; the CLI and SDKs already do (3 retries, 500 ms doubling, 8 s cap). Rate-limit windows are one second — the X-RateLimit-Reset header says when the window resets. Do not retry other 4xx. The CLI mirrors errors as a JSON envelope on stderr and exits 2 (bad input), 3 (auth), 4 (API error), or 5 (timeout) — branch on exit codes, not output. See errors and rate limits.

API keys are required

Set CAESAR_API_KEY (CI), run caesar-search auth login (interactive — opens a browser and stores a named, revocable key; --device over SSH), or connect the Caesar MCP server in an OAuth-capable host, before search, read, or feedback calls. A missing credential returns 401 missing_api_key; a present-but-invalid key returns 401 invalid_api_key. See authentication.

Optional feedback

When your app can tell which result helped, send feedback with the search_id and doc_id you got back. Use result_helpful when a result answered the task and stale_result when content was outdated:

caesar-search feedback --event-type result_helpful --search-id "$SID" --doc-id "$DID"

Feedback is deliberately not exposed over MCP; use any other surface.

Install

CLI vs MCP

Decide which surface to give an agent first, and when to use both.

Install for agents

Copy-paste install blocks per environment, each with a verification step.

​Pick your surface

​Same three verbs everywhere

​Conventions that hold on every surface

​Wire fields are snake_case

​One set of verbosity presets

​max_results is 1-50

​Continue truncated reads with start_char

​Preserve identifiers, cite only returned URLs

​Errors and retries

​API keys are required

​Optional feedback

​Install

CLI vs MCP

Install for agents

Pick your surface

Same three verbs everywhere

Conventions that hold on every surface

Wire fields are snake_case

One set of verbosity presets

max_results is 1-50

Continue truncated reads with start_char

Preserve identifiers, cite only returned URLs

Errors and retries

API keys are required

Optional feedback

Install