Phase machine
Every task moves through PLAN, EXECUTE, VERIFY, COMMIT. Each transition is gate-checked. The harness, not the model, decides when a phase is complete.
- PLAN. The implementer drafts a tracked task list; scope-fit and budget gates run.
- EXECUTE. Tools run against the plan; per-tool gates check secrets, syntax, and policy.
- VERIFY. A different model family reviews the artifact without seeing the implementer's reasoning; tests, lint, and review gates run.
- COMMIT. Only here does anything merge, publish, pay, or reply. The receipt is signed at this transition.
Memory tiers
Memory is tiered, not flat. Each tier has its own access pattern and its own retention. Two naming conventions are in play and we are converging them; until that ships, this page is the authoritative mapping.
UI label storage tier (code) description
L0 working stable agent identity; loaded at session start; immutable in-run
L1 episodic task-level decisions and constraints; persists across phases
L2 semantic working set for the current phase; high churn
L3 procedural long-running content-addressed cache; deduplicates across tasks
The L0..L3 labels are what you see in the harness UI, the receipt JSON, and the CLI flags. The right-hand working / episodic / semantic / procedural names are the storage tier identifiers in internal/memory/tiers.go; that is what shows up in source-level traces and database column values.
The MEMORY_READ and MEMORY_WRITE events in the receipt name the tier. You can audit which decisions came from L1 vs which were rederived from L3.
STOKE events
STOKE is the protocol for the trace. Every meaningful action is an event; events form a content-addressed graph; the graph is the receipt.
{
"id": "e3f4a5...",
"t": 0.314,
"tier": "L2",
"type": "TOOL_CALL",
"phase": "EXECUTE",
"label": "edit_file",
"detail": "patch /src/auth.ts",
"caused_by": ["b1c2d3..."]
}
The full event-type list and canonicalisation rules live in the STOKE spec.
Skill wizard
R1 skills are typed, replayable intermediate representation rather than markdown prompt snippets. The wizard walks an operator through intent, inputs, outputs, capabilities, side effects, failure, tests, and determinism; every answer becomes a SKILL_AUTHORING_DECISION in the receipt.
The important property is structural: replay is deterministic by construction because the IR declares capability use, cache keys, and test vectors up front. The walkthrough and the operator-facing authoring flow live at /skill-wizard/.
Beacon sibling protocol
Beacon is a sibling protocol to STOKE. Beacon establishes cryptographic identity, remote pairing, capability-token scope, and advisory delivery. STOKE records the resulting action graph. A remote command and a local command therefore produce the same receipt shape; the difference is that the remote path includes a provenance chain back to operator, device, and token.
Read the control-plane overview at /beacon-protocol/ and the operator-safety layer at /security-layer/.
Daemon mode
Daemon Mode turns R1 from a one-shot CLI into a long-running process with a persistent task queue, append-only WAL, runtime-resizable worker pool, and an HTTP control plane. The harness loop stays the same; only the execution envelope changes. Crash recovery resumes from disk rather than from operator memory.
The operational model and endpoints are summarized at /daemon-mode/. Daemon-mode completions still flow through the same gates, receipts, and delivery-ratio checks as interactive sessions.
Gates
Gates are first-class. A gate observes a phase transition and returns one of three verdicts.
- allow. the transition proceeds.
- block. the transition halts; the harness reports the reason and the task either retries or fails.
- escalate. the transition halts pending operator approval; the verdict and the operator's response both land in the receipt.
Built-in gates ship for syntax, tests, lint, secrets, and policy. Custom gates register as small WASM modules; the gate-decision contract is the same.
Cross-family review
By default, the implementer and the reviewer cannot share a model family. A reviewer that watched the implementer think will rationalise the implementer's mistakes; a fresh reviewer will not. The override is allowed and recorded.
Family registry today: Anthropic (Claude family), OpenAI (GPT family), Google (Gemini family), Meta (Llama family), Mistral, and any model behind the OpenAI-compatible adapter (declared family at registration time).