Concepts · R1 docs

Phase machine

Every task moves through PLAN, EXECUTE, VERIFY, COMMIT. Each transition is gate-checked. The harness, not the model, decides when a phase is complete.

PLAN. The implementer drafts a tracked task list; scope-fit and budget gates run.
EXECUTE. Tools run against the plan; per-tool gates check secrets, syntax, and policy.
VERIFY. A different model family reviews the artifact without seeing the implementer's reasoning; tests, lint, and review gates run.
COMMIT. Only here does anything merge, publish, pay, or reply. The receipt is signed at this transition.

Memory tiers

Memory is tiered, not flat. Each tier has its own access pattern and its own retention. Two naming conventions are in play and we are converging them; until that ships, this page is the authoritative mapping.

UI label  storage tier (code)   description
L0        working                stable agent identity; loaded at session start; immutable in-run
L1        episodic               task-level decisions and constraints; persists across phases
L2        semantic               working set for the current phase; high churn
L3        procedural             long-running content-addressed cache; deduplicates across tasks

The L0..L3 labels are what you see in the harness UI, the receipt JSON, and the CLI flags. The right-hand working / episodic / semantic / procedural names are the storage tier identifiers in internal/memory/tiers.go; that is what shows up in source-level traces and database column values.

The MEMORY_READ and MEMORY_WRITE events in the receipt name the tier. You can audit which decisions came from L1 vs which were rederived from L3.

STOKE events

STOKE is the protocol for the trace. Every meaningful action is an event; events form a content-addressed graph; the graph is the receipt.

{
  "id": "e3f4a5...",
  "t": 0.314,
  "tier": "L2",
  "type": "TOOL_CALL",
  "phase": "EXECUTE",
  "label": "edit_file",
  "detail": "patch /src/auth.ts",
  "caused_by": ["b1c2d3..."]
}

The full event-type list and canonicalisation rules live in the STOKE spec.

Skill wizard

R1 skills are typed, replayable intermediate representation rather than markdown prompt snippets. The wizard walks an operator through intent, inputs, outputs, capabilities, side effects, failure, tests, and determinism; every answer becomes a SKILL_AUTHORING_DECISION in the receipt.

The important property is structural: replay is deterministic by construction because the IR declares capability use, cache keys, and test vectors up front. The walkthrough and the operator-facing authoring flow live at /skill-wizard/.

Beacon sibling protocol

Beacon is a sibling protocol to STOKE. Beacon establishes cryptographic identity, remote pairing, capability-token scope, and advisory delivery. STOKE records the resulting action graph. A remote command and a local command therefore produce the same receipt shape; the difference is that the remote path includes a provenance chain back to operator, device, and token.

Read the control-plane overview at /beacon-protocol/ and the operator-safety layer at /security-layer/.

Daemon mode

Daemon Mode turns R1 from a one-shot CLI into a long-running process with a persistent task queue, append-only WAL, runtime-resizable worker pool, and an HTTP control plane. The harness loop stays the same; only the execution envelope changes. Crash recovery resumes from disk rather than from operator memory.

The operational model and endpoints are summarized at /daemon-mode/. Daemon-mode completions still flow through the same gates, receipts, and delivery-ratio checks as interactive sessions.

Gates

Gates are first-class. A gate observes a phase transition and returns one of three verdicts.

allow. the transition proceeds.
block. the transition halts; the harness reports the reason and the task either retries or fails.
escalate. the transition halts pending operator approval; the verdict and the operator's response both land in the receipt.

Built-in gates ship for syntax, tests, lint, secrets, and policy. Custom gates register as small WASM modules; the gate-decision contract is the same.

Cross-family review

By default, the implementer and the reviewer cannot share a model family. A reviewer that watched the implementer think will rationalise the implementer's mistakes; a fresh reviewer will not. The override is allowed and recorded.

Family registry today: Anthropic (Claude family), OpenAI (GPT family), Google (Gemini family), Meta (Llama family), Mistral, and any model behind the OpenAI-compatible adapter (declared family at registration time).

Concepts.