Agents that ship. Then prove they shipped.
R1 is an open-source agent framework that plans, runs, checks, and records software work before it merges. It can conduct Claude Code and Codex as sub-agents, but the harness keeps the proof trail and the gate checks.
For solo builders, platform teams, and regulated industries that need proof, not just output.
Three things you actually get.
Not a chat window. Not a black box. Three concrete capabilities shipped in one open-source runtime.
Three kinds of teams pick R1 for three different reasons.
You are the whole engineering team. You want an agent that can pick up a feature, ship it end to end, and not require a 30-minute review of every line. R1 gives you the harness so you only have to review what the harness flagged.
For solo builders →You run agents for a company of engineers. You need policy, audit, cost caps, and the ability to roll back when something shipped that should not have. R1 gives you the harness boundary and composes with governance and verification systems around it.
For platform teams →You cannot run agents without proof of where the data went, who approved which action, and how to keep the runtime inside your control boundary. R1 self-hosts cleanly and keeps the approval trail attached to the work.
For regulated industries →PLAN. EXECUTE. VERIFY. COMMIT.
Four steps, one harness. Skip any of them and the work does not merge.
Here is what shipping with R1 actually looks like.
Not a glossy promise. The cadence of one engineer’s day with an agent that does not ship work without a paper trail.
The difference between agentic and driven.
On the left, the model decides when it is done. On the right, R1 checks. Watch the same task ship two different outcomes.
Every thought. Every memory. Every skill use. Tracked.
R1's substrate is a content-addressed graph. This is one session rendered as a 3D grid. Phases on one axis, memory tiers on another, event types on the third. Scrub through the session. Click a node. See exactly what the agent was thinking, what it read, what skills it invoked, what tools it called.
The state machine refuses to skip.
Seven gates from intake to commit. Plan, execute, verify, review, remember, skill, commit. Click a state — see what R1 requires, what's traced, what's written to memory. Try to find a way to skip. You can't.
Your subscriptions, your agents. R1 conducts.
Got Claude Code. Got Codex. Got both. R1 uses each as a sub-agent, picks the right one for the step, and cross-checks the result before anything merges.
Run eight tasks at once. Merge safely.
R1's scheduler picks which tasks run in parallel. File-scope conflicts prevented. One mutex serializes all merges to main. No corruption.
Set your bar. R1 holds it.
Drag the thresholds. See what passes. Your team's bar is configurable; R1 enforces whatever you configure, exactly.
Make R1 the harness and the loop.
Daemon Mode turns R1 into a long-running process with a persistent task queue, append-only ndjson WAL, runtime-resizable worker pool, and HTTP control plane. The crash boundary moves from shell glue into a resumable loop with state on disk.
Refuse weak claims at the boundary.
Truth Engine adds five anti-deception guards for the places humans miss: unsupported completion claims, path-marker corruption, destructive post-merge tree drops, shallow audit posture, and suspiciously low delivery ratios without a written explanation.
Turn intent into a verified skill. In eight stages.
Skills are code, not prompts. R1's wizard walks an operator through Intent, Inputs, Outputs, Capabilities, Side effects, Failure, Tests, and Determinism, then writes a typed, content-addressed IR that replays deterministically and gets smarter with every authored skill.
Control any agent. From any device. Verifiably yours.
Beacon Protocol is end-to-end encrypted remote control with cryptographic identity. Beacons are claimed through SAS-verified pairing. Every action traces back to a specific operator, device, token, and permission row. The Hub relays bytes. It cannot decrypt the session.
Your agent checks with you when the risk changes.
Named situations pause the run and ask for consent, from destructive file moves to configuration changes your team owns. You decide. The agent does not continue without a recorded answer.
How R1 compares to running an agent without a harness.
| R1 | Bare Claude Code | Bare Codex | Hand-built scripts | |
|---|---|---|---|---|
| Plans before executing | Yes · explicit plan | Sometimes · implicit | Sometimes · implicit | No |
| Verifies its own work | Yes · cross-family review | Some self-review | Some self-review | No |
| Replayable runs | Yes · receipt trail | No | No | Only if you build it |
| Cross-model fallback | Yes · conductor pattern | No | No | Only if you build it |
| Self-host path | Yes · Apache 2.0 core | No | No | Yes |
| Audit-grade receipts | Yes · signed proof chain | No | No | Only if you build it |
| Cost model | Free self-host · hosted optional | Subscription | Subscription | Engineer time |
| Net effect | Ships with proof | Ships with trust | Ships with trust | You rebuild the harness yourself |
R1 is the runtime. Adjacent systems compose around it.
R1 stands alone as a native agent framework. It also plugs into eight specialized systems that govern traffic, remote control, ground facts, verification, payments, and long runs. Tap any satellite to see how it hooks in.
Launch R1 on CloudSwarm now.
Spin up a durable R1 agent in your browser. Your subscriptions, your repo, their sandbox. Multi-day builds that survive laptop closes.
Free for self-host. Hosted is optional. Enterprise is there when you need it.
You can run R1 yourself at no license cost, pay only when you want a managed runtime, and send procurement-heavy deployments to the enterprise path. See pricing.
Apache 2.0 core. Bring your own compute and model accounts. Full harness, full receipts, no license meter.
See self-host →Managed runtime, upgrades, backups, team controls, and receipt retention when you want operations handled for you.
See hosted →SSO, export controls, long retention, and sovereign deployment options for larger teams and regulated buyers.
See enterprise →Stop watching your agents. Watch what they ship.
Start free, run the demos, or go straight to the docs if you already know the shape of the work you want R1 to take over.