Phase 24v0.24.0-rc1Release Candidate
Closure Hardening + Release-Candidate Conditioning

CodeWorldObservatory

A simulation-first control plane for agentic software engineering. Software agents should imagine code futures before acting.

Treating software repositories as dynamic causal worlds — not static text — enables counterfactual planning, branch evaluation, and prediction-before-write workflows that autoregressive approaches cannot achieve.

Observatory Thesis

Software is a world. Agents must learn to navigate it.

Autoregressive write-first is insufficient

Systems that generate code by predicting the next token cannot reason about causal consequences. They write first and discover problems later.

Repositories are observable worlds

A codebase is a deterministic, stateful world with defined laws (types, tests, contracts). It can be modeled, simulated, and reasoned over before any action.

Agentic IDEs are the substrate

Modern agentic environments provide state capture, tool access, artifact production, and execution verification - perfect conditions for a simulation control plane.

MCP bridges prediction to action

Model Context Protocol will serve as the bridge between the observatory's predictive intelligence and the tools that execute real interventions.

Live Observatory

Research control surface

Real-time visibility into world state, planned interventions, comparative timelines, experiment-centered lineage, replay-aware research consumers, evaluation persistence, advisory-only research prioritization, governed priority drift traceability, historically durable priority history, automated snapshot comparison surfaces, unified comparative governance synthesis, and closure-hardened release-candidate conditioning. Phase 24 active.

Current World State

Capturing repo snapshot…

WS
Pending

Scanning workspace…

Reading file metadata…

Analyzing imports…

Candidate Interventions

Generating branch plans…

INT
Pending

Reading repo signals…

Analyzing scope…

Building candidate branches…

Counterfactual Futures

Generating consequence projections…

CF
Pending

Predicting branch outcomes…

Uncertainty Surface

Checking structural limits…

USF
Pending

Analyzing prediction confidence…

Prediction vs Reality

Retrieving execution records…

PVR
Pending

Loading calibration data...

Artifact Ledger

Retrieving ledger entries…

ART
Pending

Reading immutable records...

Benchmark Harness [Sim-Sessions]

Initializing simulation session...
Phase 24 ActiveThe Observatory has completed closure hardening and release-candidate conditioning: canonical governance vocabulary unified across all services, inline styles eliminated from all panels in favor of Tailwind semantic tokens, institutional color scheme unified from fragmented info/warning to consistent accent, and shared UI primitives established for cross-panel visual coherence.

Operating Principles

Four laws of the Observatory

These are not guidelines. They are invariants. Any system claiming to implement CodeWorld Observatory must satisfy all four.

01

Simulation Before Write

No agentic write operation executes without a prior simulation pass. Imagining consequences is not optional — it is the prerequisite to acting.

02

Branch Before Intervention

Every proposed change is evaluated as a counterfactual branch in the world model. The main timeline is never the first casualty of exploration.

03

Visible Uncertainty

Uncertainty is first-class data. Where the model cannot predict with confidence, the system makes that ambiguity explicit and visible to the operator.

04

Auditable Artifacts

Every plan, simulation, and outcome is recorded as an immutable, hashed artifact. The system can always trace the chain from intention to consequence.

Roadmap

24-phase build plan

Each phase is gated on the completion and verification of the preceding phase. No phase skipping.

Foundation Scaffold

State Capture Engine

Intervention Planner

Simulation Engine

Evaluation Framework

MCP Simulation Bridge

SE-JEPA Prototype Layer

Benchmark Harness

Experiment Memory

Latent State Approximation

Research Export / Briefing Surface

Comparative Research Timeline + Narrative Playback

Experiment Registry + Historical State Capture

Scenario Library + Dataset Infrastructure

Reproducibility + Statistical Evaluation

Experiment-to-Simulation / Benchmark Lineage Binding

Replay-Aware Research Consumers + Experiment Detail Surface

Research Evaluation Semantics + Evidence-Weighted Comparative Analysis

Evaluation Persistence + Comparative Research Surfaces

Research Prioritization Engine

Priority Drift + Recommendation Governance

Historical Priority Ledger + Comparative Governance History

Priority Snapshot Automation + Historical Comparison Surfaces

Comparative Governance Synthesis Layer

24

Closure Hardening + Release-Candidate Conditioning

Current