Microcosm
This page

Evaluate

Evidence & authority

The 1-5 rank measures verdict independence: how much the check can fail on its own, from a 1 where the fixture hands over the answer to a 5 where the harness derives the verdict across the whole declared contract and can fail with no fixture echo. It does not measure how heavy the machinery is, how much real tooling runs, or how mature the component is.

Every one of the 78 components declares how it is backed: an evidence class and a strength rank from 1 to 5. The rank measures one thing only, verdict independence, which is how much the check can fail on its own rather than echoing an answer the fixture already supplied. A 1 means the fixture hands over the verdict and the component checks its shape. A 5 means the harness derives the verdict across the component's whole declared contract and can fail with nothing fed to it. The rank is not a measure of how much machinery runs or how mature a component is, so read the ordering carefully: a component that runs a real external tool over a deliberately small scope can sit at 4 while a contract validator with no external tool sits at 5. The classes below are ordered by how many components carry them, and each says plainly what it checks and what it does not.

A class and a rank describe how a component's own public contract and fixtures are checked, not whole-system correctness, live freshness, or anything past the component's stated scope. Each component card's “Scope limit” line holds that boundary.

Why these modes

Microcosm is the public release of a larger working system, and the evidence classes are the release lanes components took to get here. A verified source import carries real code across under content-digest checks. Computed projections and bounded replays exercise that code over public fixtures, where private data and live services cannot follow. Contract validators publish the checks themselves, so they can fail in public. External tool runs close the loop with real machinery, such as Lean and finance statistics code, where a real tool fits a bounded public form. A release where every component ran live external tools would need the live system; this one shows the mechanism, the code, and the check in their inspectable forms instead. The class records the lane a component took, not a quality tier.

rank 5 · 39rank 4 · 12rank 3 · 27

If you are wondering why a Contract validator (5) outranks a component that compiles Lean or runs real statistics (4): the rank scores verdict independence, not engineering weight. An External tool run does invoke the real tool, but on this public slice it claims only a bounded witness, the tool's return code plus a few output checks over a small scope, so it is capped at 4. A Contract validator earns 5 because the harness derives the verdict over the component's whole declared contract, with no fixture-supplied answer to lean on. The same ordering holds down the scale: a validator with no external tool can outrank one that runs code when it checks more of its own contract unaided. Components that genuinely run external tools are flagged separately below.

Components that actually run things Runs real tools

A 4 here often means more machinery, not less. These components execute a real tool or runtime: they compile Lean through Lake, run forecast-evaluation statistics over market-shaped fixtures, or step a small NumPy model forward. They are capped at 4 because each claims only a bounded witness over a small scope, never a general proof. Look for the Runs real tools marker on a component to spot them.

Computed projection (27)

A deterministic projection verified by recomputing it from source rather than by a live run; negative cases are policy checks, not real-world validation. Rank 3: the code computes the result, but failure coverage is partial.

Verified source import (21)

A public source body is copied and validated against its origin byte for byte; the check fails on a missing target, a placeholder digest, an unverified body, or a launch or private-equivalence overclaim. Rank 5: a fully independent provenance verdict.

Contract validator (20)

The harness derives the verdict over the component's whole declared public contract and can fail with no answer supplied by the fixture. Rank 5: the most independent check on this slice.

External tool run (7) Runs real tools

A real external tool, such as Lean or Lake, is run and its return code plus output checks are witnessed over a deliberately small scope. Rank 4: genuine execution, capped because the witness is bounded, not a general proof.

Bounded runtime computation (3) Runs real tools

Real in-process computation runs over public inputs with predicted-versus-actual checks and negative cases, scoped to a declared toy runtime. Rank 4: genuine computation, capped at the bounds of that toy scope.