Public Evidence Field / Ouroboros Project

AI should not act without evidence it can carry.

We do not ask people to believe the claim first. We provide a place to inspect it, reproduce it, attack it, and see exactly what it does not prove.

claim evidence boundary repair

Serious readers can move from public evidence to a bounded private briefing without exposing protected execution systems.

Evidence shape visual for the public evidence field.
Claim Named and bounded
Evidence Artifacts and logs
Attack Public counterexamples
Repair Versioned boundaries

Entry Architecture

Three public doors. One protected pilot route.

The public layer has one job: let people understand, verify, challenge, or privately scope the work without exposing protected execution systems. The first three doors are the public route; private briefing begins only after the claim boundary and public evidence are clear.

01

Start Here

The first page for public readers: what the project is, what it is not, what is public, and what stays private.

Open start page
02

Evidence Map

Three core evidence cards plus the larger evidence ladder: claim, protocol, evidence, limit, DOI, attack.

Open evidence map
03

Counterexample Challenge

A bounded route for formula counterexamples, leakage reports, stronger baselines, reproduction gaps, and claim-boundary attacks.

Open challenge
Private

Bounded Pilot Route

A protected path for serious readers to scope a private evidence review after the public evidence boundary is understood.

Open pilot route

Support Tools

Everything else keeps the three public doors inspectable.

These pages are not separate slogans. They are tools for boundary control, reproducibility, and public critique.

Manual

Field Manual

The operating manual for researchers, engineers, partners, and critics to read, reproduce, challenge, or pilot the work.

Open field manual
Boundary

Claim Boundaries

Plain statements of what is claimed, what is not claimed, what is public, and what remains private.

Open boundaries
Registry

Public Registries

Machine-readable claim, evidence, counterexample, and action registries for public inspection.

Open registries
Demo

Runnable Demos

Small public slices for warrant refusal, receipt closure, negative-space memory, and clean learning boundaries.

Open demos
Open

Open Source Artifacts

GitHub, Hugging Face, Zenodo, runnable demos, registries, and the public/private boundary in one route.

Open artifact map
Table

Claim-to-Evidence Table

A compact table that connects each public claim to evidence, limits, attack routes, and artifacts.

Open claim table
Tech

Technology Notes

Implementation-facing notes for public routes, registries, demos, reproducibility, and evidence gates.

Open technology notes
Status

Review Status Ledger

A public ledger that separates active routes, desk decisions, claim boundaries, and next-route decisions.

Open status ledger
Pilot

Pilot Packet

A bounded non-secret packet for qualified readers to scope a private evidence review.

Open pilot packet
CN

Chinese Reliable Action

A Chinese-language route explaining evidence, boundaries, receipts, regret, and clean learning.

Open Chinese route
Core Card 1

WisdomBench

Learning from failure should be measured longitudinally, not inferred from first-attempt success.

Inspect card
Core Card 2

Proof-Carrying Action

High-risk AI action needs a warrant, falsifier, receipt, counterfactual, regret route, and no-credit repair discipline.

Inspect card
Core Card 3

Relational Observability

Adaptive intelligence needs observable relations, constraints, control debt, and evidence half-life, not only scalar scores.

Inspect card
Boundary

What stays private

Financial execution details, customer data, private agent orchestration, and commercial schedulers are not part of the public challenge.

Read boundary

Proof-Carrying Action

The central interface: no action authority without a stated evidence threshold.

The public program is converging on one operational rule: an AI system should not merely produce an answer. It should know whether it has earned the right to act, what evidence supports that action, what would falsify it, what receipt proves what happened, and how to learn without contaminating future decisions.

goal -> observation -> relation field -> thesis -> falsifier -> warrant -> receipt -> regret -> clean learning

This is the bridge between the papers and the product. In language agents it becomes refusal, critique, and longitudinal learning. In robotics it becomes perception-to-action restraint and recovery. In finance it becomes proof-carrying action: a system can say "not yet" and turn missing evidence into repair work orders.

Research Program

From first-attempt intelligence to post-failure wisdom.

Most evaluations ask whether an agent succeeds on the first attempt. My work asks a different question: after failure, feedback, or environmental shift, does the system become more reliable, more bounded, and more capable of choosing the right next action?

The portfolio formalizes this question across cognitive agents, embodied benchmarks, world-model evidence, social calibration, cognitive immunity, and robust perception under adverse conditions.

01

Wisdom Science

Metrics and protocols for learning from repeated exposure, failure modes, perturbations, and recovery.

02

Embodied Intelligence

WB-E evaluates physical agents beyond first-attempt success, emphasizing recovery, provenance, and bounded action.

03

Cognitive Immunity

Failure is treated as an antigen: a reusable signal for improved reasoning, safety, and decision hygiene.

04

Reflexive Systems

Agents must reason about how their actions change the environment that will later evaluate them.

Art Direction

Scientific rigor with visual gravity.

The public layer is designed as an editorial research space: quiet, precise, warm enough to enter, and strong enough to be remembered. The goal is not spectacle. The goal is trust with taste.

Evidence gate visual in black and gold.
Supra-body architecture visual.
Closed-loop learning visual.

Public Archive

Selected papers and records.

Public preprints may contain author identity. Double-blind conference submissions use separate anonymous packages.

Open the full papers page Open the full Zenodo portfolio archive
Black-gold evidence gate command visual.

Systems Layer

SOVEREIGN is a local-first decision intelligence system.

The engineering layer organizes evidence trails, failure logs, workflow memory, social calibration, cognitive immunity, and closed-loop teaching into a practical system for research, operations, and reliable agent design.

  • Evidence-gated outputs and claim boundaries.
  • Failure logs treated as reusable learning assets.
  • Local-first memory and private decision ledgers.
  • Embodied and cognitive loops described through one control framework.
Open the systems product page

Position

The next route is not only larger models. It is architecture, feedback, evidence, recovery, body-like subsystems, and the discipline to know when not to act.

This is a research archive, a systems map, and a public instrument panel for work that must remain falsifiable.

Contact

Mian Zhang

Independent Researcher, Ouroboros Project