Claim-to-Evidence Table

Claims should have a shape.

A public claim is useful only when it points to evidence, states what it does not prove, names the condition that would weaken it, and gives critics a bounded route to attack it.

Evidence shapes for claims, artifacts, receipts, and attack routes.

Public Table v0

Bounded claims, public evidence, and downgrade triggers.

Claim Evidence status What it does not prove Downgrade trigger Attack route
P02-C1
WisdomBench measures longitudinal learning from failure rather than single-shot task capability.
Public supporting evidence: GitHub, Hugging Face dataset, Zenodo record. Human-like wisdom, general deployment reliability, or that all agents learn from failure. Task leakage, scoring bugs, reproduction failure, or stronger baselines removing the longitudinal effect. WisdomBench issue template
PCA-C1
High-risk AI action should not earn action credit until warrant and receipt closure exist.
Public protocol and interface demo. Live trading profit, private product performance, or universal safety. The public gate allows unsafe action, gives credit without receipts, or cannot reproduce its no-go boundary. Proof-carrying action issue template
P24-C1
Adaptive systems need relational observability: relations, constraints, control debt, and evidence half-life.
Public protocol stage. A theorem covering all adaptive systems or a finished private product. Relation variables, control debt, or evidence half-life do not change decisions beyond scalar baselines. Public counterexample route
P20-C1
Physical AI should route degraded evidence to recovery or abstention rather than direct action.
Public bounded support; rebuild needed before stronger deployment claims. Detector SOTA, offensive autonomy, or real-world robot deployment performance. Stronger conformal, shield, or fusion baselines handle the same degraded evidence without this boundary. Public counterexample route
F1-C1
Trading is used as a high-risk testbed for proof-carrying action discipline, not as a public claim of live profitability.
Public boundary and private-briefing route. Live trading edge, customer readiness, private execution quality, or alpha dominance. Public language implies live profitability, private execution readiness, or authority beyond no-go evidence. Boundary issue template

Rule

No evidence row is allowed to silently become a larger claim.

The public layer is deliberately narrow: it exposes protocols, manifests, bounded evidence, negative results, public demos, and repair routes. It does not expose private financial execution details, customer data, internal agent orchestration, commercial schedulers, API keys, or anonymous conference packages.

If a critic can show a stronger baseline, a leakage path, a failed reproduction command, or an over-broad boundary, the claim should be narrowed or repaired. That is the point of the registry.