Runnable Entry Points

Small demos for one rule: no action without carried proof.

These demos are intentionally small. They do not expose private systems, customer data, or financial execution logic. They show the public interface: how a system refuses, explains the missing evidence, creates repair work, and keeps dirty learning out.

warrant receipt regret repair

Run mini gate Open proof-carrying-action Mini Gate Inspect demo cards Attack the claim

Evidence shape diagram for proof-carrying action demos.

Public Protocol shape

Private Execution systems

Safe No customer data

Attackable Clear failure routes

Demo 00

Proof-Carrying Action Mini Gate

A runnable toy gate for one question: does a proposed action carry enough thesis, falsifier, null arms, receipts, and counterfactual closure to act, or should it stop and create no-credit repair work?

Run the public mini gate

Demo 01

Action Warrant Refusal

Input a proposed action with missing null arms, missing terminal receipt, or weak evidence. The expected result is not a confident answer; it is a clean refusal with repair work orders.

{
  "action_allowed": false,
  "authority_leak": 0,
  "credit_leak": 0,
  "action_requires_warrant": true
}

Demo 02

Receipt Closure Checker

A receipt is not a feeling of progress. The checker separates price-only evidence, wait-policy receipts, terminal join contracts, null baselines, and regret computability.

missing_birth_terminal_join_contract
required_nulls_missing
wait_policy_receipts_missing
regret_computable = false

Demo 03

Negative-Space Memory

Non-action is treated as a decision. The demo records why the system did not act, what happened afterward, and whether restraint was justified.

wait -> observe -> counterfactual
     -> regret route
     -> clean / dirty learning label

Demo 04

WisdomBench Mini Route

A compact route into the public benchmark: tasks, scoring code, repeated rounds, confidence intervals, negative results, and repeat-failure traces.

Open WisdomBench code

Pilot

From Demo to Bounded Pilot

If the mini gate is useful, the next step is not to send private data. Use the mini gate to prepare a redacted workflow, synthetic or fully redacted public-safe traces, stop rules, and a clear no-go boundary.

Open mini gate

Why This Demo Exists

The first useful behavior is sometimes restraint.

A high-risk AI system can look capable while still lacking the right to act. The public demo layer exists to make that distinction visible. It shows how the system should turn uncertainty into structured repair, rather than converting uncertainty into false confidence.

For researchers, the key challenge routes are formula counterexamples, stronger baselines, leakage reports, reproduction gaps, and claim-boundary errors. For engineers, the key interface is simpler: action authority is granted only after closed evidence.

Proof-Carrying Action card GitHub repository Counterexample Challenge

Trust Metrics

What the demo should make inspectable.

These are not marketing scores. They are interface-level checks that a critic can ask the public artifacts to expose.

Metric Meaning Expected Demo Behavior Attack Route

authority_leak Whether a weak signal becomes action authority Must remain 0 under missing receipts Find a path where weak evidence acts

credit_leak Whether repair intent becomes learning credit Must remain 0 until evidence closes Find reward given to incomplete repair

regret_route Whether action or non-action can later be evaluated False until counterfactual receipts exist Show regret without comparable nulls

boundary_fit Whether the claim says only what the evidence supports Explicit limitations on every demo Find overclaiming or hidden assumptions