Runnable Entry Points

Small demos for one rule: no action without carried proof.

These demos are intentionally small. They do not expose private systems, customer data, or financial execution logic. They show the public interface: how a system refuses, explains the missing evidence, creates repair work, and keeps dirty learning out.

warrant receipt regret repair
Evidence shape diagram for proof-carrying action demos.
Public Protocol shape
Private Execution systems
Safe No customer data
Attackable Clear failure routes
Demo 00

Proof-Carrying Action Mini Gate

A runnable toy gate for one question: does a proposed action carry enough thesis, falsifier, null arms, receipts, and counterfactual closure to act, or should it stop and create no-credit repair work?

Run the public mini gate
Demo 01

Action Warrant Refusal

Input a proposed action with missing null arms, missing terminal receipt, or weak evidence. The expected result is not a confident answer; it is a clean refusal with repair work orders.

{
  "action_allowed": false,
  "authority_leak": 0,
  "credit_leak": 0,
  "action_requires_warrant": true
}
Demo 02

Receipt Closure Checker

A receipt is not a feeling of progress. The checker separates price-only evidence, wait-policy receipts, terminal join contracts, null baselines, and regret computability.

missing_birth_terminal_join_contract
required_nulls_missing
wait_policy_receipts_missing
regret_computable = false
Demo 03

Negative-Space Memory

Non-action is treated as a decision. The demo records why the system did not act, what happened afterward, and whether restraint was justified.

wait -> observe -> counterfactual
     -> regret route
     -> clean / dirty learning label
Demo 04

WisdomBench Mini Route

A compact route into the public benchmark: tasks, scoring code, repeated rounds, confidence intervals, negative results, and repeat-failure traces.

Open WisdomBench code
Pilot

From Demo to Bounded Pilot

If the mini gate is useful, the next step is not to send private data. Use the pilot packet to prepare a redacted workflow, synthetic or fully redacted public-safe traces, stop rules, and a clear no-go boundary.

Open pilot packet

Why This Demo Exists

The first useful behavior is sometimes restraint.

A high-risk AI system can look capable while still lacking the right to act. The public demo layer exists to make that distinction visible. It shows how the system should turn uncertainty into structured repair, rather than converting uncertainty into false confidence.

For researchers, the key attack routes are formula counterexamples, stronger baselines, leakage reports, reproduction gaps, and claim-boundary errors. For engineers, the key interface is simpler: action authority is granted only after closed evidence.

Trust Metrics

What the demo should make inspectable.

These are not marketing scores. They are interface-level checks that a critic can ask the public artifacts to expose.

Metric Meaning Expected Demo Behavior Attack Route
authority_leak Whether a weak signal becomes action authority Must remain 0 under missing receipts Find a path where weak evidence acts
credit_leak Whether repair intent becomes learning credit Must remain 0 until evidence closes Find reward given to incomplete repair
regret_route Whether action or non-action can later be evaluated False until counterfactual receipts exist Show regret without comparable nulls
boundary_fit Whether the claim says only what the evidence supports Explicit limitations on every demo Find overclaiming or hidden assumptions