Public Release - 2026-06-06

Before per-paper attack, expose the evidence gaps.

Artifact: 16-paper Preflight Evidence Digest. The 16-paper upload matrix should not move from page count to narrative confidence. It first needs a preflight digest: evidence cards, counterexample queues, boundary ledgers, public index gaps, freshness checks, and next attack targets.

evidence_card queue boundary index_gap attack_target

Evidence Map Challenge Route Boundaries Registries

Preflight evidence digest visual showing a 16-item audit matrix before per-paper evidence review.

Weekly Digest

This is a preflight object, not a victory lap.

From 2026-06-02 to 2026-06-05, the daily record added four public control surfaces for the upload matrix: Evidence Card Inspector, Per-paper Counterexample Queue, Per-paper Boundary Update Ledger, and Public Index JSONL. Today's digest asks whether those surfaces are complete enough to start per-paper attacks.

The matrix remains W0, P23, P28, P29, P30, P31, P32-P40, and F1/P8. The point is not to promote paper count. The point is to make each claim, evidence object, boundary, freshness state, and challenge route inspectable.

6/2

Evidence cards

Each paper needs public fields for PDF, manifest, boundary, hash/DOI status, supported claim, and does_not_prove.

6/3

Attack queues

Each paper needs a queue row: paper_id, target claim, expected evidence, observed gap, status, and boundary effect.

6/4

Boundary ledgers

Each paper needs does_not_claim, downgrade_trigger, withdraw_condition, old boundary, new boundary, and public note.

6/5

Public index

The matrix needs JSONL rows for papers, claims, evidence, counterexamples, boundaries, and freshness.

6/6

Gap digest

The week ends by naming what is still missing before 6/10 starts the per-paper attack cycle.

Per-paper review

P23, P28, P29, P30/P31, P32/P33, P34-P38, F1/P8, and P39/P40 each receive bounded attack targets.

Preflight Ledger

The digest should expose unresolved gaps.

Gap classQuestionPublic checkBoundary

Missing evidenceWhich claim still lacks a public artifact or manifest row?Evidence card and registry route.No protected operational systems demand.

Needs reproductionWhich claim needs replay, verifier output, or public fixture before it can be trusted?Demo, artifact, or reproducible route.No customer log request.

Needs third-party attackWhich paper has a claim that only becomes credible after outside critique?Counterexample route and issue packet.Public artifact only.

Missing DOI/hashWhich public file has pending, stale, or unclear archive/checksum status?Public URL, DOI status, hash status.No private build hash.

Boundary riskWhich wording makes a protocol, spec, or negative result look like deployment capability?does_not_claim and downgrade trigger.No alpha, no trading advice, no action authority.

Boundary filter visual blocking misleading progress claims and allowing only validated public evidence.

Digest Boundary

Do not package gaps as progress.

A weekly digest is useful only if it refuses inflation. It should not turn a missing artifact into momentum, a spec into deployed capability, a pending upload into completed verification, or a negative-result boundary into a hidden performance claim.

This public artifact excludes protected operational systems, customer data, real execution logs, sensitive instructions, protected orchestration, commercial schedulers, account state, financial execution details, alpha hints, and trading advice.

Attack Priority

Five gaps should be challenged first.

PriorityTargetAttack questionExpected repair

1P23Does dry-run replay support any stronger self-modification claim?Downgrade to bounded replay unless stronger public evidence exists.

2P28 / P29Do drift and relabel bridges rely on private or unsupported context?Mark public-source, synthetic, heldout, or private-excluded scope explicitly.

3P30 / P31Do proof and honesty artifacts support protocol-stage claims only?Separate proof transcript, verifier, and honesty-bound evidence tiers.

4P32-P40Do protocol gates imply deployment, legal, ethics, diagnostic, or governance authority?Add authority-leak and diagnostic-overclaim boundaries.

5F1/P8Does the public note accidentally imply alpha, strategy, account state, or trading advice?Keep no-alpha, no-trade, no-advice framing visible.

Next Build Queue

6/10 starts the per-paper attack cycle.

The next public work is not a broader story. It is a narrower queue: P23 upload evidence package, P28 drift evidence package, P29 pluralistic evidence package, P30/P31 proof and honesty package, P32/P33 protocol gates, P34-P38 scaffold family, F1/P8 no-trade boundary, and P39/P40 authority and spectral leak gates.

Please attack the unresolved public gaps first. A useful critique names the paper_id, target claim, missing public evidence, reproduction route, boundary effect, and whether the digest has overstated progress.

Priority flow visual filtering a large paper queue into five unresolved per-paper attack targets.

Challenge Packet

Do not ask for private systems. Challenge public claims.

The challenge route remains public-only: claim wording, public evidence link, reproduction route, stronger baseline, metric, freshness, and boundary. Requests for protected operational systems, customer data, execution logs, sensitive instructions, account state, or commercial scheduling do not count as public evidence critique.