Allowed Scope
Five useful ways to attack the work.
Submit only public, non-sensitive examples. Do not include credentials, customer data, private prompts,
live trading logs, unpublished review material, or operational harm instructions.
ClassSubmitInvalid ifMinimal verification
Formula counterexampleFormula/protocol id, variable assignment, expected result, observed contradiction.It only says the formula feels wrong or needs private assumptions.Replay the assignment; check whether the stated rule, inequality, gate, or invariant fails.
Data leakageDataset row ids, split ids, hashes, duplicate evidence, future timestamp, or provenance mismatch.It requires private data, protected logs, or guesses about hidden pipelines.Recompute split/provenance checks on public artifacts; confirm whether label, future, or duplicate leakage exists.
Stronger baselineBaseline description, public code/command, same task set, same metric, seeds, and result table.The baseline changes the task, metric, data boundary, or allowed information.Run the baseline under the same scoring contract; compare effect size and confidence interval.
Repro failureCommand, environment, artifact version, expected output, observed output, and minimal failing case.It omits the command, uses private dependencies, or reports only a screenshot without replay details.Run the stated command from a clean checkout or artifact package; confirm the mismatch.
Claim-boundary overreachExact sentence, claimed scope, supporting evidence, missing evidence, and proposed narrower wording.It attacks a claim the project does not make or asks for private disclosure.Trace claim -> artifact -> metric -> limitation; decide whether wording, evidence, or boundary must change.