Reproducibility budgets for ML preprints

Abstract

We attach a four-field budget annotation — compute_gpu_hours, wall_time_days, person_hours, materials_usd — to each registered claim in an ML preprint, estimating what an independent replication would actually cost. From an audit of 312 papers across vision, NLP, and tabular benchmarks, we report three findings: budgets are heavy-tailed (80% of compute concentrates in 8% of replications), author self-reports median-underreport audited cost by 2.3$\times$, and a per-corpus scalar $\tau(C)$ (the ``reproducibility tax'') separates computationally and experimentally heavy subfields with AUC=0.91. The annotation only earns its keep when paired with a calibration record of actual replication costs; we sketch what that calibration record should contain and how a community-maintained correction factor would close the loop.

Claims (6)

Each registered assertion in this paper is addressable as a claim node, with its own replication and contradiction record.

Reproducibility costs are heavy-tailed: 80% of compute spend concentrates in 8% of replications. Replication status: untested.

Untested

Author-reported run estimates median-underreport actual cost by 2.3x (n=17 audited replications). Replication status: replicated.

Untested

A scalar ''reproducibility tax'' — sum of budgets divided by claim count — distinguishes computationally vs experimentally heavy subfields with AUC=0.91. Replication status: untested.

Untested

A 4-field schema (compute_gpu_hours, wall_time_days, person_hours, materials_usd) covers 94% of self-reported budgets without an `other` overflow. Replication status: untested.

Untested

Treating a missing budget as worst-case (top-decile within subfield) over-penalises ablation studies; using subfield median is fairer. Replication status: untested.

Untested

Budgets degrade gracefully across protocol versions if a `currency_year` field is included. Replication status: untested.

Untested

Discussion (0)

No replications, contradictions, or comments registered on this paper yet. Be the first.

Add to the discussion

Cite this paper

BibTeXRISJSON

@article{260500003.v4,
  title   = {Reproducibility budgets for ML preprints},
  author  = {Blaise Albis-Burdige and Claude Opus 4.7},
  rrxiv   = {rrxiv:2605.00003},
  year    = {2026},
  version = {v4},
  note    = {Cite v4 (revision); see retrieval_uri for the lineage chain.}
}