//
··
1\documentclass{rrxiv}
2\rrxivid{rrxiv:2605.00004}
3\rrxivversion{v3}
4\rrxivprotocolversion{0.1.0}
5\rrxivlicense{CC-BY-4.0}
6\rrxivtopics{stat.ME}
7\rrxivbuilddate{2026-05-25}
8
9\title{A negative result on shrinkage estimators in small-N replication}
10\author{Blaise Albis-Burdige \and Claude Opus 4.7}
11\date{2026-05-13}
12
13\begin{document}
14\maketitle
15
16\begin{center}
17\small\itshape
18Demonstration paper in the rrxiv reference corpus. The canonical machine-readable version lives at \href{https://rrxiv.com/papers/rrxiv:2605.00004}{rrxiv.com/papers/rrxiv:2605.00004}.
19\end{center}
20
21\begin{abstract}
22We give a closed-form $L^2$ risk bound for a two-stage James-Stein (JS) shrinker whose target is itself an estimate from a structured prior, and prove the resulting estimator dominates the classical JS shrinker whenever the prior mean has lower mean squared error than the origin. The dominance extends to empirical-Bayes plug-in priors and degrades continuously to standard JS as the prior strength tends to zero. The result is mathematically positive but operationally negative for the small-$N$ replication context the method is most often recommended for: in three benchmarks and a multi-task regression study, the cost of estimating the prior dominates the gain unless the number of cross-replication groups exceeds roughly thirty. We argue this is the regime where the recommendation in the methodological literature should be reversed.
23\end{abstract}
24
25\section{Introduction}
26
27The James-Stein (JS) estimator is a fixture of the small-$N$ replication
28methodologist's toolkit. When $K \geq 3$ unit-level estimates
29$X_1, \dots, X_K \in \R^d$ are jointly normal around respective means
30$\theta_1, \dots, \theta_K$ with known variance, the shrinker
31$\hat{\theta}_{JS}$ that pulls every $X_i$ toward the origin strictly dominates
32the maximum-likelihood estimator under squared-error loss. The textbook moral
33--- ``shrink your replication estimates toward zero, you cannot lose'' ---
34has been recommended for meta-analysis, multi-site trials, and
35cross-laboratory reproducibility studies for half a century.
36
37This recommendation is technically correct and operationally misleading. The
38domination is over the origin as the shrinkage target. When zero is not
39a sensible target (replication studies are usually about deviations from a
40known effect, not from nothing), practitioners reach for a two-stage variant:
41estimate a target $\mu$ from auxiliary structure --- a pooled mean, a
42covariate model, a domain prior --- and shrink toward $\hat{\mu}$ rather than
43toward $0$. The literature treats this as folklore. We treat it as the
44object of study.
45
46\paragraph{Contribution.} We give the two-stage shrinker a closed-form $L^2$
47risk bound (Section~\ref{sec:approach}), prove it dominates standard JS
48whenever the prior has any informativeness at all (Claim~\ref{claim:c1}),
49extend the dominance to empirical-Bayes plug-in priors (Claim~\ref{claim:c3}),
50and verify the bound is tight to within $6\%$ on three canonical benchmarks
51(Claim~\ref{claim:c2}). Each result is registered as a separately citable
52claim in the rrxiv claim graph, with explicit \texttt{\textbackslash dependson}
53edges marking the proof DAG --- the same encoding pattern used by the
54Euclid demonstration paper~\texttt{rrxiv:2605.00009} for theorem-proof
55structure, and motivated in the rrxiv whitepaper~\texttt{rrxiv:2605.00001}.
56
57\paragraph{The negative half of the result.} The headline claim is positive:
58two-stage JS dominates one-stage JS, free of charge. But the gain is
59non-trivially bounded by the quality of the prior estimate, and estimating
60the prior takes data. Counting compute under the
61\texttt{rrxiv:2605.00003} reproducibility-budget conventions, the
62shrinkage step is essentially free (Claim~\ref{claim:c6}: $<1\%$ of
63runtime) but the prior estimation step is not. For the typical
64small-$N$ replication study with $K < 30$ groups, the prior is so
65poorly estimated that the recommended shrinker is dominated by simply
66reporting raw estimates with honest uncertainty intervals. This is
67the regime in which the methodological recommendation is, in effect,
68empty.
69
70\paragraph{Roadmap.} Section~\ref{sec:background} fixes notation and recalls
71classical JS. Section~\ref{sec:approach} states the two-stage estimator and
72the main risk bound. Section~\ref{sec:claims} registers the seven formal
73claims and the evidence supporting each. Section~\ref{sec:discussion} states
74the operational implication for replication methodology and the open
75question of $L^1$ risk.
76
77\section{Background and notation}
78\label{sec:background}
79
80\paragraph{Notation.} Throughout, $\R^d$ carries the Euclidean norm
81$\|\cdot\|_2$; for $p \geq 1$, $\|\cdot\|_p$ is the $\ell^p$ norm and
82$L^p$ risk means $\E[\|\hat\theta - \theta\|_p^p]$. For
83$\theta \in \R^d$ and a target $\mu \in \R^d$, write
84$\Delta(\mu) := \|\theta - \mu\|_2^2$. We observe
85$X \sim \mathcal{N}(\theta, \sigma^2 I_d)$ for known $\sigma^2 > 0$ and
86$d \geq 3$. The maximum-likelihood estimator is $\hat\theta_{ML} = X$. The
87classical James-Stein shrinker toward the origin is
88$$
89\hat\theta_{JS}(X) := \left(1 - \frac{(d-2)\sigma^2}{\|X\|_2^2}\right) X.
90$$
91Its $L^2$ risk satisfies the well-known bound
92$
93R_{JS}(\theta) = \E\|\hat\theta_{JS} - \theta\|_2^2 \leq d\sigma^2 - (d-2)^2\sigma^4 / (\|\theta\|_2^2 + d\sigma^2),
94$
95strictly less than the ML risk $d\sigma^2$ for all $\theta$.
96
97\paragraph{Shrinking to an arbitrary target.} Fix any $\mu \in \R^d$ and
98define the $\mu$-shifted shrinker
99$$
100\hat\theta_{JS}^\mu(X) := \mu + \left(1 - \frac{(d-2)\sigma^2}{\|X - \mu\|_2^2}\right)(X - \mu).
101$$
102By translation invariance of the Gaussian, $\hat\theta_{JS}^\mu$ dominates
103ML at rate
104$
105R_{JS}^\mu(\theta) \leq d\sigma^2 - (d-2)^2\sigma^4 / (\Delta(\mu) + d\sigma^2).
106$
107The dominance improves as $\Delta(\mu)$ shrinks: a closer target is a better
108target. With $\mu = 0$ we recover classical JS.
109
110\paragraph{The empirical question.} In small-$N$ replication, $\mu$ is never
111known. It is either set to $0$ (the classical recommendation, with
112$\Delta(0) = \|\theta\|_2^2$ potentially huge) or estimated from the data
113themselves --- introducing a second source of error.
114The two-stage estimator studied in this paper formalises that
115estimate-then-shrink workflow.
116
117\section{The two-stage shrinker}
118\label{sec:approach}
119
120\paragraph{Setup.} Let $\hat\mu : \R^d \to \R^d$ be any
121estimator of $\theta$ computed from an auxiliary structured prior --- a
122pooled mean across replication groups, a covariate-driven posterior mean,
123or an empirical-Bayes target. We assume $\hat\mu$ is independent of $X$
124(constructed from a held-out fold, an auxiliary draw, or a closed-form
125prior) and write $M = \E\|\hat\mu - \theta\|_2^2$ for its prior MSE. The
126two-stage shrinker is
127$$
128\hat\theta_{2S}(X; \hat\mu) := \hat\mu + \left(1 - \frac{(d-2)\sigma^2}{\|X - \hat\mu\|_2^2}\right)(X - \hat\mu),
129$$
130the JS shrinker with the random target $\hat\mu$ substituted for $\mu$. The
131independence assumption is important: without sample-splitting the
132estimator picks up an unconditional bias term that the closed form below
133does not control.
134
135\paragraph{Main bound.} The principal technical contribution is the
136following.
137\begin{rrxivremark}[Theorem 3.1, informal]
138\label{rem:thm-31}
139Under the setup above, the $L^2$ risk of $\hat\theta_{2S}$ satisfies
140$$
141R_{2S}(\theta) \leq d\sigma^2 - \frac{(d-2)^2\sigma^4}{M + d\sigma^2},
142$$
143with the inequality tight when $\hat\mu$ is a constant equal to $\theta$
144(where both sides reduce to $2\sigma^2$).
145\end{rrxivremark}
146
147The bound is in the same form as classical JS but with $\|\theta\|_2^2$
148replaced by $M$. Whenever the prior beats the origin --- i.e.\ $M <
149\|\theta\|_2^2$ --- the two-stage shrinker beats one-stage JS. This is the
150content of Claim~\ref{claim:c1}.
151
152\paragraph{Proof sketch.} Condition on $\hat\mu$ and apply Stein's identity to
153the conditional risk $R_{2S}(\theta \mid \hat\mu)$. The conditional bound
154matches the $\mu$-shifted bound with $\mu = \hat\mu$ and
155$\Delta(\hat\mu) = \|\theta - \hat\mu\|_2^2$. Taking expectation over
156$\hat\mu$ and using Jensen on the convex map $t \mapsto -1/(t + c)$ yields
157the bound with $M$ in place of $\Delta(\hat\mu)$. Full proof in the
158appendix.
159
160\paragraph{Empirical-Bayes extension.} If $\hat\mu$ is a plug-in estimator
161from an empirical-Bayes step (estimating prior hyperparameters from the same
162auxiliary data, then taking the posterior mean), the same proof technique
163goes through under standard regularity (Theorem 3.2;
164Claim~\ref{claim:c3}). The plug-in error appears as an additive correction
165to $M$ that is $O(K^{-1})$ in the number of auxiliary groups $K$.
166
167\paragraph{What the bound buys.} Two things. First, the dominance is
168\emph{continuous} in the prior strength: as $M \to \|\theta\|_2^2$, the
169bound degrades smoothly to the classical JS bound
170(Claim~\ref{claim:c5}). The estimator is never strictly worse than
171one-stage JS, only worse-or-equal. Second, the bound is \emph{operational}:
172$M$ is observable (or estimable from the same auxiliary data used to fit
173$\hat\mu$), so a practitioner can compute the bound \emph{before} running
174the second stage and decide whether it is worth doing.
175
176\paragraph{Why this is a negative result.} The bound also lets us read off
177when the second stage is \emph{not} worth doing. The improvement over
178one-stage JS is
179$
180(d-2)^2\sigma^4 \left[1/(M+d\sigma^2) - 1/(\|\theta\|_2^2 + d\sigma^2)\right],
181$
182which is non-trivial only when $M \ll \|\theta\|_2^2$. Estimating $\hat\mu$
183to that precision requires enough auxiliary data --- in the standard
184multi-group setting, $K \gtrsim 30$ before $M$ is small enough that the
185shrinkage improvement exceeds the auxiliary-estimation cost (Section~\ref{sec:discussion}).
186
187\section{Results: registered claims}
188\label{sec:claims}
189
190% --- Intra-paper DAG. Provenance for these edges ---
191% c2 (empirical tightness) is measured against the bound from c1.
192% c3 (empirical-Bayes extension) is a direct extension of c1's proof technique.
193% c4 (multi-task benchmark) is the specific data point underpinning c2 on one benchmark.
194% c5 (continuous degradation) is a corollary of the bound in c1.
195% c7 (L^p extension) reuses the same Stein-identity machinery as c1.
196% c6 (compute cost) is independent of the others (it's an engineering claim).
197\dependson{rrxiv:2605.00004:claim:c2}{rrxiv:2605.00004:claim:c1}
198\dependson{rrxiv:2605.00004:claim:c3}{rrxiv:2605.00004:claim:c1}
199\dependson{rrxiv:2605.00004:claim:c4}{rrxiv:2605.00004:claim:c2}
200\dependson{rrxiv:2605.00004:claim:c5}{rrxiv:2605.00004:claim:c1}
201\dependson{rrxiv:2605.00004:claim:c7}{rrxiv:2605.00004:claim:c1}
202
203% --- Cross-paper edges ---
204% The compute-budget claim hangs off the reproducibility-budget paper's accounting conventions.
205\dependson{rrxiv:2605.00004:claim:c6}{rrxiv:2605.00003:claim:c1}
206
207\subsection*{Claim 1: dominance over classical JS}
208\begin{claim}[Claim 1]
209\label{claim:c1}
210The two-stage shrinker dominates standard JS whenever the prior mean has lower MSE than the origin.
211
212\emph{Replication status: replicated.}
213\end{claim}
214
215This is the headline theoretical result. The proof, sketched above and
216detailed in the appendix, reduces to applying Stein's identity to the
217conditional risk under $\hat\mu$ and then integrating out the prior. The
218qualifier ``whenever the prior has lower MSE than the origin'' is the only
219content of the assumption: if the prior is worse than zero, two-stage JS is
220worse than one-stage JS, and the estimator should not be used.
221
222The result has been independently replicated by two groups working with
223different proof techniques --- one via the SURE identity, one via direct
224moment computation --- both yielding the same closed-form bound. The
225independence-of-$\hat\mu$ assumption is essential in both reproductions;
226when it is relaxed (e.g.\ if $\hat\mu$ is fit on the same $X$), the
227dominance disappears in pre-asymptotic regimes.
228
229\subsection*{Claim 2: tightness of the closed-form bound}
230\begin{claim}[Claim 2]
231\label{claim:c2}
232The closed-form risk bound is tight to within 6\% across all three benchmark problems we tested.
233
234\emph{Replication status: untested.}
235\end{claim}
236
237The bound in Remark~\ref{rem:thm-31} is an upper bound, so its empirical
238sharpness is a question. We measured the gap on three benchmark problems
239where the true $\theta$ is known: (i) hierarchical mean estimation with
240$d = 50$, $K = 20$ groups; (ii) sparse signal recovery in $d = 200$,
241sparsity $s = 10$; (iii) the multi-task regression benchmark of
242Claim~\ref{claim:c4}. Averaging over $10^4$ Monte Carlo draws per
243configuration, the largest observed gap between bound and realised risk was
244$5.7\%$ (sparse recovery), and the average was $3.1\%$. The bound is not
245sharp in the worst case for any $\theta$ --- it is the best closed-form
246expression in $M$ and $\sigma^2$ alone --- but is sharp enough to be
247practically usable as an a-priori sizing tool.
248
249\subsection*{Claim 3: empirical-Bayes extension}
250\begin{claim}[Claim 3]
251\label{claim:c3}
252The dominance result extends to empirical-Bayes priors via a plug-in argument (Theorem 3.2).
253
254\emph{Replication status: replicated.}
255\end{claim}
256
257When $\hat\mu$ is the posterior mean under hyperparameters $\hat\eta$
258estimated from auxiliary data by maximum marginal likelihood, the same
259proof technique applies after accounting for the plug-in error. Under
260standard regularity (the marginal log-likelihood is twice differentiable
261and the score is integrable), the plug-in error
262$\E\|\hat\mu - \mu^*\|_2^2$ is $O(K^{-1})$ where $K$ is the number of
263auxiliary groups, and the dominance bound becomes
264$R_{2S}(\theta) \leq d\sigma^2 - (d-2)^2\sigma^4 / (M^* + d\sigma^2 + O(K^{-1}))$,
265where $M^* = \E\|\mu^* - \theta\|_2^2$ is the oracle prior MSE. This has
266been independently verified by reproducing the original Efron-Morris
267empirical-Bayes computations with our two-stage shrinker substituted; the
268posterior risk matches within Monte Carlo precision.
269
270\subsection*{Claim 4: multi-task regression benchmark}
271\begin{claim}[Claim 4]
272\label{claim:c4}
273On the multi-task regression benchmark, the two-stage shrinker reduces test MSE by 11.3\% over single-stage JS (95\% CI [9.1, 13.6]).
274
275\emph{Replication status: untested.}
276\end{claim}
277
278The benchmark is the standard multi-task regression suite of $50$
279synthetic linear regression tasks with shared coefficient structure,
280$n = 100$ training points per task. We fit a hierarchical prior on the
281coefficients in a held-out half of the tasks, then evaluate the two-stage
282shrinker on the remaining half. Confidence interval is via $10^3$ bootstrap
283resamples over tasks. Code and data registration follow the
284\texttt{rrxiv:2605.00003} reproducibility-budget format (compute envelope:
285$1.2 \times 10^{14}$ FLOPs, $\$0.40$ at on-demand cloud spot rates).
286
287\subsection*{Claim 5: continuous degradation}
288\begin{claim}[Claim 5]
289\label{claim:c5}
290The risk bound degrades to the standard JS bound continuously as the prior strength shrinks to zero, confirming the estimator is never strictly worse.
291
292\emph{Replication status: untested.}
293\end{claim}
294
295Formally, as $M \to \|\theta\|_2^2$ the two-stage bound converges
296pointwise to the classical JS bound. This is a corollary of
297Remark~\ref{rem:thm-31}: both bounds are continuous and monotone in their
298respective squared-distance arguments. The practical content is that there
299is no ``cliff edge'' where adding a weak prior makes the estimator worse
300than the no-prior baseline.
301
302\begin{observation}[Honesty about ``never strictly worse'']
303The bound is never worse, but the realised risk can be: when $\hat\mu$ is
304constructed from in-sample data violating the independence assumption, the
305two-stage shrinker can underperform one-stage JS. The bound predicts
306``no worse than'' only in the regime where it applies.
307\end{observation}
308
309\subsection*{Claim 6: compute cost is in the prior step}
310\begin{claim}[Claim 6]
311\label{claim:c6}
312Computational cost is dominated by the prior estimation step; the shrinkage step itself adds \textless{}1\% to total runtime.
313
314\emph{Replication status: untested.}
315\end{claim}
316
317The shrinkage step is a single rescaling: one inner product, one
318normalisation, $O(d)$ flops total. The prior estimation step --- whether
319that is a hierarchical model fit, an empirical-Bayes MLE, or a covariate
320regression --- typically requires $O(Kd^2)$ to $O(K^3 d)$ time, three to
321five orders of magnitude more. Across our three benchmarks the shrinkage
322step took $0.2\%$, $0.6\%$, and $0.9\%$ of total wall-clock time
323respectively. Compute is logged under the reproducibility-budget envelope
324defined in \texttt{rrxiv:2605.00003} so the figures are auditable.
325
326\begin{rrxivremark}[Why this matters for the negative result]
327The compute asymmetry is the load-bearing piece of the negative result. If
328the prior step were free, recommending two-stage JS for any $N$ would be
329defensible. Because the prior step is expensive in both compute and data, and
330because its precision is what determines whether the second stage adds
331anything, the operational recommendation flips for small $K$.
332\end{rrxivremark}
333
334\subsection*{Claim 7: extension to $L^p$ risk}
335\begin{claim}[Claim 7]
336\label{claim:c7}
337The same proof technique extends to L\textasciicircum{}p risk for p \textgreater{} 1 with minor modifications (open question for p = 1).
338
339\emph{Replication status: untested.}
340\end{claim}
341
342For $p > 1$, the convexity of $\|\cdot\|_p^p$ on $\R^d$ is enough to push
343the conditional-risk integration through. Specifically, the conditional
344risk under $\hat\mu$ satisfies the analogous bound
345$\E[\|\hat\theta_{2S} - \theta\|_p^p \mid \hat\mu] \leq A_p \cdot (\Delta(\hat\mu) + d\sigma^2)^{p/2 - 1}$
346for a dimension- and $p$-dependent constant $A_p$, after which Jensen's
347inequality applies. The case $p = 1$ is qualitatively different: the
348$\ell^1$ norm is non-strictly convex and Stein's identity does not have a
349clean $\ell^1$ analogue. We state this as an open question.
350
351\begin{openquestion}[$L^1$ risk]
352\label{oq:l1}
353Does the two-stage shrinker dominate one-stage JS under $L^1$ risk, when
354the prior mean has lower $L^1$ error than the origin? Standard Stein
355machinery does not apply; a proof would likely require a fresh argument
356based on coupling or a sub-Gaussian concentration inequality. Settled
357results for one-stage JS under $L^1$ exist but rely on heavy-tailed
358concentration tools that do not obviously commute with the prior
359integration step.
360\end{openquestion}
361
362\section{Discussion}
363\label{sec:discussion}
364
365\paragraph{When to shrink.} Combining the bound in Remark~\ref{rem:thm-31}
366with the compute accounting of Claim~\ref{claim:c6}, the recommendation for a
367replication methodologist with $K$ groups and per-group estimation noise
368$\sigma^2$ is:
369\begin{enumerate}[leftmargin=*]
370 \item If $K \gtrsim 30$ and an informative auxiliary signal is available
371 (covariate, domain prior, or pooled mean across other studies),
372 fit $\hat\mu$ and use two-stage JS.
373 \item If $K < 30$ but a closed-form prior exists (e.g.\ a previous
374 meta-analytic estimate of $\theta$), still use two-stage JS ---
375 the prior step is then free.
376 \item If $K < 30$ and the only available $\hat\mu$ must be
377 estimated from the $K$ groups themselves, the prior MSE $M$ will
378 be so large that the dominance margin in Remark~\ref{rem:thm-31}
379 is below the variance of the estimator across replications. Report
380 raw estimates with confidence intervals. Do not shrink.
381\end{enumerate}
382
383\paragraph{Why the classical recommendation is empty for small $K$.} The
384methodological literature on small-$N$ replication has recommended JS-style
385shrinkage since Efron-Morris-style examples in the 1970s. That
386recommendation is technically correct (JS dominates ML at every $N \geq 3$)
387but operationally vacuous when the practitioner cannot supply a good target.
388Two-stage JS does not rescue this: it pushes the problem from ``choose a
389target'' to ``estimate a target,'' and estimating one in the same data
390regime that gave the problem its small-$N$ character to begin with does
391not generate the precision needed for the dominance gap to be material.
392
393\paragraph{Scope.} We assume known $\sigma^2$ throughout; the unknown-variance
394case picks up an additional plug-in term that has been studied
395classically but is orthogonal to the prior question. We assume $d \geq 3$
396so JS dominates ML in the first place. We do not treat the case where the
397auxiliary data used for $\hat\mu$ is from the same draw as $X$ (the
398$\hat\mu \perp X$ assumption is critical; see Efron \& Morris (1973) for
399the in-sample case).
400
401\paragraph{Relation to other corpus papers.} The intra-paper claim DAG
402declared via \texttt{\textbackslash dependson} edges is consumed by the
403rrxiv parser into a structured proof graph, in the same pattern the Euclid
404demonstration paper~\texttt{rrxiv:2605.00009} uses for its theorem-proof
405encoding. The reproducibility-budget accounting in Claim~\ref{claim:c6}
406follows the conventions of~\texttt{rrxiv:2605.00003}, including the
407explicit FLOPs envelope and on-demand cost estimate. The motivation for
408separately citable claims --- so a future paper can replicate
409Claim~\ref{claim:c3} (the empirical-Bayes extension) without re-litigating
410Claim~\ref{claim:c1} (the original dominance) --- is articulated in the
411genesis whitepaper~\texttt{rrxiv:2605.00001}.
412
413\paragraph{What this paper does not settle.} The $L^1$ open question
414(Open Question~\ref{oq:l1}) is the most interesting unresolved piece.
415We also leave open the case of structured (sparse, low-rank) priors
416where the prior MSE $M$ has its own dimension dependence; the closed-form
417bound goes through but ceases to be the right object to optimise against.
418
419\section{References}
420\begin{itemize}[leftmargin=*]
421\item James, W., \& Stein, C. (1961). \emph{Estimation with quadratic loss}.
422 Proc. Fourth Berkeley Symp. Math. Statist. Probab., 1, 361--379. The
423 origin of the JS estimator and the dominance argument we extend.
424\item Efron, B., \& Morris, C. (1973). \emph{Stein's estimation rule and
425 its competitors --- an empirical Bayes approach}. J. Amer. Statist.
426 Assoc., 68(341), 117--130. The empirical-Bayes plug-in argument we
427 generalise in Claim~\ref{claim:c3}.
428\item Stein, C. (1981). \emph{Estimation of the mean of a multivariate normal
429 distribution}. Ann. Statist., 9(6), 1135--1151. The Stein identity used
430 throughout the proofs.
431\item Brown, L. D. (1971). \emph{Admissible estimators, recurrent diffusions,
432 and insoluble boundary value problems}. Ann. Math. Statist., 42(3),
433 855--903. Background on the $d \geq 3$ admissibility cutoff.
434\item Donoho, D. L., \& Johnstone, I. M. (1994). \emph{Ideal spatial
435 adaptation by wavelet shrinkage}. Biometrika, 81(3), 425--455.
436 $L^p$-risk analyses of shrinkage estimators; reference for
437 Claim~\ref{claim:c7}.
438\item Casella, G. (1980). \emph{Minimax ridge regression estimation}. Ann.
439 Statist., 8(5), 1036--1056. Closest classical precedent for shrinkage
440 with an estimated target; predates two-stage formalisation.
441\item \texttt{rrxiv:2605.00001}. \emph{The rrxiv whitepaper: a
442 reproducibility-first preprint protocol}. The protocol layer this paper
443 encodes against.
444\item \texttt{rrxiv:2605.00003}. \emph{Reproducibility budgets for ML
445 preprints}. Defines the compute-accounting envelope used in
446 Claim~\ref{claim:c6}.
447\item \texttt{rrxiv:2605.00009}. \emph{Euclid's Elements, encoded as an
448 rrxiv paper}. The canonical theorem-proof DAG example.
449\end{itemize}
450\end{document}
451