The Duality of Bayesian and Frequentist Statistics
Abstract
We show that Bayesian and frequentist statistics admit a common spectral formulation: after eigendecomposition, both frameworks act on the same mode-level evidence and therefore agree on the central inferential question of model complexity. The shared object is the pair \(\psi_k = (A_k, \sigma^2_k)\) of estimated coefficient and residual uncertainty per eigenmode. That object is developed more directly in the companion theory paper The Spectral Information State; in the present paper it serves as the common substrate from which Bayesian posteriors, frequentist confidence intervals, \(p\)-values, and minimum description lengths are all read out in closed form. The eigendecomposition is performed once; each inferential framework is then a one-line readout of the same spectral state.
From this construct we prove that Bayesian model selection (with a prior derived from the Lean 4-verified Universal Risk Representation Theorem), frequentist minimax estimation, and MDL all yield the same optimal model complexity:
\[K^_B = K^_F = K^*_{MDL} = \Theta\!\left(\frac{\log(n/\sigma^2)}{\log\rho}\right) + O(1)\]
where \(\rho\) is the eigenvalue decay rate, \(\sigma^2\) the noise variance, and \(n\) the sample size. The Bayesian-frequentist disagreement is \(O(1/\sqrt{n})\) per mode and confined to a "contested zone" of \(O(1/\log\rho) \approx 3\) modes near \(K^\). Below \(K^\), both views agree the mode is signal. Above \(K^*\), both agree it is noise. The paper's claim is therefore not merely that a useful statistic exists, but that the apparent philosophical divide collapses to a thin finite-sample boundary layer once the problem is seen in the spectral basis.
The Bayesian prior is not subjective: \(A_k \sim \mathcal{N}(0, C^2\rho^{-2k})\) is the unique prior consistent with the machine-verified coefficient decay theorem \(|A_k| \leq C\rho^{-k}\). The result requires only finite variance and is robust to non-Gaussian noise: the contested zone widens for heavy tails but the duality persists (verified empirically for Student-\(t\), Laplace, Pareto, and contaminated mixtures).
Five practical consequences follow: (1) analytic model selection without cross-validation, recovering 40% of discarded data; (2) resolution of BIC/AIC/CV disputes — all approximate the same \(K^*\); (3) an overfitting audit from \(n\), \(p\), and \(\sigma^2\) alone, requiring no data access; (4) per-mode sample size calculation for experimental design; (5) Kelly-optimal position sizing in eigenspace for quantitative trading, where noise modes are automatically zeroed by the spectral shrinkage filter. All theoretical results build on 10 Lean 4-verified theorems.