K* Modes Are All You Need: Spectral Uncertainty Quantification for Deep Learning

Nagy, Tamás

K* Modes Are All You Need: Spectral Uncertainty Quantification for Deep Learning

Tamás Nagy, Ph.D. Updated 2026-03-08 Short Draft Machine Learning Lean-Verified

Mathematics verified. Core theorems are machine-checked in Lean 4. Prose and presentation may not have been human-reviewed.

Download PDF View in Graph BibTeX

Abstract

We show that the uncertainty of any smooth learning problem is captured by \(K^\) spectral modes of the data covariance, where \(K^ = \Theta(\log(n/\sigma^2)/\log(\rho^2))\) depends only on the dataset size \(n\), noise level \(\sigma\), and eigenvalue decay rate \(\rho\) — NOT on the number of model parameters \(p\). This follows from applying the Universal Spectral Representation Theorem (USRT) to the posterior distribution over models. The \(K^\) formula identifies a sharp Baik--Ben Arous--Péché (BBP) phase transition: modes \(k \leq K^\) carry signal (learnable structure); modes \(k > K^\) carry only noise (overfitting if used). The spectral posterior — using only \(K^\) eigenmodes of the empirical covariance — provides calibrated uncertainty bounds, exact separation of signal from noise, and a proof that Bayesian and frequentist model selection AGREE on which modes carry information. We validate on regression and classification tasks: the spectral posterior requires \(K^* \approx 10\)--\(50\) modes regardless of model size (\(p = 10^2\) to \(10^6\)), gives well-calibrated confidence intervals, and the USRT formula correctly predicts the overfitting boundary. The theoretical core is formally verified in Lean 4 (USRT convergence + Eckart-Young optimality).

Length

2,730 words

Claims

3 theorems

Status

Draft

Target

Nature Machine Intelligence / NeurIPS / ICML

Connects To

Universal Foundations: A Verified Library of Core Mathematic...

Browse all Machine Learning papers →