Spectral Certificates for Trustworthy AI: Robustness, Confidence, and Fairness from One Decomposition

Nagy, Tamás

Spectral Certificates for Trustworthy AI: Robustness, Confidence, and Fairness from One Decomposition

Tamás Nagy, Ph.D. Updated 2026-03-04 Working Paper Machine Learning Lean-Verified

Abstract

We prove that a single singular value decomposition (SVD) of a neural network's local Jacobian yields three formally verified trustworthiness guarantees simultaneously. First, we establish a spectral robustness certificate that is provably tighter than the standard Lipschitz bound \(r = m/(2\prod\|\mathbf{W}_l\|)\) for average-case (random-direction) perturbations by replacing the product of spectral norms with a Frobenius-based effective Lipschitz constant. The key result — the Frobenius–spectral inequality \(\sum_k \sigma_k^2 / n \le \sigma_{\max}^2\) — implies that the spectral certificate dominates the standard certificate for all networks, with numerical experiments at Kaiming He initialization showing 13.5× average improvement (68× for deep networks). The spectral certificate bounds the root-mean-square amplification across perturbation directions; for worst-case adversaries who align perturbations with the top singular vector, the standard \(\sigma_{\max}\)-based bound remains necessary. Second, we derive a spectral entropy confidence measure \(H = -\sum p_k \log p_k\) from the normalized singular value distribution, providing a calibrated reject signal: low entropy indicates concentrated sensitivity (robust prediction), high entropy indicates diffuse sensitivity (fragile prediction). Third, we introduce spectral fairness analysis: mode alignment \(\cos^2(\mathbf{v}_k, \mathbf{d}_{\text{prot}})\) detects which Jacobian modes correlate with protected attributes, and we prove that suppressing biased modes improves the robustness certificate — fairness and robustness are synergistic, not antagonistic. The framework yields three practical training reforms: Frobenius–Spectral Normalization (FSN), which is 429× cheaper than spectral normalization; spectral-aware weight decay; and the observation that standard \(L_2\) weight decay already provides a free certified robustness radius. All results are formally verified in Lean 4 (22 source files, 0 sorry).

Length

5,323 words

Claims

11 theorems

Status

Working Paper

Target

ICML / NeurIPS (AI Safety track) or Journal of Machine Learning Research

Connects To

Formal Foundations of Stochastic Gradient Descent

Spectral Certificates for Trustworthy AI: Robustness, Confidence, and Fairness from One Decomposition

Abstract

Connects To

Read next — this paper appears in