← All collections

AI: Safety, Scaling, Interpretability

Machine learning results with formal guarantees

Results about large language models, neural scaling, adversarial robustness, and AI safety — all framed as mathematical theorems with Lean-verified components. Some of these rederive known empirical scaling laws from first principles.

8 papers

Machine Learning Working Paper Lean Flagship
Neural Scaling Laws Formalized: Why Chinchilla Works (A Machine-Verified Derivation)
Neural scaling laws — the empirical observation that test loss decreases as a power law in compute budget — are the foundation of modern AI training strategy. Every major laboratory trains billion-dollar models by extrapolating scaling curves, yet *why* these power laws hold remains unexplained.
8,417 words 20 claims
machine_learning Draft Lean
Why Neural Networks Scale: A Complete Latent-Theoretic Foundation
We present a unified mathematical theory of neural scaling laws derived from the spectral structure of data distributions. The central object is the Latent Number $\rho \in (0, \infty)$, which measures the rate at which a distribution's spectral coefficients decay.
6,042 words 1 claims
Formal Verification Working Paper Lean
Adam's Convergence Proof Was Wrong: A Machine-Checked Verification of the Bug and the Fix
Adam (Kingma & Ba, 2015) is deep learning's most-cited optimizer, with over 100,000 citations and native implementation in every major framework: TensorFlow, PyTorch, JAX, Keras. Its original convergence proof — Theorem 4.1, published at ICLR 2015 — claimed $R_T = O(\sqrt{T})$ regret on convex problems.
8,361 words 35 claims
Machine Learning Draft Lean
Verified Adversarial Robustness: Lipschitz Certificates for Neural Networks in Lean 4
Can we *prove* a neural network is safe? Adversarial examples — imperceptible input perturbations that cause misclassification — remain the most persistent failure mode of deep learning. Existing defenses rely on empirical testing, which cannot guarantee safety.
8,773 words 45 claims
Machine Learning Draft Lean
The AI Safety Certificate: A Machine-Verified Framework for Quantitative AI Safety
Can we *prove* an AI system is safe — not with testing, not with benchmarks, but with mathematical certainty? We present a Lean 4 verification of a unified AI safety certificate — to our knowledge, the first formal verification of a multi-dimensional
8,572 words 56 claims
machine learning / model analysis Draft Lean
Residual Stream Denoising in Large Language Models: Gradient-Free Quality Improvement via Functional Sensitivity Analysis
9,856 words
Machine Learning Working Paper Lean
Spectral Certificates for Trustworthy AI: Robustness, Confidence, and Fairness from One Decomposition
We prove that a single singular value decomposition (SVD) of a neural network's local Jacobian yields three formally verified trustworthiness guarantees simultaneously.
5,323 words 11 claims
Machine Learning Working Paper Lean
The Spectral Theory of LLM Understanding: When Can We Trust a Language Model?
Large language models are deployed in medicine, law, education, and critical infrastructure — yet we cannot mathematically characterize when their outputs are trustworthy.
7,629 words 42 claims