▲
AI: Safety, Scaling, Interpretability
Machine learning results with formal guarantees
Results about large language models, neural scaling, adversarial robustness, and AI safety — all framed as mathematical theorems with Lean-verified components. Some of these rederive known empirical scaling laws from first principles.
8 papers
Machine Learning
Working Paper
Lean
Flagship
Neural Scaling Laws Formalized: Why Chinchilla Works (A Machine-Verified Derivation)
Neural scaling laws — the empirical observation that test loss decreases as a power law in compute budget — are the foundation of modern AI training strategy. Every major laboratory trains billion-dollar models by extrapolating scaling curves, yet *why* these power laws hold remains unexplained.
machine_learning
Draft
Lean
Why Neural Networks Scale: A Complete Latent-Theoretic Foundation
We present a unified mathematical theory of neural scaling laws derived from the spectral structure of data distributions. The central object is the Latent Number $\rho \in (0, \infty)$, which measures the rate at which a distribution's spectral coefficients decay.
Formal Verification
Working Paper
Lean
Adam's Convergence Proof Was Wrong: A Machine-Checked Verification of the Bug and the Fix
Adam (Kingma & Ba, 2015) is deep learning's most-cited optimizer, with over 100,000 citations and native implementation in every major framework: TensorFlow, PyTorch, JAX, Keras. Its original convergence proof — Theorem 4.1, published at ICLR 2015 — claimed $R_T = O(\sqrt{T})$ regret on convex problems.
Machine Learning
Draft
Lean
Verified Adversarial Robustness: Lipschitz Certificates for Neural Networks in Lean 4
Can we *prove* a neural network is safe? Adversarial examples — imperceptible input perturbations that cause misclassification — remain the most persistent failure mode of deep learning. Existing defenses rely on empirical testing, which cannot guarantee safety.
Machine Learning
Draft
Lean
The AI Safety Certificate: A Machine-Verified Framework for Quantitative AI Safety
Can we *prove* an AI system is safe — not with testing, not with benchmarks, but with mathematical certainty? We present a Lean 4 verification of a unified AI safety certificate — to our knowledge, the first formal verification of a multi-dimensional
machine learning / model analysis
Draft
Lean
Residual Stream Denoising in Large Language Models: Gradient-Free Quality Improvement via Functional Sensitivity Analysis
Machine Learning
Working Paper
Lean
Spectral Certificates for Trustworthy AI: Robustness, Confidence, and Fairness from One Decomposition
We prove that a single singular value decomposition (SVD) of a neural network's local Jacobian yields three formally verified trustworthiness guarantees simultaneously.
Machine Learning
Working Paper
Lean
The Spectral Theory of LLM Understanding: When Can We Trust a Language Model?
Large language models are deployed in medicine, law, education, and critical infrastructure — yet we cannot mathematically characterize when their outputs are trustworthy.