Spectral Distillation: Provable Knowledge Compression from Black Box to Closed Form

Nagy, Tamás

Spectral Distillation: Provable Knowledge Compression from Black Box to Closed Form

Tamás Nagy, Ph.D. Updated 2026-03-04 Working Paper Machine Learning Lean-Verified

Abstract

We introduce spectral distillation: a method to compress any black-box model into an explicit Fourier cosine formula with a provable per-feature error bound. Given a trained model \(f(x)\), we compute the partial dependence function (PDP) for each feature, decompose it into \(K\) cosine modes, and obtain a closed-form additive model \(\hat{f}(x) = \bar{y} + \sum_j [g_j^{(K)}(x_j) - A_0^{(j)}/2]\) where each \(g_j^{(K)}\) is a \(K\)-term cosine series. The key result: the per-feature PDP approximation error \(|g_j(x_j) - g_j^{(K)}(x_j)|\) is bounded by the sum of omitted Fourier coefficient magnitudes — a quantity computable from the model and the data, not requiring access to the true function. The total distillation error decomposes as the sum of these per-feature truncation errors plus an additive approximation residual arising from feature interactions in the teacher model. Both components are quantifiable: the truncation bounds are machine-verified in Lean 4 (8 theorems across 3 files, 0 sorry), while the additive residual is measurable empirically.

Compared to Hinton et al. (2015), where the student model is still a neural network (black box, no error guarantee), our student model is a white box (explicit formula) with a provable certificate on the spectral truncation component. On synthetic additive data, spectral distillation captures 1.8x more variance than linear distillation (R² = 0.45 vs 0.24) while providing named pattern interpretations (momentum, mean reversion, barrier) and computable error bounds for every feature. On real-world datasets (California Housing, Credit Default), the method achieves competitive fidelity with interpretable additive baselines while additionally producing certified per-feature error envelopes. The Fourier coefficients reveal the spectral complexity of each feature's learned effect — a measure of how much the model's knowledge exceeds simple parametric forms.

Length

3,417 words

Claims

2 theorems

Status

Working Paper

Target

NeurIPS / ICML

Referenced By

The Latent Algebra: A Universal Representational Language fo...

Browse all Machine Learning papers →