← All Papers · Quantitative Finance

What Your XGBoost Learned: Spectral Knowledge Extraction from Black-Box Models

Tamás Nagy, Ph.D. Updated 2026-03-07 Working Paper Quantitative Finance Lean-Verified
Download PDF View in Graph BibTeX

Abstract

We introduce spectral knowledge extraction: a method to decompose the learned function of a black-box model (XGBoost, random forest, neural network) into explicit Fourier cosine modes, each with a direct interpretation. Given a trained model \(f(x)\), we compute the partial dependence function \(g_j(x_j)\) for each feature \(j\) and decompose it as \(g_j(x_j) \approx A_0/2 + \sum_{k=1}^{K} A_k \cos(k\pi x_j / x_{\max})\). The coefficients \(\{A_k\}\) encode the shape of the learned relationship: \(A_1\) dominant implies a linear effect (momentum/contrarian); \(A_2\) dominant implies a quadratic effect (mean reversion/barrier); higher modes indicate complex nonlinearity. We define the spectral complexity \(\text{SC} = \sum_{k \geq 3} A_k^2 / \sum_{k \geq 1} A_k^2 \in [0, 1]\) as a measure of how much the model's learned effect exceeds simple parametric forms. For 2D feature interactions, the Fourier coefficient matrix \(B_{kl}\) reveals the interaction type (spread tracking, joint mean reversion, conditional momentum). On synthetic data with known ground truth, spectral extraction correctly identifies the generating patterns (mean reversion, momentum, interactions) from a gradient boosting model. On real-world housing data, the method recovers economically intuitive nonlinear effects and separates genuine complexity from simple parametric relationships. The reconstruction captures the model's predictive power with explicit, interpretable formulas. Key structural results — the reconstruction error bound (via triangle inequality on omitted coefficients) and coefficient-shape correspondence — are machine-verified in Lean 4. This goes beyond SHAP (which gives feature importance per prediction) by providing the global functional shape of each learned effect.

Length
3,400 words
Claims
3 theorems
Status
Working Paper
Target
Journal of Financial Data Science

Browse all Quantitative Finance papers →