Emergence Is a Spectral Phase Transition: Predicting When Language Models Acquire New Abilities
Abstract
We propose that emergent abilities in large language models are spectral phase transitions: a capability appears when the model's spectral resolution crosses the task's intrinsic complexity. We formalize this as \(N^ = (C/(\rho_{\text{task}} - 1))^{1/\alpha}\) where \(\rho_{\text{task}}\) is the task's spectral decay rate and \(C, \alpha\) are architecture constants. Testing on 20 BIG-Bench tasks claimed as emergent by Wei et al. (2022), we find that only 7/20 (35\%) show true sigmoid emergence (R\(^2 > 0.7\)); 4 are gradual, 6 are noisy, and 3 are flat. The 7 genuinely emergent tasks have emergence thresholds spanning 5 orders of magnitude (\(N^ = 142\)M to \(1.75\)T parameters), with sigmoid sharpness correlating with the number of spectral modes required (\(K^*\)). The spectral framework explains WHY some tasks emerge sharply (few modes, high \(\rho\)) and others gradually (many modes, low \(\rho\)), resolving the debate between Wei et al. (2022, `emergence is real'') and Schaeffer et al. (2023, `emergence is a mirage'') --- both are right, for different tasks.