Life: What, Why and How
Abstract
We formalize a definition of life as the intersection of six necessary constraints: (L1) Shannon information capacity for self-description, (L2) Eigen error threshold for pattern preservation, (L3) thermal stability of covalent bonds, (L4) reversible molecular recognition via hydrogen bonding, (L5) metabolic energy supply, and (L6) evolvability through bounded mutation. We show that this intersection is non-empty given Earth's physical constants, with the viable region spanning 200–2500 atoms. A systematic parameter sweep across 14 chemistries — from binary polymers (\(k = 2\)) to codon-level coding (\(k = 64\)) — reveals that the atom-count function \(N(k, m) = I_{\min} \cdot m / \ln k\) (with \(I_{\min}\) in nats) has a broad optimum basin at \(k \in [8, 20]\), \(N \in [320, 364]\), where both reduced and full amino acid alphabets achieve near-identical efficiency. We prove the Binary Paradox: simpler monomers (\(k = 2\), \(m = 5\)) require more total atoms than complex ones (\(k = 20\), \(m = 10\)) because the information compression from a larger alphabet outweighs the heavier monomer. We frame self-replication as the closure of three maps — from folding to recognition to copying — and we show that, among the chemistries analyzed, amino acids are the lightest monomer class where all three maps are simultaneously well-defined. We identify four candidate points on the landscape whose coarse parameters satisfy all six constraints and are accessible to current synthetic chemistry: a reduced-alphabet peptide (\(\sim 340\)–\(400\) atoms), a thioester polymer (\(\sim 575\) atoms), a PNA self-replicator (\(\sim 1075\) atoms), and a minimal Ghadiri variant (\(\sim 480\) atoms). All mathematical claims are machine-verified in a formal proof kernel (Lean 4) and independently type-checked in Lean 4 with Mathlib, with zero sorry and zero errors.
Keywords: origin of life, self-replication, information theory, Eigen threshold, amino acids, fixed-point theory, formal verification