Applied Knowledge Algebra: A Collection of Demonstrations and Use Cases

Nagy, Tamás

Applied Knowledge Algebra: A Collection of Demonstrations and Use Cases

Tamás Nagy, Ph.D. Updated 2026-03-17 Working Paper Machine Learning

Abstract

The companion methods paper (Nagy 2026a) defines the Knowledge Artifact — a portable spectral representation of trained model knowledge — and the Knowledge Algebra — exact arithmetic on compatible artifacts. This paper is the empirical companion: a collection of detailed use cases, each designed to be self-contained and independently publishable.

Part A: Federated Knowledge Aggregation. We show that averaging Knowledge Artifact coefficients across isolated data sites produces a federated model without sharing any raw data, gradients, or model weights. On the Diabetes dataset (5 hospital sites, \(n \approx 70\) each), the federated artifact (\(R^2 = 0.484\)) beats the pooled gold standard (\(R^2 = 0.450\)). On Breast Cancer (5 clinics), the federated classifier (accuracy = 0.974) beats pooled training (accuracy = 0.956). The protocol requires one round, no central server, and no iterative optimization.

Part B: Spectral Debiasing. We demonstrate post-training bias removal via spectral subtraction. On the Wine dataset with injected bias, one subtraction operation recovers \(R^2 = 0.954\) on the fair target from a model that was anti-correlated (\(R^2 = -1.29\)), with 7x RMSE reduction. The debiased artifact contains zero residual bias signal.

Part C: Structural Model Fingerprinting. We show that Knowledge Artifacts provide architecture-independent structural comparison of trained models. A 200-tree GBM and a 200-tree Random Forest trained on the same data converge to spectral cosine similarity = 0.984, despite being fundamentally different algorithms. Energy, entropy, and spectral distance provide quantitative fingerprints for model auditing, versioning, and drift detection.

Additionally, we validate extraction across 18 regression model types and 8 classifier types (Section 2), catalog 10 algebraic operations on nonlinear models (Section 3), document honest failure modes (Section 8), and compare against weight-space alternatives (Section 9).

All experiments use fixed seeds and are reproducible with a single script.

Length

10,099 words

Claims

1 theorems

Status

Working Paper

Target

KDD / JMLR / NeurIPS Datasets and Benchmarks (collection; individual parts extractable)

Connects To

Universal Foundations: A Verified Library of Core Mathematic...

Applied Knowledge Algebra: A Collection of Demonstrations and Use Cases

Abstract

Connects To

Read next — this paper appears in