Reflective Self-Amendment
Abstract
We identify a structural property unique to large language model (LLM) agents: when an agent's behavioral policy is specified in natural language (a "skill file"), and its reasoning engine operates in the same modality (natural language), the agent can read, evaluate, and propose modifications to its own behavioral specification. We call this reflective self-amendment (RSA). Unlike reinforcement learning (opaque weights), constitutional AI (fixed constitution), or symbolic self-modification (different formalism for rules and reasoning), RSA exploits the same-modality property: policy and reasoning share a representational medium. We formalize RSA as a fixed point of the tool-building cascade: the point where the tool that improves tools is directed at itself. We give conditions under which the RSA loop converges (bounded improvement per cycle) versus diverges (unbounded self-modification), connecting to known results on AI self-improvement bounds.