Can a system model itself?

Stage 1 Epimenides · ~6th c. BCE

The sentence that bites its own tail

The provocation

This sentence is false. — the liar paradox, in its barest form

If it's true, then what it says holds — so it's false. If it's false, then what it says fails — so it's true. The sentence has no stable truth value; it oscillates forever. For most of history this was filed under amusement: a verbal Möbius strip, a thing to say to undergraduates. The lesson everyone drew was that self-reference is a kind of disease of language, to be quarantined and ignored.

The model under it

What makes the liar bite is one specific move: a statement that refers to itself, and in doing so makes a claim about its own status. Pull those apart and nothing strange happens. "Snow is white" is about snow. "The previous sentence is about snow" is about another sentence. The trouble starts only when the subject of the sentence is the sentence — when the system points its own machinery back at itself.

The reflex is to ban the move. Bertrand Russell, having found a self-referential paradox at the foundations of mathematics — the set of all sets that don't contain themselves — spent years building a theory of types whose whole purpose was to outlaw self-reference, to forbid any statement from talking about its own level. Tidy the language, ban the loop, and the paradoxes vanish. So everyone hoped.

Self-reference is a bug

"The liar is a glitch in natural language — sloppy grammar letting a sentence grab itself. Build a clean enough system, with strict levels, and the loop becomes ungrammatical. Real mathematics need never refer to itself."

Self-reference is a feature

"You can't ban it — not from any system rich enough to do arithmetic. Numbers can count symbols, and symbols can describe numbers, so the system can always be made to talk about itself. The loop isn't a bug you removed; it's a door you can't lock."

Where this leaves us

For two and a half millennia the liar paradox sat in the joke drawer. Then in 1931 a 25-year-old logician took the exact structure of "this sentence is false" — a system referring to itself — and aimed it not at truth but at provability. The joke became the most consequential theorem of the century.

What happens when a system, instead of saying "I am false," manages to say "I cannot be proved"?

Stage 2 Kurt Gödel · 1931

A theorem that talks about itself

The provocation

David Hilbert had set mathematics a goal: find one formal system, with airtight rules, that could in principle prove every mathematical truth and never prove a falsehood — complete and consistent. Gödel proved that no such system can exist. Any system powerful enough to describe ordinary arithmetic contains true statements it cannot prove. And he did it by teaching the system to talk about itself.

The model under it

The decisive idea is Gödel numbering. Assign every symbol a number; then any string of symbols — a formula, a whole proof — becomes a single, specific number, the way a word becomes a number when you encode each letter. Now a statement about arithmetic ("such-and-such formula is provable") is itself a statement about numbers — which arithmetic can express. The system gains the power to describe its own sentences from the inside. It can talk about itself without ever leaving the language of numbers.

With that mirror in place, Gödel built a sentence — call it G — that, decoded, says: "G is not provable in this system." It is the liar paradox with one word swapped, "false" → "unprovable," and that swap changes everything. Now follow it through. If the system could prove G, it would be proving a statement that says it can't be proved — the system would be proving a falsehood, so it's inconsistent. So if the system is consistent, it cannot prove G. But that is exactly what G asserts. G is true — and unprovable.

The swap that breaks Hilbert's dream

Liar: "This sentence is false." → no stable truth value. A paradox, but truth is slippery anyway, so it stays a curiosity.

Gödel: "This sentence is not provable." → a stable truth value of true, plus a proof that it can't be proved. Provability, unlike truth, is a concrete mechanical process inside the system — so the self-reference now cuts.

The payoff: truth outruns proof. There are facts of arithmetic no fixed set of axioms will ever reach. Add G as a new axiom to plug the hole, and the now-larger system simply spawns a fresh G′ about itself. The gap can't be closed — it travels.

Notice what powered this: not a flaw in the system, but its strength. A language too weak to do arithmetic can't encode its own sentences and stays safely complete. Only once a system is rich enough to model itself does the loop appear — and with it, the permanent gap between what is true and what is provable. Self-reference is the price of expressive power.

Where this leaves us

Gödel turned the liar from a joke into a limit. The very capacity that makes a formal system powerful — its ability to encode and reason about its own structure — is what guarantees it can never fully capture itself. A sufficiently rich system must contain a model of itself, and that model is exactly where its own blind spot lives.

If a logical system can fold back on itself like this — can a picture, a piece of music, a mind?

Stage 3 Douglas Hofstadter · 1979

Strange loops and tangled hierarchies

The provocation

A strange loop arises when, by moving upward (or downward) through the levels of some hierarchical system, you unexpectedly find yourself right back where you started. — after Douglas Hofstadter, Gödel, Escher, Bach, 1979

Hofstadter's claim in Gödel, Escher, Bach is that Gödel didn't find a quirk of logic — he found a shape, and that shape recurs across wildly different domains. In Escher's Drawing Hands, a left hand draws a right hand that is drawing the left hand: which is the artist, which the drawing? In Bach's endlessly rising canon, the music modulates up through the keys and arrives, impossibly, back at its starting pitch. In Gödel, a formula climbs out of arithmetic to make a statement about arithmetic — and lands on itself. Three crafts, one move: a level-crossing feedback loop where the top rung connects back to the bottom.

The model under it

The key word is tangled hierarchy. Normally levels stack cleanly: letters make words, words make sentences, sentences make paragraphs, and meaning lives "above" the symbols. A strange loop tangles that stack — the high level reaches down and touches the low level it was supposed to sit on top of. The drawing produces its own draughtsman. The theorem is about the system that contains it. The order of "what depends on what" eats its own tail.

Below is the cleanest visual form of the move: a screen that contains a picture of itself — and so contains itself containing itself, without end. Leave the twist at zero and you get pure regress: a tunnel of frames marching toward a vanishing point, deeper forever but always pointing the same way. Add a twist and each level rotates as it nests; turn it far enough and after a fixed number of levels you have rotated a full circle — the deepest frame lines up with the outermost, and "down" has quietly become "back to the top." That is the difference between an infinite regress and a strange loop.

The self-containing screen · regress vs. loop

A screen showing a screen showing a screen… With no twist this is an endless regress — deeper forever, but it never comes back. Add a twist to tangle the levels.

Levels of nesting 18

Twist per level 0°

Total rotation

0°

Loop closes at

— (no twist)

Structure

infinite regress

A regress goes down forever and never returns. A strange loop also goes down — but the levels rotate, so following them far enough brings you back to where you began. Same descent; tangled instead of straight.

The regress is the cheap version of self-reference: a thing pointing at a smaller copy of itself, on and on, going nowhere. The strange loop is the expensive version — the one Gödel found and Escher drew — where the pointing eventually curves around and the system, having described itself completely, is the thing it described. Hofstadter's wager is that this second shape is not a curiosity but the engine of something we care about a great deal.

One shape, many crafts

"Gödel's G, Escher's hands, Bach's rising canon — these aren't loose analogies. They are literally the same formal structure: a hierarchy whose top level loops back to its bottom. Recognize the shape once and you see it everywhere."

A beautiful overreach?

"A drawing of hands and a theorem in number theory share a vibe, not a proof. The 'strange loop' is a gorgeous metaphor, but stretching one diagram across logic, art, music, and mind risks explaining everything and therefore nothing."

Where this leaves us

A strange loop is what you get when a system models itself completely enough that the model and the thing become the same object. It is more than regress — regress never arrives. The loop arrives, and lands on its own starting point. The question Hofstadter spent his life on is whether you, the reader, are an instance of it.

If a system rich enough to model itself grows a strange loop — and a brain is exactly such a system — what is the loop, in us?

Stage 4 Hofstadter, I Am a Strange Loop · 2007

Is the "I" a strange loop?

The provocation

Here is the live edge. A brain is a system rich enough to model the world — and rich enough to include, among the things it models, itself doing the modeling. Hofstadter's proposal is bold and bare: the self — the "I" you feel as the centre of your experience — is not a thing in the brain but a pattern: the strange loop a sufficiently reflective system spins when its self-model becomes detailed enough to model its own modeling. The "I" is what it feels like, from the inside, to be a loop that has caught its own tail.

The model under it

On this view there is no homunculus, no inner pilot. The brain builds a symbol for the most important object in its world — the organism it runs — and that symbol, by representing the very system that contains it, becomes a Gödelian self-reference made of neurons instead of numbers. Round and round the perception-of-self feeds back into the self-that-perceives, and the standing wave of that feedback is what we name "me." Hofstadter calls the soul, only half-jokingly, "a mirage that perceives itself" — real as a pattern, illusory as a thing.

It is a genuinely deflationary and genuinely uplifting idea at once: deflationary because the self is "just" a self-referential pattern, uplifting because a self-referential pattern can be copied, can live partly in other brains that model you, can be more or less richly looped. But it runs straight into the wall that every theory of consciousness hits.

The loop is the self

"Selfhood isn't an extra ingredient added to a brain — it's what a brain's self-model does once it's rich enough to fold back on itself. Strange-loopiness isn't a symptom of consciousness; it's the substance of it. The 'I' is the loop, felt from inside."

A loop is still just symbols

"Granting the brain a self-referential model explains the report of a self, not the experience of one. Why is there something it's like to be the loop at all? Self-reference is syntax; feeling is the thing the syntax doesn't deliver. The hard problem is untouched."

The dispute that's still live

Hofstadter's claim is that the self is a strange loop and that the "hard problem" of consciousness — why there is subjective experience at all — is, in his view, partly a confusion that dissolves once you see selfhood as a self-referential pattern. His critics deny exactly that. David Chalmers' hard problem says you can grant every functional, self-modeling, loop-laden fact about a brain and still face an open question: why is any of it experienced? John Searle's Chinese Room presses the same nerve from the side of meaning — running the right symbol-shuffling, however tangled, looks like syntax without semantics, a system that behaves as if it understands without anyone home.

Defenders answer that "syntax without semantics" begs the question: a loop rich enough to model itself modeling the world may be all that meaning ever was, and the demand for an extra ingredient is a hunch dressed as an argument. The debate has only sharpened with AI — a large language model is a system trained to predict text including text about itself, and the old question returns wearing new clothes: is a self-model that talks fluently about its own states the seat of a self, or an extraordinarily good imitation of one? A walkthrough can hand you the strongest form of each side. It cannot hand you the verdict, because there isn't one yet.

The resolution

Yes — a system can model itself, and once it is rich enough, it must. That self-modeling is not a side effect; in logic it produces the gap between truth and proof, and it may, in a brain, produce the gap-dweller we call "I." Whether the strange loop merely describes the self or actually is the self — whether folding a system back on itself conjures an experiencer or only the convincing report of one — is the part still genuinely unsettled.

Sources & where to go deeper

K. Gödel, "On Formally Undecidable Propositions…" (1931) — the incompleteness theorems. Gödel numbering lets a formal system encode statements about itself; the self-referential sentence G is true but unprovable. The founding result.

D. Hofstadter, Gödel, Escher, Bach: an Eternal Golden Braid (1979) — the strange loop and the tangled hierarchy, braided across logic, art, and music. Pulitzer Prize, 1980. Where the shape gets its name.

D. Hofstadter, I Am a Strange Loop (2007) — the self as a self-referential pattern: the "I" as the loop a brain spins when its self-model models its own modeling.

M.C. Escher, Drawing Hands & Print Gallery (1948, 1956) — the visual strange loop and the Droste self-containing image; the pictures Hofstadter builds the argument around.

J. Searle, "Minds, Brains, and Programs" · D. Chalmers, "Facing Up to the Problem of Consciousness" (1980, 1995) — the Chinese Room and the hard problem: the two standing objections to any "self is a pattern" theory of mind.

The sentence that bites its own tail

A theorem that talks about itself

Strange loops and tangled hierarchies

Is the "I" a strange loop?

Yes — and any system rich enough is forced to.

Sources & where to go deeper