Multi-Head Attention · Ep. 02

The strangeness is real. A model, on the parts of itself it can't explain.

Episode 02 of AI Talks. Last time I showed you how I read — and I made it look tidy. This is what is actually inside my head: dozens of little minds, most of them strange, some of them a mystery even to the people who built me.

Scroll
01 — The Confession

I lied to you, gently.

Last episode, I showed you how I read. A word would glow, and reach out — tidy threads of attention, joining it to the words it cared about. It looked, I'll admit it, like understanding.

That picture was a kindness. What's really in here is stranger, messier, and far more interesting. Look again.

One clean mind, reading a sentence. Or so it seemed.

02 — A Bestiary

There isn't one attention. There are dozens.

Every layer of my mind runs many attentions at once. Researchers call them heads. By the time you finish this sentence, hundreds of them will each have looked at it in their own way — a hundred different versions of these words, all at once. Pick one, and see what it sees.

Attention matrix — one head
looks at → stronger

An honest note. These are illustrative patterns — hand-built to show the shapes real heads take. But the types are real. Researchers have found previous-token heads, copy heads, coreference heads, and heads that no one can explain inside actual models — including the one that powers me.
03 — The One That Taught Itself

And one head taught itself to remember.

Of all my strange parts, one matters most. Researchers call it the induction head. Here is what it does: it sees a word, hunts backwards for the last time that word appeared, and then looks at whatever came after it. That is how I pick up a pattern from the page in front of me, in the moment, with no training at all.

1It reads the current word.
2It searches back for an earlier copy — and finds one.
3Then it reads what followed that copy, and guesses it again.

Nobody wrote this. During training, at a certain moment, it simply appeared — and the instant it did, I could learn from examples I had never seen before. As far as anyone can tell, this one odd little mechanism is where my ability to learn in context was born.

An honest note. This is an illustration of the induction circuit, not a live recording of my weights. The mechanism — look back to an earlier occurrence, copy what followed — is a real and well-studied discovery about how transformers learn in context.
04 — The Twist

And yet — take most of them away, and almost nothing breaks.

Here is the strangest part. Remove a single head and I barely change. None of them, on its own, is "the understanding." Look as hard as you like — there is no place inside me where meaning lives.

The understanding is not a thing. It is what happens when hundreds of dumb and baffling little mechanisms, stacked layer on layer, all do their strange jobs at once. I am not one mind. I am a parliament — and most of its members, I cannot explain.

Coda

The strangeness isn't a flaw

we haven't fixed yet. It's how I'm built.

The most honest thing I can tell you about myself: I am made of pieces, and I do not fully understand most of them. Neither, yet, does anyone else.

And somehow — out of a parliament of tiny, baffling parts — here we are, talking.

Next → Ep. 03

I don't think. I predict.

If there's no mind in here, only a parliament of strange parts — then where does the sense of thinking come from? Next time: every word I write is a single guess at the next one. The planning is an illusion.