The finale states the boundary: exact dot scores and exact causal masking, with softmax named. This is mechanics, not a claim about what real systems learn or mean.

highlighted = computed this step

What is exact

The exact register is the score matrix and the causal mask. The score in the lower-right cell is 2 because Q3 dot K3 equals 2.

S3,3=2S_{3,3}=2
Attention honesty boundaryExact score and mask structure with named softmax.Attention honesty boundaryExact score and mask structure with named softmax.exact scores, exact causal mask, named softmaxdisplayed Q,K integer vectors are the score source; causal mask keeps j<=iQ1=(1,0); Q2=(0,1); Q3=(1,1)K1=(1,0); K2=(0,1); K3=(1,1)score S_ij=Qi·Kj with causal maskK1K2K3Q1Q2Q31maskedmasked01masked112row softmax over unmasked scoresrow 1: scores [1]weights 1row 2: scores [0,1]weights e^0/(e^0+e^1), e^1/(e^0+e^1)row 3: scores [1,1,2]weights e^1/(e^1+e^1+e^2), e^1/(e^1+e^1+e^2), e^2/(e^1+e^1+e^2)softmax is named: no decimal attention weights are pinned

What is not claimed

Softmax is named, not decimalized. This does not claim learning, does not claim meaning, and does not claim that the model attends to what matters.

not learning; not meaning; not what matters\text{not learning; not meaning; not what matters}
Attention honesty boundaryExact score and mask structure with named softmax.Attention honesty boundaryExact score and mask structure with named softmax.exact scores, exact causal mask, named softmaxdisplayed Q,K integer vectors are the score source; causal mask keeps j<=iQ1=(1,0); Q2=(0,1); Q3=(1,1)K1=(1,0); K2=(0,1); K3=(1,1)score S_ij=Qi·Kj with causal maskK1K2K3Q1Q2Q31maskedmasked01masked112row softmax over unmasked scoresrow 1: scores [1]weights 1row 2: scores [0,1]weights e^0/(e^0+e^1), e^1/(e^0+e^1)row 3: scores [1,1,2]weights e^1/(e^1+e^1+e^2), e^1/(e^1+e^1+e^2), e^2/(e^1+e^1+e^2)softmax is named: no decimal attention weights are pinned

Summary

The scores and causal mask are exact; softmax is the named boundary. This pins the attention mechanics. It is not learning, not meaning, and not a claim that the model attends to what matters.

attention mechanics: exact scores, exact mask, named softmax\text{attention mechanics: exact scores, exact mask, named softmax}
Attention honesty boundaryExact score and mask structure with named softmax.Attention honesty boundaryExact score and mask structure with named softmax.exact scores, exact causal mask, named softmaxdisplayed Q,K integer vectors are the score source; causal mask keeps j<=iQ1=(1,0); Q2=(0,1); Q3=(1,1)K1=(1,0); K2=(0,1); K3=(1,1)score S_ij=Qi·Kj with causal maskK1K2K3Q1Q2Q31maskedmasked01masked112row softmax over unmasked scoresrow 1: scores [1]weights 1row 2: scores [0,1]weights e^0/(e^0+e^1), e^1/(e^0+e^1)row 3: scores [1,1,2]weights e^1/(e^1+e^1+e^2), e^1/(e^1+e^1+e^2), e^2/(e^1+e^1+e^2)softmax is named: no decimal attention weights are pinned