Attention begins with dot-product scores between query and key vectors. This lesson shows the displayed integer Q and K vectors and the exact score grid they generate.

highlighted = computed this step

Attention scores

Each query vector is compared with each key vector by a dot product. The displayed Q and K vectors are integer pairs, so every score is an exact integer.

Sij=QiKjS_{ij}=Q_i\cdot K_j
Attention scoresQ and K vectors produce exact dot-product scores.Attention scoresQ and K vectors produce exact dot-product scores.exact integer attention scoresdisplayed Q,K integer vectors are the score sourceQ1=(1,0); Q2=(0,1); Q3=(1,1)K1=(1,0); K2=(0,1); K3=(1,1)score S_ij=Qi·KjK1K2K3Q1Q2Q3101011112score entries are exact integer dot products

Exact score rows

The recomputed score rows are [1,0,1], [0,1,1], and [1,1,2].

S=[101011112]S=\begin{bmatrix}1&0&1\\0&1&1\\1&1&2\end{bmatrix}
Attention scoresQ and K vectors produce exact dot-product scores.Attention scoresQ and K vectors produce exact dot-product scores.exact integer attention scoresdisplayed Q,K integer vectors are the score sourceQ1=(1,0); Q2=(0,1); Q3=(1,1)K1=(1,0); K2=(0,1); K3=(1,1)score S_ij=Qi·KjK1K2K3Q1Q2Q3101011112score entries are exact integer dot products

Summary

The score grid is still only integer dot products. The next step changes which scores are allowed to contribute.

scores exact; no softmax yet\text{scores exact; no softmax yet}
Attention scoresQ and K vectors produce exact dot-product scores.Attention scoresQ and K vectors produce exact dot-product scores.exact integer attention scoresdisplayed Q,K integer vectors are the score sourceQ1=(1,0); Q2=(0,1); Q3=(1,1)K1=(1,0); K2=(0,1); K3=(1,1)score S_ij=Qi·KjK1K2K3Q1Q2Q3101011112score entries are exact integer dot products