Top-k keeps a small branch list by exact ordering. This lesson uses k equal to two on the first displayed step.

highlighted = computed this step

Top-k

Top-k keeps the k largest logits instead of only one token. The ordering is still exact integer comparison.

top-k=largest logits by exact order\text{top-}k=\text{largest logits by exact order}
Top-k selectionThe top-k list is selected by exact integer ordering.Top-k selectionThe top-k list is selected by exact integer ordering.exact greedy/top-k decode stepdisplayed integer logits are the source; ties break by lowest vocab indexstep 1 logits: a=3, b=1, c=2tokenlogitranka31b13c22ranked order: a > c > btop-2: a, cgreedy argmax: aselection uses ordering only; no probability row is displayed

Top two at the first step

For k=2, the first step keeps a with logit 3 and c with logit 2.

top-2=[a(3),c(2)]\text{top-}2=[a(3),c(2)]
Top-k selectionThe top-k list is selected by exact integer ordering.Top-k selectionThe top-k list is selected by exact integer ordering.exact greedy/top-k decode stepdisplayed integer logits are the source; ties break by lowest vocab indexstep 1 logits: a=3, b=1, c=2tokenlogitranka31b13c22ranked order: a > c > btop-2: a, cgreedy argmax: aselection uses ordering only; no probability row is displayed

Summary

Top-k broadens the displayed branch list, but it still uses exact ordering of the same integer logits.

top-k is deterministic selection\text{top-k is deterministic selection}
Top-k selectionThe top-k list is selected by exact integer ordering.Top-k selectionThe top-k list is selected by exact integer ordering.exact greedy/top-k decode stepdisplayed integer logits are the source; ties break by lowest vocab indexstep 1 logits: a=3, b=1, c=2tokenlogitranka31b13c22ranked order: a > c > btop-2: a, cgreedy argmax: aselection uses ordering only; no probability row is displayed