Captured logits are approximate display values, while greedy argmax is an exact deterministic selection over token IDs.
Real logits
The top-k rows are captured logits from the final generation step. Their magnitudes are floating-point cells, so the diagram marks them with approximate labels.
captured logits are ≈ -labeled \text{captured logits are }\approx\text{-labeled} captured logits are ≈ -labeled
Captured logits to greedy token Captured magnitudes, exact greedy argmax. Captured logits to greedy token Captured magnitudes, exact greedy argmax. captured logits to exact greedy Captured final-step top logits and exact greedy pick. tensor: step02.logits_last shape [1, 50257] rank token id logit precision 1 373 ≈19.3264 captured fp32, shown to 6 sig-digits 2 547 ≈15.4883 captured fp32, shown to 6 sig-digits 3 5615 ≈14.7227 captured fp32, shown to 6 sig-digits 4 11 ≈11.5574 captured fp32, shown to 6 sig-digits 5 318 ≈11.2361 captured fp32, shown to 6 sig-digits exact greedy argmax token id = 373 text ' was' final output text = 'Once upon a time, there was'; output IDs = [7454, 2402, 257, 640, 11, 612, 373]
Exact greedy decision
The greedy argmax is discrete and deterministic. The selected token ID is 373, whose token text is was.
argmax → 373 \operatorname{argmax}\rightarrow373 argmax → 373
Captured logits to greedy token Captured magnitudes, exact greedy argmax. Captured logits to greedy token Captured magnitudes, exact greedy argmax. captured logits to exact greedy Captured final-step top logits and exact greedy pick. tensor: step02.logits_last shape [1, 50257] rank token id logit precision 1 373 ≈19.3264 captured fp32, shown to 6 sig-digits 2 547 ≈15.4883 captured fp32, shown to 6 sig-digits 3 5615 ≈14.7227 captured fp32, shown to 6 sig-digits 4 11 ≈11.5574 captured fp32, shown to 6 sig-digits 5 318 ≈11.2361 captured fp32, shown to 6 sig-digits exact greedy argmax token id = 373 text ' was' final output text = 'Once upon a time, there was'; output IDs = [7454, 2402, 257, 640, 11, 612, 373]
Summary
This is the Book 16 split in a real run: magnitude rows are REAL CAPTURED, while ordering and greedy selection are exact integer decisions.
captured magnitudes, exact selection \text{captured magnitudes, exact selection} captured magnitudes, exact selection
Captured logits to greedy token Captured magnitudes, exact greedy argmax. Captured logits to greedy token Captured magnitudes, exact greedy argmax. captured logits to exact greedy Captured final-step top logits and exact greedy pick. tensor: step02.logits_last shape [1, 50257] rank token id logit precision 1 373 ≈19.3264 captured fp32, shown to 6 sig-digits 2 547 ≈15.4883 captured fp32, shown to 6 sig-digits 3 5615 ≈14.7227 captured fp32, shown to 6 sig-digits 4 11 ≈11.5574 captured fp32, shown to 6 sig-digits 5 318 ≈11.2361 captured fp32, shown to 6 sig-digits exact greedy argmax token id = 373 text ' was' final output text = 'Once upon a time, there was'; output IDs = [7454, 2402, 257, 640, 11, 612, 373]