The finale states the boundary for the flagship exact step. The arithmetic is exact and the broader claims are explicitly outside this render.

highlighted = computed this step

What is exact

This book computes one exact backprop step on the same small rational network. The key register includes dL/dw11=-2, dL/dw12=-4, and dL/db1=-2.

dLdw11=2,dLdw12=4,dLdb1=2\frac{dL}{dw_{11}}=-2,\quad \frac{dL}{dw_{12}}=-4,\quad \frac{dL}{db_1}=-2
One exact backprop stepThe reverse-chain arithmetic is shown without broader claims.One exact backprop stepThe reverse-chain arithmetic is shown without broader claims.w11=1w12=1w21=1w22=-1ReLUReLUv1=1v2=1target=3x1x=1grad=-2x2x=2grad=-2z1val=2grad=-2z2val=-1grad=0h1val=2grad=-2h2val=0grad=-2yhatval=2grad=-2Lval=1grad=1b1=-1b2=0c=0z=0 convention: ReLU'(0)=0ReLU'(z1)=1ReLU'(z2)=0parameter grads: dv1=-4, dv2=0, dc=-2, dw11=-2, dw12=-4, db1=-2, dw21=0, dw22=0, db2=0

What is not claimed

This is about 10 parameters, not 100B. It is not training, not convergence, not learning, and no generalization claim is made.

one exact step; not training or convergence\text{one exact step; not training or convergence}
One exact backprop stepThe reverse-chain arithmetic is shown without broader claims.One exact backprop stepThe reverse-chain arithmetic is shown without broader claims.w11=1w12=1w21=1w22=-1ReLUReLUv1=1v2=1target=3x1x=1grad=-2x2x=2grad=-2z1val=2grad=-2z2val=-1grad=0h1val=2grad=-2h2val=0grad=-2yhatval=2grad=-2Lval=1grad=1b1=-1b2=0c=0z=0 convention: ReLU'(0)=0ReLU'(z1)=1ReLU'(z2)=0parameter grads: dv1=-4, dv2=0, dc=-2, dw11=-2, dw12=-4, db1=-2, dw21=0, dw22=0, db2=0

Summary

This is one exact backprop step on a small rational toy network. ReLU's zero-or-one derivative keeps every gradient exact rational; it is not training, not convergence, not learning, and no generalization claim.

backprop mechanics on toy rational data\text{backprop mechanics on toy rational data}
One exact backprop stepThe reverse-chain arithmetic is shown without broader claims.One exact backprop stepThe reverse-chain arithmetic is shown without broader claims.w11=1w12=1w21=1w22=-1ReLUReLUv1=1v2=1target=3x1x=1grad=-2x2x=2grad=-2z1val=2grad=-2z2val=-1grad=0h1val=2grad=-2h2val=0grad=-2yhatval=2grad=-2Lval=1grad=1b1=-1b2=0c=0z=0 convention: ReLU'(0)=0ReLU'(z1)=1ReLU'(z2)=0parameter grads: dv1=-4, dv2=0, dc=-2, dw11=-2, dw12=-4, db1=-2, dw21=0, dw22=0, db2=0