Assembly starts by slicing reads into overlapping k-mers.

highlighted = computed this step

Assembly starts by slicing reads into k-mers

For this tiny read ATGCG, choose k=3. The read has 3 overlapping k-mers.

k-mers=all length-3 substrings\text{k-mers}=\text{all length-} 3 \text{ substrings}
Overlapping k-mersEach row is sliced directly from the source DNA string.Startk-merPrefixSuffix0ATGATTG1TGCTGGC2GCGGCCG

Each k-mer has a prefix and suffix

With k=3, every node will be a k minus 1 string. The prefix and suffix columns are exactly what the graph will connect.

node length=k minus 1\text{node length}=k\text{ minus }1
Overlapping k-mersEach row is sliced directly from the source DNA string.Startk-merPrefixSuffix0ATGATTG1TGCTGGC2GCGGCCG