Reshaping by Hand
Zip Columns to Rows
Three parallel column lists — names, cats, and vals — assembled into a
list of row tuples using an index loop. The replay steps through each index,
building one tuple per iteration with (names[i], cats[i], vals[i]).
By hand
The Pythonic way
zip(names, cats, vals) pairs the elements by position and returns an
iterator of tuples. Wrap it in list() to materialise all rows at once.
naive.py
names = ['ann', 'bob', 'cal', 'dan']
cats = ['a', 'b', 'a', 'b']
vals = [10, 20, 30, 40]
rows = []
for i in range(len(names)):
rows.append((names[i], cats[i], vals[i]))
print('RESULT:', rows)
library.py
names = ['ann', 'bob', 'cal', 'dan']
cats = ['a', 'b', 'a', 'b']
vals = [10, 20, 30, 40]
rows = list(zip(names, cats, vals))
print('RESULT:', rows)
RESULT: [('ann', 'a', 10), ('bob', 'b', 20), ('cal', 'a', 30), ('dan', 'b', 40)]
Implementation notes
- The index loop and
zipare equivalent only when all three lists have the same length. With mismatched lengths,zipsilently stops at the shortest while the index loop raisesIndexError. In practice, column data is generated together and is length-aligned, sozipis the safer default. - This operation is invertible:
names, cats, vals = zip(*rows)unpacks a list of row tuples back into three separate column iterators. - In pandas, parallel column lists are passed as a dict to the constructor —
pd.DataFrame({'name': names, 'cat': cats, 'val': vals})— so no explicit zipping is needed.