Left Join with Missing - Data Handling in Pure Python

Join two tables by a shared key, keeping every left-side row regardless of whether the key exists in the right table. Unmatched rows receive None for the right-side value. The replay shows result growing one pair per order, with (3, None) appearing for the unmatched key 'x'.

By hand

The Pythonic way

A list comprehension over zip(order_ids, order_ckeys) with index.get(k) handles matched and unmatched rows in one expression, with no filter clause.

naive.py

order_ids = [1, 2, 3, 4, 5]
order_ckeys = ['a', 'b', 'x', 'c', 'a']
cust_keys = ['a', 'b', 'c', 'd']
cust_names = ['alice', 'bob', 'carol', 'dave']
index = {}
for k, name in zip(cust_keys, cust_names):
    index[k] = name
result = []
for oid, k in zip(order_ids, order_ckeys):
    result.append((oid, index.get(k)))
print('RESULT:', result)

library.py

order_ids = [1, 2, 3, 4, 5]
order_ckeys = ['a', 'b', 'x', 'c', 'a']
cust_keys = ['a', 'b', 'c', 'd']
cust_names = ['alice', 'bob', 'carol', 'dave']
index = {k: name for k, name in zip(cust_keys, cust_names)}
result = [(oid, index.get(k)) for oid, k in zip(order_ids, order_ckeys)]
print('RESULT:', result)

RESULT: [(1, 'alice'), (2, 'bob'), (3, None), (4, 'carol'), (5, 'alice')]

Implementation notes

The key difference from inner-join-by-key: the inner join drops row 3 entirely; the left join keeps it with None. Row count is always preserved for the left table in a left join.
index.get(k) returns None by default; pass a second argument (index.get(k, 'unknown')) to substitute a different sentinel.
This is the mechanism behind pd.merge(orders, customers, on='key', how='left') in the python-pandas track — mechanism here, API there.
The simpler loop (no if branch) also produces fewer trace events than the inner join version for the same dataset.