GroupBy
GroupBy Mean
Split a table of rows into groups by a category column, then compute the mean of a value column within each group. Groupby-mean is one of the most common one-liner operations in data analysis.
By hand
With Pandas
pd.DataFrame loads the list of dicts into a typed table. groupby('cat')
partitions the rows, ['val'] selects the column, and .mean() reduces
each partition to its mean — all in one chain. The result is a Series
indexed by category.
naive.py
cats = ['a', 'b', 'a', 'b', 'a', 'b']
vals = [10, 20, 30, 40, 50, 60]
totals = {}
counts = {}
for cat, val in zip(cats, vals):
totals[cat] = totals.get(cat, 0) + val
counts[cat] = counts.get(cat, 0) + 1
means = {cat: totals[cat] / counts[cat] for cat in sorted(totals)}
print('RESULT:', means)
library.py
import pandas as pd
from dalib.display import set_display
set_display()
data = pd.DataFrame({
'cat': ['a', 'b', 'a', 'b', 'a', 'b'],
'val': [10, 20, 30, 40, 50, 60],
})
means = data.groupby('cat')['val'].mean()
result = {k: float(means[k]) for k in sorted(means.index)}
print('RESULT:', result)
RESULT: {'a': 30.0, 'b': 40.0}