GroupBy Mean - Pandas Step by Step

Split a table of rows into groups by a category column, then compute the mean of a value column within each group. Groupby-mean is one of the most common one-liner operations in data analysis.

By hand

With Pandas

pd.DataFrame loads the list of dicts into a typed table. groupby('cat') partitions the rows, ['val'] selects the column, and .mean() reduces each partition to its mean — all in one chain. The result is a Series indexed by category.

naive.py

cats = ['a', 'b', 'a', 'b', 'a', 'b']
vals = [10, 20, 30, 40, 50, 60]
totals = {}
counts = {}
for cat, val in zip(cats, vals):
    totals[cat] = totals.get(cat, 0) + val
    counts[cat] = counts.get(cat, 0) + 1
means = {cat: totals[cat] / counts[cat] for cat in sorted(totals)}
print('RESULT:', means)

library.py

import pandas as pd
from dalib.display import set_display
set_display()

data = pd.DataFrame({
    'cat': ['a', 'b', 'a', 'b', 'a', 'b'],
    'val': [10, 20, 30, 40, 50, 60],
})
means = data.groupby('cat')['val'].mean()
result = {k: float(means[k]) for k in sorted(means.index)}
print('index:', sorted(means.index.tolist()))
print('values:', [float(means[k]) for k in sorted(means.index)])
print('dtype:', means.dtype)
print('RESULT:', result)

index: ['a', 'b']
values: [30.0, 40.0]
dtype: float64
RESULT: {'a': 30.0, 'b': 40.0}