Decision Trees
Decision Stump Predict
Apply a given depth-1 decision tree (stump): if x < threshold → left_class,
else → right_class. A single loop classifies each new point. Library:
DecisionTreeClassifier(max_depth=1).fit(X_train, y_train).predict(X_new).
RESULT: predicted class list.
By hand
With scikit-learn
DecisionTreeClassifier(max_depth=1) trains on the same data; .predict
applies the identical threshold rule to new points.
naive.py
threshold = 4.0
left_class = 0
right_class = 1
X_new = [0, 3, 5, 9]
preds = []
for x in X_new:
c = left_class if x < threshold else right_class
preds.append(c)
print('RESULT:', preds)
library.py
from sklearn.tree import DecisionTreeClassifier
from dalib.display import set_display
set_display()
X_train = [[1], [2], [3], [5], [7], [8]]
y_train = [0, 0, 0, 1, 1, 1]
clf = DecisionTreeClassifier(max_depth=1, random_state=0)
clf.fit(X_train, y_train)
X_new = [[0], [3], [5], [9]]
preds = clf.predict(X_new).tolist()
print('threshold:', round(float(clf.tree_.threshold[0]), 4))
print('RESULT:', preds)
threshold: 4.0
RESULT: [0, 0, 1, 1]
Implementation notes
- The stump rule is hardcoded from
best-split-find(this chapter) — this lesson isolates the predict step from the fit step. - sklearn's left condition is feature ≤ threshold; the naive uses strict
<. Both agree here because no test point equals 4.0. - Majority class per leaf: left (training X=1,2,3) → all class 0; right (training X=5,7,8) → all class 1. Impure leaves would use majority vote.
- Cross-reference:
knn-classify-majority(ch02) for another simple classifier; a stump is O(1) at predict time vs O(n) for kNN.