Activity 1

"Who Are Your Neighbors?"

AgeAnnual Spending ($k)Label
252Low
355Low
458High
221.5Low
5512High
304Low
407High
283Low
5010High
386High
6015High
334.5Low

New customer: Age=32, Spending=$5k, Label=?

Your Task

  1. Find the 3 closest customers to the new one (by eyeballing age and spending).
  2. Based on those 3 neighbors, what label would you assign?
  3. What if you used 7 neighbors instead -- would the answer change?
Reveal Solution

This is K-Nearest Neighbors (KNN) -- classify by majority vote of the $k$ closest training points. With $k=3$, you likely get "Low." With $k=7$, you include more distant points -- the decision may change. The choice of $k$ is critical: small $k$ = sensitive to noise, large $k$ = smoother boundaries.

Activity 2

"How Far Apart?"

CustomerIncome ($k)AgeDebt Ratio
A50300.4
B80450.2
Comparison of Euclidean and Manhattan distance metrics

Your Task

  1. Calculate |A-B| for each feature.
  2. If income is in dollars and age in years, does the raw distance make sense?
  3. What if you multiply income by 1000 (to get actual dollars)?
Reveal Solution

Raw differences: |50-80|=30, |30-45|=15, |0.4-0.2|=0.2. But income dominates just because of scale! Feature scaling (standardizing to mean=0, std=1) ensures all features contribute equally. Without scaling, KNN essentially ignores low-magnitude features.

Activity 3

"Draw the Boundaries"

KNN decision boundaries showing how classification regions change with k

Your Task

  1. Where do the decision boundaries fall?
  2. Are they smooth or jagged?
  3. What would happen if you increased $k$ to a very large number?
Reveal Solution

KNN creates piecewise boundaries that trace around training points. Small $k$ = jagged, complex boundaries (high variance). Large $k$ = smoother boundaries (high bias). As $k \to n$, the model just predicts the overall majority class.

Activity 4

"Group These Customers"

CustomerAgeSpending ($k)
1221.5
2252
3458
4489
55010
6231
7477.5
8303.5
95211
10262.5

Your Task

  1. Divide these 10 customers into 3 groups that "make sense" to you.
  2. Describe each group in plain English (e.g., "young, low spenders").
  3. How did you decide which group each customer belongs to?
Reveal Solution

You just did clustering -- grouping similar items without labels. K-Means does this automatically: (1) pick $k$ random centers, (2) assign each point to the nearest center, (3) recompute centers, (4) repeat until stable. Your groups likely match: Young/Low, Middle/Medium, Older/High.

Activity 5

"Watch the Algorithm"

K-means clustering iterations showing centroids moving and cluster assignments changing

Your Task

  1. What changed between iterations?
  2. When would the algorithm stop?
  3. What if the starting points were in different positions -- would you get the same clusters?
Reveal Solution

Each iteration: reassign points to nearest centroid, then recompute centroids. It stops when assignments no longer change (convergence). Different starting centroids can yield different results -- this is initialization sensitivity. Solutions: run multiple times (K-Means++) or use smart initialization.

Activity 6

"How Many Groups?"

Elbow method plot showing within-cluster sum of squares versus number of clusters

Your Task

  1. What happens to the error as you add more clusters?
  2. Where is the "elbow" -- the point of diminishing returns?
  3. Why not just use $k = 10$?
Reveal Solution

More clusters always reduce within-cluster error, but at a cost of complexity. The elbow method looks for where adding another cluster stops helping much. Using $k = n$ gives zero error but zero insight -- every point is its own cluster. The elbow balances fit vs. parsimony.

Activity 7

"K-Means By Hand"

Worked example of K-means clustering on a small 2D dataset
PointXY
A12
B21
C1.51.5
D65
E76
F6.55.5

Initial centroids: C1=(1,1), C2=(7,7)

Your Task

  1. Assign each point to its nearest centroid.
  2. Compute the new centroid of each cluster.
  3. Would any points switch clusters in the next iteration?
Reveal Solution

Round 1: Cluster1={A,B,C} with new centroid=(1.5,1.5). Cluster2={D,E,F} with new centroid=(6.5,5.5). Round 2: all assignments stay the same, so the algorithm converges in 2 iterations! This is K-Means in action.