Activity 1

"Draw Your Best Line"

SqftPrice ($k)
800150
1200210
1500265
1800310
2200380
2800470

Your Task

  1. If you plot these on a graph, what pattern do you see?
  2. Draw a straight line through the data -- where would you place it?
  3. Measure the vertical distance from each point to your line. What do these distances represent?
Reveal Solution

These vertical distances are called residuals. Linear regression finds the line that minimizes the sum of squared residuals -- this is called Ordinary Least Squares (OLS). The line is $y = \beta_0 + \beta_1 x$.

Activity 2

"What Does This Chart Tell You?"

Scatter plot showing house price versus square footage with fitted OLS regression line

Your Task

  1. What relationship do you see between size and price?
  2. Using the line, predict the price of a 2000 sqft house.
  3. How confident are you in that prediction? Why?
Reveal Solution

The line is the OLS fit: $y = \beta_0 + \beta_1 x$. Points close to the line suggest a strong linear relationship. Predictions near the center of the data are more reliable than extrapolations. The $R^2$ value measures how much variance the line explains.

Activity 3

"Good Fit or Bad Fit?"

Residual plots comparing good and bad model fits

Your Task

  1. What patterns do you see in the residuals?
  2. Which plot suggests the model is doing a good job?
  3. What would a "perfect" residual plot look like?
Reveal Solution

A good residual plot shows random scatter around zero -- no patterns. If you see curves or funnels, the model is missing something (non-linearity or heteroscedasticity). A "perfect" plot would be a flat band of random points.

Activity 4

"The Downhill Walk"

Gradient descent optimization showing path to minimum of cost function

Your Task

  1. Imagine you're blindfolded on a hilly landscape. How would you find the lowest point?
  2. What happens if you take very large steps? Very small steps?
  3. How do you know when to stop walking?
Reveal Solution

This is gradient descent -- at each step, move downhill in the steepest direction. Large steps (high learning rate) may overshoot the minimum. Small steps (low learning rate) converge slowly. You stop when steps become negligibly small (convergence).

Activity 5

"Too Simple vs. Too Complex"

Bias-variance tradeoff showing underfitting, good fit, and overfitting

Your Task

  1. Look at the three models -- which one would you trust for a brand-new data point?
  2. What goes wrong with the wiggly line?
  3. What goes wrong with the flat line?
Reveal Solution

The wiggly model overfits -- it memorizes noise (high variance). The flat model underfits -- it misses the real pattern (high bias). The best model balances both. This is the bias-variance tradeoff, a fundamental concept in ML.

Activity 6

"Predicting House Prices"

SqftBedroomsAge (yrs)Price ($k)
1100230195
1400215260
180035350
2100325290
2500410410

Your Task

  1. Which feature seems to matter most for price?
  2. Can you estimate a simple formula: Price ≈ __ × sqft + __?
  3. Notice houses 3 and 4 -- same bedrooms but different prices. What other features explain this?
Reveal Solution

With multiple features, we use multiple linear regression: $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots$ Each coefficient captures one feature's effect while holding others constant. When features are correlated (e.g., sqft and bedrooms), we face multicollinearity, which makes individual coefficients unreliable.