Correlation matrix:
| Spending | Income | Education | Age | |
|---|---|---|---|---|
| Spending | 1.00 | 0.85 | 0.72 | 0.15 |
| Income | 0.85 | 1.00 | 0.68 | 0.20 |
| Education | 0.72 | 0.68 | 1.00 | 0.10 |
| Age | 0.15 | 0.20 | 0.10 | 1.00 |
Eigenvalues: 2.62 | 0.98 | 0.28 | 0.12
PC1 loadings: Spending = 0.58, Income = 0.57, Education = 0.54, Age = 0.18
a) Which two variables are most redundant? What is the correlation between them?
b) How many components would you keep according to Kaiser’s rule (eigenvalue > 1)?
c) What real-world concept does PC1 represent? Give it a name.
d) What percentage of total variance does PC1 capture? Show the calculation.
Spending and Income have the highest correlation: $r = 0.85$. Knowing one gives you most of the information about the other.
Only one eigenvalue exceeds 1: the first eigenvalue is 2.62. Kaiser's rule says keep 1 component.
"Economic Status" or "Socioeconomic Standing." Spending (0.58), Income (0.57), and Education (0.54) all load heavily on PC1—these three variables collectively reflect how affluent a consumer is. Age (0.18) is almost unrelated to this component.
$2.62 / 4 = 0.655$, i.e., 65.5% of total variance. (There are 4 variables, so the total variance is 4.)
Unrotated factor loading matrix:
| Factor 1 | Factor 2 | |
|---|---|---|
| Spending | 0.82 | 0.12 |
| Income | 0.80 | 0.15 |
| Education | 0.74 | −0.08 |
| Age | 0.14 | 0.95 |
a) Assign each variable to the factor it loads most strongly on (loading > 0.4). Which factor is "Wealth"? Which is "Life Stage"?
b) Compute the communality of Spending. What does it mean?
c) Compare with your Part A answer. What can EFA tell you that PCA cannot?
d) Vote with your partner: would you recommend PCA or EFA for this marketing study? Why?
Factor 1 loads on Spending (0.82), Income (0.80), Education (0.74) → Wealth.
Factor 2 loads on Age (0.95) → Life Stage.
(Education’s small negative loading on Factor 2 is below the 0.4 threshold and can be ignored.)
$h^2 = 0.82^2 + 0.12^2 = 0.6724 + 0.0144 = 0.69$. This means 69% of Spending’s variance is explained by the two factors combined. The remaining 31% is unique (specific) variance not shared with other variables.
PCA collapsed everything into one component and made Age nearly invisible (loading 0.18). EFA separates the age dimension (Life Stage) cleanly from the wealth dimension. EFA reveals that consumers differ on two independent axes—something PCA missed because it only retained one component under Kaiser’s rule.
For this marketing study, EFA is more appropriate. The researcher has a theoretical model (two hidden factors) and wants to confirm whether data supports it. PCA is better when the goal is purely to reduce dimensions without an interpretive theory about latent causes.