Given noisy observations \((x_i, y_i)\) where:
We want to estimate the unknown function \(f(x)\) at any point \(x_0\).
Kernel smoothers estimate \(f(x_0)\) using a weighted average of nearby points:
where \(K(\cdot)\) is the kernel function and \(h\) is the bandwidth.
The Epanechnikov kernel is defined as:
Why Epanechnikov? It is the optimal kernel in the sense of minimizing the asymptotic mean integrated squared error (AMISE) among all second-order kernels.
Properties:
The standard Nadaraya-Watson estimator uses the weighted mean:
Our approach uses the weighted median:
Robustness Comparison:
| Property | Weighted Mean | Weighted Median |
|---|---|---|
| Breakdown Point | 0% | 50% |
| Influence Function | Unbounded | Bounded |
| Effect of Single Outlier | Can shift estimate arbitrarily | Limited effect |
Silverman's rule of thumb provides automatic bandwidth selection:
where \(\hat{\sigma}\) is the sample standard deviation and IQR is the interquartile range.