Empirical Mode Decomposition with Local Median Smoothing using bandwidth halving schedule
The test signal is an amplitude-modulated, frequency-modulated (AM-FM) signal with additive noise:
Amplitude modulation:
\[ A(x) = 1 + 0.5 \sin(4\pi x) \]Phase function:
\[ \phi(x) = 2\pi (6x + 12x^2) \]Instantaneous frequency:
\[ f(x) = \frac{1}{2\pi}\frac{d\phi}{dx} = 6 + 24x \]Clean signal:
\[ y_{\text{clean}}(x) = A(x) \sin(\phi(x)) = [1 + 0.5\sin(4\pi x)] \sin(2\pi(6x + 12x^2)) \]Noisy signal:
\[ y(x) = y_{\text{clean}}(x) + 0.2 \varepsilon(x), \quad \varepsilon \sim \mathcal{N}(0,1) \]The sifting process uses a bandwidth halving schedule where each iteration uses a smaller bandwidth:
| Iteration | Bandwidth | Formula | Value |
|---|---|---|---|
| 1 | \(h_1\) | \(h_1\) | 0.1 |
| 2 | \(h_2\) | \(h_1 / \sqrt{2}\) | 0.0707 |
| 3 | \(h_3\) | \(h_1 / 2\) | 0.05 |
| 4 | \(h_4\) | \(h_1 / (2\sqrt{2})\) | 0.0354 |
| 5 | \(h_5\) | \(h_1 / 4\) | 0.025 |
Each iteration extracts an Intrinsic Mode Function (IMF) using local median smoothing. The residual from each iteration becomes the input for the next.
where \(\hat{m}^{(k)}\) is the local median smoothed estimate of the residual \(r^{(k)}\) at iteration \(k\).
Initial signal and first IMF extraction with bandwidth h = 0.1
Second iteration with bandwidth h = h1/sqrt(2)
Third iteration with bandwidth h = h1/2
Fourth iteration with bandwidth h = h1/(2*sqrt(2))
Fifth iteration with bandwidth h = h1/4
Three approaches for computing the local trend in EMD sifting:
Kernel-weighted median using Epanechnikov kernel. Robust to outliers with 50% breakdown point.
Running median with efficient heap-based computation (Hardle & Steiger 1995).
Standard Nadaraya-Watson kernel regression. Not robust to outliers.
For efficient running median computation, the algorithm of Hardle & Steiger (1995) uses a binary-tree structure with min/max heaps:
Window partitions:
\[ L_i = \{X_t \in W_i \mid X_t \leq \text{median}\} \] \[ U_i = \{X_t \in W_i \mid X_t \geq \text{median}\} \]Median computation:
\[ \text{median} = \begin{cases} \max(L_i) & \text{if } |L_i| > |U_i| \\ \min(U_i) & \text{if } |U_i| > |L_i| \\ \frac{\max(L_i) + \min(U_i)}{2} & \text{if } |L_i| = |U_i| \end{cases} \]Complexity: O(n log k) where n is the number of data points and k is the window size, compared to O(nk) for naive implementation.
Related projects and implementations: