Sifting - EPA Local Median Smoothing

AM-FM Signal Specification

The test signal is an amplitude-modulated, frequency-modulated (AM-FM) signal with additive noise:

Amplitude modulation:

\[ A(x) = 1 + 0.5 \sin(4\pi x) \]

Phase function:

\[ \phi(x) = 2\pi (6x + 12x^2) \]

Instantaneous frequency:

\[ f(x) = \frac{1}{2\pi}\frac{d\phi}{dx} = 6 + 24x \]

Clean signal:

\[ y_{\text{clean}}(x) = A(x) \sin(\phi(x)) = [1 + 0.5\sin(4\pi x)] \sin(2\pi(6x + 12x^2)) \]

Noisy signal:

\[ y(x) = y_{\text{clean}}(x) + 0.2 \varepsilon(x), \quad \varepsilon \sim \mathcal{N}(0,1) \]

Bandwidth Halving Schedule

The sifting process uses a bandwidth halving schedule where each iteration uses a smaller bandwidth:

\[ h_1 = 0.1, \quad h_{k+1} = \frac{h_k}{\sqrt{2}} \]

Iteration	Bandwidth	Formula	Value
1	\(h_1\)	\(h_1\)	0.1
2	\(h_2\)	\(h_1 / \sqrt{2}\)	0.0707
3	\(h_3\)	\(h_1 / 2\)	0.05
4	\(h_4\)	\(h_1 / (2\sqrt{2})\)	0.0354
5	\(h_5\)	\(h_1 / 4\)	0.025

Sifting Iterations

Each iteration extracts an Intrinsic Mode Function (IMF) using local median smoothing. The residual from each iteration becomes the input for the next.

\[ r^{(k+1)} = r^{(k)} - \hat{m}^{(k)} \]

where \(\hat{m}^{(k)}\) is the local median smoothed estimate of the residual \(r^{(k)}\) at iteration \(k\).

Iteration 1: h = 0.1

Initial signal and first IMF extraction with bandwidth h = 0.1

Iteration 2: h = 0.0707

Second iteration with bandwidth h = h1/sqrt(2)

Iteration 3: h = 0.05

Third iteration with bandwidth h = h1/2

Iteration 4: h = 0.0354

Fourth iteration with bandwidth h = h1/(2*sqrt(2))

Iteration 5: h = 0.025

Fifth iteration with bandwidth h = h1/4

Method Comparison

Three approaches for computing the local trend in EMD sifting:

Local Median (Recommended)

Kernel-weighted median using Epanechnikov kernel. Robust to outliers with 50% breakdown point.

\[ \hat{m}(x_0) = \text{argmin}_m \sum_i K\left(\frac{x_i - x_0}{h}\right) |y_i - m| \]

Median Approach

Running median with efficient heap-based computation (Hardle & Steiger 1995).

\[ \hat{m}(x_i) = \text{median}\{y_j : j \in W_i\} \]

Average Approach

Standard Nadaraya-Watson kernel regression. Not robust to outliers.

\[ \hat{m}(x_0) = \frac{\sum_i K\left(\frac{x_i - x_0}{h}\right) y_i}{\sum_i K\left(\frac{x_i - x_0}{h}\right)} \]

Heap-Based Median Algorithm

For efficient running median computation, the algorithm of Hardle & Steiger (1995) uses a binary-tree structure with min/max heaps:

Window partitions:

\[ L_i = \{X_t \in W_i \mid X_t \leq \text{median}\} \] \[ U_i = \{X_t \in W_i \mid X_t \geq \text{median}\} \]

Median computation:

\[ \text{median} = \begin{cases} \max(L_i) & \text{if } |L_i| > |U_i| \\ \min(U_i) & \text{if } |U_i| > |L_i| \\ \frac{\max(L_i) + \min(U_i)}{2} & \text{if } |L_i| = |U_i| \end{cases} \]

Complexity: O(n log k) where n is the number of data points and k is the window size, compared to O(nk) for naive implementation.

External Resources

Related projects and implementations:

GitHub: EMD
Median Approach Code theIDA.net
Research Group Quantinar.com
Quantitative Finance Quantlet.com
Code Repository Blockchain Research Center
Affiliated Research

References

Hardle, W., & Steiger, W. (1995). Algorithm AS 296: Optimal median smoothing. Applied Statistics, 44(2), 258-264.
Huang, N. E., et al. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society A, 454, 903-995.
Sha, Y., Spokoiny, V., Hardle, W. K., et al. (2024). A New EMD Approach using Local Median Smoothing.

EMD Sifting Process