Empirical Mode Decomposition for nonlinear and non-stationary signal analysis
Empirical Mode Decomposition (EMD) is a data-driven method for decomposing complex signals into simpler oscillatory components called Intrinsic Mode Functions (IMFs). Unlike Fourier analysis, EMD makes no assumptions about linearity or stationarity.
Application Domain: Cryptocurrency Returns Analysis
EMD decomposes a signal into nonlinear and non-stationary components that are extracted adaptively from the data itself. The result is a collection of Intrinsic Mode Functions (IMFs) plus a residual trend.
Signal Decomposition:
\[ X(t) = \sum_{k=1}^{K} \text{IMF}_k(t) + r(t) \]where \(\text{IMF}_k\) are the intrinsic mode functions and \(r(t)\) is the residual trend.
The decomposition is entirely data-dependent, not relying on predetermined basis functions.
Works for nonlinear systems where Fourier methods fail.
Handles time-varying frequency and amplitude naturally.
Each IMF captures oscillations at a characteristic time scale.
Traditional EMD uses envelope detection to identify the local mean:
The local mean is computed as the average of upper and lower envelopes connecting local maxima and minima.
For testing and validation, we use an amplitude-modulated, frequency-modulated (AM-FM) signal that exhibits both time-varying amplitude and instantaneous frequency.
AM-FM signals are ideal test cases because they are non-stationary by construction, have known ground truth for validation, and appear naturally in many applications (speech, radar, biomedical signals, financial data).
Amplitude Modulation:
\[ A(x) = 1 + 0.5 \sin(4\pi x) \]The amplitude varies between 0.5 and 1.5 with period 0.5.
Phase Function:
\[ \phi(x) = 2\pi (6x + 12x^2) \]Quadratic phase creates a chirp (increasing frequency).
Instantaneous Frequency:
\[ f(x) = \frac{1}{2\pi}\frac{d\phi}{dx} = 6 + 24x \]Frequency increases linearly from 6 Hz at x=0 to 30 Hz at x=1.
Clean Signal:
\[ y_{\text{clean}}(x) = A(x) \sin(\phi(x)) = [1 + 0.5\sin(4\pi x)] \sin(2\pi(6x + 12x^2)) \]Noisy Signal:
\[ y(x) = y_{\text{clean}}(x) + \sigma \varepsilon(x), \quad \varepsilon \sim \mathcal{N}(0,1) \]We typically use \(\sigma = 0.2\) for examples.
| Property | Value | Description |
|---|---|---|
| Amplitude Range | [0.5, 1.5] | Modulated by low-frequency sinusoid |
| Frequency Range | [6, 30] Hz | Linear chirp (increasing frequency) |
| Noise Level | \(\sigma = 0.2\) | Moderate Gaussian noise |
| Domain | \(x \in [0, 1]\) | Normalized time interval |
We compare three methods for computing the local trend in the sifting process:
Kernel-weighted median using Epanechnikov kernel. Robust to outliers with 50% breakdown point.
Running median with efficient heap-based computation (Hardle & Steiger 1995).
Standard Nadaraya-Watson kernel regression. Not robust to outliers.
The Local Median approach combines the adaptivity of kernel smoothing with the robustness of the median, making it ideal for financial time series that often contain extreme values.