Dynamic Hedge Ratios via Kalman Filter

The standard pairs trading setup is deceptively simple: find two cointegrated assets, compute a hedge ratio, trade the spread when it deviates from its mean.

The problem is that the hedge ratio isn't constant. The relationship between two cointegrated assets drifts sometimes slowly, sometimes sharply. A static OLS estimate from the past 252 days is a point estimate of a moving target.

The Kalman filter solves this cleanly. Instead of periodically re-running OLS, it continuously updates the hedge ratio as new observations arrive.

The State Space Formulation

We model the relationship between price series $P_t^A$ and $P_t^B$ as:

$P_t^A = \beta_t \cdot P_t^B + \alpha_t + \varepsilon_t$

Where $\beta_t$ (the hedge ratio) and $\alpha_t$ (the intercept) are time-varying state variables, not fixed parameters. The state vector is:

$\theta_t = [\beta_t, \alpha_t]^T$

The state transition model assumes the hedge ratio follows a random walk:

$\theta_t = \theta_{t-1} + w_t, \quad w_t \sim \mathcal{N}(0, Q)$

The observation model is:

$P_t^A = H_t \cdot \theta_t + \varepsilon_t, \quad \varepsilon_t \sim \mathcal{N}(0, R)$

Where $H_t = [P_t^B, 1]$ is the observation matrix.

The Filter

import numpy as np

class KalmanHedgeFilter:
    def __init__(self, delta: float = 1e-4):
        # State: [beta, alpha]
        self.theta = np.zeros(2)
        self.P = np.eye(2)                    # State covariance
        self.Q = delta / (1 - delta) * np.eye(2)  # Process noise
        self.R = 1.0                          # Observation noise (estimated)

    def update(self, price_a: float, price_b: float) -> dict:
        H = np.array([price_b, 1.0])

        # Predict
        P_pred = self.P + self.Q

        # Innovation
        y_hat = H @ self.theta
        innovation = price_a - y_hat
        S = H @ P_pred @ H.T + self.R        # Innovation covariance

        # Kalman gain
        K = P_pred @ H.T / S

        # Update
        self.theta = self.theta + K * innovation
        self.P = (np.eye(2) - np.outer(K, H)) @ P_pred

        return {
            "beta": self.theta[0],
            "alpha": self.theta[1],
            "spread": innovation,
            "spread_std": np.sqrt(S),
            "z_score": innovation / np.sqrt(S),
        }

The Delta Parameter

The single most important tuning decision is delta the process noise scaling factor.

High delta → filter trusts new observations heavily, beta tracks price changes quickly, generates more false signals
Low delta → filter is slow to update, beta is stable, misses genuine regime changes

In practice, delta = 1e-4 to 1e-5 works for daily equity pairs. For higher-frequency data or more volatile relationships, you want higher delta.

You can estimate delta empirically by maximizing the log-likelihood of the innovations:

$\mathcal{L} = -\frac{1}{2} \sum_t \left[ \log S_t + \frac{e_t^2}{S_t} \right]$

Trading the Dynamic Spread

Once you have a continuously updated z-score, entry and exit logic is the same as static pairs trading — but now the spread is properly normalized against current volatility:

ENTRY_Z  =  2.0
EXIT_Z   =  0.5

for price_a, price_b in zip(prices_a, prices_b):
    state = kf.update(price_a, price_b)
    z = state["z_score"]

    if abs(z) > ENTRY_Z and not in_position:
        # Enter: long the underpriced, short the overpriced
        direction = -np.sign(z)
        in_position = True

    elif abs(z) < EXIT_Z and in_position:
        # Exit at mean reversion
        in_position = False

What Changes vs. Static OLS

The practical difference shows up most in trending markets and regime transitions. A static hedge ratio estimated during a low-volatility period becomes dangerously stale when correlation structure shifts. The Kalman filter doesn't fix this completely it just degrades gracefully instead of catastrophically.

The other difference is signal quality. Because the z-score is normalized against the Kalman filter's own estimate of innovation variance, you get fewer false entries during periods when the spread is simply more volatile. The static approach treats all spread deviations equally regardless of the current noise regime.

Neither approach survives a genuine cointegration breakdown. That's a separate problem.