Pairs trading and correlations

User Rating: / 5
PoorBest 
Written by Forex Automaton   
Thursday, 29 October 2009 10:34

 I approach pairs trading with the correlations tool-box and basic algebra. Let's consider two time series, a(t) and b(t). It will be understood that these are  taken on a fixed time scale (second, minute, hour, and the like). Most explanations of pair trading fail to communicate the importance of non-zero correlations at non-zero time lags --  let alone the importance of their constructive interference (to be explained). Meanwhile, it is these subtleties that make a difference between just another roulette-like source of random outcomes and a reliable, little-risk source of arbitrage profits. 

 The typical hedging transaction consists in simultaneous buying and selling of positively correlated instruments. The transaction is closed by selling and buying, respectively. Therefore if the operator strictly limits themselves to pairs trading in these two securities, that reduces the choice of the possibilities so that one effectively trades (buys and sells) a synthetic instrument whose price is the difference p(t) = a(t)-b(t). 

Since we want to operate with returns, the operation of subtraction limits the choices of how the returns can be constructed to the linear operation: returns A(t) and B(t) will be defined

 A(t) = a(t+1)-a(t) 

 B(t) = b(t+1)-b(t), 

that way for the pair instrument P(t),

 P(t) = p(t+1)-p(t) = a(t+1)-b(t+1)-(a(t)-b(t))=A(t)-B(t). 

 I aim to clarify the role of autocorrelations (especially those at the non-zero lags) and cross-correlations in the choice of the pairs to hedge. Indeed, the whole purpose of the game is to reduce the risk while maximizing the profit. What is usually explained in intuitive anecdotal terms is that when a moment t is such that b(t) exceeds a(t) "enough", one can hope to "lock" that difference by buying a and selling b. Of course this is only part of the story. The whole story being how you trade A-B throughout time, and whether, how much and why this is less risky than trading in A and B separately. 

For a homogeneous (stationary) time series considered, the autocorrelation

Et[A(t)A(t+td)] = C[A](td),

a function of only one argument. I will assume that this is the case for the time series of returns and only the time lag td matters. Here Et denotes the expectation or averaging operator, and the t subscript points to the fact that the averaging is over time.

For two markets A and B, the inter-market or cross-market correlation is, likewise,

Et[A(t)B(t+td)] = C[A,B](td).

The measure of risk inherent in market A is C[A](0), variance. Same for B, it's C[B](0). The rest of values of C[A] (those at non-zero td) measure predictability of A on the basis of self -- I will denote them symbolically C[A](non-zero). Same for B. Low predictability can not be helped; effects of low volatility and low variance can be improved by using leverage. A typical forex time series has C[A] dominated by a peak at td=0 and at best (depending on the time scale) C[A](non-zero) of several percent with respect to the peak. The goal in the game is to have high predictability while minimizing risk, risk being understood as unpredictable and therefore unhelpful volatility -- all the volatility content that gets recorded as C[A](0) and C[B](0). Predictable volatility, on the contrary, is what a speculator wants. Let's see what the simple operation of subtraction does to C[](0) and C[](non-zero).

As it turns out, predicting what the autocorrelation of P(t) will look like on the basis of C[A], C[B] and C[A,B] is not difficult, and no other information is needed.

C[P](td) = Et[P(t)P(t+td)] = Et[(A(t)-B(t))(A(t+td)-B(t+td))]=

Et[A(t)A(t+td)]-Et[B(t)A(t+td)]-Et[A(t)B(t+td)]+Et[B(t)B(t+td)]=

C[A](td) - C[A,B](td) - C[B,A](td)+C[B](td).

The cross-terms are indeed just C[A,B] and C[B,A], the mirror reflections of one another. Each of them can be asymmetric (see examples of actual forex cross-market correlations in addition to examples of correlations.) Therefore in general

C[A,B](td) does not equal C[B,A](td)

except for when td=0.

Autocorrelation of returns1.1: C[A] or C[B] Intermarket correlation of returns, AB1.2: C[A,B] Intermarket correlation of returns, BA1.3: C[B,A] Autocorrelation of A-B pair1.4: C[P]

Fig.1. A mock-up example illustrating the equations for what might be a good pair to pair-trade. td is the difference variable, the correlation magnitude is plotted along the vertical axis. Such correlation patterns are typical in forex."Bipolar disorder" is the feature in panel 1.1. Panels 1.2 and 1.3 illustrate the leading indicator or "leader-follower" effect, also frequent. Panel 1.4 is the autocorrelation of the resulting series P(t)=A(t)-B(t).

The unhelpful risk content can be potentially reduced in C[P](0) due to the subtraction of C[A,B](0) and C[B,A](0). For that, C[A,B](0) needs to be large and positive. However, that's not enough for a speculative pair. We want the predictable content to survive in C[P](td) -- and for that, it needs, first, to exist in at least some of the correlations. Second, it needs to exist in a way which is not going to be destroyed by the subtraction detailed above. A good pair is the one where both A and B suffer from bipolar disorder (an empirically discovered concept mentioned many times throughout this site including the summary report linked above), while A and B are correlated positively and preferably with a positive leader-follower feature. That way, in C[P](td) the bipolar disorder will enter as negative spikes, the unpredictable and unhelpful peak at td=0 will be reduced while the leader-follower feature will mirror-reflect around zero and enter twice, thus interfering constructively with the bipolar disorder. The process is illustrated in Fig.1.

The result, Fig.1.4, is remarkable in that the correlation magnitude at the non-zero lags is enhanced considerably with respect to the central peak at td=0. Or to put it differently, the central peak magnitude (reflection of the unpredictable and therefore unhelpful volatility, the propensity of the pair to "drift") is beaten down considerably with respect to the "good" algorithmically predictable volatility associated with non-zero time lags. In theory, the central peak can be reduced to 0 (unfortunately, along with the rest of the structure) if A and B are perfectly correlated.

Executive summary: to find a pair for pair-trading, you need both

  • high correlation at zero time lag between the ingredients of the pair
  • significant correlations at non-zero time lags in either one ingredient, or in both ingredients, or in the cross-correlation between the ingredients -- ideally, in all of these places provided they don't cancel each other. Equations above and Fig.1 explain how they might cancel.

These conclusions are based entirely on the analysis presented in this post. The forex correlations section of the site is a good starting place to research correlated pairs.

Bookmark with:

Deli.cio.us    Digg    reddit    Facebook    StumbleUpon    Newsvine
Last Updated ( Friday, 15 April 2011 14:36 )