TRY (Turkish Lira) Intermarket Correlation: USD/TRY Follows EUR/JPY

A time-integrated study of hourly time-scale lagged correlation between USD/TRY and EUR/JPY reveals hints of a correlation whereby EUR/JPY leads and TRY/USD follows with a lag on one hour.

Compared to the series of intermarket correlation reports generated in 2008-2009 (not covering Turkish Lira), the new reports adopt a somewhat different approach to statistical uncertainty estimation. Instead of synthesizing mock time series with the volatility distribution of the real ones, producing their autocorrelations (and intermarket correlations), and inferring the uncertainty from a comparison of many such analyses, as was done in 2008-2009, I now infer the uncertainty of the correlation coefficients directly as a standard deviation of the sum, knowing the terms entering the sum. As before, hourly data are used and the quantity correlated is hourly logarithmic return.

The data used cover the period from November 14, 2010 till December 09, 2012.

USD/TRY and EUR/JPY hourly correlation 1.1 USD/TRY and EUR/JPY hourly correlation zoomed 1.2

Fig.1. Correlation in hourly logarithmic returns between USD/TRY and EUR/JPY, 2010-2012. 1.1: Lags from 0 to 200 hours. 1.2: Zooming on the signal. The correlation is normalized so that total correlation corresponds to correlation strength of 1, total anti-correlation — to correlation strength of -1 (see Pearson correlation coefficient).

The time lag between the USD/TRY and EUR/JPY time series is defined as

td = tUSD/TRY – tEUR/JPY.

The negative correlation content in the +1 hour time lag bin is seen with about 2.5 standard deviations. This means that a move (the present analysis is insensitive to the direction of that move) in USD/TRY and time t and a move in EUR/JPY at time t-1 are negatively correlated over the time of observation. This is the same as saying the TRY/USD follows (trails) a movement in EUR/JPY with a lag of one hour. Due to the discrete nature of binning, the actual ticks that create the effect could be separate by a time interval from anything about zero to anything below two hours and still land in the separate adjancent hourly periods, creating a 1-hour lag effect. The average such time interval is one hour.

USD/TRY and EUR/JPY hourly autocorrelations, zoomed

Fig.2. Autocorrelation in hourly logarithmic returns in USD/TRY and EUR/JPY, 2010-2012. The correlation is normalized as in Fig.1.

Speaking of pair trading, when forming a pair of EUR/JPY and USD/TRY, due to the negative correlation, both positions must be held short or long at the same time. The relative weight with which EUR/JPY and USD/TRY enter the pair must be optimized to increase the one-lag correlation with respect to the one at lag zero. Qualitatively speaking, the features around zero in Fig.2 (zero autocorrelation at 1-hour lag in EUR/JPY and negative autocorrelation at 1-hour lag in USD/TRY) will not cancel the feature in Fig.1 (also negative). The resulting pair will be a mean-reverting autocorrelated time series.

Central European Time

Central European time is chosen for the following reason. Forex week begins, roughly speaking (since the volume increase is gradual) on Sunday 5pm and ends Friday 6pm Eastern time. It is convenient to define this week to consist of 5 full days, from 6pm Sunday to 6pm Friday New York time. When it’s 6pm in New York, it’s midnight in Berlin, Paris, Madrid, Rome, Geneva and Frankfurt. These cities use Central European Time or CET. Therefore, the convenience of using CET is that one gets 5 non-interrupted, full 24-hour long trading days per week. Table 1 compares four time zones including major trading centers of the world.

Tokyo91011 121314 15161718192021 2223012 3 456 78
Central Europe12345678910111213141516171819202122230
Eastern US19202122230123456789101112131415161718

Table 1. Time zone conversion table. Seasonal time shifts, such as daylight saving time, may complicate the picture if the nations choose to enact them on different days, and are ignored.

Profile Histogram

A profile histogram provides an economic representation of two-dimensional (X vs Y) data. Unlike a two-dimensional histogram, the profile histogram is unbounded in the Y (vertical) dimension. Like any histogram, it has discrete bins along the X axis to aggregate the data. The data are represented graphically as a series of (X,Y) points on a plot, with the point position in Y corresponding to the mean and the vertical bars typically indicating the measure of the spread such as the RMS, or the measure of the precision of the mean, whereby larger bars correspond to less precision.

None of the projection techniques is perfect, since the reduction of information involved in all projections is not guaranteed to be “intelligent”. But they do solve the problem, even though it is necessary to look from more than one point of view and try more than one path to optimization.

High Order Cumulants

Cumulants are statistical measures of correlation designed to go to zero whenever any one or more quantities under study become statistically independent of the rest. Cumulants generalize the concept of a correlation measure; in particular, a correlation of two bodies, quantities and so on, the most intuitive one, can be represented and measured by the second-order cumulant. Higher orders can be conceived.

Financial time series come as sequences of bars or “candles”, one bar per time step of the series. The bar has an open, close, low and high levels of price. In the liquid market like forex, close, low and high should be sufficient while open is typically not too different from the previous close and is believed to be redundant.

To judge the quality of market predictions, we are interested in multivariate cumulants, since for each of the three essential components of a candle there is a prediction. Because we make predictions for each of these components, the number of variables we would like to correlate is even, and therefore we are interested in even-order cumulants.

The simplest of these is second order cumulant also known as covariance:

c2 = E[x1x2] – E[x1]E[x2] (1)

Here E[] stands for the averaging (a.k.a. expectation) operator.

In general, n-th order cumulant is constructed by making a correlation term E[x1x2…xn] and subtracting all terms which are not “genuinely” n-th order correlations, but are composed of lower order ingredients.

Take for example the daily candle. We can generate predictions for daily changes in low and high, such that the correlation between the real and predicted change for high will be positive. Same for low. If we take a trading position having yesterday’s low as a stop-loss and yesterday’s high as a profit target, we want to make sure that not only there is a tendency for low not to be hit when the prediction says so (attested to by the positive correlation between the daily change in low and its forecast), and not only there is a tendency for the high to be hit when the prediction says so (attested to by the positive correlation between the daily change in high and its forecast), but that these two things tend to happen “simultaneously” within the same trade. This is the essence of the difference between the “genuine” fourth order correlation and a mere superposition of two second order ones.

Forth order cumulant is defined as:

c4 = E[1,2,3,4]
– E[1,2,3]E[4] – E[1]E[2,3,4] – E[1,3,4]E[2] – E[1,2,4]E[3]
– E[1,2]E[3,4] – E[1,3]E[2,4] – E[1,4]E[2,3]
+ 2(E[1,2]E[3]E[4] + E[1,3]E[2]E[4] + E[1,4]E[2]E[3] + E[2,3]E[1]E[4] + E[2,4]E[1]E[3] + E[3,4]E[1]E[2])
– 6E[1]E[2]E[3]E[4].

Here we use the notation: E[x1x2…] gets replaced by E[1,2…] for the sake of brevity.

What is being subtracted is in fact products of lower order cumulants, which in turn subtract their lower order cumulants, which is why there are terms with both plus and minus sign alternating in a certain order. A recurrence relation exists allowing one to express higher order cumulants in terms of lower order ones.

A cumulant of order higher than 2 will go to zero if any two quantities are proportional to each other:

x1=ax2. (3)

The fact that it will also go to zero whenever any one quantity is statistically independent of the rest, combined with the additivity of cumulants, implies that a higher order cumulant will go to zero whenever any pair of quantities has even a less deterministic, randomized form of that equation:

x1=ax2 + r (4)

where r is a random number independent of x2.

A non-zero higher order cumulant indicates that a relationship between the data is not merely Eq. (4), with its familiar visualization as a diagonally elongated cloud in the x1, x2 space — even though one may see such a cloud and other signatures of two-point correlations when subjecting higher-order correlated data to a lower order analysis.

To keep the cumulant independent of the units in which the underlying quantities are expressed, we sometimes normalize it:

C4 = c4/(Var[1]Var[2]Var[3]Var[4])1/2, (5)

where Var is variance.

Forex Automaton as a Shannon’s Communication Channel. Introducing Kelly Criterion.

The intention of this post is to tie together several topics which appeared on my radar screen in the course of the trading system optimization. First, it has been understandably hard to fully rid oneself of vestiges of the mainstream financial theory based on the postulate of market efficiency, while building a wealth-generating tool relying explicitly on demonstrable market inefficiencies. The realization that Sharpe ratio does not let one make an objective choice of a portfolio was there from the beginning, and I recall perceiving this fact as a “necessary evil”. Then came the understanding of the fact that an arithmetic average of returns gives one a biased picture of long-term return, and consequently, Sharpe ratio is built around biased quantities.

In what followed, the key concept was that of Kelly Criterion and the rich intellectual context it is part of — quite remote from the mainstream financial engineering. Initially I came across a mention of Kelly Criterion in some pieces by Edward Thorp, found on his website, but did not fully appreciate the depth of the context. Later, prompted by some readers of this site, I learned about Ed Thorp’s more extensive exposition of Kelly Criterion, “The Kelly Criterion in blackjack, sports betting, and the stock market”.

A good place to begin is probably Claude Shannon’s work The Mathematical Theory of Communication (originally, The Bell System Technical Journal, Vol. 27, pp. 379-423, 623-656, July, October, 1948). The work introduces the concept of information in the strict theoretical sense. It deals with measures of information and redundancy. These are probably two most important concepts in the algorithmic trading — suffice it to say that if the markets had not been redundant, algorithmic trading in the sense discussed and developed here would have been impossible and trading would degenerate into gambling. Much of’s research content directly deals with measurements of informational redundancy in forex. Shannon’s information theory is, among other important things, the venue for cryptography, linguistics, statistics and mathematics to meet in the context of financial speculation. Shannon introduced the concept of mutual information to characterize transmission capacity of communication channels.

In 1956, Kelly introduced what is known as Kelly Criterion in an application of Shannon’s theory to gambling. He titled the work “A New Interpretation of Information Rate”. The communication channel considered is a very specific one: it is a noisy channel allowing a gambler to know in advance the outcomes of chance events and bet accordingly. “Noisy” means that the information gambler receives is in general not perfect. Shannon’s mutual information (channel capacity) is the measure of quality of betting tips the gambler receives. Because the chance of losing is non-negligible, the gambler can not bet all of her capital on every game, but because the value of insider data she receives is non-negligible either, her optimal betting ratio is non-zero. The optimal betting ratio in the simple case of two symmetric outcomes equals the difference between the win and loss probabilities if the tip is followed. The maximum expected logarithmic rate of growth of capital (per game) turns out to be Shannon’s mutual information of the abstract communication channel, with true information as the input and prediction as the output (or vice versa, since the formula is symmetric).

This gets me back to the measurements of Forex Automaton prediction quality — the state of the art at the moment is the measurement of Pearson correlation coefficient between predicted and real value of logarithmic returns. I am eager to see what the plot will look like if Shannon’s mutual information is used instead of the correlation coefficient — this is particularly interesting, given Kelly’s interpretation of the quantity.

Where does this all leave the common theory with its Sharpe ratios and efficient portfolios? A recent (June 2009) article by Javier Estrada, Geometric Mean Maximization: an Overlooked Portfolio Approach? proved informative to me. Estrada contrasts Sharpe ratio maximization which leaves one with not one, but an entire efficient frontier of portfolios with geometric mean maximization (Kelly by other name) leading to the largest expected terminal wealth. Estrada notes that SRM (Sharpe ratio maximization) is a one-period framework, while GMM (geometric mean maximization) is a multi-period framework — I take this to mean that the biased nature of Sharpe when cumulative returns are concerned (as already noted, one of my early disappointments with this statistic) is thus acknowledged in the academic community.

Why do the practitioners overlook a useful criterion? — wonders Estrada. My guess is that the idea of Kelly’s “private wire” and the like is simply too alien to the version of the financial theory based on the postulates of market efficiency, thus making the corollary of Shannon’s information theory as applied to markets, the Kelly Criterion, genetically alien as well. One should never underestimate the role of the overall surrounding cultural and ideological context in the development of science. The symptomatic overstatement of symmetries by the theorists of “market efficiency” may be the case in point. Certainly, Kelly makes more sense to someone dealing with wealth generation on the basis of quantifiable and specific advantages, rather than just submitting oneself to the egalitarian random walk of the hypothetical “efficient market”. This is because Kelly addresses the problem of the value of information directly, it is at the heart of his approach.

Possible Figures Of Merit Related To Return On Investment: The Arithmetic And Geometric Mean

When evaluating the performance of a trading system, I calculate the first moment (an arithmetic mean of the series of returns) as well as the second one (a variance of the series). Originally my “Sharpe-like” ratio, used to adjust the return for the risk, was a ratio of the first moment to the square root of the second. The series of returns would be composed of annualized returns calculated every month.

However there is a subtlety. If {r1,r2,r3,… rn} is a series of monthly returns for some period (ratios of capital at the end of the period to that at the start of the period), then the total return for the same period will be the product r1r2r3… rn.

When each monthly return is already annualized (exponentiated to the 12th power), the opposite operation (applying power 1/12 to the product) is required to get the actual return for the year. This operation is known as taking the geometric mean.

Arithmetic mean is always greater than or equal to the geometric mean. This fact is well known from university math courses.

For example, if the series of returns is 1.0, 1.081, 0.9, 1.02, then the geometric average is 0.998084 and the arithmetic average is 1.00025. In this particular example, the arithmetic average of returns will tell you that you are making money whereas in fact you are losing.

Arithmetic mean equals the geometric mean in the particular case when all elements of the series are equal. The more volatility there is in the series, the more difference between the two means can be expected.

The bottom line is that the results of the research so far should be revisited using the simple return and the Sharpe ratio based on such return instead of the first moment of the series of annualized returns. The first moment gives a biased estimate of the actual return.

Nevertheless, since the main driving logic in the choice of parameters was minimization of the drawdown, the basic conclusions regarding the parameter choices are likely to stay the same.



“An investment operation is one which, upon thorough analysis promises safety of principal and an adequate return. Operations not meeting these requirements are speculative.” Apparently for Graham and Dodd, the term “speculation” has no positive connotations.

What Do You Mean By “Predictable” Or “Predictability” When You Talk About Forex?

Imagine tossing a coin which is slightly bent in a way which is known to you. The bend is almost unnoticeable but it does exist and this fact will become obvious after a long enough series of coin tosses, if you do the book-keeping accurately. Trading the markets with a good trading system is similar. The amount of predictability is small, making it justifiable for people who do not have access to an analysis technique of sufficient sensitivity to talk about “efficient markets” — the markets indeed look “efficient” to their methods of observation. The fact of predictability only becomes undeniable after hundreds or thousands of “coin tosses” (trades).

The world economy is never in equilibrium. “I assume that markets are always wrong” said George Soros. A market actor (investor, trader, central banker) requires finite time to analyze the changing environment and to make a decision. Yet more time and resources are needed to turn it into action visible by the market. No analysis is perfect, and some are less perfect than others. In addition, people who share common educational and cultural background tend to make similar, rather than the most efficient, decisions. Groupthink exists inside communities and large organizations. For these and other reasons, the so-called market inefficiency exists. Once the market inefficiency is quantified, predictive power (and a winning strategy) is only a step away, since such inefficiency — always specific and quantifiable — is in some sense but a deviation from the perfect symmetry which random noise (aka “efficient market”) would possess.


Anticorrelation, or anti-correlation, means the same as negative correlation. To say that A and B are anticorrelated means that A tends to move up when B moves down and vice versa. This can be said when discussing both autocorrelation or cross-correlation. The term is not as widely used in financial contexts as it is in physics.

How many people trade forex every day? A Pareto estimate.

To estimate how many people trade forex every day, we use Pareto distribution. This is a power-law, highly asymmetric distribution, encountered in many contexts, including distribution of wealth. Pareto’s probability density distribution is

f(x|k,xm) = k xmkx-k-1

for x greater or equal to xm. xm is the minimum value of x, in our case — the minimum value contributed by a trader into the total turnaround. This number defines who we call a trader for the purpose of this estimate, a threshold so to speak.

According to the definition of a mathematical expectation E[x], it is an integral of f(x|k,xm) times x from xm to infinity and

L E[x] = T

where T is the total turnaround and L is the number of trades that create it. The integral is easy to deal with for k > 1 and we obtain

L = T(k-1)/(k xm)

Now to the less definitive stuff. To the best of my knowledge as of now the total daily forex turnaround is about $3 trillions. A reasonable definition of a trader is someone who trades at least one standard size lot a day, that is, $100,000. Thus T=3× 1012 and xm=105. There is a fair amount of uncertainty as to what the Pareto index k is for this market. There is a famous 80-20 rule applicable in many contexts. In this case it would mean that 20% of traders contribute 80% of the turnaround. The 80-20 rule is just a particular instance of a Pareto distribution corresponding to Pareto index of 1.161. With this input, we estimate to have about four million forex transactions a day. That’s what we’ve called L. The number of traders creating L transactions a day depends on the structure of the relationships between them, that is, who trades with whom. Imagine a trader as a point on a plane, the L transactions being the links connecting the points, and you get the picture — there are various ways of connecting the points. Our problem is that we’ve estimated (with Pareto’s help) the number of links but we do not know the number of points they connect. One situation, extreme in a sense and only applicable for the sake of academic argument, would be to assume that each trader has a unique partner (kind of a monogamous marriage). Then the number of traders n is simply L times 2. Of course such a system is not capable of moving money. But as you will see, it requires the highest number of traders (eight million traders?!) to create a given turnaround, and is interesting as a limiting case. Another situation is the egalitarian one where every trader is equally likely to trade with every other trader. This is also not realizable in practice, but it corresponds to the equation:

L = n(n-1)/2

Because n is much larger than one, n is simply

n = (2L)1/2

If such topology of trading links were the case, only about 3 thousand traders (with Pareto wealth distribution) could create our present trading volume. Such topology may be considered an idealized limit, the Holy Grail of the online trading business. The reality is probably in between these two extremes, with traders being not completely isolated, but forming relatively isolated clusters around brokers, banks, hedge funds and similar institutions. The relationships between those higher level entities are then much closer to the egalitarian model — a tightly knit community where every member knows every other member.

If the estimates of the number of traders, given the trading volume, differ so much depending on the ogranization of the market, then the really interesting conclusion is the converse: the real reason for the spectacular growth in the forex trading volume seen in the past few years probably has at least as much to do with changes in the organization of the market as it does with purely economic reasons.