

Progress towards algorithmic implementation of Kelly Criterion 
Written by Forex Automaton  
Friday, 26 February 2010 13:49  
The "Kelly Criterion" in quant folklore is based on the exposition and development of Kelly's work done by Edward Thorp. Acknowledging Thorp's contribution, I find the original article by Kelly conceptually more relevant to the realities of algorithmic trading as developed by ForexAutomaton. Our forecasting algorithm, as any forecasting tool, can be very naturally considered a case of the hypothetic communication channel discussed by Kelly, and the related mathematical objects, such as joint and conditional probability density distributions for the communicated "symbols" (forecasts and real quotes), are being already monitored here, despite the fact that the Kelly Criterion for capital allocation to trades remains to be coded. In preparation for the literal implementation of Kelly's prescription in code, I took Kelly's paper "A New Interpretation of Information Rate" published in the Bell System Technical Journal, 35: 917926, July 1956, and worked through his reasoning. In this post, I elaborate on some elements of the derivation which are abbreviated in the journal publication, make an effort to put the work in the context of financial trading, as opposed to parimutuel betting context used by Kelly, and lay ground for implementation of a capital allocation algorithm. I use Kelly's notation with minor changes:
is somewhat counterintuitive since most people do not picture trading, especially in a single market, as betting on a set of outcomes, despite the fact that a set of outcomes is what they actually face. In spot forex, the full variety of exposures a trader can have to a single market can be expressed by combining a very simple piecewise linear with an independent of s (but in general, dependent on r). This extends naturally to a portfolio of trades in various markets; would take care of that, but s would number possible multimarket outcomes. Next, Kelly states that
 so called "fair odds" in parimutuel terminology. Going over from parimutuel to forex (or other financial trading for that matter), if s is linear with price, this condition does not hold in most cases. However, there is a certain freedom in determining how price is related to s  it does not need to be linear. Whatever relationship is chosen, it must be independent of the trader's decision on the direction of trade (long or short), because that decision depends on the choice of how to relate the signal to price. Fig.1 below indicates that simply using the expected price change (u in our notation) for s does not satisfy  at least does not always satisfy  Equation 1. Denoting the initial capital as , capital after trades as , the number of times that the forecast is r and the actual outcome is s as ,
The exponential capital growth rate, G, is
or using Eq.1,
Of these two terms, the second one has nothing to do with trading system optimization:
being Shannon's entropy of the market. It's interesting that entropy of the market seems to help wealth generation, entering the expression for the speed of wealth accumulation with a negative sign, while being in itself negative due to the fact that p(s) is less than 1 for every s, and therefore every logarithm is negative. Now we can see that the path of the least resistance, the one leading to staying with the market (as opposed to beating it), is to "buy and hold":
or to allocate capital with no regard to forecast. Or rather, we know that to buy and hold is the way to be "with the market"  what's interesting is the formal representation of this within the Kelly framework, given by the equation above. Of course, here we completely ignore macroeconomic (or ideological?) considerations, which may be relevant to e.g. stock market, namely that one's nominal wealth may increase as a result of stock market exposure, reflecting the supposed material progress of mankind (or longterm inflationary dilution of the value of fiat money). The goal is to maximize the rate of growth of capital G by choosing the best way of responding to forecasts, represented in the equations by. Note that we do not assume anything about how good the forecasts are. Since there is not much we can do about market entropy represented by the second term, we maximize
Both and are influenced by the decisions made by the trading system developer. This expression allows one to become mathematically precise about the relationship between the two interrelated components necessary to any operational trading system: the forecasting (market insight, trading strategy) and money management. These are represented by and respectively, and we see that
Speaking of ForexAutomaton.com, , or rather, its nontrivial feature of central interest, the correlation between s and r, has been monitored for the Danica trading system via Pearson correlation coefficents. Improving is outside the scope of Kelly's interest: he focuses on finding the best . In doing that, we are bound by the normalizations of probabilities:
These are the constraints affecting the maximization of G. Next I am going to elaborate on the maximization (Kelly limits his exposition of the subject to a remark on "convexity of logarithm"). We are maximizing a function of form
subject to the constraints
I am going to use the method of Lagrange multipliers. In this method, a function is composed
and one requires that the partial derivatives of this function be zero:
The last two equations, 16 and 17, restate the constraints. From Eq.15, , which together with Eq.16 can be used to eliminate and from Eq.17. Then,
maximizes expression 10, subject to the constraints. That this is a maximum and not a minimum can be seen from the negativeness of the second derivatives (convexity) of the function: indeed, while and are nonnegative, and have no other choice but to be negative. With this input, we return to Eq.7,8,9 and recognize the pieces: the term written as Eq.7 is maximized by putting
The last transition relies on
which has its roots in the foundations of the probability calculus and does not limit generality. Note that the only summation in Eq.7 identified with summation over i in Eq.10 when using Lagrange multipliers, was the summation over s. To restate the main result, Eq.19, in words: the rate of capital growth is maximized by having the allocation of capital to outcomes, given the forecast, match the probability distribution of those outcomes, given the forecast. In the particular case when Efficient Market Hypothesis is true, the markets are unpredictable (instantly price in all available information), thus
since s and r, reality and forecast, are completely statistically independent. All of the preceding was a theoretical preparation to answer the following practical question: given the history of forecast and real outcomes (which can be used to construct and keep updating throughout the trading operation an estimator of whose accuracy will improve with time), how do we allocate capital to trades on each step, after the forecast r is received? The hypothetic Kelly trader would bet all of his capital since all of the outcomes are mutually exclusive and some of them cancel each other in a way that his net position is zero unless r biases q(s) in such a way that is moved in one direction. (Note: provided sufficient amount of past data exist, that won't happen unless the system has a history of being successful under this particular condition!) Then that is the direction of the resulting trade, and the capital allocated to the trade is the remainder position. The fact that the space of signals is not the space of price movements is immaterial, since whatever Jacobian is needed to transform the differential probability density from one space into the other, will be the same on both sides of Eq.19. Therefore,
on the basis of Eq.19. Having received the forecast r, the hypothetic Kellyoptimal trader would bet fraction q(ur) of his capital on the fact that the market would move u units, and so for every possible outcome u. What would be his net position and capital allocation to communicate to broker? It will be
where is Heaviside step function: +1 for positive u, 1 for negative u. This expression returns a fraction of capital to be allocated with a sign, where plus sign corresponds to long and minus sign to short position. Spreads and capital costs are ignored. A hypothetic case of Gaussian returns with standard deviation and mean is easy to consider analytically:
where is the cumulative form of Gaussian distribution and erf is the error function. It's noteworthy that in this special case, Kelly allocation would depend only on the Sharpe ratio . In the good years of the S&P500, with the Sharpe ratio of 0.3 or so, a Kelly investor would risk erf(0.2) which is about 0.2 (20%) of his trading capital as a stoploss set on S&P500, and keep the rest of it in "riskless" assets. Note that in this framework, by risk we mean the amount committed to the potential stoploss in the quasiparimutuel way. Then what would be the actual position and leverage of such an investor? The only answer is: such that Eq.1 holds, or as close to that as possible. Fig.1 illustrates that Eq.1 is unlikely to hold exactly. I mark this as an open issue with Kelly framework. Extension to a portfolio of multiple markets is somewhat complicated by the fact that in financial markets, you can not bet on a complex outcome (such as EUR/USD down, USD/JPY up, ...) with the same dollar. The solution I see for N markets is to split the capital into N equal pieces. The point being, there is no reason to say that e.g. USD/JPY is a priori worse a market than EUR/USD, although in the context of a particular forecast r such a statement can make a perfect sense!) This consideration makes the pieces equal a priori. Now, within each piece, the same process as just described for the single market, takes place. The money allocation system is effectively a "multichannel" one, each "channel" dedicated to a particular market and sharing same fraction of the trading capital. Similar to the forecasting system which may consider markets jointly or in isolation, the Kelly capital allocation component can in principle, given enough data, consider implications of a forecast for market i on the capital allocation for market j and vice versa. 

Last Updated ( Saturday, 25 February 2012 12:19 ) 