

Explaining Danica v0.5 optimization choices 
Written by Forex Automaton  
Thursday, 14 January 2010 13:45  
Page 1 of 2 This report gives a more detailed discussion of the optimization tradeoffs made for Danica, as compared to the initial announcement. At the same time, it establishes a reference point for future studies of past performance. As a brief introduction for those new to this, Danica is an algorithmic forecasting system launched in December 2009. The system is trained and optimized using historical forex data beginning in August 2002. The system's output is a forecast of the next day's highlowclose "candle". Instead of running a model portfolio, the system updates a set of moving average correlation coefficients between its forecast logarithmic returns and the real ones. The most interesting question is whether the optimization decisions made on the basis of past data are or any value for the future. To be able to answer that, the backtesting performance is being documented here. The backtest performance discussed here is that of version 0.5 of the system. On December 28, 2009, when I launched the system in the production mode, I made her work her way fromĀ the nominal launch date on April 22, 2006 through historical data up to the present moment with incremental stepbystep update of all the objects. In the process, a couple of minor bugs (one related to the way leap year was treated) were discovered and fixed and the system version was upgraded from 0.1 to 0.3. Later I realized that the version launched was not the best one researched during the optimization phase  and thus, version 0.5 is being introduced. Compared to 0.3, version 0.5 has what looks now like a considerable improvement in backtest performance, in particular with regard to prediction quality for daily low and high. Therefore, the backtesting to be discussed is, for a very good reason, different from what one can see in Danica's backtesting run covering nominally the same data range. 1.1 1.2 1.3 The figure of merit used, Pearson correlation coefficient between the forecast and real logarithmic returns on a day scale, is a measure of asymmetry in the distribution of the product of such returns. Fig.1 shows these "raw" distributions in which the asymmetry can be appreciated visually (especially on the tails of the distribution) and by looking at the deviation of the mean from zero. By construction, the Pearson correlation coefficient is a quantity bounded between 1 (the forecast move and reality are total opposites) and 1 (the forecast is perfect). A success or lack thereof on every trading day makes a contribution to this quantity. In order to make a large positive contribution, one needs a coincidence of a large move in a currency pair with a large forecast move in the same direction  in other words, a successful forecast which has an above average chance to result in a successful trade, since a hypothetic rational operator of the system will not pursue small forecast moves, understanding this to be a noisy system. By the same property of a product, an impact of even a large forecast move in a wrong direction on the Pearson correlation coefficient can be moderated if the actual price move turned out to be negligible  again what we want, since in this case the effect of the system's imperfection on the operator's portfolio is likely to be similarly insignificant. Likewise, an impact of a wrong forecast move of negligible magnitude will be negligible both for the Pearson correlation and for the operator's portfolio since in this case, the operator is unlikely to enter the trade. Finally, the worst case is the one when the system predicts a large movement in the currency pair and a large movement does materialize  but in the opposite direction. In this case, a large negative contribution to Pearson is recorded. These considerations argue in favor of Pearson correlation coefficient as an intuitively good figure of merit. These qualitative considerations are also true for the way the shape and mean of the product distribution histograms (Fig.1) is influenced by successes or failures of the forecasting on a daybyday basis. In terms of Fig.1, tails in the positive direction, large compared to the negative ones, as well as the asymmetry of the peak in the vicinity of 0 are the desired signatures of a successful system. 

Last Updated ( Wednesday, 14 April 2010 12:45 ) 