Explaining Danica v0.5 optimization choices

User Rating: / 2
Written by Forex Automaton   
Thursday, 14 January 2010 13:45
Article Index
Explaining Danica v0.5 optimization choices
Volatility effect on the position of the optimum
All Pages

This report gives a more detailed discussion of the optimization trade-offs made for Danica, as compared to the initial announcement. At the same time, it establishes a reference point for future studies of past performance.

As a brief introduction for those new to this, Danica is an algorithmic forecasting system launched in December 2009. The system is trained and optimized using historical forex data beginning in August 2002. The system's output is a forecast of the next day's high-low-close "candle". Instead of running a model portfolio, the system updates a set of moving average correlation coefficients between its forecast logarithmic returns and the real ones.

The most interesting question is whether the optimization decisions made on the basis of past data are or any value for the future. To be able to answer that, the back-testing performance is being documented here.

The back-test performance discussed here is that of version 0.5 of the system. On December 28, 2009, when I launched the system in the production mode, I made her work her way fromĀ  the nominal launch date on April 22, 2006 through historical data up to the present moment with incremental step-by-step update of all the objects. In the process, a couple of minor bugs (one related to the way leap year was treated) were discovered and fixed and the system version was upgraded from 0.1 to 0.3. Later I realized that the version launched was not the best one researched during the optimization phase -- and thus, version 0.5 is being introduced. Compared to 0.3, version 0.5 has what looks now like a considerable improvement in back-test performance, in particular with regard to prediction quality for daily low and high. Therefore, the back-testing to be discussed is, for a very good reason, different from what one can see in Danica's back-testing run covering nominally the same data range.

Distribution of a product of predicted and  actual day-scale forex logarithmic returns in day HIGH for the forecasting parameter=28 1.1 Distribution of a product of predicted and  actual day-scale forex logarithmic returns in day CLOSE for the forecasting parameter=28 1.2 Distribution of a product of predicted and  actual day-scale forex logarithmic returns in day LOW for the forecasting parameter=28 1.3

Fig.1. Distribution of a product of predicted and actual day-scale forex logarithmic returns (1.1 -- high, 1.2 -- close, 1.3 -- low). Data for the 14 forex rates tracked by Danica (AUD/USD, EUR/USD, GBP/USD, USD/CAD, USD/CHF, USD/JPY, AUD/JPY, CHF/JPY, EUR/GBP, EUR/JPY, GBP/CHF) are combined. The forecasting parameter is fixed at 28.

The figure of merit used, Pearson correlation coefficient between the forecast and real logarithmic returns on a day scale, is a measure of asymmetry in the distribution of the product of such returns. Fig.1 shows these "raw" distributions in which the asymmetry can be appreciated visually (especially on the tails of the distribution) and by looking at the deviation of the mean from zero.

By construction, the Pearson correlation coefficient is a quantity bounded between -1 (the forecast move and reality are total opposites) and 1 (the forecast is perfect). A success or lack thereof on every trading day makes a contribution to this quantity.

In order to make a large positive contribution, one needs a coincidence of a large move in a currency pair with a large forecast move in the same direction -- in other words, a successful forecast which has an above average chance to result in a successful trade, since a hypothetic rational operator of the system will not pursue small forecast moves, understanding this to be a noisy system.

By the same property of a product, an impact of even a large forecast move in a wrong direction on the Pearson correlation coefficient can be moderated if the actual price move turned out to be negligible -- again what we want, since in this case the effect of the system's imperfection on the operator's portfolio is likely to be similarly insignificant.

Likewise, an impact of a wrong forecast move of negligible magnitude will be negligible both for the Pearson correlation and for the operator's portfolio since in this case, the operator is unlikely to enter the trade.

Finally, the worst case is the one when the system predicts a large movement in the currency pair and a large movement does materialize -- but in the opposite direction. In this case, a large negative contribution to Pearson is recorded.

These considerations argue in favor of Pearson correlation coefficient as an intuitively good figure of merit. These qualitative considerations are also true for the way the shape and mean of the product distribution histograms (Fig.1) is influenced by successes or failures of the forecasting on a day-by-day basis. In terms of Fig.1, tails in the positive direction, large compared to the negative ones, as well as the asymmetry of the peak in the vicinity of 0 are the desired signatures of a successful system.

Last Updated ( Wednesday, 14 April 2010 12:45 )