Incorporating seasonality into Heidi. A concept of a better forecasting component for an intraday trading system.

User Rating: / 3
PoorBest 
Written by Forex Automaton   
Monday, 07 March 2011 17:21
Article Index
Incorporating seasonality into Heidi. A concept of a better forecasting component for an intraday trading system.
Intraday Season 0 Optimization: 5-7pm ET, 23-1 CET
Intraday Season 1 Optimization: 8-10pm ET, 2-4 CET
Intraday Season 2 Optimization: 11pm-1am ET, 5-7 CET
Intraday Season 3 Optimization: 2-4am ET, 8-10 CET
Intraday Season 4 Optimization: 5-7am ET, 11-13 CET
Intraday Season 5 Optimization: 8-10am ET, 14-16 CET
Intraday Season 6 Optimization: 11am-1pm ET, 17-19 CET
Intraday Season 7 Optimization: 2-4pm ET, 20-22 CET
All Pages

Hourly time scale in forex provides roughly 24 times more statistics, than the daily scale. Therefore, if a single adjustable parameter for all currency pairs is justifiable in the daily system optimization (Danica), 24 parameters can be adjusted for Heidi without jeopardizing significance of the results. The natural way of increasing the number of parameters is to split the 24-hour day into time window "seasons" and treat the seasons as independent pattern recognition problems in the code and as independent optimization problems on the stage of optimization.

The benefits of such an approach, and why it is natural, are clear from the intraday seasonality reports, with a single report per currency pair. A summary of the intraday seasonality studies has been recently posted (also see the links in that post).

The data obtained so far and presented in the above-mentioned documents, allow me to discuss three types of differences among the different intra-day time windows or "seasons": 1) the mean of the logarithmic return is not necessarily the same for all seasons, 2) the mean zero-lag autocorrelation variance) is different, season to season, 3) the mean one-hour-lag autocorrelation is different season to season. Out of these three points, point 2 (heteroscedasticity of the markets) is widely popularized, while the other two are not usually talked about. All of these justify a specialized approach whereby every season is treated and optimized independently.

For intra-day seasonality studies, I choose Central European time because it allows me to split the forex week into five full non-interrupted trading days. Those trading days are further subdivided into intra-day seasons, as explained in Table 1. The data aggregation is now done separately within the seasons. The hours listed in the table are the end readings of the respective hour bins. For example, a period which is said to include hours from 17:00 to 19:00 Eastern time (5pm to 7pm) begins at 4pm and ends at 7pm. The seasons are defined including both edges.

Central Europe 123456789101112 13141516171819202122230
Eastern US 19202122230123456 789101112131415161718
intraday season ID 0 111 222 333 444 555 666 777 00

Table 1. Time zone conversion table. Seasonal time shifts, such as daylight saving time, may complicate the picture if the nations choose to enact them on different days, and are ignored.

For each period, I analyze Pearson correlation coefficient between predicted logarithmic difference (return) in hourly quantity and the forecast of the same for the three quantities: low, high and close.

I also analyze the fourth-order cumulant formed out of the logarithmic return for hourly high, its predicted value, the logarithmic return for hour low and its predicted value. This cumulant measures "genuine" correlation among the four quantities, left after subtracting the pair and triplet correlation contributions. A positive genuine correlation (found in real forex but not in random walk data) signifies the propensity of the market to create coincidences between non-hitting the stop-loss target while hitting the profit target within the same trade, if the previous day's low and high are chosen as such levels (which depends, of course, on the direction of the trade). The positiveness of such a cumulant is synonymous with profitability for Demi, our day-scale trading system who currently goes through her paper trading trial period in the members-only section of this site. Daily forecasts for high and low (ignoring close) are at the core of Demi's approach. Naturally, it is desirable to be able to do the same on the hour scale. Forth order cumulants on the hour scale are presented for the first time, while the daily scale forth order cumulants have been reported before.

The studies are performed by varying the main quantity responsible for the forecasting, the quantity nicknamed Fred. The Fred number determines how recent past interacts with accumulated long-term knowledge to determine the projection into the future.

Temporal history is integrated out of sight: the data points are temporal averages over the years from 2003 through 2010 (8 years of data). Importantly, the variation of the results with time is incorporated into the presentation. It enters through the precision of the mean which is determined by analyzing the variation of the 8 independent measurements, each corresponding to a year of data, for each quantity under study.

The data are 8 years of hour-by-hour candles for the six major exchange rates involving USD: AUD/USD, GBP/USD, EUR/USD, USD/CAD, USD/CHF and USD/JPY.

Summary of the results

The differences between individual currency pairs often exceed long-scale temporary variations, therefore a possibility that custom optimizing solutions for individual pairs, valuable in the long term, could be obtained and maintained.

Differences between intraday seasonality bins exceed differences between individual currency pairs. Overall, there are fewer classes of Fred dependencies to optimize than there are seasons. Optimization of hourly close presents the largest problem and that's what I focus on. One pattern is that of general indifference towards Fred; the other is a decrease of correlation with rising Fred with a saturation at large Fred, and the third one is the rise with Fred with a saturation at large Fred.

In all intraday seasons except Season 4, a solution could be found, possibly involving a sign flip, which would work on average for the entire time history and for all 6 pairs considered, yielding positive correlations of the forecast with reality.

The most common trick of getting a positive correlation is however to flip the sign of the forecast, meaning that the "raw" forecast has a propensity to go systematically against reality. This was noted already in the very first announcement of Heidi, where I said that I expected a -3% correlation on average across the forex pairs. At that time, I resisted the temptation to make the correlation positive by hook or by crook applying a universal, time-zone-independent sign flip to the forecasts for close. The situation would have been much harder to deal with, had there been a featureless distribution of that -3% correlation along the time seasons inside the day. To my relief, the already mentioned seasonality studies conducted in the past months indicated that there is considerable intraday structure. A time-independent solution ignoring that structure would therefore throw out the baby with the bath water part of the time. The structure of the time-dependence, and thus of the solution to it, is presented, mostly in graphics, in the following sections.

Moreover, I hypothesize that the necessity to apply the sign flip part of the time is related to the mean-reversion of the FX markets. This is supported by the fact that the only time window (or "season") where such a solution is not needed is the one characterized by visible trend following, as detected by one hour lag autocorrelation. This time window is Season 6.

Such is the summary of the results presented graphically in the following subsections, with one subsection per intraday season.


For Season 0, the Pearson prediction quality statistic for hourly extremes is somewhat lower than it is for the rest of seasons. The value 32 (or 33 as determined using more precise studies before) is probably the best to maximize Pearson for hourly extremes. For hourly close, we see with good confidence that one needs to flip sign of the predicted return to obtain positively correlated forecast, valuable for most pairs except for EUR/USD where prediction has zero correlation with reality.

For cumulants, as Fig.0.4 shows, one needs to avoid the very low values of Fred. The value which works best for the second order correlation of daily extremes with their forecasts will likely be fine for their forth-order cumulant.

Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly HIGH on the optimization parameter Fred. Season 0. 0.1 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 0. 0.2 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 0. 0.3

Fig. 0.1-0.3 Dependence of Pearson correlation coefficients between predicted and actual logarithmic differences (returns) in the three components of the hour candle (high, low and close, in that order) the optimization parameter nicknamed Fred. 0.1: hourly high, 0.2: hourly low, 0.3: hourly close. Data are for the Intraday Season 0.

Dependence of normalized 4-point cumulant among predicted and actual logarithmic differences in hourly HIGH and hourly LOW on the optimization parameter Fred. Season 0. 0.4

Fig. 0.4 Dependence of the normalized 4th order cumulant among predicted and actual logarithmic differences (returns) in hourly high and low on the optimization parameter nicknamed Fred. Data are for the Intraday Season 0.


Season 1 presents somewhat of a dilemma when it comes to optimizing the prediction for hourly close. If the system were to only trade EUR/USD and USD/CHF, I would say that one would need Fred below 10. However, for the rest of the time series, such a choice would yield negative correlations, which is to say that playing against the forecast would be indicated. The compromise solution is to take Fred between 30 and 50 and flip sign.

For the hourly high and low, and the corresponding cumulant, the situation is staightforward and the compromise solution outlined above could also be the best solution from the point of view of these figures of merit, if Fred=33 is chosen again.

Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly HIGH on the optimization parameter Fred. Season 1. 1.1 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 1. 1.2 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 1. 1.3

Fig. 1.1-1.3 Dependence of Pearson correlation coefficients between predicted and actual logarithmic differences (returns) in the three components of the hour candle (high, low and close, in that order) the optimization parameter nicknamed Fred. 1.1: hourly high, 1.2: hourly low, 1.3: hourly close. Data are for the Intraday Season 1.

Dependence of normalized 4-point cumulant among predicted and actual logarithmic differences in hourly HIGH and hourly LOW on the optimization parameter Fred. Season 1. 1.4

Fig. 1.4 Dependence of the normalized 4th order cumulant among predicted and actual logarithmic differences (returns) in hourly high and low on the optimization parameter nicknamed Fred. Data are for the Intraday Season 1.


For Season 2, the upward bent of the data trends for low values of Fred becomes more pronounced and involves more currency pairs than it did in Season 1. However it does not reach above zero. The solution again is to play against the forecast by taking Fred=33 and flipping the sign, creating "artificially" a positive correlation. Again, this is fine by high, low and the cumulant.

Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly HIGH on the optimization parameter Fred. Season 2. 2.1 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 2. 2.2 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 2. 2.3

Fig. 2.1-2.3 Dependence of Pearson correlation coefficients between predicted and actual logarithmic differences (returns) in the three components of the hour candle (high, low and close, in that order) the optimization parameter nicknamed Fred. 2.1: hourly high, 2.2: hourly low, 2.3: hourly close. Data are for the Intraday Season 2.

Dependence of normalized 4-point cumulant among predicted and actual logarithmic differences in hourly HIGH and hourly LOW on the optimization parameter Fred. Season 2. 2.4

Fig. 2.4 Dependence of the normalized 4th order cumulant among predicted and actual logarithmic differences (returns) in hourly high and low on the optimization parameter nicknamed Fred. Data are for the Intraday Season 2.


Season 3 resembles Season 1, except that this time, it is AUD/USD and GBP/USD that create the upward trend at low values of Fred for the hourly close. The solution is the old one.

Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly HIGH on the optimization parameter Fred. Season 3. 3.1 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 3. 3.2 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 3. 3.3

Fig. 3.1-3.3 Dependence of Pearson correlation coefficients between predicted and actual logarithmic differences (returns) in the three components of the hour candle (high, low and close, in that order) the optimization parameter nicknamed Fred. 3.1: hourly high, 3.2: hourly low, 3.3: hourly close. Data are for the Intraday Season 3.

Dependence of normalized 4-point cumulant among predicted and actual logarithmic differences in hourly HIGH and hourly LOW on the optimization parameter Fred. Season 3. 3.4

Fig. 3.4 Dependence of the normalized 4th order cumulant among predicted and actual logarithmic differences (returns) in hourly high and low on the optimization parameter nicknamed Fred. Data are for the Intraday Season 3.


Season 4 is the most difficult one to optimize. Focusing on hourly close, some currency pairs are systematically above zero (positive correlation) and some are systematically below zero (negative correlation). No single Fred would work. A solution may be not to trade or to pick an independent strategy for each currency pair, thus increasing the number of free parameters more than seems optimal for the dataset at large.

Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly HIGH on the optimization parameter Fred. Season 4. 4.1 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 4. 4.2 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 4. 4.3

Fig. 4.1-4.3 Dependence of Pearson correlation coefficients between predicted and actual logarithmic differences (returns) in the three components of the hour candle (high, low and close, in that order) the optimization parameter nicknamed Fred. 4.1: hourly high, 4.2: hourly low, 4.3: hourly close. Data are for the Intraday Season 3.

Dependence of normalized 4-point cumulant among predicted and actual logarithmic differences in hourly HIGH and hourly LOW on the optimization parameter Fred. Season 4. 4.4

Fig. 4.4 Dependence of the normalized 4th order cumulant among predicted and actual logarithmic differences (returns) in hourly high and low on the optimization parameter nicknamed Fred. Data are for the Intraday Season 4.


For Season 5, we see another change of the predominant pattern in the evolution of the Pearson correlation coefficient for hourly close with Fred. USD/JPY shows a rising trend with Fred, entering the positive correlation zone. For the rest of currency pairs, the behaviour is mixed, but the area of Fred below 10 offers the most negative correlation values. The solution therefore is to pick Fred below 10 and reverse the sign of the forecast.

For the high, low and the cumulant Fred=33 will work.

Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly HIGH on the optimization parameter Fred. Season 5. 5.1 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 5. 5.2 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 5. 5.3

Fig. 5.1-5.3 Dependence of Pearson correlation coefficients between predicted and actual logarithmic differences (returns) in the three components of the hour candle (high, low and close, in that order) the optimization parameter nicknamed Fred. 5.1: hourly high, 5.2: hourly low, 5.3: hourly close. Data are for the Intraday Season 5.

Dependence of normalized 4-point cumulant among predicted and actual logarithmic differences in hourly HIGH and hourly LOW on the optimization parameter Fred. Season 5. 5.4

Fig. 5.4 Dependence of the normalized 4th order cumulant among predicted and actual logarithmic differences (returns) in hourly high and low on the optimization parameter nicknamed Fred. Data are for the Intraday Season 5.


Season 6 is the season of strong trend following in forex, according to the trend following/mean reversion study recently released. This happens to be the time window when the forecasting works best in all of its aspects -- hourly high, low, close -- with minimum intervention. For close, Pearson correlation coefficient as high as 6% can be achieved in a sustainable way, judging by the size of the statistical errorbars, the magnitude of the points and the overall consistency of the results across the different forex exchange rates.

Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly HIGH on the optimization parameter Fred. Season 6. 6.1 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 6. 6.2 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 6. 6.3

Fig. 6.1-6.3 Dependence of Pearson correlation coefficients between predicted and actual logarithmic differences (returns) in the three components of the hour candle (high, low and close, in that order) the optimization parameter nicknamed Fred. 6.1: hourly high, 6.2: hourly low, 6.3: hourly close. Data are for the Intraday Season 6.

Dependence of normalized 4-point cumulant among predicted and actual logarithmic differences in hourly HIGH and hourly LOW on the optimization parameter Fred. Season 5. 6.4

Fig. 6.4 Dependence of the normalized 4th order cumulant among predicted and actual logarithmic differences (returns) in hourly high and low on the optimization parameter nicknamed Fred. Data are for the Intraday Season 6.


In Season 7, we finish the daily cycle and not surprisingly, the picture for Season 7 looks much like that for Season 0, the next season. The optimization solution is also the same.

Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly HIGH on the optimization parameter Fred. Season 7. 7.1 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 7. 7.2 Dependence of Pearson correlation coefficient between predicted and actual logarithmic differences in hourly LOW on the optimization parameter Fred. Season 7. 7.3

Fig. 7.1-7.3 Dependence of Pearson correlation coefficients between predicted and actual logarithmic differences (returns) in the three components of the hour candle (high, low and close, in that order) the optimization parameter nicknamed Fred. 7.1: hourly high, 7.2: hourly low, 7.3: hourly close. Data are for the Intraday Season 7.

Dependence of normalized 4-point cumulant among predicted and actual logarithmic differences in hourly HIGH and hourly LOW on the optimization parameter Fred. Season 5. 7.4

Fig. 7.4 Dependence of the normalized 4th order cumulant among predicted and actual logarithmic differences (returns) in hourly high and low on the optimization parameter nicknamed Fred. Data are for the Intraday Season 7.

Bookmark with:

Deli.cio.us    Digg    reddit    Facebook    StumbleUpon    Newsvine
Last Updated ( Saturday, 07 July 2012 10:44 )