Backtesting Pitfalls-Part 2

Outliers

Outliers must be properly accounted for. If outliers are representative of the current time period, then they must be included in the backtest. If outliers are not representative of the current time period, then they may be excluded from the backtest. This distinction is particularly important because including a non-representative data point or excluding a representative data point can skew results. Outliers are typically non recurring events that have significant impacts on residual plots. Backtesters may be able to test how representative the sample of data points is by eliminating the outliers and observing changes. 

Overfitting and Data Mining vs Data Dredging

Overfitting occurs when a backtesting model is extrapolated to other models of similar pairs. For example, a strategy to buy the S&P 500 index when the one-week moving average is greater than the two-week moving average is particularly effective (Figure 13). However, different moving average pairs do not produce the same results. A trading strategy for when the three-week moving average is greater than the four-week moving average greatly underperforms. Backtests must statistically test each trading strategy and not extrapolate strategies onto the same set of data. 

Data mining is the practice which underlies all backtesting: reviewing data to find meaningful patterns and relationships. Concurrently, data dredging describes an inappropriate practice which seeks to derive all possible relationships regardless of statistical significance and an underlying hypothesis. A given strategy can not be selected to represent all possible trading strategies. An example of data mining would be to construct a graph with only the colored lines shown in Figure 14. In order to avoid the bias associated with data dredging, back testers must consider all relationships and then determine the statistical significance of those relationships. Furthermore, traders are advised to form an independent hypothesis before conducting backtesting.

Unexpected Risk

Effective backtesters will account for potential future risks in hypothetical crisis scenarios. They will stress-test strategies to determine performance in times of economic downturn, providing traders with essential information about risk. As discussed earlier, the implementation of stop-loss strategies can limit potential losses. Backtesters must also take into account more frequently-occurring risks like spread, fees/commissions, market impact, and slippage.

In-Sample Testing

In-sample testing is one of the most common yet devastating types of backtesting pitfalls; this type of testing occurs when a trader or backtesting platform fits their model onto the same set of data, the training set, then evaluates their strategy on that same training set. To avoid this pitfall, a backtester must be programmed to construct strategies from a training set and test strategies on a sample test set.

Continue to the next part of our backtester series here.