When does optimization become curve fitting? And how do you prevent the latter? Obviously we all want to optimize our system/method/strategy whatever you want to call it, but how do we know if we are just curve fitting it? Curve fitting, in my experience on here, always has negative connotations... Could curve fitting be good so long as it is done on a regular basis? How often should one re-examine and adjust their criteria?
Any opinions on this?
Sam.
Here's a quick and practical answer: use the out-of-sample and do all the curve-fitting you want.
You can optimize 2000 variables if you wish on a system - and by optimizing them you will come up with a profitable system - but you must only do so by using a sample covering half the historical data you have. Let's say you have 10 years of historical data: you only use the first 5 years for your optimizations. Then, once you think you have come up with a perfect system, you see how well it does on the out-of-sample, the last 5 years, which you have prevented yourself and the software from seeing during optimization. If your system makes money in the out-of-sample, then it is a good system. Otherwise you throw it away and move on to the next system. By all means, once you know how that system behaves in the out-of-sample, you cannot keep tweaking it and it's too late to fix it, whether it turned out to be profitable or not.
If you don't do this, you will easily come up with a system that works
in the past, but it's like doing the crosswords by looking at the answers, it's like cheating yourself. If you make that mistake, then, to do things properly, you will have to wait another 2 years of forward-testing (paper trading your system, live, in the real market) to see if your system, which obviously worked in the past, also works in the future.
In other words, by optimizing, or simply by repeatedly trying new and different rules/conditions/parameters, you will end up finding systems that work on a given sample, but the only way you know if those systems are actually good is if you hide a part of your data and pretend that part is the future, which you will use to verify your finished system.
Yes, you could still get lucky and find an out-of-sample which, by luck, works for your in-sample optimized code, but it's very unlikely. What's more likely is that a good system will stop working, or that your past drawdown will be exceeded, but at least the out-of-sample will prevent you from deceiving yourself.
Typically since I've been using the out-of-sample (a year ago), I've been seeing two thirds of the systems work in forward-testing vs. only half of them working when I wasn't using it. Also, typically, when a system I created is not healthy, even though it looks just as good in the in-sample, it will have a sharp continuous drop in the equity line starting just as soon as the out-of-sample starts, and going all the way to the end of it. By using the out-of-sample, you detect and discard such illusions. Without the out-of-sample, you have to keep believing in them for two years longer.
It's funny. Using the out-of-sample means preventing yourself from seeing a slice of the past, but it allows you to take a peak at the future, so it's an even more helpful way to use your data. It's counter-intuitive in that it seems like you're not using some of the data, like you're wasting it... when instead you're making that data you hide even more useful.
It's like going to the eye-doctor and he shows you the letters and covers your eyes one at a time. Until you do that, you will never realize that one of your eyes sees better than the other one. Because the other eye is supplementing all the needed information. The out-of-sample forces your system to face reality in terms of its ability to read the future, just as the doctor forces your eye to face reality in terms of its ability to read the letters. Can your system read the future? Then prove it right now, by uncovering the out-of-sample.
Here's a good link on this:
http://backtestingblog.com/glossary/out-of-sample-testing/
...To be completely effective, the out-of-sample data should only be used once. Each backtest should have its own out-of-sample data because if it is used frequently, the out-of-sample data too easily becomes in-sample data.
More good links on this:
http://en.wikipedia.org/wiki/List_of_cognitive_biases#Biases_in_probability_and_belief
http://en.wikipedia.org/wiki/Hindsight_bias
http://en.wikipedia.org/wiki/Experimenter's_bias
http://en.wikipedia.org/wiki/Wishful_thinking