Optimization and Curve Fitting

megamuel · Jun 20, 2011

When does optimization become curve fitting? And how do you prevent the latter? Obviously we all want to optimize our system/method/strategy whatever you want to call it, but how do we know if we are just curve fitting it? Curve fitting, in my experience on here, always has negative connotations... Could curve fitting be good so long as it is done on a regular basis? How often should one re-examine and adjust their criteria?

Any opinions on this?

Sam.

forker · Jun 20, 2011

I would class optimisation as alterations to trade management. Curve fitting is changing trade criteria and this isn't an area I mess around with. I guess everyones definition will be different though.

Pat494 · Jun 20, 2011

I don't really understand the mathematics of why optimization/curve fitting doesn't work very well. I suppose the fact that it doesn't shows how random and chaotic results really are. How one can take this into account is surely one of the great challenges facing mathematicians. Many of the brighest minds on the planet have and are working on this problem of accurate forecasting. Billions of dollars are thrown at the problem. As far as I know they haven't really come up with anything more accurate tham moving averages etc.
Plenty of hype but not much substance

🙁

Shakone · Jun 20, 2011

I think optimisation is always curve fitting.

Joey25 · Jun 20, 2011

Say you have 2 weeks of data. You can fit a pefect curve to that data by applying OLS regression to the data with dummy variables (0 or 1) for MondayWeek1, TuesdayWeek1,...,FridayWeek2.

You get a perfect fit - an R-squared of 1. However, clearly this is no good for forecasting.

The Akaike Information Criterion is a measure that attempts to differentiate between methods that have different numbers of parameters by penalising those with the most parameters, but giving credit for an improvement in the 'fit':

AIC = 2k - ln(L)

where k = number of parameters
ln(L) is the log-likelihood function

A high AIC is actually bad since it reflects either more parameters or a lower log-likelihood.

ronblack · Jun 20, 2011

Shakone said:
I think optimisation is always curve fitting.

It is the other way: curve-fitting is optimization but not every optimization is curve fitting.

ronblack · Jun 20, 2011

Just found this blog by Michael Harris. Good overview with examples:

http://www.priceactionlab.com/Blog/2010/11/curve-fitting-and-optimization/

megamuel · Jul 2, 2011

Completely forgot about this thread! Anyway, to those of you who trade mechanically, how often do you optimize your strategy/strategies? And how often do you subsequently change your settings?

To those of you who trade with discretion, how often do you review your performance and change your strategy accordingly?

Sam.

Yamato · Jul 2, 2011

megamuel said:
When does optimization become curve fitting? And how do you prevent the latter? Obviously we all want to optimize our system/method/strategy whatever you want to call it, but how do we know if we are just curve fitting it? Curve fitting, in my experience on here, always has negative connotations... Could curve fitting be good so long as it is done on a regular basis? How often should one re-examine and adjust their criteria?

Any opinions on this?

Sam.

Here's a quick and practical answer: use the out-of-sample and do all the curve-fitting you want.

You can optimize 2000 variables if you wish on a system - and by optimizing them you will come up with a profitable system - but you must only do so by using a sample covering half the historical data you have. Let's say you have 10 years of historical data: you only use the first 5 years for your optimizations. Then, once you think you have come up with a perfect system, you see how well it does on the out-of-sample, the last 5 years, which you have prevented yourself and the software from seeing during optimization. If your system makes money in the out-of-sample, then it is a good system. Otherwise you throw it away and move on to the next system. By all means, once you know how that system behaves in the out-of-sample, you cannot keep tweaking it and it's too late to fix it, whether it turned out to be profitable or not.

If you don't do this, you will easily come up with a system that works in the past, but it's like doing the crosswords by looking at the answers, it's like cheating yourself. If you make that mistake, then, to do things properly, you will have to wait another 2 years of forward-testing (paper trading your system, live, in the real market) to see if your system, which obviously worked in the past, also works in the future.

In other words, by optimizing, or simply by repeatedly trying new and different rules/conditions/parameters, you will end up finding systems that work on a given sample, but the only way you know if those systems are actually good is if you hide a part of your data and pretend that part is the future, which you will use to verify your finished system.

Yes, you could still get lucky and find an out-of-sample which, by luck, works for your in-sample optimized code, but it's very unlikely. What's more likely is that a good system will stop working, or that your past drawdown will be exceeded, but at least the out-of-sample will prevent you from deceiving yourself.

Typically since I've been using the out-of-sample (a year ago), I've been seeing two thirds of the systems work in forward-testing vs. only half of them working when I wasn't using it. Also, typically, when a system I created is not healthy, even though it looks just as good in the in-sample, it will have a sharp continuous drop in the equity line starting just as soon as the out-of-sample starts, and going all the way to the end of it. By using the out-of-sample, you detect and discard such illusions. Without the out-of-sample, you have to keep believing in them for two years longer.

It's funny. Using the out-of-sample means preventing yourself from seeing a slice of the past, but it allows you to take a peak at the future, so it's an even more helpful way to use your data. It's counter-intuitive in that it seems like you're not using some of the data, like you're wasting it... when instead you're making that data you hide even more useful.

It's like going to the eye-doctor and he shows you the letters and covers your eyes one at a time. Until you do that, you will never realize that one of your eyes sees better than the other one. Because the other eye is supplementing all the needed information. The out-of-sample forces your system to face reality in terms of its ability to read the future, just as the doctor forces your eye to face reality in terms of its ability to read the letters. Can your system read the future? Then prove it right now, by uncovering the out-of-sample.

Here's a good link on this:
http://backtestingblog.com/glossary/out-of-sample-testing/

...To be completely effective, the out-of-sample data should only be used once. Each backtest should have its own out-of-sample data because if it is used frequently, the out-of-sample data too easily becomes in-sample data.

More good links on this:
http://en.wikipedia.org/wiki/List_of_cognitive_biases#Biases_in_probability_and_belief
http://en.wikipedia.org/wiki/Hindsight_bias
http://en.wikipedia.org/wiki/Experimenter's_bias
http://en.wikipedia.org/wiki/Wishful_thinking

megamuel · Jul 2, 2011

See I knew you would have a good answer Travis! Thanks - Great suggestion. So say if you did your optimization on the first 5 years, then did back tests on the next 5 years and it turned out to be profitable. You go live, at any point then would you re-evaluate your settings and adjust them accordingly as conditions change?

Yamato · Jul 2, 2011

Thanks for appreciating.

The way I see it is you'd never re-adjust the settings of a system. You would consider a system as "failed" if its forward-tested performance (all trading taking place after a successful backtesting) had, more or less, all these 3 characteristics:

1) unprofitable
2) longer drawdown in terms of time
3) deeper drawdown in terms of money

If a system doesn't fail, you don't fix it. If a system fails, you could either:
1) keep forward-testing it to observe its behaviour
2) discard it and build another one with similar features, an extra filter, but again using the out-of-sample methodology - that is why you can never reassess a system nor fix it, because you'd be benefiting from knowing the future (e.g.: "this did not work in the last two years, so let's add this rule to prevent it from trading...").

enragedcow · Jul 3, 2011

agreed with travis with a bit extra: i don't think optimization is very useful unless you are such an amazing mathematician that you can add in a variable which acts as a proxy for "future price movement" but that would mean you've already won the game. optimization can be entertaining, i've spent many days on matlab optimizing trading systems which i'd never trade..ever... but when it comes to trading if a mechanical system comes up with positive profit, trade it and optimize it logically rather than mathematically.

also remember

if you end up with a number greater than 8 you're not doing real maths

megamuel · Jul 3, 2011

Great posts and suggestions. Does anyone know where I can get 10 years worth of reliable tick data for Metatrader 4.... All the Forex majors....

Cheers,

Sam.

Yamato · Jul 3, 2011

www.disktrading.com

rawrschach · Jul 3, 2011

why not 'optimise' your methodology to the distribution rather than the data

rawrschach · Jul 3, 2011

what you really want to do is create a methodology that responds well to random data in a variety of distributions, not just past data.

megamuel · Jul 3, 2011

Hi rawrschach,

Thanks for the suggestion, could you please elaborate a bit more though? I don't really understand what you mean! I'm already a bit out of my depth! Lol. Cheers,

Sam.

rawrschach · Jul 3, 2011

To avoid the problem of curve fitting you'd need a methodology that was robust with random data, if you come up with something you could test it with data that was well behaved, fat tailed, skewed in either direction etc

Would probs give you something better than using just past data. Ofc if it's any good it'll work fine with that too.

edit: setting up a monte carlo simulation would be a gg way of generating the data.

Mr Soros · Jul 4, 2011

Id say that optimization is fine tuning the noise , whereas curve fitting is swapping parameters.

Optimization and Curve Fitting

megamuel

Experienced member

forker

Senior member

Pat494

Legendary member

Shakone

Senior member

Joey25

Established member

ronblack

Active member

ronblack

Active member

megamuel

Experienced member

Yamato

Legendary member

megamuel

Experienced member

Yamato

Legendary member

enragedcow

Active member

megamuel

Experienced member

Yamato

Legendary member

rawrschach

Experienced member

rawrschach

Experienced member

megamuel

Experienced member

rawrschach

Experienced member

Mr Soros

Junior member

Similar threads