Trading Using Statistical Arbitrage vs. Traditional Technical Analysis

IFeelFree

Active member
Messages
109
Likes
22
The problem with traditional technical analysis is that most financial time series (such as stock prices) are pretty close to being a "random walk". The "random walk hypothesis" states that stock market prices evolve according to a random walk and thus cannot be predicted. (This is consistent with the "efficient market hypothesis.") However, stock prices are not a pure random walk. For one thing, there is usually a small deterministic (predictable) component. The implication here is that much of technical analysis is an effort to detect a small deterministic trend within a larger random process. This large random component means that conventional technical analysis often doesn't work all that well.

There is some evidence that certain asset classes, such as currencies and commodities, tend to be mean-reverting, in which case I would expect traditional technical analysis to work somewhat better for these assets. Even so, there is a large random component to these asset prices that make them difficult to forecast. To address the randomness problem, some quantitative analysts use statistical arbitrage, in particular, cointegration. Cointegration involves assembling a portfolio of securities that has favorable statistical properties for forecasting. The idea is that you try to find a combination of 2 or more securities in which the random walk components largely cancel out, and you’re left with a portfolio that is more deterministic.

The most common example of this is pair trading. Pair trading is a market-neutral trading strategy that monitors the prices of two correlated securities, such as, say, Coca-Cola (KO) and Pepsi (PEP). Because these companies sell similar products, their stocks tend to move together. In pair trading you buy one and short the other (in the correct ratio) when their prices diverge significantly (based on historical norms). However, the problem with pair trading is that in recent years it has become too popular. As we know, when too many people are using the same or similar strategy, its effectiveness diminishes. To put it bluntly, it appears that pair trading is dead.

However, there’s no reason to limit cointegration to two securities. Statistical methods such as the Johansen procedure allow one to find cointegrated portfolios of 3 or more securities. The rapid growth in the number of ETFs in recent years adds a large number of potential securities with which to assemble a cointegrated portfolio. The possible number of cointegrated portfolios of 3 or more ETFs is so large that it is unlikely to be fully exhausted for the foreseeable future. (For example, I calculate 1.2 billion possible combinations of 3 ETFs. Even if only a small fraction of these exhibit cointegration, that is still likely to be a large number. The number of possible combinations of 4 or 5 ETFs is astronomical.)

There are commercial software packages (such as FX AlgoTrader) which enable users to find and trade cointegrated pairs, but I’m not aware of any commercial products that enable users to assemble cointegrated portfolios of 3 or more securities. For anyone who has software programming skills, there are resources available to do this. (For example, Ernie Chan’s “Algorithmic Trading: Winning Strategies and Their Rationale” provides a good overview, is practical and not overly technical.) As a software engineer, I wrote my own software to find and trade cointegrated portfolios of multiple (3 or more) securities. Currently, I’m trading a cointegrated portfolio of 3 ETFs, and so far, I’m having good results. I did paper trading for nearly a year while I developed and refined the algorithm, but I recently “went live”. Backtesting suggests I should be able to achieve greater than 50% average yearly return (AYR), with a maximum drawdown of 13%. Since I starting live trading in a $100,000 account 2 1/2 months ago I’ve made an 11.3% return during this period. (If the algorithm continues to perform this well, this suggests an AYR of 67%.) Time will tell how well the algorithm continues to perform.

To implement a mean reversion strategy, I apply a Kalman filter to the cointegrated portfolio. This provides optimal dynamic updating of the portfolio weights, along with error and variances of the “spread”. (This is similar to a Bollinger Band approach, but without the lag inherent in moving averages and moving standard deviations.) The following chart shows 5 years of weekly data of the spread “error” (blue line), along with +/- the square-root of the spread variance (green/red lines). When the spread error is above the green line, I short the portfolio. When it’s below the red line, I buy:

Kalman Filter Statistics.jpg

The cumulative returns during 4 years of backtesting produced an in-sample AYR of 67%:

Cumulative Returns.jpg

If this thread generates any interest, I’ll try to post regularly to report on the algorithm performance, answer questions, and discuss other trading strategies I’m looking at, such as using multivariate auto-regressive models.

P.S. If anyone can tell me how to make the chart images larger, I'd appreciate it!
 
This is not my thing as I get much higher returns actively day trading, but it is very interesting and I ,for one, appreciate the time and effort you've put in posting the above. Please continue 🙂
 
This is not my thing as I get much higher returns actively day trading, but it is very interesting and I ,for one, appreciate the time and effort you've put in posting the above. Please continue 🙂

Thanks, Mr. Charts. I suspect that my approach is bit too mathematical for most traders, but the math gives me confidence that I have a sound basis for making my trades. There's at least one other trader on this forum who's also using a cointegration/mean-reversion strategy. I'm looking at other strategies as well, such as cointegrated vector autoregressive models and momentum strategies. As long as I'm making good returns, I'm going to continue with this.
 
IFeelFree,

Your stuff sound very good. I have some questions :

- You use ETFs are of same sectors for your portfolio ?
- You run ols regression for the ETFs weights ?
- Your backtest it's for 3 ETFs or all portfolio ?
- What timeframe you use ?

Regards.
 
If you stuck with "what's good about this" rather than creating fake straw men you'd be on sounder ground. Disparaging something you seem to know little about makes you look foolish. Instead, if you expound on your method it could be an interesting thread.
 
IFeelFree,

Your stuff sound very good. I have some questions :

- You use ETFs are of same sectors for your portfolio ?
- You run ols regression for the ETFs weights ?
- Your backtest it's for 3 ETFs or all portfolio ?
- What timeframe you use ?

Regards.

1. The ETFs I'm currently using are for stocks, bonds, and commodities. (I don't want to list the ETFs because I want to keep that proprietary.) I also use the inverse ETFs rather than short the ETFs (which I can't do in my account.)

2. The ETF weights are obtained from the Johansen procedure, which optimizes cointegration, and so is different than OLS. (In the case of only two time series, one can use OLS to obtain coefficients, but not when there are 3 or more time series.) In particular, the Johansen procedure uses a vector auto-regressive model to look for a cointegrating relationship between multiple time-series. I found the code for the Johansen procedure online, and I verified by inspection that the code agrees with the formulas.

3. The 3 ETFs (and their inverse) are used to assemble the portfolio, which is updated every time I make a trade (weekly). I did backtests using 5 years of data which, because of the lookback period, gives 4 years of returns. I update the parameters every week, so the whole thing is very dynamic.

4. I trade weekly. I ran the backtest for daily trades but the results were only a modest improvement. Given the much higher trading costs of daily vs. weekly, I decided to stick with a 1 week time-frame.

I'm happy to report that the algorithm had me short stocks and long bonds this week, which made it very profitable during Friday's bloodbath.
 
If you stuck with "what's good about this" rather than creating fake straw men you'd be on sounder ground. Disparaging something you seem to know little about makes you look foolish. Instead, if you expound on your method it could be an interesting thread.

It may be the case that technical analysis can key in on certain patterns that come up when traders act in certain ways. I don't know. However, based on standard statistics, price movements in markets exhibit a high degree of "randomness", that is, poor predictability of those price movements is expected using conventional technical analysis. Nevertheless, it appears that some experienced traders are able to use such analysis profitably. I suspect that years of trading have given them a highly developed intuition regarding the markets, and that is their true edge.

Traders uses whatever edge they have in trading. My edge, if I have any, is my mathematical knowledge and programming skills. The advantage with my approach is that I only spend about 3 hours a week trading and updating my software, and I'm up 15% since August. That's good enough for me.
 
Thanks IFeelFree for your feedback.

How many bars you use for find your cointegration ? 2500 5000? more?

So if i understand right, you don't use the weight found with johansen procedure but you use kalman filter ?

If Yes there are differences ? the weight of johansen isn't enough good?
 
Thanks IFeelFree for your feedback.

How many bars you use for find your cointegration ? 2500 5000? more?

So if i understand right, you don't use the weight found with johansen procedure but you use kalman filter ?

If Yes there are differences ? the weight of johansen isn't enough good?

I'm using 5 years of weekly data, so 5 x 52 = 260 data points. I also have a daily version of the software which uses 5 x 252 = 1260 data points. However, the returns of the daily version are just slightly higher than the returns of the weekly version, but it has higher transaction fees, and also would make a lot more demands on my personal time, so I'm using the weekly version.

I take the weights from the Johansen procedure and I use them as the initial values (the "seed" values) for the Kalman filter. (It turns out the Kalman filter is sensitive to initial values, so this is important.) The advantage of the Kalman filter is that it gives the statistically optimal weights for each period, rather than just a constant weighting. It is more dynamic, and seems to better accommodate changing market conditions. The bottom line is that it dramatically improves the performance of the algorithm. That's why I'm getting >50% AYR.

You can read examples in Chan's book. For example, using a triplet of cointegrated ETFs, EWA-EWC-IGE, he obtains only a 12.6% average yearly return (AYR) using the Johansen weights with a simple mean-reversion algorithm. However, adding a Kalman filter to the simpler EWA-EWC cointegrated pair, he obtains the much higher 26.2% AYR. I know most people prefer to keep things simple, but sometimes a more sophisticated algorithm really does perform better.
 
Prices are not random under specific conditions.....if you know The conditions to look for and can execute the trades at that time you will always make money trading
 
Thanks, Mr. Charts. I suspect that my approach is bit too mathematical for most traders, but the math gives me confidence that I have a sound basis for making my trades. There's at least one other trader on this forum who's also using a cointegration/mean-reversion strategy. I'm looking at other strategies as well, such as cointegrated vector autoregressive models and momentum strategies. As long as I'm making good returns, I'm going to continue with this.

That's the secret to being a trader .....be confident in your edge .....

Respect
N
 
Prices are not random under specific conditions.....if you know The conditions to look for and can execute the trades at that time you will always make money trading

It's not that prices are random. It's that our knowledge of the causes of price movements are limited. (I can't predict price movements with perfect accuracy.) Therefore, any price movements I can't predict are random to me.

Presumably, all price movements have causal mechanisms, but we can't know all of them. What we can't explain is "random".
 
Hi IFeelFree,
did you perform any cointegration testing over FX spot pairs? The main issue with ETFs is data hovewer FX data are quite easily within reach

Thanks!

Tomas

The problem with traditional technical analysis is that most financial time series (such as stock prices) are pretty close to being a "random walk". The "random walk hypothesis" states that stock market prices evolve according to a random walk and thus cannot be predicted. (This is consistent with the "efficient market hypothesis.") However, stock prices are not a pure random walk. For one thing, there is usually a small deterministic (predictable) component. The implication here is that much of technical analysis is an effort to detect a small deterministic trend within a larger random process. This large random component means that conventional technical analysis often doesn't work all that well.

There is some evidence that certain asset classes, such as currencies and commodities, tend to be mean-reverting, in which case I would expect traditional technical analysis to work somewhat better for these assets. Even so, there is a large random component to these asset prices that make them difficult to forecast. To address the randomness problem, some quantitative analysts use statistical arbitrage, in particular, cointegration. Cointegration involves assembling a portfolio of securities that has favorable statistical properties for forecasting. The idea is that you try to find a combination of 2 or more securities in which the random walk components largely cancel out, and you’re left with a portfolio that is more deterministic.

The most common example of this is pair trading. Pair trading is a market-neutral trading strategy that monitors the prices of two correlated securities, such as, say, Coca-Cola (KO) and Pepsi (PEP). Because these companies sell similar products, their stocks tend to move together. In pair trading you buy one and short the other (in the correct ratio) when their prices diverge significantly (based on historical norms). However, the problem with pair trading is that in recent years it has become too popular. As we know, when too many people are using the same or similar strategy, its effectiveness diminishes. To put it bluntly, it appears that pair trading is dead.

However, there’s no reason to limit cointegration to two securities. Statistical methods such as the Johansen procedure allow one to find cointegrated portfolios of 3 or more securities. The rapid growth in the number of ETFs in recent years adds a large number of potential securities with which to assemble a cointegrated portfolio. The possible number of cointegrated portfolios of 3 or more ETFs is so large that it is unlikely to be fully exhausted for the foreseeable future. (For example, I calculate 1.2 billion possible combinations of 3 ETFs. Even if only a small fraction of these exhibit cointegration, that is still likely to be a large number. The number of possible combinations of 4 or 5 ETFs is astronomical.)

There are commercial software packages (such as FX AlgoTrader) which enable users to find and trade cointegrated pairs, but I’m not aware of any commercial products that enable users to assemble cointegrated portfolios of 3 or more securities. For anyone who has software programming skills, there are resources available to do this. (For example, Ernie Chan’s “Algorithmic Trading: Winning Strategies and Their Rationale” provides a good overview, is practical and not overly technical.) As a software engineer, I wrote my own software to find and trade cointegrated portfolios of multiple (3 or more) securities. Currently, I’m trading a cointegrated portfolio of 3 ETFs, and so far, I’m having good results. I did paper trading for nearly a year while I developed and refined the algorithm, but I recently “went live”. Backtesting suggests I should be able to achieve greater than 50% average yearly return (AYR), with a maximum drawdown of 13%. Since I starting live trading in a $100,000 account 2 1/2 months ago I’ve made an 11.3% return during this period. (If the algorithm continues to perform this well, this suggests an AYR of 67%.) Time will tell how well the algorithm continues to perform.

To implement a mean reversion strategy, I apply a Kalman filter to the cointegrated portfolio. This provides optimal dynamic updating of the portfolio weights, along with error and variances of the “spread”. (This is similar to a Bollinger Band approach, but without the lag inherent in moving averages and moving standard deviations.) The following chart shows 5 years of weekly data of the spread “error” (blue line), along with +/- the square-root of the spread variance (green/red lines). When the spread error is above the green line, I short the portfolio. When it’s below the red line, I buy:

View attachment 202404

The cumulative returns during 4 years of backtesting produced an in-sample AYR of 67%:

View attachment 202406

If this thread generates any interest, I’ll try to post regularly to report on the algorithm performance, answer questions, and discuss other trading strategies I’m looking at, such as using multivariate auto-regressive models.

P.S. If anyone can tell me how to make the chart images larger, I'd appreciate it!
 
Autoregressive models

Hello Feel Free

I am glad you mentioned autoregressive models. I have been wondering if using any AR or ARMA models have any predictable capabilities on the stock market.

Have you used any of these?


I have only applied these models for academic purposes, therefore I was wondering if you could provide any feedback on their practicality or feasibility in trading.



regards
 
So would you all say the mean reversion applies mainly if you trade fx cause I do know 1 guy that does this with FX but he has a very large capital base.

Not sure if this would work on individual stocks or indecies ?
 
Back
Top