3rd generation NN, deep learning, deep belief nets and Restricted Boltzmann Machines

Here are the features with absolute values

cond(11) = upBol_1(t);
cond(12) = lwBol_1(t);
cond(13) = upBol_15(t);
cond(14) = lwBol_15(t);
cond(15) = upBol_20(t);
cond(16) = lwBol_20(t);
cond(17) = upBol_25(t);
cond(18) = lwBol_25(t);
cond(19) = upBol_30(t);
cond(20) = lwBol_30(t);

cond(134) = MA5(t);
cond(135) = MA15(t);
cond(136) = MA30(t);
cond(137) = MA70(t);
cond(138) = MA150(t);

cond(139) = EMA10(t);
cond(140) = EMA20(t);
cond(141) = EMA50(t);
cond(142) = EMA100(t);
cond(143) = EMA200(t);

cond(156) = m(t);
cond(157) = d(t);
cond(158) = h(t); % hour
cond(159) = mn(t);% minute
cond(160) = day(t);

cond(161) = open(t);
cond(162) = high(t);
cond(163) = low(t);
cond(164) = close(t);

features 156-160 it is a time information i.e. month, day of the week etc
All features are normalized 0-1 later. You think I should remove them or replace with their log returns or lags ??

Anyway, I tried some feature selection and none of the method from WEKA improved the results maybe PCA in some cases (but i didnt try all of them !!!)
I know that maybe 100 of features is not necessary, maybe just lags are enough,
results are kind of similar even if half of them is removed so it confirms theory that classifiers are able to learn this information anyway.

But what you would suggest as a preprocessing for this data ??



of course its not enough, even if it is a lot of trades they are strongly correlated
so it does not prove anything. So I'm trying now to get results for more days.
That the only way to make some conclusions in my opinion.

Krzysztof

Yes, i saw the time information (did not see them from my first glance)
Normalizing does not help in this case as that is for learning / classification purposes and not to get bias out of the features. Therefore you should replace all absolute values with logreturn or other relative change measures.
As you are interested in the relation between e.g. EMA and price u have to quantify that alike to have real values instead of just saying price is above / below etc. e.g. price is at 50% or 120% of EMA.

Feature selection usually does not improve classification.. it rather makes learning cheaper and less complicated..

ps:
tp = take profit
TP = True Positive
can u elaborate on your label creation?

is it like this:
starting from bar 1 - u look ahead and see whether the tp is hit (e.g. +15 pips) therefore bar 1 is a TP learning example for the BUY class? and then you keep adding the following bars as TP until tn+x - tn < 15 pips?
The same with tn (-x pips) for the TN case? If you do so i see a "risk" of you producing fine noise as e.g. there is a very slow market ranging within 10 pips over night and then market opens and you get your +15 pips you will add all preceeding bars as TP?

Hope i made myself clear and you can explain :)

greetings
 
Last edited:
Yes, i saw the time information (did not see them from my first glance)
Normalizing does not help in this case as that is for learning / classification purposes and not to get bias out of the features. Therefore you should replace all absolute values with logreturn or other relative change measures.
As you are interested in the relation between e.g. EMA and price u have to quantify that alike to have real values instead of just saying price is above / below etc. e.g. price is at 50% or 120% of EMA.

Feature selection usually does not improve classification.. it rather makes learning cheaper and less complicated..

ps:
tp = take profit
TP = True Positive
can u elaborate on your label creation?

is it like this:
starting from bar 1 - u look ahead and see whether the tp is hit (e.g. +15 pips) therefore bar 1 is a TP learning example for the BUY class? and then you keep adding the following bars as TP until tn+x - tn < 15 pips?
The same with tn (-x pips) for the TN case? If you do so i see a "risk" of you producing fine noise as e.g. there is a very slow market ranging within 10 pips over night and then market opens and you get your +15 pips you will add all preceeding bars as TP?

Hope i made myself clear and you can explain :)

greetings

OK I will change feature definitions and rerun it against my 17 algos which are connected to the system so hopefully average performance measures will dedect some improvement.

Regarding labels. Starting from bar 1 I look ahead and check if stop loss or take profit is hit first. If take profit than bar 1 is given label 1 so True Positive, if stop loss than bar 1 is given label 0 so True Negative. Its repeated for all bars.

So you can say

TP - correctly taken trade
TN - corectly not taken trades (as stop loss is predicted)
FP - losing trades
FN - missed trades

Hope it clarifies.

Krzysztof
 
OK,
yet you did not react to my point that you are adding a lot of nonsens as TP. If you have a very slow market you keep adding bars where the impact of the features on the final take profit is running against zero..

this way it is very difficult to find appropriate decision boundaries as they could equally likely be TN..

hence you could have a tighter definition of your TP e.g.:
bar is TP sample if within the next x candles tp is hit. (the smaller x the tighter your class)
Then all other bars represent negative classes. This will result in a very unbalanced set which is OK if you know how to deal with it. I see a higher chance of success going this way - think about it.
 
Last edited:
OK,
yet you did not react to my point that you are adding a lot of nonsens as TP. If you have a very slow market you keep adding bars where the impact of the features on the final take profit is running against zero..

this way it is very difficult to find appropriate decision boundaries as they could equally likely be TN..

hence you could have a tighter definition of your TP e.g.:
bar is TP sample if within the next x candles tp is hit. (the smaller x the tighter your class)
Then all other bars represent negative classes. This will result in a very unbalanced set which is OK if you know how to deal with it. I see a higher chance of success going this way - think about it.

yes, i know that the label creation is a problem - see this post, most likely I will change it soon

http://www.trade2win.com/boards/tra...ricted-boltzmann-machines-14.html#post1828870

my idea was to use like ATR based labels or MA cross based labels

meanwhile i tried 2 Time frame trading i.e. using rebinning i created training set containing 1min data and the same number of bars of rebinned data (factor 5 so aka 5 min). No improvement of results, profit factors similar for 1 min and merged 1/5 min data tests. Mayebe Time frames were too close or just another myth
that trading using multiple time frames helps....

will try 1min/15min now

Krzysztof
 
new features performance

here are the new features. All absolute features are replaced with relative values.
Additionally i added VWAP price VWAP=(open+close+(high+low)/2)/3; as according
to some quantexchange gurus its the best representation of price.
See below

cond(1) = within(upBol_1(t),price(t)); % within 5 pips of upper Bollinger band
cond(2) = within(upBol_15(t),price(t)); % within 5 pips of upper Bollinger band
cond(3) = within(upBol_20(t),price(t)); % within 5 pips of upper Bollinger band
cond(4) = within(upBol_25(t),price(t)); % within 5 pips of upper Bollinger band
cond(5) = within(upBol_30(t),price(t)); % within 5 pips of upper Bollinger band

cond(6) = within(lwBol_1(t),price(t)); % within 5 pips of upper Bollinger band
cond(7) = within(lwBol_15(t),price(t)); % within 5 pips of upper Bollinger band
cond(8) = within(lwBol_20(t),price(t)); % within 5 pips of upper Bollinger band
cond(9) = within(lwBol_25(t),price(t)); % within 5 pips of upper Bollinger band
cond(10) = within(lwBol_30(t),price(t)); % within 5 pips of upper Bollinger band

cond(11) = upBol_1(t)/price(t);
cond(12) = lwBol_1(t)/price(t);
cond(13) = upBol_15(t)/price(t);
cond(14) = lwBol_15(t)/price(t);
cond(15) = upBol_20(t)/price(t);
cond(16) = lwBol_20(t)/price(t);
cond(17) = upBol_25(t)/price(t);
cond(18) = lwBol_25(t)/price(t);
cond(19) = upBol_30(t)/price(t);
cond(20) = lwBol_30(t)/price(t);

% Price trend

cond(21) = trend(price,t,'DOWN',2); % duration 2 bars
cond(22) = trend(price,t,'DOWN',3); % duration 3 bars
cond(23) = trend(price,t,'DOWN',4); % duration 4 bars
cond(24) = trend(price,t,'DOWN',5); % duration 5 bars
cond(25) = trend(price,t,'DOWN',6); % duration 6 bars

cond(26) = trend(price,t,'UP',2); % duration 2 bars
cond(27) = trend(price,t,'UP',3); % duration 3 bars
cond(28) = trend(price,t,'UP',4); % duration 4 bars
cond(29) = trend(price,t,'UP',5); % duration 5 bars
cond(30) = trend(price,t,'UP',6); % duration 6 bars

if t > 2
cond(31) = high(t) < high(t-1) && low(t) < low(t-1);
cond(32) = high(t) < high(t-1) && low(t) > low(t-1);
cond(33) = high(t) > high(t-1) && low(t) < low(t-1);
cond(34) = high(t) > high(t-1) && low(t) > low(t-1);
end

cond(35) = price(t) / open(t);

% RSI

cond(36) = trend(RSI8,t,'DOWN',2);
cond(37) = trend(RSI8,t,'DOWN',3);
cond(38) = trend(RSI8,t,'DOWN',4);
cond(39) = trend(RSI8,t,'DOWN',5);
cond(40) = trend(RSI8,t,'DOWN',6);

cond(41) = trend(RSI8,t,'UP',2);
cond(42) = trend(RSI8,t,'UP',3);
cond(43) = trend(RSI8,t,'UP',4);
cond(44) = trend(RSI8,t,'UP',5);
cond(45) = trend(RSI8,t,'UP',6);

cond(46) = trend(RSI14,t,'DOWN',2);
cond(47) = trend(RSI14,t,'DOWN',3);
cond(48) = trend(RSI14,t,'DOWN',4);
cond(49) = trend(RSI14,t,'DOWN',5);
cond(50) = trend(RSI14,t,'DOWN',6);

cond(51) = trend(RSI14,t,'UP',2);
cond(52) = trend(RSI14,t,'UP',3);
cond(53) = trend(RSI14,t,'UP',4);
cond(54) = trend(RSI14,t,'UP',5);
cond(55) = trend(RSI14,t,'UP',6);

cond(56) = trend(RSI50,t,'DOWN',2);
cond(57) = trend(RSI50,t,'DOWN',3);
cond(58) = trend(RSI50,t,'DOWN',4);
cond(59) = trend(RSI50,t,'DOWN',5);
cond(60) = trend(RSI50,t,'DOWN',6);

cond(61) = trend(RSI50,t,'UP',2);
cond(62) = trend(RSI50,t,'UP',3);
cond(63) = trend(RSI50,t,'UP',4);
cond(64) = trend(RSI50,t,'UP',5);
cond(65) = trend(RSI50,t,'UP',6);

cond(66) = trend(RSI200,t,'DOWN',2);
cond(67) = trend(RSI200,t,'DOWN',3);
cond(68) = trend(RSI200,t,'DOWN',4);
cond(69) = trend(RSI200,t,'DOWN',5);
cond(70) = trend(RSI200,t,'DOWN',6);

cond(71) = trend(RSI200,t,'UP',2);
cond(72) = trend(RSI200,t,'UP',3);
cond(73) = trend(RSI200,t,'UP',4);
cond(74) = trend(RSI200,t,'UP',5);
cond(75) = trend(RSI200,t,'UP',6);

cond(76) = RSI8(t);
cond(77) = RSI14(t);
cond(78) = RSI50(t);
cond(79) = RSI200(t);

% CCI

cond(80) = trend(CCI5,t,'DOWN',2);
cond(81) = trend(CCI5,t,'DOWN',3);
cond(82) = trend(CCI5,t,'DOWN',4);
cond(83) = trend(CCI5,t,'DOWN',5);
cond(84) = trend(CCI5,t,'DOWN',6);

cond(85) = trend(CCI5,t,'UP',2);
cond(86) = trend(CCI5,t,'UP',3);
cond(87) = trend(CCI5,t,'UP',4);
cond(88) = trend(CCI5,t,'UP',5);
cond(89) = trend(CCI5,t,'UP',6);

cond(90) = trend(CCI10,t,'DOWN',2);
cond(91) = trend(CCI10,t,'DOWN',3);
cond(92) = trend(CCI10,t,'DOWN',4);
cond(93) = trend(CCI10,t,'DOWN',5);
cond(94) = trend(CCI10,t,'DOWN',6);

cond(95) = trend(CCI10,t,'UP',2);
cond(96) = trend(CCI10,t,'UP',3);
cond(97) = trend(CCI10,t,'UP',4);
cond(98) = trend(CCI10,t,'UP',5);
cond(99) = trend(CCI10,t,'UP',6);

cond(100) = trend(CCI21,t,'DOWN',2);
cond(101) = trend(CCI21,t,'DOWN',3);
cond(102) = trend(CCI21,t,'DOWN',4);
cond(103) = trend(CCI21,t,'DOWN',5);
cond(104) = trend(CCI21,t,'DOWN',6);

cond(105) = trend(CCI21,t,'UP',2);
cond(106) = trend(CCI21,t,'UP',3);
cond(107) = trend(CCI21,t,'UP',4);
cond(108) = trend(CCI21,t,'UP',5);
cond(109) = trend(CCI21,t,'UP',6);

cond(110) = trend(CCI35,t,'DOWN',2);
cond(111) = trend(CCI35,t,'DOWN',3);
cond(112) = trend(CCI35,t,'DOWN',4);
cond(113) = trend(CCI35,t,'DOWN',5);
cond(114) = trend(CCI35,t,'DOWN',6);

cond(115) = trend(CCI35,t,'UP',2);
cond(116) = trend(CCI35,t,'UP',3);
cond(117) = trend(CCI35,t,'UP',4);
cond(118) = trend(CCI35,t,'UP',5);
cond(119) = trend(CCI35,t,'UP',6);

cond(120) = CCI5(t);
cond(121) = CCI10(t);
cond(122) = CCI21(t);
cond(123) = CCI35(t);

% MAs

cond(124) = price(t) / MA5(t);
cond(125) = price(t) / MA15(t);
cond(126) = price(t) / MA30(t);
cond(127) = price(t) / MA70(t);
cond(128) = price(t) / MA150(t);

cond(129) = price(t) / EMA10(t);
cond(130) = price(t) / EMA20(t);
cond(131) = price(t) / EMA50(t);
cond(132) = price(t) / EMA100(t);
cond(133) = price(t) / EMA200(t);

if t > 1 cond(134) = VWAP(t) - VWAP(t-1); end
if t > 2 cond(135) = VWAP(t) - VWAP(t-2); end
if t > 3 cond(136) = VWAP(t) - VWAP(t-3); end
if t > 4 cond(137) = VWAP(t) - VWAP(t-4); end
if t > 5 cond(138) = VWAP(t) - VWAP(t-5); end

if t > 1 cond(139) = VWAP(t) / VWAP(t-1); end
if t > 2 cond(140) = VWAP(t) / VWAP(t-2); end
if t > 3 cond(141) = VWAP(t) / VWAP(t-3); end
if t > 4 cond(142) = VWAP(t) / VWAP(t-4); end
if t > 5 cond(143) = VWAP(t) / VWAP(t-5); end

% stochastics

cond(144) = stochK143(t);
cond(145) = stochK215(t);
cond(146) = stochK3610(t);
cond(147) = stochK5021(t);

cond(148) = stochD143(t);
cond(149) = stochD215(t);
cond(150) = stochD3610(t);
cond(151) = stochD5021(t);

cond(152) = DPO10(t);
cond(153) = DPO20(t);
cond(154) = DPO50(t);
cond(155) = DPO200(t);

cond(156) = m(t);
cond(157) = d(t);
cond(158) = h(t); % hour
cond(159) = mn(t);% minute
cond(160) = day(t);

cond(161) = mean(price(1:t))/price(t);
cond(162) = var(price(1:t));
cond(163) = mean(VWAP(1:t))/VWAP(t);
cond(164) = var(VWAP(1:t));

cond(165) = lag1(t);
cond(166) = lag2(t);
cond(167) = lag3(t);
cond(168) = lag4(t);
cond(169) = lag5(t);
cond(170) = lag6(t);
 
and results

Sadly no performance improvement. 17 algos trained on 5000 1min bars on 6 days
on old and new features has almost the same profit factor (0.22/0.23)

So change of features don't improve results, it was what I was expecting....
See excel sheet

Krzysztof
 

Attachments

  • new features.zip
    37.8 KB · Views: 190
Last edited:
Sadly no performance improvement. 17 algos trained on 5000 1min bars on 6 days
on old and new features has almost the same profit factor (0.22/0.23)

So change of features don't improve results, it was what I was expecting....
See excel sheet

Krzysztof


can you provide me with your code? so i can apply some changes
 
scripts

Here you have 4 scripts.

instantpip170_1 - makes a features
instantpipexit - makes a labels (var status=label)
trend
within


Krzysztof
 

Attachments

  • TradeFX.zip
    4.3 KB · Views: 204
they perform worse

Sadly no performance improvement. 17 algos trained on 5000 1min bars on 6 days
on old and new features has almost the same profit factor (0.22/0.23)

So change of features don't improve results, it was what I was expecting....
See excel sheet

Krzysztof

If you look carefully to this sheet you can see that actually new features perform worse than old ones. Profit factor seems to be the same for all algos but other measures are falling e.g. average precision from 32.6% to 28.8% and also not
counted in the sheet the percentage of profitable algos/non profitable algos from 0,57 for old features (64 profitable algos) to 0.4 (45 probitable algos) for a new features.

Hmmm I thought VWAP price will help with something....next myth..

Krzysztof
 
If you look carefully to this sheet you can see that actually new features perform worse than old ones. Profit factor seems to be the same for all algos but other measures are falling e.g. average precision from 32.6% to 28.8% and also not
counted in the sheet the percentage of profitable algos/non profitable algos from 0,57 for old features (64 profitable algos) to 0.4 (45 probitable algos) for a new features.

Hmmm I thought VWAP price will help with something....next myth..

Krzysztof

That performance decreased does not say that the initial featureset made more sense :) to my understanding your approach does not make any sense what so ever as long as you keep adding bars as TP/TN which are very unlikely to have any impact on development of the price..

Try to extract bars where e.g. price hits +- ATR/X within a fixed window (e.g. within next 5 bars) then you can try to clusteranalysis to make sure you have actual concepts you can watch out for. Finally, create a strong learner on these positive samples stressing decision bounds until you have positive performance in crossvalidation. Furthermore ensemble methods are worth a shot. It is theoretically proven that if you combine infinite classifiers with successrate > 50% you will have optimal performance. :smart:

Anyways - keep patience and dont give up yet :rolleyes:
 
That performance decreased does not say that the initial featureset made more sense :)

So if the performance decrease it means what ???

Meanwhile I rerun all algos on different training length (10000) and for sure it decrease comparing to my original feature set. str6 old features, str7 new ones, measures in the same order like in excel.

5000 str6 32.63902857 0.58372549 0.042 0.429656863 0.021617647 -18254.52941 PF=0.23 W/L=0.57
10000 str6 31.89832402 0.608317308 0.038870056 0.499903846 0.023317308 -10141.17308 PF=0.47 W/L=0.6

5000 str7 28.84949367 0.598137255 0.016410256 0.579215686 0.005882353 -15243.47549 PF=0.22 W/L=0.4
10000 str7 31.83 0.601813725 0.02716129 0.579313725 0.015833333 -12132.11765 PF=0.36 W/L=0.58

As far as I know features are just a filters which filter out signal. Either the filter better or worse...

to my understanding your approach does not make any sense what so ever as long as you keep adding bars as TP/TN which are very unlikely to have any impact on development of the price..

If it would not make any sense so results should be random but they are not,
kappa and MC index are always >0 !!! Its just way of creating labels and I thought
classifiers should learn 'quiet market conditions'. Its not my idea, it was in original
system TradeFX. That profit is low that's another story, just TP rate is low, TN rate is high and accuracy is always around 0.6. So I believe it predicts something....

FYI. I made another features set, just six lags of prices and for those features
performance went down a lot and kappa/MC are <0

Try to extract bars where e.g. price hits +- ATR/X within a fixed window (e.g. within next 5 bars) then you can try to clusteranalysis to make sure you have actual concepts you can watch out for. Finally, create a strong learner on these positive samples stressing decision bounds until you have positive performance in crossvalidation. Furthermore ensemble methods are worth a shot. It is theoretically proven that if you combine infinite classifiers with successrate > 50% you will have optimal performance. :smart:

I agree, it will be more selective method however to find X of ATR and window
length can be a problem.

create a strong learner on these positive samples stressing decision bounds until you have positive performance in crossvalidation.

yes I was going to apply cost sensitive over/undersampling to train on more positive samples but crossvalidation ??? Are you sure its correct method for time
series ?? As you will train on later samples it will introduce future leak and inflate
results ??

Anyways - keep patience and dont give up yet

Not going to give up easily, have plenty of time to play with it at the moment

Krzysztof
 
Hi Fabwa,
You said '' create a strong learner on these positive samples'' , can you name the strong learner algos?
 
Thanks but according to wiki:

''In 2008 Phillip Long (at Google) and Rocco A. Servedio (Columbia University) published a paper at the 25th International Conference for Machine Learning suggesting that many of these algorithms are probably flawed. They conclude that "convex potential boosters cannot withstand random classification noise," thus making the applicability of such algorithms for real world, noisy data sets questionable''
 
By the way I am working with a freelance programmer on customized random forest tree alghorithm with R package -MT4.I will share my results
 
Last edited:
By the way I am working with a freelance programmer on customized random forest tree alghorithm with R package -MT4.I will share my results

RF is one of these ensemble methods :) interested in your results for sure
 
random forest

By the way I am working with a freelance programmer on customized random forest tree alghorithm with R package -MT4.I will share my results

I tried random forest already and they underperformed comparing to other algos
so this is a reason that it is not included in my 17algos group. Tree algos are known to describe data very well and perform poor out of sample. In my group I have J48 and LMT and they under perform also.

Good book to read which one can be good is 'Elements of statistical learning',
you can download it free.

So if you make R-MT4 system how you will make a back test ??? Because in MT4 is rather impossible as it is too slow. Perhaps MC could be better but the best would be to make all inside R i think. I was also considering using R-MT4 due to avaialble dll but gave up this idea and stick to MATLAB-MT4 and doing backtest in MATLAB.

Krzysztof
 
Last edited:
I tried random forest already and they underperformed comparing to other algos
so this is a reason that it is not included in my 17algos group. Tree algos are known to describe data very well and perform poor out of sample. In my group I have J48 and LMT and they under perform also.

Good book to read which one can be good is 'Elements of statistical learning',
you can download it free.

So if you make R-MT4 system how you will make a back test ??? Because in MT4 is rather impossible as it is too slow. Perhaps MC could be better but the best would be to make all inside R i think. I was also considering using R-MT4 due to avaialble dll but gave up this idea and stick to MATLAB-MT4 and doing backtest in MATLAB.

Krzysztof

I am thinking to do the backtesting in Chaos Hunter.The problem with Chaos Hunter is I can only create buy-sell signals, and CH fitness function is '' maximize the equity'' .One can maximize the equity with large drawdowns and I dont' like large drawndowns.I like signals wih smallest MAE percent.
 
Last edited:
Top