I wrote a note on how to combine multiple predictors with only marginally higher than baseline accuracy to construct an ensemble predictor that is much more accurate, via unanimous voting. It has a few simple formulas which don't work in html so here it is in pdf format.
I'm a bit worried because it looks much more mathematically complicated than it is (it's only simple fraction arithmetic and the most basic probability) and because it has no pictures. There is a table at least. It's written to be as intuitive as possible though, like everything else.
I'd like to hear if anyone's using techniques like this.
4 comments:
Halo,
The problem with ensemble predictors are that they must be independent and uncorrelated each to other. I have played with that (the treebagger, according your earlier post) and have found that the ensemble didn't improve the classifier too much. I have found that some instances of training data are inherently hard to predict and the majority of trees fail on them. So I played with the 2 predictors: the first predicting the target and the second predicting if the first one fail or not. But also not much success. Also tried continuous updating of the ensemble trough adding trees trough the time and continuous deleting the poor performing ones. To make long story short, the best performing classifiers was not trees, treebaggers, naive bayes, etc. but surprisingly the dynamic neural networks and incremental learning (try newdtdnn matlab command, play with warious parameters). The treebagger I found useful for selection best performing variables.
Thanks for the response Anon. That's been my experience too and you're right on about it being hard to make them independent.
Regards,
Max
Great Blog Post, thanks for putting the pdf up with it. I'm not so sure if it;s even possible to find independent inputs if they're reliable. Isn;t it sufficuent though merely to look for inputs that use disparate data in predicting the same thing?
Thanks Ralph. Yes, the claim should perhaps not have been stated so strongly.
Regards,
Max
Post a Comment