I'm going to explain how I go about reading an academic paper because a lot of people I talk to don't read them, although that's where the best machine learning ideas are.
Michael Jordan is a Berkeley professor I follow and try to read as much as possible of. On his website is the paper, "A Flexible and Efficient Algorithm for Regularized Fisher Discriminant Analysis".
Fisher/linear discriminant analysis (LDA) is an algorithm for doing multi-class classification. A three "class" problem, for example, would be predicting whether to buy, short, or do nothing.
At least skim the paper before reading the rest of this note. I like this one because they don't leave out any steps and the matrix algebra makes it easy to write in Matlab.
So first I read over the paper about 1.5 times taking notes. Then I walked over to one of my friends at work to ask what he thought of the paper. He didn't have any opinion on it at first.
Then I opened up Matlab to implement Algorithm 1 on page 8. Here's the code. It's as straightforward as possible; First I initialize all variables, then calculate the algorithm step-by-step; Where a line is based on an equation, the equation number is given to the right in a comment, ex. %(15); And all variables have the exact same names as in the paper, so you can easily match them back and forth.
Next I wrote Algorithm 2 on page 9. Here's the code. As in the previous, the comments in the code should make it extremely clear how it relates to the paper.
The code seemed to work, so I emailed the main author, Zhihua Zhang, for his version to check mine against. Here's his code. It turns out both of his algorithms were the same, with one variable renamed. His is harder to connect to the paper itself because he uses different variable names.
My friend at work wrote the best version. It's the fastest and clearer than mine. Here's his code. He's a Matlab hacker and math Ph.D so he recognized that centering X and calculating psuedo-inverses could be sped up a lot, among other things.
In the end, all of our versions gave the same results. The third version is the most efficient. Multiclass kernel LDA is pretty good algorithm and implementing the paper directly was a fun exercise. Hopefully you can look at my version to see how to go directly from the paper to code, and then my friend's to see how to make it much more efficient.
Topic:
Artificial Intelligence,
Examples
Subscribe to:
Post Comments (Atom)
13 comments:
Chintan,
Maybe in an ideal world it would be the same if you assume a model like a random walk but that's unrealistic. Multiclass classification has other uses too.
Regards,
Max
Not Random walk. But I guess only volatility behaves differently during upward/downward movements.In b/w which variables you think behave differently for +ve/-ve directions?
Chintan,
I don't know what you mean by "which variables". What are the choices?
It's just a really big assumption to say that the properties are exactly the same when something's going up or down, especially since time has to pass between the change between going up to down or down to up.
Regards,
Max
One can classify directional/sideways situation.As classifying directional(buy)/directional(sell)/sideways wont make any sense as basically there isnt any additional statistical property/variable which can identify upward/downward movement.
***In sudden trend reversal from up->down / down->up one has to use another tool to determine direction.
--Anything which is taken as a training data is a variable.
I look at the long/short/do-nothing 3 class problem as two 2 class problems. Go long/do-nothing and go short/do-nothing. Where do-nothing is not only the default state, it also takes "extra oomph" from the data to go into either of the "exposed states". Unless I'm missing something, I don't see the advantage of attacking this kind of problem as a 3 state problem.
I realize that in the process of solving other types of problems there may be an advantage to this method, but on the typical long/short/do-nothing situation, I just don't see it. You're welcome to correct me if I've missed something.
Bill S
Bill,
That's how two-class classifiers are often made multiclass. I guess the advantage of true multiclass classification is that you only have to incur the computational cost of running the learner once rather than twice. As you go to even more classes (in other applications maybe) the benefit is greater because the number of pairwise comparisons grows combinatorially .
In this case, you're exactly right and your approach will work fine.
Regards,
Max
Bill & Max,
Classifying either Buy/Sell/Sideways or Buy/sideways,sell/sideways is like classifying small white ball/big white ball/black ball ,small white ball/black ball,big white ball/black ball.
As One can either classify different colors or different sizes.But One can not Classify with combination of such parameters.
What you say?
Chintan,
In my experience, the magnitude of a move is basically impossible to predict.
When I classify a long/do-nothing or short/do-nothing, I'm only dealing with direction.
On those occasions where the distribution of win amounts and loss amounts are significantly different, I have to justify why that happened before I'll take on that position. Since that is man-hour intensive with no guarantee of an answer, I typically move on to another trade.
Bill S
Bill,
That might be true,I guess.
But Rather then direction my interest lies in classifying non-random & random/noisy movements.Is it possible ?
Chintan,
I've tried that a few times, with no luck. I ended up with probability distributions of probability distributions. And, if I changed my fitting parameters, I would generate a different set of distributions of distributions.
The dangerous part was that I could potentially convince myself that I "knew" more about the data than what really existed.
The result? I threw that into my ever-increasing pile of stuff in my filing cabinets.
Bill S
Bill
Haa Haa Haa..
Thats the most interesting aspect of trading.Its unbreakable combination of Arts & Science.So One has to become first artistic and feel the markets and then put those feelings on Computer.
If you experiment with good mixture of arts&science then i am sure you will end up making a profitable trading system.
Regards,
Chintan
Chintan,
I'm walking proof of Bayes' Theorem. I learned to start off with the assumption that I don't know what I'm doing. I'm usually right.
When it comes to the signal/noise issue, many many years ago, I configured various neural networks to discriminate between noise and a KNOWN REPEATABLE signal. The thing worked great.
I assumed (bad idea) that the same concept would work on financial data where my so-called KNOWN signal was a centered moving average of the raw data. The training data worked great, but the unseen test data was worse than a coin flip. It's fair to say that I was disappointed in my newly acquired education.
Good Luck in whatever you do,
Bill S
Post a Comment