Here's the beginning of the note, click on it for the full pdf write-up (it has some math which doesn't display well here):

There are a couple different approaches to determining the best leverage to use for an overall portfolio. In finance 101 they teach Markowitz's mean-variance optimization where the efficient portfolios are along an optimal frontier and the best one among those is at the point where a line drawn from the risk free porfolio/asset is tangent to the frontier. In the 1950s Kelly derived a new optimal leverage criteria inspired by information theory (which had been established by Shannon a few years earlier). The criteria being optimized in these two cases is typically called an `objective function'- a function of possible asset weights/allocations that outputs a number giving the relative estimated quality of the portfolio weights. However, the two objective functions look quite different (peek ahead if you're unfamiliar). In this note I show the two are approximately equivalent, with the approximation being very close in realistic risk-return scenarios.

This is an equivalence that I haven't seen mentioned very often so I thought a formal version would be appropriate.

Bayesian statistics is an alternative to classical statistics. Classical stats is the one you're probably familiar with - confidence intervals, significance levels, p-values, estimators, and overfitting. Bayesian learning is more theoretically unified and optimal, and automatically builds in a preference for model simplicity i.e. doesn't overfit. Before computers and sampling methods for marginalizing probability distributions (evaluating integrals), Bayesian learning was usually intractable except in some special cases. Bayesian learning models have still been adopted slowly because the math can look hard- in Bayesian models you usually add more variables (i.e. greek letters) than the ones you start with.


The ideas of Bayesian learning can be mixed with existing systems too so even when a full Bayesian model is hard to define, some ideas can be transferred.

Finally, there's a pretty long argument (referred to as Dutch Book arguments) that proves that if two people are playing a fair gambling game, and one uses a Bayesian model to bet, while the other uses some other strategy, the Bayesian will always pull ahead as you play more and more games. This is an extremely insightful proof that organisms such as humans must internally model uncertainty and randomness in a Bayesian framework. Alternatively an equivalent, but closer Bayesian animal will drive them into extinction.

Here's a really excellent tutorial on the basics. It's big, 30mb; if you can't get through all of it, the first 10 pages will give you a good background.

For those who are familiar, or became familiar after reading MacKay's tutorial above, here's Bayesian learning applied to forecasting a noisy series with many redundant, correlated features - like stock market prices. Obviously this is a case where linear regression fails and other approaches struggle and require crossvalidation loops.

And one more on a very accurate Bayesian neural network (usually just called a "Bayesian network") which won a prediction competition. Making it Bayesian allowed the model to do automatic feature selection.

The last paper and the tutorial were by David MacKay, whose work on information theory I mentioned earlier. He's a good, clear author.



If you know of any good papers on Bayesian regression with noisy input/output and redundant features please share. Or on Bayesian feature selection.

Before programming a system, I find it helpful to diagram the flow and dimensionality of data. By breaking it down you can avoid complexity. Here's an example of a diagram (usually mine are hand drawn):

click to enlarge

Notice that we actually start out with one dimension here so the maximum dimension is 4.

The system diagrammed above is more complicated than those implemented in the code I've posted previously. Specifically it uses multiple models instead of one and performs portfolio optimization at the end.

The following are some of the papers I read last week (the good ones).


Meucci (entropy pooling) is really interesting, accessible, and right on the cutting edge (2009) of portfolio theory. Unfortunately I haven't been able to work out how one might make the implementation tractable/efficient and the paper only gives a general picture. Looking at Meucci's recent presentations, it looks like this is still imperfect. The use of confidence seems rather heuristic for a guy like Meucci who's published on Bayesian portfolio theory before.

This Bayesian Vector Autoregression paper is just a straightforward example from the economic prediction literature. It has very positive results, another algorithm to keep in mind even though it's from 1986.

These last two, on stacking, are really interesting if you've thought about the problem of combining multiple signals/systems, especially ones that are likely somewhat correlated. Stacking is sort of like crossvalidation, but for optimizing ensembles of models instead of a single model. The literature on ensembles/combining multiple learners has some really interesting unexplained results - especially the obvious one - why does it even improve the overall accuracy of the individual models? These papers on stacking shed some light on it. Tibshirani & LeBlanc 1993; Breiman 1996.


Finally, here's an excellent source for more research directly analyzing arbitrage opportunities. This is really a suprisingly good source.

Please share if you've read any interesting papers recently on machine learning or trading or anything else you think might be of interest.

I read Ralph Vince's new book, The Leverage Space Trading Model, this evening. It was released very recently on May 26th '09. Previously I read one of his older books, The Handbook of Portfolio Mathematics. Vince writes about money management, i.e. position sizing, which tries to answer the question, "How much of my capital should I bet on any given trade in order to maximize my wealth over time".


This one is much shorter at under 200 pages which is definitely an advantage over the previous. Overall, it's an interesting read, but with big issues:

Vince seems to be living in an insulated world. He apparently hasn't followed recent advances is portfolio optimization, and he is calls Monte Carlo extremely difficult. He is extremely critical of things he shows only a basic understanding of.

For example one of the major justifications he claims for his "new" theory is that mean-variance portfolio optimization ("modern portfolio theory") doesn't consider leverage. But in fact MVAR optimization is equivalent to Kelly criterion betting. In all his examples of MVAR he forgets the risk free asset, which allows for leverage to come into the optimization. Furthermore, Monte Carlo simulation is trivial. Humorously he doesn't seem to realize that his proposal is essentially equivalent to an approximate Monte Carlo.

He seems to have a tendency to become obsessed with one or two little problems of the mainstream/popular approaches to money management and now he lashes out against them, overdoing the nonconformity. He creates his own notation, metaphors, names for theories (he may simply not be aware of similar work), and as I mentioned above, he doesn't really understand all the things he lashes out against. Overall it comes across sounding a little bit immature. Academic publishing is a long discourse, not a contest. Also, he sounds like he has a thesaurus on hand while he writes.

The last chapter is simply ridiculous. Basically he recommends that everyone should bet according to the scheme outlined in the St. Petersburg Paradox (Wikipedia). As I was reading it I kept thinking he would say it was a joke or just an idea to think about. But he's really saying that banks, individuals, and funds should go out and use this strategy, because supposedly humans only care about being profitable with the highest probability (he gives a couple of loose justifications from cherry-picked psychology/econ utility theory works). Essentially he's saying everyone should follow LTCM's strategy.

At the same time, if you can look past these issues, Vince has new ideas. The first few chapters will definitely expand your understanding of position sizing. I'm disappointed that Vince's creativity couldn't have illuminated more fruitful paths.

One thing I was thinking he may go into when I read the title of Chapter 6 "A Framework to Satisfy Both Economic Theory and Portfolio Managers" is applying optimal 'betting' to everyday non-financial choices. Any choice with an unknown outcome can be considered a bet, but some result non-monetary gains, and maybe he could have analyzed these similarly.

Overall, I was diappointed that he didn't do more, but happy with the ideas I was able to selectively extract.

It's a great demo:

Plus he gives away a trading strategy that looks like it works.


Hopefully more will be added later; I'm also considering contributing.