I watched a video lecture, as I often do, on data analysis. here's the video: the Hilbert Spectrum. Here are the notes I took while watching it:


The idea is appealing- to decompose a time series into underlying trends of different periodicities. In the trading world this would correspond to maybe a long term macroeconomic trend, a monthly pattern occurring around announcement of the federal funds rate, and a short term pattern caused by supply and demand and liquidity constraints. The researcher in the video was trying to study ocean waves with satellite data. Obviously there may be a difference in the two processes.

I implemented the Hilbert spectrum algorithm because I was excited about it. Here's the R script. For example, here's what the spectrum looks like for GOOG & TYP share prices:

At the top is the actual price series and below that are the series with the high frequency patterns removed one by one. They look nice.

Here's the code, hspect.r, in the language R. R is basically an advanced calculator that's also programmable.

The problem is that this is a type of smoother, useful for summarizing and exploring data, but useless for extrapolation or prediction. Among this family is cubic spline interpolation and LOESS. At the edges, if you extend these curves to make predictions the estimates will have extremely high variance. Making predictions with one of these smoothers is equivalent to throwing away almost all your data except the bit at the very end, and then either fitting a 3rd degree polynomial to it (in cubic spline interpolation) or a straight line (in LOESS).

Cubic spline interpolation is especially insidious because most people don't understand it and a confusing name doesn't help. Everyone knows how to interpret two derivatives: velocity and acceleration. The third derivative is interpretable, in two different contexts, as curvature or as burst. Burst is like if you're standing in an elevator and it goes up, how much you feel it. If the elevator is designed will, burst
will be a constant and you will barely feel it. It's also important in roller coaster design to ensure you have a smooth ride. In terms of curvature, if the third derivative is constant, it will be pleasing to the eye as if it were drawn by sweeping hand motions. That's the qualitative explanation. This latter interpretation of curvature is what cubic spline interpolation is based on. The cubic spline
interpolation fits a nice-looking piecewise (between each two points) polynomial which matches 1st and 2nd derivatives at each knot.

Unfortunately you have to understand these methods to know not to use them and not to trust systems based on them. I've had people contact me about using cubic spline interpolation for prediction but it's just not applicable.

Feel free to add your own thoughts.

8 comments:

Anonymous said...

Hi Max,

Nice article... And I like that you have a Free and Open example we can all follow along with...

However, your comment "R is basically an advanced calculator that's also programmable." really sells R short...

Here is a link to an article from The New York Times http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

I'll let the article speak for R ;-)

Keep up the good work!!!

Cordially,

-Digital Dude-

"Friends don't let friends drink and derive." -unknown-

Max Dama said...

Haha Digital- I'm trying to trick the non-programmers who use indicator-type strategies to make the jump to a real language. I agree with you of course.

Don't drink and derive: http://worrydream.com/media/absolute_value.jpg

Regards,
Max

Anonymous said...

Ever thought about Dynamic Linear Models / Space State Time series?

It pretty much does what you want (decomposes a time series into trend, level and seasonal component and other unobservable components) plus its pretty good for forecasting.

- Henning

Max Dama said...

I haven't looked into those. I'll try to soon.

Max

Anonymous said...

Max,

Good article. I really like the fact that you posted the R code. I'm amazed at how much it speeds up the learning process.

As far as uselessness in trades, you're right. To be useful, an indicator has to have predictive qualities. Interpolating has nothing to do with predicting.


Bill S

Max Dama said...

You're welcome Bill. I'll try to post the code in an open source language more often.

I'm probably switching to Python, using numpy,scipy,matplotlib,ipython / pylab. It's a very nice language, we'll see. It also has the easiest IB API I've found, IbPy. Basically the Java API is re-written in Python in a more intuitive fashion than the EWrapper approach.

Regards,
Max

Anonymous said...

Max,

I'm in the process of moving over to Interactive Brokers (I assume that's what IB stands for in your previous post).

The frequency of my trades is currently low enough that their Excel Active X interface is good enough for now. However, I'll continue to evaluate their API's and any language that might provide an advantage.

Do you think Python is a better path than Perl?


Bill S

Max Dama said...

Bill,

I don't know perl but as far as I can tell most prefer python.

Max