Wednesday, 12 December 2012

The Signal and the Noise

Ever since I was dragged along to a baseball game in 1996 and watched Steve Finley hit two home runs, Ken Caminiti hit a home run, and then Trevor Hoffman close out a tense ninth inning to enable the San Diego Padres beat the evil Los Angeles Dodgers, I've become an avid baseball (and Padres) fan. I'm also a political junkie, dating back to my undergraduate days when I was treasurer of the student union. I once spent a day sitting in a hotel room in Palo Alto watching the World Series of Poker on ESPN rather than working on the scientific talk I was supposed to give the following day. And in 1992 when I was a PhD student interested in computational predictions of membrane protein structure, I brought my friend Jon in to help analyze the data, he used some black magic called Bayesian statistics, and produced some interesting findings that at the time didn't match with my preconceptions, so I didn't do take it any further; years later another group published experimental data showing that Jon's findings were correct (sorry Jon!).

What do all these things have common?

I've just finished reading Nate Silver's excellent book "The Signal and the Noise. Why so many predictions fail- but some don't". Nate is a Bayesian statistician who has had a very unusual career, amongst his claims to fame, he developed PECOTA, a system for predicting future performance of baseball players, he spent a couple of years making a lucrative living playing online poker, and he now runs a highly successful blog FiveThirtyEight which provides statistical analysis of political polling data. I first started following FiveThirtyEight during the 2008 US presidential campaign and have been a regular reader ever since then. Nate's now become a celebrity with his uncannily accurate predictions of both the 2008 and 2012 US presidential elections.

Since I work in scientific fields that rely on empirical analysis of massively large data sets, and we are also moving towards undertaking iterative modeling of both bacterial communities and the metabolism of individual bacterial cells, I found Nate's book fascinating. And I think the take home message is very important for us- that anyone doing modeling or forecasting should avoid overconfidence, and recognise the degree of uncertainty in one's models or predictions.


1 comment:

  1. My uncle do follow the baseball matches and he really likes to play also. I am feeling like I should share this blog with him as he would like to read about his favorite sports.