“The Signal And The Noise” Book Review: Building Better Predictions

So I just finished reading Nate Silver’s The Signal and the Noise, and I have to say it’s one of the more thought-provoking non-fiction reads I’ve enjoyed in quite a while, offering insights which I intend to integrate into my hockey writing more often.

Anyone whose writing involves making predictions or forecasts could benefit from these suggestions, not just us hockey bloggers!

Who is Nate Silver?

You may have heard of Nate Silver and his FiveThirtyEight blog during the last election cycle, as most media outlets talked about an incredibly close Presidential race, some even predicting a likely Mitt Romney win. Silver’s forecast, driven by state-to-state analysis built upon aggregated polling data, not only put forth a 90%+ probability that President Obama would be reelected, but it nailed the state-by-state electoral results as well.

It was a resounding public triumph for stats geeks (Silver has a background blogging on baseball analytics) at the expense of political pundits who are often portrayed as experts, but are really nothing more than blowhards for their particular interest group. In retrospect, some said that Silver’s projections were obvious given the data that was out there. If so, then why weren’t others presenting it that way?

Building Better Predictions

In The Signal and the Noise Silver outlines the nature of forecasting, with his advice on how people in all sorts of fields can make more useful predictions. It’s important to understand that this does not always mean “more accurate”. Perhaps the most essential element of forecasting is understanding just how confident you can really be, and avoid relying on a model which attempts to be overly precise.

There are two ideas in this book which I could see building into my hockey writing, which often involves making predictions for individual and team performance:

1) Expressing Predictions as a Probabilistic Range

Our tendency in making predictions is to put a specific number out there, such as “Patric Hornqvist will score 30 goals for the Nashville Predators this season”, which is a bit of a fool’s errand. There is only a small chance that the exact number will be hit, whereas a more useful forecast might put this in terms such as “Hornqvist is 50% likely to score between 26-34 goals for the Preds this season”.

2) Refining Predictions as New Information Arises

There are really two aspects of this recommendation. First, once you build a model (in this case projecting a hockey player’s performance), you can adjust projections as new information becomes available, i.e. games are actually played. There’s nothing wrong with feeding that information into your model and updating the prediction, saying for example “based on his hot scoring pace in the first half, Patric Hornqvist has a 40% chance of topping 40 goals this season” at the halfway point.

Silver talks about this as a Bayesian approach, which is all about setting new information into proper perspective when using it to change our view of the future.

In addition, once enough data comes in one can review whether the model itself has structural issues which need to be addressed before the next full-scale round of predictions is made. Do I consistently give too much credit to players on top teams? Do I build too much optimism into the growth potential of young players? Models only get better when you consider their weak points in the harsh light of their results.

Putting These Ideas To Work

You might be wondering how this type of thinking can be practically applied. For a good working example of probabilistic predictions at work, constantly updated as new information becomes available, just check out SportsClubStats.com. That site is dedicated to the question “will this team make the playoffs?”, covering the NFL, NHL, NBA, Major League Baseball, and more leagues, including European soccer.

If we consider the Nashville Predators, for example, as of this writing SportsClubStats says they have a 33.1% chance of making the playoffs. There are also any number of permutations to that question, i.e. how many points in the standings they might finish up with, which opponents they might meet in the playoffs, etc.

All of those predictions are expressed as probabilities, which gives a more nuanced sense of how the various factors come together. For example, right now the Predators sit in 8th place in the Western Conference, but their chances of making the playoffs are actually 10th-best (based on factors like games played so far, and the strength of all teams’ remaining schedules).

In addition, SportsClubStats has a built-in mechanism for updating its projections as games are played, and in fact we can see how various results will impact those projections ahead of time. Sticking with our example, the Predators have a road game coming up on Monday against the defending Stanley Cup champions, the Los Angeles Kings. If they win that game without going to overtime, the Preds’ playoff chances rise by 8.9%. If they lose, they drop by 6.9%. Yup, it’s a big game!

SportsClubStats updates all of this information on a daily basis, allowing fans to follow up on a night’s scoreboard results to see how the chase for the playoffs is shaping up.

Can I implement a similar methodology for individual and team performance as part of the hockey analysis I do over at On The Forecheck? I think the idea has tremendous merit, and it wouldn’t have occurred to me were it not for The Signal and the Noise.