Stephen Henderson reviews Nate SIlver's book "The Signal And The Noise", looking at the lessons it holds.
I bought Nate Silver’s new book The Signal And The Noise a few weeks back although I had never heard of him – as for once Amazon successfully suggested a book I may be interested in.
I finished it just as his accurate polling predictions of the US election were making worldwide news. Thanks to him I had an early night last Tuesday and slept peacefully safe in the knowledge I would (most probably) wake up to a Barack Obama presidential win.
It’s worth noting that, though undoubtedly a smart guy, his statistical polling analysis is not hugely complex or even novel.
Several other bloggers such as Sam Wang (Princeton Election Consortium) and Simon Jackman (pollster) or even an interested amateur Gianluca Baio, who popped up last week, made similarly accurate predictions.
They all use the same publicly available polling data and are all quite open about their Bayesian statistical methods. There is no secret sauce.
But then that’s partly why their success is so satisfying. Only days ago even the BBC and the FT were saying the election was too close to call. Whilst others such as Janet Daley in The Daily Telegraph just knew Romney would win (she then informed us at least Romney would win the popular vote). Yet the truth was always there in plain sight for an informed amateur to see. Of course even a moron can be right sometimes so it is important to demonstrate a sensible method.
In his books chapter on the L’Aquila earthquake and the subsequent trial of several Italian seismologists, Silver points out a Giampolo Guiliani had predicted an earthquake. Except it was about a fortnight earlier, 60 kilometers away in Sulmona, and was apparently based on Radon gas levels, lunar perihelion and the alignment of Venus. Epistemology matters.
So what is the method?
In the US there are many state level polls for every Electoral College carried out throughout the campaign. First you can consider how each poll has historically leaned left or right and adjust for that (bias).
Then instead of relying on a single poll you can aggregate results – weighting the best ones more heavily – to reduce the sampling error (precision). Newer polls are used to update a running estimate of the vote (Bayesian) with diminishing likelihood estimates around the central estimate. With the electoral votes for each college being a fixed number you can then compute lots of different scenarios probabilistically (simulation).
This is the basic method though Silver has some further demographic and neighbour state adjustments.
Criticisms of this method – in so far as they are intelligible – fall roughly into three categories:
1). We just don’t like and/or trust your Liberal predictions.
2). If Mitt Romney has 49.5% of the vote then he has a 49.5% chance to win!
3). The polls are biased and we either don’t realise you adjust your polls to take this into account or think the adjustment doesn’t take something new into account.
The third of these is actually a valid criticism. In the US the Bradley effect was named after Tom Bradley who lost the Los Angeles mayoral election in 1982 despite a consistent poll lead.
Seemingly many people told pollsters they would vote for him but didn’t – perhaps because he was black?
With any prediction we should always be careful a well working model may fail when some new variable we have not encountered comes along. Financial modellers have lost billions by trusting methods that worked brilliantly right up to the moment they didn’t and blew up the company. And yet Obama had been elected as a black man four years previously and any Bradley effect or other known biases would already be adjusted into Silver’s polling model.
It would have taken something new and uncertain to throw the polls off. There were no “unknown unknowns”.
For example, it appears the Romney camp itself was totally convinced the polls were biased and had told Romney and Ryan they would win. They even planned a huge victory fireworks display.
They deeply believed that whilst the Obama camp had an exceptional turnout of black and Latino voters in 2008 they would be less enthused this time and wouldn’t turn out in the numbers like before. They were wrong. Though the demographics and opinions of the US may be steadily changing, their behaviour is not – this was just wishful thinking.
And what of the coming UK elections? Could similar methods be used here? It’s perhaps a little more challenging.
Our constituencies are smaller and most polling is national rather than local. There are more parties including locally strong nationalists and potential tactical voting. Poll aggregators such as Electoral Calculus use national poll averages with some success but they lack the detail and depth of information available in the US.
This need for local polling, particularly of marginal seats, has been recognised by some such as Conservative peer Lord Ashcroft, who is usually sensible enough to admit uncomfortable polling facts. Such data will be useful to the Conservatives as the election approaches but he may be less and less publicly forthcoming or frank.
So I think it unlikely an amateur will be able to repeat Silver’s (and others) remarkable prescience with the publicly available UK polling data. I have been wrong before though.
There is a lot more incredibly timely and fascinating material apart from the polling discussion in The Signal and The Noise.
It’s a book about all sorts of forecasting. There is a superb discussion of all the myriad failures in foresight that led to the credit crunch and in later chapters of the efficient market hypothesis.
There is a remarkably contemporary discussion of earthquakes and the L’Aquila case plus a chapter on hurricane forecasting that I read days after Hurricane Sandy. There is a chapter on earthquakes that touches on why Fukushima was perhaps not prepared for a category 9 earthquake. Then there are chapters on the uncertainty of epidemiological predictions surrounding flu outbreaks – and the politics of getting it wrong that made me think of Ash trees and badger culls.
Plus there is a chapter on how bad economic predictions such as GDP or inflation are that made me roll my eyes for our own OBR. As I was reading it I realised it wasn’t just a coincidence all these issues felt contemporary – actually the news was dominated by political fallout over faulty predictions.
In the past few days his book has become a bestseller because of the US election but its wider themes of handling uncertainty are timeless.