Tuesday, July 17, 2012

Bad Statistics

Paraphrasing an article originally published by Ben Goldacre on his Bad Science blog in 2008:

"In 1954 a man called Darrell Huff published a book called How to Lie with Statistics. Chapter one is called 'the sample with built in bias'.

Huff sets up his headline: 'The average Yaleman, Class of 1924, makes $25,111 a year!' said Time magazine. That figure sounded pretty high: Huff chases it, and points out the flaws. How did they find all these people they asked? Who did they miss? Losers tend to drop off the alma mater radar, whereas successful people are in Who’s Who and the College Record. Did this introduce selection bias into the sample? And how did they pose the question? Can that really be salary rather than investment income? Can you trust people when they self-declare their income? Is the figure spuriously precise? And so on.

In the intervening fifty years this book has sold one and a half million copies, it’s the greatest selling stats book of all time (a very tough market) and it remains in print."

Perhaps one and a half million copies is not going to single-handedly change public attitudes towards statistics, however, you might expect statistical reporting to have improved somewhat. This is certainly not the case in the mainstream media where spurious surveys with headline grabbing conclusions are quoted on a daily basis (I will have to try to dig out some examples to support this accusation), making the same mistakes that Huff was fighting against in the '50s. Perhaps the worst thing is that so few journalists are actually prepared to question the data and consider inbuilt bias. I suppose newspapers have effectively passed this responsibility on to the reader.

No comments: