Category Archives: stats

Come write me down

There’s a particular form of serendipity that comes from learning something in one area which resolves a puzzle, or fills a gap in your thinking, in another area entirely. It’s all the more serendipitous – and pleasant – if you didn’t realise the gap was there.

This line of thought was prompted by this piece on the excellent FactCheck blog, which made me realise that I’d always been a bit dubious about the notion of “policy-based evidence”. OK, it’s a neat reversal – and all too often people who say they’re making evidence-based policy are doing nothing of the sort – but is the alternative really policy-based evidence? Doesn’t that amount to accusing them of just making it up?

Thanks to Cathy Newman at FactCheck, I realise now that I was looking at this question the wrong way. Actually “policy-based evidence” means something quite specific, and it hasn’t (necessarily) got anything to do with outright fraud. Watch closely:

Iain Duncan Smith has been celebrating the government’s benefits cap. Part of the welfare reform bill, state handouts will be capped at £26,000 a year so that “no family on benefits will earn more than the average salary of a working family,” i.e. £35,000 a year before tax.

Today, the work and pensions secretary was delighted to cite figures released by his department which he said were evidence that the policy is already driving people back into work. Of 58,000 claimants sent a letter saying their benefits were to be capped, 1,700 subsequently moved into work. Another 5,000 said they wanted support to get back into work, according to the figures.

OK, this is fairly simplistic thinking – We did a new thing! Something happened! Our thing worked! – but it’s something like a legitimate way to analyse what’s going on, surely. It may need more sophisticated handling, but the evidence is there, isn’t it?

Well, no, it isn’t.

In order to know how effective the policy had been, we would need to know the rate at which people on benefits worth more than £26,000 went into work before the letter announcing the changes was sent, and compare it to after the letter was received. But those figures aren’t available.

“[These figures do] not reveal the effect of the policy,” Robert Joyce, senior researcher at the Institute for Fiscal Studies told us. Mr Joyce went on: “Indeed, this number is consistent with the policy having had no effect at all. Over any period, some fraction of an unemployed group will probably move into work, regardless of whether a benefits cap is about to be implemented. The number of people who moved into work as a result of the policy is 1,700 minus the number of people who would have moved into work anyway. We do not know the latter number, so we do not know the effect of the policy.”

The number of people, in a given group of claimants, who signed off over a given period is data. Collecting data is the easy part: take five minutes and you can do it now if you like. (Number of objects on your desk: data. Number of stationary cars visible from your window: data. Number of heartbeats in five minutes: data.) It’s only when the data’s been analysed – it’s only when we’ve compared the data with other conditions, compared variations in the data with variations in those conditions and eliminated chance fluctuations – that data turns into evidence. The number of people who moved into work as a result of the policy is 1,700 minus the number of people who would have moved into work anyway: that number would be evidence, if we had it (or had reliable means of estimating it). The figure of 1,700 is data.

One final quote:

A spokesman for the Department for Work and Pensions said: “The Secretary of State believes that the benefits cap is having an effect.”

Et voilà: policy-based evidence.

A man he may grow

Michael Rosen’s written a long and thoughtful piece about his experience of the grammar school system in the 1950s. I don’t know if it’s going to appear in print or on a higher-profile blog, but at the moment it’s just a post on his own blog – and he’s such a prolific poster that it’s going to roll off the bottom of the front page at any moment.

So catch it while you can – it’s a must-read for anyone who’s interested in the debate around grammar schools, or interested in debates about selective education, or secondary education in general. And anyone who’s got kids at school, has kids at school or is ever likely to. And anyone who went to a grammar school, or a selective school, or a comprehensive, or a secondary modern… Basically, you should read this.

It rings so many bells, both positively and negatively (really? we didn’t do that) that I’m tempted to live-blog my reactions to it, but that would be rather self-indulgent. I’ll just mention one small detail of Rosen’s story. He mentions that he was born in 1946, his mother’s second son, and that she died in 1976, aged 55. My own mother had her 55th birthday in 1976; I had my 16th. The coincidence of one date, and the differences of the others, raise all sorts of questions. I can’t begin to imagine my life if my mother had died in her 50s; it was hard enough when it did happen, thirty years later. Then: is it easier for an adult to lose a parent who dies relatively young? Then: easier than what?

But back to school, and a detail of Rosen’s story that sparked off a problem-solving train of thought. He writes:

the pass rate for the 11-plus wasn’t the same for boys and girls and it wasn’t the same from area to area. That’s to say, it panned out at the time that girls were generally better than boys at passing this exam. However, the places for boys and girls was split evenly between us. Somehow or another they engineered what was in reality something like a 55-45% split into a 50-50% cent split. Clearly, some five per cent of girls were serious losers in this and some five per cent of boys some kind of gainers – at least as far as the system thought of us.

But that last sentence can’t be right.

Say for the sake of simplicity that the children taking the test were evenly divided between boys and girls, rather than being 49:51 or 48:52. Then we want to know how many kids passed, and then how many were pushed up or down to even up the figures. Another thing I learned from Rosen’s post is that the pass rate varied from region to region(!), depending on the availability of grammar school places(!!), but let’s forget that for the moment and assume that about one in five passed the 11-plus (in fact the proportion ranged from 30% down to 10%).

So we’ve got, oh, let’s say 10,000 kids, made up of 5,000 boys and 5,000 girls, and 2,000 of them are going to Grammar School, the lucky so-and-so’s. Now, 55% of those 2,000 – 1,100 – are girls, and only 900 are boys. So we need to balance things up, and we skim off the dimmest 100 girls who passed and promote the brightest 100 boys who didn’t (each and every one of whom is officially less bright, and hence less able to benefit from grammar school, than the 100 girls we’ve just sent to the secondary mod, but we avert our eyes at this point).

So that’s 5% of girls demoted, 5% of boys promoted? No – it’s 100/5000, or 2%. When you massage that 55% down to 50%, the 5% that’s lost is 5% of the cohort that passed the exam (male and female), not of the girls (passed and failed). You could also say that the really serious losers – the ones who have been unfairly discriminated against even by the system’s own standards – are 100 out of the 1,100 girls who passed: roughly 9.1%. The serious gainers, on the other hand, are 100 out of the 4,100 boys who failed, roughly (reaches for calculator) 2.4%.

So there you go: applied maths for real-world problem-solving.

Clearly, some two per cent of girls (or nine per cent of the girls who passed the exam) were serious losers in this and some two per cent of boys some kind of gainers – at least as far as the system thought of us.

At which point I feel a bit like Babbage correcting Tennyson, but it’s right, dammit. And besides, without the maths I wouldn’t have arrived at the figure of nine per cent – for the girls who passed the eleven-plus but were artificially failed to even up the numbers – which is pretty shocking.

Cold water in the face

A remarkable variety of people have poured scorn on Clegg Minor’s contribution to the Sun, and rightly so. The point I want to make, following on from that fourth link, is that we need to watch the Liberal Democrats – now more than ever. (‘Watch’ here includes ‘exacerbate the contradictions within’; there are some good people in Clegg’s party, even now.) The problem is not just that the party’s support is going down the drain, or that the party’s reputation as a byword for unscrupulous vote-whoring has escaped the politically active minority and gone viral: trust can always be regained, to a greater or lesser extent. (And at the end of the day they don’t have to outrun the bear: it doesn’t matter if they don’t look whiter-than-white any more, just as long as they look cleaner than the other two parties.) What’s more to the point is that the reputational capital the party built by coherently positioning itself to the Left of New Labour was thrown to the winds last May; a sizeable chunk of the party’s 2010 vote went with it, and it’s not coming back. On top of that, the experience of coalition – the extraordinarily passive and timorous experience of coalition – is surely chipping away at the party’s bedrock support: from David Steel back to Jo Grimond, the party always stood for something, whatever that might actually be in any given period. The ‘standing for’ part seems to elude the party at the moment – quite possibly because they’ve been stitched up like a kipper by their coalition partner – and their former supporters have noticed.

The problem for the Lib Dem leadership is that they need to stem the flow of disaffected supporters. (The party took 23% of the vote last May; UK Polling Report currently has them averaging 9%, and doesn’t record a single poll when they’ve exceeded 15% since the beginning of November.) Or if they can’t do that – and they haven’t had much luck so far – they need to get support from somewhere else. And cue “Alarm Clock Britain”:

There are millions of people in Alarm Clock Britain. People, like Sun readers, who have to get up every morning and work hard to get on in life. People who want their kids to get ahead. People who don’t want to rely on state handouts. People who don’t need politicians to tell them what to think or how to live their lives. People who are not poor but struggle to stay out of the red.

They are the backbone of Britain. These are the people who will get this country moving again. It is their hard graft, day in, day out, that will get us out of the hole Labour left us in.

This Government is formed by a coalition of two parties and we want to join the people of Alarm Clock Britain in another coalition. A coalition of people prepared to roll up their sleeves and get the nation back on its feet. Ed Miliband may be prepared to hide under his duvet from the problems Labour left us with. But we will get up every morning and face up to them. In Alarm Clock Britain, people don’t want a handout but they appreciate a helping hand. And that is exactly what the Coalition Government is offering them.

I know that times are difficult right now. We are having to make cuts to pay off Labour’s debts and some bills are going up. Now more than ever, politicians have to be clear who they are standing up for. Be in no doubt, I am clear about who that is.

That is why the Liberal Democrats made a promise to voters on the front of our manifesto. That no basic rate taxpayer will pay any tax on the first £10,000 they earn. We’ve already taken the first steps which will take nearly 900,000 out of paying tax altogether. From April, every single taxpayer earning less than £42,500 a year will see their income tax bill cut by £200. By the time of the next election, 23 million people will be paying £700 less.

The Government is lending a hand in other ways, too.

(That’s enough Lib Dem promises – Ed.)

“Now more than ever, politicians have to be clear who they are standing up for. Be in no doubt, I am clear about who that is.” And who is he standing up for? Why, it’s you, you lucky Sun-reader! “People, like Sun readers, who have to get up every morning and work hard to get on in life.” People in work, in other words. Follow it through: these are also people who “want their kids to get ahead”, “don’t want to rely on state handouts” and (bizarrely) “don’t need politicians to tell them what to think or how to live their lives”. And they’re “the backbone of Britain”: Nick Clegg thinks they’re great, he really does.

Obviously life isn’t always quite that neat, but that’s OK too. Maybe you are receiving benefits of some sort or other – lots of working people do – but that’s all right: you’re just one of those people who “don’t want a handout but … appreciate a helping hand”. Maybe you’ve found that you just can’t “get on in life”, no matter how early you start work, but not to worry – you’re not poor, it’s just that you “struggle to stay out of the red”.

Which is just as well, because if you were poor, or – God forbid – if you didn’t have a job to get up for in the morning, then this offer would no longer apply. You would no longer be putting in the “hard graft, day in, day out, that will get us out of the hole Labour left us in”; on the contrary, you would be digging that hole deeper with every day you lived on benefits, and making life harder for “the backbone of Britain” with every morning that you didn’t stir from your lazy idle bed.

Who Nick Clegg is standing up against turns out to be just as important as who he’s standing up for. The message seems to go something like this: Tired after a long day? Taking on extra shifts? Working unpaid overtime? Blame them – blame the workshy, blame the bone-idle, blame all those people living on benefits. They don’t know the meaning of a hard day’s work, not like you do… This would be nasty, vindictive stuff at the best of times. At a time when the unemployment rate stands at 7.9%, or 2.5 million people – and when (as Clegg well knows) the government is poised to throw many more people out of work – it’s outrageous.

Having abandoned any pretence of a position to the Left of Labour, Clegg seems to have decided that fishing for support to the left of the Tories isn’t working either, and he’s trying out the populist far Right. I’ve got a nasty feeling this isn’t going to be a one-off: Clegg may be staring into the abyss, but he’s not going down without a fight. In 2011, watch out for our Deputy Prime Minister celebrating Crimestoppers Britain (“people who don’t want to see lynch law, but can’t let petty criminals make their lives a misery”), Easter Egg Britain (“people who are not racist, but simply know how to value their own traditions”), Beside The Seaside Britain (“people who don’t hate other nations, but know the truth of that old adage – east, west, home’s best!”) and (of course) Poppy Day Britain (“people who don’t glory in war for its own sake, but know that sometimes it is the only honourable choice”).

On the plus side, by the end of the year they’ll probably still be stuck on 9%.

Update Oldham East and Saddleworth: Labour 42.1% (up 10.3%), Liberal Democrat 31.9% (up 0.3%), Conservative 12.8% (down 13.6%); turnout 48.1% (down 14.1%). An interesting result, not least because the shares of the vote aren’t that different from earlier results:

Votes for the main parties in Oldham East and Saddleworth, 1997-2011 (rounded to nearest %)

Year Labour Lib Dem Tory Tory + LD
1997 42 35 20 55
2001 39 33 16 49
2005 41 33 18 51
2010 32 32 26 58
2011 42 32 13 45

At every election from 1997 to 2005, Labour has been at least 6% ahead of the Liberal Democrats, with the Tories taking less than 20% in third place. You could see 2010′s result as a local example of last year’s swing against Labour, and last night’s result as the return of business as usual. But if 42% and 32% are around what you’d expect Labour and the Lib Dems to be getting in OE&S, 13% is very low indeed for the Tories; there will have been some defection to the extreme right, but not a lot (the combined BNP and UKIP vote share went up by a little over 1% against last May). The best explanation is surely that the consistency of the Lib Dem vote is deceptive, and that some – perhaps quite a lot – of last night’s 32% were tactical Tory votes. It’s also worth noting that the combined Tory and Lib Dem vote was lower last night than it’s been at any time since 1997; it’s only the second time it’s been below 50% (and 2001 was an unusual election; this was the year of the BNP’s big push in Oldham, when they took 11% of the vote).

However, unlike Tom Clark, I don’t believe that this result supports Clegg’s apparent new direction:

YouGov this week reported that by 51% to 16% , the small band of remaining Liberal Democrats would prefer a Tory government led by Cameron to an Ed Miliband Labour administration.

The shrinking Lib Dem electorate, then, is now much more inclined to the centre-right than it has been historically, and Oldham suggests that as it retreats from the left it can hope to make good some of the losses by advancing on the right.

Dear oh dear. The Lib Dems have lost 14% of the 23% support it had in May 2010 – more than half; 51% of 9% equates to 20% of 23%. Lib Dem voters are more right wing than they used to be because there are fewer of them, and the left-leaning voters are the ones that have given up on the party. (As UK Polling Report puts it, “the remaining rump support for the Liberal Democrats is made up of those more positively inclined towards the Tories”.) This doesn’t mean that there are votes to be gained by “advancing on the right”; in fact it specifically and precisely means that that’s a good way to lose votes.

Nor does OE&S suggest that there are votes to be won on the Right; actually what it suggests is that the party’s vote is only holding up thanks to the generosity of Tory voters. This kind of grace and favour arrangement may keep the lights on for a while, but it doesn’t bode well for the party’s future; it suggests that a party with Liberal in the name is, once again, locked into a decaying orbit around the Conservative Party. Into which, precedent suggests, they would disappear without a trace.

Update 19/1/11 Polling data bears out my speculation that the unchanged Lib Dem percentage vote masked a partial collapse in the vote, propped up by borrowed Tory votes. UK Polling Report:

of 2010 Lib Dem voters, only 55% of those who voted in the by-election stuck with the party, with 29% instead defecting to Labour … This drop in Lib Dem support was cancelled however out by Conservative tactical voting: of 2010 Conservative voters, 33% who voted in the by-election ended up backing the Liberal Democrats.

Only 49% of the 2010 Conservative voters in the sample voted Tory in 2010; 91% of the 2010 Labour voters stayed loyal, but then there were fewer of them. Shift all the Tory-LD defectors back to the Conservatives and you get a notional Tory vote share of 22%, vying for second place with the Lib Dems on 23%. Of course, this is working back from answers to a phone poll to the actual result, which isn’t really legitimate, but what’s interesting about these figures is how much of the shift in voting patterns they do in fact seem to account for. You can do it yourself if you’ve got a spreadsheet handy:

2011 Labour = 91% 2010 Labour + 29% 2010 LD + 5% 2010 Tory (!)
2011 LD = 5% 2010 Labour (!!) + 55% 2010 LD + 33% 2010 Tory
2011 Tory = 0% 2010 Labour + 3% 2010 LD + 49% 2010 Tory

Let 2010 Labour = 32%, 2010 LD = 32% and 2010 Tory = 26%, and the 2011 figures come out at 40%, 28% and 14%; you only need to massage the figures a bit to cover variable turnout and you’ve got the real results of 42%, 32% and 13%.

These figures bear out the big difference between the Tory base and its Lib Dem counterpart. Tory support is flexible, and will go under other colours if it’s for the good of the party. Lib Dem support is just soft – and, given what they’re currently being asked to support, it’s no wonder.

A treasure hunt, but the treasure’s gone

Recent discussion on CT has made me aware of some startling disparities:

UK(2001) Oxford admissions (2009)
White 71.1% 84.9%
Mixed 3.2% 4.6%
Asian 12.1% 4.6%
Black 10.9% 1.0%
Chinese 1.1% 1.8%
Other 1.6% 0.3%


A massive over-representation of the White majority, together with a really glaring under-representation of British Asian and especially Black students, who are being rejected literally nine times out of ten, whereas…

Hang on, wrong figures. That first column is the ethnic breakdown of the population of London (which is where David Lammy MP was born and has lived most of his life, not to mention the obvious point that it’s where he works). Here’s the UK:

UK(2001) Oxford admissions (2009)
White 92.1% 84.9%
Mixed 1.2% 4.6%
Asian 4.0% 4.6%
Black 2.0% 1.0%
Chinese 0.4% 1.8%
Other 0.4% 0.3%

White majority: slightly under-represented. Chinese and mixed-race groups: over-represented. British Asians: very slightly over-represented. Black British…

Well, OK, Lammy has got something here, but it’s not quite as big an issue as it might look if you’re coming at it from an ethnically-mixed background (also known as a ‘city’). The UK population in 2001 was still 92% White – there are whole areas of the country where you just won’t see a brown face, or if you do you’ll go home and tell somebody. I won’t be surprised if the figure that comes out of the 2011 Census is a bit lower, but I’ll be amazed if it’s below 90%. So the fact that the Oxford student intake is 85% White is not, in itself, a problem, except insofar as it suggests that recruitment from Scotland, Wales and the North-East might need a bit of work.

All the same, it’s true that Black students are seriously under-represented; a factor of 2 isn’t as bad as a factor of 10, but it’s not good. But this seems to be a point specifically about Black students and not about non-Whites more generally. If racism on the part of Oxford admissions tutors is at the root of what’s going on here, either it’s specifically anti-Black racism or there are other factors outweighing racist attitudes towards other groups.

Or is the problem at the application stage? Here’s how applications look in comparison to UK population figures (bearing in mind that these are 2001 figures and hence almost certainly out of date). In 2009, there were approximately 185 Oxford applications for every 1,000,000 UK citizens. If the same figure is calculated for each ethnic group, you get the following:

Applications per million Over/under
White 155 83.5%
Mixed 703 379.4%
Asian 353 190.7%
Black 192 103.8%
Chinese 918 495.2%
Other 364 196.6%

Relative to the size of their ethnic group within the population as a whole, White students are under-represented. Asians and the ‘Other’ group – which consists mainly of people who declined to state their ethnic group – are over-represented; Chinese and the ‘Mixed’ group are massively over-represented. Black students are right in the middle of the distribution, a fairly small population represented – relative to the total of applications – proportionately to its size.

Here are the admission figures again, this time side by side with the application figures:

Applications Admissions Success Over/under
White 76.9% 84.9% 27.6% 110.0%
Mixed 4.4% 4.6% 26.5% 105.6%
Asian 7.6% 4.6% 15.3% 61.0%
Black 2.0% 1.0% 12.2% 48.6%
Chinese 2.1% 1.8% 21.6% 86.1%
N/K 6.3% 2.8% 11.1% 44.2%

The “over/under” figure gives the relative success of each group as compared with the overall success rate of 25.1%. And it’s an interesting figure. Relative to applications, White students are quite substantially over-represented, while every other group is under-represented, with the exception of the ‘Mixed’ group (the cynical explanation that they’re seen as ‘white enough’ suggests itself).

Here, finally, is what it looks like if you put it all together. (These are the same numbers I’ve been crunching so far. The ‘Over/under’ figure for applications is the ratio between the number of applicants per million in each group and the number of applicants per million UK residents. The ‘Over/under’ figure for admissions is the ratio between the success rate of applicants in each group and the overall success rate of applicants.)

% of population % of applications Over/under % of admissions Over/under
White 92.1% 76.9% 0.835 84.9% 1.103
Mixed 1.2% 4.4% 3.794 4.6% 1.057
Asian 4.0% 7.6% 1.907 4.6% 0.610
Black 2.0% 2.0% 1.038 1.0% 0.488
Chinese 0.4% 2.1% 4.952 1.8% 0.862
Other 0.4% 0.8% 1.966 0.3% 0.428

Every line tells a slightly different story. The Mixed ethnic group comes off best, with a massive over-representation in applications which is entrenched at the admissions stage; Chinese students are also over-represented, with a larger over-representation among applicants only slightly scaled back at the admission stage. A smaller over-representation over Asian students is almost entirely reversed by the rejection of 85% of applicants. The White group is significantly under-represented among applicants, although the admissions process partially compensates for this with a slight over-representation, relative to applications. Alone among all the major ethnic groups, Black students apply to Oxford at roughly the same rate as the population as a whole, neither over-represented among applicants (like most others) nor under-represented (like White students). However, the Black group suffers enormously at the admission stage, with a rejection rate of nearly 88%; this compares with 74.9% for all applicants and 72.4% for White students.

So what is going on? A large part of what’s going on seems to be that White schoolchildren aren’t getting the top grades in the numbers we’d expect – although this is still being compensated during admissions. Where Black Oxford applicants are concerned, it seems undeniable that something is going wrong somewhere in the admission process. The numbers of Asian – and to a lesser extent Chinese – applicants are cut down fairly significantly in the admissions process, but this is compensated by a massive over-representation of those groups among applicants. Black students get hit both ways: they’re not over-represented (although I would find it hard to label this as a fault, particularly given the performance of my own ethnic group), and they’re turned away at an even higher rate than Asian applicants. Oxford’s own investigation concludes that subject choice must bear some (most? all?) of the blame:

BME students apply disproportionately for the most oversubscribed courses. Oxford’s three most oversubscribed large (over 70 places) courses (Economics & Management, Medicine and Mathematics) account for 43% of all BME applicants and 44% of all Black applicants – compared to just 17% of all white applicants.

Well, maybe, but I can’t help feeling that this explanation stops where it ought to start. It’s hard to believe that subject choice is the only reason why Black students’ faces so consistently fail to fit; more to the point, the ‘good’ and ‘bad’ subject choices themselves are not entirely weightless and without a history.  I passed this snippet on to my wife (we met at Cambridge). Apparently Black students aren’t being advised to choose the right subjects, I said, and that’s why not many of them get into Oxford. What, she said, they’re not applying to do Land Economy?

A parting on the right

The police forces of England and Wales implemented a new set of rules for recording crimes in 2002-3, following earlier piecemeal adjustments in 1998-9. The National Crime Recording Standard (NCRS) was designed to be more victim-friendly than the counting rules which had preceded it: rather than the police insisting on corroborating evidence before a crime was recorded to have happened, a crime was to be recorded whenever one was reported unless there was evidence to the contrary. There was a certain amount of resistance to these changes, which had the direct effect of apparently increasing the crime rate and the indirect effect of lowering the police’s clear-up rate. Nevertheless, the Home Office felt very strongly that police figures were far too low – the British Crime Survey, based on reports from a representative sample of individual victims of crime, suggested that only about 25% of predatory crimes were getting into the police figures – and the changes duly went through. Comparability was also an issue, although less of an issue with each passing year of data being produced under the new rules. The Home Office has in any case made it very clear that there is no comparability of police crime figures between 2002 and 2003, making available figures like the ones from which the graph below was compiled.

As you can see, there’s a strong correlation between the impact of the NCRS and the amount of evidence typically left by the offence. Recorded burglaries weren’t greatly affected, but recorded crimes of personal violence – where supporting evidence is particularly thin on the ground – went up by almost a quarter from one year to the next, on the basis of nothing other than a change in counting rules.

Now, there is no particular reason why the average member of the public should know about all this. It’s inconceivable that anyone with a professional or academic interest in crime or policing wouldn’t know about it, though; it would be like claiming expertise in English history and getting the date of the Battle of Hastings wrong. So this was an interesting story about the Shadow Home Secretary, Chris Grayling.

Sir Michael Scholar, chairman of the UK Statistics Authority, has warned [Grayling] that the way he used figures for violent crime were “likely to mislead the public”. … Mr Grayling’s office arranged for a press release to go out in every constituency in England and Wales, purporting to show that violent crime had risen sharply under Labour, as part of a campaign spearheaded by Mr Cameron about “broken Britain”. But Mr Grayling had failed to take into account a more rigorous system for recording crime figures introduced by the Home Office in 2002. … Mr Grayling has used comparison between the figures before and after the rule change to suggest that the Labour government has presided over a runaway rise in violent crime.

“I do not wish to become involved in political controversy but I must take issue with what you said about violent crime statistics, which seems to me likely to damage public trust in official statistics,” Sir Michael wrote in a letter to Mr Grayling yesterday.

Mr Grayling replied by promising to “take account of the request by the Statistics Authority, particularly with regard to the changes to recording practices made in 2002-03″. But he insisted that he would “continue to use recorded crime statistics, because they reflect an important reality; that the number of violent crimes reported to police stations, and particularly serious violent crimes, has increased substantially over the past decade, even taking into account any changes to data collection”.

But we don’t know the number of violent crimes reported to police stations, because we don’t know the number which are reported but not recorded; that number is not recorded, surprisingly enough. (There was a proposal a few years back to keep separate tabs on ‘incidents’ (i.e. everything that comes over the front desk or over the phone) and ‘calls for service’ (the subset of incidents that the police do anything about), but as far as I’m aware it didn’t come to anything.) In other words, Grayling has not only managed to ignore a really basic piece of statistical general knowledge; he’s gone on to ignore a correction by an expert in the field, responding in a way which demonstrates a complete lack of understanding of what he’d just been told.

The question this leaves is, is David Cameron’s first choice for Home Secretary very, very dishonest, or just very, very stupid?

Not one of us

Nick Cohen in Standpoint (via):

a significant part of British Islam has been caught up in a theocratic version of the faith that is anti-feminist, anti-homosexual, anti-democratic and has difficulties with Jews, to put the case for the prosecution mildly. Needless to add, the first and foremost victims of the lure of conspiracy theory and the dismissal of Enlightenment values are British Muslims seeking assimilation and a better life, particularly Muslim women.

It’s the word ‘significant’ that leaps out at me – that, and Cohen’s evident enthusiasm to extend the War on Terror into a full-blown Kulturkampf. I think what’s wrong with Cohen’s writing here is a question of perspective, or more specifically of scale. You’ve got 1.6 million British Muslims, as of 2001. Then you’ve got the fraction who take their faith seriously & probably have a fairly socially conservative starting-point with regard to politics (call it fraction A). We don’t really know what this fraction is, but anecdotal evidence suggests that it’s biggish (60%? 70%?) – certainly bigger than the corresponding fraction of Catholics, let alone Anglicans. Then there’s fraction B, the fraction of the A group who sign up for the full anti-semitic theocratic blah; it’s pretty clear that fraction B is tiny, probably below 1% (i.e. a few thousand people). Finally, you’ve got fraction C, the proportion of the B group who are actually prepared to blow people up or help other people to do so – almost certainly 10% or less, i.e. a few hundred people, and most of them almost certainly known to Special Branch.

I think we can and should be fairly relaxed about fraction A; we should argue with the blighters when they come out with stuff that needs arguing with, but we shouldn’t be afraid to stand with them when they’re raising just demands. (Same as any other group, really.) Fraction B is not a good thing, and if it grows to the point of getting on the mainstream political agenda then it will need to be exposed and challenged. But it hasn’t reached that level yet, and I see no sign that it’s anywhere near doing so. (Nigel Farage gets on Question Time, for goodness’ sake. Compare and contrast.) The real counter-terrorist action, it seems to me, is or should be around fraction C. Let’s say there are 5,000 believers in armed jihad out there – 500 serious would-be jihadis and 4,500 armchair jihadis, who buy the whole caliphate programme but whose own political activism doesn’t go beyond watching the Martyrdom Channel. What’s more important – eroding the 5,000 or altering the balance of the 500/4,500 split? In terms of actually stopping people getting killed, the answer seems pretty obvious to me.

Nick Cohen and his co-thinkers, such as the Policy Exchange crowd, focus on fraction B rather than fraction A. In itself this is fair enough – I think it’s mistaken, but it’s a mistake a reasonable person can make. What isn’t so understandable is the urgency – and frequency – with which they raise the alarm against this tiny, insignificant group of people, despite the lack of evidence that they’re any sort of threat. “A small minority of British Muslims believe in the Caliphate” is on a par with “A small minority of British Conservatives would bring back the birch tomorrow” or “A small minority of British Greens believe in Social Credit”. It’s an advance warning of possible weird nastiness just over the horizon; it’s scary, but it’s not that scary.

What explains the tone of these articles, I think, is an additional and unacknowledged slippage, from fraction B back out to fraction A. What’s really worrying Cohen, in other words, isn’t the lure of conspiracy theory and the dismissal of Enlightenment values so much as the lure of Islam (in any form) and the dismissal of secularism. (What are these Enlightenment values, anyway? Nobody ever seems to specify which values they’re referring to. Somebody should make a list). Hence this sense of a rising tide of theocratic bigotry, and of the need for a proper battle of values to combat it. This seems alarmingly wrongheaded. Let’s say that there’s a correlation between religious devotion and socially conservative views (which isn’t always the case) – then what? A British Muslim who advocates banning homosexuality needs to be dealt with in exactly the same way as a British Catholic who advocates banning abortion – by arguing with their ideas. (Their ideas are rooted in their identities – but then, so are mine and yours.) And hence, too, that odd reference to British Muslims seeking assimilation and a better life, as if stepping out of the dark ages must mean abandoning your faith – or, at least, holding it lightly, in a proper spirit of worldly Anglican irony. Here, in fact, Cohen is a hop and a skip from forgetting about all the fractions and identifying the problem as Muslims tout court. Have a care, Nick – that way madness lies.

The high and the low

(Updated Christmas Eve, after spotting a flaw in my statistical analysis. I am deeply sad.)

Now that it’s well and truly over, two things really stick in my mind about the Manchester Congestion Charge vote. (Strictly speaking, the Manchester Transport Innovation Fund vote – but I don’t think it’s a fund that we voted to reject.)

One is the sheer strangeness of the Yes campaign. As you’ll already know if you live anywhere in Greater Manchester, this was a huge campaign. The public transport companies were in favour anyway, so you couldn’t get on a bus or a tram without being invited to vote Yes. But you couldn’t wait for a bus – or look out of the window once it started moving – without your eyes being met by the dull-eyed, faintly reproachful gaze of the Vote Yes People. (Click around the site for more. Perhaps not late at night.) They were everywhere. According to that Web site, the campaign was sponsored by TCS (a property company) and Practicus (an ‘interim management’ company, which seems to be something like middle-management recruitment only not quite; perhaps you don’t get an actual job at the end of it). Those two companies must be doing remarkably well, to have all that money to spend on someone else’s publicity; clearly names to watch. From the Vote Yes campaign’s point of view, though, I do wonder that nobody seems to have considered the potential downside of this level of saturation publicity. People don’t generally like being told what to do, least of all by spud-faced pod-people who purport to represent them.

Perhaps it wouldn’t have been so bad if the content of the campaign had been different. There were three waves of pod-people posterage, each a variation on the basic theme of What An Ordinary Manchester Person Is Thinking. (And ‘thinking’ is the word: nobody was actually speaking in those pictures. Look into my eyes! Hear my thoughts!) The first wave was the deeply annoying “I won’t be paying” theme. This wasn’t encouraging civil disobedience (which would probably be fairly futile with the level of surveillance required by the scheme). Rather, it was based on the idea that most people wouldn’t be making car journeys which would be hit by the charge – supposedly ‘eight out of ten people wouldn’t pay’ – and therefore most people ought to vote Yes.

This was a bad approach on so many levels. On the face of it, it was a straightforward appeal to self-interest: you want better public transport? you don’t want to pay more? lucky you, you won’t have to! But anyone who was already concerned about the charge, or suspected that they might be affected, had already had ample opportunities to do the sums for their own situations. (Full disclosure: I worked out that I’d be charged once a week. I really resented that.) Even if only 20% of the population was likely to be charged – and I’m sure people like me, incurring weekly charges, weren’t included in those calculations – the appeal to self-interest, for those people, would immediately backfire: saying that four out of five people wouldn’t pay isn’t much of a selling-point if you’re number 5.

For anyone who hadn’t given the charge much thought, on the other hand, the campaign could almost have been calculated to raise suspicions – precisely because of that weird and phony “we are ordinary people like you” framing. I won’t pay, says an actor representing a typical Manchester resident, because I only go into town at the weekend / I get to college by bus / I never go out of the house (I may have made up the last one). I suppose our reaction to these was supposed to be “good for us – tough luck on those people who insist on commuting by car”. Actually my instinctive reaction was “good for fictional you, but what about me?” If you’re going to appeal to self-interest, you need to get the story straight – once you start thinking in terms of “can I get something for nothing?”, you’re also thinking “am I going to get ripped off?”

The second wave was all about fairness. This time the pod people had talking points that they were mulling over (although where they got them was a mystery to me – the publicity about the actual details of the scheme was woefully limited). The emphasis was on the commitment to get the improvements to public transport into place before the charge came in; a typical poster read “Bus fares are frozen, and then the charge comes in? Sounds fair to me.” This wasn’t as actively repellent as the first phase, but it was extraordinarily weak – what do you mean, it sounds fair to you? What is this imitation of reasoning – are you saying it is fair or not – and if not, why not? Come to think of it, what’s fairness got to do with the timing of the introduction of the charge? There’s no sense in which the benefits gained in the first couple of years offset the costs imposed from that point on. Once again, this “we are ordinary people” approach provokes the very suspicions it’s apparently meant to allay – maybe it sounds ‘fair’ to you, mate, but to me it just sounds like a sweetener… And, once again, the underlying appeal is not to collective benefits or to fairness (despite the language), but to self-interest. Two years benefits upfront, free of charge? I’ll have some of that. What would genuinely sound fair would be “We’ll pay more when we drive at peak times, but we’ll get the benefit when we use public transport” – but that message never appeared.

The idea of actually paying the charge did surface in the third and final stage of the campaign, but yet again the appeal was to individual self-interest. The message here was “I want to [get from A to B quickly]. That’s why I’m voting Yes.”, with examples ranging from getting to the building site on time to putting the kids to bed. I don’t mind paying, the logic runs, because I know that other people won’t want to pay, and so the roads I drive down will be much clearer. Essentially this was the “get the plebs off the road” phase of the campaign. It seems to tap into the same vein of narcissistic fantasy that brought us the remake of SurvivorsWhat if everyone stopped using their cars to get to work except me? Wouldn’t that be brilliant?

This isn’t a full picture of the Yes campaign; there was some publicity which focused on improvements to public transport. More to the point, a lot of the actual campaigning went on by word of mouth, and here the idea that the charge might be paid for in collective benefits did get an airing. Overall, though, the Yes campaign was woeful as well as creepy. What it was trying to get us to do was assent to an additional tax, for the sake of benefits which (by government decree) couldn’t be funded any other way. The question, in other words, was “do you agree to start making a payment you’ve never had to make before and carry on paying it indefinitely, with no guarantee that the scheme won’t be extended or the toll increased, for no reason except that that’s the only offer on the table?” (The TIF was to consist of a £1500 million grant plus a £1200 million loan, a quarter of which would need to be spent on setting up the machinery to administer the scheme. And no, we couldn’t just have the £1500 million.) It appeals to a certain combination of public-spiritedness and submissive ‘realism’: you can say “yes, because I believe the investment in public transport will be worth it, and besides it’s the only offer on the table” or “yes, because I believe we should be encouraged to use our cars less (and besides…)”, but those are arguments for agreeing to a collective tax, arbitrarily imposed, in return for collective benefits. There’s just no way to sell a Yes vote in terms of individual self-interest, and it was pretty shabby of the Yes campaign to make the attempt.

The other thing that struck me about the campaign was the consistency of the voting figures, with one interesting exception. There are ten boroughs within the old Greater Manchester region; the plan was to implement two charging zones, one following the M60 and an inner ring further in towards the centre (not far enough in for my liking, but that’s by the way). Out of the ten boroughs, Bolton and Wigan are entirely outside the M60, and Rochdale almost entirely; these three boroughs presumably have the largest proportion of people who would be completely unaffected by the charge. Bury, Oldham, Tameside, Stockport and Trafford are all crossed by the M60. Manchester and Salford, finally, are divided both by the M60 and by the inner ring.

Here are the voting figures. I’ve given the percentage turnout and the No vote (as a percentage of those who voted). The dotted lines represent percentages across all ten boroughs. (Region-wide turnout: 53.2%; region-wide No vote: 78.8%.) I’ve graphed the No vote because it turns out that there was very little variation in the Yes vote, calculated as a percentage of eligible voters: 4% in total (from a low of 8.9% to a high of 12.8%), with six boroughs within 0.5% of the overall figure of 11.3%.

Congestion charge 1

Here are the same figures, normalised around those region-wide percentages: 90% means ’90% of the regional percentage turnout/No vote’.

Congestion charge 2

And here are the percentages again, sorted by No vote rather than by turnout.

Congestion charge 3

What do we see? The first thing is that turnout was respectable everywhere (the Wigan low of 45% would be very good for a local election) and better than that in a few places (over 60% in Tameside and Trafford). The second is that the No vote was overwhelming (and the Yes vote miserable) pretty much everywhere: the No vote ranged from 84.5% in Salford all the way down to 72.2% in Manchester. This wasn’t a multiple-choice question or a choice between several candidates: 27.8% of people who voted in Manchester voted Yes, and 72.2% voted No. For the proposal to pass, the vote had to be over 50% in seven out of ten boroughs; it didn’t even reach 30% in one.

Then there’s the correlation of turnout and No vote, which is particularly striking in the third graph: three boroughs had a below-average No vote and a below-average turnout; six had an above-average turnout and an above-average No vote. (Bolton was in between.) Look at the first graph and compare Trafford, Tameside and Stockport (crossed by the M60) with Rochdale, Bolton and Wigan (outside the M60). Outer boroughs: low turnout, relatively low No vote. Inner: high turnout, relatively high No vote. As I noted above, the Yes turnout varied between 8.9% and 12.8%, for an overall average of 11.3%. There was much more variation in the No turnout, which was 41.9% across the area, but ranged from over 50% in Trafford and Tameside to just over 33% in Wigan and Manchester. (Trafford also had an above-average Yes turnout, at 12.5%. I guess they just take voting seriously in Trafford.) There seems to be a definite correlation with geography; it looks as if, where geography made a difference, the difference was both that the congestion charge interested fewer people (lower turnout in outer boroughs) and that those who bothered to vote were more motivated by self-interest (lower No vote in outer boroughs). In short, the geographical patterning of the Yes vote is highly suggestive of an appeal to self-interest, while the overall level of the Yes vote suggests that this appeal has very little power to mobilise.

Lastly, there’s a glaring exception to this correlation: Manchester, the borough covering most of the city centre and hence the only borough, apart from Salford, which is crossed by both inner and outer charging rings. Salford has the record No vote, at 84.5%; turnout was a respectable 57%. Manchester, by contrast, is out there with Wigan: a turnout of only 46%, of whom 27.8% voted Yes. Clearly, the model which explains the differences between inner and outer boroughs in terms of individual self-interest can’t deal with these figures.

I haven’t got an explanation, either for the high Yes vote or for the equally puzzling low turnout. Anecdotal evidence suggests that Manchester (or at least South Manchester) may have an unusually high concentration of people sympathetic to the aims of the Congestion Charge, or of non-drivers, or both. As for the low turnout, Manchester City Council hasn’t changed hands since 1974; the council’s motto is Concilio Et Labore, and it is. Perhaps conditions like that – compounded by the fug of neo-Blairite ex-municipal-socialist hortatory corporate righteousness which has enveloped the Town Hall for the last decade – tend to promote cynicism and disengagement: they’ll do it anyway, so why encourage them? The day the vote came through the Manchester Evening News results page included a poll: “Is the Congestion Charge dead and buried?” When I looked at the page, votes were running 4:1 in favour of “It’ll be back in some form”. White Van Man won’t resist the Future forever. (And a Merry Christmas to you too, Mr Leese sir!)

All those numbers

I like a good fallacy; I managed to get the Base Rate Fallacy, the Hawthorne Effect and Goodhart’s Law into one lecture I gave recently. So I was intrigued to run across this passage in Jock Young’s 2004 essay “Voodoo Criminology and the numbers game” (you can find a draft in pdf form here):

Legions of theorists from Robert K Merton through to James Q Wilson have committed Giffen’s paradox: expressing their doubts about the accuracy of the data and then proceeding to use the crime figures with seeming abandon, particularly in recent years when the advent of sophisticated statistical analysis is, somehow, seen to grant permission to skate over the thin ice of insubstantiality.

I like a good fallacy, but paradoxes are even better. So, tell me more about Giffen’s paradox:

Just as with Giffen’s paradox, where the weakness of the statistics is plain to the researchers yet they continue to force-feed inadequate data into their personal computers

Try as I might, I wasn’t seeing the paradox there. A footnote referenced

Giffen, P. (1965), ‘Rates of Crime and Delinquency’ in W. McGrath (ed.), Crime Treatment in Canada

I didn’t have W. McGrath (ed.), Crime Treatment in Canada by me at the time, so I did the next best thing and Googled. I rapidly discovered that Giffen’s paradox is also known as the Giffen paradox, that it’s associated with Giffen goods, and that it’s got nothing to do with Giffen, P. (1965):

Proposed by Scottish economist Sir Robert Giffen (1837-1910) from his observations of the purchasing habits of the Victorian poor, the Giffen paradox states that demand for a commodity increases as its price rises.

Raise the price of bread when there are people on the poverty line – ignoring for the moment the fact that this makes you the rough moral equivalent of Mengele – and those people will buy more bread, to substitute for the meat they’re no longer able to afford. It’s slightly reassuring to note that, notwithstanding Sir Robert’s observations of the Victorian poor, economists have subsequently questioned whether the Giffen paradox has ever actually been observed.

But none of this cast much light on those researchers force-feeding their personal computers with inadequate data. Eventually I tracked down W. McGrath (ed.), Crime Treatment in Canada. It turns out that the less famous Giffen did in fact describe the willingness of researchers to rely on statistics, after having registered caveats about their quality, as a paradox (albeit “one of the less important paradoxes of modern times”). I still can’t see that this rises to the level of paradox: surely being upfront about the quality of the data you’re processing is what a statistical analyst should do. If initial reservations don’t carry through into the conclusion that’s another matter – but that’s not a paradox, that’s just misrepresentation.

Paradoxical or not, Giffen’s observation accords with Young’s argument in the paper, which is that criminologists, among other social scientists, place far too much trust in statistical analysis: statistics are only as good as the methods used to produce them, methods which in many cases predictably generate gaps and errors.

It’s a good argument but not a very new or surprising one (perhaps it was newer in 1965). Moreover, Young pushes it in some odd directions. The paper reminded me of Robert Martinson’s 1974 study of rehabilitation programmes, “What Works?” – or rather, of how that paper was received. Martinson demonstrated that no study had conclusively shown any form of rehabilitation to work consistently, and that very few studies of rehabilitation showed any clear result; his paper was seized on by advocates of imprisonment and invoked as proof that nothing worked. This was unjustified on two levels. Firstly, while Martinson’s negatives would justify scepticism about a one-size-fits-all rehabilitation panacea, the detail of his research did suggest that some things worked for some people in some settings. Subsequent research – some of it by Martinson himself – bore out this suggestion, showing reasonably clear evidence that tailored, flexible and multiple interventions can actually do some good. Secondly, if Martinson was sceptical about rehabilitation, he wasn’t any less sceptical about imprisonment: his conclusion was that ex-offenders could be left alone, not that they should be kept locked up (“if we can’t do more for (and to) offenders, at least we can safely do less”). For Martinson, rehabilitation couldn’t cut crime by reforming bad people, because crime wasn’t caused by bad people in the first place. Sadly, the first part of this message was heard much more clearly than the second.

Like Martinson, Young is able to present a whole series of statistical analyses which seem obviously, intuitively wrong. However, what his examples suggest is that statistics from different sources require different types and levels of wariness: some are dependably more trustworthy than others, and some of the less trustworthy are untrustworthy in knowably different ways. But rather than deal individually with the different types of scepticism, levels of scepticism and reasons for scepticism which different analyses provoke, Young effectively concludes that nothing works, or very little:

Am I suggesting an open season on numbers? Not quite: there are … numbers which are indispensable to sociological analysis. Figures of infant mortality, age, marriage and common economic indicators are cases in point, as are, for example, numbers of police, imprisonment rates and homicide incidences in criminology. Others such as income or ethnicity are of great utility but must be used with caution. There are things in the social landscape which are distinct, definite and measurable; there are many others that are blurred because we do not know them – some because we are unlikely ever to know them, others, more importantly, because it is their nature to be blurred. … There are very many cases where statistical testing is inappropriate because the data is technically weak – it will simply not bear the weight of such analysis. There are many other instances where the data is blurred and contested and where such testing is simply wrong.

(In passing, that’s a curious set of solid, trustworthy numbers to save from the wreckage – it’s hard to think of an indicator more bureaucratically produced, socially constructed and culture-bound than “infant mortality”, unless perhaps it’s “marriage”.)

I’ve spent some time designing a system for cataloguing drug, alcohol and tobacco statistics – an area where practically all the data we have is constructed using “blurred and contested” concepts – so I sympathise with Young’s stance here, up to a point. Police drug seizure records, British Crime Survey drug use figures and National Treatment Agency drug treatment statistics are produced in different ways and tell us about different things, even when they appear to be talking about the same thing. (In my experience, people who run archives know about this already and find it interesting, people who use the statistics take it for granted, and IT people don’t know about it and want to fix it.) But: such testing is simply wrong? (Beware the persuasive adverb – try re-reading those last two sentences with the word ‘simply’ taken out.) We know how many people answered ‘yes’ to a question with a certain form of words; we know how many of the same people answered ‘yes’ to a different question; and we know the age distribution of these people. I can’t see that it would be wrong to cross-tabulate question one against question two, or to calculate the mean age of one sub-sample or the other. Granted, it would be wrong to present findings about the group which answered Yes to a question concerning activity X as if they were findings about the group who take part in activity X – but that’s just to say that it’s wrong to misrepresent your findings. Young’s broader sceptical claim – that figures constructed using contested concepts should not or cannot be analysed mathematically – seems… well, wrong.

Young then repeats the second of the errors of Martinson’s audience: if none of that works, then we can stick with what we know. In this case that means criminology reconceived as cultural ethnography: “a theoretical position which can enter in to the real world of existential joy, fear, false certainty and doubt, which can seek to understand the subcultural projects of people in a world riven with inequalities of wealth and uncertainties of identity”. Fair enough – who’d want a theoretical position which couldn’t enter in to the real world? But the question to ask about creeds is not what’s in them but what they leave out. Here, the invocation of culture seems to presage the abandonment not only of statistical analysis but of materialism.

The usual procedure … is to take the demographics and other factors which correlate with crime in the past and attempt to explain the present or predict the future levels of crime in terms of changes in these variables. The problem here is that people (and young people in particular) might well change independently of these variables. For in the last analysis the factors do not add up and the social scientists begin to have to admit the ghost in the machine.

People … might well change independently of these variables – how? In ways which don’t find any expression in phenomena that might be measured (apart from a drop in crime)? It seems more plausible to say that, while people do freely choose ways to live their lives, they do not do so in circumstances of their own choosing – and that those choices in turn have material effects which create constraints as well as opportunities, for themselves and for others. To put it another way, if the people you’re studying change independently of your variables, perhaps you haven’t got the right variables. Young’s known as a realist, which is one way of being a materialist these days; but the version of criminology he’s proposing here seems, when push comes to shove, to be non- or even anti-materialist (“the ghost in the machine”). That’s an awfully big leap to make, and I don’t think it can be justified by pointing out that some statisticians lie.

What arguments based on statistics need – and crime statistics are certainly no exception – is scepticism, but patient and attentive scepticism: it’s not a question of declaring that statistics don’t tell us anything, but of working out precisely what particular uses of statistics don’t tell us. A case in point is this story in last Friday’s Guardian:

An 8% rise in robberies and an 11% increase in vandalism yesterday marred the latest quarterly crime figures, which showed an overall fall of 2% across all offences in England and Wales.

The rise in street crime was accompanied by British Crime Survey indicators showing that public anxiety about teenagers on the streets, noisy neighbours, drug dealing, drunkenness and rowdiness has continued to increase despite the government’s repeated campaigns against antisocial behaviour. … But police recorded crime figures for the final three months of 2006 compared with 12 months earlier showed that violent crime generally was down by 1%, including a 16% fall in gun crime and an 11% fall in sex offences.

The more authoritative British Crime Survey, which asks 40,000 people about their experience of crime each year, reported a broadly stable crime rate, including violent crime, during 2006. … The 11% increase in vandalism recorded by the BCS and a 2% rise in criminal damage cases on the police figures underlined the increase in public anxiety on five out of seven indicators of antisocial behaviour.

Confused? You should be. Here it is again:

  Police BCS
All crime down 2% stable (up 1%*)
Violent crime down 1% stable
Robbery up 8% stable (down 1%*)
Vandalism up 2% up 11%

* Figures in italics are from the BCS but weren’t in the Guardian story.

Earlier on in this post I made a passing reference to statistical data being bureaucratically produced, socially constructed and culture-bound. Here’s an example of what that means in practice. Police crime figures are a by-product of the activities of the police in dealing with crime, and as such are responsive to changes in the pattern of those activities: put a lot more police resources into dealing with offence X, or change police procedure so that offences of type X are less likely to go by unrecorded, and the crime rate for offence X will appear to go up (see also cottaging). Survey data, on the other hand, is produced by asking people questions; as such, it’s responsive to variations in the type of people who answer questions and to variations in those people’s memory and mood, not to mention variations in the wording of the questions, the structure of the questionnaire, the ways in which answers are coded up and so on. The two sets of indicators are associated with different sets of extraneous influences; if they both show an increase, the chances are that they’ve both been affected by the same influence. The influence in question may be a single big extraneous factor which affects both sets of figures – for example, a massively-publicised crackdown on particular criminal offences will give them higher priority both in police activities and in the public consciousness. But it may be a genuine increase in the thing being measured – and, more to the point, the chances of it being a genuine increase are much higher than if only one indicator shows an increase.

In this case, the police have robberies increasing by 8%; the BCS has theft from the person dropping by 1%. That’s an odd discrepancy, and suggests that something extraneous is involved in the police figure; it’s not clear what that might be, though. Vandalism, on the other hand, goes up by 2% if you use police figures but by all of 11% if you use the BCS. Again, this discrepancy suggests that something other than an 11% rise in the actual incidence of vandalism might be involved, and in this case the story suggests what this might be:

British Crime Survey indicators showing that public anxiety about teenagers on the streets, noisy neighbours, drug dealing, drunkenness and rowdiness has continued to increase despite the government’s repeated campaigns against antisocial behaviour

Presumably the government’s repeated campaigns against antisocial behaviour have raised the profile of anti-social behaviour as an issue. Perhaps this has made it more likely that people will feel that behaviour of this type is something to be anxious about, and that incidents of vandalism will be talked about and remembered for weeks or months afterwards (the BCS asks about incidents in the past twelve months).

That’s just one possible explanation: the meaning of figures like these is all in the interpretation, and the interpretation is up to the interpreter. The more important point is that there are things that these figures will and won’t allow you to do. You can say that police figures, unlike the BCS, are a conservative but reliable record of things that have actually happened, and that robbery has gone up by 8% and criminal damage by 2%. You can say that victim surveys, unlike police figures, are an inclusive and valid record of things that people have actually experienced, and that vandalism has gone up by 11% while robbery has gone down by 1%. What you can’t do is refer to An 8% rise in robberies and an 11% increase in vandalism - there is no way that the data can give you those two figures.

But that’s not a paradox or even a fallacy – it’s just misuse of statistics.

None of you stand so tall

In the previous post, I showed that the canonical ‘power law’ chart which underlies the Long Tail image does not, in fact, represent a power law. What it represents is a ranked list, which happens to have a similar shape to a power law series: as it stands, the ‘power law’ is an artifact of the way the list has been sorted. In particular, the contrast which is often drawn, in this context, between a power law distribution and a normal distribution is inappropriate and misleading. If you sort a list high to low, it can only ever have the shape of a descending curve.

There are counter-arguments, which I’ll go through in strength order (weakest first).

Counter-argument 1: the Argument from Inconsequentiality.

In the post which started it all, Clay wrote:
the shape of Figure #1, several hundred blogs ranked by number of inbound links, is roughly a power law distribution.

Note weasel wordage: it would be possible to argue that what Clay (and Jason Kottke) identified wasn’t really a power law distribution, it was just some data which could be plotted in a way which looked oddly like a power law curve. Thankfully, Clay cut off this line of retreat, referring explicitly to power law distributions:

power law distributions are ubiquitous. Yahoo Groups mailing lists ranked by subscribers is a power law distribution. LiveJournal users ranked by friends is a power law … we know that power law distributions tend to arise in social systems where many people express their preferences among many options.

And so on. When we say ‘power law’, we mean ‘power law distribution’: we’re all agreed on that.

Except, of course, that what we’re talking about isn’t a power law distribution. Which brings us to…

Counter-argument 2: the Argument from Intuition.

The pages I excerpted in the previous post specifically contrast the power law distribution with the ‘normal’ bell curve.

many web statistics don’t follow a normal distribution (the infamous bell curve), but a power law distribution. A few items have a significant percentage of the total resource (e.g., inbound links, unique visitors, etc.), and many items with a modest percentage of the resources form a long “tail” in a plot of the distribution.

we find a very few highly connected sites, and very many nearly unconnected sites, a power law distribution whose curve is very high to the left of the graph with the highly connected sites, with a long “tail” to the right of the unconnected sites. This is completely different than the bell curve that folks normally assume

The Web, like most networks, has a peculiar behavior: it doesn’t follow standard bell curve distributions … [it] follows a power law distribution where you get one or two sites with a ton of traffic (like MSN or Yahoo!), and then 10 or 20 sites each with one tenth the traffic of those two, and 100 or 200 sites each with 100th of the traffic, etc.

One of my Latin teachers at school had an infuriating habit, for which (in the best school-story tradition) I’m now very grateful. If you read him a translation which didn’t make sense (grammatically, syntactically or literally) he’d give you an anguished look and say, “But how can that be?” It was a rhetorical question, but it was also – infuriatingly – an open question: he genuinely wanted you to look again at what you’d written and realise that, no, actually that noun in the ablative couldn’t be the object of the verb… Good training, and not only for reading Latin.

If you’ve got this far, do me a favour and re-read the excerpts above. Then ask yourself: how can that be?

As long as we’re talking about interval/ratio variables – the only type for which a normal distribution can be plotted – it’s hard to make sense of this stuff. What, to put it bluntly, is being plotted on the X axis? The best I can do is to suppose that the X axis plots number of sites: A few items have a significant percentage of the total resource; a very few highly connected sites; one or two sites with a ton of traffic. There’s your spike on the left: a low X value (a few items) and a high Y (a significant percentage of the total resource).

But this doesn’t really work either. Or rather, it could work, but only if every group of sites with the same number of links had a uniquely different number of members – and if the number of members in each group were in inverse proportion to the number of links (1 site with n links, 2 sites with n/2 links, 3 sites with n/3 links, 4 sites with n/4 links…). This isn’t impossible, in very much the same way that the spontaneous development of a vacuum in this room isn’t impossible; a pattern like that wouldn’t be a power law so much as evidence of Intelligent Design.

This is an elaborate and implausible model; it’s also something of a red herring, as we’ll see in a minute. It’s worth going into in detail, though; as far as I can see, it’s the only way of getting these data into a power law distribution, with high numbers of links on the left, without using ranking. And cue…

Counter-argument 3: the Argument from Ranking.

Over to Clay:

The basic shape is simple – in any system sorted by rank, the value for the Nth position will be 1/N. For whatever is being ranked — income, links, traffic — the value of second place will be half that of first place, and tenth place will be one-tenth of first place. (There are other, more complex formulae that make the slope more or less extreme, but they all relate to this curve.)

“The value for the Nth position will be 1/N” (or proportionate to 1/N, to be more precise); alternatively, you could say that N items have a value of 1/N or greater. (Have a think about this one – we’ll be coming back to it later.) Either way, it’s a power law, right? Well, yes – and no. It’s certainly true to say that a ranked list with these properties confirms to a version of the power law – specifically, Zipf’s law. It’s also true to say that Zipfian rankings are associated with Pareto-like power law distributions: we may yet be able to find a power law in this data. But we’re not there yet – and Clay’s presentation of the data doesn’t help us to get there. (Jason’s has some of the same problems, but Clay’s piece is a worse offender; it’s also much more widely known.)

The first problem is with the recurrent comparison of ranked graphs with bell curves. Adam: “a ranked graph … by definition is *always* decreasing, and can *never* be a bell curve”. If anyone tells you that such and such a phenomenon follows a power law rather than a normal distribution, take a good look at their X axis. If they’ve got ranks there, the statement is meaningless.

Secondly, the graph Clay presented – a classic of the ‘big head, long tail’ genre – isn’t actually a Zipfian series, for the simple reason that it includes tied ranks: it’s not a list of ranks but a list of nominals sorted into rank order.

I’ll clarify. Suppose that we’ve got a series which only loosely conforms to Zipf’s Law, perhaps owing to errors in the real world:

Rank Value
1 1000
2 490
3 340
4 220
5 220
6 180
7 140

Now, what happens on the graph around values 4 and 5? If the X axis represents ranking, it makes no sense to say that the value of 220 corresponds to a rank of 4 and a rank of 5: it’s a rank of 4, followed by no ranking for 5 and a rank of 6 for the value of 180. We can see the point even more clearly if we take the alternative interpretation of a Zipfian list and say that the X axis tracks ‘number of items with value greater than or equal to Y’. Clearly there are 6 items greater than or equal to 180 and 5 greater than or equal to 220 – but it would be nonsensical to say that there are also 4 items greater than or equal to 220. Either way, if you have a ranked list with tied rankings this should be represented by gaps in the graph.

This may seem like a minor nitpick, but it’s actually very important. Back to Adam:

One nice thing about a ranked graph is that the “area” under the curve is equal to the total value associated with the items spanned on the ranked axis

Or, in the words of one of the pieces I quoted in the previous post:

In such a curve the distribution tapers off slowly into the sunset, and is called a tail. What is most intriguing about this long tail is that if you add up all the traffic at the end of it, you get a lot of traffic

What we’re talking about, clearly, is the Long Tail. Looking at some actual figures for inbound linkage (collected from NZ Bear earlier this year), there are few tied ranks in the higher rankings and more as we go further out: 95 unique values in the first 100 ranks and 79 in the next 100. Further down, the curve grows flatter, as we’d expect. The first ten rankings (ranging from 5,389 down to 2,142 links) correspond to ten sites; the last ten (ranging, predictably, from 9 down to zero) correspond to a total of 14,445. As Adam says, if you were to graph these data as a list of nominals ranked in descending order, the ‘area’ covered by the curve would give you a good visual impression of the total number of links accounted for by low-linked sites: the Long Tail, no other. But this graphic does not conform to a power law – not even Zipf’s Law. A list conforming to Zipf’s Law would drop tied ranks – it would exclude duplicates, if that’s any clearer. Instead of a long tail, it would trail off to the right with a series of widely-spaced fenceposts. (“In equal 9126th place, blogs with 9 links; in equal 9593rd place, 8-linkers…”)

Long Tail, power law: choose one.

You can have a Long Tail, but only by graphing a list of nominals ranked in descending order.

You can have a power law series with rankings, but only by replacing the long tail with scattered fenceposts.

Even more importantly, neither of these is a power law distribution. Given the appropriate data values, you can derive a power law distribution from a ranked list – but it doesn’t look like the ‘long tail’ graphic we know so well. I’ll talk about what it does look like in the next post.

Put your head back in the clouds

OK, let’s talk about the Long Tail.

I’ve been promising a series of posts on the Long Tail myth for, um, quite a while. (What’s a month in blog time? A few of those.) The Long Tail posts begin here.

Here’s what we’re talking about, courtesy of our man Shirky:

We are all so used to bell curve distributions that power law distributions can seem odd. The shape of Figure #1, several hundred blogs ranked by number of inbound links, is roughly a power law distribution. Of the 433 listed blogs, the top two sites accounted for fully 5% of the inbound links between them. (They were InstaPundit and Andrew Sullivan, unsurprisingly.) The top dozen (less than 3% of the total) accounted for 20% of the inbound links, and the top 50 blogs (not quite 12%) accounted for 50% of such links.


Figure #1: 433 weblogs arranged in rank order by number of inbound links.

It’s a popular meme, or it would be if there were any such thing as a meme (maybe I’ll tackle that one another time). Here’s one echo:

many web statistics don’t follow a normal distribution (the infamous bell curve), but a power law distribution. A few items have a significant percentage of the total resource (e.g., inbound links, unique visitors, etc.), and many items with a modest percentage of the resources form a long “tail” in a plot of the distribution. For example, a few websites have millions of links, more have hundreds of thousands, even more have hundreds or thousands, and a huge number of sites have just one, two, or a few.

Another:

if we measure the connectivity of a sample of 1000 web sites, (i.e. the number of other web sites that point to them), we might find a bell curve distribution, with an “average” of X and a standard deviation of Y. If, however, that sample happened to contain google.com, then things would be off the chart for the “outlier” and normal for every other one.If we back off to see the whole web’s connectivity, we find a very few highly connected sites, and very many nearly unconnected sites, a power law distribution whose curve is very high to the left of the graph with the highly connected sites, with a long “tail” to the right of the unconnected sites. This is completely different than the bell curve that folks normally assume

And another:

The Web, like most networks, has a peculiar behavior: it doesn’t follow standard bell curve distributions where most people’s activities are very similar (for example if you plot out people’s heights you get a bell curve with lots of five- and six-foot people and no 20-foot giants). The Web, on the other hand, follows a power law distribution where you get one or two sites with a ton of traffic (like MSN or Yahoo!), and then 10 or 20 sites each with one tenth the traffic of those two, and 100 or 200 sites each with 100th of the traffic, etc. In such a curve the distribution tapers off slowly into the sunset, and is called a tail. What is most intriguing about this long tail is that if you add up all the traffic at the end of it, you get a lot of traffic

All familiar, intuitive stuff. It’s entered the language, after all – we all know what the ‘long tail’ is. And when, for example, Ross writes about somebody who started blogging about cooking at the end of the tail and is now part of the fat head and has become a pro, we all know what the ‘fat head’ is, too – and we know what (and who) is and isn’t part of it.

Unfortunately, the Long Tail doesn’t exist.

To back up that assertion, I’m going to have to go into basic statistics – and trust me, I do mean ‘basic’. In statistics there are three levels of measurement, which is to say that there are three types of variable. You can measure by dividing the field of measurement into discrete partitions, none of which is inherently ranked higher than any other. This car is blue (could have been red or green); this conference speaker is male (could have been female); this browser is running under OS X (could have been Win XP). These are nominal variables. You can code up nominals like this as numbers – 01=blue, 02=red; 1=male, 2=female – but it won’t help you with the analysis. The numbers can’t be used as numbers: there’s no sense in which red is greater than blue, female is greater than male or OS X is – OK, bad example. Since nominals don’t have numerical value, you can’t calculate a mean or a median with them; the most you can derive is a mode (the most frequent value).

Then there are ordinal variables. You derive ordinal variables by dividing the field of measurement into discrete and ordered partitions: 1st, 2nd, 3rd; very probable, quite probable, not very probable, improbable; large, extra-large, XXL, SuperSize. As this last example suggests, the range covered by values of an ordinal variable doesn’t have to exhaust all the possibilities; all that matters is that the different values are distinct and can be ranked in order. Numeric coding starts to come into its own with ordinals. Give ‘large’ (etc) codes 1, 2, 3 and 4, and a statement that (say) ’50% of size observations are less than 3′ actually makes sense, in a way that it wouldn’t have made sense if we were talking about car colour observations. In slightly more technical language, you can calculate a mode with ordinal variables, but you can also calculate a median: the value which is at the numerical mid-point of the sample, when the entire sample is ordered low to high.

Finally, we have interval/ratio or I/R variables. You derive an I/R variable by measuring against a standard scale, with a zero point and equal units. As the name implies, an I/R variable can be an interval (ten hours, five metres) or a ratio (30 decibels, 30% probability). All that matters is that different values are arithmetically consistent: 3 units minus 2 units is the same as 5 minus 4; there’s a 6:5 ratio between 6 units and 5 units. Statistics starts to take off when you introduce I/R variables. We can still calculate a mode (the most common value) and a median (the midpoint of the distribution), but now we can also calculate a mean: the arithmetic average of all values. (You could calculate a mean for ordinals or even nominals, but the resulting number wouldn’t tell you anything: you can’t take an average of ‘first’, ‘second’ and ‘third’.)

You can visualise the difference between nominals, ordinals and I/R variables by imagining you’re laying out a simple bar chart. It’s very simple: you’ve got two columns, a long one and a short one. We’ll also assume that you’re doing this by hand, with two rectangular pieces of paper that you’ve cut out – perhaps you’re designing a poster, or decorating a float for the Statistical Parade. Now: where are you going to place those two columns? If they’re nominals (‘red cars’ vs ‘blue cars’), it’s entirely up to you: you can put the short one on the left or the right, you can space them out or push them together, you can do what you like. If they’re ordinals (‘second class degree awards’ vs ‘third class’) you don’t have such a free rein: spacing is still up to you, but you will be expected to put the ‘third’ column to the right of the ‘second’. If they’re I/R variables, finally – ’180 cm’, ’190 cm’ – you’ll have no discretion at all: the 180 column needs to go at the 180 point on the X axis, and similarly for the 190.

Almost finished. Now let’s talk curves. The ‘normal distribution’ – the ‘bell curve’ – is a very common distribution of I/R variables: not very many low values on the left, lots of values in the middle, not very many high values on the right. The breadth and steepness of the ‘hump’ varies, but all bell curves are characterised by relatively steep rising and falling curves, contrasting with the relative flatness of the two tails and the central plateau. The ‘power law distribution’ is a less common family of distributions, in which the number of values is inversely proportionate to the value itself or a power of the value. For example, deriving Y values from the inverse of the cube of X:

X value Y formula Y value
1 1000 / (1^3) 1000
2 1000 / (2^3) 125
3 1000 / (3^3) 37.037
4 1000 / (4^3) 15.625
5 1000 / (5^3) 8
6 1000 / (6^3) 4.63

As you can see, a power law curve begins high, declines steeply then ‘levels out’ and declines ever more shallowly (it tends towards zero without ever reaching it, in fact).

Got all that? Right. Quick question: how do you tell a normal distribution from a power-law distribution? It’s simple, really. In one case both low and high values have low numbers of occurrences, while most occurrences are in the central plateau of values around the mean. In the other, the lowest values have the highest numbers of occurrences; most values have low occurrence counts, and high values have the lowest counts of all. In both cases, though, what you’re looking at is the distribution of interval/ratio variables. The peaks and tails of those distribution curves can be located precisely, because they’re determined by the relative counts (Y axis) of different values (X axis) – just as in the case of our imaginary bar chart.

Back to a real bar chart.

Figure #1: 433 weblogs arranged in rank order by number of inbound links.

The shape of Figure #1, several hundred blogs ranked by number of inbound links, is roughly a power law distribution.

As you can see, this actually isn’t a power law distribution – roughly or otherwise. It’s just a list. These aren’t I/R variables; they aren’t even ordinals. What we’ve got here is a graphical representation of a list of nominal variables (look along the X axis), ranked in descending order of occurrences. We can do a lot better than that – but it will mean forgetting all about the idea that low-link-count sites are in a ‘long tail’, while the sites with heavy traffic are in the ‘head’.

[Next post: how we could save the Long Tail, and why we shouldn't try.]

A trick of the eye

A long time ago on a Web site far, far away, Clay Shirky wrote:

“We are all so used to bell curve distributions that power law distributions can seem odd.”

He then traced Pareto-like ‘power law’ curves operating in a number of domains where large numbers of people make unconstrained choices – most memorably, inbound link counts for blogs. The inverse ‘power law’ curve dives steeply, then levels out, glides downwards almost to zero and peters out slowly. And thus was born the ‘Long Tail’.

As I wrote here, there’s a problem with this article, and hence with the ‘Long Tail’ image itself. Despite repeated references to ‘power law distributions’, none of the curves Clay presented were distributions. They were histograms representing ranked lists: in other words series of numbers ordered from high to low.

What’s the difference? A short answer is that the data Clay presents makes his own comparison with ‘bell curve’ (normal) distributions unsustainable: order from high to low and you will only ever get a downward curve.

For a longer answer, you’ll have to look at some numbers. Here are some x,y values which would give you a normal distribution. (For anyone in danger of glazing over, that’s ‘x’ as in horizontal axis, low to high values running left to right; ‘y’ values are on the vertical axis, low to high running bottom to top).

1 1
2 30
3 100
4 240
5 400
6 600
7 750
8 900
9 960
10 1000
11 1000
12 960
13 900
14 750
15 600
16 400
17 240
18 100
19 30
20 1

OK? And here are some co-ordinates which would give you an inverse power-law distribution:

1 1000
2 444
3 250
4 160
5 111
6 82
7 63
8 49
9 40
10 33
11 28
12 24
13 20
14 18
15 16
16 14
17 12
18 11
19 10
20 9

Just for the hell of it, here are some numbers that would give you a direct (ascending) power law distribution:

1 9
2 10
3 11
4 12
5 14
6 16
7 18
8 20
9 24
10 28
11 33
12 40
13 49
14 63
15 82
16 111
17 160
18 250
19 444
20 1000

Finally, by way of contrast, here’s a series of numbers.

1000
444
250
160
111
82
63
49
40
33
28
24
20
18
16
14
12
11
10
9

I’ve sorted these numbers high to low, but – unlike the other three examples – there’s nothing in the data that told me to do that. You could arrange them that way; you could sort them low to high instead; you could even hack them about manually to produce a rather lumpy and uneven bell curve. It’s up to you.

I’m not saying that a ranked listing – arranging numbers like these high to low – is meaningless. The ranked histogram is quite a good graphic – it’s informative (within limits) and easy to grasp. What I am saying is that it’s an arbitrary ordering rather than a distribution. Which is to say, it’s not the best way of representing this data – let alone the only way. It’s a relatively information-poor representation, and one which tends to promote perverse and unproductive ways of thinking about the data.

More about this – and a couple of constructive suggestions – next time I post.

When is a spike not a spike?

When it’s a long tail. Maybe.

David Weinberger writes:

In a conversation with Erica George at the Berkman she pointed out that the demographics of Live Journal don’t always represent one’s experience of Live Journal — the demographics say that teenage girls are the largest users, but if you’re a 25 year old, your social group there may not look that way at all.

Which raises an issue about the way the “long tail” is pictured. Clay’s charts are accurate depictions of his data, but they have a mythic power that’s misleading: The long tail looks like, well, a long tail when in fact it’s a fractal curlicue of relationships.

This is an interesting point in itself – perhaps the blogosphere would be better viewed as a series (archipelago? galaxy?) of more or less closed, more or less interlinked ‘spheres’. I’m not sure how you’d visualise that, though – perhaps something like the Jefferson High School network diagram?.

But there’s a broader point about the accuracy of those ‘long tail’ graphics. Adam Marsh made an interesting point here about a recently-discovered ‘long tail’:

Clay refers to “the characteristic long tail of people who use many fewer tags than the power taggers.” While this chart does exhibit a “long tail,” this is simply a result of the fact that the users were ordered by decreasing tag usage (also true of the following three charts) — the X axis here doesn’t represent a value, it is just a sequence of users.

The phrase “long tail” usually refers to the observation that for many distributions, the number of elements with outlying values (the “tail”) may be cumulatively significant compared to the number of elements clustered near the average.

On inspection, it turns out that this is also true of the celebrated ‘Power law and Weblogs’ graphic: there are no values on the X axis, just a list of blogs arranged in descending order of number of links. This matters, because in a graphical representation of a statistical distribution both axes carry information. Typically, values of the variable being measured run low to high on the X axis, left to right, while the count of occurrences of each value runs high to low on the Y axis, top to bottom. Clay wrote, “We are all so used to bell curve distributions that power law distributions can seem odd.” But Clay’s own graphics aren’t so much odd as misleading, and not only because he’s put high values on the left of the graph rather than the right. In effect, he’s got two axes conveying one piece of information. Andrew Sullivan’s blog and Instapundit get a high Y value (lots of links) and a high X value (because all the sites with lots of links have been sorted to the left).

If you took the same numbers and plotted them on an X axis with values – if you produced a graph showing how many blogs had how many links, with zero at the origin on both scales… Well, I don’t know what would happen – but five minutes’ experimentation tellsreminds me that, if you wanted to produce a nice clear series of vertical bars rather than a line that wanders all over the place, you’d need to put ‘number of blogs’ on the Y axis and ‘number of inbound links’ on the X axis, rather than vice versa. (There’s a simple reason for this: some values are unique by definition, others aren’t.) Which in turn means that any vertical spike would represent large numbers of blogs (say, for example, blogs with small numbers of inbound links) while any long tail would represent small numbers (say, for example, the few blogs with lots of links).

Caveat: I haven’t crunched any actual numbers, or even mumbled them gently. But maybe we’ve been looking at this the wrong way round, statistically speaking. Perhaps the long tail is the spike; perhaps the spike is really the long tail.

For Tomorrow (I) – 126 as a limit

Who’s Backing Blair? Probably not Chris Applegate, who says tactical voting is rubbish. Not Ken MacLeod, who fears we’re sleepwalking towards a Tory government. Certainly not Tom Watson MP, who says that making a protest vote is “one hell of a risk”.

This is the first in a series of posts inspired by Backing Blair and its critics: it began as an attempt to identify exactly what was wrong with Tom Watson’s arguments against protest voting. It grew from there; I’m going to be writing about electoral blackmail, Howard Dean’s presidential campaign, the state of the Left and Paul Anderson’s recent revival of Neville’s Inch, among other things. But to begin with, here’s some arithmetic. (Thanks to Electoral Calculus, UK Polling Report and ukpolitical.info, and in particular this site at Keele University, for the figures.)

At present, the Labour Party has 409 MPs out of 658 – a theoretical majority of 160. The number of Scottish constituencies will be reduced by 13 at the next election. In effect, Labour will go into the election with 400 MPs out of 645 – a majority of 155. The figures for the Conservatives and the Liberal Democrats are 164 and 54. (Boring but relevant information: in what follows I’ll use the by-election figures for the two seats which have changed hands at by-elections since 2001 (Leicester South and Brent East), but use the 2001 figures for the four by-election holds (Hartlepool, Birmingham Hodge Hill, Ogmore, Ipswich). I’ll also use the 2001 figures for two seats which have changed hands without an election (Wantage, Shrewsbury & Atcham) and for the 59 redefined Scottish seats; this includes one seat, the Scottish Conservative marginal of Galloway & Upper Nithsdale

Follow

Get every new post delivered to your Inbox.

Join 212 other followers

%d bloggers like this: