I like a good fallacy; I managed to get the Base Rate Fallacy, the Hawthorne Effect and Goodhart’s Law into one lecture I gave recently. So I was intrigued to run across this passage in Jock Young’s 2004 essay “Voodoo Criminology and the numbers game” (you can find a draft in pdf form here):
Legions of theorists from Robert K Merton through to James Q Wilson have committed Giffen’s paradox: expressing their doubts about the accuracy of the data and then proceeding to use the crime figures with seeming abandon, particularly in recent years when the advent of sophisticated statistical analysis is, somehow, seen to grant permission to skate over the thin ice of insubstantiality.
I like a good fallacy, but paradoxes are even better. So, tell me more about Giffen’s paradox:
Just as with Giffen’s paradox, where the weakness of the statistics is plain to the researchers yet they continue to force-feed inadequate data into their personal computers
Try as I might, I wasn’t seeing the paradox there. A footnote referenced
Giffen, P. (1965), ‘Rates of Crime and Delinquency’ in W. McGrath (ed.), Crime Treatment in Canada
I didn’t have W. McGrath (ed.), Crime Treatment in Canada by me at the time, so I did the next best thing and Googled. I rapidly discovered that Giffen’s paradox is also known as the Giffen paradox, that it’s associated with Giffen goods, and that it’s got nothing to do with Giffen, P. (1965):
Proposed by Scottish economist Sir Robert Giffen (1837-1910) from his observations of the purchasing habits of the Victorian poor, the Giffen paradox states that demand for a commodity increases as its price rises.
Raise the price of bread when there are people on the poverty line – ignoring for the moment the fact that this makes you the rough moral equivalent of Mengele – and those people will buy more bread, to substitute for the meat they’re no longer able to afford. It’s slightly reassuring to note that, notwithstanding Sir Robert’s observations of the Victorian poor, economists have subsequently questioned whether the Giffen paradox has ever actually been observed.
But none of this cast much light on those researchers force-feeding their personal computers with inadequate data. Eventually I tracked down W. McGrath (ed.), Crime Treatment in Canada. It turns out that the less famous Giffen did in fact describe the willingness of researchers to rely on statistics, after having registered caveats about their quality, as a paradox (albeit “one of the less important paradoxes of modern times”). I still can’t see that this rises to the level of paradox: surely being upfront about the quality of the data you’re processing is what a statistical analyst should do. If initial reservations don’t carry through into the conclusion that’s another matter – but that’s not a paradox, that’s just misrepresentation.
Paradoxical or not, Giffen’s observation accords with Young’s argument in the paper, which is that criminologists, among other social scientists, place far too much trust in statistical analysis: statistics are only as good as the methods used to produce them, methods which in many cases predictably generate gaps and errors.
It’s a good argument but not a very new or surprising one (perhaps it was newer in 1965). Moreover, Young pushes it in some odd directions. The paper reminded me of Robert Martinson’s 1974 study of rehabilitation programmes, “What Works?” – or rather, of how that paper was received. Martinson demonstrated that no study had conclusively shown any form of rehabilitation to work consistently, and that very few studies of rehabilitation showed any clear result; his paper was seized on by advocates of imprisonment and invoked as proof that nothing worked. This was unjustified on two levels. Firstly, while Martinson’s negatives would justify scepticism about a one-size-fits-all rehabilitation panacea, the detail of his research did suggest that some things worked for some people in some settings. Subsequent research – some of it by Martinson himself – bore out this suggestion, showing reasonably clear evidence that tailored, flexible and multiple interventions can actually do some good. Secondly, if Martinson was sceptical about rehabilitation, he wasn’t any less sceptical about imprisonment: his conclusion was that ex-offenders could be left alone, not that they should be kept locked up (“if we can’t do more for (and to) offenders, at least we can safely do less”). For Martinson, rehabilitation couldn’t cut crime by reforming bad people, because crime wasn’t caused by bad people in the first place. Sadly, the first part of this message was heard much more clearly than the second.
Like Martinson, Young is able to present a whole series of statistical analyses which seem obviously, intuitively wrong. However, what his examples suggest is that statistics from different sources require different types and levels of wariness: some are dependably more trustworthy than others, and some of the less trustworthy are untrustworthy in knowably different ways. But rather than deal individually with the different types of scepticism, levels of scepticism and reasons for scepticism which different analyses provoke, Young effectively concludes that nothing works, or very little:
Am I suggesting an open season on numbers? Not quite: there are … numbers which are indispensable to sociological analysis. Figures of infant mortality, age, marriage and common economic indicators are cases in point, as are, for example, numbers of police, imprisonment rates and homicide incidences in criminology. Others such as income or ethnicity are of great utility but must be used with caution. There are things in the social landscape which are distinct, definite and measurable; there are many others that are blurred because we do not know them – some because we are unlikely ever to know them, others, more importantly, because it is their nature to be blurred. … There are very many cases where statistical testing is inappropriate because the data is technically weak – it will simply not bear the weight of such analysis. There are many other instances where the data is blurred and contested and where such testing is simply wrong.
(In passing, that’s a curious set of solid, trustworthy numbers to save from the wreckage – it’s hard to think of an indicator more bureaucratically produced, socially constructed and culture-bound than “infant mortality”, unless perhaps it’s “marriage”.)
I’ve spent some time designing a system for cataloguing drug, alcohol and tobacco statistics – an area where practically all the data we have is constructed using “blurred and contested” concepts – so I sympathise with Young’s stance here, up to a point. Police drug seizure records, British Crime Survey drug use figures and National Treatment Agency drug treatment statistics are produced in different ways and tell us about different things, even when they appear to be talking about the same thing. (In my experience, people who run archives know about this already and find it interesting, people who use the statistics take it for granted, and IT people don’t know about it and want to fix it.) But: such testing is simply wrong? (Beware the persuasive adverb – try re-reading those last two sentences with the word ‘simply’ taken out.) We know how many people answered ‘yes’ to a question with a certain form of words; we know how many of the same people answered ‘yes’ to a different question; and we know the age distribution of these people. I can’t see that it would be wrong to cross-tabulate question one against question two, or to calculate the mean age of one sub-sample or the other. Granted, it would be wrong to present findings about the group which answered Yes to a question concerning activity X as if they were findings about the group who take part in activity X – but that’s just to say that it’s wrong to misrepresent your findings. Young’s broader sceptical claim – that figures constructed using contested concepts should not or cannot be analysed mathematically – seems… well, wrong.
Young then repeats the second of the errors of Martinson’s audience: if none of that works, then we can stick with what we know. In this case that means criminology reconceived as cultural ethnography: “a theoretical position which can enter in to the real world of existential joy, fear, false certainty and doubt, which can seek to understand the subcultural projects of people in a world riven with inequalities of wealth and uncertainties of identity”. Fair enough – who’d want a theoretical position which couldn’t enter in to the real world? But the question to ask about creeds is not what’s in them but what they leave out. Here, the invocation of culture seems to presage the abandonment not only of statistical analysis but of materialism.
The usual procedure … is to take the demographics and other factors which correlate with crime in the past and attempt to explain the present or predict the future levels of crime in terms of changes in these variables. The problem here is that people (and young people in particular) might well change independently of these variables. For in the last analysis the factors do not add up and the social scientists begin to have to admit the ghost in the machine.
People … might well change independently of these variables – how? In ways which don’t find any expression in phenomena that might be measured (apart from a drop in crime)? It seems more plausible to say that, while people do freely choose ways to live their lives, they do not do so in circumstances of their own choosing – and that those choices in turn have material effects which create constraints as well as opportunities, for themselves and for others. To put it another way, if the people you’re studying change independently of your variables, perhaps you haven’t got the right variables. Young’s known as a realist, which is one way of being a materialist these days; but the version of criminology he’s proposing here seems, when push comes to shove, to be non- or even anti-materialist (“the ghost in the machine”). That’s an awfully big leap to make, and I don’t think it can be justified by pointing out that some statisticians lie.
What arguments based on statistics need – and crime statistics are certainly no exception – is scepticism, but patient and attentive scepticism: it’s not a question of declaring that statistics don’t tell us anything, but of working out precisely what particular uses of statistics don’t tell us. A case in point is this story in last Friday’s Guardian:
An 8% rise in robberies and an 11% increase in vandalism yesterday marred the latest quarterly crime figures, which showed an overall fall of 2% across all offences in England and Wales.
The rise in street crime was accompanied by British Crime Survey indicators showing that public anxiety about teenagers on the streets, noisy neighbours, drug dealing, drunkenness and rowdiness has continued to increase despite the government’s repeated campaigns against antisocial behaviour. … But police recorded crime figures for the final three months of 2006 compared with 12 months earlier showed that violent crime generally was down by 1%, including a 16% fall in gun crime and an 11% fall in sex offences.
The more authoritative British Crime Survey, which asks 40,000 people about their experience of crime each year, reported a broadly stable crime rate, including violent crime, during 2006. … The 11% increase in vandalism recorded by the BCS and a 2% rise in criminal damage cases on the police figures underlined the increase in public anxiety on five out of seven indicators of antisocial behaviour.
Confused? You should be. Here it is again:
|All crime||down 2%||stable (up 1%*)|
|Violent crime||down 1%||stable|
|Robbery||up 8%||stable (down 1%*)|
|Vandalism||up 2%||up 11%|
* Figures in italics are from the BCS but weren’t in the Guardian story.
Earlier on in this post I made a passing reference to statistical data being bureaucratically produced, socially constructed and culture-bound. Here’s an example of what that means in practice. Police crime figures are a by-product of the activities of the police in dealing with crime, and as such are responsive to changes in the pattern of those activities: put a lot more police resources into dealing with offence X, or change police procedure so that offences of type X are less likely to go by unrecorded, and the crime rate for offence X will appear to go up (see also cottaging). Survey data, on the other hand, is produced by asking people questions; as such, it’s responsive to variations in the type of people who answer questions and to variations in those people’s memory and mood, not to mention variations in the wording of the questions, the structure of the questionnaire, the ways in which answers are coded up and so on. The two sets of indicators are associated with different sets of extraneous influences; if they both show an increase, the chances are that they’ve both been affected by the same influence. The influence in question may be a single big extraneous factor which affects both sets of figures – for example, a massively-publicised crackdown on particular criminal offences will give them higher priority both in police activities and in the public consciousness. But it may be a genuine increase in the thing being measured – and, more to the point, the chances of it being a genuine increase are much higher than if only one indicator shows an increase.
In this case, the police have robberies increasing by 8%; the BCS has theft from the person dropping by 1%. That’s an odd discrepancy, and suggests that something extraneous is involved in the police figure; it’s not clear what that might be, though. Vandalism, on the other hand, goes up by 2% if you use police figures but by all of 11% if you use the BCS. Again, this discrepancy suggests that something other than an 11% rise in the actual incidence of vandalism might be involved, and in this case the story suggests what this might be:
British Crime Survey indicators showing that public anxiety about teenagers on the streets, noisy neighbours, drug dealing, drunkenness and rowdiness has continued to increase despite the government’s repeated campaigns against antisocial behaviour
Presumably the government’s repeated campaigns against antisocial behaviour have raised the profile of anti-social behaviour as an issue. Perhaps this has made it more likely that people will feel that behaviour of this type is something to be anxious about, and that incidents of vandalism will be talked about and remembered for weeks or months afterwards (the BCS asks about incidents in the past twelve months).
That’s just one possible explanation: the meaning of figures like these is all in the interpretation, and the interpretation is up to the interpreter. The more important point is that there are things that these figures will and won’t allow you to do. You can say that police figures, unlike the BCS, are a conservative but reliable record of things that have actually happened, and that robbery has gone up by 8% and criminal damage by 2%. You can say that victim surveys, unlike police figures, are an inclusive and valid record of things that people have actually experienced, and that vandalism has gone up by 11% while robbery has gone down by 1%. What you can’t do is refer to An 8% rise in robberies and an 11% increase in vandalism - there is no way that the data can give you those two figures.
But that’s not a paradox or even a fallacy – it’s just misuse of statistics.