Category Archives: flummery

On science alone

Like Splinty, I am not inconsiderably annoyed at Private Eye. Oh yes.

In the recent ruckus between Newsnight and the Decent Right thinktank Policy Exchange, the Eye (or at least the enigmatic ‘Ratbiter’) has unaccountably chosen to side with the latter.

Newsnight alleged that Policy Exchange or its researchers had forged the receipts which showed you could buy book spewing out hatred of women, Jews, Christians and moderate Muslims in mosques. The researchers utterly deny any forgery; but the implications of the alleged exposé are explosive: David Cameron’s favourite think-tank was apparently stirring up racial hatred with fraudulent evidence.

Newsnight‘s killer claim was that its hacks had organised forensic tests which proved that receipts Policy Exchange said it had collected from the Muslim Education Centre in High Wycombe were dubious. When Policy Exchange said that the centre was selling such titles as Women Who Deserve to go to Hell – for complaining about their husbands and going along with feminist ideas promoted by Jews and Christians – it couldn’t be believed. The BBC stuck by the accusation even though the Muslim Education Centre cheerily told reporters that the books were indeed on sale.

Similarly Newsnight said receipts from the Al-Muntada Al-Islami Trust in west London were suspicious … If Newsnight‘s allegations were correct, the al-Muntada centre should be the innocent victim of a disgraceful smear. But the most basic checks show that it wasn’t. At the time the Eye was going to press, the al-Muntada online bookshop was offering [two works cited by Policy Exchange]

There’s a very basic logical fallacy in the argument put forward by Policy Exchange and endorsed by the Eye, which hinges on the unstated proposition that for Muslim bookshops to sell the works of (say) Sayyid Qutb really matters. It’s about working backwards up the chain of causation and treating an intermediate (and perhaps optional) link as if it were the starting point. All sorts of misinterpretations can follow from this error: some gang members grew up listening to gangsta rap, for example, but many people who grew up listening to gangsta rap didn’t go on to join gangs and were never at any risk of doing so. In the case of Qutb, as Splinty says:

What Qutb does do, if you’re a young Muslim alienated from the surrounding society, is provide an intellectual framework for you to understand your alienation. Note that this only works if you’re already an alienated Muslim, and that a Qutbist intellectual framework is not remotely necessary for the alienated Muslim to adopt jihadi ideas.

You can get from A to C via B, but you can also go straight from A to C, or go to B without going on to C. What’s most important is starting at A – and you don’t get there from B.

So there’s a strong argument that Policy Exchange and ‘Ratbiter’ don’t have a case even if we take everything they say at face value. But there’s a more fundamental problem. ‘Ratbiter’ doesn’t go into any detail about the alleged faking of the receipts, resorting to the weaselly adjectives ‘dubious’ and ‘suspicious’ and a reference to sciencey-sounding “forensic tests”. Those scientists, they can prove anything, can’t they? Newsnight will have given those receipts to a bunch of boffins in white coats, they’ll have taken a sample and whizzed it round in a centrifuge or something, and just because some liquid ends up turning red instead of blue…

Actually the tests were a bit more basic – and a bit more conclusive. Here‘s Richard Watson of Newsnight (and this has been up since the 14th of December, which presumably was some time before the Eye went to press):

Al-Manaar Muslim Cultural Heritage Centre
the hand-writing on this receipt is very similar – to my eye it looks identical – to the hand-writing on another receipt, said to have been obtained from a mosque in Leyton, 10 miles away [Masjid as-Tawhid]. A registered forensic document examiner concluded that there was “strong evidence” that the two receipts were written by the same person.

Masjid as-Tawhid
The first receipt provided by the researcher was obtained from the bookshop, at 78 Leyton High Road. I did see the carbon copy of this receipt so we know the books were acquired from the bookshop. But both the bookshop manager and the mosque management categorically say they are two separate organisations.

Curiously, we were told that researchers were sent back at a later date to obtain a second receipt on headed paper and that document, printed on an ink-jet printer, introduced the word “mosque” into the receipt for the first time. The address is still given as that of the bookshop. But none of this addresses the worrying fact that the hand-writing on the printed receipt matches that on the receipt from the Muslim Cultural Heritage Centre, 10 miles away.

Al-Muntada
[The receipt was] printed on an ink-jet printer. The forensic ESDA tests carried out by the registered document examiner concluded that this receipt was underneath the receipt from the Muslim Education Centre in High Wycombe when this latter one was written out. Once again the mosque management categorically told us that the receipt provided by the researchers was not a genuine document. Even if the books are available online, there are serious questions about the authenticity of this receipt.

You get the idea.

I read quite a lot of research for the purposes of my day job, and I’ve seen results called into question on much weaker grounds than Newsnight had. If you’ve got good reason to believe that the evidence in front of you isn’t genuine – let alone reason to believe that it’s been faked – then you just don’t trust that research, even if it’s telling you that the sky is sometimes dark at night and Monday tends to come after Sunday. If someone else can get similar results by other means, bully for them – let them publish what they’ve got. But that doesn’t somehow retrospectively validate the faked research, as the Eye seems to imagine.

Ultimately it’s a point about the reliability of the researcher as well as the research. If you’ve got evidence that they’re willing to put their thumb on the scales to get the right answer, from that point on you can’t really trust anything they tell you – unless it begins with “I’m sorry I faked those results”, and even then you’ll want to watch them like a hawk. Unfortunately Policy Exchange’s response to Newsnight can be summed up as “we didn’t fake those results, and what does it matter if we did, and besides you’re no better”.

To push the evidence is bad, but it doesn’t make the research completely invalid. To fake the evidence does invalidate the research, but for the researcher it’s survivable. But to fake the evidence and then refuse to admit it, deny that it matters, change the subject and generally try to bluster your way out of it – you’re off the list, I’m afraid.

The fundamental point ‘Ratbiter’ seems to miss is that this applies just as strongly if the results are plausible – and twice as strongly if the results are in line with the audience’s expectations. Picture the scene: they’re telling you what you want to hear, and it seems believable, but you’ve got evidence that they’re willing to lie about it. It’s a setup that rings some very loud alarm bells for me, but apparently it doesn’t at the Eye. Perhaps ‘Ratbiter’ had better stay well away from time-share presentations.

Just a parasol

The following comment didn’t appear on whatever post it was meant for, as WordPress’s spamcatcher automatically sent it to the bitbucket.

I like your blog and I feel we share sufficient common ground for a link to each others blogs to be mutually beneficial.If you agree to link then please contact me at ‘An Unrepentant Communist’

http://unrepentantcommunist.blogspot.com/

on the commments page of the current post,and I will immediately link your blog to mine.Looking forward to hearing from you.
Gabriel in County Kerry Ireland

Gabriel, for the love of Marx, give it a rest.

Incidentally, can anyone tell me what’s at http://urban75.net/vbulletin/showthread.php?t=222492? There’s a link to this blog there, apparently, but not having an Urban 75 account I can’t tell what it is.

But you don’t know me

I don’t know Tilda Swinton. At all.

There are, of course, many people I don’t know; the list could be extended more or less indefinitely, potentially forming the basis for a rather unchallenging game (“Yeah? Well, I don’t know Charles Kennedy, Jason Orange or Hufty from the Word…”) The point about Tilda Swinton in particular is that, if you stopped me in the street and asked me if I knew her, I’ve got a horrible feeling I’d say Yes. (At least, I used to… Well, when I say ‘know’, I met… actually no, I never actually met… sorry, what was the question?)

Obviously, the image of anyone you’ve seen a lot on the screen can get painted on the back of your mind, to the point where they seem as familiar as a friend or neighbour (“In the street people come up to Rita/It’s Barbara Knox really but they’re still glad to meet her” – Kevin Seisay). I suppose something similar’s going on here, assisted in this case by the fact that I was at the same university as Tilda Swinton for at least one year; I even saw her in a college theatre production once, playing opposite a friend of a friend of mine. (I think. It may have been someone else.)

I’ve never even had any contact with Tilda Swinton, if it comes to that. I did once try to get in touch with her, for a series of brief interviews we were running in Red Pepper at the time. A friend gave me the number of a friend, who she thought had known her and might be able to put me in touch. I duly phoned the friend’s friend, who was a bit taken aback and suggested that if I wanted to speak to Tilda Swinton I should probably go through Tilda Swinton’s management. Nothing ever came of it.

In short, whatever fantasies I may half-consciously harbour, the real world is unanimous on this one: I don’t know Tilda Swinton, at all. I’ve got a friend who’s got a friend who may once have known her, and I had a friend at college who had a friend who may once have acted with her, but none of that adds up to anything.

Or it didn’t, until LinkedIn.

LinkedIn is a social networking site for people who want to make their social network work; it’s designed to enable members to exploit “the professional relationships you already have”. You join LinkedIn by writing a ‘profile’ (a c.v., more or less). You then ‘build your network’ by exchanging emails with existing members of LinkedIn who you already know; the software helpfully provides lists of LinkedIn members who are, or were, at your workplace, former workplace or university. When your emailed invitation has been accepted, the user you invited becomes one of your ‘connections’, while you become one of theirs. Ultimately you end up with a network “consist[ing] of your connections, your connections’ connections, and the people they know, linking you to thousands of qualified professionals”. ‘Thousands’ is no exaggeration: after a month’s membership I’ve got 41 ‘trusted friends and colleagues’, and many LinkedIn users have five or ten times as many. It adds up, or rather multiplies out: if you count “[my] connections’ connections, and the people they know”, I’m connected to over 200,000 people. Woohoo.

There are two main ways to make money out of social software – adding advertising or charging a fee for a premium service – and I’m generally in favour of the latter. This is the route LinkedIn have chosen. Annoyingly, the result in this case is not simply that fee-paying users benefit but that free riders are penalised. The profiles of users outside your network are only shown in full if you’ve got a paid-for account, which can be frustrating. Worse, the highest echelons of power-networking users can opt out of receiving common-or-garden email invitations, so that they can only be contacted using the network’s ‘InMail’ facility – which is, of course, only available on paid-for accounts. There’s being linked in, and then there’s being linked in. I suppose this says something about the nature of the service they’re providing: a professional social network is one with lots of people excluded from it.

The bigger question is what LinkedIn actually provides (apart from the warm glow of knowing that somebody else has been excluded). I wrote last year that tagging, for me, is more an elaborate way of building a mind-map than anything to do with bookmarking pages and finding them again; I’m interested to see that Philipp has reached a similar conclusion (“Let’s put it straight: Using tags to find my bookmarks later just doesn’t work. I give up.”) Similarly, I suspect that one of the main benefits of LinkedIn – at least for us non-power-networkers – is the capacity it gives you to contemplate the scale and plenitude of your own network: all those people I know, sort of! I mean, I know someone who knows them, or else there’s a friend of a friend who knows them… So I sort of know them, really, don’t I, just a bit?

But Tilda Swinton’s not on LinkedIn. So I don’t know her at all.

Wrapped in paper (8)

After all those columns from 1999, here’s one from last month. (And then I’ll get back to proper blogging, probably.) They say you should write about what you know; what I knew, that particular weekend, was beer. ‘Dave Bitzer’ doesn’t represent anyone in particular. Years ago I invented a consultancy called Gargle Bitzer Helipad, and I’ve used various Gargles and Bitzers ever since then to stand in for different talking heads and company spokespeople. Usually I make them talk rubbish, for obvious reasons, but in this one I think Dave talks a lot of sense.

I RAN INTO my old friend and commenting partner Dave Bitzer the other day at local beer tasting event Pale And Bitter (And Slightly Sour). I’d worked my way through the milds by this point and started on the fruit beers. In retrospect I think the second blackcurrant flavoured porter may have been a mistake.

“Dave!” I put it to him. “How’s it going! How is it going? How’s life in the… well, you know.”

I could see that my incisive style of questioning had caught Dave unprepared. For a moment, in fact, he got so confused that he said “Hello” and then turned his back on me – very much as if he had said “Goodbye”! Pausing only to sample the ginger-flavoured pale ale, I hastened to set his mind at rest.

“Dave,” I put it to him, placing a friendly arm around his shoulders. “David, David, Davey Davey Dave. It’s like this. I mean, is it like this? That’s the thing, you see – is it like this or not? I mean, if you spend your time reading about Web 2.0 on blogs and podcasts… and, and blogcasts…”

Dave said that people who did that should probably get out more, although in my case he’d make an exception. I thought that was a very good point.

“That’s a very good point,” I put it to him. “Thing is, if you read the Webby, Web things, lots of stuff. Lots of stuff happening. Reminds me of the dot boom. Is this another dot boom boom, Dave? Wait a minute, that’s not right. Is this… another… dot dot boom, de-boom boom boom. That’s what I say.”

Dave gave a heavy sigh, clearly impressed with the cogency of my argument. OK, he said, look at it this way. He tore the top layer of paper from a beermat and drew a cross on the exposed surface. So here’s your basic quadrant, he said. You can call this one -

“I’ll call it Henry,” I put it to him.

Dave sighed again, obviously deeply impressed. OK, he said, here’s Henry the Quadrant. Left to right we’ve got usefulness – is an application idea actually useful or not? Top to bottom, marketability, or whether or not you can get people worked up about it. We can rule out a couple of combinations straight away. ‘Dull but useful’ is an uphill battle for any company (apart from companies that have a large installed base they can sell to), and ‘dull and useless’ is best avoided. Clearly, ‘useful and exciting’ is what most developers are aiming for. But here’s the problem. How can developers actually come up with something that’s both exciting and useful? Look at the way we live already – we wear clothes, we drive cars, we synchronise calendars, we download MP3s, we shop around to get the best price for DVDs and don’t worry too much about the Hong Kong customs stamp when they arrive. It all works, basically. So people end up going for “exciting but useless”, and you get applications like Twitter – sounds great, if you like the idea of reading other people’s diaries in real time, but it’s not much use if you’ve actually got a life. Speaking of which, he added cryptically, then turned to look around the room, avoiding my gaze completely. I was touched by this mark of respect and put my arm round his shoulders again.

“So… So, so, so, Dave,” I put it to him. “Tell me, Dave. Is this another dot boom boom?”

Dave made a strange respectful growling noise. Not really, he said, because… oh, never mind. Look, it’s as if a brewery had to keep coming up with something new, and after a while they found they’d done every kind of beer that was actually drinkable, but they just kept going anyway and turned out, I don’t know, ginger-flavoured pale ale or blackcurrant-flavoured porter. There’s just not a lot going on, and the stuff that gets the hype isn’t really worth it. That’s why I’m here, actually.

“What, to check out the brewing… trendy… trendy trends?” I put it to him.

No, Dave said – to get drunk. What do you recommend?

Great big bodies

I think the thing that really irritates me about the Long Tail is just how basic the statistical techniques underlying it are. If you’ve got all that data, why on earth wouldn’t you do something more interesting and more informative with it? It’s really not hard. (In fact it’s so easy that I can’t help feeling the Long Tail image must have some other appeal – but more on that later.)

As you may have noticed, this weblog hasn’t been updated for a while. In fact, when I compared it with the rest of my RSS feed I found it was a bit of an outlier:

blogs2

The Y axis is ‘number of blogs': two updated today (zero days ago), 11 in the previous 10 days, 1 in the 10-day period before that, and so on until you get to the 71-80 column. Note that each column is a range of values, and that the columns are touching; technically this is a histogram rather than a bar chart.

You can do something similar with ‘posts in last 100 days':

blogs1

This shows that the really heavy posters are in the minority in this sample; twelve out of the eighteen have 30 or fewer posts in the last 100 days.

So it looks as if I’m reading a lot of reasonably regular but fairly light bloggers, and a few frequent fliers. If you put the two series together you can see the two groups reflected in the way the sample smears out along the X and Y axes without much in the middle:

blogs3

My question is this. If you can produce readable and informative charts like this quickly and easily (and I assure you that you can – we’re talking an hour from start to finish, and most of that went on counting the posts), what on earth would make you prefer this:

blogs5

or this:

blogs4

I can only think of two reasons. One is that it looks kind of like a power law distribution, and that’s a cool idea. Except that it isn’t a power law distribution, or any kind of distribution – it’s a list ranked in descending order, and, er, that’s it. The same criticism applies, obviously, to the classic ‘power law’ graphic ranking weblogs in descending order of inbound links.

DIGRESSION
You can compute a distribution of inbound links across weblogs using very much the techniques I’ve used here – so many weblogs with one link, so many with two and so forth. Oddly enough, what you end up with then is a curve which falls sharply then tapers off – there are far fewer weblogs with two links than with only one, but not so much of a difference between the ’20 links’ and ’21 links’ categories. However, even that isn’t a power law distribution, for reasons explained here and here (reasons which, for the non-mathematician, can be summed up as ‘a power law distribution means something specific, and this isn’t it’).
END DIGRESSION

The other reason – and, I suspect, the main reason – is that the Long Tail privileges ranking: the question it suggests isn’t how many of which are doing what? but who’s first?. A histogram might give more information, but it wouldn’t tell me who’s up there in the big head, or how far down the tail I am.

People want to be on top; failing that, they want to fantasise about being on top and identify with whoever’s up there now. Not everyone, but a lot of people. The popularity of the Long Tail image has a lot in common with the popularity of celebrity gossip magazines.

Got a web between his toes

Now that Nick has read the last rites for Web 2.0, perhaps it’s safe to return to a question that’s never quite been resolved.

To wit: what is Web 2.0? (We’ve established that it’s not a snail.) Over at What I wrote, I’ve just put up a March 2003 article called “In Godzilla’s footprint“. In it, I asked similar questions about e-business, taking issue with the standard rhetoric of ‘efficiency’ and ‘empowerment’. I suggested that e-business wasn’t – or rather isn’t – a phenomenon in its own right, but the product of three much larger trends: standardisation, automation and externalisation of costs. (Read the whole thing.)

Assuming for the moment that I called this one correctly – and I find my arguments pretty persuasive – what of Web 2.0? More of the same, only featuring the automation of income generation (AdSense) and the externalisation of payroll costs (‘citizen journalism’)? Or is there more going on – and if so, what?

Update 16/11

It would be remiss of me not to give any pointers to my own thinking on Web 2.0. So I’m republishing another column at What I wrote, this time from February of this year. Most of you will probably have seen it the first time round, when it appeared in iSeries NEWS UK, but I think it’s worth giving it another airing. Have a gander.

We’re all together now, dancing in time

Ryan Carson:

I’d love to add friends to my Flickr account, add my links to del.icio.us, browse digg for the latest big stories, customise the content of my Netvibes home page and build a MySpace page. But you know what? I don’t have time and you don’t either…

Read the whole thing. What’s particularly interesting is a small straw poll at the end of the article, where Ryan asks people who actually work on this stuff what social software apps they use on a day-to-day basis. Six people made 30 nominations in all; Ryan had five of his own for a total of 35.

Here are the apps which got more than one vote:

Flickr (four votes)
Upcoming (two)
Wikipedia (two)

And, er, that’s it.

Social software looks like very big news indeed from some perspectives, but when it’s held to the standard of actually helping people get stuff done, it fades into insignificance. I think there are three reasons for this apparent contradiction. First, there’s the crowd effect – and, since you need a certain number of users before network effects start taking off, any halfway-successful social software application has a crowd behind it. It can easily look as if everyone‘s doing it, even if the relevant definition of ‘everyone’ looks like a pretty small group to you and me.

Then there’s the domain effect: tagging and user-rating are genuinely useful and constructive, in some not very surprising ways, within pre-defined domains. (Think of a corporate intranet app, where there is no need for anyone to specify that ‘Dunstable’ means one of the company’s offices, ‘Barrett’ means the company’s main competitor and ‘Monkey’ means the payroll system.) For anyone who is getting work done with tagging, in other words, tagging is going to look pretty good – and, thanks to the crowd effect, it’s going to look like a good thing that everyone‘s using.

Thirdly, social software is new, different, interesting and fun, as something to play with. It’s a natural for geeks with time to play with stuff and for commentators who like writing about new and interesting stuff – let alone geek commentators. The hype generates itself; it’s the kind of development that’s guaranteed to look bigger than it is.

Put it all together – and introduce feedback effects, as the community of geek commentators starts to find social software apps genuinely useful within its specialised domain – and social software begins to look like a Tardis in reverse: much, much bigger on the outside than it is on the inside.

That’s not to say that social software isn’t interesting, or that it isn’t useful. But I think that in the longer term those two facets will move apart: useful and productive applications of tagging will be happening under the commentator radar, often behind organisational firewalls, while the stuff that’s interesting and fun to play with will remain… interesting and fun to play with.

Save our kids from this culture

My frustration with the bearpit that is Comment is Free was brought to a head by this bizarre post by David Hirsh. Once again, I’m going to reproduce my CiF comment here, because frankly I think more people will pay attention to it here than there.

First, a word about Hirsh’s argument. He opens thus:

Since before it even existed, Israel has been engaged in two wars with its neighbours. One is a just war, waged by Palestinian Arabs for freedom – which became a demand for Palestinian national independence; the other is a genocidal war that aims to end Jewish life in the Middle East.The job of the left is to insist on the reality of this distinction and to stand against those who recognise the reality of only one or other of these two separate wars.

The job of the left – ugh. Something very Euston about that formulation – the call to duty, with the implication that this might not be a duty we all like…. But let’s press on.

The problem with social reality is that if enough people believe something to be true, and act as though it is indeed true, then it may become the truth. So if Israelis believe they are only ever fighting a war of survival, then they will use tactics and strategies that are proportionate to the war they believe themselves to be fighting. If Palestinians, meanwhile, come to believe that they can win their freedom only by destroying Israel, then they will think of the Jew-haters of Hamas, Hizbullah, al-Qaeda and the Syrian and Iranian regimes as their allies in the task.The only way out is for cosmopolitan voices and political movements to insist on the reality of both wars – to separate them conceptually and to stand clearly for a Palestinian victory in the fight for freedom and equally clearly for an Israeli victory in the fight against annihilation.

There’s a certain narrowness to Hirsh’s focus here. I’m quite prepared to nail my colours to the mast and say that I’m not in favour of annihilation, by and large. On the contrary, I’m very much in favour of people who are alive being enabled and permitted to remain alive. But I don’t think this commits me to supporting ‘an Israeli victory’ of any sort, in any set of geopolitical circumstances which I can begin to imagine developing out of the current situation.

But maybe my imagination just isn’t up to the job. A few more words from David, this time in the comment thread:

its not far-fetched to imagine a very serious threat. Imagine if the regime in Syria and Iran were joined, perhaps by a Jihadi-revolutionary regime in Saudi and perhaps a Muslim Brotherhood regime in Egypt. Add these to a Hamas led Palestine and a Hezbullah led Lebanon. This is hypothetical, yes, but entirely possible.Imagine also, perhaps that the neo-cons in Washington are replaced by the neo-realists – Mearsheimer and Walt advising the White House that it is in the national interest of the US to ditch Israel.

Imagine also a global liberal intelligensia and labour movement that believes the Israelis are so evil that they deserve what’s coming to them.

But its OK, because Israel is heavily armed.

The logic of your position, then, is that it is a good thing that Israel has the 4th largest army in the world (or whatever it is) because it guarantees their survival.

So how do you feel about the proposal of an arms embargo against Israel? How do you feel about the proposal to stop US aid and to stop the US selling arms to Israel?

What then is there to guarantee Israel’s survival?

I’ll stop beating about the bush: I think this argument is silly, offensive and dangerously dishonest. If Israel’s apologists genuinely believe the country is engaged in a fight for survival at this moment, they’re self-deceived to the point of insanity. If they don’t believe that but think that what’s going on now should be understood by reference to a completely hypothetical worst-case scenario, they’re grossly dishonest. Perhaps even more important, the ‘fight for survival’ argument is being used to divert attention from what the Israeli government and army are actually doing; in other words, it’s being made to do work that it couldn’t do even if it was valid.

Here’s a comment I prepared earlier:

David,I think your argument is interesting & instructive, but not quite in the way that you think it is.

There are (at least) three questions which can legitimately be asked of the state of Israel without arousing suspicions of anti-semitism. Firstly, can the state itself be described as constitutionally unjust, either from its founding or since 1967 (and two-thirds of its history is post-67)? I assume you’d answer No, but many people would answer Yes – including many diaspora Jews and a good few Israelis. But a constitutionally unjust state is one which needs to be replaced, not reformed: replaced through the actions and with the consent of its citizens, certainly, but still replaced. In normal circumstances (I’ll return to this point), asking whether – as a matter of principle – a constitutionally unjust state has the right to perpetuate itself is asking whether injustice has the right to continue.

Secondly, is the state’s posture of perpetual war, and its repeated use of force rather than diplomacy, an appropriate response to the situation Israel finds itself in? Answer No (as many of us do) and any incursion into Gaza, any house demolition, any IDF sniper bullet carries a burden of justification: is this specific action justifiable, or is it just another example of an established, unjust pattern? This is where the allegations of prejudice start flying – those who answer Yes to the second question don’t believe there is any such pattern, and consequently judge each specific action as ‘innocent until proven guilty’.

Lastly, when the state does resort to military force, is its use of force appropriate and proportionate? It’s important to note that this is a completely separate question from the previous one (and does have to be judged on a case by case basis). If I’m fighting for my life and I kill a defenceless passer-by who wasn’t threatening me, I’m still a murderer. (Cf. suicide bombers.)

I found your ‘Imagine’ comment particularly enlightening. Because circumstances alter cases – a position that would be appropriate in normal circumstances isn’t necessarily appropriate in the middle of a war. If Israel were an isolated underdog, entirely surrounded by states which seriously wanted to invade and destroy it, and unable to count on any outside assistance – if this were the case, my answer to question 1 would change (from ‘Yes’ to ‘Maybe, but that’s not important right now’). And if Israel were not only surrounded, outnumbered and outgunned, but on the brink of an exterminationist final conflict – in that case my answer to question 2 would probably change (from ‘No’ to ‘Maybe not, but it’s not for us to say’).

So what’s instructive about your article is the insight it gives into a certain Israeli mindset – a mindset which I can’t regard as being grounded in reality, and one which I’m happy to say isn’t universal among Israelis. I also think it illuminates a further, basically irrational slippage over the third question: are the IDF’s tactics in Gaza and Lebanon (and elsewhere) disproportionate and inhumane? The answer which comes from Israel’s apologists seems to be, essentially, “They had to do something, these people were going to kill them all!” Even in the nightmare scenario where this was actually true, it wouldn’t be an adequate answer: if someone’s trying to kill you, it’s not self-defence to burn out the family who live next door.

Not that anyone appears to be listening to arguments like these. (They certainly aren’t listening on Comment is Free…) In a way that’s the worst thing about the current situation – the sense that the killers of the IDF are doing exactly what the killers of Hezbollah want them to (and vice versa), so that things are likely to get a lot worse before they get better.

It will have blood, they say – blood will have blood.

Don’t have nightmares.

The users geeks don’t see

Nick writes, provocatively as ever, about the recent ‘community-oriented’ redesign of the netscape.com portal:

A few days ago, Netscape turned its traditional portal home page into a knockoff of the popular geek news site Digg. Like Digg, Netscape is now a “news aggregator” that allows users to vote on which stories they think are interesting or important. The votes determine the stories’ placement on the home page. Netscape’s hope, it seems, is to bring Digg’s hip Web 2.0 model of social media into the mainstream. There’s just one problem. Normal people seem to think the entire concept is ludicrous.

Nick cites a post titled Netscape Community Backlash, from which this line leapt out at me:

while a lot of us geeks and 2.0 types are addicted to our own technology (and our own voices, to be honest), it’s pretty darn obvious that A LOT of people want to stick with the status quo

This reminded me of a minor revelation I had the other day, when I was looking for the Java-based OWL reasoner ‘pellet’. I googled for
pellet owl
– just like that, no quotes – expecting to find a ‘pellet’ link at the bottom of forty or fifty hits related to, well, owls and their pellets. In fact, the top hit was “Pellet OWL Reasoner”. (To be fair, if you google
owl pellet
you do get the fifty pages of owl pellets first.)

I think it’s fair to say that the pellet OWL reasoner isn’t big news even in the Web-using software development community; I’d be surprised if everyone reading this post even knows what an OWL reasoner is (or has any reason to care). But there’s enough activity on the Web around pellet to push it, in certain circumstances, to the top of the Google rankings (see for yourself).

Hence the revelation: it’s still a geek Web. Or rather, there’s still a geek Web, and it’s still making a lot of the running. When I first started using the Internet, about ten years ago, there was a geek Web, a hobbyist Web, an academic Web (small), a corporate Web (very small) and a commercial Web (minute) – and the geek Web was by far the most active. Since then the first four sectors have grown incrementally, but the commercial Web has exploded, along with a new sixth sector – the Web-for-everyone of AOL and MSN and MySpace and LiveJournal (and blogs), whose users vastly outnumber those of the other five. But the geek Web is still where a lot of the new interesting stuff is being created, posted, discussed and judged to be interesting and new.

Add social software to the mix – starting, naturally, within the geek Web, as that’s where it came from – and what do you get? You get a myth which diverges radically from the reality. The myth is that this is where the Web-for-everyone comes into its own, where millions of users of what was built as a broadcast Web with walled-garden interactive features start talking back to the broadcasters and breaking out of their walled gardens. The reality is that the voices of the geeks are heard even more loudly – and even more disproportionately – than before. Have a look at the ‘popular’ tags on del.icio.us: as I write, six of the top ten (including all of the top five) relate directly to programmers, and only to programmers. (Number eight reads: “LinuxBIOS – aims to replace the normal BIOS found on PCs, Alphas, and other machines with a Linux kernel”. The unglossed reference to Alphas says it all.) Of the other four, one’s a political video, two are photosets and one is a full-screen animation of a cartoon cat dancing, rendered entirely in ASCII art. (Make that seven of the top ten.)

I’m not a sceptic about social software: ranking, tagging, search-term-aggregation and the other tools of what I persist in calling ethnoclassification are both new and powerful. But they’re most powerful within a delimited domain: a user coming to del.icio.us for the first time should be looking for the ‘faceted search’ option straight away (“OK, so that’s the geek cloud, how do I get it to show me the cloud for European history/ceramics/Big Brother?”) The fact that there is no ‘faceted search’ option is closely related, I’d argue, to the fact that there is no discernible tag cloud for European history or ceramics or Big Brother: we’re all in the geek Web. (Even Nick Carr.) (Photography is an interesting exception – although even there the only tags popular enough to make the del.icio.us tag cloud are ‘photography’, ‘photo’ and ‘photos’. There are 40 programming-related tags, from ajax to xml.)

Social software wasn’t built for the users of the Web-for-everyone. Reaction to the Netscape redesign tells us (or reminds us) that there’s no reason to assume they’ll embrace it.

Update Have a look at Eszter Hargittai‘s survey of Web usage among 1,300 American college students, conducted in February and March 2006. MySpace is huge, and Facebook’s even huger, but Web 2.0 as we know it? It’s not there. 1.9% use Flickr; 1.6% use Digg; 0.7% use del.icio.us. Answering a slightly different question, 1.5% have ever visited Boingboing, and 1% Technorati. By contrast, 62% have visited CNN.com and 21% bbc.co.uk. It’s still, very largely, a broadcast Web with walled-garden interactivity. Comparing results like these with the prophecies of tagging replacing hierarchy, Long Tail production and mashups all round, I feel like invoking the story of the blind men and the elephant – except that I’m not even sure we’ve all got the same elephant.

I couldn’t make it any simpler

I hate to say this – I’ve always loathed VR boosters and been highly sceptical about the people they boost – but Jaron Lanier’s a bright bloke. His essay Digital Maoism doesn’t quite live up to the title, but it’s well worth reading (thanks, Thomas).

I don’t think he quite gets to the heart of the current ‘wisdom of the crowds’ myth, though. It’s not Maoism so much as Revivalism: there’s a tight feedback loop between membership of the collective, collective activity and (crucially) celebration of the activity of the collective. Or: celebration of process rather than end-result – because the process incarnates the collective.

Put it this way. Say that (for example) the Wikipedia page on the Red Brigades is wildly wrong or wildly inadequate (which is just as bad); say that the tag cloud for an authoritative Red Brigades resource is dominated by misleading tags (‘kgb’, ‘ussr’, ‘mitrokhin’…). Would a wikipedian or a ‘folksonomy’ advocate see this situation as a major problem? Not being either I can’t give an authoritative answer, but I strongly suspect the answer would be No: it’s all part of the process, it’s all part of the collective self-expression of wikipedians and the growth of the folksonomy, and if the subject experts don’t like it they should just get their feet wet and start tagging and editing themselves. And if, in practice, the experts don’t join in – perhaps, in the case of Wikipedia, because they don’t have the stomach for the kind of ‘editing’ process which saw Jaron Lanier’s own corrections get reverted? Again, I don’t know for sure, but I suspect the answer would be another shrug: the wiki’s open to all – and tagspace couldn’t be more open – so who’s to blame, if you can’t make your voice heard, but you? There’s nothing inherently wrong with the process, except that you’re not helping to improve it. There’s nothing inherently wrong with the collective, except that you haven’t joined it yet.

Two quotes to clarify (hopefully) the connection between collective and process. Michael Wexler:

our understanding of things changes and so do the terms we use to describe them. How do I solve that in this open system? Do I have to go back and change all my tags? What about other people’s tags? Do I have to keep in mind all the variations on tags that reflect people’s different understanding of the topics?The social connected model implies that the connections are the important part, so that all you need is one tag, one key, to flow from place to place and discover all you need to know. But the only people who appear to have time to do that are folks like Clay Shirky. The rest of us need to have information sorted and organized since we actually have better things to do than re-digest it.

What tagging does is attempt to recreate the flow of discovery. That’s fine… but what taxonomy does is recreate the structure of knowledge that you’ve already discovered. Sometimes, I like flowing around and stumbling on things. And sometimes, that’s a real pita. More often than not, the tag approach involves lots of stumbling around and sidetracks.

It’s like Family Feud [a.k.a. Family Fortunes - PJE]. You have to think not of what you might say to a question, you have to guess what the survey of US citizens might say in answer to a question. And that’s really a distraction if you are trying to just answer the damn question.

And our man Lanier:

there’s a demonstrative ritual often presented to incoming students at business schools. In one version of the ritual, a large jar of jellybeans is placed in the front of a classroom. Each student guesses how many beans there are. While the guesses vary widely, the average is usually accurate to an uncanny degree.This is an example of the special kind of intelligence offered by a collective. It is that peculiar trait that has been celebrated as the “Wisdom of Crowds,”

The phenomenon is real, and immensely useful. But it is not infinitely useful. The collective can be stupid, too. Witness tulip crazes and stock bubbles. Hysteria over fictitious satanic cult child abductions. Y2K mania. The reason the collective can be valuable is precisely that its peaks of intelligence and stupidity are not the same as the ones usually displayed by individuals. Both kinds of intelligence are essential.

What makes a market work, for instance, is the marriage of collective and individual intelligence. A marketplace can’t exist only on the basis of having prices determined by competition. It also needs entrepreneurs to come up with the products that are competing in the first place. In other words, clever individuals, the heroes of the marketplace, ask the questions which are answered by collective behavior. They put the jellybeans in the jar.

To illustrate this, once more (just the once) with the Italian terrorists. There are tens of thousands of people, at a conservative estimate, who have read enough about the Red Brigades to write that Wikipedia entry: there are a lot of ill-informed or partially-informed or tendentious books about terrorism out there, and some of them sell by the bucketload. There are probably only a few hundred people who have read Gian Carlo Caselli and Donatella della Porta’s long article “The History of the Red Brigades: Organizational structures and Strategies of Action (1970-82)” – and I doubt there are twenty who know the source materials as well as the authors do. (I’m one of the first group, obviously, but certainly not the second.) Once the work’s been done anyone can discover it, but discovery isn’t knowledge: the knowledge is in the words on the pages, and ultimately in the individuals who wrote them. They put the jellybeans in the jar.

This is why (an academic writes) the academy matters, and why academic elitism is – or at least can be – both valid and useful. Jaron:

The balancing of influence between people and collectives is the heart of the design of democracies, scientific communities, and many other long-standing projects. There’s a lot of experience out there to work with. A few of these old ideas provide interesting new ways to approach the question of how to best use the hive mind.Scientific communities … achieve quality through a cooperative process that includes checks and balances, and ultimately rests on a foundation of goodwill and “blind” elitism — blind in the sense that ideally anyone can gain entry, but only on the basis of a meritocracy. The tenure system and many other aspects of the academy are designed to support the idea that individual scholars matter, not just the process or the collective.

I’d go further, if anything. Academic conversations may present the appearance of a collective, but it’s a collective where individual contributions are preserved and celebrated (“Building on Smith’s celebrated critique of Jones, I would suggest that Smith’s own analysis is vulnerable to the criticisms advanced by Evans in another context…”). That is, academic discourse looks like a conversation – which wikis certainly can do, although Wikipedia emphatically doesn’t.

The problem isn’t the technology, in other words: both wikis and tagging could be ways of making conversation visible, which inevitably means visualising debate and disagreement. The problem is the drive to efface any possibility of conflict, effectively repressing the appearance of debate in the interest of presenting an evolving consensus. (Or, I could say, the problem is the tendency of people to bow and pray to the neon god they’ve made, but that would be a bit over the top – and besides, Simon and Garfunkel quotes are far too obvious.)

Update 13th June

I wrote (above): It’s not Maoism so much as Revivalism: there’s a tight feedback loop between membership of the collective, collective activity and (crucially) celebration of the activity of the collective. Or: celebration of process rather than end-result – because the process incarnates the collective.

Here’s Cory Doctorow, responding to Lanier:

Wikipedia isn’t great because it’s like the Britannica. The Britannica is great at being authoritative, edited, expensive, and monolithic. Wikipedia is great at being free, brawling, universal, and instantaneous.If you suffice yourself with the actual Wikipedia entries, they can be a little papery, sure. But that’s like reading a mailing-list by examining nothing but the headers. Wikipedia entries are nothing but the emergent effect of all the angry thrashing going on below the surface. No, if you want to really navigate the truth via Wikipedia, you have to dig into those “history” and “discuss” pages hanging off of every entry. That’s where the real action is, the tidily organized palimpsest of the flamewar that lurks beneath any definition of “truth.” The Britannica tells you what dead white men agreed upon, Wikipedia tells you what live Internet users are fighting over.

The Britannica truth is an illusion, anyway. There’s more than one approach to any issue, and being able to see multiple versions of them, organized with argument and counter-argument, will do a better job of equipping you to figure out which truth suits you best.

Quoting myself again, There’s nothing inherently wrong with the process, except that you’re not helping to improve it. There’s nothing inherently wrong with the collective, except that you haven’t joined it yet.

When there is no outside

Nick Carr’s hyperbolically-titled The Death of Wikipedia has received a couple of endorsements and some fairly vigorous disagreement, unsurprisingly. I think it’s as much a question of tone as anything else. When Nick reads the line

certain pages with a history of vandalism and other problems may be semi-protected on a pre-emptive, continuous basis.

it clearly sets alarm bells ringing for him, as indeed it does for me (“Ideals always expire in clotted, bureaucratic prose”, Nick comments). Several of his commenters, on the other hand, sincerely fail to see what the big deal might be: it’s only a handful of pages, it’s only semi-protection, it’s not that onerous, it’s part of the continuing development of Wikipedia editing policies, Wikipedia never claimed to be a totally open wiki, there’s no such thing as a totally open wiki anyway…

I think the reactions are as instructive as the original post. No, what Nick’s pointing to isn’t really a qualitative change, let alone the death of anything. But yes, it’s a genuine problem, and a genuine embarrassment to anyone who takes the Wikipedian rhetoric seriously. Wikipedia (“the free encyclopedia that anyone can edit”) routinely gets hailed for its openness and its authority, only not both at the same time – indeed, maximising one can always be used to justify limits on the other. As here. But there’s another level to this discussion, which is to do with Wikipedia’s resolution of the openness/authority balancing-act. What happens in practice is that the contributions of active Wikipedians take precedence over both random vandals and passing experts. In effect, both openness and authority are vested in the group.

In some areas this works well enough, but in others it’s a huge problem. I use Wikipedia myself, and occasionally drop in an edit if I see something that’s crying out for correction. Sometimes, though, I see a Wikipedia article that’s just wrong from top to bottom – or rather, an article where verifiable facts and sustainable assertions alternate with errors and misconceptions, or are set in an overall argument which is based on bad assumptions. In short, sometimes I see a Wikipedia article which doesn’t need the odd correction, it needs to be pulled and rewritten. I’m not alone in having this experience: here’s Tom Coates on ‘penis envy’ and Thomas Vander Wal (!) on ‘folksonomy’, as well as me on ‘anomie’.

It’s not just a problem with philosophical concepts, either – I had a similar reaction more recently to the Wikipedia page on the Red Brigades. On the basis of the reading I did for my doctorate, I could rewrite that page from start to finish, leaving in place only a few proper names and one or two of the dates. But writing this kind of thing is hard and time-consuming work – and I’ve got quite enough of that to do already. So it doesn’t get done.

I don’t think this is an insurmountable problem. A while ago I floated a cunning plan for fixing pages like this, using PledgeBank to mobilise external reserves of peer-pressure; it might work, and if only somebody else would actually get it rolling I might even sign up. But I do think it’s a problem, and one that’s inherent to the Wikipedia model.

To reiterate, both openness and authority are vested in the group. Openness: sure, Wikipedia is as open to me as any other registered editor d00d, but in practice the openness of Wikipedia is graduated according to the amount of time you can afford to spend on it. As for authority, I’m not one, but (like Debord) I have read several good books – better books, to be blunt, than those relied on by the author[s] of the current Red Brigades article. But what would that matter unless I was prepared to defend what I wrote against bulk edits by people who disagreed – such as, for example, the author[s] of the current article? On the other hand, if I was prepared to stick it out through the edit wars, what would it matter whether I knew my stuff or not? This isn’t just random bleating. When I first saw that Red Brigades article I couldn’t resist one edit, deleting the completely spurious assertion that the group Prima Linea was a Red Brigades offshoot. When I looked at the page again the next day, my edit had been reverted.

Ultimately Wikipedia isn’t about either openness or authority: it’s about the collective activity of editing Wikipedia and being a Wikipedian. From that, all else follows.

Update 2/6/06 (in response to David, in comments)

There are two obvious problems with the Wikipedia page on the Brigate Rosse, and one that’s larger but more diffuse. The first problem is that it’s written in the present tense; it’s extremely dubious that there’s any continuity between the historic Brigate Rosse and the gang who shot Biagi, let alone that they’re simply, unproblematically the same group. This alone calls for a major rewrite. Secondly, the article is written very much from a police/security-service/conspiracist stance, with a focus on question like whether the BR was assisted by the Czech security services or penetrated by NATO. But this tends to reinforce an image of the BR as a weird alien force which popped up out of nowhere, rather than an extreme but consistent expression of broader social movements (all of which has been documented).

The broader problem – which relates to both of the specific points – goes back to a problem with the amateur-encyclopedia format itself: Wikipedia implicitly asks what a given topic is, which prompts contributors to think of their topic as having a core, essential meaning (I wrote about this last year). The same problem can arise in a ‘proper’ encyclopedia, but there it’s generally mitigated by expertise: somebody who’s spent several years studying the broad Italian armed struggle scene is going to be motivated to relate the BR back to that scene, rather than presenting it as an utterly separate thing. The motivation will be still greater if the expert on the BR has also been asked to contribute articles on Prima Linea, the NAP, etc. This, again, is something that happens (and works, for all concerned) in the kind of restricted conversations that characterise academia, but isn’t incentivised by the Wikipedia conversation – because the Wikipedia conversation doesn’t go anywhere else. Doing Wikipedia is all about doing Wikipedia.

Some day this will all be yours

Scott Karp:

What if dollars have no place in the new economics of content?

In media 1.0, brands paid for the attention that media companies gathered by offering people news and entertainment (e.g. TV) in exchange for their attention. In media 2.0, people are more likely to give their attention in exchange for OTHER PEOPLE’S ATTENTION. This is why MySpace can’t effectively monetize its 70 million users through advertising — people use MySpace not to GIVE their attention to something that is entertaining or informative (which could thus be sold to advertisers) but rather to GET attention from other users.

MySpace can’t sell attention to advertisers because the site itself HAS NONE. Nobody pays attention to MySpace — users pay attention to each other, and compete for each other’s attention — it’s as if the site itself doesn’t exist.You see the same phenomenon in blogging — blogging is not a business in the traditional sense because most people do it for the attention, not because they believe there’s any financial reward. What if the economics of media in the 21st century begin to look like the economics of poetry in the 20th century? — Lots of people do it for their own personal gratification, but nobody makes any money from it.

Pedantry first: it’s inconceivable that we’ll reach a point where nobody makes any money from the media, at least this side of the classless society. Even the hard case of blogging doesn’t really stand up – I could name half a dozen bloggers who have made money or are making money from their blogs, without pausing to think.

It’s a small point, but it’s symptomatic of the enthusiastic looseness of Karp’s argument. So I welcomed Nicholas Carr’s counterblast, which puts Karp together with some recent comments by Esther Dyson:

“Most users are not trying to turn attention into anything else. They are seeking it for itself. For sure, the attention economy will not replace the financial economy. But it is more than just a subset of the financial economy we know and love.”

Here’s Carr:

I fear that to view the attention economy as “more than just a subset of the financial economy” is to misread it, to project on it a yearning for an escape (if only a temporary one) from the consumer culture. There’s no such escape online. When we communicate to promote ourselves, to gain attention, all we are doing is turning ourselves into goods and our communications into advertising. We become salesmen of ourselves, hucksters of the “I.” In peddling our interests, moreover, we also peddle the commodities that give those interests form: songs, videos, and other saleable products. And in tying our interests to our identities, we give marketers the information they need to control those interests and, in the end, those identities. Karp’s wrong to say that MySpace is resistant to advertising. MySpace is nothing but advertising.

Now, this is good, bracing stuff, but I think Carr bends the stick a bit too far the other way. I know from my own experience that there’s a part of my life labelled Online Stuff, and that most of my reward for doing Online Stuff is attention from other people doing Online Stuff. Real-world payoffs – money, work or just making new real-world friends – are nice to get, but they’re not what it’s all about.

The real trouble is that Karp has it backwards. Usenet – where I started doing Online Stuff, ten years ago – is a model of open-ended mutual whuffie exchange. (A very imperfect model, given the tendency of social groups to develop boundaries and hierarchies, but at least an unmonetised one.) Systematised whuffie trading came along later. The model case here is eBay, where there’s a weird disconnect between meaning and value. Positive feedback doesn’t really mean that you think the other person is a “great ebayer” – it doesn’t really mean anything, any more than “A+++++” means something distinct from “A++++” or “A++++++”. What it does convey is value: it makes it that much easier for the other person to make money. It also has attention-value, making the other person feel good for no particular real-world reason, but even this is quantifiable (“48! I’m up to 48!”).

Ultimately Dyson and Carr are both right. The ‘attention economy’ of Online Stuff is new, absorbing and unlike anything that went before – not least because the way in which it gratifies fantasies of being truly appreciated, understood, attended to. But, to the extent that the operative model is eBay rather than Usenet, it is nothing other than a subset of the financial economy. Karp may be right about the specific case of MySpace, but I can’t help distrusting his exuberance – not least because, in my experience, the suffix ‘2.0’ is strongly associated with a search for new ways to cash in.

We are bored in the city

Et la piscine de la rue des Fillettes. Et le commissariat de police de la rue du Rendez-Vous. La clinique médico-chirurgicale et le bureau de placement gratuit du quai des Orfèvres. Les fleurs artificielles de la rue du Soleil. L’hôtel des Caves du Château, le bar de l’Océan et le café du Va et Vient. L’hôtel de l’Epoque.

Et l’étrange statue du Docteur Philippe Pinel, bienfaiteur des aliénés, dans les derniers soirs de l’été. Explorer Paris.

The early situationists, following Chtcheglov‘s lead, turned urban wandering into a form of political/psychological exploration, a group encounter with the city mediated only by alcohol. At a less exalted level, I’ve long been fascinated by the kind of odd urban poetry evoked here, in Manchester as much as Paris, and by the changing articulation of city space: established cities are a slow-motion example of Marx’s dictum about how we make our lives within conditions we have inherited. So it’s easy to see how well this could work:

Socialight lets you put virtual “sticky” notes called StickyShadows anywhere in the real world. Share pictures, notes and more using your cell phone.

But – for all that the site says about restricting access to Groups and Contacts – it’s also easy to see how very badly it could work.

* I leave a note for all my friends at the mall to let them know where I’m hanging out. All my friends in the area see it.
* A woman shows all her close friends the tree under which she had her first kiss.
* An entire neighborhood gets together and documents all the unwanted litter they find in an effort to share ownership of a community problem.
* A food-lover uses Socialight to share her thoughts on the amazing vanilla milkshakes at a new shop.
* The neighborhood historian creates her own walking tour for others to follow.
* A group of friends create their own scavenger hunt.
* A tourist takes place-based notes about stores in a shopping district, only for himself, for a time when he returns to the same city.
* A small business places StickyShadows that its customers would be interested in finding.
* A band promotes an upcoming show by leaving a StickyShadow outside the venue.

It was all going so well (although I did wonder why that entire neighbourhood couldn’t just pick up the litter) right up to the last two. Advertising – yep, that’s just what we all want more of in our urban lives. Lots of nice intrusive advertising.

Anne:

The worst thing about taking-for-granted that our experiences with the city and each other will be “enriched” by more data, by more information, by making the invisible visible, etc., is that we never have to account for or be accountable to how.

More specifically, there’s a huge difference between enabling conversation and enabling people to be informed – in other words, between talking-with and being-talked-at. Social software is all about conversation – about enabling people to talk together. Moreover, any conversation is defined as much by what it shuts out as what it includes; it’s hard to listen to the people you want to talk with when you’re being talked at. Even setting aside the information-overload potential of all those overlapping groups (do I need to know where so-and-so had her first kiss? do I need to know now?), it’s clear that Socialight is trying to serve two ends which are not only incompatible but opposed – and only one of which pays money. Which is probably why, even though the technology is still in beta, I already feel that using it constructively would be going against the grain.

Which side of the table?

[Pardon the long silence - life called.]

Shelley:

I think Google Base is a fun experiment, and I’m willing to play a little. It will be interesting to see the directory, especially if the company provides web services that aren’t limited to so many queries a day. But I never forget that Google is in the business to make a profit. If we give it the power, it will become the Wal-Mart of the waves–by default if not by design. Is that what you all want? If it is, just continue getting all misty eyed, because you’ll need blurred vision not to see what should be right in front of you.

I think this is precisely right. I find a lot of the comment on Google Base strange and slightly depressing, in the same way I find a lot of Web 2.0 talk strange and depressing. In the context of social software, when I use a word like ‘enclose’ – or a word like ‘monetise‘ – it means something quite specific and entirely negative: it’s a red-flag word. So it’s weird, to say the least, to see the same words used positively. It’s only a little less strange to see these concerns acknowledged, then batted away as trivial or meaningless (Tom: “Making data available for everyone to use is keeping it in the public sphere.” (I’m Phil #2 in those comments, by the way)).

It seems to me that there’s a fundamental tension between the demands of commerce and the nature of social software, as defined by Tom some time ago:

We believe that for a piece of Social Software to be useful:

  • Every individual should derive value from their contributions
  • Every contribution should provide value to their peers as well
  • The site or organisation that hosts the service should be able to derive value from the aggregate of the data and should be able to expose that value back to individuals

You add content, which has value for you and to other users; the host derives further value from the aggregate of content; the host exposes that added value to all users. Beautiful.

What this suggests is that social software – unlike, say, the e-business of the late 1990s – is all about the content. Specifically, it’s all about freely and collectively contributed content, either held in common or held in trust for the commons. So monetising social software is qualitatively different from making money out of a new piece of stand-alone software, because it’s the contributed content which has the original value – and makes the added value possible. Tangentially, I’m not sure whether Doc Searls gets this or not, and not only because his article’s had a well-deserved slashdotting. On a first reading I got the impression he was saying that both the Net and the Web are valuable common resources which should not be fenced off for the sake of making money, but that part of what makes them valuable is that they’re great environments for fencing things off and making money. (Oh, and we should stop saying ‘common’ because if you put ‘ist’ on the end it sounds kind of like ‘communist’, and when Eric Raymond hears the word ‘Communist’ he reaches for well, you know.) It’s a great article, though, and I look forward to taking a more considered look at it when the tide subsides.

One of Doc’s points is that the Web is, figuratively, all about publishing – a profession which, like many bloggers, I’ve seen close up. Just under ten years ago, I went from Unix sysadmin to magazine editor, and rapidly discovered that commercial publishing looks very different from the inside. Perhaps the biggest single shock was the realisation that content doesn’t matter. Obviously I tried to make it the best magazine I could (and it got better still under my successor), but at a fundamental level editorial content wasn’t what it was about. If the advertising department sold enough space, we made a profit; if they didn’t, we didn’t. (Show me a magazine that relies on the cover price and I’ll show you a magazine with money worries. Show me a publication that gets by on the cover price and I’ll show you an academic journal.) The purpose of the magazine was to put advertisements in front of readers – and the purpose of the editorial was to make readers turn all the pages.

So there’s nothing very new about Google’s business model: Google Base is to the Web what a commercial magazine is to a fanzine – or rather, a whole mass of different fanzines. The only novelty is that we, the fanzine writers, are providing the content: the content whose sole function, from the point of view of Google as a commercial entity, is to attract an audience which will look at ads.

But that’s quite a big novelty.

Bill Burnham:

In my next post I will talk about Google Base’s impact on the “walled garden” listings sites. I’ll give you a hint: it won’t be pretty.

Unless, of course, you like really big gardens with really high walls.

Update: I wrote, I find a lot of the comment on Google Base strange and slightly depressing, in the same way I find a lot of Web 2.0 talk strange and depressing. Cue Cringely:

Google has the reach and the resources … And you know whose strategy this is? Wal-Mart’s. And unless Google comes up with an ecosystem to allow their survival, that means all the other web services companies will be marginalized. … the final result is that Web 2.0 IS Google. Microsoft can’t compete. Yahoo probably can’t compete. Sun and IBM are like remora, along for the ride. And what does it all cost, maybe $1 billion? That’s less than Microsoft spends on legal settlements each year. Game over.

As an aside, I love the idea of International Business Machines as a parasite on the behemoth that is Google; I don’t think we’re quite there yet. But the accuracy or not of Cringely’s prediction concerns me less than his tone, which I think can reasonably be called lip-smacking: “Google’s going to 0wn the Web! Wow!” (Quick test: try reading that sentence out loud with a straight face. Now try substituting ‘Microsoft’ – or, for older readers, ‘IBM’.) This enthusiasm for big business – as long as it’s a cool big business – strikes me as both dangerous and weird, not to mention being the antithesis of what’s made the Net fun to work with all these years. But it is a logical development of one branch of the ‘Web 2.0′ hype – an increasingly dominant branch, unfortunately.

Put your head back in the clouds

OK, let’s talk about the Long Tail.

I’ve been promising a series of posts on the Long Tail myth for, um, quite a while. (What’s a month in blog time? A few of those.) The Long Tail posts begin here.

Here’s what we’re talking about, courtesy of our man Shirky:

We are all so used to bell curve distributions that power law distributions can seem odd. The shape of Figure #1, several hundred blogs ranked by number of inbound links, is roughly a power law distribution. Of the 433 listed blogs, the top two sites accounted for fully 5% of the inbound links between them. (They were InstaPundit and Andrew Sullivan, unsurprisingly.) The top dozen (less than 3% of the total) accounted for 20% of the inbound links, and the top 50 blogs (not quite 12%) accounted for 50% of such links.


Figure #1: 433 weblogs arranged in rank order by number of inbound links.

It’s a popular meme, or it would be if there were any such thing as a meme (maybe I’ll tackle that one another time). Here’s one echo:

many web statistics don’t follow a normal distribution (the infamous bell curve), but a power law distribution. A few items have a significant percentage of the total resource (e.g., inbound links, unique visitors, etc.), and many items with a modest percentage of the resources form a long “tail” in a plot of the distribution. For example, a few websites have millions of links, more have hundreds of thousands, even more have hundreds or thousands, and a huge number of sites have just one, two, or a few.

Another:

if we measure the connectivity of a sample of 1000 web sites, (i.e. the number of other web sites that point to them), we might find a bell curve distribution, with an “average” of X and a standard deviation of Y. If, however, that sample happened to contain google.com, then things would be off the chart for the “outlier” and normal for every other one.If we back off to see the whole web’s connectivity, we find a very few highly connected sites, and very many nearly unconnected sites, a power law distribution whose curve is very high to the left of the graph with the highly connected sites, with a long “tail” to the right of the unconnected sites. This is completely different than the bell curve that folks normally assume

And another:

The Web, like most networks, has a peculiar behavior: it doesn’t follow standard bell curve distributions where most people’s activities are very similar (for example if you plot out people’s heights you get a bell curve with lots of five- and six-foot people and no 20-foot giants). The Web, on the other hand, follows a power law distribution where you get one or two sites with a ton of traffic (like MSN or Yahoo!), and then 10 or 20 sites each with one tenth the traffic of those two, and 100 or 200 sites each with 100th of the traffic, etc. In such a curve the distribution tapers off slowly into the sunset, and is called a tail. What is most intriguing about this long tail is that if you add up all the traffic at the end of it, you get a lot of traffic

All familiar, intuitive stuff. It’s entered the language, after all – we all know what the ‘long tail’ is. And when, for example, Ross writes about somebody who started blogging about cooking at the end of the tail and is now part of the fat head and has become a pro, we all know what the ‘fat head’ is, too – and we know what (and who) is and isn’t part of it.

Unfortunately, the Long Tail doesn’t exist.

To back up that assertion, I’m going to have to go into basic statistics – and trust me, I do mean ‘basic’. In statistics there are three levels of measurement, which is to say that there are three types of variable. You can measure by dividing the field of measurement into discrete partitions, none of which is inherently ranked higher than any other. This car is blue (could have been red or green); this conference speaker is male (could have been female); this browser is running under OS X (could have been Win XP). These are nominal variables. You can code up nominals like this as numbers – 01=blue, 02=red; 1=male, 2=female – but it won’t help you with the analysis. The numbers can’t be used as numbers: there’s no sense in which red is greater than blue, female is greater than male or OS X is – OK, bad example. Since nominals don’t have numerical value, you can’t calculate a mean or a median with them; the most you can derive is a mode (the most frequent value).

Then there are ordinal variables. You derive ordinal variables by dividing the field of measurement into discrete and ordered partitions: 1st, 2nd, 3rd; very probable, quite probable, not very probable, improbable; large, extra-large, XXL, SuperSize. As this last example suggests, the range covered by values of an ordinal variable doesn’t have to exhaust all the possibilities; all that matters is that the different values are distinct and can be ranked in order. Numeric coding starts to come into its own with ordinals. Give ‘large’ (etc) codes 1, 2, 3 and 4, and a statement that (say) ‘50% of size observations are less than 3′ actually makes sense, in a way that it wouldn’t have made sense if we were talking about car colour observations. In slightly more technical language, you can calculate a mode with ordinal variables, but you can also calculate a median: the value which is at the numerical mid-point of the sample, when the entire sample is ordered low to high.

Finally, we have interval/ratio or I/R variables. You derive an I/R variable by measuring against a standard scale, with a zero point and equal units. As the name implies, an I/R variable can be an interval (ten hours, five metres) or a ratio (30 decibels, 30% probability). All that matters is that different values are arithmetically consistent: 3 units minus 2 units is the same as 5 minus 4; there’s a 6:5 ratio between 6 units and 5 units. Statistics starts to take off when you introduce I/R variables. We can still calculate a mode (the most common value) and a median (the midpoint of the distribution), but now we can also calculate a mean: the arithmetic average of all values. (You could calculate a mean for ordinals or even nominals, but the resulting number wouldn’t tell you anything: you can’t take an average of ‘first’, ‘second’ and ‘third’.)

You can visualise the difference between nominals, ordinals and I/R variables by imagining you’re laying out a simple bar chart. It’s very simple: you’ve got two columns, a long one and a short one. We’ll also assume that you’re doing this by hand, with two rectangular pieces of paper that you’ve cut out – perhaps you’re designing a poster, or decorating a float for the Statistical Parade. Now: where are you going to place those two columns? If they’re nominals (‘red cars’ vs ‘blue cars’), it’s entirely up to you: you can put the short one on the left or the right, you can space them out or push them together, you can do what you like. If they’re ordinals (‘second class degree awards’ vs ‘third class’) you don’t have such a free rein: spacing is still up to you, but you will be expected to put the ‘third’ column to the right of the ‘second’. If they’re I/R variables, finally – ‘180 cm’, ‘190 cm’ – you’ll have no discretion at all: the 180 column needs to go at the 180 point on the X axis, and similarly for the 190.

Almost finished. Now let’s talk curves. The ‘normal distribution’ – the ‘bell curve’ – is a very common distribution of I/R variables: not very many low values on the left, lots of values in the middle, not very many high values on the right. The breadth and steepness of the ‘hump’ varies, but all bell curves are characterised by relatively steep rising and falling curves, contrasting with the relative flatness of the two tails and the central plateau. The ‘power law distribution’ is a less common family of distributions, in which the number of values is inversely proportionate to the value itself or a power of the value. For example, deriving Y values from the inverse of the cube of X:

X value Y formula Y value
1 1000 / (1^3) 1000
2 1000 / (2^3) 125
3 1000 / (3^3) 37.037
4 1000 / (4^3) 15.625
5 1000 / (5^3) 8
6 1000 / (6^3) 4.63

As you can see, a power law curve begins high, declines steeply then ‘levels out’ and declines ever more shallowly (it tends towards zero without ever reaching it, in fact).

Got all that? Right. Quick question: how do you tell a normal distribution from a power-law distribution? It’s simple, really. In one case both low and high values have low numbers of occurrences, while most occurrences are in the central plateau of values around the mean. In the other, the lowest values have the highest numbers of occurrences; most values have low occurrence counts, and high values have the lowest counts of all. In both cases, though, what you’re looking at is the distribution of interval/ratio variables. The peaks and tails of those distribution curves can be located precisely, because they’re determined by the relative counts (Y axis) of different values (X axis) – just as in the case of our imaginary bar chart.

Back to a real bar chart.

Figure #1: 433 weblogs arranged in rank order by number of inbound links.

The shape of Figure #1, several hundred blogs ranked by number of inbound links, is roughly a power law distribution.

As you can see, this actually isn’t a power law distribution – roughly or otherwise. It’s just a list. These aren’t I/R variables; they aren’t even ordinals. What we’ve got here is a graphical representation of a list of nominal variables (look along the X axis), ranked in descending order of occurrences. We can do a lot better than that – but it will mean forgetting all about the idea that low-link-count sites are in a ‘long tail’, while the sites with heavy traffic are in the ‘head’.

[Next post: how we could save the Long Tail, and why we shouldn't try.]

A trick of the eye

A long time ago on a Web site far, far away, Clay Shirky wrote:

“We are all so used to bell curve distributions that power law distributions can seem odd.”

He then traced Pareto-like ‘power law’ curves operating in a number of domains where large numbers of people make unconstrained choices – most memorably, inbound link counts for blogs. The inverse ‘power law’ curve dives steeply, then levels out, glides downwards almost to zero and peters out slowly. And thus was born the ‘Long Tail’.

As I wrote here, there’s a problem with this article, and hence with the ‘Long Tail’ image itself. Despite repeated references to ‘power law distributions’, none of the curves Clay presented were distributions. They were histograms representing ranked lists: in other words series of numbers ordered from high to low.

What’s the difference? A short answer is that the data Clay presents makes his own comparison with ‘bell curve’ (normal) distributions unsustainable: order from high to low and you will only ever get a downward curve.

For a longer answer, you’ll have to look at some numbers. Here are some x,y values which would give you a normal distribution. (For anyone in danger of glazing over, that’s ‘x’ as in horizontal axis, low to high values running left to right; ‘y’ values are on the vertical axis, low to high running bottom to top).

1 1
2 30
3 100
4 240
5 400
6 600
7 750
8 900
9 960
10 1000
11 1000
12 960
13 900
14 750
15 600
16 400
17 240
18 100
19 30
20 1

OK? And here are some co-ordinates which would give you an inverse power-law distribution:

1 1000
2 444
3 250
4 160
5 111
6 82
7 63
8 49
9 40
10 33
11 28
12 24
13 20
14 18
15 16
16 14
17 12
18 11
19 10
20 9

Just for the hell of it, here are some numbers that would give you a direct (ascending) power law distribution:

1 9
2 10
3 11
4 12
5 14
6 16
7 18
8 20
9 24
10 28
11 33
12 40
13 49
14 63
15 82
16 111
17 160
18 250
19 444
20 1000

Finally, by way of contrast, here’s a series of numbers.

1000
444
250
160
111
82
63
49
40
33
28
24
20
18
16
14
12
11
10
9

I’ve sorted these numbers high to low, but – unlike the other three examples – there’s nothing in the data that told me to do that. You could arrange them that way; you could sort them low to high instead; you could even hack them about manually to produce a rather lumpy and uneven bell curve. It’s up to you.

I’m not saying that a ranked listing – arranging numbers like these high to low – is meaningless. The ranked histogram is quite a good graphic – it’s informative (within limits) and easy to grasp. What I am saying is that it’s an arbitrary ordering rather than a distribution. Which is to say, it’s not the best way of representing this data – let alone the only way. It’s a relatively information-poor representation, and one which tends to promote perverse and unproductive ways of thinking about the data.

More about this – and a couple of constructive suggestions – next time I post.

When is a spike not a spike?

When it’s a long tail. Maybe.

David Weinberger writes:

In a conversation with Erica George at the Berkman she pointed out that the demographics of Live Journal don’t always represent one’s experience of Live Journal — the demographics say that teenage girls are the largest users, but if you’re a 25 year old, your social group there may not look that way at all.

Which raises an issue about the way the “long tail” is pictured. Clay’s charts are accurate depictions of his data, but they have a mythic power that’s misleading: The long tail looks like, well, a long tail when in fact it’s a fractal curlicue of relationships.

This is an interesting point in itself – perhaps the blogosphere would be better viewed as a series (archipelago? galaxy?) of more or less closed, more or less interlinked ‘spheres’. I’m not sure how you’d visualise that, though – perhaps something like the Jefferson High School network diagram?.

But there’s a broader point about the accuracy of those ‘long tail’ graphics. Adam Marsh made an interesting point here about a recently-discovered ‘long tail':

Clay refers to “the characteristic long tail of people who use many fewer tags than the power taggers.” While this chart does exhibit a “long tail,” this is simply a result of the fact that the users were ordered by decreasing tag usage (also true of the following three charts) — the X axis here doesn’t represent a value, it is just a sequence of users.

The phrase “long tail” usually refers to the observation that for many distributions, the number of elements with outlying values (the “tail”) may be cumulatively significant compared to the number of elements clustered near the average.

On inspection, it turns out that this is also true of the celebrated ‘Power law and Weblogs’ graphic: there are no values on the X axis, just a list of blogs arranged in descending order of number of links. This matters, because in a graphical representation of a statistical distribution both axes carry information. Typically, values of the variable being measured run low to high on the X axis, left to right, while the count of occurrences of each value runs high to low on the Y axis, top to bottom. Clay wrote, “We are all so used to bell curve distributions that power law distributions can seem odd.” But Clay’s own graphics aren’t so much odd as misleading, and not only because he’s put high values on the left of the graph rather than the right. In effect, he’s got two axes conveying one piece of information. Andrew Sullivan’s blog and Instapundit get a high Y value (lots of links) and a high X value (because all the sites with lots of links have been sorted to the left).

If you took the same numbers and plotted them on an X axis with values – if you produced a graph showing how many blogs had how many links, with zero at the origin on both scales… Well, I don’t know what would happen – but five minutes’ experimentation tellsreminds me that, if you wanted to produce a nice clear series of vertical bars rather than a line that wanders all over the place, you’d need to put ‘number of blogs’ on the Y axis and ‘number of inbound links’ on the X axis, rather than vice versa. (There’s a simple reason for this: some values are unique by definition, others aren’t.) Which in turn means that any vertical spike would represent large numbers of blogs (say, for example, blogs with small numbers of inbound links) while any long tail would represent small numbers (say, for example, the few blogs with lots of links).

Caveat: I haven’t crunched any actual numbers, or even mumbled them gently. But maybe we’ve been looking at this the wrong way round, statistically speaking. Perhaps the long tail is the spike; perhaps the spike is really the long tail.

Oh, good grief

Via Nick, another Blairian fantasy:

The PM also this morning urged Labour supporters to turn out to vote on May 5, saying: “It only takes one in 10 of our voters to drift off to the Liberal Democrats and you end up with a Tory government.”

That is a figure hotly disputed by the Lib Dems, who said the swing voters would have to be concentrated entirely in Labour/Tory marginals – and even then the figure would be much closer to one in four.

What they don’t say is that it’s a figure hotly disputed by anyone with a brain. I’ve been following the polls, and the running average of all of the national polls I’ve seen is something like this:

Labour: 38%
Conservatives: 33%
Liberal Democrats: 20%
Others: 9%

Wave our magic wand of tactical-voting apathy – what is this about drifting off to the Liberal Democrats? “Sod it, I just can’t be bothered voting Labour this time – I’m going to take the easy way out and cast a protest vote for a party I’ve never voted for in my life. I’d better take something to read in case there’s a queue – pass us that Trotskyist Anarchist, could you?”

But anyway. Transfer 10% of the Labour vote to the Lib Dems and you get a split of 34.2%/33%/23.8%/9%. Plug that into the BBC’s very wonderful seat calculator and you get… a Labour majority of 62.

The polls could be wrong, of course. More to the point, all the polls except YouGov could be wrong. YouGov has consistently reported a much narrower gap between the levels of Labour and Tory support, more like:

Labour: 36%
Conservatives: 34%
Liberal Democrats: 21%
Others: 9%

Transfer 10%, and you get 32.4%/34%/24.6%/9%. And… a hung parliament, with Labour taking 318 seats out of 646; the Tories and the Liberal Democrats put together would only have 299.

The facts are what they’ve been all along:

The Conservatives have a mountain to climb (hey, pictures!).

They’re showing no sign of being able to climb it.

Liberal Democrat votes won’t enable them to do it.

And:

Tony Blair is constitutionally unable to make any appeal to natural Labour voters which will actually get the vote out. From that same Guardian article:

the prime minister and chancellor joined forces to unveil a new slogan of “Forward with Blair & Brown”.

Good old Gordon – at least he’s an ex-pinko. We’ll be seeing a lot more of him this coming week.

Follow

Get every new post delivered to your Inbox.

Join 241 other followers

%d bloggers like this: