Like Boiling a Frog
- The Wikipedia Revolution by Andrew Lih
Aurum, 252 pp, £14.99, March 2009, ISBN 978 1 84513 473 0
The best one-volume encyclopedia in the world used to be the Columbia Encyclopedia, first published by Columbia University Press in 1935. In our house we have the fifth edition, from 1993, and we still get it out occasionally to look up kings and queens and old-fashioned stuff like that. It’s a lovely book, fat but portable and full of nuggety little entries on most things you can think of. It also has quite a poignant preface, in which the editors talk about the difficulties of updating an encyclopedia in such a fast-changing world: they note how much history, politics, even geography they have had to revise since the collapse of the Soviet Empire just a couple of years earlier. They are clearly proud of their efforts to keep up to speed, but some things inevitably slip through the net. There are for example no entries for ‘email’, the ‘World Wide Web’ or the ‘internet’, all of which were just beginning to attract attention in 1993. The editors think the pace of change at the end of the 20th century means that traditional works of reference are going to have a hard time keeping up. Really they have no idea.
1993 wasn’t so long ago; Bill Clinton was president, a fact that the Columbia editors boast about having been able to include at the last moment (the last moment here meaning the weeks or months between the book’s being set and its arriving in the shops or in the hands of door-to-door salesmen). Yet in encyclopedia publishing, 1993 is now prehistory. Even 2000, when a sixth – one has to presume final – edition of the Columbia appeared, belongs to another age. Two years later, a one-time market analyst called Jimmy Wales started up an experimental online project called Wikipedia, which allowed volunteers to create their own encyclopedia entries that could then be revised or even entirely rewritten by anyone else who happened to be logged on. Wales, like everyone else involved in the project, didn’t know if it would work, but since the technology was available it seemed worth a try. In its first year, Wikipedia generated 20,000 articles, and had acquired 200 regular volunteers working to add more (this compares with the 55,000 articles in the Columbia, all subject to rigorous standards of editing and fact-checking, though this in itself was a small-scale enterprise compared to the behemoths of the industry like the Encyclopaedia Britannica, whose 1989 edition covered 400,000 different topics). By the end of 2002, the number of entries on Wikipedia had more than doubled. But it was only in 2003, once it became apparent that there was nothing to stop it continuing to double in size (which is what it did), that Wikipedia started to attract attention outside the small tech-community that had noticed its launch. In early 2004, there were 188,000 articles; by 2006, 895,000. In 2007 there were signs that the pace of growth might start to level off, and only in 2008 did it begin to look like the numbers might be stabilising. The English-language version of Wikipedia currently has more than 2,870,000 entries, a number that has increased by 500,000 over the last 12 months. However, the English-language version is only one of more than 250 different versions in other languages. German, French, Italian, Polish, Dutch and Japanese Wikipedia all have more than half a million entries each, with plenty of room to add. Xhosa Wikipedia currently has 110. Meanwhile, the Encyclopaedia Britannica had managed to increase the number of its entries from 400,000 in 1989 to 700,000 by 2007.
Part of the reason the astonishing growth of Wikipedia took even its founders by surprise was that this wasn’t their first attempt to set up an online encyclopedia. Wikipedia was an offshoot of something called Nupedia, which Wales had established in 2000 with the aim of using online volunteers to produce a new work of reference that would be free to use. The mistake Wales and his Nupedia collaborators made was to assume that any encyclopedia has to go through a formal editing process if it’s going to be reliable. Editors were appointed whose job was to decide on appropriate topics, open them up to online editing and then approve final versions once an agreed standard had been met. The editing process had seven stages from ‘assignment’ to ‘mark-up’, and was a slow, frustrating and ultimately fruitless business. By the end of the first year about two dozen articles had been completed, while the drafts of a few hundred more were still being fretted over. It looked like the vast additional resources and manpower that the internet had made available for checking reference books was going to overwhelm the capacities of anyone trying to process the information.
Hence the Wikipedia solution, stumbled on more by chance than by design: don’t try to process the information. It is generally assumed that what is distinctive about Wikipedia is that it is open to anyone to contribute, but that was true of Nupedia too. Wikipedia is different in that it doesn’t try to frame the creation of new entries with commissioned beginnings and fixed endpoints. It is open to anyone to initiate an entry on Wikipedia, and no entry is ever formally closed, since it is also open to anyone to keep editing and altering whatever is already there. Wikipedia still uses a large volunteer army of editors and ‘janitors’ to oversee the whole process, looking out for flagrant abuses and sounding the alarm when disputes get out of hand. But it is not the job of any editor to decide what counts as an entry. If there is any doubt about whether something is too trivial to take up space even in so limitless a space as Wikipedia it is put to the vote of others users (and any vote can always be overturned by another vote further down the line); otherwise, if you don’t like an entry it is up to you to change it. The editors are there to try to ensure this is done in as non-abusive a way as possible. But it is not up to anyone to call time on anything.
That’s how it works. The puzzle is why it works, given that this way of compiling an encyclopedia seems to have a flaw so obvious it is hardly worth stating: if no entry is ever nailed down, how do you know when you are reading an entry that someone hasn’t just interfered with it, making it thoroughly unreliable? The early years of Wikipedia were dogged by this suspicion, and many people – including a lot of schoolteachers and university lecturers who could remember the distant days before 2002 when books were books and editors actually edited – were openly derisive of a work of reference that appeared to make no effort to discriminate between good information and bad. It is easy to assume that some version of Gresham’s Law, which states that bad money will always drive out good, must apply to the circulation of facts as well. Why would anyone with good information want to put it in a place where bad information could contaminate it at the touch of a button? Wouldn’t they choose to keep it to themselves, or at the very least give it to someone who could recognise its true value, leaving open-access encyclopedias to the mercies of all the flakes and grudge-bearers who want to use its veneer of objectivity to force their craziness down other people’s throats? Well, the answer is apparently not. One of the remarkable achievements of Wikipedia is to show that on the internet Gresham’s Law can work in reverse: Wikipedia has turned into a relatively reliable source of information on the widest possible range of subjects because, on the whole, the good drives out the bad. When someone sabotages or messes with an otherwise sound entry, there are plenty of people out there who see it as their job to undo the damage, often within seconds of its happening. It turns out that the people who believe in truth and objectivity are at least as numerous as all the crazies, pranksters and time-wasters, and they are often considerably more tenacious, ruthless and monomaniacal. On Wikipedia, it’s the good guys who will hunt you down.
Wales thinks this tells us something surprising and reassuring about human nature. ‘Generally we find most people out there on the internet are good,’ he says. ‘It’s one of the wonderful humanitarian discoveries in Wikipedia that most people only want to help us and build this free, non-profit, charitable resource.’ But in truth it’s a bit more complicated than that. Wikipedia works because it is highly distinctive in the way it pulls knowledge together from many different sources. Most internet-based techniques for gathering information are aggregative, in that they try to pool as much information as possible, allowing all the prejudices and random bits of disinformation that attach to individual opinions to cancel each other out. This is true of the many different kinds of polling that take place on the internet, which use the wisdom of crowds to produce answers far more accurate than any individual can give. It’s also pretty much what happens at Google, where everybody else’s searches are monitored to help filter the information that you might find useful. Aggregative methods minimise personal responsibility for what is produced and place all the emphasis on collective outcomes – after all, who knows, or cares, what their own Google searches are adding to the sum of knowledge (or subtracting from it)? However, Wikipedia’s approach to knowledge gathering is not aggregative but cumulative. It builds up information bit by bit, edit by edit, and it never stops. It also leaves a virtual paper trail for every entry, so that it is possible to trace the various steps by which an article has reached its current form.
When knowledge is generated by crowds, no single individual has much personal responsibility for what is produced, but nor does any one person have a realistic prospect of shaping the outcome. With Wikipedia, the opposite is true. The fact that there is no final version means that anyone can change anything, but it also means that every given change can be attributed to a particular individual. Though it is possible, and common, to make edits on Wikipedia anonymously (by hiding behind a nickname), it is still true that someone is always responsible for everything that happens, and that someone always knows who they are. So the fact that there are no authoritative versions on Wikipedia is what makes it possible to generate a sense of personal accountability for particular entries, since any entry at any given time is the responsibility of the last person to edit it. This seems to be enough to make most people want to get it right. But it also means that those who don’t want to get it right can have their mistakes corrected. The secret to Wikipedia’s success lies in the fact that personal responsibility for particular mistakes can’t be erased, but the mistakes themselves can be.
Still, it takes a lot of policing. Wikipedia has a ‘Recent Changes Patrol’ whose job is to surf the site picking up on all the endless obscenities and absurdities that are inserted by people who can’t believe a website would allow anyone to change any page on it (when they discover that they can, but that changes quickly get corrected, the fun wears off). More serious tinkering requires more concerted oversight. From its outset Wikipedia has aimed to operate according to a code of conduct (of which the centrepiece is the proposition that ‘Wikipedia has a neutral point of view’), but to dispense with firm rules. However, in 2004, the three revert rule (‘3RR’) was introduced in order to prevent tit-for-tat battles, whereby corrections are corrected back to their original form (known as ‘reverts’), then corrected back again, and so on, because two contributors cannot agree on a single point of view. The classic case concerned the entry for Gdansk. The name of the town was changed by a German contributor to Danzig, then by a Polish contributor back to Gdansk, then back to Danzig, with no sign of this stopping until the administrators intervened. The 3RR states: ‘An editor must not perform more than three reverts, in whole or in part, on a single page within a 24-hour period.’ Just three changes per 24 hours in a work of reference might seem absurdly fluid by traditional standards, but for Wikipedia this was a draconian measure, adopted with deep reluctance by some. Even so, the Gdansk/Danzig wars were only finally settled when the matter was put to a vote of the wider Wikipedia community, and it was agreed that the town could be referred to as Danzig in relation to the period between 1308 and 1945, and in the biographies of ‘clearly German persons’; otherwise it was to be Gdansk. It took two years of back and forth to reach this point: a traditional encyclopedia editor could have settled it in ten minutes. Nevertheless, the consensus position on the name appears to have stuck, which given the history of Gdansk/Danzig is no small achievement.
That Wikipedia represents a finely calibrated balance between licence and surveillance, and between anonymity and responsibility, is something often missed by those who want to translate its achievements elsewhere. It is not an easy model to replicate. One notorious failure came in 2005, when the editorial page of the Los Angeles Times decided to experiment with a ‘wikitorial’, which would allow anyone to contribute to the writing of an editorial column using the same techniques as a Wikipedia entry. The aim was to let readers shape the views expressed by the newspaper; the result was a complete mess, as the entire process was hijacked by vandals determined either to skew the political slant of the piece, or to overwhelm the Times editorial page with the sort of shock images in which the internet abounds, and the project was quickly abandoned. The newspaper had made two mistakes. First, its editors seemed to imagine that a wikitorial would edit itself, so they left it alone while they devoted themselves to other things (like editing ‘real’ columns). But as Wikipedia shows, freedom requires constant vigilance, and a column will write itself only if someone is on hand to fight off all the people who will try to wreck it. Second, a newspaper editorial is actually a much less open-ended form of writing than an encyclopedia entry. Newspaper writing has a shelf-life: it appears and is read at a particular time, often on a particular day. As a result, contributors have an incentive to try to skew the whole process at the moment of maximum impact. The Wikipedia principle that all mistakes can be corrected (so that it is hardly worth trying to introduce them) has much less force in the case of newspapers, because by the time any corrections have been made most readers will have moved on.
This is why encyclopedias have been made better by the advent of the internet, but newspapers have been made worse: the cumulative impact of the readers’ comments that can now be appended online to almost any article tends to diminish most forms of human understanding. Bias is not cancelled out on the readers’ pages of newspaper websites, as might happen if opinion were being aggregated, but nor is it eliminated over time, as in the case of Wikipedia. Instead, each contribution just sits there, glowering back at you, demanding your attention. I recently read through the hundreds of comments that Guardian readers had attached to an article about Julie Myerson, the novelist who wrote about her drug-addicted son and sparked a wave of middle-class outrage and voyeuristic delight. What was striking was not just the anger of all those who wanted to see the Myersons suffer horribly for their crimes, but the equivalent anger of all those who were disgusted by such vindictiveness, and the anger of the people who were appalled by the prissiness of that response, and the anger of the people who couldn’t believe anyone would waste their time caring about this rubbish, and on, and on. Everyone was furious with everyone else, and determined not to be shouted down. No one with a reasonable point of view would bother wasting it on a site like this. When tempers are frayed, and time horizons are short, the bad drives out the good.
One of the ironic consequences of the open-endedness of the Wikipedia editorial process is that many of its articles are preoccupied with the immediate past. The desire to update the facts about any given subject often means that the facts that remain are the most up-to-date ones. Biographical entries on living individuals tend to concentrate on the most recent things they have done, particularly if these have generated a lot of newsprint that can be used as source material. For an encyclopedia, Wikipedia devotes far too much space to the latest scandals and controversies, whose significance, if any, is impossible to gauge. But this is not a reflection of some desire on the part of the founders of Wikipedia to stir up interest by courting topicality and trivia. Far from it: it reflects an almost touching reverence for properly grounded evidence that underlies the entire Wikipedia project. Although anyone can edit anything in Wikipedia, everything that appears there is supposed to carry a reference to some published source so that it can be checked by other readers. The Wikipedia policy on this is as follows:
The threshold for inclusion in Wikipedia is verifiability, not truth – that is, whether readers are able to check that material added to Wikipedia has already been published by a reliable source, not whether we think it is true. Editors should provide a reliable source for quotations and for any material that is challenged or likely to be challenged, or the material may be removed.
The proliferation of newspaper sources on the internet means that this is often the best place to look for new, verifiable source material (particularly if you are not too bothered about truth). Most of the information out there is recent information, and so therefore is most of what winds up on Wikipedia.
The insistence that everything in Wikipedia can be referred to something outside itself stems from an anxiety that the encyclopedia might otherwise become its own source material, and start to generate free-floating facts out of nothing. One of the many fascinating details to emerge from Andrew Lih’s The Wikipedia Revolution is that both Jimmy Wales and one of his first collaborators, Larry Sanger, are self-confessed and totally earnest ‘objectivists’, meaning followers of the philosophy of Ayn Rand. Sanger wrote his doctoral thesis at Ohio State University under the title ‘Epistemic Circularity: An Essay on the Problem of Meta-Justification’. He and Wales first encountered each other on an internet forum Wales had established in 1992, which offered a ‘Moderated Discussion of Objectivist Philosophy’ and described itself as ‘the most scholarly of all Objectivist discussions available on the networks’. Other early contributors to Wikipedia learned about its existence through the community of online objectivists, and it was this bond as much as anything that drove the project forward in its initial stages.
What is objectivism? Frankly, I have no idea. I have never read a word by Ayn Rand, and though I know she is an object of veneration in some surprising places (Alan Greenspan, for instance, is a fan), the little bits I have picked up always sounded a bit bonkers to me.[*] So this seemed a good test of Wikipedia’s much vaunted NPOV (neutral point of view): I would look her up on Wales and Sanger’s encyclopedia to find out what she’s all about. Well, it’s hard to express in mere words just how dispiriting an experience it is trying to find out about objectivism on Wikipedia. This isn’t because the entries seem biased or uncritical. It is just that they are so introverted, boring and just long. The entry on Ayn Rand herself is more than 8000 words long and covers her views on everything from economics to homosexuality in technical and mind-numbing detail. There are separate lengthy entries on objectivist metaphysics, objectivist epistemology, objectivist politics, objectivist ethics, plus entries on all Rand’s various books, including the novels The Fountainhead and Atlas Shrugged, and entries on all the characters in these novels, and entries that offer plot summaries of these novels, and even entries on individual chapters. All of it reads as though it has been worked over far too much, and like any form of writing that is overcooked it alienates the reader by appearing to be closed off in its own private world of obsession and anxiety. Compare this with the entry on Rand in the 1993 Columbia Encyclopedia:
1905-82, American writer, b. St Petersburg, Russia. She came to the United States in 1926 and worked for many years as a screenwriter. Her novels are romantic and dramatic, and they espouse a philosophy of rational self-interest that opposes the collective of the modern welfare state. Her best-known novels include The Fountainhead (1943) and Atlas Shrugged (1957). In The New Intellectual (1961) she summarised her philosophy, which she called ‘objectivism’.
That’s it (with a couple of references appended), and seems admirably clear in 70 words. Also, by allocating her 70 words, the Columbia editors give some indication of what they think she’s worth: on the same page she gets more space than the French architect Joseph Jacques Ramée (1764-1842) and the Swiss novelist Charles Ferdinand Ramuz (1878-1947), but fewer words than the French historian and politician Alfred Nicolas Rambaud (1842-1905), the Spanish histologist Santiago Ramón y Cajal (1852-1934) and the Scottish chemist Sir William Ramsay (1852-1916). That also seems pretty clear.
Wikipedia still has its advantages, however. Despairing of discovering anything about Rand that I could make sense of, I looked up the article on Jimmy Wales, to see if that shed any light on his personal philosophy. This article is also long, but more reasonably so, given that Wales is responsible for one of the most significant inventions of the 21st century. It is also admirably even-handed, managing to convey that Wales is both something of a visionary and also something of a creep. The section on his personal life includes this detail, which neither he nor anyone else has seen fit to edit: ‘His first wife, Pam, was quoted in a September 2008 W magazine article as saying that Wales, because he believed altruism was evil, discouraged her from pursuing a nursing degree when they were married.’ The entry also details the break-up of Wales’s second marriage and the claims of a subsequent girlfriend, the Canadian conservative columnist Rachel Marsden, that she only discovered he was ending his relationship with her by reading about it on Wikipedia. I guess that’s ‘objectivism’ for you.
Perhaps unsurprisingly, Wales has long since fallen out with Sanger, re-editing his Wikipedia entry to remove any reference to him as a co-founder of the project, even though both men were there from the beginning. But it may be Sanger’s PhD title that gives the clearest indication of some of the difficulties that lie ahead. ‘Epistemic circularity’ is a fancy way of saying that Wikipedia could prove too successful for its own good. This is not because entries on the site are likely to start cannibalising each other and end up reducing the whole thing to a relativistic soup: Wikipedia is still very good at distinguishing cross-references within the site from source material outside it. Instead, the problem may come as the source material itself starts to ape the wiki-model. Already, academic publishers are grappling with the problem of open access, which makes increasing numbers of academic articles freely available on the web (‘free’ here meaning not only free to use but also free to dice, slice and reproduce in another format). Some of the pressure for this move is coming from the people who fund academic research and who want to see it disseminated as widely as possible. But a number of funding bodies (particularly in the sciences) are also questioning whether it makes sense to wait until research is ‘completed’ before publishing it. Why not put earlier draft versions out there, or even just the initial raw data, and let others see what they can make of it? This opens up the possibility of collaborative editing online: authors might ‘publish’ draft versions of their books and readers could tinker with them to produce something they are happy with. Of course, the idea of the permanently updatable book raises the prospect of nightmarish copyright issues (or more likely the end of copyright altogether), and it is hardly attractive for academic publishers, since it cuts off their most obvious revenue stream, which has always been to charge for the finished product, properly edited in-house. It also raises difficulties for the idea of verifiability. Wikipedia needs its source material to be relatively stable, so that its entries can have fixed reference points. But if the reference points are themselves subject to endless change, then it becomes much harder to know what counts as verification.
Meanwhile, as conventional publishing starts to open up to the Wikipedia way of doing things, the encyclopedia is toying with a revert back to more conventional methods. German Wikipedia has started experimenting with ‘flagged’ articles, which means articles that have been certified as reliable and free from vandalism, to meet a demand for certainty from German users. (Incidentally, this is not the only international variation in Wikipedia practice that seems to conform to national stereotypes: on Japanese Wikipedia, editors are much more reluctant than their Western counterparts to alter existing pages and prefer to conduct their exchanges on adjoining discussion sites rather than blithely interfering with what someone else has written.) The German experiment has now led to a demand for approved articles to be published separately on a static website protected from editing, in order to give readers the option of something that has been pre-verified.
The question of ‘flagging’ is one of the issues discussed in the afterword of Lih’s book, which addresses the most pressing challenges Wikipedia is likely to face in the future. Other concerns include the creation of a fully-paid executive staff, something that may cause serious divisions in an organisation that relies so heavily on voluntary labour; the risk of a major lawsuit by someone who has been libelled in a Wikipedia entry (the fact that anyone can remove the offending information doesn’t prevent them from trying to sue, though it isn’t clear who would be liable – the person who introduced the libel or the last person to edit the page on which it appears?); and the increasing complexity of the editing software, which is putting off many new contributors. More interesting than any of this, though, is the fact that the afterword was written as a wiki: that is, as a collaborative exercise using software similar to that of the encyclopedia itself, and made available to be freely copied and distributed. It is good of Lih to include it, since it is somewhat better written than the rest of the book, having a tighter style and a sharper focus. The single-authored chapters are full of interest but rather indulgent, containing too much incidental detail about people Lih wants to please. The afterword has none of that – it just gets to the point, and doesn’t worry about offending anybody. It helps that this is a book, so space is limited, and this particular wiki can’t indulge in the commonest vice of entries on Wikipedia, which is not knowing when to stop.
Yet even a piece of writing that has been edited by so many people can’t resist the occasional cliché. The multiple authors of the afterword write: ‘The Wikipedia community might be like the frog slowly boiling to death – unaware of the building crisis, because it is not aware how much its environment has slowly changed.’ When I read this, I thought: is it really true that frogs can be slowly boiled to death without realising what’s happening to them? So I looked it up on Wikipedia, confident that there would be an entry. There is: type in ‘boiling frog’ and you go straight to a page that tells you everything you need to know. It gives you examples of the use of the term, its history and a discussion of the veracity of the central idea, including a description of the late 19th-century experiment in which it was first demonstrated and the more recent experiments that have cast doubt on it. Links at the bottom of the page take you to accounts of these later experiments in scientific journals, which suggest that the whole thing is a myth. So there it is: you won’t find any of this in the Columbia, or Encyclopaedia Britannica, or anywhere else for that matter. There is no other way I could have found out about boiling frogs – truly, for all its flaws, Wikipedia is a wonderful thing.