In​ Italo Calvino’s If on a Winter’s Night a Traveller the protagonist encounters two sisters who have different styles of reading. Ludmilla reads for pleasure, unencumbered by academic-literary-critical goggles, delighting in writers who write as ‘a pumpkin plant produces pumpkins’. Lotaria, on the other hand, is an academic who reads books ‘only to find in them what she was already convinced of before reading them’. (‘You know the type,’ you can hear Calvino sigh.) She possesses a machine that analyses novels by counting instances of certain words in them; this allows her to arrive at a reading free of the ‘taboos’ and pretensions of her culture. ‘Look here,’ she says: ‘Words that appear 19 times: blood, cartridge belt, commander, do, have, immediately, it, life … There’s no question: it’s a war novel, all action, brisk writing, with a certain underlying violence.’

The authors of The Bestseller Code (Allen Lane, £20), Jodie Archer and Matthew Jockers, are both literary computer people: Archer worked for Penguin, did an English PhD at Stanford and then went to work for Apple; Jockers is the cofounder of Stanford’s ‘Literary Lab’ in Silicon Valley. They have built Lotaria’s machine. It works by counting instances of words in novels and identifying the features they share; but it aims at understanding Ludmilla’s reading. Archer and Jockers are interested in the pumpkin plants – writers like Stephen King, John Grisham and Danielle Steel, perennial presences on the New York Times bestseller list – and what makes them sell so well. By looking only at textual features their machine has isolated the essence of the bestseller, Archer and Jockers believe, and can now make predictions, based exclusively on a book’s manuscript, as to whether it will bestsell or not. It is remarkably accurate, putting the probability of Dan Brown’s Inferno being a bestseller, for example, at 95.7 per cent. It can even predict whether or not a novel will be a bestseller based on the frequency with which its author uses the word ‘the’.

Whether or not a book does well seems to have to do with its style, its subject and its structure, which isn’t exactly surprising. What a good subject, or at least successful subject, consists in, according to the machine, is more so. The computer looks for subjects by looking for clusters of nouns – if kitten, furball, whiskers and paws appear then some of the book is about cats – and calculating how much space the clusters take up. In a sample of 5000 novels 0.001 per cent of the material was about sex; 0.003 per cent was drugs, 0.001 per cent rock ’n’ roll. Sex doesn’t sell after all, and takes up twice as much space in non-bestsellers as it does in bestsellers. In fact what matters isn’t so much the subject itself as the proportions in which a novel’s subjects appear. A third of John Grisham’s paragraphs are about ‘the legal system’; a third of Danielle Steel’s are about ‘domestic life’. Their books have a number of other, lesser subjects, but not too many: bestselling books are about fewer things than non-bestsellers, taking up 40 per cent of their pages, on average, with just four subjects. In On Writing, Stephen King recommended that writers take a subject they know and blend in ‘personal knowledge of life, friends, relationships, sex and work. Especially work. People love to read about work.’ They really do, to a shocking extent; it’s one of the subjects most common to the bestseller. The subject they share most, though, is ‘human closeness’, which Archer and Jockers gloss as ‘shared intimacy, shared chemistry and shared bonds’.

They reckon it was the ‘human closeness’ that sold Fifty Shades of Grey: all those scenes in which Anastasia Steele and Christian Grey talk to each other, in his ‘beast of a car’ and elsewhere, about such things as their families and what music they like (‘My taste is eclectic, Anastasia, everything from Thomas Tallis to the Kings of Leon. It depends on my mood’). There’s a lot more of that than there is sex; Anastasia actually spends a large chunk of the first hundred pages wishing Christian would stop asking her questions and get on with it. But that’s not all: Fifty Shades also has a plot structure that the computer has ascertained is the plot structure most likely to shift serious units. There are emotional ups and emotional downs, five of each, and the story moves from one to the other with metronomic regularity. Steele meets Grey and there is chemistry; then she gets drunk, calls him and throws up; then they have sex and it is fun; then she learns about Grey’s ‘dark side’ and his ‘rules’; and so on. The plots of Stephen King, Jackie Collins, Dan Brown, Sylvia Day, Danielle Steel, Lee Child and James Patterson all, apparently, have a similar shape, and the curve of The Da Vinci Code is identical in its measuring out of highs and lows until the very end of the novel: Dan Brown finishes his book on an upbeat where E.L. James, in anticipation of a sequel, ends on a downer.

It’s fun reading bestsellers after The Bestseller Code because you can see them ticking the computer’s boxes. Danielle Steel’s Power Play is driven by a logic of anti-escapism in which board meetings – work! – are as thrilling to its protagonist as kinky sex is to Anastasia (first sentence: ‘Fiona Carson left her office with the perfect amount of time to get to the boardroom for an important meeting’). The lead characters in David Nicholls’s One Day spend the first twenty pages in bed together categorically refusing to have sex, settling for ‘human closeness’ instead. ‘Let’s just cuddle,’ Emma Morley says. ‘Of course,’ Dexter Mayhew replies, ‘though in truth he had never really seen the point of cuddling. Cuddling was for great aunts and teddy bears.’ According to Archer and Jockers, the ideal opening sentence has an active decision, two characters implied, a reference to family bonds and the suggestion of conflict. Here’s the opening sentence of Donna Tartt’s The Little Friend: ‘For the rest of her life, Charlotte Cleve would blame herself for her son’s death because she had decided to have the Mother’s Day dinner at six.’ All boxes ticked. As for the regular pulse, you could set your watch by James Patterson’s cliffhangers. In 8th Confession there’s a chapter break every three to four pages, and every chapter opens in stillness and ends with a shock. The morning commute (work!) ends with an exploding school bus; the cops calmly analysing the scene (work!) figure out that the school bus was actually a mobile meth lab; the reporter on her way home (from work!) stumbles across the body of a tramp who has been shot in the face.

Calvino gets everything wrong. The most bestselling title for a book, says the computer, is a noun with a definite article – The Goldfinch, The Firm, The Circle – qualified, on occasion, for spice: The Boleyn Inheritance, The Da Vinci Code. It most certainly isn’t the first half of a conditional sentence featuring, for extra vagueness, not one but two indefinite articles; If on a Winter’s Night a Traveller is the only one of those. Whether a lead character is likely to make a novel sell well or not, the computer says, depends on their level of agency, which it measures by counting the number of active verbs attached to their name. The hero of Calvino’s novel, ‘you’ – second person, another disaster – spends the first chapter as inactive as could be, in ‘your’ armchair, planning to read a book, not even reading one. Calvino wrote the opposite of a pumpkin, to defy the machines of Lotaria, Archer and Jockers. There’s a bestselling crime writer in If on a Winter’s Night called Silas Flannery whose books are so formulaic they can be written by a machine. Archer and Jockers warn that The Bestseller Code doesn’t allow you to do this; the computer tells us what bestselling novels have in common, but you could write a novel that ticked all the boxes and still stank. It’s easy to imagine tweaks to Fifty Shades that wouldn’t have affected the proportions in which its subjects are arranged, or its distribution of highs and lows, or the level of agency of its lead character, but that would probably have jeopardised its ability to sell as many copies as it has. It must matter that Christian Grey is a dashing billionaire, not a pizza delivery boy, and that what he does with Anastasia is not fishing, and that the Da Vinci Code contains secrets about Jesus’s sex life, not his thoughts on vol-au-vents. Mustn’t it?

Send Letters To:

The Editor
London Review of Books,
28 Little Russell Street
London, WC1A 2HN

Please include name, address, and a telephone number.

Read anywhere with the London Review of Books app, available now from the App Store for Apple devices, Google Play for Android devices and Amazon for your Kindle Fire.

Sign up to our newsletter

For highlights from the latest issue, our archive and the blog, as well as news, events and exclusive promotions.

Newsletter Preferences