The four most ‘informative’ words in Moby-Dick, statistically speaking, are ‘I’, ‘whale’, ‘you’ and ‘Ahab’. Marcello Montemurro and Damian Zanette worked this out by comparing the text of Moby-Dick to all the possible alternatives obtainable by shuffling Melville’s words into random sequences. These are not the four words that are used most often, or that carry the most ‘information’ in the everyday sense of the term, but the words whose positioning in the original, meaningful text differs most from the way they would be scattered in all other permutations. The ‘information’ here is of the mathematical, measurable kind: ‘most informative’ means ‘least randomly distributed’. It may seem a slightly odd way to try to quantify semantic content, as though when Melville wrote Moby-Dick, it wasn’t so much a matter of finding the right words, as of putting them down in the right order.