Late last autumn this book received a prodigious amount of attention in the United States. No one who has been exposed to any of the American media can have escaped it. Among the reactions was a chorus of élite liberal denunciations. The New Republic of 31 October ran a piece by Murray followed by 18 criticisms. Stephen Jay Gould spoke out in the New Yorker of 28 November. I especially recommend Alan Ryan’s analysis in the New York Review of Books of 17 November, followed in the 1 December issue by Charles Lane’s examination of some of the sources of statistical information in this book, sources closely connected with an Edinburgh publication, the Mankind Quarterly. Lane is particularly useful on Richard Lynn, a professor at the University of Ulster, who is cited 24 times in the book, but whose research will strike many readers as questionable.
The authors maintain that there is an accurate unitary measure of general intelligence, named g, first isolated in 1904 by the British psychologist Charles Spearman, who became Professor of Mind and Logic at University College London. (Our authors, and some of their critics, persist in referring to Spearman as a ‘former British Army officer’, which he was, but for goodness sake, he was professor at what was, in its day, the world centre for research on statistical inference, and took his doctorate at Leipzig, in its day the world centre for experimental psychology. This does not prove his work is valuable. Indeed I mistrust factor analysis, the technique he invented, and I am sceptical about g, but what game is being played by referring to Spearman simply as a military man?) Spearman’s g, claim Herrnstein and Murray, is a feature underlying what is measured in any plausible test for intelligence. Wary of the word ‘intelligence’, they write of ‘cognitive ability’, which, thanks to the prowess of modern cognitive sciences, sounds good, but they say the phrase means exactly, no more and no less, what intelligence testers call ‘intelligence’. Seeking vernacular but polite usage, they call people on the lower end of the IQ scale ‘dull’ and ‘very dull’. For people at the other end they often use the excellent American word ‘smart’, which gets the British ‘clever’, and ‘bright’ and ‘quick’ all at once.
Herrnstein and Murray hold that IQ, as measured, is distributed as what mathematicians call a Gaussian or normal distribution. Intuitively, this looks like a bell; hence the title of the book. The authors hold that the average IQ of East Asian populations is higher than that of white Americans (here they rely heavily on the dubious work of Professor Lynn). They correctly note that on average American blacks score substantially worse than American whites. They hold that IQ is partially inherited, which they manifestly believe, but they do insert words of caution and rightly note that many of their theses do not depend on genetically transmitted IQ.
Herrnstein and Murray compile and tabulate an immense amount of information from many sources. Their original contribution is based on the national Longitudinal Survey of Labour Market Experience of Youth. Starting in 1979, this survey has been tracking about 12,500 young Americans, following not only their family background, education, employment, marriage and child-bearing, but measuring their IQ. This unique resource enables our authors to show that IQ is a fairly good predictor of many social variables – educational attainment, unemployment, salary when employed, pregnancy in and out of wedlock. In the earliest part of the book they provide results for white Americans only, and show that low IQ predicts, with some probability, numerous bad events or qualities, while high IQ predicts good ones. Men are studied chiefly as solid workers or the opposite, unemployed or criminals. Women are studied chiefly as solid homemakers or the opposite, single mothers on welfare. When it is careful, the book restrains itself to the language of prediction and correlation, but often it lapses, and speaks of cause or effect; low IQ as a cause of poverty, unemployment or illegitimate births.
I’ve just spoken of good and bad outcomes. The value system throughout The Bell Curve is based strictly on what are now called ‘family values’. The authors would restore the word ‘illegitimate’ in place of ‘out-of-wedlock’ or whatever. They do so on the basis of a single statement by Malinowski (1884-1942), the anthropologist of Trobriand Island fame, who wrote that in every known society ‘there runs the rule that the father is indispensable for the full sociological status of the child as well as of the mother, that the group consisting of a woman and her offspring is sociologically incomplete and illegitimate.’ They omit that even Malinowski allowed that the man need not be the biological father, indeed need never even set eyes on the child once a ceremony is performed.To follow the logic of The Bell Curve it is useful to read carefully the chapter called ‘Family Matters’. Low IQ does at present predict unmarried motherhood. But at best it predicts it only within current social arrangements, since births out of wedlock have sky-rocketed without any significant change in IQ distribution. Hence talk of ‘cause’ invites a discussion of society, not IQ. Herrnstein and Murray nevertheless ‘have this causal model in mind ... The less intelligent the woman is, the more likely that she does not think ahead from sex to procreation ... How intelligent a woman is may interact with her impulsiveness ... The result is a direct and strong relationship between ... low intelligence and the likelihood that the child will be born out of wedlock.’ (My dots of omission indicate I’ve left out the converse statements about ‘smart’ women.)
This ‘causal model’ has no foundation in IQ testing. IQ as measured quite ignores prudence and impulse. Some people score very well on time-limited tests by recklessly guessing the last 20 answers in the last 20 seconds available. And why does low IQ now ‘interact with impulsiveness’ when it didn’t, much, a quarter century ago? This is not a ‘causal model’ but a causal fantasy.
The regular turning of prediction into cause is studied, in statistics, under the label ‘spurious correlation’. In this book it is pernicious. Consider the thesis that one is born with an IQ, and that in any stable population the IQ distribution will be fairly stable from generation to generation. If IQ were causal, rather than correlated in a particular social arrangement, a dreadful fatalism would seem to ensue. And that of course is what the ‘white’ part of the book is setting us up for. With fierce cunning we are taught the predictive power of pencil and paper IQ tests when taken by white Americans, and are constantly nudged towards thinking of IQ as being a cause. Move on. Black people on average have lower IQ than white ones. Therefore there are a lot of causal factors in their make-up which, collectively, they cannot escape. That is a bald way in which to describe the logic of the book. Yes, the authors constantly insert disclaimers and often write with elegant caution. The disclaimer I would want to see in bold letters at the top of most chapters is this: statistical predictability does not of itself imply causation.
The authors seem wilfully to conflate cause and correlation at every stage of their writing. Thus there are helpful explanations of statistical ideas. One, on page 67, is in a box headed, ‘A Primer on the Correlation Coefficient’. In their statistical appendix, page 553, they say that this primer ‘should be satisfactory for people who are at home with math but never took a statistics course’. They end the box with the words: ‘A correlation of .2 can nevertheless be “big” for many social science topics. In terms of social phenomena, modest correlations can produce large aggregate effects. Witness the prosperity of casinos despite the statistically modest edge they hold over their customers.’ As if the edge at casinos had anything to do with correlation coefficients! A roulette wheel has 18 reds, 18 blacks and two zero slots. The house collects all stakes when the wheel stops at zero. Hence once every 19 spins, on average, it collects all stakes. We well understand what causes the wheel to be profitable: physical and geometrical symmetries, within the rules of the game. But this has nothing to do with correlation. Thus our authors invite us to compare correlation to causal processes we think we understand, but which are not examples of correlation coefficients.
I will not discuss the racial part of the book that has attracted so much debate. I repeat my suggestion that the fundamental trouble does not occur in the black part of the book but in the white part. There it is truly claimed that at present low scores on IQ tests predict bad outcomes. That is taken, intermittently, to be part of a causal claim. Scoring poorly is a trait that causes the bad outcomes. That idea is reinforced by the proposition that this trait indicates a low Spearman coefficient of general intelligence g. (See the end of Gould’s article, mentioned in the first paragraph, for problems with this.) Then we move on to the black part of the book, where it is truly asserted that African-Americans tend to score poorly on IQ tests. At most one can infer that this predicts bad outcomes for them, in the present state of things. One cannot infer that they have some permanent trait that causes them to have bad outcomes.
Herrnstein and Murray are part of the fin-de-siècle gloom movement whose adherents range from Edward Luttwak, writing powerfully in these pages on 7 April last year, to Jeremy Rifkin in The End of Work. From an abstract point of view – the vantage point of those who have high IQs and some idle time – Herrnstein and Murray have quite the neatest version of doom. People of high IQ have increasingly come to dominate the American educational scene, so that the best universities admit only students who are virtually off the top end of the IQ scale. Those people grow up to take the best jobs and create more jobs demanding their types of talent. They mix only with each other, they intermarry, they foster their abilities in their offspring. They form a new caste living in isolated comfort. Meanwhile, those on the other side of the bell curve socialise and breed among their own kind. No matter whether culture or genes most influence IQ, the underclass will retain its low IQ and be increasingly unfit for the jobs provided by the élite. Moreover, the underclass, according to this theory, will be increasingly white, for though blacks on average score much worse than whites, there are many more whites, and hence more dull whites than dull blacks. There are more steps in this chain of reasoning but now you can fill them in yourself. The end result is the custodial state. The minimum needs of the underclass are to be provided within a fierce programme of law and order; the top and the bottom sectors of society will be physically separated from each other everywhere, less by walls of bricks and mortar than by electronic surveillance and control.
Notice how our authors have done a remarkable job of recasting Plato. Plato forecast that as his ideal Republic degenerated it would pass through oligarchy to democracy, which in turn would succumb to tyranny. Herrnstein and Murray also begin with their near ideal, the Republic after the Founding Fathers. It becomes democratic and then is replaced by the oligarchy of IQ. This may be a richer idea than they know. Technological democracies have evolved a remarkable device that helps stop the slide into tyranny – a device suggested by the title of Theodore Porter’s new book, Trust in Numbers. Instead of trying to kill each other, and thus turning to a tyrant in order to survive, we have concocted an endless array of objective measures and tests of almost everything. When we disagree we do studies, we count and measure so that the very numbers themselves serve as buffers and arbiters. IQ is an early example. Herrnstein and Murray begin their book noting how incredibly stratified America’s best colleges have become, populated only by top IQs. Hardly a surprise, because students are selected primarily by test scores and high school grades, plus an ability to score well at lots of different things. This is the spirit of numerical democracy at work, barely tempered by the admission of a few children of old boys or benefactors. Trust the numbers! That generates a college élite which becomes a managerial and professional élite whose members want to be around people like themselves and want their type of work to be most highly rewarded, and want everyone but drudges to work as they work.
In this Platonistic vision, IQ tests and their ilk began as attempts to make objective discriminations independent of social class, but turn out to have an incredible feedback effect. They create a new social class and turn democracy into oligarchy. The side of myself that once enabled me to score well on IQ tests just loves this idea and I could expatiate on it for pages. But a sense of reality intrudes; just as it intruded when Aristotle thought about Plato.
I have to confess that I much enjoyed parts of this book – the reams of information, the simplistic data analysis, the glorious caricature of the End of America as We Have Known It. I found the last chapter especially endearing, for it is a picture of a gentler, more caring America of small neighbourhoods and clear codes of conduct. One is not troubled that this utopia can hardly arise from the custodial state that is so grimly and fatalistically forecast in the preceding chapter. The reader has become used to the incoherence of successive structures here described. The Bell Curve would be the perfect textbook for a course on Post-Modern statistics. Unfortunately the book must be taken seriously, for it is presented as a reasoned attack on the entire welfare structure and ambitions of the United States. As Alan Ryan remarks in the piece I mentioned at the start, none of The Bell Curve’s anti-welfare theses follows from any of the phenomena about IQ contained in the book. But at this point reasoning falters. This is because ‘it does not follow’ is a very imperfect reply to demagogues.
Forgive me for concluding a review of The Bell Curve with a few pedantic remarks about the bell curve. In their first appendix, ‘Statistics for People Who Are Sure They Can’t Learn Statistics’, Herrnstein and Murray give a good simplistic account of standard deviations, regression, correlation and the like. They also assert that ‘the phrases normal distribution or bell-shaped curve, or, as in our title, bell curve... refer to a common way that natural phenomena arrange themselves approximately ... It makes sense that most things will be arranged in bell-shaped curves. Extremes tend to be rarer than the average.’ They do not note that there are three distinct kinds of bell. 1. The curve of errors. Around 1800, some of the greatest mathematicians of the day, such as Gauss and Laplace, proposed a highly plausible mathematical model of the distribution of errors made by an observer using an instrument to determine the position of an object such as a heavenly body. There was a real, true unknown value. If the measurements were unbiased, their average, or mean, would be that value, and the curve of error modelled the deviation around that real true value. 2. Biometric distributions. About fifty years later a Belgian astronomer, Quetelet, noted that measurements of many biological variables are distributed like the curve of errors. His first example was the chest circumference of soldiers in Highland regiments; Murray and Herrnstein use the heights of boys in your high-school gym class. (It’s amazing how often in books like this we end up, after almost all is said and done, in the locker room.) Notice that the mean is no longer aiming at a measure of a real quantity existing in nature, but is just an average, and that the deviation around the mean is not produced by physical or geometrical symmetries in the measuring device plus observer. Karl Pearson, often called the founder of biometrics, was so convinced that these distributions were widespread that he called them normal. In most other languages they are still called Gaussian. Pearson’s mentor and patron, Francis Galton, inventor of regression and correlation, long warned against trying to fit all biometrical distributions into the ‘Procrustean Bed’ of the error curve. 3. Normalised test results. Because many real biometric variables such as height are distributed like the curve of errors, it was supposed that postulated quantities should follow the same curve. IQ is the classic example. Questions were chosen so that (in the simplest case) half the population being tested would answer correctly, giving a mean score of 100. The skill of designing a test was, in part, to choose questions so that the results in the population formed, roughly, a Gaussian curve. If they did not, change the questions. The greatest designer of tests was Lewis Terman, who invented the name IQ, and who did the first massive testing, of US Army recruits in 1917. When attention was turned to women, it emerged that female scores were higher than male ones. Solution: find the questions that women answer better than men and replace them. Thus it is a fact of biometrics that males of the same population are on average taller than females. But it was not an empirically discovered fact of nature, revealed by Terman’s final tests, that females have the same average IQ as men. It was a fact of test design. I do not mean this observation to impugn the testing industry. I am saying only that the bell curve of IQ is a logically different species of beast from the bell curve of biometrics, in turn logically different from the original bell curve of error. A technicality? No, because one ought to conceptualise causality very differently in the three cases. And this is a book which claims, au fond, to be about causality.
Send Letters To:
London Review of Books,
28 Little Russell Street
London, WC1A 2HN
Please include name, address, and a telephone number.