A Short History of IQ Testing: From Binet to the Modern Era

In 1904, the French government handed Alfred Binet a practical problem. France had recently made primary education compulsory, and schools needed a way to identify children who would struggle in ordinary classrooms so they could be given extra help. Crucially, the authorities wanted an objective method — not a teacher's subjective impression, which could be biased by a child's background or behavior. Binet, working with his collaborator Théodore Simon, set out to build one.

What they produced in 1905 — the Binet–Simon scale — was the first practical intelligence test, and it established a template still recognizable today. Rather than measuring one narrow skill, it sampled a wide range of tasks of increasing difficulty: following commands, naming objects, defining words, completing sentences, reasoning about everyday situations. The logic was that a general capacity would reveal itself across many different kinds of problems.

The mental age idea

Binet's key innovation was the concept of mental age. By testing many children, he established what an average child could do at each chronological age. A child's "mental age" was the age level whose tasks they could successfully complete. A seven-year-old performing like an average nine-year-old had a mental age of nine; one performing like an average five-year-old had a mental age of five. The gap between mental and chronological age flagged children who needed support.

Binet was careful — almost anxious — about how his test should be used. He insisted it measured current performance, not innate or fixed capacity. He warned explicitly against treating the score as a permanent label, and against the "brutal pessimism" of assuming a low scorer could never improve. He saw the test as a tool for identifying who needed help, so they could be helped.

1905

The Binet–Simon Scale

Alfred Binet and Théodore Simon create the first practical intelligence test in Paris, to identify schoolchildren needing extra help.

1912

The "intelligence quotient" is born

German psychologist William Stern proposes dividing mental age by chronological age — the ratio that gives IQ its name.

1916

The Stanford–Binet

Lewis Terman at Stanford adapts the test for American use, popularizing the IQ score and multiplying the ratio by 100.

1917

Mass testing in WWI

The US Army tests 1.75 million recruits with the Army Alpha and Beta tests — the first large-scale group intelligence testing.

1939

The Wechsler scales

David Wechsler introduces a test with separate verbal and performance scales, and the deviation IQ — the basis of modern tests.

1984

The Flynn effect

James Flynn documents that raw IQ scores rose steadily throughout the 20th century, forcing periodic re-norming.

From ratio to quotient

The term "intelligence quotient" itself came from the German psychologist William Stern in 1912. He suggested expressing the result as a ratio: mental age divided by chronological age. When Lewis Terman at Stanford University adapted and standardized the test for American use in 1916 — creating the Stanford–Binet — he multiplied this ratio by 100 to remove the decimal. A child whose mental age matched their chronological age scored exactly 100. The number we still center every IQ scale on traces directly to this arithmetic convenience.

But the ratio method had a fatal flaw: it doesn't work for adults. Mental development plateaus in adulthood, so dividing by an ever-increasing chronological age would make everyone's IQ appear to decline with age — an absurd result. The solution came from David Wechsler.

Wechsler and the deviation IQ

In 1939, David Wechsler, chief psychologist at New York's Bellevue Hospital, introduced a fundamentally better approach. Instead of a ratio of ages, he defined IQ by where a person fell in the distribution of their same-age peers. This is the deviation IQ, and it's what virtually every modern test uses.

The method is statistical. Scores are scaled so that the average for any age group is set to 100, with a standard deviation of 15. That means about 68% of people score between 85 and 115, about 95% between 70 and 130, and only around 2% above 130 or below 70. Your IQ is no longer a ratio of ages — it's a statement about your position relative to others of your age.

The deviation IQ. Scores follow a normal distribution centered at 100. Each band of 15 points is one standard deviation. This is why "100" is average and why scores above 130 are rare — about 1 in 50.

The darker chapters

No honest history of IQ testing can skip what happened when the tool left Binet's careful hands. In the United States, a number of early testing advocates were also proponents of eugenics — the belief that humanity could be "improved" by controlling who reproduced. IQ tests were used to justify the involuntary sterilization of thousands of people deemed "feeble-minded," and to support the discriminatory national-origin quotas of the 1924 Immigration Act. Tests administered in English to non-English-speaking immigrants, unsurprisingly, produced low scores that were then misread as evidence of innate inferiority.

These abuses were not failures of the underlying statistics so much as failures of interpretation and ethics — exactly the "brutal pessimism" Binet had warned against. They are a permanent reminder that a measurement is only as good as the judgment of those who wield it, and that treating a single number as a verdict on human worth has caused real and serious harm.

Some recent thinkers affirm that an individual's intelligence is a fixed quantity, a quantity which cannot be increased. We must protest and react against this brutal pessimism. — Alfred Binet, 1909

The Flynn effect and modern testing

One of the most striking discoveries came from political scientist James Flynn, who documented in the 1980s that raw IQ scores had risen substantially across the developed world throughout the 20th century — by roughly three points per decade. Because tests are re-normed to keep the average at 100, this rise was invisible unless you compared old and new norms directly. The Flynn effect remains partly mysterious, with explanations ranging from better nutrition and education to increasing familiarity with abstract reasoning. Whatever its cause, it proved that test performance is not a fixed biological constant.

Today's gold-standard instruments — the Wechsler Adult Intelligence Scale (WAIS), now in its fifth edition, and the modern Stanford–Binet — are sophisticated, carefully normed batteries administered one-on-one by trained psychologists over a couple of hours. They produce not just a single number but a profile across multiple domains: verbal comprehension, perceptual reasoning, working memory, and processing speed. The single "IQ score" survives mostly as a convenient summary of this richer picture.

The practical takeaway

The IQ test was born as a humane, practical tool and has been both genuinely useful and genuinely misused. Modern tests are far more refined than Binet's, and the deviation-IQ method gives scores real statistical meaning. But the history is a standing warning: a number that measures a slice of cognitive performance is not a measure of human value, potential, or destiny — and was never meant to be.

Binet would likely recognize little of what his invention became — the mass testing, the controversies, the cultural weight a three-digit number came to carry. But he would recognize the original purpose, still valid today: a carefully built test, honestly interpreted, can tell us something useful about how a mind is working right now. The key word, as he insisted from the start, is honestly.

This article is for educational purposes. Cortextest assessments are not clinical instruments and are not equivalent to the WAIS, Stanford–Binet, or other professionally administered tests. For a formal assessment, consult a licensed psychologist.

A short history of IQ testing

The mental age idea

The Binet–Simon Scale

The "intelligence quotient" is born

The Stanford–Binet

Mass testing in WWI

The Wechsler scales

The Flynn effect

From ratio to quotient

Wechsler and the deviation IQ

The darker chapters

The Flynn effect and modern testing

Try it yourself.

A short history of IQ testing

The mental age idea

The Binet–Simon Scale

The "intelligence quotient" is born

The Stanford–Binet

Mass testing in WWI

The Wechsler scales

The Flynn effect

From ratio to quotient

Wechsler and the deviation IQ

The darker chapters

The Flynn effect and modern testing

Try it yourself.

What is the g factor?

How working memory works