Statistical Errors
ways true facts can lead to false conclusions
Not allowing for trends
“When states said that wearing seatbelts was no longer a choice, traffic fatalities dropped by more than 50% nationwide between 1980 and 2009.” (Matt Mahan, mayor of San Jose)
The statement taken literally is false, since fatalities were 51,091 in 1980, 33,883 in 2009, for a decline of 34%. It is true, indeed an understatement, for fatalities per passenger mile, which is more relevant and might be what Mahan meant but did not say.1
The first state to make wearing seat belts mandatory was New York in 1984. During the preceding 29 years, from 1955 to 1984, fatalities per hundred million passenger miles fell from 6.06 to 2.57, a decline of more than 50%.
Cherry Picking
Why did Mahan start his figure in 1980, four years before the first state made seat belt wearing mandatory? In 1980 the figure was 3.35, by 1984 down to 2.57. Why did he end the calculation in 2009? The figure that year was 1.15, the figure for 2023, the last year on the web page, was 1.27. The end points appear to have been chosen, by Mahan or someone he was quoting, to make the decline look larger. The decline from 1984 to 2023 was actually less than from 1955 to 1984 and over a longer period. If we eliminate the cherry picking and compare the rate of decline after the legal change to the rate before, the data provides no evidence of a benefit from the change.
Looking At Only Part of the Maximand
From the Center for Disease Control we have:
Eating too much sodium can increase your blood pressure and your risk for heart disease and stroke. Together, heart disease and stroke kill more Americans each year than any other cause.
… Most people eat too much sodium (“About Sodium and Health”)
But from the National Institute of Health:
Conclusion: Our observation of sodium intake correlating positively with life expectancy and inversely with all-cause mortality worldwide and in high-income countries argues against dietary sodium intake being a culprit of curtailing life span or a risk factor for premature death. (“Sodium intake, life expectancy, and all-cause mortality,” NIH)
If both factual claims are true as, I think they probably are, the implication is that the increased risk from heart disease and stroke is more than balanced by the decreased risk from other causes of death. If so, the CDC fact is true, its conclusion false.
The pattern of a negative effect outweighed by a positive effect is not merely possible but, in the biological context, something we should be looking for. We are, after all, very sophisticated biological machines designed by evolution. If more salt is bad for us there ought to have been selective pressure for a tendency to absorb up to the optimal level, excrete the surplus.
Ignoring Correlated Causes
·According to research from the University College London, children have a 16% higher likelihood of experiencing behavioral issues if their parents divorce when they are between the ages of 7 and 14.
· A 2019 study published in PNAS estimates that divorce is associated with an 8% lower probability of a child completing high school, a 12% lower probability of college attendance, and a 11% lower probability of college completion.
· Another 2019 study published in World Psychiatry reports children of divorced or separated parents are 1.5 to 2 times more likely to live in poverty. (Understanding the Impact of Divorce on Children)
The claims may all be true but they do not tell us what the impact of divorce on children is since parents who get divorced are not a random sample of all parents. Most obviously, lower-income couples are more likely to divorce compared to their higher earning counterparts. Hence the children of divorced parents would probably be more likely to live in poverty even if their parents had not gotten divorced. Children of poorer parents are also less likely to complete high school, go to college, graduate.
It isn’t just income. Parents divorce for reasons probably related to their personality. Their personality also affects how they rear their children. To the extent that personality is heritable it affects their children’s personality. All of that affects the chance of behavioral issues by the children. All three studies as described confuse the effect of divorce with the effects of characteristics of couples that lead to divorce.
Biased Sources — In Two Senses
A widely reported news story, headlined “Kids of lesbians have fewer behavioral problems, study suggests,” reported that
“A nearly 25-year study concluded that children raised in lesbian households were psychologically well-adjusted and had fewer behavioral problems than their peers.” (news story)
I located the study online and discovered that the conclusion about how well adjusted the children were was based entirely on the reports of their mothers. A more accurate headline would have read: “Lesbian Mothers Think Better of Their Kids than Heterosexual Mothers Do.”2
A famous example of the problem of biased samples in the statistical sense of “biased” was the 1948 presidential election. Polls predicted that Dewey would beat Truman, Truman won by a comfortable majority. At least part of the explanation was that polling was done mostly by telephone and in 1948 households with phones were a biased sample, biased toward higher income households.
Selective Reporting
Many parents greatly overestimate the risk of child abduction, because a child being abducted is much more likely to be reported in the news than a child not being abducted. Similarly for shark attacks.
Direction Of Causation
During a lengthy House debate regarding the bill, Representative Jack Minor (D‑Flint) told his colleagues that studies show crime rates are lower in states without the death penalty. He noted, “The death penalty’s not a deterrent. In fact, the figures would suggest it’s just the opposite.” (Death Penalty Information Center reporting on a debate in the Michigan house)
That assumes that causation runs from law to crime. The obvious alternative is that causation runs the other direction, that a high homicide rate is a reason for a state to have capital punishment.
Small Sample Size
I can think of no good examples of a false conclusion based on a sample size too small to eliminate the influence of random variation. Perhaps a reader can offer one.
A reader does:
Law of large/small numbers: the schools with the highest average test scores are very small, and the schools with the lowest average test scores are very small. These facts, when presented independently, can (and have) caused people to assume that the school size causes a much larger impact on student education than it does.
Past posts, sorted by topic
My web page, with the full text of multiple books and articles and much else
A search bar for text in past posts and much of my other writing
For other problems with that study, including evidence of bias in a third sense, a biased researcher rather than biased sources, see my old blog post on the study.

Climate change claims seem to me to be frequently based on very short periods of accurate data collection, or smallish areas which are then extrapolated, thus not accounting for natural variability over longer time and place. Where I live we tend to get heavy downpours in August, so when we had a heavy rain on Thursday July 31st the headlines were all about the record breaking rainfall and climate change, even though we often got the same rainfall on other days of the week and frequently one calendar day later in August. This is the media that tell me it's "twice as hot" as some other period...they don't realize that 30 degrees C is not twice as hot as 15C. Just try expressing that in degrees Fahrenheit! One year in the 1970s had abnormally very few forest fires, so guess where "record setting" accounts usually start counting. Media and others who should know better seem to confuse number of forest fires (now labeled the more dramatic wild fires) with the area burned individually or collectively. Anything to make a headline.
Here's one (small sample problem + Bayesian angle): A majority of rural communities have rates of kidney cancer that are lower than urban centers. Many are much lower. Is there something about rural communities that makes them less prone to the disease?