We do ten experiments. A scientist observes the results, constructs a theory consistent with them and uses it to predict the results of the next ten. We do them and the results fit his predictions. A second scientist observes the results of all twenty experiments and constructs a theory consistent with them.
The two theories give different predictions for the next experiment. Which do we believe? Why?
In case the puzzle is not obvious, let me offer the argument for what I believe is the wrong answer:
Imagine that each possible theory is written on a piece of paper, pieces of paper are sorted into barrels according to the results they predict for the first ten experiments. Within each barrel, theories are sorted into boxes according to the results they expect for the second ten experiments.
The first theorist restricted himself to the barrels containing theories consistent with the first ten experiments, drew a theory from a box in one of them, and it happened to be from a box containing theories also consistent with the next ten experiments. The second experimenter chose a box containing theories consistent with the second ten experiments from a barrel containing theories consistent with the first ten. All we know, in either case, is that the box the theory was drawn from contains theories consistent with all twenty experiment.
What is wrong with this model is the assumption that experimenters are drawing theories at random. Assume instead that some people are better than others at coming up with correct theories, at least on this subject. Since only a small fraction of theories are consistent with the second ten experiments it would be very unlikely for an experimenter choosing a box at random to come up with one of them; the fact that the first experimenter did so is evidence that he is good at coming up with correct theories. We have no similar evidence for the second person. Since we have more reason to believe that the first theorist is good at creating correct theories than that the second one is, we have more reason to believe his theory and so more reason to trust his prediction for the next experiment.
Statisticians may recognize the argument as a version of spurious contagion. Picking the right barrel does not make the theorist or his theory any better — but the fact that he picked the right barrel increases the probability that he is a good theorist, hence that he produced a good theory, hence that his prediction of the twenty-first experiment is correct.
So far I have implicitly defined a good theory as one that is always right. The same argument works for theories that are good but imperfect, that predict the results of experiments correctly much more often than alternative theories or make predictions that are closer to correct.
A related puzzle with the same solution starts by observing that there is a large, possibly infinite, number of different theories consistent with any set of facts, that any collection of data points can be fitted with an infinite number of different curves. Any body of evidence for evolution and the age of the earth is consistent with the theory that the world was created in 4004 B.C., complete with all evidence of its previous existence, with multiple theories differing in why it was done and whether by God, a god,1 or the Devil.
Since the evidence our beliefs are based on is consistent with multiple explanations, how can we know, how can we even have reason to believe, anything?
The answer is the same as to my first puzzle — that we are equipped with pattern recognition software that can, to some degree, identify what pictures of reality are more or less likely. To deal with reality one must have some idea what it is, so if reality has detectible patterns evolution should have selected for the ability to spot them. That we manage to figure out quite a lot about reality — rarely eat poison or walk off cliffs — is evidence that it did.
The software is not perfect; we sometimes see patterns that are not there or fail to see ones that are. People vary in how good they are at correctly identifying patterns or in what domains. That is a reason to put more trust on a theory that correctly predicts the result of experiments not known when it was constructed than on one constructed to fit the known results.
I wrote about some of this on my blog many years ago. What started me thinking about it again was a recent Liberty Fund conference on the relation between the approaches to economics of the Chicago and Austrian schools. One characteristic of some versions of the Austrian approach is to rely on theory instead of data, in the extreme to claim that economic propositions can and should be derived with certainty from known axioms. If the evidence cannot tell you which explanation of the evidence is true, use logic instead.
The problem is that theory alone, pure a priori argument from axioms of human behavior, cannot predict anything of interest. If one is completely agnostic about the facts, including both utility functions and production technology, any physically possible pattern of human behavior, past or future, is consistent with economic theory. As I put it long ago in my Price Theory:
Why did I stand on my head on the table while holding a burning $1,000 bill between my toes? I wanted to stand on my head on the table while holding a burning $1,000 bill between my toes.
The Chicago school solution follows from my argument above. Use a priori argument, mostly the same economics that Austrians use, to form a conjecture. To find out if the conjecture, the tentative theory, is right check its predictions against facts you did not know when you constructed it. If the predictions fit the facts that is evidence that you have some ability to construct correct theories.
There is a second advantage to the Chicago approach that I discovered when I submitted my first economics article,2 “A Theory of the Size and Shape of Nations,” to the Journal of Political Economy. George Stigler, the editor, rejected it on the grounds that in order for an article to be published in his journal it had to contain evidence as well as theory. He offered suggestions for how to do it, none of which, in my view, were workable. I came up with other tests; their results supported the predictions of the theory. I revised, resubmitted and the article was accepted.
Having to generate predictions that could be tested against real-world data forced me to specify the theory and its implications more carefully than in the initial version. The result was a better theory. My tests of predictions of the theory provided evidence that the theory was at least in part correct but that was not the main benefit of making predictions and testing them. Asking what real world facts would support or refute your beliefs is a way of forcing yourself to think more clearly. It is a useful practice for economic theorists — and other people with other sorts of beliefs — even if they are not submitting to the JPE.
An Unrelated Puzzle
Tomorrow I am flying to Lisbon with my wife an daughter for the beginning of a speaking cum touristing trip. The flight is a red eye. It would be nice to spend the night in a bed; mostly out of curiousity, I checked prices for business and first class tickets a few weeks later. Business class cost about eight times as much as economy, first class something like twice that.
There are multiple routes long enough to make a bed for the night worth having, but probably not for many people at those prices. Why hasn’t any airline figured out a way to offer a bare bones sleeper, a problem railroads solved many decades ago?
I am told that stacking beds as rail cars do would encounter problems associated with safety regulations but even if that is not an option one should be able to get at least a third as many passengers, probably half as many, into a flying sleeper as into an ordinary economy flight. Does anyone do it? If not, why not?
Coyote and Anansi are plausible candidates.
Not counting the piece on population I wrote for the Population Council, an occasional paper not a journal article.
Some long-haul airlines do offer beds in economy class. Air New Zealand for instance:
https://www.businessinsider.com/flew-on-worlds-4th-longest-flight-in-revolutionary-skycouch-review-2022-10
This is somewhat related to an article I wrote about epistemology. Economics is in that category with one foot in the hard sciences and one foot in the social sciences. The soft social sciences have a huge problem with this, going off theory and doing very little well does experiments. We need to use the real world experiments of different culture groups to see how things work out in real life. We are getting further and further in the wrong direction by being all theory and no real world consequences. https://open.substack.com/pub/moralgovernment/p/10-reasons-to-or-not-to-believe-things?r=12fk1m&utm_medium=ios&utm_campaign=post