Unofficial SJG Archive

The Unofficial Stephen Jay Gould Archive

Unofficial SJG Archive



Hypotheses, Facts, and the Nature of Science

by Douglas Futuyma


K
ow, for example, can you be sure that DNA is the genetic material? What if the scientists who "proved" it made a mistake? Has anything really been proved absolutely true? Is science merely one way—the dominant Western way—of perceiving the world, no more or less valid than other perceptions of reality? Is evolution a fact or a theory? Or is it just an opinion I'm entitled to hold, just as creationists are entitled to their opposite opinion?

Consider a hypothetical example. You are assigned to determine why sheep are dying of an unknown disease. You take tissue samples from 50 healthy and 50 sick sheep, and discover a certain protozoan in the liver of 20 of the sick animals, but only 10 of the healthy ones. Is this difference enough to reject the NULL HYPOTHESIS: that the two groups of sheep do not really differ in the incidence of protozoans? To answer this question, you do a statistical test to see whether the difference between these numbers is too great to have arisen merely by chance. You calculate the chi-square (c2) statistic (it is 4.76), look it up in a statistical table of chi-square values, and find that "0.025 < p < 0.05." What does this expression, which you will find the like of in almost all analyses of scientific data, mean? It means that (assuming you had a random sample of sick sheep and healthy sheep) the probability is less than 0.05, but more than 0.025, that the difference you found could have been due to chance alone and that there is no real difference in protozoan infection rates of sick and healthy sheep, at large.

Every experiment or observation in science is based on samples from the larger universe of possible observations (all sheep, in this case), and in every case, there is some chance that the data misrepresent the reality of this larger universe. That is, it is always possible to mistakenly reject the null hypothesis—the hypothesis that there is no difference between groups of sheep, that there is no effect of an experimental manipulation, or that there is no correlation between certain variables. In some cases, happily, the probability of rejecting a true null hypothesis, and of accepting as true a false alternative hypothesis, may be 0.00001 or less—in which case you would feel confident that you can reject the null hypothesis, hut not absolutely certain.

So the study of 100 sheep supports the hypothesis that sick sheep are more likely to have protozoans—but only weakly. You suspect that the protozoans might be the cause of death, but you are worried by the imperfect correlation. So you expand your sample to 1000 sheep, take liver biopsies and examine them more carefully for protozoans (revealing cases that you might have missed in your first study in which the protozoans are present, but at low density), and record which sheep die within the following year. To your great satisfaction, only 5 percent of the sheep in which you did not find protozoans die; 95 percent of the infected sheep die, and when all the survivors are slaughtered at the end of the year, you find that the apparently healthy sheep still show no sign of infection. You triumphantly report to your advisor that the protozoan is the cause of the disease. Right?

Wrong, says she. You haven't eliminated other hypotheses. Maybe the disease is caused by a virus that incidentally also lowers the animals' resistance to a relatively harmless protozoan. Maybe some sheep have a gene that shortens their life and also lowers their resistance to infection. What you must do, she says, is an experiment: inject some sheep, at random, with the protozoan and others with a liquid that is the same except that it lacks the organism. You do so, and after several failed experiments—it turns out that the infection doesn't take unless the sheep consume the protozoan orally—you are delighted to report that 90 of the 100 experimentally infected sheep died within 3 months, and 95 of the 100 "control" sheep lived through the 1-year duration of the experiment. The chi-square test shows that p < 0.0001: there is an exceedingly low probability that your results are due to chance.

At this point, you may have considerable confidence that the protozoan causes disease and death. But you still haven't absolutely proved it. Is it possible that you isolated and fed to the sheep not only protozoans, but an unseen virus? Are you sure you infected sheep at random, or might you subconsciously have chosen weaker-looking animals to infect? What do you suppose explains the 15 animals that didn't fit the hypothesis? And even if p < 0.0001, there's still a chance, isn't there, that you had a bad "luck of the draw"? We need not belabor the example longer, but it provides several lessons.

First, data in themselves tell us nothing: they have to be interpreted in the light of theory and prior knowledge. In this example, we need (among other things) probability theory (which underlies statistics such as the chi-square test), the theory of experimental design, and the knowledge that viruses exist and might confound our conclusions. The history of science is full of examples of conclusions that had to be modified or rejected in the light of new theory and information. Until the late 1950s, for instance, almost all geologists believed in the fixed position of the continents; now all believe in plate tectonics and continental drift, and many geological phenomena have had to be reinterpreted in this light. Second, our hypothetical research experience shows us that arriving at a confident conclusion takes a lot of work. It is easy to overlook that every sentence in a textbook purporting to state a fact is based on research that required immense effort, usually at least a few years of at least one person's lifetime. For this reason, scientists usually defend their conclusions with considerable vigor—a point to which we will soon return. Third, and most important, research, no matter how carefully and painstakingly conceived and executed, approaches proof, but never fully attains it. There is always some chance, although it may seem almost nonexistent, that the hypothesis you have come to accept will someday be modified or rejected in the light of utterly new theories or data that we cannot now imagine. Consequently, almost every scientific paper couches its conclusions in terms that leave some room for doubt. In a paper on Drosophila genetics, that happened just now to be within reach, I read the conclusion: the experiment "suggests that different mechanisms mediate the two components of sperm displacement" (Clark et al. 1995). The data are, in fact, exquisite, the experiment carefully designed, the statistical analyses exemplary—but the authors do not claim to have proved their point. Scientists often have immense confidence in their conclusions, but not certainty. Accepting uncertainty as a fact of life is essential to a good scientist's world view.

Any statement in science, then, should be understood as a HYPOTHESIS—a statement of what might be true. Some hypotheses are poorly supported. Others, such as the hypothesis that the earth revolves around the sun, or that DNA is the genetic material, are so well supported that we consider them to be facts. It is a mistake to think of a fact as something that we absolutely know, with complete certainty, to be true, for we do not know this of anything. (According to some philosophers, we cannot even be certain that anything exists, including ourselves; how could we prove that the world is not a self-consistent dream in the mind of God?) Rather, a fact is a hypothesis that is so firmly supported by evidence that we assume it is true, and act as if it were true.

Why should we share scientists' confidence in the statements they propound as well-supported hypotheses or as facts? Because of the social dynamics of science. A single scientist may well be mistaken (and, very rarely, a scientist may deliberately falsify data). But if the issue is important, if the progress of the field depends on it (as, for example, all of molecular biology depends on the structure and function of DNA), then other scientists will skeptically question the report. Some may deliberately try to replicate the experiment; others will pursue research based on the assumption that the hypothesis is true, and will find discrepancies if in fact it is false. In other words, researchers in the field will test for error, because their own work and their own careers are at stake. Moreover, scientists are motivated not only by intellectual curiosity, but also by a desire for recognition or fame (although they seldom can hope for fortune), and disproving a widely accepted hypothesis is a ticket to professional recognition. Anyone who could show that heredity is notbased on DNA, or that AIDS is not caused by the human immunodeficiency virus, would be a scientific celebrity. Of course, those who originally propounded the hypothesis have a lot at stake—a great investment of effort, and even their reputations—so they typically defend their view passionately, even sometimes in the face of damning evidence. The result of this process is that every scientific discipline is full of controversies and intellectual battles between proponents of opposing hypotheses. There is competition—a kind of natural selection—among ideas, with the outcome decided by more evidence and ever-more rigorous analysis, until even the most intransigent skeptics are won over to a consensus view (or until they die off).

Evolution as Fact and Theory

Is evolution a fact, a theory, or a hypothesis? In science, words are often used with precise meanings and connotations that differ from those in everyday life. This is an exceedingly important point, and we will encounter many examples in this book (e.g., fitness, random, correlation). Among such words are hypothesis and theory. People often speak of a "mere" hypothesis (as in "it is merely a hypothesis that smoking causes cancer") as if it were an opirtion unsupported by evidence. In science, however, a hypothesis is an informed statement of what might be true. It may be poorly supported, especially at first, but as we have seen, it can gain support to the point at which it is effectively a fact. For Copernicus, the revolution of the earth around the sun was a hypothesis with modest support; for us, it is a hypothesis with strong support.

Likewise, a theory in science is not an unsupported speculation. Rather, it is a mature, coherent body of interconnected statements, based on reasoning and evidence, that explains a variety of observations. Or, to quote the Oxford English Dictionary, a theory is "a scheme or system of ideas and statements held as an explanation or account of a group of facts or phenomena; a hypothesis that has been confirmed or established by observation or experiment, and is propounded or accepted as accounting for the known facts; a statement of what are known to be the general laws, principles, or causes of something known or observed." Thus atomic theory, quantum theory, and the theory of plate tectonics are not mere speculations or opinions, nor are they even well-supported hypotheses (such as the hypothesis that smoking causes cancer). Each is an elaborate scheme of interconnected ideas, strongly supported by evidence, that accounts for a great variety of phenomena.

Because a theory is a complex of statements, it usually does not stand or fall on the basis of a single critical test (as simple hypotheses often do). Rather, theories evolve as they are confronted with new phenomena or observations; parts of the theory are discarded, modified, added. The theory of heredity, for instance, consisted at first of Mendel's laws of particulate inheritance, dominance, and independent segregation of the "factors" (genes) that affect different characteristics. Exceptions to dominance and independent segregation were soon found, but the core principle of particulate inheritance remained. Building on and adding to this core throughout the twentieth century, geneticists have developed a theory of heredity far more complex and detailed than Mendel could have conceived. Parts of the theory are exceedingly well established, other parts are still tentative, and we may expect many additions and changes as the mechanisms of heredity and development are plumbed further.

In light of the preceding discussion, evolution is a scientific fact. But it is explained by evolutionary theory. In The Origin of Species, Darwin propounded two large hypotheses. One was descent, with modification, from common ancestors, or, for simplicity, the hypothesis of descent with modification. I will also refer to this as the "historical reality of evolution." The other large hypothesis was Darwin's proposed cause for descent with modification: that natural selection sorts among hereditary variations.

Darwin provided abundant evidence for the historical reality of evolution—for descent, with modification, from common ancestors. Even in 1859, this idea had considerable support. Within about 15 years, all biological scientists except for a few diehards had accepted this hypothesis. Since then, hundreds of thousands of observations, from paleontology, biogeography, comparative anatomy, embryology, genetics, biochemistry, and molecular biology, have confirmed it. Like the heliocentric hypothesis of Copernicus, the hypothesis of descent with modification from common ancestors has long held the status of a scientific fact. No biologist today would think of publishing a paper on "new evidence for evolution," any more than a chemist would try to publish a demonstration that water is composed of hydrogen and oxygen. It simply hasn't been an issue in scientific circles for more than a century. Darwin hypothesized that the cause of evolution is natural selection acting on hereditary variation. His argument was based on logic and on interpretation of many kinds of circumstantial evidence, but he had no direct evidence. More than 70 years would pass before an understanding of heredity and the evidence for natural selection would fully vindicate his hypothesis. Moreover, we now know that there are more causes of evolution than Darwin realized, and that natural selection and hereditary variation themselves are more complex than he imagined. Much of this book will be concerned with the complex body of ideas—about mutation, recombination, gene flow, isolation, random genetic drift, the many forms of natural selection, and other factors—that together constitute our current understanding of the causes of evolution.

This complex of interrelated ideas about the causes of evolution is the theory of evolution, or "evolutionary theory." It is not a "mere speculation," for all the ideas are supported by evidence. It is not a hypothesis, but a body of hypotheses, most of which are well supported. It is a theory in the sense defined in the preceding section. Like all theories in science, it is incomplete, for we do not yet know the causes of all of evolution, and some details may turn out to be wrong. But the main tenets of evolutionary theory are so well supported that most biologists accept them with confidence.


[ Douglas Futuyma, Evolutionary Biology, 3rd ed., Sinauer Associates, 1998, pp. 9-12. ]


Home Page  |  Further Reading  |  Site Map  |  Send Feedback