Subido por Arnold Barahona


Viewing all comparative data from a three-item perspective, rather than a component perspective, may allow
greater precision in extracting the information that relates organisms. In any event, the three-item approach
has already exposed some vagaries concerning optimization (Nelson, 1996; Platnick et al., 1996; Williams, 2002).
The two options are open to empirical investigation.
D. M. Williams, eds.). Systematics Association Special Volume 52.
Clarendon Press, Oxford, U.K.
NELSON, G. J., AND P. Y. LADIGES . 1996. Paralogy in cladistic biogeography and analysis of paralogy-free subtrees. Am. Mus. Novit. 3167:1–
NELSON, G. J., AND N. I. PLATNICK . 1980. Multiple branching in cladograms: Two interpretations. Syst. Zool. 29:86–91.
NELSON, G. J., AND N. I. PLATNICK . 1981. Systematics and biogeography: Cladistics and vicariance. Columbia Univ. Press, New York.
NELSON, G. J., AND N. I. PLATNICK . 1991. Three-taxon statements: A
more precise use of parsimony? Cladistics 7:351–366.
NELSON, G. J., D. M. WILLIAMS , AND M. C. BEACH. 2003. A question of
conflict: Three item and standard parsimony compared. Syst. Biodiv.
PAGE, R. D. M. 1990. Component analysis: A valiant failure? Cladistics
PAGE, R. D. M. 1993. Component for Windows. Software and manual.
Natural History Museum, London.
PISANI , D., AND M. WILKINSON. 2002. Matrix representation with parsimony, taxonomic congruence, and total evidence. Syst. Biol. 51:151–
1996. Is Farris optimization perfect? Cladistics 12:243–252.
PURVIS , A. 1995. A modification to Baum and Ragan’s method for combining phylogenetic trees. Syst. Biol. 44:251–255.
RAGAN, M. A. 1992a. Matrix representation in reconstructing phylogenetic relationships among the eukaryotes. BioSystems 28:47–55.
RAGAN, M. A. 1992b. Phylogenetic inference based on matrix representation of trees. Mol. Phylogenet. Evol. 1:53–58.
RONQUIST , F. 1996. Matrix representation of trees, redundancy, and
weighting. Syst. Biol. 45:247–253.
SANDERSON, M. J., A. PURVIS , AND C. HENZE. 1998. Phylogenetic supertrees: Assembling the trees of life. Trends Evol. Ecol. 13:105–109.
SLOWINSKI , J. B., AND R. D. M. PAGE. 1999. How should species phylogenies be inferred from sequence data? Syst. Biol. 48:814–825.
SNEATH, P. H. A., AND R. R. SOKAL. 1973. Numerical taxonomy: The
principles and practice of numerical classification. W. H. Freeman,
San Francisco.
SOKAL, R. R., AND P. A. S. SNEATH. 1963. Principles of numerical taxonomy. W. H. Freeman, San Francisco.
WILEY, E. O. 1988. Vicariance biogeography. Annu. Rev. Ecol. Syst.
WILKINSON, M. 1994. Common cladistic information and its consensus representation: Reduced Adams and reduced cladistic consensus
trees and profiles. Syst. Biol. 43:343–368.
WILLIAMS , D. M. 1996. Fossil species of the diatom genus Tetracyclus
(Bacillariophyta, ‘ellipticus’ species group): Morphology, interrelationships and the relevance of ontogeny. Philos. Trans. R. Soc. Lond.
Ser. B 351:1759–1782.
WILLIAMS , D. M. 2002. Parsimony and precision. Taxon 51:143–149.
WILLIAMS , D. M., AND D. J. SIEBERT . 2000. Characters, homology and
three-item analysis. Pages 183–208 in Homology and systematics
(R. W. Scotland and T. Pennington, eds.). Systematics Association
Special Volume 58. Taylor and Francis, Philadelphia.
First submitted 11 June 2002; reviews returned 22 September 2002;
final acceptance 3 December 2002
Associate Editor: Peter Linder
Syst. Biol. 52(2):259–271, 2003
DOI: 10.1080/10635150390192762
Popper and Systematics
Department of Geology, Field Museum, 1400 S. Lake Shore Drive, Chicago, Illinois 60605-2496, USA; E-mail: [email protected]
The philosophy of Karl Popper is frequently appealed to either in support or in defense of theories
and methods of systematic biology. Perhaps the best illustration of Popper’s prominence in this field is the
collection of papers published in Systematic Biology in June 2001 (de Queiroz and Poe, 2001;
Faith and Trueman, 2001; Farris et al., 2001; Kluge,
Downloaded from by Unal user on 07 March 2020
BAUM , B. R. 1992. Combining trees as a way of combining data sets for
phylogenetic inference, and the desirability of combining gene trees.
Taxon 41:3–10.
BAUM , B. R., AND M. A. RAGAN. 1993. Reply to A. G. Rodrigo’s ‘A
comment on Baum’s method for combining trees.’ Taxon 42:637–640.
BININDA-EMONDS , O. R. P., AND H. N. BRYANT . 1998. Properties of
matrix representation with parsimony analysis. Syst. Biol. 47:497–
BROOKS , D. R. 1990. Parsimony analysis in historical biogeography
and coevolution: Methodological and theoretical update. Syst. Zool.
BROOKS , D. R. 1996. Explanations of homoplasy at different levels of
biological organization. Pages 3–36 in Homoplasy. The recurrence
of similarity in evolution (M. J. Sanderson and L. Hufford, eds.).
Academic Press, San Diego.
DOYLE, J. J. 1992. Gene trees and species trees: Molecular systematics
as one-character taxonomy. Syst. Bot. 17:144–163.
EBACH, M. C., AND K. J. McNAMARA. 2002. A systematic revision of the
family Harpetidae (Trilobita). Rec. West. Aust. Mus. 21:235–267.
FARRIS , J. S. 1973. On comparing the shapes of taxonomic trees. Syst.
Zool. 22:50–54.
FARRIS , J. S., A. G. KLUGE, AND M. J. ECKHARDT . 1970. A numerical
approach to phylogenetic systematics. Syst. Zool. 19:172–191.
GUIG Ó , R., I. MUCHNIK , AND T. F. SMITH. 1996. Reconstruction of ancient molecular phylogeny. Mol. Phylogenet. Evol. 6:189–213.
HUGOT , J.-P., AND J.-F. COSSON. 2000. Constructing general area cladograms by matrix representation with parsimony: A western Palearctic example. Belg. J. Entomol. 2:77–86.
KLUGE, A. G. 1993. Three-taxon transformation in phylogenetic inference: Ambiguity and distortion as regards explanatory power.
Cladistics 9:246–259.
NELSON, G. J. 1979. Cladistic analysis and synthesis: Principles and definitions with a historical note on Adanson’s “Familles des Plantes”
(1763–1764). Syst. Zool. 28:1–21.
NELSON, G. J. 1993. Reply. Cladistics 9:261–265.
NELSON, G. J. 1996. Nullius in Verba. J. Comp. Biol. 1:141–152.
NELSON, G. J., AND P. Y. LADIGES . 1991a. Standard assumptions for
biogeographic analysis. Austr. Syst. Bot. 4:41–58.
NELSON, G. J., AND P. Y. LADIGES . 1991b. Three-area statements: Standard assumptions for biogeographic analysis. Syst. Zool. 40:470–485.
NELSON, G. J., AND P. Y. LADIGES . 1992a. Information content and fractional weight of three-taxon statements. Syst. Biol. 41:490–494.
NELSON, G. J., AND P. Y. LADIGES . 1992b. TAX: MSDOS program for
cladistics. American Museum of Natural History, New York.
NELSON, G. J., AND P. Y. LADIGES . 1994. Three-item consensus:
Empirical test of fractional weighting. Pages 193–209 in Models
in phylogeny reconstruction (R. W. Scotland, D. J. Siebert, and
ity, and severity of test relate to degree of corroboration?
What is the meaning of Popper’s propensity interpretation of probability? What, if anything, has all this to do
with phylogenetic analysis?
To better understand how Popper’s philosophy can or
cannot be brought into a discussion of systematics, it is
helpful to review Popper’s place in the history of philosophy of science. Popper was a philosopher, not a scientist. He wrote on the logic of scientific discovery, not
on applied physics or on phylogenetic systematics. He
wrote about problem solving in science, but he did not
himself solve scientific problems (e.g., Popper [1983:230]
wrote of degree of corroboration that it “has little bearing
on research practice”). A philosopher, whether approvingly appealed to or criticized, “must be understood immanently, that is, from his own point of view” (Carnap,
1997a:41). Any characterization of Popper’s philosophy
must be a coherent one, or the departure from his original ideas must be made explicit. A good way to achieve
a better understanding of Popper’s work is to contrast it
with that of his peers, understanding that Popper formulated his ideas in contradistinction to those propagated
by the Vienna Circle and in contradiction to his shadow,
Ludwig Wittgenstein. It also helps to understand that The
Logic of Scientific Discovery (Popper, 1992) was abstracted
from a larger work entitled The Two Basic Problems of the
Theory of Knowledge (Popper, 1979). These two (interconnected) problems were the problem of induction and the
demarcation of science from metaphysics.
Logical positivism is a label that is commonly attached
to members of the Vienna Circle and to those of the Unity
of Science movement that grew out of it, among which
were Carnap and Hempel (Hung, 1992; Carnap, 1997a).
Another prominent logical empiricist, A. J. Ayer, counted
Karl Popper in that group as well (Ayer, 1959). The young
Popper indeed courted the Vienna Circle but was never
admitted (Edmonds and Eidinow, 2001). Although there
are profound differences between his philosophy and
that of, say, Rudolph Carnap, there also exist profound
similarities: “His basic philosophical attitude was quite
similar to that of the Circle” (Carnap, 1997a:31). All the
authors involved in logical empiricism had one goal in
common: the demarcation of science from metaphysics.
To achieve this demarcation both camps used the same
tactics: they focused on Wittgensteinian truth conditions
(truth anchored in logic) more than on the evidence condition (knowledge anchored in observation); science was
demarcated from metaphysics on the basis of the syntactic form of statements (the analysis of the grammatical
[and logical] structure of sentences) and semantic analysis (the analysis of the meaning of terms) rather than on
epistemological grounds (observation and justification;
Laudan and Leplin, 2002). Beyond this common ground,
Popper’s thinking diverged from the doctrines of the
Vienna Circle. Carnap declared theories scientific when
they (logically) implied observation statements; for Popper, theories became scientific when they (logically)
Downloaded from by Unal user on 07 March 2020
In a very general sense, Popper’s philosophy can be
seen as a world view, attractive to fellow philosophers,
sociologists, historians, politicians, and scientists. Consequently, Popper can be read at different levels. In Objective Knowledge, Popper (1974b:80) appealed to “the critical discussion of competing theories,” which portrays
the criterion of testability in its weakest sense (“the criterion of demarcation cannot be an absolutely sharp one”
[Popper, 1997:187]); not to assert positive evidence in favor of a cherished hypothesis but to be critical and to
seek flaws or mistakes rather than confirmation. Things
get more serious when we “look upon the critical attitude as characteristic of the rational attitude” (Popper,
1974b:31), where rationality is the capacity of logical
thinking and where Popper recognized “deductive logic
as the organon of criticism” (Popper, 1974b:31). For here
we cross the bridge from common sense philosophy to a
full-blown theory of knowledge. In current discussions
of implications of Popper’s philosophy of science for
systematics, terms such as explanatory power, degree
of corroboration, severity of test, and logical (or absolute) probability have been used in connection with formalisms, which means that today systematics is linked
to the full-fledged apparatus of Popper’s hypotheticodeductivism (where testable observation statements are
deduced from a hypothesis).
De Queiroz and Poe (2001:306) stated that “Popper’s
corroboration is based on the general principle of
likelihood,” or that likelihood is the foundation of
Popper’s concept of corroboration (de Queiroz and
Poe, 2001:308), which seems counterintuitive given that
likelihood methods have been identified as inductive
(Edwards, 1992) or abductive (Sober, 1988). De Queiroz
and Poe (2001) were countered by Kluge (2001b:323),
who found likelihood analysis to be verificationist/
inductive, as opposed to parsimony analysis, which is
“practiced in terms of Popperian testability” and hence
is falsificationist/deductive (deductive logic does not
necessarily exclude a verificationist attitude, as seen
in the discussion of the Hempel–Oppenheim model
of scientific explanation). Recognizing that there is
normally no “demonstrable” Popperian falsification
in cladistics, Faith and Trueman (2001:331) decoupled
their interpretation of Popperian corroboration from
the “popular falsificationist interpretation of Popperian
philosophy.” They moved on to recognize supposed
corroboration cladistics as merely goodness-of-fit of
evidence. Farris et al. (2001) in turn took issue with Faith
and Trueman’s “Popper*,” debating goodness-of-fit
versus data as Popperian evidence.
Here, I do not address issues of methodology, i.e.,
whether parsimony or likelihood analysis should be preferred, which algorithms should or should not be used in
cladistic inference, or whether permutation tail probabilities are meaningful in phylogeny reconstruction. What
I am concerned with is whether Popper’s philosophy of
science is at all relevant to these methodological discussions. In that context, the following questions must be
asked and answered. What is deduction? What is falsification? How do explanatory power, degree of falsifiabil-
VOL. 52
The opening phrase in Popper’s (1974b:1) book Objective Knowledge reads “I think I have solved a major
philosophical problem, the problem of induction.” The
problem of induction was formulated by David Hume
and can be presented as follows. As far as we know,
the sun has risen every morning for as long as there
have been human beings on planet Earth. Are we justified to conclude from these innumerable observations
that the sun will rise again tomorrow? (Whether or not
the sun will rise again tomorrow is a contingent fact of
nature, not a logical necessity. This can be shown by
the fact that the statement “it is not the case that the
sun will rise again tomorrow” is not self-contradictory,
which it would be if tomorrow’s sunrise were a logical necessity.) Popper’s perception of this problem was
that our knowledge of past events, based merely on inductive generalization and on nothing else, constitutes a
historical account of those events, and such a historical
account has no predictive power that is based in logic
(i.e., that is deductive; compare “degree of corroboration
as a historical account” below). In other words, the inductive generalization that the sun will rise every day in
the future just as it has in the past is, without additional
knowledge, a merely historically contingent generalization that expresses no necessity in the natural course of
This interpretation changes if Netwon’s laws of interplanetary motions are taken into account, for it is the appeal to universal lawfulness that imparts necessity on the
natural course of events. The appeal to Newton’s laws establishes what Sober (1988) called a three-place relation:
The past experience of the sun rising every morning is
explained by Newton’s laws, which in turn allow the
prediction that the sun will rise again tomorrow. This
prediction is justified to the degree that Newton’s laws
are “true,” i.e., they will hold tomorrow and ever after.
Induction has been characterized as a process of reasoning that moves from the particular to the general,
whereas deduction is a process of reasoning that moves
from the general to the particular. Ever since Peirce (de
Waal, 2001), it has been said that abduction is about what
could be the case (given the evidence at hand, the butler
could be the murderer), induction is about what is the
case (emeralds are green, not grue or blue; Goodman,
2001), and deduction is about what must be the case. In
other words, induction is (more or less) supported by
evidence, but deduction is conclusive because it is based
in logic.
Both Carnap and Popper agreed that no finite number of singular observations would suffice to fully verify a generalization. Carnap therefore switched from
the concept of verification to that of degree of confirmation (and its complement, degree of disconfirmation;
Mayhall, 2002), whereas Popper turned his back on induction and used deduction as the hallmark of scientific reasoning. There is a catch involved with this move,
however, because deductive reasoning happens in logical space. Thus, deductive (i.e., logical) relations obtain
only between sentences or complexes thereof, not between words and things (nor between observation and
justified belief).
A simple form of deductive reasoning is the Aristotelian categorical syllogism: (1) all mammals are vertebrates, (2) all nudibranchs are mammals, therefore (3) all
nudibranchs are vertebrates (Weiner, 1999). As long as
the premises (1 and 2) are agreed upon, the conclusion
(3) must follow necessarily. The deduction is logically
valid but factually wrong (unsound). Thus, deduction
can deliver truth independent of any epistemic status
(independent of observation or justification).
Propositional logic (Weiner, 1999) is based on universal conditional statements such as “(for every A) if A is
a dog, then it barks.” If the conditional and the premise
that “A is a dog” are true, then it must also be true that
“A barks.” This valid deductive argument says nothing
about a particular dog, only something about the class
of all dogs (which means that barking is a universal, essential, or simply necessary property of the class of all
dogs). It is logically not impossible to imagine duckogs
that quack (Kripke, 2002). I can say, “(for every A) if A
is a duckog, then it quacks,” “A is a duckog,” therefore
“A quacks.” Everything concluded so far is based on
deduction; no observation is yet involved. But now we
can go beyond deduction and appeal to our senses to
see whether the conditionals and the premises are true
or false (whether there is reason to believe that they are
true or false): I have never seen a duckog, but I have seen
many dogs, and all those that I have seen barked, never
quacked. This A here (to which I point) definitively resembles a dog, and it does not bark right now because it is
fast asleep. Although it is highly unlikely that we should
encounter a dog in Central Park tomorrow that quacks,
it is logically not impossible. However, if you and I can
reach an agreement and accept that to the best of our observational powers we have indeed sighted what looked
like a dog that quacked, we are logically compelled to
accept one of two possible conclusions: either the conditional “(for every A) if A is a dog, then it barks” is not
universal, or if “dog” is a name that rigidly refers to As
that never quack but only bark, which is what names are
supposed to do, then we should not call the animal we
sighted a dog; we might call it a duckog.
Cladistic analysis has been labeled deductive (Crother,
2002), but valid deductive inference, as much as a
Popperian test, is a complex problem that requires consideration of both the structure of the language used and
Downloaded from by Unal user on 07 March 2020
entailed the negation of observation statements. Carnap
appealed to the principle of verifiability as the criterion
by which to establish the meaning of a sentence while
declaring metaphysical sentences as meaningless. Popper objected and used his falsifiability criterion to distinguish science from metaphysics. The members of the
Vienna Circle dismissed the problem of induction as a
pseudoproblem on the grounds that there is no solution
for it in the way it is ordinarily conceived (Ayer, 1952).
Popper sought a solution to the logical problem of induction instead.
about what must be the case. The symmetry that characterizes inductive systems (where a theory is more or less
confirmed or disconfirmed) has given way to an asymmetry of falsification based on deductive reasoning (a
theory can never be shown to be true but can be shown
to be false). Practically, falsification requires that everybody involved accepts that there indeed is a white raven
here now (and of course this could be wrong). It must
be agreed that this bird is a raven and not some other
bird precisely because it is white, and it must also be
agreed that this bird is white and not a shade of gray
that some may still consider as black. The meanings of
“black,” “white,” and “raven” must be fixed, at least to
such a degree that a contemporary linguistic and/or scientific community can engage in a meaningful discourse.
This, of course, is falsificationism in it strong sense, something cladists have agreed cannot apply to systematics
(Farris, 1983), hence the recourse to parsimony analysis.
The question to be discussed below is whether parsimony analysis satisfies Popperian falsificationism even
in a weak sense.
According to Popper, theories are scientific if they are
falsifiable, but falsifiable on the basis of deduction. For
this to be possible, scientific theories must have the syntactical form of universal propositions (or must be linked
to statements of such a syntactical form). To prevent the
immunization of theories, the potential and acceptable
falsifying instance must be specified in advance (Popper,
1989:38), such that if it occurs, there is no backdoor open
that would allow us to save the theory from falsification
by a shift of meaning of the words that report on the
observable state of affairs.
Inductive support works symmetrically, confirming
or disconfirming theories or hypotheses to a greater or
lesser degree. An empirically confirmed hypothesis A
disconfirms a rival hypothesis B to the degree to which
B is inconsistent with A. So if x confirms hypothesis A, y
confirms hypothesis B, and if x carries a greater evidentiary weight than y, then A is confirmed and B is symmetrically disconfirmed. In contrast, Popperian falsification works asymmetrically: If it occurs (if it is accepted
that it has occurred), it is conclusive. The reason for this
asymmetry is that this falsification is based on a contradiction that occurs in logical space. In parsimony analysis, we have alternative hypotheses of relationships, the
number of which is determined by the number of terminal taxa. Among those, we choose the one that is supported or confirmed by the largest number of congruent
characters. This most-parsimonious hypothesis symmetrically disconfirms alternative hypotheses to the degree
that these are inconsistent with the most-parsimonious
hypothesis. “Parsimony operates by finding a pattern of
relationships that is most consistent with the data” (Kluge
and Farris, 1969:7, emphasis added). The criterion of
consistency was later replaced by explanatory power,
meaning that parsimony maximizes the number of character descriptions that are explained as putative homologies, thus minimizing inconsistency, i.e., minimizing the
number of character descriptions that are explained ad
hoc as homoplasies (Farris, 1983). This point is perfectly
Downloaded from by Unal user on 07 March 2020
the meanings of the terms that are used in that language.
Propositions used as a basis for deduction must conform
to a certain syntactical structure. The categorical syllogism presented above requires that the subject of the first
(major) premise (all mammals [subject] are vertebrates
[predicate]) appears as predicate in the second (minor)
premise (all nudibranchs [subject] are mammals [predicate]). Similarly, to qualify as a conditional, the statement
must be an “if . . . , then . . . ” statement (or a syntactically
equivalent statement such as “ . . . not . . . , unless . . . ”). Beyond that, semantic analysis of meaning is also required.
The meaning of the name “dog” obtains from the attachment of that name to a physical object (the meaning of
the proper name Fido obtains from the attachment of that
name to my dog to which I point). But how names are attached to things, or more generally how names are used,
is governed by rules and conventions. I live in a linguistic
community in which the name “dog” is attached only to
animals that bark, never to animals that quack. If in that
same linguistic community somebody would attach the
name dog to an animal that quacks, she would violate
the rules and conventions governing the use of the name
dog in that community. However, linguistic as well as scientific communities change over time. The name “atom”
was attached to the physical world in a different way for
Democritus than for Niels Bohr.
Popper believed to have solved the problem of induction by reformulating it. His solution had two parts.
First he subdivided the problem into two components,
the psychological versus the logical problem of induction. Popper considered the fact that we expect future regularity on the basis of experienced past regularity a psychological component, the roots of which
he eventually traced back to evolutionary epistemology
(Popper, 1974b). The other component, i.e., the question
of whether there could be a rational (i.e., logical) reason
to prefer one scientific theory over another, he dubbed
the logical problem of induction, and he claimed to have
solved that problem by reformulating it in terms of universal propositions. He did this because the syntactical
form of universal propositions allows the deduction of
the negation of observation statements, i.e., universal
theories forbid certain things to happen. The statement
“all unicorns have a horn on their forehead” cannot be
used as the basis for the deduction “a unicorn is here
now, or anywhere,” because the existence of a unicorn
in our world of experience can only be established by
observation. By contrast, the statement “all ravens are
black” can be logically transformed to “there is no one
thing that is not black and is a raven,” and that statement in turn can be logically transformed to “no white
raven is here now” (Popper, 1989:385). That there should
not be a white raven here now is a predictive statement
that is obtained on purely logical grounds; no steps have
been taken yet to attach that statement to the physical
world. But if we accept that we do in fact observe a white
raven here and now, then that negation of the negated
observation statement contradicts the universal proposition in logical space, and falsification occurs. Logically,
this falsification is absolute and conclusive; deduction is
VOL. 52
Both Hempel and Popper championed a model of scientific explanation based on deduction. If cladistics is deductive, it must satisfy the nomological deductive model
of scientific explanation originally proposed by Hempel
and Oppenheim (as was claimed to be the case by Kluge,
According to Popper, a scientific explanation obtains
if the state of affairs to be explained is logically entailed
by the explanation “the use of a theory for predicting
some specific event is just another aspect of its use for
explaining such an event” (Popper, 1976b:124). In practice, a theory must be complemented by initial and/or
boundary conditions (such as an experimental setup) to
yield testable predictions. This kind of scientific explanation has become known as the Hempel–Oppenheim
nomological deductive model of scientific explanation
(Hempel, 1965; Hempel used this model in a verificationist context, whereas Popper used it in a falsificationist
context). This model also addresses the logical relations
between sentences (or complexes thereof), and it divides
the scientific explanation into two components, the explanans (the sentences that constitute the explanation)
and the explanandum (the sentences that describe the
state of affairs to be explained). The explanans contains
statements of antecedent conditions and general laws.
Hempel’s (1965) use of “antecedent conditions” shows
that his model of scientific explanation can be rendered
in the form of a universal conditional (this is the elliptical
form of the N-D model sensu Hempel, 2001). For example, “if this is a magnetized piece of iron (antecedent),
then it will attract iron filings (consequent).” The antecedent thus contains the initial or boundary condition,
i.e., the magnetized piece of iron, and it makes reference to a universal law, the law of magnetism. Prediction
sensu Popper requires deduction, and deduction is a logical operation; so the prediction that iron filings will be
attracted by a magnetized piece of iron is possible only
because the antecedent logically entails the consequent
(by virtue of referring to a universal law). As stated by
Hempel (1965:247), “the explanandum must be a logical
consequence of the explanans.”
Popper’s falsificationism can now be seen to be linked
to the modus tollens form of argument, which is based
on the logical entailment of the consequent by the antecedent in a universal conditional statement (this is also
the way in which Carnap [1995] rendered universal theoretical laws). If this is a piece of magnetized iron (if
P), then it attracts iron filings (then Q); if iron filings
are not attracted (if not-Q), then this is not a piece of
magnetized iron (then not-P). From this example we obtain “if P, then Q,” “Q,” “therefore P” (modus ponens),
and “if P, then Q,” “not-Q,” “therefore not-P” (modus
tollens). This example also shows that the failure to attract iron filings does not quite falsify the law of magnetism in an empirical sense. Maybe the piece of iron
we thought was magnetized is in fact not made of iron,
maybe the filings that fail to be attracted are not made
of iron, or maybe the process of magnetization itself was
unsuccessful not because the metal used was not iron
but because a faulty procedure was used to magnetize
it. These potential problems all relate to the issue of
how names or statements can successfully be attached
to physical objects or observable states of affairs, and
because the attachment of words to things will never be
absolutely perfect, there will never be an unequivocal falsification of any universal proposition empirically (i.e.,
practically). This is not a problem of logic; it is a problem
of bridging the (logical; Körner, 1970) gap between words
and things. Popper (e.g., 1974c, 1992) always acknowledged that the unequivocal empirical falsification of any
theory is impossible and that we live in a layered world
of competing theories of varying degrees of universality,
Downloaded from by Unal user on 07 March 2020
valid, but it cannot be linked, nor does it need to be
linked, to Popperian falsificationism. The meaning of
“explanatory power” as used by Farris (1983) is not coextensive with the meaning as Popper used this term.
Support, or lack thereof, is also symmetrical rather than
asymmetrical in likelihood analysis.
More recent debate has turned on the issue of whether
goodness of fit, the nature of the data, or the improbability of evidence is relevant to Popper’s writings (Faith and Trueman, 2001; Farris et al., 2001). Parsimony is defended in terms of Popper’s explanatory
power (minimizing ad hoc auxiliary hypotheses; Farris,
1993) which leads back to Popper’s (1992:145) “principle of parsimony in the use of hypotheses.” Traditionally, parsimony has been defined as “a methodological principle dictating a bias towards simplicity in
theory construction” (The Oxford Companion to Philosophy, 1995), which is not the same as Popper’s principle that requests restraint from indulgence in ad hoc
auxiliary hypotheses. In the German edition of The
Logic of Scientific Discovery, Popper (1976a) used Einfachheit (simplicity) for parsimony in its traditional sense,
but Grundsatz des sparsamsten Hypothesengebrauchs for
“principle of parsimony in the use of hypotheses.”
Popper was dissatisfied with this traditional sense of
parsimony (simplicity) and replaced the conventional/
methodological rule that the simplest (most parsimonious) hypothesis should be selected among competing ones with his normative rule that the most severely
testable hypothesis should be preferred, i.e., the theory
with the highest degree of falsifiability. Auxiliary ad hoc
hypotheses can serve to immunize a theory from falsification, which is why they must be avoided unless
they increase the degree of falsifiability This argument is
tied to Popper’s notion of falsifiability and the requirement not to immunize a theory from falsification; none of
this argument applies to parsimony analysis. What parsimony analysis delivers is maximal consistency, which
was interpreted as one measure of goodness of fit by
Faith and Trueman (2001). In this context, parsimony
must be viewed as a model, which can only be justified by reference to the background assumptions that
parsimony analysis implies about character evolution
(Sober, 1994).
“homeostatic in that there are mechanisms that cause their
systematic coinstantiation or clustering,” yet “this clustering is the result of incompletely understood mechanisms that govern an embryo’s development” (Wilson,
1999:197f). Wittgensteinian family resemblance (Wilson,
1999:196; see also Glock, 2000:120ff) is just not enough to
warrant a deductive link, let alone the incomplete knowledge of initial and boundary conditions of inheritance
and ontogenetic development.
Even if descent with modification (Farris et al., 2001;
Kluge, 2001a, 2001b) is taken as a universal law, it forbids little more than spontaneous generation. (I believe
that spontaneous generation has taken place at least
once, such that it is logically not impossible that it could
have occurred more than once, hence the talk of life on
Mars. “What we call the evolutionary hypothesis . . . is
not a universal law” Popper [1976b:106f].) It could be
argued that in conjunction with the appropriate initial
conditions, it might be possible to explain (by way of prediction or retrodiction) a specific speciation event. However, these initial conditions are so complex that “the
properties of new species which arise in the course of
evolution” remain “unpredictable” (Popper, in Popper
and Eccles, 1977:29).
Popper tied his falsificationism to universal propositions (laws) yet stated that evolutionary theory is no
such law. Systematists who appeal to Popper emphasize
that Popper acknowledged testability and falsifiability
for singular statements as well and therefore claim testability for singular taxonomic statements. Is this claim
Popper frequently referred to evolutionary theory and
in that context considered the applicability of his falsificationism. “All ravens are black” is a universal proposition and hence of a syntactical form that characterizes
universal laws. By contrast, “all vertebrates have one
common pair of ancestors” is not a universal proposition because it does not hold in all possible worlds, even
if these obey the same rules of logic. Pegged to time and
space in its meaning by virtue of reference to common
ancestry, the statement refers “to the vertebrates existing on earth” only, not “to all organisms at any place
and time which have that constitution which we consider characteristic of vertebrates” (Popper, 1976b:107;
in the latter quote “vertebrates” are presented as a class
or a natural kind). Systematists dealing with spatiotemporally restricted entities must be concerned about the
testability of singular statements.
Popper was very consistent in his admission that singular statements are in fact testable, but for this to
be true, they again need to satisfy certain conditions,
i.e., they need to be tokens of the same type. Following C. S. Peirce (de Waal, 2001), we can think of
words as signs, and such a sign can denote a quality
(red), a specific particular (individual) object (“Fido”
is a sign standing for my dog), or a general type (the
class [or natural kind] of all dogs). An office furniture
Downloaded from by Unal user on 07 March 2020
but he also emphasized that his interest was centered
on the logic of scientific discovery, not on its empirical
fuzziness. However, this is precisely what cladists have
to deal with—the fuzziness of the evidence manifest in
The use by Popper of the modus tollens form of argument in his demarcation of science from metaphysics
has important consequences for systematics. The modus
tollens is a logical argument, and it is decisive because it
states necessary conditions: “if h, then not-e,”, “e,” “therefore not-h” (this is the form in which Popper used the
modus tollens, where h = a hypothesis, e = an observation sentence). It holds in all possible worlds that obey
the same rules of logic. Proponents of likelihood methods must consider that there is no probabilification of
the modus tollens (Sober, 1999). Their method is not an
appeal to falsification or deductive inference but rather
a statement of probabilities for the evidence with and
without the hypothesis. Proponents of parsimony analysis must consider that the modus tollens is based on deductive entailment, which is a logical relation, whereas
statements that concern historical states of affairs, such as
those of systematics and phylogeny, are historically contingent. There is therefore no room for the purely logical
entailment of the explanandum by the explanans. There
is no antecedent in systematics or phylogeny reconstruction that can logically entail its consequent, i.e., there
is no deductive link between a hypothesis of relationships and the character distribution across the terminal
taxa it concerns (Sober, 1988). Premises that would allow
the prediction (or retrodiction) of how certain species
or token fossils are interrelated would require a connection to some sort of generality: “In so far as we are concerned with the historical explanation of typical events
they must necessarily be treated as typical, as belonging
to kinds or classes of events. For only then is the deductive method of causal explanation applicable” (Popper,
1976b:146; emphasis added). If species or taxa in general
are historically unique (i.e., are historical particulars),
then deductive reasoning cannot be applied to systematics or phylogeny reconstruction (Sober, 1988). To even
consider the applicability of deductive logic to systematics, species would have to be instantiations of some
kind or elements of some class, which some have argued might be achieved by considering species as natural kinds.
It is one thing to consider species as natural kinds but
another to claim a deductive link between a hypothesis of relationships and character distribution across
the species it entails. Kripke (2002) essentially characterized a natural kind as calibrated on standard members
that share at least some of a cluster of characteristics:
“these properties need not hold a priori of the kind; later
empirical investigation may establish that some of the
properties did not belong to the original sample, or that
they were peculiarities of the original sample, not to be
generalized to the kind as a whole” (Kripke, 2002:137).
In biology, this explanation translates into the homeostatic property cluster kinds that are species (Boyd, 1999).
The properties that characterize these natural kinds are
VOL. 52
relationships do not allow such connections; the known
laws of inheritance or natural selection do not allow us
to predict or retrodict character distribution across taxa
that would falsify hypotheses of relationships. If they
did, there would be no need for parsimony or likelihood
Given that hard falsificationism (falsification in logical space, an asymmetrical all-or-nothing affair) cannot
obtain in cladistics (Farris, 1983), cladists have turned
to Popper’s notion of degree of corroboration as the
method for choosing between competing hypotheses of
relationships. To understand Popper’s notion of corroboration, it is necessary to introduce his notion of background knowledge (b). Not everything can be investigated or tested at the same time. Science must partition
its problems. Popper referred to the body of scientific
knowledge that was not questioned in any particular
test situation as background knowledge. Some parts of
the background knowledge may be more or less relevant to a specific test situation (b also contains initial
conditions [Popper, 1989:288], and when Popper talked
about varying the background knowledge [e.g., Faith
and Trueman, 2001:430)], he meant the latter), but there
is no room in Popper’s philosophy to arbitrarily minimize background knowledge (to maximize the relative
empirical content of a hypothesis) as is sometimes done
in the cladistic literature (e.g., Kluge, 2001b; see comments of Faith and Trueman, 2001). Not everything can
be investigated or tested simultaneously, which is another reason why the ultimate and finite verification or
falsification of a natural law, theory, or hypothesis cannot be accomplished in practice. Any testing procedure
will make use of background knowledge in some way,
and fault might have to be attributed to the background
knowledge rather than to the theory under test. For example, I might use a ruler to measure the expansion of a
piece of metal as a function of increasing heat. In so doing, I rely on the background assumption that the ruler
will be inert under varying thermal conditions, but that
assumption may be wrong. So I might find that a certain
metal does not expand with increasing temperatures, but
that result might reflect the fact that the ruler I used expanded to the same degree as the piece of metal I was
Carnap (1995) invoked the notion of degree of confirmation to characterize the status of a theory or hypothesis. The degree of confirmation increases with the
number (and kinds; Nagel, 1971) of its successful instantiations. Carnap thought that over time the degree of
confirmation of a theory could increase more and more
(Mayhall, 2002), and as the degree of confirmation increased, the degree of disconfirmation would decrease
symmetrically. Because inductive inference is tied to a
probabilistic approach (Ayer, 1952), an increasing degree
of confirmation translates into an increasing probability
of a theory. Degree of confirmation therefore makes a
Downloaded from by Unal user on 07 March 2020
retailer may have all kinds of chairs, all of which collectively represent a general type, which is “chair.” All
the different office chairs are tokens of the type “chair,”
which does not stand only for office chairs but also for
Popper allowed for falsifiability of singular statements
when they refer to tokens of a same type. “Here stands a
glass of water” is a singular statement, and it is testable
but only because the glass this particular glass is made of
and the water it contains here and now behave according
to a certain lawful regularity, i.e., are tokens of the same
type (Popper, 1974c). Singular statements that cannot be
connected to any kind of lawful regularity cannot be
tested in Popper’s sense of testing. For example, “there is
at least one Loch Ness Monster” cannot be tested in Popper’s sense but not because it is impossible to mount an
expedition to find out whether the monster exists or not.
For Popper, isolated existential statements such as “there
is at least one Loch Ness Monster” are syntactically metaphysical because they cannot serve as a basis for the deduction of anything forbidden. It is, of course, easy to render such a statement testable simply by removing it from
its isolation and rendering it a token of some type: “There
is at least one (specimen of the) Loch Ness Monster in the
Royal Scottish Museum.” That statement is easily tested
and refuted. (Whether we would consider the species
Nessiteras rhombopteryx a natural kind or not is one thing.
It is another thing to assert the possession of remains of a
once-living animal, for that ties this assertion to all kinds
of what Popper would have accepted as universal laws
[of physics, of chemistry, etc.]. Hence the testability of the
statement, because the alleged specimen could also be a
hoax!) The simple point is that without some kind of regularity, without some kind of repeatability, there simply
cannot be any testability and potential falsifiability sensu
Much is being made in the systematics community
about Popper’s letter published in The New Scientist in
which he stated, “the description of unique events can
very often be tested by deriving [sic., not by deducing!] from them testable predictions and retrodictions”
(Popper, 1980:611; emphasis added). Popper did not say
that the descriptions of unique events can be tested but
instead said that statements derived from them can be
tested. He also did not state that evolutionary theory is
a universal law or that it could be tested, but he rejected
the accusation of having denied scientific character to
“historical sciences such as paleontology, or the history
of the evolution of life on earth” (Popper, 1980:611). The
key to an understanding of Popper’s position on these issues is to be found in The Poverty of Historicism (Popper,
1976b:108). Whereas the evolution of life on earth is a
unique process, and its description therefore a singular
historical statement, this process nevertheless proceeds
“in accordance with all kinds of causal laws, for example, the laws of mechanics, of chemistry, of heredity and
segregation, of natural selection, etc.” Thus, it is possible
to test statements that bear on the process of evolution if
these can be linked to any such lawfulness. However, singular taxonomic statements pronouncing phylogenetic
ered very improbable has become everyday knowledge.
This does not mean, however, that the theory could not
be subjected to further tests in the future.
Degree of corroboration is just a historical account of
the past performance of a theory; it is not a probability, it
says nothing about the future performance of that same
theory (Popper, 1983, 1992), and therefore it does not mirror the increasing degree of confirmation sensu Carnap.
Popper (Popper, 1997:220; see also Popper, 1983:243) held
against Carnap the idea that if maximal probability of its
theories were the goal of science, scientists should restrict themselves to the utterance of tautologies, which
are necessarily true and hence cannot be tested and cannot acquire any degree of corroboration. The degree of
corroboration is also not symmetrical to a degree of falsification, a notion that does not exist in Popper’s vocabulary. Corroboration does not come in degrees because falsification comes in degrees. Instead, the severity
of test (which depends on the degree of falsifiability, or
corroborability, testability, or empirical content–the latter increasing with increasing improbability of evidence)
determines degree of corroboration. For Popper, the falsification of a bold new hypothesis that vastly transcends
current background knowledge is of minor importance;
its corroboration, however, is of major importance for
progress in science. In the same way, the further corroboration of an already highly corroborated theory does
not but its falsification does contribute significantly to
scientific progress (Chalmers, 1986).
De Queiroz and Poe (2001:306) claimed that Popper’s
degree of corroboration is based on the general principle
of likelihood. De Queiroz and Poe followed earlier workers (see references cited by Faith and Trueman, 2001) in
correctly identifying likelihood terms in Popper’s formalism for degree of corroboration, but their more basic claim can be investigated in an intuitively appealing
way by the application of Sober’s (2001) surprise principle. If the data match the expectation to a high degree,
there is no surprise, and the likelihood is high. If the data
match the expectation not at all, there is a big surprise,
but the likelihood is small. For Popper’s degree of corroboration, the more severe the test that a theory survives
(the more improbable the evidence is that the theory predicts), the greater the surprise if the theory passes the test
(“the success of the prediction could hardly be due to coincidence or chance,” Popper, 1983:247) and the higher
the degree of corroboration. Conversely, the more probable the evidence is that is predicted by the theory, the
smaller the surprise if it passes the test and the smaller
the degree of corroboration (see also Faith and Trueman,
2001; Kluge, 2001b). For Bayes’s theorem we obtain low
prior probability, high posterior probability, big surprise;
this is similar to Popper’s degree of corroboration, but
note that the “posterior” is always a probability making
promises as to the future performance of a hypothesis,
in contrast to Popper’s degree of corroboration. At the
end of the day, however, Popper (1983:254) concluded,
“To repeat, I do not believe that my definition of degree of corroboration is a contribution to science except,
Downloaded from by Unal user on 07 March 2020
probabilistic prediction as to the future performance of
a theory.
In Popper’s system, a theory is said to increase its degree of corroboration as it survives the severest tests.
The severity of test will depend on the investigator’s intention (hence the impossibility to fully analyze degree
of corroboration in terms of logic; Popper, 1983:236) but
also on the testability of a theory that increases with the
degree of improbability of a particular test statement.
Some theories are more severely testable than others, and
those that are most severely testable are the ones that
make the most improbable predictions. Testability and
with it the corroborability of a theory increase with the
improbability of evidence described by statements that
can be deduced from the theory. However, the improbability of the evidence that would corroborate or falsify
a theory must be established against a standard, which
is the background knowledge. For Popper, the class of
(negated) observation statements a theory entails constitutes its explanatory power (Laudan and Leplin, 2002;
note the difference from Farris’s [1983] use of the term).
The empirical content of a theory increases with the increasing number of (negated) observation statements it
entails. Falsification of a theory can be seen as a search
for counterinstances, i.e., the positive instantiations (i.e.,
negations) of negated observation statements (Popper,
1983:234, 1989:240). Some of these counterinstances will
be embedded in background knowledge, and others will
transcend this knowledge. If the empirical content of a
theory equals the background knowledge, its relative
empirical content is zero. The relative empirical content of a theory increases the more the theory transcends
background knowledge. However, the more the theory
transcends background knowledge, the more its predictions will be improbable. The more improbable the evidence that is appealed to in the test of a theory, the greater
the falsifiability as well as the corroborability of a theory (see Faith and Trueman, 2001) and hence the greater
the degree of corroboration if the theory passes the test.
Yesterday’s test results are part of today’s background
knowledge, so that if we aim for a high degree of falsifiability today, we must test a prediction deduced from
the theory that transcends today’s background knowledge and hence yesterday’s test. Thus, we must test a
new prediction of a theory, for example by forcing it to
make an even more precise prediction than yesterday. In
this way we push the theory toward an experimentum
crucis (Popper, 1992).
However, a theory that has been variously tested multiple times and that has achieved a high degree of corroboration may become accepted (perhaps because no new
ways of further testing it are apparent at that time) and
hence relegated to background knowledge. If so, its relative empirical content drops to zero, and its total empirical content is diminished. The degree of falsifiability and
the severity of test are then also diminished (in particular
because there is no intention anymore to subject the theory to further severe tests). The degree of corroboration
drops as a consequence. Hence, what once was consid-
VOL. 52
perhaps, that it may be useful as an appraisal of statistical
in the same object language itself. The truth-value of object language statements of science must be measured
from the outside, which requires that the measure will
have to be expressed in a metalanguage of science. Figuratively speaking, the metalanguage of science provides
the ruler that is applied to the object language of science
in the attempt to measure its truth-value. If the scientist pronounces in her object language the statement “if
p and if q, then z follows (with a statistical probability
of n),” the philosopher of science may look at that statement and ask the question, in a metalanguage of science,
“To which degree is it true (are we rationally entitled to believe: Carnap, 1997b:967, emphasis added) that ‘If p and
if q, then z follows (with a statistical probability of n)’?”
This is figurative speech, however. Technically speaking, logical probability is an analytic statement on the
degree to which stated evidence is logically implied by
a stated hypothesis (Carnap, 1995:35, 1997a:73). Carnap
believed that it would eventually be possible to express
the logical probability of a theory in numerical terms,
and it is evident that the logical probability of a theory directly corresponds to its degree of confirmation.
The two, indeed, are synonymous (Carnap, 1997a:72).
(Logical probability was also called inductive probability [Carnap, 1997a, 1997b; Hempel, 2001] because it was
viewed as the backbone of inductive logic. It is interesting
to compare Carnap’s concept of logical probability to the
prior probability in Bayes’s theorem and to the concept
of entrenchment and the corresponding projectability of
predicates as argued by Goodman [2001].)
Popper did not believe that logical probability (which
he also called absolute, initial, prior, or a priori probability; Popper, 1983:284) could be expressed numerically without stepping outside the realm of pure logic
(Popper, 1974b, 1989, 1992; exceptions are the theoretical boundary conditions near complete verification or
falsification [Popper, 1989:397]). For him, logical probability was tied to the explanatory power of a theory in a
deductive framework, hence to its logical range. Popper
used deduction in the establishment of the empirical content class of a theory, which he considered to be the set
of empirical statements whose negations are (logically)
entailed by that theory (Laudan and Leplin, 2002; see
also Popper, 1983:231). This empirical content of a theory can be pictured as the range or the space of all the
forbidden states of affairs that can be deduced from a
theory (Carnap [1997a:71; see also Glock, 2000:117, 221]
traced the notion of the logical range of propositions to
Wittgenstein’s Tractatus [see also Popper, 1992:124]). A
theory that forbids almost nothing (small empirical content) has a large logical range, a small logical improbability, and hence a large logical probability. The more
“facts” a theory forbids (the greater its empirical content
or explanatory power), the greater its risk of being found
false, hence the smaller its logical range, the larger its
logical improbability, and the smaller its logical probability. Corroborability (or falsifiability) is hence the inverse
of logical probability (Popper, 1983:245). Where Carnap
found logical probability to provide support for a degree of credence in the future successful performance
Downloaded from by Unal user on 07 March 2020
The current debate on Popper and cladistics turns to a
large degree on issues related to the probabilification of
systematics. Whereas systematists mostly think of statistical probability, Popper distinguished this from logical
probability, a concept that is not easy to understand but is
essential for an understanding of Popper’s philosophy of
science and its bearing on systematics. As noted by Faith
and Trueman (2001:333), “p denotes a logical probability”
in Popper’s formalism for degree of corroboration.
Carnap (1995) claimed that the increasing degree of
confirmation of a theory translates into a greater probability for it to be correct, but he invoked two different kinds of probabilities in this context. The first is
statistical and is expressed in the object language of science. An observational statement that reports on a state
of affairs is expressed in the object language of science
and in statistical probability. For example, a physicist
may emerge from her lab with a discovery that she announces in the object language of science: “If p and if
q, then z follows (with a statistical probability of n).” So
p and q may specify certain initial or boundary conditions that, under a covering natural law, would result
in z, but that result may obtain with a certain statistical
probability of n only if the covering law is of a probabilistic nature. In other words, if I perform a series of
experiments, I would predict that z should obtain with a
frequency of n (in n percent of the cases), and this prediction should be more closely approximated the longer the
series of experiments is performed. But whether it obtains in this particular experiment I am conducting here
and now is not determined. Statistical probability cannot
relate to a particular event but only to a series of events
(Popper, 1979), which in turn is the reason why Popper
introduced his propensity interpretation of probability
(Popper, 1974a:122).
If the covering law in question is of a probabilistic
nature, the individual instantiation is not determined.
The law still governs the natural course of events, yet
the question remains, to which degree? Statistical probability is a quantitative relation that governs repeatable
kinds of events (tokens of the same type), which is not
the same as saying that a certain hypothesis (or hypothesized law) is made probable to a certain degree by a
certain kind of evidence (Hempel, 2001). Carnap (1995)
invoked a second kind of probability, the logical probability, to measure this second type of probability or, in
figurative speech, the truth-value (used here colloquially, not technically) of a statement rendered in the object
language of science. Borrowing from Wittgenstein, statements rendered in the object language of science measure the world (Pears, 1988), and for Carnap they do so
in terms of statistical probability. Logical probability is
supposed to measure the truth-value of those object language statements of science, but the truth-value of object
language statements cannot be measured or expressed
As the debate about the probabilification of systematics in the light of Popper’s philosophy of science continues, it is important to keep in mind that statistical and
logical probabilities are only two of several possible interpretations of probability (Nagel, 1971) to which Popper
added his own, the propensity interpretation. This interpretation supersedes logical probability (“[propensities]
are not merely logical possibilities, but physical possibilities”; Popper, 1988:105; see also Popper, 1983:286) in that
it can deal with genuinely indeterminate singular events
that seem to evade lawfulness (Popper, 1983:285), such
as those addressed by quantum mechanics.
The rise of quantum mechanics dramatically changed
the landscape for philosophy of science based on deduction. Deduction is about what must be the case, but
given Heisenberg’s uncertainty principle, what must be
the case remains indeterminate in its singular (particular,
individual) instantiation. Hence, it is Popper’s writing
on probability that has attracted the attention of phylogenetic systematists, especially as presented in appendix
IX of his Logic of Scientific Discovery (Popper, 1992; cf. de
Queiroz and Poe, 2001; Kluge, 2001b).
Given Popper’s adherence to deductive logic, such
as exemplified in the modus tollens form of argument,
he was not at ease with the probabilification of science
(Lakatos, 1974). Some support can always be obtained
for every theory or hypothesis that is of a probabilistic nature. Statistical theories or hypotheses must hence
be made refutable by the conventional determination of
criteria of rejection (Lakatos, 1974). These criteria were
formulated by Popper (1992) in appendix IX of Logic of
Scientific Discovery, and they state that the probability of
the evidence sought to support a hypothesis must be very
small (the evidence must be very improbable; see Faith
and Trueman, 2001), but that same evidence must impart
a very high probability on the hypothesis under test, and
the margin of error (δ) must be very small, which means
that the hypothesis must make very precise predictions
(see Nagel, 1971, for further comments on δ). In sum, the
evidence must consist of a “statistical report” that asserts
“a good fit in a large sample” (Popper, 1992:411).
However, that criterion alone was not enough for
Popper. He introduced the propensity interpretation of
probability (Popper, 1983, 1989) to be able to deal with
singular events (statistical probability can only deal with
a series of events) and to remove “certain disturbing elements of an irrational and subjective character” in quantum mechanics (Popper, 1983:351), viz. “the strange role
played by ‘the observer’ in some interpretations of quantum mechanics” (Popper, in Popper and Eccles, 1977:26).
Propensity means that an object has a disposition to behave in a certain way. Dogs, for example, have a disposition to bark, and radium atoms have a disposition to
radioactively decay with an associated half-life of 1,620
years. The example of radioactive decay “is perhaps the
strongest argument in favor of what I have called ‘the
propensity interpretation of probability in physics’” (Popper,
in Popper and Eccles, 1977:26, n. 4). In the example of
propositional logic, its conditional can now be interpreted as a disposition statement (Körner, 1959): “(For
every A) if A is a dog, then it barks” (then it has a disposition to bark), “A is a dog,” “therefore A barks” (has a
disposition to bark). This argument is deductive, specifying a necessary or universal property of dogs, and this
property is the disposition to bark. But the deduction
does not specify that this particular dog I am pointing
at must bark right now. It may not bark because it is
fast asleep. The same holds for the radium atom: “ (For
every A) if A is a radium atom, then it will decay” (it
has a disposition to decay with an associated half-life
of 1,620 years), “A is a radium atom,” “therefore A will
decay,” but maybe not right now. Again, the statement
identifies the disposition to radioactively decay as a universal (i.e., necessary) property of the radium atom, but
it does not specify which particular radium atom will
decay in which particular point in space and at which
particular time. These events can only be a matter of (objective) probability. But although the theory of radioactive decay is by its very nature a probabilistic theory, its
meaning is not under the propensity interpretation of
probability because the disposition to behave in a certain way is a universal property of the atoms that fall
Downloaded from by Unal user on 07 March 2020
of a hypothesis (Carnap, 1997b), Popper found logical
probability to provide a measure for the corroborability (or falsifiability) of a hypothesis. Carnap sought high
logical probability for hypotheses, and Popper sought
small logical probabilities for hypotheses. Using Sober’s
(2001) surprise principle again, we can state that the surprise factor will be greater and hence the degree of corroboration will be higher if a theory with a very small
logical (prior; Popper, 1993:244) probability passes a severe test. However, logical probability remains a metalinguistic expression relative to the object language of
Logical probability has long since vanished from the
philosophy of science (e.g., Balashov and Rosenberg,
2002) and for good reasons (e.g., Putnam, 1997; Quine,
2000). Logical probability is built upon the entailment of
stated evidence by a stated hypothesis, and if such entailment is not obtained for phylogenetic hypotheses, it
certainly does not apply to phylogenetic analysis. The
recent systematics literature (e.g., de Queiroz and Poe,
2001; Faith and Trueman, 2001; Farris et al., 2001; Kluge,
2001b) documents extensive use of Popper’s formalisms
for degree of corroboration, severity of test, etc. The concept of probability used in these formalisms is that of
logical probability, the formalisms therefore are metalinguistic in nature: h is a stated hypothesis, e is stated evidence, b is the corpus of stated current scientific knowledge that is not questioned when h is being put to test,
and p(e, h) is the degree by which e is (logically) entailed
by h. In sum, the formalisms of Popper are statements
rendered in the language of philosophy, not in the language of science; e stands to h not in the relation of an
observation to justified belief (hypothesis) but in the relation of logical entailment.
VOL. 52
evolution, which Cuvier in a calculated way misrepresented as désires (desires, wishes) (Appel, 1987:169).
Faith and Trueman (2001:331) escaped from large parts
of what I have explained because they explicitly decoupled corroboration from “the popular falsificationist interpretation of Popperian philosophy” (rendering it Popper*: Faith, 1999). Thus, the meaning with which Popper
used the concept of degree of corroboration cannot be
coextensive with the meaning bestowed on that concept
by Faith and Trueman (2001; see remarks above on semantic analysis). Recognizing that Popper’s asymmetry
of falsification is difficult to realize in empirical space,
Faith and Trueman (2001) put emphasis on degree of corroboration instead. To link their argument to Popper’s
hypotheticodeductivism, however, they need to show
how their notion of corroboration differs from the
inductivist notion of confirmation. Popper linked corroborability to testability (Popper, 1983:245) and stated that
“scientific tests are always attempted refutations” (Popper, 1983:243). For him, corroboration was thus embedded in a falsificationist context. In a context decoupled
from falsificationism, Faith and Trueman (2001) emphasized the “genuineness” of a test that is measured by the
improbability of the evidence (i.e., the low p(e, b) value;
D. Faith, pers. comm.). Faith and Trueman (2001) capitalized on Popper’s notion that severity of test increases
to the degree that evidence e is improbable given background knowledge b in the absence of the hypothesis h
(i.e., p[e, b]).
With their argument, Faith and Trueman (2001) recognized the fact that Popperian falsification based on hypotheticodeductivism cannot exist in systematics. Turning
the tables, they therefore emphasized the synonymy that
Popper allowed for degree of corroboration qua support
(Popper, 1983:238ff), and from there follows their emphasis on the necessity for the evidence e to be improbable
in the light of background knowledge (b). There is more
reason to consider improbable evidence (in light of b) as
possibly false than to consider highly probable evidence
false, such that the hypothesis h, which possibly accommodates the improbable e, is itself more improbable or
has a higher content (as required by Popper, 1983). But
once evidence is accepted in support of a given h, they
argued, its improbability decreases and with it decreases
the improbability and therefore the content of h in support of which e is accepted. This argument is analogous to
that of degree of corroboration decreasing once a theory
is accepted into background knowledge. (The analogy
concerns different levels of argumentation. On the one
hand there is the relationship of evidence to a hypothesis, and on the other hand there is the relationship of a
hypothesis to background knowledge.)
Given that Faith and Trueman (2001) explicitly decoupled their interpretation of Popper’s philosophy from
falsificationism, it is interesting to compare their notion
of degree of corroboration to Popper’s formalization of
degree of confirmation. Popper (1997:222) considered it
Downloaded from by Unal user on 07 March 2020
under that theory. So with his propensity interpretation
of probability, Popper was able to maintain universality
even for laws that are probabilistic in nature. Popper’s
propensity interpretation of probability is thus revealed
to ultimately lead back to propositional logic (Popper,
1974c:1132), such that deductive explanations (causal hypotheses) that must be the case assert a propensity of
1 (Popper, 1983:288). As Hempel (2001:212) put it, it is
the universal character of the disposition of objects to
behave in a certain way “that gives probabilistic laws
their predictive and their explanatory force” for objects
that fall under these laws. The empirical content of a scientific theory was considered by Popper to correspond
to the class of negated observation statements it entails.
But this statement asserts nothing else than that “natural
laws . . . can never do more than exclude certain possibilities” (Popper, 1976b:139). By the same token, “Propensities may rule out certain possibilities: in this consists
their lawlike character” (Popper, in Popper and Eccles,
1977:31, n. 13).
The appeal by systematists to Popper’s writings on
probability must take into account the propensity interpretation of probability. Popper (1983:284f) went to considerable length to analyze the meaning of logical probability in a frequentist context, eventually abandoning
his early preference for absolute (i.e., logical) probability as he “realized that absolute probability is a special
case of relative probability, but not the other way around”
(Popper, 1974c:1119). He found his axiomatization of the
theory of relative probability stronger and more interesting than the theory of absolute probability, a point that
“was decisive for my choice” (Popper, 1974c: 1119). From
this choice then followed the propensity interpretation
as a “measure of possibilities” (Popper, 1983:286). However, that interpretation requires repeatable phenomena,
which again are tokens of the same type.
This interpretation of propensity might seem at odds
with Popper’s (1974b:276–284) propensity interpretation
of evolution, because there would seem to be no room for
universality in evolutionary theory. Popper (1976b:108)
considered evolution a unique process, but nevertheless one that occurs in accordance with “all kinds of
causal laws.” To invoke propensities for the evolutionary process comes at a cost, however, that “the evolution
of the executive organs will become . . . ‘goal-directed’”
(Popper, 1974b:278). What Popper was aiming at is a
Darwinian process of variation and natural selection,
where behavioral changes that are subject to organismic propensities precede somatic changes and determine
the direction of somatic change in a process of directional selection that “‘simulates’ not only Lamarckism,
but Bergsonian vitalism also” (Popper, 1974b:284). So
the giraffe stretched its neck after all, for without that
propensity, “a longer neck would not have been of any
survival value” (Popper, 1974b:279). Propensities or dispositions will inevitably be realized given certain circumstances (Glock, 2000:183; e.g., if the burglar startles the
sleeping the dog, then it will bark), and this inevitability reflects Lamarck’s use of the term besoins (necessities, requirements, needs) in his explanation of adaptive
Many of the issues in systematics that are currently
debated with an appeal to Popper are just good common sense. There can hardly be any disagreement with a
methodological principle that seeks to minimize ad hoc
auxiliary hypotheses to save the phenomena, a principle that is endorsed by every scientist and by Popper
(and other philosophers of science). Why disagree with
a methodological principle that requires going beyond
that sort of supporting evidence that is the cheapest one
to be had? However, there is the semantic ambiguity of
the notion of a test. Some will consider the testing of
a null hypothesis as opposed to alternative hypotheses
a legitimate test, others will call the disconfirmation of
a theory the result of a test, and all of it is testing, just not
in Popper’s sense. Carnap (1997a:59), for example, used
testability as an “empiricist criterion of significance” in
an explicitly verificationist context: a sentence is confirmable by possible observable events and testable if
a method can be specified for producing such events at
will. More generally, Popper’s appeal for a critical attitude is commonsensical and endorsed by most if not
all scientists. This appeal was Popper’s way of carefully
steering his theory of knowledge past the Scylla of dogmatism and the Charybdis of skepticism. But Popper’s
theory of knowledge was more. Hume’s predicament
was insurmountable: there simply is no way to deduce
theory from observation (Quine, 2000). So Popper turned
the tables: Start with a theory, and if it is to be scientific it
has to allow the deduction of negated observation statements. This is the step in which systematics cannot follow
Popper (Sober, 1988). One can use the word “test” in parsimony as well as in likelihood analysis, but it will not
correspond to a Popperian test. For Popper, a test is tied
to the asymmetry of falsification; by contrast, symmetry has long been noted in whether congruent characters
support a certain hypothesis of relationships or falsify
alternative potential hypotheses for the same terminal
taxa (Bonde, 1974; Gaffney, 1979). There is one easy way
to elucidate how far the theory of phylogenetic inference
can follow Popper. Phylogenetics must logically demarcate its concepts from those of inductive or abductive inference; it must logically demarcate corroboration from
confirmation. However, times move on and meanings
change, and it may be absolutely permissible that corroboration in modern systematics no longer needs to be
used in the same way Popper used it. But at that point,
systematics can no longer be claimed to be Popperian,
but rather it is under the influence of the philosophy of
its own exponents, as it should be.
The June 2001 issue of Systematic Biology prompted me to read
Popper again, an adventure that led to the insights presented in this
paper. David Hull unsuccessfully tried to explain some of these issues
to me 10 or so years ago. I have also greatly profited from discussions
with Dan Faith, Arnold Kluge, Lukas Rieppel, Michael Rieppel, Maureen Kearney, and Peter Wagner. I do not suggest that they agree with
my analysis, but often it is through discussions with opponents that
one learns most.
APPEL, T. B. 1987. The Cuvier–Geoffroy debate. French biology in the
decades before Darwin. Oxford Univ. Press, Oxford, U.K.
AYER , A. J. 1952. Language, truth, and knowledge. Dover, New York.
AYER , A. J. 1959. Bibliography of logical positivism. Pages 381–446 in
Logical positivism (A. J. Ayer, ed.). Free Press, New York.
BALASHOV , Y., AND A. ROSENBERG (eds.). 2002. Philosophy of science.
Contemporary readings. Routledge, London.
BONDE, N. 1974. Review of Interrelationships of Fishes. Syst. Zool. 23:562–
BOYD , R. 1999. Homeostasis, species, and higer taxa. Pages 141–185 in
Species; New interdisciplinary essays (R. A. Wilson, ed.). MIT Press,
Cambridge, Massachusetts.
CARNAP, R. 1995. An introduction to the philosophy of science. Dover,
New York.
CARNAP, R. 1997a. Intellectual autobiography. Pages 1–84 in The philosophy of Rudolph Carnap (P. A. Schilpp, ed.). Open Court, La Salle,
CARNAP, R. 1997b. Replies and systematic expositions. Pages 859–1013
in The philosophy of Rudolph Carnap (P. A. Schilpp, ed.). Open
Court, La Salle, Illinois.
CHALMERS , A. F. 1986. Wege der Wissenschaft. Springer, Berlin.
CROTHER , B. I. 2002. Is Karl Popper’s philosophy of science all things
to all people? Cladistics 18:445.
DE QUEIROZ, K., AND S. POE. 2001. Philosophy and phylogenetic inference: A comparison of likelihood and parsimony methods in the
context of Karl Popper’s writings on corroboration. Syst. Biol. 50:305–
DE WAAL, C. 2001. On Peirce. Wadsworth, Belmont, California.
Downloaded from by Unal user on 07 March 2020
his “critical task” to show by way of his falsificationism that “inductive logic [sic!] is impossible.” But Popper
commented on his own “positive theory” and provided
a definition of degree of confirmation, where C(x, y) is
the degree by which y (observation statements) confirms
x (theory, hypothesis) (Popper, 1997:221f). Popper’s degree of confirmation is, again, defined in a way that is
relativized to the current background knowledge z (as
he did for degree of corroboration): C (x, y, z; degree of
confirmation) = p(y, x · z) − p(y, z)/p(y, x · z) − p(x · y, z) +
p(y, z). For Popper, z comprises the old evidence, the old
and new initial conditions, and accepted theories. Confirmation is bestowed upon a new explanatory hypothesis
x by the new observational results y, which are not part of
the background knowledge z (as is also not the improbable evidence requested by Faith and Trueman, 2001).
“That is to say, the total evidence e is to be partitioned into
y and z; and y and z should be so chosen as to give C(x,
y, z) the highest value possible for x, on the available total evidence” (Popper, 1997:222, n.82; emphasis added).
The total evidence is what Carnap (1997b) used to gauge
the logical probability of a hypothesis. Popper (1997:222)
was satisfied that his definition would meet the “condition that the confirmability of a statement—its highest
degree of confirmation—equals its content (i.e., the degree of falsifiability).” Because the degree of falsifiability
increases with the degree by which a theory transcends
background knowledge, i.e., with the most improbable
predictions the theory generates, the degree of confirmation likewise increases with the most improbable of
evidence, as was also postulated by Faith and Trueman
PEARS , D. 1988. The false prison, a study of the development of
Wittgenstein’s philosophy. Clarendon Press, Oxford, U.K.
POPPER , K. R. 1974a. Autobiography. Pages 1–181 in The philosophy
of Karl Popper, Volume 2 (P. A. Schilpp, ed.). Open Court, La Salle,
POPPER , K. R. 1974b. Objective knowledge, an evolutionary approach.
Clarendon Press, Oxford, U.K.
POPPER , K. R. 1974c. Replies to my critics. Pages 961–1197 in The philosophy of Karl Popper, Volume 2 (P. A. Schilpp, ed.). Open Court,
La Salle, Illinois.
POPPER , K. R. 1976a. Logik der Forschung. J. C. B. Mohr (Paul Siebeck),
Tübingen, Germany.
POPPER , K. R. 1976b. The poverty of historicism. Routledge & Kegan,
POPPER , K. R. 1979. Die beiden Grundprobleme der Erkenntnistheorie.
J. C. B. Mohr (Paul Siebeck), Tübingen, Germany.
POPPER , K. R. 1980. Evolution. New Sci. 190:611.
POPPER , K. R. 1983. Realism and the aim of science. Routledge, London.
POPPER , K. R. 1988. The open universe. An argument for indeterminism. Routledge, London.
POPPER , K. R. 1989. Conjectures and refutations. Routledge & Kegan
Paul, London.
POPPER , K. R. 1992. The logic of scientific discovery. Routledge & Kegan
Paul, London.
POPPER , K. R. 1997. The demarcation between science and metaphysics.
Pages 183–226 in The philosophy of Rudolph Carnap (P. A. Schilpp,
ed.). Open Court, La Salle, Illinois.
POPPER , K. R., AND J. C. ECCLES . 1977. The self and its brain. An argument for interactionism. Springer International, Berlin.
PUTNAM , H. 1997. ‘Degree of confirmation’ and inductive logic. Pages
761–783 in The philosophy of Rudolph Carnap (P. A. Schilpp, ed.).
Open Court, La Salle, Illinois.
QUINE, W. V. 2000. Epistemology naturalized. Pages 292–300 in Epistemology, an anthology (E. Sosa and J. Kim, eds.). Blackwell, Oxford,
SOBER , E. 1988. Reconstructing the past. Parsimony, evolution, and inference. MIT Press, Cambridge, Massachusetts.
SOBER , E. 1994. Let’s razor Ockham’s razor. Pages 136–157 in From
a biological point of view (E. Sober, ed.). Cambridge Univ. Press,
Cambridge, U.K.
SOBER , E. 1999. Testability. Proc. Addr. Am. Philos. Assoc. 73:47–76.
SOBER , E. 2001. Core questions in philosphy. A text with readings.
Prentice-Hall, Upper Saddle River, New Jersey.
WEINER , J. 1999. Frege. Oxford Univ. Press, Oxford, U.K.
WILSON, R. A. 1999. Realism, essence and kind: Resuscitating species
essentialism? Pages 187–207 in Species; new interdisciplinary essays
(R. A. Wilson, ed.). MIT Press, Cambridge, Massachusetts.
First submitted 2 July 2002; reviews returned 30 September 2002;
final acceptance 14 December 2002
Associate Editor: Dan Faith
Syst. Biol. 52(2):271–280, 2003
DOI: 10.1080/10635150390192799
Using a Null Model to Recognize Significant Co-Occurrence Prior to Identifying
Candidate Areas of Endemism
Institute of Systematic Botany, University of Zürich, Zollikerstrasse 107, 8008 Zürich, Switzerland;
E-mail: [email protected] (A.R.M.)
Recent commentaries on areas of endemism have suggested that these be recognized by the “nonrandom”
(e.g., Nelson and Platnick, 1981:56; Morrone, 1994:438)
distributional congruence of two or more taxa at some
scale of mapping. However, no one has addressed how
one might objectively determine the threshold between
random and nonrandom co-occurrence for this purpose
(Hausdorf, 2002).
Downloaded from by Unal user on 07 March 2020
EDMONDS , D., AND J. EIDINOW. 2001. Wittgenstein’s poker. HarperCollins, New York.
EDWARDS , A. W. F. 1992. Likelihood. Expanded edition. Johns Hopkins
Univ. Press, Baltimore, Maryland.
FAITH, D. P. 1999. Review of Error and the growth of experimental knowledge. Syst. Biol. 48:675–679.
FAITH, D. P., AND W. H. TRUEMAN. 2001. Towards an inclusive philosophy for phylogenetic inference. Syst. Biol. 50:331–350.
FARRIS , S. J. 1983. The logical basis of phylogenetic analysis. Pages 7–36
in Advances in cladistics, Volume 2 (N. I. Platnick and V. A. Funk,
eds.). Columbia Univ. Press, New York.
FARRIS , J. S., A. G. KLUGE, AND J. M. CARPENTER . 2001. Popper and
likelihood versus “Popper*.” Syst. Biol. 50:438–444.
GAFFNEY, E. S. 1979. An introduction to the logic of phylogeny reconstruction. Pages 79–111 in Phylogenetic analysis and paleontology
(J. Cracraft and N. Eldredge, eds.). Columbia Univ. Press, New York.
GLOCK , H.-J. 2000. A Wittgenstein dictionary. Blackwell, Oxford, U.K.
GOODMAN, N. 2001. The new riddle of induction. Pages 215–224 in
Analytic philosophy, an anthology (A. P. Martinich and D. Sosa, eds.).
Blackwell, Oxford, U.K.
HEMPEL, C. G. 1965. Studies in the logic of explanation. Pages 245–295
in Aspects of scientific explanation and other essays in the philosphy
of science (C. G. Hempel, ed.). Free Press, New York.
HEMPEL, C. G. 2001. Laws and their role in scientific explanation. Pages
201–214 in Analytic philosophy, an anthology (A. P. Martinich and
D. Sosa, eds.). Blackwell, Oxford, U.K.
HUNG , T. 1992. Ayer and the Vienna Circle. Pages 279–300 in The philosophy of A. J. Ayer (L. E. Hahn, ed.). Open Court, La Salle, Illinois.
KLUGE, A. 2001a. Parsimony with and without scientific justification.
Cladistics 17:199–210.
KLUGE, A. G. 2001b. Philosophical conjectures and their refutation. Syst.
Biol. 50:322–330.
KLUGE, A. G., AND J. S. FARRIS . 1969. Qualitative phyletics and the
evolution of anurans. Syst. Zool. 18:1–32.
KÖRNER , S. 1959. Conceptual thinking. A logical enquiry. Dover,
New York.
KÖRNER , S. 1970. Erfahrung und Theorie. Suhrkamp, Frankfurt am
Main, Germany.
KRIPKE, S. 2002. Naming and necessity. Blackwell, Oxford, U.K.
LAKATOS , I. 1974. Falsification and the methodology of scientific research programs. Pages 91–196 in Criticism and the growth of knowledge (I. Lakatos and A. Musgrave, eds.). Cambridge Univ. Press,
Cambridge, U.K.
LAUDAN, L., AND J. LEPLIN. 2002. Empirical equivalence and underdetermination. Pages 362–384 in Philosophy of science. Contemporary readings (Y. Balashov and A. Rosenberg, eds.). Routledge,
MAYHALL, C. W. 2002. On Carnap. Wadsworth, Belmont,
NAGEL, E. 1971. Principles of the theory of probability. Pages 341–422
in Foundations of the Unity of Science, Volume I (O. Neurath, R.
Carnap, and C. Morris, eds.). Univ. Chicago Press, Chicago.