Categorical logic is the mathematics of combining statements about objects that can belong to one or more classes or categories of things. For instance:
All cars require an energy source. Not all cars use gasoline as an energy source. Therefore, some cars use something other than gasoline as an energy source.
and
All cows eat grass. All grasses are plants. Therefore, all cows eat plants.
Categorical logic is intimately related to set theory, discussed in Categories are equivalent to sets. Categorical statements can be expressed using three notions from logic and set theory: the subset relation, the complement of a set, and logical negation, which is covered in
Categorical syllogisms are special three-line arguments about the relationships among three categories that have been studied since antiquity. The two arguments given above are categorical syllogisms. The chapter examines categorical syllogisms and methods for testing whether they are valid.
We assume there is some domain of discourse, a universal set or universe \( \mathbf{S} \). (This corresponds to outcome space in a random experiment.) The universe contains all the "things" that can belong to categories. A category is any subset of the universe.
Suppose \(A\) and \(B\) are sets. We will be concerned with statements like "every element of \(A\) is an element of \(B\)," which we might write as "every \(A\) is a \(B\)" or "all \(A\) are \(B\)." Of course, that is just what it means for \(A\) to be a subset of \(B\): \(A \subset B\). An example: If \(A\) comprises all ravens and \(B\) comprises all birds, then \(A \subset B\) could be pronounced everything that is a raven is a bird, every raven is a bird, or all ravens are birds.
Another categorical statement is the denial of "all \(A\) are \(B\)," namely, "some \(A\) are not \(B\)." That is just what it means for \(A\) not to be a subset of \(B\): \(A \not\subset B\). An example is some birds are not ravens. Note that some birds are not ravens is not equivalent to some ravens are not birds: \(A \not\subset B\) is not the same as \(B \not\subset A\).
Interestingly, the two statements, \(A \subset B\) and \(A \not\subset B\), have a fundamental difference: it can be true that \(A \subset B\) even if \(A\) has no elements and \(B\) has no elements, because the empty set is a subset of every set. For example, it is true that all immortal ravens are pink birds, because there are no immortal ravens.
In contrast, if \(A \not\subset B\), \(A\) must have elements, at least one of which is in \(B^c\). For instance, it is not true that some immortal ravens are not pink birds, because that would require there to be some immortal ravens. Hence \(A \not\subset B\) lets us deduce that neither \(A\) nor \(B^c\) is empty: the relationship \( \not\subset \) has existential import while the relationship \( \subset \) does not.
Be careful not to assume that any set has a member unless the diagram represents "some are" or "some are not." To assume without justification that a set has at least one element (i.e., that an element of the set exists) is called the existential fallacy.
There are two other categorical statements, "some \(A\) are \(B\)." and "no \(A\) are \(B\)." The first can be rephrased as "some \(A\) are not non-\(B\)," that is, \(A \not\subset B^c\). An example would be some birds are ravens; equivalently, not every bird is a non-raven, or (even more geeky) birds are not a subset of non-ravens. The second can be rephrased as "all \(A\) are non-\(B\)," that is, \(A \subset B^c\). An example might be no invisible birds are ravens or every invisible bird is a non-raven.
Later in this chapter we will look at other ways of expressing these four categorical relationships: (i) all A are B, (ii) some A are not B, (iii) some A are B, (iv) all A are not B. As just illustrated, the statements with the word some imply that something (a member of \(A\), at least) exists—they have existential import. The statements with all do not imply that \(A\) has any elements: they have no existential import.
Recall from that one way to specify a set \(A \subset \mathbf{S} \) is to list its members: \(A = \{ a, b, c \}\). Another way is to find something that is true for members of the set \(A\) but false for all other elements of \(\mathbf{S}\). Then we can identify \(A\) as the those members of \(\mathbf{S}\) for which that thing—called a predicate function or membership function—is true.
For example, in the definition \(A = \text{{all animals that are birds}}\), the thing that is true for the members of the set is that they are birds. That is, the name of the category of things, "birds" is essentially the membership function: the function that is true for birds and false for everything else.
For instance, if \(\mathbf{S}\) is the set of all numbers, we could write the set of even numbers as \( \{x \in \mathbf{S} : \frac{x}{2} \text{is an integer}\}\). The colon is pronounced "such that." This expression means "the set of all numbers \(x\) such that when you divide \(x\) by \(2\), the result is an integer. The membership function here is \( \frac{x}{2} \text{ is an integer}\), which is true if \(x\) is an even number and false otherwise. In general, to specify a set by its membership function, we use an expression like \(\{x \in \mathbf{S} : \text{(the membership function is true for x)}\}\).
The membership function of a set \(A\) is like an oracle: you bring it an element \(x\) of \(\mathbf{S}\) and the membership function says, "yes, this is an element of \(A\)," or "no, this is not an element of \(A\)." It assigns the value "true" or the value "false" to elements of \(\mathbf{S}\), according to whether they are elements of \(A\). If \(P()\) is the membership function for the set \(A\), then \(A = \{ x \in \mathbf{S} : P(x)\}\). When \(\mathbf{S}\) is clear from context, we can omit it from the notation and just write \(A = \{x: P(x)\}\).
A set is equivalent to its membership function: if you know one, you know the other. Because of that equivalence, we will often use the same symbol to denote a set and its membership function. For instance, unadorned, \(A\) might denote a set of elements of \(\mathbf{S}\), while with parentheses, \(A()\) might denote a function that assigns the value "true" to elements of \(A\) and the value "false" to all other elements of \(\mathbf{S}\): \(A = \{x: A(x)\}\).
To bring the discussion back to Earth, suppose the domain of discourse \(\mathbf{S}\) is birds. Black birds comprise a subset of \(\mathbf{S}\). Let \(B\) be the class of black birds:
\(B = \{ x \in \mathbf{S} : \text{x is black}\}\).
Let \(R\) denote the set of ravens, a different class of birds.
\(R = \{ x \in \mathbf{S} : \text{x is a raven}\}\).
Then we might ask whether the following argument is valid:
Some ravens are black. All ravens are birds. Therefore, some birds are black.
Before answering the question, we will introduce some standard mathematical notation.
We shall be concerned with statements like "every element of \(A\) is an element of \(B\)" (informally, "all A are B," "every A is a B"), "some element of \(B\) is not an element of \(A\)" (informally, "some B are not A," "not all B are A," "not every B is an A"), and the like. As discussed more below, every and all do not presuppose that there are any. In contrast, some means at least one.
The symbol \(\forall\) means every or all. It is called the universal quantifier. For instance, the statement all ravens are black (birds) could be symbolized as:
\(\forall \; x \in \mathbf{S} \text{, if x} \in \text{R then x} \in B\).
We could also write simply \(R \subset B\). Here is the corresponding Venn diagram:
Note that all ravens are black birds is quite different from all black birds are ravens: \(R \subset B\) is not the same as \(B \subset R\).
The statement no ravens are black birds could be symbolized as:
\( \ \; x \in \mathbf{S} \text{, if x} \in \text{R then } x \notin B.\)
We could also write simply \(R \cap B=\{\}\); that is, \(R\) and \(B\) are disjoint. Here is the corresponding Venn diagram:
Here, in contrast, no ravens are black birds describes the same state of the world as no black birds are ravens: \(R \cap B=\{\}\) if and only if \(B \cap R=\{\}\).
The symbol \( \exists \) means there exists, some or at least one. It is called the existential quantifier. Some means at least one, but not necessarily all. The statement some raven is black could be symbolized as:
\( \exists \; x \in \mathbf{S} : x \in R \text{ and } x \in B.\)
We could write \(R \cap B \neq \{\}\) (something is both in \(R\) and in \(B\): there is a black raven), \(R \not\subset B^c\) (not every element of \(R\) is not in \(B\): not every raven is non-black), or \(B \not\subset R^c\) (not every element of \(B\) is not in \(R\): not every black thing is a non-raven). Here is a Venn diagram illustrating the situation.
\( \bullet x \in R \cap B\)
The statement some ravens are black birds describes the same situation as some black birds are ravens: \(R \not\subset B^c\) if and only if \(B \not\subset R^c\). Both conditions are true exactly when the intersection of \(R\) and \(B\) has at least one element.
Let \(Y = \{ x \in \mathbf{S} : \text{x is yellow}\}\) be the set of yellow birds. The assertion that no ravens are yellow could be written \(R \cap Y = \{\}\) (no bird is both a raven and yellow), \(R \subset Y^c\) (every raven is non-yellow), or \(Y \subset R^c\) (every yellow bird is a non-raven).
Naively, we might assume that if something is true for every element of some set, it is true for at least one element of the set. But for the fact that the set could be empty, that is true. However, if the set in question is empty, something can be true for all of its elements and yet not true for at least one of its elements. This is an important difference between \(\exists\) and \(\forall\): the former asserts that something exists, while the latter can be satisfied even if nothing exists.
Here is an example. Let the universe be the set \(\mathbf{S}\) of animals. Let us suppose that unicorns do not exist. Let \(U\) be the set of animals that are unicorns. Let \(E\) be the set of animals that lay eggs. Consider the assertion All unicorns lay eggs. In symbols, we could write that as
\( \forall \; x \in \mathbf{S} \text{, if } x \in U \text{ then } x \in E.\)
Equivalently, it is the assertion \(U \subset E\). Since there are no unicorns, \(U\) is the empty set, and the assertion that all unicorns lay eggs is true.
Now consider the assertion Some unicorns lay eggs. In symbols, we could write that as
\( \exists \; x \in \mathbf{S}: x \in U \text{ and } x \in E.\)
Equivalently, it is the assertion \(U \cap E \neq \{\}\) or \(U \not\subset E^c\). Since there are no unicorns, \(U\) is the empty set, which is a subset of every set, including \(E^c\); similarly \(U \cap E\) is also the empty set, and the assertion that some unicorns lay eggs is false.
The assertion no unicorns lay eggs can be written \( \forall \; x \in \mathbf{S} \text{, if x} \in U \text{ then } x \notin E\), or \(U \subset E^c\). Since there are no unicorns, this assertion is true. The assertion some unicorns do not lay eggs can be written \( \exists \; x \in \mathbf{S}: x \in U \text{ and } x \notin E\), \(U \cap E^c \neq \{\}\), or \(U \not\subset E\) This assertion is false—it must be because \(U\) is empty.
Here is a more mathematical explanation. Suppose that the set \(A\) is empty. Consider the assertion \( \forall \; x \in \mathbf{S} \text{, if } x \in A \text{ then } x \in B\). That is equivalent to the assertion that \(A \subset B\), which is true because the empty set \(\{\}\) is a subset of every set, including, in particular, \(B\).
Now consider the assertion \( \exists \; x \in \mathbf{S} : x \in A \text{ and } x \in B\). That is equivalent to the assertion that \(A \cap B \neq \{\}\) and to the assertion \(A \not\subset B^c\). The assertion is false if the set \(A\) is empty. It is true that every element of the set \(A\) is an element of the set \(B\), and yet there is nothing that is an element of the set \(A\) and also an element of the set \(B\).
In categorical logic, there are six basic assertions we can make regarding the relation between two classes, listed in the box below.
Relationships between Two Categories
Suppose we have two categories, \(A\) and \(B\), subsets of a domain of discourse \(\mathbf{S}\). We can express six categorical relations between \(A\) and \(B\). The relations can be written using universal and existential quantifiers, or using set notation:
\(\bullet x \in A \cap B\)
\(\bullet x \in A \cap B^c\)
\(\bullet x \in A^c \cap B\)
As you can see, set notation is more compact.
There is more than one way to express any quantified assertion. For example, some \(A\) is not a \(B\) can also be written not all \(A\) are \(B\), \(A \cap B^c \neq \{\}\) and \(A \not\subset B\). As you read this chapter, draw Venn diagrams to represent each assertion. Use the Venn diagram to help you find other, equivalent assertions. If the diagram involves \(A\) and \(B\), think about the relationship between \(A^c\) and \(B\), between \(A\) and \(B^c\), and between \(A^c\) and \(B^c\).
For instance, here is the Venn diagram for every A is a B:
In the diagram, the area that represents \(A\) is entirely contained in the area that represents \(B\). The area that represents the complement of \(B\) is entirely contained in the area that represents the complement of \(A\). No part of \(A\) is in the complement of \(B\). The relationship every A is a B is mathematically equivalent to the relationship every non-B is a non-A and to the relationship no A is a non-B.
Similarly, here is the Venn diagram for some A are B:
You can see from the diagram that the situation could also be described as some B are A, not every A is a non-B, and not every B is a non-A.
There is an art to translating English sentences into mathematical expressions. The next few exercises check your ability to go back an forth.
A categorical syllogism is a three line argument with two premises and a conclusion, of the form:
The quantifiers can be all (every one is), no (none is), not all (some are not), or some (at least one is). In a valid syllogism, each of the premises and the conclusion involve two categories, and there are three categories in all. The category that occurs at the end of the conclusion is called the major term. The category that occurs at the beginning of the conclusion is called the minor term. The third category is called the middle term.
One premise involves the major term and the middle term; the other premise involves the minor term and the middle term. In each premise, the middle term can be first or second.
Here is an example of a valid categorical syllogism:
In this example, mortal creatures is the major term. The minor term is ravens. The middle term is birds. The major premise is all birds are mortal creatures. The minor premise is all ravens are birds.
The syllogism is valid because it is of the form:
where \(A\) is ravens, \(B\) is birds and \(C\) is things that are mortal. The logic is valid because the subset relationship is transitive: if \(A \subset B\) and \(B \subset C\) it is necessarily true that \(A \subset C\), as illustrated by the following Venn diagram:
\(A\)
To put the previous syllogism in standard form, we write the major premise first, then the minor premise, then the conclusion:
Not every syllogism is valid. For instance, consider:
In set notation, if we take \(A\) to represent ravens, \(B\) to represent black birds, and \(C\) to represent swans, the argument is:
\(A \not\subset B^c.\; C \not\subset B^c. \text{ Therefore, } A \not\subset C^c.\)
This syllogism is invalid; the following Venn diagram shows a counterexample:
\(\bullet a \in B \cap A\)
\(\bullet c \in B \cap C\)
In this Venn diagram, \(a\) is a black raven and \(c\) is a black swan. The diagram shows that both can exist (that is, \(A \not\subset B^c\) and \(C \not\subset B^c\)) and still \(A \cap C = \{\}\): there might not be any "raven-swans." The truth of the assumptions does not guarantee that the conclusion is true, so the argument is invalid.
In fact, any syllogism that has the quantifier some in both the major and minor premise is invalid. And any syllogism that has the quantifier some or not all in the conclusion must have either some or not all in one of the premises, or it is invalid.
Let's count the number of possible syllogisms: in the major premise, there are 4 possible quantifiers. Either the major term or the middle term can come first in the major premise, giving 2 orders. In the minor premise, there are 4 possible quantifiers, and either the minor term or the middle term can come first. And in the conclusion, there are 4 possible quantifiers—but the minor term must come first and the major term must come last. By the fundamental rule of counting, there are therefore
\(4 \times 2 \times 4 \times 2 \times 4 = 256\)
possible syllogisms. But only 15 of the 256 are valid. We have seen one valid syllogism already. All 15 are given in the box below.
Valid Categorical Syllogisms
There are 256 possible categorical syllogisms. Only fifteen of them are valid. The following are all written so that the major premise is first, the minor premise is second and the conclusion is third. The major term is \(P\) (predicate of the conclusion) , the middle term is \(M\) (middle) and the minor term is \(S\) (subject of the conclusion).
Many of the syllogisms above differ in their language but not in the underlying mathematics. For instance, there is no mathematical difference between \(\text{some S are M}\) and \(\text{some M are S}\), between \(\text{no M are P}\) and \(\text{no P are M}\), or between \(\text{no M are S}\) and \(\text{no S are M}\). So, of the 15 syllogisms above, 2 and 3 are the same; 5 and 6 are the same; 7 and 8 are the same; 9, 10, 11 and 12 are the same; and 13 and 14 are the same.
The following exercises check your ability to recognize categorical syllogisms, their premises and terms; to determine whether a syllogism is in standard form; and to translate syllogisms from words into set notation.
Many—but by no means all—valid three-line arguments about categories are categorical syllogisms. For instance, the following three arguments are valid but are not categorical syllogisms:
All \(S\) are \(P\). Some \(S\) are \(M\). Therefore, some \(S\) are \(P\).
Some \(A\) are \(B\). All \(B\) are \(C\). Therefore, there are \(A\)s, \(B\)s and \(C\)s.
All \(A\) are \(B\). \(x\) is not \(B\). Therefore, \(x\) is not \(A\).
Nonetheless, because categorical syllogisms are so common, for thousands of years philosophers have worked on rules to test whether a quantified argument is in fact one of the 15 valid syllogisms. We will look at three methods for doing this.
The first method is the most mechanical: try to put the argument in standard form. If that is impossible, the argument is not a syllogism. Game over. If the argument can be put in standard form, check whether it is one of the 15 valid syllogisms. If it is not, the syllogism is not valid. We will also cover some rules of thumb that help us spot arguments that cannot be valid syllogisms.
The second method requires a bit more finesse but is much more rewarding. We will see that there are really only two mathematically different syllogisms, one with existential import, and one without. If the argument we are testing can be transformed into one of the two canonical valid syllogisms by following the rules of set theory, the argument is a valid syllogism. Equivalently, if the Venn diagram for one of the two canonical valid syllogisms can be re-labelled so that it is the Venn diagram for the argument we are testing, the argument is a valid syllogism.
The third method is to translate the syllogism into set notation and to prove directly that the argument is valid, or to construct a counterexample (Venn diagrams help very much). This approach is the most general. It is not limited to syllogisms—it can be used to test any argument about sets.
A syllogism is in standard form if it has the major premise first, the minor premise second, and then the conclusion. The major premise either looks like
[quantifier] [major term] are [middle term]
or
[quantifier] [middle term] are [major term].
The minor premise either looks like
[quantifier] [minor term] are [middle term]
[quantifier] [middle term] are [minor term].
The conclusion looks like
[quantifier] [minor term] are [major term].
If a three-line argument cannot be put in this form, it is not a categorical syllogism. The tricky part is to translate plain language arguments into standard form. We will work some examples presently. First we will look at some rules of thumb.
The Conservation of Rabbits Principle says that to pull a rabbit out of a hat, a rabbit must first be put into the hat. The application of the conservation of rabbits principle to categorical syllogisms says that if the conclusion is that something exists (some \(S\) are \(P\) or some \(S\) are not \(P\)), one of the premises must guarantee that something exists (some \(P\) are \(M\), some \(M\) are \(P\), some \(S\) are \(M\), or some \(M\) are \(S\)). Moreover, in a valid syllogism at most one premise says that something exists.
Another rule of thumb is that in a valid syllogism, one premise contains the middle term and the major term, the other premise contains the middle term and the minor term, and the conclusion contains only the minor term and the major term. So an argument that looks like:
[Quantifier] [minor term] are [major term]. [Quantifier] [minor term] are [middle term]. Therefore, [quantifier] [minor term] are [major term]
is not a valid syllogism—although it could be a logically valid argument. Some systems of testing the validity of syllogisms are based on such rules of thumb. Here are examples of testing syllogisms by translating them into standard form.
Consider the syllogism:
\(\text{All A are B. No B are C. Therefore, no A are C.}\)
Since the conclusion is \(\text{no A are C}\), the major term is \(C\) and the minor term is \(A\). To put this in standard form we would put the major premise (the one involving the major term, \(C\)) first. Thus, in standard form, this is the eighth valid syllogism above:
\(\text{No M are P. All S are M. Therefore, no S are P.}\)
Now consider the categorical syllogism:
\(\text{All A are B. Some A are C. Therefore, some B are C.}\)
This syllogism asserts that the set \(A \cap B\) has at least one element, so the syllogism has existential import. Since the conclusion is \(\text{some B are C}\), the major term is \(C\) and the minor term is \(B\). To put this in standard form we would put the major premise (the one involving the major term, \(C\)) first. In standard form, this is thus the thirteenth valid syllogism above:
\(\text{Some M are P. All M are S. Therefore, some S are P.}\)
Finally, consider the argument:
\(\text{All A are B. All A are C. Therefore, all B are C.}\)
Since the conclusion is \(\text{all B are C}\), the major term is \(C\) and the minor term is \(B\). To put this in standard form we would put the major premise (the one involving the major term, \(C\)) first. In standard form, this is thus:
\(\text{All M are P. All M are S. Therefore, all S are P.}\)
This is not one of the 15 valid syllogisms; it is not valid. It is easy to construct a counterexample. For instance, let \(A\) stand for human beings, \(B\) for animals and \(C\) for vertebrates. It is true that all humans are animals and that all humans are vertebrates. It does not follow that all animals are vertebrates—and they are not.
Although the 15 valid categorical syllogisms look different when written in words, in fact, there are only two mathematically distinguishable syllogisms if no special significance is given to which term is the major term and which is the minor term. One of those two syllogisms has existential import, the other does not. That is, all valid categorical syllogisms with existential import (that have some or not all as the quantifier in the conclusion) are really the same, and all valid categorical syllogisms without existential import (that have all or no in the conclusion) are really the same.
The Two Canonical Valid Categorical Syllogisms
There are only two mathematically distinct valid categorical syllogisms. The other valid categorical syllogisms can be converted to these—or derived from these—by applying identities from set theory or changing the names of variables. The first canonical syllogism does not have existential import; the second does.
\(S\)
·\(x\)
Any valid syllogism can be transformed into one of these two by renaming the categories and using two equivalences:
\(A \subset B\) if and only if \(B^c \subset A^c\) (every \(A\) is a \(B\) if and only if every non-\(B\) is a non-\(A\))
\(A \not\subset B\) if and only if \(B^c \not\subset A^c\) (some \(A\) are non-\(B\) if and only if some non-\(B\) are \(A\)).
The first of these is the only option for transforming premises that involve the quantifiers all or no. The second is the only option for transforming premises that involve the quantifiers some or not all. Thus, there are a limited number of possibilities to consider.
If the syllogism is valid and has no existential import, it can be transformed into the first canonical syllogism. If it is valid and has existential import, it can be transformed into the second canonical syllogism.
These transformations amount to relabeling the Venn diagrams, simply changing the names of categories. The argument remains valid—or invalid—as long as the labels are changed consistently. For instance, suppose category \(A\) is in the original argument. We can change \(A\) to \(Q\) provided:
We could instead change \(A\) to \(Q^c\) as long as \(Q\) does not appear in the argument and we make the change everywhere, including changing \(A^c\) to \(Q\) throughout.
Let's do examples of transforming an argument with no existential import and an argument with existential import.
Consider the argument:
All \(A\) are \(B\). No \(B\) are \(C\). Therefore, no \(A\) are \(C\).
This has no existential import, so—if it is valid—we should be able to show that it is equivalent to the first of the two canonical syllogisms.
The first problem we face is that this argument has a premise and a conclusion of the form no \(Q\) are \(R\), while every statement in the canonical syllogism is of the form all \(Q\) are \(R\). How can we get the argument into that form?
The simplest way to separate the abstract content from the words used to express the content is to use mathematical notation instead of words. Both no \(Q\) are \(R\) and all \(Q\) are \(R^c\) describe the same situation: \(Q \subset R^c\). (Moreover, that is the same condition as \(R \subset Q^c\).) In set notation we have:
\(A \subset B\). \(B \subset C^c\). Therefore, \(A \subset C^c\).
This is in exactly the same form as the first canonical syllogism if we reverse the order of the premises:
\(B \subset C^c\). \(A \subset B\). Therefore, \(A \subset C^c\).
We just need to match the categories in this argument to the major, minor and middle terms in the canonical syllogism: let \(M = B\), \(P = C^c\) and \(S = A\). With these changes of variables, the syllogism becomes
\(M \subset P\). \(S \subset M\). Therefore, \(S \subset P\).
This is indeed the first canonical syllogism, so the syllogism we are testing is valid.
Here is a Venn diagram for this syllogism:
This is exactly the Venn diagram for the first canonical syllogism, relabeled as described (\(S = A\), \(M = B\) and \(P = C^c\)).
Consider the categorical argument:
All \(A\) are \(B\). Some \(A\) are \(C\). Therefore, some \(B\) are \(C\).
This argument asserts that the set \(B \cap C\) has at least one element, so it has existential import. If it is valid, we should be able to transform it into the second canonical syllogism.
Some are means the same thing as not all are not. For instance, the second premise, some \(A\) are \(C\) means the same thing as not all \(A\) are \(C^c\); i.e., \(A \not\subset C^c\).
As in the previous example, we can avoid some of the distraction of the variety of ways of expressing the same underlying situation in English by using set notation instead of words. The syllogism we are testing is
\(A \subset B\). \(A \not\subset C^c\). Therefore, \(B \not\subset C^c\).
Where are we trying to go? If this syllogism is valid, we should be able to write it in the form of the second canonical syllogism:
\(P \subset M\). \(S \not\subset M\). Therefore, \(S \not\subset P\).
We are almost there. In the syllogism we are testing, the set that appears in both premises (and which must therefore be the middle term), \(A\), is on the left hand side of both premises. In the second canonical syllogism, the set that appears in both premises, the middle term \(M\), is on the right hand side of both premises. And in the canonical syllogism, the major term is on the right of the conclusion, while in the syllogism we are testing, the term that is not the middle term and occurs in the first premise is on the left of the conclusion. We need to reverse all three relationships if we are going to make them match the pattern of the second canonical syllogism.
We can "flop" all three of the relationships in the syllogism we are testing by taking complements. Remember the identities:
The first identity can be applied to the first premise of the argument we are testing (which has \subset ). The second identity can be applied to the second premise and the conclusion (both of which have \not\subset ). That changes the argument to:
\(B^c \subset A^c\). \(C \not\subset A^c\). Therefore, \(C \not\subset B^c\).
Now the correspondence between this and the second canonical syllogism is plain: let \(P = B^c\), \(M = A^c\), and \(S = C\). With these changes of variables, the syllogism becomes
And that is the second canonical categorical syllogism. Therefore, the argument is a valid categorical syllogism.
As you can see, this is just a relabeling of the Venn diagram for the second canonical syllogism, as just described.
We can always test whether a syllogism is valid by putting it in standard form and checking whether it is one of the 15 valid syllogisms. But that is a fairly mindless undertaking that requires memorizing 15 special cases. And we can test whether an argument is a valid syllogism by seeing whether we can transform it into one of the two canonical valid syllogisms by changing variable names and using set theory identities, or show that its Venn diagram is just a consistent relabeling of the Venn diagram of one of the two valid syllogisms. This is more interesting, but still does not tell us why the syllogism is valid or invalid. It is far more rewarding—and a better use of our brains—to learn the rules of set theory and then to use logic to determine directly whether a syllogism is valid—to prove that it is valid or to find a counterexample.
The first step is to express the syllogism using set notation. For instance, if a premise or conclusion is of the form every \(A\) is a \(B\), all \(A\) are \(B\), or no \(A\) are not \(B\), we would write that as \(A \subset B\), or as \(B^c \subset A^c\).
If a premise or conclusion is of the form no \(A\) is a \(B\), no \(A\) are \(B\), or every \(A\) is not a \(B\), we would write that as \(A \sub B^c\) or as \(B \subset A^c\).
If a premise or conclusion is of the form some \(A\) is a \(B\) or some \(A\) are \(B\), we would write that as \(A \not\subset B^c\) or as \(B \not\subset A^c\).
To test whether the syllogism is valid, we see whether we can get from the two premises to the conclusion using only rules of set theory, such as:
Perhaps it is not surprising that the third rule is the first canonical syllogism, and the fourth is equivalent to the second canonical syllogism! The two canonical syllogisms are simply facts about the subset relationship.
Here are a few examples of testing the validity of natural language syllogisms by translating them into set theory.
Consider the following three-line quantified categorical argument:
All ravens are black. No animals that live in Antarctica are black. Therefore, no ravens live in Antarctica.
The three categories in this argument are ravens, black animals, and animals that live in Antarctica. Let's call them \(R\), \(B\) and \(A\), respectively. Then the argument is:
All \(R\) are \(B\). No \(A\) are \(B\). Therefore, no \(R\) are \(A\).
In set notation, the premises are:
\(R \subset B and B \subset A^c.\)
The conclusion is \(R \subset A^c\).
But the subset relationship is transitive, so if \(R \subset B\) and \(B \subset A^c\) then \(R \subset A^c\); therefore, this argument is valid.
Some people run 100-mile mountain races through snow and nasty weather. Anybody who would run a 100-mile mountain race through snow and nasty weather is crazy. Therefore, some people are crazy.
The three categories in this argument are people, those who run 100-mile races in the mountains in bad weather, and crazy people. Let's call them \(P\), \(R\) and \(C\), respectively. Then the argument is:
Some \(P\) are \(R\). All \(R\) are \(C\). Therefore, some \(P\) are \(C\).
\(P \not\subset R^c \text{ and } R \subset C.\)
The conclusion is \(P \not\subset C^c\).
Complements reverse the subset relationship, so \(R \subset C\) if and only if \(C^c \subset R^c\). Thus, the second premise is equivalent to \(C^c \subset R^c\). The first premise says there is some \(x \in P\) that is not in \(R^c\). But subsets are not bigger than the sets that contain them, so that \(x\) cannot be in any subset of \(R^c\). The second premise tells us that \(C^c \subset R^c\). Hence \(x\) is not in \(C^c\). So, \(P \not\subset C^c\). The syllogism is valid.
Consider the following syllogism:
All balls are approximately spherical. All planets are approximately spherical. Therefore, all balls are planets.
The three categories in this argument are balls, approximately spherical objects, and planets. Let's call them \(B\), \(S\) and \(P\), respectively. Then the argument is:
All \(B\) are \(S\). All \(P\) are \(S\). Therefore, all \(B\) are \(P\).
\(B \subset S \text{ and } P \subset S.\)
The conclusion is \(B \subset P\).
There is no logical way to get from the premises to the conclusion. How can we see that mathematically? Here is a Venn diagram that is consistent with the premises but for which the conclusion is false:
This is a counterexample to the argument. The syllogism is not valid.
Some statisticians are funny. No comedians are statisticians. Therefore, some comedians are not funny.
The three categories in this argument are statisticians, funny people, and comedians. Let's call them \(S\), \(F\) and \(C\), respectively. Then the argument is:
Some \(S\) are \(F\). No \(C\) are \(S\). Therefore, some \(C\) are \(F^c\).
\(S \not\subset F^c. C \subset S^c.\)
The conclusion is \(C \not\subset F\).
This Venn diagram shows a situation that is consistent with the premises but for which the conclusion is false:
\(\bullet x\)
This is a counterexample to the argument. So, the syllogism is not valid.
By saying that there are funny statisticians, the premises guarantee that there are statisticians and that there are funny people. But the premises can be true even if there are no comedians; therefore, the premises cannot possibly guarantee that there are un-funny comedians.
To test the validity of a categorical argument (whether it is a syllogism or not), first try to think of a counterexample. Drawing pictures (Venn diagrams) can help enormously. If you convince yourself that there can be no counterexample, ask yourself why there can be no counterexample. If your intuition is clear, it will lead you to a proof—a derivation of the conclusion from the premises, using the rules of set theory.
The following exercise checks your ability to test whether a categorical argument is valid.
Categories can be viewed as sets. To say that \(x\) is in category \(A\) is equivalent to saying \(x\) is in the set \(A\), i.e., \(x \in A\). If \(x\) is in the category \(A\), we say \(x\) is in \(A\), \(x\) is an \(A\), or, least formally, \(x\) is \(A\).
The basic quantified relationships between two categories are all are (every one is, none are not), none is (all are not), some are (at least one is, not all are not) and not all are (some are not). For instance, if \(A\) consists of people who are over 100 years old and \(B\) consists of women, not all \(A\) are \(B\) means not all people who are over 100 years old are women.
These four relationships also can be expressed as set relations. All \(A\) are \(B\) is equivalent to \(A \subset B\). No \(A\) is \(B\) is equivalent to \(A \subset B^c\). Some \(A\) are \(B\) is equivalent to \(A \not\subset B^c\). Not all \(A\) are \(B\) is equivalent to \(A \not\subset B\).
The empty set is a subset of every set, so \(A\) can be a subset of \(B\) or of \(B^c\) even if \(A\) has no elements. That is, \(A \subset B\) does not imply that there exists anything in \(A\). In contrast, for \(A\) not to be a subset of \(B\) or of \(B^c\), \(A\) must have at least one element. Thus the relationships some are and not all are have existential import—they imply that the first set has at least one elements—while the relationships all are and none is do not have existential import. The existential fallacy consists of assuming, in the course of an argument, that a set has elements when the premises do not guarantee that it has any elements.
Categorical syllogisms are special three-line arguments about categories, consisting of two premises and a conclusion. They involve three categories and three quantified relationships among the three categories. The premises and conclusion are of the form [quantifier] [category 1] are [category 2]. The category that plays the role [category 2] in the conclusion is called the major term. The category that plays the role [category 1] in the conclusion is called the minor term. One of the premises contains the major term and the third category, called the middle term. That premise is called the major premise. The other premise contains the minor term and the middle term; that premise is called the minor premise.
When a categorical syllogism is in standard form, the first premise is the major premise and the second premise is the minor premise. In standard form there are 256 possible categorical syllogisms, of which but 15 are valid. The rest are fallacious: one can construct counterexamples to the arguments. There is less to syllogisms than meets the eye: all valid syllogisms are mathematically equivalent to one of two canonical syllogisms. One of the canonical syllogisms has existential import; the other does not.
The validity of categorical arguments—including categorical syllogisms—can be tested by converting the argument into set notation and trying to derive the conclusion from the premises using the rules of set theory. If the argument is not valid, it is possible to draw a Venn diagram that satisfies the premises of the argument but for which the conclusion of the argument is false.