Sie sind auf Seite 1von 16

PROPOSITIONAL CALCULUS AND BOOLEAN ALGEBRA

(18.510, FALL 2011)

HENRY COHN

1. Syntax
The propositional calculus is the mathematical system that deals with logical
relationships between propositions, i.e., between assertions. The word calculus
simply means a method of calculating. It is by no means limited to the differential
or integral calculus, although those are certainly the most famous examples.
The propositional calculus involves manipulating strings of symbols. Some of the
symbols will be variables, which stand for propositions; then we can combine them
with the logical symbols (not), (and), (or), (implies), and (is equivalent
to). For example, (p p) means that p is true or p is not true.
This framework is quite limited, because it does not allow us to create interesting
statements from scratch. Using the propositional calculus, we can make assertions
like (p q), but we cannot explain what p and q actually mean, so we are limited
to talking about pre-existing statements. Thus, to do serious mathematics we will
eventually need a more powerful framework, but the propositional calculus is an
important part of more general systems.
We begin by formalizing the intuitive idea of a string of symbols in terms of the
von Neumann definition of the natural numbers.
Definition 1.1. Given a set , a string with alphabet (or string over ) is a
function from a natural number to . We write for the set of all strings over ,
and say that a function from n to has length n.
Given a function : n , we typically write the string as (0)(1) . . . (n 1).
For example, if = {0, 1}, then 01001 denotes the function : 5 defined
by (0) = 0, (1) = 1, (2) = 0, (3) = 0, and (4) = 1. However, this is just
shorthand for the formal definition. We also typically denote the symbols in a
string with subscripts rather than parentheses: given a string , we write it as
0 1 . . . n1 rather than (0)(1) . . . (n 1). We will not use a special notation
for the empty string (the unique string with zero length), because we will not talk
about it often enough to justify reserving a letter for it.
Definition 1.2. If and are strings of lengths m and n, respectively, then their
concatenation is the string
0 1 . . . m1 0 1 . . . n1
of length m + n.
We identify the symbols in with the strings of length one. For example, if
is a string and is a symbol, then means the concatenation of the string of
length one with the string .
1
2 HENRY COHN

The alphabet of the propositional calculus consists of a countably infinite set


of variables together with nine other symbols (distinct from any of the variables),
namely , , , , , (, ), >, and . We will use lowercase letters to denote
variables, but because the English language is equipped with only a finite alphabet,
we must allow some sort of decoration to distinguish extra variables. For example,
if we run out of letters, we can generate more variables by appending primes: p,
p0 , p00 , etc. will all be distinct variables. We treat these as representing indivisible
symbols (i.e., the prime is meaningless by itself and cannot be used in isolation).
As a general convention, we will use Roman letters to denote variables and Greek
letters to denote strings.
For the rest of this section, let denote the alphabet of the propositional calculus.
We begin by giving an inductive definition of which strings are syntactically correct.

Definition 1.3. A subset S of is admissible if


(1) S contains all the variables in , as well as the symbols > and ,
(2) if S contains , then it also contains , and
(3) if S contains and , then it also contains ( ), ( ), ( ), and
( ).
A well-formed formula (or wff ) is a string that is an element of all admissible subsets
of .

The word wff is pronounced like woof, or sometimes wiff.


We can prove that a string is a wff by showing how to derive it from (1) through
(3) above. For example, since p and q are variables, they are both wffs by (1)
(because they must be in every admissible set), and q is then a wff by (2), so
(p q) is a wff by (3). It is easy to build up much more elaborate wffs, such as

((p q) (r0 s)).

To prove that something is not a wff, we must take the opposite approach of
finding an admissible set that does not contain it. For example, (p q is not a wff,
because the parentheses are unbalanced. Specifically, let S be the set of strings in
with equal numbers of left and right parentheses. This is an admissible set, since
it satisfies (1) through (3), but it does not contain (p q, so that string is not a wff.
In fact, it takes no real ingenuity to determine whether a given string is a wff:
either the string is obviously a wff or there is something obvious that is wrong with
it. We will not formalize or prove this statement here, but we will prove one closely
related lemma, called unique readability. It says that every wff can be built up
from (1) through (3) in a unique way, so we neednt worry about ambiguity (for
example, that a single wff might simultaneously be the conjunction of two wffs and
the disjunction of two different wffs):

Lemma 1.4. Every wff satisfies exactly one of the following:


(1) is a variable, >, or ,
(2) there exists a unique wff such that = ,
(3) there exist unique wffs 1 and 2 such that = (1 2 ),
(4) there exist unique wffs 1 and 2 such that = (1 2 ),
(5) there exist unique wffs 1 and 2 such that = (1 2 ), or
(6) there exist unique wffs 1 and 2 such that = (1 2 ).
PROPOSITIONAL CALCULUS AND BOOLEAN ALGEBRA (18.510, FALL 2011) 3

To prove this lemma, we will use the idea of a prefix of a string: is a prefix of
if = for some string . This allows = , with the empty string ; we say
is a proper prefix of if it is a prefix and 6= .

Proof. We begin by showing that no wff can be a proper prefix of another wff.
Every proper prefix of a wff is either a string of symbols or has strictly more left
parentheses than right parentheses, because the set of strings whose proper prefixes
are of these types is admissible. However, no wff can consist entirely of symbols
or have unbalanced parentheses.
Now we turn to the proof of the lemma. The only interesting part is cases (3)
through (6). We must prove that there is no overlap among these cases, and that
1 and 2 are uniquely determined in each case. To do so, let and  be symbols
chosen from , , , or , and suppose we have a wff that is of the form (1 2 )
and (10  20 ), where 1 , 2 , 10 , and 20 are wffs.
Because (1 2 ) = (10  20 ), one of 1 and 10 must be a prefix of the other,
and therefore 1 = 10 . Then (1 2 ) = (10  20 ) implies =  and 2 = 20 as
well. 

The only reason why unique readability holds is that we have fully parenthesized
everything. If we allowed strings like p q r, it would be ambiguous whether the
string was built from p and q r or from p q and r. In practice, writing many
parentheses can be tedious, and it is often convenient to omit some of them, but the
resulting formula should be viewed as shorthand for the real, fully parenthesized
wff.

2. Semantics
We have specified the syntax of the propositional calculus, i.e., the grammar
that specifies which formulas are well formed. Now we turn to the semantics, the
meaning of the formulas.
The symbol > stands for truth, and for falsehood. If we substitute > and
for the variables in a wff, then the following table shows how to assign a truth value
of > or inductively to the entire wff:
( ) ( ) ( ) ( )
> > > > > >
> >
> > > >
> > >
Under these rules, each wff involving variables x1 , . . . , xn determines a function
from the variable assignments {>, }n to {>, }. Note that this process depends
crucially on Lemma 1.4, since unique readability guarantees that we cannot evaluate
a wff in two different ways.
In the table shown above, each symbols behavior corresponds to its name (not,
and, or, implication, and equivalence). Two cases are worthy of note. One is that
always denotes inclusive or, so ( ) is true if is true, is true, or both are
true. The other is that denotes material implication. In other words, ( )
depends only on the truth values of and , and does not require any conceptual
relationship between the statements. It simply means that if is true, then is
true; equivalently, is true or is not true.
4 HENRY COHN

For example, if p stands for 1 + 1 = 2 and q stands for Fermats last theorem, then
(p q) is true, because p and q are both true, even though there is no obvious way
to deduce the truth of Fermats last theorem from 1 + 1 = 2. Material implication
does not require such a deduction. Part of its beauty is that we can deal with
implications without ever having to formalize the tricky intuitive notion of what it
means for q to be a consequence of p; instead, all we need is the property that if p
and (p q) are both true, then q is true. Material implication has this property.
Similarly, a false statement vacuously implies everything: ( p) is true
regardless of whether p is true or false. This can be slightly counterintuitive, since
it disagrees with the usual English interpretation of the word if. Suppose it isnt
raining, and someone says if it were raining, I would take my umbrella. This
person is presumably imagining a counterfactual scenario and reasoning about how
that imaginary world differs from the real world. Its much less plausible that
someone would say if it were raining, then it wouldnt be raining, even though
material implication allows you to deduce anything from a falsehood. Counterfactual
reasoning is much richer and deeper than material implication, too much so to be
captured by the propositional calculus. Instead, material implication depends only
on truth values. We know a false statement can sometimes imply a true statement
(if 1 = 1, then (1)2 = 12 ) and can sometimes imply another false statement (if
1 = 1, then 1 + 1 = 1 + 1). If we have just truth values to work with, then we can
capture this behavior only by deciding that ( p) is always true, regardless of
whether p is true. One might fear that this is an oversimplification compared with
counterfactual reasoning, and indeed it is, but it is not a harmful oversimplification,
and material implication provides a perfectly good foundation for mathematics.
The logical symbols used in the propositional calculus are redundant, because we
can express some of them in terms of the others. We could replace ( ) with
(( ) ( )), and we could replace ( ) with ( ). We could even
eliminate one of and by expressing it in terms of the other, i.e., replacing ( )
with ( ) or replacing ( ) with ( ). These sorts of reductions
are sometimes convenient, but they make long formulas much harder to read, so we
will cheerfully use a redundant system.
Definition 2.1. A wff in the propositional calculus is satisfiable if there exist
truth values for its variables such that evaluates to >. It is a tautology if it always
evaluates to >.
For example, (p p) is a tautology. The wff (p q) is satisfiable but not a
tautology, and (p p) is not even satisfiable. Note that is a tautology if and
only if is not satisfiable.
It is generally easy to tell whether a short wff is a tautology. However, it can be
a little counterintuitive. For example,
(((p q) r) ((p r) (q r)))
is a tautology, because both ((p q) r) and ((p r) (q r)) are equivalent to
((r p) q). At first, this seems plainly absurd: it seems to be saying that if two
hypotheses together imply a conclusion, then only one of the hypotheses is actually
needed to reach the conclusion. However, that reformulation is misleading, because
it is subtly changing the problem. When people think about implication, they often
allow an implicit universal quantifier to slip in: they imagine that propositions p
and q depend on some circumstances x, and they think of (p q) as meaning
PROPOSITIONAL CALCULUS AND BOOLEAN ALGEBRA (18.510, FALL 2011) 5

x (p(x) q(x)) (which is of course not a wff in the propositional calculus). This is
not what the propositional calculus studies; it deals with single, isolated propositions,
not families of them depending on other variables. That distinction clears up the
intuitive problem:
x ((p(x) q(x)) r(x))
is not equivalent to
(x (p(x) r(x))) (x (q(x) r(x))),
but it is equivalent to
x ((p(x) r(x)) (q(x) r(x))).
The problem isnt that
(((p q) r) ((p r) (q r)))
is not a tautology. Instead, it is that you cant distribute over .
As this example shows, it is not always obvious at first glance whether a wff
is a tautology. It is easy to test it by brute force, by substituting all possible
combinations of truth values for the variables, but that is extremely time-consuming,
because there are 2n possibilities for n variables. For example, in practice it is
impossible to check all 2100 cases for 100 variables.
In fact, one of the deepest open problems in theoretical computer science, namely
whether P = NP, asks whether one can do better: P = NP if and only if there is
a polynomial-time algorithm for testing whether a wff is a tautology. The time
bound means the number of steps required to test a wff of length n is bounded by
a polynomial in n. Of course, such an algorithm could not check all the variable
assignments, so it would have to be based on a more efficient reformulation of what
it means to be a tautology. It is widely believed that P 6= NP, and in fact that no
tautology-testing algorithm runs in subexponential time, but no proof is known.

3. Boolean algebra
Classical logic is based on the law of the excluded middle: a proposition that is
not true must be false, with no third possibility. However, it is sometimes convenient
to extend the notion of logic to other truth values. For example, we could use a
three-valued logic, with a truth value ? (maybe) in addition to > and . Theres a
natural way to extend , , , , and to this setting, based on interpreting ? as
a state of ignorance about whether the truth value is > or :
( ) ( ) ( ) ( )
> > > > > >
> ? ? > ? ?
> >
? > ? ? > > ?
? ? ? ? ? ? ?
? ? ? ? ?
> > > >
? > ? > ?
> > >
However, this system has some unfortunate properties. For example, we would like
(p p) to be a tautology, because every proposition should be equivalent to itself,
but in the three-valued logic described above, it is not a tautology. The problem is
6 HENRY COHN

that (? ?) evaluates to ?, since it has no way of knowing whether the two question
marks describe the same uncertain proposition or different ones.
In this section, we will develop the concept of a Boolean algebra, which is arguably
the best-behaved way to extend two-valued logic. The three-valued logic described
above is not a Boolean algebra.
We will build up to Boolean algebras in several steps. To begin, we will have
a partially ordered set B of truth values. The ordering relation p q means q is
at least as true as p; for example, this may hold if q is true on weekends and p is
true on Saturdays. The reason it is only a partial ordering is that there may be
incomparable truth values, for example things that are true on Mondays and things
that are true on Tuesdays.
In principle, we could use any poset of truth values, but it is far from clear how
to define the operations , , , , and in an arbitrary poset. Thus, we will
impose additional structure on B. We will define and in terms of the other
three operations, just as in the two-valued case, but that still leaves three operations
to go.
Definition 3.1. A lattice is a poset in which every pair of elements p and q has a
least upper bound p q and a greatest lower bound p q.
In a lattice, p q is read p join q, and p q is read p meet q. Recall that the
least upper bound property means
(1) p q p and p q q, and
(2) for all r such that r p and r q,
r p q.
I.e., it is an upper bound, and it is the smallest of the upper bounds. The greatest
lower bound has exactly the same property, but with all the inequalities reversed.
Two elements p and q in a poset have at most one least upper bound: given two
of them, each would have to be less than or equal to the other. Similarly, they have
at most one greatest lower bound. However, they neednt have either a least upper
bound or a greatest lower bound. For example, in a poset consisting of just two
incomparable elements, those elements have no upper or lower bounds at all. Thus,
not every poset is a lattice. For an example of a poset that is a lattice, let S be a
set, and consider P(S) ordered by . The least upper bound is then the union, and
the greatest lower bound is the intersection.
It is not difficult to check that if p1 , . . . , pk are elements of a lattice, then
p1 pk is their least upper bound, and p1 pk is their greatest lower
bound. (No parentheses are needed, because these operations are associative.)
The lattice perspective on and fits naturally with the idea of truth values:
p or q should be at least as true as p and at least as true as q, but no truer than
this forces it to be, so we take it to be the least upper bound p q of p and q, and
we deal similarly with and. Thus, we will assume that our poset B of truth values
is a lattice. However, we have not yet seen how to deal with .
Definition 3.2. A complemented lattice is a lattice L such that
(1) there exist elements 0, 1 L such that 0 p 1 for all p L, and
(2) for every p L, there exists some q L such that p q = 1 and p q = 0.
In a complemented lattice, 0 and 1 behave like and >, and q behaves like p.
The lattice P(S) is complemented: 0 = , 1 = S, and the complement of p is S \ p.
PROPOSITIONAL CALCULUS AND BOOLEAN ALGEBRA (18.510, FALL 2011) 7

However, not every lattice is complemented. For example, every totally ordered
set is a lattice, with p q = max(p, q) and p q = min(p, q). Even if there are
greatest and least elements 1 and 0, no other element p can have a complement,
since max(p, q) = 1 and min(p, q) = 0 imply that {p, q} = {0, 1}. Thus, totally
ordered sets with more than two elements are not complemented lattices.
Complemented lattices almost allow us to define , but there is one major issue:
an element may have several complements. (By contrast, 0 and 1 are uniquely
determined by property (1) in the definition.) For example, in the following poset,
each of the three incomparable elements in the middle is a complement of each of
the others. r
@
@
r r @r
@
@
@r
Definition 3.3. A distributive lattice is a lattice in which p(q r) = (pq)(pr)
and p (q r) = (p q) (p r) for all p, q, and r.
In other words, each of and distributes over the other, just as in classical
two-valued logic. The lattice P(S) is always distributive, but the lattice drawn
above is not. If we let p, q, and r be the elements in the middle row (and let 0 and
1 be the least and greatest elements), then
p (q r) = p 1 = p,
but
(p q) (p r) = 0 0 = 0.
Lemma 3.4. In a complemented, distributive lattice, the complement of each
element is unique.
Proof. Suppose q and q 0 and complements of p in a complemented, distributive
lattice. Then
q =q1
= q (p q 0 )
= (q p) (q q 0 )
= 0 (q q 0 )
= q q0 .
Similarly, q 0 = q q 0 , and thus q = q 0 . 

In a complemented, distributive lattice, we can define p to be the unique


complement of p. This gives a suitable setting for doing propositional logic.
Definition 3.5. A Boolean algebra is a complemented, distributive lattice.
We have defined Boolean algebras to be a special type of partially ordered set,
but one can also characterize them algebraically, in terms of identities satisfied by
, , and , as follows.
8 HENRY COHN

Proposition 3.6. Suppose B is a set with distinguished elements 0 and 1, a unary


operation , and binary operations and . Define p q to mean p = p q. Then
B is a lattice under with meet and join if and only if it satisfies (1) through
(3) below, and it is a Boolean algebra with complement if and only if it satisfies
(1) through (5).
(1) Associativity: p (q r) = (p q) r and p (q r) = (p q) r for all
p, q, r B.
(2) Commutativity: p q = q p and p q = q p for all p, q B.
(3) Absorption: p (p q) = p and p (p q) = p for all p, q B.
(4) Complements: p p = 0 and p p = 1 for all p B.
(5) Distributivity: p (q r) = (p q) (p r) and p (q r) = (p q) (p r)
for all p, q, r B.
Proof. One direction is easy: it is straightforward to show that every lattice satisfies
(1) through (3), and (4) and (5) are part of the definition of a Boolean algebra.
For the other direction, we begin by showing that if (1) through (3) hold, then B
is partially ordered by . To show that is reflexive, we must prove that p = p p
for all p. That follows from the two absorption laws: p = p (p p) implies
p p = p (p (p p)),
and the right side simplifies to p by absorption. For antisymmetry, p q means
p = p q, and q p means q = q p, so p = q follows from commutativity. Finally,
for transitivity, if p = p q and q = q r, then
p r = (p q) r
= p (q r)
=pq
= p,
so p q and q r imply p r. Thus, is a partial ordering of B.
We can characterize p q not just in terms of , but also . Specifically, p = pq
if and only if q = p q, by the absorption and commutative laws: if p = p q, then
p q = (p q) q = q,
and vice versa.
Now we will verify that p q is the least upper bound of p and q in this poset, and
that p q is their greatest lower bound. We have p p q because p = p (p q)
by absorption, and q p q now follows by commutativity. Thus, p q is an upper
bound for p and q; to show that it is the least upper bound, we must prove that if
p = p r and q = q r, then p q = (p q) r. That would follow immediately
from the distributive law, but we would like to prove it using only (1) through (3).
To do so, we will use r = p r and r = q r, as justified by the previous paragraph.
Then
(p q) r = (p q) (p r)
= (p q) (p (q r))
= (p q) ((p q) r)
= p q,
PROPOSITIONAL CALCULUS AND BOOLEAN ALGEBRA (18.510, FALL 2011) 9

where the last equality follows from absorption. Thus, p q is the least upper
bound of p and q. The proof that p q is the greatest lower bound is identical, with
all inequalities reversed and with and interchanged. (The fact that can be
characterized in terms of either or is crucial for this symmetry.)
So far, we have shown that B is a lattice if (1) through (3) hold. All that remains
is to check that 0 p 1 for all p B, since (4) and (5) then amount to the rest of
the definition of a complemented, distributive lattice. We have p1 = p(pp) = p
by the complement and absorption laws, so p 1, and p 0 = p (p p) = p, so
p 0. This completes the proof. 

These axioms highlight the duality symmetry of Boolean algebras: switching


and and 0 with 1 still yields a Boolean algebra. Thus, all Boolean algebra
identities come in dual pairs. However, this symmetry can be understood equally
well from the poset perspective, in which it amounts to switching and .

4. Classification
What are all the Boolean algebras? The obvious examples are the lattices P(S)
of all subsets of a given set S. In this section, we will show that there are no other
finite examples, up to isomorphism. An isomorphism of Boolean algebras is the
same as an isomorphism of the underlying poset. Equivalently, it is a bijective map
that preserves , , and (equivalence holds because the poset structure can be
defined in terms of these operations, and vice versa).
There exist infinite Boolean algebras that are not of this form. For example, take
any countably infinite subset S of an infinite Boolean algebra, and let B be the
closure of S under , , and . Then B is a Boolean algebra (by Proposition 3.6).
However, B is countably infinite, and it is therefore not even in bijection with a
power set, let alone isomorphic to one as a poset.
The Stone representation theorem characterizes all Boolean algebras (finite or
infinite) in terms of topology: for every Boolean algebra B, there is a compact,
totally disconnected, Hausdorff topological space T such that B is isomorphic to
the poset of subsets of T that are both closed and open. In fact, one can recover T
from B as the space of homomorphisms from B to the two-element Boolean algebra.
However, that theorem is beyond the scope of these notes. Instead, we will focus on
the finite version.
Definition 4.1. An atom in a Boolean algebra is a nonzero element p such that
there exists no q satisfying 0 < q < p.
In other words, an atom is an element as close to the bottom as possible, without
actually being at the bottom. The dual notion is a coatom, but we will focus on
atoms.
For our purposes, the importance of atoms is that they recover the elements of
the underlying set S from P(S): an atom in P(S) is a nonempty subset of S that
has no nonempty proper subsets, and that means an atom must be a single-element
subset. Thus, if we want to show that a Boolean algebra is of the form P(S), then
the elements of S must corresponds to the atoms in the Boolean algebra.
Lemma 4.2. If p is an atom in a Boolean algebra and p q r, then p q or
p r.
10 HENRY COHN

This is obvious for P(S) (if a single element is in the union of two sets, then it
must be in one or the other), but we must prove it for a general Boolean algebra.
Proof. Suppose p q r. Then
p = p (q r) = (p q) (p r).
Each of p q and p r is less than or equal to p, and thus equals either 0 or p. They
cannot both be 0, since then p = 0 0 = 0. Thus, p q = p or p r = p, so p q
or p r, as desired. 
Lemma 4.3. If p and q are elements of a Boolean algebra such that p < q and
every r < q satisfies r p, then p = 0 and q is an atom.
Proof. To make use of the hypothesis, we must choose an r. We would like to choose
one that is at most q and unlikely to be at most p, and taking r = q p is a natural
choice. Then r q, so either r = q or r p.
In the first case, q p = q, and then combining this with p = p q (from p < q)
yields
0 = q 0 = q (p p) = (q p) (q p) = q p = p.
Thus, p = 0, and now the hypotheses of the lemma also tell us that q is an atom.
In the second case, we have q p p. This means p = p (p q), and applying
the distributive law yields
p = (p p) (p q) = 1 (p q) = p q.
Thus, p q, which contradicts p < q, so this case cannot occur. 
Lemma 4.4. If p is a nonzero element of a finite Boolean algebra, and q1 , . . . , qk
are the atoms that are less than or equal to p, then
p = q1 qk .
Note that every nonzero element of a finite Boolean algebra has some atom beneath
it (otherwise, one could produce an infinite descending sequence of elements), and
of course only finitely many because the Boolean algebra is finite. The lemma
statement still works even if p = 0, as long as we interpret an empty join to equal 0
(as it should, because 0 is the identity element for ).
Proof. The join q1 qk is the least upper bound of q1 , . . . , qk , and thus
q1 qk p. Suppose p is a minimal counterexample to the lemma; in other
words, suppose q1 qk < p but equality holds in the corresponding inequality
for each element below p. Note that there must be a minimal counterexample,
since there cannot be an infinite descending sequence of counterexamples in a finite
Boolean algebra.
For every r < p, the minimality of p tells us that r is the join of the atoms below it,
and those atoms form a subset of q1 , . . . , qk . Thus, r q1 qk . By Lemma 4.3,
q1 qk = 0 and p is an atom. However, no atom can be a counterexample,
since every atom equals the join of those less than or equal to it (namely, itself).
Thus, there cannot be a minimal counterexample, or any counterexample at all. 
Theorem 4.5. Let B be a finite Boolean algebra, and let A be the set of atoms in
B. Then the map f : B P(A) defined by
f (p) = {a A : a p}
is an isomorphism.
PROPOSITIONAL CALCULUS AND BOOLEAN ALGEBRA (18.510, FALL 2011) 11

Proof. Injectivity follows from Lemma 4.4, because one can recover p from f (p) as
the join of its elements. For surjectivity, given any set a1 , . . . , ak of atoms, each of
them is at most a1 ak , and Lemma 4.2 implies that no other atom is at most
a1 ak . Thus, f (a1 ak ) = {a1 , . . . , ak }.
If p q, then every atom below p is also below q and hence f (p) f (q).
Conversely, if f (p) f (q), then the join of the elements of f (p) and is less than or
equal to the join of the elements of f (q), and thus p q. It follows that f is an
isomorphism. 

The intuitive picture behind the Boolean algebra P(S) is that S is the set of
possible states of the world. The truth value of a proposition is the set of states in
which that proposition is true. Theorem 4.5 tells us that if we want a finite set of
truth values, and if we want our logical operations to obey the laws enumerated in
Proposition 3.6, then this construction is the only possibility. Even in this more
general setting, classical two-valued logic plays a fundamental role: it is hiding in
the fact that each element of S is either completely in or completely out of a subset.
Aside from a digression in the next section, from this point on we will restrict
our attention to classical logic. It is an appropriate foundation for mathematics,
and we would gain little by casting everything in more general terms. However,
other Boolean algebras do play a fundamental role in certain areas within logic. For
example, Cohens forcing technique for proving independence results in set theory
can be understood in terms of constructing Boolean-valued models of ZFC, i.e.,
models of set theory in which the truth values are drawn from a Boolean algebra.
Of course the details are subtle and important, but this is a very plausible way to
arrive at independence proofs. At the risk of oversimplification, if one can construct
a model in which all the axioms of ZFC have truth value 1, but the continuum
hypothesis has truth value strictly between 0 and 1, then it can be neither proved
nor disproved using the axioms.

5. Quantum logic
Is it possible that the real world is governed by non-classical logic at the level
of fundamental physics? Surprisingly, the answer is yes: quantum mechanics can
naturally be described in terms of a strange logic called quantum logic, which is not
a Boolean algebra.
The fundamental construction amounts to replacing set theory with linear algebra.
The set of states of our quantum system will be described by a complex vector space
V of possible wave functions. Typically, V will be an infinite-dimensional Hilbert
space, but we can imagine a finite-dimensional vector space, and that does indeed
occur for very simple quantum systems. The state space V will play the same role
as the set S in the Boolean algebra P(S).
Truth values of propositions in quantum logic are vector subspaces of V . The key
difference from classical logic is the restriction to subspaces: a proposition cannot be
true for an arbitrary subset of the state space, but rather just for a vector subspace.
Let Q be the set of all subspaces of V . Then Q is a lattice under . Specifically,
p q is the intersection of p and q, and p q is their span (i.e., the smallest subspace
containing both). The identity element 0 for is the zero-dimensional subspace,
and the identity 1 for is the full space V . Furthermore, we can define p to be the
orthogonal complement of p (provided we have a Hermitian inner product on V ).
12 HENRY COHN

Thus, Q is a complemented lattice, so we can carry out all of our logical operations
in Q.
However, Q is not a Boolean algebra, because it is not distributive, assuming
dim V 2. Let p, q, and r be distinct lines in the same plane. Then q r equals
that plane, so p (q r) = p. However, p q = 0 and p r = 0, so (p r) (p r) = 0.
Thus, p (q r) 6= (p r) (p r).
This is extremely disconcerting if one takes it literally, because the real world is
governed by quantum mechanics and hence does not have a distributive lattice of
truth values. In fact, this perspective helps explain (or at least formalize) some of
the weird aspects of quantum mechanics, such as the uncertainty principle.
For a toy model, imagine a particle that has two possible positions q0 and q1 , and
two possible momenta p0 and p1 . We know it is definitely at position q0 or q1 ; in
other words, q0 q1 = 1. Similarly, p0 p1 = 1. It follows that (q0 q1 )(p0 p1 ) = 1.
Thus, we know it has a definite position, and independently a definite momentum.
However, we cannot apply the distributive law to conclude that
(q0 p0 ) (q0 p1 ) (q1 p0 ) (q1 p1 ) = 1.
In fact, the uncertainty principle for position and momentum tells us that this is
false! In other words, even though the particle has a definite position and a definite
momentum, we cannot conclude that we can specify both at once.
For a concrete realization of this possibility, let p and q be distinct lines in a
two-dimensional space V . Then
(p p) (q q) = 1,
so every state satisfies p or p and satisfies q or q, but
(p q) (p q) (p q) (p q) = 0,
so there is no state in which we can pin down the status of both p and q.
It is difficult to understand what this means, because it is natural to apply the
distributive law without thinking, and this difficulty is responsible for much of
the confusing nature of quantum mechanics. Fortunately, we do not have to use
quantum logic to understand quantum mechanics. There are two ways to approach
quantum mechanics: we can either think inside the quantum system and apply
quantum logic, or think outside the system and apply classical logic to reason about
the state space V . The second option implicitly deals with quantum logic in the
form of linear algebra, but it is much easier to take this approach. In principle, one
might hope to achieve deeper insight by training ones brain to use quantum logic
directly, but in practice that does not seem to be fruitful.
Quantum logic is far from an arbitrary complemented lattice. Although the
distributive law fails, it is not hard to prove a partial substitute, in which the
distributive law
p (q r) = (p q) (p r)
is replaced with the weaker version
pr p (q r) = (p q) r,
which is called the modular law. It says that a meet can be distributed over a join
whenever doing so would leave one of the terms in the join unchanged. The modular
law is particularly symmetrical, since its dual form
pr p (q r) = (p q) r
PROPOSITIONAL CALCULUS AND BOOLEAN ALGEBRA (18.510, FALL 2011) 13

is exactly the same law. As we can see from the uncertainty principle, modularity is
a poor substitute for distributivity, but it is better than nothing.

6. Formal proofs
In this section, we will develop a notion of formal proof for tautologies. Of course,
it is straightforward to test whether a wff is a tautology using a truth table, so there
is no real need to use formal proofs. However, they are interesting in their own right
as well as good preparation for the much deeper topic of formal proofs in first-order
logic.
For simplicity, we will use just a subset of the propositional calculus, with
variables, parentheses, , and . Everything else can be defined in terms of them,
according to the rows of the following table:


( )
(( ) ( ))
We will view , , , and simply as abbreviations. Alternately, one could keep
them as part of the language, and have special rules for going back and forth between
the two forms listed above.
Our proof system is based on five axioms. Specifically, for all wffs , , and ,
each of the following wffs is an axiom:
(1) ( )
(2) ( ( ))
(3) (( ) (( ( )) ( )))
(4) (( ) ( ))
(5) ((( ) ) )
Note that these axioms are tautologies (they are true regardless of what , , and
are). Axioms 1 and 3 are obvious properties of implication, as is 2 if one keeps
in mind that is material implication. Axioms 4 and 5 are best understood by
reading ( ) as .
Axiom 5 is the law of the excluded middle: a proposition that is not false must
be true. Axioms 1 through 4 define what is known as intuitionistic logic.
It is natural to wonder where these axioms came from. What would lead someone
to choose them, as opposed to all the other true axioms one could choose instead?
The answer is that they come from reverse engineering proofs. If you try to prove
the results from this section, and add a new axiom every time you get stuck, youll
end up with a perfectly serviceable list of axioms.
In addition to the axioms, our system has one rule of inference, called modus
ponens (Latin for the method of affirming or establishing). Modus ponens says that
if we have proved wffs and ( ), then we can deduce .
Definition 6.1. Let be a wff and a set of wffs. A formal proof of given
is a finite sequence 1 , . . . , n of wffs with n = , such that each of them either
follows from two previous wffs in the sequence via modus ponens or is an axiom or
element of . We write ` if there is a formal proof of given , and we write
` if there is a formal proof of given .
Note that ` if and only if there is a finite subset 0 of such that 0 ` ,
because every formal proof involves only finitely many wffs.
14 HENRY COHN

As an example of a formal proof, we will show that Axiom 1 is redundant, given


the other axioms. Let be any wff. Then the following five wffs are a formal proof
of ( ) without using Axiom 1:
(1) ( ( ))
(2) (( ( )) (( (( ) )) ( )))
(3) (( (( ) )) ( ))
(4) ( (( ) ))
(5) ( )
Specifically, line 1 of the proof takes = and = in Axiom 2. Line 2 takes
= , = ( ), and = in Axiom 3. Line 3 follows from modus ponens
applied to lines 1 and 2. Line 4 takes = ( ) and = in Axiom 2. Finally,
line 5 follows from modus ponens applied to lines 4 and 3.
As one can see from this example, formal proofs are typically cumbersome and
unenlightening. The purpose isnt to give intuitive insight, but rather to capture
the idea of provability in a formal system, and we will see that they accomplish this
goal.
Definition 6.2. A set of wffs is inconsistent if ` , and consistent otherwise.
It is satisfiable if there is an assignment of truth values to the variables such that
every wff in evaluates to > (such an assignment is called a satisfying assignment).
If we think of as denoting a contradiction, then a set of wffs is consistent if and
only if it cannot prove a contradiction. For example, {, ( )} is inconsistent.
Lemma 6.3 (Soundness). If ` , then evaluates to > under every satisfying
assignment for .
Proof. Consider any formal proof of given . The axioms are tautologies, and the
elements of are by assumption true under every satisfying assignment. Furthermore,
modus ponens preserves truth: if and ( ) both evaluate to >, then so does
. Thus, the conclusion of the proof must evaluate to >. 
Lemma 6.4 (Deduction). If {} ` , then ` ( ).
The intuition is obvious: a proof of assuming as a hypothesis amounts to a
proof of ( ). However, it is far from obvious that our formal proof system has
this property. We prove below that it does. Note that in the proof, we are working
outside of our formal system to establish properties of the system.
Proof. Let 1 , . . . , n be a formal proof of given {}. We will turn the
sequence ( 1 ), . . . , ( n ) into a formal proof of ( ) given , by
inserting some additional steps as justification.
Each step i in the original proof is either , an element of , an axiom, or the
result of modus ponens. If i is , then ( i ) follows from Axiom 1. If i is
an element of or an axiom, then we justify ( i ) by inserting two additional
steps before it: first we assert i itself (as an axiom or element of ), and then
(i ( i )) via Axiom 2, before deducing ( i ) by modus ponens.
Finally, suppose j, k < i and k = (j i ), so i follows from j and
k by modus ponens. We must justify deducing ( i ) from ( j ) and
( (j i )). By Axiom 3, we can add
(( j ) (( (j i )) ( i )))
PROPOSITIONAL CALCULUS AND BOOLEAN ALGEBRA (18.510, FALL 2011) 15

to the proof, after which applying modus ponens twice lets us deduce (( (j
i )) ( i )) and then ( i ), as desired. 
The primary purpose of Axioms 1 through 3 is to justify the deduction lemma.
The lemma itself is an important tool for constructing formal proofs. For example,
suppose and are wffs, and we want to show that
` ( (( ) )).
One can give a direct proof, but the deduction lemma makes it completely straight-
forward. Specifically, {, ( )} ` has the three-step proof , ( ), , and
now we can apply the deduction lemma twice, to conclude that {} ` (( ) )
and then ` ( (( ) )).
Lemma 6.5. Let be a wff and a set of wffs.
(1) If is consistent and ` , then {} is consistent.
(2) If {} is inconsistent, then ` ( ); thus, if is consistent and
{} is inconsistent, then {( )} is consistent.
(3) If {( )} is inconsistent, then ` ; thus, if is consistent and
{( )} is inconsistent, then {} is consistent.
Proof. For (1), if ` and {} ` , then consider any formal proof of from
{}. If appears in the proof, then one can insert a proof of it from , and
this shows that ` .
For (2), since {} ` , the deduction lemma implies that ` ( ), and
then (1) tells us that {( )} is consistent.
Finally, for (3), the deduction lemma implies that ` (( ) ). Combin-
ing this with Axiom 5 leads to ` , and thus {} is consistent by (1). 
We can now prove the converse of soundness. It says that our proof system is
powerful enough to prove everything that could possibly be proved.
Theorem 6.6 (Completeness). If evaluates to > under every satisfying assign-
ment for , then ` .
For example, is a tautology if and only if ` (every assignment is a satisfying
assignment for the empty set).
Corollary 6.7. If a set of wffs is consistent, then it is satisfiable.
To see why the corollary follows from the completeness theorem, note that if is
not satisfiable, then every satisfying assignment for makes true (because has
no satisfying assignments), and thus ` . In fact, the theorem also follows from
the corollary: if evaluates to > under every satisfying assignment for , then
{( )} is not satisfiable. By the corollary, it is inconsistent, so part (3) of
Lemma 6.5 implies that ` .
Thus, it suffices to prove the corollary. To do so, we must somehow use consistency
to produce a satisfying assignment. There are usually many satisfying assignments
for a given set of wffs (if it is consistent), which makes it difficult to single one out.
We will get around this difficulty by making the consistent set as large as possible,
so that it will have a unique satisfying assignment, which can easily be described.
Proof. Suppose is a consistent set of wffs. We begin by finding a maximal
consistent set of wffs containing . Specifically, there are only countably many
16 HENRY COHN

wffs, so we can number them 1 , 2 , . . . . Let 0 = , and for each i 1, let


i = i1 {i } if i1 {i } is consistent, and let i = i1 {(i )}
otherwise. By part S (2) of Lemma 6.5, i is consistent for all i.
Now let 0 = i0 i . This set is also consistent, since if 0 ` , then some
finite subset of 0 also proves . Each element of the finite subset is in i for some
i, and because the sets i are nested, the entire finite subset is in one of them.
However, that contradicts the consistency of i .
The set 0 is a maximal consistent set of wffs, because for each wff , either
0 or ( ) 0 (while they cannot both be in any consistent set). If is
consistent with 0 , then 0 .
Now we can define an assignment of truth values as follows. For each variable x,
if x 0 then we set x to be >, and if (x ) 0 then we set x to be . Let
ev() denote the evaluation of under this assignment.
We will prove by induction that ev() = > if and only if 0 . Thus, because
0 , we have found a satisfying assignment for .
We begin with the cases when is a variable or . By definition, each variable
x satisfies ev(x) = > iff x 0 . Furthermore, ev() = , which corresponds with
6 0 (since 0 is consistent). Thus, all we need to verify is that the equivalence
between ev() = > and 0 holds when = ( ), assuming it holds for
and .
Specifically, we must prove that 0 when ev() = > or ev() = , and that
6 0 when ev() = and ev() = >. In terms of membership in 0 , we must
prove that if 0 or ( ) 0 , then ( ) 0 , while if ( ) 0
and 0 , then ( ) 6 0 .
If 0 , then 0 ` ( ) by Axiom 2 and modus ponens. Thus, ( ) is
consistent with 0 by part (1) of Lemma 6.5, so ( ) 0 .
If ( ) 0 , then we can use the same argument with Axiom 4 instead of
Axiom 2.
Finally, if ( ) 0 and 0 , then ( ) cannot be in 0 , since if it
were then 0 would be inconsistent. 

Das könnte Ihnen auch gefallen