Вы находитесь на странице: 1из 265

Discrete Structures: Spring 2016

Ganesh Gopalakrishnan
April 9, 2016

Contents
0 Course Introduction

1 Propositional Logic, Boolean Gates


1.1 Introduction to Logic . . . . . . . . . . . .
1.2 Basic Truth Values and Truth Tables . .
1.2.1 Truth Values . . . . . . . . . . . .
1.2.2 Formal Propositions . . . . . . . .
1.2.3 Truth Tables . . . . . . . . . . . .
1.3 Exercises . . . . . . . . . . . . . . . . . . .
1.3.1 Basics . . . . . . . . . . . . . . . .
1.3.2 Evaluation of Boolean Functions
1.3.3 Swapping . . . . . . . . . . . . . .
1.3.4 Clearing memory . . . . . . . . .
1.3.5 Gate Realization . . . . . . . . . .
1.3.6 Mux-based Circuit Realization .
1.4 A Glossary of Symbols and Terminology
1.5 Lecture Outline . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

2 Propositional (Boolean) Identities


2.1 Boolean Identities . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Example: Logical Equivalence via a Truth-table .
2.2 Personality, Tautology, Contradiction . . . . . . . . . . . .
2.2.1 Properties of Truth Tables and Personalities . . .
2.2.2 The number of Boolean Functions over N inputs
2.2.3 The Number of Non-Equivalent Assertions . . . .
2.2.4 Significance of Universal Gates . . . . . . . . . . .
2.2.5 Tautologies, Contradictions . . . . . . . . . . . . .
2.3 DeMorgans Laws, Propositional Identities . . . . . . . .
3

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

5
5
9
9
10
10
19
19
20
20
20
21
22
25
27

.
.
.
.
.
.
.
.
.

29
29
30
30
31
31
33
35
35
36

CONTENTS
2.3.1 Illustrations . . . . . . . . . . . . . . . . . .
2.4 Proofs via Equivalences . . . . . . . . . . . . . . . .
2.4.1 Equivalence Proofs as If-and-only-if Proofs
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Propositional Identities . . . . . . . . . . . .
2.5.2 Simplifying the Staircase Light Example .
2.5.3 Simplifying Assertions . . . . . . . . . . . .
2.5.4 Tautology or Contradiction or Neither? . .
2.5.5 Number of Boolean Concepts . . . . . . . .
2.5.6 Negating Implication . . . . . . . . . . . . .
2.5.7 DeMorgans Law . . . . . . . . . . . . . . . .
2.5.8 Mux-based Realization . . . . . . . . . . . .
2.6 Lecture Outline . . . . . . . . . . . . . . . . . . . . .

3 Propositional (Boolean) Proofs


3.1 Inference Rules . . . . . . . . . . . . . . .
3.1.1 A Collection of Rules of Inference
3.2 Examples of Direct Proofs . . . . . . . .
3.3 Examples of Proofs by Contradiction . .
3.4 Exercises . . . . . . . . . . . . . . . . . . .
3.5 Lecture Outline . . . . . . . . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

40
41
42
42
42
42
43
43
43
44
44
44
44

.
.
.
.
.
.

47
48
49
53
55
55
56

4 Binary Decision Diagrams


4.1 BDD Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 BDD Guarantees . . . . . . . . . . . . . . . . . . . . . . . .
4.1.2 BDD-based Comparator for Different Variable Orderings
4.1.3 BDDs for Common Circuits . . . . . . . . . . . . . . . . .
4.1.4 A Little Bit of History . . . . . . . . . . . . . . . . . . . . .
4.2 Checking Proofs using BDDs . . . . . . . . . . . . . . . . . . . . .
4.2.1 Checking a Correct Direct Proof . . . . . . . . . . . . . . .
4.2.2 Checking an Incorrect Direct Proof . . . . . . . . . . . . .
4.2.3 Checking a Correct Proof by Contradiction . . . . . . . .
4.2.4 Checking an Incorrect Proof by Contradiction . . . . . .
4.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Lecture Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59
60
62
62
62
65
67
67
68
68
70
70
72

CONTENTS
5 Addendum to Chapters
5.1 Books to Purchase . . . . . . . . . . . . . . . .
5.2 Operator Precedences . . . . . . . . . . . . . .
5.2.1 Example . . . . . . . . . . . . . . . . . .
5.2.2 Another Example . . . . . . . . . . . .
5.3 Gate Realizations . . . . . . . . . . . . . . . .
5.4 Insights Into Logical Equivalences . . . . . .
5.4.1 Jumping Around Implications (NEW)
5.4.2 Telescoping Antenna Rule (NEW) . .
5.5 Muxes . . . . . . . . . . . . . . . . . . . . . . .
5.6 Glossary of Formal Definitions . . . . . . . . .

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

73
73
74
74
75
75
77
77
79
80
80

6 Notes on BDDs as Mux21 Circuits


87
6.1 A Magnigude Comparator . . . . . . . . . . . . . . . . . . . . . . 89
7 Intuitive Description of Topics
8 Sets
8.1 All of Mathematics Stems from Sets . . . . . . . . . . . .
8.2 Characteristic Vector, Powerset . . . . . . . . . . . . . . .
8.3 Special Sets in Mathematics . . . . . . . . . . . . . . . . .
8.4 Approaches to Define Sets . . . . . . . . . . . . . . . . . .
8.4.1 PYTHON EXECUTION . . . . . . . . . . . . . . .
8.5 Operations on Sets . . . . . . . . . . . . . . . . . . . . . . .
8.5.1 Cardinality or Size . . . . . . . . . . . . . . . . . . .
8.6 Operations on Sets . . . . . . . . . . . . . . . . . . . . . . .
8.7 Venn Diagrams . . . . . . . . . . . . . . . . . . . . . . . . .
8.7.1 Details of Venn Diagrams . . . . . . . . . . . . . .
8.8 Set Identities . . . . . . . . . . . . . . . . . . . . . . . . . .
8.8.1 Connection between Operators in Logic and Sets
8.8.2 Python Illustration of Set/Logic Connection . . . .
8.8.3 Formal Proofs of Set Identities . . . . . . . . . . .
8.8.4 Checking the Proofs Using Python . . . . . . . . .
8.9 Cartesian Product and Powerset . . . . . . . . . . . . . . .
8.9.1 Cartesian Product . . . . . . . . . . . . . . . . . . .
8.9.2 Cardinality of a Cartesian Product . . . . . . . . .
8.9.3 Powerset . . . . . . . . . . . . . . . . . . . . . . . . .
8.9.4 Application: Electoral Maps . . . . . . . . . . . . .

93

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

97
98
99
101
102
104
104
104
106
109
110
111
112
113
114
118
120
120
121
121
122

6
9 Predicate Logic
9.1 Predicates and Predicate Expressions
9.2 Examples . . . . . . . . . . . . . . . . .
9.3 Illustrating Nested Quantifiers . . . .
9.4 Primes Fixed . . . . . . . . . . . . . . .

CONTENTS

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

125
125
129
129
133

10 Combinatorics
10.1 Permutations versus Combinations . . . . . . . . .
10.1.1 Delta vs. Southwest Airlines: Ticket Sales
10.1.2 Properties of Permutations . . . . . . . . . .
10.1.3 Combinations as Ways to set Lucky Bits
10.2 Recursive Formulation of Combinations . . . . . .
10.3 Examples: Permutations and Combinations . . . .
10.3.1 Birthday Problem . . . . . . . . . . . . . . .
10.3.2 A Variant of the Birthday Problem . . . . .
10.3.3 Hanging Colored Socks . . . . . . . . . . . .
10.4 Binomial Theorem . . . . . . . . . . . . . . . . . . .
10.5 Combinatorics Concepts via Python Code . . . . .
10.5.1 Permutations . . . . . . . . . . . . . . . . . .
10.5.2 Factorial . . . . . . . . . . . . . . . . . . . . .
10.5.3 Combinations . . . . . . . . . . . . . . . . . .
10.5.4 Combinations . . . . . . . . . . . . . . . . . .
10.5.5 Birthday Conjecture . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

135
135
136
138
139
140
142
142
142
143
143
148
148
149
150
151
152

.
.
.
.
.
.
.
.
.
.
.
.
.

155
156
157
158
164
173
173
176
176
180
181
183
183
183

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

11 Probability
11.1 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11.1.1 Unconditional and Conditional Probability . . . . .
11.1.2 Unconditional Probability . . . . . . . . . . . . . . .
11.1.3 A Collection of Examples . . . . . . . . . . . . . . . .
11.2 Conditional Probability . . . . . . . . . . . . . . . . . . . . .
11.2.1 Conditional Probability Basics . . . . . . . . . . . .
11.2.2 Derivation of Bayes Theorem . . . . . . . . . . . . .
11.2.3 Law of Total Probability . . . . . . . . . . . . . . . .
11.2.4 Patient Testing: Bayes Theorem . . . . . . . . . . .
11.2.5 More Examples on Independence and Dependence
11.3 Advanced Examples . . . . . . . . . . . . . . . . . . . . . . .
11.3.1 New England Patriots . . . . . . . . . . . . . . . . . .
11.3.2 Independence, and how it allows the Product Rule

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

CONTENTS

11.3.3 Independence is Symmetric . . . . . . . . . . . . . . . . . 184


11.3.4 New England Patriots Game . . . . . . . . . . . . . . . . . 184
12 Functions, Relations, Infinite Sets
12.1 Overview of Functions and Relations . . . . . . . . . . . . . . .
12.2 Overview of Functions . . . . . . . . . . . . . . . . . . . . . . . .
12.2.1 Example Function: Mapping (0, 1] to [1, ) . . . . . . .
12.2.2 Example Function: Map Q to N . . . . . . . . . . . . . .
12.2.3 Example Function: Map N to N N . . . . . . . . . . . .
12.2.4 Inverse of a function . . . . . . . . . . . . . . . . . . . . .
12.2.5 Composition of Functions . . . . . . . . . . . . . . . . .
12.2.6 Example Functional Relation: Map Facult y to Ranks
12.3 Overview of Relations . . . . . . . . . . . . . . . . . . . . . . . .
12.3.1 Example Relation: Map Facult y to Committees . . .
12.3.2 Example Relation: The inverse of a non 1-1 function .
12.3.3 Inverse of a relation . . . . . . . . . . . . . . . . . . . . .
12.3.4 Composition of Binary Relations . . . . . . . . . . . . .
12.4 Functions in Depth . . . . . . . . . . . . . . . . . . . . . . . . . .
12.4.1 Examples of Functions . . . . . . . . . . . . . . . . . . .
12.4.2 Correspondences, Invertibility, and Tarzan Proofs . .
12.4.3 Gdel Hashes . . . . . . . . . . . . . . . . . . . . . . . . .
12.5 Infinite Sets, Cardinalities . . . . . . . . . . . . . . . . . . . . .
12.5.1 Matching up the sizes of infinite sets . . . . . . . . . . .
12.5.2 Cantor-Schrder-Bernstein Theorem . . . . . . . . . . .
12.6 Cantors Diagonalization Proof . . . . . . . . . . . . . . . . . . .
13 Classifying Relations
13.1 Why Classify Relations? . . . . . . . . . . . . . . . . . . . . .
13.1.1 Andrew Hodges Definitions for Types of Relations
13.1.2 Preorder (reflexive plus transitive) . . . . . . . . . .
13.1.3 Partial order (preorder plus antisymmetric) . . . .
13.1.4 Total order, and related notions . . . . . . . . . . . .
13.1.5 Relational Inverse . . . . . . . . . . . . . . . . . . . .
13.1.6 Equivalence (Preorder plus Symmetry) . . . . . . .
13.1.7 Equivalence class . . . . . . . . . . . . . . . . . . . .
13.1.8 Reflexive and transitive closure . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

189
190
192
192
193
194
195
196
197
197
198
198
198
199
200
200
203
206
207
210
211
212

.
.
.
.
.
.
.
.
.

217
217
218
224
224
227
227
228
228
231

8
14 Review of Functions and Relations
14.1 Gdel Hashing . . . . . . . . . . . . . . .
14.2 Relations and Functions . . . . . . . . .
14.3 Invertibility of Functions . . . . . . . . .
14.4 Pigeon-hole Theorem, Finite Domains .
14.5 Correspondences Between Infinite Sets

CONTENTS

.
.
.
.
.

233
233
234
237
238
238

15 Induction
15.1 Basic Idea Behind Induction . . . . . . . . . . . . . . . . . . . . .
15.1.1 First Incorrect Pattern for Induction . . . . . . . . . . . .
15.1.2 Correct Pattern for Induction . . . . . . . . . . . . . . . .
15.1.3 Induction: Basis Case and Step Case . . . . . . . . . . . .
15.2 A Template for Writing Induction Proofs . . . . . . . . . . . . . .
15.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15.3.1 Series Summation Problems-1 . . . . . . . . . . . . . . . .
15.3.2 Series Summation Problems-2 . . . . . . . . . . . . . . . .
15.3.3 Series Summation Problems-3 . . . . . . . . . . . . . . . .
15.3.4 Series Summation Problems-4 . . . . . . . . . . . . . . . .
15.3.5 Proving an Inequality-1 . . . . . . . . . . . . . . . . . . . .
15.3.6 Proving an Inequality-2 . . . . . . . . . . . . . . . . . . . .
15.3.7 Proving an Inequality-3 . . . . . . . . . . . . . . . . . . . .
15.3.8 Sequence Summation Needing TWO Basis Cases . . . .
15.3.9 Riffle Shuffles . . . . . . . . . . . . . . . . . . . . . . . . . .
15.4 Proof by induction of the Fundamental Theorem of Arithmetic
15.5 Failing to Prove by InductionStrengthening . . . . . . . . . .

239
240
240
241
241
243
244
244
246
247
248
249
250
251
252
253
253
255

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

Chapter 0
Course Introduction

CHAPTER 0. COURSE INTRODUCTION

Module1

Chapter 1
Propositional Logic, Boolean
Gates
1.1

Introduction to Logic

The purpose of this chapter is to give you the vocabulary for stating facts
and non-facts (truths and falsehoods) and manipulating them. This idea
originated with George Boole who, around 1859, published his book Laws
of Thought where he introduced some of the fundamental ideas behind calculating using truths. Independently, logicians have been exploring these
ideas even before Christ. The culmination of their work can be distilled into
two very inter-related topics: propositional logic and Boolean algebra.
Today, propositional logic underlies all of the mathematical proofs and
derivations we do. Boolean algebra is central to the design of hardware that
powers all kinds of cool devices, beginning with smartphones. Circuits are
also used to model computational problems and study their complexity. The
study of how biological brains work, and how to model human thought using
neural networks also relies on propositional logic and Boolean algebra. In
short, the material in this chapter is central to everything we do in computing! We will now introduce the subject matter step by step, going through
basic definitions, examples, and problems.
Declarative Statements and Truth Values: We often make declarative
statements that may be true (often written as 1) or false (often written as
0). Examples (from Huth and Ryan) are below (and we also comment on the
5

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

truth status of these declarative statements):


The sum of the numbers 3 and 5 equals 8 (true).
Jane reacted violently to Jacks accusations (true/false).
Every even natural number above 2 is the sum of two prime numbers
(appears true to the extent checked; this is known as Goldbachs conjecture, and is an open question).
All Martians like pepperoni on their pizza (true/false, but highly unlikely that this is a fully defined statement; are there Martians? Is
like a concept that applies to them? etc.).
Every number above 1 can be written as a unique product of primes
(true; this is known as the fundamental theorem of arithmetic; note
that we avoid 1 because it is not a prime).
There are also statements that are not declarative; examples (from Huth
and Ryan):
Could you please pass me the salt?
Ready, steady, go!
May fortune come your way.
These statements do not have a truth status, and we avoid considering such
statements any further.
Combining Truths: Given two truths, one can derive new truths. The
familiar operators involved in this process are and (written ), or (written
), and not (written ). For example,

true f alse = f alse,


true true = true,
true f alse = true,
f alse f alse = f alse,
f alse = true,
true = f alse.
Using 1 and 0, we can re-express the above identities:
1 0 = 0,
1 1 = 1,
1 0 = 1,
0 0 = 0,
0 = 1,
1 = 0.

1.1. INTRODUCTION TO LOGIC

Practical usage: conditionals in programs: Both propositional logic


and Boolean algebra underlie almost all aspects of computer science. When
you write conditional statements in your code
if ((x == 0) and (y < 0)) or (z > w):
...do something...
else:
...do something else...

you are using ideas based on propositional logic (and Boolean algebra). The
operators and and or are Boolean functions (or propositional operators/connectives), and the relations (<, > and ==) are built up using Boolean functions
acting on bits in computer words.
It must be intuitively clear that the else part will be executed when the
following condition is true:
((x != 0) or (y >= 0)) and (z <= w)

Notice how the given condition changes when we negate it:


This condition ((x == 0) and (y < 0)) or (z > w), when negated
becomes this:

((x != 0) or (y >= 0)) and (z <= w)


It is very important to be sure that such conclusions, when drawn through
painstaking manual calculations, are correct. Otherwise, one will end up
debugging a program incorrectly, not covering all of its feasible branches
correctly.
In this example, this conversion was achieved using the so-called DeMorgans law which we shall study in Chapter 2. We shall be studyiing many
more such laws in this and subsequent chapters.
Practical usage: Writing proofs: Suppose we have these propositions
(examples from Huth and Ryan):
p: Gold is a metal
q: Silver is a metal
It must be possible to infer p q but not p q.
Proofs are chains of reasoning steps going from existing (or given) truths
to new truths. That is, proofs are valid implication chains. It must not be
possible to prove something thats false, given only true assertions. We will
ensure that this cannot happen by employing only good (sound) proof rules.
Here are additional examples:

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES


It must be possible to prove p q from p, even though we know that
these formulae are not equivalent. However, p q is a weaker assertion,
and it should be possible to infer it from p (a strong assertion).
It must be impossible to prove p from p q from p. These formulae are
also not equivalent, but we know that p q is a weaker assertion, and
from it, we should not be able to draw a strong conclusion, such as p.

Practical Usage: Designing Circuits: Consider a dark staircase with


two switches at either end, called a and b. Most staircases following the
following logic:
Initially, let us say that both a = 0 and b = 0 (both switches are off).
Before one walks into the staircase, one turns one switch on (say, a = 1),
thus illuminating the stair. One then turns the light off at the other
end by flicking the other switch ( b = 1).
The protocol repeats when another person wants to enter the b side,
setting b = 0, and then switching the light off by a = 0.
Thus, if a Boolean function F controls the staircase light, then it is
easy to see that F (a, b) = (a 6= b). That is, when a and b are unequally
set, the light is on. Later in this chapter, we shall learn that the 6= is
realized through the XOR function which is really the 6= operator for
Booleans.
Now consider a master override switch m being brought into the picture. The idea is that if m = 1, the light is turned on, and nothing else
matters. Now, the whole function becomes

F (a, b, m) = ( m ( m (a 6= b)))
This is the same as

F (a, b, m) = ( m ( m (a b)))
In Chapter 2, we shall learn that the above function can be simplified
to the following:
F (a, b, m) = ( m (a b))
The laws of Boolean algebra that allow this simplification are also introduced there.
We now embark on a systematic study of Boolean reasoning aided by our
examples.

1.2. BASIC TRUTH VALUES AND TRUTH TABLES

1.2
1.2.1

Basic Truth Values and Truth Tables


Truth Values

Figure 1.1: A switch and LED represented in TkGate


We now explain our ideas in the context of circuits, using a circuit simulator called TkGate (see Figure 1.1). Here, the Boolean values or truth values
are generally represented by 0 (off , or False as in Python) and 1 (on, or
True as in Python).
It is indeed remarkable that Boolean reasoning or proofs that we can
carry out using paper and pencil can also be mechanized using circuits such
as shown here. Computers can be thought of as engines that process billions
of propositional logical deductions per second. By doing so, they are able to
extract a nice song from your Flash Drive and play it remarkable isnt it?
Note: In a course devoted to Logic, one might introduce the propositional
concepts first, and then only show you the gates. In a much more tightly
scheduled course such as this, we plan to freely mix these ideas. In fact, we
find that many students benefit from these alternative views inter-mixed.
Gates are also much more visual, further helping you ground your knowledge in a timely way.
Propositional Operators or Functions? We can view operators such
as (and) either as builders of longer (more complex) propositions or as
functions. What are functions? Functions are simply black boxes into which
values walk in and new values (results) walk out. An amplifier is a function
into which a small signal walks in and a piece of loud music walks out. A
mouse walking into an amplifier emerges as an elephant.
Functions have other nice properties also. Thus, if you feed 1 and 0 into
an and gate (and function), it must not walk out as a 0 result sometimes,

10

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

and a 1 result at other times. In other words, for one input, there cant be
more than one output. However many different inputs can result in a single
output. An (and) function sends all these inputs to a 0 output: 0, 0, 0, 1
and 1, 0.
In the rest of this chapter, we shall view our Boolean operators both as
propositional formula builders as well as Boolean functions.

1.2.2

Formal Propositions

Formal propositions, otherwise also known as propositional formulae, are


expressions defined as follows:
Propositional variables (usually single letters such as a or b or x or y)
are formal propositions. A letter s may stand for I am smart.
Formulae such as a b, a b, and a are also formal propositions. In
general, if p, p 1 and p 2 are propositions, so are p 1 p 2 , p 1 p 2 , and
p.

1.2.3

Truth Tables

The truth value of formal propositions is calculated based on the truth of the
propositional variables. We display this truth using a Truth table. We now
provide truth tables for some common functions. This is then followed by an
example.
Common Functions, Universal elements
There is a set of fundamental Boolean functions that are well-known and
which get used frequently. In this section we will introduce these functions
and their truth tables. Familiarity with these functions and understanding
why the truth tables are as they are will help tremendously in developing
strong intuitions in Boolean logic and Boolean algebra. The functions we
will cover in this section are not, and, or, if-then, if-and-only-if,
xor, nor, nand. One may ask why we need this many Boolean operators.
One may also ask what are the absolute minimum set of primitives that one
can get away with These are termed universal. A universal set could have
a single function (or gate) type. A universal set could also have more than
one function (or gate) type. Here are our answers, with examples:

1.2. BASIC TRUTH VALUES AND TRUTH TABLES

11

We provide multiple operators for convenience.


Some of the operators (e.g., nor) often have more efficient and direct
circuit realizations than others such as and. The fact that and is more
popular does not mean that it also has a more efficient circuit realization.
Some of these operators are also universal by themselves. For instance, nor is universal: having just nor, we can build all known gate
types.
Function (gate) nand is also universal.
Function and, by itself, is not universal. However the set {and,not} is
universal.
We now introduce various gate types. More detailed discussions on universality will be presented in subsequent sections.
not
not is the only unary operator we will study in this section. This simply
means that it operates on one Boolean statement (or one propositional input)
instead of two. The definition of not is straight-forward and as one would
expect. Applying not to any operand will invert its truth value.
Note that not may be represented with any of the following symbols:
!, ~,
In addition, not x can also be written as x. The truth table for not is:

x
0
1

x
1
0

Points to note:
Please refer to Figure 1.5 which summarizes the behavior of not or
. It also shows a gate rendering of the not-gate. Gates are circuit
realizations of Boolean functions.
I hope you can believe that not is not a universal gate (think of how to
build an and gate using not gates, for instance!)
If you feed x as input, and if x = 0 , the output will be x or 1.

12

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

It should be clear that x = x because double-negations cancel each


other.
Personality: We introduce the notion of personality as a way to summarize
the entire output of a truth-table. For the not function, assuming that we
first list x = 0 and then list x = 1, the personality is the sequence 10. We shall
give additional examples of personality in this and subsequent chapters.
A note on the Personality of Boolean Functions
Assuming that we enumerate the input to a Boolean function in a standard
way, the personality of the function completely determines its behavior. We
will employ this idea (of a personality) for many purposes; for instance:
Establishing the logical equivalence of two functions is done by ensuring that their personalities agree (for the same input listing order, the
outputs agree).
When we simply a Boolean function, the simplified function must also
have the same personality.
and

and statements are true only when both operands are true. If either of the
operands is false, then the whole statement is false. Like not the formal
meaning for and is intuitive.
Note and may be represented with either of
,

One often omits the symbol, writing ab instead of a b.


The truth table for and is:

x
0
0
1
1

y
0
1
0
1

x y
0
0
0
1

1.2. BASIC TRUTH VALUES AND TRUTH TABLES

13

Points to note:
and is not a universal gate (think of how you might realize a not gate
using and, and see if you succeed).
If you feed x and x as the two inputs, the output will be x x = x.
If you feed x and x as the two inputs (or vice-versa), the output will
be x x = 0.
If you feed x and 0 as the two inputs (or vice-versa), the output will be
x 0 = 0.
If you feed x and 1 as the two inputs (or vice-versa), the output will be
x 1 = x.
The personality of and is 0001; that is, going by the standard listing
order of the inputs x, y going through 00, 01, 10, 11, the outputs generated are 0, 0, 0, 1 respectively (or, in other words, we read the whole
personality out as 0001).
or

or statements are true when at least one of the operands is true. An or


statement is only false when both of its operands are false. Note that this
definition of or is different from the notion of or wherein only one of the two
options can be true. For example, if somebody tells you that you can have
soup or salad, typically they mean that you may have one or the other but
not both. This second meaning of or will be defined later in this section via
an operator called xor.
The or operation may be represented with either of
+,
The truth table for or is as follows (and its definition does allow you to have
both soup and salad).

x
0
0
1
1

y
0
1
0
1

x y
0
1
1
1

14

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

Points to note:
or is not a universal gate.
If you feed x and x as the two inputs, the output will be x x = x.
If you feed x and x as the two inputs (or vice-versa), the output will
be x x = 1.
If you feed x and 0 as the two inputs (or vice-versa), the output will be
x 0 = x.
If you feed x and 1 as the two inputs (or vice-versa), the output will be
x 1 = 1.
The personality of or is 0111.
if-then (or implication)
if-then statements are true when the first operand is false or the second
operand is true. An if-then statement is only false when the first operand
is true and the second operand is false. if-then statements may also be
referred to as implications. if x then y is equivalent to x implies y.
An if-then statement is made up of two parts, the antecedent and the
consequent. The antecedent is the first statement of the implication, the
piece that does the implying. The consequent is the second statement and
is what is implied by the antecedent. In the statement if x then y, x is the
antecedent and y is the consequent. Note that sometimes, antecedent is also
called premis and consequent called conclusion.
There is some subtlety to the definition of if-then that should be addressed. It can be puzzling to try and work out why an implication is always
true when the antecedent is false. We will attempt to make this clear via
a simple example. Take the statement, If it is sunny then I will ride my
bicycle to class. Clearly, if it is sunny and I ride my bicycle to class then
the statement is true. Conversely, if it is sunny and I dont ride my bicycle
then the statement is false. Consider the case when it is not sunny and I
ride to class anyhow. I have not violated any terms of the original statement, therefore it is still true. Likewise if it is not sunny and I do not ride to
class. I made no promise under such circumstances and so my original claim
remains true. This is how we arrive at the truth values for implication.
Central Role in Proofs: Implication is the central concept underlying
mathematical proofs. All proofs consist of implying new facts from existing facts. It is therefore important to keep examining your understanding of
the concept of implication till you are sure about it.

1.2. BASIC TRUTH VALUES AND TRUTH TABLES

15

While it is somewhat uncommon to view implication as a gate, there is no


issue with this at all; in fact, Figure 1.5 shows you how to notate implication
as a gate. Figure 1.2 further illustrates how this gate works.

Figure 1.2: An Implication Gate: s i 1, producing a 0 output when s = 1 is


applied at the inverting input (bubble) and i 1 = 0 at the other input

if-then may be represented with either of


,

The truth table for if-then is:

x
0
0
1
1

y
0
1
0
1

x y
1
1
0
1

Points to note:
The inputs of and () and or () are interchangeable. For implication,
this is not the case. That is, x y is not the same as y x.
You will notice that the implication gate x y can be replaced by x y
a circuit realized using an or gate and a not gate.

16

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES


Later on, we will show you that by just having an implication gate, we
can build any desired gate! That is, implication is universal. Think
how to build various gates using implication.
How does one build an inverter, given an implication gate?
How does one build an OR gate, given an implication gate?
Now, how does one build a NOR gate?
If you feed x and x as the two inputs, the output will be x x = 1.
If you feed x and x as the two inputs (but not vice-versa), the output
will be x x = x.
If you feed x and 0 as the two inputs (but not vice-versa), the output
will be x 0 = x.
If you feed x and 1 as the two inputs (but not vice-versa), the output
will be x 1 = 1.
The personality of implication () is 1101.

if-and-only-if (or bi-implication)


if-and-only-if statements are true when the first operand has the same
truth value as the second operand.
if-and-only-if is frequently abbreviated iff.
It may also be referred to as a bi-implication.
This alternate name is telling and hints at the true nature of iff
statements. Namely, the statement x iff y is true exactly when ( x
y) ( y x), i.e. when x implies y and y implies x.
if-and-only-if may be represented with either of (, ).
The truth table for if-and-only-if is:

x
0
0
1
1

y
0
1
0
1

x y
1
0
0
1

Points to note:
The inputs of bi-implication () are interchangeable.
You will notice that bi-implication behaves like = (equality).

1.2. BASIC TRUTH VALUES AND TRUTH TABLES

17

Bi-implication is not universal. In a later chapter, we will learn how


to prove this, but for now, think of ways to realize not and and using
bi-implication (i.e., ), and see if/when you succeed.
If you feed x and x as the two inputs, the output will be 1.
If you feed x and x as the two inputs (or vice-versa), the output will
be 0.
If you feed x and 0 as the two inputs (or vice-versa), the output will be
x.
If you feed x and 1 as the two inputs (or vice-versa), the output will be
x.
The personality of bi-implication () is 1001.
xor

xor (exclusive or) statements are true when exactly one of the operands is
true. Recall the soup or salad example given above. If you are asked whether
you want soup or salad, the usual implication is that you may have one or
the other but not both. The definition of xor is similar: the statement is true
if one of the operands or the other is true but not both.1

xor is represented with .


The truth table for xor is:

x
0
0
1
1

y
0
1
0
1

x y
0
1
1
0

Points to note:
The inputs of xor are interchangeable.
You will notice that xor behaves like 6= (inequality).
xor is not universal. In a later chapter, we will learn how to prove this,
but for now, think of ways to realize not and and using xor and see
1

Despite the apparently less generous nature of xor in terms of not allowing soup and
salad, it plays a fundamental role in Computer Science.

18

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

if/when you succeed.


If you feed x and x as the two inputs, the output will be 0.
If you feed x and x as the two inputs (or vice-versa), the output will
be 1.
If you feed x and 0 as the two inputs (or vice-versa), the output will be
x.
If you feed x and 1 as the two inputs (or vice-versa), the output will be
x.
The personality of xor () is 0110.

nor

nor statements are true only when both the left and right operands are false.
nor is true exactly when or is false, and vice versa.
Symbolically, x nor y is the same as !( x + y). nor is usually just represented
as nor.
The truth table for nor is:

x
0
0
1
1

y
0
1
0
1

x nor y
1
0
0
0

Points to note:
The inputs of nor are interchangeable.
nor is universal.
If you feed x and x as the two inputs, the output will be x.
If you feed x and x as the two inputs (or vice-versa), the output will
be 0.
If you feed x and 0 as the two inputs (or vice-versa), the output will be
x.
If you feed x and 1 as the two inputs (or vice-versa), the output will be
0.
The personality of nor () is 1000.

1.3. EXERCISES

19

nand
nand statements are true when the left operand and the right operand are
not both true. Similarly to nor, nand is true exactly when and is false. Symbolically, x nand y is equivalent to !( x y). nand is typically represented simply
as nand. The truth table for nand is:

x
0
0
1
1

y
0
1
0
1

x nand y
1
1
1
0

Points to note:
The inputs of nand are interchangeable.
nand is universal.
If you feed x and x as the two inputs, the output will be x.
If you feed x and x as the two inputs (or vice-versa), the output will
be 1.
If you feed x and 0 as the two inputs (or vice-versa), the output will be
1.
If you feed x and 1 as the two inputs (or vice-versa), the output will be
x.
The personality of nand () is 1110.

1.3
1.3.1

Exercises
Basics

Negate ((x != 0) or (y <= 0)) and (z >= w)


Negate ((x != 0) and (y <= 0) and (z >= w))

20

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

1.3.2

Evaluation of Boolean Functions

Simplify the Boolean function


(a (a b)) (a (a b))
for all four settings of the a, b pair. That is, set a = 0, b = 0 and simplify
the whole formula; then set a = 0, b = 1, and then a = 1, b = 0 and
finally a = 1, b = 1.
Evaluate (a b c p) ( q r ) for all possible values of the variables
a, b, c, p, q, r . Simplify your answer by grouping cases; example: when
any one of a, b, c in the antecedent of the formula is false, the whole
formula evaluates to true. Otherwise, ...
A new gate is to be introduced. Its personality is 0010. Is it one of the
gates seen so far? If not, give it a convenient name (say Foo for now).
Is Foo a universal gate?

1.3.3

Swapping

In the program given below, ^ is the XOR operator in Python. We find that
no matter which two numbers we start with for a and b, the program ends
up swapping the values of these variables. Explain why.
Hint: Show that swapping works when a and b are just one-bit Boolean
variables. Now extend your reasoning for more general examples.
Python 3.4.3 (default, Mar 10 2015, 14:53:35)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.56)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 234
>>> b = 442
>>> a = a ^ b
>>> b = a ^ b
>>> a = a ^ b
>>> a
442
>>> b
234

1.3.4

Clearing memory

In many programs, programmers clear a word of computer memory by XORing that word with itself. Describe in one sentence why this approach works.

1.3. EXERCISES

1.3.5

21

Gate Realization

In the table below, you are given certain implementation challenges. Either
write realizable and then show how to realize the said gate using the given
gates, or write unrealizable and then briefly justify why not. You may
employ more than one instance of a given gate type to realize the challenge
gate type.
Using these gate(s)
And
Or
And
Not
Nor
Nor
Nor
Nor
Nand
Nand
Nand
Nand
Nand
XOR
XOR
XOR
XOR
XOR, And
Implication
Implication
Implication
Implication
Bi-implication
Bi-implication
Bi-implication
Bi-implication, And
Bi-implication

Realize
Or
And
Not
And
Not
Or
And
Bi-implication
Not
And
Or
XOR
Bi-implication
Not
Bi-implication
And
Or
Or
Not
And
Or
Bi-implication
Not
And
Or
Or
Implication

Solution: We will solve selected examples below.

22

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

XOR using Nand: We are not seeking the best solution (often in terms of
the fewest gates; however, that is not the only measure of goodness).
We are interested only in realizing the function correctly.
First of all, you will be able to build an or gate (the expression x y)
using a nand gate by
1. inverting x
2. inverting y
3. Feeding it into a nand gate to obtain (( x) ( y)). You can check
that this amounts to x y. This step is called the DeMorgans law,
and explained in Chapter 2.
And using XOR: This will be shown to be impossible. Reason:
1. XOR can realize inversion.
2. If it realizes And, then we can thereafter buiild anything (that is,
{ And, N ot} is a universal set.
3. But this contradicts the fact that XOR is not universal (if we can
build inversion and And, we can build anything should be impossible since XOR is not universal).
4. Thus, XOR cannot build an And gate.
Show how to realize an OR gate using Bi-implication and And.

1.3.6

Mux-based Circuit Realization

A gate called multiplexor (or mux) is available. It is a three-input gate with


the selector being labeled s and the inputs being labeled i 0 and i 1 . These
are standard input names because i 0 denotes here is the input that gets
copied to the output when the selector s is a 0. Likewise, i 1 denotes here is
the input that gets copied to the output when the selector s is a 1. Its truthtable is as given below. Its circuit behavior is shown in Figure 1.3 where the
select input is at the side of the trapezium and the inputs i 0 and i 1 are at the
longer parallel side of the trapezium, clearly labeled. Notice that when the
select inputs switch is at the off position, the i 0 input is faithfully copied
to the output.

1.3. EXERCISES

23

s
0
0
0
0
1
1
1
1

i0
0
0
1
1
0
0
1
1

i1
0
1
0
1
0
1
0
1

mux( s, i 0 , i 1 )
0
0
1
1
0
1
0
1

Figure 1.3: A 2-to-1 multiplexor


Exercise: Realize an Implication gate using a multiplexor.
Solution: Think of a multiplexor as something that steers inputs along a
tree. Figure 1.4(a) and (b) show how, based on the select input, a mux can
be viewed as something that steers its inputs up the tree. For instance, in
Figure 1.4(b), if s = 0, the output will be obtained by picking the left-hand
side input which is i 0, and if s = 1, the output will be obtained by picking

24

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

the right-hand side input which is i 1. This idea can be extended to any tree
depth, as shown in Figure 1.4(c). This tree depicts a 3-mux circuit.
Key Insight: Now who is steering the inputs? It is the inputs x and y.
What are the tree inputs that are being steered? Well, it is the personality
of implication gate! In other words,
Place any personality at the leaves.
The bits in the personality appear at the tree roots when x, y are varied
in the standard order 00, 01, 10, 11.
We now see that this is a generalized method for realizing any 2-input
gate.
By growing the tree even deeper, we can realize 3-input functions, 4input functions, etc. etc.
This is how field-programmable gate arrays work! They are
malleable gates in that by programming bit-patters at the leaves
(stored in suitable flip-flops), they can be programmed to be any gate
at all!

Figure 1.4: Mux21 Based Realization of Implication Gate: (a) A Mux21 (b)
An abstract depiction of Mux21 as a steering circuit (c) Three Mux21s into
a Steering Tree. Note that in the steering tree, all the muxes involved
receive the same steering input. Thus for x = 1, y = 0, the first level of the
steering tree selects the right branch of the tree. Both the second levels select the left branch of the steering tree. In the second level, only the second
mux from the left matters: it couples with the selection at the first level,
producing a final output of 0. That is, the 0 walks up the second level and
the first level.

1.4. A GLOSSARY OF SYMBOLS AND TERMINOLOGY

25

The realization of the gate in Figure 1.4 can be written in text as

mux21(x, mux21(y,1,1), mux21(y,0,1))


Realize a 2-input NAND gate using Mux21s, and write its design in
the format

mux21(x, mux21(y,?,?), mux21(y,?,?))


Think of another 2-input gate (besides the Foo gate) that is universal.
Realize it using Mux21s.
Describe its design in the format
mux21(x, mux21(y,?,?), mux21(y,?,?)) .

1.4

A Glossary of Symbols and Terminology

Given the compressed nature of our lectures, it is, unfortunately, necessary


to talk about concepts from proofs (propositional logic) and concepts from
Boolean functions (Boolean algebra) in one setting. Thus, we end up introducing many notations that mean the same. Examples:
Here are three ways in which we have captured negation:
! x, x, x
Here are two ways in which we have captured conjunction:

x y, x y
Here are two ways in which we have captured disjunction:

x + y, x y
Figure 1.5 helps summarize all these variants for easy reference.

26

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

Quantity

Name

Variant

Other Variant(s)

English

Examples

Value
Value
Function
Function
Function
Function
Function
Function

Zero
One
And
Or
Not
Implication
XOR
XNOR

0
1

False, false
True, true
.
+
!

Off
On
Conjunction
Disjunction
Negation
Implication
Inequality
Equality

0 or False
1 or True
x y, x y
x y, x + y
!x, y, y
x y, if x then y
x y, x 6= y, 6 y
x y, x = y, x y, x y

If-Then
6=
=

Function

Gate Icon

inputs

And

on the left

Nand

on the left

Or

on the left

Not

on the left

Implication

i on left
s is beneath

XOR

on the left

XNOR

on the left

Figure 1.5: Different Syntaxes as well as Gate Icons for Boolean Functions

1.5. LECTURE OUTLINE

1.5

27

Lecture Outline

A typical lecture covering this chapter may go through the following topics:
A brief history of Boole, Shannon
Uses of Propositional Logic and Boolean Algebra (and what is the difference between these terms)
Declarative and non-declarative statements
How to invert a conditional such as ((x== 0) and (y < 0)) or (z >

w)
Staircase switch: governing logic expressed in terms of m, a and b
Formal Propositions using , , and (which one can we leave out?)
Gates, personalities, which operators/gates are universal (simple argument by trying to create the and,not set or or,not set
Swapping using XOR
Clearing a word by XORing with itself
Realize one gate type using a collection of other gate types (e.g., try
building XOR using Nand, then AND using Implication
Realize any 2-input gate type using a Mux21 (try a few). Write the
answer as
mux21(x, mux21(y,?,?), mux21(y,?,?)) .
Key role played by Muxes in being the fundamental element behind
programmable logics finding growing usage in computing

28

CHAPTER 1. PROPOSITIONAL LOGIC, BOOLEAN GATES

Chapter 2
Propositional (Boolean)
Identities
2.1

Boolean Identities

Chapter 1 introduced the basics of Boolean propositions. We also discussed


how to view Boolean operators as circuit elements (gates).
In this chapter, we shall learn techniques to manipulate Boolean expressions (statements in propositional logic). Here are some of the specific techniques to be studied:
Often we will have the need to show that a Boolean expression B1 and
another expression B2 are equivalent. This is really akin to claiming
that x+ x = 2 x in arithmetic: both these expressions will evaluate to the
same numeric answer. We will define an idea similar to evaluating to
the same numeric answer for Boolean expressions. This idea is called
logical equivalence in the parlance of Boolean expressions. The idea is
quite simple: two Boolean expressions are logically equivalent if their
personalities are the same.
Another way in which two Boolean expressions can be shown to be
equivalent is through a standard collection of Boolean identities. These
are akin to using identities of the form (a m )n = a mn in the context of
natural numbers. Our work will be based on a table of Boolean identities. Some of these identities occur so frequently that we will give
them specific names and will practice their usage. These include:
The DeMorgans law
29

30

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES

ab

bc

ca

LHS = ab + bc + ca

RHS = ab + ca

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
0
0
0
0
0
1
1

0
0
0
1
0
0
0
1

0
1
0
1
0
0
0
0

0
1
0
1
0
0
1
1

0
1
0
1
0
0
1
1

Figure 2.1: Conjecture ab + bc + ca ab + ca shown through a Truth Table


A given statement and its contrapositive form

2.1.1

Example: Logical Equivalence via a Truth-table

We prove the logical equivalence of two formulae LHS and RHS by employing the truth-table in Figure 2.1. This is achieved by showing that the
personalities of LHS and RHS are the same.

2.2

Personality, Tautology, Contradiction

As already stated several times, we refer to the entire column of a Boolean


function (formal proposition) as its personality. I did not find a convenient
term for this concept thus, I borrowed a term that is used in the synthesis
of programmable logic arrays (PLA) namely personality. It is a very
appropriate term, because the essence of a Boolean function is captured by
its personality. For example, the personality of
( a b ) ( b c ) ( c a)
is
01010011
and this matches the personality of
( a b ) ( c a)
thus allowing us to show, in one fell swoop, that these propositional forms (or
Boolean functions) are logically equivalent. When two personalities match,

2.2. PERSONALITY, TAUTOLOGY, CONTRADICTION

31

the functions (or propositions) in question are found to generate identical


truth values for every input.
The personality of course depends on the order in which we enumerate
the truths of variables, but we will always enumerate in a fixed way. Three
variables a, b, c will be listed as follows:
0, 0, 0 0, 0, 1 0, 1, 0 0, 1, 1 1, 0, 0 1, 0, 1 1, 1, 0 1, 1, 1.
This is the same order generated by a cars odometer if someone left the
0 alone, and painted over 1 through 9 as one big 1 sector. This standard
enumeration order will be assumed throughout this book (unless otherwise
stated).

2.2.1

Properties of Truth Tables and Personalities

In a truth-table of N Boolean variables, there will be 2 N rows. This is obvious because there are two settings per variable and the settings for one
variable do not depend on that for another. Thus, we have 2 2 . . . 2 = 2 N
possible combinations (rows) for an N -variable truth-table. We will refer
to this number by R in what follows.
Now, for each of the rows of a truth-table, a personality has to produce a
0 or a 1. Then, it is clear that there are 2R possible personalities, given an
R -row truth-table. Plugging in the value of R , we now surmise that
N

There are 22 possible personalities that one can encounter, given


any N -variable Boolean function ( N -variable propositional formula).

2.2.2

The number of Boolean Functions over N inputs

Any Boolean function F over N inputs is written F ( x1 , x2 , . . . , x N ). For example, one-input Boolean functions are written F ( x1 ), two-input functions
are written F ( x1 , x2 ), and so on (the variable names are of course arbitrary).
These are called functions because given an input combination, they spell
out a unique output. For example, nand is a function where nand (0, 0) = 1
whereas or is a function where or (0, 0) = 0. This difference shows up in the
0, 0 position of the personality of nand and or.
Given all this, it is clear that there are this many possible functions over
a particular number of inputs:

32

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES

Constant x
0
0
1

Constant x
1
0
1

0
0

z
1
1

x
z =! x
Inverter 0
1

z=x
x
Identity
0
1

z
1
0

Figure 2.2: All possible 1-input Boolean Functions

z = x.y
AND

Constant
0

z=x

z = x. ! y

0
0
1
1

0
1
0
1

0
0
0
0

0
0
1
1

0
1
0
1

0
0
0
1

0
0
1
1

0
1
0
1

0
0
1
0

0
0
1
1

0
1
0
1

0
0
1
1

z =! x.y + x.! y
XOR

z= y

z =! x .y

z = x+ y
OR

0
0
1
1

0
1
0
1

0
1
0
0

0
0
1
1

0
1
0
1

0
1
0
1

0
0
1
1

0
1
0
1

0
1
1
0

0
0
1
1

0
1
0
1

0
1
1
1

z = !(x+y)
NOR

z = x y+! x.! y
XNOR or =

z =! y

z = x+! y

0
0
1
1

0
1
0
1

1
0
0
0

0
0
1
1

0
1
0
1

1
0
0
1

0
0
1
1

0
1
0
1

1
0
1
0

0
0
1
1

0
1
0
1

1
0
1
1

z =! x + y
IMPLICATION

z =! x
x

0
0
1
1

0
1
0
1

1
1
0
0

x
0
0
1
1

y
0
1
0
1

Constant
1

z =!( x.y)
NAND
x y z

1
1
0
1

0
0
1
1

0
0
1
1

0
1
0
1

1
1
1
1

0
1
0
1

1
1
1
0

Figure 2.3: All possible 2-input Boolean Functions

z
0
1

2.2. PERSONALITY, TAUTOLOGY, CONTRADICTION

33

There are 22 = 4 possible functions of one input. The inversion function is just one of these, with personality 10. The other four personalities are 00, 01 and 11. Figure 2.2 lists all these functions and personalities.
2
There are 22 = 16 2-input gate types (of the kind shown in Figure 2.3).
Continuing this way, there are:
256 3-input functions,
65,536 4-input functions,
4,294,967,296 5-input functions (or, over 4 billion).
These numbers get pretty large: 1.8.1019 6-input gate types (or 6-input functions), 3.1038 7-input gate types, 1077 8-input gate types, 10154 9-input gate
types, and 10308 10-input gate types.

2.2.3

The Number of Non-Equivalent Assertions

In this section, we will describe an approach to calculate the number of nonequivalent assertions expressible over N inputs. This result will also re-use
our derivation of the number of Boolean functions over N inputs presented
in Section 2.2.2.
Let us begin our discussion with N = 3 Boolean variables. If we are
given propositional variables a, b, c, how many non-equivalent propositional
assertions can be expressed over them? a could model I am smart while b
could model I studied CS 2100 and c could model I did well in all exams.
In this case, we can have all these combinations:
Assertion 1: a b c Not Smart, Didnt Study 2100, Didnt Ace
Exams
Assertion 2: a b c Not Smart, Didnt Study 2100, Aced Exams
...
Assertion 8: a b c Smart, Studied 2100, Aced Exams
Well, you may think that you have exhausted all propositional assertions
over 3 variables? Let us look at the personalities we have generated in the
above listing (Figure 2.4):
It is clear that we did express eight distinct propositional assertions over
three Boolean variables. But did we express all assertions? What about this
assertion:
Assertion 9: (a b c) NOT THE CASE THAT (Smart and Studied
2100 and Aced Exams)

34

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES

Assertion 1

Assertion 2

...

Assertion 8

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

1
0
0
0
0
0
0
0

0
1
0
0
0
0
0
0

...
...
...
...
...
...
...
...

0
0
0
0
0
0
0
1

Figure 2.4: Eight of the 256 possible Propositional Assertions Expressed


over Three Variables
a

Assertion 1

Assertion 2

...

Assertion 8

Assertion 9

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

1
0
0
0
0
0
0
0

0
1
0
0
0
0
0
0

...
...
...
...
...
...
...
...

0
0
0
0
0
0
0
1

1
1
1
1
1
1
1
0

Figure 2.5: Aha a ninth assertion was missed!


Clearly, it was not expressed in Figure 2.4, as evidenced by Figure 2.5 which
includes this new assertion (Assertion 9) as a new column. This column
has a 1 whenever one of the variables a, b, c is false. That is, we set a 1
whenever you are Not Smart OR you Havent Taken CS 2100 OR you Did
Not Ace Exams.
Proceeding this way, you can see that there are 256 distinct assertions
that can be expressed over 3 propositional variables! Each new assertion
(non-equivalent assertion) is obtained by setting the column with a different
personality. The column has 8 entries, and hence we can set the column in
256 different ways (256 personalities). Some additional assertions that can
be formed are the following (we give the personalities):
11000000: This assertion amounts to (a b c) (a b c). In
English, it reads
(Not Smart AND Not Taken CS 2100 AND Not Aced Exams)
OR (Not Smart AND Not Taken CS 2100 AND Aced Exams).
You will realize that this can be simplified to Not Smart AND

2.2. PERSONALITY, TAUTOLOGY, CONTRADICTION

35

Not Taken CS 2100. In Section 2.3, we will present the law


of Boolean Algebra (Propositional Logic) that allows you to
make this simplification.
00000000, i.e. false. This is an extreme assertion which asserts false,
ignoring all of a, b, c. In a sense, this resembles the following situation:
Instructor: Give me a function that maps a natural number to
another natural number.
You: Take a natural number x, return x + 1
A Smart Aleck: Take a natural number x, return 0
The assertion f alse is equivalent to the Smart Alecks answer: ignore
all given variables and return a constant. This too is a perfectly acceptable answer (albeit, a trivial example that wasnt explicitly ruled
out).
11111111, i.e., true. This is also an extreme assertion which asserts
true, ignoring all of a, b, c.
There are many many more assertions in the mix of 256 assertions.
But the point is that you cannot make any more than these 256 assertions over 3 variables.

2.2.4

Significance of Universal Gates

Clearly, a manufacturer can ill-afford to build separate gate types (function


types) for each of these Boolean functions! By merely manufacturing universal gate types, the manufacturer can, instead, let the user realize any one of
these desired Boolean functions.
The same goes for Propositional Logic: we cant provide one operator for
each Boolean assertion. Thus logicians give you a complete set (such as
, ) or (, ), or sometimes something more than a complete set just for
some useful redundancy such as (, , vee), and then let you express all
possible propositional assertions using them!

2.2.5

Tautologies, Contradictions

In this section, we will consider 3-variable (or 3-input) Boolean functions


for the sake of simplicity. However, our discussions apply equally well to
functions with any number of inputs. What conclusion can be drawn if the
personality for some 3-variable function is all zeros (i.e., 00000000)? It

36

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES

ab

bc

ca

LHS = ab + bc + ca

RHS = ab + ca

LHS RHS

0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

0
0
0
0
0
0
1
1

0
0
0
1
0
0
0
1

0
1
0
1
0
0
0
0

0
1
0
1
0
0
1
1

0
1
0
1
0
0
1
1

1
1
1
1
1
1
1
1

Figure 2.6: Conjecture ab + bc + ca ab + ca shown through a Truth Table.


This table is identical to that given in Figure 2.1 except for adding the last
column LHS RHS which is a tautology
is then clear that such a function is never true (cannot be made true for
any input-variable setting). Such Boolean functions (or Boolean expressions
or propositional formulae) are known as contradictions. Now, how about a
personality that is all 1s? (i.e., 11111111)? Such functions are always
true, and are known as tautologies.
Here are some examples of tautologies, contradictions, and formulae that
are neither:
a (a b) is a tautology.
a a is a contradiction
(a b) (a b) (a b) (a b) is a contradiction. Suppose we pick
a = 1, b = 0. It is then clear that (a b) will be false (0), thus making
the whole formula false. Try the other three value assignments to
convince yourselves that this formula is a contradiction.
(a b) a is neither a tautology nor a contradiction: for b = 1, it can
be made either true or false depending on whether a = 1 or a = 0 (respectively).
Let us modify our earlier example and obtain a new formula LHS
RHS . This formula can be shown to be a tautology, as shown by its personality being all 1s, as illustrated in Figure 2.6.

2.3

DeMorgans Laws, Propositional Identities

Boolean identities help us simplify propositional forms (or Boolean expressions) as well as circuits built out of gates. We list a collection of identities

2.3. DEMORGANS LAWS, PROPOSITIONAL IDENTITIES

37

that prove useful in practice. We express these identities as equalities =.


We will first list a whole set of identities below, but will later present a small
useful set in a neat tabular format:
( x y) = x + y: Note that we denote negation by ! or , and or by +.
( x y) = x y + x y: This expansion helps explain why behaves like
the Boolean 6= operator.
( x y) = x y + x y: This explains why behaves like the Boolean equality operator.
x + ( y + z) = ( x + y) + z, Or Associativity of +
x ( y z) = ( x y) z, Or Associativity of .
x + y = y + x, Or Commutativity of +
x y = y x, And Commutativity of
x ( y + z) = ( x y) + ( x z), And Distributivity of over +
x + ( y z) = ( x + y) ( x + z), Or Distributivity of + over
x + 0 = x, Identity for +
x 1 = x, Identity for
x + x = x, Idempotence of +
x x = x, Idempotence of
x ( x + y) = x, Absorption 1
x + ( x y) = x, Absorption 2
x + 1 = 1, Annihilator for +
x 0 = 0, Annihilator for
x x = 0, Complementation 1
x + x = 1, Complementation 2

38

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES


x = x, double negation
( x + y) = ( x y), De Morgan 1
( x y) = ( x + y), De Morgan 2
x y = y x, Contrapositive.
x + x y = x + y, Implied Negation in Disjunct

Commonly Used Identities: Here is a summary of commonly used Boolean


identities, using a syntax that may be preferred in your exams. Notice that
binds more tightly than , and also that these both bind more tightly than
. In fact, the precedence of the operators follows this order:
, , ,

We shall remind you of other aspects of precedence, as well as use parenthesis when in doubt.
Or-distribution:

( p ( q r )) (( p q) ( p r ))

And-distribution:

( p q) r ( p r q r )

And-commutation:

pq q p

Or-commutation:

pq q p

Negation:

p p False

Contrapositive:

p q q p

Negating Implication:

( p q) ( p q)

Implied Negation in Disjunction:

p ( p q ) p q

DeMorgan:

( p q) ( p q)

Complementation 1:

( x x) 0

Complementation 2:

( x x) 1

2.3. DEMORGANS LAWS, PROPOSITIONAL IDENTITIES


Using Commutation along with distribution:
that we gave only one Or-distribution rule, namely

39

You may be surprised

( p ( q r )) (( p q) ( p r )).
You may have expected another rule
(( q r ) p) (( q p) ( q r ))
We avoid introducing these additional distribution rules, because we can
always apply the given commutation rules and turn things around. Hopefully this detail will be apparent from context.
Propositional Equivalences (alternate syntax): The same equivalences
in our (more circuit-oriented) alternate syntax is as follows (keeping in
mind that . binds more tightly than +; also, we often omit ):
Or-distribution:

p + q r ( p + q) ( p + r )

And-distribution:

( p + q) r pr + qr

And-commutation:

pq q p

Or-commutation:

p+q q+ p

Negation:

p p False

Contrapositive:

( p q) ( q p)

Negating Implication:

p q ( p q)

Implied Negation in Disjunction:

p+ pq p+q

DeMorgan:

( p q) ( p + q)

Complementation 1:

xx =0

Complementation 2:

x+x =1

40

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES

2.3.1

Illustrations

Simplification Rules for Nand , N or , X OR , : Now, let us derive some


rules specific to Nand , N or , X OR , and . In all these proof rules, we can
read = the same as or .
nand (0, x) = nand ( y, 0) = 1, for any x and y: For a Nand, a 0 forces a
1.
nor (1, x) = nor ( y, 1) = 0, for any x and y: For a Nor, a 1 forces a 0.
1 x = x1 = x
0 x = x0 = x
xx =0
xx =1
(0 x) = 1
(1 x) = x
Simplification of Assertions: Let us simplify the assertion
(a b c) (a b c).
We will work with the more readable syntax of propositions using ., + and
overline ( ) for negation:
a.b.c + a.b.c
We present our simplifications along with a comment:

a.b.c + a.b.c
(a.b).( c + c)
(a.b).1
(a.b)

Using And-distribution
Using Complementation 2
Using Identity.

This explains the simplification presented on Page 2.2.3, namely


(Not Smart AND Not Taken CS 2100 AND Not Aced Exams) OR
(Not Smart AND Not Taken CS 2100 AND Aced Exams).

2.4. PROOFS VIA EQUIVALENCES

41

being simplified to
Not Smart AND Not Taken CS 2100.
Simplification of Assertions: A Second Example
following assertion:

Let us consider the

a.b.c + a.b.c + a.b.c


We present our simplifications along with a comment:

a.b.c + a.b.c + a.b.c


a.b.c + a.b.c + a.b.c + a.b.c
a.b.( c + c) + a.c.( b + b)
(a.b).1 + (a.c).1
a.b + a.c

Using Idempotence (to repeat a summation)


Using And-distribution twice
Using two applications of Complementation 2
Using Identity twice

There is a method based on Karnaugh maps that makes such simplifications


much more intuitive. These techniques are taught in advanced classes on
digital design.

2.4

Proofs via Equivalences

Suppose we are asked to prove that


(a + b).( c + d ) ac + bc + ad + bd
We can achieve it through the following steps. We first assert the left-hand
side, namely (a + b).( c + d ), as a premis. If there is more than one premis,
we number them P1, P2, etc. We tag the proof goal as G; in our case, it is
ac + bc + ad + bd . Then we string via equivalences, listing the consequences
(or conclusions) of the original premis (or premises) as C1, C2, etc. Here is
our proof for the above equivalence:
P: (a + b).( c + d )
C1: (a + b).c + (a + b).d , using And-distribution with respect to P.
C2: (ac + bc) + (ad + bd ), using And-distribution twice with respect to
C1.
= G.
We see that the goal has been achieved.

42

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES

2.4.1

Equivalence Proofs as If-and-only-if Proofs

The equivalence proof (a + b).( c + d ) ac + bc + ad + bd in fact ended up establishing the equivalence chain

P C1 C2 G
thus showing that G follows from P , as well as P follows from G . Equivalence proofs are if and only if proofs. Thus, what we have shown that
(a + b).( c + d ) if and only if ac + bc + ad + bd .

2.5
2.5.1

Exercises
Propositional Identities

All propositional identities express the equality of two Boolean assertions.


For example,
In the DeMorgans law, ( p q) and ( p q) are logically equivalent.
In the Contrapositive law, p q and p q are logically equivalent.
Using Truth tables, show that these identities are tautologies (i.e., are of the
form F1 F2 where F1 F2 evalues to 1 for all values of the variables).

2.5.2

Simplifying the Staircase Light Example

We obtained the formula for the staircase light function as:

F (a, b, m) = ( m ( m (a b)))
Using the rule Implied Negation in Disjunction, we can simplify it to

F (a, b, m) = ( m (a b))
where we eliminate the negation that is implied. We dont need to say either
p, or not p and q; we can simply say either p or q.
Show that this simplification holds true (i.e. the original and the new formula are logically equivalent).

2.5. EXERCISES

2.5.3

43

Simplifying Assertions

Suppose a models Smart, b models Studied 2100 and c models Did Exams Well. Simplify these assertions, showing the rules of Boolean algebra
used in each simplification step. If the formulae cannot be simplified, state
why.
1. (Smart and Studied 2100 and Did Exams Well)
OR
(Smart and NOT(Studied 2100) and NOT(Did Exams Well))

2. (Smart and Studied 2100 and NOT(Did Exams Well))


OR
(Smart and NOT(Studied 2100) and Did Exams Well)

2.5.4

Tautology or Contradiction or Neither?

Classify these formulae into tautologies, contradictions, or neither:


1. a.b + a.b + a.b + a.b
2. a.b + a.b + a.b
3. (a + b).(a + b).(a + b).(a + b)
4. (a + b).(a + b).(a + b)
5. a ( b c) (a b) c
6. (a b c d ) (a b c d )
7. (a b c d ) (a b c d )

2.5.5

Number of Boolean Concepts

Determine the number of distinct truths (Boolean concepts or facts) that can
be expressed over 3, 4 and 5 variables.

44

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES

2.5.6

Negating Implication

Negate these statements, expressing your results using , and :


1. (a b)
2. (a ( b c))
3. (a ( b ( c d )))

2.5.7

DeMorgans Law

Negate the following formulae using DeMorgans Law. Check your answers
by using truth-tables.
1. a.b + a.b + a.b + a.b
2. a.b + a.b + a.b
3. (a + b).(a + b).(a + b).(a + b)
4. (a + b).(a + b).(a + b)

2.5.8

Mux-based Realization

Demonstrate how to realize the stair-case switch function


( m (a b))
using Mux21s. Hint: Obtain the personality for this function, and then use
a Mux21 tree of the appropriate height.

2.6

Lecture Outline

A typical lecture covering this chapter may go through the following topics:
What truth-tables capture, and how to develop them for any given
proposition
How the personality of a Boolean function describes the function fully
(all possible outputs, assuming that the inputs are enumerated in a
certain way)

2.6. LECTURE OUTLINE

45

Given a collection of Boolean variables, how many distinct truth-tables


can be obtained? How this relates to the total number of distinct truths
that can be expressed over these variables. This is the astronomical
N
number 22 for N Boolean variables!
Universal gates are there because it can help realize any of these large
number of functions.
How the personalities tell us which Boolean functions (propositions)
are tautologies and which are contradictions, and which are neither
(always true, always false, sometimes true/sometimes false).
Boolean identities: point out DeMorgans, distribution of Or over
And, and Contrapositive.
Simplification rules for XOR, Implication, Nand, Nor.
Simplification of Assertions: work out the examples in Sec 2.3.1
Proofs via Equivalences
Mux-based realization generalized. Extends the Mux21 tree idea. Each
level of the Mux-tree serves to steer a bit from the personality to the
output.

46

CHAPTER 2. PROPOSITIONAL (BOOLEAN) IDENTITIES

Chapter 3
Propositional (Boolean) Proofs
In this chapter, we will go through the basics of proving Boolean propositions. Recall what we said in Chapter 2: that proofs in general attempt to
prove something of the form Z from something of the form A via steps of the
following kind:
A B C D E F G . . . Z.
Then we would, in effect, have shown A Z or Z if A .
Notice the difference with the previous chapter: there, we attempted
proofs using identities, and all such proofs look like

A B C D . . . Z.
There are many details that we elided over in the above discussion. Basically, there are two approaches to proving a goal proposition G :
Direct proof: In this approach, we start from a collection of premises
P1 , P2 , . . . and then obtain many consequences (or conclusions) C 1 , C 2 , . . ..
We stop the proof when we obtain the goal proposition G as one of the
consequences (or conclusions). Let P represent the conjunction of all
given premises. In the light of our earlier discussions, this proof does
end up showing
P G
i.e., that P G is a tautology.
Proof by contradiction: In this approach, we take the premises P1 , P2 , . . .
and then add to it a new made-up premis G . This may appear totally crazy: why add the negated goal as a premis? The reason why
47

48

CHAPTER 3. PROPOSITIONAL (BOOLEAN) PROOFS


this works will become apparent in a moment. But then, the proof
goes by obtaining conclusions C 1 , C 2 , . . . till one of the conclusions obtained is 0 (False). When we obtain False as a conclusion, we stop, and
then assert that G has been established! In the light of our earlier
discussions, this proof does end up showing
(P G ) False
which is logically equivalent to

P G.
Why does proof by contradiction work? The reason why (P G )
False is logically equivalent to P G is quite simple to show:
(P G ) False
(P G ) False (using the definition of )
(P G )
(using the fact that X False X , for any X )
(P G )
(using Demorgans law)
(P G )
(using the rule of double negation)
(P G )
(using the definition of ).

3.1

Inference Rules

Having introduced propositional identities in Chapter 2, we just need a collection of bridge implications otherwise known as rules of inference before we can start writing proofs. The reason why we cant just use identities
to write proofs must be clear; but to reiterate:
Sometimes we will be proving weaker assertions from given assertions.
For instance, we may want to prove A B from A
It is clear that A 6 ( A B), but in fact it is the case that A ( A B).
Thus, it must be possible to infer weaker facts from a collection of
premises, thus requiring rules of inference that are not identities.
Writing style for rules of inference: We now present to you the writing
style for rules of inference. Specifically, rules of inference are written as
follows:

P remis1 P remis2 ...


RuleName
Conclusion

3.1. INFERENCE RULES

49

That is, we write a bunch of premises as a pattern above the line, and
the conclusion we can draw from below the line.
Illustration using Socrates: You all perhaps have heard this:
From the premises:
All men are mortal
Socrates was a man
Show that
Socrates was mortal
Solution:
Model Men are Mortal using m r where m stands for the assertion
is a man pertaining to all possible men there are, and r stands for is
mortal pertaining to that man.
Model Socrates is a man using m, which stands for the is a man
assertion specialized to Socrates.
We now have to infer r
We apply the rule modus ponens which says
From A and A B, infer B
Using this rule as a pattern, we can bind A to m and B to r , thus
allowing us to infer B, which happens to be r .

3.1.1

A Collection of Rules of Inference

Most of the action (and error-prone aspects) of a proof are in the modeling
phase. When dealing with English assertions, we will help you by modeling
the situation at hand using variables. All the proofs you do in this course
will, thus, involve only symbol pushing moves.
Modus Ponens: The first rule of inference we just now introduced is called
Modus Ponens. Once again, it is written as follows, using our writing style:

( A B)
ModusPonens
B
This is how from an assertion A and an implication A B you make progress
by deducing B.

50

CHAPTER 3. PROPOSITIONAL (BOOLEAN) PROOFS

SIGNIFICANCE OF RULES OF INFERENCE: Let us pause for a minute


and understand what modus ponens is saying; it is this:
Take a formula that looks like A . Thus, A could represent There is
Smoke.
Take another formula that looks like A B. Thus, A B could
represent There is Smoke IMPLIES There is Fire.
Infer formula B as being true. Thus, we infer There is Fire.

Or, A could be ( p q r ),

B could be ( p q r ) ( s t)
Then we can match A with the antecedent of B and infer ( s t)

In other words, we are really asserting that this is a tautology:


[( p q r )
(( p q r ) ( s t))
( s t)]

Chaining:

The second rule of inference we shall use is called chaining.

AB
B C Chaining
AC
Chaining allows you to transitively collapse implications, obtaining long
reach inference steps.

3.1. INFERENCE RULES

51

Rules of Inference are Valid Implications:


really asserts that
( A ( A B)) B

Notice that modus ponens

is a tautology. It does not assert that


( A ( A B)) B
which of course is not true.
In a sense, rules of inference are implication bridges, asserting useful
implications that are tautologies.
Notice that there should not be a rule of inference of the following kind:

A B StinkyRule
A
If we were to allow Stinky Rule, then we would be happily (?) asserting
that ( A B) A is a tautology, and building implication bridges. Such implication bridges do not preserve truths they can suddenly introduce lies!
Thus, B may be true, but A may be false; yet, Stinky Rule will allow you to
claim A is true by the mere fact that B is true, and then happily prove just
about anything!

Other Rules of Inference: We have introduced two of the key rules we


shall use to create implication chains. The remaining rules are in fact
identities. But we shall pretend that they are also rules, helping us extend
implication chains. Clearly, many of these rules are more than valid implications they are valid equivalences, and hence even more safe to use. We
introduce these rules also, so that we have many handy rules together in one
place.

A B Contrapositive
B A
Contrapositive allows you to swing an implication the other way making
it amenable to more chaining steps. Dont forget to negate when you swing
implications around!
ABC D
Contrapositive Detail 1
D A B C

52

CHAPTER 3. PROPOSITIONAL (BOOLEAN) PROOFS

Contrapositive, in case you have a stack to the left.


( A B C ) (D E F )
Contrapositive Detail 2
(D E F ) ( A B C )
The above rules can be thought of as generalized contrapositive.

A B And Commutativity
B A
This commutativity rule avoids having to state two And rules below; but
good to have the separate rules anyhow.
AB
And Rule 1
A
You cant have proven A B unless you have proven A .
AB
And Rule 2
B
You cant have proven A B unless you have proven B.
A B If and Only If
BA
This commutativity rule avoids having to state two rules below; but good
to have the separate rules anyhow.
A B If and Only If1
AB
A If and only If B means If A then B or B If A . Try applying contrapositive to A B to know what else you can infer from A B.
A B If and Only If 2
BA
A If and only If B means If B then A or A If B. Try applying contrapositive to B A to know what else you can infer from A B.
A

A B C Simplification of Implication
BC

3.2. EXAMPLES OF DIRECT PROOFS

53

When a rule has too many things stacked up before the , you can get rid
of some of them.

A B C D Moving Around Implication


A B C D
You can move things around the implication by negating in the process.
Imagine the to have an -stack on the left and a -stack on the right.
A B C D Moving Around Implication
A B C D
You can move things around the implication by negating in the process.
Imagine the to have an -stack on the left and a -stack on the right.

3.2

Examples of Direct Proofs

Please take a look at Puzzles by Lewis Carroll compiled by Prof. Gerald Hiles
at http://tinyurl.com/Gerald-Hiles-Lewis-Carroll. Here are the premises:
1. Every idea of mine, that cannot be expressed as a Syllogism, is really
ridiculous;
2. None of my ideas about Bath-buns are worth writing down;
3. No idea of mine, that fails to come true, can be expressed as a Syllogism;
4. I never have any really ridiculous idea, that I do not at once refer to
my solicitor;
5. My dreams are all about Bath-buns;
6. I never refer any idea of mine to my solicitor, unless it is worth writing
down.
Here is the desired conclusion:
All my dreams come true.
Modeling hints: we introduce propositional variables for each concept below:
Universe: "my idea";

54

CHAPTER 3. PROPOSITIONAL (BOOLEAN) PROOFS

PREMISES
P1. !a e
P2. b ! k
P3. ! c !a
P4. e h
P5. d b
P6. h k
GOAL
G. d c
PROOF: Derive these Conclusions, the last of which is the goal
C1. d ! k
P5, P2, Chaining
C2. ! k ! h
P6, Contrapositive
C3. d ! h
C1, C2, Chaining
C4. ! h ! e
P4, Contrapositive
C5. d ! e
C3, C4, Chaining
C6. ! e a
P1, Contrapositive
C7. d a
C5, C6, Chaining
C8. a c
P3, Contrapositive
C9. d c
C7, C8, Chaining
=G
Figure 3.1: Proof of All My Dreams Come True
a = able to be expressed as a Syllogism;
b = about Bath-buns;
c = coming true;
d = dreams;
e = really ridiculous;
h = referred to my solicitor;
k = worth writing down.
Figure 3.1 presents the direct proof of d c from the given premises.

3.3. EXAMPLES OF PROOFS BY CONTRADICTION

3.3

55

Examples of Proofs by Contradiction

Figure 3.2 presents the proof by contradiction of All my dreams come true.
PREMISES
P1. !a e
P2. b ! k
P3. ! c !a
P4. e h
P5. d b
P6. h k
P7. d ! c
Negated goal added as premis
PROOF: Derive these Conclusions, the last of which is FALSE
C1. d
P7
C2. ! c
P7
C3. b
C1, P5, MP
C4. ! k
C3, P2, MP
C5. ! k >! h P6, Contrapositive
C6. ! h
C4, C5, MP
C7. ! h ! e P4, Contrapositive
C8. ! e
C7, C8, MP
C9. ! e a P1, Contrapositive
C10. a
C9, C10, MP
C11. a c P3, Contrapositive
C12. c
C10, C11, MP
P15. False C2 and C12

Figure 3.2: Proof by contradiction of All My Dreams Come True

3.4

Exercises

1. Provide a proof of y x from premis x. You may use the definition of


in terms of , and you may introduce a rule from A infer A B.

56

CHAPTER 3. PROPOSITIONAL (BOOLEAN) PROOFS


2. Provide a proof by contradiction that x ( y x) is a theorem. Hint:
Treat x ( y x) as the goal, negate it, and derive falsehood.
3. From the premises
P1.
P2.
P3.
P4.
P5.

a.b c
cd
eb
e. f ! d
f a

Infer the goal G given by e ! f .


You are free to choose whichever approach (a direct proof or a proof by
contradiction) that makes this proof easier.
4. Show that the generalized contrapositive rules are safe to use as rules
of inference. Hint: Take one of those rules:

A B C D Moving Around Implication


A B C D
We can view this rule as the implication
( A B C D ) ( A B C D )
Show that this implication is valid.
5. Show that even a stronger result holds:
( A B C D ) ( A B C D )

3.5

Lecture Outline

A typical lecture covering this chapter may go through the following topics:
What does a proof mean? I.e. proof of a goal G from a set of premises
P ? It is to show that P G is a tautology! For any setting of variables,
if P is true, so is G .
What does proof by contradiction mean? It is to show that P G
is a contradiction (false) for any setting of variables. This is exactly
equivalent to P G being a tautology; show how.

3.5. LECTURE OUTLINE

57

What do rules of inference do? They help form Implication chains


i.e. from many little . You may use from the previous chapter
anywhere to form bridges. A proof now looks like

A B C D E F G . . . Z.
Discuss two sound rules (contrapositive, modus ponens) and the stinky
rule. See whats wrong with the stinky rule.
Writing a direct proof: Example from Sec 3.2
Writing a proof by contradiction: Example from Sec 3.2

58

CHAPTER 3. PROPOSITIONAL (BOOLEAN) PROOFS

Chapter 4
Binary Decision Diagrams
In this section, we introduce Binary Decision Diagrams, a simple yet elegant idea to compactly represent Boolean functions. Notice that Boolean
functions represented by a truth-table can have 2 N rows for an N -variable
function. For this, BDDs often offer a linear or polynomial representation.
This really helps when N becomes large (e.g., for N = 16, there is a huge
difference between 216 and 16 as you will agree).
Given the need to represent large Boolean functions (say, those involving dozens of Boolean variables), it is important to have practical (scalable)
representations. Unfortunately, truth tables and Karnaugh maps (which we
did not study so far, but is standard fare in many courses) are not scalable
or practical for these sizes! While one may represent a Boolean function of a
few inputs e.g., And using a truth table, even something conceptually as simple as a magnitude comparatorcomparing whether two bytes (8-bit words)
are equalrequires us to employ a 16-input truth table. This truth-table
will have 65,536 rowssomewhat like this:
Row
Number
1:
2:
3:
4:
...
65536:

b7

b6

b5

b4

b3

b2

b1

b0

a7

a6

a5

a4

a3

a2

a1

a0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
0
0

0
0
1
1

0
1
0
1

1
0
0
0

Clearly, working with a truth-table of 65,536 rows (or a K-map with 65,536
59

60

CHAPTER 4. BINARY DECISION DIAGRAMS

cells) is not practical. Fortunately, there is an alternative representation


of Boolean functions called a Binary Decision Diagram (BDD) that can, for
many commonly occurring Boolean functions, be quite a bit more compact.
It is BDDs that we shall now study systematically, beginning with some
examples of Boolean functions.
Consider another example to motivate our discussions: the design of a
64-bit adder that adds two 64-bit integers producing a 65-bit result. As
pointed out in the example of a comparator, truth-tables are poor representations for almost all functions, including for an adder. For instance, a truthtable for an adder with respect to each of the 65 bits of output will have size
(number of rows) equaling 2128 . It is clearly impossible to build such truth
tables or verify such adders by going through every Boolean combination.
We obviously need more efficient methods such as will be presented in this
chapter. Specifically, we will introduce BDDs as a data structure conducive
to representing Boolean functions compactly, provided a good variable ordering can be selected. While this method is not foolproof (i.e., there are Boolean
functions for which their BDDs are large), it often works surprisingly well
in practice.

4.1

BDD Basics

BDDs are directed graphs. They have two types of nodes: ovals and rectangles. Ovals are interior nodes, representing variables and their decodings.
One can in fact view the ovals as 2-to-1 muxes. The variable written inside
the oval is connected to the selector of the mux. There are two leaf nodes,
namely 0 and 1 written within rectangles. BDDs also have edges emanating
from the ovals:
red (dotted) edges are 0 edges. They are like the 0 input of the 2-to-1
muxes.
blue (solid) edges are the 1 edges. They are like the 1 input of the
2-to-1 muxes.
The output of each interior node (circle) represents a Boolean function
realized using 2-to-1 muxes.
Figure 6.3 presents the BDDs for And, Or, and Xor. Notice that by walking paths to the 1 node, we can determine which truth-table rows must emit
a 1. You can notice a heavy degree of compression: for And , only one path
goes to the BDDs 1 node, and all others jump to 0. This example, by itself,

4.1. BDD BASICS

61

Figure 4.1: Some Common BDDs: And, Or, and Xor (from left to right). Blue
is 1 and Red is 0. Memory aid: 0 is the most fundamental invention in math;
and that goes with red (i.e., Us color :)

Figure 4.2: Situations to avoid in order to make BDDs Canonical Representations of Boolean functions

62

CHAPTER 4. BINARY DECISION DIAGRAMS

shows the magical compression ability of BDDs.

4.1.1

BDD Guarantees

BDDs that meet three conditions become canonical representations of Boolean


functions:
Variable Ordering: There is one fixed sequence v1 , v2 , . . ., v N ordering
the variables. In other words, in any path from the root of the BDD to
a leaf (one of the squares), there is no vk followed by a v j for j < k.
Note that it is okay for some variable v j NOT to be on a path
No Redundant Decoding: There is no circle whose outgoing red and
blue edges go to the same child circle. They must go to different
children.
No Duplicated Boolean Function: There are no separately drawn circles representing the same Boolean function.
Figure 4.2 illustrates the situations to avoid so that BDDs may be canonical. Having a canonical representation allows us to compare equivalent
BDDs through graph isomorphism. As implemented by most BDD packages, one does not have to carry out graph isomorphism, but rather compare
the root node of the BDDs to be hashing into the same bucket (thus making
function equality comparison a constant-time operation).

4.1.2

BDD-based Comparator for Different Variable Orderings

A comparator can have size linear in the number of bits being compared
(for a favorable ordering of BDD variables). On the other hand, the BDD
can also be exponentially large (for an unfavorable BDD variable ordering).
These are illustrated in Figure 4.3.

4.1.3

BDDs for Common Circuits

Let us illustrate BDDs constructed through a simple Python script acting on


a data file as shown:
#---Mux41Good.txt begins here and ends where shown below--Var_Order : s0 s1 i0 i1 i2 i3
Main_Exp : ~s0 & ~s1 & i0 | s0 & ~s1 & i1 | ~s0 & s1 & i2 | s0 & s1 & i3

4.1. BDD BASICS


63

Figure 4.3: Comparator BDD for the Best Variable Ordering and the worst

64

CHAPTER 4. BINARY DECISION DIAGRAMS

Figure 4.4: A 4-to-1 mux with good variable ordering (left) and a bad ordering (right)
#---end of Mux41Good.txt--#---Mux41Bad.txt begins here and ends where shown below--Var_Order : i0 i1 i2 i3 s1 s0
Main_Exp : ~s0 & ~s1 & i0 | s0 & ~s1 & i1 | ~s0 & s1 & i2 | s0 & s1 & i3

To summarize, a good variable ordering is one that minimizes the BDD


size. It may not be unique (there could be two equally good orderings). Also
it depends, in practice, on how closely related a collection of variables are
in determining the truth value of the function. The sooner (after reading
the fewest inputs) we can decide the function output, the better.
By studying BDDs in CS 2100, we will have several gains:
Learn another representation (a canonical representation) for Boolean
functions.
A representation that makes sense to use in practice (exponentially
better than truth tables in many important cases)
N
Knuths observation: There are 22 Boolean functions over N inputs

4.1. BDD BASICS

65

Most are uninteresting in practice


Therefore, there must be a compressed representation for those
that matter in practice
Much like compression of images etc. (many pixels that really
dont matter that much..)
Will learn how to obtain mux-based circuits straight out of BDDs
It is easy to learn how to read out CNF and DNF representations out
of BDDs (this is in my more advanced books for CS 2100)
Will be able to do combinatorics pertaining to unstructured information with respect to BDDs

4.1.4

A Little Bit of History

BDDs are the culmination of a gradual evolution of ideas (1970s, notably


Sheldon Akers). In 1986, Randy Bryant introduced the concept of reduced
ordered BDDs or ROBDDs (this is what we call BDD). He invented it in
the context of electronic digital circuit simulation/analysis (1986). Since
Bryants invention, BDDs took off like wildfire. They are the basis of
many tools. Knuths Volume 4a (http://www-cs-faculty.stanford.edu/
~knuth/) covers BDDs and their use in combinatorics and other applications quite extensively. Knuth calls BDDs one of the most important of
data structures to be introduced in the last 25 years.
Example: Design and debugging of a comparator BDD
Suppose we are given a bit-vector [a2,a1,a0] of three bits, where a2 is
the MSB and a0 is the LSB. Similarly, suppose [b2,b1,b0] is another bit
vector. Suppose we want to define the < relation between these bit vectors.
One definition that was attempted recently proved to be incorrect; it is:
# A < B
# i.e. a2,a1,a0 < b2,b1,b0
Var_Order : a2, b2, a1, b1, a0, b0
Main_Exp

: ~a2 & b2 | ~a1 & b1 | ~a0 & b0

From Figure 4.5 (left), we can see that this BDD is not correct. Go
through all possible paths and see if you can spot errors. One clue: what
happens when a2 is 1 and b2 is 0? What should it be? (In a correct comparator, the answer must be 0.)

66

CHAPTER 4. BINARY DECISION DIAGRAMS

Figure 4.5: Incorrect (left) and Corrected (right) magnitude comparator for
the Less-than relation <. The mistake is for instance in not completely specifying the decodings.

The corrected comparators description is below, and its BDD is in Figure 4.5
(right). Notice that we do a full case analysis of how the comparison must
go.

# A < B
# i.e. a2,a1,a0 < b2,b1,b0
Var_Order : a2, b2, a1, b1, a0, b0
Main_Exp : ~a2 & b2 | (a2 <=> b2) & (~a1 & b1 | (a1 <=> b1) & ~a0 & b0)

4.2. CHECKING PROOFS USING BDDS

4.2

67

Checking Proofs using BDDs

In this section, we shall illustrate how having a tool allows us to automate some of the hand proof and produce mistake-free (machine-checked)
proofs.

4.2.1

Checking a Correct Direct Proof

Consider a direct proof:


PREMISES
P0. a
P1. a b
P2. b c
P3. c d
P4. d ! e
GOAL
G. b! e
Let us use the BDD tool to enter this proof:

Var_Order:
P0
P1
P2
P3
P4

=
=
=
=
=

a
a
b
c
d

->
->
->
->

a, b, c, d, e

b
c
d
!e

Premises = P0 & P1 & P2 & P3 & P4


Goal = b & !e
Main_Exp : Premises -> Goal
The result of the BDD tool in Figure 4.6 shows that indeed this proof is valid
that is, the goal G does follow from the given premises. That is, P G did
end up being a tautology.

68

CHAPTER 4. BINARY DECISION DIAGRAMS

Figure 4.6: A Successful Direct Proof

4.2.2

Checking an Incorrect Direct Proof

By leaving out premis P0, we get evidence that the goal cant quite be proven
(Figure 4.7). The BDD is crying out to become 1, but since the status of a
is not given, it shows both possibilities (of a being 1 and 0). In other words,
P G did not end up being a tautology, because it has paths to 0 also
(it can be falsified)! An astute user will immediately see the flaw and add
premis P0, thus rescuing the proof.

4.2.3

Checking a Correct Proof by Contradiction

Let us use the BDD tool to enter this proof:

Var_Order:
P1
P2
P3
P4

=
=
=
=

a
b
c
d

->
->
->
->

Premises =

a, b, c, d, e
b
c
d
!e
P1 & P2 & P3 & P4

Goal = b & !e
NegatedGoal = !Goal
Main_Exp : Premises & NegatedGoal
Figure 4.8 shows how a successful proof by contradiction shows up as the
BDD output. That is, P G did end up being a contradiction.

4.2. CHECKING PROOFS USING BDDS

Figure 4.7: An Unsuccessful Direct Proof

Figure 4.8: A Successful Proof by Contradiction

69

70

CHAPTER 4. BINARY DECISION DIAGRAMS

Figure 4.9: An Unsuccessful Proof by Contradiction

4.2.4

Checking an Incorrect Proof by Contradiction

Again, by leaving out premis P0, we get the result of an incorrect proof by
contradiction, as in Figure 4.9. In other words, P G did not end up
being a contradiction, because it has paths to 1 also (it can be satisfied)!

4.3

Exercises

1. Verify that the proof in Section 3.2 ended up proving that P G is


true where P is the conjunction of P1 through P6 and G is the given
goal G. Use the Binary Decision Diagram tool (to be demonstrated in
class). The BDD tool is available from here:

http://www.cs.utah.edu/fv
Look for Software, then PBDD, then Web Interface. This webpage comes with a self-contained example. Here is what you type for
this example, and then build a BDD for Main_Exp and then describe

4.3. EXERCISES

71

your observation(s) about this BDD in a few neat sentences. Specifically, relate it to the discussion on Direct proof on/near Page 3. Is the
purpose of a proof as captured there being accomplished? Reflect this
understanding in your answer.

Var_Order:
P1
P2
P3
P4
P5
P6

=
=
=
=
=
=

a, b, c, d, e, h, k

!a -> e
b -> !k
!c -> !a
e -> h
d -> b
h -> k

Premises = P1 & P2 & P3 & P4 & P5 & P6


Goal = d -> c
Main_Exp : Premises -> Goal
2. Artificially introduce a mistake by changing the goal d c to d c.
Rerun the BDD tool. What does the Main_Exp look like now, and what
is it telling you?
3. Verify that the proof in Section 3.3 ended up proving that P G is
false, where P is the conjunction of P1 through P6 and G is the given
goal G. Use the Binary Decision Diagram tool. Encode the Premises
as given in Question 1, but do add another premis the negated goal
also. Then plot Main_Exp. Does this reflect the intent of a proof by
contradiction as outlined on/near Page 3?
4. Artificially introduce a mistake by changing the goal d c to d c.
Rerun the BDD tool for the proof by contradiction approach. What
does the Main_Exp look like now, and what is it telling you?
5. Study Section 4.1.2 and Section 4.1.3 that discusses the notion of bad
variable orderings. Write in 4-5 clear sentences what variable orderings can (heuristically) be considered to be good, and which are considered bad, and why?

72

CHAPTER 4. BINARY DECISION DIAGRAMS


6. Study Section 4.1.4 where we make a mistake in a Boolean equation.
Describe the mistake and its correction in a few clear sentences. How
did the BDD help in discovering the mistake?
7. Verify the proof in Question 3 of Chapter 3 using the BDD tool. The
requested proof was this.
From Premises:
P1. a.b c
P2. c d
P3. e b
P4. e. f ! d
P5. f a
Infer the goal G given by e ! f .
You are free to choose whichever approach (a direct proof or a proof by
contradiction) that makes this proof easier. But since you are using
BDDs, try both.

4.4

Lecture Outline

A typical lecture covering this chapter may go through the following topics:
Show the advantages of BDDs as opposed to truth-tables.
Show the dependency of BDDs on variable order. Keeping it small
means choose order smartly.
Otherwise (apart from keeping small), the variable order plays no role
whatsoever. Choose one and stay with it for all your calculations.
Then two Boolean functions will have the same (equivalent) BDD graphs.
BDDs can be read as Mux21-based graphs. In this way, the BDD for
any Boolean function is also a circuit for that function!
BDDs help us check proofs. If a proof of P G is sound, then the BDD
for P G will be the 1 node. For a sound proof by contradiction, the
BDD for P G will be the 0 node.
BDDs can also be used to check that rules of inference are valid. Basically, in a rule of the form PCs where P s are the premises and C is a
conclusion, we get P s C being a tautology.

Chapter 5
Addendum to Chapters
This chapter covers points that came up in our Canvas discussions plus the
feedback I received through TAs. I chose to create an addendum so that you
dont have to print everything again, plus lose all your hand-written notes
(so just print from this PDF page onwards). I will now cover these FAQs:
Books to purchase 5.1, Operator Precedences 5.2, Gate Realizations 5.3,
insights into Logical Equivalences 5.4, Muxes 5.5, and Glossary of Formal
Definitions 5.6.

5.1

Books to Purchase

For those who want to purchase a book, here are some points worth noting:
I gave you the link to a book by Grimaldi (inexpensive used copies;
good content). There are also many notes online (this subject has been
around for a century). It is good to read the material of this course
from many sources so that you obtain many perspectives.
You may still not see many things Im hoping to cover:
Which gates are universal, and why XOR is not. In computer
science, impossibility results are as important.
N
Ive seldom seen a discussion of there being 22 Boolean functions. Upper bounds in this case the number of gate forms, or
the number of logically non-equivalent assertions one can make
over N variables are another important aspect of computer science.
Books in this area do not often include important practical ma73

74

CHAPTER 5. ADDENDUM TO CHAPTERS


terial. As an example, Professor Donald Knuth of Stanford is
one of the luminaries of Computer Science. He has written at
length about Binary Decision Diagrams (BDD) in his latest book
The Art of Computer Programming, Volume 4, Combinatorial Algorithms, highlighting their importance (you can get a peek at
Prof. Knuths draft manuscripts at http://www.cs.utsa.edu/
~wagner/knuth/). We will learn about BDDs in Chapter 4, and
put them to good use.

5.2

Operator Precedences

Operator precedences for Boolean expressions are as follows:


Operator
Negation
Conjunction
Disjunction
Other operators

Symbol

, , =,

Alternate Symbol(s)
!,

Precedence
1 (highest)
2
3
4

Notes:
Juxtaposition (as in ab) can be used for conjunction (as in a b).
Parentheses override all precedences.
When implication chains are used, they right-associate, as in

a b c a ( b c)
although I dont advise that you rely on this usage too much (errorprone for beginners).

5.2.1

Example

An expression

abcd e f g
can be read as
((a b) ( c d )) (( e f ) g)

5.3. GATE REALIZATIONS

75

although I would recommend that you write with some minimal usage of
parentheses to enhance readability, with white-spaces judiciously used, as
in
(a b) c d ( e f ) g
but better also as
((a b) c d ) (( e f ) g)
The Boolean math syntax can make things much more readable, as in
(ab + c + d ) ( e f + g)

5.2.2

Another Example

The expression
a b c d e f g

can be read as (and must ideally be written as follows, for clarity)


((a b) c d ) (( e f ) g)
The Boolean math syntax can make things much more readable, as in
(ab + c + d ) ( e f + g)

5.3

Gate Realizations

In writing your answer for gate realizations, suitably summarize or adapt


the answer template Im about to give below with respect to an example.
The high-level steps are:
Please write down the equation for the given gate(s), drawing their
symbols also for clear documentation.
Write down the equation for the gate(s) to be realized.
Write a sentence describing a method of construction.
Show the result as a full equation or as a schematic.
Example: Realize "Nand" using "Implication"
Given an Implication gate whose equation is a + b or !a + b (if you prefer
to write it that way),
Here is its schematic (draw the schematic)
To realize Nand, whose equation is ab (or !(ab)).

76

CHAPTER 5. ADDENDUM TO CHAPTERS

Figure 5.1: Nand gate made using two Implication gates; then connected in
a test-rig where it is compared against a genuine Nand. The XNOR gate
implements equality. Notice that its output LED is on for all input combinations, thus proving that our Nand construction works.

5.4. INSIGHTS INTO LOGICAL EQUIVALENCES

77

Method: Inversion is realized through Implication by setting b=0. Then,


the conjunction in Nand can be realized through DeMorgans Law.
In more detail, look at !a + b. By setting b = 0, we get !a + 0 =!a. Thus
we get inversion wrt a. Set this inverter aside.
Take another copy of implication. Write its equation as ! c + d .
Notice that I can apply the newly formed inverter to its d input, thus
obtaining ! c+! d .
From DeMorgans Law, we know that this is equivalent to !( cd ), which
is the desired Nand gate.
If I dont mention a specific approach for gate realization, you may choose
any method that works. For example, some of you may go by truth-tables, in
case I dont give any constraints.
In Figure 5.1, we show how a Nand gate realized using Implication gates
can be wired in a test rig. Please dont get confused by the large number
of circuits: the two implication gates used to realize the Nand are at the
top right corner. The first Or with a bubble is the inverter we realized, by
taking !a + b and setting b = 0. The second Or with the bubble is the ! c + d
gate we mentioned above.
What weve done in this construction is to also use a real Nand gate and
then compare its output with the Nand weve made. This comparison is
done by the XNOR gate at whose output we have attached an LED. Now we
crank through all input combinations, and find that the XNOR gate always
outputs a 1, regardless of the inputs. Thus, the Nand we made using two
implication gates indeed works.

5.4

Insights Into Logical Equivalences

We studied several logical equivalences. Wouldnt it be cool to see DeMorgans law (the most famous of logical equivalences) as a circuit? In Figure 5.2, we do exactly that: we provide a circuit that proves that a + b a b.
We provide a circuit for both sides of this equivalence, and then use an XNOR
to check whether they are equal under all inputs. We see this to be true as
per this figure. Think of all Boolean laws as defining tautologies of this kind.

5.4.1

Jumping Around Implications (NEW)

We now discuss the jumping around implications rules.

78

CHAPTER 5. ADDENDUM TO CHAPTERS

Figure 5.2: DeMorgans Law (a + b) (a b) Illustrated Using a Circuit

5.4. INSIGHTS INTO LOGICAL EQUIVALENCES

79

The formula
( A B ) (C D )
is equivalent to

B ( A C D )
which is also equivalent to
( A B C ) D
In other words, you can take a formula of the form
stack-of-ANDs stack-of-ORs
and
move one of the conjuncts to the right of the arrow (after negating
it) and making it part of the OR-stack, or
move one of the disjuncts to the left of the arrow (after negating
it) and making it part of the AND-stack.
This is a valid rule because of a simple fact (proof):
( A B ) (C D )
( A B) (C D )
( A B C D )
(B A C D )
(B ( A C D )

And similarly, jumping C to the left can be derived (try it).

5.4.2

Telescoping Antenna Rule (NEW)

The Telescoping Antenna Rule allows us to mush together chains of implications, as if its a telescoping antenna. That is,

A (B C ) ( A B ) C
The reason again is simple (lets formally derive this equivalence):

80

CHAPTER 5. ADDENDUM TO CHAPTERS


A (B C )
A (B C )
A (B C )
( A B ) C
( A B) C
( A B) C

5.5

Muxes

In Figure 5.3, we present the use of Mux21 to realize an implication gate.


Basically, we wire the personality at the leaves. See how, for each input
combination, the right bit of the personality is steered through the tree.
In Figure 5.4, we present the use of Mux21 to realize a 3-input XOR gate.
Again the same construction method is followed: we wire the personality at
the leaves. See how, for each input combination, the right bit of the personality is steered through the tree.

5.6

Glossary of Formal Definitions

Here are formal definitions of terms used in Chapters 1 through 4.


Chapter 1:
Declarative Sentence: A statement having true/false as its meaning.
Propositional variable: A mathematical variable that takes on true/false (commonly 1/0) as its values.
Propositional / Boolean: Terms that are interchangeably used to denote truth-valued propositions and concepts.
Propositional formula: A mathematical formula containing propositional variables and connected using propositional operators.
Formal proposition: Also known as propositional formula.

5.6. GLOSSARY OF FORMAL DEFINITIONS

Figure 5.3: Mux21-based Implication

81

82

CHAPTER 5. ADDENDUM TO CHAPTERS

Figure 5.4: Mux21-based XOR3 a 3-input XOR

5.6. GLOSSARY OF FORMAL DEFINITIONS

83

Boolean function: Formal propositions can also be viewed as mathematical functions that take Booleans as input and yield a single
Boolean (for each input combination) as output.
Truth table: A tabular presentation of a Boolean function having 2 N
rows, one for each combination of Boolean inputs.
Personality: The entire output column of a truth-table, assuming a
fixed enumeration order of the rows of the truth-table going from
all 0s to all 1s. The personality summarizes the behavior of
N
the Boolean function. There are 22 distinct personalities that
can be obtained, given N inputs.
Gates: Circuit embodiment of a Boolean function.
Universal Gate: A gate-type (or a collection of gate types) that can
(typically with multiple copies employed) be used to realize any
other Boolean gate type.
Mux, Mux21: A multiplexor is a special gate type. A Mux21 is the
most primitive multiplexor type, capable of steering one of its inputs i 0 and i 1 to the output, based on whether a selector input
s is 0 or 1, respectively. Muxes are univeral gates (see Mux tree,
below).
Mux tree: A tree arrangement of Mux21s that can be used to build
any Boolean function by (1) placing the personality of the function
to be realized at the leaves, and (2) by employing the function
inputs as selection inputs at the right levels of the tree.
Chapter 2:
Propositional Identities: Identities or laws such as DeMorgans
Law or the Law of Contrapositives. These are most commonly
stated as F1 F2 as in (a b) (a b).
Tautology: A propositional formula that evaluates to true under all
assignments of values to its variables. Such formulae are also
known as valid or simply true.
The negation of a tautology is a contradiction. Thus, x x is a
tautology, while ( x x) which is x x is a contradiction.
Many tautologies contain , as in (a b) (a b). But they
need not as in x x.
Contradiction: A propositional formula that evaluates to false under all assignments of values to its variables. Such formulae are

84

CHAPTER 5. ADDENDUM TO CHAPTERS


false. Unsatisfiable formulae are contradictions. The negation
of a contradiction is a tautology.
Satisfiable: A propositional formula for which there is a value assignment that makes it true. Tautologies are special cases. In
general, satisfiable formulae can also be falsifiable and hence not
tautologies.
Non-Equivalent Assertions: Two assertions F1 and F2 for which
F1 F2 does not hold at least for one input value assignment.

Chapter 3:
Premis: A propositional formula that models a given fact
Conclusion: A propositional formula that we want to prove
Rule of Inference: A pattern that matches a collection of premises
and spits out one or more formulae as output. For example, a
A B Contrapositive
rule of the form B A
matches anything of the form B A and outputs A B. Here,
A and B could themselves be arbitrary propositional formulae.
AB
B C Chaining
Another example is
AC

P1
P2
GeneralRuleR
In general, given C 1 C 2
it must be the case that (P1 P2 ) (C 1 C 2 ) must be valid. Otherwise, the given inference rule is not sound (it can allow us to
prove incorrect conclusions).
If you take a close look at the contrapositive rule, it is more than
an implication. That is, from the contrapositive rule, one of course
can glean that
( A B ) ( B A ),
but of course, by interpreting A as if it were Q and B as
if it were P , one can also see that this rule contains another
implication:
(Q P ) (P Q ).
Thus, the contrapositive rule is really giving you a more powerful

5.6. GLOSSARY OF FORMAL DEFINITIONS

85

statement:
( A B) (B A ).
Number of Rules of Inference: There must be a minimal number
of rules of inference (a detail you dont need to worry). Extra ones
are thrown in simply for convenience. For example, many books
talk about Modus Tollens. It is entirely redundant (hence Im
avoiding its introduction in my book).
Proof: A chain of inferences, aided by either propositional identities
or other rules of inferences, such that starting from premises P
we can prove a goal G . In a correct proof, the formula P G will
end up being valid.
For instance, we can prove a b from a. In this case, a (a b)
is easily checked to be valid. Notice that a b is not equivalent to
a, but is weaker than a. In general, in a proof, G is equivalent to
or weaker than P .
Direct Proof: A proof that begins with premises P and ends with a
goal G .
Proof by Contradiction: A technique whereby we assert G , conjoin it with the given premises P , and then apply the available
rules of inference to produce False (or 0). At that point, we can
conclude that P G is valid.
Chapter 4:
Binary Decision Diagram: A graphical form that is like a Mux-tree,
except (1) it is constructed with respect to a fixed variable order.
(2) the better the suggested variable order, the more compact a
BDD will be. (3) BDDs share sub-functions maximally. (4) BDDs
need not decode every variable in the variable order along every
path (i.e. they can skip levels). BDDs are more properly called
Reduced Ordered Binary Decision Diagrams (ROBDD) but BDD
is easier to say.
Mux realization of BDDs: Any Mux-tree can be collapsed to become
a BDD (or ROBDD). Thereafter, the interior nodes of a BDD can
be realized using Mux21, thus obtaining a direct method to realize any Boolean function using Mux21s in a more efficient way
than through a plain Mux-tree.

86

CHAPTER 5. ADDENDUM TO CHAPTERS


Checking Direct Proofs Using BDDs: We build a BDD for P G ,
and if G is indeed provable from P , then this BDD will be a 1
BDD. The proof itself is not going to be found (but at least you
know that it is provable without spending a whole lot of time).
If not provable, you get something other than a 1 BDD. By staring at that BDD, one can often discover flaws in the problem formulation.
Checking Proofs by Contradiction Using BDDs: We build a BDD
for P G , and if G is indeed provable from P , then this BDD will
be a 0 BDD. The proof itself is not going to be found (but at least
you know that it is provable without spending a whole lot of
time). If not provable, you get something other than a 0 BDD.
By staring at that BDD, one can often discover flaws in the problem formulation.

Chapter 6
Notes on BDDs as Mux21
Circuits
Suppose you are asked to build an And gate. You may be tempted to say
why bother why not take it from a gate catalog?
But suppose we dont have And gates at all; i.e., we are given an FPGA
board such as in Figure 6.1 which is full of Mux21s but nothing else. Then
you cannot simply avail an And gate instead, you might have to take the
approach shown at the top of Figure 6.2, which is the approach of building
any gate by programming its personality at the leaves of a Mux21 tree.
Unfortunately, such a Mux21 tree is guaranteed exponential in size (i.e.,
could be unacceptably inefficient). One way to make Mux21 based circuits
compact is to employ a BDD package and generate a Binary Decision Diagram. If you pick the right variable order, BDDs can be much more efficient,
and result in the circuit shown at the bottom of Figure 6.2.
While a circuit purist might not like the long path-lengths in such a circuit, it is still intellectually satisfying to know how to turn BDDs to Mux21
circuits. This is what we shall study now.
By typing in these commands at the online BDD package situated at
http://formal.cs.utah.edu:8080/pbl/BDD.php, we can generate any desired BDD in this case, the BDD for an And gate:

Var_Order: a b
Main_Exp: a & b
This BDD is shown on the left-hand side of Figure 6.3 (and likewise we can
obtain the other BDDs shown in this figure). Notice that the circuit at the
87

88

CHAPTER 6. NOTES ON BDDS AS MUX21 CIRCUITS

Figure 6.1: A prototyping board with Virtex-5 Field Programmable Gate


Arrays (FPGAs) consisting of over 300K configurable logic blocks (essentially
the Mux21 we studied) is shown (Image courtesy of Xilinx/Digilent Inc.). In
a research project at Utah called XUM (http://www.cs.utah.edu/fv/XUM),
we have packed in eight MIPS cores plus interconnect into such a board.

Figure 6.2: The realization of a 2-input And, by programming the personality directly (top). The more optimized version is obtained by converting an
And BDD into a Mux circuit

6.1. A MAGNIGUDE COMPARATOR

89

Figure 6.3: Some Common BDDs: And, Or, and Xor (from left to right). Blue
is 1 and Red is 0. Memory aid: 0 is the most fundamental invention in math;
and that goes with red (i.e., Us color :)
bottom of Figure 6.2 and the BDD for And in Figure 6.3 are exactly the same,
as far as the core information contained in them. In fact, you can now begin
reading BDD graphs also as Mux21 circuits.

6.1

A Magnigude Comparator

Let us now present a magnigude comparator designed using BDDs. The


design of this BDD is presented in Chapter 4, Figure 4.5 (right), which is
the correct BDD for implementing A < B. We will now provide this BDD
again, and contrast it with a Mux21 that interprets this BDD as a circuit
both given in Figure 6.4. This contrast should further help you understand
how BDDs work. The remaining details are in Chapter 4.

90

CHAPTER 6. NOTES ON BDDS AS MUX21 CIRCUITS

Figure 6.4: A BDD for A < B and a direct Mux21 interpretation of this BDD.
Notice how the lights operate for the four cases shown: 000 < 100, 100 < 100,
100 < 110 and 110 < 111

Module2

91

Chapter 7
Intuitive Description of Topics
In this module, we will study many basic topics of Discrete Mathematics.
This section attempts to provde a cohesive overview of as many of these topics, providing simple definitions and intuitive examples. This will hopefully
minimize your fear (if any) as well as give you a sense of purpose when you
descend into later chapters that detail these topics.
Some topics are inter-dependent in a chicken and egg manner. For instance, to define predicates, we need to assume that you know what sets
are, and to define sets, we need to assume that you know a little bit about
predicates. These circularities will be broken by providing convenient working definitions e.g., when defining predicates, we will provde an English
definition of sets.
Chapter 8:
Sets: Sets are collections of items without duplication. The items are
drawn from a universe the full list of things that the sets under
discussion may be formed out of.
Characteristic Vector: A set can be modeled using a characteristic
vector a bit vector. Thus, if the universe of possible elements
is {a, b, c}, then: (1) the characteristic vector 000 says none of
a, b, c are present, i.e., denotes {}; (2) vector 010 denotes { b}; and
(3) vector 111 denotes {a, b, c}.
Size of the Powerset of a Set: It is easy to then see that any given
set S of N elements has a characteristic vector of length N and
hence has 2 N possible subsets (the size of the powerset of S ).
93

94

CHAPTER 7. INTUITIVE DESCRIPTION OF TOPICS


Special Sets: We often refer to some special sets: N, the set of natural
numbers; N+ , the set of positive natural numbers excluding 0; Z,
The set of integers or whole numbers; and R, the set of reals.
Defining Sets: There will be two fundamental ways in which to define sets: Explicit definition, and Set Builder. The Set Builder
notation is also known as Set Comprehension.
Predicates on Sets, yielding Truth Values: One can test sets using predicates: membership using , emptiness ( isempt y), , ,
, and .
Operations on Sets, yielding Sets: There are many standard operations that combine sets to produce new sets. Some of the important ones are , , , complement (S ).
Other Operations on Sets: Cartesian product takes two sets S 1 and
S 2 , and produces a set of ordered pairs. Powerset takes a set S
and produces a set of its subsets.

Chapter 9:
Predicates: Predicates are operators such as < and 6= that yield truthvalues by examining and comparing non-Boolean quantities.
Predicate Expressions: Predicate expressions are assertions involving non-Boolean variables and predicates. For example, z > 23 is
a predicate expression.
Quantification: Quantification is a convenient way for asserting a
conjunction of many predicate expressions (or disjunction of many
predicate expressions). The two quantifications commonly used
are universal and existential.
Negating Quantified Expressions:
( x, Odd ( x))

can be evaluated using DeMorgans law to obtain


x, Even( x)

Chapter 10:
Principles of Counting: The two rules for counting are

95
Sum rule: If one can divide a counting problem into two disjoint cases, one can then count the two sub-cases and total
up.
Inclusion/Exclusion: If the sets have overlaps, then one can
count using the inclusion/exclusion rule.
Product rule: If there are N1 ways to do something and N2
ways to do something else, and if these actions are independent, then there are N1 N2 ways to do both things together.
Permutations: Permutations are the number of subsequences of n
things taken r at a time.
Combinations: Combinations are the number of subsets of n things
taken r at a time.
Chapter 11:
General Principles of Induction: Induction is one of the most fundamental of proof techniques. It is used to prove properties of
infinite sets of items such as natural numbers where there is a
smallest item, and a next item larger than each item.
Deriving Summations of Series: We will learn how to derive and
verify formulae pertaining to summing arithmetic and geometric
progressions (series).
Properties of Trees: We will learn to count the number of leaves, as
well as the total number of nodes, in balanced trees.
Problems Relating to Recurrences: We will learn to apply induction to problems stated using recurrence relations.

96

CHAPTER 7. INTUITIVE DESCRIPTION OF TOPICS

Chapter 8
Sets
Sets are collections of items without duplication. The items can be anything
even other sets! Here are some examples of sets:
{1, 2, 3} a set of numbers
{" dog", " cat", " mouse"} a set of strings
{" dog", " cat", 22} a set with two strings and a number (we dont need
to ensure that all the items have the same type)
{" dog", " cat", 22, {" dog", 33}} a set with one of the elements being another set; that is, the fourth element of the outer set is this set:
{" dog", 33}}.
{} an empty set (an empty set of numbers, strings, etc since it is
empty, we really cant tell its type)
Here are some non-examples of sets:
{1, 2, 2, 3} duplicated number
{" dog", " cat", " dog"} duplicated string
{{}, {}, 22} duplicated inner set, i.e. the first and second elements are
themselves empty sets
The universe, or Universal set (all the things we can talk about in a
given setting), is always known. For instance, the universe could be integers, just even numbers, a collection of countries, etc.
Sets are one of the central data structures in computer science and mathematics. Even in everyday situations, one can use sets. For instance, suppose in a committee C , there are two people from the US, three from UK,
one from Canada and five from India and zero from Japan (sorry). Then the
set of countries represented by the committee is
97

98

CHAPTER 8. SETS

C = {U K,U S, I ndia, Canada}


We forget how many from each country, and just record the presence/absence the natural role assigned to a set data structure.

8.1

All of Mathematics Stems from Sets

This section tells you about the fundamental role played by sets in mathematics. It also drives the point home that the notion of sets containing other
sets is not at all bizzare but a fundamental idea that is widely used.
We will introduce the idea of how numbers are represented using sets
through a short story. Consider Professor Sayno Toplastix an avid plasticbag recycler who wants to illustrate to his class how numbers are represented using sets. Prof. Toplastix simulates sets using supermarket plastic
bags that he has in plenty. Here is how a short session goes:
Prof. Toplastix shows the class Look, 0 is represented by this empty
plastic bag. He inflates and explodes the bag for emphasis; he pops
it so that it truly models ;, that is, it can no longer reliably hold anything.
Representing 1 takes two bags: it is modeled by a bag within a bag.
Continuing on, 2 needs 4 bags: it is a bag containing (i) an empty bag
i.e. 0, and (ii) a bag containing an empty bag i.e., 1.
You can now wonder how many plastic bags are needed to represent any
number in this fashion. You can begin to observe that to represent N , we
will need 2 N bags. More specifically, consider natural numbers (the set
{0, 1, 2, . . .}):
0 is modeled as {}, the empty set, requiring 20 bags;
1 is modeled as {0}, or {{}}, the set containing 0, requiring 21 bags;
2 is modeled as {0, 1}, or {{}, {{}}}, requiring 22 bags;
3 is modeled as {0, 1, 2}, or {{}, {{}}, {{}, {{}}}}, requiring 23 bags; and so on.
This exponentially growing number of bags is of no real concern to a
mathematician; all they care is that one can represent everything using sets,
i.e., numbers are a derived concept. All of mathematics can be derived from
set theory.
Question: What would be the weight of number 64 represented as above,
if one plastic bag weighs about a gram (it actually weighs a lot more; but
assuming one gram simplifies our calculations)?

8.2. CHARACTERISTIC VECTOR, POWERSET

99

Answer: Then number 64 will weigh 264 grams.


Here is a quick table of powers of two, and their values:
20 = 1
21 = 2
210 = 1, 024, a thousand grams
220 = 1, 048, 576, a million grams
230 = 1, 073, 741, 824 a billion grams
232 four billion
264 16 billion billion grams or 16 trillion tons (there are 1,000
grams in a kilogram and 1,000 kilograms in a ton)
Thus, 264 , in plastic bags, will weigh 16 trillion tons!

8.2

Characteristic Vector, Powerset

Characteristic vectors (also known as indicator vectors, https://en.wikipedia.


org/wiki/Indicator_vector) are a standard way in which to denote finite
sets and their subsets. Thus, if the universe of possible elements is {a, b, c},
then: (1) the characteristic vector 000 says none of a, b, c are present, i.e.,
denotes {}; (2) vector 010 denotes { b}; and (3) vector 111 denotes {a, b, c}.
In our example involving countries, the universe or Universal set (all the
things we can talk about in a given setting) are five; namely
{U S,U K, Canada, I ndia, Japan}.
Then, committee C is also modeled by 11110.
A characteristic vector of a set over a universe U consisting of N
elements is an N -bit vector of 0 and 1, indicating the presence/absence
of each of these N items.

Note: The empty set {} is often written as ;.

The set of all possible subsets of a set is its powerset. For example,
the powerset of {a, b, c} is this set:

100

CHAPTER 8. SETS

{{},
{a},
{ b},
{ c},
{a, b},
{ b, c},
{a, c},
{a, b, c}}

The Powerset of a Set: The members of this powerset have a characteristic vector associated with them, as follows:
Subset
{}
{ a}
{ b}
{ c}
{a, b}
{ b, c}
{a, c}
{a, b, c}

Characteristic vector
000
100
010
001
110
011
101
111

Thus, it is easy to then see that any given set S of N elements has a
characteristic vector of length N and hence has 2 N possible subsets (the
size of the powerset of S ).
In our committee example, the situation can be modeled using five switches,
one for each country, all initially off (down). When one person from a country
comes in, they push the switch up. If its already up, another push wont be
recorded it still stays up.
Or instead of switches, think of a computer word, all 0. When someone
comes, they set their bit into a 1. If already set, setting it again keeps it
a 1. In our committee example, assuming that Japan is modeled by the last
switch, the switches will be
11110

8.3. SPECIAL SETS IN MATHEMATICS

101

i.e., we will model sets using bit-vectors such as this, with one bit per
possible set member.
The powerset of the empty set Note that powerset of S is the set of all
its subsets (not merely proper subsets, but all subsets). This is why {} has a
powerset, which equals {{}}.
Remember that the powerset of any set even an empty set contains ;

8.3

Special Sets in Mathematics

We often refer to some special sets that help us model various (infinite) sets
of numbers we shall often use in our work:
N: The set of natural numbers, i.e. the set
{0, 1, 2, 3, 4, 5, . . .}

This is an infinite set of all the positive numbers and 0.


N+ : The set of positive natural numbers excluding 0,
{1, 2, 3, 4, 5, . . .}

This is also an infinite set.


Z: The set of integers or whole numbers, i.e. the set
{0, 1, 1, 2, 2, 3, 3, 4, 4, . . .}

This is an infinite set of all the positive and negative numbers, and 0.
R: The set of reals, i.e. the set
p
{0.1, 1.1222, 1.334, e, , 2, . . .}
This is an infinite set of all the real numbers.
It is clear that we can derive other sets from the above sets. Some are these:
Even: The set of even numbers, {0, 2, 4, 6, 8, . . .}
Odd: The set of odd numbers, {1, 3, 5, , 7, 9, . . .}
Primes: The set of prime numbers, {2, 3, 5, 7, 11, 13, . . .}

102

8.4

CHAPTER 8. SETS

Approaches to Define Sets

There will be two fundamental ways in which to define sets:


Explicit definition: The simplest way to introduce sets is to write them
out, as in
{1, 2, 33}

which is a set containing three items, namely 1, 2 and 33.


Set Builder: The notation for set builder is to give a template for including all those items that satisfy a condition. This notation is also
known as set comprehension and I shall use these terms interchangeably.

The template used in the set-builder notation is


{ x : p( x)}

and it means form a set of all those x for which the predicate expression p( x) is true.

Many books also use the following notation


{ x | p ( x )}

It is just a matter of the separator being a : or |, and we may occasionally use the latter separator.

Characteristic Predicate
tion as follows

For a set S defined using the set-builder nota-

S = { x : p( x)}
we call p the characteristic predicate for S . It is assumed that S is defined
over a universe U and that x ranges over U also.

8.4. APPROACHES TO DEFINE SETS

103

Examples of Set Builder (Set Comprehension) Examples of the set


builder notation now follow:
{ x : ( x > 10) ( x 15)} This yields the set {11, 12, 13, 14, 15}.
You may ask how I knew to pick only integers, i.e. could this set not
also contain fractions, as in
{10.01, 10.01, 11, 11.02, 14.999, . . .}

This detail is usually pinned down in the set comprehension in two


ways:
{ x N : ( x > 10) ( x 15)}, or
{ x : ( x N) ( x > 10) ( x 15)}.
These definitions say what the type of x is.
Test your understanding :
What is { x N : T rue}?
Answer: N, because for every x N, T rue is true (it does not depend
on x)
What is { x N : isP rime(7)}?
Answer: N, for the same reason as above, because isP rime(7) is true.
What is { x N : 1 < 2}?
Answer: N
What is { x N : False}?
Answer: ;, because False is false, no matter which x, and this prevents all xs from being included in the set.
What is { x N : isP rime(4)}?
Answer: ;, for the same reason as above.
What is { x N : even( x) isP rime( x)}?
Answer: {2}
What is { x N : x < 10 isP rime( x)}?
Answer: {2, 3, 5, 7}
What is { x N : isP rime( x)}?
Answer: Primes
What is { x N : odd ( x))}?
Answer: Odd

104

8.4.1

CHAPTER 8. SETS

PYTHON EXECUTION

In the following sections, we will illustrate many examples using Python.


You can run simple Python scripts even without installing it on your machine. Here are some approaches:
Run Python in your browser using: http://www.skulpt.org/ Youll
see a Demo window (above) as well as an interactive window (below).
You may try the interactive window.
There are also other approaches:
In http://jupyter.org/, try Python in your browser.
Use the Python Tutor at http://www.pythontutor.com/.
Finally, Python installs easily even on your phone.
We are really expecting you to be running the suggested examples in Python
while reading this chapter. This is a good way to obtain practice.

8.5
8.5.1

Operations on Sets
Cardinality or Size

The cardinality of a finite set is its size expressed as a number in N


(a natural number). The cardinality of {} is 0. The cardinalities of {1},
{2}, {{}}, {{123}}, and {2016} are all 1. The cardinality of {1, 2}, {{}, {1}}, and
{2,00 hi 00 } are all 2. The cardinality of infinite sets will be defined in a
different way (comes much later in our course).
There are two standard ways in which to write down the cardinality
of a set S . They are: (i) | S |, and (ii) n(S ).
For finite sets A and B,
If A B, then | A |<| B |, or in the alternate notation, n( A ) < n(B).
If A B, then | A || B |, or in the alternate notation, n( A ) n(B).
In Python, the function len computes the cardinality of a set.
>>> A = {1,2}
>>> B = {1,2,3}
>>> len(A)

8.5. OPERATIONS ON SETS

105

2
>>> len(B)
3
>>> A <= B
True

We will not, at this point, define the notion of cardinality for infinite sets
just keep in mind that this takes a whole different (but very interesting)
approach!
The operator used to denote the size of a set S is either |S |, or n(S ) (standing for the number of elements). For example, |{}| = 0 and |{2, 3, 1}| = 3.
Notice that we can define sets using the range() function in Python.
For instance, set(range(3)) is the set {0,1,2}. This is a very convenient
way to generate a set, given its cardinality. Here are some variations of the
range() function:
If you want to begin a set at a different point, provide an additional
argument
E.g., set(range(1,3)) is the set {1,2,3}
E.g., set(range(10,13)) is the set {10,11,12} (Pythons convention
is inclusive/exclusive, i.e., start from 10, but leave out 13)
E.g., set(range(10,18,2))
returns {16, 10, 12, 14}. Notice that Python does not guarantee any standard way of printing the contents of a set say in
ascending or descending order.
Here, we get the set {10, 12, 14, 16}, which by the inclusive/exclusive
convention leaves out things that touch or fall beyond 18.
Note: We have to wrap the range(3) call inside a set() call; otherwise, we
will often be left with a list, not a set.
NOTE: I deliberately change around the listing order of the contents of
a setto prevent you from taking advantage of this order. Thus, {1, 2, 3},
{2, 1, 3}, {3, 2, 1} are all the same set. By the same token,
Dangerous coding: Please dont take the str() (string of) operation of a set
and then assume that two equal sets have the same string representation.
They often dont! This was a nasty bug I long-ago ran into.

106

8.6

CHAPTER 8. SETS

Operations on Sets

The basic set operations are now introduced. I highly encourage you to try
these in Python (most definitions given here should work in Python3; if not,
try Python2). When I provide something in teletype fonts, it is usually
the Python syntax Im referring to.
Union, written s 1 s 2 or return S1 | S2.
Example: {1, 2} {1, 3} or {1,2} | {1,3}
resulting in {3, 1, 2} or {3,2,1}.
Intersection, written s 1 s 2 or return S1 & S2.
Example: {1, 2} {1, 3} or {1,2} & {1,3}
resulting in {1} or {1}.
Example: {1, 2} {4, 3} or {1,2} & {4,3}
resulting in {} or {}.
Difference or subtraction written s 1 \ s 2 or return S1 - S2.
Example: {1, 2} \ {1, 3} or {1,2} - {1,3}
resulting in {2} or {2}.
Example: {1, 2} \ {4, 3} or {1,2} - {4,3}
resulting in {1, 2} or {1,2}.
Example: {1} \ {2, 3} or {1} - {2,3}
resulting in {1} or {1}.
Example: {1} \ {1, 2} or {1} - {1,2}
resulting in {} or {}.
Now, symmetric difference written return S1 ^ S2 in Python has the
standard mathematical symbol of 4. s 1 4 s 2 stands for ( s 1 \ s 2 )( s 2 \ s 1 ).

8.6. OPERATIONS ON SETS

107

Example: {1, 2} 4 {1, 3} or {1,2} ^ {1,3}


resulting in {2, 3} or {2,3}.
Example: {1, 2} 4 {4, 3} or {1,2} ^ {4,3}
resulting in {1, 4, 2, 3} or {2,1,3,4}.
The complement of a set is defined with respect to a universal set U .
Its mathematical operator is written as an overbar.
Formally, given a set S and a universal set (or universe) U , the complement of set S with respect to U is given by U \ S (or U S ). For
instance, with respect to U = Nat, s 4 = s 5 . In many problems, you will
be given a universal set U that is finite (and quite small). Regardless, you always subtract the set S from U using the set subtraction
operator in order to complement S .
In a Venn diagram, the universal set U is drawn as an all-encompassing
rectangle. For example, in Figure 8.2, the universe is shown, and the
complement of set A with respect to the universe is the region within
this rectangle that is outside of circle A .
We will rarely (at least in CS 2100) perform a complement operation in
Python. The main reason is that complementation is often used when
the domain is infiniteand representing infinite domains is somewhat non-trivial (hence skipped) in Python. Mathematics, on the other
hand, has no such issues.
Notice the spelling: it is complement and not compliment.1
The subset operation is written (<=) and the proper subset operation
is written (<).
Example: {1, 2} {1, 2} or {1,2} <= {1,2}
resulting in true or True.
Example: {1, 2} {1, 2, 3} or {1,2} <= {1,2,3}
resulting in true or True.
1

The latter is what I will do if you earn an A grade in this course. The former is what
you do to flip a set.

108

CHAPTER 8. SETS
Example: {} {1, 2, 3} or {} <= {1,2,3}
resulting in true or True.
Example: {1, 2, 3, 4} {1, 2, 3} or {1,2,3,4} <= {1,2,3}
resulting in f alse or False.
Example: {1, 2} {1, 2} or {1,2} < {1,2}
resulting in f alse or False.
Example: {1, 2} {1, 2, 3} or {1,2} <= {1,2,3}
resulting in true or True.
Example: {} {1, 2, 3} or {} <= {1,2,3}
resulting in true or True.
Example: {1, 2, 3, 4} {1, 2, 3} or {1,2,3,4} < {1,2,3}
resulting in f alse or False.

The superset operation is written (>=) and the proper superset operation is written (>).
Now, A B if and only if B A .
Now, A B if and only if B A .
Please infer the related facts about the Python operators. Try it out.
Almost everything we define for sets also applies equally to lists.
Try it out.

Here is a terminal session illuminating a few things (notice that by default,


range() creates a list):

8.7. VENN DIAGRAMS

109

>>> set(range(2)) <= {0,1}


True
>>> set(range(2)) >= {0,1}
True
>>> range(2) == {0,1}
False
>>> range(2) == [0,1]
True

8.7

Venn Diagrams

John Venn, the English mathematician of the 19th century evolved a convention for depicting sets and their relationships that has acquired the name
Venn diagrams. A good illustration of the use of Venn diagrams is given
in [2], a web article: The distinction The distinction between Tiffany likes

Shoes

Expensive
Items

Expensive
Shoes

Expensive
Items

"Shoes that are expensive" versus "Shoes which are expensive"


(adapted from http://home.earthlink.net/~llica/wichthat.htm).

Figure 8.1: That versus Which in English usage


shoes that are expensive and Tiffany likes shoes, which are expensive
(notice the comma after shoes) is best captured by a Venn diagram as in
Figure 8.1. The former looks for common elements between Shoes and
Expensive items whereas the latter looks for Expensive items and finds
a subset within it called Expensive Shoes.
We will of course not be delving too much into English grammar in this
course, but it is good to know that Venn diagrams can come in handy even
to disambiguate English constructions in technical writing.
We will be studying Venn diagrams more in depth later in this chapter.

110

8.7.1

CHAPTER 8. SETS

Details of Venn Diagrams

Universe

Figure 8.2: The Familiar Venn Diagram of 3 sets

Figure 8.3: Venn Diagrams of order 5 (left); of order 5 with regions colorized (middle); and order 7 (right). Images courtesy of http://mathworld.
wolfram.com/VennDiagram.html and http://www.theory.csc.uvic.ca/
~cos/inf/comb/SubsetInfo.html#Venn.
Venn diagrams are one of the most widely used of notations to depict sets
and their inclusion relationships. Usually one draws the universal set as a
rectangle, and within it depicts closed curves representing various sets. I am
sure you have seen simple venn diagrams showing three circles representing
three sets A, B, and C, and showing all the regions defined by the sets (e.g.,
Figure 8.2 on Page 110) namely: the eight sets: A B C (points in all three
sets), A B, B C , and A C (points in any two sets chosen among the
three), and then A , B, and C (points in the three individual sets), and finally
; (points in no set at allshown outside of the circles).

8.8. SET IDENTITIES

111

Venn diagrams are schematic diagrams used in logic theory to depict


collections of sets and represent their relationships [4, 5]. More formally,
an order- N Venn diagram is a collection of simple closed curves in the
plane such that
1. The curves partition the plane into connected regions, and
2. Each subset S of {1, 2, . . . , N } corresponds to a unique region formed
by the intersection of the interiors of the curves in S [3].
Venn diagrams involving five and seven sets are beautifully depicted in
these websites, and also the associated combinatorics is worked out. Two
illustrations from the latter site are shown in Figure 8.3 on Page 110, where
the colors represent the number of regions included inside the closed curves.
Illustration of the total number of regions in a Venn diagram: For
the Venn diagram Figure 8.3 (middle), there are a total of 25 = 1 + 5 + 10 +
10 + 5 + 1 regions. Why this follows this rule (power of 2) will be the subject
of our study later it is a beautiful result covering Permutations, Combinations, and Binomial Coefficients.

8.8

Set Identities

Sets are set up very similar to propositional logic, and hence there are many
set identities that track logical identities. We provide a listing in a table,
reusing some of the logical identities also. We take candidate sets A , B and
C in our discussions: Here, two sets S 1 and S 2 are equal if they have the
same elements; or, in other words:
(S 1 = S 2 ) (S 1 S 2 ) (S 2 S 1 )
That is, S 1 = S 2 if and only if each set contains the other.
Precedences: As far as parsing set expressions, again follows the same
rules as and follows the same rules as . Also and complementation
bind the tightest. When in doubt (i.e., almost always), we shall use parenthesis.
We shall gradually build toward showing you set identities, after making
sure that you see how the basic relationships between sets and logic works.

112

CHAPTER 8. SETS

Or-distribution:

( p ( q r )) (( p q) ( p r ))

A (B C ) = ( A B ) ( A C )

And-distribution:

( p q) r ( p r q r )

( A B ) C = ( A C ) (B C )

And-commutation:

pq q p

AB = B A

Or-commutation:

pq q p

AB = B A

Negation:

p p False

A A = ;.

Implied Negation:

p ( p q ) p q

A ( A B) = A B

DeMorgan:

( p q) ( p q)

A B = ( A B)

Complementation:

( x x) 1

A A =U

Figure 8.4: Set Identities (note how similar to Logical Identities)

8.8.1

Connection between Operators in Logic and Sets

It must be apparent that (and) behaves similar to intersection, and


(or) behaves similar to union. For example, if x belongs to sets A and
B, then it belongs to their intersection. Likewise, complementation of sets
and negation behave similarly. Here are some of these connections, more
formally, for sets S 1 and S 2 defined over a universe called U , and for x U
being an arbitrary item in U .
Let the characteristic predicates of S 1 and S 2 be p 1 and p 2 respectively.
That is,
S 1 = { x U : p 1 ( x)}
and

S 2 = { x U : p 2 ( x)}
Connections between logic and sets
Union (): An element belongs to a union if it belongs to either set
(according to the characteristic predicates p 1 and p 2 ).

S 1 S 2 = { x : p 1 ( x) p 2 ( x)}

8.8. SET IDENTITIES

113

Intersection ():

S 1 S 2 = { x : p 1 ( x) p 2 ( x)}

Complement ( ):

S 1 = { x : p 1 ( x)}

Subtraction (, or sometimes shown as \):

S 1 S 2 = { x : p 1 ( x ) p 2 ( x )}

Containment vs. Implication: With sets and subsets, there is a nice


connection with implication. We will not present too many implicationoriented rules regarding sets; but keep in mind this nifty fact.

S 1 S 2 p 1 ( x) p 2 ( x)
That is, set containment () holds between two sets S 1 and S 2 if the
fact that an element is in S 1 (determined by applying p 1 ) implies that
the element is in set S 2 also (as per p 2 ( x)).

8.8.2

Python Illustration of Set/Logic Connection

The beauty of studying sets using Python is that you get ready reinforcement by typing things into a terminal. You can not only work out a problem
by hand, but also check your answer, and also try out many problems on
your own. With these ideas in mind, we provide you with a few snippets of
examples that you may try on your own:
>>>
>>>
{0,
>>>

U = set(range(10))
U
1, 2, 3, 4, 5, 6, 7, 8, 9}
S_1 = {x for x in U if x < 5 }

114

CHAPTER 8. SETS

>>>
>>> S_2 = {x for x in U if x > 3 }
>>> S_1
{0, 1, 2, 3, 4}
>>> S_2
{4, 5, 6, 7, 8, 9}
>>> S_1cup2 = {x for x in U if (x < 5) or (x > 3) }
>>> S_1cup2
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> S_1cap2 = {x for x in U if (x < 5) and (x > 3) }
>>> S_1cap2
{4}
>>> S_1bar = {x for x in U if not(x < 5) }
>>> S_1bar
{8, 9, 5, 6, 7}
>>> S_1 <= S_2
False
>>> S_1 <= U
True
>>> S_2 - S_1
{8, 9, 5, 6, 7}
>>> S_2minus1 = { x for x in U if ((x > 3) and not(x < 5)) }
>>> S_2minus1
{8, 9, 5, 6, 7}
>>>

8.8.3

Formal Proofs of Set Identities

Using the logical definitions of sets and their identities given before as well
as within Figure 8.4, we will now provide proofs for a few important set
identities (we also leave a few as exercises).
A (B C ) = ( A B ) ( A C )
A Formal Proof (see Figure 8.5), 8.8.4

A (B C ) = x

= x

= x

= x

= x

: x A x (B C )

: x A (x B x C)

(definition of )

: ( x A x B) ( x A x C )

: x A B x ( A C)

: x (( A B) ( A C )

= ( A B) ( A C )

(definition of )

( distributes)
(definition of )
(definition of )

8.8. SET IDENTITIES

115

[left hand side]

[right hand side]

S
A

B|C

A&B

S
A

(A & B) | (A & C)

A&(B|C)

Figure 8.5: Venn diagram for A (B C ) = ( A B) ( A C )


AB = AB
A Formal Proof (see Figure 8.6)

AB = x

= x

= x

= x

= x

= x

= x

: x AxB

: (( x A ) ( x B))

: (( x A ) ( x B))

: (( x A ) ( x B))

: ( x ( A B))

: x ( A B)

: x ( A B)

= AB
( A 4 B) = ( A B) ( A B)
A Formal Proof (see Figure 8.7)

(definition of )

(DeMorgans Law)
(definition of )
(definition of Z )
(definition of )
(definition of )
(definition of Z )

116

CHAPTER 8. SETS

[left hand side]

[right hand side]

S
A

S-A
S

S-B
S

(S - A) & ( S - B)
S

S
A

A|B

S - ( (S - A) & ( S - B) )

Figure 8.6: Venn diagram for A B = A B

8.8. SET IDENTITIES

117

[left hand side]

[right hand side]

S
A

A|B

A^B

A&B
S

S
A

A^B

( A | B) - (A & B )

Figure 8.7: Venn diagram for ( A 4 B) = ( A B) ( A B)

118

CHAPTER 8. SETS

This one is pretty long. Notes are put below the previous line.

A 4 B = x : ( x A x B) ( x B x A )

(defintion of 4)

= x : (( x A x B) x B) (( x A x B) x A )

( distributes)

= x : (( x A x B) ( x B x B)) (( x A x B) x A )

( distributes again, on the left)

= x : (( x A x B) true) (( x A x B) x A )

= x : ( x A x B) (( x A x B) x A )

( p p is always true)

( p true has the same truth value as p)

= x : ( x A x B) (( x A x A ) ( x B x A ))

( distributes again, on the right)

= x : ( x A x B) ( true ( x B x A ))
( p p is always true)

= x : ( x A x B) ( x B x A )

( true p has the same truth value as p)

= x : ( x A x B ) ( x B x A )

(DeMorgans Law)

= x : ( x A x B ) ( x A B )

(definition of )

= x : ( x A B ) ( x A B )

(definition of )
= ( A B) ( A B)

(definition of )

8.8.4

Checking the Proofs Using Python

The proof given in 8.8.3 for A (B C ) = ( A B) ( A C ) can be checked in


Python as follows. While the checking is being done for specific input sets,
it at least gives reassurance that no simple superficial mistakes have been
made.

8.8. SET IDENTITIES

119

# set([2, 4, 6, 8, 10, 12, 14, 16])


A = { i for i in range(2,17) if (i%2 == 0) }
# set([3, 5, 7, 9, 11, 13, 15])
B = { i for i in range(2,17) if (i%2 == 1) }
# set([3, 4, 6, 8, 9, 12, 15, 16])
C = { i for i in range(2,17) if ((i%3 == 0) | (i%4 == 0)) }
# set([2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16])
S = { i for i in range(2,17) }
# ---------- BEGIN -- A & (B | C) == (A & B) | (A & C)
#--- Using & and | for set operations ; and and or for logical operations
# LHS
T0 = A & (B | C)
# LHS, written in set comprehension form
T1 = { x for x in S if x in A & (B | C) }
# defn of & on sets : set to logic
T2 = { x for x in S if (x in A) and (x in B|C) }
# defn of | on sets : set to logic
T3 = { x for x in S if (x in A) and ((x in B) or (x in C)) }
# and distributes : in logic
T4 = { x for x in S if ((x in A) and (x in B)) or ((x in A) and (x in C)) }
# defn of & : logic to set
T5 = { x for x in S if (x in A & B) or (x in A & C) }
# defn of | : logic to set
T6 = { x for x in S if x in (A & B) | (A & C) }
# RHS
T7 = (A & B) | (A & C)
# One way to put in assertions
#
assert(T0 == T1 == T2 == T3 == T4 == T5 == T6 == T7), \
"T0 == T1 == T2 == T3 == T4 == T5 == T6 == T7 VIOLATED!!"

120

8.9

CHAPTER 8. SETS

Cartesian Product and Powerset

We now provide two important operations that build new sets from existing
sets. The first of these, cartesian product (8.9.1) allows us to take two sets
and pair up elements across them. We also define the notion of an ordered
pair in this section. The second of these, powerset (discussed briefly in 8.2)
allows us to take all the subsets of a set, and will be presented in more detail
in 8.9.3.

8.9.1

Cartesian Product

Ordered Pairs, Triples, etc There is a data type called ordered pair. It
looks like (1,2). It is not a set. It just pairs up things. One can pair up
dissimilar things also. Please see some examples from Python:
(2, a), an ordered pair of a number and a string.
(2, 2), an ordered pair of a number and a set.
We can also triple things (put three things together).
(2,{},"a"): A triple of a number, a set, and a string.
(2,{2},"2"): Another triple of a number, a set, and a string.
(2,{3},{2,{3}}): Another example of a triple.
In mathematics, ordered pairs are, in turn, defined using sets. For instance,
the ordered pair (2, 3) is modeled in mathematics as {2, {3}}. This is mainly
for our general knowledge (we will not have much use of this definition elsewhere in this book).

Cartesian Product We now introduce a set operator called cartesian


product (some books call this the cross product). Given two sets A and
B, their cartesian product A B is defined as follows:

A B = {( x, y) : x A and y B}

The notation above defines all pairs ( x, y) such that x belongs to A and y
belongs to B. To understand cartesian products, we can readily obtain some
practice with Python:
>>> { (x,y) for x in {1,2,3} for y in {11,22} }

8.9. CARTESIAN PRODUCT AND POWERSET

121

set([(1, 22), (3, 22), (2, 11), (3, 11), (2, 22), (1, 11)])
>>> { (x,y) for x in {10,20,30} for y in {"he", "she"} }
set([(10, he), (30, she), (20, she), (20, he), (10, she), (30, he)])
>>> { (x,y) for x in {} for y in {"he", "she"} }
set([])

8.9.2

Cardinality of a Cartesian Product

Notice that the cardinality of the cartesian product of two sets S 1 and S 2
equals the product of the cardinalities of the sets S 1 and S 2 . That is,
|S 1 S 2 | = |S 1 | |S 2 |

Thus, if S 1 has 4 elements and S 2 has 5 elements, their cartesian product


will have 20 elements. If one of the sets is empty (size 0), the cartesian
product results in an empty set (as the size of the resulting set must be 0
something, which is 0.
Let us see some examples that confirm these facts:
>>> S1 = {1,2,3,4}
>>> len(S1)
4
>>> S2 = {"he","she","it"}
>>> len(S2)
3
>>> S1timesS2 = { (x,y) for x in S1 for y in S2 }
>>> len(S1timesS2)
12
>>> S0 = {}
>>> len(S0)
0
>>> S0timesS1 = { (x,y) for x in S0 for y in S1 }
>>> S0timesS1
set()
>>> len(S0timesS1)
0
>>>

8.9.3

Powerset

In this section we discuss powersets, how to generate them in Python, and


some of the real world situations where Powersets occur.

122

CHAPTER 8. SETS

Figure 8.8: Powerset as a Lattice


The way the Powerset algorithm works is easy to explain with respect
to the structure of the recursion in Figure 8.10. We explain it through the
following steps:
1. The powerset of the empty set {} is {{}} because we are supposed to
return the set of subsets of {}; and there is only one subset for {}, which
is itself.

L=list(S)
if L==[]:
return([[]])
2. For a non-empty set, the powerset is calculated as follows:
(a) First, calculate the powerset of the rest of the set:

else:
pow_rest0 = pow(L[1:])
(b) Then calculate the set obtained by pasting the first element of the
original set onto every set in pow_rest0:

pow_rest1 = list(map(lambda ls: [L[0]]+ls, pow_rest0))


(c) Finally, compute a set of sets, containing all the sets within pow_rest0
and pow_rest1:
return(pow_rest0 + pow_rest1)

8.9.4

Application: Electoral Maps

You have seen maps such as in Figure 8.9. There are a total of 250 such
electoral maps possible, with Republican (red) and Democrat (blue) states
shown [1]. The reason is obvious: any subset of states could be won by
either party.

8.9. CARTESIAN PRODUCT AND POWERSET

123

Figure 8.9: Recent electoral maps of the USA. Notice that each state can
be won by Democrats (blue) or Republicans (red). Lets take all possible
electoral maps. This must clearly be equal to the powerset of the set of
states in the US (all states won by Democrats, all the way to zero states won
by them). Thus, there are 250 possible electoral maps. Which one will it be,
in 2016?

124

CHAPTER 8. SETS

def pow(S):
"""Powerset of a set L. Since sets/lists are unhashable,
we convert the set to a list, perform the powerset operations,
leaving the result as a list (cant convert back to a set).
pow(set([ab, bc])) --> [[ab, bc], [bc], [ab], []]
"""
L=list(S)
if L==[]:
return([[]])
else:
pow_rest0 = pow(L[1:])
pow_rest1 = list(map(lambda ls: [L[0]]+ls, pow_rest0))
return(pow_rest0 + pow_rest1)
--->>> pow
<function pow at 0x026E1FB0>
>>> pow({1,2,3})
[[], [3], [2], [2, 3], [1], [1, 3], [1, 2], [1, 2, 3]]
>>> pow({})
[[]]
>>> pow({hi,there,5})
[[], [5], [there], [there, 5], [hi], [hi, 5], [hi, there], [hi, there, 5]]
>>> len(pow(range(1)))
2
>>> len(pow(range(2)))
4
>>> len(pow(range(4)))
16
>>> len(pow(range(10)))
1024
>>> len(pow(range(20)))
1048576

Figure 8.10: The Powerset function, and how it recurses

Chapter 9
Predicate Logic
In computer programming, it is important to be able to make assertions
about numbers, sets, trees, hash-tables, etc. After all, you may test any of
these data structures and take a branch in a piece of code. For example,
consider a program that looks up a hash-table H for a key k, and if the
key is present, and the value v against the key is odd, the program control
branches one way; else it branches the other way. Already, we have used two
predicates:
Hash-table has a key, modeled by predicate has, as in its usage has( k, H )
The key is associated with a value, modeled by looku p, as in its usage,
isodd ( looku p( k, H ))
Clearly, in order to understand programs and compute their flow-paths (say,
for program testing), one needs to reason about predicates, and tell when
they will become true.
This chapter will give you more such examples, and then introduce the
idea of stating interesting facts in predicate logic. We will also study a generalized form of DeMorgans law that we will use to negate quantified statements.

9.1

Predicates and Predicate Expressions

Predicates are operators such as < and 6= that yield truth-values by examining and comparing non-Boolean quantities. We also saw two predicates
isodd and has in our example above.
125

126

CHAPTER 9. PREDICATE LOGIC

Books on mathematics split hairs over predicate symbols vs. predicates.


For now, we will assume that they are one and the same. Later when we
study relations, we will define this distinction better. As we make progress,
we will often get sloppier, and use predicate even for predicate expression. These are widely tolerated notational abuses.
One can write predicate expressions such as 2 < 3 using predicates. We
know that 2 < 3 is true. It helps state assertions about non-Boolean items
such as 2 and 3, and also non-Boolean (integer) variables such as z.
Other examples of predicates are Brother (is a brother of ), Older (is
older than), and Colder (is colder than). Here are their usages to build
some predicate expressions.
Brother: Brother ( x, y) might mean x is a brother of y.
Older: Older ( x, y) might mean x is older than y. (You have to pick a
convention is the first argument the older guy?)
Colder: Colder ( M yHand, I ce) might mean my hand is colder than ice.
There are other predicates that we have studied in Chapter 8 in conjunction
with sets. For instance:
: Is an element of
: Is a subset of
: Is a subset of or is the same as

Predicate Expressions We will define things other than propositions


that have truth-values. For example, if x, y are Boolean variables, they
can take on truth values, and so can x y. But when I write z > 23, it is
clear that z is a number such as 24 or 25, for which this assertion is true.
Predicate expressions are assertions involving non-Boolean variables
and predicates. For example, z > 23 is a predicate expression. Once we
absorb this idea, we can define conjunctions of predicate expressions and
such exactly as in propositional logic.
Some examples:
z > 23 z < 25: These are parsed ( z > 23) ( z < 25). In this case, z is
pinned to be 24.

9.1. PREDICATES AND PREDICATE EXPRESSIONS

127

z > 23 z 25: These are parsed ( z > 23) ( z 25). In this case, z
could be one of 24 or 25.
x {1, 2, 3}: x is a member of the set {1, 2, 3}.
{1, 2} {1, 2, 3}: {1, 2} is a proper subset of {1, 2, 3}.
Odd ( x) Colder ( M yHand, Dr yI ce): x is odd and my hand is colder
than dry ice.
Programming language conditional statements such as

((x == 0) or (y > z))


are indeed predicate expressions. We already saw how to negate them
using DeMorgans laws, in our homeworks.

Quantification Quantification is a convenient way for asserting a conjunction of many predicate expressions (or disjunction of many predicate
expressions). With infinite sets, quantification is the only way to express
such conjunctions/disjunctions. The two quantifications commonly used
are universal (written ) standing for repeated conjunction and existential (written ) standing for repeated disjunction.
Some details and examples:
or Forall, which looks like an upside-down A. This is a quantifier,
asserting lots of ands (..and..and..and over may items).
Usage of Forall:
x, Odd ( x) Odd ( x + 1): This might be true in some cases. This is a
way of saying For all x, either x is odd or x + 1 is odd. You have to
say more (e.g., where does x come from?), but these are the kinds of
things one likes to say using quantification.
I hope you see that this is really like saying
(Odd (0) Odd (1)) (Odd (1) Odd (2)) (Odd (2) Odd (3)) . . .
This is like other notations in mathematics that repeat operators. For
example, repeats multiplication, as in

128

CHAPTER 9. PREDICATE LOGIC

5
Y

i = 1 2 3 4 5 = 120

i =1

and repeats addition, as in


5
X

i = 1 + 2 + 3 + 4 + 5 = 15.

i =1

Likewise, helps compactly describe repeated conjunctions, and


helps compactly describe repeated disjunctions.
or Exists, which looks like a backward E. This is a quantifier,
asserting lots of ors (..or..or..or over may items). As said before,
repeats disjunction.
Consider the assertion x, Odd ( x). This assertion might be true, depending on where the x are drawn from. For instance, x Even, Odd ( x)
is false (if Even denotes all even numbers) while x N, Odd ( x) is true.
I hope you see that these existential assertions are really a shorthand
for an assertion of the form

Odd (0) Odd (1) Odd (2) Odd (3) . . .


Negating Quantified Expressions We already mentioned that x, Odd ( x)
is a short-hand for
(Odd (0) Odd (1) Odd (2) Odd (3) . . .)
Thus, it must be clear that
( x, Odd ( x))

can be evaluated using DeMorgans law. The result will be


(Odd (0) Odd (1) Odd (2) Odd (3) . . .)
(Even(0) Even(1) Even(2) Even(3) . . .).

That is, the negation of not there exists an odd x is forall x, it is the case
that x is even. Whether true or false, that is what the negation asserts.

9.2. EXAMPLES

9.2

129

Examples

Here are usages of quantifiers and their negations.


All men are mortal
Negation: Some men are immortal.
Notice that all forall and exists statements are repeated conjunctions
or disjunctions. Thus, the entire statement is true or false. In this
case, All men are mortal may be assumed to be true, in which case
its negation is false.
All squares are rectangles: For all s that are squares, they are always
rectangles.
Negation: Some squares are not rectangles.
Find out which (given or negation) is true.
Some rectangles are squares: There exist rectangles r that are squares.
Negation: All rectangles are not squares.
Find out which (given or negation) is true.
Some rectangles are triangles: Well, this can be said in first-order
logic, but when it comes to evaluating the truth, these sentences will
be deemed to be false.
Negation: All rectangles are not triangles.
Forall x, x equals 0: Again, it can be said, but is false.
Negation?
All rectangles are squares: False again, because while some rectangles are squares, not all of them are.
Negation: Some rectangles are not squares.

9.3

Illustrating Nested Quantifiers

We now discuss simple examples that offer us practice on negating quantified statements.
General Rules Here are the general rules to follow while negating quantifiers. We also provide many special cases for the sake of illustration:
Generic example: ( x D, p( x)) ( x D, p( x))
This is a simple example of negating a forall

130

CHAPTER 9. PREDICATE LOGIC

Generic example: ( x D, ( p( x) q( x)) ( x D, p( x) q( x))


This is a special case of negating forall where the innermost predicate is an implication, whose negation becomes p( x) q( x).
Lets take a friendly dog-example:
x D, ( dog( x) animal ( x))

If you doubt the above (true) statement, negate and see what you get:
( x D, ( dog( x) animal ( x)) ( x D, dog( x) animal ( x))

This reads exists x D that is a dog but not an animal.


This is obviously false.
A few generic examples of nested quantifications being negated:
( x D, y E, p( x)) ( x D, y E, p( x))
( x D, y E, p( x)) ( x D, y E, p( x))
( x D, y E, p( x)) ( x D, y E, p( x))

Now, lets take an assertion there exist infinitely-sized subsets of N:


S N, S 6= ; x S, y S, y > x

This assertion can be understood as follows: There is at least one nonempty subset S N, such that for every x in S , there is a larger number
y, also in S . Such a set must have no largest element because for
every such element, there must be another element that is higher in
magnitude.
Again, if you doubt this, negate and see what you get:
(S N, S 6= ; x S, y S, y > x)

Becomes
S N, S = ; x S, y S, y x

9.3. ILLUSTRATING NESTED QUANTIFIERS

131

This reads every subset of N is either the empty set, or a set with a
largest element
Do you agree? I hope you wont. There are many infinite subsets of N
including N itself.

Other handy identities:


x > y, p( x) is equivalent to
x, x > y p( x)

Thus, ( x > y, p( x)) is equivalent to


( x, x > y p( x))

which is equivalent to
x, x > y p( x)

And this is an abbreviation for


x > y, p( x)

What this shows is that you can roll conditions such as x > y as
part of quantifiers. They stay put across negations.
Additional Examples We now provide an array of additional examples
relating to negating quantified formulae. I hope you can use these for practice. Some are in English and some in math.
In all countries c, for all people p who study discrete structures in these
countries, either p goes on to become a theoretician or a hacker.
Negation: There exists a country c and a person p in country c where
p neither becomes a theoretician nor a hacker.
There exists a subset P of N where every member of P is above 1, and
those members are divisible only by 1 or by themselves. Obviously,
such a P is the set of prime numbers (but see Section 9.4). (Note:
In mathematics, 1 is considered not to be a prime. There are many
reasons; here is one video that explains the reasons at a high level
https://www.youtube.com/watch?v=IQofiPqhJ_s).

132

CHAPTER 9. PREDICATE LOGIC

In mathematical logic, this becomes


S N, x S, ( x > 1 [ y S, divides( y, x) ( y = x y = 1) ] )

Negating the above assertion, we get


S N, x S, ( x 1 [ y S, divides( y, x) ( y 6= x y 6= 1) ] )

This says that every subset of N either contains 1 or has a composite


number.
MAJOR EDIT: This is not quite saying that S is all and only the
Primes. See 9.4 for the fix.
A More Involved Example Suppose we are presented with the assertion
For all natural numbers p N, if p is odd, then
there exists another natural number r > p, such that
for all natural numbers q < r ,
q p.
Tasks for you:
Write the above assertion in logic
Negate it
Reconstruct an explanation in English for the negation
Solution:
p N, [ odd ( p) r > p, q < r, q p]
The fact odd ( p) really does not matter. It is there just to add detail to
this example, for the sake of practice
Also, the r in question is p + 1, because the q value cant be between p
and r .
Negating this, we get
p N, [ odd ( p) r > p, q < r, q > p]
This is false. Take r = p + 1. In this case, if q < r , then q cant also be
greater than p. Thsu the r > p, fails at r = p + 1.

9.4. PRIMES FIXED

133

Illustration on Fermats Last Theorem To obtain some practice on negating quantified formulae, let us In number theory, Fermats Last Theorem
(sometimes called Fermats conjecture, especially in older texts) states that
no three positive integers a, b, and c can satisfy the equation a n + b n = c n for
any integer value of n greater than two; see http://en.wikipedia.org/
wiki/Fermats_Last_Theorem.
a, b, c, n : (((a, b, c > 0) ( n 3)) (a n + b n ) 6= c n )

This theorem was first conjectured by Pierre de Fermat in 1637, famously


in the margin of a copy of Arithmetica where he claimed he had a proof that
was too large to fit in the margin. See http://en.wikipedia.org/wiki/
Fermats_Last_Theorem for a discussion of the history of this theorem that
remained open for nearly 360 years before it was proved by Andrew Wiles,
then working at Princeton University.
Suppose Fermats Last Theorem were false; then, the negation of
a, b, c, n : (((a, b, c > 0) ( n 3)) (a n + b n ) 6= c n )

would have been true; i.e.,


a, b, c, n : ((a, b, c > 0) ( n 3) ((a n + b n ) = c n ))

Unfortunately, try and try again as much as you wish, you will never find
such a set of numbers (a, b, c, n) such that this equation holds. Following
Wiles proof, we know why.

9.4

Primes Fixed

The reason for the error is obviously that S could just be empty! We have
not pinned it down sufficiently!
Let N ++ be the set N {0, 1}, i.e., the set {2, 3, 4, 5, . . .}. Which of these is
the properly fixed version of Primes, and why?
1. Version-1
S N ++ ,

134

CHAPTER 9. PREDICATE LOGIC


[ z N ++ ,
( y N, divides( y, z) ( y = z y = 1))

(z S)
]

2. Version-2
S N ++ ,

[ z N ++ ,
( y N, divides( y, z) ( y = z y = 1))

(z S)
]
Version-2 is correct. (Version-1 can include junk, i.e. non-primes also.)
Version-2 can be read as follows.
There is a set S N ++ ,
You are allowed to put a z N ++ into S
EXACTLY WHEN
For every y N,
y divides z means y = z or y = 1.
Think about it!

Chapter 10
Combinatorics
In the movie Rainmain, Dustin Hoffman (the Rainman) shows his amazing
ability of counting things at a glance. In one scene, a nurse accidentally
spills a box of toothpicks, and the Rainman takes one glance and immediately says 82, 82, 82 (meaning 82+82+82) there are 246 toothpicks
on the floor. Indeed he was right! You may have some fun seeing this
amazing piece of acting on Youtube https://www.youtube.com/watch?v=
kthFUFBwbZg.
Unfortunately, in real life, most of us need to be counting more abstract
things, and dont certainly have access to our friendly Rainman in any case.
This chapter will therefore introduce methods for counting that help us
count large collections of things systematically and reliably. After all, we
dont want to be caught in the position of the famous king who promised one
of his subjects one grain of rise for the first square of a chessboard, two for
the second square, and so on (doubling for each square). The king thought
that he was returning a favor in the cheap by providing only a few bags of
rice.1

10.1

Permutations versus Combinations

Permutations and combinations are central to many counting situations. To


understand these concepts, let us take a real-world situation involving airlines lets say Delta and Southwest (youll soon realize why Im picking (on)
1

You can imagine how such a gesture ends! Please calculate the weight of 265 1 grains
of rice, if one grain weights 26 grams. The king must take CS 2100 before making promises!

135

136

CHAPTER 10. COMBINATORICS

these airlines!)

10.1.1

Delta vs. Southwest Airlines: Ticket Sales

Delta Airlines Sales


One day, for a certain flight, Delta found it has three vacant seats (say, seat
1, 2, and 3), but there are five potential buyers (numbered 1 through 5).
How many different sales can be made? Remember that Delta has assigned
seating, meaning a person gets a numbered seat and not just a seat. We
will use the notation (a, b, c) to denote that seat 1 is sold to person a, seat 2
to person b, and seat 3 to person c. Here are various sales:
(1,2,3) sell seat 1 to person 1, 2 to person 2, and 3 to person 3.
We now realize there are many many sales possible:
(1,2,3), (1,2,4), (1,2,5), (2,1,3), (2,1,4), . . . (5,1,2), (5,1,3), . . ., (5,4,3)
Notice that sales (1,2,3) and (2,1,3) are different (because of assigned
seating).
We soon proceed to think systematically as follows:
There are 5 ways to fill the first component of the triple, 4 ways to
fill the second component, and 3 ways to fill the third component.
Thus, there are 5 4 3 = 60 different sales possible for Delta.
In the above reasoning, we ended up using the so called product rule of
counting.

Product Rule of Counting If a given task layers itself into k stages


(sub-tasks) where there are n 1 ways to finish the first stage, and independently, n 2 ways to finish the second stage, all the way to n k ways to
finis the k-th stage, there are a total of n 1 n 2 . . . n k ways to finish all the
stages, thus finishing the overall task.
The product 5 4 3 that we formed for solving our example is an instance of the product rule being applied, where each stage is concerned

10.1. PERMUTATIONS VERSUS COMBINATIONS

137

with filling the appropriate spot of the triple. Thus, we have three layers where the first layer has 5 choices of people to assign to the first
seat, the second layer has 4 choices, and the third layer has 3 choices.

More Examples of the Product Rule


1. In calculating the number of truth-table rows for an n-input Boolean
function, we can layer the problem as follows: (i) the first variable can
be assigned 2 ways; (ii) the second variable in another 2 ways; and so
on for all the variables. This product gives us the familiar answer of
2n .
2. If a combination lock has 3 dials going through 0 through 9, we can
layer the problem by considering dial-1, then dial-2 and finally dial-3,
for a total of 1000 combinations.
Important facts about permutations
The product n ( n 1) . . . ( n r + 1) is called P ( n, r ) or sometimes
written n P r. It is known as the number of permutations of n items
taken r at a time. The word permutation reminds us that the order
of items matter.
Also notice that P ( n, n 1) and P ( n, n) are equal.

P ( n, n 1) = n ( n 1) ( n 2) . . . 2
while

P ( n, n) = n ( n 1) ( n 2) . . . 2 1
and both equal n!.
One can also notice that P ( n, r ) = (nn!
r)!

Southwest Airlines Sales


(You probably already know that) Southwest does not have assigned seating;
in other words, it has open seating (anyone can sit anywhere). In other
words, Southwest picks sets of lucky folks e.g., set {1, 2, 3} chosen, set {3, 2, 1}
chosen, etc. These are the people whose lucky bit gets set! To summarize,
when counting the number of distinct sales that Southwest can make in this

138

CHAPTER 10. COMBINATORICS

situation, we are asked to count the number of distinct subsets of cardinality


3 from a universe of five elements.
It is easy to observe that given a set of size 3 (say, {3, 2, 5}), one can form
P (3, 3)) different 3-tuples over it. This fact easily generalizes: given a set
of size n, one can form P ( n, n)) different n-tuples over it. So, to forget
the assigned seats in our example, all we need to do is to divide P (5, 3) distinct seat assignments by P (3, 3). In our example, we divide P (5, 3) = 60 by
P (3, 3) = 3! = 6, resulting
in 10 different sales. This is called combinations,

and its notation is nr (and n C r in some books).

To count combinations, we count n choose n written nr , or
sometimes C ( n, r ).

We can also observe that C ( n, r ) = nr = P ( n, r )/r !
It is also possible to observe that C ( n, r ) = nr = P ( n, r )/P ( r, r ) because P ( r, r ) is nothing but r !.
Given that P ( n, r ) = n!/( n r )!, we can write
!
n
C ( n, r ) =
= n! / ( r ! ( n r )!)
r

We will now once again review permutations and combinations, presenting additional examples as needed to illustrate various points. We will also
present (in 10.5) Python code that helps you experiment with these notions.

10.1.2

Properties of Permutations

A whole list of things can be observed about P ( n, r ):


Read P ( n, r ) as number of ways to choose permutations of r items,
given n items. Thus, we are counting the number of distinct r -long
sequences (or r -tuples) formable from n elements.
P ( n, 1) = n, as there are n distinct one-long sequences (one-tuples). Example: P (5, 1) = 5.
P ( n, 2) = n ( n 1), as there are n ways to pick who is in the first position, and then ( n 1) ways to pick the second positions occupant.

10.1. PERMUTATIONS VERSUS COMBINATIONS

139

Example: P (5, 2) = 5 4 = 20. Thus, if the n items are {a, b, c, d, e}, the
sequences are (a,b), (a,c), (a,d), (a,e), (b,a), (b,c), (b,d), (b,e), etc, all
the way to (e,a), (e,b), (e,c), and (e,d). There are 20 of these 2-long
sequences (2-tuples).
P ( n, 3) = n ( n 1) ( n 2).
P ( n, n 1) = ( n 0) ( n 1) ( n 2) . . . ( n ( n 2)). This accounts for the
n 1 different seats that n guys need to try and occupy.
This product is the same as n ( n 1) ( n 2) . . . 2.
Similarly, P ( n, n) = ( n 0) ( n 1) ( n 2) . . . ( n ( n 1)).
This product is the same as n ( n 1) ( n 2) . . . 1.
The reason that P ( n, n 1) equals P ( n, n) is because once we find n 1
items to occupy the first n 1 positions, the item to occupy the n-th
position is forced. As a specific example, the number of 4-tuples over
the set {a, b, c, d, e} is the same as the number of 5-tuples over this set.
What is P ( n, 0)? How many ways can 0 items be chosen out of n
items? You can do this exactly in one way, and so P ( n, 0) = 1. Determining these boundary values requires care.
What is P (0, 0)? By convention (and for deeper reasons), 0! = 1.
We consider it undefined to have n < r in P ( n, r ).

10.1.3

Combinations as Ways to set Lucky Bits

Suppose that we have to choose sets of 3 items out of a set of 5 items. We


can employ characteristic vectors, and find out the number of ways in which
to set 3 bits out of 5. This is how:
The characteristic vectors that select 3 out of 5 elements are
11100, 11010, 11001, 10110, 10101, 10011, 01110, 01101, 01011, and
00111.
There are exactly 10 of these combinations. This also gives us an added
result presented below.

140

CHAPTER 10. COMBINATORICS

The number of distinct ways in which to set r bits out of n bits is


C ( n, r ). In a sense, when r lucky bits are selected, we only care to pull
out the elements indicated by the 1 bits and form a set out of them.
Additional properties of combinations:
It is clear that C ( n, n) = 1. We have to choose all the elements.
It is clear that C ( n, n 1) = n, because we just need to decide who not
to choose accomplished in n ways.
Finally, C ( n, 1) = n, because we just need to decide which of the n items
to choose each time.
Finally, C ( n, 0) = 1, because there is exactly one way to choose 0 items
from a set of n items. This also means C (0, 0) = 1.
We consider it undefined to have n < r in C ( n, r ).

10.2

Recursive Formulation of Combinations

We now model a combinations problem arising on hypothetical circus floor,


thus arriving at a recursive formulation of the choose operation. Consider
the circus-act of firing clowns from cannons. Say there are n clowns quaking
in their own cannons, and we have to choose r lucky clowns to be fired into
safety nets. One can proceed as follows:
We walk up to one of these clowns (say the first), and toss a coin.
If the coin is a heads, we fire that clown2 and then we now have to
choose r 1 clowns from n 1 remaining cannons.
If the coin is a tails, we do not fire that clown,3 but now we must
choose r clowns from the n 1 remaining clowns.

With all other clowns watching and grinning, not remembering that they might be
launched next!
3
This clown lets out a huge sigh of relief and sticks out his/her tongue at the others!

10.2. RECURSIVE FORMULATION OF COMBINATIONS

141

This argument allows us to observe


!
!
!
n1
n1
n
=
+
r
r1
r
Illustration of the Recursive
5 Rule for Combinations Let us revisit
our familiar example that of 3 .
The
formula for combinations allows us to express this as
4 recursive
4
+
.
2 3

In 42 + 43 , the latter simplifies to 4. Now, we can now focus on 42 , and
write it as
! ! !
4
3
3
=
+
2
1
2
which evaluates
to 3 + 3 = 6.
5
Therefore, 3 = 4 + 6 = 10 exactly what was concluded above.
We can capture the idea behind the recursive formulation of combinations
in a more general fashion via the sum rule of counting.

Sum Rule of Counting Suppose a task splits into two disjoint cases
(either / or). Suppose there are n 1 ways to finish the task under the
first (either) case and n 2 ways in the second (or) case. Then, there
are a total of n 1 + n 2 ways to accomplish the task. The original problem (choose r lucky clowns) splitting into two disjoint cases is a good
illustration of the application of the sum rule.

More Examples of the Sum Rule


1. Suppose we have to find the cardinality of A B. We can divide the
space of interest into three disjoint cases and apply the sum rule, yielding
| A B | = | A B | + | B A | + | A B |.
2. Suppose a waiter asks soup or salad and offers a choice of 3 soups
and 2 salads. If the waiter truly meant soup XOR salad (as is the most

142

CHAPTER 10. COMBINATORICS


common meaning of this offer meaning you can have only one or the
other), then, clearly, there are 5 ways (sum rule). If the waiter meant
soup OR salad (meaning you can have both), and you want both, then
you can pick (as per the product rule) one of each, in 6 ways.

10.3

Examples: Permutations and Combinations

We will now present many real-world counting situations and help you identify whether you need to use permutations or combinations.

10.3.1

Birthday Problem

Suppose we consider non-leap years (with 365 days), and we are in a room
with n 365 individuals. In how many ways can these n individuals have
distinct birthdays?
It is clear that the first individual could have been born on any one of
these 365 days, the second in any of the remaining 364 days, etc. Then the
answer is clear: there are P (365, n) ways in which all these individuals can
have distinct birthdays. The probability of this happening is very low:

P (365, n)/365n
as will be illustrated by the Python program in 10.5. (We will study Probability Theory much more thoroughly later in this course.)

10.3.2

A Variant of the Birthday Problem

Suppose we have n individuals in a room. What is the probability that none


was born on Christmas?
This is a situation where we just need to set apart one of the dates, and
then any of the individuals can choose from any of the remaining dates. The
product rule comes into play, allowing each person to pick from 364 days
for a total of 364n ways. The probability would be
364n /365n

10.4. BINOMIAL THEOREM

10.3.3

143

Hanging Colored Socks

Suppose we have 5 red socks, 4 blue socks and 3 green socks. How many
distinct ways can we hang these on a clothesline? The problem is one of
describing sequences of length 12 with 5R, 4B and 4G.
Much like in any combinatorics problem, the first thing to do is to model
the situation. Modeling comes with experience; and the better the modeling,
the easier the approach to a solution will prove to be.
Here, we suggest that we model this as a choose problem. Suppose we
reduce the problem to the following:
1. Choose, from among the 12 spots, five (5) spots for the R;
2. Then choose from among the remaining 7 spots, four (4) spots for the
B;
3. The choice for G is now forced. There are exactly 3 Gs and 3 spots.
Having reduced the problem to this state, we just need to now think
through the rule (sum or product) that applies. Here is the insight for this
part of our solution:
Depending on where the five Rs sit, the placement of the Bs will change.
This clearly is a layer as per the product rule.
Once this insight is obtained, we have our answer:
! ! !
12
7
3

5
4
3
This formulation
already shows the forced situation of Gs having no
3
latitude: 3 , that is 1 choice left by the time we hit the third layer.
Question: Will the choice of which socks to hang first match? Try different
orders, and convince yourself that the product rule works no matter what,
resulting in the same final answer.

10.4

Binomial Theorem

This section puts many ideas together, celebrating a brilliant theorem due
to Sir Isaac Newton. This is the famous Binomial Theorem. This theorem

144

CHAPTER 10. COMBINATORICS

helps us determine the expansion of (a + b) N . Let us proceed systematically,


starting from the familiar identity (a + b)2 = a2 + 2ab + b2 . The general power
(a + b) N is obtained through the following reasoning steps:
It is clear that when we write the product of terms T1 , T2 , . . . , T N where
each term is (a + b) a situation we depict as

Tn
T1
T2
(a + b) (a + b) . . . (a + b)
At each term T i , we can choose either an a or a b and proceed
multiplying this variable with the variables chosen from the following terms.
One may choose all as:

T1 T2 T N
a a ...a
One may choose all bs:

T1 T2 T N
b b ... b
In general, one may choose k as and ( N k) bs in many ways:
* This being one way:

T1 T2 T k T k+1 T N
a a ...a ... b
... b
* ... and this being another way (mixtures of a and b):

T1 T2 T k T k+1 T N
a a ... b ...a
... b
It is clear that each combination of choose k as and ( N k) bs is
disjoint, for each k.
Thus, we can use the sum rule, and add up the various combinations.

Now, choosing k as can be accomplished in Nk ways (and this forces
the choice of N k bs.

10.4. BINOMIAL THEOREM

145

The term generated by this choice is


!
N
a k b N k
k

Putting it all together, we can express (a + b) N as a summation:


!
N N
X
(a + b) N =
a r b N r
r
r =0


The term Nk is called a binomial coefficient. Let us determine the
value of these coefficients for various values of N and k by expanding
(a+ b) to various powers of N . Let us denote the sequence of coefficients
within [. . .].

(a + b)0 = [1],

i.e. [ 00 ]
(a + b)1 = 1 a1 + 1 b1 ,

i.e. [ 11 , 10 ]
(a + b)2 = 1 a2 + 2 a.b + 1 b2 ,

i.e. [ 22 , 21 , 20 ]
(a + b)3 = 1 a3 + 3 a2 .b + 3 a.b2 + b3

i.e. [ 33 , 32 , 31 , 30 ]

146

CHAPTER 10. COMBINATORICS

If you look carefully, the coefficients above form the famous Pascals triangle:
0
1
1
1

1
2

1
3

...

In 10.5, we will provide Python programs to produce these coefficients.


We can immediately observe the following facts:

The zeroth row of the Pascals triangle, namely [ 00 ], models the binomial coefficients of (a + b)0 . The sum of the elements in this row is
0.

The first row of the Pascals triangle, namely [ 11 , 10 ], models the
binomial coefficients of (a + b)1 . The sum of the elements in this row is
2, or 21 .

The second row of the Pascals triangle, namely [ 22 , 21 , 20 ], models
the binomial coefficients of (a + b)2 . The sum of the elements in this
row is 4, or 22 .

In general, the kth row of the Pascals triangle, namely [ kk , (kk 1) , . . . ,


k k
k
1 , 0 ], models the binomial coefficients of (a + b) . The sum of the
k
elements in this row is 2 .
That is,
!
k k
X
= 2k
i
i =0

10.4. BINOMIAL THEOREM

147

because, as you recall,

is the number of ways to select 0 (lucky) bits out of k bits


is the number of ways to select 1 (lucky) bit out of k bits
is the number of ways to select 2 (lucky) bits out of k bits

...

k
k

is the number of ways to select k (lucky) bits out of k bits

Since these are disjoint cases, we can again apply the sum rule
and surmise that these are all the number of ways in which to
set bits in a k-bit word. This is, as we know, 2k .
Another view (taking a 4-bit vector as an example:
* One way to enumerate the bit-combinations of a 4-bit vector
is to follow the standard binary counting order:

0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111


1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111
Total number of ways = 16
* Another way to enumerate the 16 bit combinations of a 4-bit
vector: proceed in groupings of the number of 1-bits set, and
employ the sum-rule:

148

CHAPTER 10. COMBINATORICS

0000

zero 1-bits set

0001, 0010, 0100, 1000

one 1-bit set

0011, 0101, 1001,


0110, 1010, 1100

two 1-bits set

0111, 1011, 1101, 1110

three 1-bits set

1111

four 1-bits set

ways
ways

ways
ways
ways

Total ways = 16 again!

10.5

Combinatorics Concepts via Python Code

The Python code that follows illuminates pretty much all of what we studied
in this chapter.

10.5.1

Permutations

from f u n c t o o l s import *
def Perm ( n , r ) :
"""
Implements P ( n , r ) or n P r .
P r e c o n d i t i o n : n >= r , n >= 0 , r >= 0 .
"""
a s s e r t ( n >= r ) , " Error : Fed n < r "
return reduce ( lambda x , y : x * y , range ( n , nr , 1) , 1 )
# Returns 1 when n = 0

Testing Perm: The first routine we code-up is P ( n, r ). We check for all


preconditions, throwing an assertion if the inputs are illegal.

10.5. COMBINATORICS CONCEPTS VIA PYTHON CODE

149

>>> Perm ( 0 , 0 )
1
>>> Perm ( 1 , 0 )
1
>>> Perm ( 0 , 1 )
Traceback ( most r e c e n t
F i l e " < stdin > " , l i n e
F i l e " < stdin > " , l i n e
AssertionError : Error :

call last ) :
1 , in <module>
6 , in Perm
Fed n < r

>>> Perm ( 1 , 1 )
1
>>> Perm ( 5 , 3 )
60
>>> l i s t ( range (5 ,5 3 , 1))
[5 , 4 , 3]
>>> reduce ( lambda x , y : x * y , [ 5 , 4 , 3 ] )
60

The workings of Perm are clear from the example above. We employ range(..)
to enumerate the list of numbers to be multiplied, and then use a reduction
tree (realized via reduce(..) to multiply these numbers.

10.5.2

Factorial

def Fact ( n ) :
"""
F a c t o r i a l n . Builds on Perm .
"""
return Perm ( n , n )

Testing Fact: Realizing factorial is easy, since P ( n, n) = n!. We test this


for some input values.

150

CHAPTER 10. COMBINATORICS

>>> Fact ( 5 )
120
>>> Fact ( 5 0 )
30414093201713378043612608166064768844377641568960512000000000000
>>> Fact ( 5 0 0 )
1 2 2 0 . . . . 0 0 0 ( huge number )

10.5.3

Combinations

def Comb( n , r ) :
"""
Implements C( n , r ) or n C r .
P r e c o n d i t i o n : n >= r , n >= 0 , r >= 0 .
"""
return Perm ( n , r ) / / Fact ( r )

Testing Comb: Combinations is obtained as an integer fraction (denoted by


the use of //) of P ( n, r ) and r !.
>>> Comb( 5 , 3 )
10
>>> [ Comb( 3 , i ) f o r i in range ( 4 ) ]
[1 , 3 , 3 , 1]
>>> sum ( [ Comb( 3 , i ) f o r i in range ( 4 ) ] )
8
>>> [ Comb( 4 , i ) f o r i in range ( 5 ) ]
[1 , 4 , 6 , 4 , 1]
>>> sum ( [ Comb( 4 , i ) f o r i in range ( 5 ) ] )
16
>>> [ Comb( 5 , i ) f o r i in range ( 6 ) ]
[ 1 , 5 , 10 , 10 , 5 , 1 ]

10.5. COMBINATORICS CONCEPTS VIA PYTHON CODE

151

>>> sum ( [ Comb( 5 , i ) f o r i in range ( 6 ) ] )


32

We observe that not only are the combinations working correctly, but we can
also obtain the summation of the binomial coefficients
!
k k
X
= 2k
i
i =0
as discussed in 10.4, and see that the 2k result indeed follows.

10.5.4

Combinations

def PascTri (N) :


"""
Return Pascal s Triangle from 0 C i thru N C i f o r 0 <= i <= N.
"""
f o r i in range (N+ 1 ) :
p r i n t ( [ Comb( n , i ) f o r n in range ( i , i +1) f o r i in range ( n+1) ] )

Testing PascTri: We can generate the Pascals triangle of any size simply
by running through Comb:
PascTri ( 0 )
[1]
>>> PascTri ( 1 )
[1]
[1 , 1]
>>>
[1]
[1 ,
[1 ,
[1 ,
[1 ,

PascTri ( 4 )
1]
2 , 1]
3 , 3 , 1]
4 , 6 , 4 , 1]

152

10.5.5

CHAPTER 10. COMBINATORICS

Birthday Conjecture

def bdayColl ( n ) :
"""
Given a subset o f n people in a room , return the p r o b a b i l i t y
that a l l have d i s t i n c t birthdays . Obtained as 356 P n / 365^n ,
where :
the numerator
r e p r e s e n t s the s i z e o f the event that a l l n o f them have
d i s t i n c t birthdays ;
and 365^n i s the s i z e o f the sample space .
365 P n r e a l i z e d using r e d u c t i o n .
"""
return ( f l o a t ( Perm( 3 6 5 ,n ) ) / ( 3 6 5 . ** n ) )
def plotBdayColl (N) :
"""
Invoke bdayColl N times and p l o t the decreasing p r o b a b i l i t y
as N i n c r e a s e s .
"""
f o r i in range ( 1 ,N+ 1 ) :
p r i n t ( s t r ( i ) + " : " + s t r ( bdayColl ( i ) ) )

Testing bdayColl: We test the Birthday conjecture by plotting the probability of there being unique birthdays as n increases. Specifically, we plot
P (365, n)/365n as n increases. The results are below (retaining every tenth
after 10). The result is that the probability of distinct birthdays decreases
dramatically after about 40 people.
plotBdayColl ( 3 0 )
1 : 1.0
2 : 0.9972602739726028
3 : 0.9917958341152187
4 : 0.9836440875334497
5 : 0.9728644263002064
6 : 0.9595375163508885
7 : 0.9437642969040246
8 : 0.925664707648331

10.5. COMBINATORICS CONCEPTS VIA PYTHON CODE


9 :
10
..
20
..
30
..
40
..
50
..
60
..
70
..
80

0.9053761661108333
: 0.8830518222889224
: 0.58856161641942
: 0.2936837572807313
: 0.10876819018205101
: 0.0296264204220116
: 0.005877339134652057
: 0.0008404240348429087
: 8.56680506865053e 05

153

154

CHAPTER 10. COMBINATORICS

Chapter 11
Probability
Probability theory is an important topic underlying modern computer science theory. Everything from photo-tagging software to neural networks
that help recognize speech are designed based on probability theory. Handwriting recognition is widely used in the Postal Service to automatically sort
mail.
Probability gets even more interesting when radio hosts take on this
topic. Last Fall, three such hosts were discussing, on NPR, how the New
England Patriots managed to win 19 of the 25 coin tosses in that season.
Dumb luck?? asked one host;
Was the coin deflated? asked another;1
The third host sounded much more self-assured. He said While
the probability is low for one team, the probability of any one
team having such a winning streak is rather high, considering
the number of teams playing.
How do we verify whether the probability of a winning streak is rather
high as the third reporter seemed to say? Fortunately, we will be studying
the basics of such calculations in this chapter! We will study the details of
this unusual coin-toss winning rate in Section 11.3.1.
The words probability and statistics are often used in the same setting (and some folks informally use the word ProbStats to refer to these
1

This joke would have been apparent to you unless you hadnt heard of the cheating
incident when the football was underinflated a few times last season, allegedly leading to
some Patriots victories!

155

156

CHAPTER 11. PROBABILITY

topics collectively). We will be drawing heavily from the fun book Cartoon
Guide to Statistics that actually introduces both topics.2 In the rest of these
notes, we will exclusively focus on Probability Theory.

11.1

Probability

It is indeed remarkable that probability theory was developed over 400 years
ago as a tool for understanding games (including gambling). Of course, as
you may have guessed, probability theory now has applications far beyond
gambling. It powers almost all the automation we encounter in daily life
(the Siri system of iPhones, Google search, photo tagging, voice recognition
systems, etc.).
The annals of mathematics continue to show how all useful ideas are
connected, and also build on each other. In fact, Isaac Newton is said to
have said3 :

If I have seen further, it is by standing on the shoulders of giants.


While the seeds of thought leading to probability theory were present even
as early as the 12th century, it was the combined effort Blaise Pascal and
Pierre Fermat that really lay the foundations of modern probability theory (see https://en.wikipedia.org/wiki/Pierre_de_Fermat and https:
//en.wikipedia.org/wiki/Blaise_Pascal). Since then, the tower of humans standing on each others shoulders has elevated probability theory to
what it is now.
Scientists personal lives often go unmentioned but it is always insightful to know a little about them. The fact that Pascal did his pioneering work
amidst serious personal health issues (e.g., see http://www.iep.utm.edu/
pascal-b/) is a testament to his dedication. Pascal has many other claims
to fame, including the design of the earliest mechanical calculators for his
fathers use.4 Of course, Pascals triangle is another of his discoveries!

These cartoons are available on the class Canvas page.

https://en.wikipedia.org/wiki/Standing_on_the_shoulders_of_giants

In 2012, I had the distinct pleasure of seeing many of these calculators in the Museum of Arts and Crafts in Paris https://en.wikipedia.org/wiki/Mus%C3%A9e_des_
Arts_et_M%C3%A9tiers, https://en.wikipedia.org/wiki/Pascals_calculator

11.1. PROBABILITY

11.1.1

157

Unconditional and Conditional Probability

We will be studying ways to formally define the likelihood of certain discrete


outcomes occuring when we repeatedly perform experiments. For example,
an experiment may be a single roll of a fair (unbiased) six-sided die.5 Such
likelihood will be measured in terms of a measure called probability a real
number between 0 and 1. In our example, the probability of seeing a 6
emerge is 1/6. This is because a 6 is just one of the six elementary events or
outcomes of rolling a single die.
At a high level, the words event and outcome may seem strange, but
they capture a simple idea: the situation whose probability we like to measure. For instance, if all of you in this class stand on each others shoulders
and make a human pyramid, what is the probability that you can touch the
ceiling? In this problem, the event is the sum of your heights adding up to
the height of the room.
As another example, consider an experiment where two dice are tossed
one after the other. The probability that their values add up to 10 is the
probability of getting a (6, 4) pair, a (5, 5) pair, or a (4, 6) pair. The elementary
events for this example are getting a (6, 4) pair, getting a (5, 5) pair, and
getting a (4, 6) pair. The event of interest is adds up to 10. Notice that this
even includes all the three elementary events we just pointed out.
Thus, this single (compound or non-elementary) event includes three
elementary events, namely
(6, 4), (5, 5), (4, 6)
out of the 36 possible elementary events, namely
(1, 1), (1, 2), . . . , (6, 5), (6, 6)
This is why we calculate the probability of the event adds up to 10 to be
3/36 or 1/12.
Notice that we modeled each outcome as a pair (6, 4) rather than as a
set {6, 4} because we wanted to record that 6 is the first outcome (from the
first die) and 4 is the second outcome (from the second die).
In the above discussions, we pretended that we first recorded the two
tosses, and then only asked the question what is the probability of the two
tosses adding up to 10? But now, consider a slightly different situation.
5

The word die is the correct singular form and dice the correct plural form.

158

CHAPTER 11. PROBABILITY

Suppose we finish making the first toss, and see that we got a 5. Suppose
we now ask: what is the probability that the second die roll (which we are
about to do) would yield a number N such that 5 + N = 10? That is, we are
asking a question about when the second toss would end up creating a sum
of 10, knowing that the first toss already gave us a 5. We clearly know
that the second toss must also be a 5 in order for the total to be a 10. The
probability of getting just a 5 from a single toss is, as we know, 1/6.
In other words, the probability of the second toss resulting in a sum of 10
given that the first toss yielded a 5 is 1/6. Thus, the knowledge of the first
toss being a 5 restricts the space of values we must consider with respect to
the second toss. The underlying idea here is that of conditional probability.
Let us change the example slightly. What is the probability of the sum of
the tosses being a 10, knowing that the first toss is a 1? We know that no
matter what the second toss is, the sum cannot be 10. Thus, the conditional
probability now becomes 0. In the same vein, the probability that the sum of
the tosses exceeds 1, given that the first toss is a 1 is 1 (or 100%). It becomes
a certainty.
In the rest of this chapter, we will be studying the basics of unconditional
probability first, and then move on to the study of conditional probability.

11.1.2

Unconditional Probability

There are many chance events such as the tossing of a coin, the roll of a
single die, or the roll of a pair of dice. In probability theory, we use the term
random experiment to describe such activities. We now describe the four-step
process advocated by Lehman, Meyer and Thompson in their book. This
book has been kept on canvas and is called Mathematics for Computer
Science (MCS).
Step-1: Determine the Sample Space that suitably models a problem.
The set of all possible observations is called the sample space and
each possible outcome or in other words, each member of the sample
space is termed an elementary outcome or an elementary event.
For a single die, the sample space is the set {1, 2, 3, 4, 5, 6}, and the numbers 1
through 6 are the elementary events or elementary outcomes. Note: strictly

11.1. PROBABILITY

159

speaking, {1} through {6} are the elementary outcomes., but if clear from
context, we can regard 1 through 6 themselves as the elementary outcomes.
That is, when talking about elementary outcomes or elementary events,
we will hereafter leave out the { and }, and simply refer to 1, 2, etc. as the
outcomes or elementary events. For compound events, we will employ the
brackets ({ and }), i.e., view these compound events as sets such as {1, 2}
or {(4, 6), (5, 5), (6, 4)}. In particular, {(4, 6), (5, 5), (6, 4)} can be regarded as the
event a two-tosses sequence adds up to 10.
For a pair of dice, the sample space is
{1, 2, 3, 4, 5, 6} {1, 2, 3, 4, 5, 6}

with its 36 members, i.e., (1, 1), (1, 2), . . ., (6, 5), and (6, 6) as the elementary events. Does it matter whether you throw both dice at the same time,
or do it one after the other? A moments reflection should convince you that
it does not matter. This is because we do not capture extraneous aspects
into our model such as whether the human knew that the first toss was
already a 5 before making the second toss.6
In our example pertaining to the height of people, the sample space could
be viewed as the set of all possible numbers in the range {50, 300}7 , with each
possible height (say, expressed as an integer).
In general, one has picked an appropriate sample space if it meets a few
simple checks. First, it must include all possible elementary outcomes that
one likes to consider. But, it may include outcomes that one may never
see, although doing so is often un-natural. For instance, one can select
{1, 2, 3, 4, 5, 6, 7} as the sample space modeling the outcome of tossing a regular 6-faced die. It is not a crime to have put in 7 one can simply set the
probability of seeing a 7 to a 0, and everything would work out. Of course, in
most of our examples, we will select the most obvious and compact of sample
spaces such as {1, 2, 3, 4, 5, 6} for one die.

The selection of a suitable sample space is the first significant


step toward solving almost any problem in probability theory.
6

This assumes many practical realities for example, looking at the first toss does not
give the person a sweaty palm that somehow influences the result of the second toss.
7
Assuming that nobody is likely to be taller than 300 centimeters or shorter than 50
centimeters

160

CHAPTER 11. PROBABILITY

Step-2: Define the Elementary Events and Events of your interest.


An event is a subset of the sample space. An elementary event is a singleton subset of the sample space. Probabilty is a measure that we associate
with elementary events as well as events. Here are the definitions, with
examples:
Probability of All Elementary Events:
Each elementary event has a probability value (a real number) in
the range 0 to 1.
As an example, for a single die, the probability of outcomes 1
through 6 is all going to be 1/6.
Notice that some of the elementary events can indeed have a probability of 0. It also can have a probability of 1.
If one of the elementary events has probability 1, then, by definition, all other elementary events must have a probability of 0.
Event Probability:
The probability of any event e equals the sum of the probabilities
of all elementary events belonging to e.
As an example, for a single die, an event can be {1, 3, 5}. This is
not an elementary event. This event models the toss of a die that
results in an odd-numbered outcome.
The probability of the above event odd-numbered outcomes is
1/6 + 1/6 + 1/6 = 0.5.
Sample Space Probability:
The probability of the whole sample-space is 1.0

The selection of suitable events (whose probabilities you are then


interested in) is the second significant step toward solving almost
any problem in probability theory.
Often, the selection of these events requires considerable care.
You may find it easier to model and analyze the complement of
the actual event you are interested in. Often, you have to keep
the axioms associated with Probability Spaces, as well as the
events, clearly in mind. This helps you avoid making mistakes,
and also to simplify the analysis.

11.1. PROBABILITY

161

Step-3: Use the Axioms of Probability Spaces Wisely.


The use of axioms of probability is almost always required in solving any
problem. One occasion to use these axioms is in figuring out the complement
of an event. Another occasion arises when we ask whether two events are
disjoint.

The notions of disjoint and independent may sound alike, but


are totally unrelated! They are easily confused. Two events E 1
and E 2 are disjoint if E 1 E 2 = ;. Notice that by this token,
elementary events are always disjoint.
Two events E 1 and E 2 are independent if the occurrence of one
does not affect the occurrence (or the likelihood of occurrence)
of the other. This notion squarely belongs to the topic of conditional probability, and we shall discuss it there.

(Definition, that will be used below): A collection of sets


E 1 , E 2 , . . . , E n1 , E n are the partitions of a set S if
E i E j = ; for all pairs i, j {1, . . . , n} (the condition of
being mutually exclusive)
E 1 E 2 . . . E n1 E n = S (the condition of being exhaustive, which shows that the union of these events equals
the whole set.)
Examples:
{{1, 3, 5}, {2, 4, 6}} is a partition of {1, 2, 3, 4, 5, 6} because
{1, 3, 5} {2, 4, 6} = ; (Mutually exclusive)
{1, 3, 5} {2, 4, 6} = {1, 2, 3, 4, 5, 6} (Exhaustive)
{{1, 2, 3, 4, 5}, {}, {6}} is a partition of {1, 2, 3, 4, 5, 6} again because
the sets in this partition are pairwise mutually exclusive, that
is,
* {1, 2, 3, 4, 5} {} = ;
* {1, 2, 3, 4, 5} {6} = ;

162

CHAPTER 11. PROBABILITY

* {} {6} = ;
the sets in this partition are exhaustive, that is
{1, 2, 3, 4, 5} {} {6} = {1, 2, 3, 4, 5, 6}

Axioms of Probability: With the above definitions in place, we can now introduce the axioms of probability. These axioms are intuitively summarized
in the Gonick/Smith cartoons. Briefly, the axioms are the following:
All probability values are associated with events (including elementary events), and are real numbers r such that 0 r 1. Examples:
The probability of getting a 2 in a die-toss is 1/6 (2 is an elementary event)
The probability of getting an odd value in a die-toss is 1/2 ({1, 3, 5}
is a non-elementary event)
The sum of the probability values of all elementary events adds up to
1.
The probability of the empty event, i.e. the empty set is 0.

p({}) = p(;) = 0
If events E 1 , E 2 , . . ., E n partition the sample space, then the probability values of E i add up to 1. That is,

p(E 1 ) + p(E 2 ) + . . . + p(E n ) = 1


Notice I said partition the sample space. Any partitioning cuts up a set
into a collection of mutually exclusive and exhaustive events. Here are
two familiar examples:
The probability of getting an odd or an even value is p({1, 3, 5}) +
p({2, 4, 6}) = 1,
p({1, 2, 3, 4, 5}) + p({}) + p({6}) = 1
* We of course know that p({}) = 0.

11.1. PROBABILITY

163

For two non-disjoint events E 1 and E 2 , p(E 1 E 2 ) = p(E 1 ) + p(E 2 )


p(E 1 E 2 ). Examples:
The probability of getting an odd value: 1/2.
The probability of getting a value above 4 is p({5, 6}) = 1/3.
But, the probability of getting an odd value or a value above 4 is
not 1/2 + 1/3 = 5/6, but:

p({1, 3, 5} {5, 6}) = p({1, 3, 5, 6}) = 4/6,


obtained as 1/2 + 1/3 p({1, 3, 5, 6} {5})
i.e. 1/2 + 1/3 p({5})
i.e. 1/2 + 1/3 1/6
i.e. 5/6 1/6
i.e. 4/6.
Step-4: Use a Decision Tree Diagram (or approximate it).

Toss
H 1/2

T 1/2

Outcome
H

Figure 11.1: Decision tree for one coin cartoon from Gonick/Smith
For simple problems, it helps draw out a full decision tree, so that you do
not make mistakes. For more involved problems, drawing suitably approximated decision trees can still help you think clearly and avoid mistakes.

164

CHAPTER 11. PROBABILITY


Toss 1

H 1/2

Toss 2

Event:
At least
one H

H 1/2

(H,H)

T 1/2

(H,T)

(T,H)

H 1/2
T 1/2

Outcome

T 1/2

(T,T)

Figure 11.2: Decision tree for two coins

Draw
decision
trees
similar
to
those
in
the
Lehman/Leighton/Meyer book Mathematics for Computer
Science. 8 These decision trees are noteworthy in many ways:
(1) They depict the stages of each random-experiment (or
game), annotating these edges with probabilities; (2) They show
the elementary outcomes as leaves, assigning probabilities to
them, (3) They put checkmarks against collections of elementary events, writing what events they contribute to. (Note:
Decision trees are in fact even more useful for understanding
conditional probabilities, as we shall soon see.)

11.1.3

A Collection of Examples

Probability theory is best learned by solving many problems.


Toss of a Single Fair Coin: Figure 11.1 presents the decision tree for the
toss of a single (fair) coin. In the decision tree, we label the action and
the outcome as shown.
The toss of two coins in sequence: Figure 11.2 presents the decision tree
for the toss of two coins in sequence. We can see how the actions, outcomes (or elementary events), and finally, the events of interest are
annotated.

11.1. PROBABILITY

165

Figure 11.3: Sample Space and Events for two dice (from Gonick/Smith)

166

CHAPTER 11. PROBABILITY

Figure 11.4: Strange Dice: A versus B (from the MCS book)

11.1. PROBABILITY

167

Sample Space and Events: Two Dice: Figure 11.3 discusses the sample
space and events associated with two dice.
A versus B : Strange Dice: Figure 11.4 analyzes the probability of strange
die A winning over strange die B.
Use of Or: Disjoint and Non-Disjoint: Figure 11.5 discusses the or of
two events: disjoint and non-disjoint.
Use of the Not of an event: Figure 11.6 shows how the use of Not can
simplify the analysis of probabilities.
Dms problem: Use of Complements: Figure 11.8 analyzes De Meres
problems using the not operator. It demonstrates that the use of the
complement of an event can simplify analysis.
Birthday Paradox: Another use of Complements: Some code to execute the Birthday paradox is given in Figure 11.9. The problem
and its encoding are in the comments of function bdayColl. You can
clearly see the decreasing probability of having distinct birthdays as N
increases:
By applying the rule of complements, you can then surmise that the
probability of collision increases as N grows. This exact logic underlies the design of hash tables. The rule of hash-table sizing in response
to this observation is discussed on a number of sites e.g., http://
cseweb.ucsd.edu/~kube/cls/100/Lectures/lec16/lec16-5.html.

>>> plotBdayColl(100)
1 : 1.0
2 : 0.9972602739726028
3 : 0.9917958341152187
4 : 0.9836440875334497
5 : 0.9728644263002064
6 : 0.9595375163508885
7 : 0.9437642969040246
8 : 0.925664707648331
9 : 0.9053761661108333
10 : 0.8830518222889224
11 : 0.858858621678267
12 : 0.8329752111619356
13 : 0.8055897247675705
14 : 0.7768974879950271
15 : 0.7470986802363135

168

CHAPTER 11. PROBABILITY

Figure 11.5: Or of two events: disjoint and non-disjoint cases (Gonick/Smith)

11.1. PROBABILITY

Figure 11.6: Use of Not of an event (Gonick/Smith)

Figure 11.7: De Meres Conundrum (Courtesy, Gonick and Smith)

169

170

CHAPTER 11. PROBABILITY

Antoine Gombaud, Chevalier de Mere and his problems


Probability of
No 6 in four throws of a die

Toss 1

Toss 2 Toss 3

5/6

5/6

Event:
No 6 in
Toss 4 four tosses
(5/6)4
5/6

Probability of
No double 6 in 24 throws of two dice

Event:
No Double-6 in
twenty-four tosses
Toss 1

Toss 24

Toss 2 Toss 3

(35/36)24

5/6
35/36

35/36

35/36

35/36

Figure 11.8: De Meres problem: Cartoons courtesy Gonick/Smith

11.1. PROBABILITY

171

from functools import *


def Perm(n,r):
"""
Implements P(n,r) or n P r.
Precondition: n >= r, n >= 0, r >= 0.
"""
assert(n >= r), "Error: Fed n < r"
return reduce(lambda x,y: x*y, range(n, n-r, -1), 1) # Returns 1 when n = 0
def Fact(n):
"""
Factorial n. Builds on Perm. Can also be Perm(n,n-1) below.
"""
return Perm(n,n)
def Comb(n,r):
"""
Implements C(n,r) or n C r.
Precondition: n >= r, n >= 0, r >= 0.
"""
return Perm(n,r) // Fact(r)
def PascTri(N):
"""
Return Pascals Triangle from 0 C i thru N C i for 0 <= i <= N.
"""
for i in range(N+1):
print ([ Comb(n,i) for n in range(i,i+1) for i in range(n+1) ])
def bdayColl(n):
"""
Given a subset of n people in a room, return the probability that
have distinct birthdays. Obtained as 356 P n / 365^n, where: the
represents the size of the event that all n of them have distinct
and 365^n is the size of the sample space. 365 P n realized using
"""
return ( float(Perm(365,n)) / (365. ** n) )

all
numerator
birthdays;
reduction.

def plotBdayColl(N):
"""
Invoke bdayColl N times and plot the decreasing probability as N increases.
"""
for i in range(1,N+1):
print(str(i) + " : " + str(bdayColl(i)))
#-- Poker-hand probabilities: From http://www.math.hawaii.edu/~ramsey/Probability/PokerHands.html
def singlePairProb():
return ( Comb(13,1) * Comb(4,2) * Comb(12,3) * (4*4*4) ) / float( Comb(52,5) )
#--end

Figure 11.9: Some Python Code to execute the Birthday Paradox plus
Poker Hands, etc

172
16 :
17 :
18 :
19 :
20 :
21 :
22 :
23 :
24 :
25 :
26 :
27 :
28 :
29 :
30 :
31 :
32 :
33 :
34 :
35 :
36 :
...
53 :
...
88 :
...
100:
>>>

CHAPTER 11. PROBABILITY


0.7163959947471501
0.6849923347034393
0.6530885821282106
0.6208814739684633
0.58856161641942
0.5563116648347942
0.5243046923374499
0.4927027656760146
0.4616557420854712
0.43130029603053616
0.401759179864061
0.37314071773675805
0.3455385276576006
0.31903146252222303
0.2936837572807313
0.26954536627135617
0.2466524721496793
0.225028145824228
0.20468313537984573
0.18561676112528477
0.1678178936201205
0.01886188651608717
1.0719834084561783e-05
3.0724892785157736e-07

In the next section, we proceed to discuss the topic of conditional probability.

11.2. CONDITIONAL PROBABILITY

11.2

Conditional Probability

11.2.1

Conditional Probability Basics

173

Figure 11.10 discusses the basics of conditional probability. Suppose you


stand next to someone who has a closed fist containing two dice. Let event
A and C be as defined in Figure 11.3, meaning: A is the event that the
dice add up to 3, and C is the event that the white die shows a 1. Now,
P ( A ) = 2/36, as both (1, 2) and (2, 1) sum to 3 and there are 36 events
in the sample space.
But suppose the person reveals that C has occurred (as in Figure 11.10);
then under this condition, P ( A ) = 1/6, because the white die has to be a
1, and there are 6 such events: (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), and (1, 6),
and within this set, event A means only (1, 2).
Thus we invent a new notation P ( A | C ), meaning the probability that
A occurs in the reduced sample-space modeled by C having occurred.
This is 1/6. Mathematically, P ( A | C ) is defined only if P (C ) 6= 0, and
is given by
P ( A C)
P ( A | C) =
P (C )
and its value is 1/6 in this example.
If P (C ) = 0, then P ( A | C ) is undefined.
The exact Venn diagram describing conditional probabilities is given in Figure 11.11. We now describe this diagram.
This Venn diagram depicts all people in the world (sample space)
It shows the set of people who live in Cambridge, a city in Massachusetts
(where MIT is). This is set B.
It then shows those who are MIT students (set A)
Thus, P r ( A | B) means the probability that the person is an MIT student, given that the person lives in Cambridge
This is given by the dark shaded area (P ( A B)) divided by the light
shaded area (P (B)). Notice that P (B) 6= 0.

Also note the following very important connection between


disjointness and independence. Two events A and B are inde-

174

CHAPTER 11. PROBABILITY

Figure 11.10: Basics of Conditional Probability

11.2. CONDITIONAL PROBABILITY

175

Figure 11.11: Venn Diagram Illustrating Conditional Probabilities (from


Mathematics for Computer Science by Lehman, Leighton, and Meyer,
MIT Educational Resource)

pendent if and only if P (B) = 0 or


P ( A | B) = P ( A )

That is, in case B is non-zero, the occurrence of A is not conditioned upon B having occurred.
Notice that if A and B are disjoint, their intersection (the dark
shaded region) is empty.
Suppose this happens when B is non-empty. Then it can only
mean one thing:

P ( A B) = 0 (the dark shaded region is empty)


P (B) 6= 0
B)
Thus P ( A | B) = P(A
P(B) = 0
But P ( A ) 6= 0 is possible
Thus P ( A | B) 6= P ( A )
OR, in other words, disjoint events are not independent.
This makes sense. If two events A and B are disjoint, then
B having occurred means A did not occur!

176

CHAPTER 11. PROBABILITY

If two events A and B are independent, then we can rewrite


P ( A | B) =

P ( A B)
P (B )

as
P ( A | B) = P ( A ) =

P ( A B)
P (B )

or that
P ( A B ) = P ( A ) P (B )

11.2.2

Derivation of Bayes Theorem

Figure 11.12 discusses Bayes Theorem and associated results, accompanied


by examples. Since the use of conditional probability is really error-prone,
we list some of the underlying formal results that guide us in its application:
First of all, whenever P (B) 6= 0, we have

P ( A | B) =

P ( A B)
P (B )

By the same token, whenever P ( A ) 6= 0, we have

P (B | A ) =

P (B A )
P ( A)

Putting these together, we have Bayes Theorem (or Bayes rule):

P (B | A ) P ( A ) = P ( A | B ) P (B ) = P ( A B )

11.2.3

Law of Total Probability

If P (E ) and P (E ) are non-zero then

P ( A ) = P ( A | E ) P (E ) + P ( A | E ) P (E )

11.2. CONDITIONAL PROBABILITY

177

178

CHAPTER 11. PROBABILITY

Figure 11.13: Patient Testing: Use of Bayes Theorem

11.2. CONDITIONAL PROBABILITY

A: Patient has disease


B: Patient tests positive
A:
B:
Disease? Tests positive?
B|A .99
A
.001
!B|A .01
!A
.999

B|!A .02

!B|!A .98

179

Event:
Person
has
Disease

Event:
Person
tests
Positive

Event:
Has Disease
AND
tests Positive

B & A: .00099

!B & A: .00001

Events

B & !A: .01998

!B & !A: .979

Figure 11.14: Decision Tree for Medical Testing

180

CHAPTER 11. PROBABILITY

11.2.4

Patient Testing: Bayes Theorem

Figure 11.13 presents the basics of conditional probability as used for drug
testing and determining the likelihood of having a disease, if one tests positive for it. Figure 11.14 presents the decision tree associated with this example. Here is a complete explanation of this highly important example that
ties together all the concepts introduced thus far:
Medical testing is seldom 100% fool-proof. Suppose the probability of
having a disease is .001 (shown as P ( A ))
Suppose the probability of the test emerging positive, given one has
the disease, is .99. That is P (B | A ) = .99, as in the figure.
By the above discussion, the probability of not having the disease is
.999 (shown as P (! A ))
Testing can still yield a positive result! Thus P (B |! A ) = .02 is possible,
as in the diagram
But fortunately, P (!B |! A ) = .98 (that is, the test is negative when one
has no disease with this probability)
Question: Suppose one tests positive; what is the probability that one
has the disease? In other words, what is P ( A | B)?
By the definition of conditional probability, we have

P ( A | B) =

P ( A B)
P (B )

We see that P ( A B) = .00099 from the decision tree.


Now, what is P (B)? This is the probability of The person tests positive. Using the law of total probability, we can write

P (B) = P (B | A ) P ( A ) + P (B |! A ) P (! A )
From the diagram we can read-off this value to be

P (B) = .00099 + .01998 = .02097


Thus, P ( A | B) = .00099/.02097 = 0.0472
In other words, you may test positive and still have the disease only
with a 4.72% chance!!
Wow. This low an efficacy of testing? In practice, most tests are not
this bad.

11.2. CONDITIONAL PROBABILITY

181

Such a low number results from the diseases being so rare (0.001, or
0.1% of the population), and that testing has such high false positive
(false alarm) rate: even 2% of those who dont have the disease test
positive. In practice, hopefully things are far better.

11.2.5

More Examples on Independence and Dependence

Independence of two dice events


Suppose we consider the toss of two dice, one white and the other black.
Suppose C is the event White is 1 and DC is the event Black is 1.
Clearly, these events appear to be independent: the occurrence of C does
not affect that of D (and vice versa).9 Let us calculate these results (see
Figure 11.5 which also highlights these events):
D)
P (C | D ) = P(C
P(D)
There is exactly one outcome in C D , namely (1, 1); hence P (C D ) =
1/36
P (D ) = 1/6 as there are 6 outcomes in this event.
Thus P (C | D ) = 1/36
1/6 = 1/6 = P (C ). Thus, C and D are independent.
Independence of two disjoint events
Consider E to be adds up to 6 and F to be adds up to 3 (see Figure 11.5
which also highlights these events). These events are disjoint; or P (E | F ) =
0. Therefore, P (E | F ) and P (F | E ) are P (E F ) divided by something that
is non-zero (P (F ) and P (E ) respectively). But since P (E | F ) = 0, we have
P (E ) 6= P (E | F ), and also P (F ) 6= P (F | E ).
Independence of two non-disjoint events
Now consider the A and C events discussed in Section 11.10. We have
P ( A | C ) = 1/6, while P ( A ) = 2/36 = 1/16. Thus, A is not independent of C .
Independence in a decision tree
Figure 11.15 tells us how, by inspecting a decision tree, we can immediately
tell that two events are independent.
9

Unless the dice are coupled by a thin spring, as in one of the Gonick/Smith cartoons.

182

CHAPTER 11. PROBABILITY

Event
A

Event
B
B|A x

A
y
!B|A (1-x)
!A
1-y

B|!A x

!B|!A (1-x)

Events

Total
Prob
of B
is x

B & A: x.y

!B & A: (1-x).y

B & !A: x.(1-y)

!B & !A: (1-x).(1-y)

Here, P(B & A) is x.y and P(B) is x (based on total probability)


Therefore, P(A|B) = xy/x = y = P(A) and also see that P(B) is x = P(B|A)
Thus, x and y are independent, and we can tell this by the x versus 1-x symme
under the B event, for both cases of the A event.
Figure 11.15: Independence as evident from a decision tree

11.3. ADVANCED EXAMPLES

11.3

Advanced Examples

11.3.1

New England Patriots

183

I provide analysis of this situation at http://tinyurl.com/Coin-Deflate-Gate.

11.3.2

Independence, and how it allows the Product Rule

To see that independence is crucial for applying the product rule, let us work
out the following example. Consider the toss of two dice. Let event WO =
white is odd and SELE4 = dice Sum to Even 4. (this is forcing the black
die to be also odd, and that too 1 or 3). Let us analyze this situation to see if

P (SELE 4 | WO ) = P (SELE 4)
i.e., if SELE4 is independent of WO.
P (WO ) = 1/2.
P (SELE 4): Happens in these cases:
(1,1), (1,3), (3,1), (2,2)
Probability is 4/36 = 1/9
P (SELE 4 WO ): This happens in these cases:
(1,1), (1,3), (3,1)
Probability is 3/36 = 1/12
P (SELE 4 | WO ) = P (SELE 4 WO )/P (WO ) = (1/12)/(1/2) = 1/6.
P (SELE 4) = 1/9.
Since P (SELE 4 | WO ) 6= P (SELE 4), we conclude that these are dependent events.

If, instead of SELE4, we just say SE = dice add up to even, then we


will find that the events end up being independent. (Try this!)

184

CHAPTER 11. PROBABILITY

Now we have, P (SELE 4 | WO ) 6= P (SELE 4). Thus,

P (SELE 4 WO ) 6= P (WO ) P (SELE 4),


or,
given that these events are dependent, one may not apply the
product rule!

11.3.3

Independence is Symmetric

If A depends on B, then surely B depends on A. Let us set up a proof by


contradiction.
A depends on B
Thus P ( A | B) 6= P ( A )
Thus P ( A B)/P (B) 6= P ( A ) (*)
Now assume B independent of A
That is, P (B | A ) = P (B), or that P (B A ) = P ( A ) P (B).
Then we can obtain P (B A ) = P ( A B) = P ( A ) P (B).
This yields: P ( A B)/P (B) = P ( A )
We obtain a contradiction with (*)

11.3.4

New England Patriots Game

Are the Patriots deflating the coin?

The website http: // www. npr. org/ 2015/ 11/ 06/ 455049089/
introduces the story Luck Of
The Flip: New England Patriots Defy Probability With Coin Toss Wins (www.npr.org,
Nov 6, 2015). Apparently the New England Patriots are winning tosses at an impressive rate (19 out of 25 so far). There is analysis given here: http: // nesn. com/ 2015/
11/ numbers-bill-belichick-patriots-win-pregame-coin-flip-at-impossible-rate/ Assuming the coin toss is a
50/50 proposition, the probability of winning it at least 19 times in 25 tries is 0.0073.
That is less than three-quarters of 1 percent. (Emphasis theirs.)
luck-of-the-flip-new-england-patriots-defy-probability-with-coin-toss-wins

11.3. ADVANCED EXAMPLES

185

I decided to do some analysis of the Probability of winning exactly 19


tosses out of 25.
The exact toss-sequence is an arbitrary bit-pattern of 25. A toss-call
sequence of 25 matches the toss-sequence if there are exactly 6 mistakes.
These 6 mistakes could be anywhere, and so choose 6 places out of 25 where
the toss-call differs from the toss-sequence.
The set of all toss-call sequences is the sum of:
sequences that are wrong in 0 places :
sequences that are wrong in 1 place :

25
0

25
1

...
sequences that are wrong in 6 places :

25
6

[[ event of interest ]]

...
sequences that are wrong in all places :

25
25 .

This sum is 225 (Binomial theorem applied to (1 + 1)25 ). This independently makes sense, as there are 225 ways to generate bit strings out of 25
bits (but I wanted to do it directly based on the problem at hand, and not
jump onto a familiar formula, just because it is there.) This is the sample
space.
The event of interest is marked above (Patriots managed
to pick those
25 25
sequences with exactly 6 mistakes). The probability is 6 /2 . Using my
Python code, I get

>>> Comb(25,6) / (2**25)


0.00527799129486084
Zs
number, I decided
Since this number does not match the analystA
to do another calculation. Suppose we mean Not exactly 6 mistakes, but
suppose we want to find out the prob. of making anywhere from 0 to 6
mistakes. Then?

Then the event of interest is 6i=0 25
i . Because these events are disjoint for every i , we can apply the rule of the sum. First, a test run to
estimate the event size:

186

CHAPTER 11. PROBABILITY

>>> sum( [Comb(25,i) for i in range(7) ])


245506
Now for the real probability:

>> sum( [ Comb(25,i) for i in range(7) ]) / (2 ** 25)


0.007316648960113525
This matches the result given above - reassuring!
Now, MACEACHERN goes on to say this:
MACEACHERN: If were thinking about professional football, there are a lot of
teams. And if instead of focusing only on the Patriots, you ask whats the chance
that at least one of the teams win 19 out of 25, the the probability then is, of course,
much larger.
MCEVERS: But Steve MacEachern says the chance of winning or losing the toss
will always stay at about 50-5.
SIEGEL: Plus, he says, its pretty hard to deflate a coin.
:-)

The Probability of Some Team Being Lucky


How many teams are there in the NFL ? How many coin-toss experiments
could be engaged in independently by these teams? Note that we are making a huge assumption that the toss outcomes of the teams are independent.
As per http://espn.go.com/nfl/teams, there are 32 teams. Suppose
all teams, (ALL TEAMS) toss, and end up making 7 or more mistakes in
their toss. Then we have a big "32-way AND" event. The complement of this
event probability is what we are after. Again, this rests on independence,
which licenses us to apply the product rule.
Here are the calculations:
1 - (((25 C 7) + ... + (25 C 25)) / (2 ** 25)) ** 32
>>> 1 - ( sum( [ Comb(25,i) for i in range(7,26,1) ] ) / (2 ** 25) ) ** 32
0.20942401274128541

Wow, this is pretty high !! A 20% chance that some team will get a string
of 19 or more wins!

11.3. ADVANCED EXAMPLES

187

A Cleaner Derivation
In my haste, I typed a redundant calculation

sum( [ Comb(25,i) for i in range(7,26,1) ] ) / (2 ** 25)


But the discerning reader will note that this part evaluates to 0.9926833510398865,
which is
1 0.007316648960113525
That is, by the use of the complement event we have already evaluated
earlier, namely via

sum( [Comb(25,i) for i in range(7) ]) / (2 ** 25)


This forms another nice illustration of the use of complementary events!

188

CHAPTER 11. PROBABILITY

Chapter 12
Functions, Relations, Infinite
Sets
In this chapter, we will present an overview of three inter-connected topics,
namely
functions,
relations, and
infinite sets.
Through these topics, we will learn many concepts central to everyday computer science. The subject of functions and relations is fundamental to Discrete Structures yet vast. In the interest of time, we will focus on a small
selection of topics; we provide a summary below:
Functions, and Correspondences: Functions are mappings from domains to codomains 12.4. We will study functions along these lines.
Types of Functions: 1-1 and Onto: Some functions are 1-1, while
others are many-to-one. It is important to know under what conditions functions are 1-1.
Showing whether a given functions is a Correspondences: This is
a pictorial proof that we will present in order to show whether a
given function is invertible. We will call it the Tarzan proof. It
conjures the image of Tarzan being able to swing from any point
in the domain to the codomain and back. Similarly we (Tarzan)
must be able to swing from any point in the codomain to the domain, and back! 12.4.2.
189

190

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS


Gdel Hashes: Prime Factorization to Ship Secrets: Using the so
called fundamental theorem of arithmetic, every natural number above 1 can be written uniquely as a product of primes. This
allows us to encode tuples of natural numbers into a single natural number and vice-versa 12.4.3. We will give you some cool
Python code that you should fully understand, and then run some
examples using it.
Infinite Sets and Cardinalities: We obtain some surprises when
we apply familiar ideas from finite sets to infinite sets. For instance, for infinite sets A and B, it is possible that A B, and
yet they have the same cardinality. This argument is based on
exhibiting correspondences 12.5. A very cool theorem called the
Cantor-Schrder-Bernstein (C-S-B) will allow us to easily find correspondences.
Diagonalization, showing Correspondences Dont Exist: In some
cases, we would want to argue that a correspondence cannot exist,
without introducing a contradiction. A famous proof technique in
this area is called diagonalization 12.6.

12.1

Overview of Functions and Relations

Functions and relations are used to relate items between two given sets. The
first of these sets is called the Domain and the second called Codomain. We
assume that both the domain and the codomain are non-empty sets. These
kinds of mappings or associations appear in computing in many places.
A function tends to model a piece of code that processes some input. For
instance, a spell-checker is a function that, given a piece of text, consults
a dictionary and emits all the misspelt words. With respect to a given dictionary and a piece of prose given as inputs, the list of misspelt words is
uniquely determined. That is, for a combination (prose, dictionary), there
cant be two distinct list of misspelt words. In such a situation, one can employ a function (a one-to-one or a many-to-one map, but never a one-to-many
map).
A relation can model food or beverage preferences of individuals. Each
individual typically likes more than one food item. Thus, the mapping
from people to their preferred food items is a one-to-many map here is

12.1. OVERVIEW OF FUNCTIONS AND RELATIONS

191

where functions cant be used. A relation is a generalization of functions


that allows a one-to-many mapping as well.
More formally, let A be a domain and B be a codomain. A function f : A
B relates items from A , yielding items in B. Functions are single-valued
mappings. That is, given x A , there is only one y B that is yielded. In this
sense, functions are also relations; we then say that the relation is functional.
One should not confuse between the mathematical idea of functions and
the realization of functions in a computer. In a computer, a function wanders about for a little while inside a computer and (hopefully) emerges with
an answer.1 This behavior when examined over all possible inputs defines a
mapping. In the limit, we obtain the entire mapping of the alleged function
that underlies a computer program.
Every function f must work for every x A , i.e., yield a mapping for
every x A . For instance, suppose A = N N and B = R. Is / : A B a
function? Unfortunately, it is not, because / is undefined for B = 0. Thus,
one must define / with domain N N+ (where N+ = N {0}). In this case, the
domain avoids ( x, 0) for any x, and then the / function is defined everywhere
on such a domain.
Relations are not single-valued that is, they can associate more than
one element in B for each element of A . For example, a relation that models
food or beverage preferences of individuals can be
{( Al i, K ebab), (Y uki, Sushi ), (K rishna, I dl i ),
(K rishna, Dosa), (Y uki, T empura), ( Al i, Fala f al ),
( M i guel, Tamal es)}

In this example, Al i likes K ebab and Fala f al , K rishna likes Dosa and
I dl i , Y uki likes Sushi and T empura while M i guel likes only Tamal es.
If you want to make food preferences functional, you have to force one
person to choose only one food type; it is possible for multiple individuals
to prefer one food item. An example of a functional food-preference relation
1

It is an entirely different issue that we cannot tell whether such a function has decided
to enter into an infinite loop or not. Things that infinitely loop cannot be associated with
mathematical functions as they must be defined everywhere in the domain. The computer science notion of functions does allow for functions infinitely looping. This is achieved
by introducing the notion of partial functions. Such functions model looping by returning a special value called or bottom. More on that when you study the Denotational
Semantics of Programming Languages.

192

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

would be:
{( Al i, I dl i ), (Y uki, I dl i ), (K rishna, Dosa),
(Y uki, Dosa), ( M i guel, Tamal es}

In this example, we have eliminated the situation of one person preferring


more than one food type.
In our book, the rule for relations is specified as follows. Let A be the
domain and B the codomain. The rule for relations is a subset of A B. In
the book, only binary relations are defined.
In general, relations can have higher arity. For instance, a ternary relation over A , B, C is a subset of A B C . For instance, A can be People, B
can be Food preferences and C can be Age. Such triples may be stored in
a hotel database to, say, recommend food for different age-groups; example:
{( M ike y, PBnJ, 4), (Shaq, Steak, 30+), (T rump, RumpRoast, 70)}

Now, let us gain familiarity with functions and relations through more examples.

12.2

Overview of Functions

Functions are maps from domains to codomains as in Figure 12.1. For every
domain point x and function f , there is no more than one range point y
such that f ( x) = y. Functions must be defined everywhere in their domain.
Further details about functions are given in the caption of Figure 12.1.

12.2.1

Example Function: Mapping (0, 1] to [1, )

Let us consider subsets of R defined by intervals such as [1, ) and (0, 1].
Here, an interval [1, ) means all numbers in R from 1 to numbers approaching . Note that is not a number, and so we cant quite write
[1, ], meaning that a number actually equals . Similarly, (0, 1] represents
numbers from 1 down to numbers approaching 0. Such intervals are called
semi-open intervals. (By contrast, an interval of the form [a, b] is called a
closed interval.)
Is there a function that maps every point in the domain (0, 1] to a point
in the codomain [1, ) such that

12.2. OVERVIEW OF FUNCTIONS

193

Figure 12.1: The general shape of a function mapping. The entire domain
is mapped from, but the points hit in the codomain (the range of the function shown in purple) can be a proper subset of the codomain for into
functions. If the range and codomain coincide, the function is onto. If the
collapsing arrows are absent (two yellow points going to one purple point),
the function is one-to-one. One-to-one and onto functions are called correspondences or bijections. Correspondences have inverses. Inverses are also
correspondences, with the codomain and domain switched around.
Every codomain point results uniquely from a single domain point,
Every codomain point is mapped onto, and
(Of course) the function works on every domain point.
The answer is of course yes. The rule to apply is 1/ x. We can see that when
fed numbers approaching 0, the result 1/ x tends to . When approaching 1,
the result also approaches 1.
Question: Define the rule for a function that maps (0, ) to
(1, ). Hint: Numbers close to 0 may be sent closer to .
Answer:

12.2.2

Consider the rule ( x + 1)/ x.

Example Function: Map Q to N

There are many ways to map Q, the domain of rational numbers to the
codomain of natural numbers, N. Since every x Q is of the form a/ b, we
can write one of many possible maps. The real question is what we want the
map to represent. We now present some possibilities:
Dont care: Given a/ b, return some fixed number c.

194

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

Just throw away b: Given a/ b, return a.


Map in a many-to-one manner: Given a/ b, return a + b.
Map in a one-to-one manner: Given a/ b, return 2a 3b . This is a oneto-one map because of the fundamental theorem of arithmetic otherwise known as the property of unique factorization of natural numbers. That is, every natural number above 1 can be expressed in one
and only one way as a product of primes. This result appears under
the name fundamental theorem of arithmetic (p 129 in our book).
Gdel Hash
The idea of encoding numbers using powers of primes has a name: Gdel
hashing! Here is the idea: suppose you want to ship the triple (6, 37, 155, 3)
to your friend. Here are the encoding steps:
Obtain the first four prime numbers to package the four elements of
this quadruple. The primes are 2, 3, 5, 7.
Obtain 26 337 5155 73 , and ship this huge number.
>>> (2**6)*(3**37)*(5**155)*(7**3)
2164268760214856240692772513553339929342581849870101035060
9901117235549251462830796545466771618748680339194834232330
322265625000000L
>>>

And here are the decoding steps:


Upon receiving the above huge number,
Divide by 2 until we cannot do so evenly; this achieves 6 divisions
by 2, and so write down 6.
Similarly, achieve 37 divisions by 3, and write down 37.
Now achieve 155 divisions by 5, and write down 155.
Finally, achieve 3 divisions by 7, and write down 3.
Emit (6, 37, 155, 3), the decoded secret!

12.2.3

Example Function: Map N to N N

Again, one can arrive at many rules, depending on what one wants to accomplish. Here are some examples:
Dont care: emit some member of N N.
Many to one: given x, emit some (a, b) in N N such that a + b = x (and
to be deterministic i.e., predictable, we could keep a b).

12.2. OVERVIEW OF FUNCTIONS

195

Figure 12.2: Dovetailing (zig-zag) correspondence (bijection) from N to N


N
One to one: Enumerate all pairs (a, b) in NN such that a + b = 0. Then
consider all that add up to 1, then 2 and so on. The full sequence may
look something like this, and corresponds to the zig-zag or dovetailing walk shown in Figure 12.2:
0 (0, 0)
1 (1, 0), 2 (0, 1)
3 (0, 2), 4 (1, 1), 5 (2, 0)
6 (3, 0), 7 (2, 1), 8 (1, 2), 9 (0, 3)
(and so on)
As it turns out, this can be a one-to-one and onto map. The standard
name for one-to-one and onto maps is correspondence and we will soon
be discussing correspondence and their significance in 12.5.

12.2.4

Inverse of a function

The notion of inverse is important to grasp without any loose ends in your
understanding. Functions f : A B and g : B A are inverses of each other
if for every a A and b B, f (a) = b if and only if g( b) = a. In predicate logic,
we have
a A, b B, f (a) = b g( b) = a

196

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

Try however I might, I could not read this statement without my head hurting. Then one day I immediately saw how to present this: It is a Tarzan
Proof! Why? Look what is being said:
If Tarzan can start from a A , and can swing to b B via f (one rope
by which Tarzan swings from tree a to tree b), then Tarzan can come
back to a from b by riding the g rope.
If Jane (Tarzans partner) can start from b B, and can swing to a A
via g (one rope by which Jane swings from tree b to tree a), then Jane
can come back to b from a by riding the f rope.
That is it!
For further details, please see 12.4.2.

With this definition, let us examine if the following function has an inverse:
Name: f
Domain: N+
Codomain: N+
Rule for f : 2x
In other words, f is the function lambda x: 2x. It turns out that it does
not have an inverse g of this type.

We want for every a, b N+ , this to hold


Take the rule x/2
Unfortunately, this rule applied to 1 and 3 dont yield points in N+
Hence this inverse over this domain and codomain do not exist.

However if you changed the domain and codomain to R+ = R {0} (remove 0


from R), then the said inverse does exist.

12.2.5

Composition of Functions

When two functions f and g are given, naturally one can compose them,
written f g. We define f g to be such a function that given x, ( f g)( x) =
f ( g( x)).
A familiar example from trigonometry is sin and sin1 . If we write
sin1 sin, we obtain a new function such that given x, ( sin1 sin( x) =
sin1 ( sin( x)), which of course is x. Thus, this function composition yields
the identity mapping (in the Lambda notation it would be lambda x: x).
One can compose other functions also; for instance composing the function lambda x: x*x with itself yields a function that takes the fourth power
of a given input. (One must also specify the domain and codomain, to make

12.3. OVERVIEW OF RELATIONS

197

these definitions unambiguous.)

12.2.6

Example Functional Relation: Map Facult y to Ranks

Within a department, each member of faculty holds exactly one rank - say
an AsstProf, and AssocProf or a Professor. Thus, one can set up a relation
Facult y to Rank. As it turns out, this will be a functional relation in most
departments.

12.3

Overview of Relations

Figure 12.3: The general shape of a relational mapping. The entire domain need not be mapped from (i.e., just the pink region may be mapped
from). The same way, the codomain need not be mapped onto fully. Most
commonly, we are discussing binary relations over a set X (i.e., the domain
and codomain are the same set X ). For a relation over X , if all the X points
are mapped from, or if all the X points are mapped to (or both), the relation
is said to be total or complete. Relational inverses always exist for any binary
relation over X , regardless of wheher a relation is total; it is the relation you
see when you turn the arrows around.
Relations are maps from domains to codomains as in Figure 12.3. For
every domain point x and relation r , there could be more than one range
point y such that ( x, y) are in relation r (or, in other words, r ( x, y) is true).
However, unlike functions, relations need not be defined everywhere in their
domain. Thus, ; is a relation that maps nothing to nothing.

198

12.3.1

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

Example Relation: Map Facult y to Committees

Within a department, a member of the faculty can be assigned to multiple


committees. This will require the mapping to be modeled using relations (a
one-to-many map).

12.3.2

Example Relation: The inverse of a non 1-1 function

Consider the mapping from Q to N given by the rule: upon input a/ b, output
a + b. This defines a many-to-one mapping. For example, given 3/4 or 4/3,
we emit 7.
But what about the inverse mapping? That is, given 7, we want to yield
one of the pairs (expressed as a rational number) that adds up to 7. Now we
do have a relation.
So in summary, the inverse of a many-to-one function is not a function,
but it is definitely a relation.

12.3.3

Inverse of a relation

Relational inverse is an easy concept. Given a relation R over A B, the


inverse of R , denoted R 1 , is defined as follows: ( x, y) R if and only if
( y, x) R 1 . Since it is R that is given, the construction of R 1 is achieved by
taking every pair in R and flipping it.
One can think of relations as arrow diagrams, as in Figure 12.3. In some
contexts, relations can also be interpreted as capturing directed graphs of
node pairs. For example, the relation R over set {a, b}
{(a, b), (a, c), ( b, c)}

can be viewed as a graph (or arrow diagram) in which there are two arrows
emanating from a and hitting b and c, and there is an arrow hitting c but
emanating from b. Then, R 1 is the relation where all the graph edges are
reversed. It would be
{( b, a), ( c, a), ( c, b)}
now with arrows from b and c hitting a, and an arrow from c hitting b.
The caption of Figure 12.3 provides a few additional facts about relations.
It defines the notion of a binary relation over a set X : a very important

12.3. OVERVIEW OF RELATIONS

199

Figure 12.4: Illustration of Natural Join (from Wikipedia, https:


//en.wikipedia.org/wiki/Relational_algebra#Joins_and_join-like_
operators)
special case when the domain and codomain are both the same set X (which
is what we shall study quite extensively in Chapter 13.
Note that relational inverses exist even for relations other than those
over X i.e., even if R A B, R s inverse is perfectly well defined.
Figure 12.3 defines when a binary relation is total: when there are ( x, y)
pairs for all of x X , or when there are ( x, y) pairs for all y X . Total
relations are further discussed in Chapter 13.

12.3.4

Composition of Binary Relations

Relations can similarly be composed. Suppose A P Q is a relation, and


B Q R is a relation. Then A B is a relation =
{(a, b) : x Q, (a, x) A ( x, b) B}

One can think of relations as graphs, as the arrow diagrams in our book
have suggested thus far. Viewed this way, interpret a directed graph G as a
relation RG . Then, RG RG is a relation that takes two steps at a time (along
the arrow paths of G ).
Relational composition finds many uses. In a generalized setting when
we compose database relations, operators such as join are examples of relational composition. There are many types of joins, and we describe only one
type called the natural join, an example of which appears on Wikipedia. It
is given in Figure 12.4. There are two differences that come to our attention:

200

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

First, these relations are not binary; they can be of any arity.
After the natural join, the common entries (across two tables) are also
retained.
Such join queries are very expensive to evaluate across very large databases,
and modern research approximately computes such joins, trading off accuracy in order to gain performance (as Dr. Lis group in the SoC at Utah is
working on).

12.4

Functions in Depth

A function is specified by presenting


its domain (a non-empty set),
its codomain (a non-empty set), and
a rule that describes how each domain point is mapped to a codomain
point.
It is a function only if these two conditions are met:
Totality: Every domain point is mapped to a codomain point.
Single-value: A domain point is mapped to exactly a codomain point.
It is possible for a function to have these:
Some codomain points are not mapped onto by any domain point.
Some codomain points are mapped onto by multiple domain points.
The Signature of a function: The signature of a function is a syntactic convention for presenting the domain and codomain of a function. The signature
is written
f :DC
meaning that a function named f maps a domain D to the codomain C .
Correspondence or Bijection:
If a function f : D C is a correspondence if f is 1-1 and onto.
Correspondences are also known as bijections.

12.4.1

Examples of Functions

Example: succ1, Successor function from Z to Z


Let the domain and codomain be Z, which is the infinite set {0, 1, 1, 2, 2, . . .}.
Let the rule be: map x to x + 1.

12.4. FUNCTIONS IN DEPTH

201

This is a function, because for any member of Z, there is a codomain


point defined namely, the next higher value.
All codomain points are mapped onto. Such functions are called onto
functions.
Each codomain point is mapped onto (targeted) by exactly one domain point. Such functions are called one-to-one functions (or 1-1)
functions.
Example: succ2, Successor function from N to N
Let the domain and codomain be N, which is the infinite set {0, 1, 2, . . .}.
Let the rule be: map x to x + 1.
This is a function, because for any member of N, there is a codomain
point defined namely, the next higher value.
There is one codomain point that is not mapped onto namely, 0. Thus,
succ2 is not onto. It is still a 1-1 function.
Example: c23, Constant function from N to N
Let the domain and codomain be Z, which is the infinite set {0, 1, 1, 2, 2, . . .}.
Let the rule be: map x to 23.
This is a function, because for any member of Z, there is a codomain
point defined namely, always 23.
This is neither not 1-1 nor not onto.
Example: Addition function add 2 from N N to N
Let the domain be N N and codomain be N.
Let the rule be: add x and y belonging to domain point ( x, y) N N,
sending it to x + y.
This is a function: addition works for all pairs of natural numbers,
and yields a unique sum.
This is not 1-1 but is onto.
Example: A familiar Boolean function
Let the domain and codomain be B or {0, 1}.
Let the rule be: map x to x.
This is the not function.

202

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

This is an one-to-one and onto function.


Truth tables are a convenient way to present the mapping yielded by
Boolean functions.
Example: Another familiar Boolean function
Let the domain be B B or {0, 1} {0, 1}, and codomain be B.
Let the rule be: map ( x, y) to xor ( x, y).
This is the xor function.
This is an onto function. It is not 1-1 because, for instance, 0 is yielded
by (0, 0) and (1, 1).
It is also not 1-1 because, for instance, 1 is yielded by (0, 1) and (fill this
answer here).
Again, truth tables are a convenient way to present the mapping yielded
by all Boolean functions.

Example: div2 function from N to N


Let the domain and codomain be N.
Let the rule be: map x to x div 2. Thus,
0 and 1 map to 0,
2 and 3 map to 1,
4 and 5 map to 2, etc.
This is not 1-1 but is onto.
Example: r 1, A Function from Rnn to Rnn
Let Rnn be the set of non-negative reals.
Let the domain and codomain be Rnn .
p
Let the rule be: map x to x + 33.
This is a function, because for any member of the domain x Rnn , there
p
is a codomain point x + 33.
This is not onto. There is no mapping into the codomain points [0, 33).
The signatures of functions seen so far are listed below
succ1 : Z Z
succ2 : N N

12.4. FUNCTIONS IN DEPTH

203

c23 : N N. Even though this function always yields 23 as the answer, we can set its codomain to be N.
Of course, someone else may come around and define a codomain
containing exactly one point, namely 23:

c23 : N {23}

Strictly speaking this c23 is not the same function as before.


While its mapping is the same, its declared domain and/or codomain
are different.
add 2 : N N N
not : B B
xor : B B B
div2 : N N
r 1 : Rnn Rnn

12.4.2

Correspondences, Invertibility, and Tarzan Proofs

We now offer a formal definition of a correspondences and when a function


is invertible. We will refer to Figure 12.5, which also depicts the Tarzan
proof.
Let a function f : D C be given (it maps domain D to codomain C ).
Such a function f is invertible, or has an inverse if there is a function g (serving as the inverse of f ) such that:
g : C D ; i.e., g is a function from codomain C to domain D .
For all points x D , if f ( x) = y (and we know y C , it is the case
that g( y) = x.
We also want this:
For all points y0 C , if g( y0 ) = x0 (and we know x0 D ), it is the
case that f ( x0 ) = y0 .
We called our proof a Tarzan proof because if you think of the
domain and codomain as a forest full of trees, then starting from
any tree x in the domain, we can swing to a tree y in the codomain via f , and swing back to the same tree x in the domain
via g. The same is also true if we started from y0 , swung to x0 ,
and swung back to y0 .
The arrows in Figure 12.5 have the following significance:
* Arrows 1,2 form the Tarzan swing from the domain to the
codomain and back.

204

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

Figure 12.5: Tarzan Proof to show that a function is a correspondence

* Arrows 3,4 form the Tarzan swing from the codomain to the
domain and back.
A function is a correspondence if it is
1-1, and
onto.
A function is invertible only if it is a correspondence. That is,
If a function is not 1-1, it does not have an inverse. The reason is
clear: we do not know which input point to come back to.
If a function is not onto, then too the function is not invertible:
we do not have any mappings that define which domain point the
inverse must map to.
Correspondences are important for many other reasons also:
They help argue that two finite sets have the same size.
They help define that two infinite sets have the same cardinality.

12.4. FUNCTIONS IN DEPTH

205

Inverse of succ1, Successor function from Z to Z


The inverse of succ1 is a function from Z to Z, with rule map x to
x 1. Call this function pred 1.
Tarzan Proof:
For all domain points x Z, we have

pred 1( succ1( x)) = x


because ( x + 1) 1 = x.
For all codomain points y0 Z, we have

succ1( pred 1( y0 )) = y0
because ( y0 1) + 1 = y0 .
No Inverse for succ2, Successor function from N to N
The inverse of succ2 does not exist. Let us claim that the rule map
x to x 1 implements the inverse function called pred 2. The Tarzan
Proof will now fail:
Tarzan Proof attempt:
For all domain points x N, we have

pred 2( succ2( x)) = x


because ( x + 1) 1 = x. This part of the Tarzan swing works.
For all codomain points y0 N, we do not have a domain point x0
under the mapping pred 2. In particular, for 0 N in the codomain,

pred 2(0)) = 1
which is not in the domain N.
Inverse Exists if we change D or C
Suppose we define the signature of succ2 as

succ2 : N N+
where, recall that N+ = N {0}, i.e., it is N minus the set {0}. Then, there
is an inverse for succ2!This is because with this modified codomain, we can
swing back from N+ to N.

206

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

No Inverse for add 2 and div2


Both add 2 and div2 are not correspondences, because they are many-toone. Hence, they do not have an inverse. Here is the proof, taking div2 as
an example (the reasons for add 2 are similar):
Suppose we think of a function div2 inv. We need to come up with a
rule to invert div2. Let us say that div2 inv works as follows:
0 is sent to 0,
1 is sent to 1, and so on.
In other words, we think of the identity map.
But the Tarzan proof wont go through:
* For all x N, we dont have the guarantee that div2 inv will
send div2( x) back to x.
* For instance div2 inv( div2(0)) = 0, BUT div2 inv( div2(1)) = 0
because of the many-to-one mapping.

12.4.3

Gdel Hashes

Any natural number greater than 1 can be uniquely expressed as a product


of primes. Here are examples, where we express each natural number as an
N -tuple of exponents of primes (typed as lists below):
22 = [1, 0, 0, 0, 1] Obtained as 21 30 5 0 70 111
254 = [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
Obtained as 21 1271
256 = [8] Obtained as 28
258 = [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1] Obtained as 21 31 431
We can run the code in Figure 12.7 which relies on prime generation
via recursive sieving, given in Figure 12.6. Here are some more examples.
[ gUnhash(x) for x in list(range(2,11)) ]
gives [[1], [0, 1], [2], [0, 0, 1], [1, 1], [0, 0, 0, 1],

[3], [0, 2], [1, 0, 1]]


[ GodelHash(x) for x in [[1], [0, 1], [2], [0, 0, 1], [1,
1], [0, 0, 0, 1], [3], [0, 2], [1, 0, 1]] ]
gives [2, 3, 4, 5, 6, 7, 8, 9, 10]
Consider the Gdel hash operation as function f defined for all tuples
where not all positions are 0. That is, the tuples on which f applies,
are: [1], [0,1], [1,0], [0,0,0,1], etc. Consider Gdel unhash as

12.5. INFINITE SETS, CARDINALITIES

207

function g. These functions are inverses of each other.


The domain of f is the union of all possible k-tuples of N for k > 0
but avoiding all 0 tuples.
The codomain is N {0, 1}, i.e. 2 and up.
The forward mapping function f takes each tuple (a, b, c, . . .) and
position-wise takes 2a 3b 5 c . . ..
The inverse mapping function g successively divides each number
in the codomain by powers of primes, and produces a tuple of
integers.

12.5

Infinite Sets, Cardinalities

This section discusses how to measure the size of infinite sets. You will
employ many of the ideas found in this chapter in later courses such as CS
3100 to argue the existence of non-computable functions.
The cardinality of a set is its size. The cardinality of a finite set is measured using natural numbers; for example, the size of {1, 4} is 2. How do we
measure the size of infinite sets? The answer is that we use funny numbers, called cardinal numbers. The smallest cardinal number is 0 , the next
larger cardinal number is 1 , and so on. If one infinite set has size 0 , while
a second has size 1 , we will say that the second is larger than the first,
even though both sets are infinite. Moreover, 0 is the number of elements
of Nat, while 1 is the number of elements of R eal . All these ideas will be
made clear in this section.
To understand that there could be smaller infinities and bigger infinities, think of two infinitely sized dogs, Fifi and Howard. While Fifi is infinitely sized, every finite patch of her skin has a finite amount of hair. This
means that if one tries to push apart the hair on Fifis back, they will eventually find two adjacent hairs between which there is no other hair. Howard is
not only huge - every finite patch of his skin has an infinite amount of hair!
This means that if one tries to push apart the hair on Howards back, they
will never find two hairs that are truly adjacent. In other words, there will
be a hair between every pair of hairs! This can happen if Fifi has 0 amount
of hair on her entire body while Howard has 1 amount of hair on his body.2
Real numbers are akin to hair on Howards body; there is a real number
that lies properly between any two given real numbers. Natural numbers
2

Hope this wouldnt be viewed as splitting hairs. . .

208

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

#!/usr/bin/env python3
import sys
import math
def primes(N):
"""
Calculate the list of primes upto and including N.
Recursively compute the primes upto and including ceil(sqrt(N)).
Then sieve this list out of ceil(sqrt(N))...N.
"""
if (N <= 1):
return []
elif (N == 2):
return [2]
else:
sq = int(math.ceil(math.sqrt(N)))
p1 = primes(sq)
p2 = sieve(p1, list(range(sq, N+1)))
return p1+p2
def sieve(divs, lst):
"""
This function sieves the list of numbers passed in through divs
from the list lst. Essentially, the multiples of the numbers from
divs are removed from lst.
"""
if (divs == []):
return lst
else:
knock1 = knock_off(divs[0], lst)
return sieve(divs[1:], knock1)
def knock_off(d, lst):
"""
This function removes all multiples of d from lst.
"""
return list(filter(lambda x: (x%d != 0), lst))
def isPrime(N):
"""
This function checks if N is a prime.
"""
if (N <= 1):
return False
elif (N == 2):
return True
else:
sq = int(math.ceil(math.sqrt(N)))
p2 = sieve(list(range(2,sq+1)), [N])
return (p2 != [])
def isComposite(N):
"""
Composite numbers are not prime.
"""
return not(isPrime(N))

Figure 12.6: Illustration of Prime Generation via Recursive Sieving

12.5. INFINITE SETS, CARDINALITIES

209

p1000000 = primes(1000000) # Store all primes in the range 2..100 here.


def GodelHash(L):
"""
Given a list of numbers, compute the Godel hash
of those list of numbers. Example:
hash([1,2,0,3]) returns 6174.
6174 = 2**1 * 3**2 * 5**0 * 7**3.
"""
if (L==[]):
print("Error")
return 0
else:
return hh(L, p1000000, 1)
def hh(L, prl, N):
"""
This is a hash-helper called from GodelHash.
"""
if (L==[]):
return N
else:
return hh(L[1:], prl[1:], N * (prl[0] ** L[0]))
def gUnhash(N):
"""
Successively find primeIndex values with respect
to the list of primes in p1000000. This unhashes a given number.
For instance, gUnhash(100) = [2,0,2] because 100 = 2 ** 2 * 5 ** 2.
Note that GodelHash(gUnhash(i)) = i.
"""
assert(N >= 2), "gUnhash given an N that is < 2"
i = 0
L = []
(ind, residue) = primeIndex(N, p1000000[i])
L = L + [ind]
while (residue > 1):
i = i + 1
(ind, residue) = primeIndex(residue, p1000000[i])
L = L + [ind]
return L
def primeIndex(N, p):
"""
Given a natural number N and a prime p, find the largest
exponent i such that p^i divides N. Return the pair (i, N // p ** i).
primeIndex(50,3) returns (0, 50), as 3^0 divides 50, but not 3^1.
Return (0, 50 // 3 ** 0).
primeIndex(50,5) returns (2, 2), as 5^2 divides 50, but not 5^3.
Return (2, 50 // 5 ** 2).
primeIndex(50,2) returns (1, 25), as 2^1 divides 50, but not 2^2.
Return (1, 50 // 2 ** 1).
"""
i = 0
while (N % p == 0):
i = i + 1
N = N // p
return (i, N)
#--end

Figure 12.7: Illustration of Gdel hashing and unhashing using Primes

210

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

are akin to hair on Fifis body; there is no natural number between adjacent
natural numbers.

12.5.1

Matching up the sizes of infinite sets

Questions such as these arise easily:


Are there the same number of Natural numbers in N as there are
even numbers in Even?
Are there the same number of natural numbers as there are real
numbers?
Strictly we cannot be counting the sizes of two infinite sets and see
if the sizes agree. Instead, we adopt the idea of matching the sizes. This is
achieved by using the idea of correspondence. In this setting, correspondence
is like a barter agreement. If we cant count, at least match up!. Note that
correspondences are also often called bijections, and we may occasionally
slip into this term.
More specifically,
Two infinite sets have the same cardinality if there is a correspondence
between them. Thus, N and Even have a correspondence (namely, the
2 x rule). Thus, they have the same cardinality, even though Even N,
i.e., Evens are properly contained inside Natural Numbers.
If we can show that two infinite sets do not have a correspondence
between them, we say that they have different cardinalities. Then,
knowing which set is a proper subset of another, we can tell which set
has higher cardinality. Thus, N and R do not have the same cardinality; in fact, R has higher cardinality.
In fact, one can show that P (N) and R stand in correspondence. Hint:
Each subset of N can be modeled using an infinite bit vector. Such
infinite bit vectors with a pretend decimal point at the left end can
be the numeral representation (in Binary) of all R in the range [0, 1).
The cardinality of the Natural number set N is 0 , and that of Reals,
R is 1 . Each time one takes the P () of an infinite set, you are in a set
with higher cardinality.
The cardinality (or set with cardinality) 0 is called countable infinity or countably many while 1 is called uncountable infinity or
uncountably many. These terms help make one feel silly when one
starts numbering Reals, here is my first Real, here is my second, ....
Such a numbering does not exist.

12.5. INFINITE SETS, CARDINALITIES

211

There are higher cardinal numbers 2 , and so on. For instance, 2


corresponds to the powerset of R. Ive seen in Gamows book 1-2-3
Infinity that this can model the set of all curves one can draw in RR.

12.5.2

Cantor-Schrder-Bernstein Theorem

Since finding a correspondence directly is quite hard, we can rely on the


Cantor-schroder-bernstein theorem (or simply Schrder-Bernstein Theorem
as it is commonly known) which states, for given infinite sets A and B:
If there is a 1-1 map from A into B (not necessarily onto),
and If there is a 1-1 map from B into A (not necessarily onto),
then there is a correspondence between A and B,
i.e., these sets have the same cardinality!
Application: cardinality of all C Programs
As our first application of the Schrder-Bernstein Theorem, let us arrive at
the cardinality of the set of all C programs, CP . We show that this is 0 by
finding a 1-1 and into maps from Nat to CP and vice versa. The real beauty
of this theorem is that we can find such maps completely arbitrary. For instance, we consider the class of C programs beginning with main(){}. This
is, believe it or not, a legal C program! The next longer, such weird but legal C program, is main(){;}. The next ones are main(){;;}, main(){;;;},
main(){;;;;}, and so on! Now,
A function f : Nat CP that is 1-1, total, and into is the following:

Map 0 into the legal C program, main(){}


Map 1 into another legal C program main(){;}
Map 2 into another legal C program main(){;;}
. . ., map i into the C program main(){; i }i.e., one that contains
i occurrences of ;.

A function g : CP Nat that is 1-1, total, and into is the following:


view each C program as a string of bits, and obtain the value of this
bit-stream viewed as an unsigned binary number.

By virtue of the existence of the above functions f and g, from the SchrderBernstein Theorem, it follows that |CP | = | Nat|.

212

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

Illustration: Comparing N Z and Z


Problem: show that A = N Z and B = Z have the same cardinality.
Here is the 1-1 map from A into B: x, y.si gn( y) (2 x 3| y| ). That is,
take every pair (a, b) N Z
preserve the sign of b
then do the Gdel hash 2 x 3| y| .
The reverse map is much easier: just pair the Int with some arbitrary
Nat, that is:
x.0, x.
Then, as per the Cantor-Schrder-Bernstein theorem or the C-S-B theorem, N Z and Z have the same cardinality.

12.6

Cantors Diagonalization Proof

Let us return to our original question, is there a bijection from Nat to


R eal ? (Note that bijection is a synonym for correspondence these
words mean exactly the same!) The answer is no and we proceed to show
how. We follow the powerful approach, developed by Cantor, called diagonalization. Diagonalization is a particular application of the principle of
proof by contradiction or reductio ad absurdum in which the solution-space
is portrayed as a square matrix, and the contradiction is observed along the
diagonal of this matrix. In other words, this is another illustration of the
proof by contradiction approach.
We now walk you through the proof, providing section headings to the
specific steps to be performed along the way.
Most textbooks prove this result using numbers represented in decimal,
which is much easier than what we are going to present in this section namely, prove it in binary. We leave the proof in decimal as an exercise for
you. In addition to being a fresh, as well as illuminating proof, a proof for
the binary case also allows us to easily relate cardinality of R eal s to that of
languages over some alphabet. Here, then, are the steps in this proof.
Simplify the set in question
We first simplify our problem as follows. Note that ( x.1/(1 + x)) is a bijection
from [0, ] R eal to [0, 1] R eal . Given this, it suffices to show that there

12.6. CANTORS DIAGONALIZATION PROOF

213

is no bijection from Nat to [0, 1] R eal , since bijections are closed under
composition. We do this because the interval [0, 1] is easier to work with.
We can use binary fractions to capture each number in this range, and this
will make our proof convenient to present.
Avoid dual representations for numbers
The next difficulty we face is that certain numbers have two fractional representations. As a simple example, if the manufacturer of Ivory soap claims
that their soap is 99.99% pure, it is not the same as saying it is 99.999%
pure.3 However, if they claim it is 99.99% pure (meaning an infinite number
of 9s following the fractional point), then it is equivalent to saying it is 100%
pure. Therefore, in the decimal system, infinitely repeating 9s can be represented without infinitely repeating 9s. As another example, 5.1239 = 5.124.
The same dual representations exist in the binary system also. For example, in the binary system, the fraction 0.0100 (meaning, 0.010 followed by
an infinite number of 0s) represents 0.25 in decimal. However, the fraction
0.0101 (0.010 followed by an infinite number of 1s) represents 0.0110 in binary, or 0.375 in decimal. Since we would like to avoid dual representations,
we will avoid dealing with number 1.0 (which has the dual representation of
0.1). Hence, we will perform our proof by showing that there is no bijection
from Nat to [0, 1) R eal . This would be an even stronger result.
Let us represent each real number in the set [0, 1) R eal in binary. For
example, 0.5 would be 0.100 . . ., 0.375 would be 0.01100 . . .. We shall continue
to adhere to our convention that we shall never use any bit-representation
involving 1. Fortunately, every number in [0, 1) can be represented without
ever using 1. (This, again, is the reason for leaving out 1.0, as we dont wish
to represent it as 0.1, or 1.0).
Claiming a bijection, and refuting it
For the simplicity of exposition, we first present a proof that is nearly right,
and much simpler than the actual proof. In the next section, we repair this
proof, giving us the actual proof. Suppose there is a bijection f that puts
Nat and [0, 1) in correspondence C1 as follows:
0 .b 00 b 01 b 02 b 03 . . .
3

Such Ivory soap may still float.

214

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS


1 .b 10 b 11 b 12 b 13 . . .
...
n .b n0 b n1 b n2 b n3 . . .
...

where each b i j is 0 or 1.
Now, consider the real number

D = 0. b 00 b 11 b 22 b 33 . . . .
This number is not in the above listing, because it differs from the i -th number in bit-position b ii for every i . Since this number D is not represented,
f cannot be a bijection as claimed. Hence such an f does not exist.
Fixing the proof a little bit
Actually the above proof needs a small fix; what if the complement of the
diagonal happens to involve a 1? The danger then is that we cannot claim
that a number equal to the complemented diagonal does not appear in our
listing. It might then end up existing in our listing of Reals in a non 1 form.
We overcome this problem through a simple correction. This correction
ensures that the complemented diagonal will never contain a 1. In fact,
we arrange things so that the complemented diagonal will contain zeros infinitely often. This is achieved by placing a 1 in the uncomplemented diagonal every so often; we choose to do so for all even positions, by listing the
R eal number .12n+1 0 . . . (2 n + 1 1s followed by 0) at position 2 n, for all n.
Consider the following correspondence, for example:
0 .10
1 .c 00 c 01 c 02 c 03 . . .
2 .1110
3 .c 10 c 11 c 12 c 13 . . .
4 .111110
5 .c 20 c 21 c 22 c 23 . . .
6 .11111110
...
2 n .12n+1 0 . . .
2 n + 1 .c n0 c n1 c n2 c n3 . . .

12.6. CANTORS DIAGONALIZATION PROOF

215

...
Call this correspondence C2. We obtain C2 as follows. We know that the
numbers .10, .1110, .111110, etc., exist in the original correspondence C1.
C2 is obtained from C1 by first permuting it so that the above elements are
moved to the even positions within C2 (they may exist arbitrarily scattered
or grouped, within C1). We then go through C1, strike out the above-listed
elements, and list its remaining elements in the odd positions within C2. We
represent C2 using rows of .c i j , as above.
We can now finish our argument as follows. The complemented diagonal doesnt contain a 1, because it contains 0 occurring in it infinitely often.
Now, this complemented diagonal cannot exist anywhere in our .c i j listing.
The complemented diagonal is certainly a Real number missed by the original correspondence C1 (and hence, also missed by C2). Hence, we arrive
at a contradiction that we have a correspondence, and therefore, we cannot
assign the same cardinal number to the set [0, 1) R eal. It is therefore of
higher cardinality.
The conclusion we draw from the above proof is that R eal and Nat have
different cardinalities. Further details of this topic are usually covered in
classes on formal languages and computability.

216

CHAPTER 12. FUNCTIONS, RELATIONS, INFINITE SETS

Chapter 13
Classifying Relations
This chapters covers various types of relations, introducing their theoretical
and practical connotations. The classification of relations will be in terms
of notions called reflexive, symmetric, antisymmetric, transitive, etc. These
are best presented using succinct phrases due to Andrew Hodges, presented
in 13.1.1. We also talk about equivalence relations, equivalence classes,
and partitions.

13.1

Why Classify Relations?

We classify relations to understand and catalog familiar properties, and


avoid inadvertent conclusions. It is like type-checking: the more one keeps
track of higher level properties (such as types), the less mistakes one makes.
Relations are crucial building blocks of database reasoning engines and network routing tables. Mistakes with respect to defining and manipulating
relations can sow serious bugs hence our motivation to classify relations.
Here are some examples of how relations are classified (typed):
Consider < N N. We know that if a < b and b < c, then a < c. Thus,
< is a transitive relation. In other words, knowing that a relation is
transitive allow us to bridge through if (a, b) R eln and ( b, c)
R eln, it is safe to jump to the conclusion that (a, c) R eln.
We know that x < x is false for any x N. But we know that x x is
true for any x N. We flag this by saying that < is irreflexive (does not
hold for any x). On the other hand, is reflexive.
Now consider 6= N N. We know that if a 6= b and b 6= c, then a 6= c
217

218

CHAPTER 13. CLASSIFYING RELATIONS

does not follow. In fact, we have 3 6= 4 and 4 6= 3, and we know 3 6= 3


does not hold. Thus, 6= is not a transitive relation. In fact, it is a nontransitive relation.
In social-media websites, there are link relations maintained. Suppose
(a, b) Linked ; that is a and b are linked. Likewise, suppose ( b, c)
Linked. Can we infer that (a, c) Linked ? At least, the site can infer
that (a, c) may benefit from being linked, and send nag-messages to c
(and/or a) to try and befriend the other.

13.1.1

Andrew Hodges Definitions for Types of Relations

We shall be mainly concerned with binary relations over a set S . Such relations occur widely. Most relations we encounter, such as <, , , , and 6=
are binary relations (over suitable sets).
Binary relations help impart structure to sets of related elements. They
help form various meaningful orders as well as equivalences, and hence are
central to mathematical reasoning. Our definitions in this chapter follow
several books and webpages, notably
Naive Set Theory, Halmos.
Programming Semantics, Loeckx and Sieber.
The Oxford Philosophy webpage, http://logic.philosophy.ox.ac.
uk/.
A binary relation R on S is a subset of S S . It is a relation that can be
expressed by a 2-place predicate. Examples: (i) x loves y, (ii) x > y.
Set S is the domain of the relation. It is possible that the domain S is
empty (in which case R will be empty). In all instances that we consider, the
domain S will be non-empty. However, it is also possible that S is non-empty
while R is empty (in which case, none of the pairs of elements happen to be
relatedthe situation of an empty relation1 ).
We now proceed to examine various types of binary relations. In all these
definitions, we assume that the binary relation R in question is on S , i.e., a
subset of S S . For a relation R , two standard prefixes are employed: irrand non-. Their usages will be clarified in the sequel.
Relations can be depicted as graphs. Here are conventions attributed to
Andrew Hodges (described in the Oxford Philosophy page). The domain is
1

A situation where nobody loves anybody else (including themselves!) is an example of


S 6= ; and R = ;.

13.1. WHY CLASSIFY RELATIONS?

219

represented by a closed curve (e.g., circle, square, etc) and the individuals
in the domain by dots labeled, perhaps, a, b, c, and so on. The fact that
a, b R will be depicted by drawing a single arrow (or equivalently oneway arrow) from dot a to dot b. We represent the fact that both a, b R
and b, a R by drawing a double arrow between a and b. We represent the
fact that a, a R by drawing a double arrow from a back to itself (this is
called a loop). We shall present examples of these drawings in the sequel.

Types of binary relations

R1

R2

R3

2
R4

R5

R6

Figure 13.1: Some example binary relations


We shall use the following examples. Let S = {1, 2, 3}, R 1 = { x, x | x S },
R 2 = S S , and

R 3 = {1, 1, 2, 2, 3, 3, 1, 2, 2, 1, 2, 3, 3, 2}.
All these (and three more) relations are depicted in Figure 13.1.

220

CHAPTER 13. CLASSIFYING RELATIONS

Reflexive, and Related Notions

R is reflexive, if for all x S , x, x R . Equivalently,


In R s graph, there is no dot without a loop.
Informally, every element is related to itself.

A relation R is irreflexive if there are no reflexive elements; i.e., for no


x S is it the case that x, x R . Equivalently,
In R s graph, no dot has a loop.

Note that irreflexive is not the negation (complement) of reflexive. This is


because the logical negation of the definition of reflexive would be, there
exists x S such that x, x R . This is not the same as irreflexive because
all such pairs must be absent in an irreflexive relation.

A relation R is non-reflexive if it is neither reflexive nor irreflexive.


Equivalently,
In R s graph, at least one dot has a loop and at least
one dot does not.

Examples:
R 1 , R 2 , R 3 are all reflexive.
If S = ; (in the empty domain), then R = ; is reflexive and irreflexive.
It is not non-reflexive.
For x, y Nat, x = y2 is non-reflexive (true for x = y = 1, false for x =
y = 2).

13.1. WHY CLASSIFY RELATIONS?

221

Symmetric, and Related Notions

R is symmetric if for all x, y S , x, y R y, x R . Here, x and y


need not be distinct. Equivalently,
In R s graph, there are no single arrows. If the relation
holds one way, it also holds the other way.

Examples: R 1 , R 2 , and R 3 are symmetric relations. Also note that ; is a


symmetric relation.

R is asymmetric if for x, y S , not necessarily distinct, if x, y R ,


then y, x R . Example: elder brother is an asymmetric relation, and
so is < over Nat. Asymmetric relations need not be total; that is, it is not
required that for two arbitrary x, y, we have to have elderbrother ( x, y)
or elderbrother ( y, x). But if it holds one way, it does not hold the other
way. Equivalently,
There are no double arrows in its graph; if the relation
holds one way, it does not hold the other.

Curiously, this rules out . We have 0 0. But it does not follow that
(0 0) because of the not necessarily distinct aspect.
Again, note that asymmetric is not the same as the negation of (the definition
of) symmetric. The negation of the definition of symmetric would be that
there exists distinct x and y such that x, y R , but y, x R .

R is non-symmetric if it is neither symmetric nor asymmetric


(there is at least one single arrow and at least one double arrow).

Example: ; is symmetric and asymmetric, but not non-symmetric.

222

CHAPTER 13. CLASSIFYING RELATIONS

R is antisymmetric if for all x, y S , x, y R y, x R x = y (they


are the same element). Equivalently,
There is no double arrow unless it is a loop.

Antisymmetry is a powerful notion that, unfortunately, is too strong for


many purposes. Consider the elements of 2S , the powerset of S , as an example. If, for any two elements x and y in S , we have x y and y x, then we
can conclude that x = y. Therefore, the set containment relation is antisymmetric; and hence, antisymmetry is appropriate for comparing two sets
in the less than or equals sense.
Consider, on the other hand, two basketball players, A and B. Suppose
the coach of their team defines the relation BB as follows: A BB B if and
only if B has more abilities or has the same abilities as A . Now, if we have
two players x and y such that x BB y and y BB x, we can conclude that they
have identical abilities - they dont end up becoming the very same person,
however! Hence, BB must not be antisymmetric. Therefore, depending on
what we are comparing, antisymmetry may or may not be appropriate.

Transitive, and Related Notions


To define transitivity in terms of graphs, we need the notions of a broken
journey and a short cut. There is a broken journey from dot x to dot z via
dot y, if there is an arrow from x to y and an arrow from y to z. Note that
dot x might be the same as dot y, and dot y might be the same as dot z.
Therefore if a, a R and a, b R , there is a broken journey from a to b
via a. Example: there is a broken journey from Utah to Nevada via Arizona.
There is also a broken journey from Utah to Nevada via Utah.
There is a short cut just if there is an arrow direct from x to z. So if
a, b R and b, c R and also a, c R , we have a broken journey from a
to c via b, together with a short cut. Also if a, a R and a, b R , there is
a broken journey from a to b via a, together with a short cut.
Example: There is a broken journey from Utah to Nevada via Arizona, and
a short cut from Utah to Nevada.

13.1. WHY CLASSIFY RELATIONS?

223

R is transitive if for all x, y, z S , x, y R y, z R x, z R .


Equivalently,
There is no broken journey without a short cut.

R is intransitive if, for all x, y, z S , x, y R y, z R x, z R .


Equivalently,
There is no broken journey with a short cut.

R is non-transitive if and only if it is neither transitive nor intransitive.


Equivalently,
There is at least one broken journey with a short cut
and at least one without.

Examples:
Relations R 1 and R 2 above are transitive.
R 3 is non-transitive, since it is lacking the pair 1, 3.
Another non-transitive relation is 6= over Nat, because from a 6= b and
b 6= c, we cannot always conclude that a 6= c.
R 4 is irreflexive, transitive, and asymmetric.
R 5 is still irreflexive. It is not transitive, as there is no loop at 1. It is
not intransitive because there is a broken journey (2 to 3 via 1) with
a short cut (2 to 1). It is non-transitive because there is one broken
journey without a short cut and one without.
R 5 is not symmetric because there are single arrows.
R 5 is not asymmetric because there are double arrows.

224

CHAPTER 13. CLASSIFYING RELATIONS

From the above, it follows that R 5 is non-symmetric.


R 5 is not antisymmetric because there is a double arrow that is not a
loop.

13.1.2

Preorder (reflexive plus transitive)

If R is reflexive and transitive, then it is known as a preorder.


Continuing with the example of basketball players, let the BB relation for
three members A , B, and C of the team be
{ A, A , A, B, B, A , B, B, A, C , B, C , C, C }.

This relation is a preorder because it is reflexive and transitive. It helps


compare three players A , B, and C , treating A and B to be equivalent in
abilities, and C to be superior in abilities to both.

13.1.3

Partial order (preorder plus antisymmetric)

If R is reflexive, antisymmetric, and transitive, then it is known as a


partial order.
As shown in Section 13.1.1 under the heading of antisymmetry, the subset
or equals relation is a partial order.
Example: Members of a Powerset Figure 8.8 depicts the powerset of
the set {1, 2, 3} as a lattice. As shown in this figure, this relation is the
partial order
{
(;, ;),
(;, {1}), (;, {2}), (;, {3})
({1}, {1, 2}), ({1}, {1, 3}),
({2}, {1, 2}), ({2}, {2, 3}),
({3}, {1, 3}), ({3}, {2, 3}),

13.1. WHY CLASSIFY RELATIONS?

225

({1, 2}, {1, 2, 3}), ({2, 3}, {1, 2, 3}), ({1, 3}, {1, 2, 3})
}
However, this relation has even more elements in it, namely (;, {1, 2, 3}).
These are generally left out, as the transitivity of a partial order implies
these pairs (you should bridge through any such un-mentioned pairs also).

Figure 13.2: Let us define the Interval Containment Partial Order as shown
here. An Interval is I N N, i.e. has a pair (a, b) in it representing a closed
interval of say Natural numbers. In this case, an interval I 1 = [a1, b1] is
contained in another interval I 2 = [a2, b2] exactly when a1 a2 and b1 b2.
One can check this containment visually by seeing that the intervals overlap,
and the end-points of the contained interval are neatly tucked away within
the bounds of the containing interval.
Example: Interval Containment Partial Order Figure 13.2 depicts a
partial order obtained by using a relation over intervals. We consider intervals to be pairs of natural numbers such as [a0, b0] shown in this figure. An
interval is contained in another as defined and illustrated in the figure. We
obtain the interval-containment partial order as shown in this figure.

226

CHAPTER 13. CLASSIFYING RELATIONS

The fact that this is a parial order is easy to see. Suppose we call our relation R I I where I denotes Intervals. R denotes interval containment.
More formally, I = N N where the first number is assumed to be less than
or equal to the second number. We must now argue that R is a partial order
over I .
For example, I = (2, 4) is an ordered pair of 2 and 4 (sometimes written
in math books as 2, 4). It represents the closed interval [2, 4]. We will
not consider intervals of the form (4, 3) (one can think of these as being the
empty interval; but we wont go there).
OK now, what does R look like?
R must contain pairs as shown below
{((2, 4), (1, 5)), ((2, 4), (2, 4)), ((2, 4), (2, 30)), . . .}

That is, interval (2, 4) is contained in interval (1, 5), etc.


R must not contain
{((1, 5), (2, 4)), ((2, 40), (2, 30)), . . .}

This models the fact that interval (1, 5) is not contained in interval
(2, 4), etc.
In general, ((a, b), ( c, d )) R if and only if
(a c) ( b d )
Proof:
R is reflexive because for all intervals I , ( I, I ) is in R .
R is antisymmetric:
If ((a, b), ( c, d )) and (( c, d ), (a, b)) are both in R, then (a c) ( b
d ) and ( c a) ( d b). Thus, a = c and b = d , or they are the
same interval.
Thus, antisymmetry is satisfied.
R is transitive:
If ((a, b), ( c, d )) and (( c, d ), ( e, f )) are both in R, then (a c) ( b d )
and ( c e) ( d f ). Thus, a e and b f
This means that ((a, b), ( e, f )) must be in R .
Thus, transitivity is satisfied.
Hence, R is a partial order.

13.1. WHY CLASSIFY RELATIONS?

13.1.4

227

Total order, and related notions

A total order is a special case of a partial order. R is a total order if


for all x, y S , either x, y R or y, x R . Here, x and y need not be
distinct (this is consistent with the fact that total orders are reflexive).
The relation on Nat is a total order. Note that < is not a total order,
because it is not reflexive.2 However, < is transitive. Curiously, < is antisymmetric.
A relation R is said to be total if for all x S , there exists y S such that
x, y R . In other words, a total relation is one in which every element x
is related to at least one other element y. If we consider y to be the image
(mapping) of x under R , this definition is akin to the definition of a total
function.
Note again that R being a total order is not the same as R being a partial
order and a total relation. For example, consider the following relation R
over set S = {a, b, c, d }:

R = {a, a, b, b, c, c, d, d , a, b, c, d }
R is a partial order. R is also a total relation. However, R is not a total order,
because there is no relationship between b and c (neither b, c nor c, b is
in R ).

13.1.5

Relational Inverse

The inverse of a relation R can be defined as follows:

R 1 ( y, x) if and only if R ( x, y).


Thus, if

R = { x, y | p( x, y)}
for some characteristic predicate p, then R 1 is as follows:

R 1 = { y, x | p( x, y)}.
2

Some authors are known to abuse these definitions, and consider < to be a total order.
It is better referred to as strict total order or irreflexive total order.

228

CHAPTER 13. CLASSIFYING RELATIONS

Example: The inverse of the < relation over natural numbers Nat is
the relation > over Nat. It is not the same as . (Note that if we
negate the characteristic predicate definining <, we will have obtained
. This is however not how you obtain relational inverses. Relational
inverses are obtained by flipping the tuples around.)
Example: The inverse of the < relation over Integers I nt (positive and
negative whole numbers) is the relation > over I nt.
Observation: If we take every edge in the graph of relation R and
reverse the edges, we obtain the edges in the graph of relation R 1 .

13.1.6

Equivalence (Preorder plus Symmetry)

An equivalence relation is reflexive, symmetric, and transitive.


Consider the BB relation for three basketball players A , B, and C . Now,
consider a specialization of this relation obtained by leaving out certain
edges:
BB = { A, A , A, B, B, A , B, B, C, C }.
This relation is an equivalence relation, as can be easily verified.
1
Note that BB = BB
BB . In other words, this equivalence relation
is obtained by taking the preorder BB and intersecting it with its inverse.
1
The fact that BB
BB is an equivalence relation is not an accident. The
following section demonstrates a general result in this regard.

13.1.7

Equivalence class

An equivalence relation R over S partitions the elements of S into equivalence classes. Intuitively, the equivalence classes E i are those subsets of S
such that every pair of elements in E i is related by R , and E i s are the maximal such subsets. In other words, for distinct E i and E j , an element x E i
and an element y E j are not related.
Figure 13.3 presents an equivalence class formed over the set {0, 1, 2, 3, 4, 5}
by treating two numbers to be equivalent if their div 2 answers are the
same; thus, 2 3 under this equivalence relation. The figure shows the

13.1. WHY CLASSIFY RELATIONS?

229

Figure 13.3: Equivalence Classes Explained


initial relation missing self equivalences (the black edges only list interesting equivalences such as between 0 and 1, 2 and 3, and 4 and 5). One
can then come around and add the blue edges also (all the self equivalences
are added). The relation now becomes reflexive, symmetric, and transitive.
We can also learn the notion of transitive closure from this example. Suppose we initially add the equivalences between 0 and 1, 2 and 3, and 4 and
5. Then suppose we take a transitive closure. Since we have (0, 1) and (1, 0
in the relation, we will end up adding the transitive edge, i.e., (0, 0). Similarly, since we have (1, 0) and (0, 1), we will end up adding the transitive edge
(1, 1). Thus, to build up to the equivalence relation, one can also start from
the black edges and take a transitive closure, and thus add in the reflexive
edges.
The equivalence classes on the right-hand side partitions S = {0, 1, 2, 3, 4, 5}
into
{{0, 1}, {2, 3}, {4, 5}}

230

CHAPTER 13. CLASSIFYING RELATIONS

Figure 13.4: The infinite set of all possible Boolean formulae over two
Boolean variables is being shown partitioned according to Boolean equivalence. As we studied in Chapters 1 and 2, there are 16 Boolean functions
possible over 2 Boolean functions. Thus, there will be 16 equivalence classes
in this diagram. Some of the equivalence classes and their members are
shown here in this figure.
Recall that a partition of a set S is a subset of pairwise disjoint sets that
are exhaustive (whose union becomes equal to the full set). From such a
partition, we can easily read-off the equivalence relation: (1) any member of
a partition is related to itself (reflexive); (2) any two members of a partition
are related to each other in both ways (symmetric); and (3) the partitions are
transitively closed, as well.
Figure 13.4 further illustrates equivalence classes. Recall that we have
N
already learned (from Chapters 1 and 2) that there are 22 distinct Boolean
functions over 2 variables. This number is 16 for N = 2. Thus, if we keep listing all possible syntactically expressible Boolean formulae,3 then these formulae will neatly arrange themselves into 16 bins (or equivalence classes).
Why? Because it should not be possible to express a 17th semantically distinct formulathere are only 16 Boolean functions, after all! (Section14.4
presents this as the pigeon-hole theorem.) This is another use of the notion
of equivalence classes.

Simply create a formula diarrhea of all possible formulae somehow listed...

13.1. WHY CLASSIFY RELATIONS?

13.1.8

231

Reflexive and transitive closure

The reflexive closure of R , denoted by R 0 , is

R 0 = R { x, x | x S }.
This results in a relation that is reflexive.
The transitive closure of R , denoted by R + , is

R + = R { x, z | y S : x, y R y, z R + }.
R + is the least such set. The use of + highlights the fact that transitive
closure relates items that are one or more step away.
The reflexive and transitive closure of a relation R , denoted by by R , is
R = R0 R+.
The use of highlights the fact that reflexive and transitive closure relates
items that are zero or more stpdf away.
Example: Consider a directed graph G with nodes a, b, c, d, e, and f . Suppose it is necessary to define the reachability relation among the nodes of
G . Oftentimes, it is much easier to instead define the one-step reachability
relation
R each = {a, b, b, c, c, d , e, f }
and let the users perform the reflexive and transitive closure of R each. Doing so results in R each RT closed , that has all the missing reflexive and transitive pairs of nodes in it:

R each RT closed = {a, b, b, c, c, d , e, f , a, a, b, b, c, c, d, d ,


e, e, f , f , a, c, a, d , b, d }.
Such reflexive-transitive closures can help us save maps succinctly. Thus,
if a = U tah, b = N evada, and c = Cal i f ornia, and the relation is reachability, then before the reflexive-transitive closure is taken, we are saying Utah
can reach Nevada and Nevada can reach California. After the reflexivetransitive closure, we would have added many more facts: Utah can reach
Utah; Nevada can reach Nevada; California can reach California; also Utah
can reach California; etc.

232

CHAPTER 13. CLASSIFYING RELATIONS

Chapter 14
Review of Functions and
Relations
In this chapter, we will provide a review of much of the material from previous chapters, and also provide some examples.

14.1

Gdel Hashing

Here are some exercises on Gdel Hashing and Unhashing. These exercises
teach us that the DNA of any natural number is in its prime factors. This
is because for any natural number,
either it is a prime number, or
it is a composite number, in which case, it has prime factors.
Thus, 80 = 24 30 51 . Thus, the DNA sequence of 80 is (4, 0, 1). This DNA
sequence is unique because of the fundamental theorem of arithmetic, which
states that every natural number is expressible uniquely as a product of
primes. For a proof, see Chapter 15.
1. Encode the tuple (4, 3, 0, 1) using Gdel hashing.
Solution: Using prime numbers 2, 3, 5, 7, . . ., we can map (4, 3, 0, 1) in a
1-1 fashion through the expression 24 33 50 71 = 16 27 7 = 3, 024
2. Encode the tuple (3, 0, 2, 1) using Gdel hashing.
Solution: Using prime numbers 2, 3, 5, 7, . . ., we can map (3, 0, 2, 1) in a
1-1 fashion through the expression 23 30 52 71 = 8 25 7 = 1, 400

233

234

CHAPTER 14. REVIEW OF FUNCTIONS AND RELATIONS

3. Suppose you receive 88 as a result of Gdel hashing from a tuple of


unknown size. Decode the result and present it as a tuple.
Solution: The idea is to divide successively by primes, noting the exponents of each prime factor, till the remainder attains value 0.
This yields (3, 0, 0, 0, 1) as we have 23 and the remainder being 11, we use
zero exponents for 3, 5, and 7.

14.2

Relations and Functions

Now we will review some of the basics of relations and functions.


1. What is the smallest relation that can be defined over D C (or that
matter, for any non-empty domain and codomain)?
Solution: The answer is ; or the empty relation. This contains no

pairs. This is allowed for relations.

2. What is the smallest function that can be defined over D C (or what
is meant by the size of a function f : D C viewed as a relation)?
Solution: Unlike with relations, we must map every domain element in
D . Thus, there will be as many pairs as there are elements in D . All

functions will have the same size. Examples:


Nand: Nand maps B B B.

Nand = {((0, 0), 1), ((0, 1), 1), ((1, 0), 1), ((1, 1), 0)}

The size of the Nand function is 4 because all the combos


(0, 0), (0, 1), (1, 0), (1, 1) are being mapped.
And: And maps B B B.
And = {((0, 0), 0), ((0, 1), 0), ((1, 0), 0), ((1, 1), 1)}

The size of the And function is also 4.


Const0: Const0 maps B B B. Let Const0 always yield 0.
Const0 = {((0, 0), 0), ((0, 1), 0), ((1, 0), 0), ((1, 1), 0)}

The size of this function is also 4, as it has to still handle the four
tuples.
3. Can there ever be a function that maps ; to something? If so provide
an example of such a function. Can there ever be a function that maps
something to the ;? If so provide an example of such a function.

14.2. RELATIONS AND FUNCTIONS

235

Surely so! The Size function that takes the size of a set is
one example of the former. For the latter, think of a function that maps
natural numbers to sets, where the empty set can be returned for, say, 0.

Solution:

4. Consider the domain D 1 = {1, 2, 3} and codomain C 1 = { A, B, C }.


(a) Is R 1 = {(1, A ), (2, B), (3, C )} a (properly defined) relation over D 1
C1 ?
Solution: It is, as R 1 is a subset of D 1 C 1 and both D 1 and C 1
are non-empty. Whenever we have these, relations such as R1 are

well defined. Relations are simply sets of tuples and these sets of
tuples can come from suitable domains and codomains.

(b) Is R 1 a function?
Solution: Yes, it is, because there is no domain point that is

mapped to two distinct codomain points. Also, every domain point


is mapped. Hence it is a function.

(c) Answer these questions, now considering R 1 to be a function:


Please write it in signature form : i.e., f : P Q filling in
the correct P and Q .
Solution: f : D 1 C 1 .
Is f one-to-one? onto? invertible? a correspondence?
Solution: f satisfies all these conditions, so yes for all.
(d) Is R 2 = {(0, A ), (2, B), (3, C )} a (properly defined) relation over D 1
C 1 ? Give reasons.
Solution: Not so, as R 2 includes 0 in one of its pairs, as the first
component. However, D 1 does not have 0 in it.
(e) Consider R 3 = {(1, A ), (1, B), (3, C )}
What is R 3 s inverse? Is it (R 3 s inverse) a function? If so,
what type of function (1-1, onto, correspondence)?
Solution: It is {( A, 1), (B, 1), (C, 3)}. This is a function, but

many to one. Hence not a correspondence.

Is R 3 a function?
Solution: It is not a function, as 1 is mapped to A and B.
5. How many functions can you define over domain {0, 1} and codomain
{0, 1}? Name all these functions (they have standard names).
Solution: Identity and inverter.
6. How many functions can you define over domain {0, 1} N and codomain
{0, 1}? Name three of these functions for N = 2.
N
Solution: There are 16 (22 ) functions over this domain. Three of the
familiar functions are And , Nand , and X OR .

236

CHAPTER 14. REVIEW OF FUNCTIONS AND RELATIONS

7. How many functions can you define over domain {0, 1} N and codomain
{0}?
Solution: In this case, we can define only one function for any value of
N . All of them are constant functions that always return 0.
8. How many functions can you define over domain {0, 1, 2} N and codomain
{0, 1, 2, 3}?
Solution: The domain size is 3 N , obtained by measuring the size of
{0, 1, 2} N . Against each element of the domain can be listed the output
N
which comes from the codomain of size 4. Thus, the answer is 43 .
N
Comparing this against 22 the number of Boolean functions of N inputs,

it is clear that this is a generalization of the derivation we did when we


studied Boolean functions.

9. How many correspondences can exist between {0, . . . , 7} and itself? What
are these correspondences called (from your study of permutations and
combinations)?
Solution: These correspondences must map from a domain of size 8
to a codomain of size 8 through a non-collapsing map. Each map is a

permutation. For instance,

{(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7)}

is one such correspondence. Another one is


{(0, 1), (1, 0), (2, 3), (3, 2), (4, 5), (5, 4), (6, 7), (7, 6)}

and therefore, there are n! such correspondences.


10. Consider the correspondence f : {0, . . . , 7} {0, . . . , 7} with rule ( x+1) mod 8.
Describe f f . . . f ( N times) as the N -fold composition of f with itself.
How many distinct correspondences (across all possible N ) exist?
Solution: Each such composition rotates the elements. For instance, a
0-fold composition results in
{(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7)}

while a 1-fold composition results in


{(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 0)}.

Now, a 2-fold composition results in while a 1-fold composition results in


{(0, 2), (1, 3), (2, 4), (3, 5), (4, 6), (5, 7), (6, 0), (7, 1)}.

14.3. INVERTIBILITY OF FUNCTIONS

237

It is now clear that after 8 rotations, we would be back to the original


situation of 0 rotations. Thus, there are 8 such distinct compositions
possible.
11. How many relations R A A exist, where A = {0, 1, 2}?
Solution: This asks for the number of subsets of A A . There are 9
elements in A A and therefore 29 such relations.

14.3

Invertibility of Functions

1. Suppose f ( x) = 3 x 30 is a function from R to R. Show that f has an


inverse.
Solution: We have to show the Tarzan proof.
Let the inverse ( g) be g( y) = ( y + 30)/3. This g undoes the operations
that f carries out.
For every x R, we have to show that g( f ( x)) = x. This is seen to be

true by substitution.

((3 x 30) + 30)/3 = x

For every y R, we have to show that f ( g( y)) = y. This is seen to be


true by substitution.
(( y + 30)/3) 3 30 = y
2. Consider the domain D of a function f to be the power set of {1, 2, 4, 8};
that is, P ({1, 2, 4, 8}). Let the codomain C be {0, 1, 2, ..., 15}. Let f take
every x D and do the following. Recall that x is a set. The rule
for f is: add the members of x. Thus, {1, 2, 8} P ({1, 2, 4, 8}) maps to
1 + 2 + 8 = 11. Is this a 1-1 function? A correspondence?
Solution: It is a correspondence. Notice that 1, 2, 4, 8 are distinct bits in
a binary representation of these numbers. Thus, the rule of f simply sets
these four bits for each addition. For instance, 1 + 2 + 8 can be thought of

as the following in Binary:

0001 + 0010 + 1000

This evaluates to 1011 because these bits are or-ed in. Such additions
result in 1-1 maps. Also, by placing these bits in all combinations, we will
generate all the codomain elements also. Thus, f is invertible. Write out
a few of these mappings and check for yourselves.

238

14.4

CHAPTER 14. REVIEW OF FUNCTIONS AND RELATIONS

Pigeon-hole Theorem, Finite Domains

There are a few simple theorems regarding functions between finite domains
and codomains. Let the domain be A , codomain be B, n( A ) be the size of the
domain and n(B) be the size of the codomain.
n( A ) > n(B): All functions f : A B must be many-to-one. This is known as
the pigeon-hole principle because if there are n pigeon-holes and n + k
pigeons (for k > 0), then there must be one pigeon-hole that contains
more than one pigeon.
n( A ) < n(B): No function of the form f : A B can be onto. This is clear
because members of A cannot map to more than one element of B.
However, f can still be many to one. For instance, it still is possible
that all members of A map to one member of B.
n( A ) < n(B) or A B: For finite sets, if A B, then a 1-1 map from A to B
cannot exist.
Note that for infinite sets A and B, even if A B, it is possible to have a
1-1 function f : A B. For instance, suppose A = Even and B = N. We can
define f ( x) = x div 2 which maps A to B in a 1-1 manner.

14.5

Correspondences Between Infinite Sets

It is important to become familiar with the construction of correspondences


between infinite sets.
1. Show, by proposing a correspondence, that there are as many points in
(1, 2] R as in [2, ) R (both these sets have cardinality 1 ).
Solution: This means we must map every point in (1, 2] to [2, ) in a
1-1, onto, and total map. How about sending 2 to 2, and in that case, one
can send points approaching 1 to points approaching . This is achieved
by the function 2/( x 1).
2. Show, by proposing a correspondence, that there are as many points in
[0, ) R as in [0, 1) R (both these sets have cardinality 1 ).
Solution: This is achieved by the function x/ x + 1.
As x approaches , the ration approaches 1.
As x approaches 0, the ration approaches 0.

Chapter 15
Induction
In mathematics and in computer science, one likes to prove facts about all
elements of an infinite set. Examples:

N ( N + 1)
2
N
N
The sum of binomial coefficients 0 through N is 2 N .
The sum of all natural numbers from 1 to N is

An ant decides to walk on a graph paper starting from the origin (coordinate (0, 0)), heading toward point ( N, N ) toward a sugar cube. It
always going one unit right or one unit up. This ant has a total of
(2 N )!
different walks, for any N .
( N !)2
We can of course check these assertions for a few N values. For instance:
The sum of 1 through 5 is 1 + 2 + 3 + 4 + 5 which is 15. Plugging in N = 5
N ( N + 1)
5 (5 + 1)
into
, we get
, or 15.
2
2


The sum of binomial coefficients 40 through 44 is (from a suitable Pascals triangle row)
1+4+6+4+1
which simplifies to 16, or indeed 24 .
Tracing the ant from (0, 0) to (2, 2), it can go six different ways, as
follows:
(0, 0), (1, 0), (2, 0), (2, 1), (2, 2)
239

240

CHAPTER 15. INDUCTION

(0, 0), (1, 0), (1, 1), (2, 1), (2, 2)


(0, 0), (1, 0), (1, 1), (1, 2), (2, 2)
(0, 0), (0, 1), (1, 1), (2, 1), (2, 2)
(0, 0), (0, 1), (1, 1), (1, 2), (2, 2)
(0, 0), (0, 1), (0, 2), (1, 2), (2, 2)

Now this fits the equation

4!
(2 2)!
which is
or 6.
2
4
(2!)

However, checking these assertions for a few values isnt any guarantee that
they hold true for all N . Induction is the central approach for showing such
general results.

15.1

Basic Idea Behind Induction

The basic idea behind induction is to use a proof pattern. Let us derive this
pattern through a few attempts, culminating in the correct version.

15.1.1

First Incorrect Pattern for Induction

Let us try erecting a simple pattern:


Assume that the assertion is true at 0; show that it is true at 1.
Assume that the assertion is true at 1; show that it is true at 2.
Assume that the assertion is true at 2; show that it is true at 3.
...
(Keep doing this)
Clearly, this is infeasible, as we dont know when to stop. It is also plain
wrong! For example, suppose one wants to show that for every n, it is the
case that n = n + 1. Suppose someone suggests proceeding as follows (clearly,
all this is incorrect, but we just want to make a point):
Assume that the assertion is true at 0, i.e. assume that 0 = 1. Then
one can show that 1 = 2 by adding 1 to both sides.
Now that we know 1 = 2, we can show 2 = 3, and so on.
What we ended up doing is this. Suppose P ( n) is the assertion that n = n + 1.
Then, the above argument achieved the following:
We showed P (0) P (1); i.e., assuming P (0) (or 0 = 1), we established
P (1) (or 1 = 2).

15.1. BASIC IDEA BEHIND INDUCTION

241

Likewise, we showed P (1) P (2).


Speaking in general, we showed that for every n, P ( n) P ( n + 1).

15.1.2

Correct Pattern for Induction

The stack of implications


P (0) P (1)
P (1) P (2)
P (2) P (3)
P (3) P (4)
...
does not allow us to infer anything! For all you know, each statement above
may be equivalent to IF the moon is made of green cheese THEN horses
can fly. Anything (including false assertions) can be put after the IF.
We know that to apply modus ponens, we need a trigger. That is, suppose we also manage to show P (0). Then we will have a much better situation:
P (0) (is true)
P (0) P (1)
P (1) P (2)
P (2) P (3)
P (3) P (4)
...
We can now apply modus ponens, and derive P (1), and then P (2), and so on.
This then proves that for all n, it is the case that P ( n) is true. In a sense, the
stack of implications is like a row of dominoes, and the trigger is the push
to the first domino!

15.1.3

Induction: Basis Case and Step Case

We can now summarize the rule of induction systematically. There are basically two approaches, called arithmetic induction and complete induction.
Arithmetic induction This is the most basic pattern that we shall follow.
Goal: Prove that for all n, P ( n) is true.
Approach:

242

CHAPTER 15. INDUCTION


Prove the Basis Case: Show that P (0) is true.
Prove the Step Case: Show that P ( n) P ( n) is true (or valid).

One can state formally thus: for showing n, P ( n) for any predicate P ,
Show that P (0) is true
Show that n, [P ( n) P ( n + 1)] is valid.
In other words, assuming P ( n) for an arbitrary n, we can show that
P ( n + 1) is valid (or true).

It is important to keep in mind that we may change the basis case to P (1)
or P ( k) for some k N. We may also need to establish multiple basis cases.
These variations will be introduced depending on the problem. In all cases,
the trip the stack of dominoes pattern of proofs will hold.
Complete induction While theoretically equivalent to arithmetic induction, this rule often proves handier in many situations. Please see 15.4 for
an illustration of this rule.
Goal: Prove that for all n, P ( n) is true.
Approach:
No Explicit Basis Case: You heard us right; you wont be showing
an explicit basis case!
Prove the Step Case for Complete Induction: Show that by assuming P ( m) true for all m < n, we can show P ( n).
Catch! When you take n = 0, you wont have an m < n (typically you
induct from 0 and up). Thus, youll have to show P (0) without
the benefit of assuming it for m < n. This way, you will be forced
to prove a basis case anyhow.
One can state formally thus: for showing n, P ( n) for any predicate P ,
Show that n, ( m < n, P ( m) P ( n)).

15.2. A TEMPLATE FOR WRITING INDUCTION PROOFS

243

In other words, for an arbitrary n, assume that P ( m) holds for all


m < n. Using this, try to show P ( n).

Failure! You will not be presented with problems where youll fail to prove
by induction (other than by trying reasonably hard). But when one fails to
prove something by induction (despite trying extremely hard), one of two
things can be concluded:
Either what we are trying to prove is false, or
The formula may be true, but not inductive; that is, have to prove
something for a stronger P . We wont face too many of these situations (we will provide one example in 15.5).

15.2

A Template for Writing Induction Proofs

Induction proofs must be written in such as way that you can trace your
arguments, and so can we when we grade your work. The basic steps to be
listed in your answers are now listed:
Induction variable: State what we are inducting on (which variable). Typical step: induct on n.
Formulate proof goal: Formulate and write down the forall query to be
verified Typical step: To show that for all n, P ropert y( n) holds.
Basis case(s): Think of the basis case(s). Typical step: We now show that
P ropert y( b 1 ), P ropert y( b 2 ), etc. hold (for the basis cases b 1 , b 2 , etc).
Induction hypothesis: What is the induction hypothesis (what do you assume to be true of ( n 1) (the book standardizes on induction hypothesis being wrt ( n 1). You may assume it for n also. Typical step:
Assume that propert y( n) holds.
Induction step: Write down the induction step (what should you be seeking to conclude as the induction step). Typical step: We now show
that propert y( n + 1) holds.

244

CHAPTER 15. INDUCTION

Finising the proof: Apply algebra to simplify the induction step (where
the induction hypothesis is involved, write it down)

15.3

Examples

We will now consider several examples. These are the situations in which
our examples will arise.
General Principles of Induction: Induction is one of the most fundamental of proof techniques. It is used to prove properties of infinite sets of
items such as natural numbers where there is a smallest item, and a
next item larger than each item.
Deriving Summations of Series: We will learn how to derive and verify
formulae pertaining to summing arithmetic and geometric progressions (series).
Properties of Trees: We will learn to count the number of leaves, as well
as the total number of nodes, in balanced trees.
Problems Relating to Recurrences: We will learn to apply induction to
problems stated using recurrence relations.

15.3.1

Series Summation Problems-1

Question: Prove by induction that


n
X

r i = ( r n+1 1)/( r 1)

i =0

where r stands for the common ratio.


Solution:
Induction variable: n
Proof goal:
n,

n
X

r i = ( r n+1 1)/( r 1)

i =0

It is a bit tedious to write this down, so define

S ( n) =

n
X
i =0

ri

15.3. EXAMPLES

245

So, the proof goal becomes:


n, S ( n) = ( r n+1 1)/( r 1)

Basis case: Show for n = 0 that the property is true. That is show that

S (0) = ( r 0+1 1)/( r 1)


From the definition of S ( n), we know

S (0) =

0
X

ri = 1

i =0

But this is also what ( r 0+1 1)/( r 1) evaluates to. Thus, the property
holds for n = 0.
Induction hypothesis: Assume S ( n 1) is true, i.e.,

S ( n 1) = ( r n 1)/( r 1)
Induction step: Show that the property holds for n. That is, show that

S ( n) = ( r n+1 1)/( r 1)
Key observation: We can write S ( n) as S ( n 1) + r n . This is because
we are adding one more element to the summation.
= (by induction hypothesis) ( r n 1)/( r 1) + r n
=
= (by algebra)
( r n+1 1)/( r 1)
Hence proved!

246

CHAPTER 15. INDUCTION

15.3.2

Series Summation Problems-2

Question: Prove by induction that


n
X

i 3 = n2 ( n + 1)2 /4

i =1

Solution:
Induction variable: n
Proof goal:
n,

n
X

i 3 = n2 ( n + 1)2 /4

i =1

It is a bit tedious to write this down, so define Sc( n) to stand for sum
of cubes upto n
Sc( n) = n2 ( n + 1)2 /4
Basis case: One basis case suffices: Show for n = 1:

Sc(1) = 12 (1 + 1)2 /4 = 1
This is true by algebra.
Induction hypothesis: Assume Sc( n 1) is true, i.e.,

Sc( n 1) = ( n 1)2 (( n 1) + 1)2 /4


i.e.,

Sc( n 1) = ( n 1)2 n2 /4
Induction step: Show

Sc( n) = n2 ( n + 1)2 /4
Key observation: We can write Sc( n) as Sc( n 1) + n3 . This is because
we are adding one more element to the summation. The i 3 becomes
n3 .
= (by induction hypothesis) ( n 1)2 n2 /4 + n3

15.3. EXAMPLES

247

= (( n2 + 1 2 n) n2 + 4 n3 )/4
= ( n4 + n2 + 2 n3 )/4
= ( n2 ( n + 1)2 /4
Hence proved!

15.3.3

Series Summation Problems-3

Given a sequence defined as follows:

a1 = b
a n = b + ( n 1) k
Prove by induction the summation closed-form expression
n
X

ai =

i =1

n
(2 b + ( n 1) k)
2

Solution:
Denote the summation up to n by S n , that is, we have to show

Sn =

n
(2 b + ( n 1) k)
2

Basis Case: Show that the formula S 1 = b holds for n = 1: The summation S 1 amounts to
1
X
i =1

b=

1
(2 b + (1 1) k)
2

which simplifies to b, thus matching the summation.


Induction Case: Assume the above identity for n and show it holds for
n + 1.

248

CHAPTER 15. INDUCTION

We know that S n+1 = S n + ( b + n k), i.e., add a n+1 to S n to obtain summation up to element n + 1.
Employ the induction hypothesis, i.e. it holds up to n to expand S n
in the above formula, to get

S n+1 =

n
2

(2 b + ( n 1) k)

( b + n k)

= 12 ( n (2 b + ( n 1) k) + 2 ( b + nk))
= 12 (2 b + 2 nb + n( n 1) k + 2 nk)
= 12 (( n + 1) 2 b + n2 k + nk)
= 12 (( n + 1) 2 b + nk ( n + 1)
=

( n+1)
2

(2 b + nk)

Thus, the formula for S n holds for all n. We can thus say

n N,

15.3.4

Sn =

n
(2 b + ( n 1) k)
2

Series Summation Problems-4

P
Prove by induction on n 0 that ni=1 i ( i + 1) = n( n + 1)( n + 2)/3. Provide all
requisite details for an induction proof.
Induction Variable: n
P
Proof Goal: S n = ni=1 i ( i + 1) = n( n + 1)( n + 2)/3
Basis Case: S 0 = 0
Induction Hypothesis: S n = n( n + 1)( n + 2)/3
Induction Step: To show S n+1 = ( n + 1)( n + 2)( n + 3)/3
Proof:
= n( n + 1)( n + 2)/3 + (( n + 1)( n + 2)) (by ind hyp)
= [ n( n + 1)( n + 2) + (3( n + 1)( n + 2))]/3

15.3. EXAMPLES

249

= [( n + 1)( n + 2)( n + 3)]/3


Hence proved.

15.3.5

Proving an Inequality-1

Question: Show that


n, n 7 3n < n!

Induction variable: n
Proof goal:
n, Cond ( n)

where

Cond ( n) = ( n 7) (3n < n!)


We should test n = 6 to understand the given condition well:

Cond (6) = (6 7) (36 < 6!)


Now, 36 = 729 while 6! = 720. Thus, 729 < 720 does not hold! Thus, we
are avoiding a bad spot by using the implication.
Hopefully, things will work above 6; lets check: 37 = 2187, while 7! =
5040; and 2187 < 5040. Yay, the inequality seems to want to work!
Thus, we now productively go forward inducting.
Basis case: For n = 7:

Cond (7) = 37 < 7!


This is true (can check out; see above).
Induction hypothesis: Assume Cond ( n 1) is true, i.e.,
3n1 < ( n 1)!
for ( n 1) 7.

250

CHAPTER 15. INDUCTION

Induction step: Show

Cond ( n)
i.e., to show that
3 n < n!
and obviously if ( n 1) 7, then n 7 also. So we dont need to carry
the baggage of the implication any more. We can simply focus on the
juicy part of the proof goal.
Thus, to take stock of things:
We know that 3n1 < ( n 1)!
Must show that 3n < n!
I.e., must show that 3 3n1 < n ( n 1)!
I.e., must show that P Q < R S , where
* P =3
n1
* Q=3
* R=n
* S = ( n 1)!
* But observe that Q < S (induction hypothesis).
* Also, observe that P < R (i.e. 3 < n)
* Thus, P Q < R S holds!
i.e., 3n < n! holds!

Hence, proved.

15.3.6

Proving an Inequality-2

Prove by induction that n3 + 2 n is divisible by 3, i.e.,


n 0, ( n3 + 2 n) mod 3 = 0

Induction variable: n
Proof goal:
n, ( n3 + 2 n) mod 3 = 0

Basis case: We should test n = 0, and it works out.


Induction hypothesis: Assume
(( n 1)3 + 2( n 1)) mod 3 = 0

15.3. EXAMPLES

251

Induction step: Show


( n3 + 2 n) mod 3 = 0.
Let us call I H = (( n 1)3 + 2( n 1)) and IS = ( n3 + 2 n)
Let us find out the difference between IS and I H .
( n3 + 2 n) (( n 1)3 + 2( n 1))
Use the fact that ( n 1)3 = n3 1 3 n2 + 3 n, to obtain
( n3 + 2 n) ( n3 1 3 n2 + 3 n + 2 n 2)
This simplifies to 3 n2 3 n + 3 which is divisible by 3.
Thus, IS I H is divisible by 3 and so also I H is also divisible by 3 (by
induction hypothesis).
Thus, IS is divisible by 3, or that the induction step is established.

15.3.7

Proving an Inequality-3

Prove by induction on n 5 that 2n > n2 . Hint: 2n+1 = 2n + 2n . Provide all


requisite details for an induction proof. Also argue why n 4 does not work.
Induction Variable: n
Proof Goal: 2n > n2 in the range 5 and above
Basis Case: 25 > 52 (notice that this does not work for 4)
Induction Hypothesis: 2n > n2
Induction Step: To show 2n+1 > ( n + 1)2
Proof:
2n+1 = 2n + 2n
> n2 + n2 (by ind hyp)
> n2 + (2 n + 1) (since n2 > (2 n + 1) in the range 5 and above)
= ( n + 1)2 .
Hence proved.

252

15.3.8

CHAPTER 15. INDUCTION

Sequence Summation Needing TWO Basis Cases

This example is from Ensley and Crawleys book on Discrete Structures. The
goal is to show that the sequence defined by

a k = a k1 + 2a k2
for k 3, where a 1 = 1 and a 2 = 2 is equivalently described by the formula

a n = 2n1
Induction variable: k
Proof goal:
n, a k = 2k1

Basis cases: We should test for two basis cases, namely a 1 and a 2 .
This is because the sequence of interest starts off at two of these
basis cases and then only recursively builds up.
Thus we have

a 1 = 1 = 211
a 2 = 2 = 221
Induction hypothesis: Assume for all k 3 upto and including
( n 1) that
a k = 2k1
Induction step: Show

a n = 2n1
According to the sequence definition, we have

a n = a n1 + 2a n2
According to induction hypothesis, we have

a n1 = 2(n1)1
a n2 = 2(n2)1
Thus, using Ind. Hyp, we can write a n as

a n = 2n2 + 2 2n3
= 2n2 + 2n2
= 2n1

15.4. PROOF BY INDUCTION OF THE FUNDAMENTAL THEOREM OF ARITHMETIC253

15.3.9

Riffle Shuffles

Here, there are two decks, each with N1 and N2 cards. From Chapter 10, we
have seen that there are ( N1 + N2 )! / ( N1 ! N2 !) riffle-shuffles possible. Let us
establish this result by induction.
Let us follow the complete induction recipe.
Assume: For riffle-shuffles of all lower sizes of decks of cards, the
formula works correctly.
Thus for the N1 , ( N2 1) deck combo and the ( N1 1), N2 deck combo,
assume the formulae work.
Thus, we obtain either an ( N1 1) against N2 shuffle and plop the
final card of the first deck or obtain an N1 against ( N2 1) deck shuffle
and plop the other card.
That is, we recursively divided the problem into these two cases, and
each of these cases gives the shuffles that constitute the whole: (( N1
1) + N2 )! / (( N1 1)! N2 !) + ( N1 + ( N2 1))! / ( N1 ! ( N2 1)!)
This can be algebraically simplified to (( N1 + N2 1)! ( N1 + N2 ))/( N1 ! N2 !)
Or to ( N1 + N2 )!/( N1 ! N2 !), which is what we want to prove.

15.4

Proof by induction of the Fundamental


Theorem of Arithmetic

The fundamental theorem of arithmetic states that

Every natural number is expressible uniquely as a product of


primes.

Proof by induction (see Chapter 15):


Either the given natural number n is a prime, in which case, it will be
of the form (0 . . . 0 . . . 1 . . . 0 . . .). Thus, 17 7 (0, 0, 0, 0, 0, 0, 1) because it is
equal to
20 30 50 70 110 130 171

254

CHAPTER 15. INDUCTION


In this section, we will use 7 in this sense when we compare numbers
and tuples. Such n have unique prime factorizations. This establishes
the basis case for us, actually!

Or, the given n is composite, and is a product of primes, i.e. n = n 1 n 2 .


Clearly, n 1 and n 2 are less than n.
By complete induction, assume that all n i below n have unique prime
factorizations.
Thus, n 1 and n 2 have unique prime factorizations

n 1 7 (a p1 , a p2 , . . . , a p m1 )
and

n 2 7 ( b p1 , b p2 , . . . , b p m2 )
That is, n 1 involves going upto prime p m1 (the last prime exponent
needed to express n 1 ) and n 2 involves going upto prime p m2 (the last
prime exponent needed to express n 2 ) Without loss of generality, assume that m 2 > m 1
Then

n 7 ((a p1 + b p1 ), (a p2 + b p2 ), . . . , (a p m1 + b p m1 ), . . . , b p n2 )
For instance,
131784 = 68 1938 = (4 17) (2 3 17 19). And so, if we inductively
assume that these numbers have unique prime factorizations, i.e.,
68 7 (2, 0, 0, 0, 0, 0, 1)
and
1938 7 (1, 1, 0, 0, 0, 0, 1, 1)
Then we can express
68 1938 7 ((2 + 1), (0 + 1), (0 + 0), (0 + 0), (0 + 0), (0 + 0), (1 + 1), 1)
i.e.
68 1938 7 (3, 1, 0, 0, 0, 0, 2, 1)
which is a way of saying that
68 1938 = 23 31 50 70 110 130 172 191
Thus we obtain a unique encoding for n also.

15.5. FAILING TO PROVE BY INDUCTIONSTRENGTHENING

15.5

255

Failing to Prove by InductionStrengthening

Suppose we are engaged in an experiment which goes on forever: we take a


jug and at every time-step t 1, we add 2 more liters of water to it. Suppose
the whole experiment starts at t = 0. Suppose someone wants you to prove
that for all t, volume( t) 6= 3; that is, the volume of water should not be 3 is
the proof goal.
Let us begin dutifully inducting:
Basis case of t = 0: 0 6= 3. Check.
Induction step case: assume that at t, volume( t) = m, and that m 6= 3.
Show that at time t + 1, volume( t + 1) 6= 3. This amounts to:

m 6= 3 m + 2 6= 3
Alas, this does not work, because m could be an odd number, say 1, in
which case we will get 1 + 2 = 3.
While we (as humans) know that m cannot be odd, the proof-rule of
induction, when blindly applied, does not know that.
This situation often gets arbitrarily complex in practice. Thus, when
such failure occurs, one has to think hard and prove a stronger result.
For us:
Prove that t, [ even(volume( t)) volume( t) 6= 3]
Then the step case becomes:
[ even( m) ( m 6= 3)] even( m + 2)
Thus, we emerge having proved something stronger: t, even(volume( t)).
From this, what we wanted proven that volume( t) 6= 3 follows.

256

CHAPTER 15. INDUCTION

Bibliography
[1] http://www.270towin.com/.
[2] Lorraine Lica. http://home.earthlink.net/~llica/wichthat.htm.
[3] F. Ruskey, C. D. Savage, and S. Wagon. The search for simple symmetric
venn diagrams. Notth Amer. Math. Soc., 53:13041311, 2006.
[4] The University of Victoria website http://www.theory.csc.uvic.ca/
~cos/inf/comb/SubsetInfo.html#Venn.
[5] The Wolfram website http://mathworld.wolfram.com/VennDiagram.
html.

257

Вам также может понравиться