Вы находитесь на странице: 1из 12

Complex Information Processing

Paper *96
March 13, 1967



The Measurement of Task Size

In a sufficiently small problem space, W,, where a member of ¥ r

once discovered, can be identified easily as a solution* tie task of dis-
covering solutions may be trivial '{e.go, a T-ssase for a rat with food in
one branch)o The difficulties in complex problem solving usually arise
from some combination of two factors § the size of the set of possible
solutions that must be searched, and the task of identifying whether &
proposed solution actually satisfies the conditions of the problem., In

any particular case, either or both of these may be the sources of problem
difficulty By using our formal models of problem solving we can often

obtain meaningful measures of the size and difficulty of particular prob-

lems., and measures of the effectiveness of particular problem solving
processes and devices» Let us consider some examples.

The Logic Theorist

We have made some estimates of the size of the space of possible

solutions (proofs) for the problems handled by the Logic Theorist«, By a

possible proof- which we shall take as the element of the set W (Figure

2) in this case- we mean an arbitrary list of symbolic logic expressions <,

If we impose no limits on the length or other characteristics of such se-
quences s their number, obvious ly s, is infiniteo Hence, we must suppose at
GIF #96

the outset that we are not concerned with the whole set of possible proofs,
but with some subset comprising, say, the "simpler" elements of that set.
We might restrict W, for example, to lists consisting of not more than
twenty logic expressions, with each expression not more than 23 symbols
in length and involving only the variables p, q, ?, s* and t, and the con-
nective 3 "V" and 'tD". The number of possible proofs meeting these restric°
235 ~-one followed W 235 zeros»
tions is about 10
The task is also not trivial of verifying that a particular ele-
ment of the set tf, ©B we have just defined it, is a proof of a particular

problem in logic; for it is necessary to determine whether each expression

in the sequence is an axiom or follows from some of the expressions pre-
ceding it by the rules of deductive inference. In addition, of course,

the expression to be proved has to be contained in the sequence.

Clearly, selecting possible proofs by sheer trial and error and

testing whether each element selected is actually the desired proof is not
a, feasible method for proving logic theorems for either humans or machines

The set to be searched is to© large and the testing of the elements sel-
ected is too difficult How can we bring thia task down to manageable pre=

First of al^ the number just computed--10 --is not only ex

ceedingly large but also arbitrary, for it depends entirely on the restrie
tions of simplicity we impose on W, By strengthening these conditions, we

reduce the size of W? by weakening them we increase its size We must look

for & more meaningful way to describe the size of the set Wo It does not

appear that there is any non- trivial measure of W that is entirely indepen-
dent of assumptions about the solution process
CIP #96 ~3~

The best we can do is to define a very simple "brute-force18

process that produces members of W in a certain order, and ask how many
members the generator would have to produce* on the average, to obtain
solutions to problems of a specified class, Stated otherwise 9 we can con
struct some sort of measure of the size of a problem space by measuring
the power of an exceedingly weak heuristic applied to the problem This
measure will not be independent of the heuristic we select as yardstick,
but it will allow us to compare the power of alternative heuristics with
the power of that heuristic as a standard
We shall now describe such a standard for problems of proving
theorems in logic 0 let us generate elements of W according to the follow-
ing simple scheme (which we call the British Museum Algorithm in honor of
the primates who are credited with employing it)?
(1) We consider only lists of logic expressions that are valid
proofc= that is, whose initial expressions are isxioms, and each of whose
expressions is derived from prior ones by valid rules of inference By
generating only sequences that are proofs (of something), we eliminate
the major part of the task of verification 0
(2) We generate first those proofs that consist of a single ex=
preesion (the axioms themselves), then proofs two expressions long, and so
on 9 limiting the alphabet of symbols as before Given all the proofs of
length kg, we generate those of length (fc+1) ky applying the rules of infer=
enee in all permissible ways to the former to generate new derived
GIF #96 -4=

3ions that can be added to the sequences., ~* That is, we generate a
a&ze (see again 9 figure 1} each choice point (a^, b^, b^> etc 4 of which
represents a proof, with the alleys leading from the choice point repre
senting the legitimate ways of deriving new expressions as immediate con
sequences of: the expressions contained in the proof Thus, in the figure,

d, is a pvoof that can be derived as an Immediate consequence of c«, using

path 2, 2/
Figure 4 shows how the set of n»step proofs generated by the
algorithm increases with n at the very start of the proof-generating
processo This enumeration only extends to replacements of "v" with 'tD" 9
b" with "v", and negation of variables (e.g., 'S p" for "p">° No detach-

ments and no complex substitutions (eeg, s "q v r" for "p") are included<>
No specializations have been made Ce 0 g 09 substitution of p for q in "p v q 11}
If we include the specializations, which take three more steps, the algo-
rithm will generate an (estimated) additional 600 theorems, thus providing
a set of proofs, all of 11 steps or less, containing almost 1,000 theorems 9
none of them duplicates»
What does the algorithm do with the sixty-odd theorems of Chap
ter 2 of Principia? One theorem (2,01) i$ obtained in step (4) of the

generation, hence is..among the first 42 theorems provedo Three more (2 0 G2 9

2o03, and 2.04) are obtained in step (6), hence among the first 115 o One

more (2.05) is obtained in step (8), hence in the first 246, Only one more

A ^number of fussy but not fundamental points must be taken care of in

constructing the algorithmo The phrase "all permissible substitutions"
needs to be qualified, for there is an infinity of these. Care must
be taken not to duplicate expressions that differ only in the names of
their variableso We will not go into details here, but simply state
that these difficulties can be removed« The essential feature in con-
structing the algorithm is to allow only one thing to happen in gener
ating each new expression, i<>e», one replacement, substitution of "not°p"
for "p",
re 4
P #96 -6=

is included in the first 1,000, theorem (2.07).The proofs of all the re-
mainder require either complex substitutions or detachment.
We have no way at present to estimate how many proofs must be
generated to include proofs of all theorems of Chapter 2 of Principle 0
We will show later that it is almost certainly more than 10 0 More-
over, apart from the six theorems listed,, there is no reason to suppose
that the proofs of the Prineipia theorems would occur early in the list
Our information is too poor to estimate more than very roughly
the times required to produce such proofs by the algorithm using a computer
like JOHNNXAC; but we can estimate times of about 16 minutes (40 7090 see-
oads) to do the first 250 theorems of Figure 4 (i 0 e., through step (8}>
assuming processing times comparable with those in LTc The first part of
the algorithm has an additional special property, which holds only to the
point where detachment is first useds that no check for duplication is
necessary. Thus the time of computing the first few thousand proofs only
increases linearly with the number of theorems generated. For the theorems
requiring detachments 9 duplication checks must be made, and the total ccm
puting time increases as the square of the number of expressions generated
At this rate it would take eons of eons of computation for the British
Museum Algorithm to generate proofs for the theorems in Chapter 2 of Prin
cipia. By this measure, the set W is very largt, indeed, and something
more effective is needed than the British Museum Algorithm in order for a
man or machine to solve problems in symbolic logic in a reasonable time.
It is worth observing that, in spite of its spectacular inef-
ficiency, the generating process of the British Museum Algorithm is far
from random. In adding to subsequences of expressions, it applies only

admissible operators; hence all the sequences it generates are valid

proofs (of something I)» The algorithm does not explore the space of
possible sequences of expressions, but only the space of proofSo Even

in this space, it proceeds in an orderly way from "short" to "long"

proofs e

The Hoore«°Anderson Logic Problems

as—«K=» -J- ._ _BM_HLJ-: -' -+-• .••*•'.[ -•—-••--^T~ -m-'•'m-K&H.'f**a <=,auaiilf tSMW-ij ill.I'm i .1 *

Before leaving the Logic Theorist, we wish to mention a vari-

ant of the Whitehead and Russell problems which we have also studied 9 and
which will be the subject of detailed analysis in Chapter . At Yale,

OK Moore and Scarvia Anderson ( ) have studied the problem solving

behavior of subjects who were given a small set (from one to four) of
logic expressions as premises and asked to derive another expression from
these, using twelve specified rules of transformation, (For details see

the discussion in the next section and Chapter 4o) If we again suppose
derivations to be generated by working forward from the premises, we can,
in the case where there is a single premise, make simple estimates of the
number of possible derivations of given length--and hence characterize
this particular problem maze.
Assuming (which are oversimplifications) that each rule of
transformation operates on one premise, and that each such rule is applic-
able to any premise, this particular maze branches in twelve directions (one
for each rule of transformation) at each choice point e That is, we start
out with a single premise; depending on which rule of transformation ve
apply* we obtain one of twelve possible new expressions from each of these»
CIP #96 °8~

and so on Thus, the number of possible sequences of length k is 12 0
If a problem expression can be derived from the premises in a minimum of
seven steps, then a trial-and^error search for the derivation would re
quire, on the average, the construction of 1/2x12 "18,000,000 sequences
If only four rules of transformation were actually applicable,
on the average^, at each stage a more realistic assumption, since expres-
sions must be of particular forms for particular rules to be applicable
t® them), the number of sequences of length 7 would still be 4 «16 9 384

Chess Playing

Let us turn now to a second example** choosing a,move in chess,

On the average, a chess player whose turn it is to move has his choice
among twenty to thirty legal alternatives 0 There is no difficulty, there
fore, in "finding" possible moves, but great difficulty in determining
whether a particular legal move is a good move* The problem lies in the
verifier and not in the generator,, However, a principal technique for eval
uating a move is to consider some of the opponent's possible replies to it,
one's own replies to his, and so on, only attempting to evaluate the result
ing positions after this maze of possible move sequences has been explored
to some deptho Even though, in this scheme, we restrict ourselves to legal
moves, hence to sequences generated by admissible operators, the maze of
move sequences is tremendously large. If we consider the number of contin
uations five moves deep for each player, assuming an average of 30 legal
continuations at each stage, we find thafc ? 9 the set of such move sequences
has about 10 (one (million billion) members
CIP #96

Opening a Safe

We can make similar estimates of the sizes of the set W for the
other examples of problem-solving tasks we have listecU In all cases the
set is so large as to foreclose (at least with human processing speeds) a
solution-generating process that makes essentially random search through
the set for possible solutions It will be useful to consider one addi-
tional "synthetic" example that has a simpler structure than any we have
discussed so far, and that will be helpful later in understanding how
various heuristic devices cut down the amount of search required to find
problem solutions° Consider a safe whose lock has ten independent dials,
each with numbers running from 00 to 99 on its face» The safe will have
100 -10 , or one hundred billion billion possible settings, only one of
which will unlock ito A would-be safe cracker, trying the combinations
systematically, could expect to take on the average 50 billion billion
trials to open ito


Thus far we have emphasized the large sizes of typical problem

spaces as an explanation for problem difficulty. Not all difficult prob-
lems, however, have large spaces of possibilities associated with them.
There are non trivial problems, with certain characteristics that we would
call "puzzle-like," having very small spaces. Not all puzzles have small
W, but it will be interesting to see whether we can discover a common prop
erty of puzzles that makes their difficulty relatively independent of Wo
An example of a problem with small W that is difficult for most
people is the Missionaries and Cannibals problem. Three missionaries and
CIP #96 ,10-

three cannibals are on one bank of a river, waiting to cross« They have

a boat that will hold only two men at u time, and all members of the party
know how to paddle it, At no Cime may a missionary or set of missionaries
be left on either bank with a larger number of cannibals. What sequence
of boat loads will get the party acrcts the river safely? Let us generate
all the possible solution paths that eatiufy the conditions of the problem
(no more than two in the boat, missionaries ne er to be outnumbered on
either bank), terminating any path If it loops -returns to a situation pre-
viously achieved. We find that theve are tnly four paths, and each of
these leads to a solution of the problem! (The reader can easily construct

the maze for himself.)

Thus, simply by listing the possible paths, a ^ ^,,i- :i on of t^e
Missionaries and Cannibals problem can ba obtained in a matter o£ *^*-eB
with pencil and paper. Yet intelligent people often take a half hour or
more to solve the problem. Why? The solution calls for eleven boat trips-
six across the river and five back. In all the trips across, two men are

carried in the boat; in all the trips back, except the third., one man
crosses alone. But two men have to be carried on the third return trip.
This step appears implausible to most people because it leads away from the
final goal, which is to get the members of the party across the river.
Hence this alternative is generally avoided for a long time. The small

total number of alternative paths does not facilitate solving the problem
because the correct paths are not traversed at all.
Other well-known puzzles can be shown to have the same property
Thus, their puzzle-like character is not simply a function of the size of
CiP #96 -li-

the problem space, but has to do also with the characteristics of the
generators that people commonly bring to the problem. Such puzzles

would be trivial problems for sufficiently "stupid," brute=force, problem-

solving programs. We shall see presently what properties of solution
generating programs create the difficulty in puzzle situationso

Information»Theoretic Measures of Size

In theoretical analyses of selection and search, the size of

$ie problem space is often measured in terms of bits. units introduced in
information theory. The number of bits in a finite set is simply the

logarithm, to the base 2, of the number of elements in the seto There is

nothing particularly sacred about the base 2; if we want a logarithmic

measure of size, we can as easily «ase digits> the logarithm to the base 10.
One digit is about 3.3 bits (Iog210**3.3)o
The safe in the example of the previous paragraph constituted
a problem space of 20 digits, or about 66 bitSo Knowledge of the correct
setting of any given gial would reduce the space to 18 digits, a reduction
of 2 digitso Hence, we can say that a dial setting provides 2 (decimal)
digits of informationo This relation holds in general; a decimal number

of n digits provides n digits of information, for it selects a particular

one out of 10 possible numbers.
The advantage of measuring the size of a set by the number of its
elements rather than the logarithm of that number resides in the feet that
the average time required to find a specified maaber of the set by non
CIP #96 -70

heuristics as LI (although Gelernter's powerful diagram-using heuris-

tic has no close analogue in XX).
Both LT and the Geometry Theorem Machine work backward. Be-

causs they do so, they both have available a powerful matching heuristic
for determining which substitutions to make for the variables in the axioms
Both programs generate subproblems that are placed on a list where they
can be examined to determine which subproblem should be attempted next.
In both programs, the subproblems generated provide guarantees that a path
Cor, in some cases, a conjunction of paths) from the axioms to the final
goal will be a valid proof. These heuristics are not only common to these

two programs, tout they are applicable to any realm of mathematics where
proofs involve some kind of substitution of appropriate constants for vari-
ables. We will encounter these same heuristics again in other contexts.