Академический Документы
Профессиональный Документы
Культура Документы
doi:10.1068/p5973
Gerald Westheimer
Division of Neurobiology, 144 Life Sciences Addition, University of California at Berkeley, Berkeley,
CA 94720-3200, USA; e-mail: gwestheimer@berkeley.edu
Received 22 December 2007; in revised form 15 January 2008; published online 15 May 2008
Abstract. Modern developments in machine vision and object recognition have generated renewed
interest in the proposal for drawing inferences put forward by the Rev. Thomas Bayes (1701 ^ 1759).
In this connection the epistemological studies by Hermann Helmholtz (1821 ^ 1894) are often cited
as laying the foundation of the currently popular move to regard perception as Bayesian inference.
Helmholtz in his mature writings tried to reconcile the German idealist notions of reality-as-
hypothesis with scientists' quests for the laws of nature, and espoused the view that we ``attain
knowledge of the lawful order in the realm of the real, but only in so far as it is represented in the
tokens within the system of sensory impressions''. His propositions of inferring objects from internal
sensory signals by what he called `unconscious inferences' have made Helmholtz be regarded as
a proto-Bayesian. But juxtaposing Bayes's original writings, the modern formulation of Bayesian
inference, and Helmholtz's views of perception reveals only a tenuous relationship.
Figure 1. The Rev. Thomas Bayes, F.R.S. Figure 2. Exzellenz Hermann v. Helmholtz
(1701 ^ 1759) (1821 ^ 1894).
Sometimes, a more particular formulation is used for cases in which there is a set of
parameters w within the hypothesis H. For example, if H is the hypothesis that a Poisson
distribution is involved, and w is the variance, the equation takes the following form
P
P
w jE, H P
E jw, H 6P
w jH
P
E jw, H 6P
wjH .
The term P(E jw, H ) is unproblematic and involves the calculation of an inverse prob-
ability, viz the probability of the event E, given the hypothesis. Most of the discussion
has centered on the origin and ultimate source of the `prior'.
When applied to object recognition, the task is to decide whether a specific com-
ponent in a captured picture is in fact the image of a real object. In this connection,
the event is the presence in the image of a structure or a pattern that can be fully
characterized, for example by the number and value of pixels. The hypothesis would be
the existence of a specific object with a pre-defined structure in the real world. Enough
would have to be known about the actual image and each of the hypothesized objects to
calculate the probability that in the mode of transmission each of these objects would
generate such an image. Also required are prior estimates of the probability of the objects
being in fact out there in the real world.
With this information it is possible to use the above equations to calculate the
posterior probability that what has been received is in fact the image of a specific object.
``Given the number of times in which an unknown event has happened and failed: required
the chance that the probability of its happening in a single trial lies somewhere between
any two degrees of probability that can be named.'' (page 298)
What follows has been described by one of its foremost interpreters as ``one of the
most difficult works to read in the history of statistics'' (Stigler 1982). Bayes deals
with the location along a distance AC of the point in which a thrown ball comes to
rest, given that ``there shall be the same probability that it rests upon any one equal
part of the plane as another, and that it must necessarily rest somewhere upon it''.
First, a ball is thrown once, landing in O (figure 3), but the position of O is not
known. What is then determined, though, is that a second similar ball, when thrown
n times, landed p times to the right of O and q n ÿ p times to the left. Bayes is
interested in the probability of O lying between any arbitrary points F and B along AC.
In each ball throw ``there shall be the same probability that it rests upon any one equal
part of the plane as another'' and ``the happening or failing of an event in different
trials are so many independent events''.
F B
f
b
C A
o
O
Figure 3. Bayes considers the situation of a ball being thrown to land anywhere with equal
probability along the distance AC, calling the event a success if it comes to rest to the right of a
fixed point O whose position is unknown. The law governing the event is the binomial distribu-
tion with parameter x if one stipulates the event to have taken place on p occasions out of
p q n trials. Bayes then goes on to calculate the chance of p successes in n trials for the
interval between two points F and B within AC. It is the ratio of the area under the likelihood
distribution of binomials between parameters f and b to that under the whole distribution
between A and C.
Bayes calls the situation in which a ball lands to the right of O as ``the happening
of the event M'' and argues that its occurrence p times in p q n trials is governed
by the binomial distribution (n!=p!q!)6x p 6(1 ÿ x)q where x is the ratio of the distances
AO and AC. In modern parlance, the ball coming to rest to the right of O is the event
and the binomial distribution in the hypothesis. Bayes then presciently used what is
now known as the inverse approach: For any other point along AC, say F, the chance
that a ball tossed n times would on p occasions land to the right of F is also governed
by the binomial, now (n!=p!q!)6f p 6
1 ÿ f )q, f being the ratio of AF to AC. Hence
the desired answer, viz the chance that O lies between F and B, is given by the sum
of the binomials with the parameters of all points lying between F and B. In order to
arrive at a probability, ie a value between zero and one, this would have to be divided
by the sum for all points between A and C. According to Bayes, this is also the
probability that a single throw from the same ensemble lands in that interval.
Bayes approach is to identify the event E and the governing hypothesis w, H
(that the chance of a tossed ball landing to the right of any dividing point of a line
of unit length is given by the binomial distribution with the parameter w, the ratio of
the
P dividing point distance to the length of the whole line) and then to compute
P
E jw, H , the chance of the event occurring for the stipulated parameter range of
the hypothesis. In other words, Bayes's operations take place in the realm of likelihood.
Was Helmholtz a Bayesian? 645
Just to illustrate what Bayes was after: If 5 out of 25 ball tosses gave a positive
answer to the question whether the event has occurred, what is the probability that o
lies in the middle 10% of the line AB, ie between f 0:45 and b 0:55?
If B (x; n, p) n!=
p!(n ÿ p)!6x p 6(1 ÿ x)q , then the required probability is
x 0:55 x 1:00
P P
B
x; 25, 5 B
x; 25, 5
x 0:45 x0
The argument proceeds along the following lines: For x 0:45, what is the probability
that a single throw lands to the right of it? Here n p 1 and (0:45; 1, 1) 0:45.
For one throw out of two it is B
0:45; 2, 1 0:495, because either both throws go into
one or the other division of the whole distance, with probabilities of (0.45)2 and (0.55)2
respectively, or one throw goes into each, with probability 0:4560:55 0:2475, which
can occur in two ways. Finally, P for exactly 5 throws out of 25, the binomial value
B(0:45; 25, 5) is equal to 0.0063; B
x; 25, 5 for the range 0.45 to 0.55 computes to
0.0248 and for the full distance 1 4 x 4 0 to 3.846. Hence, in this particular application
of Bayes's problem, namely `If 5 out of 25 tosses had a positive outcome, what is
the probability of 0:55 4 o 4 0:45?' the answer is 0.0248=3.846 0.0065, or 1 in 154.
On the other hand, for 0:3 4 x 4 0:2 it is 0.453 or 1 in 2.2.
But it is the prior that is the crucial component of all the Bayesian discussions. In
Bayes's words, in arriving at the value of his expectation he wanted to consider whether
there was anything that may give ``reason to think that, in a certain number of trials,
[the event] should rather happen any one possible number of times than another''.
Even here, unfortunately, Bayes's writing has led to debates. Having called the
outcome of the ball-throw the ``event M'' in prop. 9 where he lays out the binomial
proposition, he goes on to say:
``In what follows therefore I shall take for granted that the rule given concerning the event M
in prop. 9 is also the rule to be used in relation to any event concerning the probability of
which nothing at all is known antecedently to any trials made or observed concerning it.''
(cited in Barnard 1958, page 306).
Such a phrasing might give the impression that Bayes rejected the idea of a prior, but
this is not the case, for he states explicitly concerning both the fiduciary and test throws
``that if either ... be thrown there shall be the same probability that it rests upon any
one equal part of the plane as another'', ie the prior is flat.
Stigler, who subjected the situation to the most thorough critical analysis, concluded
that for Bayes the prior is associated with the locations of the subsequent throws and
not those leading to the formulation of the hypothesis H, as is now embodied in the
canon of Bayesian inference. As it happens, in the situation analyzed by Bayes, both
have a flat prior and the distinction is not material. However, the substitution of P(E )
for P(H ) in equation (1) would reformulate Bayes's theorem into
P
P
H jE P
E jH 6P
E P
E jH 6P
E
and deprive it of its main attraction, namely the ability to assign a different probability
to one hypothesis or model, or one range of parameters within a hypothesis, than to
others. The starting point of modern Bayesian analyses is a particular set of observa-
tions and the assignment of higher probabilities to some than to others surely must
relate to hypotheses and not observations.
Thus is seems that, if being a Bayesian means being a strict follower of Bayes, it does
not necessarily imply unconditional acceptance of the current interpretation of all the
terms of equation (1), even if there were no other challenges to their application.
Bayes was ignored for at least a century and a half. When his work was resur-
rected, it was in the context of arguments between warring factions in statistics and
646 G Westheimer
probability theory who tried to come to terms with the extremely deep problem of
defining probability. The situation is one in which Einstein's famous dictum (made
in connection with geometry) might be rephrased: Insofar as probability definitions
are mathematical, they do not refer to reality; insofar as they refer to reality they are not
mathematical. In mathematics one looks for proofs of convergence of infinite series.
But in situations where probability is invoked, even in as long a sequence of dice throws
(or balls drawn from an urn with replacement) as might be contemplated, the outcome
is ultimately uncertain and hence unsatisfactory in the strictest logico-deductive envi-
ronment of mathematics. Again and again, in probability theory and statistics, recourse
has to be sought in how the individual regards chance and would deal with it. Bayes
writes about expectation, gain and loss, and it is not by accident that the one surviving
original document of his was found in the archives of an insurance company.
4 Helmholtz's epistemology
No single person, before or since, contributed more to the knowledge of the human
sensory apparatus than Hermann Helmholtz, and throughout his career he kept concern-
ing himself with questions of the origin of our visual experiences. He first broached
the subject in an 1854 lecture, as a 34-year-old beginning professor of physiology in
Ko«nigsberg, and returned to it in a variety of settings till almost the last essay he wrote
during the year of his death in 1894. The introduction to part III of the first edition
of the Handbuch der physiologischen Optik, which forms the basis of the English version,
contains a sentence which best encapsulates the views most widely attributed to him.
``The general rule according to which visual representations determine themselves ... is that
we always find present in the visual field such objects as would have to exist in order for
them to produce the same impression on the neural apparatus under the usual normal
conditions of the use of our eyes.'' (Helmholtz 1867, page 428/1910, page 4) (2)
In gauging the evolution of Helmholtz's views, however, it is important to realize
that this passage was omitted in the second edition (published in 1895, but not avail-
able in an English translation). In its place Helmholtz wrote an extensive revision of
the section which is more in line with the pivotal formulations to which we now turn.
The roots of Helmholtz's epistemological writings are twofold. First, he was one
of the ablest and most consequential practitioners of mid-nineteenth century natural
scienceöand by far the most successful of those in sensory physiology öwho erected a
securely structured body of knowledge based on relatively uncomplicated empirical
observations and deductive rules of manageable mathematical complexity. Second, he
was firmly grounded in the academic and cultural (and later even the industrial) elite
of that other solidly and successfully constructed element of nineteenth-century Euro-
pean history, the Prussian state. The two streams converged in 1878, when as Rector
of Berlin University he took the occasion of the solemn Founder's-Day address to
present the credo of his epistemological system. The speech represented a synthesis of
the aims of a working scientist in the area of sensory perception and those of a loyal
member of an intellectual community in which Kant and Goethe were revered.(3)
(2) The wording of the passage differs somewhat from the one in Southall's English translation which,
as many of the English versions of Helmholtz's writing, does not always capture the subtleties of
Helmholtz's phrasing. All quotes in this paper have been rendered into English by the author.
(3) Helmholtz's standing in the German Imperial establishment is attested to in a memo from the Min-
ister of Education to Chancellor Bismarck relating to the appointment to the presidency of the new
Imperial Bureau of Standards. Helmholtz should maintain his association with the University because
it ``would retain a man who has been viewed for many years as its scientific head, who contributed
more than anyone else to smooth out ... the contrast between the natural sciences and the humanities,
and who, in the arena of politics, fostered the moderately conservative tendency in which Berlin Univer-
sity is well in advance among German Universities'' (Koenigsberger 1903, page 353). In 1883 he was
raised to the hereditary nobility and in 1891 he became Privy Councilor to be addressed `His Excellency'.
Was Helmholtz a Bayesian? 647
Its very title affords an illustration of the difficulties encountered in viewing Helmholtz
entirely via the English version of his writings. Helmholtz toyed with various other titles
(Koenigsberger 1903), such as ``Prinzipien der Wahrnehmung'' (Principles of perception),
``Was ist wirklich?'' (What is real?), and a favorite citation from Goethe's Faust ``Alles
Verga«ngliche ist nur ein Gleichniss'' (All that is transitory is only a metaphor). In the end
he chose ``Tatsachen in der Wahrnehmung'' which surely should be rendered ``Facts in
perception'' and not, as a prominent version would have it, ``The facts of perception''.
He quickly cut to the chase:
``What is truth in our percepts and thoughts? In what way do our ideas correspond to reality?
Philosophyöit tries to sift what in our knowledge and ideas is due to the influence of
the material world in order to establish what belongs to the purely innate activity of the
mind. Science öit, to the contrary, tries to sift what is definition, notation, manner of
representation, hypothesis, in order to lay out, in a pure form, what belongs to the world
of reality whose laws it seeks.'' (Helmholtz 1878/1903, page 218)
In order to mediate between them, Helmholtz first of all recognizes and accepts
the validity of the program of the two camps over which he as Rector was presiding.
And no one was better suited for the mediating task than this iconic presence in
natural science who at the same time could boast as family acquaintance the founding
philosopher of German idealism, and predecessor in the rectorate a half-century
earlier, Johann Gottlieb Fichte. Helmholtz was driven to his cogitations when, after
sequentially analyzing the optical, anatomical, physiological, and psychophysical stages
of vision, he confronted the nature and origin of perception. He realized that the
physicalist/materialist approach that guided him through the physics and biology thus
far had come to the end of its rope, here in the borderland between the material and
mental worlds. Seeking guidance in Kant's writing, he could not refute that philoso-
pher's proposition that a deep chasm divides the concept of an object's representation
in the mind from that object's hypothesized existence in the outside world. Helmholtz
was a medical graduate, a sometime Professor of Physiology at Heidelberg, now head
of one of the great departments of Physics, and he owed it to his constituents in these
professions and to his own scientific achievements to hold fast to the concept of a
reality in which the laws articulated by natural science held sway. At the same time he
was thoughtful and thoroughgoing and could not, therefore, ignore Kant's teaching.
The reconciliation is elegant:
``The distinction between thought and reality is possible only when we know how to distin-
guish between what the `I' can change and what the `I' cannot change. ... What we then attain
is knowledge of the lawful order in the realm of the real, but only in so far as it is
represented in the tokens within the system of sensory impressions.'' (Helmholtz 1878/
1903, page 242)
The phrasing becomes more impressive when it is realized that `I' and `not-I' (`Ich' and
`Nicht-Ich') are essential formulations in Fichte's writing, who counterposed the incarna-
tions of mind, will, faith, morals, to those of the material and of nature (Schmidt 1969)
and who, as an idealist philosopher, even went so far as to posit that the `I' generates
the `not-I'. All the ingredients of Helmholtz's argument are contained in the paragraph:
there is a realm of the real with its lawful order; what we know about it is represented
by tokens within the system of sensory impressions; and finally, our knowledge hinges
on active explorationöreality is what remains invariant when the expected changes
due to willed movements are factored out from the sensory impressions.
5 Helmholtz on inference
The single most enveloping aspect of Helmholtz as a scientist and epistemologist was a
belief in empiricism which he espoused throughout his career. Specifically, he continued
to assert that our knowledge of the real world is derived by using our motor system as
648 G Westheimer
exploring organ to deduce invariances by trial and error. Generating a movement is the
activity of the `I' which keeps track of the instructions. (A modern term is efference
copy, meaning the record that is maintained of the outgoing or efferent signals from
the central nervous system to the muscles.) The associated change in sensory signals is
registered, and inferences can be drawn from a `before' and `after' comparison. Through
knowledge of the actuated movement it can be determined what in the changes of the
sensory impressions can be ascribed to the movements; what remains, by inference,
is of the real world. Such inferences are the same as the deductions in the realm of
ordinary logic, only here we are not aware of them and hence the appellation `unconscious
inferences'.
Helmholtz's clearest and most specific illustration of how he imagines this process
to operate is in the arena of retinal local signs. He agrees with Lotze that each location
of the retinal periphery has its own spatial value. In the fully developed organism a
nexus has been established between this spatial value and the eye rotation necessary to
foveate a target imaged on that peripheral retinal location. When a willed eye rotation
has been executed, and both its extent and the shift in retinal location have been
registered internally, the spatial location and extent of the object can be inferred.
Helmholtz here has satisfied the imperatives both of the natural scientist of his day,
to whom a real world was a given, and of Kantian thinkers, for whom the `Ding und
sich' is in principle unreachable: the real world was a hypothesis. As a practicing
scientist he is, of course, obliged to argue that there is nothing wrong with hypotheses,
and in fact the whole enterprise of science rests öquite firmlyöon this premise:
``In its essence, each properly constituted hypothesis proposes a more general law of
phenomena than we had obtained by immediate observations up to then, and is an attempt
to rise to an ever more general and comprehensive set of laws. Any new facts asserted
by such a hypothesis must be tested and affirmed by observations and experiments.''
(Helmholtz 1878/1903, page 242)
He will not be pushed into a corner by absolutists and pure materialists:
``Any reduction of phenomena to underlying causes and forces asserts that we have found
something permanent and final. Such an unconditional statement is, however, never
justified; the incompleteness of our knowledge does not allow this, nor does the nature
of conclusions from inferences on which, right from the beginning, our perceptions of the
real are based.'' (page 243)
Nor is he unaware of the problem of causality that had exercised Hume's mind:
``Every inductive inference is based on the trust that previously observed lawful behavior
will be found valid in all cases that have yet to be observed. This is the trust in the
lawfulness of all phenomena. But that lawfulness is the condition of comprehensibility. ...
The law of causality expresses a trust in the complete comprehensibility (vollkommende
Begreifbarkeit) of the world. Comprehending, in the sense in which I have described it,
is the method used by our thinking to subdue the world, to order facts, to predict the future.
It has the right and the duty to extend its method to occurrences, and it has indeed
harvested great results in this way. For the utilization of the law of causality, however, we
have no guarantee other than its success.'' (page 243)
www.perceptionweb.com
Conditions of use. This article may be downloaded from the Perception website for personal research
by members of subscribing organisations. Authors are entitled to distribute their own article (in printed
form or by e-mail) to up to 50 people. This PDF may not be placed on any website (or other online
distribution system) without permission of the publisher.