Вы находитесь на странице: 1из 28

Gauss’s and Laplace’s Orbit Determination

Methods and the Hunt for Least Squares


Melody Dodd
Advised by Dr. Donald A. Teets

May 2010

Introduction

The discovery in 1801 of the dwarf planet Ceres marked not only a sig-
nificant event in the history of astronomy, but also led to a remarkable sci-
entific and mathematical drama. When Ceres was discovered, Pierre-Simon
Laplace was the preeminent authority of the time in celestial mechanics,
and was in the prime of his mathematical career. Laplace had, years earlier,
developed a method for determining the orbit of a celestial object based on
observational data. Yet it seems that his methods were unable to produce an
accurate orbit for Ceres when such a calculation was desperately needed by
the scientific community. It was, instead, the young and relatively unknown
Carl Friedrich Gauss who saved the day, thereby gaining international fame
and establishing Gauss as a highly respected member of the mathematical
community.
Natural questions arise concerning these events. Why had Laplace’s well-
known and well-respected method apparently failed? What exactly made
Gauss’s method superior to Laplace’s? Surprisingly, these questions have
gone largely unanswered for over two centuries, due to the fact that the ever
enigmatic Gauss never published his original orbit determination methods,
and it seems that his notes regarding his earliest work on Ceres were not
preserved.
Further questions arose a few years later when Adrien-Marie Legendre
published the method of least squares. Gauss promptly claimed that he
had used least squares in the computation of Ceres’ orbit, thereby asserting
priority for the discovery. Gauss, having failed to keep adequate records, ap-
parently had no direct evidence to support his claim, but he maintained his

1
position nonetheless. The issue was never settled between the two mathe-
maticians, and indeed, this controversy remains unresolved within the math-
ematical community even today.
Can we ever have answers to these questions? Is it possible to sift through
the evidence available and put to rest a two-hundred-year-old unsolved mys-
tery? Frustratingly, the answer appears to be no: there is simply not enough
data available to definitively answer these mathematical and historical rid-
dles. However, there is much to be learned from the process of examining
what evidence we do have. The mathematics involved are engaging, and the
story is entertaining in its own right. In this paper, we examine the events
surrounding the discovery of Ceres and take a careful look at the obtainable
data, both from a historical and a mathematical perspective, to gain new in-
sight into an old mystery. The reader may also find new appreciation for the
genius of Laplace, Legendre, and Gauss, who came to be connected through
an exceptional set of circumstances, all beginning with the discovery of a
little planet called Ceres.

Background: The Discovery of Ceres

Ceres was discovered by Joseph Piazzi, the Italian astronomer and founder
of the Palermo Observatory, on January 1, 1801. Piazzi, who sighted the
object quite by accident while collecting data for a star catalogue, at first
suspected he had discovered a comet. He soon became hopeful, however,
that the object was something more exciting, for the scientific community
in Europe, headed by Baron Xavier Von Zach, had in fact been actively
searching for a “missing” planet between the orbits of Mars and Jupiter.
Piazzi, reluctant to assert whether or not the “moving star” he had ob-
served was in fact the long-searched-for planet, kept his discovery secret
initially. He did eventually notify two of his colleagues regarding the discov-
ery, but did not immediately give them observational data so that they or
others could themselves observe Ceres [2].
Ceres, which indeed orbits between Mars and Jupiter, remained visible
in the night sky for a period of 41 days. Piazzi was able to record a total of
19 complete observations during this time, but Ceres disappeared into the
glare of the sun after his final observation on February 11, 1801. Further
observation then became impossible as Ceres passed behind the sun, and
the newly discovered object was effectively lost, with Piazzi being the only
human who had ever observed it [2], [10].
News of the object’s discovery quickly swept the scientific community in

2
Europe, and leading astronomers and mathematicians launched an earnest
effort to relocate Ceres. Piazzi’s observations were made available and were
circulated within the scientific community. Beginning in August of 1801, as-
tronomers began systematically watching the skies for the object’s reappear-
ance. To assist their efforts, astronomer and mathematician Jean Charles
Burckhardt was given the task of computing an orbit for Ceres from Pi-
azzi’s observations, using known orbit determination techniques. Burck-
hardt, along with others working independently, indeed produced prelimi-
nary orbits, but astronomers were unable to locate Ceres using ephemerides
derived from such computations [2], [10].
By September the situation was becoming tense as the best efforts of the
scientific and mathematical community still yielded nothing. Piazzi’s repu-
tation was on the line, for he alone had originally observed the object, and
suspicions began to arise of Piazzi’s credibility and the accuracy of his obser-
vational techniques. Von Zach complained that Piazzi had, by keeping his
discovery private too long, precluded the possibility of other astronomers
helping to understand the object and its movements. Burckhardt, likely
troubled that his computations had failed to be useful, apparently asserted
that Piazzi’s observations must have been inaccurate [2].

Gauss Steps In
Carl Friedrich Gauss, then 24 years of age, became absorbed with the
problem at about this time. He had, by happy coincidence, been working
on topics dealing with celestial mechanics at the time of Ceres’ discovery.
Having obtained a full set of Piazzi’s observations after they were published
in the September issue of the Monatliche Correspondenz, Gauss became de-
termined to produce an accurate orbit for Ceres. In the space of just a
few months, he succeeded in his efforts, computing an orbit that was much
different from ones computed earlier by Burckhardt and others. This new
orbit was immediately employed to produce new ephemerides for Ceres, and
Ceres was then successfully relocated on December 7, 1801, in precisely the
position predicted by Gauss’s computations [2], [10].
Gauss’s achievement earned him instant and international acclaim; the
discovery of Ceres effectively launched Gauss’s reputation as a mathemat-
ical genius. To relocate Ceres when hope was nearly lost, almost a year
after its initial sighting and using only a tiny amount of observational data,
was indeed a remarkable feat. Despite the magnitude of his accomplish-
ment, however, Gauss never published the orbit determination methods he
so painstakingly developed, and apparently saved little of his original work

3
from 1801. His later published methods dealing with orbit determination
included in his famous Theoria Motus Corporum Coelestium in Sectionibus
Conicis Solem Ambientium (Theory of the Motion of the Heavenly Bodies
Moving about the Sun in Conic Sections) of 1809, by Gauss’s own admission,
were vastly different from the techniques he employed to relocate Ceres [3],
[10].
Our knowledge of Gauss’s work on the Ceres problem is therefore largely
incomplete. In fact, we would likely know nothing of Gauss’s original orbit
determination methods had it not been for Gauss’s friendship with German
physician and astronomer Wilhelm Olbers. Olbers had been intimately in-
volved in the efforts to relocate Ceres, and was intensely interested in the
mathematics utilized in Gauss’s work. In a letter dated August 6, 1802,
Gauss described his method to Olbers in some detail, though many points
were left undeveloped and somewhat vague. Fortunately for us, this letter
was preserved by Olbers and was eventually published in 1809 by von Zach
under the title Summarische Übersicht der zur Bestimmung der Bahnen der
beiden neuen Hauptplaneten angewandten Methoden (Summary Survey of
the Methods used to Determine the Orbits of the Two New Major Planets),
[4], [10].

A Mathematical Mystery
The circumstances surrounding the discovery of Ceres invoke some in-
teresting questions. The apparent inability of the finest scientific minds of
the day to compute a correct orbit for Ceres is rather surprising in some
respects, given that methods for determining the orbit of a celestial object
were well known at the time. Pierre-Simon Laplace, age 52 when Ceres was
first sighted by Piazzi, was a leading expert on the subject. Laplace had
himself developed and described such an orbit determination method in his
1780 Meḿoires de l’Académie Royale des Sciences des Paris , which later
became a chapter in his crowning achievement, the Mécanique Céleste , his
monumental treatise on celestial mechanics [6], [5].
Laplace was very active in celestial mechanics at the time of Ceres’ dis-
covery and was, in fact, in the midst of writing and compiling the Mécanique
Céleste. Laplace’s method was well-known and readily available when Ceres
was discovered, not to mention the fact that Laplace himself was at the top
of his career. While no evidence seems to remain of what method Burck-
hardt and others employed in their efforts to compute Ceres’ orbit, it seems
reasonable that they were using Laplace’s previously published method, or
some variation of this. Why then, armed with the genius of Laplace, did

4
astronomers and mathematicians of the day fail to compute an accurate
orbit for Ceres? More importantly, how had the young Gauss succeeded
when the most determined efforts of the best minds had come to nothing?
These are questions that have never been fully answered, due to our lack of
information concerning the intricacies of Gauss’s method.

The Least Squares Controversy


The rediscovery of Ceres ended several months of tension in the scien-
tific community and saved Piazzi’s reputation, but the drama was not over
for Gauss. Controversy and still more unanswered questions ensued several
years later, in a way that Gauss could not have foreseen. In 1805, Adrien-
Marie Legendre published his Nouvelles Met́hodes pour la Det́ermination
des Orbites des Comet́es, (New Methods for the Determination of the Orbits
of Comets), a work that contained, among other new techniques, the first
documented account of the method of least squares [7].
Gauss obtained a copy of Nouvelles Met́hodes not long after its publica-
tion, and quickly made a connection between techniques that he had been
using in his orbit determination methods and the least squares techniques
contained within Legendre’s work. In fact, in a letter from Gauss to Olbers,
dated July 30, 1806, Gauss states,

[T]he principle which I have used since 1794, that the sum of
squares must be minimized for the best representation of several
quantities which cannot all be represented exactly, is also used
in Legendre’s work and is most thoroughly developed [7].

Gauss was at the time working assiduously on his Theoria Motus, and
when the work was published in 1809 Gauss included his own version of the
method of least squares. Gauss recognized in Theoria Motus that Legendre
had precedence for publishing the least squares technique, but insists that it
was he, rather than Legendre, who first discovered the method, stating, “our
principle, which we have made use of since 1795, has lately been published
by Legendre,” [3], [7].
Legendre, who received a copy of Theoria Motus soon after its publi-
cation, was understandably annoyed by Gauss’s claims. In a letter from
Legendre to Gauss dated May 31, 1809, Legendre expresses his dismay with
Gauss’s assertions in Theoria Motus, arguing,

There is no discovery that one cannot claim for oneself by saying


that one had found the same thing some years previously; but

5
if one does not supply the evidence by citing the place where
one has published it, this assertion becomes pointless and serves
only to do a disservice to the true author of the discovery [7].

Gauss, who often did not keep clear and accurate records, apparently
never managed to provide Legendre with the evidence requested, stating in
a letter to Olbers in 1812, “The papers have now been lost, in which I ap-
plied that method in earlier years, e.g. in Spring 1799,” [7]. In fact, though
Gauss averred for the remainder of his life that he deserved credit for the
discovery, no direct evidence ever emerged of Gauss’s use of least squares
prior to the method’s publication by Legendre. The two mathematicians
thus became locked in a decades-long feud over the issue, which was never
resolved.
What does all this have to do with Ceres? In the 1812 letter to Ol-
bers referenced above, in the sentence before Gauss’s statement of the lost
records, Gauss offers us an intriguing bit of information. Gauss writes, “In
Autumn 1802 I entered in my astronomical notebook the eighth set of ele-
ments of Ceres, found by the method of least squares,” [7]. Gauss neglects
to describe anywhere, however, the exact way in which the method was em-
ployed in his computations of Ceres’ orbit, or whether the method was used
in his original 1801 computations for Ceres.

Unanswered Questions
It seems we are left with several unanswered questions. What differences
were there between Gauss’s and Laplace’s orbit determination methods that
caused Gauss’s to emerge as superior over Laplace’s in 1801? If what Gauss
claimed was true, and he had been using least squares since 1795 or so, did
he put the technique to service in his original computations of Ceres’ orbit?
Is this, perhaps, what made the difference between the two methods in the
end? Most importantly, is there any way to gain additional insight on any
of these questions?
While it may be unlikely that we will ever fully unravel the mysteries
surrounding Ceres’ discovery, we can learn a great deal using the facts that
we do have. We have Laplace’s method, published in full, in both his Mem-
oires and the Mécanique Céleste. We also have Gauss’s letter to Olbers of
1802, which was later published as the Summarische Übersicht, in which
Gauss describes at least some portions of his method in detail. These re-
sources, as the reader will discover, can go a long way to helping us gain
better understanding of the issues at hand. We will thus end the historical

6
discussion here, and move on into more mathematical topics, beginning first
with a short review of the basics of orbit determination.

A Brief Lesson in Orbit Determination

This portion of the discussion is intended to provide a sketch of the


basics of planetary orbits and the process of determining an orbit from
observational data, with the purpose of acquainting the reader with some
fundamental terminology and concepts. If the reader wishes to obtain a more
thorough understanding of these topics, there are a great many excellent
resources available to accomplish this end, including several listed in the
references for this paper.

Information Provided from Observational Data


When Piazzi observed Ceres through his telescope, he was able to record
its location in the sky, relative to the Earth, with great accuracy. While there
are a few different systems used to describe an object’s apparent position
in the sky, the method we shall use throughout this paper is that shown in
Figure 1. As can be seen in the figure, the angles λ and β are measured from
the Earth’s center with respect to a three-dimensional coordinate system.
This system, known as the heliocentric ecliptic coordinate system, places the
sun at the origin, defines the XY plane to coincide with the ecliptic plane
(the plane of Earth’s orbit), and the X axis is on the line extending from
the sun to the Earth on the first day of spring in the northern hemisphere.
The angles λ and β are respectively called the geocentric ecliptic longitude
and latitude, and it is these angles that can be determined directly from
observational data. When Piazzi discovered Ceres in 1801, these angles
were recorded for the 19 original observation times and were published in
the Monatliche Correspondenz. These angles and the corresponding times
were all the information that Gauss had when he computed Ceres’ orbit.
What the astronomer is unable to determine from observational data
alone is the object’s distance from the Earth. Essentially, the angles (λ, β)
together define the direction of the vector extending from Earth’s center to
Ceres’ center, but we do not know this vector’s magnitude. Furthermore, we
do not know Ceres’ distance from the sun at the observation time. If either of
these quantities were known, we could obtain the other from geometry. Any
one (λ, β) pair corresponding to an observation time therefore can provide
information about an object’s angular position with respect to the Earth at

7
Figure 1: Geocentric Ecliptic Longitude and Latitude (λ, β).

that time, but that is all. To obtain this missing distance information is the
primary goal of both Gauss’s and Laplace’s orbit determination methods.

Determining Distance Information: Two Options


To compute distance information, we need more than one of these (λ, β)
pairs, corresponding to different observation times. How many of these
pairs, then, are needed? Rather surprisingly, both Gauss’s and Laplace’s
basic methods rely on only three initial pairs; the fact that we can obtain
useful information about an object’s orbit from such a small amount of ob-
servational data is somewhat remarkable. These pairs will be designated
(λ1 , β1 ), (λ2 , β2 ), and (λ3 , β3 ), corresponding to observation times t1 , t2 , and
t3 .
Laplace and Gauss each solve the problem of obtaining this distance
information in very different ways, which shall be described in more detail
shortly. Laplace’s method takes the three (λ, β) pairs and derives from these
the heliocentric position vector r and the velocity vector v corresponding
to the middle observation time t2 , as shown in Figure 2. In this setup, the
vector r is defined to originate at the center of the sun and terminate at
the center of Ceres, with v being the corresponding instantaneous velocity

8
of Ceres. Gauss’s method, on the other hand, uses the same three (λ, β)
pairs to compute two separate heliocentric position vectors, r1 and r3 , cor-
responding to the observation times t1 and t3 , as shown in Figure 3.
Once we have either the (r, v) pair or the (r1 , r3 ) pair, we can proceed

Figure 2: Results Obtained using Laplace’s Method: (r, v)

Figure 3: Results Obtained using Gauss’s Method: (r1 , r3 )

with the computation of the orbit using methods developed by Newton,


Kepler, and others. These methods were established long before the time
of Gauss and Laplace, and were considered standard problems in celestial
mechanics in 1801. In the next few paragraphs we give an overview of
these processes, but a great many details have been omitted for the sake of
brevity. It should be noted that a complete description of these methods
can be found in many publications, and the interested reader is very much

9
encouraged to consult these sources for more information regarding the de-
tails of orbit computations. Recommended reading includes “Computation
of Planetary Orbits” and “The Discovery of Ceres: How Gauss Became Fa-
mous,” both by Teets and Whitehead, and Fundamentals of Astrodynamics
by Bate, Mueller, and White [1], [9], [10] .

Computation of the Orbital Elements


Whether we derive the (r, v) pair by Laplace’s method or the (r1 , r3 )
pair by Gauss’s method, the next step in the orbit determination process
is to compute the orbital elements. These are a set of parameters that
completely describe the orbit of a celestial object. As the reader most likely
knows, planetary orbits are ellipses with the sun at one focus, as given by
Kepler’s First Law of Planetary Motion. The first orbital parameter, a, is
simply the semi-major axis length of the orbit. Next, the eccentricity e is
just the eccentricity of the ellipse measured in the usual way. Together, the
elements a and e describe the shape of the orbit.
To orient the ellipse in our heliocentric ecliptic coordinate system, we
need three angles, as can be seen in Figure 4. The first of these angles is
the inclination, denoted i, which measures the angle between the planet’s
orbital plane and the ecliptic (XY ) plane. The angle denoted Ω is measured
from the positive X axis to the line of nodes (the line of intersection of
the orbital plane and the ecliptic plane), and is called the longitude of the
ascending node. The third angle, ω, is measured from the line of nodes to
the major axis of the ellipse, and is known as the argument of perihelion.
The perihelion position, which is intersected by the major axis, is the point
on the orbit where the planet is closest to the sun. This brings us to the
the final element, τ , the time of perihelion passage, which determines the
object’s position on the orbit at various times.
The matter of computing the orbital elements from either (r, v) or
(r1 , r3 ) is a straightforward computation from start to finish, relying on
Kepler’s laws, geometry, and solar data. The whole process, however, is
rather lengthy to describe in any detail, and is not the focus of this paper.
For this reason, the computations are here omitted, and the reader is once
again invited to pursue further reading such as the previously mentioned
sources for a more thorough treatment of these methods.

10
Figure 4: The Orbital Angles

Computing Position from the Elements


Once an element set has been found for a celestial object, the orbit
computation is essentially finished. The six orbital elements completely de-
scribe the orbit and give all the data necessary to compute the position of
the planet at any time, both in the heliocentric ecliptic coordinate system
and with respect to the Earth. The process of converting the orbital ele-
ments back into position data was also a well-known process at the time of
Gauss and Laplace, and is considered another standard problem in celestial
mechanics. Once again, however, the techniques are lengthy to describe and
will not, therefore, be developed in this paper. Suffice it to say, once the
orbital elements have been computed for a planet, it is a straightforward
process to compute a position vector r and then a (λ, β) pair for any given
time.
The overall process of orbit determination is given visually as a flowchart
in Figure 5. This chart shows that the special methods developed by Gauss
and Laplace constitute only a small part of the overall process, although
this is the most critical step, and it was this missing step in the procedure
that caused the scientific community of Europe so much trouble in the late
months of 1801.

11
Figure 5: Flowchart of Orbit Determination Process

Laplace's Method

We will now introduce Laplace’s method for orbit determination. This


method, which is described by Laplace in both the Mécanique Céleste and
the earlier Meḿoires, predates Gauss’s method by some twenty years [5],
[6]. The reader should note that either of these sources can be consulted for
a complete treatment of Laplace’s method. The method described next is a
synopsis of a somewhat more modern version of Laplace’s method, largely
consistent with techniques presented in Fundamentals of Astrodynamics by
Bate, Mueller, and White [1]. Note that while the computations presented
here may appear somewhat different from those contained in Laplace’s texts,
the method is truly and fundamentally equivalent to Laplace’s original work.
Recall that Laplace’s method begins with three geocentric longitude and
latitude pairs: (λ1 , β1 ), (λ2 , β2 ), and (λ3 , β3 ), corresponding to observation
times t1 , t2 , t3 . The goal of Laplace’s method is, once again, the computation
of the position vector r and the corresponding velocity vector v for the
middle observation time t2 .
We begin with some variable definitions. Let R denote the vector from
the sun’s center to Earth’s center at any given time. Note that both the
magnitude and direction of this vector are known in our problem. We further
define ρ to be the vector from Earth’s center to Ceres’ center. This vector
has known direction, but unknown magnitude. Once again, the vector from
the sun’s center to Ceres’ center is r, which has unknown magnitude and
unknown direction. Finally, let L denote the unit vector that originates at
Earth’s center and points in the direction of Ceres, along the vector ρ. This
setup is shown in Figure 6.

12
Figure 6: The Vectors R, r, ρ, and L

Our first task is to compute three vectors Li corresponding to each of


our observation times ti . This is simple enough; each of these unit vectors,
as one can see from the geometry in Figure 6, is given by
 
cos(λi ) cos(βi )
Li =  cos(λi ) sin(βi )  .
sin(βi )

Next, we must derive an expression relating the unit vector L (at any
arbitrary time) to the vector R and the scalar values r and ρ, which are the
unknown magnitudes of the previously defined vectors. To begin, observe
that a simple vector addition gives us

r = ρL + R. (1)

Differentiating this twice with respect to time then produces

r̈ = 2ρ̇L̇ + ρ̈L + ρL̈ + R̈. (2)

Furthermore, from physics, we have the inverse square law


µ
r̈ = − r. (3)
r3

13
Where the constant µ is known, being equal to the product of the gravita-
tional proportion constant G and the mass of the sun. Substituting (1) into
(3) gives
µ µ
r̈ = − 3 Lρ − 3 R, (4)
r r
and by equating (2) and (4) we obtain
 
 µ  R
Lρ̈ + 2L̇ρ̇ + L̈ + 3 L ρ = − R̈ + µ 3 . (5)
r r

Note that (5) is a vector equation in three dimensions, composed of three


separate equations that relate the various quantities for any time t. For any
given time, the constant µ and the vector quantities L, R, and R̈ are known
(R̈ can be computed using R and some basic equations from the laws of
planetary motion). This leaves us with unknowns L̇, L̈, ρ, ρ̇, ρ̈, and r.
Our next endeavor will be to find numeric values for L̇ and L̈ at the
middle observation time t2 using interpolation. To accomplish this, the
previously computed vectors L1 , L2 , and L3 are utilized, along with the cor-
responding times t1 , t2 and t3 . Using the Lagrange interpolation formula to
express L as a function of t, we write

(t − t2 )(t − t3 ) (t − t1 )(t − t3 ) (t − t1 )(t − t2 )


L(t) = L1 + L2 + L3 . (6)
(t1 − t2 )(t1 − t3 ) (t2 − t1 )(t2 − t3 ) (t3 − t1 )(t3 − t2 )

Observe that L(t) = Li when t = ti . Equation (6) can then be differentiated


twice to obtain L̇ and L̈ (a straightforward process, so the expressions for
the derivatives are not included here). Then setting t = t2 will give numeric
values for L̇(t2 ) = L̇2 and L̈(t2 ) = L̈2 . From this point forward, all com-
putations will concern the middle observation time t2 , so we will drop the
subscripts under this assumption.
Now, returning to (5), it is clear that there are four remaining unknowns:
ρ, ρ̇, ρ̈, and r. To obtain these missing quantities, we first need to solve (5)
for ρ, ρ̇, and ρ̈ as functions of r. To begin this part of the process, we rewrite
(5) as  
 µ  R
L ρ̈ + 3 ρ + 2L̇ρ̇ + L̈ρ = − R̈ + µ 3
r r
and then again in its matrix form
  
↑ ↑ ↑ ρ  
L̈ 2L̇ L  ρ̇  = R̈ + µ R . (7)
r3
↓ ↓ ↓ ρ̈ + rµ3 ρ

14
For notational simplicity, let
 
↑ ↑ ↑
A = L̈ 2L̇ L .
↓ ↓ ↓

Then multiplying (7) through by A−1 produces


 
ρ
 ρ̇  = −A−1 R̈ + µ −A−1 R .

3
(8)
r
ρ̈ + rµ3 ρ

Now let    
bx cx
−1 −1
−A R̈ = by  ,
 −A R = cy  .

bz cz
Then we can write
µ
ρ = bx + cx (9)
r3
and
µ
ρ̇ = by + cy , (10)
r3
thereby expressing ρ and ρ̇ as functions of r, where the quantities b1 , b2 , c1
and c2 along with the constant µ are all known. Note that the unknown ρ̈
is no longer needed; its presence in the process was merely the bi-product
of intermediate computation steps.
We have now reduced our original three equations given in (5) down to
two equations, but there are still three unknowns, ρ, ρ̇, and r. We also know
r = ρL + R given before, however. Taking the dot product of each side with
itself produces
r2 = ρ2 + 2ρ(L · R) + R2 , (11)
thus giving two equations in the unknowns ρ and r. Substituting (9) into
(11) gives an 8th degree polynomial in the variable r,

r8 − b2x + R2 + 2bx (L · R) r6 − 2µcx (bx + L · R) r3 − µ2 c2x = 0, (12)




where the values of all coefficients are known. This can be solved using
any available method (though one must truly feel sympathy and respect for
Laplace solving this by hand in 1780).
Now that r is known, ρ and ρ̇ can be found easily from (9) and (10).

15
Then the vector r is given by r = ρL+R, and the velocity vector v is simply
the derivative of this, given by

v = ṙ = ρ̇L + ρL̇ + Ṙ.

Thus we have achieved our goal; Laplace’s method has produced a position
vector r and a velocity vector v for the middle observation time t2 . At this
point, we would proceed with element computations as described earlier to
produce an orbit for Ceres.

Gauss's Method

We now turn to the method employed by Gauss when he computed Ceres’


orbit in 1801. The method presented here, with a few notation changes, is
the technique given in Gauss’s letter to Olbers of 1802, which was later pub-
lished in its original form as the Summarische Übersicht. Note that Gauss’s
later orbit determination techniques, such as those published in the Theoria
Motus, are much different from the work outlined here.
The reader may well agree that Laplace’s method, while being perhaps
computationally tedious in places, is predominantly an intuitive process that
unfolds naturally with easily understandable steps. Gauss’s method, on the
other hand, is not so. In fact, the various steps of the method appear largely
unmotivated, and even mathematically mature readers may have difficulty
following Gauss’s arguments. For these reasons, and to prevent this sec-
tion from becoming overlong, we will skip most of the derivations in this
paper. The curious reader is very much encouraged to refer to Gauss’s
Summarische Übersicht, and the previously mentioned “The Discovery of
Ceres: How Gauss Became Famous,” for derivations of the equations that
are to follow in this section [4], [10].
Recall that Gauss’s method begins in much the same way as Laplace’s,
with three geocentric longitude and latitude pairs (λ1 , β1 ), (λ2 , β2 ), and
(λ3 , β3 ), corresponding to the three observation times t1 , t2 , and t3 . The end
goal of Gauss’s method, as was mentioned previously, differs from Laplace’s,
however. Whereas Laplace’s techniques led to the computation of a posi-
tion vector r and velocity vector v corresponding to the middle observation
time t2 , Gauss sought to find two position vectors (r1 , r3 ), corresponding to
observation times t1 and t3 .
We begin by defining the vectors R, r and ρ as before. Additionally,
let δ denote the vector projection of ρ onto the ecliptic plane, as shown in
Figure 7. Note that the direction of δ is given by the known angle λ, but

16
Figure 7: The Vectors R, r, ρ, and δ and the angle L

its magnitude, like that of the the vector ρ, is unknown. The heliocentric
longitude of the Earth, denoted L, is the angle measured from the positive
X axis to the Earth’s position (be sure not to confuse this with the vector
L given in Laplace’s method). The angle L is known, as is (once again) the
vector R. As before, with any of these symbols, we can indicate a specific
quantity corresponding to an observation time ti by adding the appropriate
subscript.
Now let π = h cos(λ), sin(λ), tan(β) i and let P = h cos(L), sin(L), 0 i.
Gauss derives

t3 − t1 (π 2 × π 3 ) · P2
δ1 = δ2 (13)
t3 − t2 (π 2 × π 3 ) · P2
and
t3 − t1 (π 1 × π 2 ) · P2
δ3 = δ2 , (14)
t2 − t1 (π 2 × π 3 ) · P2

which are approximations for the magnitudes of the previously defined vec-
tors δ 1 and δ 3 .
Additionally, Gauss constructs

17
  3 
R2 R2 −2 tan(β2 ) sin(λ3 −λ1 )−tan(β2 ) sin(λ2 −λ1 )
1− r2 δ2 = (M2 −M1 )(M3 −M2 ) tan(β1 ) sin(L2 −λ3 )−tan(β3 ) sin(L2 −λ1 ) (15)

and !− 1
 2 2
R2 R2 R2 R2
= 1 + tan2 (β2 ) + +2 cos(λ2 − L2 ) (16)
r2 δ2 δ2 δ2

where M1 , M2 , and M3 in equation (15) denote the mean anomaly of the


Earth at the three observation times. These are angles related to the Earth’s
displacement from perihelion, and are easily computed using Earth data.
Note that in (15) and (16), the quantities R2 , r2 and δ2 refer to the mag-
nitudes of the vectors R, r and δ at the middle observation time. Observe
also that in both these equations, all values are known except the quantities
r2 and δ2 .
For notational simplicity, let N denote the entire right hand side of (15),
where all quantities are known. Furthermore, let a = 1 + tan2 (β2 ), b =
2 cos(λ2 − L2 ), x = Rδ22 , and y = Rr22 , so that a and b are known quantities
and x and y are unknowns. Then equations (15) and (16) become
2
N = x 1 − y3

and
− 1
y = x x2 + bx + a 2
,
respectively. Then, substitution gives an 8th degree polynomial in the quan-
tity x = Rδ22 ,
3
x8 − (x − N )2 x2 + bx + a = 0. (17)
This polynomial can be solved for the quantity x = Rδ22 from which we
can easily obtain δ2 , and then δ1 and δ3 are given by equations (13) and
(14). Next we can compute the vectors ρ1 and ρ3 from geometry using

ρ1 = h δ1 cos(λ1 ), δ1 sin(λ1 ), δ1 tan(β1 ) i

and
ρ3 = h δ3 cos(λ3 ), δ3 sin(λ3 ), δ3 tan(β3 ) i .
These vectors then yield the desired r1 and r3 if we observe that r + R = ρ
for any time. Thus the computation is complete, and we could proceed with
element computations if desired.

18
A Comparison of Methods

We have now seen the techniques employed in both Gauss’s and Laplace’s
basic orbit determination methods. While there are many differences be-
tween the methods, the reader will no doubt have noticed a few interesting
similarities as well. Most obvious is the fact that both methods rely on
the use of three (λ, β) pairs from three separate observation times. Both
methods also employ approximation techniques that relate the three time
values (interpolation polynomials in Laplace’s method, and equations (13)
and (14) in Gauss’s method). Perhaps more intriguing, however, is the fact
that both methods hinge on the solving of an 8th degree polynomial, and
in both cases the quantity being solved for is a distance that relates to the
geometry of the planet’s position in the solar system.
Indeed, it seems Wilhelm Olbers noticed some similarities as well, re-
marking to Gauss in a letter dated September 11, 1802, in reference to one
of the results contained within the Summarische Übersicht,

[Y]ou must have known the great analogy of your principle equa-
tion with the Laplacian . . . This analogy must naturally occur.
So as you justly note, that every useful method must give at least
an equally accurate orbit from the same observations, so must
also every direct dissolution of the comet problem from three
observations . . . lead to similar equations. It comes only to the
more or less simple course of analysis, and on the convenience
and brevity with which one can calculate the required outcome
for the equation [8].

Olbers goes on to extol the virtues of Gauss’s technique, asserting that


while it is perhaps equivalent to Laplace’s method in its accuracy, it is supe-
rior to Laplace’s in its computational simplicity. Indeed, if one undertakes
the task of computing an orbit using Laplace’s and then Gauss’s method,
it quickly becomes clear that Gauss’s method is the less computationally
burdensome of the two methods.
In Gauss’s reply to Olbers, dated September 14, 1802, we obtain a
glimpse of Gauss’s rather egotistical attitude toward the whole matter. As
one reads this, one should remember that Laplace developed his method
some two decades before Gauss, and was a much older, more established
and respected mathematician in 1802.

The Laplace formula, that I had seen many years ago in his The-
ory of Elliptical Motion, had come completely out of my memory,

19
until I very recently received the Mécanique Céleste. I think one
must be able to derive it from [my formula] very easily [8].

Could it be that Gauss’s and Laplace’s methods yield similar results?


The historical evidence suggests that Gauss’s method was superior, but is
this actually the case? The answer, somewhat surprisingly, appears to be
no, at least for the method in its uncorrected form given previously (we
will soon clarify this). Table 1 gives some indication of the accuracy of each
method. The results in this table are the average absolute errors, in radians,
of the (λ, β) values computed for Ceres for Piazzi’s 19 original observation
times.

Table 1: Average Absolute Error from Observed Values (Radians)


λ β
Laplace’s Method 8.75E-04 8.34E-05
Gauss’s Method 5.69E-04 7.03E-05

The results in Table 1 were obtained via the following process.

1. Choose 3 of Piazzi’s 19 original observations.

2. Use Laplace’s or Gauss’s method to find distance information as de-


scribed previously.

3. Compute an element set for Ceres using results from step 2.

4. Compute (λ, β) values for all nineteen observation times using results
from step 3.

5. Subtract each computed (λ, β) pair from the observed (λ, β) pair com-
puted by Piazzi to obtain the error value, and take the absolute value
of this result.

6. Find the arithmetic mean of the errors from step 5.

This process yields some interesting results. As we can plainly see in Ta-
ble 1, Gauss’s method seems to produce results that are only negligibly more
accurate than Laplace’s method.

20
Possible Correction Methods

The results contained within Table 1 raise a perplexing question: if


Gauss’s method is no better than Laplace’s, then how did Gauss solve the
Ceres problem when Laplace’s method apparently failed? Gauss offers us at
least a partial explanation in the Summarische Übersicht [4]. In a section
entitled “Improvement Methods,” Gauss discloses,
If one calculates the position for time [t2 ] from the approximate
elements found through the previous methods, and if one finds
the same to be in agreement with the observation, the work is
done. Usually the agreement will be very large (often the dif-
ference in my calculations amounted to only a few seconds), but
seldom complete, partly because only approximate hypotheses
underlie [the calculations], partly because the locus of the sun
itself, which one uses therein, is not elliptical, but includes small
disturbances.
Gauss then mentions that one could obtain greater accuracy by modify-
ing equations (13) and (14) to improve the approximations for δ1 and δ3 , but
goes on to say that this would be very computationally difficult, and rec-
ommends that there are easier ways to improve accuracy. A few paragraphs
later, Gauss explains,
One calculates the heliocentric positions according to 3 hypothe-
ses from the two outer geocentric positions, in which one first
assumes the approximated distances for these observations, and
afterwards changes first the one and then the other a little bit.
From the elements found in all 3 hypotheses one calculates the
position for the middle observation, which one compares with
the observed. Through interpolation one then finds the corrected
distances, and if one wants also the corrected elements . . .
The above passage is, admittedly, somewhat difficult to interpret, but it
is clear that Gauss is recommending an interpolation method to improve the
the “approximated distances,” which we can take to mean the (δ1 , δ3 ) values
obtained from the uncorrected computation. It seems that Gauss used such
a correction method when he computed the orbit of Ceres in 1801, and it
seems reasonable that this could at least partially answer questions regarding
the relative accuracy of Gauss’s and Laplace’s methods. From this point on
we shall refer to the technique described within the Summarische Übersicht
as the Three Hypotheses method.

21
The Three Hypotheses Method
We will now attempt to unravel the meaning of Gauss’s advice. It should
be noted that the method presented here is only one possible interpretation
of the previously reproduced passage, although the reader will likely agree
that this interpretation seems to match Gauss’s words quite closely.
Gauss states that we first assume the “approximated distances,” and
then we change “first the one and then the other a little bit,” thus giving
the three hypotheses. These steps are explained below.
1. We perform the computations explained in Gauss’s method to achieve
an approximated (δ1 , δ3 ) pair for the planet. We will denote these
original approximations as (δ1 (1) , δ3 (1) ).

2. We then change our computed δ1 “a little bit,” producing a mod-


ified value that we will denote δ1 (2) . Thus we have a second pair,
(δ1 (2) , δ3 (1) ).

3. Finally, we change our computed δ2 “a little bit,” producing a third


pair, (δ1 (1) , δ3 (3) ).
Gauss then goes on to say that we should use the results of each hypothesis
to compute the position (i.e. a (λ, β) pair), for the middle observation time,
and then interpolate the results to achieve corrected values for δ1 and δ3 .
Then “if one wants,” the corrected distances can be used to compute a cor-
rected element set for the planet. We will denote the (λ, β) values computed
from the first, second, and third hypotheses as (λ(1) , β (1) ), (λ(2) , β (2) ), and
(λ(3) , β (3) ), respectively, keeping in mind that these are all computed posi-
tions for the middle observation time t2 .
What kind of interpolation process does Gauss intend for us to use? Ob-
serve that the process of obtaining a (λ, β) pair from a (δ1 , δ3 ) pair can be
thought of as a function with two inputs and two outputs; thus we have three
pairs of approximate inputs and the corresponding three pairs of outputs.
Additionally, we have the observed (λ, β) pair for the middle observation
time, which we assume is the desired output. The problem then is to deter-
mine, using some form of interpolation, the corrected values for (δ1 , δ3 ) that
will yield this desired output.
To gain perspective on this problem, pretend for a moment that instead
of two inputs and two outputs, we had a function with one input δ and one
output λ. The first hypothesis in this case is given by δ = δ (1) , and the
second by δ = δ (2) (in this scenario, there is no third hypothesis). These
inputs then yield outputs λ(1) and λ(2) . Suppose that our desired (observed)

22
Figure 8: Two-Dimensional Linear Interpolation

output is λ∗ . Our goal is to find a value δ ∗ that when used as an input will
yield λ∗ . This is a simple two-dimensional linear interpolation problem, and
is presented pictorially in Figure 8.
The interpolation here is trivial. Simply find the slope of the line that
intersects all three points as shown in Figure 8, using

λ(2) − λ(1)
m= . (18)
δ (2) − δ (1)
Then the equation of this line is given by

m(δ ∗ − δ (1) ) = λ∗ − λ(1) , (19)

which yields δ ∗ easily.


A similar process can be used to interpolate the results in the actual
case of two inputs and two outputs, although the picture, having 2 input
dimensions and 2 output dimensions, becomes a little harder to draw. We
therefore visualize the set of inputs and the set of outputs as two separate
coordinate planes, where our computations represent a function that maps
points in the (δ1 , δ3 ) plane to the (λ, β) plane.
Suppose, now, that we draw a vector in the (δ1 , δ3 ) plane that originates
at the point of the first hypothesis, (δ1 (1) , δ3 (1) ), and terminates at the point
of the second hypothesis, (δ1 (2) , δ3 (1) ). This vector maps to a corresponding
vector in the the (λ, β) system that originates at (λ(1) , β (1) ) and terminates
at (λ(2) , β (2) ). Similarly, another vector connects the inputs from the first

23
Figure 9: Visualizing the Three Hypothesis Method

and third hypotheses and maps to an analogous output vector. Our problem,
then, is to find the vector originating at (δ1 (1) , δ3 (1) ) and terminating at
some unknown point (δ1 ∗ , δ3 ∗ ) that maps to the vector which originates at
(λ(1) , β (1) ) and terminates at the pair of desired (observed) outputs (λ∗ , β ∗ ).
This setup is shown in Figure 9.
By examining Figure 9, we see that we can write
 (2)
− δ1 (1)
  (2)
λ − λ(1)

δ
M 1 = (2) (20)
0 β − β (1)
and   (3)
λ − λ(1)
 
0
M = (3) , (21)
δ3 (3) − δ3 (1) β − β (1)
where M denotes the transformation matrix that maps the vectors from the
(δ1 , δ3 ) plane to the (λ, β) plane. The matrix M can be found using linear
algebra, and we can write
 λ(2) −λ(1) λ(3) −λ(1)

(2)
−δ1 (1) δ3 (3) −δ3 (1)
 δ1
M= , (22)

β (2) −β (1) β (3) −β (1)
δ1 (2) −δ1 (1) δ3 (3) −δ3 (1)

which is analogous to equation (18) in the 2-dimensional example. Then the


interpolation is accomplished by
 ∗
δ1 − δ1 (1)
  ∗
λ − λ(1)

M = ∗ , (23)
δ3 ∗ − δ3 (1) β − β (1)

24
Table 2: Average Absolute Error from Observed Values (Radians)
λ β
Laplace’s Method 8.75E-04 8.34E-05

Gauss’s Method Uncorrected 5.69E-04 7.03E-05

Gauss’s Method with Three Hypotheses 6.19E-05 6.84E-05

Results Computed by Gauss 1.08E-06 1.19E-06

which is analogous to equation (19) in the 2-dimensional example. Equation


(23) can then easily be solved for the unknowns δ1 ∗ and δ3 ∗ , thus completing
the interpolation process.
Having now computed improved values for δ1 and δ3 , which are the
“corrected distances” referred to in the Summarische Übersicht, we can now
proceed with the remaining computations to find a corrected element set
and the resulting corrected positions.

Results from the Three Hypotheses Method


Application of the Three Hypotheses method resulted in some encourag-
ing outcomes. Indeed, an element set for Ceres produced using this method
proved to be significantly more accurate (and closer to Gauss’s computed
element set for Ceres) than element sets produced using either Laplace’s
method or Gauss’s method uncorrected. When position computations were
performed, the absolute errors (explained earlier) were improved as well, as
shown in Table 2.
Table 2 gives also, however, the errors that Gauss’s own computations
apparently yielded. This is based on a list of (λ, β) values and the corre-
sponding errors from observation that Gauss himself computed for Piazzi’s
19 original observation times [11]. We can see that these errors are extremely
small, and are much smaller, in fact, than the results obtained using any of
the attempted methods. So while the Three Hypotheses method produced
better results, Gauss himself was able to compute still more accurate posi-
tion values for Ceres.
How was Gauss be able to compute Ceres’ position with such accuracy?
After performing all the computations given in the Summarische Übersicht
in full, including his recommended correction method (to the best of our in-

25
terpretive abilities), we still cannot replicate Gauss’s results. It would seem
that there is more to Gauss’s method than he gives in the Summarische
Übersicht.

Least Squares?
We turn now, again, to Gauss’s possible use of least squares. Gauss
specifically stated that he had used least squares since at least 1795, and
that at some point, at least by 1802, he used least squares in his computa-
tion of the elements of Ceres [7]. Gauss gives us no more information than
this, and we are left with a puzzle with several key pieces missing. Assuming
Gauss was truthful on these points, how did Gauss use least squares in his
computations? Did he use the method when he first computed Ceres’ orbit
in 1801, or did he use least squares only in subsequent computations after
Ceres was rediscovered? Although we may never be able to fully answer
these questions, we now consider two potential least-squares-based correc-
tions that Gauss could have used.
The first possible least squares method is a direct extension of the Three
Hypotheses interpolation method. The Three Hypotheses method uses the
(λ, β) values from the middle observation time t2 as the desired outputs
(λ∗ , β ∗ ) to find the interpolated inputs (δ1 ∗ , δ3 ∗ ). Suppose we keep the val-
ues for δ1 (1) , δ1 (2) , δ3 (1) , and δ3 (3) fixed, and add n observed (λ∗i , β ∗i ) pairs
from n “middle” observation times. Piazzi, after all, recorded a total of 19
(λ, β) pairs, and we have thus far only used three of them in our computa-
tions. Each added observation then produces two new equations analogous
to (20) and (21), and M goes from being 2 × 2 to being n × 2. Equation
(23) then becomes
 ∗1
λ − λ(1)

 β ∗1 − β (1) 
 
 ∗2 (1)

λ − λ 
 ∗
δ1 − δ1 (1)
  
M = ∗2 (1)
β − β  . (24)

δ3 ∗ − δ3 (1) .

 .. 

 
 λ∗n − λ(1) 
β ∗n − β (1)

This is an overdetermined system that becomes more overdetermined with


each additional observation. As the reader has probably noticed, this is a
classic least squares problem of the form Mx = b, which can be solved using
−1
the least squares estimate x ≈ (MT M) MT b.

26
The second likely least squares method involves making small changes
to each of the six computed elements, calculating the modified (λ, β) values
that result in each case, and then performing a similar least squares estimate
process to determine the corrected elements. This computation can be per-
formed in conjunction with the Three Hypotheses method or independently.
So what results were obtained using these correction methods? The short
answer is that both methods seem to offer some improvement to the com-
puted elements and positions, but neither method succeeded in reproducing
Gauss’s results. The outcomes from both of these computations were, there-
fore, inconclusive, and we are still no closer to knowing how Gauss used least
squares in his early orbit determination methods.

Conclusion

Can we ever have an answer to the questions examined within this pa-
per? Will we ever know exactly how Gauss computed the orbit of Ceres in
1801? Can we discover how and when he used least squares in his computa-
tions? After carefully examining the available evidence, after meticulously
performing the many necessary computations, after scrutinizing and dissect-
ing Gauss’s words, it appears that there is simply not enough information
available to provide conclusive answers to our questions.
It is unlikely that we will uncover any direct written explanations for our
queries, and it is also unlikely that we will hit upon a method that exactly
reproduces Gauss’s results. For all we know, Gauss did use one of the least
squares methods presented in this paper, but we may be unable to reproduce
his computed results due to differences in values used for constants, Earth
data, and other parameters throughout the process. Then again, Gauss
could have used an entirely different process that was only obvious to him,
and those of us who are, perhaps, slightly less mathematically gifted than
Gauss will have an impossible time stumbling upon the same method.
It therefore seems likely that the mysteries explored in this paper will
live on, possibly forever. We close, then, with the hope that the reader has
gained an appreciation for the genius of our mathematical forebears, has
gleaned some insight into orbit determination processes, and has formulated
his or her own thoughts regarding the problems contained within this doc-
ument. At the very least, it is hoped that the reader has been entertained
by a good story.

27
References

[1] R. Bate, D. Mueller, J. White, Fundamentals of Astrodynamics, Dover, New York, 1971.

[2] G. Foderà Serio, A. Manara, P. Sicoli, Giuseppe Piazzi and the Discovery of Ceres, Asteriods
III, U. of Arizona Press, Tucson, AZ, 2002, pp. 17-24.

[3] C. F. Gauss, Theory of the Motion of the Heavenly Bodies Moving about the Sun in Conic
Sections, , translation by C. H. Davis, Dover, New York, NY, 1963.

[4] C. F. Gauss, Summarische Übersicht der zur Bestimmung der Bahnen der beiden neuen
Hauptplaneten angewandten Methoden, Montaliche Correspondenz zur Beforderung der Erd-
und Himmels-Kunde, XX (1809), pp. 197-224.

[5] P. S. Laplace, Mécanique Céleste, translation by N. Bowditch, Hilliard et al., Boston, MA,
1829.

[6] P. S. Laplace, Meḿoires de l’Académie Royale des Sciences des Paris, 1780.

[7] R. L. Plackett, Studies in the History of Probability and Statistics XXIX, The Discovery of
the Method of Least Squares, Biometrika 59 (1972), pp 239-251.

[8] C. Schilling, Wilhelm Olbers, Sein Leben und Sein Werke, Verlag Von Julius Springer, Berlin,
1900, p. 78.

[9] D. Teets and K. Whitehead, The Computation of Planetary Orbits, The College Mathematics
Journal 29 (1998), pp. 397-404.

[10] D. Teets and K. Whitehead, The Discovery of Ceres: How Gauss Became Famous, Mathe-
matics Magazine 72 (1999), pp. 83-93.

[11] X. Von Zach, Fortgesetzte Nachrichten über den Längst Vermutheten Neuen Haupt-Planeten
Unseres Sonnen-Systems[Ceres], Montaliche Correspondenz zur Beforderung der Erd-und
Himmels-Kunde, III (1801).

28

Вам также может понравиться