An Application of The Calculus of Variations To Sturm Liouville Theory

An Application of the Calculus of Variations to Sturm
Liouville Theory
Tim Kwan
May 12, 2013
1 Introduction
Eigenvalue problems pervade many areas of applied and pure mathematics. They involve
the discovery of eigenvalues: special scalars for which non trivial solutions exist in the
relevant vector space. The Sturm Liouville equation is one such eigenvalue problem and
will be the focus of this report. It is a linear second order differential equation subject to
boundary conditions.
This equation will be recast as an isoperimetric problem and a number of helpful theo-
rems will be derived in order to solve or approximate eigenvalues. This will involve looking
at functionals of y ∈ C 2 [x0 , x1 ]. It will be found that each eigenvalue, λn corresponds to
an eigenfunction ϕn which is a unique solution to the equation. This report will open
with a brief overview of Sturm Liouville theory and then will explain how the calculus of
variations may be applied. The application will be organised into theorems and methods
pertaining to the first eigenvalue and then to higher eigenvalues.
2 Sturm Liouville Theory

Sturm Liouville theory was developed collaboratively by Charles-Franciois Sturm (1803-
1855) and Joseph Liouville (1809-1882) in order to generalise a relatively disorganised
array of second order linear differential equations used to model physical problems. These
included Bernoulli’s work on vibrating strings and Liouville’s own work on heat conduction
(Lutzen, 1984). The theory was particularly significant because it provided the first
qualitative theory of differential equations, and was thus very useful for solutions that
could not be solved explicitly. Sturm and Liouville made significant contributions to
Fourier approximations, to use an anachronistic term, and discovered helpful ways to
approximate the eigenvalues and eigenfunctions of their famous equation,
−(p(x)y ′ (x))′ + q(x)y(x) − λr(x)y(x) = 0 (1)
1
subject to boundary conditions,
α0 y(x0 ) + β0 y ′ (x0 ) = 0
α1 y(x1 ) + β1 y ′ (x1 ) = 0
αk2 + βk2 ̸= 0.
Note that in the interval [x0 , x1 ], y(x) is real valued, q(x) and r(x) are continuous,
p(x) has a continuous first order derivative and p(x), r(x) > 0.
As a result of its generality the equation has a broad range of applications, with many
systems capable of being modelled as second order differential equations. Like most of the
motivating problems, the boundary conditions make the equation suitable for standing
waves. For example, it can be applied to quantum theory by representing the rearranged
one dimensional time dependent Schrödinger equation
2m
−(ψ ′ (x))′ + V (x)ψ(x) − λψ(x) = 0.
h̄2
Where, y(x) = ψ(x), p(x) = 1, q(x) = 2m h̄2
V (x) and r(x) = 1.
The following theorems are integral to Sturm-Liouville theory and will be assumed
but not proven in this report. Refer to Hanus (2008) for the full proofs.
Theorem 1. The spectrum {λn } is an infinite and monotonic increasing sequence with
limx→∞ λn = ∞
Theorem 2. Each eigenvalue corresponds to precisely one eigenfunction. That is to say

that they are simple.
Theorem 3. If λm and λn are distinct eigenvalues with corresponding eigenfunctions

notated as ϕm and ϕn respectively, then
⟨ϕm , ϕn ⟩ = 0
where ∫ x1
⟨ϕm , ϕn ⟩ = r(x)ϕm (x)ϕn (x)dx
x0
2.1 A short and important example

We can let p = r = 1, q = 0 and y(0) = y(l) = 0 in the Sturm Liouville equation to give
y ′′ (x) + λy(x) = 0
For λ < 0, the solution is √ √

−λx
y = c1 e + c2 e − −λx
.
Imposing the boundary conidions shows A = B = 0 which gives only the trivial solution.
For λ = 0,the solution is
y = c1 x + c2
which also only gives the trivial solution.
2
However, for λ > 0, we can impose the boundary conditions and solve y to give
√ √
y = c1 sinx λ + c2 cosx λ
√
= c1 sinx λ
√
0 = c1 sinl λ
and we attain the eigenvalues
n2 π 2
λn = (2)
l2
3 Recasting as a Variational Problem

The Sturm Liouville equation can be recast as an isoperimetric problem whereby the
Euler-Lagrange equation is equivalent to the original equation. We will require that
β0 = β1 = 0, so that the boundary conditions are y(x0 ) = y(x1 ) = 0. However, this is not
an essential imposition as demonstrated in Chapter 5 of Courant and Hilbert’s Methods
of Mathematical Physics (1989).
The reformulation is attained by a number of smart choices for both the functional J
and the isoperimetric constraint I:
∫ x1
J(y) = f (x, y, y ′ )dx (3)
∫x0x1
= (p(x)[y ′ (x)]2 + q(x)[y(x)]2 )dx (4)
∫x0x1
I(y) = g(x, y, y ′ )dx (5)
∫x0x1
= r(x)[y(x)]2 dx = L (6)
x0
Often L = 1, which is to say that the normalisation restriction has been imposed or
that y has been scaled by an appropriate constant. We will demonstrate that the Euler
Lagrange equation under the isoperimetric constraint is the Sturm Liouville equation.
Firstly, the Euler-Lagrange equation for the isoperimetric constant is
−2r(x)y(x) = 0.
Since r > 0, y can never attain an extremum for a non trivial solution.
Recall the following theorem (Brunt, 2000. Wang, 2013):
Theorem 4. Suppose J has extremum at y subject to the boundary conditions and isoperi-
metric constraint, and y is not an extremal of I, then there exists λ which satisfies,
d ∂F ∂F
′
− = 0, (7)
dx ∂y ∂y
where
F = f − λg. (8)
3
By substituting f and g into (3), we find
F = py ′2 + qy 2 − λry 2 .
We can hence compute (7) to get
−(py ′ )′ + qy − λry = 0 (9)

which is the Sturm Liouville equation. Hence, for our choices of J and I, the lagrange
multiplier is equivalent to the eigenvalue parameter. In analogy to the first variation
test, it is necessary but not sufficient that an eigenvalue λ corresponds to each y, if y is
extremal and y ̸= 0. By Theorem 1 and Theorem 2, each λ is unique.
4 The First Eigenvalue

The first eigenvalue λ1 is defined as the least element in the monotonic increasing spectrum
of eigenvalues {λn } (See Theorem 1). This is usually the most important eigenvalue in
physical applications because it tends to be the first time the relevant effect is attained.
For example, in quantum theory, it corresponds to the lowest energy wavefunction and
in engineering it may correspond to the point at which a structure will instantaneously
collapse. It will be proven that the first eigenvalue is equal to the Rayleight quotient
defined below. This will provide a helpful method of approximation for the first eigenvalue.
4.1 Characterisation of the first eigenvalue

The following Rayleigh quotient R is important in Sturm Liouville theory because it can
be found to be equal to any eigenvalue following recasting as a variational problem.
J(y)
Definition 1. R(y) = I(y)
Its relevancy is immediately apparent from Lemma 1.

Lemma 1. R(ϕn ) = λn , n = 1, 2, ....
Proof. We start with the Sturm Liouville equation (1), multiply each side by y and inte-
grate between x0 and x1 to give.
∫ x1 ∫ x1 ∫ x1
′ ′ ′′
−p yy dx + 2
(−pyy + qy )dx = λ ry 2 dx
x0 x0 x0
Integrating the first term by parts and noting that y(x0 ) = y(x1 ) = 0 gives
∫ x1 ∫ x1 ∫ x1

′ x1 ′2 ′′ ′′
−pyy x0 + (py + pyy )dx + 2
(−pyy + qy )dx = λ ry 2 dx
x0 x0 x0
J(y) = λI(y)
We notice that we can choose y to be any non trivial solution ϕn , so that
R(ϕn ) = λn
4
We may now prove the following theorem. Note that it is different to the lemma,
because y is not necessarily a solution to (1).
Theorem 5. miny∈S ′ R(y) = λ1
where S’ is the set of functions in C 2 [x0 , x1 ] that satisfy the boundary conditions
Proof. Let
miny∈S ′ R(y) = Λ
ŷ = y + ϵη
where ŷ ∈ S ′ and η is any smooth function with η(x0 ) = η(x1 ) = 0.

To emphasise, this following proof defines the minimum value of R at y ∈ S ′ , contrary
to the rest of this report for which y can take any value in S ′ . The reason for this decision
is to present the equations in their familiar form.
We start by integrating g(x, ŷ, ŷ ′ ) from x0 to x1 so that
I(ŷ) = I(y) + O(ϵ).

Similarly, we integrate f (x, ŷ, ŷ ′ ) from x0 to x1 and apply a similar integration by parts
to Lemma 1, so that
∫ x1
J(ŷ) = J(y) + 2ϵ η((−py ′ )′ + qy)dx + O(ϵ2 ).
x0
By definition, ∫ x1
J(ŷ) = ΛI(y) + 2ϵ η((−py ′ )′ + qy)dx + O(ϵ2 ).
x0
Subtracting ΛI(ŷ) from each side and rearranging:

∫ x1
J(ŷ) − ΛI(ŷ) = 2ϵ η((−py ′ )′ + qy − Λry)dx + O(ϵ2 ).
x0
Hence, we are equipped to compute the following, remembering that I(y) = 0 only for
the trivial solution.
J(ŷ) J(y)
R(ŷ) − R(y) = −
I(ŷ) I(y)
J(ŷ) − ΛI(ŷ)
=
I(ŷ)
∫ x1
2ϵ x0 η((−py ′ )′ + qy − Λry)dx + O(ϵ2 )
=
I(ŷ)
As R(ŷ) − R(y) is always positive, the terms dominated by ϵ must evaluate to 0 because
otherwise for small ϵ, unconstrained choices of η could make the term positive or negative
in the domain.
This gives the Sturm Liouville equation (1), and therefore Λ must be an eigenvalue.
5
From lemma 1, for all n,
R(ϕn ) = λn ≥ miny∈S ′ R(y) = Λ = λn .
Therefore,
min′y∈S R(y) = λ1 .
4.2 Approximation of the First Eigenvalue

Due to the widespread application of solutions to the first eigenvalue, it is helpful to have
a ready method of approximation. Because it is rarely easy to find the global minimum
of the functional R, being non linear and fractional, we find an upper and lower bound.
However due to the generality of the following approximation, it is recommended that
for many particular equations, improvements could be made by slight modifications to
the process.
From Theorem 5, R(y) ≥ λ1 and so we have our upper bound. To find a lower bound,
¯
we create a comparison problem by constructing R̄(y) := J(y) ¯
I(y)
such that it is lesser than
R(y) and always explicitly solvable. To achieve the desired effect, we change p, q and r
to p̄, q̄ and r̄ accordingly. If we impose p(x) ≥ p̄(x) and q(x) ≥ q̄(x) this gives
∫ x1
¯
J(y) = (p̄y ′2 + q̄y 2 )dx ≤ J(y).
x0
Similarly, by choosing r(x) ≤ r̄(x), we find

∫ x1
¯
I(y) = (r̄y 2 )dx ≥ I(y).
x0
As required, for any y, we have

¯
J(y) J(y)
R̄(y) = ¯ ≤ = R(y),
I(y) I(y)
so that
miny∈S ′ R(y) = λ1 ≥ miny∈S ′ R̄(y) := λ̄1 .

However, in order for the theorem to be useful, we need R̄(y) to have an explicitly
solvable solution, and therefore it can be helpful to choose its functions as constants:
p̄(x) = minx∈[x0 ,x1 ] p(x) ≡ pm

q̄(x) = minx∈[x0 ,x1 ] q(x) ≡ qm
r̄(x) = maxx∈[x0 ,x1 ] p(x) ≡ rM
The explicit solution of to the Sturm Liouville equation associated with R̄(y) (by
Section 3), and with boundary conditions y(0) = y(x1 − x0 ), has already been illustrated
(2).
6
1 pm n2 π 2
λ̄n = ( + qm )
rM (x1 − x0 )2
Hence, we have the following boundary conditions.
min R̄(y) ≤ λ1 ≤ R(y)

or more specifically
1 pm π 2
( + qm ) ≤ λ1 ≤ R(y). (10)
rM (x1 − x0 )2
5 Higher Eigenvalues
5.1 Characterisation of Higher Eigenvalues
The following is a useful theorem that will not be proven. See Wan (1993) (284-285) for
a brief sketch of the proof.
Theorem 6. Let Sn′ be the set of functions y ∈ S ′ such that ⟨y, yk ⟩ = 0, and Ωn−1 be the
set of functions z = (z1 , ...zn ) such that zk ∈ S ′ , k = 1, ..., n − 1.
λn = maxz∈Ωn−1 (miny∈Sn′ R(y))
5.2 Approximation of Higher Eigenvalues

The first step in the approximation of eigenvalues is to transform the Sturm Liouville
equation (1) into a form which is simpler insofar as p = r = 1.
Under the rather messy transformation,
ψ(x) = (r(x)p(x))1/4 y(x) (11)

∫ x√
r(ξ)
t = dξ (12)
0 p(ξ)
((r(x)p(x))1/4 )′′ q(x)
f (x) = + (13)
(r(x)p(x))1/4 r(x)
∫ π √
r(x)
l = dx (14)
0 p(x)
ψ(0) = ψ(l) = 0, (15)
(16)
The Sturm Liouville equation (1) becomes
ψ ′′ (t) − f (t)ψ(t) + λψ(t) = 0 (17)

We can describe this new equation using notation familiar to the calculus of variations.
7
J(ψ)
R(ψ) =
I(ψ)
∫ l
J(ψ) = (ψ ′2 + f (t)ψ 2 )dt
0
∫ l
I(ψ) = ψ 2 dt
0
Our first and only approximation is to let the constant
k = maxt∈[0,l] |f (t)| . (18)

What follows are a number of definitions of functionals which utilise this approximation
and will be used to bound J and R:
∫ l
+
J (ψ) := (ψ ′2 + kψ 2 )dt
0
∫ l
−
J (ψ) := (ψ ′2 − kψ 2 )dt
0
+
J (ψ)
R+ (ψ) :=
I(ψ)
J − (ψ)
R− (ψ) :=
I(ψ)
Rearrangement of the Rayleigh quotients gives
∫ l
+
R (ψ) = ψ ′2 dt + k
0
∫ l ′2
ψ dt
R− (ψ) = 0 −k
I(ψ)
So that
R− (ψ) ≤ R(ψ) ≤ R+ (ψ)
or in other words,
−k ≤ R(ψ) − R̂(y) ≤ k (19)
where ∫l
0
ψ ′2 dt
R̂(y) =
I(ψ)
From (4), (6) and Definition 1, R̂(y) is associated with r = p = 1, q = 0. From Section
3, the following Sturm Liouville equation is an equivalent problem.
ψ ′′ + λ̂ψ = 0
ψ(0) = ψ(l) = 0
8
This is explicitly solvable and indeed was solved (2).
n2 π 2
λˆn = 2 (20)
l
From Theorem 6,
λn = maxz∈Ω (min′ R(ψ))

y∈Sn
= maxz∈Ω (min′ R̂(ψ) + C), −k ≤ C ≤ k

y∈Sn
= maxz∈Ω (min′ R̂(ψ)) + C

y∈Sn
= λˆn + O(1)
n2 π 2
= + O(1)
l2
∫ π√
r(x)
= n2 π 2 ( dx)−2 + O(1)
0 p(x)
The second line uses (19), the fifth line (20) and the sixth line (14).
Therefore,
λn π2
lim 2 = ∫ √ (21)
n→∞ n π r(x)
0
dx
p(x)
6 Case study of the Matthieu equation

This report will conclude with an application of the methods of approximation that have
been developed from the synthesis from Sturm Liouville theory and the calculus of vari-
ations. This will demonstrate the applicability of the approximations; but also that the
methods may be tweaked slightly to improve the error bounds for a particular example.
This example will be the Mathieu equation for which p = 1, r = 1 and q = 2θcos(2x):
y ′′ + (λ − 2θcos(2x))y = 0 (22)
We will be using the boundary conditions y(0) = y(π) = 0.

The Mathieu equation (or Mathieu function) is a special case of the Sturm Liouville
equation that was developed by Leonard Matthieu shortly after Sturm and Liouville’s
seminal work in 1968. In his Trait de physique mathmatique (Treatise on Mathematical
Physics) he applied his canonical equation to a number of two dimensional wave problems
including elastics and electrodynamics (The Complete Dictionary, 2013). Since then, it
has been applied to unforeseen fields including quantum theory and general relativity
(Garcia-Ravelo et al. 2012). However, the equation is not in general solvable and large,
useful and uninteresting tables have been developed containing numerical approximations
for the many different values which may be taken by the variables (Bickley, 1945).
9
6.1 The First Eigenvalue
Finding the a lower bound is a simple matter of following precisely the method established
in Section 4.2 and this method involves choosing p̄, q̄ and r̄ as the constants at which p,q
and r are minimum. That is, pm = 1, rM = 1 and qm = − |2θ|. This is substituted into
(10) and we immediately attain
λ1 ≥ 1 − 2 |θ|
For the upper bound, recall that any y ∈ S ′ may be chosen, and we choose y(x) = sinx.
Then,
∫π
((sinx)′2 + 2θcos(2x)sin2 x)dx
R(y) = 0 ∫π
0
sin2 xdx
= 1 − θ.
In summary, Figure 1 indicates the upper and lower bounds of the first eigenvalue for
different values of θ.
Figure 1: The approximation of the first eigenvalue of the Mathieu equation. Values in
the range are in blue and the true values are in red.
6.2 Higher eigenvalues

Using the method established in Section 5.2, we can simply subsitute p = 1 and r = 1
into (21) and find that
λn
lim =1
n→∞ n2
10
Figure 2 indicates that in fact this can be a remarkably successful approximation, but
a priori, we have no reason to expect this.
Figure 2: The eigenvalues of the Mathieu equation compared to their approximation.
An approximation without a knowledge of the error is meaningless and so we will

derive error bounds. This derivation additionally illustrates that the method from Section
5.2 may not be optimal, and we will use this method with one major alteration. The
approximation was to replace f (t) with k, a constant term that was denoted O(1). This
time, we will replace f (t) by a term dominated by θ. Firstly, we calculate that
((rp)1/4 )′′ q
f (t) = +
(rp)1/4 r
= 2θcos(2x)
It is obvious that |f (t)| ≤ θ.

Therefore, the approximation in Section 5.2 could be repeated with k = 2θ so that
λn = maxz∈Ω (min′ R(ψ))

y∈Sn
≤ maxz∈Ω (min′ R̂(ψ)) ± 2θ

y∈Sn
= n ± 2θ
2
That is
λn − n2 ≤ 2θ
The following (Figure 3) illustrates the implication of this inequality for some values
of θ.
11
.
Figure 3: The range in which higher eigenvalues of the Mathieu equation must be found
when |λn − n2 | ≤ 2θ.
Unfortunately as with the first eigenvalue, the approximation is less successful for
larger values of θ. We would expect that a better approximation would exist, and the
methods in common usage are iterative.
7 Concluding Remarks
The application of the calculus of variations to this Sturm Liouville problem increases one’s
appreciation for the vast scope of this area of mathematics, and in particular, the famous
Euler Lagrange equation. The Sturm Liouville equation is recast through intelligent
algebraic manipulations and most of the consequent theorems utilise similar tactics.
Although this report has asserted that the calculus of variations has been applied,
this is not to say that the proofs revolve around the discovery of the extrema of J.
As was observed in Section 3, these would not be sufficient (only necessary) for each
eigenvalue, and consequently new theorems had to be developed which would cover the
entire spectrum. Otherwise, one could not be certain that the lowest λ was the first
eigenvalue, nor that the limits were infinite.
The calculus of variations lends itself readily to eigenvalue problems, and indeed pro-
vides the origin and solution of many.
12
8 References
Bickley, W.G. 1945 Mathematical Tables and Other Aids to Computation American
Mathematical Society, 1(11), 409-419.
Complete Dictionary of Scientific Biography 2013Mathieu, Emile Leonard

http://www.encyclopedia.com/doc/1G2-2830902861.html accessed 9 May 2013.
Courant R., Hilbert, D. 1989 Methods of Mathematical Physics Volume 1, Wiley, New
York.
Garcia Ravelo, J., Schulze-Halberg, A., Trujillo, A.L. 2012Explicit formulas for gener-
alized harmonic perturbations of the infinite quantum well with an application to
Mathieu equations Journal of Mathematical Physics, 53(10), 1-14.
Hanus, R.G. 2008Sturm Liouville Theory with Applications to Quantum Mechanics Texas
AM University, ProQuest, UMI Dissertations Publishing, Ann Arbor.
Lutzen, J. 1984 Sturm and Liouville’s Work on Ordinary Linear Differential Equa-
tions.The Emergence of Sturm-Liouville Theory Archive for History of Exact Sci-
ences, 29(4), 309-376.
van Brunt, B. 2004 The Calculus of Variations Springer.
Wan, F.Y.M. 1993 Introduction to the Calculus of Variations and its Applications Chap-
man & Hall.
Wang, K. 2013Calculus of Variation An Introduction to Isoperimetric Problems

http://www.maths.usyd.edu.au/u/UG/IM/MATH2916/ accessed 2 May 2013.
13

An Application of The Calculus of Variations To Sturm Liouville Theory

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

An Application of The Calculus of Variations To Sturm Liouville Theory

Загружено:

Авторское право:

Доступные форматы

An Application of the Calculus of Variations to Sturm

2 Sturm Liouville Theory

−(p(x)y ′ (x))′ + q(x)y(x) − λr(x)y(x) = 0 (1)

Theorem 2. Each eigenvalue corresponds to precisely one eigenfunction. That is to say

Theorem 3. If λm and λn are distinct eigenvalues with corresponding eigenfunctions

2.1 A short and important example

For λ < 0, the solution is √ √

3 Recasting as a Variational Problem

−(py ′ )′ + qy − λry = 0 (9)

4 The First Eigenvalue

4.1 Characterisation of the first eigenvalue

Its relevancy is immediately apparent from Lemma 1.

where ŷ ∈ S ′ and η is any smooth function with η(x0 ) = η(x1 ) = 0.

I(ŷ) = I(y) + O(ϵ).

Subtracting ΛI(ŷ) from each side and rearranging:

R(ϕn ) = λn ≥ miny∈S ′ R(y) = Λ = λn .

4.2 Approximation of the First Eigenvalue

Similarly, by choosing r(x) ≤ r̄(x), we ﬁnd

As required, for any y, we have

miny∈S ′ R(y) = λ1 ≥ miny∈S ′ R̄(y) := λ̄1 .

p̄(x) = minx∈[x0 ,x1 ] p(x) ≡ pm

min R̄(y) ≤ λ1 ≤ R(y)

λn = maxz∈Ωn−1 (miny∈Sn′ R(y))

5.2 Approximation of Higher Eigenvalues

ψ(x) = (r(x)p(x))1/4 y(x) (11)

The Sturm Liouville equation (1) becomes

ψ ′′ (t) − f (t)ψ(t) + λψ(t) = 0 (17)

Our ﬁrst and only approximation is to let the constant

k = maxt∈[0,l] |f (t)| . (18)

λn = maxz∈Ω (min′ R(ψ))

= maxz∈Ω (min′ R̂(ψ) + C), −k ≤ C ≤ k

= maxz∈Ω (min′ R̂(ψ)) + C

6 Case study of the Matthieu equation

We will be using the boundary conditions y(0) = y(π) = 0.

6.2 Higher eigenvalues

Figure 2: The eigenvalues of the Mathieu equation compared to their approximation.

An approximation without a knowledge of the error is meaningless and so we will

It is obvious that |f (t)| ≤ θ.

λn = maxz∈Ω (min′ R(ψ))

≤ maxz∈Ω (min′ R̂(ψ)) ± 2θ

Complete Dictionary of Scientiﬁc Biography 2013Mathieu, Emile Leonard

van Brunt, B. 2004 The Calculus of Variations Springer.

Wang, K. 2013Calculus of Variation An Introduction to Isoperimetric Problems

Вам также может понравиться