Вы находитесь на странице: 1из 25

Spring 2011

A collection of useful math


David M. Wood
Department of Physics, Colorado School of Mines

The following sections survey mathematics used frequently in physics courses; they are meant as reminders,
not derivations. These are widely used but typically ‘fall through the cracks’ of lots of physics (or math) courses.
To the (limited) extent that a section depends on another section, they have been placed so that dependent
sections occur downstream.

1. Zeroth Law of math in physics


We are not permitted to add, subtract, or set equal quantities with different dimensions [units]. Corollary:
The functions we encounter in applications in general must depend on dimensionless quantitities. Proof: (i)
An arbitrary function has an infinite number of terms in its Taylor series; (ii) if the argument of a function
has dimensions [units, e.g., cm], each term in the series has different dimensions; (iii) Since this violates the
Zeroth Law, only dimensionless quantitities can occur as arguments; QED. Example: If a problem already
has a natural length scale , we expect the dependence on distance x to be via the ratio x/. Warning:
There are important special cases: functions which are pure ‘power laws’, e.g., the Coulomb potential. Since
there is only one power in the function, we often say that ‘there is no natural length scale’:
1
Vc (r ) = −α no natural length scale (1)
r
e−λr
Vy (r ) = −α length scale 1/λ. (2)
r

2. Buckingham Π theorem
If a physical problem containts N variables that depend on only P distinct units, there are N − P dimen-
sionless quantities which describe the physics.

(a) Example: Cooking a turkey. The heat conduction equation describes the rate of change of the temper-
ature of a material in terms of material properties and the temperature’s spatial dependence:

∂T
= κ ∇2 T . (3)
∂t
where κ is1 the ‘thermal diffusivity’ (units: m2 /sec). We define a turkey as ‘cooked’ when its core
temperature has reached the ‘cooked’ temperature Tc ; see panel (a) of Fig. (1). We expect the relevant
physical variables to be: t, Tc , κ, T , and the turkey volume V . We seek ‘dimensionless groups’ in the
form
T a Tcb t c κ d V e . (5)
Important: Familiarity with statistical mechanics or kinetic theory might suggest that you should ex-
press temperatures in units of (energy/kB ), where kB is Boltzmann’s constant. This would be a mistake,
and you would have to recapitulate a brief but intense controversy in the early 20th century. The up-
shot was that, for logical consistency, the temperature must be assigned its own units, often denoted
[θ]. With this proviso, our dimensionless group in Eq. (5) has net units
 d
a+b c cm2
[θ] [sec] [cm]3e = [θ]a+b [cm]2d+3e [sec]c−d . (6)
sec

So we have 5 physical quantities which depend on only 3 distinct units (temperature, time, and length),
so we should be able to construct (according to the Π theorem) 2 dimensionless quantities. We do so
equating to zero the exponents of each distinct unit. We find a ‘chunk’ of the form
 a  d
T κt
dimensionless quantity = 2 (7)
Tc V3
1 You don’t need to know where κ comes from, but it pops right out of a derivation of the heat flow equation as
σ
κ = therm (4)
CV ρm
where σtherm is the thermal conductivity [units J/(m2 sec)], CV is the specific heat (at constant volume) [units J/(kg-◦ C)], and ρm is the mass
density [units g/cm3 ], so that κ has units cm2 /sec.

1
Since a and d are independent, we have identified the two dimensionless groups—the arguments of
the exponents. We conclude that the natural units in which to measure the turkey temperature T is
2
Tc , and the natural units in which to measure elapsed time in the oven are V 3 /κ. We can thus write
 
T κt
=f 2 . (8)
Tc V3
We expect that f (t/tscl ) will in general have an infinite number of terms in its Taylor series, so the
value of d is irrelevant—all powers will be present. When T (t) = Tc the turkey is cooked (see Fig. 1),

(a) (b)
T u
6
Tc 4

2
τ
t 2 4 6 8 10 12 14
-2
tc -4

Figure 1: Examples of use of dimensionless variables. Panel (a): Schematic dependence of turkey temperature on
time. Panel (b) Solution to ODE using strictly dimensionless variables.

so that we have deduced that the cooking time


2   23
V3 1 M
tc ∝ ∝ . (9)
κ κ ρm

Note that we have deduced this without knowing anything about the actual mathematical solutions to
the heat flow equation. However, we can think about the solution a bit: the faster heat diffuses into
2
the turkey (the larger κ) the shorter the cooking time. V 3 is proportional to the surface area of the
turkey: the only way heat (energy) can flow into the turkey is through its surface. The shape of the
object to be cooked does not affect these ‘scaling arguments’.
(b) Finding complete solution of an ODE using dimensionless variables
The equation for a particle (mass M) falling vertically with air resistance might look like

d2 z dz
M = −Mg − γ , (10)
dt 2 dt
dz
with initial conditions, say, z(t = 0) = 0 and dt (t = 0) = 1. At first glance, in order to predict the
behavior of the particle we would need to find z = z(t, M, g, γ), so that a plot of the solution would
require 5 dimensions: bummer. The Buckingham Π theorem, however, says that out of the 5 variables
z, t, M, g, γ—which depend only on the 3 units kg, m, and sec—we can construct two dimensionless
groups. This amounts to finding the natural time scale T and distance scale  and casting the solution
(and the ODE) into the form (don’t confuse mass M with the unit meters, m)
 
z t
=f . (11)
 T
First we find the dimensions (units) of γ from the ODE, then we look for

 = M a g b γ c = [kg]a [m/sec2 ]b [kg/sec]c = [m]b [kg]a+c [sec]−2b−c (12)


d e f d 2 e f e d+f −2e−f
T = M g γ = [kg] [m/sec ] [kg/sec] = [m] [kg] [sec] (13)

This time we want  to have units m1 and T to have units sec1 . Now we find no undetermined exponents
and  = m2 g/γ 2 and T = m/γ and we can rewrite the original ODE as

d2 u du
+ + 1 = 0. (14)
dτ 2 dτ
This equation depends on no parameters; whatever (two) initial conditions we are given can be scaled
as well. Thus from requiring four dimensions for a plot we have collapsed down to u = u(τ).

2
3. Common series

1
= 1 − x + x2 − x3 + . . . (15)
1+x
converges for |x| < 1.
x2 xn
ex = 1 + x + + ...+ (16)
2! n!
converges for all x.
x3 x5
sin x = x − + + ... (17)
3! 5!
converges for all x.

n(n − 1) n−2 2
(a + b)n = an + nan−1 b + a b + ...
2!

n
n!
= an−j bj
j=0
j!(n − j)!

is the binomial theorem. Since this coincides with a Taylor series of (a + b)n about b = 0, it applies even
√ 2
for non-integer values of n, so that, for example, 1 + δ  1 + δ2 − δ8 + . . ..

∞
x2 x3 (−1)n−1 x n
log(1 + x) = x − + + ... = (18)
2 3 n=1
n

4. Summing finite geometrical series


The finite sum

N
S≡ zn (19)
n=0

can be evaluated by subtracting z S from S:


N
S= zn = 1 + z + z2 + . . . + zN (20)
n=0


N
zS = z zn = z + z 2 + . . . + z N+1 (21)
n=0

so that

S − zS = S (1 − z) = 1 + (z − z) + (z 2 − z 2 ) + . . . + (0 − z N+1 ) (22)

and thus
1 − z N+1
S= . (23)
1−z
This works for complex z too, e.g.,


N 
N
einz = exp inz
n=0 n=0
1 − exp i(N + 1)z
=
1 − exp iz

+i (N+1)z (N+1) (N+1)


e 2 e−i 2 z − e+i 2 z
= z
z z

e+i 2 e−i 2 − e+i 2


⎡ ⎤
(N+1)z
Nz sin
= ei 2 ⎣ 2
z
⎦.
sin 2

3
5. Volume elements

d(volume) = dV = dr = d3x = dx dy dz, (24)


where the last pair is appropriate to Cartesian (x, y, z) coordinates. All of these are useful and much used;
dr is perhaps the most flexible, reminding us that the volume element (although not a vector, despite the
notation) depends on the coordinate system selected.

6. Heaviside (step) function


This is defined by 
0 x≤0
θ(x) = (25)
1 x>0

7. Dirac δ function

• Properties:  +∞
dxf (x)δ(x − a) = f (a) (26)
−∞
b
dxf (x)δ(x − c) = f (c)θ(b − c)θ(c − a) (27)
a

(means: the delta function can only contribute if the point where it diverges lies in the range of inte-
gration).
• Relation to step function
d
θ(x) = δ(x) (28)
dx
d
[Proof: Integrate by parts with u = 1, dv = dx
θ(x)dx]
• Common representations The following functions in the appropriate limit have the properties of the
Dirac δ function, and occur in various physical applications:
1
 +∞
(a) 2π −∞ dk exp ikx
sin xb
(b) limb→∞ πx

sin n+ 1
2 x
(c) limn→∞ x
2π sin 2
1 a
(d) lima→0 π a2 +x2
b
(e) limb→∞ √π exp −b2 x 2
(f) Box function ⎧ −1

⎨ 0, x < 2a
1 1
lim a, − 2a < x < + 2a
a→∞ ⎪
⎩ 1
0, x > + 2a
There are more at http://functions.wolfram.com/GeneralizedFunctions/DiracDelta/09/
• Other properties
(a) δ(−x) = δ(x)
(b) x δ(−x) = 0
1
(c) δ(ax) = |a| δ(x)
(d) δ functions of more complicated arguments
 δ(x − xi )
δ [ϕ(x)] = 

 dϕ 
i  
 dx x=xi 

(where the {xi } are the roots of ϕ(x) = 0). Proof: Expand ϕ(x) around one of its zeroes and use
the previous item; each zero must contribute so we must sum over all of them. ( If the derivatives
at the roots are themselves zero, we must do more work.)

4
• Delta functions in non-Cartesian coordinate systems
If the length elements associated with curvilinear coordinates u, v, and w are du/U (u, v, w), dv/V (u, v, w)
and dw/W (u, v, w), then

δ(r − r ) = U · V · W δ(u − u ) δ(v − v ) δ(w − w ). (29)

Example: In spherical polar coordinates the product of path lengths which give rise to the volume
element is (r sin θdφ)(r dθ)dr , so with u = r , v = θ, w = φ, du/U = dr , we can identify dv/V =
dθ r sin θ, dw/W = dφ r . Thus U = 1, V = 1/(r sin θ), W = 1/r , and
1
δ(r − r ) = δ(r − r ) δ(θ − θ ) δ(φ − φ ). (30)
r 2 sin θ
Note that the units are what they should be.

8. Fourier analysis and the Dirac δ function


Using one common convention for how to distribute the necessary factors of 2π between the function and
its Fourier transform2 ,
 +∞
dk
f (x) = √ f (k)eikx
−∞ 2π
 +∞
dx
f (k) = √ f (x)e−ikx
−∞ 2π
so
 +∞  
+∞
dk ikx dx
f (x) = √ e √ f (x )e−ikx
−∞ 2π −∞ 2π
 +∞  +∞
dk ik(x −x)
= dx f (x ) e ,
−∞ −∞ 2π
possible only if  +∞
dk
δ(x − x ) = exp [ik(x − x )] , (31)
−∞ 2π
which is the most important ‘representation’ of the Dirac δ function. In three dimensions this becomes

dk
δ(r − r ) = δ(x − x )δ(y − y )δ(z − z ) = exp [ik·(r − r )] . (32)
(2π )3

9. Fourier transforms in multiple dimensions (including our favorite, three)


As mentioned above, physicists usually use the Fourier transform conventions (in d dimensions)

dkd
f (r) = f (k) exp(ik·r) (33)
all k (2π )d

f (k) = drd f (r) exp(−ik·r) (34)
all r
for functions which are sufficiently well behaved (‘square integrable’). Here the dimensions (units) of
f (k) are ([d-dimensional] volume) times those of f (r), and dkd = Πd i=1 dki , e.g., in three dimensions
dk = dkx dky dkz = d3 k. With this asymmetric choice, the factors of 2π are lumped into the inverse
Fourier transform, rather than symmetrically distributed between the Fourier transform and the inverse
transform, as in some texts.
Examples

(a) General spherically symmetrical function


For functions which depend only on the distance from some origin (i.e., which are ‘spherically sym-
metrical’ functions f (r ), it is natural to use spherical polar coordinates (r , θ, φ) [see Fig. 2]. In
2 In physics the most common convention is actually:
 +∞
dk
f (x) = f (k) eikx
−∞ 2π
 +∞
f (k) = dx f (x) e−ikx
−∞

5
z

r
k

Figure 2: Geometry and spherical polar coordinates used to evaluate Fourier transform for spherically symmetric
function.

these coordinates the ‘polar angle’ θ ranges from 0 to π (radians) and the ‘azimuthal angle’ φ ranges
from 0 to 2π (radians), while the length r of the vector r ranges from 0 to ∞. The volume element
dr = dV = dxdydz in these coordinates is r 2 dr sin θdθdφ.
Since in calculating the Fourier transform f (k) we need to do an integral over all space, we can adopt
any orientation of our coordinate system which is convenient. It is wildly convenient to rotate our
axes so that the z axis lies along the vector k [as shown in Fig. 2]. Then, if f (r) = f (r ) [spherically
symmetrical function],
k·r = kr cos θ,
and our integral becomes
∞ π  2π
f (k) = dr r 2 dθ sin θ dφ f (r ) exp(ikr cos θ)
0 0 0
∞  cos θ=−1
= 2π dr r 2 f (r ) −d(cos θ) exp(ikr cos θ)
0 cos θ=1
∞  +1
= 2π dr r 2 f (r ) dμ exp(ikr μ)
0 −1

where we have done the φ integral and made a change of variables to μ ≡ cos θ. But
 +1 +1
exp(ikr μ) 
 = 2 sin kr
dμ exp(ikr μ) = 
−1 ikr −1 kr
Finally, then, we have for any spherically symmetric function f (r ) that
∞
sin kr
f (k) = dr 4π r 2 f (r )
0 kr
The same process can be applied to compute inverse Fourier transforms of functions f (|k|) = f (k)
(aligning the kz axis along the vector r, yielding
∞
1 sin kr
f (r) = dk 4π k2 f (k) = f (r )
(2π )3 0 kr
(b) Specific example
Suppose that
Ze
f (r ) = −
exp(−λr ).
r
(Such a form emerges from the ‘Thomas-Fermi’ model for the electrostatic potential due to a point
of charge Ze after ‘screening’ by a gas of electrons; here λ depends on the density of that gas. It is
also known as the ‘Yukawa potential’ in particle physics.) It is easy to use Mathematica to evaluate
such integrals, as is shown in Fig. 3. Note that in the limit of λ → 0 we properly recover the Fourier
1
transform of the ‘bare’ r Coulomb potential, namely
4π Ze
fCoulomb (k) = (35)
k2

6
 Example: screened Coulomb potential

Original function, Fourier transform, back transform:

fr_ : Z e  r ExpΛ r

fofk 
Simplify4 Π Integrater ^ 2 Sink r  k r fr, r, 0, , GenerateConditions  False
4eΠZ
 

k2  Λ2

inverseFT 
Integrate4 Π k ^ 2 Sink r  k r fofk, k, 0, , GenerateConditions  False  2Π ^ 3 
PowerExpand

e r Λ Signr Z Signr


 

r

SimplifyinverseFT, r 0

e r Λ Z
 
r

Figure 3: Sample Mathematica input and output used to evaluate Fourier transform and its inverse.

10. Fourier transforms of periodic functions


For a function f (r) periodic in a three dimensional lattice with translation vectors {R} we require that

f (r + R) − f (r) ≡ 0 (36)
  
dk
= f (k) eik·r eik·R − 1 , (37)
(2π )3
so that only a discrete set of ‘reciprocal lattice vectors’ (RLVs) {G} obeying exp(iG·R) ≡ 1 contribute. Writing
therefore 
f (r) = fG eiG·R , (38)
G

we may multiply both sides by exp −iK ·r (where K is another reciprocal lattice vector) and integrate over
all space, finding
  
dr f (r)e−iK·r = fG ei(G−K)·R (39)
all r G all r
= (2π )3 δ(K − G) (Defn. of delta function) (40)
= Ω δG,K treating RLVs as discrete (41)

where Ω is the volume of the crystal (nominally infinite). So we have two useful results:
 
Ω
→ dk (42)
k
(2π )3 all k space
 
1 1
fG = dr f (r) e−iG·r = dr f (r) e−iG·r . (43)
Ω crys vprim prim cell

The last line follows from the fact that the integration over the entire crystal can be reduced to one over
a single ‘primitive cell’ of volume vprim ≡ Ω/Ncell , where we envision the (infinite) crystal volume as
consisting of an integral number Ncell of primitive cells.

11. Green functions and Dirac delta functions


You will fairly often encounter ‘Green functions’ in the solution of partial differential equations. The idea
is that rather than re-solving the PDE each time we change the inhomogeneous (‘source’ or ‘driving’ term)
on the right hand side of
L̂φ(r, t) = h(r, t) (44)

7
we can solve once for the Green function, which obeys

L̂G(r, r , t, t ) = δ(r − r ) δ(t − t ), (45)

and be assured that  


φ(r, t) = dr dt G(r, r , t, t ) h(r , t ) (46)

for any well-behaved h(r, t). Thus the Green function for a PDE is the solution corresponding to an ‘pulse’
initial condition which is a spatial and temporaal Dirac delta function. Example: The three-dimensional
diffusion equation for particle density n(r, t) in the presence of a source term h(r, t) is
∂n(r, t)
− D∇2 n(r, t) ≡ L̂n(r, t) = h(r, t) (47)
∂t
where D is the diffusion constant. The Green function (non-zero only for t > t ) is readily evaluated by
contour evaluation and obeys
 
dk ik·(r−r ) dω −iω(t−t ) 1
G(r, r , t, t ) = e e (48)
(2π )3 2π Dk2 − iω
(r − r )
2
1
G(r, r , t, t ) =  exp − (49)

(2 π D(t − t )) 3 4D(t − t )
  
1 t > t
dΔ G(Δ, t, t ) = dΔ 4π Δ2 G(Δ, t, t ) = (50)
0 t < t

where Δ = r − r and G actually depends on r, r only through the difference |r − r |. The above properties
of Green functions are crucial in that, for example, they preserve probability (in the context of quantum
mechanics) or particle number (in the case of particle diffusion above). Figure 4 shows an example.

(a) (b)
Green function for
3D diffusion equation

Δt tensor pot

scalar pot

Δr
Figure 4: Radial Green function 4π Δ2 G(Δr = |r − r |, Δt = t − t ), for three dimensional diffusion equation for
D = 1, panel (a). Panel (b): examples of forces derived from scalar and tensor potentials.

12. Dummy indices and dummy variables


When we sum over (or integrate over) a variable, the result no longer depends on that variable. This is very
obvious, but is used in a number of potentially confusing ways in physical derivations and manipulations.
Example: Suppose we know that two functions f (x) and g(x) are equal (e.g., f (x) = tan2 x and g(x) =
sec2 x − 1), and we know the power series for each about the same point. Then, schematically,

 ∞

an x n = bm x m . (51)
n=0 m=0

If we recognize that the integers m and n are dummy indices, that is what we call the variable we’re inte-
grating over (here, summing over) is up to us, we can for convenience replace m by n, and rewrite

 ∞

an x n = bn x n , (52)
n=0 n=0

8
or


(an − bn ) x n = 0, (53)
n=0

only possible (since x is arbitrary) if an = bn . (Strictly speaking we can’t do this unless the series are
uniformly convergent.)
The same works for integrals over dummy variables too, e.g., if two functions f (r) and g(r) are equal, then
in terms of their Fourier transforms

dk  ˜ 
3
f(k) − g̃(k) exp ik·r = 0, (54)
(2π )

only possible for arbitrary r if f˜(k) = g̃(k).

13. Pair potentials, double sums, and thermodynamic limit


One often encounters double sums of the form
⎛ ⎞
1 ⎝  ⎠
N N
U = φ(rij ≡ ri − rj ) (55)
2 i=1 j=1
1  
= φ(rij ) = φ(rij ) (56)
2 i,j i<j

where the particles labeled by i and j interact via the ‘pair potential’ φ(r). The restriction j ≠ i is needed
in order not to include non-physical self-interactions and is often indicated by the prime on the sum, and
1
the prefactor of 2 occurs because in the double sum a given pair (e.g., 1-2) occurs twice: i = 1, j = 2 and
i = 2, j = 1, while we wish to include the pair energy of interaction only once. Make sure you understand
why each line is equivalent to the previous lines above, since all are common notations. Discrete sums
occur because above we assume the coordinates of the N particles are assumed given. Very commonly the
coordinates vary continously over a volume, so that the total potential energy U above may be written
 
1
U= dr dr φ(r − r ); (57)
2
as usual, r and r are ‘dummy variables’. Because φ depends only the difference r−r we can define x ≡ r−r ,
finding
 
1
U = dr dx φ(x) (58)
2

1
= Ω dx φ(x) (59)
2

where the volume of the system Ω = dr has been assumed finite but unspecified. Thus

U
= dx φ(x) (60)
Ω Ω

is the total potential energy per unit volume of the system. For an infinite system (such as is often assumed
as a model for a crystal) U and Ω each are infinite, but the physically relevant quantity U /Ω is well-defined
and finite provided the integral is finite. If there are N particles in the system, U /N = U /Ω/(N/Ω) is
the potential energy per particle and N/Ω is the particle density. This situation (limU→∞ and limΩ→∞ )
is commonly known as the thermodynamic limit, since it is the situation in which a conventional three-
dimensional thermodynamic description of a system is used3 .
A pair potential which obeys φ(r) = φ(|r|) = φ(r ) is ‘central’, that is, the corresponding force is directed
along the line from the origin [center] to the point r. For such a φ(x),

U = Ω dx φ(x) (61)
∞
= Ω dx 4π x 2 φ(x). (62)
0
2
3 Forfinite size systems for which the ratio A/V 3 is appreciable there are additional surface and other terms. Here A is the system surface
area and V its volume.

9
This argument at least suggests (correctly) that there is something horribly pathological about the Coulomb
(1/x) potential, the longest range potential in nature. Central potentials are so common it’s easy to forget
that not all pair potentials are central. For example, to explain the properties of the deuteron a ‘tensor
force’ must be invoked, whose potential is of the form
 
2
Vtens (r) = λr 6 (S · r̂) − S · S ; (63)

the corresponding force is conservative despite being anisotropic.


For periodic potentials, such as occur in a solid, further simplifications are common. For simple crystals the
‘view’ (potential, force, etc) from every site in the crystal is identical and the vector distances rij = ri − rj
between sites are ‘quantized’ by the crystal. One such distance is the vector r = 0; thus
⎛ ⎞
1 ⎝  ⎠
N N
U = φ(ri − rj ) (64)
2 i=1 j=1

1
N
= φ(0) (65)
2 i=1
N
= φ(0); (66)
2
once again we find a finite potential energy per particle, provided that the total potential energy at a given
site is finite.

14. Using algebraic variables and counting arguments for solutions of a linear equation
We consider as an example the solution of the one-dimensional Schrödinger equation with a Dirac delta
function centered at x = a, V (x) = V0 δ(x − a):

d2 ψ 2m
+ 2 [E − V (x)]ψ = 0. (67)
dx 2 
As always, the wavefunction is continuous where the potential is singular at x = a but the singularity

V (x) = V0 δ(x − a)

V(x)
E>0
I II
x
x=a
E<0
Figure 5: Sample potential for Schrödinger equation

forces us to examine the solution to the left and to the right of x = a; in these regions V (x) = 0, so the ODE
is the same on either side of the singularity. Integrating the SE from just below to just above the singularity,
we find the usual expression for the jump in slope. We thus have the two equations

ψ(a+ ) − ψ(a− ) = 0 (68)


 a+
2mV0 2mV0
ψ (a+ ) − ψ (a− ) = + lim dx δ(x − a)ψ(x ) = + ψ(a) (69)
2 −>0 a− 2

It is at this point that we have to commit to the sign of E: E > 0 yields oscillatory solutions, E < 0 yields
exponentially growing or damped solutions.

10
Case E > 0: We proceed as usual: in regions I and II, to the left and to the right of the singularity,

ikx −ikx 2mE
ψ(x) = Ae + Be , x<a, E>0; k≡ (70)
2
ψ(x) = Ceikx + De−ikx , x>a, E>0. (71)

Note: This solution is never correct! This is for physical, not mathematical reasons: E > 0 corresponds to
unbound (‘scattering’) states. Because we are solving4 the time-independent SE, we must treat this problem
in a way similar to how we’d treat the optics problem of a beam of light incident on an interface from one
side or the other. We don’t ask what the time-evolution of the beam is, but instead we solve for steady-state
conditions, when there is no explicit time dependence (in our case, because our quantum system has fixed
energy, so that all pieces of the wavefunction evolve in time in an identical way). Thus we MUST specify
the physical boundary conditions: whether our particle is incident from the left ∝ eikx or from the right
(∝ e−ikx ). If from the left we expect a REFLECTED beam (∝ e−ikx ) and a TRANSMITTED beam (∝ eikx ). We
do not have, in addition to the particles incident from the left, any particles incident from the right (x = ∞,
propagating ∝ e−ikx in region II). Thus for E > 0 we must set D ≡ 0 (if we have in mind a beam incident
from the left). Equations (using ψ(a) in terms of C rather than two terms involving A and B, and dividing
through the derivative equation to simplify):
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 e−2ika −1
⎟ ⎜ A ⎟ ⎜ 0 ⎟
⎜ 2imV
⎝ 1 −e−2ika − 1 + 2 k 0 ⎠ · ⎝ B ⎠ = ⎝ 0 ⎠ (72)
0 0 0 C 0

Case E < 0: On the other hand, for purely mathematical reasons we know the solution for E < 0:

κx −2mE
ψ(x) = Ae , x<a, E<0; κ≡ (73)
2
ψ(x) = De−κx , x>a, E<0, (74)

since terms with wrong-sign exponents would diverge. Equations (again, cleaning up the derivative jump
equation a little):      
−2κa
1 −e

A 0
2mV · = (75)
1 1 + 2 κ0 e−2κa D 0
The profound differences between E > 0 and E < 0 can be detected simply by counting equations and
unknowns:
(i) For E > 0 there are 3 unknowns (A, B, and C) and 2 equations. Thus we will only be able to solve for the
ratios r ≡ B/A and t ≡ C/A, finding

e2ika
r = (76)
−1 + iχ
χ
t = . (77)
i+χ

where χ ≡ 2 k/(mV0 ). The reflection and transmission coefficients R ≡ |r |2 and T ≡ |t|2 depend only on
χ 2 , so don’t even depend on the sign of V0 , although R + T = 1.
(ii) For E < 0 there are 2 unknowns (A and D) and 2 equations (same as before). This situation might seem
easier and better but it is not. The SE is a linear equation: if ψ(x) is a solution, so is5 any constant times
ψ(x). Something has gone wrong when we find as many equations as unknowns: we have tacitly assumed
that any E < 0 will work in the solutions above. In fact, this is not true: only ‘magic’ values of E < 0 (the
eigenvalues) will solve the SE. In fact, for a non-trivial solution we must set the determinant of the matrix
appearing in Eq. (75) equal to zero, finding
 
mV0
2 e−2κa 1 + 2 = 0. (78)
 κ
4 The officially correct thing to do would be do construct a ‘wave packet’, the object which resembles a classical particle as much as is

consistent with the Uncertainty Principle. We could then watch the time evolution of a wave packet incident from x = −∞: the (slowly
spreading) wavepacket would come in, its high Fourier components (with a higher phase velocity) would begin to interact with the potential,
and eventually part of the wavepacket would pinch off (to give rise to the reflected wave) and the remainder would exit to x = +∞ (the
‘transmitted beam’).
5
The requirement that the wavefunction be normalized might appear to be a third condition, but it is a physical, not a mathematical
requirement.

11
m and  are positive, e−x is never equal to zero for finite x, and we selected κ > 0 in order to have the
wavefunction behave properly for large positive or negative x. Thus if V0 > 0 (as shown in the figure) there
are no real roots for this equation. On the other hand, if V0 < 0 (so we could write V0 = −|V0 |, we do have
one solution, κ = m|V0 |/2 . We have made no assumptions about the sign of V0 in our manipulations, and
so are at liberty to assume either sign. This is an example of treating V0 as an algebraic variable: we assume
it to be positive, find conditions on a solution, and discover that in fact it needs to be negative. Note: As for
the E > 0 case, we cannot in fact solve for A and D. Instead, we plug back in the permitted solution (again,
valid only for V0 < 0. We will then obtain two equations which both agree obout the ratio D/A = e+2κa .
For either case, E > 0 or E < 0 we can now proceed to impose the physical requirement of normalization.

15. Manipulating vectors and matrices

(a) A note on indices


In specific problems, we often refer to the Cartesian components of a vector, e.g., we ask ‘What is the
x component Ax of the vector A’? When dealing with formal manipulations of vector properties, it is
much more convenient to use the [‘Roman’] indices i, j, k . . ., to refer Ai as the ‘i-th component’ of the
vector A, and to use the conventions
x  i=1 x1 ≡ x
y  i=2 x2 ≡ y (79)
z  i=3 x3 ≡ z

Note: The relations we find by such index manipulations are valid for any i (for example). The matrix
relation between two n-dimensional column vectors, for example,

a = G ·b (80)

in components becomes

n
ai = Gij bj , (81)
j=1

is valid for i = 1 (the x component) or i = 2 (y) or i = 3.


(b) Cyclic permutations
Generally if we have a vector defined, for example, by a cross product and know one of the components
(e.g., the x component), we can construct the y and z components ‘by inspection’, knowing that they
are symmetrically defined. For example, given that C = A × B, if we know that

Cx = (A × B)x = Ay Bz − Az By , (82)

instead of going through two more laborious calculations to see what are Cy and Cz , we can perform
cyclic permutations: Using Cartesian indices i = 1 ↔ x, i = 2 ↔ y, i = 3 ↔ z, make the replacements

x → y
y → z
z → x.

E.g., this would mean that


Cy = (A × B)y = Az Bx − Ax Bz , (83)
(c) Compact notation for derivatives
∂f
∂i f ≡ (84)
∂xi
Note: we need to know the point at which this derivative is evaluated; it has been suppressed for
compactness.
(d) Kronecker δ function

1 i=j
δi,j = δj,i ≡ (85)
0 i≠j
Note that the commas between the indices are often omitted, and that we can substract any constant
from both indices and preserve the behavior of the Kronecker δ function:

δm,n+p = δm−p,n = δn+p−m,0 (86)

12
(e) Levi-Civita tensor (‘completely antisymmetric third-rank tensor’)
This object is incredibly useful in constructing cross products of vectors or vector operators (e.g., the
angular momentum operator in quantum mechanics, or the curl in E&M). It is used, for example, in the
definition 
(A × B)i = εijk Aj Bk . (87)
j,k

Observe: the first index of ε is the component i we specify, the second is the component of the first
vector in the cross product, and the third is the component of the second vector.
What good is this? —Instead of having to deal first with the x component, then with the y, then z
components, we have an algebraic expression (albeit involving a sum) which gives us all the (Cartesian)
components—we just need to know which i to specify. In this expression


⎨ 1 ijk = 123 or ‘cyclic permutations’ 312, 231
εijk = −1 ijk = 213 or cyclic permutations 321, 132 (88)

⎩ 0 any two indices same

Thus the whole job of this object is to select out of a sum only the pieces for which i ≠ j ≠ k, crucial
in constructing cross products.
Key properties:
• Cyclic permutations
εijk = εkij = εjki (89)
[the ‘cyclic permutations’ of the original indices ijk], true for any ijk.
• Index swaps
εjik = −εijk (90)
Key identity:
εijk εkm = δi δjm − δim δj (91)
6
Note well the structure: we can only use this identity when the last index of the first ε is the first index
of the second ε. If we instead have the product

εijk εlkm (92)

we can use the index swap identity to rewrite εlkm = −εklm and use the ‘delta identity’ Eq. (91) above.
(f) Einstein summation convention
In physical applications indices which occur twice in sums (all examples below) are in fact summed
over. So we adopt a compact convention for quantities constructed from vectors, tensors, etc.: any
index occurring twice is assumed to be summed over.
Examples
• Dot product: for n-dimensional vectors


n
a i bi ≡ ai bi = a·b (93)
i=1

or, commonly used,


⎛ ⎞
b1
⎜ b2 ⎟
⎜ ⎟
(a1 a2 . . . an ) · ⎜
⎜ .. ⎟.
⎟ (94)
⎝ . ⎠
bn
• Multiplication of n-dimensional vector by an n×n matrix:


n ↔ 
ai = Gij bj ≡ Gij bj = G ·b (95)
i
j=1

6 We already know about ε


ijk that it’s zero if any of the indices are the same. Thus, given that the two terms share the given index k, the
product εijk εkm is non-zero only if (i) either i =  AND j = m, OR if (ii) i = m AND j = . In case (i) the product becomes εijk εkij , which
is +1 since kij is just a cyclic permutation of ijk; in case (ii) the product becomes εijk εkji , which is -1 because kji is obtained from ijk by
swapping i and k. The statement Eq. (91) is just a compact way of saying this, and taking a linear combination which guarantees that if j = k
or if  = m we must get zero.

13
or ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a1 G11 G12 ... G1n b1
⎜ a2 ⎟ ⎜ G21 G22 ... G2n ⎟ ⎜ b2 ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ .. ⎟=⎜ .. .. .. .. ⎟·⎜ .. ⎟ (96)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ . ⎠ ⎝ . . . . ⎠ ⎝ . ⎠
an Gn1 Gn2 ... Gnn bn
(g) Combining identities above
• Laplacian
∂2f ∂2 f ∂2f
∂i ∂i f = 2
+ 2
+ ≡ ∇2 f (97)
∂x ∂y ∂z 2
• Vector Taylor expansion
ai aj
f (r + a) = f (r) + ai ∂i f + ∂i ∂j f + . . . (98)
2!
1
= f (r) + a·∇f + (a·∇) (a·∇) f + . . . (99)
2!
where all derivatives are evaluated at r.
• Vector identities

(A × B)·(C × D) = εijk Aj Bk εim C Dm (100)


= εjki εim Aj Bk C Dm (2 cyclic perms of εijk ) (101)
= (δj δkm − δjm δk )Aj Bk C Dm (102)
= Aj Cj Bk Dk − Aj Dj Bk Ck (103)
= (A·C) (B·D) − (A·D) (B·C) (104)

[∇ × (∇ × A)]i = εijk ∂j (εkm ∂ Am ) (105)


= δi δjm − δim δj ∂j ∂ Am (106)


= ∂j ∂i Aj − ∂j ∂j Ai (107)

= ∂i ∂j Aj − ∂j ∂j Ai (108)
2
= ∇i (∇·A) − ∇ Ai (109)

so that
∇ × (∇ × A) = ∇ (div A) − ∇2 A (110)
• Vector manipulations with inverse power laws
e.g. Derive the ‘coordinate-free’ form7 form for the magnetic field B from the vector potential A(r)
of a magnetic dipole
μ × r̂
Adip (r) = a 2 . (111)
r
Since B = ∇ × A and r = xm x̂m (= x x̂ + y ŷ + z ẑ),

Bi = εijk ∂j Ak (112)
   

= εijk ∂j a εkm μ (113)
r2 m

 
xm
= a δi δjm − δim δj μ ∂j (114)
r3
(115)

since the m-th Cartesian component of r is xm . Thus


   
xj xi
Bi = a μi ∂j 3
− a μj ∂j (116)
r r3
   
r̂ xi
= a μi ∇· − a μ ∂
j j . (117)
r2 r3
7
A ‘coordinate-free’ expression is one which manifestly does not depend on the particular coordinate system chosen. Examples of such
expressions are: φ(r) = −E·r (a scalar function), G(r) (a vector function of a vector argument—the vectors G and r can be expressed in any
coordinate system, but the vectors are the same. A non coordinate-free form would be ϕ(x, y, z) = −Ex x − Ey y − Ez z, because it explicitly
depends on Cartesian coordinates.

14


The term ∇· is very peculiar (as we see, for example, in PHGN 361), since it is related to the
r2


charge density of a point charge, a three-dimensional Dirac delta function: ∇· r 2 = 4π δ(r). In
! " 3

the second term, it is convenient to write r 3 = r 2 2 and r 2 = x x , so
 
xi
Bi = 4π a μi δ(r) − a μj ∂j (118)
r3
⎡ ⎤
δij 3 xi · 2x δj ⎦
= 4π a μi δ(r) − a μj ⎣ 3 − , (119)
r 2 (x x ) 52
 

∂xi
where, as usual, ∂j xi ≡ ∂xj = δkj . In order to simplify, we need to be careful using the Einstein
summation convention. We know that any index which occurs twice is summed over, but what
if the index occurs more than twice, as happens in the second term in square brackets? First, we
simplify terms linear in x , then we replace x x by r 2 , to find
μi xi xj
Bi = 4π a μi δ(r) − a + 3a μj 5 (120)
r3 r
[3 (μ·r̂) (r̂)i − μi ]
= 4π a μi δ(r) + a , (121)
r3
xi
where we have chosen to write, e.g., r
= (r̂)i . So,

[3 (μ·r̂)r̂ − μ]
Bdip (r) = 4π a δ(r) μ + a (122)
r3
The ‘strange’ term proportional to the δ function is physically very important in, for example, the
theory of the ‘hyperfine interaction’ in atoms.

16. Pauli spin matrices


The intrinsic ‘spin’ angular momentum of a particle has no coordinate representation (unlike orbital angular
momentum, which in the coordinate representation gives rise to spherical harmonics), so we must fall back
on a matrix representation of the spin operator in a basis of spin eigenstates8 . For spin- 12 particles such
as electrons, protons, or neutrons, we generally replace the spin operator by the corresponding matrix, via


Ŝ → S≡ σ where
2      
0 1 0 −i 1 0
σx = , σy = , σz = (123)
1 0 +i 0 0 −1
where the diagonal form of σz indicates that we must be using the eigenstates of σz as basis vectors. These
‘Pauli spin matrices’ obey

σi σi = 1 NO ESC
  
3
σi , σj = 2i εijk σk = 2i εijk σk
k=1
# $
σi , σj = 0 (124)

where 1 is a unit 2 × 2 matrix, the phrase NO ESC means that we are not using the Einstein summation
1
convention, so that σi σi = σx · σx for i = 1, for example. Thus the spin- 2 operators obey the usual
commutation relations for an angular momentum, the square of a Pauli spin matrix is a unit matrix, and
any two different Pauli spin matrices anticommute. A particularly concise way to encapsulate all of these
properties is the identity
σi σj = iεijk σk + δij . (125)
It is an easy exercise to show that the three results in Eq. (124) follow from Eq. (125).

17. Integrals with oscillatory integrands


Integrals of the form  +∞
I(k) = dx f (x) exp(i k x), (126)
−∞ % &' (
g(x,k)

8 If this means nothing to you, ignore this sentence.

15
where f (x) is well-behaved, generally decay in value as k increases. (I(k) here happens to be the Fourier
transform of f (x), but these remarks hold if g(x, k) is any oscillatory function whose argument is an
increasing function of k, e.g., g = sin(k2 x).) Physicists just say ‘the oscillations
) cause f to *average away to
zero as k increases’. A particular example, using the function f (x) = x 2 exp −a(x − x0 )2 with a = 1 and
x0 = 2, is shown in Fig. 6. (Much more sophisticated examples occur in the ‘method of steepest descent’

8
 +∞ Re I
6
I = dx f (x) exp(i k x)
−∞
  4

f (x) = x 2 exp −a (x − x0 )2 2

1 2 3 4 5 k
-2
a=1; x0=2
-4

-6 Re [f (x) exp(i k x)]


4 4
f(x)
2
k=0 2
k=20
x x
1 2 3 4 5 1 2 3 4 5

-2 -2

-4 -4

Figure 6: Sample oscillatory integrand and its value I.

and ‘saddle point methods’.)

18. Mathematical description of waves

(a) Dispersion relations


Very frequently, e.g., when describing wave-like phenomenon in classical or quantum mechanics and
electodynamics, we encounter modes with the (in general, three-dimensional) dispersion relations ω =
ω(k).
Examples:
i. Light in vacuum or long-wavelength sound: ω = ck, equivalent to the relation λν = c.
ka
ii. Waves along a 1-dimensional chain of masses and springs separated by distance a: ω = ω0 sin 2
k2
iii. Quantum non-relativistic free particle with mass m: ω = E/ =  2m .
(b) Density of modes or states
Often what is measureable experimentally depends only on the number of such modes in a given
frequency range and not explicitly on the vector k. It becomes convenient to focus on the number of
modes dN in a frequency range dω rather than in the volume element in k−space. Since for quantum
systems generally E = ω(k), the ‘energy density of states’

dNstates = g(E) dE (127)


dNmodes = g(ω) dω (128)

obeys
 
dk
dNmodes = · (degeneracy of modes for given k)
volume per allowed state in k space

For instance, for (transverse) electromagnetic waves (described ultimately by spin-1 photons of zero
mass) there are two distinct polarizations9 for each propagation direction n̂ = k/k, giving rise to a
9 Not three, as would be expected from the 2s + 1 counting for spin s, since the photon has zero mass.

16
1
prefactor of 2 when counting photon states. On the other hand, finite-mass electrons have spin 2 , so
acquire two (2 · 12 + 1) distinct electron states (spin up, spin down) for each spatial wavefunction.
As we may see in some class, the definition of the energy density of states differs slightly in form
depending on the energy distribution of levels:
i. Discrete energy levels: 

g(E) = gj δ E − Ej , (129)
j

where gj (a common notation, not to be confused with the DOS itself) is the degeneracy (the number
of physically distinct states with the same energy10 ) of the level with energy Ej .
ii. Continuous energy levels: There are two common and equivalent forms:

V
g(E) = dk δ (E − ε(k)) (130)
(2π )3
+
V dS
= . (131)
(2π )3 S |∇k ε(k)|
In Eq. (130) the integration is either over all of k-space (for a free particle) or over the first Brillouin
zone (for a Bloch electron). In Eq. (131) the surface integral is over the surface ε(k) = E. Note that
g(E)/V is what is physically relevant—it is a nice intensive quantity characteristic of the system,
not diverging as the system volume grows large.
2 k2
Example: Free particle in three dimensions, ε(k) = 2m
.
i. Dirac δ version
  
V 2 k2
g(E) = dk δ E − (132)
(2π )3 2m
  
V 2m 2 2 2mE
= dk 4π k δ k − (133)
(2π )3 2 2
   
4π V 2m 1 2 2 δ k2 −
2mE
= · d(k ) k (134)
(2π )3 2 2 2

m 4π 2mE
= V 2 (135)
 (2π )3 2

where in the second line we have used spherical polar coordinates in k space and the facts that
1
δ(ax) = δ(x)
|a|
δ(−x) = δ(+x)

ii. Surface integral version


We note that
 

∂ ∂ ∂ 2 k2x + k2y + k2z


∇k ε(k) = x̂ + ŷ + ẑ , (136)
∂kx ∂ky ∂kz 2m
so that
2

∇k ε(k) = · 2 x̂kx + ŷky + ẑkz (137)


2m
2
|∇k ε(k)| = (|k| = k) (138)
m
10
For example: a hydrogen-like atom has energy levels labeled by (n, , m , ms ). Because the energy of such a level is independent of ,
m , and ms , each level labeled by n has
1
+2 +
 
n−1 
= 2n2
ms =− 2 =0 m =−
1

levels with the same energy.

17
Thus
+
V dS
g(E) = (139)
(2π )3 |∇k ε(k)|
S
+
V m dS
= (140)
2 S k
(2π )3
+
V m1
= dS (141)
2 k S
(2π )3
V m1
= · 4π k2 (142)
2 k
(2π )3

m 4π 2mE
= V 2 , (143)
 (2π ) 3 2

where we have observed that the magnitude k does not vary across the surface S, whose area is
4π k2 .
(c) Wave packets
Most waves are dispersive, i.e., the dispersion relation is not linear and the spatial and time frequencies
are related. These include light moving in a medium with an ω-dependent index of refraction, matter
waves, and mechanical waves (e.g., water waves). In a one-dimensional example, we expect a dispersive
wave to have the mathematical form
 +∞
dk
ϕ(x, t) = √ A(k) ei[kx−ω(k)t] , (144)
−∞ 2π
that is, for any instant t, the wave is a superposition of distinct wavevectors, as described by the
function A(k). This is precisely what we would expect from Fourier analysis, except that the wave has
a time dependence which we must acknowledge via ω(k) since the wave is dispersive.

If there were only a single Fourier component in A(k), i.e., A(k) = 2π δ(k − k0 ), we would have a
simple wave ∝ exp i (k0 x − ωt) whose phase is
,  -
ω(k)
k0 x − ω(k0 )t = k x − t , (145)
k k0

so that the wave fronts (spatial locations where the phase is fixed) clearly move at the phase velocity
vphase = ω(k0 )/k0 . More generally, for the Fourier transform of the ϕ(x, t) to exist11
 +∞
2
dx |ϕ(x, t)| < ∞ (146)
−∞
 +∞
2
dk |A(k)| < ∞. (147)
−∞

The integrand in Eq. (144) is thus proportional to exp iθ(k) with

θ(k) = kx − ω(k)t. (148)

Our integrand
A(k) eiθ(k) = |A(k)| ei[φ(k)+θ(k)] (149)
is thus very oscillatory and the integral will be completely dominated by k regions where

|A(k)| = maximum
.
θ(k) + φ(k)  constant

If the ‘power spectrum’ |A(k)|2 is strongly peaked12 about a particular wavevector k, we may Taylor
expand  
 
dω 1 d2 ω
ω(k)  ω(k) + ·(k − k) + ·(k − k)2 + . . . (150)
dk k 2 dk2 k
11 In quantum mechanics this is the route to making a normalizable wavefunction.
12 Note that the only purpose for this observation is to identify the main spatial frequency kpresent in the wavepacket.

18
We define the group velocity of the wave as vg (k) ≡ dω(k)/dk; our integral thus becomes, neglecting
the second derivative,
 +∞
dk
ϕ(x, t)  √ A(k) eikx e−i[ω(k)+vg (k)(k−k)] (151)
−∞ 2π
 +∞
dk
= e−it(ω(k)−vg k) √ A(k) eik(x−vg (k)t) . (152)
−∞ 2π

[We recognize the prefactor’s phase as kt vg − vphase .] Thus, remarkably, apart from a phase factor,
we can find the wave packet at time t from that at time t = 0 via the simple substitution x → x−vg (k) t.
This means that the wave packet preserves its shape over the times permitted by the approximations
above; for this reason wave packets are used in quantum mechanics as the mathematical object most
nearly like a classical particle: it has a well-defined momentum k and a well-defined position, con-
sistent with the momentum-position uncertainty relations. We also conclude that in the analysis of a
‘localized’ wave train or wave packet, two distinct velocities are important:
ω
vphase = (153)
k

vgroup = , (154)
dk
each evaluated at k. If we include corrections to the description above (e.g., we retain the quadratic
term (dvg /dk)k , (i) the wave packet will spread out for long times and (ii) its shape will change.

For light waves in empty space (vg = vphase = c) wave packets preserve their shape for all times.

19. ‘Principal value’ integrals


Functions which diverge at physically-relevant parameter values are not uncommon in physics, e.g., mechan-
ical or optical resonances, or integrals whose integrands diverge somewhere in the region of integration.
Consider the integral ∞
x 2 e−ax
I(a, b) ≡ dx , (155)
0 x−b
whose integrand is well-behaved at the integration limits but which diverges at x = b; see Fig. (8) panel (a).
Formally I is ill-defined; with the customary interpretation of the definite integral as the area under a curve,
however, we would physically expect the divergent positive and negative areas to compensate, yielding a
finite answer. We define the principal value of the integral via
β   x0 +δ 
x0 −δ
f (x) f (x) f (x)
P dx ≡ lim dx + dx , (156)
α x − x0 δ→0 α x − x0 α x − x0
where δ ≥ 0.
Using the residue theorem and contour integration it is quite easy to show that for integrands with this
particular form of singularity (a ‘simple pole’ on the path of integration)
β β
f (x) f (x)
dx =P dx ∓ iπ f (x0 ) . (157)
α x − (x 0 ± i) α x − x0
Here we have acknowledged that in many physical problems the ‘resonant’ value of a parameter (here x0 )
may for physical reasons have a slight imaginary part. For example, the function describing light absorption
near an optical resonance at ω0 might have the form
1
freson (ω) = (158)
ω − (ω0 + iΓ )
where Γ is positive but otherwise assumed very small13 . Because the construction associated with the
principal value is a linear operator applied to the integration and doesn’t depend on the function f (x), one
frequently writes the mnemonic form
1 1
≡P ± iπ δ (x − x0 ) , (159)
x − (x0 ± i) x − x0
which is meant to be applied ‘inside the integral’ over x of f (x). .
13 We must make sure that the contour we pick always encloses or always excludes the pole as we take δ → 0. As a result, we select the
semicircular contour below the pole if the pole lies in the upper complex plane and vice versa. The ± in the formal expression comes from
the sign of the contribution as we traverse the semicircle in a positive or negative sense.

19
(☆)
α  δ β
 x0  x

()

Figure 7: How to select the semicircle used to make contour integration well defined when the path of integration
crosses a simple pole slightly displaced from the real axis.

integrand
e−ax
f (x) = x 2 (a) (b)
0.05 x−b
a = 3
b = 2 1.0
x 0.5 6
1 2 3 4 0.0 ∞
e −ax b 1
0.5 dx x 2 = −b2 e−ab Ei(ab) + +4 2
0.05 1.0 0 x−b a a
0 b
2
0.10 2
a
4
0

Figure 8: Example of function with simple pole on path of integration. Integrand, panel (a); plot of principal value
of integral as a function of parameters, panel (b).

20
20. Kramers-Kronig relations
A central result of the theory of functions of a complex variable is Cauchy’s theorem (Cauchy’s integral
formula) +
1 f (z )
f (z) = dz (160)
2π i C z −z
where f (z) is analytic (has a derivative everywhere) within and on the closed contour C (traversed so as to
keep its interior on the left). It’s used to prove the residue theorem, but is noteworthy above because of
the factor i. This means that the real part of f (z) is related to an integral over the imaginary part and vice
versa. In PH511 it is often proved that for a complex-valued function of a real, positive argument defined on
the interval (0, ∞) [here the dielectric function ε(ω) as it depends on the positive definite light frequency
ω]
∞
2 ω
Re ε(ω) = 1 + P dω 2 Im ε(ω ) (161)
π 0 ω − ω2
∞
2ω 1
Im ε(ω) = − P dω 2 [Re ε(ω ) − 1] . (162)
π 0 ω − ω2
These are the ‘Kramers-Kronig’ relations and P reminds you to use the nefarious principal value. (I’m sorry
to simply cite this result; it’s proved in Jackson and is not difficult, just a little tedious.) They hold generally
for the ‘linear response functions’ describing the response of a causal linear medium to an external driving
force and therefore are incredibly useful in a very broad array of applications.

21. The method of Lagrange multipliers


To find the maximum or minimum of a function f (x, y, z) subject to constraints (e.g.) φ(x, y, z) = C1 and
χ(x, y, z) = C2 ,

(i) Define the function


F (x, y, z) ≡ f (x, y, z) + λ φ(x, y, z) + μ χ(x, y, z), (163)
where λ and μ are called ‘Lagrange multipliers’.
(ii) Set the partial derivatives of F with respect to x, y, z equal to zero and solve these equations for x, y, z
in terms of λ and μ; identify the values of λ and μ from C1 and C2 .

Note that (i) It is obvious how to extend this procedure to more variables and more constraints. No matter
what the dimensionality of the space, there will be as many equations as unknowns, so that a solution will
generally exist, and (ii) each additional constraint simply adds another equation too. [The operation count
between this approach, the ‘method of Lagrange multipliers’, and direct brute force solution by finding the
extrema, then imposing the constraints, can be orders of magnitude.]
Example: Find the volume of the largest box with edges parallel to the axes which can be inscribed inside
the ellipsoid

(x, y, z)

2y 2z
y

Figure 9: Box inscribed in ellipsoid.

x2 y2 z2
2
+ 2 + 2 = 1, (164)
a b c

21
noting that the volume of such a box is 8 xyz. Solution: We have only one constraint: that the box just
touch the surface defined by the ellipsoid; this will happen at 8 places, but we need worry about only one
(in the octant where x, y, z > 0). Assuming a > 0, b > 0, c > 0 and that the volume is positive, requiring
x, y, z to lie on the surface of the ellipsoid we find
1
(x, y, z) = √ (a, b, c)
3
4abc
λ = √ ,
3
8abc
√ .
so that the volume of the largest inscribed box (for given a, b, c) is 3 3

22. Legendre transformations


In several applications (e.g., classical mechanics and thermodynamics) it is useful to be able to systematially
dy
change variables. If, for example y = y(x), we say that dy = dx dx and that the natural variable for y is
x, since its differential depends only on dx. On the other hand, consider the function

χ = y(x) − z x (165)
dy dy
where z = dx
. Now dχ = dx
dx − z dx − x dz, or using the definition of z

dy dy
dχ = dx − dx − x dz = −x dz (166)
dx dx
We then say that χ depends only on z = dy/dx, so that the ‘natural variable’ of χ is14 z.

(i) Thermodynamics: We write the First Law in differential form as dU = T dS − P dV + μ dN, and remark
that the ‘natural variables’ of the (internal energy) U are S, V , and N. On the other hand, if we define
G = U + P V − T S, we find

dG = dU + (P dV + V dP ) − (T dS + S dT ) (167)
= (T dS − P dV + μ dN) + (P dV + V dP ) − (T dS + S dT ) (168)
= −S dT + V dP + μ dN, (169)

so that the natural variables of G are T , P , and N. (This is a triple Legendre transform of the original
function U .)

(ii) From Lagrangian to Hamiltonian mechanics: Traditionally one begins with the Lagrangian L({qi , qi })
for which the ‘equation of motion’ reads
⎛ ⎞
d ⎝ ∂L ⎠ ∂L
= or (170)
dt ∂ q◦ ∂qi
i
◦ dpi
pi ≡ = Fi (171)
dt
(since generally L = T − V in terms of the kinetic energy T and potential energy V ; we recognize
Fi = −∂V /∂qi ).

Now we seek a way to change variables from the ({qi , qi }, t) to ({qi , pi }, t), where pi is the ‘canonical
momentum conjugate to the generalized coordinate qi ’. If we write
 ◦ ◦
χ= pi qi −L({qi , q i }, t) (172)
i


we find (since L depends on time both explicitly and via the qi and qi )
⎡ ⎤
, ◦ ◦
- 
∂L ◦ ∂L ∂L
dχ = pi dqi + qi dpi − ⎣ dqi ⎦ −
◦ d i +
q dt (173)
i i q
∂ i ∂q i ∂t
⎡ ⎤
 ◦  ◦ ∂L  ∂L ∂L
= q i dpi + d qi ⎣pi − ◦ ⎦ − dqi − (174)
i i ∂q i
∂q i ∂t
i
14
This is like choosing to describe a function not by its dependence on x but by the family of curves which are tangent (i.e., have specified
slope) to the curve y(x).

22
But the term in brackets in the second term cancels by the definition of the generalized momentum,
and the third time can be simplified using the Lagrange equation of motion:
∂L
◦ ≡ pi (175)
∂qi
∂L ◦
= pi (176)
∂qi
so that now (replacing χ by its official label H, the Hamiltonian), we have
,◦ ◦
-
∂L
dH = qi dpi − p i dqi − , (177)
i
∂t

which yields the Hamilton equations


∂H ◦
= qi (178)
∂pi
∂H ◦
= − pi (179)
∂qi
∂H ∂L
= − (180)
∂t ∂t
. /
We have thus formally shifted to the qi , pi as mechanical variables.

23. Useful Gaussian integrals


If ∞
2
I(α) ≡ dxe−αx ,
−∞
how can we easily evaluate I? Write the square of this integral as
 +∞  +∞
2 2
I 2 (α) = dx dye−αx e−αy
−∞ −∞
 +∞  +∞
2 2
= dxdye−α(x +y ) .
−∞ −∞

Now change to polar coordinates in the x − y plane, replacing the area element dxdy by 2π r dr with r
now ranging from 0 to ∞. Thus ∞
2 π
I 2 (α) = 2π dr r e−αr = ,
0 α
so that 0
π
I(α) = .
α
But wait! There’s more! We can actually generate a whole slew of similar useful integrals by treating I(α)
as a known function of α. For example,

dI(α) d +∞ 2
= dxe−αx
dα dα −∞
 +∞
2
= dx(−x 2 )e−αx
−∞
 +∞
2
= − dx x 2 e−αx
−∞

so that
 +∞
2 dI(α)
dx x 2 e−αx = −
−∞ dα
0
d π
= −
dα α
 
√ 1
= − π − α−3/2
2
0
1 π
= .
2α α

23
This trick works for any convergent integrand dependent on a parameter. It also works if we have a con-
vergent series which depends on a parameter.

24. Operators, eigenvalues, eigenfunctions


In quantum mechanics, physical observables correspond to linear operators; in the ‘coordinate representa-
tion’, these are often linear differential operators. The operator Q̂, for example, when applied to a typical
function f (x) yields some other nasty function:

Q̂f (x) = g(x)

where, for example, Q̂ might be of the form


 
d2 df
Q̂ = 3 − 2 + 3 .
dx 2 dx

However, if
Q̂h(x) = qh(x)
where q is an ordinary number (possibly complex), then

• q is called an eigenvalue of the operator Q̂.


• h(x) is called an eigenfunction of the operator Q̂.

25. Operator algebra and functions


In formal quantum mechanics one often encounters expressions such as e−iĤt/ ,
1
E−Ĥ
, and f (Ô), where Ĥ
and Ô are operators. What meaning can we give these expressions? We generally assume the functions we
encounter are well-behaved enough to have Taylor expansions; this is also what we mean by functions of
operators, e.g.,
   
t 1 t 2
e−iĤt/ = 1̂ + −iĤ + −iĤ + ... (181)
 2! 
If acting on an eigenfunction |o of the operator Ô [defined by Ô|o = o|o], for example, we find

f (Ô) |o ≡ f0 + f1 Ô + f2 Ô Ô + . . . |o


= f0 + f1 o + f2 o2 + . . . |o
= f (o)|o. (182)

26. Proof by induction


On Liquor Production by David M. Smith

A friend who’s in liquor production


Owns a still of astounding construction
The alcohol boils
Through old magnet coils;
She says that it’s "proof by induction."

Idea: In induction, we begin by demonstrating that a result holds for an assumed ‘starting’ value of a discrete
index n, usually n = 0 or n = 1. Next, we assume it to hold for a specified arbitrary value of n, then prove
that it holds when n is replaced by n + 1. Since n was arbitrary to begin with, the identity must hold for
arbitrary n. Thus we have a chain of deduction that reaches from n = 0 (or 1) through arbitrary n (since
for any n we get the result for n + 1 by the replacement n → n + 1, and thus have a completely general
result.
Example: Prove by induction that
 n n
d x2 x2 d 2
x− e− 2 = (−1)n e+ 2 e−x . (183)
dx dx n

Proof

d 0
First, does it hold for n = 0? Yes, since x − dx
≡ 1, (−1)0 = 1, and the zeroth derivative of a function is

24
the function itself. Now on to arbitrary n. Replace n by n + 1 in the equation above, but isolate the part
which looks most like Eq. (183):
 n+1    n 
d − x2
2 d d − x2
2
x− e = x− x− e
dx dx dx

The quantity in [] on the right side is by hypothesis equal to the right hand side of Eq. (183), so
   n   , -
d d x2 d x 2 dn
−x 2
x− x− e− 2 = x − (−1)n e+ 2 e .
dx dx dx dx n

Expanding out the derivatives on the right side,


 , -
d x 2 dn x 2 dn 1 x 2 dn
−x 2 2 2
x− (−1)n e+ 2 n
e = (−1)n xe+ 2 n
e−x − (−1)n (2x)e+ 2 e−x +
dx dx dx 2 dx n
x 2 dn+1 2
−(−1)n e+ 2 e−x (184)
dx n+1
The last task is to make the right hand side depend only on the index n + 1, in order to see if the identity
holds when n is replaced by n + 1. So we have to simplify; this is easy in this example since the first two
terms cancel identically. We find
 n+1
d x2 x2 dn+1 −x2
x− e− 2 = (−1)n+1 e+ 2 e (185)
dx dx n+1
which is of the same form as Eq. (183) with n replaced by n+1. Thus we have proved this result by induction,
since n was unspecified, so that it holds for n = 0, 1, . . ., i.e., for arbitrary n.

25

Вам также может понравиться