Академический Документы
Профессиональный Документы
Культура Документы
The following sections survey mathematics used frequently in physics courses; they are meant as reminders,
not derivations. These are widely used but typically ‘fall through the cracks’ of lots of physics (or math) courses.
To the (limited) extent that a section depends on another section, they have been placed so that dependent
sections occur downstream.
2. Buckingham Π theorem
If a physical problem containts N variables that depend on only P distinct units, there are N − P dimen-
sionless quantities which describe the physics.
(a) Example: Cooking a turkey. The heat conduction equation describes the rate of change of the temper-
ature of a material in terms of material properties and the temperature’s spatial dependence:
∂T
= κ ∇2 T . (3)
∂t
where κ is1 the ‘thermal diffusivity’ (units: m2 /sec). We define a turkey as ‘cooked’ when its core
temperature has reached the ‘cooked’ temperature Tc ; see panel (a) of Fig. (1). We expect the relevant
physical variables to be: t, Tc , κ, T , and the turkey volume V . We seek ‘dimensionless groups’ in the
form
T a Tcb t c κ d V e . (5)
Important: Familiarity with statistical mechanics or kinetic theory might suggest that you should ex-
press temperatures in units of (energy/kB ), where kB is Boltzmann’s constant. This would be a mistake,
and you would have to recapitulate a brief but intense controversy in the early 20th century. The up-
shot was that, for logical consistency, the temperature must be assigned its own units, often denoted
[θ]. With this proviso, our dimensionless group in Eq. (5) has net units
d
a+b c cm2
[θ] [sec] [cm]3e = [θ]a+b [cm]2d+3e [sec]c−d . (6)
sec
So we have 5 physical quantities which depend on only 3 distinct units (temperature, time, and length),
so we should be able to construct (according to the Π theorem) 2 dimensionless quantities. We do so
equating to zero the exponents of each distinct unit. We find a ‘chunk’ of the form
a d
T κt
dimensionless quantity = 2 (7)
Tc V3
1 You don’t need to know where κ comes from, but it pops right out of a derivation of the heat flow equation as
σ
κ = therm (4)
CV ρm
where σtherm is the thermal conductivity [units J/(m2 sec)], CV is the specific heat (at constant volume) [units J/(kg-◦ C)], and ρm is the mass
density [units g/cm3 ], so that κ has units cm2 /sec.
1
Since a and d are independent, we have identified the two dimensionless groups—the arguments of
the exponents. We conclude that the natural units in which to measure the turkey temperature T is
2
Tc , and the natural units in which to measure elapsed time in the oven are V 3 /κ. We can thus write
T κt
=f 2 . (8)
Tc V3
We expect that f (t/tscl ) will in general have an infinite number of terms in its Taylor series, so the
value of d is irrelevant—all powers will be present. When T (t) = Tc the turkey is cooked (see Fig. 1),
(a) (b)
T u
6
Tc 4
2
τ
t 2 4 6 8 10 12 14
-2
tc -4
Figure 1: Examples of use of dimensionless variables. Panel (a): Schematic dependence of turkey temperature on
time. Panel (b) Solution to ODE using strictly dimensionless variables.
Note that we have deduced this without knowing anything about the actual mathematical solutions to
the heat flow equation. However, we can think about the solution a bit: the faster heat diffuses into
2
the turkey (the larger κ) the shorter the cooking time. V 3 is proportional to the surface area of the
turkey: the only way heat (energy) can flow into the turkey is through its surface. The shape of the
object to be cooked does not affect these ‘scaling arguments’.
(b) Finding complete solution of an ODE using dimensionless variables
The equation for a particle (mass M) falling vertically with air resistance might look like
d2 z dz
M = −Mg − γ , (10)
dt 2 dt
dz
with initial conditions, say, z(t = 0) = 0 and dt (t = 0) = 1. At first glance, in order to predict the
behavior of the particle we would need to find z = z(t, M, g, γ), so that a plot of the solution would
require 5 dimensions: bummer. The Buckingham Π theorem, however, says that out of the 5 variables
z, t, M, g, γ—which depend only on the 3 units kg, m, and sec—we can construct two dimensionless
groups. This amounts to finding the natural time scale T and distance scale and casting the solution
(and the ODE) into the form (don’t confuse mass M with the unit meters, m)
z t
=f . (11)
T
First we find the dimensions (units) of γ from the ODE, then we look for
This time we want to have units m1 and T to have units sec1 . Now we find no undetermined exponents
and = m2 g/γ 2 and T = m/γ and we can rewrite the original ODE as
d2 u du
+ + 1 = 0. (14)
dτ 2 dτ
This equation depends on no parameters; whatever (two) initial conditions we are given can be scaled
as well. Thus from requiring four dimensions for a plot we have collapsed down to u = u(τ).
2
3. Common series
1
= 1 − x + x2 − x3 + . . . (15)
1+x
converges for |x| < 1.
x2 xn
ex = 1 + x + + ...+ (16)
2! n!
converges for all x.
x3 x5
sin x = x − + + ... (17)
3! 5!
converges for all x.
n(n − 1) n−2 2
(a + b)n = an + nan−1 b + a b + ...
2!
n
n!
= an−j bj
j=0
j!(n − j)!
is the binomial theorem. Since this coincides with a Taylor series of (a + b)n about b = 0, it applies even
√ 2
for non-integer values of n, so that, for example, 1 + δ 1 + δ2 − δ8 + . . ..
∞
x2 x3 (−1)n−1 x n
log(1 + x) = x − + + ... = (18)
2 3 n=1
n
N
S= zn = 1 + z + z2 + . . . + zN (20)
n=0
N
zS = z zn = z + z 2 + . . . + z N+1 (21)
n=0
so that
S − zS = S (1 − z) = 1 + (z − z) + (z 2 − z 2 ) + . . . + (0 − z N+1 ) (22)
and thus
1 − z N+1
S= . (23)
1−z
This works for complex z too, e.g.,
N
N
einz = exp inz
n=0 n=0
1 − exp i(N + 1)z
=
1 − exp iz
3
5. Volume elements
7. Dirac δ function
• Properties: +∞
dxf (x)δ(x − a) = f (a) (26)
−∞
b
dxf (x)δ(x − c) = f (c)θ(b − c)θ(c − a) (27)
a
(means: the delta function can only contribute if the point where it diverges lies in the range of inte-
gration).
• Relation to step function
d
θ(x) = δ(x) (28)
dx
d
[Proof: Integrate by parts with u = 1, dv = dx
θ(x)dx]
• Common representations The following functions in the appropriate limit have the properties of the
Dirac δ function, and occur in various physical applications:
1
+∞
(a) 2π −∞ dk exp ikx
sin xb
(b) limb→∞ πx
sin n+ 1
2 x
(c) limn→∞ x
2π sin 2
1 a
(d) lima→0 π a2 +x2
b
(e) limb→∞ √π exp −b2 x 2
(f) Box function ⎧ −1
⎪
⎨ 0, x < 2a
1 1
lim a, − 2a < x < + 2a
a→∞ ⎪
⎩ 1
0, x > + 2a
There are more at http://functions.wolfram.com/GeneralizedFunctions/DiracDelta/09/
• Other properties
(a) δ(−x) = δ(x)
(b) x δ(−x) = 0
1
(c) δ(ax) = |a| δ(x)
(d) δ functions of more complicated arguments
δ(x − xi )
δ [ϕ(x)] =
dϕ
i
dx x=xi
(where the {xi } are the roots of ϕ(x) = 0). Proof: Expand ϕ(x) around one of its zeroes and use
the previous item; each zero must contribute so we must sum over all of them. ( If the derivatives
at the roots are themselves zero, we must do more work.)
4
• Delta functions in non-Cartesian coordinate systems
If the length elements associated with curvilinear coordinates u, v, and w are du/U (u, v, w), dv/V (u, v, w)
and dw/W (u, v, w), then
Example: In spherical polar coordinates the product of path lengths which give rise to the volume
element is (r sin θdφ)(r dθ)dr , so with u = r , v = θ, w = φ, du/U = dr , we can identify dv/V =
dθ r sin θ, dw/W = dφ r . Thus U = 1, V = 1/(r sin θ), W = 1/r , and
1
δ(r − r ) = δ(r − r ) δ(θ − θ ) δ(φ − φ ). (30)
r 2 sin θ
Note that the units are what they should be.
5
z
r
k
Figure 2: Geometry and spherical polar coordinates used to evaluate Fourier transform for spherically symmetric
function.
these coordinates the ‘polar angle’ θ ranges from 0 to π (radians) and the ‘azimuthal angle’ φ ranges
from 0 to 2π (radians), while the length r of the vector r ranges from 0 to ∞. The volume element
dr = dV = dxdydz in these coordinates is r 2 dr sin θdθdφ.
Since in calculating the Fourier transform f (k) we need to do an integral over all space, we can adopt
any orientation of our coordinate system which is convenient. It is wildly convenient to rotate our
axes so that the z axis lies along the vector k [as shown in Fig. 2]. Then, if f (r) = f (r ) [spherically
symmetrical function],
k·r = kr cos θ,
and our integral becomes
∞ π 2π
f (k) = dr r 2 dθ sin θ dφ f (r ) exp(ikr cos θ)
0 0 0
∞ cos θ=−1
= 2π dr r 2 f (r ) −d(cos θ) exp(ikr cos θ)
0 cos θ=1
∞ +1
= 2π dr r 2 f (r ) dμ exp(ikr μ)
0 −1
where we have done the φ integral and made a change of variables to μ ≡ cos θ. But
+1 +1
exp(ikr μ)
= 2 sin kr
dμ exp(ikr μ) =
−1 ikr −1 kr
Finally, then, we have for any spherically symmetric function f (r ) that
∞
sin kr
f (k) = dr 4π r 2 f (r )
0 kr
The same process can be applied to compute inverse Fourier transforms of functions f (|k|) = f (k)
(aligning the kz axis along the vector r, yielding
∞
1 sin kr
f (r) = dk 4π k2 f (k) = f (r )
(2π )3 0 kr
(b) Specific example
Suppose that
Ze
f (r ) = −
exp(−λr ).
r
(Such a form emerges from the ‘Thomas-Fermi’ model for the electrostatic potential due to a point
of charge Ze after ‘screening’ by a gas of electrons; here λ depends on the density of that gas. It is
also known as the ‘Yukawa potential’ in particle physics.) It is easy to use Mathematica to evaluate
such integrals, as is shown in Fig. 3. Note that in the limit of λ → 0 we properly recover the Fourier
1
transform of the ‘bare’ r Coulomb potential, namely
4π Ze
fCoulomb (k) = (35)
k2
6
Example: screened Coulomb potential
fr_ : Z e r ExpΛ r
fofk
Simplify4 Π Integrater ^ 2 Sink r k r fr, r, 0, , GenerateConditions False
4eΠZ
k2 Λ2
inverseFT
Integrate4 Π k ^ 2 Sink r k r fofk, k, 0, , GenerateConditions False 2Π ^ 3
PowerExpand
SimplifyinverseFT, r 0
e r Λ Z
r
Figure 3: Sample Mathematica input and output used to evaluate Fourier transform and its inverse.
f (r + R) − f (r) ≡ 0 (36)
dk
= f (k) eik·r eik·R − 1 , (37)
(2π )3
so that only a discrete set of ‘reciprocal lattice vectors’ (RLVs) {G} obeying exp(iG·R) ≡ 1 contribute. Writing
therefore
f (r) = fG eiG·R , (38)
G
we may multiply both sides by exp −iK ·r (where K is another reciprocal lattice vector) and integrate over
all space, finding
dr f (r)e−iK·r = fG ei(G−K)·R (39)
all r G all r
= (2π )3 δ(K − G) (Defn. of delta function) (40)
= Ω δG,K treating RLVs as discrete (41)
where Ω is the volume of the crystal (nominally infinite). So we have two useful results:
Ω
→ dk (42)
k
(2π )3 all k space
1 1
fG = dr f (r) e−iG·r = dr f (r) e−iG·r . (43)
Ω crys vprim prim cell
The last line follows from the fact that the integration over the entire crystal can be reduced to one over
a single ‘primitive cell’ of volume vprim ≡ Ω/Ncell , where we envision the (infinite) crystal volume as
consisting of an integral number Ncell of primitive cells.
7
we can solve once for the Green function, which obeys
for any well-behaved h(r, t). Thus the Green function for a PDE is the solution corresponding to an ‘pulse’
initial condition which is a spatial and temporaal Dirac delta function. Example: The three-dimensional
diffusion equation for particle density n(r, t) in the presence of a source term h(r, t) is
∂n(r, t)
− D∇2 n(r, t) ≡ L̂n(r, t) = h(r, t) (47)
∂t
where D is the diffusion constant. The Green function (non-zero only for t > t ) is readily evaluated by
contour evaluation and obeys
dk ik·(r−r ) dω −iω(t−t ) 1
G(r, r , t, t ) = e e (48)
(2π )3 2π Dk2 − iω
(r − r )
2
1
G(r, r , t, t ) = exp − (49)
(2 π D(t − t )) 3 4D(t − t )
1 t > t
dΔ G(Δ, t, t ) = dΔ 4π Δ2 G(Δ, t, t ) = (50)
0 t < t
where Δ = r − r and G actually depends on r, r only through the difference |r − r |. The above properties
of Green functions are crucial in that, for example, they preserve probability (in the context of quantum
mechanics) or particle number (in the case of particle diffusion above). Figure 4 shows an example.
(a) (b)
Green function for
3D diffusion equation
Δt tensor pot
scalar pot
Δr
Figure 4: Radial Green function 4π Δ2 G(Δr = |r − r |, Δt = t − t ), for three dimensional diffusion equation for
D = 1, panel (a). Panel (b): examples of forces derived from scalar and tensor potentials.
If we recognize that the integers m and n are dummy indices, that is what we call the variable we’re inte-
grating over (here, summing over) is up to us, we can for convenience replace m by n, and rewrite
∞
∞
an x n = bn x n , (52)
n=0 n=0
8
or
∞
(an − bn ) x n = 0, (53)
n=0
only possible (since x is arbitrary) if an = bn . (Strictly speaking we can’t do this unless the series are
uniformly convergent.)
The same works for integrals over dummy variables too, e.g., if two functions f (r) and g(r) are equal, then
in terms of their Fourier transforms
dk ˜
3
f(k) − g̃(k) exp ik·r = 0, (54)
(2π )
where the particles labeled by i and j interact via the ‘pair potential’ φ(r). The restriction j ≠ i is needed
in order not to include non-physical self-interactions and is often indicated by the prime on the sum, and
1
the prefactor of 2 occurs because in the double sum a given pair (e.g., 1-2) occurs twice: i = 1, j = 2 and
i = 2, j = 1, while we wish to include the pair energy of interaction only once. Make sure you understand
why each line is equivalent to the previous lines above, since all are common notations. Discrete sums
occur because above we assume the coordinates of the N particles are assumed given. Very commonly the
coordinates vary continously over a volume, so that the total potential energy U above may be written
1
U= dr dr φ(r − r ); (57)
2
as usual, r and r are ‘dummy variables’. Because φ depends only the difference r−r we can define x ≡ r−r ,
finding
1
U = dr dx φ(x) (58)
2
1
= Ω dx φ(x) (59)
2
where the volume of the system Ω = dr has been assumed finite but unspecified. Thus
U
= dx φ(x) (60)
Ω Ω
is the total potential energy per unit volume of the system. For an infinite system (such as is often assumed
as a model for a crystal) U and Ω each are infinite, but the physically relevant quantity U /Ω is well-defined
and finite provided the integral is finite. If there are N particles in the system, U /N = U /Ω/(N/Ω) is
the potential energy per particle and N/Ω is the particle density. This situation (limU→∞ and limΩ→∞ )
is commonly known as the thermodynamic limit, since it is the situation in which a conventional three-
dimensional thermodynamic description of a system is used3 .
A pair potential which obeys φ(r) = φ(|r|) = φ(r ) is ‘central’, that is, the corresponding force is directed
along the line from the origin [center] to the point r. For such a φ(x),
U = Ω dx φ(x) (61)
∞
= Ω dx 4π x 2 φ(x). (62)
0
2
3 Forfinite size systems for which the ratio A/V 3 is appreciable there are additional surface and other terms. Here A is the system surface
area and V its volume.
9
This argument at least suggests (correctly) that there is something horribly pathological about the Coulomb
(1/x) potential, the longest range potential in nature. Central potentials are so common it’s easy to forget
that not all pair potentials are central. For example, to explain the properties of the deuteron a ‘tensor
force’ must be invoked, whose potential is of the form
2
Vtens (r) = λr 6 (S · r̂) − S · S ; (63)
1
N
= φ(0) (65)
2 i=1
N
= φ(0); (66)
2
once again we find a finite potential energy per particle, provided that the total potential energy at a given
site is finite.
14. Using algebraic variables and counting arguments for solutions of a linear equation
We consider as an example the solution of the one-dimensional Schrödinger equation with a Dirac delta
function centered at x = a, V (x) = V0 δ(x − a):
d2 ψ 2m
+ 2 [E − V (x)]ψ = 0. (67)
dx 2
As always, the wavefunction is continuous where the potential is singular at x = a but the singularity
V (x) = V0 δ(x − a)
V(x)
E>0
I II
x
x=a
E<0
Figure 5: Sample potential for Schrödinger equation
forces us to examine the solution to the left and to the right of x = a; in these regions V (x) = 0, so the ODE
is the same on either side of the singularity. Integrating the SE from just below to just above the singularity,
we find the usual expression for the jump in slope. We thus have the two equations
It is at this point that we have to commit to the sign of E: E > 0 yields oscillatory solutions, E < 0 yields
exponentially growing or damped solutions.
10
Case E > 0: We proceed as usual: in regions I and II, to the left and to the right of the singularity,
ikx −ikx 2mE
ψ(x) = Ae + Be , x<a, E>0; k≡ (70)
2
ψ(x) = Ceikx + De−ikx , x>a, E>0. (71)
Note: This solution is never correct! This is for physical, not mathematical reasons: E > 0 corresponds to
unbound (‘scattering’) states. Because we are solving4 the time-independent SE, we must treat this problem
in a way similar to how we’d treat the optics problem of a beam of light incident on an interface from one
side or the other. We don’t ask what the time-evolution of the beam is, but instead we solve for steady-state
conditions, when there is no explicit time dependence (in our case, because our quantum system has fixed
energy, so that all pieces of the wavefunction evolve in time in an identical way). Thus we MUST specify
the physical boundary conditions: whether our particle is incident from the left ∝ eikx or from the right
(∝ e−ikx ). If from the left we expect a REFLECTED beam (∝ e−ikx ) and a TRANSMITTED beam (∝ eikx ). We
do not have, in addition to the particles incident from the left, any particles incident from the right (x = ∞,
propagating ∝ e−ikx in region II). Thus for E > 0 we must set D ≡ 0 (if we have in mind a beam incident
from the left). Equations (using ψ(a) in terms of C rather than two terms involving A and B, and dividing
through the derivative equation to simplify):
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 e−2ika −1
⎟ ⎜ A ⎟ ⎜ 0 ⎟
⎜ 2imV
⎝ 1 −e−2ika − 1 + 2 k 0 ⎠ · ⎝ B ⎠ = ⎝ 0 ⎠ (72)
0 0 0 C 0
Case E < 0: On the other hand, for purely mathematical reasons we know the solution for E < 0:
κx −2mE
ψ(x) = Ae , x<a, E<0; κ≡ (73)
2
ψ(x) = De−κx , x>a, E<0, (74)
since terms with wrong-sign exponents would diverge. Equations (again, cleaning up the derivative jump
equation a little):
−2κa
1 −e
A 0
2mV · = (75)
1 1 + 2 κ0 e−2κa D 0
The profound differences between E > 0 and E < 0 can be detected simply by counting equations and
unknowns:
(i) For E > 0 there are 3 unknowns (A, B, and C) and 2 equations. Thus we will only be able to solve for the
ratios r ≡ B/A and t ≡ C/A, finding
e2ika
r = (76)
−1 + iχ
χ
t = . (77)
i+χ
where χ ≡ 2 k/(mV0 ). The reflection and transmission coefficients R ≡ |r |2 and T ≡ |t|2 depend only on
χ 2 , so don’t even depend on the sign of V0 , although R + T = 1.
(ii) For E < 0 there are 2 unknowns (A and D) and 2 equations (same as before). This situation might seem
easier and better but it is not. The SE is a linear equation: if ψ(x) is a solution, so is5 any constant times
ψ(x). Something has gone wrong when we find as many equations as unknowns: we have tacitly assumed
that any E < 0 will work in the solutions above. In fact, this is not true: only ‘magic’ values of E < 0 (the
eigenvalues) will solve the SE. In fact, for a non-trivial solution we must set the determinant of the matrix
appearing in Eq. (75) equal to zero, finding
mV0
2 e−2κa 1 + 2 = 0. (78)
κ
4 The officially correct thing to do would be do construct a ‘wave packet’, the object which resembles a classical particle as much as is
consistent with the Uncertainty Principle. We could then watch the time evolution of a wave packet incident from x = −∞: the (slowly
spreading) wavepacket would come in, its high Fourier components (with a higher phase velocity) would begin to interact with the potential,
and eventually part of the wavepacket would pinch off (to give rise to the reflected wave) and the remainder would exit to x = +∞ (the
‘transmitted beam’).
5
The requirement that the wavefunction be normalized might appear to be a third condition, but it is a physical, not a mathematical
requirement.
11
m and are positive, e−x is never equal to zero for finite x, and we selected κ > 0 in order to have the
wavefunction behave properly for large positive or negative x. Thus if V0 > 0 (as shown in the figure) there
are no real roots for this equation. On the other hand, if V0 < 0 (so we could write V0 = −|V0 |, we do have
one solution, κ = m|V0 |/2 . We have made no assumptions about the sign of V0 in our manipulations, and
so are at liberty to assume either sign. This is an example of treating V0 as an algebraic variable: we assume
it to be positive, find conditions on a solution, and discover that in fact it needs to be negative. Note: As for
the E > 0 case, we cannot in fact solve for A and D. Instead, we plug back in the permitted solution (again,
valid only for V0 < 0. We will then obtain two equations which both agree obout the ratio D/A = e+2κa .
For either case, E > 0 or E < 0 we can now proceed to impose the physical requirement of normalization.
Note: The relations we find by such index manipulations are valid for any i (for example). The matrix
relation between two n-dimensional column vectors, for example,
↔
a = G ·b (80)
in components becomes
n
ai = Gij bj , (81)
j=1
Cx = (A × B)x = Ay Bz − Az By , (82)
instead of going through two more laborious calculations to see what are Cy and Cz , we can perform
cyclic permutations: Using Cartesian indices i = 1 ↔ x, i = 2 ↔ y, i = 3 ↔ z, make the replacements
x → y
y → z
z → x.
12
(e) Levi-Civita tensor (‘completely antisymmetric third-rank tensor’)
This object is incredibly useful in constructing cross products of vectors or vector operators (e.g., the
angular momentum operator in quantum mechanics, or the curl in E&M). It is used, for example, in the
definition
(A × B)i = εijk Aj Bk . (87)
j,k
Observe: the first index of ε is the component i we specify, the second is the component of the first
vector in the cross product, and the third is the component of the second vector.
What good is this? —Instead of having to deal first with the x component, then with the y, then z
components, we have an algebraic expression (albeit involving a sum) which gives us all the (Cartesian)
components—we just need to know which i to specify. In this expression
⎧
⎪
⎨ 1 ijk = 123 or ‘cyclic permutations’ 312, 231
εijk = −1 ijk = 213 or cyclic permutations 321, 132 (88)
⎪
⎩ 0 any two indices same
Thus the whole job of this object is to select out of a sum only the pieces for which i ≠ j ≠ k, crucial
in constructing cross products.
Key properties:
• Cyclic permutations
εijk = εkij = εjki (89)
[the ‘cyclic permutations’ of the original indices ijk], true for any ijk.
• Index swaps
εjik = −εijk (90)
Key identity:
εijk εkm = δi δjm − δim δj (91)
6
Note well the structure: we can only use this identity when the last index of the first ε is the first index
of the second ε. If we instead have the product
we can use the index swap identity to rewrite εlkm = −εklm and use the ‘delta identity’ Eq. (91) above.
(f) Einstein summation convention
In physical applications indices which occur twice in sums (all examples below) are in fact summed
over. So we adopt a compact convention for quantities constructed from vectors, tensors, etc.: any
index occurring twice is assumed to be summed over.
Examples
• Dot product: for n-dimensional vectors
n
a i bi ≡ ai bi = a·b (93)
i=1
n ↔
ai = Gij bj ≡ Gij bj = G ·b (95)
i
j=1
13
or ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
a1 G11 G12 ... G1n b1
⎜ a2 ⎟ ⎜ G21 G22 ... G2n ⎟ ⎜ b2 ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ .. ⎟=⎜ .. .. .. .. ⎟·⎜ .. ⎟ (96)
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ . ⎠ ⎝ . . . . ⎠ ⎝ . ⎠
an Gn1 Gn2 ... Gnn bn
(g) Combining identities above
• Laplacian
∂2f ∂2 f ∂2f
∂i ∂i f = 2
+ 2
+ ≡ ∇2 f (97)
∂x ∂y ∂z 2
• Vector Taylor expansion
ai aj
f (r + a) = f (r) + ai ∂i f + ∂i ∂j f + . . . (98)
2!
1
= f (r) + a·∇f + (a·∇) (a·∇) f + . . . (99)
2!
where all derivatives are evaluated at r.
• Vector identities
= ∂i ∂j Aj − ∂j ∂j Ai (108)
2
= ∇i (∇·A) − ∇ Ai (109)
so that
∇ × (∇ × A) = ∇ (div A) − ∇2 A (110)
• Vector manipulations with inverse power laws
e.g. Derive the ‘coordinate-free’ form7 form for the magnetic field B from the vector potential A(r)
of a magnetic dipole
μ × r̂
Adip (r) = a 2 . (111)
r
Since B = ∇ × A and r = xm x̂m (= x x̂ + y ŷ + z ẑ),
Bi = εijk ∂j Ak (112)
r̂
= εijk ∂j a εkm μ (113)
r2 m
xm
= a δi δjm − δim δj μ ∂j (114)
r3
(115)
14
r̂
The term ∇· is very peculiar (as we see, for example, in PHGN 361), since it is related to the
r2
r̂
charge density of a point charge, a three-dimensional Dirac delta function: ∇· r 2 = 4π δ(r). In
! " 3
√
the second term, it is convenient to write r 3 = r 2 2 and r 2 = x x , so
xi
Bi = 4π a μi δ(r) − a μj ∂j (118)
r3
⎡ ⎤
δij 3 xi · 2x δj ⎦
= 4π a μi δ(r) − a μj ⎣ 3 − , (119)
r 2 (x x ) 52
∂xi
where, as usual, ∂j xi ≡ ∂xj = δkj . In order to simplify, we need to be careful using the Einstein
summation convention. We know that any index which occurs twice is summed over, but what
if the index occurs more than twice, as happens in the second term in square brackets? First, we
simplify terms linear in x , then we replace x x by r 2 , to find
μi xi xj
Bi = 4π a μi δ(r) − a + 3a μj 5 (120)
r3 r
[3 (μ·r̂) (r̂)i − μi ]
= 4π a μi δ(r) + a , (121)
r3
xi
where we have chosen to write, e.g., r
= (r̂)i . So,
[3 (μ·r̂)r̂ − μ]
Bdip (r) = 4π a δ(r) μ + a (122)
r3
The ‘strange’ term proportional to the δ function is physically very important in, for example, the
theory of the ‘hyperfine interaction’ in atoms.
σi σi = 1 NO ESC
3
σi , σj = 2i εijk σk = 2i εijk σk
k=1
# $
σi , σj = 0 (124)
where 1 is a unit 2 × 2 matrix, the phrase NO ESC means that we are not using the Einstein summation
1
convention, so that σi σi = σx · σx for i = 1, for example. Thus the spin- 2 operators obey the usual
commutation relations for an angular momentum, the square of a Pauli spin matrix is a unit matrix, and
any two different Pauli spin matrices anticommute. A particularly concise way to encapsulate all of these
properties is the identity
σi σj = iεijk σk + δij . (125)
It is an easy exercise to show that the three results in Eq. (124) follow from Eq. (125).
15
where f (x) is well-behaved, generally decay in value as k increases. (I(k) here happens to be the Fourier
transform of f (x), but these remarks hold if g(x, k) is any oscillatory function whose argument is an
increasing function of k, e.g., g = sin(k2 x).) Physicists just say ‘the oscillations
) cause f to *average away to
zero as k increases’. A particular example, using the function f (x) = x 2 exp −a(x − x0 )2 with a = 1 and
x0 = 2, is shown in Fig. 6. (Much more sophisticated examples occur in the ‘method of steepest descent’
8
+∞ Re I
6
I = dx f (x) exp(i k x)
−∞
4
f (x) = x 2 exp −a (x − x0 )2 2
1 2 3 4 5 k
-2
a=1; x0=2
-4
-2 -2
-4 -4
obeys
dk
dNmodes = · (degeneracy of modes for given k)
volume per allowed state in k space
For instance, for (transverse) electromagnetic waves (described ultimately by spin-1 photons of zero
mass) there are two distinct polarizations9 for each propagation direction n̂ = k/k, giving rise to a
9 Not three, as would be expected from the 2s + 1 counting for spin s, since the photon has zero mass.
16
1
prefactor of 2 when counting photon states. On the other hand, finite-mass electrons have spin 2 , so
acquire two (2 · 12 + 1) distinct electron states (spin up, spin down) for each spatial wavefunction.
As we may see in some class, the definition of the energy density of states differs slightly in form
depending on the energy distribution of levels:
i. Discrete energy levels:
g(E) = gj δ E − Ej , (129)
j
where gj (a common notation, not to be confused with the DOS itself) is the degeneracy (the number
of physically distinct states with the same energy10 ) of the level with energy Ej .
ii. Continuous energy levels: There are two common and equivalent forms:
V
g(E) = dk δ (E − ε(k)) (130)
(2π )3
+
V dS
= . (131)
(2π )3 S |∇k ε(k)|
In Eq. (130) the integration is either over all of k-space (for a free particle) or over the first Brillouin
zone (for a Bloch electron). In Eq. (131) the surface integral is over the surface ε(k) = E. Note that
g(E)/V is what is physically relevant—it is a nice intensive quantity characteristic of the system,
not diverging as the system volume grows large.
2 k2
Example: Free particle in three dimensions, ε(k) = 2m
.
i. Dirac δ version
V 2 k2
g(E) = dk δ E − (132)
(2π )3 2m
V 2m 2 2 2mE
= dk 4π k δ k − (133)
(2π )3 2 2
4π V 2m 1 2 2 δ k2 −
2mE
= · d(k ) k (134)
(2π )3 2 2 2
m 4π 2mE
= V 2 (135)
(2π )3 2
where in the second line we have used spherical polar coordinates in k space and the facts that
1
δ(ax) = δ(x)
|a|
δ(−x) = δ(+x)
17
Thus
+
V dS
g(E) = (139)
(2π )3 |∇k ε(k)|
S
+
V m dS
= (140)
2 S k
(2π )3
+
V m1
= dS (141)
2 k S
(2π )3
V m1
= · 4π k2 (142)
2 k
(2π )3
m 4π 2mE
= V 2 , (143)
(2π ) 3 2
where we have observed that the magnitude k does not vary across the surface S, whose area is
4π k2 .
(c) Wave packets
Most waves are dispersive, i.e., the dispersion relation is not linear and the spatial and time frequencies
are related. These include light moving in a medium with an ω-dependent index of refraction, matter
waves, and mechanical waves (e.g., water waves). In a one-dimensional example, we expect a dispersive
wave to have the mathematical form
+∞
dk
ϕ(x, t) = √ A(k) ei[kx−ω(k)t] , (144)
−∞ 2π
that is, for any instant t, the wave is a superposition of distinct wavevectors, as described by the
function A(k). This is precisely what we would expect from Fourier analysis, except that the wave has
a time dependence which we must acknowledge via ω(k) since the wave is dispersive.
√
If there were only a single Fourier component in A(k), i.e., A(k) = 2π δ(k − k0 ), we would have a
simple wave ∝ exp i (k0 x − ωt) whose phase is
, -
ω(k)
k0 x − ω(k0 )t = k x − t , (145)
k k0
so that the wave fronts (spatial locations where the phase is fixed) clearly move at the phase velocity
vphase = ω(k0 )/k0 . More generally, for the Fourier transform of the ϕ(x, t) to exist11
+∞
2
dx |ϕ(x, t)| < ∞ (146)
−∞
+∞
2
dk |A(k)| < ∞. (147)
−∞
Our integrand
A(k) eiθ(k) = |A(k)| ei[φ(k)+θ(k)] (149)
is thus very oscillatory and the integral will be completely dominated by k regions where
|A(k)| = maximum
.
θ(k) + φ(k) constant
If the ‘power spectrum’ |A(k)|2 is strongly peaked12 about a particular wavevector k, we may Taylor
expand
dω 1 d2 ω
ω(k) ω(k) + ·(k − k) + ·(k − k)2 + . . . (150)
dk k 2 dk2 k
11 In quantum mechanics this is the route to making a normalizable wavefunction.
12 Note that the only purpose for this observation is to identify the main spatial frequency kpresent in the wavepacket.
18
We define the group velocity of the wave as vg (k) ≡ dω(k)/dk; our integral thus becomes, neglecting
the second derivative,
+∞
dk
ϕ(x, t) √ A(k) eikx e−i[ω(k)+vg (k)(k−k)] (151)
−∞ 2π
+∞
dk
= e−it(ω(k)−vg k) √ A(k) eik(x−vg (k)t) . (152)
−∞ 2π
[We recognize the prefactor’s phase as kt vg − vphase .] Thus, remarkably, apart from a phase factor,
we can find the wave packet at time t from that at time t = 0 via the simple substitution x → x−vg (k) t.
This means that the wave packet preserves its shape over the times permitted by the approximations
above; for this reason wave packets are used in quantum mechanics as the mathematical object most
nearly like a classical particle: it has a well-defined momentum k and a well-defined position, con-
sistent with the momentum-position uncertainty relations. We also conclude that in the analysis of a
‘localized’ wave train or wave packet, two distinct velocities are important:
ω
vphase = (153)
k
dω
vgroup = , (154)
dk
each evaluated at k. If we include corrections to the description above (e.g., we retain the quadratic
term (dvg /dk)k , (i) the wave packet will spread out for long times and (ii) its shape will change.
For light waves in empty space (vg = vphase = c) wave packets preserve their shape for all times.
19
(☆)
α δ β
x0 x
☆
()
Figure 7: How to select the semicircle used to make contour integration well defined when the path of integration
crosses a simple pole slightly displaced from the real axis.
integrand
e−ax
f (x) = x 2 (a) (b)
0.05 x−b
a = 3
b = 2 1.0
x 0.5 6
1 2 3 4 0.0 ∞
e −ax b 1
0.5 dx x 2 = −b2 e−ab Ei(ab) + +4 2
0.05 1.0 0 x−b a a
0 b
2
0.10 2
a
4
0
Figure 8: Example of function with simple pole on path of integration. Integrand, panel (a); plot of principal value
of integral as a function of parameters, panel (b).
20
20. Kramers-Kronig relations
A central result of the theory of functions of a complex variable is Cauchy’s theorem (Cauchy’s integral
formula) +
1 f (z )
f (z) = dz (160)
2π i C z −z
where f (z) is analytic (has a derivative everywhere) within and on the closed contour C (traversed so as to
keep its interior on the left). It’s used to prove the residue theorem, but is noteworthy above because of
the factor i. This means that the real part of f (z) is related to an integral over the imaginary part and vice
versa. In PH511 it is often proved that for a complex-valued function of a real, positive argument defined on
the interval (0, ∞) [here the dielectric function ε(ω) as it depends on the positive definite light frequency
ω]
∞
2 ω
Re ε(ω) = 1 + P dω 2 Im ε(ω ) (161)
π 0 ω − ω2
∞
2ω 1
Im ε(ω) = − P dω 2 [Re ε(ω ) − 1] . (162)
π 0 ω − ω2
These are the ‘Kramers-Kronig’ relations and P reminds you to use the nefarious principal value. (I’m sorry
to simply cite this result; it’s proved in Jackson and is not difficult, just a little tedious.) They hold generally
for the ‘linear response functions’ describing the response of a causal linear medium to an external driving
force and therefore are incredibly useful in a very broad array of applications.
Note that (i) It is obvious how to extend this procedure to more variables and more constraints. No matter
what the dimensionality of the space, there will be as many equations as unknowns, so that a solution will
generally exist, and (ii) each additional constraint simply adds another equation too. [The operation count
between this approach, the ‘method of Lagrange multipliers’, and direct brute force solution by finding the
extrema, then imposing the constraints, can be orders of magnitude.]
Example: Find the volume of the largest box with edges parallel to the axes which can be inscribed inside
the ellipsoid
(x, y, z)
•
2y 2z
y
x2 y2 z2
2
+ 2 + 2 = 1, (164)
a b c
21
noting that the volume of such a box is 8 xyz. Solution: We have only one constraint: that the box just
touch the surface defined by the ellipsoid; this will happen at 8 places, but we need worry about only one
(in the octant where x, y, z > 0). Assuming a > 0, b > 0, c > 0 and that the volume is positive, requiring
x, y, z to lie on the surface of the ellipsoid we find
1
(x, y, z) = √ (a, b, c)
3
4abc
λ = √ ,
3
8abc
√ .
so that the volume of the largest inscribed box (for given a, b, c) is 3 3
χ = y(x) − z x (165)
dy dy
where z = dx
. Now dχ = dx
dx − z dx − x dz, or using the definition of z
dy dy
dχ = dx − dx − x dz = −x dz (166)
dx dx
We then say that χ depends only on z = dy/dx, so that the ‘natural variable’ of χ is14 z.
(i) Thermodynamics: We write the First Law in differential form as dU = T dS − P dV + μ dN, and remark
that the ‘natural variables’ of the (internal energy) U are S, V , and N. On the other hand, if we define
G = U + P V − T S, we find
dG = dU + (P dV + V dP ) − (T dS + S dT ) (167)
= (T dS − P dV + μ dN) + (P dV + V dP ) − (T dS + S dT ) (168)
= −S dT + V dP + μ dN, (169)
so that the natural variables of G are T , P , and N. (This is a triple Legendre transform of the original
function U .)
◦
(ii) From Lagrangian to Hamiltonian mechanics: Traditionally one begins with the Lagrangian L({qi , qi })
for which the ‘equation of motion’ reads
⎛ ⎞
d ⎝ ∂L ⎠ ∂L
= or (170)
dt ∂ q◦ ∂qi
i
◦ dpi
pi ≡ = Fi (171)
dt
(since generally L = T − V in terms of the kinetic energy T and potential energy V ; we recognize
Fi = −∂V /∂qi ).
◦
Now we seek a way to change variables from the ({qi , qi }, t) to ({qi , pi }, t), where pi is the ‘canonical
momentum conjugate to the generalized coordinate qi ’. If we write
◦ ◦
χ= pi qi −L({qi , q i }, t) (172)
i
◦
we find (since L depends on time both explicitly and via the qi and qi )
⎡ ⎤
, ◦ ◦
-
∂L ◦ ∂L ∂L
dχ = pi dqi + qi dpi − ⎣ dqi ⎦ −
◦ d i +
q dt (173)
i i q
∂ i ∂q i ∂t
⎡ ⎤
◦ ◦ ∂L ∂L ∂L
= q i dpi + d qi ⎣pi − ◦ ⎦ − dqi − (174)
i i ∂q i
∂q i ∂t
i
14
This is like choosing to describe a function not by its dependence on x but by the family of curves which are tangent (i.e., have specified
slope) to the curve y(x).
22
But the term in brackets in the second term cancels by the definition of the generalized momentum,
and the third time can be simplified using the Lagrange equation of motion:
∂L
◦ ≡ pi (175)
∂qi
∂L ◦
= pi (176)
∂qi
so that now (replacing χ by its official label H, the Hamiltonian), we have
,◦ ◦
-
∂L
dH = qi dpi − p i dqi − , (177)
i
∂t
Now change to polar coordinates in the x − y plane, replacing the area element dxdy by 2π r dr with r
now ranging from 0 to ∞. Thus ∞
2 π
I 2 (α) = 2π dr r e−αr = ,
0 α
so that 0
π
I(α) = .
α
But wait! There’s more! We can actually generate a whole slew of similar useful integrals by treating I(α)
as a known function of α. For example,
dI(α) d +∞ 2
= dxe−αx
dα dα −∞
+∞
2
= dx(−x 2 )e−αx
−∞
+∞
2
= − dx x 2 e−αx
−∞
so that
+∞
2 dI(α)
dx x 2 e−αx = −
−∞ dα
0
d π
= −
dα α
√ 1
= − π − α−3/2
2
0
1 π
= .
2α α
23
This trick works for any convergent integrand dependent on a parameter. It also works if we have a con-
vergent series which depends on a parameter.
However, if
Q̂h(x) = qh(x)
where q is an ordinary number (possibly complex), then
= f0 + f1 o + f2 o2 + . . . |o
= f (o)|o. (182)
Idea: In induction, we begin by demonstrating that a result holds for an assumed ‘starting’ value of a discrete
index n, usually n = 0 or n = 1. Next, we assume it to hold for a specified arbitrary value of n, then prove
that it holds when n is replaced by n + 1. Since n was arbitrary to begin with, the identity must hold for
arbitrary n. Thus we have a chain of deduction that reaches from n = 0 (or 1) through arbitrary n (since
for any n we get the result for n + 1 by the replacement n → n + 1, and thus have a completely general
result.
Example: Prove by induction that
n n
d x2 x2 d 2
x− e− 2 = (−1)n e+ 2 e−x . (183)
dx dx n
Proof
d 0
First, does it hold for n = 0? Yes, since x − dx
≡ 1, (−1)0 = 1, and the zeroth derivative of a function is
24
the function itself. Now on to arbitrary n. Replace n by n + 1 in the equation above, but isolate the part
which looks most like Eq. (183):
n+1 n
d − x2
2 d d − x2
2
x− e = x− x− e
dx dx dx
The quantity in [] on the right side is by hypothesis equal to the right hand side of Eq. (183), so
n , -
d d x2 d x 2 dn
−x 2
x− x− e− 2 = x − (−1)n e+ 2 e .
dx dx dx dx n
25