Вы находитесь на странице: 1из 103

Contents

1 Dierential Calculus 1.1 Ordinary and Partial Derivatives . . . . . . . 1.2 Directional Derivatives and the Gradient . . . 1.2.1 Directional Derivatives . . . . . . . . 1.2.2 Meaning of the Gradient . . . . . . . 1.2.3 The Gradient in Three Dimensions . . 1.2.4 Physical Applications of the Gradient 1.3 Summary . . . . . . . . . . . . . . . . . . . 2 Single integrals 2.1 Fundamental Theorem of Calculus . . 2.2 Variable Transformations in Integrals . 2.3 Proof of the Fundamental Theorem . 2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 4 5 7 10 12 14 15 17 18 21 24 25 25 32 32 34 36 40 41 41 46 46 47

3 Double Integrals 3.1 Integrals in Cartesian Coordinates . . . . . . . . 3.2 Circular Polar Coordinates . . . . . . . . . . . . 3.2.1 Denitions of Coordinate Systems . . . . 3.2.2 Transformation of the Integration Element 3.2.3 Double Integrals in Polar Coordinates . . 3.3 Summary . . . . . . . . . . . . . . . . . . . . . 4 Triple Integrals 4.1 Integrals in Cartesian Coordinates . . . . . 4.2 Cylindrical Polar Coordinates . . . . . . . . 4.2.1 Denition of the Coordinate System 4.2.2 The Integration Element . . . . . . i . . . . . . . . . . . .

ii 4.2.3 Triple Integrals in Cylindrical Polar Coordinates Spherical Polar Coordinates . . . . . . . . . . . . . . . 4.3.1 Denition of the Coordinate System . . . . . . . 4.3.2 The Integration Element . . . . . . . . . . . . . 4.3.3 Triple Integrals in Spherical Polar Coordinates . Surface Integrals . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 51 51 53 54 55 57

4.3

4.4 4.5

5 Line Integrals 5.1 Work in Classical Mechanics . . . 5.2 Line Integrals over Closed Curves 5.3 Exact and Inexact Dierentials . . 5.4 Arc Length . . . . . . . . . . . . 5.5 Summary . . . . . . . . . . . . .

59 . 59 . 64 . 70 . 73 . 74 . . . . . . . . . . . 77 77 82 86 89 91 91 95 99 99 99 100 104

6 The Divergence and the Divergence Theorem 6.1 Denition of the Divergence . . . . . . . 6.2 The Divergence Theorem . . . . . . . . . 6.3 Gauss Law . . . . . . . . . . . . . . . . 6.4 Summary . . . . . . . . . . . . . . . . .

7 The Curl and Stokes Theorem 7.1 The Curl in Two Dimensions . . . . . . . . . . . . . 7.2 Greens Theorem . . . . . . . . . . . . . . . . . . . 7.3 Stokes Theorem . . . . . . . . . . . . . . . . . . . 7.3.1 Line Integrals in Three Dimensions . . . . . 7.3.2 The Curl Vector . . . . . . . . . . . . . . . . 7.3.3 The Curl of Three-Dimensional Vector Fields 7.4 Summary . . . . . . . . . . . . . . . . . . . . . . .

Chapter 1 Dierential Calculus


There are two basic operations in calculus: dierentiation and integration. These are used throughout physics, both as direct operations on physical quantities and for expressing equations that govern the behavior of physical systems. The rst chapter of these notes discusses dierentiation. First ordinary derivatives are discussed: these are mostly applicable to functions of single variables. Then the denition is extended to functions of two, or in general n variables. However, even in these general cases dierentiation is only permitted along the co-ordinate axis of the given variable. There is no reason why we could not carry out dierentiation in an arbitrary direction. Indeed, when we introduce directional derivatives this is just what we do: we dene the derivative in a direction specied by a unit vector u.

1.1 Ordinary and Partial Derivatives


The derivative of a function f of a single independent variable x is dened by the following limit: " df f (x + x) lim x!0 dx x f (x) #

(1.1)

As the construction in Fig. 1.1 demonstrates, the derivative is the slope of the tangent to f at the point x. The derivative of f is often written as f 0 (x). Example. Consider the function f (x) = x2 . The derivative of this function with respect to x can be calculated from rst principles by using the denition in Eq. (1.1) 1

Dierential Calculus

x x
the point (x, f ) with slope

x
f / x. The right panel shows the effect of taking the limit x ! 0,

Fig. 1.1: The construction of the derivative in Eq. (1.1). The left panel shows a line through which results in a line through (x, f ) that is tangent to f at x.

as follows:

" # d x2 (x + x)2 x2 = lim x!0 dx x " # 2x x + ( x)2 = lim x!0 x = lim (2x + x)
x!0

= 2x .

(1.2)

The basic denition in Eq. (1.1) can be used to show the following well-known formulae of sums, products, and quotients of functions, and the chain rule for composite functions (i.e. functions of functions): d df dg (a f + bg) = a +b , dx dx dx d( f g) df dg = g+ f , dx dx dx ! ! d f 1 df dg = 2 g f , dx g g dx dx (1.3) (1.4) (1.5)

Dierential Calculus

(a)

(b)

(c)

Fig. 1.2: (a) A section of a surface f (x, y). (b) The partial derivative of f with respect to x, and (c) the partial derivative of f with respect to y at the same point. The constructions in (b) and (c) show that the two partial derivatives of f can be obtained by slicing the surface parallel to the appropriate axis.

d f (g(x)) d f dg = , dx dg dx

(1.6)

in which a and b are any constants and f and g are any dierentiable functions. Specic derivatives that will be used throughout this course are d xn = nxn 1 , dx d sin x = cos x , dx d cos x = dx sin x , (1.7) (1.8) (1.9) (1.10)

d e f (x) d f f (x) = e , dx dx

d ln x 1 = , (1.11) dx x where n is an integer. All of these results will be derived from the denition in Eq. (1.1) in Classwork 1 and Problem Set 1. The derivative can be extended to functions of more than one variable. For a function f of two independent variables x and y, the partial derivative of f with respect to x is dened as (Fig. 1.2) " @f f (x + x, y) lim x!0 @x x f (x, y) # , (1.12)

Directional Derivative and the Gradient

with an analogous expression for the partial derivative @ f /@y: " @f f (x, y + y) lim y!0 @y y f (x, y) # . (1.13)

As these denitions indicate, when taking the partial derivative with respect to a particular independent variable, the other independent variables are held xed. Thus, the usual rules of dierentiation apply, with these other variables treated eectively as constants. Partial derivatives are often abbreviated with a subscript to indicate the independent variable used for the derivative. In this notation, the two derivatives in Eqs. (1.12) and (1.13) are written as f x and fy , respectively. Similarly, the three second-order derivatives are written as f xx , f xy , and fyy . The generalization of partial derivatives to any number of independent variables is straightforward. Example. Consider the function f (x, y) = x sin y. The two rst partial derivatives are @f = sin y , @x @f = x cos y . @y The derivative can also be applied to vectors. Consider the quantity r(t) = x(t) i + y(t) j + z(t) k , (1.16) (1.14) (1.15)

where i, j, and k are the usual unit vectors along the x, y, and z directions, respectively. This may be imagined as the position of a particle in space at time t. The derivative of r with respect to t, which is the instantaneous velocity of the particle, is given by dr dx dy dz v(t) = = i+ j+ k. (1.17) dt dt dt dt This vector is tangent to r, with a magnitude that is equal to the speed of the particle.

1.2 Directional Derivatives and the Gradient


A function f (x) of a single independent variable can be characterized at any point x0 by the slope, in terms of the rst derivative f 0 (x0 ), the curvature, in terms of

Directional Derivative and the Gradient

the second derivative f 00 (x0 ), and so on. These quantities enable a function to be visualized in terms of its steepness and the sharpness of its bends. A function f of two or more independent variables can be similarly characterized in terms of its partial derivatives. But partial derivatives such as f x and fy are unnecessarily restrictive in that they represent the rates of change of a function only along the xand y-axes (Fig. 1.2). In fact, derivatives can be taken along any direction in the space of the independent variables. The computation of such directional derivatives is the basis for introducing a new type of derivative for functions of two or more variables, called the gradient. This quantity is a vector that provides similar information to contour plots, such as isobars on a weather map or a relief map of a mountainous range, but in a much more succinct form. In some respects, the gradient is the natural generalization to higher dimensions of the ordinary derivative in that it determines the tangent plane to the surface of a function (for the case of two independent variables), just as the ordinary derivative is the slope of the tangent line to a function. The gradient has a wide variety of applications, ranging from the calculation the ow of physical quantities such as heat, the solution of certain types of linear equations (the conjugate gradient method, and in image processing, where the gradient is used to extract information about the edges in an image, which is especially important in biological and medical imaging.

1.2.1

Directional Derivatives

We will conne our discussion initially to functions of two independent variables because the various quantities associated with derivatives are easier to visualize than for the case of three independent variables and the results obtained are straightforward to generalize. The task at hand is the calculation of the derivative of a function along a particular direction. This proceeds by specifying the point r0 and the direction along a unit vector u: r0 = x0 i + y0 j , u = ai + b j, (1.18) (1.19)

where the stipulation that u must be a unit vector, u u = 1, means that a2 + b2 = 1. We now form the vector r = r0 + u s = (x0 + as) i + (y0 + bs) j , (1.20)

where 0 s 1 is a parameter, as shown in Fig. 1.3. In terms of Cartesian components, r = (x, y), x = x0 + as, (1.21)

Directional Derivative and the Gradient


y = y0 + bs . (1.22)

Thus, for s = 0, r = r0 , and for s = 1, r = r0 + u (Fig. 1.3). Now consider a scalar function f (x, y). y Along the direction dened by r, f (x, y) = f [x(s), y(s)] = = f (x0 + as, y0 + bs) . (1.23)
r0 u r0 s 0

s 1

Thus, the derivative of f with respect to s is, according to the chain rule, df @ f dx @ f dy = + ds @x ds @y ds @f @f = a+ b. @x @y

Fig. 1.3: The vectors r0 (s = 0), r0 + u

(1.24) (s = 1), and the those for which 0 s 1.

We can rewrite this expression in vector notation as ! df @f @f @f @f = i+ j (a i + b j) = a+ b, | {z } @x ds @x @y @y | {z } u rf where we have dened the gradient of f , denoted by r f , as rf = @f @f i+ j. @x @y

(1.25)

(1.26)

The symbol r is referred to as nabla, grad, or del. The gradient is a vector eld calculated from the scalar function f . The quantity d f /ds is the directional derivative of f in the direction of the unit vector u, and is written as ru f r f u , (1.27)

which is a scalar quantity because it is obtained as the dot product of two vectors. This derivative can be written in a form analogous to the ordinary derivative in Sec. 1.1. Using the notation f (r) f (x, y), we have that f (r0 + su) ru f = lim s!0 s r0 " f (r0 ) # . (1.28)

Directional Derivative and the Gradient

Example. Consider the special cases where u = i and u = j. From Eq. (1.27), we obtain ! @f @f @f ri f = i+ j i= , (1.29) @x @y @x ! @f @f @f rj f = i+ j j= . (1.30) @x @y @y Thus, the directional derivatives along the directions of the coordinate axes reduces to the familiar partial derivatives. Along any other directions, the directional derivative is the weighted average of these two derivatives. This explains why u must be a unit vector: the role of u is only to provide the direction in the directional derivative and not to aect its magnitude.

1.2.2 Meaning of the Gradient


Magnitude To provide an interpretation of the gradient, we rst observe that, since |u| is a unit vector, we can write the directional derivative as ru f = r f u = |r f | |u| cos = |r f | cos , (1.31)

where is the angle between r f and u. The maximum value of the right-hand side is obtained for = 0, when r f and u are parallel. We conclude that ru f |r f | , (1.32)

i.e. the absolute value of the gradient of a function is the maximum rate of change of that function. Direction We can proceed further and obtain an interpretation of the direction of the gradient. Consider the surface f (x, y) = constant and a curve [x(s), y(s)] that lies on this surface, so f [x(s), y(s)] = constant for all s. These curves are called contour lines and they represent constant function values as shown in Fig. 1.4(a). If a contour line is projected to the x y plane any point of the projected curve in the x y plane is given by: r(s) = x(s) + y(s) i j (1.33)

Directional Derivative and the Gradient


y

(a)

(b)
2

r(s)
x

3 3

Fig. 1.4: Contour lines of constant function value (a). The position vector r(s) is used to describe a contour line (b).

as shown in Fig. 1.4(b). So for the contour lines we may write: f [x(s), y(s)] = const. which gives after dierentiation of both sides: d f [x(s), y(s)] =0) ds @ f [x(s), y(s)] dx(s) @ f [x(s), y(s)] dy(s) + = 0. @x ds @y ds (1.34)

(1.35)

as follows from the chain rule. The above equation can be written as a scalar product of two vectors: rf v = 0 (1.36) with dx(s) dy(s) dr(s) i+ j= . (1.37) ds ds ds This equation shows that v(s) is perpendicular to r(s) at any point, i.e. v(s) is tangent to the contour line. Also, by Eq. 1.36, r f is perpendicular to v(s), i.e. the gradient of a function is perpendicular to lines of constant function value. The properties of the gradient for functions of two independent variables are illustrated by the following example. v(s) = Example. Consider the function f (x, y) = 1 x2 y2 (1.38)

Directional Derivative and the Gradient

for the ranges of variables 1 x 1 and 1 y 1. This surface, which is an inverted paraboloid, is shown in Fig. 1.5 (a). The gradient of f is dened as the two-dimensional vector @f @f rf = i+ j. (1.39) @x @y The partial derivatives of f are straightforward to calculate: @f = 2x , @x so the gradient is r f = 2x i 2y j . (1.41) According to the discussion in this section, the gradient represents the maximum rate of change of f and point in the direction normal to surfaces of constant f . In this case, the surfacesof constant f are curves in the x-y plane. For f = z0 , these curves are given by x2 + y2 = 1 z0 , (1.42) @f = 2y , @y (1.40)

(a)
1 z 0 1 0 x 1 1

(b)

0.5

1 0.5

0 y

1 1 0.5 0 x 0.5 1

Fig. 1.5: The function z = 1

x2

constant f in Eq. (1.38) and the gradient eld in Eq. (1.41) (b).

y2 for 1 x 1 and 1 y 1 (a). The contours of

p which are circles of radius 1 z0 centered at the origin. As z0 increases from 0 to 1, the radii of the circles decreases from 1 to 0, i.e. the height of f above the x-y plane increases toward the origin, which is the maximum of f , as shown in Fig. 1.5 (a). These contours are shown in Fig. 1.5 (b), together with the gradient calculated in Eq. (1.40). The gradient is seen to be a radial vector eld that points toward the origin, i.e. along the direction of the maximum rate of change of f . Also evident is that the gradient vectors are normal to the circles of constant f .

10

Directional Derivative and the Gradient

Having computed the gradient of f , we can now determine its directional derivative. For any unit vector u = a i + b j, where a2 + b2 = 1, we have from Eq. (1.27), ru f = ( 2x i 2y j)(a i + b j) = = 2ax 2yb . (1.43)

The calculation of this derivative requires the specication of a point (x, y) and a direction (a, b). For example, at the point ( 1 , 1 ), 2 2 ru f
(1,1) 2 2

= a

b.

(1.44)

The maximum of the directional derivative is obtained from calculating the maximum of the function p ru f = a 1 a2 = f (a) (1.45)
(1,1) 2 2

which gives:

p that gives a = 2/2. By choosing the negative sign we have a = p b= 2/2 which gives the maximum of the directional derivative p p 2 2 p ru f = + = 2 2 2 (1,1) 2 2

d f (a) a = 1+ p =0 da 1 a2

(1.46) p 2/2 and

(1.47)

which isp equal to the magnitude of the gradient. The minimum value is obtained p for a = 2/2 and b = 2/2: ru f
(1,1) 2 2

p 2,

(1.48)

which is also equalp the magnitude of the gradient. The directional derivative to p p p vanishes for a = 2/2 and b = 2/2 and a = 2/2 and b = 2/2, which is the tangential direction to the contours of constant f .

1.2.3 The Gradient in Three Dimensions


All of the discussion in Secs. 1.2.1 and 1.2.2 can be generalized to functions of three independent variables with only minor modications necessitated by the

Directional Derivative and the Gradient


y 0 1 2 1 0 z 1 2 2 1 0 x 1

11

Fig. 1.6: The surface x2 + y2 + z2 = constant shown together with the gradient at several points.

added dimension. We only quote the main results here and leave the derivations as an exercise. The gradient of a function of f (x, y, z) of three independent variables is rf = @f @f @f i+ j+ k. @x @y @z (1.49)

The directional derivative at a point (x0 , y0 , z0 ) in the direction of the unit vector u = ai + b j + c k, where a2 + b2 + c2 = 1, is ru f = r f u . (1.51) (1.50)

The properties of the gradient are the same as for functions of two independent variables, namely, that the magnitude of the gradient of f at a point is the maximum rate of change of f at that point, and the gradient points in the direction of the maximum rate of change of f , normal to surfaces of constant f . Example. Consider the function f (x, y, z) = x2 + y2 + z2 . (1.52)

The surfaces of constant f are concentric spheres, as shown in Fig. 1.6. The partial derivatives of f are @f = 2x , @x @f = 2y , @y @f = 2z , @z (1.53)

12 so the gradient of f is

Directional Derivative and the Gradient

Several of these vectors are plotted in Fig. 1.6. These vectors are normal to the spherical surface and point away from the origin, because points on spherical surfaces with increasing radii are further away from the origin. The directional derivative can be calculated for example at (1, 0, 0) from Eqs. 1.54 and 1.50 as ru f = 2a, . (1.55)
(1,0,0)

r f = 2x i + 2y j + 2z k .

(1.54)

This derivative has a maximum value when a = 1, i.e. for u along the x-axis (as shown in Fig. 1.6), and vanishes if a = 0. This corresponds to four unit vectors pointing along the positive and negative y- and z-axes, i.e. normal to the gradient. Example We may use the results of the following example to calculate the tangent plane of a sphere. The equation of a general plane is given by N (r a) = 0 (1.56)

where N is a vector normal to the plane, r = x + yj + zk is a position vector and i a is a vector pointing to a known position of the plane. Suppose we would like to know the equation of the plane tangent to the sphere x2 + y2 + z2 = 1 (1.57)

at a = k. The normal at a = (1, 1, 1) is given from Eq. 1.54 by N = 2 k i+ j+ i+2j+2 and so Eq. 1.56 yields: h i (2 + 2 + 2k) (x 1) + (y 1) + (z 1)k = 0 i j i j (1.58) which gives x+y+z=3 which is the equation of the plane. (1.59)

1.2.4 Physical Applications of the Gradient


There are several physical situations that are formulated in terms of the gradient. Perhaps the most familiar application of the gradient is in classical mechanics. Given an energy-conserving potential V(r) of a particle as a function of position r, the force F acting on the particle is the negative gradient of the potential: F = rV . (1.60)

Directional Derivative and the Gradient

13

The minus sign indicates that the forces acting on the particle point in the direction of decreasing potential energy. The motion of the particle is obtained by solving Newtons second law of motion: m d2 r = rV , dt2 (1.61)

where m is the mass of the particle. The existence of a potential associated with a force was discussed in Sec. 5.3. The electrostatic force E is conservative, so the work done on a particle depends only on the initial and nal position of the particle, and not on the path followed. as discussed in Sec. 5.3, with each conservative force, a potential energy can be associated. For the electrostatic force, the associated potential is calculated from E= r . (1.62)

In most metals and semiconductors, the relationship between the electrical current density j and the applied electric eld E is given by Ohms law: j= E, (1.63)

where is the electrical conductivity. By using Eq. (1.62), we can write this relation as j= r . (1.64)

As the following discussion shows, there are several phenomena that are described by equations of this form. The relationship between the heat ow q in the presence of variations of temperature T is expressed in terms of Fouriers law: q = CrT , (1.65)

where C is the coe cient of thermal conductivity. The coe cient C may vary with the temperature, and certainly varies from one substance to another, but it is always a positive constant. This of course makes intuitive sense, at least if the molecular concept of temperature is invoked; the heat (kinetic energy at the microscopic scale) tends to ow from regions of high concentration of internal energy to regions of low internal energy, which is consistent with the statement above that

14

Directional Derivative and the Gradient

the heat ow is directed in the direction of the gradient of the temperature, which denes its maximum rate of change. Similar concepts apply to particle diusion. The current of particles j in the presence of a varying concentration c of the particles is given by Ficks law: j = Drc , (1.66)

where D is the diusion coe cient. The negative sign indicates that transport of material is from high to low concentrations, so that any variations in the concentrations tend to be smoothed out.

1.3 Summary
This chapter has introduced a derivative operation on scalar functions with several independent variables called the gradient, and denoted by r: rf = @f @f @f i+ j+ k. @x @y @z (1.67)

The gradient is a vector quantity calculated from a scalar function. The gradient is an intrinsic property of the function has the magnitude of the maximum rate of change of f and points in the direction normal to surfaces of constant f . The gradient is used in the following applications: 1. Identify the magnitude and direction of the maximum rate of change of a function at a given point. 2. Calculate the directional derivative at a point in a direction specied by a unit vector. 3. Find the normal to a surface as a specied point. 4. Determine the tangent plane to a surface at a given point.

Chapter 2 Single integrals


The integral of a function f (x) over an interval a x b is dened as the limit of Riemann sums, which are an approximation to the area bounded by f , the x-axis, and the lines x = a and x = b, as indicated in Fig. 2.1. A Riemann sum is constructed by rst dividing the interval into N subintervals of length xN (b a)/N. Associated with each subinterval is a strip of area f (a + n xN ) xN . The Riemann sum is obtained by adding all of these areas together. The integral of f over this interval is obtained as the limiting value of this sum as the length of the subintervals vanishes (N ! 1): Z
b a

2 N 3 6X 7 6 7 6 7 f (x) dx lim 6 f (a + n xN ) xN 7 , 6 7 4 5 N!1


n=1

(2.1)

Example. The integral of f (x) = x between x = a and x = b is calculated by rst constructing the Riemann sum. For this function, we have that f (a + n xN ) = a + n xN , (2.2)

so the area corresponding to each strip is (a + n xN ) xn . Hence, denition in Eq. (2.1) reduces to 2 N 3 Z b 6X 7 6 7 6 7 x dx = lim 6 (a + n xN ) xN 7 . (2.3) 6 7 4 5
a N!1 n=1

With xN = (b

a)/N, we have Z b x dx = lim


a

N!1

2 N 6X b 6 6 6 a+n 6 4
n=1

a N

b N

15

!3 7 7 7 7 7 5

16
y y

Directional Derivative and the Gradient

Fig. 2.1: The approximation by Riemann sums (left panel) of the area between a curve and the x-axis (right panel).

We can break up the right-hand side of this equation into two separate sums. The rst of these can be easily evaluated because there is no explicit n-dependence: ! N X b a! b a a = Na = a(b a) . (2.5) N N n=1 The second sum,
N X n=1

8 N 2 ! 2 39 >X 6 b a ! > 6 b a 7> < 7> 7= . 7> = lim > 6a +n 4 5> > 6 ; N!1 : N N n=1

(2.4)

b N
N X n=1

!2

b N

requires the following result:

!2 X N
n=1

n,

(2.6)

n = 1 N(N + 1) . 2

(2.7)

Thus, b N a

Combining these summations and taking the limit N ! 1 allows us to evaluate the integral: ! Z b N+1 2 1 x dx = a(b a) + 2 (b a) lim = 1 (b2 a2 ) . (2.9) 2 N!1 N a | {z } =1

!2 X N
n=1

n = 1 (b 2

a)2

N+1 . N

(2.8)

Directional Derivative and the Gradient

17

2.1 Fundamental Theorem of Calculus


The calculation of an integral as the limit of Riemann sums is much more cumbersome than determining the derivative of a function from Eq. (1.1). Fortunately, the Fundamental Theorem of Calculus alleviates the need such calculations for a large class of integrals. This theorem states that Z where dF = f. dx (2.11)
b

f (x) dx = F(b)

F(a) ,

(2.10)

The function F, whose derivative is equal to f is called the anti-derivative or the primitive function of f . Note the structure of the Fundamental Theorem. The integral of f is an expression that involves the values of f at every point within the interval (a, b). But the evaluation of this integral with the primitive function F of f requires the values of F only at the endpoints a and b of this interval. The basic theorems of vector calculus will be seen to have an analogous structure. A proof of the Fundamental Theorem of Calculus is given in the last section of this chapter. Example. Consider the integral Z
b a

x dx ,

(2.12)

which was evaluated in the preceding section using Riemann sums. To use the Fundamental Theorem of Calculus, we rst identify the primitive function F of x as F(x) = 1 x2 + A , 2 (2.13)

where A is a constant (called a constant of integration). Then, the value of this integral is Z b b 1 2 x dx = 2 x + A = 1 (b2 a2 ) . (2.14) 2
a a

Note that the constant A makes no contribution to the value of the integral. Thus, for the purposes of evaluating denite integrals, constants of integration can be omitted from the primitive function F.

18

Directional Derivative and the Gradient

The Fundamental Theorem of Calculus enables a number of important properties of integrals to be obtained. Higher-dimensional versions of this theorem form one of the major themes of this course. The following properties of denite integrals are implied by the Fundamental Theorem: Z b Z a f (x) dx = f (x) dx , (2.15)
a b

d dx d dx

x a b x

f (s) ds = f (x) ,

(2.16)

f (s) ds =

f (x) , du f (u(x)) . dx

(2.17)

Z d v(x) dv f (s) ds = f (v(x)) dx u(x) dx

(2.18)

2.2 Variable Transformations in Integrals


The fundamental theorem of calculus establishes a connection between derivatives and integrals and provides a way of evaluating integrals in principle. But evaluating the anti-derivative of particular functions (i.e. nding F such that dF/dx = f ) may still prove a challenging proposition and in some cases may require numerical evaluation (e.g. the trapezoidal method). Several methods have been developed for nding anti-derivatives in particular cases, including integration by parts, trigonometric substitution and other variable transformations, and partial fractions. Variable transformations in particular provide a versatile way of changing di cult integrals into expressions that are more manageable. We rst work through an example. Example. Consider the integral Z
1 0

dx 1 x2

(2.19)

This a standard example of an integral whose evaluation benets from a change of variables, in this case based on trigonometric functions. We dene a new variable of integration through x = sin . (2.20) To transform the integral, we must consider the eect of this transformation on the integrand, the integration element, and the limits of integration. By using the

Directional Derivative and the Gradient


trigonometric identity cos2 + sin2 = 1, the integrand is transformed to p 1 1 x2 = p 1 1 sin2 = 1 . cos

19

(2.21)

The integration element is calculated by applying the chain rule to Eq. (2.20): dx = cos d . (2.22)

Lastly, the new limits of integration are determined by identifying the values of whose values are 0 for the lower limit, and 1 for the upper limit. These are identied as sin(0) = 0 , sin( 1 ) = 1 . 2 (2.23)

Thus, the original integral is transformed to Z 1 Z 1 2 dx = d , p 0 0 1 x2 the right-hand side of which is easily evaluated, and we obtain Z 1 dx = 1 . p 2 2 0 1 x

(2.24)

(2.25)

This example illustrates the power of variable transformations: what seemed as di cult evaluation has been transformed, through a judicious choice of a new integration variable, to a much simpler expression. An appropriate transformation is sometimes apparent from the integral itself, as in this example, but may involve an element of trial and error. Most modern computational mathematics software, e.g Maple and Mathematica, perform a series of transformations to determine the simplest form of an integral. We can now formulate in general terms the transformation of an integral Z b f (x) dx (2.26)
a

under the change of variables x ! t(x). The integrand becomes f (x) = f (x(t)) ,

(2.27)

where x(t) is obtained from the inverse of t(x). In the example above, the change of variables was dened in this form. The integration element is transformed to dx = dx dt dt (2.28)

20

Directional Derivative and the Gradient

and the limits of integration are now t(a) and t(b). Hence, the general form of a change of variables in an integral is Z
b a

f (x) dx =

t(b) t(a)

f (x(t))

dx dt . dt

(2.29)

The choice of transformation is usually dictated by the requirement that the primitive function of the transformed integrand, f (x(t))(dx/dt), is easier to determine than the original function. The quantity dx/dt represents the change in the density of integration points induced by the change of variables. This is a key quantity that arises whenever integration variables are changed and will be encountered again when we discuss coordinate transformations in two and three dimensions. For one-dimensional integrals there is alternative interpretation of the transformation of the integration element. If we regard x(t) as the position of a particle as a function of time, then dx represents the dierential change in the position during a time interval dt: dx(t) = x(t + dt) x(t) . (2.30)

Intuitively, we know that dx is given by the instantaneous velocity v(t) multiplied by the time interval dt: dx(t) = v(t) dt. But, v(t) = dx/dt, so dx(t) = dx dt , dt (2.31)

which is the same as Eq. (2.28). A somewhat more formal argument goes as follows. We may expand the function f (x + dx) in the vicinity of dx by using the Taylor series: 1 f (x + dx) = f (x) + dx f 0 (x) + dx2 f 00 (x) + . . . = f (x) + dx f 0 (x) + O(dx2 ) (2.32) 2 where O(y) denotes the error in the order of y. However, f (x + dx) f (x) is just a small increment on the function f which can be denoted by d f . Hence, for small but nite dx we have d f = f 0 (x) dx . Note that d f and dx are called dierential changes of f and x, respectively, while f 0 (x) = d f (x)/dx denotes the derivative of function f by variable x. The following indenite integrals, which can be derived from the basic formulae in Eqs. (1.7)(1.11), will be used throughout this course: Z 1 n+1 xn dx = x + A, (2.33) n+1

Directional Derivative and the Gradient


Z sin x dx = cos x + A ,

21 (2.34) (2.35) (2.36) (2.37) (2.38) (2.39) (2.40)

cos x dx = sin x + A , Z ex dx = ex + A , ln x dx = 1 + A, x

cos2 x dx = 1 x + 1 sin x cos x + A , 2 2 1 cosn+1 x + A , n+1


x

sin x cosn x dx = Z

x2 e x dx = (x2 + 2x + 2) e

+ A.

where n is a positive integer and A is a constant of integration that is eliminated once the integrals are evaluated between specic upper and lower limits. As guaranteed by the Fundamental Theorem, the derivative of the right-hand side of each of these expressions yields the integrand on the corresponding left-hand side.

2.3 Proof of the Fundamental Theorem


Proving the Fundamental Theorem of Calculus requires that we rst prove the Mean Value Theorem: If f is a real continuous function on an interval [a, b] and dierentiable on the open interval (a, b), then there is a point x within (a, b) at which f (b) f (a) = (b a) f 0 (x) . This theorem is straightforward to understand in terms of the diagram shown below. The quantity f (b) f (a) b a represents the slope of the straight line passing through the end-points (a, f (a)) and (b, f (b)). The Mean Value Theorem states that, if f is dierentiable everywhere within (a, b), there is a point x within this interval where the slope f 0 (x) of

22

Directional Derivative and the Gradient

Slope = f(b)

f (b) b

f (a) a

f(a) a x b

f is given by f 0 (x) = f (b) b f (a) . a

This is shown by the emboldened line in the gure above. To use the Mean Value Theorem to prove the Fundamental Theorem of Calculus, we dene the function F by Z x F(x) = f (t) dt
a

for some function f and a x b. We will rst show that F is a dierentiable function where f is continuous. Using the denition in Eq. (1.1), we write the derivative of F as F(x + x) F(x) dF = lim x!0 dx x 1 Z x+ = lim x!0 x a
x

f (t) dt

x a x

f (t) dt Z
x a

Z x+ 1 Z x = lim f (t) dt + x!0 x a x 1 Z x+ = lim x!0 x x


x

f (t) dt

f (t) dt

f (t) dt .

The integral in the last line of this equation can be approximated by the area of a strip of height f (x) and width x, with a correction of order ( x)2 : Z
x+ x x

f (t) dt = f (x) x + O( x)2 .

Directional Derivative and the Gradient

23

Hence, upon substitution of this expression into the denition of the derivative of F, we obtain 1 dF = lim f (x) x + O( x2 ) x!0 dx x = lim f (x) + O( x)
x!0

= f (x) ,

which demonstrates that the derivative of F exists for every point x where f is continuous. In particular, if f is a continuous function on [a, b], then F is dierentiable at every point in that interval. Thus, consider the partition of [a, b] into N intervals such that xn 1 x xn , where x0 = a and xN = b: ! b a xn = a + n , (2.41) N where n = 1, 2, . . . , N. This is shown schematically in the following diagram:

We now use the Mean Value Theorem to choose a point tn within the nth interval that satises F(xn ) Then F(b) F(a) =
N X n=1

F(xn 1 ) = (xn

xn 1 )F 0 (tn ) = (xn

xn 1 ) f (ni ) .

[F(xn )

F(xn 1 )] =

N X n=1

f (tn ) xn ,

24

Directional Derivative and the Gradient

where xn = xn xn 1 . The right-hand-side of this equation is represented by the shaded area in the right panel in the gure above and is seen to be the same basic construction as that used for Riemann sums shown in the gure above. Accordingly, if we now take the limit N ! 1 this approaches the area under the curve, and we have Z b F(b) F(a) = f (t) dt ,
a

which is the Fundamental Theorem of Calculus.

2.4 Summary
This chapter has reviewed the main results of single integral calculus. The evaluation of integrals in higher dimensions, which are introduced in the following chapters, all reduce to the type of one-dimensional evaluations discussed here. Moreover, the Fundamental Theorem of Calculus will have analogues in higher spatial dimensions.

Chapter 3 Double Integrals


Quantities such as mass and charge density are often dened for systems within continuous regions of two- or three-dimensional space. Three-dimensional systems are the typical case, but two-dimensional systems are of interest either as limiting cases of three dimensions (such as thin plates), or as inherently lowdimensional systems (such as electrons within nano-structured materials that are a few atoms wide). Determining the amount of a physical quantity within a given region requires performing integrals over that region. This leads to the notion of double and triple integrals. These integrals are natural extensions of Riemann integrals in one dimension that were discussed in Sec. 2. In this chapter, we discuss the evaluation of double integrals, initially in Cartesian coordinates, and then in circular polar coordinates, which nd frequent application when dealing with circular boundaries.

3.1 Integrals in Cartesian Coordinates


Consider a two-dimensional system within which a function f is dened at every point (x, y). This function can represent a physical quantity such as a mass or charge density, the local temperature or, more abstractly, a probability density, which occurs in quantum mechanics and in the physics of random processes. Calculating the amount of this quantity within a specied region A in the x-y plane necessitates performing an integral over the ranges of x and y that span the interior of A. Because this operation involves the integration over two variables, it is represented as a double integral: ZZ

f (x, y) dx dy .

(3.1)

25

26

Double Integrals

Fig. 3.1: Geometric representation of a double integral of a function f (x, y), represented as the surface z = f (x, y) (shown shaded). The integral is bounded by integration range A in the x-y plane, the corresponding region mapped onto the surface, and the vertical extensions of the boundaries of A to f . This interpretation is analogous to that in Fig. 2.1 for one-dimensional integrals.

Double integrals have a geometrical interpretation that is analogous to that of one-dimensional integrals. As shown in Fig. 3.1, the function f (x, y) can be represented as the surface z = f (x, y) in three-dimensional space. The region A in the x-y plane is mapped to the region f (A). The integral in Eq. (3.1) therefore corresponds to the volume bounded by the x-y plane, the surface f (A), and the boundaries of A extended vertically to f . There are several important points to note about double integrals: 1. Once the area A is specied, the integral has a unique value. 2. The integrations over x and y can be carried out in any order. 3. For the special case f (x, y) = 1, the region of integration is a cylinder of unit height with base area A. The value of the integral is, therefore, the area of A. Example. Suppose f = x2 y and A is the rectangular region shown in Fig. 3.2. The evaluation of the double integral ZZ x2 y dx dy (3.2)
A

Double Integrals
y

27

Fig. 3.2: The integration region A, shown shaded, for the double integral in Eq. (3.2).

proceeds by rst determining the ranges of x and y for the coordinates of every point within A. Since the boundaries of A are parallel to the x- and y-axes, these ranges are readily determined as 1 x 3, The double integral to be evaluated is Z
3 1

1 y 2. Z Z

(3.3)

dx

2 1

dy x y =

3 1

x dx

2 1

y dy .

(3.4)

The original integral has thereby been reduced to two one-dimensional integrals. This is a general feature of multiple integrals: their evaluation always reduces to a sequence of one-dimensional integrals. The nal step is the evaluation of the integrals in Eq. (3.4): Z
3 1

x2 dx =
2

1 33 x =9 3 1 1 22 y =2 2 1

1 26 = , 3 3 1 3 = , 2 2

(3.5) (3.6)

y dy =

from which we obtain ZZ x2 y dx dy = 26 3 = 13 . 3 2 (3.7)

28

Double Integrals

Double integrals over regions bounded by lines that are parallel to the coordinate axes are especially straightforward to evaluate because the ranges of x and y are independent of one another. But, as the following example shows, this is not always the case. Example. Suppose that, in the double y integral in Eq. (3.2), the integration region A is the triangle shown in Fig. 3.3. Identi2 fying the ranges of x and y for this region requires a dierent procedure from that in Fig. 3.2. There are two ways that this integral can be done: by carrying out the yintegration rst followed by the x-integration, and by carrying out the x-integration rst followed by the y-integration. x 1 Method I. If x is allowed to range over the interval 0 x 1 then, as shown in Fig. 3.3: The integration region A, Fig. 3.4(a), the values of y corresponding to shown shaded, for the double integral in a particular value of x must lie in the range Eq. (3.2). 0 y 2x because A is bounded from below by the x-axis and from above by the line y = 2x. Thus, the double integral over A is written as R 1 R 2x R1 R 2x dx 0 dy x2 y = 0 x2 dx 0 y dy = 0 R 1 R 2x x2 0 y dy dx . (3.8) 0

Notice that, because the upper limit of the y-integration is a function of x, this integral must be performed before the integral over x. Such a multiple integral is called an iterated integral. In eect, the double integral over A has been represented as an integral over each vertical strip that runs parallel to the y-axis within A, followed by an integral over all of these strips. Accordingly, the integration over y yields Z 2x 1 2x y dy = y2 = 2x2 , (3.9) 2 0 0 The integral over x can now be carried out, and we obtain Z 1 2 1 2 2 x4 dx = x5 = . 5 0 5 0

(3.10)

Method II. This integral can also be evaluated by performing the integration over x rst. Referring to Fig. 3.4(b), the range of y is 0 y 2. For a given value of y, the corresponding values of x within A are bounded from the left by y = 2x

Double Integrals
y y

29

2 2x

y x 1 x y2 1 x

(a)

(b)

Fig. 3.4: Two ways of setting up the ranges of integration for the region shown in Fig. 3.3. (a) The allowed values of y for a given value of x in the range 0 x 1. (b) The allowed values of x for a given value of y in the range 0 y 1.

and from the right by x = 1. Thus, the corresponding range of x is 1 y x 1. 2 The double integral is now written as 1 Z 2 Z 1 Z 2 Z 1 Z 20 Z 1 B C B C 2 2 2 By B C dy dx x y = y dy x dx = x dxC dy . (3.11) @ A
0
1 2y

1 2y

1 2y

The integration over x must now be performed rst, with the result Z 1 1 1 1 1 3 x2 dx = x3 = y . 1 3 1 y 3 24 y 2 2 The integration over y then yields Z Z 2 1 2 1 1 2 y dy y4 dy = y2 3 0 24 0 6 0 1 52 2 y = 120 0 3

(3.12)

32 2 = 120 5

which agrees with the result in Eq. (3.10). Example We now evaluate the same function f (x, y) = x2 y as in the previous example but now over the unshaded area in Fig. 3.3. Method I. We allow the independent variable x to vary such that 0 x 1. The dependent variable y then needs to vary between 2x and 2: 2x y 2. Thus the double integral over the new area A is given by: Z 1 Z 2 R1 R2 R1 2 2 dx dy x2 y = 0 x2 dx 2x y dy = 0 x2 y2 dx 2x 0 2x R1 4 = 2 0 x2 (1 x2 ) dx = 15 (3.13)

Method II. We now choose the independent variable to be y. Accordingly, as the gure shows we allow y to vary such that 0 y 2. The dependent variable is x.

30

Double Integrals

Since now we are integrating for the unshaded triangle we need to allow x to vary between 0 and y/2: 0 x y/2. Accordingly the integral now reads: Z
2 0

dy

y/2 0

dy x2 y =

R2
0

ydy

R y/2
0

x2 dx =
3

Clearly the two methods yield the same result. However, the two examples result in dierent answers. This is not surprising when we note that what we are calculating is the volume that is determined by the x y plane from below, the function x2 y from above and the three planes whose intersection with the x y plane is the outline of area A. If the function value changes above the area then, of course, so will the resulting volume. The approach described in the preceding examples can be applied to any region bounded by straight line segments. In some cases, the same methods can be used for regions with circular boundaries. The following example shows how to calculate the area of a semi-circular region. Example. Consider the integral ZZ
1 y

R2
0

y y 323 dy =

R2
0

y
4 15

y/2 y3 3 0

dy (3.14)

dx dy ,

(3.15)

where the area A, shown in Fig. 3.5, is bounded from below by the x-axis, and from above by x 1 1 the boundary of the circle x2 + y2 = 1. Because the integrand is unity, the value of this integral Fig. 3.5: The semi-circular region A, is equal to the area of A. We will evaluate this shown shaded, for the double integral integral by performing the integral over y rst. in Eq. (3.15). The range of x is 1 x 1. For a given value of x, the values of y within A are bounded from below by the x-axis and p from above by the circular boundary. Thus, the range of y is 0 y 1 x2 . The double integral is therefore written as Z
1 1

dx

p 1 x2 0

dy .

(3.16)

The integral over y is straightforward to carry out and we obtain Z


1 1

p 1

x2 dx .

(3.17)

Double Integrals

31

This integral can be evaluated by trigonometric substitution. We set x = sin . Then, q p 2 = 1 x 1 sin2 = cos , (3.18) dx = cos d , (3.19) and the limits of integration are transformed as x= 1 ! The transformed integral is Z
1 1

1 , 2

x=1 !
1 2 1 2

= 1 . 2

(3.20)

x2 dx =

cos2 d = 1 , 2

(3.21)

which is the area of the semi-circular region. With the examples in this section as background, we can summarize the evaluation of double integrals over any region A in the x-y plane by the two approaches illustrated in Fig. 3.5. In Fig. 3.6(a), the range of x is xA x xB and the corresponding range of y at a particular value of xis u1 (x) y u2 (x), and the double integral is written as ZZ
y u2 x

f (x, y) dx dy =

xB xA

dx

u2 (x) u1 (x)

dy f (x, y) .

(3.22)

yB

v1 y

v2 y

u1 x xA xB x

yA x

(a)
shaded.

(b)

Fig. 3.6: The two methods of evaluating a double integralover a region in the x-y plane, shown

32

Double Integrals

In Fig. 3.6(b), the range of y is yA y yB and the corresponding range of x for a particular value of y is v1 (y) x v2 (y), and the double integral is ZZ f (x, y) dx dy = Z
yB

yA

dy

v2 (y) v1 (y)

dx f (x, y) .

(3.23)

Although these expressions indicate the order in which the integrals over x and y are to be carried out, the actual evaluation of these integrals may prove problematic for certain types of boundaries and integrands. For some common cases there are special methods available. We consider an example. Example. Consider the integral ZZ

x 2 y2

dx dy ,

(3.24)

where the region A is shown in Fig. 3.5. In integrals of this type arise in quantum mechanics and in the physics of random processes. We set up the integral using the same steps that lead to Eq. (3.16): R1 R p1 x2 2 2 dx 0 dy e x y = 1 R1 R p1 x2 2 2 e x dx 0 e y dy . (3.25) 1 We arrive at an impasse because there is no explicit expression for the primitive 2 function of e y . The problem is not the boundary, but the integrand. In fact, the semi-circular boundary provides the basis for an alternative way of writing this integral that enables it to be evaluated in a straightforward manner. This involves the transformation to a new coordinate system and will be discussed in the next section.

3.2 Circular Polar Coordinates


3.2.1 Denitions of Coordinate Systems
The ability to transform integrals and derivatives between dierent coordinate systems is one of the cornerstones of modern physics. Cartesian coordinates provide a simple way of labelling any point in the x-y plane. Lines parallel to the coordinate axes are drawn from the point and their intersection points with the x and y axes dene the coordinates of that point, as shown in Fig. 3.7(a). The ranges of x and y are 1 < x < 1, 1 < y < 1, (3.26)

Double Integrals

33

x,y

(a)
system and (b) (r, ) in a circular polar coordinate system.

(b)

Fig. 3.7: Two ways of labelling the same point in the plane: (a) (x, y) in a Cartesian coordinate

include all points in the x-y plane. This coordinate system is conceptually simple and has natural extensions to higher dimensions. But there are other ways of labelling points that may be more suitable in particular circumstances. The basic idea of circular polar coordinates is to specify any point (x, y) in terms of two new variables: (i) a radius r that species the distance of the point from the origin, and (ii) an angular variable , called the azimuthal angle, that species the angle of the radial vector with respect to some axis, by convention taken to be the positive x-axis, with the angle increasing from zero in the counterclockwise direction. These quantities are shown in Fig. 3.7(b). The variable r is an inherently non-negative quantity so its range is 0 r < 1. (3.27)

The azimuthal angle must account for all orientations with respect to the positive x-axis while maintaining a unique labelling for all points, so its range is 0 < 2 . (3.28)

The relationship between the two coordinate systems can be determined from a standard trigonometric analysis: x = r cos , or, alternatively, r= p x 2 + y2 , y = r sin , = tan
1

(3.29)

. (3.30) x The restriction of the range of in Eq. (3.28) can now be understood as restricting the trigonometric functions in Eq. (3.29) to a single period. The dierences between the Cartesian and circular polar coordinates are best appreciated pictorially by plotting the coordinate curves, i.e. the curves where

34

Double Integrals

one of the coordinates is held constant. The coordinate curves of the Cartesian coordinate system are straight lines parallel to the x and y axes, as shown in Fig. 3.8(a). For the circular polar coordinate system, the curves of constant r are, from Eq. (3.30) concentric circles centered at the origin. The lines of constant azimuthal angle are straight lines through the origin that make an angle with respect to the positive x-axis. These are shown in Fig. 3.8(b).

3.2.2 Transformation of the Integration Element


The transformation from (x, y) to (r, ) necessitates changes to the element of integration in double integrals. The basic calculation involves dierential changes of the variables and calculating the area enclosed as the result of these changes. For Cartesian coordinates, at any point (x, y), changing x to x + dx and y to y + dy encloses an area dx dy everywhere in the x-y plane. We now consider the corresponding calculation in circular polar coordinates, where we calculate the area enclosed by the dierential changes r ! r + dr and ! + d . The steps are shown in Fig. 3.9. The area between the circles of radii r and r + dr, depicted in the left panel of Fig. 3.9, which is calculated either as (r+dr)2 r2 , or directly as the dierential of the area of a circle of radius r: d(r2 ) = 2r dr . (3.31) The fraction of this area contained between the azimuths and + d is d /2, i.e. the fraction of 2 that d represents. Hence, the area dA enclosed by the innitesimal variations of r and is dA = 2r dr d = r dr d . 2 (3.32)

(a)

(b)

Fig. 3.8: The coordinate curves of the (a) Cartesian and (b) circular polar coordinate system. Also shown is a circle to indicate how the polar coordinates provide a more natural description of such boundaries than the Cartesian coordinates.

Double Integrals
r + dr r
+d

35

(a)

(b)

Fig. 3.9: The steps used to calculate the integration element in circular polar coordinates. The area between the circles of radii r and r + dr is shown in the left panel. The fraction of this area contained between the azimuthal angles and

+ d is shown in the right panel.

Note the factor of r multiplying dr d . This results from the fact that, for a xed angle d , the arc length between and + d is r d . An alternative way of deriving the integration element is to rst construct the vector r = x i + y j which, in terms of the variables in Eq. (3.29) is r = r cos i + r sin j. (3.33)

We now calculate the vectors drr and dr resulting from the dierential with respect to r and , respectively: drr = dr cos i + dr sin j (3.34) (3.35)

dr = r sin d i + r cos d j . These vectors are orthogonal, drr dr = (dr cos i + dr sin j)( r sin d i + r cos d j)

= r dr cos sin d + r dr cos sin d = 0, (3.36)

so the area dened by these vectors is obtained by multiplying their magnitudes: dA = |drr ||dr | = drr d = r dr d . (3.37)

This procedure can be generalized to any coordinate transformation given in terms of new variables u and v: x = x(u, v) and y = y(u, v). The vector r is r = x(u, v) i + y(u, v) j , (3.38)

36 and the dierential changes to u and v yield the vectors dru = drv = @x @y du i + du j @u @u

Double Integrals

(3.39)

@x @y dv i + dv j . (3.40) @v @v Since these vectors are not necessarily orthogonal, we must calculate the area from their cross product, which we write as @x @y @x @y du du @u @u @u @u dA = |dru drv | = = du dv (3.41) @x @y @x @y dv dv @v @v @v @v The determinant on the right-hand side of this equation is called the Jacobian and denoted by J(u, v): J(u, v) = @x @u @x @v @y @u @y @v (3.42)

so the integration element is written as dA = |J(u, v)| du dv , (3.43)

where the absolute value is taken because dA is an inherently positive quantity, while the sign of J can be changed simple by interchanging the two vectors in the cross product. The Jacobian provides the weight of the integration elements as a function of position. In the case of circular polar coordinates, the Jacobian factor indicates that the weight associated with an element dr d increases linearly with r: cos sin J(r, ) = = r(cos2 + sin2 ) = r , (3.44) r sin r cos as in Eq. (3.37).

3.2.3 Double Integrals in Polar Coordinates


We rst consider the example at the end of Sec. 3.1. The integral is given in Eq. (3.24) ZZ 2 2 e x y dx dy ,
A

Double Integrals

37

where the region A is shown in Fig. 3.5. In circular polar coordinates in Eq. (3.29), the integrand becomes 2 2 2 e x y =e r , (3.45) and the integration element becomes r dr d . There remains only the specication of the ranges of r and . For the semi-circular region in Fig. 3.5, 0 r 1, 0 , (3.46)

where the restriction on the range of results from the fact that the integration region is the upper half-circle. The integral to be evaluated is Z Z 1 2 d re r dr . (3.47)
0 0

Both integrals are straightforward to carry out, and we obtain Z 1 2 2 1 re r dr = 1 e r = 1 1 e 1 . 2 2


0 0

(3.48)

Thus, the transformation into circular polar coordinates has enabled us to evaluate an integral that was intractable in Cartesian coordinates. Example. Consider the integral of f (x, y) = 2x + 4y2 between the circles x2 + y2 = 1 and x2 + y2 = 4. In circular polar coordinates Eq. (3.29), f is 2x + 4y2 = 2r cos + 4r2 sin2 . (3.49)

The range r is restricted by the radii of the bounding circles, 1 r 2, and the range of is 0 < 2 to account for the entire region between the circles. Hence, the integral to be evaluated is Z 2 Z 2 r dr d (2r cos + 4r2 sin2 )
1 0

=2

r dr
2

= 4 = r

r3 dr

|0

cos d +4 {z } =0

r dr

|0

sin2 d {z } =

2 1

= 15 .

(3.50)

38

Double Integrals

(a)
polar coordinates.

(b)

(c)

Fig. 3.10: The most common types of region, shown shaded, used for integrations in circular

The most common regions over which integrations are carried out in circular polar coordinates are shown in Fig. 3.10. Figure 3.10(a) represents the interior of a circle of radius R. The ranges of r and are 0 r R, 0 < 2 . (3.51)

For the interior of the wedge-shaped region in Fig. 3.10(b), we have 0 r R, 0 , (3.52)

where is the angle of the wedge. Finally, for the region in Fig. 3.10(c), which is a partial annular region, R1 r R2 ,
1

(3.53)

Integrals over all of these regions can therefore be written as ZZ Z


R2 R1

f (x, y) dx dy =

f (r cos , r sin ) r dr d .

(3.54)

Although regions of the type in Fig. 3.10 are the most natural for circular polar coordinates, the following example shows that integrals over areas with straight boundaries can also be carried out in this coordinate system. Example. Consider the area shown in Fig. 3.11. The azimuthal angle is seen to range between tan 1 (1) = 1 and tan 1 ( 1) = 3 : 4 4
1 4

3 . 4

(3.55)

Double Integrals
y 1

39

Fig. 3.11: Integration area for example

The lower bound of r is r = 0. The upper bound is determined by writing the upper boundary of the triangle, y = 1, in circular polar coordinates. Given that y = r sin , we have that this boundary is r sin = 1. The range of r is therefore given by 0r and the area integral is Z Z
1/ sin 0
3 4 1 4

1 , sin
1/ sin

(3.56)

r dr .

(3.57)

The radial integration must be carried out rst, and we obtain r dr =


1 2 r 2 1/ sin 0

1 2 sin2

(3.58)

and the area integral reduces to 1 2 Z


3 4 1 4

d sin2

(3.59)

The primitive function of the integrand is d(cot x) = dx

cot x: (3.60)

d cos x sin2 x + cos2 x 1 = = , 2 dx sin x sin x sin2 x

40 so the area integral is evaluated as 1 2 Z


3 4 1 4

Double Integrals

d sin2

which is the area of the triangle.

0 1B B B = B cot x @ 2

1 C 1 C C = (1 + 1) = 1 . C A 1 2 4
3 4

(3.61)

3.3 Summary
The double integral of a function f (x, y) represents the volume under the surface of that function within a specied region in the x-y plane. This extends to functions of two independent variables the discussion in Sec. 2 of one-dimensional integrals. The change of variables from Cartesian to another system of coordinates, such as circular polar coordinates, introduces a term, called the Jacobian, into the integral. The Jacobian is the higher-dimensional analogue of the term obtained by applying the chain rule to the integration element in one-dimensional integrals. It takes into account the changes of the element of integration area across the x-y plane. The evaluation of double integrals proceeds by the successive evaluation of a sequence of one-dimensional integrals.

Chapter 4 Triple Integrals


Triple integrals are the natural extensions of double integrals to three dimensions. The basic physical motivation of such integrals is the same as for double integrals: determining the amount of a quantity, typically expressed as a density, within a three-dimensional region necessitates performing a triple integral of the quantity over that region. Just as for double integrals, there are coordinate systems other than Cartesian that are convenient for integrating over certain types of regions. We will discuss the two most common of such coordinate systems, circular cylindrical coordinates and spherical polar coordinates, and show how integrals are transformed into these coordinate systems.

4.1 Integrals in Cartesian Coordinates


Suppose there is a quantity f that represents the density of a physical quantity, such as the mass or charge density at every point (x, y, z) in a region of threedimensional space. The amount of this quantity within a region V is obtained by integrating over the ranges of x, y, and z that span the interior of V. Since this calculation involves three separate integrations, it is called a triple integral, and is written as ZZZ f (x, y, z) dx dy dz . (4.1)
V

Following our discussion of double integrals, there are several points to note about triple integrals: 1. Once the volume V has been specied, the integral has a unique value. 2. The integrals over x, y, and z can be carried out in any order. 41

42

Triple Integrals
3. If f = 1, the integral yields the volume of the integration region: ZZZ dx dy dz = V .
V

(4.2)

The evaluation of triple integrals proceeds in direct analogy to the cases described in Chapter 3 for double integrals. The following examples illustrate the dierent situations that can arise. Example. Suppose f = xyz, and that V is the volume shown in Fig. 4.1. We must rst determine the ranges of the integration variables. The volume V is a cube in the positive octant of space with one corner at the origin. The points (x, y, z) within the cube have coordinates within the ranges 0 x 1, 0 y 1, 0 z 1. (4.3)
y 1

0 x 1

Fig. 4.1: The cubic region for the triple integral in Eq. (4.4).

The triple integral of f = xyz within the cube in Fig. 4.1 is therefore calculated as ZZZ Z 1 Z 1 Z 1 1 f (x, y, z)dx dy dz = x dx y dy z dz = . 8 V |0 {z } |0{z } |0{z }
1 2 1 2 1 2

(4.4)

This type of region, where the ranges of x, y, and z are specied independently, is the simplest for triple integrals. The most general volume of this type is a rectangular prism aligned with the coordinate axes, where each side is a rectangle parallel to one of the coordinate planes. The next two examples have a volumes which do not satisfy these criteria, with the result that the triple integrals become iterated integrals. Example. Suppose that f = xyz, as in the preceding example, and V is the wedge shown in Fig. 4.2. We rst determine the ranges of the integration variables. The wedge is bounded from above by the plane y z = 0, with all other bounding planes lying parallel to coordinate planes. Thus, the range of x is

Triple Integrals

43

z 1 0 x 1 0 y

Fig. 4.2: The volume for the triple integral in Eq. (4.9).

0 x 1.

(4.5)

The triangular sides of the wedge are parallel to the plane x = 0, so the ranges of the y and z coordinates cannot be specied independently. Referring to Fig. 3.4, the ranges of these variables are 0 y 1, An alternative choice is (cf. Fig. 3.4) z y 1, 0 z 1. (4.7) 0 z y. (4.6)

Using the ranges in Eq. (4.6), the triple integral is ZZZ Z 1 Z 1 Z y f (x, y, z)dx dy dz = x dx y dy z dz .
V 0 0 0

(4.8)

As was the case for double integrals, this is called an iterated integral because the upper limit of the z-integral is y, which necessitates evaluating this integral before the y-integral. The x-integral can be carried out independently of the other two. Thus, carrying out the required integrations, Z 1 Z 1 Z y Z Z y 1 1 x dx y dy z dz = y dy z dz 2 0 0 0 0 0 | {z } 1 = 2 Z
1 2 1 0

! Z 1 2y 1 1 3 1 y dy z = y dy = . 2 0 4 0 16 | {z } | {z } 1 1 2 y 4 2

(4.9)

44

Triple Integrals

The evaluation of this integral with the ranges in Eq. (4.7) is left as an exercise. Example. Consider now the integration of f = xyz over the volume in Fig. 4.3. This region is contained in the positive octant, bounded from below by the x-y plane and from above by the plane x + y + z = 1. The ranges of the integration variables are ob1 y tained by rst observing that, in the x-y plane, where z = 0, the (x, y) coordinates within V are 0 bounded by the line x + y = 1. Hence, the ranges 1 of x and y may be chosen as 0 x 1, 0y1 x. (4.10)
z 0 x 1

The lower bound for the range of z for all values of x and y is z = 0. The upper bound is obtained from the equation of the plane, solved for z: z = 1 x y. Hence, 0z1 x y, (4.11) Z

Fig. 4.3: The for the triple integral in Eq. (4.4).

so the integral to be evaluated is $ Z f (x, y, z)dx dy dz =


V

1 0

x dx

1 x 0

y dy

This is again an iterated integral in which the z-integration must be evaluated rst, then the y-integration, and nally the x-integration. The integral over z is evaluated as ! Z 1 x y 1 21 x y 1 z dz = z = (1 x y)2 (4.13) 2 0 2 0 By substituting this result into the y-integral and carrying out an integration by parts, we obtain Z 1 1 x y(1 x y)2 dy 2 0 Z 1 x 1 1 1 x 3 = y(1 x y) + (1 x y)3 dy 0 |6 {z } 6 0 0 = = 1 (1 24 x y)4
1 x 0

1 x y 0

z dz .

(4.12)

1 (1 24

x)4 .

(4.14)

Triple Integrals

45

Finally, substitution of this expression into the x-integral and again integrating by parts yields 1 24 Z
1 0

x(1

1 1 1 x) dx = x(1 x)5 + 0 | 120 {z } 120 0 4

1 0

(1

x)5 dx

1 (1 720

x)5 =
0

1 720

(4.15)

as the value of the integral in Eq. (4.12). Example. It is possible also to solve this problem from rst principles. Imagine that the volume is divided up into slices parallel to the x y plane. Each slice in the z direction has a thickness of z as shown in Fig. 4.4. When the volume is viewed from above (with z axis pointing towards the viewer) the lines that limit each section look like it is shown on the right hand side in Fig. 4.4. The equation of the line delimiting the nth slice is y = (1 n z) x.
z

x x y

Fig. 4.4: Figure for the derivation of the volume integral from rst principles.

We want to write a volume integral for the entire object and for this we proceed as follows: rst we determine the area of each one of the triangular slices. The volume of the given slice will be just obtained as the area times the height which of course is z. Finally we will some together the contribution from all slices and calculate the limit for z ! 0. This solution therefore exemplies how a volume integral problem can be reduced to an area integration problem. The area of a given triangle is determined by using a double integral. As Fig. 4.4 shows the limit for x integration will be 0 x 1 n x as a line intersects the x axis at 1 n z. The corresponding y limit is 0 y 1 n z x.

46 Therefore area = Z
1 n x 0

Triple Integrals
Z
1 n z x

dx

dy

and the volume of an individual sice is therefore given by Z 1 n x Z 1 n z x volume = dx dy z


0 0

Because we need to integrate the function f (x, y, z) = xyz we need to weight the dierential volume by this function. Therefore we obtain the value of the integral for a given slice: Z Z
1 n x 0

x dx

1 n z x

y dy (n z) z.

The volume integral can now be written as a Riemann sum by:


N X (Z n=0 1 n z

N!1

lim

x dx

1 n z x 0

y dy (n z) z =

1 0

z dz

1 z 0

x dx

1 z x 0

y dy.

This integral can readily be evaluated in the following way: Z 1 Z 1 z Z 1 z x Z Z 1 z 1 1 z dz x dx y dy = z dz x(1 z x)2 dx 2 0 0 0 0 0 Z 1 1 1 = z(z 1)3 dz = 24 0 720 Cartesian coordinates are convenient for evaluating triple integrals within volumes bounded by planes. But there are many situations where other geometries are used, the most common of which are spheres and volumes contained within surfaces of revolution. In the next two sections, we will discuss two coordinate systems that considerably extend the capabilities of triple integrals.

4.2 Cylindrical Polar Coordinates


4.2.1 Denition of the Coordinate System
Cylindrical polar coordinates generalize circular polar coordinates (Sec. 3.2) to three dimensions by adding the height z to indicate the position of a point relative to the x-y plane [Fig. 4.5(a)]. The complete transformation from (x, y, z) to (r, , z) is x = r cos , y = r sin , z = z, (4.16)

Triple Integrals
where This transformation is depicted in Fig. 4.5. The expressions for r and of x and y are the same as those in Eq. (3.29). 0 r < 1, 0 < 2, 1 < z < 1.

47

(4.17) in terms

4.2.2

The Integration Element

The integration element of this coordinate system can be obtained in two ways. The simplest way is to observe that the z-coordinate simply adds a thickness dz to the integration element in circular polar coordinates: dV = r dr d dz . (4.18)

The other method, described in Problem Set 4, is based on writing any point (x, y, z) as a radius vector r r = r cos i + r sin j + z k, (4.19)

and calculating the integration element from the vector product dV = |drr dr drz | , (4.20)

where drr , dr , and drz are the dierential changes of r with respect to r, , and z, respectively: drr = dr cos i + dr sin
z y r

j,

(4.21)

x
(a) (b)
Fig. 4.5: Two illustrations of circular polar coordinates. (a) The denitions of r, , and z. (b) The representation of any point as the intersection of the surface of constant r (the cylinder), constant (the vertical plane), and constant z (the horizontal plane).

48 dr = r sin d i + r cos d j , drz = dz k .

Triple Integrals
(4.22) (4.23)

4.2.3 Triple Integrals in Cylindrical Polar Coordinates

z 1 1 0 1 1 0 y

Fig. 4.6: The unit upper half-sphere, x2 + y2 + z2 + 1, for z

0.

Example. Consider the sphere with unit radius in the upper half-space, as shown in Fig. 4.6. The equation of the surface is x2 + y2 + z2 = 1 , (4.24)

where z 0. To calculate this volume as an integral in circular polar coordinates, we must rst determine the ranges of the integration variables. The ranges of r and span the interior of the half-sphere: 0 r 1, 0 < 2 (4.25)

The upper bound for the range of z is obtained from the equation of the sphere, solved for z: z2 = 1 x 2 y 2 = 1 r 2 . (4.26) The half-sphere is bounded from below by the x-y plane, where z = 0. Hence, p the range of z is 0 z 1 r2 . Thus, the volume integral of the half-sphere is given by V =
1 p d dz = 2 r 1 0 0 0 0 | {z } | {z } p 2 1 r2 " # 1 1 1 2 2 3/2 = 2 (1 r ) = 2 = , 3 3 3 0 1

r dr

p 1 r2

r2 dr

(4.27)

Triple Integrals
which is one-half the volume of the unit sphere. Example. Consider the cone in Fig. 4.7. The surface is given by x2 + y2 = (1 for 0 z 1. The ranges of r and are 0 < 2 . z)2 ,

49

(4.28)

0 r 1,

(4.29)

The range of z is calculated by following the steps in the preceding example. The cone is bounded from below by the x-y plane, where z = 0. The upper bound of x is determined by the surface of the cone which, in cylindrical coordinates, is r2 = (1 z)2 . Thus, the range of z is 0z1 r. (4.30)

The volume integral of the cone is Z 1 Z 2 Z 1 r Z 1 V = r dr d = 2 (r 0 0 0 |0{z } |{z} 2 1 r ! r2 1 r3 1 1 = 2 = 2 = . 2 0 3 0 6 3 The preceding two examples showed how cylindrical polar coordinates are used to calculate the volumes of surfaces of revolution, i.e. surfaces that were obtained by rotating a curve about an axes, in those cases, the z-axis. We now consider a more substantial example by calculating the volume of another surface of revolution, the torus.

r2 ) dr

(4.31)

z 1 0 y 11

1 0 x

Example. A torus is a surface of revolution Fig. 4.7: The surface determined generated by rotating a circle of radius whose by x2 + y2 = (1 z)2 , for 0 z 1. center is a distance R > from the origin about an axis, usually taken as the z-axis. The calculation of the volume of a torus does not actually require an expression for the surface. The ranges of r, , and z can be determined by referring to Fig. 4.8. Consider rst the left panel. Suppose we take the range of z as z . (4.32)

50 The equation of the circle in the x-z plane is (x R)2 + z2 = 2 ,

Triple Integrals

(4.33)

so the range of r is obtained by solving this equation for x and referring to Fig. 4.8(b): p p R 2 z2 r R + 2 z2 . (4.34)
z
y

x R

(a)

(b)

Fig. 4.8: (a) The circle in the x-z plane that is rotated about the z-axis. (b) The section of the torus in the x-y plane. The emboldened line is the path traced out by the center of the circle.

Figure 4.8(b) indicates that the range of 0


2

is 2 .
R+ 2 z2

(4.35)

The volume integral of the torus is thus given by Z Z Z p V=


0

dz

2 z2

r dr .

(4.36)

The radial integral is evaluated as Z p2 2


R+ z R

2 z2

r r dr = 2

2 R+ R

p p

2 z2 2 z2

The integral over the azimuthal angle in Eq. (4.36) is 2, so the volume integral reduces to Z p V = 4R 2 z2 dz . (4.38)

2 p 1 R + 2 z2 2 p = 2R 2 z2 . =

z2

(4.37)

Triple Integrals

51

This integral can be evaluated by the trigonometric substitution z = sin , where 1 1 . Carrying out the required changes to the integrand, the integration 2 2 element, and the limits of integration yields Z p Z 1 2 2 2 dz = 4R2 V = 4R z cos2 d = 22 R2 . (4.39) 1 2 | {z } 1 2 By writing this result as the volume of a torus can be interpreted as the product of the area of the circle that is rotated about the z-axis to form the torus (2 ) and the length of the path taken by the center of the circle (2R). This is a special case of Pappus Theorem1 : let R be a planar region that lies entirely on one side of an axis (usually the z-axis) in the plane. If R is rotated about this axis, the volume of the resulting solid is the product of the area A of R and the distance travelled by its centroid. V = (2R) (2 ) , (4.40)

4.3 Spherical Polar Coordinates


One of the most important coordinate systems in physics is spherical polar coordinates. These are appropriate whenever there is a spherical boundary or a section of such a boundary. Spherical polar coordinates are especially important for the quantum mechanical theory of atoms, which is based on spherical symmetry.

4.3.1 Denition of the Coordinate System


In spherical polar coordinates, a point (x, y, z) is expressed in terms of the radius r, which measures the distance of the point from the origin, the azimuthal angle , which measures the orientation of the radius vector with respect to the positive xaxis, with positive taken in the counterclockwise direction, and the polar angle , which measures the orientation of the radius vector with respect to the z-axis. These denitions and conventions are depicted in Fig. 4.9. The transformations between the coordinates (x, y, z) and (r, , ) is determined from the trigonometric construction in Fig. 4.10. The projection of the radial vector onto the x-y plane has length r sin . The x and y coordinates are obtained by projecting this quantity onto the x and y axes: x = r sin cos , y = r sin sin .
1

(4.41) (4.42)

Pappus of Alexandria, who lived in the 4th century, is considered to be the last of the great Greek geometers.

52
z

Triple Integrals

(a)

(b)

Fig. 4.9: Two depictions of spherical polar coordinates. (a) The denitions and ranges of r, , and . (b) The representation of any point as the intersection of the surface of constant r (the sphere), constant (the plane), and constant (the cone).

The projection of the radius onto the z-axis is z = r cos . (4.43)

These are the transformations that relate Cartesian coordinates to spherical polar coordinates. The ranges of the radial and azimuthal z variables are determined by referring to Fig. 4.9(a). As in circular polar coordinr ates (Sec. 3.2) 0 r < 1, 0 < 2 . (4.44)

r cos The range of is determined by requiring that the transformation between Cartesian and spherical polar coordinates is singler sin valued, i.e. that one and only one set of spherical polar coordinates (r, , ) cor- Fig. 4.10: The orientation of a radial vector responds to a particular set of Cartesian with respect to the z-axis. coordinates (x, y, z). This necessitates restricting the range of to

0 . To understand this, consider a point r r = r cos sin i + r sin sin j + r cos k .

(4.45)

(4.46)

Suppose that we transform this point by rotating the azimuthal angle by : ! + . The coordinates of the transformed point r0 are obtained by applying standard

Triple Integrals
trigonometric identities: r0 = r cos sin i r sin sin j + r cos k .

53

(4.47)

Now suppose that we rotate the polar angle of r so that: ! 2 (< 2). The coordinates of the transformed point r00 are again determined by applying standard trigonometric identities: r00 = r cos sin i r sin sin j + r cos k . (4.48)

By comparing these coordinates with those in Eq. (4.47), we conclude that r00 = r0 , i.e. that there are two ways of labelling the same point. To avoid this unacceptable result, the range of is restricted to the range in Eq. (4.45).

4.3.2

The Integration Element

The integration element in spherical polar coordinates is most easily obtained with the procedure in Problem Set 4. The radius vector associated with a point (x, y, z) is written as r = r cos sin i + r sin sin j + r cos k , and calculating the integration element from the vector product dV = |drr dr dr | , (4.50) (4.49)

where drr , dr , and dr are the dierential changes of r with respect to r, , and z, respectively: drr = dr cos sin i + dr sin sin j + dr cos k , dr = r sin sin d i + r cos sin d j , dr = r cos cos d i + r sin cos d j r sin d k . (4.51) (4.52) (4.53)

These vectors are mutually orthogonal so the integration element is obtained from the product of their magnitudes: dV = r2 sin dr d d . (4.54)

54

Triple Integrals

4.3.3 Triple Integrals in Spherical Polar Coordinates


Integrals of a function F(x, y, z) over a volume V are written in spherical polar coordinates as ZZZ F(x, y, z) dx dy dz
V

ZZZ ZZZ

V0

F x(r, , ), y(r, , ), z(r, , ) r2 sin dr d d f (r, , ) r2 sin dr d d , (4.55)

V0

where V 0 is the volume V expressed in spherical polar coordinates. There are two important special cases of this integral. If f has no -dependence, f = f (r, ), then f is said to have azimuthal symmetry. According to the transformations in Eqs. (4.42) and (4.43) and Fig. 4.8, this corresponds to rotational symmetry about the z-axis. Surfaces of revolution have this type of symmetry. A physical situation with this type of symmetry is discussed in Problem Set 4. The integral over can be evaluated immediately and the general expression in Eq. (4.55) becomes ZZ 2 f (r, ) r2 sin dr d . (4.56) In the second case, where f has neither - nor -dependence, f is said to be isotropic. This corresponds to spherical symmetry in that f depends only on the radius r and not on any angular orientation. The integrals over and can be evaluated immediately and the general integral in Eq. (4.55) reduces to Z 4 f (r)r2 dr . (4.57) This integral is seen to correspond to the integration over radial shells. Example. We consider rst the calculation of the volume of a sphere of radius R. Referring to Eq. (4.55), this corresponds to the case f = 1. The ranges of the integration variables are obtained directly from Fig. 4.9(a): 0 r R, 0 < 2 , 0 , (4.58)

so the volume integral is Z R Z 2 Z 4 2 V= r dr d sin d = R3 . 3 |0 {z } |0{z } |0 {z } 1 3 2 2 R 3

(4.59)

Triple Integrals

55

The generalization of this procedure to sections of a sphere between given azimuthal and polar angles and to spherical shells with given inner and out radii is straightforward. Example. Consider the integral of f = e r over all space. This is an example of a function with spherical symmetry that occurs frequently in quantum mechanics. The ranges of the integration variables are 0 r < 1, so the integral of f becomes Z
1 0

0
2

< 2 ,

0 , Z

(4.60)

re

dr

The radial integral is evaluated by performing successive integrations by parts: Z 1 Z 1 2 r 1 8 1 r 2 r 4 r | dr re dr |{z} e {z } = r e 0 + 0 0 | {z } u dv 0 Z 8 1 r 1 1 1 r = re + e dr | {z } 0 0 0 ! 8 1 r 1 = 2 e 0 = 8 . 3 (4.62)

d sin d = 4 |0{z } |0 {z } 2 2

1 0

r2 e

dr .

(4.61)

Notice that, in arriving at this result, we have twice used the fact that
x!1

lim xn e

=0

(4.63)

for any n (and > 0).

4.4 Surface Integrals


A particular case of integrals in three dimensions involves integrals over surfaces. A common type of surface integral is where one of the three variables is held

56

Triple Integrals

constant. Consider the surface of a sphere of radius R. According to Eq. (4.49) the radius r vector at any point on the sphere is r = R cos sin i + R sin sin j + R cos k , (4.64)

The element of area integration is obtained by calculating the dierential of this vector for changes in turn of d and d: dr = R sin sin d i + R cos sin d j , dr = R cos cos d i + R sin cos d j These vectors are orthogonal, dr dr = 0 , (4.67) R sin dk . (4.65) (4.66)

so the dierential area dA corresponding to these dierential changes is obtained from the product of the magnitudes of dr and dr : dA = |dr ||dr | = R sin d R d = R2 sin d d . Example. The surface area of a sphere of radius R is represented as Z 2 Z 2 R d sin d = R2 2 2 = 4R2 .
0 0

(4.68)

(4.69)

The corresponding expression of the surface area subtended by azimuthal angles 1 and 2 and polar angles 1 and 2 is Z 2 Z 2 2 R d sin d = R2 ( 2 cos 2 ) . (4.70) 1 )(cos 1
1

The other type of surface integral we will encounter involve a cylinder of radius R. The radius vector is, from Eq. (4.19), given by r = R cos i + R sin j + z k, (4.71)

The dierential dr corresponding to dierential changes of d and dz are dr = R sin d i + R cos d j , drz = dz k . (4.72) (4.73)

Triple Integrals

57

These vectors are manifestly orthogonal, dr drz = 0, so the dierential area dA corresponding to these dierential changes is obtained from the product of the magnitudes of dr and drz : dA = R d dz . (4.74)

Example. The surface of a cylinder of radius R and height H is calculated as Z 2 Z H R d dz = 2RH . (4.75)
0 0

The surface area of cylinder between heights H1 and H2 and azimuthal angles 1 and 2 is similarly calculated as Z 2 Z H2 R d dz = R( 2 H1 ) . (4.76) 1 )(H2
1

H1

4.5 Summary
The triple integral of a function f (x, y, z), viewed as a density of some physical quantity, is the amount of that quantity within a volume in three-dimensional space. There is considerably more freedom to specify other cooordinate systems than in two dimensions and many applications in physics rely on such transformations to enable calculations to be carried out. From Cartesian coordinates, we transformed triple integrals into cylindrical polar coordinates, which are the natural generalizations of circular polar coordinates to three dimensions, and are appropriate to situations where there is azimuthal symmetry, and spherical polar coordinates, for situations that involve spherical symmetry. The Jacobians obtained in each case reect the position dependence of the magnitude of the dierential volume elements.

Chapter 5 Line Integrals


For a function f of a single variable, the integral of f over an interval is uniquely determined once the limits of integration are specied. Extending this construction to the integration of a function f of two or more variables along a path in space connecting specied initial and nal points (the limits of integration) leads to entirely new mathematical issues. Foremost among these is that the value of such an integral called a line integral generally depends not just on the limits of integration, but on the path that connects these points along which the integration of f is carried out. Thus, the information required to perform a line integral of a given function is comprised of the initial and nal points and the path connecting them. In this respect line integrals represent a signicant conceptual departure from double and triple integrals. In this chapter, we will rst motivate the mathematical structure of a large class of line integrals, using the calculation of work in classical mechanics as a motivation, and work through several examples to demonstrate through explicit calculations that the value of a line integral can depend on the path connecting two points. We will then examine some general properties of line integrals, determine the criterion for the value of a line integral to be independent of the integration path between the limits of integration. Path-dependent and path-independent line integrals each have important applications in several areas of physics, including mechanics, electromagnetic theory, and thermodynamics, which we mention at various places in this chapter.

5.1 Work in Classical Mechanics


A standard calculation in classical mechanics is the work W done by a force F along a path between two points a and b. If the force has a constant magnitude and direction and acts at an angle along a path of length r, as shown in the gure at right, the work W done by the force is W = F r. Similarly, if F acts only over 59

60 an innitesimal distance dr, the corresponding work dW done is dW = Fdr .

Line Integrals

(5.1)

Suppose now that the force is a funcF tion of position. We consider this situation in one dimension rst: F = F(x). The calculation of the work between two points x = a and x = b proceeds according to the construction in Fig. 5.2. The inr terval (a, b) is rst divided into N subintervals of length x = (b a)/N. The Fig. 5.1: A force F acting at an angle force acting within each of these subin- along a displacement r. tervals is taken to be the constant value at the left endpoint of that interval. Thus, we obtain F(a) x + F(a + x) x + + F(b x) x . (5.2) as an approximation of the work done over the interval. As x ! 0, this approximation becomes increasingly accurate and the work done approaches the shaded region in the right panel of the gure. Referring to Sec. 2, the procedure depicted in Fig. 5.2 is the same as that used for the Riemann sum construction of the integral of a function, so we conclude that W= Z
b a

F dx .

(5.3)

Similar considerations apply for paths in two and three dimensions. In this chapter, we will consider the two-dimensional case. A force F in two dimensions
F F

Fig. 5.2: (Left panel) Construction used to calculate the work done from x = a to x = b by a position-dependent force. The shaded area corresponds to the work calculated by regarding the force as constant over each subinterval. (Right panel) The corresponding calculation for innitesimal subintervals, which is seen to represent the area bounded by F , the x-axis, and the lines x = a and x = b.

Line Integrals
is a vector eld: F(x, y) = P(x, y) i + Q(x, y) j , where P and Q are functions of x and y and i. This expression indicates that every point (x, y) is assigned a vector F whose x-component is given by P i and whose y-component is Q j. The path along which F acts is a curve P in the x-y plane between an initial point i and a nal point f , as shown in Fig. 5.3. The work done along this path is calculated as in Eq. (5.1) by rst considering the incremental work dW done by the force along a distance dr: dW = F dr, where dr is the incremental distance along the path. Then, with the position vector given by
f

61

(5.4)

Fig. 5.3: A path in a vector eld between an initial point i and the nal point f .

r = xi + y j, we have that the incremental change along the path is dr = dx i + dy j , so the work done along P is Z W = Fdr = = Z Z
P P

(5.5)

(5.6)

P(x, y) i + Q(x, y) j (dx i + dy j) P(x, y) dx + Q(x, y) dy . (5.7)

This is an example of a line integral. In addition to this example from classical mechanics, line integrals appear in thermodynamics and in electricity and magnetism. In thermodynamics, P represents a process between initial and nal values of thermodynamic variables (e.g. pressure, temperature, volume). The line integral of such variables yields quantities such as heat ow and the work done during the process. In electricity and magnetism, P is a path in space, and line integrals represent quantities such as the electromotive force. In all of these cases the mathematical form of a line

62 integral is Z f (x, y) dx + g(x, y) dy ,

Line Integrals

(5.8)

where f and g are any functions that can be integrated and P is the path connecting the initial and nal points. As we stressed in the introduction, specifying the integration path P is as important as specifying the initial and nal points. The path provides a functional relationship between x and y and allows the integrals to be evaluated; otherwise the variable y in the term f (x, y) dx and the variable x in the term g(x, y) dy appear superuous. Additionally, the value of the line integral may depend explicitly on the path, so specifying only the initial and nal points does not necessarily su cient to obtain a unique value. The following example illustrates these ideas. Example. Consider the line integral Z xy dx ,
P

(5.9)

which is of the general form in Eq. (5.8) with f = xy and g = 0. In the context of the calculation of work, this corresponds to a force F(x, y) = xy i. We will evaluate this integral over 1 the three paths shown in Fig. 5.4, each of which have their initial point at the 0.8 origin (0, 0) and their nal point at 0.6 P2 (1, 1). We rst consider P1 . This path is 0.4 composed of two straight segments: P3 (0, 0) ! (1, 0) and (1, 0) ! (1, 1). 0.2 The rst segment lies along the x-axis, P1 so we have that
y

y = 0,

Hence, since y = 0, the integrand vanFig. 5.4: The three paths, labelled P1 , P2 , and ishes, so the contribution from segP3 between (0, 0) and (1, 1) used for evaluating ment also vanishes. The second segthe line integral in Eq.(5.9). ment is parallel to the y-axis, so x = 1, dx = 0 , Since dx = 0, the contribution along this segment also vanishes. Therefore, the integral along P1 vanishes: Z xy dx = 0 . (5.12)
P1

0 x 1.

(5.10)

0.2

0.4 x

0.6

0.8

0 y 1.

(5.11)

Line Integrals

63

The path P2 connects (0, 0) to (1, 1) with the straight line y = x. Thus, along this path, the integrand can be expressed entirely as a function of x: xy = x2 , with 0 x 1. The line integral is thereby evaluated as Z xy dx =
P2

1 0

1 x2 dx = 3 x3 = 1 . 3 0

(5.13)

Finally, the path P3 connects (0, 0) to (1, 1) with the parabola y = x2 . Along this path, the integrand can be written as xy = x3 , with 0 x 1, and the line integral becomes Z Z 1 1 xy dx = x3 dx = 1 x4 = 1 . (5.14) 4 4
P3 0 0

We have thus obtained three dierent values for the line integral in Eq. (5.9) along the three paths shown in Fig. 5.4. This result can be understood by interpreting this integral as the work done by the force F = xy i over the three paths: Z Z Fdr = xy dx , (5.15)
Pi Pi

for i = 1, 2, 3. This vector eld is shown in Fig. 5.5 superimposed on the paths P1 , P2 , and P3 . We can see immediately from this diagram that the work done along P1 must vanish because F vanishes along the x-axis (the rst segment of P1 ), and acts in the normal direction to the second segment of this path. Alternatively, the line integrals along P2 and P3 are both necessarily positive because the projection of F onto the path has a component along the direction of the path, producing positive work.

0 1 0.8 0.6 y 0.4 0.2

0.2

0.4

0.6

0.8

P2

P3

P1 0 0 0.2 0.4 0.6 x 0.8 1

This example illustrates two fundamental Fig. 5.5: The vector eld F = xy i and the points about line integrals. (i) The value of paths P1 , P2 , and P3 shown in Fig. 5.4 a line integral may depend on the path over used for the evaluation of the line integral which it is evaluated. There are physical in Eq. (5.9). manifestations of this property that have important consequences in mechanics, thermodynamics, and electricity and magnetism. (ii) The path between given initial and nal points establishes a relationship between the independent variables. Once this information is incorporated into the line integral, the evaluation reduces to that of an ordinary integral (Sec. 2).

64 Example. Consider the line integral Z (xy2 dx + x2 y dy)


P

Line Integrals

(5.16)

evaluated along the three paths in Fig. 5.4. Along the rst segment of P1 , y = 0, and therefore dy = 0, so there is no contribution from either term in the integral. Along the second segment x = 1, dx = 0, and 0 y 1. Thus, only the second term in the integral makes a contribution to the integral, and we obtain Z (xy dx + x y dy) =
P1 2 2

1 0

y dy = 1 y2 = 1 . 2 2
0

(5.17)

Along P2 , y = x, so dy = dx. Thus, both terms in the integrand can be written in terms of either x or y alone: R
P2

(xy2 dx + x2 y dy) = 2
1 0

= 2 1 x4 = 1 , 4 2

R1
0

x3 dx (5.18)

which is the same value obtained in Eq. (5.17). Finally, along P3 , y = x2 , so dy = 2x dx. We can express the integrand in terms of x alone to obtain R R1 (xy2 dx + x2 y dy) = 0 (x5 dx + 2x5 dx) P3 1 R1 = 3 0 x5 dx = 3 1 x6 = 1 , (5.19) 6 2
0

which is the same as that obtained for the other two paths. A natural question arises: Is this a coincidence, or does this integral always have the same value when evaluated over dierent paths between xed initial and nal points? The results we have obtained in this example are certainly suggestive, but to address this question in a mathematically concise framework, we must derive some additional properties of line integrals. This is the subject of the next two sections.

5.2 Line Integrals over Closed Curves


An important class of line integrals is that for which the path of integration forms a simple closed curve, i.e. a path that returns to the initial point but does not cross its path. Such integrals nd many applications in thermodynamics, where they are called cycles, and in electricity and magnetism, where they form the mathematical expression of Amp` res law, and are referred to as loop integrals. e

Line Integrals

65

In this section, we will re-express the question of the path-dependence of a line integral in terms of the value of that integral around a closed curve. We rst determine the eect that reversing the sense of the integration path has on the value of a line integral. Consider a line integral over a path between an initial point i and a nal point f , as shown in Fig. 5.6(a): Z
y

( f dx + g dy) .

(5.20)
P

Suppose that this path is reversed, so that the new initial point is f and the new nal point is i, as shown Fig. 5.6(b). We signify this path by P and write the corresponding line integral as Z ( f dx + g dy) . (5.21) The relationship between the values of these two line integrals is straightforward to understand. As the examples in the preceding section show, the evaluation of a line integral always reduces to an ordinary integral. Thus, reversing the integration path in a line integral has the eect of interchanging the upper and lower limits of integration. According to the Fundamental Theorem of Calculus, this changes the sign of the integral [Eq. (2.15)]. Thus, the line integrals in Eqs. (5.20) and (5.21) have the same absolute value, but opposite signs: Z Z
P

i (a) x y f

i (b) x

Fig. 5.6: (a) The path P between points i and f , and (b) the reverse path

( f dx + g dy) =
P

( f dx + g dy) .
P

(5.22)

P (b).

Consider now a line integral over a closed curve C (Fig. 5.7). Such integrals, often called loop integrals, have a special notation to indicate that the integration path is a closed curve: I ( f dx + g dy) .
C

(5.23)

Choose any two distinct points A and B on C and denote by P1 the path on C from A to B and by P2 the path that returns B to A along C. The integral over C can be expressed as sum of line integrals over P1 and P2 : I Z Z ( f dx + g dy) = ( f dx + g dy) + ( f dx + g dy) . (5.24)
C P1 P2

66

Line Integrals

Suppose that the value of the line integral in Eq. (5.20) is independent of the path P for any initial and nal points. The closed curve C in Fig. 5.7 denes two paths from A to B: the path P1 and the reverse of the path P2 . Path-independence requires that the line integrals over P1 and P2 are equal: Z ( f dx + g dy) =
P1

( f dx + g dy) .
P2

(5.25)

By invoking Eq. (5.22), we can write this equation as Z = ( f dx + g dy)


P1

Z I

( f dx + g dy) Z
P2

( f dx + g dy) +
P1

( f dx + g dy)
P2

( f dx + g dy) = 0 .
C

(5.26)

This shows that, if the value of a line integral is independent of the path between any initial and nal points, the loop integral vanishes for any closed curve. The converse of this statement is also y true. If a loop integral vanishes for any closed curve C, then we can choose any P1 two points A and B on C as initial and nal points of line integrals along the corresponding paths P1 and P2 . Then, by reB versing the steps leading to Eq. (5.26), we nd that Z ( f dx + g dy) =
P1

(5.27) which implies path independence. Thus, we have shown that the path independence of a line integral is both necessary [Eq. (5.27)] and su cient [Eq. (5.26)] for the loop integral to vanish over any closed curve. In other words, these two properties are equivalent:

( f dx + g dy) ,
P2

P2 x

Fig. 5.7: A closed curve C in the x-y plane.

P1 is a path between any two points A and B on C and P2 is the path from B to A that
completes the loop. The closed curve is the sum of these two paths: C = P1 + P2 .

Line Integrals
A line integral Z ( f dx + g dy)
P

67

is independent of the path P between any two points i and f if and only if I ( f dx + g dy) = 0
C

for any closed curve C. This result provides an alternative statement of the fact that line integrals fall into two classes: (i) path-dependent and, therefore, typically non-vanishing values over closed curves, and (ii) path-independent and vanishing values over closed curves. Both types of line integral are important in applications to physics and understanding the physical circumstances that lead to one type of integral or another is a central theme in several disciplines. We conclude this section with two examples. Example. Consider the loop integral I
y

y dx ,
C

(5.28)

where C is a circle of radius a centered at (1, 1), as shown in Fig. 5.8. We represent the circle as follows: x=1 a cos , (5.29)

a 1

y = 1 + a sin ,

where 0 < 2. This parametrization x 1 sweeps through the circle in a clockwise direction beginning at (1 a, 1). The inFig. 5.8: The circle of radius a centered at tegral in Eq. (5.28) can be expressed as (1, 1), showing the denition of for carryan integral over by using Eq. (5.29) to ing out a loop integral over this curve. transform the integrand, the integration element, and the limits of integration. The integrand y is given by the second of Eqs. (5.29), an application of the chain rule to x( ) yields dx = a sin d , (5.30)

68
y (a) P1 y

Line Integrals
(b)

B A

xA y

P2

xB

x y (c) (d)

Fig. 5.9: The evaluation of the loop integral in Eq. (5.28) around an arbitrary closed curve

C, showing (a) the separation of C into upper and lower paths P1 and P2 , (b) and (c) the
evaluation of the integral along these paths, and (d) the cumulative effect of the loop integral.

and the limits of integration are 0 I y dx = Z


2 0

< 2. The original integral thereby becomes

(1 + a sin )a sin d sin d +a {z } |0 =0


2

=a

= a2 .

|0

sin2 d {z } =

(5.31)

This is readily identied as the area of the circle enclosed by C. This result can be generalized to any closed curve in the x-y plane by following the steps shown in Fig. 5.9. We rst identify the points A = (xA , yA ) and B = (xB , yB ) that allow C to be written as the sum of upper and lower paths P2 and P2 , which can be represented as functions y1 (x) and y2 (x), respectively. The integral

Line Integrals
along P1 is Z Z
xB xA

69

y dx =
P1

y1 (x) dx .

(5.32)

This is an ordinary integral whose value is represented by area bounded by y1 (x), the x-axis, and x = xA and x = xB , as shown in Fig. 5.9(a). The loop is completed by integrating y2 (x) from xB to xA . The integral Z xB y2 (x) dx (5.33)
xA

is represented by the area shown in Fig. 5.9(c). But the integral we need to complete C has the upper and lower limits interchanged, so its value corresponds to the negative of this quantity. Hence, the loop integral is calculated as I Z xB Z xB y dx = y1 (x) dx y2 (x) dx , (5.34)
C xA xA

which is represented in Fig. 5.9(d). The integral over y2 (x) cancels the contribution from the integral over y1 (x) that represents the area below P2 , leaving only the area enclosed by C. We have thereby shown that I y dx = A , (5.35)
C

where A is the area enclosed by C. We conclude this section with an example of a loop integral that does vanish. Example. Consider the integral I (xy2 dx + x2 y dy) , (5.36)
C

( 1,1)

(1,1)

where C is the closed curve in Fig. 5.10. The integrand is the same as that in the second example in Sec. 5.1. The closed curve is composed of four straight segments, so we will evaluate the loop integral by considering each segment separately. Beginning at ( 1, 1), the segments are characterized as follows:

( 1, 1)

(1, 1)

Fig. 5.10: The closed contour for the integral in Eq. (5.36).

70 ( 1, 1) ( 1, 1) (1, 1) (1, 1) ! ! ! ! ( 1, 1) : (1, 1) : (1, 1) : ( 1, 1) : x= 1 y=1 x=1 y= 1 dx = 0 dy = 0 dx = 0 dy = 0 1y1 1x1 1y1 1x1

Line Integrals

If dx = 0 the rst term in Eq. (5.36) makes no contribution, while if dy = 0, the second term makes no contribution. The integral can therefore be written as (note the upper and lower limits of each integral!) Z 1 Z 1 Z 1 Z 1 y dy + x dx + y dy + x dx = 0 . (5.37)
1 1 1 1

5.3 Exact and Inexact Dierentials


Although the results of the preceding section allow us to re-express the pathindependence of a line integral in terms of a loop integral, we are no closer to determining a priori whether or not a given line integral is independent of path between given initial and nal points. In this section, we will derive a condition that allows us to address this question without having to perform any integration whatsoever. Consider the line integral Z ( f dx + g dy) (5.38)
P

over a path P between an initial point (xi , yi ) and a nal point (x f , y f ). The path establishes a relation between x and y that we represent as y(x). This enables us to write the line integral as an integral over x only by following the procedure in Sec. 2.2. We have Z Z xf f (x, y) dx = f [x, y(x)] dx , (5.39) Z
P xi

g(x, y) dy =
P

xf xi

g[x, y(x)]

dy dx . dx

(5.40)

Thus,

The right-hand side of this equation is an ordinary integral over xi x x f . Accordingly, if the line integral is path-independent, we can use the Fundamental

( f dx + g dy) =

xf xi

) dy f [x, y(x)] + g[x, y(x)] dx . dx

(5.41)

Line Integrals
Theorem of Calculus to write ) Z xf ( dy f [x, y(x)] + g[x, y(x)] dx = F(x f ) dx xi where dF dy = f [x, y(x)] + g[x, y(x)] . dx dx By writing F as F[x, y(x)], we also have dF @F @F dy = + , dx @x @y dx from which we identify @F = f, @x @F = g. @y

71

F(xi ) ,

(5.42)

(5.43)

(5.44)

(5.45)

The quantity F is called the potential. On account of Eq. (5.45) we can write the dierential of F as dF = @F @F dx + dy = f dx + g dy , @x @y (5.46)

in which case the quantity on the right-hand side is independent of the path. This is called an exact differential. Otherwise, the quantity f dx + g dy is called an inexact differential and the corresponding line integral is path-dependent. Hence, a line integral of an exact dierential can be represented as Z Z ( f dx + g dy) = dF , (5.47)
P P

in which case we have that Z ( f dx + g dy) = F[x f , y(x f )]


P

F[xi , y(xi )] (5.48)

= F(x f , y f )

F(xi , yi ) ,

In terms of our original formulation in Sec. 5.1, this equation states the work done between and initial point i and a nal point f is equal to the change in the potential F. Equation (5.45) provides a method of testing for the exactness of a dierential. By dierentiating the rst of these equations with respect to y, ! @ @F @2 F @f = = , (5.49) @y @x @y@x @y

72 the second with respect to x, ! @ @F @2 F @g = = , @x @y @x@y @x

Line Integrals

(5.50)

and equating the mixed second partial derivatives of F: Fyx = F xy , we obtain @f @g = . @y @x (5.51)

The discussion leading to this equation shows that it is a necessary condition for a dierential to be exact. The procedure described in Problem 5, Problem Set 5 shows that this is also a su cient condition for exactness, thus demonstrating the equivalence between Eq. (5.51) and the exactness of a dierential. Example. Consider the line integral Z y dx ,
P

(5.52)

which was discussed in an earlier Example in this section. In the notation of Eq. (5.8), f = y and g = 0. Thus, @f = 1, @y @g = 0. @x (5.53)

Since these two partial derivatives are unequal, we conclude from Eq. (5.51) that y dx is an inexact dierential, so the line integral in Eq. (5.52) is path-dependent. This result is to be expected in view of Eq. (5.35). As a second example, we consider the line integral Z (xy2 dx + x2 y dy) . (5.54)
P

This integral has been discussed in Secs. 5.1 and 5.2. In the notation of Eq. (5.8), f = xy2 and g = x2 y, and we nd @f = 2xy , @y @g = 2xy . @x (5.55)

The equality of these partial derivatives means that xy2 dx + x2 y dy is an exact dierential, so the line integral in Eq. (5.54) is path-independent. This conclusion conrms our expectations based on the results already obtained for this line integral.

Line Integrals

73

The exactness of this dierential implies that there is an underlying potential function F such that @F @F = xy2 , = x2 y . (5.56) @x @y Integrating the rst of these with respect to x yields F(x, y) = 1 x2 y2 + h(y) , 2 (5.57)

where h(y) is an arbitrary function of y (analogous to constants of integration obtained when integrating functions of one variable). Dierentiating both sides of this equation with respect to y, @F = x2 y + h0 (y) , @y (5.58)

and requiring that this result be consistent with the second of equations (5.56), necessitates setting h = constant, so h0 (y) = 0. Thus, F(x, y) = 1 x2 y2 + constant . 2 The constant term disappears upon integration: Z (xy2 dx + x2 y dy) = 1 (x2 y2 f f 2
P

(5.59)

xi2 y2 ) . i

(5.60)

5.4 Arc Length


A line integral with a mathematical structure dierent from that in Eq. (5.8) is obtained by calculating the distance travelled by a particle moving along a trajectory. The trajectory is described by a curve [x(t), y(t)], where t is the time between an initial time ti and a nal time t f . The distance ds travelled by the particle in time dt, when x changes by dx and y changes by dy, is given by the relation ds2 = dx2 + dy2 , or, ds = (dx2 + dy2 )1/2 . (5.61) (5.62)

Thus, the total distance S travelled by the particle, called the arc length of the path, is Z f Z f p S = ds = dx2 + dy2 . (5.63)
i i

74

Line Integrals

The integrand on the right-hand side can be written in a more physically suggestive form as p dx2 + dy2 2 ! !2 31/2 q 6 dx 2 dy 7 6 7 6 7 = dt v2 + v2 = v dt , 6 7 = dt 4 + 5 x y dt dt (5.64)

where v x and vy are x and y components of the instantaneous speed v of the particle. The arc length along the trajectory is can thereby be represented as Z tf S = v(t) dt . (5.65)
ti

The general form of the arc length, Z p


P

dx2 + dy2 .

(5.66)

is used to represent the distance along any curve P. Example. We will illustrate the methodology of computing the arc length by considering y = cosh x between x = 0 and x = a. For y = cosh x, we have dy = sinh x dx , so the integrand in Eq. (5.66) becomes p p p dx2 + dy2 = dx2 + sinh2 x dx2 = dx 1 + sinh2 = cosh x dx . Thus, Z p dx2 + dy2 = Z
a 0

(5.67)

(5.68)

cosh x dx = sinh x = sinh a .


0

(5.69)

5.5 Summary
We can summarize the main results we have obtained on line integrals by noting that the following statements are equivalent in that any one implies any other. If any one statement is false, all other are false as well. 1. f dx + g dy is an exact dierential;

Line Integrals
2. 3. 4. Z I ( f dx + g dy) is independent of the path P between xed endpoints; ( f dx + g dy) = 0 for any closed curve C;

75

@f @g = ; @y @x

5. There is a potential function F such that F x = f and Fy = g, so dF = f dx + g dy; Z 6. ( f dx + g dy) = F(x f , y f ) F(xi , yi ) for any initial point (xi , yi ) and nal point (x f , y f ).
P

Chapter 6 The Divergence and the Divergence Theorem


There are two basic types of function that describe physical quantities in higher spatial dimensions: scalar functions of several variables and vector elds. Scalar functions assign a number to every point in a given region, while vector elds assign a vector, i.e. a magnitude and direction, to every point in a region. For example, temperature is a scalar function, but the wind direction and speed on a weather map is an example of a vector eld. In the preceding chapter, we showed that the gradient of a scalar function is a vector eld that characterizes the magnitude and direction of the maximum rate of change of that function. By their very nature, vector elds are more di cult to represent and visualize than scalar functions, so operations that characterize properties of vector elds are especially useful. In this chapter, we introduce the divergence of a vector eld, a suggestively named quantity that quanties the extent to which the vector eld points toward or away (i.e. diverges) from a point. The divergence is central quantity in the mathematical formulations of continuum theories such as electromagnetism, uid mechanics, and elasticity.

6.1 Denition of the Divergence


We will work initially in two spatial dimensions; the results obtained can be taken over to three dimensions with minimal procedural changes. Consider the construction in Fig. 6.1, which shows a rectangular region in the x-y plane of area x y. A vector eld that crosses the boundary of this region is also indicated. We now ask the question: what is the ux of the vector eld across that region? The ux of a vector eld across a boundary is dened as the part of the vector eld that is normal to that surface. The component of the vector parallel to the boundary 77

78

The Divergence and the Divergence Theorem

x,y

x,y

x,y

x,y

Fig. 6.1: The region of area also indicated.

x y used to calculated the divergence of a vector eld that

crosses the boundary of that region. The outward unit normal of each face of this region is

does not contribute to the ux across the boundary. If we denote this vector eld by V = P(x, y) i + Q(x, y) j , (6.1) and the outward unit normal of the surface by n, the total ux across the boundary of the region can be expressed as a line integral of the dot product V n over the boundary: Z V n ds .

(6.2)

Note the sign convention used here: positive for ux out of the region and negative for ux into the region. We can now calculate the contribution to this quantity from each of the four sides of the rectangular region. Beginning with the part of the boundary contained between (x, y) and (x + x, y) and moving counterclockwise, we obtain the following expressions: (V n) x = [P(x, y) i + Q(x, y) j] ( j) x = Q(x, y) x , (V n) y = [P(x + x, y) i + Q(x + x, y) j] i y = P(x + x, y) y , (V n) x = [P(x, y + y) i + Q(x, y + y) j] j x (6.4) (6.3)

The Divergence and the Divergence Theorem


= Q(x, y + y) x . (V n) y = [P(x, y) i + Q(x, y) j] ( i) y = P(x, y) y ,

79 (6.5)

(6.6)

where the signs are due to the directions of the unit normals of each face, which are j, i, j and i, respectively. Note the arguments of P and Q in each term! The variation along each face has been neglected because the lengths of each side will become innitesimal later in this calculation. Moreover, the reference points for each side have been chosen so that the leading corrections to V(x, y) are of order x y, rather than ( x)2 or ( y)2 , which would vanish in the innitesimal limit. Neither of these choices is essential for our calculation, but they make the intermediate step much simpler. Upon summing up the contributions to the ux, we obtain Z V n ds = P(x + x, y) P(x, y) y + Q(x, y + y) Q(x, y) x P(x + x, y) = x " P(x, y) Q(x, y + y) + y Q(x, y) # x y. (6.7)

The expressions within the square brackets on the right-hand side are discrete approximations to partial derivatives of P and Q, as given in Eqs. (1.12) and (1.13). Thus, by dividing both sides of this equation by x y and taking the limit x ! 0 and y ! 0, we obtain " # Z 1 lim V n ds x!0, y!0 x y P(x + x, y) = lim x!0 x = @P @Q + . @x @y " P(x, y) # Q(x, y + y) + lim y!0 y " Q(x, y) #

(6.8)

The quantity on the right-hand side of this equation is called the divergence of V. From the way we have arrived at this quantity, the divergence is dened as the ux density across the boundary of an innitesimal region. This point merits further discussion. We are familiar with other types of densities, such as mass density and charge density. Each of these refers to an extensive quantity in that the total amount of mass or charge within a region is obtained

80

The Divergence and the Divergence Theorem

by integrating the corresponding density at each point within the region. The density at any point can be determined by calculating the volume within a region surrounding the point, dividing by the volume of the region, and shrinking the volume of the region to that point. Flux is another such quantity, as the Eq. (6.8) demonstrates. This is of fundamental signicance for the applications of the divergence to physical theories. We can write the divergence of a vector eld in a more compact form that has a natural extension to three dimensions by introducing the del operation ri @ @ @ +j +k . @x @y @z (6.9)

By regarding this operation as a vector, we take the dot product with V = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k , to obtain ! @ @ @ @P @Q @R rV = i + j + k (P i + Q j + R k) = + + . @x @y @z @x @y @z (6.11) The denition of the divergence in Eq. (6.8) can now be written as r V = lim 1 A Z V n ds , ! (6.12) (6.10)

A!0

where we have written A = x y. This denition has several similarities with the denition of the derivative in Eq. (1.1). We used Eq. (1.1) explicitly in arriving at the expression for the divergence, but the similarities run much deeper. The right-hand sides of both equations are expressed in terms of a function evaluated on the boundary of a region [the endpoints of an interval in the denition in Eq. (1.1)], and the corresponding derivatives of those functions are obtained by shrinking this region to zero. These similarities will be extended further when we derive the Fundamental Theorem associated with the divergence. We rst consider some examples of vector elds and their divergences. Example. As our rst example, we consider the vector eld V = xi + y j, (6.13)

The Divergence and the Divergence Theorem


y

81

Fig. 6.2: Plot of V = x i + y j.

which is shown in a region around the origin in Fig. 6.2 above. The divergence of V is calculated from the denition in Eq. (6.11), with P = x and Q = y: rV = @x @y + = 1 + 1 = 2. @x @y (6.14)

According to our convention, the positive divergence of this vector eld indicates that the ux density is directed outward. This is certainly apparent at the origin, but the fact that the divergence is independent of position means that this interpretation is valid for every point in the x-y plane, i.e. the ux density is directed outward from every point. Understanding how this conclusion is drawn is central to the concept of what the divergence means. The details of this interpretation is discussed in Classwork 7 for this and other vector elds. Example. Consider now the vector eld V = xi y j, (6.15)
x y

which is shown in Fig. 6.3 at right. The divergence of V is calculated from Eq. (6.11), with P = x and Q = y: rV = @x @x @y =1 @y 1 = 0. (6.16)

This zero divergence indicates that there is no net ux density. This is again apparent at the origin, but Fig. 6.3: Plot of V = x i y j. the fact that the divergence is independent of position means that this interpretation is valid for any point in the x-y plane. Example. Our nal example is the vector eld

82

The Divergence and the Divergence Theorem


y

Fig. 6.4: Plot of V = y i + x j.

V = yi + x j,

(6.17)

which is shown in Fig. 6.4 above. The divergence of V is calculated from Eq. (6.11), with P = y and Q = x: rV = @y @x + = 0 + 0 = 0. @x @y (6.18)

The vanishing divergence of this vector eld indicates that there is no net ux density, but here because the vector appears as a vortex. As in the preceding examples, this conclusion is valid at every point.

6.2 The Divergence Theorem


The divergence was derived in the preceding section by calculating the ux through an innitesimal region in the x-y plane. In this section, we will integrate Eq. (6.8) to obtain the fundamental theorem of calculus associated with the divergence. The key point here is that for adjacent regions the share a boundary, the ux from one region exactly cancels the ux into the next region. The only net contribution to the ux is where there is no adjacent region, as shown in Fig. 6.5. Thus, we can partition any region in the x-y plane into contiguous regions of the type in Figs. 6.1 and invoke the cancellation of ux along adjacent boundaries (Fig. 6.5). The sequence of steps is shown in Fig. 6.6. If we denote the boundary of the ith region by i and the area x y by i , then for each region we have ! Z @P @Q V ni ds = + i . (6.19) @x @y i

The Divergence and the Divergence Theorem

83

Fig. 6.5: Adjacent regions showing that, where there is a common boundary, indicated by bold lines, the ux from one region exactly cancels the ux into the adjacent region. Only unshared boundaries contribute to the net ux.

where ni represents the outward normal of the ith region. Then, summing over each region yields XZ X @P @Q ! V ni ds = + i . (6.20) @x @y i i i
y y y

(a)

(b)

(c)

Fig. 6.6: (a) Partition of a region in the x-y plane bounded by a curve. (b) Cancellation of integrals over adjacent regions, as shown in Fig. 6.5. (c) As the area of the basic regions becomes smaller ( x ! 0 and y ! 0), the partitioning in (a) provides a successively more accurate representation of the region, yielding a more accurate representation of the curve surrounding the region.

By taking the limit x ! 0 and y ! 0, we obtain 2 Z 3 Z Z 6X 7 6 7 6 7= 6 lim 6 V ni ds7 V n ds V nd , 7 4 5


x!0, y!0 i
i

(6.21)

where

is the boundary of the entire region and n is the corresponding outward

84 unit normal, and

The Divergence and the Divergence Theorem

Thus,

2 3 ZZ ! 6X @P @Q 7 6 7 6 7 lim 6 + i 7 = 6 7 4 5 x!0, y!0 @x @y i Z V nd = ZZ

! @P @Q + d . @x @y

(6.22)

which is the divergence theorem in two dimensions. The same arguments apply in three dimensions, and we can write this in a more compact form by using the notation in Eq. (6.11): Z where V nd = ZZ r Vd , (6.24)

! @P @Q + d , @x @y

(6.23)

is the curve bounding the area in two dimensions, and ZZ V nd = ZZZ r Vd , (6.25)

where is the surface bounding the volume in three dimensions. Note that these equations have the structure of the Fundamental Theorem of Calculus in Eqs. (2.10) and (2.11), which we combine as Z b dF F(b) F(a) = dx . (6.26) a dx The left-hand sides of all these equations involves the integral of the derivative of a function over the interior of a region, while the right-hand sides involve the evaluation of the function over the boundary of that region.

Example. We will verify the divergence theorem Eq. (6.24) for the vector eld V = xi+y j (6.27) over the volume in the region of the x-y plane given by 0 x 1 and 0 y 1, as depicted in Fig. 6.7. The divergence of V was calculated in the Example in the preceding section, r V = 2, so the right-hand side of the divergence can be evaluated immediately: " Z 1 Z 1 r V d = 2 dx dy = 2 . (6.28)
0 0

The Divergence and the Divergence Theorem


y

85

Fig. 6.7: Plot of V = x i + y j in the region 0 x 1 and 0 y 1 of the x-y plane, shown shaded with emboldened boundaries.

The left-hand side is evaluated in an analogous manner to that used to obtain Eq. (6.8). Beginning with the segment along the x-axis and proceeding in a counterclockwise direction, we have V n = (x i + y j) ( j) = y V n = (x i + y j) i = x V n = (x i + y j) j = y V n = (x i + y j) ( i) = x (6.29) (6.30) (6.31) (6.32)

Along these four segments, we have, respectively, that y = 0, x = 1, y = 1, and x = 0. Thus, only the expressions in Eqs. (6.30) and (6.31) have nonzero contributions. This can be understood from Fig. 6.7 because V is seen to have only components parallel to the boundaries that lie along the x- and y-axes. Hence, there is no ux of V across these boundaries. We thereby obtain Z Z 1 Z 1 V nd = x dy + y dx = 1 + 1 = 2 , (6.33)
0 x=1 0 y=1

which agrees with Eq. (6.28).

Example. We now turn our attention to the divergence theorem in three dimensions in Eq. (6.25). Consider the ux of vector eld V = yi x j + z k, (6.34) through the surface of the unit sphere, x2 + y2 + z2 = 1 (Fig. 6.8). To evaluate the right-hand side of Eq. (6.25), we rst calculate the divergence of V. Using Eq. (6.11) with P = y, Q = x, and R = z, we obtain rV = @y @x @x @z + @y @z (6.35)

= 0 + 0 + 1 = 1.

86

The Divergence and the Divergence Theorem

The integral of this quantity over the volume of the sphere can be carried out either by inspection or explicitly in spherical polar coordinates: $ Z 1 Z 2 Z 1 4 2 r V d = r dr d sin d = 2 2 = , (6.36) 3 3 0 0 0 which is just the volume of the unit sphere. The evaluation of the left-hand side of Eq. (6.25) requires determining the dot product V n over the surface of the unit sphere. The outward unit normal is obtained from the gradient r(x2 +y2 +z2 ) = 2x i+2y j+2z k , (6.37) The length of this vector is p |r(x2 + y2 + z2 )| = 4x2 + 4y2 + 4z2 p = 2 x2 + y2 + z2 = 2 (6.38) on the surface of the unit sphere (where x2 + y2 + z2 = 1). Hence, n = xi + y j + z k, so V n = (y i (6.39)
y 0 1 1 1 x 0 1

1 z 0 1

Fig. 6.8: The vector eld in Eq. (6.34) evaluated at the surface of the unit sphere.

The integral of this quantity over the surface of the unit sphere is carried out in spherical polar coordinates, with z2 = cos2 : ZZ Z 2 Z 4 V nd = d sin cos2 d = , (6.41) 3 |0{z } |0 {z } 2 3 1 cos = 2 3 3
0

x j + z k) (x i + y j + z k) = xy

xy + z2 = z2 .

(6.40)

which agrees with Eq. (6.36).

6.3 Gauss Law


Gauss law is a special case of the divergence theorem that has several important applications in physics. We begin by considering the divergence theorem for the particular case that V = r (r) , (6.42)

The Divergence and the Divergence Theorem

87

i.e. V is the gradient of a scalar function and r = (x2 + y2 + z2 )1/2 is the usual radial variable that measures the distance of the point (x, y, z) to the origin. We will evaluate the left-hand side of Eq. (6.25) for a spherical surface of radius R, which means that we need to determine V and n over this surface. The equation of this surface is x2 + y2 + z2 = R2 , so the outward unit normal is obtained by taking the gradient of this expression and normalizing the resulting vector, as in the steps leading to Eq. (6.39): x y z n = i + j + k. (6.43) R R R To calculate V, we use the chain rule to obtain r (r) = = @ @ @ i+ j+ k @x @y @z

On the surface of the sphere, r = R and this expression reduces to x d y z r (r) = i+ j+ k , dr r=R R R R so d V n = r (r) n = dr
r=R

d @r d @r d @r i+ j+ k dr @x dr @y dr @z d x y z = i+ j+ k . dr r r r

(6.44)

(6.45)

which, since both V and n are radial vector elds, is a constant on the surface of the sphere. Thus, integrating this quantity (which is a constant) over the surface of the sphere of radius R yields ZZ d V n d = 4R2 . (6.47) dr r=R We now observe that, if we specialize our choice of any constant, then ZZ to (r) = A/r, where A is (6.48)

! x2 + y2 + z2 d = 2 R dr

r=R

(6.46)

V n d = 4A ,

i.e. independent of the radius of the sphere! This result has several important consequences. Figure 6.9 shows a twodimensional depiction that we will use in the following discussion. Figure 6.9(a) shows a sphere of any radius. Since Eq. (6.48) is independent of the radius, the value of this integral is unaected by any deformation of this sphere as long as the

88

The Divergence and the Divergence Theorem

(a)

(b)

(c)

Fig. 6.9: Schematic depiction in two dimensions of Gauss law for surfaces that enclose the origin, with (a) a spherical surface, (b) a deformed spherical surface, which leaves the value of Eq. (6.48) unaffected, and (c) a general surface that can be represented by the construction in (b).

(a)

(b)

Fig. 6.10: Schematic depiction in two dimensions of Gauss law for surfaces that do not enclose the origin, with (a) a deformed surface with spherical sections, (b) a general surface that can be represented by the construction in (a).

resulting surface contains the origin. One such deformation is shown in Fig. 6.9. There is no ux through the radial planes because the vector eld is radial, so only the spherical portions contribute to the ux. Any section within a xed subtended angle can be moved to any radius with no eect on the ux through that section. Hence, since any surface can be decomposed into such radial and spherical sections, the result in Eq. (6.48) will be obtained for any surface that enclosed the origin, as shown in Fig. 6.9(c). An altogether dierent result is obtained if the surface does not contain the origin. This situation is depicted in Fig. 6.10. The surface in Fig. 6.10(a) is composed of spherical sections. Since this surface surrounds a region that excludes the origin, the ux into the volume exactly cancels the ux out of the volume because every spherical section which admits ux has a corresponding region that expels

The Divergence and the Divergence Theorem

89

the same amount of ux. Hence, the ux integral over such a surface vanishes! Figure 6.10(b) shows a smooth surface that can be decomposed into sections as in Fig. 6.10(b). Figures 6.9 and 6.10 summarize the essence of Gauss law. One of the most far-reaching applications of Gauss law is to electrostatics, where the function (r) represents the electrostatic potential of a charge q located at the origin: q (r) = , (6.49) 40 r where 0 is the permittivity of free space. The associated electric eld E is then given by q E= r = , (6.50) 40 r2 and Gauss law reads ZZZ r E d = ZZ E nd = q . 0 (6.51)

if the surface encloses the origin. More generally, the right-hand side is equal to the total charge enclosed by , i.e. the total charge contained within the volume . Gauss law results from the divergence theorem applied to Coulombs law and leads to one of the four Maxwell equations the equations that govern the behavior of all electromagnetic phenomena.

6.4 Summary
This chapter has introduced the divergence of a vector eld V = V = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k: rV = @P @Q @R + + . @x @y @z (6.52)

The divergence represents the ux density of the vector eld and, because of the derivative operation, has an associated Fundamental Theorem of Calculus called the divergence theorem: ZZ ZZZ V nd = r Vd , (6.53) for a surface surrounding a volume .

Chapter 7 The Curl and Stokes Theorem


The divergence of a vector eld is a derivative operation that yields the ux density of that eld as a function of position. Another quantity used to characterize a vector eld is called the curl or, sometimes the rotation. The curl is obtained by taking a derivative, and is associated with the circulation density of the vector eld, so there is a Fundamental Theorem of Calculus for the curl , called Stokes theorem. In this chapter we develop these concepts by following similar steps to those used in the preceding chapter for the divergence: the denition in terms of innitesimal quantities is followed by integration of this quantity over a region to obtain the fundamental theorem. The divergence and curl provide complementary information about a vector eld. In fact, a theorem due to Helmholtz states that a vector eld that vanishes at innity is uniquely dened by its curl and divergence. Maxwells equations, the fundamental equations of electromagnetism, utilize this theorem by specifying the divergence and curl of the electric and magnetic elds.

7.1 The Curl in Two Dimensions


The basic construction used to derive the curl in two dimensions is shown in Fig. 7.1 for a vector eld V given by V = P(x, y) i + Q(x, y) j . (7.1) We consider a rectangular region of area x y and calculate the line integral around its perimeter, which is denoted by @( A), I I V dr = (P dx + Q dy) (7.2)
@( A) @( A)

in the counterclockwise direction. The integrand of this line integral represents the projection of V along the integration path, so a positive (resp., negative) value of 91

92
(x,y+ y) (x+ x,y+ y)

The Curl and Stokes Theorem

(x,y)

(x+ x,y)

Fig. 7.1: The region of area x y used to calculated the curl of a vector eld. The projections of the components of the vector eld onto the directions around this region are indicated by arrows.

the integral implies that V a positive (resp., negative) circulation. The direction of positive circulation is simply a matter of convention. If the line integral vanishes, V has no circulation in the region. The line integral in Eq. (7.2) is composed of four segments: I (P dx + Q dy) = Z
x x+ x

@( A)

Z
0

x+ x x

P(x , y) dx +
0

+ By using Eq. (2.15),

P(x , y + y) dx + Z

y+ y y

Q(x + x, y0 ) dy0 (7.3)

y y+ y

Q(x, y0 ) dy0

we can combine terms on the right-hand side of Eq. (7.3) to obtain I (P dx + Q dy) = Z
y+ y y

b a

f (x) dx =

a b

f (x) dx ,

(7.4)

@( A)

x+ x x

Since the limits x ! and y ! 0 are to be taken, we can regard P and Q as constant over their intervals of integration. Our line integral then becomes I (P dx + Q dy) = P(x, y) P(x, y + y) x
@( A)

P(x0 , y)

Q(x + x, y0 )

Q(x, y0 ) dy0 .

P(x0 , y + y) dx0 (7.5)

+ Q(x + x, y)

Q(x, y)

y.

(7.6)

The Curl and Stokes Theorem


We now divide both sides of this equation by x y, I 1 P(x, y) P(x, y + y) (P dx + Q dy) = x y @( A) y + Q(x + x, y) x Q(x, y) ,

93

(7.7)

and take the limits x ! and y ! 0. The right-hand side can be evaluated by using the denitions in Eqs. (1.12) and (1.13): " # P(x, y) P(x, y + y) @P lim = , (7.8) y!0 y @y " # Q(x + x, y) Q(x, y) @Q lim = . (7.9) x!0 x @x Thus, we obtain lim " 1 x y I (P dx + Q dy) = # @Q @x @P . @y (7.10)

x!0, y!0

@( A)

This is the denition of the curl. It represents the circulation of a vector eld around an innitesimal area at (x, y). Notice that in deriving this quantity, we have used only the components along dr, whereas the corresponding derivation of the divergence in Sec. 6.1 used the normal components. This provides a intuitive basis for understanding why the divergence and curl uniquely specify a vector eld. Example. We calculate the curls of the vector elds in the examples in Sec. 6.1. Consider V = xi + y j, (7.11)

which is shown in Fig. 7.2. This is a radial vector eld with a divergence that was calculated as r V = 2. To calculate the curl of this vector eld, we apply the denition in Eq. (7.10) with P = x and Q = y to obtain @y @x @x =0 @y 0 = 0. (7.12)

Thus, V is specied completely by its divergence. These conclusions are valid for any radial vector eld (Problem Set 9).

94
y

The Curl and Stokes Theorem

Fig. 7.2: Plot of V = x i + y j.

Now consider the vector eld V = xi y j, (7.13)

which is shown in Fig. 7.3. This vector eld was found in Sec. 6.1 to have a divergence that vanishes: r V = 0. With P = x and Q = y, the curl of V is @( y) @x + = 0 + 0 = 0, @x @y (7.14)

which is also zero! Thus, both the divergence and curl vanish for this vector eld. This example shows that, even if the curl and divergence of a vector eld are both zero, the vector eld itself need not reduce to a constant everywhere. However, if we further stipulate further that a vector eld must vanish at innity, then a vanishing divergence and curl do indeed imply that that V = 0. This discussion has important implications for the governing equations of electric and magnetic elds in electromagnetism (Maxwells equations).
y

Fig. 7.3: Plot of V = x i

y j.

The Curl and Stokes Theorem


y

95

Fig. 7.4: Plot of V = y i + x j.

Finally, consider

V = yi + x j, @( y) = 1 + 1 = 2. @y

(7.15)

which is shown in Fig. 7.4. With P = y and Q = x, the curl is calculated as @x @x (7.16)

so this vector eld has a positive circulation according to our convention. Indeed, the plot of the vector eld in Fig. 7.4 is suggestive of a circulation around the origin. But, as in the discussion in Sec. 6.1, the fact that the curls of the vector elds in this example are constants, means that any interpretation assigned to them must be valid for every point in the x-y plane. This point is taken up in Problem Set 9.

7.2 Greens Theorem


The curl was obtained by considering the circulation around an innitesimal region in the x-y plane. We can integrate Eq. (7.10) to obtain the Fundamental Theorem of Calculus associated with the curl in a manner analogous to that for the divergence. The key point again is that for adjacent elemental regions the share a boundary, the integral along one boundary exactly cancels the corresponding integral around the adjacent region because the sense of integration is opposite on these faces (Figs. 7.5). Thus, the only net contribution to the curl is where there is no adjacent region. We begin by rearranging (7.10) as ! I @Q @P (P dx + Q dy) = A. (7.17) @x @y @( A)

96
y y

The Curl and Stokes Theorem


y

(a)

(b)

(c)

Fig. 7.5: (a) Partition of a region in the x-y plane bounded by a curve. (b) Cancellation of line integrals over adjacent regions. (c) As the area of the basic regions becomes smaller ( x ! 0 and y ! 0), the partitioning in (a) provides a successively more accurate representation of the region, yielding a more accurate representation of the curve surrounding the region.

The left-hand side of this equation is the line integral over the perimeter of the region A and the right-hand side is the circulation density (i.e. the curl) multiplied by the area A of the region, which yields the total circulation within the region. Now, any region in the x-y plane can be partitioned into such contiguous elemental rectangular regions, as shown in Fig. 7.5(a). For each such region we can carry out the indicated calculations in Eq. (7.17). Thus, the corresponding quantities calculated for the entire region A is obtained by summing over the elemental regions: XI X @Q @P ! (P dx + Q dy) = Ai , (7.18) @x @y @( Ai ) i i where Ai is the area of the ith elemental region. This is depicted in Fig. 7.5(b). For extended regions A, the partitioning does not give an accurate representation of either interior or the boundary, but improves as A decreases. Thus, by taking A ! 0, we can obtain an exact relation between the line integral over the perimeter of the region and the curl within the region. We consider the left-hand side of Eq. (7.18) rst. As Figs. 7.5(b,c) show, the sum of the line integrals over the elemental rectangular regions contains only those segments that have no neighboring elemental regions; the interior line integrals cancel pairwise. Thus, 2 I 3 I 6 7 6X 7 6 7 lim 6 (P dx + Q dy)7 = (P dx + Q dy) , (7.19) 6 7 4 5
Ai !0 i @( Ai ) @A

where @A is the perimeter of the region A in the x-y plane and the integral is taken in the counter-clockwise direction.

The Curl and Stokes Theorem

97

The right-hand side of Eq. (7.18) is two-dimensional Riemann sum (cf. Eq. 2.1). In particular, as ! 0, the summation becomes an integral over the interior of the region A: 2 3 ZZ ! ! 6X @Q @P 7 @Q @P 6 7 6 7= lim 6 Ai 7 dA . (7.20) 6 7 4 5 Ai !0 @x @y @y A @x i Thus, the results in Eqs. (7.19) and (7.20) show that the limiting form of Eq. (7.18) obtained as Ai ! 0 is I (P dx + Q dy) =
@A

ZZ

@Q @x

! @P dx dy , @y

(7.21)

which is known as Greens theorem. This equation has the structure of the Fundamental Theorem of Calculus in that the integral of a derivative of a quantity (the curl) over the interior of a region is equal to that quantity evaluated on the boundary of the region. Example. Consider the vector eld V = yi + x j, (7.22)

which is shown in Fig. 6.4, and suppose that the area in the x-y plane is a circle of radius R. With P = y and Q = x, the left-hand side of Eq. (7.21) is I I (P dx + Q dy) = (x dy y dx) . (7.23)
@A @A

In circular polar coordinates, x = R cos , y = R sin , where 0 dx = R sin d , so the integral becomes I (x dy
@A

< 2, and (7.24)

dy = R cos d ,

y dx) =

2 0

(R2 cos2 + R2 sin2 ) d


2 0

=R

d = 2R2 .

(7.25)

To evaluate the right-hand side of Eq. (7.21), we have that the curl is @Q @x @P =1 @y ( 1) = 2 , (7.26)

98 so, again using polar coordinates, ZZ @Q @x

The Curl and Stokes Theorem

! Z R Z 2 @P dx dy = 2 r dr d @y 0 0 1 = 2 R2 2 = 2R2 , 2 (7.27)

which agrees with Eq. (7.25). Example. Consider the vector eld V = xi + y j, (7.28)

which is shown in Fig. 7.2. With P = x and Q = y, the left-hand side of Eq. (7.21) is I I (P dx + Q dy) = (x dx + y dy) .
@A @A

(7.29)

Since the curl of V vanishes,

@Q @x

@P =0 @y

0 = 0.

(7.30)

Thus, according to Greens theorem, I (x dx + y dy) = 0


@A

(7.31)

for any area A! This ostensibly surprising result is, in fact, to be expected from the discussion in Sec. 5.2, where we showed that the value of a line integral is independent of the path if and only if the integral over any closed curve vanishes. Indeed, we showed in Sec. 5.3 that the condition for the value of a line integral, I (P dx + Q dy) (7.32)
@A

to be independent of the path is @P @Q = , @y @x (7.33)

i.e. the vanishing curl of the vector eld V = P i + Q j. The three-dimensional generalization of this result will be derived in the next section.

The Curl and Stokes Theorem

99

7.3 Stokes Theorem


Greens theorem establishes a relationship between the curl of a two-dimensional vector eld and line integrals of that vector eld in the x-y plane. The motivation for the mathematical structure of line integrals in terms of the work done by a force eld along a path (Sec. 5.1) suggests that line integrals in three dimensions might also benet from the analysis in Secs. 7.1 and 7.2. In this section, we generalize Greens theorem to three dimensions by examining at each side of Eq. (7.21) separately.

7.3.1

Line Integrals in Three Dimensions


V = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k . (7.34)

Consider a three-dimensional vector eld V given by

The work done along a three-dimensional path P is obtained by calculating the component of V projected along the direction of the path. With the position vector given by r = x i + y j + z k, we have Z Z V dr = (P dx + Q dy + R dz) . (7.35)
P P

This is the generalization of the left-hand side of Eq. (7.21) to three dimensions.

7.3.2

The Curl Vector

The curl derived in Sec. 7.1 is endowed with a sign in that the counter-clockwise direction of the circulation is taken as positive by convention. But when this construction is extended to three dimensions, the concept of clockwise versus counterclockwise is not precise enough to identify the direction of circulation. For example, the counterclockwise direction observed by looking down onto the x-y plane from the positive z-axis appears as the clockwise direction when looking up to the x-y Fig. 7.6: The curl vector according to the plane from the negative z-axis. This am- right-hand rule. biguity can be alleviated by using the right-hand rule to assign an orientation to positive (i.e. counterclockwise) circulation: when the ngers of your right hand

100

The Curl and Stokes Theorem

bend in the counterclockwise direction, your thumb points in the direction of the positive z-axis. Thus, we can write the curl of a vector V = P i + Q j as ! @Q @P k. (7.36) @x @y We can represent this expression in a more suggestive form by utilizing the denition of the del operation, r=i @ @ @ +j +k , @x @y @z (7.37)

to represent the curl of a vector as the cross product between this operation and V. The calculation of this quantity proceeds in direct analogy with the representation of the cross product of two ordinary vectors as a determinant: i r V = @x P j @y Q k @Q 0 = @x 0 ! @P k, @y (7.38)

which is the same as Eq. (7.36). To convert this to a scalar quantity, we take the dot product of this quantity with k: (r V) k = @Q @x @P . @y (7.39)

Thus, combining Eqs. (7.35) and (7.39) Greens theorem can be written as I V dr = ZZ (r V) k dx dy . (7.40)

@A

7.3.3 The Curl of Three-Dimensional Vector Fields


The general vector eld V is given by Eq. (7.34). The construction in Eq. (7.39) yield the following expression for the curl: i rV = @ x P @R = @y j @y Q k @z R ! ! @R @Q j+ @x @x ! @P k. @y

@Q @P i+ @z @z

(7.41)

The Curl and Stokes Theorem

101

The meaning of this vector eld follows from that for the two-dimensional curl: it represents the circulation density, with a direction given by the right-hand rule. Example. The curl of the vector eld V = zi + x j + y k, is i rV = @ x j @y k @z = i + j + k . (7.43) (7.42)

z x y The vector components of the curl of V each have unit circulation in directions given by the right-hand rule, so the total circulation is along the direction i + j + k. The right-hand side of Eq. (7.21) can now be expressed in terms of the local unit normal n to the surface as Z ZZ (r V) k dx dy ! (r V) n d . (7.44)
A

We thus arrive at Stokes theorem: I V dr = ZZ (r V) n d , (7.45)

where @ is the bounding curve of the surface . An important consequence of the structure of this equation is that, given a vector eld V, the left-hand side is determined completely by the bounding curve, independent of the surface . To appreciate the signicance of this, consider the three surfaces shown in Fig. 7.7. Each surface has the same bounding curve, namely, the unit circle in the x-y plane. The left-hand side of Stokes theorem and, therefore, the right-hand side, is the same for all three surfaces! This highlights the fact that Stokes theorem is a fundamental theorem of calculus for the curl in that the evaluation of the righthand side of Eq. (7.45) is determined entirely by the nature of the boundary @ . Example. Consider the surface given by the surface of the upper half-sphere of radius R: : x2 + y2 + z2 = R2 , (z 0) . (7.46) The bounding curve is therefore given by the circle of radius R in the x-y plane: @ : x 2 + y2 = R 2 . (7.47)

102

The Curl and Stokes Theorem

1 z 0 1 0 x 1 1 1 z

1 1 0 1 0 x 1 1 z

1 1 0 1 0 x 1 1

0 y

0 y

0 y

(a)

(b)

(c)

Fig. 7.7: Three surfaces that have the same bounding curve in the x-y plane, which is shown emboldened: (a) an upper half-sphere, (b) a cylinder, and (c) a cone. For each of these surfaces and for a given vector eld, the left-hand side of Stokes theorem in Eq. (7.45) is identical.

These quantities are shown Fig. 7.7(a). We will evaluate both side Stokes theorem in Eq. (7.45) for the vector eld V = yi + x j + z k. We consider the left-hand side of Eq. (7.45) rst. We have V dr = ( y i + x j + z k) (dx i + dy j + dz k) = y dx + x dy + z dz . On @ , which lies in the x-y plane, where z = 0, this expression reduces to V dr = y dx + x dy .
@

(7.48)

(7.49)

(7.50) < 2, we have (7.51)

In circular polar coordinates, x = R cos , y = R sin , where 0 dx = R sin d , from which we obtain V dr dy = R cos d ,

= y dx + x dy = R2 sin2 d + R2 cos2 d = R2 d .
@

(7.52)

Thus, the left-hand side of Stokes theorem evaluates to I V dr = Z


2 0

R2 d = 2R2 .

(7.53)

The Curl and Stokes Theorem

103

To evaluate the right-hand side of Stokes theorem, we rst calculate the curl of V: i j k y x z The unit normal is given in Eq. (6.43): n= Thus, x y z i + j + k. R R R x Z (7.55) rV = @ x @y @z = (1 + 1) k = 2 k . (7.54)

y z 2z (rV) n = 2 k i+ j+ k = . (7.56) R R R R We will evaluate the integral of this quantity over the upper half-sphere in spherical polar coordinates: Z (r V) n d =R
2 2

1 2

! 2 sin R cos d R

= 4R

Z |

1 2

0 1 2

sin

sin cos d {z }
2
1 2

1 2

= 2R2 . which agrees with Eq. (7.53). H: Example. Consider the surface

(7.57)

given by a cylinder of radius R and height (0 z H) . (7.58)

: x 2 + y2 = R 2 ,

The bounding curve is again the circle of radius R in the x-y plane: @ : x 2 + y2 = R 2 . These quantities are shown Fig. 7.7(b). We again have the vector eld V = yi + x j + z k. (7.60) (7.59)

Since the bounding curve and the vector eld are the same as in the preceding example, the evaluation of the left-hand side of Stokes theorem again yields the value 2R2 . To evaluate the right-hand side of Stokes theorem, we again have

104

The Curl and Stokes Theorem

that rV = 2 k. The unit normal to the top of the cylinder n = k, so for this part of the surface integral we have (rV) n = 2 . The integral of this quantity over the top of the cylinder is Z Z R Z 2 (r V) n d = 2 r dr d = 2R2 .
0 0

(7.61)

(7.62)

For the integral over the sides of the cylinder, we have that the unit normal is n= and, therefore, we nd that y i + j = 0. (rV) n = 2 k R R x (7.64) x y i + j, R R (7.63)

Thus, the integral over the sides of the cylinder vanishes and the total integral over the surface of the cylinder is given by the integral over the top of the cylinder, which is independent of the height of the cylinder, and equal to 2R2 .

7.4 Summary
This chapter has introduced the curl of a vector eld V = V = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k: i rV = @ x P @R = @y j @y Q k @z R ! ! @R @Q j+ @x @x ! @P k. @y

@Q @P i+ @z @z

(7.65)

The curl represents the circulation density of the vector eld and, because of the derivative operation, has an associated Fundamental Theorem of Calculus called Stokes theorem theorem: I ZZ V dr = (r V) n d , (7.66)
@

for a surface

with a bounding curve @ .

Вам также может понравиться