SAnet CD 3110438216

Pietro-Luciano Buono
Advanced Calculus
De Gruyter Graduate
Also of interest
Functional Analysis. A Terse Introduction
Gerard Chacn, Humberto Rafeiro, Juan Camilo Vallejo, 2016
ISBN 978-3-11-044191-8, e-ISBN (PDF) 978-3-11-044192-5,
e-ISBN (EPUB) 978-3-11-043364-7
Introduction to Topology
Min Yan, 2016
ISBN 978-3-11-037815-3, e-ISBN (PDF) 978-3-11-041302-1,
e-ISBN (EPUB) 978-3-11-037816-0
Tensors and Riemannian Geometry. With Applications to

Differential Equations
Nail H. Ibragimov, 2015
ISBN 978-3-11-037949-5, e-ISBN (PDF) 978-3-11-037950-1,
e-ISBN (EPUB) 978-3-11-037964-8
Multivariable Calculus and Differential Geometry

Gerard Walschap, 2015
ISBN 978-3-11-036949-6, e-ISBN (PDF) 978-3-11-036954-0
Elements of Partial Differential Equations

Pavel Drbek, Gabriela Holubov, 2014
ISBN 978-3-11-031665-0, e-ISBN (PDF) 978-3-11-031667-4,
e-ISBN (EPUB) 978-3-11-037404-9
Pietro-Luciano Buono
Advanced Calculus
Differential Calculus and Stokes Theorem

Mathematics Subject Classification 2010
35-02, 65-02, 65C30, 65C05, 65N35, 65N75, 65N80
Author
Prof. Pietro-Luciano Buono
University of Ontario
Institute of Technology
2000 Simcoe St North
Oshawa ON L1H 7K4
Canada
luciano.buono@uoit.ca
ISBN 978-3-11-043821-5
e-ISBN (PDF) 978-3-11-043822-2
e-ISBN (EPUB) 978-3-11-042911-4
Library of Congress Cataloging-in-Publication Data

A CIP catalog record for this book has been applied for atthe Library of Congress.
Bibliografische Information der Deutschen Nationalbibliothek

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available on the Internet at http://dnb.dnb.de.
2016 Walter de Gruyter GmbH, Berlin/Boston

Cover image: Pietro-Luciano Buono via Asymptote: The Vector Graphics Language
Printing and binding: CPI books GmbH, Leck
Printed on acid-free paper
Printed in Germany
www.degruyter.com
Isabelle pour son amour, son support et sa patience.
Contents
Preface IX
1 Introduction 1
1.1 Review of Set Theory 1
1.2 Review of Linear Algebra 3
1.3 Coordinate systems 13
1.4 Functions and Mappings: including partial derivatives 19
1.5 Parametric representation of curves 27
1.6 Quadrics 34
2 Calculus of Vector Functions 41

2.1 Derivatives and Integrals 41
2.2 Best Linear Approximation and Tangent Lines 49
2.3 Reparametrizations and arc-length parameter 53
3 Tangent Spaces and 1-forms 58

3.1 Tangent spaces 58
3.2 Differentials 69
3.3 1-forms 80
4 Line Integrals 84
4.1 Integration of 1 forms 84
4.2 Arc-length, Metrics and Applications 95
4.3 Line integrals of vector fields 111
5 Differential Calculus of Mappings 117

5.1 Graphs and Level Sets 117
5.2 Limits and Continuity 121
5.3 Best Linear Approximation and Derivatives 129
5.5 The Chain Rule 142
5.6 Higher Derivatives 145
5.7 Taylor expansions 151
6 Applications of Differential Calculus 156

6.1 Optimization 156
6.2 Parametrizations 166
6.3 Differential Operators 176
6.4 Application of Clairaults theorem to 1-forms 183
VIII Contents
7 Double and Triple Integrals 188

7.1 Area and Volume Forms 188
7.2 Double integrals 194
7.3 Greens Theorem 207
7.4 Three-dimensional domains 215
8 Wedge Products and Exterior Derivatives 224

8.1 More on Wedge Products 224
8.2 Differential Forms 231
8.3 Exterior Derivative 234
9 Integration of Forms 243

9.1 Pullbacks of k-forms: k = 1, 2, 3 243
9.2 Integrals of Forms: change of variables formula 247
9.3 Integrals on a surface 253
9.4 Orientation of Surfaces 264
9.5 General Pullback Formula 270
10 Stokes Theorem and Applications 274

10.1 More on orientation of curves and surfaces 274
10.2 Stokes Theorem 280
10.3 Stokess Theorem for Vector Fields 294
Bibliography 299
Index 301
Preface
This book is an outgrowth of the notes I have been using to teach a one semester
Calculus III course at the University of Ontario Institute of Technology since 2012.
It is intended for students who have already completed at least one semester of
Elementary Linear Algebra and two semester long courses in Calculus.
The approach taken in this book is to take full advantage of Linear Algebra in
order to present the Calculus concepts in as much generality as possible. Because
of this bias towards using Linear Algebra, I decided also to go one step further
(from many other books) and introduce the concept of tangent space early in the
text from which it is possible to define properly the differential of a function and
from there, differential forms and pullbacks in the context of line integrals. In the
following chapters, those are generalized just enough to provide a unified treatment
of integration and the generalized Stokes theorem in R3 (Green, Classical Stokes
and Divergence). Therefore, this book can also serve as a gentle introduction to
the theory of differential forms and prepare the reader to delve into more advanced
topics from differential geometry and mathematical physics.
The book begins with an introductory chapter on basic topics which are recur-
rent in the remainder of the book. Several of those topics may already be familiar
to some readers. In order to provide an incremental progression in the blending of
Linear Algebra and Calculus, sprinkled with differential forms theory, the next three
chapters discuss almost exclusively vector functions of one variable and culminate
with the Fundamental Theorem of Line Integrals. Chapter 3 is the cornerstone for
the whole book and it uses tangent vectors to vector functions to introduce tangent
spaces to curves, Rn and to surfaces. It is then possible to define differentials and
1-forms as acting on tangent vectors. The second part of the book is made up of
Chapters 5 and 6 and focuses on differentiable mappings from Rn to Rm . Chapters
7 through 9 introduce, in a blended way, additional concepts of differential form the-
ory along with the theory of multiple integrals. Finally, Chapter 10 puts the results
from the previous chapters together in the statement and proof of Stokes theorem
(Green, Classical and Divergence) using differential forms and exterior derivatives.
The statement is also rewritten in terms of the classical differential operators.
One of the advantage I see in the unconventional ordering of topics adopted here
is that it is now possible to start introducing terminology in the context of curves
(i.e. one dimensional geometrical objects) which are typically easier to understand
and for which the calculations do not require the notational machinery needed with
more variables. After a discussion of mappings and especially the introduction of
the Jacobian, it is then possible to extend the differential form concepts to higher
dimensions. At the same time, this enables for a repetition of the new concepts and
terminology which is typically beneficial for learning. Another learning goal that
this text attempts to achieve is for the reader to start distinguishing between the
X Preface
definition of mathematical concepts, the geometric content of the definition and the
computational formulae which are useful in most problem solving. It also provides
an algorithmic presentation of some computations which I hope will make their
usage more straightforward; one such example is for determining the arc-length
parametrization of a curve.
The content of this book has seen several versions since 2012 and I would
like to thank all my students who have ploughed through those various versions.
I am grateful to all of those that provided feedback on the presentation and
noticed mistakes, typos, etc. I would like to thank Eryn Frawley (Calculus III,
2013) who assisted me by producing a great number of figures using Tikz, solu-
tions to many problems and proofreading of chapters. I am also indebted to my
teaching assistants, and especially Jamil Jabbour, who have read major portions
of the material and challenged me on the presentation in several places. All fig-
ures were done using Tikz (https://sourceforge.net/projects/pgf/) and Asymptote
(http://asymptote.sourceforge.net). I would like to thank all the users of those soft-
ware packages for posting examples and code snippets. In particular, I am indebted
to the gallery maintained at http://asy.marris.fr/asymptote/index.html. I hope to
be able to give back soon to the community by making some of the codes for the
figures of this book available online.
For any inquiries about this textbook, error or typos found, etc. Please contact
me at: luciano.buono@uoit.ca.
Luciano Buono
Oshawa, June 2016.
1 Introduction
This chapter gives an overview of several of the topics necessary for the remainder of
the book. The first sections on Set Theory and Linear Algebra are review sections.
1.1 Review of Set Theory
We begin with a quick review of basic concepts from set theory with a focus on real
numbers R. A set of real numbers is an unordered collection of real numbers. One
denotes sets inside brackets in enumerative style as follows

A = {1, 3, , 1/ 2}.
The symbol means element of and denotes the belonging of an element to a set.
We can also describe a set using a defining condition
B = {x R | x > 2}
which is read as:
x is the placeholder for elements of R such that x is greater than 2.
In the defining condition notation, to verify whether a number belongs to a

given set one has to check if the condition is satisfied. Is 3 B? Let x = 3, then
3 > 2 is false; therefore, 3 is not an element of B and it is denoted: 3 6 B. Sets
2
x Fig. 1.1. In bold, the set B.
can have a finite number of elements as A or an infinite number of elements as B.

An important type of sets on the real line are the intervals, defined as follows: Let
a, b R and a < b
[a, b] = {x R | a x b}, (a, b) = {x R | a < x < b}

(a, b] = {x R | a < x b}, [a, b) = {x R | a x < b}.
If a = or b = then we always use (a or b).
a b a b a b
x x x
Fig. 1.2. From left to right, the intervals [a, b], (a, b) and (a, b].
2 1 Introduction
A set E is a subset of a set F if every element of E is also an element of F . We

then write E F . Another notation is E F which allows for E and F to have
exactly the same elements, that is E = F .
Example 1.1.1. Let

E = {x R | x = 4n with n N}
and F be the set of even integers. Is E F ? To see this, one has to make sure
that every element of E is an even integer. But, E contains an infinite number of
elements and so we use the defining condition instead. At this point, if one believes
that E F , then we can proceed to verify it properly as follows: let x = 4n for an
arbitrary natural number n N, but x = 2(2n) is divisible by 2 and so x is an even
integer. Because x can be chosen to be any element of E, this means E F .
Example 1.1.2. Let
E = {x R | x = p/q, p, q Z, p, q 10}
and F be the interval (0, 1). Is E F ? In this case, one can check that for p = 2 Z
and q = 1 Z, p, q 10 and so x = 2/1 E. But, x = 2 > 1 and so x 6 F . Thus,
E 6 F .
The main operations on sets are the union, the intersection and the complement.
Let A R and B R be two sets. Then the union of A and B is
A B := {x R | x A or x B}.
The intersection of A and B is
A B := {x R | x A and x B}.
The complement of A is
Ac := {x R | x 6 A}.
In particular, one can check that for any two sets A and B:
A B A B.
One is often familiar with these operations in the context of Venn diagrams
Because sets are often given using defining conditions the union and intersection
operations are given by adding or for unions and and for intersections to the
defining conditions.
Example 1.1.3. Let A = {x R | x is an even integer} and B = (3, 5). Then,
A B = {x R | x is an even integer or 3 < x < 5}.
A B = {x R | x is an even integer and 3 < x < 5}.

For the complement, one writes for instance
Ac = {x R | x 6 A} = {x R | x is not an even integer}.

A B
C
Fig. 1.3. Venn Diagram of three sets
A, B and C.
Exercises
(1) Write the following descriptions of sets using defining conditions.

(a) Let E be the set of elements of R that are square of integers.
(b) Let F R be the set of points with distance less than 1 from .
(c) Let A be the set of integers with absolute value greater than 2.
(d) Let B be the set of rational numbers with denominator less than 30.
(2) Determine the union and intersection of the sets A, B, E, F of Exercise 1.
(3) For each problem, determine whether E F or not. Explain.
(a) Let E = {x R | |x 3| < 2} and F = {x R | 2 < x < 4}.
(b) Let E = {x R | x = cos(n), for any n Z} and F = {x R | 1 x
1}.
(c) Let E = {x R | |1 x| > 0} and F = {x R | |x 1| < 0}.
1.2 Review of Linear Algebra
This review of linear algebra focuses on the definitions and results revolving around
the concept of vector space and on geometrical aspects of linear algebra. For review
of solution of linear systems using row reduction, matrices, inverses of matrices, and
others, the reader can consult their favourite textbook on linear algebra.
One, two and three dimensional Euclidean spaces are the line, the plane and the
ambient space familiar to us. They are represented mathematically as R, R2 and R3 .
The space Rn is defined as the set of all collections of n real numbers written as
(x1 , x2 , . . . , xn ).
For n = 2 and n = 3 we have
(x1 , x2 ) (x1 , x2 , x3 )
where x1 , x2 , x3 are any real numbers.

Euclidean spaces are vector spaces; that is, all elements of Rn satisfy the follow-
ing two conditions.
4 1 Introduction
(1) For any two elements in Rn :
(x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn )
then (x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn ) is also

in Rn .
(2) For a R and (x1 , x2 , . . . , xn ) Rn , then
a(x1 , x2 , . . . , xn ) = (ax1 , . . . , axn ) Rn .
In general, let V Rn , then V is a vector space (or a vector subspace of Rn ) if for

every element v1 , v2 V and a R the following two properties are satisfied:
(1) v1 + v2 V
(2) av1 V .
If a vector space W is a subset of a vector space V , we say that W is a vector

subspace of V .
Example 1.2.1. The subset
V = {(x1 , x2 , x3 ) R3 | x3 = 2x1 + x2 } R3
is a vector subspace. We need to check the two conditions. Two general elements of
V must satisfy the defining condition. We write
v1 = (a1 , a2 , 2a1 + a2 ) and v2 = (b1 , b2 , 2b1 + b2 ).
Then,
v1 + v2 = (a1 , a2 , 2a1 + a2 ) + (b1 , b2 , 2b1 + b2 )
= (a1 + b1 , a2 + b2 , 2(a1 + b1 ) + (a2 + b2 )).
Therefore, the third component of the sum satisfies the defining condition for V .
Now check for c R
cv1 = c(a1 , a2 , 2a1 + a2 ) = (ca1 , ca2 , c(2a1 + a2 ))
and here we also see that the third component satisfies the defining condition for V .
Therefore, V is a vector space.
Example 1.2.2. The subset
W = {(x1 , x2 ) R2 | x2 = x1 + 1} R2
is not a vector subspace of R2 . Two general elements of W must satisfy the following
condition. We write
w1 = (a1 , a1 + 1) and w2 = (b1 , b1 + 1).

Then,
w1 + w2 = (a1 , a1 + 1) + (b1 , b1 + 1) = (a1 + b1 , (a1 + b1 ) + 2).
Therefore, the second component of the sum does not satisfy the defining condition
and so W is not a vector space.
The elements of a vector space V are called vectors. A linear combination of a

collection of k vectors v1 , v2 . . . , vk of a vector space V is given by
a1 v1 + a2 v2 + + ak vk
for some real numbers a1 , a2 , . . . , ak called the coefficients of the linear combination.
Example 1.2.3. One can check that v1 = (1, 0, 2), v2 = (3, 1, 5) and v3 =
(0.2, 1, 0.6) are elements of V from Example 1.2.1. Then,
(1.1)v1 + v2 + (1)v3
is a linear combination of v1 , v2 , v3 .
We now recall the definitions of linear dependence and linear independence of vectors.
Consider the following example first.
Example 1.2.4. Let v1 = (1, 2) and v2 = (1.3, 2.6), then we can see that v2 =
(1.3)v1 . We can rewrite as a linear combination
(1.3)v1 + (1)v2 = 0.
That is, the linear combination is equal to zero, but the coefficients of the linear
combination are not zero.
A collection of vectors v1 , . . . , vk in a vector space V are linearly dependent if there

exists a linear combination
a1 v1 + a2 v2 + + ak vk = 0
for which not all the coefficients are zero. A collection of vectors v1 , . . . , vk are linearly
independent if the only way to have a linear combination
a1 v1 + a2 v2 + + ak vk = 0
is for a1 = a2 = = ak = 0.
Example 1.2.5. The vectors v1 = (1, 0, 2), v2 = (3, 1, 5) and v3 = (0.2, 1, 0.6) of
Example 1.2.3 are linearly independent. One can see this by writing
a1 v1 + a2 v2 + a3 v3 = (a1 + 3a2 0.2a3 , a2 + a3 , 2a1 + 5a2 + 0.6a3 ) = 0
from which we obtain a3 = a2 by looking at the second component. Substituting

in the first component we obtain a1 3a2 0.2(a2 ) = 0 which means a1 = 3.2a2 .
6 1 Introduction
Substituting in the third component we have 2(3.2)a2 +5a2 +0.6(a2 ) = 12a2 = 0. But
this forces a2 = 0 and therefore a1 = a3 = 0. The only way for a linear combination
of v1 , v2 , v3 to equal zero is for all the coefficients to be zero.
For a collection of vectors v1 , v2 , . . . , vk V , the span of v1 , v2 , . . . , vk is the set of

all linear combinations of v1 , . . . , vk . We write
span(v1 , . . . , vk ) := {a1 v1 + a2 v2 + + ak vk | a1 , a2 , . . . , ak R}.
Consider the following example.
Example 1.2.6. The span of v1 = (1, 0, 2) and v2 = (3, 1, 5) is
span(v1 , v2 ) = {a1 (1, 0, 2) + a2 (3, 1, 5) | a1 , a2 R}

= {(a1 3a2 , a2 , 2a1 + 5a2 ) | a1 , a2 R}.
This last line gives an explicit expression for elements of the span. One can now ver-
ify whether a vector belongs to the span of v1 , v2 by checking explicitly. For instance,
is v4 = (4, 1, 4) an element of span(v1 , v2 )? We need,
4 = a1 3a2 , 1 = a2 , a1 + 5a2 = 4.
We see that a2 = 1 and a1 = 1 solve the three equations. That is, v4 span(v1 , v2 ).
Recall the following result concerning span of vectors.
Proposition 1.2.7. Let v1 , . . . , vk be a collection of vectors in the vector space V .

Then, span(v1 , . . . , vk ) forms a vector subspace of V .
We now have all the ingredients to discuss the concepts of basis and dimension. A
set of vectors B = {v1 , . . . , vk } in a vector space V is a basis for V if
(1) the vectors in B are linearly independent, and

(2) V = span(B).
Example 1.2.8. If V = Rn , the set B = {e1 , . . . , en } where
ej = (0, . . . , |{z}
1 , . . . , 0)
j th
is a basis for Rn . It is called the canonical basis of Rn . The linear independence is

straightforward to check. For n = 2, we have
e1 = (1, 0) and e2 = (0, 1).
Let v = (x1 , x2 ) be an arbitrary element of R2 , then v is in the span of B:
v = (x1 , x2 ) = x1 e1 + x2 e2 .
e2
e1
Fig. 1.4. Canonical basis of

R2 : {e1 , e2 }.
The elements x1 , x2 are called the coordinates of v. For general n, any element
w = (x1 , . . . , xn ) Rn can be written uniquely as
w = x1 e1 + x2 e2 + + xn en
where x1 , . . . , xn are said to be coordinates of w.
The dimension of a vector space V is given by the number of elements of a basis of

V.
The choice of basis is not unique. For the Euclidean spaces, the canonical basis
is the preferred one and other bases are used typically in special circumstances when
one wants to single out a particular geometrical feature.
Example 1.2.9. We are interested in the subspace of R2 spanned by the vector d1 =

(1, 1). This subspace is the bisector line of the first and third quadrants. A basis of
R2 which includes d1 is {d1 , e2 }. For this we check the two properties of a basis: (1)
for a, b R, ad1 + be2 = (a, a + b) = 0 which forces a = b = 0. The vectors are
linearly independent, and (2) for any (x1 , x2 ) R2 ,
(x1 , x2 ) = x1 d1 + (x2 x1 )e2 .
Thus, {d1 , e2 } spans R2 .
We now consider the concept of length of vectors in Rn when written in the canonical
basis. This is given by the norm of a vector v = (x1 , x2 , . . . , xn ) defined by
q
||v|| := x21 + x22 + + x2n .
In R2 , this corresponds to the hypotenuse of the right angle triangle with sides x1
and x2 .
In Example 1.2.9, a choice which may seem more natural is to take the vector
d2 = (1, 1) which spans the bisector of the second and fourth quadrants. The
vectors d1 and d2 are at right angles, just as the canonical basis elements. The
dot product or scalar product (or also called inner product) of two vectors v and
w written in the canonical basis is denoted by v w and defined as follows: let
v = (x1 , x2 , . . . , xn ) and w = (y1 , y2 , . . . , yn ) then
v w = x1 y1 + x2 y2 + + xn yn .
8 1 Introduction
v
x2
x1
Fig. 1.5. Thep

vector v has
norm ||v|| = x21 + x22 .
This formula comes from the following argument. If v and w are at right angles or
vw
v
w
Fig. 1.6. The vector v w.
perpendicular, then by Pythagoras theorem (valid in Rn ) we have
||v||2 + ||w||2 = ||v w||2

||v||2 + ||w||2 = (x1 y1 )2 + + (xn yn )2
||v||2 + ||w||2 = ||v||2 + ||w||2 2(x1 y1 + + xn yn ).
Therefore, the left and right hand sides are equal if and only if
x1 y1 + + xn yn = 0 = v w.
Two vectors v, w are orthogonal (i.e. perpendicular) if and only if v w = 0. A basis

B for a vector space V is orthogonal if all basis elements are mutually orthogonal. If
in addition, the vectors in B have norm 1, the basis is called orthonormal.
Example 1.2.10. Consider R2 with the basis {d1 , e2 } of Example 1.2.9. Then, d1
e2 = 1 and so this basis is not an orthogonal basis. If instead we choose the basis
{d1 , d2 }, then d1 d2 = 0 and so this forms an orthogonal basis, but it is not or-

thonormal because ||d1 || = ||d2 || = 2. To make it orthonormal, we need to form a
basis using d01 = d1 /||d1 || and d02 = d2 /||d2 ||.
It is a straightforward exercise to check that the canonical basis of Rn is an or-

thonormal basis.
An important property of the scalar product of v and w is that it gives the
product of the projection of one vector along the other one. If v and w are vectors
in some orthonormal basis, then
v w = ||v|| ||w|| cos .
We continue with a discussion of matrices and linear mappings. Consider a mn

matrix M of real numbers with entries labelled with aij , where i is the row label
and j is the column label. We obtain
a11 a12 a1n

a21 a22 a2n
M = . .. .

.. ..
.. . . .
am1 am2 amn
Let v = (v1 , . . . , vn ) be a vector in Rn . Recall that matrix - vector multiplication

M v is defined by
a11 a12 a1n a11 v1 + a12 v2 + + a1n vn

v1
a21 a22 a2n v2 a21 v1 + a22 v2 + + a2n vn
.. .. = .

. .. .. ..
.. . . . . .
am1 am2 amn vn am1 v1 + am2 v2 + + amn vn
In particular, matrix - vector multiplication has a linearity property. That is, for
vectors v1 , v2 Rn and R then
M (v1 + v2 ) = M v1 + M v2 .
The m n matrices are examples of linear transformations from Rn to Rm . Linear

transformations are often more easily described without the use of matrices as the
next example shows.
Example 1.2.11. Consider the spaces of polynomials of degree 2 and 1. We

write
P2 = {a0 + a1 x + a2 x2 | a0 , a1 , a2 R} and P1 = {b0 + b1 x | b0 , b1 R}.
Consider the differentiation operation D applied to elements of P2
D(a0 + a1 x + a2 x2 ) = a1 + 2a2 x P1 .
Therefore D : P2 P1 and we know from elementary calculus that differentiation

is a linear operation; that is, if p, q P2 and R then
D(p + q) = D(p) + D(q).
This example shows that using exclusively matrices to discuss linear transformations
is not optimal in all situations. Moreover, writing out a matrix requires one to fix
a basis for the vector spaces. Therefore, it is more appropriate to define the set of
linear transformations independently of bases as follows.
10 1 Introduction
Definition 1.2.12. We say that T : Rn Rm is a linear transformation if it satisfies

T (v1 + v2 ) = T (v1 ) + T (v2 ) where R and v1 , v2 Rn . The set of all linear
transformations from Rn to Rm is denoted by L(Rn , Rm ).
In this book, we shall often discuss linear transformations in an abstract way, without
specifying a basis, and so it is more convenient to adopt the linear transformation
formalism, rather than writing all our transformations in matrix form. Finally, recall
that two vector spaces V and W are isomorphic if there exists a linear transformation
T : V W such that T 1 exists. In particular, any two vector spaces of same
dimension are automatically isomorphic.
Returning to matrices, recall the definition of determinant for 2 2 and 3 3
matrices which we use repeatedly:

a11 a12
det A = det = a11 a22 a12 a21
a21 a22
and

a11 a12 a13
det A = det a21 a22 a23
a31 a32 a33
= a11 (a22 a33 a23 a32 ) a12 (a21 a33 a23 a31 ) + a13 (a21 a32 a22 a31 ).
If no confusion with the absolute value is possible, we sometimes write det A = |A|.
We now conclude with some geometric properties of determinants.
q P (v, w)
w
x
Fig. 1.7. Vectors v and w and parallel-
ogram P (v, w)
If v = (a, b) and w = (c, d) are two vectors in the plane based at some point q, then
the area of the parallelogram Pvw generated by v and w is given by the absolute
value of the determinant of the matrix

a b
,
c d
That is,
Area(Pvw ) = |ad bc|.
There are various ways to check this result, but we do not pursue this here. How-
ever, this result is important in understanding the area of a parallelogram in three-
dimensional space.
Another geometric way of obtaining the area of a parallelogram is as follows.
Consider the parallelogram Pvw formed by two vectors v and w in the plane. The
w
v Fig. 1.8. Parallelogram P
formed by the vectors v and
w.
area of this parallelogram is also given by the formula
base of parallelogram height of parallelogram.
The height is obtained by drawing a perpendicular from a vertex to the opposite side
as in Figure 1.8. The length of this height is, using basic trigonometry, h = ||v|| sin
where is the angle opposite the dashed line. Geometrically, this means
Area(Pvw ) = ||w|| ||v|| sin .
Another way to compute the area of a parallelogram is with the cross product,
also called vector product. The cross product is defined for pairs of vectors in R3
(or R2 if the same component of the vectors is zero). Let v = (x1 , x2 , x3 ) and
w = (y1 , y2 , y3 ), written in an orthonormal basis, then the cross-product v w is a
vector given by

x2 x3 x1 x3 x1 x2
v w := , , (1.1)
y2 y3 y1 y3 y1 y2
where | | is the 2 2 determinant. This means that the components of the cross-
product correspond to the oriented areas of the parallelograms obtained by the
projection of the parallelogram P (v, w).
12 1 Introduction
The cross product vw is perpendicular to both v and w and so is perpendicular

to the plane in which v and w lie. This is verified by checking that
v (v w) = 0 and w (v w) = 0.
An important feature of the cross product is called its alternating property,

namely
v w = (w v).
This reflects the fact that there are two vectors perpendicular to the plane containing
v and w and they are opposite of each other. Finally, let Pvw be the parallelogram
generated by v and w then
Area(Pvw ) = ||v w||.
Exercises
(1) Show that the set

V = {(x, y, z) R3 | z = 3x + 2y}
is a vector subspace (i.e check that the two conditions of a vector space are
satisfied).
(2) Show that a line ` in R2 is a vector subspace if and only if ` passes through the
origin.
(3) Use the scalar multiplication property of vector spaces to show that a closed
ball B of any radius r > 0 cannot be a vector space. Explicitly
B = {(x, y, z) R3 | x2 + y 2 + z 2 r2 }.
(4) Consider the vectors given. Find out if they are linearly dependent or indepen-
dent. Compute the span of these vectors and find the dimension of the subspace
spanned by the vectors.
(a) v1 = (1, 1, 0), v2 = (0, 2, 0)
(b) v1 = (2, 1, 1), v2 = (0, 1, 3) and v3 = (2, 0, 4).
(c) v1 = (3, 4, 1, 0), v2 = (0, 1, 0, 1) and v3 = (2, 2, 0, 1).
(5) Determine if the following vectors are orthogonal.
(a) v1 = (1, 3, 1), v2 = (2, 1, 1).
(b) v1 = (2, 1, 0, 0), v2 = (0, 1, 1, 0).
(c) v1 = (1, 1, 2, 2), v2 = (1, 1, 1, 1).
(6) Find a vector (a, b, c, d) orthogonal to (3, 1, 0, 1).
(7) Find a vector v orthogonal to the subspace V of Exercise 1.
(8) Normalize the vectors of Exercise 5.

(9) Verify that v1 = (2/ 5, 1/ 5, 0), v2 = (0, 0, 1) and v3 = (1/ 5, 2/ 5, 0)
forms an orthonormal basis of R3 .
(10) Recall the determinant formula to compute the cross product of two vectors.
Let v = (x1 , y1 , z1 ) and w = (x2 , y2 , z2 ) then

i j k
v w = det x1 y1 z1
x2 y2 z2
where i, j, k correspond to the basis vectors e1 , e2 , e3 (notation used in physics).
Show that the result corresponds to the formula 1.1.
(11) Show that for vectors in the xy-plane v = (x1 , y1 , 0), w = (x2 , y2 , 0), the only
possible nonzero component of the cross product v w is in the e3 direction.
(12) Compute the area of the parallelogram given by the vectors v = (1, 0, 2) and
w = (3, 1, 1).
(13) Show that for 6= 0, (v) w = (v w).
(14) If w = v for some 6= 0, show that v w = 0.
1.3 Coordinate systems
Coordinate systems are commonly used to identify locations whether on earth with
the longitudes and latitudes or for celestial objects in the night sky using the Ele-
vation/Azimuthal coordinate system. The mathematical definition of a coordinate
system used throughout this text is the following.
Definition 1.3.1. A coordinate system on Rn is a collection of n-families of curves

such that any point in Rn corresponds uniquely to the intersection of one curve from
each family.
Coordinate systems, as opposed to coordinates of a basis of a vector space, do not

need to be linear. In fact, from the commonly used coordinate systems only the
Cartesian coordinate system is made up uniquely of straight lines.
We define the following families of coordinate curves
i
Fi := xa (t) = (a1 , . . . , ai1 , ti , ai+1 , . . . , an ) | t R
(1.2)
and a = (a1 , . . . , ai , . . . , an ) Rn1

for i = 1, . . . , n and where the u denotes the missing coordinate in the vector. The
Cartesian coordinate system on Rn consists of the n-families of straight lines given
by (1.2). A point p Rn at the intersection of the n lines Fi with values tp1 , tp2 , . . . , tpn
has coordinates (tp1 , tp2 , . . . , tpn ).
Example 1.3.2. Let u = 1.4e1 + 1.7e2 R2 , then u is at the intersection of the lines
x11.7 (t1 ) and x21.4 (t2 ) with t1 = 1.4 and t2 = (1.7) and so has coordinates (1.4, 1.7).
See Figure 1.9.
One notices from this example that the coordinates of a point u given in the canon-
ical basis correspond to the coordinates of u in the Cartesian coordinate system.
14 1 Introduction
x21.4 (t)
x11.7 (t)
1.7 u
1.4
Fig. 1.9. u = (1.4, 1.7) at the intersec-

tion of two coordinate lines.
Because of this, the dividing line between canonical basis and the Cartesian coordi-
nate systems is often overlooked.
= /2
= 3/4 = /4
= =0
= 5/4 = 7/4 Fig. 1.10. Polar coordinate system showing

four (equally spaced) radii and eight radial
= 3/2 vectors.
However, this is not correct as they correspond to different mathematical objects:

a set of vectors for the canonical basis and a family of straight lines for the Cartesian
coordinates.
The polar coordinate system in R2 is obtained from the Cartesian coordinate
system by defining a family of rays from the origin and a family of circles centered
at the origin. Let p R2 be at the intersection of the ray with angle 0 from the
x-axis and the circle at radius r0 , in coordinates we write p = (r0 , 0 ). In terms of
Cartesian coordinates, the formulae for a ray at angle 0 [0, 2) and a circle of
radius r0 are respectively
r0 (t1 ) = (t1 cos 0 , t1 sin 0 ) and r0 (t2 ) = (r0 cos(t2 ), r0 sin(t2 )).
The polar coordinate system is an example of a curvilinear coordinate system, be-

cause at least one family of curves is not made up of straight lines.
Example 1.3.3. Consider again the point u = (1.4, 1.7) R2 . In polar coordinates,
the coordinate curves passing through u are determined by the ray r0 (t) joining the
origin to u and the circle r0 (t) passing through the point u. The angle 0 is obtained
x21.4 (t)
0
u
x11.7 (t)
1.7
1.4
Fig. 1.11. u = (1.4, 1.7) at the in-

tersection of two coordinate curves in
Cartesian and polar coordinates.
using basic trigonometry since

1.7
tan(0 ) = ;
1.4
and so 0 = arctan(1.7/1.4) 5/18. The radius of the circle is obtained from

Pythagoras theorem: r0 = 1.42 + 1.72 2.20.
We now look at the main curvilinear coordinate systems in R3 obtained via the
Cartesian coordinate system. The cylindrical coordinate system is described by fam-
ilies of curves extending the polar coordinate system in the third dimension using
straight lines.
r0 ,z0 (t) = (t cos 0 , t sin 0 , z0 ), r0 ,z0 (t) = (r0 cos t, r0 sin t, z0 )
and x3(r0 ,0 ) (t) = (r0 , 0 , t). Figure 1.12 shows the radial and angular coordinates in
the plane and the vertical axis. Note that there are equivalent coordinate systems
Fig. 1.12. Cylindrical coordinates: polar coordinates in the plane and vertical axis.
16 1 Introduction
with polar coordinates in the other coordinate planes.
r0 ,x0 (t) = (x0 , t cos 0 , t sin 0 ), r0 ,x0 (t) = (x0 , r0 cos t, r0 sin t)
and x1(r0 ,0 ) (t) = (t, r0 cos 0 , r0 sin 0 ). Similarly,
r0 ,y0 (t) = (t cos 0 , y0 , t sin 0 ), r0 ,y0 (t) = (r0 cos t, y0 , r0 sin t)
and x2(r0 ,0 ) (t) = (r0 cos 0 , t, r0 sin 0 ).

The spherical coordinate system is obtained by fixing a radius 0 > 0 in R3 and
defining two families of curves on a sphere of radius 0 as follows
0 ,0 (t) = (t cos 0 sin 0 , t sin 0 sin 0 , t cos(0 ))

0 ,0 (t) = (0 cos t sin 0 , 0 sin t sin 0 , 0 cos(0 )) (1.3)
0 ,0 (t) = (0 cos 0 sin t, 0 sin 0 sin t, 0 cos t)
The definitions of coordinate system above give us formulae for the correspon-
dence of points from a curvilinear coordinate system to the Cartesian coordinate
system. If a point p R2 has Cartesian coordinates (x, y) and polar coordinates
(r, ) then
p
x = r cos , y = r sin r = x2 + y 2 , = arctan(y/x). (1.4)
The relationship between the Cartesian coordinate system in R3 and the cylin-
drical coordinate system is a direct extension of the polar coordinate system equa-
tions. For spherical coordinates, equations (1.3) give us
x = cos sin , y = sin sin and z = cos .
We obtain , , as functions of x, y, z as follows. The radius of a sphere is given by
x2 + y 2 + z 2 = ( cos sin )2 + ( sin sin )2 + ( cos )2

= 2 sin2 (cos2 + sin2 ) + 2 cos2
= 2 (sin2 + cos2 )
= 2 .
p
so = x2 + y 2 + z 2 . Notice that y/x = tan which means
= arctan(y/x).
p
Finally, cos = z/ and writing = x2 + y 2 + z 2 we obtain
!
z
= arccos p .
x2 + y 2 + z 2
Fig. 1.13. Spherical coordinates
We summarize as follows, see Figure 1.13 for an illustration where the right-angled
triangle has sides of length sin , cos and hypotensude .
p
x = cos sin , = x2 + y 2 + z 2
y
y = sin sin , = arctan
x (1.5)
!
z
z = cos , = arccos p .
x2 + y 2 + z 2
Here are some examples on how to perform the algebra from one coordinate system
to another.
Example 1.3.4. Consider the locus of points C in R2 which satisfies the equation
x2 + y 2 = 2y. We rewrite this locus of points using polar coordinates. We substitute
for x and y to obtain
(r2 cos2 + r2 sin2 ) = 2r sin

r2 = 2r sin
r = 2 sin .
C is shown in Figure 1.14, it is the circle of radius 1 centered at (0, 1).
Consider this time an example in R3 .
Example 1.3.5. We write the locus of points S in R3 given by 3x2 + 3y 2 z 2 = 1

using the cylindrical and spherical coordinates. In cylindrical coordinates, one obtains
3(r2 cos2 ) + 3(r2 sin2 ) z 2 = 1

3r2 z 2 = 1.
18 1 Introduction
y
2
1
Fig. 1.14. The circle C
x of radius 1 centered at
(0, 1)
One can write, either

1
z 2 = 1 3r2 or r2 = (1 + z 2 ).
3
In spherical coordinates,
3(2 cos2 sin2 ) + 3(2 sin2 sin2 ) (2 cos2 ) = 1

32 sin2 2 cos2 = 1
2 (4 sin2 1) = 1
where the last line is obtained by adding and subtracting 2 sin2 and simplifying
with the 2 cos2 term. Therefore, one can write
1
2 = .
4 sin2 1
We conclude this section with coordinate systems in R4 . Obvious choices are the
Cartesian coordinate system given by (x1 , x2 , x3 , x4 ), but one can also take two sets
of polar coordinates (r1 , 1 , r2 , 2 ) or a three dimensional coordinate system, say
spherical, plus an additional Cartesian coordinate (, , , x4 ). Several other options
are possible.
Exercises
(1) Consider the disk D of radius a > 0 in the plane
D = {(x, y) R2 | x2 + y 2 a}.
Write the definition of D in polar coordinates. What is the shape of D in the

(r, ) plane?
(2) Consider the region

2 2
R = (r, ) R | 2 < r < 3, .
6 3
Draw R in the (r, ) plane. Describe R in Cartesian coordinates and draw the
region in the (x, y) plane.
(3) Write x2 + y 2 = 2xy in polar coordinates. Simplify if possible.

1
(4) Write r = in Cartesian coordinates. Simplify if possible.
1 cos
(5) Write z = x2 + y 2 in cylindrical coordinates. Simplify if possible.
(6) Write z 2 = x2 + y 2 in spherical coordinates. Simplify if possible.
(7) Consider the region R R3 given by

1
R = (x, y, z) | x2 + y 2 1 z 2 , < z < 1 .
2
Describe this region in spherical coordinates and draw it.

(8) Look up hyperbolic coordinates online. Describe this coordinate system.
(9) Look up paraboloidal coordinates online. Describe this coordinate system.
(10) Search the internet for other coordinate systems. How do they relate to the
Cartesian coordinate system?
1.4 Functions and Mappings: including partial derivatives
One is typically familiar with functions of the form y = f (x) where x I R. The
set I is called the domain of f and the rule f assigns a unique value y for each x I;
this forms the image or range of f . The general definition of a function follows the
same pattern.
Definition 1.4.1. Let A, B be two sets and f is a rule which assigns to every a A,
a unique value b = f (a) B. Then the triplet (A, B, f ) is called a function where A
is the called the domain of the function and B is the image or range of the function.
In this book, we consider functions of the following form. Let U Rn and
f : U Rm
be the rule. As a shortcut, it is typical to refer to f as the function without mention-

ing the domain. However, one must always keep track of the domain of the function
even if it is not explicitly stated. Several cases of n and m values are given special
names, those are:
(1) n = m = 1: functions of one variable,

(2) n > 1 and m = 1: functions of several variables,
(3) n = 1 and m > 1: vector functions.
We look at a few examples.
Example 1.4.2. Here are some examples of functions of several variables.

20 1 Introduction
U = R2 and f (x, y) = x2 + y 2 .
1
U R3 and g(, , ) = 2 .
4 sin2 1
4
U = R and h(x, y, z, w) = xyzw.
Example 1.4.3. Consider now examples of vector functions.

r(t) = (t, t2 )
r(t) = (cos t, sin t, et )
r(t) = (1 t, |t|, 2t, cos t).
For general n and m values, it is also customary to refer to f : U Rm as a mapping.

Changes of coordinates between coordinate systems are very useful mappings. Here
are a few examples.
Example 1.4.4. Consider the mapping f : R2 \ (0, 0) R2 given by
f (r, ) = (r cos , r sin )
Example 1.4.5. Let A : Rn Rn be a linear mapping; i.e. a matrix. Suppose that

|A| 6= 0 then A is a linear change of coordinates in Rn . For instance, let (x, y, z) be
coordinates in R3 and
0 1 0
A= 1 0 0 .
0 0 1
We can use A to make a change of coordinates

u x
v = A y .
w z
Example 1.4.6. Another change of coordinates is from Cartesian to spherical coor-

dinates. Those are seen in equation (1.5):
!!
p y z
f (x, y, z) = 2 2 2
x + y + z , arctan , arccos p .
x x2 + y 2 + z 2
The inverse transformation mapping is
g(, , ) = ( cos sin , sin sin , cos ).
1.4.1 Partial Derivatives
For functions of several variables of the form f : Rn R, there is a partial extension

to the concept of derivative of a function; the so-called partial derivative which we
define in the case of a function of two variables. The general case is straightforward
once this one is understood.
Consider f (x, y) near a point (x0 , y0 ) in its domain. If the following limits exist
f (x, y0 ) f (x0 , y0 ) f (x0 , y) f (x0 , y0 )
lim and lim
xx0 x x0 yy0 y y0
we call those the partial derivatives of f with respect to x and with respect to y
respectively. Those are denoted by
f f (x, y0 ) f (x0 , y0 )
(x0 , y0 ) := lim
x xx0 x x0
and
f f (x0 , y) f (x0 , y0 )
(x0 , y0 ) := lim .
y yy0 y y0
Those can also be rewritten for an arbitrary point (x, y) as:
f f (x + h, y) f (x, y) f f (x, y + h) f (x, y)
(x, y) := lim , (x, y) = lim .
x h0 h y h0 h
The partial derivatives of familiar functions are computed in the same way as deriva-
tives of functions of one variable because the other variable is considered fixed as
a constant. Moreover, all the differentiation rules apply in the same way: addition,
multiplication rule, quotient rule, chain rule.
Example 1.4.7. Let f (x, y) = x2 y cos(xy) + x/(1 + y), then considering y fixed
f 1
= 2x cos(xy) x2 y sin(xy)y +
x 1y
and keeping x fixed
f x
= x2 cos(xy) x2 y sin(xy)x + .
y (1 y)2
For functions of n variables, the recipe is the same. Consider a function f (x1 , . . . , xn )
and a point (x10 , . . . , xn0 ) Rn , then for j = 1, . . . , n the partial derivative is
f f (x10 , . . . , xj , . . . , xn0 ) f (x10 , . . . , xn0 )
(x10 , . . . , xn0 ) = lim
xj xj xj0 xj xj0
if the limit on the right-hand side exists. Similarly, at an arbitrary point (x1 , . . . , xn )
we have
f f (x1 , . . . , xj + h, . . . , xn ) f (x1 , . . . , xn )
(x1 , . . . , xn ) = lim .
xj h0 h
As explained above, the computation of partial derivatives in the case of n variables
follows the same rules as for n = 2.
Example 1.4.8. We compute the partial derivatives of f (x, y, z) = (xy + z 2 )eyz :

f f f
= yeyz , = xeyz + (xy + z 2 )zeyz , = 2zeyz + (xy + z 2 )yeyz .
x y z
22 1 Introduction
We now define an object known as the gradient and which is our first example of a
differential operator.
Definition 1.4.9. Let f (x1 , . . . , xn ) be a function of several variables in Cartesian

coordinates for which partial derivatives with respect to all variables exist. The gra-
dient of f is defined as

f f
f (x1 , . . . , xn ) = ,..., .
x1 xn
If a function is not specified, we write

:= ,...,
x1 xn
As we show in the example above, the chain rule applies in the same way as for func-
tions of one variable. However, there is an important formula concerning the chain
rule for functions of several variables which generalizes the regular chain rule from
elementary calculus. This formula is valuable for computing derivatives in concrete
examples, but it is mostly used theoretically and we refer to this formula often in
the following chapters.
Proposition 1.4.10. Consider a function of several variables g(x1 , . . . , xn ) and sup-

pose that xj = j (u1 , . . . , uk ) for j = 1, . . . , n. Define
f (u1 , . . . , uk ) := g(1 (u1 , . . . , uk ), . . . , n (u1 , . . . , uk )).
Then, for i = 1, . . . , k we have

f g 1 g n
= + + .
ui x1 ui xn ui
where the partial derivatives of g are evaluated at xj = j (u1 , . . . , uk ) for j =
1, . . . , n.
The proof can be obtained as a special case of a more general chain rule which we
present in Chapter 5 and we decide not to burden the presentation with lengthy
calculations at this stage.
Let us look at this formula for the cases of a few variables.
Example 1.4.11. If g(x, y) and x = 1 (u1 , u2 ), y = 2 (u1 , u2 ) and f (u1 , u2 ) =

g(1 (u1 , u2 ), 2 (u1 , u2 )). Then
f g 1 g 2
= +
u1 x u1 y u1
It is customary to label the functions of u with the variables x, y instead of 1 , 2 .
That is, we write f (u1 , u2 ) = g(x(u1 , u2 ), y(u1 , u2 )) and
f g x g y
= + .
u2 x u2 y u2
We use the formula to compute the following partial derivatives.
Example 1.4.12. Let f (x, y, z) = x2 + y 2 + xz 2 and consider the change of variables

in spherical coordinates x = x(, , ) = cos sin , y = y(, , ) = sin sin and
z = z(, , ) = cos . We consider f (x(, , ), y(, , ), z(, , )) and compute
f f x f y f z
= + +
x y z
where the partial derivatives of f are evaluated at x = x(, , ) = cos sin ,
y = y(, , ) = sin sin and z = z(, , ) = cos . We obtain
f
= 2x + z 2 |x=x(,,),z=z(,,) = 2 cos sin + 2 cos2
x
f
= 2y |y=y(,,) = 2 sin sin ,
y
f
= 2xz |x=x(,,),z=z(,,) = 22 cos sin cos .
z
and
x y z
= cos sin , = sin sin , = cos .

Putting all those calculations together we obtain
f
= 2 cos2 sin2 + 2 sin2 sin2 + 32 cos sin cos2 .

1.4.2 Open and closed sets
Before we begin our discussion on functions, it is important to notice that sometimes

only some subset of Euclidean space may be of interest. Here is an important type
of subset.
Definition 1.4.13. A subset U Rn is open if for all points p U , there exists a

ball
B (p) := {x Rn | ||x p|| < }
of radius > 0 containing p such that B (p) U .
For p Rn and > 0, a ball B (p) is open. In R, a ball is just an interval of length
centered at p. In R2 with the Euclidean norm, B (p) is a disk of radius centered
at p. In R3 , B (p) is a ball in the common sense of the word. In general, a ball of
radius centered at p is the set of points whose distance to p is less than .
Example 1.4.14. On the real line, let a < b and consider an interval (a, b). We
verify using the definition that this interval is open. For each point p in the interval
(a, b), we must find a ball B (p) of a certain radius > 0 such that B (p) (a, b).
24 1 Introduction
B (p)
Fig. 1.15. B (p) is a disk of radius .
The point p is either closer to a, closer to b or in the middle of the interval. Let
be the smallest value of the distances between p and a and p and b. This is written
= min(|p a|, |p b|).
See Figure 1.16 for an illustration. Let = /2 and choose an arbitrary point x
B (p). By definition of B (p), we must have |x p| < . By choosing to be half the
distance to the nearest boundary point, this makes sure that x (a, b). Therefore,
any point x B (p) is also in (a, b). This means B (p) (a, b). Because p is chosen
arbitrarily in (a, b) this completes the verification that (a, b) is open.
Consider the following subsets of R: (a, b], [a, b) and [a, b]. Those three sets are not
open and this can also be seen using the definition. The problem lies in the inclusion
of at least one boundary point in the set. Consider the first case (a, b]. Suppose p = b,
then any ball B (p) with > 0 contains points x such that b < x < b + . But this
means x 6 (a, b] and so B (p) 6 (a, b] no matter how small > 0 is chosen. Thus,
(a, b] is not open. The same applies automatically to the other cases. In particular,
if a = b then [a, b] consists of one point and so singletons are not open sets. We now
look at examples of sets in R2 and R3 .
Example 1.4.15. Consider the set
O := {(x, y) R2 | x > y}.
This corresponds to the half-plane shown in Figure 1.18. We now show it is an

open set. The key here is the strict inequality used in the definition of the set. Let
p = (x0 , y0 ) O. Then, x0 > y0 . Let be the distance between p and the bisector
line x = y. Because x > y, then > 0. Choose = /4 and consider the ball
B (p) = {(x, y) R2 | ||(x, y) (x0 , y0 )|| < }
a p b a p b a p b
x x x
Fig. 1.16. Three possible cases of location of p in (a, b): midpoint, right and left of midpoint
a p=b
x Fig. 1.17. (a, b] with p = b.
B (p)
B (p)
p
x
Fig. 1.18. Balls B (p) from Exam-
ple 1.4.15
is strictly contained in B (p) and so strictly contained in O.
Here is an example of an open set in higher dimension defined in an abstract way.
Example 1.4.16. Consider the motion of three (celestial) bodies in R3 of mass

m1 , m2 , m3 . Let q1 , q2 , q3 R3 be the position vectors of the three bodies. The set of
non-collision positions defined by
C = {(q1 , q2 , q3 ) (R3 )3 | q1 6= q2 , q2 6= q3 and q1 6= q3 }
is an open subset of (R3 )3 .
An interval [a, b] containing its boundary points is known as a closed interval. The
concept of a closed set is trickier to define than open sets and closed sets. Even on
the real line, closed sets can have bizarre properties which are beyond the scope of
this document. We do not define closed sets formally, but focus our attention to a
special type of sets which is a generalization of the closed interval.
Example 1.4.17. Consider an open set U in R2 and add the boundary to it to form
a set F . The set F is a closed set. For instance, let
U = {(x, y) R2 | |x| + |y| < 1}.
This is the diamond shape region shown in Figure 1.19. If we add the boundary, that
is, the lines given by |x| + |y| = 1, the set
F = {(x, y) R2 | |x| + |y| 1}
is a closed set.
One way to find out if a set is closed is by using the following result.
Proposition 1.4.18. A set U Rn is open if and only if its complement U c is closed.

26 1 Introduction
x + y = 1 x+y =1
U
x
1 1
x y = 1 xy =1
1 Fig. 1.19. The open set U is inside

the diamond. The closed set F is com-
posed of U and includes the boundary
lines.
We omit the proof of this theorem as it is also beyond the scope of this text.
Example 1.4.19. Proposition 1.4.18 shows that the complement of the set O in Ex-
ample 1.4.15 defined by
Oc = {(x, y) R2 | x y}
is closed. In this case, the portion with no boundary is not a problem; infinity acts
as a boundary to this set.
Exercises
(1) Find the domain for each of the following mappings

1
(a) f (x, y) = 2 .
x y2

(b) g(r, , z) = 1 r2 z 2 . Describe the region geometrically.
2 1
(c) h(, , ) = .
1 cos sin

(d) r(t) = ( t2 1, tan(t))
(2) Compute all the partial derivatives and express the gradient of each function.
(a) f (x, y) = ln(cos(xy))
xy
(b) f (x, y) =
x2 + y 2
p
(c) f (x, y, z) = (x2 + y 2 + z 2 ) cos(1/ x2 + y 2 + z 2 )
(d) f (x1 , x2 , x3 , x4 ) = x1 x4 ex1 x2 x3 x4
(3) Compute the partial derivatives
f f
and

in Example 1.4.12 using the formula of Proposition 1.4.10. Write explicitly the
composition of f with the functions of , and and compute the partial
derivatives with respect to , and using the regular chain rule for functions
of one variable. If you do not obtain the same thing with the two methods, verify
your calculations.
(4) Determine whether the sets below are open, closed or neither.
(a) A = {(x, y) R2 | x2 + y 2 1, y > 0}
(b) B = {(x, y) R2 | 1 x 2, 3 y 4}
(c) C = {(x, y) R2 | x y 6= 0}
(d) D = {(r, ) R2 | r > 1, 0 }
(e) E = {(r, ) R2 | 1 r < 2}
(5) The union of two open sets is an open set. Give an explanation (or a proof if
you can) of why this is true.
(6) The intersection of two closed sets is a closed set. Give an explanation (or a
proof if you can) of why this is true.
1.5 Parametric representation of curves
In this section, we begin the study of curves defined using so-called parametric
representations. A curve C in Rn is a geometric object which can be described
by a unique number; its parameter. Let t R, a curve C in Rn has parametric
representation given by
x1 (t), . . . , xn (t)
where t [a, b] with a < b, where a = and b = are allowed.
Remark 1.5.1. Note that for a given curve C, the choice of parametric representa-
tion is not unique. Therefore, one has to distinguish between the geometric object C
and the possible representations that can describe C.
The graphical representation of the real-line comes with an orientation; the arrow
points to the right and this determines the direction of increasing real numbers. The
choice of pointing to the right is arbitrary and is the convention adopted, possibly
unanimously, and is called the positive orientation of the real-line. If the arrow points
to the left, so that positive numbers increase in that direction, we talk about negative
orientation. An orientation of a curve C is the orientation given by a consistent
direction given by an arrow along the length of C.
28 1 Introduction
For a curve C with parametric representation r(t) with t [a, b], the orientation
of C is given by the direction of travel along the curve C given as t increases from a
to b.
y
(t, t2 )
Fig. 1.20. Parabola (t, t2 ) with

x t [0, 1].
Example 1.5.2. Let y = f (x) where f : [a, b] R is a function. The graph of f (x)
is a curve C in R2 . The parametric representation is given by
x(t) := x1 (t) = t, y(t) := x2 (t) = f (t).
Consider the parabola C given by y = x2 defined for all x R, it has parametric

representation
x(t) = t, y(t) = t2 with t R.
Now, the portion of parabola given by y = x2 with x [0, 1] is a different geometric

object which we denote C1 and has parametric representation
x(t) = t, y(t) = t2 with t [0, 1]
We can use parametric representations to describe more complicated curves which

cannot be obtained by a single function y = f (x).
Example 1.5.3. A circle of radius r0 has equation x2 + y 2 = r02 . A parametric

representation is given by
x(t) = r0 cos t, y(t) = r0 sin t, t [0, 2)
because substituting x(t) and y(t) in the equation is an identity for all t [0, 2).
If the equation of the circle is given in polar coordinates r = r0 , then parametric
equations are
r(t) = r0 , (t) = t, t [0, 2).
In some simple cases, it is possible to convert from a parametric representation of

a curve to Cartesian coordinates in R2 . The cases where either x(t) = t or y(t) = t
are straightforward as noticed above. For more complicated cases, the approach is
to find functions of x(t) and y(t) from which we obtain an equality.
Example 1.5.4. Consider the curve C given by x(t) = 2t2 , y(t) = t6 . Then,
1
x(t)3 = t6 = y(t)
8
and so C is given by x3 = 8y.
Although this approach is useful sometimes for graphing, our main emphasis in this
textbook is on parametric representations and its properties. Let us look at some
commonly presented example in R3 .
Example 1.5.5. Consider the curve C given by x(t) = cos t, y(t) = sin t and z(t) = t
for t [0, 4]. We see that in the xy-plane, this is just a circle and in the z-direction,
we have linear growth. The curve C is called a helix and is illustrated in Figure 1.21.
Fig. 1.21. Helix

curve of radius 1
We now look at an example with quite an intricate structure in R3 .
Example 1.5.6. Consider the curve C1 given by
x(t) = 4 cos t + cos t cos(3t), y(t) = 4 sin t + sin t cos(3t), z(t) = 2 sin(3t)
We see from Figure 1.22 (left) that we obtain a closed curve. Compare with the curve

C2 obtained by changing 3t to 2t (Figure 1.22 right),

x(t) = 4 cos t + cos t cos( 2t), y(t) = 4 sin t + sin t cos( 2t), z(t) = 2 sin( 2t).
This curve lies on a surface which has the shape of a donut or bagel (depending on
your taste!), known as a torus. In fact, the curve C1 also lies on the same torus.
Parametric representations in higher dimensions commonly arise in many problems.

However, their graphical representations can only be glimpsed upon by looking at
projections to R3 . Consider the following one which is reminiscent of the previous
example.
30 1 Introduction
Fig. 1.22. Left: closed curve lying on a torus, Right: non-closed curve lying on a torus.
Example 1.5.7. Consider the curve C in R4 given by the parametrization

x1 (t) = cos t, x2 (t) = sin t, x3 (t) = cos 2t, x4 (t) = sin 2t.
We see that the projections to the x1 , x2 and x3 , x4 planes are just circles with periods

respectively of 2 and 2/ 2. This kind of parametric curve describes the motion
of a double pendulum subject to a small displacement. See Figure 1.23 obtained for
t [0, 100].
Fig. 1.23. Parametric curve of Example 1.5.7
One can think of parametric curves as describing the motion of a point particle
in space. This leads to a subtle aspect of parametric curves with respect to their
intersections as the following example shows.
Example 1.5.8. Two particles A and B travel in the paths given by xA (t) = t,
yA (t) = t3 and xB (t) = cos t, yB (t) = sin t. Do the particles collide? The paths
traced by these parametric curves intersect in two points as seen from the figure.
However, in order to have the particles collide, one would need the intersections
to occur for the same value t0 . It is not the case in this particular example. To check
this, one would need to find a value t0 such that t0 = cos t0 . By drawing the graphs of
t and cos t, we see that there are two such solutions. For each (approximate) solution,
compute t30 sin t0 and verify that this is not zero.
Fig. 1.24. The curves have two inter-

section points.
1.5.1 Conics
We now look more closely at the planar curves known as conics. Those are obtained
geometrically by taking various cross-sections of a cone and have the following defi-
nitions in their simplest forms. Those are for a, b R:
y
1
x
2
Fig. 1.25. Ellipse with a = 2 and
b = 1.
(1) Ellipse:
x2 y2
2
+ 2 =1
a b
32 1 Introduction
(2) Hyperbola: Left-Right and Up-Down cases.
x2 y2 y2 x2
= 1 or = 1.
a2 b2 a2 b2
y
y=x
x
1 1
Fig. 1.26. Left-Right hyperbola with
a = b = 1. The dashed lines are the
y = x asymptotes of the hyperbola.
(3) Parabola:
4ay = x2 or 4ax = y 2 .
For the ellipse, the largest of a and b represents the major axis and the other one, the
minor axis. The ellipse in Figure 1.25 has major axis a = 2 and minor axis b = 1. In
the case of the hyperbola, a is the distance between the origin and the vertices which
are the nearest points of the hyperbola to the centre. For |x| large, the hyperbola
approaches the asymptotes of slope b/a. The parabola 4ay = x2 opens up or down
depending on whether a > 0 or a < 0. The case 4ax = y 2 opens either left or right.
Fig. 1.27. Parabola 4ay = x2 with

a = 1/4.
We now describe the conics using parametric representations. The first way one can
do this is by solving for y as a function of x. In the case of the ellipse, we have two
functions r
x2
y = f (x) := b 1 2
a
and f (x) is well-defined as a function for x [a, a]. This leads to

r
t2
x(t) = t, y(t) = 1 2 , t [a, a].
a
Another parametrization similar to the parametrization of the circle is obtained as
x(t) = a cos t, y(t) = b sin t, t [0, 2).
Indeed,
x(t)2 y(t)2 a2 cos2 t b2 sin2 t
+ = + = 1.
a2 b2 a2 b2
For the hyperbola, the y = f (x) parametrization is done as above. Recall the hyper-
bolic functions
et + et et et
cosh t = and sinh t = .
2 2
Another parametrization of the hyperbola is given by
x(t) = a cosh t, y(t) = b sinh t, t R.
This justifies the name hyperbolic function.
Exercises
(1) Use a computer software to draw the curves given by the following parametric
representations.
(a) x(t) = t2 , y(t) = cos t; t [0, 2]
(b) x(t) = t cos t, y(t) = t sin t; t [0, ]
(c) x(t) = t2 , y(t) = t3 ; t [2, 2]
(2) Find the intersections of the curves in Exercise 1. (A difficult exercise!)
(3) Verify the statement of Example 1.5.8 by using the approach suggested.
(4) Transform the following equations of conics into their standard form and draw
a rough sketch of the conic.
(a) 2x2 + 5y 2 = 3
(b) 4y 2 x2 = 3
(5) Find the intersection points of the two conics given.
(a) 4x2 + y 2 = 1 and x2 + 4y 2 = 1.
(b) x2 3y 2 = 1 and 3x2 + 5y 2 = 1.
(c) 3y 2 x2 = 1 and y = 4x2 .
2
(d) 9x2 3y 2 = 3 and y 2 x32 = 1.
34 1 Introduction
Fig. 1.28. Ellipsoid and Hyperbolic Paraboloid
Fig. 1.29. Cone and Elliptic Paraboloid
1.6 Quadrics
The Quadrics are a family of surfaces in R3 which contain some well-known examples
such as the cone and the ellipsoid. The general equation of a quadric is given by:
x2 + y 2 + z 2 + 2xy + 2xz + 2yz + x + y + z + p = 0
which covers the case of quadrics anywhere in space and in any orientation. We
focus on the special cases where the quadrics are centered at the origin and have an
orientation given by the z-axis. We list the quadrics in their simplest form below.
They are the ones that are used in the remainder of the book. Let a, b, c R:
(i) Ellipsoid:
x2 y2 z2
2
+ 2 + 2 = 1.
a b c
Fig. 1.30. Hyperboloid of one sheet and Hyperboloid of two sheets

1.6 Quadrics 35
(ii) Hyperboloid of one sheet:
x2 y2 z2
+ = 1.
a2 b2 c2
(iii) Hyperboloid of two sheet:
x2 y2 z2
2
2 + 2 = 1.
a b c
(iv) Elliptic paraboloid:
x2 y2
+ z = 0.
a2 b2
(v) Hyperbolic paraboloid:
x2 y2
2
2 z = 0.
a b
(vi) Cone:
z2 x2 y2
2
= 2 + 2.
c a b
Note that all the quadrics are aligned along the z axis in the notation above. However,
those quadrics can also be aligned along the x and y axes and the equations are the
same up to a permutation of the x, y and z. In the case of the elliptic paraboloid,
one obtains
y2 z2 x2 z2
2
+ 2 x = 0 and 2
2 y = 0.
a b a b
The geometry of the quadrics can be understood by taking intersections with
planes. We define horizontal planes by setting z = K3 , vertical planes parallel to the
xz-plane setting y = K2 and vertical planes parallel to yz-plane by setting x = K1
for some K1 , K2 , K3 R acting as a parameter which we can vary. The intersection
of conics and coordinate planes are called traces. The following examples illustrate
the traces of some conics.
Example 1.6.1. Consider an ellipsoid
x2 y2
+ + z 2 = 1.
9 4
Consider the traces given by z = K3 , we obtain
x2 y2
+ = 1 K32
9 4
which is the equation of an ellipse as long as 1 K32 0. Thus, the ellipses are
getting smaller as |z| increases from zero.
36 1 Introduction
Fig. 1.31. A hyperbolic paraboloid with its traces: a hyperbola for the horizontal trace and two
parabolae, one for each vertical trace.
Example 1.6.2. Consider a hyperbolic paraboloid

x2 y2
z =0
2 10
and take all three types of traces. Setting z = K3 we can simplify the equation to
x2 y2
2
=1
( 2K3 ) ( 10K3 )2
which is the equation of a hyperbola. Setting x = K1 we have
K12 y2
z =0
2 10
which becomes
y2 K2
z=+ 1.
10 2
This is the equation of a parabola opening downward with vertex at z = K12 /2.
Finally, setting y = K2 yields
x2 K2
z= 2
2 10
which is also a parabola, now opening upward and with vertex at z = K22 /10. Those
are shown in Figure 1.31.
Often, in applications, it is necessary to determine the curves of intersections of

quadric surfaces. This is done by equating the equations for the quadrics and per-
forming some manipulations. The following examples illustrate this procedure.
1.6 Quadrics 37
y2
Example 1.6.3. We find the intersection of the cone z 2 = x2 + and the ellipsoid
3
2 2 2 2
x + 2y + 3z = 1. Isolating z from both equations, we then have
y2 1
x2 + = (1 x2 2y 2 )
3 3
which simplifies to
x2 y2
4x2 + 3y 2 = 1 2
+ 2 = 1;
1

2 1
3

an ellipse with minor axis a = 1/2 and major axis b = 1/ 3.
Quadrics can be expressed in the form of one or two functions of several variables
z = f (x, y). The elliptic and hyperbolic paraboloid are written, respectively,
x2 y2 x2 y2
z= + and z = .
a2 b2 a2 b2
The remaining quadrics are obtained using square roots of z and so are expressed
using two functions. For instance, the hyperboloid of two sheets is given by
r
x2 y2
z = c 1 + 2 + 2 .
a b
Note that it is sometimes more convenient to isolate either x or y, rather than z.
Example 1.6.4. The surface obtained by rotating the line x = 2y about the x-axis is
a cone. The equation of this cone is obtained as follows. For a fixed x 6= 0 value, the
trace is a circle of radius x/2 which projects to the yz-plane. The equation of this
circle is x 2
= y2 + z2
2
which we rewrite as
s
2 y2 z2 y2 z2
x = 2
+ 2
or x = 2
+ .
(1/2) (1/2) (1/2) (1/2)2
Figure 1.32 shows the cone along with the line x = 2y.
1.6.1 Cylinders
The concept of cylinder is a familiar one. One can describe it as a surface with
constant horizontal trace given by a circle of fixed radius. The equation in this case
is
x2 + y 2 = r 2
where r > 0 is the radius of the cylinder. This definition can be generalized to any
curve in the plane. Let C be a curve in the xy-plane, then a cylinder over C is the
surface with constant horizontal trace given by C.
38 1 Introduction
Fig. 1.32. Cone obtained from revolving the line x = 2y around the x axis.
(1) The curve y = x2 defines a parabolic cylinder.

(2) Note that cylinders do not need to be defined in the xy-plane. Consider the
3
parametric curve x(t) = 0, y(t) = t and z(t) = et with t [1, 1]. See
Figure 1.33.
The intersection of cylinders and quadrics also occurs frequently in applications and
define curves which are often better understood using parametric representations.
Example 1.6.5. We find the intersection curve of the parabolic cylinder y = x2 and
the top half of the ellipsoid x2 + 3y 2 + 3z 2 = 9 and describe it using its parametric
equations.
The top half of the ellipsoid is obtained for z 0. To obtain the intersection
curve, we substitute y = x2 into the equation of the ellipsoid:
y + 3y 2 + 3z 2 = 9.
By completing the square for y, we obtain that this equation has the form
2
1 1
y+ + z2 = 3 + .
6 36
1
This equation describes a circle of radius 3+ 36 with centre at (y, z) = ( 61 , 0). We
can solve for z and keep only the + solution because we are interested in the top half
1.6 Quadrics 39
3
Fig. 1.33. Cylinder defined by the parametric curve y(t) = t, z(t) = et in the yz-plane.
of the ellipsoid: s 2
109 1
z= y+ . (1.6)
36 6
But, the expression under the square root needs to be positive, so 109 1 2
36 (y + 6 ) 0.
We can now describe the intersection curve in parametric form. Let x(t) = t, and
since y = x2 then y(t) = t2 . Equation (1.6) completes the description. We have
s 2
109 1
x(t) = t, y(t) = t2 , z(t) = t2 +
36 6
Fig. 1.34. Ellipsoid and cylinder of Example 1.6.5 with its intersection curve.
40 1 Introduction
109 1 2

with domain of t obtained by isolating t in 36 t2 + 6 0. This yields
sr sr
109 1 109 1
t .
36 6 36 6
See Figure 1.34.
Exercises
(1) Write the following quadrics in their standard form and identify the quadric
(note that the x, y, z might be permuted with respect to the equations given at
the beginning of the section).
(a) 2x2 y 2 + z 2 = 3
(b) x y 2 2z 2 = 0
(c) 2z 2 x2 + 4y 2 = 0
x2
(d) 2 y 2 + 3z 2 = 2
2
(e) x2 4y 2 5z 2 = 2
(2) Find the equations of the traces of the quadrics of Exercise 1.
(3) Consider the quadric Q with traces given by two families of hyperbolae and one
family of circles defined for all values of the constant K. Which quadric is Q?
(4) Consider the quadric Q with traces given by two families of parabolae and one
family of circles. Which quadric is Q?
(5) Find the curve of intersection of the cone z 2 = 2x2 + y 2 with the hyperboloid
of two sheets
y2
4x2 + z 2 = 1.
3
What is this curve?
(6) Find the curve of intersection of the paraboloid 3x2 + y 2 + z 2 = 1 and the
hyperboloid of one sheet x2 + 3y 2 z 2 = 1. What is this curve?
(7) Find the curve of intersection of the elliptic paraboloid z = 3x2 + 2y 2 and the
hyperbolic paraboloid z = x2 y 2 . What is this curve?
(8) Consider the cylinders given by x = y 2 and y 2 + z 2 = 4. Find the intersection
curve of those surfaces and write the result in parametric form.
(9) Consider the cylinder given by the curve x(t) = t3 , y(t) = t2 and the cone
z 2 = x2 + y 2 . Write the intersection curve in parametric form.
2 Calculus of Vector Functions
The previous chapter showed how a curve C can be expressed in terms of a para-
metric representation
x1 (t), . . . , xn (t)
with t [a, b]. A convenient way to write parametric representations is using vector
functions r : R Rn of the form
r(t) = (x1 (t), . . . , xn (t)).
In this section, we show how to apply calculus techniques to curves, and this is done
more conveniently by using vector functions rather than parametric representations.
2.1 Derivatives and Integrals
Consider a curve C with parametrization r(t). Examples from the previous section
show that many of the curves defined are quite smooth in the sense that there
are no sharp corners. In fact, they are also continuous in the sense that there is no
jump in the tracing of the curve as one follows it. The concepts of continuity and
smoothness (i.e. derivatives) for function f : [a, b] R are one of the main topics
in an introductory course on Calculus. We show in this section how these concepts
extend to curves, but with some warnings!
2.1.1 Limits and Continuity
To discuss limits of vector functions, consider first the case of r(t) = (t, f (t)) with
t [a, b]. Those correspond to functions y = f (x) with x [a, b]. Recall that f has
a limit L R at some x0 [a, b] if
lim f (x) = L. (2.1)

xx0
For convenience of the reader, recall that the exact definition of (2.1) is that
> 0, > 0 such that if 0 < |x x0 | < then |f (x) L| < .
The vector function r(t) = (t, f (t)) describes the same curve as y = f (x). Consider
x(t) = t and y(t) = f (t) separately. Then, the limits as t t0 exist in both cases:
lim x(t) = lim t = t0 and lim y(t) = lim f (t) = L.

tt0 tt0 tt0 tt0
Therefore, it makes sense to define

lim r(t) = lim x(t), lim y(t) = (t0 , L).
tt0 tt0 tt0
42 2 Calculus of Vector Functions
The definition of limit for general vector functions r(t) follows the same ap-
proach.
Definition 2.1.1. Consider the vector function
r(t) = (x1 (t), . . . , xn (t))
for t [a, b]. Then, the limit of r(t) at t0 [a, b] exists if
lim xj (t) = Lj R
tt0
for all j = 1 . . . , n. Thus,
lim r(t) = (L1 , . . . , Ln ).

tt0

Note that for t0 = a or t0 = b we take the one-sided limit: t t+
0 or t t0
respectively.
We now look at several examples.
Example 2.1.2. This example is similar to a function y = f (x) with a jump discon-
tinuity. Let
1, t [1, 0)
y(t) =
1, t [0, 1]
then the vector function
r(t) = (t, y(t), t2 ) t [1, 1]
has two separate pieces depending on whether t [1, 0) or t [0, 1], see Figure 2.1.
Fig. 2.1. Discontinuous vector

function
With the above definition of limit, we can now discuss the concept of continuity.
Definition 2.1.3 (Continuity). A vector function r(t) with r : [a, b] Rn is contin-

uous at t0 [a, b] if r(t0 ) = r0 Rn and
lim r(t) = r0 .
tt0
In Example 2.1.2, r(0) = 1 and
lim (t, y(t), t2 ) = (0, 1, 0) and lim (t, y(t), t2 ) = (0, 1, 0)

t0 t0+
so the limit does not exist and r(t) is not continuous at t = 0.

Consider instead a vector function r(t) = (t2 , t3 ). Then, r(t0 ) = (t20 , t30 ) and
lim (t2 , t3 ) = (t20 , t30 ).

tt0
In most of our examples, the vector functions are continuous.
2.1.2 Derivatives, Smoothness and Integrals
We begin with the definition of derivative for a vector function.
Definition 2.1.4. A vector function r(t) is differentiable at t = t0 if
r(t) r(t0 )
lim
tt0 t t0
exists. The limit is denoted by r0 (t0 ) and called the derivative of r(t).
An equivalent definition is obtained by setting t = t0 + h, then

r(t0 + h) r(t0 )
r0 (t0 ) = lim .
h0 h
This form of the definition is useful when one needs to obtain a general formula for
the derivative of r(t) independent of t0 . A vector function r(t) defined for t [a, b]
is differentiable on (a, b) if r(t) is differentiable at each t (a, b).
As one may expect, if each coordinate function of r(t) is differentiable on its
interval of definition then r(t) is differentiable. This is expressed in the following
result.
Proposition 2.1.5. A vector function r(t) = (x1 (t), . . . , xn (t)) is differentiable on

(a, b) if and only if xj (t) is differentiable on (a, b) for j = 1, . . . , n.
Proof. The proof can be done as a single calculation as follows. Let t0 (a, b) and
we write
r(t) r(t0 ) (x1 (t) x1 (t0 ), . . . , xn (t) xn (t0 ))
lim = lim
tt0 t t0 tt0 t t0
x1 (t) x1 (t0 ) xn (t) xn (t0 )
= lim ,...,
tt0 t t0 t t0
So, if the left-hand side limit exists, then the right-hand side limit in the last row
must exist for each component. Similarly, if the right-hand side limit exists for each
component, then the left-hand side limit exists.
From Proposition 2.1.5, we see that the derivative of vector functions depends com-
pletely on each coordinate function and it is known from an introductory calculus
course how to compute derivatives of functions xj : [a, b] R.
We know that r(t) = (t, |t|) is not differentiable at t = 0 and this is a consequence
of the corner of the function |t| at t = 0. However, vector functions are different from
functions y = f (x) because the curve C defined by a vector function can have all its
coordinates differentiable, but still have a corner as the next example shows.
Example 2.1.6. Consider
r(t) = (t2 , t3 ), t [1, 1].
Then, x(t) = t2 and y(t) = t3 are differentiable, but Figure 2.2 shows clearly a
corner at t = 0. We see that x(t)3 = t6 = y(t)2 so C corresponds to the curve given
by y 2 = x3 and called a cusp curve.
Fig. 2.2. The cusp curve has a sharp corner

at (0, 0)
Here is an example without a corner.
Example 2.1.7. Consider r(t) = (t, t2 , et ) with t [1, 1], Figure 2.3 shows the
curve C corresponding to this vector function. The fact that there are no corners at
any point shows a greater degree of smoothness than the previous example.
Therefore, the concept of derivative and the smoothness of the curve C corresponding
to the vector function are not equivalent. We now discuss the question of smoothness
of the curve associated with a differentiable vector function r(t). We define smooth-
ness here and show in the following section that smoothness at a point p of a curve
C corresponds to the existence of a line tangent to C at p.
Fig. 2.3. Curve with no corner.
Definition 2.1.8. A curve C has a smooth parametrization given by a vector func-

tion r(t) for t [a, b] if r0 (t) 6= 0 for all t (a, b). If C has a smooth parametrization,
then we say that C is a smooth curve.
Clearly, the cusp curve is not smooth at t = 0. This next example has a non-obvious
corner point.
Example 2.1.9. Consider the curve r(t) = (t2 , t2 ) with t [1, 1]. This curve starts
at (1, 1) for t = 1 and evolves down the diagonal to (0, 0) and returns on itself
until it reaches (1, 1) at t = 1. In order to return on its path, the curve must stop at
t = 0 and indeed, r0 (0) = (0, 0). Therefore, this is not a smooth curve.
y
r(t4 ) r(t3 )
r(t)
r(b)
C
r(t2 )
a t1 t2 t x
t3 t4 b
r(a)
r(t1 )
Fig. 2.4. A piecewise smooth curve with five pieces and non-smooth points at t1 , t2 , t3 , t4 .
A curve C is piecewise smooth if there exists a parametrization given by r(t) such

that r0 (t) = 0 only at a finite number of points t1 , . . . , tk (a, b). Most of the
examples presented in this book are at least piecewise smooth. Figure 2.4 sketches
a piecewise smooth curve.
A piecewise smooth curve C must often be given by distinct vector functions as
the next example shows. But it is not always the case as Example 2.1.9 shows.
y
1
x
2
Fig. 2.5. Curve C of Example 2.1.10
Example 2.1.10. Consider the curve C which is the triangle with vertices at (0, 0),
(2, 0) and (0, 1). We obtain a vector function for C (with a consistent orientation)
by writing a parametrization of each side of the triangle.

(t, 0) t [0, 1]
r(t) = (2 2t, t) t [0, 1]
(0, 1 t) t [0, 1].

See Figure 2.5.
Computing the integral of a vector function is a straightforward process. Let r(t)

with t [a, b] be a continuous vector function with r(t) = (x1 (t), . . . , xn (t)), then
b b b
r(t) dt = x1 (t) dt, . . . , xn (t) dt .
a a a
Example 2.1.11. Let r(t) = (t2 , cos(2t)) with t [0, 1] then

1 1 1
2
r(t) dt = t dt, cos(2t) dt
0 0 0
1

= 3, 0 .
2.1.3 Position and velocity of particles
The use of vector functions is crucial in physics (especially Newtonian mechanics)

and in this section we expose the main interpretations of the concepts of the previous
section to this context. As an object moves in space, its velocity is the instanteneous
rate of change of its position. If the position is given by a parametric representation,
we have the following definition.
Definition 2.1.12. Let r(t) be the vector function describing the motion of a particle
p. The velocity vector of p is r0 (t) and the speed is given by ||r0 (t)||.
The acceleration is the rate of change of velocity and so we have the definition.
Definition 2.1.13. Let r(t) be the parametrization describing the motion of a particle
m. The acceleration vector of m is r00 (t).
Let us consider some examples.
Example 2.1.14. Suppose that an objects position is given by r(t) = (t, t, t(2 t))
with t [0, 2]. The velocity vector is r0 (t) = (1, 1, 2 2t) and the speed ||r0 (t)|| =
p
12 + 12 + (2 2t)2 = 6 8t + 4t2 . The acceleration vector is r00 (t) = (0, 0, 2).
We can also use integration of vector functions to solve the inverse problem.
Example 2.1.15. Suppose that an object has acceleration vector a(t) = (1, t, t2 ) and
has initial position vector (0, 0, 0) and initial velocity vector (1, 0, 0). Integrating the
acceleration vector gives the velocity vector:

0 2
t + c1 , 21 t2 + c2 , 13 t3 + c3

r (t) = a(t) dt = 1 dt, t dt, t dt =
t, 12 t2 , 13 t3 + (c1 , c2 , c3 ).

=
We know r0 (0) = (c1 , c2 , c3 ) = (1, 0, 0) implies c1 = 1, c2 = c3 = 0. So,
t2 t3

r0 (t) = t + 1, , .
2 3
Integrating the velocity vector:

2
t3 t4

t
r(t) = + t + k1 , + k2 , + k3 .
2 6 12
Then r(0) = (k1 , k2 , k3 ) = (0, 0, 0) implies

2
t3 t4

t
r(t) = + t, , .
2 6 12
Newtons second law relates the force vector F acting on an object with the accel-
eration vector a in this famous formula:
F = ma
where m is the mass of the object.
Example 2.1.16. Consider a ball with mass m thrown from the ground at an angle
and with initial velocity v = v0 . If the only external force acting on the ball is the
gravitational force Fg with acceleration g = 9.8m/s2 in the (0, 0, 1) direction, we
find the position r(t) of the ball and the angle that maximizes the horizontal distance
traveled.
We assume that at time t = 0, the ball is at the origin r(0) = (0, 0, 0) and we
suppose that the motion of the ball happens completely inside the yz-plane. Then,
r0 (0)
v0 sin
y Fig. 2.6. Projections on the y and
v0 cos z axis of the initial velocity r0 (0)
the initial velocity vector is given by r0 (0) = (0, v0 cos , v0 sin ). Because the gravi-
tational force is the only one acting on the ball, from Newtons second law, we obtain
the acceleration
9.8
a(t) = (0, 0, 1).
m
Integrating the acceleration, we obtain the velocity vector
r0 (t) = (9.8/m)(c1 , c2 , t + c3 )
and the values of the constants are computed using the initial velocity: r0 (0) =
(0, v0 cos , v0 sin ) = (9.8/m)(c1 , c2 , c3 ). Therefore,

0 9.8 t
r (t) = 0, v0 cos , + v0 sin .
m
Integrating the velocity vector gives the position vector
9.8 t2

r(t) = d1 , tv0 cos + d2 , + tv0 sin + d3 .
2m
Because r(0) = (0, 0, 0) then d1 = d2 = d3 = 0 implies
4.9 t2

r(t) = 0, tv0 cos , + tv0 sin .
m
The ball lands at time t for which z(t ) = 0 and solving for t we obtain t =
mv0 sin /4.9. Thus, the distance traveled by the ball is given by d() := y(t ) =
mv02 cos sin /4.9. Using elementary calculus, d() has a maximum value for =
/4.
Exercises
(1) For the vector functions below, determine if they are differentiable and/or
smooth on their interval of definition.
(a) r(t) = (t2 2t, sin(2t)) t [0, 1]
2
(b) r(t) = (3t2 , cos t, e1/t ), t [1, 1]
(c) r(t) = ((t )2 , cos t, 2 sin(t/2)), t [2, 2]
(d) r(t) = (4t3 , t2 ), t [1, 3]

2
(e) r(t) = (3t cos t, t2/3 , et )), t [1, 1]
(2) For each vector function of exercise (1), compute the integral of the vector
function.
(3) Let each vector function of exercise (1) describe the trajectory of an object
moving in space. Compute the velocity vector, speed and acceleration vector.
(4) Show that if a vector function r(t) is differentiable at t = t0 , then it is continuous
at t = t0 .
(5) Show that any curve given by a vector function (t, f (t)) is smooth.
(6) Suppose that a force F (t) = (t2 , cos t, sin t) is applied on an object of mass 1.
Find the general formula for the velocity and position vector.
(7) An object falls from a cliff of height h0 > 0 and only subject to gravitational
acceleration g = 9.8(0, 0, 1)m/s2 . Find the velocity and position vectors if the
initial velocity is v0 . Determine a formula for the final velocity as the object hits
the ground.
2.2 Best Linear Approximation and Tangent Lines
To discuss the existence of tangent line at a point p of a curve, we introduce the

concept of best linear approximation. This concept is defined as follows.
Definition 2.2.1. Consider a function f (x) and a linear function L(x) = a + bx. We
say that L(x) is the best linear approximation of f (x) at x = x0 if
f (x) L(x)
lim = 0.
xx0 x x0
The best linear approximation limit means that the difference f (x)L(x) approaches
zero near x = x0 at a much faster rate than 1/(xx0 ) approaches infinity near x = x0
so that the product tends to zero.
We look at the case of functions of one variable as a reminder. Let f (x) be a
function which is at least twice differentiable. We do a Taylor expansion of f (x) at
x = x0 to first order,
f (x) = f (x0 ) + f 0 (x0 )(x x0 ) + o(|x x0 |) (2.2)
where the symbol o(|x x0 |) is a short-hand expression for terms of higher degrees,
it is called little o and its exact definition is:
o(|x x0 |)
lim = 0.
xx0 |x x0 |
The right-hand side expression of (2.2) provides an approximation of f (x) for x close
to x0 . Consider the linear function L(x) = f (x0 ) + f 0 (x0 )(x x0 ), then we have
f (x) L(x) o(|x x0 |)
lim = lim = 0.
xx0 x x0 xx0 x x0
In fact, L(x) is the only linear function for which

f (x) L(x)
lim = 0.
xx0 x x0
This is shown in the next result.
Theorem 2.2.2. Suppose f (x) is differentiable at x = x0 and L(x) = a + bx. Then,

f (x) L(x)
lim =0
xx0 x x0
if and only if f (x0 ) = L(x0 ) and f 0 (x0 ) = L0 (x0 ).
Proof. This is an if and only if statement that can be proved in one calculation. We
write L(x) = a + bx = (a + bx0 ) + b(x x0 ). Begin with the limit:
f (x) L(x)
lim
xx0 x x0
f (x0 ) + f 0 (x0 )(x x0 ) + o(|x x0 |) (a + bx0 + b(x x0 ))

= lim
xx0 x x0
[f (x0 ) (a + bx0 )] + (f 0 (x0 ) b)(x x0 ) + o(|x x0 |)
= lim
xx0 x x0
f (x0 ) (a + bx0 ) 0 o(|x x0 |)
= lim + (f (x0 ) b) + .
xx0 x x0 x x0
Now, the limit on the last line is zero if and only if the limit of each term is zero.
We check each case. First,
f (x0 ) (a + bx0 )
lim
xx0 x x0
exists if and only if the numerator f (x0 ) (a + bx0 ) is zero. In which case, the limit
is zero. The third limit is automatically zero by definition of o(|x x0 |). Therefore,
the second limit is zero if and only if f 0 (x0 ) = b. Therefore a = f (x0 ) + f 0 (x0 )x0 and
so L(x) = f (x0 ) + f 0 (x0 )(x x0 ) which means L(x0 ) = f (x0 ) and L0 (x0 ) = f 0 (x0 )
This completes the proof.
Definition 2.2.3. The tangent line to the curve given by y = f (x) at (x0 , f (x0 )) is
given by the best linear approximation L(x) of f (x) at x0 .
Note that in this case, the tangent line is always given by a function y = mx + b.
This is not the case anymore for general vector functions as we show below.
A vector function q : R Rn defined by

q(t) = a + bt

where a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) is called an affine vector function. If
a = 0, it is a linear function. Let r(t) be a parametrization of the smooth curve

C. The concept of best linear approximation for f (x) extends naturally to vector
functions and we have the following definition.
Definition 2.2.4. Let r, q : R Rn where r(t) is a vector function and q(t) is a

linear vector function. We say that q(t) is the best linear approximation of r(t) at
t = t0 if
r(t) q(t)
lim = 0.
tt0 t t0
The idea of a best linear approximation is used in the sections and chapters that
follow in order to provide optimal approximations to lengths of curves, areas of
surfaces, etc. As with the function f (x) above, we can characterize the best linear
approximation using the derivative.

Theorem 2.2.5. Let r(t) be a vector function and q(t) =
a + b t where a , b Rn .
Then, q(t) is a best linear approximation at t = t0 if and only if q(t0 ) = r(t0 ) and
q0 (t0 ) = r0 (t0 ).
Proof. The argument is similar to the proof of Theorem 2.2.2 done on each compo-
nent of r(t) and q(t) separately.
Therefore, if the curve C is smooth at p = r(t0 ), the best linear approximation is
nonconstant and so we have the following definition.
Definition 2.2.6. The tangent line at a smooth point p C is given by the best
linear approximation q(t).
Example 2.2.7. Consider the circle of radius 1 given by
r(t) = (cos t, sin t)
and let t = 0. Then, the best linear approximation is q(t) = (1, t) for t R. It is a
vertical line in the plane and so we cant write it as y = mx + b.
2.2.1 Construction of the Tangent Line
Let C be the curve defined by the vector function r(t) = (x(t), y(t)) for t [a, b]
and differentiable for t (a, b). Let t0 (a, b), ` be a tangent line at the point r(t0 )
and let (p, q) `. See Figure 2.7. Then,
vp,q := (p, q) (x(t0 ), y(t0 ))
is the vector joining r(t0 ) to (p, q) and vp,q is a nonzero vector. Because r(t) is
differentiable at t = t0 , then vp,q = sr0 (t0 ) for some s 6= 0 and this means r0 (t0 ) 6=
0. This corresponds to what is seen in Example 2.1.6, corners can appear only if
r0 (t0 ) = 0.
The construction seen in Figure 2.7 gives us a method to obtain the formula
for the tangent line at points where a curve C is smooth. This is illustrated in the
following example.
r(b)
`
y
r0 (t0 )
r(t)
r(t0 )
v
t x
a t0 b C
r(a) vp,q
(p, q)
Fig. 2.7. A curve C given by r(t) with its tangent vector r0 (t0 ) at r(t0 ), tangent line ` and a
vector vp,q ` joining the point (p, q) ` to r(t0 ).
Example 2.2.8. Consider the curve C given by the smooth parametrization r(t) =
(t2 , 1 t) for t [0, 1]. The tangent line at t = 12 is obtained by first obtaining the
tangent vector r0 (t) = (2t, 1) and evaluating at t = 21 : r0 ( 21 ) = (1, 1). Using the
base point r( 12 ) = ( 41 , 21 ) and the tangent vector, the equation of the tangent line at
r( 12 ) is
1 1 1 1
`(s) = , + s (1, 1) = + s, s .
4 2 4 2
where s is the variable parameterizing the tangent line.
The general method to compute a tangent line is similar to the one exposed in
Example 2.2.8. Let r(t) be the smooth parametrization of a curve C and we want
to compute the tangent line at t = t0 .
(1) Evaluate the vector function at t = t0 : r(t0 ).

(2) Compute the tangent vector at t = t0 : r0 (t0 ).
(3) Write the formula:
`(s) = r(t0 ) + sr0 (t0 ).
Example 2.2.9. Consider the curve C with smooth parametrization

r(t) = (t cos(2t), et sin t, 2t2 )
for t [0, 1]. We find the general formula for the tangent line at any point on the
curve using the approach given above.
(1) Evaluate the vector function: r(t) = (cos(2t), et sin t, 2t2 )

(2) Compute the tangent vector: r0 (t) = (2 sin(2t), et cos t + et sin t, 4t)
(3) Write the formula:
`(s) = r(t) + sr0 (t).
We obtain
`(s) = (cos(2t), et sin t, 2t2 ) + s(2 sin(2t), et cos t + et sin t, 4t).
This formula can now be evaluated at any point t [0, 1].
Exercises
(1) Determine the general equation of the tangent line for each vector function.
Then, evaluate at the given t0 value.
(a) r(t) = (t2 2t, sin(2t), t), t0 =
(b) r(t) = (4t3 , t2 , et ), t0 = 0
(c) r(t) = (3tet , t2/3 , t2 ), t0 = 1
(d) r(t) = (et , tet , t2 et ), t0 = 0
2.3 Reparametrizations and arc-length parameter
Consider a straight line segment on the x-axis from x = a to x = b. A natural

parametrization is
x(t) = t, y(t) = 0
with t [a, b]. Note that ||(x0 (t), y 0 (t))|| = 1. Let > 0 and consider instead the
parametrization
x(t) = t, y(t) = 0
with t [a/, b/]. Now ||(x0 (t), y 0 (t))|| = and the length of the domain is 1 (b
a). Therefore, the speed of the parametrization being leads to a contraction of the
domain by a factor of .
Consider the parametrization of a circle of radius r0 given by
x(t) = r0 cos(t/r0 ), y(t) = r0 sin(t/r0 )
with t [0, 2r0 ) and here again we have ||(x0 (t), y 0 (t))|| = 1. This parametrization
is similar to the one from Example 1.5.3, but the domain has been dilated by a factor
r0 with the division of t by r0 in cos and sin.
These are examples of reparametrization of a curve.
Remark 2.3.1. Note that for both examples, the curve has a parametric represen-
tation which travels along the curve at speed 1 and the length of the curve (known
from elementary geometry) is the same as the length of the domain interval.
In this section, we present an argument showing that for any curve C there exists a
parametric representation given by a vector function r(t) such that ||r0 (t)|| = 1. We
single out such parametric representations in the following definition.
Definition 2.3.2. Let C be a curve with parametrization r(t) with t [a, b]. A
reparametrization of C is an invertible differentiable function : [c, d] [a, b],
t = (s), from which we define a parametrization
r(s) = r((s))
with s [c, d] where c = 1 (a) and d = 1 (b).
In the circle example, the reparametrization from Example 1.5.3 to the one above is
given by (s) = s/r0 .
Example 2.3.3. Let C be a curve with parametrization r(t) = (et , e2t , e3t ) with t
[0, 3] and consider the reparametrization t = (s) = ln(s). Then,
r(s) = r((s)) = (eln(s) , e2 ln(s) , e3 ln(s) ) = (s, s2 , s3 )
with s [1, e3 ].
Definition 2.3.4. A curve C has an arc-length parametrization r(s) if
||r0 (s)|| = 1
for all s.
Example 2.3.5. Let C be a curve with parametrization r(t) = (t2 /2, t3 /3) with t

[0, 1]. Then ||r0 (t)|| = t2 + t4 = t 1 + t2 . We now find a reparametrization t =
(s) such that r(s) = r((s)) and ||r0 (s)|| = 1. Begin by computing

d d
||r0 (s)|| = r0 ((s)) = ||r0 ((s))|| .

ds ds
Because we want ||r0 (s)|| = 1, we set up the equation

0
d
1 = ||r ((s))|| .
ds
It is reasonable to assume that the parametrization does not change the orientation,
so we have d/ds > 0. Since t = (s), we can write
dt 1 1
= 0 = . (2.3)
ds ||r (t)|| t 1 + t2
and using the method of separation of variables this becomes
t t p
ds = 1 + 2 d.
0 0
The integral on the right can be computed with the substitution rule (u = 1 + 2 )
and we obtain
1
s(t) s(0) = ((1 + t2 )3/2 1).
3
The value of s(0) is not fixed and so we assume s(0) = 31 , therefore
1
s(t) = (1 + t2 )3/2 . (2.4)
3
But we can invert this function and obtain
q
t = (s) = (3s)2/3 1. (2.5)
Therefore, C has an arc-length parametrization given by

2 3 1 2/3 1 2/3 3/2
r(s) = ((s) /2, (s) /3) = ((3s) 1), ((3s) 1)
2 3
where the range of the parameter s is obtained from (2.4) with t [0, 1]. That is,

1 1 3/2
s , 2 .
3 3
From this example, we can extract an algorithm to find the arc-length parametriza-
tion of a vector function r(t) with domain [a, b].
(i) Compute ||r0 (t)||. If ||r0 (t)|| = 1 then it has already the arc-length parametriza-
tion. If not, go to step (ii).
(ii) Set up the equation
t
s(t) s(a) = ||r0 ( )|| d. (2.6)
a
Compute the integral on the right if possible; s(a) is arbitrary so you can set it
to a convenient value.
(iii) If possible, invert the formula given by (2.6) to obtain
t = (s).
(iv) Determine the domain of s: the interval [s(a), s(b)].

(v) Write r(s) = r((s)).
It is well-known that antiderivatives of continuous functions are in many cases, very

difficult to find or that they may not even exist in terms of elementary functions
(polynomial, rational, trigonometric, exponential, logarithmic functions) as the next
case shows.
Example 2.3.6. Consider the curve C with parametrization x(t) = t and y(t) = et ,
then a calculation as above leads to a differential equation
dt 1
=
ds 1 + e2t
and so one must compute the integral.

(1 + e2t )1/2 dt
Unfortunately, an antiderivative cannot be found for this case and so the arc-length
parametrization cannot be obtained analytically.
Although we cannot compute the arc-length parametrization analytically in all cases

by solving a differential equation such as (2.3), we can show abstractly that the arc-
length parametrization must always exist. This is the content of the next result.
Theorem 2.3.7. A smooth curve C always has an arc-length parametrization.
Proof. Let r(t) be a parametrization of C with t [a, b]. Let t = u(s) be a

reparametrization with r(s) = r(u(s)). We must find a function u such that

0
du
||r (s)|| = ||dr/du|| = 1
ds
for all s. This means

du 1 1
= = p := g(u). (2.7)
ds ||dr/du|| x01 (u)2 + x02 (u)2 + + x0n (u)2
We can solve this differential equation by separation of variables

1
du = ds = s. (2.8)
g(u)
1
Because r(t) is smooth and g(u) 6= 0, then g(u) is continuous. By the Fundamental
Theorem of Calculus, there exists F (u) such that F 0 (u) = 1/g(u). So, the solution
is
F (u) = s
but F 0 (u) = 1/g(u) > 0 so the inverse of F exists. This means
u(s) = F 1 (s)
satisfies the required property. Therefore, there always exists an arc-length parame-
trization.
Exercises
(1) For the following vector functions, use the transformation t = (s) given to
reparametrize and change the domain accordingly. In each case compute the
reparametrization r(s) with its domain, compute its speed and identify the ones
with arc-length parametrization.
4
(a) r(t) = (t2 , t2 et ), t [0, 2]; t = (s) = s.
(b) r(t) = (cos(8t), sin(8t)), t [0, 4]; t = (s) = s/8.

(c) r(t) = (et , e2t / 2, e3t /3), t [0, ln(10)]; t = (s) = ln(s).
s
(d) r(t) = (a + bt, c dt, g + ht), t = (s) =
t [0, 1]; .
b2 + d2 + h2
(2) Compute s(t) given by equation (2.6) for the following vector functions. Invert
the relationship to t = (s) if possible.
(a) r(t) = (t cos t, t sin t)
(b) r(t) = (et , e2t , e4t )
(c) r(t) = (t2 , cos(t2 ), sin(t2 ))
(d) r(t) = (2et/2 cos t, 2et/2 sin t)
3 Tangent Spaces and 1-forms
We define the concept of tangent space using specific examples: curves, the spaces
Rn and finally two-dimensional surfaces given by z = f (x, y). The tangent space
to a geometric object is an important topic in advanced calculus and differential
geometry. The second part introduces a formal definition of differential and the
construction of the so-called 1-forms.
3.1 Tangent spaces
We begin by discussing how tangent lines lead to tangent space and build more
general tangent spaces for Rn and surfaces.
3.1.1 From Tangent Lines to Tangent Spaces
The previous section shows how to determine the tangent line to a curve C at some
point p C using the derivative of a vector function r(t) defining C. The goal of
this section is to extract the vectors which lie on the tangent lines so that we can
add them together without leaving the tangent line.
Example 3.1.1. Consider the plane curve C given by x(t) = t2 , y(t) = 1 t for
t [0, 1]. We obtain the tangent line equation at t = 1/2 by the formula
`(s) = (x(1/2), y(1/2)) + s(x0 (1/2), y 0 (1/2)) = (1/4, 1/2) + s(1, 1)
where s R is a parameter. The tangent line is not a subspace, if one adds two
elements of `(s), then the result is not in `(s) anymore:

1 1 1 1 1
+ s1 , s1 + + s2 , s2 = + (s1 + s2 ), 1 (s1 + s2 ) .
4 2 4 2 2
However, if we decide to fix the base point (x(1/2), y(1/2)) and only add the
parametrized part and add the base point after, this leaves us on the tangent line:
s1 (1, 1) + s2 (1, 1) = (s1 + s2 , s1 s2 ) = (s1 + s2 )(1, 1)
and so (x(1/2), y(1/2))+(s1 +s2 )(1, 1) is a point of the tangent line. See Figure 3.1
for an illustration.
With this in mind, we can now introduce the concept of tangent space, which is a
crucial aspect of advanced calculus and of differential geometry in general.
Definition 3.1.2. Let C be a smooth space curve (in Rn ). If p C then the tangent
space of C at p, denoted by Tp C, is the set of all vectors tangent to C at the point
p.
x Fig. 3.1. Curve C given by r(t) =

1 (t2 , 1 t) and the tangent line at
p = (1/4, 1/2). The orientation is
`(s) from (0, 1) to (1, 0).
If the vector function r(t) defines C and p = r(t0 ), Tp C is a 1-dimensional vector

space with basis {r0 (t0 )}. In Example 3.1.1, r0 (1/2) = (1, 1) and for p = (1/4, 1/2),
Tp C = span(r0 (1/2)) = {(1, 1) | R}.
Tp C lies on top of the tangent line `(s), but it is a different geometric object, it is
made up of vectors.
Apart from the fact that tangent spaces are vector spaces, another advantage is
that they dont depend on the vector function used to parametrize C.
Example 3.1.3. Let C be the circle of radius a and p = (0, 1). The vector functions
r1 (t) = (a cos(t), a sin(t)), t [0, ]

p
r2 (t) = (t, a2 t2 ), t [a, a]
both parametrize the upper portion of C with the counterclockwise orientation and
p = r1 (/2) = r2 (0). Now
r01 (/2) = (a sin(/2), cos(a/2)) = (a, 0)
and
r02 (0) = (1, 0).
So, r01 (/2) = ar02 (0) and they span the same tangent space as shown in Figure 3.2.
3.1.2 Tangent Spaces in any Dimension
We extend the concept of tangent space to any dimension. The following example
illustrates the situation.
Example 3.1.4. Consider Cartesian coordinate lines in R2 : C1 horizontal, x1 (t) = t,

y1 (t) = y0 and C2 vertical, x2 (t) = x0 , y2 (t) = t. Then, for C1 we have
(x01 (t), y10 (t)) = (1, 0) = e1 and (x02 (t), y20 (t)) = (0, 1) = e2
60 3 Tangent Spaces and 1-forms
r02 (0) a
r01 (/2)
x
a
Fig. 3.2. Tangent vectors of two different

parametric representations of C at (0, a).
for all t R. Let p1 = (t1 , y0 ) and p2 = (x0 , t2 ) for t1 , t2 R. Then,

Tp1 C1 = {e1 | R} and Tp2 C2 = {e2 | R}.
Consider p1 , q1 C1 with p1 6= q1 , then Tp1 C1 and Tq1 C1 are spanned by the same
basis vector e1 , but Tp1 C1 6= Tq1 C1 because those are located at different points. To
emphasize this distinction, our convention is to distinguish the vectors in a tangent
space by labelling those using their base points, for instance e1 (p) Tp1 C1 and
e1 (q) Tq1 C1 .
Tq C 1 Tp C1
C1
q p
x Fig. 3.3. Tangent spaces of C1 at p and q

with representative vectors.
Example 3.1.5. We return to the plane curve C of Example 3.1.1 given by x(t) =
t2 , y(t) = 1 t for t [0, 1] at t = 1/2 and consider the elements of Tp C with
p = (1/4, 1/2). Those can be written as a linear combination of the tangent vectors
of the coordinate lines. Indeed, v Tp C is written v = r0 (1/2) and
r0 (1/2) = 1e1 (p) + (1)e2 (p).
where p is also an element of C1 C2 and e1 (p) Tp C1 and e2 (p) Tp C2 .
In particular, this example shows that {e1 (p), e2 (p)} can span the tangent vectors of
any curve passing through p. We define basis vectors at a point p in Rn from which
y C2
p 1e1 (p)
C1
x Fig. 3.4. Decomposition of vector r0 (1/2) =

(1, 1) Tp C along the e1 (p) and e2 (p)
1e2 (p) r0 (1/2) = (1, 1) directions.
we can decompose any vector located at p. This leads us to this other important
definition.
Definition 3.1.6. The tangent space at the point p Rn , denoted by Tp Rn , is the

set of all vectors based at the point p.
Example 3.1.7. We look at the cases n = 1, 2.
(1) Let t0 R, then Tt0 R = {v = he1 | h R, e1 = 1}.

(2) As we saw above, using e1 (p) = (1, 0) and e2 (p) = (0, 1) then
Tp R2 = {v = 1 e1 (p) + 2 e2 (p) | 1 , 2 R}.
The tangent space Tp Rn can be constructed using families of curves passing through
p. As an example, let n = 2 again. We show that any vector v Tp R2 can be
obtained as the tangent vector of a curve passing through p. Let p = (x0 , y0 ) and
v = (v1 , v2 ) Tp R2 . Consider the curve
r(t) = (x0 + tv1 , y0 + tv2 ).
Then r(0) = p and

r0 (0) = (v1 , v2 ).
Proposition 3.1.8. Tp Rn is a n-dimensional vector space and the set
{e1 (p), . . . , en (p)}
of unit tangent vectors to the coordinate lines forms a basis. The notation ej (p) is
used to emphasize that the location of the basis vectors is at p.
Proof. See exercises.

The role of the derivative is intimately linked to the tangent spaces as the following
result shows.
Proposition 3.1.9. Let r(t) define the smooth curve C and p = r(t0 ) C. Then,
r0 (t0 ) : Tt0 R Tp C. That is, r0 (t0 ) is a mapping taking vectors from Tt0 R to Tp C.
Proof. We know Tp C = {r0 (t0 ) | R}. But, can be expressed as e1 Tt0 R.

Thus, any vector w Tp C can be written as w = r0 (t0 )v where v Tt0 R.
z
r(t)
r(b)
C
r0 (t0 )v
a t0 b
t y
v
r(a) p = r(t0 )
Fig. 3.5. The derivative r0 (t) is a mapping from Tt R into Tp C. For any v Tt R, one obtains
w = r0 (t)v Tp C.
This result shows that for a vector function r : [a, b] Rn defining a curve C, the
derivative is a function which takes vectors in Tt R and gives a vector in Tp C. This
interpretation of the derivative as a function on tangent spaces is fundamental.
This is shown in Figure 3.5.
3.1.3 Vector Fields
Using the concept of tangent space, we can introduce a useful interpretation of

mappings f : Rn Rn called the vector field. Let p Rn and consider Tp Rn .
As a vector space Tp Rn is isomorphic to Rn because they have the same dimension.
Therefore, in the discussion that follows, we drop the dependence on p in the tangent
space notation and label all tangent spaces as just Rn .
Definition 3.1.10. A mapping f : Rn Rn is a vector field if for p Rn , f (p) Rn

is interpreted as a vector in Tp Rn .
The concept of vector field is very important in physics and is at the heart of the
qualitative theory of differential equations pioneered by Henri Poincar in the late
19th century. We now show some examples to illustrate this concept.
Example 3.1.11. Let F, G : R2 R2 be vector fields defined by F (x, y) = (x, y)

and G(x, y) = (y, x), shown in Figure 3.6. The arrows are obtained by evaluating F
and G at a number of sample points. For instance, F (0, 0) = (0, 0), F (1, 1) = (1, 1)
and F (1, 0) = (1, 0).
Fig. 3.6. Vector fields F (x, y) (top) and G(x, y) (bottom). Note that the size of the arrows is
normalized to a unique size for convenience.
The next example is an important example from physics.
Example 3.1.12. Newtons law of gravitation says that the two bodies are attracted
with a force proportional to the masses of the bodies m and M and inversely propor-
tional to the square of the distance. The magnitude of this force is written
mM G
F (r) =
r2
where G is Newtons gravitational constant. In the case of a large isolated body of
large mass M (e.g. the Earth) and much lighter objects in its neighborhood, it is
customary to place the centre of mass M at the origin. Therefore, all bodies are
attracted radially towards the origin and for a body located at x = (x, y, z), we write

x
F (x) = F (r)
||x||
where r = ||x||2 and the second term is the unit direction vector pointing radially

towards the origin. The function F : R3 R3 defines a vector field.
Example 3.1.13. Here is an example of a vector field on R: F (x) = x(1 x). In this
case, all arrows lie directly on the line. For zero vectors, one draws a point.
x
0 1
Fig. 3.7. Vector field on R given by F (x) = x(1 x). The size of the arrows is relative.
A vector field can be defined not only over whole spaces, but also subsets of spaces.
For instance, one can define a vector field on a curve C.
Example 3.1.14. To define a vector field on curve, one must use a parametric rep-
resentation of the curve. Consider a circle of radius 1 with parametrization given
by r() = (cos(), sin()) with [, ). An example of a vector field on C is
given by v : [, ) Tp C ' R with v() = 2 2 2 . This vector field is shown in
Figure 3.8.
Fig. 3.8. Vector field on a circle.

The arrows are in the tangent
spaces at all points of the circle.

Note the 0 arrows at = .
2
Vector fields are discussed in more details in several upcoming sections.
3.1.4 Tangent Plane to a Surface
For curves in space, the question of the existence of a tangent line to the curve is
an important aspect to consider as it gives a best linear approximation to the curve
locally. In particular, tangent lines exist at points in C where the curve is smooth.
Consider now a two-dimensional surface S given by z = f (x, y) and consider the
question of the existence and computation of a tangent plane to a surface. We do
Fig. 3.9. Two-dimensional surface S with curves C1 and C2 projecting to coordinate lines in the
xy-plane and intersecting at p = (x0 , y0 , f (x0 , y0 )).
not consider the question of whether a tangent plane exists or not just yet, but focus
on the computation.
Definition 3.1.15. Let S be a surface in R3 given by z = f (x, y) and p = (x0 , y0 , z0 )

where z0 = f (x0 , y0 ). If p S is a point at which a unique tangent plane exists, then
the tangent space of S at p, denoted by Tp S is the set of all vectors tangent to S at
p.
With the following calculation, we begin our characterization of Tp S. Consider a

point p = (x0 , y0 , f (x0 , y0 )) S and the curves C1 and C2 given by
r1 (t) = (x0 + t, y0 , f (x0 + t, y0 )) and r2 (t) = (x0 , y0 + t, f (x0 , y0 + t))
passing through p; that is, r1 (0) = p = r2 (0). The projections of C1 and C2 in the
xy plane are Cartesian coordinate lines. Figure 3.9 shows a surface S with the curves
C1 and C2 .
Computing the derivative at t = 0 gives the tangent vectors of both curves at
p. Using the chain rule for partial derivatives we obtain

f f
r01 (0) = 1, 0, (x0 , y0 ) and r02 (0) = 0, 1, (x0 , y0 ) .
x y
These tangent vectors to C1 and C2 are shown in Figure 3.10. This shows that r01 (0)
and r02 (0) are vectors tangent to S at p. We can now state our result.
Theorem 3.1.16. Let S be the surface given by z = f (x, y) and
p = (x0 , y0 , f (x0 , y0 )) S.
Then, Tp S is a vector space of dimension two with basis given by {1 (p), 2 (p)} where

f f
1 (p) = 1, 0, (x0 , y0 ) and 2 (p) = 0, 1, (x0 , y0 ) .
x y
Fig. 3.10. Two-dimensional surface S with tangent vectors 1 (p) and 2 (p) at p =
(x0 , y0 , f (x0 , y0 )).
The proof is interesting, but technical and is left to the end of the section. We begin
with the simplest example.
Example 3.1.17. Consider the plane P given by equation 3x y + 2z = 3. A plane

tangent to P at any point p should correspond to P itself. We verify this. We compute
1
the tangent plane to P at the point p = (1, 2, 1). We write z = f (x, y) = (33x+y)
2
and from Theorem 3.1.16 we have

f
1 (p) = 1, 0, (1, 2) = (1, 0, 3/2)
x
and
f
2 (p) = 0, 1, (1, 2) = (0, 1, 1/2).
y
Those are indeed linearly independent vectors tangent to P at p. The tangent plane
of P at p is obtained by taking
Tp S = span(1 (p), 2 (p)) = {(1, 0, 3/2) + (0, 1, 1/2) | , R}
= {(, , 3/2 + 1/2) | , R}.
As expected, Tp S is independent of p.
We look at an example where the partial derivatives depend on the base point, so
the tangent plane depends on its location.
Example 3.1.18. Consider the elliptic paraboloid surface E given by
z = f (x, y) = x2 + y 2
for (x, y) R2 . We determine the equation for the tangent plane at p = (1, 2, 5). We
obtain the tangent vectors to S

f
1 (p) = 1, 0, (1, 2) = (1, 0, 2)
x
and
f
2 (p) = 0, 1, (1, 2) = (0, 1, 4)
y
based at p and so
Tp S = {(, , 2 + 4 | , R}
The paraboloid and its tangent plane at p are shown in Figure 3.11
Fig. 3.11. Two-dimensional surface S with tangent vectors 1 (p) and 2 (p) at p.
Proof of Theorem 3.1.16 We begin by showing that all vectors in the vector
subspace span{1 (p), 2 (p)} are tangent to S at p. Choose an arbitrary element
w span{1 (p), 2 (p)}. Let , R and

f f
w := 1 (p) + 2 (p) = , , (x0 , y0 ) + (x0 , y0 )
x y
and consider the curve on S given by the vector function
r(t) = (x0 + t, y0 + t, f (x0 + t, y0 + t)) t R.
Then, r(0) = p and

0 f f
r (0) = , , (x0 , y0 ) + (x0 , y0 )
x y
is tangent to S at p with w = r0 (0). So w Tp S and therefore
span{1 (p), 2 (p)} Tp S.
If Tp S contains a vector v not in
span{1 (p), 2 (p)},

then v is linearly independent from 1 (p) and 2 (p) and
span{1 (p), 2 (p), v}
is a three-dimensional vector space which would mean Tp S = R3 . This implies

S = R3 which is a contradiction. Therefore, Tp S = span{1 (p), 2 (p)}.
Exercises
(1) For each curve C given by r(t), answer the question:

(a) r(t) = (t cos(t), t sin(t)). Find Tp C where p = r().
(b) r(t) = (et , e2t , e4t ). Find Tp C where p = r(ln(2)).
(c) r(t) = (cos(8t), sin(8t)). Find a general formula for Tp C at an arbitrary
point p = r(t).

(d) r(t) = (et , e2t / 2, e3t /3). Find a general formula for Tp C at an arbitrary
point p = r(t).
(2) Draw (by hand) the vector field F (x, y) by sampling a sufficient number of
points in each quadrant. Compare your answer with a picture obtained using a
software.
(a) F (x, y) = (1 xy, 2 + x + y)
(b) F (x, y) = (2x, 3y)
(c) F (x, y) = (y, x x2 )
(3) For each surface S given by z = f (x, y), determine Tp S by finding 1 (p) and
2 (p).
(a) z = 3x2 4y 2 at p = (2, 1, 8).
(b) z = cos(xy) at p = (1, /2, 0).
(c) z = x3 3xy 2 at p = (0, 0, 0).
p
(d) z = 1 (x2 + y 2 ) at p = (1/2, 1/2, 1/2).
(4) Check whether the vectors below are in Tp S or not for problems 3(b) and 3(c).
(a) v = (2, 3, 2 3) (b) w = (1, 2, 1)

(c) u = (0, 4, 16) (d) q = (1, 2, 0).
(5) Tp S can also be computed for surfaces given in other coordinate systems. In
the case of cylindrical coordinate system, the basis vectors for Tp S are given
by evaluating along the r and coordinate lines in the xy-plane. Let p =
(r0 , 0 , f (r0 , 0 )) S.
(a) Draw a sample picture for S (as in Figure 3.9) showing the curves C1 and
C2 on S given by r1 (t) = (r0 + t, 0 , f (r0 + t, 0 )) and r2 (t) = (r0 , 0 +
t, f (r0 , 0 + t)).
(b) Compute the derivative of r1 (t) and r2 (t) at t = 0.

(c) Argue geometrically that the vectors f1 (p) = r01 (0) and f2 (p) = r02 (0) are
linearly independent and conclude that
Tp S = span{f1 (p), f2 (p)}.
(6) Using the method outlined in the previous problem, compute Tp S for S given

by z = f (r, ) = 1 r2 at (r, , z) = ( 2/2, /4, 1/2). Compare with problem
3(d), are the tangent spaces the same?
(7) Consider the curve C given by r(t) and the point p specified. In each case, answer
the question.
(a) r(t) = (1 + t2 , t 3) and p = r(1) = (2, 2): find the vector v T1 R such
that (10, 5) = r0 (1)v.
(b) r(t) = (t, et , e2t ) and p = r(0) = (0, 1, 1): Is (4, 4, 8) Tp C? Explain why.
(c) r(t) = (t2 , 2t, t3 ) and p = r(1) = (1, 2, 1): show that if v1 = 2 T1 R and
v2 = 3 T1 R, then r0 (1)(v1 + v2 ) = r0 (1)v1 + r0 (1)v2 .
(8) Show Proposition 3.1.8.
3.2 Differentials
In several elementary books about calculus, the concept of the differential of a func-
tion y = f (x) is introduced in a simple fashion, by saying that the differential is
dy = f 0 (x) dx where dx is an independent variable taking real values. The differen-
tial reappears when discussing definite integrals under the integral sign as one takes
the limit of Riemann sums to define the integral. The reason for the appearance of
dx under the integral sign is either not mentioned or the author states that it has
no meaning save to identify the variable to be integrated or it is the x magically
transforming into dx as the limit is taken. Finally, when introducing the substitution
rule, say u = g(x), then dx suddenly has a meaning again since now the dx under
the integral sign must be changed to du using du = g 0 (x) dx. After this, one would
understand a student to be confused about the concept of differential.
Our goal in this section is to put the concept of differential over solid foundations
such that its use in differentiation and integration becomes clear. The differential we
define in this section must at least satisfy the following properties:
(1) dx should take values in R.
(2) If y = f (x), then we must have dy = f 0 (x) dx.

(3) The dx under the sign must have a geometric meaning.
3.2.1 The differential in one-dimension
Example 3.2.1. Consider the curve C on R given by I(t) = t.
(1) At a point t0 R, we compute the tangent vector A = I 0 (t)e1 = e1
A
t
0 t0
(2) Let v Tt0 R and compute the dot product of v with A:
A v = (1e1 ) (he1 ) = h(e1 e1 ) = h R.
(3) Let v, w Tt0 R, then using the properties of the scalar product
A (v + w) = A v + A w and A (bv) = b(A v).
We use this example to define the differential.
Definition 3.2.2. The differential on R, written (dt)t0 , is a linear function (dt)t0 :

Tt0 R R defined by:
(dt)t0 (v) := A v = h, where v = he1 .
where A = I 0 (t). Because (dt)t0 is the same at all base points t0 , we write only dt.
Example 3.2.3. The differential and the norm return different information. For in-
stance, let v = (5)e1 , then dt(v) = 5 while ||v|| = 5.
Thus, we use the tangent vector A to the curve I(t) = t to define a function dt
which takes tangent vectors and returns a real number corresponding to the length
and the direction of those vectors. The direction of the vector is lost when using the
norm instead of the differential. This definition may seem cumbersome and clumsy
at the level of R and one should see this as the initial brick to more complicated con-
structions. Its strength is that it generalizes in a natural way to higher dimensional
spaces Rn , to functions, curves and surfaces. The differential of a general function
f (t) is done by thinking of the curve C on R with parametrization given by f (t).
Definition 3.2.4. The differential of a differentiable function f : R R at t0 R

is a function (df )t0 : Tt0 R R defined as follows. Let A be the tangent vector f at
t0 , A = f 0 (t0 )e1 , and let v = he1 then
(df )t0 (v) := A v.
Part (3) of Example (3.2.1) shows that dt is a linear function and this generalized
automatically to differentiable functions.
Proposition 3.2.5. Let f : R R be a differentiable function, v, w Tt0 R and

, R, then
(df )t0 (v + w) = (df )t0 (v) + (df )t0 (w).
We now can obtain our first important result with our definition of differential.
3.2.2 The classical differential formula
Using the definition above we now obtain the differential as seen in elementary
calculus classes. Let s = f (t) and v = he1 and recall that dt(v) = h, then
(ds)t0 (v) = (df )t0 (v)

= (f 0 (t0 )e1 ) v
= f 0 (t0 )h(e1 e1 ) (3.1)
= f 0 (t0 )h
= f 0 (t0 )dt(v).
Therefore, we recover the well-known formula from elementary calculus courses,

namely
(ds)t0 = f 0 (t0 ) dt.
ds(v)
x
v = te1
Fig. 3.12. Geometric representation of the differential as the increment along the y-axis given by
the projection of a tangent vector to the curve with projection t along the x-axis.
The geometric meaning of the differential is similar as in elementary calculus. Let

t0 R and v = (t)e1 Tt0 R be an increment from t0 . Then,
ds(t) = f 0 (t0 )dt(t)
However, for s = f (t), formula ds = f 0 (t) dt now has additional meaning since we
know that dt (and so ds) are functions defined on tangent spaces. We can add and
multiply those functions as with any other function and this is useful in the following
sections. In particular, this justifies the Leibniz notation for the derivative,
ds
= f 0 (t)
dt
where the term on the left is genuinely the division of ds by dt.
3.2.3 Differentials on higher dimensional spaces
We now extend the differential to tangent spaces of arbitrary dimensions. Essentially,

a coordinate line Cj can be seen as a function R R since all the other coordinates
xi = xi0 are constant for i 6= j. We begin with R2 . Let p = (x0 , y0 ) R2 and
u = (1 , 2 ) Tp R2 . Because the tangent space of coordinate lines Tp C1 and Tp C2
are one-dimensional subspaces of Tp R2 , we extend our definition of differential to
dx, dy : Tp R2 R
as follows. The coordinate lines C1 and C2 are given as (x1 (t), y1 (t)) = (t, y0 ) and
(x2 (t), y2 (t)) = (x0 , t), then
(dx)(x0 ,y0 ) (u) := (x01 (t), y10 (t)) u = (1, 0) (1 , 2 ) = 1 .
and
(dy)(x0 ,y0 ) (u) := (x02 (t), y20 (t)) u = (0, 1) (1 , 2 ) = 2 .
Therefore, the geometric meaning of the differentials dx and dy is that it returns the
projections of u along the x and y directions respectively. This construction extends
automatically to higher dimensional spaces.
Proposition 3.2.6. Let x1 , . . . , xn be Cartesian coordinates on Rn and consider the

vector v = (1 , . . . , n ) Tp Rn , then
dxj (v) = j
for j = 1, . . . , n.
We can now begin our discussion of differentials in the context of functions of several
variables.
Differentials for functions of several variables

Consider a surface S given by a function z = f (x, y). We see in Section 3.1.4 that
the tangent space at a point p = (x0 , y0 , z0 ) S is given by span{1 (p), 2 (p)} where

f f
1 (p) = 1, 0, (x0 , y0 ) and 2 (p) = 0, 1, (x0 , y0 ) .
x y
We now define (df )(x0 ,y0 ) on vectors v Tp S. We need (df ) : Tp S R. Let v Tp S,

then
f f
v = , , (x0 , y0 ) + (x0 , y0 ) .
x y
Now,
(dx)p (v) = , (dy)p (v) =
and
f f
(dz)p (v) = (x0 , y0 ) + (x0 , y0 ).
x y
But this means
f f
(dz)p (v) = (x0 , y0 )(dx)p (v) + (x0 , y0 )(dy)p (v).
x y
Equivalently, recalling the gradient in Cartesian coordinates, we have
(dz)p (v) = f (x0 , y0 ) ((dx)p (v), (dy)p (v)). (3.2)
But (3.2) has exactly the form of our previous definitions of differential: a deriva-
tive scalar product with a vector. Because z = f (x, y) and the values of dx and dy
do not depend on the point p explicitly.
Definition 3.2.7. The differential of f at (x0 , y0 ) is defined by
f f
(df )(x0 ,y0 ) := (x0 , y0 )dx + (x0 , y0 )dy. (3.3)
x y
In particular, (df )(x0 ,y0 ) can be evaluated at any vector v T(x0 ,y0 ) R2 .
The formula is written more simply

f f
df = dx + dy.
x y
Geometrically, we see that df (v) gives the variation of the function f in the direction
of the vector v. For this reason, it is also called the directional derivative of f . In
particular, because we can express
df (v) = f (x, y) (dx(v), dy(v))
by choosing a unit vector v, we see that the directional derivative of f is maximal if

v is parallel to the vector f (x, y) and pointing in the same direction. If v is chosen
perpendicular to f (x, y), then df (v) = 0 and this corresponds to a direction where
f does not vary. This leads us to introduce the concept of level set curves of a
function f (x, y) which is defined as
f 1 (c) := {(x, y) | f (x, y) = c}

where c R. Note that for a fixed c R, typically, f (x, y) = c determines a curve.

The function f is fixed along f 1 (c) and so the gradient vector is perpendicular to
level set curves from the calculation above.
A similar construction of the differential can be done for differentiable functions
of more than two variables and leads to the formula:
n
X f
df = dxi .
xi
i=1
However, the proof of this case must wait for the general definition of tangent spaces
for n-dimensional surface in Chapter 5, Section 5.4. We now look at two examples.
Example 3.2.8. Consider the surface S given by z = f (x, y) = 3 3x + 2xy 2 . Let

p = (1, 2), we compute (df )p evaluated on the vector v = (3, 4) Tp R2
f f
(df )p (v) = (1, 2)dx(v) + (1, 2)dy(v)
x y
= 5(3) + (8)(4) = 17
We now determine the direction of maximal increase of df . We do it by determining

the direction of no increase of f which is obtained using
0 = f (v1 , v2 ) = (5, 8) (v1 , v2 )
where (v1 , v2 ) is a unit vector. Therefore, v2 = 5v1 /8 and ||(v1 , 5v1 /8)|| = 1.
Therefore,
8 5
(v1 , v2 ) = , .
89 89
The unit vectors perpendicular are

5 8
, .
89 89
Therefore, the direction of maximal increase is given by

5 8
, .
89 89
Example 3.2.9. Consider the function f : R4 R defined by
f (x, y, z, w) = x2 + y 2 + z 2 + w2 .
Then
f f f f
df = dx + dy + dz + dw
x y z w
= 2x dx + 2y dy + 2z dz + 2w dw.

y p
r p
x
q

r

q
q

Fig. 3.13. Basis vectors r

and
of Tp R2 and Tq R2 where p = (1, 1) and q =

( 3/2, 1/2). The dashed circles emphasize the tangency of
with the circles through p
and q and the dashed rays the radial directions of r .

3.2.4 Differentials of curvilinear coordinate systems
Differentials can be defined in any coordinate system. Let (r, ) be polar coordinates
on the plane and let u0 = (r0 , 0 ) R2 . From Section 3.1, Problem (4), Tu0 R2 is
spanned by
d d
(1, 0) = (r0 + t, 0 ) |t=0 and (0, 1) = (r0 , 0 + t) |t=0 .
dt d
Let r1 (t) = ((r0 +t) cos(0 ), (r0 +t) sin(0 )) and r2 (t) = (r0 cos(0 +t), r0 sin(0 +t)).
Consider the vectors
r01 (0) = (cos 0 , sin 0 ) and r02 (0) = (r0 sin 0 , r0 cos 0 )
based at p. See Figure 3.13. We define the following vectors at p = (x, y) =
(r cos , r sin ):

:= (cos , sin ) and := (r sin , r cos ). (3.4)
r p p
which form an orthogonal basis to the tangent space Tp R2 . Note however that it
is not an orthonormal basis since ||/|| = r. Thus, we see that the standard
orthogonal basis of T(r,) R2 is sent via the derivative of the vector functions r1 and
r2 to the basis {/r, /} at Tp R2 . Therefore, a vector v Tp R2 can be written

v = vr + v ;
r
and in coordinates we write v = (vr , v ). We conclude this discussion by defining the

scalar product for vectors in Tp R2 written in the basis (/r, /). We know that
the scalar product of v and w should be the product of the length of the projection
of v onto w times the length of w. This means
2 2

= = 1 and = = r 2 .
r r r
Hence, letting

v=a +b and w=c +d
r r
then v w := ac + bdr2 or we can also write

1 0
v w = (a, b) (c, d)T . (3.5)
0 r2

y
r p
p
v
p
2

Fig. 3.14. Basis vectors and at p = (1, 1) with tangent vector v = (vr , v ) =
r p p
(1, 1).
The relationship between differentials in different coordinate systems is straightfor-

ward to obtain. We obtain dr, d as functions of dx and dy using r = f (x, y) =
p
x2 + y 2 , = g(x, y) = arctan(y/x). We know
f f
dr = dx + dy
x y
x y (3.6)
= p dx + p dy
x2 + y 2 x2 + y 2
and
g g
d = dx + dy
x y (3.7)
y x
= 2 2
dx + 2 dy.
x +y x + y2
Proposition 3.2.10. Let v = (vr , v ) Tp R2 . The differentials dr and d at p eval-
uated at v are given by
(dr)p (v) := vr and (d)p (v) := v .
Proof. We use (3.6) and the definitions (3.4) to compute

x y
(dr)p = dx + dy
r r r r r
r cos r sin
= cos + sin = 1
r r
and

y x
(d)p = dx + dy
r2 r2
r sin r cos
= (r sin ) + r cos = 1.
r2 r2
From the above, it is straightforward to check that (dr)p (/) = (d)p (/r) = 0.
Therefore, by linearity of the differentials, the proposition is verified.
Using these formulae, we can evaluate dr and d on vectors v Tp R2 in Cartesian

coordinate systems. Consider for instance, v = (vx , vy ) = (1, 1) at p = (1, 1) as

seen in Figure 3.15; the vector v is tangent to the circle of radius 2 and so (dr)p
should be zero and (d)p nonzero on this vector. We verify with the computation:
1 1
(dr)p (v) = dx(1, 1) + dy(1, 1) = 0
2 2
1 1
(d)p (v) = dx(1, 1) + dy(1, 1) = 1.
2 2
Exercises
(1) Using the differential formula (3.3), compute the differential of the functions
listed below.
(a) f (x, y) = x/y. Find the direction of maximal increase of f at (1, 1).
(b) f (x, y) = x2 y y 2 x. Find the direction of no increase of f at (2, 1).
(c) f (x, y, z) = 3 cos(xyz)

y
r p
p
p
2
v
x

Fig. 3.15. Basis vectors and at p = (1, 1) with tangent vector v = (vx , vy ) =
r p p
(1, 1).
(d) f (x1 , . . . , xn ) = exp((x1 a1 ) (xn an )).

(2) Show the following statements
(a) d(f (t)g(t)) = (f 0 (t)g(t) + f (t)g 0 (t)) dt
(b) If g(t) 6= 0, d(f (t)/g(t)) = (f 0 (t)g(t) f (t)g 0 (t))g(t)2 dt.
(c) d((f g)(t)) = f 0 (g(t))g 0 (t) dt.
(3) Consider a curve C given by r(t) = (x(t), y(t)) where t [, ]. Let = t0 <
t1 < . . . < tn1 < tn = and consider the vector vj = tj+1 tj based at tj R
for j = 1, . . . , n 1 (i.e. v0 = t1 t0 is a vector based at t0 , v1 = t2 t1 is a
vector based at t1 , etc). We denote pj = r(tj ).
(a) Compute dt(vj ).
(b) Explain why the vector Wj := r0 (tj )dt(vj ) is in the tangent space Tpj C.
(c) Compute dx(Wj ), dy(Wj ).
(4) Generalize the problem above to a curve r(t) = (x1 (t), . . . , xn (t)) with t [a, b]
by doing part (c) in this context; that is, compute dxi (Wj ) for i = 1, . . . , n.
(5) Evaluate dr and d at the points p R2 and vectors in v Tp R2 in polar and
Cartesian coordinates using the direct definition of dr and d and the formu-
lae (3.6) and (3.7). Check that the answers agree. Illustrate the vectors at the
point p.
(a) p = (2, 0); v = (vr , v ) = (1/2, 1/2) and v = (vx , vy ) = (1/2, 1).

(b) p = (1/2, 3/2); v = (vr , v ) = (1, 1) and

(1 3) (1 + 3)
v = (vx , vy ) = , .
2 2
(6) Compute dx and dy in terms of dr and d using x = r cos and y = r sin .

Check that your result is similar to solving dx and dy from the equations (3.6)
and (3.7).
(7) Consider the spherical coordinate system (, , ) and compute d, d and d
as functions of dx, dy and dz.
(8) Compute dx, dy and dz as functions of d, d and d. You can do it directly
using the formulae relating (x, y, z) with (, , ) or solve for dx, dy and dz from
the previous problem.
(9) Let p = (r cos , r sin , z). By defining curves
r1 (t) = ((r + t) cos , (r + t) sin , z)T ,
r2 (t) = (r cos( + t), r0 sin( + t), z)T ,
r3 (t) = (r cos , r sin , z + t)T
show that the vectors

:= (cos , sin , 0), := (r sin , r cos , 0), := (0, 0, 1)
r p p z p
forms an orthogonal basis of Tp R3 .

(10) Let p = ( cos sin , sin sin , cos ). By defining curves
r1 (t) = (( + t) cos sin , ( + t) sin sin , ( + t) cos )T ,
r2 (t) = ( cos sin( + t), sin sin( + t), cos( + t))T ,
r3 (t) = ( cos( + t) sin , sin( + t) sin , cos )T
show that the vectors

:= (cos sin , sin sin , cos ),
p

:= ( cos cos , sin cos , sin ),
p

:= ( sin sin , cos sin , 0)
p
forms an orthogonal basis of Tp R3 with

||/|| = 1, ||/|| = , ||/|| = sin .
(11) Show that the scalar products in the basis of the previous two problems are
(a) cylindrical coordinates:
(v1 , v2 , v3 ) (w1 , w2 , w3 ) = v1 w1 + r2 v2 w2 + v3 w3 .
(b) spherical coordinates:
(v1 , v2 , v3 ) (w1 , w2 , w3 ) = v1 w1 + 2 v2 w2 + 2 sin2 v3 w3 .
3.3 1-forms
In this section, we introduce a class of objects called 1-forms, which include the
differentials. We begin with a reminder from elementary calculus. The Fundamental
Theorem of Calculus states that for any continuous function f : [a, b] R, one can
find a differentiable function F (x) (the antiderivative) such that F 0 (x) = f (x), or in
the language of differentials
dF = F 0 (x) dx = f (x) dx.
Consider now the problem in two dimensions. Let f (x, y) and g(x, y) be continuous
functions, is it always possible to find F (x, y) such that
dF = f (x, y) dx + g(x, y) dy
which means
F F
= f (x, y) and = g(x, y)? (3.8)
x y
The answer is negative and the cases for which it is possible are studied in more
details in a forthcoming section. Here is a case where (3.8) is not satisfied. Let
f (x, y) = y and g(x, y) = x2 .
Indeed, by integrating with respect to x

F
=y = F (x, y) = xy + G(y)
x
where G(y) is some arbitrary function of y. But then,
F
= x + G0 (y) = x2
y
cannot be satisfied.
Note however that the expression
y dx + x2 dy
on its own has a well-defined mathematical meaning even though it is not the dif-
ferential of any function. In fact, expressions such as
f (x, y) dx + g(x, y) dy (3.9)
have many uses in physics as we see below. All differentials and expressions such
as (3.9) are examples of mathematical objects called 1-forms.
Definition 3.3.1. Let U be an open subset of Rn and consider functions aj (x) :=

aj (x1 , . . . , xn ) for all j = 1, . . . , n.
3.3 1-forms 81
(1) If aj is a continuous functions for j = 1, . . . , n, then
(x) = a1 (x)dx1 + . . . + an (x) dxn
is a continuous 1-form on U .
(2) If aj is a differentiable functions for j = 1, . . . , n, then
(x) = a1 (x)dx1 + . . . + an (x) dxn
is a differentiable 1-form on U .
1-forms are defined using the differentials dx1 , . . . , dxn ; therefore, they act on vectors
in the tangent space of points p U Rn . That is, for each p U ,
(p) : Tp Rn R
Example 3.3.2. Consider the 1-form
(x, y) = y dx + x2 dy,
p = (3, 2) R2 and v = (1, 5) Tp R2 , then
(3, 2)h1, 5i = (2) dx(1, 5) + (3)2 dy(1, 5)

= 2(1) + 9(5) = 49.
We use the notation h i for the vectors on which is applied. This should alleviate
possible confusion when writing down such expressions.
1-forms can be defined also using curvilinear coordinate systems.
Example 3.3.3. Consider the 1-form = r2 dr+r d. We evaluate at p = (r0 , 0 ) =

(2, /3) at the vector v = (vr , v ) = (0.2, 1.3) Tp R2 .
(2, /3)h0.2, 1.3i = (2)2 dr(0.2, 1.3) + (2)(/3)d(0.2, 1.3)

= 4(0.2) + (2/3)(1.3).
If the tangent vector is in Cartesian coordinates v = (vx , vy ), then one needs to use
the formulae (3.6) and (3.7) to evaluate dr and d.
We conclude this section by looking at the set of (continuous or differentiable) 1-

forms in n-dimensions at a point p which we denote
( n
)
X
n
p := : Tp Rn R | = ai (x1 , x2 , . . . , xn ) dxi .
i=1
We have the following result.

Proposition 3.3.4. n
p is a vector space (over R).
Proof. One needs to check that the sum of two 1-forms 1 , 2 n is also in n
and for any a R and n , then a n . We leave the details to the reader.
Let U Rn be an open set, we denote by n (U ) the vector space of 1-forms defined
on U .
3.3.1 1-Forms in Physics
As noted above, 1-forms are useful in describing quantities in physics, in particular

for computing work. However, many more examples exist. For now, we focus on
work.
In elementary physics, the work (W ) done by a force is described as the product
of a force F and the distance d travelled by a mass moved in the direction of F :
W = F d, and has units of Newton-meters (N m). In fact, a more general/practical
way of expressing work is using 1-forms. The force may depend on its location and
so
F = F (p) = (F1 (p), . . . , Fn (p))
is a vector field where p = (x1 , . . . , xn ) Rn . One defines

n
X
dW (p) = Fi (p) dxi . (3.10)
i=1
At a point p, one can approximate a small distance in the direction of motion given
by r(t) by taking a vector v = sr0 (t) Tp Rn for s small and so dxi (v) = x0i (t)s =
x0i (t) dt(s). Therefore, for v Tp Rn , Fi (p) dxi (p)hvi is the product of the force F
in the ith direction times the projection of the distance vector v also in the ith
direction. Hence, work at p is the linear superposition of the work done in every
Cartesian coordinate direction.
Example 3.3.5. Consider the following case. Let F (x, y) = (x, 3) be the force of the
wind and suppose that a cyclist travels along the path from P = (3, 2) to Q = (3, 1)
and then from Q to R = (2, 3). We compute the work done at each point p by the
wind on the cyclist. Suppose that the path P Q is parametrized by r1 (t) = (3, 2 t)
with t [0, 3] and the velocity of the cyclist is given by tangent vectors along P Q
are of the form v = r01 (t) = (0, 1), then
dW (3, 2 t)hvi = xdx(v) + (3)dy(v) = (3)(0) 3(1) = 3.
On the path QR, a parametrization is given by r2 (t) = (3 5t, 1 2t) with t [0, 1]
and suppose also that the velocity of the cyclist is given by tangent vectors given by
3.3 1-forms 83
y
P
x
Q
Fig. 3.16. Trajectory taken by cyclist: from P = (3, 2) to Q = (3, 1) and from Q to R =
(2, 3).
w = r02 (t) = (5, 2).
dW (3 5t, 1 2t)hwi = xdx(w) + (3)dy(w)

= (3 5t)(5) + (3)(2)
= 9 + 25t.
In the following section, we show that the total work over a path can be computed by
integrating dW .
Exercises
(1) Evaluate the 1-forms at the point p and on the vector v given.
(a) = 2x dx + (3xy y 2 ) dy, p = (2, 1), v = (1, 1).
(b) = cos(x + y) dx + 3xyz dy + (x2 + z 2 ) dz, p = (0, 0, 1), v = (3, 2, 4).
(c) = x1 x2 dx1 + x1 x3 dx2 + x2 x3 dx3 + x3 x4 dx4 , p = (3, 2, 1, 1), v =
(2, 1, 1, 3).
(d) = ( + r) dr + (r2 ) d, p = (r, ) = (1, ) and v = (vr , v ) = (1, 2).
(e) = 2r2 dr + (r) d, p = (x, y) = (1, 1) and v = (vx , vy ) = (2, 3).
(2) Consider the 1-form = 2xy dx + x2 dy. Find a function F (x, y) such that
dF = . Is this function F the unique function for which dF = ?
(3) Prove that n is a vector space (Proposition 3.3.4).
(4) Compute the work 1-form dW done by the force F (x, y) = (3x, 2xy) at each
point of the path r(t) = (t2 , 1 t) with t [0, 1].
(5) Compute the work 1-form dW done by the gravitational force

mM G x
F (x) =
r2 ||x||
on an object of mass m falling on the earth with the trajectory r(t) = ((1
t) cos(t), (1 t) sin(t), 1 t) with t [0, 1].
4 Line Integrals
We introduce the concept of line integrals starting with the integration of 1-forms.
This leads to the first of the important theorems of Vector Calculus: the Fundamental
Theorem of Line Integrals which is a generalization of the Fundamental Theorem of
Calculus seen in elementary calculus courses. We then extend these results to the
context of vector fields.
4.1 Integration of 1 forms
When introducing differentials, we mention that expressions such as dt, dx, dy need
to satisfy conditions (1), (2), (3) at the beginning of Section 3.2. The first two are
satisfied with the definition given above and we even generalized to higher dimension.
We now look at condition (3) which has to do with integration. However as the
previous section shows, the concept of 1-forms is more general (at least in dimensions
greater than 1) and so we define what it means to integrate 1-forms.
4.1.1 Revisiting integration in one-dimension
We can now properly define integration, not of functions, but of 1-forms over space
curves in R. This is done using Riemann sums as in elementary calculus.
v0 v1 vn1
a t1 t2 tn1 b t
Fig. 4.1. Curve C is the interval [a, b] with partition a = t0 < t1 < < tn1 < tn = b and
vectors vj , j = 0, . . . , n 1.
Let C be the positively oriented curve given by the interval [a, b] R, with
parametrization r(t) = t, t [a, b]. Let = f (t) dt be a continuous 1-form defined
over C.
We begin by defining a partition of [a, b]:
a = t0 < t1 < . . . < tn = b.
At each point tj of the partition, define vectors in the tangent space
vj := tj+1 tj Ttj R, j = 0, 1, . . . , n 1
and write vj = tj e1 where tj = tj+1 tj . Now, evaluate at each tj on vj :
(tj )hvj i = f (tj ) dt(vj )

and take the sum over all j = 0, . . . , n 1. Therefore,

n1
X n1
X
(tj )hvj i = f (tj ) dt(vj ). (4.1)
j=0 j=0
By noticing that dt(vj ) = tj , (4.1) is just a Riemann sum used to define the integral
as it is shown in elementary calculus. Adding points to the partition so that every
subinterval [tj1 , tj ] is always subdivided, we define
n1
X n1
X b
:= lim (tj )hvj i = lim f (tj ) dt(vj ) = f (t) dt.
C n n a
j=0 j=0
With the formulation given by (4.1), the significance of the dt in the integral is justi-
fied because this is how the base of the rectangles in the Riemann sum is computed.
As the limit is taken, the vj s vanish, but the dt remains. Thus, the integration
of 1-forms is well-defined and blends nicely with previously known integration of
functions from elementary calculus. In particular, the properties of integration of
1-forms are identical.
Proposition 4.1.1. The integral of a 1-form = f (t) dt satisfies the properties be-
low.
(a) If C = [a, b], C1 = [a, c] and C2 = [c, b] then

= + .
C C1 C2
(b) Let C be the curve C travelled in the opposite orientation. Then,

= .
C C
(c) If a, b R and i , i = 1, 2 are 1-forms, then

(a1 + b2 ) = a 1 + b 2 .
C C C
Proof. These follow from the same properties for the Riemann integral.
The next example not only shows how to use the substitution rule in the context
of 1-forms, but it also introduces a new operation called the pullback which is
how changes of variables are applied to 1-forms (and 2-forms, 3-forms, etc which are
defined in subsequent chapters).
Example 4.1.2 (Substitution rule). Consider the 1-form
(t) = 2t cos(t2 ) dt
86 4 Line Integrals
over the positively oriented curve C = [0, ]. We know from elementary calculus that
the integral of can be done using the substitution rule as follows. Let s = t2 , then
ds = 2t dt and the s-variable takes values in [0, 2 ] so,
2
2
= 2t cos(t ) dt = cos s ds.
C 0 0
Thus, we see that the substitution rule gives rise to a new 1-form
(s) = cos(s) ds
defined on the curve C 0 = [0, 2 ] and such that

(t) = (s).
C C0

Note that we can obtain (s) by using t = s and dt = 12 s1/2 ds:
1
= 2t cos(t2 ) dt = 2 s cos s s1/2 ds = cos s ds = .
2
Changes of variable are more often expressed in the form of the old variable t
in terms of a new variable s, so we favour this last formulation. This process of
starting with a 1-form and using a change of variables to obtain a new 1-form
is what the pullback is about; the exact definition follows.
Definition 4.1.3. Let (t) = a(t) dt be a 1-form and t = g(s) where g : R R is

a differentiable function. The pullback of by g is also a 1-form, denoted by g ,
defined by
(g )(s) := a(g(s)) g 0 (s) ds.
In Figure 4.2, we see how the reparametrization of the domain between the s and
t-variables is used to take the 1-form (t) and pulls back a 1-form in the s-domain.
Example 4.1.4. We compute the pullback of

t
(t) = dt
1 + t2

by t = g(s) = s 1. In (t), we have that a(t) = t/ 1 + t2 , then applying the
formula for the pullback we obtain
g (s) = a(g(s)) g 0 (s) ds

s1 1
= p ds
1 + ( s 1) 2 2 s 1
1
= ds.
2 s
Reparametrization

t= s
0 2 0
r1 (s) = s r2 (t) = t
R R
s [0, 2 ] t [0, ]
P ullback
w(s)
e = cos(s)ds w(t) = 2t cos(t2 )dt
t= s
1
dt =
2 s
ds
Fig. 4.2. Diagram illustrating the relationship between reparametrizations and pullbacks in the
case of Example 4.1.2.
4.1.2 Integration of 1-forms
We now show how to integrate the 1-form basis elements over any curve C in Rn .
We use the case n = 3 to illustrate the general case. Let r(t) = (x(t), y(t), z(t)) with
t [a, b] define a curve C and consider a point p = r(t0 ) on C. If v Tp C, we can
write for some s R:
v = (sx0 (t0 ), sy 0 (t0 ), sz 0 (t0 )).
Then, setting s = dt(s) we have
(dx)p (v) = x0 (t0 )dt(s), (dy)p (v) = y 0 (t0 )dt(s), (dz)p (v) = z 0 (t0 )dt(s). (4.2)
Consider the case dx and note that for some small s > 0
x(t0 + h) x(t0 )
x(t0 + s) x(t0 ) = s
s
x(t0 + s) x(t0 ) (4.3)

= dt(s)
s
x0 (t0 )dt(s) = dx(v).
Therefore, dx(v) approximates the small increment in the x-direction projected from
the tangent vector v Tp C. We now define

dx
C
using the following process.
(1) Partition [a, b] R: a = t0 < t1 < . . . < tn1 < tn = b.

88 4 Line Integrals
(2) Let vj = (tj+1 tj )e1 Ttj R, then dt(vj ) = tj+1 tj and r0 (tj )dt(vj ) Tr(tj ) C.
Recall that r0 (tj )dt(vj ) is the best linear approximation of C near r(tj ).
(3) We take the sum of dx evaluated at tangent vectors
r0 (tj )dt(vj ) = (x0 (tj ), y 0 (tj )) dt(vj )
located at the points r(tj ):

n
X n
X
Rn = dx(r0 (tj )dt(vj )) x(tj+1 ) x(tj ) = x(b) x(a)
j=1 j=1
where the approximation is given by (4.3). But,

n
X
Rn = x0 (tj ) dt(vj ).
j=1
where the right-hand side is a Riemann sum in R.

(4) Therefore,
b
dx := lim Rn = x0 (t) dt. (4.4)
C n a
and the limit is taken by adding points to the partition so that every interval is
subdivided.
Another way of writing (4.4) is using the pullback operation on dx:

b
dx = r dx.
C a
The formula in terms of pullback is the one which we use for general 1-forms. We
have the formulae
b
dx := x0 (t) dt = x(b) x(a),
C b
a
dy := y 0 (t) dt = y(b) y(a), (4.5)

C b
a
dz := z 0 (t) dt = z(b) z(a).

C a
We see that those are respectively the variation of C along the x, y and z directions.
Another way of seeing (4.5) is in terms of displacement on C along x, y and z
directions as an object travels on C from r(a) to r(b).
Example 4.1.5. Consider a smooth curve C with endpoints at
r(a) = (x(a), y(a)) = (1, 0) and r(b) = (x(b), y(b)) = (1, 4).
1
x
Fig. 4.3. Smooth curve C from Exam-
ple 4.1.5.
The displacement on C along the x direction is 0 and the displacement on C along

the y-direction is 4. These can be verified using formulae (4.5)

dx = x(b) x(a) = 0, dy = y(b) y(a) = 4.
C C
We now generalize the pullback to a general 1-form in Rn .
Definition 4.1.6. Let be a 1-form on Rn given by

n
X
(x) = ai (x) dxi .
i=1
and let C be a curve with parametrization r(t) = (g1 (t), . . . , gn (t)). The pullback of
along C is the 1-form on R given by
n
X
(r )(t) = ai (r(t)) gi0 (t) dt.
i=1
Example 4.1.7. Consider the context of Example 3.3.5 with
dW = x dx + (3) dy.
On the path from P = (3, 2) to Q = (3, 1) given by r1 (t) = (x(t), y(t)) = (3, 2 t),
we compute the pullback of := dW using the three step method outlined below:
(1) Write the formula: for n = 2, the general formula is
= (a1 (r(t)) x0 (t) + a2 (r(t)) y 0 (t)) dt.
(2) Identify the pieces: From the formula for the coefficients are
a1 (x, y) = x and a2 (x, y) = 3

90 4 Line Integrals
so on r1 (t) we have
a1 (r1 (t)) = a1 (3, 2 t) = 3 and a2 (r1 (t)) = a2 (3, 2 t) = 3.
The differentials are: x01 (t) = 0 and y10 (t) = 1.

For the path r2 (t) = (3 5t, 1 2t) from Q to R = (2, 3), we have
a1 (r2 (t)) = a1 (3 5t, 1 2t) = 3 5t and a2 (3 5t, 1 2t) = 3
with x02 (t) = 5, y20 (t) = 2

(3) Assemble the pieces: we can now use the computations of the second step into
the formula:
(r1 dW )(t) = r1 = (3(0) + (3)(1))dt = 3 dt.
(r2 dW )(t) = r2 = ((3 5t) (5) + (3) (2)) dt = (25t 9) dt.
We notice that the pullback is an operator
r : n 1 .
That is, it takes a 1-form in Rn and gives a 1-form in R. From the previous section,
we know that 1-forms in R can be integrated in a straightforward way because they
correspond to Riemann integrals.
Geometric construction of the integral

We show the geometric construction of the integral of 1-forms in the case R2 as
illustrated in Figure 4.4. Consider a curve C given by r(t) with t [a, b] and =
a1 (x, y) dx + a2 (x, y) dy.
r(b)
r(t) C r0 (tj )dt(vj )
a t1 t2 t3 tn1 b r(t)
v0 v1 v2 vn1 t
pj = r(tj )
r(a)
Fig. 4.4. Geometric construction of the integral of a 1-form.

(1) Partition [a, b] R: a = t0 < t1 < . . . < tn1 < tn = b.

(2) Let vj = (tj+1 tj )e1 Ttj R, then dt(vj ) = tj+1 tj and r0 (tj )dt(vj ) Tr(tj ) C.
Recall that r0 (tj )dt(vj ) is the best linear approximation of C near r(tj ).
(3) We take the following sum at points r(tj ) and evaluate at
r0 (tj )dt(vj ) = (x0 (tj ), y 0 (tj )) dt(vj ):
n
X
Rn = a1 (r(tj )) dx(r0 (tj )dt(vj )) + a2 (r(tj )) dy(r0 (tj )dt(vj ))
j=1
This sum can be seen as the contributions of areas of rectangles corresponding to

heights of a1 and a2 and bases dx and dy along the x and y coordinate directions
as seen in Figure 4.5:
a1 (r(tj )) dx(r0 (tj )dt(vj )), a2 (r(tj )) dy(r0 (tj )dt(vj )).
| {z } | {z } | {z } | {z }
height at base in height at base in
r(tj ) x-direction r(tj ) y-direction
A2 a2
A1
a1 pj dy
Fig. 4.5. Contribution of areas of rect-

dx angles in the x and y directions.
Therefore, we can think of Rn as the addition of the Riemann sums along each
direction. We define
:= lim Rn
C n
where the limit is taken by adding points to the partition so that every interval
is always subdivided.
(4) Evaluating dx and dy we obtain
n
X
a1 (r(tj ))x0 (tj ) + a2 (r(tj ))y 0 (tj ) dt(vj )

Rn =
j=1
and so
n
X
a1 (r(tj ))x0 (tj ) + a2 (r(tj ))y 0 (tj ) dt(vj )

lim Rn = lim
n n
j=1
b b
= (a1 (r(t))x0 (t) + a2 (r(t))y 0 (t)) dt = r .
a a
92 4 Line Integrals
(5) Therefore,
b
= r .
C a
Using the above geometric construction for n = 2, we now state the general defini-
tion.
Definition 4.1.8. Let n (U ) be a 1-form defined on U Rn and let C be a

curve given by the vector function r(t) with t [a, b]. Then, the integral of along
C is given by
b
:= r .
C a
In coordinates, this means that if
n
X
(x) = ai (x) dxi
i=1
and r(t) = (x1 (t), . . . , xn (t)) then

n
bX
= ai (r(t))x0i (t) dt.
C a i=1
Remark 4.1.9. In the context of vector fields (that comes up in an upcoming section),
the expression line integral along C is preferred, but can be used also in the context
of 1-forms.
We now look at several examples.
Example 4.1.10. Set up the integral of = y dx + 3x dy over C given by r(t) =

(t, sin(2t)) with t [1, 2]. We first write the pullback of :
(1) Formula: the general formula for the 1-form with n = 2 is
r = (a1 (r(t)) x0 (t) + a2 (r(t)) y 0 (t))dt.
(2) Identify the pieces: a1 (x, y) = y and a2 (x, y) = 3x. Then on r(t) we have
a1 (r(t)) = sin(2t) and a2 (r(t)) = 3t.
Also, x0 (t) = 1 and y 0 (t) = 2 cos(2t).

(3) Assemble the pieces in the formula:
(r )(t) = sin(2t)dt + (3t)2 cos(2t) dt = (sin(2t) + 6t cos(2t)) dt.
We can now set up the integral

2 2
= r = (sin(2t) + 6t cos(2t)) dt.
C 1 1
Example 4.1.11. We set up the integral of = (z y) dx + x2 dy zy dz over C

given by r(t) = (t, 2t, t) with t [0, 1].
(1) Write the formula of the pullback:
r (t) = (a1 (r(t))x0 (t) + a2 (r(t))y 0 (t) + a3 (r(t))z 0 (t)) dt
(2) Identify the pieces: We have a1 (x, y, z) = (z y), a2 (x, y, z) = x2 , and

a3 (x, y, z) = zy. Now r(t) = (x(t), y(t), z(t)) = (t, 2t, t) which implies
r(t) = (x0 (t), y 0 (t), z 0 (t)) = (1, 2, 1). Therefore,
a1 (r(t)) = 3t, a2 (r(t)) = t2 , a3 (r(t)) = 2t2 .
(3) Assemble the pieces:
r (t) = ((3t)1 + 4t2 (2) + 2t2 (1)) dt = (3t + 6t2 ) dt.
Therefore,
1
= (3t + 6t2 ) dt.
C 0
This integral is easily computable and we leave the details to the reader.
We can now state the general properties of integrals of 1-forms.
Proposition 4.1.12. The integral of a 1-form n satisfies the following proper-

ties.
(a) If a curve C is the (disjoint) union of C1 and C2 then

= + .
C C1 C2
(b) Let C be the curve C travelled in the opposite orientation. Then,

= .
C C
(c) If a, b R and i , i = 1, 2 are 1-forms, then

(a1 + b2 ) = a 1 + b 2 .
C C C
Proof. (a) We suppose that C is given by r(t) with t [a, b] with C1 obtained by
restricting t [a, c] and C2 by restricting t [c, b]. We can decompose the integral
as follows b c b
= r = r + r .
C a a c
94 4 Line Integrals
b
because the integral r is the integral of a 1-form in R. The two integrals on
a
the right-hand side correspond to

and .
C1 C2
(b) Let C be given by r(t) = (x1 (t), . . . , xn (t)) with t [a, b]. Then C is
given by r(t) = r(a + b t) with t [a, b]. Then, r(a) = r(b), r(b) = r(a) and
r0 (t) = r0 (a + b t). We compute
b X n
!
0
= ai (r(t)) (xi (a + b t)) dt
C a i=1
n
bX
= ai (r(a + b t)) (x0i (a + b t)) dt, set u = a + b t
a i=1
aX n
= ai (r(u))x0i (u) (du)
b i=1
bXn
= ai (r(u))x0i (u) du = .
a i=1 C
(c) This property is straightforward to check and left as an exercise.
We illustrate the use of part (a) of Proposition 4.1.12.
Example 4.1.13. Find the total work done in Example 3.3.5 on the path C from
P to R given. Because the path is piecewise smooth, the integral is the sum of the
integral on both smooth pieces C1 and C2 :
3 1
dW = dW + dW = r1 dW + r2 dW
C C1 C2 3
0 1 0
= 3 dt + (25t 9) dt
0 0 1
25 2
= 9+ t 9t
2 0
25 25
= 9+ 9 = .
2 2
Exercises
(1) Set up the integral of = 2xy dx + x2 dy over the curve C given by r(t) =
(1 + t, 3t2 ) with t [0, 1].
(2) Set up the integral of = (z y) dx + x2 dy zy dz over the curve C given by
r(t) = (t, 2t, t) with t [0, 1].
(3) Set up

C
where = exy dx + xey dy over the piecewise smooth curve C consisting of C1
the piece of parabola y = x2 from (0, 0) to (1, 1) , followed by the line segment
C2 from (1, 1) to (2, 0), and finally the line C3 from (2, 0) to (0, 0).
(4) Compute the integral of = y sin(z) dx + z sin(x) dy + x sin(y) dz over the curve
C given by r(t) = (cos(t), sin(t), sin(5t)) with t [0, ].
(5) Set up the integral for the work done by the force F (x, y, z) = (xy 2 , xy +
yz, 3z 3 + y 2 ) over the piecewise smooth path C given by the piece of helix
C1 parametrized by r(t) = (cos(t), sin(t), t) with t [0, ] and C2 the line seg-
ment from (1, 0, ) to (0, 1, 0).
(6) Consider a cyclist of mass 1 on a road up a mountain where the path C is given

by r(t) = ( 2 t cos(2t), 2 t sin(2t), t) with t [0, 2].
(a) Verify that the path verifies the equation of the elliptic paraboloid z =
2 (x2 + y 2 ).
(b) Determine the work needed against gravity (a = (0, 0, 9.8)) to climb up
the path C.
(c) Suppose there is a wind with force F (x, y, z) = (3z 1, 0, 0), compute the
work done by the cyclist against the wind.
(7) Show part (c) of Proposition 4.1.12.
4.2 Arc-length, Metrics and Applications
We show how to use the ideas from 1-forms to compute important quantities related
to curves such as arc-length and curvature. We also introduce integration over the
length of a curve and use it to look at computation of mass and centre of mass for
curved shaped objects. Finally, we make the link between 1-forms and vector fields
and between the integration of 1-forms and the Line Integrals of Vector Fields.
4.2.1 Arc-length
Suppose one wants to know the length of a piece of curve on the floor. It can just
be picked up, straightened and measured using a measuring tape. The length of
the rope is independent of its shape on the floor. We exploit this idea to given an
intrinsic definition of arc-length by finding a mathematical way of straightening
a curve C on top of a coordinate axis to obtain the arc-length. As a first step,
the parametrization should not travel along the curve several times. Consider the
following example.
96 4 Line Integrals
Example 4.2.1. Let C be the circle curve given by r(t) = (cos(t), sin(t)) with t
[0, 4]. The section of curve for t [2, 4] is a second winding around the circle.
Thus for t1 [0, 2] and t2 = t1 + 2 this means two different elements in the do-
main, t1 6= t2 , are sent to the same point in R2 : r(t1 ) = r(t2 ). Computing the length
of the circle over the whole domain would give us twice the length. See Figure 4.6.
r(t1 ) = r(t1 + 2)
Fig. 4.6. Parametrization of the circle in

Example 4.2.1
This example leads us to consider parametrizations that are one-to-one, or also called
injective. We recall the definition in the context of vector functions.
Definition 4.2.2. Let r(t) be a vector function with t [a, b]. Then, r(t) is one-to-
one or injective if for t1 , t2 [a, b],
r(t1 ) = r(t2 ) = t1 = t2 .
Another way to write this implication is: if t1 6= t2 then r(t1 ) 6= r(t2 ).
Consider the following example.
Example 4.2.3. Let C be a curve given by the parametric representation r(t) =

(t2 , t4 ) with t [1, 1]. We check to see whether this vector function is injective by
setting r(t1 ) = r(t2 ). The goal is to find out if this equality leads t1 to be equal to t2 .
If so, then it is injective. But, we can check directly that for t1 = 1 and t2 = 1 we
have
(t21 , t41 ) = (t22 , t42 ).
Thus, it is not injective. In fact, for t2 = t1 we have r(t1 ) = r(t2 ). Restricting the
domain of r(t) to t [1, 0] or t [0, 1] leads to an injective vector function.
As noted in Remark 2.3.1, the domain of the arc-length parametrization corresponds

in the examples at the beginning of Section 6.2 to the length of the curve C as we
know from basic geometry. This is true in general as the construction below shows.
Arc-Length: Geometric Construction

We use tangent vectors at points along C to build a Riemann sum for the arc-length.
See Figure 4.7 for an illustration of this construction. Let r(s) with s [a, b] be the
arc-length parametrization of a curve C.
(1) Let
a = s0 < s1 < . . . < sn = b
be a partition of [a, b]
(2) si = si+1 si Tsi R for i = 0, . . . , n 1.
We know that the line segment r0 (si ) ds(si ) is the best linear approximation of C
near r(si ) and
||r0 (si )dt(si )|| = ds(si )
for all i = 0, . . . , n 1 because r0 (si ) = 1 in the arc-length parametrization.
r(b) = r(tn )
r(t)
r(tn1 )
r0 (tj )dt(vj )
a = t0 t1 t2 t3 tn1 tn = b
v0 v1 v2 ... vn1 r(tj ) C
r(t2 )
r(t1 )
r0 (t0 )dt(v0 ))
r(a) = r(t0 )
Fig. 4.7. Tangent vectors at points r(tj ) along C.
We see that the arc-length parametrization preserves the length of tangent vectors
from Tsj R to Tr(sj ) C. Now, summing those line segments, we obtain an approxima-
tion of the length of C which is independent of n, the number of elements in the
partition:
n1
X n1
X
||r0 (si )ds(si )|| = ||ds(si )|| = b a (4.6)
i=0 i=0
and in particular
n1
X
lim ||r0 (si )dt(si )|| = b a. (4.7)
n
i=0
But, the approximation C by the line segments r0 (si )dt(si ) improves as ds(si )
becomes smaller. Thus, in (4.7), we let n in such a way that all intervals
si 0 for i = 0, . . . , s 1.
98 4 Line Integrals
Definition 4.2.4. Let C be a curve with parametrization given by the vector function
r(s) with s [a, b] corresponding to arc-length parametrization. Then, the arc-length
of C is given by
b
`(C) := ds = b a.
a
The above calculation shows that using the arc-length parametrization, approxima-
tions to the curve by tangent vectors always add-up naturally to the length of the
domain of the arc-length parametrization. We are in some sense, straightening out
pieces of curve on the tangent vectors.
As mentioned in the previous chapter, finding the arc-length parametrization
is not possible in many cases and so we must obtain a formula from which the
arc-length can be expressed no matter which parametrization is used.
Metrics and arc-length

Consider a vector v = (v1 , . . . , vn ) Tp Rn . We know that the differentials dxi
applied on v yield the projections of v on the coordinate axes xi : dxi (v) = vi . We
use the differentials to define an alternative way of computing the length of vectors
in Tp Rn using the metric operator ds : Tp Rn R defined by
v
u n
uX 2
ds := t dxi .
i=1
Note that the metric is not a 1-form because it is not linear, but it is built using
1-forms. The notation ds for the metric operator should not be confused with the
differential ds as in the previous section. We see indeed that
v v
u n u n
uX 2
uX 2
ds(v) = t dxi (v) = t vi = ||v||.
i=1 i=1
Note that the norm can be defined in curvilinear coordinate systems. Consider the
differentials dr and d from polar coordinates. We perform the change of coordinates
from dx, dy to dr, d:
dx = cos dr r sin d and dy = sin dr + r cos d.
So,
dx2 + dy 2 = (cos dr r sin d)2 + (sin dr + r cos d)2 = dr2 + r2 d2
and therefore, p
ds = dr2 + r2 d2 .
We see in the next result that the metric is the right concept to produce a general
formula for computing arc-length. We can compute the arc-length of a curve C by
integrating the metric ds. We assume that the curve C is given by an injective
parametrization r(t) with t [a, b].
Theorem 4.2.5. The arc-length `(C) of a curve C in Rn with injective parametriza-

tion r(t), t [c, d], is given by
d
`(C) = ||r0 (t)|| dt.
c
Proof. Let r(s) with s [a, b], be the arc-length parametrization of C, then `(C) =
b a. The parametrizations r(s) and r(t) are related by t = u(s) where u : [a, b]
[c, d] is differentiable, and we suppose, without loss of generality that u0 (s) > 0. We
know that
1 = ||r0 (s)|| = ||r0 (t)|| |u0 (s)|.
We now use ds to obtain an approximation of the length of C. Let
c = t0 < . . . < tn = d
be a partition of [c, d] and consider the vectors r0 (ti )dt(ti ) based at r(ti ) where
ti = ti+1 ti Tti R. Then,
n1
X n1
X
ds(r0 (ti )dt(ti ))
p
= x01 (ti )2 + + x0n (ti )2 dt(ti )
i=0 i=0
n1
X
= 1/|u0 (si )|dt(ti ).
i=0
But, dt(ti ) = u(si+1 ) u(si ) u0 (si )ds(si ). Thus,

n1 n1
X X 1
1/|u0 (si )|dt(ti ) (u0 (si ))ds(si ) = b a.
|u0 (si )|
i=0 i=0
Letting n one has

d
||r0 (t)|| dt = b a = `(C)
c
and this completes the proof.
The following result is immediate from the proof of the above theorem.
Corollary 4.2.6. Let C be a curve. The arc-length is independent of the vector func-
tion used to describe C. That is, if r1 (t) with t [a1 , b1 ] and r2 (t) with t [a2 , b2 ]
are two different parametrizations of C, then
b1 b2
||r01 (t)|| dt = ||r02 (t)|| dt.
a1 a2
100 4 Line Integrals
Proof. Any parametrization can be reparametrized to the arc-length parametrization

and so the formula is the same with the bounds of integration depending on the
parametrization.
Remark 4.2.7. (1) A more widespread notation for arc-length is

ds := `(C).
C
(2) Note that the integral for computing arc-length is the same as the one we need
to compute the arc-length parametrization.

(3) Reversing the orientation leaves ds invariant because
C
ds(r (ti )dt(ti )) = ds(r0 (ti )dt(ti )).

0
Example 4.2.8. We compute the arc-length of C given by
r(t) = (t cos(2t), t sin(2t))
with t [0, 2]. The derivative is
r0 (t) = (cos(2t) 2t sin(2t), sin(2t) + 2t cos(2t))

and ||r0 (t)|| = 1 + 4 2 t2 . Therefore,
r(t) = (t cos(2t), t sin(2t))
Fig. 4.8. Curve C of Example 4.2.8
2p
ds = 1 + 4 2 t2 dt
C 1 p 2
1 2 2
1 p
2 2

= t 1 + 4 t + ln(2t + 1 + 4 t )
2 p 2 0
1 2
1 p
2
= 2 1 + 16 + ln(4 + 1 + 16 ) .
2 2
Example 4.2.9. Set up the integral for the arc-length of C given by r(t) = (et , tet , t2 et )
with t [0, 2]. The derivative is r0 (t) = (et , et + tet , 2tet + t2 et ) and so ||r0 (t)|| =

et 2 + 2t + 3t2 + 4t3 + t4 . Then
2 p
ds = et 2 + 2t + 3t2 + 4t3 + t4 dt.
C 0
More Metrics
We begin with some examples of metrics in three-dimensions, the simplest one being
obtained for cylindrical coordinates which is just a direct extension of the polar
coordinate case. We have
p
ds = dr2 + r2 d2 + dz 2 .
The metric in spherical coordinates requires more work to derive. We begin by

writing dx, dy, dz as functions of d, d and d. Recall
x = cos sin , y = sin sin , z = cos .
Then,
dx = cos sin d sin sin d + cos cos d.
dy = sin sin d + cos sin d + sin cos d
dz = cos d sin d
and a straightforward, but lengthy computation shows
ds2 = dx2 + dy 2 + dz 2 = d2 + 2 sin2 d2 + 2 d2 .
This metric is useful for measuring curves lying on a sphere or following a spherical
trajectory, possibly of non-constant radius.
Example 4.2.10. We set up the integral for the length of the curve C given by the
vector function
r(t) = (t cos t sin t, t sin2 t, t cos t)
with t [0, /2]. We write the curve in spherical coordinates:
p
= t2 cos2 t sin2 t + t2 sin4 t + t2 cos2 t = t,
with = arctan(y(t)/x(t)) and cos = z(t)/(t). Because t [0, /2], we obtain
= t, = t.
Therefore, d2 = dt2 , d2 = dt2 and d2 = dt2 . Thus,

/2 p
ds = 1 + t2 sin2 t + t2 dt.
C 0
r(t) = (t cos t sin t, t sin2 t, t cos t)
Fig. 4.9. The curve given by r(t) in Exam-

z ple 4.2.10
Exercises
(1) For the following curves C, determine if the parametrization is injective. If not,
restrict the domain of t to make it injective
(a) Consider the curve C given by r(t) = (a cos(2t), b sin t) with t [ 2 , 2 ].
(b) Consider the curve C given by r(t) = (t2 , sin(4t2 )) with t [1, 1].
(2) Find the length of the curve C given by r(t) = (t2 /2, t3 /3) with t [0, 1].
Compare your answer with the domain obtained in Example 2.3.5.
(3) Find the length of the Cycloid curve C given by r(t) = (r(t sin t), r(1 cos t))
with t [0, 2]. Use a computer software to plot C.
(4) Find the length of the Cardioid curve C given by r(t) =(a(2 cos t cos(2t)), a(2 sin t
sin(2t))) with t [0, 2]. Use a computer software to plot C.
(5) Find the length of the Deltoid curve C given by
r(t) = (2a cos t + a cos(2t), 2a sin t a sin(2t))
with t [0, 2]. Use a computer software to plot C. p

(6) Show that for a vector function r(t) = (t, f (t)), then ds = 1 + f 0 (t)2 dt.
(7) Find the length of the helix curve C given by r(t) = (a cos(2t), a sin(2t), t)
with t [0, 2]. Use the metric ds in cylindrical coordinates.
(8) Find the length of the curve C given by r(t) = (t cos t, t sin t, t) with t [0, 2].
Is a given metric preferable here?
(9) Find the length of the curve C given by r(t) = (e2t cos t, 2, e2t sin t) with t
[0, 2].
4.2.2 Integral of functions over curves: including applications
Consider the situations shown in Figure 4.10:
h(p)
p C p C
Fig. 4.10. Fences of constant height (left) and variable height (right) over a curve C.
(1) Suppose that a fence of constant height h > 0 is build on top of a curve C. The
intuition, which is correct, is that the area of the fence is given by the length of
the curve C times the height of the fence:

h ds.
C
If the height depends on the location along the fence h = h(p) where p C, by
analogy with the area under a curve seen in elementary calculus, we expect the
area to be given by an expression such as

h(p) ds. (4.8)
C
(2) A very thin wire (of uniform radial cross-section) is made up of a unique material
and is arranged in the shape given by a curve C. It is possible that the material
is unevenly distributed along the wire and so the density of material in some
small s piece of the curve varies along C. This means that for a small portion
of wire near two different points p, q C, the mass in equal neighborhoods near
p and q could be different. The mass near p and q is approximated by (p) ds(v)
and (q) ds(w) where is the mass density (kg/m) (evaluated on C) and ds (m)
is a small distance near the points p and q where v Tp C, w Tq C. Thus, the
mass of the wire should be given by

(p) ds. (4.9)
C
We want to make sense of the expressions (4.8) and (4.9). Let r(t) be a parametriza-
tion of a curve C and let f (x) be a function defined on C, where x = (x1 , . . . , xn ).
Let tj := tj+1 tj Ttj R and define
n1
X
Rn := f (r(tj )) ds(r0 (tj ) dt(tj ))
j=0
n1
X p
= f (r(tj )) x01 (tj )2 + + x0n (tj )2 dt(tj )
j=0
We define

f (x) ds := lim Rn
C n
n1
X p
= lim f (r(tj )) x01 (tj )2 + + x0n (tj )2 dt(tj )
n
j=0
b p
= f (r(t)) x01 (t)2 + + x0n (t)2 dt
ab
= f (r(t)) ||r0 (t)|| dt.
a
This leads us to the following definition.
Definition 4.2.11. Consider a function f : Rn R and C a curve defined by r(t)

with t [a, b]. The integral of f over C is given by
b
f (x1 , . . . , xn ) ds = f (r(t))||r0 (t)|| dt.
C a
(1) Write the formula for the integral of f (x, y) = xy over the curve C defined by
r(t) = (t, t3 ) with t [0, 1]. Begin by obtaining the derivative r0 (t) = (1, 3t2 )

and ||r0 (t)|| = 1 + 9t4 . Then,
1 p
f ds = 3t2 1 + 9t4 dt.
C 0
(2) Compute the integral of f (x, y, z) = 2x over the curve C defined by r(t) =

(t, 3 cos t, 3 sin t) with t [0, 2]. We compute directly ||r0 (t)|| = 10 and so
2
2

2t 10 dt = 10t2 = 4 2 10.

f ds =
C 0 0
Example 4.2.12. The formula for the integral of f over C is also valid in other
coordinate systems. Consider the curve given by the vector function
r(t) = (a cos t sin t, a sin2 t, a cos t)

with t [0, /2] seen in Figure 4.11. This curve has a form similar to the one of
Example 4.2.10 and one can verify that C is located on a sphere of radius a. We
compute d, d and d explicitly. We start with
2 = x2 + y 2 + z 2
and take the differential to obtain
2 d = 2x dx + 2y dy + 2z dz.
Then,
r(t) = (2 cos t sin t, 2 sin2 t, 2 cos t)
Fig. 4.11. Curve r(t) from Exam-

z ple 4.2.12
d = x dx + y dy + z dz
= a2 cos t sin t( sin2 t + cos2 t) dt + a2 sin2 t (2 sin t cos t dt)
+a cos t (a sin t dt)
= a2 (cos3 t sin t + sin3 t cos t sin t cos t) dt
= a2 (sin t cos t(cos2 t + sin2 t) sin t cos t) dt = 0.
d = y dx + x dy = (a2 sin2 t( sin2 t + cos2 t)

+a2 cos t sin t(2 sin t cos t)) dt
= a2 (sin4 t + sin2 t cos2 t) dt
= a2 sin2 t dt.
and
dz = cos d + d
becomes (a sin t dt) = ad. Since 2 sin2 = a2 (1 cos2 ) = a2 sin2 t

p
ds = a sin t a2 sin2 t + 1 dt.
/2 p
z ds = a2 cos t sin t 1 + a2 sin2 t dt, u = a sin t
C 0a p
= u 1 + u2 du, v = 1 + u2
0
a2
1

= 2 v dv
0
a2
1 3/2
1 3
= 3v = 3a .
0
Mass and centre of mass

As described above, for a thin wire with mass density given by a function (x, y, z),
the mass m of the wire is given by

m= (x, y, z) ds.
C
Example 4.2.13. We compute the mass of the wire of density (x, y, z) = z of shape
(cos t, sin t, t) with t [0, 4]. This is done in cylindrical coordinates with ds =
dr2 + r2 d2 + dz 2 . We know
r dr = x dx + y dy = cos t( sin t dt) + sin t(cos t dt) = 0, dz = dt
and
r2 d = y dx + x dy = sin t( sin t dt) + cos t(cos t dt) = dt
with r2 = cos2 t + sin2 t = 1. Thus,
4
(x, y, z) ds = t 2 dt = 28 2 .
C 0
The centre of mass of a wire with density (x, y, z) is located at (x, y, z) given by
the formulae

1 1 1
x= x(x, y, z) ds, y= y(x, y, z) ds, z= y(x, y, z) ds.
m C m C m C
where m is the mass of the wire. For a wire lying in a plane, we need only consider
(x, y) and (x, y).
Example 4.2.14. Compute the centre of mass of a wire with constant density
(x, y) = 3 of shape C given by r(t) = (cos t, sin t) with t [0, ]. We be-

gin by computing the mass using ds = dr2 +r2 d2 (it is also straightforward with
2
ds = dx +dy2 = dt). Because C has constant radius, dr = 0 can be easily checked.
Now, tan = tan t so = t, but the domain must be split at t = /2. Therefore,
d = dt and
/2 !
m= 3 ds = 3 ds = 3 dt + dt = 3(/2 + /2) = 3.
C 0 0 /2
(0, 2 )
(cos t, sin t)
x
Fig. 4.12. Wire in a semicircle shape and location of the centre of mass in Example 4.2.14.
Then,

1 1 1
x= 3x ds = cos t dt = sin t = 0.
3 C 0 0

1 1 1 2
y= 3y ds = sin t dt = cos t = .
3 C 0 0

Exercises
(1) Let C be the curve given by r(t) = (t, t2 ) for t [0, 1] and compute

x ds.
C
(2) Let C be the curve of intersection of the cone z 2 = x2 + y 2 and the plane
z = 2 x y. Compute
z ds.
C
(3) Find the mass and the centre of mass of the wire C with density (x, y) = 1+x+y
with shape given by r(t) = (1 t)(1, 0) + t(2, 3).
(4) Let C be the curve given by r(t) = (et cos t, et sin t) with t [0, 2] and
compute p
x2 + y 2 ds.
C
(5) Find the mass and centre of mass of the wire C with density (x, y, z) = xy
with shape given by r(t) = (cos(2t), sin(2t), 3t) with t [0, 1].
(6) Let C be the curve given by r(t) = ((1 t) cos t sin t, (1 t) sin2 t, (1 t) cos t)
with t [0, 2]. Set up the integral

x ds.
C
4.2.3 Curvature
We denote the unit tangent vector to a curve C given by a vector function r(t) by
r0 (t)
T(t) :=
||r0 (t)||
and look at its evolution as t changes. In fact, we begin the discussion by assuming
that r(s) is the arc-length parametrization of C, then T(s) T(s) = 1 and
d dT(s)
T(s) T(s) = 2T(s) = 0. (4.10)
ds ds
T(s)
C T(s)
dT
ds dT
ds
dT
ds
T (s)
Fig. 4.13. Curve C with unit tangent
vector T(s) and the vector dT
ds
.
This means the vector

dT(s)
ds
is perpendicular to the tangent vector, but need not be a unit vector. See Figure 4.13
Definition 4.2.15. The curvature of a curve with the arc-length parametrization is

dT
=
ds
where T is the unit tangent vector.
We check this definition with some intuitive examples.
Example 4.2.16. Consider two points p, q Rn and C the line joining these points.
The arc-length parametrization is given by
(||q p|| s)p + sq
r(s) = .
||q p||
with s [0, ||q p||]. Since r0 (s) = (q p)/||q p|| is constant

dT
=0
ds
and the curvature = 0. This agrees with our intuition that a straight line has no
curvature.
Example 4.2.17. Consider the circle of radius a given by its arc-length parametriza-
tion r(s) = (a cos(t/a), a sin(t/a)) with t [0, 2]. Then,
dT
r0 (s) = ( sin(t/a), cos(t/a)) and = (a1 cos(t/a), a1 sin(t/a))
ds
which means = 1/a. A circle has constant curvature which is the reciprocal of its
1
= 10
10
=1
1
Fig. 4.14. Comparison of the cur-

vature of circles of radius 1 and
10.
radius. Therefore, the larger the circle, the smaller its curvature. In plain words, this
means that for an arc of same length on the small and large circles, the unit tangent
vector on the small circle has a greater variation of its direction as compared to the
larger circle; it is more curved!
As we have seen already several times, the arc-length parametrization is not always
computable and so we now obtain formulas for in terms of general parametrizations
r(t).
Theorem 4.2.18. Let C be a curve given by r(t) with t [a, b]. Then
||T0 (t)|| ||r0 (t) r00 (t)||

= 0
= .
||r (t)|| ||r0 (t)||3
Proof. Let t = (s) be the reparametrization between r(t) and its arc-length
parametrization r(s). Then, ds/dt = ||r0 (t)|| Recall that
r0 (t)
T(t) =
||r0 (t)||
and so
dT dt T0 (t) T0 (t)
= T0 (t) = = 0 .
ds ds ds ||r (t)||
dt
Thus,
||T0 (t)||
= .
||r0 (t)||
The second formula is obtained by noticing that
r0 (t) = ||r0 (t)||T0 (t).
The rest of the proof is an exercise in computing various derivative. First,
d2 s ds
r00 (t) = T(t) + T0 (t)
dt2 dt
Using the fact that T(t) T(t) = 0, one can show that
2
ds
||r0 (t) r00 (t)|| = ||T0 (t)||
dt
Isolating ||T0 (t)|| from this equation yields the result.
The formula for in terms of r0 (t) and r00 (t) is typically easier to use.
Example 4.2.19. We compute the curvature of the parabola C given by r(t) = (t, t2 ).
Begin with writing C as r(t) = (t, t2 , 0), then
r0 (t) = (1, 2t, 0) and r00 (t) = (0, 2, 0)
and r0 (t) r00 (t) = (0, 0, 2).
||r0 (t) r00 (t)|| = 2.

Now, ||r0 (t)|| = 1 + 4t2 which yields
2
= .
(1 + 4t2 )3/2
Exercises
(1) Compute the curvature of the following curves.

(a) r(t) = (et cos t, et sin t)
(b) r(t) = (1 + t, 2 5t2 )
(c) r(t) = (t, t2 , t3 )
(d) r(t) = (cos t, sin t, t)
(2) For the first two curves of the previous problem, plot on the same graph r(t)
and (t).
(3) Show that for a curve given by r(t) = (t, f (t)), the curvature formula is
|f 00 (x)|
= .
(1 + (f 0 (x))2 )3/2
4.3 Line integrals of vector fields
There is a tight link between vector fields and 1-forms. Let x = (x1 , . . . , xn ) Rn ,
then
{Vector fields in Rn } {1-forms on Rn }

F (x) = (F1 (x), . . . , Fn (x)) = F1 (x)dx1 + Fn (x) dxn .
In particular, one can write
= (F1 (x), . . . , Fn (x)) (dx1 , . . . , dxn ).
Because of the correspondence above, we can define a line integral of vector fields. If
F is a vector field and C is a curve given by r(t) with t [a, b], then we can evaluate
the contribution of F along C by the formula:
F (r(t)) r0 (t)
for all t [a, b].
Definition 4.3.1. The line integral of a vector field F over a curve C given by r(t)
with t [a, b] is
b
F (r(t)) r0 (t) dt.
a
Using (3.6), we obtain the notation

b b
F (r(t)) r0 (t) dt = F dr = F T ds
a C a
where T = r0 (t)/||r0 (t)||.
Example 4.3.2. Consider the vector field F (x, y) = (3x2 + xy, 5x2 y 2 ) and let C be
the curve given by r(t) = (t3 , 2t2 ) with t [0, 1]. Then, the line integral of F over
C is obtained as follows. Compute
F (r(t)) = (3(t3 ) + t3 (2t2 ), 5t3 (2t2 )) = (3t3 2t5 , 10t5 )
and so
F (r(t)) r0 (t) = (3t3 2t5 , 10t5 ) (3t2 , 4t) = 9t5 6t7 + 40t6 .
Therefore,
1
3 3 40
F dr = (9t5 6t7 + 40t6 ) dt = + .
C 0 2 4 7
Definition 4.3.3. The following two definitions are in direct correspondence.

Simple Closed Curve
Not Simple
Fig. 4.15. Left: a simple closed curve. Right: a closed curve which is not simple because of the
self-intersection.
(1) A 1-form n (U ) is exact if there exists a differentiable function f : U

Rn R such that = df .
(2) A vector field F (x) defined on U Rn is said to be gradient or conservative if
there exists a differentiable function f : U Rn R such that F (x) = f (x).
In this case, f is called the potential function.
The name conservative comes from physics where force fields are given by vector
fields. In the case of conservative vector fields, we show at the end of the section the
Principle of Conservation of Energy, which justifies the use of the term conservative.
Example 4.3.4. Let = yexy dx + xexy dy. We show that is exact with = df ;
that is,
f f
= yexy and = xexy .
x y
We integrate
f
f (x, y) = dx = yexy dx = exy + g(y).
x
Therefore,
f
= xexy + g 0 (y) = xexy
y
so g 0 (y) = 0 which implies g(y) = K is a constant. Thus, f (x, y) = exy + K, where
K is an integrating constant.
For exact 1-forms (and conservative vector fields), the function f such that = df ,
is called an antiderivative in analogy with the case of functions of one-variable and
so the integration of is done directly using the antiderivative. This is the content
of the next result which is one of the cornerstones of advanced calculus.
Before, we write the statement, we need to introduce the concept of simple closed
curve. A curve C is closed if it can be parametrized by r(t) with t [a, b] such that
r(a) = r(b). A closed curve C is simple if it has no self-intersection. See Figure 4.15.
Theorem 4.3.5. [Fundamental Theorem of Calculus for Line Integrals] Consider

n
X
= ai (x) dxi F (x) = (a1 (x), . . . , an (x))
i=1
with = df and so F (x) = f (x). Let C be a curve given by r(t) with parameter
t [a, b], then

(a) = f (r(b)) f (r(a)) = F dr
C C
(b) If C and C 0 are two curves joining p Rn to q Rn , then

= and F dr = F dr.
C C0 C C0
The integral is independent of the path joining p and q.

(c) If C is a simple closed curve, then

=0= F dr.
C C

where is the symbol used to emphasize that the line integral is taken over a
simple closed curve.
Proof. (a) Because is exact, by definition there exists a differentiable function f

such that = df . Let r(t) = (x1 (t), . . . , xn (t)), then

f f
r
= r (x)dx1 + + (x)dxn
x1 xn
f 0 f 0
= (r(t))x1 (t) dt + + (r(t))xn (t) dt
x1 xn
f 0 f 0
= (r(t))x1 (t) + + (r(t))xn (t) dt
x1 xn
= f (r(t)) r0 (t) dt
d
= f (r(t)) dt
dt
where the last equality follows from the chain rule. Thus,
b
d
= f (r(t)) dt = f (r(b)) f (r(a)).
C a dt
(b) Because C and C 0 have the same endpoints, the integrals are equal by part (a).
(c) A simple closed curve is such that r(b) = r(a) and so the integral is zero.
Remark 4.3.6. In particular, the last implication means

is exact = = 0.
C
Example 4.3.7. Consider the 1-form

y dx + x dy
= .
x2 + y 2
In polar coordinates, = d. Consider now a circle C of radius a centered at (0, 0)

and compute
2
= d = 2 6= 0.
C 0
Therefore, is not exact.
In Chapters 6 and 8, we investigate in more details the question of exactness of 1-

forms and establish a necessary and sufficient condition for a 1-form to be (locally)
exact. Example 4.3.7 is crucial in the understanding of this problem.
Example from Physics

We now look at applications of the Fundamental Theorem of Calculus for line inte-
grals in physics.
Example 4.3.8. The radial gravitational force field given by

mM G x
F (x) =
r2 ||x||
where x = (x, y, z) and r = ||x|| is conservative. The potential function is:
mM G 1
f (x) = .
r2 ||x||
Indeed,
!
mM G x y z
f (x) = p , p , p
r2 x2 + y 2 + z 2 x2 + y 2 + z 2 x2 + y 2 + z 2
We compute the work done by F on the following paths.
(1) C given by r(t) = (cos t sin t, sin2 t, cos t) with t [0, /4]. Because F is conser-
vative

F dr = f (r(/4)) f (r(0))
C
= f (2, 2, 2) f (0, 0, 1)
mM G 1 mM g 1
=
r2 2 2
q
2 2
r 02 + 02 + 12
2 +2 + 2
mM G 1
= 1 .
r2 10
(2) C is given by r(t) = (cos t, sin t, cos(3t)) with t [0, 2]. Then,

F dr = f (r(2)) f (r(0)) = f (1, 0, 1) f (1, 0, 1) = 0.
C
Consider the work done by a force F on an object moved between p and q in Rn . If

F is conservative, then by Theorem 4.3.5(b), the work done by F is independent of
the path chosen. Suppose that an object of mass m is moved on a path C given by
r(t) between p = r(a) and q = r(b) in R3 . The force applied on the object at r(t) is
F (r(t)) = mr00 (t)
by Newtons second law. The force F is not necessarily conservative, but the work
1-form dW = F (dx, dy, dz) on r(t) is
1 d 0
r dW = F (r(t)) dr = mr00 (t) r0 (t) = ||r (t)||2 dt
2 dt
and integrating we have

m b d 0 m
||r (t)||2 dt = ||r0 (b)||2 ||r0 (a)||2 .

dW =
C 2 a dt 2
The quantity 12 m||r0 (t)|| is called the kinetic energy of the object and is denoted
K(r(t)). This gives us the following result.
Theorem 4.3.9. The work done by a force F in moving an object of mass m on a

path C between points p, q Rn is the difference in the kinetic energy at points p
and q.
In the case of a conservative force F with potential function f , we know furthermore

that
F dr = f (r(b)) f (r(a)). (4.11)
C
The potential energy of an object located at p in a conservative field F is P (p) =
f (p) and therefore, F (p) = P (p). Therefore, (4.11) becomes

F dr = (P (r(a)) P (r(b))).
C
Together with Theorem 4.3.9 we have
m
||r0 (b)||2 ||r0 (a)||2 = (P (r(a)) P (r(b))).

2
We define E(p) := K(p) + P (p) to be the total energy at p. Rearranging the terms,
we obtain
E(r(b)) = K(r(b)) + P (r(b)) = K(r(a)) + P (r(a)) = E(r(a)).
This leads us to one of the Fundamental Laws of Physics.
Theorem 4.3.10. [Principle of Conservation of Energy] In a conservative field F ,

the total energy is conserved along any path C joining two points p and q.
Exercises
(1) Compute the line integrals of vector fields F over the curves C given below.
(a) F (x, y) = (x sin(y), cos(y)) where C is given by r(t) = (cos(t), t) with t
[0, 2].

(b) F (x, y, z) = (y 2 ex , z cos(yz), xz) where C is given by r(t) = (ln(t), t, 2t)
and t [1, 3].
(2) Show that the line integrals below are independent of path and compute the
integrals.

(a) (2xy + z 2 ) dx + (x2 2z 2 ) dy + (2xz 4yz) dz where C is any path from
C
(1, 2, 0) to (0, 2, 3).

(b) (ex sin(y) + 3x2 ) dx + (ex cos(y) ey ) dy where C is any path from (0, 0)
C
to (1, 0).
(3) Let = (y dx + x dy)/(x2 + y 2 ) and show the following integrals.

1
(a) = 1 where C is the square with vertices (1, 1), (1, 1), (1, 1),
2 C
(1, 1) in counterclockwise direction.

1
(b) = 0 where C is the circle of radius 1 centered at (2, 0) in the
2 C

counterclockwise direction.
1
(c) = n where C is given by r(t) = (cos(nt), sin(nt)) with t [0, 2]
2 C
and n N is a fixed positive integer.
2 2
(4) From the previous problem, we see that if = (y dx + x dy)/(x + y ), then
1
is either non-exact, or could be exact if the integral is zero. Notice that
2 C
is not defined at (0, 0) and in the first and third integral, the curve C surrounds
(0, 0), but not the second curve. Consider = (y dx+(x2) dy)/((x2)2 +y 2 )
and show that
1
(a) 2 = 1 where C is the circle of radius 1 centered at (2, 0).
C
1
(b) 2 = 0 where C is given by r(t) = (cos(nt), sin(nt)) with t [0, 2].
C
What is your conclusion? (Hint: use the substitution u = x 2).
(5) Consider the electric force field
x
F (x) = qQ .
||x||3
generated by a charged particle Q located at the origin on an electric charge
q located at x = (x, y, z). Suppose an electron is located at the origin with
charge Q = 1.6 1019 Coulomb and a positive charge q > 0 is located at
(x, y, z) = (1012 , 0, 0) (in meters). Find the work done by the electric force field
on the positive charge as it moves to the location (x, y, z) = (0, 108 , 0) (use the
value = 8.985 109 ).
5 Differential Calculus of Mappings
The goal of this chapter is to generalize the basic results of differential calculus to
mappings. We begin by distinguishing between functions (mappings) and the graphs
of functions. This is often blurred when studying functions f : R R, but for general
mappings it is an essential step. We go on to discuss various ways of visualizing
some types of mappings. The following section deals with the concepts of limit and
continuity. The definitions are similar to the case of real functions of a single variable;
however, examples have much more exotic properties. We pursue with the concept of
best linear approximation, introduced previously for vector functions, which we now
apply to general mappings and use it to obtain a definition of derivative. We present
the main properties of the derivative, in particular, the Chain Rule. We conclude
this chapter with higher derivatives and Taylor expansions.
5.1 Graphs and Level Sets
We begin the study of local properties of general functions or mappings f : Rn Rm

where n, m 1. We write those mappings as column vectors. For instance
(1) r : R R3 :
x(t)

r(t) =
y(t) .

z(t)
(2) f : R2 R2 : !
f1 (x, y)
f (x, y) =
f2 (x, y)
where f1 , f2 : R2 R.
(3) f : R5 R3 :

f1 (x1 , x2 , x3 , x4 , x5 )

f (x1 , x2 , x3 , x4 , x5 ) =
f2 (x1 , x2 , x3 , x4 , x5 )

f3 (x1 , x2 , x3 , x4 , x5 )
where f1 , f2 , f3 : R5 R.
We now define explicitly the graph of a function. In the case of functions f : R R

or f : R2 R, the graph of f can be easily visualized and the function and its graph
are often thought of as one and the same. For vector functions of several variables,
118 5 Differential Calculus of Mappings
this is not feasible and we must then make sure that the concepts of a function and
its graph are well-understood in their own rights.
Definition 5.1.1. Let U Rn be an open set and f : U Rm be a mapping. The

graph of f is the set
graph (f ) = {(x, f (x)) | x U }.
We look at a few examples
Fig. 5.1. Graph of a function f :

a b [a, b] R
Example 5.1.2. Consider a function f : [a, b] R, then
graph (f ) = {(x, f (x)) | x [a, b]}.
We can plot the points of graph (f ) in the plane and this gives us the familiar repre-
sentation of a curve over an interval in the plane as shown in Figure 5.1.
Fig. 5.2. Graph of a function f : U R2

R
Let U R2 and f : U R, then
graph (f ) = {(x, y, f (x, y)) | (x, y) U }
As shown in Figure 5.2, we can plot graph (f ) in a representation of three-

dimensional space.
5.1 Graphs and Level Sets 119
In Chapter 1, we plot several curves in the plane given by vector functions r :

R R2 . The graph of r can also often be visualized in three-dimensional space, as
seen in Figure 5.3.
r(t) = (t cos(t) sin(t), t sin2 (t), t cos(t))
t
Fig. 5.3. Graph of a vector function in
x R3 .
Graphical Representations
The previous examples show visualizations of graphs of functions and mappings.
Those can be obtained for f : Rn Rm with n + m 3. We can complement the
graphical representations with other visualizations in the case of vector functions
r : [a, b] R3 and vector fields f : R2 R2 and f : R3 R3 as presented in
previous chapters.
Another approach to obtain information about mappings (and their graph) for
functions of several variables is through level sets. The traces of conics of Chapter
1 are examples of level sets and we briefly mention level set curves in Section 3.2.3.
We now formalize the concept.
Definition 5.1.3. Let U Rn , f : U R. The level sets of f are given by
{(x1 , . . . , xn ) U | f (x1 , . . . , xn ) = c}, (5.1)
where c R. We denote the set (5.1) by f 1 (c) and it is called the inverse image of
c by f . In particular, f 1 (c) can be the empty set.
Note that the definition of inverse image is similar to the inverse of a function; in
fact, if a function is invertible, its inverse image consists of isolated points. We now
look at a few examples.
Example 5.1.4. The Monkey Saddle is a surface given by the graph of f (x, y) =
x3 3xy 2 , see Figure 5.4. We look at its level sets by computing the inverse image
f 1 (c) for various values of c. It is typical, as a first step, to consider c = 0 and one
value of c < 0 and c > 0. We have for c = 0 that f (x, y) = x(x2 3y 2 ) = 0 and this

holds for x = 0 and x = 3y. See Figure 5.5 for some level set curves. Therefore,
Fig. 5.4. Monkey saddle surface
f 1 (0) is made up of three lines through (0, 0) at angles of 2/3. For c 6= 0 the task
is more difficult.
4
f 1 (1)
f 1 (1)
2 f 1 (0)
4 2 2 4
Fig. 5.5. Level sets of the Monkey saddle surface.
Example 5.1.5. The motion of a pendulum without friction has total energy given
by
1
E(x, y) = y 2 cos(x).
2
Figure 5.6 shows the surface given by E and Figure 5.7, level sets for c = 1,
c = 1/2 and c = 2. We see a transition from isolated points (at x = 2k), to
circles, and finally to curves extending to in x.
Example 5.1.6. Let x = (x1 , x2 ) be coordinates on the plane and y = (y1 , y2 ) the
velocity of the particle at x. The harmonic oscillator in the plane has total energy
given by
1
H(x, y) = (y y + x x)
2
Fig. 5.6. Surface given by the energy E of the pendulum without friction.
where y y/2 is the kinetic energy and x x/2 is the potential energy. The level sets
of H are given by
H 1 (h) = {(x, y) R4 | y y + x x = h}
For h > 0 we let h = k 2 for k > 0. The level sets H 1 (h) correspond to three dimen-
sional spheres of radius k since they satisfy an equation analogous to the equation of
a sphere in R3 , but with an additional variable. For h = 0, the level set is only the
origin (0, 0, 0, 0), while for h < 0 the level set is empty.
Exercises
(1) Write the expression for the graph of the functions.

(a) f (x, y, z) = (x2 , xy, yz 3 )T .
(b) g(A) = det(A) where A is a 2 2 matrix.
(2) Write the level sets of the functions and give a geometrical description.
(a) g(x, y) = cos x cos y
(b) f (x, y, z) = 2x2 + y 2 + z 2
5.2 Limits and Continuity
In this section we extend the concepts of limit and continuity to general mappings.
The computation of limits for functions of several variables is much trickier than for
functions f : R R.
E 1 (1)
4 E 1 ( 12 )
E 1 (2)
2
2 2
Fig. 5.7. Level sets of the energy E of the pendulum without friction for c = 1, c = 1/2 and
c = 2.
Example 5.2.1. Consider the function f : R2 R defined by

x if y = 0

f (x, y) = y if x = 0

1 otherwise
and we look at various approaches towards (0, 0). Consider the path (x(t), y(t)) =
(t, t) then
lim f (x, y) = 1
(t,t)(0,0)
because f (x, y) = 1 on the path (t, t). But
lim f (x, y) = 0.
(x,0)(0,0)
Therefore, the limit cannot exist at (0, 0).
The definition of limit for mappings, therefore, needs to consider all possible ap-
proaches of x towards x0 .
Definition 5.2.2. Let f : Rn Rm be a mapping, then the limit of f at x0 Rn

exists if there exists a unique ` Rm such that
lim f (x) = `,
xx0
which means
> 0, > 0 such that ||x x0 ||Rn < ||f (x) `||Rm < .
Example 5.2.3. We show using the definition that
lim |u| + |v| = 0.

(u,v)(0,0)
We begin by choosing some > 0. The function in our case is f : R2 R defined

by f (u, v) = |u| + |v| and
||f (x) `||R1 = ||u| + |v| 0| = |u| + |v|.

We choose (u, v) such that ||(u, v) (0, 0)||R2 = u2 + v 2 < where is not yet
fixed with respect to . Note that the following inequalities hold for all u and v
p p
|u| u2 + v 2 and |v| u2 + v 2 .
Therefore, p
||f (x) `||R1 = |u| + |v| 2 u2 + v 2 < 2
and setting = /2 we obtain ||f (x) `||R1 < . This completes the proof.
As seen in Example 5.2.1, in order to show that a limit does not exist, it is often a
good strategy to determine a direction of approach to the limiting point for which
the one-dimensional limit does not exist or find two approaches which yield different
values for the limit. We illustrate this method with the following two examples.
Example 5.2.4. We check whether

xy
lim
(x,y)(0,0) 3x2 + 2y 2
exists. Substituting (0, 0) into the formula yields 0/0 and this is an indeterminate
form. However, it cannot be resolved using lHospital rule as it is only valid for
functions of one variable. Consider the line y = mx and substitute in the expression,
we obtain
xy mx2 m
2 2
= 2 2 2
= .
3x + 2y 3x + 2m x 3 + 2m2
Therefore,
xy m m
lim = lim = ,
(x,y)(0,0) 3x2 + 2y 2 (x,mx)(0,0) 3 + 2m2 3 + 2m2
which means the value of the limit depends on the line of approach towards the origin.
We must then conclude that the limit does not exist.
We now look at a mapping example.
Example 5.2.5. Consider f : R2 \ {(0, 0)} R2 defined by
(x, y)T
f (x, y) = .
||(x, y)||
We show that the limit as (x, y) (0, 0) does not exist. Let y = mx and substitute
in the expression
(x, y)T (x, mx)T
lim = lim
(x,y)(0,0) ||(x, y)|| (x,mx)(0,0) ||(x, mx)||
(x, mx)T
= lim
(x,mx)(0,0) x 1 + m2
(1, m)T
= lim .
(x,mx)(0,0) 1 + m2
Again, the vector value of the limit depends on the line of approach to the origin, so
the limit does not exist.
This example is similar to the case of the function

x
x 6= 0
f (x) = |x|
0 x = 0,
which is illustrated in the Figure 5.8. It is clear that the function does not have a
limit at x = 0.
1
Fig. 5.8. Function f (x) with a
jump discontinuity at x = 0
For functions of several variables, f : Rn R, the properties of the limit are the
same as the ones already known for functions of one variable. Those are listed in the
next result.
Proposition 5.2.6. Let f, g : Rn R, x0 Rn , R, L1 , L2 R and suppose
lim f (x) = L1 and lim g(x) = L2 .

xx0 xx0
Then,
(1)
lim f (x) + g(x) = L1 + L2 Linearity Property
xx0
(2)
lim f (x)g(x) = L1 L2
xx0
(3) If L2 6= 0, then
f (x) L1
lim = .
xx0 g(x) L2
Proof. We only do the second case and leave the other ones as exercises at the
end of the section. We must use the definition of limit. Let 0 > 0 and consider
|f (x)g(x) L1 L2 |. Then,
|f (x)g(x) L1 L2 | = |f (x)g(x) L1 g(x) + L1 g(x) L1 L2 |

= |(f (x) L1 )g(x) + L1 (g(x) L2 )|
|(f (x) L1 )g(x)| + |L1 (g(x) L2 )|
= |f (x) L1 ||g(x)| + |L1 ||g(x) L2 |.
Because the limit of g(x) as x x0 exists, there exists > 0 such that if ||x x0 || <
1 then |g(x) L2 | < . Rearranging this last inequality we obtain + L2 <
g(x) < + L2 thus |g(x)| < + L2 for all x (x0 1 , x0 + 1 ). For the same
> 0, if |x x0 | < 2 then |f (x) L1 | < . Therefore, choosing x such that
||x x0 || < min(1 , 2 ) we have
|f (x) L1 ||g(x)| + |L1 ||g(x) L2 | < ( + L2 + |L1 |)
Setting 0 = ( + L2 + |L1 |) the proof is complete.
As it is done for functions of one variable, we use the limit to introduce the concept
of continuity of a function. The exact definition follows.
Definition 5.2.7. Let f : Rn Rm be a mapping. Then f is continuous at x0 Rn

if f (x0 ) is defined and
lim f (x) = f (x0 ).
xx0
Example 5.2.8. Consider f (x, y) = (|x| + |y|, |x| |y|)T and compute
lim f (x, y) = (0, 0)T = f (0, 0).

(x,y)(0,0)
We now use the definition to prove this statement. Let > 0. The first step is to
estimate f (x, y) (0, 0)T in the Euclidean norm on R2 :
p
||f (x, y) (0, 0)T || = (|x| + |y|)2 + (|x| |y|)2
p
= 2(|x|2 + |y|2 )
p
= 2 x2 + y 2 = 2||(x, y)||.
If (x, y) is chosen so that ||(x, y)|| < and we choose = then

||f (x, y) (0, 0)T || = 2||(x, y)|| < =
and the proof is complete.

The properties of continuous functions of several variables follow directly from the
properties of limits seen in Proposition 5.2.6.
Proposition 5.2.9. Let f, g : Rn R be continuous at x0 . Then, f (x) + g(x),

f (x)g(x) are continuous at x0 . Moreover, if g(x0 ) 6= 0, then f (x)/g(x) is continuous
at x0 .
Proof. By setting L1 = f (x0 ) and L2 = g(x0 ) the proof of Proposition 5.2.6 yields
the result.
We now look at properties of limits and continuity of general mappings f : Rn Rm .

Of course, the multiplication and division properties of limit do not have an analogue
if m > 1. However, the linearity property still holds: if f, g : Rn Rm , v1 , v2 Rm ,
and
lim f (x) = v1 and lim g(x) = v2
xx0 xx0
then for R
lim f (x) + g(x) = v1 + v2 . (5.2)
xx0
However, we can use the results for functions of several variables, as follows, to
obtain limit and continuity results for vector functions of several variables.
Proposition 5.2.10. Let f : Rn Rm be a mapping where f = (f1 , . . . , fm )T and

v = (v1 , . . . , vm ) Rm . Then,
lim f (x) = v
xx0
if and only if limxx0 fj (x) = vj for all j = 1, . . . , m.
Proof. This is an if and only if statement and so we need to prove two implications.
We begin by assuming the limit for f exists. Let > 0, then for ||x x0 || <
p
||f (x) v|| = (f1 (x) v1 )2 + + (fm (x) vm )2 < .

Recall that |u| = u2 , and therefore
p
|fj (x) vj | < (f1 (x) v1 )2 + + (fm (x) vm )2 <
for all j = 1, . . . , m. This implies
lim fj (x) = vj
xx0
for all j = 1, . . . , m and so one implication is proved. Suppose now that
lim fj (x) = vj
xx0
for all j = 1, . . . , m. Let > 0, there exists > 0 such that if ||x x0 || < then
|fj (x) vj | < for j = 1, . . . , m. So, if x is such that ||x x0 || < , then
||f (x) v||2 = (f1 (x) v1 )2 + + (fm (x) vm )2 < m2 .

Thus, ||f (x) v|| < m and the proof is complete.
We can use the above result to determine continuity directly. The proof is straight-
forward by setting vj = fj (x0 ) for j = 1, . . . , m.
Corollary 5.2.11. Let f : Rn Rm where f = (f1 , . . . , fm )T , then f is continuous

at x0 if and only if fj is continuous at x0 for all j = 1, . . . , m.
Proposition 5.2.10 and Corollary 5.2.11, along with Propositions 5.2.6 and Proposi-
tion 5.2.9, give a recipe for verifying whether a mapping has a limit or is continuous.
See the next example for an illustration of the procedure.
Example 5.2.12. Consider the mapping

!
x2 sin(xyz)
f (x, y, z) = .
(x + yz)ex
We use properties of functions of several variables to show continuity of f as
(x, y, z) (0, 1, 1). Let f = (f1 , f2 )T and we study each function in turn. Be-
cause x2 0 as x 0 and sin(xyz) 0 as (x, y, z) (0, 1, 1) and we have
f1 (0, 1, 1) = 0, then lim(x,y,z)(0,1,1) f1 (x, y, z) = f1 (0, 1, 1). Checking the second
one, we have (x + yz) 1 as (x, y, z) (0, 1, 1) and ex 1 as x 0; therefore,
lim(x,y,z)(0,1,1) f2 (x, y, z) = f2 (0, 1, 1). By Corollary 5.2.11, f is continuous at
(0, 1, 1).
We can extend the definition of continuity at a point to continuity in a domain.
Definition 5.2.13. Let U Rn be an open set and f : U Rm . Then, f is contin-

uous on U if for any x0 U ,
lim f (x) = f (x0 ).
xx0
Before we address the question of derivatives of mappings, let us consider the fol-
lowing example which shows the limitations of partial derivatives.
Example 5.2.14. Let

x if y = 0

f (x, y) = y if x = 0

1 otherwise.
We begin by showing that the partial derivatives exist at (0, 0):
f f (h, 0) f (0, 0) h
(0, 0) = lim = lim = 1,
x (x,y)(0,0) h (x,y)(0,0) h
f f (0, h) f (0, 0) h
(0, 0) = lim = lim = 1.
y (x,y)(0,0) h (x,y)(0,0) h
However, the function itself does not have a limit at (0, 0) as shown above in Exam-
ple 5.2.1. Therefore, the existence of partial derivatives at a point does not provide
any information about the existence of a limit at that same point.
In fact, there are examples where all directional derivatives exist at a point, but the
limit does not.
Exercises
(1) Compute the following limits.

(a) lim x sin(xy)
(x,y)(1,0)
ez
(b) lim
(x,y,z)(0,0,0) 1 + x2 + y 2
(c) lim x2 2xy + y 2 2

(x,y)(2,2)
! ! !
3 7 x 1
(d) lim +
(x,y)(1,1) 2 4 y 5
(e) lim ||(x 1, y + 2, z 3)||
(x,y,z)(0,0,0)
(f) lim (2 sin cos , 2 sin sin , 2 cos )
(,)(,/2)
(2) Show using approaches from various directions that the limits below do not
exist.
3y 6
(a) lim
(x,y)(1,2) ||(x 1, y 2)||
3xy 2
(b) lim (Hint: try a quadratic approach)
(x,y)(0,0) 2x xy 2
5
(y, x, 5z)T
(c) lim
(x,y,z)(0,0,0) ||(x, y, z)||
x
(d) lim qQ , where x = (x, y, z).
x0 ||x||3
(3) Show using the definition that the limit exists.
(a) lim 7x 3y = 0
(x,y)(0,0)
(b) lim xy + x y 1 = 0
(x,y)(1,1)
(x, y)||(x, y)||
(c) lim =0
(x,y)(0,0) 1 + ||(x, y)||
(d) lim (x, y, z) (1, 1, 1) = (0, 0, 0)
(x,y,z)(1,1,1)
(4) Show cases (1) and (3) of Proposition 5.2.6.

(5) Determine whether the following functions are continuous at the point given. If
yes, provide a calculation or use the definition of limit. If not continuous, show
why using the method of your choice.
1 + xy
(a) lim
(x,y,z)(/4,2,/4) cos(y(x + z))
!
1 3 x
(b) lim (A(x, y)) where A(x, y) =
(x,y)(0,0) 2y x
(c) lim (cos sin , sin sin , cos )
(,)(/4,/4)
(6) Show the continuity of the following mappings using properties of continuity of
functions of several variables using Corollary 5.2.11.
x2

1/xy
(a) lim ,e
(x,y)(0,0) 3x + 4xy
tanh(1/(xyz)), x2 + y 2 + z 2

(b) lim
(x,y,z)(0,0,0)
( cos sin , sin sin , cos )
(c) lim
(,,)(5,0,0) ||(, 0, 0)||
5.3 Best Linear Approximation and Derivatives
We begin the discussion on derivatives by addressing the concept of tangency at a

point.
Example 5.3.1. The following examples show functions which are tangent at a point.
(1) x2 and x3 at x = 0.
x3
x2
1
Fig. 5.9. Tangency of x2 and x3

at x = 0.
(2) sin x and tanh(x) at x = 0.
tanh(x)
1
Fig. 5.10. Tangency of sin(x) and
1 sin(x) tanh(x) at x = 0.
Figure 5.9 and Figure 5.10 show the tangency; however, we need a formula from
which we can determine whether two curves are tangent at a point. This is given by
the next definition.
Definition 5.3.2. Consider mappings f : Rn Rm and g : Rn Rm . We say f

n
and g are tangent at x0 = (x10 , . . . , xn
0 ) R if f (x0 ) = g(x0 ) and
||f (x) g(x)||Rm

lim = 0.
xx0 ||x x0 ||Rn
The idea behind this definition is that tangency occurs if the approach of f (x)g(x)
towards zero as x x0 happens at a rate much faster than the linear rate of approach
given by x x0 . We verify that the examples above agree with this definition.
Example 5.3.3. In the case of f (x) = x2 and g(x) = x3 at x = 0 we have
f (x) g(x) x2 x3
lim = lim = lim x x2 = 0.
x0 x0 x0 x x0
Take f (x) = sin x and g(x) = tanh(x). Recall from elementary calculus that
sin x
lim = 1,
x0 x
and tanh0 x = 1 tanh2 x. Then, using lHospital rule
tanh(x)
lim = 1.
x0 x
Therefore,
sin x tanh(x) sin x tanh(x)
lim = lim lim = 0.
x0 x x0 x x0 x
We now look at a case in several variables.
Example 5.3.4. Consider f (x, y) = x2 + y 2 and g(x, y) = 2(x + y 1). We show

that
|f (x, y) g(x, y)|
lim = 0.
(x,y)(1,1) ||(x, y) (1, 1)||
Note that
|f (x, y) g(x, y)| = |x2 + y 2 2(x + y 1)|

= |(x2 2x + 1) + (y 2 2y + 1)|
= |(x 1)2 + (y 1)2 | = ||(x 1, y 1)||2 .
Thus,
|f (x, y) g(x, y)| ||(x 1, y 1)||2

lim = lim
(x,y)(1,1) ||(x, y) (1, 1)|| (x,y)(1,1) ||(x 1, y 1)||
= lim ||(x 1, y 1)|| = 0.
(x,y)(1,1)
We now refine the above definition to consider only functions g which are linear and
this leads us to the concepts of Best Linear Approximation and the Derivative.
Definition 5.3.5. Consider the mapping f : Rn Rm and

n
x0 = (x10 , . . . , xn
0) R :
(1) If g(x) = a + Lx where a Rm , L : Rn Rm and g is tangent to f at x = x0

then g(x) is called the best linear approximation to f at x0 .
(2) If f has a best linear approximation g(x) at x0 , then f is said to be differentiable
at x0 and L is called the derivative of f at x0 . We denote L = Df (x0 ).
Properties of Tangency and Best Linear Approximation

We now look at properties of tangency and best linear approximation which are
direct consequences of the definitions above.
(1) Suppose f (x) and g(x) are continuous and tangent at x0 , then f (x0 ) = g(x0 ).
Proof. Because f (x) and g(x) are continuous at x0 then
lim f (x) = f (x0 ) and lim g(x) = g(x0 ).

xx0 xx0
Tangency of f and g implies limxx0 f (x) g(x) = 0. Therefore, f (x0 ) = g(x0 ).

If g is a best linear approximation to f at x0 then g(x) = f (x0 ) + L(x x0 ).
Proof. We know g(x) = a + Lx, so g(x0 ) = a + Lx0 = f (x0 ) by the above result.
This implies a = f (x0 ) Lx0 and
g(x) = a + Lx = f (x0 ) + L(x x0 ).
Best linear approximations are unique, which means derivatives are unique.
Proof. Suppose there are two best linear approximations g1 (x) = f (x0 )+L1 (x
x0 ) and g2 (x) = f (x0 ) + L2 (x x0 ). Then, g1 and g2 are tangent to each other
and
||g1 (x) g2 (x)|| ||(L1 L2 )(x x0 )||
0 = lim = lim .
xx0 ||x x0 || xx0 ||x x0 ||
We now proceed by contradiction. Suppose that L1 L2 6= 0. Let v = x x0
be a vector not in the kernel of L1 L2 and consider w := (L1 L2 )v. Then
||w|| = ||v|| for some nonzero R. In the v direction, we can write
||(L1 L2 )(x x0 )|| ||(L1 L2 )tv||
lim = lim
xx0 ||x x0 || t0 ||tv||
||tw||
= lim
t0 ||tv||
= lim 6= 0.
t0
Thus we have a contradiction and so L1 = L2 .

We look at some familiar examples.
(1) For f : R R, the derivative Df (x) is a 1 1 matrix and corresponds to the

regular derivative f 0 (x).
(2) For vector functions f : R Rn this is shown in Chapter 2.
(3) For f : R2 R the derivative Df (x) is a 1 2 matrix. The next results shows
that Df (x) = f (x).
Proposition 5.3.6. If f : Rn R is a differentiable function at a point x0 . Then,

the derivative L = Df (x0 ) = f (x0 ).
Proof. Let L = Df (x0 ) = (a1 , . . . , an ) be a 1 n matrix. Because f is differentiable

at x = x0 we know that
|f (x) g(x)|
lim =0
xx0 ||x x0 ||
no matter which path is used to approach x0 . Let x0 = (x01 , . . . , x0n ) and consider
the path parallel to the j th coordinate axis given by
x(t) = (x1 (t), . . . , xj (t), . . . , xn (t))
where xi (t) = x0i for i 6= j and xj (t) = x0j + t where t R. Then,
|f (x) f (x0 ) Df (x0 )(x x0 )|

0 = lim
xx0 ||x x0 ||
|f (x01 , . . . , x0j + t, . . . , x0n ) f (x0 ) Df (x0 )ej t|
= lim
t0 t
|f (x01 , . . . , x0j + t, . . . , x0n ) f (x0 ) aj t|
= lim
t0 t
f
= (x0 ).
t
For j = 1, . . . , n we obtain the partial derivatives in all coordinate directions and
Df (x0 ) = f (x0 ).
Example 5.3.7. The derivative of f (x1 , x2 , x3 , x4 ) = x22 x3 ex1 x4 at
x0 = (x01 , x02 , x03 , x04 ) = (1, 1, 1, 1)
is obtained by computing the gradient

f f f f
Df (x0 ) = (x0 ), (x0 ), (x0 ), (x0 )
x1 x2 x3 x
4
x x 2 x x

= (x4 e 1 4 , 2x2 x3 , x2 , x1 e 1 4 ) = (e, 2, 1, e).
x0
The Jacobian Matrix

Using the characterization of the derivative for a function of several variables f :
Rn R in terms of the gradient, we can extend our study to general mappings
f : Rn Rm for arbitrary n, m N. This is given in the next theorem.
Theorem 5.3.8. Let f : Rn Rm be a mapping and x = (x1 , . . . , xn ) Rn with

f1 (x1 , . . . , xn )

f2 (x1 , . . . , xn )
f (x) = .

..

.

fm (x1 , . . . , xn )
Suppose that Df (x0 ) exists. Then,
f1 f1 f1

(x ) (x0 ) (x0 )
x1 0 x2 xn

f2 f2 f2
(x0 ) (x0 ) (x0 )
Df (x0 ) = x1 x2 xn . (5.3)

.. .. ..

. . .

fm fm fm
(x0 ) (x0 ) (x0 )
x1 x2 xn
The matrix (5.3) is called the Jacobian matrix of f and is written Jf (x).
The proof is done in a similar way as for the derivative of a function f : Rn R and
we do not present it here. The interested reader can supply the details. The Jacobian
matrix Jf (x) depends only on partial derivatives and has importance of its own, and
should be computed even before checking that a mapping f is differentiable at some
point. It is a useful construction and is fundamental in what follows. However, it
should not be confused with the derivative since the derivative can exist at points
where the Jacobian matrix cannot be computed as we see in the next section.
Example 5.3.9. We compute the derivative of

uz 2 3wx2

f (x, y, z, u, w) =
cos(yz) + xuw

xe1/yu
at (x, y, z, u, w) = (0, 1, 0, 1, 0). We begin by writing f = (f1 , f2 , f3 )T and the Jaco-

bian matrix formula is

f1 f1 f1 f1 f1
x y z u w

f2 f2 f2 f2 f2
Jf (x, y, z, u, w) =
x

y z u w

f3 f3 f3 f3 f3
.
x y z u w
We compute each partial derivative:

f1 f1 f1 f1 f1
= 6wx, = 0, = 2uz, = z2, = 3x2 ,
x y z u w
f2 f2 f2 f2 f2
= uw, = z sin(yz), = y sin(yz), = xw, = xu,
x y z u w
f3 f3 x 1/yu f3 f3 x 1/yu f3
= e1/yu , = e , = 0, = e , = 0.
x y uy 2 z u yu2 w
and evaluate at (x, y, z, u, w) = (0, 1, 0, 1, 0) to obtain

0 0 0 0 0

Jf (0, 1, 0, 1, 0) =
0 0 0 0 .
0
e1 0 0 0 0
Example 5.3.10. We determine if the Jacobian of

y sin(x)

g(x, y, z) =
y ln |1 + x|

z y 2 + cos(x)
4
computed at (x, y, z) = (0, 2, 1) is triangular. We begin by writing g = (g1 , g2 , g3 )T

and the Jacobian matrix formula is

g1 g1 g1
x y z

g2 g2 g2
Jg(x, y) =
x y z

g3 g3 g3
x y z
We compute each partial derivative:

g1 g1 g1
= y cos(x), = sin(x), =0
x y z
g2 y g2 g2
= , = ln |1 + x|, =0
x 1+x y z
g3 g3 g3
= sin(x), = 2u4 y, = 4u3 y 2
x y z
and evaluate at (x, y, z) = (0, 2, 1) to obtain

2 0 0

Jg(0, 2) = 3 ln 1 0

0 4 16
which is a lower triangular matrix.
Alternate terminology and notations are as follows:
(1) The Jacobian is sometimes written

(f1 , f2 , . . . , fm )
Jf (x) = (x).
(x1 , . . . , xn )
(2) The Jacobian matrix is sometimes referred to as the linear part of f or the
linearization of f .
We conclude this section with the linearity property for derivatives of mappings.
Proposition 5.3.11. [Linearity Property] Let f, g : Rn Rm be differentiable map-

pings on U Rn , and R. Then,
D(f + g)(x) = Df (x) + Dg(x) and D(f )(x) = Df (x).
Example 5.3.12. In the study of differential equations, mappings such as this one
are often encountered
! !
x 2x2 + 4xy
f (x, y) = A +
y 5x3 3xy 2
where A is a matrix of constants. The derivative is
!
4x + 4y 4x
Df (x, y) = A + .
15x2 3y 2 6xy
Exercises
(1) Determine whether the functions are continuous, if so prove using the
definition. If not, give an explanation why. If possible, find a candidate for the
best linear approximation function g and show the tangency with the function
at the point. (Hint: the Jacobian(matrix may be useful in finding g).
x2 x0
(a) f (x) at x = 0 where f (x) =
x2 x<0
(b) f (x, y) = (x2(+ y 2 )5/2
at (0, 0).
xyz sin(1/x) x 6= 0
(c) f (x, y, z) = at (0, 0, 0).
0 x=0
(d) Consider the function f : R2 R2
x
y
|x| ,

x 6= 0
xy
f (x, y) = !

0
, x=0

0

at (x, y) = (0, 0).

(2) Compute the Jacobian matrix of the following mappings:

(a) f : R3 R defined by f (x, y, z) = xyz.
3 2
(b) f : R R defined by
!
x2 eyz
f (x, y, z) = .
(1 + xyz)2
(c) f : R2 R3 defined by

xy 2

f (x, y) = exy .

xy
(d) f : R4 R4 defined by

x21 + x2 x3

x1 x2 + x2 x4
f (x1 , x2 , x3 , x4 ) =
x x +x x

1 3 2 4
x2 x3 + x24
5.3.1 Conditions for differentiability
The regularity of a function/mapping is often used to refer to how nice a func-

tion/mapping is. For instance, we know that a differentiable function f : R R
is automatically continuous. But we know that the opposite statement is not true;
for instance f (x) = |x| at x = 0 is continuous, but not differentiable. Therefore,
one says that differentiable functions have more or higher regularity than continu-
ous functions. In this section, we begin our discussion of regularity of mappings by
generalizing the implication just mentioned for functions of one variable.
Theorem 5.3.13. Let f : Rn Rm be differentiable at x = x0 , then it is continuous

at x = x0 .
Proof. The proof is left as an exercise.

We continue with examples from which we are able to derive sufficient conditions
guaranteeing that a mapping is indeed differentiable and in the process, also make a
distinction between the derivative of a function and its Jacobian matrix. We begin
with an example from elementary calculus.

(
x2 sin(1/x) x 6= 0
f (x) =
0 x=0
We begin by noting that f (x) is continuous at x = 0. The details of this are left as
an exercise. The Jacobian is given by
Jf (x) = 2x sin(1/x) cos(1/x)
and this function is not defined at x = 0. By the way, it cannot be made continuous
at x = 0 because of the oscillating nature of the term cos(1/x). However, f is
differentiable at x = 0 and Df (0) = f 0 (0) = 0 which can be computed directly via
the limit definition of derivative. Thus, we have
(
2x sin(1/x) cos(1/x) x 6= 0
Df (x) =
0 x = 0.
and Df (x) is defined for all x, but Jf (x) is not. This means Jf (x) is not a good
indicator of whether Df (x) exists at x = 0.
This example can be easily extended to functions of several variables.

!
(x2 + y 2 ) sin 1
(x, y) 6= (0, 0)

p
f (x, y) = x2 + y 2

0 (x, y) = (0, 0).

The Jacobian matrix is Jf (x, y) = (f /x, f /y) with

f 1 x 1
= 2x sin cos
x ||(x, y)|| ||(x, y)|| ||(x, y)||
f 1 y 1
= 2y sin cos .
y ||(x, y)|| ||(x, y)|| ||(x, y)||
Note again that Jf (x, y) is not defined at (0, 0) since both partial derivatives are not
defined at (0, 0). But, we show that Df (0, 0) = (0, 0) using the definition. Let g be
the best linear approximation to f at (0, 0): g(x, y) = g(0, 0) + L(x 0, y 0)T with
g(0, 0) = 0, we let L = (0, 0) and write

2 1
||(x, y)|| sin
f (x, y) g(x, y) ||(x, y)||
lim = lim
(x,y)(0,0) ||(x, y) (0, 0)|| (x,y)(0,0) ||(x,
y)||
1
= lim ||(x, y)|| sin
(x,y)(0,0) ||(x, y)||
= 0.
where the last equality follows from the squeeze theorem since
1 sin(1/||(x, y)||) 1.
Because partial derivatives are easy to compute, we determine a criterion based on

partial derivatives which guarantees the existence of Df . This is the content of the
next result.
Theorem 5.3.16. Let U Rn be an open set and consider f : U Rm with

f = (f1 , . . . , fm )T . Then, f is differentiable on U if f is continuous on U and all
the partial derivatives
fj
(x0 ) j = 1, . . . , m; i = 1, . . . , n
xi
exist and are continuous functions U .
We omit the proof of this theorem. The following definition is helpful in describing
levels of differentiability
Definition 5.3.17. A mapping f differentiable on U Rn and for which Df (x) is

a continuous function on U is said to be of class C 1 on U . We write f C 1 on
U Rn or f C 1 (U ).
We can now illustrate the logical relationship between various conditions of regularity
of a function.
f satisfies Theorem 5.3.16 f is C 1 on U
f is C 1 on U = f is differentiable on U
f is differentiable on U = f is continuous on U .
Exercises
(1) Consider the mapping f : R3 R defined by f (x, y, z) = 3x cos(yz) and verify

that f is differentiable for all (x, y, z) R3 .
(2) Consider !
(x + y)4/3

(x, y) 6= (0, 0)
(x y)2/3

f (x, y) =

(0, 0)T (x, y) = (0, 0).

Compute the partial derivatives, check to see if the partial derivatives are defined
at (0, 0) and if they are continuous at (0, 0). Using this information, can one tell
if Df (0, 0) exists?
(3) Show Theorem 5.3.13.
(4) For each of the mappings below (i) Find the domain of the mapping. (ii) Com-
pute the Jacobian matrix of the mapping. (iii) Find out if the Jacobian matrix
is defined at all points of the domain.
(a) p
x2 y 2 uz

f (x, y, z, u) =
cos yz

ln(x2 u2 )
(b)

xy + 2z 2

f (x, y, z, u) =
uxeyz

cos(x2 y 2 )
5.4 Tangent spaces
In Sections 3.1.1 and 3.1.4, we define tangent spaces to curves and to surfaces defined
by z = f (x, y). The tangent space to a function y = f (x) is given by
Tp C = {(1, f 0 (x0 )) | R},
where Tx0 R. In general, tangent space of space curves are given by
Tp C = {r0 (t0 ) | R},
with Tt0 R. Let S be a surface given by z = f (x, y). Then
Tp S = span{1 (p), 2 (p)},

where

f f
1 (p) = 1, 0, (x0 , y0 ) and 2 (p) = 0, 1, (x0 , y0 ) .
x y
We now use the derivative to generalize the above constructions to arbitrary map-
pings f : Rn Rm . Recall that graph (f ) is a surface of dimension n in Rn Rm .
Definition 5.4.1. Consider the set S given by graph f and suppose that f is differ-
entiable at x0 . Then the tangent space at p = (x0 , f (x0 )) S is given by
Tp S = {(v T , [Df (x0 )v]T ) Tx0 Rn Tf (x0 ) Rm | v Tx Rn }
and recall that v is a column vector.
If S is a n-dimensional surface given by a differentiable mapping f : U Rn Rm

then the tangent space at any point p S is a vector space of dimension n. This can
be easily shown with a calculation. We check that a linear combination of elements
of Tp S is also in Tp S. Let (v1T , [Df (x0 )v1 ]T ), (v2T , [Df (x0 )v2 ]T ) Tp S and R,
then
(v1T , [Df (x0 )v1 ]T ) + (v2T , [Df (x0 )v2 ]T )
= (v1T + v2T , [Df (x0 )v1 ]T + [Df (x0 )v2 ]T )
= (v1T + v2T , [Df (x0 )(v1 + v2 )]T )
where this last element is indeed in Tp S.
Example 5.4.2. We now verify that the formula of the definition corresponds to
the tangent space of a two-dimensional surface. Let z = f (x, y) be differentiable at
(x0 , y0 ). The definition gives
Tp S = {(v T , [Df (x0 , y0 )v]T ) T(x0 ,y0 ) R2 Tf (x0 ,y0 ) R | v T = (v1 , v2 ) Tx R2 }
with
f (x0 , y0 ) f (x0 , y0 )
Df (x0 , y0 )v = , (v1 , v2 )T
x y
f (x0 , y0 ) f (x0 , y0 )
= v1 + v2 .
x y
Thus,

f (x0 , y0 ) f (x0 , y0 )
(v T , [Df (x0 , y0 )v]T ) = v1 , v2 , v1 + v2
x y
f (x0 , y0 ) f (x0 , y0 )
= v1 1, 0, + v2 0, 1, ,
x y
which confirms the correspondence.
From the definition of Tp S, we see that the linear operator Df (x0 ) is, in fact, a
linear mapping that sends vectors v, element of the tangent space at x0 , to vectors
at the tangent space at f (x0 ).
Df (x0 ) : Tx0 Rn Tf (x0 ) Rm v 7 Df (x0 )v.

(f, Df )
Df (x0 )v
f (x0 )
x0
Fig. 5.11. The pair (f, Df ) maps the pair (x0 , v0 ) Rn Tx0 Rn to the corresponding pair
(f (x0 ), Df (x0 )) Rm Tf (x0 ) Rm .
This is illustrated in Figure 5.11. Similarly, we can say that Tp S is the graph of the
linear mapping Df (x0 ).
Example 5.4.3. We compute the tangent space of f (x, y) = (2xy, x2 + y 2 )T at an

arbitrary point p = (x, y, f (x, y)). S = graph f is a two-dimensional surface in R2
R2 . We proceed first by writing f = (f1 , f2 )T and computing
f1 f1

!
x y 2y 2x
Df (x, y) = f f2 = .

2 2x 2y
x y
Let v = (v1 , v2 )T and compute

!
2yv1 + 2xv2
Df (x, y)v = .
2xv1 + 2yv2
Then,
Tp S = {(v1 , v2 , 2yv1 + 2xv2 , 2xv1 + 2yv2 ) | (v1 , v2 )T T(x,y) R2 }.
We are now in a position to show the differential formula for functions of several
variables. Recall from Chapter 3 that if f : Rn R is such that all its partial
derivatives exist, then
n
X f
df = dxi .
xi
i=1
Let S be the n-dimensional surface given by the graph of f (x1 , . . . , xn ). Then
Tp S = {(1 , . . . , n , f (1 , . . . , n )T ) | 1 , . . . , n R}.
Letting v = (1 , . . . , n , f (1 , . . . , n )T ) we have dxj (v) = j for j = 1, . . . , n

and
n
X f
df (v) = dxn+1 (v) = f (1 , . . . , n )T = dxi (v).
xi
i=1
Exercises
(1) Consider the mapping

!
xy 2
f (x, y) =
x + 2(x3 + y 3 )
and consider S = graph (f ).

(a) Verify that the point p = (1, 1, 1, 0) S.
(b) Compute Tp S.
(c) Is the vector w = (2, 1, 0, 20) Tp S?
(2) Consider the differentiable mapping
!
yx2 y 2 + xyz
f (x, y, z) = .
3xy + xyz 2
(a) Compute Df (x, y, z). Find values of (x, y, z) for which Df is the zero matrix.
(b) Compute the general formula for the tangent space to graph (f ) = S for an
arbitrary point p = (x, y, z, f (x, y, z)).
(3) Consider the differentiable mapping
9 1
g(x, y, z, u) = 6xyz zx2 z 2 y x2 + 9x x4 2uy + u.
2 2
(a) Compute the general formula for the tangent space to graph (g) = S for an
arbitrary point p = (x, y, z, u, g(x, y, z, u)).
(b) Find all the values of (x, y, z, u) for which Dg(x, y, z, u) = 0.
5.5 The Chain Rule
Let f : Rn Rm and g : Rm Rk , then we can compose f and g:
(g f )(x) = g(f (x)).
We look at two examples.

(1) Let f (x, y, z) = x2 + y 2 + z 2 and g(u) = u, we compute (g f )(x):
p
(g f )(x) = g(f (x)) = g(x2 + y 2 + z 2 ) = x2 + y 2 + z 2 .
5.5 The Chain Rule 143
(2) Let f : R2 R4 be defined by

!
x y
f (x, y) =
cos(x) 0
and g be the determinant function. We compute (g f ):

!!
x y
(g f )(x) = g(f (x)) = g
cos(x) 0
!
x y
= det
cos(x) 0
= y cos(x).
We recall the chain rule for functions of one variable. Let f, g : R R and suppose
that g f : R R is well-defined. The chain rule formula is
d
(g f )0 (x) = g(f (x)) = g 0 (f (x))f 0 (x).
dx
Thus, we multiply the derivatives of g and f . The same multiplication of derivatives
characterizes the chain rule for mappings. However, we must now consider multipli-
cation of matrices with the proper sizes. The exact statement follows.
Proposition 5.5.1. [Chain rule] Let f : Rn Rm and g : Rm Rk be two differ-

entiable functions. Then,
D(g f )(x) = Dg(f (x))Df (x).
We now show what this formula means and how it can be unpacked.
Example 5.5.2. We compute the derivative of Example (1). The formula is
D(g f )(x) = Dg(f (x))Df (x).
We compute each derivative separately:

1
(Dg)(u) = , Df (x) = (2x, 2y, 2z).
2 u
Evaluating Dg(u) at u = f (x) we obtain
1
Dg(f (x)) = p .
2 x + y2 + z2
2
Putting these computations together in the formula we conclude that

(x, y, z)
D(g f )(x) = Dg(f (x))Df (x) = p .
x2 + y2 + z2
In particular, this shows that in fact, (g f )(x, y, z) = ||(x, y, z)|| and we have

x y z
D(||(x, y, z)||) = , , .
||(x, y, z)|| ||(x, y, z)|| ||(x, y, z)||
We compute the derivative of Example (2).
Example 5.5.3. We start by computing each derivative separately. Let

!
a1 a2
A=
a3 a4
and rewrite g as a mapping from R4 R:
g(a1 , a2 , a3 , a4 ) = a1 a4 a2 a3 .
Then
Dg(A) = (a4 , a3 , a2 , a1 ).
Write f as f (x, y) = (x, y, cos(x), 0)T and

1 0

0 1
Df (x, y) =
.
sin(x) 0

0 0
Evaluating Dg at A written as a R4 vector (x, y, cos(x), 0) we have Dg(A) =

(0, cos(x), y, x). We now assemble in the formula.

1 0

0 1
D(g f )(x, y) = (0, cos(x), y, x)

sin(x) 0

0 0
= (y sin(x), cos(x)).
Exercises
(1) Compute the derivative of the mappings g f for f and g given below.
(a) f (x, y, z) = xyz and g(t) = (t, t2 , t3 ).
(b) f (x, y) = (xy 2 , 2x2 y 2 ) and g(u, v) = (uv, u2 , u v 2 ).
(c) Let A, B be n n matrices, f (A) = A2 and g(B) = tr B.
(d) f (t) = (cos(t), t sin(t), te2t ), g(x, y, z) = (xy, x2 2y 2 ).
(2) Show that if x = (x1 , . . . , xn ) Rn and g(x) = ||x||, then

x
Dg(x) = .
||x||
(3) Use the chain rule to compute the following derivatives where f (x, y, z) = x2 +
yz, g(x, y) = y 3 + xy and h(x) = sin(x).
(a) F (x, y, z) = f (h(x), g(x, y), z).
(b) G(x, y, z) = h(f (x, y, z)g(x, y)).
(c) H(x, y, z) = g(f (x, y, h(x)), g(z, y)).
5.6 Higher Derivatives
As in the case of real functions, the derivative of a mapping f : Rn Rm is

also a mapping. Before looking at a few examples, consider the set L(Rn , Rm ) of
all real linear transformations from Rn to Rm . This set is a vector space since for
any two linear transformations L1 , L2 L(Rn , Rm ) and R, then L1 + L2
is also an element of L(Rn , Rm ). If one prefers thinking about matrices, then it is
straightforward to show that Matmn (R), the vector space of all m n matrices
with real entries, and L(Rn , Rm ) are isomorphic as vector spaces.
(1) Let f : R R, then Df (x) is a 1 1 matrix. For instance, if f (x) = sin x, then
Df : R L(R, R)
x 7 (cos(x))
(2) Let f : R4 R2 , then Df (x, y) is a 2 4 matrix. If f (x, y, z, w) = (x2 +

yzw, y 2 + xw)T , then
Df : R4 L(R4 , R2 )
!
2x zw yw yz
(x, y, z, w) 7 .
w 2y 0 x
We are thus led to the following characterization. If f : Rn Rm is differentiable,

then Df is a mapping:
Df : Rn L(Rn , Rm ) x = (x1 , . . . , xn ) 7 Df (x).
In the case of real functions of one variable, the above characterization (as we know
from elementary calculus) shows that iterating derivatives does not change the ma-
trix structure of Df .
Proposition 5.6.1. Let f : R R be an infinitely differentiable function. Then, for

any k N, Dk f : R L(R, R).
For general mappings, the situation becomes complicated quite quickly as this ex-
ample shows.
Example 5.6.2. Let f : R2 R, then Df (x, y) L(R2 , R); that is, it can be repre-
sented as a 1 2 matrix. Note that the space of 1 2 matrices (or L(R2 , R)) can
be identified with R2 . Thus, we can write Df : R2 R2 . We can now take the
derivative of the Df mapping
D(Df ) : R2 L(R2 , R2 ).
That is, we define D2 f (x, y) := D(Df )(x, y), and it is a 2 2 matrix. We illustrate
with f (x, y) = x2 y 3 . We compute Df (x, y) = (2xy 3 , 3x2 y 2 )T and so
!
2 2y 3 6xy 2
D f (x, y) = .
6xy 2 6x2 y
Continuing this way, D3 f should be a three-dimensional 2 2 2 array, and this can

still be visualized. However, for Dk f for k 4 we cannot visualize the derivatives
anymore in this way.
Example 5.6.3. Consider again f : R2 R4 . Then Df (x, y) L(R4 , R2 ) is a 2 4

matrix, therefore D2 f already must have the form of a three-dimensional array.
These examples illustrate the notational complexity arising for higher order deriva-
tives of general mappings. In the remainder of this section, our focus is on functions
of several variables and we do not cover the case of more general mappings. The
next section deals specifically with the second derivative which has important appli-
cations.
5.6.1 Second Derivative: functions of several variables
We consider only functions f : Rn R for which Df is differentiable; that is, we

say that f : Rn R is twice differentiable if Df : Rn L(Rn , R) exists and it is a
differentiable mapping. In Example 5.6.2, we see that we can identify L(R2 , R) with
R2 . This is true in any dimension and we can write L(Rn , R) ' Rn . Therefore, we
also have the identification
L(Rn , Rn ) ' L(Rn , L(Rn , R))
where recall that L(Rn , Rn ) can be identified with Matnn (R). We can now state
the main theorem of this section.
Theorem 5.6.4. Let f : Rn R be twice differentiable where x = (x1 , . . . , xn ).

Then, the matrix of D2 f : Rn L(Rn , L(Rn , R)) can be represented by a n n
matrix given by
2f 2f 2f

x2
1 x2 x1 xn x1

2f 2f 2f

x22

x1 x2 xn x2

.. .. ..

. . .

2f 2
f 2
f
2
x1 xn x2 xn xn
where each partial derivative is evaluated at x = (x1 , . . . , xn ).
Example 5.6.5. The second derivative of f (x, y) = cos(xy) is
2f 2f

!
2 y 2 cos(xy) xy cos(xy)
D2 f (x, y) = x yx = .

2
f 2f xy cos(xy) x2 cos(xy)
xy y 2
Example 5.6.6. We compute D2 f for f (x, y, z, w) = x2 + eyz + w. From Theo-

rem 5.6.4, we have
2f 2f 2f 2f

x2 yx zx wx

2f 2f 2f 2
f

xy

y 2 zy wy

D2 f (x, y, z, w) =

2f 2f 2f 2f

xz
yz z 2 wz

2f 2f 2f 2
f
xw yw zw w2

2 0 0 0
2 yz yz

0 z e (1 + zy)e 0
.
= 0 (1 + zy)eyz y e2 yz 0

0 0 0 0
Definition 5.6.7. A mapping f : Rn R is of class C 2 if
D2 f : Rn L(Rn , L(Rn , R))
exists and is a continuous function.
A very important property of C 2 functions is the following.
Theorem 5.6.8 (Clairaults Theorem). Suppose that f : Rn R is a C 2 mapping,

then the matrix D2 f (x) is symmetric; that is,
2f 2f
= for i, j = 1, . . . , n
xi xj xj xi
Therefore, all mixed partial derivatives
2f 2f
= .
xi xj xj xi
for i, j = 1, . . . , n are equal.
We see that the functions of Examples 5.6.5 and 5.6.6 satisfy the conclusions of
Clairaults theorem. In particular, Clairaults theorem implies that the derivative of
any C 2 function f is a symmetric matrix; that is,
D2 f (x) = [D2 f (x)]T .
We see above that third and higher derivatives are problematic to represent in terms
of matrices. In order to come up with a reasonable way of writing higher derivatives,
we look at a more abstract set-up for the second derivative, which generalizes nicely
to higher dimensions.
Algebraic meaning of the 2nd derivative

For a function of several variables f : Rn R, Df (x) sends vectors in Tx Rn to
vectors in Tf (x) R:
Df (x) : Tx Rn Tf (x) R
v 7 Df (x)v.
We can check that Df (x) is a linear mapping. Let v, w Tx Rn and R then
Df (x)(v + w) = Df (x)(v) + Df (x)w = Df (x)v + Df (x)w.
Therefore, Df (x) L(Tx Rn , Tf (x) R), but because we can make the identifica-
tion Tp Rm ' Rm for any point p and any m, we write Df (x) L(Rn , R) as
noted above. We know that D2 f (x) L(Rn , L(Rn , R)) where we should think
of Rn as Tx Rn and R as Tf (x) R. Therefore, the correct expression is D2 f (x)
L(Tx Rn , L(Tx Rn , Tf (x) R)). If we reason in terms of vectors, for v Tx Rn a column

vector, then D2 f (x) applied to v must be a 1 n vector. This is obtained as follows
D2 f (x) : Tx Rn L(Tx Rn , Tf (x) R)

v 7 v T D2 f (x).
Since, v T D2 f (x) L(Tx Rn , Tf (x) R) then if w is a column vector in Tx Rn then

v T D2 f (x) applied on w must be real number. This is done as follows
v T D2 f (x) : Tx Rn Tf (x) R
w 7 v T D2 f (x)w.
We see that if D2 f (x) is applied to two vectors v, w Tx Rn then the formula

v T D2 f (x)w returns indeed an element of Tf (x) R.
Example 5.6.9. Consider Example 5.6.5 with f (x, y) = cos(xy). Then, if v =

(v1 , v2 )T and w = (w1 , w2 ) we have
! !
y 2 cos(xy) xy cos(xy) w 1
v T D2 f (x, y)w = (v1 , v2 )
xy cos(xy) x2 cos(xy) w2
!
w1
= (y 2 cos(xy)v1 xy cos(xy)v2 , xy cos(xy)v1 x2 cos(xy)v2 )
w2
= (y 2 cos(xy)v1 xy cos(xy)v2 )w1 + (xy cos(xy)v1 x2 cos(xy)v2 )w2
= y 2 cos(xy)v1 w1 xy cos(xy)(v2 w1 + v1 w2 ) x2 cos(xy)v2 w2 .
The following definition formalizes this type of mathematical object.
Definition 5.6.10. A mapping B : Rn Rn R is a real bilinear form on Rn if for

any u, v, u1 , u2 , v1 , v2 Rn and , R then
(1) B(u1 + u2 , v) = B(u1 , v) + B(u2 , v), and
(2) B(u, v1 + v2 ) = B(u, v1 ) + B(u, v2 ).
We denote by L2 (Rn , R) the set of all real bilinear forms. Moreover, if B(u, v) =
B(v, u) for all u, v Rn then B is called a symmetric bilinear form and the set of
all symmetric bilinear forms is denoted L2s (Rn , R). Consider the following example.
Example 5.6.11. Let A be a symmetric n n matrix and v, w Rn be two column

vectors. Define
B(v, w) = v T Aw. (5.4)
Let u Rn be another vector and R, then
B(u + v, w) = (u + v)T Aw = (uT + v T )Aw

= uT Aw + v T Aw
= B(u, w) + B(v, w).
A similar calculation shows that
B(v, u + w) = B(v, u) + B(v, w).
Thus, B defined using (5.4) is a bilinear form. We now show B is symmetric as

follows. Note that since v T Aw is a real number, then v T Aw = (v T Aw)T and so
v T Aw = (v T Aw)T = wT AT v = wT Av.
where the last equality holds because A is symmetric. Therefore, B(v, w) = B(w, v).
This last example provides a proof of the following statement.
Proposition 5.6.12. Let A be a nn symmetric matrix and u, v Rn be two column

vectors. Then,
B(u, v) = uT Av
is a symmetric bilinear form.
Proposition 5.6.12 shows that we can construct symmetric bilinear forms by using
symmetric matrices. A quadratic form q : Rn R is obtained from a symmetric
bilinear B : Rn Rn R form by the formula
q(v) := B(v, v).
Quadratic forms are important in several fields of mathematics and in particular,

they are encountered in the next section.
Example 5.6.13. Consider the quadratic form q(x, y) = B((x, y)T , (x, y)T ) where B
is obtained from the matrix
!
1 2
A= .
2 3
Then,
q(x, y) = x2 + 3y 2 4xy.
Exercises
(1) Let r : R Rn be a vector function defined by r(t) = (x1 (t), . . . , xn (t))T . Show
that if the k th derivative of r exists, then Dk r(t) = (xk1 (t), . . . , xkn (t))T where
xkj (t) is the k th derivative of xj (t).
(2) Compute the fourth derivative of r(t) = (t2 , e2t , cos(t))T .

(3) Compute the second derivative of the following mappings.
(a) f (x, y, z) = xy exp(xz)
!
x1 x2
(b) f (x1 , x2 , x3 , x4 ) = det
x3 x4
! !
3 1 x
(c) f (x, y) = (x, y)
1 2 y
(4) Let f (x, y) = x2 y 2 + 1
and g(u) = e2u .
Compute D2 (g f )(x, y) by first express-
ing g f explicitly and then taking its second derivative.
(5) Consider the function

xy(x2 y 2 )
if (x, y) 6= (0, 0)

x2 + y 2

f (x, y) =

0 if (x, y) = (0, 0).
2f 2f
Compute the second partial derivatives (0, 0) and (0, 0). Is f a C 2
xy yx
function? (Hint: Not an easy question.)
(6) If q(v) = v T Av where A is a symmetric matrix, compute D2 q(v) in two ways:
by writing q explicitly in coordinate form by setting v = (x1 , . . . , xn )T or by
using the tangency limit and the best linear approximation formula.
(7) Consider the quadratic form q : R3 R where q(v) = B(v, v) and B is given
by the matrix
a b c

A=
b d .
e
c e f
Show that
q(x, y, z) = ax2 + dy 2 + f z 2 + 2bxy + 2cxz + 2eyz.
5.7 Taylor expansions
Recall Taylors expansions for functions f : R R at a point x0 . If f is differentiable

r times, then
f 0 (x0 ) f 00 (x0 )
f (x) = f (x0 ) + (x x0 ) + (x x0 )2
1! 2!
f (r) (x0 )
+ + (x x0 )r + Rr (x)
r!
where Rr (x) is the remainder term and can be expressed as
f r+1 ()
Rr (x) = (x x0 )r+1
(r + 1)!
where (x0 , x). We can rewrite the above compactly as
r
X f j (x0 )
f (x) = (x x0 )j + Rr (x).
j!
j=1
Example 5.7.1. We compute the Taylor expansion of exp(x) at x = 1 up to order 3:

f 00 (1) f 000 (1)
f (x) = f (1) + f 0 (1)(x 1) + (x 1)2 + (x 1)3 + R3 (x)
2 3!
1 1
= e + e(x 1) + e(x 1)2 + e(x 1)3 + R3 (x)
2 6

1 1
= e 1 + (x 1) + (x 1)2 + (x 1)3 + R3 (x)
2 6
1
where R3 (x) = 4! e (x 1)4 for some (1, x). Note that because ex is infinitely
differentiable, we can take the Taylor expansion up to any order.
Here is an example where the function is only differentiable a finite number of times.
Example 5.7.2. We compute the Taylor expansion of f (x) = x5/2 at x = 0 up

to order 2 because this function is only twice differentiable. We obtain f (0) = 0,
15 1/2 3
f 0 (0) = 0, f 00 (0) = 0 with f 000 (x) = x , therefore
8
1
f (x) = f (0) + f 0 (0)x + f 00 (0)x2 + R2 (x) = 0 + R2 (x)
2
15 1/2 3
where R2 (x) = 48 x for some (0, x).
We now show a Taylors expansion formula for functions f : Rn R at some point

x0 = (x01 , . . . , x0n ) Rn . We do this by using the function g : R R defined by
g(t) = f (x0 + t(x x0 ))
where x = (x1 , . . . , xn ) and begin by computing the Taylor expansion at t = 0. We

assume that f is infinitely differentiable and so obtain the expression
1 00 g r (0) r
g(t) = g(0) + g 0 (0)t + g (0)t2 + + t +
2! r!
where g(0) = f (x0 ). We now use this notation to compute the higher derivatives.
Set u(t) = x0 + t(x x0 ) and note that u(0) = x0 and u0 (t) = x x0 . Then
d
g 0 (t) = f (u(t)) = Df (u(t))u0 (t).
dt
and so
g 0 (0) = Df (x0 )(x x0 ).
We now proceed to the second derivative.
d
g 00 (t) = (Df (u(t))(x x0 )T )
dt
Note first that Df (u(t))(x x0 )T is a 1 1 matrix and taking its transpose we have
Df (u(t))(x x0 ) = (x x0 )T Df (u(t))T .
Therefore,
d d
g 00 (t) (x x0 )T Df (u(t))T

= (Df (u(t))(x x0 )) =
dt dt
= (x x0 )T D2 f (u(t))(x x0 )
where the last equality holds because D2 f is a symmetric matrix. Thus, we obtain
g 00 (0) = (x x0 )T D2 f (x0 )(x x0 ).
We stop the calculations here and summarize our findings in the next result.
Theorem 5.7.3. Let f : U Rn R be a C 2 function on U . Then, the Taylor

expansion of f truncated to second order is
1
T2 f (x) := f (x0 ) + Df (x0 )(x x0 ) + (x x0 )T D2 f (x0 )(x x0 ).
2!
This result is obtained by taking
1 00
g(t) = g(0) + g 0 (0)t + g (0)t2
2!
and evaluating at t = 1; that is, computing g(1). As one can imagine, higher deriva-
tives of g should be yielding higher derivatives of f . However, since we have not
studied formally the higher derivatives of functions of several variables, we do not
pursue the Taylor expansions to orders higher than two. Notice also that we do not
provide a remainder formula, but only look at the second order truncation. We look
at a few examples on how to apply the Taylor expansion formula.
Example 5.7.4. Compute the Taylor expansion truncated to second order of f (x, y) =
exp(xy) at (x0 , y0 ) = (1, 1). The first derivative is
Df (x, y) = (y exp(xy), x exp(xy)), Df (1, 1) = (e, e).
The second derivative is

!
2 y 2 exp(xy) exp(xy)(1 + xy)
D f (x, y) =
exp(xy)(1 + xy) x2 exp(xy)
and so !
e 2e
D2 f (1, 1) = .
2e e
Therefore,
! ! !
x1 1 e 2e x1
T2 f (x, y) = e + (e, e) + (x 1, y 1)
y1 2 2e e y1

(x 1) + 2(y 1)
= e 1 + (x 1) + (y 1) + 12 (x 1, y 1)
2(x 1) + (y 1)
= e(2 + x + y + 12 (x 1)2 + 2(x 1)(y 1) + 21 (y 1)2 )
= e(1 2x 2y + 12 x2 + 2xy + 21 y 2 ).
Example 5.7.5. We compute the Taylor expansion truncated to second order of
f (x, y, z) = sin(xyz) at (x, y, z) = (1, , 1/2). The first derivative is
Df (x, y, z) = (yz cos(xyz), xz cos(xyz), xy cos(xyz)), Df (1, , 1/2) = (0, 0, 0).
The second derivative is

(yz)2 sin(xyz) xyz 2 sin(xyz) xy 2 z sin(xyz)
2

D f (x, y, z) = 2 (xz)2 sin(xyz) x2 yz sin(xyz)
xyz sin(xyz)
xy 2 z sin(xyz) x2 yz sin(xyz) (xy)2 sin(xyz)
and so
2 2

4 4 2
1

2
D f (1, , ) =
1 .

2

4 4 2

2
2
2 2
The Taylor expansion truncated to second order is, setting u = (x 1, y , z 12 ),

2 2

4 4 2
1

T2 f (x, y, z) = 1 u 1 uT
2 4 4 2

2
2
2 2
2 1 2
= 1 (x 1)2 (y )2 (z 1)2 (x 1)(y )
8 8 2 4
2

1 1
(x 1) z (y ) z
2 2 2 2
Exercises
(1) Compute the Taylor expansions truncated to second order of the following func-
tions.
(a) f (x, y) = x4 y 3 at (x, y) = (1, 1).
(b) f (x, y, z) = ln(x + y + z) at (x, y, z) = (1, 0, 0).
(2) Consider the case f (x, y) evaluated at (x0 , y0 ) and compute explicitly g 000 (0).
Rearrange the terms to obtain
1 3f 1 3f
3
(x0 , y0 )(x x0 )3 + (x0 , y0 )(x x0 )2 (y y0 )
3! x 2! x2 y
1 3f 2 1 3f
+ (x 0 , y0 )(x x0 )(y y 0 ) + (x0 , y0 )(y y0 )3 .
2! xy 2 3! y 3
This is the expression for the cubic terms of the Taylor expansion of f (x, y) at
(xx0 , yy0 ). It can be written in short form as D3 f (x0 , y0 )(uu0 , uu0 , uu0 )
where uu0 = (xx0 , yy0 ). The operator D3 f (x, y) is an example of a trilinear
form.
(3) Using the formula of the previous exercise, compute the cubic terms of the Taylor
expansion of
(a) f (x, y) = x4 y 3 at (1, 1).
(b) f (x, y) = exp(xy) at (1, 1).
(4) Compute the Taylor expansion of f (x, y) = x2 + xy 2y 2 at (x, y) = (0, 0)
truncated to second order. Set y = x in the Taylor expansion and determine the
curve obtained. Set y = x/2 in the Taylor expansion and determine the curve
obtained. Draw each curve in xyz-space.
6 Applications of Differential Calculus
This chapter contains a collection of different topics where the concept of derivative
of mapping is applied. The first section is on optimization and this is an important
topic in Calculus. However, the results are not being used in the following chapters.
This is followed by a section on parametrizations which is crucial for the remainder
of the book. The next section is on differential operators and puts on firm footing
the concept of differentials and gradients and introduces more differential operators.
This section is also used in the following chapters. Finally, we conclude by discussing
how Clairaults theorem is used to complete the theory of exact 1-forms discussed
in Chapter 4.
6.1 Optimization
This section is concerned with generalizing the optimization theory of functions

f : R R to the setting of general functions of several variables f : Rn R. We
begin by generalizing the concept of critical point.
Critical Points and Extremals
Recall that for differentiable functions f : R R, a point x0 is a critical point if

f 0 (x0 ) = 0. This generalizes naturally to the n-variables case.
Definition 6.1.1. A point u0 Rn is a critical point of a differentiable mapping

f : Rn R if
Df (u0 ) = 0,
which is the same as f (u0 ) = 0.
We find the critical points of the following examples.
Example 6.1.2. Let f (x, y) = 2x2 x3 y + xy, then
Df (x, y) = (4x 3x2 y + y, x3 + x) = (0, 0)
is solved by seeing first that x3 + x = 0 has solutions x = 0 and x = 1. Setting

x = 0 in the first equation we have solution y = 0. Substituting x = 1 in the first
equation we have 2 3y + y = 0 which has solution y = 1. There are three critical
points (x0 , y0 ) = (0, 0), (x1 , y1 ) = (1, 1) and (x2 , y2 ) = (1, 1).
Example 6.1.3. Consider f (x, y, z) = 3xz 2 3x 2y + 4xy + z 2 15z. We set
Df (x, y, z) = (3z 2 3 + 4y, 2 + 4x, 6xz + 2z 15) = (0, 0, 0)

and solve for (x, y, z). We have three equations in three unknowns:
3z 2 3 + 4y = 0
2 + 4x = 0
6xz + 2z 15 = 0.
Solving the second equation we obtain x = 1/2, which we substitute in the third
equation. Thus, we obtain 5z 15 = 0 with solution z = 3. Finally, substituting
in the first equation we have y = 6. Therefore, f has a unique critical point at
u0 = (x0 , y0 , z0 ) = (1/2, 6, 3).
Example 6.1.4. Consider the mapping

9 1
g(x, y, z, u) = 6xyz zx2 z 2 y x2 + 9x x4 2uy + u.
2 2
Set
Df (x, y, z, u) = (6yz 2xz 9x + 9 2x3 , 6xz z 2 2u,
6xy x2 2yz, 2y + 1)
= (0, 0, 0, 0)
and we obtain the nonlinear system of equations
6yz 2xz 9x + 9 2x3 = 0

6xz z2 2u = 0
6xy x2 2yz = 0
2y + 1 = 0.
The fourth equation is solved for y = 1/2 and we substitute in the other equations
to obtain
3z 2xz 9x + 9 2x3 = 0
6xz z 2 2u = 0
3x x2 z = 0.
Solving the last equation we have z = 3x x2 and we substitute in the first equation
to obtain
3(3x x2 ) 2x(3x x2 ) 9x + 9 2x3 = 9x2 + 9 = 0
which has solutions x = 1. Therefore, for x = 1 we have z = 2 and for x = 1 we

have z = 4. Finally, we solve for u in the second equation
1
u = 3xz z 2 ,
2
substituting x = 1, z = 2 we have u = 4 while for x = 1, z = 4 we have u = 4.
The critical points are
(x0 , y0 , z0 , u0 ) = (1, 1/2, 2, 4) and (x1 , y1 , z1 , u1 ) = (1, 1/2, 4, 4).

158 6 Applications of Differential Calculus
Recall that for a function f : Rn R, the tangent space at p = (x, f (x)) graph (f )
is
Tp S = {(v T , (Df (x)v)T ) | v Tx Rn }.
Thus, at a critical point u0 , if p0 = (u0 , f (u0 )) then
Tp0 S = {(v T , 0) | v Tx Rn }
and we can say that the Tp0 S is parallel to Tx Rn . For functions f : R R and
f : R2 R, this leads to saying that Tp0 S is horizontal, since it is perpendicular
to the vertical axis.
We know that the relationship between critical points and extremals is impor-
tant. However, the definition of extremal is independent of the concept of critical
point. For instance, the function f (x) = |x| has a (global) minimum at x = 0, but
the function is not differentiable at x = 0 and so we cannot say that x = 0 is a
critical point.
Definition 6.1.5. Let f : Rn R be a function of several variables.

(1) A point u0 Rn is a local minimum of f if there exists a number r > 0 such
that f (x) f (u0 ) for all x Rn satisfying ||x u0 || < r.
(2) A point u0 Rn is a local maximum of f if there exists a number R > 0 such
that f (x) f (u0 ) for all x Rn satisfying ||x u0 || < R.
We consider an example.
Example 6.1.6. We show that (x0 , y0 ) = (0, 0) is a local maximum of f (x, y) =

2 x2 x4 . This is straightforward because f (0, 0) = 2 and for all nonzero values of
(x, y), the expression x2 + x4 > 0. Therefore, f (x, y) = 2 (x2 + x4 ) < 2 = f (0, 0).
We establish the link between extremals and critical points.
Theorem 6.1.7. Let f : Rn R be a differentiable mapping. If u0 is a local maxi-

mum or a local minimum of f , then u0 is a critical point of f .
Proof. Let u0 be a local maximum (or minimum) of f , then there exists R > 0 such
that for all x Rn satisfying ||xu0 || < R (or r), then f (x) f (u0 ) (or f (x) f (u0 )
in the case of a local minimum). Consider the path xj (t) = u0 + tej where ej is the
j th canonical basis vector of Rn . Then, xj (0) = u0 , x0j (0) = ej and we consider the
function g : R R defined by gj (t) = f (xj (t)). Then gj (0) = f (u0 ) and for |t| small
enough we have ||xj (t) u0 || < R which implies gj (t) = f (xj (t)) f (u0 ) for all |t|
small enough. That is, g has a local maximum (minimum) at t = 0. Therefore, for
all j = 1, . . . , n
d
0 = gj0 (t) = f (xj (t)) |t=0 = Df (xj (0))x0j (0) = Df (u0 )ej .
dt
This means all partial derivatives of f at u0 vanish and so Df (u0 ) = 0; u0 is a
critical point of f .
Theorem 6.1.7 gives us a criteria to identify possible extrema for differentiable func-
tions. As in the simpler case of functions f : R R, we know that the second
derivative contains information about whether a critical point is a local minimum,
local maximum or an inflection point. This is the topic of the next section.
6.1.1 Second derivative criterion for extremals
Consider the case of a differentiable function f : R2 R with a critical point at

u0 = (x0 , y0 ). Theorem 6.1.7 tells us that u0 could be a local maximum or a local
mimimum. An example such as the hyperbolic paraboloid z = f (x, y) = x2 y 2
shows that (0, 0) is a critical point of f , but this critical point is not an extremal.
Indeed, for x = 0 we have z = f (0, y) = y 2 which is a parabola with a maximum
at (0, 0) and for y = 0, then z = f (x, 0) = x2 has a minimum at (0, 0). This type
of critical point is an example of a saddle point: a critical point which is a local
minimum in at least one direction and a local maximum in at least one direction.
The next example shows another situation
Example 6.1.8. Consider z = f (x, y) = x3 3xy 2 , this is the surface shown in

Figure 5.4. Then, Df (x, y) = (3x2 3y 2 , 6xy) = (0, 0) implies (x, y) = (0, 0) is
the only critical point. This critical point is not a local extrema or a saddle point. A
problem in the exercises asks to show that in fact, any path through (0, 0) has a point

of inflection at (0, 0), except for the three lines y = mx with m = 0, m = 1/ 3.
The criterion for determining local maxima and minima is the second derivative test
and it is sometimes expressed as follows in elementary calculus courses.
Proposition 6.1.9. Let f : R2 R be a C 2 function with a critical point at u0 =

(x0 , y0 ). Define
2
2f 2f
2
f
D(x0 , y0 ) := (x0 , y0 ) 2 (x0 , y0 ) (x0 , y0 ) .
x2 y xy
2f
(a) If x2 (x0 , y0 ) > 0 and D > 0 then (x0 , y0 ) is a local minimum.
2f
(b) If x2 (x0 , y0 ) < 0 and D > 0 then (x0 , y0 ) is a local maximum.
(c) If D(x0 , y0 ) < 0 then f is neither a maximum nor a minimum, but a saddle
point.
(d) If D(x0 , y0 ) = 0, then the test is inconclusive.
Proof. (a) Let v = (v1 , v2 )T R2 be a vector and consider the straight line path
r(t) = (x0 + tv1 , y0 + tv2 ) through u0 in R2 . Define the function h(t) = f (r(t)).
Then, one can check that h(0) = f (u0 ) and h0 (0) = 0. For simplicity of notation let
2 f (u0 ) 2 f (u0 ) 2 f (u0 )

a= , b= , c= .
x2 xy y 2
Then
h00 (0) = v T D2 f (u0 )v
= av12 + cv22 + 2bv1 v2
(6.1)
= a(v1 (b/a)v2 )2 + (c (b2 /a))v22
1
= a(v1 (b/a)v2 )2 + (ac b2 )v22 > 0
a
because a > 0 and D = ac b2 > 0. Thus, u0 is a local minimum of h for all
straight line paths r(t) through u0 . Thus, for |t| small enough f (r(t)) f (u0 ) for
all paths, hence f (x, y) f (u0 ) for all (x, y) close enough to u0 and so it is a local
minimum. A similar argument works for local maxima in part (b) using (6.1). For
part (c), D < 0 implies that if a > 0, then h00 (0) > 0 for the path given by (v1 , 0)
and h00 (0) < 0 for the path given by (0, v2 ). If a < 0, the opposite is true, but still
h00 (0) has different signs depending on the paths. Therefore, it is a local minimum
with respect to a path and a local maximum with respect to the other path. Hence,
u0 is a saddle point. For part (d), since D = 0 as Example 6.1.8 shows, the situation
may be other than a local extrema or saddle point and so we cannot conclude.
Example 6.1.10. We continue with Example 6.1.2 and determine the nature of the
critical points (0, 0), (1, 1) and (1, 1). We compute
2f
= 4 6xy
x2
which evaluated at (0, 0) and (1, 1) is positive, while it is negative at (1, 1). However,
2
2f 2f
2
f
D= (x0 , y0 ) (x0 , y0 ) (x0 , y0 ) = (3x2 + 1)2
x2 y 2 xy
which means D is always negative. Therefore, all critical points are saddle points.
We now show that the second derivative test conditions of Proposition 6.1.9 can be
rewritten in terms of the second derivative D2 f . But before, we introduce a widely
used terminology about the second derivative when expressed as a matrix.
Definition 6.1.11. Let f : Rn R be a C 2 function. The Hessian matrix of f is

H(f )(u0 ) := D2 f (u0 ).
The Hessian matrix is just another name for the matrix of the second derivative when
a function is C 2 . Just as the Jacobian matrix is the matrix of the first derivative.
Using the Hessian matrix we notice that for f : R2 R a C 2 function
2f 2f

2
det H(f )(u0 ) = x yx

2f 2f
x2 y y 2
2
2f 2f 2f
=
x2 y 2 yx
= D(x0 , y0 ).
Thus, we rewrite the second derivative test as follows.
Proposition 6.1.12. Let f : R2 R be a C 2 function with a critical point at u0 .

2f
(a) If x2 (x0 , y0 ) > 0 and det H(f )(x0 , y0 ) > 0 then (x0 , y0 ) is a local minimum
2f
(b) If x2 (x0 , y0 ) < 0 and det H(f )(x0 , y0 ) > 0 then (x0 , y0 ) is a local maximum
(c) If det H(f )(x0 , y0 ) < 0 then f is neither a maximum nor a minimum, but a
saddle point.
(d) If det H(f )(x0 , y0 ) = 0 then the test is inconclusive.
We now come to the last characterization of the second derivative test. Recall that
if A is a n n matrix with eigenvalues 1 , . . . , n then det A = 1 n . Therefore,
if 1 , 2 are the eigenvalues of a 2 2 Hessian matrix H(f )(x0 , y0 ) then
det H(f )(x0 , y0 ) = 1 2 .
Using this correspondence, the second derivative test can be rewritten.
Proposition 6.1.13. Let f : R2 R be a C 2 function with a critical point at u0 .

(1) If the eigenvalues of H(f )(x0 , y0 ) are both positive, then (x0 , y0 ) is a local min-
imum
(2) If the eigenvalues of H(f )(x0 , y0 ) are both negative, then (x0 , y0 ) is a local max-
imum
(3) If the eigenvalues of H(f )(x0 , y0 ) have different signs, then f is neither a max-
imum nor a minimum, but a saddle point.
(4) If H(f )(x0 , y0 ) has a zero eigenvalue, then the test is inconclusive.
Proof. Write !
a b
H(f )(x0 , y0 ) = .
b c
and let 1 , 2 be its eigenvalues. Then ac b2 = 1 2 and ac b2 > 0 if and only

if 1 and 2 are both positive or both negative. But this means a and c must have
the same sign and this corresponds to the first two cases of local minimum and local
maximum from Proposition 6.1.12. If 1 , 2 have different signs then ac b2 < 0.
The last case is straightforward.
Let us apply Proposition 6.1.13 to some examples.
Example 6.1.14. Consider f (x, y) = cos x cos y with 0 x < and 0 y < .
We set Df (x, y) = (sin x cos y, cos x sin y) = (0, 0) and critical points are found at
(x, y) = (0, 0) and (x, y) = (/2, /2). The Hessian matrix is
!
cos x cos y sin x sin y
H(f )(x, y) =
sin x sin y cos x cos y
and we evaluate at each critical point. We have

! !
1 0 0 1
H(f )(0, 0) = and H(f )(/2, /2) = .
0 1 1 0
Because H(f )(0, 0) is a diagonal matrix, the eigenvalues are the elements of the
diagonal: 1 with multiplicity two. Thus, (0, 0) is a local maximum. The eigenvalues
of H(f )(/2, /2) are given by solving the characteristic polynomial p() = det(I
H(f )(/2, /2)) = 0. A calculation shows that p() = 2 1 = 0 and the eigenvalues
are 1 which implies that (/2, /2) is a saddle point.
Example 6.1.15. Consider f (x, y) = x3 3xy 2 which we know has a unique critical
point at (0, 0) from Example 6.1.8. The Hessian at (0, 0) is
!
0 0
H(f )(0, 0) =
0 0
and both eigenvalues are zero. Therefore, the second derivative test is inconclusive.
We can now turn our attention to the general case of C 2 functions of several variables.
Before we do so, we restrict our attention to critical points which exclude cases that
are inconclusive in the classification given by Proposition 6.1.13.
Definition 6.1.16. A critical point u0 of f : Rn R is a nondegenerate critical

point of f if det H(f )(u0 ) 6= 0.
The characterization of local minimum, local maximum and saddle points in terms of
eigenvalues generalizes in a straightforward manner to functions of several variables.
Note that with n-dimensions, the concept of saddle point has many more cases than
in the two-dimensional case. We do not seek to classify all those cases which is the
topic of Morse theory and we consider a few examples.
Example 6.1.17. Let f (x, y, z, u) = x2 + y 2 + z 2 u2 , then
Df (x, y, z) = (2x, 2y, 2z, 2u)
and there is a unique critical point at (0, 0, 0, 0). Setting y = z = u = 0, x = z =

u = 0 and x = y = u = 0 shows (0, 0, 0, 0) has a minimum along the respective paths
while for x = y = z = 0, the path through (0, 0, 0, 0) has a maximum.
Consider instead f (x, y, z, u) = x2 + y 2 z 2 u2 . Then, the critical point at
(0, 0, 0, 0) is a minimum along x and y axis paths and a maximum along z and u axis
paths. We see that as the number of dimensions goes above three, there are different
types of saddle points with different numbers of (local) minimum and maximum
directions.
Considering the variety of saddle points possible, one can imagine that cases where
a zero eigenvalue occurs in the Hessian matrix at a critical point can easily get
out of control. By focusing on nondegenerate critical points, the situation remains

manageable and we have the following result.
Theorem 6.1.18. Let f : Rn R be a C 2 function with a nondegenerate critical

point at u0 . Then,
(1) If all the eigenvalues of the Hessian H(f )(u0 ) are positive, then u0 is a local
minimum
(2) If all the eigenvalues of the Hessian H(f )(u0 ) are negative, then u0 is a local
maximum
(3) If the Hessian H(f )(u0 ) has both positive and negative eigenvalues, then it is a
saddle point.
Proof. Using paths as in the proof of Proposition 6.1.9 is quite cumbersome and we
seek a different approach. Instead, we use the second-order Taylor expansion of f at
a critical point u0 :
T2 f (x) = f (u0 ) + (x u0 )T H(f )(u0 )(x u0 )
since Df (u0 ) = 0. Because the matrix H(f )(u0 ) is symmetric, it has only real
eigenvalues and is diagonalizable. This is a well-known result of linear algebra which
is found in any textbook on linear algebra [6]. Therefore, there exists an orthogonal
matrix P such that P T H(f )(u0 )P = diag(1 , . . . , n ) is the diagonal matrix with
entries given by the eigenvalues 1 , . . . , n of H(f )(u0 ) from top left to bottom
right. Note that H(f )(u0 ) = P diag(1 , . . . , n )P T and so the second-order Taylor
expansion is written
T2 f (x) = f (u0 ) + (x u0 )T P diag(1 , . . . , n )P T (x u0 ).
Set v = P T (x u0 ) := (v1 , . . . , vn )T , then
f (x) = f (u0 ) + v T diag(1 , . . . , n )v = f (u0 ) + 1 v12 + + n vn2 .
We disregard the fact that terms of order three and higher are neglected and proceed.
For part (a), if 1 > 0, . . . , n > 0 then f (x) f (u0 ). For part (b), if 1 <
0, . . . , n < 0, then f (x) f (u0 ). Finally, for part (c), suppose that 1 > 0 and
2 < 0 then for x values such that v1 6= 0 and v2 = = vn = 0 then f (x) f (u0 ).
For x values such that v1 = 0, v2 6= 0, v3 = = vn = 0 then f (x) f (u0 ). The
argument is similar if i < 0, j > 0 for i 6= j. Adding back the higher-order terms
and considering x close enough to u0 the inequalities above are maintained and the
result holds.
Using Theorem 6.1.18, we can now return to Examples 6.1.3 and 6.1.4, and determine
the nature of the critical points.
Example 6.1.19. For f (x, y, z) = 3xz 2 3x 2y + 4xy + z 2 15z the critical point
is (x, y, z) = (1/2, 6, 3) and the Hessian matrix is

0 4 18

H(f )(0, 0, 0) =
4 0 0 .

18 0 2
The characteristic polynomial is given by
p() = det(H(f )(1/2, 6, 3) I)

4 18

= det
4 0

18 0 + 2
= 3 22 340 + 32.
We must solve p() = 0. This can be done using a mathematical computation

software package and we obtain the eigenvalues 1 = 8, 2 = 3 13 and

3 = 3 + 13. This implies that the critical point is a saddle point.
Example 6.1.20. For g(x, y, z, u) = 6xyz zx2 z 2 y 92 x2 + 9x 12 x4 2uy + u.

recall that the critical points are
(x0 , y0 , z0 , u0 ) = (1, 1/2, 2, 4) and (x1 , y1 , z1 , u1 ) = (1, 1/2, 4, 4).
The Hessian matrix is

9 6x2 6z 6y 2x 0

6z 0 6x 2z 2
H(f )(x, y, z, u) =
6y 2x
.
6x 2z 2y 0

0 2 0 0
We evaluate the Hessian at each critical point

15 12 1 0

12 0 2 2
H(f )(x0 , y0 , z0 , u0 ) =
1
1 2 0

0 2 0 0

15 24 5 0

24 0 2 2
H(f )(x1 , y1 , z1 , u1 ) =
1
5 2 0

0 2 0 0
The eigenvalues are computed from the characteristic polynomials
p0 () = det(H(f )(x0 , y0 , z0 , u0 ) I)

15 12 1 0

12 2 2
=
1 2 1 0

0 2 0
= 4 + 163 1382 316 56
and
p1 () = det(H(f )(x1 , y1 , z1 , u1 ) I)

15 24 5 0

24 2 2
=
5 2 1 0

0 2 0
= 4 + 163 5942 220 + 40.

Numerical solutions can be computed using a software and one obtains for p0 ()
three negative eigenvalues ( 21.69, 1.74, 0.19) and one positive ( 7.63). Now,
p1 () has two negative eigenvalues ( 33.52, 0.49) and two positive eigenvalues
( 17.88, 0.13). In both cases, the critical point is a saddle point.
Exercises
(1) Compute the critical points of the following functions and determine if they
are nondegenerate. For the nondegenerate cases, determine whether the critical
points are local maximum, local minimum or saddle point.
(a) f (x, y) = x2 y + 2xy 3 + y.
(b) g(x, y) = ln(1 + x2 + y 2 ).
(c) f (x, y, z) = exyz+xyxz+2yz .
(d) g(x, y, z) = cos x cos y sin z, with domain 0 x, y, z /2.
(e) h(x, y, z, u) = x3 2y 2 + 3xz 3zu + z 3 + 3u.
(f) k(x, y, z, u) = x2 + u2 + 3z 2 + (y + 1)2 2y.
(2) Show that f (x, y) = x3 3xy 2 along any straight line path through (0, 0) has
an inflection point at (0, 0), except for three lines. (Hint: this is the function
defining the Monkey Saddle of Chapter 5
(3) Show that f (x, y) = x2 + y 4 has a degenerate critical point at (0, 0). Using the
definition, explain why (0, 0) is a local minimum. Can you find a similar mapping
with a local maximum?
(4) Find a function g(x, y) which has a degenerate critical point at (0, 0) but which
behaves as a saddle point at (0, 0).
6.2 Parametrizations
In this section, we explore parametrizations of regions in R2 and R3 and then discuss

the parametrization of graphs of functions f : U R2 R. We consider the case
of differentiable parametrizations. The techniques and examples in this section are
crucial for the computation of double integrals, triple integrals and surface integrals.
6.2.1 Parametrization of 2D and 3D regions
We state most of the definitions and results for regions in R3 , with the R2 case nat-
urally included implicitly. The goal of a parametrization is to provide a description
of a region via another region which has a simpler description.
Definition 6.2.1. A differentiable parametrization of a region E R3 (also in R2 )

is a differentiable mapping
: R R3 E
such that 1 exists and is a differentiable mapping.
Note that any region E has the trivial parametrization given by the identity mapping.
That is, let R = E and (x, y, z) = (x, y, z)T . However, unless the region E is simple
enough, e.g. a rectangle parallel to the axes in R2 or a cube parallel to the axes in
R3 , we seek for parametrizations which simplify the description of the region E. The
main types of parametrizations we use involve curvilinear coordinate systems
(1) Polar: (r, ) = (r cos , r sin )T

(2) Cylindrical: (r, , z) = (r cos , r sin , z)T
(3) Spherical: (, , ) = ( cos sin , sin sin , cos )T
and linear mappings

(u1 , . . . , un ) = A(u1 , . . . , un )T
where A is an invertible n n matrix. Note that in each case, the mappings are
differentiable and
det(D) 6= 0
at all points in the domain. This implies that 1 exists and is a differentiable
function. This is proved using a result called the Inverse Function Theorem, which
is not covered in this book.
We begin with examples of regions D in R2 and the important class of
parametrizations in R2 done with polar coordinates.
Example 6.2.2. Consider the domain
D = {(x, y) R2 | a x2 + y 2 b2 , 0 < a < b, x, y 0}.
This region is an annulus centered at the origin and with radius varying from a
to b as shown in Figure 6.1. A parametrization in polar coordinates is given by
r=b
r=a
Fig. 6.1. Annulus region bounded by cir-

cles of radius a and b with a < b.
(r, ) = (r cos , r sin )T and the domain of is
R = {(r, ) | a r b, 0 < /2}.
We see in Figure 6.2 that the domain R is a rectangle with sides parallel to the axes
in the (r, ) plane. Therefore, the parametrization provides a description of D via a
domain R which has a simpler form.
Another important type of parametrizations are the linear transformations. Those

can be useful if the domain is bounded by straight lines.
D = {(x, y) | 0 y 1, y x y}.
This is a triangular region with vertices at (1, 1), (1, 1) and (0, 0), see Figure 7.14.
The goal of the parametrization we use here is to have a triangular domain R with
two sides on the horizontal and vertical axes. We define
Fig. 6.2. Domain of the parametrization

a b r
(1, 1) (1, 1)
D
Fig. 6.3. Domain D of Exam-
x ple 6.2.3
! !
T 1 1 u
(u, v) = (u + v, u v) = .
1 1 v
with domain
R = {(u, v) | 0 u 1, u 1 v 0}.
Using 1 we show how R is constructed using D. It is straightforward to compute
v
1
u
u1
1 Fig. 6.4. Domain R of the

parametrization
! ! T
1 1 1 1 x 1 1
(x, y) = = (x + y), (x y) .
2 1 1 y 2 2
Consider the boundary y = x with 0 x 1. Then
(u, v)T = 1 (x, x) = (x, 0)T ,

so u = x and 0 u 1. The boundary y = x with 1 x 0 leads to
(u, v)T = 1 (x, x) = (0, x)T .
Thus v = x and 1 v 0. From the boundary y = 1 with 1 x 1, we have
(u, v)T = 1 (x, 1)
which leads to
1 1
u= (x + 1) and v = (x 1).
2 2
Isolating x in each equation, we obtain x = 2u 1 and x = 2v + 1 which means
2u 1 = 2v + 1. Thus, v = u 1 with 0 u 1. This completes the characterization
of R, see Figure 6.4.
Our main examples of parametrizations in R3 are using cylindrical and spherical

coordinates. We consider an example of each.
Example 6.2.4. Consider the region E inside the cylinder x2 + y 2 = 1 and in the
sphere x2 + y 2 + z 2 = 4, see Figure 6.5 on the left. We can describe this region in
Cartesian coordinates as

E = (x, y, z) | 1 x 1, 1 x2 y 1 x2 ,
p p o
4 x2 y 2 z 4 x2 y 2 .
If we use cylindrical coordinates, the cylinder equation becomes r2 = 1 and the
Fig. 6.5. Region E (left) and the portion z 0 of the domain R of its parametrization
(right).
sphere r2 + z 2 = 4. The cylindrical coordinates parametrization
(r, , z) = (r cos , r sin , z)T

with domain
p p
R = {(r, , z) | 0 < r 1, 0 2, 4 r2 z 4 r2 }
gives a simpler description of E. Indeed, the domain R is the region extending above
and below the rectangle 0 < r 1 and 0 2 bounded above and below by

z = 4 r2 . See Figure 6.5 right for the z 0 portion of R, the z < 0 portion is
symmetric.
We now turn to a case where spherical coordinates are useful.
Example 6.2.5. Consider a shell region E between two spheres of radius 1 and 2
in the first octant, see Figure 6.6. We use spherical coordinates. To stay in the first
octant, one needs [0, /2] and [0, /2]. We define
n o
R = (, , ) | 1 2, 0 , 0
2 2
which is a box in (, , ) space with sides parallel to the coordinate planes. The
Fig. 6.6. Region E between spheres of

radius 1 and 2.
domain R is pictured in Figure 6.7.
z
/2
R
2 /2

Fig. 6.7. Box domain R from Example 6.2.5
6.2.2 Two-Dimensional Surfaces in R3
We now show examples of parametrizations of two-dimensional surfaces in R3 . Be-

cause a surface S is a two-dimensional set, we need two variables to describe S; that
is, parametrizations are of the form
: D R2 S.
The main example of parametrization is using graphs of functions. Consider a surface

S given by z = f (x, y) where (x, y) D. A parametrization is given by the graph
(x, y) = (x, y, f (x, y)).
All quadric surfaces can be written as combinations of the above.
Example 6.2.6. Consider a hyperbolic paraboloid z = 3x2 4y 2 with domain D =

R2 . Then,
(x, y) = (x, y, 3x2 4y 2 )T
is a parametrization. For a hyperboloid of one sheet x2 +y 2 z 2 = 1 the parametriza-

tion is obtained using two mappings:
p p
1 (x, y) = (x, y, + x2 + y 2 1)T and 2 (x, y) = (x, y, x2 + y 2 1)T
with domain D = {(x, y) | x2 + y 2 1}.
The circular nature of the domain suggests that polar coordinates can also be
used for the hyperboloid of one sheet. In fact, for all quadrics with circular traces,
parametrizations can be achieved with polar coordinates.
Fig. 6.8. Cone intersecting the hemisphere in Example 6.2.8.

Example 6.2.7. Consider the paraboloid P of equation z = x2 + y 2 with disk domain

D of radius 3 centered at the origin. A simple parametrization with polar coordinates
can be done with z = r2 :
(r, ) = (r cos , r sin , r2 )T
with
D = {(r, ) | 0 r 3, 0 2}
Example 6.2.8. Consider the piece of sphere S of radius a > 0 lying inside the cone
z 2 = x2 + y 2 in the upper hemisphere. Parametrizations exist in Cartesian, cylin-
drical and spherical coordinates. The cylindrical coordinate system parametrization
is obtained with the cone given by z = r and the sphere becomes r2 + z 2 = a2 . The
intersection of the cone and sphere is obtained by substituting z 2 = r2 from the cone
to the sphere equation from which we obtain 2r2 = a2 . Solving this equation yields

r = a/ 2. Therefore, the cone and the sphere intersect in the upper hemisphere at

the height z = a/ 2. A parametrization of the portion of sphere is
p
(r, ) = (r cos , r sin , a2 r2 )T
with
D = {(r, ) | 0 r a/ 2, 0 2}.
We now obtain parametrizations for quadrics with non-circular traces.
Example 6.2.9. Consider the ellipsoid S given by
x2 y2 z2
2
+ 2 + 2 = 1.
a b c
A parametrization for S is
(, ) = (a cos sin , b sin sin , c cos )T
with domain
D = {(, ) | 0 2, 0 }.
This is verified by substituting x = a cos sin , y = b sin sin and z = c cos in

the ellipsoid equation.
Example 6.2.10. Consider the hyperboloid of one sheet
x2 y2 z2
2
+ 2 2 = 1.
a b c
Let x = au, y = bv and z = cw and substitute in the equation for the hyperboloid of
one sheet. We obtain
u2 + v 2 w2 = 1.
Isolating w, we obtain
w2 = 1 (u2 + v 2 ).
Writing u, v in polar coordinates: u = r cos , v = r sin , the equation for w simplifies
to w2 = 1 r2 . A parametrization is given by the mappings

+ (r, ) = (ar cos , br sin , c 1 r2 )T ,

(r, ) = (ar cos , br sin , c 1 r2 )T
with
D = {(r, ) | r > 1, 0 2}.
Here is a more exotic example of a very important surface in mathematics.
Fig. 6.9. Torus
Example 6.2.11. A torus is a two-dimensional surface which can be described as the

Cartesian product of two circles. It is the shape of donuts and bagels! See Figure 6.9.
Several parametrizations exist and we look at one using angles. Consider a circle
S1a of radius a in the yz-plane with centre at (0, b, 0) where b > a. We obtain the
torus surface by taking the circle S1a and making a full revolution (thus obtaining a
second circle of radius b which we denote S1b ) of the circle around the origin keeping
the centre in the xy-plane, see Figure 6.10. We now give a description in terms
of coordinates. The circle S1a has parametrization y = b + a cos u, z = a sin u with
S1b
a
b
S1a y
Fig. 6.10. The two circles S1a and S1b generat-
x ing the torus
0 u 2. Because the height z does not depend on the location of S1a along
its revolution, then z = a sin u. For the x and y coordinates, this is not true and
depends on the location along S1b which is parametrized by x = b cos v, y = b sin v

with 0 v 2. Note that the circle S1a has x = 0 coordinate which corresponds to
b cos v with v = /2. We now put together the two circle parametrizations to obtain
x = (b + a cos u) cos v and y = (b + a sin u) sin v.
The parametrization of the torus is
(u, v) = ((b + a cos u) cos v, (b + a sin u) sin v, a sin u)T
with domain
D = {(u, v) | 0 u 2, 0 v 2}.
Arbitrary Surfaces given by Parametrizations

The examples above show cases where a surface S described either in words or by an
equation can be written in terms of a parametrization : D R2 S. Recall that
an arbitrary vector function r : [a, b] Rn defines a curve C, but this curve may
not necessarily have a description in words or via a formula. The same is true for
(differentiable) mappings : D R2 R3 , those always define surfaces. But there
may not exist an easy equation in Cartesian coordinates describing the surface. The
following example is solvable, but it is easy to produce mappings for which such
a description would be challenging or even impossible.
Example 6.2.12. The mapping (u, v) = (u2 , u + v, v u)T with domain D =

{(u, v) | 0 u 1, 0 v 1} defines a surface as shown in Figure 6.11. We
express the surface given by as a single equation in Cartesian coordinates. Multi-
plying y and z we have yz = v 2 u2 and so x + yz = v 2 . We can obtain
1 2
v2 = (y + z 2 ) x.
2
Therefore is the parametrization of
1 2
(y + z 2 ) yz 2x = 0.
2
Exercises
(1) Find parametrizations of the following regions in R2 .

(a) E is the region enclosed by the circles of radius 1 and 2 in the first, second
and third quadrants, the x-axis between 1 and 2, and the y-axis between
1 and 2.
Fig. 6.11. Surface given by (u, v) = (u2 , u + v, v u)T
(b) E is the region bounded by the x-axis between 0 and 1, the line y = (x 1)
for 1 x 2, the line y = 1 and the y-axis.
(c) E is the region bounded by the curve C given by
r(t) = ((t + 1) cos(2t), (t + 1) sin(2t))
with t [0, 1] and the line joining the points (1, 0) and (2, 0).
(2) In each case, show that the parametrization maps R into E.

(a) Let E be the half-disk of radius 2 with y x ,
= (r cos , r sin )
and
2, /4 3/4}.
D = {(r, ) | 0 r

1 |x| 1 |x|
(b) Let E = (x, y) | 2 x 2, + y ,
2 4 2 4

1
(u, v) = 2(1 + u + v), (u + v)
2
and R = {(u, v) | 0 u 1, 0 v 1}.
(3) Find parametrizations of the following regions in R3 .
(a) E is the region bounded below by the paraboloid z = x2 + y 2 and bounded
above by the sphere of radius 1, with z > 0.
(b) E is the region bounded by the cylinder x2 + z 2 = 1 with x 0, the plane
x = 0 and the planes y = 1, y = 1.
(c) E is the region bounded below by the plane z = 1, above by the sphere of
radius 2 and in the first octant.
(d) E is the region bounded above by the paraboloid z = x2 + y 2 , inside
the cylinder (x 1)2 + y 2 = 1 and above the xy-plane. (Hint: attempt
a parametrization using cylindrical coordinates.)
(4) In each case, show that the parametrization maps R into E.

(a) E = {(x, y, z) | y 0},
(, , ) = ( cos sin , sin sin , cos )
with
R = {(, , ) | > 0, 0 , 0 /2}
(b) E is the tetrahedron with vertices

(1, 0, 0), (1/2, 3/2, 0), (1/2, 3/2, 0) and (0, 0, 1),
with
1 3
(u, v, w) = 1 + (u + v), (u + v), w
2 2
and R = {(u, v, w) | u, v, w 0, 0 u + v + w 1}.
(5) Find parametrizations of the following surfaces S in R3 . Note that you have to
determine correctly the domain D in each case.
(a) S = graph (f ) where f (x, y) = 4 (x2 + y 2 ), not extending below the
xy-plane.
(b) S is the hyperbolic paraboloid z = x2 2y 2 for x 0.
(c) S is the sphere of radius 1 between the planes z = 0 and y = 0 and for
y 0.
(d) S is one eight of the torus of Example 6.2.11 lying in the first octant.
(6) Consider 1 : R3 R3 given by 1 (x, y, z) = (au, bv, cw); it is a rescaling
of all of R3 . Show that 1 composed with the parametrization 2 (, ) =
(cos sin , sin sin , cos ) with 0 2, 0 gives the parametriza-
tion of the ellipsoid in Example 6.2.9.
6.3 Differential Operators
The gradient and differential are examples of differential operators. This section
gives formal definitions of those differential operators as well as introducing new
differential operators that are important in Vector Calculus and in the theory of
partial differential equations (which is not a topic of this book).
6.3.1 Differentials and Gradients
In Cartesian coordinates, the link between the differential and the gradient of a
differentiable function f (x1 , . . . , xn ) is straightforward
df = f (dx1 , . . . , dxn ) = f (dx1 , . . . , dxn )T

where we use the fact that the scalar product in Cartesian coordinates corresponds
to multiplication of a row and column vector.
We begin by showing that the structure of the differential is invariant by changes
of coordinates. This is illustrated with the next example and shows the process by
which the general case is proved.
Example 6.3.1. Consider R2 and the change of coordinates to polar
x = x(r, ) = r cos and y = y(r, ) = r sin . (6.2)
The inverse change of coordinates is

p
r = r(x, y) = x2 + y 2 and = (x, y) = arctan(y/x). (6.3)
Let f (x, y) be a differentiable function, then we write
df = f (x, y)(dx, dy)T .
To convert df in polar coordinates, we must obtain the partial derivatives as a func-

tion of r and using (6.3)
f f r f f f r f
= + and = + .
x r x x y r y y
where the partial derivatives of r, with respect to x and y are expressed in polar
coordinates. That is, in matrix form we have
r r

f f x y
f (x, y) = ,
r
x y
x y
f f x2 +y2 x2 +y2

= ,
(6.4)
r

y x
2 2 x2 +y 2
x +y
cos sin

f f
= ,
r sin cos
r r
Using (6.2) we compute

x x y y
dx = dr + d and dy = dr + d (6.5)
r r
which we rewrite

x x
cos r sin
(dx, dy)T =
r
(dr, d)T = (dr, d)T . (6.6)
y y

sin r cos
r
Notice that the matrices in (6.4) and (6.6) are inverses of each other. Therefore,

T f f
df = f (x, y)(dx, dy) = , (dr, d)T .
r
However, note that we do not write the vector of partial derivatives with respect to
the polar coordinates in terms of the operator. The issue of change of coordinates
for the gradient vector is trickier.
We illustrate the calculation above with an explicit computation.
Example 6.3.2. Let f (x, y) = x2 y and consider the polar change of coordinates. We
compute first df = 2xy dx + x2 dy and using (6.5) we obtain
df = 2r2 cos sin (cos dr r sin d) + r2 cos2 (sin dr + r cos d)

= 3r2 cos 2 sin dr + (2r2 cos sin2 + r3 cos3 ) d.
Writing f directly in polar coordinates we have f (r, ) = r3 cos2 sin and
df = 3r2 cos sin dr + r3 (2 cos sin2 + cos3 ) d

= 3r2 cos sin dr + r3 cos (2 sin2 + cos2 ) d.
The two results match as the previous result informs us.
We now state the general result about changes of coordinates.
Proposition 6.3.3. Let (x1 , . . . , xn ) be Cartesian coordinates, f (x1 , . . . , xn ) be a dif-

ferentiable function and xi = xi (u1 , . . . , un ) for i = 1, . . . , n is a change of coordi-
nates. Then
f f
df = du1 + + dun .
u1 un
Proof. The proof follows the approach outlined in the polar coordinates example
above. Write the inverse change of coordinates ui = ui (x1 , . . . , xn ) for i = 1, . . . , n.
We know df = f (dx1 , . . . , dxn ),

f f (u1 , . . . , un )

f (x1 , . . . , xn ) = ,...,
u1 un (x1 , . . . , xn ) xi =xi (u1 ,...,un )

and
(x1 , . . . , xn )
(dx1 , . . . , dxn )T = (du1 , . . . , dun )T .
(u1 , . . . , un )
The Jacobian matrices
(u1 , . . . , un ) (x1 , . . . , xn )

and
(x1 , . . . , xn ) xi =xi (u1 ,...,un ) (u1 , . . . , un )

are inverses of each other as they are obtained via the coordinate change and its
inverse. Therefore, the formula follows.
Because the structure of df is unchanged with respect to coordinate changes, this

enables us to extend the definition of differential which we obtained for the Cartesian
coordinate system to any coordinate system
Definition 6.3.4. Let q1 , . . . , qn be a coordinate system in Rn and f (q1 , . . . , qn ) be

a differentiable function. Then, the differential of f is
f f
df = dq1 + + dqn .
q1 qn
As mentioned above, changing coordinates for the gradient is not so straightforward
and we see below that the structure of the vector may well be modified from the
Cartesian case. But before we can present this, more fundamentally, we have not yet
provided a formal definition of gradient. We do this now.
Definition 6.3.5. Let q1 , . . . , qn be a coordinate system in Rn and f (q1 , . . . , qn ) be

a differentiable function. Then, the gradient of f is the vector q f such that
df = q f (dq1 , . . . , dqn ).
where is the scalar product for tangent space vectors in the coordinate system
(q1 , . . . , qn ).
We now work in the polar coordinate case again and show how the structure of f
changes, but also, the fact that it depends on the basis chosen for the tangent space
in polar coordinates.
Example 6.3.6. Consider R2 with polar coordinates where the scalar product for
tangent vectors in R2 in the basis (/r, /) is the one defined in Chapter 3,
see (3.5). For a differentiable function f (r, ), let (r,) f = (r f, f ) and by
definition
df = (r,) f (dr, d) = r f dr + r2 f d.
Therefore,
f 1 f
(r,) f = , .
r r2
In many other references, the gradient in polar coordinates does not have this form
and this is a consequence of those authors normalizing the basis of Tp R2 in polar
coordinates:
1
(er , e ) := , .
r r
The alternative formula is obtained as follows:
f 1 f
(r,) f = + 2
r r r
f 1 f 1
= er + 2 r
r r r
f 1 f
= er + e .
r r
Therefore, the gradient in the orthonormal basis is

0 f 1 f
(r,) f = , .
r r
Exercises
(1) Let (u1 , . . . , un ) = A(x1 , . . . , xn ) where A is an orthogonal matrix (AT A =

AAT = I). Show that the gradient operator has the form

u = ,..., .
u1 un
(2) In each case, show the formula for the gradient operator in the following coor-
dinate systems in the given basis.
(a) In cylindrical coordinates with basis (/r, /, e3 )

f 1 f f
(r,,z) f = , 2 ,
r r z
and in its orthonormalized basis

f 1 f f
0(r,,z) f = , , .
r r z
(b) In spherical coordinates with basis (/, /, /)

f 1 f 1 f
(,,) f = , 2 , 2 2 .
sin
and in its orthonormalized basis.

f 1 f 1 f
(,,) f = , , .
sin
6.3.2 Laplacian, Divergence and Curl Differential Operators
The gradient operator is useful to define other differential operators. We now intro-
duce the Laplacian, the divergence and the curl operators.
The Laplacian operator is defined as
= 2 := .
where is the dot product. For instance, for functions R2 R we have
2 2

= , , = 2
+ 2.
x y x y x y
and for functions from R3 R
2 2 2

= , , , , = 2
+ 2 + 2.
x y z x y z x y z
Example 6.3.7. We apply the Laplacian operator to f (x, y) = x2 y + sin(xy).

2f 2f
f = +
x2 y 2
= (2y y sin(xy)) + (x2 sin(xy)) = 2y (x2 + y 2 ) sin(xy).
2
By transforming the gradient to other coordinate systems, we can obtain the Lapla-
cian after coordinate change. Of course, this depends on the basis for the tangent
vectors chosen. If two relevant choices are available, as in the polar coordinate case,
we obtain both.
Example 6.3.8. The Laplacian in polar coordinates is given as follows; recall that
the scalar product depends on the basis chosen. In the basis (/r, /),
(r,) f = (r,) (r,) f

1 1
= , 2 , 2
r r r r
2 1 2
= + 2 2.
r2 r
We obtain the same result in the orthonormal basis (er , e ). Recall that the scalar
product does not have a r2 factor now. We have
0(r,) f = 0 0(r,) f
(r,)
1 1
= , ,
r r r r
2 1 2
= + 2 2.
r2 r
We now define two differential operators which appear in the study of vector fields
and the integration theory of the next chapters. The divergence operator is defined
on differentiable vector fields F : Rn Rn , F = (f1 , . . . , fn )T , as follows
f1 fn
div F (x) := F = + + .
x1 xn
The physical meaning of the divergence is discussed in Chapter 10.
The curl operator is defined also on vector fields in R3 , F : R3 R3 , F =
(f1 , f2 , f3 )T using the following formula

i j k

curl F (x) := F = .
x y z

f f2 f3
1
where i, j, k correspond to the standard basis vectors of R3 . Therefore, the curl of

a vector field in R3 is also a vector field in R3 . The explicit expression from the
formula is
f3 f2 f1 f3 f2 f1
curl F (x) = , , .
y z z x x y
The curl operator can also be applied to vector fields in R2 of the form F (x, y) =
(F1 (x, y), F2 (x, y)) by extending the vector field to the z component with 0:
F (x, y, z) = (F1 (x, y), F2 (x, y), 0). The physical meaning of the curl operator is
also discussed in Chapter 10.
Example 6.3.9. Recall that the radial gravitational force vector field is given by
mM G x
F (x) =
r2 ||x||
where x = (x, y, z) and r = ||x||. To compute the div and curl, we first decompose F
into its components
mM Gx ymM G mM Gz
f1 (x, y, z) = , f2 (x, y, z) = , f3 (x, y, z) = .
r3 r3 r3
Then,
r3 3rx2 r3 3ry 2 r3 3rz 2

div F = mM G 6
mM G 6
mM G
r r r6
mM G 3 2 2 2
= (3r 3r(x + y + z )) = 0
r6
because x2 + y 2 + z 2 = r2 and
curl F = mM G (3yzr 3zyr, 3rxz 3rzx, 3ryx 3rxy) = (0, 0, 0).
The geometric and physical significance of the values obtained here are explained in
Chapter 10.
The gradient, Laplacian and divergence differential operators are linear, that is, for
functions g1 , g2 : Rn R, vector fields F1 , F2 : Rn Rn , G1 , G2 : R3 R and
scalars , R then
(g1 + g2 ) = g1 + g2
(g1 + g2 ) = g1 + g2
(6.7)
div(F1 + F2 ) = divF1 + divF2 .
curl(G1 + G2 ) = curlG1 + curlG2 .
The linearity of is automatic from the linearity of the derivative. The linearity of
, div and curl are left for the reader in the exercises section.
Exercises
(1) Apply the differential operator to the following functions.

(a) f (x, y) = exp(x) cos(y)
(b) f (x, y, z) = x2 yz 3
(c) f (x, y, z) = 21 z exp(2x) cos(2y)
(d) f (x, y, z, w) = (w2 + x2 )ezy
(2) A function f is said to be harmonic if f = 0. Which functions in exercise 1
are harmonic?
(3) Apply the differential operator div to the following vector fields and curl if
applicable. In each case, plot the vector field using a software package and make
a guess about the meaning of the div and curl operators on vector fields.
(a) F (x, y) = (y + x(x2 + y 2 ), x + y(x2 + y 2 ))
(b) F (x, y, z) = (x, y, z)
(c) F (x, y, z, w) = (x, y, z, w)
(d) F (x, y) = (2y + 12x4 y 2 , 6x 16x3 y 3 )
(e) F (x, y, z) = (xy, xz 2 , y 2 z)
(4) Show that , div and curl are linear operators. That is, show that the equa-
tions (6.7) are satisfied.
(5) Let u, v be functions and F and G be vector fields. Show the following differential
operator identities.
(a) (uv) = uv + vu
(b) (uF ) = (u) F + u( F )
(c) ( F ) = 0
(d) (u) = 0
(e) (F G) = ( F ) G F ( G)
(6) Show that the Laplacian operator after the coordinate change is given by the
formulae below. Do the calculation using the gradient obtained for both bases
in Problem (2) of Section 6.3.1.
(a) cylindrical coordinates:
2f 1 f 2f
f = + + .
r2 r2 r2 z 2
(b) spherical coordinates:
2f 1 2f 1 2f
f = 2
+ 2 2 + 2 2 .
sin 2
6.4 Application of Clairaults theorem to 1-forms
We begin by considering 1-forms defined in U R2 where U is open and for which

the coefficients are C 1 . Let
= a1 (x, y) dx + a2 (x, y) dy
and suppose that is exact, so that there exists a C 2 function f : U R with

f f
a1 = and a2 = .
x y
Then, from Clairaults theorem, we have the equality of mixed second derivative for
f . This leads to the following equalities
a1 2f 2f a2
= = = .
y yx xy x
Thus, we can summarize this result as follows:
Theorem 6.4.1. If = a1 (x, y) dx + a2 (x, y) dy is exact, then

a2 a1
= 0. (6.8)
x y
This gives us a negative criterion for verifying whether is exact; that is, if the
condition on the partial derivatives of a1 and a2 is not satisfied, then is not exact.
We look at an example.
Example 6.4.2. Let = 2xy dx + x2 y dy with a1 (x, y) = 2xy and a2 (x, y) = x2 y.

Then
a2 a1
= 2xy 2x 6= 0
x y
and so is not exact.
We return to Example 4.3.7 from Chapter 4.

x y
= dx + 2 dy
x2 +y 2 x + y2
where the coefficients are C 1 on U = R2 \ {(0, 0)}. We now compute

a2 a1 2xy 2xy
= 2 2 = 0.
x y (x + y 2 )2 (x + y 2 )2
But, we know that is not exact.
Therefore, the opposite implication is not necessarily true, that is

a2 a1
= 0 does not necessarily imply = a1 (x, y) dx + a2 (x, y) dy is exact.
x y
We see below that there are cases where this implication is in fact true. Before
addressing this question, we generalize the above argument to C 1 differentiable 1-
forms in U Rn . Suppose that
n
X
= ai (x1 , . . . , xn ) dxi
i=1
is exact, then there exists a C 2 function f : U Rn R such that

f
ai =
xi
for all i = 1, . . . , n. From Clairaults theorem, we can write the equality of mixed
partial derivatives for every i, j, for which we only need to consider i < j. We obtain
ai 2f 2f aj
= = = .
xj xj xi xi xj xi
We illustrate this calculation in the case n = 3. We only need to consider the cases
(i, j) = (1, 2), (i, j) = (1, 3) and (i, j) = (2, 3), therefore
a2 a1 a3 a1 a2 a3
= 0, = 0, = 0.
x1 x2 x1 x3 x3 x2
We now summarize the calculation above in the following result.
Theorem 6.4.4. Suppose that the C 1 differentiable 1-form

n
X
= ai (x1 , . . . , xn ) dxi
i=1
defined in U Rn is exact. Then,

aj ai
=0 (6.9)
xi xj
for all i, j = 1, . . . , n with i < j. This means that if at least one of the conditions
in (6.9) is not satisfied, then is not exact.
Let us return to Example 4.3.7 and consider the function
f (x, y) = arctan(x/y).
It is a straightforward exercise to check that df = and this would make f a

potential function for and therefore exact! How can we explain this seemingly
contradictory statement? Note that f is not defined for y = 0, therefore f cannot
be a valid potential function for all (x, y). This means that is not globally exact
on all its domain of definition R2 \ {(0, 0)}.
We say that a 1-form defined on an open subset U Rn is locally exact if for
every p U , there exists a neighborhood V U of p and a function f : V R
such that = df . Therefore, we can say that in Example 4.3.7 is locally exact by
choosing the domain U = R2 \ {(0, 0)}.
We now state a computational criterion which gurarantees whether a 1-form is
locally exact.
Theorem 6.4.5. Let be a C 1 differentiable 1-form defined in U Rn where U is

an open subset. Then, is locally exact if and only if conditions (6.9) are satisfied.
Proof. We only need to prove that if conditions (6.9) are satisfied, then is locally
exact. The other implication is given in Theorem 6.4.4. Let p = (x01 , . . . , x0n ) U
and > 0 be small enough so that the ball
B (p) := {x = (x1 , . . . , xn ) Rn | ||x p|| < } U.
Let x = (x1 , . . . , xn ) B (p) be an arbitrary point and consider the straight line
path L joining x and p with parametrization r(t) = p + t(x p). Note that
r0 (t) = x p = (x1 x01 , . . . , xn x0n ).
We are now ready to define the potential function as

1 1Xn
f (q) = = r = ai (r(t))(xi x0i ) dt.
L 0 0 i=1
Thus, we need to verify that df = and this is where the conditions (6.9) are crucial.
We do this be computing the partial derivative
1X n
f (x)
= ai (r(t))(xi x0i ) dt
xj xj 0
i=1
1X n
ai (p + t(x p)) (xj x0j )
= (xi x0i ) + aj (r(t)) dt
0 i=1 xj xj xj
1X n
ai (r(t))
= (xi x0i )t + aj (r(t)) dt
0 xj
i=1
(6.10)
where we can differentiate under the integral because the integral is with respect to
t and the functions are continuous. We know from conditions (6.9) that for all i 6= j
we have
ai aj
= 0.
xj xi
We make the substitution in the last line of (6.10) to obtain
1X n
f (x) aj (r(t))
= (xi x0i ) t + aj (r(t)) dt.
xj 0 xi
i=1
We notice the following simplification

n
X aj (r(t)) d
(xi x0i ) = aj (r(t))
xi dt
i=1
and this leads to

1
f (x) d
= t aj (r(t)) + aj dt
xj dt
01
d
= (aj (r(t))t) dt = aj (r(1)) = aj (x).
0 dt
So f is indeed the potential function for :

n
X f (x)
df = dxi = ai (x) dxi =
xi
i=1
Exercises
(1) Determine whether the 1-forms are locally exact.

x y
(a) = dx + dy
y x
(b) = (x cos(ey ) + z 3 ) dx + 12 x2 sin(ey )ey + 2z dy + (3xz 2 + x + 2y) dz

(c) = (2xy 2 + 3x2 ) dx + 2x2 y dy + 2z dz

x y z
(d) = dx + dy + dz.
||(x, y, z)||3 ||(x, y, z)||3 ||(x, y, z)||3
7 Double and Triple Integrals
This chapter is concerned with the definitions of double and triple integrals as well
as practical formulae for their computation. However, we begin with a section in-
troducing a new concept called the wedge product which we apply to differentials.
The wedge product is useful to compute areas and volumes and form the backbone
of the definitions of double and triple integrals. The second section is about the
double integral and may be considered as review for some readers, except for the
last part of the section where we discuss Greens theorem. This theorem links the
computation of line integrals of 1-forms to double integrals and is considered one of
the fundamental theorems of vector calculus, the second presented in this text after
Theorem 4.3.5. The final section discusses the triple integral over boxed and more
general domains.
7.1 Area and Volume Forms
We show in Chapter 1, Section 1.5 that curves can be given an orientation. We

generalize the construction to higher dimensional spaces and surfaces. We say that
R2 is positively (negatively) oriented if angles increase when rotating around the
origin in the counterclockwise (clockwise) direction. This means that when defining
polar coordinates in the typical way, we are giving R2 a positive orientation.
Orientations of R3 are obtained as follows. Choose a positive orientation of R2
with axes x and y and let the z axis be perpendicular to the x, y-plane. If the positive
numbers of the z axis point up, we say that R3 has positive orientation, otherwise, it
has negative orientation. One can give a positive and negative orientation to Rn for
any n inductively by choosing a positive orientation for Rn1 , then the orientation
of the xn axis determines a positive or negative orientation of Rn . However, it is not
possible to visualize the orientation in dimensions higher than three. In Chapter 10,
we define orientations for two-dimensional surfaces in R3 .
For computing integrals over two dimensional and three dimensional domains,
one approximates the area or volume by splitting the region into small pieces for
which the area or volume is known. We use parallelograms and parallelepipeds to
decompose our regions.
Parallelogram
The area of a parallelogram of base b and height h is bh. However, often a par-
allelogram is given in terms of vectors defining the sides of the parallelogram, for
instance, let u = (a1 , a2 ) and v = (b1 , b2 ) be two vectors in R2 as Figure 7.1 shows.
Recall from linear algebra that the area of a parallelogram given by two vectors
u = (a1 , a2 ) and v = (v1 , v2 ) is also given by the norm of the cross-product u v

and we know that u v = v u. Therefore, it seems that the cross-product of
vectors is a suitable tool for our purposes. One drawback of the cross-product is
that it is not generalizable to dimensions higher than three and this alone is reason
enough to consider instead the following approach.
v
x
Fig. 7.1. Vectors u and v
The area A of the parallelogram P (u, v) obtained with u and v is given by

A(u, v) = |a1 b2 a2 b1 |.
Note that P (u, v) and P (v, u) give the same parallelogram and in the formula for
the area,
A(u, v) = |a1 b2 a2 b1 | = |b1 a2 b2 a1 | = A(v, u)
because of the absolute value. If one removes the absolute value, we obtain a signed
area of P (u, v) which is the negative of the signed area of P (v, u). This suggests that
the ordering of the vectors in the definition of a parallelogram, that is P (u, v) versus
P (v, u), can be identified with the orientations of R2 . We want to keep track of the
orientation of parallelograms in R2 when computing area.
We know that basic 1-forms dx and dy take values that depend on the orientation
of vectors in R2 . Therefore, we define a product of 1-forms which does exactly what
we want.
Definition 7.1.1. Let u = (a1 , a2 ) and v = (b1 , b2 ) be two vectors in R2 in the

standard basis. The wedge product of 1-forms dx, dy on (u, v) is a mapping dxdy :
R2 R2 R defined by

dx(u) dy(u)
(dx dy)(u, v) := = a1 b2 a2 b1 .

dx(v) dy(v)
190 7 Double and Triple Integrals
The expression (dx dy)(u, v) is the signed area of the parallelogram generated by
the vectors u and v.
If dx and dy are interchanged, this causes a change of orientation of R2 and the sign
of the area changes.

dy(u) dx(u)
(dy dx)(u, v) =

dy(v) dx(v)
= a2 b1 b2 a1
= (dx dy)(u, v).
This is called the skew-symmetry property of wedge product. That is, the inter-
change of dx and dy induces a change in sign.
The wedge product of two differentiable 1-forms is an example of a differentiable
2-form which are introduced in Chapter 8. We can now return to the problem of
area and define the area form as
dA = |dx dy|
which always gives a non-negative area. Note that it is not a 2-form.
Example 7.1.2. We compute the signed area of the parallelogram generated by the
vectors u = (1, 4) and v = (2, 1). We have,

dx(u) dy(u) 1 4
(dx dy)(u, v) = = = 7.

dx(v) dy(v) 2 1
The parallelogram P (u, v) is positively oriented and so the signed area is positive.
A geometric criterion to determine the sign of the area of a parallelogram P (u, v) is

to use a rotation with smallest angle to align u to v. If the smallest angle is achieved
via a counterclockwise rotation, then P (u, v) is positively oriented, otherwise it is
negatively oriented.
y
u
x
Fig. 7.2. Positive orientation of the paral-
lelogram generated by u and v
It is straightforward to extend the wedge product of two basic 1-forms dxi , dxj
applied to vectors in higher dimensions as follows. For vectors u = (a1 , . . . , an ) and
v = (b1 , . . . , bn ), we define

dx (u) dx (u)
i j
(dxi dxj )(u, v) := = ai bj bi aj .

dxi (v) dxj (v)
The vectors u and v generate a parallelogram P (u, v) in Rn . This means (dxi

dxj )(u, v) is the signed area of the projection of P (u, v) to the two-dimensional
(xi , xj )-subspace of Rn . Indeed, the projection of u and v to the (xi , xj )-plane is
given by the vectors (ai , aj ) and (bi , bj ).
Example 7.1.3. Consider the vectors u = (2, 2, 3) and v = (1, 3, 1). We com-
pute the signed area of the projections of the parallelogram P (u, v):

2 2
(dx1 dx2 )(u, v) = = 8

1 3

2 3
(dx1 dx3 )(u, v) = =5

1 1

2 3
(dx2 dx3 )(u, v) = = 7.

3 1
The parallelogram P (u, v) and its projection on the xy-plane are illustrated in Fig-
ure 7.3.
Consider the parallelogram in R3 given by the vectors
u = (a1 , a2 , a3 ) and v = (b1 , b2 , b3 ).
The formula for the area of the parallelogram is also obtained from the norm of the
cross-product
p
||u v|| = (a2 b3 b2 a3 )2 + (a1 b3 b1 a3 )2 + (a1 b2 a2 b1 )2
p
= (dy dz)2 (u, v) + (dx dz)2 (u, v) + (dx dy)2 (u, v).
Thus, we automatically have the following theorem.
Theorem 7.1.4. Let u and v be two vectors in R3 . Then, the area AP of the paral-
lelogram P spanned by u and v is
p
AP = [(dy dz)(u, v)]2 + [(dz dx)(u, v)]2 + [(dx dy)(u, v)]2 .
We now turn our attention to parallelepipeds.

u
v
Fig. 7.3. Projection of the parallelogram generated by u and v onto the xy-plane.
Parallelepipeds
Consider a parallelepiped Q(u, v, w) given by three vectors u = (a1 , a2 , a3 ), v =
(b1 , b2 , b3 ) and w = (c1 , c2 , c3 ). Similar to parallelograms, the volume of Q(u, v, w) is
given by the product of the area of the base of Q(u, v, w) times the height. In terms
of vectors u, v, w, recall the triple product
V ol(Q(u, v, w)) = u (v w).
Then,
V ol(Q(u, v, w)) = (a1 , a2 , a3 ) (b1 c2 b2 c1 , (b1 c3 b3 c1 ), b1 c2 b2 c1 )

= a1 (b1 c2 b2 c1 ) a2 (b1 c3 b3 c1 ) + a3 (b1 c2 b2 c1 )

a1 a2 a3

= b1 b2 b3 .
det
c1 c2 c3
This shows a similar pattern as we have seen with area of parallelogram and we
recast the volume of a parallelepiped in terms of wedge products. This is easily done
by tacking on another wedge and extending the definition by increasing to a 3 3
determinant.
Definition 7.1.5. Let u = (a1 , a2 , a3 ), v = (b1 , b2 , b3 ) and w = (c1 , c2 , c3 ) be three

vectors in the standard basis of R3 . Then, the wedge product of dx, dy and dz is
dx dy dz : R3 R3 R3 R
where
dx(u) dy(u) dz(u)

(dx dy dz)(u, v, w) = dx(v) dy(v) dz(v) .

dx(w) dy(w) dz(w)
This is the signed volume of a parallelepiped. It is an example of a differentiable

3-form. One sees that the wedge product formalism is easily extendable to higher
dimensions in the way shown in Definition 7.1.5. Definition 7.1.5 also has skew-
symmetry properties as above:
dx dz dy = dx dy dz = dy dx dz = dz dy dx.
from the interchange of columns in the determinant.
Definition 7.1.6. The volume form is defined as
dV = |dx dy dz|.
It always gives a non-negative volume. It is not a 3-form.
Example 7.1.7. Compute the signed volume and volume of the parallelepiped given
by u = (2, 1, 2), v = (0, 3, 2) and w = (1, 1, 3).

dx(u) dy(u) dz(u)

(dx dy dz)(u, v, w) = dx(v) dy(v) dz(v)

dx(w) dy(w) dz(w)

2 1 2

= 0
3 2 = 14.

1 1 3
Therefore dV (u, v, w) = 14.
The signed areas and signed volume forms are used in a later section when we
introduce integration of forms. The area and volume forms are used in the next
section to define double and triple integrals.
Exercises
(1) Determine whether the parallelograms P (u, v) below are positively or negatively
oriented and compute their areas.
(a) P (u, v) with u = (1, 2), v = (3, 2)
(b) P (u, v) with u = (3, 1), v = (2, 4)

(c) P (u, v) with u = (0, 3), v = (1/2, 2 3 ).
(2) Compute the area of the parallelograms P (u, v) below
(a) P (u, v) with u = (3, 1, 0), v = (1, 2, 3)
(b) P (u, v) with u = (1, 0, 3), v = (1, 4, 1)
(c) P (u, v) with u = (2, 3, 3), v = (5, 1, 1)
(3) Show that dxi dxj = 0 if i = j and that dxi dxj dx` = 0 if i = j, i = ` or
j = `.
(4) Compute the volume of the parallelepipeds Q(u, v, w) defined below.
(a) u = (2, 0, 0), v = (1, 1, 3), w = (3, 2, 4).
(b) u = (1, 1, 1), v = (1, 1, 1), w = (1, 0, 1).
(c) u = (4, 3, 1), v = (2, 1, 3), w = (0, 5, 7).
(5) Show that if u = (a1 , a2 , a3 , a4 )T and v = (b1 , b2 , b3 , b4 )T are vectors in R4 , then
(u, v) = (dx1 dx3 )(u, v) + (dx2 dx4 )(u, v)
can be written

b1

T
b2
(u, v) = u Jv = (a1 , a2 , a3 , a4 )J

b3

b4
where !
0 I2
J= .
I2 0
with I2 the 2 2 identity matrix and 0 the 2 2 zero matrix.
(6) Using the representation of (u, v) of the previous problem, show that is
unchanged if (u, v) is replaced by (Au, Av) where A is a 4 4 matrix satisfying
AT JA = J.
7.2 Double integrals
We now study the question of integrating functions of two variables over rectangular
and more general domains in R2 . We begin with functions of two variables over
rectangular domains. Suppose we want to compute the volume under a surface z =
f (x, y) over a rectangle R R2 . We look at some easy examples.
Example 7.2.1. Define the piecewise constant function f on R = [1, 2] [0, 1] by

2, (x, y) R1 = [1, 3/2] [0, 1/2]

1, (x, y) R2 = [3/2, 2] [0, 1]
f (x, y) =
3,

(x, y) R3 = [1, 5/4] [1/2, 1]

3/2, (x, y) R4 = [5/4, 3/2] [1/2, 1].

1
R3 R4
1/2 R2
R1
x
1 5 3 2
4 2
Fig. 7.4. Splitting of the domain for f (x, y)
See Figure 7.4. Then, the volume under the function f (x, y) is the sum of the volumes
of the boxes with heights given by f over the rectangles defined in the domain. We
have
V olume = 2Area(R1 ) + 1Area(R2 ) + 3Area(R3 ) + 32 Area(R4 )
= 2(1/4) + (1/2) + 3(1/8) + (1/8) = 3/2.
Let R be a rectangle in R2 : [a0 , a1 ][b0 , b1 ] and f : R R2 R with f (x, y) 0. We

now estimate the volume between the xy-plane and the function f . First, we define
a partition of R into small subrectangles by partitioning each interval separately.
We have
a0 = x0 < x1 < . . . < xn1 < xn = a1
b0 = y0 < y1 < . . . < ym1 < ym = b1 .
There are mn subrectangles labelled: Rij := [xi , xi+1 ][yj , yj+1 ] with i = 0, . . . , n1
and j = 0, . . . , m 1. We obtain an approximation to the volume between the xy-
plane and the function f by creating a piecewise constant function and evaluating
f at a point (xij , yij
) for each subrectangle R . We take the sum of the volumes of
ij
the boxes with base Rij and height f (xij , yij ): we obtain
n X
X m
RS := f (xij , yij

)dA(Rij ).
i=1 j=1
The expression RS is called a Riemann Sum, similar to the definition of integral

of one-dimensional functions f (x). Refining the grid of subrectangles Rij by letting
n, m so that dA(Rij ) 0 for all i, j, the question which arises is the existence
of the following limit
n X
X m
lim f (xij , yij

)dA(Rij ). (7.1)
n,m
i=1 j=1
b1
yj+1 Rij
yj
b0
a0 xi xi+1 a1 x
Fig. 7.5. Subrectangle Rij given by the
partitions of [a0 , a1 ] and [b0 , b1 ].
The above construction is valid for functions f (x, y) which are not necessarily posi-
tive and we have the following definition.
Definition 7.2.2. If the limit of the Riemann sum (7.1) of f exists on R, then f is
said to be integrable on R and
Xn Xm
f (x, y) dA := lim f (xij , yij

)dA(Rij ).
R n,m
i=1 j=1
It is in general difficult to check for integrability and compute the integral of f

directly from the definition. Therefore, we first need to determine general conditions
under which a function is integrable. This is done in the next theorem.
Theorem 7.2.3. Let f : R R2 R. Then,
(1) If f : R R2 R is continuous, then f is integrable on R.

(2) If f : R R2 R is bounded on R and has only finite size jump discontinuities
over a finite set of smooth curves, then f is integrable on R.
Proof. If f is a continuous function on R, then it is bounded on R. For bounded

functions, with finite size jump discontinuities, the values f (xij , yij
) on each rect-
angle Rij can be bounded above and below and as the area of Rij shrinks to zero,
the limit converges.
As mentioned above, we also need a more effective way of computing the double in-
tegral than using the definition. We know from elementary calculus that for simple
integrals of functions of one variable, the Fundamental Theorem of Calculus enables
one to relate the value of the integral to the existence of an antiderivative, or indef-
inite integral. There is no such result for double integrals. However, it is possible to
decompose the double integral into a succession of two simple integrals as the next
result shows.
Theorem 7.2.4 (Fubini). If f is integrable on the rectangle R. Then

a1 b 1 b1 a1
f (x, y) dA = f (x, y) dy dx = f (x, y) dx dy.
R a0 b0 b0 a0
The idea of this theorem is that one can decompose the rectangle [a0 , a1 ] [b0 , b1 ]
using slices parallel to one of the axes. This is done as follows, say for slices parallel
to the y-axis. Fix a value x [a0 , a1 ] and consider f (x, y) with y [b0 , b1 ]. Then,
we can compute the integral of the slice obtained using f (x, y):
b1
f (x, y) dy. (7.2)
b0
Now, (7.2) depends on the value of x chosen to obtain the slice and so the result
slice at x = x0
b1
b0
x
a0 x0 a1
Fig. 7.6. Slice at x = x0 of the domain.
of (7.2) is a function of x. We write

b1
G(x) = f (x, y) dy.
b0
Thus, for every x [a0 , a1 ], G(x) is the integral of a slice. Fubinis theorem tells us
that summing up all the integrals G(x) for x [a0 , a1 ] corresponds to the integral
of f (x, y) over R.
Example 7.2.5. Let R = [0, 3] [1, 1] and f (x, y) = ex+y . We compute

3 1
f (x, y) dA = ex+y dy dx
R 0 1
3 1 !
x y

= e dx e
0 1
= (e e1 )(e3 1).
Computation of mass and populations

The double integral can be used to compute physical quantities such as total mass
or total electric charge if the mass or charge density is known. Suppose that a
rectangular plate R of a certain material has density (with units of mass/unit area,
say kg/m2 ) given by a continuous function (x, y). The total mass of the plate can
be computed as follows.
Decompose R into subrectangles Rij for i = 1, . . . , n, j = 1, . . . , m as shown in
the definition of the double integral. Then, we evaluate (x, y) at an arbitrary point
(xi , yj ) of Rij . Let A(Rij ) be the area of the rectangle Rij , then (xi , yj )A(Rij )
has units of mass and gives an approximation of the mass of the plate defined by
Rij . Therefore, we can write a Riemann sum with (x, y) giving an approximate
value for the total mass of the plate. Because is a continuous function, by taking
the limit of the Riemann sum, the total mass is given by

(x, y) dA.
R
Example 7.2.6. Suppose that R = [1, 1] [1, 1] and (x, y) = |x| + |y|, then
1 1
(x, y) dA = |x| + |y| dx dy
R 1 1
1 1 0 0
= (x + y) dx dy + (x + y) dx dy
0 0 1 1
1 0 0 1
+ (x + y) dx dy + (x y) dx dy
0 1 1 0
1 1 0 0
1 2 1 2
= x + yx dy x + yx dy
0 2 0 1 2 1
1 0 0 1
1 1 2
x2 + yx dy

+ x yx dy
0 2 1 1 2 0
= 1
where the result is obtained by a simple computation of the integrals.
We can apply this approach to any density function, for instance, population density.
Example 7.2.7. Suppose that a bacterial colony is cultivated in a square domain R

of sides of length a and that the density of population is estimated by (x, y) =
3 (x2 + y 2 ). The total population is obtained by integrating over the domain:
a a
2 2
(x, y) dA = (3 (x + y )) dy dx
R 0 0
a a
y 3
= (3 x2 )y dx
0 3 0
a
a3

2
= (3 x )a dx
0 3
a
x3 a3

= 3x a x
3 3 0
2a4
= 3a2 .
3
7.2.1 Integrals over general two-dimensional domain
Let D R2 be a bounded region and f (x, y) be a continuous function defined on

D. We suppose that the boundary of D is made up of a finite number of smooth
curves. Suppose that D R where R is a rectangle in R2 , see Figure 7.7. We show
that the double integral over D can be defined. Let
(
f (x, y) if (x, y) D
F (x, y) =
0 if (x, y) R \ D.
F (x, y) is bounded on R and continuous on the pieces D and R \ D, with a jump

discontinuity on a finite number of smooth curves. By Theorem 7.2.3, F (x, y) is
R\D
x
Fig. 7.7. Domain D with boundary given
by piecewise smooth curves.
integrable on R and we define

f (x, y) dA := F (x, y) dA.
D R
This definition is satisfying because the contribution of the double integral of

F (x, y) on R \ D is zero. We now enumerate a list of cases of domains D in R2 that
are computationally tractable.
Type I:
A planar domain D is said to be of type I if it lies between the graph of two continuous
functions of x with the same domain, see Figure 7.8.
g2 (x)
g1 (x)
x
Fig. 7.8. Schematic of a domain of Type I.
D = {(x, y) R2 | a x b, g1 (x) y g2 (x)}
An example of Type I domain is given in Figure 7.8. The general formula for the
double integral is
b
!
g2 (x)
f (x, y) dA = f (x, y)dy dx.
D a g1 (x)
Example 7.2.8. Compute the volume of the region R between the xy-plane with y
0, the elliptic paraboloid z = x2 + y 2 and the cylinder x2 + y 2 = 1. This region is
bounded below by the xy-plane and above by the elliptic paraboloid. But the domain
D in the xy-plane is bounded by the cylinder and by the line y = 0. We have
p
D = {(x, y) R2 | 1 x 1, 0 y 1 x2 }.
Therefore, the volume of R is given by

1
1x2
!
2 2 2 2
Volume(R) = x + y dA = (x + y ) dy dx
D 1 0
1 1x2
2 1 3
= x y+ y dx
1 3 0
1 p
1
= x2 1 x2 + (1 x2 )3/2 dx
1 3
1
1 2 3/2 1 p 2
1
= x(1 x ) + x 1 x + arcsin x
4 8 8 1
1
1 1 2 3/2 3 p 2
3
+ x(1 x ) + x 1 x + arcsin x
3 4 8 8 1
1
= .
4
Type II:
A planar domain D is said to be of type II if it lies between the graph of two
continuous functions of y.
D = {(x, y) R2 | c y d, f1 (y) x f2 (y)}.
An example of a type II domain is found in Figure 7.9. The general formula for the
double integral is
d
!
f2 (y)
f (x, y) dA = f (x, y)dx dy.
D c f1 (y)
f2 (y)
f1 (y)
Fig. 7.9. Schematic domain of Type II

Example 7.2.9. Compute the volume of the region R enclosed by the cylinder x
|y| 1 = 0, the planes y = 1, y = 1/2, x = 0 and below the function f (x, y) = 2y.
We see that x values are constrained to be between 0 and the lines given by 1 + |y|,
where y takes its values between 1/2 and 1. This is indeed a region of type II and
we write
D = {(x, y) R2 | 1/2 y 1, 0 x 1 + |y|}.
Therefore, the volume is
1 1+|y|
!
2y dA = 2y dx dy
D 1/2 0
1
= 2 y(1 + |y|) dy
1/2
0 1
= 2 y(1 y) dy + y(1 + y) dy
1/2 0
= 2
Type III:
A domain D is of type III if it can be written as a type I or as a type II domain.
Here are some examples.
(1) D is the region bounded by the ellipse
x2 y2
+ =1
4 3
in the first quadrant, see Figure 7.10. We can write
p
D = {(x, y) | 0 x 2, 0 y 3(1 x2 /4)}
p
= {(x, y) | 0 y 3, 0 x 2 1 y 2 /3}.
(2) D is the region bounded by the curves x = y 2 and y = ex 1, see Figure 7.11.
These two curves intersect at (0, 0) and at
2
y0 = ey0 1
where y0 0.7469 which means x0 = y02 0.5578. Therefore,

D = {(x, y) | 0 x x0 , ex 1 y x}
= {(x, y) | 0 y y0 , y2 x ln(y + 1)}.
y

3
2 x Fig. 7.10. Domain of Type III bounded by

the ellipse in the first quadrant.
y
y0
x0 x
Fig. 7.11. Domain of Type III bounded by
the exponential and the parabola
It is often the case that for type III domains, one of the parametrizations of D leads
to an easier computation.
Example 7.2.10. Consider the double integral

2
ex dA
D
where D is the domain bounded by the lines y = 0, y = x and x = 1. This is a

domain of type III and we write
D = {(x, y) R2 | 0 y 1, 0 x y}.
Then 1 y
2 2
ex dA = ex dx dy
D 0 0
but the inner integral does not have an antiderivative which can be written in what
are called elementary terms: polynomials, rational functions, trigonometric func-
tions, exponential and logarithmic functions. Hence, the integral is not computable
explicitly.
If we write instead
D = {(x, y) R2 | 0 x 1, 0 y x}.
Then the order of integration is changed and we have

1 x
Fig. 7.12. Triangular domain of
Type III
1 x
x2 x2
e dA = e dy dx
D 01 0 x
2
= ex dy dx
01 0
x2
= xe dx
0
1 1 u
= e du
2 0
1
= (1 e1 ).
2
More general domains

If a domain is not of type I, II or III one can in general split the domain into such
regions and perform the integrals.
Proposition 7.2.11. Let D R2 and suppose D = D1 D2 where the intersection

of D1 and D2 is only along smooth curves. Then

f1 dA = f1 dA + f1 dA
D D1 D2
We illustrate the statement in the next example.
Example 7.2.12. Let D be the region enclosed by the lines L1 , . . . , L5 with L1 joining
(0, 0) to (2, 0), L2 joins (2, 0) to (1, 3), L3 joins (1, 3) to (2, 2), L4 joins (2, 2) to
( 32 , 1), L5 joins ( 32 , 1) to (2, 0) and finally L6 joins (2, 0) to (0, 0) . See Figure 7.13.
We can split D into four regions:
D1 = {(x, y) | 0 x 1, 0 y x + 2}
D2 = {(x, y) | 1 x 2, 2 y 4 x}
y
D3 = {(x, y) | 1 y 2, 1 x 2 + 1}
D4 = {(x, y) | 0 y 1, 1 x y2 + 2}.
Then, if f : D R is integrable we can write

f dA = f dA + f dA + f dA + f dA
D D1 D2 D3 D4
1 x+2 2 4x
= f dy dx + f dy dx
0 0 !1 2
2 1+y/2 1
!
2y/2
+ f dx dy + f dx dy.
1 1 0 1
D2
2
D3
1
D1 D
D4
x Fig. 7.13. Union of domains of Types
1 2
I and II.
Exercises
(1) Calculate the iterated integrals.

3 5
(a) 4xy dy dx
1 1
2 1
(b) y cos(xy) dx dy
0 0
1 y

y2
(c) e dx dy
1 0
(2) Find the volume of the solid that lies under the hyperbolic paraboloid z =
9y 2 3x2 + 6 and above the rectangle R = [1, 1] [1, 2].
(3) Consider the domains below and determine if they are of type I, II, III.
(a) D1 = {(x, y) | 1 y 1, y 2 x y}.
(b) D2 is the triangular region with vertices (0, 1), (1, 2), (4, 1).
(c) D3 = {(x, y) | 0 x , 0 y sin(x)}

(d) D4 is the circle with centre at the origin and radius 2.
(4) Evaluate the following double integrals

(a) 2y dA where D is bounded by y = x, y = x3 , x 0.
D
(b) x2 y 2 dA where D is enclosed by y = |x| and y = 2.

D
(c) (x + y) dA where D is enclosed by x3 = y 2 and y = x + 1.

D
(5) Compute the double integral of f (x, y) = x2 y + 3 over the domain
D = {(x, y) | 1 x 1, x + 1 y x + 1}
(6) A heavy snow fall leaves on a flat roof of rectangular shape R of dimensions
20m 15m an uneven depth of snow due to the wind. Let R = [0, 20] [0, 15]
and we approximate the depth of the snow cover (in meters) on the roof by
1 3
the function f (x, y) = 1 + 1000 x2 + 1000 y 2 . Suppose that the snow density is
estimated as 50 kg/m , determine the density of snow per m2 and calculate the
3
total mass of snow on the roof.

(7) The density of woodland caribou (Rangifer tarandus caribou) living in the bo-
real forest of northern Qubec has been estimated (for the Temiscamie herd)
at 2.04 caribou/100 km2 [4]. Suppose that for conservation purposes, an area of
prime ecological conditions D for the woodland caribou is protected from hu-
man disturbances and has the shape given in Figure 7.14. How many woodland
caribous can be estimated to inhabit the conservation area?
12km 12km 6km
12km
45km
48km
D
3km
6km
12km 18km Fig. 7.14. Domain D of Exercise 7.
7.3 Greens Theorem
After the Fundamental Theorem of Line Integrals, Greens theorem is the next ma-
jor theorem of Vector Calculus. It provides a relation similar to the Fundamental
Theorem, but between line integrals of 1-forms in R2 and double integrals. We be-
gin with the statement in the simplest case, that of a unique simple closed curve
surrounding a domain where a 1-form is defined.
Theorem 7.3.1 (Greens Theorem: simplest case). Let C R2 be a simple closed

curve oriented counterclockwise surrounding an open domain D R2 . Let =
a(x, y) dx + b(x, y) dy be a 1-form with C 2 coefficients defined for all (x, y) D.
Then,
b a
= dA(x,y) .
C D x y
There are several proofs of Greens theorem and we provide a proof for the more
general case described below.
Example 7.3.2. The simplest situation which illustrates Greens theorem is in the
case of a curve C = C1 C2 C3 C4 enclosing a rectangular region D given
by [, 0] [0, ]. We compute the line integral using parametrizations r1 (t) = (t, 0)
for t [0, ], r2 (t) = (, t) for t [0, ], r3 (t) = ( t, ) for t [0, ] and
r4 (t) = (0, t) for t [0, ]. Then, one obtains

= + + + (7.3)
C C1 C2 C3 C4

= a(t, 0) dt + b(, t) dt a( t, ) dt b(0, t) dt
0 0 0 0

= a(t, 0) dt + b(, t) dt a(u, ) du b(0, u) du
0 0 0 0

= (a(t, ) a(t, 0)) dt + (b(, t) b(0, t)) dt
0 0
where we use the substitution rule above, and return from the u to the t variable in
the last line. Using the Fundamental Theorem of Calculus, the last line is equal to
the following

b a
= dx dt dy dt
0 0 x 0 y
0
b a
= dy dx (7.4)
0 0 x y

b a
= dA(x,y) .
D x y
This example is important not only in understanding the origin of the formula of
Greens theorem, but we use it also in Chapter 10 to provide a more general proof
than the one we provide in this chapter. Stay tuned.
We now look at some applications of Greens theorem.
Example 7.3.3. Let = x dy. Then a(x, y) = 0 and b(x, y) = x, so Greens theorem
yields
= dA(x,y)
C D
where the right-hand side is the area of the domain D. The reader can check that the
same result can be obtained by using the 1-forms = y dx or = 12 (y dx + x dy).
It is a remarkable consequence of Greens theorem that the line integral of a particular
1-form gives the area of the region it encloses.
Fig. 7.15. Rectangular domain

a b x
with boundary oriented counter-
clockwise.
The more general version of Greens theorem takes into account the fact that the
boundary of the region D may be made up of a simple closed curve C forming the
outer boundary and several simple closed curves surrounding holes within the region
bounded by C. See Figure 7.16 and note that the orientation of the closed curves
inside C are given orientation opposite to the orientation of C. The convention
is to take C with counterclockwise orientation and this one is referred to positive
orientation, thus the inner curves have clockwise orientation, which are referred to as
negative orientations. This choice is not arbitrary and guarantees that the formula
for Greens theorem of Theorem 7.3.1 is unchanged for a domain with holes.
Indeed, consider the simple situation in R2 where D is the region bounded by
D = C1 C2 where C1 is a circle of radius 1 and C2 is a circle of radius 2 both
oriented counterclockwise, see Figure 7.17. Let Di be the region enclosed by the
curve Ci , for i = 1, 2. Then,

b a b a b a
= .
D x y D2 x y D1 x y
We apply Greens theorem for the outer boundary curves, we obtain

b a b a
= .
D2 x y D1 x y C2 C1
Fig. 7.16. Left: a simple closed curve C surrounding a region with holes bounded by several
simple closed curves. Note the opposite orientation of the inner boundary curves.
y
C2
C1
Fig. 7.17. Annular domain bounded

by the curves C1 and C2 with coun-
terclockwise orientations.
Let C1 be the curve C1 with opposite orientation, then

= .
C1 C1
Thus,
b a
= + = .
D x y C2 C1 C2 (C1 )
This example illustrates why the boundary curves are chosen with opposite orienta-
tions, in this case, D = C2 (C1 ) where C2 and C1 have opposite orientations.
Theorem 7.3.4. Let D R2 be a domain and C = D. Let = a(x, y) dx +

b(x, y) dy be a 1-form with C 2 coefficients defined for all (x, y) D. Then,

b a
= dA(x,y) . (7.5)
C D x y
Proof. We verify the formula of Greens theorem for regions of Type I and Type II,
but only do the proof for Type I regions as the proof for Type II regions is completely
analogous. More general regions can be decomposed into a union of Type I and Type
II regions and we return to this case at the end of the proof.
Let
D = {(x, y) | x , g1 (x) y g2 (x)}.
The boundary curve C is made up of four pieces
C1 = {(t, g1 (t)) | t }, C2 = {(, t) | g1 () t g2 ()}

C3 = {(t, g2 (t)) | t }, C4 = {(, t) | g1 () t g2 ()}.
We compute
= + + +
C C1 C2 C3 4
C (7.6)
= + .
C1 C2 C3 C4
The following integrals can be verified by the reader using the pullback formula

= (a(t, g1 (t)) + b(t, g1 (t))g10 (t) dt
C1
g2 ()
= b(, t) dt
C2
g1 ()
(7.7)
= (a(t, g2 (t)) + b(t, g2 (t))g20 (t) dt
C3 g2 ()
= b(, t) dt.
C4 g1 ()
Collecting the integrals involving a(x, y) from the C1 and C3 integrals we obtain
g2 (t) !
a a
a(t, g1 (t)) a(t, g2 (t))) dt = (t, y) dy dt = dA(x,y) .
g1 (t) y D y
We now consider the computation of the integrals involving b(x, y). Let B y (x, y) be
the antiderivative of b(x, y) with respect to y. Then
g2 ()
b(, y) dy = B y (, g2 ()) B y (, g1 ())
g1 ()
and g2 ()
b(, y) dy = B y (, g2 ()) + B y (, g1 ()).
g1 ()
To compute the integrals involving b(t, g1 (t)) and b(t, g2 (t)), we compute for i = 1, 2,
dB y (t, gi (t)) B y dx B y dy
= +
dt x dt y dt
y
B (t, gi (t))
= + b(t, gi (t))gi0 (t).
x
Integrating with respect to t we have

dB y (t, gi (t))
B y (, gi ()) B y (y, gi ()) = dt
dt

B y (t, gi (t))
= dt + b(t, gi (t))gi0 (t) dt.
x
Therefore,

B y (t, gi (t))
b(t, gi (t))gi0 (t) dt = B y (, gi ()) B y (y, gi ()) dt.
x
Collecting the terms together using (7.6) and (7.7), the ones involving only B y cancel
out each other and because x = t on C1 and C3 , we are left with

B y (x, g2 (x)) B y (x, g1 (x))

a
= dx dA(x,y)
C x x D y
g2 (x) 2 y !
B (x, y) a
= dy dx dA(x,y) ()
g1 (x) yx D y
g2 (x) !
b(x, y) a
= dy dx dA(x,y)
g1 (x) x D y

b a
= dA(x,y) .
D x y
where the equality following () (on the second line above) is justified from the
fact that B y is a C 2 function and so we can interchange the order of the mixed
derivatives, that is:
2 B y (x, y) 2 B y (x, y) B y (x, y)
= = = b(x, y).
yx xy x y x
Suppose that D = D1 Dk where each Dj is a region of Type I or II. Let
Ci` be the (nonempty) smooth boundary curve between Di and D` . Then, Ci` as a
portion of Di has the opposite orientation to Ci` as a portion of D` . Therefore,

b a b a b a
dA = dA + dA
Di D` x y Di x y D` x y
= +
Di D`
= + +
Di \Ci` Ci` D` \Ci`
= +
Di \Ci` D` \Ci`
=
(Di D` )
and we can continue this process by iterating over the rest of the components of D
to obtain the result. This completes the proof.
7.3.1 Greens Theorem: examples and applications
We begin with the 1-form from Example 4.3.7 of Chapter 4. We know that this
1-form is not exact and compute its line integral around a circle of radius a centered
at the origin. We generalize this result to any simple closed curve surrounding the
origin.
Example 7.3.5. Let

y dx + x dy
=
x2 + y 2
and C be any simple closed curve around the origin in R2 oriented in counterclock-
wise direction. We now show that

= 2.
C
Let D be the region enclosed by C and consider Ca a circle of radius a > 0 small
enough such that Ca D. We define D0 as the region between Ca and C, see
Figure 7.18. Thus, D0 = C Ca . C has positive orientation going counterclockwise
and Ca has positive orientation going clockwise. Let r(t) = (a cos t, a sin t) with
t [0, 2] be a clockwise parametrization of Ca . Then,
D0
Ca
Fig. 7.18. Curves C and Ca forming

the boundary of the domain D0 .

= 2
Ca
and so
= + = 2.
D 0 C Ca C
Recall that
b a
=0
x y
and so by Greens theorem

b a
= dA = 0.
D 0 D0 x y
which means
= 2.
C
Therefore, one sees that the value of the integral of the 1-form does not depend on
the curve C itself, but whether the curve C winds around the origin which is the only
point where is not defined.
We now explain this result from the viewpoint of polar coordinates. Indeed, in
polar coordinates, it is straightforward to compute that = d. Therefore,

C
computes the angular displacement along the curve C. For simple closed curves en-
circling the origin, we obtain 2.
The following example shows the derivation of one of Greens formula [2].
Example 7.3.6. Let D R2 be an open subset with D = C given by a simple closed

curve. Let u = u(x, y) and v = v(x, y). Recall that
2v 2v
v(x, y) = 2
+ 2.
x y
Consider the 1-form
v v
= u dx + u dy.
y x
Then, a straightforward computation shows that
d = (uv + u v) dx dy. (7.8)
Thus, taking the double integral over D and applying Greens theorem, we obtain

= d = (uv + u v) dA.
C D D
Writing the line integral explicitly and rearranging the terms we have

v v
u v dA = uv dA + u dx u dy. (7.9)
D D C y x
We simplify this last integral further as follows. We know that if C has a parametriza-
tion r(t) = (x(t), y(t)), then r0 (t) is tangent to C. The vector N (t) = (y 0 (t), x0 (t))
is perpendicular to r0 (t) and therefore, perpendicular to the boundary curve C. There-

fore, the pullback of by r(t) is

v 0 v 0
u(r(t)) (r(t))x (t) + (r(t))y (t) dt = u(r(t))(v(r(t)) N (t)) dt.
y x
Thus, we can rewrite (7.9) as

u v dA = uv dA + u(v N ) dt. (7.10)
D D C
This last equation is the first Greens formula and is a useful tool in the context of
partial differential equations.
The following example is called the Bendixson-Dulac theorem and is useful in the
geometric theory of differential equations, see for instance [7].
Example 7.3.7. Let F (x, y) = (f (x, y), g(x, y)) be a vector field in R2 . If there exists
a simple closed curve C with parametrization, say r(t) = (x(t), y(t)) such that r0 (t) =
F (r(t)) then C is called a periodic solution of the vector field F . Periodic solutions
are a fundamental concept in the theory of differential equations.
Suppose the vector field F is such that there exists a function : R2 R so
that
(f ) (g)
+ >0
x y
in R2 . We now show that F does not have any periodic solution. We do this by con-
tradiction. We suppose there does exist a simple closed curve C with parametrization
r(t) (t [0, 2]) such that r(t) = F (r(t)) and show that this leads to an impossible
situation.
Let D R2 be a region with boundary D = C, a simple closed curve. We
consider the 1-form = (x, y)g(x, y) dx + (x, y)f (x, y) dy. Then, by Greens
theorem
f g
= d = + dA(x,y) > 0
C D D x y
because the integrand is strictly positive. Note that on C we have dx = x0 (t) dt =
f (x(t), y(t)) dt and dy = y 0 (t) dt = g(x(t),
y(t)) dt because r(t) = (x(t), y(t)) is a
solution of the differential equation. So =
C
2
((x(t), y(t))g(x(t), y(t))f (x(t), y(t))
0
+(x(t), y(t))f (x(t), y(t))g(x(t), y(t))) dt = 0.
Therefore, we have a contradictions since the line integral of along C cannot

be simulateneously positive and zero. This means the vector field F does not allow
periodic solutions.
Exercises
(1) Let D be the region bounded by the lines y = 0, y = x 2, x = 0. Parametrize

the curves so that D has positive orientation.
(2) Consider the annular region D bounded by the circles of radius 1 and of ra-
dius 4 centered at the origin. Parametrize the curves so that D has positive
orientation.
(3) Let D be the region bounded by the ellipse x2 + y 2 /4 = 1 and the ellipse
2x2 + 4y 2 = 1. Parametrize the curves so that D has negative orientation.
(4) Let C be the curve given by the line from (0, 0) to (1, 1) and the arc of circle

from (1, 1) to ( 2, 0), and finally the line from ( 2, 0) to (0, 0). Let
= (x x2 y) dx + (xy 2 y 3 ) dy
and evaluate the integral

.
C

2
(5) Compute where = (xex + 3y 2 ) dx + (6x tanh y 2 ) dy where C is the
C
positively oriented curve which forms the boundary of the half-disk of radius a
where y x.
(6) Consider the vector field F (x, y) = (a(x, y), b(x, y)). Apply Greens theorem to
the 1-form 0 = b(x, y) dx a(x, y) dy and show that for any curve C bounding
a domain D R2
F n ds = div F dA
C D
where n is the unit vector perpendicular to Tp C at all p C. This is often
referred to as the Divergence theorem in the plane. Chapter 10 discusses the
general Divergence theorem.
(7) For each of the following problems, use the Divergence theorem in the plane of
Exercise 6.
(a) Let F (x, y) = (3x + 4 cos(y 2 ), 4yx) and C be the triangle defined by the
vertices (0, 0), (2, 0) and (1, 1).
(b) Let F (x, y) = (xy 2 , yx2 ) and C be given by the segment joining (0, 0) to
(a, 0), the portion of the circle of radius a in the quadrants 1, 2 and 3, and
finally the segment joining (a, 0) to (0, 0).
7.4 Three-dimensional domains
We now extend integration to functions of three variables. We begin with cubic do-
mains which are the simplest type of domains for which Fubinis theorem applies. As
in the previous section, we define three types of domains for which it is straightfor-
ward to decompose the triple integral into three simple integrals. Several examples
are presented.
7.4.1 Triple integral over cubes
We now present the construction of integrals of functions of three-variables over a

box domain B defined by [a0 , a1 ] [b0 , b1 ] [c0 , c1 ]. In this case, we can also define
an integral over B as before. Let f : R3 R and consider partitions of each interval:
a0 = x0 < x1 < . . . < xn1 < xn = a1

b0 = y0 < y1 < . . . < ym1 < ym = b1
c0 = z0 < z1 < . . . < zm1 < z` = c1
from which we obtain a decomposition into subboxes Bijk for i = 1, . . . , n, j =

1, . . . , m and k = 1, . . . , `. A subbox Bijk is illustrated in Figure 7.19. We define a
zk+1
zk
Bijk
yj yj+1
xi
xi+1
Fig. 7.19. A subbox element Bijk decomposing the box domain B.

Riemann sum as in the double integral

n X
X m X
`
RS = f (xijk , yijk

, zijk )dV (Bijk ) (7.11)
i=1 j=1 k=1
and obtain a similar definition as above.
Definition 7.4.1. Let (xijk , yijk

, z ) B
ijk ijk for all i = 1, . . . , n, j = 1, . . . , m and
k = 1, . . . , ` and dV (Bijk ) 0 as n, m, ` . If the limit of the Riemann sum
exists then f is said to be integrable on B and
n X
X m X
`
f (x, y, z) dV = lim
f (xijk , yijk
, zijk )dV (Bijk ).
B n,m,`
i=1 j=1 k=1
Continuity of f on B is again a sufficient condition for integrability and also for

functions that are piecewise continuous. This is described precisely in the following
result.
Theorem 7.4.2. Let f : B R3 R. Then,
(1) If f is continuous on B, then it is integrable on B

(2) If f is bounded and has only finite size jump discontinuities over a finite set of
smooth surfaces, then f is integrable on B.
As in the case of double integrals, Fubinis theorem is applicable and we can compute
the triple integral as an iterated integral.
Theorem 7.4.3. If f is integrable on the rectangle B. Then

x1 y1 z1
f (x, y, z) dV = f (x, y, z) dz dy dx.
B x0 y0 z0
and all other permutations are equal.
Again, just as with double integrals, we can compute interesting physical quantities
such as total mass, moments, centre of mass, total charge, etc in three-dimensional
regions using the triple integral.
Example 7.4.4. Consider an electrically charged region B in the shape of a box.

The charge density in B (given in Coulomb per square meters C/m2 ) is given by a
continuous function (x, y, z). The Riemann sum over the region B with (x, y, z)
gives an approximation to the total charge and we define the total charge as the limit
of the Riemann sum. Thus, the total charge is defined by

(x, y, z) dV.
B
Suppose that B = [0, 1] [0, 1] [0, 1] and (x, y, z) = xyz. The total charge in B is
1 1 1
1
(x, y, z) dV = x dx y dy z dz = .
B 0 0 0 8
7.4.2 Triple integrals over more general regions
We can define the triple integral over a general region E as in the case of double
integrals. Suppose that the boundary of E is made up of a finite number of smooth
surfaces and let E B where B is a box in R3 . Assume f (x, y) is continuous on E.
We define (
f (x, y, z) (x, y, z) E
F (x, y, z) =
0 (x, y, z) B \ E.
Then, F is continous on B with only a finite number of finite jump discontinuities
over smooth surfaces. By Theorem 7.4.2, F is integrable and we define

f (x, y, z) dV := F (x, y, z) dV
E B
We can proceed to a similar classification of domains in R3 into three types.
Type I:
A region E is of type I if it is lies between the graphs of two continuous functions
u1 (x, y) and u2 (x, y) with domain D.
E = {(x, y, z) | (x, y) D, u1 (x, y) z u2 (x, y)}.
The general formula is of the form

u2 (x,y)
!
f (x, y, z) dV = f (x, y, z) dz dA.
E D u1 (x,y)
Example 7.4.5. We set up the triple integral of f (x, y, z) = (x + y)z on the region
E enclosed by the ellipsoid
y2
x2 + + z2 = 1
4
and the cylinder
2
1 1
x + y2 = .
2 4
The region E can be obtained as the region inside the cylinder, bounded below and
above by the portion of the ellipsoid inside the cylinder for z < 0 and z > 0 respec-
tively. Figure 7.20 shows the portion for z > 0. We write
( 2 )
1 2 1
D = (x, y) | x +y
2 4
Fig. 7.20. Region E enclosed by the ellipsoid and the cylinder in Example 7.4.5.
and
n p p o
E = (x, y, z) | (x, y) D, 1 x2 y 2 /4 z 1 x2 y 2 /4 .
Then,
1x2 y2 /4 !
(x + y)z dV = (x + y)z dz dA.
E D 1x2 y 2 /4
Type II:
A region E is of type II if it is lies between the graphs of two continuous functions
v1 (y, z) and v2 (y, z) with domain D.
E = {(x, y, z) | (y, z) D, v1 (y, z) z v2 (y, z)}.

v2 (y,z)
!
f (x, y, z) dV = f (x, y, z) dx dA.
E D v1 (y,z)
Example 7.4.6. We set up the triple integral of f (x, y, z) = x2 on the region E

enclosed by the planes x y + z = 2, x = 1, y = 0 and z = 0. The planes
delimiting the region E are shown in Figure 7.21. We choose to let x = 2 + y z and
consider the region in the plane x = 1 bounded by the remaining planes to define
D = {(y, z) | 0 z 3, 0 y z 3}.
Then,
xy+z =2
3 x = 1
y
2
x
Fig. 7.21. The various planes delimiting the region E in Example 7.4.6
E = {(x, y, z) | (y, z) D, 1 x 2 + y z}.

The triple integral is
2+yz
2 2
x dV = x dx dA.
E D 1
Type III:
A region E is of type III if it is lies between the graphs of two continuous functions
w1 (x, z) and w2 (x, z) with domain D.
E = {(x, y, z) | (x, z) D, w1 (x, z) y w2 (x, z)}.

w2 (x,z)
!
f (x, y, z) dV = f (x, y, z) dx dA.
E D w1 (x,z)
Fig. 7.22. Region E bounded by the cone and the sphere in Example 7.4.7.
Example 7.4.7. We set up the triple integral of f (x, y, z) = xyz on the region E
enclosed by the cone y 2 = x2 + z 2 and the sphere x2 + y 2 + z 2 = 1 and containing
the y-axis. See Figure 7.22.

The region E projects to a disk in the (x, z) plane with maximal radius 1/ 2.
This can be seen by obtaining the intersection curve of the cone and the sphere. We
substitute x2 + z 2 by y 2 in the sphere equation to obtain 2y 2 = 1 and so y 2 = 1/2.
The projection of the intersection curve to the (x, z) plane is indeed x2 + z 2 = 1/2.
We can now write
D = {(x, z) | x2 + z 2 1/2}
and so
p p
E = {(x, y, z) | (x, z) D, 1 x2 z 2 y 1 x2 z 2 }.
From this we can write the triple integral

1x2 z 2
!
xyz dV = xyz dy dA.
E D 1x2 z 2
The following is a straightforward result to prove using Riemann sums.
Proposition 7.4.8. Let B R3 and B = B1 B2 where the intersection of B1 and

B2 is only along smooth surfaces and let g be a continous function. Then

g dV = g dV + g dV
B B1 B2
We illustrate the use of this result in the next example.

Example 7.4.9. Let B be the region enclosed inside the cone (z 1)2 = x2 + y 2 ,
bounded below by z = 0 and above by z = 3. We split B = B1 B2 as follows. Let
D1 be the circle of radius 1 centered at the origin and D2 be the circle of radius 2
centered at the origin. Then,
p
B1 = {(x, y, z) | (x, y) D1 , 0 z 1 x2 + y 2 },
p
B2 = {(x, y, z) | (x, y) D2 , 1 + x2 + y 2 z 3}.
See Figure 7.23. If g : B R is an integrable function, then

g dV = g dV + g dV
B B1 B2
! !
1 x2 +y 2 3
= g dz dA + g dz dA.
D1 0 D2 1+ x2 +y 2
Fig. 7.23. Region B made up of two regions B1 and B2 bounded by cones.
Exercises
(1) Find the volume of the solid that lies between the planes z = 3 + 2x y, z = 2
and which projects to the rectangle R = [1, 1] [1, 2].
(2) Compute the volume under the surface z = 4 x2 y over the domain D bounded
by the lines y = 1, x = 0, x = 2, y = x + 4, y = x + 2.
(3) Calculate the iterated integrals.

1 1 2
(a) z 2 xy dx dy dz
0 1 1
1 2
1
2 xyz
(b) xy e dz dx dy
01 0 2 0 2
(c) cos(x)yz dy dz dx
1 1 0
(4) Consider the domains below and determine if they are of type I, II, or III.
(a) R1 = {(x, y, z) | 1 y 1, 2 x 0, (x + y)2 z 4}.
(b) R2 is the tetrahedron determined by the vertices (0, 0, 0), (1, 0, 0), (0, 1, 0)
and (0, 0, 1).
(c) R3 is the region enclosed by the circular cylinder of radius 1 centered at the
origin in the xz-plane and the sphere of radius 2 centered at the origin, and
containing the y-axis.
(d) R4 = {(x, y, z) | 0 y 3, 0 z ln(1 + y), 3 x y 2 + z 2 }.
(5) Find the volume of the solid enclosed by the cylinder z = 1 y 2 , the z = 0 plane
and the planes x = 1 and x = 2 y.
(6) Find the volume of the solid E enclosed by the elliptic paraboloid z = x2 + y 2
and the plane z = 4. The answer is an integer multiple of .
(7) Find the volume of the solid bounded below by the elliptic paraboloid x =
z 2 + y 2 1 and above by the plane x = 4 z + 2y.
8 Wedge Products and Exterior Derivatives
This chapter begins with an extension of the wedge product operation to arbitrary
forms. This enables us to obtain formulae relating area and volume forms under
changes of coordinates. These calculations are used at the beginning of Chapter 9
when discussing pullbacks. From the definition of wedge product, we can now extend
the concept from differentiable 1-forms to differentiable k forms with an emphasis
on 2-forms in R2 and R3 and 3-forms in R3 . We then generalize the differential d
acting on functions by introducing the exterior derivative which can be applied to
any k-form. We do not prove all the properties of the exterior derivative and only
show its use on 1 and 2 forms. In particular, we redefine the concepts of closed and
exact forms using the exterior derivative.
8.1 More on Wedge Products
We generalize the wedge product defined for differentials dxi to 1-forms. This is done
by a formal extension of the definition.
Definition 8.1.1 (Wedge product for 1-forms). Let 1 , 2 be 1-forms in Rn and

u, v Rn . Then,
(u) (u)
1 2
(1 2 )(u, v) := .

1 (v) 2 (v)
The wedge product has the following properties that are easily verifiable using the
definition.
Proposition 8.1.2. Let 1 , 2 , 3 be 1-forms on Rn and a, b R.

(a) 1 1 0
(b) (a1 + b2 ) 3 = a(1 3 ) + b(2 3 )
(c) 1 2 = (2 1 )
(d) (1 2 )(u, u) = 0.
(e) if u1 , v1 , u2 , v2 Rn then
(1 2 )(au1 + bv1 , u2 ) = a(1 2 )(u1 , u2 ) + b(1 2 )(v1 , u2 )
and
(1 2 )(u1 , au2 + bv2 ) = a(1 2 )(u1 , u2 ) + b(1 2 )(u1 , v2 ).
Proof. (a) With 1 = 2 in the wedge product, the determinant has identical
columns and so the determinant is identically zero. (b) This property is obtained
directly by writing

(a + b )(u) (u)
1 2 3
[(a1 + b2 ) 3 ](u, v) =

(a1 + b2 )(v) 3 (v)
= (a1 + b2 )(u)3 (v) 3 (u)(a1 + b2 )(v)

= a1 (u)3 (v) + b2 (u)3 (v) a3 (u)1 (v) b3 (u)2 (v)
= a(1 (u)3 (v) 3 (u)1 (v)) + b(2 (u)3 (v) 3 (u)2 (v))
= a(1 3 )(u, v) + b(2 3 )(u, v).
(c) The columns of 2 1 are interchanged from 1 2 , therefore one is minus
the other. (d) is straightforward from the definition because two columns of the
determinant are equal. (e) is done by writing explicitly the left-hand side of the
formula and computing explicitly the determinant.
We look at some examples of computations using the properties (a), (b), (c) above.
Example 8.1.3. Let

1 = a1 (x, y) dx + a2 (x, y) dy and 2 = b1 (x, y) dx + b2 (x, y) dy.
Writing down the formula and using property (b) at first we obtain
1 2 = (a1 (x, y) dx + a2 (x, y) dy) (b1 (x, y) dx + b2 (x, y) dy)
= a1 (x, y)b1 (x, y)dx dx + a1 (x, y)b2 (x, y)dx dy
+a2 (x, y)b1 (x, y) dy dx + a2 (x, y)b2 (x, y) dy dy
= a1 (x, y)b2 (x, y)dx dy + a2 (x, y)b1 (x, y) dy dx using (a)
= (a1 (x, y)b2 (x, y) a2 (x, y)b1 (x, y)) dx dy using (c).
We now show that general wedge products are useful in dealing with changes of
coordinates. We begin with a general calculation.
Example 8.1.4. Let x = x(u, v), y = y(u, v) and z = z(u, v) and we compute their
differentials:
x x y y z z
dx = du + dv, dy = du + dv, dz = du + dv.
u v u v u v
According to the calculation in Example 8.1.3 we set
x x y y
a1 = , a2 = , b1 =
, b2 =
u v u v
and so
x y x y
dx dy = du dv.
u v v u
This can be rewritten

x x
u v
dx dy = det du dv
y y

u v

(x, y)
= det du dv.
(u, v)
226 8 Wedge Products and Exterior Derivatives
Similar formulae with dy dz and dz dx yield

(y, z) (z, x)
dy dz = det du dv, dz dx = det du dv.
(u, v) (u, v)
The formulae of this example can be summarized in the following result for Rn .
Proposition 8.1.5. Consider (x1 , . . . , xn ) Rn and suppose that xi = xi (u, v) for

i = 1, . . . , n where the dependence is differentiable. Then, for i 6= j

(xi , xj )
dxi dxj = det du dv.
(u, v)
We now consider specific changes of coordinates.
Example 8.1.6. (1) For the transformation to polar coordinates, we set x = r cos ,
y = r sin , then
!
cos r sin
dx dy = det = r dr d.
sin r cos
(2) Consider the linear transformation x = u + v, y = u v, then

!
1 1
dx dy = det du dv = 2 du dv.
1 1
We translate those results directly to area forms dA. In coordinates x, y, we write
dA(x,y) and recall that dA(x,y) = |dx dy|. We have for x = x(u, v) and y = y(u, v),

(x, y)
dA(x,y) = det dA .
(u, v) (u,v)
(1) for polar coordinates dA(x,y) = r dA(r,) .
(2) for x = u + v and y = u v, dA(x,y) = 2 dA(u,v) .
(3) More generally, suppose that
(x, y)T = B(u, v)T
where B is a 2 2 matrix of real numbers with det(B) 6= 0. Then
dA(x,y) = | det(B)|dA(u,v)
The formulae above are obtained by computing wedge products and from algebraic
manipulations. We still need to determine which vectors are arguments of dx dy
and du dv and look at the geometric meaning of the relationship between the area
forms.
Consider the change of variables given by the mapping (x, y)T = (u, v). and
the point (u0 , v0 ) R2 . Consider now the vectors ~ = (1 , 2 ) in
~ = (1 , 2 ) and
2
T(u0 ,v0 ) R . Let (x0 , y0 ) = (u0 , v0 ), then the vectors
D(~
) and ~
D()
are in T(x0 ,y0 ) R2 , and generate a parallelogram P (D(~ ~ based at (x0 , y0 ).

), D())
Let e1 = (1, 0) and e2 = (0, 1) be the canonical basis vectors for T(u0 ,v0 ) R2 .
T T
Let P (e1 , e2 ) be the parallelogram generated by e1 and e2 , then (dudv)(e1 , e2 ) = 1.

Suppose that ~ = e1 and ~ = e2 Then

x x
u v
(D(e1 ), D(e2 )) = , .
y y

u v
Therefore,

(x, y)
(dx dy)(D(e1 ), D(e2 )) = det
(u, v)

x x
u v
= (du dv)(e1 , e2 )
y y

u v
where the last equality holds because (du dv)(e1 , e2 ) = 1.
If we let ~ be arbitrary vectors,
~,
~ = (1 D(e1 ) + 2 D(e2 ), 1 D(e1 ) + 2 D(e2 ))
), D())
(D(~
because the derivative D is a linear operator. We now compute the area of the
parallelogram P (D(~ ~
), D());
(dx dy)(D(~ ~ = (dx dy)(1 D(e1 ) + 2 D(e2 ), 1 D(e1 ) + 2 D(e2 ))
), D())
= 1 2 (dx dy)(D(e1 ), D(e2 ))
+ 2 1 (dx dy)(D(e2 ), D(e1 ))

(x, y)
= (1 2 2 1 ) det
(u, v)

(x, y) ~
= det (du dv)(~
, )
(u, v)
with the last line obtained because

~ = du(~
) du(~
)

(du dv)(~
, ) = 1 2 2 1 .

~
du() ~
du()
We summarize these computations in the following result
Theorem 8.1.7. Let : R2 R2 be a change of coordinates (i.e. differentiable and

~ and ~ be tangent vectors at (u0 , v0 ).
invertible) such that (u0 , v0 ) = (x0 , y0 ). Let
Then, D(~ ~
) and D() are tangent vectors at (x0 , y0 ) and the signed areas of the
parallelograms P (D(~ ), D()) ~ and P (~ ~ are related by the formula
, )

(dx dy)(D(~ ~ = det (x, y) (du dv)(~
), D()) ~
, ).
(u, v)
~ (R)

R D(~
)

~ ~
D()
r x
Fig. 8.1. The rectangle R is mapped by the polar coordinates mapping to a portion of annulus
(R). The vectors ~ and ~ are mapped to D(~ ~
) and D().
We look at the important case of polar coordinate transformations.
Example 8.1.8. Consider the case of polar coordinates with a rectangle R in (r, )
space generated by ~ where we assume that
~ and , ~ are relatively small. Then,
~ and
(R) is a portion of annulus in (x, y)-space as in Figure 8.1. The parallelogram
P (D(~ ~
), D())
provides an approximation to the area of the portion of annulus; with the approxi-
mation improving as
~ and ~ approach zero.
We continue with an application to volume forms. Recall that
dV = |dx dy dz|.
We now explore the effect of changes of coordinates in R3 on the triple wedge prod-
uct, and therefore the volume form.
Theorem 8.1.9. Let x = x(u, v, w), y = y(u, v, w) and z = z(u, v, w) be a differen-

tiable change of coordinates. Then

(x, y, z)
dx dy dz = det du dv dw.
(u, v, w)
In particular,
(x, y, z)
dV(x,y,z) = det
du dv dw.
(u, v, w)
Proof. We write the wedge product explicitly in terms of the differentials
x x x
dx = du + dv + dw
u v w
y y y
dy = du + dv + dw
u v w
z z z
dz = du + dv + dw.
u v w
We obtain
dx dy dz = (dx dy) dz

x x x y y y
= du + dv + dw du + dv + dw
u v w u v w

z z z
du + dv + dw
u v w

x y x y x y x y
= du dv + du dw
u v v u u w w u

x y x y z z z
+ dv dw du + dv + dw
v w w v u v w

x y x y z x y x y z
=
u v v u w u w w u v

x y x y z
+ du dv dw.
v w w v u

(x, y, z)
= det du dv dw.
(u, v, w)
The result for volume forms is straightforward.
We apply this result to the main three dimensional changes of coordinates.
Example 8.1.10 (Cylindrical coordinates). It is a direct extension of the polar case

and the details are left to the reader. We obtain
dx dy dz = r dr d dz.
Example 8.1.11 (Spherical coordinates). For spherical coordinates, we use
x = x(, , ), y = y(, , ), z = z(, , ).
and so the Jacobian matrix is

cos sin cos cos sin sin

sin sin sin cos cos sin

cos sin 0
which has determinant 2 sin . Therefore,
dx dy dz = 2 sin d d d.
We denote the volume forms in Cartesian and spherical coordinates respectively by

dV(x,y,z) and dV(,,) then
dV(x,y,z) = |dx dy dz| = 2 | sin | |d d d| = 2 sin dV(,,) .
Note that | sin | = sin since [0, ].

We now state a result similar to Theorem 8.1.7
Theorem 8.1.12. Let : R3 R3 be a change of coordinates (i.e. differentiable and

invertible) such that (u0 , v0 , w0 ) = (x0 , y0 , z0 ). Let ~ and ~ be tangent vectors at
~,
(u0 , v0 , w0 ). Then, D(~), D()~ and D(~ ) are tangent vectors at (x0 , y0 , z0 ) and
the signed volumes of the parallelepipeds P (D(~ ~ D(~ )) and P (~
), D(), ~ )
, ,
are related by the formula

(dx dy dz)(D(~ ~ D(~ )) = det (x, y, z) (du dv dw)(~
), D(), ~ ~ ).
, ,
(u, v, w)
The calculation proceeds as in the 2D case. Consider the box generated by the
canonical basis vectors e1 , e2 and e3 in the tangent space at (u0 , v0 , w0 ). Then,

x x x

u v w

(D(e1 ), D(e2 ), D(e3 )) = y , y , y

u v w

z z z
u v w
and so

(x, y, z)
(dx dy dz)(D(e1 ), D(e2 ), D(e3 )) = det (du dv dw)(e1 , e2 , e3 ).
(u, v, w)
Let ~ = (1 , 2 , 3 ) and ~ = (1 , 2 , 3 ) be three vectors at

~ = (1 , 2 , 3 ),
(u0 , v0 , w0 ), then by writing

~ = 1 e1 + 2 e2 + 3 e3
~ = 1 e1 + 2 e2 + 3 e3
~ = 1 e1 + 2 e2 + 3 e3
and using the linearity property of the wedge product, we obtain the result of The-
orem 8.1.12.
Exercises
(1) Compute the wedge product of the 1-forms
1 = 5x1 dx1 + 3x3 x5 dx2 x22 dx3 + x23 x4 dx4
and
2 = 2x5 x3 dx1 x4 dx2 x33 dx3 + x1 x4 x5 dx4 + x1 x25 dx5 .
(2) Let x = a cosh(u) and y = a sinh(u). Compute dA(x,y) in terms of dA(a,u) .
(3) Let x = u + 2v 2w, y = 2u 3w and z = 3v + w. Compute dV(x,y,z) in terms

on dV(u,v,w) .
(4) Compute dV(x,y,z) in terms of the volume form in cylindrical coordinates (r, , z).
(5) Let 1 = a1 (x, y, z) dx + a2 (x, y, z) dy + a3 (x, y, z) dz,
2 = b1 (x, y, z) dx + b2 (x, y, z) dy + b3 (x, y, z) dz, and

3 = c1 (x, y, z) dx + c2 (x, y, z) dy + c3 (x, y, z) dz.
If u, v, w are vectors in R3 , verify that computing
(1 2 3 )(u, v, w)
using the definition is equal to

1 (u) 2 (u) 3 (u)

(1 2 3 )(u, v, w) = det
1 (v) 2 (v) .
3 (v)
1 (w) 2 (w) 3 (w)
(6) The elliptic coordinate system is given by
x = cosh cos , y = sinh sin
where 0 and [0, 2). Show that
dx dy = (sinh2 + sin2 ) d d.
Recall that (cosh x)0 = sinh x, (sinh x)0 = cosh x and cosh2 x sinh2 x = 1.
(7) The oblate spheroidal coordinate systems is given by the formula
x = cosh cos cos , y = cosh sin cos , z = sinh sin .
where 0, [, ] and [/2, /2]. (Recall that cosh0 (x) = sinh(x),

sinh0 (x) = cosh(x) and cosh2 (x) sinh2 (x) = 1.). Show that
dx dy dz = cosh cos (sin2 + sinh2 ) d d d.
(8) The parabolic coordinate system is given by

1 2
x = and y= ( 2 )
2
where , R. Show that dx dy = ( 2 + 2 ) d d .
8.2 Differential Forms
Using the wedge product of differentials, we now define differential 2-forms and 3-
forms in R2 and R3 . The subject of differential forms is quite extensive in both
mathematics and physics; including differential geometry, control theory, dynamical

systems, Hamiltonian mechanics and thermodynamics. We do not launch into the
general definition of k-forms, but rather build-up the differential forms which are
needed in this book.
Definition 8.2.1. Differentiable 2-forms in R2 and R3 are defined, respectively, as
= a(x, y) dx dy,
= a1 (x, y, z) dy dz + a2 (x, y, z) dz dx a3 (x, y, z) dx dy.
where a, a1 , a2 , a3 are C 1 functions. Differentiable 3-forms in R3 are defined as
= b(x, y, z) dx dy dz
where b is a C 1 function.
In Chapter 4, we show that differentiable 1-forms can be defined in Rn for any n N.

This is true also of 2-forms and 3-forms. A general study of k-forms is beyond the
scope of this text. In a nutshell, a k-form in Rn can be defined by taking linear
combinations of
dxi1 dxi2 dxik
by C 1 functions, where for vectors u1 , . . . , uk

dx (u )
i1 1 dxi2 (u1 ) dxik (u1 )

dxi1 (u2 ) dxi2 (u2 ) dxik (u2 )
(dxi1 dxi2 dxik )(u1 , . . . , uk ) :=

.. .. .. ..

. . . .

dxi1 (uk ) dxi2 (uk ) dxik (uk )
By this definition of the multiple wedge product of differentials dxi1 , . . . , dxik , we

see that if k > n then automatically
dxi1 dxi2 dxik 0
because two columns of the determinant must be equal. Therefore, k-forms in Rn

are nontrivial only for k n. In particular, differentiable 0-forms are differentiable
functions f : Rn R.
Example 8.2.2. Consider the case n = 4 with X = (x1 , x2 , x3 , x4 ), then we can

define differentiable k-forms for k = 0, 1, 2, 3, 4. Those are
(1) 0-form: a(X),
P4
(2) 1-form: i=1 ai (X) dxi ,
(3) 2-form: b1 (X) dx1 dx2 + b2 (X) dx1 dx3 + b3 (X) dx1 dx4 + b4 (X) dx2 dx3 +
b5 (X) dx2 dx4 + b6 (X) dx3 dx4
(4) 3-form: c1 (X) dx1 dx2 dx3 + c2 (X) dx1 dx2 dx4 + c3 (X) dx1 dx3 dx4 +
c4 (X) dx2 dx3 dx4
(5) 4-form: g(X) dx1 dx2 dx3 dx4 .
where a : R4 R, ai : R4 R for i = 1, 2, 3, 4, bi : R4 R for i = 1, . . . , 6,

ci : R4 R for i = 1, 2, 3, 4 and g : R4 R are all differentiable functions. We now
define the wedge product of arbitrary forms as follows.
Definition 8.2.3 (Wedge product). Let X = (x1 , . . . , xn ), k, ` n and
1 = a(X) dxi1 dxi2 dxik and 2 = b(X) dxj1 dxj2 dxj` .
Then
1 2 = a(X)b(X) dxi1 dxi2 dxik dxj1 dxj2 dxj` .
In particular, 1 2 0 if dxin = dxjm for some n and m.
Example 8.2.4. Let X R3 , 1 = a(X) dx dy, 2 = b(X) dz and 3 = c(X) dx.

Then, 1 3 0
1 2 = a(X)b(X) dx dy dz and 2 3 = b(X)c(X) dz dx.
For general k-forms and `-forms in Rn , the wedge product is taken by linearity: If
1 k (Rn ), 2 ` (Rn ), 3 m (Rn ) and , R then
( 1 + 2 ) 3 = 1 3 + 2 3 .
Example 8.2.5. Let 1 = a1 (x, y, z) dx dz + a2 (x, y, z) dz dx + a3 (x, y, z) dx dy

and 2 = b1 (x, y, z) dx + b2 (x, y, z) dy + b3 (x, y, z) dz. Then,
1 2 = (a1 (x, y, z) dy dz + a2 (x, y, z) dz dx + a3 (x, y, z) dx dy)

(b1 (x, y, z) dx + b2 (x, y, z) dy + b3 (x, y, z) dz)
= a1 (x, y, z)b1 (x, y, z) dy dz dx + a2 (x, y, z)b2 (x, y, z) dz dx dy
+a3 (x, y, z)b3 (x, y, z) dx dy dz
= (a1 b1 + a2 b2 + a3 b3 ) dx dy dz.
where the last equality follows by interchanging the differentials (twice) in the wedge
products of the first two terms.
We conclude this section by looking at the space of k-forms. We denote by k (Rn )

the space of differentiable k-forms. One can show that k (Rn ) is a vector space by
verifying the following property: if 1 , 2 k (Rn ) and a, b R then
a1 + b2 k (Rn ).
Exercises
(1) Compute 1 2 where 1 = x1 dx1 + x2 dx2 + x3 dx3 and
2 = x1 dx1 dx2 + dx1 dx3 .
(2) Let 1 = x2 x4 dx1 dx3 and 2 = x1 x3 dx2 dx4 . Compute 1 2 .

(3) Denote by the 2-form in R4 given by in Example 8.2.2. Show that is
not necessarily zero.
(4) Show that k (Rn ) is a vector space.
(5) This is a generalization of Exercises 5 and 6 of Section 7.1. Consider the coor-
dinates of R2n given by (q1 , p1 , . . . , qn , pn ) and define the 2-form
n
X
= dpi dqi .
i=1
(a) Let u = (u1 , v1 , . . . , un , vn ) and v = (w1 , z1 , . . . , wn , zn ) be vectors in R2n .

Compute (u, v). Notice that it corresponds to the sum of the signed areas
of the parallelograms P ((ui , vi ), (wi , zi )) for i = 1, . . . , n.
(b) Let !
0 In
J=
In 0
where In is the n n identity matrix and A be a 2n 2n matrix such that
AT JA = J. Show that
(Au, Av) = (u, v).
(Hint: represent using the J matrix). This 2-form is a symplectic form

and is an important tool for the study of Hamiltonian systems.
8.3 Exterior Derivative
In the first subsection we define a differential operator called the exterior derivative
and denoted by d (just as the differential) which we can use on 1 and 2-forms. The
formal definition extends directly from the one we give here, but we do not present
it. The main property of the exterior derivative is that the composition with itself is
equal to 0. We show this fact in a particular case and go on to discuss the concepts of
closed and exact k-forms from this point of view. The second subsection is optional
on a first reading and shows how to use the exactness conditions (6.9) for a 1-form to
formally construct the exterior derivative for 1-forms. In particular, this construction
exhibits directly that d is in fact a 2-form.
8.3.1 Definition and Properties of the Exterior Derivative
In general, the exterior derivative is an operator d : k (Rn ) k+1 (Rn ). In the

case of 1-forms in Rn
n
X
(x1 , . . . , xn ) = ai (x1 , . . . , xn ) dxi
i=1
the exterior derivative is defined by

n
X
d := dai (x1 , . . . , xn ) dxi . (8.1)
i=1
For 2-forms in R3 ,
(x, y, z) = a1 (x, y, z) dy dz + a2 (x, y, z) dz dx + a3 (x, y, z) dx dy.
then
d(x, y, z) := da1 (x, y, z) (dy dz) + da2 (x, y, z) (dz dx)

+da3 (x, y, z) (dx dy).
The principle of the exterior derivative for k-forms follows the recipe above: take
the differential of the coefficient functions and wedge it with the wedge product
corresponding to each coefficient. We compute some useful general formulae with 1
and 2-forms in R3 .
As we know from Definition 8.3.6, if
= a1 (x, y, z) dx + a2 (x, y, z) dy + a3 (x, y, z) dz
then

a1 a1 a2 a2
d = dy + dz dx + dx + dz dy
y z x z

a3 a3
+ dx + dy dz.
x y
which simplifies to

a2 a1 a3 a2
d = dx dy + dy dz
x y y z
(8.2)

a1 a3
+ dz dx.
z x
Consider now the 2-form in R3
= b1 (x, y, z) dy dz + b2 (x, y, z) dz dx + b3 (x, y, z) dx dy. (8.3)

In the computation of db1 , db2 and db3 we only need to consider the differential term
which does not appear in the associated wedge product. We obtain

b1 b2 b3
d = + + dx dy dz
x y z
because the reordering of the wedge products to dx dy dz always requires an even
number of interchanges. The reader is invited to perform this calculation.
Correspondence between k-forms and vector fields

Formulae (8.2) and (8.3) should remind you of the curl and divergence operators
applied to vector fields, presented in Chapter 6, Section 6.3.2. We make this corre-
spondence explicit by considering differentiable vector fields F (x, y, z) and G(x, y, z)
in R3 . We define 1-forms and 2-forms
F (x, y, z) = F (x, y, z) (dx, dy, dz), G = G(x, y, z) (dy dz, dz dx, dx dy).
Then,
dF = curl F (x, y, z) (dy dz, dz dx, dx dy)
dG = div G(x, y, z) dx dy dz.
We return to these relationships in the formulation of Stokes theorem in terms of

differential operators (see Chapter 10).
Exact and Closed k-forms

We now come to the fundamental property of exterior derivative when applied to
differentiable k-forms. The proof is presented only in the case of 1-forms.
Theorem 8.3.1. For any k-form with C 2 coefficients, then
d2 = d(d) = 0.
proof for 1-forms: We present the proof for 1-forms in Rn . The general proof
follows a similar argument using the general form of a k-form explained in the
previous section. Let x = (x1 , . . . , xn ) and
n
X
= ai (x) dxi .
i=1
Then,
n n
X X ai
d = dxj dxi
xj
i=1 j=1
and !
n X
n n
2
X X 2 ai
d = d(d) = dxk dxj dxi .
xk xj
i=1 j=1 k=1
For any term in the triple sum for which i = j, i = k or j = k, then dxk dxj
dxi 0. Suppose now that i, j, k are different and fix i. Consider only the terms
j = m1 , k = m2 and j = m2 , k = m1 in the sum, we have
2 ai 2 ai
dxm2 dxm1 dxi + dxm1 dxm2 dxi
xm2 xm1 xm1 xm2
2 ai 2 ai

= dxm2 dxm1 dxi
xm2 xm1 xm1 xm2
=0
by the equality of mixed second partial derivatives guaranteed by Clairaults the-

orem. That is, all terms in the sum are zero and this completes the proof for 1-
forms.
We extend the definition of exactness to general k-forms, moreover we introduce
the concept of closed k-forms.
Definition 8.3.2 (exact and closed forms). Consider a differential form k (Rn ).
(a) is exact if there exists a (k 1)-form such that = d.

(b) is closed if d = 0.
These definitions lead directly to the following result.
Proposition 8.3.3. If is an exact k-form, then is closed.
Proof. is exact implies there exists a (k 1)-form such that = d. Then,

d = d(d) = 0 by Theorem 8.3.1.
Example 8.3.4. Let = a(x)dx + b(y) dy + c(z) dz where a(x), b(y) and c(z) are
continuous functions. Then,
d = a0 (x) dx dx + b0 (y) dy dy + c0 (z) dz dz = 0.
So, is closed and is also exact where = df with

f (x, y, z) = a(x) dx + b(y) dy + c(z) dz.
We conclude with a generalization of Theorem 6.4.5 to general differentiable k-forms.

We do not prove this result, but refer to [1].
Theorem 8.3.5 (Poincars Lemma). is a C 2 , closed k-form defined on U Rn if

and only if is locally exact.
Exercises
(1) Show that if 1 is a k-form and 2 is an s form with k, s {0, 1, 2, 3} then
1 2 = (1)ks (2 1 ).
(2) Let = xyz dx + yz dy + (x + z) dz and compute d.

(3) Show that if 1 , 2 are both either 1, 2 or 3 forms in R3 , then
d(1 + 2 ) = d1 + d2 .
(4) Show that if = 12 (y dx + x dy) then d = dx dy.

(5) Consider = (xy + 12 y) dx 12 x dy. Verify that = e + r where
x2 x x2

xy xy y
e = dx + dy and r = + dx + dy.
2 4 2 2 2 4
Show that e is closed and find the 0-form f such that df = e . Show that r
is not exact.
(6) Let = a(x, y) dx + b(x, y) dy where a, b are differentiable and defined for all of
R2 and F (x, y) = (x, y) be a vector field. Consider
1
H(x, y) := (x, y)hF (x, y)i d.
0
Then, H(x, y) is a function. Explain why dH is exact? In each case below,

compute H(x, y), dH and rewrite
= dH + ( dH).
(a) = (xy + y/2) dx (x/2) dy,

(b) = (5x2 y 2xy) dx + (2x xy 2 ) dy,
(c) = exy dx + xey dy.
(7) Consider the 2-form
y2

2 2 2
= (xy + 2z y ) dy dz + xz dz dx + (x3 y 2 ) dx dy.
2
Verify that
= e + r
y2
where e = xy dy dz 2 dz dx and
r = 2z 2 y 2 dy dz + xz 2 dz dx + (x3 y 2 ) dx dy.
Show that e is closed and find a 1-form such that d = e . Show that r is
not exact.
8.3.2 A Construction of the Exterior Derivative
As Theorem 6.4.5 shows, the conditions (6.9) are important quantities to determine
whether a 1-form is locally exact. Now, those expressions are obtained from differen-
tiating the coefficients of and therefore we would like to express (6.9) in terms of
a differential operator we can apply directly to ; in analogy with differential op-
erators such as , div, curl or seen in Section 6.3. The exterior derivative defined
in the previous section is the correct differential operator for this purpose, however,
it is stated without justification whatsoever. In this section, we present one way of
constructing the exterior derivative as applied to 1-forms.
Before, doing the derivation recall the following properties of matrices. Let A
be a n n matrix, then A is symmetric if A = AT , while A is skew-symmetric (or
anti-symmetric) if AT = A. Then, any matrix A can be written as the sum of a
symmetric and an antisymmetric matrix. Explicitly,
A = B + C, where B = 21 (A + AT ) and C = 12 (A AT ).
It is straightforward to check that B is symmetric and C is skew-symmetric. We

illustrate with a 2 2 matrix
! ! !
1
a b a 2 (b + c) 1 0 1
= + (c b) .
c d 1 2
2 (b + c) d 1 0
Consider the 1-form = a(x, y) dx + b(x, y) dy where a, b are C 1 functions in

R . We know that at each (x, y) R2 , can be applied to a vector v T(x,y) R2 . If
2
the point (x, y) is not specified, the vector v can vary with the base point (x, y) and
it is then more convenient to think not only of , but of a pair (, F ) depending on
(x, y) only, where F = F (x, y) is some vector field. Thus, we consider (x, y)hF i as
a function of (x, y). Now, let G = G(x, y) be another vector field and we compute
the derivative of applied to vectors in G; that is,
D(x, y)hF i : T(x,y) R2 R, G 7 D(x, y)hF i G.
In explicit coordinates, we have the following expression: D(x, y)hF i G

a b a b
= dx(F ) + dy(F ), dx(F ) + dy(F ) G
x x y y

a b a b
= dx(F ) + dy(F ) dx(G) + dx(F ) + dy(F ) dy(G)
x x y y

a a
dx(G)
x y
= (dx(F ), dy(F ))
b b
dy(G)
x y

a 1 a b
2 + dx(G)
x y x

= (dx(F ), dy(F ))

a b b

1
2 + dy(G)
y x y

1

b a

0 1 dx(G)
+ (dx(F ), dy(F ))
2 x y
1 0 dy(G)
where the last equality comes from the splitting of a matrix into its symmetric and
antisymmetric parts. In this form, we see the desired expression

b a

x y
appear in the second term. Moreover, the last term corresponds to the area of the
parallelogram generated by the vectors F and G, in other words, we can rewrite
! !
0 1 dx(G)
(dx(F ), dy(F )) = (dx dy)(F, G).
1 0 dy(G)
A straightforward computation shows that interchanging the roles of F and G yields
D(x, y)hGi F

a 1 a b
2 + dx(F )
x y x

= (dx(G), dy(G))

a b b

1
2 + dy(F )
y x y

1 b a
+ (dx dy)(G, F )
2 x y

a 1 a b
2 + dx(F )
x y x

= (dx(G), dy(G))

a b b

1
2 + dy(F )
y x y

1 b a
(dx dy)(F, G).
2 x y
We can now conclude this calculation by noticing that the symmetric parts cancel
out and so we isolate the antisymmetric part, which is a 2-form:

b a
D(x, y)hF i G D(x, y)hGi F = (dx dy)(F, G).
x y
Let us look at a similar calculation with a 1-form in R3 and we use the (x1 , x2 , x3 )
notation. Let
= a1 (x1 , x2 , x3 ) dx1 + a2 (x1 , x2 , x3 ) dx2 + a3 (x1 , x2 , x3 ) dx3
and F = (F1 , F2 , F3 )T and G = (G1 , G2 , G3 )T be vector fields. Then, we obtain

D(x1 , x2 , x3 )hF i G

a1 1 a1 a2 1 a1 a3
2 + 2 +
x1 x2 x1 x3 x1

= FT

1 a1 a2 a2 1 a2 a3 G
2 x + x x 2 x3
+
x2

2 1 2

1 a1 a3 1 a 2 a3 a3
2 + 2 +
x3 x1 x3 x2 x3

0 1 0 dx1 (G)

1 a2 a1

+ (dx1 (F ), dx2 (F ), dx3 (F )) 1

0 0 dx2 (G)
2 x1 x2

0 0 0 dx3 (G)

0 0 1 dx1 (G)

1 a3 a1

+ (dx1 (F ), dx2 (F ), dx3 (F )) 0

2 x1 x3 0 0 dx2 (G)

1 0 0 dx3 (G)

0 0 0 dx1 (G)

1 a3 a2

+ (dx1 (F ), dx2 (F ), dx3 (F )) 0 dx2 (G) .

2 x2 x3 0 1

0 1 0 dx3 (G)
By noticing that

0 1 0 dx1 (G)
(dx1 (F ), dx2 (F ), dx3 (F )) 1 0 0 dx2 (G) = (dx dy)(F, G)
0 0 0 dx3 (G)
and similarly for the other two expressions, we can rewrite as D(x1 , x2 , x3 )hF i G

a1 1 a1 a 2 1 a 1 a3
2 + 2 +
x1 x2 x1 x3 x1

= FT

1 a1 a2 a2 1 a2 a3 G
2 x + x x2 2 x3
+
x2

2 1

1 a1 a3 1 a2 a3 a3
2 + 2 +
x3 x1 x3 x2 x3

1 a2 a1 1 a3 a1
+ (dx dy)(F, G) + (dx dz)(F, G)
2 x1 x2 2 x1 x3

1 a3 a2
+ (dy dz)(F, G)
2 x2 x3
Therefore,

a2 a1
D(x1 , x2 , x3 )hF i G D(x1 , x2 , x3 )hGi F =
(dx dy)(F, G)
x1 x2

a3 a1 a3 a2
(dz dx)(F, G) + (dy dz)(F, G)
x1 x3 x2 x3
because the symmetric parts cancel out. This calculation, unsurprisingly, also yields
a 2-form
Repeating the calculation above with a general 1-form in Rn we obtain the
following formula. Let x = (x1 , . . . , xn ),
n
X
= ai (x) dxi ,
i=1
F = (F1 , . . . , Fn ), G = (G1 , . . . , Gn ) be vector fields in Rn . Then,

X aj ai

D(x)hF i G D(x)hGi F = (dxi dxj )(F, G).
xi xj
i<j
These calculations motivate the following definition.

n
X
Definition 8.3.6. Let = ai (x) dxi be a differentiable 1-form in Rn . The exte-
i=1
rior derivative of is the differential operator d : 1 (Rn ) 2 (Rn ) defined by
X aj ai

dhF, Gi := (dxi dxj )(F, G).
xi xj
i<j
Exercises
(1) Verify that the exterior derivative formula for a general 1-form
n
X
= ai (x) dxi
i=1
obtained in Definition 8.3.6 corresponds to the exterior derivative computed

with formula (8.1).
9 Integration of Forms
In this chapter, we put together many of the results obtained in Chapter 7 and
Chapter 8 in order to develop the theory of integration of 2-forms and 3-forms.
This includes broadening the discussion on pullbacks, specifically for the integration
of 2-forms and 3-forms. The results are used in the second section to establish a
change of variables formula which we state for 2 and 3 forms, along with examples.
We continue with the definition of surface integral and in particular, integrating 2-
forms on a surface. Next, we present an introduction to the orientation of surfaces.
We conclude with a brief section on general pullbacks presenting results linking
pullbacks with wedge products and the exterior derivative.
9.1 Pullbacks of k-forms: k = 1, 2, 3
We begin by generalizing the pullback operation described in Chapter 4 for 1-forms

to the case of 2-forms in R2 and R3 and 3-forms in R3 . We then extend the discussion
to more general pullbacks.
Recall first the pullback operation for 1-forms over a curve C parametrized by
r(t) with t [a, b]. Consider a 1-form in R2
= a(x, y) dx + b(x, y) dy.
The pullback is a way of restricting , defined for all of R2 , to the curve C R2 . To

do this, we evaluate the coefficients a and b on the parametrization r(t) and evaluate
the differentials dx and dy on vectors in the tangent space at all points of the curve
C.
The definition of pullback for 2 and 3 forms (and more general k-forms too) fol-
lows a similar process. This time the geometric objects under study are not curves,
but two and three dimensional domains, as well as surfaces in R3 . Recall that Sec-
tion 6.2 shows how to obtain parametrization of such regions. We can now begin our
derivations.
Case 1:
Consider a parametrization of R2 given by
: R2 R2 , (x, y)T = (u, v).
Let = a(x, y) dx dy be a 2-form defined on R2 . Let ~ and ~ be vectors in the

tangent space of some point in (u, v)-space. As in Theorem 8.1.7, we use the mapping
to obtain

(x, y) ~
(dx dy)(D(~ ), D(~ )) = det du dv(~
, )
(u, v)
244 9 Integration of Forms
The pullback of by is completed by evaluating the coefficient of on the

parametrization . Thus, the pullback defines a new 2-form obtained using the
substitutions of variables just shown:

(x, y)
( )(u, v) := a((u, v)) det du dv
(u, v)
~ in T(u,v) R2 .
, )
applied to pairs of vectors (~
Example 9.1.1. Consider the polar coordinate mapping.
(r, ) = (r cos , r sin )

2
p
2 2
pthe two-form in R : = 2x x + y dx dy. The coefficient is a(x, y) =
and
2 2
2x x + y and so
a((r, )) = 2r2 cos .
Because we know dx dy = r dr d,
( )(r, ) = 2r3 cos dr d.
Case 2:
For a parametrization of R3
: R3 R3 , (x, y, z)T = (u, v, w)
the pullback operation is defined in the same way as for 2-forms. Consider the 3-form
on R3 given by (x, y, z) = a(x, y, z) dx dy dz. We know that

(x, y, z)
dx dy dz = det du dv dw.
(u, v, w)
We define a new 3-form using the direct substitutions into :

(x, y, z)
( )(u, v, w) = a((u, v, w)) det du dv dw
(u, v, w)
~ ~ ) in the tangent space of the point (u, v, w).

, ,
applied to triples of vectors (~
Example 9.1.2. Consider the spherical coordinate mapping
(, , ) = ( cos sin , sin sin , cos )

2 2 2 2
and the three-form: = ex +y +z dxdydz. Then, a(x, y, z) = ex +y 2 +z 2 becomes
2
a((u, v, w)) = e which implies
2
( )(r, , ) = e 2 sin d d d.
9.1 Pullbacks of k-forms: k = 1, 2, 3 245
Case 3:
We now turn to the case of the parametrization of a surface S given by
: R R2 R3 .
and consider the 2-form
(x, y, z) = a1 (x, y, z) dy dz + a2 (x, y, z) dz dx + a3 (x, y, z) dx dy.
The pullback operation is defined as follows. We need to evaluate the coefficients on

the surface given by the parametrization (u, v); that is,
a1 ((u, v)), a2 ((u, v)), a3 ((u, v)).
We then use the parametrization to transform the 2-forms dy dz, dz dx and

dx dy to 2-forms involving du dv. For convenience, we write
(x, y, z)T = (u, v) = (x(u, v), y(u, v), z(u, v))T .
Therefore,

(x, y)
dx dy = det du dv,
(u, v)

(y, z) (9.1)
dy dz = det du dv,
(u, v)

(z, x)
dz dx = det du dv.
(u, v)
For vectors ~ T(u,v) R2 ,
~,
D(~
) and ~
D()
are elements of T(u,v) S. The wedge products dxdy, dy dz and dz dx are applied
to the vectors D(~ ) and D(). ~
Putting all those substitutions together, we define the pullback of by as
( )(u, v) =

(y, z) (z, x) (x, y)
a1 ((u, v))
+ a2 ((u, v))
+ a3 ((u, v))
du dv
(u, v) (u, v) (u, v)
~ , ~ Tu,v R2 .
applied to vectors
Example 9.1.3. Consider the surface S with parametrization
(u, v) = (u2 , u + v, v u)
and the 2-form
= 3xyz dy dz + zy dz dx + x2 y dx dy.
Then, a1 (x, y, z) = 3xyz , a2 (x, y, z) = zy, a3 (x, y, z) = x2 y and so
a1 ((u, v)) = 3u2 (u + v)(v u) = 3u2 (v 2 u2 )
a2 ((u, v)) = (v u)(u + v) = v 2 u2
a3 ((u, v)) = (u2 )2 (u + v) = u5 + u4 v.
Moreover,

1 1

(y, z)
det = det =2
(u, v)
1 1

1 1

(z, x) = 2u
det = det
(u, v)
2u 0

2u 0

(x, y)
det = det = 2u
(u, v)
1 1
which leads to
( )(u, v) = (6u2 (v 2 u2 )2 2u(v 2 u2 ) + 2u(u5 + u4 v)) du dv
= (6u2 v 2 6u4 2uv 2 + 2u3 + 2u6 + 2u5 v) du dv.
The main use of pullbacks in our context is to compute integrals as we show in the
next section.
Exercises
(1) Compute the following pullbacks of 2 and 3 forms.

(a) (x, y) = (x2 y 2 )dx dy with (u, v) = (u cosh(v), u sinh(v)).
(b) (x, y) = exp(xy)dx dy with (u, v) = (u2 v 2 , u2 + v 2 )
p
(c) (x, y) = (3 x2 + y 2 )dx dy with (u, v) = (u cos v, u sin v)
(d) (x, y, z) = (3x y)dy dz + (y z)dz dx (x + y + z)dx dy with
(u, v) = ( 31 u + v, v, u v).
(e) (x, y, z) = xdy dz + ydz dx + zdx dy with
(u, v) = (2 cos u sin v, 2 sin u sin v, 2 cos v).
(f) (x, y, z) = (xz y)dy dz + (y 2 z)dz dx + xyzdx dy with (u, v) =

(u, uv + u2 , v + u).
2 2 2

(g) (x, y, z) = x22 + y12 + z32 dx dy dz with (u, v, w) = (2u, v, 3w).
(h) (x, y, z) = exp(xyz)dx dy dz with (u, v, w) = (ln uv, v 2 , uw).
(i) (x, y, z) = ||(x, y, z)||dx dy dz with (u, v, w) = (u cos v, u sin v, w).
9.2 Integrals of Forms: change of variables formula
We begin by considering the integration of 2-forms over two-dimensional domains

and 3-forms over three-dimensional domains. We show that the integrals of those
forms correspond to regular double and triple integrals as defined in Chapter 7. To
make this correspondence, we return to wedge products and orientation.
Note that for any rectangle P (u, v) with u parallel to the x-axis and v parallel
to the y-axis, then (dx dy)(u, v) = dA(P (u, v)). Similarly, if B(u, v, w) is a box
with u parallel to the x-axis, v parallel to the y-axis and w parallel to the z-axis.
Then, (dx dy dz)(B) = dV (B).
Definition 9.2.1. Let 2 (x, y) = f (x, y) dx dy and 3 (x, y, z) = g(x, y, z) dx dy

dz, then for a rectangular domain D in R2 and a box domain B in R3

2 and 3
D B
are defined as limits of Riemann sums over grids subdividing D and B as in the
definition of double and triple integrals, see Definition 7.2.2 and Definition 7.4.1.
Proposition 9.2.2. Let 2 (x, y) be a 2-form and 3 (x, y, z) be a 3-form, as in Defi-

nition 9.2.1. For a domain D R2 , we have

2
= f (x, y) dA.
D D
3
and for a domain E R ,

3
= g(x, y, z) dV,
E E
if the integrals on the right exist. Therefore, all results shown for double and triple
integrals apply to integrals of 2 and 3 forms.
Proof. Using these properties of dx dy and dx dy dz on P (u, v) and B(u, v, w)

respectively shows the equality between the Riemann sums of forms and the Riemann
sums defining double and triple integrals. In particular, the existence of integrals is
guaranteed over rectangles R and boxes B domains for 2 and 3 forms having bounded
coefficients with a finite number of jump discontinuities of finite size. Therefore, we
can define those integrals for domains D and E more general than the rectangle and
box domains by setting 2 and 3 to zero outside D and E.
We now present the change of variables formula which is just a generalization

of the substitution rule (or u-substitution) to multiple integrals. In Section 4.1, we
show that the substitution rule can be rewritten in terms of pullbacks of the 1-form.
Therefore, it is natural that the change of variables formula for multiple integrals
can also be obtained using pullbacks and with the formulae obtained above, it is
now a straightforward result.
Theorem 9.2.3 (Change of variables formula). Let D R2 and E R3 be subsets

for which
f (x, y) dA, g(x, y, z) dV
D E
exist. Consider the parametrizations 1 : R1 D and 2 : R2 E:
1 (u, v) = (x(u, v), y(u, v)) and
2 (u, v, w) = (x(u, v, w), y(u, v, w), z(u, v, w)).
Then

(x, y)
f (x, y) dA(x,y) = f (1 (u, v)) det
dA(u,v)
D R1 (u, v)

(x, y, z)
g(x, y, z) dV(x,y,z) = g(2 (u, v)) det dV
E R2 (u, v, w) (u,v,w)
Proof. Note that using Proposition 9.2.2, we can work directly with

2 and 3
D E
where 2 = f (x, y) dx dy and 3= g(x, y, z) dx dy dz.

Without loss of generality, we consider a rectangular domain D containing D and
2 equal to 2 on D and zero outside. Decompose D into nm subrectangles Dij with
i = 1, . . . , n and j = 1, . . . , m where Dij is generated by the vectors xi = xi xi1
and yj = yj yj1 . Moreover, note
~ i = D1
1 (xi ) and
~j = D1 (yj ).
1
We write the integral of 2 using its Riemann sum

n X
X m
2 = lim f (x, y)(dx dy)(xi , yj )
D n,m
i=1 j=1
Xn X m
= lim f (1 (u, v))(dx dy)(xi , yj )
n,m
i=1 j=1
n X m
X (x, y) ~j )
= lim f ((u, v)) (du dv)(~
i ,
n,m (u, v)
i=1 j=1

= 1 2
1
1 (D)
since the next to last line is the Riemann sum of 1 2 over 1 (D). A similar
calculation shows that
3
= 1 3
E R2
We now proceed with several examples of the change of variables formula.
Fig. 9.1. Region E bounded by the cylinder and paraboloid described in Example 9.2.4.
Example 9.2.4. Find the volume of the solid that lies under the paraboloid z =
x2 + y 2 inside the cylinder (x 1)2 + y 2 = 1 and above the xy-plane. See Figure 9.1.
We begin by describing the domain E enclosed by the paraboloid and the cylinder,
we write
E = {(x, y, z) | (x, y) D, 0 z x2 + y 2 }
where
D = {(x, y) | 0 (x 1)2 + y 2 1}.
Because the region is defined partly using a cylinder, it is a reasonable guess to

move to cylindrical coordinates: (x, y, z) = (r, , z) where : R E. We need to
determine the domain R. From the change of coordinates, we have
0 z x2 + y 2 0 z r2
and
(x 1)2 + y 2 = 1 (r cos 1)2 + r2 sin2 = 1 r = 2 cos .
Thus, the projection of the interior of the cylinder is parametrized by

0 r 2 cos and .
2 2
Therefore,
n o
R = (r, , z) | , 0 r 2 cos , 0 z r2 .
2 2
We can now apply the change of variables formula:

dV(x,y,z) = | det D| dV(r,,z)
E R

= r dV(r,,z)
R
/2 2 cos r2
! !
= r dz dr d
/2 0 0
/2
3
= 4 cos4 d = .
/2 2

Example 9.2.5. Compute (x2 xy +y 2 ) dA where D is the region bounded by the
D
ellipse x2 xy + y 2 = 2. First, set h(x, y) = x2 xy + y 2 and we use (x, y) = (u, v)
given by p p
x = 2u 2/3v, y = 2u + 2/3v.
The domain inside the ellipse is described by
D = {(x, y) | 0 x2 xy + y 2 2}.
and so : D1 D is a parametrization of D where
D1 = {(u, v) | u2 + v 2 1}.
From the change of variables formula, we obtain

h(x, y)dA(x,y) = h((u, v))| det D| dA(u,v) .
D D1
We just need to compute the terms in the integrand on the right-hand side of the
equality. We have
4
h((u, v)) = 2(u2 + v 2 ) and | det D| = .
3
Therefore,
1
1v 2
!
8 2
h(x, y)dA(x,y) = (u + v 2 ) dv du.
D 3 0 1v 2
Now, one can possibly argue that the iterated integral on the right can be simplified
by going to polar coordinates. Let (u, v) = 1 (r, ) and 1 : D2 D1 where D2 =
{(r, ) | 0 r 1, 0 2}. Then,

8 8
(u2 + v 2 ) dA(u,v) = r2 |D1 | dA(r,)
3 D1 3 D2
1 2
8 3 4
= r d dr = .
3 0 0 3
Example 9.2.6. Compute the volume inside the torus given by
x = (2 + cos v) cos u, y = (2 + cos v) sin u, z = sin v
where 0 u 2 and 0 v 2. The parametrization of the region inside

the torus is similar to a spherical coordinate case. The torus has rotation axis at
distance 2 from the origin and the torus radius is 1. The region E inside the torus
is parametrized by : R E given by
(, u, v) = ((2 + cos v) cos u, (2 + cos v) sin u, sin v)
from a domain
R = {(, u, v) | 0 1, 0 u 2, 0 v 2}.
Thus,
dV(x,y,z) = | det D| dV(,u,v)
E R
where det D =

cos v cos u (2 + cos v) sin u sin v cos u

det cos v sin u (2 + cos v) cos u sin v sin u = (2 + cos v) > 0.

sin v 0 cos v
This leads to
1 2 2
dV(x,y,z) = (2 + cos v) dv du d
E 0 0 0
1 2
2
= (2v + sin v) |0 du d
0 0
1 2

= 4 du d
0 0
1
= 8 d
0
= 4 2
Exercises
(1) Set up the iterated integral for the following integrals of 2-forms and 3-forms
over the given domain.
(a) Let (x, y) = (x2 y 2 )dxdy and the domain D is parametrized by (u, v) =
(u cosh(v), u sinh(v)) with 1 u 3 and 0 v .
(b) Let (x, y) = yex with domain D parametrized by (u, v) = (ln(uv), uv)
where 1 u 2, 1 v 3.
(c) Let
x2 y2 z2

(x, y, z) = + + dx dy dz
22 12 32
and the domain E given by the parametrization (u, v, w) = (2u, v, 3w)
with 0 u 1, 0 v 2, 1 w 1.
(d) Let (x, y, z) = (x2 + y 2 z 2 )dx dy dz and the domain E is parametrized
by (, , ) = (cosh cos cos , c cosh , cos , sin , sinh sin ) where
0 < 1, /2 /2 and 0 .
(2) Evaluate the following integrals

(a) S (x + y) dA where S is the region inside the disk x2 +y 2 = a2 and between
the lines y = x and y = x and containing the y-axis.

(b) D tan(x2 + y 2 ) dA where D is the disk of radius 1 centered at (0, 0).
(3) Let T be the region bounded by the triangle with vertices (1, 0), (0, 1) and
(1, 0). Evaluate
2
yx
dA
T y+x
by using the change of variables u = y x, v = y + x.
(4) Consider the region E bounded by the tetrahedron with vertices (0, 0, 0), (0, 1, 1),

( 3/2, 1/2, 1), ( 3/2, 1/2, 1). Evaluate

(xz + 2y) dV
E
using the change of variables

z x z x z
u=y , v= , w= + .
2 3 2 3 2
(5) In each case, find the volume of the given region.p
(a) The region inside the cone of equation z = x2 + y 2 , below the sphere of
equation x2 + y 2 + z 2 = a2 .
(b) The region located in the first octant between the planes of equation y = 0,
y = x and inside the ellipsoid of equation
x2 y2 z2
2
+ 2 + 2 = 1.
a b c
(6) Consider a colony of bacteria living in a Petri dish in the shape of a disk of
radius 10cm. Suppose that the density of bacteria is estimated by (x, y) =
3 2 exp((x2 + y 2 )1 ) bacteria/cm2 . Compute the total number of bacteria in
the dish and determine the average density of bacteria.
(7) Consider an electrically charged region E in the shape of a torus as in Exam-
ple 9.2.6. We assume the charge density in E (given in Coulomb per square
meters C/m2 ) to be given by a function (x, y, z) = z(x2 + y 2 ). Evaluate the
total charge density in E.
(8) Consider a ball of radius 8 cm filled with sand lying on a table and assume that
the south pole is at the origin of R3 . Suppose that the mass density is estimated

by the function (x, y, z) = 5 z/4 g/cm3 . Find the centre of mass (x, y, z) of
the ball by computing the integrals

1 1 1
x= x dV, y = y dV, z = z dV
M E M E M E
where E is the domain bounded by the ball.
9.3 Integrals on a surface
Consider a surface S R3 and a point p S. We begin by defining an area form,

but for parallelograms defined in Tp S. Recall that if ~ are vectors in R3 then the
~,
area of the parallelogram P (~ ~ is given by the formula
, )
q
A(P ) = ((dy dz)2 (~ ~ + (dz dx)2 (~
, ) ~ + (dx dy)2 (~
, ) ~
, ).
See Section 7.1.

Definition 9.3.1. Let S R3 be a surface for which a unique tangent space exists
~ , ~ be vectors in Tp S. Then, the surface area form is
at each point p S and let
defined by
q
~
, ) = (dy dz)2 (~
dS(~ ~ + (dz dx)2 (~
, ) ~ + (dx dy)2 (~
, ) ~
, ).
The surface area form is non-negative and is a generalization of the area form to any
two-dimensional surface in R3 . We now use the surface area form to define various
surface integrals.
9.3.1 Surface integral
We begin with the surface integral of a real-valued function f on a surface S. This

is the simplest type of integral on a surface and one of its main use is to compute
the area of surfaces.
Consider a rubber sheet of area A and a curved surface S we would like to cover
with the rubber sheet. We take the rubber sheet, stretch it and bend it to fit on
the surface. What is the surface area of the curved surface? If we dont have to
stretch the rubber sheet and we can cover the surface entirely, then it is intuitively
clear that the curved surface has also area A. Suppose a stretching is needed, and
the stretching and bending can be described by the mapping . We show in this
section how to use the mapping to compute the surface area of S. The pullback
transformation is the key ingredient here.
Fig. 9.2. Surface S decomposed via a grid into portions Sij
Consider a surface S which has a tangent space at all points, and partitioned
into a grid as in Figure 9.2 with ki , the ith curve parallel to the y-axis, and `j ,
the j th curve parallel to the x-axis on S. Let qi (t) be a parametrization of ki with

t [0, 1] and rj ( ) a parametrization of `j with [0, 1]. Let pij = qi (tj ) = rj (i )
be the intersection point of ki and `j and define
~ij := qi0 (tj )(tj+1 tj ), ij = r0j (i )(i+1 i ),
two vectors based at pij and tangent to ki and `j respectively. The parallelogram
P (~ij , ~ij ) is an approximation of the area of the portion of surface Sij bounded
by the curves ki , ki+1 , `j , `j+1 and it has area dS(~ij , ~ij ). By refining the grid, see
Fig. 9.3. Refinement of the

grid on the surface S.
Figure 9.3, the approximation improves since each Sij becomes flatter as the pieces
become smaller.
Consider now a function f : S R. The value of f can be approximated at the
points pij . We use this construction to define the integral of f on the surface S.
Definition 9.3.2. Let S be a differentiable surface and f (x, y, z) a function taking

real values on S. That is, f (p) R for all p S. Then,

f (x, y, z) dS
S
is called the surface integral of f on S and its value is given by the Riemann sum
n X
X m
lim f (pij )dS(~ij , ~ij ), (9.2)
n,m
i=1 j=1
if it exists as the size of ~ij , ~ij tends to 0 as n, m .
Now that we have a definition of the surface integral, we need to obtain a compu-
tational formula to evaluate the integral. For this, we need S to be described by a
parametrization and our goal is to write dS in terms of this parametrization.
Let : D S, (u, v) = (x(u, v), y(u, v), z(u, v)) be a parametrization of S. Let
(u0 , v0 ) D and p = (u0 , v0 ) S. Let ~ T(u ,v ) R2 , then D(~
~, ~
), D()
0 0
Tp S and

~ (y, z) ~
(dy dz)(D(~
), D()) = det (du dv)(~
, )
(u, v)

~ = det (z, x) ~
(dz dx)(D(~
), D()) (du dv)(~
, )
(u, v)

~ (x, y) ~
(dx dy)(D(~
), D()) = det (du dv)(~
, ).
(u, v)
Therefore, we obtain
s 2 2 2
~ = (y, z) + (z, x) + (x, y) (du dv)(~ ~

dS(D(~
), D()) (u, v) (u, v) (u, v) , )
where the right hand side is similar to a pullback of dS via and we define
s
(y, z) 2 (z, x) 2 (x, y) 2

( dS) := (u, v) (du dv).
+ +
(u, v) (u, v)
This now leads to a computational formula for the surface integral in terms of the
pullback of dS.
Proposition 9.3.3. Let S be a surface parametrized by : D R2 S and let

f : S R be a continuous function. Then,

f dS = f ((u, v))( dS).
S D
Proof. Consider the elements in the Riemann sum (9.2): pij = (ui , vj ) for some
(ui , vj ) D, and so f (pij ) = f ((ui , vj )). Moreover,
(~ij , ~ij ) = (D(~

ij ), D(~ij ))
~ ij , ~ij T(ui ,vj ) R2 . Then

for some
dS(~ij , ~ij ) = ( dS)(~

ij , ~ij ).

Substituting in (9.2) we obtain f dS
S
s
n m
(y, z) 2 (z, x) 2 (x, y) 2
X X
~
= lim f ((ui , vj )) (u, v) (u, v) (du dv)(~ij , ij )
+ +
n,m (u, v)
i=1 j=1
where the right-hand side is the Riemann sum leading to the double integral of the
function s
(y, z) 2 (z, x) 2 (x, y) 2

f ((ui , vj )) + +
(u, v) (u, v) (u, v)
on D. This proves the result.
We now look at several examples beginning with the area of surfaces obtained if
f (x, y, z) = 1.
Example 9.3.4. Suppose S is a sphere of radius a with parametrization:
x = a cos sin , y = a sin sin , z = a cos
where D = {(, ) | 0 , 0 2}. Then,

dS = ( dS)
S D

s
2
2 (z, x) 2 (x, y) 2

(y, z)
= (, ) + (, ) + (, ) d d

0 0
2 q
= (a2 cos sin2 )2 + (a2 sin sin2 )2
0 0

+(a2 cos2 sin cos + a2 sin2 sin cos )2 d d
2
2
= a sin d d
0 0
= 4a2 .
We now look at a family of examples of surface integrals for surfaces S given by z =

g(x, y) where g : D R2 R is differentiable and f : D R. The parametrization
is (x, y) = (x, y, g(x, y)), with : D S. Then,

(y, z) g (z, x) g (x, y)
det = , det = and det = 1.
(x, y) x (x, y) y (x, y)
Therefore,

s
g 2 g 2
f dS = f ((x, y)) 1+ + dx dy.
S D x y
Example 9.3.5. Consider the elliptic paraboloid S given by g(x, y) = x2 + y 2 defined

over D, the circle of radius 2:
D = {(r, ) | 0 r 2, 0 2}.
We write the formula and convert to polar coordinates

p 2 2p
dS = 1 + 4x2 + 4y 2 dx dy = 1 + r2 r dr d.
S D 0 0
The r integral is obtained by substitution and so

2(53/2 1)
dS = .
S 3
We now consider a few applications to quantities expressed as densities. Let (x, y, z)

be the density of a quantity (mass, population, biomass, etc) and we restrict (x, y, z)
to S. The total quantity over S is then expressed using the surface integral

(x, y, z) dS.
S
Example 9.3.6. We compute the mass of a thin material in the shape of a conic
surface S with parametric representation (r, ) = (r cos , r sin , r) with 0 r 1
and 0 2. Suppose (x, y, z) = z 2 be the mass density at (x, y, z) S. Then,

m= (x, y, z) dS = r2 2r dr d
S D
1

= 2 2 r3 dr
0

2
= .
2
Example 9.3.7. We consider a situation where the density of vegetation cover on a
mountain depends on the height and whether we consider the south or north face
of the mountain. Suppose the mountain has height z = H > 0 with a cone shape
assumed to be circularly symmetric with radius 10H at the base z = 0. We set the
top of the mountain at (x, y, z) = (0, 0, H) and assume the density of biomass to be
given by a function

z
2C 1
if 0 y,
(x, y, z) = H

C 1
z
if y < 0
H
where C is some constant. Let S be the surface of the mountain given by z =
1
g(x, y) := H 100H (x2 + y 2 ) with D = {(x, y) | x2 + y 2 (10H)2 }. The total
density of vegetation is given by

(x, y, z) dS.
S
1
The parametrization of S is (x, y) = (x, y, H 100H (x2 + y 2 )) with domain D. To
obtain the integral, we begin by splitting D = D+ D where D+ = D {(x, y) |
y 0} and D = D {(x, y) | y < 0}. We define S+ = S|y0 and S | = Sy<0 and
z

so S is parametrized by with domain D . Moreover, |D+ = 2C 1 H and
z

|D = C 1 H . Therefore, we can write

(x, y, z) dS = + (x, y, z) dS + (x, y, z) dS
S S+ Ss

4(x2 + y 2 )

g(x, y)
= 2C 1 1+ dx dy
D+ H (100H)2

s
4(x2 + y 2 )

g(x, y)
+ C 1 1+ dx dy.
D H (100H)2
Transforming to polar coordinates in (x, y), the parametrization of S is given by

1
(r, ) = (r, , H 100H r2 ) with domain D+ = {(r, ) | r 10H, (0, )} and
D = {(r, ) | r 10H, (, 2)}. Thus, we obtain
r r
g(x, y) 4(x2 + y 2 ) r2 4r2

1 1+ dx dy = 1+ rdr d.
D H (100H)2 D 100H 2 (100H)2
Writing in terms of iterated integrals, using the fact that the integrand is independent
of , we now have
10H
s s
r2 4r2 r2 4r2
2
1+ 2
r dr d = 2
1+ r dr
D 100H (100H) 0 100H (100H)2
This last integral can be solved by setting

4r2 4
u2 = 1 + with u du = r dr.
(100H)2 (100H)2

The bounds of integration are u = 1 and u = 5. We leave the details to the reader.
Exercises
(1) In each case, set up the integral to compute the surface area of S, if possible,
compute the integral.
(a) S is given by z = 2x2 + y 2 with domain D = {(x, y) | 0 2x2 + y 2 4}.
(b) S is the portion of the cylinder x2 + z 2 = 2 with z 0 and 3 y 3.
(c) S is given by (u, v) = (uv, uv 2 , 2 + u) with 1 u 1 and 0 v 1.
(d) S is the portion of sphere of radius 1 with 1/2 z 1.
(e) S is the surface of the tetrahedron with vertices (0, 0, 0), (1, 0, 0), (0, 1, 0)
and (0, 0, 1).
(f) S is the positively oriented cone with a vertex at (0, 0, 2) with axis given by
the z-axis and base given by the disk of radius 2 in the xy plane.
(g) S is the portion of sphere given by
(, ) = (2 cos sin , 2 sin sin , 2 cos )
with D = {(, ) | /3 2/3, /4 3/2}.

(2) Suppose that the density of population of a wildlife species A depends on the
height of the landscape and is given by the formula (x, y, z) = 0 z and does
not live at heights greater than 0 . If the landscape can be approximated by
z = x4 y 3 + 2x2 y + 3xy 2 where 1 x 1 and 0 y 1. Set up the integral
for the total population of A.
(3) Cell membranes are covered with channels enabling the transport of ions from
the interior of the cell to the extracellular medium. Channels typically only
enable the passage of a unique type of ion. Na+ ion channel density on a given
type of cell is 60 channels/m2 [5]. Suppose that a cell is spherically shaped
with radius 2m. How many ion channels are on the sphere.
(4) European otter Lutra lutra exhibits a mean hair density of about 70000
hairs/cm2 not including appendages [3] The body of the otter is 57 to 95
cm long not counting the tail. If we assume that the body of the otter is approx-
imatively a cylinder of length ` (with ` [57, 95]) and radius of 8cm, determine
the total number of hair on the otter.
9.3.2 Integral of a 2-form on a surface
We begin by a formal definition of the integral of a 2-form over a surface using

Riemann sums. Then, we use the pullback formula for 2-forms in R3 via a mapping
acting as a parametrization of S to have a convenient formula for the integral of
2-forms on surfaces.
Consider a surface S in the context given before the definition of surface integral;
that is, Definition 9.3.2. Let R = [a, b] [c, d] R2 be a rectangular region such
that D R and consider partitions of [a, b] and [c, d] leading to the grid of points
(xi , yj ) R with 1 i n and 1 j m. Let pij = g(xi , yj ) S and ~ij and ~ij
be vectors in Tpij S parallel to the x and y axes, respectively.
Definition 9.3.8. Let S be a surface as described above and 2 be a differentiable

2-form. Then, the integral of 2 on S is defined as
n X
X m
2 = lim a1 (pij ) (dy dz)(~ij , ~ij ) + a2 (pij ) (dz dx)(~ij , ~ij )
S n,m
i=1 j=1
+a3 (pij ) (dx dy)(~ij , ~ij )
if the limit of the Riemann sum on the right-hand side of the equality sign exists as
the size of ~ij , ~ij tends to 0 as n, m .
Theorem 9.3.9. Consider a surface S with a parametrization : D R2 R3 :
(u, v) = (x(u, v), y(u, v), z(u, v)).

Let be a 2-form, then

=
S D
where the term on the right is a double integral in R2 .
Proof. Begin with the Riemann sum formula of Definition 9.3.8. Set = a1 dy dz +
~ ij , ~ij Tpij S and ~ij = D1 (~
a2 dz dx+a3 dxdy, pij = (ui , vj ) S and ij ),
~ij = D1 (
~ij ), then
n X
X m
2 = lim ij , ~ij ) + a2 (pij ) (dz dx)(~
a1 (pij ) (dy dz)(~ ~ij )
ij ,
S n,m
i=1 j=1
ij , ~ij )
+a3 (pij ) (dx dy)(~
m
n X
X (y, z)
= lim a1 ((uij , vij )) (du dv)(~ij , ~ij )
n,m (u, v)
i=1 j=1

(z, x)
+a2 ((uij , vij ))
(du dv)(~ij , ~ij )
(u, v)

(x, y)
+a3 ((uij , vij ))
(du dv)(~ij , ~ij )
(u, v)
n X m
(y, z)
+ a2 ((uij , vij )) (z, x)
X
= lim a1 ((uij , vij ))
n,m (u, v) (u, v)
i=1 j=1

(x, y)
+a3 ((uij , vij ))
(du dv)(~ij , ~ij )
(u, v)

= 2 ,
D
where the last equality holds as one recognizes the Riemann sum of the pullback of
2 in the previous line.
We illustrate the use of the formula of Theorem 9.3.9 in the following examples.
Example 9.3.10. Let S be parametrized by (x, y, z) = (u, v) = (u2 , u + v, v u)

with D = {(u, v) | 0 u 1, 0 v 1} and
= 3xyz(dy dz) + zy(dz dx) + x2 y(dx dy).

We compute .
S
This case is straightforward because the domain D and the parametrization
are given explicitly. All we need is to compute the pullback of . We compute first

1 1 1 1 2u 0
(y, z) (z, x) (x, y)
= = 2, = = 2u, = = 2u.

(u, v) (u, v) (u, v)
1 1 2u 0 1 1
We have
= (3u2 (u + v)(v u)(2) + (v u)(u + v)(2u) + u4 (u + v)(2u))du dv
and so
1 1
= (u + v)(6u3 6u2 v 2uv + 2u2 + 2u5 ) du dv.
S 0 0
This last integral is straightforward and we leave the details to the reader.
Example 9.3.11. Let S be the surface which forms the boundary of the region deter-
mined by the cylinder x2 + z 2 = 1, the plane y = 0 and x + y = 2. See Figure 9.4.
Let
= 5(dy dz) + y(dz dx) + x(dx dy).

We compute . This computation is much more involved than the previous one
S
and we outline only the major steps. First, there are three separate surfaces making
up S and second, the parametrizations are not given explicitly. The three separate
Fig. 9.4. Surface S made up of three sepa-

rate surfaces S1 , S2 and S3 .
surfaces are: the disk of radius 1 centered at the origin in the (x, z)-plane (S1 ), the
cylinder (S2 ) and the plane given by x + y = 2 (S3 ). Therefore, we write

= + + .
S S1 S2 S3
S1 is parametrized by 1 : D1 S1 where 1 is the polar coordinates transformation

and D1 = {(r, ) | 0 r 1, 0 2}. We have
S2 = {(x, y, z) | x2 + z 2 = 1, 0 y 2 x}
parametrized by 2 : D2 S2 where
2 (, y) = (cos , y, sin ) and D2 = {(, y) | 0 2, 0 y 2 cos }.
Finally,
S3 = {(x, y, z) | y = 2 x, x2 + z 2 1}
and 3 : D3 S3 is
3 (r, ) = (r cos , 2 r cos , r sin ), D3 = {(r, ) | 0 r 1, 0 2}.
One now has all the information necessary to complete the problem (See Exercises).
The link between surface integrals and integrals of 2-forms on a surface

For a 2-form = a1 dy dz + a2 dz dx + a3 dx dy and a surface S parametrized
by : D S. Notice that the formula given by Theorem 9.3.9 can be transformed
as follows:

(y, z)

(z, x)
= a1 ((u, v))
+ a2 ((u, v))
+
S D (u, v) (u, v)

(x, y)
a3 ((u, v)) du dv (9.3)
(u, v)

(y, z) (z, x) (x, y)
= a((u, v)) , , du dv.
D (u, v) (u, v) (u, v)
where we define
a((u, v)) := (a1 ((u, v)), a2 ((u, v)), a3 ((u, v))).
Just as we do for 1-forms in Chapter 4, we can interpret the coefficients of the 2-

form as the components of a vector field a(p) := (a1 (p), a2 (p), a3 (p)) Tp R3 where
p = (x, y, z). Thus, we see that the integrand is the scalar product of the vector field
a(p) with a vector

(y, z) (z, x) (x, y)
~n(p) := , ,
(u, v) (u, v) (u, v)
associated only with the surface S. Note that

s
(y, z) 2 (z, x) 2 (x, y) 2

||~n(p)|| = +
(u, v) .
+
(u, v) (u, v)
Therefore, normalizing the vector ~n(p) in the integrand of (9.3) we obtain

~n(p)
= a(p) ||~n(p)||du dv = f (p) dS (9.4)
S D ||~n(p)|| D
where dS corresponds to ||~n(p)|| du dv, and
~n(p)
f (p) := a(p) R.
||~n(p)||
Thus, the integral of a 2-form on a surface S corresponds to the surface integral of

the function f . In the next section, we give a geometric meaning to ~n; namely, ~n(p) is
a vector perpendicular to S at p (also called a normal vector). The interpretation
of the coefficients of the 2-form as a vector field and the geometric significance of f
are explained in Chapter 10.
(1) Setup the iterated integrals for the computation of the integral of the 2-form on
the given surface. Compute the integral if possible.
(a) Let S be the surface given by (u, v) = (u + v, uv, u2 ) with 0 u 1,
1 v 1 and = 2xyz dy dz + 3y 2 x dz dx + zx2 dx dy.
(b) Let S be the portion of the ellipsoid
x2 z2
+ y2 + =1
4 3
with y 0 and the 2-form is = z dy dz + xy dz dx
(c) Let S be the cylinder with cross-section given by (x 1)2 + y 2 = 1 with
0 z 2 and the 2-form is = x dy dz + y dz dx.
(d) Let S be the surface of rotation generated by y = (1 x)2 with 0 x 1
around the z-axis and the 2-form is = xeyz dydz+exz dzdx+z 2 dxdy.
(e) Let S be the portion of sphere given by
(, ) = (cos sin , sin sin , cos )
with /4 /4 and the 2-form is = y dy dz + dz dx + zdx dy.

(2) Transform the integrals of the previous problem into surface integrals using
formula (9.4).
(3) Perform the computations of Example 9.3.11.
9.4 Orientation of Surfaces
Just as curves have two possible orientations, we now explain how to define and
compute the orientation on surfaces. The simplest example of a surface is a plane
and we begin with this case as the general case builds on this one.
We know that the (x, y)-plane can be given an orientation, positive (counter-
clockwise rotation) or negative (clockwise rotation). Consider now a plane P R3 ,
not necessarily containing the origin. We center the plane at some p P and consider
an axis perpendicular to P and passing through p. Rotations in one direction are
defined as positive while rotations in the opposite direction are defined as negative.
For nonvertical planes P in (x, y, z) space, we say that the plane P has positive ori-
entation at p P , if the rotation around the axis at p, projected on the (x, y)-plane
is in the positive orientation. Therefore, all points on a plane can be given a positive
orientation induced by the orientation on the (x, y)-plane. For a vertical plane, we
can use the projection to the (y, z) or (x, z) planes to define the orientation.
Fig. 9.5. Consistent orientation at points p and q of a plane P induced by the orientation on the
xy-plane.
Let p, q P , we say that the orientations at p and q are consistent if the orientations
at p, q are the same. Therefore, the orientation induced by the (x, y)-plane for all
p P is consistent. See Figure 9.5. Because orientations are well-defined and easily
computable for planes, those are used for the definition of orientation for a general
surface.
Definition 9.4.1. A two-dimensional surface S is orientable if for any simple closed

curve C S, it is possible to prescribe a consistent orientation at each point of the
curve; that is, for any p, q C, Tq S and Tp S have the same orientation.
Just as with planes, visualizing the orientation for two-dimensional surfaces in R3 is

done by considering a vector perpendicular to the tangent space, see Figure 9.6.
Definition 9.4.2. Let S be a two-dimensional surface and p S. A vector ~n Tp R3

is called a normal vector to S if ~n is perpendicular to Tp S. If v is a normal vector
with ||~n|| = 1, it is called a unit normal vector.
Given any normal vector v at p S, the vector v is also a normal vector. We use
this duality to attach a normal vector to each orientation. We adopt the convention
that the direction of the normal vector corresponding to the positive orientation
Fig. 9.6. Surface S with tangent space Tp S and normal vectors ~

n.
should follow the right-hand rule; this means that if you bend your four fingers
in the counterclockwise direction on the tangent space to the surface using the axis
perpendicular to the tangent space, your thumb points in the direction of the positive
normal vector.
Example 9.4.3. A sphere S is orientable. Indeed, for each point p S, one can
always choose the unit normal vector ~n perpendicular to Tp S as pointing towards the
outside of the sphere. Thus, for any curve C joining two points p, q S, the normal
vector ~n prescribes a consistent orientation for all tangent spaces along the curve C
joining p and q.
We now conclude this section with the famous Mbius strip which is the simplest
example of a non-orientable surface.
Example 9.4.4. A Mbius strip S is a surface obtained by taking a strip of length L

with much smaller width, giving the edges a half-turn and connecting the two ends
together as shown in Figure 9.7. The Mbius strip is not orientable.
b d
a L c
a c
b d
Fig. 9.7. Thin rectangle used to make a Mbius strip by twisting one end and connecting the
two more remote sides.
From Figure 9.8, for a point chosen on the line in the middle of the strip, the curve
returns to its original location after one revolution, but the normal vector attached to
the path is now pointing in the opposite direction from the original one. This means
that as we follow a path along the middle of the strip, the orientation on the surface
is not consistent. This is true also for any path which winds around once along the
strip. For instance, consider a point at a constant positive distance from the middle
of the strip, we see in the picture that the normal vector after one revolution points
in the opposite direction from the original normal vector, then joining the starting
and end points by a straight line the normal vector is carried to minus the starting
point normal vector.
Fig. 9.8. Paths along the Mbius strip with normal vectors attached along the paths.
9.4.1 Computation of Normal Vectors
The previous section shows that the computation of normal vectors perpendicular
to a surface is obtained by computing the tangent plane and determining a vector
perpendicular to the tangent plane. We now derive a general formula to compute
normal vectors for parametrized surfaces.
Let S be a surface parametrized by (u, v) = (x(u, v), y(u, v), z(u, v)) and p =
(u0 , v0 ) S. A normal vector ~n(p) = (nx , ny , nz ) must satisfy ~n(p) Tp S, but
we know that an infinite number of vectors satisfy this condition. Therefore, we also
impose a condition on the norm of this vector. One choice would be to restrict to unit
vectors. However, we prefer the following condition which has a stronger geometric
significance in the context of parametrized surfaces.
Consider the horizontal and vertical paths through (u0 , v0 ) given by (u, v0 ) and
(u0 , v). Then (u, v0 ) and (u0 , v) are paths in S through p. Let ~u1 = (1, 0) and
~v1 = (0, 1) be unit vectors at T(u0 ,v0 ) R2 and consider the vectors

x y z

~ = D(u0 , v0 )~u1 = , ,
u u u (u0 ,v0 )

y
(u0 , v)
~v1
v0 (u, v0 )
~u1
u0 x
Fig. 9.9. Horizontal and vertical paths through (u0 , v0 ).
and
~ = D(u0 , v0 )~v1 = x y z
, , .
v v v (u0 ,v0 )

See Figure 9.10 for an illustration.
Fig. 9.10. Surface S with vectors ~ obtained using the derivative D.

~ and
Then, Tp S = span(~ ~ and so ~n(p)

, ) ~ = 0 are the orthogonality
~ = ~n(p)
conditions. The second condition needed to characterize ~n(p) is that the norm of the
normal vector should be equal to the area of the parallelogram P (~ ~ that is,
, );
||~n(p)|| = A(P (~ ~
, )).
The explicit formula is a direct computation as

q
||~n(p)|| = (dy dz)2 (~ ~ + (dz dx)2 (~
, ) ~ + (dx dy)2 (~
, ) ~
, ).
Therefore, the normal vector to S at a point p = (x, y, z) is given by

(y, z) (z, x) (x, y)
~n(p) = , , .
(u, v) (u, v) (u, v) (u0 ,v0 )
With this choice, the normal vector is connected to to the formula for dS. We now
look at some important examples for which the formulae are used repeatedly in the
next sections.
Example 9.4.5. Consider the surface given by the graph of a differentiable function
f (x, y). Then, (u, v) = (u, v, f (u, v)) is a parametrization and the normal vector
at a point p = (x, y, z) is given by

(y, z) (z, x) (x, y) f f
~n(p) = , , = , ,1 .

(u, v) (u, v) (u, v) u v (u0 ,v0 )
This normal vector corresponds to the positive orientation induced from the positive
orientation in the (x, y)-plane.
Example 9.4.6. Let S be the sphere of radius 1 centered at the origin with parametriza-
tion given by (, ) = (cos sin , sin sin , cos ) with 0 , 0 2.
Then,

(y, z) (z, x) (x, y)
~n(p) = (, ) , (, ) , (, )

= (cos sin2 , sin sin2 , sin cos ).
Note that ~n(p) points towards the outside of the sphere. This can be seen by letting
= 0, = /2, then the normal vector is (1, 0, 0).
Exercises
(1) Consider the surface S given by the cylinder x2 + y 2 = 4 with 0 z 1

given by the parametrization (, z) = (2 cos , 2 sin , z). Determine (graphi-
cally) the positive orientation of S. Show that the normal vector computed with
the parametrization automatically gives the positive orientation.
(2) Compute the normal vector on a torus with parametrization (u, v) = ((2 +
cos v) cos u, (2 + cos v) sin u, sin v) with 0 u, v 2.
(3) Compute the normal vectors of the surface given by the basic quadrics:
x2 y2 z2 x2 y2 z2
2
+ 2 + 2 = 1, 2
+ 2 2 =1
a b c a b c
x2 y2 z2 x2 y2
2
2 + 2 = 1, 2
+ 2 z =0
a b c a b
x2 y2 z2 x2 y2
2
2 z = 0, 2
= 2 + 2.
a b c a b
(4) Compute the normal vector of the helicoid surface given by the parametric
representation
(, ) = ( cos(a), sin(a), )
where > 0, [0, 2) and a is a fixed constant.
(5) Consider the tetrahedron with vertices at (0, 0, 0), (1, 0, 0), (0, 1, 0) and (0, 0, 1).
Determine the positively oriented normal vectors for this surface. (Hint: the
normal vectors for the coordinate planes are straightforward to obtain).
9.5 General Pullback Formula
The formulae obtained in the above section are special cases of a pullback formula
valid for any k-forms as we now explain. Let : Rn Rm be a differentiable
mapping and be a k-form in Rm . That is, is evaluated at q Rm on vectors
1 , . . . , k where j Tq Rm for j = 1, . . . , k:
(q)h1 , . . . , k i R.
Using , we want to define a k-form on Rn denoted by and defined at p Rn

on vectors v1 , . . . , vk where vj Tp Rn for j = 1, . . . , k:
( )(p)hv1 , . . . , vk i := ((p))hD(v1 ), . . . , D(vk )i (9.5)
where the left-hand side is well-defined at a point p Rn and applied to vectors

in Tp Rn because q = (p) Rm and D(vj ) Tq Rm for j = 1, . . . , k. We do not
present the procedure to compute the explicit k-forms in complete generality, but
instead illustrate in this example how one obtains explicitly the new k-form with
such a formula.
Example 9.5.1. Let
= a1 (x, y, z) dx + a2 (x, y, z) dy + a3 (x, y, z) dz
be a 1-form in R3 and consider the mapping : R2 R3 . We use the pullback

formula and this gives us a 1-form at a point p = (, ) R2 and applied at
v Tp R2 :
( )(p)hvi = a1 ((, )) dxhD(v)i+a2 ((, )) dyhD(v)i+a3 (, ) dzhD(v)i

(9.6)
which we now unpack. We write (, ) = (1 (, ), 2 (, ), 3 (, ))T and for
v Tp R2 , the derivative is
D(v) = (D1 (v), D2 (v), D3 (v))T
By definition of the differentials we have
dxhD(v)i = D1 (v), dyhD(v)i = D2 (v), dzhD(v)i = D3 (v).

Moreover, letting v = (d(v), d(v)), for i = 1, 2, 3 we have

i i
Di (v) = d(v) + d(v).

Collecting this information into equation (9.6) and simplifying, we obtain (eliminat-
ing the dependency on v)
3
! 3
!
X i
X i
( )(, ) = ai ((, )) d + ai ((, )) d.

i=1 i=1
Note that this example contains also the subcase of : R2 R2 if 3 = 0.
From this example, we can now present a result about changes of variables in line
integrals which is useful in proving Stokes theorem in Chapter 10.
Proposition 9.5.2. Let be a 1-form in R3 and : R2 R3 be a differentiable

mapping. If C is a curve in R2 , then

= .
C (C)
Proof. Let = a1 (x, y, z) dx+a2 (x, y, z) dy +a3 (x, y, z) dz, and r(t) = ((t), (t)) =
(r1 (t), r2 (t)) with t [a, b] be a parametrization of C. Then, the curve (C) has
parametrization given by
x(t) = 1 (r(t)), y(t) = 2 (r(t)), z(t) = 3 (r(t))
with t [a, b] and

d 1 0 1 0
dx((r(t))) = 1 (r(t)) dt = r1 (t) + r2 (t) dt
dt

d 2 0 2 0
dy((r(t))) = 2 (r(t)) dt = r1 (t) + r2 (t) dt
dt

d 3 0 3 0
dz((r(t))) = 3 (r(t)) dt = r1 (t) + r2 (t) dt.
dt
Therefore, putting all the formulae above together we obtain
b
= (a1 ((r(t))) dx((r(t))) + a2 ((r(t))) dy((r(t)))
(C) a
+a3 ((r(t))) dz((r(t))))

b 3
! 3
! !
X i X i
= ai ((r(t))) r10 (t) + ai ((r(t))) r20 (t) dt
a
i=1 i=1

= .
C
from Example 9.5.1.

We conclude with some important properties of the interaction between wedge prod-
ucts, exterior derivatives and pullbacks. These are used in the proof of Stokes theo-
rem which is presented in Chapter 10.
Proposition 9.5.3. Let 1 , . . . , ` be 1-forms in Rm , and be two forms in Rm

and : Rn Rm be a differentiable mapping where (x1 , . . . , xm ) = (y1 , . . . , ym ).
Then,
(1) (1 . . . ` ) = 1 . . . `
(2) ( )
(3) d( ) = (d).
Proof. The proof of those results follows the ones found in [1], we reproduce the one
for (a) and part of (c) here for completeness. (a) This is a straightforward application
of the general definition of pullback and the definition of wedge product of 1-forms
using the determinant. We have
(1 . . . ` )(p)hv1 , . . . , v` i = (1 . . . ` )(p)hd(v1 ), . . . , d(v` )i
= det(i (d(vj ))
= det( (vj ))
= ( 1 ` )(v1 , . . . , vk ).
(b) One can use part (a) construct the proof of this case. Working out some
simple examples may be helpful.
(c) We begin by proving the result for differentiable 0-forms f : Rm R. We
have
m
!
X f
(df ) = dyi
yi
i
m
X f
= dyi (D)
yi
i
m n
X f X i
= dxj
yi xj
i j=1
where this last equality is obtained because dyi computes the differential of the ith
component of D as illustrated in Example 9.5.1. This last term is then equal to
m X
n
X f i
= dxj
yi xj
i=1 j=1
n
X (f )
= dxj
xj
j=1
= d(f ) = d( f ).
The remainder of the proof uses part (b) and the first part of this proof, but we do
not continue it here explicitly.
Exercises
(1) Show that Cases 1,2 and 3 in Section 9.1 can be obtained using the general
pullback formula (9.5).
(2) Let be a 2-form in R4 and : R3 R4 be a differentiable mapping. Compute
explicitly the 2-form in R3 given by .
(3) Complete the proof of Proposition 9.5.3 by consulting [1].
10 Stokes Theorem and Applications
We are now at the point of presenting the main theorem of Vector Calculus: Stokes
Theorem. In the traditional setting of Vector Calculus where one studies vector
fields and differential operators without discussing differentiable k-forms and exterior
derivatives, Stokes theorem comes in three flavours. There is Greens theorem for
vector fields in the plane which is decribed in Chapter 7, Stokes theorem for vector
fields on surfaces and the Divergence Theorem (also known as Gauss Theorem) for
vector fields in R3 . However, the formulae involved are quite different in each case.
We present those three theorems using the differential form theory and the formulae
have exactly the same structure, depending only on the type of k-form and the
dimension of the geometric objects. In order to state Stokes theorem in complete
generality for arbitrary k-forms and in any dimension requires one to develop the
theory of differentiable k-forms and the formulae of exterior derivatives at a much
deeper level and is beyond the scope of this book. We recommend the following
reference for the interested readers [1].
We begin this chapter with a basic discussion of orientation of curves on surfaces
as a generalization of the orientation of curves around two-dimensional domains seen
for Greens theorem in Chapter 7. We state Stokes theorem in the three versions
stated above and present some examples and applications. We then state Stokes
theorem in the classical way using the differential operators and present some ex-
amples.
10.1 More on orientation of curves and surfaces
Before we can state Stokes theorem in the three versions, we need to define orien-
tation of curves on surfaces and the orientation of surfaces forming the boundary of
bounded regions in R3 .
10.1.1 Orientation of curves on surfaces
Recall the convention for orienting boundary curves in Chapter 7 in the context of
Greens theorem. Let D R2 be a subset with boundary denoted by D given by
only one simple closed curve C. We say that C is oriented positively if it is travelled in
the counterclockwise direction. There is also the more subtle situation if the domain
D has holes in its interior, bounded by simple closed curves C1 , . . . , Ck . Then the
curves C1 , . . . , Ck have positive orientation if the orientation is in the clockwise
direction.
One way to understand the convention described here is to consider the xy-plane
in R3 and taking the unit normal vector ~n pointing in the positive z direction, then
C2
C1
Fig. 10.1. Curve C made up of three distinct curves, bounding a domain in the plane where the
orientation of the inner curves is opposite the orientation of the outer curve.
as the normal vector is moved along C, the domain D remains on the left. In the case
of a domain with holes, this choice is made so that the domain D is again, always
on the left of the normal vector ~n as it is moved along any of the curves C1 , . . . , Ck .
Figure 10.2 gives a schematic of this process.
left C1
left Fig. 10.2. The domain D lies to the left
C2 D y
of the normal vector as it is moved along
x the curves C1 and C2 .
Let S be an orientable surface with boundary C = S (and no holes). We induce

an orientation on C using the orientation of S. Because S is orientable, we can choose
a consistent orientation for all tangent spaces Tp S. Even though Tp S may not be
defined on points p C, we impose the same orientation as S on the boundary curve
C by a straightforward limiting process; that is, we take the normal vector in the
direction consistent with normal vectors on S \ S. Then, the orientation on C is
defined so that as the normal vector ~n is moved along C, the surface S must remain
on the left.
If S has holes, determined by simple closed boundary curves C1 , . . . , Ck , then
the orientations induced from the surface S are obtained as above by moving the
induced normal vector ~n along the curve Ci (for some i = 1, . . . , k) so that the
surface S remains on the left. Note that the induced orientation of the boundary
curves C1 , . . . , Ck of holes in S are opposite to the orientation of the (outside)
boundary curve C. See Figure 10.3.
Example 10.1.1. Consider the surface S given by the sphere with z 0. Then, the
boundary curve is given by the circle x2 + y 2 = 1 at z = 0. The orientation on
276 10 Stokes Theorem and Applications
Fig. 10.3. Surface with outside boundary curve positively oriented in the counterclockwise direc-
tion. The boundary curve inside is oriented positively in the clockwise direction.
Fig. 10.4. Surface S of Example 10.1.1 with positively oriented boundary curve
S is obtained using the parametrization (, ) = (sin cos , sin sin , cos ) with
0 /2 and 0 2. At a point p S,
~n(p) = (cos sin2 , sin sin2 , sin cos ).
At the boundary, the normal vector is ~n((/2, )) = (cos , sin , 0) which points
away from (0, 0, 0). This means the positive orientation along S is in the counter-
clockwise direction. See Figure 10.4.
10.1.2 Orientation of boundary surfaces
Let E R3 be an open subset. E is said to be bounded if there exists a ball B

of radius R centered at the origin such that E B. Moreover, we say that E is
connected if for any two points p0 , p1 E, there exists a continuous curve C E
joining p0 to p1 (a more general definition of connectedness exists, but we do not
need it).
Fig. 10.5. Three-dimensional region E bounded by the closed surface E with positive orienta-
tion from the outward pointing normal vectors.
Consider an open bounded and connected set E such that its boundary S is an
orientable piecewise smooth surface, made up of a collection of smooth (orientable)
surfaces S1 , . . . , Sk . We denote S = E. We also refer to E as a solid region. The
boundary of E is known as a closed surface as it separates R3 into two disjoint
regions: E which lies inside E and R3 \ E outside of E. Then, we say that E
is positively oriented if the normal vectors ~n for all points p S points towards
R3 \ E. We determine the positive orientation of E for some often encountered
parametrizations of surfaces.
Example 10.1.2. Consider a sphere S of radius R with parametrization
(, ) = (R cos sin , R sin sin , R cos )
S is the boundary of the region E which is a ball of radius R,
E = {(x, y, z) | x2 + y 2 + z 2 < R}.
Then, the normal vector is given by

(y, z) (z, x) (x, y)
~n(p) = det , det , det
(, ) (, ) (, )
= (R cos sin2 , R sin sin2 , R sin cos )
as computed in Example 9.4.6 and so the parametrization automatically yields the

positive orientation on the sphere. This can be verified by setting = /2 for which
~n(p) = (R cos , R sin , 0)
points towards R3 \ E.
Example 10.1.3. Consider the region E bounded by the cylinder given by the ellipse
x2 y2
2
+ 2 =1
a b
with a, b > 0, and the planes z = k and z = h for k < h. A parametrization of the
Fig. 10.6. Cylinder

bounded at z = k
and z = h with
k < h.
cylinder is given in cylindrical coordinates by
(, z) = (a cos , b sin , z).
Thus,

(y, z) (z, x) (x, y)
(, z) (, z) (, z)
= (b cos , a sin , 0).
By evaluating at = 0, we have the vector ~n = (1, 0, z) which points towards R3 \ E

and has positive orientation. Parametrizations of the planes are given by A (x, y) =
(x, y, A) where A = k and A = h. Then, ~n(p) = (0, 0, 1) which in the case of the plane
z = h points to the outside of E. However, for z = k, we must take ~n(p) = (0, 0, 1)
in order to have positive orientation.
Fig. 10.7. Normal vectors ~

n computed using the formula from the parametrization in Exam-
ple 10.1.4. To obtain a positive orientation, the normal vectors of the cone must be chosen in
the opposite direction.
Example 10.1.4. Let E be the region bounded by the cone z 2 = x2 + y 2 and the
elliptic paraboloid z = 4 x2 y 2 with z > 0. Parametrizations are given by
1 (r, ) = (r cos , r sin , r2 ) and 2 (r, ) = (r cos , r sin , 4 r2 ) respectively. For
the cone we have

(y, z) (z, x) (x, y)
(r, ) (r, ) (r, )
= (2r2 cos , 2r2 sin , r).
Note that the normal vector cannot be computed at the vertex of the cone. Set = 0,
then ~n(2r2 , 0, r) points to the inside of region E. Therefore, the positive orientation
is given by (2r2 cos , 2r2 sin , r).
For the elliptic paraboloid we have

(y, z) (z, x) (x, y)
(r, ) (r, ) (r, )
2 2
= (2r cos , 2r sin , r).
and we can check that at = 0, we obtain ~n(r, 0, 4 r2 ) = (2r2 , 0, r) and this vector
points towards R3 \E and so has positive orientation. Figre 10.7 shows a few normal
vectors.
Exercises
(1) Consider the torus with parametrization (u, v) = ((2 + cos v) cos u, (2 +
cos v) sin u, sin v) with 0 u 2 and 0 v 2. Let S be the portion of the
torus with y 0. Identify the boundary of S and find a parametrization so that
S has positive orientation.
(2) Consider the sphere of radius 1 and the plane P given by the equation 2zy1 =
0. Let the surface S be the portion of the sphere lying above the plane P .
Determine the boundary curve of S and find a parametrization such that D
has positive orientation.
(3) Consider the plane P of equation z = 2 x y lying in the first octant and con-
sider the circular cylinder of radius 1/4 centered at (x, y) = (1, 1/2). The surface
S is the portion of the plane P lying outside the cylinder. Identify the boundary
curve S and find a parametrization so that S has positive orientation.
(4) Let E be the region in the first octant x, y, z 0 where z 2 and x2 + y 2 4.
Find the normal vectors that give E its positive orientation.
(5) Let E be the region bounded by the planes y = 0, y = 3 x and the cylinder
x2 + z 2 = 1. Find the normal vectors that give E its positive orientation.
(6) Let E be the region bounded by the cylinder x2 + y 2 = a2 , the sphere x2 + y 2 +
z 2 = 2a2 , and not containing the z-axis. Find the normal vectors that give E
its positive orientation.
10.2 Stokes Theorem
We now come to the statement of Stokes theorem in the three versions. We rewrite
Greens theorem about the integration of 1-forms along curves in R2 forming the
boundary of some region D. Then, what is clasically known as Stokes theorem for
integration of 1-forms along curves forming the boundary on a surface S. Finally,
the Divergence theorem (or Gauss theorem) about the integration of 2-forms on
closed surfaces.
Theorem 10.2.1 (Stokes Theorem).
(1) Greens Theorem (flat surfaces): Let D R2 and D be the boundary of D

oriented positively. If is a 1-form with C 1 coefficients then

= d.
D D
3
(2) Classical Stokes theorem: Let S R be a positively oriented surface with
boundary curve S having the induced orientation. If is a 1-form with C 1
coefficients in R3 then
= d.
S S
(3) Divergence theorem (Gauss Theorem): Let E R3 be a solid region with

boundary surface E oriented positively. If is a 2-form with C 1 coefficients in
R3 , then "
= d.
E E
We do not present the proofs of those theorems here. We begin by showing the power
of these results with some examples.
We begin by illustrating the use of (the Classical) Stokes theorem.
Example 10.2.2. Let
= (y + sin x) dx + (z 2 + cos y) dy + x3 dz
and C the curve given by r(t) = (sin t, cos t, sin 2t) for 0 t 2. Let us evaluate

.
C
Note that if one attempts to compute this integral directly by using the pullback via r,
the resulting integral cannot be easily computed. We now show that Stokes theorem
makes the calculation tractable.
Note that C lies on the surface S given by z = 2xy. Therefore, let S be the
portion of S enclosed by C. Then, Stokes theorem states that

= d.
C S
Thus,
d = d(y + sin x) dx + d(z 2 + cos y) dy + d(x3 ) dz
= 1 dy dx + 2z dz dy + 3x2 dx dz
= 2z dy dz 3x2 dz dx 1 dx dy.
A parametrization of S is given in polar coordinates by
(r, ) = (r cos , r sin , 2r2 cos sin )
with domain
D = {(r, ) | 0 r < 1, 0 < 2}.
Noticing that 2 cos sin = sin 2 , we compute

sin r cos

(y, z)
det = det
(r, )
2r sin 2 4r2 cos 2
= 2r2 (sin cos 2 2 cos sin 2)
= 2r2 (sin + cos sin 2).

(z, x)

2r sin 2 4r2 cos 2
det = det
(r, )
cos r sin
= 2r2 (sin 2 sin + 2 cos 2 cos )
= 2r2 cos (1 + cos 2)
where the last equalities in the above two determinants follow by applying the differ-
ence of angles formula sin = sin(2 ) and cos = cos(2 ). We know already
that
(x, y)
det = r.
(r, )
Therefore, the pullback of d by is

(y, z)
(z, x) (x, y)
(d) = r2 sin 2 2 2
(r, ) 3r cos (r, ) (r, ) dr d

= r2 sin 2(2r2 (sin + cos sin 2))

3r2 cos2 (2r2 (cos + cos 2 cos )) r dr d
= (r + 2r4 (2(sin sin 2 + sin2 2 cos )
+3 cos3 (1 + cos 2)))dr d
:= G(r, ) dr d.
This leads to 1 2
= G(r, ) d dr.
C 0 0
This double integral is straightforward to compute and we leave the details to the
interested reader.
Stokes theorem is useful in order to establish some properties of differential opera-

tors. We show an example here.
Example 10.2.3. Let and be two C 2 functions from R3 R and consider the
vector field
F (x, y, z) = .
We define the 1-form F = F (dx, dy, dz) and let S be a surface with a simple closed
curve boundary C = S. By Stokes theorem

F = dF .
C S
It is straightforward to verify that
dF = ( ) (dy dz, dz dx, dx dy).
Similarly, the vector field

G(x, y, z) =
is used to define the 1-form G = G (dx, dy, dz). then by Stokes theorem

G = dG
C S
and
dG = ( ) (dy dz, dz dx, dx dy) = dF .
Therefore, we conclude that

dr = dr.
C C
Moreover, this calculation also tells us that
div ( ) = 0
since this corresponds to d2 F .
In the next example, we use Stokes theorem to derive a meaning for the curl operator
in the context of fluid flows.
Example 10.2.4. Let v(x, y, z) be the vector field describing the velocity flow of some
fluid defined in some open set U R3 . We define the circulation 1-form as
= v (dx, dy, dz).
The circulation of v around a simple closed curve C with parametrization r(t), t

[a, b], is defined as
b b
= v dr = (v T(t)) ds
C a a
where recall that T(t) = r(t)/||r(t)|| and ds = ||r0 (t)|| dt. Let p0 U and Sa be
a small disk of radius a > 0 in U (not necessarily lying in a plane parallel to the
xy-plane) with p0 in the centre and let Sa = Ca . We apply Stokes theorem to this
situation where we recall that d = curl v (dy dz, dz dx, dx dy) and we denote
by n(p) = ~n(p)/||~n(p)|| the unit normal vector to Sa at p. We obtain,

= d = curl v ~n(p) dS.
Ca Sa Sa
Now, for a 0, we can say that curl v(p) n(p) curl v(p0 ) n(p0 ) for all p Sa .
Thus, we have

curl v(p0 ) n(p0 )Area(Sa ) = curl v(p0 ) n(p0 )a2 .
Ca
which we rearrange as

1
curl v(p0 ) n(p0 ) . (10.1)
a2 Ca
Finally, we can write

1
curl v(p0 ) n(p0 ) = lim .
a0 a2 Ca
The interpretation of formula (10.1) is illustrated in Figure 10.8 and described as

follows. At a point p0 R3 , choose a unit vector n(p0 ) and consider the plane P
perpendicular to n(p0 ). For a circle Ca of (small) radius a in P centered at p0 , the
curl operator of F at p0 defines a vector at p0 whose orientation with respect to
n measures the amount of rotation the vector field F near p0 has in the plane P .
Notice that the magnitude of the circulation of the vector field F along the circle Ca
is maximized if curl(F )(p0 ) and n(p0 ) are parallel. But, it is minimal if curl(F )(p0 )
and n(p0 ) are perpendicular, thus meaning that there is no rotation of F near p0 in
the plane P .
Fig. 10.8. Normal vector n(p0 ) and vector

field at points on the perpendicular plane P .
Note that the vectors do not necessary lie on
the plane.
We now look at some applications of the Divergence theorem.

Example 10.2.5. Let

1 3
= z 2 x dy dx + y + tan z dz dx + (x2 z + y 2 ) dx dy
3
and let S be the top half of the sphere x2 + y 2 + z 2 = 1. We compute

S
using the Divergence theorem. Because S is not a closed surface, we need to define a
region bounded by a closed surface using S. We do this by closing the region defined
by S using the disk D of radius 1 centered at the origin of the xy-plane. Of course,
other choices are possible, but this is one of the simplest. Then, S := S D is a closed
surface, E is the solid region with E = S and we choose a positive orientation for
E. Therefore,
d = = +
E S S D
and
= d .
S E D
We can now proceed with the computation. We have
d = (x2 + y 2 + z 2 ) dx dy dz.
We use the polar parametrization : D D,
(r, ) = (r cos , r sin , 0)
with domain D = {(r, ) | 0 < 2, 0 r < 1}. Then,
= r3 sin dr d.
For the triple integral, we use spherical coordinates with domain
R = {(, , ) | 0 < 1, 0 < /2, 0 < 2}.
Putting it all together we have

1 /2 2 ! 1 2
4
= sin d d d r3 sin2 d dr
S 0 0 0 0 0
2 3
= = .
5 4 20
Flux through a surface

We begin by giving an interpretation of 2-forms as the rate of flow, also called flux,
of a vector field through a surface. Consider a surface S R3 and a vector field
F (x, y, z) = (F1 (x, y, z), F2 (x, y, z), F3 (x, y, z))T
The vector field can have several physical meanings as a flux per unit area. For
instance, the flux per unit area of a fluid, an electric field, heat, concentration of
a chemical, etc. We can write F = ~v where is a density and ~v a velocity field.
Note that has units of kg/m3 and ~v has unit of m/s, therefore ~v has units of
(kg/s)/m2 which is indeed a flux per unit area.
Fig. 10.9. Unit normal vectors to S and

vector field at points of S. Arrows with
thick heads are the normal vectors to S.
Let p0 = (x0 , y0 , z0 ) and let n = ~n/||~n|| be the unit tangent vector to S. We approx-
imate the flux F through S near p0 in the direction n using
F (p0 ) n(p0 ).
~ and ~ be a pair of linearly independent

Because the above is a flux per unit area, let
~
, ) a parallelogram with area given by ||~n||. Then, the flux
vectors in Tp0 S and P (~
through S near p0 is approximated by
F (p0 ) n(p) Area(P (~ ~

, ))
which we can rewrite using the surface area form as
F (p0 ) n(p0 ) dS(P (~ ~

, )).
Recall that
q
dS(P (~ ~ =
, )) (dy dz)2 (~ ~ + (dz dy)2 (~
, ) ~ + (dx dy)2 (~
, ) ~
, ).
But, since ||~n|| = dS(P (~ ~ then F (p0 ) n(p0 ) dS(P (~

, )), ~ =
, )
F (p0 ) ((dy dz)(~ ~ (dz dy)(~

, ), ~ (dx dy)(~
, ), ~
, ))
where we recognize the right-hand side of the above equation as a 2-form.

Fig. 10.10. Zoom in near a point p0 with the normal vector n(p0 ), the vector field F (p0 ) and
the projection of the vector field on the normal vector.
We summarize as follows:
(a) Let F be a vector field describing a flux per unit area (fluid, heat, etc)
(b) Define a 2-form: F = F (dy dz, dz dx, dx dy).
(c) Let S be a surface with unit normal vector n.
(d) Then, on S we have: F n dS = F .
(e) The flux through S is defined by

F n dS = F .
S S
(f) If S is parametrized by : D S then

F n dS = F = F .
S S D
We now show an example of flux.
Example 10.2.6. A fluid has density = 870 kg/m3 and flows with velocity v =
(z, y 2 , x2 ) where x, y, z are measured in meters and the components of v in m/s. We
find the flux outward through the side of the cylinder x2 + y 2 = 4 for 0 z 1.
A parametrization of the cylinder is given by
(, z) = (2 cos , 2 sin , z)
where D = {(, z) | 0 < 2, 0 z < 1}. Let F = v be the flux per unit area,
then the flux 2-form is
F = F (dy dz, dz dx, dx dy)
= 870(z dy dz + y 2 dz dx + x2 dx dy).
We compute the pullback

(y, z) 2
(z, x) 2
(x, y)
F = 870 z +y +x d dz
(, z) (, z) (, z)
= 870(z(2 cos ) + 4 sin2 (2 sin )) d dz
= 870(2z cos + 8 sin3 ) d dz.
Therefore,

F n dS = F
S S
1 2

3
= 870(2z cos + 8 sin ) d dz = 0.
0 0
Remark 10.2.7. We introduce the following notation, which is widely used, for the
flux through a surface
F dS := F n dS.
S S
That is, we define dS = n dS.
The formulation of flux in terms of a 2-form leads directly to the formulation of

Stokes theorem for vector fields.
10.2.1 Proof of Theorem 10.2.1: Stokes
We now perform the proof of Stokes theorem. Note that the proof of Greens theorem
found in Chapter 7 is obtained by assuming that the 1-form is C 2 . This is an
unnecessary assumption and we now show the proof for the case where is only C 1 .
Proof of Greens theorem. Any domain D in R2 can be decomposed as a union of

domains of Type I and Type II as discussed in the proof of Greens theorem in
Chapter 7. We focus on the domains of Type I and let D be such a domain; that is,
D = {(x, y) | a x b, g1 (x) y g2 (x)}. Let B = [0, 1] [0, 1]. Then there exists
a parametrization : B D defined using x(u) = a(1 u) + bu by
(u, v) = (x(u), g1 (x(u))(1 v) + g2 (x(u))v).

By the change of variables formula and Proposition 9.5.3,

d = d = d( ).
D B B
Moreover, because (B) = D, then

= .
D B
by Proposition 9.5.2. But, we know from Example 7.3.2 that for a rectangular region

= d( ).
B B
Note that Example 7.3.2 only requires to be C 1 and so this completes the proof.
We now use Greens theorem to provide a straightforward proof of the Classical

Stokes theorem using the pullback and exterior derivative properties obtained at
the end of Section 9.1.
Proof of the Classical Stokes theorem. Consider a surface S with regular boundary
S. We assume S can be split into the union of subsurfaces S1 , S2 , sharing a bound-
ary given by C. Therefore S1 and S2 have boundaries given respectively by C1 C
and C2 C and S = C1 C2 . If Stokes theorem holds for each surface S1 and
S2 we have

d = d + d
S S1 S2

= +
C1 C C2 C

= + + +
C1 C C2 C

= +
C1 C2

= .
S
This construction can be generalized to a finite number of surfaces S1 , . . . , Sk with
the common boundaries cancelling out in the same way as the calculation just above
as in the case of Greens theorem, see the proof in Chapter 7.
Therefore, all we need is to prove the formula of Stokes theorem for surfaces
S with boundary S given by a parametrization : D R2 R3 . Note that
S = (D). For this, we use Greens theorem and Proposition 9.5.3:

d = d
S D

= d( )
D

=
D

=
(D)

=
S
where the fourth equality holds by Proposition 9.5.2.
In order to provide an argument for the proof of the Divergence theorem with a
simple calculation, we begin by noticing that the regions of Type I, Type II and
Type III for triple integrals can be given a simple parametrization using a cube as a
domain, similar to what is done for domains in R2 in the proof of Greens theorem
in this chapter. We show the details in the case of a Type I region.
Let E = {(x, y, z) R3 | (x, y) D, g1 (x, y) z g2 (x, y)} where we assume
that D R2 is a domain of Type I; that is, D = {(x, y) R2 | a x b, h1 (x)
y h2 (x)}. Let B = [0, 1][0, 1][0, 1] and consider the parametrization : B E
defined by
x(u) = a(1 u) + bu, y(u, v) = h1 (x(u))(1 v) + h2 (x(u))v
and
(u, v, w) = (x(u), y(u, v), g1 (x(u), y(u, v))(1 w) + g2 (x(u), y(u, v))w). (10.2)
Proof of the Divergence theorem. Any solid region E can be decomposed into a
union of regions of Type I,II and III with each subregion sharing a boundary via a
piecewise smooth surface. We return to the decomposition aspect at the end of the
proof and we assume for the beginning part of the proof that E is without loss of
generality a region of Type I.
We know there exists a parametrization : B E as given by (10.2). We use
to pullback the calculation on B:

=
E B
and
d = d = d( )
E B B
with the last equality guaranteed by Proposition 9.5.3. Suppose that

= a(x, y, z) dy dz + b(x, y, z) dz dx + c(x, y, z) dx dy.
A straightforward computation shows that
= a(u, v, w) dv dw + b(u, v, w) dw du + c(u, v, w) du dv
where the exact form of the coefficients is not relevant for our purposes. Therefore,

a b c
d( ) = + + du dv dw.
u v w
We now parametrize the six sides of B using the mappings j (, ) with
domain D = [0, 1] [0, 1], where j = u, v, w describes the direction of the normal
vector to the side and = 0, 1 the location of the side along the j-axis. For instance,
u1 (, ) = (1, , ) is the parametrization of the side of the cube located at u = 1.
Then, it is a straightforward calculation to show that j (dv dw) = 0 for
j = v, w, j (dw du) = 0 for j = u, w and j (du dv) = 0 for j = u, v. Moreover,
u (dv dw) = d d, v (dw du) = d d, w (du dv) = d d.

From this, we can conclude that
B

= a(u1 (, )) d d a(u0 (, )) d d
D D

b(v1 (, )) d d + b(v0 (, )) d d
D D

+ c(w1 (, )) d d c(w0 (, )) d d
D D

= (a(1, , ) a(0, , ))d d (b(, 1, ) b(, 0, ))d d
D D

+ (c(, , 1) c(, , 0)) d d
D
1 1
a(u, v, w) b(u, v, w)
= du dv dw dv du dw
D 0 u D 0 v
1
c(u, v, w)
+ dw du dv
D 0 w

a(u, v, w) b(u, v, w)
= du dv dw dv du dw
B u B v

c(u, v, w)
+ dw du dv
B w

a(u, v, w) b(u, v, w) c(u, v, w)
= + + du dv dw
B u v w

= d( ).
B
Suppose now that E = E1 Ek where the regions Ej are of Type I, II or III

and the boundary between Ej and the other regions is given by a piecewise smooth
surface Sj . That is, we can write Ej = (Ej \ Sj ) Sj . This means that
k
[
E = (Ej \ Sj ).
j=1
We must also write explicitly the piecewise smooth surface Sj as
`j
[
Sj = Sji
i=1
where Sji are smooth surfaces. But notice that
`j
k X
X
=0
j=1 i=1 Sji
because for a fixed j, the computation of the integral for Sji has normal vector point-
ing towards the outside of Ej , while for neighbouring regions, the same boundary
surfaces have normal vectors pointing in the opposite directions, thus all integrals
cancel out. The argument is similar to the one used for Greens theorem and the
details are left to the reader.
We can now compute
k
X
d = d
E j=1 Ej
Xk
=
j=1 Ej
Xk k
X
= +
j=1 Ej \Sj j=1 Sj
Xk k
XX `j
= +
j=1 Ej \Sj j=1 i=1 Sji
Xk
=
j=1 Ej \Sj

=
E
and this completes the proof of the theorem.

Exercises
(1) Evaluate the line integral of over the given simple closed curve C oriented
positively.
2
(a) Let = (cos x + y 3 ) dx + (4x ey ) dy and C be the curve bounding the
half-disk x2 + y 2 a2 with x 0.
(b) Let = (cos(ex ) y 3 ) dx + (y 5 + x3 ) dy and D be the region between y = x,
y = x and the circle or radius a, with y 0.
(2) Let u, v : R2 R be differentiable and let D R2 be a bounded, connected
region. Let N (t) be a vector normal to D. Show that

(uv vu) dA = (u(v N ) v(u N )) dt.
D D
(3) Show that 1 = x dy, 2 = y dx are 1-forms which yield the area of a bounded
region D R2 .

(4) Show that if is an exact 2-form, then S = 0 for any closed surface S.
(5) Use Classical Stokes or Divergence theorem to compute the following integrals.

(a) C where = yex dx + (x + ex ) dy + z 2 dz and C is the curve with
parametrization r(t) = (1 + cos t, 1 + sin t, 1 sin t cos t) with 0 t 2.
(Hint: the curve C lies on a plane z = A Bx Cy where you need to find
A, B and C).

(b) S d where = 3y dx 2xz dy + (x2 y 2 ) dz and S is the positively
oriented hemisphere of equation x2 + y 2 + z 2 = a2 with z 0.

(c) S where = (x + y 2 ) dy dz + (3x2 y + y 3 x3 ) dz dx + (z + 1) dx dy
and S is the positively oriented cone with a vertex at (0, 0, 2), axis given by
the z-axis, and with base given by the disk of radius 2 in the xy-plane.
2 2 3 2 2 3
(d) E d where = (1 (x + y ) ) dy dz + 2(1 (x + y ) ) dz dx +
3
x2 z 2 dx dy and E R is the region bounded by the cylinder of radius 1
along the z-axis and between the planes z = 0 and z = 1.

(6) Set up the explicit integral for S F computing the rate of flow (i.e. flux) of
the vector field F across S. (Hint: you must find a parametrization for S along
with its domain D. Determine the 2-form F corresponding to F , compute the
pullback and set up the bounds of the integrals.)
(a) F (x, y, z) = (xy, yz, zx)T and S is the part of the paraboloid z = 4 x2 y 2
that lies above the square 0 x 1, 0 y 1.
(b) F (x, y, z) = (y, z y, x)T and S is the surface of the tetrahedron with
vertices (0, 0, 0), (1, 0, 0), (0, 1, 0) and (0, 0, 1).
(c) F (x, y, z) = (x, z, y)T and S is the part of the sphere x2 + y 2 + z 2 = 4 in
the first octant.
(7) Let S1 be the elliptic paraboloid surface z = x2 + y 2 and S2 be the plane
z = 2 2x 2y. Show that the intersection curve C of the cone and the plane
has parametric representation r(t) = (1+2 cos t, 1+2 sin t, 64 cos t4 sin t)
with t [0, 2). Consider the 1-form

3
= (3xz y 2 ) dx + (exp(y 2 ) 2xy) dy + z 3 + x2 dz
2

and evaluate C .
10.3 Stokess Theorem for Vector Fields
We now write explicitly Stokes theorem using the vector field formalism. The for-
mulae are straightforward from the correspondence seen in Section 8.3.1 between
1-forms and vector fields, and between the exterior derivative and the curl and di-
vergence differential operators acting on vector fields.
Theorem 10.3.1 (Stokes theorem in Vector Calculus). (1) Let D R2 , D is pos-

itively oriented using r : [a, b] R2 and F (x, y) = (f1 (x, y), f2 (x, y)) a vector
field on D. Then,

f2 f1
F dr = dA.
D D x y
(2) Let S R3 be a surface with boundary S positively oriented and induced ori-
entation using r : [a, b] R3 . If F : R3 R3 is a differentiable vector field,
then
F dr = curl F n dS.
S S
3
(3) Let E R be a solid region with boundary surface E oriented positively. If
F : R3 R3 is a differentiable vector field then
"
F n dS = div F dV.
E E
We now use the vector field statement of the Divergence theorem to explore the
physical meaning of the divergence operator.
Example 10.3.2. Let F (x, y, z) be a differentiable vector field and p0 = (x0 , y0 , z0 ).

Let B be a ball of radius > 0 with boundary sphere S . We suppose that B is
oriented positively and we use the Divergence theorem for the flux of F through S :

F n dS = div F dV.
S B
Now, notice that for << 1, we can approximate div(F )(p) div(F )(p0 ) for all
p B . Therefore, we have the following computation

1 1
lim F n dS = lim div (F ) dV = div(F )(p0 ).
0 dV (B ) S 0 dV (B ) B
The divergence operator can thus be interpreted in the following way. If div(F )(p0 ) >
0, it means that for S with << 1 near p0 , the total flux of F points towards R3 \B
and so there is a positive flux away from p0 and we say that p0 acts as a source point
for the vector field. If div(F )(p0 ) < 0, then there is a total negative flux pointing
towards the inside of B and we say that p0 acts as a sink point for the vector field.
Here are a few more examples with applications of Stokes and Divergence theorem.
Example 10.3.3 (Stokes theorem: Balloon). Suppose that air diffuses out of the Mr.
Winter hot air balloon. The bottom of the balloon is circular with radius 1. The hot
air escapes the porous membrane with velocity V (x, y, z) = curl F with F (x, y, z) =
(y, x, 0). We compute the outward flux of air through the surface if density is
(x, y, z) = K.
Fig. 10.11. Mr. Winter hot

air balloon.
The vector field associated with the air flow is given by

G(x, y, z) = V (x, y, z) = K curl(F ).
The crucial point here is that the boundary of the hot air balloon surface is a circle
C of radius 1. We assume that the boundary circle lies in the xy-plane and we
parametrize C by r(t) = (cos t, sin t, 0) with t [0, 2]. By Stokes theorem

K curl(F ) n dS = F dr
S S

= (y, x, 0) ( sin t, cos t, 0) dt
C
2
= ( sin t, cos t, 0) ( sin t, cos t, 0) dt
0
2
= (sin2 t + cos2 t) dt
0
= 2.
The next example is important in the theory of electricity and magnetism.
Example 10.3.4 (Divergence Theorem: Electric Fields). Consider the electric field
Q
E(x) = x
|x|3
where the electric charge Q is located at the origin and x = (x, y, z) is a position
vector. We show that the electric flux of E through any closed surface S that encloses
the origin is (recall Remark 10.2.7 for the notation),

E dS = 4Q.
S
Notice that E is not defined at (0, 0, 0) and this should remind you of the 1-form
in Example 7.3.5 which is not defined at (0, 0). The argument is similar to the one
used for Example 7.3.5.
Fig. 10.12. Region R bounded by the surfaces S and S. The unit normal vectors are ~
n1 and ~
n2 .
Let R be the region enclosed by the surface S and let S be a sphere of radius a
centered at the origin and lying entirely in R. We define R as the region bounded
outside by S and inside by S and let R have positive orientation. That is, we denote
by n1 the unit normal vector of S and n2 the unit normal vector of S, which points
towards the origin. By the Divergence theorem

E n dS = div(E) dV.
R R
But, it is a straightforward computation to show that div(E) = 0. Therefore,

0= E dS = E n1 dS + E n2 dS.
R S S
Using the unit normal vector n2 instead, the last equality can be rewritten as

E n1 dS = E (n2 ) dS.
S S
We now compute
E (n2 ) dS.
S
by parametrizing the sphere with
(, ) = (a cos sin , a sin sin , a cos )
and on the sphere, we have ||x|| = a. We define the 2-form
= E (dy dz, dz dx, dx dy).
Thus,
Q
= (, ) (a2 cos sin2 , a2 sin sin2 , a2 cos sin ) d d
a3
= Q(cos2 sin3 + sin2 sin3 + cos2 sin )d d
= Q sin d d
and we compute

E (n2 ) dS =
S S
2
= Q sin d d
0 0

= 2Q sin d = 4Q
0
Therefore,
E dS = 4Q
S
for all closed surfaces S enclosing a region containing the origin.
Exercises
(1) Use Classical Stokes or Divergence theorem to compute the following integrals.
(Hint: transform the integrand to a 1, 2 or 3 form and use the methods of the
previous section.)
2 2 2
(a) S curl F n dS where F (x, y, z) = (xz y 3 cos z, x3 ez , xyzex +y +z ) and
S is the surface with equation x2 + y 2 + 2(z 1)2 = 6 with z 0 and n is
the unit normal vector pointing away from the z-axis.

(b) C F dr where F (x, y, z) = (xy, yz, zx) where C is the triangular curve
joining the vertices (1, 0, 0), (0, 1, 0) and (0, 0, 1) oriented in the positive
direction (counterclockwise).
(c) Compute the flux of F (x, y, z) = (y +xz, y +yz, 2xz 2 ) in the direction of
positive normal orientation across the sphere of radius a in the first octant.
(d) Let E be the region defined by x2 + y 2 + z 2 2a, x2 + y 2 a2 . Let S = E
be the boundary of E which is the union of a cylindrical surface S1 and a
spherical surface S2 . Compute the flux of F (x, y, z) = (x + yz, y xz, z
ex sin y) in the positive orientation, across (a) S, (b) S1 , (c) S2 .

(2) If is an exact 2-form, then S = 0 for any closed surface S. Write this
statement in terms of the curl operator.
(3) Find the flux of the vector field F (x, y, z) = (x + yz, y + xz, z + xy)T through the
boundary of the region E delimited by the first octant x, y, z 0 where z 2,
x2 + y 2 4 and E is oriented positively.
(4) Find the flux of the vector field F (x, y, z) = (x2 , y 2 , z 2 )T through the surface of
the sphere of radius a oriented positively.
Bibliography
[1] M.P. DoCarmo. Differential Forms and applications. Springer-Verlag, 1994.
[2] R. Haberman. Applied Partial Differential Equations: With Fourier Series and Boundary
Value Problems. Pearson Prentice Hall, 2004.
[3] Rachel A. Kuhn, Hermann Ansorge, Szymon Godynicki, and Wilfried Meyer. Hair density in
the eurasian otter lutra lutra and the sea otter enhydra lutris. Acta Theriologica, 55:211
222, 2010.
[4] T. D. Rudolph, P Drapeau, M-H St-Laurent, and L. Imbeau. Status of woodland caribou
(rangifer tarandus caribou) in the james bay region of northern qubec. Technical report,
Wooldand Caribou Recovery Task Force Scientific Advisory Group - Northern Qubec, Mon-
tral, 2012.
[5] Biswa Sengupta, A Aldo Faisal, Simon B Laughlin, and Jeremy E Niven. The effect of cell
size and channel density on neuronal information encoding and energy efficiency. Journal of
Cerebral Blood Flow & Metabolism, 33:14651473, 2013.
[6] G. Strang. Linear Algebra and Its Applications. Thomson, Brooks/Cole, 2006.
[7] S.H. Strogatz. Nonlinear Dynamics And Chaos. Studies in nonlinearity. Sarat Book House,
2007.
Index
1-form 81 simple 112

circulation 283 smooth 45
exact 112 cylinder 37
exterior derivative 242
locally exact 185 derivative
space 81 linearity 135
work 82 mapping 131
differential 179
arc-length parametrization 54 R 70
area form 190 d 77
dr 76
bagels 173 dx 72
Bendixson-Dulac theorem 214 dy 72
best linear approximation function 70
function 49 function of two variables 73
mapping 131 wedge product 189
vector function 51 directional derivative 73
bilinear form 149 divergence operator 181
symmetric 149 Divergence theorem 281
bounded set 276 domains in R2
type I 200
canonical basis 6 type II 201
centre of mass type III 202
1d-domain 106 domains in R3
change of variables formula 248 type I 218
Classical Stokes theorem 280 type II 219
Conics 31 type III 220
Ellipse 31 donuts 173
Hyperbola 31
Parabola 32 energy
connected set 276 kinetic 115
coordinate system 13 potential 115
Cartesian 13 principle of conservation 115
curvilinear 14 total 115
cylindrical 15 exact
polar 14 1-form 112
spherical 16 exterior derivative
critical point 156 for 1-form 242
nondegenerate 162 more general formula 235
cross product 11
curl operator 181 flux 286
curvature 108 forms
curve 27 closed 237
closed 112 exact 237
helix 29 Fubinis theorem 197
parametric representation 27 R3 217
302 Index
function 19 metric operator

graph 118 Cartesian coordinates 98
level set 119 cylindrical coordinates 101
local maximum 158 polar coordinates 98
local minimum 158 spherical coordinates 101
regularity 137 Monkey saddle 119
Taylor expansion 151
functions of several variables 19 Newtons law of gravitation 63
norm 7
gradient 179 normal vector 265
Greens formula 213 computation 267
Greens theorem 207
using forms 280 open set 23
orientation
Hessian matrix 160 R2 188
curve on a surface 275
inner product 7 parallelogram 190
integrability plane 265
function of several variables 104 surface 277
function of two variables 196 orthogonal 8
integral of 2-form on surface 260 orthonormal 8
integration of forms 247
isomorphic vector spaces 10 parallelepiped 192
parallelogram
Jacobian matrix 133 orientation 190
signed area 190
Laplacian operator 180 parametric representation
level set curves 73 speed 53
line integral of vector fields 111 parametrization
linear combination 5 R3 166
linearly dependent 5 partial derivatives 21
linearly independent 5 population density 198
little o 49 potential function 112
pullback
Mbius strip 266 1-form 89
mapping 20 2-form in R2 244
C 1 138 2-form on surface 245
C 2 148 3-forms in R3 244
chain rule 143 real function 86
continous 125
continuous on set 127 quadratic form 150
differentiable 131 Quadrics 34
limit 122 Cone 35
regularity 137 Ellipsoid 34
second derivative 146 Elliptic paraboloid 35
tangent 130 Hyperbolic paraboloid 35
tangent space 140 Hyperboloid of one sheet 35
mass Hyperboloid of two sheet 35
1d-domains 106 traces 35
Index 303
reparametrization 54 Theorem
Riemann sum Line Integrals 113
1-form in R2 91 torus 29, 173
dx 88 total charge 218
double integral 195 total mass
real function 85 plate 197
scalar product 7
vector field 62
second derivative criterion 159
conservative 112
set 1
gradient 112
closed 25
sink point 295
smooth parametrization 45
source point 295
space of k-forms 233
vector function
span 6
affine 50
Stokes theorem 280
continuity 42
vector calculus notation 294
derivative 43
subset 2
differentiability 43
surface
integral 46
closed 277
limit 42
flux 286
linear 50
nonorientable 266
piecewise smooth 45
orientable 265
vector functions 19
orientation 277
vector space 4
surface area form 254
basis 6
surface integral 255
dimension 7
tangent line 51 vector subspace 4
tangent space 58 volume form 193
Rn 59
curve 58 wedge product
mapping 140 1-form 224
surface 65 k-forms 233
Taylor expansion differential 189
function 151 differentials in R3 192
function of several variables 152 work 82

SAnet CD 3110438216

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

SAnet CD 3110438216

Загружено:

Авторское право:

Доступные форматы

Pietro-Luciano Buono

Tensors and Riemannian Geometry. With Applications to

Multivariable Calculus and Differential Geometry

Elements of Partial Differential Equations

Differential Calculus and Stokes Theorem

Library of Congress Cataloging-in-Publication Data

Bibliografische Information der Deutschen Nationalbibliothek

2016 Walter de Gruyter GmbH, Berlin/Boston

2 Calculus of Vector Functions 41

3 Tangent Spaces and 1-forms 58

5 Differential Calculus of Mappings 117

6 Applications of Differential Calculus 156

7 Double and Triple Integrals 188

8 Wedge Products and Exterior Derivatives 224

9 Integration of Forms 243

10 Stokes Theorem and Applications 274

1.1 Review of Set Theory

which is read as:

x is the placeholder for elements of R such that x is greater than 2.

In the defining condition notation, to verify whether a number belongs to a

can have a finite number of elements as A or an infinite number of elements as B.

[a, b] = {x R | a x b}, (a, b) = {x R | a < x < b}

If a = or b = then we always use (a or b).

A set E is a subset of a set F if every element of E is also an element of F . We

Example 1.1.1. Let

Example 1.1.2. Let

The intersection of A and B is

Example 1.1.3. Let A = {x R | x is an even integer} and B = (3, 5). Then,

A B = {x R | x is an even integer or 3 < x < 5}.

A B = {x R | x is an even integer and 3 < x < 5}.

Ac = {x R | x 6 A} = {x R | x is not an even integer}.

(1) Write the following descriptions of sets using defining conditions.

1.2 Review of Linear Algebra

For n = 2 and n = 3 we have

where x1 , x2 , x3 are any real numbers.

(1) For any two elements in Rn :

(x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn )

then (x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn ) is also

a(x1 , x2 , . . . , xn ) = (ax1 , . . . , axn ) Rn .

In general, let V Rn , then V is a vector space (or a vector subspace of Rn ) if for

If a vector space W is a subset of a vector space V , we say that W is a vector

Example 1.2.1. The subset

v1 = (a1 , a2 , 2a1 + a2 ) and v2 = (b1 , b2 , 2b1 + b2 ).

cv1 = c(a1 , a2 , 2a1 + a2 ) = (ca1 , ca2 , c(2a1 + a2 ))

Example 1.2.2. The subset

w1 = (a1 , a1 + 1) and w2 = (b1 , b1 + 1).

w1 + w2 = (a1 , a1 + 1) + (b1 , b1 + 1) = (a1 + b1 , (a1 + b1 ) + 2).

The elements of a vector space V are called vectors. A linear combination of a

A collection of vectors v1 , . . . , vk in a vector space V are linearly dependent if there

a1 v1 + a2 v2 + a3 v3 = (a1 + 3a2 0.2a3 , a2 + a3 , 2a1 + 5a2 + 0.6a3 ) = 0

from which we obtain a3 = a2 by looking at the second component. Substituting

For a collection of vectors v1 , v2 , . . . , vk V , the span of v1 , v2 , . . . , vk is the set of

span(v1 , . . . , vk ) := {a1 v1 + a2 v2 + + ak vk | a1 , a2 , . . . , ak R}.

Consider the following example.

Example 1.2.6. The span of v1 = (1, 0, 2) and v2 = (3, 1, 5) is

span(v1 , v2 ) = {a1 (1, 0, 2) + a2 (3, 1, 5) | a1 , a2 R}

Recall the following result concerning span of vectors.

Proposition 1.2.7. Let v1 , . . . , vk be a collection of vectors in the vector space V .

(1) the vectors in B are linearly independent, and

Example 1.2.8. If V = Rn , the set B = {e1 , . . . , en } where

is a basis for Rn . It is called the canonical basis of Rn . The linear independence is

e1 = (1, 0) and e2 = (0, 1).

Let v = (x1 , x2 ) be an arbitrary element of R2 , then v is in the span of B: