Вы находитесь на странице: 1из 68

}

Linear algebra: SMTA 022


A STUDY GUIDE

}
Amartya Goswami

University of Limpopo
Preface

This study guide is a compilation of the materials taken from the resources mentioned in
references. Notes are designed to motivate students in pure mathematics through the study of
finite-dimensional vector spaces. Problems have been provided after each section to grasp the
relevant ideas.
The aim of this course is to give the very basic foundation on finite-dimensional vector
spaces.
Amartya Goswami

i
How to do Mathematics?

When you are with any mathematical text, never ever read it like a story book. You need
to study it actively. What does it mean by that? Well! Assume you come across a definition.
Now, what should you do? First check whether do you know all the “mathematical words” in
the definition (except the defining word). If not, do the following with each unknown word:
Try to find examples of the mathematical word. It will be great if you come up with your own
examples which are not written in the text. Then try to find examples which are not examples
of that mathematical word, that is “counter-examples”. Once you do this process with all un-
known mathematical words, try to find examples and counter-examples of the defining word.
On completion of this process, you will definitely understand that definition.
Next, probably you will see a theorem or a proposition. Along with some extra conditions,
generally those things talk about properties of the mathematical words you have encountered
before. First thing you need to do is to look at the result carefully for a while. Ask yourself:
What does it mean? Does the property hold if we remove one or more assumptions from the
statement of the theorem? Do I know all the “ingredients” to prove the result? Once you
are satisfied with all these questions, try to prove the result by yourself without looking at
the proof given in the text. Proving a result by yourself is the biggest achievement towards
learning mathematics. Once you have a proof, try to understand overall meaning of the result
and how does the result contribute to the theory.
Finally, solving exercises are just a process of checking your understanding of the text you
have just read. Therefore, if you have understood the text up to the trivial level, solving
exercises become merely a routine work for you.

iii
Greek alphabet

Capital Lower case Greek name


A α Alpha
B β Beta
Γ γ Gamma
∆ δ Delta
E ε Epsilon
Z ζ Zeta
H η Eta
Θ θ Theta
I ι Iota
K κ Kappa
Λ λ Lambda
M µ Mu
N ν Nu
Ξ ξ Xi
O o Omicron
Π π Pi
P ρ Rho
Σ σ Sigma
T τ Tau
Y υ Upsilon
Φ φ Phi
X χ Chi
Ψ ψ Psi
Ω ω Omega

v
C ONTENTS

Preface i

How to do Mathematics? iii

Greek alphabet v

0 Introduction 1
0.1 What is linear algebra? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
0.2 Mathematical prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1 Finite-dimensional vector spaces 9


1.1 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Linear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Vector subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Quotient spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5 Bases and coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.7 Dual spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2 Linear maps and matrices 37


2.1 The linear map associated with a matrix . . . . . . . . . . . . . . . . . . . . 37
2.2 The matrix associated with a linear map . . . . . . . . . . . . . . . . . . . . 38
2.3 Bases, matrices, and linear maps . . . . . . . . . . . . . . . . . . . . . . . . 40

3 Eigenvectors and eigenvalues 45


3.1 Eigenvectors and eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 The characteristic polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4 Scalar products and orthogonality 51


4.1 Scalar products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.1 The real positive definite case . . . . . . . . . . . . . . . . . . . . . 52

vii
viii Contents

4.2 Orthogonal bases, positive definite case . . . . . . . . . . . . . . . . . . . . 55

References 59
I NTRODUCTION
0
0.1 What is linear algebra?
Mathematics is the study of mathematical structures and in particular, linear algebra is the
study of the algebraic structures called vector spaces or linear spaces and structure preserving
maps between them called linear transformations or linear maps. Like other mathematical
structures, from given vector spaces we construct new vector spaces: product of vector spaces,
subspaces, and quotient spaces. Every vector space has a set of free generators called basis
and in general a vector space may have many bases. The number of elements in a basis is the
dimension of the vector space.
Explicit calculations with a linear maps usually depend on a representation of that linear
map by a matrix of scalars. Each linear map has such a matrix representation under given
bases of the domain and the codomian spaces of the map. The idea is to develop ways of
replacing operations on linear maps by operations on the corresponding matrices, as well as
methods of describing the change in the matrix representing a linear map caused by a change
in the choice of bases.

0.2 Mathematical prerequisites


In this section we summarise all the necessary definitions and results needed for this course.
We suggest [4] for details.

D EFINITION 0.2.1 The cartesian product A × B of sets A and B is defined as

A × B = {(a, b) | a ∈ A and b ∈ B}.

D EFINITION 0.2.2 A relation R : A → B is a subset of A × B; we write aRb when (a, b) ∈ R.


Instead of specifying A and B in advance, we could say that a relation is a triple R = (A, G, B),
where G is a subset of A × B called the graph of R. The sets A and B are called the domain of
R and the codomain of R, respectively. We shall also say that R is a relation from A to B.

1
2 CHAPTER 0. INTRODUCTION

D EFINITION 0.2.3 Let X be a set. A relation E : X → X is said to be:


(a) reflexive, if ∀x∈X (x, x) ∈ X;
(b) symmetric if ∀x∈X ∀y∈X ((x, y) ∈ E ⇒ (y, x) ∈ E);
(c) transitive, if ∀x∈X ∀y∈X ∀z∈X (((x, y) ∈ E ∧ (y, z) ∈ E) ⇒ (x, z) ∈ E);
(d) an equivalence relation (on X), if it is reflexive, symmetric, and transitive at the same
time.

Remark 0.2.1. When E : X → X is an equivalence relation and x is an element of X, we shall


write
[x]E = {y ∈ X | (x, y) ∈ E},
and call this set the equivalence class of x under E, or simply call the class of x and write [x]
for it; another standard notation uses clsE (x) = cls(x). Note that, as follows from reflexivity, x
always is an element of [x]E .

D EFINITION 0.2.4 A set S of subsets of X is said to be a partition of X, if it satisfies the


following conditions:
(a) ∪ S = {x ∈ X | ∃A∈S x ∈ A} = X;
(b) ∀A∈S ∀B∈S (A ∩ B 6= 0/ ⇒ A = B).

D EFINITION 0.2.5 A relation R : A → B is said to be a map, or a function, if for every a in A


there exists a unique b in B with aRb; we then write R(a) = b, and call b the image of a under
R, or simply the R-image of a. We also say that by the function R, a maps to b and denote it
by a 7→ b.

D EFINITION 0.2.6 Given maps f : A → B and g : B → C, the composite of the maps f and g,
(we write g ◦ f : A → C) defined by

g ◦ f = {(a, c) ∈ A ×C | c = g( f (a))}.

Remark 0.2.2. Given sets A, B, C, and maps f : A → B, g : B → C, and h : A → C, whenever


we have h = g ◦ f , we say that the following diagram commutes:

f
A / B
g
h  
C

D EFINITION 0.2.7 A map f : A → B is said to be:


(a) injective or an injection, if ∀a∈A ∀a0 ∈A ( f (a) = f (a0 ) ⇒ a = a0 ), that is, if f (a) = f (a0 ) in
B only when a = a0 in A;
(b) surjective or an surjection, if ∀b∈B ∃a∈A f (a) = b, that is, for every b in B, there exists a
in A with f (a) = b;
(c) bijective (or a bijection), if it is injective and surjective at the same time.
CHAPTER 0. INTRODUCTION 3

D EFINITION 0.2.8 Suppose that g : T → S and f : S → T , so that the composite f ◦g is defined.


If this composite is the identity 1T = f ◦ g, call f a left inverse of g and g a right inverse of
f . When the composites in both orders are identities, so that f ◦ g = 1T and g ◦ f = 1S , call f
a two-sided inverse of g (and hence g a two-sided inverse of f ).

T HEOREM 0.2.1 A function with non-empty domain is an injection if and only if it has a left
inverse. A function is a surjection if and only if it has a right inverse.

D EFINITION 0.2.9 Let A be a subset of a set B. The inclusion map i : A → B, defined by


i(a) = a for each a in A, is always injective. It is surjective, or, equivalently, bijective, if and
only if A = B, in which case it is called the identity map of A and denoted by 1A .

Remark 0.2.3. A map f : A → B is a bijection if and only if for every b in B there exists a
unique a in A with f (a) = b.

T HEOREM 0.2.2 A map f : A → B is a bijection if and only if there exists a map g : B → A


such that g ◦ f = 1A and f ◦ g = 1B .

D EFINITION 0.2.10 Let f : A → B be a map. For a subset X of A, the image f (X) of X under
f is defined by
f (X) = {b ∈ B | ∃x∈X f (x) = b}.
Let f : A → B be a map. For a subset X of B, the inverse image f −1 (X) of X under f is
defined by
f −1 (X) = {a ∈ A | f (a) ∈ X}.

D EFINITION 0.2.11 A binary operation on a set A is a map ◦ : A × A → A.

Remark 0.2.4. Using such an operation we shall often write a ◦ b instead of ◦(a, b).

D EFINITION 0.2.12 A binary operation ◦ : A × A → A is called


(a) associative if x ◦ (y ◦ z) = (x ◦ y) ◦ z for all x, y, z ∈ A;
(b) commutative if x ◦ y = y ◦ x for all x, y ∈ A.
(c) idempotent if x ◦ x = x for all x ∈ A.

D EFINITION 0.2.13 Let ◦ be a binary operation on a set X, while ◦0 is another such operation
on a set X 0 . A morphism or a homomorphism f : (X, ◦) → (X 0 , ◦0 ) is defined to be a map on
X to X 0 such that f (x ◦ y) = f (x) ◦0 f (y) for all x, y ∈ X.

D EFINITION 0.2.14 A morphism f : (X, ◦) → (X 0 , ◦0 ) is said to be


(a) an endomorphism if X = X 0 and ◦ = ◦0 ;
(b) an isomorphism if f is a bijection;
(c) an automorphism if f is a bijection and an endomorphism.
4 CHAPTER 0. INTRODUCTION

T HEOREM 0.2.3 The inverse of an isomorphism is an isomorphism.

Remark 0.2.5. Two sets (X, ◦) and (X 0 , ◦0 ), each with a binary operation, are called isomor-
phic when there exists some map which is an isomorphism and we denote it by (X, ◦) ∼ =
0 0
(X , ◦ ).

D EFINITION 0.2.15 A group is a set G together with a binary operation G×G → G, written
(a, b) 7→ ab, such that:
(i) The operation ◦ is associative.
(ii) There is an element e ∈ G with ea = a = ae for all a ∈ G.
(iii) For this element e, there is to each element a ∈ G an element a0 ∈ G with aa0 = e = a0 a.

Remark 0.2.6. The element ab is called the product of a and b in G, while e is the unit or the
identity of G, and a0 the inverse of a in G. Here the binary operation in G has been written as
a product; often it may be written as a sum (a, b) 7→ a + b; we say accordingly that the group is
multiplicative or additive. For additive or multiplicative groups we denote the corresponding
identity elements by 0 or 1, whereas the inverse of an element a is denoted by −a or 1/a. The
letter G will stand both for the set of elements of the group and for this set together with its
binary operation.

D EFINITION 0.2.16 A group G is called abelian if the binary operation is commutative. A


subset S of a group G is called a subgroup of G if S itself is a group under the binary operation
of G.

Remark 0.2.7. If G and G0 are groups, a morphism f : G → G0 of groups is a map on G


to G0 which is a morphism for the binary operations involved. In case G and G0 are both
multiplicative groups, this requirement means that f (ab) = f (a) f (b) for all a, b ∈ G, whereas
in case G and G0 are both additive groups, this requirement means that f (a + b) = f (a) + f (b)
for all a, b ∈ G.

T HEOREM 0.2.4 In a group G, the identity element is unique and for each element a ∈ G the
inverse element a0 of a is also unique.

T HEOREM 0.2.5 Any group homomorphism preserves identity and inverse elements.

D EFINITION 0.2.17 Let f : G → H be any morphism of groups. The image of f is the set
Im f = { f (a) ∈ H | a ∈ G}. The set Ker f = {a ∈ G | f (a) = 0} is called the kernel of the
morphism f .

T HEOREM 0.2.6 For any morphism f : G → H of groups, the sets Im f and Ker f are subgroups
of H and G respectively.
CHAPTER 0. INTRODUCTION 5

T HEOREM 0.2.7 A group morphism f : G → H is


(a) a surjection if and only if Im f = H;
(b) an injection if and only if Ker f = 0;
(c) an isomorphism if and only if Im f = H and Ker f = 0.

D EFINITION 0.2.18 A field is a set F with two binary operations, addition + and multipli-
cation · such that (F, +, 0) and (F \{0}, ·, 1) are abelian groups, and for all α, β , γ ∈ F the
following distributive property holds:

α · (β + γ) = α · β + α · γ.

Remark 0.2.8. Having in mind · as the multiplicative operation, we often write the product of
two elements α and β of F as αβ instead of α · β .

D EFINITION 0.2.19 A morphism of fields, α : F → F 0 is a map on one field F to a second


field F 0 , which is a morphism of addition, of multiplication, and of multiplicative identity. In
other words, α must satisfy the conditions:

α(a + b) = α(a) + α(b), α(ab) = α(a)α(b), α(1) = 10

for all elements a, b ∈ F and for multiplicative identity 1 of R, 10 that of F 0 .

D EFINITION 0.2.20 A subset of a field F is a subfield if it is closed under the operations


multiplicative identity, subtraction, multiplication, and multiplicative inverse (of non-zero el-
ements).

Review problems
P ROBLEM 0.2.1 Let R be the set of real numbers. Which of the following relations R → R
are maps?
(a) {(x, y) ∈ R × R | x = y};
(b) {(x, y) ∈ R × R | x 6= y};
(c) {(x, y) ∈ R × R | x2 = y};
(d) {(x, y) ∈ R × R | xy = 1}.

P ROBLEM 0.2.2 Which of the followings define a binary operations on the set of integers? Of
those that do, which are associative? Which are commutative? Which are idempotent?
(a) m ◦ n = mn + 1;
(b) m ◦ n = (m + n)/2;
(c) m ◦ n = m;
(d) m ◦ n = mn2 ;
(e) m ◦ n = m2 + n2 ;
(f) m ◦ n = 3.
6 CHAPTER 0. INTRODUCTION

P ROBLEM 0.2.3 Let α, β , and γ be maps from Z → Z defined by α(n) = 2n, β (n) = n + 1,
and γ(n) = n2 . Write a formula for each of the following compositions:
(a) α ◦ α;
(b) γ ◦ α;
(c) α ◦ β ;
(d) β ◦ β ;
(e) β ◦ γ;
(f) γ ◦ γ.

P ROBLEM 0.2.4 Each of the following maps, decide which are injections, surjections, or bi-
jections:
(a) f : N → Z; f (x) = 2x;
(b) f : Z → Z; f (x) = x − 4;
(c) f : N → N; f (x) = x2 ;
(d) f : R → R; f (x) = x3 .

P ROBLEM 0.2.5 Each of the following maps are from R to R. Find the inverse maps in each
case:
(a) f (x) = 5x;
(b) f (x) = x − 4;
(c) f (x) = −x/2;
(d) f (x) = x3 .

P ROBLEM 0.2.6 Let R be the set of real numbers, and R : R → R a relation. In which of the
following cases it is (i) reflexive? (ii) symmetric? (iii) transitive?
(a) R = {(x, y) ∈ R × R | 3x + 3x + 1 = 3y + 3y + 1};
(b) R = {(x, y) ∈ R × R | x + y = 0};
(c) R = {(x, y) ∈ R × R | x2 − y2 = 0};
(d) R = {(x, y) ∈ R × R | x ≤ y}.

P ROBLEM 0.2.7 Let R be the set of real numbers, f : R → R a map, and E the equivalence
relation on R defined by E = {(x, y) ∈ R × R | f (x) = f (y)}. Describe the equivalence classes
of the numbers 0, 1, and 2 in the following cases:
(a) f (x) = 2x + 1 for all x ∈ R;
(b) f (x) = x3 for all x ∈ R;
(c) f (x) = x3 + x for all x ∈ R;
(d) f (x) = x3 − x for all x ∈ R.

P ROBLEM 0.2.8 (a) Prove that in any group G satisfies the right and left cancellation laws:
For all a, b, c ∈ G, ab = ac =⇒ b = c; ba = ca =⇒ b = c.
(b) For e be the identity element of a group G and for all a, b ∈ G, prove that

e−1 = e, (a−1 )−1 = a, (ab)−1 = b−1 a−1 .


CHAPTER 0. INTRODUCTION 7

(c) Prove that any group homomorphism f : G → G0 of groups preserves the identity and
inverse elements, i.e. f (e) = e0 and f (a−1 ) = ( f (a))−1 .
(d) Assume that a, b are elements of a group G. Show that ab = ba if and only if (ab)2 = a2 b2 .

P ROBLEM 0.2.9 In any field F and a, b, c, d ∈ F, show that the followings hold:
(a) (a/b) + (c/d) = (ad + bc)/bd, b, d 6= 0;
(b) (a/b)(c/d) = (ac)/(bd), b, d 6= 0;
(c) −(a/b) = (−a)/b = a/(−b), b 6= 0;
(d) (a/b)−1 = b/a, a, b 6= 0;
(e) (a/b)/(c/d) = (ad)/(bc), b, c, d 6= 0.

P ROBLEM 0.2.10 (a) Show that for any field morphism α : F → F 0 and a, b ∈ F, b 6= 0 we
have α(a/b) = (αa)/(αb).
(b) A morphism of fields is an injection.
F INITE - DIMENSIONAL VECTOR SPACES
1
1.1 Vector spaces
D EFINITION 1.1.1 A vector space V over a field F is an additive abelian group together with
a map F ×V → V, written (α, v) 7→ αv, and subject to the following axioms, for all elements
α, β ∈ F and v, w ∈ V :

α(v + w) = αv + αw, (1)


(α + β )v = αv + β v, (2)
(αβ )v = α(β v), (3)
1v = v. (4)

Remark 1.1.1. The elements of the vector space V are called the vectors. The binary operation
of the abelian group V is called the addition, and the additive identity of the abelian group
V is called the zero vector or the null vector of V. The elements of the field F are called
the scalars, and the map F × V → V is called the scalar multiplication. The multiplicative
identity of F is denoted by 1.
Remark 1.1.2. For the rest of this course V will denote a vector space over a field F, and our
fields will be either R or C unless otherwise stated. If F = R, we call V is a real vector space
whereas if F = C, we say V is a complex vector space.
Remark 1.1.3. We will use the notation u − v for the addition u + (−v) of two vectors u and
−v, where −v is the additive inverse of v. We will use the same symbol 0 to denote both the
zero vector of V (that is the additive identity of V ) and the scalar zero of F (that is the additive
identity of F).

P ROPOSITION 1.1.1 For every v ∈ V and α ∈ F :


(a) 0v = 0.
(b) α0 = 0.
(c) (−α)v = α(−v) = −αv.
(d) αv = 0 implies either α = 0 or v = 0.

9
10 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

Proof. (a) 0v + v = 0v + 1v = (0 + 1)v = 1v = v. Adding −v to both sides shows that 0v = 0.


(b), (c), and (d) are left as exercises.

P ROPOSITION 1.1.2 Let U and V be two vector spaces over the same field F. The cartesian
product U ×V over F forms a vector space with respect to the following two operations:

(u, v) + (u0 , v0 ) = (u + u0 , v + v0 ),
α(u, v) = (αu, αv),

where u, u0 ∈ U, v, v0 ∈ V, and α ∈ F.

Proof. Exercise.
Remark 1.1.4. The vector space U ×V in Proposition 1.1.2 is called the product of the vector
spaces U and V.

Problems
P ROBLEM 1.1.1 Show that in a vector space there is only one zero vector.

P ROBLEM 1.1.2 Let V be a vector space and v ∈ V. Show that there is exactly one vector w ∈ V
such that v + w = w + v = 0.

P ROBLEM 1.1.3 Let V be a vector space and v, w two vectors of V. If v + w = 0, show that
w = −v.

P ROBLEM 1.1.4 Let V be a vector space, and v, w two vectors of V such that v + w = v. Show
that w = 0.

P ROBLEM 1.1.5 In any vector space V over a field F, prove that n(αv) = α(nv), where α ∈ F,
v ∈ V and n ∈ Z.

P ROBLEM 1.1.6 If F is a field then show that F n is a vector space over F under the following
two operations:
(i) (α1 , α2 , . . . , αn ) + (β1 , β2 , . . . , βn ) = (α1 + β1 , α2 + β2 , . . . , αn + βn ),
(ii) α(α1 , α2 , . . . , αn ) = (αα1 , αα2 , . . . , ααn ),
where α ∈ F, (α1 , α2 , . . . , αn ) ∈ F n , and (β1 , β2 , . . . , βn ) ∈ F n .

P ROBLEM 1.1.7 Let S be a set and F be a field. Let V be the set of all maps f : S → F of S
into F. Prove that V is a vector space over F under the following two operations:
(i) ( f + g)(x) = f (x) + g(x),
(ii) (α f )(x) = α f (x),
where α ∈ F, and f , g ∈ V.
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 11

P ROBLEM 1.1.8 Which of the following sets of polynomials in R[x] are vector spaces over R:
(i) all polynomials of degree exactly 4;
(ii) all polynomials of degree at most 4;
(iii) all monic polynomials;
(iv) all polynomials of even degree?

P ROBLEM 1.1.9 Prove that the set V = { f : R → R | f (x) 6= 0 ∀x ∈ R} does not form a vector
space over R under the following two operations:
(i) ( f + g)(x) = f (x) + g(x), (ii) (α f )(x) = α f (x), where α ∈ R, and f , g ∈ V.

P ROBLEM 1.1.10 Let V = {x ∈ R | x > 0}. For x, y ∈ V and α ∈ R define addition and scalar
multiplication as: x + y = xy and αx = xα . Show that V is a vector space over R.

1.2 Linear maps


D EFINITION 1.2.1 Let V and V 0 be the vector spaces over the field F. An F-linear map or
F-linear transformation or a morphism f : V → V 0 is a map which satisfies the following
two properties:
(a) For all v, v0 ∈ V we have f (v + v0 ) = f (v) + f (v0 ).
(b) For all α ∈ F and v ∈ V we have f (αv) = α f (v).

Remark 1.2.1. Since we usually deal with a fixed field F, we omit the prefix F, and say simply
that f is linear.
Remark 1.2.2. A linear map f : V → V (that is an endomorphism) is also often called an
operator, whereas linear maps f : V → F with codomain F (vector space F over itself) are
called linear functionals or linear forms.

T HEOREM 1.2.1 Let U × V be the product of vector spaces U and V over the same field
F. Define the maps π1 : U × V → U and π2 : U × V → V by π1 (u, v) = u and π2 (u, v) = v
respectively. Then
(a) the maps π1 and π2 are linear.
(b) For any vector space W with two linear maps f : W → U and g : W → V, there exists
a unique linear map h : W → U × V such that the two triangles in the following diagram
commute:
U oc 1 U ×V /V
π π2
O ;
h g
f
W
that is, f = π1 ◦ h and g = π2 ◦ h.

Proof. (a) For (u, v), (u0 , v0 ) ∈ U ×V and α ∈ F, we observe that


π1 ((u, v) + (u0 , v0 )) = π1 (u + u0 , v + v0 ) = u + u0 = π1 (u, v) + π1 (u0 , v0 ),
12 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

and
π1 (α(u, v)) = π1 (αu, αv) = αu = απ1 (u, v).
Therefore π1 is linear. The proof of linearity of π2 is similar.
(b) Define a map h : W → U × V by h(w) = ( f (w), g(w)). Immediately we notice that for
w ∈ W, (π1 ◦ h)(w) = π1 (h(w)) = π1 ( f (w), g(w)) = f (w), that is f = π1 ◦ h. The proof of
f = π2 ◦ h is similar.
For linearity of h, we notice that
h(w + w0 ) = ( f (w + w0 ), g(w + w0 ))
= ( f (w) + f (w0 ), g(w) + g(w0 ))
= ( f (w), g(w)) + ( f (w0 ), g(w0 ))
= h(w) + h(w0 ),
and
h(αw) = ( f (αw), g(αw))
= (α f (w), αg(w))
= α( f (w), g(w))
= αh(w).
We can see there is exactly one such h satisfying the conditions f = π1 ◦ h and g = π2 ◦ h.

P ROPOSITION 1.2.1 An R-linear map f : R → V is determined by the vector f (1).

Proof. We notice that for any α ∈ R, we have f (α) = f (1 · α) = α f (1). Therefore a linear
map f : R → V is completely determined by the vector f (1).

T HEOREM 1.2.2 If V and V 0 are two vector spaces over a field F, then the set HomF (V,V 0 ) =
{ f | f : V → V 0 is a linear map} is an abelian group under pointwise addition of maps.

Remark 1.2.3. Hereafter “hom”, with lower case “h” refers to the set of these morphisms f
and “Hom”, with capital “H”, to the additive group of these f .
Proof. The pointwise sum of two morphisms f , g : V → V 0 is the map f + g defined for all v ∈
V by ( f + g)(v) = f (v) + g(v). For any scalar κ, ( f + g)(κv) = κ( f (v) + g(v) = κ[( f + g)(v)],
so f + g is a morphism for any scalar multiple a v 7→ κv. It is also a morphism for addition
because, for any two vectorss v, w ∈ V,
( f + g)(v + w) = f (v + w) + g(v + w) = f (v) + f (w) + g(v) + g(w),
while
( f + g)(v) + ( f + g)(w) = f (v) + g(v) + f (w) + g(w);
the two results are equal because addition in V 0 is commutative. This sum gives a binary
operation ( f , g) 7→ f + g on HomF (V,V 0 ). As a pointwise sum, it is associative and commu-
tative. The zero for this sum is that morphism 0 which sends every v to 0 (the zero morphism
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 13

0 : V → V 0 ). The additive inverse of f is the map − f defined by (− f )(v) = − f (v). Hence


HomF (V,V 0 ) is indeed an abelian group.

T HEOREM 1.2.3 For F-linear maps in the configuration


g
// V 0
f
U /V h /W
g0

both distributive laws hold: h ◦ (g + g0 ) = h ◦ g + h ◦ g0 , (g + g0 ) ◦ f = g ◦ f + g0 ◦ f .

Proof. To prove the first law, we take any v ∈ V and write

[h ◦ (g + g0 )](v) = h(g(v) + g0 (v)) = h(g(v)) + h(g0 (v)) = (h ◦ g + h ◦ g0 )(v),

using the fact that h is a linear map. The proof of the second distributive law is similar, but
does not use the linearity of f .

L EMMA 1.2.1 If f : V → V 0 is an F-linear map then for each α ∈ F the pointwise multiple α f
is also an F-linear map on V to V 0 .

Proof. Here α f : V → V 0 is the map defined by (α f )(v) = α f (v) for each v ∈ V. Clearly, α f
is a morphism of addition (prove it!). Moreover, for any scalar κ and any v ∈ V,

(α f )[κv] = α[ f (κv)] = (ακ)( f (v)) = κ[(α f )(v)];

here we can write ακ = κα because multiplicative operation in F is commutative. hence,


α f : V → V 0 is an F-linear map, as asserted.

T HEOREM 1.2.4 For vector spaces V and V 0 over a field F, the set HomF (V,V 0 ) is itself a
vector space over F under pointwise addition and pointwise scalar multiplication.

Proof. The proof is a straight forward verification of the vector space axioms for the multiples
(α, f ) 7→ α f defined by the above lemma.

T HEOREM 1.2.5 A linear map f : V → V 0 is


(a) a surjection if and only if Im f = V 0 ;
(b) an injection if and only if Ker f = 0;
(c) an isomorphism if and only if Im f = V 0 and Ker f = 0.

Proof. (a) The statement just reformulates the definition, for f is a surjection precisely when
Im f = V 0 .
(b) We know f (0) = 0. Therefore 0 ∈ Ker f . Suppose f is an injection and v ∈ Ker f . Then

f (v) = 0 = f (0) =⇒ v = 0,
14 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

where we have used the fact that f is an injection. Conversely, suppose that Ker f = 0. If
f (v) = f (w), then f (v − w) = f (v) − f (w) = 0, so v − w ∈ Ker f . Since Ker f = 0, we have
v = w; hence, f is indeed injective.
(c) The statement follows from (a) and (b).

Problems
P ROBLEM 1.2.1 Determine which of the following maps f are linear:
(i) f : R3 → R2 defined by f (x, y, z) = (x, z).
(ii) f : R2 → R2 defined by f (x, y) = (2x + y, y).
(iii) f : R2 → R2 defined by f (x, y) = (2, y − x).
(iv) f : R2 → R2 defined by f (x, y) = (y, x).
(v) f : R2 → R defined by f (x, y) = xy.

P ROBLEM 1.2.2 Let f : V → V 0 be a linear map. Show that:


(i) f (0) = 0.
(ii) f (−v) = − f (v) for all v ∈ V.

P ROBLEM 1.2.3 Let V be a vector space over R, and let v, w ∈ V. The line passing through v
and parallel to w is defined to be the set of all vectors v + tw with t ∈ R. The line segment
between v and v + w is defined to be the set of all vectors v +tw with 0 ≤ t ≤ 1. Let f : V → U
be a linear map. Show that the image under f of a line segment in V is a line segment in U.
Between what points? Show that the image of a line under f is either a line or a point.

P ROBLEM 1.2.4 Let V be the vector space of maps which have derivatives of all orders, and
let D : V → V be the derivative. What is the kernel of D?

P ROBLEM 1.2.5 Let V be the space of all infinitely differentiable functions, and let D : V → V
be the derivative.
(i) Let L = D − I, where I is the identity map. What is the kernel of L?
(ii) Same question if L = D − aI, where a is a real number.

P ROBLEM 1.2.6 Let f : R2 → R2 be the linear maps as indicated. Show that f is a bijection
in each case.
(i) f (x, y) = (x + y, x − y).
(ii) f (x, y) = (2x + y, 3x − 5y).

P ROBLEM 1.2.7 (i) Let L : V → V be a linear map such that L2 = O. Show that I − L is a
bijection.
(ii) Let L : V → V be a linear map such that L2 + 2L + I = O. Show that L is a bijection.
(iii) Let L : V → V be a linear map such that L3 = O. Show that I − L is a bijection.
Here Ln = L ◦ L ◦ · · · ◦ L, the composition of n linear maps L. I is the identity map on V.
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 15

1.3 Vector subspaces


D EFINITION 1.3.1 A non-empty subset S of a vector space V is called a subspace of V if S is
closed under (vector) addition and under scalar multiplication, that is,
(a) s + s0 ∈ S for all s, s0 ∈ S;
(b) αs ∈ S for all α ∈ F and for all s ∈ S.

Remark 1.3.1. We notice that every subspace S of a vector space V over a field F is itself a
vector space over the same field F.
Remark 1.3.2. Among the subspaces of V, the subspace V itself and the set 0 consisting of
the zero vector alone are called the trivial or improper subspaces of V. Any subspace of V
different from these two is said to be a non-trivial or proper subspace of V.

P ROPOSITION 1.3.1 The set of subspaces of a vector space V satisfies the following properties:
(a) Every subspace S is a subspace of itself.
(b) Let S and T be two subspaces of V. If S is a subspace of T and T is a subspace of S then
S = T.
(c) Let S, T, and U be three subspaces of V. If S is a subspace of T, and T is a subspace of U
then S is a subspace of U.

Proof. Exercise.

P ROPOSITION 1.3.2 If S and T are two subspaces of a vector space V then so are the follow-
ings:
(a) the intersection S ∩ T = {v | v ∈ S and v ∈ T };
(b) the sum S + T = {s + t | s ∈ S and t ∈ T };
(c) the span Fv = {αv | α ∈ F}.

Proof. Exercise.

T HEOREM 1.3.1 For any linear map f : V → V 0 the sets Im f and Ker f are subspaces of V 0 and
V respectively.

Remark 1.3.3. Sometimes Ker f is called the null space and Im f the range space of f .

P ROPOSITION 1.3.3 If S and T are two subspaces of a vector space V, then the following
properties are equivalent:
(a) S ∩ T = {0}.
(b) Every vector in S + T can be written uniquely in the form s + t, where s ∈ S and t ∈ T.

Proof. To show that (b) implies (a), we note that if v ∈ S ∩ T, then v + 0 = 0 + v and the
uniqueness forces v = 0. Conversely, if (a) holds, then for s, s0 ∈ S and t,t 0 ∈ T, s + t = s0 + t 0
16 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

gives s − s0 = t − t 0 . But s − s0 ∈ S and t − t 0 ∈ T. Hence s − s0 = t − t 0 = 0, or s = s0 and


t = t 0.
Remark 1.3.4. When conditions (a) and (b) of the Proposition 1.3.3 are satisfied, we say S + T
is a direct sum of S and T. If, also, V = S + T, we say that S and T are supplementary
subspaces of V.

D EFINITION 1.3.2 Let a is in V. The map fa : V → V defined by fa (v) = a + v is called the


translation by a. If S is any subset of V, then fa (S) = {a + s | s ∈ S} is called the translation
of S by a. We often denote it by a + S. If S is a subspace of a vector space V, then a + S is
called a linear affine variety or simply a linear variety of V.

T HEOREM 1.3.2 (i) If S is a subspace of V and a ∈ S, then a + S = S.


(ii) If S, T are two subspaces of V and a, b two vectors in V, then a + S ⊆ b + T if and only if
S ⊆ T and a − b ∈ T.

Proof. (i) Note that a ∈ S implies a + s ∈ S for each s ∈ S.


(ii) Now a + S ⊆ b + T if and only if every s ∈ S satisfies an equation a + s = b + t for some
suitable t ∈ T ; that is s = c + t, where c = b − a. In particular, taking s = 0, we have −c =
a − b ∈ T, which shows then that for all s ∈ S, s ∈ c + T = T, or S ⊆ T. On the other hand, if
S ⊆ T and a−b ∈ T, then a+S ⊆ a+T = a+((b−a)+T ) = b+T, since T = (b−a)+T.
Remark 1.3.5. The Theorem 1.3.2 (ii) shows that for each linear variety L of V, there corre-
sponds a unique subspace S such that L = a + S. On the other hand the vector a is not unique
as it may be replaced by any vector a + b with b ∈ S; in other words, the vector a can be chosen
arbitrarily within L. We call S the direction of the linear variety L.

P ROPOSITION 1.3.4 Two linear varieties a + S, b + T of V intersect non-trivially if and only if


a − b ∈ S + T. The intersection is then a linear variety with direction S ∩ T.

Proof. If x belongs to both a + S and b + T, there exists s ∈ S, t ∈ T such that x = a + s = b +t,


or a − b = −s + t ∈ S + T. The converse is obviously true (why?). Thus, the varieties can be
written as x + S, x + T, clearly showing that their intersection is x + (S ∩ T ) precisely.

D EFINITION 1.3.3 Let a + S and b + T be two linear varieties of V. We say that a + S and b + T
are parallel if S ⊆ T.

P ROPOSITION 1.3.5 Parallel varieties a + S and b + T are either totally disjoint or one of them
is contained in the other. Through any vector v ∈ V, there is one and only one variety with the
given direction (and therefore parallel to T ), namely v + S.

Proof. If x belongs to both a + S and b + T, then a + S = x + S ⊆ x + T = b + T.

T HEOREM 1.3.3 (i) If a, b are two vectors of V, then fa+b = fa ◦ fb .


CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 17

(ii) If a is a vector of V, then the translation fa : V → V has an inverse map which is the
translation f−a .
(iii) The set of all translations S = { fa | a ∈ V } on V forms an abelian group under the binary
operation ◦ : S × S → S defined by fa ◦ fb = fa+b .
(iv) The group (S, ◦) is isomorphic to the additive group structure of V.

Proof. (i) For v ∈ V,


fa+b (v) = (a + b) + v = a + (b + v) = fa (b + v) = fa ◦ fb (v).
(ii) For v ∈ V, we note that
( f−a ◦ fa )(v) = f−a ( fa (v)) = f−a (a + v) = −a + (a + v) = (−a + a) + v = v,
and
( fa ◦ f−a )(v) = fa ( f−a (v)) = fa (−a + v) = a + (−a + v) = (a − a) + v = v.
Hence f−a is the inverse map of fa for each a ∈ V.
(iii) We observe that for any fa ∈ S,
( fa ◦ f0 )(v) = fa ( f0 (v)) = fa (0 + v) = a + v = fa (v),
and
( f0 ◦ fa )(v) = f0 ( fa (v)) = f0 (a + v) = 0 + (a + v) = a + v = fa (v).
Therefore, f0 is the identity element for (S, ◦).
Also, for fa , fb , fc ∈ S, we notice that
( fa ◦ ( fb ◦ fc ))(v) = fa ( fb (c + v))
= fa (b + (c + v))
= a + ((b + c) + v)
= ((a + b) + c) + v
= (( fa ◦ fb ) ◦ fc )(v).
Finally, for fa , fb ∈ S, we have
( fa ◦ fb )(v) = fa (b + v)
= a + (b + v)
= (a + b) + v
= (b + a) + v
= b + (a + v)
= fb (a + v)
= ( fb ◦ fa )(v).
Hence (S, ◦) is an abelian group.
(iv) We define a map φ : (V, +, 0) → (S, ◦, f0 ) as φ (a) = fa . To show φ is an isomorphism, we
need to show
18 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

(a) φ is a morphism: φ (a + b) = fa+b = fa ◦ fb = φ (a) ◦ φ (b).


(b) φ is an injection: φ (a) = φ (b) =⇒ fa = fb =⇒ a = b.
(c) φ is surjective: For each fa ∈ S, we have the vector a ∈ V such that φ (a) = fa .

P ROPOSITION 1.3.6 A translation fa : V → V is a linear map if and only if a = 0.

Proof. If a = 0, then obviously f0 is a linear map. Conversely, we notice that 0 = fa (0) =


a + 0 = a.

Problems
P ROBLEM 1.3.1 Which of the following subsets of R2 are not subspaces?
(i) The line x = y.
(ii) The unit circle x2 + y2 = 1.
(iii) The line 2x + y = 1.
(iv) The set {(x, y) ∈ R2 | x ≥ 0, y ≥ 0}.

P ROBLEM 1.3.2 Prove that all lines through the origin and planes through the origin in R3 are
subspaces.

P ROBLEM 1.3.3 Let U, V, W be three subspaces of a vector space E. Show that if U ⊆ V, then
U + (V ∩W ) = (U +V ) ∩ (U +W ). Is this true if U is not a subspace of V ?

P ROBLEM 1.3.4 Let U, V, W be three subspaces of a vector space E. Show that if V ⊆ U, then
U ∩ (V +W ) = (U ∩V ) + (U ∩W ). Is this true if V is not a subspace of U?

P ROBLEM 1.3.5 Let f : V → V 0 be a linear map and S, T be subspaces of V. Show that


(i) f (S ∩ T ) ⊆ f (S) ∩ f (T ).
(ii) f (S + T ) = f (S) + f (T ).

P ROBLEM 1.3.6 Let f : V → V 0 be a linear map and S0 , T 0 be subspaces of V 0 . Show that


(i) f −1 (S0 ∩ T 0 ) = f −1 (S0 ) ∩ f −1 (T 0 ),
(ii) f −1 (S0 + T 0 ) ⊇ f −1 (S0 ) + f −1 (T 0 ),
where f −1 (X) denotes the inverse image of X.

P ROBLEM 1.3.7 Let V be a vector space. Let P : V → V be a linear map such that P2 = P.
Show that V = KerP + ImP and KerP ∩ ImP = {0}.

P ROBLEM 1.3.8 Let V be a vector space, and let P, Q be linear maps of V into itself. Assume
that they satisfy the following conditions:
(a) P + Q = I (identity map).
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 19

(b) PQ = QP = 0.
(c) P2 = P and Q2 = Q.
Show that V is equal to the direct sum of ImP and ImQ.

1.4 Quotient spaces


P ROPOSITION 1.4.1 Let S be a subspace of a vector space V. A relation R ⊆ V × V on V is
defined as follows: For u, v ∈ V, (u, v) ∈ R if v = u + s for some s ∈ S. The relation R is an
equivalence relation.

Proof. Exercise.
Remark 1.4.1. An equivalence class of the above equivalence relation is called a coset of S in
V, and is denoted by a + S = {a + s | s ∈ S}. By V/S = {a + S | a ∈ V }, we denote the set of
all equivalence classes of S in V.

T HEOREM 1.4.1 For a + S, b + S ∈ V/S, exactly one of the following is true:


(a) (a + S) ∩ (b + S) = 0.
/
(b) a + S = b + S.
Moreover, a + S = b + S if and only if a − b ∈ S.

Proof. Let a + S and b + S have non-empty intersection. Then there exists v ∈ V such that v ∈
a + S and v ∈ b + S, that is v = a + s and v = b + s0 for some s, s0 ∈ S. But then a − b = s − s0 ∈ S
This proves the last statement of the theorem.
Let u = a − b ∈ S. Let x ∈ a + S. Then x = a + s00 for some s ∈ S. Since a = (a − b) + b and
u = a − b ∈ S we see that x = u + b + s00 or x = b + s1 where s1 = u + s00 ∈ S. Thus x ∈ b + S.
Since x was an arbitrary element of a+S, we have thus proved that a+S ⊆ b+S. Interchanging
a and b we see that b + S ⊆ a + S.

P ROPOSITION 1.4.2 The set V/S forms a vector space over F under the following two opera-
tions:
(a) (a + S) + (b + S) = (a + b) + S;
(b) α(a + S) = (αa) + S,
where α ∈ F and a + S, b + S ∈ V/S.

Proof. Exercise.

T HEOREM 1.4.2 The map p : V → V/S defined by p(a) = a + S is a surjective linear map with
Kerp = S.

Proof. Linearity:
(i) p(a + b) = (a + b) + S = (a + S) + (b + S) = p(a) + p(b),
(ii) p(αa) = (αa) + S = α(a + S) = α p(a).
20 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

Surjectivity: For each element a + S ∈ V/S we have an element in V, namely a itself.


Kernel of p : Since the zero vector of V/S is the subspace S, we immediately have Kerp =
{v ∈ V | p(v) ∈ S} = S.

T HEOREM 1.4.3 Let S be a subspace of a vector space V and p : V → V/S be the linear map
defined by p(a) = a + S. If we have another linear map f : V → W such that S ⊆ Ker f then
there exists a unique linear map h : V/S → W such that the following diagram commutes:
p
V / V/S

h
f 
W
that is, f = h ◦ p. Also, we have Imh = Im f and Kerh = Ker f/S.

Proof. Since S ⊆ Ker f , we have


f (a + s) = f (a) + f (s) = f (a) + 0 = f (a),
for all s ∈ S. We define a map h : V/S → W as h(a + S) = f (a) and immediately notice:
h((a + S) + (b + S)) = h((a + b) + S)
= f (a + b)
= f (a) + f (b)
= h(a + S) + h(b + S),
and
h(α(a + S)) = h((αa) + S)
= f (αa)
= α f (a)
= αh(a + S).
Therefore h is a linear map. Next, from the definition of h we get
(h ◦ p)(a) = h(p(a)) = h(a + S) = f (a),
that is f = h ◦ p.
For uniqueness of h, let h0 : V/S → W be a linear map such that f = h0 ◦ p. Since p is a
surjection, p has a right inverse r : V/S → V such that p ◦ r = 1V/S , where 1v/S is the identity
map on V/S. Hence
h = h ◦ 1V/S = h ◦ (p ◦ r) = (h ◦ p) ◦ r = (h0 ◦ p) ◦ r = h0 ◦ (p ◦ r) = h0 ◦ 1V/S = h0 .
The fact Imh = Im f follows from the definition of h. To find Kerh, we note that
h(a + S) = 0 ⇔ f (a) = 0 ⇔ a ∈ Ker f ⇔ a + S ∈ Ker f/S.
Therefore, Kerh = Ker f/S.
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 21

C OROLLARY 1.4.1 Let S be a subspace of a vector space V and p : V → V/S be the linear
map defined by p(a) = a + S. If we have another linear surjective map f : V → W such that
S = Ker f , then we have an isomorphism V/Ker f ∼
= Im f .

Proof. Exercise.

Problems
P ROBLEM 1.4.1 Find the cosets in each of the following cases:
(i) V = R2 , S = {(x, 0) | x ∈ R}.
(ii) V = R3 , S = {(x, y, 0) | x, y ∈ R}.
(iii) V = R2 , S = {(x, y) | ax + by = 0, (a, b) is a non-zero fixed vector of R2 }.

P ROBLEM 1.4.2 Let M, N be subspaces of V. Prove that the map


f : (M + N)/N → M/(M ∩ N)
defined by f (m + n + N) = m + (M ∩ N) is a linear isomorphism.

P ROBLEM 1.4.3 Let V be a direct sum of M and N. Prove that the map
f : M → V/N
defined by f (m) = m + N is a linear isomorphism.

1.5 Bases and coordinates


Let V be a vector space over a field F, n is a natural number, and n = {1, . . . , n} the set of the
first n positive integers.

D EFINITION 1.5.1 A map v : n → V, sending each i ∈ n to some vi ∈ V, is called a list or an


n-tuple (v1 , . . . , vn ) of n vectors in V.

Remark 1.5.1. The field F is itself a vector space, and a function ξ : n → F is a list of n scalars
ξ1 , . . . , ξn . The set F n of all such lists ξ is a vector space under termwise addition and (left)
scalar multiplication.

D EFINITION 1.5.2 A linear combination of n vectors v1 , . . . , vn of a vector space V over a


field F is an expression ξ1 v1 + · · · + ξn vn , where ξ1 , . . . , ξn ∈ F.

D EFINITION 1.5.3 An n-tuple v of n vectors of V spans the vector space V when every vector
u ∈ V can be expressed in at least one way as a linear combination
u = ξ1 v1 + · · · + ξn vn (5)
22 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

of the vectors of V.

Remark 1.5.2. The sum Fv1 + · · · + Fvn = {ξ1 v1 + · · · + ξn vn | ξ1 , . . . , ξn ∈ F} consists of all


F-linear combinations of the elements of vi of the given list and is subspace of V, called the
subspace spanned by v1 , . . . , vn . It is the intersection of all subspaces of V containing these n
vectors. Similarly, the subspace of V spanned by any subset X of V is the intersection of all the
subspaces of V containing X. Since every subspace contains the zero vector 0, the subspace of
V spanned by the empty subset of V is the subspace 0. A vector space V is said to be finitely
spanned or, of finite type if V = Fv1 + · · · + Fvn for some finite list v1 , . . . , vn of vectors of V.

D EFINITION 1.5.4 An n-tuple v of vectors of V is a basis for the vector space V when every
vector u ∈ V has exactly one expression as a linear combination of vectors of v.

D EFINITION 1.5.5 An n-tuple v of vectors of V is called linearly independent in V when


every vector u ∈ V has atmost one expression as a linear combination of vectors of v.

Remark 1.5.3. The n-tuple v is linearly independent when, for all n-tuple α and β of n scalars
each,
n n
∑ αivi = ∑ βivi =⇒ α1 = β1 , α2 = β2 , . . . , αn = βn .
i=1 i=1

n n n
Since ∑ αi vi = ∑ βi vi implies ∑ (αi − βi )vi = 0, it is enough to require, for each n-tuple α
i=1 i=1 i=1
of n scalars, that
n
∑ αivi = 0 =⇒ α1 = α2 = · · · = αn = 0.
i=1

D EFINITION 1.5.6 An n-tuple v of vectors of V is called linearly dependent in V when it is


not linearly independent, that is, when ∑ni=1 αi vi = 0 for some n-tuple of scalars αi not all
zero.

Remark 1.5.4. Note that an n-tuple v of n vectors is a basis for V precisely when v spans V
and is linearly independent in V.

P ROPOSITION 1.5.1 Each n-tuple v of n vectors in a vector space V over a field F determines
a linear map fv : F n → V by fv (ξ ) = ξ1 v1 + · · · + ξn vn . Relative to this linear map,
(a) v spans V if and only if fv is a surjection;
(b) v is linearly independent in V if and only if fv is an injection;
(c) v is a basis of V if and only if fv is a bijection.

Proof. To show fv is linear, take ξ and η be two lists of n scalars. The sum ξ + η is the list of
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 23

the scalars ξi + ηi . Now


n n n
fv (ξ + η) = ∑ (ξi + ηi )vi = ∑ ξi vi + ∑ ηi vi = fv (ξ ) + fv (η).
i=1 i=1 i=1

If κ is any scalar, then κξ is the list of the scalars κξi , and


!
n n
fv (κξ ) = ∑ (κξi )vi = κ ∑ ξivi = κ fv (ξ ).
i=1 i=1

Given that fv is linear, the statement that fv is surjective means exactly that the list v spans,
and similarly for bases. Also, fv is injective means exactly that fv (ξ ) = fv (η) implies ξ = η.
This is equivalent to saying that fv has kernel 0.

D EFINITION 1.5.7 The coordinates of a vector v ∈ V relatives to a basis b of V are the n


scalars ξi of the unique list ξ used in the expression v = ξ1 b1 + · · · + ξn bn of v as a linear
combination of the bi . Each scalar ξi is called the i-th coordinate of v, relative to b.

T HEOREM 1.5.1 If b : n → V is a basis of the vector space V, then the map sending each vector
v ∈ V to the list ξ of its coordinates, relative to b, is an isomorphism V ∼
= F n of vector spaces.
The inverse of this isomorphism is the linear map

fb : F n ∼
= V, (6)

assigning to each list ξ the vector ∑ni=1 ξi bi .

Proof. The proof follows from Theorem 1.5.1 and Definition 1.5.7.
Remark 1.5.5. Thus a vector space V with a finite basis is isomorphic to a vector space F n of
n-tuples of scalars-and isomorphic in many ways, one isomorphism for each choice of a basis.
Since
e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en = (0, 0, . . . , 1) (7)
is a basis of F n , the isomorphism fb of (6) takes these unit vectors to the basis vectors
b1 , . . . , bn .

P ROPOSITION 1.5.2 A list v of vectors is linearly dependent in V if and only if some one
vector vk of the list is zero or a linear combination of the previous vectors of the list. When
this is the case, removal of the vector vk gives a new list with the same span as v.

Proof. If vk is a linear combination η1 v1 + · · · + ηk−1 vk−1 of previous vectors, then

η1 v1 + η2 v2 + · · · + ηk−1 vk−1 + (−1)vk = 0,

so there is a linear combination with at least one coefficient (to wit, −1) not zero, so the
list v is indeed linearly dependent. Moreover, let w be any vector in the subspace spanned
24 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

by all the v1 , . . . , vn , so that w = ∑ni=1 ξi vi . Replacing vk here by its expression η1 v1 + · · · +


ηk−1 vk−1 represents w as a linear combination of the shorter list (v1 , . . . , vk−1 , vk+1 , . . . , vn )
with vk removed, as asserted.
Conversely, suppose that the list v is linearly dependent; then ∑ni=1 ξi vi = 0 for scalars ξi not
all zero. Let k be the last index with a coefficient ξk 6= 0, so that ξ1 v1 + · · · + ξk vk = 0. Since
F is a field, ξk−1 exists in F; multiply by ξk−1 and solve for vk as

vk = v1 (−ξ1 ξk−1 ) + · · · + vk−1 (−ξk−1 ξk−1 );

the vector vk is indeed a linear combination of previous vectors unless k = 1, in which case
this equation shows that vk must be zero.

C OROLLARY 1.5.1 Any vector space of finite type has a basis.

Proof. Since V is of finite type, it is spanned by some finite list v of n vectors. If this list v
happens to be also linearly independent, it is a basis of V . If not, v is linearly dependent; by
the proposition above we remove vectors one by one from this list till we get a shorter list, still
spanning V, which is independent.

Problems
P ROBLEM 1.5.1 In any vector space, prove:
(a) A list of just one vector is linearly independent if and only if the vector is non-zero;
(b) a list of two vectors is linearly dependent if and only if each is a scalar multiple of the
other.

P ROBLEM 1.5.2 Show that any two vectors (ξ1 , ξ2 ) and (η1 , η2 ) are linearly independent in
F 2 if and only if ξ1 η2 − ξ2 η1 6= 0.

P ROBLEM 1.5.3 let v, w be vectors of a vector space V over F, and assume that v 6= 0. If v, w
are linearly dependent, show that there is a scalar α ∈ F such that w = αv.

P ROBLEM 1.5.4 For any choices of scalars κ, λ , and µ show that the vectors (1, κ, λ ), (0, 1, µ),
and (0, 0, 1) form a basis of F 3 .

P ROBLEM 1.5.5 Do (1, 2, 3), (2, 3, 4), and (3, 4, 5) form a basis of Q3 ?

P ROBLEM 1.5.6 (a) Show that the list (1, 1, 0), (1, 0, 1), and (0, 1, 1) is a basis of Q3 .
(b) Find the coordinates of the unit vectors (1, 0, 0), (0, 1, 0), and (0, 0, 1) of Q3 relative to
this basis.

P ROBLEM 1.5.7 If κu + λ v + µw = 0, where κu 6= 0, show that the vectors u and v span the
same subspace as do v and w.
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 25

P ROBLEM 1.5.8 If w is a linearly independent list of n vectors of V, show that a vector v is a


linear combination of the list w if and only if the list v, w1 , . . . , wn is linearly dependent.

P ROBLEM 1.5.9 If u, v, and w are three linearly independent vectors in a vector space over Q,
prove that v + w, w + u, and u + v are also linearly independent. Is this true over every field?

P ROBLEM 1.5.10 If f : V → W is a linear map and v is a list of n vectors in V, prove the


following properties of the composite list f ◦ v = ( f (v1 ), . . . , f (vn )) :
(a) If v spans V, then f ◦ v spans Im f .
(b) If f ◦ v is linearly independent in W, then v is linearly independent in V.
(c) If v is linearly independent in V and f is an injection, then f ◦ v is independent in W.

1.6 Dimension
D EFINITION 1.6.1 The number n of vectors in any one basis of a vector space V is called the
dimension of V, written dimV = n.

Remark 1.6.1. The dimension dimV infinite means that V has no (finite) basis. Thus dimF n =
n. A vector space of finite type is also called a finite-dimensional space.

T HEOREM 1.6.1 If a vector space V is spanned by some list of n vectors and also has a list v of
m linearly independent vectors, then m ≤ n. Moreover, V can be spanned by a list of n vectors
containing the given list v.

Proof. Given n, we shall prove both conclusions together by induction on m. For m = 0, the
result is immediate. Suppose then that the conclusion holds for all lists of m independent
vectors, and let v be a list of m + 1 linearly independent vectors. The first m vectors of this list
v are still linearly independent, so, by the induction assumption, n − m ≥ 0 and V is spanned
by some list
(v1 , v2 , . . . , vm , w1 , . . . , wn−m ) (8)
of n vectors including v1 , v2 , . . . , vm . In particular, vm+1 , as a vector of V, must be a linear
combination of these spanning vectors, say as
vm+1 = ξ1 v1 + · · · + ξm vm + η1 w1 + · · · + ηn−m wn−m . (9)
Now we can prove m + 1 ≤ n. If not, m = n, there are no vectors wi in (9), and this for-
mula expresses vm+1 in terms of v1 , . . . , vm , a contradiction to the hypothesis that the list
(v1 , . . . , vm , vm+1 ) is linearly independent.
Now adjoin the vector vm+1 to the list (8) of n vectors, getting a new list
(v1 , . . . , vm , vm+1 , w1 , . . . , wn−m )
of n + 1 vectors which still spans V (because the shorter list (8) did). Moreover, by the relation
(9) above, these n + 1 vectors are linearly dependent. By Proposition 1.5.2, we can remove
26 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

some one vector from this list; namely, the first one which is a linear combination of previous
vectors. The vector thereby removed is surely not one of the v1 , . . . , vm+1 , for these vectors
are known to be linearly independent. Therefore, the vector removed must be one in the list
w, say the vector w j . We now have a new list

(v1 , . . . , vm , vm+1 , w1 , . . . , w j−1 , w j+1 , . . . , wn−m )

of n vectors still spanning V and containing the whole list v. This completes the induction,
hence the proof.

C OROLLARY 1.6.1 (Invariance of Dimension) Any two bases for a vector space V of finite
type have the same number of vectors.

Proof. Let (b1 , . . . , bm ) and (c1 , . . . , cn ) be two bases of m and n vectors, respectively. Since b
is linearly independent and c spans V, the above theorem implies m ≤ n. Since c is independent
and b spans, n ≤ m. Together, these conclusions give n = m, as desired.

C OROLLARY 1.6.2 In a vector space V of dimension n :


(i) Any list of n + 1 vectors of V is linearly dependent.
(ii) No list of n − 1 vectors of V can span V.

T HEOREM 1.6.2 In a finite-dimensional vector space V :


(i) Any linearly independent list of vectors is a part of a basis.
(ii) Any list of vectors spanning V has a part which is a basis.

Proof. To prove (ii), start with a list spanning V and remove dependent vectors by Proposition
1.5.2 till the resulting list is independent, and hence a basis. As for (i), let the list v be inde-
pendent, while some other list spans V, because V is finite-dimensional. Theorem 1.6.1 then
yields a list containing v which spans V. Remove dependent vectors, as before, until this list
becomes a basis containing v.

C OROLLARY 1.6.3 In a vector space V of dimension n :


(i) Any list of n linearly independent vectors is a basis.
(ii) Any list of n vectors spanning V is a basis.

Proof. For (i), let v be a list of n independent vectors. By the Theorem 1.6.2, v is part of a
basis; by the invariance of dimension, this basis has exactly n vectors, hence must be just the
original list v. The proof of (ii) is similar.

T HEOREM 1.6.3 Let V and W be n dimensional vector spaces over a field F. Let b be a basis
of V . Let w be a list of n vectors of W . Then there exists a unique linear map f : V → W such
that f (bi ) = wi .
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 27

Proof. Let v ∈ V. Then, v = ∑ni=1 αi bi . Define a map f : V → W by f (v) = ∑ni=1 αi wi . In


particular, we have f (bi ) = wi .
Linearity of f : Let u, v ∈ V such that u = ∑ni=1 αi bi and v = ∑ni=1 βi bi , where αi , βi ∈ F.
Then
!
n n
f (u + v) = f ∑ αi b i + ∑ β i b i
i=1 i=1
!
n
=f ∑ (αi + βi)bi
i=1
n
= ∑ (αi + βi ) f (bi )
i=1
n
= ∑ (αi + βi )wi
i=1
n n
= ∑ αi wi + ∑ βi wi
i=1 i=1
= f (u) + f (v).

Let α ∈ F, then
n n
f (αv) = ∑ (ααi )wi = α ∑ αi wi = α f (v).
i=1 i=1

Hence f is a linear map.


To prove the uniqueness part of the theorem, let f 0 : V → W be another linear map such that
f 0 (bi ) = wi . Then for any v ∈ V,
!
n n n
f 0 (v) = f 0 ∑ αi b i = ∑ αi f 0 (bi ) = ∑ αi wi = f (v).
i=1 i=1 i=1

This completes the proof.

Remark 1.6.2. There is a parallel treatment for (possibly) infinite-dimensional vector spaces
V. In this treatment, a basis B is a possibly infinite set of vectors of V such that every vector
v ∈ V has exactly one expression as a linear combination of a finite number of vectors of B.
Using the axiom of choice, one can prove that every vector space V has a basis B, and that for
any two bases B and B0 of the same space there is a bijection B ∼ = B0 . Also, a subset X ⊆ V is
said to be linearly independent in V when every finite subset of X is linearly independent, in
the sense of our previous definition.

P ROPOSITION 1.6.1 If S is a subspace of a finite-dimensional vector space V, each basis of S


is part of a basis of V. Hence, dimS ≤ dimV ; moreover, dimS < dimV whenever S is a proper
subspace of V.
28 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

Proof. Any list of vectors of S linearly independent in S is obviously linearly independent in


V. By Theorem 1.6.2, each such list is thus part of a basis of V. Moreover, a basis for S can be
a basis for the whole space V only if every vector of V is already in S.
Remark 1.6.3. In the next few theorems, we have occasion to “combine” two lists v = (v1 , . . . , vm )
and w = (w1 , . . . , wr ) of vectors into a single list which will be written as

v ∨ w = (v1 , . . . , vm , w1 , . . . , wr ).

T HEOREM 1.6.4 If f : V → V 0 is a linear map with a finite-dimensional domian, then

dimV = dim(Ker f ) + dim(Im f ). (10)

In more detail, if a basis v for Ker f is part of a basis v ∨ w for V, then f ◦ v is a basis for Im f .

Proof. Since any vector v ∈ V is a linear combination

v = ∑ ξi vi + η j w j

for suitable lists ξ and η of scalars, and since f (vi ) = 0, each vector f (v) is ∑ η j f (w j ). Hence,
the list f ◦ w spans Im f . On the other hand, ∑ η j f (w j ) = 0 means f (∑ η j w j ) = 0; hence,
∑ η j w j ∈ Ker f . But v is a basis for Ker f , so ∑ η j w j = ∑ ξi vi for some list ξ of scalars. But
v ∨ w linearly independent makes all the η j and all the ξi zero. Hence, the list f ◦ w is linearly
independent in Im f . Since it is independent and also spans Im f , it is a basis, as asserted.

D EFINITION 1.6.2 The rank and nullity of a linear map f : V → V 0 are defined by rank f =
dim(Im f ), nullity f = dim(Ker f ).

C OROLLARY 1.6.4 If two vector spaces V and V 0 have the respective finite dimensions n and
n0 , then the rank r of any linear map f : V → V 0 is at most the smaller of n and n0 . For each
such map f there is a basis b of V and a basis b0 of V 0 such that

f (b1 ) = b01 , . . . , f (br ) = b0r , f (br+1 ) = 0, . . . , f (bn ) = 0.

Proof. The inequalities r ≤ n and r ≤ n0 follow from 10 and Definition 1.6.2. To get the
indicated bases, we use the notation of the Theorem 1.6.4, set b = w ∨ v, and make f (w) part
of a basis b0 of V 0 .

C OROLLARY 1.6.5 Let V and V 0 be vector spaces of the same finite dimension. Then any
surjective linear map f : V → V 0 , and also any injective linear map f : V → V 0 , is necessarily
an isomorphism.

Proof. Set dimV = n = dimV 0 and use (10). If dim(Im f ) = n, (10) implies that dim(Ker f ) = 0,
so f is an isomorphism. If Ker f = 0, (10) implies that dim(Im f ) = n, so Im f must be all of
V 0 ; and f is again an isomorphism.
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 29

C OROLLARY 1.6.6 If S is any subspace of a finite-dimensional vector space V, the correspond-


ing quotient space V/S has the dimension

dim(V/S) = dimV − dimS. (11)

Proof. Apply the Theorem 1.6.4 to the linear map p : V → V/S.

C OROLLARY 1.6.7 If S is a subspace of a finite-dimensional vector space V, then there is a


subspace T of V with S + T = V and S ∩ T = 0; moreover T ∼= V/S.

Proof. Take a basis of S and adjoin vectors w1 , . . . , wm to get a basis of V. The subspace T
of V spanned by the adjoined vectors w1 , . . . , wm has S + T = V and S ∩ T = 0, as required.
Moreover, the Theorem 1.6.4 applied to the linear map p : V → V/S states that the list S +
w1 , . . . , S + wm of cosets is a basis of V/S. hence, the assignment ∑ η j w j 7→ ∑ η j (S + w j ) is an
isomorphism T ∼ = V/S.

P ROPOSITION 1.6.2 If a vector space W is the direct sum of two finite-dimensional subspaces
V1 and V2 , then any basis b0 of V1 and any basis b00 of V2 combine to form a basis b0 ∨ b00 of W.
Hence, the direct sum W is finite-dimensional, and dimW = dimV1 + dimV2 .

Proof. W is the direct sum of V1 and V2 means V1 ∩ V2 = 0 and V1 + V2 = W, so each vector


w ∈ W can be written uniquely as a sum w = v1 + v2 of vectors vi ∈ Vi . By the description
of a basis, v1 is uniquely a linear combination of the vectors of the list b0 , and v2 a similar
combination of b00 . Hence, each w is uniquely a linear combination of the list b0 ∨ b00 , and the
proposition follows.

C OROLLARY 1.6.8 If S and T are finite-dimensional subspaces of a vector space, then the
subspace S + T is also finite-dimensional, and

dimS + dimT = dim(S ∩ T ) + dim(S + T ).

Proof. The quotient space (S + T )/(S ∩ T ) is the dierct sum of its subspaces S/(S ∩ T ) and
T/(S ∩ T ). Hence, the previous results on the dimensions of quotients and direct sums yield

dim(S + T ) = dim ((S + T )/(S ∩ T )) + dim(S ∩ T )


= dim(S/(S ∩ T )) + dim(T/(S ∩ T )) + dim(S ∩ T )
= dimS + dimT − dim(S ∩ T ).

Problems
P ROBLEM 1.6.1 Show that any part of a linearly independent list is linearly independent.
30 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

P ROBLEM 1.6.2 If w is a list of vectors in V and if some part of w spans V,show that w spans
V.

P ROBLEM 1.6.3 If the vector v is not in the subspace S, but is in the subspace spanned by S
and the vector w, show that w is in the subspace spanned by S and v.

P ROBLEM 1.6.4 Let S and T be the subspaces of Q4 spanned, respectively, by the vectors

S : (1, −1, 2, −3), (1, 1, 2, 0), (3, −1, 6, −6);


T : (0, −2, 0, −3), (1, 0, 1, 0).

Find the dimensions of S, of T, of S ∩ T, and of S + T.

P ROBLEM 1.6.5 If S and T are distinct two-dimensional subspaces of a three-dimensional


space, show that their intersection S ∩ T has dimension 1. What does this mean geometrically?

P ROBLEM 1.6.6 Suppose that V is finite-dimensional and U is a subspace of V such that


dimU = dimV. Prove that U = V.

P ROBLEM 1.6.7 Suppose U and W are subspaces of R8 such that dimU = 3, dimW = 5, and
U +W = R8 . Prove that U ∩W = {0}.

P ROBLEM 1.6.8 Suppose that U and W are both five-dimensional subspaces of R9 . Prove that
U ∩W = {0}.

P ROBLEM 1.6.9 If U1 , U2 , and U3 are subspaces of a finite-dimensional vector space, then is


the following formula true?

dim(U1 +U2 +U3 ) = dimU1 + dimU2 + dimU3


− dim(U1 ∩U2 ) − dim(U1 ∩U3 ) − dim(U2 ∩U3 )
+ dim(U1 ∩U2 ∩U3 ).

P ROBLEM 1.6.10 If three subspaces S, T, and T 0 of a finite-dimensional V satisfy S ∩ T =


S ∩ T 0 , S + T = S + T 0 , and T ⊆ T 0 , prove that T = T 0 .

1.7 Dual spaces


D EFINITION 1.7.1 Let V be a vector space over a field F. The vector space

V ∗ = HomF (V, F) = {all F-linear maps f : V → F}

is called the dual space of V.


CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 31

Remark 1.7.1. The elements of V ∗ are called linear functionals or linear forms.

T HEOREM 1.7.1 Let b be a basis of n vectors for the vector space V. Then to each list µ of n
scalars there is exactly one linear form f : V → F with f ◦ b = µ; namely, the map f defined
for each list ξ of n scalars as !
n n
f ∑ ξibi = ∑ ξi µi .
i=1 i=1
Moreover, the assignment f 7→ f ◦ b is an isomorphism V ∗ ∼
= F n of vector spaces.

C OROLLARY 1.7.1 Any finite-dimensional vector space V has the same dimension as its dual
space V ∗ .

Proof. The isomorphism V ∗ ∼


= F n shows that V ∗ has dimension n = dimV.
Remark 1.7.2. If V is a finite-dimensional vector space of dimension n, then by Theorem 6
and Theorem 1.7.1 we obtain V ∗ ∼ = Fn ∼
= V, that is, every finite-dimensional vector space is
isomorphic to its dual space. The Theorem 1.7.1 will also construct an explicit basis for V ∗ ,
as follows:

C OROLLARY 1.7.2 If V has a basis b1 , . . . , bn , its dual V ∗ has a basis x1 , . . . , xn , consisting of


the linear forms xi : V → F defined for i ∈ n by

xi (bi ) = 1, (12)
xi (b j ) = 0, i 6= j, j = 1, . . . , n. (13)

Proof. The isomorphism f 7→ f ◦ b of the Theorem 1.7.1 carries the linear forms xi to the unit
vectors ei = (0, . . . , 1, . . . , 0) of (7), which together constitute a basis of F n . Hence, x1 , . . . , xn
is a basis of V ∗ .
 
Remark 1.7.3. By (12) and (13), xi ∑nj=1 ξ j b j = ξi . Hence, the linear form xi : V → F is just
the map sending each vector v of V to its ith coordinate ξi , relative to b. In brief, given a basis,
the “coordinate maps” relative to that basis are elements of the dual space and form a basis
there.
This basis x1 , . . . , xn of V ∗ is called the dual basis to the given basis b. With the following
Kronecker delta notation for i, j ∈ n,
(
1, if i = j,
δi j =
0, if i 6= j,

the conditions (12) and (13) that a basis x of V ∗ be dual to a basis b of V read

x i b j = δi j , i, j = 1, . . . , n. (14)

L EMMA 1.7.1 To each non-zero linear form f on V there is at least one vector v with f (v) = 1.
32 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

Proof. Since f : V → F is a map, f 6= 0 means that f (u) 6= 0 for some vector u; a suitable
scalar multiple v of u then has f (v) = 1.
Remark 1.7.4. When V is finite-dimensional, the “symmetrical” property holds:

L EMMA 1.7.2 To any non-zero vector v a finite-dimensional vector space V, there exists at
least one linear form f with f (u) = 1.

Proof. Since u 6= 0, it is linearly independent, hence part of a basis of V. Take f to be the


corresponding linear form in the dual basis.
Remark 1.7.5. An element f ∈ V ∗ is by definition a map v 7→ f (v) of a “variable” v ∈ V. If
we hold v fixed and let f vary, then f 7→ f (v) becomes a map V ∗ → F. We label this map as
ωv : V ∗ → F. Thus ω maps V into its double dual V ∗∗ .

C OROLLARY 1.7.3 Any finite-dimensional vector space V is isomorphic to its double dual
V ∗∗ under the morphism ω : V → V ∗∗ which sends each vector v of V into the linear form
ωv : V ∗ → F, where ωv is defined for each f ∈ V ∗ by ωv ( f ) = f (v).

Proof. That v 7→ ωv as defined is a linear map V → V ∗∗ is immediate. Lemma 1.7.2 implies


that Kerω = 0. Since dimV = dimV ∗∗ also, it follows that ω is an isomorphism, as required.

Remark 1.7.6. Note that the map ω : V → V ∗∗ is defined without reference to any basis, while
a basis was used to construct the isomorphism V ∼ = V ∗ in Corollary 1.7.2.
To bring out the parallel between Lemma 1.7.1 and Lemma 1.7.2, we will write h f , vi for
the value f (v) of the form f ∈ V ∗ on the vector v ∈ V. This gives a map

( f , v) 7→ h f , vi , V ∗ ×V → F.

To emphasize the symmetry of V and its dual V ∗ in this situation, we write W for the dual
space V ∗ , so that hw, vi is a map W ×V → F. This map of two arguments w and v is linear in
each argument, in the sense that the equations

hκ1 w1 + κ2 w2 , vi = κ1 hw1 , vi + κ2 hw2 , vi , (15)


hw, κ1 v1 + κ2 v2 i = κ1 hw, v1 i + κ2 hw, v2 i , (16)

holds for all scalars κi ∈ F and for all vectors w, wi ∈ W, v, vi ∈ V. In this language, Lemma
1.7.1 and Lemma 1.7.2 becomes symmetrical statements about fixed vectors w0 ∈ W or v0 ∈ V :
(i) If hw0 , vi = 0 for all v ∈ V, then w0 = 0.
(ii) If hw, v0 i = 0 for all w ∈ W, then v0 = 0.
More generally,

D EFINITION 1.7.2 If S is any subset of V, the annihilator of S is defined as

AnnihS = {w | w ∈ W and hw, si = 0 for all s ∈ S},


CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 33

consisting of those linear forms w which carry every s to 0. Similarly, if T is a subset of W,


the annihilator of T is the subset

AnnihT = {v | v ∈ V and ht, vi = 0 for allt ∈ T }

of V.

Remark 1.7.7. Because of the linearity (15) and (16), it follows that AnnihT is a subspace of
V and AnnihS is a subspace of W. Also conditions (i) and (ii) in Remark 1.7.6 can now be
written as
AnnihV = 0, and AnnihW = 0. (17)

T HEOREM 1.7.2 If a map h i : W ×V → F on finite-dimensional vector spaces W and V over F


has the properties (15), (16), and (17), then it determines isomorphisms W ∼
= V ∗ and V ∼
= W ∗.

Proof. For each vector v ∈ V the assignment w 7→ hw, vi is a map, written as h−, vi : W → F,
which is left-linear on W by (15). Therefore the assignment v 7→ h−, vi is a map on V to the
dual W ∗ , and by (16) it is a right-linear map V → W ∗ . The assumption AnnihW = 0 of (17)
states that this map has kernel 0 (is an injective linear map); hence by Proposition 1.6.1 and
Corollary 1.7.1,
dimV ≤ dimW ∗ = dimW.
Symmetrically, for each w ∈ W the assignment w 7→ hw, −i is a right-linear injection W → V ∗ ,
so that dimW ≤ dimV ∗ = dimV. Combined with the previous inequality this gives dimW =
dimV and implies by Corollary 1.6.5 that both linear injections are isomorphisms V ∼= W ∗ and
∼ ∗ ∼ ∗∗, ∗∗
W = V . This also yields V = V as in the isomorphism ω : V → V of Corollary 1.7.3.

T HEOREM 1.7.3 For subspaces S of a finite-dimensional vector space V and T of its dual
W = V ∗ there are isomorphisms

S∗ ∼
= V ∗/AnnihS, (V/S)∗ ∼
= AnnihS, (18)
T∗ ∼
= W ∗/AnnihT,
∗∼
(W/T ) = AnnihT. (19)

Proof. By the symmetry of h i : W ×V → F it suffices to consider the case of the subspace S.


For each w ∈ W , the map hw, −i : V → F, if restricted to elements s ∈ S in the subspace S ⊆ V,
is a linear map denoted by

hw, −i|S : S → F; s 7→ hw, si ∈ F.

This assignment w 7→ hw, −i|S , called restriction to S, is a linear map

resS : W = V ∗ → S∗ .

Now any basis of the subspace S ⊆ V is by Theorem 1.6.2 part of a basis of V, so any linear
map g : S → F can be extended to a map g0 : V → F linear on all of V (say by making g0 zero
34 CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES

on the additional part of the chosen basis of V ). This observation means that every g in S∗ is
the restriction to S of some g0 = hw, −i, so the map resS above is a linear surjection. Its kernel
is AnnihS, by the very definition of this annihilator. Therefore, by Theorem 1.4.1 there is an
isomorphism S∗ ∼ = V ∗/AnnihS, as in the first of (18).
To treat the second isomorphism of (18), consider any h : V/S → F in the dual space (V/S)∗ .
Composing h with the projection p : V → V/S yields a linear map h ◦ p : V → F which an-
nihilates S. On the other hand, any linear map f : V → F which annihilates S must, by the
Theorem 1.4.3, factor through the projection p : V → V/S as f = h ◦ p for a unique h. There-
fore, h 7→ h ◦ p is the desired isomorphism (V/S)∗ ∼ = AnnihS.

P ROPOSITION 1.7.1 For any subspace S of a finite-dimensional vector space V ,

dimS + dim(AnnihS) = dimV, (20)


Annih(AnnihS) = S. (21)

Proof. If dimV = n and dimS = k, then dim (V/S) = n − k, and by the second equation of (18),
dim(AnnihS) = dim (V/S)∗ = n − k. This is (20), as desired. Repeating the argument gives
dim(Annih(AnnihS)) = k. But if s ∈ S, then hw, si = 0 for all w ∈ AnnihS, so s ∈ Annih(AnnihS)
and therefore S ⊆ Annih(AnnihS). Since these spaces have the same dimension k, they must
by Proposition 1.6.1 be equal, as asserted in (21).

Remark 1.7.8. Note that this result depends essentially upon the fact that V is of finite dimen-
sion.
For a subspace T of W = V ∗ the equation corresponding to (20) is

dim(AnnihT ) = dimV − dimT. (22)

This conceptual result is really a statement of the fundamental fact about systems of homo-
geneous linear equations. For if f ∈ V ∗ and v ∈ V the equation f v = 0 is a homogeneous
linear equation f in n unknowns (the coordinates of v). Thus if f1 , . . . , fk is a list of k elements
fi ∈ V ∗ and T the subspace of V ∗ which they span, then AnnihT is just the set of all v ∈ V with

f1 v = 0, · · · , fk v = 0;

in other words, AnnihT is the set (actually, the subspace) of all solutions of these k simultane-
ous homogeneous linear equations. Thus (22) states that the number of linearly independent
solutions is dimV , the number of unknowns, minus dimT , the maximum number of linearly
independent equations.

Problems
P ROBLEM 1.7.1 Consider the basis {(2, 1), (3, 1)} of R2 . Find the dual basis.
CHAPTER 1. FINITE-DIMENSIONAL VECTOR SPACES 35

P ROBLEM 1.7.2 Let V = {a + bt | a, b ∈ R}, the vector space of real polynomials of degree
≤ 1. Find the basis {v1 , v2 } of V that is dual to the basis {φ1 , φ2 } of V ∗ defined by
Z 1 Z 2
φ1 ( f (t)) = f (t) dt and φ2 ( f (t)) = f (t) dt.
0 0

P ROBLEM 1.7.3 Let S1 , S2 be subspaces of a finite-dimensional vector space V , while h i : W ×


V → F is a dual pairing. Prove:
(a) S1 ⊆ S2 =⇒ AnnihS1 ⊇ AnnihS2 .
(b) Annih(S1 + S2 ) = (AnnihS1 ) ∩ (AnnihS2 ).
(c) Annih(S1 ∩ S2 ) = (AnnihS1 ) + (AnnihS2 ).
L INEAR MAPS AND MATRICES
2
2.1 The linear map associated with a matrix
D EFINITION 2.1.1 Let F be a field and let m = {1, . . . , m}, n = {1, . . . , n} denote respectively
the sets of first m and n positive integers. An F-valued m × n matrix is a map M : m × n → F,
written (i, j) 7→ αi j , where i ∈ m, j ∈ n, and αi j ∈ F and may be displayed as
 
α11 · · · α1 j · · · α1n
 α21 · · · α2 j · · · α2n 
M =  .. ..  .
 
..
 . . . 
αm1 · · · αm j · · · αmn

Let  
a11 · · · a1n
 .. .. 
A= . . 
am1 · · · amn
be an m × n matrix. We can associate with A a map
LA : F n → F m
by letting
LA (X) = AX
for every column vector X in F n . Thus LA is defined by the association X 7→ AX, the product
being the product of matrices. We observe that LA is linear, indeed
A(X +Y ) = AX + AY and A(cX) = cAX
for all vectors X,Y in F n and all numbers c. We call LA the linear map associated with the
matrix A.

T HEOREM 2.1.1 If A, B are m × n matrices and if LA = LB , then A = B. In other words, if


matrices A, B give rise to the same linear map, then they are equal.

37
38 CHAPTER 2. LINEAR MAPS AND MATRICES

Proof. By definition, we have Ai · X = Bi · X for all i, if Ai is the i-th row of A and Bi is the i-th
row of B. Hence (Ai − Bi ) · X = 0 for all i and all X. Hence Ai − Bi = O, and Ai = Bi for all i.
Hence A = B.

Problems
P ROBLEM  find the vector LA (X).
 2.1.1In eachcase,
2 1 3
(a) A = ,X= .
1 0  −1
 
1 0 5
(b) A = ,X= .
 0 0  1
1 1 4
(c) A = ,X= .
0 1  1 
0 0 7
(d) A = ,X= .
0 1 −3

2.2 The matrix associated with a linear map


We first consider a special case.

T HEOREM 2.2.1 Let L : F n → F be a linear map. There exists a unique vector A in F n such
that L = LA , i.e. such that for all X we have L(X) = A · X.

Proof. Let e1 , . . . , en be the unit vectors in F n . If X = x1 e1 + · · · + xn en is any vector, then


L(X) = L(x1 e1 + · · · + xn en )
= x1 L(e1 ) + · · · + xn L(en ).
If we now let ai = L(ei ), we see that L(X) = x1 a1 + · · · + xn an = X · A. This proves what we
wanted. It also gives us an explicit determination of the vector A such that L = LA , namely the
components of A are precisely the values L(e1 ), . . . , L(en ), where ei (i = 1, . . . , n) are the unit
vectors of F n .
We shall now generalize this to the case of an arbitrary linear map into F m , not just into F.

T HEOREM 2.2.2 Let L : F n → F m be a linear map. Then there exists a unique matrix A such
that L = LA .

Proof. As usual, let e1 , . . . , en be the unit column vectors in F n , and let ε1 , . . . , εm be the unit
column vectors in F m . We can write any vector X in F n as a linear combination
 
x1
 .. 
X = x1 e1 + · · · + xn en =  .  ,
xn
CHAPTER 2. LINEAR MAPS AND MATRICES 39

where x j is the j-th component of X. We view e1 , . . . , en as column vectors. By linearity, we


find that L(X) = x1 L(e1 ) + · · · + xn L(en ) and we can write each L(e j ) in terms of ε1 , . . . , εm . In
other words, there exist numbers ai j such that
L(e1 ) = a11 ε1 + · · · + am1 εm
..
.
L(en ) = a1n ε1 + · · · + amn εm
or in terms of the column vectors,
   
a11 a1n
 ..   .. 
L(e1 ) =  .  , · · · , L(en ) =  .  . (23)
am1 , amn
Hence
L(X) = x1 (a11 ε1 + · · · + am1 εm ) + · · · + xn ((a1n ε1 + · · · + amn εm )
= (a11 x1 + · · · + a1n xn )ε1 + · · · + (am1 x1 + · · · + amn xn )εm .
Consequently, if we let A = (ai j ), then we see that L(X) = AX. Written out in full, this reads
    
a11 · · · a1n x1 a11 x1 + · · · + a1n xn
 .. ..   ..  =  ..
.

 . .  .   .
am1 · · · amn xn am1 x1 + · · · + amn xn
Thus L = LA is the linear map associated with the matrix A. We also call A the matrix asso-
ciated with the linear map L. We know that this matrix is uniquely determined by Theorem
2.1.1.

T HEOREM 2.2.3 Let A be an n × n matrix, and let A1 , . . . , An be its columns. Then A is invert-
ible if and only if A1 , . . . , An are linearly independent.

Problems
P ROBLEM 2.2.1 Find the matrix associated with the following linear maps:
(a) f : R4 → R2 given by f (x1 , x2 , x3 , x4 ) = (x1 , x2 ). (relative to the bases of unit vectors)
(b) f : R2 → R2 given by f (x, y) = (3x, 3y). (relative to the basis of unit vectors)
(c) f : R2 → R2 given by f (x, y) = (2x + 3y, 4x − 5y). (relative to the basis {(1, 2), (2, 5)})
(d) V is the vector space of differentiable maps over R, D : V → V defined by D( f (t)) =
d( f (t))/dt. (relative to the basis {sint, cost, e3t })

P ROBLEM 2.2.2 A linear map L : R2 → R2 a rotation if its associated matrix can be written
in the form  
cos θ − sin θ
R(θ ) = .
sin θ cos θ
40 CHAPTER 2. LINEAR MAPS AND MATRICES

Let (1, 2) be a point of R2 . Let R be the rotation through an angle of π/4. What are the
coordinates of R(1, 2) relative to the usual basis {(1, 0), (0, 1)}.

pangle θ . Show that for any vector v = (x, y) in


P ROBLEM 2.2.3 Let R be a rotation through an
2
R we have kvk = kR(v)k, where k(x, y)k = x2 + y2 .

P ROBLEM 2.2.4 Let c be a real number, and let f : R3 → R3 be the linear map such that
f (v) = cv, where v ∈ R3 . What is the matrix associated with this linear map?

2.3 Bases, matrices, and linear maps


In the first two sections we consider the relation between matrices and linear maps of F n into
F m . Now let V,W be arbitrary finite dimensional vector spaces over F. Let

b = {v1 , . . . , vn } and b0 = {w1 , . . . , wm }

be bases of V and W respectively. Then we know that elements of V and W have coordinates
vectors with respect to these bases. In other words, if v ∈ V then we can express v uniquely as
a linear combination
v = x1 v1 + · · · + xn vn , xi ∈ F.
Thus V is isomorphic to F n under the map F n → V given by

(x1 , . . . , xn ) 7→ x1 v1 + · · · + xn vn .

Similarly for W. If f : V → W is a linear map, then using the above isomorphism, we can
interpret f as a linear map of F n into F m , and thus we can associate a matrix with f , depending
on our choice of bases, and denoted by Mbb0 ( f ). This matrix is the unique matrix A having the
following property:

T HEOREM 2.3.1 If X is the (column) coordinate vector of an element v of V relative to the


basis b, then AX is the (column) coordinate vector of f (v) relative to the basis b0 .

To use a notation which shows that the coordinate vector X depends on v and on the basis b
we let Xb (v) denote this coordinate vector. Then the above property can be stated in a formula.

T HEOREM 2.3.2 Let V,W be vector spaces over F, and let f : V → W be a linear map. Let b
be a basis of V and b0 of W . If v ∈ V then Xb0 ( f (v)) = Mbb0 ( f )Xb (v).

Proof. Let A = Mbb0 ( f ), and X is the coordinate vector of v with respect to b, then by definition,
f (v) = (A1 · X)w1 + · · · + (Am · X)wm . This matrix A is determined by the effect of f on the
basis elements as follows.
CHAPTER 2. LINEAR MAPS AND MATRICES 41

Let

f (v1 ) = a11 w1 + · · · + am1 wm


..
. (24)
f (vn ) = a1n w1 + · · · + amn wm .

Then A turns out to be the transpose of the matrix


 
a11 a21 · · · am1
a12 a22 · · · am2 
..  .
 
 .. ..
 . . ··· . 
a1n a2n · · · amn

Indeed, we have

f (v) = f (x1 v1 + · · · + xn vn ) = x1 f (v1 ) + · · · + xn f (vn ).

Using expression (24) for f (v1 ), . . . , f (vn ) we find that

(a11 x1 + · · · + a1n x1 )w1 + · · · + (am1 x1 + · · · + amn xn )wm


= (A1 · · · X)w1 + · · · + (Am · X)wm .

This proves our assertion.

C OROLLARY 2.3.1 Let V be a vector space, and let b, b0 be bases of V. Let v ∈ V . Then
Xb0 (v) = Mbb0 (1V )Xb (v).

Remark 2.3.1. The corollary expresses in a succinct way the manner in which the coordinates
of a vector change when we change the basis of the vector space.

T HEOREM 2.3.3 Let V,W be vector spaces. Let b be a basis of V, and b0 a basis of W. Let f , g
be two linear maps of V into W. Let M = Mbb0 . Then M( f +g) = M( f )+M(g). If c is a number,
then M(c f ) = cM( f ). The association f 7→ Mbb0 ( f ) is an isomorphism between the space of
linear maps Hom(V,W ) and the space of m × n matrices (if dimV = n and dimW = m.)

Proof. Proof The first formulas showing that f 7→ M( f ) is linear follow at once from the
definition of the associated matrix. The association f 7→ M( f ) is injective since M( f ) = M(g)
implies f = g, and it is surjective since every linear map is represented by a matrix. Hence
f 7→ M( f ) gives an isomorphism as stated.
We now pass from the additive properties of the associated matrix to the multiplicative
properties.

T HEOREM 2.3.4 Let V,W,U be vector spaces. Let b, b0 , b00 be bases for V,W,U respectively.
0
Let f : V → W and g : W → U be linear maps. Then Mbb00 (g)Mbb0 ( f ) = Mbb00 (g ◦ f ).
42 CHAPTER 2. LINEAR MAPS AND MATRICES

Proof. Let A be the matrix associated with f relative to the bases b, b0 and let B be the matrix
associated with g relative to the bases b0 , b00 .. Let v be an element of V and let X be its
(column) coordinate vector relative to b. Then the coordinate vector of f (v) relative to b0 is
AX. By definition, the coordinate vector of g( f (v)) relative to b00 is B(AX) which is equal to
(BA)X. But g( f (v)) = (g ◦ f )(v). Hence the coordinate vector of g ◦ f )(v) relative to the basis
b00 is (BA)X. By definition, this means that BA is the matrix associated with g ◦ f , and proves
our theorem.
Remark 2.3.2. In many applications, one deals with linear maps of a vector space V into itself.
If a basis b of V is selected, and f : V → V is a linear map, then the matrix Mbb ( f ) is usually
called the matrix associated with f relative to b (instead of saying relative to b, b). From
the definition, we see that Mbb (1V ) = I, where I is the matrix. As a direct consequences of
Corollary 2.3.1 we obtain

C OROLLARY 2.3.2 Let V be a vector space and b, b0 bases of V. Then


0 0
Mbb0 (1V )Mbb (1V ) = I = Mbb (1V )Mbb0 (1V ).

In particular, Mbb0 (1V ) is invertible.

00
Proof. Take V = W = U in Theorem 2.3.4, and f = g = 1V and b = b. Our assertion then
drops out.
The general formula of Theorem 2.3.4 will allow us to describe precisely how the matrix
associated with a linear map changes when we change bases.

T HEOREM 2.3.5 Let f : V → V be a linear map, and let b, b0 be bases of V. Then there exists
0 0
an invertible matrix N such that Mbb0 ( f ) = N −1 Mbb ( f )N. In fact, we can take N = Mbb (1V ).

Proof. Applying Theorem 2.3.4 step by step, we find that


0 0
Mbb0 ( f ) = Mbb0 (1V )Mbb ( f )Mbb (1V ).

Corollary 2.3.2 implies the assertion to be proved.


Remark 2.3.3. Let V be a finite dimensional vector space over F, and let f : V → V be a linear
map. A basis b of V is said to diagonalize f if the matrix associated with f relative to b is
a diagonal matrix. If there exists such a basis which diagonalizes f , then we say that f is
diagonalizable. It is not always true that a linear map can be diagonalized. Later on, we shall
find sufficient conditions under which it can. If A is an n × n matrix in a field F, we say that
A can be diagonalized (in F) if the linear map on F n represented by A can be diagonalized.
From Theorem 2.3.5, we conclude at once:

T HEOREM 2.3.6 Let V be a finite dimensional vector space over F, let f : V → V be a linear
map, and let M be its associated matrix relative to a basis b. Then f (or M) can be diagonalized
CHAPTER 2. LINEAR MAPS AND MATRICES 43

(in F) if and only if there exists an invertible matrix N in F such that N −1 MN is a diagonal
matrix.

Remark 2.3.4. In view of the importance of the map M 7→ N −1 MN, we give it a special name.
Two matrices, M, M 0 are called similar (over a field F) if there exists an invertible matrix N
in K such that M 0 = N −1 MN.

Problems
P ROBLEM 2.3.1 In each of the following cases, find Mbb0 (1R3 ). The vector space in each case
is R3 .
(a) b = {(1, 1, 0), (−1, 1, 1), (0, 1, 2)}
b0 = {(2, 1, 1), (0, 0, 1), (−1, 1, 1)}
(b) b = {(3, 2, 1), (0, −2, 5), (1, 1, 2)}
b0 = {(1, 1, 0), (−1, 2, 4), (2, −1, 1)}

P ROBLEM 2.3.2 Let f : V → V be a linear map. Let b = {v1 , . . . , vn } be a basis of V. Suppose


that there are numbers c1 , . . . , cn such that f (vi ) = ci vi for i = 1, . . . , n. What is Mbb ( f )?

P ROBLEM 2.3.3 For each real number θ , let fθ : R2 → R2 be the linear map represented by
the matrix  
cos θ − sin θ
r(θ ) = .
sin θ cos θ
Show that if θ , θ 0 are real numbers, then fθ fθ 0 = fθ +θ 0 . Also show that fθ−1 = f−θ .

P ROBLEM 2.3.4 Let X = (1, 2) be a point of the plane. Let r F be the rotation through an angle
of π/4. What are the coordinates of r(X) relative to the usual basis {e1 , e2 }?

P ROBLEM 2.3.5 In each of the following cases, let D = d/dt be the derivative. We give a set
of linearly independent maps b. These generate a vector space V, and D is a linear map from
V into itself. Find the matrix associated with D relative to the bases b, b.
(a) {et , e2t }
(b) {1,t}
(c) {et ,tet }
(d) {1,t,t 2 }
(e) {1,t, et , e2t ,te2t }

P ROBLEM 2.3.6 (a) Let N be a square matrix. We say that N is nilpotent if there exists a
positive integer r such that N r = 0. Prove that if N is nilpotent, then I − N is invertible.
(b) State and prove the analogous statement for linear maps of a vector space into itself.
E IGENVECTORS AND EIGENVALUES
3
3.1 Eigenvectors and eigenvalues
Let V be a vector space and let A : V → V be a linear map of V into itself. An element v ∈ V
is called an eigenvector of A if there exists a number λ such that Av = λ v. If v 6= 0 then λ
is uniquely determined, because λ1 v = λ2 v implies λ1 = λ2 . In this case , we say that λ is an
eigenvalue of A belonging to the eigenvector v. We also say that v is an eigenvector with the
eigenvalue λ . Instead of eigenvector and eigenvalue, one also uses the terms characteristic
vector and characteristic value.
If A is a square n × n matrix then an eigenvector of A is by definition an eigenvector of
the linear map of F n into itself represented by this matrix. Thus an eigenvector X of A is a
(column) vector of F n for which there exists λ ∈ K such that AX = λ X.

T HEOREM 3.1.1 Let V be a vector space and let A : V → V be a linear map. Let λ ∈ F. Let Vλ
be the subspace of V generated by all eigenvectors of A having λ as eigenvalue. Then every
non-zero element of Vλ is an eigenvector of A having λ as eigenvalue.

Proof. Let v1 , v2 ∈ V be such that Av1 = λ v1 and Av2 = λ v2 . Then

A(v1 + v2 ) = Av1 + Av2 = λ v1 + λ v2 = λ (v1 + v2 ).

If c ∈ K then A(cv1 ) = cAv1 = cλ v1 = λ cv1 . This proves our theorem.

Remark 3.1.1. The subspace Vλ in Theorem 3.1.1 is called the eigenspace of A belonging to
λ . Note that if v1 , v2 are eigenvectors of A with different eigenvalues λ1 6= λ2 then of course
v1 + v2 is not an eigenvector of A. In fact, we have the following theorem:

T HEOREM 3.1.2 Let V be a vector space and let A : V → V be a linear map. Let v1 , . . . , vm be
eigenvectors of A, with eigenvalues λ1 , . . . , λm respectively. Assume that these eigenvalues are
distinct, i.e. λi 6= λ j if i 6= j. Then v1 , . . . , vm are linearly independent.

45
46 CHAPTER 3. EIGENVECTORS AND EIGENVALUES

Proof. By induction on m. For m = 1, an element v1 ∈ V, v1 6= 0 is linearly independent.


Assume m > 1. Suppose that we have a relation

c1 v1 + · · · + cm vm = 0 (25)

with scalars ci . We must prove all ci = 0. We multiply our relation (25) by λ1 to obtain

c1 λ1 v1 + · · · + cm λ1 vm = 0.

We also apply A to our relation (25). By linearity, we obtain

c1 λ1 v1 + · · · + cm λm vm = 0.

We now subtract these last two expressions, and obtain

c2 (λ2 − λ1 )v2 + · · · + cm (λm − λ1 )vm = 0.

Since λ j − λ1 6= 0 for j = 2, . . . , m, we conclude by induction that

c2 = · · · = cm = 0.

Going back to our original relation, we see that c1 v1 = 0, whence c1 = 0, and our theorem is
proved.

Remark 3.1.2. In Theorem 3.1.2, suppose V is a vector space of dimension n and A : V → V is


a linear map having n eigenvectors v1 , . . . , vn whose eigenvalues λ1 , . . . , λn are distinct. Then
{v1 , . . . , vn } is a basis of V.
Remark 3.1.3. Let V be a finite dimensional vector space, and let L : V → V be a linear map.
Let {v1 , . . . , vn } be a basis of V . We say that this basis diagonalizes L if each vi is an eigen-
vector of L, so Lvi = ci vi with some scalar ci . Then the matrix representing L with respect to
this basis is the diagonal matrix
 
c1 0 · · · 0
 0 c2 · · · 0 
A =  .. .. . . ..  .
 
. . . .
0 0 · · · cn

We say that the linear map L can be diagonalized if there exists a basis of V consisting of
eigenvectors.

Problems
 
1 a
P ROBLEM 3.1.1 Let a ∈ K and a 6= 0. Prove that the eigenvectors of the matrix gen-
0 1
erate a 1-dimensional space, and give a basis for this space.
CHAPTER 3. EIGENVECTORS AND EIGENVALUES 47

 
2 0
P ROBLEM 3.1.2 Prove that the eigenvectors of the matrix generate a 2-dimensional
0 2
space, and give a basis for this space. What are the eigenvalues of this matrix?

P ROBLEM 3.1.3 Let A be a diagonal matrix with diagonal elements a11 , . . . , ann . What is the
dimension of the space generated by the eigenvectors of A? Exhibit a basis for this space, and
give the eigenvalues.

P ROBLEM 3.1.4 Let A = (ai j ) be an n × n matrix such that for each i = 1, . . . , n we have
n
∑ ai j = 0. Show that 0 is an eigenvalue of A.
j=1

 
cos θ sin θ
P ROBLEM 3.1.5 Show that if θ ∈ R, then the matrix A = always has an
sin θ − cos θ
eigenvector in R2 , and in fact that there exists a vector v1 such that Av1 = v1 .
sin θ
[Hint: Let the first component of v1 be x = if cos θ 6= 1. Then solve for y. What if
1 − cos θ
cos θ = 1?]

P ROBLEM 3.1.6 Let V be a finite dimensional vector space. Let A, B be linear maps of V into
itself. Assume that AB = BA. Show that if v is an eigenvector of A, with eigenvalue λ , then
Bv is an eigenvector of A, with eigenvalue λ also if Bv 6= 0.

3.2 The characteristic polynomial


We shall now see how we can use determinants to find the eigenvalue of a matrix.

T HEOREM 3.2.1 Let V be a finite dimensional vector space, and let λ be a number. Let
A : V → V be a linear map. Then λ is an eigenvalue of A if and only if A − λ I is not invertible.

Proof. Assume that λ is an eigenvalue of A. Then there exists an element v ∈ V, v 6= 0 such


that Av = λ v. Hence Av − λ v = 0, and (A − λ I)v = 0. Hence A − λ I has a non-zero kernel, and
A − λ I cannot be invertible. Conversely, assume that A − λ I is not invertible, which implies
A−λ I must have a non-zero kernel, meaning that there exists an element v ∈ V, v 6= 0 such that
(A − λ I)v = 0. Hence Av − λ v = 0, and Av = λ v. Thus λ is an eigenvalue of A. This proves
our theorem.

Let A be an n × n matrix, A = (ai j ). We define the characteristic polynomial PA to be the


48 CHAPTER 3. EIGENVECTORS AND EIGENVALUES

determinant PA (t) = Det(tI − A), or written in full,



t − a11 · · · −a1n

−a21 · · · −a2n
P(t) = .. .. .

.
··· .
−an1 · · · t − ann

We can also view A as a linear map from F n to F n , and we also say that PA (t) is the charac-
teristic polynomial of this linear map.

T HEOREM 3.2.2 Let A be an n × n matrix. A number λ is an eigenvalue of A if and only if λ


is a root of the characteristic polynomial of A.

Proof. Assume that λ is an eigenvalue of A. Then λ I − A is not invertible by Theorem 3.2.1,


hence Det(λ I − A) = 0. Consequently λ is a root of the characteristic polynomial. Conversely,
if λ is a root of the characteristic polynomial, then Det(λ I − A) = 0, and hence we conclude
that λ I − A is not invertible. Hence λ is an eigenvalue of A by Theorem 3.2.1.

Remark 3.2.1. Theorem 3.2.2 gives us an explicit way of determining the eigenvalues of a
matrix, provided that we can determine explicitly the roots of its characteristic polynomial.
This is sometimes easy, especially in exercises at the end of sections when the matrices are
adjusted in such a way that one can determine the roots by inspection, or simple devices. It is
considerably harder in other cases.
Suppose the field of scalars F is the complex numbers. We then know the fact that every
non-constant polynomial with complex coefficients has a complex root. If A is a complex n × n
matrix, then the characteristic polynomial of A has complex coefficients, and has degree n ≥ 1,
so has a complex root which is an eigenvalue. Thus we have:

T HEOREM 3.2.3 Let A be an n × n matrix with complex components. Then A has a non-zero
eigenvector and an eigenvalue in the complex numbers.

This is not true over the real numbers.

T HEOREM 3.2.4 Let A, B be two n × n matrices, and assume that B invertible. Then the char-
acteristic polynomial of A is equal to the characteristic polynomial of B−1 AB.

Proof. By definition, and properties of the determinant,

Det(tI − A) = Det(B−1 (tI − A)B) = Det(tB−1 B − B−1 AB)


= Det(tI − B−1 AB).

This proves what we wanted.


CHAPTER 3. EIGENVECTORS AND EIGENVALUES 49

Let L : V → V be a linear map of a finite dimensional vector space into itself, so L is an


operator. Select a basis for V and let A = Mbb (L) be the matrix associated with L with respect
to this basis. We then define the characteristic polynomial of L to be the characteristic
polynomial of A. If we change basis, then A changes to B−1 AB where B is invertible. By
Theorem 3.2.4, this implies that the characteristic polynomial does not depend on the choice
of basis.
Theorem 3.2.2 can be interpreted for L as stating:

T HEOREM 3.2.5 Let V be a finite dimensional vector space over C of dimension > 0. Let
L : V → V be an operator. Then L has a non-zero eigenvector and an eigenvalue in the complex
numbers.

It should be remembered that in the case of complex eigenvalues, the vector space is over
the complex numbers, so it consists of linear combinations of the given basis elements with
complex coefficients.

Problems
P ROBLEM 3.2.1 Let A be a diagonal matrix,
 
a1 0 ··· 0
0 a2 · · · 0
A =  .. ..  .
 
..
. . .
0 0 · · · an

(a) What is the characteristic polynomial of A?


(b) What are its eigenvalues?

P ROBLEM 3.2.2 Let A be a triangular matrix,


 
a11 0 · · · 0
a21 a22 · · · 0 
A =  .. ..  .
 
..
 . . . 
an1 an2 · · · ann

What is the characteristic polynomial of A, and what are its eigenvalues?

P ROBLEM 3.2.3 Find the characteristic polynomial, eigenvalues, and bases for the eigenspaces
of the
 following
 matrices:
1 2
(a) .
 3 2 
3 2
(b) .
−1 0
50 CHAPTER 3. EIGENVECTORS AND EIGENVALUES

 
4 0 1
(c) −2 1 0 .
−2 0 1
1 −3 3
(d) 3 −5 3 .
6 −6 4

P ROBLEM 3.2.4 Find the eigenvalues of the following matrices. Show that teh eigenvectors
forma 1-dimensional
 space.
2 −1
(a) .
1 0
1 1
(b) .
0 1 
1 1 1
(c) 0 1 1 .

0 0 1 
1 1 0
(d) 0 1 1 .

1 0 1

P ROBLEM 3.2.5 Let V be an n-dimensional vector space and assume that the characteristic
polynomial of a linear map A : V → V has n distinct roots. Show that V has a basis consisting
of eigenvectors of A.

P ROBLEM 3.2.6 Let A be an invertible matrix. If λ is an eigenvalue of A show that λ 6= 0 and


that λ −1 is an eigenvalue of A−1 .

P ROBLEM 3.2.7 Let V be the space generated over R by the two functions sint and costt.
Does the derivative (viewed as a linear map of V into itself) have any nonzero eigenvectors in
V ? If so, which?

P ROBLEM 3.2.8 Let A, B be square matrices of the same size. Show that the eigenvalues of
AB are the same as the eigenvalues of BA.
S CALAR PRODUCTS AND ORTHOGONALITY
4
4.1 Scalar products
D EFINITION 4.1.1 Let V be a vector space over a field F. A scalar product on V is an asso-
ciation which to any pair of elements v, w of V associates a scalar, denoted by hv, wi , or also
v · w, satisfying the following properties:
SP1 We have hv, wi = hw, vi for all v, w ∈ V.
SP2 If u, v, w are elements of V, then hu, v + wi = hu, vi + hu, wi .
SP3 If x ∈ F, then hxu, vi = x hu, vi and hu, xvi = x hu, vi .
The scalar product is said to be non-degenerate if in addition it also satisfies the condition:
SP4 If v is an element of V, and hv, wi = 0 for all w ∈ V, then v = 0.

Let V be a vector space with a scalar product. We define elements v, w of V to be orthogonal


or perpendicular, and write v ⊥ w, if hv, wi = 0.. If S is a subset of V , we denote by S⊥ the
set of all elements w ∈ V which are perpendicular to all elements of S, i.e. hv, wi = 0 for all
v ∈ S. Then using SP2 and SP3, one verifies at once that S⊥ is a subspace of V , called the
orthogonal space of S. If w is perpendicular to S, we also write w ⊥ S. Let U be the subspace
of V generated by the elements of S. If w is perpendicular to S, and if v1 , v2 ∈ S, then

hw, v1 + v2 i = hw, v1 i + hw, v2 i = 0.

If c is a scalar, then
hw, cv1 i = c hw, v1 i .
Hence w is perpendicular to linear combination of elements of S, and hence w is perpendicular
to U.
Let V again be a vector space over the field F,
with a scalar product. Let {v1 , . . . , vn } be a
basis of V. We say that it is an orthogonal basis if vi , v j = 0 for all i 6= j. We shall show later
that if V is a finite dimensional vector space, with a scalar product, then there always exists
an orthogonal basis. However, we shall first discuss important special cases over the real and
complex numbers.

51
52 CHAPTER 4. SCALAR PRODUCTS AND ORTHOGONALITY

4.1.1 The real positive definite case


D EFINITION 4.1.2 Let V be a vector space over R, with a scalar product. We say this scalar
product positive definite if hv, vi ≥ 0 for all v ∈ V, and hv, vi > 0 if v 6= 0.

Let W be a vector space over R, with a positive definite scalar product denoted by h , i. Let
W be a subspace. Then W has a scalar product defined by the same rule defining the scalar
product in V. In other words, if w, w0 are elements of W, we may form their scalar product
hw, w0 i. This scalar product on W is obviously positive definite.
p
D EFINITION 4.1.3 We define the norm of an element v ∈ V by kvk = hv, vi.

Remark 4.1.1. If c is any number, then we immediately get kcvk = |c| kvk , because
p q
kcvk = hcv, cvi = c2 hv, vi = |c|kvk.

The distance between two elements v, w of V is defined to be

dist(v, w) = kv − wk.

Let us justify our definition of perpendicularity. From the intuition of plane geometry and the
following figure tell us that v is perpendicular to w if and only if kv − wk = kv + wk.

wO
kw+vk kw−vk

−v o 0 / v
But then by algebra:

kv − wk = kv + wk ⇔ kv − wk2 = kv + wk2
⇔ (v − w)2 = (v + w)2
⇔ v2 − 2v · w + w2 = v2 + 2v · w + w2
⇔ 4v · w = 0
⇔ v · w = 0.

This is the desired justification.

D EFINITION 4.1.4 We say that an element v ∈ V is a unit vector if kvk = 1. If v ∈ V and v 6= 0,


then v/kvk is a unit vector.

T HEOREM 4.1.1 (The Pythagoras theorem.) If v, w are perpendicular, then

kv + wk2 = kvk2 + kwk2 .


CHAPTER 4. SCALAR PRODUCTS AND ORTHOGONALITY 53

T HEOREM 4.1.2 (The parallelogram law.) For any v, w we have

kv + wk2 + kv − wk2 = 2kvk2 + 2kwk2 .

T HEOREM 4.1.3 Let w be an element of V such that kwk 6= 0. For any v there exists a unique
number c such that v − cw is perpendicular to w.

Proof. For v − cw to be perpendicular to w we must have hv − cw, wi = 0, whence hv, wi −


hv, wi
hcw, wi = 0 and hv, wi = c hw, wi . Thus c = . Conversely, letting c have this value shows
hw, wi
that v − cw is perpendicular to w (prove it!).
We call c the component of v along w. We call cw the projection of v along w.

T HEOREM 4.1.4 (Schwarz inequality.) For all v, w ∈ V we have | hv, wi | ≤ kvkkwk.

Proof. If w = 0, then both sides are equal to 0 and our inequality is obvious. Next, assume that
w = e is a unit vector, that is e ∈ V and kek = 1. If c is the component of v along e, then v − ce
is perpendicular to e, and also perpendicular to ce. Hence by the Pythagoras theorem, we find
kvk2 = kv − cek2 + kcek2 = kv − cek2 + c2 . Hence c2 ≤ kvk2 , so that |c| ≤ kvk. Finally, if w
is arbitrary non-zero element of V, then
 

v, w
≤ kvk.
kwk

This yields | hv, wi | ≤ kvkkwk, as desired.

T HEOREM 4.1.5 (Triangle inequality.) If v, w ∈ V, then kv + wk ≤ kvk + kwk.

T HEOREM

Let v1 , . . . , vn be non-zero elements of V which are mutually perpendicular,
4.1.6
that is vi , v j = 0 if i 6= j. Let ci be the component of v along vi . Then v − c1 v1 − · · · − cn vn is
perpendicular to v1 , . . . , vn .

Proof. To prove
this, all we have to do is to take the scalar product with v j for any j.
All the
terms

involving
v i j will give 0 if i 6= j, and we shall have two remaining terms v, v j −
, v
c j v j , v j which cancel. Thus subtracting linear combinations as above orthogonalizes v with
respect to v1 , . . . , vn .
The next theorem shows that c1 v1 + · · · + cn vn gives the closest approximation to v as a
linear combination of v1 , . . . , vn .

T HEOREM 4.1.7 Let v1 , . . . , vn be vectors which are mutually perpendicular, and such that
kvi k 6= 0 for all i. Let v be an element of V, and let ci be the component of v along vi . Let
54 CHAPTER 4. SCALAR PRODUCTS AND ORTHOGONALITY

a1 , . . . , an be numbers. Then

n n
v − ∑ ck vk ≤ v − ∑ ak vk .

k=1 k=1

n
Proof. We know that v − ∑ ck vk is perpendicular to each vi , i = 1, . . . , n. Hence it is perpen-
k=1
dicular to any linear combination of v1 , . . . , vn . Now we have:
2 2
n n n
v − ∑ ak vk = v − ∑ ck vk + ∑ (ck − ak )vk

k=1 k=1 k=1

2 2
n n
= v − ∑ ck vk ≤ v − ∑ ak vk ,

k=1 k=1
and thus our theorem is proved.

T HEOREM 4.1.8 (Bessel inequality.) If v1 , . . . , vn are mutually perpendicular unit vectors, and
n
if ci is the component of v along vi , then ∑ c2i ≤ kvk2 .
i=1

n
Proof. The elements v − ∑ ci vi , v1 , . . . , vn are mutually perpendicular. Therefore:
i=1
2 2
n n
kvk2 = v − ∑ ci vi + ∑ ci vi

i=1 i=1
2
n
≥ ∑ ci vi

i=1
n
= ∑ c2i
i=1

because v1 , . . . , vn are mutually perpendicular and kvi k2 = 1. This proves the theorem.

Problems
P ROBLEM 4.1.1 Prove the Theorem 4.1.1, Theorem 4.1.2, and Theorem 4.1.5.

P ROBLEM 4.1.2 Let V be a vector space with a scalar product. Show that h0, vi = 0 for all v
in V.

P ROBLEM 4.1.3 Assume that the scalar product is positive



definite. Let v1 , . . . , vn be non-zero
elements which are mutually perpendicular, that is vi , v j = 0 if i 6= j. Show that they are
linearly independent.
CHAPTER 4. SCALAR PRODUCTS AND ORTHOGONALITY 55

4.2 Orthogonal bases, positive definite case

Let V be a vector space with a positive scalar product through out this section. A basis

{v1 , . . . , vn } of V is said to be orthogonal if its elements are mutually perpendicular, i.e.


vi , v j = 0 whenever i 6= j. If in addition each element of the basis has norm 1, then the
basis is called orthonormal.

T HEOREM 4.2.1 Let V be a finite dimensional vector space, with a positive definite scalar
product. Let W be a subspace of V, and let {w1 , . . . , wm } be an orthogonal basis of W. If
W 6= V, then there exists elements wm+1 , . . . , wn of V such that {w1 , . . . , wn } is an orthogonal
basis of V.

Proof. The method of proof is as important as the theorem, and is called the Gram-Schmidt
orthogonalization process. We know that we can find elements vm+1 , . . . , vn of V such that
{w1 , . . . , wm , vm+1 , . . . , vn } is a basis. Of course, it is not an orthogonal basis. Let Wm+1 be the
space generated by w1 , . . . , wm , vm+1 . We shall first obtain an orthogonal basis of Wm+1 . The
idea is to take vm+1 and subtract from it its projection along w1 , . . . , wm . Thus we let

hvm+1 , w1 i hvm+1 , wm i
c1 = , · · · , cm = .
w1 , w1 hwm , wm i

Let wm+1 = vm+1 − c1 w1 − · · · − cm wm . Then wm+1 is perpendicular to w1 , . . . , wm . Further-


more, wm+1 6= 0 (otherwise vm+1 would be linearly dependent on w1 , . . . , wm ) and vm+1 lies
in the space generated by w1 , . . . , wm+1 because vm+1 = wm+1 + c1 w1 + · · · + cm wm . Hence
{w1 , . . . , wm+1 } is an orthogonal basis of Wm+1 . We can now proceed by induction, showing
that the space Wm+s generated by w1 , . . . , wm , vm+1 , . . . , vm+s has an orthogonal basis

{w1 , . . . , wm+1 , . . . , wm+s }

with s = 1, . . . , n − m. This concludes the proof.

C OROLLARY 4.2.1 Let V be a finite dimensional vector space with a positive definite scalar
product. Assume that V 6= {0}. Then V has an orthogonal basis.

Proof. By hypothesis, there exists an element v1 of V such that v1 6= 0. We let W be the


subspace generated by v1 , and apply the theorem to get the desired basis.

We summarize the procedure of Theorem 4.2.1 once more. Suppose we are given an arbi-
56 CHAPTER 4. SCALAR PRODUCTS AND ORTHOGONALITY

trary basis {v1 , . . . , vn } of V. We wish to orthogonalize it. We proceed as follows. We let

v01 = v1 ,
hv2 , v0 i
v02 = v2 −
0 10 v01 ,
v1 , v1
0 hv3 , v02 i 0 hv3 , v01 i 0
v3 = v3 − 0 0 v2 −
0 0 v1

v2 , v2 v1 , v1
..
.
vn , v0n−1 hvn , v01 i 0


0 0
vn = vn −
0 v − · · · − v .
vn−1 , v0n−1 n−1 v01 , v01 1

Then {v01 , . . . , v0n } is an orthogonal basis. Given an orthogonal basis, we can always obtain an
orthonormal basis by dividing each vector by its norm.

T HEOREM 4.2.2 Let V be a vector space over R with a positive definite scalar product, of
dimension n. Let W be a subspace of V of dimension r. Let W ⊥ be the subspace of V consisting
of all elements which are perpendicular to W. Then V is the direct sum of W and W ⊥ , and W ⊥
has dimension n − r. In other words, dimW + dimW ⊥ = dimV.

Proof. If W consists of 0 alone, or if W = V, then our assertion is obvious. We therefore


assume that W 6= V and W 6= 0. Let {w1 , . . . , wr } be an orthonormal basis of W. By Theo-
rem 4.2.1, there exists elements ur+1 , . . . , un of V such that {w1 , . . . , wr , ur+1 , . . . , un } is an
orthonormal basis of V. We shall prove that {ur+1 , . . . , un } is an orthonormal basis of W ⊥ .
Let u be an element of W ⊥ . Then there exist numbers x1 , . . . , xn such that

u = x1 w1 + · · · + xr wr + xr+1 ur+1 + · · · + xn un .

Since u is perpendicular to W, taking the scalar product with any wi (i = 1, . . . , r) we find

0 = hu, wi i = xi hwi , wi i = xi .

Hence all xi = 0 (i = 1, . . . , r). Therefore u is a linear combination of ur+1 , . . . , un .


Conversely, let u = xr+1 ur+1 + · · · + xn un be a linear combination of ur+1 , . . . , un . taking the
scalar product with any wi yields 0. Hence u is perpendicular to all wi (i = 1, . . . , r), and hence
is perpendicular to W. This proves that ur+1 , . . . , un generate W ⊥ . Since they are mutually
perpendicular, and of norm 1, they form an orthonormal basis of W ⊥ , whose dimension is
therefore n − r. Furthermore, an element of V has a unique expression as a linear combination

x1 w1 + · · · + xr wr + xr+1 ur+1 + · · · + xn un

and hence a unique expression as a sum of w + u with w ∈ W and u ∈ W ⊥ . Hence V is the


direct sum of W and W ⊥ .
Remark 4.2.1. The space W ⊥ is called the orthogonal complement of W.
CHAPTER 4. SCALAR PRODUCTS AND ORTHOGONALITY 57

Problems
P ROBLEM 4.2.1 What is the dimension of the subspace of R6 perpendicular to the two vectors
(1, 1, −2, 3, 4, 5) and (0, 0, 1, 1, 0, 7)?

P ROBLEM 4.2.2 Find an orthonormal basis for the subspace of R3 generated by the following
vectors: (a) (1, 1, −1) and (1, 0, 1). (b) (2, 1, 1) and (1, 3, −1).

P ROBLEM 4.2.3 In problems 4.2.3 through 4.2.5 we consider teh vector space of continuous
real-values maps on the interval [0, 1]. We define the scalar product of two such maps f , g by
the rule Z 1
h f , gi = f (t)g(t) dt.
0
Using standard properties of the integral, verify that this is a scalar product.

P ROBLEM 4.2.4 Let V be the subspace of maps generated by the two maps f , g such that
f (t) = t and g(t) = t 2 . Find an orthonormal basis for V.

P ROBLEM 4.2.5 Let V be the subspace generated by the three maps 1,t,t 2 (where 1 is teh
constant map). Find an orthonormal basis for V.

P ROBLEM 4.2.6 Let V be a finite dimensional vector space over R, with a positive definite
scalar product.

Let {v1 , . . . , vm } be a set of elements of V, of norm 1, and mutually perpendic-
ular (i.e. vi , v j = 0 if i 6= j). Assume that for every v ∈ V we have
m
kvk2 = ∑ hv, vi i2 .
i=1

Show that {v1 , . . . , vm } is a basis of V.

P ROBLEM 4.2.7 Let V be a finite dimensional vector space over R, with a positive definite
scalar product. Prove the parallelogram law, for any elements v, w ∈ V

ku + vk2 + ku − vk2 = 2 kuk2 + kvk2 .



B IBLIOGRAPHY

[1] J. Dieudonné, Linear algebra and geometry, 1969, Hermann.

[2] Paul R. Halmos, Finite-dimensional vector spaces, 1987, Springer.

[3] Paul R. Halmos, Linear algebra problem book, 1995, The Mathematical Association of
America.

[4] S. Mac Lane, G. Birkhoff, Algebra, third edition, 1999, AMS Chelsea Publishing.

[5] S. Lang, Linear Algebra, third edition, 1987, Springer.

59