Вы находитесь на странице: 1из 33

Graduate Algebra I (MATH600, UMD) Lecture Notes

Fall 2016, Prof. Tamvakis


Notes by Joseph Heavner
September 24, 2016
Abstract
These are simply a retying of my in class notes, perhaps with a few
gaps filled in. They are not sufficiently detailed to replace our course
text (Dummit & Foote) or Prof. Tamvakis lectures, but these notes
may serve as a good supplement to such resources, particularly for review
purposes. Any special material should be marked in some way or another
as supplementary.
Send any corrections or suggestions to jheavner (at) umd (dot) edu.

Contents
1 August 30: Introduction to
Field
1.1 Logistics . . . . . . . . . .
1.2 The Course . . . . . . . .
1.3 Guiding Philosophy . . . .
1.4 Notation . . . . . . . . . .
1.5 High-Level Overview . . .
1.6 Vector Spaces . . . . . . .

the Course, Vector Spaces over a


.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

2 September 1: Linear Independence, Bases,


of V
2.1 Linear Independence . . . . . . . . . . . . .
2.2 Bases . . . . . . . . . . . . . . . . . . . . .
2.3 Dimension . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

2
2
2
3
3
3
3

and the Dimension


. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .

3 September 6: Choices, Infinite Dimensions,


& Their Matrices
3.1 Zorns Lemma . . . . . . . . . . . . . . . . . .
3.2 Infinite Dimensional Vector Spaces . . . . . .
3.3 Linear Maps & Their Matrices . . . . . . . .
3.4 Preview . . . . . . . . . . . . . . . . . . . . .

7
7
7
9

and Linear Maps


.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

11
11
13
13
15

4 September 8: Isomorphisms, Direct Sums & Products, Quotient Spaces


16
4.1 Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2 Direct Sums & Direct Products . . . . . . . . . . . . . . . . . . . 16

4.3
4.4

Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . .
Preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 September 13: The Isomorphism Theorems & Universal Properties


5.1 Isomorphism Theorems . . . . . . . . . . . . . . . . . . . . . . .
5.2 Universal Properties . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1 Universal Property of the Quotient . . . . . . . . . . . . .
5.2.2 Universal Property of the Direct Product . . . . . . . . .
5.2.3 Universal Property of Direct Sum . . . . . . . . . . . . . .
6 September 15: Hom, Duals, Coordinates,
6.1 More Subspace Constructions . . . . . . .
6.2 Coordinate Systems . . . . . . . . . . . .
6.3 Matrix of a Linear Map . . . . . . . . . .

18
20

21
21
22
23
24
24

Eigenvalues
26
. . . . . . . . . . . . . 26
. . . . . . . . . . . . . 26
. . . . . . . . . . . . . 27

7 September 20: Diagonalization, Inner Products, Gram-Schmidt


7.1

1
1.1

Diagonalizable Linear Maps . . . . . . . . . . . . . . . . . . . . .

31
31

August 30: Introduction to the Course, Vector Spaces


over a Field
Logistics

There are three problem types in this courses p-sets: A, B, C. Type A problems are standard and fairly straightforward, testing comprehension of material. Type B problems are more involved and difficult, and they are designed to be similar to research in that they consist of many parts, one leading
into another. Type C problems are particularly difficult and serve as extra
credit. Homeworks will be assigned on Thursday (posted to the course website (http://www.math.umd.edu/~harryt/teaching/600.html) and due the
following Thursday. You are permitted two late submissions.
The heart of the course is in the homework and lectures, but the university
requires we give exams for any course tested on in a qualifying exam such as
this one. Therefore, there will be one in-class midterm on Tuesday, November
1, and there will be a final exam on Monday, December 9, 1:30 - 3:30 p.m., as
scheduled by the university.

1.2

The Course

Algebra is essentially the study of sets with some additional structure, especially
algebraic operations and symmetries. This differs from analysis by the concept
of a limit. In this way, analysis is the study of inequalities; algebra is the study
of equalities. Analysts always have some fudge room, but so do algebraists in
the sense that equality really means isomorphism in most contexts, a weaker
but more interesting condition. Both areas enter geometry, which is a synthetic
field utilizing analysis locally and algebra globally.

1.3

Guiding Philosophy

We begin with linear algebra, unlike most books, including our own. Why do
we do this? Well, we basically know all about linear algebra at this point in the
sense that we can completely classify vector spaces up to isomorphism. This is
not the case for groups, the usual starting point, or even rings (another starting
point); although, finite simple groups have been classified in several thousand
pages of research. Also, linear algebra is more fundamental from a historical
perspective (historically, mathematicians worked with representations of groups
rather than the abstract groups themselves). Succinctly, groups can be abstract
and difficult to start with. (Note: Evan Chen and I recently discussed how to
introduce algebra in the context of his Napkin project, and we both agreed linear
algebra followed by rings followed by groups followed by the rest is probably best.)

1.4

Notation

Tamvakis mentioned more than this, but I only wrote down the notations that
differ from standard or where there are several widely used variants. We write
A B to include the possibility that A = B; if we wish to emphasis that
equality cannot happen, we will write (. For instance, we might write
[
\
(, ) = R &
(, ) = {0}
>0 R

0 R

Please do not confuse {0} with . You are better than that. (That last part
was me.) Oh, and we also exclude 0 from N. (I hope I can force myself to stick
with this, because I always include 0, so forgive me if I mess this up.) Also, we
will always assume the axiom of choice unless otherwise mentioned.

1.5

High-Level Overview

Algebra is involved in the study of set inclusions, the going from one distinguished set to another, in particular the inclusions
NZQRCHO
where the last two are the quaternions and octonions. Of course, there is
one inclusion there not studied by us. Can you guess what it is? (Hopefully,
you said Q to R.)
Vector spaces form a category. (By this point in your career, you probably
have some vague idea what this is, but do not worry about it.) In this and
similar categories, we will discuss sub-objects, quotients, direct products and
direct sums, morphisms, kernels, images, cokernels, and so forth. Part of the
beauty of thinking of things categorically is seeing the connections like this.

1.6

Vector Spaces

Definition. A real vector space is a set V with two operations, interior


addition + : V V V and external multiplication : R V V such that
1. u + v = v + u u, v V
3

2. (u + v) + w = u + (v + w) u, v V
3. ! 0 V
4. ! v V for all v V
5. ( )v = v + v for all , R and all v V
6. (u + v) = u + v R , u, v V
7. ()v = (v) , R v V
8. ! 1 V
Examples.
1 Rn := {(x1 , . . . , xn ) : xi R i {1, . . . , n}}
2 Mm,n (R) := {real m by n matrices}
3 Generalize A a non-empty set and consider F un(A, R) = {f : A R}, e.g.,
can be seen as A = {1, . . . , n} x(1), . . . , x(n) where we might just let x(i) be
xi to go with our familiar notation of sequences and importantly matrices
4 C([0, 1]) = {f : [0, 1] R : f is continuous}
5 Pn of polynomials (in the formal sense) of degree at most n, i.e.,
(
)
n
X
i
p(x) =
ai x , ai R
i=1

Also, the vector space of all real polynomials of any degree, which you might
know as R[x], which is a vector space because R is a field.
The eight axioms imply vectors in real space behave like we expect. We
prove two easy but essential theorems.
Theorem. 0v = 0v V
Proof. Functional equation type manipulation
0 + 0 = 0 = (0 + 0)v = 0 = 0v + 0v = 0 + 0v = 0v = 0

Theorem. v = 1(v)v V
Proof. Same sort of thing. We have 1 + (1) = 0 = 1v + (1)v = 0v = 0 by
the first theorem, and so we have v + (1)v = 0 = v + (v) = (1)v = v
by cancellation.

We now generalize the idea of a vector space. If you take any field (basically anything resembling R, but you should really know what this means, as
undergraduate algebra is a prerequisite for this course), e.g., Q, R, C, you define
a vector space over the field F (an F -vector space) by replacing R with F in
the definition of a real vector space. So, V over general fields are not too hard.
Examples.

1 F n 2 Mm,n (F ) 3 Cn 4 Q[ 2], which adjoins root two to the rationals


4

We now look at more complicated constructions we can do.


Definition. For a vector space V , H V , then H is a subspace of V if H
is closed under the inherited operations, i.e., if u, v H, F = u + v
H & u H, so that H itself is easy seen to be a vector space itself.
Examples.
1 Subspaces of R2 are the lines through the origin, the empty set, and the
whole space (the last two, i.e., the empty set and the whole space, are always
subspaces and might be called trivial) 2 R3 is basically the same with lines
and 2-planes through the origin being the non-trivial ones 3 the solution space
of a homogeneous linear system A Mm,n (R) has subspace given by


x1


H := x = ... : Ax = 0

xn
One can easily translate the matrix equation to m linear equations in n
variables, all equal to 0.
Definition. A linear combination of v1 , . . . , vn is
n
X

i vi , i F , vi V

i=1

Definition. For all subsets S V , the span of S is


( n
)
X
Span(S) = hSi =
i vi , i F , v i V
i=1

i.e., the set of all linear combinations of the vectors in the set. (We will
sometimes say the span generates. This and the alternate notation should be
clear from your experience with groups.)
Theorem. The span of a set is the smallest subspace containing the set, or
precisely
\
Span(S) =
H
H
subspace
containing
S

Proof. We prove both inclusions to show equality. First, we show S H. We


know hSi H by closure, so hSi H. Span is a clear subspace containing S,
so hSi = H (one of them), and so the intersection must be less than or equal in
size, meaning hSi H.

Next time we will look at linear independence and bases to get to the dimension of a vector space. Let us define just one last thing before we leave, and
this is cute: we can define finite dimensional vector spaces right now, but we
still need machinery to define the dimension of a vector space. Anyway, here
you go.
5

Definition. Dim(V ) is finite if there exists a finite subset S V such that S


generates V , i.e., hSi = V .

September 1: Linear Independence, Bases, and the Dimension of V

2.1

Linear Independence

Definition. S V for V a vector space over F is linearly independent if


n
X

i vi = 0 (i F , vi V ) = i = 0i

i=1

Otherwise, we call the subset linearly dependent, and there is a dependence


relation among the vectors.
Proposition. (1) Any v 6= 0 in V is linearly independent as a singleton & (2)
Two vectors are linearly dependent if and only if v = u for some , so that
they are scalar multiples.
Proof. Omitted (clear)
Pn
Example. Take A Mm,n (F ), consider Ax = i=1 xi ai = b. Asking if Ax = b
has a solution in x over F is equivalent to asking if b Span(a1 , . . . , an ). Asking
if Ax = 0 has only the trivial solution is equivalent to asking if {a1 , . . . , an } is
linearly independent.
Exercise. Find the equation of the plane spanned by (1, 2, 3), (4, 5, 6) R3 .
Consider

1
2
3


4  
x
p
5
= y R3
q
6
z

Augment the matrix and perform Gauss-Jordan elimination

1 4
x
1 4
x
y 2x
0 3 y 2x 0 3
0 0 z 2y + x
0 6 z 3x
Whence, the equation of the plane is
x 2y + z = 0

2.2

Bases

Definition. Possibly infinite B V is a basis if it is linearly independent and


spans V .
Examples. 1 F n has basis {e1 , . . . , en } where ei where the ith component of
ei is 1 and all others zero (this is known as the standard basis) 2 Mm,n (F )
has basis {Ei,j } with 1 i m and 1 j n where only (Ei,j )i,j is 1 and
all others are 0 (again, this is standard for the space).
3 Pn (F ) has basis
B = {1, x, . . . , xn } 4 P(F ) has infinite basis B = {1, x, . . .}.
Proposition. B is a basis iff every v V can be written uniquely as

v=

n
X

i vi

i=1

with the lambdas and vs coming from the obvious places.


Proof.
We prove right implication first. Choose v V . Then, the span of the basis
being V means we may write
V =

n
X

i i ()

i=1

P
i i
P where i B. This is unique, because if not we would have 0 =
i i for some other constants, where the constants are not all identical, and
that would imply B is not linearly independent and so not a basis.
Going the opposite direction, we suppose that all v can be written uniquely
as above. It isPclear from () that B spans V . And, B is linearly independent,
because 0 =
0vi , but that expressions uniqueness guarantees that only 0
coefficients give 0.

P
Consequently, a basis gives rise to a coordinate system, because
i i and
{i } are in bijection.
Proposition 1. If v1 , . . . , vn V are linearly dependent, then some vi is a
linearly combination of the previous ones, and conversely.
Proof.
Suppose there is a linear combination being 0 with not all constants being
0. Let K = max{i : xi 6= 0}. So, we have that
k
X

i xi = 0 = xk =

i=1

Pk1
i=1

i xi

Conversely, if we have that there is a xk being equal to a linear combination


of k 1 vectors, then
k1
X

n
X

i xi 1vk +

i=1

0vj = 0

j=k+1

with at least 1vk 6= 0.



Proposition 2. Every non-zero finite dimensional vector space has a basis
(disclaimer: not unique).
Proof. Suppose the span of v1 , . . . , vn is V . If the vectors are linearly independent, then we are done. Otherwise, write
vk =

k1
X
i=1

i vi

for some k. Then, suppose hv1 , . . . , vk , . . . , v, i = hv1 , . . . , vm i = V , where


vk being hatted means it is excluded. If, the set with vk excluded is linearly
independent, then we are again done. Otherwise, continue this process of throwing out vectors until you reach a linearly independent set. (This is guaranteed
to terminate, because the singleton is always linearly independent; it is also
guaranteed to always remain spanning.)

Proposition 3. If V is finite, with u1 , . . . um V linearly independent, then
we can find um+1 , . . . , um+r such that u1 , . . . , um , um+1 , . . . , um+r is a basis.
Proof. Let {v1 , . . . , vn } be a basis of V . Then, {u1 , . . . , um , v1 , . . . , vn } spans
V . Now, throw out vectors as in Prop. 2s proof so that we get a basis. Note
that ui is never removed, because no ui is a linear combination of the previous
vectors by assumption of linear independence.

Proposition 4. If {v1 , . . . , vn } is a basis of V and if {w1 , . . . , wm } are linearly
independent, then m n.
Proof. Assume m 2 to throw out the trivial case. Then, {wm , v1 , . . . , vn }
is linearly dependent and spans V , so we have a subset {wm , vi1 , . . . , vik } that
will be a basis, because wm never goes. Do the same with wm1 and continue
until you reach {w2 , . . . , wm , vk1 , . . . , vks }. At each step, we replace at least
one vi with a wj , but at the end we still have at least one vi left, since w1 is
not a linear combination of the others. So, the number of steps is m 1 and
m 1 n 1 = m n.

Corollary.
Any two bases of a finite dimensional vector space have the same cardinality.

2.3

Dimension

Definition. The dimension of V is the size of any basis of V , i.e., Dim(V ) =


|{1 , . . . , n } = n where {1 , . . . , n } is any basis of V .
Examples. 1 Dim(F n ) = n 2 Dim(Mm,n ) = m n 3 Dim(Pn ) = n + 1
It is important to note that the dimension of a vector space depends on what
field you are working over, so we should really write DimF (V ) when it might
be unclear. The canonical example here is that Dim(C) = 2 over R but 1 over
C.
Corollary. Let V be of dimension n. (1) hSi = V = |S| n with equality
only when S is a basis & (2) L V linearly independent implies |L| n
with equality only when L is a basis.
In other words, a basis is a minimal spanning set and a maximal linearly
independent set.
We now go to one last corollary. This one is not so easy, because it tricks you.
You will think spanning set when you should be thinking linear independence.
Corollary. Suppose V is finite-dimensional and W is a subspace of V (we may
write W < V for this in the future, in the spirit of group theory). Then, W is

finite dimensional with dimW dimV with equality only when W = V .


Proof. Suppose w1 , . . . , wn is a linearly independent subset of W and W is
infinite dimensional. Then, there exists wm+1 W but not in hw1 , . . . , wm i,
because that span isnt W ,, but then we have {w1 , . . . , wm+1 } is still linearly
independent (by Prop. 1). In this way, we get an infinity of linearly independent
wi W , but these must also be independent in V (obvious), so we have a
contradiction, because V is finite.

The rest is left as an exercise.

10

September 6: Choices, Infinite Dimensions, and Linear


Maps & Their Matrices

3.1

Zorns Lemma

The first part of this lecture, we will just be extending some results to the case of
infinite dimensional vector spaces. Note that linear independence for a (possibly
infinite) vector space is defined to be a finite subset of the space for which there
is no dependence relation.
Definition. Let I 6= be an index set and {Ai }iI be a family of non-empty
sets. A choice function
Q is any f : I iI Ai such that f (i) Ai i. The
Cartesian product iI
Q Ai is the set of all choice functions from I to Ai .
Elements are denoted iI ai , where i I. This means f takes i to ai for all
i I.
We now define the second most controversial axiom in mathematics (canonically formalized as ZFC set theory).
Definition. The axiom of choice states that choice functions exist. Equivalently, any family of non-empty sets has a non-empty Cartesian product.
Q
Note that when i is finite, iI Ai = A1 An , and its choice functions, which look like (f (1), . . . , f (n)) are just n-tuples of elements with the ith
component belonging to the ith set.
This is not nearly as tame an axiom as it may seem. It leads to all sorts
of odd matters, including the infamous Banach-Tarski paradox, but it is also
very important to prove some theorems we want, like that all vector spaces
have a basis. (I discussed the axiom of choice, whether I accept it, and why
in a Quora answer not too long ago, as did some other mathematically inclined individuals such as David Joyce at Clark, c.f. https://www.quora.com/
Do-you-believe-in-the-axiom-of-choice/answer/Joseph-Heavner?srid=
OteC.)
Let us quickly look at the most controversial axiom in the foundations of
mathematics, just for fun/culture: the axiom of infinity. As many know, we
construct the natural numbers by taking the empty set and saying it is 0, the
set containing it is 1, and so forth, i.e.,
0 ; {} 1 ; {, {}} 2 ; {, {}, {, {}}} 3 ;
The axiom of infinity says the collection of all of these things (natural numbers) form a set, N. Yet, one never sees infinite objects, and so this is somewhat
odd, and it leads to more weirdness with real numbers and cardinal infinities (or,
worse, ordinal infinities) and so forth; it is, however, also too important for most
mathematicians to want to let go of, because it weakens math substantially and
makes some things we know and love impossible or weird. (Some alternative
philosophies of math include finitism, intuitionism, and constructivism, and you
can look up N.J. Wildberger for an actual mathematician who believes in this
sort of thing and pushes hard for others to, too.)
We now define some things so that we can define Zorns lemma, which is
11

equivalent to the axiom of choice, even if we cannot prove that here. (It is
pretty hard to do, except maybe for the set theorists among us, but we always
knew those guys were, err, different Joe.)
Definition. A partially ordered set (poset) is a set A, which has a relation
on it, which is reflexive (x x x A), anti-symmetric (x y and y x =
x = y), and transitive (x y and y z implies x z).
Note that this is not a total order, so we do not have that any two elements
are comparable (i.e., that x y or y x for all x, y A).
Definition. A total order is a partial order such that if x, y A, then x y
or y x. B A is a chain if it is totally ordered.
Definition. B A has an upper bound M if b M for all b B.
Definition. a A is maximal if a x implies a = x for all x A.
Take A = P(R){R} (recall the power set is the set of all subsets) and let
be given by set inclusion. Then, the maximal elements of A are punctured lines,
i.e., R{x} for some x R, so that only one point is removed. (Not subtracting
by the reals makes the maximal element the reals themselves, clearly, and indeed
the power set of any set under set inclusion would have maximal element of the
whole set in this way.)
Definition. Zorns lemma states that if P 6= is a poset such that every chain
in P has an upper bound, then P has a maximal element.
While the axiom of choice is somewhat intuitive, this is not at all obvious.
It is, however, very important for us to transition from finite dimensional things
to infinite dimensional things (not that it always works), so we should try to
build some intuition for why it might be true.
Proto-Proof.
I am loosely borrowing Luries terminology here, because I wanted to distinguish this explanation, even if it is not the most revealing one of all time (not
at the fault of any person). We basically argue by contradiction. If P did not
have a maximal element, then any m1 would have m2 m1 and so forth to get
a chain
m1 m2
But, this being a chain means it has an upper bound, so there exists some
n1 mi i, and then we get another chain n1 n1 , and then we repeat
ad infinitum. In a sense, this does not seem desirable.

Anyway, we can get to what we have been trying to now. This is a tough
proof the first time around if you have not seen Zorns lemma before, but it is
very important, just as is knowing Zorns lemma. (I actually like Langs Algebra
for a reference on this, because it has a pretty decent appendix on set theory,
which covers Zorns lemma in some detail, and because I think this is a good
book for serious algebraic type people to have some familiarity with, anyway.)

12

3.2

Infinite Dimensional Vector Spaces

Theorem. Any non-zero vector space (including the possibility that it has
infinite dimension) has a basis.
Proof.
Let P := {all linearly independent subsets C V } ordered by set inclusion.
We check that every chain in P has an upper bound. Let C be a chain in P .
We claim that
[
D :=
C
CC

is an upper bound for C in P .


Clearly, D contains C for all C C by construction. So, we just check that
D P , so that we know D is linearly independent. Choose v1 , . . . , vn D.
Then, vi Ci for some Ci C , 1 i n. Since C is a chain, we can totally
order C1 , . . . , Cn by inclusion. Let j [1, n] be such that
Cj =

n
[

Ci

i=1

Then, v1 , . . . , vn (distinct) all lie in Cj and Cj is linearly independent, so


v1 , . . . , vn are linearly independent. So, the chain has an upper bound given by
the union.
Whence, by Zorns lemma, P has a maximal element, B, which we claim is a
basis (think maximally linearly independent). Linear independence is given by
construction, so we need only show it spans V . If it did not span V , we could
choose v0 V Span(B) and see that B {v0 } is linearly independent. (Prove
this!) It also strictly contains B, but that contradicts the maximality of B in P ,
and we are done.

Similarly, one can show that any linearly independent subset S V can be
extended to a basis of V , even in the infinite dimensional case. Also, we can
handle infinite linearly independent sets and that sort of thing, but that is in
analysis, for instance in Hilbert or Banach spaces; infinite bases for these take
some care and are not well-suited for algebra.
But, we now move on to part two of this lecture, which covers linear maps
(transformations), which are the homomorphisms of vector spaces, i.e., the morphisms or structure preserving functions of vector spaces. As is always the case
with algebra (and math in general), these morphisms are key. Luckily, you
already have some familiarity with them from linear algebra.

3.3

Linear Maps & Their Matrices

Definition. Let V, W be vector spaces over a shared field F . A function


T : V W is linear if it is additive (T (x+y) = T (x)+T (y)) and respects scalar
multiplication (T (cx) = cT (x)), or equivalently if T (cx + y) = cT (x) + T (y).
(c F and x, y V )
13

It is easy to see that T (0) = T (00) = 0T (0) = 0, where the only the second
and fourth zeros are the scalar zeros and where the last is the zero vector in W .
Examples.
1. Any m n matrix A with entries in F induces a linear transformation
TA : F n F m with TA (x) = Ax.
2. Conversely, if T : F n F m is any linear map, then there is a unique
A Mm,n (F ) such that T = TA , i.e., T (x) = Ax for all x F n .
Indeed, note that

T (x) = T

n
X
i=1

!
xi ei

n
X

x1

xi T (ei ) = (T (e1 ), . . . , T (en )) ...

i=1

xn

We might set A := (T (e1 ), . . . , T (en )). This is bad notation, because we


almost always use column vectors as to have left multiplication (f x, not
xf ), but you will live. Anyway, consider the example T (x, y) = (2x +
y, 3x
by the matrix (where the first column is

 4y). This is represented
 
1
0
T(
) and the second is T (
)):
0
1


2 1
3 4
3. Let R be counterclockwise rotation by radians in R2 . Then, R : R2
R2 is represented by the matrix

(R (e1 ) R (e2 )) =

cos
sin

sin
cos

which can be seen by drawing the unit vectors and parametrizing the plane
as usual.
4. C (R) is the set of all infinitely differentiable real functions f : R R.
Clearly, this is a real vector space. Moreover, D : C (R) C (R) given
by D : f 7 f 0 is linear, because differentiation is linear.
Here is a good question. Why do we multiply matrices the way we do? This
is a good question, because the elementary theory gives no reason as to why we
should not just do array multiplication; however, matrices are not arrays: they
are linear transformations in disguise. Therefore, it makes sense that matrix
multiplication is defined to correspond with composition of linear maps. In particular, composition of the maps F p F n F m corresponds to multiplications
of the unique matrices which represent them.
Remark. T : V W is uniquely determined by the values T (b) for b B,
where B is any basis of V .
Proof. Clear

14

Corollary. Given any basis B = {bi } and wi W with i I, there exists a


unique linear map T : V W so that T (vi ) = wi for all i I.
P
Proof. Given v V , write v uniquely as v = iI i vi (finite). Then,
!
X
X
T (v) = T
i v i =
i wi
iI

iI

Check the resulting map is linear as an exercise.



Definition. For T : V W we define the kernel of T to be Ker(T ) := {v
V : T (v) = 0}. We define the image of T to be Im(T ) := {T (v) : v V }.
Proposition. Im(T ) is a subspace of W and Ker(T ) is a subspace of V .
Proof. Easy.
Worked Examples.
1. T : R3 R3 given by T (x, y, z) = (x y, y z, z x). Find the kernel and
the image.
Well, the kernel is the (x, y, z) so that every component is 0, which means
x y = 0 and y z = 0 and z x = 0, so clearly we have x = y = z, and so the
kernel are vectors of the form (x, x, x), which is the span h(1, 1, 1)i. The image,
however, is whatever we can get out. Rewriting the vectors, we get x(1, 0, 1) +
y(1, 1, 0) + z(0, 1, 1), which is spanned by (1, 0, 1), (1, 1, 0), (0, 1, 1), but
we can just throw out that last vector, because clearly they are linearly dependent. It turns out that we are then left with a basis {(1, 0, 1), (1, 1, 0)},
which obviously also describes the image.
2. Compute the image and the kernel of the map from C (R) to itself given
by differentiation.
It is instructive to do some work yourself. I will just say the kernel are the
constant functions, as they are those with derivative 0. (What spans this?) The
image is basically the fundamental theorem of calculus.
Also, oddly, we get a surjective map with a non-zero kernel, something which
does not happen in the finite case.

3.4

Preview

If T = TA : F n F m where A Mm,n (F ), then the kernel is called the null


space and the image is called the column space (of A). The dimension of the
null space is the nullity; the dimension of the column space is the rank. We will
prove the important rank-nullity theorem, which says (for finite V , at least)
dim kerT + dim ImT = dimV
(Of course, you can translate this to nullity and rank as you please.)

15

September 8: Isomorphisms, Direct Sums & Products,


Quotient Spaces

4.1

Isomorphisms

Proposition. T : V W a linear map, then (1) T injective iff ker T = {0} &
(2) T surjective iff Im T = W .
Proof.
The second part is basically the definition of surjectivity. As for (1), this is
a good fact to know, so we will prove it. If T is one-to-one, then let v ker T ,
and see that T (v) = 0 = T (0) but T is an injection so v = 0. For the other
direction, if v ker T is 0, then
T (v1 ) = T (v2 ) = T (v1 v2 ) = 0 = v1 v2 ker T = v1 = v2

Definition. A linear map T : V W is an isomorphism if there exists S :
W V such that S T = 1V and T S = 1W .
Theorem. Every vector space of dimension n is isomorphic to F n . (n = 0 is a
stupid case.)
Pn
Proof. Choose basis v1 , . . . , vn . Then, any v V is i=1 i vi . Define T : V
F n to be v 7 (1 , . . . , n ). It is left as an exercise to choose that this is linear
and has an inverse.

Examples.
1 : F n F n an isomorphism corresponds to invertible matrices in Mn (F )
(i.e., to GL(n, F )). This is easy to see once you have the correspondence between
linear maps between vector spaces and matrices.
2 F4
= P3
= M2,2 (R) given by mapping bases. Taking the standard ones,
we get
(e1 , e2 , e3 , e4 ) 7 (1, x, x2 , x3 ) 7



1
0

 
0
0
,
0
0

 
1
0
,
0
1

 
0
0
,
0
0


0
1

2
3
 This
 gives the classic identification of (a, b, c, d) with a + bx + cx + dx with
a b
.
c d

4.2

Direct Sums & Direct Products

There are two notions of a direct product for vector spaces; these are intimately
related, and in practice they may even be considered the same.
Definition. Let V1 , V2 be vector spaces over F . The external direct sum
V1 V2 := {(v1 , v2 ) : v1 V1 , v2 V2 } with addition and scalar multiplication
defined component-wise.
Theorem. If V1 and V2 are finite dimensional, then dim(V1 , V2 ) = dim V1 +
dim V2 .
16

Proof. Concatenate bases. If 1 = (v1 , . . . , vm ) and 2 = (v10 , . . . , vn0 ), then


((v1 , 0), . . . , (vm , 0), (0, v10 ), . . . , (0, vn0 )) is a basis. (Prove!)

How would you define addition of vector subsets? Well, the obvious way is
to say that if A, B V , then A + B := {a + b : a A, b B}, but this is not
necessarily a subspace. (For that, we want A, B to be subspaces.)
Definition. U1 , U2 < V (subspaces), then we say V is the internal direct sum
of U1 , U2 if each v V has a unique expression as v = u1 + u2 for u1 inU1 and
u2 U2 .
Theorem. V
= U1 U2 with the isomorphism given by v = u1 +u2 7 (u1 , u2 ).
Proof.

Exercise

This is why we throw in the uniqueness condition in the definition. Note


that, like we mentioned before, there is not a big distinction in practice between
these two technically different ideas, because internal direct products give rise
to external direct products and vice-versa.
Proposition. V = U1 U2 iff V = U1 + U2 (as a set) and U1 U2 = {0}. If
dim V < , then this is equivalent to V = U1 +U2 and dim U1 +dim U2 = dim V .
Proof. We begin with left implication. Suppose V = U1 + U2 but it is not
unique. Why must u1 = u01 and u2 = u02 ? Well, u1 u01 = u2 u02 lie in the
intersection, so both sides are zero, and whence the result follows. For right
implication, we clearly have V = U1 + U2 by definition, so we check that the
subspaces intersect trivially. Indeed, if u U1 U2 , then 0 + 0 = 0 = u + (u)
with u U1 and u U2 and by uniqueness we have u = 0.

Exercise.

Prove the dimension part.

Definition. Let U < V and U 0 < V such that V = U U 0 . Then, U 0 is called


a complement of U in V .
Theorem. For any vector space V with U < V , there exists U 0 < V such
that U U 0 = V , i.e., compliments exist.
Proof. Assume U 6= {0} and U 6= V . Let B1 be a basis of U . Then, B1 is
linearly independent in V and we can find a basis B2 V such that B1 B2 is
a basis of V . Set U 0 = hB2 i. It follows that V = U U 0 .

Example. If V = R2 and U = h(1, 1)i, then what are the complements of U ?
The answer is any other subspace, i.e., all lines through the origin not equal to
U.
Clearly, compliments are not unique. There is no canonical choice of complement, either, as least not without additional structure (e.g., if you give the
space an inner product, then orthogonality gives a canonical complement). We
now generalize these constructions to more than two vector spaces.
Definition.
If {Ui }iI is a family of vector spaces over F , then the direct
Q
product iI Ui is the Cartesian product of the Ui as sets with point-wise
17

addition
andQscalar multiplication, i.e.,
Q
iI ai = iI ai .

iI

ai +

iI

bi =

iI

ai + bi and

Definition. The direct sum iI Ui is the linear subspaces of iI Ui (this is


just
Qanother notation for Cartesian product, really, so we use both) consisting
of ai with only finitely many non-zero ai .
These really are different in the extension. The direct product
is much, much
L
larger than the direct sum, for instance. Note that V =P iI Ui if and only if
each v V has a unique expression as a finite sum v = i ai with ai Ui .
L
Example. Let B = {bi } be a basis of V . Then, V = iI F bi (this is weird
notation, but F is justLthe set of elements from the field, so we really have the
span of {bi }), so V
i IF (6= F i ) where all but finitely are zero.
=
Theorem. There is an analogue of the criterion
in the finite case. V =

Pn
Ln

i=1 Ui if and only if V =


i=1 Ui and Ui U1 + + Ui + + Un = 0
for all i. (As usual, the hat denotes exclusion.)
Proof. This theorem is boring enough, kinda ugly to look at. It is not hard to
see, though, so we will not prove it.

4.3

Quotient Spaces

It is much nicer to speak of vector spaces than groups or most other categories
when it comes to quotients. We can draw nice pictures, and who doesnt love
pictures? A picture is worth a thousand words, right? Here is the idea. Take
some U R2 . There are many, albeit no canonical, compliments. So, we make
a set of all lines parallel to U into a vector space, R2 /U , which essentially plays
the role of a canonical compliment (even though it is not actually a subspace!).
(Draw this. I would, but I am too lazy to make a nice picture, and it is not
complicated enough for me to want to expose you to a crappy MS Paint drawing.) We add lines by choosing v1 `1 and v2 `2 l adding and considering
the unique line through that point parallel to U as `1 + `2 = `3 . (Similarly
for multiplication.) The magic is that this is actually independent of choice of
v1 , v2 ! (Alternatively, we could define the addition with the set addition A + B
from before.)
We now digress a bit to review equivalence relations.
Definition. A relation on a set S is an equivalence relation if and only if
is (1) reflexive (x x), (2) symmetric (x y = y x), (3) transitive (x y
and y z means x z).
Definition. An equivalence class of x S under the relation is denoted
Cx or [x] and is defined as [x] := {y S : y x}. (Note that this is always
non-empty, as x [x].)
Examples.
1 If R, then 1 2 if and only if 1 2 = 2k , k Z. So, []
points on the unit circle.
2 Define S := {all days, numbered} (clearly this is a loose one). Then, day
1 is equivalent to day 2 if and only if the days are equivalent mod 7, i.e., if day 1
- day 2 is divisible by 7. The equivalence classes are just Monday, . . . , Sunday.
18

Proposition. The condition that Cx Cy 6= is equivalent to Cx = Cy is


equivalent to x y.
Proof.

Exercise.


Consequently, {Cx } form a partition of S (i.e., they cover S`and are pairwise
disjoint), and conversely any partition of S is a disjoint union CC C (that notation just means the union is disjoint, it comes from the fact that the coproduct
in Set is just this) gives rise to n equivalence relation x y x, y C. So,
the equivalence classes are obviously the C C.
Definition.
mod .

The set of equivalence classes is denoted S/ = C and read S

We have a nice picture for this, whereby the modding out by collapses
points considered equal under the relation to mere points.

Definition. Given U < V1 , define on V by v1 v2 if and only if v1 v2


U . (Check this is an equivalence relation.) This corresponds to the example
before (x, y lie on ` k U ). Equivalence classes of x V are cosets x + U :=
{x + u : u U } (translating the vector U by x). The quotient V /U is then
given by {x + U : x V }, which is just V / as a set, but it has operations
(x + U ) + (y + U ) := (x + y) + U and (x + U ) := x + U .
The notations are a bit weird, because we must choose a representative.
Therefore, we should check that this is well-defined.
Theorem.

V /U has well-defined operations.

Proof. For addition, we want x + U = x0 + U and y + U = y 0 + U implies


(x + U ) + (y + U ) = (x0 + U ) + (y 0 + U ). This is given because x x0 =
subspace

x x0 U = (x + y) (x0 + y 0 ) = (x x0 ) + (y y 0 ) U . Similarly,
multiplication is well-defined.


19

4.4

Preview

Next time, we will look at the isomorphism theorems, as they appear on the
p-set.

20

September 13: The Isomorphism Theorems & Universal


Properties

Proposition. Let W be a complementary subspace of U . Then : U 0 W/U


with (u0 ) = u0 + V is an isormorphism.
Proof. The map is linear by construction, so we need only show it is a
bijection. We first show injection. ker = 0, because u0 ker = (u0 ) =
0V /U = u0 + U = U , but u0 U = u0 U U 0 = {0}. It is surjective
essentially
beause we can write every v V uniquely as u + u0 , since V =
L 0
U
U . So, (u0 ) = u0 + U = (v u) + U = v + U .

The converse of this theorem is not true, but we do have a nice corollary.
Corollary.
Proof.

If V is finite dimensional, then dim(V /U ) = dim(V ) dim(U ).

If U 0 is a complement of V , then dim U 0 = dim V dim U .


Example. Let A Mm,n (F ), b F m , and S := {x F n : Ax = b}. Either S
is empty, or S is a coset of H = Null(A). Indeed, if x0 S, then S = x0 + H.

5.1

Isomorphism Theorems

Theorem.

(The First Isomorphism Theorem.) Let F : V W be a linear

map. Define f (v + K) = f (v) (where K = ker f ). This is an isomorphism:


V / ker f
= Imf .

Proof.

We need to show that f is well-defined. If v + ker f = v 0 + ker f , then

we must have that f (v) = f (v 0 ). This is true since that is equivalence to saying
v v 0 ker f = f (v v 0 ) = 0 = f (v 0 ) = f (v), where f is linear, and so if

f . Clearly, f (V ) = Imf , so it is onto. We now only show it is one-to-one. Note

that ker f = {v + ker f |f (v) = 0} = {v + ker f : v ker f } = ker f = 0V / ker f .



Corollary.
dim V .
Proof.

If |V | < and f : V W is linear, then dim ker f + dim Imf =

Take dimensions of both sides of V / ker f


= Im(f ).


This gives a method for problems: to prove an isomorphism between a quotient V /U and V , find a surjective linear map f : V W where ker f = U . In
practice, we never construct actual isomorphisms for quotients.
Theorem. (The Second Isomorphism Theorem.) If M, N < V , then we have
(M + N )/M
= N/(M N ).

21

Proof. This is a direct consequence of the first isomorphism theorem. Joe


thinks it is boring, but Joe will write out the proof given in class, anyway.
Consider f : N (M + N )/M where f (n) = n + M (the m is absorbed). This
is clearly linear. Check that ker f = M N as follows.
def

ker f = {n N : n + M = 0(M +N )/M = M } = {n N : n M } = N M


The map is a surjection, because if you take a general (m + n) + M , you get
f (n) = n + M = m + n + M = (m + n) + M .

Corollary. If M, N are finite subspaces of a vector space V , then so is M +N ,
then dim(M + N ) + dim(M N ) = dim M + dim N .
Proof.

Just take dimensions.


There are other theorems with the name isomorphism theorem, but Tamvakis has chosen to cover just these two, thankfully.

5.2

Universal Properties

We have officially arrived at our first hint of category theory for the year. I have
decided to add a bit more than was in class; in particular, I define a category
in full and give four examples.
Definition.
A category C consists of the following data: a class of objects, denoted Obj(C); a class of morphisms (arrows) between objects, denoted
Hom(a, b) (or M or(a, b)) for objects a, b C (an abuse of notation); a composition law of sorts so that Hom(b, c) Hom(a, b) Hom(a, c) for all objects
a, b, c for which two conditions hold: (i) associativity and (ii) the existence of
identities for all objects x.
In a sense, Tamvakis says this is all just a change of notation. Around the
end of the nineteenth century we began to think of maps taking precedence
over objects, of a function being a map between sets, but this is where we
only care about the maps (even calling them arrows), and basically ignore the
objects all together. Category theory has been affectionately (as well as not so
affectionately) described as abstract nonsense. It, like homological algebra
which uses category-theoretic methods heavily, has a lot of diagram chasing.
I will now draw a simple commutative diagram (using the diagrams package,
which can be found on Paul Taylors site), which represents a category, its
objects (A, B, C, D) and the maps between them; we have left out the identity
maps A A, B B, etc.
A

>B

d
C
>D
We say this diagram commutes, because we can follow maps two different
ways to get to the same object, i.e., because d b = c a. (Hence the name
22

commutative diagram.) Some examples of categories include Set, which has


objects as sets and arrows as functions, Grp, which has groups as objects and
arrows as homomorphisms, Top, which has topological spaces as objects and
continuous maps as arrows, and Vect k , which has vector spaces over the field k
as objects and linear maps as arrows. All of these are realized in concrete ways.
The arrows are just maps as we know and love, but categories need not be so.
The classic example here talks about posets, which I encourage you to look up
in, e.g., Evan Chens An Infinitely Large Napkin.
We now return to the lecture proper. We introduce universal properties by
looking at examples of them. The best one we have seen is that the quotient
space V /U comes with a canonical map : v 7 v + W . We wont really define
universal properties at all, at least not in general, but you should get a feel for
what they are. The formal definition is somewhat complicated and much more
categorically involved than we would like to get.
5.2.1

Universal Property of the Quotient

In this case, we have a universal property because we have a pair (V /U, ) that
satisfies the following conditions: (i) U ker (actually, we even have equality
here), (ii) for any pair (W, f ) of a vector space and a linear map f : V W

such that U ker f there exists a unique linear map f : V /U W such that

f = f , i.e., such that the following diagram commutes.

> V /U
..
..
.f
f > ...

We often draw a little curly arrow in the middle of the diagram to indicate
the direction in which one should read it to see the commutativity, but Joe sucks
at drawing commutative diagram in LATEX as it is. The dotted line indicates
that the map is induced by the others, i.e., given the others there exists a unique

map, called f , such that the diagram commutes.


Generally speaking, given a diagram and another copy of it with perhaps
certain conditions, there exists a unique set of induced arrows such that all the
triangles commute.

Proof.

We now show that the diagram males sense. Given f we define f (v +

U ) = f ((v)) = f (v) in order for f = f . So, f , if it exists, is unique. We


have to check the map is well-defined. Well, v + U = v 0 + U implies v v 0 U
implies v v 0 ker f , which gives the result. (This is why the condition with
the kernel is required.)

Fundamentally, universal properties characterize the objects and morphisms
that obey them up to isomorphism via abstract nonsense . . . Oh, what the hell,

23

let us go ahead and quickly run through it. (Tamvakis did not use this language.
That would be moi.)
Consider

V /U
..
..
f
- .... f
?
W

..
..6
..
..
..
...
..
.

with the condition U ker . (Imagine actually went from W to V /U .


I hope to fix this eventually. Parallel arrows do not like me.) Suppose (W, f :
V W ) were also universal. (Well, it is a gadget in the diagram.) Then,

U ker f , but then there must be so it commutes both ways. We claim that

f = 0V /U and f = 0W (isomorhism). We get the diagram


V

V /U
..
..

- ... id
.?
V /U

..
..
..
..
.. f
...
..
..?

And the uniqueness condition gives that f = id. The other way is
analogous.

5.2.2

Universal Property of the Direct Product


Q
There exist canonical projection morphisms i : Ui Ui . So, the universal
property is: forQany V and morphisms fi : V Ui , there exists a unique
map f Q
: V
Ui so that i f = fi i. We are basically forced to have

f (v) = fi (v).
Y

5.2.3

Ui > Ui
<.....
.... f
i
..
!f .....
V

Universal Property of Direct Sum

We will see once more that this really is technically distinct from the direct
product; in particular, arrows are reversed! There are canonical inclusions i :
Ui , Ui . The property is that for any V and linear maps gi : Ui V there
exists a unique map f : UP
that f i = fi for all i. Again, something
i V soP

is forced. In particular, f ( ui ) =
fi ui is forced. (Note that all but finitely
many ui are 0.

24

Ui < Ui
....
....
...
fi
...
!f ..>
V

25

September 15: Hom, Duals, Coordinates, Eigenvalues

6.1

More Subspace Constructions

Definition. Consider vector spaces V, W over a common field. Define Hom(V, W ) :=


{T : V W } where each map is linear. The operations are point-wise so that
(S + T )(v) = S(v) + T (v) and (S(v)) = (S)(v).
F n and W =
F m . Using these and
If dim V = n and dim W = m, then V =
n
m
representations T TA of any T Hom(F , F ) by the matrix A, we identify
Hom(V, W )
= Mm,n (F ), implying dim Hom(V, W ) = mn.
Definition. If W = F in the Hom construction above, we get the dual space
V := Hom(V, F ) of linear functionals V F .
Definition. Assuming dim V = n < and B = (v1 , . . . , vn ) a basis of
V , we construct the dual basis (v1 , . . . , vn ) for V by setting vi (vj ) = ij for
i, j = 1, . . . , n (the Kronecker delta, which is 0 except where i = j).
Theorem.
Proof.

The dual basis is a basis.

The vectors are linearly independent since if


!
n
X

i vi (vj ) = 0 j
i =

i vi = 0, then

i=1

Clearly, the dimension of the dual space is n, and so we have n linearly independent vectors, i.e., a basis.

We obtain an isomorphism V V by sending one basis to another, but this
is not canonical, as it depends on the bases. There is a canonical isomorphism
: V V given by x 7 x
where x
(f ) = f (x) F . Clearly, this is canonical.
Theorem.

The map is an isomorphism.

Proof. The map is clearly linear. Since dim V = dim V < by assumption, it suffices to show that ker = {0}. Suppose x 6= 0. We find
f V = x
(f ) = f (x) 6= 0. Relabel x := v V . Extend v to a basis v, v2 , . . . , vn of V . Then, v , v2 , . . . , vn is a dual basis and f := v with
v (v) = 1 6= 0.


6.2

Coordinate Systems

This is mostly notation, to be honest.


Let V be a real vector space (real just for simplicity this generalizes to
any field). Choose an ordered basis B = (b1 , . . . , bn ) of V . The basis induces
a coordinate isomorphism betweenV
and Rn given by V (V )B where if
x1

v = x1 b+ + xn bn , then (v)B = ... . If B 0 = (b01 , . . . , b0n ) is another basis
xn
26

we get new coordinates.


Theorem. The coordinate vector (v)B0 = (x1 b1 + + xn bn )B0 = (id)B,B0 (v)B
where (id)B,B0 has ((b1 )B0 , . . . , (bn )B0 ) is the change of basis matrix.
Note that (id)B0 ,B = (id)1
B,B0 .
Example.
1. Let B be a basis for Rn . Let E = (e1 , . . . , en ) be as usual. Then,
1
(id)E,B = (id)1
B,E = (b1 , . . . , bn )

   
 
1
3
2
2
2. Let B =
,
R . Let v =
. We have
2
4
2
(id)B,B =

So, (v)B =

6.3

1
2

4
2


1
2

3
4

1
=

1
2

4
2


3
.
1

   
3
2
1
=
and v = b2 b1 .
1
2
1

Matrix of a Linear Map

Let T : V n W m be a linear map and B = (b1 , . . . , bn ) be a basis of V and


C = (c1 , . . . , cm ) be a basis of W . Then, we can assign the matrix (T )B,C to
T (dependent on B and C, of course) such that (T (v))C = (T )B,C (v)B for all
v V . We then get the following diagram.

Rn

>W

> Rm

TA

Above the vertical maps are given by the coordinate maps and the map TA
is the linear map corresponding to the matrix A = (T )B,C .

x1

To find the matrix A, write T (v) = T (x1 b1 + + xn bn )(v)B = ... and
xn
(T (v))C = (x1 T (b1 ) + xn T (bn ))C so that
(T )B,C = ((T (b1 ))C , . . . , (T (bn ))C )
Now the notation (id)B,B should make some sense. One brief notational
note: suppose T : V V be a linear map and B a basis of V ; then, we might
denote (T )B,B by (T )B for convenience.
Example.
1. Let T : P2 P3 be a linear transformation given by f (x) 7 (2x + 5)f (x)
with B, C the standard bases on their respective vector spaces. Find (T )B,C
and compute (2x + 5)(1 3x + 4x2 ).

27

(T )B,C

5
2
=
0
0

0
5
2
0

5
0

0
& T (1 3x + 4x2 ) = 2
0
5
0
2

0
5
2
0

0
5
1

0
3 = 13
14
5
4
2
3

2. Suppose T : Rn Rm is represented by A in the standard bases E, E 0


so that A = (T )E,E 0 = (T (e1 ), . . . , T (en )). Let B be a basis for Rn , C a basis
for Rm . We then have a sequence of maps
(v)B

(id)B,E

(v)E (T (v))E 0

(id)E 0 ,C

(T (v))C

so that (T )B,C = (id)E 0 ,C (T )E,E 0 (id)B,E .


Corollary. If A0 := P AQ for P, Q invertible (non-singular), then A0 is represented by the matrix TA with respect to the bases B = (q1 , . . . ,n ) and
C = (p1 , . . . , pn )1 . When B = C on Rn we get T (x) = Ax that (T )B =
(id)E,B A (id)B,E = P 1 AP where P = (b1 , . . . , bn ), and conversely.
Definition. Matrices A, B Mn (R) are similar if and only if there exists
P GL(n, R) such that A = P 1 BP .
So, similar matrices represent the same linear maps with respect to different
coordinate systems.
Question. By changing bases B, C how simple (interpret as you will) can we
make (T )B,C in general?
We, the students, all failed to get this correct. Granted, this was about
two minutes of throwing out ideas as simple from the identity to as complex as
mentioning the Jordan canonical form and so forth, which is actually in some
sense harder than this.
Theorem.
that

Let T : V W be a linear map. There exists bases B, C such



(T )B,C =

Ir

0
0

where r = Rnk(T ) = dim Im(T ).


For all
(R) there exists P GL(n, R) amd Q GL(m, R)
 A Mn 
Ir 0
such that P AQ =

Corollary.

This is somewhat similar to Gaussian elimination, actually.


Proof. Let u1 , . . . , uk be a basis for ker T . Extend this to a basis (v1 , . . . , vr , U1 , . . . uk )
of V where r + k = n. Set wi = T (vi ) for 1 i r. The wi span ImT , which
has dimension r, so they are linearly independent. Extend to a basis of W :
(w1 , . . . , wr , `1 , . . . `s ) = C. Then, (T )B,C is as desired.

Let f : Mn (F ) F be any function such that f (A) = f (P 1 AP ) for all
P GL(n, F ), then f makes sense for linear transformations T Hom(V, V ).
28

Definition. A scalar F is called an eigenvalue of a linear transformation


T Hom(V, V ) if and only if there exists a v 6= 0 in V we have T (v) = v.
That v is called an eigenvector, or specifically a -eigenvector of T .
This happens if and only if ker(T I) 6= 0 so that T I is singular (not
invertible). (Check this by taking the determinant to be 0.) Of course, we have
not rigorously spoken about determinants, but we will eventually in much more
generality than vector spaces.
Theorem. The quantity det( AI) is equivalent to (1)n det(I A) where
A Mn (F ).
Proof.

Omitted.


Definition. The characteristic polynomial p(x) := det(I A) (note that
others might define this to be det(A I), which differs by most by sign) where
A is a matrix of the linear transformations T : V V with respect to some
bases.
So, the eigenvalues of T are the roots of p(x), and the eigenvalues of a
triangular matrix (look this up if you do not know what it means) are its diagonal
entries.
Exercise. Check that p(x) is independent of basis. So, if A B (similar), then
pA (x) = pB (x).
det(I P AP P 1) = det(P (I A)P 1 ) = det(I A)
Theorem. The characteristic polynomial p(x) of a linear map T with associated matrix A can be written
p(x) = xn (TrA)xn1 + + (1)n det A
Proof.

Omitted.


The hard part is obviously what the coefficients of the middle terms are.
We will see this later. (One guess involved minors, which is right to an extent.)
The answer is that ck = Tr(k A) where k is the kth exterior power. Have fun
trying to make something of that at this point, though.

3 2 0
Example. The -eigenspace V is the kernel of T I. Take A = 1 4 0.
0 0 2

1 2 0
Compute p(x) to be (x2)2 (xt), so that V2 = ker(A2I) = Null 1 2 0 =
0 0 0


2
0
1
h1 , 0i and similarly V5 = h1i.
0
1
0
Nota Bene. These note are a bit scrambled; hopefully, I will fix the ordering
later.
29

Corollary. If T : V V is a linear map and V is of dimension n, then T has


at most n eigenvalues.
Corollary. If F = C is the base field of the vector space V and v 6= 0, then T
has at least one eigenvalue and hence one eigenvector.


0 1
Example. The matrix R90 =
with p(x) = x2 + 1 has no real roots.
1 0

30

September 20: Diagonalization, Inner Products, GramSchmidt

Nota Bene. I actually missed class this day, so thank you to Robert Adkins (the
fellow undergraduate in class) for the notes. The end of his notes for the last
lecture also made it in (the last few sentences), because I could not read what
I wrote.

7.1

Diagonalizable Linear Maps

Definition. A linear map T : V V is diagonizable if V has a basis B of


eigenvectors of T . In this case, (T )B is a diagonal matrix (only entries are along
main diagonal). A matrix A Mn (F ) is diagonizable if TA is, i.e., if A is similar
to a diagonal matrix.
Proposition. If v1 , . . . , vk V are eigenvectors of T : V V corresponding to distinct eigenvalues, then {v1 , . . . , vn } is linearly independent; hence,
eigenspaces Vi are linearly independent.
Pr1
Proof. If the claim fails, then there is some r > 1 so that vr = i=1 ai vi
with v1 , . . . vr1 are linearly independent (say a1 6= 0) and apply T : r vr =
Pr1
Pr1
P
ai (i r )vi = 0,
i=1 i vi . Also, r vr =
i=1 a1 r vi . Subtraction yields
which implies a1 (i r ) = 0, which means either a1 = 0 or i = r , both of
which are contradictions.

Theorem. Let T : V V be a linear map with dim V < . Let 1 , . . . , k be
distinct eigenvalues of T and let Wi = ker(T i I) = Vi . Then, the following
are equivalent:
1. T is diagonizable.
2. The characteristic
polynomial of T factors as a product of linear factors,
Q
i.e., p(x) = i (x i )di , and di = dim Wi , so the dimension of each
eigenspace is the multiplicity of the corresponding eigenvalue in p(x).
L
L
3. V = W1 Wk .
Proof.

(1) = (2) If

A=

i
..

.
i
..

.
k

..

.
k

where we have diagonal blocks d1 , . . . , dk , then p(x) = det(xI A) = (x


1 )d1 (x k )dk . (Sorry for my crappy block matrix.)
31

(2) = (3) The


P proposition
P above shows that W1 + + Wk is direct.
Hence, (2) implies i dim Wi = di = dim V , and so (3) follows.
(3) = (1) Form a basis B of V by concatenating bases of W1 , . . . , Wk .
Then, that is a basis of eigenvectors, so T is diagonizable by definition.

Note that in general ei := dim Wi mult(i ). This is because T (Wi ) Wi ,
so we can find a basis of V such that

..

.
i

so pT (x) = det(xI (T )B ) = (x i )ei det(xI C).


Corollary.

If a matrix A has distinct eigenvalues, then it is diagonizable.

If A Mn (F ) and B = (b
1 , . . . , bn ) is abasis of eigenvectors for A
1

..
with eigenvalues 1 , . . . , k and P =
, then A = BP B 1 . This
.
k
is because (A)E = (id)B,E (A)B (id)E,B , e.g.,

Example.

3
1
0


1
2 0
4 0 = 1
0
02

2
1
0

5
0
0
1

1
1
0
2

2
1
0

1
0
0
1

Diagonalization is very useful in computing the powers of a matrix.


Theorem. (a) Let T : V V be a linear transformation where V is a complex
vector space. Then, there is a basis B of V such that (T )B is upper triangular.
(b) Any A Mn (C) is similar to an upper triangular matrix.
Proof. We proceed by induction on n = dim V . We know T has an eigenvector
v1 with
eigenvalue 1 . Extend
this to a basis B = (v1 , . . . vk ) of V , Then,

1
0

(T )B = .
. In matrix form, this means given A Mn (C)
..

R
there exists invertible P with P AP 1 = (T )B . By the induction hypothesis,
there exists Q GL(n 1, C) such that QRQ1 is upper triangular. Set

32

1
0

Q1 := .
..

0
Q

. Then,

1
(Q1 P )A(Q1 P )1 = Q1 (P AP 1 )Q1
1 = Q1 (T )B Q1

1
0

= .
..


Later we will see that there is a canonical representative in each similarity
class of Mn (C), the Jordan form. Henceforth in this lecture, we will always
restrict the base field to be R or C unless otherwise stated.
Definition. Let V be a real vector space. An inner product on V is a function
h, i >: V V R that is (i) linear in the first variable (hax, yi = ahx, yi and
hx + y, zi = hx, zi + hy, zi), (ii) symmetric (hx, yi = hy, xi), (iii) positive definite
(hv, vi 0 with equality only when x = 0).
Definition. For the complex numbers, the hermitian inner product is a map
h, i : V V C such that (i), (ii), and (iii) above held, except that (ii) is altered
so that hv, wi = hw, vi. Hence, a hermitian inner product is conjugate linear in
y 0 i.
the second variable hx, y + y 0 i = hx, yi + hx,
Example.

33