Вы находитесь на странице: 1из 138

O X F O R D M A T H E M A T IC A L M O N O G R A P H S

M E R O M O E P H IC FU N C TIO N S
By w.

k. haxm an.

1963

T H E T H E O R Y OF L A M I N A R B O U N D A R Y L A Y E R S
IN C O M P R E SSIB L E F L U ID S
By K. SIEWASTSON. 1964
C LA SSIC A L H A R M O N IC A N A L Y S I S A N D
L O C A L L Y COMPACT GROUPS
By H. EBITER. 1968
Q U A N T U M - S T A T I S T I C A L F O U N D A T I O N S OF
C H E M IC A L K IN E T IC S
By s.

golden.

1969

CO M P LE M E N TA R Y V A R IA T IO N A L PR IN C IP LE S
By

a . m. a e t h u b s .

1970

V A R IA T IO N A L P R IN C IP L E S IN H E A T T R A N SF E R
By MAURICE A. BIOT. 1970
P A R T IA L W A V E A M P L IT U D E S AND R ESO N AN CE POLES
By J. H a m i l t o n and

b. tromborg.

1972

THE DISTRIBUTION OF
PRIME NUMBERS
Large sieves and zero-density theorems
BY

M. N . H U X L E Y

OXFORD
AT TH E CLAREN D O N PRESS
1972

Oxford University Press, Ely House, London W. 1


G LA SGOW
C APE TO W N
D ELHI

NEW YORK

IB A D A N

BOM BAY

TORONTO

N A IR O B I
CALCU TTA

K U A L A LU M PU R

M E LB O U R N E

D A K ES SALAAM
MADRAS

SIN G A PO R E

W E LLIN G TO N

LUSAKA

KARACHI
HONG K O N G

A D D IS A B A llA

LA H O R E
TO K Y O

Oxford University Press 1072

Printed in Great Britain


at the University Press, Oxford
by Vivian Midler
Printer to the University

DACCA

TO
T H E M E M O R Y OF
P R O F E S S O R H. D A V E N P O R T

PREFACE
T h i s book has grown out o f lectures given at Oxford in 1970 and at

University College, Cardiff, intended in each ease for graduate students


as an introduction to analytic number theory. The lectures were based
on D avenport s Multiplicative Number Theory, but incorporated simpli
fications in several proofs, recent work, and other extra material.
Analytic number theory, whilst containing a diversity o f results, has
one unifying method: that o f uniform distribution, mediated b y certain
sums, which m ay be exponential sums, character sums, or Dirichlet
polynomials, according to the type o f uniform distribution required.
The study o f prime numbers leads to all three. Hopes o f elegant asym
ptotic formulae are dashed b y the existence o f complex zeros o f the
Riemann zeta function and o f the Dirichlet L-functions. The primenumber theorem depends on the qualitative result that all zeros have
real parts less than one. A zero-density theorem is a quantitative result
asserting that not many zeros have real parts close to one. In recent
years many problems concerning prime numbers have been reduced to
that o f obtaining a sufficiently strong zero-density theorem.
The first part o f this book is introductory in nature; it presents the
notions o f uniform distribution and o f large sieve inequalities. In the
second part the theory o f the zeta function and L-functions is developed
and the prime-number theorem proved. The third part deals with large
sieve results and mean-value theorems for L-functions, and these are
used in the fourth part to prove the main results. These are the theorem
o f Bombieri and A. I. Vinogradov on primes in arithmetic progressions, a
result on gaps between prime numbers, and I. M. Vinogradovs theorem
that every large odd number is a sum o f three primes. The treatment is
self-contained as far as possible; a few results are quoted from Hardy
and W right (1960) and from Titchmarsh (1951).
Parts o f prime-number theory not touched here, such as the problem
o f the least prime in an arithmetical progression, are treated in Prachars
Primzahlverteilung (Springer 1957). Further work on zero-density
theorems is to be found in Montgomery (1971), who also gives a wide list
o f references covering the field.
M. N. H.
Cardiff
1971

CONTENTS

PART

I.

INTRODUCTORY

RESULTS

1. Arithmetical functions

2. Some sum functions

3. Characters

10

4. P olyas theorem

14

5. Dirichlet series

18

6. Schinzels hypothesis

23

7. The large sieve

28

8. The upper-bound sieve

32

9. Franels theorem

36

P A R T II.

THE

PRIM E-NUM BER

THEOREM

10. A modular relation

40

11. The functional equations

45

] 2. HadamarcPs product formula

50

13. Zeros o f (s)

55

14. Zeros o f {s, x)

58

15. The exceptional zero

61

16. The prime-number theorem

66

17. The prime-number theorem for an arithmetic progression

70

PART

III.

THE

NECESSARY

TOOLS

18. A survey o f sieves

73

19. The hybrid sieve

79

20. An approximate functional equation (I)

84

21. An approximate functional equation (II)

89

22. Fourth powers o f ^-functions

93

CONTENTS

PAET IV.

ZEROS AND

PRIM E N U M B E R S

23. Ingham s theorem

98

24. Bom bieris theorem

103

25. I. M. Vinogradovs estimate

107

26. I. M. Vinogradovs three-primes theorem

110

27. Halaszs method

114

28. Gaps between prime numbers

118

N O T A TIO N

123

B IB LIO G R A P H Y

124

IN D E X

127

PART

Introductory Results

A R ITH M E TIC A L

FU N CTION S

An Expotition . . . means a long line of everybody

I. 110
T h i s chapter serves as a brief resume o f the elementary theory o f prime

numbers. A positive integer m can be written uniquely as a product


o f primes

m _

( 1,1)

where t h e ^ are primes in increasing order o f size, and the ai are positive
integers. W e shall reserve the letter p for prime numbers, and write a
sum over prime numbers as 2 ancl a product as JT- The p roof o f
p
i>
unique factorization rests on E uclids algorithm that the highest com
mon factor (m, n) o f two integers (not both zero) can be written as
(m,n) = m u + n v,

( 1.2)

where u, v are integers. W e use (m, n) for the highest common factor
and \m, n\ for the lowest common multiple o f two integers where these
are defined.
Let # be a positive integer. Then the statement that m is congruent
to n (m od#), written m = n (m od#), means that m n is a multiple o f q.
Congruence m od q is an equivalence relation, dividing the integers into
q classes, called residue classes m od#. A convenient set o f representa
tives o f the residue classes mod q is 0, 1, 2,..., q I. The residue classes
m od# form a cyclic group under addition, and the exponential maps
m -> eQ(am),
where a is a fixed integer, and

(1.3)

IN T R O D U C T O R Y RESULTS

e(a) = exp(27ria),

eg(a) = exp(27ria/g),

1.1

(1.4)

are homomorpliisms from this group to the group o f complex numbers


o f unit modulus under multiplication. There are q distinct maps, corre
sponding to a = 0, 1, 2,..., q1. They too can be given a group structure,
forming a cyclic group o f order q. They have the important property

( i -5>
where the summation is over a complete set o f representatives o f the
residue classes m od# (referred to briefly as a complete set o f residues
m od#). I f on the left-hand side o f eqn (1.5) we replace to b y t o + 1 , the
sum is still over a complete set o f residues, but it has been multiplied
b y ea(a), which is not unity unless a = 0 (m od#). The sum is therefore
zero unless a = 0 (m odg), when every term is unity. Interchange o f a
and to leads to a corresponding identity for the sum o f the images o f m
under a complete set o f maps (a 0, 1,..., q 1). These identities arise
because the images lie in a multiplicative not an additive group.
From E uclid s algorithm comes the Chinese remainder theorem: if
to, n are positive integers and (m, n) = 1, then any pair o f residue classes
a (mod to) and b (mod?i) (which are themselves unions o f residue classes
modwm) intersect in exactly one class c (modtow), given by
c = bmu-\-anv (mod tow)

( 1,6)

in the notation o f eqn (1.2). N ow let /(to ) be the number o f solutions


(ordered sets (x1,...,x r) o f residue classes) o f a set o f congruences
gi{xi,...,xr) = 0 (m odto),

(1.7)

where the gi are polynomials in


xr with integer coefficients. When
(m ,n) = 1, gi(x1>...,x r) is a multiple o f mn if and only if it is a multiple
both o f to and o f n. Hence
f(m n ) = f(m )f(r i)

whenever (m ,n ) = 1.

(1.8)

Equation (1.8) is the defining property o f a multiplicative arithmetical


function. An arithmetical function is an enumerated subset o f the
complex numbers, that is, a s e q u e n c e /( l ) ,/( 2),... o f complex numbers.
The property

f(m n )= f(m )f(n )

(1.9)

for all positive integers m and n seems more natural; if eqn (1.9) holds
as well as (1.8) th e n /(to ) is said to be totally multiplicative, but (1.8) is
the property fundamental in the theory.

1.1

A R IT H M E T IC A L FU N C T IO N S

The Chinese remainder theorem enables us to construct more compli


cated multiplicative functions. W e call a residue class a (mod#) reduced
if the highest common factor (a, q) is unity. A sum over reduced residue
classes is distinguished by an asterisk.
W ith this notation we introduce Eulers function <p(m) by
<?M =

2 *

a mo dm

I-

(i-io )

To show that <p(to) is multiplicative, we must verify that in eqn (1.6)


(c, mn) = 1 i f and only if both (a, to) and (b, n) are unity. Equation
(1.6) implies also that Ram anujans sum
cg{m) =

2 *
a mods

eq(am)

( 1 . 11)

is multiplicative in q for each to. W e see this if we write

where

a = c^q^+c^q^,

( 1.12)

Si'^2+ 0'2% ~ 1;

(1.13)

note that % (m od#]) and uz (m od#2) are reduced residue classes, that
< W m) =

2 *

2 *

aiinod<?i agTnodfo

e9s( 2 2)eai(a i% ),

(1.14)

and that ax % runs through a complete set o f reduced residues mod q1


when ax does so.
Two examples follow o f totally multiplicative arithmetical functions.
The first is

/(to ) = ms,

(1-15)

s CT+it,

(1.16)

where s is a complex variable

cr and t being real. This notation is traditional among number theorists.


To introduce our second example we note that the reduced residues
m od# (algebraically the invertible elements in the ring o f integers
m od#) form under multiplication an Abelian group o f order <p(g). By
considering the images o f the generators o f this group, we can see that
from this group to the group o f complex numbers o f unit modulus under
multiplication there are <p(q) maps x with the homomorphism property
x (mn) = x(m)x(n)-

(L17)

These include the trivial map for which x(m ) 1 f r each reduced class m.
W e turn these maps into arithmetical functions by defining
x(m) = 0

if (to, q) > 1.

(1-18)

W ith this definition, eqn (1.17) still holds. W e have now assigned a
complex number to each residue class mod q. Hence we have constructed

1.1

IN T R O D U C T O R Y RESULTS

a totally multiplicative periodic function, which is called a Dirichlets


character m od#, or more briefly a character. Characters can be defined
as those totally multiplicative functions that are periodic. Since negative
integers also belong to well-defined residue classes m od q, we can speak
o f x(m) when
is a negative integer; in particular, we shall refer to
X( !)
It is possible to build new multiplicative functions from old. W e say
that d divides m, written d |to, when the integer to is a multiple o f the
positive integer d ; another paraphrase is d is a divisor o f to . (Note that
the divisors o f 6 are 1, 2, 3, 6.) Nowlet/(TO) andgf(TO) be multiplicative.
Then so are the arithmetical functions

and

h(m) = f(m )g(m ),

(1.19)

H>) = 2 M ,
d\m

( 1-20)

h(m) = '2,f(d)g(mjd).

(1-21)

d\m

W e shall consider eqn (1.21), since (1.20) is a special case, and (1.19) is
evident. W hen (m,n) 1, the divisor d o f mn can be written uniquely
as d = ctb, where a |to and b \n, and (a, b) = 1. Hence
h(mn) 2 f(^ )s imnld)
d\mn

= H 2 f ( ab)g(mnlab)
a\m b\n

= 2

( i - 22)

a\m b\n

which is h(m)h{n) as required. Thus


d(m) = 2 1 .

(1.23)

d\n

the number o f divisors o f to, and


(m) = 2,d ,

(1.24)

d\m

the sum o f the divisors o f m, are multiplicative functions.


W e can invert eqn (1.20) and return from h(m) to f(d ) b y using
M obiuss multiplicative function fi(m), defined by
/x(l) = 1
n(p) = 1
p,(pa) = 0

for primes p
for prime powers p a with a > ]

}.

(1.25)

1.1

A R IT H M E T IC A L F U N C TIO N S

I f the positive integer m factorizes according to (1.1), then


2 p(d) = X I
d\m

')}

= n a - i ) = o,

(1.26)

unless m = 1, when the product in eqns (1.1) and (1.26) is empty. We


have now proved the following lemma.
Lem m a.

I f m is a positive integer, then


2 ^ ) = (1
cm
'
|0

% m =1>
t f m > 1.

(1.27)
K
'

From the lemma we have the corollary:


C o ro lla ry .

I f h(m) and f(m ) are related by eqn (1.20), then


f ( n ) = 2 l*(m)h(nlm),

(1.28)

? n \n

and i f eqn (1.28) holds then so does eqn (1.20).


To prove the corollary we substitute as follows.
2 ji(m)h(njm) = 2 i (w )

m\n

ni\n

= 2 m
d\n

d\{nfm)

m\(nld)

f (d)

K m)

(i-29)

when we interchange orders o f summation. The inner sum is zero by


eqn (1.27), unless d = n, when only one term/(cZ) remains. The converse
is proved similarly.
W e can also define an additive function to be an arithmetical function
/(to ) with

f(m n )= f(m )-\ -f(n )

when (in, n) 1.

(1.30)

The simplest examples are log to and the number o f prime factors o f m.
There are useful arithmetical functions that are neither multiplicative
nor additive. W e shall make much use o f A(m), given by
/[im \
|0

i f m is a prime power p a, a ^ 1,
if m is not a prime power.

(131)

2 A(d) = log to.

(1.32)

It satisfies the equation


d\m

W e could have used eqn (1.32) to define A(m) and recovered the defini
tion (1.31) b y M obius s inversion formula (1.28).

2
SOME

SUM

FU N CTIO N S

T h e study o f the sum functions o f arithmetical functions is important


in analytic number theory. For instance, we shall treat many o f the
properties o f prime numbers b y using the sum function
>ft(x) =

2 A(m).

(2.1)

Our object is to express the sum function as a smooth main term (a power
o f x or o f log x, for example) plus an error term. In place o f the cumber SOme

l/(OI = 0{cj(x)),

we shall often write

f(x )

g(x),

(2.2)
(2.3)

and other asymptotic inequalities similarly. Some sum functions can be


estimated by writing the arithmetical function as a sum over divisors
and rearranging. In this chapter we shall give examples o f this method.
From the theory o f the logarithmic function we borrow the relation

where y is a constant lying between \ and 1. W e deduce the useful


formula
,

]T 111 =

= lo g (M + l)+ y + 0 (M ~ i).

(2.5)

Our first example is an asymptotic formula for


() = 2 9 ()

(2.6)

as x tends to infinity. Since 9 (m) is the number o f integers r with


1 ^ r ^ m and (r,m) = 1, eqn (1.27) gives

( )

1.2

SOME SUM F U N C T IO N S

Hence

and 80

9 (to) = 2 2
r=1 d|m
fZ|r
m

(2.8)

2 2 2 mW
m<# r=X d|m
d\r

/ (^)

m
2

m<a

?=!

??i = 0(m.ocld) r^O(raodfZ)

2 ^(d)

d^x

2
m<$

ml d

m = 0( m o d d )

= 2 /*(<*)n^xjd
2 n>

(2-9)

wliere we have written ?i for m/d, ancl the sum over n is


%{x/d)2+ 0 (x/d).

( 2 . 10 )

W e now write eqn (2.9) as

in2
d

d<aj
= i 2 ^

^ | r + 0() + O (a;loga:)

1
= cx 2+ 0 (xlogx),

(2.11)

where the constant c can be evaluated; in the notation o f Chapter 5 it


is {2^(2)}-x, which can be shown to be 3 t t ~ 2. The size o f the error term
is highly satisfactory: when x is a prime, &(x) has a discontinuity x 1 ,
and the actual error in (2.11) is > x infinitely often. Even so, the upper
bound O(.TlogiK) for the error term in (2.11) has been improved a little,
W hen we first apply the same method to
D(x) =

2 d(m),

(2.12)

m ^x

we find that

D (x ) =

2 2 1
m*^x r\m

r^ x '

= a;(log -j-y+ 0 (x~1))-\-0 (x)


xlogx-\-0(x),

(2.13)

by (2.5). The error term here is much larger in proportion to the main
853518S

IN T R O D U C T O R Y RESULTS

1.2

terra than in (2.12); in fact a little cunning enables us to improve


(2.13). W e let y be the positive integer for which y 2 ^ x ^ ( ? /+ l ) 2,
and write m = qr in eqn (2.12), so that
D (x) =

2 1
QV^X

= 2
2 i + 2
2 i- 2
2 i
q^y r^xla
r^yq^x/r
q^y >'<?/
= 2 2

{x !q + 0 (l))~ ~ y*

a^u

= xlogx-\ -(2y 1) + 0 (x i),

(2.14)

when we substitute (2.5) and the value o f y.


Dirichlets divisor problem is to improve the exponent o f a; in the error
term o f (2.14). I. M. Vinogradov (1955) has shown that the error term
in (2.14) is
,# lo g 2a b y elementary arguments. Van der Corptit and
x s with values o f S a little less than
others have obtained estimates
their method involves writing the error in (2.14) as a contour integral.
On the other side, it has been shown that the error term is
x s for
each < J. The limitation on the accuracy o f (2.14) is not so easily
explained as that o f (2.11): it is not difficult to show that
d(m)

(2.15)

for each t > 0, and so the error in (2.14) must often be very much greater
than any individual step in the value o f D(x).
To illustrate more difficult examples, we consider
2 d2(m)/m.

(2.16)

m ^x

Let dr(m) be the number o f ways o f writing the positive integer m as


a product o f r positive integers (so that d(m) = dz(m)). B y induction
on (2.14),
D r{x) = 2 dr(m)
m ^x

2 dr - M
uv^x

= ~

+ 0(x\ogr-*x)>

(2.17)

and b y partial summation


V
-<

=
m

D i(x ) i p Aog3^'
m ( m + 1)
I x

= <&log4a ;+ 0 (log3a:).

(2.18)

1.2

S O M E SU M F U N C T I O N S

W e have chosen to compare cl2(m) with cZ4(to) since, when m is a prime


power p a,

di (m) = J(a + l ) ( a + 2) ( a + 3),

(2.19)

which is equal to d2(m) if a = 0 or 1. The next step is to find a function


b(m) for which

d2(m) =

ch{u)b{v)'

(2.20)

uv=m

Since

\(a-\-l)(aJr 2)(aJr '&)

l ) a ( a + l ) = (a + 1 )2,

(2.21)

the choice 6(1) = 1, b(p2) = 1, b(pa) = 0, for prime powers p a with


a not zero or two, satisfies eqn (2.20) when m is a prime power. I f we
complete the definition o f b(m) by making it multiplicative, then
(2.20) holds for all m. The choice is thus

6(m) = H n)

(0

(2.22)

if m is not a perfect square.

W e now complete the proof. Equations (2.20) and (2.22) give


d2(m) __ ^ d{u)b{v)
m
-4
uv
u v^ x

__.

t2

Z-4
n
uK;Xjt2
co

W hen we substitute (2.18) and the value 67r~2 o f 2

have

70, ,
2

. ,
=

i11^0 this, we

^ 2+ o (l))lo g % .

\4t72

(2.24)

W e require (2.24) and upper estimates for similar sums in the later work.
Any
estimate for a sum involving divisor functions that we quote
will be a corollary o f (2.14) or o f (2.24), possibly using partial summation.
The method we employ in this chapter can be summarized as follows.
To work out a general sum function
2

F(x) =

we try to write

f(m ) =

/(

m),

(2.25)

a,(u)b(v),

(2.26)

uv=m

where we have an asymptotic formula for the sum function o f the a(m),
and b(m) is in some sense smaller than a(m ). A necessary condition is that
b(p) = o{\a(p) |).

(2.27)

Our determination o f &(x) depended on the equation


<p(m) =

uii(v),

(2.28)

uv=m

which is o f the form (2.26); and eqn (2.20) is clearly o f this form.

3
CHARACTERS
T h e reduced residue classes m od # form a group under multiplication,
and the characters m od# correspond to the maps to->- x(m ) from this
group to the group o f complex numbers o f unit modulus under multi
plication. W e define a group operation on the set o f the cp(q) characters
m od#: the product X1 X2 f two characters xi and Xz m od# is the map
m -* Xi(m)X2.{m)- The unit o f this group is the trivial character mod#.
W hen the group o f reduced residues and the group o f characters are
each expressed as a direct sum o f cyclic groups, we can see that they
are isomorphic and that the homomorphisms o f the group o f characters
to the complex numbers are given b y % -> x(m)> where m runs through
the reduced residues m od#. Two finite Abelian groups related in this
way are said to be dual.
W e have already seen that
2

mmodg

()= (

\0

11 Cl

0 ( HIOQ^),

(3.D

for the maps to h* eg(am) from the group o f residue classes to (mod #)
under addition to the complex numbers o f unit modulus. W e have
similar results for the characters xim) m od#:
2
x M = l 9(q)
mmodg
(0

^
istriv ia l>
i f ^ is non-trivial.

(3.2)

Here x(m) is non-zero when the residue class to (mod#) is reduced, and
thus when m is a member o f the group o f reduced residue classes under
multiplication. N ow eqn (3.1) has a dual interpretation, in terms o f
residue classes a (m od#) and maps a -> ea(am). This corresponds to
T vim) = (9{q)
X'mod2
(o

if m = 1 (m o d 2)>
i f to ^ 1 (mod #).

/o o\

The proof o f eqn (3.2) runs as follows. I f x is trivial, then x(to) = 1


when (to, #) = 1 and 0 otherwise, so that the sum on the left-hand side
o f (3.2) is the definition (1.10) o f <p(#). I f % is non-trivial, there is an
integer r for which (r, q) = 1 but x(r) is not unity. W e multiply the sum
on the left o f eqn (3.2) by x(t') I this gives 2 x(mr)> where to runs through

CHARACTERS

1.3

11

a complete set o f residues m od#. Since (r, #) = 1, mr also runs through


a complete set o f residues m od#, and tlie sum is unchanged. This
contradicts the choice o f r with x{f ) ^ lifth e s u m is n o tz e r o . Similarly,
in eqn (3.3) if m is not congruent to unity m od# there is a character xi
with Xi(m)
1- W hen we multiply the left-hand side o f (3.3) bj^
2 x(m)xi(m) is a sum over all characters m od#, and again this sum
must be zero, for we have multiplied b y a constant that is not xuiity
but have succeeded only in permuting the terms o f the sum. O f course,
eqns (3.1), (3.2), and (3.3) are essentially special cases o f a theorem on
dual finite Abelian groups.
An important notion is the propriety o f characters. Let #2be a multiple
o f qv and xi a character m od#x. The group o f reduced residue classes
mod #2 maps homomorphically onto the corresponding group m od#x,
and we define a character x% mod #2 b y the equation
(3.4)
W e note that xi and x% are different arithmetical functions. I f #x = 3,
#a = 6, and xi takes the values
1, - 1, 0, 1, - 1, 0

(3.5)

for m = 1, 2, 3, 4, 5, 6, then %2 takes the values


1, 0 , 0 , 0 , 1 , 0 ,

(3.6)

since Xz is zero when m is a multiple o f 2 as well as when m = 0 (mod 3).


W hen Xi is constructed b y eqn (3.4), we say that ^1 m o d #1 induces
Xz mod #2, and, if #2 ^ #1( that y2 m o d #2 is improper. A. proper character
mod # (also called a primitive character) is one that is not induced by
a character mod d for any divisor d o i q other than # itself. The smallest
/ for which a character Xi m o d / induces x mod # is called the conductor
o f x The customary letter / is the initial o f a German word for a tram
conductor.
W e shall now discuss Gauss's sum r(x), defined by
t( x)

mmo clg

x(m)eg(m).

(3.7)

In this curious, expression, the factors correspond to the multiplicative


group o f reduced residues mod # and the additive group o f all residues
m od#. The absolute value o f r(x) is found below. Gauss (preceding
Dirichlet) considered onty characters x for which x(m) takes the values
rb 1 and zero only. In this case r 2 is real, but it is still not easy to find
the argument o f r(x).

IN T R O D U C T O R Y RESULTS

12

1.3

W e shall use t ( x ) to remove characters from a summation. W ith x ( m )


denoting the complex conjugate o f x(m) (the inverse o f x hi the group
o f characters), eqn (3.7) gives
r(x)x(m) =

x(aK {am )

(3.8)

flm odg

whenever (to,#) = 1. I f r(x) is non-zero, we can use (3.8) to change a


summation over x(m) to a summation over the exponential maps ea(am),
which are easier to manipulate. A defect in eqn (3.8) is the condition
(to, q) = 1. If, however, x is proper m od q eqn (3.8) holds for all integers
m. W e must show that the sum on the right-hand side o f (3.8) is zero
when

m = tn,

In this case,

a m od a

q = tr,

xia)eq(am) =

am o d g

t > 1.

(3.9)

xia)er{an)-

(3.10)

Since % is not induced by any character m odr, there is an integer b


with (b,q) = 1, b = 1 (m odr), but x(b) ^ 1- Our standard p roof now
applies. Multiplication by x(b) permutes the residue classes in the sum
on the left o f eqn (3.10), but multiplies the value o f the sum by a constant
that is not unity. The sums in (3.10) and (3.8) are thus zero if x is proper
m od q and (to, q) > 1.
For eqn (3.8) to be o f use we must be sure that t ( % ) is non-zero. When
X is proper mod q there is an elegant demonstration. For each m m od q,
|x(to)t(x)|2 =

x ()x (6)ea(m M -

(3.11)

x(a)x(b) 2

(3.12)

a m o d s 6 m od s

EE0IIC6
2

w m od ff

\x(m)?\T( x ) ? =

a m o d g &mod<z

m m odg

ea(ambm).

The inner sum is zero b y eqn (3.1), unless a = b, when it is q, so that


H x ) l2

mmodtz

\x{)\2 = q

am od g

lx ( ) l2-

( 3 . i3 )

Since x is proper m od# if and only if its inverse x is proper m od#, we


deduce that
, ,
f
\r \X)\
q ir X proper m o d q.
(3.14)
Equation (3.8) thus applies when x is proper m od#, the factor t (x ) being
non-zero.
W e conclude with the case when x m od# has con du ctor/, and # = fg
where g > 1. I f the lowest common multiple h = [/, g] o f f and g is not
# itself, x is not merely induced b y some character Xa, mo(i
but is
actually equal to X2> since any integer prime to h is already prime to
fg q. Replacement o f m in the sum in eqn (3.7) b y m-\-h permutes
the residue classes m od#, but multiplies r(x) b y eq(h), which is not unity.
In this case therefore r(x) is zero.

CHARACTERS

1.3

13

W hen (f,g) = 1 we invoke the Chinese remainder theorem.


eqn (1.2) there are integers u, v with
f u + g v = 1.

By

(3.15)

Residue classes a (m odfg) correspond to pairs o f classes b (m o d /),


c (m od#) according to the relation
a = cfu-^bgv (modfg).

(3.16)

In (3.16), a (m odfg) is a reduced class if and only if both b (m od /) and


c (m od#) are reduced, and thus
H x) =

2 *

amodtf

x ( a ) ea (a )

= bm2*
2* X{cfu+bgv)ea(cu)ef{bv)
od f cnioclf/
=

2 *

b mod/

x ( b )ef( b v )

2 *

cmodtf

e f, M

= Xi(v)r(xiK(u)}

(3.17)

where xi m od/ is the character inducing % mod #, and cy(u) is Ram a


nujan s sum, defined in eqn (1.11).
W e now proceed to compute Ram anujans sum. B y eqns (1.11) and
^ ' 27^

c(?0 =
=

amod{/

eu(au) X ^ ( d)
d\a

mu

2 P'W bmo&gld
2 %ld{bu),

d\g

(3.18)

where we have written a = bd. From eqn (3.1) the inner sum is zero
unless u is a multiple o f gjd. Writing h = g/d, we have
o(u ) = 2 ( # )

(3' 19)

It is possible to continue and to express cu{u) in terms o f Eulers cp


function. In our application, eqn (3.15) ensures that (g,u) = 1 and
thus that ca(u) is /u.(#), which is itself zero i f # has a repeated prime
factor. Ram anujans sum with (#, %) = 1 can be regarded as Gauss s
sum for the trivial character %0 m od#, for which Xoim) 1 whenever
(g,m) = 1.
In this chapter, we have shown that r(x) is zero unless # = # // is
composed solely o f those primes whose squares do not divide #, in
which case

|r(X)|2 = / .

(3.20)

4
PO LY A S THEOREM

F r o m eqn (3.2) we see that the sum function

x (v) = 2 xim)

(4.1)

is bounded when x is a non-trivial character m odg. Since the sum over


any q consecutive integers is zero, the absolute value o f X (x) can be at
most \q. Polya (1918) proved the following sharper result.
T h e o re m .

Let x be a non-trivial character m.odq with conductor f.


(4.2)

Then
where the term o (1) is to be interpreted as f -> oo.

P olya s theorem was discovered independently by I. M. Vinogradov


(1955, chapter 3, example 12), with a different constant in the upper
bound. Later proofs have been given b y Linnik and Renyi (1947) and
by Knapowski in an unpublished manuscript. W e shall follow P olyas
argument, as it is the most precise and can easily be adapted to show
that the sum in (4.2) is frequently ! > / 3.
First we introduce some notation. I f a is a real number, we write [ct]
for the largest integer not exceeding a, and ||a|| for the distance from a
to the nearest integer, so that
[a] = max m,

(4.3)

|a|| = min|ma|,

(4.4)

where the maximum in (3.3) and the minimum in (3.4) are over all
integers m. W e now state a lemma.
Lemma. The Fourier series

(4.5)
oo

m=0

1.4

P O L Y A S

converges to

15

t h e o r e m

a __[a] _ i

if a

no%an integer,

i f a is an integer,

(4.6)

and the partial sums satisfy the relation


M e(mct)
-M

(4.7)

2 ttw i

m 7^=0

when a is not an integer.


Proof. W e prove (4.7). Since H(a) has period 1 and H( a) = H (a),
we suppose that 0 < a < -J. Now
(2TTim)-1e(m<x) (2TTbn)~1( ~ l ) m = J e(mt) At,

and thus

(4.8)

* M

_-M
nr
in # 0

{2nim)-1e(ma.)-\-^ a =

J _ i\j

e(mt) di

e(M t+%t)e(M t lt) ^


e ( ) ~ e( ~ )
f sin(2ili"-|-l)7rf
d t,
sin 771

(4.9)

which by one o f the mean-value theorems for integrals does not exceed
in modulus

sin(2Jf-(-l)77i

sin 77a

At

1
2a (2M + \ ) t t

(4.10)

2 ttM cx

for some j3 between a and


This has proved (4.7), which gives (4.6) if
a is not an integer; if a is an integer every term in H (a) is zero.
W e now return to the sum in eqn (4.1). Since x is non-trivial, we have
rq+q
2 x(w) = o,

(4.11)

rq + l

and so we can replace the sum X (x) by a sum o f the same form with
0 ^ x < q. I f now x > \q, we can replace m by q-\-m and obtain a
sum with 0 ^ x ^ \q, multiplied by x ( 1). Thus we can suppose
that 0 < x ^ \cq in (4.1). W e now use H (a) to construct 0(a), where
n
0() = I \
\0

if 0 < a < xjq,


if a = 0 or a = xjq,
if xjq < a < 1,

(4.12)

16

IN T R O D U C T O R Y RESULTS

1.4

using the equation


0(a) = xjq-\-H(a.xjq)H(ot).

(4.13)

W e now have
X(l) + x(2) + ...+ l* (a 0 =

x(m)0(mlq).

(4.14)

m 1

W hen we use eqn (4.7) to truncate the sums for H (a xjq) and H(a) in
(4.14), the total error in modulus is at most

2J

[ ,/ )-! <

1 dh?

m=l

< 277-~1i l f - 1#logg.

(4.15)

W e now have finite sums to manipulate. Writing


CO

0(a) = 2 a(m)e(ma),

(4.16)

CO

we have to consider
q

2 X(r)

J .= l

es(TOr)-

(4.17)

Supposing first that x is proper m od#, we have from eqn (3.8)


a

2 a (m) 2

-M

x0')e9(') =

>=1

2 a(m)x(m)r(x).

(4.18)

Since x is non-trivial a(0)%(0) is zero, and by eqns (4.5) and (4.13), for
\a(m)\ < \TTin\-1.

(4.19)

The modulus o f the expression in (4.18) is now


/

< ? i 2 | (nm)-1
m=l

W e choose
so that (4.20) is

< 2 7 T ^ ^ (lo g J f+ 0 (l)).

(4.20)

M = qi+s,

(4.21)

77-^1gi lo g g (l-l-0 (S)),

(4.22)

and after (4.15) the tails o f the series give


< ( n M ^ q l o g q = o(qHogq).

(4.23)

Finally the omitted term \xix ) is 0(1), and we have P olyas theorem
when x is proper mod q.

P 6 L Y A S

1.4

IT

t h e o r e m

I f x is induced by Xi proper m o d /, where / < q, we could complete


the proof similarly, but it is easier to deduce this case.
2

x ( ) =

=
<

m^cc

X iM

d\q
d\m

M d)

^(d)Xi(d)

/ } l o g '/ { ^ + o ( l ) } ,

d\q

m ^xfd

Xi()
(4.24)

d\q

(d, f) = 1

since xi is proper m od /. The number o f terms in the sum is at most


d(qjf), and we have completed the proof o f P olyas theorem.
As a simple corollary we prove that for each prime p > 2 there is an
integer m with

m < p i\ogp

(4.25)

u2 = m (m od#)

(4.26)

for which the congruence

has no solution. Since 1, 4, 9,..., (# l )2 are distinct m od#, there are


\(p 1) reduced residue classes that are congruent to squares m od#,
and these form a subgroup o f index 2. There is therefore a character
X m od # with x(m) 1 when m is congruent to the square o f a reduced
residue and 1 when (m,q) = 1 but m is not congruent to a square.
W e choose
.
i,
,.
x> p -\ og p
(4.27)
in (4.2). N ot all terms in the sum (4.1) can be non-negative if p is
sufficiently large, and so there is an m < # - l o g # with xim) = !
The exponent \ in (4.25) can be improved, but the conjecture that the
asymptotic inequality

< #s

(4.28)

for each 8 > 0 holds in place o f (4.25) has not yet been proved or
confounded.

5
D IR IC H LE T

SERIES

Well, said Owl, the customary procedure in such cases


is as follows.
What does Crustimoney Proseedcake mean ? said Pooh.
For I am a Bear of Very Little Brain, and long words bother
me.
It means the Thing to Do.
I. 48

A Dirichlet series is an analytic function o f the complex variable


s = c j+ it defined b y a series
/ ( ) = 1 a{m)m~s,

(5.1)

or a generalization thereof. All the Dirichlet series that we need are


special cases o f (5.1). I f eqn (5.1) converges at s0 = cr0-(-ii0, then
\a(m)m~8<
\ is bounded. This simple observation is the basis for the
theory o f convergence o f Dirichlet series. B y partial summation
(5.1) converges whenever a > <r0 and converges absolutely when
a > <r0+ L The convergence is uniform in a half-plane a ^ a, provided
a > o-0+ l . W e see that the region o f definition o f f(s ) is a half-plane
bounded to the left b y some vertical line; this line is called the abscissa
o f convergence.
If

4 ( )= 2 a (m )

(5.2)

is the sum function o f the coefficients in (5.1), then


CO

f(s ) = j s x - ^ A i x ) da;.

(5.3)

Formula (5.3) can be inverted: fro m /(s) we can recover the sum function
A(x) o f the coefficients. Let a and u be positive real numbers. Then
ar-f- i oo

23

(0

a- ic0

\1

if

0 <C 11 <C 1 ,

if =

1,

(5.4)

if U > 1.

For tt ^ 1 we consider the integral from a i2\ to a-f-iT%. W hen u > 1,


this is equal to the integral round the three remaining sides o f the

1.5

D IR IC H L E T SER IE S

19

rectangle whose other corners are Rjlogu^iT^, R jlo g u iT^


modulus o f the integral in eqn (5.4) is thus
o~R
R ______
-^ e -K .
ir
___________________________
2tt t 277^ log u 1 2irT2i\ogti

The

( 5 gx

a number which tends to zero as R, Tx, and T2 tend to -|-oo. W hen


u < 1, Rflogu is negative, and we must add the residue from the pole
o f s-1 at s = 0 ; this gives unity. Finally when u = 1 we define the
value o f the integral (5.4) to be the limit o f the integral from a i T to
a-\-iT when T -> oo. This reduces to an inverse tangent integral.
T

Rj logu

Of.

1' 1
F ig .

I f f(s ) defined b y eqn (5.1) converges uniformly in t on the line a a,


then for x > 0 term-by-term integration gives
oi-J-ico

f
J

a ico

x ^ f i s ) ds = 2 a(m)+a(x),
m<x

(5.6)

where the last term occurs only if x is an integer. There are many
integral transforms from Dirichlet series to their coefficient sums, all
proved b y the same method. The simplest one after (5.6) itself is
a-f ico

h J

2 ( - )<>

a ico

A special class o f Dirichlet series occurs naturally in number theory.


The functions are known as L-functions after Dirichlets functions

Lis>x) = 2

(5.8)

m 1

where x is a Dirichlets character to some modulus q, or as zeta functions


after Riem anns function
() =

2 m ~s>
m 1

(5.9)

IN TR O D U C TO R Y RESULTS

20

1.5

which is the special case o f Dirichlets definition when q = 1 and x is


trivial.
^-functions are defined by two properties. First, the coefficients a(m)
are multiplicative, so that Eulers product identity
a(m ) _

-| r (

a(p )

a ( p 2)

m=1
holds in a half-plane a ^ a in which one side o f eqn (5.10) converges
absolutely. I f the product in (5.10) converges, f(s ) can be zero only
when one o f the factors on the right-hand side o f (5.10) is zero. The
convergence o f the left-hand side o f (5.10) alone does not im ply that o f
the product; L(s, x) with x non-trivial has a series (5.8) converging for
a > 0, but the function itself has zeros in a >
preventing the product
from converging in 0 < a < J-.
The second defining property is that f(s ) should have a functional
equation

f{s)G (s) = f * ( r - s ) G * ( r - s ) ,

(5.11)

where r is a positive integer, G(s) is essentially a product o f gamma


functions, and the operation * has ( /* )* = / and ((?*)* = G. As an
example, in the functional equation for L (s ,x ) in Chapter 11, L*(s, x)
is L (s ,x ). An important conjecture about Z-functions is the Biemann
hypothesis that if f(s ) satisfies eqns (5.10) and (5.11) then all zeros o f
f(s)G (s) have real part Jr. The truth or falsity o f this hypothesis is not
settled for any Z-function.
Two generalizations that are often called zeta functions are
CO

1 ( m + 8) - s,
m=l

(5.12)

where 8 is a fixed real number, and


00
2 r(m)m~s,
m=1

(5.13)

where r(m) is the number o f representations o f to by a positive definite


quadratic form. E xcept in special cases these fail to have a product
formula o f the form (5.10), and not all o f their zeros lie on the appropriate
line. Some authors even use zeta function as a synonym for Dirichlet
series .
In Chapter 11 we shall obtain analytic continuations o f (s) and other
.L-functions over the whole plane. Since the sum function X (x ) formed

D IR IC H L E T SE R IE S

1.5

21

with a non-trivial character %is bounded, b y partial summation (5.8) con


verges for ct > 0 except when x is trivial. Similarly, the function
co

X ( l ) m-1TO~~s = (1 21-S)(s)
m1

(5.14)

converges for a > 0 and provides an analytic continuation for (s).


W hen we make s
1 in (5.14), we see that (s) has a pole o f residue 1
at s = 1. W hen we p u t /(s ) = (s) in (5.6), the integrand has a simple
p o le a ts = 1 with residue. The value o f the right-hand side o f eqn (5.6)
is between x 1 and x. I f we deform the contour in (5.6) so that it passes
to the left o f the pole, the residue makes the main contribution, and the
contour integral left over is bounded. Let
<A() =
and

M (x) =

2 A{m)

(5.15)

2 Mm)>

(5.16)

m <x

m < (G

which are the coefficient sums o f '(s)/(s) and o f l/(s) (we shall prove
this below). The function '(s)/(s) also has a pole o f residue 1 at
s = 1, but l/(s) does not. I f the corresponding contour integrals were
negligible, we should have
t(x ) = x + o ( x ) ,

(5.17)

M (x) = o(x).

(5.18)

These are forms o f the prime-number theorem, which we shall prove in


Part II.
Writing m = f g , we have
co

CO

co

2 m~s 2 ( / W W / ) = ( 2 ( / ) / *) ( 2 H g ) r s)-

m= 1

f\m

'/= 1

' ' ( 7=1

'

(5 19)

I f b(g) = 1 and a (f) = /x(/) for each pair o f integers / , g,


co

(5) 2 K m)m~a =
m =l

CO

2 w s 2 M /) = 1

m =l

/|m

(5'2)

from eqn (1.27). Since eqns (5.3) and (5.6) im ply that expansions in
Dirichlet series are unique, we have shown that the series on the left-hand
side o f (5.20) represents l/(s) wherever it converges. Similarly, using
eqns (5.19) and (1.32) we can check that '(s)/(s) has a Dirichlet series
with coefficients A(m).
For fifty years (1898-1948) the only proofs known o f eqns (5.17) and
(5.18) used contour integration and other complex-variable techniques.
In 1948, Selberg and Erdos gave a real p roof o f (5.18) (see Hardy and

22

IN TR O D U CTO R Y RESULTS

1.5

W right (1960), Chapter 22). The real-variable approach is not so well


understood, and the strongest forms o f (5.17) and (5.18) (those in which
the error term is smallest) have been obtained b y analytic methods.
The form (16.22) in which we shall prove (5.17) is a little stronger than
the best so far obtained b y Selbergs method.
Aj)art from the analytic arguments, studjr o f log(A'r!) suggests the
form (5.17) as a conjecture. B y eqn (1.32),

2 MN/e),

(5.21)

e < iV

where weJiave written m = tie in the first sum. On the other hand, by
expression (2.5),

= N logN ~ N -}-0(logN )

(5.22)

which agrees with the result o f substituting (5.17) into (5.21). In this
way, Gauss was led to conjecture that
(5.23)
2

which is another form o f the prime-number theorem. B y consideration


o f the binomial coefficient 2lvCjv> ^ can be shown that <p(2N)>fj(N) lies
between boimded multiples o f N (Hardy and Wright, 1960); but there
are too many terms in the sum (5.21) to allow (5.17) to be deduced
from (5.22).

6
SC H IN ZE L S H Y PO TH E SIS
And all the good things which an animal lilies
Have the wrong sort of swallow or too many spikes.
II. 30
M a n y problems in prime-number theory follow a similar pattern.

Various constraints are laid on a set o f integer unknowns, and we ask


whether the integer unknowns can all be jsrime simultaneously, and
whether this happens infinitely often. Many o f these problems are
subsumed under SchinzeVs hypothesis: i f1(x1,...,xn) ,...,fm(x1,...,xn) are
polynomials (with integer coefficients) irreducible over the integers, and
there is no prime p for which J J / { = 0 (m od#) for all sets
xn o f
residues m od# , then there are infinitely many sets x v ..., xn o f integers
for which the absolute values o f / 1;..., f m are all prime.
There is a conjectiired asymptotic formula for the number o f sets
ik-l,..., x n with 0 < xi ^ N for each i with each o f
f m prime, and
some bold authors have conjectured that the asymptotic formula can
be stated with an error term smaller than the main term b y a factor
for each e greater than zero. Thus the simplest case is one p oly
nomial, f ( x ) = x, and the conjecture now states that tt(N), the number
o f primes up to N, satisfies the relation
( 6 . 1)
2

a very strong form o f the prime-number theorem (5.23). The accuracy


o f (6.1) seems unattainable. W e shall prove later that the hypothesis
is true for one linear polynom ial/(a;) = qx-\-a\ this is the prime-number
theorem for arithmetical progressions, but the error term in the asym
ptotic formula will only be shown to be slightly smaller than the leading
term.
The next simplest case concerns two linear polynomials, f ^ x ) = x,
f 2(x) = x 2. Here the conjectured formula is
Ar

2 XJ {1
853518X

l ) " 2} I (log ) 2

-f-error term,

(6.2)

24

IN T R O D U C T O R Y RESU LTS

1.6

W e shall now describe how to write down the conjectured asymptotic


formulae. Let

a/ \
S (a )=

v
2,
p^N

n.

m o\
(6>3)

such an expression is called an exponential sum or a trigonometric sum.


B y the fundamental relation
/ /
t
(1
J e(m) * . = (

i f m 0,
jj

(6-4)

we see that the number o f primes p ^ N for which p 2 is also prime is


l
J S(a)S(a)e( 2a) da.
o

(6.5)

W e cannot, o f course, work out this integral, but we can suggest a


plausible value for it. Writing
n (N ;q ,b )=

1.

(6.6)

P<N

p = b (modg)

we have

S(a/q) =

ea(ab)ir(N;q,b).

(6.7)

fcniodg

N ow the sum in eqn (6.6) is 0 or 1 i f b has a common factor with q. I f


we make the approximation that the primes are divided equally between
the <p(q) residue classes b with (b,q) 1, the expression in (6.7) is
{tp(g)}-177(iV) 2 * eQ(ab) = {<p(?)}~Mi\T)ca(a),
fcinodg

(6.8)

where cq(a) is Ram anujans sum (1.11). I f (a, q) = 1 the Ram anujans
sum is just /x(g), from eqn (3.19). This argument suggests that S(a) has
a spike at a/q o f height proportional to ii(q)l<p(q), that is, that |$(a)|
has a local maximum close to ajq. Now the area under the graph o f
|$(a)|2 near 0 (which certainly is the site o f a spike) may plausibly be
written

2V'"17r2(iV*)-{-error term,

(6.9)

and this can be proved to be true. I f we assume further that all the
spikes at rational points are the same shape, the spikes at rational
points ajq contribute
N^1tt2(N) 2 lJ'2(q){<?(q))~z 2 *

amodg

ea(2a)+error term,

(6.10)

and the sum over q in (6.10) converges to


2n { i - ( ? - n
P> 2

which gives the main term in eqn (6.2).

(6. n )

1.6

SC H IN ZE L S H YP O TH E SIS

25

For small q the argument above can be made rigorous; but then part
o f the range o f integration in (6.5) does not support spikes. Away from
a spike we cannot estimate S(a) except b y replacing it b y its absolute
l
value; and the spikes with small q contribute very little to J |$(a) |2 da.
o
In the integral o f |$(a)|3 the spikes do dominate, and b y this method
I. M. Vinogradov was able to prove that every large odd number is the
sum o f three primes.
The approach to Schinzels hypothesis through exponential sums does
lead to an upper bound for the number o f sets x 1}..., xn o f integers not
exceeding N for w h ic h /l v .., f m are all prime. To explain the method
we shall take n = 1, so we are considering integers * in the range
1 ^ x
N for w liich /1(a;),...,/m(a;) are all prime. W e now work modulo
a prime # . Apart from the finite number o f x for which one o f f t(x),...,
f m(x) is p, x must be such that none o f fx(x),...,fm(x) ~ 0 (m odp). This
means that x must be confined to certain residue classes m od#. W e
therefore divide the residue classes m od # into a set H (#) o f /(# ) forbidden
classes and a set K (p ) o f g(p) = # / ( # ) permitted classes; h m od # is
forbidden if and only if one o f the p o ly n o m ia ls /^ ) is a multiple o f # .
I f x falls into a forbidden class for any prime # smaller than each o f the
fi(x), then one at least o f t h e / i(x) cannot be prime.
The values o f x that make f x(x),..., f m(x) primes greater than some
bound Q form a sifted sequence, in the following sense. The increasing
sequence J f o f positive integers % , n 2>... is sifted b y the primes # < Q
if for each prime # < Q there is a set H(p) (possibly empty) o f /( # )
residue classes m od # into which no member o f J r falls. W e shall show
in Chapter 8 that, i f J/' satisfies the above condition, the number o f
members o f J f in any interval o f N consecutive integers is
N
v ... >,...wi, w ..+ error t e r m >

I /**(?)/(?)/?(?)

( 6 - 12)

a<Q

where

f{q) = q H f ( p ) l p ,

(6-!3)

g(i) = ? IT { l - f i v ) h ? } v\a

(6-14)

W e shall work out examples o f this upper bound in Chapter 8 ; in each


case the leading term is a multiple o f the leading term in the conjectured
formula.
Upper bounds o f the right order o f magnitude were first found by
Viggo Brun using combinatorial arguments. Rosser used Bruns method

26

IN T R O D U C T O R Y RESULTS

1.6

to obtain expression (6.12), which was found in a different way b y Selberg.


A n outline o f Selberg s method follows. It rests on the construction of
an exponential sum T(a) with the same spikes at rational points as
S () =

eK )

!< N

(6-15)

is conjectured to have. Let K (q ) be the set o f g(q) residue classes that


are in K (p ) (permitted classes) for each prime factor p o f q. A t a/q we
expect a spike o f height proportional to
K {a,q) =

k eK (q )

ea(ak)l9(q)-

(6' 16)

I f p is a repeated prime factor o f q, then the classes k-\-qjp are also in


K{q), and since replacement o f k in eqn (6.16) b y k + q j p multiplies the
right-hand side b y ep(a), which is not unity, K(a, q) is zero if q has any
repeated prime factor. N o av the simplest exponential sum i s
W

2 e(ma),
m=l

(6.17)

which has a spike at a = 0, since


]F (a) | = |sin ttN a/sin tt<x \.

(6.18)

W e therefore compare 8(a) with


y () = 2

2*

q^ Q omodg

K (a ,q )F (a -cilq ).

(6.19)

The coefficient o f e(ma) in T(a) is

gCQ

2fteK (g) aTm o d q

= g2 :Q

k eK (q )

(6 .2 0 )

using the definition (1.11) o f Ram anujans sum. W e know the sum
over 1c must be zero if q has a repeated prime factor; i f q is square-free
then
i \

e K(q
n)
k eK

2d\n
d\q

d\(k-m)

M 'I

g\g(g)
d\q
meK(d)

w M )

= pte)9te)M ftei)lgtei),

(6.21)

where qx is the largest factor o f q for which m e K ( q 1). W e write


2 =
so
m e H ( c l ) then d divides q2. The expression in

(6.21) becomes

A<lMlM<l)lf(b)-

(6.22)

1.6

SC H IN ZE L S H Y PO TH ESIS

N ow

27

(6.23)

and so the coefficient o f e(ma) in eqn (6,19) is

d=C_Q

meH(d)

ii{d)d

fid )

Z-,

'

^{q)f{q)

g = o(m ocld )

(6>24)

glq)

s<Q

Selbergs argument proceeds as follows. I f m is in


only when d = 1, so that b y eqn (6.4)

jo W

nm^N 'a^QT

W e obtain (6.12) b y Cauchys inequality,


l
i
i
J S{<x)T(-~a) d a 2 < j ^ ( a ) ! 2
0
noting that, from eqn (6.4),
1

and

'

<6'26>

J |T(/3)|2 dj3,
0
0

(6.26)

f | (a )| 2 d a =

da

then m e H(d)

f j S ( a ) /S ( -a ) d a =

( 6 '2 7 )

0J

f |T (a)|2 da = N ^
o
sQ

+ error term.

(6.28)

W e can prove eqn (6.28) either directly from (6.19) or b y writing ^ ( a )!2
as T(a)T( a) and using (6.4) again and the expression (6.24) for the
coefficients. Selberg used the second method, which can be expressed
entirely in terms o f the manipulation o f the coefficients (6.24), with the
integrations (6.25) and (6.28) occurring only implicitly. The first proof
uses eqn (6.18) to show that the spikes o f T(a) provide the main contribu
tion to the integral in (6.28), and the observation that
\K(a,qW = ^(q)f(q)lg(q).

(6.29)

am odg

The usual account o f Selbergs method in terms o f coefficient manipu


lations can be found in Halberstam and R oth (1966, Chapter 4). The
sketch above is presented in terms o f exponential sums for comparison
with the large sieve o f the next two chapters.

7
THE

LARGE

SIEVE

But whatever his weight in pounds, shillings and ounces,


He always seems bigger because of his bounces.
II.30

W e have seen that the behaviour o f a given sequence o f integers con


sidered m odg is reflected in the behaviour o f the sums 8(a/q) where
JV
S(ot) =
(7.1)
1
in which am is 1 if m is in the given sequence, and 0 if m is not. An ujjper
bound for the sum
^
^
|(g(a/?)|a
(7 2)
q ^ Q amodg

(or for some related expression) is called a large sieve for residue classes.
Other large sieves will appear in Chapter 18. In this chapter we prove
what is probably the simplest o f the upper bounds for the sum (7.2), and
in the next chapter we shall use it to prove an upper bound o f the form
(6.12) for the number o f elements o f a sifted sequence in a bounded
interval.
The p roof does not require that am in eqn (7.1) takes only the values
0 or 1, and it treats (7.2) as a special case o f the sum

f |S(*,)la
(7.3)
}.=i
where 0 ^ x x < x 2 < ... < xR ^ 1. Since ajq occurs in the sum (7.2) only
i f it is in its lowest terms, the points x r are distinct in our proposed
application. W e write
S = min{*2a^Xgx z,...,xx+ l x ^

(7.4)

and suppose that S > 0. Before proving an upper bound for (7.3), we
consider what form it might possibly take. Certainly there will exist
sequences o f coefficients am and points xr for which
b
1
2 |S( ,)|>8- i f |S()|d
= !
J

= 8-1 K P 1

(7-5)

THE LARG E SIEVE

1.7

29

On the other hand, any one term in the sum (7.3) may be as large as
( 2 K n!)2> ari(l ^ ail the

are equal in modulus this is


t f f h J 81

(7.6)

An optimistic conjecture is that the inequality


|

\S(xr) \ * ^ ( N + 8 ~ i ) f \ a m\*

)=i

(7.7)

always holds. Surprisingly, the right-hand side o f (7.7) has the correct
order o f magnitude: we shall prove that
f

\S(x,.)\^ ( i V + p - W 3 + 0 ( l ) ) i

(7.8)

ml2;

r=l

this result is due essentially to Bombieri (1972).


To prove the relation (7.8) we use the language o f ^-dim ensional
vectors over the complex numbers. The inner product (g, h) o f two
vectors g = {g-s_,---,gN) and h (hv ...,hN) is given by
N

and the norm ||g|[ b y

(g, h) = % g j i ns
x

(7.9)

||g|| = ((g,g))*.

(7.10)

W e can now state a fundamental lemma.


Lemma. Let u, f (1),..., f(fi) be N-dimensional vectors, and cv ..., cR be any
complex coefficients. Then

f c ,( u ,f W )
r 1

< ||u||(| | c,| f ( max


' 1

'

|(fM,fM)|)*.

1,...,R 8=1

(7.11)

'

Proof. The left-hand side o f (7.11) is


R

(u , | c,fW)

2 M W

(7.12)

The square o f the second factor on the right-hand side o f (7.12) is

f crcs(w,f) < 1 f

j = l s = l

r=l

r=l

| (Icri+Kia)i(fw,fw)i
s==l

K l2 2

s=1

l(f(,,)>f(8))l

(7.13)

from which the result follows.


In applying the lemma we choose c,, so that each summand on the left
o f (7.11) is real and positive. In Chapter 27 we shall apply the lemma

30

IN T R O D U C T O R Y RESULTS

1.7

with cr o f unit modulus to obtain Halaszs method for estimating the


number o f times 8(a) is large. To prove (7.8) we take cr given b y
cr = (fMf<s>)

(7.14)

and deduce the corollary.


We have

C o ro lla ry .

|(u,fW)|2 < jjujj3 m ax

r=1

|(f(,,),f (s))|.

(7.15)

1<9'<B s = l

W e write

M
S(u.) = e(M<xJr a) 2 bmc(ffla),

(7,16)

-M

where 2M = N or N 1, whichever is even. W e choose


= &me(m,),

(7.17)

um = b j k m,

(7.18)

where k_M,..., kM are given by


km = 1

for \m\ ^ M
(7.19)

k.m = 0

for \m\ ^ M-\-L

here L is a parameter which we shall choose below. W e now have


M

2 b me(mx,) = \8(xr)\.

l ( , f (,,))l =

Moreover

M
!ll!2 =

and

-M

(7 .2 0 )

N
IKWK, = 2 K l 2
i

( f fW) = K (x s~ x r),

(7.21)
(7.22)

M + L

where

K(a) =

e(ma)
-M -L

sin2(ilf -f- L )n a sin2ilf7ra

(7.23)

Hence we have
J*

S= 1

l-^r(s' '*'') I

2-ZI4"-f--Zv
}2

i(B -l)

Z -1 coseo27TW
CO

< 2 i + JL + 2 L - 17T2 |;TO-2S--2+ 0 ( L - i )


l

< 2 i f + Z + ( 3 i > 2) - 1+ 0 ( i v - i ) .

(7,24)

1.7

THE LARG E SIEVE

31

When we choose L to be an integer close to (SV3)-1, (7.24) becomes


< 2 iY + f S - 1V3 + 0(1).

(7.25)

W e substitute (7.20), (7.21), and (7.25) into (7.15) to obtain (7.8).


Our inequality (7.8) represents an improvement o f an inequality o f
R oth (1965), which has led to much recent work. The best upper bounds
known for the sum (7.3) at the time o f writing are

S-1(l + 270Ar3S3) f

\am\2,

(7.27)

and

2max(AT, S^1) 2

(7.28)

Of these, (7.2 6) is the result o f this chapter, appropriate when N > f 8 _1V3,
and (7.27) and (7.28) are results o f Bombieri and Davenport (1969,1968).
(7.27) is appropriate when S_1 > 3(10)i/V, and (7.28) for the inter
mediate range.
Note added in proof. H. M ontgomery and R . C. Vaughan have now
proved the conjecture (7.7). This supersedes (7.26) and (7.28) but not
(7.27).

8
THE

U P P E R -B O U N D

SIEVE

1Its a comfortable sort of thing to have said Christopher


Robin, folding up the paper and putting it into his pocket.
II. 170

I n this chapter we obtain the upper bound (6.12) as an application o f


the large sieve. The notation is that o f Chapter 6. J r is a sequence
o f positive integers, and for each q ^ Q there are sets H(q) and K(q) o f
residue classes m od#. The f(q ) classes o f H(q) are precisely those that
are not congruent to any member o f J r m od# for any prim e# dividing q,
so that, if h is in a class o f H(q) and n e J f,
(nh,q) = l.

(8.1)

The g(q) classes o f K(q) are those that for each # dividing q are congruent
m o d # to some member o f J r \their union contains all members o f the
sequence J f.
W e work with the exponential sum o f eqn (6.15):
#( ) =

( ).

(8 '2 )

where the sum is over members nv n2,... o f the sequence J r . I f h is a


class o f H(q),
2*

S(alq)ea{ - a h ) =

am odg

c ^ h)

2
U; < N

= p{q)M,

(8.3)

where M is the number o f members o f J r , and we have used eqn (3.19) in


the sjiecial case (8.1). Hence

2* 2 8{alq)eq(-ah).

(8.4)

a m o d g heH (q)

Cauchys inequality now gives


,i*(q)f*(q)M* ^

2 *

' am odg

| S ( / ? ) l a) (

' ' am odg

heH ( q)
hef-T(n\

'

>

( 8 -5 )

TH E U P P E R -B O U N D SIE V E

1.8

33

The second sum over a on the right-hand side o f (8.5) can be rearranged
as follows.
2*

amodg heH(q)

ea( ah) 2

geH(q)

%iacJ)

geH(q) JisHiq)

05H(q) heH(q)

c fl( 0 A)

2 M v l d)
d\(g-h)

= 'ZMqld)P(q)lf(d).

(8.6)

d\q

In eqn (8.6) we have a sum that is a Mobius inverse (in the sense o f
(1.27)) o f that in

' /*2(})fl,(r) _
rlm

f (r )

(8.7)

m y

an equation that expresses the fact that for a prime modulus p, every
residue class is either in H(p) or in K{p). The terms involving d in
(8.6) therefore come to p?(q)g(q)lf(q), and we can put (8.5) into the form
'V '*

j^
/
J
2

""

q lq )
Jw

Z -i
amodg

Sa

( 8 .8)

W e apply the large sieve (7.8) with the rationals ajq with q ^ Q and
(a ,q ) = 1 as the points x x,..., xR, so that
= (Q (Q -l))-1

(8.9)

in eqn (7.4). The upper bound (7.8) now gives


2

2*|-S(/?)|2 < ( ^ + 0 ( e 2) ) ^ .

q<,Q amodg

(8-10)

Combining (8.8) and (8.10), we have


M 2 2 [i2(q)f(q)lg(q) < (N + 0 { Q * ) ) M ,

(8.11)

q^Q

or

M ^
^ + ^ ( ^ 2)
M ^ W
i i i '
q<Q

fo i9\
( '

which is the relation (6.12) with an explicit error term. W e have stated
our result for the interval 1 ;;C n $ ' N o f the sequence J /\ but (8.12)
holds as an upper bound for the number o f members o f J r in any interval
LA-1 < % <c L-\-N) we have to consider merely the integers n L,
where n is in J f , themselves a sifted sequence with the same values o f
f(q ) and g(q) but the sets H(q) and K (q) translated b y L. The deduction
o f (8.12) from (7.8) is due to Montgomery (1968). His argument differs
from that given here; there are several alternative proofs.

34

IN T R O D U C T O R Y RESU LTS

1.8

Tw o worked examples follow. First we consider the perfect squares


not exceeding N. These are a sifted sequence: the set H(p) contains the
residue classes m od # that do not contain squares, and thus
/ ( 2) = 0,

g( 2) = 2

f ( p ) = U p - 1)>

9(P) = U P + !)

fo r # > 3

I t is not difficult to sIioav that


'2,p2te)fte)lg(q) =

(< H -o (i))Q ,

(s.i4 )

where c is a constant, as Q -> oo. Choosing Q N -, we have shown that


the number o f perfect squares not exceeding N is 0(2V*). This is very
encouraging, since the sieve upper boim d is sharp , differing only by
a constant factor from the actual number o f squares. It is surprising
that we have not lost the correct order o f magnitude in combining so
many inequalities.
Our second, less trivial example concerns the primes between Q and
N ; these form a sifted sequence with
f ( p ) = 1>
and thus
a

v(p

g(q)

- 1 =

( 8 -1 5 )

Z-t <p(q)
a
Tu'
= (l+ o (l))lo g Q ,

(8.16)

when worked out as in Chapter 2. W e can obtain a lower bound for


the sum in (8.16) with less effort, since

2 ^ = 2 fir n K + ? + ~ )

q^Q r

q^Q

P\q '

= 2 iK

t8-1?)

the sum being over all m whose prime factors are at most Q. All m ^ Q
are included in this sum, and thus

2 ^T W '> 2m<<3s > ' ^


for q > 1, by (2.5). When we choose Q a little smaller than
that the primes between 1 and N number
< Q + { N + O m ) / l o g Q < (2 + o (l))N/logN.

<8'I8>
we see
(8.19)

The right-hand side o f (8.19) is just double the true value (5.23). M ore
over, (8.19) is also an upper bound for the number o f primes in any
interval o f length N.

1.8

TH E U P P E R -B O U N D SIE V E

35

W e derived the inequality (8.12) from Cauchys inequality; the


difference between the two sides o f the inequality (8.12) is a measure
o f how closely the values o f S(a/q) are proportional to those o f
O /ta))'1 2

lx K (q )

e(cl&)

(8'2)

and this in its turn measures how evenly J f is distributed among the
g(q) residue classes m od# into which it is allowed to fall. W e could add
an explicit term on the right o f (8.8) to measure the unevenness (what
statisticians might term a variance). The inequality (7.8) gives a strong
upper bound for this variance as well as for the main term. W hen we
use (7.8) to prove Bom bieris theorem, it is the variance bound that is
important, not the bound for the main term.

9
FR A N E L S THEOREM
Its just a thing you discover, said Christopher Robin
carelessly, not being quite sure himself.
I. 109

T h e Farey sequence o f order Q consists o f the fractions ajq in their lowest


terms (i.e. (a, q) = 1), with q
Q ancl 0 < a
q. W e name them
f r = ar/qr in increasing order, so that f x = 1IQ ,f2 = l/(Q l),...,fF = 1.
For notational convenience we may refer to f F+r] this is to be interpreted
as 1+ / r Here F is the number o f terms in the Farey sequence, so that

F = 2 ?(?) = 377-*Q*+0(Q log Q)

(9.1)

from eqn (2.11). The properties o f the Farey sequence are discussed by
H ardy and W right (1960, Chapter 3).
W e shall sketch a p roof that
fr+l

fr =

( M m ) - 1-

(9.2 )

Let us represent rational numbers ajq (not necessarily in their lowest


terms) by points (a, q) o f two-dimensional Euclidean space. Since/,, and
f r+1 are consecutive, the only integer points in the closed triangle with
vertices 0 (0, 0), Pr (ar, qr), and Pr+1 {ar+1, qr+1) are its three vertices. B y
symmetry, the only integer points in the parallelogram OPr TPr+1 are
its vertices, where T is (ar-\-ar+1, qr-\-qr+1). W e can now cover the plane
with the translations o f this parallelogram in such a way that integer
points occur only at the vertices o f parallelograms. It follows that
OPr TPr+1 has unit area, which is the assertion (9.2).
Before stating FranePs theorem we introduce some notation. For
0 < a < 1 we write
E{a) =

1- a F ,

(9.3)

so that E(a) is the excess number o f Farey fractions in (0, a] beyond the
expected number aF. Franel considered the sum
S Im fr) I2

r= l

(9.4)

>

F R A N E L S TH EO REM

1.9

37

and showed that it is o (Ql) i f and only i f eqn (5.18) holds; and an upper
bound
for (9.4) with 3 < A < 4 is valid if and only if
\M{x)\

(9.5)

W e can also connect M (x) with


l
J |2?(o:)|2 da.
o

(9-6)

In fact, the ratio o f (9.4) and (9.6) lies between bounded multiples o f F ;
but this fact requires proof, as the Riemann sum corresponding to the
integral (9.6) and the p o in t s /i,...,/F is
f

)=1

(fr+ l-fr) i m W ,

(9.7)

and from eqn (9.2) we see that the difference f r+1f r varies from Q -1
almost down to Q~2. Franel (1924) produced a curious identity for the
sum (9.4). W e shall show in (9.20) that (9.4) is less than a bounded
multiple o f Franels expression involving M (x), and deduce the only i f
clause o f Franels result b y a method o f Landau (1927, Vol. II, pp. 16977).
W e use the function H(a) o f eqns (4.5) and (4.6):
tt, \
# ( ) =

M \
n
(0

i f a is not an integer,
+
it-fa is an integer.

9-8

W e have

_ (F(a.)
i f a is in the Farey sequence,
\ J ? ( ) - i if not,

. .
(J' J)

for 0 < a -<C I. To verify eqn (9.9), we observe that the left-hand side
is a,F plus a step function that is zero at a = 0 and has discontinuities
\ on each side o f the Farey p o in ts /r Taking (4.7) in the form
5 (a ) +

-
J]
m# 0

e(mot)
277-1TO

(9.10)

Q2INI

we have from (9.9)


F

t=i

V, rt# 0

38

IN TRO D U C TO RY RESULTS

1.9

B y (9.2), the consecutive Farey fractions are at least Q~2 apart, and so
the sum of the error terms in (9.11) is
< Q~z X Q2lt < log -P < log Q,

(9.12)

t= i

where we have used expressions (2.5) and (9.1). We can now replace
the term t = r (since H ( 0) = 0) and rearrange the first term on the
right o f (9.11) as
F

Q2

t~ 1

Q2

{27nm)-1e{mfrmfi)

2
Q2

Q1
m# 0

to #

3 <Q

X*

e (m /^ )e ( a m )

amodg

2 (2wim)-1e(TO/1.) X cq(-).
S o
a<1
(9.13)

We apply the large sieve (7.8) with (9.13) as the exponential sum and
/x,..., fff as the points. By eqn (9.2), S-1 is here Q(Q 1). We have
F
Q2
Q
X (2 7r i m ) - 1e (m /).) X c a( m )
2 cg(m)
X
r= 1 -Q2
to#0
TO
TO##0
Q2 2

2<z

ra=l

2 = 0 (m o d d )

(9.14)

where we have used eqn (3.19) for Ramanujans sum. The coefficient
of Q 2 in (9.14) is now
<

2
d^Q

dM(Qld) X f M (Qlf)
f^Q

m = 0 (m o d tf)
w = 0 (m o d /)

m~2>

(9 J 5 )

where we have taken the sum over to from m = 1 to infinity, since the
coefficient o f to~2 in (9.14) is non-negative. The inner sum in (9.15) is
772

_ tr2 ( d J Y

6 \ d ,ff ~

6 dT

1 '

Let u(m) be the arithmetical function with


2

u(d) = to2.

(9.17)

d\m

By Mobiuss inversion (1.28), we can verify that


0 < u(t) < t2.

(9.18)

We can now write (9.15) as


Itt2 2
<K Q

dM(Qld) X m Q l f ) d ~ y - * X Mt) < X


/< Q

tf

(0 (

\ 4 f r S o a ,)

d -'M (Q ld )y.
1

(9.19)

1.9

FR A N E LS THEOREM

39

Squaring (9.11) and substituting (9.12) and (9.19), we have


2 W ) I 2 < Q2 2 (*)(
r=1

'

t^ Q

c i-m (Q id )Y + F iog *Q

< Q 2l H

'

d<Q
(Z= 0 (m od t)

KO v d^Q

d -i M ( Q l d ) Y + Q * l o g * Q .

'

d = 0(m od t)

(9.20)

The leading term on the right of (9.20) is essentially the sum involving
Mobius functions o f Franels identity.
The inequality from (9.8) to M (x ) is less involved. Since, by (3.19),
2 * ea(a) = c9(l) = rfq),

(9.21)

am odg

we have
Now

M (Q ) = 2 e(fr)-

(9-22)

r1

\ e (f^ e {r lF ) \^ 2 n \ f - r l F \

< 2 ^ 1 E ( f r)\.
Since
from (3.1), we have

2 ^(rjF) = 0

(9.23)
(9.24)

r=l

|jf(<3)|< 2 ^ 2 TO O!
r=l

< Q - i( z m

\r=i

r) \ f ,

(9-25)

and so a bound for (9.8) of the form o(Q 4) implies the prime-number
theorem in the form (5.18).

853518

t)

P A R T II

T h e P r im e -N u m b e r T h e o r e m

10
A MODULAR

RELATION

Winnie-the-Pooh read the two notices very oarefully, first


from left to right, and afterwards, in case he had missed
some of it, from right to left.

I. 46
I n this second part we study (s) and L(s, x) as functions o f the complex

variable s, and work towards the prime-number theorem. Our investiga


tions are based on the functional equations for 'C(s) and for L(s, x). The
first step is therefore to prove these.
W e need a lemma from the theory of Fourier series.
(Poissons summation formula.) Let Tc, I be integers. L e t f ( x )

Lem m a.

be a differentiable ftmction of a real variable tvith


\ m \ < A
on [k, I]. Then

>

ki l

f m e(rn x)d x = i m + f ( } c + l ) + ...+ f ( k + l ~ l ) ^ f ( l c + l )

X
m = -co

(10 .1 )

and moreover the -partial sums satisfy the inequality


M

l,e + 1

f(x )e (m x )d x -y (k )-...-y (k + l)
M

^ lA M -H o g M .
( 10 . 2 )

Proof. Equation (10.1) is additive on intervals: its truth for [/<:,r]


and for j>, t] implies its truth for [k, t]; so we may suppose 1 = 1. By

2.10

A M ODULAR RE LA T IO N

41

a change of variable we may suppose also Jc = 0. We use the function


H(x) of eqn (4.4):
H (x)=

y
M
= H
Z-t 27rim
10
mco
v
m^O

if o < * < i,
if x ~ 0 or I,

and in particular if 0 < x < 1 we have (4.6):

^ > + 2M. 27rim


5

7)1#0

Hence
/

V2

H ( x ) Jr 2 (277im)_1e(m3;) da;

2 m -1 d a ;+

1'

da;
1I'M

M -1 log M .

(10.5)

We now have
jif, }
2
- m

I f(x)e(mx) da;
J

m#0 0
=

[/(%)

(277-iTO)_1e(?wa;)j J jV'O) 2

m^0

(27rim)_1e(OTa;) d*.

(10.6)

The first term on the right of eqn (10.6) is zero, since the terms for m
and m cancel, and by (10.5) the second is
1

J f ( x ) H ( x ) d x - { - 0 ( A M ^ 1log M ) .
0

(10.7)

The integral in (10,7) is


1

j / '( * ) ( * - * ) da = i / ( 0 ) + i / ( l ) ~ J /(* ) d.
o
o

(10.8)

Combining (10.6), (10.7), and (10.8), we have


M

2 f f{x)e(mx)

dx = | /( 0 ) + i /( l ) + 0 (^ ilf-1log Jf),

(10.9)

and letting M tend to infinity we have the case k = 0, 1 = 1 of (10.1).


A general method of proving functional equations is to write the
required function as an infinite series, apply Poissozis summation
formula to a partial sum, and then let the length of the sum tend to
infinity. This entails a change in the order of summation in a double
infinite series. We could prove the functional equations for ( (s) and

42

THE P R IM E -N U M B E R TH EOREM

2.10

L(s, x) directly by this method, but it is more troublesome to justify


the interchange o f summations and more difficult to identify the
functions that arise. We shall therefore prove the identity
co

oo

exp{(to-|-8)27t2:-1} = x i

m = co

m=

exj)( i7i2Trx)e(m8),

(10.10)

co

where x is real and 0 s ' S <1 1. Rather than apply (10.2) directly, we
vise eqn (10.9), so that
M

| exp{(r + i+ S )2^ - 1} di

-M

| e x p { -(r + 8 )2} + i e x p { - ( r + l + ) 2} +
+ 0[ne_1exp{ ( r - ^ S y n x ^ j M ^ l o g Jf],

(10.11)

and the double series


00

OO

m = co

e x i ^ { t + d f n x ^ } e { m t ) At

(10.12)

r = co ^

converges and is equal to


co

2
y =

e x p { - ( r + 8 ) 277a;-1}.

(10.13)

CO

A typical term in (10.12) is


CO

J exp{ (t-\-i)lTTX^1Jr 2nimt} At


CO
CO

J exp{ TTXZ2Jr2nT\m(xz 8)}* dz,


co
co

= icexp{ nxm? 2mmZ) J e x p { rrx(z im)zJAz,

(10.14)

CO

where we have put t-\-8 = xz. The integral in (10.14) can naturally be
regarded as the contour integral of e~z2 along the line Ixnz = to. To
evaluate the integral, we can move the line o f integration to the real
axis, when (10.14) becomes
CO

a exp(
rrxm2 2mm)

J exp(irxt2) At
CO

= C77~sa;=exp(nxm2 2irim),

(10.15)

co

where

c=

J e x p ( t2) At.

(10.16)

A M ODULAR R E LA T IO N

2.10

43

I f we combine (10.15) and (10.12) we have proved that


CO

CO

2
?=

exp{ ('-)- 8)27rai-1} =


CO

j
7)1

C7r~%*exp(7rm2x)e(mS).

(10.17)

co

Putting S = 0, x = 1 verifies that


c = ttI,
(10.18)
and we have proved (10.10).
Equation (10.17) should not be let pass without some comment. Let
us put
m
(w) =
2
exp(7rmaai)
(10.19)
m= oo
for values of to for which the series in (10.19) converges, that is, for
complex to with positive real part. Then we have
6(to + 2 ) =

8 ( w ),

(1 0 .2 0 )

and (10.10) with 8 = 0 gives us


02( - l / ) = to82{to),

(10.21)

provided w is pure imaginary. However, since eqn (10.21) holds along


the imaginary axis, the two sides o f (10.21) have the same derivatives
at points iy with y > 0, and, since power-series expansions of regular
functions are unique, (10.21) must hold whenever 8(to) and 6( l/to) are
both defined, which is whenever the imaginary part of oj is positive.
From eqns (10.20) and (10.21) we see that
04H dw

(10.22)

is invariant under the group of transformations o f the upper half o f the


complex plane generated by co -> w + 2 and w -> 1/to.
We are now in the realm of the elliptic modular functions. A modular
function is one that is invariant under the group of transformations
generated by a> ->oj-|-1 and oj -> 1/co or under a subgroup o f finite
index in this group. The derivatives of a modular function are not
invariant under these transformations, since dw itself is not invariant;
functions that satisfy the transformation law for a power of a derivative
o f a modular function are called modular forms. The name elliptic
modular functions arises as follows. The periods o f an elliptic function
form a free Abelian group on two generators tox and to2. A modular
form corresponds to a function of two complex variables
and w2
which is homogeneous and whose value does not change when we replace
ojl and to2 by another pair of generators o f the same free Abelian group
(or, more generally, which takes a finite set of different values when toJ

44

THE P R IM E -N U M B E R TH EOREM

and a)2 are generators of the same Abelian group). Here


Thus we may consider
f{a>i,co2) = io^e^iojcox),

w ith

and

/(o j

2,

OJjJ

U)2~1 02( W j/o jg ) =

2.10
co

= w2/wi(10.23)

162(co2/co1)

= / K . w8)
/(oij, a)2+2co1) cox 102(co2/a)1-)-2) = /(aij, co2).

(10.24)
(10.25)

11
THE

FUNCTIONAL

EQUATIONS

When he awoke in the morning, the first thing he saw was


Tigger, sitting in front of the glass and looking at himself.
H a llo ! said Pooh.
H allo! said Tigger. I ve found somebody just like me.
I thought I was the only one of them.

II. 21

W e return to (s) and L ( s , x ) The definition


CO

r () = / e - v y ^ d y ,

(11.1)

0
valid for cr > 0 (we recall s cr+itf, cr and t real), becomes
A i s)

CO

e~~m7rxx is~1

da;

(H -2)

when we write y = -nmPx. Summation over m gives


co

7 t - } sr ( \ s ) i ( s )

>

q - 2itxx Is - 1

da;.

( 11 .3 )

m= 1 0

Since the sum and integral in eqn (11.3) each converge absolutely, we
can rearrange the right-hand side of (11.3) as
co

CO

J 2

dx =

J 4 (0 (i )-l)* * -1 da;,

(11.4)

where 6(a>) is the function of eqn (10.19). Next we write

>

| i ( 0 (i )-l) * -1 d a != J (0(i/f) l)$ -ls-1 dt

(11.5)

and use eqn (10.10) to put (11.5) into the form


CO

co

f ^ ( ^ ( i t J -lJ H 8- 1 dt =

f I0(it ) t ~ ^ i dt - K 2 S - 1)

i
CO

= J 4 ( 0 ( i i ) - l ) H - * d i - s - 1- ( l - a ) - 1.

(11.6)

46

THE P R IM E -N U M B E R TH EOREM

2.11

The left-hand side of (11.3) has now been expressed as


CO

J (0(ia;) l ) ( J - i + a r + ' - * ) d { ( i a) } - i

(1 1 .7 )

The expression (11.7) was obtained under the assumption a > 1, but,
Since

0 (i )-l< e -

(1 1 .8 )

for x > 1, the integral in (11.7) converges for all complex s. Since
r(|s) is a known function, and ( m s ) ) - 1 is integral (single-valued and
regular over the whole s-plane), we can take (11.3) with (11.7) as the
CO

definition of (s), knowing that X m ~s agrees with our new definition


i

when the series converges. We have now continued l(s) over the whole
plane. Further, (11.7) is unchanged when we replace s by 1s, so that
^ r ( i s ) U s ) = t ^ - ^ T ( ! - 1 S) ( 1 - S),

(11.9)

the promised functional equation. Since


'M
1 I2

= 21^7r^T(s)cosis7r,

(11.10)

2 )

an alternative form o f (11.9) is


(1s) = 21- s7T~sr(s)cos %stt t,(s).

(11-11)

We now list some properties of F(s) (see for example Jeffreys and
Jeffreys 1962, Chapter 15). The product
1
r(s+ i)

where y is tbe constant of (2.5), converges for all s, and defines F(s) as
a function tliat is never zero and has simple poles at 0, 1, 2,....
Using this information in (11.7) we see that the pole of (11.7) at 1 comes
from (s), the pole at 0 from -T(^s), and that (s) must have zeros at
s = 2 , 4,..., to cancel the other poles of F(^s). From eqn (11.12),
T (s + 1 ) = sF(s),

and

Jrr( 1-)s)_Z^(1s) =

its cosec

(11.13)

77s,

(11.14)

where we have used the product formula for sin 7rs. We can verify
eqn ( 11. 10) by showing that the ratio of the two sides is a constant.
Equation (11.1) is obtained by evaluation of the limit of
N

C *10+1( 1t/N )N dt

(11.15)

2.11

THE F U N C TIO N A L E Q U A TIO N S

47

in two ways as N tends to infinity. We can also obtain from (11.12) the
asymptotic formulae
log ics) = (s -P o g s -s + llo g 2 7 r + 0 (l/| s | ),
and

! / /

= lo g s+0(l/|s|)

(11.16)
(11.17)

which hold as |s| co uniformly in any angle 7i+S < args < tt S
for any 8 > 0.
Next we consider an L-function L(s, x) with x a proper character
mod#. There are two cases. I f x(~~l) is 1> we argue as above up to
00

Tr-*sqissr(^ s)L(s, x) =

x(m)e~m*n!>:lQ

a;is-1 2

m=1

00
=

(11.18)

x i ^ '? (x ,x )d x ,

0
co

where

2 x ( m ) e ~m*nx^a'

( p ( x ,x ) =

(11.19)

CO

W e approach cp(.T, x ) through (10.10):


CO

CO

2 e -( n + 8 )M * =

xh 2 e ~ mS7r-r e ( m S ) .

co

( 1 1 .2 0 )

co

W e put S = cijq and use eqn (3.8):


x (m)T(x)

2*

(H.21)

x ( a )ea (a m )>

am o d g

so that

r(xMx>x)

=* 2* r n 2
co
= 2* x{a)(<llxf 2 e^m+a^v2/a!, (11 .22)
am o d g

m = co

am o d g

m ~ co

which we may rearrange as


co

co

2*

= (?/*)*

co a m od g

?= oo

x ( r ) ^ ,m

(q lx M l/ x ,x ).

(H .23)

This will play the part of the modular relation (10.10). As before, we
split up the range of integration in (11.18) and find that
1

CO

J x ^ c p ix , x) da; = J ^ is- 1(r(x)) 1(g<)i<?(^ x) da;.

(11.24)

48

THE P R IM E -N U M B E R TH E O R E M

2.11

The analogue of (11.7) is now seen to be


co

co

TT~lsqisr(^s)L{s, x ) = i J

x)

x~^-^(x, x) d a .

(11.25)

As before, the right-hand side of (11.25) converges for all s, so that


L(s, x) has an analytic continuation over the whole plane, with no
singularities. Moreover, L ( s ,x ) must have zeros at 0, 2, 4,... to
cancel the poles of F(\s). We proceed to deduce the functional equation.
We have
t (x )

mmodg

X(m)eq(m ) =

mmodg

x ( - ) e g{ m)

(11.26)

= r(x),

since it was assumed that x( 1) = 1. By eqn (3,14), since x is proper


g*/r(x) = r ( x ) l r .

m0dq

(11.27)

We now see that the right-hand side of (11.25) is r(x)g,_i times the
corresponding expression with s replaced by 1-s and x by x> which
gives the functional equation
ir-bqi*r($s)L(8,x) = T(x)q~lTT-l+*sq ^ sr ( ~ s ) L ( l - s , jf).

(11.28)
We now consider characters xim) proper modg with %( 1) = 1.
Since we want to consider a sum from co to co, we use mx(m) in place
o f x(m). Writing s + 1 for s in (11.2), we have
7T-Ks+I)^i(s+I)p (i(s-|-I))i(s, x) =

OD

2 me-'! % !s- da:


o m=1
co

co

where

p(x, x)

= iJ p (a ? ,X)**-*da!,
0

(11.29)
(11.30)

co

We find a functional equation for p(x, x) by differentiating (10.10) with


respect to 8. We get
t (x )p (x >X)

= i ^ x - i p ( l f x , x).

(11.31)

Arguing as before, we find


w-*-tgi+ *r(i(*+ l))i(s,x)
CO

= i | />(*, x ) ^ d x + ^ ) f

CO

P(X>x)*-** d.

(11.32)

THE F U N C TIO N A L E Q U A TIO N S

2 .II

49

Again, the integrals on the right of eqn (11.32) converge for all s, so that
L (s ,x ) has an analytic continuation; it must have zeros at 1, 3,
5,... to cancel the poles of r (| (s + l)), and satisfies the functional
equation
is

lq i

i s r ( l % 8)L(l s,x)
=

^ - s+1 (s+1)r(| (S+ l ) ) L ( S, x).

(11.33)

To check this, we note that when %( 1) = 1


(11.34)
= ^(x)There is also an analytic continuation of L(s, y) when % mod q is not
proper. I f Xi proper m o d / induces % modg, then for cr > 1
t (x )

(11.35)

when we write m = dr. The sum over r in (11.35) is L(s, ^i), which has
an analytic continuation since Xi m o d / is proper, and the sum over d
is defined for all /. The corresponding functional equation for L(s, x)
contains the sum over d explicitly. We shall not need this case again.
A number of proofs of the functional equation can be found in
Chapter 2 of Titchmarsh (1951).

12
HAD AM ARD S PRODUCT

FORMULA

Suddenly Christopher Robin began to tell Pooh about some


of the things: People called Kings and Queens and something
called Factors.
II. 174

In proving the prime-number theorem, Hadamard studied integral


functions of finite order, that is, functions f(s ) regular over the whole

plane, with

log|/(s)| < ||^

(12.1)

for some constant A , as |s| -s*oo. The order o f/(s ) is the lower bound
o f those A for which an inequality o f the form (12.1) holds. Hadamard
showed that an integral function o f finite order can be written as an
infinite product containing a factor s p corresponding to each zero p
o f the function. This generalizes the theorem that a polynomial can be
written as a product of linear factors. Weierstrass s definition (11.12) of
the gamma function is an example. The product is especially simple
w h en /(s) has order at most unity. The order of l/r(s-\-l) is unity,
from (11.16). We shall obtain the product formulae for (s) an(l (s>x)
g iv e n b y
and

= 8(l-8)n-*r(i8)Z(a)

(,*) = ( q M ^ r { U s + a ) ) L ( s , x ),

( 12.2)
(12.3)

where % is proper modg and a = 0 or 1 accoiding to the relation


%( 1) = ( 1). Note that (11.9) is just the assertion that (1s) is
equal to (s), and (11.28) or (11.33) implies that
lf(l- .X )l =

\(S,X)\-

(12.4)

First we show that |(s, x) has order one. By eqn (5.3), if a > 0,
L(s, x ) =

f
j

2 x (m) dx.

(12.5)

By Polyas theorem (4.2), the sum over m is bounded, and thus


00
!-(*> X)I < 2 ^ ogc| J Is la -^ d o : < a-i|s|gUogg,
(12.6)

2.12

H A D A M A R D S PRODUCT FORMULA

61

and for cr ^ we have


i o g l ^ x ) ! <logg+log|5|.
Stirling s formula (11.16) is uniform in

(12.7)

and we have

logics, X)( < ls llog?ll


(12'8)
uniformly in q and in a
From (12.4), inequality (12.8) is also true
for a ^ -J.
To bound (s) we recall eqn (5.14):
co

(1 -2 1 -) (S) = 2 { - v r - h n , - ,
(12.9)
m=1
valid for a > 0. As in the derivation of (12.7) from (12.5), we deduce
that the right-hand side of (12.9) is
|s| for cr 5;
Now when a ^
w e lm v e

( 1 s ) / ( l 2 i- s ) <

and so

|a |,

(1 2 .1 0 )

log|(ls)(s)| <^log|s| + l,

(12.11)

and Stirling s formula gives


logics)I < Is l(log|| + 1)

( 12.12)

uniformly in cr Js
Since (ls) = (s), (12.12) is also valid for cr
i.
We have now shown that (s) and (s, %) have order at most one. I f s
is real and positive then (12.7) and (12.12) are best possible, by Stirlings
formula, so that the order of (s) and (s , x)
exactly unity.
We now discuss the zeros p of the functions (s) and |(s, %). First we
prove a lemma.
Lem m a.

(Jensens formula.)

Let f( s ) be a function of the complex

variable s = rcis# which is regular in |s| ^ R with no zeros on |s| = R,


fo r which f ( 0 ) 1. Then
2

(2-rr)-1 J log|/(i?cis0)| dd J r_1iV(>) dr,

(12.13)

where N(r) is the number of zeros o f f(s ) inside the circle G(r): |s| = r.
Proof. We can write the left-hand side of (12.13) as
27T R

27T R

(2tt)-1 J JRe(d/ / / ) dr d 0 = (Zir)-1J

0 i-=0

J R e cis 6 drdB

0 r=0
R

27T
27T

= Re f i f W
J

'=o

and the inner integral is 2niN(r).

2niJ f(s)
6=0

(12.14)

>

THE P R IM E -N U M B E R TH EO R EM

52

2.12

Now (0) = 1, so we can apply (12.13) to (s) at once. We shall prove


later that (0, x) is non-zero, so that (12.13) can be applied to (s, x)/(0, x).
To avoid a circular argument, we choose a 8 for which (S, x) is non-zero
and apply Jensens formula to g(s, x)/(&> x)(12.7) or (12,12) we
have
B
J r~xN(r) dr < ^ .B l o g B ,
(12.15)
o
and as T
oo
N (T )
T log T.
(12.16)
Here, N ( T ) is the number o f zeros o f (s) or o f (s, x) with |s| ^ T.
When we examine the formulae (12.2) and (12.3) for (s) and |(s, x),
we see that any zero of (s, x) must be a zero of L(s, x), and similarly
for (s ). The converse is not true, because L(s, x) has extra zeros at
negative integer values to cancel the poles o f the gamma function in
(12.3). I f s = o-\-rl with a > I, Eulers product formula
^ . X ) = = n { l - X ( ^ - S} - 1

(12.17)

converges absolutely and so is non-zero. By the functional equation,


(s, x) is therefore non-zero for cr < 0, since g(s, x) is non-zero for a > 1.
Thus all zeros p = /3+iy o f (s, x) have 0 ^
^ 1, and the same is true
for (s) by a similar argument. Riemanns hypothesis is that /3is always -J.
Riemann stated the hypothesis for (s), but it is difficult to conceive a
proof o f the hypothesis for (s) that would not generalize to (s, x).
We shall prove later that 0 < j8 < 1: this statement is equivalent to the
prime-number theorem in the form (5.17) in the sense that each can be
derived from the other.
For later use we now prove a result more precise than (12.16).
Lemma. The number o f zeros p = /3-f-iy of (s, x) in the rectangle B ,
0 < j8 < l,
is at most

T ^ ys^ T + 1 ,

(12.18)

^log(g(|y|-fe)),

(12.19)

< l o g ( m + e).

(12.20)

and of i(s) in B is at most

Proof. Let s0 be the point 2 + i(T + | ). Then

|(*o>x)l > i - i - i - A - - > i


We apply Jensens formula (12.13) with B = 3 and

( 12 .21 )

/( ) = ( - * , x)/(* 0.X).
( 12.22)
By (12.7) and Stirlings formula (11.16) the left-hand side o f eqn (12.13) is
< l o g g+log(|T|+e).

(12.23)

2.13

H A D A M A R D S PRODUCT FORMULA

53

The circle radius 3 and centre s0 clears the box B by a distance at least \,
so that, if N is the number of zeros required, the right-hand side of
eqn (12.13) is

> t flo g 6 /5 .

(12.24)

This proves (12,19), and (12.20) follows similarly.


We shall not need the following more accurate formula. I f T ^ e,
the number of zeros of (s, x) with 0 -<C [3
1 and 0 sC y -<C T is
(27r)-12, log(gy/27J-e) + 0(logg21),

(12.25)

from which (12.19) and (12.16) follow readily.


We can now prove the product formula. Let f (s) be (s, x) or (s),
and let p run through all zeros of f(s). By (12.19) or (12.20) the series
2 Iph2 converges, and so therefore does the product

IT ( ! s/p)exP(//>)>
(12.26)
P# o
where if 0 is a zero we add a factor s at the beginning. P(s) is a regular
function with the same zeros as f(s ) . We should like f(s)jP (s) to be a
constant or some other simple function. Certainly
p (s) =

g(s) = log {f(s)lP(s))

(12.27)

can be defined to be single-valued and regular over the whole s-plane.


We shall prove that
g{g) = A + B s _
(12 28)
By (12.19) or (12.20), there is a sequence of B tending to infinity with
i?-|p| > ( l o g i ? ) - i

(12.29)

for each zero p. W e want a lower bound for log P(s) on the circle |s] = B .
Now

log|(l//>)exp(/p)| < B

0<\p\<iB

0<|p\<iR

l/>l_1 < ^ l o g 2E,


(12.30)

by (12.19) or (12.20), and


-

a)exp(s/p)|
log|(1 s/p)exp(s/
p) | <

(1+logIogU )

B log B loglog B ,

(12.31)

by (12.29). Finally
-

lpl>2B

log'l(1*(p)exp(s//>))| < B 2

lp|>2B

l/>l~2 < R ^ o g B .

(12.32)

On such a circle we have


log|/(5)/p (5)l < Rlog^B.

(12.33)

The left-hand side of (12.33) is the real part of g(s). I f we write


s = B cis d,

T H E P R IM E -N U M B E R T H E O R E M

54

2.12

the power-series expansion of g(s) makes g(R cis 9) a Fourier series in 6:


g(R cis 6) =

so that

B,eg(Rcmd) =

and

ra(n)Rn =

(12.34)

(a(n)Jr ib(n))Rn cisnd>


R n(a(n)cosn6 b(n)sinnd)

(12.35)

2tt
f cos(?i0)Reg{R cis 6) |dd.

(12.36)

Hence
*7/

J |Re<7(i?cis 0) d0

\a(n)\Bn

o
27T

(\Reg(R cis d j l + K e g i R cis 0) - a(0)) dd


27T

1+

J max{R>e g(R cis 6), 0} dd


o

< 1 -f RlogZR.

(12.37)

Since (12.37) holds for an infinite sequence o f R, a(2), tt(3),... must be


zero, and similarly so must 6(2), 6(3),..., and we have proved eqn (12.28).
W e have therefore
(s) = eBs JJ (1s//>)exp(s/p),

(12.38)

p
C(x)eB{x)s I I ( ! slp)exp(slp),
(12.39)
p
with the modification mentioned above if p = 0 occurs as a zero. These
i(s, x) =

products converge for all s. By taking logs and differentiating, we get


^ 1 = B ----- j + i l o g 2n I F 'j f S + 1) + y ( ~ h -V
(s)
s 1
2 r ( 2s + l )
^ \S~ P
pj

(12.40)

and
(S>X) _ R/,,\

110" ^

l-^ '(2 (S+ ffl)) | X ' / 1

fl^ 41)

where p runs over all zeros of ((s) or L(s, x) that do not coincide with
the gamma-function poles. We can substitute the equation

- l m

= ir +

(12'42)

which follows from the corresponding product formula (11.13) for F(s),
and obtain a sum over all zeros (except p 0) of (s) or L(s, x). It can
be shown that

= I0g 2 + i l o g w - l - i y ,

(12.43)

but no simple expression for G(x) and B(x) is known in general.

13
ZEROS

O F f ( s)

You remember how he discovered the North P ole; well, he


was so proud of this that he asked Christopher Robin
if there were any other Poles such as a Bear of Little Brain
might discover.

I. 131
W e saw in the last chapter that i(s) and g(s, x) have 110 zeros p /3+iy
with /3 > 1. The prime-number theorem corresponds to the fact that
no zeros of (5) (these include the zeros o f (s)) have j8 = 1. All the
direct proofs that j8 < 1 at a zero are based on the following argument.
The function '(s)/(s) has poles of residue 1 at the poles o f (s) and
poles of residue 1 at the zeros. Now

2 log p i l p - * ) - 1
(13.1)
where A(m) is the function o f eqn (1.31). At s = 1, the series diverges
to + 00, corresponding to the pole of residue -(-1. Now, if 1 + iy were
a zero of (s), we should expect the series in (13.1) to diverge to co,
and the partial sums to be as large as those for the case s = 1, but
negative. To achieve these, the numbers to-1? must be predominantly
near 1. The values of to2i>/ are therefore predominantly near + 1 , and
there is a pole at l + 2iy with residue -(-1) aild so a simple pole o f (s)
at s = l + 2iy, which we know does not occur.
To make this argument rigorous, we use (13.1) with s = a + ii where
a > 1, so that the series converges. For all real 9,

Since

3 + 4cos0 + cos 29 = 2(l + cos0)2 ^ 0.


00
Re('(s)/()) = 2 A(m )m ~aco&(it\ogm),

(13.3)
(13.4)

we have
853518 X

(13.2)

56

TH E P R IM E -N U M B E R TH EO R EM

2.13

We now make cx+if tend to a zero j8+iy. Since (s) has a pole at 1, there
is a circle centre 1 and some radius r, within which (s) is non-zero.
(Calculation shows that r = 3 has this property.) I f we suppose that
j3 > 1 - f r ,

(13.5)

then |y| ^ r, and so is bounded away from zero. In eqn (12.40),

t'W
m

P,

W l )

V / 1 ,1)

= ~ B + i " ll0 8 2 - i T ( i i + i ) - 2 , 1 ^ + W

(13'6)

we shall assume 1 < a < 2, \t\ ^ fr > 0. Here the sum is over zeros p
o f (s), not over all zeros o f (s), and, since s p and p have positive
real part, we have
/ -,
^
Re _ i _ + > 0
(13.7)
\S~~-P

PI

whenever a > 1 and p = /3+iy has 0 ^ / 3 ^ 1. By (11.17) the term in


^ ( i + l ) is <log(|i|+e).
We now write down three inequalities. Since there is a pole at s = 1
of residue 1, we have
-i'(o)H (o) = (a -l)-H O (l).

(13.8)

At s = cr-j-iy we omit all terms in (13.6) except those from the particular
zero with which we are concerned; by (12.7) this gives us the inequality
_ '(cr+ iy)/(ff+ i y) < - {a- p ) - i + 0 ( l o g ( \ y \ + e ) ) .
Similarly,

_ '( ff+2iy)/(a+2iy) < 0(log(|y|+e)).

(13.9)
(13.10)

When we substitute (13.8), (13.9), and (13.10) into (13.4), we have


4 ( a - j 8 ) - i - 3 ( a - l ) - i < 0(log(|y|+c)),

(13.11)

valid as a -> 1 from the right. By giving a a suitable value, we see that
j3 < 1,
log(|y|+e)

(13. 12)

where c is an absolute constant. I f we constrain c to be less than f r


then, by (13.5), (13.12) also holds when |y| < |r and is thus valid for
all zeros p = /3+iy o f (s).
For twenty years, (13.12) remained the best known upper bound; in
1922 J. E. Littlewood proved that
c2loglog(|y|+e2)

P < 1

k O T + ^ T

no

ion

(13J3)

2.13

ZE R O S 03?

(s)

57

for some constant c2 (Titchmarsh 1951, theorem 5.17). The latest


result is
C3(e)
ft < 1 ,
nTT T IVfTe
(13' 14)
log(|y|+e)^
where the constant c3 depends on e. This was proved in 1958 by Korobov
and by I. M. Vinogradov independently. Intermediate improvements
on (13.13) used the intricate methods o f I. M. Vinogradov (1954).

14
ZEROS

O F |(s, x )

W hat do you think youll answer ?


I shall have to wait till I catch up with it, said Winniethe-Pooh.
I. 34

In this chapter and the next we prove results like (13.12) for the zeros
p = /3+iy o f L(s, x), with uniform constants in the upper bounds. We
actually work with (s, %), where % is proper mod#; the zeros of (s, x)
are those zeros of L(s, x) that are not at negative integers and so are
not cancelled by gamma-function poles. We use the product formula
in the differentiated form (12.41),
L ( s ,x )

TT

Z-4\s p

2 i (!( + ))

pj

where the sum is over zeros p of (s, x)> the term 1/p being omitted if
p = 0. We shall assume throughout the chapter that s = a-\-it with
1 < (7 < f.
The first complication is the elimination o f B(x) from eqn (14.1). We
subtract from (14.1) its value at s 2, noting that
CO

\L'(2, x )\I\L(2>x)\ <

2
(14.2)
m=1
which is bounded independently o f x Estimating the gamma-function
term from (11.17), we have

tM

5 5

2 R6( ^ , - ^ ) +0(log(|,1+e))

( 1 4 '3 )

where the term yo = 0, if it occurs, is now included in the sum. We now


note that

^ ^ - f ) - 1 = 2 ( 2 ^ ) \ 2 ~ p \ -2 < lo g ?,
p

p
l(t) = log{q(\t\+e)},

by (12.19). Writing
we have

Be ^

^ < "V Re - \-0(l(t))

JjyS} ^ j

for 1 < a ^ |, the implied constant being absolute.

(14.4)
(14.5)
(14.6)

2.14

ZE R O S OF

{{s,

59

x )

When we apply (13.2) we obtain the relation


_ 3

o) - 4 Re 7; ' (<7 : ] 1 y) - Re L 'ja+ 2^ X2)


L i>Xo)

L ( a Jr it, x )

p,

( 14.7)

L (a -\ -2it,x)

validforcr > 1, as the analogue of (13.4), Here, Xo is the trivial character


modg, whose value xim ) is 1 when (m, q) = 1 and 0 otherwise, and x 2 is
the character whose value at m is (x(m)}2. Although we have supposed x
to be proper mod q, x 2 might be trivial and certainly need not be proper
mod <7. The trivial character xo is not proper mod <7. However, if Xi
proper mod/induces x 2 mod 5, th e n i'(s, Xz)l^is >X2) and _Z7(s, Xi)!L{s, X-d
differ only by terms involving powers of those primes that divide q but
n o t /. For a > 1, these terms give at most

m=2?
p\q

p \Xf

^ 2 los p K p 1 ) < log ? p\q

( u -8)

The inequality (14.8) applies also f o r / = 1, X2 = Xoconclude that


(14.6) is valid for any non-trivial xm odg, possibly with a different
O-constant, and that
_ R e ^ ^

Z jL _ R e V _ i |- 0 (l (t))
(14.9)
Is 1|
for the trivial character Xo mod q.
I f xa is non-trivial, substitution of (14.6) and (14.9) into (14.7) gives
L(s,Xo)

implying that

4(uj8)_1 < 3(ct 1 )-!+ O(Z(0),

(14-10)

/? < 1 -cJ H y )

(14.11)

for some absolute constant cx when we choose a appropriately. I f x 2 is


trivial, then
4(ctjS) - 1 < 3 ( a - l ) - 1+ ( a - l ) / { ( a - l ) 2+ 4 y 2}+0{Z(y)}, (14.12)
which is consistent with j8 = 1 when a - 1. However, if
\y\ > cjl(y)

(14.13)

for some positive c2, then by choice of a in (14.12) we can show that
0 < 1- c 8/Z(y),

(14.14)

with a smaller absolute constant c3. We have now shown that either
(14.14) is true or

|y| < 8/log?,

(14.15)

where is an absolute constant. The absolute constant c3 in (14.14)


depends on the choice of 8 in (14.15). When (14.15) is satisfied with

THE P RIM E -N U M B E R THEOREM

60

2.14

y ^ 0 we can still deduce an upper bound for f3, but it is very close to
unity, and tends to 1 as y
0.
The third and greatest difficulty is to deal with zeros close to unity
when x2 is trivial. First we show that there is at most one. We have
_ R e ^ 4

L(a, x)

= Re y ^
A*

- )

co

^ 2 A(m )m ~a
i

> {a 1)-X+ 0 (1 ).

(14.16)

I f px = /Sx-(-iYi and p2 = j82+ iy 2 are two zeros satisfying (14.15), then


Re y ^

L(a,x)

^ Re ------- -Re ------\- 0(logq)


a Pl

^ (cr-jSi)*+yf ~

p2

+ 0(l0g q)

< (7 -j3 1)*+ 8 ?iogg)-a+ '9(l0g^)

(14J7)

if /32 ^ j8x ^ 1S/(logg). I f 8 is small enough, this implies that


Pi ^ 1

cJ l(Yi)-

(14.18)

Clearly we can choose c4 ^ c3, and so (14.18) is true for every zero
fti+iyi f
x) except (possibly) p2, and the possible exception p2
occurs only if ^(m) is always real, so that x %is trivial. Since p2 is also
a zero when L(s, x) has real coefficients, if p2 fails to satisfy (14.18) we
conclude that p2 is real. We devote the next chapter to the case of an
exceptional zero ft on the real axis.

15
THE

EXCEPTIONAL

ZERO

Piglet said that the best place would be somewhere where


a Heffalump was, just before he fell into it, only about a foot
farther on.

I. 57
Ik this chapter we consider real characters, that is, characters for which
x(m) is always real and thus
is trivial. As far as we know, the corre
sponding -L-functions may have real zeros /3 with \ ^ /3
1. Just as
before we saw that L (s ,x ) cannot have two zeros both close to 1, we
shall now see that two functions L(s, x) corresponding to different proper
characters cannot both have zeros close to I. Suppose Xi is proper
m od?!, x% is proper mod<j2, and the corresponding i-functions vanish
at j8j and jS2. In place of (13.2) we use
(1+ Xi(m)) (1+ X2()) > 0,

(15.1)

which implies that


I'(a)

L'(o, Xi)

L ' ( a , x 2)

L'{a,

Xss) ^

(ct)

L{a,X\)

L (a>X2)

L(a>X 1 X2)

^15 2j

where X1 X2 denotes the character modg'1g'2 whose value at m is


Xi(m )Xz(m )' When xi and Xz are different, the character X1 X2
non
trivial, and (14.6) gives
L '(o > X iX 2 )m J>XiXz) < ( l g W h ) ,

(15-3)

L '(<r>Xi)lL (o,X 1) < (-~ P d ^ + O i l o g q J ,

(15.4)

and for L(a, xi)

and similarly for


(a

and if

In place of (14.17) we have


fti)-1~MCT fi%)~1

{a

I)-1

O ilogq ^),

(15.5)

^ /?2 then /32 at any rate satisfies the relation


As < i-C iA logff^g),

(15.6)

where cx is an absolute constant. We deduce a uniform zero-free region.

THE P R IM E -N U M B E R THEOREM

62

2.15

By (15.6) and (14.18) there is a constant c2with the following property.


Let Q > 1. Then no i-function formed with a character x mod q with
q ^ Q has a zero p = /3-)-iy with
p >

1ca/log{Q(|y|+e)}

(15.7)

except possibly at a point J81 on the real axis, where L(s, x) has at worst
a simple zero. All ^modr? with q ^ Q for which L (pv X ) 0 are
induced by the same real character.
To prove the prime-number theorem for an arithmetic progression
with common difference q, we need to know that neither (s) nor any
.L-function formed with a character x m o d q has a zero p = fi-j-iy with
P = 1. The proof is simpler if we have p explicitly bounded away from
unity. We have to deal only with the case x real, p real. One method
is to interpret L( 1, x) as the density of ideals in a quadratic number field.
This gives a very weak bound. We shall prove Siegels theorem.
Theorem. For each e > 0 there is a constant c(e) such that if
L Wi> X) = >
ivhere x is a character mod q then

^ < 1c(e)<r.

(15.8)

The constant c(e) in Siegel s theorem is ineffective; that is, the proof does
not enable us to calculate it. All previous constants in upper bounds,
such as c2 in (15.7), have been ones we could calculate, given a table of
values o f (s) for |s|
3 and standard inequalities such as Stirling s
formula.
Following Estermanns account (1948) of Siegels theorem, we con
sider the function
F(s) =

(s)L(s, xi)L(s, Xz)L (s , Xi Xz)>

(15-9)

where Xi and Xz are real characters proper m o d ^ and mod <72 respec
tively. By (15.1), the Dirichlet series
CO

log F{s) =

J {i + Xi(m ) + Xz(m ) + XiXz(m )}A (m )m ^s

7)11

(15-10)

has non-negative coefficients; it converges for a > 1. For a > 1 we can


take the exponential of (15.10):
CO

F(s) =

2
m=1

(15.11)

with non-negative coefficients. F(s) has (at worst) a simple pole at


s = 1 of residue

A = L (l, Xl)L (l, Xz)L(l, XlXz),

(15-12)

2.15

THE E X C E P T IO N A L ZERO

63

and no pole if A = 0. Moreover, F(s) has a power-series expansion,


CO

(15.13)

F(s) = 2 b ( r ) ( 2 - 8y ,

where

b(r) =

(-l)'\ F (2 )/r!
(15.14)

In particular, b(0) = F ( 2), which is at least unity. The function


F ( s ) - X / ( s - l ) = 2 {&(>') A}(2 8)r

(15.15)

has no singularities, and so its power series converges everywhere. If


A = 0, the series on the right of eqn (15.15) is positive on the negative
real axis and represents F(s). Since F(s) is zero at 0, 2, 4,..., we
conclude that A ^ 0.
We now have
< 1. To prove the more precise result (15.9), we use
Cauchys formula, integrating round a circle, centre 2 and radius f, to
obtain the coefficients b(r). On the circle, (s) is bounded, and for the
^-functions we use (12.6),
(15.16)

|(>x)l < c r 1|s|^logg,

which was proved from eqn (5.3) and Polyas theorem, and is thus true
for any non-trivial x mod?, proper or not. We have

< ( i Y i l l log V i M log q2)(qlq\ log qxq2)

(15.17)

< (t)r?l?2 1g8?l?2>

the constant implied in the


sign being absolute. We restrict ourselves
to the range t9q ^ a < 1 and estimate the tail of the series (15.13)
for F{s).
CO

2
r =--R
R+1

|6(*)A|(2cr)*- < 2 <Z1?2log3g1(?2( | ( 2 - (7))''


R+l
CO

< q 1qalog'q1qs 2 (tt)r

2?-}"1

< q 1q2logaq1q2{ ^ ) It,

(15.18)

64

T H E P R IM E -N U M B E R T H E O R E M

2.15

and thus (taking only the first term in the series for F(s))
F ( a ) ----- > 1 - A f ( 2 - Cr) - 0 ( g 1g2log3g1g'2(ii)iJ)

a 1

> 1- A( (2 - a)5^1-1 ) /( 1 - a) - 0 {qxq2log3?! ga(ii)B).


(15.19)
We choose R so that the error term in (15.19) is less than
possible with a choice of R with
R = A log qx q2,

this is
(15.20)

where A is an absolute constant. Now


F ()

> i -A ( 2 -a ) * /( l-( T )

> i A(lo-)-1exp{i?log(l + ( l a))}


>

A ( l a)^ 1e x p { i? ( l cr)}

> i - A ( l - a ) - i ( ?1?a)^-">,

(15-21)

where B is another absolute constant.


We now choose xt so that L(s, x z) has a real zero j8a with
l - e f t A + l) < j8 2 <

1;

(15.22)

if the choice is impossible then (15.8) certainly holds. Since/(/3a) = 0,


we have

^<

<Aq f ^ \

(15.23)

where the constant depends on the choice of ^2 and so on e. This is a


lower bound for the product A of the three L-functions at 1. Since
-(l,X'a) is a known constant, we seek an upper bound far L ( l , x x Xt)From eqn (5.3),
m
L (S, X ) =

Let

f sa;-*-1 2

x M d:K-

r = gHogg,

(15.24)
(15.25)

the upper bound in Polya s theorem. Then


r

|L(,X)| ^

oo

da; +

J \s\x~a-h' da;.

I f a ^ l - ( l o g q ) - 1, the first integral is


< | s| ,a ild s

(15.26)

jsjlogq and the second is

| i(,x )l< l* | lo g 2 .

(15.27)

When we integrate round a circle radius i(lo g g )-1 to find L '(s ,x ), we


have for

> i-^ lo g ? )"1

(15.28)

the bound

\L'(a>x)\ < l o g 2q.

(15.29)

2.15

T H E E X C E P T I O N A L ZE R O

65

Hence, if L(s, xi) has a zero j8x in the range (15.28),


A = L ( l , X i ) L ( 1 > Xi ) L ( l > Xi X z ) < 1 g ? i i (1.Xi)

< l o g q 1{ l - P 1)L,{a,Xl)

(15-30)

for some a in /3X < a < 1, and so by (15.23) and (15.29) we have
1 < ( l - / W (1-^ lo g 3<Zi < ( 1 - A )qb

(15.31)

where the constants implied depend on xz and so on e. I f /3X does not


satisfy (15.28), then (15.31) certainly holds, and Siegels theorem
(15,8) follows from (15.31). '

16
THE

PRIM E-NUM BER

THEOREM

The clock slithered gently over the mantelpiece, collecting


vases on the way.
II. 135

WE can now prove the prime-number theorem in the form (5.17). Com
bining (5.4) and (5.5), we have

277-1

if u

'0 ( u a{T\\ogu\)-x)

a + iT

-A s

cciT

i + 0 ( a /T )

<

1,

if u = 1,

(16.1)

,l + 0 ( o:(?, log?()~1) if u > 1.

On the line a a, where a > 1, the series


Z'(s)lt{s) = 2

( 16.2 )

A(m)m-S

m= l

converges absolutely. We now suppose that x is an integer plus one-half.


Then
a-f-i T

J_

27rt J

a_ iT

Z-,
rn=l

ms s

m<a;

A(m) +

o k

\T

'

no

J -V
log x m\ m a

Z-i
m=1 1 a '

(16.3)

To estimate the series in the error term of (16.3), we first note that
2 A (m )< 2/,

m <y

(16.4)

by the sieve upper bound (8.19). Partial summation gives


V
A (m )
| V
A (m )
^
1
Z., m a\\ogxjm\
Z-t a: log xlm
a l

(16.5)

In the remaining range for m we write on == x-\-^r, where r is an odd


integer. Thus

ix < m < 2 x

A ^ -<^ - loga: T |loga:(l-f h'lx) h 1


x alog\xm\
& r=T-x
2

2x

a;- loga; 2 xlr


)=!
^ ^ - log2.

(16.6)

2.16

T H E P R IM E -N U M B E R T H E O R E M

67

We can now write the right-hand side o f eqn (16.3) as


xa

x lo g 2x\

T (a 1) 1

( 16. 7)

where ^{x), as usual, is the sum function of the coefficients A(m) of

-mm*

\T

-i r
F ig . 2. The contour G

We must now find an approximate value for the integral on the left
of eqn (16.3). By (12.20), there is a value of T in each interval
t < T < t+ 1

for which

^ n __m\~i
\v\ T\ > (log T)-

(16.8)

for each zero p = /3+iy of (s)- With such a value of T we move the
integral to the contour 0 consisting of
C^: the line segment [ai T,
Gz: the line segment [

-in

iT , | + i T \

Os : the line segment [J + i ? 1, a-f- iT].

In eqn (12.40),
^ 1

= 1

we subtract the value at 2 + i t from that at

using the fact that

THE P R IM E -N U M B E R TH EOREM

68

2,16

'(s)/(s) is bounded on a = 2, where its Dirichlet series converges


uniformly.
'(*) _.p m
()

1 r'(tts + i))

1 r (f+ ijQ

2 -r d + ii^ )

^ -^'(i('s+ l ) )

/ 1
^

Vs p

.1

2+ i i p )

2a
(a + iip)(2-(-itp)
(16.10)
where we have used (12.20) and (16.8) on the sum over zeros, and
(11.17) on the gamma-function terms.
We can now estimate the integral round C. First
-1 f S ^ d .
27ri
m
a

iog2y

x a do-

x a log2T j (T log x ),

(16.11)

and similarly for the integral along C3, whilst


1 f V(s)xs
ds
27ri J (s) s

: x ~ -l o g zT

ds
s

o2

_ ilog ST.

(16.12)

At this stage it is convenient to make the choice of a


a =

l-f(log.'K)-1,

(16.13)

whereupon we have

1
r
277-i , () 5
0

. *log 2T
j-ai^log3! 1.
T log x

(16.14)

The contour has been moved past several poles of the integrand. The
residues of '(s)/(s) are + 1 at zeros of (s), 1 at poles, and so the
integral in (16.3) differs from that in (16.14) by
xP

2
Iy \ < T

'(0)

(16.15)

(0)

where the sum is over zeros p = /3+i y of (s) with )3 > 0. These are all
zeros o f (s), and by inequality (13.12) each satisfies the relation
(16.16)

2.16

THE P R IM E -N U M B E R TH EOREM

69

By (12.20), the sum over zeros in (16.15) is in modulus

rsSalogr
< a ;1- slog2T )
where

w lyl<2r
21

8 = cJlogiT-^e).

(16.17)
(16.18)

We have now shown that


>p(x) = 3!+ o (* l y f X + a;1" 8log2y j ,

(16.19)

and when we choose


lo g ? 1 = (lo g * )* + 0 (l),

(16.20)

where the 0(1) is to allow us to find a T for which (16.8) holds, the
error terms in (16.19) are bounded by expressions of the form
* e x p {c^loga:)4}

(16.21)

with different values of c. Hence there exists an absolute constant c2


such that

= a;_|_o|-;Cexp(c2(loga:)*)}.

(16.22)

We can quickly deduce from (16.22) that


2 logp = ;+ 0 { a e x p ( - c 8(loga;)J)},

(16.23)

p<x

and integration by parts gives


X

A x) =

2 1 = f + 0 { e x p ( - c 4(loga;)i )},

p<x

(16.24)

the classical form of the prime-number theorem. Clearly, only the


O-constants in (16.22), (16.23), and (16.24) are affected when we drop
the restriction that x be an integer plus one-half.
We have approached the prime-number theorem by the classical
route of Riemanns functional equation and Hadamards product
formula. Neither o f these is necessary for the proof o f eqn (16.22). There
are elementary proofs of eqn (5.17), ones which do not use the complex
variable at all (see for example Hardy and Wright 1960, Chapter 22).
The elementary proofs are disappointing in that much extra effort is
needed to prove the prime-number theorem with an explicit error term,
rather than as an asymptotic equality. An analytic proof that does not
use the functional equation is given by Titchmarsh (1951, Chapter 3).

17
THE

PRIM E-NUM BER

AN ARITHM ETIC

THEOREM

FOR

PROGRESSION

P(x>x) = mX,
A ('m)x(m)<x

L et

C17 1)

When x is proper mod q, we apply the method o f the last chajjter to


obtain
<"*
ipl>

where <x is given by (16.13), and T and R are chosen so that


||y[-T| X l o g g T ) - 1,

(17.3)

IIHi?| X l o g g ) - 1

(17.4)

for each zero p = jS+iy of |(s, x). By (12.19), we can suppose that
A < R < J. The contour C consists of
Gx: the line segment [aiT, f i? 1],
C2: the line segment [

\T, | + i T],

C3: the line segment [ | + i T, a + iT ],


Ga: the circle centre 0 , radius R described negatively.

The circle Gi avoids the pole of x*/s and a possible Z-function zero at
s = 0.
As in (16.10), the inequality
(17.5)

\L'(S, x )IL(S, x ) \ ^ \ o g \ T

holds on the contour G, the constant being absolute. Thus


1
f L ' {S X) X8ds
2771 , L ( s , x ) 8

(17.6)

We now treat the sum in (17.2). By (14.18), non-exceptional zeros


satisfy the relation

ft ^

(17.7)

1 c J lo g q T

with an absolute constant cv I f q satisfies an inequality


log

< c2(log i# ,

(17.8)

2.17

A R IT H M E T IC P R O G R E S S IO N

71

with c2 < 1, we can choose T so that (17.3) holds, and


log q T = (loga;)J+ 0 ( l ) .

(17.9)

As in the last chapter, we deduce

/'(a, x)

f 3^7/?!+ 0 { x exp(c3(log x )i )}

,0 {* e x p (c3(loga:)*)}

if there is an exceptional
zero Ji>
(17.10)
if not.

The value o f c3 depends on that of c2 in (17.8), but can be effectively


calculated.

We can absorb the term *&//?! only if q satisfies


q < logiVa;

(17.11)

for some N > 0. By Siegel s theorem with e = ^ N -1,

and

ft > 1c4g~e > 1c4(loga;)-^

(17.12)

a;A/ft <; &exp{c^logx)*}.

(17.13)

W e have now shown that the inequality


\i/j { x ,

x)| < exp{ c6(loga:)*},

(17.14)

where c5 is an absolute constant, holds if (17.11) does, or if (17.8) holds


and there is no exceptional zero.
853518X

THE P R IM E -N U M B E R TH EO REM

72

2.17

Our characters can at last abandon propriety. I f x niod q is induced


b y Xi proper mod/ then
<(>(x >Xi)-~>l)(x >x) = 2 logP 2

Xi(Pr)>

(17.15)

an expression whose modulus does not exceed


lo g ^ f l
2
i>
la

log ,t\
< log a; lo g ?.
k>g'W

(17.16)

The estimate (17.16) also holds if x is trivial mod#, <jj(x, Xi) being tfi(x).
This error term absorbs easily into that o f (17.14), which thus holds
whenever % is non-trivial mod q. For the trivial character Xo mo(i q>
(17.16) gives
1#^ Xo)

x \ < x exp(c6(loga) *) + log x 2 1,


vVi

and again the first error term will absorb the second.
Let
? , )=
2
A(m).
m<a?

(17.17)

(17.18)

w = a(mod<2)

There are two cases. I f (a, q) is not unity, only the powers o f primes
that divide q can occur in the sum (17.18). If, however, (a,q) = 1, we
have from eqn (3.3)
if>(x;q,a) =

2
Xiad9

X) =

0 { * e x p ( - c 7(loga:)*)},
(17.19)

the prime-number theorem for the arithmetic progression with first


term a and common difference q. We have proved (17.19) if (17.8) holds
and no character m od? has an exceptional zero in its i-function, or if
the stronger'condition (17.11) on q holds. The result (17.19) subject
only to (17.11) is called the Siegel-Walfisch theorem. When (a,q) 1
the arithmetic progression a, a+g,... contains infinitely many primes,
but we have not proved that the first xjq terms include a prime unless
(17.11) holds, so that x is very much larger than q. For a given q we
may have to choose x larger still to ensure that the second term in
(17.19) is smaller than the first, and t/i(x, q, a) is non-zero. Thus (17.19) is
useless for numerical work, since we do not know the value o f c7. From
the proof o f Siegel s theorem it follows that, if we knew one L-function
with an exceptional zero, we should be able to find the c7 corresponding
to one particular value of N in (17.11). I f no Z/-function has an excep
tional zero, then (17.19) holds subject only to (17.8), and we can find c7.

PART

III

The Necessary Tools

18
A SURVEY

OF SIE VE S

It wasnt what Christopher Robin expected, and the more


he looked at it, the more he thought what a Brave and
Clever Bear Pooh was.
I. 140

S u p p o s e that we have a classification of the integers into certain

classes, and a system o f functions a from the classes to the complex


numbers with the properties (i) and (ii) below, which generalize those
that the functions
, * = ea(am)
, \
0 \
ct(to)
(18.1)
possess for the classification of integers

to into

residue classes modg.

(i) Apart from the trivial function a for which a(m) is always 1, the
sum (or integral as appropriate) of ct(to) over a complete set of classes
is zero.
(ii) I f to and n are in different classes there is at least one function a
for which a (to) ^ a(n).
Then a sequence m i of integers is uniformly distributed among the
classes if and only if
| t% ) = o ( X |a(TOf)|)

(18.2)

for each a except the trivial one. This is known as W eyls criterion for
uniform distribution. As an example, the prime numbers form a sequence
that is uniformly distributed among those residue classes a mod q for
which (a,q) = 1, in the sense that a proportion (l/<p(<z)+o(l)) of the
primes up to some bound x falls into each reduced residue class. We

74

T H E N E C E S S A R Y TOOLS

3.18

deduced this from the fact that j/r(x, %) is of smaller order than x when
X is a non-trivial character mod?.
Franels theorem o f Chapter 9 fits this scheme. We were concerned
there with the distribution of the F Farey fractions in the interval [0,1].
The appropriate functions a(m) are e(bfm), where b is an integer, and a
calculation with Ramanujan sums shows that
F

(18.3)

e(bfm) = 2 d M ( Q l d ) ,
d\b

in the notation of Chapter 9. Franels theorem relates a measure of the


uniform distribution of Farey fractions in the interval to the mean
square of the sums in eqn (18.3) for varying b.
Many results in number theory rest on the uniform distribution of
some sequence among certain classes. This is the final step in the proof
of the prime-number theorem for arithmetic progressions. Often the
uniformity result is needed at an intermediate stage, and in these cases
it is usually sufficient to know (in a quantitative sense) that (18.2) holds
most o f the time. An assertion that, for a given sequence m t ,
and
set o f classes, eqn (18.2) holds for almost all functions cr is called a largesieve result. A common form of large-sieve result is that the mean square
o f the left-hand side o f (18.2) is less than the mean square of the right.
The sieve (8.10) of Chapter 8 is such a result with a(m) given by (18.1).
In this chapter we shall also consider the characters x(m ) an(l the powers
m~s as functions a(m). These correspond to distribution among reduced
residue classes, and distribution of the sequence within intervals.
The proof o f a large-sieve result depends on an upper bound for an
average of some convenient auxiliary function. The simplest example
is a result for all characters % to a fixed modulus q. Let u(m) be any
complex coefficients and
(18.4)
In Chapter 7, we introduced the sum
M +N

$ ( ) =

m=M + l

(18.5)

u(m)e(mtx);

and we obtain the result by comparing the identities


2

(18.6)

3.18

A S U R V E Y OF S I E V E S

75

wliicii follows from eqn (3.1), and


iU+iV
X mo clq

b moclg'

(18.7)

m.M + 1

m = 6(modg)

which follows from eqn (3.3). The stun on the left of (18.6) can be bounded
by the large sieve for exponential sums (7.8), and we have
M+iV
I \8X\ < r W s H x + o i r 1)) I
M)l2(18-8)
X mod q

m M + 1

When we restrict our attention to proper characters we can sum over q


on the left of (18.8); by eqn (3.8) for % proper mod?,
A x)x(m ) =

and we have

I*

am odg

(18.9)

x ( a ) e a( a m ) ,

M +N
r(x)>s x =

2
2 * X ( a ) u { m ) e q{ a m )
m=M + 1 amodg

(18.10)
* qn * >
-t-t-i/-/!
''-L'
a2
mod
By (3.3) again we have

^modg

x{a)S
X mod a

< ? (?) 2 ;l

(18.11)

am odg

where the asterisk on the sum over % indicates a restriction to proper


characters. When % is proper m od? then |t ( x )|2 = ?, and we have
y
jL j

jl y
ap(q) Z-*

g^Q TX-t/ ^m odg

g< 0

am odg
M +N

< {N + o m )

m=M+l

k * )is

(18.12)

by (8.10), that is, by (7.8) again.


A third sieve result for characters is possible if the coefficients u(m)
are 0 whenever to has any factor smaller than Q; for then (to,?) = 1,
and (18.9) holds for all characters m od?. For example, if M = 0, and
u(m) is 1 when to is a prime greater than Q and 0 otherwise, we have

2 2

a=SQ x mocla

p(<z)

< { N + 0 ( Q 2) ) X |m(to)|2
1
< (A ^+0(Q 2))( tx(iY) + 0(Q )).

(18.13)

76

3.18

T H E N E C E S S A R Y TOOLS

The left-hand side of (18.13) can be expanded as

f^Q

2
xmocl/

(18.14)

j s O fm o d /)

<p(<7)

s=s Q

By eqn (8.16), the term / = 1 gives


(77(iV))2l o g Q (l+ o (l)) ,

(18.15)

which is approximately half the right-hand side of (18.13) when Q is


almost N$. I f for some % L(s, x) has an exceptional zero in the sense
o f (15.7), the term in |^|2 can be as big as (18.15). This give sanother
proof that at most one x has an exceptional zero , blit the range for q
is not as large as in (15.7) when we work out the details.
To obtain a sieve result for a(m) = m~a we return to first principles
and use what we shall refer to as Gallaghers first lemma (1967).
Lem m a. Let f(x )

be a differentiable function of a real variable x.

Suppose that a^,..., xR are at least apart and

Z . + i S < xr < Z a- J S
for r = 1,..., R.

(18.10)

Then

it

Z
I l/( r )la < g J
r=1

!/() 12 da; + 1 J jf(x)j2 d x j i j \f'(x)\*dx


X,
W
' 'Xi
(18.17)

Proof. We integrate the identity

!/(*,) I2 = \f(x)\2~

Ax

(18.18)

j/(a;)|2 da;

over an interval of length S and obtain


a,v+i8

S|/(a;,)|2 <

9,v+*8 V
\f(x)\2 da; -f-

!'f- id

Xr-iS

j - ^ \ f ( x ) \ 2 dx

dy

X,

x r+ tS

\f(x)\2 da
da; -f-

Xr

da;

da;.

(18.19)

i r i S

By (18.16), these intervals are disjoint and subintervals of [X 1; X 2], and


r

2 I /M 2<

f l/(*)l2 da; + i f |2/(a;)/'(a;)I da;,

and (18.17) follows from Cauchys inequality.

(18.20)

3.18

A S U R V E Y OF S I E V E S

77

To obtain large-sieve results from Gallaghers lemma, we arrange the


sum on the left o f eqn (18.2) to b e f ( x ) , different xr giving the different
functions cr. For instance, for S(<x) given by (18.5) and ct(to) by (18.1), the
points x r are the rationals ajq in their lowest terms, and Gallaghers
lemma gives
M +N

s la

< (Q2+

Q amodg 1

ttN)

K()!2-

m =M +l

(18-21)

Like the relation (7.8), (18.21) is an approach to the conjecture (7.7).


Whereas (7.8) gives N the conjectured coefficient unity, (18.21) obtains
the conjectured coefficient of Q2 but not that of N ,
The application of Gallaghers first lemma to m rs was made by
Davenport. Let
N
f(t) = 2
(18.22)
?n=1
where a is fixed. The first integral on the right of (18.22) is

2'i

Tt

T2

V i ^ l L 2 f dt + V

Z-t

m 2a

rp

Z-t

m ana

J \mj

11

rp

d*.

(18-23)

The second term in (18.23) is

22
m

2 2

u ( m) u ( n ) (n j m ) iT2 (,/w)iTi

ma?ia

n
m^n

ilo &(nlm)

|2
\(n m)lm\ \2

?w=l ]nK)K2m x

, V / V

1 V

m2a
\

+ mZ=1 n>2m
Z + ?KZ?n. 7

1 1-1/7<T)\12'
2

n2a

1 A l()l2 , 1 M l 2'!

no t>/n

(18'24)

'
where we have used the geometric-mean inequality. The coefficient of
\u{m)\hn~iCL in (18.24) is 0 { N \ o g N ) , and thus we can write the righthand side of (18.23) as
(T2 T y -}-0 (N lo g N )) 2 \u{m) |2m_2 .
m=i

(18.25)

Since

f'(t) =

2 iiog m u (m )m ~ a~il,

(18.26)

and log to ^ logiV, the upper bound for the integral o f \f'{t)\2 is at

78

T H E N E C E S S A R Y TOOLS

3.18

most log2lY times that for (18.23). We have now proved that if
tR
satisfy the relations
,
,
2
/,., /, >
(18.27)
for r = 1,..., R 1 and
T y + l 8 < tr < T2- 18

(18.28)

for r = 1,..., R, then


|u(m) |2
m 2a
1

(18.29)

which is Davenports sieve, first published by Montgomery (1969a).

19
THE

HYBRID

SIEVE

Kanga saicl to Roo, Drink up your milk first, dear, and talk
afterwards. So Roo, who was drinking his milk, tried to say
that he could do both at once . . . and had to be patted on the
back and dried for quite a long time afterwards.
I. 151

In the last chapter we obtained sieve results in the form of inequalities


with two terms on the right-hand side, one of which corresponded to
the maximum size of the summands, the other to the mean square of
the function times the number of summands. In the series for L(s, %)
we can sieve either over % or over s. Montgomery (1969a) sieved over
both x and s and obtained a hybrid result, which is better than the one
we should obtain by sieving over the one and summing over the other.
P. X . Gallagher has found a simple method of obtaining hybrid sieve
results, which we describe here.
The function f(t) of a real variable t is said to be the Fourier transform
of f { x ) when

f(t)=

\ f(x)e(xt) dx.

(19.1)

CO

Under suitable conditions (these include the continuity of / at x),


oo

J / W e ( - ^ ) di.

/( ) =

(19.2)

CO

Equations (19.1) and (19.2) imply formally that


co

00

CO

J 1/(0 r d* = J f(t) J j( x ) e ( xt) clad/,

oo

co

co

co

co

= J /> ) j f(t)e( xt) dtdx

CO

CO

CO

=j
co

l/(*)lada,

(19,3)

80

T H E N E C E S S A R Y TOO LS

3.19

Plancherels identity. We shall use eqns (19.1) and (19.2) w ith /a linear
combination o f the values of

* H r &!>$
Here

F{t) = 2 ^ ,

(19.5)

and we can check by means of a contour integral that (19.2) holds


except at the points o f discontinuity x | 8 . We can verify from
first principles that the interchange o f integrations in the proof of
(19.3) is valid. We now deduce Gallaghers second lemma.
Lem m a.

S(t) = 2 o(v)e(vt),

Let

(19.6)

where the sum is over a finite set of real numbers v. Let


D{x) =

o{v).

(19.7)

oo

J |^(<)|2d / < 772T 2 f |D(*)|2da;.

Then

(19.8)

co

Proof. Let S = ( 2 T ) - 1 in eqn (19.5). Then


D{x) = S 2 c { v ) F { x - v )

(19.9)

and

B(i) = J c(v)e(rf)l(i) = S(t)_F(i).

(19.10)

By (19.3),
co

co

J |X>(*)|a d = Sa J \S(t)\*\F(t)\* dt ^ 8* f |(/)|2|jfy)|2 dZ,

co

co

*27

(19.11)
and

\F(t)\^2/TT

(19.12)

for |f|
T , which gives (19.8).
We apply the lemma to a Dirichlet polynomial, that is, a finite Dirichlet
series
N
$ (s>x) = X, a(m)x(m)m~a~it>
(19.13)
m=l

so that the numbers v are of the form logm, and


D ( x) =

where

Ae*
2 a(m)x(m)>
A-ie*

A = exp((2T )-1).

(19.14)
(19.15)

3.19

T H E H Y B R I D S IE V E

81

The large sieve (18.8) gives


2

.2

XinodQ a~1gx

()x()

< (( A -A -1)e *+ ? ) f

\a{m)\

A_1ez

< ( 2 1- 1ea:+ ? ) X

|(m)|2.

(19.16)

A ~'e*

i (U>

xyl
* - ( l o f > ' xY)-1

--

11

1*

1 + (lo g A r)_1

i(log A ) - 1

F ig . 4. T h e contour C

By the lemma,
T

f
J T

logfiiA

|S(s,x)la tlf < T 2 X

xm oda

m=l

( T - V + g ) dx

log(m./A)

X {m-\-qT)\a(m)\2m ~Za.

=1

(19.17)

A more realistic application is the following. For each character


X mod q there is a set of points s(r, x) = o(r, x) + i t(r>x) within a rectangle
<ct<1,

T s ^ t s ^ T ,

(19.18)

satisfying the condition


t(r + l,x )-t(r ,x )> 8 ,

(19.19)

where is a given real number, not necessarily the S of equation (19.4).


Clearly we need Gallaghers first lemma (18.17), but it is not directly
applicable, since a is also a function of r. We turn to Cauchys identity.
Let C be the rectangle (Fig. 4) whose vertices are
a(log iV J -^ i^ og A7) - 1,

1-)- (log N ) _1 i(lo g N )-1.

82

T H E N E C E S S A R Y TOOLS

3.19

Then if s = cr+i/ satisfies (19.18) we have


S *(s,x ) = f
2wi

and hence

du,

|S(s, %)|2 < logi\r j |$(t(4~i/> x) I2 \&u \>

(19.20)
(19.21)

uniformly in cr. We sum over r and apply (18.17) under the integral sign.
Next we sum over x mod?. One of the terms on the right of (18.17) is
a geometric mean, but Cauchys inequality allows us to sum over X hi
each factor before taking the geometric mean. We have now to estimate
two sums over x of integrals, which are of the form (19.17). In each,
a and T have been replaced by A and f + f S , where %i = A+ir, A and r
real; in the first a(m) has been replaced by a(m)m~iT, and in the second
by ia(m)logmro~iT. It is simpler to describe this inequality in words
than to write it out! After applying (19.17) we have
R(x)

2 \s {u+ it(r>x)> x)\2 < ( 3 - 1+loglY ) X (m+qT)\a(m)\2m -* x.

vmodtf i==l

m 1

(19.22)

Finally we integrate round 0 , noting that


o i+ (lo g iV )_1

\du\ <^TO-2 a (log.ZV) - 1 -|-

J
a -(lo g iV )-1

m - 2 '' dA

m_2am in{la+ClogiV)-1, (log to)-1}


(m2 log(TO-f e))_1.

(19.23)

Hence
x 2 "

(19.24)
where for each X the points sr s(r,x) satisfy (19.18) and (19.19).
We can use (18.12) in place of (18.8) and show that

f 2 w ) 2 * 1S(* * )l' d < 2


_rp (7<Q
^moclg
m1

<19'26

and
R(X)

2
T'

2* 2
Xmocla r=l

-'* * '* )1
N

< ( S - . + lgJ 0 2 ( + 13) E M

^ J ^ l !.

m= 1
&v
7
where the points $(?*,%) satisfy (19.18) and (19.19) for each

(19.26)

3.19

T H E H Y B R I D S IE V E

83

The point of these hybrid sieve results is that we have N-\-Q 2T (in
fact, m-\-Q2T) in (19.25) etc. in place o f (N-\-Q2) T or (N -\ -T ) Q 2 which
we would have from sieving one parameter and summing the other.
A useful mnemonic for large-sieve results is that N 2 \a(m)\2m ~2a corre
sponds to the square o f the trivial term t = 0, % trivial, and that there
are 0 ( Q 2T) non-trivial terms, whose mean square is 2 |(to)|2to 2ct.
We have certainly shown that eqn (18.2) holds for almost all pairs s,

20
AN A P P R O X I M A T E F U N C T I O N A L
E Q U A T I O N (I)
In a corner of the room, the tablecloth began to wriggle.
Then it wrapped itself into a ball and rolled across the room.
Then it jumped up and down once or twice, and put out two
ears. It rolled across the room again, and unwound itself.
II. 135

W e now have efficient tools for averaging finite Dirichlet series. We


would like to average powers of L(s, x) over points s and characters xIn this chapter and the next, we express powers of L(s, x) as finite sums
plus error terms. The best results of this type contain partial sums for
L(s, x ), as one would expect, but also partial sums for L ( l s, x). They
have thus earned the nickname of approximate functional equations.
There are many ways of proving approximate functional equations; we
give a straightforward method of H. L. Montgomerys which uses the
functional equation explicitly. Our application leads us to consider
L 2(s, x), where x is proper m od?. This has the functional equation
L 2( l u , x) = (qlir)2'<-1G( u)L*(u, x ),

(20.1)

in which we have put

where a is 0 if x( 1) is 1, 1 if x( 1) is 1. We shall use u for a complex


variable u = A-fir, where A and r are real. By Stirlings formula
(11.16), when u tends to infinity in a region A
|r|5 we have
| I | = (27r)i|pe-'W (l + 0(|T|-^)),

(20.3)

and so under the same condition on u,


\{qlrr)lu~1G{u) |~

(ferM/TT-)2*-1.

(20.4)

We use the ideas of Chapter 5 to get a partial sum for the Dirichlet
series of L 2(s, x). In the fundamental equation (5.4) the convergence of

3.20

A N A P P R O X I M A T E F U N C T I O N A L E Q U A T I O N (I)

85

the integral is only conditional, so we introduce a kernel K(co) to make


the convergence absohite. Let
ii() = ^7r4e " { ( 77-i)(o) |7Ti)(oj + l7Ti)(co+|7ri)}-1

and

(20.5)

K(w) = Z ^ w J + ^ - o * ) = ^ c o s h w f l (<(r

-l

( 2 0 . 6)

We have normalized so that K ( 0) is 1. In partial fractions,


K t {aj)

1 e

'r '

u>

2 a)

/ w 3 2 ( 2 ? ) ! ( 1 - ) - ? ') ! ai (?'

(20.7)

For real positive x, repeated use of eqn (5.4) gives


2+
+ ico
lco

2rri
'tt i

m w

co

\x j

2 ico

where
c(v) = 0

if v > e

c(w) = -J3^ sin(^77-logw)3^ sin(f77-log-y)


c(v) = 1

if e -1 < v ^ e

J. (20.9)

if 0 < v < e-1

Lets = cr+ii be a fixed complex number with J ^ a < f . The restriction


on a is not essential for the proof, but it is sufficient for our application.
Then
2 cr+ ico

d{'m] ^ m') ct~~) = 2~1

L%s+aj,x)daj.

(20.10)

By construction K(to) is an integral function, and the only pole of the


integrand in (20 . 10) is at co = 0, with residue L 2{s,x)- The difference
between L 2(s, x) and the sum in (20.10) is therefore given by an integral
along a contour passing to the left o f this pole, which we shall take as
the vertical line Reco = i cr. Equations (20.1) and (20.4) provide
bounds for the integrand that justify moving the contour: for example,
they imply that
| ( - i + i r , x )|2
(20. 11)
since L ( |ir, %) is bounded, and that
|Z(A+ir)| < r - 4

(20. 12)

in 1 ^ A ^ 2, |r| ^ 10, so that the integral along Reco =


a
converges absolutely. We change the variable of integration to u

T H E N E C E S S A R Y TOOLS

86

3.20

( = A+ir) where s+a> = \ u, and use the functional equation (20.1) to


write

ox
|+ico

1
2771

) ( z Y u~1 Q(u ) L n u>x ) d u .

(20.13)

s + tt 1

f i c o

Because of the factor K(s-\-u 1), the integrand in (20.13) is largest


when is close to 1s. Now when it is near 1s the terms (20.4) approxi
mate the (2A l)th power of the constant \ c[\s \Itt. In modulus at any
rate the main contribution to (20.13) is an integral of the form (20.10)
with s and x replaced by 1 s and x , and x replaced by y, where the
integer y is given by
y =
(20.14)
More precisely, provided |Im(s-|-tt 1) |<
1

a\21,-1

'

we have
*\2\A1

G(%)xl- Uy x~n

47r2x y

?ki

< (l + 0(|/|^) + 0(|2/|-1))2i'l* < 1,

provided that

j|-'.

I f we choose y by

4772x y = q2t2

(20.15)
(20.16)
(20.17)

exactly, so that y is not necessarily an integer, the term 0 ( y ^ 1) is not


present and (20.15) is true in any case; but in this application we shall
g^) and (20.16) will hold easily.
have y
The series for L2(f + ir , x) converges absolutely uniformly, so that we
may integrate it term by term. By analogy with the term-by-term
integration o f (20.10), we break the series up into 91+93 or into
(according to whether we take K-^s-^-u 1) or 1^(1s u)), where
cp4 are given by
9i(u) ~
<?2(u ) =

d(m)x(m)m-u N

d (m )x{m )m -,

d(m)x (m )m -v

m<ev
m <e~ iy

(Pz(u )

(20.18)

m >ey
? i(u )

d{m )x{m)m-u

We write the integral in (20.13) as the sum o f four integrals numbered


correspondingly
/ 4. First we consider the integral I3 involving

3.20

A N A P P R O X I M A T E F U N C T I O N A L E Q U A T I O N (I)

87

cp3('ii)Z1(s+% 1). When (20.15) is valid, the integrand decreases in


modulus as we move the contour to the right. We employ the contour 0
consisting (in the case \t\ ^ 10) o f
Gy. the line segment (2ioo, 2it i||}],
G2: the semicircle centre 2it, radius 111*to the right o f the line A = 2,
Gs : the line segment [2ii-f-i|i|*, 2+ioo),

l-.s

Pig. 5. The contour G

which is the line A = 2 with an indentation around the poles of


1). I f \t\ < 10 we take G to be the line A = 2; the estimates
below will then hold with (if)-1replaced by 1 and with different implied
constants.
For A ^ 2 we have

l<Pa( )l <

m >oy

d{m )m ~x

2 m ~ \ D (m ) D { m 1)}
m>ey
< {ey^ log y,

(20.19)
by partial summation and the estimate (2.13) for the sum function D(m)
of the divisor function. By (20.19) and (20.4),
T .'B1- s~X 1(s+'M 1) j g^2" - 1 (
G(u)<ps(u) du,
2m J
s-\-u 1
1

<J
Ci

,.1 -a -A

Jxyir y ~ *

< ?-

y^logy dr

{l+\t+r\f\2rr j

< x - at~2

853518 X

/^|t |\2A-1

4qH2)
_1iog y-

q\t\logy
( 20 . 20 )

88

T H E N E C E S S A R Y TOOLS

3.20

The same estimate holds for the integral along Cs. On the semicircle C2,
(20.4) and (20.15) are valid, and we have the upper bound
< x - a(l+ \ t\ i) - 5q\t\logy(ir\t\i)
<^.qx~a\t\-1\ogy.

(20.21)

Hence / 3 and similarly / 4 are bounded by the expression in (20.21).

21
AN A P P R O X I M A T E F U N C T I O N A L
E Q U A T I O N (II)
Wliy, whats happened to your tail ? he said in surprise.
Wliat lias happened to it ? said Eeyore. It isnt there!
I. 43

W e have now shown that I 3 and I i can be regarded as error terms.


We treat (px and cp2 by moving the contour the other way. The first case
considered is |i| > 10. I f t > 10 we use the contour D given by
D x: the line segment (

ioo, J it i||*],

D 2: the semicircle, centre \ it, radius \t^!, to the left o f A =


D s : the line segment [Ji-|-i|i|% -J-i],

D 4: the line segment [

i, Ji],

Z>5: the line segment [Ji, J + i],


D 6: the line segment [J + i, i + i ] ,
D 1\ the line segment [ 1 + i, J ico).

The contour D for t < 10 is the reflection in the real axis of that
described above. The indentation about the origin avoids a possible
double pole of G(u) at u = 0. The analogue of (20.19) is that, uniformly
l<Pi( )l < ( e y ^ l o g y .
1

(21 . 1)

C x 1~s~uK 1(s-}-u 1)

!7ii J
D

Using (21.1) in place of (20.19) we see that


|ii| <4qx-\t\~nogy,

(21.3)

and the right-hand side of (21.3) is similarly an upper bound for / 2 in


which K x( l - ~ s u)(p2(u) replaces K x(s-\-u l)tp1(it).

90

T H E N E C E S S A R Y TOOLS

3.21

Between the contours G and D lie the poles of w - 1K ( w ) at iv = 0 and


ibO'IV i, where w = s-\-u 1. The residues at these can be
calculated term by term; in total they give

w =

d(m)x(m) ,lm
c
m 1s

(21.4)

Fig . 6. The contour D

where
c'(v) = 0

if v ^ e

+2

1
9
32(1+J*)!(2?)! \n2vxyi

X ( ? ( l 5 + ( r l)7ii)

(21.5)

if e_1 ^ v < e

12s

c'{v)

G ( l s)

if 0 < v < e_1

If
is
then Q{u ) is regular at 0. For \t\ ^ 10 we take the
contour D to be the line A = f . The bound for Jt and / 2 on this contour

3.21

A N A P P R O X I M A T E F U N C T I O N A L E Q U A T I O N (II)

91

is the same, but without the factor ||-1 and with a different implied
constant. I f x( 1) is + 1> however, there is a double pole. I f a is
bounded away from 1 we can run the contour D between the origin
and the poles of /^ (s+ m l)/(s-\-u 1). I f a < 1 we take D as
D x\ the line segment (

D 2: the line segment


D 3: the line segment
Z>4: the line segment
D h\ the line segment

ioo,
i],
[1 i, 1(1 cr)i],
[1(1 cr)i , l ( l cr)+i],
[1(1 cr)+i, 1 + i],
[ 1 + i, 1+ioo).

The upper bound for the integral along I)3 will be


(21.6)

qx~(l a)~zlo g y.

I f all else fails, we can take I) as the vertical line X = \ and get an
explicit but very complicated extra term in the approximate functional
equation from the residue at n 0 of the multiple pole. When || < 10,
(20.14) is interpreted as
q 2 < ^ x y < g 2.
(21.7)
We have now shown that an equation of the form
7jV jx) _ 2

- M

A ) + 2

,(5 )+

H
( 2 1 .8 )

holds both in \ ^ a ^ f, \t\ > 10, in which case each of / 1;..., / 4 is


<g.'B-'|/|-1log2/,
and in \ < a < 1, \l\ < 10, when each of

( 2 1 .9 )

/ 4 is

(1 o ) - 2qx~ulo g y,

(21.10)

the estimates being uniform in the ranges stated.


Although we have assumed q > 1, so that L(s, %) is not (s), we can
treat 2(s) by the same method. We assume \t\ > 10, since it would be
unprofitable to try to approximate near the double
at s = 1.
When we move the integral in (20.13) to the line Re?o = 1cr, this
double pole gives a residue
2Aiv~1K{iv) + A

[to^ K iiv))

(21.11)

evaluated at to = 1s, where A is the second coefficient (in fact, A is


Eulers constant y) in the Laurent expansion
() = ( a - l ) - H - 4 + 0 ( - l )

(21.12)

92

T H E N E C E S S A R Y TOOLS

3.21

of (s) about the pole at s = 1. We must add to eqn (21.8) the additional
terms (21.11); they are
^
(21.13)
uniformly i n | / | ^ 1 0 , j ^ c 7 ^ f .
There are many approximate functional equations in the literature.
The one we have proved is easy to sieve, but has more complicated
coefficients. Approximate functional equations for (s) and for 2(s) are
given by Titchmarsh (1951, Chapter 4), and generalized by Chandra
sekhar an and Narasimhan (1963). A. F. Lavrik has a number of papers
on the subject in the Doklady and Izvestia of the USSR Academy of
Sciences. Fogels (1969) has a form rather like that of Montgomerys
given here.

F O U R T H P O W E R S OF L - F U N C T I O N S
They were out of the snow now, but it was very cold, and to
keep themselves warm they sang Poohs song right through
six times, Piglet doing the tiddley-poms and Pooh doing the
rest of it, and both of them thumping on top of the gate with
pieces of stick at the proper places.
II. 7

T h e approximate functional equation for L 2(s, x) is used to obtain


upper bounds for |L(s, x) |at individual values of s, or on average. Our
inspiration is the Lindelof hypothesis,
( 22 .1)

for each e > 0, with a constant depending on e. In this chapter, we


show that the mean fourth power o f L(s, x) satisfies the inequality
(22.1). Beyond the fourth power, the sums in the approximate functional
equation are too long for us to prove (22.1) even on average. We use
the hybrid sieve (19.26),

( 2 2 .2 )

Although (22.2) allows us to vary a, we take a \ throughout, since


\{ql'rr)1^2s(xle)^r7TiG (l s + > '7 7 i ) |

(22.3)

for a = J, the gamma functions in 0 being evaluated at conjugate


complex points. The proof is no simpler, but the form of the upper
bounds is less complicated.
Cauchys inequality applied to eqn (21.8) gives
\L(*> X)l4 < 6|2d(ra)x(m)m-c(m/)|a+
4

3.22

T H E N E C E S S A R Y TOOLS

94

and we have a similar result for (s), with 7 instead of 6 and an extra
term 0 ( |/|~10). The integers x and y in (22.4) are connected by (20.14):
y = [qH*l( 4 A ) ] .

(22.5)

When we fix x and average over x and t, y is varying. For a good upper
bound, x and y must be of the same order of magnitude. We restrict
ourselves for the moment to P < q < 2P and U < \t\ < 2 U , and
average the right-hand side of (22.4) over all integer values of a; between
\ P U j n and \ P U jit . For each fixed x and t ^he corresponding values
of y given bjr (22.5) are distinct and lie between ^PUj-rr 1 and 1 6 P U / tt.
The average of the square of the terms involving y taken over all integers
y in this range is at least
times the corresponding average over the
values o f y that actualfy occur in the sum. This device allows us to sum
a Dirichlet series of fixed length ey over varying x and t, provided that
x does not occur in the coefficients (and vice versa).
For each value of x in the above range, (22.2) gives
R(X)

<p(q)

# < < K 2P T'-t/ ymoclg

d ( m )x (m )

hn'

TO^-X)

cx

r 1

(S_1+loga;) ^

(m +P ^U )

m1

cZ2(to) log x
mlog (m +e)

< P 2/log5(P ?7+e),

(22.6)

where we have used (2.24) for ^ d2(m)jm, and partial summation. We


have assumed

8> 1

(22 7)

It is only slightly harder to average over the reflected series. For


e ^ y < to < ey we must break the coefficients c'(m/y) up into five terms

corresponding to the five poles of


fl\

I-

We take out a factor

/ .2/^A Y n i

(i)

(22'8)

from the term corresponding to the x^ole of K - ^ w ) ^ at ttt\, where


r == 0,
+ 1 . By (22.3), the modulus o f the expression (22.8) is fixed
independently of x, q, t, and the parameter a in the definition of 0(u),
which is 1 if x( 1 ) = - - 1 and 0 if p^( 1) = + 1 . After this factor has
been removed, a sum like that in (22.6) remains, with no gamma-function
factors and no concealed dependence on *. For each r, the right-hand
side of (22.6) is an upper estimate for this sum, possibly with a different
0 -constant.

3.22

F O U R T H P O W E R S O F L -F U N C T I O N S

95

The contour integrals Iv ..., / 4 present complications. First we fix t


and average only over characters. We have for any function F(u) for
which the integrals exist
2
F(u) d u

''>
Ti

71
'' '' D

'

(22.9)
and similarly for integrals along the contour G. The first factor here is
We shall find an estimate for the average of Iv Here,
F(u) = x1- s- u(sJr u l ) - lK 1(sJr u \)(qlir)'in- 1G(u,x)qii{w)( 2 2 . 10 )

We wish to average this over % with the length of the sum cp^w) fixed,
that is, with y fixed. By (22.4), x is approximately varying as q2, and
is bounded on the contour D by a function o f t and y alone, by
(20.14). Similarly, (20.4) permits us to take out (q'/tt)2 1 G?(^<-) at its
maximum. B y (18.12),

xmods

msgey
< ( y + P 2)(e/)i-2Mog!ty,

(22.11)

where we have used the estimate


2 d2(m) < l o g 3i,

(22.12)

which can be deduced from (2.24). The sum of the integrals involving
F(u) on the right-hand side of (22.9) is now
yy,\ 22 u/^y\4u 2

< max max \ t \ l ar


(q,x) med (||T

>

G*{u) ( y + P i){ey)i--*x \ogzy

(max&)1_2l7|i|~*(?/-j-P2)log3?/
< (P 2+P|i|)|i|-Jlog3(P(|<|+e)),

(22.13)

where we have used (20.15). Hence


q
]T * l^l2 < ( P 2+P|<|)|<h4log3(P(|f|+e)),
<
p
(<
7
)
P<8<2P T'-t' ^moda

(22.14)
and similarly the same upper bound holds for the corresponding sum
involving / 2.

96

T H E N E C E S S A R Y TO OLS

3.22

To apply a Gallagher sieve we need a similar bound for a sum involving


The contour D varies with t, but this was only to help estimation,
and we can keep D fixed and differentiate the integrand. A similar
calculation will now give

(22.15)
Gallaghers lemma (18.17) gives
277
i
it
kj+
T"2

,/ ^
2d'~
V/~a
+i

x\ ,1 /, 2U + i

J i/ii2^+ J i-^rdiW J

U < t r^ W

U~i

'u - t

'

'77-i

i j
dt 1

. jl

\t

dtj

(22.16)
We sum (22.16) over % (using Cauchys inequality on the second term)
to find
,

_ mx)
2
Jo ) Z
2
IAI8 < ( - P + ^ ^ - 8log4(P (?7+ e)) (22.17)
P<g<2P

x m od2 r = 1

when we substitute in (22.14) and (22.15). The same upper bound holds
for the sum with Ia.
The integrals I 3 and / 4 present a further complication in that <p3 and
cp4 are not finite sums. To treat 93 we write
__
2

19
d (m )x (m )m ~ u\

>i>ey

/ J"?

\/ ^

( 2 ?l_2)( 2 ,l2
V=1

' 'n=l

,_i
2

leM
y<m<eM
+1j/

\
d (m )x (m )m ~ u 2)
'

(22.

By (18.11) we have
g
cp(<7)

d(m)x (m)

'V *

. ,

P<q<2P rvx/ ^mod(2 enu<m^en+1v

in

(e +1 /+ P 2)(e,li/) 1_2;'(ii+log?/)3,

(22.19)

and the sum of terms of the form (22.19) from n = 1 to infinity converges
uniformly in A to

^ ( y + P a)(ey)i-^ 1og 8y.

( 22.20)

When we use (22.20) in place of (22 . 11) we show that the sums over I 3
and J4 also satisfy the bound (22.17).

3.22

F O U R T H P O W E R S OF ^ -F U N C T I O N S

97

We have now shown that

< P 2C71og6(P (?7+e)) + (P + 7 )P /-3log4(P (?7+e)).

(22.21)

By (21.13), if the zeta function is included in the sum there is an extra


term TJ~9, which is easily absorbed in the second term on the right
of (22.21).
When we sum over values of P and U running through powers of two,
we can show that

provided that for r =

1 ,...,

R(x) we have

(22.23)
and for r = 1,..., R ( x ) ~ 1 we have
t(r + l,x )-t{r ,x ) >

1,

(22.24)

and for q = 1 \t(r, x)| ^ 1 .


We could replace (22.24) by a weaker condition
t { r + l , x ) ^ t ( r>X) > (log Q T ) - 1,

(22.25)

and still have the upper bound (22.22), so that the root mean fourth
power of L(s, %) is
log Q T , Without the sieving over T this method
proves that the root mean fourth power at afixedsis <^log(Q(|s|+e)).
A simpler proof of a fourth-power moment is in Gallagher (1967).
Another result is given by Linnik (1964, section 41). These relate to
summing over x- The situation is much easier if we merely integrate
over f; there are examples in Titchmarsh (1951, Chapter 7). The hybrid
moment we have proved here is sketched in Montgomerys paper (19696),
but with a larger logarithm power.

P A R T IV

Zeros and Prime Numbers

23
INGHAMS THEOREM
He tried Counting Slieep, which is sometimes a good way of
getting to sleep, and, as tliat was no good, he tried counting
Heffalumps.
I. 62

W e shall estimate the number of zeros p j 3 + i y of L(s,x) or o f (s)


in a rectangle

< 0 < 1,

T < y < T

(23.1)

where re > -J, T > J. Let X be a large integer and


M (s , x) =

(23.2)

m<:X

I f Riemanns hypothesis were true, then for a > \ M (s , x) would tend


to {(s, x)}_1 as
tended to infinity. We write
L ( S, x ) M ( s , x ) = l + f ( s , x ) ,

(23.3)

w here/(s, x) lias the Dirichlet series

a(m)x{'in)m~s,

(23.4)

m >X

a(m) =

2 M<Z).

(23-5)

d^X

The series (23.4) converges without any hypothesis for a > 1. Until
A. I. Vinogradovs work (1965), the number of zeros was customarily
estimated by integration o f the logarithm of 1+/(> x) round the
boundary of the rectangle (23.1). As we shall see below, f(s , x ) is less
than unity in root mean square, whether averaged over t or over x r

4.23

IN G H A M S THEOREM

99

over both, provided a > J. At a zero, f(p , x) is 1. A. I. Vinogradov


revived an alternative approach: to count directly the number of times
f ( s , x) has modulus at least unity. While no simpler in detail, this method
is the more flexible, and we follow it here.
The integral transform
2 +ico

I f
2t t i

JY\W
r ( w ) ^ j W dIV = e - m'r

(23.6)

2 ico

is easily verified by moving the line of integration to Be vo = -J R,


where R is a large positive integer, and letting R tend to infinity: the
residues at poles of F(w) give the terms o f the exponential series. Hence
m >X

2 + ico

277i

L i p + t v , x ) M ( p Jr w, x ) Y wr (w ) div.

(23.7)

2 ico

Here p is a zero of L (s , x) so that the zero of L(p-\-w, x) cancels the pole


o f F(w) at w = 0. We take the integral back to the line Rew =
(3.
I f x is not trivial, the right-hand side is equal to
i-p+im

1
27ri

L (p Jr iv, x ) M ( p Jr iv, x ) Y wF(w) dw,

(23.8)

and if x mod q is trivial there is an extra term


9(q)q-1M ( l , x ) Y ^ P F ( l - p )

(23.9)

from the pole at p .\-w = 1.


We write

I log Q T

and suppose that

logAr .<C 10Z,

lo g !7

(23.10)
10Z.

(23.11)

The terms on the left of (23.7) with to > 100IY contribute less than ,
if I is sufficiently large. We recall Stirlings formula in the form
(20.3), valid if A ^ |t |j :
|r(A+ir)| = e-^W|A+ir|;-i{(277^ + 0(|T|^)}.

(23.12)

The term (23.9) is now seen to be at most ^ when |y| ^ 100Z, again
provided I is sufficiently large. Apart from the zeros o f (s) with
|y|^ 100Z, all zeros fall into one or both o f the following classes.
Class (i). Zeros p with
y

a(m)x('>n)m-Pe~m!Y > .

(23.13)

ZEROS A N D PRIME NUMBERS

100

4.23

Class (ii). Zeros p with


- ; 8 +ico

L(p~>r'W, x)iY(/>+ti>, x ) Y lr(w ) dtv > 77.

(23.14)

i-fl-ic o

We subdivide class (i) by writing the range (X , lOOZF] as the union of


intervals/,.: 2rY < m ^ 2,,+1I r (the first and last intervals having instead
the end points X and 100IT). A zero is of class (i,r) if
2 a(m)x(ni)m~Pe~m,r > {20(r2+ l)} ^ 1.

(23.15)

nisi?

By (12.19) or (12.20) the number of zeros of a fixed function L(s, x)


in a subrectangle t
y ^ t-\-1, a ^ j8 ^ 1 of (23.1) is 0(1). From each
class of zeros in (23.1) we can pick out a sequence of zeros whose imaginary
parts differ by at least unity, in such a way that the sequence contains
a proj)ortion o f at least ;> l~x of the zeros of L(s, x) of that class in
(23.1). It is possible that our sequence contains all p at which L (p ,x )
has a zero of the given class, but these p are multiple zeros of order 0(1).
We write Nt(x) for the number of class (i) zeros of L(s, x) in (23.1), N2(x)
for class (ii) zeros, and N (x ) for the total number. We call our sub
sequence the representative zeros.
Summing over representative zeros p of class (i, r) we have
Y

JL

V * V

Z-t m(q) Z-i

<1<Q

xmoda

Z-,
p

V
/-i

m elr

a(m)x(m ) e -,ir 2
mP

^ Z 2 ( +Q 2y)^2()~2a:exp( 2')
melr

(2'T)1~2(Q2I 7+ 2 ,T )log4r e x p ( 2r)


< Z6exp(-2'')(<92y (2 'T )1- 2 + (2 'T )2- 2a),

(23.16)

by (19.26) with 8 = 1 ; we have used (23.3) to estimate a(m) by a divisor


function, then (2.24) and partial summation.
Comparing (23.15) and (23.16) and adding 12 for the zeros of (s) with
|y|< 100Z, we have
2

2 " Nlix) < ^ 2+ / i ; ( r 2+ 1)exP( ~ 2 0 x

q^Q ''-i-' ^mod(2

X{<922,(2,T ) 1- 2 + (2 ,T ) 2- 2}

(23.17)

where we have assumed X > Q2T. We now choose


X = Q2Tl.

(23.18)

There are several ways of treating zeros of class (ii). The ingenious
work of Montgomery (19696) makes much use of this flexibility. To

4.23

IN G H A M S THEOREM

101

obtain Inghams theorem we raise the expression on the left o f (23.14) to


the four-thirds power and sum over representative zeros p /3-f-iy of
class (ii). By Holders inequality we have
i - 0 + icO

V JL V* V
(pftf) jL-4 Z-4

q ^ Q T SJ-' xmod q

L(p-\-w, x)M(p-\-io, x )Y wr(iv) dto 3

) 1 CO

i +ico
2 ^ 2 * 2
f
a<Q9 U ;xmoda P s-ico
4+1C

2 ^
' e=so

2 *

\2

xmas p

| d| r x

l ^ + i y - x)\z\r iu ~P)\* |dlj

(23.19)

By (23.12) and (22.22) the first integral on the right o f (23.19) is


i

<

f Q2T l5m & x . { t + B - t ) - t d t +


o
P
CO

+ j" Q2( T Jr t)(lJr \ogt)5msbx(e-27rllH~iPls) di


<

(23.20)

When we apply the hybrid sieve (19.26), the second integral on the right
of (23.19) becomes
x ( m + Q 2T)
lo g Z
m
log(m +e)
m=1

<^i(a- i ) - n x + Q m )

(23.21)

<

by (23.18). The right-hand side of (23.19) is


(23.22)

<

and thus

^
a<Q

We now choose

2 *
xmoda

^ {ot l ) - * Q iTY*a-*<Mii.

Y (Q2T l - 2)8l(-i - 2oc\

(23.23)
(23.24)

so that (23.17) and (23.23) together give


2 H 2 * iY(x) ^ (a -4 )-l( Q 82,)a(1- fl,(S!- )ZW-oO
2*SQ
^morta

(23.25)

102

Z E R O S A N D P R IM E N U M B E R S

4.23

We have assumed that 100ZY is greater than X ; this is certainly


true if

a > $ + fZ -ilo g Z.

(23.26)

I f a is less than this bound we quote (12.19) or (12.20), which give an


upper bound

N (x) < T l ,

so that

^
a^Q

T
iV(/Y) ^ Q2Tl
xvaoaq

(23.27)
(23.28)

in any case. In fact, (23.27) and (23.28) give

2m2

< min{Q 2Tl, ( - JJ-^QaTJsa-aJKa-^s/OJ-a)}.

(23.29)

Ingham proved his theorem for the zeta function only, with no sum
over
his result (1940) was
N < ysa-oo/ta-oO^

(23.30)

where N refers to the zeros of the zeta function satisfying (23.1). (23.25) is
stronger than (23.30) for a near but weaker for a > f . We have lost
a logarithm factor by taking representative zeros instead of counting
according to multiplicity.

24
BOMBIERFS THEOREM
I see, I see, said Pooh, nodding his head. Talking about
large somethings, he went on dreamily, I generally have a
small something about now about this time in the morning.
I. 48

W e proved the prime-number theorem for arithmetic progressions


a modr/ uniformly in q <, Q, where
Q =

(log x)N,

(24.1)

in the notation of Chapter 15. In 1965, Bombieri (1965) and A. I.


Vinogradov (1965) provedindejDendentlythat the prime-number theorem
holds uniformly for almost all progressions with Q a little smaller than
Bombieris result was the following theorem.
T h e o r e m . Given A > 0 toe can find B > 0 such that for all large x

. am odg y ^ x

Hy> <?>)-

q<Q

9 (g)

^ (log)^

Q = x*Q.ogx)-B

where

(24'2)
(24.3)

and the asterisk indicates a restriction to reduced residtie classes.

The right-hand side of (24.2) is smaller by a factor (loga;)^+1 than the


sum o f the corresponding main terms. The proportion of progressions
for which the error term in the prime-number theorem for the interval
1 ^ P ^ x is greater than (logx)~A times the main term is O((loga:))-1.
In calculations we often apply the prime-number theorem to several
progressions, and its failure in a small number of cases affects only the
error term in the eventual asymptotic formula. An example is the proof
that every large even integer is the sum o f a prime and an integer with
at most three prime factors. It may be that (24.2) holds with some larger
constant in place of but ?, is the limit of present methods.
An upper bound for the left-hand side of (24.2) is

q ^ Q TV- "
863518 X

xm odg

X non-trivial

"

</<Q
H

l!% ' j;>~ * '!'

, 2 4 -4 )

ZEROS A N D P R I M E N U M B E R S

104

4.24

When we use eqn (17.15) to pass to proper characters, the sum in


(24.4) does not exceed

1 / Q

x m o c l/

V<X

s s O t m o d /)

a<Q

+max|i/i(2/ ) - 2/l S ' - ^ r + A2Q,

(24.5)

A = log*.

(24.6)

v^o;

where we have written

<?(q)

The coefficient of the term in % mod/ in (24.5) is


V

ffl(ff)

Z -!

g= 0(m od/) Tv~i '

a<Q

ro(f) Z -, m(rn)
Tw '

'

cp(/)

(24.7)

rw '

Relations (16.22) and (24,3) assure us that the second and third terms
on the right of (24.5) are
^ x^_A
^24 ^
for x sufficiently large. Since / -<C XiB is of the form (24.1), the SiegelWalfisch inequality (17.14) implies that (24.8) is an upper bound for the
terms in (24.5) w it h / < A43; we are assuming x is sufficiently large.
For q > A4B we combine (17.2) and (17.6) into-

4>{y, x)

= ~ 2 ^+

i+ ^l)log2? Ty\

(24.9)

where the sum is over zeros o f (s, %) with |p| > R and |y| < T , R and
T being chosen for each % so that (17.3) and (17.4) hold; in particular
the O-constant in (24.9) is independent of
T, or y and uniform in
^ R
J. When we choose T to be a little smaller than x we have

Ip{y>x)\<

Y^+O{x^logzx)

(24.10)

whenever y < x, with an absolute O-constant. The error term in


(24.10) contributes
O( Q x Ho g ^ )
(24.11)
to the sum (24.4).
We now divide the rectangle 0 ^ / 3 < 1, ~ T ^ y < T into smaller
rectangles, so that /3 is in one o f the ranges [0, J+(log )-1],...,
[i+ r ^ o g a O -^ l+ ^ -flX lo g * )-1],..., and |y| in one of the ranges
[0,1], [1,2],..., [2r, 2>+1],.... Within each rectangle, xP{\p\ varies by a
bounded factor. Again, we divide the range (A45, Q\ for q into intervals

4.24

B O M BIB RI S THEOREM

105

P < q ^ 2P. We use Inghams theorem: the number of zeros p = |3+iy


counted with weight q[<p(q) o f functions L ( s , j ) with % proper modg,
P < q
2P, a < /3 < a+(loga;)_1, and Z7 < y < 2 U is
< min(P8l7A, (pa*7)8d-/<s-<*A8),

(24.12)

by (23.25) and (23.28). Each of these zeros contributes at most


^

_A_ <

\P \ <?(q) ^

JL
P U <p(q)

(24.13)
K

to the sum in (24.5). We multiply the expression in (24.12) by Xxa/ P U


and sum over the appropriate sequence of values of P U . First, we have
2

/ 3 ( l - a ) / ( 2 - a ) 1

^ ^

(24.14)

when a > |- and the sum is over powers of two not exceeding x.
In the sum over P we distinguish two cases. The exponent o f P in
(24.12) is greater than unity for a < f and less than unity for a. > f.
The values of P are the powers of two between JA4B and Q; they are
0(A) in number. Hence for a ^ f we have
2 p8(l-o0/(2-a)-l

^^4B)6(l-a)/(2-a)-l

^ ^ l+ 4 B(0(l- a )- l)

< a ;i A1- 4B,

(24.15)

when we assume x to be sufficiently large. For \ < a ^ we have


2 _p6(la)/(2
a)1

AQ6a-)/(2-a)-l

P
a. 3 ( l - a ) / ( 2 - a ) - ^ l - J ?

< a ;1 A1- I?,

(24.16)

where we have substituted Q = a^A~B from (24.3). The terms from zeros
o f L-functions formed with characters whose conductors exceed A43 are
now seen to be

^ aA10--5

(24.17)

by (24.12), (24.13), (24.14), (24.15), and (24.16). The other terms in


(24.5) are of the form (24.17), by (24.8) and (24.11), when we choose
B = ,4 + 10.

(24.18)

We have thus proved (24.2) with B given by (24.18), which clearly can
be improved slightly, since (24.15) and (24.16) can be improved.
There are two ways o f proving Bombieri s theorem. The shorter, due
to Gallagher (1968), is to perform the sieving directly on L'(s, x)jL(s, x)

106

ZEROS A N D PRIM E NUM BERS

4.24

and related functions in contour integrals. The proof is an anagram of


the one we used; multiplication by M (s , X) and fourth-power averages
o f L(s, x) occur in it. The longer way is to prove a zero-density theorem
and deduce Bombieri s theorem from that, as in this chapter. In either
case, the Siegel-Walfisch form of the prime-number theorem has to be
used, and the constants are non-elfective. Without using the hybrid
sieve or the approximate functional equation, we should obtain a weaker
zero-density result than (23.29), but one still powerful enough to enable
(24.2) to be deduced.

>

25
I. M. V I N O G R A D O V S E S T I M A T E
I f we are to capture Baby Roo, we must get a Long Start,
because Kanga runs faster than any of Us, even Me.

I. 93
T h e sum

8{oc) = S(y, a) =

X logpe(pa)

(25.1)

is of the type considered in Chapter 6, with local maxima near rational


points ajq (if y is sufficiently large). In this chapter, we adapt the proof
o f Bombieris theorem to show rigorously that 8 (a ) is small except
possibly when a is close to a rational point with small denominator.
More precisely, if B is fixed and x is large, then
(25.2)
where

(25.3)

and ajq is in its lowest terms with


(25.4)
Inequality (25.2) is I. M. Vinogradovs famous minor arcs estimate ,
with 8 in place of Vinogradovs
His proof is elementary, making no
use of Dirichlet s series; an analytic proof was published by Linnilc (1945),
We remark first that
S (y,a) =

2 A(m)e(ma) + 0 ( y n o g y ) ,

(25.5)

the error term arising from squares and higher powers of primes. When
ajq is in its lowest terms,

the error term arising from powers of the O(logg) primes that divide q;
these are not congruent to reduced residues b mod#. The sum of the
terms involving b in (25.6) is x(a)T(x)> and by ecln (3.20) the modulus

ZEROS AND PRIME NUMBERS

108

4.25

o f this expression is at most g*. We combine (24.10) and (17.15) into


(25-7)
IPI > 7,

1 /5 1

lyl<
whenever y + x, the O-constant being absolute. The stun is over zeros
o f (s, Xj), where Xi proper mod/ induces x modg. Since every zero of
i ( s >Xi) i,s a zero
x)> we shall take the sum in (25.7) to be over
zeros of (s, x). From (25.5), (25.6), and (25.7), we have

<*)

TW/ xmod ip|>

where y ^ a; and q ^ a:.


We do not use Bombieris theorem itself, but analogous results for
zeros o f the functions L(s, x) where x induces a character to the fixed
modulus q. This argument is not as natural as that leading to Bombieris
theorem, and the various inequalities do not fit together so well. First
we give a form of Inghams theorem for the set of all characters % mod q,
where q is fixed. I f x modg is induced by xi proper m od/, we have
L(s, X) = L(s, Xl)

(l ~

p \
<j

(25.9)

\
'

p/ f

The method of Chapter 22 with (19.24) instead o f (19.26) gives

2*

R(x)

\L(^+it(r>x))\l < c P(/)T lo g !/ T ,

(25.10)

Xmod/ r=l

and hence

2
2
xmoda >=!

m + it(r ,

x))l4 <

T lo g tq T

Y <p(/) T T 1 + -*

II \

p\

PXf

T l o g 5q T J ! (p l + l+^J- *)
p\a

qT\og5qT,

(25.11)

since the product over primes dividing q is at most g(f). Here we


sirppose that the points t(r, x) satisfy conditions (22.23) and (22.24).
When we use (18.8) and (25.11) in place of (18.12) and (22.22), Inghams
theorem becomes: the number o f zeros p = jS-f-iy of functions L (s ,x )
formed with characters x mod q in the rectangle a ^ / J ^ l , T ^ y ^ ^ 1
is at most
max{<p(g)51A, (a|) 1(gJ1)3(1^ (2a)(cp(g)A/g)1^2- )}
where

A = log qT.

(25.12)
(25.13)

4.25

I. M. V I N O G R A D O V S E S T I M A T E

109

As in the proof of Bombieris theorem, we sum zeros over rectangles.


For a ^
the analogue of (24.15) is
^3(1-00/(2-00-* <^(^3(1-00/(2-00-* ^ x l-ocl-iB
(25.14)
when x is sufficiently large, and for a < f the analogue of (24.16) is
^3(l-a)/(2-a0-.} ^ (xl~B)3(l-a0/(2-a)-} <^ ^1-^-5 B_
(25.15)
Since the primes less than q and not dividing it are reduced mod q, we
see from the prime-number theorem that y{q)X6jq is greater than unity
for q satisfying (25.4). By (25.14), (25.15), and (24.14) the sum over x
and p in (25.8) is
<
(25.16)
and we have proved (25.2) for /3 = 0.
We now show that (25.2) holds (with a different O-constant) when
j8 is different from zero. Since
m
j 2irifie(fiy) dy = e(m/3) 1,
(25.17)
o
we have the identity
S (x ,a l q + P )
X

= 8(x,alq) + 2iril3 [ { S(x,alq) S(y,a/q)}e(Py) dy.


o
The right-hand side of eqn (25.18) is in modulus
< (l + 47r|/3|a;)max|$(2/,a/q)\,
and we have proved (25.2).

(25.18)

(25.19)

26
I. M. V I N O G R A D O V S T H R E E - P R I M E S
THEOREM
W i t h S ( y } a) given

eqn (25.1) and x an integer we have

x
f S 3(x, ot)e(-xot) da = 2 2 2 lo g ^ jlo g ^ lo g ^ ,
(j

(26.1)

Pi P 2 Pa

where the sum is over tiiples (Pi,P 2,P s) of primes with


P i+P z+P a = x -

(26.2)

We shall find an asymptotic formula for the left-hand side of eqn


(26.1), and deduce that every sufficiently large odd integer is a sum of
three primes. Since we have to use the prime-number theorem for
arithmetical progressions, the constant in the error term of the asym
ptotic formula depends on Siegels theorem and cannot be stated; how
ever, there are weaker results than Siegels theorem in which the con
stants are effective, and in principle a finite set could be given in which
any exceptional x must lie.
Let

Q = [xl~2S],

(26.3)

where I = log a:. Since S(x,u) is periodic, we can take the range of
integration in (26.1) to be the unit interval [1/(Q +1), (Q-t-2)/(Q-|-l)].
We divide this interval into arcs
- i

Ir

A r + ir -l

a,+a,.+1

(26.4)

5V+2V + 1.

corresponding to the fractions ar/qr of the Farey sequence of order Q.


By eqn (9.2) if ar/qr~{-/3 is on then
IPI < M ) - 1-

(26.5)

The intervals I r are known as Farey arcs (if we think o f e(a) as a com
plex variable, the integration is round the unit circle). We divide the Ir
into major arcs, on which we work out the integral over the spike at
arlqr explicitly, as promised in Chapter 6, and minor arcs, on which we

4.20

I. M. V I N O G R A D O V S T H R E E - P R I M E S T H E O R E M

111

use an upper estimate for the integrand. We call Ir a minor are if qr > Z26,
when (25.2) with B 26 gives
\S(x, ar/qr+ P ) | < (1 +Z-aeQ -1)a:Z-8 < x

(26.6)

for ar/qr lying on Ir.


On the major arcs we approximate 8(a). First we calculate the size
of the spike at ajq. From (17.19) we have
V

log
?(?

p itv ,,

3?= &(modg)

(26.7)

xl~

uniformly in q < 126 and in b reduced mod q. We recall that


(26.8)

2 * ea(a&) = cq(a) = fx(q)


b mode

when a is reduced modg. Hence


\ S (y,a J g ) n(q)\jf]/(p(q)\

< 1+

y
log -
., 8
?(?)
35=vDk
(mvodg)

2 *
bm odq

(26.9)
cp(q)xl~32.

<
The simplest sum with a spike is
F(a) =

By the identity (25.18),

(26.10)

^ ( 4

m^x

<g; (l+|/3|)max
s^

y^x

) m

l2Sq - 1cp(q)xl-3i

(26.11)

xh*.

We now have
/A\?) F ( m ,
S ( a l q + P ) - V M F 3(j3) < x l ^ [ \ S ( a l q + P ) \ * + ^
? 3(q)
? a(?)
(26.12)
and the integral in eqn (26.1) is
,
2

(Q+2)/(Q+D

Ir majore Ir

|*%)|2da) +

I K Q + l)

+ {

\ I r m ajor

I,

'( -? J

dot .

(26.13)

The integral in the first error term in (26.13) is


2 log2jJ <a;Z

2J^x

(26.14)

112

ZEROS A N D PRIME N U M B ERS

4.26

by the prime-number theorem or by (8.19), and the integral in the


second error term is
,2
< f |P(j8)|2 d/3 = x.
(26.15)
Using (8.16) for 2

?(q)> we see that the error terms in (26.13) are

both

<a:Z~3.

(26.16)

Finally we must work out the first integral in (26.13). Since


^ (a )= e j ^ + a ) - l <
e(a) 1

1 ^
||a||

we have
*
eqr( - a rx)

J JP8(i8)c(-j8)dj8-Ji,8|a-|:Je(-a[)da

when qr < 126. The integral over /3 on the left of (26.18) is


(26.19)

%x{x l),

and the integral in eqn (26.1) is now seen to be


2

'* eg( - ^ V ( f f ) ! 0 (x*\


2*

< >

Q^V20 amodg

We can replace the sum over q ^ W by one over q from 1 to infinity


with an error term 0(xH~2&). Now

U1 am odg

s(q)

/-<

22
g

d\x

d\x
d\q

<f(q)

dix{qld)iJ?{q)
'

* *'

g^O(modfZ) r

\ n l-i

n + ( p - wny

(p - 1)3

( p - i ) 3 ( ^ - i ) 3+ i

4.26

I. M. V I N O G R A D O V S T H R E E - P R I M E S T H E O R E M

113

which is zero if x is even and positive if x is odd. We put A {x ) for the


constant in (26.21), so that the left-hand side o f (26.1) is
\ A (x)x2-\-0(xH~z),

(26.22)

which is non-zero for x odd and sufficiently large.


Since primes less than xl ~2 can contribute at most xH 1 to the sum on
the right of eqn (26.1), and
I ^ lo g p ^ I 2log?

(26.23)

if x ^ p ^ xl~2, we can assert that the number of solutions of (26.2) is


i A ( x ) x H - s+ 0 ( x H ~ i logl),

(26.24)

ail expression which again is seen to be non-zero for x odd and sufficiently
large. The formula (26.24) is the famous theorem of I. M. Vinogradov.

27
H A L A S Z S METHOD
And I know it seems easy, said Piglet to himself, but it
isnt everyone who could do it.
II. 18

W e have been concerned with estimating how often various sums S are
large. The large-sieve results that we have had gave an upper bound
for the sum o f |$|2 over different values of a parameter. The upper
bounds we obtained contained two terms; the first was the maximum
o f \S\2 and the second the mean square o f |S| multiplied by the number
of values of the parameter. Thus if
N

S(t) =

2
m=1

(27.1)

then (18.29) gives

r= l

|(y|3 < iY lo g 2V 2 |o(m)|a+ (2 1 ,-!7 1 )lo g ^ 2 M m)l2

(27-2)

where the points tr are at least (log N )^ 1 apart and lie between
and 2\,
The second term in (27.2) corresponds to the mean value o f |$()'|2 and
the first to its maximum. Halasz s method sometimes enables us to
count the number o f values o f r for which \S(tr) \ is large, without the
second terms being present on the right of (27.2).
Halaszs method is based on the lemma of Chapter 7,
f c,(u,fW) < ||u||( % |c,|2W max f |(fW,f<fl>)|a)*.

(27.3)

In the proof of (27.3), we assumed that the vectors u,


had
finite dimension N , but the proof remains valid when the dimension is
infinite, provided that each vector involved has finite norm. We choose
the coefficients cr so that ^ .(u ,^ ) is real and positive. For (7.15) we took
c},to be the complex conjugate of (u, fw), but here we give c,, unit modulus,

4.27

H A L A S Z S M ETH OD

115

so that the second factor on the right of (27.3) is R*. I f each term on
the left o f (27.3) is at least V we can square (27.3) to give
i?2T/2 < P||u||2 max f |(fW,ffe>)|
< -B||u||2(max||f(,')||2+ ( P l)max|(fW,ffe))|],
\ r
}
r,g

(27.4)

r^q

We can now deduce Halasz s lemma.


L e m m a . Let

u,

for r = 1, 2,,.., R .

f(R) be vectors of finite norm with


|(u,fM| > V

(27.5)

R < 2F~2||u||2 m ax ||fW||2


l<r^B

(27.6)

F2 ^ 2||u||2max|(fw,f (9))|.

(27.7)

Then

provided that

<l=r

We shall apply (27.6) with


(u,fW) =
where sr = or-\-itr, 0 ^ ar <

Following Montgomery (1969a), we take

<YYi\
um = emlNa(m)

um = 0

iiff 1
1

<T" N'
AH
^ rm.
m^
if m > N

fm = e - mlNm - ar<--iir

Here,

(27.8)

m^N

||u||2 < e2 f

(27.9)

for all m ^ 1.

(27.10)

|a(m)|2

(27.11)

CO

and

||f||a < 2 e_2m,iV = W + 0 ( 1).


1

(27.12)

Writing a for o-,.+crg and t for tq tr, we have


00
(f(r), f(^) =

0
2//iV^crit

m 1
2 + ico

f
2771

J
2 ico

by the integral transform (23.6).

r(w)(%N)w(w-{-<j-\-it) dw

(27.13)

116

ZE RO S A N D PRIME N U M B E R S

4.27

Before estimating the integral in eqn (27.13), we move the line of


integration to the contour 0 consisting of
C'1: the
C2' the
the
Gs : the

line segment (ico, i(logiV)-1],


semicircle, centre the origin, radius (logA7)-1, to the right of
imaginary axis,
line segment [i^og-A^'+ico).

A residue

(27.14)

r ( l cr~ i t ) ( ^ N ) 1~'J- it

accrues from the pole of


formula in the form (20.3):

a tw + ij+ ii = 1. We recall Stirlings

!r( A+k)| = e-^H|A+ir|^{(27r)i+0(|r|^)},

valid when A

(27.15)

|r|-. Hence if
\t\ > logA7

(27.16)

the residue (27.14) is bounded.


Next we need a bound for |(A+ir)|. Erom the approximate func
tional equation of Chapter 21 we have for 0 ^ A ^
|r| ^ 1 0
|(1 Air)|2 <

m<0(|r|)

d { m ) m x~1+

-H K A + iT + ^ i)}

+ max

r
{i(l-A
- i- r - f 77i)}
-uv'

M<2

m^o('|r|)

< M Alog2M,

(27.17)

where we have used (20.4) and (2.14), and partial summation. The
functional equation and the estimate (20.4) now give
|(A+ir)|2 < M ^ lo g ^ r l,

(27.18)

and (27.17) and (27.18) together give


m + i r ) \ <|Tp-A>logM

(27.19)

uniformly i n O ^ A ^ l , |t | ^ 10.
We have now shown that

|(f,ffa>)|
< 1 + J | r^ )| | ?(ty + A + ir)| | P 7|,u |dw|
C
CO

< l-j-||*log||logiV+ J e_-7I"rr (l + jr|=l og(j r

dr

< |i|*logaJV||,

(27.20)

4.27

HALASZS

m ethod

117

where we have used (27.15) for the gamma function and (27.19) for the
zeta function. We can now restate the condition (27.8) as
T < T0)

(27.21)

where Ta is such that equality holds in (27.8) when we use (27.20) in


substituting for the scalar product. Hence
F2 > f

|a(m)|a2 * lo g O T .

(27.22)

Clearly when T > Ta we must divide up the range for T into intervals
of length at most Tg. Repeated application o f the inequality (27.7)
gives us

IT
\
R < (m +A y-*

\a{m)\m.

v*0
/
m= 1
When we substitute for T0 we have the result which follows:
T h e o r e m . Let

If

(27.23)

< 3 = 2 \a(m)\2.
m=1

(27.24)

(27.25)

^ F

m 1

for s =

sR, where sr = ar-\-itr with 0 < cr,,. ^ J and


T >

\ K - t a\ > lo g iV

(27.26)

for q ^ r, then
R <^ G N V -*+G P N T V -*lo ffN T ,

(27.27)

the implied constants being absolute.

The form of the second term in (27.27) arises from our choice o f
functions f^ ; it is larger than the first term unless (27.22) holds with
T in place o f T0. A plausible conjecture is that
R < G N V -2

whenever

F2

GTS

(27.28)
(27.29)

for any fixed 8 > 0.


The use of the zeta function to prove (27.27) is a curious feature of
Halaszs method. I f Lindelofs hypothesis is true, we can take the line
of integration in (27.13) to Re w l . with the effect o f replacing T\ in
(27.22) by JV'Syg for any e > 0. This is an improvement for T > N
(and if T < N then (27.28) follows trivially from (27.2)), but it is still
a long way from weakening the condition on T0 to (27.29).

28
GAPS B E T W E E N P R I M E N U M BERS
I shall do i t , said Pooh, after waiting a little longer, by
means of a trap. And it must be a Cunning Trap, so you
will have to help me, Piglet.
I. 56

F i r s t we jno v e a theorem on the zeros of (s), replacing the large sieve

(19.26) by Halaszs method in the work of Chapter 23. We shall use


the notation of that chapter with Q = 1, so that only zeros of the zeta
function are considered. The definition o f class (i) and class (ii) zeros
remains as before. We pick representatives o f each class o f zeros in such
a way that their imaginary parts differ by at least 21, where
l = \ogT,

(28.1)

but the representatives are in number ;> l ~2times the zeros in that class.
We suppose a > f , since the result (28.19) which we obtain below
improves on Inghams theorem only for a > f . The parameters X and
Y will satisfy

x ^ y2;

im Y ^ T 2

(28.2)

In the definition (23.14) o f a class (ii) zero p = j8+i y,


l - f l + im

l[pJr io)M(pJr w)Ywr(iv) dw > 7t,

(28.3)

ito

the parts of the integrand with |Imt| > 100? give less than ^ (if I is
sufficiently large). The integral of |.T(|-+ii)| converges rapidly so, for
(28.3) to hold, there must be some t with \t y\ sj 100? for which
m + i t ) M ( i + i t ) \ > c Y P -i,

(28.4)

where c is an absolute constant. We pick as representatives o f the


class (ii) zeros a sequence o f values o f t satisfying (28.4).
By (22.22) the number of these t with
|(f+if)| > U,

(28.5)

where we choose U below, is


< T U - H 5.

(28.6)

4.28

GAPS B E T W E E N P R I M E N U M B E R S

119

Otherwise we have
|i(i+ii)| > V = c U - lY - ! ,

(28.7)

and by (27,28) the number of such t is

We choose

jj

< X V - H + X T V - 9? .

(28.8)

__ ^Y-i/ioys(2a-i)/io;

(28.9)

V = eX1'10^ 2*-1)'5,

(28.10)

and on adding (28.6) and (28.8) and multiplying by I2we see that class (ii)
zeros number

^ jf2/5y-e(2a~i)/5^9^_jy4/5y-2(2a-i)/5^3;

(28.11)

the second term in (28.11) being less than the first provided
Z 2r 2- < TH30.

(28.12)

A zero is o f class (i,r) if


j J a(m)m~Pe~mir j > {20(?-a+ l ) } - 1.
I m elr

(28.13)

We pick representatives and apply (27.28) with


G 2 \a(m)\^m~2ae~2mlY
melr

(2>'Y)1- 2aexp( 2'')l3.

(28.14)

The number of representatives is thus


< *4(2'T)2- 2aexp(-2'-+1)P + f12(2T )4- 6 21e x p ( - 3 . 2r+1)l13.
(28.15)
Summing over r and multiplying by I2, we see that there are at most
< r 2- toi6+ Z 4- 6 T P

(28.16)

class (i) zeros. Choosing


X = 21^a_1)^4+_i),
Y

y 5 o;-3)|{aa+a:-l))

(28.17)
(28.18)

we find (28.12) is satisfied for 0 ^ a ^ 1, and that the number N(a, T)


of zeros p = jS+i y o f (s) with j8 ^ ot and |y( ^ T satisfies the relation
N(a, T) < T5<x-3Xl-M
))Kas+a-l)^7
(28.19)
for f ^ a ^ 1. The result (28.19) is also true for \ ^ a ^ | b y Ingham s
theorem.
We now sketch tlie proof o f our theorem on gaps between prime
numbers.
Theorem. Let e be a real number greater than

Then whenever a: is

sufficiently large, there is a prime p with


x < p ^ x - ^ x 0.
853518 x

(28.20)

120

ZEROS AN D PRIME NUM BERS

4.28

Such a result was first proved by Hoheisel (1930) with c a little less
than one. Ingham (1937) obtained the result with c > f and indicated
how to replace f with a smaller number by improving an upper bound
for |(--f-i?)|. Several authors achieved this by means of intricate
arguments. Recently Montgomery (19696) obtained the result for c >
by the method given here, but with a less efficient use o f the Halasz
lemma; the improvement to
was seen by the author in preparing
the present exposition. As we have seen, Montgomery s method rests
on the Halasz lemma, and thus on bounds for |( 1+i?) |. As with Inghams
result c > , improvements at -|+i? improve the constant, in that a good
estimate for the mean of a higher power than |(-J+it) |4 would decrease
the estimate for class (ii) zeros, both in (28.19) and in Ingham s theorem.
However, even if we knew Lindel5fs hypothesis, we should only be able
to deduce (28.20) for c >
It has long been conjectured (Cramer 1936)
that for large x there is always a prime p with
x < p ^ a;+ 0(log2a;),

(28.21)

but there seems no chance of approaching this conjecture by present


methods.
There are two essentials for a proof of (28.20) with c < 1: a zerodensity theorem such as (28.19) and a result on zeros of (s) with /3 close
to 1. We shall assume that (s) has no zeros p = /3+iy with
> l-4 {lo g (| y | + c)}-* ,

(28.22)

where B < 1. The inequality (28.22) is proved by Hadamards double


height method just as (13.12) was, but the proof uses such inequalities as
|(l+tt)| < l o g c(|*|+e)

(28.23)

with c < 1. No better way to prove bounds for |(l-f~i?) |is known than
to replace exp(iilogm ) by exp{itP(m)}, where P(m) is a polynomial
arising from the first few terms in the expansion of the logarithmic series.
Since m runs through integer values, the resulting sum depends only on
the fractional part of ^ / tt. More direct arguments fail, because t is much
larger than any other parameter involved. After this transformation we
must use the intricate methods of I. M. Vinogradov; W eyls simpler
approach gives only
l(+i*)l ^ io g '^ l+ e y io g lo g ^ l+ e 2),

(28.24)

where any higher power o f loglog t than the first would suffice for the
application. The proof of (28.22) and (28.23) occupies one and a half
chapters of Titchmarsh (1951).

4.28

GAPS B E T W E E N P R IM E N U M B E R S

121

While proving the prime number theorem in Chapter 16 we saw that


# * )_ * -

-+ o K ^ ),

Iy \ < T

'

(28.25)

'

where T is chosen less than x with the property that each zero p = /3+iy
of l(s ) has

\y T\ > l o g 2 \

(28.26)

the sum in (28.25) being over all zeros p of (s, x) with \y\ < T . Hence
ftx+li)-t(x)=h+

xp~ { x + h ) p + o ( XP j ,

Iy\ < T

where we have written

'

A = logo:.

(28.27)

'

(28.28)

By the mean-value theorem the sum over zeros in (28.27) is in modulus


<

H x + B h f-1

xP-1

(28.29)

Iy \ < T

\y\<T

for some 9 in 0 < 9 < 1. To estimate the sum in (28.29) we divide the
interval [0,1] into ranges [0, i], [|, i+ A ^ 1],..., [| + rA -\ \ + { r + 1)A-1] ,....
The number of zeros p = /3+i y with a ^ j3 ^ a+A -1 is
y*.*<i-<*>A27,

(28.30)

this being by (28.19) for a ^ f and by Inghams theorem (23.29) for


\ ^ a jSC f . The sum in (28.29) is thus
< 7iA28 max (ay-M/ejP-i,

(28.31)

\y\< T

the maximum being over zeros p with |y| < T . With


T = x5/12- 8,

(28.32)

where 8 > 0, the expression in (28.31) is


< 7jA28 max

< 7iA28expfSA^A-*)},

(28.33)

Iy\<T

which is o (1) as x tends to infinity; here we have used (28.22). The error
term in (28.27) is also less than h when
h > a:7'12+8A2.

(28.34)

I f (28.34) holds with a sufficiently large constant then


tf,(x+h) ^ x ) > {o(l)}A.

(28.35)

122

ZE R O S A N D P R IM E N U M B E R S

4.28

Finally we note that the prime powers up to x-\-h contribute


(28.36)
to the sum >/i(x-\-h) tfi(x), and so for sufficiently large x
2

x <p^x+7i

loEP >

(28.37)

and we have proved (28.20) when we choose


= i(c

tt )'

(28.38)

NOTATION
2 .n
V

indicate a stim ox1 a product over primes only (Chapter 1)

(m , n )

highest common factor of the integers m and n (Chapter 1)

m = n (mod g) m n is a multiple of q (Chapter 1)


e(a) = expZwia. ea(a) = exp(27ria/g) (1.4)
a

sum over a set of representatives of residue classes modg

m od q

2*
amodg
9(m)
cq(m)
s = cr+ ii
x (m )

d(m)
A(m)
>fi(x)
fi x ) < g(x)

fix)
t(x )
X o (m )

M
IN I

H(a)
L (s ,X)

()
S(a)
7t ( x )

a/q
r(s)
(s)> (s X)
p

= /S+iy

< A ( . x)
ip(x;q,a)
Sx
X*
Xiiiodg
u = A+ ir
0(u)
M( 8 , x )
N ( X )

(Chapter 1)
sum over a set of representatives of reduced residue classes mod q
(Chapter 1)
Eulers function (1.10)
Ramanujans sum (1.11)
complex variable (1.10)
a Dirichlets character (Chapter 1)
the number of divisors of m (1.23)
Mobiuss function (1.25)
logp if m is a prime power p a, otherwise 0 (1.31)
sum function of A(m) (2.1)
\f(x)\ = 0(g(x)) (2.3)
the conductor of a character (Chapter 3)
Gausss sum (3.7)
a trivial character (Chapter 3)
largest integer not exceeding a (4.3)
distance of a from the nearest integer (4.4)
the saw-tooth Fourier series (4.5)
Dirichlets -function (5.8)
Riemanns zeta function (5.9)
an exponential sum (Chapter 6)
number of primes up to x (Chapter 6)
a rational number in its lowest terms (Chapter 6)
Eulers gamma function (11.1), (11.12)
functions occurring in function equations for (s) and L{s, x)
(12.2), (12.3)
a zero of (s) or of (s,x) (Chapter 12)
sum function of A(m)x(m) (17.1)
sum function of A(m) in the arithmetic progression a (modg)
(17.18)
a character sum corresponding to the exponential sum S(a) (18.4)
a sum over proper characters modg (Chapter 18)
auxiliary complex variable (Chapter 20)
all the junk in the functional equation (20.2)
partial stun for the inverse of L(s, x) (23.2)
number of zeros of L (s ,x ) m a rectangle (Chapter 23)

BIBLIOGRAPHY
The epigraphs are from
I. Winnie the Pooh
II. The House at Pooli Corner
B o m b i e r i , E. (1965). On the large sieve. Matliematika 12, 201-25.
------- (1972). A note on the large sieve. To appear.
------- and D a v e n p o r t , H . ( 1 9 6 8 ) . On the large sieve method. Abhandlungen aus
Zahlentheorie und Analysis zur Erinnerung an Edmund Landau. Berlin.
----------------(1969). Some inequalities involving trigonometric polynomials. Annali
Scu. norm, sup., Pisa 23, part 2, 223-41.
C h a n d r a s e k h a r a n , K . and N a r a s i m h a n , R. (1963). The approximate functional
equation for a class of zeta-functions. Math. Annaln 152, 30-64.
C r a m e r , H . (1936). On the order of magnitude of the difference between con
secutive prune numbers. Acta arith. 2, 23-46.
D a v e n p o r t , H. (1967). Multiplicative number theory. Markham, Chicago.
E s t e r m a n n , T. (1948). On Dirichlets ^-functions. J. Lond. math. Soc. 23, 275-9.
F o g e l s , E . (1969). Approximate functional equation for Heckes ^-functions of
quadratic field. Acta arith. 16, 161-78.
F r a n e l , J. (1924). Les suites de Farey et les problemes des nombres premiers.
Nachr. Qes. Wiss. Gottingen, 198-201.
G a l l a g h e r , P. X . (1967). The large sieve. Matliematika 14, 14-20.
------- (1968). Bombieris mean value theorem. Ibid. 15, 1-6.
H a l A s z , G. and T u r A n , P. (1969). On the distribution of the roots of Riemann
Zeta and allied functions I. J. Number Theory 1, 121-37.
H a l b e r s t a m , H . and R o t h , K . F. (1966). Sequences, vol. 1. Oxford.
H a p i d y , G. H. and W r i g h t , E. M. (1960). A n introduction to the theory of numbers.
4th edn. Oxford.
H o h e i s e l , G. (1930). Primzahlprobleme in der Analysis. Sber. berl. math. Qes.
580-8.
I n g h a m , A. E. (1937). On the difference between consecutive primes. Q. Jl Math.
8, 255-66.
------- (1940). On the estimation of N(a, T). Ibid. 11, 291-2.
J e f f r e y s , H . and J e f f r e y s , B. (1962). Methods of mathematical physios.
Cambridge.
L a n d a u , E. (1927). Vorlesungen iiber Zahlentheorie. Leipzig.
L i n n i k , Y u . V . (1945). On the possibility of a unique method in certain prob
lems of additive and multiplicative number theory. Doklady Alcad. Nauh
SSSB,, ser. mat. 49, 3-7.
------- (1964). All large numbers are sums of a prime and two squares (a problem
of Hardy and Littlewood) II. Am. math. Soe. Transl. (2) 37, 197-240.
--------- and R e n y i , A. (1947). On some hypotheses in the theory of Dirichlet
charactexs. Izv. Alcad. Nauk SSSB, ser. mat. 11, 539-46.
M o n t g o m e r y , H . L . (1968). A note on the large sieve. J. Lond. math. Soc. 43,
93-8.
------- (1969a). Mean and large values of Dirichlet jjolyxiomials. Invent, math. 8,
334-45.
------- (19696). Zeros of i-functions. Ibid. 8, 346-54.
------- (1971). Lectures on multiplicative number theory. Springer.

B IBLIO G RAPH Y
P o ly a ,

125

G. (1918). Uber die Verteilung der quadratischen Reste und Nichtreste.

Nachr. Ges. Wiss. Gottingen, 21-9.


P b a c h a b , , K . (1957). Primmhlverteilung. Springer.
R o th , K . F. (1965). On the large sieves of Linnik and Renyi. Mathematika 12,
1-9.
T i t c h m a b s h , E. C. (1951). The theory of the Riemann zeta-function. Oxford.
V i n o g r a d o v , A. I. (1965). On the density hypothesis for Dirichlet i-functions.
Izv. Akad. Nauk SSSB, ser. mat. 29, 903-34.
V i n o g b a d o v , I. M. (1954). The method of trigonometric sums in the theory of
numbers (transl. A . Davenport and K . F. Roth). Intersoience, New York.
------- (1955). A n introduction to the theory of numbers (translation). Pergamon,
Oxford.

INDEX
abscissa o f covergence, 18
additive function, 5
approxim ate functional equation, 84-93,
106, 116
arithm etical functions, 2
additive, 5
m ultiplicative, 2
totally m ultiplicative, 2
B om bieris theorem , 35, 103-9
ch aracters:
con du ctor of, 11
defined, 4
induced, 11
propriety of, 11
sieved, 74-6, 79-82
trivial, 3, 13
classification o f zeros, 99-100
congruence 1
D irich le t:
characters, see characters
divisor problem , 8
polynom ials, 80, 84
series, defined, 18
divisor fu n ctio n :
averaged, 7 -9
defined, 4
elem entary proofs, 69, 107
E ulers :
function defined, 3
product., 20, 52
exceptional zero, 60-5, 7072, 76
exponential maps, 1
exponential s u m :
defined, 24
sieved, 28-31, 77
spikes of, 24-6, 110-11
F arey:
arcs, 110
sequence, 36, 74, 110
F o u rie r:
series, 14, 37, 40-1
transform, 79
Franels theorem, 36-9, 74
functional equation, 20, 41, 45-50, 69,
84, 116
Gallaghers :
first lemma, 76-8, 812
second lemma, 80

Gauss, 11, 22
Gausss sum, 1113
H adam ards product, 50-4, 69
H alaszs m ethod, 30, 114-18
ineffective constants, 62, 72, 110
Ingham s theorem , 98-102, 105,
118-19, 121
integral functions, 50, 85

108,

Jensens form ula, 512


_L-functions defined, 19-20
L in delof hypothesis, 93, 117, 120
m ajor arcs, 11012
m inor arcs, 107, 110-11
m od u la r:
functions, 43
relation, 43
M ob iu s:
fun ction defined, 4
inversion, 5, 33, 38
m ultiplicative fun ction defined, 2
Planoherels identity, 79-80
P oisson s sum m ation form ula, 40-42
P o ly a s theorem , 14-17, 50, 63
prime num ber th eorem :
for an arithm etic progression, 23, 62,
71-2, 74, 103, 110
form s of, 21-2, 23, 39, 55, 62
proved, 66-72
propriety o f characters, 11
R am an ujan s sum, 3, 13, 24, 26, 38, 111
representative zeros, 100, 119
residues:
classes of, 1 -2
com plete set of, 2 , 1011
reduced, 3, 10
R iem ann hypothesis, 20, 52
Schinzels hypothesis, 23-5
Selberg, A ., 21-2, 26-7
Siegels theorem , 62-5, 71-2, 110
SiegelW alfisch theorem , 72, 104, 106
s ie v e :
for characters, 74-6
for exponential sums, 28-31, 77
hybrid, 79-83, 93, 101

128

IN DEX

sieve (em it.)


large, 27, 28-35, 38, 74-8, 114, 118
upper bound, 25-7, 325
sifted sequence, 25, 32-4
spikes o f an exponential sum, 24-6, 110-11
Stirling s formula, 47, 51, 52, 84, 99, 116
totally m ultiplicative functions, 2
trigonometries sums, see exponential sums
uniform distribution, 73-4
V in ogradov, A. I., 98-9, 103

V inogradov, I. M., 8, 14, 25, 57, 107, 113.


120
estimate, 1079
m ethod, 57, 120
theorem, 113
W ey ls criterion, 73
zero-density theorems, 98-102, 105, 108-9,
118-20
zero-free region, 56-65, 120
zeta-function defined, 19-20