Subido por Carlos Garcia-Angulo

Chandrasekharan, Introduction to Analytic Number Theory

Anuncio
Die Grundlehren der
mathematischen Wissenschaften
in Einzeldarstellungen
mit besonderer Beriicksichtigung
der Anwendungsgebiete
Band 148
H erausgegeben von
J. L. Doob . E. Heinz' F. Hirzebruch . E. Hopf· H. Hopf
W. Maak . S. MacLane . W. Magnus· D. Mumford
M. M. Postnikov· F. K. Schmidt· D. S. Scott· K. Stein
Geschaftsfohrende H erausgeber
B. Eckmann und B. L. van der Waerden
K. Chandrasekharan
Introduction to
Analytic Number Theory
I
Springer-Verlag New York Inc. 1968
Prof. Dr. K. Chandrasekharan
Eidgeniissische Technische Hochschule Ziirich
Geschiiftsfiihrende Herausgeber:
Prof. Dr. B. Eckmann
Eidgeniissiscbe Technische Hochschule Ziirich
Prof. Dr. B. L. van der Waerden
Mathernatisches Institut der Universitiit Ziirich
ISBN-13: 978-3-642-46126-2
e-ISBN-13: 978-3-642-46124-8
DOl: 10.1007/978-3-642-46124-8
Aile Rechte vorbehalten. Kein Teil dieses Buches darf ohne schriftliche Genehrnigung
des Springer·Verlages iibersetzt odeT in irgendeiner Form vervietnUtigt werden.
© by Springer-Verlag Berlin· Heidelberg 1968
Softcover reprint of the hardcover 1st edition 1968
Library of Congress Catalog Card Number 68-21990
Titel-Nr. 5131
Preface
This book has grown out of a course of lectures I have given at the
Eidgenossische Technische Hochschule, Zurich. Notes of those lectures,
prepared for the most part by assistants, have appeared in German.
This book follows the same general plan as those notes, though in style,
and in text (for instance, Chapters III, V, VIII), and in attention to
detail, it is rather different. Its purpose is to introduce the non-specialist
show
to some of the fundamental results in the theory of numbers,
how analytical methods of proof fit into the theory, and to prepare
the ground for a subsequent inquiry into deeper questions. It is published in this series because of the interest evinced by Professor Beno
Eckmann.
to
I have to acknowledge my indebtedness to Professor Carl Ludwig
Siegel, who has read the book, both in manuscript and in print, and
made a number of valuable criticisms and suggestions. Professor
Raghavan Narasimhan has helped me, time and again, with illuminating
comments. Dr. Harold Diamond has read the proofs, and helped me
to remove obscurities. I have to thank them all.
August 1968
K.C.
Contents
Chapter I
The unique factorization theorem
§ 1.
§ 2.
§ 3.
§ 4.
§ 5.
§ 6.
Primes . . . . . . . . . . . .
The unique factorization theorem. . . . . . . . .
A second proof of Theorem 2 . . . . . . . . . . .
Greatest common divisor and least common multiple
Farey sequences . . . .
The infinitude of primes. . . . . . . . . . . . .
I
I
3
5
6
9
Chapter II
Congruences
§ 1. Residue classes. . . . . . . . . . . .
§ 2. Theorems of Euler and of Fermat. . . .
§ 3. The number of solutions of a congruence
11
13
15
Chapter III
Rational approximation of irrationals and Hurwitz's theorem
§ 1.
§ 2.
§ 3.
§ 4.
Approximation of irrationals .
Sums of two squares . .
Primes of the form 4k ± 1
Hurwitz's theorem . . .
18
20
21
22
Chapter IV
Quadratic residues and the representation of a number
as a sum of four squares
§ 1.
§ 2.
§ 3.
§ 4.
The Legendre symbol. . . . . . . .
Wilson's theorem and Euler's criterion
Sums of two squares
Sums of four squares . . . . . . . .
26
27
29
31
Chapter V
The law of quadratic reciprocity
§ 1. Quadratic reciprocity . . . . . . . . .
§ 2. Reciprocity for generalized Gaussian sums . . . . .
34
34
Contents
§ 3. Proof of quadratic reciprocity
§ 4. Some applications . . . . .
VII
39
42
Chapter VI
Arithmetical functions and lattice points
§ I.
§ 2.
§ 3.
§ 4.
§ 5.
§ 6.
45
45
Generalities . . . . . . . .
The lattice point function r(n)
The divisor function d(n)
The function u(n). . . .
The Mobius function ~ (n)
Euler's function rp(n) . .
47
54
55
59
Chapter VII
Chebyshev's theorem on the distribution of prime numbers
§ I.
§ 2.
§ 3.
§ 4.
§ 5.
The Chebyshev functions
Chebyshev's theorem . .
Bertrand's postulate. . .
Euler's identity. . . . .
Some formulae of Mertens
63
67
71
76
81
.
.
.
.
Chapter VIII
Wey!'s theorems on uniform distribution and Kronecker's theorem
§ 1.
§ 2.
§ 3.
§ 4.
§ 5.
Introduction. . . . . . . . . . . . .
Uniform distribution in the unit interval .
Uniform distribution modulo! .
Weyl's theorems. .
Kronecker's theorem . . . . .
84
84
86
87
91
Chapter IX
~inkowski's
theorem on lattice points in convex sets
§ I. Convex sets . . . . .
§ 2. Minkowski's theorem.
§ 3. Applications. . . . .
97
98
102
Chapter X
Dirichlet's theorem on primes in an arithmetical progression
§ I. Introduction .
§ 2. Characters. . . . . . . . . . . . . . . . . . . . . . . . .
105
107
VIII
Contents
§ 3. Sums of characters, orthogonality relations.
§ 4. Dirichlet series, Landau's theorem
§ 5. Dirichlet's theorem. . . . . . . . . . .
109
111
117
Chapter XI
The prime number theorem
§ 1. The non-vanishing of ( (1 + it)
§ 2. The Wiener-Ikehara theorem
§ 3. The prime number theorem
122
124
128
A list of books
Notes
Subject index .
131
132
139
Chapter I
The unique factorization theorem
§ 1. Primes. We assume as known the positive integers 1,2,3, ... ,
the negative integers - 1, - 2, - 3, ... , and zero, which we reckon as
an integer. By the non-negative integers we mean the positive integers
together with zero. We assume as known the elementary arithmetical
operations on integers.
An integer a is said to be divisible by an integer b ,*0, if there exists
an integer c, such that a = b c. We then say that b divides a, or b is a
divisor of a, and indicate this by writing bla. We also say that a is an
integral multiple or just a multiple of b. We write b,r a to indicate that b
does not divide a. The following propositions are easily verified:
if bla, and a>O, and b>O, then 1 ~b~a;
if b la, and c Ib,
then cia;
if b la, and c ,*0, then b clac;
if cia, and clb,
then c!(ma+nb), for all integers m and n.
'*
Given two integers a and b, b 0, there exist unique integers q and r,
such that a=bq+r, where O~r< Ibl. We call q the quotient, and r the
remainder in the division of a by b. If bla, then r=O.
An integer p, where p> 1, is a prime number, or a prime, if its only
positive divisors are 1 and p. An integer greater than 1, which is not a
prime, is called composite.
In this chapter we shall prove that every integer greater than 1
can be represented as a product of primes, and that such a representation
as a product is unique, except for the order of the factors. We shall also
prove that there exist infinitely many primes.
§ 2. The unique factorization theorem. We begin with the following
simple
THEOREM 1. If n is an integer greater than 1, then n is a product of
primes.
PROOF. Either n is a prime, or it is composite. In the former case,
there is nothing more to prove. If n is composite, 'then, by definition, there
exist integers d, such that 1 < d < n, and din. Let m be the least of such
1 Chandrasekharan, Analytic Number Theory
The unique factorization theorem
2
divisors. Then m must be a prime, for otherwise there exists an integer k,
such that 1 <k<m, and kim. That would imply that kin, and 1 <k<m,
which contradicts the definition ofm. Thus m is a prime Pl' say. We then
write n = Pl . r, where 1 < r < n, and repeat the same process with r,
to obtain n = Pl . P2 . s, where P2;?; Pl' and 1:;:; s < r < n. This process
clearly breaks off after a finite number of steps, since there are only
finitely many integers between 1 and n. We therefore obtain
(1)
which concludes the proof.
We note, in passing, that if n=ab, then a and b cannot both be
It follows that any composite integer n has a prime
greater than
factor p, such that P:;:;
By grouping together the equal primes in the representation (1), and
changing the indices, if necessary, we can rewrite (1) as
Vn.
Vn.
(2)
where Pl <P2 < ... <Pk, and a;>O, for i= 1,2, ... , k. This is called the
standard form of n.
We are now in a position to prove the unique factorization theorem, which is also known as the fundamental theorem of arithmetic
(Theorem 2).
THEOREM 2. The standard form of an integer n, which is greater than
1, is unique.
We shall give three proofs of this theorem. The first proof uses only
Theorem 1. The second is connected with the solution oflinear equations
in integers, while the third makes use of the theory of F arey sequences.
FIRST PROOF OF THEOREM 2. The standard form of a prime is clearly
unique.
Suppose, if possible, that some positive integers > 1 have two
different standard forms. Let N be the smallest such integer, with
Every P is distinct from every q, since any prime common to both the
representations would divide N to yield an integer N' < N with the
same property as N, which is impossible by the definition of N.
We may assume that
§ 3.
A second proof of Theorem 2
3
Now PI =!=ql· Let us suppose, as we may, that PI <ql. We define the
number
Since PIIP, and PIIN, it follows that PII(N - P), where
N -P=(ql -pdqz ... qm> 1.
Therefore we can write
(3)
where the ti are primes for i = 1,2, ... , h. We can also write ql -PI as
a product of primes, say
if ql - PI> 1. Then we get
(4)
as another representation of N -P as a product of primes. We have
seen that none of the p's is equal to a q. In particular, PI is not equal to
any q. Nor is PI equal to any r, for it is clear that pd/(ql -PI)' so that
no factorization of ql - PI can contain Pl. Thus the integer N - P has
two factorizations, namely (3) and (4), which are distinct, since only
one of them contains Pl. This is the case even if ql - PI = 1. But
1 < N - P < N, which contradicts the minimality of N. Hence there
exists no integer n> 1 with more than one standard form.
§ 3. A second proof of Theorem 2. This is based on the solution of
certain linear equations in integers. We need some preparation.
Let a and b denote integers, not both zero. Their greatest common
divisor, denoted by (a, b), is defined to be the largest positive integer
which divides both a and b. If (a,b)=1, we say that a is prime to b, or
that a and b are relatively prime. We shall see that if (a,b)=d, the equation ax+by=d has a solution in integers x,y.1t follows from this that
if P is a prime, and plab, then pia or plb, and this, in turn, implies the
unique factorization theorem.
A non-empty set of integers S with the property
mES and nES => m-nES,
is called a module. It follows from the definition that if m,nES, then
O=m-mES,
1*
-n=O-nES,
m+n=m-( -n)ES.
The unique factorization theorem
4
More generally, if aES, bES, then ax+b YES, where x and yare
integers. If a module contains only 0, we call it the trivial module. A nontrivial module obviously contains infinitely many positive, and negative, integers. We can say a little more.
THEOREM 3. Every non-trivial module S consists of all integral multiples of a positive integer.
PROOF. Since S is not the trivial module, it contains some positive
integers. Let d be the smallest such integer. Then S contains all integral
multiples of d. In order to show that these are the only elements of S,
take any nES. We can write n=dk+c, where k and c are integers,
and O~c<d. Since dES, it follows that dkES. Since nES, we have
n-dkES, that is CES. But c<d, and d is the smallest positive integer
in S. Hence c = 0. Therefore n is an integral multiple of d.
From this we deduce
THEOREM 4. If a and b are given integers, the module S = {a x + by},
where x and yare integers, is the set of all integral multiples of d=(a,b).
PROOF. It is easy to see that the set S is a module. By Theorem 3 we
know that S is the set of all integral multiples of some positive integer e.
Therefore e divides all elements of S; in particular, ela, and elb. Since
d is the greatest common divisor of a and b, we must have e~d. On the
other hand, dl(ax + by) for all integers x,y, so that d divides every element
of S. In particular, die. Hence d~e. Thus e=d, and the result follows.
It is now clear that the following theorem holds:
THEOREM 5. The equation ax + by = n is soluble in integers x and y
if and only if (a,b)ln.
COROLLARY 1. If (a,b)=d, then ax+by=d is soluble in integers x
and y. In other words, the greatest common divisor of a and b is a linear
combination of these integers with integer coefficients.
COROLLARY 2. Any common divisor of a and b divides (a,b).
These results lead to
THEOREM 6 (EUCLID). If albc, and (a,b)=1, then alc.
PROOF. Since (a,b) = 1, there exist integers x and y, such that
ax+by=1. If we multiply by c, we get acx+bcy=c, and since
albc, it follows that al(acx+bcy), or ale.
COROLLARY. If p is a prime, and p
li01 Pi'
i=1,2, ... ,r, then P=Pi for at least one i.
where Pi is a prime for
§ 4.
5
Greatest common divisor and least common multiple
We are now in a position to give
A SECOND PROOF OF THEOREM 2. Suppose that an integer N has two
different standard forms,
Then p11q~1 q~2 ... q~r, hence, by the Corollary of Theorem 6, P1 = qi for
some i, 1 ~ i ~ r. In the same way we see that every p equals some q,
and every q equals some p. Therefore k = r, and since both forms are
arranged in ascending order, we have
with P1 < P2 < ... < Pk' We shall see that a i = bi for i = 1,2, ... ,k. For
if a i > bi for some i, we can divide both sides by pri and obtain
where Pi divides the left-hand side, but not the right-hand side, which
is impossible. Similarly it is impossible that a i < bi' Hence a i = bi for
all i, and the standard form is unique.
§ 4. Greatest common divisor and least common multiple. Related
to the greatest common divisor of two integers a and b, defined in § 3,
is the least common multiple.
DEFINITION. The least common multiple {a,b} of two integers a and b,
where ab =1= 0, is the smallest positive integer which is divisible by both
a and b.
The relationship between (a,b) and {a,b}, where ab>O,
pressed by the identity
ab=(a,b)·{a,b}.
IS
ex-
(5)
To prove this, consider the integer f.1=ab/(a,b). Since (a,b)lb, f.1 is an
integral multiple of a. Similarly f.1 is an integral multiple of b. Thus f.1
is a common multiple of a and b. Let v be an integer which is some
other common integral multiple of a and b, and consider the number
v
f.1
v'(a,b)
ab
We know that (a,b)=ax+by for some integers x and y. Hence
v
-=
f.1
v'(ax+by) vx vy
=-+-.
ab
b
a
The unique factorization theorem
6
But via and vlb are integers, hence vl/1 is an integer. Thus any common integral multiple of a and b is an integral multiple of /1. Hence /1
is their least common multiple, and
ab
/1 = (a,b) = {a,b}.
Incidentally we have shown that the least common multiple of a
and b divides any common multiple of a and b.
If a is a positive integer, we can write
IJ(~O,
where the product extends over all primes p, and IJ( is a non-negative
integer which is zero except for finitely many p. If a prime p does not
divide a, then the corresponding exponent IJ( is zero. Similarly we have
{J ~ O.
It is easy to see that
(a,b) =
TIpffiin[a,/lJ,
(6)
§ 5. Farey sequences. If hand k are integers, and k > 0, we call
hlk a fraction, with numerator h, and with denominator k.
A fraction hlk is called irreducible, or reduced, if (h, k) = 1. A fraction hlk is called proper, if 0 ~ hlk ~ 1.
A Farey sequence of order n, where n is a positive integer, is the
sequence Fn of all irreducible, proper fractions hlk, with 1 ~ k ~ n,
arranged in non-decreasing order. For example, Fs is the sequence
A Farey fraction is a term in a Farey sequence of some order. We
note that every rational number min, such that 0 ~ min ~ 1, is equal
to a Farey fraction.
It follows from the unique factorization theorem (Theorem 2) that
a reduced fraction is unique. In other words, two reduced fractions
which are equal must be identical. Since we do not wish to use Theorem 2,
however, we have to allow for the possibility that two Farey fractions
may be equal without being identical. In that case, we arrange them
in increasing order of their numerators. The following theorem rules
out such a possibility in fact, and prepares the ground for a third proof
of Theorem 2.
THEOREM 7 (F AREY-CAUCHY). If 11m is the immediate successor of
hlk in the Farey sequence FN , then kl- hm = 1.
Farey sequences
§ 5.
7
The result is seen to be true, by actual verification, for FN ,
We shall assume it true for FN, and prove it for FN+1 •
Let alb be a reduced proper fraction which does not belong to FN.
Then b;;::: N + 1, and alb must lie between some two consecutive fractions hlk and 11m of FN, say
PROOF.
1~N~5.
h
k
a
I
m
- ~ - ~-,
b
equality being allowed, since the uniqueness of reduction of a fraction
is not assumed.
Define the integers A and p, as follows:
A=ka-hb,
p,=bl-am.
Then A;;::: 0, p,;;::: 0, and A+ P, > 0, since we have assumed the theorem
to be true for FN, to which hlk and 11m belong. Further
AI + p,h = kal- ham = a(kl- hm) = a,
since kl-hm= 1 by the induction hypothesis on FN. Similarly
(7)
Am+p,k=b,
and (A,p,) = 1, since (a,b)=1. Thus, if
a
b
Al+p,h
,A;;:::O,
Am+p,k
p,;;:::0,
hlk~alb~llm,
A+p,>O,
(a,b)=l, then
(A,p,)=1.
Conversely, if A and p, are integers, such that A;;:::O, p,;;:::0, A+p,>O,
(A,p,)=l, and we define a,b by a=AI+p,h, b=Am+p,k, then uniquely
A=ka-hb, p,=bl-am, and (a,b)=l, so the fraction alb is reduced,
and hlk~alb~llm, since kl-hm=l. Thus alb belongs to FM , for
some M.
Since k>O, m>O, (A,p,) = 1, we also see that b~m+k exactly in
the three cases A,p,=O,l; 1,1; 1,0; giving a,b=h,k; l+h, m+k; I,m.
Now A#O, for if A=O, then alb = (p,h)/(Pk), which is not reduced
unless p, = 1, in which case b = k by (7), and that contradicts the assumption that b;;::: N + 1 > k. Similarly p, # 0. Hence b ~ m + k only if
A=p,=l. Now b;;:::N+1, and if (alb)EFN+l' 'then b=N+l. Further
m+k;;:::N+1, since
8
The unique factorization theorem
hlk and 11m being consecutive terms in FN. It follows that if b = N + 1,
then A. = 1 and It = 1. Hence
a
a
h+l
bEFN + l => a=h+l,b=k+m, b = k+m'
and this fraction alb clearly satisfies the theorem with respect to its
neighbours hlk and 11m, since kl-hm=1, by the induction hypothesis on FN. Thus the theorem holds for FN+ l if it holds for FN. Since
we know that it does hold for Fl , it holds for all Fn.
It follows from Theorem 7 that a reduced fraction is unique.
DEFINITION. Thefraction (h+l)I(k+m) is called the mediant of the
fractions hlk and 11m.
Implicit in the proof of Theorem 7 is the result that the mediant
of two Farey fractions is a Farey fraction, as well as
THEOREM 8. The fractions which belong to FN+ l but not to FN are
mediants of the neighbouring fractions in FN.
A consequence of Theorem 7 is
THEOREM 9. If hlk, h"lk", h'lk' are successive fractions belonging to
the same Farey sequence, then h"lk" = (h+h')j(k+k').
PROOF. By Theorem 7, we have kh"-hk"=1, and k"h'-h"k'=1,
and by subtraction we get the required relation.
THEOREM 10. If hlk and 11m are two successive fractions in a Farey
sequence FN' then k + m ~ N + 1.
PROOF. Since
h
k<
h+l
I
k+m <
m'
the mediant of hlk and 11m does not belong to FN, hence k+m > N.
Finally we prove
THEOREM 11. If N > 1, no two successive fractions in FN have the
same denominator.
PROOF. Let k > 1. If h'lk is the immediate successor of hlk in FN,
then h + 1 ~ h' < k, and we would have
h
k<
h
h +1
k -1 < -k- ~
h'
k·
Thus hl(k-1) would lie between hlk and h'lk in FN, which contradicts
our assumption about hlk and h'lk.
The infinitude of primes
§ 6.
9
THIRD PROOF OF THEOREM 2. We can now apply our knowledge of
Farey sequences to prove that the equation ax+b Y= 1, where (a,b)= 1,
is soluble in integers x,y. This implies, as we have already seen,
Theorem 2.
Since the conclusion is trivially true when ab = 0, or when a = b,
we shall suppose that b > a > 0, and (a, b) = 1. Consider the fraction
a/b. It occurs as a term in a Farey sequence, for example in Fb • Let
h/k be its immediate predecessor in that sequence. Then by Theorem 7
we have ka-hb=l, so that x=k and y= -h give a solution of
our equation.
§ 6. The infinitude of primes. We have obtained three different proofs
of the unique factorization theorem. We shall now show that there are
infinitely many primes.
THEOREM 12 (EUCLID). The number of primes is infinite.
We shall give two different proofs of this theorem, the first by Eucl~d,
and the second by G. P61ya. A third proof, due to Euler, is given in
Chapter VII, § 1.
FIRST PROOF OF THEOREM 12 (EUCLID). Let 2,3,5, ... ,p be the set of
all primes up to p, and consider the integer
q=(2·3·5 ... p)+1.
It is not divisible by any of the primes up to p. Since q> 1, either q is
itself a prime greater than p, or is divisible by a prime greater than p.
In either case, there exists a prime greater than p. Hence the number
of primes is infinite.
If Pn denotes the nth prime, it follows from this argument that
PmlilPi +1
n Pi+ 1 <p~+ 1 for n> 1.
i=1
n
for an m > n. Hence Pn+1 ~Pm ~
Actually the argument can be made to yield a little more. One can
prove that
n~l,
with Pn < 22n - 1 for n > 1. For suppose that
p1~2,
p2~22,
P3~24' ... ,Pn~22n-1.
Then
and we have the required result by induction.
The unique factorization theorem
10
P61ya's proof of Theorem 12 uses a property of Fermat numbers.
A Fermat number In is an integer of the form In=22" + 1, n~ 1. We
shall see that Theorem 12 is a consequence of
THEOREM 13 (POLYA). Any two different Fermat numbers are relatively
prime.
PROOF. Let In and In+k (k > 0) be any two Fermat numbers. Suppose
that m is a positive integer, such that ml/n' and ml/n+k' Setting x = 2 2",
we have
In+k-2 = x2k_l = x2k-1_x2k-2+ ... _1
h
x+l
'
so that In l(fn +k - 2). It follows that m l(fn +k - 2). Since m also divides
In+k' this implies that m12. But Fermat numbers are odd. Therefore
m = 1, which proves Theorem 13.
SECOND PROOF OF THEOREM 12 (POLYA). It follows from Theorem 13
that each of the Fermat numbers 11'/2'''''/n is divisible by an odd
prime which does not divide the others. Hence there are at least n odd
primes not exceeding In. Consequently there are infinitely many primes.
Further, if we allow n = 0, with 10 = 3, then since P1 = 2, and
there are at least n odd primes not exceeding In for n ~ 1, we obtain
Pn+2 ~/n' where Pn denotes the nth prime. That is
Pn+2 ~ 2 2 " + 1,
which is better than the previous estimate.
Fermat observed that
11=5,
12=17,
13=257,
14=65537
are all primes, and conjectured that all In are primes. This was disproved, however, by Euler, who showed that Is is divisible by 641.
A simple proof, due to G. T. Bennett, runs as follows:
Is = 2 25 +1 = 232+1 = (2'2 7)4 +1.
Set 27 = a, and 5 = b. Then Is = (2a)4+ 1 = 24 a4+ 1. Now 24 = 1 +3b,
or 24 = 1 + b(a - b3). Hence
Is = (1 +ab-b4)a4 +1 = (1 +ab)[a4 +(I-ab)(1 +a2b2)],
which implies that l+ab(=641) divides Is.
It does not seem to be known whether any Fermat numbers, other
than the first four, are primes.
Chapter II
Congruences
§ 1. Residue classes. Let a, b, and m be integers, and m >0. We
say that a is congruent to b modulo m, if ml(a - b). We express this in
symbols as: a == b(modm), and call it a congruence. If m,t'(a-b), we
say that a is incongruent to b modulo m, and write a 'jE b (mod m).
The congruence relation is an equivalence relation, for it is reflexive,
since a==a(modm); symmetric, since a==b(modm) implies b==a(modm);
and transitive, since a==b(modm) and b==c(modm) imply a==c(modm).
Thus the relation "== (mod m)" partitions the integers into disjoint
equivalence classes A, B, C, ... , such that two integers are congruent
modulo m if and only if they lie in the same class. These classes are
called residue classes modulo m.
Clearly the integers 0,1, ... ,m-1 all lie in different residue classes.
Since any integer n can be written as n=qm+r, 0::0; r::O;m-1, every
integer is congruent modulom to one of the integers 0,1, ... ,m -1. Therefore there are exactly m residue classes modulo m, and the integers
0,1, ... , m -1 form a set of representatives of these classes.
Congruences can be added, subtracted, or multiplied, like ordinary
equalities. If a == b (mod m), and c == d (mod m), then a + c == b + d (mod m),
a-c == b-d(modm), and ac == bd(modm). For, if ml(a-b), and
ml(c-d), then ml{(a-b)±(c-d)}; further ml(a-b)c, so that ac==bc
(modm); and ml(c-d)b, so that bc == bd(modm); and since the congruence relation is transitive, we have ac == bd(modm).
In general one cannot divide congruences. We have 2 == 12 (mod 10),
but 1'jE 6 (mod 10).
Let A and B be two residue classes. Then, according to the above
rules, if a is an arbitrary element of A, and b of B, then a + b always lies in
the same residue class, which we call the sum A + B. Likewise we use
the notations A - B and A· B, and speak of the difference, or product,
of two residue classes.
It is easy to see that the residue classes modulom form an abelian
group with respect to addition. The zero element of this group is the
class which contains all integral multiples of m, and the inverse of a
class A is the class A' which contains the negatives of all members of A.
The congruence
ax == c(modm)
12
Congruences
II
is equivalent to the linear equation
ax-my=c,
and, by Theorem 5 of Chapter I, we see that it has a solution in integers
x,y if (a,m) = 1. The solution is unique, up to congruence, for if
aX l == c (mod m),
and
axz == c(modm),
then a(xl-xz)==O(modm), or mla(xl-xz). But since (a,m)=I, this
implies that ml(x l - x z ), or Xl == Xz (modm).
Therefore, if Xo,Yo is a particular solution of the linear equation
ax+by=n,
(a,b)=I,
the general solution is given by X= Xo - b t, Y = Yo + a t, where t is an
integer.
We can also express the result which we have just obtained for
congruences by saying that if A, C and X are residue classes modulo m,
the equation A X = C has a single solution X, if the elements of A are
prime to m.
Those residue classes modulo m whose elements are prime to m
are called prime residue classes. They form an abelian group with respect
to multiplication, the unit class being the one which contains the integer 1.
Each prime residue class has an inverse, for if (a,m)= 1, there exists an
integer a' such that aa'==l(modm).
Let us consider the additive group of all residue classes modulo a
prime p. With the exception of the zero class, they are all prime residue
classes, hence form also a multiplicative abelian group. The distributive
law A(B + C) = A B + A C is a simple consequence of the distributive
law for integers. We therefore have
THEOREM
I. The residue classes modulo a prime p form a field of p
elements.
RESIDUE SYSTEMS. We have distinguished the prime residue classes
modulo m from among all the m residue classes modulo m.
A complete residue system modulo m consists of one representative of
each residue class. Thus a set of m integers is a complete residue system
modulo m only if its members are pairwise incongruent modulo m. On
the other hand, a complete prime residue system modulo m consists of
one representative of each prime residue class modulo m.
For example, the integers 0,1, ... , 7 form a complete residue system
(mod 8), while 1,3,5 and 7 form a complete prime residue system (mod 8).
Theorems of Euler and of Fermat
§ 2.
13
EULER'S FUNCTION cpo Euler's function cp is defined for all positive
integers n by the relation: cp(n) equals the number of integers among
1,2, ... , n which are prime to n.
It follows from the definition that cp(n) is also the number of prime
residue classes modulo n.
§ 2. Theorems of Euler and of Fermat. If a l ,a2, ... , am is a complete
residue system modulo m, and if k is an integer prime to m, then the set
kal,ka2, ... , ka m is also a complete residue system modulo m, for these
m integers are easily seen to be pairwise incongruent modulo m.
More generally, if (k,m)=l, and h is some integer, the set ka;+h
(i = 1,2, ... , m) is also a complete residue system modulo m.
On the other hand, if r 1> r2, ... , r",(m) is a complete prime residue
system modulo m, and if (a,m)= 1, then the integers ar l ,ar2, ... , ar",(m)
also form a complete prime residue system. Hence
or
Since rl ,r2, ... , r",(m) are prime to m, we have
THEOREM 2 (EULER). If (a,m)= 1, then a",(m) =. 1 (modm).
A particular case of this theorem, where m is a prime, was discovered
by Fermat.
THEOREM 3 (FERMAT). If P is a prime, and (a,p)=l, then aP-l=.l
(modp).
To prove an important property of Euler's function, we need
THEOREM 4. Let (m,m') = 1. If a runs through a complete residue
system (modm), and a' through a complete residue system (modm'), then
am' +a'm runs through a complete residue system (mod mm').
PROOF. There are mm' integers am' +a'm, and every two of them
are incongruent (mod mm'), for if
a~
m + a l m' =. a~ m + a2m' (mod mm'),
then
from which it follows, since (m,m') = 1, that a l =.a 2(modm). Similarly
a'l =.a~(modm').
14
II
Congruences
DEFINITION. An arithmetical function is a complex-valued function
defined on the set of positive integers.
An arithmetical function f is multiplicative, if (i) f is not identically
zero,and (ii)(m,n)=1 implies that f(mn) =f(m) f(n).
Theorem 4 can be used to prove
THEOREM 5. Euler's function q; is multiplicative.
PROOF. Since q;( 1) = 1, q; is not identically zero. Let (m, m') = 1, and
let a and a' run through complete residue systems modulo m, and
modulo m', respectively. Then, by Theorem 4, am' +a'm runs through a
complete residue system (mod mm'). Therefore q;(mm') is the number
of integers am' +a'm which satisfy the condition (am' +a'm, mm')= 1.
But this is equivalent to the two conditions
(am' +a' m,m)= 1,
and
(am' +a'm,m')= 1,
(am',m)= 1,
and
(a' m, m') = 1,
(a,m)= 1,
and
(a',m')= 1.
or to
orto
Since there are q;(m) values of a for which (a,m)= 1, and q;(m') values
of a' for which (a',m') = 1, there are q;(m)·q;(m') values of am' +a'm
which are prime to mm'. Hence
q;(mm') = cp(m)· cp(m').
This proof leads also to the following
THEOREM 5'. If (m,m') = 1, and if a runs through a complete prime
residue system (modm), and a' through a complete prime residue system
(mod m'), then am' + a' m runs through a complete prime residue system
(mod mm').
Theorem 5 can be used to calculate q;(n). Every integer n> 1 can
be written in the standard form
n=
Il pfi,
i= 1
so that
q;(n) =
Il q;(Pfi),
.
i= 1
and q;(n) is known if we know q;(pa) for a prime p. We have obviously
q;(p) = p -1. If a> 1, consider the complete residue system modulo
15
The number of solutions of a congruence
§ 3.
pa, namely 1,2, ... , pa. Exactly pa - 1 of these integers are not prime
to pa, namely the multiples p, 2 p, 3 p, ... , pa of p. Therefore
-~).
q>(pa)=pa_ pa-1=pa(1
Thus
r
r
(
1)
q>(n)= in q> (pi i) = in pii 1 - ~ ,
or
q>(n)=n
TI (1 - ~).
pin
(1)
P
Another important property of q> is given by
THEOREM 6.
L cp(d)=m.
dim
TI pii. The divisors ofm are then ofthe form TI p~i,
PROOF. Let m=
i; 1
where
O~bi~ai'
i; 1
Hence
~ CP(d)=(bl ..~hr) q> (~ Pf) = (bl.~'br) i~ cp(pfi),
o~~~~
o~~~~
by Theorem 5. By writing out the terms and rearranging, we obtain
a,
L cp(d) = TI L q>(pf')
dim
=
i;1
b,;O
TI
[q>(I)+CP(Pi)+"'+cp(pii)]
i; 1
=
r
TI
[1+(pi- 1)+p;(pi- 1)+"'+pi,-1(pi- 1)]
i; 1
=
TI pii=m.
i; 1
§ 3. The number of solutions of a congruence. We have seen earlier
in this chapter that if (a,m) = 1, the linear congruence ax=c(modm)
is soluble, and has, up to congruence, but one solution. We now raise
the question of the number of solutions of a polynomial congruence
aO x n+a 1 x n- 1 + ". +an=O(modp),
where ao, a 1 , .•• , an are integers, n> 1, and P is a prime.
II
Congruences
16
If X is a solution of this congruence, so is any integer congruent to x
(modp). For this reason, when we speak of the number of solutions of a
congruence, we mean the number of residue classes whose elements
satisfy the congruence. The number of solutions is therefore equal to the
number of representatives of a complete residue system (modp) which
satisfy the congruence.
Such congruences may have solutions or not. For example x 2 = 3
(mod 7) has no solution.
On the other hand, we know by Fermat's theorem (Theorem 3) that
the congruence
x p - I =1(modp}
has the p-1 solutions x=1,2, ... ,p-1.
Since x p - I =1(modp} if p,rx, we have xP=x(modp} for all x,
and x p + l =x 2 (modp}, and so on; any power greater than p-1 can be
reduced, so that we may assume the degree n < p. Further we shall
suppose that (ao,p)= 1, to ensure that the congruence is really of
degree n.
The answer to the question raised at the beginning of this section
is given by
THEOREM
7 (LAGRANGE). The congruence
aOxn+a l x n- l
+ ... +an=O(modp},
(ao,p)= 1
(2)
has at most n solutions.
PROOF.
We use induction. The theorem is true for n = 1, since
(ao,p) = 1. Now suppose the theorem true with n -1 in place of n. It
is trivially true for the degree n, if the congruence (2) has no solution.
If it does have a solution, say Xl' then
(3)
If we subtract this from (2), we get
ao(xn-xD+al (x n - l _ X~-l)+ ...
+ an- l (x- xl)=O(modp},
(4)
which is obviously satisfied by any solution of (2). But (4) can be written as
(x-xl)(aOx n - 1 +b l x n -
2
+ ... +bn_I)=O(modp),
where b l ,b2 , ... , bn - l are integers which depend on Xl and on the
integers ao, ... ,an - l . Therefore every solution of (2) must satisfy either
the congruence
(x-xl)=O(modp),
§ 3.
17
The number of solutions of a congruence
which yields the original solution x = Xl' or
aOx n - l +b l x n - 2 + ...
+b n - 1 =O(modp),
(ao,p)=
1,
which is of degree n -1, and has, by the induction hypothesis, at most
(n -1) solutions. In either case (2) can have at most n solutions, as
claimed.
2
Chandrasekharan, Analytic Number Theory
Chapter III
Rational approximation of irrationals
and Hurwitz's theorem
§ 1. Approximation of irrationals. Let ~ be a real number which is
irrational. Then given e > 0, we know that there exists a rational number
h/k, such that I~ - h/kl < e, since the set of rational numbers is dense in
the space of real numbers. The problem we now wish to consider is the
size of the difference I~ - h/k I as a function of k.
Unless there is a statement to the contrary, we shall assume that
< ~ < 1, and that h/k is irreducible, and k > 0.
°
THEOREM 1. If ~ is irrational, and N a positive integer, then there
exists a rational number h/k, with denominator k:::; N, such that
1 ~_~I<_1
k
kN
.
PROOF. For any real number x, let [x] denote the integral part of x,
that is the integer m, such that m:::;x<m+ 1. We then have
0< n ~ - [n~] < 1, the first inequality being strict since ~ is irrational. As n
takes the values 1,2, ... , N, we get N different numbers n ~ - [n
all
of which lie in the open interval (0,1). Consider the N sub-intervals
(O,I/N), (I/N,2/N), ... ,((N -1)jN, 1). Either each of these sub-intervals
contains in its interior exactly one of the numbers n ~ - [ n ~], or there
exists a sub-interval which contains more than one of them. In the first
case, the interval (0, liN) contains one of the numbers, and therefore
O<m~-[m~]<I/N, for an integer m such that l:::;m:::;N. That is,
< ~ - [m ~]/m < limN, and we have thus found a rational number h/k
with the desired property.
Ifthe sub-interval (0,1/N) contains none of the numbers n ~ - [n ~],
1 :::; n:::; N, then there exists another sub-interval which contains at least
two such numbers, say n~-[n~] and m~-[mn We then have two
integers m and n, with 0< m < n:::; N, such that
n
°
l(n~-[n~])-(m~-[m~])1
or
l(n-m)~-([n~]-[m~])1
1
< N'
1
< N'
19
Approximation of irrationals
§1.
If we set n-m=k, and [n~]-[m~]=h, then we have again
1 ";c_~I<_1
k
kN'
with k<N.
A slightly stronger result than Theorem 1 is
THEOREM 2. If ~ is irrational, and N a positive integer, then there
exists a rational number hjk, with k~N, such that
I~ -
~ I < k(N1+ 1)·
PROOF. This can be proved in the same way as Theorem 1. Let
xo=O, X 1 'X 2 ' ... 'X N , X N + 1 =1 be the N+2 different numbers 0,1, and
n ~ - [ n ~], n = 1,2, ... , N, in ascending order. Then 1 is the sum of the
N + 1 positive and irrational differences Xn + 1 - X n , n = 0, 1, ... , N, hence
xn+l-xn<lj(N+l) for at least one value ofn. This implies, as in the
proof of Theorem 1, that there exists a rational number hjk such that
I~ -~I
1
< k(N +l)'
where k~N.
Another proof of Theorem 2 uses Farey sequences. If FN denotes
the Farey sequence of order N, then, since ~ is irrational, ~¢FN for
any N. But ~ lies between some two consecutive fractions ajb and cjd
belonging to FN. Let ajb < ~ < cjd. Consider the mediant (a + c)j(b + d).
From Chapter I we know that ajb«a+c)j(b+d)<cjd. Hence either
ajb<~«a+c)j(b+d), or (a+c)j(b+d)<~<cjd. But (a+c)j(b+d)¢FN'
since ajb and cjd are consecutive terms III FN. Hence b + d ~ N + 1.
Therefore we have either
O<~
or
a
-
b<
a+c
abc-ad
b+d - b = b(b+d)
°< d - ~ < d c
c
a+c
b+d
=
bc-ad
d(b+d)
=
1
b(b+d)
1
=
d(b+d)
~
1
b(N+l)'
1
~ d(N+l)·
Since ajb and cjd belong to FN, they are irreducible, and b~N, d~N.
We have therefore obtained the required approximant hjk (equal to
ajb or cjd).
We can consider the validity of Theorem 2 when ~ is rational, say
~=ljm, with (l,m) = 1, and m>N. Then ~¢FN' and we can follow the
same proof as above, except that we may now have ~ = (a + c)j(b + d),
2*
20
Rational approximation of irrationals and Hurwitz's theorem
III
which would not allow us to claim the strict inequality of Theorem 2.
Thus we have
THEOREM 3. If ~ is rational, and N a positive integer, and ~ = lim,
(I,m) = 1, where m>N, then there exists an irreducible fraction h/k
with denominator k:( N, such that
Theorem 1 implies, since N ~ k, the following
THEOREM 4. If
~
is irrational, then there exist infinitely many rationals
h/k, such that
This is sometimes expressed by saying that an irrational ~ can be
approximated to within 1/k2 by a rational h/k.
Since ~-h/k can be written as (~+n)-(h+kn)/k, where n is an
integer, Theorems 1, 2, 3, and 4 are true without the assumption 0 < ~ < 1.
§ 2. Sums of two squares. Theorem 3 can be used to show that
certain integers are representable as sums of two squares.
THEOREM 5. If n and A are positive integers, such that n\(A 2 + 1), then
there exist integers sand t, such that n = S2 + t 2.
PROOF. The case n = 1 is trivial. We assume therefore that n ~ 2, and
define N=[VnJ. Then n>N for n~2. Since n\(A 2 +1), itfollowsthat
(n, A) = 1. Hence A/n is a reduced fraction with denominator n > N,
and by Theorem 3 there exists a reduced fraction r/s, such that
I-An
- -r I :(
s
1
, O<s:(N.
(N + 1)s
That is
n
\As-rn\ :( - -
N +1
=
n
[Vn]+1
II:
< V n.
Let As-rn=t. Then t is an integer, and s2+t 2 =s2+(As-rn)2
=s2(A2+1)-2Asrn+r2 n 2. Since n divides the right-hand side of the
equation, we must have n\(s2+t 2). But O<s:(N:(Vn, and \t\<Vn,
which together imply that 0 < S2 + t 2 < 2 n. Since S2 + t 2 is a multiple
Primes of the form 4 k ±1
§ 3.
21
of n, we must have S2 + t 2 = n, so that n is representable as a sum of two
squares.
It is easy to see; besides, that (s,t) = 1. For (s,t)=(s,As-rn)=(s,rn).
However, rls is irreducible, hence (r,s) = 1. Thus (s,t)=(s,n). But
n=s2 +t 2, hence
s2(A2+1)
1=
- 2Asr+r2 n.
n
Since, by hypothesis,
+ l)/n is an integer, it follows that any common divisor of sand n must divide 1, hence (s, n) = 1 = (s, t).
(A 2
COROLLARY. Ifnisapositiveinteger,and nl(A 2 +B 2), where (A,B)=I,
then there exist integers sand t, such that n = S2 + t 2.
PROOF. We use the identity
(A 2 + B2)( C2 + D2) = (A D + B C)2 + (A C - B D)2.
Since (A,B) = 1, we know from Chapter I that there exist integers C
andD,suchthat AC-BD=1. We then have
(A2 +B2)(C2 +D2)=(AD+BC)2 + 1,
so that if nl(A 2 +B2), then nl{(AD+BCf+l}. This, by Theorem 5,
implies that n = S2 + t 2 •
§ 3. Primes of the form 4k ± 1. Euclid's proof of the existence of
infinitely many primes was given in Chapter I. Every prime number
other than 2 is odd, and an odd number is either of the form 4k-l or
4k+ 1, where k is an integer. We shall show, by arguments similar to
Euclid's, that both these sequences contain infinitely many primes.
THEOREM 6. There exist infinitely many primes of the form 4k-1.
PROOF. Let ql,q2' ... , qr be the first r primes of the form 4k-1.
Define N=4Q1Q2 ... Qr-1. Then N is an odd number. Therefore all
its divisors are of the form 4k-l or 4k+ 1. But N cannot have only
divisors of the form 4k+ 1, since the product of two integers of that
form is again of the same form, whereas N is of the form 4 k - 1. Hence
N has a prime divisor of the form 4 k - 1. But N is not divisible by
Ql, ... ,qr. Therefore there exists a prime of the form 4k-l, which is
greater than qr.
THEOREM 7. There exist infinitely many primes of the form 4k+ 1.
PROOF. Suppose, if possible, that 5,13, ... , p are the only primes of
the form 4 k + 1, of which p is the largest. Define the integer
N = (2·5· 13 ... p)2 + 1.
22
Rational approximation of irrationals and Hurwitz's theorem
III
Since N is odd, all its divisors must be odd. By Theorem 5, every prime
divisor q of N is of the form q = S2 + t 2 . For q to be odd, one of the two
integers sand t must be odd, and the other even. Then q = S2 + t 2 == 1
(mod 4). That is, every prime divisor of N is of the form 4k + 1. This leads,
however, to a contradiction, since N> 1, and is not divisible by any
of the primes 5,13, ... , p, which, according to our hypothesis, were the
only primes of the form 4 k + 1.
§ 4. Hurwitz's theorem. We begin by sharpening Theorem 4.
THEOREM 8. If ~ is irrational, there exist irifinitely many irreducible
fractions h/k, such that
1 ~_~1<_1
k
2k2
.
PROOF. Let FN be the Farey sequence of order N> 1. Then ~ lies
between some two consecutive fractions a/b, c/d belonging to FN , so
that
a
c
- <
b
~
<-.
d
We shall prove the theorem by showing that one at least of the inequalities
(1)
holds. For, if this were false, we should have, since
~
1
a
c
- b > 2b 2 ' d - ~
~
is irrational,
1
> 2d 2 '
(2)
which imply, since bc-ad= 1, that (b-d)2 <0. Hence we have either
~
a
1
- b < 2 b2 '
c
or
d-
1
~ < 2 d2 •
Thus there exists a fraction h/k in F N (equal to a/b or c/d), such that
1 ~_~1<_1
k
2k2
.
Since (c/d)-(ajb) = 1j(bd), and because of the choice of hjk, we have
1
1
I~ - kh I< bd1 ~ b+d-1
~ N'
§4.
23
Hurwitz's theorem
if we note that b+d~ N + 1 because of Theorem 10 of Chapter I. There
exist infinitely many such fractions h/k, since N is at our disposal, which
proves Theorem 8.
In Theorem 4 we showed that any irrational ~ can be approximated
to within 1/k2 by an infinity of rationals h/k. In Theorem 8 that
approximation was improved to 1/2k2. The question arises whether
this result can be further improved. Does there exist a number c > 2,
such that ~ can be approximated to within l/c k2 by an infinity of
rationals h/k? The answer to this question is given by Hurwitz's theorem,
which follows.
THEOREM 9 (HURWITZ). If
tive real number, such that c ~
numbers h/k, such that
~
is an irrational number, and c any posi-
0, then there exist infinitely many rational
1 ~_~1<_1
k
ck 2
.
0,
If c >
then there exist irrationals ~ for which the above approximation holds only for finitely many rationals h/k.
PROOF (KHINCHIN). Let F N be a Farey sequence of order N> 1, and
h/k, h'/k' two successive terms in it such that h/k<~<h'/k'. We may
suppose that
either
k' > (VS; l)k,
or
el5-1)k
k' < V J
2
.
For if
el5-1)k
~V_
J __
2
el5+1)k
< k' < --,-V_
J __
2
'
then
115+1
k + k' > _V_
J _ max(k k')
2
'
,
and we can replace FN by FM , M = k + k', and one of h/k, h'/k' by their mediant (h+ h')/(k+ k'), since k(h+ h') - h(k + k') =(k+ k')h' - (h + h')k' = 1
(cf. Theorem 7, Chapter I).
Thus, if k'/k=w, then w>(0+1)/2, or w«0-1)/2. In either
case we have 1 +w- 2 > VSw- 1 , since
_1 (1
VS
+ ~) _ ~
w2
W
=
_1_
tIS w
2
(w _ tIS2+ 1 ) (w _ tIS2-1 ) > O.
Rational approximation of irrationals and Hurwitz's theorem
24
III
Hence
1 (1k2 + k,21) = VS1 (1 + OJ21) > OJk12'
VS
P
so that
h'
h
k' - k =
1
1
1 (1
kk' = k2OJ < VS
1)
k2 + k,2 '
which implies that
h
1
h'
1
k
VSk2
k'
VSk,2'
-+-->---Hence one of the open intervals
contains ¢. Reasoning as before, we see that there exist an infinity of such
approximations, which proves the first part of the theorem.
To prove the second part, we assume that c > VS, and consider the
irrational number ¢ =t(l + VS). We shall see that ¢ has only finitely
many rational approximants hjk satisfying the inequality
.
1 ¢_~1<_1
k
ck
(3)
2
Let
c= VSjex,
ex
where 0< < 1, and suppose that
.
1 ~_1+VSI<_ex
k
2
VS k2
This can be written as
181 < ex, if we set
in which case
k
VSk
h - - = -2
2
8
+ --,
VSk
or
2
2
h -hk-k =8
8
+-.
2
2
5k
Since hand k are integers, it follows that h2 - hk - k2 cannot be zero
unless h = k = O. But it is impossible that k = 0, hence Ih 2- h k - k21 ~ 1.
§4.
Hurwitz's theorem
25
rx. 2
k2 < - - 5(1-rx.)
(4)
Since 181 < rx. < 1, we have
or
Thus the denominator k of a fraction h/k which satisfies (3) must satisfy
(4). Since rx. is given, k can have only finitely many integral values, and
because of (3), h can have only finitely many integral values. Thus, if
inequality (3) can hold only for finitely many fractions h/k,
c>
which completes the proof of Theorem 9.
Following the remark at the end of Theorem 4, Theorems 8 and 9
hold without the assumption that 0 < ~ < 1.
VS,
Chapter IV
Quadratic residues and the representation of a number
as a sum of four squares
§ 1. The Legendre symbol. The theory of quadratic residues is a fundamental part of the theory of numbers. It can, for instance, be applied
to prove such elegant results as Euler's theorem that every prime number of the form 4 k + 1 is a sum of two squares, and Lagrange's theorem
that every positive integer is a sum of four squares.
Let p be an odd prime, and a an integer such that (a,p) = 1. If there
exists an integer x such that x2=a(modp), then a is called a quadratic
residue modulo p. If there exists no such x, then a is called a quadratic
non-residue modulo p.
We shall sometimes write aRp to indicate that a is a quadratic
residue modulo p, and aN p to indicate that it is a quadratic non-residue
modulo p.
In order to find out how many of the integers 1,2,3, ... , p -1 are
quadratic residues modulo p, we should know how many of the congruences
x 2 =a(modp)
(1)
are soluble when a runs through the integers 1,2,3, ... , p-1.
Let us consider the integers
12 ,2 2 ,3 2 , .•• ,
(p;
1y.
They are all mutually incongruent (modp). For if we take any two of
them, say r2 and S2, r=l=s, then r2=s2(modp) would imply that r=s
(modp), or r= -s(modp), and both alternatives are excluded, since
1~r, s~(P-1)/2. Further, r2=(p-rf(modp). It follows from these two
remarks that the integer a in (1) assumes t(P-1) different values, when
x runs through the set 1,2,3, ... ,p-1. Hence there are exactly t(P-1)
quadratic residues modulo p, and t(P -1) quadratic non-residues.
THE LEGENDRE
SYMBOL.
Let p be an odd prime, and m an integer
such that (m, p) = 1. We define the Legendre symbol (;) by the relations
=
(~)
p
{
+ 1, ~f mRp,
-1,lfmNp.
(2)
Wilson's theorem and Euler's criterion
§2.
27
It is convenient to extend Legendre's definition by defining
(;) =
0, if plm.
Since there are as many quadratic residues as non-residues (modp), it
follows that
§ 2. Wilson's theorem and Euler's criterion. The following result,
known as Wilson's theorem, but first proved by Lagrange, expresses a
characteristic property of primes.
THEOREM 1. If P is a prime, then (p - I)!
=- 1(mod p).
PROOF. If P = 2, the conclusion is obvious. Therefore let p > 2. From
the discussion in § 1 of Chapter II it follows that to any x in the set
1,2, ... , (p -1), there corresponds one and only one x' in the same set,
such that
xx'= 1(modp).
(3)
Further x = x' if and only if x = 1 or p -1. For the congruence
x 2 =I(modp) is equivalent to (x-l)(x+ 1) O(modp), so that either
x=l(modp), which implies that x=l, or x= -1(modp), which im-
=
plies that x = p - 1.
From (3) it follows that
2·3 ... (p-2)
=1(modp).
If we multiply this, in turn, by the congruence
we get
1(P -1)
=-1 (modp) ,
1·2·3 ... (p-l)= -1(modp),
(4)
which is Wilson's theorem.
Note that ifp is composite, then it can be factorized as p = qr, 1 <q<p.
Hence q occurs as a factor in the product 1· 2·3 ... (p -1), and the congruence
(p -I)! + 1
=O(modq)
28
Quadratic residues and the representation of a number as a sum of four squares IV
is impossible; so also the congruence (P-1)!+1 :;:O(modp). Thus Wilson's theorem states a property characteristic of the primes.
Now let p be an odd prime, and (a,p) = 1. We shall see that if a is
a quadratic residue modulo p, then
ai-(P-I):;: 1 (modp).
For the congruence x 2 :;: a (mod p) is then soluble, and (x,p) = 1, since
(a,p) = 1. If we raise this congruence to the power t(P -1), which is
an integer since p is odd, we get
XP-l :;:
a}(P-I)(modp).
But by Fermat's theorem (Theorem 3, Chapter II), we know that
x p - l :;: 1(mod p). Hence a!(p-l):;: 1(mod p).
On the other hand, the congruence
XHP-l) :;: 1(mod p)
has at most t(P-l) solutions, because of Lagrange's theorem (Theorem 7, Chapter II). And we know from § 1 that there are exactly t(p-1)
quadratic residues. Each of them, as we have just seen, satisfies it, hence
there are no other solutions. Thus we obtain
THEOREM 2 (Euler's criterion). Suppose p is an odd prime, and a is
any integer. Then
a!(p-l):;:
1(mod p),
if and only if a is a quadratic residue modulo p.
Now if p is an odd prime, and (x,p) = 1, then by Fermat's theorem
xP-l-l
( P-l -I}\(P-l)
x
+ 1 :;:O(modp).
= x
2
2
Hence either
x hp -
or
1 ):;: 1(mod p),
Xf(p-l):;:
-1(modp).
(5)
(6)
Since, by Theorem 2, a quadratic non-residue does not satisfy (5), it
must satisfy (6). Combining this observation with the definition of the
Legendre symbol, we obtain
THEOREM
3. If P is an odd prime, then
m!<p-l) :;: ( ; )
(modp).
§ 3.
Sums of two squares
29
COROLLARY. We have
(~)(~) = (~n),
which means that the product of two quadratic residues, or non-residues,
modulo p, is again a quadratic residue; but the product of a quadratic
residue with a quadratic non-residue, modulo p, is again a quadratic nonresidue.
§ 3. Sums of two squares. Let p be an odd prime, and set m = p-l
in Theorem 3. Since p -1 == -1 (mod p), we get
(P-1)
But
(~1) = ±1,
== (-1)
t(p-l) (modp).
and (_l)t(P-l)=
±1,
and p;::3. Hence
(~1) = (_1)i(p-l),
from which it follows that -1 is a quadratic residue (mod p) of all
primes p == 1 (mod 4), and a quadratic non-residue (modp) of all primes
p==3(mod4). This leads us to
THEOREM 4 (EULER). Every prime of the form 4k + 1 is representable
as a sum of two squares.
PROOF. If P is a prime of the form 4k + 1, then -1 is a quadratic
residue of p. That is, the congruence x 2 == -1 (mod p) has a solution.
Therefore there exists an integer A, such that pl(A 2 + 1). This implies,
by Theorem 5 of Chapter III, that p is a sum of two squares.
The result that if p is a prime of the form 4k+ 1, then pl(A 2 + 1),
for some integer A, can be sharpened as follows.
THEOREM 5. If P is a prime, such that p == I (mod4), then there exists
an integer x, such that
x2
+ 1=
m p,
where
0 < m < p.
PROOF. Since -1 is a quadratic residue of p, there exists an integer x
of the set 1,2,3, ... ,t(p-l), which satisfies the congruence
x 2 == -1 (modp).
That is, x 2 + 1 = mp for some integer m. But x < p/2, therefore
x 2 +1 <(pj2)2+1<p2. Hence x 2 +1=mp, with O<m<p.
A result similar to Theorem 5 is the following
30
Quadratic residues and the representation of a number as a sum of four squares
THEOREM 6.
IV
If P is an odd prime, there exist integers x and y such that
1+x 2 +y2=mp,
where
O<m<p.
PROOF. The integers x 2, O~x~t(p-l), are pairwise incongruent
(modp); so are the integers -1- y2, 0~y~Hp-1). But these two
sets together contain p + 1 integers, and since there are only p residue
classes (modp), some member x 2 of the first set must be congruent
to some member - 1 - y2 of the second set. Thus
x 2 == -1- y2 (mod p)
or
1 + x 2 + y2 = m p.
But 0 ~ x, y ~ Hp -1). Therefore
hence
1+x 2 +y2=mp,0<m<p,
as claimed in the theorem.
We have seen that every prime p such that p == 1 (mod 4) is representable as a sum of two squares. But other integers also have that
property. For instance 10=12+3 2. The following theorem gives a
necessary and sufficient condition for a positive integer to be representable as a sum of two squares.
THEOREM 7. A positive integer n is a sum of two squares if and only
all its prime factors of the form 4 k + 3 have even exponents in the
standard form of n.
if
-
For the proof of Theorem 7 we need two lemmas. We call a representation n = x 2 + y2 primitive if (x,y) = 1, and imprimitive otherwise.
LEMMA l. If n is divisible by a prime p, where p == 3 (mod 4), then n has
no primitive representations.
PROOF. If n has a primitive representation, say
n=x 2 + y2,
(x,y)= 1,
then pl(X 2 +y2), but p,rx, p,ry. And since (p,x) = 1, the equation
m x - t P= c is soluble in integers m and t, for all integral c, and in particular for c = y. Hence there exists an integer m such that
mx==y(modp),
§4.
Sums of four squares
31
which implies that
X2 + (mx)2 :=X2 + y2 :=O(modp).
Therefore plx2(m 2+ 1), and since p,.r x, it follows that pl(m 2+ 1). That
is, m2 := -1(modp). In other words, -1 is a quadratic residue modulo a
prime p of the form 4 k + 3, which is impossible, as we have seen at the
beginning of §3. Thus the lemma is proved.
LEMMA 2. If P is a prime, p:=3(mod4), and c is an odd integer, such
that pCln but pC+ I ,.r n, then n cannot be represented as a sum of two
squares.
PROOF. Suppose, if possible, that n = x 2 + y2, where (x, y) = d. Then
wehave x=dX,y=dY, with (X, Y)= 1, and n=d 2(X2+ y2)=d 2N, say.
Let p' be the highest power of p which divides d. Then p<-2, is the
highest power of p which divides N. And c - 2 r > 0, since c is odd.
Thus we have an integer N, such that N = X2 + y2, (X, y) = 1, and
piN, where p:=3(mod4). This contradicts Lemma 1, hence Lemma 2
is proved.
PROOF OF THEOREM 7. The condition is necessary, for Lemma 2
implies that if n is a sum of two squares, then every prime factor of n,
of the form 4k+3, has an even exponent in the standard form ofn.
The condition is also sufficient, for if n is a positive integer such that
every prime factor of the form 4 k + 3 which occurs in its standard
form has an even exponent, then n can be written as n=nin2, where
n2 has no prime factors of the form 4 k + 3. Therefore the only prime
factors of n2 are either the number 2 or odd primes of the form 4k+ 1.
Now 2 is representable as a sum of two squares 12 + 12, and every odd
prime of the form 4 k + 1 can be represented as a sum of two squares.
Further the identity
(xi + Yi)(x~ + y~) = (Xl X2 + YI Y2)2 + (Xl Y2 - x 2Ytf
shows that the product of two numbers each of which is representable as
a sum of two squares is likewise representable. Hence n2 = a2+ b2, which
implies that n=(n 1 a)2+(n 1 b)2.
§ 4. Sums of four squares. We conclude this chapter with a result
which is as famous as it is elegant.
THEOREM 8 (LAGRANGE). Every positive integer n is a sum of four
squares.
PROOF. Since 1 = 12 + 0 2 + 0 2 + 0 2, we suppose in what follows that
n> 1. The identity
(xi +x~ +x~ +x~)(Yi + y~ + y~ + y~)=zi +z~ +z~ +z~,
(7)
32
Quadratic residues and the representation of a number as a sum of four squares IV
where
+X2Y2 +X3Y3 +X4Y4'
Z2 =X 1 Y2 -X2Yl +X3Y4 -X4Y3,
Z3=X1Y3- X 3Yl +X4Y2- X2Y4'
Z4=X1Y4 -X4 Yl +X2Y3 -X3Y2,
Zl =X1Yl
shows that a product of two integers, each of which is representable
as a sum of four squares, is likewise representable. Every integer n> 1
is a product of primes, and 2 = 12 + 12 + 0 2 + 0 2 • It suffices therefore
to show that every odd prime is representable as a sum of four squares.
It follows from Theorem 6 that if p is an odd prime, then there exists
an integer m < p, such that
mp=xI +x~ +x~ +xi,
where Xl,X2,X3,X 4 are not all divisible by p.
Given any odd prime p, let mo denote the smallest positive integer
such that
(8)
If mo = 1, there is nothing more to prove.
Suppose that mo > 1. We shall first show that mo must be odd.
For ifmo is even, then Xl,X2,X3,X 4 are either all even, or all odd, or two
even and two odd (for instance X l ,X2 even, and X3,X4 odd). Since
!.m
2
oP
=
(Xl +X2)2
(Xl -X2)2
(X3 +X4)2 (X3 -X4)2
2
+
2
+
2
+
2
'
we see that tmoP is a sum of four integral squares, not all of which are
divisible by p. But this contradicts the minimality of mo.
Hence mo ~ 3, and we can write
xi=bim O+ Yi, (i= 1,2,3,4),
(9)
where the integer bi can be so chosen that IYi I< t mo. For if the division
of Xi by the odd number mo gives xi=b;mo+Y;, where y;>tm o, then
we can write xi=(b;+ l)mo+(yi-mo)=bimo+Yi, where -tmO<Yi<O.
Now Xl ,X2,X 3 ,X4 are not all divisible by mo, for that would imply
that mo Ip, which is impossible, since 1 < mo < p. Therefore
YI + Y~ + Y~ + yi > O.
Thus we have
Sums of four squares
§4
33
But it follows from (8) and (9) that
yi + y~ + y~ + y~ =0 (modmo)·
Thus we have integers x;,y;(i=1,2,3,4), such that
xi+x~+x~+x~=mop,
and
mo<p,
Identity (7) therefore gives us four integers Z1,Z2,Z3,Z4, such that
zi +z~ +z~ + z~ =m~m1 p.
But
(10)
4 4 4
Z1 =
L X;y;= L x;(x;-b;mo)= L xf(modmo)=O(modmo)'
;=1
;=1
;=1
Similarly
Z2 = Z3 = Z4 = O(mod mo)'
Hence z;=mot;, where t; is an integer for i=1,2,3,4. On substituting
these values in (to), we get
m1P=ti +d +t~ +ti,
with 0<m 1<mo. But this contradicts the minimality of mo. Hence
mo = 1, and Theorem 8 is proved.
3 Chandrasekharan, Analytic Number Theory
Chapter V
The law of quadratic reciprocity
§ 1. Quadratic reciprocity. Let p and q be two distinct odd primes.
Then the Legendre symbols
determined if
(~)
(~) and (~) are defined. Can (~) be
is known? Gauss's law of quadratic reciprocity
shows that that is indeed possible.
THEOREM
I (GAUSS). If P and q are distinct odd primes, then
Since t(p-l)·t(q-1) is odd if and only if p=q=3(mod4), Theorem 1 can be restated as follows:
(~) =
-
(~).
and
(~)=
if p=q=3(mod4),
(;).
otherwise.
We shall deduce the law of quadratic reciprocity from a reciprocity
formula for certain exponential sums.
§ 2. Reciprocity for generalized Gaussian sums. Let m and n be two
non-zero integers. Then a generalized Gaussian sum is defined as
g(m,n)=
Inl
L e7ti!ffk2+
7timk.
(1)
k=l
When m is even, this reduces to a Gaussian sum. Theorem 1 can be
deduced from a formula connecting g(m,n) and g( -n,m), which we
state as
§2
Reciprocity for generalized Gaussian sums
THEOREM
2. If m and n are non-zero integers, then
-1- g(m, n) = e4.i (1- Imn I)sgn(mn) -1- g( - n, m),
vr;;I
where sgnr=r/lrl
PROOF.
35
~
if
if
r=l=O, and sgnr=O
(2)
r=O.
We shall use complex integration for the proof. Consider the
integral
f(X) = f(X, r) =
JrI>(u) du,
(3)
c
where
e1t'iTU 2
rI>(u) = rI>(u, X) = rI>(u, X, r) =
+ 2niXu
-----::----c--
e21tiu _1 .
(4)
Here u is a complex variable, X an arbitrary complex number, r a
complex number with positive real part, and C a line in the complex
u-plane through the point u = t which is inclined at an angle n/4 to
the positive real axis. We shall first show that the integral converges. For
this we shall estimate the function rI> in any strip (of finite width), which
is bounded by two lines parallel to C. If we set
i1t
u=c+re 4
,
where c and r are real, c bounded and r variable, and
r=Rer+iImr,
then
and
ru 2 +2X u=irr 2 +2e* (rc+X)r+(rc+2X)c,
so that
Hence
where A and B are constants independent of r.
Further
Ie2 1tiu - 11 ~ [1 - Ie2 1tiu I[ = 11 - e - V2 1tr I.
Now r-+± 00 as lul-+oo in the strip, so that if lui is large enough, then
(6)
3*
The law of quadratic reciprocity
36
v
Combining (5) and (6) we have
(7)
1<P(u)I~A1'e-"r2Ret+Blrl,
in the strip chosen, if lui is large enough. Hence the integral J<P(u)du
c
converges.
We shall next show that g(m, n), for n > 0, is the value of the integral
J<P(u)du for a suitably chosen contour y.
Let y be the parallelogram formed by the line e, the line en parallel
to e which cuts the real axis at the point n+t, n>O, and the lines L 1 ,
and L 2 , in the upper and lower half-planes respectively, which are
parallel to the real axis and at a positive distance from it (Fig. 1).
Fig. 1
Now <P(u) is a meromorphic function of u, and if y is taken in the
positive sense, then by Cauchy's theorem of residues, we have
n
J<P(u)du= L e"itk2+2"iXI<.
(8)
k=l
Because of (7), <P(u)-+O uniformly as lul-+oo in the strip, while the
two sides of y parallel to the real axis are of constant width. Hence the
integrals along these sides tend to zero, when L1 and L2 go to infinity,
away from the real axis. Thus we are left with
n
J<P(u)du- J<P(u)du= L e"itk2+2"iXk.
~
C
k=l
From (4), however, we have
<P(u + n, X) = e"itn2 + 2"iXn <P(u, X
+. n),
(9)
37
Reciprocity for generalized Gaussian sums
§2
so that
JcI>(u)du =
e7titn2+
27tiXn f(X +Tn),
en
where
f
is defined as in (3). Hence (9) becomes
e7titn2+27tiXn
L e1[itk +27tiXk,
n
f(X +Tn)- f(X) =
(10)
2
k=l
which is a relation between f(X) and f(X +Tn). We shall now seek
another such relation and compare the two.
For this purpose, we start with the identity
f e~itU2
f
f(X+1)-f(X) =
{e 27ti (X+l)u_ e 21[iXu}du
e21t1u _1
C
X2
. 2'
-nie7t !!U +21t1Xu du=e
t
=
C
f
e
(X)2
nit u+t
dUo
C
Now let C' be the line parallel to C, obtained by the translation
U--+U+XjT. Then
X2
f(X + 1)- f(X)=e -1[i~
f
e7titU2
duo
C'
That this integral converges is clear from the estimate (5). By integrating
again around a parallelogram, as before, and using the estimate (5) with
X = 0, it can be seen that
J
e7titU2
J
du =
C'
du,
e7titU2
Co
where Co is the line parallel to C' through the origin. On Co we have
u = r e7ti /4 , with r real. Therefore
f
e1[itu 2
Co
f
(1)
du = e
~
e- 7ttr2
~
dr= e I"
-(1)
say. Hence
.( 1
X2)
f(X + 1)- f(X)=/' 4-, It'
By iteration of this formula m times, we get
m-l
f(X +m)-f(X)=I,'
.e
(X+V)2)
L /' 4--t-,
v=o
38
v
The law of quadratic reciprocity
where m is a positive integer_ If we replace X by X
second relation we are seeking, namely
m-1
f(X +rn)- f(X +rn-m)=lt -
_[1
I /'
+r n -
m, we get the
(X+tn-m+v)2]
"4-
t
_
(11)
v=o
From (11) and (10) we obtain
enitn2+2niXnf(X +rn-m)- f(X)
=
I
n
enitk2+2niXk_1tenitn2+2niXn
I
m
_[1
en,
"4-
(X+tn-v)2]
t
v=l
k=l
(12)
If in this we put X=mI2, and r=mln, m>O, n>O, we have
n
"e
L.,
nik2'!'.+nimk
n
~(l-mn) m
=1min e 4
"
L.,
v=l
k=l
nivn-v2ni!!.
m
(12')
e
Here if we set m=n=l, we get 11 =1, that is
If we now make the substitution t--+t0, where r is real and positive,
we get
f
00
1t =
e- 1ttt2 dt =
~-
(13)
-00
If we use formula (13), with r = min, in formula (12'), then we get
1
Ie
n
Vnk=l
nik2!!!+nimk
n
1
vm
~(l-mn)
1
~(l-mn)
= __ e 4
vm
Ie
m
'ltivn- ,,21ti~
m
v=l
= --e4
Ie -7tivn-v27ti~
m
m
v=l
39
Proof of quadratic reciprocity
§3
and this, by the definition of g(m,n), leads to
vm
1
1
Vn g(m,n) =
1ti(1-mn)
e4
g( -n,m),
(14)
which proves the theorem for m > 0, n > 0.
If m>O, and n<O, then -n, m>O, and (14) gives
1
11::
Vm
g( -n,m) =
1
1
r-: e
~(1 +mn)
V -n
g( -m, -n),
or
1
--g( -m, -n)=e
~
-1ti(1-lmnl)
1
vm
--g( -n,m).
4
But by definition, g( -m, -n)=g(m,n), hence
1
--g(m,n)=e
~(1 -Imnl)sgn(mn)
~
1
vm
--g(--n,m),
as claimed.
If m < and n < 0, the reciprocity formula (2) remains valid, since
g( -m, -n)=g(m,n), g(n, -m)=g( -n,m), and (1-lmnl)sgn(mn) remains unchanged if m and n are replaced by - m, - n respectively. This
concludes the proof of Theorem 2.
It may be noted that this proof does not assume the result
°
-00
but obtains it as a byproduct.
§ 3. Proof of quadratic reciprocity. The law of quadratic reciprocity, stated in Theorem 1, can be elegantly deduced from the reciprocity formula for generalized Gaussian sums proved in Theorem 2.
Since k2 == k(mod 2), we can replace k by k2 in the definition of
g(m,n) given in (1), and write
g(m,n)=
~
L.
nik 2 "'(n+ 1)
en.
k=1
Now let n be an odd prime, and m some integer prime to n. We then have
n-l
g(m,n)=l+Le
k=1
m
nik2-(n
n
+ 1)
•
40
v
The law of quadratic reciprocity
If k 2 == p (mod n), then it is easy to see that
xik2!!!(n + 1)
e
=
n
xip!!!(n + 1)
e
n
But if
== p (mod n), and 1 ~ k ~ n -1, then p is a quadratic residue
modulo n, and (n - k)2 == k 2 == p (mod n). Thus if k runs through the
integers 1,2, ... ,n-1, then k 2 (taken modulo n) runs twice through the
set of quadratic residues modulo n. Hence
k2
g(m,n)
xip!!!(n+ 1)
= 1+2Ie
n
(15)
,
p
where p runs through the set of quadratic residues modulo an odd
prime n.
Now consider the sum
where v runs through the quadratic non-residues modulo n. We obviously have
~
1 + L,.e
xip!!!(n+ 1)
n
~
+ L,.e
xiv!!!(n+ 1)
n
=
p
n~,t
L.
xik!!!(n+ 1)
en.
k=O
xik!!!(n+ 1)
But n + 1 is even, and therefore e n
is the
root of unity, say '1, and '1 1 since n,r m. Thus
'*
g(m,n)
~ xip!!!(n+ 1)
= L,.e
n
kth
~ xiv!!!(n+ 1)
L,.e
-
n
power of an
nth
(17)
•
p
We now consider the two possibilities
(~) =
+ 1,
and
(~) =
-1.
(a) If m is a quadratic residue modulo n, and p runs through all
quadratic residues modulo n, then by the Corollary of Theorem 3 of
Chapter IV, pm likewise runs through all the quadratic residues. And
if v runs through all the non-residues, so does vm. Hence
g(m,n) =
~ xip(!!..±..!)
L,.e
n
-
p
= g(1,n)
= (~) g(1,n).
~ xiv(n+ 1)
L,.e
n
v
(by (17))
§3
Proof of quadratic reciprocity
41
(b) If m is a quadratic non-residue modulo n, then by reasoning
again as in case (a), we have
Iexiv(~) Ie xiP(n+~)
g(m,n) =
n
-
v
n
p
= -g(1,n)
= (;) g(1,n).
We have therefore shown that if n is an odd prime, and (m,n) = 1,
then
g(m,n) = (;) g(1,n).
(18)
On the other hand, from Theorem 2 it follows that
-
1
~(l-n)
g(1,n) = e 4
Vn
g(-n,1),
and since, by definition, g( - n, 1) = 1, we have
g(1,n) =
Vn e~(l-n)
4
(19)
.
from (18) and (19) we get the important formula
(m)n
-
1
~(n-l)
= 1;: e
Vn
(20)
g(m,n),
where n is an odd prime, and m is an integer such that (m,n) = 1.
If m = -1, this gives
(-1) Vn1
-
n
=-
~(n-l)
e4
g(-1,n).
But by (2), we have
xi
xi
1
4(n-l)
"4(n-l)
g( -1,n) = e
g( -n, -1) = e
Vn
since g( -n, -1) = 1. Hence
( 1)
-
_
n
xi
-(n-l)
= e2
= (-1)
n-l
-
2
•
(21)
Here n is an odd prime. Let us now assume that m is also an odd prime.
42
v
The law of quadratic reciprocity
Then it follows from (20) and (2) that
m
xi
1
xi
-(n-l)
-(l-mn)
- = e4
()
vm
e4
•
n
--
g( -n,m).
If we use (20) once again, we get
m
n
(_)
xi
-(n-l)
= e4
.
xi
-(l-mn)
e4
.
-n
m
-xi
(m-l) ( )
e
4
_
•
But
because of (21). Hence
m
_)
(
n
Since
(;Y
=
-(n-l)(m-l)
-xi
(
= e 4
n)
_
m
n-l.
m-l
= (-1) 2
2
(
n)
_.
m
1, it follows that
(m)n (n)_
m
-
-
-(-1)~'9 ,
which proves Theorem 1.
§ 4. Some applications. Theorem 1 was concerned with the value
of
(~).
when p and q are distinct odd primes. In order to determine
whether or not a given even integer is a quadratic residue modulo an
odd prime, we have to evaluate the Legendre symbol
(~).
This can
be done by an application of (2) and (20).
THEOREM
3. If P is an odd prime, then
(p2)
-
In other words,
(~)
\p
=
{+
= (-1)
p2;1
.
1, if p = ± 1 (mod8),
-1, if p = ±3 (mod8).
(22)
(23)
§4
Some applications
PROOF.
43
From (20) we have
(2)
1
\P
= tIP
e
~(P-1)
g(2,p),
and from (2) we have
1
tIP
g(2,p)
~(1-2p)
= e4
1
V2
g( - p,2),
while from the definition of g(m,n) we have
nip
g(-p,2) = l+e2 .
Thus
(2)
_
e _1t!P (
= __ l+e
V2
P
AN
1tiP)
EXAMPLE.
2
1(_1tiP
1tiP)
= - e 4+e 4
V2
=(-1)
p2-1
8.
Let us use Theorems 1 and 3 to evaluate
( 12703).
16361
Here both 12703 and 16361 are primes. By Theorem 1 we have
( 12703) (16361)
16361 = 12703 '
and since 16361 =3658(mod 12703), we have
( 16361) ( 3658 )
12703 = 12703 .
Since
(:n) (;)-(%),
=
we have
C32~083) = C2~03) C;7~3) C;;03)
= C;7~3) C;;03)
= _
=
_
(12;~3)- (12;~3)
(by Theorem 3)
(by Theorem 1)
G;) G!) G:) (:1) (5~) G;).
=
44
v
The law of quadratic reciprocity
Since
G:) C1Y
= 2
=
1, and similarly G;) = 1, we get finally
G!~~~) = (:1)(:1) (52
9) = 1· (:1) (-1) = e31) = (~) = 1.
A
REMARK.
As seen in Chapter IV, if p is a given odd prime, then
(~) = (:} for all integers m'=m(modp).
On the other hand, from Theorem 3 we know that
(%) has the
same value for all odd primes p which lie in the arithmetical progressions
8m± 1, or in 8m±3.
Theorem 1 can be used to show, more generally, that if q is a fixed
odd prime, then
(24)
where p' is a prime such that p'=p(mod4q).
For if p'=p(mod4q), then p'=p(mod4), so that
(mod 2). By Theorem 1 we have
(%)G)
=
t(p'-1)=t(p-1)
(_1t~l,q;1 =(_1)P;l,Q;1 = (~)(;).
Further~since p:=p(mod4q~ webave p'=p(modq~ hence (~) ~ (~).
Thus ~) = (p) as claimed in (23).
Chapter VI
Arithmetical functions and lattice points
§ 1. Generalities. We recall that an arithmetical function is a
complex-valued function defined on the set of positive integers. Many
of the arithmetical functions we shall consider are integer-valued.
An arithmetical function f is multiplicative, if (i) f is not identically zero, and (ii) f(m n) = f(m) -j(n), if (m, n) = 1. Condition (i) may be
given an alternative form, namely f(I)=1.
Euler's function cp, introduced in Chapter II, is an example. We
have proved that it is multiplicative, and that cp (pa) = pa(I-I/p), for
every prime p, and positive integer a.
Many arithmetical functions behave irregularly, and it is often more
interesting to study the summatory function of an arithmetical function
f, namely
F(N) =
N
L f(n),
n=1
than f itself.
Some of the arithmetical functions in which we are interested have a
simple geometrical interpretation. They count the number of lattice
points in certain regions. A lattice point is a point in n-dimensional
Euclidean space, n ~ 1, with integer co-ordinates.
§ 2. The lattice point function r(n). The arithmetical function r(n)
gives the number of representations of an integer n ~ 1 as a sum of
two integral squares; in other words, the number of solutions of the
equation x 2 + y2 = n, in integers x, y. Solutions which differ only in
sign, or order, are counted as distinct. Thus r(l) =4, since 1 =(± 1)2 +0 2
= 0 2 + (± W. It follows that r(n) is not multiplicative.
We have seen in Chapter IV, Theorem 7, that r(n)=O, ifn is a prime
of the form 4k+3. On the other hand, we have seen in Chapter III,
Theorem 6, that there are infinitely many such primes. Hence r(n)=O
for infinitely many values of n, and since r(n) ~ 0, it follows that
lim r(n)=O.
n-+ 00
One can seek to estimate the order of magnitude of r(n), and prove
that r(n)=O(n£), for every 8>0. That is, Ir(n)ln-£<K, where K is a
46
VI
Arithmetical functions and lattice points
constant independent of n. It is more interesting, however, to study the
(modified) summatory function
N
R(N)=
L r(n),
r(O)=l.
n=O
Geometrically speaking, R(N) is the number of lattice points inside
and on the circumference of the circle x 2+ y2 = N. It is easy to see
that the magnitude of R(N) is approximately equal to the area of the
circle.
THEOREM
1 (GAUSS). R(N)=nN + OWN).
PROOF. The lattice points in the plane are the vertices of squares each
of which is of unit area. To each lattice point inside or on the circle
x 2 + y2 = N, we can associate a square, of which it is, for instance, the
"south-west" corner. Then R(N) is equal to the sum of the areas of
these squares.
Some squares are not entirely inside the circle; on the other hand,
some parts of the circle are not covered by the squares (Fig. 2).
/'
/
-
....-
r-...,
j
1'\
/
i\
\
'"
1\
j
..
!"'-.
r-
---
/'
IL
1/
Fig. 2
VI
However, since the diagonal of each ~uare is
all the squares
are contained inside the circle x 2+ y2 = WN + }l2)2, so that
R(N)<n(VN +
J/2)2.
Similarly the squares completely cover the smaller circle of radius
VN so that
J/2,
R(N»n(VN -
J/2f,
N~2.
The divisor function d(n)
§3
47
We thus have
n(N -2V2N +2)<R(N)<n(N +2V2N +2),
and hence
R(N)=nN + O(ViV).
§ 3. The divisor function den). The arithmetical function den) gives
the number of positive divisors of the positive integer n.
THEOREM 2. The divisor function den) is multiplicative.
PROOF. We have obviously d(1)= 1. And if (m,n)= 1, then every
divisor of mn can be uniquely represented as the product of a divisor
of m, and of a divisor of n. Conversely, every such product is a divisor
ofmn. Hence d(mn)=d(m)·d(n).
r n > 1,
THEOREM
3. I•
r
d(n)=
TI (a i+ 1).
r
with the standard form
n=
TI pi',
i=l
then
i= 1
PROOF. Since den) is multiplicative, we have
den) =
TI d(pi')·
i= 1
The only positive divisors of pi' are the (ai+l) integers 1,Pi,p~, ... ,pi'.
Hence
den) =
TI (a i + 1).
i= 1
The divisor function can be interpreted geometrically. The number
of positive divisors of n is equal to the number of solutions of x y = n
in positive integers x, y. Therefore den) is the number of lattice points
(x,y) in the "upper right quadrant" of the (x,y)-plane, which lie on the
hyperbola xy=n.
THE ORDER OF den). It follows from Theorem 3 that den) can be
made as large as we please. But den) = 2, if n is a prime. Therefore
lim d(n)=2.
"-00
THEOREM 4. For every Ll >0, there exists a sequence of integers ni
for which
d(nJ
(logn;)'1
- - - , -->00,
as i-->oo.
(1)
48
Arithmetical functions and lattice points
VI
PROOF. If Ll > 0, let k be the integer defined by k::::; Ll < k + 1. Let
PHl be the (k+ 1)th prime, and let
n=(2·3·5···Pk+l)m,
where m is a positive integer. By Theorem 3, we have
But
mk + 1
= {
}k+ 1
logn
log(2' 3·5 ... PH d
> c(logn)k+ 1,
(2)
where c is a constant independent of n.
If we now take m = 1,2,3, ... , we get an infinite sequence of positive
integers n, for which
d(n) > c(logn)k+ 1,
and if we set k + 1 = Ll + b(b > 0), then for that sequence, we have
d(n)
---.1
(logn)
> c(logn)b--+ oo, as n--+oo,
so that the theorem is proved.
On the other hand, we have
THEOREM 5. d(n)=o(n b), for every b>O.
In other words, d(n)/nb--+O, as n--+oo. For' the proof of this theorem
we require
THEOREM 6. If f is a multiplicative, arithmetical function, and
f(pm)--+o,
as
pm--+oo,
where p is a prime, and m a positive integer, (that is, f(n)--+O, as n runs
through the set of prime powers), then f(n)--+O, as n --+00.
PROOF. Since f(pm)--+o, as pm--+ 00, f satisfies the following conditions:
(i) there exists a positive constant A, such that
for all p and m;
(ii) there exists a constant B, such that if pm> B, then If(pm) I< 1;
and
(iii) given e>O, there exists an N(e), such that if pm>N(e), then
If(pm) I<e.
49
The divisor function den)
§3
Clearly A and B are independent of e, p and m, and N(e) depends
only on e.
Let n> 1, with the standard form
(3)
Since
f
is multiplicative, we have
f(n) = f(p~')-j(P22) .. .f(p~r).
(4)
Consider all prime powers pa, and let C be the number of those prime
powers which do not exceed B. Then C is independent of nand e. For
the corresponding factors f(Pii) in (4) we can apply inequality (i); their
product, in absolute value, is therefore less than A C . The remaining
factors of f(n) are, in absolute value, all less than 1, by (ii).
Again there are only finitely many integers of the form pa which do
not exceed N(e). Therefore there are only finitely many integers whose
standard form contains only factors of the form pa with pa$;N(e). Let
P(e) be the upper bound of all such integers.
If we now choose n > P(e), then the standard form of n must contain
at least one factor pa > N(e), and we can therefore apply (iii), namely
If(pa)1 <e.
Hence, if n > P(e), then we have
If(n)l< AC·e,
so that f(n)--+O as n--+oo.
PROOF OF THEOREM 5. The function f(n) = d(n)/no is multiplicative,
and
Since logp?dog2, it follows that for every 6>0, we have
2
logpm
f(pm) $; - 12 . ------;;;-;l
og
p
--+
0,
as
pm --+ 00.
Hence, by Theorem 6, we have
d(n)
- 0 --+
n
0,
as
n --+00,
for every 6> 0, as claimed.
It can be shown that given e>O, there exists a number N(e), such
that
logn
d(n)<2(1+£)IOgIOQ,l,
4
Chandrasekharan, Analytic Number Theory
for
n>N(e),
50
Arithmetical functions and lattice points
VI
and that, for infinitely many integers n, we have
\ogn
d(n) > 2(1-e)\Og!ogn
THE AVERAGE ORDER OF d(n). Let us consider the summatory function
N
D(N) =
Since d(n) =
L d(n).
n=l
L 1= L
1, we have
xy=n
tin
N
D(N)=
L d(n)= L L 1,
n=!
or
l~n~Nxy=n
D(N)=
L
1.
Clearly D(N) is the number of lattice points in the "first" quadrant
(that is, upper right), which lie on or below the hyperbola x y = N, the
points on the axes being excluded since xy=O for them.
To estimate the order of magnitude of D(N), we need
THEOREM 7. If g is a monotone decreasing function of the real variable
t,definedfor
t~l,
with g(t»O for t~l, then
x
L g(n)= f g(t)dt+A+O(g(X»),
1
l:S:;n~X
where n is a positive integer,
X~
1, and A is a constant depending only on g.
PROOF. Consider the closed interval [n, n + 1J. Since g is decreasing,
we have
n+l
J g(t)dt~g(n).
g(n+ 1)~
n
Therefore
n+l
O~An=g(n)-
f
g(t)dt~g(n)-g(n+ 1).
n
If M and N are arbitrary positive integers, with M < N, then
N
N
L An~ L
{g(n)-g(n+ 1)} =g(M)-g(N + 1),
n=M
n=M
and since g(t) > 0 for t ~ 1, it follows that
N
L
n=M
An~g(M),
for all
N>M.
(5)
51
The divisor function den)
§3
00
00
In particular, L An:::;g(l), sincegisdefinedat 1, so that the series LAn
n=l
n=l
converges. Set
00
Then, by (5), we have
NooN
A= L An+ L An= L An+O(g(N + 1)),
n=l
n=N+l
n= 1
or
n+ 1
N
J g(t)dt} + O(g(N + 1)),
A = L {g(n) n=l
n
from which it follows that
N
N+ 1
Lg(n)= J g(t)dt+A+O(g(N+1)).
n=l
1
If we set N = [X], then this takes the form
[XJ+ 1
J
L g(n)=
l';n';X
g(t)dt+A+O(g([X]+1)),
1
where n runs through integer values only.
But g is positive and decreasing, so that
[XJ+ 1
J
g(t)dt:::;g(X),O<g([X] + 1):::;g(X),
x
hence
x
L
1
Jg(t)dt+A+O(g(X)),
g(n)=
1
~n~X
as claimed.
COROLLARY
1. There exists a constant y (Euler's constant), such that
~=
L
l';n';X n
COROLLARY
+y+O(~).
X
2. Since
f
x
dt
tlogt
-- =
2
4"
logX
loglogX -loglog2,
52
Arithmetical functions and lattice points
VI
we have
1- = 10glogX +B+O (
1 ),
L nlogn
XlogX
2'Sn':;X
where B is a constant.
We are now in a position to prove
THEOREM 8. D(N)=NlogN + O(N).
PROOF. As already mentioned, D(N) is the number of lattice points
in the upper right quadrant of the (x,y) plane, which lie on or below
the hyperbola x y = N, but not on the axes. Clearly these points lie to
the left of the line x = N, and below the line y = N (Fig. 3). We count
N
Fig. 3
them by considering the lattice points on each vertical line with an
integral abscissa. The number of lattice points on an ordinate of length
N/x is [N/x] , so that
If we set [N/x]=N/x-Ox' O~Ox<l, then
N
D(N)=N
1
N
N
1
L - - L Ox=N L - + O(N),
x=l x
x=l
x=l X
N
since
L Ox<N.
x=l
From Corollary 1 of Theorem 7 it follows that
D(N)=NlogN + O(N),
as claimed.
The divisor function den)
§3
53
Theorem 8 can be considerably sharpened. As a first step we prove
THEOREM 9 (DIRICHLET). D(N)=NlogN +(2y-l)N + o (VN), where
y is Euler's constant.
PROOF. The hyperbola x y = N is symmetric relative to the line
x = y. Therefore the regions ABGEO and CDOFG (in Fig. 4) contain
the same number of lattice points. The total number of lattice points,
B(l,N)
A -
c
a
o
E
Fig. 4
in the "first" quadrant, which are on or below the hyperbola (but not on
the axes) is therefore equal to twice the number of lattice points in
ABGEO, minus the number of lattice points in the square OFGE. Thus
D(N)=2
1
=2
L 1-[VNJ 2 =2 1 ';x';vN
L 1 ';y';Njx
L 1-[VNJ 2
l';x';vN
~xy:-$;N
L
1 ';x';vN
[NJ x
[VNY
If we set [N/xJ=N/x-8 x, O~8x<1, and [VNJ=VN-8, O~8<1,
then we get
But
L
1
8x = O(VN),
';x';vN
82 =0(1),
VI
Arithmetical functions and lattice points
54
hence
D(N)=2N
L -x1 - N + O(tIN)·
l';;x.;;j/N
An application of Corollary 1 of Theorem 7 now gives the result claimed.
The error term O(VN) was improved by G. VORONOI to O(Nl/310g N).
It is conjectured that the correct error term is O(N t +"), with an arbitrary
8>0. On the other hand, it is known that the error term is not O(Nl/4).
§ 4. The function a (n). Associated with the function d(n) is the arithmetical function a(n) which gives the sum of the positive divisors of n.
More generally, one can define
ak(n)= Ldk,
k=0,1,2, ... ,
din
so that ao(n)=d(n) and a(n)=a1 (n).
THEOREM 10. The arithmetical function ak(n) is multiplicative.
PROOF. The same considerations as in Theorem 2 imply that if
(m, n) = 1, then
Ld' L d'= L d*,
dim
d'in
dOlmn
which shows that a(n) is multiplicative, and similarly also ak(n).
THEOREM 11. If n> 1, with the standard form n =
n pi', then
r
i= 1
(6)
PROOF. Since ak is multiplicative, we have
In particular, if k = 1, then
(7)
An old problem concerning the function a(n) is that of perfect
numbers. A positive integer N is called perfect if a(N)=2N. That is,
§5
The Mobius function Jl(n)
55
N equals the sum of all its positive divisors which are smaller than N.
For example, 6 and 28 are perfect numbers.
A M ersenne number is an integer of the form 2" -1; if it is a prime,
it is called a M ersenne prime.
Mersenne primes and perfect numbers are connected by the following
THEOREM 12. If 2"+ 1-1 is a prime, then 2"(2 n + 1-1) is a perfect
number.
PROOF. Let N = 2"(2"+ 1-1) = 2" p, where p is a prime. Then, by (7),
u(N)=(2"+ 1-1)(P+ 1)=(2"+ 1-1)2"+ 1=2N."
Hence N is a perfect number.
Euler observed that this result has a partial converse, namely
THEOREM 13 (EULER). Every even perfect number is of the form 2" p,
where p = 2n + 1 -1 is a M ersenne prime.
PROOF. Let N=2"N' be perfect,
n~1,
and N' odd. Then
u(N)=2N =2"+1 N'.
Since u is multiplicative, we have
u(N) = u(2") u(N'),
and since u(2") = 2"+ 1 -1 by (7), we have
(2"+ 1-1)u(N') = 2"+ 1N'.
Hence (2"+1_1)IN'. If we set N'=(2n+1_1)N", then u(N')=2n+ 1Nil,
and Nil <N'. But N'+N =2n+1 N"=u(N'). Now both N' and Nil
divide N', and their sum is u(N'). Hence N' has no other divisors, and
therefore is a prime. But N' =(2"+ 1-1)N". Therefore Nil = 1, and
N' =2n + 1_1, which proves Theorem 13.
It is not known whether there exist infinitely many even perfect
numbers (that is, infinitely many primes of the form 2n-1). Nor is it
known whether there exist odd perfect numbers.
Mersenne primes are primes of the form 2n -1. It is simple to see
that if n> 1, and a is a positive integer, and an - 1 is a prime, then
a=2 and n is a prime. For if a>2, then (a-1)I(a n-1); and if a=2,
and n=kl, 1<k~l, then (2k-1)1(2"-1).
I
§ 5. The Mobius function J,l(n). The Mobius function J,l is an arithmetical function defined by the following three properties:
Arithmetical functions and lattice points
56
VI
(i) J.l(I) = 1;
(ii) J.l(n)=( -1)\ if n is a product of k different primes;
(iii) J.l(n)=O, otherwise; that is, if n is divisible by a square different
from l.
An immediate consequence of the definition is
THEOREM 14. The Mobius function J.l is multiplicative.
THEOREM 15. We have
LJ.l(d) =
{I, ~f
n=l,
0, If n> 1.
din
PROOF. Let n > 1, with the standard form n =
divisors d ofn, for which J.l(d)#O, are:
m
TI pfi.
The only
;=1
Thus
hence
~J.l(d) = 1 -
(7) + (;) - (~) + ...
= (1-1)m =
0.
One can alternatively define the Mobius function by Theorem 15,
and deduce properties (i), (ii), (iii) from it.
The most important applications of this function stem from the
so-called Mobius inversion formulae.
THEOREM 16. (The first Mobius inversion formula). If f is an arithmetical function, and
g(n) = )' f(d) ,
~
then
f(n)= L J.l(d)
din
PROOF.
LJ.l(d)g
din
(~) = LJ.l(d) L
din
d'iJ
g(~).
d
f(d')
= L J.l(d)f(d') = L f(d') L J.l(d)
dd'in
d'in
di~
= f(n) (by Theorem 15).
57
The Mobius function Jl{n)
§5
Theorem 16 has a converse given by
THEOREM 17. If
h(n) = IIl(d)f
din
then
(~) = dinIII (~) f(d),
f(n) = I h(d).
din
PROOF. When d runs through the divisors of n, so does n/d. Hence
I
din
h(d) =
I
din
h
(J)
=
I I
din d'IJ
Il
(d~') f(d')
= I Il (dnd') f(d')
=
dd'in
I
d'in
()
f(d') I Il dnd'
dl~
d'
= f(n)
(by Theorem 15).
As an application of Theorem 16, let us consider the relation
L rp(d) = n,
din
which was proved in Chapter II, Theorem 6. From Theorem 16 it
follows that
rp(n) =
L Il(d)
din
n
- = n
d
Il(d)
L.
din
(8)
d
As another application, we can consider the von Mangoldt function A,
defined by
IOgp, if n is a prime power pm, m > 0,
A(n) = {
0, otherwise.
LA(d) = logn.
THEOREM 18.
din
PROOF. Let n> 1, and have the standard form n =
the definition of A, we have
r
LA(d) = L
din
ai
i= 1
r
L A(pf) = L a;logpi = logn,
i=l a=l
which proves the theorem.
r
TI pfi.
i=l
Then, by
Arithmetical functions and lattice points
58
VI
In conjunction with the first Mobius inversion formula, Theorem 18
gives
n
A(n) = L l1(d) log -.
din
d
Since LI1(d)=O, if n> 1, by Theorem 15, and logl =0, it follows that
din
(9)
A(n) = - LI1(d)logd.
din
THEOREM 19. (The second Mobius inversion formula). If f is a
function defined for x ~ 1, and
L f(~),
g(x) =
n
n~x
then
f(x) = L l1(n)g
n~x
and conversely.
The sum
L
(~),
for
n
x~ 1,
[xl
is interpreted as
L,
and a sum without terms is 0.
n=1
n~x
PROOF. From the definition of g we have, if x ~ 1,
L l1(n)g (~)
=
n
n~x
L l1(n)
n~x
L
f
:<x
(~)
= L
mn
l1(n)f
m,n
m~;
l~mn~x
(~).
mn
If we rearrange this last sum by grouping together terms for which
mn=r, 1 :(r:(x, we get
L
m,n
l1(n)f
l~mn:::::;x
(~)
=
mn
L
1 ~r~x
f
(~)
LI1(n) = f(x) ,
r
n/r
and the first part of Theorem 19 is proved.
To prove the converse, let f(x) =
L f
m~x
(~)
=
m
L
m~x
L l1(n)g
~x
n-...::;:;;
f: l1(n)g (~),
x ~ 1.
n
n",x
(~)
=
mn
L
m,n
1 ~mn~x
and, as above, this last sum can be written as
1
;;~x g (~) ~fl(n) = g(X).
l1(n)g
Then
(~),
mn
59
Euler's function cp(n)
§6
§ 6. Euler's function cp(n). We return to Euler's function cp. We know
that cp(n) < n, if n> 1. On the other hand, if n = pm, where p is a prime,
m~ 1, and p> l/e, O<e< 1, then [cf. Chapter II]
cp(n)
=n
(l-D
> n(l- e).
From these inequalities we obtain
lim cp(n) = 1.
THEOREM 20.
n
n~oo
Another result on the order of magnitude of cp(n) is
THEOREM 21. For every b > 0, we have
cp(n)
as
~~OO,
n
PROOF. The result is trivial if
(j
n~oo.
> 1. If
(j:::;
1, we set
nl-a
f(n)=-.
cp(n)
Then f is multiplicative, and because of Theorem 6, it is sufficient to
prove that f(pm)~o as pm ~oo. In fact, we have, for every (j > 0,
1
(pm)
_ _ - _CP_
_ - ma
f(pm)-pm(l-a)-P
(1) 1
1-- >- _ ma~oo
c:;.--2 P
p
It follows from Theorem 20, or Theorem 21, that the assertion
cp(n) = O(nA) is false for every Ll < 1.
THE AVERAGE ORDER OF <p(n). Let us consider the behaviour of the
summatory function of cp, namely
<P(t)
=
L
cp(n);
<P(N) is the number of terms in the Farey sequence of order N.
THEOREM 22 (MERTENS). <P(t)
=
3t2
-2
n
+ O(tlogt).
PROOF. Since
<P(t) = ) '
)'
1 = )'
1,
1 ~~t 1 !;;:~n
1 ~m~n~t
(m,n)=l
(m,n)=l
60
Arithmetical functions and lattice points
VI
we see that tP(t) is equal to the number of lattice points with relatively
prime co-ordinates, which lie inside or on a right-angled triangle:
O<y~x~t.
We consider the square 0 < x ~ t, 0 < Y ~ t. The line x = y divides
it into two right-angled triangles, each of which contains the same
number of lattice points, with relatively prime co-ordinates. One of
them is given by 0 < y ~ x ~ t. The only lattice point with relatively
prime co-ordinates on the line x = y is the point x = y = 1.
If 'P(t) denotes the number of lattice points with relatively prime
co-ordinates in the above mentioned square, then
'P(t)=2tP(t)-I,
(10)
for the point x = y = 1 is counted in both the triangles.
The total number of lattice points in the square O<x~t,
is [t]2, so that
[t]2 =
1.
O<y~t
L
O<m~t
<n~t
o
If we arrange them according to the size of the greatest common divisor
of their co-ordinates m and n, we have
[t]2=
L
L
1.
(11)
1 ~d~t O<m~t
O<n~t
(m,n)=d
Since (m, n) = d, if and only if (mid, nld) = 1, there exists a one-one
correspondence between the lattice points with the co-ordinates m, n
such that O<m~t, O<n~t, (m,n)=d, and the pairs of integers m',n',
such that
O<m'
t
~-,
d
O<n'
t
(m',n') = 1.
~-,
d
But by the definition of 'P, there are exactly 'P(tld) such pairs m',n'.
Hence (11) can be written as
[t]2=
L
1 ",d"'t
(t)
'P - .
d
(12)
Since it is 'P(t) which we want, we apply the second Mobius inversion
formula to (12), and obtain
'P(t)=
L J1(d)[~J2, t~1.
1 ",d"'t
d
61
Euler's function cp(n)
§6
Now tid = [tid] + 8, with 0 ~ 8 < 1, so that
If'(t) =
1];~tJl(d){~ +O(l)f
~)+o(
=t2 I Jl(d)+2t.O( I
1 ~d~t d 2
1 ~d~t d
1
I1),
~d~t
since IJl(n) I~ 1. From Corollary 1 of Theorem 7, we know that
~) = 2t'O(IOgt+Y+o(~))
=O(tlogt),
t
2t.O( I
1~d~t d
and o( I
1) =O(t). Hence
1 ~d~t
Jl(d)
-d
2
If'(t)=t 2 I
+ O(tlogt).
(13)
1 ~d~t
To estimate the sum in (13), we observe that
and
Ii
[t]+ 1
i
I
Jl(d) <
d2
[t]+ 1
~<
d2
fco
~ = o(~)
du =
[t]
[t]
u2
t'
Thus (13) gives
co
If'(t)=t 2 d~1
Jl(d)
7 + O(tlogt).
coJl~
(14)
co
Here the series I - 2 can be evaluated as follows. Since I n- 2 , and
d=1 d
n=1
co
I Jl(m)m- 2 are both absolutely convergent, we can multiply them
m=1
1
Jl(m)
C
I 1: . I - 2 = I -i, where Cv = I Jl(k). Since
n=1 n
m=1 m
v=1 V
co
kJv
Cl =1, and cn=O for n>l, by Theorem 15, and
I n- 2 =n 2 /6, we
have
n=1
out, and get
<.0
co
I
n=1
co
Jl(n) _
2
n
-
co
(CO ~)-1
I 2
n=1n
_
-
~
2'
n
62
Arithmetical functions and lattice points
VI
If we substitute this in (14), we get
lJ'(t) =
6t 2
-2
7t
+ O(tlogt),
3t 2
-2
7t
+ O(tlogt),
which, together with (10), gives
4>(t) =
(15)
as claimed.
RELATION .BETWEEN cp AND (J. It is interesting to note that the results
on the order of cp lead to results on the order of (J, and vice versa. This
follows from
THEOREM 23. There exists a positive constant C, such that
C<
PROOF. If n =
in notation,
n
< 1, for n ~ 2.
2
(16)
npa, then we know from (7), with the obvious change
pin
(J(n)
Since
(J(n) cp(n)
=
n
pin
pa+1_1
P- 1
cp(n)=n
=n
n
I_p-a-1
pin
-1
1- P
n (1 - ~),
P
pin
we have
_(J(_n)-,-cp(_n) _
2
n
n(1 pin
1)
---;:tl
P
< 1,
which proves the second inequality in (16).
On the other hand,
n(1 - ~)
~ n (1 - 12) > n (1 - 12),
P
P
P
pin
pin
p
since 1-1/p2 < 1, and the product on the right extends over all the
primes p. This gives the first inequality in (16).
Chapter VII
Chebyshev's theorem on the distribution
of prime numbers
§ 1. The Chebyshev functions. We have seen in Chapter I that there
are infinitely many prime numbers. If we denote by n(x) the number of
primes not exceeding x, it follows that n(x)-Hl) as x---+oo. The prime
number theorem, which we shall prove in Chapter XI, tells us much
more, namely that
lim
x-+oo
~=
x/logx
1.
There are several intermediate results of interest, which we shall prove
in this chapter. We begin with a result, due to Euler, that the sum L lip,
extended over all the primes, diverges, from which it follows that the
number of primes is infinite.
THEOREM I (EULER).The sum L lip, and the product
are both divergent, as p runs through all the prime numbers.
TI (l-l/p)-l,
PROOF. We shall first show that the product diverges, and then
deduce that the series also does. Let
P(x)=
( 1 - -1
p~x
p
TI
)-1
, S(x)=
1
L -,
p~x
P
x~2.
If u is a real number, 0 < u < 1, and m a positive integer, we have
1
l_um + 1
I-u
I-u
-- >
=
I+u+ ... +um •
We can set u = lip, where p is a prime. If we do this for all primes p ~ x,
and m'ultiply the resulting inequalities, we get
P(x»
TI
p~x
We now choose m, such that
TI
p~x
(I + ~P + ... + P~).
2m~x.
Then
(I + ~ + ... + ~) ~ I I
p
P
n=l
n
64
Chebyshev's theorem on the distribution of prime numbers
VII
for the integers n, such that 1 < n ~ [x], have as prime factors only those
primes p ~ x, and the inequality 2m ;?: x ensures that every term in the
sum on the right-hand side comes from the product on the left. Hence
f
[xl+ 1
P(x»
[xl
1
n= 1
n
L -
>
du
-
> logx.
U
1
Hence the product f1(1-1/p)-1 diverges.
To prove the divergence of the series, we consider the expansion
log ( -1-) = u
1-u
+ -u + -u + "', -1 ~ u < 1.
2
3
2
3
If u>O, we have
The geometric series on the right converges for
2
log ( -1-) - u < u
,
1-u
2(1-u)
°<
lui < 1, so that we get
u < 1.
Setting u = lip, for all p ~ x, and adding together the resulting inequalities, we obtain
1
1
1
1
L
<- L
2p,,;x p(p-1)
2n=2 n(n-1)
10gP(x)-S(x)<-
00
2'
so that
S(x»
Hence
L lip
10gP(x)
-t> loglogx -to
diverges, which completes the proof of Theorem 1.
THE FUNCTIONS SAND ljI. Chebyshev's functions Sand ljI are defined
as follows:
9(x)=
L logp,
x>O,
p a prime,
(1)
p~x
and
ljI(x)=
L
logp,
x>O.
(2)
The sum in (2) extends over pairs p, m, where p is a prime, and m is a
positive integer, such that pm~x. This means that if pm is the highest
65
The Chebyshev functions
§1
power of p not exceeding x, then log p is counted exactly m times in
the sum. For example,
I/t(10) = 310g2 + 210g3 + logS + log7.
In Chapter VI, § 5, we introduced the von Mangoldt function
A(n) = {
IOgp, if n = pm, m a positive integer,
0, otherwise.
From (2) it is immediate that
I/t(x) =
L A(n).
(3)
Further it follows from (1) and (2) that e.9(x) equals the product of all
primes p:::; x; and, for x ~ 1, e"'(X) is the least common multiple of all
positive integers :::; x. If pm:::; x, then p:::; x 1/m, and conversely. Hence
(2) leads to the relation
I/t(x)=9(x)+9(X 1/2 )+9(X 1/3 )+ ... ,
(4)
the series being finite, since 9(x)=0 for x<2. If pm:::;x<pm+1, x~1,
then log p occurs exactly m times in I/t(x), and m= [logx/logp]. Hence
we have a fourth expression for I/t(x), namely
I/t(x)=
L [ -IOgx] ·logp.
p';;x
(5)
logp
We shall now establish a connexion between the functions
n(x)
9(x)
x/log x ' x
THEOREM
I/t(x)
x
2. Let
n(x)
n(x)
/1 = lim - - , L1 = lim
x-+oo x/logx
x-+oo x/logx'
.
9(x)
12 = lIm - ,
x- 00
X
L 2 = lim
x-oo
9(x)
x
- -1·- I/t(x)
L 31m - - .
x-oo
5
Chandrasekharan, Analytic Number Theory
X
66
Chebyshev's theorem on the distribution of prime numbers
PROOF.
VII
It follows from (4) that .9(x)::;; I/J(x), and from (5) that
logx
L -
I/J(x)::;;
p';;;x
logp
'logp=logx
L
1,
p';;;x
that is
I/J(x)::;; n(x)log x.
Hence
.9(x)::;; I/J(x)::;; n(x)logx.
If we divide throughout by x, and let x -> 00, we get
(6)
L2 ::;;L 3 ::;;L 1·
Let us choose a real number !Y., 0 <!Y. < 1, and keep it fixed. Let x> l.
Then
.9(x)~
logp,
L
XIX <p~x
and since
we have
logp>logx~,
.9(x)~!Y.logx
L
1,
which implies that
.9(x) ~ !Y.logx{n(x) -
n(x~)).
But n(x') < x~ trivially, so that
.9(x) > !Y.n(x)logx - !Y.x'logx,
or
.9(x)
-
X
log x
logx
> !Y.n(x) - - - !Y. -1- .
X
x -,
Since O<!Y.<1, it follows that (logx/x1-')->O, as x->oo. Hence
L2 ~!Y.Ll'
for every real !Y., such that 0 <!Y. < 1. Hence L2 ~ L 1 • On combining
this with (6), we get L1 =L 2 =L 3 •
The proof that 11 = 12 = 13 runs along similar lines.
It follows from Theorem 2 that if one of the three functions
n(x)
.9(x)
x/log x ' x
I/J(x)
x
tends to a limit as x->oo, then so do the others, and all three limits
are the same. Thus in order to prove the prime number theorem, it is
sufficient to show that lim t/I(x)/x = l.
x-' 00
Chebyshev's theorem
§2
67
§ 2. Chebyshev's theorem. We shall use Theorem 2 to prove the
following
THEOREM
such that
3 (CHEBYSHEV). There exist constants a and A, O<a<A,
if x is sufficiently large, we have
x
x
a -1- < n(x)<A -1-'
ogx
ogx
PROOF.
Let
n(x)
n(x)
1= l i m - - L= lim - - .
x-+oo xjlogx'
x-+oo xjlogx
We shall prove Theorem 3 by showing that L~41og2, and 1~log2.
By Theorem 2 these two inequalities are, however, equivalent to
-
x-+oo
.9 (x)
x
.
ljJ(x)
L= lim -
1= hm x-+oo
PROOF OF
x
~
410g2,
(7)
~
log2.
(8)
(7). The binomial coefficient
N
= (2n) = (n+ 1)(n+2) ... (2n)
n
1·2·3···n
has the following properties: (i) N is an integer, which occurs as the
largest term in the binomial expansion of (1 + 1)211, which has (2n+ 1)
positive terms, so that
(9)
(ii) N is divisible by the product of all primes p, such that n < p ~ 2 n,
for every such prime appears in the numerator of N, while its denominator
is not divisible by any prime p > n.
Because of (ii), we have N ~
p, hence
n
n<p~2n
L
10gN~
logp=.9(2n)-.9(n).
n<p~2n
But from (9) we get 10gN <2nlog2. Hence
.9(2n)- .9(n) < 2nlog2.
If we set n=I,2,2 2 , ••• ,2m we get
5*
1
(10)
in (10), and add the resulting inequalities,
Chebyshev's theorem on the distribution of prime numbers
68
VII
m
L 2r<2m+llog2,
9(2m)-9(1)<log2
1
r=
or
(11)
since 9(1)=0.
Now let x>1, and m a positive integer, such that 2m-l~x<2m.
Since the function 9 is non-decreasing, (11) gives
9(x) ~ 9(2m) < 2m+ 110g 2 ~4xlog2.
Hence
9(x)
- - < 410g2,
x
which implies that
9(x)
L= lim x-+ 00
X
~
410g2,
as claimed in (7).
PROOF OF (8). The second part of Chebyshev's theorem is proved
differently. It uses an important formula for the number of times a given
prime divides m!
We say that a prime p divides the integer n exactly k times, if pkl n,
and pH 1 ,r n.
LEMMA.
The number of times a prime p exactly divides m! is equal to
the series being finite since [x] = 0 for 0 < x < 1.
Among the integers 1,2, ... , m, there are exactly [m/p] which are
divisible by p, namely
p,2p, ... ,
[~J p.
(12)
The integers between 1 and m which are divisible by p2 (a subset of
the set (12)) are
[mJ
2
p2 ,2 p2 , ... , p2 p,
which are [m/p2] in number, and so on.
(13)
69
Chebyshev's theorem
§2
The number of integers between 1 and m, which are divisible by pr
but not by pr+ 1 is exactly [m/pr] - [m/pr+ 1 J. Hence p divides m! exactly
(14)
times, which proves the lemma.
In order to prove (8), we consider the integer
N
= (2n) = (2n)! .
(n !)2
n
Let p be any prime, such that p:( 2 n. Then the numerator of N is divisible by p exactly
times, and
n! is divisible by p exactly
times, so that the denominator of N is divisible by p exactly
times. Hence N is divisible by p exactly vp times, where
Therefore
Since
[~~] = [;r]= 0 when
pr>2n,
that is when
IOg2n]
r> [ -
logp ,
we have
vp =
Mp ([2n]
r~l
Ii
-
2
[n])
pr '
Mp =
[IOg2n]
logp .
(15)
70
Chebyshev's theorem on the distribution of prime numbers
VII
However, for any real y, we have
[y]~y<[y]+l,
or
2[y]~2y<2[y]+2,
and
[2y]~2y<[2y]+1,
from which it follows that -1 < [2 y] - 2 [y] < 2, hence
[2y] - 2[y] = 0, or
On using this in (15), we get
vp
N=
~
1.
(16)
M p' hence
TIpMp.
TIpvp~
p~2n
(17)
p~2n
On the other hand, (5) and (15) give
~(2n) =
IOg2n]
L [- logp = L Mplogp,
pGn
so that
logp
po%2n
e"'(2n) =
TI
p~
pMp,
2n
hence by (17),
10gN ~ l/J(2n).
From (9) we have
10gN > 2nlog2 -10g(2n + 1).
Hence for every positive integer n, we have
~(2n)
> 2nlog2 -log(2n + 1).
Let x be now a real number, x> 2, and let n = [xI2]
n > (xI2) -1, and 2n ~ x. From (18), therefore, we get
~(x) ~ ~(2n)
or
(18)
~ 1.
Then
> (x - 2)log2 -log(x + 1),
~(x)
x- 2
log(x + 1)
- - > --10g2 ,
x
x
x
hence
l/J(x)
1= lim - - ~ log2,
x- 00
x
which proves Theorem 3.
If follows from Theorem 3 that the number of primes is infinite,
and, in fact, that the series
lip, extended over all the primes, diverges.
Let Pn be the nth prime. Then n(Pn) = n, and since we have
L
x
n(x»a'--,
logx
a>O,
Bertrand's postulate
§3
71
for sufficiently large x, it follows that
n=lt(pn»a'~ > VP,;,
logPn
if n is sufficiently large. Hence logPn < 2logn, so that
apn < nlogPn < 2nlogn,
co
for sufficiently large n. It follows that the series
co
L I/nlogn.
comparison with the divergent series
L I/Pn
diverges, in
n= 1
n=2
§ 3. Bertrand's postulate. The following theorem was conjectured
by BERTRAND but first proved by CHEBYSHEV.
THEOREM 4 (Bertrand's postulate). If n is a positive integer, there
exists a prime P such that n < P ~ 2 n.
Chebyshev's proof of this is based on ideas similar to those used in
the proof of Theorem 3. The result is first proved for large values of n,
and then verified for smaller values with the aid of a table of primes.
We shall give here a proof due to S. S. PILLAI, which is simpler, in
as much as it avoids the use of Stirling's formula for r (n), and reduces
the number of verifications to a minimum.
In proving Chebyshev's theorem, we applied inequality (9), namely
2n
_2 _ <N<22n
2n+l
'
for the binomial coefficient N = (2nn). and deduced (11) from it, namely
9(2m) < 2m+ 11og2.
(11)
We shall now require the sharper estimate
22n
22n
--<N<--
2Vn
n~2,
tnn'
(19)
in order to prove that (11) holds not only for powers of 2, but for all
positive integers n, that is
9(n) < 2nlog2,
n ~ 1.
PROOF OF (19). Define the number
p=
1 . 3 . 5 ... (2 n - 1)
2·4·6 ... (2n)
.
(20)
72
Chebyshev's theorem on the distribution of prime numbers
VII
Since
1·3·5 ... (2n-l) 2·4·6 ... (2n)
p =
.
2·4·6 ... (2n)
2·4·6 ... (2n)
(2n)!
= -::---::-
22"(n!)2'
we have 22 " P = N. It is obvious that
1>
(1-~) (1-~)
(1_~)"'(1
__
1 )
4
6
(2nf '
2
22
2
which can also be written as
1>
(~) (~) (~) ... ((2n-l)(2n+l))
42
22
(2n)2'
62
or
1> (2n+ I)P2 > 2nP 2 =
2n
-4
2"
N 2,
which gives the second inequality in (19).
Similarly we have
1 > (1
-~)
3
(1
2
-~)
52
(1 _
~)
... (1 _
1 )
7
(2n -If '
2
which can be written as
1>
(~)
(~) (~)
... ((2n-2)2n)
2
2
3
(2n-l)2'
7
52
or
1
24n
1 >--=--
4nP2
4nN 2 '
which gives the first inequality in (19). Thus (19) is proved.
PROOF OF (20). This is trivial for n = 1 and n = 2. Assuming that
it is true for some n ~ 2, we shall deduce that 9(2n -1) < 2(2n -1)log2,
which would imply that
9(2n) = 9(2n -1) < 4nlog2.
Consider the integer
-1).
N = ~ (2n) = (2n)! . ~ = (2n -I)! = (2n
2
2 n
(n!)2 2n n!(n -I)!
n-l
This is divisible by all primes p, such that n < p ~ 2 n - 1, and therefore
also by their product. Hence
N
-~
2
n
n<p';;2n-l
p.
73
Bertrand's postulate
§3
On taking logarithms, we get
N
log "2 ~ 8(2n -1) - 8(n).
But from (19) we have
10gN < 2nlog2 - tlog2n.
On combining these two inequalities, we get
8(2n -1) - 8(n) < (2n -1)log2 -tlog2n.
But, by hypothesis, we have 8(n) < 2nlog2, hence
8(2n -1) < 2nlog2 + (2n -1)log2 - t log2n,
which implies, since n ~ 2, that
8(2n -1) < 2(2n -1)log2,
which is the sought inequality. Thus if (20) is proved for a certain positive
integer n ~ 2, then it also holds for the integer 2 n -1, and hence for
2n. If 8(n) < 2nlog2, for every n in an interval of the form
2' - 1 < n ~ 2',
r ~ 1,
then it is true also for every n in the interval
2' < n ~ 2'+1.
It follows by induction that (20) is true for n ~ 1.
We shall need (19) and (20) for Pillai's proof of Theorem 4.
PROOF OF THEOREM 4 (S. S. PILLAI). In order to prove Theorem 4,
we shall prove that 8(2 n) - 8(n) > 0 for n ~ 26 , and verify the inequality
directly for 1 ~ n < 26 .
We consider once again the binomial coefficient (cf. (17))
N = (2n) = (2n)! =
n
where
(n!f
n pVp
p';;2n
'
Then
(21)
We split this sum into four parts :£1' :£2':£3 and :£4' corresponding
to the following four different ranges of values of the prime p, namely
(i)
n<p~2n;
(ii) 23n <
p~n;
(iii)
~<p ~ 23n, n~5;
(iv)
p~~.
74
Chebyshev's theorem on the distribution of prime numbers
VII
In 1:1 we have n/p<l, so that [n/p] =0; and 1~2n/p<2, so that
[2n/p] = 1, and [2n/p2] =0. Hence vp=l, and we obtain
1:1 =
L
L
vplogp=
logp=8(2n)-8(n).
(22)
In 1:'2 we have l~n/p<t, so that [n/p]=l and [2n/p] =2. Further,
if n;;:: 3, then [2 n/p2] = O. Hence
1:2 =0,
for
n;;::3.
(23)
In 1:'3 we have n;;::5, and n/p2 <2n/p2 < 1, so that vp=[2n/p]
-2 [n/p] =0, or 1 (cf. (16)). Hence
1:3~ L
IOgp=8(23n) - 8 (V2n).
jI2n<p"'2n/3
But
8(V2n)=
L
logp;;::log2
p",jI2n
L
1 =n(V2n)log2.
p",jI2n
Hence
(24)
In 1: 4 we apply Chebyshev's inequality (cf. (17))
vp~Mp
IOg2n]
= [- ,
logp
and get
L
1:4~
Mplogp~
p",Vln
log2n
L - - ·logp=log2n L
p",J/2n
logp
1,
p",J/2n
that is
1: 4 ~ n(V2n)log2n.
(25)
By combining (21), (22), (23), (24) and (25), we obtain, for n;;::5,
logN
~8(2n)-8(n)+8 (23n)
- n(V2nHlog2-log2n),
which can be written as
,9(2n)-,9(n);;::logN -,9 (23n) - n(V2n)logn.
(26)
Bertrand's postulate
§3
75
From this we shall deduce that 9(2n)-9(n»0, for sufficiently large n.
For this purpose we need three inequalities:
(a)
logN>2nlog2-log(2Vn),
which is a consequence of the first inequality in (19);
(b)
ge3n )
= .9
([2 nJ) < 2[23n}Og2,
if
3
n~2,
because of (20); and
(c)
n
n(n)::=;-,
2
if
n~8,
because every even integer greater than 2 is composite.
On using (a), (b) and (c) in (26), we get, for n ~ 32,
0
4n
9(2n)-.9(n»2nlog2-log(2Vn) - 3"log2 - -2- logn,
which can also be written as
2n
.9(2n)-.9(n) > ( 3"
- 1) log2 It remains for us to show that
2n
- 1) 10g2 ( 3"
(0+1)
2
(0+1)
2
logn.
10gn>0,
(27)
(28)
for sufficiently large n. It is easy to see that (28) holds for n=26. We
shall prove that it holds also for n > 26 • For this purpose we write (28)
in the form
0-~
logn _ 3V2.10g~>0.
2 log2 log2
~
(29)
If we replace n by a real variable x, and observe that both the functions
3 logx
Vh---,
2 log2
and
3V2log~
log2
~
have a positive derivative for X~26, so that they are increasing in that
range, while their sum is positive for x = 2 6, it follows that the sum
remains positive for x> 26. Hence
9(2n)-.9(n»0,
n?26.
That is, Bertrand's postulate is true for n?2 6 =64.
(30)
Chebyshev's theorem on the distribution of prime numbers
76
VII
Now every prime, but the first, in the sequence
2,3,5, 7, 13,23,43,67
(31)
is smaller than twice its predecessor. Hence to each positive integer
n ~ 66, there corresponds at least one prime p, such that n < p ~ 2 n.
This completes the proof of Theorem 4.
§ 4. Euler's identity. The identity
1
00
L
sreal,
~=TI(1_p-S)-1,
s>1,
(32)
n
p
where p runs through all the primes, is a special case of the following
n=l
THEOREM
5. Let f be a multiplicative arithmetical function, and let the
00
series L f(n) be absolutely convergent. Then we have the identity
n=l
00
L f(n) =
n= 1
TI (1 + f(p)+ f(p2)+ .. -),
(33)
p
where the product on the right-hand side is absolutely convergent.
If f is completely multiplicative, that is f(mn) =f(m) f(n), for all
positive integers m, n, then
00
(34)
n= 1
PROOF.
p
Since f is multiplicative, f(1) = 1. Let
P(x)=
TI (1 + f(p)+ f(P2) + .. -).
p~x
Since P(x) is the product of finitely many absolutely convergent series,
we can multiply them out and get
P(x) =
L f(n'),
where n' runs through all positive integers which have no prime factor
greater than x. If we set
00
S=
then
L f(n),
n=l
P(x)-S= - Lf(n"),
where n" runs through all positive integers which have at least one
prime factor greater than x. Obviously n" > x, so that
IP(x)-SI ~ Llf(n") I ~
L:
n>x
If(n)l.
Euler's identity
§4
If we let
00
L If(n)I--+O,
then
X-HI),
77
L If(n)1
since
n >x
is, by hypothesis,
n= 1
convergent. Hence lim P(x)=S, as claimed in (33).
x--+
00
The product on the right-hand side of (33) converges absolutely,
since
00
L If(p)+ f(p2)+ ···1 ~ L (If(P) 1+ If(p2)1 + ...)~ L If(n)1 < 00.
n=2
p~x
p~x
We now consider the case in which
We see from (35) that the series
f
(35)
is completely multiplicative.
L (If(P)1 + If(P2)1 + ...),
p
extended over all the primes, is convergent. But now f(pn) = (j(P)t,
hence
(If(P)1 + If(pW + ...)
L
p
is convergent. Since each term in this sum is a geometric series, it follows
that If(P) 1 < 1. Hence
00
n=l
p
p
= TI(1-f(p)t 1 ,
p
which completes the proof of Theorem 5.
Euler's identity results from (34), if we set f(n) = n -s, s> 1. Let
1
L --; = TI (1- p-S)-l,
00
((s)=
n=l n
(s real, s> 1).
p
Then
where p runs through all the primes, and m through all positive integers.
Differentiating term by term, we get
hence
C(s) _ ;, A(n)
- -
L... -
((S)-n=l
nS
(s real, s> 1),
'
(36)
78
VII
Chebyshev's theorem on the distribution of prime numbers
where A is the von Mangoldt function defined in Chapter VI, § 5. The
term-wise differentiation is permissible, because both the series
p
L
p-Slogp
-s converge uniformly for s ~ 1 + c5 > 1.
I-p
00
The right-hand side of (36) is a Dirichlet series of the form
an n -s,
~)og(l- p -S), and
p
L
n=1
whose coefficients an are given by the von Mangoldt function A(n).
With the help of (36) we shall show that if any of the functions
n(x)
9(x)
x/log x ' x
~(x)
x
tends to a limit as x-+ 00, that limit must be equal to 1. We know already
from Theorem 2 that if any of these three functions tends to a limit, so
do the others, and all three limits are the same.
We shall work with the function ~(x)/x, and use the relation
L A(n).
~(x)=
n~x
We shall need the identity
f ~(x)
00
- C(s) = s
((s)
XS + 1
dx
(sreal, s> 1).
1
This can be obtained from Abel's summation formula.
THEOREM 6 (ABEL). Let 0 ~ Al ~ }o2 ~ ... be a sequence of real numbers, such that An-+oo as n-+oo, and let (an) be a sequence of complex
numbers. Let A(x)= L an, and q>(x) a complex-valued function defined
for x ~ O. Then
k
k-l
L anq>(An}=A(Ak)q>(Ak}- L A(An)(q>(An+l}-q>(An))·
(37)
n=1
n=1
If q> has a continuous derivative in (O,oo), and
written as
X~Al'
then (37) can be
x
(38)
If, in addition, A(x) q>(x)-+O as x-+ 00, then
00
00
(39)
provided that either side is convergent.
79
Euler's identity
§4
PROOF. If we define A(jI.o)=O, then we have
k
k
n=l
n=l
L ancp(An)= L (A(An)-A(An-d}CP(An)
k-l
L A(An)(CP(An+l)-CP(An)},
=A(Ak)CP(Ak)-
n=l
which proves (37). To prove (38), let k be the largest integer, such that
Ak :::;; x. Then, since cP has a continuous derivative cP', the sum on the
right-hand side of (37) equals
k-l
An+l
n= 1
An
L A(An) S cp'(t)dt,
while the first term on the right-hand side of (37) equals
x
A(Ak)CP(Ak)=A(x)cp(x)- S A(t)cp'(t)dt,
since A(t) is a step function which is constant in the interval Ak:::;; t < Ak+ l'
Thus (38) follows from (37), and (39) from (38) if we let x~ 00. This
completes the proof of Theorem 6.
If we set An=n, an=A(n), and cp(x)=x- s (s real, s> 1), then
A(x)=tjJ(x), and A(x)cp(x)~O as x~oo, since tjJ(x):::;;n(x)logx<xlogx
(cf. Proof of Theorem 2), so that A(x)cp(x)=O(x1 - S logx) =0(1). Thus
from (36) and (39) we obtain
f
00
- __
.ns) __ s
((s)
tjJ(x) dx
(s real, s> 1).
xs+1
We are now in a position to prove
THEOREM 7.
.
n(x)
-.-
n(x)
hm --:::;; 1:::;; hm - - .
x/logx
x-+oo x/logx
x-+oo
PROOF. We shall prove that
· tjJ(x)./ 1 ./ -I' tjJ(x)
I1m
"'" "'" x-oo
1m X ,
X
x-oo
and apply Theorem 2.
Let f(s) = - ns)/((s), for every real s> 1, and let
1= lim tjJ(x)
,
x-oo
X
1'= lim (s-l)f(s),
s-+1+0
L= lim tjJ(x) ,
x-oo
[;=
X
lim (s-l)f(s).
s-+1+0
(40)
80
Chebyshev's theorem on the distribution of prime numbers
VII
Obviously we have I ~ L, and l' ~.G. We shall first show that
I ~ l' ~.G ~ L, and then that l' =.G = 1. Together they give Theorem 7.
If B>L, then tjJ(x)jx<B for x~xo=xo(B), and we may assume
that x o> 1. From (40) we have, for s> 1,
oo tjJ(x)
fxo tjJ(x)
foo B
f(s)=s f x s + 1 dx<s x s + 1 dx+s X S dx,
1
1
Xo
so that
f(s)<s
XO tjJ(x)
f ~1
~
dx+s
1
fOO
B
fXO tjJ(x)
-dx<s -2- dx
x
~
1
1
sB
+ --,
s-1
which can be written in the form
(s-1)f(s)<s(s-1)K +sB,
where
If s--+ 1 +0, we obtain .G ~ B. Since this holds for every B > L, we
must have .G ~ L. Similarly we prove that I ~ l', so that I ~ l' ~.G ~ L.
To show that l' = z.; = 1, we shall show that
lim _(S-1)2t(S)=1,
s"'l + 0
and
lim (s-1)((s)= 1.
s"'l +0
Together they imply that (s-1)f(s)--+1. as 8--+1+0.
For s> 1, the function x- S is a decreasing function of x, so that
f-<L
00
1
dx
00
XS
n= 1
f
00
-1 < 1 +
nS
1
that is
1
s
-«(s)<-,
s-1
s-1
which implies that (s -1)((s)--+ 1 as s--+ 1 +0.
dx
-,
XS
Some formulae of Mertens
§5
On the other hand, for s> 1, and
decreasing, so that
the function x-Slogx
x~e,
f
81
IS
00
-ns)=
and on substituting
L00
logn
n=l
n
X S- 1 =
=
logx
dx+0(1),
S
1
XS
eY, we get
f
00
- -12
-ns) =
(s-1)
ye- Y dy+0(1)=
o
- -12
(s-1)
+0(1).
Thus
(s-1)f(s) = _ (s-1)2ns) -d,
(s -1)((s)
as
s-d+O.
Hence l' = L = 1, which implies that 1:::;; 1:::;; L. Taken together with
Theorem 2, this proves Theorem 7.
. n(x)
. .
It follows that if - - tends to a hmlt as x --+ 00, then that limit
xjlogx
must be equal to 1.
§ 5. Some formulae of Mertens.
THEOREM 8. As x --+ 00 we have
A(n)
L-
n~x
n
= logx +0(1);
ff
logp
L -
p""x
p
= logx+0(1),
(41)
x
ljJ(t)
dt = logx + 0(1),
1
L ~ = log log x + C + 0
p""x
p
(_1_) ,
logx
(42)
(43)
where C is a constant.
PROOF. We use a weak form of Stirling's formula, namely
log(m!) = mlogm + O(m),
as
m--+CI).
We know from Theorems 2 and 3 that
ljJ(m) = O(m),
as
6
m--+ 00.
(44)
(45)
By the Lemma proved in the course of Theorem 3, we have
Chandrasekharan, Analytic Number Theory
82
Chebyshev's theorem on the distribution of prime numbers
m!
=
n p[!!!]+[mJ+ ...
p
p
VII
,
p~m
or
L rLP~] logp = n~m
)' [~] A(n),
n
log(m!) =
(46)
p"~m
where A is the von Mangoldt function [cf. (3)].
To prove (41), we put
; = [;]
+ 6n ,
where 0 ~ 6 n < 1, in (46), so that
m
log(m!) = ~ - A(n) + O(m),
n-...::m
n
on using (45). If we divide by m, and apply (44), we get
~
n-..;:m
A(n)
-
n
= logm +0(1).
Replacing the integer m by the real variable x, we get the first formula
in (41). The second formula in (41) follows from the inequality
I
A(n)
)' -
n
/~x
81 1)
10gpi ~)' 2
p
p~x
)' p?:x
+ 3' +...
p
logp <
logp <
L p(P-l)
We can deduce (42) from (41) by using (45). For I/J (t) =
and for x ~ 1, we have
f I/J~t) f L
x
00.
p
L A(n),
n';;t
x
t
dt =
A(n) d:
t
n';;t
1
1
=
f
x
L A(n)
n';;x
dt
'2 =
t
L A(n)
(1 1)
- - -
n';;x
n
x
=
I/J(x)
L -A(n) - .
n';;x
n
x
n
Formula (43) can be proved by using (41) together with Abel's summation formula. Let (Pn) be the sequence of primes in natural order, and
A(x) =
L an,
where
h';;X
and
B(x)
=
~ bn ,
Pn~:X~
where
an
10gPn
= --,
~
§5
If
Some formulae of Mertens
X ~
2, then, by Theorem 6, we have
83
f
x
B(x)
=
A(x)
all
-=-+
P~x 10gPn logx
A(u)du
u(logu)2
.
2
From the second formula in (41), we have A(x) = logx + E(x), where
IE(x)1 < K, for all x ~ 2, K being a constant. Hence
f--+ f
x
E(x)
B(x) = 1 +-+
logx
x
du
E(u)
ulogu
u(logU)2
2
2
du
f
x
E(x)
= 1 + - - + (loglogx -loglog2) +
logx
.
. f
E(u)
u (log u)
2 duo
2
00
Smce IE(x)1 < K, the mtegral
E(u)du
u(logu)
2 converges, and
2
B(x)
E(u)du
= loglogx + ( 1 -loglog2 + 0 0 )
2 + E*(x),
f u(logu)
where
E(x)
E*(x)=--
logx
2K
logx
IE*(x)I<--,
6"
E(u)du
u(logu)2'
x
so that
This proves (43).
f
2
00
for
x~2.
Chapter VIII
Weyl's theorems on uniform distribution
and Kronecker's theorem
§ 1. Introduction. We have seen in Chapter III that to any given
irrational number ~, there correspond infinitely many rational numbers
p/q, such that I~ - p/ql < 1/q2. From this follows Dirichlet's theorem
that corresponding to any given irrational number ~, there exist infinitely many pairs of integers p and q, such that q~ differs from p by
as little as we please. For given e, 0 < e < 1, we consider the integer
1 + [l/e]. Since there exist infinitely many rationals p/q, such that
Iq~ - pi < l/q, it follows that there exist infinitely many fractions p/q,
with denominator q ~ 1 + [l/e], for which we have Iq ~ - pi < l/q < e.
Dirichlet's theorem can be generalized as follows. Given any irrational number 0, an arbitrary real number 0(, and positive real numbers Nand e, there exists integers nand p, such that
n>N,
and
InO-p-O(I<e.
If 0( = 0, this reduces to the above-mentioned theorem of DIRICHLET.
If 0 < Q( < 1, and e is an arbitrarily small positive number, it follows
that the fractional part of nO, namely {n O} = n 0 - [n 0], is arbitrarily
close to 0(. In other words, the numbers ({nO}), n= 1,2,3, ... , are everywhere dense in the interval [0,1).
This generalization of Dirichlet's theorem is itself a special case of
a deeper result due to HERMANN WEYL on the uniform distribution of
numbers, which we shall prove in this chapter.
If we are concerned with the fractional parts of real numbers, it is
of advantage to introduce a new notion. Two real numbers Xl' X2 are
said to be congruent modulo 1, if they differ by an integer. The relation
of being congruent modulo 1 is clearly an equivalence relation, which
partitions all real numbers into equivalence classes, the elements of each
equivalence class consisting of all real numbers with the same fractional
part. The map x-+e 27tix induces a one-one correspondence between
these equivalence classes and the points of the unit circle.
§ 2. Uniform distribution in the unit interval. Let S be a finite set of
real numbers O(l,0(2' ... 'O(Q contained in the interval [0,1), that is
O~O(j<l,
l~j~Q.
85
Uniform distribution in the unit interval
§2
°
Given any pair of real numbers a,b, such that ~ a < b ~ 1, we define
an interval function <p(a,b) by the requirement that <p(a,b) equals the
number of IX'S which are contained in the interval [ a, b), that is those
numbers IXj for which we have
a~lXj<b,
1~j~Q.
We define the discrepancy of the set S to be the number D, where
<p(a,b)
I.
D=sup l---(b-a)
(1)
Q
a,b
Clearly 0< D ~ 1. If we denote the interval [a, b) by I, and its length
by III, and write <p(I) for <p(a, b), then (1) takes the form
sup I-<p(I) - III I.
D=
lC[O,l)
(1 )'
Q
Given an irifinite sequence of real numbers 1X1,1X2,"" in the interval
[0,1), we denote by Dn the discrepancy ofthe first n terms of the sequence.
We say that the sequence (IX i) is uniformly distributed, if Dn~O as
n~oo.
Let <Pn(a,b)=<Pn(I) be the number of IX/S with a~lXj<b and
1 ~j ~ n. It follows from the definition that if the sequence (lXi) is uniformly distributed in [0,1), then clearly
<Pn(a,b) ~ (b _ a),
n
(2)
°
as n~ 00, for each pair of real numbers a, b, such that ~ a < b ~ 1.
But the converse is also true: if (2) holds for each such interval [a, b),
then the sequence (lXi) is uniformly distributed.
For the interval [0,1) can be split up into a finite number of subintervals (Ik), say, each of length b, 0< b < 1. Now given any interval
[c,d), where ~ c < d ~ 1, let r denote the number of intervals (Ik)'
each of length b, which lie in the interior of [c,d). Their total length
is rb, and we have rb> (d - c) - 2b. If r' denotes the number of intervals Ik which intersect [c,d), then r'b«d-c)+2b.
Since (2) holds for each interval [a,b), it holds, in particular, for an
interval Ik of length b. Thus given s > 0, there exists a number N(s),
such that
°
86
Weyl's theorems on uniform distribution and Kronecker's theorem
VIII
for all n>N(e), and all k. If we choose e=b 2 , we get
(1- b)b
~ <p)Ik) ~ (1 + b)b,
n
for all n>N'(b), which implies that
((d-c)-2b}(1-b)
<Pn(c, d)
--
~
n
~
(d-c)+2b}(1+b),
and since d - c ~ 1, it follows that
l<Pn~,d) _
(d-C)I
~ 3b+2b 2 ,
for n> N'(b), for any interval [c,d) c [0,1), with b independent of the
interval. This implies that Dn~O as n~oo. Thus we have proved
THEOREM 1. An infinite sequence of real numbers (IX;), i= 1,2, ... , such
that 0 ~ lXi < 1, is uniformly distributed, if and only if
<pn(a, b) ~ (b _ a),
n
as n ~ 00, for each pair of real numbers a and b, such that 0 ~ a < b :::; 1.
Here <pn(a,b) equals the number of lXi' such that a~lJ.i<b, and 1 ~j~n.
We remark that a uniformly distributed sequence (lX i ) is everywhere
dense in the unit interval [0,1).
§ 3. Uniform distribution modulo 1. An infinite sequence of real
numbers (lXi), not necessarily contained in the unit interval, is said to
be uniformly distributed modulo 1, if the corresponding sequence of fractional parts ({IX;}) is uniformly distributed in the sense already defined
in § 2. Thus, if Dn is the discrepancy, as defined in § 2, of the first n terms
of the sequence ({ lX i }), then Dn ~ 0 as n ~ 00. We shall see that this
condition has an alternative, but equivalent, formulation in terms of a
new notion of discrepancv modulo 1.
Given a set S of real numbers IXbIX2'"'' IJ.Q, let T denote the set of real
numbers (lXk + t), where 1 ~ k ~ Q, and t runs through all integers.
Given any pair of real numbers a and b, such that b~a, let <p*(a, b)
denote the number of elements of T, which are contained in the interval
[a,b). Then
<p*(a + t, b + t) = <p*(a, b)
(3)
87
Weyl's theorems
§4
for any integer t. Further
cp*(a,b) = cp(a,b),
if
0 ~ a < b ~ 1,
(4)
where cp(a, b) is defined, as in §2, for ({IXk}), 1 ~k~Q.
The discrepancy modulo 1 of the set S is defined to be D*, where
D* =
sup
O<:;b-a<:;l
Icp*(a,b)
Q
I
(5)
(b - a) .
Here a runs through all real numbers, but in view of (3), we may assume
that O~ a < 1.
If D is the discrepancy of the fractional parts of the numbers in S,
we have trivially D ~ D*, because of (1), (4) and (5). On the other hand,
we also have D*~2D, since any interval [a,b), where O~a<1,
and b - a ~ 1, is the disjoint union of at most two intervals each of
which is of the form [a', b'), where either 0 ~ a' < b' ~ 1, or 1 ~ a' < b' ~ 2.
Thus
cp*(a,b) = LCP*(a',b'),
where the sum
b - a = L(b' - a' ),
L extends over at most two terms. Hence
ICP*~,b) _ (b _ a)1 ~L ICP*(~,bl) -
(b ' -
a')1 ~ 2D,
because of (1), (3) and (4), and of the fact that there are at most two
terms in
Therefore D* ~ 2D.
Thus, given a set S of real numbers (IXj), 1 ~j ~ Q, we have defined
first the discrepancy D of their fractional parts, and secondly D*, their
discrepancy modulo 1, and the two are connected by the inequalities
L.
(6)
If (IXj) is an infinite sequence of real numbers, not necessarily contained
in the unit interval, let Dn denote the discrepancy of the first n terms of
the corresponding sequence of fractional parts ({IXj}), while D: denotes
their discrepancy modulo 1. It follows from (6) that if Dn~O as n~oo,
then D:~O as n~oo, and conversely. Thus we have proved
THEOREM 2. An infinite sequence of real numbers (IXj) is uniformly
distributed modulo 1, if and only if D:~O as n~oo, where D: is the
discrepancy modulo 1 of the first n terms of the sequence (IX;).
§ 4. Weyl's theorems.
THEOREM 3. If (IXj) is an infinite sequence of real numbers, such that
for j= 1,2, ... , a necessary and sufficient condition for (IXj)
O~IXj< 1,
to be uniformly distributed is that
88
Weyl's theorems on uniform distribution and Kronecker's theorem
1
lim n-+co n
1
n
L: f(r:x
h)
h=l
=
S f(x)d x,
VIII
(7)
0
for every function f which is Riemann integrable in 0 ~ x ~ 1.
PROOF. We may assume f to be real-valued, for otherwise we can
consider the real and imaginary parts separately.
The sufficiency of condition (7) for the sequence (r:x) to be uniformly
distributed is easy to prove. Given any interval [a, b), such that 0 ~ a
<b~ 1, we take f to be the characteristic function of [a,b): f(x) = 1
if a~x<b, while f(x)=O otherwise. Then
1 .;,
<fJn(a, b)
f(r:x h ) = - - ,
n h= 1
n
-
L...
(8)
1
while S f(x)dx=b-a. Condition (7) therefore implies that
o
lim <fJn(a,b) = b-a,
n
n--+oo
(9)
which, by Theorem 1, implies that the sequence (r:x) is uniformly distributed.
Conversely, if (r:x) is uniformly distributed, then (9) holds, so that (7)
holds for the characteristic function f of any interval [a, b) contained
in [0,1], and because of linearity, (7) holds also for any step function in
[0,1]' If f is Riemann integrable in [0,1], then given e > 0, one can
f1' f2 such that f1 ~f ~f2' and
find two step functions
1
S(f2(X)- f1(X))dx<e. Since (7) holds for f1> we have
o
so that, if n is sufficiently large,
Since
f~ f1'
it follows that
1
n
- L: f(r:x »
n
h
h=l
1
S f(x)dx-2e,
0
Weyl's theorems
§4
89
for sufficiently large n. Similarly we get
1
n
1
n
h= 1
0
- L f{(J.h) < Sf{x)dx+2t:,
for sufficiently large n. Thus
for sufficiently large n, which proves (7) for every Riemann integrable
function in [0,1].
THEOREM 4. If ({3) is an irifinite sequence of real numbers, not necessarily contained in the unit interval, a necessary and sufficient condition for
({3) to be uniformly distributed modulo 1 is that
1
lim .->00
n
L e21timPh =0,
n
(10)
h=l
for every integer m i= 0, where i2 =
- 1.
PROOF. Let ({3) be uniformly distributed modulo 1, and let (J.j denote
the fractional part of {3j. Then {(J.j) is uniformly distributed in the unit
interval. If in Theorem 3 we take !(x)=e 21timx, where m is an integer,
and mi=O, it follows that
1
lim .-00
n
L e21t1mah= Se21t1mx dx=0,
n
h=l
.
1.
0
which is the same as (10), since rJ.h differs from {3h by an integer.
Conversely, if (10) holds for every integer mi=O, we have
and we shall show that condition (7) is satisfied for every Riemann
integrable function in [0,1 J. Obviously (7) holds for f(x)= 1, and it
holds, by our hypothesis, for f{x) = e21timx, where m is an integer different
from zero. Hence it holds also for any trigonometric polynomial of the
form
ao+(a1cos2nx+b 1 sin2nx)+ ... +(amcos2nmx+bmsin2nmx),
where the a's and b's are constants. Now any continuous periodic
function f, of period 1, can be approximated by a trigonometric poly-
90
Weyl's theorems on uniform distribution and Kronecker's theorem
VIII
nomial of that kind. That is, given e > 0, there exists a trigonometric
polynomial f., such that
If-f.l<e.
Set f1
= J.-e,
1
and f2 = J.+e, so that f1 ~f ~f2' and S(f2(X)- f1(x))dx
o
=2e. As in the proof of Theorem 3, it follows that (7) holds for any
continuous periodic function of period 1. Confining attention to the
basic interval [0,1 J, for any step function f in [0,1 J we can find
two continuous periodic functions f1 and f2, such that f1 ~ f ~ f2' and
1
S (f2(X)- f1(x))dx<e. Hence (7) holds for a step function f in [0,1J,
o
which implies, as before, that it holds for any Riemann integrable
function in [0,1 J. This completes the proof of Theorem 4.
As an application of Theorem 4, we have
THEOREM 5. If ~ is any irrational number, then the infinite sequence
n= 1,2, ... , is uniformly distributed modulo 1.
(n~),
PROOF. Let m be an integer different from zero. Set
wish to show that
1
lim n-+oo
n
m~=I1.
We
L e21tih~=0.
n
h=l
As '1 is real, but not integral, since
~
is irrational, we have
so that
and Theorem 4 then gives the result.
COROLLARY. If
tional parts
({n~}),
is an irrational number, then the sequence of fracn=1,2,3, ... , is everywhere dense in the unit interval.
~
The concept of uniform distribution can be generalized to spaces
of dimension greater than one. Let (p(j)) be an infinite sequence of
points in a p-dimensional Euclidean space, where p ~ 1, and let the
coordinates of the point p(jl be given by (X j1 ,X j2 , ... , x jp ). Let (Xjr
denote the fractional part of Xj" namely {x jr }, so that 0~(Xjr<1, for
1 ~ r ~ p. If we denote by {p(j)} the vector of fractional
Kronecker's theorem
§5
91
parts ({x j1 },{X j2 }, ... , {x jp }), then the point {p(j)} lies in the unit cube
defined by O~Xj< 1, 1 ~j~p. Let V denote a rectangle, that is the
cartesian product of p intervals, contained in the unit cube, and let
IVI denote its (Lebesgue) measure, which is the product of the lengths
of the corresponding intervals. We say that the infinite sequence (pUl)
is uniformly distributed modulo 1, if and only if the corresponding sequence
({ p(j)}) is uniformly distributed in the unit cube; that is, if and only if
lim <l'n(V) =
n
lVI,
n-+ 00
for every rectangle V contained in the unit cube, where <l'n(V) denotes
the number of points among the first n terms of the sequence ({ pW})
which are contained in V. As in the one-dimensional case, this is equivalent to the statement
<l'n(V)
s~p I-n-
- IVI I-+0,
as n-+oo.
THEOREM 5'. The sequence {p(j)} is uniformly distributed in the unit
cube if and only if
1
lim n---co
n
I
n
e21ti[ml~hl
+ m2~h2 + ... + mp~hpl =
0,
h=l
for every set of integers (m 1 ,m2, ... , mp)¥(O,O, ... , 0).
The proof here runs along the same lines as in the case of one variable.
We have only to observe that a 'step function' can be approximated, for
example, by twice continuously differentiable functions, which have
uniformly convergent Fourier series.
The following generalization of Theorem 5 is a consequence.
THEOREM 6. If ~1'~2' ... , ~p are real numbers, such that ~1'~2' ... , ~p, 1
are linearly independent over the integers (that is, there exists no linear
p
relationoftheform
L lj~j=l,
where I and ljare integers, and (/1,12, ... ,lp,l)
j=l
#(0,0, ... ,0,0»), then the sequence n~=(n~1,n~2, ... ,n~p),
n = 1,2, ... , is uniformly distributed modulo 1.
where
§ 5. Kronecker's theorem. Theorem 6 implies that the sequence
({n~}), where {n~} =({n~d,{n~2}' ... , {n~p}), is everywhere dense in
the unit cube. This is known as Kronecker's theorem, and is a generalization to higher dimensions of the theorem mentioned in §1. We state
it as
92
Weyl's theorems on uniform distribution and Kronecker's theorem
VIII
THEOREM 7. If Ol,02, ... ,Ok,1 are real numbers linearly independent
over the integers, 1X1,1X2, ... ,lXk are arbitrary real numbers, and N and e
are positive real numbers, then there exist integers nand Pl,P2, ... ,Pk,
such that
for m=1,2, ... ,k.
We shall give another version of this theorem, namely
THEOREM 8. If 0l,02' ... , Ok are real numbers which are linearly
independent over the integers, IXl,1X2, ... ,lXk are arbitrary real numbers,
and T and e are positive real numbers, then there exist a real number t,
and integers Pl,P2, ... ,Pk, such that
t>T, and ItOm-Pm-IXml<e,
for m= 1,2, ... , k.
We shall see that Theorem 7 is equivalent to Theorem 8. Let us first
assume Theorem 8, and show that Theorem 7 follows from it.
To prove Theorem 7 in the form given, it suffices to prove it with
O<Om~ 1 for 1 ~m~k. For if 1,Ol' ... , Ok are linearly independent
over the integers, so are 1,O~, ... , 0;', where OJ=Oj-qj, and (q) are
suitable integers; and the inequality InO~-p~-lXml<e, for an integer
P;", implies that InOm-Pm-lXml<e, wherepm=p;"+nqm. Letustherefore
assume that O<Om~l for l~m~k, and O<e<l,andthat 01,02, ... ,Ok,1
are linearly independent over the integers. Then by Theorem 8, with
k+ 1 instead of k, N + 1 instead of T, and te instead of e, applied to
the set
there exist integers P1 , P2, ... , PH l' and a real t, with t > N + 1, such that
and
It-PHtl<!e.
It follows that Pk+l>t-te>N, since t>N+1, and e<1. And,
since O<Om~ 1, we have
IPH 1 Om - Pm -lXml ~ ItOm - Pm -lXml + I(PH 1 - t)Oml
~ ItOm-Pm-lXml+ IPk+l -tl<e,
for m = 1,2, ... , k. Thus Theorem 7 is proved with n = Pk + l.
Conversely, let us assume Theorem 7, and prove Theorem 8. If
k= 1, Theorem 8 is trivial, so that we assume k> 1. It is sufficient to
prove the theorem for Om>O, m= 1, ... ,k. Let Ol,02, ... ,Ok be linearly
Kronecker's theorem
§5
93
independent over the integers. Then the numbers
(}1
(}z
(}k-l
(}k
(}k
(}k
- , - , ... ,--,1
are also linearly independent. If we apply Theorem 7, with N =
the set
T(}k,
to
it follows that there exist integers Pl,PZ"",Pk-l, and n with n>N,
such that
In ~~ - Pm-rY.m I < s,
m= 1,2, ... , (k-1).
If we set t = n/(}k' then t> T, and
It(}m-Pm-rY.ml
<s, m= 1,2, ... , (k-1),
while trivially
so that we have the conclusion of Theorem 8 for the set
Similarly one can prove Theorem 8 for the set
These two conclusions together imply that Theorem 8 is valid for the
set
for if the difference of t (}m from rY. m is nearly an integer, and the difference of t' (}m from Pm is nearly an integer, then the difference of (t + t') (}m
from rY. m + Pm is nearly an integer. Thus the equivalence of Theorem 7
and Theorem 8 is proved.
We shall now give a proof of Theorem 8 due to H. BOHR.
PROOF OF THEOREM
8. If c is real, T>O, and iZ = -1, then we have
T
. -1
hm
T
T--+oo
f't
eCl dt =
o
{O, if c # 0,
1, if c=O.
Weyl's theorems on uniform distribution and Kronecker's theorem
94
VIII
Thus if c. is real, and
X(t)=
L b.eCvit ,
cm#c n
.=1
if m#n,
(11)
then
(12)
Let
F(t)= 1 +
k
L e2lti(t8m-llm),
(13)
m=l
where t is real, and
qJ(t) = IF(t)l.
Then obviously
O~qJ(t)~k+ 1.
If Theorem 8 is true, then for a sufficiently large t, every number t (}m - O(m
is nearly an integer, and qJ(t) is nearly k+1. For if xm=t(}m-O(m, and
e > 0 is given, there exists a lJ, such that if Pm is an integer, and IXm - Pm 1< lJ,
then le2ltiXm - 11 < e.
Conversely, if qJ(t) is nearly k + 1 for some large t, then every term
in the sum (13) must be nearly 1, since no term can exceed 1 in absolute
value, and Theorem 8 must be true. This can be seen as follows. If there
exists an '1, 0<'1< 1, such that qJ(t)~k+ 1-'1, and z=e2ltixm=x+iy,
say, then it follows that lyl~2'11/2. For
k+ 1-'1 ~ qJ(t)~(k-l)+ll +e2ltixml,
or
2~
11 +e2ltiXml ~2-'1, for m= 1,2, ... , k.
And
11 +Z12 =(1 +X)2 + y2 =(1 +X)2 +(I-x2)=2+2x~(2-'1)2 ~4-4'1,
so that 1 ~x~ 1-2'1. Now
y2 = l-x2 =(I-x)(1 +x)~2(I-x)~4",
which implies that IYI~2,,1/2. Therefore Iz-ll<4,,1/2.
Thus Theorem 8 will be proved if it is proved that
lim qJ(t) ~ k + 1.
t-+ 00
Let
(14)
Kronecker's theorem
§5
95
and p be a positive integer. Then
(15)
nl+"'+nk~p
nj"90,j= 1, ...• k
where the coefficients anI, ...,nk have the following properties: (i) they
are positive; (ii) their sum L a n" ... ,nk=t/tP(I,I, ... , 1)=(k+l)P; (iii) they
are at most (p + l)k in number.
We use this formalism to consider
If we use (15) with e 21[i( t6 r aj) in place of xi' we see that FP(t) is a sum
of the form given in (11), with 2n(n 1e1+ ... +nke k) taking the place
of c•. Since the e's are linearly independent, the c:s are all different. In
place of the b. in (11), we have the anl, ... ,nk given in (15), multiplied
by the factor e-21[i(nlal+···+nkak). Hence
(16)
Since cp(t) ~ k + 1, to prove (14) it is sufficient to prove that
lim cp(t) < k + 1
(17)
t .... 00
is impossible. Now (17) implies that
IF(t)1
=
cp(t)
~
A < k + 1,
for sufficiently large t, hence
f
T
1
lim-
T ....
However,
hence
ooT
f
T
IF(t)IPdt~
o
1 APdt = AP.
limooT
T ....
o
o
-If
T
Ib.1 ~ Tlim
- IF(t)IPdt ~ AP,
.... oo T
o
96
Weyl's theorems on uniform distribution and Kronecker's theorem
VIII
so that every coefficient in (15) satisfies the inequality
Since there are at most (p + II such coefficients, we have
(k + l)P=
La
n1 ••• nk
~ (p + l)k A.P.
Since /L=Aj(k+l)<I, and /L P(p+l)k--+O, as p--+oo, it follows that (17)
is impossible, so that (14) is established, and with it the theorem.
Chapter IX
Minkowski's theorem on lattice points in convex sets
§ 1. Convex sets. We have encountered in Chapter VI problems connected with the number of lattice points in certain regions of the plane.
If W denotes the Euclidean space of dimension n, n ~ 1, we call a point
in it a lattice point if all it co-ordinates are integers. In this chapter we
shall prove Minkowski's theorem that a convex set in R n , symmetric
about the origin, whose volume is greater than 2n , contains a lattice
point other than the origin.
DEFINITIONS. Let S be a set in W. If A is a real number, we denote
by AS the set obtained by magnifying S by the factor A, that is
AS = [AxlxES].
We say that S is convex, if and only if XES and YES imply that
AX+IlYES, for all real numbers A, Il, such that A~O, 1l~0, A+Il=1.
If S is convex, so is AS.
We say that S is symmetric with respect to the origin, or just symmetric,
if and only if XES implies that - XES. If S is symmetric, so is AS.
If g is a lattice point in R n , the set Sg, called the translate of S by g,
is defined by the property that XES9 if and only if x - g E S.
If S is a Lebesgue measurable set of measure V(S), then V(S) = V(Sg)'
for any lattice point g.
CONVEX SYMMETRIC SETS. (a) If S is convex and symmetric, and
XES, then AXES, for every real A, such that IAI ~ 1.
For if XES, then -XES because S is symmetric, and
(~2 +~)
2
X
+(~-~)
2
2
(-x)
= AX E S
'
if IAI ~ 1, because S is convex.
(b) If S is convex and symmetric, and XES, YES, then AX + IlY E S,
for all real A and Il, such that IAI + IIlI ~ 1.
If A= 0, or Il = 0, this reduces to property (a). Let us therefore
assume that A#O, and Il#O, and define 81 = sgnA, 82 = sgnll. Then,
because of property (a), and of the assumption 1.11 + IIlI ~ 1, we have
x'=8 1 (1AI+llll)xES, y'=8 2 (IAI+IIli)YES. If we define
7
Chandrasekharan, Analytic Number Theory
98
Minkowski's theorem on lattice points in convex sets
(J'=
1111
1,11 + 1111
IX
,
then p > 0, (J' > 0, and p + (J' = 1. Since S is convex, it follows that
p x' + (J' y' E S. But p x' + (J' y' = AX + IlY. Hence we have property (b).
§ 2. Minkowski's theorem.
THEOREM I (MINKOWSKI). A bounded, measurable, convex, symmetric
set S in Rn, of measure V> 2n, contains a lattice point different from
the origin.
We shall give a proof of this theorem, due to C. L. SIEGEL, which is
based on a formula for the measure of a bounded, measurable, convex,
symmetric set which does not contain a lattice point different from
the origin. The assumption of boundedness in Theorem 1 is not necessary
(cf. Theorem 3, and the Notes on Chapter IX).
PROOF OF THEOREM 1 (SIEGEL). Let S be a bounded, measurable, convex, symmetric set in R n of measure V, and let L 2 (S) denote the set
of square-integrable functions on S. Let cp E L2 (S), and define cp(x) =
for x¢S.
We write, as usual, k=(k l ,k2, ... ,kn), x=(X l ,X 2, ... ,xn), kx=
kl Xl + k2X2 + ... + knxn, and dx = dXl dX 2... dx n·
Consider the function
°
f(x) = LCP(2x-2k),
(1)
k
where k runs through all the lattice points in Rn. For any given x, this
sum is finite, since cp vanishes outside S, and S is bounded. Since k runs
through all lattice points, the sum remains unaltered by the substitution
k.-+k.+ 1. Thus f(x) is periodic in each of the variables Xl 'X 2' ... ,Xn,
with period 1.
Parseval's formula for the Fourier series of f gives
I Ifl2 dx
E
=
I
la11 2 ,
(2)
where E is an n-dimensional cube of side 1, I a lattice point in Rn, and
al is the Fourier coefficient of f, namely
al = I f(x)e- 27tilX dx.
E
Because of (1), this implies that
al
= IIcp(2x-2k)e-27tiIXdx, = II cp(2x-2k)e-27tiIXdx,
E k
k E
(3)
99
Minkowski's theorem
§2
where k runs through all the lattice points in Rn. Set x-k=t. Then
as x ranges over E, and k over all lattice points, t ranges over all of Rn.
Thus
R"
R"
If we now write 2 t = x, then since ({J vanishes outside S, we get
a1= 2- n S({J(x)e-rri1xdx.
s
(4)
On the other hand, we get from (1),
SIfl 2dx
=
E
S L(L({J(2X-2k)({J(2X-2k 1j\ dX = S L({J(2x-2k)({J(2x)dx
E k'
=
k
rn S L
)
R" k
rn L
({J(x-2k) ({J(x)dx =
R" k
k
S({J(x-2k)-;P(x)dx.
(5)
S
If we use (4) and (5) in (2), we get
L
S({J(x-2k)({J(;)dx = rn I IS({J(x)e-rrilxdxI2.
(6)
k S I S
Now if ({J(x-2k)({J(x) #0, then we have XES, and x-2kES. And
because S is symmetric and convex, it follows that tx+t(2k-x) = kES.
Therefore, if S contains no lattice point different from the origin, we
must have ({J(x-2k)({J(x)=0 for k#O, in which case (6) reduces to
S1({J(xW dx = 2 -n I I S({J(x) e-rrilx dxl2.
(7)
S I S
If we now choose ({J, such that ((J(x) = 1 for XES, then SI({J(X) 12 dx = V,
and (7) gives
S
V
=
rn ~ I~ e-rrilxdxl2 =
2- n
(V 2 + I~O I ~ e-rrilxdxl)·
Since -I runs through all lattice points if I does, we can write this in
the form
(8)
which is Siegel's formula for the measure V of a bounded, measurable,
convex, symmetric set S in R n , which contains no lattice point other
than the origin. It follows that V:( 2n , and Theorem 1 is an immediate
consequence.
If we wanted only to prove Minkowski's theorem, and not formula (8), we could use Schwarz's inequality
S1112 dx ~ laol 2,
E
100
Minkowski's theorem on lattice points in convex sets
IX
instead of Parseval's formula. We have
J
ao = 2-" cp(x)dx = 2-"V
s
by (4), and if S contains no lattice point other than the origin, then
by (5), we have
hence V~2".
Theorem 1 is false for some bounded, measurable, convex, symmetric
sets of measure V = 2", as can be seen by considering the set:
Ix;! < 1, 1 ~ i ~ n. This has measure V = 2", but contains no lattice
point other than the origin.
If S is closed, however, we have
THEOREM 2. A closed, bounded, convex, symmetric set S in R", of
measure V(S);;?; 2", contains a lattice point other than the origin.
PROOF. Given e, 0<e<1, consider the set S'=(1+e)S. Since Sis
measurable, so is S', and if V(S) and V(S') denote the respective
measures, then
V(S') = (1 + e)" V(S) ;;?; 2"(1 + e)" > 2".
Therefore, by Theorem 1, S' contains a lattice point I. other than the
origin. Since S is bounded, so is S', and there are only a finite number
of possibilities for I.. Therefore there exists a lattice point 10 , other
than the origin, such that 10 E(1 + e)S for every e, 0 < e < 1. That is
10/(1 +e)ES. If e-+O, it follows that 10ES, since S is closed, and the
proof of Theorem 2 is complete.
Theorem 2 implies the following
THEOREM 2'. If S is a bounded, convex, symmetric set of measure
V(S);;?; 2", then there exists a lattice point, other than the origin, in the
closure of S.
PROOF. Given a bounded, convex, symmetric set S, consider S, the
closure of §. S is convex since S_ is; it is closed; and bounded, since S
is; and V(S);;?; V(S);;?;2". Hence S is a set which satisfies the conditions
of Theorem 2. Hence it contains a lattice point other than the origin.
To provide an alternative proof of Minkowski's theorem, we first
prove the following
LEMMA (G. D. BIRKHOFF). If S is a measurable set in R", of measure
V(S) > 1, then there exist two distinct points XES and YES, such that
x - y is a lattice point.
101
Minkowski's theorem
§2
PROOF. Let g=(gl,g2, ... ,gn) be any lattice point, and consider the
cube [xilgi~xi<gi+l], i=I,2, ... ,n. Let sg denote the intersection
of S with this cube:
sg == S n [(Xl, ... ,Xn)ER", gi~Xi<gi+ 1, 1 ~i~nJ.
Let S~ 9 be the translate of S9 by - g (cf. § 1). Then S~ 9 is contained
in the unit cube 0 ~ Xi < 1, 1 ~ i ~ n. Let its measure be Yg. Then Yg is
also the measure of sg, and
Yg = V> 1. Since the unit cube has
L
9
measure 1, it follows that there exist at least two sets S~g and S~9"
where g and g' are different lattice points, which overlap. In other words,
there exist two points X E sg, Y E S9', such that x - g = y - g'. Therefore XES, YES, and x-y=g-g', which is a lattice point. (It need
not, of course, be in S.) Hence the lemma.
With this lemma we prove
THEOREM 3 (MINKOWSKI). If S is a measurable, convex, symmetric
set of measure v> 2n (possibly V = (0), then it contains a lattice point
other than the origin.
PROOF. Consider the set is, whose measure is (!t V> 1. By the
above lemma, there exist two different points XE!S, YE!S, such that
x- Y = g, a lattice point. Now !S is convex and symmetric, because S
is. It follows that !X-!y=!gE!S, by property (b) of §1; hence gES,
and g is not the origin since x and yare distinct. Thus the proof of the
theorem is complete.
These theorems can be applied to homogeneous linear forms. Let
ei=ailxl+ai2x2+"'+ainxn,
i=I,2, ... ,n,
(9)
be n homogeneous linear forms in n variables Xl""'Xn with real coefficients aij' Let ,1 be the determinant of the matrix (aij)' We suppose, at
first, that ,1 # O.
These forms define a linear transformation of the x-space into the
e-space, and if a set S is convex and symmetric in the x-space, its image T
in the e-space is also convex and symmetric, since convexity and symmetry are unaffected by linear transformations. But the measure is
altered, and if ,1 # 0, then
Jdel de2 ... den = 1,11 JdX l dX2 ... dxn,
(10)
s
so that the measure of T is 1,11 times the measure of S.
Consider (the linear transformation L of Rn into itself given by
T
(X1,X 2, ... ,xn)-(';b"',';n)' The image of points with integer co-ordinates
is called the lattice A associated with L. The determinant of L is called
the determinant of the lattice A.
IX
Minkowski's theorem on lattice points in convex sets
102
An application of Theorem 3 to the
~-space
gives
THEOREM 4, If A is a lattice with determinant L1 =1= 0, and P a measurable, convex, symmetric set of measure V> 2nlL11 (possibly V = CIJ), then
P contains a point of A different from the origin,
An application of Theorem 2 leads to
THEOREM 4'. If A is a lattice with determinant L1 =1= 0, and P is a closed,
bounded, convex, symmetric set of measure V?: 2nlL1l, then P contains
a point of A different from the origin.
§ 3. Applications. (A) Consider the closed set S
defined by the inequalities
I~d ~
ci ,
III
the x-space
i = 1,2, ... ,n.
S is obviously symmetric. It is convex, for if XES, YES, and
where A?: 0, Il?: 0, A + 11 = 1, then
(11)
Z
= Ax + 11 Y,
la i1 z1 +aiZzl+"'+ainznl
~ Ala il Xl + ... + ainxnl + Ill ail Yl + ... + ainYnl
~ max (Ia il Xl + ... + ainxnl, lail Yt + ... + ainYnj)·
n
S is bounded, for if (lXi) is the inverse matrix of (a i), then ~i =
L aijxj
j=l
implies that
Xi =
I
j=l
lXij~j' so that Ixd ~ Ilcxijlc j . By formula (10), the
measure of Sis 2nlL1I-1cICl"'Cw The corresponding set in the
is a rectangle of measure 2nCl cl ... Cn'
An application of Theorem 4' therefore gives
~-space
THEOREM 5. If ~l'~l""'~n are homogeneous linear forms in the
variables Xl' Xl"'" Xn , with real coefficients, and determinant L1 =1= 0, and
if Cl,Cl""'C n are real numbers >0, such that C1 Cl ... cn?:IL1I, then there
exist integers XI,Xl,""X n, not all zero, for which I~ll ~ cl , I~ll ~ Cz,··.,
I~nl ~ Cn'
We can, in particular, choose ci =IL1II/n, i=I,2, ... ,n, and have the
same bound for all the n inequalities in (11).
We have, so far, assumed that L1 =1= 0. If L1 = 0, then it is easily seen
that the set S in the x-space defined by (11) has infinite volume if Ci >
for every i, and the conclusion of Theorem 5 remains valid.
If, instead of (11), we consider fewer inequalities than the number
of variables, namely
°
103
Applications
§3
then the set which they define in the x-space cannot be J>ounded. But
the conclusion of Theorem 5 holds good, because of Theorem 3. There
exist integers Xl,X2' .•. 'X", not all zero, which satisfy the m inequalities
in (12).
We note that the case m < n is reduced to the former case m = n,
L1 = 0, by writing condition (12) for i = m exactly n - m + 1 times.
(B) As a second application, consider the set T in the ~-space defined
by the inequalities
It is obviously symmetric. It is convex, for if ~=(~l' ... '~n)ET,
~'=(~~,~~, ... ,~~)ET, and A~O, p.~0, A+p.=1, then
ktl
IA ~k + p. ~~I ~ Aktl I~kl + p. ktl I~~I ~ max Ctl I~kl,
ktll~~I).
If n=2, T is a square; if n=3, T is an octahedron. The volume of T
can be calculated as follows. T consists of 2n congruent parts, one in
each octant, and that part which lies in the octant ~1 >0, ~2 >0, ... , ~n>O,
has the volume
hence T has volume V = (2 cnn! .
If en ~ n ! 1L11, Theorem 4' gives
THEOREM
6. There exist integers
1~11+ 1~21+
Xl,X 2 , ... , X n ,
not all zero, such that
... + I~nl~(n! IL1Dl/n.
1
Since 1~1 ~2 ••• ~nll/n ~ - (l~ll + ... + I~nl), this implies
n
THEOREM
6'. There exist integers
X 1'X 2 ' ... , X n ,
not all zero, such that
(C) As a third application, we consider the set P in the
defined by the inequalities
~i + ~~ +
... +~; ~ c2 •
~-space
Minkowski's theorem on lattice points in convex sets
104
IX
It is symmetric; and convex, for
n
L ()·~k+Il~~)2=,F L
k=l
k=l
Its volume is
c
n
~~+2A.1l
n
n
k=l
k=l
L ~k~~+112 L ~?
JJ
••.
cnnn/2
d~l ... d~n = r (n/2+ 1) = cnsn, say.
1: ~~ ~ 1
Hence, if c?=2(ILlI/sn)1/n, we can apply Theorem 4' and get
THEOREM
7. There exist integers
X 1 'X 2 ' ... , X n,
not all zero, such that
This theorem can be carried over to a general positive definite quadratic
form
n
Q(X 1, ... , xn)=
L arsxrx.,
r,S= 1
°
with real ars = asr . Q is positive definite if and only if Q(x 1 , ... , xn) >
for all X 1'X 2 ' ..• ,Xn , other than 0,0, ... ,0. The determinant D of the
matrix (a rs ) is called the determinant of Q, and D > 0, if Q is positive
definite. Any positive definite form Q can be expressed as
Q=~i+~~+
... +~;,
where the ~k are linear forms in X 1 'X 2 ' ... , X n , with real coefficients,
and determinant
Theorem 7 can therefore be restated as
VD.
THEOREM 8. If Q is a positive definite quadratic form in n variables,
with determinant D, then there exist integers X 1 ,X2, ... , X n, not all zero,
such that
Q(X 1, ... ,
where
Sn
= nn/2/ r(n/2 + 1).
xn)~4 (~y/n,
Chapter X
Dirichlet's theorem on primes in an arithmetical progression
§ 1. Introduction. We have seen by elementary arguments that there
exist infinitely many primes, and that, in fact, each of the arithmetical
progressions 4k+1 and 4k+3, where k=1,2,3, ... , contains infinitely
many primes (Chapter III, § 3). We shall now prove Dirichlet's theorem
that there exist infinitely many primes in any arithmetical progression
a+mk, where a and m are integers, m>O, (a,m)= 1, and k runs through
all positive integers.
We proved in Chapter VII that the series L lip diverges, where p
runs through all the primes. The proof can be reformulated as follows.
For real s> 1, we have Euler's identity
L ----;1 = f1 ( 1 -
n= 1
n
(_1_) I
I-x
=
n=1
so that, for O<x:::::;t, we have
log
p
p
and, for 0 < x < 1, we have
log
1
----;
00
((s)=
xn <
n
(_1_)
I-x
I
)-1
xn =
n=1
,
_x_,
I-x
< 2x.
Thus for any prime p, and real s> 1, we get the inequality
( 1)-1 <-.
log 1 - p'
Hence
( 1)-1
10g((s}=Llog 1-----;
p
If
L lip
p
p
2
pS
<2LP-s,
s>1.
p
were convergent, we should have 2
L p-s <2 L lip.
p
know, however, that ((I +e}--+oo, as e--+ +0. Hence
diverge.
We
p
L1/p must
p
x
Dirichlet's theorem on primes in an arithmetical progression
106
L lip
Just as the divergence of
is connected with the behaviour of
p
00
L n - S (s > 1), the divergence of the series L lip, where a
n=l
p=a(modm)
and m are integers, m>O, (a,m)= 1, is connected with the behaviour of
((s) =
00
L anlns,
where both s and the coefficients
n=l
an are complex numbers. We prepare for a study of the connexion by
considering the function ((s) for complex values of s.
Let s = 0' + i t, where 0' and t are real, and i 2 = - 1. Let us assume,
to begin with, that 0' > 1. For real, positive x, we set X S = e'IOgx, where
logx is the real natural logarithm of x. We then have
Dirichlet series of the form
1
00
1
00
L -Isl = n=l
L ---;;,
n=l n
n
00
so that the series
L 1/n
s
n= 1
converges absolutely for
0'
> 1, and uniformly
in any half-plane 0' ~ 1 + b > 1, where it defines a regular analytic function.
Because of the absolute convergence of the series for 0' > 1, by
Theorem 5 of Chapter VII, the identity
((s)=
1 f1 ( 1)-1
L
00
~ =
n=l n
p
1- ~
P
remains valid for complex s with real part 0' > 1.
The absolute convergence of the product
(I-lipS) -1 for
f1
follows from that of the series
L lips.
p
Thus in the half-plane
0'
0'
>1
> 1, ((s)
p
can be represented by this absolutely convergent product of non-zero
factors. Hence ((s) ~ 0, for 0' > 1.
The function ((s), defined for 0'> 1, by the relation
00
((s)=
L
n= 1
1
~,
n
is analytic in the half-plane 0' > 0, except for a simple pole, with residue 1,
at the point s = 1. In order to prove this, we use Abel's summation
formula given in Theorem 6 of Chapter VII, with An=n, <p(x)=x- s,
and an = 1. Then A(x) = [x], the integral part of x, and
107
Characters
§2
00
Now [X ]/xS --+0, as x--+ <Xl, for
for
(1
(1
L 1/n
> 1; and the series
s
n=l
converges
> 1. If we write [u] = u - {u}, we get the representation
f f
00
;,
~=
n'::l nS
s
00
du - s
US
1
{u} d u = _s_ - s
us + 1
s-l
1
{u} d u,
us + 1
1
that is
f
f
00
f
00
~ = 1 + _1_ -
n=l nS
s-l
s
1
{u} d u
us + 1
((1
> 1).
(1)
Obviously we have O~ {u} < 1. The integral in (1) is therefore absolutely
and uniformly convergent in every half-plane (1 ~ [) > 0, and represents
a regular function of s for (1)0. Hence ((s) is meromorphic in (1)0,
with a simple pole at s= 1 with residue 1. It is called Riemann's zetafunction.
§ 2. Characters. A character X of a finite, abelian group G is a
complex-valued function, not identically zero, defined on the group,
such that if AEB, BEG, then
X(AB)=X(A)X(B).
If E denotes the unit element of G, and A - 1 denotes the group inverse of
AEG, the characters of G have the following properties.
(i) X(A)#O, for every AEG. For if X(A)=O, then X(A)X(A -1)
=X(AA- 1 )=X(E)=0. That is, X(C)=X(E)X(C)=O, for every CEG,
which contradicts the definition of X. We observe that X(E) = 1.
(ii) If G is of order h, then Ah=E, for every AEG. Hence X(At
= X(A h) = X(E) = 1. That is, X(A) is an hlb root of unity. The character
Xl' defined by the property Xl(A)=l for every AEG, is called the
principal character of G.
(iii) An abelian group of order h has exactly h characters.
We first prove this property for cyclic groups. A group G is cyclic,
if it consists of the powers A, A 2 , ... , A r = E, of a single element A,
which is called a generator of G. The order r of G is the smallest positive
integer r, such that A r = E.
Let X be a character of the cyclic group G. Then (a) X is completely
defined by the value X(A), for X(An) = (X(A»)n; (b) Ar=E implies that
(x (A»' = 1, that is, X(A) is an rIb root of unity; (c) if p is an rIb root of
unity, then we can define a character X by the relation X(A) = P (that is,
108
Dirichlet's theorem on primes in an arithmetical progression
x
X(A n) = pn), for if A a!. A a2 = Aa" then a l + az == a3 (modr), hence
pal. pa 2 = pa3. Since there exist only r different rlh roots of unity, it follows
from (a) and (b) that there are at most r different characters of G. On the
other hand, (c) implies that there are at least r characters. Hence a
cyclic group of order r has exactly r characters.
In order to prove property (iii) for an arbitrary abelian group G, we
use the following result: every finite (multiplicative) abelian group G
is a direct product of cyclic groups. Suppose that G = G1 X ... X Gb
where Gj is cyclic for 1 ~j ~ k. Let rj be the order of Gj , and Aj a generator
of Gj . The order of G is then h = rl r z ... rb and every A E G can be
uniquely expressed as A=A~'A~ ... A~", O~tj~rj-l, j=1,2, ... ,k. If X
is a character of G, we then have
If p j is an r/h root of unity, then there exists one and only one character
X of G, such that X(A)=pj, j=1,2, ... ,k. Since Pj can take exactly rj
different values, G has exactly h different characters, where h = r 1 r z '" rk'
(iv) Let G be a finite, multiplicative, abelian group of order h. It
follows from property (i) that X(E) = 1 for every character X of G. We
shall now see that given any A E G, A # E, there exists a character X,
such that X(A) # 1.
We again use the representation of G as a direct product of cyclic
groups. As in (iii), let A = Atl' A~ ... A~k. Since A # E, not all ti are zero.
For example, let t 1#O. We take X(A z)=X(A 3)="'=X(Ak)=1, and
21ti
(v) The characters of a finite, multiplicative, abelian group G again
form a finite, multiplicative, abelian group G.
By the 'product' X' X" of two characters x' and X" of G we mean the
character X defined by the property: X(A) = X'(A) x" (A), AEG. To see
that X' X" is, in fact, a character, we observe that
X(AB)= X'(AB)x"(AB)= x'(A)x'(B)x" (A)x" (B) = X(A)x(B).
The principal character Xl of G is the unit element of G. The inverse
character X-I of a character X is defined by the requirement X-I(A)
=X(A- 1), so that X-I(A)=(X(A)t l . We see that X-I is, in fact, a
character, for X-I(AB)= X((A B)-l)= X(A -1)x(B- 1)= [1(A)x-1(B).
The character X considered in (iv) generates a cyclic subgroup of G,
of order rl . Similarly there exist cyclic subgroups of orders r2, ... ,rk'
The argument used to show that G has exactly h distinct characters,
§3
Sums of characters, orthogonality relations
109
where h is the order of G, shows that G is the direct yroduct of these
cyclic subgroups of orders r1, r2, ... , rk. Hence G and G are isomorphic,
sucq an isomorphism depending on the decomposition of G into cyclic
factors, which is not unique in general, and on the choice of generators
for these cyclic factors.
§ 3. Sums of characters, orthogonality relations. Let G be a finite,
multiplicative, abelian group, of order h. Let us consider the sum
S=
Lx(A),
A
where A runs through all elements of G, and the sum
T= LX(A),
x
where X runs through all elements of the character group G.
If B is a fixed element of G, and A runs through all elements of G,
so does A B. Hence
S'X(B) = LX(AB)= LX(A)=S,
A
A
°
which implies that (X(B) -1) S = 0. Hence either S = 0, or S #
and
X(B) = 1 for every BEG, in which case X= Xl' the principal character,
and the sum S has the value h, the order of G. Hence
S=
h if X-X
x(A) = { ' .
- 1,
0, If X# Xl'
I
A
(2)
If we mUltiply the sum T by X'(A), where X' is some character of G,
then we similarly obtain
X'(A)'T =
I
X(A)x'(A) =
l.
I
x(A)= T.
l.
Henceeither T=O, or x'(A)=l for every X'EG, in which case, because
of (iv) of § 2, A =E and T =h. Thus
'"
T= .;x(A)
{h'
if
= 0, if
A = E,
A#E.
(3)
Let m be a positive integer. We know that the cp(m) prime residue
classes modulo m form a multiplicative abelian group of order h = cp(m)
(Chapter II, §1). We can therefore consider the characters of this group.
110
Dirichlet's theorem on primes in an arithmetical progression
x
But the definition of character can be carried over from the prime residue
classes modulo m to the integers themselves, as follows. We define
x(a)=x(A),
if aEA,
where A is a prime residue class modulo m. Then obviously x(a) = X(b),
if a=b(modm); and x(ab)=x(a)x(b), if (a,m)=(b,m)= 1. Since
X(A)#O for every prime residue class A, it follows that x(a)#O, if
(a,m)=1.
This definition applies only to integers (l which are prime to m.
We can extend it to all integers by the requirement that
x(a) =0,
if (a,m»
1.
A character modulo m is therefore an arithmetical function x, with the
properties:
x(a)=x(b), if a=b(modm),
x(ab)=x(a)x(b), for all integers a and b,
x(a)=O, if (a,m» 1,
x(a)#O, if (a,m)= 1.
There exist <p(m) characters modulo m, where <p(m) is the number of
integers, not exceeding m, which are prime to m. They form a (multiplicative) abelian group, which is isomorphic to the group of prime residue
classes (modm). The unit element of this group is the principal character
Xl' which is such that X1 (a) = 1 if (a, m) = 1. Further we have the relations
of orthogonality:
L
x(n) = {<p(m),
n(modm)
0,
L x(n) =
x
{<p(m),
0,
if X=Xl'
if X#Xl'
if n=l(modm),
if n#l(modm).
(4)
(5)
EXAMPLES. (1) Let m = 4. Then there are 2 prime residue classes,
namely the class E consisting of integers congruent to 1 (mod 4), and
the class A of integers congruent to 3 (mod 4). A and E form a cyclic
group of order 2. There are two characters Xl and X2, where
Xl (E) = Xl (A)= 1,
the principal character,
and
By the definition of character, carried over to the integers, we have
Xl (n) =
{a,1,
if n is even,
if n is odd,
§4
Dirichlet series, Landau's theorem
111
and
0, if n is even,
1, if n:= 1(mod 4),
-1, if n:=3(mod4).
X2(n) = {
Further we have
Xl (1)+ Xl (3)=2,
Xl (1)+ X2(1)=2,
X2(1)+ X2(3)=0,
Xl (3)+ X2(3)=0.
(II) Let m=5. Then the prime residue classes are E,A,A 2,A 3,
where A is the class of all integers congruent to 2 (mod 5). A2 is then
the class of integers congruent to 4(mod 5), and A 3 the class of integers
congruent to 3 (mod 5). E contains all integers congruent to 1 (mod 5).
The four characters are as follows:
Xl (E) = Xl (A)= Xl (A2)= Xl (A 3) = 1,
X2(E) = 1, X2(A)= i, X2(A2) = -1,
X3(E)=1, X3(A)= -1, X3(A 2)= 1,
X4(E) = 1, X4(A) = -i, X4(A2) = -1,
X2(A 3)= -i,
X3(A 3)=-1,
X4(A 3)= i.
§ 4. Dirichlet series, Landau's theorem. A Dirichlet series is a series
00
of the form
L ann-
s,
where s is a complex number, and the coefficients
n=l
an are likewise complex numbers. More generally, a series of the form
L
00
n= 1
a
00
~
A~'
or
"L.. an e- SAn ,
n=l
where O<Al <A2< ... , and An--+OO as n--+oo, is called a Dirichlet
series. Many of the Dirichlet series which appear in the theory of numbers
are of the type
an n - s, and we shall consider some elementary properties of such series.
We usually write s = (J + i t, where (J and t are real, and i 2 = -l.
L
00
THEOREM 1. If the series
L aJn s
converges for s = so, it converges
n= 1
uniformly in the angular region defined by larg(s - so)1 ~ n/2 -
e< n/2.
PROOF. We may suppose, without loss of generality, that so=O, for
OOa
n; =
L
n=1
and the convergence of
ooa
1
ns:· nS-so =
L
n=l
L ann -s,
L bn , where
n= 1
b
for s = so, is equivalent to the conver-
00
gence of
00
L nS~so'
n=l
bn = an· n -so.
x
Dirichlet's theorem on primes in an arithmetical progression
112
00
Let
L an
00
converge. Then lim rn=O, where rn=
n=1
n-'CX)
a;
n=M n
=
avo Let M
v=n+l
and N be positive integers, such that M < N. Then
N
L
L
N
L
rn - 1 -rn
n=M
n'
If 0">0, we have
I(n~lr ~, H:r ~:, I"ISI'L~:, ~ I:t· -(n~l)")
-
n
n
Further, if e>O is given, then Irnl <e, for
pendent of s. Hence, for M> no, we have
If
0"
n~no(e),
where no is inde-
> 0, and M > no (e), we therefore have the estimate
i n'ani ~ ~ _1_
+ ~ ~ 2e ls l .
In=M
M
M
a
0"
a
0"
To prove the required uniform convergence, we observe that
lsi
-=
0"
1
coslargsl
~
1
cos(n/2-O)
that is, for every s, such that largsl
~
1
sinO'
=--
n/2-()<n/2, we have
which proves Theorem 1.
It follows that if L an/no converges for s = 0" 0 + i to, then it converges
for all s=O"+it, with 0">0"0' Hence we have
00
THEOREM
plane
plane.
2. If
L an/no converges for
s = so, it converges in the half-
n=l
0"
> 0"0, and uniformly in every compact set contained in that half-
From the uniform convergence we also have
113
Dirichlet series, Landau's theorem
§4
00
THEOREM
3. If
L
anln s converges for s =
So
to sum f(so), where f(s)
n=l
denotes its sum function in the half-plane a> a 0, then f(s) ~ f(so), as
S~So
along any path in the region larg(s-so)1
~
nI2-8<nI2.
Theorem 2 shows that the region of convergence of a Dirichlet
series is a half-plane. For if the points of the real axis are divided into
two classes U and L, such that
U=
{ai I: a;n is convergent},
n=l
L=
{ai I: a;n is divergent
},
n=l
then every member of U is greater than any member of L, and the classification defines a real number a 0, such that the series converges for
a> ao, and diverges for a < a o, the case a = ao being undecided. If
U is empty, we define a 0 = + 00, and if L is empty, a 0 = - 00.
The number a o is called the abscissa of convergence, the line a = a o
the line of convergence, and the half-plane a> ao the half-plane of con00
vergence, of the Dirichlet series
00
The series
I
00
L
I
a.ln S •
n=l
n !/n S converges nowhere (a 0 =
+ (0), while the series
n= 1
1/(n! nS ) converges everywhere (a 0 = - (0).
n=l
Theorem 1, together with Weierstrass's theorem on uniform limits
of analytic functions, gives
00
THEOREM
4. A Dirichlet series
L ann-
s
represents in its half-plane of
n=l
convergence a regular analytic function of s, whose successive derivatives
are obtained by term wise differentiation of the series.
These theorems do not say anything about the convergence of the
series, or the regularity of the sum function, on the line of convergence.
In contrast to a power series which always has a singularity on the
circle of convergence, a Dirichlet series need not necessarily have any
singularity on the line of convergence. Nor can we conclude from the
convergence or divergence of a Dirichlet series at a fixed point on the
line of convergence, the regularity or singularity of the sum function
of the series at that point. We shall revert to this question a little later.
8 Chandrasekharan, Ana1ytic Number Theory
Dirichlet's theorem on primes in an arithmetical progression
114
x
00
00
if
L
n= 1
The series
L an/n
s is absolutely convergent
n=l
Ian lin" is convergent. The abscissa of absolute convergence (j of
ABSOLUTE CONVERGENCE.
00
is the abscissa of convergence of L lanllns•
n=l
Obviously we have (j ~ 0"o, since absolute convergence implies convergence. If (j>0"0, then there exists a strip of the complex s-plane in
which the series converges but not absolutely. This strip 0"0 < 0" < (j is
called the strip of conditional convergence.
To take an example, the series
the series
L anlns
00
(_1)n-1
n=l
n
L
S
converges for real s > 0, since it is an alternating series of decreasing
terms. It obviously diverges for real s < 0. Hence 0"0 = 0. It converges
absolutely for 0" > 1, and diverges absolutely for 0" < 1. Hence (j = 1.
The strip of conditional convergence has width 1.
It is interesting to note that
L
00
1)n-1
(
nS
n=l
= (1-2 1 - s )((s),
for
0"
> 0,
(6)
where ((s) is the Riemann zeta-function, for the series on the left is
absolutely convergent for 0" > 1 and can therefore be rearranged:
I
n=l
(-1
t- =(~ + ~ + ~ + ...) _ 2 (~ + ~ + ~s + ...)
nS
1
1"
=
2S
(1-2 1 -
S
3S
)((s),
2S
for
4S
6
0">1.
-1r-
1 InS
But the series L(
converges for 0">0, and the function
1
S
((s) (1- 2 - ) is regular for 0" > 0, the simple pole of ((s) at s = 1 being
cancelled by the zero of 1- 2 1 - s • Hence (6) is valid, by analytic continuation, for 0">0.
We have noted that the strip of conditional convergence of the series
in (6) is of width 1. It can be shown that the strip of conditional convergence of any Dirichlet series L allin s can be at most of width 1, so that
if it converges for a given s, it converges absolutely when the real part
of s is increased by 1 + e with any e > 0.
00
THEOREMS.
For any Dirichlet series
L
n=l
an/ns , we have (j-O"o:(1.
115
Dirichlet series, Landau's theorem
§4
00
PROOF.
If L an/no converges, then lim iani/n" =0, hence the series
n=l
00
n-+CX)
L iani/n1+a+< converges for 6>0.
n=l
This theorem does not hold for Dirichlet series of the more general
form LanA;", where (An) is not the set of positive integers, as the following examples show:
00
(_1)n
Ln= 2 (logn)S
L
n=2
converges for
( _1)n
Vn(logn)"
C1
> 0, but never absolutely;
converges for all s,
but never absolutely.
We now return to the question of the regularity of the sum function
ofa Dirichlet series Lan/ns on the line of convergence. In case the coefficients (an) are non-negative, we have
THEOREM 6 (LANDAU). If an~O for all n~1, and C1 0 is finite, then
the point of intersection of the real axis with the line of convergence is a
00
singularity of the sum function f(s) of the Dirichlet series L an/n".
n=l
Since an~O, we have u=C1o. We can assume, without loss
of generality, that C10=0. We wish to show that the point s=O is a
singularity of f If f were regular at s=O, then the Taylor series of f
at the point s = 1 would have a radius of convergence p > 1. Hence
there would exist a real s<O, for which the Taylor series
PROOF.
1t
(s
L --J<vl (1)
00
v=O
converges. But, for
C1 >
v!
0,
00
f(s)
=
L ane-sIOgn,
n=l
and by Theorem 4,
so that
8*
x
Dirichlet's theorem on primes in an arithmetical progression
116
The Taylor series of fat s = 1 is therefore
00
(s-l)"
an(-logn)"
00
I -v., n=l
I
v=o
(l-s)V
00
00
I -,I
v=o
v.
n=l
n
an(logn)"
n
Since all terms of this double series are non-negative, if s<O, we may
interchange the order of summation, and it follows that
I
n=l
I
an
n
v=o
(l-s)'~IOgn)"
v.
converges for some s<O. However,
00
"
(1- s)'(logn)'
v!
1...
v=o
=e(l-s)logn.
00
L ann- s
Hence
converges for some s<O, which is impossible, since
n=l
0"0=0. Hence the points s=O must be a singularity of f(s).
M UL TlPLICA TlON OF DIRICHLET SERIES. The formal product of two given
00
Dirichlet series
Cn
=
L
I
00
I
aJk and
s
k=l
00
bm/m is defined to be
s
m=l
L
cJn s , where
n=l
akbm· If both the given series are absolutely convergent for
km=n
a given s, they can be multiplied out and rearranged; and the series
00
I
cJn s is then absolutely convergent, and is called the product of the
n=l
given series.
For 0" > 0"0, let
~ bm
g(s) = 1...
m=l mS
The function h(s), where h(s)= f(s)·g(s) is representable by the product
of the Dirichlet series in the half-plane 0" > 0" 0 + 1, by Theorem 5.
The representation of a function by a Dirichlet series is unique, as
shown by the following
00
THEOREM 7. If the series
L an/n s ,
n=l
00
and
L bn/ns ,
n=l
converge in a common
half-plane, and if their sum functions coincide in a non-empty open set
contained in that half-plane, then an = bn for all n ~ 1.
Dirichlet's theorem
§5
117
00
PROOF.
Consider the Dirichlet series ~)an - bn)/ns. It converges in a
n=l
half-plane (1 > (10' say, where it defines a regular analytic function. That
function vanishes on a non-empty open set contained in that half-plane.
Hence it is identically zero in the whole half-plane (1 > (10.
Let M be the first value of the index n, such that an i: b n, and let
cn=an-bn. Then, for (1)(10, we have
~
Cn
L.J(f
n=l
n
=
~
L
00
= 0
L...a'
Cn
n
n=M
M+l
Cn
n"
Hence
Because of the uniform convergence of the series for (1 > (10 + 2, if we
let (1-+00, it follows that CM=O. This contradicts the definition of M.
Hence Cn = 0 for all n ~ 1.
§ 5. Dirichlet's theorem. We shall now apply the knowledge of
characters obtained from §3, and of Dirichlet series obtained from §4,
to series of the form
I:
n=l
X(~),
n
s=(1+it,
(7)
where X is a character modulo m.
There are qJ(m) such series, where qJ is Euler's function. Since
Ix(n)1 :::; 1, the series in (7) converges for (1 > 1, in comparison with the
series L 1/ns, and we denote its sum function by L(s,X). For different
characters X, we get different functions L(s,X), and these are called
Dirichlet's L-functions. To study their properties, it is convenient to
distinguish the case where X is the principal character Xl' from the case
where Xi: Xl.
(i) If X i: Xl' then the series in (7) converges in the half-plane (1 > 0,
x(n) are bounded, which can be seen as follows.
since the partial sums
L
n::S;x
If we partition the integers from 1 to [x] into residue classes (modm),
and write [x] =mq+r, O:::;r:::;m-l, then
[xl
(m
2m
mq
n~x x(n) = n~l x(n) = ~ + m~l + ... + m(q!-t) + 1
)
mq+r
x(n) + m~/(n),
Dirichlet's theorem on primes in an arithmetical progression
118
x
and because of the orthogonality relation (4), we have
mq+r
L x(n) = L
x(n),
mq+l
n~x
which implies that
In~x x(n) I ~:~>x(n)l~r<m.
Since n- a , for 0">0, decreases monotonically to zero as n-HI) , it
follows that Lx(n)/n s converges for real s=O">O, and consequently
for all s in the half-plane 0" > 0, if X=f. Xl. If 0" < 0, it obviously diverges.
Its abscissa of convergence 0"0 = 0, and the abscissa of absolute convergence iT = 1. By Theorem 4, the function L(s, X), X=f. Xl' is a regular
analytic function of s, for 0" > 0.
. (ii) If X= Xl' we use, once again, Euler's identity
L -1
-; =
00
((s)=
n= 1
n
( 1)-1
TI 1 - ----;
P
p
,0"> 1,
where pruns through all the primes. Since each character X is a completely
multiplicative arithmetical function, by Theorem 5 of Chapter VII, we
have, for all X, the identity
L(s,X)=
I
x(~) =
n=l
TI(1 _ X(:»)-l,
n
P
0">1.
p
(8)
This implies that L(s, X) =f. 0, for 0" > 1.
If Xl is the principal character (mod m), we know that
Xl (a) = {
1, if (a,m)= 1,
0, if (a,m»
1.
Using this in (8), we get
or
L(S,Xl)=((S)·TI(l-p-S) (0">1).
(9)
plm
We have seen that ((s) is meromorphic in the half-plane 0">0, having
a simple pole at s = 1, with residue 1, as its only singularity. Hence
L(s, Xl) is regular for 0" > 0, except for the point s = 1, where it has a
simple pole with residue TI (1-p-l)=<p(m)/m [cf. Chapter II, (1)].
plm
For the proof of Dirichlet's theorem we need the following
LEMMA.
If X=f.XI' then L(1,X)=f.0.
119
Dirichlet's theorem
§5
PROOF.
It is sufficient to show that the product
P(s) =
nx L(s, X)
where Xruns through all characters (modm), is not regular for a>O. For
if L(1,X)=0 for at least one character X#Xl, then the simple pole at
s= 1 of L(S,Xl) in the product P(s) would be cancelled by the zero of
L(s, X) at s = 1, and P(s) would be regular for a> O.
For a>1, we have Ix(P)p-SI~p-(1<1, so that we define
log
(1 _ P"
X(P))-l = L X(Pk).
k kpks
Then the function logL(s,X) is uniquely defined in the half-plane a> 1,
and given by
X(Pk)
(10)
logL(s,X)= L --;;S,
p,k kp
where p runs through all the primes, and k through all positive integers.
The double series is absolutely convergent for a> 1. Further
e1ogL(s,X) = L(s,X)·
If we sum log L(s, X) over all the characters X(mod m), we get
Q(s)=logP(s)=
L log L(s, X) = L L
x
X
p,k
X(Pk)
--;;s.
kp
Since there are only finitely many X, we can interchange the order of
summation, and obtain
Q(s)=
1
k ks L X(Pk).
L
P
p,k
X
Since
Lx(a) = {q>(m),
x
0,
if a=1(modm),
otherwise,
we have
Q(s)=q>(m)
L
pk=l(modm)
1
~.
kp
If we define
if n=pk=l(modm),
otherwise,
(11)
120
Dirichlet's theorem on primes in an arithmetical progression
then
00
Q(s)=L
n= 1
x
an
s'
n
where the coefficients (an) are non-negative. We know that the series
converges for (J> 1. In order to find its abscissa of convergence, let p
be a prime such that p,t'm. By Euler's theorem (Theorem 2, Chapter II)
we have ph=l(modm), where h=q>(m). If we consider the series (11)
for real s, and take only the terms for which k = h, then
Q(s) >
Since
L lip
L
p,l'm
diverges, and
1
hs
p
=
L lip
1
Lp hs
- L
P
plm
1
hs'
P
is finite, it follows that the series in
plm
(11) diverges for s= Ilh. Hence, if r:t. is the abscissa of convergence of
the Dirichlet series Q(s), then r:t. ~ 11h. But
P(s) = eQ(s) = 1 + Q(s)
Q2(S)
+ - - + ....
2!
(12)
The product of two convergent Dirichlet series, with non-negative
coefficients, is again a Dirichlet series with non-negative coefficients,
which converges in the intersection of the two half-planes of convergence.
Hence, along with Q(s) all the powers Qn(s) are absolutely convergent, so
that the series P(s) in (12) can be written as a Dirichlet series which has
non-negative coefficients.
Thus if the Dirichlet series of Q(s) converges, so does the Dirichlet
series of P(s). Conversely, if the Dirichlet series of P(s) converges for
some real s, then so does the Dirichlet series of Q(s), because its coefficients are non-negative, for that value of s.
Hence the Dirichlet series of P(s), which is unique, has the same
abscissa of convergence (J 0 = r:t. as the Dirichlet series of Q(s). By
Theorem 6, the point s = r:t. is a singularity of P(s). But we know that
r:t. ~ 11h > O. Hence the function P(s) is not regular in the whole halfplane (J> O. Thus the lemma is proved.
We are now in a position to prove the main theorem of this chapter,
namely
THEOREM 8 (DIRICHLET). If m is a positive integer, and (a, m) = 1,
then there exist iriflnitely many primes p=a(modm).
L
PROOF. It is sufficient to prove that the series
lip summed over
all primes p=a(modm) diverges. For this purpose, we use the functions
L(s,X)·
Dirichlet's theorem
§5
If
(1
121
> 1, then by (10), we have
10gL(s,X)=
X(pk)
L L --;;;.
00
p
k=1
kp
If we separate the terms for which k = 1 from the others, we get
(13)
10gL(s,X)= LX(P)p-s+R(s,X),
p
where the series
converges for (1)t.
Since (a,m) = 1, there exists an integer b, such that ab== l(modm). If
we multiply (13) by X(b), and sum over all characters x(modm), we get
LX(b)logL(s,X)= L LX(bp)p-s+ LX(b)R(s,X), (1)1.
p x
x
x
Since R(s,X) is regular for (1)t, the function R*(s) = LX(b)R(s,X) is
also regular for (1 > t. Further
x
LX(bp)
x
h if bp== 1(modm),
={ '
0,
otherwise.
If ab == 1 (modm), then the congruence b p == 1 (modm) is equivalent to
p==a(modm). Hence
LX(b)logL(s,X)=h
L
p-s+R*(s).
p=a(modm)
x
(14)
If we now let S--'> 1 + 0 along the real axis, the left-hand side of (14)
tends to 00. For L(S,XI)--'>oo as s--'>1+0; L(s,X), X#XI, is regular for
(1)0; L(l,X)#O for X#XI by the lemma; and 10gL(s,X), X#Xl' as
defined by (10), has a finite limit as s--'>1+0, because of the formula
C
10gL(s,X) = -
I.;(u,X)
J--du+logL(c,X),
L(u,X)
,>
for
S=(1) 1, c>(1,
if we note that L(u,X)#O for u~1, X#XI, and that L(S,X) is regular
for (1)0, X#XI' Further R*(s) is regular for ( 1 ) t . Hence
L
p=a(modm)
Hence
L
p=a(modm)
lip diverges.
p-S--,> 00,
as
s--'>1+0.
Chapter XI
The prime number theorem
§ 1. The non-vanishing of ,(1 + it). We have seen in the preceding
chapter that Dirichlet's L-functions have the property that L(I,X)#O
for X# Xl' and used it to show that every arithmetical progression of
the form a+mk, where m>O, (a,m)= 1, and k= 1,2, ... , contains infinitely many primes.
We shall now prove that the Riemann zeta-function has the property
that ,(1 +it)#O for t#O, and use it to prove the prime number theorem.
The prime number theorem is usually stated in the form
X
(1)
n(x) - - ,
logx
where n(x) denotes the number of primes not exceeding x, and the
symbol - in (1) means that n(x)/(x/logx)-+l as x-+oo.
Since we have seen in Chapter VII that (1) is equivalent to proving
that
lim l/I(x) = 1,
(2)
x"'" 00
x
where l/I is Chebyshev's function, we shall prove the prime number
theorem in this form.
For this we need the relation
co
- C(s) _ sf l/I(u)du
'(s) u·+ l '
(3)
I
which we have proved in § 4, Chapter VII, for real s> 1, as a consequence of Abel's summation formula. By analytic continuation, (3) is
valid for complex s with real part u> 1. (We write, as usual, s = u + it,
with u, t real, and i2 = - 1).
If we substitute u = eX in (3), we get
co
- "(s) =
s(s)
f l/I(~)e-x'dx,
u> 1,
(4)
o
from which we shall deduce that l/I(eX)-eX, that is l/I(x)-x, as x-+oo.
The non-vanishing of W +it)
§1
123
We have already seen that '(s) is analytic in the half-plane 0'>0,
except for a simple pole at s= 1 with residue 1, and that '(s)#O for
0'> 1. We shall now prove that '(s)#O on the line 0'= 1.
THEOREM 1 (HADAMARD-DE LA VALLEE POUSSIN).
,(1 +it)#O.
If
t # 0,
then
PROOF. If 0'> 1, then we have
'(s)= n(1-p-')-l,
p
and if we take logarithms, then as in Chapter X,
1
Im,p ,
mpm.
log'(s)=
0'>1,
(5)
where m runs through all positive integers, and p through all primes.
Hence
log I'(s) I= Re(log'(s)) = Re
Now
I
m,p
00
!
(I _1_).
m,p mpm.
1/(mpm.) = I cn/n' is a Dirichlet series with coefficients
n=2
1.
m
-, If n=p,
n
C
=
:,
otherwise.
Hence
Since
Cn
-Cn = ' n-·.t = -Cn ( cos(tlogn)-isin(tlogn)),
n'
n"
n"
it follows that
I
00
loglC(s)l=
C
: cos(tlogn).
n=2 n
Hence
log 1'3 (0') ,4(0' + i t)C(a + 2 i t)1 = 3log 1'(0')1 +4log 1'(0' + it)1 + log 1'(0' + 2 i t)1
=I
c:n (3+4cos(tlogn)+cos(2tlogn))~0,
since
Cn~O,
and
3 +4cosO+cos20=2(1 +COSO)2 ~O,
(6)
124
The prime number theorem
XI
for real 8. Hence
so that we have
I(a -1)(a)13 .1 na + it) 14 'I(a+ 2 it)1
.
a-I
~ _1_.
a-I
(7)
We shall show that the assumption that (I +it)=O for t=to#O, leads
to a contradiction. For if we take t = to in (7), and let a-+ 1 + 0, then the
right-hand side tends to 00, while the left-hand side tends to the limit
I('(I+it oW'I((1+2it o)l, under the assumption that nl+it o)=O; and
the limit is finite, since ((s) is analytic for a>O, s# 1. Hence ((I +ito)#O,
which proves the theorem.
§ 2. The Wiener-Ikehara theorem. We deduce the prime number
theorem from the following
THEOREM 2 (WIENER-IKEHARA). Let A(x) be a non-negative, nondecreasing function of x, defined for 0:::; x < 00. Let the integral
JA(x)e-XSdx,
o
s=a-fit,
converge for a> 1 to the function f(s). Let f(s) be analytic for a ~ 1,
except for a simple pole at s = 1 with residue 1. Then
lim e- X A(x) = 1.
X-' 00
PROOF. We shall prove the theorem in two parts. Setting
B(x)=e- x A(x),
(8)
we shall first prove that, for any A> 0,
lim
y-'oo
r
B
v sin v
(y - -)
dV=1!.
A
v
2
-2-
(9)
-00
We shall then deduce from (9) that
lim B(x)= 1.
FIRST PART. Since, for a> 1, we have
f
00
f(s)=
o
(10)
f
00
A(x)e-XSdx, _1_ =
s-1
o
e-(S-l)xdx,
125
The Wiener-Ikehara theorem
§2
it follows that
00
f(s) - _1_ = f(B(X)-1)e-(S-1)XdX
s-1
(0">1).
o
If we put
1
g(s)=f(s) - - , and
s-1
then g(s) is analytic for
For 2>0, we have
0" ~
gE(t)=g(1+8+it),
1, because of the assumption on f(s).
U (I - ~'De,"d' ~ U(I - ~'D
2A
8>0,
2A
gil)
e'"
J(B(x)-I)e-('H<~dx)
(00
dl.
(11)
We wish to show that the order of integration in (11) can be interchanged.
Since A(x) is non-negative and non-decreasing, we have for real s,
and x>O,
00
f(s)
=
f
00
A(u)e-USdu~A(x)
o
f e-usdu
=
A(x)e-XS
s
'
x
that is, A(x)~sf(s)exs. Since f(s) is analytic for 0"> 1, it follows that
A(x)=O(eXS ) for every s> 1, which implies that A(x)=o(eXS ) for every
s>1. Hence B(x)e-h=A(x)e-(lH)x= 0(1), for every 15>0. This
implies that the integral
00
S(B(x)-l}e-(E+it)xdx
o
converges uniformly in the interval - 2 2 ~ t ~ 2 2. Hence we can interchange the order of integration in (11), and obtain
f
00
=
o
(B(x)-l}e- EX sin 2 2(y-x) dx.
Il(Y_X)2
(12)
The prime number theorem
126
XI
Since g(s) is analytic for 0" ~ 1, it follows that ge(t)--+g(1 + it), as
£--+0, uniformly in any interval - 2 A~ t ~ 2 A. Further
00
. f - sin z A(Y-X)
hm e ex
dX
e->O
A(Y-X)Z
o
00
=
f sin z A(Y-X)
d x.
A(Y-X)Z
0
Hence the limit
00
sinZ A(Y-X)
lim f B(x)e-ex
Z dx
e->O
A(Y-X)
o
exists. Further, since the integrand is non-negative, and monotone
increasing as £--+0, we have
00
00
sinz A(Y - x)
lim f B(x)e-ex
dx
e->O
A(Y-X?
o
=
f
B(x)
0
sin z A(Y - x)
dx.
A(Y-X)Z
Thus we get from (12)
Z).
00
-1 f g(1+it) (1\-t-\ ) .
e,ytdt
2A
2
-2).
=
00
f B(x) sinZA(Y-x)
Z dx - fSinZA(Y-X)
Z dx.
A(Y-X)
A(Y-X)
0
0
Now, if we let y--+ a), then by the Riemann-Lebesgue lemma, the lefthand side tends to zero, while on the right-hand side the second term
gives
lim
y-> 00
f
oo
o
sinz A(Y - x)
----=-z-dx
A(Y-X)
=
.
hm
y->oo
J
sin z v
-z-dv=n;
v
-00
hence
lim
y->oo
v sin z v
B (Y - -) -Z- dv=n,
).
v
J
-00
which proves (9).
SECOND
and
PART.
We shall prove (10) in two steps, namely
lim
x-> 00
B(x)~ 1,
(13)
lim
x-> 00
B(x)~
1.
(14)
The Wiener-Ikehara theorem
§2
127
Given positive numbers a and A, let y > a/A. Then, by (9), we have
f (y - ~)
a
B
lim
y--+ 00
sin: v dv~n,
A
v
-a
because the integrand is non-negative. Further A(u) = B(u)e U is nondecreasing; hence, for -a~v~a, we have
which implies that
Hence
f(
a) e -~ 7sin2v d v ~ n,
a
lim
y--+ 00
B y-
T
.Ie
-a
that is
-lim B (y - -a)
Ii
y--+ 00
f
a
-~
e.le
sin 2 v dv~n.
-2v
-a
For fixed numbers a and A we have lim B(y-a/Ii)= lim B(y). Hence
f
a
2a
e -Y lim B(y)
y--+ 00
y-oo
y-oo
sin 2 v
-2- dv~n
v
-a
for all a>O, A>O. Now let a ....HYJ and A....HX), in such a way that
a/A~O. Then
f
00
lim B(y)
y--+ 00
sin2 v
-v
dv~n,
2
-00
or
n lim
y--+ 00
which proves (13).
B(y)~n,
XI
The prime number theorem
128
We shall use (13) to prove (14), for (13) implies that IB(x)1 ~ c, for a
suitable constant c, so that for fixed positive a and A, and a sufficiently
large y, we have
l+-;:h,dV'; c 1 f -;;dv 1B(Y-;:) T~V;)
'<Y
00]
[ -a
sin 2 V
V
+
a
sin 2 V
V
+
sin 2 V
As before, for - a ~ V ~ a, we have
so that
(16)
-a
-a
From (9), (15) and (16) we get
..; {I +Jl Si::v
That is
11:
~c
ff
00]
-a
+
[
-
00
dv
sin 2 V
-2-
v
+!~"!
+
+i),': }'i::V dv.
dv + lim B(y)e
3:.<l
A
y~oo
a
f
a
sin 2 V
-2-
v
dv .
-a
Now let a-HfJ, and A-HfJ, such that ajA-+O. We then get
11: ~ 11:
lim B(y),
y-->oo
which proves (14), and hence also Theorem 2.
§ 3. The prime number theorem. If 1/1 denotes Chebyshev's function
(cf. Chapter VII), we take A(x)= !jJ(e
and note that !jJ is nondecreasing and I/I(eX);;:?:O. Relation (4) enables us to verify the other
hypothesis of Theorem 2, for ((s) is analytic for 0' > 0, except for a
simple pole at s= 1, and, after Theorem 1, ((s) does not vanish in the
half-plane O'~ 1. Hence I/I(eX)",eX, or 1/1 (x) "'x, as X-+OO, which is
the prime number theorem.
Thus the prime number theorem follows from the Wiener-Ikehara
theorem, if we assume that ((1 + it) 0 for t O. On the other hand,
X ),
*
*
129
The prime number theorem
§3
if we assume the prime number theorem, it is easy to deduce that
((I+it)#O for t#O. For let
00
__ C'{s) __1_ = I !fr(x)-x dx
<P(s) r( )
1
s+ 1
, 0 ' > 1.
s .. s
sx
Then <P(s) is regular for 0'>0, except for simple poles at the zeros of
((s). The prime number theorem implies that t/!(x)=x+o(x), as
x--+ 00. Hence, given e > 0, there exists a number xo(e), such that for
x ~ xo(e) > 1, we have
l!fr(x)-xl < ex.
Thus, for 0' > 1, we have
XO
00
I!fr(X)-X I
I<P(s) I < I
x2
dx
+
Ie
XU dx,
Xo
and, since
00
00
e
e
< I -dx
= - e- ,
I -dx
XU
XU
0'-1
XO
we get
e
0' -1
I<P(s) I < K + --,
0'>1,
where K = K(xo) = K(e). Thus
(0' -l)I<P(s)1 < K(a -1) + e,
0' > 1.
If we now let 0'--+ 1 + 0, we get, for any fixed t,
lim (0' -1) <P(a + it) = O.
u-+1+0
(17)
If l+it, for t#O, were a zero of((s),then the limit of (a-l)<P(a+it),
as 0' --+ 1 + 0, would be equal to the residue of <P(s) at the simple pole
s = 1 + it, and therefore different from zero, which contradicts (17).
Hence ((l+it)#O for t#O.
Thus the assertion that ((I +it)#O for t#O is 'equivalent' to the
prime number theorem. Another equivalent assertion is that Pn '" nlogn,
where Pn denotes the nth prime, when the primes are arranged in natural
order.
9
Chandrasekharan, Analytic Number Theory
XI
The prime number theorem
130
For, if
n(x)logx
x
--->
1, as x ---> 00, then
logn(x) + loglogx -logx ---> 0,
hence
logn(x)
- - - --->
logx
1, so that
n(x) logn(x)
- - - - - --->
x
1,
from which it follows that Pn '" nlogn, if we take x = Pn.
Conversely, if x is defined by the inequality Pn ~ X < Pn+l, and
Pn"'nlogn, then Pn+l"'(n+1)log(n+1)",nlogn, so that x",nlogn,
or x'" ylogy, where y = n(x) = n. That is, logx '" logy, hence
x
y"'-.
logx
A list of books
L. E. Dickson, History of the theory of numbers (Carnegie Institution,
Washington), i (1919), ii (1920), iii (1923), reprinted (Chelsea, New
York, 1952).
G. H. Hardy and E. M. Wright, An introduction to the theory of numbers (Oxford University Press, 1938, 2nd edition, 1945).
A. E. Ingham, The distribution of prime numbers (Cambridge University
Press, 1932), reprinted (Stechert-Hafner, New York, 1964).
E. Landau, Handbuch der Lehre von der Verteilung der Primzahlen,
(2 volumes, Teubner, Leipzig, 1909), reprinted (Chelsea, New York,
1953).
J. V. Uspensky and M. A. Heaslet, Elementary number theory (McGrawHill, New York, 1939).
I. M. Vinogradov, An introduction to the theory of numbers (Pergamon
Press, London, 1955).
Notes
Notes on Chapter I
As general references, see J. V. Uspensky and M. A. Heaslet, loco cit.,
Chs. 1-6; and G. H. Hardy and E. M. Wright, loco cit., Chs. 1-3.
§ 2. Theorem 2 was stated by Gauss, Disquisitiones Arithmeticae,
(1801), § 16, reprinted in his Werke, i (1863), 15.
For what we call the "first proof of Theorem 2", reference may be
made to E. Zermelo, Gottinger Nachrichten (new series), i (1934), 43 -44.
According to Zermelo, his proof dates from 1912. See also H. Hasse,
l.for Math. 159 (1928), 3-6; and F. A. Lindemann, Quarterly l. Math.
(Oxford), 4 (1933), 319 - 320.
§ 3. For the "second proof of Theorem 2", see E. Heeke, Vorlesungen
uber die Theorie der algebraischen Zahlen, (1923), Ch. 1. What we have
called a module of integers is simply a subgroup of the additive group
of integers.
For Theorem 6, see Euclid's Elements, book 7, prop. 30, given in
T. L. Heath's The thirteen books of Euclid's Elements (Cambridge, 1926).
§ 5. Farey's name is associated with the Farey sequences because
of Cauchy, who noticed J. Farey's statement of Theorem 7, without
proof, in 1816, and published a proof himself. See A. Cauchy, Oeuvres,
2" serie, tome 6, 146. Theorems 7 and 9 seem to have been first stated
and proved by C. Haros in 1802. See Dickson's History, loco cit., i,
156. The following comment by C. L. Siegel on the proof of Theorem 7
may be of interest: "Let kl-hm= 1, k>O, m>O. The homogeneous
linear substitution 2=ka-hb, p.= -ma+lb of the integer variables
a,b has the inverse a=21+hp., b=m2+kp.. Hence the conditions
h/k~a/b~l/m, b>O, (a,b)= 1, are satisfied if and only if 2~0, p.~0,
2+p.>0, (2,p.) = 1, and then b~m+k exactly in the three cases
2,p.=0,1; 1,1; 1,0. This is independent of the notion of FR'"
§ 6. For Theorem 12, see Euclid's Elements, book 9, prop. 20.
For P61ya's proof of Theorem 13, see G. P61ya and G. Szego, Aufgaben und Lehrsiitze aus der Analysis, (1925), ii, 133, 342. The remark
about allowing fo=3 is due to C. L. Siegel. The proof, by G. T. Bennett,
of Euler's result that fs is divisible by 641, is given in the book by Hardy
and Wright, loco cit., 15. An alternative proof is given by Kraitchik,
Thiorie des nombres (Paris, 1926), ii, 221.
Notes
133
Notes on Chapter II
As general references, see Uspensky and Heaslet, loco cit., Chs. 6, 7;
Hardy and Wright, loco cit., Ch. 5; and Vinogradov, loco cit., Chs. 1,2.
§ 1. The theory of congruences was developed by Gauss in his Disquisitiones Arithmeticae, loco cit., though Fermat and Euler were perhaps
aware of some of the main results.
§ 2. For Fermat's statement of Theorem 3, in 1640, see his Oeuvres,
ii, 209. Euler proved Theorem 2 in 1760. See his Opera, (l), ii, 531. See
also Dickson's History, loco cit., i, Ch. 3.
§ 3. For Theorem 7, see Lagrange, Oeuvres (1868), ii, 667 -9.
Notes on Chapter III
§ 2. For the proofs of Theorems 5 and 7, see, for instance, H. Rademacher, Lectures on elementary number theory (Blaisdell, New York,
1964), 33 - 35.
§ 3. For Theorem 6, see Lucas, TMorie des nombres (1891), i, 353 -4.
§ 4. Theorem 9 is due to A. Hurwitz, Math. Annalen, 39 (1891),
279 - 284. The proof given here is due to A. Khinchin (= A. Khintchine),
Math. Annalen, 111 (1935), 631-637, and the author's attention was
drawn to it by Raghavan Narasimhan. In the author's Einfiihrung in
die Analytische Zahlentheorie, Springer Lecture Notes, 29 (1966), Ch. 3,
a different proof was sketched, which originated with L. R. Ford,
American Math. Monthly, 45 (1938), 586-601.
Notes on Chapter IV
As general references, see Uspensky and Heaslet, Hardy and Wright,
and Vinogradov, loco cit.
§ 1. For the introduction of the Legendre symbol, see Legendre,
Essai sur la tMorie des nombres (1798), 2 nd edition (1808), § 135.
We do not consider the case p=2, since all integers are quadratic
residues modulo 2.
§ 2. The first published proof (1773) of Wilson's theorem is due to
Lagrange, Oeuvres, iii, 425. The theorem was first stated by Waring,
Meditationes algebraicae (1770), 218, and attributed to J. Wilson. Hardy
and Wright say that "there is evidence that it was known long before
to Leibniz".
§ 3. Theorems 5, 6, 7 can be found in Hardy and Wright's book,
loco cit., 70, 297. The proof of Theorem 7 given here is due to Hermite,
Journal de Math. (1), 13 (1848), 15; Oeuvres, i, 264.
134
Notes
§ 4. Waring stated without proof that every positive integer is a
sum of four squares, M editationes algebraicae (1770), 204 - 5, and
Lagrange proved it the same year, see his Oeuvres, iii, 189. See also
Dickson's History, loco cit., ii, Ch.8.
Notes on Chapter V
§ 1. Theorem 1, though stated by Euler, and partly proved by
Legendre, was completely proved by Gauss in 1795. See P. Bachmann,
Niedere Zahlentheorie (1902), i, Ch. 6, where several proofs are described.
§§ 2-3. The idea of proving Theorem 1 by means of a reciprocity
formula for Gaussian sums goes back to Kronecker, Monatsber. Kgl.
Preuss. Akad. Wiss. Berlin (1880), 686 - 698; 854 - 860; J. for die reine
und angewandte Math. 105 (1889), 267 -268; Werke (1929), iv, 278- 300.
[There is, however, a reference to a paper by Schaar (1848), on the reciprocity formula for Gaussian sums, in Lindel6fs Calcul des Residus,
p. 68, as pointed out by C. L. Siegel.] It was extended to algebraic number fields by E. Hecke, Gottinger Nachrichten (1919), 265-278; Werke,
235-248; and by C. L. Siegel, Gottinger Nachrichten (1960),1-16;
Ges. Abhandlungen (1966), iii, 334-349. The proof given here is, in
substance, Siegel's. The integral, used in the proof of Theorem 2, is of
importance in the theory of the zeta-function of Riemann. See C. L.
Siegel, Quellen und Studien zur Geschichte der Math. 2 (1932), 45 - 80;
Gesammelte Abhandlungen (1966), i, 275.
For the evaluation of ordinary Gaussian sums by contour integration, see also L. J. Mordell, Messenger of Math. 48 (1919), 54-56.
The deduction of (14) from (12) is slightly shorter here than in the
author's Lecture Notes (loc. cit. Notes, Ch. 3), as a result of a comment
by C. L. Siegel.
Since g( -m, -n)=g(m,n), the case m<O, n>O can be reduced to
the case m>O, n<O.
Relation (21) shows that -1 is a quadratic residue of primes == 1
(mod4), and a quadratic non-residue of primes == 3 (mod4).
§ 4. Theorem 3 is due to Euler, Opera, (1), iii, 240.
For the example and the remark, see Rademacher, (loc. cit., Notes,
Ch. 3), 74, 82.
Notes on Chapter VI
As general references, see Hardy and Wright, loco cit., Chs. 16-18,
and Vinogradov, loco cit.
135
Notes
§ 2. The statement that r(n) = O(nE), for every £ > 0, is equivalent
to the statement that r(n)=o(nE), for every £>0.
For Theorem 1, see Gauss, Werke, (ii), 272 - 5.
§ 3. For the proof of Theorem 4, see P6lya and Szego (loc. cit.,
Notes, Ch. 1), ii, 160-1, 386.
For Theorems 5 and 6, see Hardy and Wright, loco cit., 259.
Theorem 9 was proved by Dirichlet in 1849, see his Werke, ii, 49 - 66.
G. Voronoi's improvement of the error-term is given in Ann. Sci.
Ecole Norm. Sup. (3), 21 (1904), 207 - 267; 459 - 533. That the errorterm is not O(N1/4) was proved by Hardy, Proc. London Math. Soc. (2)
15 (1916),192-213.
§ 4. For the history of Mersenne numbers, and of perfect numbers,
see Dickson, loco cit., i, Chs. 1 - 2.
§ 5. For Theorems 15 and 19, see A. F. Mobius, J. for die reine und
angewandte Math. 9 (1832), 105 -123; Werke (1887), iv, 589 - 612. See
also Landau's Handbuch, loco cit., §§ 150-152. Theorems 16 and 17 were
proved by Dedekind, J.for die reine und angewandte Math. 54 (1857), 21,
and by Liouville, J. de Math. pures et appliquees, (2) 2 (1857), 111, at
about the same time.
§ 6. For Theorem 20, see Landau's Handbuch, loco cit., § 59. Theorem 22 is due to F. Mertens, J. fur die reine und angewandte Math. 77
(1874),290-291, and is given in Landau's book, § 152.
00
The evaluation of
L f.1(n) n - 2,
without the use of Euler's identity
n=l
(proved later in Chapter VII, § 4) is a result of a comment by Raghavan
00
Narasimhan. For a proof of the formula
L n - 2= 1[2/6,
see, for in"=1
stance, K. Knopp, Theory and application of infinite series (1951), 237,
323,376.
Notes on Chapter VII
As general references, see Landau's H andbuch, loco cit., §§ 12 - 28, and
Ingham's book, loco cit., Ch. 1.
§ 1. For Theorem 1, see Euler, Opera (1),8, § 279; (1), 14,216-244.
§ 2. Theorem 3 is due to Chebyshev, Oeuvres, i, 49 - 70.
§ 3. S. S. Pill ai's proof of Theorem 4 is given in Bull. Calcutta Math.
Soc. 36 (1944), 97 -99; 37 (1944), 27. See also Landau's Handbuch, loco
cit., § 22.
§ 4. Theorem 7 is due to Chebyshev, Oeuvres, i, 27 -48. See
Ingham's book, loco cit., 16 - 21. Euler used the formal identity.
136
Notes
§ 5. Theorem 8 is due to F. Mertens, J. for die reine und angewandte
Math. 78 (1874), 46 - 62. See Ingham's book, loco cit., 22.
Stirling's formula is given, for instance, in the book by E. C. Titchmarsh, The theory of functions (Oxford, 1932), 2 nd edition (1939), § 1.87.
Notes on Chapter VIII
§§ 1-4. Weyl's theorems were proved by him in Math. Annalen, 77
(1916), 313 - 352. An exposition using the notion of "discrepancy" is
given by 1. W. Cassels, An introduction to Diophantine approximation
(Cambridge, 1957), Ch. 4.
§ 5. Kronecker proved his theorem in the Berliner Sitzungsberichte
(1884); see his Werke, iii (i) 47 -110. For further developments, see
J. F. Koksma, Diophantische Approximationen, Ergebnisse der Math.
Band iv, Heft 4 (1937).
H. Bohr's proof of Theorem 8 is given in J. London Math. Soc. 9
(1934),5-6. See also Hardy and Wright, Ch. 23.
Notes on Chapter IX
As general references, see Minkowski's Geometrie der Zahlen, lSI edition (1896), and his Diophantische Approximationen (1927). See also the
Lecture Notes on the Geometry of Numbers by C. L. Siegel (New York
University, 1945).
§ 2. Theorem 1 is true without the assumption that the set S is
bounded. For if it is unbounded, with measure V(S) > 2 n , one can take
its intersection with a cube KM given by IXkl <M, 1 ~k~n, and if M
is sufficiently large, then SM = S n K M will be a bounded set satisfying
the required conditions, because of the countable additivity of Lebesgue
measure.
We do not seek to give the optimal hypotheses here since we do
not wish to go into questions of measurability in greater detail. The
formulation and proof of Theorem 3 support this line.
Minkowski proved Theorem 3 in 1891, see his Gesammelte Abhandlung en, i, 264.
Siegel's proof of his formula (8) is given in Acta Math. 65 (1935),
307 -323.
The lemma which appears between Theorems 2 and 3 is due to G. D.
Birkhoff, as stated by Blichfeldt, Trans. American Math. Soc. 15 (1914),
230. See also an Appendix in Cassels's book (loc. cit., Notes, Ch. 8).
In Theorem 2 we use the fact that a closed set in W is Lebesgue
measurable.
Notes
137
Minkowski (loc. cit.) shows that a bounded convex set in R n has a
volume in the sense of Jordan. See his Geometrie der Zahlen, 50-60;
also his Theorie der konvexen Korper, Ges. Abh. 2, 142 -143; Blaschke's
Kreis und Kugel, (Leipzig, 1916), 57.
If a convex set S has Lebesgue measure V(S), 0< V(S)< 00, then it
is bounded. See J. W. S. Cassels, An introduction to the geometry of
numbers (Springer 1919), 109.
Notes on Chapter X
As a general reference, see Landau's Handbuch, loco cit., §§ 95 -103.
See also C. L. Siegel, Lectures on Analytic Number Theory (New York
University, 1945).
§ 5. The main theorem of this chapter, namely Theorem 8, was first
proved by Dirichlet in 1837, see his Werke (i), 307 - 342. An elementary
proof was given by Mertens, Wiener Sitzungsberichte, 106 (1897), 254286. A proof of the theorem by a new elementary method is due to
A. Selberg, Annals of Math. (2) 50 (1949), 297 - 304; Canadian J. of
Math. 2 (1950),66-78. Another elementary proof is due to H. Zassenhaus, Comm. Math. Helvetici, 22 (1949),232-259.
Notes on Chapter XI
As a general reference, see Landau's Handbuch (loc. cit.) including
the Appendix by P. T. Bateman, pp. 929 - 931, which gives a history of
the proof of the prime number theorem as an asymptotic relation. The
idea of connecting the behaviour of n(x) with the properties of ((s),
where s is a complex variable, and, in particular, with the location of
its non-real zeros, originated with Riemann, Ober die Anzahl der Primzahlen unter einer gegebenen GrojJe, Monatsberichte der Preuss. Akad. der
Wissenschaften (Berlin, 1859 -1860),671- 680; Werke (1 sl edition, 1876),
136-144; (2 nd edition, 1892), 145-155.
§ 1. The first proof of the prime number theorem was given by
J. Hadamard, Bull. de la Soc. Math. de France, 24 (1896),199-220; and
by c.-J. de la Vallee Poussin, Annales de la Soc. sci. de Bruxelles, 20 2
(1896), 183 - 256. For a clear presentation of the classical proof, see
Ingham's book (loc. cit.), Ch. 2.
§ 2. For the theorem of Wiener-Ikehara, see S. Ikehara, J. Math.
Phys. Mass. Inst. Tech. 10 (1931), 1-12; N. Wiener, Annals of Math. 33
(1932), 1-100, 787; and N. Wiener, The Fourier integral (Cambridge,
1933), § 19. The theorem is true with weaker hypotheses, but for the
deduction of the prime number theorem, it does not make much difference what form we consider.
138
Notes
The proof of the Wiener-Ikehara theorem given here, which does not
use Wiener's general Tauberian theorem, is substantially that of S.
Bochner, Math. Zeit. 37 (1933), 1-9, as simplified by E. Landau, Berliner Sitzungsberichte (1932),514-521, and by Bochner in his Lectures
on Fourier Analysis (Princeton University, 1936). It is the same as the
proof given in the author's Lectures on the Riemann zeta-Junction (Tata
Institute of Fundamental Research, Bombay, 1953). A proof of the prime
number theorem by a new elementary method has been given by A.
Selberg, Annals oj Math. (2) 50 (1949), 305 - 313.
Subject index
Abel's summation formula 78
arithmetical function 14;
completely multiplicative - 76;
multiplicative 14;
- r(n)
45; - R(N) 46;
- d(n)
47; - D(N) 50;
- (J(n)
54; - J1(n) 55;
- A(n) 57;
- cp(n) 13;
- q,(N) 59;
- n(n) 63;
- 9(n)
64; - t/!(n) 64;
Bertrand's postulate 71
Birkholrs lemma 100
Character: - of an abelian group 107; - modulo m 110;
principal- 107, 110
Chebyshev's: - functions 64;
- inequality 74; -lemma
68; - theorem 67
composite number 1
congruences 11 ; sum, difference,
product of - 11
Congruent: - modulo m 11;
-modulo 1 84
Convergence: abscissa of 113;
abscissa of absolute - 114;
half-plane of 113; line of 113; strip of conditional- 114
Dirichlet's: - formula for D(N) 53;
- theorems 84, 120;
- L-functions 117
Dirichlet series 78, 111; coefficients
of 78, 111; formal product
of 116; product of - 116;
uniqueness of 116
discrepancy 85; - modulo 1 87
divisibility 1
divisor 1; greatest common 3;
- function 47
Euclid's theorem 4
Euler's: - constant 51; - criterion
27; - function cp 13;
- identity 76; - theorem 13
Farey: - fraction 6; - sequence 6
Fermat number 10
Fermat's theorem 13
fraction 6; irreducible 6;
proper- 6; reduced- 6
fractional part 84
Gaussian sum 34; generalized34
group: abelian 107; cyclic107; generator of 107
Hadamard's theorem 123
Hurwitz's theorem 23; Khinchin's
proof of- 23
integral part 18
interval function 85
Kronecker's theorem 91;
proof of 93
Bohr's
Lagrange's theorem: - on congruences 16; sums of squares 31
Landau's theorem 115
lattice 101; determinant of a 101
lattice point 45; - function r(n)
45
Legendre symbol 26
linearly independent 91
von Mangoldt's function 57
mediant 8
Mersenne: - number 55;
-prime 55
Mertens's: - formulae 81;
- theorem 59
Minkowski's theorem 98; Siegel's
proof of 98
Mobius's function 55
Mobius inversion formula: first - 56;
second 58
module 3; the trivial- 4
1; least
multiple 1; integral common- 5
Orthogonality relations
Polya's theorem 10
perfect number 54
110
140
Subject index
prime: - number 1; - residue
class 12; relatively 3
prime number theorem 122
principal character 107, 110
quadratic: - residue 26; - nonresidue 26; - reciprocity law
34
quadratic form: positive definite104; determinant of a - I 04
quotient 1
remainder 1
representation: imprimitive 30;
primitive 30
residue class 11 ; prime 12
residue system: complete 12;
complete prime 12
Riemann's zeta-function
107
set: convex 97; symmetric 97; translate of a 97
Siegel's formula 99
standard form 2
Stirling's formula 81
summatory function 45
uniformly distributed 85;
-modulo 1 86
unique factorization theorem 2
de la Vallee Poussin's theorem
Weyl's theorems 87
Wiener-Ikehara theorem 124
Wilson's theorem 27
123
Die Grundlehren der mathematischen Wissenschaften
in Einzeldarstellungen
mit besonderer Berticksichtigung der Anwendungsgebiete
Lieferbare Biinde:
2.
3.
4.
10.
14.
15.
16.
19.
20.
22.
26.
27.
30.
31.
32.
38.
40.
50.
52.
57.
58.
59.
60.
61.
62.
64.
65.
66.
68.
69.
71.
Knopp: Theorie und Anwendung der unendlichen Reihen. DM 48,-; US $ 12.00
Hurwitz: Vorlesungen iiber allgemeine Funktionentheorie und elliptische Funktionen. DM 49,-; US $ 12.25
Madelung: Die mathematischen Hilfsmittel des Physikers. DM 49,70; US $ 12.45
Schouten: Ricci-Calculus. DM 58,60; US $ 14.65
Klein: Elementarmathematik vom hiiheren Standpunkt aus. 1. Band: Arithmetik.
Algebra. Analysis. DM 24,-; US $ 6.00
Klein: Elementarmathematik vom hiiheren Standpunkt aus. 2. Band: Geometrie.
DM 24,-; US $ 6.00
Klein: Elementarmathematik vom hiiheren Standpunkt aus. 3. Band: Prazisionsund Approximationsmathematik. DM 19,80; US $ 4.95
P6Iya/Szegii: Aufgaben und Lehrsatze aus der Analysis I: Reihen, Integralrechnung,
Funktionentheorie. DM 34,-; US $ 8.50
P6Iya/Szegii: Aufgaben und Lehrsatze aus der Analysis II: Funktionentheorie,
Nullstellen, Polynome, Determinanten, Zahlentheorie. DM 38,-; US $ 9.50
Klein: Vorlesungen iiber hiihere Geometrie. DM 28,-; US $ 7.00
Klein: Vorlesungen iiber nicht-euklidische Geometrie. DM 24,-; US $ 6.00
Hilbert/Ackermann: Grundziige der theoretischen Logik. DM 38,-: US $ 9.50
Lichtenstein: Grundlagen der Hydromechanik. DM 38,-; US $ 9.50
Kellogg: Foundations of Potential Theory. DM 32,-; US $ 8.00
Reidemeister: Vorlesungen iiber Grundlagen der Geometrie. DM 18,-; US $ 4.50
Neumann: Mathematische Grundlagen der Quantenmechanik. DM 28,-; US $ 7.00
Hilbert/Bernays: Grundlagen der Mathematik I. DM 68,-; US $ 17.00
Hilbert/Bernays: Grundlagen der Mathematik II. 2. Aufl. in Vorbereitung
Magnus/Oberhettinger/Soni: Formulas and Theorems for the Special Functions
of Mathematical Physics. DM 66,-; US $ 16.50
Hamel: Theoretische Mechanik. DM 84,-; US $ 21.00
Blaschke/Reichardt: Einfiihrung in die Differentialgeometrie. DM 24,-; US $ 6.00
Hasse: Vorlesungen iiber Zahlentheorie. DM 69,-; US $ 17.25
Collatz: The Numerical Treatment of Differential Equations. DM 78,-; US $ 19.50
Maak: Fastperiodische Funktionen. DM 38,-; US $ 9.50
Sauer: Anfangswertprobleme bei partiellen Differentialgleichungen. DM 41,-;
US $ 10.25
Nevanlinna: Uniformisierung. DM 49,50; US $ 12.40
T6th: Lagerungen in der Ebene, auf der Kugel und im Raum. DM 27,-; US $ 6.75
Bieberbach: Theorie der gewiihnlichen Differentialgleichungen. DM 58,50;
US $ 14.65
Aumann: Reelle Funktionen. DM 59,60; US $ 14.90
Schmidt: Mathematische Gesetze der Logik I. DM 79,-; US $ 19.75
Meixner/Schafke: Mathieusche Funktionen und Spharoidfunktionen mit Anwendungen auf physikalische und technische Probleme. DM 52,60; US $ 13.15
73.
74.
75.
Hermes: Einfiihrung in die Verbandstheorie. DM 46,-; US $ 11.50
Boerner: Darstellungen von Gruppen. DM 58,-; US $ 14.50
Rado/Reichelderfer: Continuous Transformations in Analysis, with an Introduction to Algebraic Topology. DM 59,60; US $ 14.90
76.
77.
Tricomi: Vorlesungen iiber Orthogonalreihen. DM 37,60; US $ 9.40
Behnke/Sommer: Theorie der analytischen Funktionen einer komplexen Veranderlichen. DM 79,-; US $ 19.75
Saxer: Versicherungsmathematik. 1. Teil. DM 39,60; US $ 9.90
Pickert: Projektive Ebenen. DM 48,60; US $ 12.15
Schneider: Einfiihrung in die transzendenten Zahlen. DM 24,80; US $ 6.20
Specht: Gruppentheorie. DM 69,60; US $ 17.40
Bieberbach: Einfiihrung in die Theorie der Differentialgleichungen im reellen
Gebiet. DM 32,80; US $ 8.20
Conforto: Abe1sche Funktionen und algebraische Geometrie. DM 41,80; US $ 10.45
Siegel: Vorlesungen iiber Himmelsmechanik. DM 33,-; US $ 8.25
Richter: Wahrscheinlichkeitstheorie. DM 68,-; US $ 17.00
van der Waerden: Mathematische Statistik. DM 49,60; US $ 12.40
Miiller: Grundprobleme der mathematischen Theorie elektromagnetischer Schwingungen. DM 52,80; US $ 13.20
Pfluger: Theorie der Riemannschen FHichen. DM 39,20; US $ 9.80
Oberhettinger: Tabellen zur Fourier Transformation. DM 39,50; US $ 9.90
Prachar: Primzahlverteilung. DM 58,-; US $ 14.50
Rehbock: Darstellende Geometrie. DM 29,-; US $ 7.25
Hadwiger: Vorlesungen iiber Inhalt, Oberflache und Isoperimetrie. DM 49,80;
US $ 12.45
Funk: Variationsrechnung und ihre Anwendung in Physik und Technik. DM 98,-;
US $ 24.50
Maeda: Kontinuierliche Geometrien. DM 39,-; US $ 9.75
Greub: Linear Algebra. DM 39,20; US $ 9.80
Saxer: Versicherungsmathematik. 2. Teil. DM 48,60; US $ 12.15
Cassels: An Introduction to the Geometry of Numbers. DM 69,-; US $ 17.25
Koppenfels/Stallmann: Praxis der konformen Abbildung. DM 69,-; US $ 17.25
Rund: The Differential Geometry of Finsler Spaces. DM 59,60; US $ 14.90
Schiitte: Beweistheorie. DM 48,-; US $ 12.00
Chung: Markov Chains with Stationary Transition Probabilities. DM 56,-;
US $ 14.00
Rinow: Die innere Geometrie der metrischen Raume. DM 83,-; US $ 20.75
Scholz/Hasenjaeger: Grundziige der mathematischen Logik. DM 98,-; US $ 24.50
Kothe: Topologische Lineare Raume I. DM 78,-; US $ 19.50
Dynkin: Die Grundlagen der Theorie der Markoffschen Prozesse. DM 33,80;
US $ 8.45
Hermes: Aufzahlbarkeit, Entscheidbarkeit, Berechenbarkeit. DM 49,80; US $ 12.45
Dinghas: Vorlesungen iiber Funktionentheorie. DM 69,-; US $ 17.25
Lions: Equations differentielles operationnelles et problemes aux limites. DM 64,-;
US $ 16.00
Morgenstern/Szabo: Vorlesungen iiber theoretische Mechanik. DM 69,-;
US$17.25
Meschkowski: Hilbertsche Raume mit Kernfunktion. DM 58,-; US $ 14.50
79.
80.
81.
82.
83.
84.
85.
86.
87.
88.
89.
90.
91.
92.
93.
94.
95.
97.
98.
99.
100.
101.
103.
104.
105.
106.
107.
108.
109.
1l0.
Ill.
112.
113.
114.
115.
116.
117.
118.
119.
120.
MacLane: Homology. DM 62,-; US $ 15.50
Hewitt/Ross: Abstract Harmonic Analysis. Vol. I: Structure of Topological Groups.
Integration Theory. Group Representations. DM 76,-; US $ 19.00
Hormander: Linear Partial Differential Operators. DM 42,-; US $ 10.50
O'Meara: Introduction to Quadratic Forms. DM 48,-; US $ 12.00
Schiifke: Einfiihrung in die Theorie der speziellen Funktionen der mathematischen
Physik. DM 49,40; US $ 12.35
Harris: The Theory of Branching Processes. DM 36,-; US $ 9.00
Collatz: Funktionalanalysis und numerische Mathematik. DM 58,-; US $ 14.50
g~: Dynkin: Markov Processes. DM 96,-; US $ 24.00
123. Yosida: Functional Analysis. DM 66,-; US $ 16.50
124. Morgenstern: Einfiihrung in die Wahrscheinlichkeitsrechnung und mathematische
Statistik. DM 34,50; US $ 8.65
125. itO/McKean: Diffusion Processes and Their Sample Paths. DM 58,-; US $ 14.50
126. LehtojVirtanen: Quasikonforme Abbildungen. DM 38,-; US $ 9.50
127. Hermes: Enumerability, Decidability, Computability. DM 39,-; US $ 9.75
128. Braun/Koecher: Jordan-Algebren. DM 48,-; US $ 12.00
129. Nikodym: The Mathematical Apparatus for Quantum-Theories. DM 144,-;
US $ 36.00
130. Morrey: Multiple Integrals in the Calculus of Variations. DM 78,-; US $ 19.50
131. Hirzebruch: Topological Methods in Algebraic Goemetry. DM 38,-; US $ 9.50
132. Kato: Perturbation theory for linear operators. DM 79,20; US $ 19.80
133. Haupt/Kiinneth: Geometrische Ordnungen. DM 68,-; US $ 17.00
134. Huppert: Endliche Gruppen I. DM 156,-; US $ 39.00
135. Handbook for Automatic Computation. Vol. IfPart a: Rutishauser: Description of
ALGOL 60. DM 58,-; US $ 14.50
136. Greub: Multilinear Algebra. DM 32,-; US $ 8.00
137. Handbook for Automatic Computation. Vol. I/Part b: Grau/Hill/Langmaack:
Translation of ALGOL 60. DM 64,-; US $ 16.00
138. Hahn: Stability of Motion. DM 72,-; US $ 18.00
139. Mathematische Hilfsmittel des Ingenieurs. Herausgeber: Sauer/Szab6. 1. Teil.
DM 88,-; US $ 22.00
141. Mathematische Hilfsmittel des Ingenieurs. Herausgeber: Sauer/Szab6. 3. Teil.
DM 98,-; US $ 24.50
143. Schur/Grunsky: Vorlesungen iiber Invariantentheorie. DM 32,-; US $ 8.00
144. Weil: Basic Number Theory. DM 48.-; US $ 12.00
145. Butzer/Berens: Semi-Groups of Operators and Approximation. DM 56,-;
US $ 14.00
146. Treves: Locally Convex Spaces and Linear Partial Differential Equations. D M 36,-;
US $ 9.00
147. Lamotke: Semisimpliziale algebraische Topologie. DM 48,-; US $ 12.00
148. Chandrasekharan: Introduction to Analytic Number Theory. DM 28,-; US $ 7.00
149. Sario/Oikawa: Capacity Functions. In Vorbereitung
150. Iosifescu/Theodorescu: Random Processes and Learning. DM 68,-; US $ 17.00
151.
152.
153.
Mandl: Analytical Treatment of One-Dimensional Markov Processes.
DM 36,-; US $ 9.00
Hewitt/Ross: Abstract Harmonic Analysis. Vol. 2. In Vorbereitung
Federer: Geometric Measure Theory. In Vorbereitung
Descargar