Subido por lmvh.23

A Course in Real Analysis

Anuncio
Mathematics
A Course in Real Analysis provides a rigorous treatment of the foundations of differential and integral calculus at the advanced undergraduate level.
The third part consists of appendices on set theory and linear algebra as well as solutions to some of the exercises.
Features
• Provides a detailed axiomatic account of the real number system
• Develops the Lebesgue integral on n from the beginning
• Gives an in-depth description of the algebra and calculus of differential forms on
surfaces in n
• Offers an easy transition to the more advanced setting of differentiable manifolds
by covering proofs of Stokes’s theorem and the divergence theorem at the
concrete level of compact surfaces in n
• Summarizes relevant results from elementary set theory and linear algebra
• Contains over 90 figures that illustrate the essential ideas behind a concept or
proof
• Includes more than 1,600 exercises throughout the text, with selected solutions
in an appendix
•
•
•
•
•
Access online or download to your smartphone, tablet or PC/Mac
Search the full text of this and other titles you own
Make and share notes and highlights
Copy and paste text and figures for use in your own documents
Customize your view by changing font size and layout
K22153
w w w. c rc p r e s s . c o m
JUNGHENN
With clear proofs, detailed examples, and numerous exercises, this book gives a thorough treatment of the subject. It progresses from single variable to multivariable functions, providing a logical development of material that will prepare readers for more
advanced analysis-based studies.
A COURSE IN
The second part focuses on functions of several variables. It introduces the topological
ideas needed (such as compact and connected sets) to describe analytical properties
of multivariable functions. This part also discusses differentiability and integrability of
multivariable functions and develops the theory of differential forms on surfaces in n.
REAL ANALYSIS
The first part of the text presents the calculus of functions of one variable. This part
covers traditional topics, such as sequences, continuity, differentiability, Riemann integrability, numerical series, and the convergence of sequences and series of functions.
It also includes optional sections on Stirling’s formula, functions of bounded variation,
Riemann–Stieltjes integration, and other topics.
WITH VITALSOURCE ®
EBOOK
A
COURSE IN
REAL
ANALYSIS
HUGO D. JUNGHENN
A
COURSE IN
REAL
ANALYSIS
K22153_FM.indd 1
1/9/15 4:46 PM
K22153_FM.indd 2
1/9/15 4:46 PM
A
COURSE IN
REAL
ANALYSIS
HUGO D. JUNGHENN
The George Washington University
Washington, D.C., USA
K22153_FM.indd 3
1/9/15 4:46 PM
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2015 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Version Date: 20150109
International Standard Book Number-13: 978-1-4822-1928-9 (eBook - PDF)
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
TO THE MEMORY OF MY
PARENTS
Rita and Hugo
Contents
Preface
xi
List of Figures
xiii
List of Tables
xvii
List of Symbols
I
xix
Functions of One Variable
1
1 The Real Number System
1.1 From Natural Numbers to Real Numbers
1.2 Algebraic Properties of R . . . . . . . . .
1.3 Order Structure of R . . . . . . . . . . .
1.4 Completeness Property of R . . . . . . .
1.5 Mathematical Induction . . . . . . . . . .
1.6 Euclidean Space . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
4
8
12
19
24
2 Numerical Sequences
2.1 Limits of Sequences . . . . . . . . .
2.2 Monotone Sequences . . . . . . . . .
2.3 Subsequences and Cauchy Sequences
2.4 Limits Inferior and Superior . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
29
29
36
38
42
. . . .
. . . .
. . . .
. . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
. .
47
47
55
59
63
67
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
73
73
80
85
88
94
3 Limits and Continuity on R
3.1 Limit of a Function . . . . . . . .
*3.2 Limits Inferior and Superior . . .
3.3 Continuous Functions . . . . . . .
3.4 Properties of Continuous Functions
3.5 Uniform Continuity . . . . . . . .
.
.
.
.
4 Differentiation on R
4.1 Definition of Derivative and Examples
4.2 The Mean Value Theorem . . . . . . .
*4.3 Convex Functions . . . . . . . . . . .
4.4 Inverse Functions . . . . . . . . . . .
4.5 L’Hospital’s Rule . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
vii
viii
Contents
4.6
*4.7
Taylor’s Theorem on R . . . . . . . . . . . . . . . . . . . .
Newton’s Method . . . . . . . . . . . . . . . . . . . . . . .
5 Riemann Integration on R
5.1 The Riemann–Darboux Integral . . . .
5.2 Properties of the Integral . . . . . . . .
5.3 Evaluation of the Integral . . . . . . . .
*5.4 Stirling’s Formula . . . . . . . . . . . .
5.5 Integral Mean Value Theorems . . . . .
*5.6 Estimation of the Integral . . . . . . . .
5.7 Improper Integrals . . . . . . . . . . . .
5.8 A Deeper Look at Riemann Integrability
*5.9 Functions of Bounded Variation . . . .
*5.10 The Riemann–Stieltjes Integral . . . . .
6 Numerical Infinite Series
6.1 Definition and Examples . . . . . . .
6.2 Series with Nonnegative Terms . . . .
6.3 More Refined Convergence Tests . . .
6.4 Absolute and Conditional Convergence
*6.5 Double Sequences and Series . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
107
. . 107
. 116
. 120
. 129
. . 131
. 134
. 143
. . 151
. 152
. 156
. . . .
. . . .
. . . .
. . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
163
. 163
. 169
. 176
. . 181
. 188
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
7 Sequences and Series of Functions
7.1 Convergence of Sequences of Functions . .
7.2 Properties of the Limit Function . . . . . .
7.3 Convergence of Series of Functions . . . . .
7.4 Power Series . . . . . . . . . . . . . . . . .
II
.
.
.
.
Functions of Several Variables
8 Metric Spaces
8.1 Definitions and Examples . . . .
8.2 Open and Closed Sets . . . . . .
8.3 Closure, Interior, and Boundary
8.4 Limits and Continuity . . . . . .
8.5 Compact Sets . . . . . . . . . .
*8.6 The Arzelà–Ascoli Theorem . . .
8.7 Connected Sets . . . . . . . . . .
8.8 The Stone–Weierstrass Theorem
*8.9 Baire’s Theorem . . . . . . . . .
100
103
193
193
199
204
211
229
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
231
. . 231
. 238
. 243
. 248
. 255
. 263
. 268
. 275
. 282
9 Differentiation on Rn
9.1 Definition of the Derivative . . . . . . . . .
9.2 Properties of the Differential . . . . . . . .
9.3 Further Properties of the Differential . . .
9.4 Inverse Function Theorem . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
287
. . 287
. 295
. . 301
. 306
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
9.5
9.6
9.7
*9.8
ix
Implicit Function Theorem . . . . . .
Higher Order Partial Derivatives . . .
Higher Order Differentials and Taylor’s
Optimization . . . . . . . . . . . . . .
10 Lebesgue Measure on Rn
10.1 General Measure Theory . .
10.2 Lebesgue Outer Measure . .
10.3 Lebesgue Measure . . . . . .
10.4 Borel Sets . . . . . . . . . . .
10.5 Measurable Functions . . . .
. . . . . .
. . . . . .
Theorem
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
312
318
323
330
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
343
. 343
. . 347
. . 351
. 356
. 360
11 Lebesgue Integration on Rn
11.1 Riemann Integration on Rn . . . . . .
11.2 The Lebesgue Integral . . . . . . . . .
11.3 Convergence Theorems . . . . . . . .
11.4 Connections with Riemann Integration
11.5 Iterated Integrals . . . . . . . . . . . .
11.6 Change of Variables . . . . . . . . . .
. . . .
. . . .
. . . .
. . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
367
. . 367
. 368
. 379
. 385
. 388
. 398
12 Curves and Surfaces in Rn
12.1 Parameterized Curves .
12.2 Integration on Curves .
12.3 Parameterized Surfaces
12.4 m-Dimensional Surfaces
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13 Integration on Surfaces
13.1 Differential Forms . . . . . . . . . . . .
13.2 Integrals on Parameterized Surfaces . .
13.3 Partitions of Unity . . . . . . . . . . . .
13.4 Integration on Compact m-Surfaces . .
13.5 The Fundamental Theorems of Calculus
*13.6 Closed Forms in Rn . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
447
. . 447
. . 461
. 472
. 475
. 478
. 495
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
409
409
412
422
432
III Appendices
503
A Set Theory
505
B Linear Algebra
509
C Solutions to Selected Problems
517
Bibliography
581
Index
583
Preface
The purpose of this text is to provide a rigorous treatment of the foundations
of differential and integral calculus at the advanced undergraduate level. It
is assumed that the reader has had the traditional three semester calculus
sequence and some exposure to elementary set theory and linear algebra. As
regards the last two subjects, appendices provide a summary of most of the
results used in the text. Linear algebra will not be needed until Part II.
The book consists of three parts. Part I treats the calculus of functions of
one variable. Here, one can find the traditional topics: sequences, continuity,
differentiability, Riemann integrability, numerical series, and convergence of
sequences and series of functions. Optional sections on Stirling’s formula,
Riemann–Stieltjes integration, and other topics are also included. As the ideas
inherent in these subjects ultimately rest on properties of real numbers, the
book begins with a careful treatment of the real number system. For this we
take an axiomatic rather than a constructive approach, guided as much by
the need for efficiency of exposition as by pedagogical preference. Of course,
presenting the real number system in this way begs the excellent question
as to whether such a system exists. It is a question we do not answer, but
the interested reader may wish to consult a text on the construction of the
real number system from the natural numbers, or even on the philosophy of
mathematics.
Part II treats functions of several variables. Many of the results in Part I,
such as the chain rule, the inverse function theorem, and the change of variables
theorem, have counterparts in Part II. The reader’s exposure to the one-variable
results should make the multivariable versions more meaningful and accessible.
As might be expected, however, some results in Part II have no counterparts in
Part I, the implicit function theorem and the iterated integral (Fubini–Tonelli)
theorem being obvious examples.
Part II begins with a chapter on metric spaces. Here we introduce the
topological ideas needed to describe some of the analytical properties of
multivariable functions. Primary among these are the notions of compact set
and connected set, which, for example, allow the extension to higher dimensions
of the extreme value and intermediate value theorems. The remainder of Part II
covers differentiability and integrability of multivariable functions. As regards
integrability, we have chosen to develop from the beginning the Lebesgue
integral rather than to the extend the Riemann integral to higher dimensions.
The additional time required for this approach is, in my view, more than offset
xi
xii
Preface
by the enormous added utility of the Lebesgue integral. The last chapter of
Part II develops the theory of differential forms on surfaces in Rn . The chapter
culminates with proofs of Stokes’s theorem and the divergence theorem for
compact surfaces. It is hoped that exposure to these topics at the concrete
level of surfaces in Rn will ease the transition to more advanced courses such
as calculus on differentiable manifolds.
Part III consists of the aforementioned appendices on set theory and linear
algebra, as well as solutions to some of the over 1600 exercises found in the
text. For convenience, exercises with solutions that appear in the appendix
are marked with a superscript S . Exercises that will find important uses later
are marked with a downward arrow ⇓. Instructors with suitable bona fides
may obtain from the publisher a manual of complete solutions to all of the
exercises.
The book is an outgrowth of notes developed over many years of teaching
real analysis to undergraduates at George Washington University. The more
recent versions of these notes have been specifically tested in classes over the
last three years. During this period, the typical two-semester course closely
followed the non-starred sections of this text: Chapters 1–7 for the first semester
and 8–13 for the second. Given the wealth of material, it was necessary to
leave some proofs for students to read on their own, a not wholly unfortunate
compromise. Material in some starred sections was assigned as optional reading.
I would like to express my gratitude to the many students whose critical
eyes caught errors before they made their way into these pages. Of course, any
remaining errors are my complete responsibility. Special thanks are due to
Zehua Zhang, whose enlightened comments have improved the exposition of
several topics.
Finally, to my wife Mary for her support and understanding during the
writing of this book: thank you!
Hugo D. Junghenn
Washington, D.C.
September 2014
List of Figures
1.1
1.2
Supremum and infimum of A . . . . . . . . . . . . . . . . .
Greatest integer function . . . . . . . . . . . . . . . . . . . .
12
14
2.1
2.2
2.3
2.4
Convergence of a sequence . . .
Squeeze principle . . . . . . . .
Interval halving process . . . .
Limits supremum and infimum
3.1
3.2
3.3
3.4
Limit of a function . . . . . . . . .
L can’t be greater than M . . . . .
One-to-one correspondence between
Intermediate value property . . . .
. . . .
. . . .
D and
. . . .
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
30
31
39
42
. .
. .
Q
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
48
53
61
64
Trigonometric inequality . . . . . .
Local extrema . . . . . . . . . . . .
Mean value theorems . . . . . . . .
Convex function . . . . . . . . . . .
Convex function inequalities . . . .
Intermediate value property implies
Intermediate value property implies
Newton’s method . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
monotonicity
continuity . .
. . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
74
.
80
. . 81
.
86
. . 87
.
89
.
89
. 104
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
Upper and lower sums . . . . .
The partitions P and Q . . . .
The partition Pn . . . . . . . .
The partitions P 0 , P, and P 00 .
Riemann sum . . . . . . . . . .
The partitions P x and P y . . .
Trapezoidal rule approximation
Midpoint rule approximation . .
Simpson’s rule approximation .
The partition Q . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 108
. 110
. . 111
. 112
. 113
. 122
. 136
. . 137
. 139
. 159
7.1
7.2
Uniform convergence . . . . . . . . . . . . . . . . . . . . . . 193
Pointwise convergence insufficient . . . . . . . . . . . . . . . . 201
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
xiii
xiv
List of Figures
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
8.10
8.11
An open ball is open . . . . . . . . . . .
The functions gn and g . . . . . . . . . .
Convex and non-convex sets . . . . . . .
The neighborhoods Ux and Vx . . . . . .
A 2ε net . . . . . . . . . . . . . . . . . .
A bounded set in Rn is totally bounded
A separation (U, V ) of E . . . . . . . . .
C1 (−1, 0) ∪ C1 (1, 0) is path connected .
E is path connected . . . . . . . . . . . .
A piecewise linear function . . . . . . . .
Sawtooth function . . . . . . . . . . . . .
9.1
9.2
The domain of argθ0 . . . . . . . . . . . . . . . . . . . . . .
Saddle point . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
Interval grid . . . . . . . . . . . . . . .
Coverings . . . . . . . . . . . . . . . .
Middle thirds . . . . . . . . . . . . . .
Ternary expansion algorithm . . . . . .
Decomposition into half-open intervals
K = cl(E) \ U . . . . . . . . . . . . . .
The components of fk . . . . . . . . .
The components of fk+1 . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 348
. 349
. 353
. 354
. . 357
. 358
. 363
. 363
11.1
11.2
11.3
11.4
11.5
Partition of an n-dimensional interval
Three-dimensional simplex . . . . . .
Concentric cube and ball . . . . . . .
The paving Qr . . . . . . . . . . . .
Theorem of Pappus . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . 367
. 390
. 402
. 403
. 408
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
12.9
12.10
12.11
12.12
12.13
12.14
12.15
12.16
Curves in R2 . . . . . . . . . . . . . . .
A piecewise smooth curve with tangent
Inscribed polygonal line . . . . . . . .
Vector field on E . . . . . . . . . . . .
Closed curve ϕ . . . . . . . . . . . . .
Concatenation of curves . . . . . . . .
Tangent spaces at p . . . . . . . . . . .
Affine space . . . . . . . . . . . . . . .
The inward unit normal . . . . . . . .
Normal vector to S at p . . . . . . . .
Surface of revolution . . . . . . . . . .
Möbius strip . . . . . . . . . . . . . . .
−1
The mapping Ga
. . . . . . . . . . . .
Transition mappings . . . . . . . . . .
Stereographic projection . . . . . . . .
The mapping dψx . . . . . . . . . . . .
. . . . .
vectors
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 409
. 410
. 412
. 416
. 418
. 419
. 422
. 424
. . 427
. . 427
. 429
. 430
. 434
. 435
. 436
. 438
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 239
. 240
. . 241
. 255
. 256
. . 257
. 268
. . 271
. 272
. 276
. 285
310
330
List of Figures
.
.
.
.
xv
12.17
12.18
12.19
12.20
Cylinder-with-boundary . . . .
Surface element . . . . . . . . .
Induced orientation of Ta∂S . . .
Stereographic projection from p
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 440
. . 441
. 443
. 444
13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8
13.9
13.10
13.11
13.12
13.13
13.14
13.15
13.16
13.17
Parallelogram approximation to ϕ(Q) . . . . .
Two dimensional simplex . . . . . . . . . . . .
A partition of unity subordinate to U1 and U2
The functions h and g . . . . . . . . . . . . .
The cubes Wi and Vi . . . . . . . . . . . . . .
Regular region E . . . . . . . . . . . . . . . .
Annulus in R2 with exterior normal . . . . . .
The case a ∈ E . . . . . . . . . . . . . . . . .
The case a ∈ bd(E) . . . . . . . . . . . . . . .
Regular region in R2 . . . . . . . . . . . . . .
Piecewise smooth surfaces . . . . . . . . . . .
Oriented cube without bottom face . . . . . .
Closed polygon . . . . . . . . . . . . . . . . .
Surfaces S1 and S2 with common boundary C
Curves contracting to p must pass through q
Boundary parametrization . . . . . . . . . . .
Star-shaped and non-star-shaped regions . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 463
. 470
. 472
. 473
. 474
. 483
. 483
. 484
. 485
. 488
. 489
. 490
. 490
. 494
. 495
. . 497
. 499
C.1
Open balls for Exercise 1 . . . . . . . . . . . . . . . . . . . . . 551
List of Tables
4.1
5.1
5.2
5.3
Newton’s method for ex + x − 2 = 0 .
R
Table for evaluating Rf h by parts . . .
Table for evaluating (x + 1)3 e5x dx by
A comparison of the methods . . . . .
. . . . . . . . . . . .
105
. . . . . . . . . . . .
parts . . . . . . . .
. . . . . . . . . . . .
125
125
143
9.1
9.2
Values of ∆ . . . . . . . . . . . . . . . . . . . . . . . . . . .
Values of ∆ . . . . . . . . . . . . . . . . . . . . . . . . . . .
333
334
xvii
List of Symbols
R
P
Q
N
Z
Q
I
n!
a<b
b>a
a≤b
b≥a
|x|
max S
min S
x+
x−
sup A
inf A
bxc
R
+∞, −∞
(a, b)
(a, b]
[a, b)
[a,b]
n
k
n
R
x·y
kxk2
kxk1
kxk∞
a×b
limn an
an ↑
real number system . . . . . . . . .
summation symbol . . . . . . . . .
product symbol . . . . . . . . . . .
set of natural numbers . . . . . . .
set of integers . . . . . . . . . . . .
set of rational numbers . . . . . . .
set of irrational numbers . . . . . .
n factorial . . . . . . . . . . . . . .
less than . . . . . . . . . . . . . . .
greater than . . . . . . . . . . . . .
less than or equal . . . . . . . . . .
greater than or equal . . . . . . . .
absolute value of x . . . . . . . . .
maximum of S . . . . . . . . . . . .
minimum of S . . . . . . . . . . . .
positive part of x . . . . . . . . . .
negative part of x . . . . . . . . . .
supremum of A . . . . . . . . . . .
infimum of A . . . . . . . . . . . . .
greatest integer in x . . . . . . . . .
extended real number system . . . .
positive infinity, negative infinity . .
open interval . . . . . . . . . . . . .
left-open interval . . . . . . . . . .
right-open interval . . . . . . . . . .
closed interval . . . . . . . . . . . .
binomial coefficient . . . . . . . . .
Euclidean space . . . . . . . . . . .
Euclidean inner product . . . . . .
Euclidean norm . . . . . . . . . . .
`1 norm . . . . . . . . . . . . . . . .
max norm . . . . . . . . . . . . . .
cross product . . . . . . . . . . . .
limit of a sequence . . . . . . . . . .
increasing sequence of real numbers
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
.
.
.
.
.
. .
.
.
3
4
4
6
6
6
6
7
8
8
9
9
9
10
10
10
10
12
12
14
15
15
16
16
16
16
21
24
25
25
26
26
27
29
36
xix
xx
List of Symbols
an ↓
an ↑ a
an ↓ a
lim inf n an
lim supn an
N (a) = Nr (a)
lim f (x)
x→a
x∈E
decreasing sequence of real numbers
sequence increases to a . . . . . . .
sequence decreases to a . . . . . . .
limit infimum of a sequence . . . .
limit supremum of a sequence . . .
neighborhood of a . . . . . . . . . .
limit of f along E . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. .
. .
36
36
36
42
42
47
47
lim f (x)
left-hand limit . . . . . . . . . . . . . . . .
48
lim f (x)
right-hand limit . . . . . . . . . . . . . . .
48
lim f (x)
two-sided limit . . . . . . . . . . . . . . . .
48
lim f (x)
limit at +∞ . . . . . . . . . . . . . . . . .
48
lim f (x)
limit at −∞ . . . . . . . . . . . . . . . . .
48
lim
inf f (x)
x→a
limit inferior of f along E . . . . . . . . .
56
lim sup f (x)
limit superior of f along E . . . . . . . . .
56
df
f = Df = dx
D` f (a) = f`0 (a)
Dr f (a) = fr0 (a)
f (n)
Tn (x, a)
Rn (x, a)
kPk
S(f, P)
S(f, P)
Rb
f
a
Rb
f
a
Rb
f
a
Rba
S(f, P, ξ)
R
f
b
Va (f )
Sw (f, P, ξ)
Rb
f dw
a
S w (f, P)
S w (f, P)
Rb
f dw
a
Rb
f dw
a
derivative of f . . . . . . .
left-hand derivative at a .
right-hand derivative at a .
nth derivative of f . . . .
Taylor polynomial . . . . .
Taylor remainder . . . . .
mesh of partition P . . . .
lower Darboux sum . . . .
upper Darboux sum . . . .
x→a−
x→a+
x→a
x→+∞
x→−∞
x∈E
x→a
x∈E
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
73
75
75
77
101
101
107
107
107
lower Darboux integral . . . . . . . . . . .
109
upper Darboux integral . . . . . . . . . . .
109
Riemann–Darboux integral . . . . . . . . .
set of Riemann integrable functions on [a, b]
Riemann sum . . . . . . . . . . . . . . . .
indefinite integral of f . . . . . . . . . . . .
total variation of f on [a, b] . . . . . . . .
Riemann-Stieltjes sum . . . . . . . . . . .
Riemann-Stieltjes integral . . . . . . . . .
upper Darboux–Stieltjes sum . . . . . . . .
lower Darboux–Stieltjes sum . . . . . . . .
109
110
113
121
152
156
156
160
160
upper Darboux-Stieltjes integral . . . . . .
160
lower Darboux-Stieltjes integral . . . . . .
160
List of Symbols
P∞
an = n=1 an
limm limn am,n
lim
P m,n am,n
am,n
m,n P
P∞
∞
a
Pj=1 k=1
P∞j,k
f
=
n
n=1 fn
Pn∞
n
n=0 cn (x − a)
−1
R = ρ
infinite series of real numbers .
iterated limit . . . . . . . . . . .
double limit . . . . . . . . . . .
double infinite series . . . . . . .
iterated series . . . . . . . . . .
infinite series of functions . . . .
power series in x about a . . . .
radius of convergence . . . . . .
a
generalized binomial coefficient .
n
(X, d)
metric space . . . . . . . . . . .
kxk
norm of x . . . . . . . . . . . .
(X , k · k)
normed vector space . . . . . . .
d2
Euclidean metric on Rn . . . . .
d1
`1 metric on Rn . . . . . . . . .
d∞
max metric on Rn . . . . . . . .
B(S)
space of bounded f : S → R . .
kf k∞
supremum norm f . . . . . . . .
`∞
set of bounded sequences . . . .
`1
set of summable sequences . . .
kak1
`1 norm of a . . . . . . . . . . .
d×ρ
product metric . . . . . . . . . .
Br (x)
open ball . . . . . . . . . . . . .
Cr (x)
closed ball . . . . . . . . . . . .
Sr (x)
sphere . . . . . . . . . . . . . .
C([a, b])
space of cont. f on [a, b] . . . .
D([a, b])
space of diff. f on [a, b] . . . . .
[a : b]
line segment from a to b . . . .
cl(E)
closure of E . . . . . . . . . . .
int(E)
interior of E . . . . . . . . . . .
bd(E)
boundary of E . . . . . . . . . .
lim{x→a, x∈E} f (x)
limit of f along E . . . . . . . .
lim(x,y)→(a,b) f (x, y)
double limit . . . . . . . . . . .
limx→a limy→b f (x, y) iterated limit . . . . . . . . . . .
d(A)
diameter of A . . . . . . . . . .
d(A, B)
distance between A and B . . .
C(X, Y )
set of cont. f : X → Y . . . . .
ext(E)
exterior of E . . . . . . . . . . .
C(X)
space of cont. f X :→ R . . . .
∂f
∂j f = fxj = ∂x
partial
derivative of f . . . . . .
j
∇f or grad f
gradient of f . . . . . . . . . . .
dfa : Rn → Rm
differential of f at a . . . . . . .
f 0 (a)
Jacobian matrix of f at a . . .
∂(f1 ,...,fn )
Jf (a) = ∂(x
(a)
Jacobian of f . . . . . . . . . .
1 ,...,xn )
P
n
xxi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
163
188
189
190
190
204
211
211
216
231
233
233
233
233
233
233
233
234
234
234
234
238
238
238
240
240
241
243
243
243
248
249
249
261
261
263
274
275
289
289
291
291
292
xxii
List of Symbols
∂mf
m
n
∂xi 1 ···∂xm
in
1
Dm fx
m
m1 ,m2 ,...,mn
Tm (x, a)
Rm (x, a)
λ∗
M = M(Rn )
λ = λn
B = B(Rn )
1A
S
R + (F)
R f dλ
f dλ
E
1
RL (E)
R
f (x, z) dz dx
Rp Rq
f ∗g
αn
length(ϕ)
R
f ds
ϕ
~
F
T~ϕ
f1 dx1 + · · · + fn dxn
~
ω·H
ϕ
Tϕ(u)
sign(ϕ)
∂ϕ⊥
~ϕ
N
ϕa : Ua → Sa
ϕab
Sa
Rn−1
+
Hn−1
∂Hn−1
∂S
S \ ∂S
dxj
ωx
ω∧η
dω
ϕ∗ ω
area(ϕ)
R
f dS
ϕ
higher order partial derivative . . . . . . .
318
mth total differential of f . . . . . . . . .
multinomial coefficient . . . . . . . . . .
Taylor polynomial . . . . . . . . . . . . .
Taylor remainder term . . . . . . . . . .
Lebesgue outer measure . . . . . . . . . .
Lebesgue measurable sets . . . . . . . . .
Lebesgue measure on Rn . . . . . . . . .
Borel measurable sets . . . . . . . . . . .
indicator function of A . . . . . . . . . .
set of F-measurable simple functions ≥ 0
Lebesgue integral of f . . . . . . . . . . .
Lebesgue integral of f on E . . . . . . .
space of integrable functions on E . . . .
iterated integral . . . . . . . . . . . . . .
convolution of f and g . . . . . . . . . .
volume of unit ball in Rn . . . . . . . . .
length of curve ϕ . . . . . . . . . . . . .
line integral over ϕ . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
324
325
327
327
349
351
351
356
362
362
370
370
370
388
389
390
412
415
vector field . . . . . . . . . . . . . . . .
unit tangent vector field along ϕ . . . .
1-form in Rn . . . . . . . . . . . . . . .
inner product of a form and vector field
parameterized m-surface . . . . . . . .
tangent space of ϕ . . . . . . . . . . . .
sign of parametrization ϕ . . . . . . . .
normal vector to surface ϕ . . . . . . .
normal vector field . . . . . . . . . . .
local parametrization of S . . . . . . .
transition mapping . . . . . . . . . . .
surface element . . . . . . . . . . . . .
Rn−1 with xn−1 > 0 . . . . . . . . . . .
Rn−1 with xn−1 ≥ 0 . . . . . . . . . . .
boundary of Hn−1 . . . . . . . . . . . .
boundary of S . . . . . . . . . . . . . .
interior of S . . . . . . . . . . . . . . .
multidifferential . . . . . . . . . . . . .
differential form . . . . . . . . . . . . .
wedge product . . . . . . . . . . . . . .
differential of ω . . . . . . . . . . . . .
pullback of ω by ϕ . . . . . . . . . . . .
area of ϕ . . . . . . . . . . . . . . . . .
integral of f on a para. surface . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
416
416
417
417
422
422
424
425
426
435
435
435
440
440
440
440
440
448
451
452
454
457
463
463
List of Symbols
R
ω
ϕ
L(U, V)
At
[T ]
det A
integral of a form on a para. surface . .
set of linear transformations T : U → V
transpose of A . . . . . . . . . . . . . .
matrix of T . . . . . . . . . . . . . . .
determinant of A . . . . . . . . . . . .
xxiii
.
.
.
.
.
. 466
. 510
. . 511
. 513
. 514
Part I
Functions of One Variable
Chapter 1
The Real Number System
If the notion of limit is the cornerstone of analysis, then the real number
system is the bedrock. In this chapter we provide a description of the real
number system that is sufficiently detailed to allow a careful development of
limit in the various forms that appear in this book.
The real number system is defined as a nonempty set R together with
two algebraic operations, called addition and multiplication, and an ordering
less than that collectively satisfy three sets of axioms: the algebraic or field
axioms, the order axioms, and the completeness axiom. These are discussed in
Sections 1.2–1.4. We begin, however, with a brief description of how the real
number system may be constructed from a more fundamental number system.
1.1
From Natural Numbers to Real Numbers
A rigorous construction of the real number system starts with the set of
natural numbers (positive integers) N and then proceeds to the set of integers
Z, the rational number system Q, and, finally, the real number system R. In
this approach the natural numbers are assumed to satisfy a set of axioms
called the Peano Axioms. These are used to define the operations of addition
and multiplication in N. Subtraction is introduced by enlarging the system
of natural numbers to Z, thereby allowing solutions of all equations of the
form x + m = n, m, n ∈ Z. To obtain division, Z is enlarged to Q by forming
all quotients m/n, where m, n ∈ Z, n 6= 0. In this system, one may solve all
equations of the form ax + b = c, a 6= 0. The final step, the construction of R
from Q, may be viewed as “filling in the gaps” of the rational number line,
these gaps corresponding to the so-called irrational numbers.1
For the details of this “bottom up” approach, the interested reader is
referred to [7] or [10]. We shall instead take a “top down” approach, describing
the real number system axiomatically.
1 This step results in a system that, while having the structure necessary to formulate a
robust theory of limits, does not allow solutions of all polynomial equations. This shortcoming
is removed by introducing complex numbers, a subject outside the scope of this book.
3
4
A Course in Real Analysis
1.2
Algebraic Properties of R
In this section we list the axioms that govern the use of addition (+) and
multiplication (·) in the real number system. These axioms lead to all of the
familiar algebraic properties of real numbers.
The operations of addition and multiplication satisfy
the following field axioms, where a, b, c denote arbitrary
members of R:
• Closure under addition: a + b ∈ R.
• Associative law of addition: (a + b) + c = a + (b + c).
• Commutative law of addition: a + b = b + a.
• Existence of an additive identity: There exists a member 0 of R such
that a + 0 = a for all a ∈ R.
• Existence of additive inverses: For each a ∈ R there exists a member −a
of R such that a + (−a) = 0.
• Closure under multiplication: a · b ∈ R.
• Associative law of multiplication: (a · b) · c = a · (b · c).
• Commutative law of multiplication: a · b = b · a.
• Existence of a multiplicative identity: There exists a real number 1 6= 0
such that a · 1 = a for all a ∈ R.
• Existence of multiplicative inverses: For each a 6= 0 there exists a member
a−1 of R such that a · a−1 = 1.
• Distributive law: a · (b + c) = a · b + a · c.
We use the following standard notation:
a
= a/b = ab−1 ,
b
a + b + c = (a + b) + c = a + (b + c), abc = (ab)c = a(bc),
a − b = a + (−b), ab = a · b,
an = aa
· · · a}, a−n = 1/an (a 6= 0), and a0 = 1.
| {z
n
We also use the summation and product symbols
n
X
j=m
aj = am + am+1 + · · · + an and
n
Y
P
and
Q
defined by
aj = am am+1 · · · an .
j=m
The field axioms may be used to derive the standard rules of algebra.
Some of these are given in the following proposition; others may be found in
Exercise 1.
The Real Number System
5
1.2.1 Proposition. The following algebraic properties hold in R:
(a) The additive identity is unique; that is, if 00 is a real number such that
a + 00 = a for all a ∈ R, then 00 = 0.
(b) The additive inverse of a real number is unique; that is, if a + b = 0, then
b = −a.
(c) The multiplicative identity is unique; that is, if 10 is a real number such
that a · 10 = a for all a ∈ R, then 10 = 1.
(d) a · 0 = 0 for all a ∈ R.
(e) The multiplicative inverse of a nonzero real number is unique; that is, if
ab = 1, then b = 1/a.
(f) If ab = 0, then either a = 0 or b = 0.
(g) If ab = ac and a 6= 0, then b = c.
(h) If b 6= 0 and d 6= 0, then a/b = c/d if and only if ad = bc.
(i) If a 6= 0 and b 6= 0, then (ab)−1 = a−1 b−1 , or
1 1
1
=
.
ab
a b
Proof. (a) If a + 00 = a for all a then, in particular, 0 + 00 = 0. But, by
definition of 0 and commutativity of addition, 0 + 00 = 00 . Therefore 00 = 0.
(b) By associativity and commutativity of addition,
b = b + 0 = 0 + b = (−a + a) + b = −a + (a + b) = −a + 0 = −a.
(c) If a · 10 = a for all a then, in particular, 1 · 10 = 1. But, by definition of
the multiplicative identity and commutativity of multiplication, 1 · 10 = 10 .
Therefore 10 = 1.
(d) By the distributive property,
a · 0 = a(0 + 0) = a · 0 + a · 0.
Adding −(a · 0) to both sides of this equation and using associativity of
addition produces the desired equation.
(e) By associativity and commutativity of multiplication,
b = 1 · b = (a−1 a)b = a−1 (ab) = a−1 · 1 = a−1 .
(f) Assume a 6= 0. By (d) and commutativity and associativity of multiplication,
0 = a−1 · 0 = (a−1 )(ab) = (a−1 a)b = 1 · b = b.
6
A Course in Real Analysis
(g) By commutativity and associativity of multiplication,
b = 1 · b = (a−1 a)b = a−1 (ab) = a−1 (ac) = (a−1 a)c = c.
(h) If a/b = c/d, then multiplying both sides by bd and using the commutativity
and associativity of multiplication we obtain ad = bc. Conversely, if ad = bc,
then multiplying both sides by 1/(bd) yields a/b = c/d.
(i) By associativity and commutativity of multiplication,
(ab)(a−1 b−1 ) = (aa−1 )(bb−1 ) = 1.
Now apply (e).
The reader will notice that the assertions in the proposition are implications,
that is, statements of the form p implies q, frequently written p ⇒ q. Such
assertions may be proved directly by assuming p and then deducing q, or
indirectly by assuming the negation of q and arguing to a contradiction or
to the negation of p. Part (h) also contains an assertion of the form p if and
only if q (hereafter, shortened to p iff q). Such an assertion is established by
proving both p ⇒ q and q ⇒ p. Throughout the text, we shall encounter many
examples of such proofs. The reader is advised that a careful proof requires that
each (nontrivial) step be justified by citing hypotheses, appropriate axioms, or
previously proved results.
One more point of logic: To prove that a general statement involving the
universal quantifier “for every” (or “for all”) is false, one must construct a
counterexample. For example, the assertion that xy = x + y for all real numbers
x and y is clearly false. For a proof, one need only find a single pair of numbers
x and y such that xy =
6 x + y, for example x = y = 1. On the other hand,
to prove that x2 − y 2 = (x − y)(x + y) for all real numbers x and y, it not
sufficient to verify the statement for a specific pair of numbers; a general proof
is needed here. For details on constructing proofs in mathematics, the reader
is referred to [2].
The number systems described in Section 1.1 are summarized as follows:
• N = {1, 2 := 1 + 1, 3 := 2 + 1, . . .}
(positive integers),
• Z = {0, ±1, ±2, ±3, . . .}
• Q = {m/n : m, n ∈ Z, n 6= 0}
(integers),
(rational numbers),
• I=R\Q
(irrational numbers).
An integer is said to be even (odd) if n = 2k (n = 2k + 1) for some k ∈ Z.
A precise definition of N is given in Section 1.5. From this it is possible
to argue rigorously that the number system N is closed under addition and
multiplication. As a consequence, Z is closed under addition, subtraction, and
multiplication, and Q is closed under addition, subtraction, multiplication, and
division (Exercise 2).
The Real Number System
7
Exercises
1. Prove the following properties of addition and multiplication in R:
(a) −(−a) = a.
(b)S −(ab) = (−a)b = a(−b).
(c)⇓2 (−a)(−b) = ab.
(d)S (−1)a = −a.
a/b
ad
ad
=
=
.
c/d
bc
bc
c
ad + bc
a
.
(f)S If b 6= 0 and d 6= 0, then + =
b
d
bd
(e) If b, d 6= 0, then
2. Let r, s ∈ Q. Assuming that Z is closed under addition and multiplication,
prove that r ± s, rs, r/s ∈ Q, the last provided that s 6= 0.
3.S If r 6= 0 ∈ Q and x ∈ I, prove that r ± x, rx, r/x ∈ I.
4. Let n ∈ N. Prove the following identities without using mathematical
induction:
n
X
n
3 n
(a) ⇓ x − y = (x − y)
xn−j y j−1 .
j=1
(b)
xn + y n = (x + y)
n
X
(−1)j−1 xn−j y j−1 if n is odd.
j=1
(c)
x−n − y −n = (y − x)
n
X
xj−n−1 y −j if x 6= 0 and y 6= 0.
j=1
5.S Define 0! = 1 and, for n ∈ N, define n! = n(n − 1) · · · 2 · 1 (n factorial).
Prove the following:
n!
(a) (1 − 1/n)(1 − 2/n) · · · 1 − (n − 1)/n = n .
n
(2n)!
(b) 1 · 3 · 5 · · · (2n − 1) = n .
2 n!
6. ⇓4 For n ∈ Z+ and k = 0, 1, . . . , n, define the binomial coefficient
n
n!
=
k!(n − k)!
k
(read “n choose k”). Prove that
n+1
n
n
=
+
.
k
k−1
k
2 This
exercise will be used in 1.3.2.
exercise will be used in 4.1.2.
4 This exercise will be used in 1.5.5.
3 This
8
A Course in Real Analysis
7. Without using mathematical induction, prove that for any n ∈ N,
n
n
X
2 X 1
1
(a)
=
.
(k + 1)(n − k + 1)
n+2
k+1
(b)
k=0
n
X
k=0
k=0
n
1 X 1
1
=
.
(2k + 1)(2n − 2k + 1)
n+1
2k + 1
k=0
8.S Find a polynomial f (x) of degree 2 such that
1.3
Pn
k=1
f (k) = n3 for all n.
Order Structure of R
The order relation on R is derived from the following order axiom.
There exists a nonempty subset P of R, closed under addition and multiplication, such that for each x ∈ R exactly
one of the following holds: x ∈ P, −x ∈ P, or x = 0.
The last part of the axiom is known as the trichotomy property. A real number
x is called positive if x ∈ P and negative if −x ∈ P.
1.3.1 Definition. Let a and b be real numbers. If b − a ∈ P, we write a < b
or b > a and say that a is less than b or that b is greater than a.
♦
1.3.2 Proposition. The order relation < on R has the following properties:
(a) a < b iff −a > −b (reflection property).
(b) If a < b and b < c, then a < c (transitive property).
(c) If a < b and c < d, then a + c < b + d (addition property).
(d) If a < b and c > 0, then ac < bc (multiplication property).
(e) For a, b ∈ R, exactly one of the following is true: a = b, a < b, or b < a
(trichotomy property).
(f) If x 6= 0, then x2 > 0. In particular, 1 > 0.
Proof. (a) a < b iff (−a) − (−b) = b − a ∈ P iff −a > −b.
(b) By hypothesis, b − a ∈ P and c − b ∈ P, hence, by closure under addition,
c − a = (b − a) + (c − b) ∈ P, that is, a < c.
(c) Similar to (b).
(d) Since b − a, c ∈ P, bc − ac = (b − a)c ∈ P, that is, ac < bc.
(e) This follows by applying the trichotomy property of P to a − b.
(f) If x > 0, then, by closure of P under multiplication, x2 > 0 . If x < 0, then
−x > 0 so, by Exercise 1.2.1(c), x2 = (−x)(−x) > 0.
The Real Number System
9
1.3.3 Definition. Let a and b be real numbers. If either a < b or a = b, we
write a ≤ b or b ≥ a and say that a is less than or equal to b or that b is greater
than or equal to a. If A ⊆ R, we define A+ = {x ∈ A : x ≥ 0}.
♦
Note that by the trichotomy property,
a ≤ b and b ≤ a ⇒ a = b.
(1.1)
The inequality a ≤ b is sometimes called weak inequality in contrast to
strict inequality a < b. The reader may check that parts (a)–(d) of the above
proposition are valid if strict inequality is replaced by weak inequality.
1.3.4 Definition. The absolute value of a real number x is defined by
(
x if x ≥ 0,
|x| =
−x if x < 0.
♦
For example, |0| = 0 and |2| = | − 2| = 2.
1.3.5 Proposition. Absolute value has the following properties:
(a) |x| ≥ 0.
(b) |x| = 0 iff x = 0.
(c) | − x| = |x|.
(d) − |x| ≤ x ≤ |x|.
(e) |xy| = |x| |y|.
(f)
(g) |x + y| ≤ |x| + |y|. (h)
|x|
x
=
(y 6= 0).
y
|y|
|x| − |y| ≤ |x − y|. (triangle inequalities)
Proof. Properties (a)–(e) are easily established by considering cases. For example, in (e), if x ≥ 0 and y ≤ 0, then xy ≤ 0, hence
|xy| = −(xy) = x(−y) = |x| |y|.
For part (f), use (e) to obtain
|x| =
x
x
y =
|y|,
y
y
and then divide both sides by |y|.
For (g), we have ±x ≤ |x| and ±y ≤ |y| by (d), hence ±(x + y) ≤ |x| + |y|.
Since one of the signed quantities on the left is |x + y|, the assertion follows.
From (g) we have
|x| = |(x − y) + y| ≤ |x − y| + |y|,
hence |x| − |y| ≤ |x − y|. Switching x and y and using (c) yields (h).
10
A Course in Real Analysis
1.3.6 Definition. Let S be a nonempty set of real numbers. The largest
element or maximum of S is a member max S of S that satisfies
max S ≥ s for all s ∈ S.
The smallest element or minimum of S, denoted by min S, is defined analogously.
A set may not have a largest or smallest member. The existence of max S
and min S for a nonempty finite set may be established by mathematical
induction. (See Exercise 1.5.2.)
1.3.7 Definition. The positive and negative parts of a real number x are
defined by
x+ = max{x, 0} and x− = max{−x, 0}.
♦
Exercises
Prove the following:
1. (a) If ab > 0, then a and b have the same sign.
(b) a > 0 iff 1/a > 0.
(c)S Suppose either b, d < 0 or b, d > 0. Then a/b > c/d iff ad > bc.
2. If x > 1, then x2 > x. If 0 < x < 1, then x2 < x.
3. (a) If 0 < x < y and 0 < a < b, then 0 < ax < by.
(b) If x < y < 0 and a < b < 0, then 0 < by < ax.
(c) Let x, y > 0. Then x < y iff x2 < y 2 .
4.S If either 0 < x < y or x < y < 0, then 1/y < 1/x.
5. If −1 < x < y or x < y < −1, then x/(x + 1) < y/(y + 1). What if
x < −1 < y?
6. If 0 < x < y and n ∈ N, then
(a)S 0 < y n − xn ≤ n(y − x)y n−1 ,
7. If x > 1, m, n ∈ N, and
(b)
ny + 1
(n + 1)y + 1
<
.
nx + 1
(n + 1)x + 1
x−1
m
<
< 1, then n > x.
x
n
8.S If a < b and 0 < t < 1, then a < ta + (1 − t)b < b. In particular,
a < (a + b)/2 < b.
9. x2 + y 2 + axy ≥ 0 for all x, y ∈ R iff |a| ≤ 2.
10.S If a ≤ b + x for every x > 0, then a ≤ b.
The Real Number System
11
11. If 0 < a ≤ bx for every x > 1, then a ≤ b.
12. If a/x ≤ x + 1 for every x > 0, then a ≤ 0.
13. For all x, y, z, w ∈ R,
(a) 2xy ≤ x2 + y 2 .
(b) S xy + yz + xz ≤ x2 + y 2 + z 2 .
(c) (xy + zw)2 ≤ (x2 + z 2 )(y 2 + w2 ). (d) (x + y)2 ≤ 2(x2 + y 2 ).
14.S If x, a > 0, then x + a2 /x ≥ 2a. Equality holds iff x = a.
15. (a) |x − y| ≤ |x − z| + |z − y|.
(b) |x − L| < ε iff L − ε < x < L + ε.
16. Let S, T ⊆ R be finite and nonempty. Define −S := {−s : s ∈ S}. Then
(a) max(−S) = − min S.
(b) min(−S) = − max S.
(c) max(S ∪ T ) = max{max S, max T }.
(d) min(S ∪ T ) = min{min S, min T }.
17. For any x, y ∈ R,
(a) x+ ≥ 0, x− ≥ 0, x = x+ − x− , and |x| = x+ + x− .
(b) x+ = |x| + x /2 and x− = |x| − x /2.
(c) x = y − z and |x| = y + z imply y = x+ and z = x− .
(d) (x + y)+ ≤ x+ + y + and (x + y)− ≤ x− + y − .
(e) (x − y)− ≤ y, if x, y ≥ 0.
18.S If a ≤ x ≤ b, then |x| ≤ max{|a|, |b|}.
19. (a) max{x, y} = x + y + |x − y| /2.
(b) min{x, y} = x + y − |x − y| /2.
20. (a) max{a, b, c} =
(b) min{a, b, c} =
1
4
1
4
a + b + 2c + |a − b| + a + b − 2c + |a − b| .
a + b + 2c − |a − b| − a + b − 2c − |a − b| .
21.S Let S = {a1 , . . . , an }, where a1 < · · · < an . Let 1 ≤ k < n and denote
by S1 , . . . , Sm thesubsets obtained by removing exactly k members from
n
S, where m =
is the binomial coefficient (see Theorem 1.5.5). Then
k
max min S1 , . . . , min Sm = ak+1 .
12
1.4
A Course in Real Analysis
Completeness Property of R
A system (F, +, ·, <) with the algebraic and order properties described
in Sections 1.2 and 1.3 is called an ordered field. By Exercise 1.2.2, Q is an
ordered field under the algebraic operations and order relation inherited from
R. The same is true for the set
√
√
Q( 2) := {x + 2 y : x, y ∈ Q}
(Exercise 19). This suggests that there are infinitely many ordered subfields of R.
The property that distinguishes R from all other ordered fields is completeness,
described in this section.
1.4.1 Definition. A nonempty subset A of an ordered field F is said to be
bounded above if there exists a member u ∈ F, called an upper bound of A,
such that a ≤ u for all a ∈ A. The notions of bounded below and lower bound
are defined analogously. The set A is said to be bounded if it is bounded above
and below. Any set that is not bounded is said to be unbounded (either above
or below).
♦
The subsets Q and Z of R are neither bounded above nor bounded below;
N is bounded below but not above. The set {n/(n + 1) : n ∈ N} is bounded
above by 1 and below by 1/2.
1.4.2 Definition. Let A be a nonempty subset of an ordered field F. An
upper bound u0 of A with the property that u0 ≤ u for all upper bounds u of
A is called a least upper bound, or supremum, of A, and is denoted by sup A.
A lower bound `0 of A such that ` ≤ `0 for all lower bounds ` of A is called a
greatest lower bound, or infimum, of A, and is denoted by inf A. If sup A ∈ A,
then sup A is called the maximum of A. If inf A ∈ A, then inf A is called the
minimum of A.
♦
inf A
r
a
a
sup A
A
FIGURE 1.1: Supremum and infimum of A.
It follows from (1.1) that the supremum or infimum of a set, if it exists, is
unique.
It is not necessarily the case that a nonempty bounded subset
√ of an ordered
field has an infimum or supremum. For example, because 2 is not rational
(1.4.11 below), the bounded set {x ∈ Q : x2 < 2} has neither an infimum nor
a supremum in Q.
The Real Number System
13
The following proposition will be used frequently in ensuing discussions
involving suprema and infima.
1.4.3 Approximation Property. Let A be a nonempty subset of an ordered
field F.
(a) If sup A exists, then for each r with r < sup A there exists a ∈ A such that
r < a ≤ sup A.
(b) If inf A exists, then for each r with inf A < r there exists a ∈ A such that
inf A ≤ a < r.
Proof. If r < sup A, then r cannot be an upper bound for A, hence there exists
a ∈ A with r < a. The proof of (b) is similar.
We may now state the property that distinguishes the real number system
from all other ordered fields.
Every nonempty subset of R that is
bounded above has a least upper bound.
This axiom is known as the completeness property of R. It is the key ingredient
needed for the formulation of a useful and robust theory of limits. From the
completeness property one may deduce (Exercise 1) the symmetrical property
Every nonempty subset of R that is
bounded below has a greatest lower bound.
The real number system may now be described as a complete ordered field.
It may be shown that (up to isomorphism) there is exactly one such structure.
The following important consequence of completeness is useful in determining the infimum or supremum of certain sets. It asserts that positive integer
multiples of a positive real number may be made arbitrarily large.
1.4.4 Archimedean Principle. For any real numbers a and b with a > 0
there exists n ∈ N such that na > b.
Proof. Suppose, for a contradiction, that na ≤ b for all n ∈ N. The set
S = {na : n ∈ N} is then bounded above and hence has a least upper bound
u. Since u − a < u, the approximation property for suprema implies that
u − a < na for some n ∈ N. But then u < (n + 1)a ∈ S, contradicting that u
is an upper bound for S.
1.4.5 Example. Let
n
A = (−1)n
o n 1 2 3
o
n
: n ∈ N = − , ,− ,... .
n+1
2 3 4
14
A Course in Real Analysis
Since A is bounded above by 1 and below by −1, −1 ≤ inf A ≤ sup A ≤ 1. Let
0 < r < 1. By the Archimedean principle we may choose an even integer n
such that n > r/(1 − r). Then r < n/(n + 1) ∈ A, which shows that r cannot
be an upper bound of A. Therefore, sup A = 1. Similarly, inf A = −1.
♦
1.4.6 Well-Ordering Principle. Every nonempty subset A of N has a smallest member.
Proof. Since A is bounded below by 1, it has a greatest lower bound `. The
theorem will follow if we show that ` ∈ A. Suppose, for a contradiction, that
` 6∈ A. By the approximation property for infima, there exists a ∈ A such
that ` < a < ` + 1. Choose any real number r with ` < r < a, for example,
r = (a + `)/2. By the approximation property again, there exists a0 ∈ A such
that ` < a0 < r. We now have ` < a0 < a < ` + 1, which implies that a − a0 is
an integer strictly between 0 and 1. As this is impossible,5 it follows that `
must be a member of A.
1.4.7 Greatest Integer Function. For each x ∈ R there exists a unique
integer bxc such that x − 1 < bxc ≤ x.
Proof. The uniqueness is clear. To prove existence, apply the Archimedean
principle twice: first to obtain an integer k such that x + k ≥ 1 and then
to conclude that the set A := {n ∈ N : n > x + k} is nonempty. By the
well-ordering principle, A has a least member a. Since 1 ≤ x + k < a, a − 1
is a positive integer. Since a − 1 < a, a − 1 cannot be in A so x + k ≥ a − 1.
Therefore, x − 1 < a − 1 − k ≤ x, hence the integer bxc := a − 1 − k has the
required property.
y
3
2
1
−3 −2 −1
1
−1
2
3
x
−2
−3
FIGURE 1.2: Greatest integer function.
The integer bxc is called the greatest integer in x or the floor of x. The
greatest integer function allows a simple proof of the following important
result:
5 This is intuitively clear. The abstract definition of N given in Section 1.5 may be used
to give a rigorous proof.
The Real Number System
15
1.4.8 Density of the Rationals. Between any pair of distinct real numbers
there is a rational number.
Proof. Let a < b. By the Archimedean principle, n(b − a) > 1 for some n ∈ N.
Let m := bnac + 1. Then na < m ≤ na + 1 < nb, hence a < m/n < b.
1.4.9 Definition. (nth roots). Let n be a positive integer and let b > 0. The
unique positive solution of the equation xn = b is called the positive nth root
of b and
by b1/n . For m ∈ Z we define bm/n = (b1/n )m . As usual we
√ is denoted
1/2
write b for b .
♦
The existence of b1/n is an easy consequence of the intermediate value
theorem, proved in Chapter 3. Uniqueness follows from Exercise 1.2.4(a).
We omit the straightforward (but admittedly tedious) proof of the following
theorem that summarizes the familiar rules of rational exponentiation.
1.4.10 Theorem. For r, s ∈ Q and positive real numbers a, b,
br
= br−s , (br )s = brs , and (ab)r = ar br .
bs
The following proposition gives a simple way to generate irrational numbers.
br bs = br+s ,
1.4.11
Proposition. If n is positive integer that is not a perfect square, then
√
n is irrational.
√
√
√
Proof. √By definition of the greatest integer function, n − 1 < b nc ≤ n.
Since n is
is strict,
√
√ assumed
√ not to be an integer, the second inequality
hence 0 < n − b nc < 1. Suppose,
for
a
contradiction,
that
n
is
rational.
√
Then the set A := {m ∈ N : m n ∈ N} is nonempty.√By the well-ordering
principle, A has a least member m0 . In particular, m0 n ∈ N, hence both of
the quantities
√
√ √
√ √ m := m0 n − b nc and m n = m0 n − nb nc
are positive√integers. But then m ∈ A, which is impossible since m < m0 .
Therefore, n must be irrational.
In later chapters, we shall see other important examples of irrational
numbers, notably the base e of the natural logarithm.
1.4.12 Definition. The extended real number system is the set
R := R ∪ {−∞, +∞},
where +∞, −∞ are symbols with the following prescribed properties:
−∞ < x < ∞ for all x ∈ R,
x + ∞ = +∞ if − ∞ < x ≤ +∞,
x · (+∞) = +∞ if 0 < x ≤ +∞,
x − ∞ = −∞ if − ∞ ≤ x < +∞,
x · (+∞) = −∞ if − ∞ < x < 0,
x · (−∞) = −∞ if 0 < x < +∞, x · (−∞) = +∞ if − ∞ ≤ x < 0,
x
x
=
= 0 if − ∞ < x < +∞.
+∞
−∞
♦
16
A Course in Real Analysis
The above algebraic conventions are derived from limit considerations. Note
that the operations
±∞ ∓ ∞, (±∞) · (∓∞),
±∞ ±∞
,
,
±∞ ∓∞
and 0 · (±∞)
(1.2)
are not defined.
1.4.13 Definition. If A 6= ∅ is not bounded above, we set sup A = +∞.
Similarly, if A is not bounded below, we set inf A = −∞. We also define
sup ∅ = −∞ and inf ∅ = +∞.
♦
The reader may verify that the approximation properties for suprema and
infima given in 1.4.3 hold in the extended system R.
1.4.14 Definition. An interval in R is a nonempty set I with the property
that a, b ∈ I and a < x < b imply that x ∈ I. An interval containing more
than one point is said to be nondegenerate.
♦
Arguing cases, one may show that the definition of interval reduces to the
following familiar subsets of R:
(a, b) := {x : a < x < b},
(a, b] := {x : a < x ≤ b},
[a, b) := {x : a ≤ x < b},
[a, b] := {x : a ≤ x ≤ b}.
For example, if I is unbounded below and bounded above with b := sup I ∈
I, then I = (−∞, b]. If, instead, I is bounded below and above such that
a := inf I ∈ I and b := sup I 6∈ I, then I = [a, b). Intervals that contain their
endpoints are said to be closed; those that don’t contain any endpoints are
called open.
The length |a − b| of a finite interval I with endpoints a, b will be denoted
by |I|. Note that the length of a degenerate interval is zero.
Exercises
1. Prove that inf (−A) = − sup A, where −A := {−a : a ∈ A}. Conclude
that every nonempty subset of R that is bounded below has a greatest
lower bound.
2. Find the supremum and infimum of the following sets, where rn denotes
the remainder on division of n ∈ N by 3. 6
(−1)n n(rn − 1)
(a) S {(−1)n (rn2 + 3rn + 2) : n ∈ N}. (b) S
:n∈N .
(n + 1)(rn + 1)
(
)
(−1)bn/3c − 1 n
n 3n + 2
(c)
(−1)
:n∈N .
(d)
:n∈N .
2n + 3
n+1
6 For
the existence of rn , see Exercise 1.5.15.
The Real Number System
17
3. Find the supremum and infimum of the following sets.
(a) {x : x2 − 5x + 6 < 0}.
(b) {x : (x + 3)(x − 4) < −6}.
(c) {x : (x − 4)/(x − 3) < −2}.
(d) S {x : x − 2 < 1/(x − 1)}.
(e) S {x : (x − 1)/x < 4}.
(f)
(g) {x : |x − 3x + 2| ≤ 1/4}.
p
(i) S {x : x − 1/8 > x}.
(h) {x : |x − 1| + |x − 2| ≤ 3}.
p
(j) {x : x + 1/8 > x}.
S
2
{x > 0 : x/(2 − x) > 3}.
S
(k) {x : 2|x − 1| + 3|y − 2| < 6 for some y ∈ R}.
(l) {x : 2 x2 − 1 + 3 y 2 − 2 < 6 for some y ∈ R}.
(m)S (−1)n sin(nπ/2) − n−1 : n ∈ N .
(n) (−1)n sin(mπ/2) − n−1 : m, n ∈ N .
4. Let A ⊆ B be nonempty subsets of R. Prove that sup A ≤ sup B and
inf A ≥ inf B.
5.S ⇓7 For a nonempty bounded set A define |A| := {|a| : a ∈ A}. Prove
that sup |A| − inf |A| ≤ sup A − inf A. Hint. Use |x| − |y| ≤ |x − y|.
6. For r ∈ Q, x ∈ R, and nonempty subsets A and B of R, define
xA = {xa : a ∈ A}
AB = {ab : a ∈ A, b ∈ B}
A + B = {a + b : a ∈ A, b ∈ B}
Ar = {ar : a ∈ A}, A ⊆ (0, +∞).
Under the conventions described in 1.4.12, prove that
(a) sup (A + B) ≤ sup A + sup B, inf (A + B) ≥ inf A + inf B.
(b)S sup (xA) = x sup A, inf (xA) = x inf A if x ≥ 0.
(c) sup (AB) ≤ (sup A)(sup B) and inf (AB) ≥ (inf A)(inf B)
if A, B ⊆ (0, ∞).
(d) sup Ar = (sup A)r , inf Ar = (inf A)r if A ⊆ (0, ∞) and r > 0.
(e) sup A−1 = 1/ inf A, inf A−1 = 1/ sup A if A ⊆ (0, ∞).
7. Let A ⊆ R be nonempty such that inf{|x − y| : x, y ∈ A, x 6= y} > 0
(for example, any set of integers). If A is bounded above, prove that
sup A ∈ A, that is, A has a maximum.
8. Let A be a nonempty bounded set and let r ∈ R such that x − y < r for
all x, y ∈ A. Show that sup A − inf A ≤ r.
9.S Prove that between any pair of distinct real numbers there is an irrational
number.
7 This
exercise will be used in 5.2.6.
18
A Course in Real Analysis
10. Prove that between any pair of real numbers a < b there exist infinitely
many rational numbers and infinitely many irrational numbers.
11. (Density of the dyadic rationals). Prove that for each pair of real numbers
a < b there exists m ∈ Z and n ∈ N such that a < m/2n < b. (Suggestion.
You might want to use the fact that 2n > n, a consequence of the binomial
theorem, proved in the next section.) A number of the form m/2n is
called a dyadic rational.
12. Prove:
(a) bxc = b−xc iff x = 0.
(b)S bxc = −b−xc iff x ∈ Z.
(c)S −1 < x + b−xc ≤ 0.
(d) bxc + bm − xc = m or m − 1.
13. Let m ∈ Z, n ∈ N, xj ∈ R, and define
s :=
n
X
xj
and
t :=
j=0
n
X
bxj c.
j=0
Prove:
(a) 0 ≤ bsc − t ≤ n.
(b) k ≤ s − t < k + 1 for some k = 0, 1, . . . , n.
1/n
14.S Let b > 0. Prove that bm/n = (bm )
.
15. ⇓8 Prove that for a, b > 0 and n ∈ N,
1/n
a
−b
1/n
= (a − b)
X
n
1−j/n (j−1)/n
a
b
−1
.
j=1
16. Show that if 0 ≤ a < b and n ∈ N, then a1/n < b1/n .
17.S Prove that if A is a bounded set, then there exists an integer N such
that |x| ≤ N for all x ∈ A.
√
18. Let a, b ∈ Q \ {0} and n ∈ N. Prove that x := a + b n is irrational iff n
is not a perfect square.
√
√
19. Show that if x, y ∈ Q( 2), then
√ x±y, xy, x/y ∈ Q( 2), the last provided
that
√ y 6= 0. Conclude that Q( 2) is an ordered subfield of R. Show that
Q( 2) is not complete.
√
√
20.S (a) Find all n ∈ N such that n + 11 + n ∈ Q.
√
√
(b) Same question for n + 21 + n.
21. Let
√ p ∈ N√be prime, that is, divisible only by 1 and itself. Prove that
( n + 1)( n + p + 1)−1 ∈ Q iff n = (p − 1)2 /4.
8 This
exercise will be used in 4.1.2.
The Real Number System
1.5
19
Mathematical Induction
In this section we give an abstract characterization of the natural number
system. This will lead directly to the principle of mathematical induction.
1.5.1 Definition. A set S of real numbers is said to be inductive if
• 1 ∈ S,
• x ∈ S implies x + 1 ∈ S.
The set N of natural numbers is then defined as the intersection of all inductive
subsets of R.
♦
The sets (a, +∞), and (a, +∞) ∩ Q, a < 1, are clearly inductive. More
importantly, N itself is inductive. Indeed, since 1 is common to all inductive
sets, 1 ∈ N, and if n is common to all inductive sets, then so is n + 1. We
may therefore characterize N as the smallest inductive set (in the sense of set
inclusion). The principle of mathematical induction follows immediately from
this characterization:
1.5.2 Principle of Mathematical Induction. For each n ∈ N, let P (n) be
a statement depending on n. Suppose that
(a) P (1) is true,
(b) P (n + 1) is true whenever P (n) is true.
Then P (n) is true for all n.
Proof. Let S denote the set of n ∈ N for which P (n) is true. Then (a) and (b)
imply that S is inductive and hence, as a subset of N, must in fact equal N.
In a particular application of 1.5.2, part (a) is called the base step and part
(b) the inductive step. The assumption in (b) that P (n) is true is called the
induction hypothesis.
The principle of mathematical induction has been loosely described as the
“domino principle”: If dominoes are lined up vertically in such a way that the
(n + 1)st domino will fall if the nth one falls, then, if the first domino is tipped,
all the dominoes will fall.
Mathematical induction may be used to give a rigorous proof that N is
closed under addition: Let P (n) be the statement that n + m ∈ N for all m ∈ N.
Then P (1) is true because N is inductive, and if, for some n, P (n) is true, that
is, if n + m ∈ N for all m, then clearly P (n + 1) is true. A similar argument
shows that N is closed under multiplication.
Mathematical induction is indispensable in proving many useful inequalities
and formulas. We offer two examples; others may be found in the exercises.
20
A Course in Real Analysis
1.5.3 Example. We prove by induction that 3n n! > nn for all n ∈ N. This
is obvious for n = 1. For the induction step, we need the fact (verified in
Example 2.2.4) that (1 + 1/n)n < 3, or equivalently, (n + 1)n < 3nn , for all n.
Assuming this, we see that if 3n n! > nn , then
3n+1 (n + 1)! = 3(n + 1)3n n! > 3(n + 1)nn > (n + 1)n+1 .
♦
Pn
1.5.4 Example. We derive a closed formula for f (n) := k=1 (3k − 1)2 and
then verify the result by induction. A little experimentation suggests that we
should try a polynomial in n of degree 3, say g(n) := An3 + Bn2 + Cn + D.
Then
g(n + 1) − g(n) = A (n + 1)3 − n3 + B (n + 1)3 − n2 + C (n + 1) − n
= 3An2 + (3A + 2B)n + A + B + C
and
f (n+1)−f (n) =
n+1
X
(3k −1)2 −
k=1
n
X
2
(3k −1)2 = 3(n+1)−1 = 9n2 +12n+4.
k=1
Assuming that f (n) = g(n) for all n, we may equate coefficients to obtain
A = 3, B = 3/2, and C = −1/2. Since f (1) = 4, we see that D = 0. Thus,
under the assumption that the sum has a closed form that is a cubic polynomial,
we obtain the formula
n
X
(3k − 1)2 = 3n3 + 32 n2 − 21 n.
k=1
To prove the validity of the formula we use induction. When n = 1, each
side equals 4. Assuming the formula holds for n, we have
n+1
X
k=1
(3k −1)2 =
n
X
2 2
(3k −1)2 + 3(n+1)−1 = 3(n+1)−1 +3n3 + 32 n2 − 12 n.
k=1
A little algebra shows that the last expression reduces to
3(n + 1)3 + 32 (n + 1)2 − 12 (n + 1).
Thus the formula holds for n + 1, completing the induction.
♦
The stalwart reader may wish to use the methods of the last example to
derive and then verify by induction the formula
n
X
k=1
k4 =
n
6n4 + 15n3 + 10n2 − 1 .
30
There are many other types of applications of the principle of mathematical
induction, some of which are given in the exercises. The following has important
consequences in combinatorics, probability theory, and infinite series.
The Real Number System
21
1.5.5 Binomial Theorem. Let a, b ∈ R and n ∈ N. Then
(a + b)n =
n X
n k n−k
n
n!
.
a b
, where
:=
k!(n − k)!
k
k
k=0
Proof. For n = 1 the formula asserts that
1 0 1
1 1 0
a+b=
a b +
a b ,
0
1
which follows from the convention 0! = 1. Suppose that the formula holds for
some n ≥ 1. Writing (a + b)n+1 as (a + b)(a + b)n and using the induction
hypothesis, we have
(a + b)
n+1
=
n X
n
k=0
n X
k
k+1 n−k
a
b
+
n X
n
k=0
k
ak bn+1−k
n X
n k n+1−k
n
ak bn+1−k +
a b
+ an+1 + bn+1
=
k
k−1
k=1
k=1
n X
n
n
=
+
ak bn+1−k + an+1 + bn+1
k−1
k
k=1
n+1
X n + 1
=
ak bn+1−k ,
k
k=0
where, for the last step, we used Exercise 1.2.6. By induction, the formula
holds for all n.
Exercises
1. ⇓9 Let 0 < a < x1 , y1 < b := a + 1 and define
p
p
xn+1 = a + |xn − a| and yn+1 = b − |b − yn |.
Prove that a < xn < xn+1 < b and a < yn+1 < yn < b for all n ∈ N.
2. Use induction to prove that a nonempty finite set has a maximum and a
minimum.
3.S ⇓10 Verify by induction that
2n
X
(−1)k+1
k=1
9 This
k
exercise will be used in 2.2.3.
10 This exercise will be used in 6.4.8.
=
2n
X
1
for all n ≥ 1.
k
k=n+1
22
A Course in Real Analysis
4. Establish the following formulas by mathematical induction:
(a)
n
X
k = n(n + 1)/2.
(b)
k=1
(c)
n
X
k=1
n
X
2
k 3 = [n(n + 1)/2] .
(d)
n
X
k=1
n
X
k 2 = n(n + 1)(2n + 1)/6.
(2k − 1)2 = n(4n2 − 1)/3.
k=1
n
X
√
1
√ = n.
k−1+ k
k=1
k=1
p
n
n
X
X 2k + k(k − 1) − 1
√
3
2
4
√
(g)
= n n.
(4k − 6k + 4k − 1) = n . (h)
√
k+ k−1
k=1
k=1
(e)
(2k − 1)3 = n2 (2n2 − 1).
(f)
√
5.S P
Use the methods of 1.5.4 to derive and verify a closed formula for
n
2
k=1 (5k − 4) .
6. Use known formulas to calculate
(a) 1 · 2 + 2 · 3 + 3 · 4 + · · · + 999 · 1000.
(b)S 1 · 3 + 3 · 5 + 5 · 7 + · · · + 999 · 1001.
(c) 1 · 3 + 5 · 7 + 9 · 11 + · · · + 1001 · 1003.
7.S Use the principle of mathematical induction to prove the following
variant: Let n0 ∈ Z and let P (n) is a statement depending on integers
n ≥ n0 such that
(a) P (n0 ) is true,
(b) if n ≥ n0 and P (n) is true, then P (n + 1) is true.
Then P (n) is true for every n ≥ n0 .
8. Use the variant of mathematical induction in Exercise 7 to verify the
following inequalities. (For (e) use (1 + 1/n)n > 2, an easy consequence
of the binomial theorem.)
(a) S 2n + 1 < 2n , n ≥ 3.
(b) n2 < 2n , n ≥ 5.
(c) 2n < n!, n ≥ 4.
(d) 3n < n!, n ≥ 7.
(e) S 2n n! < nn , n ≥ 6.
(f) 8n n! < (2n)!, n ≥ 6.
9.S Use the variant of mathematical induction in Exercise 7 to prove that
n < ln(n!), n ≥ 6.
10. Prove Bernoulli’s inequality: (1 + x)n ≥ 1 + nx, n ∈ Z+ , x ≥ −1.
The Real Number System
23
11. Use the principle of mathematical induction to prove the following variant:
Let n0 ∈ Z and let P (n) be a statement depending on integers n ≥ n0
such that
(a) P (n0 ) is true,
(b) P (n + 1) is true whenever P (j) is true for all n0 ≤ j ≤ n.
Then P (n) is true for every n ≥ n0 .
12. (Prime Factorization). Use the variant of induction in Exercise 11 to
prove that every integer n ≥ 2 may be written as a product of powers of
prime numbers (for example, 72 = 23 · 32 ).
13.S The Fibonacci numbers fn are defined recursively by
f0 = f1 = 1 and fn+1 = fn + fn−1 , n ≥ 1.
Use the variant of induction in Exercise 11 to prove that
√
√
1
1+ 5
1− 5
n+1
n+1
fn = √ a
−b
, a :=
, b :=
,
2
2
5
where a, b are the zeros of x2 − x − 1.
14. Let a0 and a1 be arbitrary and define
an+1 = 21 (an + an−1 ),
n ≥ 1.
Use the variant of induction in Exercise 11 to prove that for all n ≥ 0,
an =
1
(−1)n
(a0 − a1 ) + (a0 + 2a1 ).
3 · 2n−1
3
15.S (Division algorithm). Prove that for each pair of integers m and n with
n > 0 there exist unique integers q and r such that
m = qn + r and 0 ≤ r ≤ n − 1.
(The integer q is called the quotient and r the remainder on division of
m by n.)
16. Use the variant of induction in Exercise
Pp11 to prove that each n ∈ N
may be uniquely expressed in the form k=0 dk 10k for some p ∈ N and
dk ∈ {0, 1, . . . , 9}. The representation
n = dp dp−1 . . . d0
is called the decimal positional notation for n.
24
A Course in Real Analysis
1.6
Euclidean Space
The real number system may be used to construct other important mathematical systems, such as n-dimensional Euclidean space and the complex
number system. In this section we construct the former. The reader may delay
reading this section, as the material will not be needed until Chapter 8.
For n ∈ N, let Rn denote the set of all n-tuples x := (x1 , x2 , . . . , xn ),
where xj ∈ R. Each such n-tuple is called a point or vector, depending on
context. The distinction between points and vectors is important in physics
and geometry, as it allows one to refer to a vector at a point, a notion useful
in describing, say, forces or tangent vectors.
The set Rn has an algebraic structure which is defined as follows: Let
x = (x1 , . . . , xn ), y = (y1 , . . . , yn ), and t ∈ R.
The operations of addition x + y and scalar multiplication tx in Rn are then
defined by
x + y = (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ), and
tx = t(x1 , . . . , xn ) = (tx1 , . . . , txn ).
We also define
−x := (−x1 , . . . , −xn )
and
0 := (0, . . . , 0).
The following theorem asserts that Rn is a vector space under these operations
(see Appendix B). The straightforward proof is left to the reader.
1.6.1 Theorem. Addition and scalar multiplication on Rn have the following
properties:
• associativity of addition: (x + y) + z = x + (y + z);
• commutativity of addition: x + y = y + x;
• existence of an additive identity: x + 0 = x;
• existence of additive inverses: x + (−x) = 0;
• associativity of scalar multiplication: (st)x = s(tx);
• distributivity of a scalar over vector addition: s(x + y) = sx + sy;
• distributivity of a vector over scalar addition: (s + t)x = sx + tx;
• existence of a scalar multiplicative identity: 1x = x.
The Real Number System
25
1.6.2 Definition. Let x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ). The Euclidean
inner product x · y of x and y and the Euclidean norm kxk2 of x are defined
by
X
1/2
n
n
X
√
2
x·y =
xj yj and kxk2 =
xj
= x · x.
j=1
j=1
The set R with its vector space structure and the Euclidean inner product is
called n-dimensional Euclidean space.
♦
n
The structure of Euclidean space allows one to define lines, planes, length,
perpendicularity, angle between vectors, etc. These ideas will be useful in later
chapters.
1.6.3 Theorem. The inner product in Rn has the following properties:
(a) x · x = kxk22 .
(b) x · y = y · x (commutativity).
(c) t(x · y) = (tx) · y = x · (ty) (associativity).
(d) x · (y + z) = (x · y) + (x · z) (additivity).
(e) |x · y| ≤ kxk2 kyk2 (Cauchy–Schwartz inequality).
Proof. Properties (a) and (b) are immediate and parts (c) and (d) follow
respectively from the calculations
t
n
X
j=1
xj yj =
n
n
n
n
n
X
X
X
X
X
(txj )yj =
xj (tyj ) and
xj (yj + zj ) =
xj yj +
xj zj .
j=1
j=1
j=1
j=1
j=1
The inequality in (e) holds trivially if y = 0. Suppose y 6= 0, so kyk2 6= 0.
By properties (a)–(d),
0 ≤ kx − tyk22 = (x − ty) · (x − ty) = kxk22 − 2t(x · y) + t2 kyk22 .
Setting t = (x · y)/kyk22 , we obtain
0 ≤ kxk22 − 2(x · y)2 /kyk22 + (x · y)2 /kyk22 = kxk22 − (x · y)2 /kyk22 ,
which implies that (x · y)2 ≤ kxk22 kyk22 . Taking square roots yields (e).
1.6.4 Theorem. The Euclidean norm on Rn has the following properties:
(a) kxk2 ≥ 0 (nonnegativity).
(b) kxk2 = 0 iff x = 0 (coincidence).
(c) ktxk2 = |t| kxk2 (absolute homogeneity).
(d) kx + yk2 ≤ kxk2 + kyk2 (triangle inequality).
26
A Course in Real Analysis
Proof. Parts (a) and (b) are clear, and (c) follows from
ktxk22 =
n
n
X
X
(txj )2 = t2
x2j = t2 kxk22 .
j=1
j=1
For (d) we use 1.6.3:
kx + yk22 = (x + y) · (x + y)
= kxk22 + kyk22 + 2(x · y)
≤ kxk22 + kyk22 + 2kxk2 kyk2
= (kxk2 + kyk2 )2 .
Exercises
1.S Solve the following system of vector equations for x and y in terms of
a, b, c, d, and e, assuming that (a · b)(d · b) 6= 1.
x + (y · b)a = c
y + (x · b)d = e.
2. Prove the following:
(a) kx + yk22 − kx − yk22 = 4(x · y) (polarization identity).
(b) kx + yk22 + kx − yk22 = 2 kxk22 + kyk22
(parallelogram rule).
(c)S kxk2 − kyk2 ≤ kx − yk2 .
Pn
(d) kx1 + · · · + xn k2 ≤ j=1 kxj k2
(generalized triangle inequality).
3. Suppose that xi · xj = 0 for i 6= j. Prove that
S
kx1 + · · · + xk k22 =
k
X
kxj k22 .
j=1
4. ⇓11 For x = (x1 , . . . , xn ) define
kxk1 =
n
X
|xj | and kxk∞ = max{|x1 |, . . . , |xn |}.
j=1
Verify that k · k1 and k · k∞ have the properties (a)–(d) of 1.6.4.
5. A nonempty subset C of Rn is said to be convex if x, y ∈ C and t ∈ [0, 1]
imply that tx + (1 − t)y ∈ C. Let r > 0. Prove that {x ∈ Rn : kxk2 ≤ r}
is convex. Is the set {x ∈ Rn : kxk2 = r} convex? What about the sets
{x ∈ Rn : kxk1 ≤ r} and {x ∈ Rn : kxk∞ ≤ r}?
11 This
exercise will be used in Section 8.1.
The Real Number System
27
6. Find positive constants a, b, c such that for all x ∈ Rn ,
kxk2 ≤ akxk1 ,
kxk1 ≤ bkxk∞ , and
kxk∞ ≤ ckxk2 .
7.S Prove that kxk2 = kyk2 = k(x + y)/2k2 = 1 ⇒ x = y. Is the same true
for k · k∞ or k · k1 ?
8. Show that in R3 , a · b = kak kbk cos θ, where θ is the (smaller) angle
between a and b.
9. The cross product of vectors a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) in R3 is
defined by
a2 a3
a a3 a1 a2
a×b=
,− 1
,
b2 b3
b1 b3 b1 b2
= ha2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 i .
Let θ be the (smaller) angle between a and b. Verify the following:
(a) (a × b) · a = (a × b) · b = 0.
(b) b × a = −a × b.
(c) a × (tx + sy) = t(a × x) + s(a × y).
(d) (a × b) · c = a · (b × c).
(e) a × (b × c) = (a · c)b − (a · b)c.
(f) ka × bk = kak kbk sin θ.
Chapter 2
Numerical Sequences
2.1
Limits of Sequences
Simply stated, a sequence in a set E is a function from N to E. It is
more instructive, however, to think of a sequence as an infinite ordered list of
members of E. The list may be written out, for example, as
a1 , a2 , . . . , an , . . .
or abbreviated by {an }∞
n=1 or simply by {an }. A sequence usually starts with
the index 1, although this is not necessary, 0 being a common alternative.
The set E in the definition of sequence is arbitrary. However, for Part I of
the book, we consider only numerical sequences, that is, sequences contained
in R.
Sequences may be defined by a closed formula, such as an = (−1)n , or
recursively, such as the Fibonacci sequence, defined by
a0 = a1 = 1 and an+1 = an + an−1 , n ≥ 1
(see Exercise 1.5.13).
The following notion will occasionally be useful. A property P of a sequence
{an } is said to hold eventually if there exists an index N such that an has
property P for all n ≥ N . For example, by the Archimedean principle, the
sequence {1/n} is eventually less than .001. Or, consider the sequence defined
by an = n2 + 100(−1)n ; the reader may verify that eventually an < an+1 .
Convergence of a sequence to a number a expresses the idea that eventually
the terms of the sequence will be as close to a as desired. The following
definition makes this precise.
2.1.1 Definition. A sequence {an } in R is said to converge to a real number a,
written
an → a or lim an = lim an = a,
n
n→+∞
if for each ε > 0 there exists N ∈ N such that
|an − a| < ε, (a − ε < an < a + ε), for all n ≥ N.
If no such real number a exists, then the sequence is said to diverge.
♦
29
30
A Course in Real Analysis
a+
a
a−
1 2 3 4 5
N −2
N N +2
FIGURE 2.1: Convergence of a sequence to a
It follows immediately from the definition that an → a iff the terms of
sequence eventually lie in any open interval containing a. The definition also
implies that an → a iff |an − a| → 0.
Limits, if they exist, are unique. Indeed, if an → a and an → b, then by
the triangle inequality |a − b| ≤ |a − an | + |b − an | → 0, hence a = b.
Examples. (a) The sequence {(−1)n } oscillates between −1 and 1 and so
cannot converge. For a rigorous proof, suppose (−1)n → a for some a ∈ R.
Choose N such that a − 1 < (−1)n < a + 1 for all n ≥ N . Thus, if n ≥ N is
even, then 1 < a + 1, and if n ≥ N is odd, then a − 1 < −1. Adding these
inequalities produces the absurdity a < a.
(b) To show that
(−1)n
= 0,
n
n
let ε > 0 and choose an integer N > 1/ε (Archimedean principle). Then
|(−1)n /n − 0| = 1/n < ε for all n ≥ N .
lim
(c) To verify that
lim
n
note that
2n + 1
2
= ,
3n + 5
3
2n + 1 2
7
7
−
< ,
=
3n + 5 3
3(3n + 5)
n
so any index N > 7/ε satisfies the condition in 2.1.1.
♦
2.1.2 Definition. A sequence {an } is said to be bounded (above, below ) if
the set of its terms is bounded (above, below).
♦
2.1.3 Proposition. A convergent sequence in R is bounded.
Proof. Assume that an → a ∈ R. Choose N such that |an − a| < 1 for all
n > N . Since |an | − |a| ≤ |an − a|, we see that |an | ≤ |an − a| + |a| < 1 + |a|
for all n > N . Thus |an | ≤ max{1 + |a|, |a1 |, . . . , |aN |} for all n ∈ N.
Numerical Sequences
31
2.1.4 Theorem. Let {an } and {bn } be sequences with an → a and bn → b. If
an ≤ bn for infinitely many n, then a ≤ b.
Proof. Suppose b < a. Then b < (a + b)/2 < a, hence we may choose indices
N1 and N2 such that bn < (a + b)/2 for all n ≥ N1 and an > (a + b)/2 for
all n ≥ N2 . But then bn < an for all n ≥ max{N1 , N2 }, contradicting the
hypothesis.
Note that, as a consequence of the preceding theorem, a convergent sequence
in a closed interval I must have its limit in I.
2.1.5 Theorem (Squeeze principle). Let {an }, {bn }, and {cn } be sequences
in R such that an ≤ bn ≤ cn for all n. If limn an = limn cn = x ∈ R, then
limn bn = x.
Proof. Given ε > 0, choose N1 , N2 ∈ N such that |an − x| < ε for all n ≥ N1
and |cn − x| < ε for all n ≥ N2 . For n ≥ max{N1 , N2 }, the inequalities
−ε < an − x ≤ bn − x ≤ cn − x < ε imply that |bn − x| < ε.
an
bn
cn
x
FIGURE 2.2: The squeeze principle.
2.1.6 Example. We show that limn nrn = 0 for any r ∈ (0, 1). Let h = r−1 −1.
Then h > 0 and, by the binomial theorem,
r−n = (1 + h)n = 1 + nh + 12 n(n − 1)h2 + · · · > 21 n(n − 1)h2 ,
hence
0 < nrn <
2
, n > 1.
(n − 1)h2
Since the term on the right tends to 0 as n → +∞, the squeeze principle shows
that nrn → 0. (See Exercise 16 for an extension of this result.)
♦
For another illustration of the squeeze principle we prove
2.1.7 Proposition. For any real number x there exist sequences {an } in Q
and {bn } in I such that limn an = limn bn = x.
Proof. By 1.4.8 and Exercise 1.4.9, for each n ∈ N we may choose points
an ∈ (x − 1/n, x + 1/n) ∩ Q and bn ∈ (x − 1/n, x + 1/n) ∩ I. The squeeze
principle then implies that an , bn → x.
32
A Course in Real Analysis
2.1.8 Definition. (Infinite limits) A sequence {an } in R is said to diverge
to +∞, written
an → +∞ or lim an = lim an = +∞,
n
n→+∞
if for each real number M there exists an index N such that an ≥ M for all
n ≥ N . Divergence to −∞ is defined analogously.
♦
2.1.9 Example. If r > 1, then rn /n → +∞. This follows from 2.1.6: Given
M > 0 there exists N ∈ N such that n/rn < 1/M , hence rn /n > M , for all
n ≥ N.
♦
2.1.10 Example. If r > 0, then an := rn n! → +∞. Indeed, since
an
= rn → +∞,
an−1
there exists N ∈ N such that an > 2an−1 , for all n > N . Iterating, we see that
an > 2k an−k ≥ kan−k , so taking k = n − N we have an > (n − N )aN for all
n > N . Since limn (n − N )aN = +∞ (Archimedean principle), the assertion
follows.
♦
For the following theorem, recall the conventions regarding addition and
multiplication in the extended real number system R (1.4.12).
2.1.11 Theorem. Let {an } and {bn } be sequences in R. The following limit
properties hold in R in the sense that if the expression on the right side of the
equation exists in R, then the limit on the left side exists and equality holds.
(a) limn (san + tbn ) = s limn an + t limn bn ,
s, t ∈ R.
(b) limn an bn = limn an limn bn .
(c) limn an /bn = limn an / limn bn , if limn bn 6= 0.
(d) limn |an | = | limn an |.
√
√
(e) limn an = limn an if an ≥ 0 for all n.
Proof. Let an → a, bn → b. We prove the theorem first for the case a, b ∈ R.
Let ε > 0.
For (a) choose N1 and N2 so that
|an − a| <
ε
ε
for all n ≥ N1 and |bn − b| <
for all n ≥ N2 .
2(|s| + 1)
2(|t| + 1)
If n ≥ N := max{N1 , N2 }, then both of these inequalities hold, hence, by the
triangle inequality,
|san + tbn − (sa + tb)| ≤ |s| |an − a| + |t| |bn − b| < ε/2 + ε/2 = ε.
Numerical Sequences
33
To prove (b), choose M ≥ |a| so that |bn | ≤ M for all n (2.1.3) and choose
N so that |an − a| < ε/2M and |bn − b| < ε/2M for all n ≥ N . For such n,
|an bn − ab| = |(an − a)bn + a(bn − b)| ≤ |an − a||bn | + |a||bn − b|
≤ M |an − a| + M |bn − b| < ε/2 + ε/2 = ε.
For (c) it suffices to show that 1/bn → 1/b. Choose N such that
|bn − b| < min{|b|/2, εb2 /2}
for all n ≥ N .
For such n, |bn | ≥ |b| − |bn − b| > |b|/2, hence
1
|bn − b|
2|bn − b|
1
=
≤
< ε.
−
bn
b
|bbn |
b2
Part (d) follows from the inequality |an | − |a| ≤ |an − a|.
For (e), observe first that a ≥ 0 (2.1.4). If a = 0, choose N √
such that
an < ε2 for all n ≥ N . If a > 0, choose N such that |an − a| < ε a for all
n ≥ N . For such n,
√
√
|an − a|
|a − a|
√ ≤ n√
| an − a| = √
< ε.
an + a
a
To illustrate the remaining cases a = ±∞ or b = ±∞, we prove part (b) for
the case −∞ < a < 0 and bn → +∞. To show that an bn → −∞, let M < 0
and choose N so that
an < a/2
For such n,
and bn > 2M/a for all n ≥ N .
−an bn > (−a/2)(2M/a) = −M,
hence an bn < M .
2.1.12 Example. To find
√
lim
n
4n6 − 3n2 + 5
,
2n3 + 7n + 3
divide the numerator and denominator of the general term an by n3 , the
highest power of n occurring in the denominator, to obtain
p
4 − 3/n4 + 5/n6
an =
.
2 + 7/n2 + 3/n3
The quotients
in the numerator and denominator tend to 0, hence, by 2.1.11,
√
an → 4/2 = 1.
♦
34
A Course in Real Analysis
Exercises
1. Let a, b ∈ R. Find a closed formula for the nth term an of the sequences
(a)S a, b, a, b, . . .
(b) a, a, b, b, a, a, . . .
(d) a, b, a, c, a, b, a, c . . .
(c) a, a, a, b, b, b, a, a, a . . .
(e) 1, 2, 3, 4, 1, 2, 3, 4, . . .
2. Find a recursive formula for the sequence a, b, a, b, . . .
3. Use the ε, N definition of limit to prove that
4n − 1
(a) lim
= 2.
n 2n + 7
(b)
S
n−1
= +∞. (e) S
(d) lim √
n
n+1
√
5
2n2 − n
5 n+7
√
= .
lim 2
= 2. (c) lim
n n +3
n 3 n+2
3
r
1 3
n+2
= 8. (f) limn
lim 2 +
= 1.
n
n
n+1
4. Prove rigorously that the sequence {(−1)n n/(n + 1)} has no limit.
5.S Find limn sin (n!rπ) for r ∈ Q.
1
1 p
6. Find limn
n+
for all p ∈ R.
n
n
7.S Let {an } be contained in a finite set A. Prove that if an → a, then there
exists an index N such that an = a for all n ≥ N . In particular, a ∈ A.
8. Find limn bn if
(a)S an → a and 3an + 2bn → c.
(b) an → 2 and 3an bn + 5a2n − 2bn → 1.
9. Let k ∈ N and a, b > 0. Evaluate limn an if an =
(n + k)!
.
n!(n + k)k
(g) S n (a − 1/n)k − ak .
1/2
an − 1
.
(b)
bn + 1
q
√
√
(d) S an + b n − an.
p
(f) nk
a2 + n−k − a .
h
i
(h) n 1 − (1 − a/n)1/k .
(i) (1 − 1/2)(1 − 1/3) · · · (1 − 1/n).
(j)
2n + 1
(a)
.
k
(n + 3n + 1)1/k
p
(c)
n2 + kn − n.
S
(e)
(k) S (1 − 1/22 )(1 − 1/32 ) · · · (1 − 1/n2 ). (l)
n
X
(n2 + j)−1 .
j=1
n
X
(nk + j)−1/k , k > 1.
j=1
10. Let {an } be bounded and bn → 0. Prove that an bn → 0.
Numerical Sequences
35
11.S Let an → a ∈ R, bn → b ∈ R, and r > 0 such that |an − bn | ≤ r for all n.
Prove that |a − b| ≤ r.
√
12. Prove that if nan → a ∈ R, then n an → 0. Show that the converse is
false.
1/k
13. Let an ≥ 0 for all n and an → a. Prove that an
→ a1/k , k ∈ N.
14. Let r > 0 and k ∈ N. Prove in each case that an → 1:
(a)S an = r1/n .
(c) an = r + nk
1/n
(b) an = n1/n .
(d) an = sin(1/n)]1/n .
.
+
−
−
15. Prove that an → a iff a+
n → a and an → a . (See 1.3.7.)
16. Let m ∈ N and r ∈ (−1, 1). Prove that limn nm rn = 0.
17.S Let 0 < r < 1, an > 0, and an+1 /an < r for all n. Prove that an → 0.
Construct a sequence {an } such that an > 0 and an+1 /an < 1 for all n
but an 6→ 0.
18. Suppose that an → a ∈ R. Prove that
lim(a1 + · · · + an )/n = a.
n
Is the converse true?
19.S Let an → a ∈ R and let an ≥ a for all n. Prove that
lim min{a1 , · · · , an } = a.
n
Does min{a1 , · · · , an } → a imply that an → a?
20. Show that if n−1 an → 0, then n−1 max{a1 , · · · , an } → 0. Prove that the
converse holds if {an } is bounded below. Give an example to show that
the converse is not generally true.
21. Let 0 < x1 ≤ · · · ≤ xk . Prove that
lim(xn1 + · · · + xnk )1/n = xk .
n
22.S Let f (x) be any real-valued function on R such that f (x) − x is bounded
for all x (for example, f (x) = bxc). Use Exercise 1.5.4 to prove that
Pn
Pn
(a) (1/n2 ) j=1 f (jx) → x/2.
(b) (1/n3 ) j=1 f (j 2 x) → x/3.
√
23. Let a0 , a1 > 0 and an = an−1 an−2 , n ≥ 2. Find limn an .
24. Let k ∈ N and let {an } be a sequence such that an+k − an → c ∈ R.
Prove that an /n → c/k. Suggestion. Consider first the case k = 1 to get
the general idea.
36
2.2
A Course in Real Analysis
Monotone Sequences
2.2.1 Definition. A sequence {an } in R is said to be increasing (strictly
increasing) if an ≤ an+1 (an < an+1 ) for all n. Decreasing and strictly decreasing sequences are defined analogously. A sequence that is either increasing or
decreasing is called monotone. If {an } is increasing (decreasing), we write an ↑
( an ↓). If an ↑ (an ↓) and an → a ∈ R, we write an ↑ a (an ↓ a).
♦
2.2.2 Monotone Sequence Theorem. If {an } is increasing (decreasing),
then an ↑ supk ak (an ↓ inf k ak ). In particular, every bounded monotone
sequence converges in R.
Proof. Assume {an } is increasing and let r < supk ak . By the approximation
property of suprema, r < aN ≤ supk ak for some N . Since {an } is increasing,
r < an ≤ supk ak for all n ≥ N . Therefore, an ↑ supk ak . The proof for the
decreasing case is similar.
2.2.3 Example. Let 0 < a < x1 , y1 < b := a + 1 and define {xn } and {yn }
recursively by
p
p
xn+1 = a + |xn − a| and yn+1 = b − |b − yn |.
By Exercise 1.5.1, {xn } is strictly increasing, {yn } is strictly decreasing, and
a < xn , yn < b for all n. By 2.2.2, xn ↑ x and
√ yn ↓ y for some x, y ∈√R. To
find x, let n → ∞ in the equation xn+1 = a+ xn − a to obtain x = a+ x − a.
This has solutions x = a and x = b. Since {xn } is increasing, x = b. Similarly,
y = a.
♦
2.2.4 Example. We use the monotone sequence theorem to show that the
sequence {(1 + 1/n)n } converges. By the binomial theorem (1.5.5) and the
inequality k! ≥ 2k−1 (easily established by induction),
n
(1 + 1/n) =
n X
n
1/nk
k
k=0
=2+
≤2+
n
X
k=2
n
X
(1 − 1/n)(1 − 2/n) · · · (1 − (k − 1)/n)/k!
1/2k−1 .
k=2
n
Since the sum in the last inequality is ≤ 1, {(1 + 1/n) } is bounded above
by 3. Now let m = n + 1. Then
1 − k/m ≥ 1 − k/n ≥ 0,
k = 1, . . . , n − 1,
Numerical Sequences
37
hence
(1 + 1/m)
m
≥2+
>2+
n
X
k=2
n
X
(1 − 1/m)(1 − 2/m) · · · (1 − (k − 1)/m)/k!
(1 − 1/n)(1 − 2/n) · · · (1 − (k − 1)/n)/k!
k=2
n
= (1 + 1/n) .
n
Thus {(1 + 1/n) } is increasing. By 2.2.2, the sequence has a limit in R, which
is denoted by the letter e:
n
e := lim (1 + 1/n) = 2.71828182845905 . . .
n
♦
Exercises
1.S Let 0 < a < 1 < b. Prove that a1/n ↑ 1 and b1/n ↓ 1.
2. Let an = an /nk and bn = bn /nk , where 0 < a < 1 < b and k ∈ Z+ .
Prove that {an } is strictly decreasing and that {bn } is eventually strictly
increasing.
3.S Let
na
, a, b > 0.
1 + n2 b
Prove: an ↓ 0 (eventually) and nan ↑ a/b.
an =
4. Let xn > 0 and xn ↑ x. Prove that (xn1 + · · · + xnn )1/n → x.
5. Prove that for any nonempty set A of real numbers there exist sequences
{an } and {bn } in A such that an ↑ sup A and bn ↓ inf A.
6. Let {an } be monotone and set bn := (a1 + a2 + · · · + an )/n. Prove that
{bn } is monotone. (Compare with Exercise 2.1.18.)
7.S Define a1 = 1 and an = 1 + (1 + an−1 )−1 . Find limn an by first showing
that 1 ≤ an ≤ 2, {a2n } is decreasing, and {a2n+1 } is increasing.
√
√
8. Let r > 0, a0 = r, and an = r + an−1 , n ≥ 1. Find limn an .
9.S Let r > 0, a1 > 0 and define
an = 21 (an−1 + r/an−1 ), n > 1.
√
Show that an ≥ an+1 ≥ r and find limn an .
−n
10. Prove that e = limn (1 − 1/n)
11. Let < x0 < y0 and define
√
xn+1 = xn yn
.
and yn+1 = (xn + yn )/2.
Prove that 0 < xn < xn+1 < yn+1 < yn and that limn xn = limn yn .
38
A Course in Real Analysis
2.3
Subsequences and Cauchy Sequences
2.3.1 Definition. A subsequence of a sequence {an }∞
n=1 in R is a sequence
{ank }∞
,
where
the
indices
satisfy
1
≤
n
<
n
<
·
·
· . The limit in R of a
1
2
k=1
subsequence is called a cluster point of {an }.
♦
For example, in the following sequence the underlined terms define the
beginning of a subsequence {ank } with n1 = 3, n2 = 4, n3 = 6, etc.
a1 , a2 , a3 , a4 , a5 , a6 , a7 , a8 , a9 , a10 , a12 , a13 , a14 , a15 , . . .
Note that the indices nk of a subsequence satisfy nk ≥ k.
Examples. (a) The sequence
nh
io
1 − (−1)b(n−1)/2c = {0, 0, 2, 2, 0, 0, 2, 2 . . .}
is a subsequence of
{1 − (−1)n } = {2, 0, 2, 0, . . .},
which has cluster points 0 and 2.
(b) The sequence {n sin (nπ/2)} has cluster points 0 and ±∞.
(c) Let {r1 , r2 , . . .} be an arbitrary enumeration of the rational numbers
(see Appendix A). Then every real number is a cluster point of {rn }. Indeed,
since every interval of the form (x − 1/n, x + 1/n) contains infinitely many
terms of the sequence, we may choose n1 ≥ 1 such that |x − rn1 | < 1, n2 > n1
such that |x − rn2 | < 1/2, etc. In this way we may construct a subsequence
inductively such that |x − rnk | < 1/k for all k, hence rnk → x.
♦
Notation. It is occasionally convenient to use the following alternate method
to describe a subsequence: If we set bk = ank and then change the index in
{bk }∞
k=1 to n, then {bn } may be used to denote the subsequence {ank }. This
provides a convenient way to denote a subsequence of a subsequence. In this
regard, note that if {cn } is a subsequence of {bn } and {bn } is a subsequence
of {an }, then {cn } is a subsequence of {an }.
The following proposition shows that a convergent sequence has a single
cluster point.
2.3.2 Proposition. If {an } is a sequence in R and an → a ∈ R, then ank → a
for any subsequence {ank } of {an }.
Proof. We prove the proposition for the case a ∈ R and leave the other cases
for the reader. Given ε > 0, choose N such that |an − a| < ε for all n ≥ N .
Since nk ≥ k, |ank − a| < ε for all k ≥ N . Therefore, ank → a.
Numerical Sequences
39
2
2.3.3 Example. We calculate limn (1 + 1/n2 )3n +5 by writing
3n2 +5 "
n2 #3 5
1
1
1
=
.
1+ 2
1+ 2
1+ 2
n
n
n
The term in the square brackets is a subsequence of (1 + 1/n)n and hence
2
converges to e (see 2.2.4). It follows that (1 + 1/n2 )3n +5 → e3 .
♦
The following result will have important consequences in later chapters.
2.3.4 Bolzano–Weierstrass Theorem. Every bounded sequence in R has
a convergent subsequence.
Proof. The proof is based on the observation that if a union of two sets contains
infinitely many terms of a sequence, then at least one of the sets must contain
infinitely many of the terms of the sequence.
Let {an } be a bounded sequence, say c0 ≤ an ≤ d0 for all n. Bisect the
interval I0 := [c0 , d0 ]. By the preceding observation, one of the resulting
subintervals, call it I1 , contains infinitely many terms of the sequence. Choose
one such term, say an1 . Now bisect I1 . Again, one of the resulting subintervals,
call it I2 , contains infinitely many terms of the sequence. Choose one such term
an2 with n2 > n1 . By repeating this procedure, we produce a subsequence
{ank }∞
k=1 of {an } and a sequence of intervals Ik = [ck , dk ], k = 0, 1, . . ., such
that
c0 ≤ ck−1 ≤ ck ≤ ank ≤ dk ≤ dk−1 ≤ d0 , and dk+1 − ck+1 = 21 (dk − ck ).
Since {ck } and {dk } are monotone and bounded ck → c and dk → d for some
c, d ∈ R. Since dk − ck = 2−k (d0 − c0 ) → 0, c = d. By the squeeze principle,
ank → c.
I0
c0
I1 c
1
I2
I3
d0
an1
an2
c2
c3
d2
an3
..
.
d1
d3
FIGURE 2.3: Interval halving process.
The Bolzano–Weierstrass theorem may be extended as follows:
2.3.5 Theorem. Every sequence in R has a subsequence that converges in R.
Proof. If {an } is bounded, then the Bolzano–Weierstrass theorem applies. Suppose that {an } is unbounded above. Then for each k ∈ N there exist infinitely
many indices n such that an > k. We may then construct a subsequence {ank }
with ank > k for all k so ank → +∞.
40
A Course in Real Analysis
2.3.6 Corollary. A sequence {an } in R has a limit in R iff it has exactly one
cluster point in R.
Proof. The necessity is 2.3.2. For the sufficiency, suppose that {an } has exactly
one cluster point a ∈ R. Consider first the case a = +∞. We claim that
an → +∞. If not, then there exists M ∈ R such that an ≤ M for infinitely
many n, hence there exists a subsequence {ank } of {an } with ank ≤ M for
all k. By 2.3.5, {ank } has a cluster point b ∈ R. But b ≤ M < a, so {an } has
more than one cluster point, a contradiction. Therefore, an → +∞, as claimed.
The case a = −∞ is treated similarly.
Now suppose a ∈ R. Then an → a. If not, then there exists ε > 0 such that
|an − a| ≥ ε for infinitely many n, so there is a subsequence {ank } of {an }
with |ank − a| ≥ ε for all k. By 2.3.4, {ank } has a cluster point b in R. But
then |b − a| ≥ ε, so again {an } has more than one cluster point.
2.3.7 Definition. A sequence {an } is said to be Cauchy if for each ε > 0
there exists an index N such that |an − am | < ε for all m, n ≥ N . We express
this condition by writing
lim(an − am ) = 0.
♦
m,n
The definition asserts that the terms of a Cauchy sequence get closer to
one another. Thus the following result is not surprising.
2.3.8 Proposition. Every convergent sequence is Cauchy.
Proof. Let an → a. Given ε > 0, choose N such that |an − a| < ε/2 for all
n ≥ N . Then for n, m ≥ N ,
|an − am | = |(an − a) + (a − am )| ≤ |an − a| + |am − a| < ε.
It is of fundamental importance that the converse of 2.3.8 is true. To prove
this, we need the following lemma.
2.3.9 Lemma. A Cauchy sequence is bounded.
Proof. Let {an } be a Cauchy sequence. Choose N such that |an − am | < 1 for
all m, n ≥ N . Then |an | ≤ |an − aN | + |aN | < 1 + |aN | for all n ≥ N , hence
|an | ≤ max{1 + |aN |, |a1 |, |a2 |, . . . , |aN −1 |} for all n.
2.3.10 Cauchy Criterion. Every Cauchy sequence in R converges.
Proof. By 2.3.9 and the Bolzano–Weierstrass theorem, a Cauchy sequence
{an } has a convergent subsequence, say ank → a ∈ R. We claim that an → a.
Let ε > 0 and choose N such that |an −am | < ε for all m, n ≥ N . In particular,
|an − ank | < ε for n, k ≥ N . Fixing n ≥ N and letting k → ∞ in the last
inequality yields |an − a| ≤ ε, verifying the claim.
Numerical Sequences
41
Exercises
1. Find all cluster points of {an }, where an =
nπ 2n + 1
2n + 1
2 nπ
n
S
n
. (b) (−1)
.
(a) (−1)
sin
cos2
4n + 3
3
n+5
4
(c)S (−1)bn/3c (1 + 1/n)2 + (−1)bn/4c (2 + 1/n)2 + (−1)bn/5c (3 + 1/n)2 .
(d) (−1)n rn + r2n , where rk is the remainder on division of k by 3.
2. Construct a sequence with precisely the cluster points 1, 2, 3, +∞.
3. Let k ∈ N. Use the fact that limn (1 + 1/n)n = e (2.2.4) to find limn an
for an =
n
n
n
1
1
1
1
.
(b)
1+
.
(c)
+
.
(a)
1+
kn
k+n
k n
kn
7n3 −4
1
1
(d)S 1 +
. (e)
1+ 3
.
2n + k
3n + 5
4. Let {an } and {bn } be bounded sequences. Show that there exist convergent subsequences of {an } and {bn } with the same indices.
5.S Prove that a sequence contained in a finite set has a constant subsequence.
6. Let −∞ < an < r ≤ +∞ with an → r. Show that {an } has a strictly
increasing subsequence.
7. Show that every sequence of distinct real numbers has a strictly monotone
subsequence.
P∞
8.S Let k ∈ N and suppose that the series n=1 |an+k − an | converges (see
Chapter 6). Prove that {an } has a convergent subsequence.
9. Let a0 , a1 be arbitrary and define an+1 = (an + an−1 )/2, n ≥ 1. Show
directly that {an } is a Cauchy sequence. (Its limit may be found from
Exercise 1.5.14.)
10.S Let 0 < p ≤ q and an > 0 for all n. Set bn = aqn /(1 + apn ). Show that
an → 0 iff bn → 0. Is the assertion true if 0 < q < p?
11. Let I be an open interval and let {an } have the property that each open
subinterval J of I contains an for infinitely many n. Prove that every
point of I is a cluster point of {an }. Give an example of such a sequence.
12. Suppose that the cluster points of {an } form a sequence {bn }. Show that
every cluster point b of {bn } is a cluster point of {an }. Hint. Choose a
subsequence {bnk } such that |bnk − b| < 1/k.
42
A Course in Real Analysis
2.4
Limits Inferior and Superior
For an arbitrary sequence {an } in R, define
an = inf ak
k≥n
and an = sup ak , n = 1, 2, . . . .
k≥n
Then {an } is increasing and {an } is decreasing, hence the limits
lim inf an := lim an
n
n
and lim sup an := lim an
n
n
exist in R. These limits are called, respectively, the limit inferior and limit
superior of the sequence {an }.
an
a a
an
FIGURE 2.4: a = lim inf n an and a = lim supn an .
Clearly,
an ≤ an ≤ an and lim inf an ≤ lim sup an .
n
n
Furthermore, if {an } is unbounded below, then lim inf n an = −∞, and if {an }
is unbounded above, then lim supn an = +∞.
Here are some examples:
(−1)n n
= −1,
n+1
(b) lim inf n [(−1)n + 1]n = 0,
(−1)n n
= 1,
n+1
lim supn [(−1)n + 1]n = +∞,
(c) lim inf n sin n = −1,
lim supn sin n = 1.
(a) lim inf n
lim supn
Example (c) follows from Example 8.3.10. (See Exercise 8.3.15.)
The next proposition shows that lim sup and lim inf have properties similar
to those of limits. Their usefulness derives from this fact together with the
property that, in contrast to ordinary limits, the limits inferior and superior
of a sequence always exist (in R).
2.4.1 Proposition. For any sequences {an } and {bn } in R,
(a) lim supn (−an ) = − lim inf n an .
(b) lim supn (an + bn ) ≤ lim supn an + lim supn bn if the right side is defined.
(c) lim inf n (an + bn ) ≥ lim inf n an + lim inf n bn if the right side is defined.
(d) lim supn can = c lim supn an , if c ≥ 0.
Numerical Sequences
43
(e) lim inf n can = c lim inf n an , if c ≥ 0.
(f) lim supn (an bn ) ≤ (lim supn an )(lim supn bn ) if an , bn ≥ 0 for all n.
(g) lim inf n (an bn ) ≥ (lim inf n an )(lim inf n bn ) if an , bn ≥ 0 for all n.
(h) lim inf n an ≤ lim inf n bn , lim supn an ≤ lim supn bn if an ≤ bn for all n.
Proof. Part (a) follows from supk≥n (−ak ) = − inf k≥n ak and part (h) is a
direct consequence of the definitions. Part (b) follows by taking limits in the
inequality
sup(ak + bk ) ≤ sup ak + sup bk .
k≥n
k≥n
k≥n
Part (f) follows similarly from
sup ak bk ≤ sup ak sup bk .
k≥n
k≥n
k≥n
Part (d) is a consequence of
sup cak = c sup ak , c ≥ 0.
k≥n
k≥n
Parts (c), (e), and (g) are proved in a similar manner.
2.4.2 Theorem. For any sequence {an } in R, the extended real numbers
a := lim inf n an and a := lim supn an are cluster points of {an }. All other
cluster points of {an } in R lie between these.
Proof. We leave the case a = −∞ to the reader. Assume then that a > −∞
and recall that an ↓ a. Choose a strictly increasing sequence of real numbers
rn tending to a. Since r1 < a1 , by the approximation property of suprema
there exists an index n1 such that r1 < an1 ≤ an1 . Similarly, since r2 < an1 +1 ,
there exists an index n2 > n1 such that r2 < an2 ≤ an2 . In this way we may
construct inductively a subsequence {ank } such that rk < ank ≤ank . By the
squeeze principle, ank → a. The limit infimum case is treated similarly.
Now let {ank } be any subsequence of {an } with ank → a ∈ R. Then, for
any m and k ≥ m, am ≤ ank ≤ am . Letting k → ∞ yields am ≤ a ≤ am .
Letting m → ∞ we obtain a ≤ a ≤ a.
Since limn an exists in R iff {an } has exactly one cluster point (2.3.6), the
following result is immediate.
2.4.3 Corollary. For any sequence {an } in R, limn an exists in R iff
lim inf n an = lim supn an . In this case, all three limits are equal.
44
A Course in Real Analysis
Exercises
1. Find lim inf n an and lim supn an if
(−1)n 5n + 7
(a)S an =
.
3n + 5
(b) an = nsin(nπ/2) + (1/n) cos(n).
(c)S an = (−1)bn/3c (1+1/n)2 +(−1)bn/4c (2+1/n)2 +(−1)bn/5c (3+1/n)2 .
2nrn + 1
, rk the remainder on division of k ∈ N by 3.
(d) an =
nr2n + 1
(e) an = (−1)rn xn + (−1)rn+1 yn + (−1)rn+2 zn , where xn → x, yn → y,
zn → z, and x < y < z.
(f) a1 = 1, a2n = ra2n−1 , a2n+1 = ar + a2n , 0 < r < 1, a > 0.
(g) an = 2n + 2−n + (−1)n (2n − 2−n ).
3n cos (nπ/4) + 2
(h)S an =
.
2n sin (nπ/4) + 3
2. Show by example that the inequalities (b), (c), (f), and (g) in 2.4.1 may
be strict.
3.S Let an > 0 for all n. Prove that
lim sup(1/an ) = 1/ lim inf an and lim inf (1/an ) = 1/ lim sup an .
n
n
n
n
4. Let {an } be bounded and nonnegative and let r ∈ Q+ . Prove that
r
r
lim sup arn = lim sup an
and lim inf arn = lim inf an .
n
n
n
n
5.S Show that for any subsequence {ank } of {an },
lim sup ank ≤ lim sup an and lim inf ank ≥ lim inf an .
k→∞
n
k→∞
n
6. Let bn → b ∈ (0, +∞). Prove that
lim sup(an + bn ) = b + lim sup an and
n
n
lim inf (an + bn ) = b + lim inf an .
n
n
7.S Let an ≥ 0 for all n and bn → b ∈ (0, +∞). Prove that
lim sup an bn = b lim sup an
and
lim inf an bn = b lim inf an .
lim sup an ≤ lim sup |an | and
lim inf an ≥ lim inf |an |.
n
n
n
n
8. Prove that
n
n
n
Show by examples that the inequalities may be strict.
n
Numerical Sequences
45
9. Let {nk } be a sequence of positive integers that contains each positive
integer exactly once. Show that
lim sup ank = lim sup an and lim inf ank = lim inf an .
n
k
n
k
In particular, if an → a, then ank → a. Note: {ank }∞
k=1 is not necessarily
a subsequence {an }.
10.S Let an → a > 0 and lim inf n bn > 0. If b2n − an bn − 6a2n → 0, prove that
lim supn→∞ bn ≤ 3a.
11. Prove that for any sequence {an },
n
lim inf an ≤ lim inf
n
n
n
1X
1X
aj ≤ lim sup
aj ≤ lim sup an .
n j=1
n j=1
n
n
12.S ⇓1 Let an > 0 for all n. Prove that
lim inf
n
an+1
an+1
≤ lim inf a1/n
≤ lim sup a1/n
≤ lim sup
.
n
n
n
an
an
n
n
Use this to calculate limn n/(n!)1/n .
1 This
exercise will be used in 7.4.2.
Chapter 3
Limits and Continuity on R
3.1
Limit of a Function
The definition of limit of a function f given in 3.1.3 below is a precise
formulation of the intuitive idea that as x gets closer to a number a, the function
value f (x) approaches some fixed number L. This notion is conveniently
described in terms of certain subsets of R called neighborhoods.
3.1.1 Definition. Let r > 0. A neighborhood of
form


(a − r, a + r)
N (a) = Nr (a) := (r, +∞)


(−∞, −r)
a ∈ R is an interval of the
if a ∈ R,
if a = +∞,
if a = −∞.
If a ∈ R, the set N (a) \ {a} := (a − r, a) ∪ (a, a + r) is called a deleted
neighborhood of a.
♦
The reader should verify that the intersection of finitely many neighborhoods of a is again a neighborhood of a and that neighborhoods separate points,
that is, if a =
6 b are extended real numbers, then there exist neighborhoods
N (a) and N (b) such that N (a) ∩ N (b) = ∅.
3.1.2 Definition. An accumulation point of a nonempty set E of real numbers
is an extended real number a such that every neighborhood of a contains a
point of E not equal to a. A member of E that is not an accumulation point
of E is called an isolated point of E.
♦
For example, the set of accumulation points of E := Q ∩ (−1, 0) ∪ N is
[−1, 0] ∪ {+∞}, and the set of isolated points of E is N.
The following definition of limit is sufficiently general to include the usual
limits encountered in calculus: one-sided limits, two-sided limits, limits at
infinity, and infinite limits.
3.1.3 Definition. Let E ⊆ R, let f be a real-valued function whose domain
includes E, and let a, L ∈ R, where either a ∈ E or a is an accumulation point
of E (not necessarily in the domain of f ). We write
L = x→a
lim f (x)
x∈E
47
48
A Course in Real Analysis
if, for each neighborhood N (L) of L, there is a neighborhood N (a) of a such
that
x ∈ E ∩ N (a) implies f (x) ∈ N (L).
(3.1)
In this case we say that that f (x) approaches L as x tends to a along E
♦
The restrictions on a guarantee that E ∩ N (a) 6= ∅, hence condition (3.1)
is not vacuously satisfied. Note that if a ∈ E is not an accumulation point of
E, then it must be an isolated point, in which case lim{x→a, x∈E} f (x) trivially
exists and equals f (a).
We single out the following important special cases, where a ∈ R and s > 0:
(a) left-hand limit :
lim f (x) := x→a
lim f (x), E = (a − s, a).
x→a−
x∈E
(b) right-hand limit : lim+ f (x) := x→a
lim f (x), E = (a, a + s).
x→a
(c) two-sided limit :
(d) limit at +∞ :
(e) limit at −∞ :
lim f (x)
x→a
x∈E
:= x→a
lim f (x), E = (a − s, a + s) \ {a}.
x∈E
lim f (x) := lim f (x), E = (s, +∞).
x→+∞
x→+∞
x∈E
lim f (x) := lim f (x), E = (−∞, −s).
x→−∞
x→−∞
x∈E
f
L + 1
L + 2
L
L − 2
L − 1
a−δ
a
a+δ
x
FIGURE 3.1: δ works for ε1 but not for ε2 .
Applying the definition of limit to the cases (a)–(e) above produces the
standard limit definitions encountered in beginning calculus. For example, if
the limit L in (c) is finite, then, in the context of (c), 3.1.3 asserts that for
each ε > 0 there exists a δ ∈ (0, s) such that
|f (x) − L| < ε for all x with 0 < |x − a| < δ.
(See Figure 3.1.) For (e) and the case L = +∞, the definition asserts that for
each M ∈ R there exists an r > s such that
f (x) > M for all x with x < −r.
Limits and Continuity on R
49
The advantage of having a single definition of limit is that it provides a
unified theory and allows for economy of thought and presentation.
As in the case of sequences, limits of functions are unique. Indeed, if
L1 =
6 L2 both satisfy criterion (3.1), then, given neighborhoods N (L1 ) and
N (L2 ), there would exist a neighborhood N (a) such that
x ∈ E ∩ N (a) ⇒ f (x) ∈ N (L1 ) ∩ N (L2 ).
However, N (L1 ) and N (L2 ) may be taken to be disjoint, and choosing any
x ∈ E ∩ N (a) then results a contradiction.
In any discussion of limits we shall tacitly assume that a and E satisfy the
conditions of 3.1.3.
3.1.4 Example. Let f (x) = (3x + 2)/(2x − 1). Then
(a) limx→∞ f (x) = limx→−∞ f (x) = 3/2.
(b) limx→a f (x) = f (a), (a 6= 1/2).
(c) limx→1/2+ f (x) = +∞.
(d) limx→1/2− f (x) = −∞.
To verify (a,) let ε > 0 and note that the quantity
f (x) −
7
3
=
2
2|(2x − 1)|
will be less than ε if |2x − 1| > 7/2ε. The latter inequality is satisfied if either
x > (1 + 7/2ε)/2 or x < (1 − 7/ε)/2.
For (b), observe first that
|f (x) − f (a)| =
7|x − a|
3x + 2 3a + 2
−
=
.
2x − 1 2a − 1
|2x − 1||2a − 1|
By the triangle inequality,
|2x − 1| ≥ |2a − 1| − |(2a − 1) − (2x − 1)| = |2a − 1| − 2|a − x|.
Hence if |a − x| < |2a − 1|/4, then |2x − 1| > |2a − 1|/2 and therefore
|f (x) − f (a)| <
14|x − a|
.
|2a − 1|2
It follows that |f (x) − f (a)| will be less than ε if we require additionally that
|x − a| < ε|2a − 1|2 /14. Therefore, any δ < min{|2a − 1|/4, ε|2a − 1|2 /14} will
satisfy criterion (3.1).
To prove (c), note that if 0 < |x − 1/2| < 1/2, then x > 0, hence
f (x) =
1 3x + 2
1
>
.
2 x − 1/2
x − 1/2
Given M > 2, let δ = 1/M . Then |x − 1/2| < δ ⇒ 0 < x − 1/2 < 1/M ⇒
f (x) > M , proving (c). The proof of part (d) is similar.
♦
50
A Course in Real Analysis
3.1.5 Theorem. Let f be a function with domain D and let E = E1 ∪E2 ⊆ D.
Suppose that one of the following holds:
• a is an accumulation point of both E1 and E2 .
• a is an isolated point of both E1 and E2 .
• a is an accumulation point of E1 and an isolated point of E2 .
• a is an accumulation point of E2 and an isolated point of E1 .
Then lim{x→a, x∈E} f (x) exists in R iff both limits lim{x→a, x∈E1 } f (x) and
lim{x→a, x∈E2 } f (x) exist in R and are equal. In this case all three limits are
equal.
Proof. If a is an accumulation point of E1 or E2 , then a is an accumulation
point of E. If a is an isolated point of E1 and E2 , then a is an isolated point
of E. This shows that in each case lim{x→a, x∈E} f (x) is at least defined.
Now suppose that L := lim{x→a, x∈E} f (x) exists. Then (3.1) holds for
E, so it must hold for each of the subsets E1 and E2 as well. Therefore,
lim{x→a, x∈E1 } f (x) and lim{x→a, x∈E2 } f (x) exist and equal L.
Conversely, suppose that the limits along E1 and E2 exist and equal K ∈ R.
Then, given a neighborhood N (K), there exists a neighborhood N (a) of a
such that x ∈ Ej ∩ N (a) implies f (x) ∈ N (K), j = 1, 2. Thus x ∈ E ∩ N (a)
implies f (x) ∈ N (K), proving that lim{x→a, x∈E} f (x) = K.
3.1.6 Example. Take E1 = N and E2 = (0, 2). Then 2 is an isolated point of
E1 and an accumulation point of E2 , and lim{x→2, x∈E1 } f (x) = f (2). Therefore,
by the theorem, lim{x→2, x∈E} f (x) exists iff limx→2− f (x) = f (2).
♦
3.1.7 Example. (Dirichlet function). Let
(
1 if x ∈ Q,
d(x) =
0 otherwise.
Since lim{x→a, x∈Q} d(x) = 1 and lim{x→a, x∈I} d(x) = 0, limx→a d(x) cannot
exist.
♦
The following is an immediate consequence of 3.1.5.
3.1.8 Corollary. limx→a f (x) exists iff limx→a− f (x) and limx→a+ f (x) exist
and are equal. In this case all three limits are equal.
The next result shows that function limits may be characterized in terms
of limits of sequences.
3.1.9 Sequential Characterization of Limit. Let f be a function whose
domain includes E and let a ∈ R be an accumulation point of E. Then
lim{x→a, x∈E} f (x) exists in R and equals L iff f (an ) → L for all sequences
{an } in E with an → a.
Limits and Continuity on R
51
Proof. Assume that lim{x→a, x∈E} f (x) = L and let {an } be a sequence in
E with an → a. Given a neighborhood N (L), choose N (a) as in (3.1) and
then choose N such that an ∈ N (a) for all n ≥ N . For such n, f (an ) ∈ N (L).
Therefore, f (an ) → L.
Now suppose that lim{x→a, x∈E} f (x) 6= L. Then there is a neighborhood
of L such that (3.1) fails for each neighborhood N (a) of a. Consider the case
a, L ∈ R. Then N (L) is of the form (L − r, L + r) for some r > 0. Taking
N (a) = (a − 1/n, a + 1/n) we see that for each n ∈ N there exists an ∈ E
with |an − a| < 1/n and |f (an ) − L| ≥ r. Thus an → a and f (an ) 6→ L, so the
sequential condition does not hold. A similar argument works if either a or L
is infinite.
3.1.10 Example.
Let
f
(x)
=
sin
(1/x),
x
=
6
0.
Since
f
1/nπ
= 0 and
f 2/(4n + 1)π = 1, limx→0+ f (x) does not exist.
♦
3.1.11 Cauchy Criterion for Functions. Let a be an accumulation point
of E. Then lim{x→a, x∈E} f (x) exists in R iff given ε > 0 there exists δ > 0
such that |f (x) − f (y)| < ε for all x, y ∈ E with |x − y| < δ.
Proof. If lim{x→a, x∈E} f (x) exists in R, then an application of the triangle
inequality shows that the ε, δ-condition of the theorem holds.
Conversely, assume that the ε, δ-condition holds and let {an } be a sequence
in E with an → a. By the hypothesis, {f (an )} is a Cauchy sequence and
so converges to some real number L. Suppose {bn } is another sequence in E
converging to a. Then an − bn → 0 so, by the ε, δ-condition, f (an ) − f (bn ) → 0.
Therefore, f (bn ) → L. By 3.1.9, lim{x→a, x∈E} f (x) = L.
3.1.12 Theorem. Let f be a function whose domain includes E and let a ∈ R
be an accumulation point of E. Then the following properties hold in the sense
that if the expressions on the right exist in R, then the limits on the left exist
and the equality holds.
(a) x→a
lim [sf (x) + tg(x)] = s x→a
lim f (x) + t x→a
lim g(x), s, t ∈ R.
x∈E
x∈E
x∈E
(b) x→a
lim f (x)g(x) = x→a
lim f (x) x→a
lim g(x).
x∈E
(c) x→a
lim
x∈E
x∈E
x∈E
lim{x→a, x∈E} f (x)
f (x)
=
if x→a
lim g(x) 6= 0.
g(x)
lim{x→a, x∈E} g(x)
x∈E
(d) x→a
lim |f (x)| = x→a
lim f (x) .
x∈E
x∈E
Proof. The assertions follow immediately from 2.1.11 and 3.1.9. However, it
is instructive to formulate direct proofs. We do this for the finite version of
part (c). Assume that the limits
L := x→a
lim f (x) and M := x→a
lim g(x) 6= 0
x∈E
x∈E
52
A Course in Real Analysis
are finite and let ε > 0. Choose N1 (a) such that
|g(x) − M | < |M |/2 for all x ∈ E ∩ N1 (a).
For such x, |g(x)| ≥ |M | − |M − g(x)| ≥ |M |/2, hence
f (x)
|M f (x) − Lg(x)|
L
=
−
g(x)
M
|M g(x)|
|M (f (x) − L) + L(M − g(x))|
=
|M g(x)|
|M | |f (x) − L| + |L| |M − g(x)|
≤
|M |2 /2
2
2|L|
|f (x) − L| +
|M − g(x)|
=
|M |
M2
≤ K |f (x) − L| + |M − g(x)| , K := 2/|M | + 2|L|/M 2 .
Now choose N2 (a) so that |f (x) − L| < ε/2K and |M − g(x)| < ε/2K for all
x ∈ E ∩ N2 (a). Then x ∈ E ∩ N1 (a) ∩ N2 (a) ⇒ |f (x)/g(x) − L/M | < ε.
3.1.13 Example. (Limits of rational functions at infinity). Let f (x) =
P (x)/Q(x), where
P (x) = a0 + a1 x + · · · + an xn and Q(x) = b0 + b1 x + · · · + bm xm , an , bm 6= 0.
For any a, c ∈ R, limx→c a = a and limx→c x = c, hence, by 3.1.12,
limx→c f (x) = f (c), provided Q(c) 6= 0. To calculate limits at +∞, write
f (x) =
a0 x−n + a1 x−n+1 + · · · + an−1 x−1 + an n−m
x
.
b0 x−m + b1 x−m+1 + · · · + bm−1 x−1 + bm
Since limx→+∞ x−j = 0 for j ∈ N, we see that


if m > n,
0
lim f (x) = an /bn if m = n, and
x→+∞


±∞
if m < n,
where the sign in the last case is that of an /bm .
♦
3.1.14 Theorem. Let f be a function whose domain includes E and let
a ∈ R be an accumulation point of E. If f (x) ≤ g(x) for all x ∈ E and if
L := lim{x→a, x∈E} f (x) and M := lim{x→a, x∈E} g(x) exist in R, then L ≤ M .
Proof. Assume, for a contradiction, that M < L. Choose any K ∈ (M, L)
and then choose neighborhoods N (L) ⊆ (K, +∞) and N (M ) ⊆ (−∞, K) (see
Figure 3.2). Then there exists a neighborhood N (a) such that f (x) ∈ N (L)
and g(x) ∈ N (M ) for all x ∈ E ∩ N (a). But for any such x, g(x) < f (x),
contradicting the hypothesis.
Limits and Continuity on R
N (M )
53
N (L)
M g(x)
K
f (x) L
FIGURE 3.2: L can’t be greater than M .
3.1.15 Theorem (Squeeze principle for functions). Let f be a function
whose domain contains E and let a ∈ R be an accumulation point of E.
If f (x) ≤ g(x) ≤ h(x) for all x ∈ E and if the limits lim{x→a, x∈E} f (x) and
lim{x→a, x∈E} h(x) exist in R and are equal, then lim{x→a, x∈E} g(x) exists in
R and all three limits are equal.
Proof. Let L denote the common limit. For the case L ∈ R, given ε > 0 there
exists a neighborhood N (a) of a such that
L − ε ≤ f (x) ≤ g(x) ≤ h(x) < L + ε for all x ∈ E ∩ N (a).
The cases L = ±∞ are proved similarly.
3.1.16 Definitions. A function f is said to be strictly increasing on E if
f (x) < f (y) for all x, y ∈ E with x < y. Similarly, f is increasing on E if
f (x) ≤ f (y) for all x, y ∈ E with x < y. The notions of strictly decreasing and
decreasing are defined analogously. If f is either (strictly) increasing or (strictly)
decreasing on E, then f is said to be (strictly) monotone on E. Finally, f is
bounded on E if there exists a real number M such that |f (x)| ≤ M for all
x ∈ E.
♦
The reader should compare the following theorem with the monotone
sequence theorem (2.2.2).
3.1.17 Monotone Function Theorem. Let a, b, c ∈ R with a < c < b.
If f is monotone on (a, b), then limx→a+ f (x), limx→b− f (x) exist in R and
limx→c− f (x), limx→c+ f (x) exist in R.
Proof. Assume that f is increasing. Let s := supa<x<b f (x). By the approximation property of suprema, for each r < s there exists xr ∈ (a, b) such
that r < f (xr ) ≤ s. Since f (x) is increasing, r < f (x) ≤ s for all x ∈ (xr , b).
Therefore
lim− f (x) = sup f (x).
x→b
Similarly,
a<x<b
lim f (x) = inf f (x).
x→a+
a<x<b
The assertion regarding limits at c is proved in the same way noting that, since
f is bounded in a neighborhood of c, the limits are finite.
54
A Course in Real Analysis
Exercises
1.S Show that all points of a finite set E are isolated.
2. Find all the accumulation points of the set {2/m − 3/n : m, n ∈ N}.
3. Prove that if E has an accumulation point a, then there exists a sequence
{an } of distinct points in E such that an → a.
4. Determine the limit and then use the definition to prove your result:
√
x+1
x+3
2
S
(a) lim (3x − 2x + 1).
(b) lim
.
(c) lim √
.
x→1
x→1 3x + 1
x→4
x−1
√
√
x+1
.
x2 + x − x.
(f) lim √
(d)S lim (x2 + x).
(e) lim
x→+∞
x→−∞
x→−∞
x−1
5. Discuss the possible values of lim f (x) for the function f in 3.1.13.
x→−∞
6. Use the results of this section together with standard trig identities and
the limits
sin x
lim cos x = lim
=1
x→0
x→0 x
to evaluate the following limits without using l’Hospital’s rule.
sin(x − 1)
sin (2x)
sin x2
√ .
.
(b) lim
.
(c)
lim
x→1
x→0 sin (3x)
x→0+
x3 − 1
x
√
sin x
1 − cos (3x)
3x + 2 sin (5x)
lim
.
(e) lim
.
(f) lim
.
2
x→0
x→0
x→0+
x2
2x − 3 sin (7x)
sin x
tan(x2 − 3x + 2)
1 − sec(3/x)
sin(1/x)
√ .
lim
. (h) lim
. (i) lim
x→+∞ 1 − sec(5/x)
x→+∞ sin(1/ x)
x→1 tan(x2 − 4x + 3)
(a) S lim
(d) S
(g) S
7. Evaluate the following limits without using l’Hospital’s rule, where
m, n ∈ N, and a, b, c, d > 0.
1
(x − 15)
1
1
1
(a) lim
1+ 2
.
(b) S lim
−√
.
x→0 x x + 1
x→3 x − 3
(x + x)
x+1
xn − an
.
x→a xm − am
√
√
x−b− b
(e)
lim
.
x→2b+
x − 2b
p
√
(x + 1)(x + 2) − 2
(g) lim
.
x→0
x
x + bxc
(i)
lim−
.
x→n x − bxc
(c)
lim
x1/n − b1/n
.
x→b x1/m − b1/m
√
√
b+x− b−x
√
(f) S lim √
.
x→0
c+x− c−x
√
√
ax + b − b
√ .
(h) S lim √
x→0
cx + d − d
x + bxc
(j)
lim+
.
x→n x − bxc
(d)
lim
Limits and Continuity on R
55
8. Let aj ∈ R and ε > 0. Show that there exists a ∈ R such that
n−1
X
atn +
aj tj > −ε for all t ≥ 0.
j=1
9.S Define
(
4x2 + 2x − 11 if x is rational
f (x) =
3x2 + x − 5
if x is irrational.
Show that limx→a f (x) exists for precisely two values of a.
10. Let c ∈ R. Show that limx→c f (x) exists in R iff for each strictly increasing
sequence {an } converging to c and each strictly decreasing sequence {bn }
converging to c, limn f (an ) = limn f (bn ).
11. Evaluate the following limits without using l’Hospital’s rule, where
a, b, c > 0, m, n ∈ N.
√
baxc
b
(a) S lim
.
(b) lim
x
x.
x→+∞ x
x→+∞
x
bax + sin xc
bmxc + n
.
(d) lim
.
(c) lim
x→+∞ bbx + cos xc
x→+∞ bnxc + m
√
√
√ i
√ h√
ax + b − b
√ .
(e) S lim
ax bx + c − bx .
(f) lim √
x→+∞
x→+∞
cx + d − d
q
q
h
p
√
√
√ i
(g) lim
ax + bx − ax + cx . (h) lim x ax2 + b − x2 a .
x→+∞
x→+∞
12. ⇓1 Let f (x) = xn , n ∈ N. Use Exercise 1.2.4 to prove that f is strictly
increasing on [0, +∞), and if n is odd then f is strictly increasing on R.
13.S Suppose that f is monotone on (0, +∞) and an → +∞. Prove that
limx→∞ f (x) = limn f (an ).
*3.2
Limits Inferior and Superior
Let f be a function whose domain includes E ⊆ R, and let a ∈ R be either
a member of E or an accumulation point of E (not necessarily in the domain
of f ). For each neighborhood Nr (a) define
f (r) = inf{f (x) : x ∈ E ∩ Nr (a)} and f (r) = sup{f (x) : x ∈ E ∩ Nr (a)}.
If a ∈ R, then f (r) increases and f (r) decreases as r ↓ 0. Similarly, if a = ±∞,
then f (r) increases and f (r) decreases as r ↑ +∞. By the monotone function
1 This
exercise will be used in 3.4.5.
56
A Course in Real Analysis
theorem, in the first case f and f have limits as r → 0+ and in the second
case f and f have limits as r → +∞. We then define
lim
inf f (x) := lim+ f (r)
x→a
and lim sup f (x) := lim+ f (r) if a ∈ R,
lim
inf f (x) := lim f (r)
x→a
and lim sup f (x) := lim f (r) if a = ±∞.
r→0
x∈E
r→+∞
x∈E
x→a
x∈E
r→0
r→+∞
x→a
x∈E
The extended real numbers
lim
inf f (x) and lim sup f (x)
x→a
x→a
x∈E
x∈E
are called, respectively, the limit inferior and limit superior of f , as x tends
to a along E. Clearly,
lim
inf f (x) ≤ lim sup f (x).
x→a
x∈E
x→a
x∈E
The above definitions include all the standard formulations of limit superior
and limit inferior. For these, we shall use notation analogous to that for limits.
For example, taking E = (0 + ∞), we have
lim inf
sin(1/x) = −1 and lim sup sin x = 1.
+
x→0
x→+∞
Limits superior and inferior of a function have properties similar to ordinary
limits but unlike the latter, always exist (in R).
The following theorem establishes a connection with limits superior and
inferior of sequences.
3.2.1 Theorem. Let a ∈ R be an accumulation point of E. Then there exist
sequences {xn } and {yn } in E tending to a such that
lim
inf f (x) = lim f (xn ) and lim sup f (x) = lim f (yn ).
x→a
n
x∈E
x→a
x∈E
n
(3.2)
Moreover, if A denotes the set of all sequences {an } in E tending to a, then
lim
inf f (x) =
x→a
x∈E
inf lim inf n f (an ) and lim sup f (x) = sup lim supn f (an ).
{an }∈A
x→a
x∈E
{an }∈A
Proof. We prove only the lim sup case. Let
s := lim sup f (x) and t := sup lim sup f (an ).
x→a
x∈E
{an }∈A n→∞
We assume that both a and s are finite; the proofs for the other cases are
similar.
Choose a sequence rn → 0+ such that f (rn ) → s. For each n, use the
Limits and Continuity on R
57
approximation property of supremum to obtain a point yn ∈ E ∩ Nrn (a) such
that f (rn ) − 1/n < f (yn ) ≤ f (rn ). Then yn → a and f (yn ) → s. This proves
the limsup part of (3.2) and also shows that s ≤ t.
For the reverse inequality, let {an } be in A and let L := lim supn→∞ f (an ).
Let r > 0 and choose N ∈ N such that an ∈ Nr (a) for all n ≥ N . For such n,
f (an ) ≤ f (r), hence L ≤ f (r). Letting r → 0+ yields L ≤ s. Therefore,
t ≤ s.
3.2.2 Corollary. Let a ∈ R be an accumulation point of E. Then
lim{x→a, x∈E} f (x) exists in R iff
lim
inf f (x) = lim sup f (x),
x→a
(3.3)
x→a
x∈E
x∈E
in which case the three limits are equal.
Proof. Let (3.3) hold and denote the common value by L. By the theorem,
for any sequence {an } in E tending to a, lim supn f (an ) ≤ L ≤ lim inf n f (an ),
hence limn f (an ) exists and equals L. From the sequential characterization of
limit, lim{x→a, x∈E} f (x) exists and equals L.
Conversely, assume K := lim{x→a, x∈E} f (x) exists in R and let an ∈ E
with an → a. Then K = limn f (an ) hence lim supn f (an ) = lim inf n f (an ). By
the theorem, (3.3) holds and the common value is K.
3.2.3 Proposition. Let f and g have domain containing E ⊆ R and let a ∈ R
be an accumulation point of E. Then
(a) lim sup[−f (x)] = − lim
inf f (x).
x→a
x→a
x∈E
x∈E
(b) lim sup[f (x) + g(x)] ≤ lim sup f (x) + lim sup g(x).
x→a
x∈E
x→a
x∈E
x→a
x∈E
(c) lim
inf [f (x) + g(x]) ≥ lim
inf f (x) + lim
inf g(x).
x→a
x→a
x→a
x∈E
x∈E
x∈E
(d) lim sup cf (x) = c lim sup f (x), lim
inf cf (x) = c lim
inf f (x) if c ≥ 0.
x→a
x→a
x→a
x∈E
(e) lim sup f (x)g(x) ≤
x→a
x∈E
x∈E
x→a
x∈E
(f) lim
inf f (x)g(x) ≥
x→a
x∈E
lim sup f (x)
x→a
x∈E
x∈E
lim sup g(x) if f, g ≥ 0.
x→a
x∈E
lim
inf
f
(x)
lim
inf
g(x)
if f, g ≥ 0.
x→a
x→a
x∈E
x∈E
(g) lim
inf f (x) ≤ lim
inf g(x) and lim sup f (x) ≤ lim sup g(x) if f ≤ g.
x→a
x→a
x∈E
x∈E
x→a
x∈E
x→a
x∈E
58
A Course in Real Analysis
Proof. In (b) and (c) it is assumed that the expressions on the right are defined
in R. The assertions of the proposition may be proved directly or by using
2.4.1 together with 3.2.1. We illustrate the latter approach in proving (b). The
proofs of the remaining parts are similar.
Let {an } be an arbitrary sequence in E tending to a. By part (b) of 2.4.1
and by 3.2.1,
lim sup f (an ) + g(an ) ≤ lim sup f (an ) + lim sup g(an )
n
n
n
≤ lim sup f (x) + lim sup g(x).
x→a
x∈E
x→a
x∈E
Taking the supremum over all such sequences {an } yields (b).
Exercises
1. Calculate lim inf x→+∞ f (x) and lim supx→+∞ f (x) for each of the functions f (x) below, where r(k) is the remainder on division of k ∈ N by 3
and d(x) is the Dirichlet function (3.1.7).
(−1)bxc 3x + 7
(−1)bxc b2xc + 3
S
.
(c)
.
b3xc + 2
2x + 5(−1)bxc
4r(bxc)x + 5
ex + e−x
. (e) S
(d)
. (f) cos x sin x.
7r(bxc)x + 1
(−1)bxc (ex − e−x )
1
3 sin x
(g)
.
(h) sin x + cos x.
(i) S
.
2 + cos x
2 + sin x
π
2
1
(j) cos
sin x . (k)
.
(l) (−1)bxc−bx c .
2
3
1 + sin x
(a) S sin[xd(x)].
(b)
2. Prove the remaining parts of 3.2.3. Give examples to show that the
inequalities in (b), (c), (e), and (f) may be strict.
3.S Let E = E1 ∪ E2 , where a is an accumulation point of both E1 and E2 .
Prove:
lim sup f (x) = max lim sup f (x), lim sup f (x) .
x→a
x∈E
x→a
x∈E1
lim
inf f (x) = min
x→a
x∈E
x→a
x∈E2
lim
inf f (x), lim
inf f (x) .
x→a
x→a
x∈E1
x∈E2
Conclude in particular that
o
lim supx→a− f (x), lim supx→a+ f (x) .
n
o
lim inf x→a f (x) = min lim inf x→a− f (x), lim inf x→a+ f (x) .
lim supx→a f (x) = max
n
Limits and Continuity on R
59
4.S Let f (x) > 0 for all x ∈ E. Prove that
lim sup
x→a
x∈E
1
1
=
.
f (x)
lim inf {x→a, x∈E} f (x)
5. Prove that
lim sup f (x) ≤ lim sup |f (x)| and
x→a
x∈E
x→a
x∈E
inf |f (x)|.
lim
inf f (x) ≥ lim
x→a
x→a
x∈E
x∈E
Show by examples that the inequalities may be strict.
6. Let f : [a, b) → R and g(x) = supa≤t≤x f (t), a ≤ x < b. Prove that
g(x0 ) ≤ limx→x0 + g(x) for every x0 ∈ [a, b).
3.3
Continuous Functions
3.3.1 Definition. A function f with domain D is said to be continuous at a
point a ∈ D if lim{x→a, x∈D} f (x) = f (a); that is, for each ε > 0 there exists a
δ > 0 such that
|f (x) − f (a)| < ε for all x ∈ D with |x − a| < δ.
If f is continuous at each point of a subset E of D, then f is said to be
continuous on E. If f is continuous on D, then f is simply said to be continuous.
A point in D at which f is not continuous is called a discontinuity of f . ♦
The definition of continuity implies that any function f : D → R is
continuous at an isolated point of D. For example, if D is a finite set or a set
of integers, then every function f : D → R is continuous.
Continuity of f on E is not the same as continuity of the restriction f |E .
For example, the function on R that is identically equal to one on Z and zero
elsewhere is not continuous on Z, yet its restriction to Z is continuous (as a
function with domain Z).
From the sequential characterization of limit we have
3.3.2 Sequential Characterization of Continuity. A function f with
domain D is continuous at a ∈ D iff f (an ) → f (a) for all sequences {an } in
D with an → a.
3.3.3 Example. Let {r1 , r2 . . .} be an enumeration of the rationals in (0, 1).
Define f on (0, 1) by f (rn ) = 1/n and f (x) = 0 if x is irrational. We use the
sequential characterization of continuity to show that f is continuous precisely
at the irrational numbers in (0, 1).
60
A Course in Real Analysis
Let x ∈ (0, 1) be rational. Choose a sequence {xn } of irrational numbers
converging to x. Since f (xn ) = 0 for all n and f (x) 6= 0, f (xn ) 6→ f (x).
Therefore, f is not continuous at any rational.
Now let x ∈ (0, 1) be irrational and let {xn } be any sequence converging
to x. If f (xn ) 6→ f (x), then there exists an N ∈ N and a subsequence {yn } of
{xn } such that f (yn ) ≥ 1/N for all n. By definition of f , yn ∈ {r1 , r2 , . . . , rN }.
But this implies that x ∈ {r1 , r2 , . . . , rN }, contradicting that x is irrational.
(For a variation of this example, see Exercise 10.)
♦
The following is an immediate consequence of 3.1.12.
3.3.4 Theorem. Let f and g be functions with domain D, let α, β ∈ R and
let a ∈ D. If f and g are continuous at a, then so are αf + βg, f g, f /g (the
last provided that g(a) 6= 0).
3.3.5 Theorem. Let g : D → R and f : E → R with g(D) ⊆ E. If g is
continuous at a ∈ D and f is continuous at g(a), then f ◦ g is continuous at a.
Proof. Let b := g(a). Given ε > 0, choose η > 0 such that |f (y) − f (b)| < ε
for all y ∈ E with |y − b| < η. Next, choose δ > 0 such that |g(x) − b| < η for
all x ∈ D with |x − a| < δ. Then |x − a| < δ implies |f (g(x)) − f (b)| < ε.
A more succinct proof uses the sequential
characterization
of continuity:
an → a in D ⇒ g(an ) → g(a) ⇒ f g(an ) → f g(a) .
Constant functions and the function f (x) = x are clearly continuous. It
follows from 3.3.4 that polynomials and rational functions are continuous.
Continuity of trigonometric, logarithmic, and exponential functions will follow
from results in Chapter 4. Power functions xα := eα ln x are continuous as they
are compositions of continuous functions. Of course, in each case the domain
of the function must be carefully specified.
It is possible that a function is nowhere continuous. The Dirichlet function
(3.1.7) is an example. By contrast, we have
3.3.6 Theorem. A monotone function on an open interval I has at most
countably many discontinuities.
Proof. Assume without loss of generality that f is increasing on I. Let D
denote the set of discontinuities of f on I. For each t ∈ I, let
at = lim− f (x) and bt = lim+ f (x)
x→t
x→t
and let It = (at , bt ). Clearly, It 6= ∅ iff t ∈ D (see Figure 3.3). Furthermore,
by monotonicity, s < t ⇒ bs ≤ at . Therefore, the sets It are pairwise disjoint.
For each t ∈ D, choose a rational number rt in It . Since the correspondence
t → rt is one-to-one and the set of rationals is countable, D is countable.
Limits and Continuity on R
61
bt
rt
at
bs
rs
as
s
t
FIGURE 3.3: One-to-one correspondence between t ∈ D and rt ∈ Q.
Exercises
1.S Define
(
mx + 3
f (x) =
3x2 + 7
if x < 2,
if x > 2.
If f is continuous at x = 2, find the values of f (2) and m.
2. Find all values of a for which the following function is continuous on R.
(
3x2 + 5x − 7 if x < a
f (x) =
2x2 + 2x + 3 if x ≥ a.
3. Let f : (a, b) → R and g : (b, c) → R be continuous and suppose that
lim f (x) = lim+ g(x).
x→b−
x→b
Show that there exists a continuous function h : (a, c) → R such that
h = f on (a, b) and h = g on (b, c).
4.S Let g be continuous on R and let d(x) be the Dirichlet function. Show
that f (x) := g(x)d(x) is continuous precisely at the zeros of g.
5. Let f be defined on an open interval I and let c ∈ I. Show that f is
continuous at c iff for each strictly increasing sequence {an } converging to
c and each strictly decreasing sequence {bn } converging to c, f (an ) → f (c)
and f (bn ) → f (c).
6. Let f be a continuous function on [a, b] and let {an } be a sequence in
[a, b]. Prove:
(a) f lim sup an ≤ lim sup f (an ). (b) f lim inf an ≥ lim inf f (an ).
n→∞
n→∞
n→∞
n→∞
Show that equality holds in each case if f is increasing. Give examples
to show that the inequalities may be strict.
62
A Course in Real Analysis
7. Let f1 , . . . , fn be continuous at x0 . Prove that the functions
Mn (x) := max fj (x)
1≤j≤n
and mn (x) := min fj (x)
1≤j≤n
are continuous at x0 . Give examples to show that the corresponding
result is not true for infinitely many functions, where max is replaced
sup and min by inf.
8.S Let f : R → R be continuous at zero and satisfy f (x + y) = f (x) + f (y)
for all x, y ∈ R. Prove that f (tx) = tf (x) for all t, x ∈ R. Conclude that
f (x) = f (1)x for all x ∈ R.
9. A function f is right continuous at a if limx→a+ = f (a) and left continuous at a if limx→a− f (x) = f (a).
(a) Prove that f is continuous at a iff f is both right and left continuous
at a.
(b) Prove that the greatest integer function bxc is right continuous on R
but not left continuous at any integer.
(c)S Let {cn } be any sequence in R. For x ∈ R define
X
f (x) =
2−n ,
n:cn ≤x
where the notation indicates that the sum, possibly infinite, is taken over
all indices n for which cn ≤ x. (If there are no such indices, the sum
is defined to be 0.) Prove that f is right continuous everywhere. Prove
also that f is left continuous
P∞ at a iff a is not equal to any cn . (Note
that, because the series n=1 2−n converges, the order of summation is
irrelevant (6.4.10). Thus f (x) is well-defined.)
(d) Let f be increasing on an interval I. Define g on I by
g(x) = lim+ f (t) = inf f (t).
t→x
t>x
Prove that g is increasing and right continuous on I and that g is
continuous at a iff f is continuous at a.
10. Define f : (0, 1) → R by
(
0
f (x) =
1/n
if x is irrational
if x = m/n, reduced.
Use the sequential characterization of continuity to show that f is continuous precisely at the irrational numbers in (0, 1).
Limits and Continuity on R
63
11.S Let f : [0, 1] → R have the property that the limit g(x) := limt→x f (t)
exists in R for all x ∈ [0, 1]. Prove that
(a) g is continuous.
(b) f has at most countably many discontinuities.
Hint. For (a), use the sequential criterion. For (b), use ideas similar to
those used in the proof of 3.3.6.
3.4
Properties of Continuous Functions
3.4.1 Extreme Value Theorem. If f is continuous on a closed bounded
interval [a, b], then f has a maximum and a minimum; that is, there exist
xm , xM ∈ [a, b] such that
f (xm ) ≤ f (x) ≤ f (xM ) for all x ∈ [a, b].
Proof. We show first that f is bounded. Suppose, for instance, that f is
not bounded above. Then for each n ∈ N there exists an ∈ [a, b] such that
f (an ) > n. On the other hand, by the Bolzano–Weierstrass theorem, {an } has
a convergent subsequence, say ank → x0 . But then, by continuity,
nk < f (ank ) → f (x0 ) < +∞,
impossible. Thus f must be bounded above. Similarly, f is bounded below.
Now let M := sup{f (x) : x ∈ [a, b]}. By the first paragraph, M is finite.
By the approximation property for suprema, there exists a sequence xn ∈ [a, b]
such that f (xn ) → M . By the Bolzano–Weierstrass theorem again, there exists
a subsequence xnk converging to some xM ∈ [a, b]. By continuity, f (xM ) = M
Therefore, f (xM ) is the maximum of f . The proof for the minimum case is
similar.
The examples f (x) = 1/x on (0, 1) and f (x) = x on [0, +∞) show that the
interval in the theorem must be both closed and bounded.
3.4.2 Definition. A function f is said to have the intermediate value property
on an interval I if, for each a, b ∈ I with a < b and each y0 between f (a) and
f (b), there exists an x0 ∈ (a, b) such that f (x0 ) = y0 .
♦
The intermediate value property simply asserts that f (I) is an interval
whenever I is an interval.
3.4.3 Intermediate Value Theorem. A continuous function f on an interval I has the intermediate value property.
64
A Course in Real Analysis
Proof. Let a, b ∈ I with a < b and suppose that f (a) < y0 < f (b). The
set E := {x ∈ [a, b] : f (x) < y0 } contains a and is bounded below, hence
x0 := sup E exists and lies in [a, b]. By continuity of f at a, E contains an
interval [a, a + δ), hence x0 > a. Since f (x) < y0 for all x ∈ E, 3.1.14 and the
continuity of f at x0 imply that
f (x0 ) = x→x
lim f (x) ≤ y0 .
0
x∈E
In particular, x0 6= b. Similarly, since f (x) ≥ y0 for all x ∈ (x0 , b),
f (x0 ) = lim+ f (x) ≥ y0 .
x→x0
Therefore, y0 = f (x0 ). Figure 3.4 illustrates the proof.
f (b)
f (x)
y0
f (x)
f (a)
a
E
x
x0
x b
FIGURE 3.4: y0 = f (x0 ).
Simple examples show that the continuity hypothesis is essential. Of course,
there are many discontinuous functions that have the intermediate value
property (see Exercise 5). Interestingly, all derivatives have the intermediate
value property, whether they are continuous or not (Exercise 4.2.25). Thus a
function without the intermediate value property cannot have an antiderivative.
Combining the extreme and intermediate value theorems we obtain
3.4.4 Corollary. If f is continuous on [a, b], then f [a, b] = [f (xm ), f (xM )].
3.4.5 Corollary (Existence of nth roots). For each b > 0 and n ∈ N, the
equation xn = b has a unique positive solution.
Proof. Let f (x) = xn . Since limx→+∞ xn = +∞, we may choose c > 0 such
that f (c) > b > f (0) = 0. By the intermediate value theorem, the equation
f (x) = b has a positive solution. By Exercise 3.1.12, xn is strictly increasing
on (0 + ∞), hence the solution is unique.
Here is another application of the intermediate value theorem.
Limits and Continuity on R
65
3.4.6 Example. The equation
√
2 x + sin (3x2 ) 5x2 + e2x+7
f (x) :=
+
=0
(x − 1)3
(x − 2)5
has a solution x = x0 between 1 and 2. Indeed, since
lim f (x) = +∞ and
x→1+
lim f (x) = −∞,
x→2−
there must exist 1 < a < b < 2 such that f (a) > 0 > f (b). By the intermediate
value theorem, f (x0 ) = 0 for some x0 ∈ (a, b).
♦
Remark. The zeros of a continuous function f may be approximated using the
interval halving method, reminiscent of the proof of the Bolzano–Weierstrass
theorem: Suppose f (a) < 0 < f (b) so that a zero of f lies in (a, b). Bisect the
interval [a, b] and compute the values of f at the endpoints of the resulting
two intervals. If one of these values is zero, stop. If neither is zero, then for
one of the intervals, denote it by [a1 , b1 ], the values of f at the endpoints have
opposite signs. The intermediate value theorem then implies that a zero of f
lies in (a1 , b1 ), and we may approximate the zero by either a1 or b1 . Continuing
this process, we may (theoretically) approximate a zero of f to any desired
degree of accuracy. The procedure is easily programmable.
♦
Exercises
1. Find an example of a bounded function on [0, 1] with a single discontinuity
that has no maximum or minimum.
2.S Let f be continuous and positive on R with lim f (x) = 0. Prove that
x→±∞
f has a maximum value on R.
3. Let f be continuous on R with lim f (x) = +∞. Prove that f has a
x→±∞
minimum value on R.
4. A function f defined on an interval J and taking values in R is said to
be upper (lower) semicontinuous at x0 ∈ J if
f (x0 ) ≥ lim sup f (x) f (x0 ) ≤ lim inf f (x) ,
x→x0
x→x0
where the limits are one-sided if x0 is an endpoint of J. If f is upper
(lower) semicontinuous at each point of J, then f is said to be upper
(lower) semicontinuous on J
(a) Prove that f is upper semicontinuous at x0 iff −f is lower semicontinuous at x0 .
(b) Prove that f is continuous at x0 iff it is both upper and lower
semicontinuous at x0 .
66
A Course in Real Analysis
(c) Show that, at any integer n, bxc is upper semicontinuous but not
lower semicontinuous.
(d) Let f (x) = sin (1/x), x 6= 0, and f (0) = a. Show that f is upper
(lower) semicontinuous at 0 iff a ≥ 1 (a ≤ −1).
(e)S Let fi be defined on J and upper semicontinuous at x0 for every i
in some index set I. Define f (x) = inf i∈I fi (x), x ∈ J. Show that f
is upper semicontinuous at x0 . Give an example to show that f may
not be continuous at x0 even if each fi is continuous on J.
(f) (Semi-extreme value property) Prove: If f is upper (lower) semicontinuous at each point of [a, b], then f is bounded above (below) on [a, b]
and there exists x0 ∈ [a, b] such that f (x0 ) ≥ f (x) (f (x0 ) ≤ f (x))
for all x ∈ [a, b].
5. Give an example of a function on [0, 1] with the intermediate value
property that is
(a) discontinuous at precisely the points 1/n, n = 1, 2, . . . .
(b)S discontinuous everywhere.
6. Prove that a polynomial P of odd degree maps R onto R. In particular,
P has a real zero.
7. Use the intermediate value theorem to show that each of the following
equations has a solution in the indicated interval I.
(a) ln x + x = e, I = (1, e).
(b) sin x = ax, I = (π/2, π), 0 < a < 2/π.
(c)S tan x = x, I = (nπ, (n + 1/2)π), n ∈ N.
(d) ex = 4.82 sin x, I = (0, π/2) and I = (π/2, π).
(e)
x4 + x2 + 1 x3 + 1 e−x + x
+
+
= 0, I = (−1, 0) and I = (0, 1).
x+1
x
x−1
(f)S
e1−x − x2
2x2 − 5
=
, I = (0, π/2).
sin x
cos x
2
8. Prove that the equation ex = xn (n ∈ N) has a solution in R iff n ≥ 3.
Hint. Find the minimum of ex /xn on (0, +∞).
9.S Let f : [a, b] → [a, b] be continuous. Prove that there exists x ∈ [a, b]
such that f (x) = x.
10. Prove that if n ∈ N is odd, then every real number has a unique nth
root.
11. Let f be continuous and nonzero on R. Let a0 be arbitrary and define
{an } recursively by an = an−1 + f (an−1 ), n ≥ 1. Show that either
an ↑ +∞ or an ↓ −∞.
Limits and Continuity on R
3.5
67
Uniform Continuity
Recall that a function f is continuous on a set E if for each y ∈ E and each
ε > 0 there exists δ > 0 such that |f (x) − f (y)| < ε for all x in the domain of f
with |x − y| < δ. The number δ typically depends on both ε and y. Removing
the dependence on y results in the notion of uniform continuity:
3.5.1 Definition. A function f is said to be uniformly continuous on a subset
E of the domain of f if for each ε > 0 there exists δ > 0 such that
|f (x) − f (y)| < ε for all x, y ∈ E with |x − y| < δ.
♦
The following result is frequently useful in determining whether or not a
function is uniformly continuous.
3.5.2 Sequential Characterization of Uniform Continuity. A function
f is uniformly continuous on E iff f (xn ) − f (yn ) → 0 for all sequences {xn }
and {yn } in E with xn − yn → 0.
Proof. Let f be uniformly continuous on E and let {xn } and {yn } be sequences
in E with xn − yn → 0. Given ε > 0, choose δ > 0 so that |f (x) − f (y)| < ε
for all x, y ∈ E with |x − y| < δ. Next, choose N ∈ N such that |xn − yn | < δ
for all n ≥ N . For such n, |f (xn ) − f (yn )| < ε. Thus f (xn ) − f (yn ) → 0.
Now assume that f is not uniformly continuous on E. Then there exists an
ε > 0 and sequences xn , yn ∈ E with |xn − yn | < 1/n and |f (xn ) − f (yn )| ≥ ε.
Then xn − yn → 0 but f (xn ) − f (yn ) 6→ 0, so f does not satisfy the sequential
condition.
3.5.3 Example. The function f (x) = 1/x, x > 0, is uniformly continuous on
intervals of the form [r, +∞), r > 0, as may be seen from the inequality
|f (x) − f (y)| =
|x − y|
|x − y|
≤
, x, y ≥ r.
xy
r2
However, f is not uniformly continuous on (0, +∞). Indeed, if xn = 1/2n and
yn = 1/n, then xn − yn → 0 yet f (xn ) − f (yn ) = n → +∞.
♦
3.5.4 Theorem. Let f , g be uniformly continuous on E and let α, β ∈ R.
Then
(a) αf + βg is uniformly continuous on E.
(b) If f and g are bounded, then f g is uniformly continuous on E.
(c) If g 6= 0 and 1/g is bounded on E, then 1/g is uniformly continuous on E.
68
A Course in Real Analysis
Proof. Part (a) follows easily from the sequential characterization of uniform
continuity. For (b), let M > 0 such that |f (x)|, |g(x)| ≤ M for all x ∈ E.
Uniform continuity of f g then follows from the inequalities
|f (x)g(x) − f (y)g(y)| ≤ |f (x)g(x) − f (y)g(x)| + |f (y)g(x) − f (y)g(y|
≤ M |f (x) − f (y)| + M |g(x) − g(y)|.
For (c), choose K > 0 such that 1/|g(x)| < K for all x ∈ E. Uniform
continuity of 1/g then follows from
|g(x) − g(y)|
1
1
=
−
≤ K 2 |g(x) − g(y)|, x, y ∈ E.
g(x) g(y)
|g(x)g(y)|
The following theorem may be given a short proof based on the sequential
criterion for uniform continuity. We leave the details to the reader.
3.5.5 Theorem. Suppose that g is uniformly continuous on D, f is uniformly
continuous on E, and g(D) ⊆ E. Then f ◦ g is uniformly continuous on D.
The next theorem shows that on closed and bounded intervals the notions
of continuity and uniform continuity coincide.
3.5.6 Theorem. If f is continuous on a closed bounded interval [a, b], then
f is uniformly continuous there.
Proof. We use the sequential characterization of uniform continuity. Let {xn }
and {yn } be sequences in [a, b] with xn − yn → 0. Suppose, for a contradiction,
that f (xn ) − f (yn ) 6→ 0. Then |f (xn ) − f (yn )| > ε for some ε > 0 and
infinitely many n and hence for a subsequence of {n}. Changing notation if
necessary, we may suppose that the inequality holds for all n. By the Bolzano–
Weierstrass theorem, {xn } has a convergent subsequence, say xnk → x0 . Since
xnk − ynk → 0, ynk → x0 . But then, by continuity, |f (xnk ) − f (ynk )| → 0,
which is impossible.
The connection between continuity and uniform continuity on open intervals
is more complicated. For this, we need the following definitions.
3.5.7 Definition. A continuous function f on D is said to have a continuous
extension to a set D1 ⊇ D if there exists a continuous function f1 : D1 → R
such that f1 |D = f . In the special case D1 = D ∪ {a}, where a 6∈ D, f (x) is
said to have a removable discontinuity at x = a.
♦
3.5.8 Proposition. Let f be defined and continuous on D and let a be an
accumulation point of D, a 6∈ D. Then f has a removable discontinuity at
x = a iff L := lim{x→a, x∈D} f (x) exists in R.
Proof. The necessity is clear. For the sufficiency, simply set f (a) = L to obtain
a continuous extension of f to D ∪ {a}.
Limits and Continuity on R
69
For example, the functions
1
x sin ,
x
sin x
,
x
and
x
p
|x|
defined for x 6= 0, have removable discontinuities at x = 0 and hence have
unique continuous extensions to R. On the other hand, since limx→0+ sin(1/x)
does not exist, the function sin(1/x) does not have a removable discontinuity
at x = 0.
The following theorem is the main result regarding uniform continuity of
functions on bounded open intervals.
3.5.9 Theorem. Let f be continuous on the bounded interval (a, b). The
following statements are equivalent:
(a) limx→a+ f (x) and limx→b− f (x) exist in R.
(b) f has a continuous extension to [a, b].
(c) f is uniformly continuous on (a, b).
Proof. (a) ⇒ (b) is immediate from 3.5.8.
(b) ⇒ (c): By 3.5.6, a continuous extension g of f to [a, b] is uniformly
continuous. Therefore, f = g|(a,b) is uniformly continuous.
(c) ⇒ (a): Let {an } be any sequence in (a, b) converging to a. Then {an }
is Cauchy and since f is uniformly continuous, {f (an )} is Cauchy (Exercise 7).
Therefore, L := limn→∞ f (an ) exists. We claim that limx→a+ f (x) exists and
equals L. To see this, let {a0n } be any sequence in (a, b) converging to a. Then
an − a0n → 0, hence, by uniform continuity, f (an ) − f (a0n ) → 0, so f (a0n ) → L.
By the sequential characterization of limit (3.1.9), limx→a+ f (x) = L. A similar
argument shows that limx→b− f (x) exists.
For example, since sin(1/x) has no continuous extension to [0, 1], it is
not uniformly continuous on (0, 1]. On the other hand, for any p > 0,
limx→0+ xp sin(1/x) = 0, hence xp sin(1/x) is uniformly continuous on (0, 1].
For another example, consider f (x) = (1 − cos x)/x on R \ {0}. By
l’Hospital’s rule, proved in the next chapter, limx→0 f (x) = limx→0 sin x = 0,
hence f has a continuous extension to R. Moreover, since limx→±∞ f (x) = 0,
f is uniformly continuous on R (Exercise 5).
3.5.10 Corollary. A bounded, continuous, monotone function f on a bounded
interval (a, b) is uniformly continuous there.
Proof. By 3.1.17, limx→a+ f (x) and limx→b− f (x) exist in R.
The following result relies on the mean value theorem proved in the next
chapter.
3.5.11 Theorem. If f has a bounded derivative on an interval I, then f is
uniformly continuous on I.
70
A Course in Real Analysis
Proof. Let M be a bound for |f 0 | on I. By the mean value theorem, for any
x, y ∈ I there exists a z between x and y such that f (x) − f (y) = f 0 (z)(x − y).
Thus |f (x) − f (y)| ≤ M |x − y|, which implies uniform continuity.
For example, sinn x and cosn x have bounded derivatives for every n ∈ N,
hence are uniformly continuous on R. This also follows from periodicity (see
Exercise 11). On the other hand, xp is not uniformly continuous on (0, +∞)
for p > 1. Indeed, if xn = n + n(1−p)/2 and yn = n, then, by the mean value
theorem, for each n there exists zn ∈ (yn , xn ) such that
xpn − ynp =
pznp−1
≥ pn(p−1)/2 → +∞.
n(p−1)/2
Since xn − yn → 0, 3.5.2 implies that xp is not uniformly continuous.
Exercises
1.S Find functions f and g with f continuous and g uniformly continuous
such that neither f ◦ g nor g ◦ f is uniformly continuous.
2. Let r > 0. Show that the function f (x) = (3x + 2)/(2x − 1) in 3.1.4 is
uniformly continuous on Dr but not on its domain D, where
Dr := (−∞, 1/2 − r] ∪ [1/2 + r, +∞) and D = (−∞, 1/2) ∪ (1/2, +∞).
3. Let a, b > 0. Give a careful ε, δ proof that each of the following functions
is uniformly continuous on R.
√
√
(b) 1/ ax2 + b.
(c) |ax + b|.
(a)S ax2 + b.
4. Show that ln x is uniformly continuous on (r, +∞) for every r > 0 but is
not uniformly continuous on (0, 1).
5. Let f be continuous on [0, ∞). Prove that if limx→+∞ f (x) exists and
is finite, then f is uniformly continuous on [0, +∞). Give an example
of a bounded continuous function on [0, +∞) that is not uniformly
continuous.
6. Prove that each of the following functions is uniformly continuous on the
indicated interval, where n ∈ N:
(a) sin(1/x), [r, +∞), r > 0.
(b) x sin(1/x), [0, +∞).
(c) arctan x, (−∞, +∞).
p
(e) cos x2 + 1, (−∞, +∞).
(d) xn e−x , [0, +∞).
(g) (1 + xn )1/n , [0, +∞).
(h) (1 + xn )−1/n , [0, +∞).
(f) xp , 0 < p ≤ 1, [0, +∞).
7.S Let f be uniformly continuous on E and let {an } be a Cauchy sequence
in E. Prove that {f (an )} is Cauchy.
Limits and Continuity on R
71
8. Suppose that f (x) is uniformly continuous on [0, +∞). Prove that the
function g is uniformly continuous on R, where
(
f (x)
if x ≥ 0,
g(x) :=
f (−x) if x < 0.
9.S Let f be uniformly continuous on R. Prove that f (|x|), |f (x)|, and
|f (|x|)| are uniformly continuous on R.
10. Let f be uniformly continuous on each of the intervals (a, b) and (c, d),
where a < b < c < d. Prove that f is uniformly continuous on the set
(a, b) ∪ (c, d). What if b = c?
11. Let f : R → R be periodic with period p > 0, that is, f (x + p) = f (x) for
all x ∈ R. If f is continuous on [0, p], prove that f is uniformly continuous
and bounded on R.
12. Let f1 , . . . , fn be uniformly continuous on E. Prove that the functions
M (x) := max fj (x) and m(x) := min fj (x)
1≤j≤n
1≤j≤n
are uniformly continuous on E.
13.S Find all values of p > 0 for which the function f (x) = x−p sin x, x > 0,
has a continuous extension to [0, +∞). Prove that for all such p the
extension is uniformly continuous.
14. Let r > 0. Prove that f (x) := sin(xp ) is uniformly continuous on (r, +∞)
iff p ≤ 1.
15.S Prove that a uniformly continuous function f on a bounded interval
(a, b) is bounded. Give examples to show that the result is not true if
(a, b) is unbounded or if f is merely continuous.
16. Give examples to show that parts (b) and (c) of 3.5.4 are not necessarily
true if the boundedness conditions are removed.
17. Let f be continuous on [a, b]. Prove that
g(x) := sup f (t)
a≤t≤x
is continuous on [a, b].
18.S Let
f (x) = (1 − e1/x )−1 , x 6= 0.
Is it possible to define f (0) so that f is continuous on R? What about
for the function
g(x) = x(1 − e1/x )−1 , x 6= 0?
Chapter 4
Differentiation on R
The notion of rate of change of one quantity with respect to another is fundamental to many disciplines. It is expressed mathematically as the derivative of
a function. In this chapter we establish the main properties of this important
construct.
4.1
Definition of Derivative and Examples
4.1.1 Definition. A real-valued function f defined in a neighborhood of a ∈ R
is said to be differentiable at a if the limit
f 0 (a) = Df (a) =
df
dx
:= lim
a
x→a
f (x) − f (a)
f (a + h) − f (a)
= lim
h→0
x−a
h
exists in R. The limit is then called the derivative of f at a. If f is differentiable
at each member of a set E, then f is said to be differentiable on E and the
function
df
f 0 = Df =
dx
is called the derivative of f on E. If f 0 is continuous on E, then f is said to
be continuously differentiable on E.
♦
It follows immediately from the definition that the derivative of a constant
function is 0. Here are some nontrivial examples.
4.1.2 Example. We prove the following special cases of the power rule (the
general power rule will be proved later): Let n ∈ N and r = n or 1/n. Then
Dxr = rxr−1 .
(In the second case x 6= 0, and x > 0 if n is even.)
The case r = n is obtained by letting h → 0 in the identity
n
X
(x + h)n − xn
=
(x + h)n−j xj−1
h
j=1
73
74
A Course in Real Analysis
(Exercise 1.2.4.) Each term in the sum tends to xn−1 , and since there are n
terms the formula follows.
For the case r = 1/n we use the identity
X
−1
n
(x + h)1/n − x1/n
1−j/n (j−1)/n
=
(x + h)
x
h
j=1
(Exercise 1.4.15). As h → 0, the term in square brackets tends to nx1−1/n ,
verifying the formula.
♦
For the next example, and indeed for the remainder of the book, we shall
use the standard definitions of cosine and sine as coordinates of points on the
unit circle.1 From this one can derive the usual trigonometric identities, which
we shall invoke as needed.
1
sin h
tan h
h
h
cos h
1
FIGURE 4.1: sin h < h < tan h.
4.1.3 Example. D sin x = cos x. From the identity sin2 h + cos2 h = 1 and
the inequalities
sin h < h < tan h, 0 < h < π/2,
which may be derived with the help of Figure 4.1, we see that
p
p
sin h
1 − h2 < 1 − sin2 h = cos h <
< 1, 0 < h < π/2.
h
(4.1)
Since sin(−h) = − sin h and cos(−h) = cos h, (4.1) holds for −π/2 < h < 0 as
well. By the squeeze principle,
lim cos h = lim
h→0
h→0
sin h
= 1.
h
From this and the calculation
cos h − 1
cos2 h − 1
=
=−
h
h(cos h + 1)
sin h
h
2
h
(cos h + 1)
1 A more rigorous approach to the calculus of trigonometric functions may be based on
the inverse sine function. This approach is described briefly in Section 4.4.
Differentiation on R
we see that
lim
h→0
75
cos h − 1
= 0.
h
Therefore,
sin(x + h) − sin x
sin x cos h + cos x sin h − sin x
=
h
h
cos h − 1
sin h
=
sin x +
cos x
h
h
→ cos x as h → 0.
♦
It is occasionally necessary to consider one-sided derivatives, which are
defined by using one-sided limits in 4.1.1. Specifically, the left-hand and righthand derivatives are, respectively,
f (x) − f (a)
and
x−a
f (x) − f (a)
.
Dr f (a) = fr0 (a) := lim+
x→a
x−a
D` f (a) = f`0 (a) := lim−
x→a
From the general theory of limits, a function is differentiable at a iff it has
equal right-hand and left-hand derivatives at a. For example, at x = 0 the
function f (x) = |x| has right-hand derivative 1 and left-hand derivative −1
and so is not differentiable there.
Although we shall have no need to do so, one may even consider the more
general expressions
lim
inf
x→a
x∈E
f (x) − f (a)
f (x) − f (a)
and lim sup
,
x→a
x−a
x−a
x∈E
where a is an accumulation point of E. The so-called Dini derivates are
obtained by taking E to be intervals of the form (c, a) and (a, c).
The following proposition provides a useful characterization of differentiability. It asserts that for x near a, f (x) is approximated by the linear function
y = f (a) + f 0 (a)(x − a), the equation of the tangent line at a.
4.1.4 Proposition. Let f be defined in a neighborhood N (a) of a. Then f is
differentiable at a iff there exists a function η on N (a), continuous at a, such
that
f (x) = f (a) + η(x)(x − a) for all x ∈ N (a).
In this case, f 0 (a) = η(a).
Proof. If such a function η exists, then
f (x) − f (a)
= η(x) → η(a) as x → a,
x−a
76
A Course in Real Analysis
hence f 0 (a) exists and equals η(a). Conversely, if f is differentiable at a, define

 f (x) − f (a)
if x ∈ N (a) \ {a},
η(x) =
x−a
f 0 (a)
if x = a.
Then η has the required properties.
4.1.5 Corollary. If f is differentiable at a, then f is continuous there.
Proof. Simply note that f (x) = f (a) + η(x)(x − a) → f (a) as x → a.
The example |x| considered above shows that the converse of the corollary
is false: |x| is continuous at 0 but not differentiable there. It is a remarkable
fact that there are continuous functions on R that are nowhere differentiable
(see 8.9.7).
4.1.6 Theorem. If c ∈ R and f and g are differentiable a, then so are f + g,
cf , f g, and f /g, the last provided that g(a) 6= 0. Moreover, in this case,
(a) (f + g)0 (a) = f 0 (a) + g 0 (a),
(c) (f g)0 (a) = f (a)g 0 (a) + f 0 (a)g(a),
(b) (cf )0 (a) = cf 0 (a),
0
f
g(a)f 0 (a) − f (a)g 0 (a)
(d)
(a) =
.
g
g 2 (a)
Proof. We prove only (d). Let h = f /g. Since g is continuous at a and g(a) 6= 0,
h is defined in a neighborhood N (a) on which g is not 0. For x ∈ N (a) \ {a},
a little algebra shows that
g(a)
h(x) − h(a)
=
x−a
f (x) − f (a)
g(x) − g(a)
− f (a)
x−a
x−a
.
g(x)g(a)
Letting x → a, using the continuity of g at a, yields (d).
The preceding theorem, together with 4.1.2 and 4.1.3, show that polynomials, rational functions, and trigonometric functions are differentiable. (See
Exercise 2.) The following important result will yield additional examples.
4.1.7 Chain Rule. Let g be differentiable at a and let f be differentiable at
g(a). Then f ◦ g is differentiable at a and (f ◦ g)0 (a) = f 0 (g(a))g 0 (a).
Proof. Set b := g(a). By 4.1.4, there exists a function η, defined in a neighborhood N (b) of b and continuous at b with η(b) = f 0 (b), such that
f (y) = f (b) + η(y)(y − b), y ∈ N (b).
(4.2)
Since g is continuous at a, we may choose a neighborhood N (a) of a such that
g(N (a)) ⊆ N (b). Then f ◦ g is defined on N (a), and by (4.2)
f (g(x)) − f (g(a))
g(x) − g(a)
= η(g(x))
, x ∈ N (a) \ {a}.
x−a
x−a
Letting x → a produces the desired result.
Differentiation on R
77
The formula (f ◦ g)0 (x) = f 0 (g(x))g 0 (x) is sometimes easier to apply when
written in Leibniz notation as
dy du
dy
=
, where y = f (u) and u = g(x).
dx
du dx
4.1.8 Example. The power rule
Dxr = rxr−1 , r ∈ Q,
follows from 4.1.2 and the chain rule: Let r = m/n, m, n ∈ N, and set u = x1/n
and y = um . Then y = xr and
dy
dy du
1
m m/n−1
=
= mum−1 x1/n−1 =
x
= rxr−1 .
dx
du dx
n
n
The case r < 0 may be verified using the quotient rule.
♦
Higher order derivatives of y = f (x) are defined inductively by
f 00 = D2 f =
..
.
f (n) = Dn f =
d dy
d2 y
:=
,
dx2
dx dx
dn f
d dn−1 f
:=
.
dxn
dx dxn−1
By convention, we set f (0) = D0 f := f .
Exercises
1. Use the limit definition to find the derivative of
√
1
(c) 2
.
(a) x2 + x + 1.
(b)S 2x + 1.
x +1
(d)S √
1
.
3x + 2
2. Use the techniques of 4.1.3 to find the derivative of cos x. Use rules of
differentiation to obtain the derivatives of tan x, cot x, sec x, and csc x.
3. Use rules of differentiation to find f 0 for each of the functions f :
2/3
2
√
√
2x + 5
x −1
5
S 3
S
(a)
5x + 7 3x + 2. (b)
.
(c) sin
.
7x + 2
x2 + 1
q
√
sin2 x − 1
(d)
.
(e) tan cos(1/x) . (f)
ax + bx + c.
2
sin x + 1
4. Assuming that y is a differentiable function of x that satisfies the given
dy
equation, use the rules of differentiation to find
:
dx
(a) x3 + y 3 − xy = 1. (b) S sin(xy 2 ) + x2 = 1. (c) tan(x + y) + y 2 = x.
78
A Course in Real Analysis
5. Let f (x) = xn |x|, n ∈ N. Find f (n−1) and f (n) .
6. Let f (x) = xm bxc, m ∈ N. Find f`0 (n) and fr0 (n), n ∈ Z.
7.S Find all values of a, b, such that f 0 exists on R, where
(
ax2 + bx + a/x if x > 1,
f (x) =
x3
if x ≤ 1.
8. Find all values of a, b, and c such that f 0 is continuous on (0, +∞), where
(
ax2 + bx if x > 1,
f (x) =
√
c x
if 0 < x ≤ 1.
9. Let
(
f (x) =
ax2 + bx + c if x > 1,
x3
if x ≤ 1.
Find all values of a, b, and c such that
(a) f is continuous on R.
(b) f is differentiable on R.
(c) f is continuous on R.
(d) f 00 exists on R.
0
10. Find all values of c such that f 0 (c) exists, where
(
ax − 4 if x > c,
f (x) =
9x2
if x ≤ c.
Is f 0 continuous at these values?
11. Let f be differentiable at a. Use the limit definition of derivative to
calculate
f (a + 5 sin h) − f (a + 2 sin h)
f (a + h2 ) − f (a − h)
. (b)S lim
.
h→0
h→0
h
h
(a) lim
12. Let g be differentiable on an open interval I and let f (x) = g(x)d(x),
where d(x) is the Dirichlet function (3.1.7). Let a be a zero of g. Prove
that f 0 (a) exists iff a is a zero of g 0 .
13. Let f be differentiable at c and let {an } and {bn } be sequences such that
an < c < bn and an , bn → c. Prove that
f (bn ) − f (an )
= f 0 (c).
n→∞
bn − an
lim
14.S Let f be differentiable and increasing on (a, b). Prove that f 0 (x) ≥ 0 for
all x ∈ (a, b).
Differentiation on R
79
15. Let f be differentiable at a and nonnegative in a neighborhood of a with
f (a) = 0. Prove that f 0 (a) = 0.
16.S Prove Leibniz’s rule: If f and g are n times differentiable, then
D (f g) =
n
n X
n
k=0
k
(Dk f ) (Dn−k g).
17. Prove that if f has right-hand and left-hand derivatives at a (not necessarily equal), then f is continuous at a.
18. Assuming that f , g, and h have the necessary differentiability, find general
formulas for
(a) D f ◦ (gh) . (b) D f ◦ (g/h) . (c)S D2 f ◦ g . (d) D f ◦ g ◦ h .
19. Find a formula for the nth derivative of
√
(a)S 1/x.
(b) 1/ x.
(c) xex .
(d) xe−x .
20. Find all values of p ∈ R for which the function
(
|x|p sin(1/x) if x 6= 0,
f (x) =
0
otherwise
is (a) continuous, (b) differentiable, (c) continuously differentiable onR.
21.S Define f (0) = 0 and f (x) = xm sin xn , x 6= 0, where m ∈ Z, n ∈ N. For
what values of m and n does f 0 (0) exist? For which of these values is f 0
continuous on R?
22. A function f defined on a symmetric neighborhood (−a, a) of 0 is said
to be odd if f (−x) = −f (x) and even if f (−x) = f (x).
(a) Prove that any function h : (−a, a) → R is the sum of an even
function f and an odd function g.
(b) Prove that if f is differentiable and odd (even), then f 0 is even (odd).
(c) Is the converse true? That is, if f 0 is even (odd), is f odd (even)?
23.S Let fj , gj , and hj be differentiable, j = 1, 2, 3. Prove that
f1
g1
f1
g1
h1
f2
g2
h2
0
0
f2
f0
= 1
g2
g1
f10
f3
g3 = g1
g3
h1
f20
g2
h2
f20
f
+ 10
g2
g1
f30
f1
g3 + g10
h3
h1
f2
g20
h2
f2
g20
and
f3
f1
g30 + g1
h3
h01
f2
g2
h02
f3
g3 .
h03
80
4.2
A Course in Real Analysis
The Mean Value Theorem
The mean value theorem relates the average rate of change of a function
to its instantaneous rate of change. It is one of the most useful theorems in
analysis and will play a central role in the proof of the fundamental theorem
of calculus in Chapter 5. The proof of the mean value theorem is based on the
existence of local extrema.
4.2.1 Definition. A function f is said to have a local maximum (local minimum) at c if f is defined on an open interval I containing c and f (x) ≤ f (c)
(f (x) ≥ f (c)) for all x ∈ I. In either case, f is said to have a local extremum
at c.
♦
f
c1
c2
x
FIGURE 4.2: Local extrema of f .
4.2.2 Local Extremum Theorem. If f has a local extremum at c and if f
is differentiable at c, then f 0 (c) = 0.
Proof. Suppose that f has a local maximum at c. Let I be an open interval
containing c such that f (x) ≤ f (c) for all x ∈ I. Then
(
f (x) − f (c) ≥ 0 if x ∈ I and x < c
x−c
≤ 0 if x ∈ I and x > c.
It follows that the left-hand derivative of f at c is ≥ 0 and the right-hand
derivative is ≤ 0, hence f 0 (c) = 0. The proof for the local minimum case is
similar.
4.2.3 Rolle’s Theorem. Let f be continuous on [a, b] and differentiable on
(a, b). If f (a) = f (b), then there exists a point c ∈ (a, b) such that f 0 (c) = 0.
Proof. By the extreme value theorem there exist xm , xM ∈ [a, b] such that
f (xm ) ≤ f (x) ≤ f (xM ) for all x ∈ [a, b]. If f (xm ) = f (xM ), then f is a constant
function and the assertion of the theorem holds trivially. If f (xm ) 6= f (xM ),
then either xm ∈ (a, b) or xM ∈ (a, b), and the conclusion follows from the
local extremum theorem.
Differentiation on R
81
The following result is the key ingredient in the proof of l’Hospital’s rule
in Section 4.5.
4.2.4 Cauchy Mean Value Theorem. Let f and g be continuous on [a, b]
and differentiable on (a, b). Then there exists a point c ∈ (a, b) such that
[f (b) − f (a)]g 0 (c) = [g(b) − g(a)]f 0 (c).
Proof. The function
h(x) := [f (b) − f (a)]g(x) − [g(b) − g(a)]f (x)
is continuous on [a, b], differentiable on (a, b), and satisfies h(a) = h(b). By
Rolle’s theorem, h0 (c) = 0 for some c ∈ (a, b), which is the assertion of the
theorem.
y
y
(f (c), g(c))
(f (b), g(b))
(f (a), g(a))
x
(a)
a
x
c
(b)
b
FIGURE 4.3: (a) Cauchy mean value theorem. (b) Mean value theorem.
If f (a) 6= f (b) and f 0 (x) 6= 0 on (a, b), then the conclusion of 4.2.4 may be
written
g(b) − g(a)
g 0 (c)
= 0 .
f (b) − f (a)
f (c)
For smooth
functions f and g, this equation asserts that at some point
f (c), g(c) on the curve given parametrically by the equations x = f (t)
and y = g(t), the line through the endpoints (f (a), g(a))
and (f (b), g(b)) is
parallel to the line tangent to the curve at f (c), g(c) . See Figure 4.3(a).
Taking g(x) = x in the Cauchy mean value theorem yields the standard
mean value theorem (Figure 4.3(b)):
4.2.5 Mean Value Theorem. If f is continuous on [a, b] and differentiable
on (a, b), then there exists c ∈ (a, b) such that
f (b) − f (a)
= f 0 (c).
b−a
82
A Course in Real Analysis
4.2.6 Corollary. Let f (x) and g(x) be differentiable on an open interval I
such that f 0 (x) = g 0 (x) for all x ∈ I. Then there exists a constant k such that
f = g + k on I.
Proof. Let a, b ∈ I. By the mean value theorem applied to h := f − g, there
exists c ∈ (a, b) such that h(a) − h(b) = h0 (c)(a − b). Since h0 = 0, h(a) = h(b).
Since a and b were arbitrary, h must be constant.
4.2.7 Corollary. Let f be differentiable on an open interval I.
(a) If f 0 ≥ 0 (f 0 > 0) on I, then f is increasing (strictly increasing) on I.
(b) If f 0 ≤ 0 (f 0 < 0) on I, then f is decreasing (strictly decreasing) on I.
Proof. We prove (a) for the strictly increasing case. Let a, b ∈ I, a < b. By
the mean value theorem, f (b) − f (a) = f 0 (c)(b − a) for some c ∈ (a, b). Since
f 0 (c) > 0, f (b) > f (a).
Exercises
1.S Show that cos x =
(0, π/2).
√
x − 1 has exactly one solution x in the interval
2. Find an interval I such that for each c ∈ I, sin x = x2 /2 + x + c has
exactly one solution x in the interval (0, π/2).
3.S Show that f (x) = x4 − 4x3 + 4x2 + c has at most one zero in the interval
(1, 2). For what interval of values of c does f have exactly one zero in
(1, 2)?
4. Let f have k derivatives and n distinct zeros on an interval I. Prove that
f (k) has at least n − k distinct zeros in I.
5. Let f have a continuous second derivative on [−1, 3], f (1) = 0, and
set g(x) = x2 f (x). Prove that g 00 has at least one zero in [−1, 2]. Hint.
Consider the function gn (x) := x(x + 1/n)f (x).
6. Let P (x) be a polynomial of degree n and let a 6= 0. Prove that the
equation eax = P (x) has at most n + 1 solutions.
7.S Let P (x) be a polynomial of degree n and let a 6= 0. Prove that the
equation sin(ax) = P (x) has at most n + 1 solutions.
8. Prove Bernoulli’s inequality: (1 + x)r ≥ 1 + rx for all x ≥ −1 and all
rational numbers r ≥ 1. (Cf. Exercise 1.5.10.)
9.S Let f and g be continuous on [a, b] and differentiable on (a, b) such that
|f 0 | ≤ |g 0 |. If g 0 is never zero on (a, b), prove that
|f (x) − f (y)| ≤ |g(x) − g(y)| for all x, y ∈ [a, b].
Differentiation on R
83
10. Let f and g be differentiable on an open interval I and let a, b ∈ I with
a < b. Prove that if f (a) = g(a) and f 0 > g 0 on (a, b), then f > g on
(a, b). Use this to show that
(a) ln x < x − 1 on the interval (1, +∞).
(b) sin x < x on the interval (0, π/2).
(c) cos x > 1 − x on the interval (0, π/2).
(d) tan x > x on the interval (0, π/2).
(e) ex > 1 + x + x2 /2! + · · · + xn /n! on the interval (0, +∞). (Use
induction.)
11.S Show that
sin x
is a decreasing function on (0, π/2).
x
12. Show that on (0, π/2)
(a) x sin x + cos x > 1.
(b) x sin x + p cos x < p, p ≥ 2.
(c) x
(d) x−2 (1 − cos x) is decreasing.
−1
(1 − cos x) is increasing.
13. Let a, b, p > 0, and for x ≥ 0 define f (x) = ap + xp − (a + x)p . Show
that for x > 0,
(
> 0 if 0 < p < 1,
f 0 (x)
< 0 if p > 1.
Conclude that
(
(a + b)
p
< ap + bp
> ap + bp
if 0 < p < 1,
if p > 1.
14. Let f and g have derivatives of order n on an open interval I and let
a ∈ I. Suppose that
f (j) (a) = g (j) (a) = 0, j = 0, . . . , n − 1, and
f (j) (x)g (j) (x) 6= 0 for x > a and j = 0, . . . , n.
Prove that for any b ∈ I with b > a there exists c ∈ (a, b) such that
f (b)
f (n) (c)
= (n) .
g(b)
g (c)
15. Suppose that f has a local maximum at c. Prove that
lim inf
−
x→c
f (x) − f (c)
f (x) − f (c)
≥ 0 ≥ lim sup
.
x−c
x−c
x→c+
84
A Course in Real Analysis
16. Let f and g be continuous on [a, b], differentiable on (a, b) and let f (a) =
f (b) = 0. Show that there exists c ∈ (a, b) such that f 0 (c) = g 0 (c)f (c).
17.S Show that for any polynomial P (x) there exist finitely many intervals
with union R such that P is strictly monotone on each interval.
18. Suppose that f has the property
|f (x) − f (y)| ≤ c|x − y|1+ε for all x, y ∈ R,
where c, ε > 0. Prove that f is constant.
19.S Let f have a bounded derivative on R. Prove that for sufficiently large
r the function g(x) := rx + f (x) is one-to-one and maps R onto R.
20. Suppose f > 0 on (1, +∞) and limx→+∞ xf 0 (x)/f (x) ∈ (1, +∞). Prove
that x/f (x) is decreasing on (b, +∞) for some b > 1.
21. Let f be twice differentiable on (0, a), f 00 ≥ 0, and limx→0+ f (x) = 0.
Prove that f (x)/x is increasing on (0, a). Show that the conclusion is
false if the hypothesis f 00 ≥ 0 is dropped.
22.S Let g(x) = x2 sin(1/x) if x 6= 0 and g(0) = 0. Set f (x) = x + g(x). Show
that f 0 (0) > 0 but f is not monotone on any neighborhood of 0.
23. Let limx→+∞ f 0 (x) = 0. Prove that if g ≥ c > 0 on (a, +∞), then
lim f x + g(x) − f (x) = 0.
x→+∞
24. Let f be differentiable on R with supx∈R |f 0 (x)| < 1. Prove that the
sequence {xn } defined by xn+1 = f (xn ) converges, where x1 is arbitrary.
Conclude that f has a unique fixed point; that is, there exists a unique
x ∈ R such that f (x) = x.
25.S Suppose f is differentiable on an open interval I. Show that f 0 has the
intermediate value property. Conclude that if f 0 (x) 6= 0 on I, then f is
strictly monotone on I. Hint. Apply the extreme value theorem to the
function g(x) = f (x) − y0 (x − a), a ≤ x ≤ b.
26. Let f be differentiable on I := (1, +∞). Prove that if f 0 has finitely
many zeros in I, then limx→+∞ f (x) exists in R.
27. Let f and g have continuous derivatives on an interval I with g 0 6= 0 and
let aj , bj ∈ I with aj < bj , j = 1, . . . , n. Prove that there exists c ∈ I
such that
n
X
j=1
[f (bj ) − f (aj )]g 0 (c) =
n
X
j=1
[g(bj ) − g(aj )]f 0 (c).
Differentiation on R
85
28.S A function f is said to be uniformly differentiable on an open interval I
if, given ε > 0, there exists δ > 0 such that
f (x) − f (y)
− f 0 (y) < ε
x−y
for all x and y in I with 0 < |x − y| < δ. Prove that f is uniformly
differentiable on I iff f 0 exists and is uniformly continuous on I.
29. Generalize the preceding exercise as follows: Let f and g be differentiable
on an open interval I with g 0 6= 0 on I. Prove that f 0 /g 0 is uniformly
continuous on I iff, given ε > 0, there exists a δ > 0 such that
f (x) − f (y) f 0 (y)
< ε,
− 0
g(x) − g(y)
g (y)
for all x and y in I with 0 < |x − y| < δ.
30. Let f be differentiable on [a, +∞) and suppose that the zeros of f 0 form
a strictly increasing sequence an ↑ +∞. Prove that if L := limn f (an )
exists in R, then limx→+∞ f (x) = L.
31.S Prove that a function f is continuously differentiable on an open interval
I iff there exists a continuous function ϕ on I 2 such that
f (x) − f (y) = ϕ(x, y)(x − y) for all x, y ∈ I.
32. Let f be continuous on (−r, r) and differentiable on (−r, 0) ∪ (0, r). If
limx→0 f 0 (x) exists, prove that f 0 (0) exists and f 0 is continuous at 0.
*4.3
Convex Functions
4.3.1 Definition. A function f is said to be convex on an interval (a, b) if
f (1 − t)u + tv ≤ (1 − t)f (u) + tf (v)
for all a < u < v < b and all t ∈ [0, 1]. f is concave if −f is convex.
♦
For example, |x| is convex on R, as is easily established using the triangle
inequality.
To see the geometric significance of convexity, let Luv : [u, v] → R denote
the function whose graph is the line segment from (u, f (u)) to (v, f (v)). Since
a typical point on the line segment may be written
(1 − t) u, f (u)) + t(v, f (v) = (1 − t)u + tv, (1 − t)f (u) + tf (v) , t ∈ [0, 1],
86
A Course in Real Analysis
we see that
Luv (1 − t)u + tv = (1 − t)f (u) + tf (v).
This shows that f is convex iff the line segment connecting any two points on
the graph of f lies above the part of the graph between the two points. (See
Figure 4.4.)
f
Luv
a
u
v
b
x
FIGURE 4.4: Convex function.
Now let x ∈ (u, v). Then for some t ∈ (0, 1),
x = (1 − t)u + tv = t(v − u) + u = (1 − t)(u − v) + v,
hence
t = (x − u)/(v − u) and 1 − t = (v − x)/(v − u).
It follows that f is convex on (a, b) iff
f (x) ≤ Luv (x) = f (u)
v−x
x−u
+ f (v)
for all a < u < x < v < b. (4.3)
v−u
v−u
4.3.2 Theorem. If f : (a, b) → R has an increasing derivative, then f is
convex. In particular, f is convex if f 00 ≥ 0.
Proof. Let a < u < x < v < b. By the mean value theorem applied
to f on
each of the intervals [u, x] and [x, v], there exist points y ∈ u, x and z ∈ x, v
such that
f (x) − f (u)
f (v) − f (x)
= f 0 (y) ≤ f 0 (z) =
.
x−u
v−x
Solving the inequality for f (x) yields (4.3).
Thus x2n is convex on R for any n ∈ N, ln(x) is concave on (0, +∞), and
x is convex on (0, +∞) if p ≥ 1 and concave if p < 1.
There is a partial converse to 4.3.2. For this we need following lemma.
p
4.3.3 Lemma. If f is convex and a < u < x ≤ y < v < b, then
(a)
f (x) − f (u) f (y) − f (u) f (v) − f (y)
≤
≤
, and
x−u
y−u
v−y
(b)
f (v) − f (x) f (v) − f (y)
≤
.
v−x
v−y
Differentiation on R
87
Proof. Referring to Figure 4.5, for (a) we have
f (x) − f (u)
Luy (x) − f (u)
≤
x−u
x−u
f (y) − f (u)
=
y−u
Luv (y) − f (u)
≤
y−u
Luv (v) − Luv (y)
=
v−y
f (v) − f (y)
≤
v−y
f
by convexity, since u < x < y,
by equality of slopes on Luy ,
by convexity, since u < y < v,
by equality of slopes on Luv ,
by convexity since u < y < v.
Luv
Lxv
Luy
u
y
x
v
FIGURE 4.5: Convex function inequalities.
A similar calculation verifies (b):
Lxv (v) − Lxv (y)
Lxv (v) − Lxv (x)
f (v) − f (x)
f (v) − f (y)
≥
=
=
.
v−y
v−y
v−x
v−x
4.3.4 Theorem. If f is convex, then fr0 and f`0 exist, are increasing, and
f`0 (x) ≤ fr0 (x).
Proof. Let a < u < x ≤ y < v < b. By (a) of the lemma, the difference
quotients [f (x) − f (u)]/(x − u) decrease as x → u+ , so fr0 (u) exists in R and
fr0 (u) ≤
f (v) − f (y)
< +∞.
v−y
Letting v → y + shows that fr0 (u) ≤ fr0 (y). Therefore, fr0 is increasing. Similarly,
by (b) the difference quotients [f (v) − f (y)]/(v − y) increase as y → v − so
f`0 (v) exists in R and
f`0 (v) ≥
f (v) − f (x)
> −∞.
v−x
Taking x = y in (a) of the lemma, we have
f (x) − f (u)
f (v) − f (x)
≤
.
x−u
v−x
88
A Course in Real Analysis
Letting u ↑ x and v ↓ x, we obtain f`0 (x) ≤ fr0 (x). In particular, f`0 (x) and
fr0 (x) are finite.
4.3.5 Corollary. A convex function f is continuous.
Proof. By the theorem, f has finite left-hand and right-hand derivatives and
hence is left and right continuous.
4.3.6 Theorem. If a convex function f is differentiable at x ∈ (u, v), then
f 0 (x)(t − x) + f (x) ≤ f (t) for all t ∈ (u, v).
That is, the tangent line at (x, f (x)) lies below the graph of f on (u, v).
Proof. Since the difference quotients f (t) − f (x) /(t − x) decrease as t ↓ x,
fr0 (x) ≤
f (t) − f (x)
, t > x.
t−x
The same difference quotients increase as t ↑ x, hence
fl0 (x) ≥
f (t) − f (x)
, t < x.
t−x
Therefore, if f 0 (x) exists, then f 0 (x)(t − x) + f (x) ≤ f (t) for all t.
4.4
Inverse Functions
In this section we prove that under suitable conditions the inverse of a
one-to-one continuous (differentiable) function is continuous (differentiable).
For this we need the following two lemmas. The proof of the first is illustrated
in Figures 4.6 and 4.7.
4.4.1 Lemma. Let f be one-to-one on an interval I. If f has the intermediate
value property, then f is strictly monotone and continuous on I.
Proof. Let a, b be arbitrary points in I with a < b. Assume, for definiteness,
that f (a) < f (b). We claim that f (a) < f (x) < f (b) for all a < x < b.
Indeed, if, say f (x) < f (a), then f (a) lies between f (x) and f (b), hence, by
the intermediate value property, there exists c ∈ (x, b) such that f (c) = f (a),
contradicting that f is one-to-one.
Next we show that f is strictly increasing on [a, b]. Let a < x1 < x2 < b
and suppose that f (x2 ) < f (x1 ). Then f (x2 ) lies between f (a) and f (x1 ),
hence there exists d ∈ (a, x1 ) such that f (d) = f (x2 ), again contradicting that
f is one-to-one. Thus f is strictly increasing on [a, b]. It follows that f must
be strictly increasing on any closed and bounded subinterval of I containing
Differentiation on R
89
f
f
f (b)
f (x1 )
f (d) = f (x2 )
f (a) = f (c)
f (a)
f (x)
x
a
a
c b
d
x1
x2 b
FIGURE 4.6: f (x) < f (a) or f (x1 ) > f (x2 ) violates one-to-one hypothesis.
f
β
f (x0 )
α
x1
x x2 x0
α = f (x) < f (x2 ) < α
x
FIGURE 4.7: Intermediate value property implies continuity.
[a, b]. Since every pair of points in I lies in such a subinterval, f is strictly
increasing on I.
Now let x0 ∈ I. To verify continuity of f at x0 , note that by monotonicity
α := lim− f (x) ≤ f (x0 ) ≤ β := lim+ f (x).
x→x0
x→x0
(If x0 is an endpoint, only one of these inequalities holds.) Continuity of f at
x0 will then follow if we show that α = f (x0 ) = β. Suppose, for example, that
α < f (x0 ). Choose any x1 < x0 in I. Since f (x1 ) < α < f (x0 ), there exists
some x ∈ (x1 , x0 ) such that f (x) = α. But choosing x2 ∈ (x, x0 ) then produces
the contradiction f (x) = α < f (x2 ) < α.
4.4.2 Lemma. If f is strictly increasing (decreasing) on an interval I, then
f −1 is strictly increasing (decreasing) on f (I).
Proof. Assume that f is strictly increasing. If y1 = f (x1 ) < y2 = f (x2 ),
then x1 < x2 (that is, f −1 (y1 ) < f −1 (y2 )), since otherwise f (x1 ) ≥ f (x2 ).
Therefore, f −1 is strictly increasing on I.
90
A Course in Real Analysis
The next two theorems are the main results on inverse functions. They
assert that the properties of continuity or differentiability of a one-to-one
function are inherited by the inverse function.
4.4.3 Theorem. Let f be continuous and one-to-one on an interval I. Then
J := f (I) is an interval and f −1 : J → I is continuous. Moreover, f and f −1
are strictly monotone.
Proof. Since f is continuous, it has the intermediate value property, hence J
is an interval. Moreover, by 4.4.1 and 4.4.2, f and f −1 are strictly monotone.
Since I = f −1 (J) is an interval, f −1 has the intermediate value property. The
continuity of f −1 now follows from 4.4.1.
4.4.4 Theorem. Let I be an open interval and let f : I → R be continuous
and one-to-one on I. If f is differentiable at a ∈ I and f 0 (a) 6= 0, then f −1 is
differentiable at f (a), and
0
f −1 (f (a)) =
1
f 0 (a)
.
Proof. Let y = f (x) and b = f (a). For x near a,
−1
f −1 (y) − f −1 (b)
x−a
f (x) − f (a)
=
=
.
y−b
f (x) − f (a)
x−a
Since f −1 is continuous, x = f −1 (y) → f −1 (b) = a as y → b and the conclusion
follows.
If f is differentiable and nonzero on I and y = f −1 (x), then x = f (y) and
assertion of the theorem may be written in Leibniz notation as
dy
=
dx
1
.
dx
dy
From 4.4.3 we obtain the following result, which will be generalized in
Chapter 9 to functions on open subsets of Rn .
4.4.5 Inverse Function Theorem. Let f be continuously differentiable on
an open interval I. If f 0 (a) 6= 0, then there exist open intervals Ia ⊆ I and
Ja = f (Ia ) with a ∈ Ia such that f is one-to-one on Ia and f −1 : Ja → Ia is
continuously differentiable.
Proof. Since f 0 is continuous and f 0 (a) 6= 0, there exists an open interval Ia
containing a on which f 0 6= 0. By the mean value theorem, f is one-to-one on
Ia , hence, by 4.4.3, Ja = f (Ia ) is an interval, and, by 4.4.4, f −1 : Ja → Ia is
continuously differentiable.
Differentiation on R
91
4.4.6 Global Inverse Function Theorem. Let f be continuously differentiable with f 0 nonzero on an open interval I. Then f is one-to-one on I,
J := f (I) is an open interval, and f −1 : J → I is continuously differentiable.
Proof. That f is one-to-one follows from the mean value theorem. By 4.4.3, J
is an interval and f −1 : J → I is continuous. Since continuous differentiability
is a local property, 4.4.5 implies that f −1 is continuously differentiable.
The following examples, as well as exercises below, establish the existence
and basic properties of several well-known functions.
4.4.7 Example. Since x = sin y is strictly increasing on [−π/2, π/2], the
inverse function y = sin−1 x exists, is strictly increasing on [−1, 1], and
dy
=
dx
dx
dy
−1
=
1
1
=√
, −1 < x < 1.
cos y
1 − x2
Similarly, x = cos y is strictly decreasing on [0, π], hence y = cos−1 x exists, is
strictly decreasing on [−1, 1], and
dy
=
dx
dx
dy
−1
=
−1
−1
=√
, −1 < x < 1.
sin y
1 − x2
♦
An alternate approach to the preceding example is to define the inverse
sine by
Z x
dt
−1
√
sin x =
, −1 < x < 1
1 − t2
0
and then obtain the sine function as the inverse of sin−1 . This allows the
derivation of the standard properties of sin x, and ultimately of the other trig
functions, without relying on geometric arguments. The disadvantage of this
approach is that verification of these properties is detailed and lengthy. Still
another approach is based on complex infinite series. For the latter, the reader
may wish to consult [7].
The following example illustrates the integral approach for the exponential
function. Some of the assertions in the example rely on results from Chapters 5
and 6 but should be familiar to the reader.
4.4.8 Example. The natural logarithm function is defined by
Z x
1
ln x :=
dt, x > 0.
1 t
One may show that all the familiar algebraic properties of the natural log
follow from this definition. (See Exercise 5.) Since ln x is strictly increasing on
(0, +∞), the inverse function
exp x := ln−1 x
92
A Course in Real Analysis
exists and is strictly increasing. Since ln 2 > 0,
ln 2n = n ln 2 → +∞ and ln 2−n = −n ln 2 → −∞,
hence
lim ln x = +∞ and
lim ln x = −∞.
x→+∞
x→0+
It follows from these limits and the intermediate value theorem that the range
of ln x, that is, the domain of exp x, must be R. Thus, by Exercise 4,
lim exp x = 0 and
x→−∞
lim exp x = +∞.
x→+∞
From the fundamental theorem of calculus, proved in the next chapter,
1
d ln y
= , hence
dy
y
−1
d exp x
d ln y
=
= y = exp x.
dx
dy
Moreover, since
1=
d ln y
dy
ln(1 + 1/n) − ln 1
= lim ln(1 + 1/n)n ,
n→+∞
n→+∞
1/n
= lim
y=1
continuity of exp and 2.2.4 imply that
exp 1 = lim exp ln(1 + 1/n)n = lim (1 + 1/n)n = e.
n→+∞
n→+∞
Additional properties of exp x may be found in the exercises, including the
identity exp r = er , r ∈ Q. Because of this identity, we frequently write ex
for exp x. Indeed, the function exp is the basis for rigorous definitions of the
general exponential function ax , a > 0, and the power function xa , x ≥ 0. (See
Exercises 8 and 9.)
♦
Exercises
1. Find f −1 and its domain for each of the following functions f with the
given domain:
(a) x2 − 4x + 5, [2, +∞).
(c)
(b) S
5e−x + 2
, (−∞, +∞). (d)
3e−x + 7
(e) ex − 2e−x , (−∞, +∞)
(f) S
3x + 2
, R \ {−3/2}.
2x + 3
sin2 x − 4 sin x + 3, [−π/2, π/2].
2 + cos x
, (0, π).
3 + cos x
2. Let f (x) = ax + |x| + |x − 1|. Find all values of a for which f −1 exists
on R. For these values, find f −1 .
Differentiation on R
93
3. Give an example of a one-to-one continuous function on the union of
two intervals that is (a) not monotone, (b) strictly monotone but with
discontinuous inverse.
4. Let f be defined, continuous, and strictly increasing on (a, b), so the
limits
c := lim f (x) and d := lim f (x)
x→a+
x→b−
exist in R. Show that the domain of f −1 is (c, d) and that
lim f −1 (x) = a and
x→c+
lim f −1 (x) = b.
x→d−
5. Verify the following properties of ln x, as defined in 4.4.8:
(a)
ln 1 = 0, ln e = 1.
(b) S ln(xy) = ln x + ln y.
(c)
ln(x/y) = ln x − ln y.
(d)
ln xr = r ln x, r ∈ Q.
6. Prove that exp(x + y) = exp(x) exp(y).
7. For c, d ∈ R with c > 0, define cd = exp(d ln c). Show that this definition
agrees with the usual one if d is rational and verify the following properties,
where x, y ∈ R and a, b > 0.
(a) ln ax = x ln a.
y
(d) ax = axy .
(b)S ax ay = ax+y .
(e) (ab)x = ax bx .
(c) ax /ay = ax−y .
(f) aln b = bln a .
8. Let a > 0, a 6= 1, and define ax as in Exercise 7. Find limx→−∞ ax ,
limx→+∞ ax , and (ax )0 .
9.S Let a ∈ R and for x > 0 define xa as in Exercise 7. Prove the power rule
(xa )0 = axa−1 .
10. Prove that tan x restricted to (−π/2, π/2) has a differentiable inverse
defined on R. Find limx→−∞ tan−1 x, limx→+∞ tan−1 x, and (tan−1 x)0 .
11. Prove that sec x restricted to [0, π/2) ∪ [π, 3π/2) has a continuous inverse defined on (−∞, −1] ∪ [1, +∞). Show that sec−1 x is differentiable on (−∞, −1) ∪ (1, +∞) and compute its derivative. Also, find
limx→−∞ sec−1 x and limx→+∞ sec−1 x.
12. Verify the inequalities
x−1
(a)
< ln x < x − 1, x > 1. (b) | tan−1 x − tan−1 y| ≤ |x − y|.
x
y−x
y−x
(c) √
< | sin−1 y − sin−1 x| < p
, −1 < x < y < 1.
1 − x2
1 − y2
94
A Course in Real Analysis
13. Verify the identities
x
(a) tan sin−1 x = √
, −1 < x < 1.
1 − x2
(b) sin−1 x + cos−1 x = π/2, −1 ≤ x ≤ 1.
2
x −1
+ 2 tan−1 x = π, x ≥ 0.
(c)S cos−1
x2 + 1
r
1−x
−1
−1
(d) cos x = 2 sin
, −1 ≤ x ≤ 1.
2
(e) tan−1 x + tan−1 (2/x) + tan−1 (x + 2/x) = π, x 6= 0.
14.S Suppose f satisfies f (x + y) = f (x)f (y) for all x, y ∈ R. Show that if
a := f 0 (0) exists, then f (x) = f (0)eax .
15. Suppose that f : [0, 1] → [0, 1] is continuous, one-to-one, onto, and
f = f −1 . Prove that either f (x) = x for all x or f is monotone decreasing.
16. Suppose f 0 is one-to-one on an open interval I. Show that f 0 is continuous
and strictly monotone on I. (See Exercise 4.2.25.)
17. Let f be differentiable on an open interval I with f 0 6= 0. Let a, b ∈ I
with a < b and suppose that f : [a, b] → [a, b] is one-to-one and onto.
Prove that there exists c ∈ (a, b) such that
f (b) − f (a)
= f 0 (c)f 0 f −1 (c) .
−1
− f (a)
f −1 (b)
18.S Let f be twice differentiable and f 0 6= 0 on an open interval I. Show
that (f −1 )00 (x) exists on f (I) and find a formula.
4.5
L’Hospital’s Rule
The rule for calculating the limit of a quotient of functions, namely,
lim
x→a
x∈E
lim{x→a, x∈E} f (x)
f (x)
=
,
g(x)
lim{x→a, x∈E} g(x)
(4.4)
requires that the limits on the right are finite and the denominator is not 0. If,
instead, the limits in the quotient are both zero or ±∞, then the expression on
the left in (4.4) is called an indeterminate form of type 00 or ±∞
±∞ , respectively.
There are other types of indeterminate forms, but all may be converted to one
of these. The following theorem describes a method for evaluating these limits.
Differentiation on R
95
4.5.1 l’Hospital’s Rule. Let J be an open interval, finite or infinite, and let
a ∈ R be an accumulation point of J. Suppose that f and g are differentiable
on E := J \ {a} and that g(x)g 0 (x) 6= 0 for every x ∈ E. If the limits
A := x→a
lim f (x), B := x→a
lim g(x), and L := x→a
lim
x∈E
x∈E
x∈E
f 0 (x)
g 0 (x)
exist in R and either A = B = 0 or B = ±∞, then
lim
x→a
x∈E
f (x)
= L.
g(x)
Proof. There are a number of cases to consider, but the proofs of many of
these are essentially the same. We prove the theorem for four fundamentally
different cases and for one-sided limits, so E = (a, c) or (c, a) for some c.
As a first step, we use the Cauchy mean value theorem to obtain, for every
pair of distinct numbers x, b ∈ E, a number ξ = ξ(x, b) between x and b such
that
[f (x) − f (b)]g 0 (ξ) = [g(x) − g(b)]f 0 (ξ).
(4.5)
Now set
h(x) =
f (x)
.
g(x)
Case 1 : A = B = 0, a and L are finite, and E = (a, c). Extend f and g
continuously to [a, c) by defining f (a) = g(a) = 0. Taking b = a and x ∈ (a, c)
in (4.5) we see that
f 0 (ξ)
h(x) = 0 .
g (ξ)
Since ξ → a as x → a, limx→a+ h(x) = L, as required.
For the remaining cases, we use the Cauchy mean value theorem in the
following form. Divide (4.5) by g 0 (ξ)g(x) and solve the resulting equation for
h = f /g to obtain
f (b)
g(b) f 0 (ξ)
h(x) =
+ 1−
, x, b ∈ E.
(4.6)
g(x)
g(x) g 0 (ξ)
Case 2 : A = B = 0, a = L = +∞, and E = (c, +∞). Let M > 0 and
choose x0 ∈ E such that
f 0 (x)
> 2M for x > x0 .
g 0 (x)
Let b > x > x0 . For large b, g(b)/g(x) < 1/2, hence from (4.6)
h(x) ≥
f (b) 1
f (b)
+ (2M ) =
+ M.
g(x) 2
g(x)
96
A Course in Real Analysis
Letting b → +∞ we see that h(x) ≥ M . Therefore, limx→+∞ h(x) = +∞.
Case 3 : B = +∞, a and L are finite and E = (c, a). Given ε > 0, choose
b ∈ E such that
f 0 (t)
− L < ε/2 for all t ∈ (b, a).
g 0 (t)
Let x ∈ (b, a). By (4.6),
h(x) −
f 0 (ξ)
f (b)
g(b) f 0 (ξ)
=
−
.
0
g (ξ)
g(x) g(x) g 0 (ξ)
Since the right side tends to 0 as x → a,
|h(x) − L| ≤ h(x) −
f 0 (ξ)
f 0 (ξ)
+
− L < ε/2 + ε/2 = ε
g 0 (ξ)
g 0 (ξ)
for all x near a. Therefore, limx→a− h(x) = L.
Case 4 : B = +∞, a = L = +∞, and E = (c, +∞). Given M > 0, choose
b > c such that
f 0 (t)
> 3M for all t > b.
g 0 (t)
Let x > b such that g(x) > g(b). By (4.6),
f (b)
g(b)
+ 1−
M.
h(x) ≥
g(x)
g(x)
Since the quotients on the right side tend to zero, for all sufficiently large x
we have
1i
M h
+ 1 − (3M ) = M
h(x) ≥ −
2
2
Therefore, limx→+∞ h(x) = +∞.
The following examples illustrate typical applications of l’Hospital’s rule.
Examples. (a) The limit
L := lim
x→0
x − tan x
x3
is of the form 00 , hence
1 − sec2 x
2 sec2 x tan x
sec4 x + 2 sec x tan2 x
1
=
lim
=
lim
=− .
x→0
x→0
x→0
3x2
−6x
−3
3
L = lim
Note that each step except the last produces a limit of the form 00 , allowing
another application of l’Hospital’s rule. The validity of each step is ultimately
justified by the existence of the final limit.
Differentiation on R
(b) The limit
97
sin(1/x)
x→+∞ e1/x − 1
L := lim
is of the form 00 ; however, it is complicated to apply l’Hospital’s rule directly.
Making the substitution y = 1/x produces a more tractable problem:
L = lim+
y→0
(c) The limit
sin y
cos y
= 1.
= lim
ey − 1 y→0+ ey
L := lim x sin(1/x)
x→+∞
is of the form ∞ · 0, but a simple algebraic manipulation produces the form 00 :
sin(1/x)
sin y
= lim+
= 1.
x→+∞
y→0
1/x
y
L = lim
Here, l’Hospital’s rule was unnecessary, since we could use a known limit.
p
(d) The limit L := limx→1+ x1/(x
logarithms to obtain the form 00 :
−1)
, p > 0, is of the form 1∞ , so we take
h
i
p
1/x
1
ln x
= lim+ p−1 = .
lim+ ln x1/(x −1) = lim+ p
x→1 px
x→1
x→1 x − 1
p
Thus L = e1/p .
(e) The technique used in (d) shows that
x
t
lim
1+
= et ,
x→+∞
x
since
x
t
ln(1 + ty)
t
lim ln 1 +
= lim+
= lim+
= t.
x→+∞
y→0
y→0 1 + ty
x
y
(f) The limit
L :=
lim
x→π/2+
h
i
1
+ sec x
x − π/2
is of the form ∞ − ∞. Combining fractions we obtain a limit of the form 00 .
Thus
L=
=
=
lim
x→π/2+
lim
x→π/2+
lim
x→π/2+
= 0.
cos x + x − π/2
(x − π/2) cos x
1 − sin x
(π/2 − x) sin x + cos x
− cos x
(π/2 − x) cos x − 2 sin x
♦
98
A Course in Real Analysis
Exercises
1. Evaluate the following limits, where p, q > 0:
epx − eqx
epx − ep
(b) lim
x→0
x→1 tan(x − 1)
sin x
ln(sin px)
(d) S lim x 1 − e1/x (e) lim+
x→+∞
x→0 ln(sin qx)
−1
x
−
tan
x
1
1
S
(g) lim
(h) lim+
−
x→0 x − sin−1 x
x→0
x sin x
(a) S lim
(j)
(xp )
lim (sin x)(ln x) (k) lim+ x
x→0+
x→0
x−1
1
x
x+1
−
(n) lim+
(m) S lim+
x→1
x→0
tan x x
x−1
2
1 − cos x
sin x + cos x − 1
(p) S lim 2
(q) lim
x→0 x + x3 sin x
x→0
ln(1 + x)
−x
1
S
1/(ln ln x)p
(s) lim x
(t) lim
1− √
x→+∞
x→+∞
x
S
(v) S lim+ xsin x
x→0
x cos x − sin x
x→0
x2 sin x
(w) lim
ln(3x2 − 1)
x→+∞ ln(5x2 − 1)
p sin(px) − p2 x
(f) lim
x→0
x3
(c) lim
(i) lim+ ln(x − 1) ln x
x→1
1/x2
(l) lim (cos x)
x→0
√
x ln x
(o) lim
x→1 x − 1
x
x−1
(r) lim
x→+∞ x + 1
(u) lim+ (sin x)
x
x→0
(x) lim+
x→0
(1 + x)1/x − e
x
2. For each function f : (0, 1] → R below, define f (0) so that f continuous
on [0, 1].
(a)
(d)
1 − ex
.
x
x
.
tan x
(b)
ln(1 + x)
.
x
(e) x ln x.
(c) S
(f)
sin 5x
.
sin 3x
1 − cos 2x
.
1 − cos 3x
3. Find limn an , where an =
(a)S sin1/n (1/n).
4. Show that
(b) n − n2 ln(1 + 1/n).
(c) n [(1 + 1/n)n − e] .


if p < 2,
0
p
p
n + 1/n − n → 2
if p = 2


+∞ if p > 2
5. By considering the sequences {n} and {n + 1/n}, use l’Hospital’s rule to
prove that ex is not uniformly continuous on [0, +∞).
6.S Let f (x) = x1+1/x . Evaluate limn f (n + 1) − f (n) .
7. Let f be differentiable on (a, b) and suppose that limx→a+ f (x) and
limx→a+ f 0 (x) exist in R. Find a continuous extension of f to [a, b) such
that f 0 exists and is continuous at a.
Differentiation on R
99
8. Let g be differentiable on (1, +∞) and h differentiable on (−∞, 1] with
lim g(x) = h(1) and
x→1+
Define
(
f (x) =
lim g 0 (x) = h0` (1).
x→1+
(†)
g(x) if x > 1
h(x) if x ≤ 1.
Show that f is differentiable at x = 1 and hence on R. Conversely,
suppose that f 0 (1) exists. Do the limit equations in (†) hold?
9.S Let f and g be differentiable on (0, +∞) with
lim f (x) = lim g(x) = +∞, and
x→+∞
Evaluate
x→+∞
f 0 (x)
∈ (0, +∞).
x→+∞ g 0 (x)
lim
ln f (x)
.
x→+∞ ln g(x)
lim
10. Let f be differentiable in a neighborhood of a and suppose that f 00 (a)
exists. For α, β ∈ R calculate
βf (a + αh) − αf (a + βh) + (α − β)f (a)
.
h2
f (a + αh) + f (a + βh) − 2f (a)
lim
if f 0 (a) = 0.
h→0
h2
(a)S lim
h→0
(b)
11. Suppose that f has n derivatives on [a, +∞) and that limx→+∞ f (n) (x)
exists in R. Prove that limx→+∞ f (x)/xn exists in R.
12.S Suppose that f has n derivatives on (0, a) and L := lim+ x2n f (n) (x)
exists in R. Find limx→0+ xn f (x) in terms of L.
x→0
13. Let f be differentiable on (1, +∞) and limx→+∞ f (x) = 0. Prove that
if limx→+∞ x2 f 0 (x) exists in R, then limx→+∞ xf (x) also exists in R. Is
the converse true?
14. Suppose that, in a deleted neighborhood of 0, f is differentiable with
f 0 6= 0 and that limx→0 f (x) = 0. Prove that if limx→0 f (x)/f 0 (x) exists,
then it must equal 0.
15. Let g(x) be differentiable on (1, ∞) with g and g 0 nonzero and let f (x)
be differentiable in a neighborhood of 0. Suppose that limx→+∞ g(x) = 0,
f (0) = 0 and f 0 is continuous at 0. Find
lim
x→+∞
f (g(x))
.
g(x)
Give nontrivial examples of functions f and g that satisfy these conditions.
100
A Course in Real Analysis
16.S Let f and g be differentiable on (1, +∞) with g 0 6= 0 and suppose
that limx→+∞ f (x) = limx→+∞ g(x) = +∞ and that the limit L :=
limx→+∞ f 0 (x) exists in R. Find
lim
x→+∞
f (g(x))
.
g(x)
Give nontrivial examples that satisfy these conditions with L finite.
17. Let f be differentiable on (1, +∞) and suppose that limx→+∞ f (x) and
limx→+∞ f 0 (x) exist in R. Prove that the second limit must be zero.
Does the assertion still hold if limx→+∞ f (x) is infinite?
18.S Let f be differentiable on (1, +∞) and suppose that limx→+∞ f (x) and
limx→+∞ xf 0 (x) exist in R. Prove that the second limit is zero. Does the
assertion still hold if limx→+∞ f (x) is infinite?
19. Let f be differentiable in a deleted neighborhood
of 0 and suppose that
limx→0 f (x) and limx→0 f 0 (x) tan x exist in R. Prove that the second
limit must be 0. Does the assertion still hold if limx→0 f (x) is infinite?
20. Let f be differentiable on (0, b) and suppose that the limits limx→0+ f (x)
and limx→0+ x2 f 0 (x) exist in R. Prove that one of these limits must be
zero. Does the assertion still hold if limx→0+ f (x) is infinite?
21. Let f 00 exist and be continuous on (−1, 1) and f (0) = f 0 (0) = 0. Prove
that there exists a continuous function g on (0, 1) such that f (x) = x2 g(x).
Must g be differentiable at 0?
4.6
Taylor’s Theorem on R
Taylor’s theorem may be viewed as a generalization of the mean value
theorem. Its importance derives from its use in establishing various inequalities
and from its fundamental connection with power series.
4.6.1 Taylor’s Theorem. Let f have n + 1 derivatives in an open interval I.
Then, for each x, a ∈ I with x 6= a, there exists a number c between x and a
such that
f (x) =
n
X
f (k) (a)
k=0
k!
(x − a)k +
f (n+1) (c)
(x − a)n+1 .
(n + 1)!
(4.7)
Proof. Assume for definiteness that a < x. Define a function g on [a, x] by
g(t) =
n
X
f (k) (t)
k=0
k!
(x − t)k + α
(x − t)n+1
− f (x),
(n + 1)!
(4.8)
Differentiation on R
101
where α is chosen so that g(a) = 0. Since g is continuous on [a, x], differentiable
on (a, x) and g(x) = g(a), there exists, by Rolle’s theorem, c ∈ (a, x) such that
g 0 (c) = 0. From the calculations
 (k+1)
(t)
f (k) (t)
f
(k)
d f (t)
(x − t)k −
(x − t)k−1 if k ≥ 1,
k
(x − t) =
k!
(k − 1)!
 0
dt k!
f (t)
if k = 0,
we have
g 0 (t) =
=
n
X
f (k+1) (t)
k=0
(n+1)
f
k!
(t)
n!
(x − t)k −
n−1
X
k=0
(x − t)n
f (k+1) (t)
(x − t)k − α
k!
n!
(x − t)n
(x − t)n − α
.
n!
In particular,
0 = g 0 (c) =
(x − c)n
f (n+1) (c)
(x − c)n − α
,
n!
n!
hence α = f (n+1) (c). Thus from (4.8),
0 = g(a) =
n
X
f (k) (a)
k=0
k!
(x − a)k +
f (n+1) (c)
(x − a)n+1 − f (x),
(n + 1)!
which is (4.7).
Equation (4.7) is frequently written
f (x) = Tn (x, a) + Rn (x, a), where
n
(k)
X
f (a)
f (n+1) (c)
Tn (x, a) =
(x − a)k and Rn (x, a) =
(x − a)n+1 .
k!
(n + 1)!
k=0
The expression Tn (x, a) is called the nth Taylor polynomial of f about a, and
Rn (x, a) is called the remainder. It may be shown that Tn (x, a) is the unique
polynomial of degree ≤ n that best approximates f near a in the sense that
lim
x→a
f (x) − Tn (x, a)
= 0.
(x − a)n
(See Exercise 4.)
The remainder term Rn (x, a) has other forms, one of which is given in
Exercise 3. Observe that if Rn (x, a) → 0 as n → +∞, then Tn (x, a) → f (x),
which implies that f (x) is expressible as a power series about a. We exploit
this idea in Section 7.4.
The following application of Taylor’s theorem is a generalization of the
second derivative test.
102
A Course in Real Analysis
4.6.2 nth Derivative Test. Let f have n continuous derivatives on an open
interval I and let a ∈ I with f (j) (a) = 0, 1 ≤ j ≤ n − 1, and f n (a) 6= 0.
(a) If n is even and f (n) (a) > 0 (f (n) (a) < 0), then f has a local minimum
(local maximum) at a.
(b) If n is odd, then f has a neither a local minimum nor a local maximum
at a.
Proof. Assume f (n) (a) > 0. By continuity, f (n) (x) > 0 for all x in an open
interval J containing a. Let x ∈ J, x 6= a. By Taylor’s theorem, there exists c
between a and x such that
f (x) = f (a) + f (n) (c)
(x − a)n
.
n!
Thus if n is even, then f (x) > f (a), hence f has a local minimum at a. If n is
odd, then f (x) > f (a) if x > a and f (x) < f (a) if x < a, so f has a neither a
local maximum nor a local minimum at a. A similar argument works for the
case f (n) (a) < 0.
Note that the familiar second derivative test, obtained by taking n = 2 in
the theorem, is inconclusive for the function f (x) = x4 at a = 0. Here, one
must take n = 4.
Exercises
2
1. Define f (0) = 0 and f (x) = e−1/x x 6= 0. Prove that f (n) exists on R
and f (n) (0) = 0 for all n. Conclude that every Taylor polynomial for f
about 0 is identically 0.
2. Verify the following inequalities:
(a)
2n−1
X
2n
(−1)k xk <
k=0
(b)S
k=0
2n−1
X
k=0
(c)
2n
X
k=1
(d)
k=1
2n
X
k=0
(−1)k k
x , x > 0.
k!
2n+1
X (−1)k+1
(−1)
x2k−1 < sin x <
x2k−1 , 0 < x < π.
(2k − 1)!
(2k − 1)!
k=0
(e)
(−1) k
x < e−x <
k!
k
k+1
k=1
2n−1
X
n−1
X
X
1
<
(−1)k xk , x > 0.
1+x
(−1) 2k
x < cos x <
(2k)!
(−1)
k
k
k−1
2n
X
k=0
xk < ln(1 + x) <
(−1)k 2k
x , 0 < x < π.
(2k)!
n
X
(−1)k−1
k=1
the reverse inequalities if n is even.
k
xk if n is odd,
Differentiation on R
103
3.S ⇓2 Show that if f (n+1) is continuous on I, then
Z
1 x
Rn (x, a) =
(x − t)n f (n+1) (t) dt.
n! a
Hint. Integrate by parts n times.
4. Prove that a polynomial Pn (x) =
Pn
k=0
ak (x − a)k satisfies
f (x) − Pn (x)
=0
(x − a)n
lim
x→a
iff Pn = Tn , the nth Taylor polynomial of f about a.
Pn
Pn
5.S Let P (x) = k=0 ak (x − a)k = k=0 bk (x − b)k . Show that
bk =
n−k
X
j=0
j+k
(b − a)j ak+j .
k
6. Let P be a polynomial of degree n. Prove that the polynomials P (x ± 1)
may be written as linear combinations of P (k) (x), k = 0, . . . , n. Find
simplified expressions for P (x + 1) ± P (x − 1).
7. Let f have n derivatives on [0, 1]. Show that for each y =
6 f (1) there exists
an extension g of f to [0, +∞) with n derivatives such that g(b) = y for
some b > 1.
*4.7
Newton’s Method
A simple zero of a differentiable function f is a number z such that f (z) = 0
and f 0 (z) 6= 0. Newton’s method is a rapidly converging recursion scheme for
approximating such a zero. The idea is to choose x1 near z and then define a
sequence {xn } recursively by
xn+1 = xn −
f (xn )
, n = 1, 2, . . . ,
f 0 (xn )
(4.9)
as illustrated in Figure 4.8. Under suitable conditions, the sequence is welldefined and converges to z, hence may be used to approximate z to (theoretically) any desired degree of accuracy.
4.7.1 Newton’s Method. Let f 00 be continuous on an open interval I and
let z be a simple zero of f in I. If x1 is chosen sufficiently near z, then the
sequence {xn } lies in I and converges to z.
2 This
exercise will be used in 5.6.3.
104
A Course in Real Analysis
y = f (xn ) + f 0 (xn )(x − xn )
y = f (x)
z
xn+2
xn+1
xn
x
FIGURE 4.8: Newton’s method.
Proof. Since f 0 (z) 6= 0, there exists a neighborhood Iz of z contained in I on
which |f 0 | ≥ c > 0. Suppose that xn ∈ Iz . By Taylor’s theorem, for each x ∈ I
there exists ξ between x and xn such that
f (x) = f (xn ) + f 0 (xn )(x − xn ) + 12 f 00 (ξ)(x − xn )2 .
In particular,
0 = f (z) = f (xn ) + f 0 (xn )(z − xn ) + 12 f 00 (ξ)(z − xn )2 .
Dividing by f 0 (xn ), we have
xn+1 − z = xn − z −
f 00 (ξ)
f (xn )
=
(z − xn )2 .
f 0 (xn )
2f 0 (xn )
Thus if d is the maximum of |f 00 | on Iz , then
|xn+1 − z| ≤ α|xn − z|2 ,
α :=
d
.
2c
Iterating, we have
|xn+1 − z| ≤ · · · ≤ α2
k+1
−1
|xn−k − z|2
k+1
≤ · · · ≤ α2
n
−1
n
|x1 − z|2 .
Thus if x1 is sufficiently near z, and in particular if α|x1 − z| < 1, then xn ∈ Iz
for all n and xn → z.
4.7.2 Example. Let f (x) = sin x − x/3. Since
√
f (3π/4) = 1/ 2 − π/4 < 0 < 1 − π/6 = f (π/2),
f has a zero in [3π/4, π/2] by the intermediate value theorem. Taking x1 = 3π/4
yields the zero 2.27886266, accurate to eight decimal places. Taking x1 = 1
produces the symmetric zero −2.27886266, while x1 = π/4 produces 0.
♦
Differentiation on R
105
If x1 is not sufficiently near z, then the sequence {xn } may converge more
slowly to z or may not converge at all (see Exercise 6).
4.7.3 Example. For an approximate solution of ex = 2−x we apply Newton’s
method to f (x) = ex + x − 2. By the intermediate value theorem, f has a zero
in (0, 1). The recursion formula for f is
xn+1 = xn − (exn + xn − 2)(exn + 1)−1 .
Table 4.1 gives the first few terms of the sequence {xn } and the corresponding
TABLE 4.1: Newton’s method for ex + x − 2 = 0.
x1
1
f (x1 )
1.7182818
x1
5
f (x1 )
151.4131591
x2
.5378828
f (x2 )
.2502604
x2
3.9866142
f (x2 )
55.8587993
x3
.4456167
f (x3 )
.0070696
x3
2.9686340
f (x3 )
20.4339472
x4
.4428567
f (x4 )
.0000059
x4
1.9701667
f (x4 )
7.1420387
x5
.4428544
f (x5 )
.0000000
x5
1.0961884
f (x5 )
2.0889256
values of f (accurate up to seven decimal places) for the initial values x1 = 1
and x1 = 5. The convergence is significantly slower for the larger value. The
solution, accurate to 10 decimal places, is .4428544010.
♦
Exercises
1. Find a zero, accurate to eight decimal places, of the given polynomial in
the indicated interval.
(a) S x3 − x + 2, [−2, −1].
(b) x3 + x + 1, [−1, 0].
(c) x3 − 2x + 2, [−2, −1].
(d) S x5 − 2x + 3, [−2, −1].
(e) x7 − x − 1, [1, 2].
(f) x4 − 2x3 + 5x2 − 8x − 6, [2, 3].
(g)S 20x4 − 20x3 − 8x2 + 4x − 1, [1, 2].
(h) 20x4 − 20x3 − 4x + 1, [1, 2].
2. Find a solution of the given equation in the indicated interval, correct to
eight decimal places.
(a) S sin x = x2 , [.5, 1].
(b)
(c)
ln x + x = 2, [1, 2].
(d) 2 cos x = ex , [0, 1].
ln x = e−x , [1, 2].
(f)
(e)
S
sin x = x3 , [.5, 1].
tan x + x = 1, [0, 1].
106
A Course in Real Analysis
3. Show that Newton’s method applied to the function x−1 − c produces
the equation xn+1 = 2xn − cx2n . Use this to find 1/2.34567, correct to
eight decimal places. Check your answer with a calculator.
√
4.S Use Newton’s method to find 63 correct to eight decimal places. Check
your answer with a calculator.
5. What happens when you apply Newton’s method with x1 = 1 to the
polynomial in part (c) of Exercise 1?
6. Show that the sequence generated by Newton’s method applied to f (x) =
x1/3 cannot converge for any value of x1 6= 0.
Chapter 5
Riemann Integration on R
5.1
The Riemann–Darboux Integral
Throughout this section, f denotes an arbitrary bounded,
real-valued function on a closed and bounded interval [a, b].
The first step in the development of the Riemann–Darboux integral is to
partition the interval [a, b] into finitely many subintervals, which are used to
form upper and lower sums of f . Under suitable conditions, the sums converge
to the integral.
5.1.1 Definition. A partition of [a, b] is a set P = {x0 , x1 , . . . , xn−1 , xn },
where
x0 := a < x1 < · · · < xn−1 < xn := b.
The points x1 , . . . , xn−1 are called the interior points of the partition. The
mesh of the partition is defined as
kPk := max ∆xj , where ∆xj := xj − xj−1 , 1 ≤ j ≤ n.
1≤j≤n
A refinement of P is a partition containing P. The common refinement of
partitions P and Q is the partition P ∪ Q.
♦
5.1.2 Example. Let p ∈ N. Then, for each n ∈ N,
Pn := {j/pn : j = 0, 1, . . . , pn }
is a partition of [0, 1], kPn k = p−n , and Pn+1 is a refinement of Pn .
♦
5.1.3 Definition. The lower and upper (Darboux) sums of f over a partition
P of [a, b] are defined, respectively, by
S(f, P) :=
n
X
mj ∆xj
and S(f, P) :=
n
X
Mj ∆xj ,
j=1
j=1
inf
f (x) and Mj = Mj (f ) :=
where
mj = mj (f ) :=
xj−1 ≤x≤xj
sup
xj−1 ≤x≤xj
f (x).
♦
107
108
A Course in Real Analysis
A geometric interpretation of the upper and lower sums for a positive
continuous function is given in Figure 5.1. The lower (upper) sum is the total
area of the smaller (larger) rectangles.
f
a
x2
x1
x3
x4
b
x
FIGURE 5.1: Upper and lower sums of f .
The following proposition asserts that refinements increase lower sums and
decrease upper sums.
5.1.4 Proposition. If Q is a refinement of P, then
S(f, P) ≤ S(f, Q) ≤ S(f, Q) ≤ S(f, P).
Proof. The middle inequality is clear. To prove the rightmost inequality, let
P = {x0 = a < x1 < · · · < xn−1 < xn = b} and assume first that Q = P ∪ {c}.
Choose k so that xk−1 < c < xk and set
Mk0 =
sup
xk−1 ≤x≤c
f (x) and Mk00 =
sup f (x).
c≤x≤xk
Then Mk0 , Mk00 ≤ Mk , hence
S(f, Q) =
k−1
X
Mj ∆xj +
j=1
≤
k−1
X
n
X
Mj ∆xj + Mk0 (c − xk−1 ) + Mk00 (xk − c)
j=k+1
Mj ∆xj +
j=1
n
X
Mj ∆xj + Mk (c − xk−1 ) + Mk (xk − c)
j=k+1
= S(f, P).
For the general case, observe that any refinement Q of P may be obtained
by successively adding points to P. At each step, the upper sum is decreased
so that ultimately one obtains the desired inequality. The proof for lower sums
is similar.
5.1.5 Corollary. For any partitions P and Q of [a, b],
S(f, Q) ≤ S(f, P).
Proof. By 5.1.4, S(f, Q) ≤ S(f, P ∪ Q) ≤ S(f, P ∪ Q) ≤ S(f, P).
(5.1)
Riemann Integration on R
109
5.1.6 Definition. The lower and upper (Darboux ) integrals of f on [a, b] are
defined, respectively, by
Z b
Z b
Z b
Z b
f=
f (x) dx := sup S(f, P) and
f=
f (x) dx := inf S(f, P),
a
P
a
a
P
a
where the supremum and infimum are taken over all partitions P of [a, b]. In
each case, f is called the integrand and x the integration variable.
♦
5.1.7 Proposition. For any partition P of [a, b],
Z b
Z b
S(f, P) ≤
f≤
f ≤ S(f, P).
a
a
Proof. The left and right inequalities are immediate from the definition of
lower and upper integrals. The middle inequality follows by taking the infimum
over Q and then the supremum over P in (5.1).
5.1.8 Proposition. The following statements are equivalent:
Z b
Z b
(a)
f=
f.
a
a
(b) For each ε > 0 there exists a partition Pε of [a, b] such that
S(f, Pε ) − S(f, Pε ) ≤ ε.
Proof. (a) ⇒ (b): Given ε > 0, there exist partitions P 0 and P 00 such that
Z b
Z b
0
00
f − ε/2 < S(f, P ) and S(f, P ) <
f + ε/2.
a
a
By 5.1.4, the inequalities still hold if P 0 and P 00 are each replaced by their
common refinement Pε := P 0 ∪ P 00 . Subtracting the resulting inequalities and
applying (a) yields (b).
(b) ⇒ (a): If the inequality in (b) holds then, by 5.1.7,
Z b
Z b
0≤
f−
f ≤ S(f, Pε ) − S(f, Pε ) < ε.
a
a
Since ε is arbitrary, the integrals must be equal.
5.1.9 Definition. The function f is said to be (Darboux) integrable on [a, b]
if one (hence both) of the conditions (a), (b) of 5.1.8 hold. In this case, the
common value of the integrals in (a) is called the (Riemann–Darboux ) integral
of f on [a, b] and is denoted by
Z b
Z b
f=
f (x) dx.
a
a
110
A Course in Real Analysis
Also, define
a
Z
f =−
Z
b
b
f and
a
a
Z
f = 0.
a
The collection of all integrable functions on [a, b] is denoted by Rba .
♦
The following theorem guarantees a rich supply of integrable functions.
5.1.10 Theorem. If f is continuous on [a, b] except possibly at finitely many
points, then f ∈ Rba .
Proof. Denote the points of discontinuity of f , if any, by d1 < · · · < dn . For
convenience, we assume that these lie in (a, b); only a minor modification of
the proof is needed if d1 = a or dn = b. Let ε > 0. For each j, remove an
open interval of width r centered at dj , the value of r to be determined. Since
f is continuous on each of the resulting n + 1 closed intervals I0 , . . . , In , it
is uniformly continuous there. (If f is continuous on [a, b], then n = 0 and
I0 = [a, b].) Thus there exists a δ > 0 such that for each j,
|f (x) − f (y)| < ε/2(b − a) for all x, y ∈ Ij with |x − y| < δ.
Now, the endpoints of the intervals Ij form a partition P of [a, b]. If necessary,
refine P by inserting points (marked by ∗ in Figure 5.2) into these intervals so
that the distance between consecutive points is less than δ. The subintervals of
P
Q
r
r
a
I0
d1
β
α
I1
β
I2
d2
β
∗
d1
b
β
α
β
∗
d2
FIGURE 5.2: The partitions P and Q.
the resulting partition Q are of two types: those that contain some dj , which
we mark by α, and those that do not, which we mark by β. Thus, in the
obvious notation,
X
X
S(f, Q) − S(f, Q) =
(Mj − mj )∆xj +
(Mj − mj )∆xj .
α
β
In the first sum, ∆xj < r and in the second, Mj − mj ≤ ε/2(b − a). Since the
first sum has n terms (corresponding to the n discontinuities dj ),
S(f, Q) − S(f, Q) < 2M nr + ε/2,
where M is a bound for |f | on [a, b]. Choosing r < ε/4M n, we then have
S(f, Q) − S(f, Q) < ε, which shows that f is integrable on [a, b].
Riemann Integration on R
111
The set of discontinuities of an integrable function can be infinite but may
not be too large. We make this precise in Section 5.8. In the meantime, we
offer the following examples to illustrate the basic idea. In the first example,
the function is discontinuous only on a countably infinite set, while in the
second the function is discontinuous everywhere.
0
x3 1/(n − 1) x4 · · · x2n−3 1/2 x2n−2
x1 1/n x2
1
FIGURE 5.3: The partition Pn of Example 5.1.11.
5.1.11 Example. Let f be any bounded function on [0, 1] such that f (x) = 0
R1
if x 6∈ {1/n : n = 2, 3 . . .}. We claim that f is integrable and that 0 f = 0.
The idea is to enclose the points of discontinuity of f in small intervals, as in
the proof of 5.1.10. Fix n and let
Pn = {x0 = 0, x1 , x2 , . . . , x2n−2 , x2n−1 = 1},
where
x2j−1 < 1/(n − j + 1) < x2j < x2j+1 , j = 1, 2, . . . , n − 1, and
∆x2j = x2j − x2j−1 < 1/n2 , j = 1, 2, . . . , n.
(See Figure 5.3.) Let |f | ≤ M on [0, 1]. Since f = 0 on [x2j , x2j+1 ] and
mj ≥ −M ,
S(f, Pn ) = m1 x1 + m2 (x2 − x1 ) + · · · + m2n−2 (x2n−2 − x2n−3 )
≥ −M x1 + (x2 − x1 ) + (x4 − x3 ) + · · · + (x2n−2 − x2n−3 )
≥ −M (1/n + (n − 1)/n2 ) = −M (2/n − 1/n2 ).
A similar calculation shows that S(f, Pn ) ≤ M (2/n − 1/n2 ). Therefore,
lim S(f, Pn ) = lim S(f, Pn ) = 0,
n
n
hence f is integrable with zero integral.
♦
5.1.12 Example. The Dirichlet function d(x) (3.1.7) is not integrable on any
(nondegenerate) interval [a, b]. Indeed, every upper sum of d(x) has the value
b − a and every lower sum has the value 0.
♦
A useful characterization of integrability may be given in terms of the
limits of S(f, P) and S(f, P) as kPk → 0.
5.1.13 Definition. Let L ∈ R. We write L = limkPk→0 S(f, P) if, given ε > 0,
there exists δ > 0 such that |S(f, P) − L| < ε for all partitions P with kPk < δ.
The limit limkPk→0 S(f, P) is defined analogously.
♦
112
A Course in Real Analysis
5.1.14 Lemma. Let P 0 = {x00 = a < x01 < · · · x0n < x0n+1 = b} be a partition
of [a, b] and let |f | ≤ M on [a, b]. Then
S(f, P) ≤ S(f, P 0 ) + 3nM kPk
for all partitions P of [a, b] with kPk < δ 0 := minj ∆x0j .
P
0
P
P 00
x02
x01
a
γ
γ
γ
γ
γ
α
β
γ
α
γ
β
b
β
β
γ
γ
γ
FIGURE 5.4: The partitions P 0 , P, and P 00 .
Proof. Since kPk < ∆x0j , no interval of P can contain more that one interior
point of P 0 . Mark the intervals of P that contain exactly one interior point of
P 0 by α and mark those that contain no interior point of P 0 by γ. Consider
the common refinement P 00 = P ∪ P 0 of P and P 0 . Some of the intervals of
P 00 were formed from an interval of P of type α; we mark those by β. The
remaining intervals of P 00 , intervals that were not formed from an interval of P
of type α, are precisely the intervals marked γ in P. Thus the terms of S(f, P)
and S(f, P 00 ) corresponding to intervals of type γ are identical, hence cancel
under substraction of upper sums. Therefore, in the obvious notation,
X
X
S(f, P) − S(f, P 00 ) =
Mj (f )∆xj −
Mj00 (f )∆x00j
α
≤M
β
hX
∆xj +
α
X
∆x00j
i
β
≤ M nkPk + 2nkP 00 k ,
the last inequality because there are at most n intervals of type α and at most
2n intervals of type β. Since P 00 is a refinement of P 0 and P,
S(f, P) − S(f, P 0 ) ≤ S(f, P) − S(f, P 00 ) ≤ 3nM kPk.
5.1.15 Theorem. For any bounded function f on [a, b],
Z
b
f = lim S(f, P) and
kPk→0
a
b
Z
f = lim S(f, P).
a
kPk→0
(5.2)
Thus f is integrable on [a, b] iff the limits in (5.2) are equal, in which case
Z
a
b
f = lim S(f, P) = lim S(f, P).
kPk→0
kPk→0
(5.3)
Riemann Integration on R
113
Proof. Given ε > 0, choose a partition P 0 such that
Z b
0
S(f, P ) <
f + ε/2.
a
In the notation of 5.1.14, for any partition P with kPk < δ 0 ,
Z b
S(f, P) ≤ S(f, P 0 )| + 3nM kPk <
f + ε/2 + 3nM kPk.
a
Hence if kPk < min{δ 0 , ε/6nM }, then
Z b
Z b
f ≤ S(f, P) <
f + ε.
a
a
Since ε was arbitrary, the first limit in (5.2) is established. The second follows
from the first by considering −f and using Exercise 5.1.3.
Equation (5.3) represents the integral as a limit of upper and lower sums.
It is also possible to represent the integral as a limit of intermediate sums,
called Riemann sums.
5.1.16 Definition. Let P = {x0 = a < x1 < · · · < xn = b} be a partition of
[a, b] and let ξ = (ξ1 , . . . , ξn ), where ξj ∈ [xj−1 , xj ]. The sum
S(f, P, ξ) :=
n
X
f (ξj )∆xj
j=1
is called the Riemann sum of f determined by P and ξ.
♦
Figure 5.5 illustrates a Riemann sum for a positive continuous function
f . In this case S(f, P, ξ) is the total area of the rectangles with heights f (ξj )
and bases ∆xj .
f
a
ξ1
x1
x2
ξ2
ξ3
x3
ξ4
x4
ξ5 b
x
FIGURE 5.5: A Riemann sum.
5.1.17 Definition. Let P = {x0 = a < x1 < · · · < xn = b} be a partition of
[a, b] and let ξ = (ξ1 , . . . , ξn ), where ξj ∈ [xj−1 , xj ]. We write
L = lim S(f, P, ξ)
kPk→0
114
A Course in Real Analysis
if for each ε > 0 there exists δ > 0 such that |S(f, P, ξ) − L| < ε for all
partitions P with kPk < δ and all choices of ξ. Similarly, we write
L = lim S(f, P, ξ)
P
if for each ε > 0 there exists a partition Pε such that |S(f, P, ξ) − L| < ε for
all refinements P of Pε and all choices of ξ.
♦
We may now give Riemann’s characterization of integrability.
5.1.18 Theorem. The following statements are equivalent:
(a) f ∈ Rba .
(b)
lim S(f, P, ξ) exists in R.
kPk→0
(c) lim S(f, P, ξ) exists in R.
P
If these conditions hold, then
Z b
f = lim S(f, P, ξ) = lim S(f, P, ξ).
a
kPk→0
Proof. (a) ⇒ (b): Let L =
Rb
a
P
f . For any partition P and any ξ, we have
S(f, P) − L ≤ S(f, P, ξ) − L ≤ S(f, P) − L,
hence (b) follows from 5.1.15.
(b) ⇒ (c): Let L := limkPk→0 S(f, P, ξ). Given ε > 0, choose δ > 0 such
that
|S(f, P, ξ) − L| < ε for all partitions P with kPk < δ and all ξ.
(5.4)
Choose any partition Pε with kPε k < δ. If P is any refinement of Pε , then
kPk ≤ kPε k < δ, hence (5.4) holds for P.
(c) ⇒ (a): Let L := limP S(f, P, ξ). Given ε > 0, choose a partition Pε
such that
|S(f, P, ξ) − L| < ε for all refinements P of Pε and all ξ.
(5.5)
For such a partition P, by the approximation property of suprema there exists
for each j a sequence {ξj,k }∞
k=1 in [xj−1 , xj ] such that limk f (ξj,k ) = Mj (f ). It
follows that
lim S(f, P, ξ k ) = S(f, P), where ξ k = (ξ1k , ξ2k , . . . , ξnk ).
k
Rb
Rb
From (5.5), a f − L ≤ S(f, P) − L ≤ ε. Since ε was arbitrary, a f ≤ L.
Rb
Rb
Rb
Similarly, a f ≥ L. Therefore a f = a f .
Riemann Integration on R
115
Exercises
1. Prove that if k is a constant, then
Rb
a
k=
Rb
a
k = k(b − a).
2. Let a ≤ c < d ≤ b. Define f on [a, b] by f (x) = 1 if x ∈ [c, d] and
Rb
f (x) = 0 otherwise. Show that f ∈ Rba and evaluate a f .
3.S ⇓1 Prove that
Rb
Rb
(a) S(−f, P) = −S(f, P) and a (−f ) = − a f.
Rb
Rb
(b) f ∈ Rba ⇒ −f ∈ Rba and a (−f ) = − a f.
4. ⇓2 Prove that a monotone function is integrable.
5.S Let f ∈ Rba and let g : [a, b] → R be any function that differs from f at
Rb
Rb
finitely many points in [a, b]. Prove that g ∈ Rba and that a f = a g.
Does the same result hold if g differs from f at countably many points?
6. Let f ∈ Rba . Prove:
(a) If inf a≤x≤b f (x) > 0, then 1/f ∈ Rba .
√
(b) If f (x) ≥ 0 for all x ∈ [a, b], then f ∈ Rba .
(c)S sin(f ) ∈ Rba .
7.S Let F (P) be a real-valued function of partitions P on an interval [a, b].
Write
L = lim F (P)
P
if, given ε > 0, there exists a partition Pε such that |F (P) − L| < ε for
all partitions P refining Pε .
(a) Show that the limit is linear, that is,
lim αF (P) + βG(P) = α lim F (P) + β lim G(P),
P
P
P
provided the right side exists.
(b) Let f be a bounded function on [a, b]. With this definition, show that
Z
a
b
f = lim S(f, P) and
P
Z
a
b
f = lim S(f, P).
P
8. Let f ∈ R10 and set g(x) = xq , where q > 0. Prove that f ◦ g ∈ R10 .
1 This
2 This
exercise will be used in 5.2.2.
exercise will be used in 5.9.8.
116
A Course in Real Analysis
5.2
Properties of the Integral
The following lemma will be useful in proving certain properties of integrals.
5.2.1 Lemma. Let f : [a, b] → R be bounded. Then there exists a sequence of
partitions {Pn } of [a, b] such that
lim S(f, Pn ) =
n→∞
b
Z
f and
f.
n→∞
a
b
Z
lim S(f, Pn ) =
a
Moreover, the limits still hold if each Pn is replaced by a refinement.
Proof. By the approximation property of infima and suprema, for each n there
exist partitions Pn0 and Pn00 of [a, b] such that
Z
b
f − 1/n <
S(f, Pn0 )
Z
≤
a
b
b
Z
f and
f≤
a
Z
S(f, Pn00 )
<
a
b
f + 1/n.
a
Since refinements decrease upper sums and increase lower sums, the inequalities
still hold if Pn0 and Pn00 are replaced by their common refinement Pn or by any
refinement of Pn . Letting n → +∞ completes the proof.
5.2.2 Theorem. If f, g ∈ Rba and α, β ∈ R, then αf + βg ∈ Rba and
Z
b
b
Z
αf + βg = α
a
f +β
a
Z
b
g.
a
Proof. By 5.2.1, we may choose a sequence of partitions Pn such that
lim S(f, Pn ) =
n
Z
b
f and lim S(g, Pn ) =
n
a
Z
b
g.
a
(There exists one such sequence for f , another for g; the sequence of common
refinements then works for both functions.) Letting n → ∞ in
Z
b
(f + g) ≤ S(f + g, Pn ) ≤ S(f, Pn ) + S(g, Pn )
a
yields
Z
b
(f + g) ≤
a
Similarly,
Z
a
Z
b
f+
a
b
(f + g) ≥
Z
a
Z
b
g.
a
b
f+
Z
b
g.
a
Riemann Integration on R
117
It follows that f + g is integrable and
b
Z
(f + g) =
b
Z
g.
a
a
a
b
Z
f+
Rb
Rb
It remains to prove that αf is integrable and that a αf = α a f . If α > 0,
then
S(αf, P) = αS(f, P) and S(αf, P) = αS(f, P).
Taking the infimum and supremum over P yields
b
Z
αf = α
Z
b
f=
αf .
a
a
b
Z
a
If α < 0, then −α > 0, hence
b
Z
αf =
a
Z
b
(−α)(−f ) = (−α)
Z
a
b
(−f ) = α
b
Z
a
f,
a
the last equality by Exercise 5.1.3.
5.2.3 Proposition. If f ∈ Rba and a ≤ c < d ≤ b, then f |[c,d] ∈ Rdc .
Proof. Given ε > 0, let P be a partition of [a, b] with S(f, P) − S(f, P) < ε.
We may assume that c, d ∈ P, otherwise replace P by the refinement P ∪ {c, d}.
If Q = P ∩ [c, d], then clearly
S f |[c,d] , Q − S f |[c,d] , Q ≤ S(f, P) − S(f, P) < ε,
hence f |[c,d] ∈ Rdc .
The following is a converse of 5.2.3.
5.2.4 Theorem. Let a < c < b. If f |[a,c] ∈ Rca and f |[c,b] ∈ Rbc , then f ∈ Rba
and
Z b
Z c
Z b
f=
f+
f.
a
a
c
Proof. By 5.2.1, we may choose sequences of partitions Pn0 of [a, c] and Pn00 of
[c, b] such that
lim S(f |[a,c] , Pn0 )
n
=
Z
c
f
a
and
lim S(f |[c,b] , Pn00 )
n
=
Z
Then Pn := Pn0 ∪ Pn00 is a partition of [a, b] and
Z
a
b
f ≤ S(f, Pn ) = S(f |[a,c] , Pn0 ) + S(f |[c,b] , Pn00 ).
b
f.
c
118
A Course in Real Analysis
Letting n → ∞, we obtain
Z
b
Z
f≤
a
c
f+
a
Z
b
f.
c
Replacing f by −f produces the reverse inequality for the lower integral of f ,
proving the theorem.
5.2.5 Theorem. If f, g ∈ Rba and f ≤ g on [a, b], then
Z b
Z b
f≤
g.
a
a
In particular, if m ≤ f (x) ≤ M for all x ∈ [a, b], then
Z b
m(b − a) ≤
f ≤ M (b − a).
a
Proof. Let P be a partition of [a, b]. By hypothesis, Mj (f ) ≤ Mj (g) for each j,
hence S(f, P) ≤ S(g, P). Taking the infimum over P yields the first inequality.
The second inequality follows from the first and Exercise 5.1.1.
Z b
Z b
f ≤
|f |.
5.2.6 Theorem. If f ∈ Rba , then |f | ∈ Rba and
a
a
Proof. By Exercise 1.4.5, for any partition P of [a, b],
Mj (|f |) − mj (|f |) ≤ Mj (f ) − mj (f ).
Summing over j,
S(|f |, P) − S(|f |, P) ≤ S(f, P) − S(f, P).
Since the right side can be made arbitrarily small, |f | ∈ Rba . Applying 5.2.5 to
±f ≤ |f | we obtain
Z b
Z b
±
f≤
|f |,
a
a
which gives the desired inequality.
5.2.7 Theorem. If f, g ∈ Rba , then f g ∈ Rba .
Proof. Since f g = 12 (f + g)2 − f 2 − g 2 , it suffices to prove that f 2 ∈ Rba . To
this end, let P be any partition of [a, b] and let |f | ≤ M . Then
Mj (f 2 ) − mj (f 2 ) = Mj2 (|f |) − m2j (|f |) ≤ 2M Mj (|f |) − mj (|f |) .
Summing over j,
S(f 2 , P) − S(f 2 , P) ≤ 2M S(|f |, P) − S(|f |, P) .
Since |f | ∈ Rba , the right side of the last inequality may be made arbitrarily
small. Therefore, f 2 ∈ Rba .
Riemann Integration on R
119
Exercises
1.S Let {cn } be a convergent sequence in [a, b] and let f be a bounded
function on [a, b] with f (x) = 0 for all x 6∈ {cn }. Prove that f ∈ Rba and
Rb
find a f .
2. Define f on [0, 1] by f (0) = 0 and
f (x) = 2−n
if 2−n−1 < x ≤ 2−n , n ≥ 0.
R1
Prove that f ∈ R10 and evaluate 0 f .
3. Prove or disprove: |f | ∈ Rba implies f ∈ Rba .
4. A function s on [a, b] is called a step function if there exists a partition
of [a, b] such that s is constant on the interior of each partition interval.
Show that a step function is integrable. Prove that a bounded function f
is integrable on [a, b] iff for each ε > 0 there exist step functions s` and
Rb
su such that s` ≤ f ≤ su and a (su − s` ) < ε.
5.S Prove that if fj ∈ Rba , 1 ≤ j ≤ n, then max{f1 , . . . , fn } ∈ Rba and
min{f1 , . . . , fn } ∈ Rba .
6.S Let f be continuous and f (x) < M for all x ∈ [a, b]. Prove that
Z b
f < M (b − a). (Compare with 5.2.5.)
a
7. Let f ∈ Rba be nonnegative. Prove that if f is continuous at some point
Rb
x0 ∈ [a, b] and f (x0 ) 6= 0, then a f > 0.
8. Let f ∈ Rba such that either
Rb
(a) a f g = 0 for every continuous function g, or
Rb
(b) a f g = 0 for every step function g.
Prove that f is zero at each point of continuity of f .
Ry
9.S Let f ∈ Rba and for x, y ∈ [a, b] define F (x, y) = x f . Prove that F (x, y)
is continuous in y for each x and continuous in x for each y.
10. Let f be bounded on [a, b] and integrable [c, b] for every a < c < b. Prove
that the following statements are equivalent:
Z b
(a) lim+ f exists in R.
x→a
(b) lim
x
Z
n→+∞
b
f exists in R for some sequence an ↓ a.
an
(c) f ∈ Rba .
Conclude from Exercise 9 that if f ∈ Rba , then the limit in (a) is
Rb
a
f.
120
A Course in Real Analysis
11. Let f be integrable on [0, x] for all x > 0. Prove that
Z
Z
1 x
1 x
lim inf f (x) ≤ lim inf
f ≤ lim sup
f ≤ lim sup f (x).
x→+∞
x→+∞ x 0
x→+∞ x 0
x→+∞
Conclude that if L := limx→+∞ f (x) exists in R, then
Z
1 x
lim
f (t) dt = L.
x→+∞ x 0
12.S Let f be continuous on [a, b] and let M = supa≤x≤b |f (x)|. Prove:
(a) For each ε > 0 there exists δ > 0 such that
Z b
δ(M − ε) ≤
|f (x)| dx ≤ M (b − a).
a
b
Z
(b) M = lim
p→+∞
|f |p
1/p
.
a
13. ⇓3 Let f, g : [a, b] → R be continuous. Supply the details in the following
outline of a proof of the Cauchy–Schwarz inequality.
Z b 2 Z b Z b fg ≤
f2
g2 .
a
a
(a) The inequality holds if
a
b
Z
g 2 = 0.
a
(b) For any real number t,
Z b
Z
2
0≤
(f − tg) =
a
(c) Let t =
Z
f − 2t
2
a
Z
fg
a
5.3
b
b
b
g
2
−1
Z
a
b
fg + t
2
Z
b
g2 .
a
in (b).
a
Evaluation of the Integral
The theorems in this section describe standard methods for evaluating
integrals. The first of these expresses the integral of a function f in terms of a
primitive or antiderivative, that is, a function whose derivative is f . It also
shows that the process of integration is the inverse of that of differentiation.
3 This
exercise will be used in 5.7.19.
Riemann Integration on R
121
5.3.1 Fundamental Theorem of Calculus. Let f : [a, b] → R be continuous.
Z x
(a) The function G(x) :=
f (t) dt, x ∈ [a, b], is a primitive of f .
a
(b) For any primitive F of f ,
Z
b
f = F (x)
a
(c) If f 0 ∈ Rba , then
b
a
:= F (b) − F (a).
b
Z
f 0 = f (b) − f (a). In particular, f (x) = f (a) +
x
Z
a
f 0.
a
Proof. (a) We assume that a ≤ x < b and prove that
lim
h→0+
G(x + h) − G(x)
= f (x).
h
(5.6)
By 5.2.4 and 5.2.6, if h > 0 and x + h < b, then
Z
Z
1 x+h
1 x+h
G(x + h) − G(x)
− f (x) =
f (t) − f (x) dt ≤
|f (t) − f (x)| dt.
h
h x
h x
By continuity of f at x, given ε > 0 we may choose δ > 0 such that |t − x| < δ
implies |f (t) − f (x)| < ε. Thus if h < δ, then the term on the right in the
above inequality is ≤ ε, proving (5.6).
(b) Let F be any primitive of f . Then F = f 0 = G, hence F = G + c for
some constant c. Thus from (a),
Z b
f = G(b) − G(a) = F (b) − F (a).
a
(c) For any partition P, by the mean value theorem
f (xj ) − f (xj−1 ) = f 0 (ξj )∆xj for some ξj ∈ [xj−1 , xj ], j = 1, . . . , n.
For this choice of ξj ,
S(f 0 , P, ξ) =
n
X
j=1
f 0 (ξj )∆xj =
n
X
f (xj ) − f (xj−1 )] = f (b) − f (a).
j=1
Since we may choose P so that S(f 0 , P, ξ) is arbitrarily near
Rb
a
f 0 , (c) follows.
R
The general primitive of a continuous function f is denoted by f and
Rb
is called the indefinite integral of fR. (In this context, a f is called a definite
integral.) For example, one writes 3x2 dx = x3 + c, where c is the so-called
constant of integration. In general, since primitives of a function differ only by
a constant, we write
Z
f (x) dx = F (x) + c,
where F is any particular primitive of f .
122
A Course in Real Analysis
5.3.2 Change of Variables Theorem. Let ϕ : [a, b] → R be continuously
differentiable with ϕ0 never zero and let f be integrable on [c, d] := ϕ([a, b]).
Then (f ◦ ϕ)|ϕ0 | ∈ Rba and
b
Z
f (ϕ(x))|ϕ (x)| dx =
0
a
Z
d
f (y) dy.
(5.7)
c
Proof. By the intermediate value theorem, we may assume that ϕ0 (x) > 0 for
all x, so ϕ is strictly increasing, c = ϕ(a), and d = ϕ(b).
y = ϕ(x)
d
yn−1
..
.
yj
yj−1
..
.
y1
c
a x1 · · · xj−1
xj · · · xn−1 b
x
FIGURE 5.6: The partitions P x and P y .
We show first that f ◦ ϕ ∈ Rba . For this we use the fact that ϕ induces a
one-to-one correspondence between partitions P x = {x0 , . . . , xn } of [a, b] and
partitions P y = {y0 , . . . , yn } of [c, d], where yj = ϕ(xj ) (xj = ϕ−1 (yj )) (see
Figure 5.6). Since ϕ([xj−1 , xj ]) = [yj−1 , yj ],
Mjx (f ◦ ϕ) =
sup
xj−1 ≤x≤xj
f (ϕ(x)) =
sup
yj−1 ≤y≤yj
f (y) = Mjy (f ).
(5.8)
Moreover, by the mean value theorem, there exists zj ∈ [yj−1 , yj ] such that
∆xj = ϕ−1 (yj ) − ϕ−1 (yj−1 ) = (ϕ−1 )0 (zj )∆yj ≤ C∆yj ,
where C is a bound for |(ϕ−1 )0 | on [c, d]. From (5.8) and (5.9),
S(f ◦ ϕ, P x ) ≤ CS(f, P y ).
The same inequality evidently holds for −f , hence
−S(f ◦ ϕ, P x ) ≤ −CS(f, P y ).
Adding these inequalities,
S(f ◦ ϕ, P x ) − S(f ◦ ϕ, P x ) ≤ C[S(f, P y ) − S(f, P y )].
(5.9)
Riemann Integration on R
123
Since the right side may be made arbitrarily small, f ◦ ϕ ∈ Rba , hence also
(f ◦ ϕ)ϕ0 ∈ Rba .
To prove (5.7), we argue as in the first part of the proof, but now compare
the Riemann sums S((f ◦ ϕ)ϕ0 , P x , ξ) and S(f, P y , ζ), where the intermediate
points in each case are taken to be left endpoints:
ξ := (x0 , . . . , xn−1 ),
Then
ζ := (y0 , . . . , yn−1 ) = (ϕ(x0 ), . . . , ϕ(xn−1 )).
n
X
S (f ◦ ϕ)ϕ0 , P x , ξ =
f (ζj )ϕ0 (xj )∆xj
j=1
and, by the mean value theorem,
S(f, P y , ζ) =
n
X
n
X
f (ζj )∆ϕ(xj ) =
j=1
f (ζj )ϕ0 (tj )∆xj ,
j=1
for some tj ∈ [xj−1 , xj ]. Subtracting these equations and using the triangle
inequality, we obtain
n
X
S (f ◦ ϕ)ϕ0 , Px , ξ − S(f, Py , ζ) ≤
|f (ζj )| |ϕ0 (xj ) − ϕ0 (tj )|∆xj
j=1
n
X
≤M
|ϕ0 (xj ) − ϕ0 (tj )|∆xj ,
j=1
where M is a bound for |f | on [c, d]. By the uniform continuity of ϕ0 on [a, b],
given ε > 0 there exists a δ > 0 such that |ϕ0 (s) − ϕ0 (t)| < ε/M (b − a) for all
s, t with |s − t| < δ. Hence if kP x k < δ, then
S((f ◦ ϕ)ϕ0 , P x , ξ) − S(f, P y , ζ) < ε.
Letting kP x k → 0 and noting that then also kP y k → 0 (because ∆yj =
ϕ0 (cj )∆xj ≤ BkP x k, where B is a bound for |ϕ0 |), we see that
b
Z
f (ϕ(x))ϕ (x) dx −
0
a
b
Z
f (y) dy ≤ ε.
a
Since ε was arbitrary, the two integrals are equal, completing the proof.
Remark. Whether ϕ is increasing or decreasing, (5.7) may be written as
Z
a
b
f ϕ(x) ϕ0 (x) dx =
Z
ϕ(b)
f (y) dy.
ϕ(a)
This formula has an easy proof if f is continuous. Indeed, in this case f has a
124
A Course in Real Analysis
primitive F on [c, d], hence, by the chain rule, F ◦ ϕ is a primitive for (f ◦ ϕ)ϕ0 .
The desired formula now follows from the fundamental theorem of calculus:
Z b
Z ϕ(b)
0
f ϕ(x) ϕ (x) dx = F ϕ(b) − F ϕ(a) =
f (y) dy.
a
ϕ(a)
Note that in this case it is not necessary to assume that ϕ0 6= 0.
♦
5.3.3 Integration by Parts Formula. Let f and g be differentiable on [a, b]
with f 0 , g 0 ∈ Rba . Then
Z b
Z b
b
f (x)g 0 (x) dx = f (x)g(x) − f 0 (x)g(x) dx.
(5.10)
a
a
a
Proof. Since (f g) = f g + f g ∈
5.3.1(c) implies that
Z
Z b
Z b
b
b
f 0 g + f g0 .
f (x)g(x) = (f g)0 =
0
0
0
Rba ,
a
a
a
a
5.3.4 Example. We show that

(k − 1)(k − 3) · · · 4 · 2


Z π/2
,

k(k − 2) · · · 5 · 3
sink x dx =
π (k − 1)(k − 3) · · · 5 · 3

0


2
k(k − 2) · · · 4 · 2,
Z π/2
Let Ik =
sink x dx. Integrating by parts,
k odd,
k even.
0
Ik =
Z
π/2
sink−1 x sin x dx = (k − 1)
Z
0
π/2
sink−2 x cos2 x dx.
0
Since cos x = 1 − sin x, Ik = (k − 1)(Ik−2 − Ik ), hence
2
2
Ik =
k−1
Ik−2 .
k
Iterating, we obtain
Ik =
(k − 1)(k − 3) · · · (k − 2j + 1)
Ik−2j .
k(k − 2) · · · (k − 2j + 2)
If k = is odd, take j = (k − 1)/2 so
Ik =
(k − 1)(k − 3) · · · 4 · 2
I1 .
k(k − 2) · · · 3 · 1
If k is even, take j = (k − 2)/2 so
Ik =
(k − 1)(k − 3) · · · 5 · 3
I2 .
k(k − 2) · · · 6 · 4
Since I1 = 1 and I2 = π/4, the formula follows.
♦
Riemann Integration on R
125
If f 0 and g 0 are continuous, then (5.10) has the following analog for indefinite
integrals:
Z
Z
f (x)g 0 (x) dx = f (x)g(x) −
f 0 (x)g(x) dx.
(5.11)
Setting h = g 0 and using the symbols D for differentiation and I for integration,
we may write (5.11) as
I(f h) = f · Ih − I(Df · Ih).
By induction we obtain
I(f h) =
n
X
(−1)(k−1) Dk−1 f · I k h + (−1)n I Dn f · I n h .
(5.12)
k=1
Rb
The fundamental theorem of calculus may then be used to calculate a f h.
Formula (5.12) may be expressed in tabular form as shown in Table 5.1. For
each k, the entries in column k are multiplied and the resulting products are
added. The exception is in column n + 1, where the product must be integrated
before adding. The process terminates if and when Dn f = 0.
R
TABLE 5.1: Table for evaluating f h by parts.
k
(−1)k−1
Dk−1 f
Ikh
1
+1
f
Ih
2
3
−1 +1
Df D2 f
I 2h I 3h
···
···
···
···
n
(−1)n−1
Dn−1 f
I nh
n+1
(−1)n
Dn f
I nh
5.3.5 Example. Using Table 5.1 with f (x) = (x + 1)3 and h(x) = e5x , we
have
Z
3
3(x + 1)2
6(x + 1) 6(x + 1)
3 5x
5x (x + 1)
(x + 1) e dx = e
−
+
−
+ c.
5
52
53
54
R
TABLE 5.2: Table for evaluating (x + 1)4 e5x dx by parts.
k
(−1)k−1
Dk−1 f
Ikh
1
+1
(x + 1)3
e5x /5
2
−1
3(x + 1)2
e5x /52
3
+1
6(x + 1)
e5x /53
4
−1
6
e5x /54
5
+1
0
e5x /55
♦
126
A Course in Real Analysis
Exercises
1.S ⇓4 Let f : R → R be continuous and periodic with period p > 0, that is,
f (x + p) = f (x) for all x. Prove that
Z p
Z p
f (x + y) dx =
f (x) dx for all y ∈ R.
0
0
2. Let f : (a, b) → R have a uniformly continuous derivative. Prove that
f 0 ∈ Rba and
Z b
f 0 = lim+ f (b − ε) − f (a + ε) .
a
ε→0
3. Verify the following inequalities:
Z 1
√
sin x dx
2 √
√
2−1 ≤
≤ 2 − 1.
(a)S
2
π
1+x
0
Z 1
xp dx
21−q − 1
1
≤
≤
, p, q > 0, q 6= 1.
(b) q
p q
2 (p + 1)
(p + 1)(1 − q)
0 (1 + x )
4. Establish the formula
Z 1
(1 − x)m xn dx =
0
5. Let n ∈ N. Evaluate
Z 1
S
(a)
exp(x1/n ) dx.
0
m!
.
(n + 1)(n + 2) · · · (n + m + 1)
(b)
Z
e
lnn x dx.
1
6. Let k ∈ N. Show that

(k − 1)(k − 3) · · · 4 · 2


,

π/2
k(k − 2) · · · 5 · 3
k
cos x dx = π (k − 1)(k − 3) · · · 5 · 3

0


2
k(k − 2) · · · 4 · 2,
Z
7.S ⇓5 Let k ∈ N. Show that


 (k − 1)(k − 3) · · · 4 · 2
Z 1

xk
k(k − 2) · · · 3 · 1
√
dx = (k −
1)(k − 3) · · · 3 · 1 π

1 − x2
0


k(k − 2) · · · 4 · 2
2
k odd,
k even.
if k is odd
if k is even.
N.B. The integral is improper but converges by Exercise 5.7.7. For the
even case, use Exercise 6.
4 This
5 This
exercise will be used in 13.6.4.
exercise will be used in 13.4.2
Riemann Integration on R
127
8. Let f 0 be continuous and positive on [a, b]. Prove that
b
Z
f (x) dx +
f (b)
Z
f −1 (y) dy = bf (b) − af (a).
f (a)
a
Interpret geometrically for f > 0 and a > 0.
9.S (Young’s inequality). Let f be continuous and strictly increasing on
[0, a] with f (0) = 0. Prove that
Z x
Z y
Z x
f+
f −1 = yf −1 (y) +
f.
0
Deduce that
Z
x
f+
0
f −1 (y)
0
Z
y
f −1 ≥ xy, 0 ≤ x ≤ a, 0 ≤ y ≤ f (a).
0
10. Use Young’s inequality to verify the following inequalities:
p
1 − y 2 + y sin−1 y ≥ xy + cos x, 0 ≤ x ≤ π/2, 0 ≤ y ≤ 1.
(a)
(b)S x ln x + ey ≥ xy + x, 1 ≤ x ≤ 2, 0 ≤ y ≤ ln 2.
11. Give an example of a discontinuous function that
(a) has a primitive,
(b) has no primitive.
12. Let f and g be continuously differentiable with g > 0. Prove that
Z
Z 0
f (x)g 0 (x)
f (x)
f (x)
dx
=
dx −
.
2
g (x)
g(x)
g(x)
13.S Let f 0 ∈ Rba . Prove that
lim
n
Z
b
f (x) sin(nx) dx = 0.
a
14. Let f be continuous on [0, +∞) such that limx→+∞ f (x) exists in R and
let a > 0. Find
Z a
lim
f (nx) dx.
n→+∞
0
15. Let h0 be continuous and positive on [a, b] and let g 0 be continuous on
[c, d] = [h(a), h(b)]. Prove that
Z
a
b
g h(x) dx = g(d)b − g(c)a −
Z
c
d
g 0 (t)h−1 (t) dt.
128
A Course in Real Analysis
16. Let f ∈ Ra−a , a > 0. Show that
(
Z a
0
if f is an odd function,
Ra
f=
2
f
if f is an even function.
−a
0
17.S Let f : [a, b] → R be continuous and let u, v be differentiable functions
with range contained in [a, b]. Prove that
v(x)
Z
d
dx
f = f v(x) v 0 (x) − f u(x) u0 (x).
u(x)
18. Let functions a, b, c, d : [0, 1] → [0, 1] have continuous derivatives and
let f : [0, 1] → R be continuous. Suppose that
b(x)
Z
f=
Z
a(x)
Prove that
f for all x ∈ [0, 1].
c(x)
b(1)
Z
d(x)
f+
Z
b(0)
c(1)
f=
Z
c(0)
a(1)
f+
Z
a(0)
d(1)
f.
d(0)
19.S Let f be continuous and g differentiable with bounded derivative on
[a, b]. Evaluate
Z x
g(x)
lim
f.
x→a x − a a
20. Let p > 0, q > 1, and m, k ∈ N with m > k. Evaluate lim sn if sn =
n→+∞
(a) S
n
X
k q−1
. (c)
q
n + kq
n
X
kp
. (b)
np+1
k=1
k=1
n X
k=1
(mn)!
nkn [(m − k)n]!
1/n
.
21.S Let |f 0 | ≤ M on [a, b]. For n ∈ N set h = (b − a)/n and xk = a + kh,
k = 0, 1, . . . , n − 1. Prove that
b
Z
f −h
a
n
X
f (xk−1 ) ≤ hM (b − a).
k=1
22. Let f be continuous on [0, 1]. Prove that
Z
0
1
Z
0
x
f (t) dt dx =
Z
1
(1 − x)f (x) dx.
0
23. Let f, g : [0, 1] → R be continuously differentiable, f monotone, and
R1
g(x) > g(0) = g(1) on (0, 1). Prove that 0 f g 0 = 0 iff f is constant.
Riemann Integration on R
*5.4
129
Stirling’s Formula
Stirling’s formula gives an estimate for n! when n is large. The proof relies
on material from Section 4.3. We begin with the following lemma, which
provides the fundamental inequality needed to establish the formula.
5.4.1 Lemma. If f is concave and differentiable on (a, b), then
Z v
u+v
f (u) + f (v)
1
f (t) ≤ f
≤
, a < u < v < b.
2
v−u u
2
Proof. By the concave versions of 4.3.6 and (4.3),
f (u)
t−u
v−t
+ f (v)
≤ f (t) ≤ f 0 (x)(t − x) + f (x)
v−u
v−u
for all a < u < v < b and all x, t ∈ [u, v]. Integrating with respect to t,
Z v
v − u
(v − x)2 − (x − u)2
f (u) + f (v)
≤
+ f (x)(v − u).
f (t) ≤ f 0 (x)
2
2
u
Taking x = (u+v)/2 and dividing by v−u produces the desired inequalities.
5.4.2 Stirling’s Inequalities. For all n,
en n!
√ ≤ e,
nn n
e7/8 ≤
(5.13)
where the middle term is decreasing in n.
Proof. Taking f (x) = ln x, u = k ∈ N, and v = k + 1 in the lemma, we have
Z k+1
2
1
1
ln(t) dt ≤ ln k + 21 .
2 ln(k + k) ≤ 2 ln(k) + ln(k + 1) ≤
k
Rearranging,
Z k+1
0≤
ln(t) dt −
k
1
2
ln(k 2 + k) ≤ ln k +
1
2
−
1
2
ln(k 2 + k).
(5.14)
Now observe that
n−1
X Z k+1
k
k=1
n−1
X
k=1
ln(k 2 + k) =
ln t dt =
n−1
X
1
2
ln t dt = n ln n − n + 1,
1
[ln(k + 1) + ln k] = 2
k=1
ln(k + 21 ) −
n
Z
n
X
ln k − ln n = 2 ln n! − ln n, and
k=2
ln(k 2 + k) =
1
2
ln 1 +
1
2
4(k + k)
≤
1
,
+ k)
8(k 2
130
A Course in Real Analysis
where, for the last inequality, we used the fact that ln(1 + x) < x for x > 0,
which follows directly from the integral definition of ln(x + 1). Summing in
(5.14) and using the above inequalities, we obtain
0≤ n+
1
2
ln n − n + 1 − ln n! ≤
n−1
X
k=1
n−1 1X 1
1
1
1
=
−
≤ .
8(k 2 + k)
8
k k+1
8
k=1
Note that the term n + 2 ln n − n + 1 − ln n! is increasing in n since it was
obtained as a sum of nonnegative terms in (5.14). Rearranging, we have
1
7
≤ − n + 12 ln n + n + ln n! ≤ 1,
8
where the middle term is decreasing in n. Exponentiating yields the desired
inequalities.
5.4.3 Stirling’s Formula. lim
n
√
en n!
√ = 2π.
n
n n
R π/2
Proof. By 5.4.2, the limit L in the formula exists in R. Set In =
By 5.3.4,
I2n+1 =
0
sinn x dx.
(2n)(2n − 2) · · · 4 · 2
π (2n − 1)(2n − 3) · · · 5 · 3
and I2n =
.
(2n + 1)(2n − 1) · · · 5 · 3
2
2n(2n − 2) · · · 4 · 2
For x ∈ [0, π/2] and n ≥ m, sinn x ≤ sinm x, hence
I2n+2
I2n+1
I2n
≤
≤
= 1.
I2n
I2n
I2n
It follows that
2n + 1 π
22 · 42 · 62 · · · (2n − 2)2 · (2n)2
π
≤ 2 2 2
≤ ,
2n + 2 2
1 · 3 · 5 · · · (2n − 1)2 (2n + 1)
2
from which we obtain Wallis’s product
lim
n
22 · 42 · 62 · · · (2n − 2)2 · (2n)2
π
= .
2
2
2
2
1 · 3 · 5 · · · (2n − 1) (2n + 1)
2
Denote the general term in Wallis’s product by αn . Since
2 · 4 · · · (2n − 2) · (2n) = 2n n! and 3 · 5 · · · (2n − 1) =
we see that
√
αn =
22n (n!)2
√
.
(2n)! 2n + 1
(2n)!
,
2n n!
Riemann Integration on R
131
en n!
√ and note that
nn n
Now set βn =
√
√
βn2
e2n (n!)2 (2n)2n 2n
(n!)2 22n 2
√ .
= 2n+1
=
β2n
n
e2n (2n)!
(2n)! n
Dividing by
√
αn ,
√
√
(n!)2 22n 2 (2n)! 2n + 1 √ p
√
= 2 2 + 1/n → 2.
=
β2n αn
22n (n!)2
(2n)! n
p
√
Since αn → π/2 and βn → L, we also have
r
βn2
2
lim
.
→L
√
n β2n αn
π
q
√
Therefore, L π2 = 2, hence L = 2π.
βn2
√
5.5
Integral Mean Value Theorems
The following theorem asserts that the average value of a continuous
function over an interval [a, b] is actually assumed by the function at some
intermediate point c.
5.5.1 First Mean Value Theorem for Integrals. If f is continuous on
[a, b], then there exists c ∈ (a, b) such that
Z b
1
f = f (c).
b−a a
and the fundamental
Proof. Apply the mean value theorem for derivatives
Rx
theorem of calculus to the function G(x) := a f (t) dt.
The next theorem is a weighted average generalization of 5.5.1.
5.5.2 Weighted Mean Value Theorem for Integrals. Let f be continuous
on [a, b] and let g ∈ Rba . If g does not change sign in [a, b], then there exists
c ∈ [a, b] such that
Z b
Z b
f g = f (c)
g.
(5.15)
a
a
Rb
Proof. We may assume that g ≥ 0 on [a, b], so a g ≥ 0. Suppose first that
Rb
g = 0. If C is an upper bound for |f | on [a, b], then
a
Z b
Z b
Z b
fg ≤
|f |g ≤ C g = 0,
a
a
a
132
A Course in Real Analysis
Rb
hence both sides of (5.15) are zero. Now assume that a g > 0. Let m = f (xm )
and M = f (xM ) denote the minimum and maximum values of f on [a, b].
Since mg ≤ f g ≤ M g,
b
Z
b
Z
g≤
m
Z
a
a
hence
b
fg ≤ M
g,
a
b
Z
fg
m ≤ Za
≤ M.
b
g
a
An application of the intermediate value theorem completes the proof.
5.5.3 Second Mean Value Theorem for Integrals. Let f be continuous
and g differentiable and monotone on [a, b] with g 0 ∈ Rba . Then there exists
c ∈ [a, b] such that
Z b
Z c
Z b
f g = g(a) f + g(b) f.
a
Proof. Let F (x) =
Rx
a
c
f . Integrating by parts,
a
Z
a
b
fg =
Z
b
F g = F (b)g(b) −
0
a
Z
b
g 0 F.
a
Since g is monotone, the sign of g 0 does not change, hence, by 5.5.2, there
exists c ∈ [a, b] such that
Z
a
b
g 0 F = F (c)
Z
b
g 0 = F (c)[g(b) − g(a)].
a
Therefore,
Z
b
f g = F (b)g(b) − F (c)[g(b) − g(a)] = g(a)F (c) + g(b)[F (b) − F (c)],
a
which is the assertion of the theorem.
Remarks. (a) Because derivatives have the intermediate value property
(Exercise 4.2.25), the monotonicity requirement on g will be satisfied if g 0 6= 0
on [a, b].
(b) The second mean value theorem for integrals holds under the less
restrictive hypotheses that f is integrable and g is monotone. A proof may be
found in [3].
♦
Riemann Integration on R
133
Exercises
√ √
1. Let 0 ≤ a < b and let f be continuous on [ a, b]. Prove that there
exists c ∈ [a, b] such that
1
2
b
√
f ( x) dx = a
Z
√
a
√
√
c
Z
f (x) dx + b
Z
√
a
b
f (x) dx.
c
2. Let 0 < a < b and let f be continuous on [b−1 , a−1 ]. Prove that there
exists c ∈ [a, b] such that
Z
b
f (1/x) dx = b
2
1/c
Z
a
f (x) dx + a
2
1/a
Z
1/b
f (x) dx.
1/c
3.S Let f be continuous on [0, 1]. Prove that there exists c ∈ [1/2,
such that
Z
2
f sin x dx = √
3
π/3
π/6
√
c
Z
f (x) dx + 2
Z
√
3/2]
3/2
f (x) dx.
c
1/2
4. Let f be continuous on [0, 1]. Prove that there exists c ∈ [0, 1] such that
Z
π/4
f tan x dx =
c
Z
0
f (x) dx +
0
1
2
Z
1
f (x) dx.
c
5. Let f and g be continuous on [a, b]. Show that there exists c ∈ (a, b) such
that
Z
Z
b
b
f = f (c)
g(c)
a
g.
a
6. Prove: If f is continuous, g ∈ Rba , and m is lower bound for g, then there
exist c, d ∈ [a, b] such that
Z
b
f g = f (c)
b
Z
a
g + m(b − a)[f (d) − f (c)].
a
7.S Prove the following variant of the second mean value theorem for
integrals: Let f, g ∈ Rba with g ≥ 0. If m ≤ f ≤ M on [a, b], then there
exists c ∈ [a, b] such that
Z
b
fg = m
a
Hint. Consider G(x) := m
Z
c
g+M
a
Rx
a
g+M
Z
f.
c
Rb
x
g.
b
134
A Course in Real Analysis
8. Let g have a nonnegative integrable derivative on [0, 1] with g(0) = 0
and g(1) = 1. Show that there exists c ∈ [0, 1] such that
Z
1
xn g(x) dx =
0
1 − cn+1
.
n+1
9.S Let g have a nonnegative integrable derivative on [0, π] with g(0) = 0
and g(π) = 1. Show that there exists c ∈ [0, π] such that
Z π
g(x) sin x dx = cos c + 1.
0
10. Let g be twice differentiable on [a, b] with g 00 < 0 and g 00 ∈ Rba , and let
f be continuous on g([a, b]). Show that if g 0 ≥ m > 0 and |f | ≤ M , then
Z
b
f0 ◦ g ≤
a
2M
.
m
Hint. Use the second mean value theorem for integrals.
*5.6
Estimation of the Integral
Integrals that cannot be evaluated exactly may be approximated by various
numerical methods. Of course, an integral may always be approximated by a
Riemann sum; however, unless the intermediate points of the subintervals are
chosen judiciously, a Riemann sum usually offers only a coarse approximation
of the integral. In this section we discuss three techniques, the trapezoidal rule,
the midpoint rule, and Simpson’s rule, that yield good numerical estimates of
an integral.
The approximation techniques are given in order of increasing precision.
For each of these, we use partitions of the form
xk = a + khn , k = 0, 1, . . . , n, where hn :=
b−a
.
n
(5.16)
Rb
The integral a f is then estimated by replacing f on the interval [xk , xk+1 ]
by a simpler function fk . The approximation is therefore
Z
a
b
f (x) dx ≈
n−1
X Z xk+1
k=0
fk (x) dx.
xk
The error in the approximation is simply the difference between the left and
right sides. The main goal in the approximation schemes described below is
Riemann Integration on R
135
to obtain, for a given class of functions, the sharpest upper bound for the
magnitude of the error
The reader may wish to compare the error bounds in the three approximation techniques described below with the error bound for the approximation
given by the Riemann sum
Rn =
b − a
f (x0 ) + f (x1 ) + · · · + f (xn−1 ) .
n
(5.17)
By Exercise 5.3.21, for functions f with a bounded derivative one has in general
only the first order error bound
Z b
f − Rn ≤ hn (b − a)kf 0 k∞ ,
a
implying that a good estimate requires a large n. Here, for a bounded function
g on [a, b],
kgk∞ := sup {|g(x)| : a ≤ x ≤ b} ,
Trapezoidal Rule
Let
Pk := (xk , f (xk )) = (xk , yk ), k = 0, 1, . . . , n,
(5.18)
where the points xk are given in (5.16). The trapezoidal rule uses the line
segment from Pk to Pk+1 to approximate f on [xk , xk+1 ], k = 0, 1, . . . n − 1.
Thus the approximating function fk is given by
fk (x) = yk + mk (x − xk ), xk ≤ x ≤ xk+1 , mk :=
yk+1 − yk
.
xk+1 − xk
A simple calculation shows that
Z xk+1
hn
(yk+1 + yk ),
fk =
2
xk
The sum
Tn :=
n−1
X Z xk+1
fk =
k=0
xk
hn
y0 + 2y1 + · · · + 2yn−1 + yn
2
Rb
is then used to approximate a f . If f > 0, Tn may be realized as the sum of
areas of trapezoids. (See Figure 5.7.)
Rb
5.6.1 Trapezoidal Rule. If f ∈ Rba , then limn Tn = a f . Moreover, if f 00
exists and is continuous on [a, b], then the following error estimate holds:
Z
b
f − Tn ≤
a
h2n
(b − a)kf 00 k∞ .
12
136
A Course in Real Analysis
f
x0
x2
x1
x3
x4
x5
x
x6
FIGURE 5.7: Trapezoidal rule approximation.
Proof. For the Riemann sum Rn in (5.17),
b − a
b − a
f (x0 ) − f (xn ) =
f (a) − f (b) → 0,
2n
2n
Rb
hence Tn = (Tn − Rn ) + Rn → a f.
To obtain the error estimate, consider the function
Rn − Tn =
gk (x) :=
f (x) − yk − mk (x − xk )
f (x) − fk (x)
=
,
(x − xk )(x − xk+1 )
(x − xk )(x − xk+1 )
which has singularities at xk and xk+1 . Since both the numerator and the
denominator vanish at these points, the singularities may be removed using
l’Hospital’s rule. Therefore, gk (x) has a continuous extension to [xk , xk+1 ].
Since (x − xk )(x − xk+1 ) does not change sign on [xk , xk+1 ], by the weighted
mean value theorem for integrals (5.5.2) there exists a point zk ∈ [xk , xk+1 ]
such that
Z xk+1
Z xk+1
[f (x) − fk (x)] dx =
gk (x)(x − xk )(x − xk+1 ) dx
xk
xk
Z xk+1
= gk (zk )
(x − xk )(x − xk+1 ) dx
xk
3
= −gk (zk )
It follows that
Z b
n−1
XZ
f (t) dt − Tn =
a
k=0
xk+1
xk
h
.
6
[f (x) − fk (x)] dx = −
n−1
h3n X
gk (zk ).
6
(5.19)
k=0
Now fix x ∈ (xk , xk+1 ) and define ψ(z) on [xk , xk+1 ] by
ψ(z) = f (z) − fk (z) − gk (x)(z − xk )(z − xk+1 ).
Since ψ has distinct zeros x, xk , and xk+1 , Rolle’s theorem applied twice shows
Riemann Integration on R
137
that ψ 00 has a zero vk ∈ (xk , xk+1 ). It follows that f 00 (vk ) = 2gk (x). Since x
was arbitrary,
|gk (x)| ≤ 21 kf 00 k∞ for all x ∈ [xk , xk+1 ].
From this and (5.19) we see that
b
Z
f (t) dt − Tn ≤
a
nh3n 00
h2
kf k∞ = n (b − a)kf 00 k∞ .
12
12
Midpoint Rule
Let
xk :=
xk + xk+1
= a + k + 21 hn , k = 0, 1, . . . , n − 1,
2
where the points xk are given in (5.16). The midpoint rule uses the constant
function
fk (x) = f (xk ) , xk ≤ x ≤ xk+1 ,
Rb
to approximate f on [xk , xk+1 ]. This amounts to approximating a f by Riemann sums Mn , where the intermediate points are the midpoints of the
intervals:
b − a
Mn =
f (x0 ) + f (x1 ) + · · · + f (xn−1 ) .
n
f
a
x0
x1
x1
x2
x2
x3
x3
x
b
FIGURE 5.8: Midpoint rule approximation.
5.6.2 Midpoint Rule. If f 00 exists and is continuous on [a, b], then the
following error estimate holds:
Z
b
f − Mn ≤
a
h2n
(b − a)kf 00 k∞ .
24
138
A Course in Real Analysis
Proof. The function
gk (x) =
f (x) − f (xk ) − f 0 (xk )(x − xk )
(x − xk )2
has a double singularity at xk , which may be removed by applying l’Hospital’s
rule twice and defining gk (xk ) to be the resulting limit. Since
f (x) − f (xk ) − f 0 (xk )(x − xk ) = gk (x)(x − xk )2
and
Z
xk+1
(x − xk ) dx = 0,
xk
we see that
Z
xk+1
[f (x) − f (xk )] dx =
Z
xk+1
gk (x)(x − xk )2 dx.
xk
xk
Since (x − xk )2 has constant sign on [xk , xk+1 ], the weighted mean value
theorem for integrals implies that the integral on the right equals
Z xk+1
h3
gk (zk )
(x − xk )2 dx = gk (zk ) n
12
xk
for some point zk ∈ [xk , xk+1 ]. Therefore,
Z xk+1
h3
[f (x) − f (xk )] dx = gk (zk ) n .
12
xk
(5.20)
Now fix x ∈ [xk , xk ) ∪ (xk , xk+1 ]. By Taylor’s theorem, there exists a point
ξk ∈ [xk , xk ] such that
f (x) = f (xk ) + f 0 (xk )(x − xk ) +
f 00 (ξk )
(x − xk )2 .
2
Solving for f 00 (ξk ) we see that f 00 (ξk ) = 2gk (x). Therefore, |gk (x)| ≤ kf 00 k∞ /2
for all x ∈ [xk−1 , xk+1 ], hence from (5.20),
Z xk+1
h3
h3
− n |f 00 k∞ ≤
[f (x) − f (xk )] dx ≤ n |f 00 k∞ .
24
24
xk
Summing, we obtain
−
nh3n 00
kf k∞ ≤
24
Z
b
f (x) dx − Mn ≤
a
nh3n 00
kf k∞ ,
24
which is the assertion of the theorem.
Note that the estimates in both the trapezoidal rule and the midpoint rule
are exact for all linear functions f , since then f 00 = 0.
Riemann Integration on R
139
Simpson’s Rule
Simpson’s rule assumes n = 2m in (5.16) and uses a parabola through each
triple of points
(Pk−1 , Pk , Pk+1 ), k = 2j + 1, j = 0, . . . , m − 1, Pk := (xk , f (xk )) = (xk , yk ),
to approximate f . To obtain the rule, observe that any polynomial p(x) of
f
x0
x1
x2
x3
x4
x
FIGURE 5.9: Simpson’s rule approximation.
degree ≤ 2 may be written in the form
p(x) = bk (x − xk−1 )(x − xk ) + ck (x − xk−1 ) + dk ,
(5.21)
where
p(xk+1 ) − 2p(xk ) + p(xk−1 )
,
2h2
p(xk ) − p(xk−1 )
ck =
, and
h
dk = p(xk−1 ).
bk =
It follows that the unique polynomial pk of degree ≤ 2 that passes through
the points Pk−1 , Pk , and Pk+1 is obtained by choosing
f (xk+1 ) − 2f (xk ) + f (xk−1 )
,
2h2
f (xk ) − f (xk−1 )
ck = ck (f ) :=
, and
h
dk = dk (f ) := f (xk−1 ).
bk = bk (f ) :=
(5.22)
With this choice, one readily calculates
Z xk+1
hn
Sn,k :=
pk (x) dx =
[yk−1 +yk+1 +4yk ], k = 2j +1, j = 0, · · · , m−1,
3
xk−1
R xk+1
which is taken as an approximation of xk−1
f . Note that the approximation
is exact for all polynomials f of degree ≤ 2, since such a polynomial may be
140
A Course in Real Analysis
written in the form (5.21). Summing this result, we see that the integral of
the approximating function on [a, b] is
b−a
y0 + 4y1 + 2y2 + 4y3 + 2y4 + · · · + 2yn−2 + 4yn−1 + yn .
3n
Rb
5.6.3 Simpson’s Rule. If f ∈ Rba , then limn Sn = a f . Moreover, if f (4)
exists and is continuous on [a, b], then the following error estimate holds:
Z b
h4 (b − a)kf (4) k∞
f − Sn ≤ n
.
180
a
Sn :=
Proof. Set
Rn0 := y0 + y2 + · · · + yn−2 (2hn ) and Rn00 := y1 + y3 + · · · + yn−1 (2hn ).
These are Riemann sums for f on [a, b] and
6Sn = 2Rn0 + 4Rn00 + (b − a)(2hn ).
Rb
It follows that Sn → a f .
To obtain the error estimate, let f (4) be continuous on [a, b] and denote
the errors by
En,k =
Z
xk+1
f (x) dx − Sn,k and En =
xk−1
m−1
X
En,2j+1 =
j=0
Z
b
f (x) dx − Sn .
a
We show that there exists a point ξk ∈ [xk , xk+1 ] such that
En,k = −
h5n f (4) (ξk )
.
90
(5.23)
It will follow that
|En | ≤
h4 (b − a)kf (4) k∞
mh5n kf (4) k∞
= n
,
90
180
proving the theorem.
To verify (5.23), fix k and choose a point in x∗k ∈ (xk−1 , xk ) ∪ (xk , xk+1 ).
For any function g, define a function Lg on [xk−1 , xk+1 ] by
(Lg)(x) = ak (g)(x − xk−1 )(x − xk )(x − xk+1 ) + bk (g)(x − xk−1 )(x − xk )
+ ck (g)(x − xk−1 ) + dk (g),
where bk (g), ck (g), and dk (g) are defined as in (5.22) and ak (g) is chosen so
that (Lg)(x∗k ) = g(x∗k ). Then Lg is the unique polynomial of degree ≤ 3 passing
through the four points
xk−1 , g(xk−1 ) , xk , g(xk ) , xk+1 , g(xk+1 ) , and x∗k , g(x∗k ) .
Riemann Integration on R
141
Note that the coefficients in the definition of L are linear functions of g, hence
L itself is a linear function. Furthermore, Lg = g for all polynomials of degree
≤ 3. Since
(Lf )(x) = ak (f )(x − xk−1 )(x − xk )(x − xk+1 ) + pk (x)
and
Z
xk+1
(x − xk−1 )(x − xk )(x − xk+1 ) dx = 0,
xk−1
we see that
Z
xk+1
Lf =
xk−1
Z
xk+1
pk = Sn,k .
xk−1
By Taylor’s formula with integral remainder (Exercise 4.6.3), there exists a
polynomial T3 (x) of degree ≤ 3 such that
Z
1 x
f (x) = T3 (x) + R3 (x), where R3 (x) :=
(x − t)3 f (4) (t) dt.
3! xk−1
The remainder may be written
(
Z
(x − t)3
1 xk+1
(4)
qt (x)f (t) dt where qt (x) :=
R3 (x) =
3! xk−1
0
Since
if t ≤ x
if t > x.
Lf = LT3 + LR3 = T3 + LR3 = f − R3 + LR3 ,
we see that
En,k =
Z
xk+1
(f − Lf ) =
xk−1
Z
xk+1
(R3 − LR3 ).
xk−1
In the remaining calculations, for ease of notation we assume that
[xk−1 , xk+1 ] = [−h, h]. By Fubini’s theorem for continuous functions,
Z
Z h
Z h
1 h (4)
R3 (x) dx =
f (t)
qt (x) dx dt
3! −h
−h
−h
Z
1 h (4)
f (t)(h − t)4 dt.
(5.24)
=
4! −h
Also, because L is linear,
(LR3 )(x) =
1
3!
Z
h
f (4) (t)(Lqt )(x) dt.6
−h
Therefore, by Fubini’s theorem,
Z h
Z
Z h
1 h (4)
(LR3 )(x) dx =
f (t)
(Lqt )(x) dx dt.
3! −h
−h
−h
6 This
(5.25)
may be proved using the dominated convergence theorem. (See Exercise 11.3.??)
142
A Course in Real Analysis
Now, by definition of L,
(Lqt )(x) = at (x + h)x(x − h) + bt (x + h)x + ct (x + h) + dt ,
where
bt =
qt (h) − 2qt (0) + qt (−h)
qt (0) − qt (−h)
, ct =
, and dt = qt (−h).
2
2h
h
Since qt (−h) = 0 and qt (h) = (h − t)3 , t ∈ [−h, h],
h
Z
−h
(Lqt )(x) dx = 32 h3 bt + 2h2 ct = 13 h[(h − t)3 + 4qt (0)].
(5.26)
From (5.24), (5.25), and (5.26),
Z
h
(f − Lf ) =
−h
where
Z
h
−h
(R3 − LR3 ) =
1
72
Z
h
f (4) (t)α(t) dt,
(5.27)
−h
α(t) := 3(h − t)4 − 4h[(h − t)3 + 4qt (0)].
Recalling the definition of qt (0), we see that
(
(t − h)3 (3t + h) + 16ht3
α(t) =
(t − h)3 (3t + h)
if −h ≤ t ≤ 0,
if 0 ≤ t ≤ h.
(5.28)
Thus if t ≥ 0,
α(−t) = (t + h)3 (3t − h) − 16ht3 and α(t) = (t − h)3 (3t + h).
(5.29)
The cubic polynomials in (5.29) are easily seen to be equal at the conveniently
chosen points t = 0, ±h, 2h and therefore must be identical. Thus α is an even
function of t so (5.28) may be rewritten
(
(t + h)3 (3t − h) if −h ≤ t ≤ 0,
α(t) =
(t − h)3 (3t + h) if 0 ≤ t ≤ h.
Taking derivatives, we see that α is decreasing on [−h, 0] and increasing on
[0, h]. Since α(−h) = α(h) = 0, it follows that α ≤ 0 on [−h, h]. By (5.27) and
the weighted mean value theorem for integrals, for some point ξ ∈ [−h, h] we
have
Z
Z
Z h
f (4) (ξ) h
f (4) (ξ) h
h5 f (4) (ξ)
(f −Lf ) =
α(t) dt =
(t−h)3 (3t+h) dt = −
.
72
36
90
−h
−h
0
The same result holds for En,k , with the point ξ depending on k. This completes
the proof of the theorem.
Riemann Integration on R
143
Comparison of the Approximations
R2
Table 5.3 below gives the errors 1 x−1 dx − An , rounded to 10 decimal
places, where An is the approximation. The left point rule simply refers to
approximation by the Riemann sum Rn . The exact value of the integral, up to
10 decimal places, is ln 2 = .6931471805 . . .
TABLE 5.3: A comparison of the methods.
Method
Left Point Rule
Trapezoidal Rule
Midpoint Rule
Simpson’s Rule
5.7
n=4
.1836233710
-.0038766290
.0019272893
-.0001067877
n=8
.0927753302
-.0009746698
.0004866265
-.00000735011
Improper Integrals
In this section, the Riemann integral is extended in two ways: First, the
integrand is allowed to be unbounded and second, the integration interval can
be infinite.
5.7.1 Definition. A function f is said to be locally integrable on an interval
I if f ∈ Rdc for every interval [c, d] contained in I.
♦
For example, a continuous function is locally integrable on any interval.
5.7.2 Definition. Each expression in (a)–(c) below is called an improper
integral. The integral is said to converge if the limit exists in R and to diverge
otherwise. In the former case, f is said to be improperly integrable on I.
Z b
Z t
(a)
f := lim
f , where f is locally integrable on [a, b).
t→b−
a
(b)
Z
a
(c)
Z
a
b
f := lim+
t→a
b
f :=
Z
a
a
b
Z
f,
where f is locally integrable on (a, b].
t
c
f+
Z
b
f , where f is locally integrable on (a, c) ∪ (c, b).
♦
c
Note that the limits of integration in these definitions, where appropriate,
may be infinite.
144
A Course in Real Analysis
It is easy to see that if f is locally integrable on (a, b], then
Rc
iff a f converges for some (every) c ∈ (a, b). In this case,
Z
b
c
Z
f=
a
f+
Z
Rb
a
f converges
b
f.
a
c
The first integral on the right is improper while the second is a Riemann
integral. Moreover, if f is also bounded and a, b ∈ R, then, by Exercise 5.2.10,
Rb
the improper integral a f is simply the Riemann integral. Analogous remarks
apply to the other cases.
5.7.3 Examples. (a) Let p ∈ R. For 0 < s < t,
(
Z t
(1 − p)−1 t1−p − s1−p
if p 6= 1
dx
=
p
x
ln
t
−
ln
s
if
p = 1.
s
It follows that
Z ∞
Z 1
dx
dx
converges
iff
p
>
1
and
converges iff p < 1.
p
p
x
1
0 x
(b) Let r > 0, r 6= 1. For t > 1,
Z
t
rx dx =
1
rt − r
, hence
ln r
∞
Z
rx dx converges iff r < 1.
1
(c) Since
t
Z
(1 + x2 )−1 dx = tan−1 t − tan−1 s,
s
∞
Z
0
Z
1
(d)
−1
dx
=
1 + x2
dx
p
=
|x|
Z
0
−1
Z
0
−∞
dx
√
+
−x
Z
0
1
dx
√ = 2 lim
t→0+
x
∞
Z
dx
π
= , hence
2
1+x
2
−∞
1
Z
t
dx
= π.
1 + x2
dx
√ = 4.
x
♦
For ease of exposition, for the remainder of the section we consider only
integrals that are improper at the upper limit. Analogous discussions hold for
the other types of improper integrals.
The proof of the following theorem is left to the reader.
5.7.4 Theorem. Let f and g be locally integrable on [a, b) and let α, β ∈ R.
Rb
Rb
Rb
If the improper integrals a f and a g converge, then so does a (αf + βg),
and
Z
Z
Z
b
b
(αf + βg) = α
a
b
f +β
a
g.
a
Riemann Integration on R
145
In contrast to the Riemann integral, the product of improperly integrable
√
functions may not be improperly integrable. For example, f (x) := 1/ 1 − x
is improperly integrable on the interval [0, 1) but f 2 is not. The following
example illustrates the same phenomenon, but on an unbounded interval. It
is the first of several examples in this
P∞section that uses the fact, proved in
Chapter 6, that a series of the form n=1 1/np converges iff p > 1.
5.7.5 Example. Define f on [1, +∞) by f (x) = n if n ≤ x < n + 1/n5/2 ,
n = 1, 2, . . ., and f (x) = 0 otherwise. Then
n+1
Z
f=
1
hence
R∞
1
n
X
1
3/2
k
k=1
f converges, whereas
and
n+1
Z
f2 =
1
R∞
1
n
X
1
,
1/2
k
k=1
f 2 diverges.
♦
We now have examples, on both bounded and unbounded intervals, of
nonnegative improperly integrable functions whose squares are not improperly
integrable. Conversely, there exist locally integrable nonnegative functions on
unbounded intervals, for example, f (x) = 1/x on [1, +∞), such that f 2 is
improperly integrable but f is not. However, for bounded intervals this is not
possible: If f 2 is improperly integrable on a bounded interval, then so is |f |.
(Exercise 25.)
The remainder of this section describes various convergence tests for improper integrals. Many of these are analogs of convergence tests for infinite
series, discussed in Chapter 6.
5.7.6 Comparison Test for Integrals. Let f and g be locally integrable on
Rb
Rb
[a, b) such that 0 ≤ f ≤ g. If a g converges, then so does a f .
Rx
Rx
Proof. Let F (x) = a f and G(x) = a g, a ≤ x < b. Since f and g are
nonnegative, F and G are increasing, hence, by the monotone function theorem
(3.1.17),
Z b
Z b
f = lim− F (x) and
g = lim− G(x)
a
x→b
x→b
a
exist in R. Since F ≤ G, the conclusion follows.
1 + sin x
, x > 0. By definition,
5.7.7 Example. Let f (x) = √
x(x + 1)2
Z
0
∞
f=
Z
0
1
f+
Z
∞
f,
1
provided the integrals on the right converge. That this is √indeed the case
follows from 5.7.3(a), 5.7.6, and the inequalities f (x) ≤ 2/ x on (0, 1] and
f (x) ≤ 2/(x + 1)2 on [1, +∞).
♦
146
A Course in Real Analysis
5.7.8 Example. Define the gamma function Γ by
Z ∞
tx−1 e−t dt, x > 0.
Γ(x) =
0
To see that the integral
converges for all x > 0, note that tx−1 e−t
≤ tx−1
R 1 x−1
R
1
for t ∈ (0, 1], hence 0 t
e−t dt converges by comparison with 0 tx−1 dt
(see 5.7.3(a)). Furthermore, by l’Hospital’s rule applied sufficiently many times,
lim tx+1 e−t = 0
t→+∞
x+1 −t
so there exists
e ≤ 1, or tx−1 e−tR≤ t−2 , for all t ≥ t0 .
0 > 1 such that t
R ∞ tx−1
∞
−t
Therefore, 1 t
e dt converges by comparison with 1 t−2 dt.
The gamma function has the following recursive property:
Γ(x + 1) = xΓ(x).
To see this, integrate Γ(x + 1) by parts to obtain
Z b
Z b
t=a
tx e−t dt = tx e−t
+x
tx−1 e−t dt,
t=b
a
a
and then let a → 0 and b → +∞. In particular, for n ∈ N
Γ(n + 1) = nΓ(n) = n(n − 1)Γ(n − 1) = · · · = n(n − 1) · · · 1 · Γ(1).
Since
Γ(1) =
Z
∞
e−t dt = 1,
0
we see that Γ(n + 1) = n!. Thus Γ(x) is a continuous (indeed, differentiable)
extension of the factorial function on N.
♦
5.7.9 Limit Comparison Test for Integrals. Let f and g be locally integrable on [a, b) with f ≥ 0 and g > 0. If L := limx→b f (x)/g(x) exists and
Rb
Rb
0 < L < +∞, then a g converges iff a f converges.
Rb
Rb
Proof. Since f, g ≥ 0, a f and a g exist in R. Choose c ∈ (a, b) such that
L/2 < f (x)/g(x) < 2L for all x ∈ [c, b). For such x, g(x) < 2f (x)/L and
f (x) < 2Lg(x). The assertion then follows from the inequalities
Z b
Z
Z b
Z b
2 b
g≤
f and
f ≤ 2L g.
L c
c
c
c
√
5
2
2x − x + 1
5.7.10 Example. Let f (x) =
, x ≥ 1. For g(x) = x−3/2 ,
x4 + 3x + 5
√
√
2x8 − x5 + x3
f (x)
lim
= lim
= 2.
4
x→+∞ g(x)
x→+∞
x + 3x + 5
R∞
R∞
Since 1 g converges, so does 1 f .
♦
Riemann Integration on R
147
5.7.11 Root Test for Integrals. Let f be locally integrable and nonnegative
on [a, b), where b > 0, and suppose that L := limx→b− [f (x)]1/x exists in R.
Rb
Then a f converges if L < 1 and diverges if L > 1.
Proof. Suppose L < 1. Choose r ∈ (L, 1) and x0 ∈ (a, b) ∩ (0, b) such that
[f (x)]1/x < r for all x ≥ x0 . For such x, f (x) < rx , hence, by the comparison
Rb
Rb
theorem and 5.7.3(b), x0 f converges. Therefore, a f converges. A similar
Rb
argument shows that a f diverges if L > 1.
5.7.12 Example. For p ∈ R and x ≥ 1, let
px
2x + cos x
f (x) =
.
3x + sin x
1/x
Since lim [f (x)]
x→+∞
= (2/3)p ,
R +∞
1
f converges iff p > 0.
♦
There are examples of convergent integrals and divergent integrals with
L = 1, so the root test in inconclusive in this case (see Exercise 3).
5.7.13 Definition. Let f be locally integrable on [a, b). The improper integral
Rb
Rb
f is said to converge absolutely if a |f | converges. In this case f is said to be
a
Rb
improperly absolutely integrable on [a, b). If a f converges but not absolutely,
then the integral is said to converge conditionally.
♦
5.7.14 Proposition. If f is improperly absolutely integrable on [a, b), then
Rb
f converges and
a
Z b
Z b
f ≤
|f |.
a
a
Proof. Set g(x) := |f (x)| + f (x), so 0 ≤ g ≤ 2|f | on [a, b]. By the comparison
Rb
test, a g converges. Since f = g − |f | is the difference of two improperly
integrable functions, f is improperly
The inequality follows on
R t integrable.
Rt
letting t → +∞ in the inequality | a f | ≤ a |f |.
5.7.15 Example. For p > 0, define
f (x) =
Then
Z
1
(−1)n+1
, n ≤ x < n + 1, n = 1, 2, . . . .
np
n+1
|f | =
Z n+1 X
n
n
X
1
(−1)k+1
and
f
=
.
p
k
kp
1
k=1
k=1
The first sum has a finite limit iff p > 1, Rwhile the second sum has a finite
∞
limit iff p > 0 (see Chapter 6). Therefore, 1 f converges absolutely iff p > 1
and conditionally iff 0 < p ≤ 1.
♦
148
A Course in Real Analysis
The following theorem is useful in establishing conditional convergence of
improper integrals.
5.7.16 Dirichlet’s Test for Integrals. Let f be continuous
and g 0 improperly
Rt
absolutely integrable on [a, b). If the function F (t) := a f is bounded on [a, b)
Rb
and limx→b− g(x) = 0, then a f g converges.
Proof. Let M be a bound for |F | on [a, b). Then |F g 0 | ≤ M |g 0 |, hence, by the
comparison test, F g 0 is absolutely integrable on [a, b). Integrating by parts
yields
Z
Z
t
t
f g = F (t)g(t) −
a
Since
Rb
a
F g0 .
a
F g 0 converges and limt→b− F (t)g(t) = 0,
Rb
a
f g converges.
5.7.17 Corollary. Let f be continuous and g 0 locally integrable on [a, b) with
Rt
limx→b− g(x) = 0. If the function F (t) := a f is bounded on [a, b) and if g 0
Rb
has constant sign, then a f g converges.
Rt
Proof. By the fundamental theorem of calculus, a g 0 = g(t) − g(a), hence g 0
is absolutely integrable on [a, b) and Dirichlet’s test applies.
5.7.18 Example. Let h(x) = x−p sin x, Rx ≥ 1, where p > 0. Taking f (x) =
∞
sin x and g(x) = x−p in 5.7.17 shows that 1 h converges. Since |h(x)| ≤ 1/xp ,
h is improperly absolutely integrable on [1, +∞) if p > 1. If 0 < p ≤ 1, the
sums on the right in the inequality
Z
nπ
π
|h| =
n Z
X
k=2
kπ
|h| >
(k−1)π
n
X
Z
(kπ)−p
kπ
| sin x| dx = 2π −p
(k−1)π
k=2
n
X
k −p
k=2
are unbounded (see Example 6.2.5), hence h is not improperly absolutely
integrable in this case.
♦
5.7.19 Cauchy–Schwarz Inequality for Improper Integrals. Let f and
g be continuous with f 2 and g 2 improperly integrable on [a, b). Then f g is
improperly absolutely integrable on [a, b) and
b
Z
|f g|
2
b
Z
2
≤
a
b
Z
g2 .
f ·
a
a
Proof. By Exercise 5.2.13, for all t ∈ [a, b)
Z
2
t
|f g|
a
Now let t → b.
Z
≤
a
t
f2 ·
Z
a
t
g2 .
Riemann Integration on R
149
Exercises
Z
1
dx
converges.
p (1 − x)q
(sin
x)
0
Z ∞
Z ∞
2. Let p > 0. Show that
x−px dx converges and
x−p/x dx diverges.
1
1
Z 1
Z 1
−px
Show that the same behavior holds for
x
dx and
x−p/x dx.
1.S Determine all values of p, q > 0 for which
0
0
3. Find examples for which limx→+∞ [f (x)]
= 1 and
Z ∞
Z ∞
(a)
f converges.
(b)
f diverges.
1/x
1
1
4. Let f and g be positive and continuous on [1,R +∞). Prove that if
∞
f
f (x)
L := limx→+∞
exists in R, then lim Rx∞ = L.
x→+∞
g(x)
g
x
5.S Determine if the integrals converge or diverge:
Z 1
Z 1
Z 1√
sin x
sin x
sin x − x
dx.
(b)
dx.
(c)
dx.
(a)
3
2
x
x
x
0
0
0
Z ∞
Z ∞
Z ∞
(ln x)(sin x)
(sin x)(cos x−1 )
1
(d)
dx. (e)
dx. (f)
dx.
sin2
ln x
x
x
2
2
1
π/2
Z
6. Prove that
cos(secp x) dx converges for all p > 0.
0
1
7. Show that
Z
8.S Show that
Z
√
0
0
1
xp
dx converges iff p > −1.
1 − x2
sinp x
dx converges iff p < 1 + q.
xq
9. Find all values of p for which the integral converges:
Z ∞
Z 1
Z
(a) S
xp e−x dx.
(b)
xp e−x dx.
(c) S
1
(d)
(g)
(j)
Z
1
xp sin xp dx. (e)
0
Z π/2
0
Z π/2
0
0
1
Z
xp ln x dx.
(f)
0
sinp x dx.
(h) S
Z
π/2
x sinp x dx.
(i)
0
tanp x dx.
(k) S
Z
Z
1
sin xp dx.
0
∞
xp ln x dx.
1
Z π/2
(1 − sin x)p dx.
0
π/2
xp cos x dx. (l)
0
10. Find all values of p > 0 for which
Z
π/2
xp sin x dx.
0
Z
0
1
x−p sin ex dx converges absolutely.
150
A Course in Real Analysis
11.S Prove that
∞
Z
1
12. Prove that
x sin x
dx converges conditionally.
1 + x2
∞
Z
xp sin ex dx converges for all p. For what values of p does
1
the integral converge conditionally? (See 5.7.18.)
13.S Find all values of p, q > 0 for which the integral converges:
Z 1
Z 1
Z ∞
xp
dx
dx
√ p
.
(b)
dx.
(c)
.
(a)
p )q
2p )q
(1
−
x
(1
−
x
x
+ xq
0
1
0
Z 1
Z π/2
Z π/2
sinp x
1
dx
√ p
(d)
.
(e)
dx.
(f)
p q dx.
qx
q
cos
sin
x
x
+
x
0
0
0
Z ∞
14. Prove by induction that
xn e−x dx = n!.
0
Z
15.S Given that
∞
e−x
2
/2
dx =
√
2π (to be established in 11.5.3) show
−∞
that,
1
√
2π
Z
∞
2
x2n e−x
/2
dx = (2n − 1)(2n − 3) · · · 3 · 1 =
−∞
(2n)!
.
n!2n
√
2
e−s ds = π/2, show that
√
√
√
1
3
π
5
3 π
Γ
= π, Γ
=
, and Γ
=
.
2
2
2
2
4
16. Given that
R∞
0
17. The formula Γ(x) = x−1 Γ(x + 1) may be used to extend the gamma func
tion to non-integer values x < 0. Use this to find Γ − 21 and Γ − 32 .
18. Prove that if f is absolutely integrable on [1, ∞), then
Z ∞
lim
f (xn )dx = 0.
n→∞
1
19. (Log test for integrals). Let f be locally integrable and positive on [0, +∞)
such that
− ln f (x)
L := lim
x→∞
ln x
Z ∞
exists in R. Prove that
f converges if L > 1 and diverges if L < 1.
0
20. Use Exercise 19 to determine the convergence behavior of
Z ∞
Z ∞
− ln x
−√x
(a)
ln x
dx.
(b)
ln x
dx.
S
1
What does the root test reveal?
1
Riemann Integration on R
151
Z t
sin ax
21. Prove that L(a) := lim
dx converges for all a ∈ R and that
t→+∞ 1/t
x
L(a) = L(1) for all a > 0.
22. Let f be differentiable and nonzero on [1, +∞). If lim xf 0 (x)/f (x)
x→+∞
R∞
exists in R and is less than −1, prove that 1 f converges.
R∞
R1
23. Prove that if 0 f (x) dx converges, then limn 0 f (nx) dx = 0.
R∞
R∞p
24.S Prove that if f ≥ 0 and 1 f converges, then 1
f (x)/x dx converges.
25. Prove that if [a, b) is finite and f 2 is improperly integrable on [a, b), then
|f | is improperly integrable on [a, b).
26.S Let f be continuous and g locallyR integrable and positive on [a, b).
x
Suppose that the function G(x) := a g is bounded on [a, b) and that
Rb
limx→b− f (x) = 0. Prove that a f g converges.
Rb
27. Let f be continuous on [a, b) such that a f converges. If g 0 is locally
Rb
integrable and has constant sign on [a, b), prove that a f g converges.
28.S Let f be improperly integrable on (−∞, +∞) and c ∈ R. Prove that
Z
∞
f (x + c) dx =
−∞
5.8
Z
+∞
f (x) dx.
−∞
A Deeper Look at Riemann Integrability
In this section we characterize Riemann integrability of a function in terms
of the size of its set of discontinuities.
5.8.1 Definition. A set A of real numbers is said to have (Lebesgue ) measure
zero if for each ε P
> 0 there exists a finite or infinite sequence of intervals In
with total length n |In | < ε such that the sequence covers A, that is, every
member of A is contained in some In .
♦
Any countable set has measure zero. Indeed, if A = {a1 , a2 , . . .} and ε > 0,
then the intervals In = (an − ε/2n+2 , an + ε/2n+2 ) obviously cover A and have
total length < ε. In particular, the set of rational numbers has measure zero.
An uncountable set of measure zero is constructed in Example 10.3.4.
The following result will be proved in Chapter 11.
5.8.2 Theorem. Let f be bounded on [a, b]. Then f ∈ Rba iff its set of
discontinuities has measure zero.
152
A Course in Real Analysis
Examples 5.1.11 and 5.1.12 are relevant here: The function in the first
example, shown to be integrable, has a countable set of discontinuities. The
function in the second example, shown not to be integrable, has [0, 1] as its set
of discontinuities, certainly not a set of measure zero.
Theorem 5.8.2 allows simple proofs of many of the properties discussed in
this chapter. For example, if f and g are integrable with sets of discontinuity
A and B, respectively, then f + g and f g have sets of discontinuity contained
in A ∪ B, a set of measure zero (Exercise 2), and hence are integrable.
Exercises
1. Show that if B has measure zero and A ⊆ B, then A has measure zero.
2.S Prove: If An has measure zero for every n ∈ N, then so does A1 ∪A2 ∪· · · .
3. Let A have measure zero. Prove that A + Q has measure zero.
4. Let f : [a, b] → [c, d] be integrable and g : [c, d] → R continuous. Prove
that g ◦ f is integrable.
5. A set A of real numbers has (Jordan) content zero if for each ε > 0 there
exist finitely many intervals of total length < ε that cover A. Show that
(a) a convergent sequence has content zero.
(b) [0, 1] ∩ Q does not have content zero.
6.S Prove that the function f in Exercise 3.3.10 is integrable on [a, b] and
find its integral.
*5.9
Functions of Bounded Variation
5.9.1 Definition. Let P = {a = x0 < x1 < · · · < xn = b} be a partition of
[a, b]. For f : [a, b] → R define
VP (f ) =
n
X
|f (xj ) − f (xj−1 )|.
j=1
The total variation of f on [a, b] is the extended real number
Vab (f ) := sup VP (f ).
P
The function f is said to have bounded variation on [a, b] if Vab (f ) < +∞. The
set of all functions with bounded variation on [a, b] is denoted by BV ba .
♦
Riemann Integration on R
153
5.9.2 Proposition. Let f : [a, b] → R.
(a) If f ∈ BV ba , then f is bounded.
(b) If f has a bounded derivative on [a, b], then f ∈ BV ba .
(c) If f is monotone on [a, b], then Vab (f ) = |f (b) − f (a)|.
Rx
(d) If g ∈ Rba and f (x) = a g(t) dt, then Vab (f ) ≤ (b − a) sup[a,b] |g|.
(e) If P is a partition of [a, b] and Q is a refinement of P, then VP (f ) ≤ VQ (f ).
(f) If f, g ∈ BV ba and c ∈ R, then f + g, cf, f g ∈ BV ba .
Proof. (a) Let a < x < b and P = {a, x, b}. Then
2|f (x)| ≤ |f (x) − f (a)| + |f (x) − f (b)| + |f (a)| + |f (b)|
= VP (f ) + |f (a)| + |f (b)|
≤ Vab (f ) + |f (a)| + |f (b)|.
(b) Let |f 0 | ≤ C on [a, b]. By the mean value theorem, given a partition P,
there exists for each j a point tj ∈ (xj−1 , xj ) such that
X
X
VP (f ) =
|f (xj ) − f (xj−1 )| =
|f 0 (tj )|(xj − xj−1 ) ≤ C(b − a).
P
P
Therefore, Vab (f ) ≤ C(b − a).
(c) If f is increasing, then
X
X
|f (xj ) − f (xj−1 )| =
f (xj ) − f (xj−1 ) = f (b) − f (a).
P
P
(d) Let M := supa≤t≤b |g(t)|. Then, for any partition P,
X Z xj
X
VP (f ) ≤
|g(t)| dt ≤ M
(xj − xj−1 ) = M (b − a).
P
xj−1
P
(e) Let P = {a = x0 < x1 < · · · < xn = b} and P 0 = P ∪ {c}, where
c ∈ [xi−1 , xi ]. Then
X
VP (f ) =
|f (xj ) − f (xj−1 )| + |f (xi ) − f (xi−1 )|
j6=i
≤
X
|f (xj ) − f (xj−1 )| + |f (xi ) − f (c)| + |f (c) − f (xi−1 )|
j6=i
= VP 0 (f ).
Adding points successively, yields (e).
154
A Course in Real Analysis
(f) Let |f |, |g| ≤ M on [a, b]. The inequality
|(f g)(xj ) − (f g)(xj−1 )| ≤ M |g(xj ) − g(xj−1 )| + M |f (xj ) − f (xj−1 )|
shows that f g ∈ BV ba . The proofs of the remaining parts of (f) are similar.
5.9.3 Example. For α > 0, define a continuous function fα on [0, 1] by
(
xα sin(1/x) if 0 < x ≤ 1,
fα (x) :=
0
if x = 0.
We show that if α ≤ 1, then fα does not have bounded variation on [0, 1]. Set
ak :=
2
1
1
=
and bk :=
2kπ + π/2
(4k + 1)π
2kπ
and note that
fα (bk ) = 0 and fα (ak ) = aα
k =
c
2α
, where c := α .
α
(4k + 1)
π
Since bk+1 < ak < bk , for sufficiently small ε > 0 we may form the partition
Pε = {ε < ap < bp < ap−1 < · · · < ak < bk < · · · < bq+1 < aq < bq < 1}
of [ε, 1], where p and q are, respectively, the largest and smallest integers
satisfying ε < ap < bq < 1, equivalently,
1
2 − πε
<q<p<
.
2π
4πε
c
From fα (ak ) − fα (bk ) =
,
(4k + 1)α
V01 (fα ) ≥ Vε1 (fα ) ≥ c
p
X
k=q
1
.
(4k + 1)α
Since ε may be chosen arbitrarily small, the upper limit
P∞ p of the sum on the
right may be made arbitrarily large. Since the series k=1 (4k + 1)−α diverges,
V01 (fα ) = +∞.
♦
5.9.4 Theorem. If f 0 ∈ Rba , then f ∈ BV ba and
Z b
Vab (f ) =
|f 0 (x)| dx.
(5.30)
a
Proof. Let P be a partition of [a, b]. By the mean value theorem, there exists
ξj ∈ [xj−1 , xj ] such that
X
VP (f ) =
|f 0 (ξj )| |xj − xj−1 | = S(|f 0 |, P, ξ),
P
Riemann Integration on R
By 5.1.18,
155
b
Z
|f 0 | = lim S(f 0 , P, ξ) = lim VP (f ).
P
a
P
(5.31)
On the other hand, given r < Vab (f ), we may choose a partition Pr of [a, b] such
that r < VPr (f ) ≤ Vab (f ). By 5.9.2(e), r < VP (f ) ≤ Vab (f ) for all refinements
Rb
P of Pr . Thus, by (5.31), r ≤ a |f 0 | ≤ Vab (f ). Since r was arbitrary, (5.30)
follows.
5.9.5 Corollary. If f is continuous at a and f 0 is locally integrable on (a, b],
Rb
then (5.30) holds, where the integral is improper. Thus f ∈ BV ba iff a |f 0 |
converges.
Proof. By the theorem and the definition of improper integral, it suffices to
show that
Vab (f ) = lim+ Vtb (f ) = sup Vtb (f ) .
t→a
a<t≤b
Clearly, we may assume that Vab (f ) > 0. Let 0 < s < r < Vab (f ) and choose a
partition Pr = {x0 = a < x1 < · · · < xn = b} such that r < VPr (f ) ≤ Vab (f ).
Next, choose t ∈ (a, x1 ) so that |f (t)−f (a)| < r−s. For such t, let Pt = Pr ∪{t}.
Then
Vtb (f )
≥ |f (x1 ) − f (t)| +
n−1
X
|f (xj+1 ) − f (xj )|
j=1
= VPt (f ) − |f (t) − f (a)|
> VPr (f ) − (r − s) > s.
It follows that lim+ Vtb (f ) ≥ s. Since s was arbitrary, the assertion follows.
t→a
5.9.6 Example. We use 5.9.5 to show that the function fα in 5.9.3 has
bounded variation on [0, 1] if α > 1. We have
|fα0 (x)| = |αxα−1 sin(1/x) − xα−2 cos(1/x)| ≤ αxα−1 + xα−2 .
R1
R1
If α > 1, the integral 0 xα−2 dx converges, hence 0 |fα | converges.
♦
5.9.7 Theorem. If f ∈ BV ba , then there exist monotone increasing functions
g and h on [a, b] such that f = g − h.
Proof. For x ∈ [a, b], define g(x) := Vax (f ) and h(x) := g(x) − f (x). Clearly,
g is increasing. To see that h is increasing, let x < y, let Px be an arbitrary
partition of [a, x], and let Py = Px ∪ {y}. Then
VPx (f ) + f (y) − f (x) = VPy (f ) ≤ g(y).
Taking suprema over all partitions Px yields g(x) + f (y) − f (x) ≤ g(y), that
is, h(x) ≤ h(y).
From Exercise 5.1.4 we have
5.9.8 Corollary. BV ba ⊆ Rba .
156
A Course in Real Analysis
*5.10
The Riemann–Stieltjes Integral
In this section we describe the main features of the Riemann-Stieltjes
integral, a generalization of the Riemann integral. These integrals have many
of the properties of Riemann integrals; however, as we shall see, there are some
striking differences.
Definition and General Properties
5.10.1 Definition. Let f and w be bounded, real-valued functions on an
interval [a, b]. If P = {x0 = a < x1 < · · · < xn = b} and ξj ∈ [xj−1 , xj ], then
Sw (f, P, ξ) :=
n
X
f (ξj )∆wj , ∆wj := w(xj ) − w(xj−1 ), ξ := (ξ1 , . . . , ξn ),
j=1
is called a Riemann-Stieltjes sum of f with respect to w. The function f is
said to be Riemann-Stieltjes integrable with respect to w if for some I ∈ R and
each ε > 0, there exists a partition Pε such that
|Sw (f, P, ξ) − I| < ε for all refinements P of Pε and all choices of ξ.
In this case I is called the Riemann-Stieltjes integral with respect to w and is
denoted by
Z b
Z b
f dw =
f (x) dw(x) = lim Sw (f, P, ξ).
(5.32)
a
a
P
The function f is called the integrand and w the integrator. The collection of
all functions that are Riemann-Stieltjes integrable with respect to w is denoted
by Rba (w).
♦
It follows from 5.1.18 that, for the integrator w(x) = x, the RiemannStieltjes integral reduces to the Riemann integral.
It is clear that constant functions are Riemann-Stieltjes integrable. The
following example shows that, in contrast to the Riemann integral, if f has a
Rb
simple discontinuity, then a f dw may not exist.
5.10.2 Example. Let f : [0, 1] → R and define
(
0 if 0 ≤ x < 1,
w(x) :=
1 if x = 1
We show that f ∈ R10 (w) iff f is continuous at 1.
Let P = {x0 = 0 < x1 < · · · < xn = 1} be any partition of [0, 1]. Then
Sw (f, P, ξ) = f (ξn )[w(1) − w(xn−1 )] = f (ξn ).
Riemann Integration on R
157
Hence if f ∈ R10 (w) and ξ is chosen so that first ξn = 1 and second ξn < 1, we
R1
see that f is continuous at 1 and 0 f dw = f (1).
Conversely, if f is not continuous at 1, then there exists a sequence {am }
and r > 0 such that am ↑ 1 and |f (am ) − f (1)| ≥ r for every m. Let Pm denote
the refinement P ∪ {am } of P, where am ∈ (xn−1 , 1]. If ξ consists of the left
endpoints of the intervals of Pm , then Sw (f, Pm , ξ) = f (am ), hence
|Sw (f, Pm , ξ) − f (1)| = |f (am ) − f (1)| ≥ r.
Since P was arbitrary, f 6∈ R10 (w).
♦
5.10.3 Theorem. If f, g ∈ Rba (w) and α, β ∈ R, then αf + βg ∈ Rba (w) and
Z
b
(αf + βg) dw = α
Z
a
b
f dw + β
a
Z
b
g dw.
a
Proof. This follows from the identity
Sw (αf + βg, P, ξ) = αSw (f, P, ξ) + βSw (g, P, ξ)
and the linearity of the limit in (5.32), as is readily established by a standard
argument.
5.10.4 Theorem. Let w := αu + βv, where α, β ∈ R and u, v : [a, b] → R are
bounded. If f ∈ Rba (u) ∩ Rba (v), then f ∈ Rba (w) and
Z
b
f dw = α
a
Z
b
f du + β
Z
a
b
f dv.
a
Proof. This follows from
Sw (f, P, ξ) = αSu (f, P, ξ) + βSv (f, P, ξ)
and the linearity of the limit in (5.32).
5.10.5 Theorem. Let a < c < b. If f |[a,c] ∈ Rca (w) and f |[c,b] ∈ Rbc (w), then
f ∈ Rba (w) and
Z b
Z c
Z b
f dw =
f dw +
f dw.
a
a
c
Proof. Given ε > 0, choose partitions Pε0 of [a, c] and Pε00 of [c, b] such that the
following hold:
Z c
f dw − Sw (f, P 0 , ξ 0 ) < ε/2 for all refinements P 0 of Pε0 and all ξ 0 ,
a
Z
c
b
f dw − Sw (f, P 00 , ξ 00 ) < ε/2 for all refinements P 00 of Pε00 and all ξ 00 .
158
A Course in Real Analysis
Then Pε := Pε0 ∪ Pε00 is a partition of [a, b] containing c. Moreover, if P is a
refinement of Pε , then P 0 := P ∩ [a, c] and P 00 = P ∩ [c, b] are refinements of
Pε0 and Pε00 , respectively. From
Sw (f, P, ξ) = Sw (f, P 0 , ξ 0 ) + Sw (f, P 00 , ξ 00 )
and the above inequalities we see that
Z c
Z b
f dw +
f dw − Sw (f, P, ξ) < ε/2 + ε/2 = ε.
a
c
This establishes the existence of
Rb
a
f dw as well as the desired equality.
5.10.6 Example. Consider the floor function integrator
R n w(x) = bxc. A slight
modification of the argument in 5.10.2 shows that 0 f (x) dbxc exists iff f is
Rk
left continuous at the integers 1, 2, . . . , n, in which case k−1 f (x) dbxc = f (k).
For such a function, 5.10.5 implies that
Z n
n Z k
n
X
X
f (x) dbxc =
f (x) dbxc =
f (k).
♦
0
k=1
k−1
1
The preceding example suggests that improper Riemann-Stieltjes integration could be used to provide a unified theory that includes both improper
Riemann integrals and infinite series. This is indeed possible; however, it turns
out that Lebesgue integration is a more efficient approach. Lebesgue theory
on Rn is developed in Chapter 11.
The following theorem reveals a remarkable symmetry between integrand
and integrator.
5.10.7 Integration by Parts Formula. If f ∈ Rba (w), then w ∈ Rba (f ) and
Z b
Z b
f dw +
w df = f (b)w(b) − f (a)w(a).
a
a
Proof. For any partition P{x0 = a, x1 , . . . , xn−1 , xn = b},
f (b)w(b) − f (a)w(a) =
Sf (w, P, ξ) =
n
X
j=1
n
X
j=1
f (xj )w(xj ) −
w(ξj )f (xj ) −
n
X
j=1
n
X
f (xj−1 )w(xj−1 ) and
w(ξj )f (xj−1 ).
j=1
Subtracting we obtain
f (b)w(b) − f (a)w(a) − Sf (w, P, ξ)
n
n
X
X
=
f (xj−1 )[w(ξj ) − w(xj−1 )] +
f (xj )[w(xj ) − w(ξj )]
j=1
= Sw (f, Q, ζ),
j=1
Riemann Integration on R
159
where ζ = (a, x1 , x1 , x2 , x2 , . . . , xn−1 , xn−1 , b) and Q is the refinement of P
obtained by adding the coordinates of ξ to P. Therefore,
ξ
P a
ξ1
ξ2
ξ3
x2
x1
ξ4
ξ5
x4
b
ξ4 x4 ξ5
b
x3
ζ
Q a
ξ1 x1 ξ2 x2 ξ3 x3
FIGURE 5.10: The partition Q.
f (b)w(b) − f (a)w(a) −
Z
b
f dw − Sf (w, P, ξ) = Sw (f, Q, ζ) −
a
Z
b
f dw .
a
Since f ∈ Rba (w), the right side may be made arbitrarily small, Therefore,
Rb
Rb
w df exists and equals f (b)w(b) − f (a)w(a) − a f dw.
a
The next result shows that under certain general conditions the RiemannStieltjes integral reduces to a Riemann integral.
5.10.8 Theorem. Let f ∈ Rba (w). If w is continuously differentiable, then
f w0 ∈ Rba and
Z b
Z b
f dw =
f (x)w0 (x) dx.
a
a
Proof. For any partition P of [a, b] and any ξ,
Sw (f, P, ξ) − S(f w0 , P, ξ) =
n
X
j=1
f (ξj )∆wj −
n
X
f (ξj )w0 (ξj )∆xj .
j=1
By the mean value theorem, for each j there exists tj ∈ (xj−1 , xj ) such that
∆wj = w(xj ) − w(xj−1 ) = w0 (tj )∆xj .
Therefore,
Sw (f, P, ξ) − S(f w0 , P, ξ) =
n
X
f (ξj ) w0 (tj ) − w0 (ξj ) ∆xj .
(5.33)
j=1
Let |f | ≤ M on [a, b]. By uniform continuity of w0 , given ε > 0, there exists a
δ > 0 such that
|w0 (x) − w0 (y)| <
ε
whenever |x − y| < δ.
2M (b − a)
(5.34)
160
A Course in Real Analysis
Let Pε0 be a partition of [a, b] with kPε0 k < δ. From (5.33) and (5.34),
n
|Sw (f, P, ξ) − S(f w0 , P, ξ)| ≤
X
ε
ε
∆xj =
2(b − a) j=1
2
(5.35)
for all refinements P of Pε0 and all ξ. Next, choose a partition Pε00 such that
Z b
f dw − Sw (f, P, ξ) < ε/2 for all ξ and all refinements P of Pε00 . (5.36)
a
If P is a refinement of Pε0 ∪ Pε00 , then both (5.35) and (5.36) hold, hence, by
the triangle inequality,
Z b
f dw − S(f w0 , P, ξ) < ε.
a
This shows that f w0 ∈ Rba and establishes the equality.
Monotone Increasing Integrators
If w : [a, b] → R is monotone increasing, then the Riemann-Stieltjes integral
may be characterized in terms of upper and lower sums, as in the Darboux
theory. This fact will lead to an important existence theorem for integrators of
bounded variation and continuous integrands.
Let f : [a, b] → R be bounded and let P be a partition of [a, b]. Define the
upper and lower Darboux–Stieltjes sums of f with respect to w by
S w (f, P) =
n
X
Mj ∆wj
and S w (f, P) =
j=1
n
X
mj ∆wj ,
j=1
where
Mj = Mj (f ) :=
sup
xj−1 ≤x≤xj
f (x) and mj = mj (f ) :=
inf
xj−1 ≤x≤xj
f (x).
The upper and lower Darboux–Stieltjes integrals of f with respect to w are
defined, respectively, by
Z b
Z b
f dw := inf S w (f, P) and
f dw := sup S w (f, P).
a
P
a
P
As in the Darboux theory, if Q is a refinement of P then, because w is
increasing,
Z b
Z b
S w (f, P) ≤ S w (f, Q) ≤
f dw ≤
f dw ≤ S w (f, Q) ≤ S w (f, P).
a
a
Here is the analog of 5.1.8 for Riemann–Stieltjes integrals.
Riemann Integration on R
161
5.10.9 Theorem. The following statements are equivalent:
(a) f ∈ Rba (w).
(b) For each ε > 0, there exists a partition Pε such that
S w (f, P) − S w (f, P) < ε.
Z
(c)
b
f dw =
a
b
Z
f dw.
a
If these conditions hold, then
Z
b
f dw =
a
Z
b
f dw =
a
Z
b
f dw.
a
Proof. That (b) and (c) are equivalent is proved exactly as in 5.1.8.
Assume that (a) holds. Given ε > 0, choose a partition Pε such that
Z
b
f dw − Sw (f, P, ξ) < ε/3 for all refinements P of Pε and all ξ. (5.37)
a
For such a partition P and for each j, there exists a sequence {ξj,k }∞
k=1 in
[xj−1 , xj ] such that limk f (ξj,k ) = Mj (f ). It follows that
lim Sw (f, P, ξ k ) = S w (f, P), where ξ k = (ξ1,k , . . . , ξn,k ).
k
From (5.37),
b
Z
f dw − S w (f, P) ≤ ε/3.
a
Similarly,
Z
b
f dw − S w (f, P) ≤ ε/3.
a
Part (b) now follows from the triangle inequality.
Now assume that (c) holds. Let I denote the common value of the integrals
in (c). Given ε > 0, choose partitions Pε0 and Pε00 such that
I − ε < S w (f, Pε0 ) and S w (f, Pε00 ) < I + ε.
The inequalities still hold if Pε0 and Pε00 are replaced by any refinement P of
Pε := Pε0 ∪ Pε00 . Thus
−ε < S w (f, P) − I ≤ Sw (f, P, ξ) − I ≤ S w (f, P) − I < ε.
This shows that f ∈ Rba (w) and
Rb
a
f dw = I.
162
A Course in Real Analysis
Integrators of Bounded Variation
Recall that a function of bounded variation may be expressed as the
difference of two monotone increasing functions (5.9.7). This, together with
5.10.9, allows for a simple proof of the following existence theorem.
5.10.10 Theorem. If f : [a, b] → R is continuous and w : [a, b] → R has
bounded variation, then f ∈ Rba (w).
Proof. By the remark preceding the theorem and by 5.10.4, we may assume
that w is increasing. By uniform continuity of f , given ε > 0, there exists a
δ > 0 such that
ε
|f (x) − f (y)| <
for all x, y with |x − y| < δ.
w(b) − w(a) + 1
Let Pε be a partition with kPε k < δ. For any refinement P of Pε , kPk < δ,
hence
ε
Mj (f ) − mj (f ) ≤
.
w(b) − w(a) + 1
Therefore,
S w (f, P) − S w (f, P) =
n
X
Mj (f ) − mj (f ) ∆wj ≤ ε,
j=1
which shows that f ∈ Rba (w).
The conclusion of the theorem does not necessarily hold if w fails to have
bounded variation, even if w is continuous:
5.10.11 Example. Let f = w = f1/2 , where fα is defined as in Example 5.9.3.
R1
We show that 0 f dw does not exist. Referring to that example, let Pε be the
partition
ε < ap < bp < ap−1 < · · · < bk+1 < ak < bk < · · · < bq+1 < aq < bq < 1,
of [ε, 1], and let ξ consist of left endpoints of Pε . Then
Sw (f, P, ξ) = f (ε) w(aq ) − w(ε) + f (bq ) w(1) − w(bq )
+
p
X
p−1
X
f (ak ) w(bk ) − w(ak ) +
f (bk+1 ) w(ak ) − w(bk+1 ) .
k=q
Since f1/2 (bk ) = 0 and f1/2 (ak ) =
k=q
√
ak ,
Sw (f, Pε , ξ) = f (ε)
√
p
X
aq − w(ε) −
ak .
k=q
Since the sums diverge as ε → 0, limε→0 Sw (f, Pε , ξ) = −∞.
♦
Chapter 6
Numerical Infinite Series
An infinite series is the limit of a sequence of expanding finite sums. The
terms of these sums may be real numbers or functions. In this chapter we
examine the convergence behavior of series of the former type; series whose
terms are functions are treated in the next chapter. In the first section, we
give examples of series that may be summed, that is, for which an explicit
numerical value may be calculated. The remaining sections describe various
tests for convergence of general series. Additional methods of summing series
may be found in Section 7.4.
6.1
Definition and Examples
6.1.1 Definition. Let {an } be a sequence of real numbers. The various
symbols
∞
X
X
X
an =
an =
an = a1 + a2 + · · · + an + · · ·
n
n=1
represent what is called an infinite series with nth term an or, simply, a series.
The nth partial sum of the series is defined by
sn =
n
X
ak .
k=1
The series is said to converge if the sequence of partial sums converges, in
which case we write
X
an = lim sn
n
P
and call
an the sum of the series. If the sequence {sn } diverges, then the
series is said to diverge.
♦
6.1.2 Remark. A series may begin with an index other than 1. In this regard,
note that, because
sn = sm−1 +
n
X
ak , n ≥ m > 1,
k=m
163
164
A Course in Real Analysis
P∞
P∞
the series s := n=1 an converges iff n=m an converges. In this case the “tail
end” of the series tends to zero:
∞
X
lim
m→+∞
an =
n=m
lim (s − sm−1 ) = 0.
♦
m→+∞
6.1.3 Example. Using the definition e := lim (1 + 1/n)n (see 2.2.4), we show
n→∞
that
∞
X
1
e=
.
n!
n=0
Pn
First, since the partial sums sn := k=0 1/k! increase, the limit s := limn sn
exists in R. From the calculations in 2.2.4,
n
(1 + 1/n) = 2 +
n
X
1
(1 − 1/n)(1 − 2/n) · · · (1 − (k − 1)/n) ≤ sn .
k!
k=2
Letting n → ∞, we obtain e ≤ s. On the other hand, if n > m, then
n
(1 + 1/n) > 2 +
m
X
1
(1 − 1/n)(1 − 2/n) · · · (1 − (k − 1)/n).
k!
k=2
Letting n → ∞, we see that e ≥ sm . Letting m → ∞ yields e ≥ s.
6.1.4 Example. The geometric series
arn =
n=0
This follows from the calculation sn =
♦
arn , where a, r ∈ R and a =
6 0,
n=0
converges iff |r| < 1, in which case
∞
X
∞
X
a
.
1−r
n
X
ark = a
k=0
1 − rn+1
, r 6= 1.
1−r
♦
6.1.5 Example. For m ∈ N,
∞
X
m
1
1 X1
=
.
n(m + n)
m
k
n=1
(6.1)
k=1
To see this, we use partial fractions: For n > m
msn =
n
X
k=1
X
n m
n+m
X
X 1
m
1
1
1
=
−
=
−
.
k(m + k)
k (k + m)
k
k
k=1
k=1
(6.2)
k=n+1
The second sum on the extreme right in (6.2) is less than m/(n + 1) and hence
tends to zero as n → ∞.
♦
Numerical Infinite Series
165
The series in (6.1) is an example of a telescopic series, the name referring
to the cancellations taking place in (6.2).
P
6.1.6PTheorem. Let {anP
} and {bn } be sequences and let α, β ∈ R. If
an
and
bn converge, then (αan + βbn ) converges and
X
X
X
(αan + βbn ) = α
an + β
bn .
(6.3)
Pn
Pn
Proof. Let sn = Pk=1 ak and tn = k=1 bk . Then αsn + βtn is the nth partial
sum of the series (αan + βbn ) and
lim (αsn + βtn ) = α lim sn + β lim tn ,
n→∞
n→∞
n→∞
which is (6.3).
6.1.7 Example. By 6.1.6 and 6.1.4,
∞
∞
∞
X
X
2 · 3n+1 + 3 · 2n−1
1
3X 1
=6
+
6n
2n
2 n=1 3n
n=1
n=1
1/2
3 1/3
+
1 − 1/2 2 1 − 1/3
= 6.75.
=6
♦
The following result is a test for divergence. It implies that a seriesPwhose
nth term
does not tend to zero must diverge. For example, the series sin n
P −1/n
and
2
diverge.
P
6.1.8 Proposition. If
an converges, then an → 0.
Proof. an = sn − sn−1 → s − s = 0.
The converse of 6.1.8 is false:
P∞
6.1.9 Example. The harmonic series n=1 1/n diverges. Indeed, if sn is the
nth partial sum of the series, then for all n
s2n − s2n−1 =
1
2n−1 + 1
+ ··· +
1
2n−1
1
>
= ,
2n−1 + 2n−1
2n−1 + 2n−1
2
hence {sn } is not a Cauchy sequence.
It is of interest to note that, while the sequence sn diverges, the sequence
tn := sn − ln n converges. To see this, observe first that
ln n =
Z
1
n
n−1 Z
n Z
n
dx X k+1 dx X k+1 dx X 1
=
<
=
= sn ,
x
x
k
k
k
k
k=1
k=1
k=1
so tn > 0. Furthermore,
ln (n + 1) − ln n =
Z
n+1
n
dx
>
x
Z
n+1
n
1
1
dx =
,
n+1
n+1
166
A Course in Real Analysis
hence
tn − tn+1 = ln (n + 1) − ln n + sn − sn+1 = ln (n + 1) − ln n −
1
> 0.
n+1
Therefore, {tn } is bounded below and decreasing, hence converges.
The number
X
n
1
γ := lim tn = lim
− ln n
n→∞
n→∞
k
k=1
is known as Euler’s constant. Its value to eleven decimal places is
.57721566490 . . .. As of this writing, it is not known whether γ is irrational.
Note that since sn = tn + ln n, the convergence of {tn } provides another proof
that the harmonic series diverges.
♦
Exercises
1. Let m ∈ N. Sum the series
(a) S
(c) S
P∞
n=1
m2n+1
.
(m + 1)2n−1
n+1
2
+2
ln n+1
.
2
+1
an , where an =
(−1)n+1 m3n+1
.
(m + 1)3n−1
1
(d) p
√
√ .
n(n + 1)( n + 1 + n)
12
(f)
.
(n + 1)(n + 2)(n + 3)
(−1)n
(h)
.
(n + 1)(n + 3)(n + 5)
(b)
(−1)n
, m even.
n(n + m)
1
.
(g) S
(n + 1)(n + 3)(n + 5)
p
√
m n + n(n + 1)
S
p
(i) ln √
.
m n + 1 + n(n + 1)
1
(k)
.
√
√ √
(n + m) n + n n + m
(−1)n (n + m + 1)
(m)
.
(2n + 1)(2n + 4m + 3)
(e) S
n2 + 4n + 4
.
n2 + 4n + 3
18
.
(l)
n(n + 1)(n + 2)(n + 3)
(−1)n (2n + 2m + 1)
(n) S
.
n(n + 2m + 1)
(j)
ln
P∞
2. Let 0 < r < 1 and m ∈ N. Sum the series n=0 an if an =
(a)S rn cos (nπ)/2 .
(b) (−1)bn/3c rn .
(c) (−1)bn/mc rn .
P∞
P∞
3. Given thatPe = n=0 1/n! and e−1 = n=0 (−1)n 1/n!, find the value of
∞
the series n=0 an if an =
(a) S
(2n + 3)3
.
n!
(b)
4. Let p > 0 and sn =
sn / ln ln n → +∞.
1
1
n
. (c) S
. (d)
.
(2n)!
(2n + 1)!
(2n + 1)!
Pn
k=1
(e)
n
.
(2n)!
1/k. Prove that sn /np → 0, sn / ln n → 1, and
Numerical Infinite Series
167
5. Let γ denote Euler’s constant (6.1.9). Prove that
n
X
√
γ
1
S
(a)
− ln n → ln 2 + .
2k − 1
2
(b)
k=1
n
X
k=1
4k
− ln n → ln 4 + γ − 1.
(2k − 1)(2k + 1)
∞
X
1
= ln 4 − 1.
n(2n
−
1)(2n
+ 1)
n=1
P
an converges iff for each ε > 0 there exists an index N
6. Prove that
such that
n+p
X
an < ε for all n ≥ N and p ≥ 1.
(c)
k=n
7. Suppose that an tends monotonically to 0 and that s :=
converges.
P∞
n=1
an
(a) Prove that nan → 0.
(b) Let p ∈ N. Show that t :=
in terms of s.
P∞
n=1
n(an −an+p ) converges and express t
Suggestion. For (b), consider first the case p = 1.
P
P
8.S Let
an and
bn be convergent series with bn > 0 for all n. Suppose
that L := limn (an /bn ) exists in R. Prove that
P∞
k=n ak
lim P∞
= L.
n
k=n bk
P∞
P∞
2
2 −1
Use this to calculate limn
.
k=n sin(3/k )
k=n 1/k
9. For a sequence {cn }, define ∆cn = cn+1 − cn . Prove the following discrete
analog of l’Hospital’s rule: Let {an } and {bn } be sequences with {bn }
strictly monotone. Suppose that either (a) an → 0 and bn → 0, or
(b) bn → ±∞. Then
an
∆an
lim
= lim
,
n bn
n ∆bn
provided that the limit on the right exists in R.
P
10. Let {an } and
Pn {bn } be sequences
Pnwith bn > 0 for all n and n bn = +∞.
Set An = k=1 ak and Bn = k=1 bk . Use Exercise 9 to prove that
lim
n
an
An
= lim ,
n bn
Bn
provided that the limit on the right exists in R. Use this to calculate the
limits of
168
A Course in Real Analysis
n
X
(a)
n
X
sin(1/k)
k=1
n
X
,
(b)
1/k
k=1
n
X
ln k
k=1
n
X
,
(c)
kp
k=1
rk
k=1
n
X
,
kp
k=1
where r, p > 0.
11. Let {bn }∞
n=1 be a sequence obtained
P by rearranging finitely
P many terms
of a sequence {an }∞
bn converges in R iff an converges
n=1 . Show that
in R, in which case the series are equal.
12.S Let {bk } be a sequence obtained from a sequence {an } by grouping, that
is,
bk = ank−1 +1 + ank−1 +2 + · · · + ank , k = 1, 2, . . . ,
where {nk }k is a strictly increasing
sequence of nonnegative integers
P
P
and n0 = 0. Show that if n an converges in R, then so does k bk
and the series are equal. Show that the converse is true if an ≥ 0 for all
sufficiently large n. What if the terms an change sign infinitely often?
P∞
13. LetP{an } be decreasing and nonnegative. Prove that n=1 an converges
∞
iff k=0 2k a2k converges. Hint. Set
sn =
n
X
aj and tk =
j=1
k
X
2j a2j .
j=0
Show that sn ≤ tk if n ≤ 2k+1 − 1 and sn ≥ tk /2 if n ≥ 2k .
14. (Decimal representation of real numbers). Prove that every real number
x ≥ 0 has a decimal representation
x = bN bN −1 · · · b0 .a1 a2 · · · :=
N
X
bn 10n +
n=0
∞
X
an 10−n ,
n=1
where the digits bn , an are integers from 0 to 9.
Hint. By Exercise 1.5.16, it may be assumed that x ∈ [0, 1). Prove by
induction that for each n there exist aj ∈ {0, 1, . . . 9} and xn ∈ [0, 10−n )
such that
n
X
x = xn +
aj 10−j = xn + (.a1 a2 · · · an ).
j=1
15.S Call a decimal representation bN bN −1 · · · b0 .a1 a2 · · · standard if no index
n exists such that ak = 9 for all k ≥ n. Prove that every real number
has a unique standard decimal representation.
Numerical Infinite Series
169
16. A real number x ≥ 0 is a repeating decimal if it has decimal representation
of the form
x = bN bN −1 · · · b0 .a1 a2 · · · am am+1 am+2 · · · am+k ,
where the upper bar indicates that the block repeats forever. (For example,
61/495 = .12323 · · · = .123.) Prove that every repeating decimal is
rational.
17. Prove the converse of Exercise 16, that is, every rational number p/q is
a repeating
Conclude that if f : N 7→ N is strictly increasing,
P −fdecimal.
then
10 (n) is irrational.
Hint. By the division algorithm you may assume that 1 ≤ p < q. Begin
by showing that if p/q = .a1 a2 · · · , then for each n
p
rn
= .a1 a2 · · · an + n , where rn ∈ {0, 1, . . . , q − 1},
q
10 q
and use this to show that qan = 10rn−1 − rn , where r0 := p.
6.2
Series with Nonnegative Terms
There are a variety of tests for the convergence of series with nonnegative
terms. The most basic of these is the following theorem.
P
an converges in R iff
6.2.1 Theorem. If an ≥ 0 for all n, then the series
its partial sums are bounded.
Proof. Since the terms of the series are nonnegative, the sequence of partial
sums is increasing. The assertion therefore follows from the monotone sequence
theorem (2.2.2).
6.2.2 Remark. By 6.1.2, the theorem is still valid if the inequality an ≥ 0
holds only eventually, that is, for all n ≥ some m. Many of the results in this
chapter have similar extensions. Rather than make these explicit, we leave the
straightforward formulations to the reader.
♦
P
P
6.2.3 Example. Let an , bn ≥ 0 for all n and suppose that
an and
bn
converge. By the Cauchy–Schwarz inequality (1.6.3(e)),
n p
X
k=1
ak bk ≤
X
n
k=1
ak
1/2 X
n
1/2
bk
.
k=1
Since
P √ the sums on the right are bounded, so are the sums on the left. Therefore,
an bn converges.
♦
170
A Course in Real Analysis
The following test relates the convergence of a series to that of an improper
integral.
6.2.4 Integral Test. Let f be decreasing,
P∞ positive, and locally integrable on
the interval
[1,
∞).
Then
the
series
n=1 f (n) converges iff the improper
R∞
integral 1 f converges. Moreover, for every n ∈ N
Z ∞
0 ≤ s − sn ≤
f (x) dx.
(6.4)
n
Proof. For each n ∈ N let
sn =
n
X
f (k) and tn =
Z
n
f.
1
k=1
For each k ∈ N and x ∈ [k, k + 1], f (k + 1) ≤ f (x) ≤ f (k), hence
f (k + 1) ≤
k+1
Z
f ≤ f (k)
k
and so
sn − f (1) =
n
X
f (k) =
k=2
n−1
X
f (k + 1) ≤
k=1
n−1
X Z k+1
f = tn ≤
k=1
k
n−1
X
f (k) = sn−1 .
k=1
Therefore, {sn } is bounded iff {tn } is bounded. The first assertion of the
theorem now follows from 6.2.1.
Now observe that for m > n,
0 ≤ sm − sn =
m
X
f (k) =
k=n−1
m−1
X
k=n
f (k + 1) ≤
m−1
X Z k+1
f=
k=n
k
Z
m
f.
n
Letting m → +∞ yields (6.4).
Inequality (6.4) allows one to estimate the error made by approximating s
by a partial sum sn .
R∞
6.2.5 Example. (p-series). By 5.7.3(a), 1 1/xp dx converges iff p > 1. ThereP∞
fore, the same is true of the series s := n=1 1/np . . Furthermore, if p > 1,
then
Z ∞
1
0 ≤ s − sn ≤
x−p dx =
.
p−1
(p
−
1)n
n
Thus if the partial sum sn is to agree with s in, say, the first 10 decimal places,
then n should be chosen so that (p − 1)np−1 > 1010 .
♦
P
6.2.6 Comparison
Test. Let 0 ≤ an ≤ bn for all n. If
bn converges, then
P
so does
an .
Numerical Infinite Series
171
P
P
Proof. The partial sums of
bn are bounded and dominate those of
an ,
hence assertion follows from 6.2.1.
6.2.7 Limit Comparison Test. Let an , bn > 0 for all n.
P
P
(a) If r := lim sup(an /bn ) < +∞ and
bn converges, then
an converges.
P
P
(b) If r := lim inf(an /bn ) > 0 and
an converges, then
bn converges.
P
P
(c) If r := lim(an /bn ) exists and r ∈ (0, +∞), then
bn converges iff
an
converges.
Proof. For (a), let r ∈ (r, +∞) and choose N so that supn≥N an /bn < r. Then
an < bn r for every n ≥ N , hence the conclusion follows from the comparison
test and 6.2.2. Part (b) follows similarly by choosing r ∈ (0, r) and then N so
that inf n≥N an /bn > r. Part (c) follows from (a) and (b).
6.2.8 Examples. (a) The series
X 2n + n3
n
3n + n2
converges by comparison with the convergent series
n (2/3)
P
n
, since
1 + n3 /2n
2n + n3
(2/3)−n =
→ 1.
n
2
3 +n
1 + n2 /3n
(b) The series
√
X
n
cn + d −
n+1
√
cn
, c, d > 0,
P
converges by comparison with the convergent series n n−3/2 , since
√
√
dn3/2
d
cn + d − cn 3/2
√
n
=
→ √ .
√
n+1
2 c
(n + 1)( cn + d + cn)
♦
6.2.9 Ratio Test. Let an > 0 for all n.
P
an+1
< 1, then
an converges.
(a) If r := lim sup
an
n
P
an+1
(b) If r := lim inf
> 1, then
an diverges.
n
an
Proof. (a) Let r ∈ (r, 1) and choose N so that supn≥N an+1 /an < r. For
n>N
an < an−1 r < an−2 r2 < · · · < aN rn−N ,
P
so
an converges by the comparison test.
(b) If r > 1 there exists N such that inf n≥N an+1 /an > 1. Therefore,
an > an−1 > an−2 > · · · > aN > 0, n > N,
P
so an cannot converge to zero. Therefore,
an diverges.
172
A Course in Real Analysis
6.2.10 Examples. (a) Let an denote the general term of the series
∞
X
8 · 14 · 20 · · · (6n + 2) n
c ,
6
· 11 · 16 · · · (5n + 1)
n=1
where c > 0. Then
an+1
6n + 8
6
=
c → c,
an
5n + 6
5
hence the series converges if c < 5/6 and diverges if c > 5/6. If c = 5/6, then
an =
8 · 14 · 20 · · · (6n + 2)5n
(1 + 1/3)(2 + 1/3) · · · (n + 1/3)
=
> 1,
6 · 11 · 16 · · · (5n + 1)6n
(1 + 1/5)(2 + 1/5) · · · (n + 1/5)
so the series diverges in this case as well.
(b) For the series
∞
X
2
(n!)p rn ,
r > 0, p ∈ R
n=1
the ratios are
an+1
= (n + 1)p r2n+1 ,
an
hence the series converges iff r < 1.
(c) For the series
∞
X
2n ln2 n
,
n!
n=2
an+1
2 ln2 (n + 1)
2 ln2 (n + 1)
→ 0,
=
≤
an
(n + 1)
(n + 1) ln2 n
hence the series converges.
♦
1/n
6.2.11 Root Test. Let an ≥ 0 for all n and set ρ := lim supn an .
P
(a) If ρ < 1, then
an converges.
P
(b) If ρ > 1, then
an diverges.
1/n
Proof. (a) Let r ∈ (ρ, 1) andPchoose N such that supn≥N an < r. Then
an < rn for all n ≥ N , hence
an converges by the comparison test,
1/nk
(b) By 2.4.2, there exists a subsequence ank
large k, ank > 1, hence the series diverges.
→ ρ. Then, for all sufficiently
6.2.12 Example. For the series
∞
X
a + (−1)n b
n
, where a > b > 0,
n=1
1/n
lim supn an = a + b, hence the series converges if a + b < 1 and diverges if
a + b > 1. If a + b = 1, an 6→ 0 so the series diverges in this case as well. ♦
Numerical Infinite Series
173
P
6.2.13 Remark. No conclusion regarding the convergence of the series
an
in 6.2.9 and 6.2.11Pcan be inferred
from
the
relations
r
≥
1,
r
≤
1,
or
ρ
=
1.
P
Indeed, the series
1/n2 and
1/n satisfy r = r = ρ = 1, yet the first series
converges while the second diverges.
♦
In Section 6.3, we consider more refined tests that can detect convergence
or divergence in cases where the ratio or root test fails. Here’s an example:
6.2.14 Example. Let an denote the nth term of the series
n
∞ q
X
√
√
an + b n − an ,
n=1
where a, b > 0. Then
q
√
√
1/n
an = an + b n − an = p
√
b
b n
→ √ ,
√
√
2
a
an + b n + an
P
hence the series
an converges if b2 < 4a and diverges if b2 > 4a. If b2 = 4a,
the root test fails but the log test (6.3.4) shows that the series converges in
this case. (Exercise 6.3.11.)
♦
6.2.15 Remark. By Exercise 2.4.12, if an > 0 for all n, then
lim inf
n
an+1
an+1
≤ lim inf a1/n
≤ lim sup a1/n
≤ lim sup
.
n
n
n
an
an
n
n
This shows that if the ratio test determines convergence or divergence conclusively, then so does the root test. It also suggests that the root test may be
effective when the ratio test fails.
♦
6.2.16 Example. Let an = sn δn−1 + tn δn , where δn =
0 < s < t < 1. Then
(
sn if n is odd,
an = n
t
if n is even,
1
2 [1
+ (−1)n ] and
1/n
so the ratios an+1 /an are sn+1 /tn or tn+1 /sn , and the roots an are s or t,
depending on the parity of n. Therefore, r = 0, r =
P+∞ and ρ = t, which
shows that the root test detects the convergence of
an while the ratio test
does not.
♦
Exercises
1.S Determine whether the series
n!
.
3 · 5 · · · (2n + 1)
4n n!
(d)
.
5 · 8 · · · (3n + 2)
(a)
P
an converges or diverges, where an =
3 · 5 · · · (2n + 1)
.
(2n + 1)!
2 · 4 · · · (2n)
(e)
.
4 · 7 · · · (3n + 1)
(b)
3 · 6 · · · (3n)
.
3 · 5 · · · (2n + 1)
4 · 7 · · · (3n + 1)
(f)
.
5 · 9 · · · (4n + 1)
(c)
174
A Course in Real Analysis
P
2. Determine whether the series
an converges or diverges, where an =
(a) S
(d) S
n3
.
2n
n!
.
nn
2
ln n
.
n1.1
1
(f) 1+1/n .
n
(b) (1 + r/n)n , r > 0. (c)
(e) (n1/n − r)n .
2n
1
.
(h)
.
n(ln n)(ln ln n)p
n!
√
rn
, r 6= ±1.
(k)
(j) S sin2 (1/ n).
1 − rn
n + sin n
n + ln n
(m) S 3
.
(n) r
.
n + sin n
n ln n
n!
3n n!
(q)
.
(p) S n .
n
(1.1)n3
1
1
(s) S ln n .
(t) ln n .
2
3
3
3n + 4n
.
(w) (1 − r/n)n , r > 0.
(v) S n
8 − 6n
(g) S
(i) sin2 (1/n).
1
, r 6= ±1.
(1 − rn )2
1
(o) r
.
n ln n
n
1 + an
(r)
, a, b > 0.
1 + bn
(l)
(u) rsin n , r > 0.
(x) 1/rln n , r > 0.
P
P
3. Let an > 0 for all n and suppose that
an diverges. Prove that
an bn
diverges for all sequences {bn } with lim inf n bn > 0.
P∞
4. Let bn → p > 0. Prove that n=1 n−bn converges if p > 1 and diverges
if p < 1. Give anP
example of a sequence {bn } with bn > 1 for all n and
bn ↓ 1 such that
n−bn diverges.
P
P 1/n
5.S Let an > 0 for all n. Prove that
an converges iff
n an converges.
P∞
6. Find all values of a, b, p, q > 0 for which n=1 an converges if an =
lnp n
1
.
(c)
.
q
q
n
n lnp n
−1
n
Y
(n + 1)p − np
qp
1/q
p
S
n/2
(d) (n + 1) − n . (e)
. (f) p n!
pj + 1 .
nq
j=1
n
a + np
1 + anp
1 + anp
(g) S
.
(h)
.
(i)
.
b + nq
1 + bnq
1 + bnq
(a) S
1
.
lnp n
(b)
P
P
7. Let {an } be positive and decreasing. Prove that an converges iff a2n
converges.
P∞
8. Let an > 0 for all n. Prove or disprove: If n=1 an converges, then
Numerical Infinite Series
P∞
n=1 bn
175
converges, where bn =
(a) S a2n .
(b)
√
X
(c)
an .
(d) S min aj .
aj .
n≤j≤2n
n≤j≤n+m
(e) max aj . (f)
n≤j≤2n
(i)
n
X
an aj .
j=1
X
(g)
aj .
1
an
Y
aj . (k)
n<j≤n+m
9. Let an > 0. Prove that if
(h) S
aj .
n≤j≤2n
n≤j≤2n
(j)
Y
1
an
X
Y
aj .
1≤j≤n
aj . (l) S
n<j≤n+m
a 3/4
n
n
.
P
an converges, then (1 − cos an ) converges.
n
P
10. Let r > 0 and p > 3/r. Prove that
bn /(rp − 3) converges for all
sequences {bn } with bn → r.
P
11.S Let an , bn > 0 and an+1
bn
P/an ≤ bn+1 /bn for all n. Prove that if
converges, then so does
an .
P
P
12. Let an > 0. Show that
an converges iff
f (an ) converges, where
f (x) =
(a) sin x.
x
(e)
.
1 + ax
P
(b) tan x.
(c) sin−1 x.
(d) tan−1 x.
(f) ln(1 + x).
(g) ex − 1.
(h) x3 + x2 + x.
13. Let {pn } be a sequence in Z+ and {an } a sequence of positive reals.
P
P √
(a) Prove that if n an converges, then n an an+pn converges, provided that either {pn } is bounded or an is decreasing.
P √
(b) Suppose {pnP
} is bounded and an ↓ 0. Prove that if n an an+pn
converges, then n an converges.
Does (b) hold if {an } is not monotone or {pn } is not bounded?
14.S Let g be positive and differentiable on [1, ∞) such that limx→∞ g(x) = 0,
and let f be differentiable in a neighborhood of 0 such that f (0) = 0,
fP(x) > 0 for x > 0, f 0 is continuous
at 0, and f 0 (0) > 0. Prove that
P∞
∞
n=1 f (g(n)) converges iff
n=1 g(n) converges.
15. Let f : R → [0, +∞) be twice differentiable and p > 0. Prove:
P
(a)S If p ≤ 1 and
f (1/np ) converges, then f (0) = f 0 (0) = 0.
P
(b) If p ≥ 1 and f (0) = f 0 (0) = 0, then
f (1/np ) converges.
P
16. Let an ≥ 0 forPall n and suppose that
an converges. Prove that if
√
n−α an converges. Give an example which shows that
α > 1/2, then
the assertion is false if α = 1/2.
176
A Course in Real Analysis
Assume, for a contradiction, that
17.S This exercise shows that e is irrational.
Pn
e = m/n,
m,
n
∈
N.
Let
s
=
1/k!.
Using the series representation
n
k=0
P∞
e = k=0 1/k!, show that
(a) n!(e − sn ) ∈ N.
P∞
(b) n!(e − sn ) < k=1 (n + 1)−k = 1/n.
Conclude that e must be irrational.
Pn
18. Let sn = k=1 k −p , 0 < p < 1. Show that {sn −(1−p)−1 n1−p } converges.
Conclude that

if p + q > 1,

0
n

1
1 X 1
if p + q = 1,
=
lim
n→+∞ nq
1−p

kp

k=1

+∞
if p + q < 1.
Pn
P 2 −p
an n < +∞, where an , p > 0.
19. Let sn = k=1 ak and suppose that
Prove that limn sn n−q = 0 for all q > (p + 1)/2.
P
cn diverges,
20. Let {an }, {bn }, and {cn } be positive sequences such that
bn → b ∈ (0, +∞], and an /an+1 = 1 + bn cn . Prove that an → 0.
Hint. Let r ∈ (0, b) and choose m so that bn > r for all n ≥ m. Then
am+k /am+k+1 > 1 + rcm+k for all k ≥ 0.
6.3
More Refined Convergence Tests
The tests in this section are frequently useful when the root and ratio tests
fail. The first is a generalization of the ratio test.
6.3.1 Kummer’s Test. Let an , bn > 0 for all n and set
an
bn − bn+1 .
an+1
P
(a) If c := lim inf n cn > 0, then n an converges.
P
P
(b) If c := lim supn cn < 0 and n b−1
n diverges, then
n an diverges.
Pn
Proof. (a) Set sn = k=1 ak and let r ∈ (0, c). Choose N so that cn ≥ r for
all n ≥ N . Since an bn − an+1 bn+1 = cn an+1 , for all m > N we have
cn :=
aN bN ≥ aN bN − am bm =
m−1
X
n=N
m−1
X
an bn − an+1 bn+1 ≥ r
an+1 = r(sm − sN ),
n=N
Numerical Infinite Series
177
P
hence sm ≤ sN + aN bN /r. The partial sums of
an are therefore bounded so
the series converges.
(b) If c < 0, there exists an N such that
ak bk − ak+1 bk+1 < 0 for all k ≥ N .
Then
aN bN − an bn =
n−1
X
(ak bk − ak+1 bk+1 ) < 0
k=N
so an > (aN bN )/bn , for all n > N . Since
the comparison test.
P
1/bn diverges,
P
an diverges by
A simple but important consequence of Kummer’s test is
6.3.2 Raabe’s Test. Let an > 0 for all n and set
a
n
−1 .
dn := n
an+1
P
(a) If d := lim inf n dn > 1, then
an converges.
P
(b) If d := lim supn dn < 1, then
an diverges.
Proof. Take bn = n in Kummer’s test, so
cn =
an
n − (n + 1) = dn − 1.
an+1
Then c = d − 1 and c = d − 1 and the assertions follow.
6.3.3 Example. We use Raabe’s test to show that the series
X
n
n
Y
1
(k + a), where a > 0 and m ∈ N,
(n + m)!
k=1
converges iff m > 1 + a. Indeed, since
an
n+m+1
n(m − a)
n
−1 =n
−1 =
→ m − a,
an+1
n+1+a
n+1+a
the series converges if m − a > 1 and diverges if m − a < 1. If m − a = 1, then
the general term reduces to
n
Y
1
m(m + 1) · · · (m + n − 1)
1
(m − 1 + k) =
=
,
(n + m)!
(n + m)!
(m + n)(m − 1)!
k=1
hence the series diverges in this case as well. Note that the ratio test is
inconclusive in this example since an+1 /an → 1.
♦
178
A Course in Real Analysis
The following test is sometimes useful when the root test fails.
6.3.4 Log Test. Let an > 0 for all n and set cn := ln(a−1
n )/ ln n.
P
(a) If c := lim inf n cn > 1, then
an converges.
P
(b) If c := lim supn cn < 1, then
an diverges.
Proof. (a) Let p ∈ (1, c). Then there exists N such that cn P
> p for all n ≥ N .
p
p
an converges by
For such n, ln(a−1
n ) > ln n , hence an < 1/n . Since p > 1,
the comparison test. The proof of (b) is similar.
6.3.5 Example. Let an denote the general term of the series
n
∞ X
a + np
n=1
b + nq
,
where a, b, p, q > 0. The root test shows that the series converges if p < q
and diverges if p > q. If p = q, the test is inconclusive, so we consider cases. If
a ≥ b, then an ≥ 1 and the series diverges. If a < b, we use the log test: By
l’Hospital’s rule, the sequence
cn =
− ln an
ln(b + np ) − ln(a + np )
=
ln n
(ln n)/n
has the same limit as
pnp−1
pnp−1
−
p+1
(a + np ) − (b + np )
b + np
a + np = pn
1 − ln n
1 − ln n (a + np )(b + np )
n2
p(a − b)
n
=
.
b/np + 1 (1 − ln n)(a + np )
The first quotient in the last expression tends to p(a − b) < 0. By l’Hospital’s
rule, the second quotient has the same limit as
1
(1 − ln n)(pnp−1 ) − (a + np )/n
=
−n1−p
,
p(ln n − 1) + (a/np + 1)
which converges to 0 if p ≥ 1 and to −∞ if p < 1. Thus if p = q and a < b,
then
(
0
if p ≥ 1
lim cn =
n
+∞ if p < 1,
P
hence
an converges iff p < 1.
♦
Numerical Infinite Series
179
Exercises
1. Show that the ratio test is a consequence of Kummer’s test.
2. Show
that Raabe’s test detects the convergence properties of the p-series
P
1/np for p 6= 1, whereas the ratio and root tests do not.
P
3.S Use Raabe’s test to determine the convergence of
an if an =
n
n
n
Y
Y 3k − 1
1 Y 2k − 1
1
. (b)
. (c) n
(3k + 1).
(a)
3k + 1
2n
2k
3 (n + 1)!
k=1
k=1
k=1
Show that the ratio test is inconclusive in each case.
4. Let a, b > 0 and m ∈ N. Use Raabe’s test to show that the following
series converges iff b − a > m:
n
X Y
n
Y
−1
n
mk + a
mk + b
.
k=1
k=1
5. Find all values of p > 0 for which the series converge:
X pn n!
X
pn n!
.
(a)S
.
(b)
n
n
(p + 1)(2p + 1) · · · (np + 1)
n
n
What does the ratio test reveal?
6.S Show that the series
∞
X
1 · 3 · · · (2n − 1)
(2
+
p)
· (4 + p) · · · (2n + p)
n=1
converges iff p > 1.
7. Let p ∈ N. Use Raabe’s test to show that the series
X (pn)!
ppn (n!)p
converges if p > 3 and diverges if p < 3. What does the ratio test tell us
for these values of p?
8. Let a, b, c > 0 and m ∈ Z+ . Use Raabe’s test to show that the series
∞
n X
1 Y ak + b
nm
ak + c
n=1
k=1
converges iff c > (m + 1)a + b.
180
A Course in Real Analysis
9. Let b > 0 and m ∈ N. Use Raabe’s test to show that
!m
∞
n
X
Y
kb
kb + 1
n=1
k=1
converges if m > b and diverges if m < b. What does the ratio test
reveal? What happens if m = b = 1?
10. Let
P r > 0. Use the log test to determine the convergence behavior of
an if an =
1
1
1
(a)S rln ln n .
(b)
.
(c)
.
(d)S
.
r
ln
n
r
ln
ln
n
n
n
(ln n)rn
P
11. Let an be as in 6.2.14. Use the log test to verify that
an converges if
b2 = 4a.
P (np )
12. Let p, r > 0. Use the log test to verify that
r
converges iff r < 1.
P
ln n
13.S Let bn → b > 0. Use the log test to verify that
b−
converges if
n
b > e and diverges if b < e.
P
(np )
14. Use the log test to show that the series
(1 − 1/n)
converges iff
p > 1.
P
15. Let p > 0 and a 6= 0. Use the log test to verify that (1 − a/np )n
diverges if p ≥ 1, converges if 0 < p < 1 and a > 0, and diverges if
0 < p < 1 and a < 0. What does the root test reveal?
P
16. Let a, b, p, q > 0. Determine the convergence behavior of
an if an =
a + np ln n
a + np ln ln n
1 + anp ln ln n
(a)S
.
(b)
.
(c)
.
b + nq
b + nq
1 + bnq
P
17. Show that (ln n)bn diverges if {bn } is bounded. What happens in the
unbounded special cases (a) bn = − ln n and (b) bn = −np , p > 0? What
does the root test reveal in (b)?
18.S (Loglog test) Let an > 0 for all n and set
cn = −
ln (nan )
, c := lim inf cn , and c := lim sup cn .
n
ln ln n
n
P
Prove that
an converges if c > 1 and diverges if c < 1. Use the test to
determine the convergence behavior of
ln ln n
∞ X
1 + an
, a, b > 0.
1 + bn
n=2
Numerical Infinite Series
181
19. Let a, b > 0. Use the log test to show that
X 1 + anp ln n
n
1 + bnq
diverges if p > q; converges if p < q; and if p = q, then converges if
b/a > e and diverges if b/a < e. Use the log log test to show that the
series also diverges if p = q and b/a = e.
20. Use Kummer’s test to prove Gauss’s test: Let an > 0 for all n and let
{αn } be a bounded sequence such that
αn
an
r
=1+ − s,
an+1
n
n
P
where r, s ∈ R, s > 1. Then
an converges iff r > 1.
21.S Use Kummer’s test to prove Bertrand’s test: Let an > 0 for all n and
let {βn } be a sequence such that
βn
an
1
.
=1+ −
an+1
n n ln n
Then
6.4
P
an converges if lim inf βn > 1 and diverges if lim sup βn < 1.
n
n
Absolute and Conditional Convergence
The convergence tests in Sections 6.2 and 6.3 apply only to series with
nonnegative terms. In this section we consider tests applicable to general series.
P
P
6.4.1 Definition. A series
an is said to converge absolutely if
|an |
converges. A convergent series that does not converge absolutely is said to
converge conditionally.
♦
P
6.4.2 Theorem. (a) If
an converges absolutely, then the series
X
X
X
an ,
a+
a−
n , and
n
converge and
X
(b) If
P
an =
X
a+
n −
X
a−
n,
X
an converges conditionally, then
|an | =
P
X
a+
n and
a+
n +
P
X
a−
n.
a−
n diverge.
182
A Course in Real Analysis
Proof. (a) If
|an | converges, then the inequalities
1
0 ≤ a±
n = 2 |an | ± an ≤ |an |
P
P −
and the comparison test show that a+
an converge. The remaining
n and
+
−
assertionsP
in (a) follow
from
the
identities
a
=
a
− a−
|an | = a+
n
n P
n andP
n + an .
P −
P
−
(b) If
an and
an converge,
then
|an | =
an + 2 Pan converges.
P
P
an converges
The same conclusion holds ifP an and P a+
n converge. Hence if
conditionally, then neither
a+
a−
n nor
n can converge.
P∞
All series of the form n=1 (−1)n+1 /np , 0 < p ≤ 1 converge conditionally.
This follows from the alternating series test given below. The following example
is somewhat more interesting.
P
6.4.3 Example. We show that the series
s :=
∞
X
−1
(−1)n np − 1
n=2
converges conditionally iff 1/2 < p ≤ 1 and absolutely iff p > 1.
To see this, note first that if p < 0, then the nth term of the series does
not tend to zero, and if p = 0 the series is undefined. So assume p > 0. If sn
denotes the nth partial sum of the series, then
X
n n
X
1
1
−
=
(αk + βk ),
(6.5)
s2n+1 =
(2k)p − 1 (2k + 1)p + 1
k=1
k=1
where
(2k + 1)p − (2k)p
2
and βk := .
αk := p
p
p
(2k) − 1 (2k + 1) + 1
(2k) − 1 (2k + 1)p + 1
By the mean value theorem applied to xp on the interval [2k, 2k + 1],
pxkp−1
, for some xk ∈ (2k, 2k + 1).
αk = (2k)p − 1 (2k + 1)p + 1
If 0 < p ≤ 1, then
p
1
1
=
≤ p+1
p
p
1−p
2p
k
(2k) − 1 (2k) + 1
(2k)
(2k) − 1
Pn
the last inequality
large k. Therefore, k=1 αk converges by
Pfor sufficiently
comparison with k 1/k p+1 . Also, since
αk ≤
(2k)1−p
βk
2
1
→ 2p−1 ,
= p
k −2p
[2 − k −p ][(2 + 1/k)p + k −p ]
2
Pn
the limit comparison test shows that k=1 βk converges iff p > 1/2. Therefore
the partial sum (6.5) has a finite limit iff p > 1/2. Since s2n+1 − s2n → 0, the
series s converges iff p > 1/2. Since np − 1 ≤ (−1)n+1 np − 1 ≤ np + 1, s
converges absolutely iff p > 1.
♦
Numerical Infinite Series
183
The tests of Sections 6.2 and 6.3 for positive-term series may be used in
conjunction with 6.4.2 to test series with terms of mixed sign. For example,
the inequality P
n−2 sin n ≤ n−2 , together with the comparison test, shows
that the series
n−2 sin n converges absolutely and hence converges.
The remainder of the section describes tests that are useful for establishing
conditional convergence. They rely on the following discrete analog of the
integration-by-parts formula, due to Abel.
6.4.4 Summation by Parts. Let {an }, {bn }, and {sn } be sequences such
that s0 = 0 and sk − sk−1 = ak , k ≥ n ≥ 1. Then, for m > n ≥ 1,
m
X
ak bk =
k=n
m−1
X
sk (bk − bk+1 ) + sm bm − sn−1 bn .
k=n
Proof. Since ak = sk − sk−1 ,
m
X
m
X
ak bk =
k=n
m
X
sk bk −
k=n
sk−1 bk =
k=n
m
X
sk bk −
k=n
m−1
X
sk bk+1 .
k=n−1
Combining the last two sums yields the desired formula.
6.4.5 Dirichlet’s Test. Let {an } and {bn } be sequences such that the following conditions hold:
P
(a) The partial sums of
an are bounded.
(b) limn bn → 0, and
P
(c) The series
|bn+1 − bn | converges, which is the case, for example, if {bn }
is monotone.
P
Then
an bn converges.
Proof. Let
sn :=
n
X
ak and tn :=
k=1
n
X
ak bk .
k=1
If |sn | ≤ M for every n, then, by 6.4.4,
|tm − tn−1 | =
m
X
ak bk ≤ M
k=n
m
X
|bk − bk+1 | + M (|bn | + |bm |) m ≥ n > 1.
k=n
Since the right side of the inequality tends to 0 as m, n → ∞, {tn } is a Cauchy
sequence and hence converges. If {bn } is monotone, say decreasing, then
n
X
k=1
which converges.
|bk+1 − bk | =
n
X
k=1
(bk − bk+1 ) = b1 − bn+1 ,
184
A Course in Real Analysis
P∞
6.4.6 Example. We apply Dirichlet’s test to the series n=1 bn sin(nθ), where
{bn } is monotone andP
bn → 0. To establish the boundedness of the sequence
n
of partial sums sn := k=1 sin(kθ), we use the identity
2 sin (θ/2) sin (kθ) = cos (k − 1/2)θ − cos (k + 1/2)θ .
Summing,
2 sin (θ/2)
n
X
sin (kθ) = cos(θ/2) − cos (n + 1)θ/2 .
k=1
−1
Thus
P∞ if θ is not a multiple of 2π, then |sn | ≤ | sin(θ/2)| . By 6.4.5,
n=1 bn sin(nθ) converges for all θ. Note that if, for example, θ = π/2 and
bn = 1/n, then the convergence is conditional.
♦
P∞
n+1
6.4.7 Alternating Series Test. If bn ↓ 0, then n=1 (−1)
bn converges.
P∞
n+1
Proof. The partial sums of
are clearly bounded, hence the
n=1 (−1)
assertion follows from 6.4.5.
6.4.8
(Alternating Harmonic Series). By 6.4.7, the series
P∞ Example.
n+1 −1
(−1)
n
converges.
We show that its value is ln 2. Let
n=1
sn =
n
X
(−1)k+1
k
k=1
and tn =
n
X
1
− ln n.
k
k=1
By 6.1.9, the sequence {tn } converges. Also, by Exercise 1.5.3,
s2n =
2n
X
(−1)k+1
k=1
k
=
2n
X
1
= t2n − tn + ln 2.
k
k=n+1
It follows that s2n → ln 2. Since s2n+1 − s2n → 0, sn → ln 2.
♦
The contrast between absolutely convergent and conditionally convergent
series is strikingly displayed in the context of rearrangements.
P∞
P∞
6.4.9 Definition. A rearrangement of a series n=1 an is a series k=1 amk ,
where {mk } is a sequence of positive integers that contains every positive
integer exactly once.1
♦
P∞
6.4.10PTheorem. If n=1 an converges absolutely to s, then any rearrange∞
ment k=1 amk converges absolutely to s.
Proof. Assume first that an ≥ 0 for all n. Let
tn =
n
X
k=1
1 In
amk and sn =
n
X
ak .
k=1
other words, k 7→ mk is a one-to-one mapping of N onto itself.
Numerical Infinite Series
185
For each N , choose K so large that the terms ak , 1 ≤ k P
≤ N , are included
∞
among the terms amP
, 1 ≤ k P
≤ K. Then sN P
≤ tK ≤
k
k=1 amk . Letting
a
≤
a
a
N
→
∞
shows
that
.
Since
is
a
rearrangement
of
n
m
n
k
n
k
n
P
reverse
inequality
holds
as
well.
The
general
case
follows
by
k amk , the
P +
P −
considering
an and
an and using 6.4.2.
6.4.11 Example. Consider the series
t := 1 −
1
1
1
1
1
1
1
1
− p + p − p − p + p − p − p + ··· ,
p
2
4
3
6
8
5
10
12
which is a rearrangement of the alternating series
s := 1 −
1
1
1
1
1
1
1
1
1
+ p − p + p − p + p − p + p − p + ··· ,
2p
3
4
5
6
7
8
9
10
If p > 1, then both series converge absolutely and t = s. If p = 1, then the two
series converge to different values. Indeed, if sn and tn denote the nth partial
sums of s and t, respectively, then
t3n =
n X
k=1
1
1
1
−
−
2k − 1 4k − 2 4k
Since
t3n+1 = t3n +
n
=
1X
2
k=1
1
1
−
2k − 1 2k
=
s2n
s
→ .
2
2
1
1
and t3n+2 = t3n+1 −
,
2n + 1
4n + 2
we see that tn → s/2.
♦
The phenomenon illustrated in the last example holds generally, as shown
by the following remarkable result due to Riemann.
P∞
6.4.12 Theorem. If s := n=1 an converges conditionally, then, for any real
number x, some rearrangement of s converges to x.
Proof. We may assume that x ≥ 0. For n ∈ N let
sn :=
n
X
j=1
aj , s +
n :=
n
X
−
a+
j , sn :=
j=1
n
X
+
a−
j , and s0 := 0.
j=1
+
Since s+
n → +∞ (6.4.2), there exists a smallest integer m1 such that sm1 > x.
Since x ≥ 0, m1 6= 0. Because s−
→
+∞,
there
exists
a
smallest
positive
n
−
integer n1 such that s+
m1 − sn1 < x and then a smallest integer m2 such that
−
s+
m2 − sn1 > x. Obviously, m2 > m1 . Continuing in this manner, we obtain
strictly increasing sequences {mk } and {nk } with the following properties:
• mk is the smallest integer such that
−
+
−
+
−
tk := s+
mk − snk−1 = (a1 + · · · + amk ) − (a1 + · · · + ank−1 ) > x,
186
A Course in Real Analysis
• nk the smallest integer such that
−
+
−
+
−
rk := s+
mk − snk = (a1 + · · · + amk ) − (a1 + · · · + ank ) < x.
Now consider the series
−
−
+
+
−
+
s0 := a+
1 + · · · + am1 − a1 − · · · − an1 + am1 +1 + · · · + am2 − an1 +1 − · · · .
The terms of s0 are either aj or 0, and s0 contains each term of the series s
exactly once. Thus s0 is a rearrangement of s. We show that s0 = x.
By the minimality properties of the sequences {mk } and {nk },
−
tk − a+
mk ≤ x < tk and rk < x ≤ rk + ank ,
hence
+
x − a+
nk ≤ rk < x < tk ≤ x + amk .
Since an → 0,
lim rk = lim tk = x.
k
k
(6.6)
Now let s0k denote the kth partial sum of the series s0 and consider the partial
sums
−
+
−
r1 = (a+
1 + · · · + am1 ) − (a1 + · · · + an1 ),
−
+
−
t2 = (a+
1 + · · · + am2 ) − (a1 + · · · + an1 ),
−
+
−
r2 = (a+
1 + · · · + am2 ) − (a1 + · · · + an2 ).
If m1 + n1 ≤ k ≤ m2 + n1 , then s0k includes the terms of r1 , additional terms
+
0
from a+
m1 +1 + · · · + am2 , and no others, hence r1 ≤ sk ≤ t2 . Similarly, if
m2 + n1 ≤ k ≤ m2 + n2 , then s0k includes the terms of t2 , additional terms
−
0
from −a−
n1 +1 − · · · − an2 , and no others, so r2 ≤ sk ≤ t2 . In general, for j ≥ 1,
mj + nj ≤ k ≤ mj+1 + nj ⇒ rj ≤ s0k ≤ tj+1 and
mj+1 + nj ≤ k ≤ mj+1 + nj+1 ⇒ rj+1 ≤ s0k ≤ tj+1 .
From (6.6), s0k → x.
Exercises
P
P
1. Suppose that
an converges absolutely. Prove that
an bn converges
absolutely for all sequences {bn } with lim supn→∞ |bn | < +∞.
P
2.S Suppose
an does not converge absolutely.
P the ratio test shows that
Can
an still converge conditionally?
P∞
n+1
3. For an alternating series s =
bn , prove the inequality
n=1 (−1)
|s − sn | ≤ bn . This result is useful in estimating the error made by using
sn to approximate s. For example, use the estimate to determine
how large
P∞
n should be so that the partial sum sn agrees with s = n=1 (−1)n+1 /n4
in nine decimal places.
Numerical Infinite Series
4. Let p > 0. Determine whether the series
conditionally, or diverges, where an =
(−1)n
.
n1/n
(c) S (−1)n sin(1/np ).
187
P
an converges absolutely,
(b) S (−1)n (n1/n − 1).
(a) S
(d) (−1)n sin−1 (1/np ).
(e) (−1)n tan(1/np ).
(f) (−1)n tan−1 (1/np ).
sin[(2n + 1)π/2]
.
ln n
√
√
n+1− n
.
(i) S (−1)n
np
3n
.
(k) (−1)n √
n3 + 2
(−1)n
, (p 6= 1).
(m) S n
p + (−1)n
(−1)n n!
(o)
.
3 · 5 · · · (2n + 1)
(−1)n en n!
.
(q)
5 · 8 · · · (3n + 2)
(g)
(h)
(j)
(l)
(n)
(p)
(−2)n
.
n!
(−1)n
.
n lnp (n + 1)
(n!)2
(−1)n pn
.
(2n)!
(−1)n
, (n ≥ 2).
np + (−1)n
(−1)n 3 · 6 · · · (3n)
.
3 · 5 · · · (2n + 1)
(r) (−1)n+1 n[(1)
n
−3]/2
.
5. Suppose that {bn } is monotone and bn → 0. Use the identity
2 sin (θ/2) cos (nθ) = sin (n + 1/2)θ − sin (n − 1/2)θ
P∞
to verify that the series n=1 bn cos nθ converges if θ/(2π) 6∈ Z.
P∞
6. Let bn ↓ 0 and m ∈ N. Show that n=0 (−1)bn/mc bn converges.
7. Let m ∈ N. Show that
m
∞
X
(−1)n+1 m X (−1)n+m+1
=
+ δm ln 2,
n(n + m)
n
n=1
n=1
where δm = 0 or 2 according as m is even or odd.
P∞
8. (Abel) Prove
that if n=1 an converges and {bn } is bounded and monoP∞
tone, then n=1 an bn converges.
9.S Prove that
∞
X
(n − 1/2) sin(nθ)
converges for all real θ iff p > 1.
np + (−1)n
n=2
10. Let p > 1. Express each of the series
∞
X
1
terms of
.
p
n
n=1
∞
X
∞
X
1
(−1)n
and
in
(2n − 1)p
np
n=1
n=1
188
A Course in Real Analysis
P
P
that if
11. Prove
nan converges, then
an converges and, moreover,
P
|an |p converges for every p > 1. What if p = 1?
P −p
P −q
12. Prove that if
n an converges, then
n an converges for all q > p.
P
n
13.S (a) Let sn = k=1 an , where an → 0. Suppose
P there exists a positive
integer q such that snq → s ∈ R. Prove that
an converges to s.
(b) Use (a) to sum the series
s := 1 +
1 1 1 1 1 1 1 1
+ − − − + + + − ··· ,
2 3 4 5 6 7 8 9
where sums of length three alternate signs. Generalize your result to
alternating sums of length p > 1.
(c) Show that in contrast to (b), the following series diverges, where sums
of lengths p = 3 and q = 2 alternate signs.
t := 1 +
*6.5
1 1 1 1 1 1 1 1
+ − − + + + − − ··· .
2 3 4 5 6 7 8 9
Double Sequences and Series
A double sequence is a doubly indexed infinite array
{am,n } = {am,n }∞
m,n=1
of real numbers am,n .2 Associated with each double sequence are the so-called
iterated limits
lim lim am,n and lim lim am,n .
m
n
n
m
For the first iterated limit to exist, each inner limit bm := limn am,n , as well
as the outer limit limm bm , must exist. Similar remarks apply to the second
iterated limit. The following scheme illustrates the case when the iterated
limits exist and equal L.
a1,1
a2,1
..
.
a1,2
a2,2
..
.
am,1
↓
c1
am,2
↓
c2
···
···
···
···
···
···
a1,n
a2,n
..
.
→ b1
→ b2
..
.
am,n
↓
cn
→ bm
↓
→L
In addition to iterated limits, a double sequence gives rise to a third type
of limit, frequently called a double limit to distinguish it from iterated limits.
2 More
precisely, a double sequence is a function (m, n) 7→ am,n from N × N to R.
Numerical Infinite Series
189
6.5.1 Definition. Let L ∈ R. We write
L = lim am,n
m,n
and say that am,n converges to L or has limit L if for each ε > 0 there exists
N ∈ N such that |am,n − L| < ε for all n, m ≥ N . We also write
lim am,n = +∞ (−∞)
m,n
if for each r ∈ R there exists N ∈ N such that am,n > r (< r) for all n, m ≥ N .
♦
Double limits have properties similar to limits of single sequences. For
example, double limit analogs of 2.1.3, 2.1.4, 2.1.5, and 2.1.11, are readily
formulated and proved.
It is easy to find examples of iterated limits that exist but are unequal;
am,n = (1 − 1/n)m is one such. When this happens, the double limit cannot
exist, as shown in 6.5.2 below. However, even if the iterated limits are equal,
the double limit may fail to exist. This is the case for the sequence defined by
(
1 if m = n, and
am,n =
0 otherwise,
which has zero iterated limits. Finally, the example
am,n = (−1)m+n (1/m + 1/n)
shows that a double limit may exist even if both iterated limits fail to exist.
The following theorem gives the basic connection between double limits
and iterated limits.
6.5.2 Iterated Limit Theorem. Let {am,n } be a double sequence such that
limn am,n exists for each m and limm am,n exists for each n. If the double limit
limm,n am,n exists, then the iterated limits limm limn am,n and limn limm am,n
exist and equal the double limit.
Proof. Let L := limm,n am,n , bm := limn am,n , and cn := limm am,n . Given
ε > 0, choose N ∈ N such that
|am,n − L| < ε for all m, n ≥ N .
Letting n → +∞ yields |bm − L| ≤ ε for all m ≥ N . Therefore, bm → L.
Similarly, cn → L.
6.5.3 Definition. Given a double sequence {am,n }, form the partial sums
sm,n =
m X
n
X
j=1 k=1
aj,k , m, n ∈ N.
190
A Course in Real Analysis
The double infinite series
X
am,n =
X
∞
X
am,n =
m,n
am,n
m,n=1
is said to converge to s ∈ R ifP
{sm,n } converges to s in the sense of 6.5.1. The
series
converges
absolutely
if
|am,n | converges, and converges conditionally
P
if
am,n converges but not absolutely.
♦
As in the case of single series, an absolutely convergent double series converges (Exercise 7). Moreover, aP
doublePseries with nonnegative terms converges
m
n
absolutely iff the partial sums j=1 k=1 aj,k are bounded (Exercise 5).
The iterated limits
lim lim sm,n = lim lim
m
n
m
n
and
lim lim sm,n = lim lim
n
m
n
m X
n
X
m
aj,k =
j=1 k=1
n X
m
X
∞ X
∞
X
aj,k
j=1 k=1
aj,k =
k=1 j=1
∞ X
∞
X
aj,k
k=1 j=1
are called iterated series. The following result, a special case of the Fubini–
Tonelli theorem, establishes a connection between double and iterated series.
P
6.5.4 Fubini–Tonelli Theorem for Series. A double series
am,n is
absolutely convergent iff one (hence both) of the following conditions hold:
∞ X
∞
X
|am,n | < +∞ and
|am,n | < +∞.
(6.7)
n=1 m=1
m=1 n=1
In this case,
X
∞ X
∞
X
am,n =
m,n
∞ X
∞
X
am,n =
m=1 n=1
∞ X
∞
X
am,n .
(6.8)
n=1 m=1
Pm Pn
Pm Pn
Proof. Set sm,n = j=1 k=1 aj,k and tm,n = j=1 k=1 |aj,k |. The first
assertion of the theorem is clear, since each condition in (6.7) implies that
T := supm,n tm,n < +∞,
P and conversely.
Now suppose that
am,n is absolutely convergent. Let s := limm,n sm,n .
For each j,
n
X
|aj,k | ≤ tj,n ≤ T for all n,
k=1
hence
that
P∞
k=1 aj,k converges. Set rm :=
m X
∞
X
aj,k . Given ε > 0, choose N such
j=1 k=1
m X
n
X
j=1 k=1
aj,k − s < ε for all m, n ≥ N.
Numerical Infinite Series
191
Fixing m ≥ N and letting n → +∞ in this inequality yields |rm − s| ≤ ε. This
shows that rm → s, which is the first equality in (6.8). The proof of the second
equality is similar.
Exercises
1. Let α : N → N be strictly increasing. Show that if L := limm,n am,n
exists in R, then limm,n aα(n),n exists and equals L.
2. A double sequence {am,n } is said to be Cauchy if, given ε > 0, there
exists N ∈ N such that |am,n − am+p,n+q | < ε for all m, n ≥ N and all
p, q ≥ 0. Prove that {am,n } converges iff it is Cauchy. Hint. Show that
{an,n } converges.
3.S Determine the convergence behavior, double and iterated, of the following
sequences, where a, b > 0:
(a) sin(m/n).
m−n
.
m+n
1
(g) 1/n .
m
n + nm sin(1/n)
(j)
.
am + bn
(d)
4. Show that if
ln(mn)
.
n
mn
(e)
.
(m + n)2
n
(h)
.
m + n2
m2 n
(k) 2
.
an + bm4
(b)
(−1)m m
.
m+n
mn
(f) 2
.
m + n2
n3 m
(i) 4
.
m + n4
n2 sin(1/n)
(l)
.
m+n
(c)
am,n converges, then limm,n am,n = 0.
P
5. Let am,n ≥ 0 for all m, n ∈ N. Prove that
m,n am,n converges iff
s := supm,n sm,n < +∞, in which case the series sums to s.
P
6. State and prove a comparison test for double series with nonnegative
terms.
7. Prove that an absolutely convergent double series converges.
8. For
= an bm . Prove P
that c :=
P sequences {an } and {bn }, set cm,n P
b conm,n cm,n converges absolutely iff a :=
n an and b :=
P n n
verge absolutely, in which case c = ab. Conclude that m,n m−q n−p
converges iff p, q > 1.
9.S Given a double sequence {am,n } with am,n ≥ 0, let {bn } be the sequence
obtained
= n + 1, that is,
Pn by summing am,n alongPthe diagonals j + k P
bn := j=1 aj,n+1−j . Prove that
am,n converges iff n bn converges,
in which case the two series are equal.
192
A Course in Real Analysis
10. Use Exercise 9 to show that the double series
X
X
1
1
S
, and (c)
(a)
,
(b)
p
2 + n2 )p/2
(m
+
n)
(m
m,n
m,n
1
p + np
m
m,n
X
converge iff p > 2. Show that for p > 2,
∞
∞
X
X
1
1
1
=
−
.
p
p−1
(m + n)
n
np
n=2
n=2
m,n=1
∞
X
P
11.S Prove that m,n rmn converges iff |r| < 1, in which case the iterated
P∞ P∞ mn
series m=1 n=1 r
converges.
12.S Prove the root test for double series with
Pnonnegative terms: Suppose
that L := limm,n am,n 1/mn exists. Then m,n am,n converges if L < 1
and diverges if L > 1.
13. Let am,n = (−1)m n−m−2 . Prove that
X
|am,n | = 1 and
m≥0,n≥2
X
m≥0,n≥2
am,n = 1/2.
Chapter 7
Sequences and Series of Functions
7.1
Convergence of Sequences of Functions
Unlike numerical sequences, sequences of functions have several modes of
convergence. In this chapter we consider the two most common types: pointwise
and uniform. Other types of convergence will be examined in Chapter 11.
7.1.1 Definition. Let S be a nonempty set. A sequence of real-valued functions fn on S is said to converge pointwise on S to a function f : S → R if
fn (x) → f (x) for each x ∈ S. We then write f = limn f or fn → f (on S). ♦
The following theorem is an immediate consequence of 2.1.11 and 3.1.9.
7.1.2 Theorem. Let fn → f and gn → g pointwise on S and let h be
continuous such that h ◦ fn and h ◦ f are defined on S. Then, for α, β ∈ R,
αfn + βgn → αf + βg, fn gn → f g,
fn
f
→ (if g 6= 0) and h ◦ fn → h ◦ f
gn
g
pointwise on S.
The definition of pointwise convergence may be phrased as follows: For
each x ∈ S and ε > 0 there exists an index N such that |fn (x) − f (x)| < ε for
all n ≥ N . Here, the index N usually depends on both ε and x. Removing the
f +
fn
f
f −
S
FIGURE 7.1: Uniform convergence of fn to f .
dependence on x results in the stronger property of uniform convergence:
193
194
A Course in Real Analysis
7.1.3 Definition. A sequence of functions fn : S → R is said to converge
uniformly on S to a function f : S → R if, for each ε > 0, there exists N ∈ N
such that |fn (x) − f (x)| < ε for all n ≥ N and all x ∈ S. (See Figure 7.1.) ♦
Clearly, uniform convergence implies pointwise convergence. The examples
below show that the converse is not generally true. For these examples and for
the exercises at the end of the section, the following propositions are useful.
7.1.4 Proposition. Let fn , f : S → R. Suppose that there exists a sequence
{an } of positive real numbers such that an → 0 and |fn (x) − f (x)| ≤ an for
all x ∈ S and all n. Then fn converges uniformly to f on S.
Proof. One need only choose N in the definition of uniform convergence so
that an < ε for all n ≥ N .
7.1.5 Proposition. Let fn , f : S → R. Then fn converges uniformly to f on
S iff
lim fn (bn ) − f (bn ) = 0
n
for any sequence {bn } in S.
Proof. If fn converges uniformly to f on S, choose N so that |fn (x)−f (x)| < ε
for all n ≥ N and all x ∈ S. For such n, |fn (bn ) − f (bn )| < ε.
Conversely, suppose fn does not converge uniformly to f on S. Then there
exists an ε > 0, and points bn ∈ S such that |fn (bn ) − f (bn )| ≥ ε for infinitely
many n. Thus the sequential condition fails.
7.1.6 Examples. (a) The sequence {xn } converges pointwise but not uniformly to zero on (−1, 1). (Take bn = 1/21/n in 7.1.5.) The convergence is
uniform on intervals [−r, r], 0 < r < 1, since on such an interval |xn | ≤ rn and
rn → 0.
(b) The sequence {n/xn } converges pointwise to zero on (1, +∞) but the
convergence is not uniform there, as can be seen by taking bn = 21/n in
7.1.5. The convergence is uniform for x ∈ [r, +∞), r > 1, since then |n/xn | ≤
n/rn → 0.
(c) The sequence {xn e−nx } converges uniformly to zero on [0, +∞) since
xn e−nx ≤ e−n for x ≥ 0.
(d) The sequence {n−1 sin nx} converges uniformly to zero on R since
|n sin nx| ≤ 1/n for all x.
−1
(e) The sequence {sin(x/n)} converges pointwise to zero on R, but the
convergence is not uniform, as can be seen, for example, by takingbn = πn/2
in 7.1.5. The convergence is uniform on bounded intervals [a, b] since on this
interval | sin(x/n)| ≤ |x|/n ≤ max{|a|, ||b}.
♦
There is an analog of 7.1.2 for uniform convergence; however, it is more
restrictive and requires the notion of uniform boundedness.
Sequences and Series of Functions
195
7.1.7 Definition. A sequence of functions fn is said to be uniformly bounded
on S with uniform bound M if |fn (x)| ≤ M for all x ∈ S and all n.
♦
7.1.8 Proposition. Let fn → f pointwise on a set S.
(a) If {fn } is uniformly bounded on S, then f is bounded on S.
(b) If each fn is bounded on S and fn → f uniformly on S, then {fn } is
uniformly bounded on S, hence f is bounded.
(c) If fn → f uniformly on S and f is bounded, then {fn }∞
n=N is uniformly
bounded for some N .
Proof. (a) This follows by letting n → +∞ in the inequality |fn (x)| ≤ M .
(b) Choose N such that
|fn (x) − f (x)| ≤ 1 for all n ≥ N and x ∈ S.
For such n and for all x ∈ S,
|fn (x)| ≤ |fn (x) − f (x)| + |f (x) − fN (x)| + |fN (x)| ≤ 2 + MN ,
where MN is a bound for fN on S. Since the functions f1 , . . ., fN −1 are
bounded, {fn }∞
n=1 is uniformly bounded.
(c) Let |f (x)| ≤ M for all x. Choose N such that |fn (x) − f (x)| ≤ 1 for all
n ≥ N and x ∈ S. For such n, |fn (x)| ≤ 1 + M for all x ∈ S.
The sequence {fn } on (0, 1) defined by
(
n if 0 < x < 1/n,
fn (x) =
0 if otherwise
shows that the first assertion in (b) may be false if the convergence is merely
pointwise.
7.1.9 Theorem. Let fn → f and gn → g uniformly on S and let h be
uniformly continuous such that h ◦ f and h ◦ fn are defined on S. Then
(a) αfn + βgn → f + g uniformly on S, α, β ∈ R.
(b) h ◦ fn → h ◦ f uniformly S.
(c) fn gn → f g uniformly on S if {fn } and {gn } are uniformly bounded on S.
1
1
1
(d)
→
uniformly on S if
is uniformly bounded on S.
gn
gn
gn
196
A Course in Real Analysis
Proof. The proof of (a) is left to the reader. To prove (b), choose δ > 0 such
that |h(u) − h(v)| < ε for all u, v with |u − v| < δ and choose N such that
|fn (x)−f (x)| < δ for all x ∈ S and n ≥ N . For such n, |h◦fn (x)−h◦f (x)| < ε.
For (c), let M > 0 be a common uniform bound for the sequences {|fn |}
and {|gn |} and let ε > 0. Choose N such that
|fn (x) − f (x)| < ε/2M and |gn (x) − g(x)| < ε/2M.
for all x ∈ S and n ≥ N . For such n and x,
|fn (x)gn (x) − f (x)g(x)| ≤ |fn (x)gn (x) − f (x)gn (x)| + |f (x)gn (x) − f (x)g(x)|
= |gn (x)| |fn (x) − f (x)| + |f (x)| |gn (x) − g(x)|
≤ M |fn (x) − f (x)| + M |gn (x) − g(x)| < ε.
For (d), let 1/|gn (x)| ≤ M for all n and x. Then the same inequality holds
for g, and
1
|gn (x) − g(x)|
1
1
−
=
≤ 2 |gn (x) − g(x)|.
gn (x) g(x)
|gn (x)g(x)|
M
The hypothesis of uniform boundedness in parts (c) and (d) of the theorem
cannot be relaxed. (See Exercises 6 and 7.)
There are versions of the Cauchy criterion for pointwise and uniform
convergence of sequences of functions. For the pointwise version, consider a
sequence of functions fn on S such that limm,n |fn (x) − fm (x)| = 0 for each
x ∈ S. Then {fn (x)}∞
n=1 is a Cauchy sequence of real numbers and hence
converges to a unique real number f (x). Thus fn → f on S. Here is the
analogous result for uniform convergence:
7.1.10 Uniform Cauchy Criterion. A sequence of functions fn converges
uniformly on a set S iff for each ε > 0 there exists an index N such that
|fn (x) − fm (x)| < ε for all x ∈ S and all m, n ≥ N .
(7.1)
Proof. If fn → f uniformly on S, then, given ε > 0, there exists an index N
such that |fn (x) − f (x)| < ε/2 for all x ∈ S and all n ≥ N . An application of
the triangle inequality yields (7.1).
Conversely, assume that the condition holds. Then, in particular,
limm,n |fn (x) − fm (x)| = 0 for every x ∈ S, hence, by the observation preceding the theorem, there exists a function f such that fn → f pointwise on
S. We claim that the convergence is in fact uniform. To see this, let ε > 0
and choose N as in (7.1). Letting m → +∞ in that inequality then yields
|fn (x) − f (x)| ≤ ε for all x ∈ S and all n ≥ N . This shows that fn → f
uniformly on S.
7.1.11 Definition. Let S be an arbitrary set and let fn : S → R. If the
sequence {fn (x)} is increasing (decreasing) for each x ∈ S and fn → f on S,
we write fn ↑ f (fn ↓ f ). In either case we say that {fn } is monotone.
♦
Sequences and Series of Functions
197
The following theorem gives general conditions under which pointwise
convergence implies uniform convergence.
7.1.12 Dini’s Theorem. Let f and fn be continuous on [a, b] for each n and
suppose that either fn ↓ f or fn ↑ f on [a, b]. Then fn → f uniformly.
Proof. We may assume that fn ↓ f . Let gn = fn − f , so gn ↓ 0. Suppose the
assertion of the theorem is false. Then there exists an ε > 0, a subsequence
{hn } of {gn }, and a sequence {xn } in [a, b] such that hn (xn ) ≥ ε for all n.
(Why?) By the Bolzano–Weierstrass theorem, there exists a subsequence {xnk }
converging to some x ∈ [a, b]. Since hn ↓, for any fixed n and all sufficiently
large k, hn (xnk ) ≥ hnk (xnk ), hence hn (xnk ) ≥ ε. Letting k → +∞ in the last
inequality yields hn (x) ≥ ε for all n, contradicting that hn (x) → 0.
The examples xn on [0, 1) and x−n on [2, +∞) show that Dini’s theorem is
false if the interval is not closed and bounded. The decreasing sequence defined
by


if 0 ≤ x ≤ 1,
1
fn (x) = 1 + n(1 − x) if 1 ≤ x ≤ 1 + 1/n,
(7.2)


0
if 1 + 1/n ≤ x ≤ 2
shows that continuity of the limit function in Dini’s theorem is essential.
Exercises
1. Find the largest subset of R on which the given sequence converges
pointwise, and determine the intervals on which the convergence is
uniform.
(a) xn (1 − x)n .
nx2
.
enx2
√ 2
nx
(g) S
.
1 + nx2
x2n
(j) S
.
2 + x2n
(d) S
(b) S np xn (1 − x).
(c) ex/n .
1
.
2n
1 + x (1 − x)2
nx2
(h)
.
1 + nx2
1
(k)
.
1 + |x|n
x (f) n1/2 sin 2/3 .
n
n
x
(i)
.
2+x
n sin x2
(l)
.
1 + nx2
(e)
2. Describe the convergence behavior of the following sequences on [0, 1]:
x
nx
nx
1
.
(b)S
.
(c)
.
(d)
.
(a)S
nx + 1
nx + 1
n2 x + 1
n2 x2 + 1
3. Describe the convergence behavior of the sequences on (0, 1):
(a) {x1/n }.
(b) {x1+1/n }.
(c) {x−1/n }.
(d) {x1−1/n }.
4. Show directly that the sequence defined in (7.2) does not converge
uniformly.
198
A Course in Real Analysis
5. Let p, q > 0. Prove that the sequence of functions
uniformly to zero on [0, +∞) iff p < q.
xp
converges
n + xq
6.S Give an example of sequences {fn }, {gn } and functions f , g such that
fn → f and gn → g uniformly, and fn gn → f g pointwise but not
uniformly.
7. Give an example of a sequence {gn } and a function g such that gn → g
uniformly and 1/gn → 1/g pointwise but not uniformly.
8. Let −∞ < a < b ≤ +∞. Suppose that fn → f uniformly on [a, r]
for every r ∈ (a, b). Prove that fn → f uniformly on [a, b) iff for each
sequence {bn } with bn ↑ b, fn (bn ) − f (bn ) → 0. Use this to show that
fn (x) := x−n does not converge uniformly on [2, +∞).
9. Let fn be bounded for each n and let fn → f uniformly on a set S. Prove
that supS fn → supS f and inf S fn → inf S f .
10.S Let f be uniformly continuous on R and an → a. Set fn (x) = f (x + an ).
Show that {fn } converges uniformly on R.
11. Let fn be continuous on [a, b] for each n and let fn converge uniformly
on (a, b) ∩ Q. Prove that fn converges uniformly on [a, b].
12. Prove: If fn → f uniformly on each of the sets S1 , . . . , Sm , then fn → f
uniformly on S1 ∪ · · · ∪ Sm . Show that the corresponding statement for
a union of infinitely many sets is false.
13.S For x ∈ [0, 1] define
(
1 if x ∈ Q and x = k/m in reduced form with m ≤ n,
fn (x) =
0 otherwise.
Show that {fn } converges pointwise but not uniformly to the Dirichlet
function.
14. Let p ∈ N. For x ∈ [0, 1] define
(
(m + 1/n)p if x ∈ Q, x = k/m in reduced form
gn (x) =
0
if x is irrational.
Show that gn converges uniformly on [0, 1] iff p = 1.
15. Let {fn } be uniformly bounded, let f, g be bounded on [0, 1], and suppose
that fn → f pointwise (uniformly) on [r, 1] for each 0 < r < 1. If g is
continuous at 0 and g(0) = 0, prove that fn g → f g pointwise (uniformly)
on [0, 1].
Sequences and Series of Functions
199
16. Let {fn } be uniformly bounded and fn → f uniformly on S.
(a) Prove that (f1 + f2 + · · · + fn )/n → f uniformly on S.
(b)S Suppose for some r > 0 that fn (x) ≥ r for all n and all x ∈ S. Prove
that (f1 f2 · · · fn )1/n → f uniformly on S.
17.S Let f0 be a bounded function on a set S and 0 < r < 1. Define a
sequence {fn } recursively by
fn (x) = sin rfn−1 (x) , x ∈ S, n ≥ 1.
Prove that {fn } converges uniformly on S. Show that a similar result
holds if S is an interval and sin x is replaced by any function g such that
supx |g 0 (x)| < 1/r, where r is any positive number.
18. Let g and h be positive and continuous on [a, b] and define
fn (x) :=
ng(x)
.
1 + n2 h(x)
Prove that the following convergence is uniform on [a, b]:
g
g2
(a) n sin fn → . (b) n 1 − cos fn → 0. (c) n2 1 − cos fn → 2 .
h
2h
7.2
Properties of the Limit Function
The theorems in this section give conditions under which the properties
of continuity, integrability, or differentiability of functions in a sequence are
passed along to the limit function. We shall see that pointwise convergence is
generally insufficient for this—the stronger property of uniform convergence is
needed.
The following theorem asserts that under suitable conditions two limit
processes may be interchanged. It is one of several such results to be found in
the text.
7.2.1 Interchange of Limits. Let fn → f uniformly on a subset E of R and
let a be an accumulation point of E such that Ln := lim{x→a, x∈E} fn (x) exists
in R for each n. Then L := limn Ln exists in R and lim{x→a, x∈E} f (x) = L.
In other words, the equality
lim x→a
lim fn (x) = x→a
lim lim fn (x)
n
x∈E
x∈E
n
holds provided that each inner limit exists in R and the convergence in the
inner limit on the right is uniform.
200
A Course in Real Analysis
Proof. Given ε, for each n choose δn > 0 such that
|fn (x) − Ln | < ε/3 for all x ∈ E with |x − a| < δn .
Next, choose N ∈ N such that
|fn (x) − f (x)| < ε/6 for all x ∈ E and all n ≥ N .
For n, m ≥ N , choose x ∈ E such that |x − a| < min{δn , δm }. Then
|Ln − Lm | ≤ |Ln − fn (x)| + |fn (x) − fm (x)| + |Lm − fm (x)| < ε.
This shows that {Ln } is a Cauchy sequence and hence converges to some
L ∈ R. Let n ≥ N be sufficiently large so that |Ln − L| < ε/6. If x ∈ E and
|x − a| < δn , then
|f (x) − L| ≤ |f (x) − fn (x)| + |fn (x) − Ln | + |Ln − L| < ε/6 + ε/3 + ε/6 < ε.
Therefore, lim{x→a, x∈E} f (x) = L.
7.2.2 Corollary. If fn → f uniformly on an interval I and if each fn is
continuous at some a ∈ I, then f is continuous at a.
Proof. Take Ln = fn (a) in the theorem.
The corollary is false if the convergence is only pointwise. For example, the
sequence of continuous functions xn converges pointwise on [0, 1] to a function
that is discontinuous at x = 1.
7.2.3 Theorem. If fn → f uniformly on [a, b] and fn ∈ Rba for all n, then
f ∈ Rba and
Z b
Z b
lim
fn (t) dt =
f (t) dt.
(7.3)
n
a
a
Proof. By 7.1.8, f is bounded. By uniform convergence, given ε > 0, there
exists an N such that
ε
ε
fn (x) −
< f (x) < fn (x) +
4(b − a)
4(b − a)
for all x ∈ [a, b] and n ≥ N . It follows that for fixed n ≥ N and any partition P,
ε
ε
S(fn , P) − ≤ S(f, P) ≤ S(f, P) ≤ S(fn , P) + ,
4
4
hence
ε
S(f, P) − S(f, P) ≤ S(fn , P) − S(fn , P) + .
2
Since fn is integrable, P may be chosen so that the right side of this inequality
is less than ε. Therefore, f is integrable.
Since |fn (t) − f (t)| < ε/4(b − a) for n ≥ N and all t,
Z b
Z b
Z b
ε
fn (t) dt −
f (t) dt ≤
|fn (t) − f (t)| dt ≤ ,
4
a
a
a
Rb
Rb
which shows that a fn → a f .
Sequences and Series of Functions
201
The following examples show that the hypothesis of uniform convergence
in 7.2.3 cannot be relaxed.
7.2.4 Example. Define fn : [0, π] 7→ R by
(
n sin(nx) if 0 ≤ x ≤ π/n,
fn (x) =
0
if π/n ≤ x ≤ π.
fn
n
π/n
π
x
FIGURE 7.2: Pointwise convergence insufficient.
Each fn isR continuous and {fn } converges pointwise on [0, π] to the zero
π
function, yet 0 fn = 2 for all n.
♦
7.2.5 Example. Let r1 , r2 , . . . be an enumeration of the rationals in [0, 1] and
let
(
1 if x ∈ {r1 , . . . , rn },
fn (x) =
0 otherwise.
Then fn is integrable with zero integral and fn converges pointwise to the
Dirichlet function, which is not Riemann integrable.
♦
In the two preceding examples, either the sequence was not uniformly
bounded or the limit function was not integrable. It will follow from results in
Chapter 11 that if {fn } is uniformly bounded, fn , f ∈ Rba , and fn → f merely
pointwise on [a, b], then (7.3) holds.
7.2.6 Theorem. Let fn be differentiable on (a, b) for each n and let {fn0 }
converge uniformly on (a, b). If {fn (x0 )} converges for some x0 ∈ (a, b), then
{fn } converges uniformly to a differentiable function f on (a, b) and fn0 → f 0
on (a, b).
Proof. Given ε > 0, choose N such that, for all m, n ≥ N and x ∈ (a, b),
|fn (x0 ) − fm (x0 )| <
ε
ε
0
and |fn0 (x) − fm
(x)| <
.
2
2(b − a)
Fix m, n ≥ N . By the mean value theorem applied to fn − fm , for each pair
202
A Course in Real Analysis
x, y ∈ (a, b) there exists ξm,n ∈ (a, b) such that
0
(ξm,n )||x − y|
fn (x) − fm (x) − fn (y) − fm (y) = |fn0 (ξm,n ) − fm
≤
ε|x − y|
ε
≤ .
2(b − a)
2
(7.4)
In particular, for all x ∈ (a, b),
|fn (x) − fm (x)|
≤
fn (x) − fm (x) − fn (x0 ) − fm (x0 ) + |fn (x0 ) − fm (x0 )|
< ε/2 + ε/2 = ε.
By the uniform Cauchy criterion, {fn } converges uniformly on (a, b) to some
function f . Also, from (7.4), for fixed y and for all x 6= y,
fn (x) − fn (y) fm (x) − fm (y)
ε
−
≤
.
x−y
x−y
2(b − a)
Therefore, the sequence of functions [fn (x)−fn (y)]/(x−y) converges uniformly
in x on the set Ey := (a, y) ∪ (y, b). Since fn converges to f ,
f (x) − f (y)
fn (x) − fn (y)
→
x−y
x−y
uniformly in x on Ey .
By 7.2.1 with E = Ey ,
lim fn0 (y). = lim lim
n
n x→y
fn (x) − fn (y)
f (x) − f (y)
= lim
= f 0 (y).
x→y
x−y
x−y
The sequence given by fn (x) = xn /n, 0 < x < 1 shows that uniform
convergence of a sequence of functions does not guarantee that the derivatives
converge uniformly.
Exercises
1. Prove: If fn → f uniformly on an interval I and each fn is continuous at
a ∈ I, then, for any sequence {an } in I with an → a, limn fn (an ) = f (a).
2. Show that if fn → f uniformly on a subset E of R and each fn is
uniformly continuous on E, then f is uniformly continuous on E.
3. Prove that (1 + x/n)n → ex uniformly on any bounded interval of R.
R1
Conclude that 0 (1 + x/n)n → e − 1.
R1
4.S Show that n2 xe−nx → 0 for all x ≥ 0, yet 0 n2 xe−nx dx 6→ 0. Why
does this not contradict 7.2.3?
Sequences and Series of Functions
R1
5. Evaluate limn 0 fn if fn (x) =
203
1
x
n(ex/n − 1)
.
(b)
.
(c)
.
cos(x/n)
n sin(x/n)
x
√
n e−x/n − 1
ax(x + 1)n + 1
(d)S
. (e) arctan
, a > 0.
x
nx + 1
√
6.S Prove that fn (x) := n/(1 + n2 x2 ) converges to 0 pointwise on (0, +∞),
uniformly on [r, +∞) for every r > 0, but not uniformly on (0, 1). Show
R1
that, nonetheless, 0 fn → 0.
(a)
7. Let {an } be a positive, strictly increasing sequence. Prove that
lim
n
Z
0
1
an x
dx =
1 + an x
Z
1
lim
n
0
an x
dx.
1 + an x
8. Let f and f 0 be positive and continuous on [a, b]. Define
p
2n f 0 (x)
nf 0 (x)
and gn (x) :=
.
fn (x) :=
1 + n2 f (x)
1 + n2 f (x)
Use Exercise 7.1.18 to find
Z b
Z b
Z b
(a)S lim n sin fn . (b) lim n(1 − cos fn ). (c) lim n(1 − cos gn ).
n
n
a
n
a
a
9. Show that if fn → f uniformly on [a, b] and fn is integrable for each n
then
Z x
Z x
fn (t) dt →
f (t) dt
S
a
a
uniformly in x on [a, b].
10. Suppose that fn is improperly integrable on [a, c), fn → f uniformly
on [a, t] for all t ∈ [a, c), and |fn | ≤ g on [a, c) for all n, where g is
improperly integrable on [a, c). Prove that f is improperly integrable on
[a, c) and
Z
Z
c
lim
n
c
fn =
a
f.
a
11. Prove that if f is continuous on [0, 1], then
lim
n
Z
1
f (xn ) dx = f (0).
0
12. For each n, let fn be continuous on [a, +∞), a > 0, and suppose that
cn := limx→+∞ fn (x) exists in R. Prove that if fn → f uniformly on
204
A Course in Real Analysis
[a, +∞), then limn cn and limx→+∞ f (x) exist and are equal. Show also
that
Z 1/a
Z 1/a
lim
fn (x) dx =
f (x) dx.
n
0
0
Hint. Let gn (x) = fn (1/x), 0 < x ≤ 1/a and apply 7.2.1.
13. Let fn be as in 7.2.4 and define
Z x
gn (x) =
fn (t) dt, hn (x) = xgn (x),
0 ≤ x ≤ π.
0
Show that
(a) {gn } converges pointwise and monotonically on [0, π] but not uniformly.
(b) {hn } converges uniformly on [0, π].
(c) {h0n } does not converge uniformly on [0, π].
7.3
Convergence of Series of Functions
7.3.1 Definition. Let {fn } be a sequence of real-valued functions on a set S.
For each x ∈ S and n ∈ N form the nth partial sums
sn (x) =
n
X
fn (x) and tn (x) =
k=1
n
X
|fn (x)|.
k=1
P
P∞
The infinite series of functions n fn = n=1 fn is said to converge
P
• pointwise on S if
n fn (x) converges for each x ∈ S;
P
• absolutely pointwise on S if
n fn (x) converges absolutely for every
x ∈ S;
• uniformly on S if {sn } converges uniformly on S;
• absolutely uniformly on S if {tn } converges uniformly on S.
♦
The methods of Chapter 6 series may be applied at each x to test pointwise
convergence of a series of functions. For uniform convergence, additional tests
are required.
The following result is an immediate consequence of 7.1.9.
P
P
7.3.2 Theorem. Let
P n fn and n gn converge uniformly on a set S and
let α, β ∈ R. Then n (αfn + βgn ) converges uniformly on S and
X
X
X
(αfn + βgn ) = α
fn + β
gn .
n
n
n
Sequences and Series of Functions
205
The next theorem is a useful test for nonuniform convergence of a series.
The proof is immediate from the identity fn = sn − sn−1 .
P
7.3.3 Theorem. If
n fn converges uniformly on a set S, then fn → 0
uniformly on S.
For example, the geometric series
∞
X
xn =
n=0
1
, |x| < 1,
1−x
(7.5)
converges pointwise but not uniformly on (−1, 1), since xn does not tend to
zero uniformly on (−1, 1). We show below that the series converges uniformly
on all closed subintervals of (−1, 1).
The comparison test for uniform convergence of a series of functions takes
the following form:
7.3.4 Uniform
P Comparison Test. If |fn (x)| ≤
Pgn (x) for all n and all
x ∈ S and if n gn converges uniformly on S, then n fn converges absolutely
uniformly on S.
Pn
Pn
Proof. Since
k=m gn (x), the assertion follows from the
k=m |fn (x)| ≤
uniform Cauchy criterion.
P
7.3.5
Corollary. If n fn converges absolutely uniformly on a set S, then
P
n fn converges uniformly on S.
P
Proof. 0 ≤ fn +|fn | ≤ 2|fn |, hence, by 7.3.4, n (fn +|fn |) converges uniformly
on S and therefore so must
X
X
X
fn =
(fn + |fn |) −
|fn |.
n
n
n
7.3.6
Weierstrass M -test. If there exist positive
P
P constants Mn such that
+∞
and
|f
|
≤
M
on
S
for
all
n,
then
M
<
n
n
n
n
n fn converges absolutely
uniformly on S.
Proof. Take gn to be the constant function Mn in 7.3.4.
For example, taking Mn = rn , we see that the geometric series (7.5)
converges uniformly in every interval [−r, r], 0 < r < 1.
The next results are uniform convergence analogs of Dirichlet’s and Abel’s
tests for numerical series.
P
7.3.7 Theorem. If n fn converges uniformly on a set S and if there exists
a constant M such that
|g1 (x)| +
∞
X
|gn+1 (x) − gn (x)| ≤ M for all x ∈ S,
n=1
then
P
n
fn gn converges uniformly on S.
206
A Course in Real Analysis
Pn
P∞
Pn
Proof. Let sn = k=1 fk − n=1 fn and tn = k=1 fk gk . For each n > 1,
gn =
n−1
X
gk+1 − gk + g1 ,
k=1
hence |gn | ≤ M on S. Given ε > 0, choose N so that |sn (x)| < ε for all
n, m ≥ N and x ∈ S. By 6.4.4, for m > n > N and x ∈ S,
|tm (x) − tn−1 (x)|
m
X
≤
|sk (x)| |gk (x) − gk+1 (x)| + |sm (x)| |gm (x)| + |sn−1 (x)| |gn (x)|
(7.6)
k=n
≤ M ε + M ε + M ε = 3M ε.
Therefore, {tn } is uniformly Cauchy on S and hence converges uniformly.
P
7.3.8 Theorem.
If, on a set S, the partial sums of
n fn are uniformly
P
bounded,
|g
−
g
|
converges
uniformly,
and
g
→
0 uniformly, then
n+1
n
n
n
P
f
g
converges
uniformly
on
S.
n
n
n
Pn
Proof. Let tn be in the proof of 7.3.7, sn := k=1 fk , and let M be a uniform
bound for {sn } on S. Given ε > 0, choose N such that
|gn (x)| < ε and
m
X
|gk (x) − gk+1 (x)| < ε, m > n > N, x ∈ S.
(7.7)
k=n
Since (7.6) holds in the current setting, (7.7) implies that
|tm (x) − tn−1 (x)| ≤ 3M ε, m > n > N, x ∈ S.
Therefore, {tn } converges uniformly on S.
P
7.3.9 Corollary. If the partial sums of
P n fn are uniformly bounded and if
gn ↓ 0 or gn ↑ 0 uniformly on S, then n fn gn converges uniformly on S.
Proof. Assume that {gn } is decreasing. Then
n
X
k=1
hence
P∞
n=1
|gk+1 − gk | =
n
X
(gk − gk+1 ) = g1 − gn+1 ,
k=1
|gn+1 − gn | converges uniformly.
7.3.10 Example. Let gn be continuous and gn ↓ 0 or gn ↑ 0 on R. We apply
the preceding corollary to the series
X
s(x) :=
gn (x) sin nx
n
Sequences and Series of Functions
207
on closed bounded intervals I not containing any integer multiple of 2π.
By Dini’s theorem, gn → 0 uniformly on I. Also, by 6.4.6, s(x) converges
pointwise on R. Moreover, if x is not a multiple of 2π, then
n
X
sin(kx) ≤
k=1
1
.
sin(x/2)
Pn
Since inf I | sin(x/2)| > 0, the sums k=1 sin(kx) are uniformly bounded on I.
By 7.3.9, s(x) converges uniformly on I. By 7.3.8, the sameP
result holds if,
instead of monotonicity of the sequence {gn }, we require that n |gn+1 − gn |
converges and
P gn → 0, both uniformly on I. Analogous results hold for series
♦
of the form n gn (x) cos nx.
7.3.11 UniformPAlternating Series Test. If gn ↓ 0 or gn ↑ 0 uniformly
∞
on a set S, then n=1 (−1)n+1 gn converges uniformly on S.
Proof. Take fn = (−1)n+1 in 7.3.9.
7.3.12 Example. Let f be continuous on R and monotone in some neighborhood N of 0 with f (0) = 0. If an ↓ 0, then the series
∞
X
(−1)n+1 f (an x)
n=1
converges uniformly on any closed, bounded interval I.
We verify this for the case I ⊆ [0, +∞) and f increasing. Choose N so that
an x ∈ N for all n ≥P
N and x ∈ I. Then f (an x) ↓ 0 on I, hence, by Dini’s
∞
theorem and 7.3.11, n=1 (−1)n f (an x) converges uniformly on I.
For example, taking an = 1/n we see that the series
∞
X
n=1
(−1)n+1 sin(x/n),
∞
X
(−1)n+1 n−1 xex/n , and
n=1
all converge uniformly on closed bounded intervals.
∞
X
(−1)n+1 [1 − e−n
−2
x2
]
n=1
♦
The following theorem is an immediate consequence of 7.2.2, 7.2.3, and
7.2.6 applied to the sequence of partial sums of the series.
P
7.3.13 Theorem. Let fn : [a, b] → R and s := n fn .
(a) If s converges uniformly on [a, b] and each fn is continuous, then s is
continuous.
(b) If s converges uniformly on [a, b] and fn ∈ Rba for all n, then s ∈ Rba and
Rb
P Rb
s = n a fn .
a
P 0
(c) Let fn be differentiable on (a, b) and suppose
P that the derived series n fn
converges uniformly on (a, b) and that
n fn (x0 ) converges
P for some
x0 ∈ (a, b). Then s converges uniformly on (a, b) and s0 = n fn0 .
208
A Course in Real Analysis
P −1
7.3.14 Example. (a) By 7.3.10,
sin(nx) converges uniformly on
nn
intervals [a, b] ⊆ (0, 2π), hence
Z
a
b
X
n−1 sin(nx) dx =
n
XZ
n
b
n−1 sin(nx) dx =
a
X cos(na) − cos(nb)
n2
n
.
P
On the other hand, the derived series n cos(nx) does not converge.
P
P
(b) Both s(x) := n n−1 sin(x/n) and its derived series n n−2 cos(x/n)
converge uniformly on R, hence the latter equals s0 (x).
♦
P
A closed form for a series s := n fn on a subset E of R is a “standard
function” that equals s on E. Closed forms are typically combinations of rational, power, exponential, logarithmic, trigonometric, or inverse trigonometric
functions.
7.3.15 Example. Since 1/(1 − x) is a closed form for the geometric series
(7.5) on (−1, 1), the function
2 + sin x
=
1
1 + sin x
1−
2 + sin x
n
∞ X
1
on intervals I not containing
is a closed form for the series
2 + sin x
n=0
(4n − 1)π/2 or −(4n + 1)π/2, n = 0, 1, 2, . . .. By the Weierstrass M -test, the
series converges absolutely uniformly on closed subintervals of I, since on such
a subinterval 0 < 1/(2 + sin x) < 1/(1 + ε) for some ε > 0.
♦
1
Exercises
1. For the
fn below, determine all subintervals of [0, +∞) on
Pfunctions
∞
which n=0 fn (x) converges pointwise or uniformly, where p ∈ N.
(a) S
(d) S
1
.
1 + xn
x
.
1 + n2 x
(g) S np e−nx .
(j) xn (1 − x)n .
(b)
(e)
xn
.
1 + xn
n
x
.
x−2
(h) n−x .
n
1−x
(k)
.
1+x
(c)
(f)
x
.
+x
sin(nx)
.
1 + n2 x2
n2
(i) S sin(x/np ).
(l) xn e−nx .
2. Find the largest intervals of pointwise P
convergence and uniform conver∞
gence and a closed form for the series n=0 fn (x), where fn (x) =
(−1)n
(a) cosn πx/2 , x ∈ [0, 1]. (b)S lnn (1/x). (c) nx . (d) (x2 ln x)n .
e
Sequences and Series of Functions
209
P
3. Prove
P +that if
P n−fn converges absolutely uniformly on a set S, then
n fn and
n fn converge uniformly on set S, where, for each x ∈ S,
fn+ (x) and fn− (x) are, respectively, the positive and negative parts of
fn (x).
P
4.S Suppose that the numerical series n an converges absolutely. Let
s(t) =
X
X
an sin (2n + 1)t and c(t) =
an cos(nt).
n
n
Find series expansions for
Z
π/2
s(t) dt and
x
∞
X
5. Let p > 0 and s(x) =
Z
x
c(t) dt.
0
sin(x/np ). Prove:
n=1
(a) If p ≤ 1, then s(x) diverges for all x 6= 0.
(b) If p > 1, then s(x) converges absolutely uniformly on bounded
intervals, (hence pointwise on R) but not uniformly on R.
6.S Let p > 0 and s(x) =
∞
X
[1 − cos(x/np )]. Prove:
n=1
(a) If p ≤ 1/2, then s(x) diverges for all x 6= 0.
(b) If p > 1/2, then s(x) converges absolutely uniformly on bounded
intervals, (hence pointwise on R) but not uniformly on R.
7. Let f (x) be bounded on [0, 1] and
t(x) :=
∞
X
xn f (x), x ∈ [0, 1].
n=0
(a) Prove that t(x) converges pointwise on [0, 1) and uniformly on [0, r]
for 0 < r < 1.
(b) Prove that if f (1) 6= 0, then the convergence of t(x) is not uniform
on [0, 1).
(c) Suppose that L := limx→1− (1 − x)−1 f (x) exists. Prove that the
convergence of t(x) is uniform on [0, 1) iff L = 0.
(d) Let m ∈ N. Determine whether the convergence of t(x) is uniform on
[0, 1) for f (x) =
(i) (1 − x)m .
(ii) 1 − xm .
(iii) 1 − sin(πx/2).
(iv) cos(πx/2).
210
A Course in Real Analysis
8. (Uniform limit comparison test). Let fn ≥ 0 and gn > 0 on a set S and
let fn /gn → h uniformly on S, where h : S → R satisfies
0 < inf h ≤ sup h < +∞.
S
Prove that
on S.
P
n
S
fn converges uniformly on S iff
P
n gn
converges uniformly
9.S Suppose that f 0 exists, is bounded on I := (−r, r), and f (0) = 0. Prove
that the series
∞
X
1
x
s(x) :=
f
n
n+1
n=0
converges uniformly on I and that s0 (0) = f 0 (0).
10. Suppose that |f (x)| ≤ |x| on I = (−r, r), r > 0. If f is differentiable
on I and f 0 is continuous at 0, show that the series s(x) in Exercise 9
converges uniformly on I and that |s0 (0)| ≤ 1.
P
11.S Let fn (x) be continuous and nonnegative on [a, b]. Prove that if n fn
converges pointwise on [a, b] to a continuous function, then the convergence is uniform.
P∞ −1
12. Let {aP
n } be a sequence such that
n=1 an converges absolutely. Prove
∞
−1
that
|x
−
a
|
converges
uniformly
on bounded intervals not
n
n=1
containing any an .
P
13.S Suppose
each n. Prove that if n fn (a)
P that fn is monotone on [a, b] forP
and n fn (b) converge absolutely, then n fn ∈ Rba and
Z bX
XZ b
fn =
fn .
a
n
n
a
P
14. Let n fn converge uniformly on S and let {gn } be a uniformly bounded
sequence of functions on a set S such that eitherP
{gn } is monotone
increasing or monotone decreasing on S. Prove that n fn gn converges
uniformly on S.
P
15.S Suppose that the partial sums of n fn are uniformly bounded on
I = [a,
Pb], gn is continuous for each n, and gn ↓ 0 or gn ↑ 0 on I. Prove
that n fn gn converges uniformly on I.
P
16. Suppose that n fn converges uniformly on I = [a, b], gn is continuous
for
P each n, and gn ↓ g or gn ↑ g on I, where g is continuous. Prove that
n fn gn converges uniformly on I.
17. Suppose thatPgn is continuous on I = [a, b] for each n, {gn } is monotone,
and s(x) := n (−1)n gn (x) converges for each x ∈ I. Prove that s(x) is
continuous on I.
Sequences and Series of Functions
211
18.S Let g be continuous and nonnegative on R. Prove that the series
s(x) :=
∞
X
(−1)n
n=1
g(x) + n
n2
converges uniformly on bounded intervals, hence pointwise on R, but
does not converge absolutely for any x.
19. Let gn be continuous and gn ↓P
0 on R. Show that if [a, b] does not contain
any odd multiple of π, then n (−1)n gn (x) cos nx converges uniformly
on [a, b].
7.4
Power Series
A power series in x about a is an infinite series of the form
s(x) =
∞
X
cn (x − a)n ,
a, cn ∈ R.
n=0
In the following four subsections we examine the properties of these important
series. The first step is to determine the convergence set of a power series.
Radius of Convergence of a Power Series
7.4.1
Convergence Theorem. Given a power series s(x) :=
P∞ Radius of
n
n=0 cn (x − a) , define the extended real number R ∈ [0, +∞] by
R = ρ−1 , where ρ := lim sup |cn |1/n .
n
Then s(x)
(a) converges absolutely pointwise for |x − a| < R;
(b) converges absolutely uniformly for |x − a| ≤ r < R;
(c) diverges for |x − a| > R.
Proof. For the case R = 0 (ρ = +∞), the theorem asserts that s(x) diverges
for all x 6= a. This is immediate from the root test. A similar application of
the root test proves (c): If |x − a| > R, then
lim sup |cn (x − a)n |1/n = ρ|x − a| > 1.
n
To prove (a) and (b), assume R > 0 (ρ < +∞) and let 0 < r < s < R.
212
A Course in Real Analysis
Then ρ < 1/s so there exists an index N such that |cn |1/n < 1/s for all n ≥ N .
For such n and for all x with |x − a| ≤ r, |cn (x − a)n | ≤ (r/s)n . Since r/s < 1,
the series converges uniformly on [a − r, a + r] by Weierstrass M -test. Since r
is arbitrary, part (a) follows.
The number R = 1/ρ is called the
of convergence of the series. The
Pradius
∞
set I of all x for which the series n=0 cn (x − a)n converges is called the
interval of convergence. By 7.4.1, I is one of the intervals
{a}, (a − R, a + R), (a − R, a + R], [a − R, a + R), or [a − R, a + R].
The theorem gives no further information regarding I. The methods of Chapter 6 may be applied to determine convergence behavior at the endpoints a ± R
if R is finite.
The following characterization of R is frequently useful.
7.4.2 Theorem. If cn > 0 for all sufficiently large n, then
R = lim
n
|cn |
,
|cn+1 |
provided the limit exists in R.
Proof. Let L denote the limit and set an = |cn | > 0 for all n ≥ N . The
assertion then follows from the inequalities
an+1
1
1
an+1
= lim inf
≤ lim inf a1/n
≤ ρ = lim sup a1/n
≤ lim sup
= ,
n
n
n
n
L
an
an
L
n
n
(Exercise 2.4.12).
Here are some typical examples using 7.4.2, where I is the convergence
interval.
Examples.
∞
X
(a)
nn xn , I = {0}.
n=1
∞
X
xn
, I = (−∞, +∞).
n!
n=1
∞
X
xn
√ , I = [−1, 1), conditional convergence at −1.
(c)
n
n=1
∞
n
X
x
(d)
, I = [−1, 1], absolute convergence at ±1.
n2
n=1
(b)
The following example is somewhat more interesting.
♦
Sequences and Series of Functions
213
7.4.3 Example. The Fibonacci sequence {cn } is defined by
c0 = c1 = 1, cn = cn−1 + cn−2 , n ≥ 2.
P∞
The Fibonacci power series is the series n=0 cn xn . We use 7.4.2 to show that
√
the radius of convergence of the series is ( 5 − 1)/2.
Set rn = cn+1 /cn . Note that the first few terms of the sequence {rn } are
1, 2, 3/2, 5/3, 8/5 and that
rn =
cn + cn−1
1
=1+
,
cn
rn−1
n ≥ 2.
(7.8)
An induction argument then shows that
3/2 ≤ rn ≤ 5/3, n ≥ 2.
(7.9)
Now, from (7.8),
rn − rm =
1
rn−1
−
1
rm−1
=
rm−1 − rn−1
.
rm−1 rn−1
(7.10)
In particular,
r2k+1 − r2k−1 =
r2k−2 − r2k
r2k−3 − r2k−1
and r2k − r2k−2 =
,
r2k r2k−2
r2k−1 r2k−3
hence
r2k+1 − r2k−1 =
r2k−1 − r2k−3
.
r2k r2k−2 r2k−1 r2k−3
Iterating, we obtain r2k+1 − r2k−1 = (r3 − r1 )/ak for some ak > 0, hence
{r2k+1 } is increasing. A similar argument shows that {r2k } is decreasing.
Therefore, the sequences converge, say, r2k+1 → L and r2k → M . From (7.10),
|rn − rn−1 | =
|rn−1 − rn−2 |
|r2 − r1 |
= ... =
,
rn−1 rn−2
bn
where bn is a product of 2n−2 terms, each of which is an rk . From (7.9),
bn → +∞, hence rn − rn−1 → 0. Therefore,
cn+1
= rn → L = M = 1/R,
cn
where R is the radius of convergence of the series. Taking
√ limits in (7.8) shows
that 1/R = 1 + R, which has positive solution R = ( 5 − 1)/2.
♦
Since a power series converges uniformly on closed bounded subintervals
of (a − R, a + R), 7.3.13 implies that the series is continuous on the entire
interval. The following theorem extends continuity to the endpoints.
214
A Course in Real Analysis
P∞
n
7.4.4 Abel’s Continuity Theorem. Let s(x) :=
n=0 cn (x − a) have
radius of convergence R with 0 < R < +∞. If s(x) converges at x = a + R,
then s(x) converges uniformly on [b, a + R] for any b ∈ (a − R, a + R). In
particular, s is continuous on (a − R, a + R].
Proof. The transformation x = Ry +a produces a power series in y = (x−a)/R
that converges on (−1, 1]. Hence we may assume in the original series that
a = 0 and s(x) converges on (−1, 1]. It suffices then to show that s(x) converges
uniformly on [0,P
1].
n
Let sn (x) = k=0 ck xk , 0 ≤ x ≤ 1. For n > m > 1, define
n
X
Cm,n =
ck = sn (1) − sm−1 (1).
k=m
By 6.4.4,
sn (x) − sm−1 (x) =
n−1
X
Cm,k (xk − xk+1 ) + Cm,n xn − Cm−1,n xm .
k=m
Since n cn converges, given ε > 0, we may choose N such that |Cm,n | < ε/3
for all n > m ≥ N . Then for all n > m ≥ N ,
P
|sn (x) − sm−1 (x)| ≤
n−1
X
|Cm,k |(xk − xk+1 ) + |Cm,n | + |Cm−1,n |
k=m
≤
n−1
2ε
ε X k
(x − xk+1 ) + .
3
3
k=m
Pn−1
Since k=m (xk − xk+1 ) = xm − xn ≤ 1, the last expression is ≤ ε. This shows
that {sn } is uniformly Cauchy on [0, 1], hence converges uniformly.
The next result shows that a power series may be differentiated or integrated
term by term over the interior of the interval of convergence.
P∞
7.4.5 Theorem. Let s(x) := n=0 cn (x − a)n have radius of convergence
R > 0. Then the derived series and the integrated series
D(x) :=
∞
X
ncn (x − a)n−1 and I(x) :=
n=1
∞
X
cn
(x − a)n+1
n
+
1
n=0
have radius of convergence R. Moreover, s(x) is differentiable on the interval
(a − R, a + R), and for x ∈ (a − R, a + R)
Z x
s0 (x) = D(x) and
s(t) dt = I(x).
a
Sequences and Series of Functions
215
Proof. Since limn n1/n = limn 1/(n + 1)1/n = 1,
lim sup |ncn |1/n = lim sup |cn /(n + 1)|1/n = lim sup |cn |1/n .
n
n
n
Therefore, the series s(x), D(x), and I(x) have the same radius of convergence.
Since the differentiation and integration takes place on closed subintervals
where the convergence of each of the three series is uniform, the remaining
assertions follow from 7.3.13.
Representation of Functions by Power Series
P∞
A power series s(x) = n=0 cn (x − a)n is said to represent a function f on
an interval I if f = s on I. The largest interval for which the representation is
valid is called the representation interval. Note that the representation interval
may be smaller than the convergence interval. (See the examples below.)
P∞Power seriesn representations are unique. Indeed, if f is represented by
n=0 cn (x − a) on Ia := (a − r, a + r), r > 0, then, by 7.4.5, f has derivatives
of all orders on Ia , and repeated differentiation of the identity
f (x) =
∞
X
cn (x − a)n , x ∈ Ia
n=0
shows that f (a) = cn n!. Therefore, if f has a power series representation
about a, then
∞
X
f (n) (a)
(x − a)n , x ∈ Ia .
f (x) =
n!
n=0
(n)
The last series is called the Taylor series expansion of f about a. For a = 0 it
is called a Maclaurin series.
The following examples show how various power series representations may
be obtained from the geometric series representation of (1 − x)−1 given in
(7.5).
7.4.6 Examples. (a) Differentiating (7.5) term by term and multiplying the
result by x yields the representation
∞
X
x
=
nxn , |x| < 1.
(1 − x)2
n=1
(b) Replacing x in (7.5) by −t and integrating produces
Z x
∞
X
1
(−1)n+1 n
ln(x + 1) =
dt =
x , |x| < 1.
n
0 1+t
n=1
(7.11)
(7.12)
Since the series converges at x = 1, Abel’s continuity theorem shows that
∞
X
(−1)n+1
ln 2 =
,
n
n=1
216
A Course in Real Analysis
a result obtained in 6.4.8 by another method.
(c) Replacing x in (7.5) by −t2 and integrating produces
Z x
∞
X
1
x2n+1
arctan x =
dt
=
(−1)n
, |x| < 1.
2
2n + 1
0 1+t
n=0
(7.13)
(d) For an example with a 6= 0, consider
∞
X
3
1
3
2n
(x − 1)n ,
=
=
=
5 − 2x
3 − 2(x − 1)
1 − 2(x − 1)/3 n=0 3n
|x − 1| <
3
. ♦
2
The next example and the theorem thereafter show that differentiation can
be a powerful tool for finding a closed form for a power series.
7.4.7 Example. We show that
ex =
∞
X
xn
,
n!
n=0
−∞ < x < +∞.
(7.14)
Let s(x) denote the series. By 7.4.2, the radius of convergence of s is
(n + 1)!
= lim(n + 1) = +∞,
n
n!
so s(x) converges for all x. Differentiating the series term by term yields
s0 (x) = s(x). Now set g(x) = e−x s(x). Then g 0 (x) = e−x [s0 (x) − s(x)] = 0,
hence g is constant. Since g(0) = 1, s(x) = ex .
♦
lim
n
a
n
The following result is an extension of the binomial theorem. The coefficient
in (7.15) is called a generalized binomial coefficient.
7.4.8 Binomial Series. For any a ∈ R and |x| < 1,
∞ X
a
a n
a
a(a − 1) · · · (a − n + 1)
a
(1 + x) =
x ,
:=
,
:= 1. (7.15)
0
n
n
n!
n=0
Proof. Let s(a, x) denote the series in (7.15). A simple calculation shows that
−1
a
a
n+1
=
→ 1.
n n+1
|a − n|
Therefore, by 7.4.2, s(a, x) converges for |x| < 1. For such x,
∞ ∞ X
a − 1 n X a − 1 n+1
(1 + x)s(a − 1, x) =
x +
x
n
n
n=0
n=0
∞ X
a−1
a−1
=1+
+
xn+1
n
+
1
n
n=0
∞ X
a
=1+
xn+1
n
+
1
n=0
= s(a, x),
(7.16)
Sequences and Series of Functions
217
where for the third equality we used the identity (Exercise 6)
a−1
a−1
a
+
=
, n ∈ Z+ .
n
n+1
n+1
(7.17)
Now differentiate the series s(a, x) term by term to obtain
s0 (a, x) =
∞ X
a
n=1
n
nxn−1 =
∞ ∞ X
X
a
a−1 n
(n + 1)xn = a
x
n+1
n
n=0
n=0
= as(a − 1, x).
(7.18)
Set g(x) = (1 + x)−a s(a, x), |x| < 1. By (7.18) and (7.16),
g 0 (x) = −a(1 + x)−a−1 s(a, x) + a(1 + x)−a s(a − 1, x)
= a(1 + x)−a−1 − s(a, x) + (1 + x)s(a − 1, x)
= 0.
Therefore, g(x) = g(0) = 1, hence s(a, x) = (1 + x)a , as claimed.
7.4.9 Example. Replacing x in (7.15) by −x, we have
√
∞ X
1
−1/2
=
(−1)n xn ,
n
1 − x n=0
|x| < 1.
Since
1
3
2n − 1
−1/2
1
−
−
··· −
=
n!
2
2
2
n
(−1)n 1 · 3 · 5 · · · (2n − 1)
=
n! 2n
(−1)n 1 · 2 · 3 · 4 · · · (2n − 1) · 2n
=
n! 2n 2 · 4 · · · 2n
n
(−1) (2n)!
=
,
(n!)2 4n
we see that
√
∞
X
1
(2n)! n
=
x ,
1 − x n=0 (n!)2 4n
|x| < 1.
(7.19)
Replacing x by t2 and integrating term by term from 0 to x yields the Maclaurin
series for arcsin x:
arcsin x =
∞
X
(2n)!
x2n+1 ,
2 (2n + 1)4n
(n!)
n=0
|x| < 1.
(7.20)
218
A Course in Real Analysis
7.4.10 Remark. If a > 0 and is not an integer, then the binomial series
converges absolutely uniformly on [−1, 1]. Indeed, if an = | na |, then
an
|a(a − 1) · · · (a − n + 1)|
(n + 1)!
n+1
=
=
,
an+1
n!
|a(a − 1) · · · (a − n)|
|a − n|
hence, for sufficiently large n,
n+1
n(1 + a)
an
−1 =n
−1 =
→ 1 + a > 1.
n
an+1
n−a
n−a
By Raabe’s test (6.3.2) the series converges absolutely at x = ±1, hence, by
Abel’s continuity theorem (7.4.4), the series converges absolutely uniformly on
the interval [−1, 1].
♦
Multiplication of Power Series
7.4.11
The Cauchy
product of the power series
P∞ Definition.
P∞
n
n
b
x
is
the
power
series
n
n=0
n=0 cn x , where
cn =
n
X
P∞
n=0
an xn and
ak bn−k .
♦
k=0
P
Note that n cn xn is precisely the series one obtains by formally carrying
out the multiplication
(a0 + a1 x + a2 x2 + · · · )(b0 + b1 x + b2 x2 + · · · )
P∞
and collecting
like powers. We show below that if the power series n=0 an xn
P∞
and n=0 bn xn converge for |x| < R, then so does the Cauchy product. For
this we need the following result due to Mertens.
P∞
P∞
7.4.12 Lemma. If the numerical series A := n=0 αn and B := n=0 βn
both converge, and if at least one of the series converges absolutely, then the
Cauchy product
∞
n
X
X
C :=
γn , γ n =
αk βn−k ,
n=0
k=0
converges and C = AB.
P∞
Proof. Assume that n=0 αn converges absolutely. Let
An =
n
X
k=0
αk , Bn =
n
X
βk , C n =
k=0
n
X
k=0
γk , and A0 =
∞
X
|αn |.
n=0
Then
Cn = α0 β0 + (α0 β1 + α1 β0 ) + · · · + (α0 βn + α1 βn−1 + · · · + αn β0 )
= α0 Bn + α1 Bn−1 + · · · + αn B0
= α0 (Bn − B + B) + α1 (Bn−1 − B + B) + · · · + αn (B0 − B + B)
= α0 (Bn − B) + α1 (Bn−1 − B) + · · · + αn (B0 − B) + An B.
Sequences and Series of Functions
219
Thus to show that Cn → AB it suffices to verify that
α0 (Bn − B) + α1 (Bn−1 − B) + · · · + αn (B0 − B) → 0.
Given ε > 0, choose N such that
|Bn − B| < ε/(2A0 ) for all n > N .
Since αn → 0, we may choose N 0 > N so that for all n > N 0
|αn (B0 − B) + αn−1 (B1 − B) + · · · + αn−N (BN − B)| < ε/2.
For such n,
|α0 (Bn − B) + α1 (Bn−1 − B) + · · · + αn (B0 − B)|
≤ |αn (B0 − B) + αn−1 (B1 − B) + · · · + αn−N (BN − B)|
+ |αn−N −1 | |BN +1 − B| + |αn−N −2 | |BN +2 − B| + · · · + |α0 | |Bn − B|
< ε/2 + ε/2 = ε.
7.4.13 Cauchy Product Theorem. For each x, let
C(x) =
∞
X
cn xn
cn :=
n=0
n
X
ak bn−k
k=0
P∞
P∞
be the Cauchy product of series A(x) = n=0 an xn and B(x) = n=0 bn xn .
If A(x) and B(x) have radii of convergence Ra and Rb , respectively, then C(x)
has radius of convergence Rc ≥ min{Ra , Rb } and
C(x) = A(x)B(x),
|x| < min{Ra , Rb }.
(7.21)
Moreover, if, say Rb < Ra and B(Rb ) converges, then C(Rb ) converges and
C(Rb ) = A(Rb )B(Rb ).
Proof. Assume that Rb ≤ Ra and let |x| < Rb . By 7.4.12 applied to αn = an xn
and βn = bn xn , the series C(x) converges, hence Rc ≥ |x| and 7.21 holds. Since
|x| was arbitrary, Rc ≥ Rb = min{Ra , Rb }. The last assertion of the theorem
follows from 7.4.4 by letting x ↑ Rb in 7.21.
7.4.14 Example. By (7.5) and (7.14), for |x| < 1
∞
n
n
X
X
X
ex
(−1)n−k
(−1)k
=
cn xn , where cn =
= (−1)n
.
1 + x n=0
k!
k!
k=0
♦
k=0
Remark. If Ra = Rb and both A(Ra ) and B(Rb ) in 7.4.13 converge, it
does not necessarily
P∞ follow that√ C(Ra ) converges. Consider, for example,
A(x) = B(x) = n=1 (−1)n xn / n, which has radius of convergence 1 and
220
A Course in Real Analysis
converges conditionally at x = 1. The Cauchy product at x = 1 is
where
n−1
X
1
p
.
cn = (−1)n
k(n
− k)
k=1
P∞
n=1 cn ,
However, for odd n,
|cn | =
n−1
X
p
k=1
hence
P
(n−1)/2
1
n cn
k(n − k)
≥
1
X
p
k=1
k(n − k)
(n−1)/2
≥
1
√
2
p
=
,
2
2
(n − 1) /2
X
k=1
diverges.
♦
Analytic Functions
7.4.15 Definition. A function f is said to be (real ) analytic at a point a if,
for some r > 0, f has derivatives of all orders on (a−r, a+r) and is represented
there by its Taylor series at a, that is,
f (x) =
∞
X
f (n) (a)
(x − a)n , |x − a| < r.
n!
n=0
If f is analytic at each point of a set E, then f is said to be analytic on E.♦
A function that has derivatives of all orders on an interval may not be
analytic there. This is the case for the function in Exercise 29 below. The
following theorem gives a necessary and sufficient condition for analyticity at
a point.
7.4.16 Taylor Series Representation. Let f have derivatives of all orders
on an open interval I containing a. Then f is analytic at a iff there exist
positive constants M and r such that
|f (k) (x)| ≤ k!M k for all k ∈ N and x ∈ (a − r, a + r).
(7.22)
Proof. Assume condition (7.22) holds. To prove that f is analytic at a we
use Taylor’s theorem (Section 4.6), which asserts that for each n ∈ N and
x ∈ (a − r, a + r) there exists a number c = c(n, x) between x and a such that
f (x) = Tn (x) + Rn (x), where
Tn (x) :=
n−1
X
k=0
f (k) (a)
(x − a)k ,
k!
and Rn (x) :=
f (n) (c)
(x − a)n .
n!
Now let r ∈ (0, 1/M ) and |x − a| < r. By hypothesis,
|Rn (x)| ≤ M n |x − a|n ≤ (M r)n .
Since M r < 1, Rn (x) → 0, hence Tn (x) → f (x). Therefore, f is analytic at a.
Sequences and Series of Functions
221
Conversely, let f be analytic at a. Then there exist constants r1 ∈ (0, 1)
and cn such that
f (x) =
∞
X
cn (x − a)n , |x − a| ≤ r1 .
(7.23)
n=0
In particular, |cn r1n | → 0. Choose M1 > 1 so that |cn r1n | < M1 for all n.
Termwise differentiation of (7.23) yields
f (k) (x) =
∞
X
n(n − 1) · · · (n − k + 1)cn (x − a)n−k ,
n=k
hence for |x − a| ≤ r1 /2,
|f (k) (x)| ≤
∞
X
n(n − 1) · · · (n − k + 1)|cn |(r1 /2)n−k
n=k
≤ M1 r1−k
∞
X
n(n − 1) · · · (n − k + 1)(1/2)n−k .
n=k
The last series is the kth derivative of the geometric series for (1 − x)−1
evaluated at 1/2 and therefore equals
dk
dxk
Thus
x=1/2
(1 − x)−1 = k!(1 − 1/2)−k−1 = k!2k+1 .
|f (k) (x)| ≤ M1 r1−k k!2k+1 ,
|x − a| ≤ r1 /2.
To obtain (7.22), take r = r1 /2 and choose M > 4M1 /r1 , so that M k >
M1 r1−k 2k+1 for all k.
7.4.17 Example. Let f (x) = sin x. Then f (2k) (0) = 0 and f (2k+1) (0) = (−1)k .
Since the derivatives of f are bounded, (7.22) holds for all x. Therefore,
sin x =
∞
X
(−1)n 2n+1
x3
x5
x
=x−
+
− ...,
(2n + 1)!
3!
5!
n=0
∞ < x < +∞.
Similarly,
cos x =
∞
X
(−1)n 2n
x2
x4
x =1−
+
− ...,
(2n)!
2!
4!
n=0
∞ < x < +∞.
♦
It is clear from 7.3.2 and 7.4.13 that the sum and product of functions
analytic at a are analytic at a. In Exercise 33 the reader is asked to show that
the reciprocal of a nonzero analytic function is analytic. It follows that the
ratio of two analytic functions, if defined, is analytic.
The next result extends the property of analyticity to nearby points.
222
A Course in Real Analysis
P∞
7.4.18 Theorem. If the series f (x) = n=0 an (x − a)n converges on I :=
(a − r, a + r), then f is analytic on I.
Proof. By considering g(x) = f (x + a), we may suppose that a = 0. Let |b| < r,
0 < s < r − |b|, and |x − b| < s. We show that f has a power series expansion
about b on the interval |x − b| < s. Since b is arbitrary, it will follow that f is
analytic on I.
n
By the binomial theorem applied to (x − b) + b ,
∞ X
n
∞ X
∞
X
X
n
f (x) =
an
(x − b)k bn−k =
an dk,n (x − b)k bn−k , (7.24)
k
n=0 k=0
n=0 k=0
n
where dk,n = k for k = 0, 1, · · · , n and dk,n = 0 for k > n. Now,
∞ X
∞
∞
n X
X
X
n
k n−k
|an (x − b) b
dk,n | =
|an |
|x − b|k |bn−k |
k
n=0
n=0
k=0
k=0
=
∞
X
|an |(|x − b| + |b|)n .
n=0
If |x − b| < s then |x − b| + |b| < s + |b| < r and the last series converges.
Therefore, (7.24) converges uniformly for |x − b| < s. By 6.5.4, the order of
summation may be interchanged, so
f (x) =
=
=
∞ X
∞
X
k=0 n=0
∞ X
∞
X
k=0 n=k
∞
X
an dk,n (x − b)k bn−k
an
n
(x − b)k bn−k
k
bk (x − b)k , where bk :=
k=0
∞
X
n=k
an
n n−k
b
.
k
This shows that f has a power series expansion about b on (b − s, b + s).
7.4.19 Theorem. Let f be analytic on an open interval I and let f = 0 on a
subinterval (a, b) of I. Then f = 0 on I.
Proof. Let c ∈ I, c > b, and define
A = {t ∈ (a, c) | f (n) = 0 on (a, t] for all n ≥ 0}.
Then A 6= ∅ and t0 := sup A ≤ c. Suppose, for a contradiction, that t0 < c.
Since f is analytic at t0 , f has a Taylor series representation about t0 on
J := (t0 − r, t0 + r) for some r > 0. By continuity and the approximation
property of suprema, f (n) (t0 ) = 0 for each n. It follows that f is identically
zero on J, contradicting the definition of t0 . Therefore, t0 = c, hence f = 0
on (a, c). Since c was arbitrary, f (x) = 0 for all x ∈ I with x ≥ a. Similarly,
f (x) = 0 for all x ∈ I with x ≤ b.
Sequences and Series of Functions
223
The proof of the following corollary is left to the reader.
7.4.20 Corollary. Let f and g be analytic on the open intervals I and J,
respectively. If I ∩ J =
6 ∅ and f = g on an open subinterval of I ∩ J, then there
exists an analytic function h on I ∪ J such that h|I = f and h|J = g.
The preceding corollary is known as analytic continuation, as it may be
used to extend an analytic function to a larger interval.
Exercises
1. Find the interval of convergence of
P∞
n=1
fn (x), where fn (x) =
(−1)n n
23n n3 xn
n2 n!
(x − 1)n .
(b) √
(x − 2)n .
. (c)
n
2
(2n)!
n!
(1 + 2/n)n
(−1)n nxn
n!xn
(d) S
.
(e) n+2−1/n . (f)
(x + 1)n .
(n + 1) ln(n + 2)
3n
n
(1.5)(2.5) · · · (n + .5) n
1 n
2n + 5n 2n
(g) S [3 + (−1)n ]n sin
x . (i) S
x .
x . (h) n
n
n
3 +4
n!
(a) S
2. Use (7.5) to represent the following functions as power series about the
given point a. In each case, find the representation interval.
x3
x
x
(a)
, a = 0.
(b)S
, a = 0.
(c)
, a = 1.
2
(x + 1)
2 − 3x
3 + 2x
3. Use (7.12) to find power series representations for (a)S x ln x, (b) x2 ln x
about the point a = 1.
4. Without using 7.4.16, find the Maclaurin series and representation interval
for the following functions.
2
1 + 2x
S
(a) ln
. (b) (1 + x2 ) arctan x. (c) x3 e−3x .
1 − 3x
√
ex − 1
sin x
cos x − 1
S
√ .
.
(e)
(d)
(f)
.
x
x2
x
1
(g) S sin x cos x.
(h) √
.
(i) sin(x + π/3).
9 − x2
5.S Use an identity and 7.4.9 to find the Maclaurin series for arccos x.
6. Verify the identity (7.17).
7. Without using 7.4.16, show that
(a) sin x =
(b) cos x =
∞
X
n=0
∞
X
n=0
an (x − a)n , a2n =
(−1)n sin a
(−1)n cos a
, a2n+1 =
.
(2n)!
(2n + 1)!
bn (x − a)n , b2n =
(−1)n cos a
(−1)n+1 sin a
, b2n+1 =
.
(2n)!
(2n + 1)!
224
A Course in Real Analysis
8. Prove that
∞
∞
X
X
4(−1)n
2(2n)!
= π.
=
2 (2n + 1)4n
2n
+
1
(n!)
n=0
n=0
9. Find a power series representation for
(a)
S
sin t − t
.
t3
√
(b)
cos t t.
10. Find a closed form for the series
(a) n2 xn .
Rx
P∞
n=0 (n
2
f (t) dt if f (t) =
(c)
P∞
cos t − 1
.
t
2
(d)
et − 1
.
t2
fn (x), |x| < 1, where fn (x) =
n=0
(b)S (−1)n (2n + 1)x2n+1 .
11.S Sum the series
0
(c)
xn+1
n2 xn
. (d)
.
(n + 1)(n + 2)
n+1
+ n + 1)3−n .
12. Use 7.4.13 to find a series representation and representation interval for
sin x
ln(1 − x)
e−x
arctan x
2
. (c)
(a)S
. (b) √
. (d)S ex sin x. (e)
.
2
1+x
1
−
x
x(1
+ x2 )
1−x
13. By calculating the Maclaurin series of the function sin2 x in two ways,
establish the identity
n
X
22n+1
1
=
.
(2n + 2)!
(2k + 1)!(2n − 2k + 1)!
k=0
14. By calculating the Maclaurin series of the function cos2 x in two ways,
establish the identity
n
X
22n−1
1
=
.
(2n)!
(2k)!(2n − 2k)!
k=0
15. By calculating the Maclaurin series of the function (1 − x)−3/2 in two
ways, establish the identity
n
(2n + 1)! X (2k)!
=
.
(n!)2 4n
(k!)2 4k
k=0
16.S Show that the Fibonacci power series s(x) (7.4.3) has the closed form
√
5 − 1 /2.
(1 − x − x2 )−1 , |x| <
Conclude from Abel’s continuity
theorem (7.4.4) that s(x) cannot con√
verge at the endpoint ( 5 − 1)/2.
Sequences and Series of Functions
225
P∞
17. Let an → L ∈ R and set s(x) := n=0 an xn , |x| < 1. For m ∈ N, define
ϕm (x) :=
2m−1
X
(−1)k xk .
k=0
Prove that limx→1− ϕm (x)s(x) = mL. Hint. Use Abel’s continuity theorem.
18.S Use the method of 7.4.9 to establish the representation
ln
p
∞
X
1 + x2 + x =
(−1)n (2n)!
x2n+1 ,
2 4n
(2n
+
1)(n!)
n=0
19. Let R be the radius of convergence of
Prove:
P
n cn (x
|x| < 1.
− a)n and let p ≥ 0.
(a) If lim inf n |cn |np > 0, then R ≤ 1.
(b) If lim supn |cn |/np < +∞, then R ≥ 1.
20. Let Ra and
P Rb denote the radii of convergence of A(x) :=
B(x) := n bn xn , respectively. Suppose that
P
n
an xn and
lim sup(|an |/|bn |) < +∞.
Prove that Ra ≥ Rb .
21.S Let Rs and Rt denote the radii of convergence of
X
X
s(x) :=
cn (x − a)n and t(x) :=
cn2 (x − a)n ,
n
n
respectively. Prove:
(a) If Rs > 1, then Rt = +∞.
(b) If Rs ≤ 1, then no conclusion is possible.
22. Let Rs and Rt denote the radii of convergence of
X
X
2
s(x) :=
cn (x − a)n and t(x) :=
cn (x − a)n ,
n
n
respectively. Prove:
(a)S If 0 < Rs < +∞, then Rt = 1.
(b) If Rs = 0, then Rt ≤ 1, and any value of Rt ≤ 1 is possible.
(c) If Rs = +∞, then Rt ≥ 1, and value of Rt ≥ 1 is possible.
226
A Course in Real Analysis
23. Suppose that the series
A :=
∞
X
an , B :=
n=0
converge, where cn =
AB = C.
∞
X
bn , and C :=
n=0
Pn
k=0
∞
X
cn
n=0
an bn−k . Use 7.4.12 and 7.4.4 to prove that
24. Prove that for any a, b ∈ R and n ∈ N,
a
b
a
b
a
b
a+b
+
+ ··· +
=
.
0 n
1 n−1
n 0
n
25. Let n ∈ Z+ . The Bessel function of order n may be defined as the power
series
∞
X
(−1)k x n+2k
Jn (x) =
.
(n + k)!k! 2
k=0
Prove:
(a) The radius of convergence of Jn (x) is +∞.
(b) Jn satisfies Bessel’s differential equation x2 y 00 + xy 0 + (x2 − n2 )y = 0.
d n
(c)
x Jn (x) = xn Jn−1 (x), n ≥ 1.
dx
(
∞
X
xn = +∞ if p ≤ 1,
26. Prove that lim
np
x→1−
< +∞ if p > 1.
n=1
P
27.S Let {cn } tend monotonically to 0. Prove that n cn xn is continuous on
[−1, 1).
28. Let f (x) be bounded on [0, 1].
P
(a)S Prove that t(x) := n nxn f (x) converges pointwise on [0, 1) and
uniformly on [0, r] for 0 < r < 1.
(b) Suppose that L := limx→1− (1 − x)−2 f (x) exists. Prove that the
convergence of t(x) in (a) is uniform on [0, 1) iff L = 0. (Compare with
Exercise 7.3.7.)
29. Show that the function
(
f (x) =
2
e−1/x
0
if x 6= 0,
otherwise
is not analytic at 0. (See Exercise 4.6.1.)
30.S Prove 7.4.20.
Sequences and Series of Functions
31. Prove: If f (x) is analytic at a, then f 0 (x) and g(x) :=
analytic at a.
227
Rx
a
f (t) dt are
32. Let f be analytic at a and let {an } be a sequence of distinct real numbers
such that an → a and f (an ) = 0 for all n. Prove that f is identically zero
in a neighborhood of a. Hint. Assume that an ↑ a (how?). Construct, by
(k)
(k)
(k)
induction, sequences {an }n such that limn an = a and f (k) (an ) = 0
for all n and k.
33.S Let f be analytic at a and f (a) 6= 0. Carry out the following steps to
show that 1/f is analytic at a.
(a) Assume that f (a) = 1 and that
f (x) =
∞
X
an (x − a)n 6= 0, |x − a| < r
n=0
for some r. Define a series g formally by
g(x) =
∞
X
bn (x − a)n ,
n=0
where the sequence {bn } is given recursively by
b0 = 1 and bn = −
n
X
ak bn−k , n ≥ 1.
k=1
Show that if g(x) converges for |x − a| < r1 for some 0 < r1 < r,
then f (x)g(x) = 1 for |x − a| < r1 .
(b) Show that if |an | ≤ M n for all n, then |bn | ≤ (2M )n for all n.
(c) Conclude that g is analytic at a and that g = 1/f .
Part II
Functions of Several
Variables
Chapter 8
Metric Spaces
The essential feature in the notion of limit of a function is the idea of nearness.
This is made precise by a distance function, which, in the case of limits on R,
is derived from the absolute value function. It turns out that there are many
other important mathematical structures equipped with a distance function
and therefore admitting a definition of limit. In this chapter, we examine the
general properties of these structures.
8.1
Definitions and Examples
8.1.1 Definition. A metric on a nonempty set X is a function d : X × X → R
such that, for all x, y, z ∈ X,
(a) d(x, y) ≥ 0 (nonnegativity),
(b) d(x, y) = 0 iff x = y (coincidence),
(c) d(x, y) = d(y, x) (symmetry), and
(d) d(x, y) ≤ d(x, z) + d(y, z) (triangle inequality).
The ordered pair (X, d) is called a metric space. A nonempty subset E of X
with the metric d E×E is called a subspace of X and is denoted by (E, d). ♦
The real number system is a metric space under the usual metric d(x, y) =
|x − y|. The following example shows that any nonempty set may be given a
metric.
8.1.2 Example. (Discrete metric space). On a nonempty set X define
d(x, x) = 0 for all x ∈ X, and d(x, y) = 1 if x =
6 y. Then d is easily seen to be
a metric, called the discrete metric on X. For example, the triangle inequality
d(x, y) ≤ d(x, z) + d(y, z) holds because the left side of the inequality is at
most 1, in which case either x 6= z or y 6= z implying that the right side must
be at least 1.
♦
8.1.3 Definition. A subset E of a metric space X is said to be bounded if
for some x0 ∈ X and M > 0, d(x, x0 ) ≤ M for all x ∈ E.
♦
231
232
A Course in Real Analysis
The point x0 in the preceding definition may be replaced by any other
point y0 ∈ X since for x ∈ E,
d(x, y0 ) ≤ d(x, x0 ) + d(x0 , y0 ) ≤ M + d(x0 , y0 ).
The notions of convergence and completeness readily carry over to general
metric spaces:
8.1.4 Definition. A sequence {xn } in a metric space (X, d) is said to converge
to a member x of X if limn d(xn , x) = 0. In this case we write xn → x or
limn xn = x. A cluster point of a sequence in X is the limit of a convergent
subsequence.
♦
The limit of a sequence {xn } in X, if it exists, must be unique. Indeed, if
xn → x and xn → y, then, by the triangle inequality,
0 ≤ d(x, y) ≤ d(x, xn ) + d(y, xn ) → 0,
hence d(x, y) = 0 and so x = y.
8.1.5 Definition. A sequence {xn } in a metric space (X, d) is said to be
Cauchy if limm,n d(xm , xn ) = 0. A metric space (X, d) is said to be complete if
every Cauchy sequence in X converges to a member of X. A subset E of X is
complete if it is complete as a subspace of X, that is, every Cauchy sequence
in E converges to a member of E.
♦
The real number system is complete under the usual metric. The subspace
Q of R is not complete: a sequence of rational numbers converging to an
irrational number is Cauchy. A discrete metric space is complete, since every
Cauchy sequence is eventually constant and therefore trivially converges.
8.1.6 Proposition. (a) Every Cauchy sequence is bounded.
(b) Every convergent sequence is Cauchy, hence bounded.
Proof. (a) If {xn } is Cauchy, choose an index N such that d(xm , xn ) < 1 for
all m, n ≥ N . Then, for all n ∈ N,
d(xN , xn ) < 1 + max{d(xN , x1 ), d(xN , x2 ), . . . , d(xN , xN −1 )}.
(b) If xn → x, then the inequality d(xm , xn ) < d(xm , x) + d(xn , x) implies that
{xn } is Cauchy.
The notions of pointwise convergence and uniform convergence of a sequence
of real-valued functions easily extend to general metric spaces:
8.1.7 Definition. Let S be a nonempty set and let (X, d) be a metric space.
A sequence of functions fn : S → X is said to converge pointwise to a function
f : S → X if fn (s) → f (s) for each s ∈ S. In this case we write f = limn f or
fn → f (on S). The sequence converges uniformly
to f on S if for each ε > 0
there exists N ∈ N such that d fn (s), f (s) < ε for all n ≥ N and s ∈ S. ♦
Metric Spaces
233
8.1.8 Definition. Let X be a vector space. A norm on X is a function k · k
from X to R such that for all x, y ∈ X and t ∈ R
(a) kxk ≥ 0 (nonnegativity),
(b) kxk = 0 iff x = 0 (coincidence),
(c) ktxk = |t| kxk (absolute homogeneity),
(d) kx + yk ≤ kxk + kyk (triangle inequality).
The pair (X , k · k) is then called a normed vector space.
♦
The proof of the following proposition is left to the reader.
8.1.9 Proposition. If (X , k · k) is a normed vector space, then the function
d(x, y) := kx − yk is a metric on X .
From 1.6.4 and Exercise 1.6.4 we see that k · k2 , k · k1 , and k · k∞ are
norms on Rn , hence, according to 8.1.9, give rise to metrics. We denote these,
respectively, by d2 ,
d1 , and d∞ .
In Exercise 17 the reader is asked to show that Rn is complete in each of
these metrics. The metric d2 is called the Euclidean metric on Rn . The metric
d1 is the `1 metric on Rn and d∞ the max metric on Rn . Clearly, for n = 1,
all three metrics reduce to absolute value on R.
8.1.10 Example. Let S be a nonempty set and let B(S) denote the set of all
bounded real-valued functions on S. Then B(S) is a vector space under the
operations of addition f + g and scalar multiplication cf defined by
(f + g)(s) = f (s) + g(s) and (cf )(s) = cf (s), s ∈ S.
The supremum norm of f ∈ B(S) is defined by
kf k∞ = sup {|f (s)| : s ∈ S} .
It is easy to check that k · k∞ is indeed a norm. For example, the triangle
inequality follows by taking the supremum over s ∈ S in the inequality
|f (s) + g(s)| ≤ |f (s)| + |g(s)| ≤ kf k∞ + kgk∞ .
Note that convergence of a sequence of functions in B(S) is simply uniform
convergence on S. For this reason, k · k∞ is also called the uniform norm.
The space B(S) is complete in the metric d∞ (f, g) := kf − gk∞ induced by
the norm. To see this, let {fn } be a Cauchy sequence in B(S) and let ε > 0.
Choose N such that d∞ (fn , fm ) < ε for all m, n ≥ N . For such m, n
|fn (s) − fm (s)| < ε for all s ∈ S,
(8.1)
234
A Course in Real Analysis
hence {fn (s)} is a Cauchy sequence in R for every s ∈ S. Since R is complete,
fn (s) → f (s) for some f (s) ∈ R. Fixing n in (8.1) and letting m → +∞ yields
|fn (s) − f (s)| ≤ ε for all s ∈ S and all n ≥ N .
This shows that f is bounded and that fn → f in B(S).
♦
In the case S = N, B(S) may be identified with the set of all bounded
sequences and as such is denoted by `∞ .
1
8.1.11
the set of all sequences a = {an } in R such
P Example. Let ` denote
< +∞. Clearly, `1 is a vector subspace of `∞ . It is easy
that n |an | P
to check
that kak1 := n |an | defines a norm on `1 . We show that `1 , k·k1 is complete
in this norm.
1
Let {an := (a1,n , a2,n , . . .)}∞
n=1 be a Cauchy sequence in ` , and let ε > 0.
Choose N so that
kan − am k1 =
∞
X
|ak,n − ak,m | < ε for all n, m ≥ N.
(8.2)
k=1
Since |ak,n − ak,m | ≤ kan − am k1 , the sequence {ak,n }n is Cauchy for each k,
hence converges. Let ak = limn ak,n . Fix K ∈ N and n ≥ N . From (8.2),
K
X
|ak,n − ak,m | < ε for all m ≥ N .
k=1
Letting m → +∞, we obtain
kan − ak1 =
PK
∞
X
k=1
|ak,n − ak | ≤ ε. Since K was arbitrary,
|ak,n − ak | ≤ ε for all n ≥ N.
k=1
It follows that a ∈ `1 and an → a.
♦
8.1.12 Definition. Let (X, d) and (Y, ρ) be metric spaces. The product metric
d × ρ on X × Y is defined by
(d × ρ) (x, y), (a, b) := d(x, a) + ρ(y, b), x, a ∈ X, y, b ∈ Y.
The pair (X × Y, d × ρ) is called the product of the metric spaces X and Y . ♦
In Exercise 13 the reader is asked to prove, among other things, that d × ρ
is indeed a metric and that a sequence {(xn , yn )} converges to (a, b) in X × Y
in this metric iff xn → a in X and yn → b in Y .
Metric Spaces
235
Exercises
1.S Determine whether d is a metric on R2 , where d((x1 , x2 ), (y1 , y2 )) =
(a) 2|x1 − y1 | + 3|x2 − y2 |.
(b) |x21 − y12 | + |x22 − y22 |.
(c) |x31 − y13 | + |x32 − y23 |.
(d) |x1 − x2 | + |y1 − y2 |.
(e)
|x1 − y1 | + |x2 − y2 |
.
2 + |x1 − y1 | + |x2 − y2 |
(f) |ex1 − ey1 | + |ex2 − ey2 |.
2. (p-adic metric). Let p be a fixed prime number. Define ρp (n, n) = 0, and
for m 6= n ∈ Z define ρp (m, n) = 1/pα , where α is the power of p in the
unique prime factorization of |m − n|. (For example, ρ2 (42, 2) = 1/8,
ρ5 (42, 2) = 1/5, and ρ3 (42, 2) = 1.) Show that ρp is a metric on Z.
3.S (Hamming distance). Let A be a nonempty set and X := An . For
x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) ∈ X define d(x, y) to be the number
of indices j for which xj 6= yj . Show that d is a metric on X. (The
metric is named after Richard Hamming, who pioneered the field of error
correcting codes.)
4. Let X be as in Exercise 3. Define ρ(x, x) = 0, and for distinct points
x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) in X define ρ(x, y) = 2−j , where j
is the smallest index for which xj 6= yj . Show that ρ is a metric on X.
5.S Prove that a metric d satisfies
|d(x, y) − d(a, b)| ≤ d(x, a) + d(y, b).
Conclude that if xn → a and yn → b, then d(xn , yn ) → d(a, b).
6. Let X and Y be nonempty sets and let f : X → Y be one-to-one. Show
that if ρ is a metric on Y , then d(x, y) := ρ(f (x), f (y)) defines a metric
on X.
7. Prove that a finite union of bounded sets in a metric space is bounded.
8. Prove 8.1.9.
9. ⇓1 Prove that a Cauchy sequence with a cluster point converges.
10.S Let E1 , . . . , Em be complete subspaces of (X, d). Prove that the finite
union E1 ∪ · · · ∪ Em is complete. Does the analogous assertion hold for a
countable union of complete subspaces?
11. Let X := [1, +∞) have the metric
d(x, y) = |x−1 − y −1 |
(see Exercise 6). Show that xn → x with respect to the usual metric on
X iff xn → x with respect to d. Is (X, d) complete?
1 This
exercise will be used in 8.5.8.
236
A Course in Real Analysis
12. Same as Exercise 11 but with the metric
ρ(x, y) = x(1 + x2 )−1 − y(1 + y 2 )−1 .
13.S Let (X, d) and (Y, ρ) be metric spaces and let (Z, η) = (X × Y, d × ρ)
be the product space. Prove:
(a) η is a metric on Z.
(b) A sequence {(xn , yn )} is Cauchy in Z iff {xn } is Cauchy in X and
{yn } is Cauchy in Y .
(c) A sequence {(xn , yn )} converges to (x, y) in Z iff xn → x in X and
yn → y in Y .
(d) Z is complete iff X and Y are complete.
14. Metrics d, ρ on a set X are said to be metrically equivalent if there exist
positive constants a and b such that
d(x, y) ≤ a ρ(x, y) and ρ(x, y) ≤ b d(x, y) for all x, y ∈ X.
For example, by Exercise 1.6.6, the metrics d1 , d2 , and d∞ are metrically
equivalent. Suppose that d and ρ are metrically equivalent. Let {xn } be
a sequence in X and let x ∈ X. Prove the following:
(a) xn → x in (X, d) iff xn → x in (X, ρ).
(b) {xn } is Cauchy in (X, d) iff {xn } is Cauchy in (X, ρ).
(c) (X, d) is complete iff (X, ρ) is complete.
15.S Let d be a metric on a set X and a > 0. Define
ρ(x, y) := min{d(x, y), a}.
Prove:
(a) ρ is a metric on X.
(b) A sequence is Cauchy in (X, ρ) iff it is Cauchy in (X, d).
(c) A sequence converges in (X, ρ) iff it converges in (X, d).
(d) (X, ρ) is complete iff (X, d) is complete.
Are d and ρ metrically equivalent? Does σ(x, y) := max{d(x, y), a}
define a metric on X?
16. Let ρ1 and ρ2 be metrics on X. Prove that max{ρ1 , ρ2 } is a metric. Is
min{ρ1 , ρ2 } a metric?
17. Let x := (x1 , . . . , xn ), xk := (x1,k , . . . , xn,k ) ∈ Rn , k = 1, 2, . . .. Prove:
(a) xk → x in (Rn , d2 ) iff xj,k → xj for j = 1, . . . , n.
(b) {xk } is Cauchy in (Rn , d2 ) iff {xj,k }∞
k=1 is Cauchy in R for each
j = 1, . . . , n.
(c) Rn is complete in each of the metrics d1 , d2 , d∞ . (Use Exercise 14.)
Metric Spaces
237
18.S Let d be a metric on a set X and define
ρ(x, y) :=
d(x, y)
.
1 + d(x, y)
Verify that (a)–(d) of Exercise 15 hold. Are d and ρ metrically equivalent?
19. Let ρ1 and ρ2 be metrics on a set X and let α, β > 0. Define
ρ(x, y) := αρ1 (x, y) + βρ2 (x, y).
Prove:
(a) ρ is a metric on X.
(b) A sequence {xn } converges to x in (X, ρ) iff it converges to x in both
(X, ρ1 ) and (X, ρ2 ).
20.S Let {dk }∞
k=1 be a sequence of metrics on a set X. For x, y ∈ X define
∞
ρk (x, y) =
X
dk (x, y)
2−k ρk (x, y).
and ρ(x, y) =
1 + dk (x, y)
k=1
Prove:
(a) ρ is a metric on X. (See Exercise 18.)
(b) ρ(xn , x) → 0 iff dk (xn , x) → 0 for every k.
21. Let C(R) denote the set of continuous, real-valued functions on R. For
f, g ∈ C(R) define
∞
X
ρ(f, g) :=
2−k ρk (f, g),
k=1
where
dk (f, g) =
sup
−k≤x≤k
|f (x) − g(x)| and ρk (f, g) =
dk (f, g)
.
1 + dk (f, g)
Prove:
(a) ρ is a metric on C(R).
(b) fn → f in this metric iff fn → f uniformly on each bounded subset
of R.
(c) C(R) is complete in this metric.
Rb
22. For f ∈ C([a, b]) define kf k1 = a |f |. Show that k · k1 is a norm on
C([a, b]) and that C([a, b]) is not complete in the metric induced by this
norm.
23.S Show that the sequence of functions
fn (x, y) = (1 + xn )1/n (1 + y n )−1/n
converges uniformly to f (x, y) = x/y on [1, b] × [1, b] for any b > 1.
238
8.2
A Course in Real Analysis
Open and Closed Sets
Throughout this section, (X, d) denotes an arbitrary metric space.
It is frequently useful to formulate assertions regarding a metric space X
in terms of certain subsets of X rather than the metric. The subsets of most
interest in this regard are described in the next two definitions.
8.2.1 Definition. Let x ∈ X and r > 0. The sets
Br (x) := {y ∈ X : d(x, y) < r} and Cr (x) := {y ∈ X : d(x, y) ≤ r}
are called, respectively, the open and closed balls with center x and radius r.
The set
Sr (x) := Cr (x) \ Br (x) = {y ∈ X : d(x, y) = r}
is called the sphere with center x and radius r. The ball Br (x) is also called a
neighborhood of x.
♦
The open (closed) balls in R with the usual metric are simply the bounded
open (closed) intervals. The spheres are the endpoints of these intervals. The
open (closed) balls in Euclidean space R2 are open (closed) disks and the
spheres are circles. The open and closed balls in a discrete metric space X are
the sets X and {x}; the spheres are X \ {x} and the empty set.
8.2.2 Definition. A subset U of X is said to be open if either U = ∅ or else
U has the following property:
For each x ∈ U there exists ε > 0 such that Bε (x) ⊆ U .
A subset of X is closed if its complement is open. The collection of all open
sets is called the (metric ) topology of (X, d).
♦
In any metric space, X and ∅ are both open and closed. There are many
metric spaces for which these are the only subsets that are both open and
closed; Euclidean space Rn is an important example (see Section 8.7). The
sets Q and I are neither open nor closed in R since every open ball (= open
interval) contains members of both sets. A finite set F is always closed. Indeed,
if x ∈ F c , then Br (x) ⊆ F c , where r = min{d(x, y) : y ∈ F }, hence F c is
open.
8.2.3 Proposition. An open ball is open, a closed ball is closed, and a sphere
is closed.
Proof. Let x ∈ Br (x0 ). We claim that Bε (x) ⊆ Br (x0 ), where ε := r − d(x, x0 ).
Indeed, if y ∈ Bε (x) then
d(y, x0 ) ≤ d(y, x) + d(x, x0 ) < ε + d(x, x0 ) = r,
Metric Spaces
239
ε
x
r
x0
y
B(x)
Br (x0)
FIGURE 8.1: An open ball is open.
hence y ∈ Br (x0 ) (Figure 8.1). Since x was arbitrary, Br (x0 ) is open. A similar
argument shows that Cr (x0 )c and Sr (x0 )c are open, hence Cr (x0 ) and Sr (x0 )
are closed. (See Exercise 2.) That Sr (x0 ) is closed also follows from 8.2.6
below.
8.2.4 Theorem. Open sets in (X, d) have the following properties:
S
(a) If Ui is open for each i in an index set I, then i∈I Ui is open.
(b) If V1 , . . . , Vn are open, then V1 ∩ · · · ∩ Vn is open.
Proof. (a) Let U denote the union. If x ∈ U , then x ∈ Ui for some i, hence
there exists r > 0 such that Br (x) ⊆ Ui ⊆ U . Therefore, U is open.
(b) Let V denote the intersection and let x ∈ V . For each j = 1, . . . , n
there exists rj > 0 such that Brj (x) ⊆ Vj . Then Br (x) ⊆ V , where r =
min{r1 , . . . , rn }. Therefore, V is open.
8.2.5 Corollary. A nonempty subset U is open iff it is the union of open
balls.
For example, in a discrete metric space, every subset is a union of open
balls {x} = B1 (x) and hence is open. It follows that every subset is also closed.
8.2.6 Corollary. Closed sets in (X, d) have the following properties:
T
(a) If Ci is closed for each i in an index set I, then i∈I Ci is closed.
(b) If C1 , . . . , Cn are closed, then C1 ∪ · · · ∪ Cn is closed.
Proof. In (a), each Cic is open, hence, using DeMorgan’s law and 8.2.4,
\ c [
Ci =
Cic
i∈I
i∈I
is open, that is, i∈I Ci is closed. Part (b) is proved in a similar manner, using
DeMorgan’s law for complements of finite unions.
T
240
A Course in Real Analysis
8.2.7 Theorem. A subset C of X is closed iff C contains the limit of each
convergent sequence in C.
Proof. Assume that C is closed and let {xn } be a sequence in C with xn → x.
If x 6∈ C, then, because C c is open, there exists ε > 0 such that Bε (x) ∩ C = ∅.
But then xn is eventually in Bε (x) ⊆ C c , impossible. Therefore, x ∈ C.
Now suppose C is not closed. Then C c is not open, hence there exists
x ∈ C c such that B1/n (x) 6⊆ C c , that is, B1/n (x) ∩ C 6= ∅, for every n ∈ N .
Choosing a point xn in this intersection, we then obtain a sequence {xn } in C
that converges to a point not in C.
8.2.8 Corollary. Let (X, d) be a metric space and let Y be a subspace of X.
(a) If X is complete and Y is closed, then Y is complete.
(b) If Y is complete, then Y is closed.
Proof. (a) Let {yn } be a Cauchy sequence in Y . Since X is complete, there
exists x ∈ X such that yn → x. Since Y is closed, x ∈ Y . Therefore, Y is
complete.
(b) Let {yn } be a sequence in Y such that yn → x ∈ X. Then {yn } is
Cauchy and hence converges to some y ∈ Y . Since limits are unique, x = y.
Therefore, x ∈ Y , hence Y is closed.
8.2.9 Example. Let C([a, b]) denote the set of all continuous real-valued
functions on the interval [a, b]. Each such function is bounded, hence C([a, b])
is a vector subspace of B([a, b]) (8.1.10). Since the uniform limit of continuous
functions is continuous (7.2.2), C([a, b]) is closed in the uniform metric. Since
B([a, b]) is complete, 8.2.8(a) shows that C([a, b]) is complete.
♦
8.2.10 Example. The subspace D([a, b]) of C([a, b]) consisting of all differentiable functions is not complete in the uniform metric. To see this take
[a, b] = [0, 1] and define a sequence of continuous functions gn (x), n ≥ 2, on
[0, 1] such that gn = 1 on [0, 1/2], gn = 0 on [1/2 + 1/n, 1], and gn is linear on
[1/2, 1/2 + 1/n]. Also, define g(t) on [0, 1] by g = 1 on [0, 1/2] and g = 0 on
(1/2, 1]. (See Figure 8.3.)
gn
g
1
x
1
2
1
2
+
1
n
1
1
2
FIGURE 8.2: The functions gn and g.
1
Metric Spaces
241
Now set
fn (x) =
x
Z
gn (t) dt and f (x) =
0
Z
x
g(t) dt,
x ∈ [0, 1].
0
Then fn ∈ D([0, 1]), f ∈ C([0, 1]), and
|fn (x) − f (x)| ≤
Z
0
1
|gn − g| =
Z
1/2+1/n
gn =
1/2
1
.
2n
Therefore, fn → f uniformly on [0, 1]. Since f is not differentiable at 1/2,
D([0, 1]) is not closed.
♦
8.2.11 Definition. Let Y be a subset of X. A subset A ⊆ Y is said to be
relatively open (relatively closed ) in Y if A is open (closed) in the subspace
(Y, d) of (X, d).
♦
8.2.12 Theorem. Let A ⊆ Y ⊆ X. Then A is relatively open (relatively
closed) in Y iff A = Y ∩ B for some open (closed) subset B of X.
Proof. By definition, a nonempty open set A in the subspace Y is a union of
open balls in Y . The latter are of the form Y ∩ Br (y), where y ∈ Y and Br (y)
is an open ball of X. Therefore, A = Y ∩ B, where B is the corresponding
union of the open balls Br (y).
From the first paragraph, the closed sets of Y are of the form Y \A = Y ∩B c ,
where B is open in X. Since B c is closed in X, the assertion regarding closed
sets follows.
8.2.13 Definition. Let X be a vector space and let a, b ∈ X . The line
segment from a to b is defined by
[a : b] = {(1 − t)a + tb : 0 ≤ t ≤ 1} .
A subset E of X is said to be convex if a, b ∈ E implies [a : b] ⊆ E.
♦
a
a
b
b
FIGURE 8.3: Convex and non-convex sets.
Recall that, by definition, the convex subsets of R are the intervals. The
reader may easily check that if D ⊆ Rp and E ⊆ Rq are convex, then D × E,
as a subset of Rp+q , is convex. In particular, Cartesian products I1 × · · · × In
of intervals Ij are convex in Rn . Other examples are given in Exercise 5.
242
A Course in Real Analysis
Exercises
1.S Sketch B1 (0, 0) ⊆ R2 for the metrics d1 and d∞ derived from the norms
k · k1 and k · k∞ .
2. Prove that a closed ball is closed.
3.S Let x, y be distinct points in a metric space (X, d). Find the largest
number r such that Br (x) ∩ Br (y) = ∅.
4. Show that every open subset U of Rn is a countable union of open balls
as well as a countable union of bounded open n-dimensional intervals
(a1 , b1 ) × · · · × (an , bn ).
5.S Prove that open and closed balls in a normed vector space are convex.
Are spheres convex?
6. Show by example that arbitrary intersections of open sets may not be
open and that arbitrary unions of closed sets may not be closed.
7. Metrics d and ρ on a set X are said to be topologically equivalent if they
have the property that a sequence {xn } converges to x in (X, d) iff it
converges to x in (X, ρ).
(a) Prove that metrically equivalent metrics are topologically equivalent.
(See Exercise 8.1.14.)
(b) Prove that d and ρ are topologically equivalent iff (X, d) and (X, ρ)
have the same topologies, that is, the metrics produce the same open
sets.
(c) Are topologically equivalent metrics necessarily metrically equivalent?
8.S Prove that the metric ρ(x, y) = |ex − ey | on R is topologically equivalent
to the usual metric. Is R complete in this metric? Is ρ metrically equivalent
to the usual metric on R?
9. Let Y be a subspace of (X, d) with the property that for some r > 0,
d(x, y) ≥ r for all x, y ∈ Y with x =
6 y. Prove that Y is complete, hence
closed. Conclude that finite metric spaces, discrete metric spaces, and
the subspaces N and Z of R are complete.
10. Let xn → x0 in (X, d). Prove that the set C := {x0 , x1 , x2 , . . .} is closed
in X.
11. Let Y be open (closed) in (X, d). Prove that a subset U of Y is relatively
open (relatively closed) in Y iff it is open (closed) in X.
12.S Prove that the set
C := {f ∈ C [0, 1] : f (x) = f (1 − x) for all x ∈ [0, 1]}
is closed in the supremum metric (8.1.10) but not in the metric of
Exercise 8.1.22.
Metric Spaces
243
13. Prove that the subspaces
V := f ∈ B [0, +∞) : lim f (x) exists in R
and
x→+∞
W := f ∈ V : lim f (x) = 0
x→+∞
are closed in the supremum metric.
8.3
Closure, Interior, and Boundary
Throughout this section, (X, d) denotes an arbitrary metric space.
8.3.1 Definition. Let E ⊆ X.
• The closure cl(E) = clX (E) of E in X is the intersection of all closed
subsets of X containing E.
• The interior int(E) = intX (E) of E is the union of all open subsets of
X contained in E.
• The boundary bd(E) = bdX (E) of E is the set cl(E) \ int(E).
♦
8.3.2 Examples. (a) Since every nonempty open set of R (with the usual
metric) contains rational and irrational points,
int(Q) = int(I) = ∅ and cl(Q) = cl(I) = R, hence bd(Q) = bd(I) = R.
For bounded intervals we have
cl((a, b)) = [a, b], int([a, b]) = (a, b), and bd((a, b)) = bd([a, b]) = {a, b}.
(b) In a discrete metric space a subset E is both open and closed, hence
cl(E) = int(E) = E and bd(E) = ∅.
♦
By 8.2.4 and 8.2.6, int(E) is open and cl(E) is closed, hence bd(E) is closed.
The following proposition asserts that int(E) is the largest open set contained
in E and cl(E) is the smallest closed set containing E.
8.3.3 Proposition. If U is open, C is closed, and U ⊆ E ⊆ C, then
U ⊆ int(E) ⊆ E ⊆ cl(E) ⊆ C.
Proof. Simply note that U is one of the open sets in the definition of int(E)
and that C is one of the closed sets in the definition of cl(E).
244
A Course in Real Analysis
8.3.4 Corollary. Let E ⊆ X.
(a) E is open in X iff int(E) = E.
(b) int int(E) = int(E).
(d) cl cl(E) = cl(E).
(c) E is closed in X iff cl(E) = E.
Proof. If E is open, take U = E in the proposition. If E is closed, take C = E.
This proves (a) and (c). Parts (b) and (d) follow from these.
8.3.5 Proposition. For any subset E of X,
c
(a) cl(E c ) = int(E) ,
c
(b) int(E c ) = cl(E) ,
(c) bd(E) = cl(E) ∩ cl(E c ) = bd(E c ).
Proof. For (a) we have
c
int(E) =
[
U ⊆E
U open
c
U
=
\
C = cl(E c ).
C⊇E c
C closed
Parts (b) and (c) follow from (a).
8.3.6 Proposition. Let E ⊆ X. Then x ∈ cl(E) iff there exists a sequence
{an } in E such that an → x.
Proof. Let C be the set of all limits of convergent sequences in E, including
constant sequences, so E ⊆ C. We show that C = cl(E), which will establish
the proposition.
First, C is closed. If not, then C c is not open, hence there exists y ∈ C c
and for each n a point yn ∈ B1/n (y) ∩ C. By definition of C, each yn is the
limit of a sequence in E, hence there exists an ∈ E such that d(yn , an ) < 1/n.
By the triangle inequality, d(an , y) < 2/n hence an → y. But then y ∈ C, a
contradiction. Therefore C must be closed.
It follows that cl(E) ⊆ C. Since cl(E) contains the limit of all convergent
sequences in E (8.2.7), C ⊆ cl(E). Therefore, C = cl(E).
8.3.7 Example. (Topologist’s sine curve). Let
A = {(x, sin(1/x)) : 0 < x < 2/π} and B = {0} × [−1, 1].
We show that cl(A) = A ∪ B.
For the inclusion A ∪ B ⊆ cl(A), note first that
1
2
2
sin :
≤x≤
= [−1, 1], n ∈ Z+ .
x (4n + 3)π
(4n + 1)π
It follows from the intermediate value theorem that for each y ∈ [−1, 1] and
n ∈ N there exists xn ∈ R such that
0 < xn ≤
2
and sin(1/xn ) = y.
(4n + 1)π
Metric Spaces
245
Since (xn , y) ∈ A and (xn , y) → (0, y), (0, y) ∈ cl(A). Therefore, B ⊆ cl(A),
hence A ∪ B ⊆ cl(A).
The reverse inclusion will follow if we show that A ∪ B is closed. For this
we use 8.2.7. Let {(xn , yn )} be a sequence in A ∪ B with (xn , yn ) → (x, y).
Case 1. There exists a subsequence {(xnk , ynk )} that lies in B. Then, since
B is closed, (x, y) ∈ B.
Case 2. {(xn , yn )} eventually lies in A, so yn = sin(1/xn ) for all sufficiently
large n. Since limt→0 sin(1/t) does not exist, x cannot be zero, hence y =
sin(1/x), that is, (x, y) ∈ A.
In each case (x, y) ∈ A ∪ B, hence A ∪ B is closed.
♦
8.3.8 Definition. A subset E of X is said to be dense in X if cl(E) = X.
Equivalently, every x ∈ X is the limit of a sequence in E.
♦
By 8.3.2, Q and I are dense in R. The set of all points in R2 with rational
coordinates is dense in R2 . A discrete space has no proper dense subsets. In
Section 8.8 we show that the set of polynomials on [a, b] is dense in C([a, b]) in
the uniform norm.
8.3.9 Example. (Dirichlet). If ξ is irrational, then the set
E := {nξ + m : m ∈ Z, n ∈ N}
is dense in R. To verify this we show that for any x ∈ R and k ∈ N there exists
z ∈ E such that |z − x| < 1/k.
To this end, let
yj = jξ − bjξc, j = 1, . . . , k + 1.
Because ξ is irrational, 0 < yj < 1, hence yj must be in one of the intervals
(0, 1/k), (1/k, 2/k), . . . , ((k − 1)/k, 1). Since there are only k intervals, one of
these must contain yi and yj for some i 6= j.2 By the irrationality of ξ, yj =
6 yi .
Hence one of the quantities ±(yj − yi ), call it y, is in E and |y| < 1/k.
We consider two cases. If y > 0, choose m ∈ Z such that x + m > 0
and let n be the smallest integer such that ny > x + m. Then n ∈ N and
(n − 1)y ≤ x + m, hence z := ny − m ∈ E and
0 < z − x = ny − m − x ≤ y < 1/k.
On the other hand, if y < 0, choose m ∈ Z such that x + m < 0 and let n be
the smallest integer such that n(−y) > −(x + m), that is, ny < x + m. Again,
z := ny − m ∈ E, and in this case, since (n − 1)y ≥ x + m,
−1/k < y ≤ ny − m − x = z − x < 0.
In either case, |z − x| < 1/k, as required.
2 This
is an instance of the so-called pigeon hole principle.
♦
246
A Course in Real Analysis
8.3.10 Example. We show that the set S := {sin n : n ∈ N} is dense in the
interval [−1, 1]. Let x ∈ R and take ξ = 1/2π in the preceding example. Then
nk /2π + mk → x for some integer sequences {nk } and {mk } with nk > 0,
hence
sin nk = sin 2π(nk /2π + mk ) → sin(2πx).
Since x was arbitrary, every member of [−1, 1] is the limit of a sequence in S.
A similar argument shows that {cos n : n ∈ N} is dense in [−1, 1].
♦
8.3.11 Definition. A metric space is said to be separable if it has a countable
dense subset.
♦
For example, Rn is separable (consider all points with rational coordinates).
An uncountable discrete space is not separable. The space C([a, b]) is separable
in the supremum norm (Exercise 19).
Exercises
1. Let (X, d) be a metric space and A, B ⊆ X. Prove the following:
(a) S cl(A ∪ B) = cl(A) ∪ cl(B).
(b)
cl(A ∩ B) ⊆ cl(A) ∩ cl(B).
(c)
int(A ∩ B) = int(A) ∩ int(B). (d)
S
(e)
bd(A ∪ B) ⊆ bd(A) ∪ bd(B).
(f)
S
(g)
bd(int(A)) ⊆ bd(A).
(h)
int(A ∪ B) ⊇ int(A) ∪ int(B).
bd(cl(A)) ⊆ bd(A).
cl(A) = A ∪ bd(A).
Show by examples that the inclusions may be strict.
2. Prove: bd(A ∩ B) ⊆ A ∩ bd(B) ∪ B ∩ bd(A) ∪ bd(A) ∩ bd(B) . Show
that the inclusion may be strict.
3. Find cl(A) \ A for A =
(a)
(c)
(e) S
{(1/n, 1/m) : m, n ∈ N} .
(b) S (cos t, sin t, e−t ) : t > 0 .
t
cos t, sin t,
: t ∈ R . (d) {(t cos t, t sin t, t) : t > 0} .
1 + |t|
cos t sin t
t cos t t sin t
S
,
: t>0 .
(f)
,
: t>0 .
1+t 1+t
1+t 1+t
4. An induction argument shows that parts (a) and (c) of Exercise 1 hold
for any finite number of sets. Show, by example, that the analogous
statements for infinitely many sets are false.
5. Prove that if cl(A) ∩ cl(B) = ∅, then int(A ∪ B) = int(A) ∪ int(B).
6. Let Y be a subspace of (X, d) and A ⊆ Y . Prove that
(a)S clY (A) = clX (A) ∩ Y .
(c) bdY (A) ⊆ bdX (A).
(b) intX (A) ∩ Y ⊆ intY (A).
Show by examples that the inclusions in (b) and (c) may be strict.
Metric Spaces
247
7. Let xn → x0 in X. Show that cl {x1 , x2 , . . .} = {x0 , x1 , x2 , . . .}.
8.S Let fn (x) = xn, 0 ≤ x ≤ 1. Show that the set{f1 , f2 , . . .} is closed in
C([0, 1]), k · k∞ . Is it closed in C([0, 1]), k · k1 ?
9. Let B = Br (x0 ) and C = Cr (x0 ). Prove that
(a)S B ⊆ int(C).
(b) cl(B) ⊆ C.
(c) bd(B) ⊆ C \ B.
Show, by example, that the inclusions may be strict.
10. Prove that in a normed vector space the inclusions in Exercise 9 are
equalities.
11. Prove that the set E = {(x, y) : x, y ∈ Q and x 6= y} is neither open
nor closed and is dense in Euclidean space R2 .
12. Let x ∈ R, r ∈ Q, r 6= 0. In each case, find the largest interval in which
the given set is dense.
(a) {sin(rn) : n ∈ N}.
(c) {sin n cos n : n ∈ N}.
(b)S {sin(x + n) : n ∈ N}.
(d) tan2 n : n ∈ N .
13. Show that limn sin(πnx) does not exist for any irrational number x. Conclude that limn sin(nr) does not exist for any nonzero rational number r.
14. (a) Let E be dense in X and let F be a proper finite subset of E. Show
that E \ F is dense in X \ F . Is E \ F is necessarily dense in X?
(b) Let X be a normed vector space with {a1 , a2 , . . .} dense in X . Show
that {an : n ≥ N } is dense in X for every N ∈ N. Conclude that
{sin n : n ≥ N } is dense in [−1, 1].
15. Show that lim inf n sin n = −1 and lim supn sin n = 1.
16.S Let Y be dense in X and U ⊆ X open. Show that U ∩ Y is dense in U .
What if U is not open?
17. Let X = Rn with the Euclidean metric and let Y ⊆ X have the property
of Exercise 8.2.9. Prove that Y c is open and dense in X. Conclude that
Nc and Zc are open and dense in R.
18. Show that in a separable space, every nonempty open set U is a countable
union of open balls.
19. Use the Weierstrass
approximation theorem (8.8.5, below) to show that
C([a, b]), k · k∞ is separable
20. (a)S Let {Ii : i ∈ I} be a family of open intervals in R S
with the property
that each pair has a nonempty intersection. Show that i∈I Ii is an open
interval.
(b) Prove that every nonempty open set in R is a countable union of
disjoint open intervals.
248
A Course in Real Analysis
8.4
Limits and Continuity
In this section, (X, d), (Y, ρ), and (Z, µ) denote arbitrary metric spaces.
8.4.1 Definition. Let E ⊆ X.
A member a ∈ X is said to be an accumulation
point of E if E ∩ Br (a) \ {a} 6= ∅ for each r > 0. A member of E that is not
an accumulation point is called an isolated point of E.
♦
It follows from the definition that a is an accumulation point of E iff there
exists a sequence of distinct points of E converging to a.
No subset of a discrete metric space has an accumulation point. The set
of functions x 7→ xn in C([0, 1]), n ∈ N, has no accumulation points in the
uniform norm but the identically zero function is an accumulation point in the
norm k · k1 .
8.4.2 Definition. Let E ⊆ X, f : E → Y , and let a ∈ X be either a member
of E or an accumulation point of E. If b ∈ Y , we write
b = lim{x→a, x∈E} f (x)
if for each ε > 0 there exists δ > 0 such that
x ∈ E and d(x, a) < δ implies ρ(f (x), b) < ε.
In the special case E = X \ {a}, we write simply b = limx→a f (x).
(8.3)
♦
Note that condition (8.3) may be written
f E ∩ Bδ (a) ⊆ Bε (b).
This observation will be useful later in proving a global characterization of
continuity.
Many of the results in Chapter 3 on limits of functions on subsets of
R hold for real-valued functions defined on a metric space. These include
the theorems on limits of sums, products, and quotients of functions, the
comparison theorem, the squeeze principle, and the sequential characterization
of limit. The statements and proofs are essentially the same: simply replace
|x − y| by the metric d(x, y). For future reference, we explicitly state:
8.4.3 Sequential Characterization of Limit. Let a be an accumulation
point of E ⊆ X and let f : E → Y . Then lim{x→a, x∈E} f (x) exists and equals
b ∈ Y iff f (an ) → b for all sequences {an } in E with an → a.
The following theorem gives sufficient conditions for a double limit to equal
an iterated limit.
Metric Spaces
249
8.4.4 Iterated Limit Theorem. Let X×Y have the product metric η := d×ρ,
and let a and b be accumulation points of X \ {a} and Y \ {b}, respectively. If
f : X × Y \ {(a, b)} → Z has the properties
(a) g(x) := limy→b f (x, y) exists in Z for each x ∈ X, and
(b) z := lim(x,y)→(a,b) f (x, y) exists in Z,
then limx→a g(x) exists and equals z.
Proof. Given ε > 0, by (b) choose δ > 0 such that
µ f (x, y), z < ε for all (x, y) ∈ X × Y with 0 < η (x, y), (a, b) < δ.
Let 0 < d(x, a) < δ. Then, for all y sufficiently near b, η (x, y), (a, b) < δ,
hence
µ g(x), z ≤ µ g(x), f (x, y) + µ f (x, y), z < µ g(x), f (x, y) + ε.
Letting y → b in this inequality, noting that f (x, y) → g(x), we obtain
µ g(x), z ≤ ε. This shows that limx→a g(x) = z.
The theorem implies that
lim
(x,y)→(a,b)
f (x, y) = lim lim f (x, y) = lim lim f (x, y)
x→a y→b
y→b x→a
provided the limit on the left exists and inner limits on the right exist for each
x and y, respectively. The limits on the right are called iterated limits and the
limit on the left is sometimes called a double limit. In particular, if the iterated
limits exist and are unequal, then the double limit cannot exist.
In many cases, the iterated limit theorem (suitably modified) still holds if
f is defined on subsets E of X × Y more general than X × Y \ {(a, b)}. This
is the case in Examples (c) and (d) that follow.
8.4.5 Examples. In (a)–(e), X = Y = Z = R. Note that in this case the
product metric η is equivalent to the Euclidean metric on R2 .
(a) Let E = (0, +∞) × (0, +∞). To calculate the limit
lim
(x,y)→(0,0)
(x,y)∈E
we write the function as
sin(x + 2y)
2x + y
sin(x + 2y) x + 2y
.
x + 2y 2x + y
As (x, y) → (0, 0) along E, the first factor tends to 1 but the second factor has
no limit. Indeed, along a path y = mx, m > 0, x > 0,
x + 2y
x + 2mx
1 + 2m
=
=
.
2x + y
2x + mx
2+m
250
A Course in Real Analysis
Therefore, the double limit does not exist. The iterated limits exist and are
unequal:
lim lim f (x, y) = lim+
y→0+ x→0+
y→0
sin(2y)
= 2,
y
lim lim f (x, y) = lim+
x→0+ y→0+
x→0
sin x
1
= .
2x
2
(b) Let E be as in (a) and let p, q > 0. The limit
L :=
xp + y q
(x,y)→(0,0) x2 + y 2
lim
(x,y)∈E
exists iff p, q > 2 or p = q = 2. In the former case, L = 0 and in the
latter, L = 1. This is best seen by converting to polar coordinates x = r cos θ,
y = r sin θ, 0 < θ < π/2:
L = lim rp−2 cosp θ + rq−2 sinq θ .
r→0+
Both iterated limits exist iff p, q ≥ 2.
(c) Let E = {(x, y) : x > 0, y > 0, x 6= y}. Then
xp − y p
(x,y)→(0,0) x − y
lim
(x,y)∈E
exists iff p ≥ 1 and has zero limit if p > 1. Indeed, if 0 < x < y, then, by the
mean value theorem, there exists t ∈ (x, y) such that xp − y p = ptp−1 (x − y),
hence
xp − y p
pxp−1 <
< py p−1 ,
x−y
and the assertion follows from the squeeze principle.
Clearly, the iterated limits exist (hence equal the double limit) iff p ≥ 1.
(d) Let E be as in (c). Then
xp + y p
(x,y)→(0,0) y − x
lim
(x,y)∈E
does not exist for any value of p Indeed, along the path y = mx, m, x > 0,
m 6= 1, the function has values
xp + (mx)p
xp−1 (1 + mp )
=
mx − x
1−m
so the limit cannot exist if p ≤ 1. Let p > 1 and set θr = mrp−1 + π/4. Along
the path given by
r x = r cos θr = √ cos mrp−1 − sin mrp−1
2
r y = r sin θr = √ cos mrp−1 + sin mrp−1 ,
2
Metric Spaces
251
where r ↓ 0, the function has values
xp + y p
1
mrp−1
=√
cosp θr + sinp θr ,
p−1 )
y−x
sin(mr
2m
which tends to 2(1−p)/2 /m as r → 0.
Neither of the iterated limits exists if p < 1. If p > 1, then clearly
xp + y p
xp + y p
= lim lim
= 0,
y→0 x→0 y − x
x→0 y→0 y − x
lim lim
and if p = 1, then
xp + y p
xp + y p
= −1, while lim lim
= 1.
y→0 x→0 y − x
x→0 y→0 y − x
lim lim
(e) Let E = {(x, y) : x > 0, y > 0}. Then
xp y
(x,y)→(0,0) x + y
lim
(x,y)∈E
exists iff p > 0, in which case the limit is zero. Indeed, along the path y = mx
the function has values mxp /(1 + m), so the limit cannot exist if p ≤ 0. If
p > 0, one can introduce polar coordinates as in (b).
Both iterated limits exist iff p ≥ 0, but are unequal if p = 0.
♦
8.4.6 Definition. A function f : X → Y is said to be continuous at a point
a ∈ X if limx→a f (x) = f (a). Also, f is said to be continuous on a set E ⊆ X
if f is continuous at each point of E. If E = X, then f is simply said to be
continuous. If f is one-to-one and onto Y and if f −1 : Y → X is continuous,
then f is called a homeomorphism.
♦
From the sequential characterization of limit we have
8.4.7 Sequential Characterization of Continuity. Let f : X → Y and
a ∈ X. Then is continuous at a iff f (an ) → f (a) for all sequences {an } in X
with an → a.
The next theorem gives an important global characterization of continuity.
8.4.8 Theorem. Let f : X → Y . The following statements are equivalent:
(a) f is continuous.
(b) f −1 (V ) is open in X for each open subset V of Y .
(c) f −1 (C) is closed in X for each closed subset C of Y .
252
A Course in Real Analysis
Proof. That (b) and (c) are equivalent
follows from the general set-theoretic
c
identity f −1 B c = f −1 (B) .
(a) ⇒ (b): Let V ⊆ Y be open.
If x ∈ f −1 (V ), then f (x) ∈ V so there
exists ε > 0 such
that Bε f (x) ⊆ V . By continuity
there exists δ > 0 such
−1
that f Bδ (x) ⊆ Bε f (x) . Therefore, f Bδ (x) ⊆ V , hence B
(V ).
δ (x) ⊆ f
−1
(b) ⇒ (a): Let x ∈ X and ε > 0. Since U := f
Bε f (x) is open in X
and contains x, we may choose δ > 0 such that Bδ (x) ⊆ U . Then
f Bδ (x) ⊆ f (U ) ⊆ Bε f (x) ,
which shows that f is continuous at x.
8.4.9 Definition. A function f : X → Y is said to be uniformly continuous
on a set E ⊆ X if, given ε > 0, there exists δ > 0 such that
ρ(f (u), f (v)) < ε for all u, v ∈ E with d(u, v) < δ.
♦
8.4.10 Example. The function
f (x, y) =
1
2.1 + sin x + sin y
is uniformly continuous on R2 . Indeed, for all (x, y), (a, b) ∈ R2 ,
| sin x + sin y) − (sin a + sin b)|
(2.1 + sin x + sin y)(2.1 + sin a + sin b)
| sin x − sin a| + | sin y − sin b|
≤
(2.1 + sin x + sin y)(2.1 + sin a + sin b)
≤ 100| sin x − sin a| + 100| sin y − sin b)|
|f (x, y) − f (a, b)| =
≤ 100(|x − a| + |y − b|)
p
≤ 200 (x − a)2 + (y − b)2 .
♦
The proof of the following theorem is entirely analogous to that of 3.5.2.
The details are left to the reader.
8.4.11 Sequential Characterization of Uniform Continuity.
A function
f : X → Y is uniformly continuous on E ⊆ X iff ρ f (un ), f (vn ) → 0 for all
sequences {un } and {vn } in E with d(un , vn ) → 0.
For example, every function on a discrete metric space is uniformly
continuous,R since eventually un = vn . The indefinite integral function
x
F (f )(x) = a f (t) dt on the space C([a, b]) is uniformly continuous with respect to the uniform norm, since kfn − gn k∞ → 0 ⇒ kF (fn ) − F (gn )k∞ → 0.
The addition function (x, y) 7→ x + y is uniformly continuous on R2 since
(xn , yn ) − (an , bn ) → (0, 0) clearly implies that xn + yn − (an + bn ) → (0, 0).
On the other hand, the multiplication function (x, y) 7→ xy is not uniformly
continuous on R2 , since (n+1/n, n+1/n)−(n, n) → 0 but (n+1/n)2 −n2 → 1.
Metric Spaces
253
Exercises
1. For each of the functions f (x) below, find lim{x→0, x∈E} f (x) and the
corresponding iterated limits or show that the limits fail to exist. In each
case take E to be the natural domain of the function.
(a)
(d)
(g)
(j)
(m)
(p)
y 2 + sin2 x
.
3x2 + 2y 2
sin x sin y
p
.
x2 + y 2
x2 y 2
x2 y
.
(c)
.
2
4
+ 2y
x + 7y 4
x4
1
(e) S 4
. (f) (x + y) sin 2
.
2
4
x − xy + y
x + y2
p
(1 + x2 )(1 + y 2 ) − 1
sin(3xy 2 + 2xy 3 )
xy 2 cos(xy)
S
. (h)
.
(i)
.
2
2
2
xy
x +y
x2 + y 2
3x + 2y
x2 + |y|2.1
1 − cos(xy)
S
.
(l)
.
.
(k)
sin x sin y
x2 + y 2
(x2 + y 2 )1/3
p
1 − cos |xy|
sin x ± sin y
x−y
. (n)
.
(o) S
.
|x|p
x−y
ln x − ln y
xy + yz + xz
x|y|1.1
3x2 + 2y 2 + z 2
p
. (r) S p
.
(q)
.
x2 + y 2 + z 2
sin2 x2 + y 2
x2 + y 2 + z 2
(b) S
5x2
2.S Let a > 0, p > 1. Evaluate the limit
x2 − 5y 2
.
(x,y)→(0,0) x2 + 3y 2
lim
(x,y)∈E
for the sets
(a) E = {(x, y) : |y| ≤ a|x|p |}
(b) E = {(x, y) : |y| < |x|}.
3.S Let f be continuously differentiable on (−π/2, π/2). Define g on the set
E := {(x, y) ∈ (−π/2, π/2)2 : x 6= y}
by
g(x, y) =
f (x) − f (y)
.
sin x − sin y
Show that g has a continuous extension to (−π/2, π/2)2 .
4. Let f and g be continuously differentiable on some open interval (a, b)
and suppose that g 0 6= 0. Define h on the set
E := {(x, y) ∈ (a, b)2 : x 6= y}
by
h(x, y) =
f 2 (x) − f 2 (y)
.
g(x) − g(y)
Prove that h has a continuous extension to (a, b)2 .
254
A Course in Real Analysis
5. Let f : X → Y . Prove that the following statements are equivalent:
(a) f is continuous.
(b) f cl(A) ⊆ cl f (A) for each subset A of X.
(c) cl f −1 (B) ⊆ f −1 (cl(B)) for each subset B of Y .
(d) f −1 int(B) ⊆ int f −1 (B) for each subset B of Y .
6.S Show that d : X × X → R is uniformly continuous with respect to the
product metric η := d × d on X × X.
7.S Let f : [0, a) → R and
g(x, y) := f
p
x2 + y 2 ,
p
x2 + y 2 < a.
(a) Prove that g is uniformly continuous iff f is uniformly continuous.
(b) Use (a) to show that the functions
p
x2 + y 2 , p
1
x2
+
y2
+1
, and sin
p
x2 + y 2
are uniformly continuous on R2 but sin(x2 + y 2 ) is not.
8.S Let f (x) be uniformly continuous on R. Prove that the function g(x, y) :=
f (αx + βy) is uniformly continuous on R2 . Give an example of a bounded
uniformly continuous function f on R such that the function h(x, y) :=
f (xy) is not uniformly continuous on R2 .
9. Show that the function
f (x, y) =
1
1 − sin x sin y
is uniformly continuous on the set
Er := [−π/2 + r, π/2 − r] × [−π/2 + r, π/2 − r]
for any 0 < r < π/2, but is not uniformly continuous on
E := (−π/2, π/2) × (−π/2, π/2).
10. Let f : (X, d) → (Y, ρ) and g : (Y, ρ) → (Z, µ) be (uniformly) continuous.
Prove that g ◦ f : (X, d) → (Z, µ) is (uniformly) continuous.
11.S Let f : X → Rk , say f (x) = f1 (x), . . . , fk (x) . Prove that f is (uniformly) continuous iff each fj is (uniformly) continuous.
12.S Let fn : (X, d) → (Y, ρ) converge uniformly to f on X. Prove that if
each fn is (uniformly) continuous, then f is (uniformly) continuous.
Metric Spaces
8.5
255
Compact Sets
Throughout this section, (X, d) and (Y, ρ) denote arbitrary metric spaces.
Compactness is one of the most important concepts in analysis. For example,
it allows the formulation of results such as the extreme value theorem and
the uniform continuity theorem in the context of general metric spaces. It is
also the key feature that distinguishes the finite dimensional space Rn from its
infinite dimensional counterparts `∞ and `1 .
8.5.1 Definition. Let E ⊆ X. A collection U = {Ui : i ∈ I} of subsets of X
is called a cover of E if E is contained in the union of the sets Ui . If each Ui
is open, then U is called an open cover of E. A cover U of E is said to have a
finite subcover if there exists a finite subset I0 of I such that {Ui : i ∈ I0 } is
a cover of E. If every open cover of E has a finite subcover, then E is said to
be compact.
♦
Finite subsets of a metric space are compact. In a discrete metric space,
these are the only compact sets. Indeed, if E is an infinite subset of a discrete
space, then {x} : x ∈ E is an open cover of E with no finite subcover.
8.5.2 Proposition. A compact subset of a metric space is closed and bounded.
Proof. Let E be compact and let a ∈ E c . For each x ∈ E let Ux and Vx denote
disjoint open balls with centers x and a, respectively (see Figure 8.5). Then
{Ux : x ∈ E} is an open cover of E, hence there
T exists a finite subset E0 of
E such that {Ux : x ∈ E0 } covers E. Set V = x∈E0 Vx . Then V is an open
ball with center a, and since V ∩ Ux = ∅ for each x ∈ E0 , V ⊆ E c . Therefore
E c is open.
a
Vx
E
x
Ux
FIGURE 8.4: The neighborhoods Ux and Vx .
To show that E is bounded, choose any x ∈ X and consider the open cover
{Bn (x) : n ∈ N} of E. Let F be a finite subset of N such that {Bn (x) : n ∈ F }
covers E. Then E ⊆ Bm (x), where m is the largest member of F .
256
A Course in Real Analysis
√ √
The converse of 8.5.2 is false. For example, the set Q ∩ [− 2, 2] is closed
and bounded
in Q but not compact. Indeed, if {rn } is a sequence
√
√ √ in Q with
rn ↑ 2, then {(−rn , rn ) : n ∈ N} is an open cover of Q ∩ [− 2, 2] with no
finite subcover. For another example, consider a discrete metric space. Here,
the entire metric space is closed and bounded but only finite sets are compact.
8.5.3 Proposition. A closed subset of a compact metric space is compact.
Proof. Let X be compact, E ⊆ X closed, and let U = {Ui : i ∈ I} be an open
cover of E. Then U ∪ {E c } is an open
a finite
S cover of X, hence there exists
S
subset I0 of I such that X = E c ∪ λ∈I0 Ui . It follows that E ⊆ i∈I0 Ui .
Closely related to compactness is the notion of total boundedness.
8.5.4 Definition. Let E ⊆ X and ε > 0. An ε-net for E is a set F ⊆ X such
that {Bε (x) : x ∈ F } covers E. E is said to be totally bounded if for each
ε > 0 there exists a finite ε-net for E.
♦
An ε-net F for E has the property that every member of E is within ε of
a member of F . For example, Q is an ε-net for R, and Z is a 1-net for R.
The following proposition shows that the set F in the definition of total
boundedness may be taken to be a subset of E.
8.5.5 Proposition. If E has a finite ε-net F , then E has a finite 2ε-net
contained in E.
Proof. For each x ∈ F , apply the following procedure: If E ∩Bε (x) = ∅, remove
x from F . Otherwise, choose any a ∈ E ∩ Bε (x) and replace Bε (x) by B2ε (a)
ε
x
a
2ε
E
FIGURE 8.5: A 2ε-net.
and x in F by a. Since Bε (x) ⊆ B2ε (a), the revised set is a finite 2ε-net for E
contained in E.
Since a finite union of open balls is bounded (Exercise 8.1.3), every totally
bounded set is bounded. The converse is false. For example, in a discrete space
all sets are bounded but no infinite set can be totally bounded.
Open and closed balls in C([0, 1]) with the supremum norm are bounded
but not totally bounded (Exercise 8). Contrast this with the following example:
Metric Spaces
257
8.5.6 Example. Every bounded subset E of Rn is totally bounded. To see this,
√
let ε > 0 and choose k ∈ N so large that E ⊆ [−kδ, kδ]n , where 0 < δ < 2ε/ n.
Subdividing, we see that I is a finite union of sets of the form
J := [a1 , b1 ] × [a2 , b2 ] × · · · × [an , bn ], where bj − aj = δ.
√
(See Figure 8.6.) The largest diagonal in J has length n δ < 2ε, hence J may
kδ
c
J
B (c)
E
−kδ
kδ
−kδ
FIGURE 8.6: A bounded set in Rn is totally bounded.
be enclosed in an open ball with radius ε and center c = (c1 , . . . cn ), where
cj = (aj + bj )/2. The resulting collection of balls is a finite ε-cover of E. ♦
8.5.7 Definition. A subset E of X is said to be sequentially compact if every
sequence in E has a cluster point in E.
♦
By the Bolzano–Weierstrass theorem, closed and bounded intervals√in R
are sequentially compact. The same is not true in Q; for example, Q ∩ [0, 2] is
not sequentially compact. In a discrete space, no infinite set can be sequentially
compact since sequences with distinct terms cannot converge.
8.5.8 Heine–Borel Theorem. The following statements are equivalent:
(a) X is compact.
(b) X is sequentially compact.
(c) X is complete and totally bounded.
Proof. (a) ⇒ (b): We prove the contrapositive ∼(b) ⇒ ∼(a). Let {an } be
a sequence in X with no cluster point. Then for each x ∈ X there must
exist an open ball B(x) with center x that contains only finitely many terms
of the sequence. This implies that every finite subcover of the open cover
258
A Course in Real Analysis
{B(x) : x ∈ X} of X contains only finitely many terms of the sequence and
hence cannot cover X. Therefore, X is not compact.
(b) ⇒ (c): Let X be sequentially compact and let {an } be a Cauchy sequence
in X. By hypothesis, {an } has a convergent subsequence, say ank → a ∈ X.
By Exercise 8.1.9, an → a. Therefore, X is complete.
Suppose that X is not totally bounded. Then there exists ε > 0 such that
no finite collection of open balls of radius ε covers X. Choose any a1 ∈ X. Since
Bε (a1 ) does not cover X, there exists a2 ∈ X \ Bε (a1 ). Since
Bε (a1 ) ∪ Bε (a2 )
does not cover X, there exists a3 ∈ X \ Bε (a1 ) ∪ Bε (a2 ) . Continuing in this
fashion, we construct a sequence {an } in X such that
an ∈ X \ Bε (a1 ) ∪ Bε (a2 ) ∪ · · · ∪ Bε (an−1 ) .
It follows that d(an , am ) ≥ ε for all m 6= n. But then no subsequence of {an }
can converge. Therefore, X must be totally bounded.
(c) ⇒ (a): Assume that X is complete and totally bounded but not compact.
Then X has an open cover U = {Ui : i ∈ I} with no finite subcover. For
each k let Fk be a finite set of points in X such that {B1/k (x) : x ∈ Fk }
is a cover of X. Consider the case k = 1. If for each x ∈ F1 the ball B1 (x)
could be covered by finitely many members of U, then X itself would have
such a cover, contradicting our assumption. Thus there exists x1 ∈ F1 such
that E1 := B1 (x1 ) cannot be covered by finitely many members of U. Since
{B1/2 (x) : x ∈ F2 } covers X, {E1 ∩ B1/2 (x) : x ∈ F2 } covers E1 , so by
similar reasoning there exists x2 ∈ F2 such that E2 := E1 ∩ B1/2 (x2 ) cannot
be covered by finitely many members of U. In this manner we construct a
sequence of points xn in X and decreasing sets
En = B1 (x1 ) ∩ B1/2 (x2 ) ∩ · · · ∩ B1/n (xn ) = En−1 ∩ B1/n (xn )
(8.4)
that cannot be covered by finitely many members of U. In particular, En 6= ∅.
Choose a point yn ∈ En . If n > m, then yn ∈ Em , hence from (8.4)
d(xm , xn ) ≤ d(xm , yn ) + d(yn , xn ) < 1/m + 1/n.
It follows that {xn } is a Cauchy sequence. Since X is complete, xn → x for
some x ∈ X. Choose i ∈ I such that x ∈ Ui . Since Ui is open, there exists r > 0
such that Br (x) ⊆ Ui . Next, choose n > 2/r such that d(xn , x) < r/2. By the
triangle inequality, B1/n (xn ) ⊆ Br (x). But then En ⊆ Ui , contradicting the
noncovering property of En . Therefore, X must be compact, completing the
proof.
8.5.9 Corollary. A subset of Rn is compact iff it is closed and bounded.
Proof. We have already seen that a compact set in a metric space is closed
and bounded. Conversely, let C ⊆ Rn be closed and bounded. Since Rn is
complete (Exercise 8.1.17), C is complete (8.2.8). Since C is bounded, it is
totally bounded (8.5.6). By the theorem, C is compact.
Metric Spaces
259
The validity of the preceding corollary ultimately rests on the finite dimen
sionality of Rn . For infinite dimensional normed spaces such as C [0, 1] , a
closed and bounded set need not be compact (Exercise 8). In the next section,
we characterize the compact subsets of spaces like C [0, 1] .
8.5.10 Theorem. If f : X → Y is continuous and X is compact, then f (X)
is compact.
Proof. Let {Vi : i ∈ I} be an open cover of f (X) in Y . For each i ∈ I, set
Ui = f −1 (Vi ). Then {Ui : i ∈ I} is an open cover of X, hence there exists a
finite subset I0 of I such that {Ui : i ∈ I0 } is a cover of X. It follows that
{Vi : i ∈ I0 } is a finite cover of f (X).
8.5.11 Corollary. Let f : X → Y be continuous, one-to-one, and onto Y . If
X is compact then f −1 : Y → X is continuous, hence f is a homeomorphism.
Proof. Let g = f −1 and let C be a closed subset of X. Then C is compact
(8.5.3), hence, by the theorem, g −1 (C) = f (C) is compact and therefore closed
in Y (8.5.2). By 8.4.8, g is continuous.
Corollary 8.5.11 is false for noncompact X (Exercise 19).
8.5.12 Extreme Value Theorem. If f : X → R is continuous and X is
compact, then there exist points xm and xM in X such that
f (xm ) ≤ f (x) ≤ f (xM ) for all x ∈ X.
Proof. By 8.5.10 and 8.5.2, f (X) is closed and bounded in R and therefore
contains its supremum and infimum.
8.5.13 Theorem. If f : X → Y is continuous and X is compact, then f is
uniformly continuous.
Proof. Let ε > 0. By continuity, for each x ∈ X there exists γx > 0 such that
f Bγx (x) ⊆ Bε/2 f (x) .
(8.5)
Set δx = γx /2. The collection {Bδx (x) : x ∈ X} is an open cover of X, hence
there exists a finite set F ⊆ X such that the collection {Bδx (x) : x ∈ F }
covers X. Let δ := minx∈F δx and let a, b ∈ X with d(a, b) < δ. Choose x ∈ F
such that a ∈ Bδx (x). Then
d(x, a) < δx < γx
and d(x, b) ≤ d(a, b) + d(x, a) < δx + δx = γx ,
so a, b ∈ Bγx (x). By (8.5),
ρ f (a), f (b) ≤ ρ f (a), f (x) + ρ f (x), f (b) < ε/2 + ε/2 = ε.
Therefore, f is uniformly continuous.
260
A Course in Real Analysis
The following is a generalization of 3.5.9.
8.5.14 Corollary. Let X be compact, Y complete, E a dense subset of X,
and f : E → Y continuous. The following statements are equivalent:
(a) lim{x→a, x∈E} f (x) exists for each a ∈ X.
(b) f has a continuous extension to X; that is, there exists a continuous
function g : X → Y such that g|E = f .
(c) f is uniformly continuous on E.
Proof. (a) ⇒ (b): For each a ∈ X define g(a) = lim{x→a, x∈E} f (x). Since
f is continuous, g|E = f . If g is not continuous at a ∈ X, then thereexist
ε > 0 and a sequence {xn } in X such that xn → a and ρ g(xn ), g(a) ≥ ε
for all n. By definition of g(xn ), for each n we may choose an ∈ E such that
d(xn , an ) < 1/n and ρ g(xn ), f (an ) < ε/2. Then an → a but
ρ f (an ), g(a) ≥ ρ g(xn ), g(a) − ρ g(xn ), f (an ) > ε/2,
contradicting the definition of g(a). Therefore, g is continuous.
(b) ⇒ (c): By 8.5.13, g is uniformly continuous on X, hence f is uniformly
continuous on E.
(c) ⇒ (a): Let a ∈ X and let {xn } be a sequence in E such that xn → a.
Since f is uniformly continuous, {f (xn )} is Cauchy and therefore converges
to some b ∈ Y . If {yn } is another sequence in E such that yn → a, then
d(yn , xn ) → 0 so, by uniform continuity again, ρ f (yn ), f (xn ) → 0, hence
f (yn ) → b. By the sequential criterion for limits, lim{x→a, x∈E} f (x) exists and
equals b.
Exercises
1. Determine which of the following subsets of R2 are closed, bounded, or
compact.
(a) S {(x, y) : 2x2 + y 2 + 6y ≤ 8x}.
(b) S {(x, y) : 3x2 + 2y ≤ 6x}.
(c) {(x, y) : xy = 1}.
(d) {(x, y) : x1/3 + y 1/3 = 1}.
x cos x x sin x
(f) S
,
:x≥0 .
1+x 1+x
(e) {(x, y) : x2/3 + y 2/3 = 1}.
−x
(g)
(e cos x, e−x sin x) : x ≥ 0 . (h) S {(x, y) : x3 /y + y 3 /x > 0}.
2. Let {xn } be a convergent sequence in X with xn → x0 . Prove that the
set {x0 , x1 , x2 , . . .} is compact.
3.S Prove that a finite union of totally bounded (compact) sets is totally
bounded (compact).
Metric Spaces
261
4.S Prove that the intersection of an arbitrary family of compact subsets of
a metric space X is compact.
5. Prove that X × Y is compact in the product metric η := d × ρ iff X and
Y are compact.
6. Prove that the closure of a totally bounded subset of a metric space is
totally bounded.
7.S Prove that a subset E of a complete metric space X is totally bounded
iff every sequence in E has a cluster point in X.
8. Prove that in C([0, 1]), k · k∞ , the closed ball with radius 1 and center
the zero function is not compact.
9. Let C0 ([0, +∞)) be the vector subspace of B([0, +∞)) consisting of all realvalued continuous functions f on [0, +∞) such that limx→+∞ f (x) = 0.
Prove that C0 ([0, +∞)) is closed in the uniform norm and that the closed
ball C1 (0) in C0 ([0, +∞)) with radius 1 and center the zero function is
not compact and therefore is not totally bounded.
10. For n ∈ N, define fn ∈ B([0, +∞)) by fn = 1 on [n, n + 1] and zero
elsewhere. Prove that the set E := {f1 , f2 , . . .} is bounded but not totally
bounded in the sup metric.
11.S (Cantor’s intersection theorem). Let C1 , C2 , . . . be a sequence of
nonempty compact subsets
of a metric space X such that Cn+1 ⊆ Cn
T∞
for all n. Prove that n=1 Cn 6= ∅.
12. A collection of subsets of a metric space X is said to have the finite intersection property if every finite subcollection has a nonempty intersection.
Prove that X is compact iff every collection of closed subsets of X with
the finite intersection property has a nonempty intersection.
13.S The diameter of a nonempty subset A of (X, d) is defined by
d(A) := sup {d(a, b) : a, b ∈ A} .
(a) Prove that if A is compact, then there exist points a, b ∈ A such that
d(A) = d(a, b).
(b) Give an example of a closed and bounded set A in a metric space
such that d(A) > d(a, b) for all a, b ∈ A.
14. ⇓3 The distance between nonempty subsets A and B of (X, d) is defined
as
d(A, B) := inf {d(a, b) : a ∈ A, b ∈ B} .
3 This
exercise will be used in 8.7.2.
262
A Course in Real Analysis
(a) Prove that if A and B are disjoint with A closed and B compact,
then d(A, B) > 0.
(b) Show by example that the conclusion in (a) is false if B is merely
closed.
(c) Show that if both sets are compact, then there exist a ∈ A and b ∈ B
such that d(A, B) = d(a, b).
15.S ⇓4 Let A be a nonempty subset of X and define d(A, ·) : X → R by
d(A, x) = d(A, {x}) (see Exercise 14). Prove the following:
(a) |d(A, x) − d(A, y)| ≤ d(x, y), hence d(A, ·) is uniformly continuous.
(b) d(A, x) = 0 iff x ∈ cl(A).
(c) If A and B are disjoint closed sets, then the function
FAB (x) =
d(x, A)
,
d(x, A) + d(x, B)
x ∈ X,
is well-defined and continuous, 0 ≤ FAB ≤ 1 on X, and
A = {x : FAB (x) = 0}, B = {x : FAB (x) = 1}.
(d) If A and B are disjoint closed sets of X, then there exist disjoint
open sets U and V such that A ⊆ U and B ⊆ V . (U and V are then said
to separate A and B.)
16. Referring to 8.1.10, show that the set {f ∈ `∞ : |f (n)| ≤ e−n } is
compact. Is {f ∈ B([1, +∞)) : |f (x)| ≤ e−x } compact?
17. (Lebesgue’s number). Let X be compact and let U = {Ui : i ∈ I} be an
open cover of X. Prove that there exists a number r > 0 such that every
set with diameter < r (Exercise 13) is contained in some Ui .
18. (Dini’s Theorem). Let X be compact and let fn , g : X → R be continuous
such that either fn ↓ g or fn ↑ g on X. Prove that the convergence is
uniform. (See 7.1.12.)
19.S Let f : [0, 2π) → R2 be defined by f (t) = (cos t, sin t). Show that f is
continuous, one-to-one, and maps [0, 2π) onto the circle x2 + y 2 = 1 but
has a discontinuous inverse.
20. Let f : R2 → R be defined by f (x, y) = x. Prove or disprove:
(a) If E ⊆ R2 is closed, then f (E) is closed.
(b) If E ⊆ R2 is open, then f (E) is open.
4 This
exercise will be used in 11.2.17.
Metric Spaces
263
21.S Let A and B be compact subsets of R. Prove that the sets
AB := {ab : a ∈ A, b ∈ B} and A + B := {a + b : a ∈ A, b ∈ B}
are compact.
22. ⇓5 Let a sequence of continuous functions fn : (X, d) → (Y, ρ) converge
uniformly to f on X, let C ⊆ X be compact, and let U ⊆ Y be open.
Prove that if f (C) ⊆ U , then fn (C) ⊆ U for all sufficiently large n.
*8.6
The Arzelà–Ascoli Theorem
Throughout this section, (X, d) and (Y, ρ) denote arbitrary metric spaces
and C(X, Y ) denotes the set of all continuous functions from X to Y .
As noted in the previous section, closed and bounded subsets in infinite
dimensional spaces such as C [0, 1] need not be compact. The additional
property of equicontinuity is needed to characterize compact subsets of such
spaces.
8.6.1 Definition. A family F of functions in C(X, Y ) is said to be
a ∈ X if, for each ε > 0, there exists δ > 0
• equicontinuous at a point
such that ρ f (x), f (a) < ε for all x ∈ X with d(x, a) < δ and all f ∈ F;
• equicontinuous on E ⊆ X if F is equicontinuous at each point of E;
• uniformly equicontinuous
on E if, for each ε > 0, there exists δ > 0 such
that ρ f (x), f (y) < ε for all f ∈ F and all x, y ∈ E with d(x, y) < δ.♦
The distinguishing feature of equicontinuity is that, while δ may vary
with the point a, it is independent of the functions f ∈ F. With uniform
equicontinuity, δ is independent of both f and a.
8.6.2 Example. For each x, t ∈ R, define ft (x) = tx. Let I = (c, d) be a
bounded interval and set M = max{|c|, |d|}. The inequality
|ft (x) − ft (y)| = |t| |x − y| ≤ M |x − y|, t ∈ I,
shows that the collection of functions {ft : t ∈ I} is uniformly equicontinuous
on R. On the other hand, the larger collection {ft : t ∈ R} is not equicontinuous
at any a ∈ R. Indeed, no δ can be chosen so that |tx − ta| < 1 for all t ∈ R
and all x ∈ R with |x − a| < δ.
♦
5 This
exercise will be used in 13.6.5.
264
A Course in Real Analysis
A straightforward modification of the proof of 8.5.13 yields
8.6.3 Theorem. If X is compact and F is equicontinuous on X, then F is
uniformly equicontinuous.
8.6.4 Definition. A metric space is said to have the Bolzano–Weierstrass
property if every bounded sequence has a cluster point.
♦
A compact metric space and the space Rn have the Bolzano–Weierstrass
property, while infinite discrete
metric
spaces, the space Q, and the infinite
dimensional space C [0, 1] , k · k∞ do not.
8.6.5 Proposition. (a) A metric space with the Bolzano–Weierstrass property
is complete.
(b) A metric space has the Bolzano–Weierstrass property iff every closed and
bounded set is compact.
Proof. For (a), use the fact that a Cauchy sequence is bounded and apply
Exercise 8.1.9. Part (b) follows from 8.5.8.
The following lemma may be proved using familiar ideas such as those
found in 8.1.10. The details are left to the reader.
8.6.6 Lemma. Let (X, d) be compact and (Y, ρ) complete. For f, g ∈ C(X, Y )
define
σ(f, g) = sup ρ(f (x), g(x)).
x∈X
Then σ is a metric on C(X, Y ), and C(X, Y ) is complete in this metric.
8.6.7 Lemma.
A compact metric space X has a countable dense subset of
S∞
the form D = k=1 Fk , where Fk is a finite (1/k)-net for X.
Proof. For each k ∈ N, the collection {B1/k (x) : x ∈ X} is an open cover of
X,
S∞hence has a finite subcover {B1/k (x) : x ∈ Fk }. By definition of ε-net,
k=1 Fk is dense in X.
8.6.8 Arzelà–Ascoli Theorem. Let X be compact and let Y have
the
Bolzano–Weierstrass property. Then a set F is compact in C(X, Y ), σ iff it
is closed, bounded, and equicontinuous.
Proof. Suppose F is compact in C(X, Y ), hence closed and bounded. If F is
not equicontinuous at some a ∈ X, then there exists an ε > 0 and for every n
members xn of X and fn of F such that d(xn , a) < 1/n and
ρ(fn (xn ), f (a)) ≥ ε.
(8.6)
By compactness of F, we may assume that {fn } converges uniformly to some
f ∈ F (otherwise, take a subsequence). Since xn → a, the uniform convergence
of {fn } implies that fn (xn ) → f (a). But this contradicts (8.6). Therefore, F
is equicontinuous.
Metric Spaces
265
Conversely, assume that F is closed, bounded, and equicontinuous and let
{fn } be any sequence in F. We show that {fn } has a convergent subsequence.
The compactness of F will then follow from 8.5.8.
Let Fk and D = {x1 , x2 , . . .} be as in 8.6.7. We show first that {fn } has
a subsequence that converges pointwise on D. For this we use the Bolzano–
Weierstrass property of Y and the following diagonalization argument: Because
(1)
(0)
{fn } is bounded, we may choose a subsequence {fn } of {fn := fn } such
(1)
that the sequence {fn (x1 )} converges to some y1 ∈ Y . We may then choose
(2)
(1)
(2)
a subsequence {fn } of {fn } such that {fn (x2 )} converges to some y2 ∈ Y .
(k)
Continuing in this way, we obtain for each k a sequence {fn } such that
(k+1)
(k)
(k)
{fn
} is a subsequence of {fn } and limn fn (xk ) = yk . Now take the
(n)
diagonal sequence {gn := fn }, which is a subsequence of {fn } and for each
(k)
k, except for the first k − 1 terms, is a subsequence of {fn }. It follows that
limn gn (xk ) = yk for each k. The scheme may be depicted as follows:
(1)
(1)
→ y1 at x1
(2)
(2)
→ y2 at x2
f1 , f2 , . . . , fn(1) , . . .
f1 , f2 , . . . , fn(2) , . . .
..
.
(n)
(n)
→ yn at xn
f1 , f2 , . . . , fn(n) , . . .
..
.
&
yk
at each xk
Having obtained a subsequence {gn } of {fn } that converges pointwise on
the dense set D, we now show that {gn } converges uniformly on X, which will
complete the proof.
By the uniform equicontinuity of {gn }, given ε > 0, we may choose δ > 0
such that
ρ gn (x), gn (y) < ε/3, for all n ∈ N and x, y ∈ X with d(x, y) < δ. (8.7)
Let k > 1/δ. Since {gn } converges pointwise on Fk and Fk is finite, we may
choose Nk so that
ρ gn (y), gm (y) < ε/3, for all n, m ≥ Nk and all y ∈ Fk .
(8.8)
Since Fk is a δ-net, given x ∈ X, there exists y ∈ Fk such that d(x, y) < δ. It
follows from (8.7) and (8.8) that for m, n ≥ Nk ,
ρ gn (x), gm (x) ≤ ρ gn (x), gn (y) + ρ gn (y), gm (y) + ρ gm (y), gm (x)
< ε/3 + ε/3 + ε/3 = ε.
Since x was arbitrary, {gn } is a Cauchy sequence in C(X, Y ). Since C(X, Y ) is
complete, {gn } converges in C(X, Y ).
266
A Course in Real Analysis
Remark. The proof of the sufficiency of the theorem did not require that
F be uniformly bounded. All that was used was the property of pointwise
boundedness, that is, {f (x) : f ∈ F} bounded in Y for each x ∈ X. Uniform
boundedness is then a consequence of equicontinuity.
♦
8.6.9 Example. Let X be compact. Then any convergent sequence of functions fn in C(X, R), say fn → f , is equicontinuous. This may be verified
directly, but a quick proof uses 8.6.8 applied the set {f, f1 , f2 , . . .}, whose
compactness is readily established.
♦
Exercises
1. Let X × Y have the product metric η := d × ρ and let f : X → Y . The
graph of f is the set
G(f ) = {(x, y) : x ∈ X and y = f (x)}.
Prove that if f is continuous, then G(f ) is closed in X × Y . Conversely,
prove that if G(f ) is closed, f (X) is bounded, and Y has the Bolzano–
Weierstrass property, then f is continuous. Give an example of a realvalued discontinuous function on [0, 1] with a closed graph.
2. Let X have the Bolzano–Weierstrass property and let {xn } be a bounded
sequence in X with only finitely many cluster points y1 , . . . , yk . Prove
that the set C := {y1 , . . . , yk , x1 , x2 , . . .} is compact.
3.S Prove that a subset F of C(X, Y ) is equicontinuous at a ∈ X iff for
any
sequences {fn } in F and {xn } in X with xn → a, ρ fn (xn ), fn (a) → 0.
4. Prove that a subset F of C(X, Y ) is uniformly equicontinuous on E ⊆ X
iff for any sequences
{fn } in F and {xn }, {an } in E with d(xn , an ) → 0,
ρ fn (xn ), fn (an ) → 0.
5. Prove that a finite set of uniformly continuous functions f : X → Y is
uniformly equicontinuous.
6. Prove that the uniform closure of a set F ⊆ C(X, Y ) of uniformly
equicontinuous functions is uniformly equicontinuous.
7.S Let c, p > 0 and define fn (x) = (nx)−p , x ≥ c. Show that the sequence
{fn } is uniformly equicontinuous.
8. Define fn (x) = ln(n + x). Show that the sequence {fn } is uniformly
equicontinuous on (0, +∞).
9.S Define fn (x) = sin(nx). Use Exercise 3 and Exercise 8.3.13 to show
that the sequence {fn } is not equicontinuous at any nonzero rational
number r.
Metric Spaces
267
10. Let M > 0 and define
RM := {f : f is locally integrable on [0, +∞) and kf k∞ ≤ M } .
For f ∈ RM define
Ff (x) =
Z
x
f, x ≥ 0.
0
Prove that the set F := {Ff : f ∈ RM } is uniformly equicontinuous on
[0, +∞).
11.S Let M > 0 and define
DM := {f : (a, b) → R : |f 0 (x)| ≤ M for all a < x < b} .
Show that DM is uniformly equicontinuous. Conclude that if g has
a bounded derivative on R, then the set of functions {gt : t ∈ R} is
uniformly equicontinuous on I, where gt (x) = g(t + x).
12. Let f : X × Y → R have the property that f (x, y) is continuous in y for
each fixed x and continuous in x for each fixed y. Define
F := {f ( · , y) : y ∈ Y } .
Prove:
(a) If F is equicontinuous, then f is continuous.
(b) If f is continuous and Y is compact, then F is equicontinuous.
13. Let X be compact. Show that a totally bounded subset of C(X, Y ) is
uniformly equicontinuous.
14.S Let {fi : i ∈ I} be a uniformly bounded subset of Rba . Define
Z x
Fi (x) :=
fi (t) dt, a ≤ x ≤ b.
a
Show that {Fi : i ∈ I} is a totally bounded subset of C([a, b]).
15. Let f (t, x, y) be continuous on [a, b]3 and define ft (x, y) = f (t, x, y).
Prove that the family {ft : t ∈ [a, b]} is uniformly equicontinuous on
[a, b]2 . Apply this to the function
f (t, x, y) =
1 + t sin x
on [0, 1]3 .
2 + t sin y
268
8.7
A Course in Real Analysis
Connected Sets
Throughout this section, (X, d) and (Y, ρ) denote arbitrary metric spaces.
8.7.1 Definition. A pair (U, V ) of open sets in X is said to separate X if
X = U ∪ V, U 6= ∅, V 6= ∅, and U ∩ V = ∅.
The pair (U, V ) is then called a separation of X. The space X is said to be
connected if it has no separation, and disconnected otherwise. A subset E of
X is connected if it is connected as a subspace of X.
♦
It follows from the definition that if E is disconnected, then there exist
sets U , V open in X such that (E ∩ U, E ∩ V ) is a separation of E. The sets U
and V need not be disjoint in this definition; however the next theorem shows
that this useful state of affairs may always be achieved. In this case we shall
call (U, V ) a separation of E.
8.7.2 Theorem. A subset E of X is disconnected iff there exists a separation
(E ∩ U, E ∩ V ) of E such that U ∩ V = ∅.
U
E
V
FIGURE 8.7: A separation (U, V ) of E.
Proof. The sufficiency is clear. For the necessity, assume that E is disconnected
and that (E ∩ U1 , E ∩ V1 ) is a separation of E. Here, U1 and V1 are open in
X but may not be disjoint. However, since E ∩ U1 and E ∩ V1 are disjoint,
clE (E ∩ U1 ) ∩ V1 = ∅. Indeed, if, to the contrary, x ∈ clE (E ∩ U1 ) ∩ V1 for
some x, then there would be a sequence {xn } in E ∩ U1 converging to x,
which would imply that eventually xn ∈ E ∩ V1 , impossible. Recalling that
clE (E ∩ U1 ) = E ∩ clX (U1 ), we now see that
v 6∈ clX (U1 ) for each v ∈ E ∩ V1 .
Similarly,
u 6∈ clX (V1 ) for each u ∈ E ∩ U1 .
By Exercise 8.5.14 it follows that for u ∈ E ∩ U1 and v ∈ E ∩ V1 the distances
r(u) := inf{d(u, x) : x ∈ clX (V1 )}
and s(v) := inf{d(v, x) : x ∈ clX (U1 )}
Metric Spaces
are positive. Define
[
U=
u∈E∩U1
269
[
Br(u)/2 (u), and V =
Bs(v)/2 (v).
v∈E∩V1
Clearly, U and V are open in X and contain E ∩ U1 and E ∩ V1 , respectively.
To prove that (U, V ) is a separation of E, it remains to show that U ∩ V = ∅.
Suppose the the contrary that there exists a point x ∈ U ∩ V . Then, by
the above,
d(x, u) < r(u)/2 for some u ∈ U1 and d(x, v) < s(v)/2 for some v ∈ V1 .
Adding and using the triangle inequality we have
d(u, v) < r(u)/2 + s(v)/2.
On the other hand, by definition of r(u) and s(v),
d(u, v) ≥ r(u) and d(u, v) ≥ s(v),
hence
d(u, v) ≥ r(u) + s(v) /2
This contradiction shows that U ∩ V = ∅ and completes the proof of the
theorem.
In any metric space, the empty set and the singletons {x} are trivially
connected, but no other finite subsets are connected. In a discrete space the
only connected sets are the empty set and
The set Q is not
√ the singletons.
√
connected in R, since the open sets (−∞, 2) and ( 2, +∞) separate Q.
8.7.3 Theorem. X is not connected iff there exists a continuous function
from X onto {0, 1}. Equivalently, X is connected iff every continuous function
from X into {0, 1} is constant.
Proof. Assume that X is not connected and let (U, V ) separate X. Define
(
0 if x ∈ U ,
g(x) =
1 if x ∈ V .
Then g maps X onto {0, 1}. Let W be any open set in R. Then g −1 (W ) is
one of the sets ∅, U , V , or X, each of which is open in X. Therefore, g is
continuous.
Conversely, if a continuous function g from X onto {0, 1} exists, then the
open sets g −1 ((−1, 1/2)) and g −1 ((1/2, 2)) separate X.
8.7.4 Corollary. The nonempty connected subsets of R are the intervals.
270
A Course in Real Analysis
Proof. By the intermediate value theorem, there can be no continuous function
from an interval onto {0, 1}. Hence intervals must be connected.
Now let E be a nonempty subset of R that is not an interval. Choose
real numbers a < c < b with a, b ∈ E but c 6∈ E. Then (−∞, c) and (c, +∞)
separate E, hence E is not connected.
The following is a generalization of the intermediate value theorem.
8.7.5 Corollary. If f : X → Y is continuous and X is connected, then f (X)
is connected.
Proof. Let g : f (X) → {0, 1} be continuous. Then g ◦ f : X → {0, 1} is
continuous and hence must be constant. It follows that g itself must be
constant.
8.7.6 Corollary. If A ⊆ X is connected and A ⊆ B ⊆ cl(A), then B is
connected. In particular, the closure of a connected set is connected.
Proof. Let g : B → {0, 1} be continuous. Then g|A is continuous, hence must
be constant. Since B ⊆ cl(A), g itself must be constant. Therefore, A is
connected.
The converse of 8.7.6 is false. For example, cl(Q) = R is connected but Q
is not.
8.7.7 Definition. A path in X from x to y is a continuous function ϕ from
an interval [a, b] to X such that ϕ(a) = x, the initial point of the path, and
ϕ(b) = y, the terminal point. X is said to be path connected if for each pair of
points x, y ∈ X there exists a path in X from x to y. A subset E of X is path
connected if it is path connected as a subspace of X.
♦
Note that if ϕ : [a, b] → X is a path from x to y, then
−ϕ(t) := ϕ(−t), −b ≤ t ≤ −a,
defines a path from y to x. Also, if ϑ : [c, d] → X is a path from y to z, then
the sum or concatenation ϕ + ϑ : [0, 2] → X of the paths ϕ and ϑ is a path
from x to z, where
(
ϕ a + (b − a)t
if 0 ≤ t ≤ 1,
(ϕ + ϑ)(t) =
ϑ c + (d − c)(t − 1) if 1 ≤ t ≤ 2.
A convex subset C of a normed vector X is path connected. Indeed, if
x, y ∈ C, then the line segment
ϕ(t) := (1 − t)x + ty,
0 ≤ t ≤ 1,
joins x to y and lies in C. In particular, open and closed balls in X are path
connected.
Metric Spaces
271
8.7.8 Theorem. If X is path connected, then it is connected.
Proof. Let g : X → {0, 1} be a continuous function, let x, y ∈ X, and let
ϕ[a, b] → X be a path from x to y. Then g ◦ ϕ : [a, b] → {0, 1} is continuous
and, because [a, b] is connected, must be constant. In particular,
g(x) = (g ◦ α)(a) = (g ◦ α)(b) = g(y).
Since x and y were arbitrary, g is constant.
8.7.9 Example. The subset B1 (−1, 0)∪B1 (1, 0) of R2 is not connected, hence
not path connected.
x
y
(−1, 0)
(1, 0)
C1 (−1, 0)
C1 (1, 0)
FIGURE 8.8: C1 (−1, 0) ∪ C1 (1, 0) is path connected.
However, its closure C1 (−1, 0) ∪ C1 (1, 0) is path connected, as can be seen
from the figure, hence is connected.
♦
8.7.10 Example. A sphere in Rn , n > 1, is path connected, hence connected.
For example, consider the sphere
S = {x ∈ Rn : kxk2 = 1} .
We show that there is a path from the point a = (1, 0, . . . , 0) to any point
b = (b1 , b2 , . . . , bn ). It will then follow that any pair of points in S may be
joined by a path in S through a.
If b = (−1, 0, . . . , 0), then (cos t, sin t, 0, . . . , 0), 0 ≤ t ≤ π, is such a path.
Suppose b 6= (−1, 0, . . . , 0). Then the line segment
ϕ(t) = (1 − t)a + tb = (1 − t + tb1 , tb2 , . . . , tbn ), 0 ≤ t ≤ 1,
is never zero, hence kϕ(t)k−1
2 ϕ(t) is a path from a to b in S.
♦
The converse of 8.7.8 is false, as the following example—the topologist’s
sine curve (8.3.7)—demonstrates.
272
A Course in Real Analysis
8.7.11 Example. Let
A = {(x, sin(1/x)) : 0 < x < 2/π}, B = {0} × [−1, 1], and E = A ∪ B.
Since A is connected and E = cl(A), 8.7.6 shows that E is connected. However,
E is not path connected. Indeed, no point in A can be joined to a point in B
by a path in E. Suppose such a path existed, say ϕ : [a, b] → E, where
ϕ(t) = x(t), y(t) , ϕ(a) ∈ A, and ϕ(b) ∈ B.
Let
S := t ∈ [a, b] : ϕ [a, t] ⊆ A .
Since S is nonempty and bounded, c := sup S exists and c ∈ [a, b]. Note that
x(t) > 0 on S. If x(c) > 0, then c < b, hence, by continuity, x(s) is positive on
[a, c + δ] for some δ > 0, contradicting the definition of c. Therefore,
x(c) = 0
and x(t) > 0 on [a, c). This implies that ϕ(t) = x(t), sin(1/x(t)) on [a, c) and
limt→c− x(t) = 0. By continuity, for each δ > 0 the set x([c − δ, c]) is an interval
of the form [0, d], d > 0. Therefore, y(t) = sin(1/x(t)) takes on all values in
[−1, 1] on each interval [c − δ, c), which implies that limt→c− y(t) cannot exist.
But this contradicts the continuity of ϕ at c.
♦
While there is no strict converse to 8.7.8, the next theorem provides a
partial converse.
8.7.12 Theorem. An open connected subset E of a normed vector space X
is path connected.
Proof. Fix a point x ∈ E and let U denote the set of all points u ∈ E for
which there exists a path in E from x to u. We claim that U is open. Let
u
u0
x
E
Br (u0 )
FIGURE 8.9: E is path connected.
u0 ∈ U and choose r > 0 such that Br (u0 ) ⊆ E. By definition of U , there
exists a path in E from x to u0 . Since Br (u0 ) is convex, there exists a line
segment in Br (u0 ) from u0 to any point u ∈ Br (u0 ). The sum of these paths
is then a path in E from x to u. Therefore, Br (u0 ) ⊆ U , which shows that
U is open. A similar argument shows that V := E \ U is open. Since E is
connected and x ∈ U , V = ∅. Therefore, E = U .
Metric Spaces
273
Exercises
1. Determine which sets are connected in R2 :
(a) B1 (−1, 0) ∪ {(0, 0)} ∪ B1 (1, 0).
(b) R2 \ {(1/m, 1/n) : m, n ∈ N}.
(c)S Q2 .
(d)S R2 \ Q2 .
(e)S {(x, sin(1/x)) : x 6= 0} ∪ {(0, a)}.
(f) R2 \ G, where G is the graph of a bounded function f : [a, b] → R.
(g) R2 \ G, where G is the graph of an equation F (x, y) = 0.
(h) {(x, y, z) : x2 + y 2 − z 2 = 1}.
(i) {(x, y, z) : x2 + y 2 − z 2 = −1}.
(j) {(x, y, z) : x2 + y 2 − z 2 = 0, 0 < x2 + y 2 ≤ 1}.
2. Prove that a metric space X is connected iff it has no proper nonempty
subset that is both open and closed.
3. Prove that X is connected iff it cannot be expressed as the union of
nonempty sets A and B such that
A ∩ clX (B) = clX (A) ∩ B = ∅.
Hint. Use 8.7.3 and the sequential characterization of continuity.
4. Prove that X × Y is connected in the product metric d × ρ iff X and Y
are connected.
5.S Let X be connected and f : X → R continuous. Suppose there exist
u, v ∈ X such that f (u)f (v) < 0. Show that the equation f (x) = 0 has
a solution.
6. Let X be connected and f : X → Y continuous. Suppose f has the
property that for each x ∈ X there exists ε > 0, possibly depending on
x, such that f is constant on Bε (x). Prove that f is constant on X.
7.S Let X be connected and let g, h : X → R be continuous such that
g(x) 6= h(x) for all x ∈ X. Prove that g > h or h > g on X.
8. ⇓6 Let X be a normed vector space and u, v ∈ X . A polygonal path
P from u to v is a finite sequence of line segments Lk = [xk : xk+1 ],
k = 1, . . . , n − 1, where x1 = u and xn = v. The path P is nonoverlapping if Lj ∩ Lk = ∅ unless j = k − 1, in which case Lj ∩ Lk = xk .
A subset E of a normed vector space X is polygonally connected if for
6 This
exercise will be used in 12.2.10.
274
A Course in Real Analysis
each pair of points u and v in E there exists a polygonal path from u
to v contained in E. For example, a convex set is polygonally connected.
Prove that every open connected subset E of X is polygonally connected.
Show also that it is always possible to choose P to be non-overlapping
9.S Show that for n > 1 the complement of an open ball or a closed ball in
Rn is path connected, hence connected.
10. Suppose A ⊆ X is connected. By 8.7.6, cl(A) is connected. Prove or
disprove: (a) int(A) is connected, (b) bd(A) is connected.
11. The exterior ext(E) of a subset E of a metric space X is defined as the
interior of E c . Show that X = int(E) ∪ bd(E) ∪ ext(E). Conclude that X
is connected iff every subset of X with nonempty interior and nonempty
exterior also has a nonempty boundary.
12.S Let {An } be a finite or infinite sequence of connected
subsets of X such
S
that An ∩ An+1 6= ∅ for each n. Prove that n An is connected.
13. Let {Ai : i ∈ I} be a collection of nonempty
S connected sets and i0 ∈ I
such that Ai ∩ Ai0 6= ∅ for all i. Prove that i Ai is connected.
of compact connected subsets of X
14. Let {An } be an infinite sequence T
such that An+1 ⊆ An . Prove that n An is connected.
15. Let X = A1 ∪ · · · ∪ Ap and Y = B1 ∪ · · · ∪ Bq , p < q, where Aj and Bj
are connected, the Aj ’s are pairwise disjoint, and the Bj ’s are pairwise
disjoint and closed. Show that no continuous function f : X → Y can
map X onto Y .
16.S Prove that no one-to-one continuous function can map a closed line
segment L onto a circle C. Show, however, that there are continuous
functions that can do this.
17. Suppose closed line segments L1 , L2 , L3 in the plane meet at a single
endpoint P . Show that no one-to-one continuous function can map a
closed line segment L onto L1 ∪ L2 ∪ L3 . Show, however, that there are
continuous functions that can do this.
18. Let C1 and C2 be tangent circles in the plane. Show that no one-to-one
continuous function can map C1 ∪ C2 onto a circle C. Show, however,
that there are continuous functions that can do this.
19. Show that no one-to-one continuous function can map the set
E := {(x, y, z) : x2 + y 2 = z 2 , x2 + y 2 ≤ 1}
onto a closed disk D. Show, however, that there are continuous functions
that can do this.
Metric Spaces
275
20.S Let X be a normed vector space and f : X → R continuous. Let
A := {x ∈ X : f (x) ≥ c} and B := {x ∈ X : f (x) = c}.
Prove that bd(A) ⊆ B and that the inclusion may be strict.
21. Let X be connected and have at least two points. Show that X is
uncountable. Hint. For all sufficiently small r > 0, X 6= Br (x) ∪ Crc (x).
22.S Let U be an open subset of a normed vector space X and let x ∈ U . The
component of U containing x is the union Cx of all connected subsets of
U containing x.
(a) Prove that Cx is open and connected and that U is a union of pairwise
disjoint components.
(b) Show that the number of components is countable if X is a Euclidean
space Rn .
23. Let (X, d) be complete, (Y, ρ) connected, c > 0, and let f : X → Y be a
continuous mapping such that f (X) is open and
ρ f (u), f (v) ≥ c d(u, v) for all u, v ∈ X.
Prove that Y is complete.
8.8
The Stone–Weierstrass Theorem
Let (X, d) be a compact metric space and let C(X) denote the space of
all continuous real-valued functions of X with the supremum norm kf k∞ =
supx∈X |f (x)|. A member f of C(X) is said to be uniformly approximated by
members of a subset S of C(X) if f ∈ cl(S). This is equivalent to the existence
of a sequence {fn } in S converging uniformly to f on X.
Weierstrass’s approximation theorem asserts that any function in C [a, b])
may be uniformly approximated by polynomials. Stone’s generalization of
Weierstrass’s theorem replaces [a, b] by a compact metric space7 and the set of
polynomials by a more general class of functions.
The proof of Weierstrass’s theorem given below is due to Lebesgue. The
basic idea is to show that every continuous function may be uniformly approximated by piecewise linear functions and that these in turn may be uniformly
approximated by polynomials.
7 more
generally, by a compact Hausdorff topological space.
276
A Course in Real Analysis
8.8.1 Definition. Let a = x0 < x1 < . . . < xk = b. A function g on [a, b] is
said to be piecewise linear with vertices (xj , yj ) if, for j = 0, 1, . . . , k − 1,
g(x) = yj + mj (x − xj ), mj =
yj+1 − yj
, xj ≤ x ≤ xj+1 .
xj+1 − xj
♦
Note that a piecewise linear function is necessarily continuous and that
its graph consists of a sequence of line segments joined at the vertices. (See
Figure 8.10.)
y
y3
y5
y1
y2
y0
y4
a
x1
x2
x3
x4
x
b
FIGURE 8.10: A piecewise linear function.
8.8.2 Lemma. Every continuous function f on [a, b] may be uniformly approximated by a piecewise linear function.
Proof. Given ε > 0, choose δ > 0 such that |f (x) − f (y)| < ε/2 whenever
|x − y| ≤ δ. Let x0 = a < x1 < · · · < xk = b be a partition of [a, b] with
mesh < δ and let g be as in 8.8.1 with yj = f (xj ). If xj ≤ x ≤ xj+1 , then
|mj |(x − xj ) = |f (xj+1 ) − f (xj )|
hence
x − xj
≤ |f (xj+1 ) − f (xj )| < ε/2,
xj+1 − xj
|f (x) − g(x)| ≤ |f (x) − f (xj )| + |mj |(x − xj ) < ε.
8.8.3 Lemma. The function g in 8.8.1 may be written
g(x) = y0 +
k−1
X
cj (x − xj )+ , a ≤ x ≤ b,
j=0
for suitably chosen constants cj .
Proof. For 0 ≤ j ≤ k − 1 and xj ≤ x ≤ xj+1 , the desired equation reduces to
yj + mj (x − xj ) = y0 +
j
X
i=0
ci (x − xi ) = y0 −
j
X
i=0
ci xi + x
j
X
i=0
ci .
Metric Spaces
277
This holds iff
mj =
j
X
ci , and yj − mj xj = y0 −
i=0
j
X
ci x i .
(8.9)
i=0
The first equation in (8.9) is satisfied by taking c0 = m0 and cj = mj − mj−1 ,
j ≥ 1. For this choice,
y0 −
j
X
ci xi = y0 +
i=0
j
X
mi−1 xi −
i=1
= y0 − mj xj +
j
X
m i xi
i=0
j−1
X
mi (xi+1 − xi )
i=0
= y0 − mj xj +
j−1
X
(yi+1 − yi )
i=0
= yj − mj xj ,
which shows that the second equation in (8.9) is also satisfied.
8.8.4 Lemma. The functions |x| and x+ may be uniformly approximated by
polynomials on any bounded interval I.
Proof. By 7.4.10, the binomial series
∞ X
1/2
(−t)n
n
n=0
converges uniformly to
√
1 − t on [−1, 1]. Setting t = 1 − x2 we see that
∞ X
1/2
(x2 − 1)n
n
n=0
√
converges uniformly to x2 = |x| on [−1, 1]. Thus if sn (x) denotes the nth
partial sum of the last series and m is chosen so that I ⊆ [−m, m], then
Qn (x) := msn (x/m) defines a sequence of polynomials converging uniformly
to |x| on I. Since x+ = 12 (x + |x|), the polynomials Pn (x) := 12 x + Qn (x)
converge uniformly to x+ on I.
8.8.5 Weierstrass Approximation Theorem. The set of all polynomials
on [a, b] is dense in C([a, b]). That is, every member of C([a, b]) may be uniformly
approximated by polynomials.
Proof. Let f ∈ C([a, b]) and ε > 0. By 8.8.2, there exists a piecewise linear
function g on [a, b] such that kf − gk∞ < ε/2. By 8.8.3 and 8.8.4, there exists
a polynomial P such that kP − gk∞ < ε/2. Then, by the triangle inequality,
kf − P k∞ < ε.
278
A Course in Real Analysis
For the statement of the Stone–Weierstrass theorem, we need the following
definitions.
8.8.6 Definition. A collection A of real-valued functions on a set S is said
to be an algebra if A is closed under addition, multiplication, and scalar
multiplication; that is,
f, g ∈ A and α ∈ R ⇒ f + g, f g, αf ∈ A.
A is said to separate points of S if for each pair of distinct points s and t in S
there exists f ∈ A such that f (s) 6= f (t).
♦
For example, the collection of all polynomials on [a, b] is an algebra that
separates points of [a, b].
8.8.7 Stone–Weierstrass Theorem. Let X be a compact metric space and
let A be an algebra in C(X) that contains the constant functions and separates
points of X. Then A is dense in C(X).
Proof. Set B := cl(A). The proof that B = C(X) consists of the following
sequence of steps.
I. B is an algebra in C(X).
J If fn , gn ∈ A, fn → f , gn → g, and α ∈ R, then
(a) kαfn − αf k∞ = |α| |fn − f k∞ → 0,
(b) k(fn + gn ) − (f + g)k∞ ≤ kfn − f k∞ + kgn − gk∞ → 0, and
(c) kfn gn − f gk∞ ≤ kfn gn − f gn k∞ + kf gn − f gk∞
≤ kgn k∞ kfn − f k∞ + kf k∞ kgn − gk∞
→ 0,
the convergence in (c) holding because {gn } is uniformly bounded. (Each
gn is bounded and gn converges uniformly to a bounded function.) Thus
B is closed under addition, multiplication, and scalar multiplication. K
II. f ∈ B ⇒ |f | ∈ B.
J Let M = kf k∞ . By 8.8.4 there exists a sequence of polynomials Pn (x)
converging uniformly to |x| on [−M, M ]. It follows that Pn ◦ f converges
uniformly to |f | on X. Because B is an algebra containing the constants,
Pk
Pk
Pn ◦ f ∈ B. Indeed, if Pn (x) = j=0 aj xj , then Pn ◦ f = j=0 aj f j .
Since B is closed, |f | ∈ B. K
III. f1 , . . . , fk ∈ B ⇒ max{f1 , . . . , fk }, min{f1 , . . . , fk } ∈ B.
J By induction, it suffices to consider the case k = 2. This follows from
step II and the identities
max{f1 , f2 } = 12 f1 + f2 + |f1 − f2 | ,
min{f1 , f2 } = 12 f1 + f2 − |f1 − f2 | .
K
Metric Spaces
279
IV. Let f ∈ C(X). Then for each pair of distinct points x, y in X there exists
a function gxy ∈ A such that gxy (x) = f (x) and gxy (y) = f (y).
J Choose a function h ∈ A such that h(x) 6= h(y) (A separates points).
Define
gxy (z) = f (x) +
f (x) − f (y)
h(z) − h(x) , z ∈ X.
h(x) − h(y)
Because A contains the constant functions, gxy ∈ A. Clearly, gxy (x) =
f (x) and gxy (y) = f (y). K
V. If f ∈ C(X), x ∈ X, and ε > 0, then there exists a function gx ∈ B such
that
gx (x) = f (x) and gx (z) < f (z) + ε for all z ∈ X.
J By continuity, for each y ∈ X the set
Uy := {z ∈ X : gxy (z) < f (z) + ε}
is open in X, where gxy is the function in step IV. Moreover, Uy contains
both x and y. Since X is compact, there exist y1 , . . . yk ∈ X such that
X = Uy1 ∪ · · · ∪ Uyk .
Set gx := min{gxy1 , . . . , gxyk }. Then gx clearly has the required properties
and, by step III, gx ∈ B. K
VI. If f ∈ C(X) and ε > 0, then there exists a function g ∈ B such that
f (z) − ε < g(z) < f (z) + ε, for all z ∈ X.
J By continuity, for each x ∈ X the set
Vx := {z ∈ X : gx (z) > f (z) − ε}
is open in X, where gx is the function in step V. Moreover, Vx clearly
contains x and
f (z) − ε < gx (z) < f (z) + ε, for all z ∈ Vx .
Since X is compact, there exist x1 , . . . , xm ∈ X such that
X = V x1 ∪ · · · ∪ V xk .
Set g := max{gx1 , . . . , gxm }. By step III, g ∈ B, and g clearly satisfies
the desired inequality. K
To complete the proof of the theorem, observe that step VI asserts that
C(X) = cl(B). Since B is closed, C(X) = B.
280
A Course in Real Analysis
8.8.8 Example. A trigonometric polynomial is a function on R of the form
T (x) = a0 +
m
X
aj cos(jx) + bj sin(jx),
aj , bj ∈ R.
j=1
The collection T ([a, b]) of all trigonometric polynomials on the interval [a, b]
clearly contains the constant functions and is closed under addition and scalar
multiplication. Since
sin jx sin kx = 12 sin(j − k)x + sin(j + k)x ,
with similar identities holding for sin jx cos kx and cos jx cos kx, T ([a, b]) is
an algebra.
If 0 < b − a < 2π, then {cos x, sin x}, and hence T ([a, b]), separate points
of [a, b]. By the Stone–Weierstrass theorem, every member of C([a, b]) may be
uniformly approximated by trigonometric polynomials on [a, b].
If b − a = 2π, then T ([a, b]) no longer separates points of [a, b]. However,
in this case every member f of C([a, b]) with f (a) = f (b) may be uniformly
approximated by a trigonometric polynomial. We verify this for the interval
[0, 2π]. Let E denote the algebra of continuous functions f : [0, 2π] → R with
f (0) = f (2π), and let X denote the circle x2 + y 2 = 1 with the Euclidean R2
metric. For each f ∈ E, define Ff : X → R by
Ff (cos t, sin t) = f (t),
0 ≤ t ≤ 2π.
It is straightforward to verify that Ff is continuous. For example, if
(cos tn , sin tn ) → (1, 0), then every convergent subsequence {tnk } converges
either to 0 or to 2π, hence
Ff (cos tnk , sin tnk ) = f (tnk ) → f (0) = f (1) = Ff (1, 0).
The set
A := {FT : T ∈ T ([0, 2π])}
is easily seen to be an algebra that contains the constant functions. Moreover,
A separates points of X. Indeed, if x := (cos s, sin s) and y := (cos t, sin t)
with x =
6 y, then, say, cos s 6= cos t hence FT (x) 6= FT (y), where T (x) = cos x.
Therefore, each Ff may be uniformly approximated on X by members of A.
It follows that each member of E may be uniformly approximated on [0, 2π]
by trigonometric polynomials.
♦
Exercises
1. Give an example of a bounded continuous function that cannot be
approximated uniformly by polynomials on (0, 1).
2. Let f be continuous on [a, +∞) such that limx→+∞ f (m) (x) 6= 0 for all
sufficiently large m ∈ N. Prove that f cannot be uniformly approximated
by polynomials on [a, +∞). Give an example of such a function.
Metric Spaces
281
Rb
3.S Let f ∈ C([a, b]) have the property that a xn f (x) dx = 0 for all n ∈ Z+ .
Prove that f = 0 on [a, b]. Show that if a ≥ 0, then it is enough that the
given property holds for even integers n in Z+ .
4. Let f : [a, b] → R have continuous derivatives up to order k such that
Z
b
xn f (k) (x) dx = 0 for all n ∈ Z+ .
a
Prove that f is a polynomial.
5. Let f : [a, b] → R have continuous derivatives up to order k. Prove that
(j)
there exists a sequence of polynomials Pn such that limn Pn = f (j)
uniformly on [a, b] for j = 0, 1, . . . , k.
6.S Let X be compact and let A be an algebra in C(X) that contains the
constant functions and separates the points of X. Let x0 ∈ X and let
f ∈ C(X) satisfy f (x0 ) = 0. Prove that there exists a sequence fn ∈ A
converging uniformly to f such that fn (x0 ) = 0 for all n.
7. Show that there exists a sequence of polynomials Pn converging uniformly
to sin x on [0, π] such that Pn (0) = Pn (π) = 0 for all n.
8. Let f be an odd (even) continuous function on [−a, a], a > 0. Prove that
there is a sequence of odd (even) polynomials that converges uniformly
to f on [−a, a].
9.S Let f ∈ C([0, 2π]) have the properties f (0) = f (2π) and
Z
2π
f (x) sinm x cosn x dx = 0 for all m, n ∈ Z+ .
0
Prove that f is identically zero on [0, 2π].
10. Let f : R → R be continuous and periodic with period 2π. Prove that
there exists a sequence of trigonometric polynomials that converges
uniformly to f on R.
11.S Let f ∈ C([−π/2, π/2]) with f (0) = 0. Prove that f can
Pmbe uniformly
approximated on [−π/2, π/2] by functions of the form j=1 bj sin(jx).
12. Let g be continuous
and one-to-one on [a, b]. Prove that any function
in C [a, b] may be uniformly approximated by functions of the form
Pm
j
j=0 aj g .
13. Prove the following version of the Stone–Weierstrass theorem: If V is a
linear subspace of C(X) that contains the constant functions, separates
points of X, and contains |f | for all f ∈ V, then V is dense in C(X).
282
A Course in Real Analysis
14. Show that for any f ∈ C([0, 2π]) there exists a sequence of trigonometric
R 2π
polynomials Tn such that 0 |f − Tn | → 0.
15.S Let X and Y be compact metric spaces and let f (x, y) ∈ C(X × Y ) be
a continuous real-valued function on X × Y . Show that for every ε > 0
there exist g1 , . . . , gn ∈ C(X) and h1 , . . . , hn ∈ C(Y ) such that
f (x, y) −
n
X
gi (x)hi (y) < ε for all (x, y) ∈ X × Y .
i=1
16. Let E0 denote the algebra of all continuous functions f [a, b] :→ R such
that f (a) = f (b) = 0. If A0 is an algebra in E that separates points of
(a, b) show that A0 is dense in E0 in the uniform norm. Hint. Use ideas
of 8.8.8 by considering the algebra generated by A0 and the constant
functions.
17. Let C0 (R) denote the algebra of all continuous functions f on R such
that limt→±∞ f (t) = 0. Let B0 be an algebra in C0 (R) that separates
points of R. Show that B0 is dense in C0 (R) in the uniform norm. Hint.
Consider θ(t) = tan−1 [(t − π)/2], 0 < t < 2π and use Exercise 16.
*8.9
Baire’s Theorem
Let (X, d) be a metric space. The diameter d(E) of a nonempty subset E
of X is defined by
d(E) = sup d(x, y).
x,y∈E
8.9.1 Lemma. If X is complete, then the intersection C of any decreasing
sequence of nonempty closed sets Cn in X with d(Cn ) → 0 contains a single
point.
Proof. For each n choose a point xn ∈ Cn . If m > n, then xm ∈ Cn , hence
d(xm , xn ) ≤ d(Cn ). Since d(Cn ) → 0, {xn } is Cauchy. Let xn → x. Since
xn , xn+1 , . . . ∈ Cn and Cn is closed, x ∈ Cn for all n, that is, x ∈ C. Since
d(C) ≤ d(Cn ) → 0, C = {x}.
8.9.2 Baire Category Theorem. Let X be a complete metric space. Then
the following statements hold:
T
(a) If Un ⊆ X is open and dense in X for all n, then G := n Un is dense in
X.
S
(b) If Cn ⊆ X is closed and has empty interior for all n, then F := n Cn
has empty interior.
Metric Spaces
283
Proof. To prove (a), we show that B∩G 6= ∅ for any open ball B. Since B∩U1 is
open and nonempty, C1 := Cr1 (x1 ) ⊆ B ∩ U1 for some x1 ∈ X and 0 < r1 ≤ 1.
Since Br1 (x1 ) ∩ U2 is open and nonempty, C2 := Cr2 (x2 ) ⊆ Br1 (x1 ) ∩ U2 for
some x2 ∈ X and 0 < r2 < 1/2. Continuing in this manner, we obtain a
decreasing sequence
of closed balls Cn ⊆ B ∩ Un with diameters tending to
T
zero. By 8.9.1, n Cn contains a point x. Then x ∈ B ∩ Un for all n, hence
x ∈ B ∩ G.
Part (b) follows from (a). T
Indeed, suppose int(Cn ) = ∅ for all n. Then
c
Un :=
C
is
dense
in
X,
hence
n
n Un is dense in X. It follows that the interior
T
of ( n Un )c = F is empty.
We give three applications of Baire’s theorem. The first is known as the
principle of uniform boundedness.
8.9.3 Theorem. Let X and Y be complete normed vector spaces and let L be
a family of continuous linear transformations from X to Y such that
sup kT xk < ∞ for each x ∈ X .
T ∈L
Then there exists M > 0 such that kT xk ≤ M kxk for all x ∈ X and T ∈ L.
Proof. For each n, set
Cn = {x ∈ X : kT xk ≤ n for all T ∈ L}.
S
By hypothesis, X = n Cn . By continuity of the transformations T , each Cn
is closed. Therefore, Baire’s theorem shows that int(Cn ) 6= ∅ for some n. Thus
there exists x0 and r > 0 such that kT yk ≤ n for all T ∈ L and y ∈ X with
ky − x0 k ≤ r. If kxk ≤ r, then, taking y = x + x0 , we have
kT xk ≤ kT x + T x0 k + kT x0 k = kT yk + kT x0 k ≤ n + kT x0 k.
It follows that for all x 6= 0 and T ∈ L
rx
≤ n + kT x0 k
T
kxk
hence
kT xk ≤ r−1 n + kT x0 k kxk.
The following corollary is one of the few instances in analysis (Dini’s
theorem being another) when pointwise convergence of a sequence of continuous
functions is sufficient to convey the property continuity to the limit function.
8.9.4 Corollary. Let X and Y be complete normed vector spaces and let {Tn }
be a sequence of continuous linear transformations from X to Y converging
pointwise on X to a function T . Then T is linear and continuous.
284
A Course in Real Analysis
Proof. Linearity of T is clear. For continuity, note that supn kTn xk < +∞ for
each x ∈ X , hence, by the theorem, there exists M > 0 such that kTn xk ≤
M kxk for all n and x. Letting n → +∞ yields kT xk ≤ M kxk, hence T is
continuous.
For the second application of Baire’s theorem, recall that there exist
functions f : R → R whose set of discontinuity points is precisely Q (3.3.3).
The obvious question raised by this fact is answered in the following theorem.
8.9.5 Theorem. There is no function f : R → R whose set of continuity
points is precisely Q.
Proof. For each n, let Un denote the union of all intervals (a, b) such that
|f (x) − f (y)| < 1/n for all x, y ∈ (a, T
b). Then Un is open and the set of
∞
continuity points of f is precisely C := n=1 Un . Suppose that C = Q. Then
each Un contains Q and hence is dense in R. Let {r1 , r2 , . . .} be an enumeration
of Q. Then the open sets Vm := R \ {rm } are also dense in R and have
intersection I. By Baire’s theorem, the collection of sets {Un , Vm : m, n ∈ N}
has a nonempty intersection. But this intersection is Q ∩ I = ∅. Therefore, C
cannot equal Q.
The last application of Baire’s theorem shows that there is a rich supply
of continuous, nowhere differentiable functions. For the proof we need the
following lemma.
8.9.6 Lemma. If g is piecewise linear on [a, b], then there exists M > 0 such
that
|g(x) − g(y)| ≤ M |x − y| for all x, y ∈ [a, b].
Proof. Let g be as in 8.8.1 and set M = maxj {|mj |}. If
xi ≤ x ≤ xi+1 ≤ xj ≤ y ≤ xj+1
then
|g(x) − g(y)| ≤ |g(x) − g(xi+1 )| + |g(xi+1 ) − g(xi+2 )| + · · · + |g(xj ) − g(y)|
≤ |mi |(xi+1 − x) + |mi+1 |(xi+2 − xi+1 ) + · · · + |mj |(y − xj )
≤ M (y − x).
8.9.7 Theorem. The set of all continuous, nowhere differentiable functions
on an interval [a, b] is dense in C([a, b]) in the uniform norm.
Proof. For each n ∈ N and f ∈ C([a, b]) define
En (f ) = {x ∈ [a, b] : |f (y) − f (x)| ≤ n|x − y| for all y ∈ [a, b]}.
we break the proof into several steps:
Metric Spaces
I.
S∞
n=1
285
En (f ) contains all points at which f is differentiable.
J Let x be such a point and choose δ > 0 such that
f (y) − f (x)
− f 0 (x) < 1 for all y ∈ [a, b] with 0 < |x − y| < δ.
y−x
Then
(
|f (y) − f (x)| ≤
1 + |f 0 (x)| |y − x|
if |x − y| < δ,
−1
2kf k∞ ≤ 2δ kf k∞ |y − x| if |x − y| ≥ δ,
which shows that x ∈ En (f ) for all n > 1 + |f 0 (x)| + 2δ −1 kf k∞ . K
II. En := {f ∈ C([a, b]) : En (f ) 6= ∅} is closed in C([a, b]).
J Let {fk } be a sequence in En converging uniformly to f ∈ C([a, b]).
For each k, choose a point xk ∈ En (fk ). We may assume that xk → x for
some x ∈ [a, b] (otherwise, take a subsequence). Then for all y ∈ [a, b],
|f (y) − f (x)| ≤ |f (y) − fk (y)| + |fk (y) − fk (xk )|
+ |fk (xk ) − fk (x)| + |fk (x) − f (x)|
≤ 2kf − fk k∞ + n|y − xk | + n|xk − x|.
Letting k → ∞ shows that |f (y) − f (x)| ≤ n|y − x|, that is, x ∈ En (f ).
Therefore, f ∈ En . K
III. Enc is dense in C([a, b]).
J Let f ∈ C([a, b]) and ε > 0. We construct a function h ∈ Bε (f )∩Enc . By
8.8.2, there exists a piecewise linear function g such that kf − gk∞ < ε/2.
By 8.9.6, there exists M > 0 such that
|g(x) − g(y)| ≤ M |x − y| for all x, y ∈ [a, b].
Let r > 0 and let x0 = a < x1 < · · · < x2p = b be a partition of [a, b]
with mesh < r. Construct a “sawtooth” piecewise linear function hr with
hr
c = |hr (x) − hr (xj )| ≥ 1, |x − xj | < r
1
x
c
x1
x0
x5
x3
x2
−1
x4
x7
x6
x8
x
xj
FIGURE 8.11: The sawtooth function hr .
vertices
(x0 , 1), (x2 , 1), . . . , (x2p , 1)
and (x1 , −1), (x3 , −1), . . . , (x2p−1 , −1),
286
A Course in Real Analysis
and set h := g + εhr /2. Then
kh − f k∞ ≤ kh − gk∞ + kg − f k∞ =
ε
ε ε
khr k∞ + kg − f k∞ < + = ε,
2
2 2
so h ∈ Bε (f ). To show that h ∈ Enc , let x be an arbitrary member of
[a, b]. If hr (x) ≤ 0 (≥ 0) choose xj such that |x − xj | < r and hr (xj ) = 1
(= −1) (see Figure 8.11). Then |hr (x) − hr (xj )| ≥ 1, hence
ε
|hr (x) − hr (xj )| − |g(x) − g(xj )|
2
ε
≥ − M |x − xj |
2
ε
≥
− M |x − xj |.
2r
|h(x) − h(xj )| ≥
If r is chosen so that
ε
− M > n, then x 6∈ En (h), hence h 6∈ En . K
2r
the proof note that by step III and Baire’s theorem, F :=
T∞To complete
c
E
is
dense
in C([a, b]). Since f ∈ F implies that En (f ) = ∅ for every
n=1 n
n, and since a point at which f is differentiable must lie in some En (f ), no
member of F can be differentiable at any point of [a, b].
Exercises
1.S Prove the converse of 8.9.1: If the intersection of any decreasing sequence
of nonempty closed sets Cn in X with d(Cn ) → 0 contains a single point,
then X is complete.
Find a decreasing sequence of closed sets
2. Let Q have the usual metric. T
Cn in Q with d(Cn ) → 0 and n Cn = ∅.
3.S Show that 8.9.2 does not hold in Q with the usual metric.
4. Let D = {x1 , x2 , . . .} be a proper subset of a complete metric space X.
Show that (a) and (b) of 8.9.2 hold for Y := X \ D. Conclude that the set
of irrationals I with the usual metric satisfies (a) and (b) of the theorem.
Chapter 9
Differentiation on Rn
For the remainder of the book, the Euclidean norm
k · k2 on the spaces Rn will be denoted simply by k · k.
In this chapter we extend the ideas of Chapter 4 to vector-valued functions
of several variables. This will require some notions from linear algebra, a brief
review of which may be found in Appendix B.
9.1
Definition of the Derivative
To motivate the general definition of the derivative of a function on Rn ,
we begin with two important special cases.
Derivative of a Vector-Valued Function of a Real Variable
The definition of derivative in this case is a natural extension of the
definition of the derivative of a scalar-valued function:
9.1.1 Definition. Let I ⊆ R be an interval and a ∈ I. A function f : I → Rm
is said to be differentiable at a if the (vector) limit
f 0 (a) := lim
h→0
f (a + h) − f (a)
f (t) − f (a)
= lim
t→a
h
t−a
exists in Rm . (The limit is one-sided if a is an endpoint of I.) The vector f 0 (a)
is called the derivative of f at a. If f is differentiable at each point in I, then
f is said to be differentiable on I and the resulting function f 0 : I → Rm is
called the derivative of f on I.
♦
The function f may be viewed as a parametrization of a curve C in Rm .
The vector f 0 (a) is then called the tangent vector to C at the point f (a). If
the variable t is interpreted as time, then C may be viewed as the path of
a particle in Rm . In this context, f 0 (a) is called the velocity of the particle
and kf 0 (a)k the speed. The curve is said to be smooth if f 0 is continuous and
nonzero on I. Parameterized curves will be examined in detail in Chapter 12.
287
288
A Course in Real Analysis
Note that the function f : I → Rm may be written f = (f1 , . . . , fm ), where
fj : I → R is the jth component function of f .
9.1.2 Proposition. Let I be an interval and f = (f1 , . . . , fm ) : I → Rm .
Then f is differentiable at a ∈ I iff each fj is differentiable at a, in which case
0
f 0 (a) = (f10 (a), . . . , fm
(a)). In particular, if f is differentiable at a, then f is
continuous at a.
Proof. The assertions follow directly from the inequalities
fj (a + h) − fj (a)
− xj
h
2
f (a + h) − f (a)
− (x1 , . . . , xm )
h
m
2
X
fi (a + h) − fi (a)
≤
− xi .
h
i=1
2
≤
The differential of f at a is the linear transformation dfa : R → Rm that
takes a real number h to the vector hf 0 (a):
dfa (h) = hf 0 (a), h ∈ R.
Definition 9.1.1 may then be rephrased as follows: f is differentiable at a iff
there exists a linear transformation T : R → Rm such that
lim
h→0
f (a + h) − f (a) − T h
= 0,
|h|
in which case T = dfa
Derivative of a Real-Valued Function of Several Variables
The derivative of a scalar-valued function of n variables is defined as follows:
9.1.3 Definition. Let U ⊆ Rn be open and a ∈ U . Then f : U → R is said
to be differentiable at a if there exists a vector f 0 (a) in Rn such that
f (a + h) − f (a) − f 0 (a) · h
= 0.
h→0
khk
lim
(9.1)
The vector f 0 (a) is called the derivative of f at a. The differential of f at a is
the linear transformation dfa ∈ L(Rn , R) defined by
dfa (h) = f 0 (a) · h,
Now let
h ∈ Rn .
♦
j
ej = (0, . . . , 0, 1, 0, . . . , 0), j = 1, . . . , n,
denote the standard basis vectors in Rn . If f 0 (a) exists, then, taking h = tej
in (9.1), we have
f (a + tej ) − f (a) − tf 0 (a) · ej
= 0,
t→0
t
lim
Differentiation on Rn
289
or, equivalently,
f (a + tej ) − f (a)
= f 0 (a) · ej .
(9.2)
t→0
t
The expression the right is just the jth component of f 0 (a). The limit on the
left is called the jth partial derivative of f at a and is denoted variously by
lim
∂j f = fxj =
∂f
.
∂xj
We have proved the following result.
9.1.4 Proposition. If f is differentiable at a, then the partial derivatives
∂j f (a) of f exist at a and
f 0 (a) = ∂1 f (a), ∂2 f (a), . . . , ∂n f (a) .
(9.3)
In particular, the derivative is unique.
The vector on the right in (9.3) is called the gradient of f at a and is
denoted by ∇f or grad f . The linear transformation dfa ∈ L(Rn , R) may now
be written
dfa (h) = ∇f (a) · h, h ∈ Rn .
(9.4)
For an alternate notation, let dxj : Rn → R be the linear function defined by
dxj (h) = hj , h = (h1 , . . . , hn ).
Then dfa may be expressed as
dfa (h) =
n
X
∂f (a)
j=1
∂xj
dxj (h).
If the partial derivatives of f exist at each point of U , we write simply
df =
n
X
∂f
dxj .
∂xj
j=1
For example,
d sin(x2 y) = 2xy cos(x2 y) dx + x2 cos(x2 y) dy.
We show below that if f has continuous partial derivatives on U , then f is
differentiable on U . The continuity hypothesis cannot be removed: There are
functions f that are not differentiable on U but whose partial derivatives exist
throughout U . This is the case for the function in the following example.
290
A Course in Real Analysis
9.1.5 Example. Let m ∈ N. The function
 m
 x y
if (x, y) 6= (0, 0),
f (x, y) = x2 + y 2
0
otherwise
exhibits a variety of behavior depending on the values of m. The partial
derivatives of f are

m+1
y + mxm−1 y 3 − 2xm+1 y
 mx
, if x 6= (0, 0),
(x2 + y 2 )2
fx (x, y) =

0
otherwise,
 m 2
2
 x (x − y ) ,
if x 6= (0, 0),
(x2 + y 2 )2
fy (x, y) =

0
otherwise.
If m = 1, f is not continuous at (0, 0), hence is not differentiable there (see
9.1.11, below). If m = 1 or 2, the partial derivatives exist at (0, 0) but are
not continuous there. If m = 2, the function is continuous at (0, 0), with zero
partial derivatives at (0, 0), but is not differentiable there since in this case
the limit
f (x) − f (0) − 0 · x
x2 y
lim
,=
lim
x→0
kxk
(x,y)→(0,0) (x2 + y 2 )3/2
fails to exist. If m ≥ 3, f has continuous partial derivatives and is differentiable
on R2 .
♦
The definition of the jth partial derivative of f at a may be written
explicitly as
∂j f (a) = lim
h→0
f (a1 , . . . , aj + h, . . . , an ) − f (a1 , . . . , aj , . . . , an )
.
h
This is simply the derivative at aj of the one-variable function
t 7→ f (a1 , . . . , aj−1 , t, aj+1 , . . . , an ).
Thus to find the jth partial derivative of f (x1 , . . . , xj , . . . , xn ), one simply
differentiates f with respect to xj while holding the other variables fixed. It
follows that the standard formulas for derivatives of functions of one variable
hold for partial derivatives of functions of several variables. For example, the
product rule takes the form
∂j (f g)(a) = f (a)∂j g(a) + g(a)∂j f (a),
and the quotient rule becomes
f
g(a)∂j f (a) − f (a)∂j g(a)
(a) =
, g(a) 6= 0.
∂j
g
g 2 (a)
Differentiation on Rn
291
Derivative of a Vector-Valued Function of Several Variables
We now consider the general case. The following definition includes the
two special cases discussed before.
9.1.6 Definition. Let U ⊆ Rn be open. A function f : U → Rm is said to be
differentiable at a ∈ U if there exists a linear transformation dfa : Rn → Rm ,
called the differential of f at a, such that
lim
h→0
f (a + h) − f (a) − dfa (h)
= 0.
khk
The m × n matrix [dfa ] is called the derivative of f at a, or the Jacobian
matrix of f at a, and is denoted by f 0 (a).
♦
9.1.7 Example. If T ∈ L(Rn , Rn ), then, by the linearity of T ,
T (x + h) − T (x) − T h = 0 for all h.
It follows that dTx = T for all x. This is the n-dimensional version of the
familiar result that the derivative of the function x → tx is the constant t. ♦
9.1.8 Theorem. Let U ⊆ Rn be open, f = (f1 , . . . , fm ) : U → Rm , and
let a ∈ U . Then f is differentiable at a iff each function fi : U → R is
differentiable at a. In this case, ∂j fi (a) exists and equals dfa (ej ) · ei , and
dfa (h) = ∇f1 (a) · h, . . . , ∇fm (a) · h , h ∈ Rn .
(9.5)
In particular, if the differential exists, it is unique.
Proof. Let f be differentiable at a. For i = i, . . . , m and j = 1, . . . , n, let
bij = dfa (ej ) · ei , the ith component of dfa (ej ) and the (i, j)th entry of the
matrix [dfa ]. Then
dfa (h) = b1 · h, . . . , bm · h , where bi := (bi1 , . . . , bin ).
Thus for each i,
|fi (a + h) − fi (a) − bi · h| ≤ kf (a + h) − f (a) − dfa (h)k,
from which it follows that
lim
h→0
fi (a + h) − fi (a) − bi · h
= 0.
khk
Therefore, the derivative of fi at a exists and equals bi . By 9.1.4, bi = ∇fi (a),
that is, bij = ∂j fi (a).
Conversely, suppose each fj is differentiable at a. Then ∇fj (a) exists and
by (9.4),
lim
h→0
|fi (a + h) − fi (a) − ∇fi (a) · h|
= 0, i = 1, . . . , m.
khk
292
A Course in Real Analysis
Let T (h) denote the right side of (9.5). Then T is linear and
m
X |fi (a + h) − fi (a) − ∇fi (a) · h|2
kf (a + h) − f (a) − T (h)k2
=
→0
khk2
khk2
i=1
as h → 0. Therefore, dfa exists and equals T .
By the theorem, the (i, j) entry of f 0 (a) is ∂j fi (a). The effect of dfa on a
vector h ∈ Rn may therefore be expressed in matrix form as

  

∇f1 (a) · h
h1
∂1 f1 (a) · · · ∂n f1 (a)

  ..  

..
..
..
f 0 (a)ht = 
,
 .  = 
.
.
.
∂1 fm (a) · · ·
∂n fm (a)
∇fm (a) · h
hn
where ht denotes the transpose of the vector h. In the special case m = n, the
determinant of f 0 (a) is called the Jacobian of f at a and is denoted variously
by
∂(f1 , . . . , fn )
det f 0 (a) = Jf (a) =
(a).
∂(x1 , . . . , xn )
9.1.9 Example. The transformation (x, y, z) = (r cos θ, r sin θ, z) from cylindrical coordinates to rectangular coordinates in R3 has Jacobian
cos θ
∂(x, y, z)
= −r sin θ
∂(r, θ, z)
0
sin θ
r cos θ
0
0
0 = r.
1
♦
The following characterization of differentiability will be useful.
9.1.10 Theorem. Let f : U → Rm , where U ⊆ Rn is open. Then f is
differentiable at a ∈ U iff there exists T ∈ L(Rn , Rm ) and, for sufficiently
small r, a function η : Br (0) → Rm such that
f (a + h) = f (a) + T h + khk η(h), and
lim η(h) = 0.
h→0
(9.6)
In this case, T = dfa .
Proof. Assume that f is differentiable at a. Choose r > 0 such that Br (a) ⊆ U
and define η : Br (0) → Rm by η(0) = 0 and
η(h) =
f (a + h) − f (a) − dfa (h)
khk
if h 6= 0.
Then (9.6) holds with T = dfa .
Conversely, if (9.6) holds for some η and T , then
kf (a + h) − f (a) − T hk
= lim kη(h)k = 0,
h→0
h→0
khk
lim
hence f is differentiable at a with dfa = T .
Differentiation on Rn
293
9.1.11 Corollary. If f is differentiable at a, then f is continuous at a.
Proof. By (9.6) and the continuity of linear transformations,
lim f (a + h) − f (a) = lim khkη(h) + lim dfa (h) = 0.
h→0
h→0
h→0
Exercises
1. Find the differential df for each of the functions f (x, y):
(a) S
(d)
(g)
x−y
.
x+y
x
cos .
y
(b)
sec (yex ).
(h) S exy .
ln(x2 + y 3 ).
(e) S sin (x2 y).
2
(c)
arctan (xy 2 ).
(f)
y
arcsin , 0 < y < x.
x
3x + 2y
tan
.
2x + 3y
(i)
2. Find f 0 (x) where f (x) =
xy
x2 − y 2
3
3
2 2
S
x
y
S
(a) x − y , x y . (b) e sin y, e sin x . (c)
,
.
x2 + y 2 x2 + y 2
(d) ln(x2 + y 2 + z 2 + 1), xyz .
(e) arctan(x − y), exy , x/y .
3. For each of the functions f (x, y) below, find all values of p, q ∈ N for
which on R2
(i) fx , fy exist,
(ii) fx , fy are continuous,
(iii) f 0 exists.
 p
 p q
q
 x + y if (x, y) 6= 0,
 x y
if (x, y) 6= 0,
S
2
2
x +y
x2 + y 2
(a)
(b)
0
0
otherwise.
otherwise.

(
xp sin 1 + y q if x 6= 0,
(x − y)p sin(x − y)−1 if x 6= y,
(c)
(d) S
x
y q
0
otherwise.
otherwise.
 p
 p q
q
x
+
y
x
y


if x 6= y,
if (x, y) 6= 0,
x−y
x−y
(e)
(f)
0
0
otherwise.
otherwise.
4. Find all values of p, q, s ∈ (0, +∞) for which on R2
(i) fx , fy exist,
(ii) fx , fy are continuous,
(iii) f 0 exists,
where f (0, 0) = 0 and, for (x, y) 6= (0, 0), f (x, y) =
(a)S |x|p |y|q ln(x2 + y 2 ).
(d)S
sin |x|p |y|q
.
(x2 + y 2 )s
(b)
sin(x2 + y 2 )p
.
(x2 + y 2 )q
(e)
sin−1 |x|p |y|q
.
(x2 + y 2 )s
(c)
tan(x2 + y 2 )p
.
(x2 + y 2 )q
294
A Course in Real Analysis
5. Spherical coordinates (ρ, φ, θ) in R3 are defined by
x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ,
where ρ ≥ 0, 0 ≤ φ ≤ π, and 0 ≤ θ < 2π. Show that
∂(x, y, z)
= ρ2 sin φ.
∂(ρ, φ, θ)
∂(u, v)
6. Let (u, v) = sin f (x, y), cos f (x, y) . Find
.
∂(x, y)
7. Let (u, v, w) = (y/z, z/x, x/y), where xyz 6= 0. Find
8.S Let
f (x) =
n
X
xai i and g(x) =
i=1
where xi , ai > 0 and
P
i
n
Y
∂(u, v, w)
.
∂(x, y, z)
xai i ,
i=1
ai = 1. Find
(a) x · ∇f (x).
(b) x · ∇g(x).
9. Let f (x) be defined implicitly by the equation
n
X 1
1
=
.
f (x)
x
i=1 i
Express ∇f (x) in terms of f .
Pn xi 10.S Let f (x) = ln
. Express ∇f (x) in terms of f .
i=1 e
11. Let the equation αxn − x1 x2 · · · xn−1 = 0, α 6= 0, define each of the
variables x1 , . . . , xn−1 as a differentiable function of xn . Show that
xn−2
n
∂x1 ∂x2
∂xn−1
···
= α.
∂xn ∂xn
∂xn
12. Let x = (x1 , . . . , xn ). Find ∂i of
1
(a)S kxk.
(b)
.
kxk
(c)S
xi
.
kxk
(d)
xi
.
kxk2
13. Let f : R → R be differentiable and p > 0. Show that for x 6= 0,
x · ∇kxkp = pkxkp and x · ∇f kxkp = pf 0 kxkp kxkp .
Differentiation on Rn
9.2
295
Properties of the Differential
In this section we consider analogs of differentiation rules for single variable
functions. Deeper properties of the differential are taken up in later sections.
Linearity of the Differential
9.2.1 Theorem. Let U ⊆ Rn be open, let f, g : U → Rm be differentiable at
a ∈ U , and let α, β ∈ R. Then αf + βg is differentiable at a and
d(αf + βg)a = αdfa + βdga .
Proof. By 9.1.10, there exist functions η(h), µ(h), defined for h ∈ Rn with
sufficiently small norm, such that
f (a + h) = f (a) + dfa (h) + khkη(h),
g(a + h) = g(a) + dga (h) + khkµ(h),
lim η(h) = 0, and
h→0
lim µ(h) = 0.
h→0
Then
(αf + βg)(a + h) = (αf + βg)(a) + (αdfa + βdga )(h) + khk αη + βµ (h)
and
lim αη + βµ (h) = 0.
h→0
Another application of 9.1.10 completes the proof.
The Norm of a Linear Transformation
For additional properties of the differential, including product rules, we need
the notion of operator norm on the space L(Rn , Rm ) of linear transformations
from Rn to Rm .
9.2.2 Definition. Let T ∈ L(Rn , Rm ). The operator norm of T is defined as
kT k = sup kT xk : x ∈ Rn , kxk = 1 .
♦
The following proposition justifies the use of the term “norm.”
9.2.3 Proposition. kT k defines a norm on L(Rn , Rm ) such that
kT xk ≤ kT k kxk for all x ∈ Rn .
Moreover, if [aij ]m×n is the matrix of T , then for all k, `
X
1/2
m X
n
|ak` | ≤ kT k ≤
a2ij
.
i=1 j=1
(9.7)
(9.8)
296
A Course in Real Analysis
Proof. Inequality (9.7) is clear if x = 0. If x 6= 0, then kxk−1 x has norm 1
hence
kxk−1 kT xk = T (kxk−1 x) ≤ 1.
To verify (9.8), let ai = (ai1 , . . . , ain ). Since T x = a1 · x, . . . , an · x , by
the Cauchy–Schwarz inequality,
kT xk2 =
m
m
X
X
X
(ai · x)2 ≤
kai k2 kxk2 =
a2ij ,
i=1
i=1
kxk = 1,
i,j
which verifies the second inequality in (9.8). The first inequality follows from
|ak` |2 ≤
m
X
|ai` |2 = kT e` k2 ≤ kT k2 .
i=1
To see that kT k defines a norm, note that homogeneity follows directly
from the definition, and the triangle inequality kT1 + T2 k ≤ kT1 k + kT2 k is a
consequence of
k(T1 + T2 )xk ≤ kT1 xk + kT2 xk ≤ kT1 k + kT2 k, kxk = 1.
The property of coincidence follows directly from (9.7).
9.2.4 Corollary. A linear transformation T : Rn → Rm is uniformly continuous.
Proof. This follows from
kT x − T yk = kT (x − y)k ≤ kT k kx − yk,
using the linearity of T .
Since L(Rn , Rm ) is a normed vector space, it is a metric space under the
distance function ρ(T1 , T2 ) := kT1 − T2 k. Thus the methods of Chapter 8 apply.
In particular, we have the following consequence of 9.2.3.
9.2.5 Corollary. Let (X, d) be a metric space and let F be a function from
X to L(Rn , Rm ). For each x ∈ X, let [aij (x)]m×n denote the matrix of F (x).
Then F is (uniformly) continuous with respect to the metric ρ iff each function
aij (x) is (uniformly) continuous on X.
Proof. The matrix of F (x) − F (y) is [aij (x) − aij (y)]m×n , hence, by (9.8),
X
|ak` (x) − ak` (y)|2 ≤ kF (x) − F (y)k2 ≤
[aij (x) − aij (y)]2 .
i,j
The assertion follows.
Differentiation on Rn
297
Product Rules
We consider two product rules; additional product rules, as well as a
quotient rule, are given in the exercises.
9.2.6 Theorem (Scalar Product Rule). Let U be open in Rn and f : U → Rm
and ψ : U → R differentiable at a ∈ U . Then
d(ψf )a (h) = ψ(a)dfa (h) + ∇ψ(a) · h f (a), h ∈ Rn .
(9.9)
Proof. By 9.1.10, there exist functions η(h) and µ(h), defined for h ∈ Rn with
sufficiently small norm, such that
f (a + h) − f (a) − dfa (h) = khkη(h),
lim η(h) = 0,
h→0
ψ(a + h) − ψ(a) − ∇ψ(a) · h = khkµ(h),
lim µ(h) = 0.
h→0
Let T h denote the right side of (9.9) and set
ν(h) := (ψf )(a + h) − (ψf )(a) − T h
= ψ(a + h)f (a + h) − ψ(a)f (a) − ψ(a)dfa (h) − ∇ψ(a) · h f (a).
Then T is linear and
ν(h) = ψ(a + h) f (a + h) − f (a) − dfa (h)
+ ψ(a + h) − ψ(a) − ∇ψ(a) · h f (a) + ψ(a + h) − ψ(a) dfa (h)
= ψ(a + h)khkη(h) + khkµ(h)f (a) + ψ(a + h) − ψ(a) dfa (h).
Since kdfa (h)k ≤ kdfa k khk,
kν(h)k
≤ |ψ(a + h)| kη(h)k + |µ(h)| kf (a)k + kψ(a + h) − ψ(a)k kdfa k.
khk
By continuity of ψ at a, the right side of the last inequality tends to zero as
h → 0, proving the theorem.
9.2.7 Theorem (Dot Product Rule). Let U be open in Rn and f, g : U → Rm
differentiable at a ∈ U . Then
d(f · g)a (h) = f (a) · dga (h) + g(a) · dfa (h), h ∈ Rn .
(9.10)
Proof. Let η(h) and µ(h) be functions defined for sufficiently small khk such
that
f (a + h) − f (a) − dfa (h) = khkη(h),
g(a + h) − g(a) − dga (h) = khkµ(h),
lim η(h) = 0,
h→0
lim µ(h) = 0.
h→0
298
A Course in Real Analysis
Let T h denote the right side of (9.10) and define
ν(h) :=(f · g)(a + h) − (f · g)(a) − T h, h ∈ Rn
=f (a + h) · g(a + h) − f (a) · g(a) − f (a) · dga (h) − g(a) · dfa (h).
Then T is linear and
ν(h) = f (a + h) · g(a + h) − g(a) − dga (h)
+ g(a) · f (a + h) − f (a) − dfa (h) + f (a + h) − f (a) · dga (h)
= khkf (a + h) · µ(h) + khkg(a) · η(h) + dfa (h) + khkη(h) · dga (h).
By the Cauchy–Schwarz and operator norm inequalities,
|ν(h)|
≤ kf (a + h)k kµ(h)k + kg(a)k kη(h)k
khk
+ kdfa k kdga k khk + kη(h)k kdga k khk.
Since the right side of this inequality tends to zero as h → 0 so does the left,
completing the proof.
Continuity of the Differential
If U is an open subset of Rn and f : U → Rm is differentiable, then the
mapping x 7→ dfx is a function from U to L(Rn , Rm ). Since L(Rn , Rm ) is a
metric space in the operator norm, the notion of continuity of this mapping is
meaningful.
9.2.8 Definition. Let U ⊆ Rn be open. A function f : U → Rm is said to be
continuously differentiable on U if dfx exists and is continuous as a function
of x on U . In this case, f is also said to be of class C 1 on U . A function g is
continuously differentiable on a subset E of Rn if g is the restriction to E of a
continuously differentiable function f on an open set U ⊇ E.
♦
9.2.9 Theorem. Let f = (f1 , . . . , fm ) : U → Rm , where U ⊆ Rn is open.
Then f is continuously differentiable on U iff the partial derivatives ∂j fi ,
1 ≤ i ≤ m, 1 ≤ j ≤ n, exist and are continuous on U .
Proof. If f is continuously differentiable on U then, by 9.2.5, the matrix f 0 (x)
has continuous entries. By 9.1.8, these entries are the partial derivatives of the
components of f .
For the sufficiency, by 9.1.8 we may assume that m = 1, that is, f is realvalued. Suppose then that the partial derivatives ∂j f exist and are continuous
on U . Let a ∈ U and ε > 0. Choose r > 0 such that Br (a) ⊆ U and fix
h = (h1 , . . . , hn ) such that khk < r. For 1 ≤ j ≤ n set
gj (t) := f a + hj (t) , hj (t) := (h1 , . . . , hj−1 , thj , 0, . . . , 0), 0 ≤ t ≤ 1.
Differentiation on Rn
Then
299
gj (1) − gj (0) = f a + hj (1) − f a + hj (0) .
Also, by the mean value theorem and the chain rule, there exists tj ∈ (0, 1)
such that
gj (1) − gj (0) = gj0 (tj ) = hj ∂j f a + hj (tj ) .
Therefore,
n
n
X
X
f (a + h) − f (a) =
gj (1) − gj (0) =
hj ∂j f a + hj (tj ) ,
j=1
j=1
hence
f (a + h) − f (a) − ∇f (a) · h =
n
X
∂j f a + hj (tj ) − ∂j f (a) hj = ν(h) · h,
j=1
where
ν(h) :=
n
X
[∂j f a + hj (tj ) − ∂j f (a) ei .
j=1
Since limh→0 hj (tj ) = 0, the continuity of ∂j f at a implies that limh→0 ν(h) =
0. Since |ν(h) · h| ≤ kν(h)k khk,
|f (a + h) − f (a) − ∇f (a) · h|
≤ kν(h)k → 0,
khk
completing the proof.
Exercises
1.S Prove that for T ∈ L(Rn , Rm ),
kT k = sup kT xk : x ∈ Rn , kxk ≤ 1 .
2. Let T1 ∈ L(Rm , Rk ) and T2 ∈ L(Rn , Rm ). Prove that
kT1 T2 k ≤ kT1 k kT2 k.
(We use the standard notation T1 T2 for composition of linear operators.)
3.S (Quotient rule) Let U f , and ψ be as in 9.2.6. If ψ(a) 6= 0, prove that
ψ(a)dfa (h) − ∇ψ(a) · h f (a)
f
d
(h) =
.
ψ a
ψ 2 (a)
4. Find dgx (x), kxk =
6 0, if g(x) =
(a)S kxkx.
(b) kxk−2 x.
(c) kxk−1 x.
300
A Course in Real Analysis
5. The cross product of vectors a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) is defined
by
a a3 1
a a3 2
a a2 3
a×b= 2
e − 1
e + 1
e .
b2 b3
b1 b3
b1 b2
(See Exercise 1.6.9.) Let f : U → R3 and g : U → R3 , where U ⊆ Rn is
open. Define f × g on U by
(f × g)(x) = f (x) × g(x).
Prove that
d(f × g)a (h) = f (a) × dga (h) + dfa (h) × g(a).
6.S Let V ⊆ Rp and W ⊆ Rq be open, f : V → Rk , g : W → Rk , and
α, β ∈ R. Define F on V × W ⊆ Rp+q by
F (x, y) = αf (x) + βg(y),
x ∈ V,
y ∈ W.
If f is differentiable at a ∈ V and g is differentiable at b ∈ W , prove
that F is differentiable at c := (a, b) and
dFc (h, k) = αdfa (h) + βdgb (k), h ∈ Rp , k ∈ Rq .
7. Let V ⊆ Rp and W ⊆ Rq be open and f : V → Rk , g : W → Rk . Define
F on V × W ⊆ Rp+q by
F (x, y) = f (x) · g(y),
x ∈ V,
y ∈ W.
If f is differentiable at a ∈ V and g is differentiable at b ∈ W , prove
that F is differentiable at c := (a, b) and
dFc (h, k) = g(b) · dfa (h) + f (a) · dgb (k), h ∈ Rp , k ∈ Rq .
8. Formulate and prove the analog of Exercise 7 for cross products.
9. Let f : I → Rm be differentiable and kf k = 1 on an open interval I.
Prove that f (t) and f 0 (t) are perpendicular for all t, that is, f · f 0 = 0
on I.
10.S Let f : [a, b] → Rm be differentiable and v 6∈ A := f [a, b]. Referring to
Exercise 8.5.15 with d(x, y) = kx − yk, show that
(a) d(A, v) = kf (t0 ) − vk for some t0 ∈ [a, b].
(b) f (t0 ) − v · f 0 (t0 ) = 0 if t0 ∈ (a, b).
11. A path ϕ : [a, b] → Rn is piecewise smooth if there exists a partition
a0 = a < a1 < · · · < an = b of [a, b] such that ϕ0 exists and is continuous
on each subinterval [aj−1 , aj ]. Let U ⊆ Rn be nonempty open and
connected. Show that if ε > 0, then any pair of points can be joined by
a piecewise smooth path ϕ in U such that supaj−1 ≤t≤aj kϕ0 (t)k < ε for
each j.
Differentiation on Rn
9.3
301
Further Properties of the Differential
In this section we prove two important theorems, the first of which is an
n-dimensional version of the chain rule.
9.3.1 Chain Rule. Let U ⊆ Rn and V ⊆ Rm be open and f : U → Rm ,
g : V → Rk with f (U ) ⊆ V . If f is differentiable at a ∈ U and g is differentiable at b := f (a), then g ◦ f : U → Rk is differentiable at a and the
linear transformation d(g ◦ f )a : Rn → Rk is the composition of the linear
transformations dgb : Rm → Rk and dfa : Rn → Rm :
d(g ◦ f )a = dgb ◦ dfa .
Proof. Choose r, s > 0 such that Br (a) ⊆ U and Bs (b) ⊆ V . By 9.1.10 there
exist functions η : Br (0) → Rm and ν : Bs (0) → Rk such that
f (a + h) = f (a) + dfa (h) + khkη(h),
g(b + k) = g(b) + dgb (k) + kkkν(k),
Set
lim η(h) = 0, and
(9.11)
lim ν(k) = 0.
(9.12)
h→0
k→0
k = f (a + h) − f (a) = dfa (h) + khkη(h).
(9.13)
By the continuity of f at a, k ∈ Bs (0) for all sufficiently small khk. For such h
set
µ(h) = (g ◦ f )(a + h) − (g ◦ f )(a) − (dgb ◦ dfa )(h).
To complete the proof we show that
µ(h)
= 0.
h→0 khk
lim
From (9.11), (9.12), and (9.13),
µ(h) = g(b + k) − g(b) − dgb (k) + dgb [k − dfa (h)]
= kkkν(k) + khkdgb (η(h)),
hence
Since
we have
kµ(h)k
kkk
≤
kν(k)k + kdgb (η(h)k
khk
khk
kkk = dfa (h) + khkη(h) ≤ kdfa k + kη(h)k khk,
kµ(h)k
≤ kdfa k + kη(h)k kν(k)k + kdgb (η(h))k.
khk
Since h → 0 implies k → 0, (9.14) follows.
(9.14)
302
A Course in Real Analysis
9.3.2 Remark. Let f be differentiable on U and g differentiable on V . Set
y = f (x) and z = (g ◦ f )(x) = g(y). Then the chain rule may be written in
matrix form as (g ◦ f )0 (x) = g 0 (y)f 0 (x) or
∂z1
 ∂x1
 .
 .
 .
 ∂z
k
∂x1

···
···
  ∂z
∂z1
1
 ∂y1
∂xn 

.. 
  .
.  =  ..
∂z   ∂z
k
k
∂xn
∂y1
∂z1   ∂y1
∂ym   ∂x1

.. 
 .
.   ..
∂zk   ∂ym
∂ym
∂x1
···
···
···
···

∂y1
∂xn 
.. 

. .
∂y 
m
∂xn
From this we obtain the familiar formulas
m
X ∂z` ∂yi
∂z`
=
∂xj
∂yi ∂xj
i=1
j = 1, . . . , n,
` = 1, . . . , k.
♦
9.3.3 Example. Let the partial derivatives of u = f (x, y) and v = g(x, y)
exist on R. If x = r cos θ and y = r sin θ, we may use the chain rule to find fx ,
fy , gx , and gy in terms of ur , vr , uθ , and vθ . Indeed, from 9.3.2,
ur uθ
f
fy cos θ −r sin θ
= x
,
vr v θ
gx gy sin θ r cos θ
hence
fx
gx
fy
u
= r
gy
vr
uθ
vθ
cos θ
sin θ
−r sin θ
r cos θ
−1
1 ur
=
r vr
uθ
vθ
r cos θ
− sin θ
Thus, for example, fx = (cos θ)ur − r−1 (sin θ)uθ .
r sin θ
.
cos θ
♦
9.3.4 Remark. The chain rule may be used to suggest a definition of tangent
plane to a smooth surface. Let f : U → R be differentiable on the open subset
U of Rn and let c ∈ R. The set
S = {x ∈ U : f (x) = c and ∇f (x) 6= 0}
is called a level surface of f in Rn . Let a ∈ S and let ϕ : (−r, r) → Rn be a
smooth path in S such that ϕ(0) = a. The existence of such paths may be
justified by the implicit function theorem,
proved in the next section. Applying
the chain rule to the identity f ϕ(t) = c, we see that
0 = (f ◦ ϕ)0 (0) = ∇f (a) · ϕ0 (0).
Since ϕ0 (0) is tangent to the curve at a, ∇f (a) is perpendicular to S at a. The
tangent hyperplane to S at a is then defined as the set of all points x ∈ Rn
such that x − a is perpendicular to ∇f (a), that is,
(x − a) · ∇f (a) = 0.
Differentiation on Rn
303
For
the hyperplane tangent at a to the (n − 1)-dimensional sphere
example,
x ∈ Rn : |xk2 = 1 is the set of all x such that
n
X
2ai (xi − ai ) = 0
or a · x = 1.
i=1
The tangent hyperplane at a to a surface S may be seen as the best linear
approximation to S near a.
♦
The second main result of this section is an n-dimensional version of the
mean value theorem of Chapter 4. While such a theorem is not generally
available for vector-valued functions (Exercise 14), there is a version for scalarvalued functions. For its statement, we recall that the line segment in Rn from
a to b is defined by
[a : b] = {(1 − t)a + tb : 0 ≤ t ≤ 1} .
9.3.5 Mean Value Theorem. Let U ⊆ Rn be open and let f : U → R be
differentiable on U . For each pair of points a, b ∈ U with [a : b] ⊆ U there
exists c ∈ [a : b] such that
f (b) − f (a) = dfc (b − a) = ∇f (c) · (b − a).
Proof. Set ϕ(t) = (1 − t)a + tb, 0 ≤ t ≤ 1, and g = f ◦ ϕ. Since ϕ0 (t) = b − a,
the chain rule and one-variable mean value theorem imply that
f (b) − f (a) = g(1) − g(0) = g 0 (c) = dfϕ(c) (b − a)
for some c ∈ (0, 1). Setting c = ϕ(c) completes the proof.
We conclude this section with two applications of the mean value theorem.
9.3.6 Theorem. Let U ⊆ Rn be open and let f : U → Rm be continuously differentiable on U . Let C ⊆ U be compact and convex and define
c := supz∈C kdfz k. Then c < +∞ and kf (x) − f (y)k ≤ ckx − yk, x, y ∈ C.
Proof. Since z 7→ dfz is continuous and C is compact, c < +∞. Let x, y ∈ C
and u ∈ Rm . By 9.3.5 applied to the scalar function g := u · f , there exists a
point c ∈ [x : y] ⊆ C such that
u · f (x) − f (y) = g(x) − g(y) = dgc (x − y) = u · dfc (x − y).
Taking u = f (x) − f (y) and using the Cauchy–Schwarz and the operator norm
inequalities, we have
kf (x) − f (y)k2 = f (x) − f (y) · dfc (x − y) ≤ ckf (x) − f (y)k kx − yk.
Dividing by kf (x) − f (y)k completes the proof.
304
A Course in Real Analysis
9.3.7 Corollary. Let U ⊆ Rn be open and connected and let f : U → Rm be
differentiable on U . If dfx = 0 for all x ∈ U , then f is constant.
Proof. Let x ∈ U and choose r > 0 such that Cr (x) ⊆ U . Since Cr (x) is
compact and convex, 9.3.6 implies that
kf (x) − f (y)k ≤ ckx − yk, y ∈ Cr (x), c :=
sup kdfz k.
z∈Cr (x)
By hypothesis, c = 0, hence f (y) = f (x) for all y ∈ Cr (x). Thus f is constant
on any ball contained in U .
Now let a ∈ U and define
Ua = {x ∈ U : f (x) = f (a)} and Va = {x ∈ U : f (x) 6= f (a)} .
By the first paragraph, if x ∈ Ua , then a ball with center x is contained in Ua .
Therefore, Ua is open. A similar argument shows that Va is open. Since U is
connected and Ua 6= ∅, Ua = U , that is, f (x) = f (a) for all x ∈ U .
Exercises
1.S Let g, ϕ, ψ : R → R be differentiable and let f (x, y) = g ϕ(x)ψ(y) .
Find ∇f (x, y) in terms of g, ϕ, and ψ.
2. Let ϕ : R → R and g : R3 → R be differentiable and set f (x, y) :=
g x, ϕ(x + 2y), ϕ(x − 3y) . Find fy in terms of g and ϕ.
3.S Let g : R2 → R be differentiable, a, b ∈ Rn , and set f (x) = g a·x, b·x).
Find ∇f .
4. Let the partial derivatives of f : R2 → R of f exist and let
z = f (x, y) = f (r cos θ, r sin θ).
Prove that
2 2 2 2
∂z
∂z
∂z
∂z
r
+
=
+ r
.
∂r
∂θ
∂x
∂y
5. Let F : Rn → R be differentiable and set f (x) = F (x, . . . , x). Prove that
f 0 (x) = (1, . . . , 1) · ∇F (x, . . . , x).
6. Let f (x, y) be continuously differentiable. Prove that
f (x, y) =
Z
0
1
(x, y) · ∇f (tx, ty)t dt +
Z
1
f (tx, ty) dt.
0
7.S Let f : U → Rm be differentiable on an open set U ⊆ Rn . Find (T ◦f )0 (x)
for T ∈ L(Rm , Rk ).
Differentiation on Rn
305
8. Let f : Rn → R be differentiable and a = (a1 , . . . , an ) ∈ Rn with an 6= 0.
Prove that a · ∇f (x) = 0 for all x ∈ Rn iff there exists a differentiable
function g : Rn−1 → R such that
f (x1 , x2 , . . . , xn ) = g x1 − b1 xn , x2 − b2 xn , . . . , xn−1 − bn−1 xn ,
where bj = aj /an , 1 ≤ j ≤ n − 1.
9. Let U ⊆ Rn be open and f : U → R smooth. Let α, β : I → Rn
be smooth paths in U such that ∇(f ◦ α) = α0 , α(t1 ) = β(t1 ), and
kα0 (t1 )k = kβ 0 (t1 )k = 1 for some t1 ∈ I (that is, α and β both have unit
speed at the intersection). Show that (f ◦ α)0 (t1 ) ≥ (f ◦ β)0 (t1 ).
10. Let U ⊆ Rn be open and f : U → R differentiable at a ∈ U . If u ∈ Rn
with kuk = 1, define the directional derivative of f in the direction of u
by
f (a + tu) − f (a)
Du f (a) = lim
.
t→0
t
(a)S Show that if f is differentiable at a, then Du f (a) exists and equals
u · ∇f (a).
(b) Show that if Du f exists, then D−u f exists and D−u f = −Du f .
(c)S Define

2
 xy
2
f (x, y) = x + y 4

0
if (x, y) 6= (0, 0),
otherwise.
Show that Du f (0, 0) exists for each u but f is not even continuous
at (0, 0).
(d) Find all unit vectors u such that Du (xy)1/3 exists at (0, 0).
(e) Find all unit vectors u such that Du |x + y| exists at (x0 , −x0 ).
(f) Find all unit vectors u such that Du (x + y)1/3 exists at (0, 0).
11. Let z = F (x, y), where x = x(u, v), y = y(u, v), z = z(u, v), and
the partial derivatives of these functions exist on R2 . Suppose that
xu yv − yu xv 6= 0. Find zx and zy in terms of zu , zv , xu , xv , yu , and yv .
12.S Let f and fx be continuous on [a, b] × [c, d]. Use the mean value theorem
to prove that
Z b
Z b
d
f (t, x) dt =
fx (t, x) dt, c ≤ x ≤ d.
dx a
a
13. Let f and fx be continuous on R2 and u(x), v(x) differentiable on R.
Use Exercises 5 and 12 to prove that
Z v(x)
Z v(x)
d
f (t, x) dt =
fx (t, x) dt + f v(x), x v 0 (x) − f u(x), x u0 (x).
dx u(x)
u(x)
306
A Course in Real Analysis
14. Show that the mean value theorem does not generally hold for vectorvalued functions.
15.S A function f : Rn \ {0} → R is homogeneous of degree p > 0 if
f (tx) = tp f (x) for all t > 0 and all x 6= 0. Prove that a differentiable
function f is homogeneous of degree p iff x · ∇f (x) = pf (x) for every
x 6= 0.
16. Prove the following generalization of the Cauchy mean value theorem:
Let U ⊆ Rn be open and convex and let f, g : U → R be differentiable
on U . Then, for each pair of points a, b ∈ U , there exists c ∈ [a : b] such
that
f (b) − f (a) ∇g(c) · (b − a) = g(b) − g(a) ∇f (c) · (b − a).
17.S Let f : U → Rm be continuously differentiable on the open set U ⊆ Rn
and let C be a compact convex subset of U . Prove that
kf (x) − f (y) − dfy (x − y)k ≤ sup kdfz − dfy k kx − yk, x, y ∈ C,
z∈C
and that the supremum is finite.
18. Let f (x, y) = x2 −y 2 , 2xy and (a, b) 6= (0, 0). Show that if the functions
ϕ, ψ : (−1, 1) → R2 are differentiable and ϕ(0) = ψ(0) = (a, b), then
ϕ0 (0) · ψ 0 (0)
(f ◦ ϕ)0 (0) · (f ◦ ψ)0 (0)
=
,
k(f ◦ ϕ)0 (0)k k(f ◦ ψ)0 (0)k
kϕ0 (0)k kψ 0 (0)k
that is, the angle between the curves ϕ and ψ at their intersection is
preserved under the transformation f .
9.4
Inverse Function Theorem
The one-dimensional inverse function theorem of Section 4.4 has the following n-dimensional extension.
9.4.1 Inverse Function Theorem. Let U ⊆ Rn be open and let f : U → Rn
be continuously differentiable on U . If Jf (a) 6= 0 for some a ∈ U , then there
exist open sets Ua ⊆ U and Va = f (Ua ) with a ∈ Ua such that f is one-to-one
on Ua and f −1 : Va → Ua is continuously differentiable. Moreover,
dfx
−1
= d(f −1 )y ,
x ∈ Ua , y := f (x).
(9.15)
Differentiation on Rn
307
The conclusion of the theorem may be summarized by saying that f has a
continuously differentiable local inverse at a. Of course, since f need not be
one-to-one on U , f may not have a “global” inverse.
The proof of the theorem requires two lemmas. The first is of some independent interest.
9.4.2 Lemma (Contraction Mapping Principle). Let (X, d) be a complete
metric space and let ϕ : X → X be a continuous function such that, for some
0 ≤ c < 1,
d ϕ(x), ϕ(y) ≤ c d(x, y) for all x, y ∈ X.
Then there exists a unique point x ∈ X such that ϕ(x) = x.
Proof. Choose any point x0 in X and define a sequence {xn } recursively by
xn = ϕ(xn−1 ), n ≥ 1. By hypothesis,
d(xk+1 , xk ) ≤ c d(xk , xk−1 ) ≤ c2 d(xk−1 , xk−2 ) ≤ · · · ≤ ck d(x1 , x0 ).
Thus, by the triangle inequality, for m > n
d(xn , xm ) ≤
m−1
X
d(xk , xk+1 ) ≤ d(x1 , x0 )
k=n
∞
X
ck .
k=n
P∞
Since c < 1, the series k=1 ck converges, hence the sum on the right tends
to zero as n → ∞. It follows that {xn } is a Cauchy sequence and therefore
converges to some x ∈ X. Letting n → +∞ in the equation xn = ϕ(xn−1 )
yields ϕ(x) = x. If also ϕ(y) = y, then d(x, y) = d ϕ(x), ϕ(y) ≤ c d(x, y),
which is possible only if x = y.
9.4.3 Lemma. Let U ⊆ Rn be open and f : U → Rn continuously differentiable. If a ∈ U with Jf (a) 6= 0, then there exists r > 0 such that the linear
transformation dfx is invertible for each x ∈ Br (a).
Proof. Since f 0 is continuous, its entries are continuous, hence Jf (x) is a
continuous function of x. Since Jf (a) 6= 0, there exists r > 0 such that
Jf (x) 6= 0 on Br (a) ⊆ U . Since a linear transformation on Rn is invertible iff
the determinant of its matrix is not zero, dfx is invertible for x ∈ Br (a).
Proof of the inverse function theorem.
By 9.4.3, there exists an r > 0 such that Cr (a) ⊆ U and dfx is invertible for
each x in an open set Wr containing Cr (a). Let T = dfa and define g = T −1 ◦f
on Wr . Then dga = T −1 ◦ dfa = In , the identity transformation on Rn . Now
apply 9.3.6 to the function g(x) − x on Cr (a). The constant c in that theorem
is
sup{kdgz − dga k : z ∈ Cr (a)},
308
A Course in Real Analysis
which we can make less than 1/2 by taking r sufficiently small, using the
continuity of the function z 7→ dgz at a. Thus
kg(x) − g(y) − (x − y)k ≤ 12 kx − yk, x, y ∈ Cr (a).
Since
(9.16)
kx − yk − kg(x) − g(y)k ≤ kg(x) − g(y) − (x − y)k,
we see from (9.16) that
1
2 kx
− yk ≤ kg(x) − g(y)k, x, y ∈ Br (a).
In particular, g is one-to-one on Br (a).
Next, we use 9.4.2 to show that g Br (a) is open. Let c ∈ Br (a), d = g(c)
and choose s > 0 so that Cs (c) ⊆ Br (a). We claim that
Bs/2 (d) ⊆ g Cs (c) ⊆ g Br (a) .
(9.17)
The second inclusion
is clear. For the first, let u ∈ Bs/2 (d). To show that
u ∈ g Br (a) define
ϕ(x) = x − g(x) + u, x ∈ Cs (c).
Then
kc − ϕ(x)k = kg(x) − g(c) − (x − c) + d − uk
≤ kg(x) − g(c) − (x − c)k + kd − uk
≤ 21 kx − ck + kd − uk
by (9.16)
< s/2 + s/2 = s,
so ϕ Cs (c) ⊆ Bs (c). Moreover, using (9.16) again we have
kϕ(x) − ϕ(y)k ≤ 21 kx − yk,
x, y ∈ Cs (c).
By Lemma 9.4.2, ϕ(x) = x for some x ∈ Bs (c), hence
u = g(x) ∈ g Bs (c) .
Since u was arbitrary, (9.17) holds. Since d ∈ g Br (a) was arbitrary, g Br (a)
is open.
Next, we show that g −1 : g Br(a) → Br (a) is differentiable
at b := g(a).
Since b ∈ g Br (a) and g Br (a) is open, b + k ∈ g Br (a) for sufficiently
small kkk, that is, for each such k,
b + k = g(a + h) for some khk < r.
By (9.16),
khk − kkk ≤ kh − kk = kg(a + h) − g(a) − hk ≤ 12 khk,
Differentiation on Rn
309
hence kkk ≥ 12 khk. Since g −1 (b + k) = a + h and g −1 (b) = a, recalling that
dga = In we have
kg −1 (b + k) − g −1 (b) − In kk
kh − kk
kg(a + h) − g(a) − dga (h)k
=
≤2
.
kkk
kkk
khk
Since k → 0 implies that h → 0, which in turn implies that the right side of
the above inequality tends to zero, we see that g −1 is differentiable at b with
derivative In .
Now set Ua = Br (a) and Va = (T ◦ g)(Ua ). Since T is invertible, it is a
homeomorphism, hence Va is open. Moreover, since g is one-to-one on Ua and
maps Ua onto g(Ua ), f = T ◦ g is one-to-one on Ua and maps Ua onto Va .
Since f −1 = g −1 ◦ T −1 , the chain rule implies that f −1 is differentiable at
f (a) = T b.
Now observe that the entire above argument may be used at any point
x of Ua , since all that is needed is the invertibility of dfx . Therefore, f −1 is
differentiable on Va .
To verify (9.15) apply the chain rule to f −1 ◦ f = In :
d(f −1 )y ◦ dfx = d(f −1 ◦ f )x = d(In )x = In , y = f (x) ∈ Va .
9.4.4 Corollary. Let U ⊆ Rn be open and f : U → Rn continuously differentiable with Jf (x) 6= 0 for each x ∈ U . Then f is an open map, that is, if
E ⊆ U is open, then f (E) is open. If particular, f (U ) is open.
Proof. In the notation of the theorem, f (E) is the union of the open sets
f (Ua ∩ E), a ∈ E.
Since continuous differentiability is a local property, we have
9.4.5 Global Inverse Function Theorem. Under the conditions of the
preceding corollary, if f is also one-to-one on U , then f −1 : f (U ) → U is
continuously differentiable.
9.4.6 Example. The function (x, y) = f (r, θ) = (r cos θ, r sin θ), r > 0, θ ∈ R,
has Jacobian r, hence is locally invertible at each point of its domain. Since the
function is not one-to-one, it has no global inverse. However, if the domain of
f is suitably restricted, say by requiring θ0 < θ < θ0 + 2π, then f is one-to-one
on the resulting open set Uθ0 := (0, +∞) × (θ0 , θ0 + 2π).
By 9.4.5, the restriction g of f to Uθ0 has a continuously differentiable
inverse
r(x, y), θ(x, y) = g −1 (x, y)
on the open set Vθ0 = fp
(Uθ0 ), obtained by removing the ray (r, θ0 ), r ≥ 0, from
R2 . Clearly, r(x, y) = x2 + y 2 . The function θ(x, y) is called the argument
of (x, y) (determined by θ0 ) and is denoted by argθ0 (x, y). Thus
p
g −1 (x, y) =
x2 + y 2 , argθ0 (x, y) on Vθ0 .
For example, if θ0 = −π, then argθ0 (x, y) = arctan(y/x) for x > 0.
♦
310
A Course in Real Analysis
y
θ0
x
FIGURE 9.1: The domain of argθ0 .
If a function f has a nonzero Jacobian on an open set U and if f is oneto-one on an open subset U0 of U , then the inverse of the restriction of f to
U0 is called a branch of f −1 (even though a global f −1 may not exist). In the
preceding example, g −1 is one of infinitely many branches of f −1 .
9.4.7 Example. The function (x, y) = f (u, θ) = (eu cos θ, eu sin θ), where
(u, θ) ∈ R2 , has Jacobian eu , hence is locally invertible at each point of R2 .
The set Uθ0 = R × (θ0 , θ0 + 2π) is open, and f restricted to Uθ0 is one-to-one.
Therefore, the corresponding branch of f −1 is continuously differentiable on
f (Uθ0 ), which is the set Vθ0 of 9.4.6. The inverse may be given explicitly by
p
♦
u = ln x2 + y 2 , θ = argθ0 (x, y).
9.4.8 Example. Let (u, v) = f (x, y) = 2x2 − 3y 2 , 3x2 − 2y 2 . The Jacobian
is nonzero on the open set U = {(x, y) : xy 6= 0}. Solving the equations for x2
and y 2 yields
3v − 2u
2v − 3u
and y 2 =
.
x2 =
5
5
Restricting f to each of the open quadrants of R2 , we obtain four natural
branches of f −1 , each defined on the open set
V := {(u, v) : 3v > 2u and 2v > 3u} = {(u, v) : v > max{2u/3, 3u/2}} ,
and each of the form
r
f
−1
(u, v) =
±
!
r
3v − 2u
2v − 3u
,±
, (u, v) ∈ V,
5
5
For example, in the open second quadrant of the x, y plane, one chooses the
minus sign in the first coordinate and the plus sign in the second.
♦
Differentiation on Rn
311
Exercises
1. Find the largest set at each point of which the inverse function theorem
guarantees a local C 1 inverse of f , where f (x) =
(a) S (x + y, xy).
(c)
(e) S
2
(b) S (sin x + cos y, cos x + sin y).
2
(d) (sin x + sin y, cos x − cos y).
1
√
S
, x, y > 0.
(f)
ln xy, 2
x + y2
x
y
(h)
,
.
1 + x2 + y 2 1 + x2 + y 2
ye−x , xe−y .
ye−2x , ye3x .
(g) (xy, x2 − y 2 ).
(i) S (x2 + y 2 , xy).
2
2
(k) ye−x , yex .
(j) S (xy 2 , x2 z, yz 2 ).
(l) (x/y, y/z, z/x), xyz 6= 0.
2. Find a local inverse of the function in the specified part below of Exercise 1
−1
about the point (a, b) and find df(u,v)
.
(i) S (a) , a > b > 0.
(ii)
(iv) (g) , a > b > 0. (v)
(e) , ab 6= 0.
S
(iii)
(i) , a, b > 0. (vi)
(f) , a > b > 0.
(k) , a, b > 0.
Show that for part (a) in Exercise 1, no inverse is possible on (0, +∞)2 .
3. Let f (ρ, φ, θ) = x(ρ, φ, θ), y(ρ, φ, θ), z(ρ, φ, θ) be the spherical coordinate transformation of Exercise 9.1.5. Find an explicit formula for the
branch of f −1 on the set {(ρ, φ, θ) : ρ > 0, 0 < φ < π, 0 < θ < π} .
4.S Let
f (x, y) :=
y
x
, 2
2
2
x + y x + y2
, (x, y) 6= (0, 0).
Show that f = f −1 and find Jf .
5. By considering the function
(
x + x2 sin(1/x) if x 6= 0,
f (x) =
0
otherwise,
show that the hypothesis in the statement of the inverse function theorem
that df be continuous on U cannot be removed.
6. Let U ⊆ Rn be open and f : U → Rn of class C 1 such that for some
c > 0, kf (x) − f (y)k ≥ ckx − yk for all x, y ∈ U , where c > 0. Prove
that dfx is invertible for each x ∈ U . Conclude that f : U → f (U ) is a
homeomorphism.
312
9.5
A Course in Real Analysis
Implicit Function Theorem
The implicit function theorem is one of the most important applications
of the inverse function theorem. The theorem gives conditions under which
an equation of the form F (x, y) = 0 may be solved locally for y in terms of
x. The resulting function is then said to be implicitly defined by the equation
F (x, y) = 0. The following simple example illustrates the basic idea.
9.5.1 Example. Let F (x, y, z) = x2 + y 2 + z 2 − 1. Consider the problem of
finding all points (a, b, c) with F (a, b, c) = 0 such that the equation F (x, y, z) =
0 has a continuously differentiable solution z = z(x, y) satisfying z(a, b) = c.
The key fact here is that such a solution is possible if Fz (a, b, c)(= 2c) 6= 0.
Indeed, in this case a2 + b2 = 1 − c2 < 1, hence x2 + y 2 < 1 for all (x, y, z)
sufficiently near (a, b, c) that satisfy F (x, y, z) = 0. For such points the solution
p
z(x, y) = ± 1 − x2 − y 2
is continuously differentiable, and if the sign chosen is that of c, then z(x, y) is
the unique solution satisfying z(a, b) = c.
♦
Notation. For the statement and proof of the implicit function theorem we
use the following conventions: For points z ∈ Rn+m we write
z = (x, y) = (x1 , . . . xn , y1 , . . . ym ), x ∈ Rn , y ∈ Rm .
For a differentiable function
F (z) = F (x, y) = (F1 (x, y), . . . , Fm (x, y)),
we denote by Fy (x, y) the m × m matrix with (i, j)th entry
∂Fi
(x, y).
∂yj
♦
9.5.2 Implicit Function Theorem. Let U be an open subset of Rn+m ,
let F = (F1 , . . . , Fm ) : U → Rm be continuously differentiable, and let
F (a, b) = 0 for some (a, b) ∈ U . If
∂(F1 , . . . , Fm )
= det Fy (a, b) 6= 0,
∂(y1 , . . . , ym )
then there is an open set Va ⊆ Rn containing a and a unique continuously
differentiable mapping f : Va → Rm such that
f (a) = b and F x, f (x) = 0 for every x ∈ Va .
Proof. Define G : U → Rn+m by
G(x, y) = x, F (x, y) = x, F1 (x, y), . . . , Fm (x, y) .
Differentiation on Rn
313
Then G is continuously differentiable, and
In×n On×m
0
G (x, y) =
,
A
Fy
where In×n is the n × n identity matrix, Om×n is the m × n zero matrix, and A
is an m × n matrix of partial derivatives of the components of F with respect
to x. Therefore, JG = det Fy . Since det Fy (a, b) 6= 0, by the inverse function
theorem there exists an open set W ⊆ U containing (a, b) and an open set
V ⊆ Rn+m containing G(a, b) = (a, 0) such that G(W ) = V and
H = (H1 , . . . , Hn , Hn+1 , . . . , Hn+m ) := G−1 : V → W
is continuouslydifferentiable. Note that the identities H G(x, y) = (x, y)
and G H(x, y) = (x, y) imply, respectively, that
Hn+1 G(x, y) , . . . , Hn+m G(x, y) = y, (x, y) ∈ W
(9.18)
and
F x, Hn+1 x, y , . . . , Hn+m x, y = y, (x, y) ∈ V.
(9.19)
Now let Va = {x ∈ R : (x, 0) ∈ V }. Then Va is open and contains a. Define
f on Va by
f (x) = Hn+1 (x, 0), . . . , Hn+m (x, 0) .
n
Then f is continuously differentiable, and since (a, 0) = G(a, b), (9.18) implies
that
f (a) = Hn+1 (a, 0), . . . , Hn+m (a, 0) = b.
Furthermore, (9.19) implies that F (x, f (x)) = 0 on Va . This establishes the
existence of f .
To show uniqueness, assume that F (x, g(x)) = 0 for some function g :
Va → Rm . Then
G(x, f (x)) = x, F (x, f (x)) = x, F (x, g(x)) = G(x, g(x)).
Since G is one-to-one, f (x) = g(x).
9.5.3 Example. The point (x, y, u, v) = (−1, 1, 1, 1) is a solution of the system
F (x, y, u, v) := xu2 + y = 0
G(x, y, u, v) := xy 2 + u2 v 2 = 0,
and at that point
∂(F, G)
= 4xu3 v 6= 0.
∂(u, v)
By the implicit function theorem, there are C 1 functions u(x, y) and v(x, y)
defined on a ball Br (−1, 1) that satisfy the above system with u(−1, 1) =
v(−1, 1) = 1. If r < 1, then (x, y) ∈ Br (−1, 1) implies that x < 0 < y and we
have the explicit solution
p
√
u = −y/x, v = −x y.
♦
314
A Course in Real Analysis
9.5.4 Remark. Let f = (f1 , . . . , fm ) be the function in the statement of the
implicit function theorem. Set y = f (x) and w = F (x, y) = F (z). Applying
the chain rule to the identity F x, f (x) = 0 yields
∂wi ∂y1
∂wi ∂ym
∂wi
+
+ ··· +
= 0, i = 1, . . . , m, j = 1, . . . , n.
∂xj
∂y1 ∂xj
∂ym ∂xj
This may be written in matrix form as
 ∂w
1
 ∂y1
 .
 .
 .
 ∂w
m
∂y1
···
···
∂w1   ∂y1
∂ym   ∂x1

.. 
 .
.   ..
∂wm   ∂ym
∂ym
∂x1
···
···


∂w1
∂y1
 ∂x1
∂xn 


 = −  ..
 .

 ∂w
∂y 
m
∂xn
m
∂x1
···
···

∂w1
∂xn 

,

∂w 
m
∂xn
or, in the above notation, as Fy (z)f 0 (x) = −Fx (z). Therefore,
f 0 (x) = −Fy (z)−1 Fx (z),
which shows that the partial derivatives of the solution f in the implicit function
theorem may be calculated by carrying out a matrix inversion. However, this
is practical only for small dimensions, and even in this case it is often easier to
apply the chain rule directly and then use Cramer’s rule. The next example
illustrates the latter approach.
♦
9.5.5 Example. Suppose (x0 , y0 , u0 , v0 ) satisfies the system
F (x, y, u, v) = G(x, y, u, v) = 0,
(9.20)
where F and G are C 1 in a neighborhood of (x0 , y0 , u0 , v0 ) and
∂(F, G)
(x0 , y0 , u0 , v0 ) 6= 0.
∂(u, v)
Then (9.20) has a C 1 solution u = u(x, y), v = v(x, y) near (x0 , y0 ) such that
u0 = u(x0 , y0 ), v0 = v(x0 , y0 ). Differentiating each equation in (9.20) with
respect to x and y, we obtain the two systems
Fu ux + Fv vx = −Fx
Fu uy + Fv vy = −Fy
Gu ux + Gv vx = −Gx
Gu uy + Gv vy = −Gy
Cramer’s rule gives the following solutions near (x0 , y0 , u0 , v0 ):
∂(F, G)
∂(x, v)
ux = −
,
∂(F, G)
∂(u, v)
∂(F, G)
∂(u, x)
vx = −
,
∂(F, G)
∂(u, v)
∂(F, G)
∂(y, v)
uy = −
,
∂(F, G)
∂(u, v)
∂(F, G)
∂(u, y)
vy = −
. ♦
∂(F, G)
∂(u, v)
Differentiation on Rn
315
Exercises
1.S What does the implicit function theorem tell us about solving the
equation x + y 2 + exy = 1 near (0, 0) for one of the variables in terms of
the other?
2. Suppose (x0 , y0 , z0 ) satisfies the equation F (x, y, z) = 0, where F is C 1
in a neighborhood of (x0 , y0 , z0 ) and Fz (x0 , y0 , z0 ) 6= 0. By the implicit
function theorem, F (x, y, z) = 0 has a C 1 solution z = z(x, y) near
(x0 , y0 ) with z0 = z(x0 , y0 ). Show that near (x0 , y0 , z0 ),
zx = −
Fx
Fz
and zy = −
Fy
.
Fz
3. Show that for each of the functions F below the equation F (x, y, z) = 0
has a local C 1 solution z = z(x, y) on some ball Br (a, b) such that
z(a, b) = c. Calculate zx in a neighborhood of (a, b, c).
(a) sin(xyz) + cos(xyz) − 1,
(a, b, c) = (1, π, 0).
(b) e
+ x + y + z − 1,
√
(c) z sin(x + y + z) − π 3/6,
(a, b, c) = (0, 0, 0).
(d) xyz + ln(x + y + z) − 1 − ln 3,
(a, b, c) = (1, 1, 1).
(e) x ln z + y ln x + z ln y,
(a, b, c) = (1, 1, 1).
(f) x sin z + y sin x + z sin y − 3π/2,
(a, b, c) = (π/2, π/2, π/2).
(g) z
(a, b, c) = (1, 1, −1).
xyz
2n
+ xz
2n−1
(a, b, c) = (π/6, π/6, π/3).
+ xy − 1, n ∈ N,
(h) cos(xyz) + cos(xz) + cos(yz),
(a, b, c) = (0, 1, π/2).
4. Suppose (x0 , y0 , z0 ) satisfies the system F (x, y, z) = G(x, y, z) = 0, where
F and G are C 1 in a neighborhood of (x0 , y0 , z0 ) and
∂(F, G)
(x0 , y0 , z0 ) 6= 0.
∂(x, y)
By the implicit function theorem, the system has a C 1 solution (x, y) =
(x(z), y(z)) near (x0 , y0 ) with (x0 , y0 ) = (x(z0 ), y(z0 )). Show that near
(x0 , y0 , z0 ),
∂(F, G)
∂(z, y)
x0 (z) = −
,
∂(F, G)
∂(x, y)
∂(F, G)
∂(x, z)
and y 0 (z) = −
.
∂(F, G)
∂(x, y)
5.S Show that each pair of variables in the system
√
sin(x + z) + ln(y + z) = 2/2
exz + sin(πy + z) = 1
316
A Course in Real Analysis
are C 1 functions of the other variable near (x, y, z) = (π/4, 1, 0). In the
case (x, y) = x(z), y(z) , calculate x0 (z) and y 0 (z) in a neighborhood of
(π/4, 1, 0).
6. Show that each pair of variables in the system
xy + yz + xz = 11
xyz + x + y
=9
are C 1 functions of the other variable near (x, y, z) = (1, 2, 3). In the
case (x, y) = x(z), y(z) , calculate x0 (z) and y 0 (z) in a neighborhood of
(1, 2, 3).
7. Show that each pair of the variables (u, v), (x, y), and (x, v) in the system
x2 − y 2 + uv−v 2 = 0
x2 + y 2 + uv+u2 = 4
are C 1 functions of the remaining variables near (x, y, u, v) = (1, 1, 1, 1).
In the case u(x, y), v(x, y), calculate ux in a neighborhood of (1, 1).
8.S Show that the system
x − y + z + u2 = 2
−x
+ 2z + u3 = 2
− y + 3z + u4 = 3
cannot be solved for x, y, and z in terms of u near the point (x, y, z, u) =
(1, 1, 1, 1), but for any other group of three variables a local C 1 solution
in terms of the fourth variable is possible.
9. Let f (x, y) be continuously differentiable with f (0, 0) = 0. Give conditions
on fx and fy such that each of the equations below has a C 1 solution
y = y(x) on some interval (−r, r) with y(0) = 0. Calculate y 0 (x) in each
case.
(a) f (2y, 2x − 3y) = 0. (b)S f f (x, y), y = 0. (c) f f (x, y), f (x, y) = 0.
10. Let f (x, y) be continuously differentiable with f (0, 0) = 0. Give conditions
on fx and fy under which each of the equations below has a C 1 solution
z = z(x, y) on some open ball Br (0, 0) with z(0, 0) = 0.
(a) f (2y + 3z, 3x − 2z) = 0.
(b) f f (x, −z), z ln(e2 + x + y) = 0.
(c) f e2z f (x, 2z), f (y, sin 3z) = 0.
(d) f f (z, x), f (y, z) = 0.
Differentiation on Rn
317
11. Let f (x, y) be continuously differentiable with f (0, 0) = 0. For each
system below, give conditions on fx and fy under which the system
has a C 1 solution x = g(z), y = h(z) on some interval (−r, r) with
g(0) = h(0) = 0.
(a) S f f (x, y), f (z, y) = 0
(b) f f (z, z), f (x, y) = 0
f f (y, z), f (x, z) = 0
f f (x, y), f (y, z) = 0
For each system, calculate g 0 (z).
12. Let f (x, y) be continuously differentiable with f (0, 0) = 0. What does
the implicit function theorem tell us about the possibility of solving the
system
f f (u, x), f (v, y) = f f (y, u), f (x, v) = 0
(a) for (x, y) in terms of (u, v) such that x(0, 0) = y(0, 0) = 0?
(b) for (u, v) in terms of (x, y) such that u(0, 0) = v(0, 0) = 0?
13.S Let f , g, and h be continuously differentiable and f (1) = g(1) = h(1) = 0.
Give conditions on f 0 , g 0 , and h0 so that the system
f (xu) + g(yu) + h(zu) = 0
f (xv) + g(yv) + h(zv) = 0
has a C solution u = u(x, y, z), v = v(x, y, z) on some ball Br (1, 1, 1)
such that u(1, 1, 1) = v(1, 1, 1) = 1. Calculate ux .
1
14. Let D ⊆ R2 be compact and let F (x, y, z) be continuous on the set
E := D × [a, b] such that for each (x, y) ∈ D there exists a unique
z = z(x, y) ∈ [a, b] for which F x, y, z(x, y) = 0. Prove that z(x, y) is
continuous on D.
15.S Suppose the equation F (x1 , . . . , xn ) = 0 may be solved for each variable
xj in terms of the others. Show that under suitable conditions
∂x2 ∂x3
∂xn ∂x1
...
= (−1)n .
∂x1 ∂x2
∂xn−1 ∂xn
Verify this for each of the functions
(a) F (x1 , x2 , x3 ) = x1 x2 x3 − 1,
(b) F (x1 , x2 , x3 , x4 ) = x1 x2 x3 x4 − 1.
16. Let p(x, y) and q(x, y) be C1 on an open set U containing (0, 0) such
that p(0, 0) = q(0, 0) = 0 and for (x, y) ∈ U \ {(0, 0)}
p(x, y) > 0, and − 1 ≤ q(x, y) ≤ 1.
Let
f (x, y, z) = z 3 + p(x, y)z + q(x, y), (x, y) ∈ U, z ∈ R.
Prove that there is a unique solution z = z(x, y) to f (x, y, z) = 0 on all
of U which is C 1 on U \ {(0, 0)} and satisfies z(0, 0) = 0.
318
9.6
A Course in Real Analysis
Higher Order Partial Derivatives
Let f be a real-valued function defined on an open subset of R2 with first
partial derivatives fx and fy . The higher order partial derivatives are defined
inductively by
∂ ∂f
∂2f
:=
,
2
∂y
∂y ∂y
∂2f
∂ ∂f
fyx =
:=
,
∂x∂y
∂x ∂y
∂3f
∂ ∂2f
fxxy =
:=
∂y∂x2
∂y ∂x2
..
.
∂ ∂f
∂2f
:=
,
2
∂x
∂x ∂x
∂2f
∂ ∂f
fxy =
:=
,
∂y∂x
∂y ∂x
∂ ∂2f
∂3f
:=
,
fxxx =
∂x3
∂x ∂x2
..
.
fyy =
fxx =
Analogous definitions are given for functions of n variables. For such a function
f , integers mi ∈ Z+ and a permutation (i1 , . . . , in ) of (1, . . . , n),
∂mf
mn
1
∂xm
i1 · · · ∂xin
,
m := m1 + · · · + mn ,
is called a partial derivative of order m.
The following result will allow some simplifications in calculating higher
order partial derivatives.
9.6.1 Theorem. Let U ⊆ R2 be open and let f : U → R have continuous first
partial derivatives fx and fy on U . If fxy exists on U and is continuous at
(a, b) ∈ U , then fyx (a, b) exists and equals fxy (a, b).
Proof. Choose r > 0 such that (a − r, a + r) × (b − r, b + r) ⊆ U . For |h|, |k| < r,
define
ϕk (x) = f (x, b + k) − f (x, b),
x ∈ (a − r, a + r),
ψh (y) = f (a + h, y) − f (a, y),
y ∈ (b − r, b + r),
∆(h, k) = ϕk (a + h) − ϕk (a)
= ψh (b + k) − ψh (b)
= f (a + h, b + k) − f (a, b + k) + f (a, b) − f (a + h, b).
By the mean value theorem applied twice, there exist s, t ∈ (0, 1) such that
∆(h, k) = ϕ0k (a + sh)h
= fx (a + sh, b + k) − fx (a + sh, b) h
= fxy (a + sh, b + tk)hk.
Differentiation on Rn
319
By continuity of fxy at (a, b),
lim
(h,k)→(0,0)
∆(h, k)
=
lim
fxy (a + sh, b + tk) = fxy (a, b).
hk
(h,k)→(0,0)
On the other hand, for each h,
lim
k→0
∆(h, k)
ψh (b + k) − ψh (b)
= lim
= ψh0 (b) = fy (a + h, b) − fy (a, b),
k→0
k
k
so by the iterated limit theorem (8.4.4),
fy (a + h, b) − fy (a, b)
∆(h, k)
∆(h, k)
= lim lim
= lim
.
h→0
h→0 k→0
h
hk
hk
(h,k)→(0,0)
lim
Therefore, fyx (a, b) = fxy (a, b).
The following example shows that continuity of at least one of the second
partial derivatives in the theorem is essential.
9.6.2 Example. Let f (0, 0) = 0 and define
f (x, y) =
x3 y − y 3 x
if (x, y) 6= (0, 0).
x2 + y 2
Then the first partial derivatives exist and are continuous on R2 , the second partial derivatives exist on R2 , but fxy (0, 0) 6= fyx (0, 0). Indeed, since
fx (0, 0) = 0,
f (h, y) − f (0, y)
h2 y − y 3
= lim 2
= −y,
h→0
h→0 h + y 2
h
fx (0, y) = lim
and similarly fy (x, 0) = x. Therefore, fxy (0, 0) = −1 and fyx (0, 0) = 1.
♦
Theorem 9.6.1 may be extended to functions f of n variables. Indeed, if
1 ≤ i < j ≤ n, then under suitable continuity conditions one has
∂2f
∂2f
=
,
∂xi ∂xj
∂xj ∂xi
since the only “active” variables in this identity are xi and xj . Combining this
observation with an induction argument leads to the following result.
9.6.3 Corollary. Let f be a real-valued function defined on an open subset of
Rn and let m = m1 + m2 + · · · + mn , mi ∈ Z+ . Then, for any permutation
(i1 , . . . , in ) of (1, . . . , n),
∂mf
m
∂xi1 i1
m
· · · ∂xinin
=
∂mf
1
∂xm
1
n
· · · ∂xm
n
,
provided that all partial derivatives of f up to order m are continuous on U .
320
A Course in Real Analysis
9.6.4 Definition. Let r ∈ N. A real-valued function f on an open set U ⊆ Rn
is said to be of class C r on U (or simply C r on U ) if all partial derivatives
up to order r exist and are continuous on U . Also, f is of class C ∞ on U if
it is of class C r on U for every r ∈ N. A vector-valued function is C r if each
component function is C r . Continuous functions are said to be of class C 0 . A
function is of class C r on a set if it is the restriction of a C r function on a
larger open set.
♦
9.6.5 Remarks. (a) A function of class r + 1 is of class r. The function
(
xr+1
if x1 ≥ 0,
1
f (x1 , . . . , xn ) =
0
otherwise
is C r on Rn but not C r+1 .
(b) The standard rules of differentiation show that if f and g are real-valued
functions of class C r , then so are αf , f + g, f g, and f /g. For example, if
f (x, y) and g(x, y) are of class C 2 , then
(f g)xx = fxx g + f gxx + 2fx gx ,
with similar formulas holding for (f g)xy and (f g)yy . Since the terms on the
right are continuous, f g is C 2 . In particular, polynomials and rational functions
of several variables are of class C ∞ .
(c) The composite f = g ◦ h of real-valued C r functions is again C r . This
follows from the chain rule: The matrix equation
f 0 (x) = g 0 h(x) h0 (x)
shows that the entries of f 0 (x) are sums of products of C r−1 functions, hence
the entries of f (x) are C r .
(d) If the function f in the statement of the inverse function theorem is C r
on U , then the local inverse of f is also C r . This is proved by induction on r
as follows. Assume that the assertion holds for r − 1, and let f be C r on U .
Then the entries of the matrix f 0 (x) are C r−1 , hence, near a, the entries of
−1
(f −1 )0 (y) = f 0 (f −1 (y))
are C r−1 , as these are rational functions of the entries of f 0 . Therefore, the
entries of f −1 are C r .
(e) If the function F in the statement of the implicit function theorem is C r ,
then the solution y = f (x) to the equation F (x, y) = 0 is C r . This follows
from (d) , since f is constructed using the inverse function theorem.
♦
The following example illustrates how the chain rule may be used to
calculate higher order partial derivatives of composite functions.
Differentiation on Rn
321
9.6.6 Example. Let u = f (x, y) be C 2 on R2 and let x = r cos θ, y = r sin θ.
Then
ur = (cos θ)ux + (sin θ)uy ,
uθ = −(r sin θ)ux + (r cos θ)uy ,
urr = (cos θ)uxr + (sin θ)uyr = (cos θ)2 uxx + (2 sin θ cos θ)uxy + (sin θ)2 uyy ,
uθθ = −(r cos θ)ux − (r sin θ)uxθ − (r sin θ)uy + (r cos θ)uyθ ,
= (r sin θ)2 uxx − (2r2 sin θ cos θ)uxy + (r sin θ)2 uyy − rur .
Calculations like these are useful for changing coordinates in differential operators. For example, the above equations imply that
∂2
∂2
∂2
1 ∂
1 ∂2
+ 2 = 2+
+ 2 2.
2
∂x
∂y
∂r
r ∂r r ∂θ
(9.21)
The operator on the left is called the Laplacian. The equation expresses the
Laplacian in polar coordinates.
♦
Exercises
1. Let z = f (x, y) be C 2 on R2 . Show that the following equations hold for
the given functions x = x(r, t) and y = y(r, t).
zrr + ztt
(a) zxx + zyy = 2
, x = ar + bt, y = at − br.
a + b2
rzrr − tztt
(b)S xzxx − zyy =
, x = rt, y = r + t.
t−r
zrr + ztt
, x = rt, y = r2 − t2 .
(c) zxx + 4zyy = 2
r + t2
(d) x2 zxx + y 2 zyy = 21 [r2 zrr + ztt − rzr ], x = ret , y = re−t .
(e)S zxx + zyy = e−2r [zrr + ztt ],
x = er sin t, y = er cos t.
(f)S a2 x2 zxx + b2 y 2 zyy = zrr + ztt − azr − bzt ,
x = ear , y = ebt .
2. Let z = f (x, y) be C 2 on R2 , x = ar + bs, and y = cr + ds. Show that
   2
 
zrr
a c2
2ac
zxx
zss  =  b2 d2
2bd  zyy  .
zrs
ab cd ad + bc
zxy
In particular, if x = r − s and y = r + s, show that
 

 
zxx
1
1 −2
zrr
1
zyy  =  1
1
2  zss  .
4
1 −1
0
zxy
zrs
3. Let z = f (x, y), x = g(r, s), y = h(r, s) be C 2 on R2 . Show that
2
2
∂2z
∂z ∂ 2 x ∂z ∂ 2 y
∂ 2 z ∂x
∂ 2 z ∂r
∂ 2 z ∂x ∂y
=
+
+
+
+2
.
2
2
2
2
2
∂r
∂x ∂r
∂y ∂r
∂x
∂r
∂y
∂r
∂x∂y ∂r ∂r
322
A Course in Real Analysis
4.S Let F (x, y, z) be C 2 on an open set U and assume that the equation
F (x, y, z) = 0 defines z implicitly as a function of x and y. Express zxx
in terms of partial derivatives of F .
5.S Show that each of the following functions u = u(t, x) satisfies the one
dimensional heat equation ut = k 2 uxx .
(a) u = (a sin x + b cos x) exp(−k 2 t).
(b) u = t−1/2 exp (−x2 /4k 2 t).
6. Let f (x) and g(x) be twice differentiable.
(a) Show that the function u(t, x) = f (x − ct) + g(x + ct) satisfies the
one dimensional wave equation utt = c2 uxx .
1
(b) Show that the function v(t, x) = [f (x − ct) + g(x + ct)], x > 0,
x
1
c2
satisfies the equation vtt = c2 1 +
vxx + vx .
x
x
7.S (Spherical coordinate analog of (9.21)). Let w = f (x, y, z) be of class
C 2 on R3 , where
x = ρ sin φ cos θ, y = ρ sin φ sin θ, and z = ρ cos φ.
Show that
∂2w ∂2w ∂2w
∂ 2 w 2 ∂w 1 ∂ 2 w
cos φ ∂w
1
∂2w
+
+
=
+
+
+
+
.
∂x2 ∂y 2 ∂z 2
∂ρ2 ρ ∂ρ ρ2 ∂φ2 ρ2 sin φ ∂φ ρ2 sin2 φ ∂θ2
8. Show that if f (x, y) is C 2 and homogeneous of degree n ≥ 2 (Exercise 9.3.15), then
x2 fxx + 2xyfxy + y 2 fyy = n(n − 1)f (x, y).
9.S Let g be C 2 on (0, +∞), p 6= 0, and f (x) = g (kxkp ), x ∈ Rn \ {0}.
Show that
n
1X
fx x = (n + p − 2)kxkp−2 g 0 (kxkp ) + pkxk2(p−1) g 00 (kxkp ) and
p i=1 i i
h
iX
1X
fxi xj = (p − 2)kxkp−4 g 0 (kxkp ) + kxk2(p−2) g 00 (kxkp )
xi xj .
p i<j
i<j
10. Let r > 0. Show that the substitutions
s = ex , t = T −
2τ
, and v(t, s) = u(τ, x)
σ2
transform the partial differential equation
vt (t, s) + rsvs (t, s) + 12 σ 2 s2 vss (t, s) − rv(t, s) = 0, s > 0, 0 ≤ t ≤ T
Differentiation on Rn
323
into
uτ (τ, x) = (k − 1)ux (τ, x) + uxx (x, τ ) − ku(x, τ ), k := 2r/σ 2 .
The first equation arises in the Black–Scholes theory of option pricing.
The second is an example of a diffusion equation.
11. Show that the substitutions
u(τ, x) = eax+bτ w(τ, x), a := 12 (1−k), b := a(k−1)+a2 −k = − 14 (k+1)2 ,
reduce the diffusion equation in Exercise 10 to the heat equation
wτ (τ, x) = wxx (τ, x)
9.7
Higher Order Differentials and Taylor’s Theorem
Higher order differentials of a function f of several variables are analogs
of higher order derivatives of functions of a single variable. These may be
conveniently expressed in terms of higher order partial derivatives of f . An
important consequence is Taylor’s theorem in n-dimensions, which is used to
establish convergence of power series in several variables.
We begin by giving an alternate description of the space L Rn , L(Rn , R) .
For a member B of this space and each h ∈ Rn , Bh ∈ L(Rn , R) has matrix
(Bh)e1 · · · (Bh)en ,
which we identify with the vector (Bh)e1 , . . . , (Bh)en , so that (Bh)k may
be written (Bh) · k. Now define
B̃(h, k) := (Bh) · k,
h, k ∈ Rn .
Clearly, B̃ is linear in h for each fixed k and linear in k for each fixed h. Such
a function is called a bilinear functional on Rn . Using the bilinearity, we have


n
n
n X
n
X
X
X
i
i

B̃(h, k) = B̃
hi e ,
kj e =
Bij hi kj ,
(9.22)
i=1
j=1
i=1 j=1
where Bij := B̃(ei , ej ) = (Bei ) · ej . In matrix notation,
 
k1
 . 
B̃(h, k) = [h1 · · · hn ] Bij  ..  .
kn
324
A Course in Real Analysis
Conversely, given any bilinear functional B̃ on Rn , the equation (Bh)k :=
B̃(h, k) defines a member B of L Rn , L(Rn , R) . Thus, identifying B with B̃,
we see that L Rn , L(Rn , R) may be viewed as the vector space of all bilinear
functionals on Rn .
Now let U ⊆ Rn be open and let f : U → R be C 2 on U . Then df is
a function on U taking values in L(Rn , R). Identifying df with
the vector
function ∇f = (∂1 f, . . . , ∂n f ), we define d2 fx ∈ L Rn , L(Rn , R) by
d2 fx = d(df )x = d(∇f )x
that is, by the above identification,
d2 fx (h, k) = d(∇f )x (h) · k, x ∈ U, h, k ∈ Rn .
The matrix of d(∇f )x has (i, j) entry ∂j ∂i f (x) = ∂i ∂j f (x), since f is C 2 .
Thus
n
X
∂ 2 f (x)
d2 fx (h, k) =
hi kj , h, k ∈ Rn .
∂x
∂x
i
j
i,j=1
The bilinear function d2 fx is called the second order differential of f at x.
For higher order differentials, we need the following generalization of a
bilinear functional:
9.7.1 Definition. An m-multilinear functional on Rn is a real-valued function
M (h1 , . . . , hm ) of vectors hj = (hj1 , . . . , hjn ) ∈ Rn that is linear in each variable
hj when the other variables are held fixed.
♦
Analogous to (9.22) we have
M (h1 , . . . , hm ) =
n
X
···
j1 =1
n
X
Mj1 ,...,jm h1j1 · · · hm
jm
(9.23)
jm =1
where Mj1 ,...,jm := M (ej1 , . . . , ejm ).
Now let f be C m on U , m ≥ 2. The mth order differential of f at x is
defined inductively by
dm fx = d(dm−1 f )x .
As in the case m = 2, we may interpret dm fx as the m-multilinear functional
dm fx (h1 , . . . , hm ) =
n
X
···
j1 =1
n
X
jm
∂ m f (x)
h1j1 · · · hm
jm .
∂x
·
·
·
∂x
j
j
1
m
=1
The mth total differential Dm fx of f at x is then defined by
Dm fx (h) := dm fx (h, . . . , h), h := (h1 , . . . , hn )
n
n
X
X
∂ m f (x)
=
···
hj · · · hjm ,
∂xj1 · · · ∂xjm 1
j =1
j =1
1
m
(9.24)
Differentiation on Rn
325
which is frequently written
D m fx =
n
X
j1 ,j2 ,...,jm
∂ m f (x)
dxj1 dxj2 · · · dxjm , dxj (h) := hj .
∂xj1 ∂xj2 . . . ∂xjm
=1
By 9.6.3, each partial derivative in (9.24) may be expressed as
∂mf
, m := m1 + · · · + mn , mj ∈ Z+ .
n
. . . ∂xm
n
1
∂xm
1
Similarly, the corresponding product of h’s in (9.24) may be written in the
mn
+
+
1
form hm
1 . . . hn . For a fixed multi-index (m1 , . . . , mn ) ∈ Z × · · · × Z , the
number of terms in (9.24) of the form
∂mf
m1
mn
mn h1 · · · hn
1
∂xm
1 . . . ∂xn
is given by the multinomial coefficient
m
m
,
=
m1 , m2 , . . . , mn
m1 ! m2 ! · · · mn !
(9.25)
which is the number of distinct ways of arranging m objects, where m1 are
alike, m2 are alike, etc. With this notation, (9.24) may be written
X
∂ m f (x)
m
m
D fx (h) =
hm1 · · · hmn ,
(9.26)
m1
n
m1 , . . . , mn ∂x1 . . . ∂xm
n
or, in differential notation,
X
m
∂ m f (x)
(dx1 )m1 · · · (dxn )mn ,
D m fx =
m1
n
m1 , . . . , mn ∂x1 . . . ∂xm
n
where the sums are taken over all multi-indices (m1 , . . . , mn ) ∈ Z+ × · · · × Z+
for which m1 + · · · + mn = m.
We may go a step further by appealing to the following generalization of
the binomial theorem.
9.7.2 Multinomial Theorem. Let h1 , . . . , hn ∈ R and m ∈ N. Then
X
m
m
n
(h1 + · · · + hn ) =
hm1 · · · hm
(9.27)
n ,
m1 , . . . , mn 1
where the summation is taken over all multi-indices (m1 , . . . , mn ) for which
m1 + · · · + mn = m.
Proof. The theorem may be proved by induction, but we give a combinatorial
argument instead. The left side of (9.27) expands into a sum of products of
the form x1 · · · xm , where each xi is one of the terms in the sum h1 + · · · + hn .
326
A Course in Real Analysis
mn
1
Each such product may be written uniquely as hm
1 · · · hn , where mj ≥ 0
and m1 + · · · + mn = m. For each fixed (m1 , . . . , mn ), the number of products
of this form is the number of ways m1 factors in the product x1 · · · xn may
be chosen to be h1 , m2 factors may be chosen to be h2 , etc. This number is
precisely the multinomial coefficient (9.25).
Now consider the operator hi
∂
, which takes a C 1 function f to the
∂xi
∂f
. If multiplication of such operators is defined as operator
∂xi
composition, then the usual laws of algebra hold. For example, the operator
∂
∂
∂
h1
h2
+ h2
∂x1
∂x2
∂x2
function hi
applied to a C 2 function f yields
∂2f
∂2f
=
h1 h2
+ h22
∂x1 ∂x2
∂x22
∂2
∂2
h1 h2
+ h22 2
∂x1 ∂x2
∂x2
f,
hence we may write
∂
∂
∂
∂2
∂2
h1
+ h2
h2
= h1 h2
+ h22 2 .
∂x1
∂x2
∂x2
∂x1 ∂x2
∂x2
Similarly,
h
∂
∂
+k
∂x
∂y
2
= h2
2
∂2
∂2
2 ∂
+
2hk
+
k
.
∂x2
∂x∂y
∂y 2
The last example suggests that the multinomial theorem is valid in this setting.
This is indeed the case (a similar proof works). It follows from (9.26) that the
mth total differential may be written in operator form as
m
∂
∂
∂
m
D fx (h) = h1
+ h2
+ · · · + hn
= (h · ∇)m .
∂x1
∂x2
∂xn
We may now state the n-dimensional version of Taylor’s theorem.
9.7.3 Taylor’s Theorem. Let U ⊆ Rn be open, m ∈ N, and let f : U → R
be C m+1 on U . Then for each pair of distinct points a, x ∈ U for which
[a : x] ⊆ U , there exists a point c ∈ [a : x] depending on x and a such that
f (x) =
m
X
p
m+1
1
1
h · ∇ f (a) +
h·∇
f (c),
p!
(m + 1)!
p=0
h := x − a. (9.28)
Proof. The line segment [a : x] is described by
ϕ(t) := (1 − t)a + tx = a + th, 0 ≤ t ≤ 1.
Differentiation on Rn
327
Since U is open, there exists an r > 0 such that ϕ (−r, 1 + r) ⊆ U . Let
F = f ◦ ϕ. By the chain rule,
n
n
X
X
∂f ϕ(t)
∂ 2 f ϕ(t)
d ∂f ϕ(t)
0
F (t) =
hj and
=
hi ,
∂xj
dt ∂xj
∂xi ∂xj
j=1
i=1
hence
n
X
∂ 2 f ϕ(t)
F (t) =
hi hj .
∂xi ∂xj
i, j=1
00
An induction argument shows that
F
(p)
(t) =
n
X
j1 ,...,jp
p
∂ p f ϕ(t)
hj1 . . . hjp = h · ∇ f ϕ(t) .
∂x
·
·
·
∂x
j1
jp
=1
By Taylor’s theorem in one variable, there exists c ∈ (0, 1) such that
f (x) = F (1) =
m
X
F (p) (0)
p=0
p!
+
F (m+1) (c)
.
(m + 1)!
Setting c = ϕ(c) completes the proof.
The summation in (9.28) is called an mth order Taylor polynomial about a
and is denoted by Tm (x, a). For example, the second order Taylor polynomial
of a C 2 function f (x1 , x2 ) is
f + h1
∂2f
∂f
∂f
1
∂2f
1
∂2f
+ h1 h2
,
+ h2
+ h21
+ h22
2
∂x1
∂x2
2
∂x1
∂x1 ∂x2
2
∂x22
where hj = xj − aj and the terms are evaluated at (a1 , a2 ). The last term in
(9.28) is called the remainder term and is denoted by Rm (x, a).
The following theorem gives a sufficient condition for a C ∞ function to be
expressed as a multi-variable Taylor’s series.
9.7.4 Taylor Series Representation. Let U ⊆ Rn be open and convex and
let f : U → R be C ∞ on U . Suppose that for some M < +∞
∂ p f (x)
≤M
. . . ∂xpnn
∂xp11 ∂xp22
for all x ∈ U , p ∈ N, and all pj ∈ Z+ , where p = p1 + . . . + pn . Then
f (x) =
∞
X
p
1
h · ∇ f (a), a, x ∈ U, h := x − a.
p!
p=0
(9.29)
328
A Course in Real Analysis
Proof. By 9.7.3, the theorem will follow if we show that the remainder term
Rm (x, a) =
m+1
1
h·∇
f (c)
(m + 1)!
tends to zero as m → ∞. By (9.26),
X m + 1 M
|Rm (x, a)| ≤
|h1 |m1 |h2 |m2 . . . |hn |mn ,
(m + 1)!
m1 , . . . , mn
where the summation is taken over all multi-indices (m1 , . . . , mn ) for which
m1 + · · · + mn = m + 1. By the multinomial theorem, this sum is hm+1 , where
h = |h1 | + |h2 | + · · · + |hn |. Therefore,
|Rm (x, a)| ≤
M hm+1
,
(m + 1)!
which implies limm Rm (x, a) = 0.
The series on the right in (9.29) is called the Taylor series for f about a.
While the theorem may be applied directly, in many cases it is easier to make
use of single variable series. For example, from the series expansion for ex we
have
exy =
∞
∞ X
n
X
X
xn y n
xj y n−j and ex+y = ex ey =
.
n!
j! (n − j)!
n=0
n=0 j=0
Exercises
1. Let f be of class C 3 . Write out explicitly
(a) D2 f (x, y).
(b)S D3 f (x, y).
(c) D2 f (x, y, z).
2.S Calculate D2 f for the functions f (x, y) =
1
2
2
(a) x3 y 2 + x2 y 3 . (b) 2 . (c) sin(xy). (d) ex +y . (e) ln(x2 + y).
x y
3.S Find Dm+n+1 (xm y n ).
4. Let f (x, y) be C n . Show that for 1 ≤ k ≤ n,
∂k
f (tx, ty)
∂tk
t=1
k
= (x, y) · ∇ f (x, y).
Conclude that if f is homogeneous of degree n (Exercise 9.3.15), then
k
(x, y) · ∇ f (x, y) = n(n − 1) · · · (n − k + 1)f (x, y).
Differentiation on Rn
329
5. Write out explicitly
(a)S the first order Taylor polynomial for a C 1 function f (x1 , x2 , x3 ).
(b) the third order Taylor polynomial for a C 3 function f (x1 , x2 ),
6. A polynomial of degree m + n in two variables x and y is a function of
the form
m X
n
X
aij xi y j , where aij ∈ R and amn 6= 0.
i=0 j=0
Prove that f (x, y) is a polynomial of degree ≤ p on Br (a, b) iff
Dp+1 f (x, y) = 0 for all (x, y) ∈ Br (a, b).
7. Let P (x, y) be a polynomial in x, y. Prove that the polynomials P (x ± 1)
may be written as linear combinations of derivatives
∂ k P (x, y)
,
∂xi ∂y j
k = i + j.
8.S Let ϕ(t) be of class C m on an interval (−r, r) and let f (x) = ϕ b · x
where b, x ∈ Rn . Show that the Taylor polynomial for f of order m
about 0 is
m
X
p
ϕ(p) (0)
b·x .
p!
p=0
9. Let U ⊆ R2 be open and connected and let f be C ∞ on U such that for
each (x, y) ∈ U there exists r > 0 and p ∈ N depending on (x, y) such
that Dp f = 0 on Br (x, y). Prove that there exists a single p ∈ N such
that Dp f = 0 on U . Hint. Use Exercise 6.
10. Let U ⊆ Rn be open and let f be C p on U such that all partial derivatives
of f of order r < p vanish throughout U . Let C be a compact convex
subset of U . Prove that there exists c < +∞ such that
kf (x) − f (y)k ≤ ckx − ykp ,
x, y ∈ C.
11. Use the one variable Taylor series to find third order Taylor polynomials
with a = (0, 0) for the functions
√
cos xy.
(a) S sin(x + y).
(b)
(d) S arctan(x + y).
(e) e2x+3y .
(c)
(f)
ln(1 − x − y)−1 .
y
.
1 + xy
330
*9.8
A Course in Real Analysis
Optimization
Throughout the section, f : U → R denotes a
C 1 function on an open subset U of Rn .
In this section we use differential theory to find the maximum and minimum
values of f on subsets E of U . The first step is to find all local extrema.
Local Extrema and Critical Points
9.8.1 Definition. Let a ∈ U . If f (a) is the maximum (minimum) value of f
on some ball in U with center a then f is said to have a local maximum (local
minimum) at a ∈ U In either case, f is said to have a local extremum at a. ♦
The following theorem gives a necessary condition for the existence of a
local extremum.
9.8.2 Local Extremum Theorem. If f has a local extremum at a, then
dfa = 0.
Proof. The function g(t) := f (a1 , . . . aj−1 , t, aj+1 , . . . , an ) has a local extremum at t = aj , hence, by the single variable local extremum theorem
(4.2.2), ∂j f (a1 , . . . , an ) = g 0 (aj ) = 0.
9.8.3 Definition. A point a ∈ U is called a critical point of f if dfa = 0. A
critical point a is a local maximum (local minimum) point if f has a local
maximum (local minimum) at a. If a is neither a local maximum nor a local
minimum point, then a is called a saddle point.
♦
FIGURE 9.2: Saddle point.
By definition, a critical point a is a saddle point iff in each ball Br (a)
there exist points x and y such that f (x) < f (a) < f (y). This means that
the graph of f rises in some directions from a and falls in others. A familiar
example is f (x, y) = y 2 − x2 at (0, 0) (Figure 9.2).
Differentiation on Rn
331
Second Derivative Test
The following theorem gives sufficient conditions for a critical point of a
function f to be a local maximum point, a local minimum point, or a saddle
point. It may be seen as an extension of the second derivative test for functions
of one variable.
9.8.4 Second Derivative Test. Let f be C 2 on U and let a ∈ U be a critical
point of f .
(a) If D2 fa (h) > 0 for all h 6= 0, then a is a local minimum point.
(b) If D2 fa (h) < 0 for all h 6= 0, then a is a local maximum point.
(c) If D2 fa (h) > 0 for some h and D2 fa (k) < 0 for some k, then a is a
saddle point of f .
Proof. Choose r > 0 such that Br (a) ⊆ U . By 9.28 with m = 1, for each h
with khk < r there exists c ∈ [a : a + h] such that
f (a + h) − f (a) = 21 D2 fc (h) = 12 D2 fa (h) + η(h) ,
(9.30)
where
η(h) = D fc (h) − D fa (h) =
2
2
n
X
hi hj
i,j=1
Set
(
∂ 2 f (c)
∂ 2 f (a)
−
.
∂xi ∂xj
∂xi ∂xj
ε(h) =
khk−2 η(h) if khk =
6 0,
0
if khk = 0.
|ε(h)| ≤
n
X
∂ 2 f (c)
∂ 2 f (a)
.
−
∂xi ∂xj
∂xi ∂xj
i,j=1
Since |hi hj | ≤ khk2 ,
Since f is C 2 , limh→0 ε(h) = 0.
With these preliminaries out of the way, assume that the hypothesis in
(a) holds. Since the function D2 fa (h) is continuous in h, it has a positive
minimum m on the sphere S1 (0) in Rn . Thus
h
2
2 2
D fa (h) = khk D fa
≥ mkhk2 , h 6= 0,
khk
so from (9.30)
f (a + h) − f (a) ≥
1
2
mkhk2 + η(h) =
1
2
m + ε(h) khk2 .
Since m > 0 and ε(h) → 0, f (a + h) − f (a) > 0 for all h =
6 0 with sufficiently
small norm. This proves (a). Part (b) follows from (a) by considering −f .
332
A Course in Real Analysis
To prove (c), suppose for some h, k that D2 fa (h) > 0 and D2 fa (k) < 0.
By (9.30),
t2 2
f (a + th) − f (a) =
D fa (h) + khk2 ε(th) ,
2
for all t > 0. Therefore, f (a + th) − f (a) > 0 for all sufficiently small t > 0.
Similarly, f (a + tk) − f (a) < 0 for all sufficiently small t > 0.
9.8.5 Example. Let f (x, y, z) = x2 + y 2 + xy + 3x + sin2 z. The system
fx = 2x + y + 3 = 0, fy = x + 2y = 0, fz = sin(2z) = 0
has solutions an = (−2, 1, nπ/2), n ∈ Z. From
fxx = fyy = 2, fzz = 2 cos(2z), fxy = 1, and fxz = fyz = 0,
we have
D2 f (h, k, `) =
h
∂
∂
∂
+k
+`
∂x
∂y
∂z
2
f
= h2 fxx + k 2 fyy + `2 fzz + 2(hkfxy + h`fxz + k`fyz )
= 2 h2 + k 2 + hk + `2 cos(2z) .
Therefore,
(
D fan (h, k, `) =
2
2(h2 + k 2 + hk + `2 ) if n = 2k,
2(h2 + k 2 + hk − `2 ) if n = 2k + 1.
Since h2 + k 2 + hk ≥ 0 for all h, k, a2k is a local minimum point and a2k+1 a
saddle point.
♦
The second derivative test gives no information if D2 fa = 0. For example,
the critical point (0, 0) of the function f (x, y) = xn + y 2 , n ≥ 3, is a saddle
point if n is odd and a local minimum point if n is even.
For n = 2, there is a simpler version of the second derivative test:
9.8.6 Corollary. Let U ⊆ R2 be open and let f : U → R be C 2 on U . For a
critical point (a, b) of f , set
∆ = ∆(a, b) =
fxx (a, b) fxy (a, b)
2
= fxx (a, b)fyy (a, b) − fxy
(a, b).
fyx (a, b) fyy (a, b)
(a) If ∆ > 0 and fxx (a, b) > 0, then (a, b) is a local minimum point.
(b) If ∆ > 0 and fxx (a, b) < 0, then (a, b) is a local maximum point.
(c) If ∆ < 0, then (a, b) is a saddle point of f .
Differentiation on Rn
333
Proof. Let
α = fxx (a, b), β = fxy (a, b), and γ = fyy (a, b).
Then ∆ = αγ − β 2 and
D2 f(a,b) (h, k) = αh2 + 2βhk + γk 2 ,
h, k ∈ R.
(9.31)
If α 6= 0, completing the square yields
2
2
kβ
k 2 (αγ − β 2 )
kβ
k2 ∆
D f(a,b) (h, k) = α h +
+
=α h+
+
.
α
α
α
α
2
Thus if ∆ > 0, α > 0, and (h, k) 6= (0, 0), then D2 fa (h, k) > 0, hence, by the
theorem, (a) holds. A similar argument proves (b).
Now suppose ∆ < 0. If α 6= 0, then from (9.31)
D2 f(a,b) (1, 0) = α
and D2 f (a, b)(−βα−1 , 1) =
∆
,
α
which have opposite signs. If γ 6= 0, then completing the square yields
2
h2 ∆
hβ
+
,
D2 f(a,b) (h, k) = γ k +
γ
γ
and one may argue similarly. (This also shows that (a) and (b) hold with fxx
in the statement replaced by fyy .) Finally, if α = γ = 0, then β 6= 0, and
(9.31) shows that, again, D2 fa (h, k) has positive and negative values. This
proves (c).
9.8.7 Example. Let f (x, y) = 3x2 y + 2xy 2 − 6xy. Since
fx (x, y) = 2y(3x + y − 3)
and fy (x, y) = x(3x + 4y − 6),
the critical points are (0, 0), (2, 0), (0, 3), and (2/3, 1).
TABLE 9.1: Values of ∆.
(a, b)
fxx (a, b)
fyy (a, b)
fxy (a, b)
(0, 0)
0
0
−6
(2, 0)
0
8
6
(0, 3)
18
0
6
(2/3, 1)
6
8/3
2
∆(a, b)
−36
−36
−36
12
Table 9.1 shows that f has three saddle points and one local minimum
point.
♦
334
A Course in Real Analysis
9.8.8 Example. Let
2
f (x, y) = (cx2 + y 2 )e−x
−y 2
, c 6= 0, 1.
The system
fx = 2xe−x
2
−y 2
(c − cx2 − y 2 ) = 0,
fy = 2ye−x
2
−y 2
(1 − cx2 − y 2 ) = 0
has solutions (0, 0), (0, ±1), and (±1, 0). The second partial derivatives are
fxx = 2e−x
fyy
fxy
2
−y 2
c − 3cx2 − y 2 + 2x2 (cx2 + y 2 − c) ,
2
2
= 2e−x −y 1 − cx2 − 3y 2 + 2y 2 (cx2 + y 2 − 1) , and
2
2
= 4xye−x −y cx2 + y 2 + −c − 1 .
TABLE 9.2: Values of ∆.
(a, b)
fxx (a, b)
fyy (a, b)
fxy (a, b)
∆(a, b)
(0, 0)
(0, 1))
2c
2(c − 1)/e
2
−4/e
0
0
4c
8(1 − c)/e
(0, −1)
2(c − 1)/e
−4/e
0
(1, 0)
−4c/e
2(1 − c)/e
0
(−1, 0)
−4c/e
2(1 − c)/e
0
8(1 − c)/e2
8(c − 1)/e 8(c − 1)/e2
The values of ∆ at the critical points (a, b) are given in Table 9.2. Assigning
values to c produces a variety of local extreme points. For example, if c > 1,
then (0, ±1) are saddle points and the remaining critical points are local
minimum points of f .
♦
Global Extrema
We now turn to the problem described at the beginning of the section,
namely, to find the points in a subset E of U at which f has a maximum
or a minimum. Such points, called global extrema, will always exist if E is
closed and bounded. The following examples illustrate a common technique
for finding them.
9.8.9 Example. Let
f (x, y) = 2x3 − x2 + 3y 2 ,
E := (x, y) : x2 + y 2 ≤ 1 .
By 9.8.2, the extreme values of f occur at points on bd(E) or at critical points
of f in int(E). Solving the system
fx = 6x2 − 2x = 0, fy = 6y = 0
yields the critical points (0, 0) and (1/3, 0), which are candidates for extrema
Differentiation on Rn
335
in int(E). To find possible extrema on bd(E) we substitute 1 − x2 for y 2 in
the expression for f to obtain the function
F (x) = 2x3 − 4x2 + 3, −1 ≤ x ≤ 1.
Since the only zero of F 0 (x) in [−1, 1] is x = 0, single variable optimization
theory gives us the additional extrema candidates (0, ±1) and (±1, 0). Calculating the values of f at these six points shows that f (0, ±1) = 3 is the
maximum value of f on E and f (−1, 0) = −3 is the minimum.
♦
9.8.10 Example. Let
f (x, y, z) = (x − 1)2 + (y − 2)2 + z 2 ,
E := (x, y, z) : x2 + y 2 + z 2 ≤ 6 .
The solution of the system fx = fy = fz = 0 is (1, 2, 0), at which f has minimum
value zero. The maximum of f must then occur on bd(E). Substituting the
expression 6 − x2 − y 2 for z 2 in the definition of f , we obtain the function
F (x, y) = (x − 1)2 + (y − 2)2 + 6 − x2 − y 2 = 11 − 2x − 4y, x2 + y 2 ≤ 6.
The system Fx = Fy = 0 has no solution, hence the extreme values
√ of F must
2
2
lie on
the
boundary
x
+
y
=
6.
To
find
these
values,
let
x
=
6 cos θ and
√
y = 6 sin θ, so
√
√
F (x, y) = G(θ) := 11 − 2 6 cos θ − 4 6 sin θ, 0 ≤ θ ≤ 2π.
Applying single variable optimization techniques to G, we see that possible
extreme values of F on x2 + y 2 = 6 occur at points
q y) for which θ = 0
q (x,
√
6
6
and θ = arctan 2, that is, (x, y) = ( 6, 0) and ±
5, 2
5 . Calculating the
values of F at these points shows that the maximum value of f on E is
r
r
6
6
f −
, −2
, 0 ≈ 22.
♦
5
5
In the above examples, E was the closure of an open set whose boundary
is a smooth surface. In many important cases, however, E itself is a surface.
The surfaces we shall consider are of the form
E = {x ∈ U : g1 (x) = · · · = gm (x) = 0} ,
where U ⊆ Rn is open, m < n, and the functions gj are C 1 on U . The equations
gj (x) = 0 are then called constraints and E is the constraint set. If f (a) is the
maximum or minimum value of f on E, then f is said to have an extremum at
a subject to the constraints gj = 0.
9.8.11 Example. We find the points on the surface z 2 −x2 y = 1 closest to the
origin. This is equivalent to minimizing f (x, y, z) = x2 + y 2 + z 2 subject to the
constraint z 2 −x2 y−1 = 0. Since the surface is unbounded, it suffices to consider
336
A Course in Real Analysis
that part of the surface inside a ball with center 0. To find the minimum, we
substitute z 2 = x2 y + 1 into f to obtain a function F (x, y) = x2 (1 + y) + y 2 + 1
defined on an open disk containing a point at which f is minimum. The critical
points of F , solutions of the system
Fx = 2x(1 + y) = 0, Fy = x2 + 2y = 0,
√
are (0, 0), and (± 2, −1). The last two are easily seen to be saddle points,
while (0, 0) is a local minimum point. Therefore, the minimum of f occurs at
(0, 0, ±1), hence the distance from the surface to the origin is 1.
♦
Lagrange Multipliers
In 9.8.11, it was possible to solve the constraint equation for one of the
variables in terms of the others, reducing the dimension by one, thereby
simplifying the problem. This is not always possible, but the implicit function
theorem may be used to solve the constraint equation locally. This is the
method used in the proof of the next theorem. For its statement, we use the
following notational conventions, similar to those used in the proof of the
implicit function theorem.
Notation. Let m < n and p := n − m. For points z ∈ Rn = Rm+p we write
z = (x, y) = (x1 , . . . xm , y1 , . . . yp ), x ∈ Rm , y ∈ Rp .
If G := (g1 , . . . , gm ) : U → Rm , then G(z) may be written as
differentiable, we define



∂g1
∂g1
∂g1
···
 ∂y1 · · ·
 ∂x1

∂x
m

 .
.. 
.
 and Gy = 
.
Gx = 
 ..
···
. 
···
 .

 ∂g

 ∂gm
∂gm
m
···
···
∂x1
∂xm
∂y1
G(x, y). If G is

∂g1
∂yp 

..  .
. 

∂gm 
∂yp
♦
9.8.12 Lagrange Multipliers. Let U ⊆ Rn be open and let f, gj : U → R,
j = 1, . . . , m < n be C 1 functions. Set G := (g1 , . . . , gm ). Suppose that f
has a global extremum at c = (a, b) ∈ U subject to the constraint G = 0. If
det Gx (c) 6= 0, then there exist constants λ1 , . . . , λm such that
∇f (c) =
m
X
λi ∇gi (c).
i=1
Proof. Equation (9.32) is the system
∂g1
∂gm
+ · · · + λm
, j = 1, . . . , m,
∂xj
∂xj
∂g1
∂gm
∂j+m f (c) = λ1
+ · · · + λm
, j = 1, . . . , p,
∂yj
∂yj
∂j f (c) = λ1
(9.32)
Differentiation on Rn
337
which may be written in matrix form as
λ1 · · · λm Gx (c) = ∂1 f (c) · · · ∂m f (c)
λ1 · · · λm Gy (c) = ∂m+1 f (c) · · · ∂n f (c) .
(9.33)
(9.34)
Equation (9.33) is satisfied by defining
λ1 · · · λm := ∂1 f (c) · · ·
∂m f (c) G−1
(9.35)
x (c).
It remains to show that (9.34) is satisfied for this choice of λ1 · · · λm .
By the implicit function theorem applied to G, there is an open set Vb ⊆ Rp
containing b and a continuously differentiable mapping
h = (h1 , . . . , hm ) : Vb → Rm
such that
h(b) = a and G h(y), y = 0 for every y ∈ Vb .
Applying the chain rule to each component equation gi h(y), y = 0 yields
∂gi ∂h1
∂gi ∂hm
∂gi
+ ··· +
+
= 0, i = 1, . . . , m, j = 1, . . . , p,
∂x1 ∂yj
∂xm ∂yj
∂yj
which may be written in matrix form as


  ∂h

∂h1
∂g1
∂g1
∂g1
1
·
·
·
···



∂yp 
 ∂x1
∂xm 
 ∂y1
 ∂y1
 .
.. 
..
..  = −  ..

 .
 .
. 
. 
 .
 .


 ∂g
 ∂gm
∂gm   ∂hm
∂hm 
m
···
···
∂x1
∂xm
∂y1
∂yp
∂y1
···
···

∂g1
∂yp 

.. 
. 

∂g1 
∂yp
or in the above notation as
Gx (c)h0 (b) = −Gy (c).
Multiplying the last equation on the left by ∂1 f (c) · · · ∂m f (c) G−1
x (c)
and using (9.35), we obtain
∂1 f (c) · · · ∂m f (c) h0 (b) = − λ1 · · · λm Gy (c).
(9.36)
Since f h(y), y has a local extremum at b, its partial derivatives must vanish
there:
∂f (c) ∂h1 (b)
∂f (c) ∂hm (b) ∂f (c)
+ ··· +
+
= 0, j = 1, 2, . . . , p.
∂x1 ∂yj
∂xm ∂yj
∂yj
In matrix form,
∂1 f (c) · · ·
∂m f (c) h0 (b) = − ∂m+1 f (c) · · ·
Equation (9.34) now follows from (9.36) and (9.37).
∂n f (c) .
(9.37)
338
A Course in Real Analysis
9.8.13 Example. Let c, x ∈ Rn , c 6= 0. We find the extreme values of
f (x) := c · x on the sphere kxk = 1, that is, subject to the constraint
g(x) := kxk2 − 1 = 0. By Lagrange multipliers, the extreme values occur at
points x for which ∇f (x) = λ∇g(x) for some λ ∈ R. This leads to the system
ci = 2λxi , 1 ≤ i ≤ n. Squaring and adding yields kck2 = 4λ2 kxk2 = 4λ2 ,
hence 2λ = ±kck
and x = c/2λ = ±c/kck. Therefore, the extreme values of f
are f ± c/kck = ±kck.
♦
The last example has an important application to directional derivatives:
Let h be differentiable on Br (a). From Exercise 9.3.10, the directional derivative
Dx h(a) of h at a in the direction of a unit vector x is c · x, where c = ∇h(a).
Thus, by the example, Dx h(a) is maximum when x = c/kck, that is, when x
is in the direction of the gradient of h.
9.8.14 Example. Let x = (x1 , . . . , xn ), a = (a1 , . . . , an ), and c = (c1 , . . . , cn ),
where xj ≥ 0, aj > 0, and cj > 0. We find the maximum value of f (x) =
xa1 1 xa2 2 · · · xann subject to the constraint c · x = 1. Note that the conditions
xj ≥ 0 and cj > 0 imply that the constraint set is closed and bounded.
Set g(x) = c · x − 1. The maximum of f occurs at points x for which
∇f (x) = λ∇g(x) for some λ ∈ R. This leads to the equations
aj f (x) = λcj xj , j = 1, . . . , n.
(9.38)
Adding
Pn and using the constraint yields af (x) = λ, or f (x) = λ/a, where
a = j=1 aj . From (9.38), aj = acj xj so the maximum occurs at the point
a
a2
an 1
,
,...,
.
ac1 ac2
acn
In particular, if a1 = · · · = an = 1 and c1 = · · · = cn = 1/c, c > 0, then
f (x1 , x2 , . . . , xn ) = x1 x2 · · · xn has maximum f (c/n, . . . , c/n) = (c/n)n . Thus
x1 x2 · · · xn ≤ (c/n)n , or equivalently (x1 x2 · · · xn )1/n ≤ c/n for all xj > 0
satisfying x1 + · · · + xn = c. Since c is arbitrary, we obtain the classic result
(x1 x2 . . . xn )1/n ≤
x1 + x2 + · · · + xn
, xj ≥ 0,
n
which asserts that the geometric mean of nonnegative data does not exceed
the arithmetic mean.
♦
Exercises
1. In each case classify the critical point a := (π/2, π/2, π/2) of the function.
(a) (sin x)(sin y)(sin z).
(b) (sin x)(cos y)(cos z).
2.S Show that the function x2 + 2y 2 + 3z 2 − xy − yz − xz on R3 has minimum
value zero.
Differentiation on Rn
339
3. Find and classify the critical points of the following functions.
(a) S x3 + 2xy + 3x2 + y 2 .
(b) x3 + 3x2 y 2 − 6x2 − 12y 2 .
(c) x2 y 2 + 2/x + 2/y.
(d) S x4 + 2y 2 − 4xy.
(e) x−1 + y −1 + ln(x2 + y 2 ).
(f) S x−1 + y −1 + arctan(y/x).
(g) x3 − xy 2 + x2 − y 2 .
(h) x4 − 2x2 + 4y 3 − 12y.
(i) S xy − x2 y − xy 2 .
(j) x4 − 4x3 + 4x2 + y 2 .
4. Find the maximum and minimum values of each of the following functions
f on R2 \ {(0, 0)}.
√
x+y
x + 3y
x + 2y
x2 + xy
(a)S p
.
.
(b) p
.
(c) p
.
(d) 2
x + y2
x2 + y 2
x2 + y 2
x2 + y 2
5. Show that the point (x, x2 ) on the curve y = x2 nearest the point (1, 2)
satisfies the equation 2x3 − 3x − 1 = 0.
In Exercises 6–9, use the method of 9.8.9 and 9.8.10.
6. Find
the extreme values of the following functions on the disk D :=
(x, y) : x2 + y 2 ≤ 1 :
(a) 3x2 + 2y 2 − x. (b)S x2 + xy − x + y 2 . (c) cos(xy). (d)S sin(xy).
7.S Prove that the maximum of f (x, y) = x2 + ay 2 + (a − 1)y on the disk
D := (x, y) : x2 + y 2 ≤ 1 occurs on bd(D).
8. Let f (x, y) = x2 + y 2 + axy on the disk D := (x, y) : x2 + y 2 ≤ 1 .
Prove that a maximum of f occurs on bd(D), and that a minimum of f
occurs on bd(D) iff |a| ≥ 2.
Pn
9. Show that the
(minimum) value of fn (x1 , . . . , xn ) = i=1 xi
√ maximum
√
on C1 (0) is n (− n).
10.S Let f (x, y) = ax−1 + by −1 + xy, a, b > 0. Prove that f has a minimum
on (0, +∞) × (0, +∞) and that the minimum value is 3(ab)1/3 .
11.S Consider the data points (xi , yi ), 1 ≤ i ≤ n, where xi =
6 xj for at least
one pair of points. The linear least squares fit is the line y = mx + b
with the property that the sum of squares of the vertical distances from
2
Pn
the data points to the line, namely, i=1 yi − mxi − b , is minimum.
Show that
x · y − nx y
, and b = y − mx, where
kxk2 − nx2
n
n
1X
1X
x = (x1 , . . . , xn ), y = (y1 , . . . , yn ), x :=
xi , and y :=
yi .
n i=1
n i=1
m=
340
A Course in Real Analysis
12. Let x, a, b, c ∈ Rn . Prove that f (x) := kx − ak2 + kx − bk2 + kx − ck2
has a minimum value and find the point at which it occurs.
In Exercises 13–28, use Lagrange multipliers.
13. Show that the maximum
value of f (x) = x1 x2 · · · xn on the set E :=
Pn
−n
{x : xi ≥ 0 and
.
i=1 xi ≤ 1} is n
14. Find the maximum and minimum of 2x − 3y subject to the constraint
(x + 1)2 + (y − 1)2 = 1.
15.S Find the maximum and minimum of ax2 + 2bxy + y 2 subject to the
constraint x2 + y 2 = c2 , where abc 6= 0 and (a + 1)2 + 4(a − b2 ) ≥ 0.
16. Show that the point√(x, y, z)√on the√surface x2 + y 2 + z 2 = 1 nearest the
point (1, 2, 3) is (1/ 14, 2/ 14, 3/ 14).
17.S Show that the point (x, y, z) on the surface z = x2 + y 2 nearest the
point (1, 2, 3) satisfies the equations
10x3 − 5x − 1 = 0, y = 2x, and z = x2 + y 2 = 5x2 .
18. Show that the point on the surface x2 + y 2 − z 2 = 1 nearest to the point
(1, 2, 3) is
x, 2x, 3x/(2x − 1)), where 20x4 − 20x3 − 8x2 + 4x − 1 = 0.
19.S Show that the point on the surface z 2 − x2 − y 2 = 1 nearest (1, 2, 3) is
x, 2x, 3x/(2x − 1)), where 20x4 − 20x3 − 4x + 1 = 0.
20. The intersection of the surfaces z = x2 + y 2 and x + y + z = 1 is an
ellipse lying above the xy plane. Find the highest and lowest points of
the ellipse.
21. Let a, b, c > 0. Show that the maximum and minimum values of the
function f (x, y, z) = ax + by + cz subject to the constraints x2 + z 2 = 1,
y 2 + z 2 = 1, x, y, z ≥ 0, are, respectively, √
the maximum and minimum of
the quantities c, a + b, and (a + b + cd)/ 1 + d2 , where d := c/(a + b).
22.S Find the maximum and minimum values of x + 2y + 3z subject to the
constraints x + y + z = 1 and x2 + y 2 + z 2 = 1.
23. Let a > 1/3. Show that the maximum value of xyz subject to the
constraints x + y + z = 1 and x2 + y 2 + z 2 = a is
r
1
3a − 1
2
3
xyz =
(1 − 3t + 2t ), where t =
.
27
2
Differentiation on Rn
341
24.S Let x = (x1 , · · · , xn ), a = (a1 , · · · , an ) 6= 0, and b = (b1 , · · · , bn ).
Show
that the shortest distance from b to the hyperplane a · x = c is
√
2 c − a · b kak−1 .
Pn
25. Let p ≥ 2. Show that the largest distance from the surface i=1 |xi |p = 1
to the origin is n(p−2)/2p and the smallest distance is 1.
26.S Find the distance from the point a = (a1 , . . . , an ) to the (n − 1)dimensional sphere kxk = 1 in Rn , where aj > 0, and kak =
6 1.
27. Let a = (a1 , . . . , an ) and b = (b1 , . . . , bn ), where ai , bi > 0.
(a)S Show that the minimum value of the function a · x subject to the
n
n p
X
2
X
constraint
bi /xi = 1, where xi > 0, is
ai bi .
i=1
(b) Show that the minimum value of
n
X
3
1/3 2/3
in (a) is
ai bi
.
i=1
Pn
i=1
ai x2i subject to the constraint
i=1
28. Let
Pn a = (a1 , . . . , an ) and b = (b1 , . . . , bn ), where ai , bi > 0 and
i=1 bi = 1. Find the minimum value of a · x subject to the conQn
b
straint i=1 xjj = 1, where xi > 0.
29. Let U ⊆ Rn be open and let f : U → R be C 2 on U . Show that if f has
a local maximum (minimum) at a ∈ U , then D2 fa (h) ≤ 0 (≥ 0) for all
h ∈ Rn .
30.S Prove the following generalization of Rolle’s theorem: Let U ⊆ Rn be
bounded and open and let f : U → R be differentiable on U , continuous
on cl(U ), and constant on bd(U ). Then f 0 (u) = 0 for some u ∈ U .
31. Let U ⊆ Rn be open and f : U → Rn C 1 on U such that Jf 6= 0
on U . Let a ∈ U and let C := Cr (a) ⊆ U , r > 0. Prove that if
supC kf (x) − xk < r/2, then the equation f (x) = a has a solution in C.
Chapter 10
Lebesgue Measure on Rn
The methods of Chapter 5 may be modified in a natural way to construct the
Riemann integral of a function of several variables. In Section 11.1, we briefly
describe how this is done. However, the main goal of the present chapter and
the next is to construct the more general Lebesgue integral. The choice to
develop the n-dimensional Lebesgue integral rather than the n-dimensional
Riemann integral is motivated by the fact that, as an analytical tool, the former
has several distinct advantages over the latter. For example, the Lebesgue
theory allows the interchange of limit and integral in more general settings.
Furthermore, the collection of Lebesgue integrable functions, which includes
unbounded functions on unbounded domains, is significantly larger than the
set of Riemann integrable functions. These advantages make the Lebesgue
theory better suited for applications based on, for example, probability theory
and, in particular, stochastic processes.
The key idea in Riemann integration on Rn is the partitioning of the
domain of the integrand f into n-dimensional subintervals. The Riemann
integral is then obtained as a limit of Riemann sums, that is, sums of function
values times the volumes of the subintervals. In Lebesgue integration, it is the
range of f rather than the domain that is partitioned into subintervals (see
Figure 10.7). This still produces a partition of the domain of f ; however, the
sets in this partition are generally more complicated than subintervals. The
Lebesgue integral is constructed by multiplying the measure of these sets by
function values, adding the results, and then taking limits. In this chapter we
construct the measure and in the next chapter we construct the integral. The
precise connection between the Riemann and Lebesgue integrals is made in
Section 11.4.
10.1
General Measure Theory
In this section we give brief description of those aspects of measure theory
that will be needed to construct Lebesgue measure on Rn . For a comprehensive
treatment see, for example, [4].
343
344
A Course in Real Analysis
Sigma Fields
10.1.1 Definition. A σ-field on a nonempty set S is a collection F of subsets
of S such that
(a) S, ∅ ∈ F;
(b) A ∈ F implies Ac ∈ F;
(c) Ak ∈ F, k ∈ N, implies
[
k
Ak ∈ F.
♦
Part (c) of the definition says that F is closed under countable unions. By
DeMorgan’s law,
[
c
\
Ak =
Ack ,
k
k
hence part (b) implies that F is also closed under countable intersections.
The collection of all subsets of S and the collection {∅, S} are simple
examples of σ-fields. The following examples are somewhat more interesting.
10.1.2 Example. If A is an arbitrary collection of subsets of S, then the
σ-field generated by A is the intersection σ(A) of all σ-fields containing A. It is
the smallest σ-field containing A in the sense that if F is a σ-field containing
A then F contains σ(A). In the special case where A = {A1 , A2 , . . .} is a
countable partition of S, σ(A) is simply the collection F of all unions of
members of A. Indeed, F is clearly closed under countable unions, and the
calculation
c
[
[
Ak =
Ak , F ⊆ N
k∈F c
k∈F
shows that F is closed under complements. Thus, by minimality, σ(A) = F. ♦
10.1.3 Example. If F is a σ-field on S and E ⊆ S, then the collection
FE := {A ∩ E : A ∈ F}
is a σ-field of subsets of E. Moreover, FE ⊆ F iff E ∈ F. (See Exercise 2.) ♦
Measure on a Sigma Field
10.1.4 Definition. A measure on a σ-field F of subsets of S is a function
µ : F → [0, +∞] such that µ(∅) = 0 and µ has the additivity property
[
X
µ
Ak =
µ(Ak )
k
k
for any finite or infinite sequence of pairwise disjoint sets Ak ∈ F. The extended
real number µ(A) is called the measure of A.
♦
Lebesgue Measure on Rn
345
10.1.5 Example. Let {pk } be a sequence of nonnegative real numbers. Define
X
µ(E) =
pk , E ⊆ N,
k∈E
where the sum may be infinite. (By convention, the sum over the empty set is
zero.) It is not difficult to show that µ is a measure on the σ-field of all subsets
of N.
In the special case pk = 1 for all k, µ(E) counts the number of elements
in E if E is a finite set, and µ(E) = +∞ otherwise. In this case, µ is called a
counting measure.
♦
10.1.6 Proposition. Let µ be a measure on a σ-field F and A1 , A2 , · · · ∈ F.
(a) If A1 ⊆ A2 , then µ(A1 ) ≤ µ(A2 ) (monotonicity).
P
S
(b) µ k Ak ≤ k µ(Ak ) (subadditivity).
(c) µ(A1 ) + µ(A2 ) = µ(A1 ∪ A2 ) + µ(A1 ∩ A2 ) (inclusion-exclusion).
(d) If Ak ↑ A, then µ(Ak ) ↑ µ(A) (continuity from below).
(e) If Ak ↓ A and µ(A1 ) < +∞, then µ(Ak ) ↓ µ(A) (continuity from above).
Proof. (a) By additivity, µ(A2 ) = µ(A2 \ A1 ) + µ(A1 ) ≥ µ(A1 ).
(b) Write
[
Ak = A1 ∪ (A2 ∩ Ac1 ) ∪ · · · ∪ (Am ∩ Ac1 ∩ · · · ∩ Acm−1 ) ∪ · · · .
k
Since the sets in the union on the right are pairwise disjoint, by countable
additivity and monotonicity
[ X
X
µ
Ak = µ(A1 ) +
µ Ac1 ∩ · · · ∩ Acm−1 ∩ Am ≤
µ(Am ).
k
m≥2
m≥1
(c) Since A1 ∪ A2 is the union of the pairwise disjoint sets A1 ∩ Ac2 , A1 ∩ A2 ,
and A2 ∩ Ac1 , additivity implies that
µ(A1 ∪ A2 ) = µ(A1 ∩ Ac2 ) + µ(A1 ∩ A2 ) + µ(A2 ∩ Ac1 ).
Similarly,
µ(A1 ) + µ(A2 ) = µ(A1 ∩ Ac2 ) + 2µ(A2 ∩ A1 ) + µ(A2 ∩ Ac1 ).
It follows that µ(A1 ∪ A2 ) = +∞ iff µ(A1 ) + µ(A2 ) = +∞, which proves (c)
in the infinite case. In the finite case, simply subtract the above equations to
get (c).
(d) This is clear if some Ak has infinite measure, so assume µ(Ak ) < +∞
346
A Course in Real Analysis
for all
k. Set A0 = ∅ and Ek = Ak \ Ak−1 . The sets Ek are pairwise disjoint,
S∞
A = k=1 Ek , and µ(Ek ) = µ(Ak ) − µ(Ak−1 ), hence by additivity
µ(A) =
∞
X
µ(Ek ) = lim
n
k=1
n
X
µ(Ak ) − µ(Ak−1 ) = lim µ(An ).
n
k=1
(e) Note that A1 \ Ak ↑ A1 \ A, hence, by (d),
µ(A1 ) − µ(A) = µ(A1 \ A) = lim µ(A1 \ Ak ) = µ(A1 ) − lim µ(Ak ).
k
k
Exercises
For the following exercises, F is a σ-field of
subsets of a set S and µ is a measure on F.
1.S Find an example which shows that the hypothesis µ(A1 ) < +∞ in
10.1.6(e) cannot be removed.
2. Verify that the collection FE in 10.1.3 is a σ-field.
3.S Let A, B ∈ F with µ(B) = 0. Show that µ(A ∪ B) = µ(A \ B) = µ(A).
P
4. Let Ak , Bk ∈ F and let s denote the sum k µ(Ak \ Bk ). Prove that
[
\
[ \ (a) µ
Ak \ Bk ≤ s.
(b) µ
Ak \
Bk ≤ s.
k
k
k
k
5.S (General inclusion-exclusion principle). Let µ A1 ∪ · · · ∪ An < +∞.
Prove that for n ≥ 2
n
X
µ A1 ∪ · · · ∪ An =
µ(Ai ) −
i=1
+
n
X
n
X
µ(Ai ∩ Aj )
1≤i<j≤n
µ(Ai ∩ Aj ∩ Ak ) − · · · + (−1)n−1 µ(A1 ∩ · · · ∩ An ).
1≤i<j<k≤n
Hint. Use induction on n. The case n = 2 is 10.1.6(c).
6. For a sequence of sets Ak ∈ F, define
lim inf Ak =
k
∞ \
∞
[
k=1 j=k
Aj and lim sup Ak =
k
∞ [
∞
\
k=1 j=k
Prove the following:
(a) lim inf k Ak ⊆ lim supk Ak .
(b) µ lim inf k Ak ≤ lim inf k µ(Ak ).
S
(c) µ lim supk Ak ≥ lim supk µ(Ak ) if µ ( k Ak ) < +∞.
P
(d) µ lim supk Ak = 0 if k µ(Ak ) < +∞.
Aj .
Lebesgue Measure on Rn
347
7.S Let {Ek } be a sequence in F, m ∈ N, and let A denote the set of all
x ∈ S such that x ∈ Ek for finitely many and at least m values of k.
Prove that A ∈ F and
∞
1 X
µ(A ∩ Ek ).
µ(A) ≤
m
k=1
8. Let {Ek } be a sequence in F, m ∈ N, and let B denote the set of all
x ∈ S such that x ∈ Ek for at most m values of k. Prove that B ∈ F and
∞
µ(B) ≥
1 X
µ(B ∩ Ek ).
m
k=1
9. Let E be a collection of pairwise disjoint members of F and let A ∈ F.
Show that µ(A ∩ E) > 0 for at most countably many members of E. Hint.
Consider
Em := {E ∈ E : µ(A ∩ E) ≥ 1/m} , m ∈ N.
10.2
Lebesgue Outer Measure
10.2.1 Definition. An n-dimensional interval in Rn is a Cartesian product
I = A1 × A2 × · · · × An ,
where each Aj is an interval in R. If I is bounded, then the n-dimensional
volume |I| of I is defined by
|I| :=
n
Y
j=1
|Aj | =
n
Y
(bj − aj ),
j=1
where aj ≤ bj are the endpoints of Aj . If Aj = [aj , bj ) for each j, then I is
said to be half-open. We denote by
• I the collection of all bounded intervals in Rn ,
• H the collection of all bounded half-open intervals in Rn ,
• O the collection of all bounded open intervals in Rn ,
• C the collection of all bounded closed intervals in Rn .
♦
Note that each of the above collections is closed under the formation of
nonempty finite intersections.
348
A Course in Real Analysis
10.2.2 Lemma. Let I, I1 , . . . , Im ∈ H.
(a) If I1 , . . . , Im are pairwise disjoint and I =
Sm
Pm
(b) If I ⊆ j=1 Ij , then |I| ≤ j=1 |Ij |.
Sm
then |I| =
Pm
|Ij |.
(c) If I1 , . . . , Im are pairwise disjoint and I ⊇
Sm
then |I| ≥
Pm
|Ij |.
j=1 Ij ,
j=1 Ij ,
j=1
j=1
Proof. For ease of notation, we prove the lemma for the case n = 2, in which
case the intervals are half-open rectangles. Let I = [a, b) × [c, d). We may
d
y4
Ik
R3,3
R2,3
y3
y2
y1
R4,3
c
x2
x1
a
x4
x3
b
FIGURE 10.1: Pairwise disjoint interval grid.
Sm
assume in (b) that I = j=1 Ij , otherwise we could replace Ij by Ij ∩ I. Thus
in each case the rectangles Ij are contained in I, hence the coordinates of their
vertices form partitions
{x0 := a ≤ x1 ≤ . . . ≤ xp := b} and {y0 := c ≤ y1 ≤ . . . ≤ yq := d}
of [a, b] and [c, d], respectively. These partitions generate a grid of subrectangles
Ri,j = [xi , xi+1 ) × [yj , yj+1 ) with union I such that each rectangle Ik is a
union of subrectangles Ri,j . Case (a) is depicted in Figure 10.1; the rectangles
Ik in the figure are shown with solid boundaries, and the dashed lines are the
extensions of these boundaries. Since
b−a=
p−1
q−1
X
X
(xi+1 − xi ) and d − c =
(yj+1 − yj ),
i=0
j=1
we have
|I| =
X
p−1
X
X
q−1
p−1 X
q−1
(xi+1 − xi )
(yj+1 − yj ) =
|Rij |.
i=0
Similarly, |Ik | =
P
j=0
(10.1)
i=0 j=0
|Rij |, hence
X
X
|Ik | =
|Rij |.
{(i,j):Ri,j ⊆Ik }
X
k
k {(i,j):Ri,j ⊆Ik }
(10.2)
Lebesgue Measure on Rn
349
We now compare (10.1) andP(10.2). In part (a), every Ri,j is contained
m
in exactly one Ik , hence |I| = k=1 |Ik |.PIn (b), a rectangle Ri,j could be
m
contained in more than one Ik , so |I| ≤ k=1 |IkP
|. Finally, in (c) not every
m
Ri,j is necessarily contained in an Ik , hence |I| ≥ k=1 |Ik |.
10.2.3 Definition. The Lebesgue outer measure of a subset A of Rn is defined
by
X
[
∗
∗
λ (A) = λn (A) := inf
|Ij | : Ij ∈ I and
Ij ⊇ A .
♦
j
j
10.2.4 Remark. The number of intervals Ij covering A in the definition of
λ∗ (A) may be finite or infinite. Of course, every bounded subset of Rn has
a covering by a finite many Ij ’s. By slightly adjusting endpoints, one may
show that the value of λ∗ (A) is unchanged if I is replaced by H, O, or C.
(Exercise 1.)
♦
10.2.5 Proposition. Lebesgue outer measure on Rn has the following properties:
(a) 0 ≤ λ∗ (A) ≤ +∞.
(b) λ∗ (∅) = 0.
(c) λ∗ (I) = |I| for each I ∈ I.
(d) If A ⊆ B, then λ∗ (A) ≤ λ∗ (B) (monotonicity).
[
X
(e) λ∗
Ak ≤
λ∗ (Ak ) (subadditivity).
k
k
(f) If I, J ∈ H and I ∩ J = ∅, then λ∗ (I ∪ J) = |I| + |J|.
Proof. Parts (a) and (d) follow directly from the definition, and (b) follows
from the observation that ∅ may be covered by a single interval of arbitrarily
small volume.
I
Ik
Jk
J
FIGURE 10.2: The coverings {Ik } and {Jk }.
To prove (c), note first that, because {I} is a covering, λ∗ (I) ≤ |I|. For
the reverse inequality, let ε > 0 and choose a closed bounded interval J ⊆ I
such that |J| > |I| − ε. Let {Ik } be any sequence of intervals covering I. By
350
A Course in Real Analysis
10.2.4, we may take Ik ∈ O. Let {Jk } be a sequence in H such that Ik ⊆ Jk
and |Jk | < |Ik | + ε/2j (Figure 10.2). Since J is compact, there exists an m
such that
J ⊆ I1 ∪ · · · ∪ Im ⊆ J1 ∪ · · · ∪ Jm .
Therefore,
|I| − ε < |J| ≤ |J1 | + · · · + |Jm | ≤ ε +
∞
X
|Ik |,
k=1
P∞
the second inequality by 10.2.2(b). Letting ε → 0, we have |I| ≤ k=1 |Ik |.
Therefore, |I| ≤ λ∗ (I).
For (e), we may assume that λ∗ (Ak ) < +∞ for all k. Let ε > 0 and for
each k choose a sequence {Ik,j }∞
j=1 in I such that
Ak ⊆
∞
[
Ik,j and
j=1
∞
X
λ∗ (Ik,j ) ≤ λ∗ (Ak ) +
j=1
ε
.
2k
S∞
Since the countable collection {Ik,j : k, j = 1, 2, . . .} covers k=1 Ak ,
[
X
∞
∞ X
∞
∞
X
Ak ≤
λ∗
λ∗ (Ik,j ) ≤
λ∗ (Ak ) + ε.
k=1
k=1 j=1
k=1
Since ε was arbitrary, (e)
Smfollows.
For (f), let I ∪ J ⊆ k=1 Ik , where Ik ∈ H. Since Ik ⊇ (Ik ∩ I) ∪ (Ik ∩ J),
10.2.2(c) shows that |Ik | ≥ |Ik ∩ I| + |Ik ∩ J|. Therefore, by (c),
m
X
|Ik | ≥
k=1
m
X
|Ik ∩ I| +
k=1
m
X
|Ik ∩ J| ≥ λ∗ (I) + λ∗ (J) = |I| + |J|,
k=1
Taking the infimum we have λ∗ (I ∪ J) ≥ |I| + |J|. The reverse inequality
follows from (e).
Exercises
1.S Prove the assertions in 10.2.4. More generally prove the following:
Let J be a collection of bounded intervals with the property that for
each bounded interval I and each ε > 0 there exists J ∈ J containing I
such that |J| < |I| + ε. For A ⊆ Rn , define
X
[
α(A) := inf
|Jk | : Jk ∈ J and
Jk ⊇ A .
k
k
Then λ∗ (A) = α(A).
2. Prove that in the definition of λ∗ (A), I may be replaced by the collection
Ir of all bounded intervals I whose coordinate intervals have rational
endpoints.
Lebesgue Measure on Rn
351
3. Prove that in the definition of λ∗ (A), I may be replaced by the collection
U of all bounded open subsets of R and also by the collection K of all
compact sets.
4.S Show that Lebesgue outer measure is translation invariant, that is,
λ∗ (A + x) = λ∗ (A) for every A ⊆ Rn and x ∈ Rn ,
where A + x := {a + x : a ∈ A}.
5. Show that Lebesgue outer measure has the reflection property
λ∗ (−A) = λ∗ (A) for every A ⊆ Rn ,
where −A := {x : −x ∈ A}.
6. Show that Lebesgue outer measure has the dilation property
λ∗ (rA) = |r|n λ∗ (A) for every A ⊆ Rn and r ∈ R,
where rA := {rx : x ∈ A}.
10.3
Lebesgue Measure
By subadditivity of outer measure,
λ∗ (C) ≤ λ∗ (C ∩ E) + λ∗ (C ∩ E c )
for all subsets E and C of Rn . The following definition singles out those sets
E that also satisfy the reverse inequality for all sets C.
10.3.1 Definition. A subset E of Rn is said to be Lebesgue measurable if
λ∗ (C) ≥ λ∗ (C ∩ E) + λ∗ (C ∩ E c )
(10.3)
for all subsets C of Rn . The collection of all Lebesgue measurable subsets of
Rn is denoted by M = M(Rn ). The restriction of λ∗ to M is called Lebesgue
measure on Rn and is denoted by λ = λn . Any particular set C satisfying
(10.3) is called a test set for E.
♦
If C is a test set for E, then λ∗ (C) = λ∗ (C ∩ E) + λ∗ (C ∩ E c ) ; the set E
splits the outer measure of C.
10.3.2 Theorem. M is a sigma field containing all sets of outer measure
zero and λ is a measure on M.
352
A Course in Real Analysis
Proof. Clearly, ∅, Rn ∈ M, and since E and E c appear symmetrically in
(10.3), E c ∈ M iff E ∈ M. If λ∗ (E) = 0, then, by monotonicity,
λ∗ (C ∩ E) + λ∗ (C ∩ E c ) ≤ λ∗ (E) + λ∗ (C ∩ E c ) = λ∗ (C ∩ E c ) ≤ λ∗ (C),
hence E ∈ M. Therefore, M contains all sets of LebesgueSouter measure 0.
∞
It remains to showSthat, for
a sequence
{Ek } in M, k=1 Ek ∈ M and
P
∞
∞
∗
furthermore that λ∗
k=1 Ek =
k=1 λ (Ek ) if the sets Ek are pairwise
disjoint. This is accomplished in the following four steps:
I. If E, F ∈ M, then E ∪ F, E ∩ F ∈ M.
J To show that E ∪ F ∈ M, take any set C as a test set for E and take
C ∩ E c as a test set for F to obtain
λ∗ (C) = λ∗ (C ∩ E) + λ∗ (C ∩ E c ) and
λ∗ (C ∩ E c ) = λ∗ (C ∩ E c ∩ F ) + λ∗ (C ∩ E c ∩ F c ).
Combining these and using subadditivity,
λ∗ (C) = λ∗ (C ∩ E) + λ∗ (C ∩ E c ∩ F ) + λ∗ (C ∩ E c ∩ F c )
≥ λ∗ (C ∩ E) ∪ (C ∩ E c ∩ F ) + λ∗ (C ∩ E c ∩ F c ).
(10.4)
Since C ∩ E ∪ C ∩ E c ∩ F ⊇ C ∩ (E ∪ F ), by monotonicity and (10.4),
λ∗ (C) ≥ λ∗ C ∩ (E ∪ F ) + λ∗ C ∩ E c ∩ F c
= λ∗ C ∩ (E ∪ F ) + λ∗ C ∩ (E ∪ F )c .
This shows that E ∪ F ∈ M. That E ∩ F ∈ M follows from De Morgan’s
law E ∩ F = (E c ∪ F c )c . K
II. If C ⊆ Rn and E, F ∈ M with E ∩ F = ∅, then
λ∗ C ∩ (E ∪ F ) = λ∗ (C ∩ E) + λ∗ (C ∩ F ).
J Use C ∩ (E ∪ F ) as a test set for E to obtain
λ∗ C ∩ (E ∪ F ) = λ∗ C ∩ (E ∪ F ) ∩ E + λ∗ C ∩ (E ∪ F ) ∩ E c
= λ∗ (C ∩ E) + λ∗ (C ∩ F ).
K
S
III. If the sets
P Ek are pairwise disjoint and F := k Ek , then F ∈ M and
λ(F ) = k λ(Ek ).
Sk
J Set Fk = j=1 Ej and let C ⊆ Rn . By steps I and II and induction,
Fk ∈ M and
k
X
λ∗ (C ∩ Fk ) =
λ∗ (C ∩ Ej ).
j=1
Lebesgue Measure on Rn
353
Thus, by monotonicity,
λ∗ (C) = λ∗ (C ∩ Fk ) + λ∗ (C ∩ Fkc ) ≥
k
X
λ∗ (C ∩ Ej ) + λ∗ (C ∩ F c ).
j=1
Since k was arbitrary, by subadditivity
λ∗ (C) ≥
∞
X
λ∗ (C ∩Ej )+λ∗ (C ∩F c ) ≥ λ∗ (C ∩F )+λ∗ (C ∩F c ) ≥ λ∗ (C).
j=1
The inequalities are therefore equalities, which shows that F ∈ M.
Taking C = F verifies the second assertion of III. K
IV.
∞
[
Ek ∈ M.
k=1
J Use I, III and
S∞
k=1
Ek = E1 ∪ (E2 ∩ E1c ) ∪ (E3 ∩ E1c ∩ E2c ) ∪ . . . . K
10.3.3 Definition. A set E is said to have (Lebesgue) measure zero if λ(E) = 0.
A property P (x) depending on points x ∈ Rn is said to hold almost everywhere
(a.e.) or for almost all x if the set of all x for which P (x) is false has measure
zero.
♦
For example, the Dirichlet function is zero a.e. More generally, if E ∈ M
then 1E = 0 a.e. iff λ(E) = 0.
By subadditivity, a countable union of sets of measure zero has measure
zero. Since a point has measure zero, it follows that every countable set has
measure zero. In particular, Qn has measure zero. The following is an example
of an uncountable set with measure zero.
10.3.4 Example. (Cantor ternary set). Remove from I0,1 := [0, 1] the “middle
third” open interval (1/3, 2/3), leaving closed intervals I1,1 and I1,2 with union
E1 and total length 2/3. Next, remove from each of I1,1 and I1,2 the middle third
open interval, leaving closed intervals I2,1 , I2,2 , I2,3 , and I2,4 with union E2
and total length 4/9 = (2/3)2 . By induction, one obtains a decreasing sequence
.00220 . . .
E1
E2
E3
I0,1
.22202 . . .
I1,2
I1,1
I2,1
I2,2
I3,1 I3,2
I3,3 I3,4
I2,3
I2,4
I3,5 I3,6
I3,7 I3,8
..
.
FIGURE 10.3: Middle thirds construction.
of closed sets Ek =
S 2k
j=1 Ik,j
such that, by subadditivity, λ∗ (Ek ) ≤ (2/3)k . If
354
A Course in Real Analysis
E denotes the intersection of these sets, then E is closed and, by monotonicity,
λ∗ (E) ≤ (2/3)k for all k. Therefore, λ∗ (E) = 0.
To show that E is uncountable, we use the fact that every real number
x ∈ [0, 1] has both ternary and binary representations
x = .d1 d2 . . . (ternary) =
x = .e1 e2 . . . (binary) =
∞
X
k=1
∞
X
dk 3−k , where dk ∈ {0, 1, 2},
ek 2−k , where ek ∈ {0, 1}.
k=1
These are obvious analogs of the decimal representation of a real number (see
Exercise 6.1.14). As with decimal representations, there is some ambiguity; for
example, 1/3 = .1000 . . . = .0222 . . . (ternary). Now observe that if dk = 0 or
Ik−1,j
dk = 0
Ik,2j−1
dk = 2
Ik,2j
FIGURE 10.4: x ∈ Ik−1,j ⇒ x ∈ Ik,2j−1+dk /2 .
2 for all k in the above ternary representation, then x ∈ E. For example,
.00220 . . . ∈ I1,1 ∩ I2,2 ∩ I3,4 ∩ I4,7 ∩ · · ·
and
.22202 . . . ∈ I1,2 ∩ I2,4 ∩ I3,7 ∩ I4,14 ∩ · · ·
(see Figure 11.2). In general, if x ∈ Ik−1,j , then x ∈ Ik,2j−1+dk /2 . Conversely,
let x ∈ E. Since x ∈ E1 , we may choose d1 = 0 or 2. Similarly, since x ∈ E2 ,
we may choose d2 = 0 or 2, etc. Continuing in this manner, we see that every
member of E has a (unique) ternary representation with digits 0 or 2.
Now define ϕ : E → [0, 1] by
ϕ .d1 d2 . . . (ternary) = .e1 e2 . . . (binary), where dk ∈ {0, 2} and ek = dk /2.
The function ϕ is not one-to-one; for example,
ϕ(.0222 . . .) = .0111 . . . = .1000 . . . = ϕ(.2000 . . .).
However, by removing from E the countable set of all numbers with ternary
representations having a tail end of zeros, these being necessarily rational, we
obtain a set F on which ϕ is one-to-one. Since ϕ(F ) = (0, 1), it follows that E
is uncountable.
♦
Lebesgue Measure on Rn
355
We show in the next section that intervals, open sets, and closed sets are
Lebesgue measurable. It follows that countable unions and intersections of
these sets are also Lebesgue measurable. The reader may well ask if there are
any subsets of Rn that are not Lebesgue measurable. The answer is that there
are many, but their construction is surprisingly intricate. The following is an
example for the case n = 1.
set).
10.3.5 Example. (A non-measurable
Consider sets of the form x + Q,
x ∈ R. We claim that if x + Q ∩ y + Q 6= ∅, then x + Q = y + Q. To see
this, choose z ∈ x + Q ∩ y + Q , say z = x + r1 = y + r2 , r1 , r2 ∈ Q. Then,
for any r ∈ Q,
x + r = y + r2 − r1 + r ∈ y + Q and y + r = x + r1 − r2 + r ∈ x + Q,
hence x + Q = y + Q. It follows that every real number is in exactly one of the
sets x + Q. Now form a set E by choosing exactly one number in each of the
distinct sets x + Q.1 For each x ∈ R, the set E ∩ (x + Q) has a single member,
hence x = y + r for unique y ∈ E and r ∈ Q. Thus R may be expressed as a
disjoint union
∞
[
R=
(rk + E),
(10.5)
k=1
where {r1 , r2 , . . .} is an enumeration of Q.
Suppose, for a contradiction, that E is Lebesgue measurable. Then
λ(E) > 0, otherwise, by (10.5), translation invariance (Exercise 1), and
countable additivity, R would have measure zero. On the other hand, let I be
an arbitrary bounded interval and set J = Q ∩ (0, 1). Since I is measurable
(Section 10.4, below), the set
[
F :=
r+E∩I
r∈J
is measurable. Also, since I and J are bounded so is F . Thus, by countable
additivity and translation invariance,
X
X
+∞ > λ(F ) =
λ r+E∩I =
λ E∩I .
r∈J
r∈J
Since J is an infinite set, λ(E ∩ I = 0. But then
λ(E) =
∞
X
k=0
∞
X
λ(E ∩ [k, k + 1) +
λ(E ∩ [−k − 1, −k) = 0.
k=0
This contradiction shows that E cannot be Lebesgue measurable.
♦
1 The existence of E requires the axiom of choice, one of the axioms of Zermelo–Fraenkel
set theory.
356
A Course in Real Analysis
Exercises
1. ⇓2 Show that E ∈ M and x ∈ Rn imply that x + E ∈ M. Conclude
from Exercise 10.2.4 that λ(x + E) = λ(E).
2.S Show that E ∈ M implies that −E ∈ M. Conclude from Exercise 10.2.5
that λ(−E) = λ(E).
3. Show that E ∈ M and r 6= 0 imply that rE ∈ M. Conclude from
Exercise 10.2.6 that λ(rE) = |r|n λ(E).
4.S Show that for any ε > 0 there exists an open set D dense in Rn such
that λ(D) < ε.
5. Prove that if f and g are continuous real-valued functions on Rn which
are equal a.e., then f = g. Does the same result hold if only one of the
functions is continuous?
6. Let A be the subset of [0, 1] whose members are missing the digit three
in their decimal expansions. Prove that A is uncountable and λ(A) = 0.
10.4
Borel Sets
Recall that the σ-field generated by a collection A of sets is the intersection
of all σ-fields containing A (10.1.2). The following special case is of particular
importance.
10.4.1 Definition. The Borel σ-field B = B(Rn ) is the σ-field generated by
the open sets of Rn . A member of B is called a Borel set.
♦
10.4.2 Remark. Since open sets and closed sets are complements of one
another, B is also generated by the closed sets. Furthermore, since an open
set is a countable union of n-dimensional open intervals (Exercise 8.2.4), B is
also generated by O. Since every open interval is a countable union of closed
and bounded intervals and every closed interval is a countable intersection of
open intervals, B is also generated by C. Similar considerations show that B is
generated by H as well.
♦
10.4.3 Theorem. B(Rn ) ⊆ M(Rn ).
Proof. By 10.4.2, it suffices to show that H ⊆ M. Note first that if I, J ∈ H
then, using partitions as in the proof of 10.2.2, I \ J may be expressed (usually
in several ways) as a disjoint union of members of H. (See Figure 10.5.)
2 This
exercise will be used in 11.2.18.
Lebesgue Measure on Rn
357
J
I1
I5
I2
I4
I3
I
FIGURE 10.5: I \ J = I1 ∪ I2 ∪ I3 ∪ I4 ∪ I5 .
Now let I ∈ H, C ⊆ Rn , and let {Ik } be any sequence in H that covers C.
We show that
X
λ∗ (C ∩ I) + λ∗ (C ∩ I c ) ≤
λ∗ (Ik ).
(10.6)
k
Taking the infimum over all such sequences {Ik } produces the inequality
λ∗ (C ∩ I) + λ∗ (C ∩ I c ) ≤ λ∗ (C), provingPthat I ∈ M.
∞
To verify (10.6), we may assume that k=1 λ∗ (Ik ) < +∞. For each k there
exist, according to the observation
at the beginning of the proof, intervals
Smk
Jj,k ∈ H such that Ik \ I = j=1
Jj,k (disjoint union). Then
Ik = (Ik ∩ I) ∪ (Ik \ I) = (Ik ∩ I) ∪
m
[k
Jj,k (disjoint union),
j=1
hence, by 10.2.5(f) and induction,
λ∗ (Ik ) = λ∗ (Ik ∩ I) +
mk
X
λ∗ (Jj,k ).
j=1
Since {Ik ∩ I}k covers C ∩ I and {Jj,k }j,k covers C ∩ I c ,
X
λ∗ (Ik ) =
k
X
λ∗ (Ik ∩ I) +
k
mk
XX
k
λ∗ (Jj,k ) ≥ λ∗ (C ∩ I) + λ∗ (C ∩ I c ).
j=1
It may be shown that the inclusion B ⊆ M is proper.3 The importance of
Borel sets is that they are closely linked to the topology of Rn and hence are
better suited for contexts involving continuous functions.
The remainder of the section demonstrates the precise connection between
B and M.
3 See,
for example, [4].
358
A Course in Real Analysis
10.4.4 Lemma. For any bounded E ∈ M, there exists a decreasing sequence
of bounded open sets Uk ⊇ E such that
lim λ(Uk ) = lim λ cl(Uk ) = λ(E).
k
k
Proof. By definition of λ(E), for each k we may choose a sequence of open
intervals Ij,k with union Vk containing E such that
X
λ(E) ≤ λ(Vk ) ≤ λ cl(Vk ) ≤
|cl(Ij,k )| < λ(E) + 1/k.
j
The sequence of open sets Uk := V1 ∩ · · · ∩ Vk is decreasing, contains E, and
satisfies
λ(E) ≤ λ(Uk ) ≤ λ cl(Uk ) ≤ λ cl(Vk ) ≤ λ(E) + 1/k.
Letting k → +∞ proves the assertion.
10.4.5 Lemma. For any E ∈ M, there exists an increasing sequence of
compact sets Ck ⊆ E such that limk λ(Ck ) = λ(E).
Proof. Suppose first that E is bounded. Let I be a bounded open interval
containing cl(E) and let ε > 0. Choose a sequence of open intervals Ik with
E
K =I \U
I \ E ⊆ U :=
S
k Ik
Ik
I
FIGURE 10.6: K = cl(E) \ U .
P∞
union U ⊇ I \ E such that k=1 |Ik | < λ(I \ E) + ε. Since I is open, we may
assume that Ik ⊆ I (otherwise, replace Ik by Ik ∩ I). Then I \ E ⊆ U ⊆ I and
λ(U ) ≤ λ(I \ E) + ε = λ(I) − λ(E) + ε.
Set K = I \ U . Then K ⊆ E ⊆ cl(E) ⊆ I, hence K = cl(E) \ U . Therefore, K
is compact and
λ(K) = λ(I) − λ(U ) ≥ λ(I) − λ(I) − λ(E) + ε = λ(E) − ε.
Now let E ∈ M be arbitrary and let {Ek } be a sequence of bounded
Lebesgue Measure on Rn
359
measurable sets such that Ek ↑ E. By the first paragraph, for each k we
may choose a compact set Kk ⊆ Ek such that λ(Kk ) > λ(Ek ) − 1/k. These
conditions still hold if Kk is replaced by the compact set Ck = K1 ∪ · · · ∪ Kk .
The sequence {Ck } is increasing, contained in E, and λ(Ck ) → λ(E).
10.4.6 Lemma. If E ∈ M is bounded, then there exists an increasing sequence
of compact sets Ck and a decreasing sequence of bounded open sets Uk such
that
Ck ⊆ E ⊆ Uk and lim λ(Uk \ Ck ) = 0.
k
Proof. If Ck and Uk are as in 10.4.4 and 10.4.5 with Uk bounded, then
λ(Uk \ Ck ) = λ(Uk \ E) + λ(E \ Ck ) → 0.
10.4.7 Theorem. If E ∈ M, then there exist Borel sets F and G such that
F ⊆ E ⊆ G and λ(G \ F ) = 0.
S∞
T∞
Proof. Suppose first that E is bounded. Set F = k=1 Ck and G = k=1 Uk ,
where Ck and Uk are the sets in 10.4.6. Then F ⊆ E ⊆ G and G \ F ⊆ Uk \ Ck
for all k, hence λ G \ F ≤ λ Uk \ Ck ) → 0.
In the general case, there exists a sequence of bounded Borel sets Ek ↑ E. By
the first paragraph, there exist Borel sets Fk and Gk such that Fk ⊆ Ek ⊆ Gk
and λ(Gk \ Fk ) = 0. Let
F =
∞
[
k=1
Fk
and G =
∞
[
Gk .
k=1
Then F and G are Borel sets, F ⊆ E ⊆ G, and G \ F ⊆
countable subadditivity, λ(G \ F ) = 0.
S∞
k=1
Gk \ Fk . By
10.4.8 Corollary. Every E ∈ M is the disjoint union of a Borel set and a
set of Lebesgue measure zero.
Proof. By the theorem, E = F ∪ (E \ F ), where F ∈ B and λ(E \ F ) = 0.
Exercises
1.S Let ε > 0. Construct an explicit compact subset C ⊆ E := [0, 1] ∩ I such
that λ(E \ C) < ε.
2. Show that the graph G := {(x, y) : y = f (x)} of a continuous function
f : R → R is a Borel set with two-dimensional Lebesgue measure zero.
3. Let E denote the Cantor set (10.3.4). Show that E + Q and E + E are
Borel sets and find their measures.
4.S Let B ∈ B(Rn ), y ∈ Rn , and r ∈ R. Prove that B + y :=
{x + y : x ∈ B}, rB := {rx : x ∈ B} and −B := {x : −x ∈ B} are
Borel sets.
360
A Course in Real Analysis
10.5
.
Measurable Functions
In this section, F denotes a σ-field of subsets of a set S.
Definition and Basic Properties
10.5.1 Lemma.
(a) f −1 {+∞} ,
(b) f −1 {+∞} ,
(c) f −1 {+∞} ,
Let f : S → R. The following statements are equivalent:
f −1 {−∞} ∈ F, and f −1 (U ) ∈ F for all open sets U ⊆ R.
f −1 {−∞} ∈ F, and f −1 (F ) ∈ F for all closed sets F ⊆ R.
f −1 {−∞} ∈ F, and f −1 (B) ∈ F for all Borel sets B ⊆ R.
(d) {x : f (x) ≤ t} ∈ F for all t ∈ R.
(e) {x : f (x) < t} ∈ F for all t ∈ R.
(f) {x : f (x) ≥ t} ∈ F for all t ∈ R.
(g) {x : f (x) > t} ∈ F for all t ∈ R.
of (a)
Proof. The equivalence
c and (b) follows from the general set theoretic
relation f −1 (Ac ) = f −1 (A) . Clearly, (c) implies (b). For the converse, denote
by G the collection of all Borel subsets B of R such that f −1 (B) ∈ F. Then G
is a σ-field. If (b) holds, then G contains the closed sets, hence, by minimality,
G = B. This proves (c) and hence shows that (a)–(c) are equivalent.
The implications (c) ⇒ (d) ⇒ (e) ⇒ (f) ⇒ (g) ⇒ (d) are proved using the
following set relations:
(c) ⇒ (d) : {x : f (x) ≤ t} = f −1 {−∞} ∪ f −1 (−∞, t] .
∞
[
(d) ⇒ (e) : {x : f (x) < t} =
{x : f (x) ≤ t − 1/n} .
n=1
c
(e) ⇒ (f) : {x : f (x) ≥ t} = {x : f (x) < t} .
∞
[
(f) ⇒ (g) : {x : f (x) > t} =
{x : f (x) ≥ t + 1/n} .
n=1
c
(g) ⇒ (d) : {x : f (x) ≤ t} = {x : f (x) > t} .
Thus (d)–(g) are equivalent and are implied by (a)–(c).
Now assume that (d)–(g) hold. Then the sets
f −1 (+∞) =
∞
\
k=1
{x : f (x) > k} , f −1 (−∞) =
∞
\
k=1
{x : f (x) < −k}
Lebesgue Measure on Rn
361
are members of F, and for −∞ < a < b < +∞,
f −1 (a, b) = {x : f (x) > a} ∩ {x : f (x) < b} ∈ F.
Since every open subset of R is a countable union of open intervals, (a) holds,
completing the proof.
10.5.2 Definition. A function f : S → R is said to be measurable with
respect to F, or simply F-measurable, if any (hence all ) of the conditions in
Lemma 10.5.1 hold.
♦
The following theorem shows that the collection of all measurable functions
is closed under the standard ways of combining functions. The functions f + , f − ,
supn fn , inf n fn , lim supn fn , and lim inf n fn in the statement of the theorem
are defined by
f + (x) := max{f (x), 0},
(sup fk )(x) := sup fk (x),
k
f − (x) := max{−f (x), 0},
(inf fk )(x) := inf fk (x),
k
k
(lim sup fk )(x) := lim sup fk (x),
k
k
k
(lim inf fk )(x) := lim inf fk (x).
k
k
10.5.3 Theorem. Let f, g, fk be measurable with respect to a σ-field F on
S. If α ∈ R and p > 0, then f + g, αf , f 2 , f g, |f |p , f + , f − , supk fk , inf k fk ,
lim supk fk , and lim inf k fk are measurable.
Proof. The proof is based on the following equalities. The details are left to
the reader.
[
• {x : (f + g)(x) < t} =
{x : f (x) < r} ∩ {x : g(x) < t − r}.
r∈Q
• {x : αf (x) < t} = {x : f (x) < t/α} for α > 0.
√
√
• x : f 2 (x) < t = x : − t < f (x) < t for t > 0.
• f g = 12 [(f + g)2 − f 2 − g 2 ].
• {x : |f |p (x) < t} = x : −t1/p < f (x) < t1/p
for t > 0.
• f + = 12 (|f | + f ), f − = 12 (|f | − f ).
n
o \
• x : sup fk (x) ≤ t =
{x : fk (x) ≤ t}.
k
k
• inf k fk = − supk (−fk ).
• lim inf k fk = supk inf j≥k fj ;
lim supk fk = − lim inf k (−fk ).
10.5.4 Corollary. If fk : S → R is F-measurable for every k and if fk → f
on S, then f is F-measurable.
362
A Course in Real Analysis
Simple Functions
10.5.5 Definition. The indicator function of a set A ⊆ S is the function 1A
on S defined by
(
1 if x ∈ A, and
1A (x) =
♦
0 if x 6∈ A.
For example, the Dirichlet function may be expressed as 1Q .
10.5.6 Definition. A function f : S → R with finite range is called a simple
function. The collection of all nonnegative F-measurable simple functions is
denoted by S+ (F).
♦
10.5.7 Remarks. (a) A linear combination of indicator functions is a simple
function. Conversely, a simple function f may be expressed in many ways as a
linear combination of indicator functions. The most important of these is the
standard form
f=
m
X
aj 1Aj , Aj := {x ∈ S : f (x) = aj } ,
(10.7)
j=1
where a1 , . . . , am ∈ R are the distinct values of f . Note that the sets Aj form
a partition of Rn . By 10.5.3 and Exercise 8, f is F-measurable iff Aj ∈ F for
each j.
(b) If f1 , f2 ∈ S+ (F), α ≥ 0, and p > 0, then the functions
αf1 , f1 + f2 , f1 f2 , f1p , max{f1 , f2 }, min{f1 , f2 }
are nonnegative, measurable, and have finite ranges, hence are in S+ (F).
♦
The following theorem shows that the collection S+ (F) generates all measurable functions. It is a key ingredient in the development of the Lebesgue
theory.
10.5.8 Theorem. For each nonnegative F-measurable function f on S, there
exists a sequence {fk } in S+ (F) such that fk ↑ f on S.
Proof. Let f0 = 0, and for each k ∈ N define
k
fk =
k2
X
j−1
j=1
2k
1Ak,j + k1Ak , where Ak = {x : f (x) ≥ k} and
Ak,j = x : (j − 1)2−k ≤ f (x) < j2−k , j = 1, 2, . . . , k2k .
(See Figure 10.7.) We show that fk (x) ↑ f (x) for each x ∈ S. This is
clear if f (x) = +∞, since then fk (x) = k for all k. Suppose f (x) ∈ R
and let k ∈ N. If f (x) ≥ k + 1, then fk+1 (x) = k + 1 > k = fk (x). If
Lebesgue Measure on Rn
363
f
k
..
.
j2−k
(j − 1)2−k
S
Ak
Ak,j
FIGURE 10.7: The components of fk .
k +. 1
..
.
..
k
..
.
−k
j2
(2j − 1)2k+1
(j − 1)2−k
x
x x
FIGURE 10.8: The components of fk+1 .
S
k ≤ f (x) < k + 1, then fk+1 (x) ≥ k = fk (x). Finally, suppose that f (x) < k.
Then (j − 1)2−k ≤ f (x) < j2−k for some 1 ≤ j ≤ k2k , hence
2j − 2
2j − 1
≤ f (x) < k+1
2k+1
2
or
2j − 1
2j
≤ f (x) < k+1 .
2k+1
2
(See Figure 10.8.) In either case,
fk+1 (x) ≥
2j − 2
j−1
= k = fk (x).
k+1
2
2
Thus fk ↑ on S. Since 0 ≤ f (x) − fk (x) < 2−k for all sufficiently large k,
fk (x) → f (x).
Lebesgue and Borel Measurable Functions
10.5.9 Definition. A function f : Rn → R is said to be Borel (Lebesgue)
measurable if f is measurable with respect to the σ-field B(Rn ) (M(Rn )). ♦
10.5.10 Proposition. If f is Lebesgue measurable and f = g a.e., then g is
Lebesgue measurable.
364
A Course in Real Analysis
Proof. Let A = {x : f (x) 6= g(x)}. By hypothesis, A has Lebesgue measure
zero, hence Ac and {x : g(x) < t} ∩ A ∈ M. Therefore,
{x : g(x) < t} = {x : f (x) < t} ∩ Ac ∪ {x : g(x) < t} ∩ A ∈ M.
If f is Borel measurable and f = g a.e., then g need not be Borel measurable.
Indeed, there exist sets E ∈ M \ B with measure zero, hence 1E = 0 a.e. but
1E is not Borel measurable.4
Clearly, a Borel measurable function is Lebesgue measurable. The preceding
paragraph shows that the converse is false. However, we have
10.5.11 Proposition. If f : Rn → R is Lebesgue measurable, then there exists
a Borel measurable function g : Rn → R such that g = f a.e.
Proof. Consider first the case f = 1E , E ∈ M. By 10.4.8, E is the disjoint
union of a Borel set F and a set A of Lebesgue measure zero. Thus g := 1F is
Borel measurable and f = g + 1A = g a.e. The assertion therefore holds for
indicator functions.
If f is a simple function, then each term in the standard form of f is a.e.
equal to a Borel function. Therefore, the assertion holds for simple functions.
If f ≥ 0, then, by 10.5.8, there exists a sequence of nonnegative Lebesgue
measurable simple functions fk such that fk → f on Rn . By the previous
paragraph, for each k there exists a Borel measurable function gk such that
fk = gk a.e. Let
Ak := {x : fk (x) 6= gk (x)}
and A :=
∞
[
Ak .
n=1
Then A ∈ M, λ(A) = 0 and fk (x) = gk (x) for all x ∈ Ac and all k. Let B
denote the set of all x such that the sequence {gk (x)} does not converge. Then
B ⊆ A and, by 10.5.3, B ∈ B. Let g = limk gk 1B c . Then g is Borel measurable
and {x : g(x) 6= f (x)} ⊆ A so g = f a.e. Therefore, the assertion holds for
nonnegative f . The general case follows from the identity f = f + − f − .
Part (a) of 10.5.1 implies that a continuous function f : Rn → R is Borel
measurable. In a similar vein,
10.5.12 Proposition. If f : Rn → R be continuous except on a set E of
Lebesgue measure zero, then f is Lebesgue measurable.
Proof. Let U ⊆ R be open. Then
f −1 (U ) = A ∪ B, where A := f −1 (U ) ∩ E and B := f −1 (U ) ∩ E c .
Since A ⊆ E and λ(E) = 0, A ∈ M. Since f is continuous on E c , B is open
in E c , hence B = V ∩ E c for some open subset V of Rn . Therefore, B ∈ M,
so f −1 (U ) ∈ M. By 10.5.1, f is Lebesgue measurable.
4 See,
for example, [4].
Lebesgue Measure on Rn
365
Proposition 10.5.12 implies that a function with at most countably many
discontinuities is Lebesgue measurable. An examination of the proof shows that
such a function is in fact Borel measurable. In particular, monotone functions
on R, hence also functions of bounded variation, are Borel measurable (see
3.3.6 and 5.9.7).
Note that a function that is continuous except on a set of measure zero is
not necessarily equal a.e. to a continuous function (Exercise 12). Conversely, a
function equal a.e. to a continuous function need not be continuous anywhere;
the Dirichlet function is an obvious example.
Exercises
.
In Exercises 1–8, F denotes a σ-field of subsets of a set S.
1.S Let f : S → R haveSthe property that 1Ak f is F-measurable for every k,
where Ak ∈ F and k Ak = S. Prove that f is F-measurable.
2. Prove that if f : S → R is F-measurable and never zero, then 1/f is
F-measurable.
3. Let f : S → R have the property that {x : f (x) < r} ∈ F for all r ∈ Q.
Prove that f is F-measurable.
4. Let f : S → R be F-measurable and let g : R → R be continuous. Show
that g ◦ f is F-measurable.
5. Let g, h : S → R be F-measurable functions. Prove that the following
sets are F-measurable:
(a)S {x ∈ S : g(x) > h(x)},
(b) {x ∈ S : g(x) ≥ h(x)},
(c) {x ∈ S : g(x) = h(x)},
(d)S {x ∈ S : g(x)h(x) = 1}.
6. Let {fk : S → R} be a sequence of F-measurable functions. Prove that
the set x ∈ S : limk fk (x) exists in R is F-measurable.
7. Let f : S → R have range consisting of the distinct values ak , k ∈ N.
Show that f is F-measurable iff {x ∈ S : f (x) = ak } ∈ F for every k.
8.S Let E ⊆ S. Prove that 1E is F-measurable iff E ∈ F.
9. Let A, B, and C be subsets of S. Prove:
(a) 1AB = 1A 1B .
(b) 1A∪B = 1A + 1B − 1A 1B .
(c) 1Ac = 1 − 1A
(d) 1A ≤ 1B iff A ⊆ B.
10.S Define the symmetric difference A∆B of sets A and B by
A∆B = (A \ B) ∪ (B \ A) = (A ∪ B) \ (A ∩ B).
Prove that 1A∆B = |1A − 1B |.
366
A Course in Real Analysis
11. Let Ak ⊆ S and set B = lim inf k Ak and C = lim supk Ak . (see Exercise 10.1.6). Prove that
(a) 1B = lim inf k 1Ak .
(b) 1C = lim supk 1Ak .
12. Prove that 1[0,1] is not equal a.e. to a continuous function on R.
13. Let f : R → R. Prove that if f 0 exists on R, then f 0 is Borel measurable.
14.S Let f (x) = bx−1 c−1 , 0 < x ≤ 1. Show that f is Borel measurable on
(0, 1].
15. Let f (x) = 1 + r bx−1 c , 0 < x ≤ 1, where r(k) denotes the remainder
on division of an integer k by 3. Show that f is Borel measurable.
√
16. Define f : [0, 1] → R by f (x) = 0 if x is rational and f (x) = 1/ d if x
is irrational, where d is the first nonzero digit in the decimal expansion
of x. Prove that f is Borel measurable.
17.S Prove that if the function f in 10.5.8 is bounded, then the convergence
of the sequence is uniform.
18. Let f : R2 → R have the property that f (x, y) is continuous in x
for each y and Borel measurable in y for each x. Let g : R → R be
Borel measurable. Prove that the function h(y) := f g(y), y is Borel
measurable. Hint. Start with indicator functions g.
19.S ⇓5 Let f = (f1 , . . . , fm ) : Rn → Rm , where each fj : Rn → R is Borel
measurable. Prove:
(a) F := B ∈ B(Rm ) : f −1 (B) ∈ B(Rn ) is a σ-field.
(b) F = B(Rm ), that is, f −1 (B) ∈ B(Rn ) for every B ∈ B(Rm ).
(c) If F : Rm → R is Borel measurable, then the function g := F ◦ f is
Borel measurable.
20. (a) Show that B × R ∈ B(R2 ) for all B ∈ B(R).
(b) Let f : R → R be Borel measurable and define g : R2 → R by
g(x, y) = f (x). Show that g is Borel measurable.
21. Let 0 ∈ A ⊆ Rn . Define the “radius function” fA : Rn → R by
fA (x) := sup {t ≥ 0 : tx ∈ A} ,
x ∈ Rn .
(a) Let 0 ∈ Ak for all k and Ak ↑ A. Show that fAk ↑ fA .
(b) Show that if A is open, then fA is positive and Borel measurable.
(c) Use (b) to show that if A is compact, then fA is Borel measurable.
(d) Conclude from 10.4.5 that fA is Borel measurable for any Borel set
A containing 0.
5 This
exercise will be used in 11.5.4.
Chapter 11
Lebesgue Integration on Rn
In this chapter we use the measure theory developed in Chapter 10 to construct the Lebesgue integral of a measurable function of several variables. For
comparison purposes, we begin with a brief description of the Riemann integral
on compact subintervals of Rn .
11.1
Riemann Integration on Rn
The n-dimensional Riemann integral is constructed in essentially the same
way as the one-dimensional integral: Let f be a bounded real-valued function
on an n-dimensional interval
[a, b] := [a1 , b1 ] × [a2 , b2 ] × · · · × [an , bn ],
where a := (a1 , . . . , an ) and b := (b1 , . . . , bn ).
x2
b2
P2
I
I2
a2
a1
P1
I1
b1
x1
FIGURE 11.1: Partition of [a, b] × [c, d].
For each j, let Pj be a partition of the coordinate interval [aj , bj ]. The collection
of all Cartesian products of the resulting coordinate subintervals produces a
partition P of [a, b] consisting of n-dimensional subintervals I = I1 ×I2 ×· · ·×In
with volume ∆VI := |I1 | |I2 | · · · |In | (see Figure 11.1). The lower and upper
367
368
A Course in Real Analysis
sums of f over P are defined by
X
S(f, P) =
mI ∆VI ,
mI := inf f (x), and
x∈I
I∈P
X
S(f, P) =
MI ∆VI ,
MI := sup f (x).
x∈I
I∈P
The lower and upper integrals on [a, b] are defined by
Z
b
f := sup S(f, P) and
P
a
Z
b
a
f := inf S(f, P),
P
where the supremum and infimum are taken over all partitions P of [a, b]. If
the two integrals are equal, then f is said to be Riemann–Darboux integrable
Rb
on [a, b]. The common value of these integrals is then denoted by a f . As in
the one-variable case,
Z
b
f = lim S(f, P, {ξ I }I ),
kPk→0
a
where kPk = maxj kPj k and S(f, P, {ξ I }I ) is the Riemann sum
X
S(f, P, {ξ I }I ) :=
f (ξ I )∆VI , ξ I ∈ I, I ∈ P.
I
The n-dimensional Riemann integral has properties analogous to those of
the one-dimensional integral. Moreover, as is shown in Section 11.5, if f is
Rb
continuous, then a f may be expressed as an iterated integral
Z
b1
Z
bn
...
a1
f (x1 , . . . , xn ) dxn · · · dx1 ,
an
effectively reducing the theory to the one-dimensional case. Integrals over
regions bounded by “nice” surfaces may be similarly evaluated.
11.2
The Lebesgue Integral
The Lebesgue integral on Rn is defined first for nonnegative Lebesgue
measurable simple functions and is then extended to a larger class of functions,
including all nonnegative Lebesgue measurable functions. The identity f =
f + − f − is then used to define the integral for general measurable functions.
Lebesgue Integration on Rn
369
The Integral of a Simple Function
11.2.1 Definition. Let f ∈ S+ (M) have standard form
f=
m
X
aj 1Aj , Aj := {x : f (x) = aj } ,
j=1
where {A1 , . . . , Am } is a (measurable) partition of Rn . The Lebesgue integral
of f on Rn is defined by
Z
f dλ :=
m
X
aj λ(Aj ).
♦
j=1
Note that the above sum may contain a term of the form 0 · (+∞). While
this expression was heretofore undefined, it is now necessary to make the
definition
0 · (+∞) := 0.
In particular, the integral of the identically zero function is 0 · λ(Rn ) = 0.
11.2.2 Lemma. If f, g ∈ S+ (M) and α ≥ 0, then
Z
Z
Z
Z
Z
(a) αf dλ = α f dλ;
(b) (f + g) dλ = f dλ + g dλ;
Z
Z
Z
Z
(c) f dλ ≤ g dλ if f ≤ g a.e. (d) f dλ = g dλ if f = g a.e.
Proof. Part (a) is immediate from the definition, and (d) follows from (c). To
prove (b) and (c), let f and g have standard representations
f=
m
X
ai 1Ai and g =
i=1
so
Z
f=
m
X
Sm
i=1
Ai =
λ(Ai ) =
k
X
Sk
ai λ(Ai ) and
j=1
bj 1Bj ,
j=1
i=1
Since Rn =
k
X
Z
g=
k
X
bj λ(Bj ).
j=1
Bj and the unions are disjoint,
λ(Ai ∩ Bj ) and λ(Bj ) =
j=1
m
X
λ(Ai ∩ Bj ),
i=1
hence
Z
f dλ =
m X
k
X
i=1 j=1
ai λ(Ai ∩ Bj ) and
Z
g dλ =
k X
m
X
j=1 i=1
bj λ(Ai ∩ Bj ). (11.1)
370
A Course in Real Analysis
Now let c1 , . . . , cp be the distinct values of f + g, and set
C` = {x : (f + g)(x) = c` } , ` = 1, . . . , p.
Then
f +g =
p
X
c` 1C` and C` =
[
Ai ∩ Bj (disjoint),
{(i,j):ai +bj =c` }
`=1
so
Z
(f + g) dλ =
p
X
c` λ(Cl ) =
`=1
=
k
m X
X
p
X
`=1
X
c`
λ(Ai ∩ Bj )
{(i,j):ai +bj =c` }
(ai + bj )λ(Ai ∩ Bj ).
i=1 j=1
R
R
By (11.1), the last sum is f dλ + g dλ, proving (b).
For (c), suppose f ≤ g a.e. and let E = {x : f (x) ≤ g(x)}. Then λ(E c ) = 0
and ai ≤ bj for all i, j for which Ai ∩ Bj ∩ E 6= ∅. From
λ(Ai ∩ Bj ) = λ(Ai ∩ Bj ∩ E) + λ(Ai ∩ Bj ∩ E c ) = λ(Ai ∩ Bj ∩ E)
and (11.1), we have
Z
f dλ =
m X
k
X
ai λ(Ai ∩ Bj ∩ E) ≤
i=1 j=1
k X
m
X
bj λ(Ai ∩ Bj ∩ E) =
Z
g dλ.
j=1 i=1
The Integral of a Measurable Function
11.2.3 Definition. Let f : Rn → R be Lebesgue measurable. If f ≥ 0, define
Z
Z
nZ
o
f dλ = f (x) dλ(x) := sup
fs dλ : fs ≤ f, fs ∈ S+ (M) . (11.2)
In general, define the Lebesgue integral on Rn by
Z
Z
Z
f dλ := f + dλ − f − dλ,
provided at least one of the terms on the right is finite. For E ∈ M define the
Lebesgue integral on E by
Z
Z
f dλ := f · 1E dλ
E
R
R
whenever the right side is defined. If both E f + dλ and E f − dλ are finite,
then f is said to be (Lebesgue) integrable on E. The collection of all integrable
functions on E is denoted by L1 (E). Finally, f is said to be integrable if it is
integrable on Rn .
♦
Lebesgue Integration on Rn
R
Note that from the definition, f ≥ 0 ⇒ f ≥ 0. More generally,
371
11.2.4
If f, g : Rn → RR are Lebesgue
measurable, f ≤ g a.e.,
R Proposition.
R
R
and f dλ, g dλ are defined, then f dλ ≤ g dλ. In particular, if f ≥ 0
and g is integrable, then f is integrable
Proof. Assume first that f, g ≥ 0. Let fs ∈ S+ (M) with fs ≤ f and set
gs := 1E fs , where E := {x : f (x) ≤Rg(x)}. Then
R gs ∈ SR+ (M), fs = gs a.e.,
and gs ≤ 1RE f ≤ 1ERg ≤ g. By 11.2.2, fs dλ = gs dλ ≤ g dλ. Since fs was
arbitrary, f dλ ≤ g dλ.
In the general case, f + ≤ g + and f − ≥ g − a.e., hence, by the first part of
the proof,
Z
Z
Z
Z
Z
Z
f dλ = f + dλ − f − dλ ≤ g + dλ − g − dλ = g dλ.
11.2.5 Corollary.
If fR : Rn → R is integrable and f = g a.e., then g is
R
integrable and f dλ = g dλ.
Proof. RBy 10.5.10,R g is Lebesgue measurable.
Moreover,
f + = g + a.e., so by
R −
R −
+
+
11.2.4, f dλ = g dλ. Similarly, f dλ = g dλ.
11.2.6 Proposition. If f : Rn → R is integrable, then f is finite a.e.
Proof. Suppose first that f ≥ 0. Let
A = {x : f (x) = +∞} and Ak = {x : f (x) ≥ k} .
Since f ≥ f 1Ak ≥ k1Ak ≥ k1A ,
1
0 ≤ λ(A) ≤
k
Z
f dλ < +∞.
Letting k → +∞ shows that λ(A) = 0.
In the general case, apply the result of the first paragraph to f + and f −
to obtain
λ x : f + (x) = +∞ = λ x : f − (x) = +∞ = 0,
hence λ {x : |f (x)| = +∞} = 0.
11.2.7
Proposition. Let f : Rn → [0, +∞] be Lebesgue measurable. Then
R
f dλ = 0 iff f = 0 a.e.
R
Proof. The sufficiency follows from 11.2.5. For the necessity, suppose f dλ = 0
and let
B = {x : f (x) > 0} and Bk = {x : f (x) > 1/k} .
S∞
Then B = k=1 Bk and f ≥ f 1Bk ≥ k −1 1Bk so
Z
0 ≤ λ(Bk ) ≤ k f dλ = 0.
Therefore, λ(Bk ) = 0. By countable subadditivity, λ(B) = 0.
372
A Course in Real Analysis
R
By 11.2.4, f ≥ 0 implies that A f dλ ≥ 0 for all A ∈ M. The following is
a converse:
n
11.2.8
R Proposition. Let f : R → R be Lebesgue measurable and suppose
that A f dλ is defined for all A ∈ M.
R
(a) If A f dλ ≥ 0 for all A ∈ M, then f ≥ 0 a.e.
R
(b) If A f dλ = 0 for all A ∈ M, then f = 0 a.e.
Proof. Part (b) follows from part (a). To prove (a), let
Ak = x : f (x) ≤ −k −1 and A = {x : f (x) < 0} .
R
Then f 1Ak ≤ −k −1 1Ak or 1Ak ≤ −kf 1Ak , hence, since Ak f ≥ 0,
0 ≤ λ(Ak ) ≤ −k
Z
f dλ ≤ 0.
Ak
Therefore, λ(Ak ) = 0. Since A =
S∞
k=1
Ak , λ(A) = 0.
11.2.9 Remark. The above properties of integrals on Rn also hold for integrals
on E ∈ M. For example, if f is integrable on E, then f is finite a.e. on E:
simply replace f in 11.2.6 by f · 1E . This observation applies to most of the
results that follow. We shall usually refrain from making this explicit, but the
reader is invited to formulate and verify such generalizations.
♦
Linearity of the Integral
The following lemma is a special case of the monotone convergence theorem
proved in the next section.
11.2.10 Lemma (Beppo–Levi). If {fk } is a sequence of nonnegative Lebesgue
measurable functions such that fk ↑ f on Rn , then
Z
Z
f dλ = lim fk dλ.
k
R
Proof. By 10.5.3, f is Lebesgue measurable, hence f dλ is defined. It follows
from 0 ≤ fk ≤ fk+1 ≤ f and 11.2.4 that
Z
Z
Z
fk dλ ≤ fk+1 dλ ≤ f dλ.
R
R
Therefore, L := lim fk dλ existsR in R and L ≤ f dλ. For the reverse
inequality, it suffices to show that g dλ ≤ L for any g ∈ S+ (M) with g ≤ f .
Let 0 < r < 1 and set Ek = {x : fk (x) ≥ rg(x)} . Since the sequence {fk } is
Lebesgue Integration on Rn
373
n
increasing, Ek ⊆
k+1 . Since fk (x) ≥ rg(x) for all large k, Ek ↑ R . If g has
PE
m
standard form j=1 aj 1Aj , then
fk ≥ fk 1Ek ≥ r
m
X
aj 1Ek ∩Aj ,
j=1
hence
Z
fk dλ ≥ r
m
X
aj λ(Ek ∩ Aj ).
j=1
Letting k → +∞, noting that Ek ∩ Aj ↑k Aj , we then obtain
L≥r
m
X
aj λ(Aj ) = r
Z
g dλ.
j=1
Letting r ↑ 1 yields L ≥
R
g dλ, as required.
11.2.11 Theorem. If f, g : Rn → [0, +∞] are Lebesgue measurable, then
Z
Z
Z
(αf + βg) dλ = α f dλ + β g dλ α, β ∈ R+ .
In particular, if f and g are integrable then so is αf + βg.
Proof. By 10.5.8, there exist sequences {fk } and {gk } in S+ (M) such that
fk ↑ f and gk ↑ g. Then αfk + βgk ↑ f + g and, by 11.2.10 and 11.2.2,
Z
Z
(αf + βg) dλ = lim (αfk + βgk ) dλ
k
Z
Z
= α lim fk dλ + β lim gk dλ
k
k
Z
Z
= α f dλ + β g dλ.
11.2.12 Corollary. Let f, g : Rn → R be Lebesgue measurable.
(a) f is integrable iff |f | is integrable.
(b) If f is integrable and |g| ≤ |f |, then g is integrable.
(c) If f and g are integrable, then f + g is integrable.
(d) If f is integrable and E ∈ M, then f is integrable on E.
Proof. (a) If f is integrable then, by definition, both f + and f − are integrable,
hence, by the theorem, |f | = f + + Rf − is integrable.
Conversely, if |f | is
R
integrable, then the inequalities 0 ≤ f ± dλ ≤ |f | dλ show that both f +
and f − are integrable, hence f is integrable.
374
A Course in Real Analysis
(b) By (a), |f | is integrable. The inequality |g| ≤ |f | then implies that |g| is
integrable. By (a) again, g is integrable.
(c) If f and g are integrable, then so are |f | and |g|. The inequality |f + g| ≤
|f | + |g| then shows that |f + g| is integrable. By (a), f + g is integrable.
(d) This follows from (b) since |f 1E | ≤ |f |.
The following theorem complements 11.2.11.
11.2.13 Theorem. Let f, g : Rn → R be Lebesgue measurable with g integrable, and let c ∈ R. Then the following hold:
Z
Z
(a) cg is integrable and
cg dλ = c g dλ.
(b) If f is integrable, then f + g is integrable and
Z
Z
Z
(f + g) dλ = f dλ + g dλ.
(c) If
Z
f dλ is defined, then
Z
(11.3)
(f + g) dλ is defined and (11.3) holds.1
Proof. (a) If c ≥ 0, then (cg)+ = cg + and (cg)− = cg − , hence, by 11.2.11, the
functions (cg)± are integrable and
Z
Z
Z
Z
Z
Z
cg dλ = (cg)+ dλ − (cg)− dλ = c g + dλ − c g − dλ = c g dλ.
Next, observe that (−g)+ = g − and (−g)− = g + so
Z
Z
Z
Z
Z
Z
+
−
−
+
(−g) dλ = (−g) dλ − (−g) dλ = g dλ − g dλ = − g dλ.
Therefore, if c < 0,
Z
Z
Z
Z
cg dλ = (−c)(−g) dλ = −c (−g) dλ = c g dλ.
(b) By 11.2.12, f +g is integrable. By 11.2.6, there exists a set A of Lebesgue
measure zero such that f (x), g(x) ∈ R for x ∈ Ac . Then on the set Ac ,
(f + g)+ − (f + g)− = f + g = f + − f − + g + − g − , hence
(f + g)+ + f − + g − = (f + g)− + f + + g + .
1 To avoid undefined expressions such as ∞ − ∞ in the integrand f + g in (b) and (c), it
must be assumed that g is finite-valued. This is no real loss of generality since g is integrable,
hence finite-valued a.e. (11.2.6).
Lebesgue Integration on Rn
375
By 11.2.11 and 11.2.5,
Z
Z
Z
Z
Z
Z
(f + g)+ dλ + f − dλ + g − dλ = (f + g)− dλ + f + dλ + g + dλ.
Since the integrals in this equation are finite, rearranging yields
Z
Z
Z
+
(f + g) dλ = (f + g) dλ − (f + g)− dλ
Z
Z
Z
Z
= f + dλ − f − dλ + g + dλ − g − dλ
Z
Z
= f dλ + g dλ.
(c) The cases to be considered are
R
R
(i) f − dλ < +∞ and f + dλ = +∞;
R
R
(ii) f + dλ < +∞ and f − dλ = +∞.
Suppose that (i) holds. We may assume that both f − and g are finite-valued.
Since
Z
Z
−
(f + g) dλ ≤ (f − + g − ) dλ < +∞,
R
(f + g) dλ is defined. If (f + g)+ dλ < +∞, then (f + g) would be integrable,
hence, by part (b,) so would f + = (f + g) + f − − g, contrary to our assumption.
Therefore,
Z
Z
Z
(f + g) dλ = +∞ = f dλ + g dλ.
R
Case (ii) is similar (or apply Case (i) to −f ).
11.2.14 Corollary. If f is integrable, then
Proof. Since ±f ≤ |f |, ±
Z
f dλ =
Z
Z
Z
f dλ ≤
|f | dλ.
Z
±f dλ ≤
|f | dλ.
Approximation of Integrable Functions
11.2.15 Definition. For E ∈ M and f ∈ L1 (E) define the L1 seminorm of f
by
Z
kf k1 :=
|f | dλ.
E
11.2.16 Theorem. L1 (E) is a linear space and k · k1 has all the properties
of a norm except the coincidence property.
376
A Course in Real Analysis
Proof. That L1 (E) is a linear space follows from 11.2.13. Coincidence may fail
since kf k1 = 0 only implies that f = 0 a.e. (Consider the Dirichlet function.)
The other properties of a norm are easily established.
11.2.17 Theorem. Let f ∈ L1 (Rn ) and ε > 0. Then there exists a simple
function g and a continuous function h, each vanishing outside a bounded
interval, such that kf − gk1 < ε and kf − hk1 < ε.
positive and negative parts, we may assume that f ≥ 0.
Proof. By considering
R
By definition of f dλ, there exists fs ∈ S+ (M) with fs ≤ f such that
Z
Z
kf − fs k1 = f dλ − fs dλ < ε/4.
Let fs =
Pm
i=1
ai 1Ai , where ai > 0. Since
m
X
ai λ(Ai ) =
Z
Z
fs dλ ≤
f dλ < +∞,
i=1
λ(Ai ) < +∞ for each i. Let M = maxi ai . By 10.1.6(d), there exists a bounded
interval I such that
λ(Ai ) − λ(I ∩ Ai ) < ε/(4M m), i = 1, . . . , m.
Set Bi := Ai ∩ I and g :=
m
X
ai 1Bi . Then
i=1
kg − fs k1 =
m
X
ai λ(Ai ) − λ(Bi ) < ε/4,
i=1
hence
kf − gk1 ≤ kf − fs k1 + kfs − gk1 < ε/2.
To obtain h, for each i choose a compact set Ci and a bounded open set Ui
such that Ci ⊆ Bi ⊆ Ui and λ(Ui \Ci ) < ε/(4mM ) (10.4.6). By Exercise 8.5.15,
there exists a continuous function hi : Rn → [0, 1] such that hi = 1 on Ci and
hi = 0 on Uic . Since hi − 1Bi = 0 on Ci ∪ Uic = (Ui \ Ci )c ,
Z
k1Bi − hi k1 =
|1Bi − hi | dλ ≤ 2λ(Ui \ Ci ) < ε/2mM.
Ui \Ci
Pm
The function h :=
i=1 ai hi is continuous and by the triangle inequality
kg − hk1 < ε/2. Therefore, kf − hk1 < ε, completing the proof.
Lebesgue Integration on Rn
377
Translation Invariance of the Integral
11.2.18 Theorem. If f : Rn → R is Lebesgue measurable and y ∈ Rn , then
Z
Z
f (x + y) dx = f (x) dx
(11.4)
in the sense that if one side is defined, then so is the other and the integrals
are then equal.
Proof. If E ∈ M, then E − y ∈ M and λ(E − y) = λ(E) (Exercise 10.3.1),
hence
Z
Z
Z
1E (y + x) dx = 1E−y dλ = λ(E − y) = 1E dλ.
Therefore, (11.4) holds for indicator functions.
For a function h, define hy (x) := h(y + x). Let f ≥ 0 and let gR ∈ S+R(M)
with g R≤ f . Then
gy ≤ fy and, by the first paragraph and
g = gy ,
R
R linearity,
R
hence g ≤ fy . Taking the supremum over g yields f ≤ fy . Replacing y
by −y and f by fy in this inequality produces the reverse inequality. Therefore,
(11.4) holds for f ≥ 0. The general case follows from this and the identities
(f ± )y = (fy )± .
Exercises
1. Let f and g be integrable. Prove:
R
R
(a) If E f dλ ≤ E g dλ for all E ∈ M, then f ≤ g a.e.
R
R
(b) If E f dλ = E g dλ for all E ∈ M, then f = g a.e.
2. Let f (x) = 1 + r bx−1 c for 0 < x ≤ 1, where r(k) is the remainder on
division of the positive integer k by 3. (Cf. Exercise 10.5.15.) Show that
Z
∞ 2 X
1
2
3
f dλ = +
+
+
.
3
3k(3k + 1) (3k + 1)(3k + 2) (3k + 2)(3k + 3)
(0,1]
k=1
3.S Define f : [0, 1] → R by f (x) = 0 if x is rational, and f (x) = d2 if x
is irrational, where d is the first nonzeroR digit in the decimal expansion
of x. (See Exercise 10.5.16.) Show that [0,1] f dλ = 95/3.
4. (a) Prove the following mean value theorem for integrals: Let f be
continuous on a compact connected set K ⊆ Rn . Then there exists
xK ∈ K such that
Z
f dλ = f (xK )λ(K).
K
(b) Let f be continuous on C1 (x0 ). Prove that
Z
1
lim
f dλ = f (x0 ).
r→0 λ(Cr (x0 )) C (x )
r
0
378
A Course in Real Analysis
5.S Let f be Lebesgue measurable on R and let m ≤ f ≤ M on E ∈ M(R).
(a) Prove that if g is integrable on E, then there exists a ∈ [m, M ] such
that
Z
Z
f |g| dλ = a
|g| dλ
E
E
(b) Show that part (a) may be false if |g| is replaced by g.
(c) Use (a) to show that at each point x where f is continuous,
Z
Z
1
lim
f dλ −
f dλ = f (x).
y→x y − x
[a,y]
[a,x]
6. (Cauchy–Schwarz inequality) Let f and g be Lebesgue measurable on
Rn . Prove that
Z
2 Z
Z
2
|f g| dλ ≤ f dλ · g 2 dλ.
(See 5.7.19.)
7.S Prove that if f is integrable on [0, 1] and ε > 0, then there exists a
R1
polynomial P on [0, 1] such that 0 |f − P | dλ < ε.
8. (Absolute continuity of the integral). Let f ≥ 0 be integrable on Rn .
Prove that for each ε > 0 there exists a δ > 0 such that
Z
f dλ < ε for all E ∈ M(Rn ) with λ(E) < δ.
E
Conclude
that if {Ek } is a sequence in M(Rn ) with λ(Ek ) → 0, then
R
f dλ → 0. Hint. Begin with simple functions.
Ek
R
9.S Let f be integrable. Prove that limk [k,k+1] f dλ = 0. (A quick proof
uses the dominated convergence theorem. For now, give a proof starting
with simple functions.)
10. Suppose f : I = [0, 1] → [−1, 1] is integrable. Prove that
Z
f 2 dλ ≤ ε2 + λ {x : |f (x)| > ε} for every ε > 0.
I
11.S Let f : I = [0, 1] → R be integrable. Prove that
Z
f 2 dλ ≤ ε2 λ {x ∈ I : |f (x)| > ε} for every ε > 0.
I
12. Prove: If f is integrable and f < 1 a.e. on I, then
R
I
f < 1.
Lebesgue Integration on Rn
379
13. Let
R f be Lebesgue integrable on E ∈ M with 0 < λ(E) < +∞ and
f dλ ≥ λ(E). Prove that λ {x ∈ E : f (x) ≥ 1} > 0.
E
14. Let f be integrable on Rn . Show
R that for each
R r > 0 the function
fr (x) := f (rx) is integrable and fr dλ = r−n f dλ.
15.S Let Rf : [0, 1] → R be a bounded Lebesgue measurable function such
that [0,1] x2k f (x) dλ(x) = 0 for all k ∈ Z+ . Prove that f = 0 a.e.
16. Let f be Lebesgue integrable and g, g 0 bounded and continuous on R.
Carry out the following steps to show that
Z
lim f (x)g 0 (kx) dλ(x) = 0.
(11.5)
k
(a) Prove (11.5) for f = 1[a,b] . (Use the fact that the Riemann and
Lebesgue integrals of a continuous function on a closed bounded
interval are equal. (Section 11.4.))
(b) Use (a) to show that (11.5) holds for f = 1U , where U is bounded
and open.
(c) Use (b) and 10.4.7 to show that (11.5) holds for f = 1E , where
E ∈ M is bounded.
(d) Use (c) and 11.2.17 to complete the proof.
If g 0 (x) = sin x or cos x, then (11.5) is known as the Riemann–Lebesgue
lemma.
11.3
Convergence Theorems
In this section we state and prove three pointwise convergence theorems
for the Lebesgue integral. The first of these is a generalization of 11.2.10.
Let fk : Rn → R be Lebesgue
11.3.1 Monotone Convergence Theorem.
R −
n
measurable with fk ↑ f on R and let f1 dλ < +∞. Then
Z
Z
f dλ = lim fk dλ.
(11.6)
k
fk−
f1−
R
R
Proof. From 0 ≤ f − ≤
≤
we have fk− dλ < +∞ and f − Rdλ < +∞,
hence the integrals in the assertion of the theorem are defined. If f1+ dλ =
+
+
+
+∞,
R + then from f1 ≤ fk ≤ f we see that each side of (11.6) is +∞. If
f1 dλ < +∞, then f1 is integrable and we may apply 11.2.10 to fk − f1
(≥ 0) to obtain
Z
Z
Z
Z
Z
Z
fk dλ = (fk − f1 ) dλ + f1 dλ → (f − f1 ) dλ + f1 dλ = f dλ.
380
A Course in Real Analysis
11.3.2 Remark. Equation (11.6) is still true if the inequalities fk ≤ fk+1 ≤ f
and the convergence fk ↑ f hold only almost everywhere. To see this, let A
denote the set on which fk ≤ fk+1 for all k and fk ↑ f . Set f˜k = fk 1A and
f˜ = f 1A . Then (11.6) holds for the new functions. Since λ(Ac ) = 0, 11.2.5
shows that the equation holds for the original functions. Analogous remarks
apply to the other convergence theorems in this section.
♦
11.3.3 Corollary. If gk is Lebesgue measurable and nonnegative for every k,
then
Z X
∞
∞ Z
X
gk dλ =
gk dλ.
k=1
k=1
Pk
Proof. Let fk = j=1 gj and f =
the theorem and linearity,
Z
f dλ = lim
Z
k
P∞
j=1 gj .
Then 0 ≤ fk ↑ f on Rn , hence, by
fk dλ = lim
k
k Z
X
gj dλ.
j=1
11.3.4 Corollary. Let f ≥ 0 be Lebesgue measurable. Define a function µ on
M(Rn ) by
Z
µ(E) :=
f dλ, E ∈ M(Rn ).
E
Then µ is a measure on M(R ).
n
Proof. For countable additivity, apply 11.3.3 to gk = 1Ek .
11.3.5 Fatou’s Lemma. If fk is nonnegative and Lebesgue measurable for
every k, then
Z
Z
lim inf fk dλ ≤ lim inf fk dλ.
(11.7)
k
k
Proof. Let gk = inf j≥k fj and g = lim inf k fk . Then gk ≤ fk , gk ↑ g, and
gk and g are Lebesgue measurable (10.5.3). By the monotone convergence
theorem,
Z
Z
Z
Z
Z
lim inf fk dλ = g dλ = lim gk dλ = lim inf gk dλ ≤ lim inf fk dλ.
k
k
k
k
The inequality in (11.7) may be strict. For example, if fk = k1[0,1/k] , then
the left side is zero while the right side is one.
11.3.6 Dominated Convergence Theorem. Let g : Rn → [0, +∞] be
integrable and let {fk : Rn → R} be a sequence of Lebesgue measurable functions
n
such
that |f
R
R k | ≤ g for all k. If fk → f on R , then f is integrable and
fk dλ → f dλ.
Lebesgue Integration on Rn
381
Proof. Since |f | ≤ g, fk and f are integrable (11.2.12). Fatou’s lemma applied
to g ± fk (≥ 0) shows that
Z
Z
Z
Z
Z
g dλ + f dλ ≤ lim inf (g + fk ) dλ = g dλ + lim inf fk dλ
k
k
and
Z
Z
g dλ −
Subtracting
f dλ ≤ lim inf
k
Z
(g − fk ) dλ =
Z
g dλ − lim sup
Z
fk dλ.
k
g dλ in each inequality yields
Z
Z
Z
Z
f dλ ≤ lim inf fk dλ ≤ lim sup fk dλ ≤ f dλ.
R
k
k
The following example illustrates that care must be taken when applying
the dominated convergence theorem.
11.3.7 Example. Let p > 0 and define
fk (x) :=
k
, 0 < x ≤ 1, and Ik :=
1 + k 2 x2p
Z
fk dλ, k ∈ N.
(0,1]
Clearly, fk → 0 for all p > 0. We show that limk Ik = 0 iff 0 < p < 1. By
Section 11.4, below, the integrals are Riemann, hence, making the substitution
t = kxp and setting q = p−1 − 1, we obtain
Z 1
Z k
Z
k
1
tq
Ik =
dx = q
dt = gk dλ,
2 2p
pk 0 1 + t2
0 1+k x
where
gk (t) =
1
tq
1[0,k] .
q
pk 1 + t2
If p = 1, then q = 0 and Ik = arctan k → π/2. If 0 < p < 1, then gk → 0 and
gk (t) ≤ p−1 (1 + t2 )−1 for all t ≥ 0 and all k, so Ik → 0 by the dominated
convergence theorem. Finally, if p > 1, then −1 < q < 0 and
Z 1
1
1
Ik ≥ q
dt → +∞.
♦
pk 0 1 + t2
The following theorem gives general conditions under which one may
“differentiate under the integral sign.”
11.3.8 Theorem. Let f (x, y) be Lebesgue measurable on I := (a, b) × (c, d)
such that for each y in (c, d) the function f (·, y) is Lebesgue integrable on (a, b)
and the derivative fy exists on I. If there exists an integrable function g on
(a, b) such that |fy (x, y)| ≤ g(x) for all (x, y) ∈ I, then
Z
Z
d
∂f
f (x, y) dλ(x) =
(x, y) dλ(x).
dy (a,b)
(a,b) ∂y
382
A Course in Real Analysis
Proof. We prove the right-hand derivative version. Let y ∈ (c, d) and yk ↓ y.
Set
Z
f (x, yk ) − f (x, y)
G(y) =
f (x, y) dλ(x) and gk (x) =
.
yk − y
(a,b)
By the mean value theorem, gk (x) = fy (x, tk ) for some tk ∈ (y, yk ), hence
|gk | ≤ g. Since gk (x) → fy (x, y), the dominated convergence theorem implies
that
Z
Z
G(yk ) − G(y)
=
gk (x) dλ(x) →
fy (x, y) dλ(x).
yk − y
(a,b)
(a,b)
R
Since {yk } was arbitrary, G0r (y) exists and equals (a,b) fy (x, y) dλ(x).
Exercises
1.S Prove the following:
Z
k
(a) lim
sink x (1 − sin x) dλ = 0.
k
(b) lim
k
[0,π]
Z
[0,+∞)
k sin x3/2
dλ(x) = 0.
1 + k 2 x2
2. Let f : Rn → (0, +∞) be integrable. Prove that
Z
Z
Z
(a)
k ln(1 + k −1 f ) dλ → f dλ. (b) S
k ln(1 + k −2 f ) dλ → 0.
Z
Z
Z
(c) S
k sin k −1 f dλ → f dλ.
(d)
f 1/k dλ → λ(E), E ∈ M.
E
3.S Let f, g : Rn → (0, +∞) be Lebesgue measurable with g integrable.
Prove:
Z
Z
g(1 + k −1 f )k exp (−f ) dλ → g dλ.
4. Let f : Rn → [1, +∞) be Lebesgue measurable and g : Rn → [0, +∞)
integrable. Prove that
Z
k 2 g exp (−kf ) dλ → 0.
5.S Let f be integrable on (0, ∞). Show that for each t ∈ R the function
f (x) sin(tx)/x is integrable on (0, ∞) and prove that the integral
R
f (x)x−1 sin(tx) dλ(x) is continuous in t.
(0,∞)
6. Prove that the derivative of the gamma function (5.7.8) is
Z ∞
Γ0 (x) =
tx−1 e−t ln t dt, x > 0.
0
Lebesgue Integration on Rn
383
(Use the fact, proved in the next section, that the improper Riemann
integral and the Lebesgue integral of a nonnegative continuous function
are equal.)
7. Let f : [0, +∞) → R be bounded and Lebesgue measurable and suppose
that limx→+∞ f (x) = r. Show that
Z
lim
f (kx) dλ(x) = ar for every a > 0.
k
[0,a]
Hint. Use Exercise 11.2.14.
8.S Let f : Rn → R be Lebesgue measurable and have countable range
{a1 , a2 , . . .}. P
Set Ak = {x ∈ Rn : f (x) = ak }. Prove that f is integrable
∞
iff the series k=1
R ak λ(Ak ) converges absolutely, in which case the value
of the series is f dλ.
R
9. Let p > 1 and f (x) := bx−1 c−p , 0 < x < 1. Find (0,1) f dλ.
10. Let fk , f be integrable and Ek , E ∈ M(Rn ) such that
lim kfk − f k1 = 0 and lim λ(Ek ∆E) = 0
k
k
(see Exercise 10.5.10). Prove that
Z
Z
lim
fk dλ =
f dλ.
k
Ek
E
11. Let f : Rn → R be integrable and ε > 0.
(a) Prove that the set A = {x : |f (x)| ≥ ε} has finite measure.
(b) Show that there exists B ∈ M with λ(B) < +∞ such that
Z
Z
f dλ −
f dλ < ε.
B
12.S Let {fk } be a sequence of integrable functions on Rn such that
∞
X
kfk k1 < +∞. Prove that limk fk (x) = 0 a.e.
k=1
13. Let
Lebesgue integrable on R and p > 0. Prove that the series
P∞ f be
−p
f (kx) converges absolutely a.e. on R.
k=1 k
14. Let T : C [a, b] → C [a, b] be linear and continuous in the L1 norm. If
f : [a, b] × [c, d] → R is continuous, prove that
Z d
Z d
T
f (·, x) dx =
T f (·, x) dx
c
c
where the integrals may be taken to be Riemann.
384
A Course in Real Analysis
15.S Prove the following extension of Fatou’s lemma: If fk , g are Lebesgue
integrable on Rn and fk ≥ g for all k, then
Z
Z
lim inf fk dλ ≤ lim inf fk dλ.
k
k
16. Let g be integrable on Rn and let {fk } be a sequence of Lebesgue
measurable functions on Rn such that |fk | ≤ g. Show that
Z
Z
Z
Z
lim inf fk ≤ lim inf fk ≤ lim sup fk ≤ lim sup fk .
k
k
k
k
17. Let f, fk be nonnegative Lebesgue integrable functions on Rn such that
fk → f . Prove that
Z
Z
fk dλ → f dλ iff kfk − f k1 → 0.
Hint. For the necessity, note that (fk − f )− ≤ f .
18. Let f, fk be integrable on Rn with fk → f . Prove that
Z
Z
kfk − f k1 → 0 iff
|fk | dλ → |f | dλ.
Hint. For the sufficiency use Fatou’s lemma.
R
19.S Let f be Lebesgue integrable on R such that [a,b] f dλ = 0 for all
intervals [a, b]. Prove that f = 0 a.e. Hint. Use 11.3.4, 10.4.4, and
Exercise 11.2.8.
20. Let f : R2 → R have the property that f (x, y) is Lebesgue measurable
in y for each x and continuous in x for each y. Suppose there exists an
integrable function g : R → R such that |f (x, y)| ≤ g(y) for all x and y.
Prove that the function
Z
F (x) := f (x, y) dλ(y)
is continuous.
P∞
21. Let f ≥ 0 be integrable on [1, +∞). Prove that k=1 f (x+k) is integrable
on [0, 1]. Conclude that the series converges a.e. on [1, +∞).
22. Let f be Lebesgue measurable on I = [0, 1] and set
Ak = {x ∈ I : |f (x)| ≥ k} .
Prove:
(a) f is integrable on I iff
P∞
k=0
λ(Ak ) converges.
(b) If f is integrable on I = [0, 1] then limk kλ(Ak ) = 0.
Lebesgue Integration on Rn
11.4
385
Connections with Riemann Integration
Throughout the section, f denotes an arbitrary bounded
real-valued function on a closed and bounded interval [a, b].
In this section we show that f is Riemann integrable if and only if its set
of discontinuities has Lebesgue measure zero. The first step is to show that
the upper and lower integrals of f may be expressed as integrals of Borel
measurable functions.
By 5.2.1 there exists a sequence of partitions {Pk } of [a, b] such that Pk+1
is a refinement of Pk , kPk k → 0, and
Z
b
k
a
Define
hk =
Z
f = lim S(f, Pk ),
X
mj 1[xj−1 ,xj ]
b
f = lim S(f, Pk ).
a
k
and gk =
X
j
where
Mj 1[xj−1 ,xj ] ,
j
mj :=
inf
xj−1 ≤x≤xj
f (x),
Mj :=
sup
xj−1 ≤x≤xj
f (x),
and the intervals [xj−1 , xj ] are those generated by the partition Pk . Then gk
and hk are Borel measurable simple functions and
Z
Z
S(f, Pk ) =
hk dλ, S(f, Pk ) =
gk dλ.
[a,b]
[a,b]
Moreover,
h1 ≤ h2 ≤ · · · ≤ f ≤ . . . ≤ g2 ≤ g1 ,
hence h(x) := limk hk (x) and g(x) := limk gk (x) exist in R for each x ∈ [a, b],
h ≤ f ≤ g, and h and g are Borel measurable. If M is a bound for |f |, then
|hk |, |gk | ≤ M a.e., hence by the dominated convergence theorem
b
Z
f = lim S(f, Pk ) = lim
k
a
and
Z
a
k
b
f = lim S(f, Pk ) = lim
k
Z
k
hk dλ =
Z
[a,b]
Z
[a,b]
h dλ
(11.8)
g dλ.
(11.9)
[a,b]
gk dλ =
Z
[a,b]
11.4.1 Lemma. f ∈ Rba iff g = h a.e. In this case, f is Lebesgue measurable
Rb
R
and a f = [a,b] f dλ.
386
A Course in Real Analysis
R
Proof. From (11.8) and (11.9), f ∈ Rba iff [a,b] (g − h) dλ = 0, which, by 11.2.7,
is equivalent to g = h a.e. If this holds, then h = f = g a.e. so f is Lebesgue
Rb
R
measurable and a f = [a,b] f dλ by (11.8) and (11.9).
11.4.2 Lemma. Suppose that x ∈ [a, b] is not a member of any of the partitions
Pk . Then f is continuous at x iff h(x) = g(x).
Proof. Suppose f is continuous at x. Given ε > 0, choose δ > 0 such that
y ∈ [a, b] and |x − y| < δ implies |f (x) − f (y)| < ε. Choose N so that kPk k < δ
for all k ≥ N and fix k ≥ N . Since x is in some subinterval (xj−1 , xj ) of Pk ,
f (x) − ε < f (y) < f (x) + ε for all y ∈ [xj−1 , xj ],
hence
f (x) − ε ≤ hk (x) = mj ≤ Mj = gk (x) ≤ f (x) + ε.
Letting k → +∞ yields
f (x) − ε ≤ h(x) ≤ g(x) ≤ f (x) + ε,
and since ε was arbitrary, g(x) = h(x).
Conversely, let g(x) = h(x). Given ε > 0, choose k such that
|gk (x) − g(x)| < ε and |hk (x) − h(x)| < ε.
Suppose that x is in the open subinterval (xi−1 , xi ) of Pk . Choose δ > 0 so
that (x − δ, x + δ) ⊆ (xi−1 , xi ). Then for all y ∈ (x − δ, x + δ),
h(x) − ε ≤ hk (x) ≤ f (y) ≤ gk (x) ≤ g(x) + ε = h(x) + ε,
which implies that |f (x) − f (y)| < 2ε. Therefore, f is continuous at x.
Here is the main result of the section.
11.4.3 Theorem. Let f : [a, b] → R be bounded. Then f ∈ Rba iff the set D
of discontinuities of f has Lebesgue measure zero. In this case, f is Lebesgue
measurable and
Z
Z
b
f (x) dx =
a
f dλ.
[a,b]
Proof. Let A denote the union of the partitions Pk and set B
{x : g(x) 6= h(x)}. By 11.4.2,
=
B ∩ Ac ⊆ D ⊆ A ∪ B
Since A is countable, λ(A) = 0, hence λ(B ∩ Ac ) = λ(A ∪ B) = λ(B). It follows
that λ(B) = λ(D). Thus, by 11.4.1, f ∈ Rba iff λ(D) = 0.
Lebesgue Integration on Rn
387
11.4.4 Example. Let A := (0, 1) \ E, where E is the Cantor ternary set
(10.3.4). Since A is open, the function f (x) = 1A (x) sin(πx) is continuous on
A. Since λ(E) = 0, f is both Riemann and Lebesgue integrable on [0, 1] and
Z
1
f (x) dx =
0
Z
0
1
2
sin(πx) dx = .
π
♦
11.4.5 Remark. Theorem 11.4.3 readily extends to n-dimensional Riemann
integrals; the statement and proof are essentially the same. Note that in this
case, a Riemann integrable function f may be discontinuous on m-dimensional
hyperplanes, m < n, as these have Lebesgue measure zero (see 11.6.9).
♦
Here is the connection between improper integrals and Lebesgue integrals.
11.4.6 Corollary. Let g be locally Riemann integrable on [a, b) (where b could
be infinite). Then g is Lebesgue measurable on [a, b). Moreover:
(a) If g ≥ 0, then g is improperly integrable on [a, b) iff g is Lebesgue integrable
on [a, b), in which case
Z b
Z
g=
g dλ.
(11.10)
a
[a,b)
(b) If g is Lebesgue integrable on [a, b), then g is improperly integrable on [a, b)
and (11.10) holds.
(c) If g is improperly integrable on [a, b), then g need not be Lebesgue integrable
on [a, b).
Proof. (a) Let bk ↑ b and let D denote the set of discontinuities
of g on [a, b).
Since g is Riemann integrable on [a, bk ], λ [a, bk ] ∩ D = 0. By the theorem,
1[a,bk ] g is Lebesgue measurable for every k and
Z
a
bk
g=
Z
g dλ =
Z
1[a,bk ] g dλ.
[a,bk ]
Taking limits we see that g is Lebesgue measurable and, by the monotone
convergence theorem, 11.10 holds.
(b) If g is Lebesgue integrable on [a, b), then, by (a), g + and g − are
improperly integrable on [a, b) hence (b) holds.
(c) The function
g(x) = x−1 sin x
is improperly integrable but not absolutely improperly integrable on [1, +∞)
(5.7.18). Since a Lebesgue integrable function is absolutely integrable, g cannot
be Lebesgue integrable on [1, +∞).
388
A Course in Real Analysis
11.5
Iterated Integrals
For the remainder of the text we also use the notation dx
Rb
R
for dλ(x) and a f (x) dx for [a,b] f (x) dλ(x), etc.
In this section we state and prove a result that gives general conditions
under which the Lebesgue integral of a function on Rn may be expressed as
an iterated integral, a useful tool for evaluating integrals.
11.5.1 Fubini–Tonelli Theorem. Let f be Borel measurable on Rn and let
p, q ∈ N with p + q = n.
(a) If f ≥ 0, then the functions
Z
Z
f (x, z) dz and
Rq
f (z, y) dz
Rp
are Borel measurable in x ∈ Rp and y ∈ Rq , respectively, and
Z Z
Z Z
f (x, z) dz dx =
f (z, y) dz dy.
Rp
Rq
Rq
(b) If either of the iterated integrals
Z Z
Z
|f (x, z)| dz dx or
Rp
Rq
(11.11)
Rp
Rq
Z
|f (z, y)| dz dy
Rp
is finite, then both are finite, f is integrable, and (11.11) holds.
By induction we have
11.5.2 Corollary. Let f be Borel measurable on Rn such that
Z ∞
Z ∞
···
|f (x1 , . . . , xn )| dxi1 · · · dxin < +∞
−∞
−∞
for some permutation (i1 , . . . , in ) of (1, . . . , n). Then f is integrable and
Z
Z ∞
Z ∞
f dλ =
···
f (x1 , . . . , xn ) dxj1 · · · dxjn .
Rn
−∞
−∞
for every permutation (j1 , . . . , jn ) of (1, . . . , n).
11.5.3 Example. We prove the Gaussian density formula
Z ∞
2
e−t /2
.
ϕ(t) dt = 1, where ϕ(t) := √
2π
−∞
(11.12)
Lebesgue Integration on Rn
389
By 11.4.6, the integral may be interpreted either as a Lebesgue integral or as an
improper Riemann integral. The function ϕ is called the standard normal (or
Gaussian) density. It plays an important role in probability and statistics. For
Rb example, σ −1 a ϕ (x − µ)/σ dx is the probability that randomly chosen data
from a normally distributed population
R ∞ with mean µ and standard deviation
σ lies between a and b, and σ −1 −∞ ϕ (x − µ)/σ x dx is the average of the
data.
To verifyR(11.12) note that because the integrand is an even function, the
∞
left side is 2 0 ϕ(t) dt. By a change of variable,
Z ∞
Z ∞
Z ∞
2
2
2
−t2 /2
e−t dt.
2
ϕ(t) dt = √
e
dt = √
π
2π 0
0
0
Thus it suffices to show that
2
√
π
Z
∞
2
e−t dt = 1.
0
Let I denote the integral on the left. Then
Z ∞
Z ∞
2
2
−y 2
I =
e
e−t dt dy
Z0 ∞
Z0 ∞
2 2
−y 2
=
e
ye−x y dx dy,
by the substitution t = xy
0
0
Z ∞Z ∞
2
2
=
ye−y (1+x ) dy dx,
by 11.5.1
0
0
Z
Z ∞
1 ∞
(1 + x2 )−1
e−u du dx, by the substitution u = y 2 (1 + x2 )
=
2 0
0
∞
R∞
1
= arctan x
because 0 e−u du = 1
2
0
π
♦
=
4
11.5.4 Example. Let f, g : Rn → R be Borel measurable and integrable. By
Exercise 10.5.19, the function F (x, y) := f (x − y)g(y) is Borel measurable
in (x, y). By the Fubini–Tonelli theorem and translation invariance of the
integral,
Z
Z
Z
|F (x, y)| dλ(x, y) =
|g(y)|
|f (x − y)|dx dy = kgk1 kf k1 < +∞.
Rn ×Rn
Rn
Rn
Therefore, F is integrable, hence the function
Z
(f ∗ g)(x) :=
f (x − y)g(y)dy,
Rn
called the convolution of f and g, is finite a.e. and integrable on Rn . Convolutions are useful in calculating the probability distribution of a sum of
independent random variables.
♦
390
A Course in Real Analysis
11.5.5 Example. (Volume of a simplex). Let a > 0 and let ej , 1 ≤ j ≤ n, be
the standard basis in Rn . Define the n-dimensional simplex in Rn by
n
n
o
X
S(a, n) = x :
xj ≤ a and xj ≥ 0 .
j=1
x3
a
a
x1
x2
a
FIGURE 11.2: Three-dimensional simplex.
We use the Fubini–Tonelli theorem and induction to show that
an
λn S(a, n) =
.
n!
The formula holds for n = 1 since S(a, 1) = [0, a]. Assume the formula holds
for n − 1 and all a > 0. Then
Z
λn S(a, n) = 1S(a,n) (x1 , . . . , xn ) d(x1 , . . . , xn )
Z
=
1S(a−xn ,n−1) (x1 , . . . , xn−1 ) d(x1 , . . . , xn−1 ) dxn
[0,a]
Z
1
(a − xn )n−1 dxn .
=
(n − 1)! [0,a]
The last integral evaluates to an /n, completing the proof.
♦
11.5.6 Example. Let Crn (x) denote
the closed ball in Rn with center x and
radius r. We show that λ Crn (x) = rn αn , where

(2π)n/2


if n is even,

···4 · 2
αn = n(n − 2)
(n−1)/2
2(2π)



if n is odd.
n(n − 2) · · · 3 · 1
For ease of notation we write Crn for Crn (0) and denote by 1r the indicator
n
function of
Cr . By the
translation and dilation properties of Lebesgue measure,
λ Crn (x) = rn λ C1n , hence it suffices to establish the formula for r = 1 and
x = 0.
Lebesgue Integration on Rn
391
If n = 1, then C1n = (−1, 1)
and αn = 2, so the formula holds in this case.
By a simple integration, λ C12 = π, hence the formula holds for n = 2 as well.
Now assume that n > 2. From
C1n = (x1 , . . . , xn ) : x21 + · · · + x2n ≤ 1
= (x1 , . . . , xn ) : x23 + · · · + x2n ≤ 1 − x21 − x22 , (x1 , x2 ) ∈ C12
we have
11 (x1 , . . . , xn ) = 1√1−x2 −x2 (x3 , . . . , xn )11 (x1 , x2 ),
1
2
hence, by the Fubini–Tonelli theorem,
Z
Z
λ C1n =
11 (x1 , x2 )
1√1−x2 −x2 (x3 , . . . , xn ) dλ(x3 , . . . , xn ) dx1 dx2 .
1
Rn−2
R2
The inner integral is
n−2
λ C√
2
= (1 − x21 − x22 )(n−2)/2 λ C1n−2 ,
1−x21 −x22
hence, changing to polar coordinates,2
Z
n−2
n
λ C1 = λ C1
(1 − x21 − x22 )(n−2)/2 dx1 dx2
x21 +x22 ≤1
= λ C1n−2
Z
0
2π
2π
λ C1n−2 .
=
n
Z
1
(1 − r2 )(n−2)/2 r dr dθ
0
Iterating, we obtain
2π
(2π)2
λ C1n =
λ C1n−2 =
λ C1n−4 = · · ·
n
n(n − 2)
(2π)m−1
n−2(m−1) =
λ C1
.
n(n − 2) · · · (n − 2(m − 2))
Thus
λ C12m =
(2π)m−1
(2π)m
λ C12 =
2m(2m − 2) · · · 4
2m(2m − 2) · · · 2
and
λ C12m−1 =
(2π)m−1
2(2π)m−1
λ C11 =
.
(2m − 1)(2m − 3) · · · 3
(2m − 1)(2m − 3) · · · 3
♦
2 The general change of variables theorem for Lebesgue integrals is proved in the next
section.
392
A Course in Real Analysis
Proof of the Fubini–Tonelli theorem.
We show first that part (b) of the theorem is a consequence of part (a).
Indeed, if one of the iterated integrals in (b) is finite, then, by part (a) applied
to |f |, so is the other and f is integrable. Applying part (a) to f ± , we see that
(11.11).
Next, observe that if part (a) of the theorem holds for indicator functions
then, by linearity of the integrals, it holds for nonnegative simple functions. By
10.5.8 and the monotone convergence theorem, (a) holds for all nonnegative
Borel measurable functions.
It remains then to prove (a) for indicator functions. The proof consists
of several lemmas, the first of which is a special case of a theorem due to
E.B. Dynkin.
11.5.7 Lemma. Let F denote the intersection of all collections G of subsets
of Rn with the following properties:
(a) If A, B ∈ G and A ⊆ B, then B \ A ∈ G.
(b) If Ak ∈ G and Ak ↑ A, then A ∈ G.
(c) G contains every bounded interval.
Then F is a σ-field containing B(Rn ).
Proof. It is easy to see that F itself has properties (a)–(c). Moreover, from (b)
and (c), F contains every interval. In particular, Rn ∈ F.
We show first that F is closed under finite intersections. To see this, fix
A ∈ F and define
FA := {B ∈ F : A ∩ B ∈ F} .
One easily checks that FA has properties (a) and (b). Furthermore, if A is an
interval, then FA has property (c) so by minimality F ⊆ FA . This shows that
if B ∈ F, then A ∩ B ∈ F for all intervals A; in other words, FB contains all
intervals. Thus FB has properties (a)–(c). By minimality, F ⊆ FB , that is,
A, B ∈ F ⇒ A ∩ B ∈ F. By induction, F is closed under finite intersections.
Now observe that property (a), together with the fact that Rn ∈ F, implies
that F is closed under complements. Thus if {Ek } is a sequence in F, then,
by the result of the preceding paragraph,
Ak :=
k
[
j=1
Ek =
\
k
Ekc
c
∈ F.
j=1
S∞
S∞
By (b), k=1 Ek = k=1 Ak ∈ F. This shows that F is a σ-field. Since F
contains all intervals, it must contain B(Rn ).
11.5.8 Lemma. Let p, q ∈ N with p + q = n. If A ∈ B(Rp ) and B ∈ B(Rq ),
then
A × B ∈ B(Rn ) and λ(A × B) = λ(A)λ(B).
(11.13)
Lebesgue Integration on Rn
393
Proof. For fixed bounded intervals I ⊆ Rp and J ⊆ Rq , define
GI,J = B ∈ B(Rq ) : I ×(B ∩J) ∈ B(Rn ) & λ I ×(B ∩J) = λ(I)λ(B ∩J) .
We show that GI,J has properties (a)–(c) of 11.5.7. Clearly, (c) holds. If
B ∈ GI,J , then
I × (B c ∩ J) = (I × J) \ I × (B ∩ J) ∈ B(Rn ) and
λ I × (B c ∩ J) = λ I × J − λ I × (B ∩ J)
= λ(I) λ(J) − λ(B ∩ J)
= λ(I)λ(J ∩ B c ),
hence B c ∈ G. Therefore, GI,J is closed under complements. Now let Bk ∈ GI,J
and Bk ↑ B. Then
I × (J ∩ B) =
∞
[
I × (J ∩ Bk ) ∈ B(Rn )
k=1
and, by 10.1.6,
λ I × J ∩ B) = lim λ I × (J ∩ Bk ) = λ(I) lim λ(J ∩ Bk ) = λ(I)λ(J ∩ B),
k
k
which shows that B ∈ GI,J . Therefore, GI,J has properties (a)–(c) of 11.5.7, so
B(Rq ) = GI,J . We have shown that for all bounded intervals I ⊆ Rp , J ⊆ Rq
and all B ∈ B(Rq ),
I × (B ∩ J) ∈ B(Rn ) and λ I × (B ∩ J) = λ(I)λ(B ∩ J).
Taking a sequence of bounded intervals Jk ↑ Rn , we see that
I × B ∈ B(Rn ) and λ I × B) = λ(I)λ(B).
(11.14)
Now fix B ∈ B(Rq ) and let I ⊆ Rp be a bounded interval. Define
HB,I = {A ∈ B(Rp ) : (A ∩ I) × B ∈ B(Rn ) & λ (A ∩ I) × B = λ(A ∩ I)λ(B)}.
By (11.14), HB,I contains all intervals. Arguing as above, we see that HB,I =
B(Rp ). Thus for all A ∈ B(Rp ), B ∈ B(Rq ), and all bounded intervals I ⊆ Rp ,
(A ∩ I) × B ∈ B(Rn ) and λ (A ∩ I) × B = λ(A ∩ I)λ(B).
Taking a sequence of bounded intervals Ik ↑ Rn in the last equation yields
(11.13).
The following lemma asserts that part (a) of the Fubini–Tonelli theorem
holds for indicator functions of Borel sets and hence completes the proof of
the theorem.
394
A Course in Real Analysis
11.5.9 Lemma. Let p, q ∈ N with p + q = n and let C ∈ B(Rn ). Then
Z
Z
1C (x, z) dz and
1C (z, y) dz
Rq
Rp
are Borel measurable functions of x ∈ R and y ∈ Rq , respectively, and
Z Z
Z Z
λ(C) =
1C (x, z) dz dx =
1C (z, y) dz dy.
p
Rp
Rq
Rq
Rp
n
Proof. Let G denote the collection of all C ∈ B(R ) for which the assertions
of the lemma hold. We show that G = B(Rn ). The first step is to show that G
has properties (b) and (c) of 11.5.7.
For property (b), let Ck ∈ G and Ck ↑ C. Then 1Ck (x, z) ↑ 1C (x, z), hence,
by the monotone convergence theorem,
Z
Z
1Ck (x, z) dz ↑
1C (x, z) dz, x ∈ Rp .
Rq
Rq
Thus Rq 1C (x, z) dz is Borel measurable in x. Applying the monotone convergence theorem again, we see that
Z Z
Z Z
λ(C) = lim λ(Ck ) = lim
1Ck (x, z) dz dx =
1C (x, z) dz dx,
R
k
k
Rp
Rq
Rp
Rq
and similarly for the other iterated integral. Therefore, G has property (b).
For property (c), let A ∈ B(Rp ), B ∈ B(Rq ), and C = A × B. Then
Z
Z
1C (x, z) dz =
1A (x)1B (z) dz = 1A (x)λ(B),
Rq
Rq
which is Borel measurable in x and, together with 11.5.8, implies that
Z Z
1C (x, z) dz dx = λ(A)λ(B) = λ(C).
Rp
Rq
Similar assertions hold for the other iterated integral. Thus, G contains Cartesian products of Borel sets and, in particular, all intervals.
Now let I be a bounded interval in Rn and let GI = {B ∈ B : B ∩ I ∈ G}.
Since G has properties (b) and (c) of 11.5.7, so does GI . We claim that GI also
has property (a). To see this, let C, D ∈ GI with C ⊆ D and let E = D \ C.
Since 1E∩I = 1D∩I − 1C∩I ,
Z
Z
Z
1E∩I (x, z) dz =
1D∩I (x, z) dz −
1C∩I (x, z) dz,
Rq
Rq
Rq
which, because C ∩ I and D ∩ I ∈ G, is Borel measurable in x and implies that
Z Z
1E∩I (x, z) dz dx
Rp Rq
Z Z
Z Z
=
1D∩I (x, z) dz dx −
1C∩I (x, z) dz dx
Rp
Rq
= λ(D ∩ I) − λ(C ∩ I)
= λ(E ∩ I).
Rp
Rq
Lebesgue Integration on Rn
395
Here we have used the fact that, because I is bounded, the calculations take
place in R, hence subtraction is legitimate. The other iterated integral is
treated similarly. Therefore E ∈ GI , as required.
Since GI has properties (a)–(c) of 11.5.7, GI contains all Borel sets. This
means that for any C ∈ B(Rn ) and bounded interval I ⊆ Rn , the functions
Z
Z
1C∩I (x, z) dz and
1C∩I (z, y) dz
Rq
Rp
are Borel measurable in x and y, respectively, and
Z Z
Z Z
1C∩I (x, z) dz dx = λ(C ∩ I) =
Rp
Rq
Rq
1C∩I (z, y) dz dy.
Rp
Taking an increasing sequence of bounded intervals I tending to Rn and using
the monotone convergence theorem shows that C ∈ G. Therefore, G = B(Rn ),
as required.
Exercises
1.S Prove that λn {(x1 , . . . , xn ) : xj ∈ Q for some j} = 0.
R
2. Evaluate [0,+∞)n f , where
2
(a)S f (x) = x1 · · · xn e−kxk .
(b) f (x) = x1 · · · xn (1 + kxk2 )−n−1 .
3. (Cavalieri’s principle). For E ∈ M(Rn ) and t ∈ R, define
Et := x = (x1 , . . . , xn−1 ) ∈ Rn−1 : (x, t) ∈ E .
Suppose that Et ∈ M(Rn ) for all t ∈ [a, b]. Prove that
h
λn E ∩ R
n−1
× [a, b]
i
=
b
Z
λn−1 (Et ) dt.
a
Thus the “volume” of the portion of E between the hyperplanes xn = a
and xn = b is the integral from a to b of the “cross-sectional areas”
λn−1 (Et ).
4. Let f and g be Riemann integrable on [0, 1]. Prove that
1
Z
0
Z
x
g(x − y)f (y) dy dx =
Z
=
Z
0
0
0
5.S Evaluate
Z
0≤x≤x1 ≤···≤xm ≤1
1
Z
1−y
g(x)f (y) dx dy
0
1Z
1−x
0
x dλ(x, x1 , . . . , xm ).
g(x)f (y) dy dx.
396
A Course in Real Analysis
6. Show that
Z 1Z
0
1
x2 − y 2
dy dx = −
(x2 + y 2 )2
0
Z
1
1
Z
0
0
x2 − y 2
π
dx dy = .
(x2 + y 2 )2
4
Why does this not contradict the Fubini–Tonelli theorem?
7.S Let f be integrable on (0, 1), p > 0, and define
Z
g(x) =
t−p f (t) dt, 0 < x < 1.
[x1/p ,1)
Prove that g is integrable on (0, 1) and that
Z
Z
g dλ =
f dλ.
(0,1)
(0,1)
8. Let f be continuous on [−1, 1]. Show that
Z 2π Z 1
(a)
f 0 (r cos θ)r cos2 θ dr dθ
0
0
0
=
Z
2π
f (cos θ) cos θ dθ −
Z
0
(b)
(c)
Z
2π
Z
1
0
Z 2π
0
Z 1
0
0
f 0 (r cos θ)r sin2 θ dr dθ =
Z
0
f 0 (r cos θ)r dr dθ =
Z
2π
Z
2π
0
cos θ
Z
cos θ
f (x) dx dθ.
0
f (x) dx dθ.
0
2π
f (cos θ) cos θ dθ.
0
9. Let a, b > 0. Use the Fubini–Tonelli
R ∞theorem, the dominated convergence
theorem, and the identity 1/x = 0 e−xt dt, x > 0, to prove that
Z ∞
Z ∞ −ax
π
e
− e−bx
sin x
(a)S
dx = .
(b)
dx = ln b − ln a.
x
2
x
0
0
x
1
10. Show that ϕ ∗ ϕ(x) = √ ϕ √ .
2
2
11. Let f, g : Rn → R be Borel measurable and integrable. Prove:
(a)S f ∗ g = g ∗ f .
(b) If f and g are continuous, then
Z
d
f (x)g(y) dx dy = f ∗ g(z), where Az = {(x, y) : x + y ≤ z} .
dz Az
12. Let f : [0, 1] → (0, +∞] be Lebesgue measurable. Use the Fubini–Tonelli
theorem to prove that
Z
Z
f dλ
1/f dλ ≥ 1.
[0,1]
[0,1]
(A simpler but less interesting proof uses the Cauchy–Schwarz inequality.)
Lebesgue Integration on Rn
397
13. Let f and g be positive Lebesgue measurable functions on [0, 1] such
that f g ≥ 1. Use the preceding exercise to prove that
Z
Z
f dλ
g dλ ≥ 1.
[0,1]
[0,1]
(The Cauchy–Schwarz inequality may be used here as well.)
14.S Let f and g be Lebesgue integrable on [a, b] and for x ∈ [a, b] let
Z
Z
F (x) = F (a) +
f (t) dλ(t) and G(x) = G(a) +
g(t) dλ(t),
[a,x]
[a,x]
where F (a) and G(a) are arbitrary. Prove that
Z
Z
F (x)g(x) dλ(x) +
G(x)f (x) dλ(x) = F (b)G(b) − F (a)G(a).
[a,b]
[a,b]
15. (a) Verify that the function
2
1
1
κ(t, x) = √ e−x /4t = √ ϕ
2 πt
2t
x
√
2t
is a solution of the heat equation
wt (t, x) = wxx (t, x),
x ∈ R, t > 0.
(b) Let w0 (x) be integrable on R and define
Z ∞
w(t, x) =
w0 (y)κ(t, x − y) dy,
−∞
the convolution of κ with w0 . Show that w(t, x) satisfies the heat equation.
(c) Verify that
w(t, x) =
Z
∞
√ w0 x + z 2t ϕ(z) dz.
−∞
(d) Use (c) and the dominated convergence theorem to show that if w0 is
continuous and satisfies |w0 (x)| ≤ aeb|x| for some positive constants a, b
and for all x, then limt→0+ w(t, x) = w0 (x). Conclude that the solution
w(t, x) may be continuously extended to [0, +∞) × R and consequently
satisfies the boundary condition w(0, x) = w0 (x).
16. For a Borel measurable function f : R → [0, +∞), define
A := {(x, y) : 0 ≤ y ≤ f (x)} and Ay := {x : f (x) > y} , y ∈ R.
398
A Course in Real Analysis
Prove:
(a) A ∈ B(R2 ).
(b) The function y 7→ λ(Ay ) is Borel measurable and
Z
Z
f (x) dλ(x) =
λ(Ay ) dλ(y) = λ(A).
(0,+∞)
(c) Part (b) holds if A and Ay are replaced, respectively, by
B = {(x, y) : 0 ≤ y < f (x)} and By = {x : f (x) ≥ y} .
(d) λ {(x, y) : f (x) = y}
measure zero.)
11.6
= 0. (The graph of a Borel function has
Change of Variables
In Chapter 5 we proved that if ϕ : [a, b] → R is continuously differentiable
with everywhere nonzero derivative and if f is Riemann integrable on [c, d] :=
ϕ([a, b]), then
Z d
Z b
f (y) dy =
f (ϕ(x))|ϕ0 (x)| dx.
c
a
In this section we prove the following n-dimensional version of this result.
11.6.1 Change of Variables Theorem. Let U and V be open subsets of Rn
and let ϕ : U → V be C 1 on U with C 1 inverse ϕ−1 : V → U . If f is Lebesgue
measurable on V and either f ≥ 0 or f is integrable, then
Z
Z
f (y) dy =
(f ◦ ϕ)(x)|Jϕ (x)| dx,
(11.15)
V
U
where Jϕ is the Jacobian of ϕ on U .
11.6.2 Example. Spherical coordinates (r, θ1 , θ2 , . . . , θn−1 ) in Rn are defined
by the transformation formulas
x1 = r cos θ1
x2 = r sin θ1 cos θ2
x3 = r sin θ1 sin θ2 cos θ3
..
.
xn−1 = r sin θ1 sin θ2 · · · sin θn−2 cos θn−1
xn = r sin θ1 sin θ2 · · · sin θn−2 sin θn−1 ,
Lebesgue Integration on Rn
399
where
r > 0,
0 < θj < π, j = 1, . . . , n − 2, and 0 < θn−1 < 2π.
Pn
Note that sin θj > 0 for j ≤ n − 2 and j=1 x2j = r2 . Let
U := (0, +∞) × (0, π)n−2 × (0, 2π) and V := Rn \ Rn−2 × [0, +∞) × {0}
and define ϕ on U by
ϕ r, θ1 , , . . . , θn−1 = (x1 , . . . , xn ),
where the xj are as above. Clearly U and V are open and ϕ is C ∞ on U . We
claim that ϕ maps U onto V and has a C ∞ inverse on U .
The inclusion ϕ(U ) ⊆ V is established as follows: If (r, θ1 , . . . , θn−1 ) ∈ U
and (x1 , . . . , xn ) = ϕ(r, θ1 , . . . , θn−1 ) 6∈ V , then xn−1 ≥ 0 and xn = 0. But the
latter implies that θn−1 = π, which gives the contradiction xn−1 < 0.
For the reverse inclusion, we show that for each (x1 , . . . , xn ) ∈ V there
exists a unique solution (r, θ1 , θ2 , . . . , θn−1 ) to the above system. Clearly, r
and θ1 have the unique solutions
X
1/2
n
2
r=
xj
and θ1 = arccos(x1 /r).
j=1
In particular, the system has a unique solution if n = 2. Now set
yj = xj /(r sin θ1 ), 2 ≤ j ≤ n.
By induction, we may assume that the reduced system
y2 = cos θ2
y3 = sin θ2 cos θ3
..
.
yn−1 = sin θ2 · · · sin θn−2 cos θn−1
yn = sin θ2 · · · sin θn−2 sin θn−1
has a unique solution (θ2 , . . . , θn−1 ). Then the original system has the unique
solution (r, θ1 , . . . , θn−1 ). Therefore, ϕ is one-to-one and ϕ(U ) = V .
By standard properties of determinants and a reduction argument,
Jϕ (r, θ1 , θ2 , . . . , θn−1 ) = rn−1 sinn−2 θ1 sinn−3 θ2 · · · sin2 θn−3 sin θn−2 .
Since Jϕ > 0 on U , the inverse function theorem implies that ϕ has a global
C ∞ inverse on U . Hence, by the change of variables theorem, if f is Lebesgue
measurable on Rn and either f ≥ 0 or f is integrable, then
Z
Z
f dλ =
(f ◦ ϕ)Jϕ dλ.
V
U
400
A Course in Real Analysis
Since V differs from Rn by a set of measure zero, we may write the last equation
as
Z ∞
Z ∞
···
f (x1 , . . . , xn ) dx1 · · · dxn
(11.16)
−∞
=
Z
−∞
∞Z π
Z
π
2π
Z
f r cos θ1 , r sin θ1 cos θ2 , . . . , r sin θ1 · · · sin θn−1
0
0
0
0
rn−1 sinn−2 θ1 sinn−3 θ2 · · · sin2 θn−3 sin θn−2 dθn−1 dθn−2 · · · dθ1 dr.
···
In particular, taking f to be the indicator function
of C1n (0) and using 11.5.6,
n
we see that the left side of (11.16) is λ C1 (0) = αn and the right side is
Z
1
Z
π
Z
···
0
0
π
Z
2π
rn−1 sinn−2 θ1 · · · sin2 θn−3 sin θn−2 dθn−1 dθn−2 · · · dθ1 dr
0
0
Z
Z π
2π π
=
···
sinn−2 θ1 · · · sin2 θn−3 sin θn−2 dθn−2 · · · dθ1 .
n 0
0
In particular,
Z π
Z π
nαn
···
sinn−2 θ1 sinn−3 θ2 · · · sin2 θn−3 sin θn−2 dθn−2 · · · dθ1 =
. ♦
2π
0
0
Proof of the change of variables theorem.
Before we begin the proof proper, we make some reductions. First, by
considering f + and f − , we need only prove the case f ≥ 0. Second, since a
Lebesgue measurable function is equal a.e. to a Borel measurable function, we
may assume that f is Borel measurable. Note that in this case f ◦ ϕ is also
Borel measurable. To prove (11.15) it then suffices to verify that
Z
Z
f dλ ≤
(f ◦ ϕ)|Jϕ | dλ
(11.17)
V
U
for all Borel measurable functions f : V :→ [0, +∞]. Indeed, if (11.17) holds
for all f and ϕ, then, switching the roles of U and V it must also be the case
that
Z
Z
g dλ ≤
(g ◦ ϕ−1 )|Jϕ−1 | dλ
U
V
for all Borel measurable g : U :→ [0, +∞]. Taking g = (f ◦ ϕ)|Jϕ | and recalling
that Jϕ Jϕ−1 = 1, we obtain the reverse of inequality (11.17). Finally, by
considering simple functions and using linearity, 10.5.8, and the monotone
convergence theorem, it suffices to prove (11.17) for indicator functions f = 1B ,
where B ∈ B(Rn ) and B ⊆ V . Equation (11.17) then reduces to
Z
λ(B) ≤
|Jϕ | dλ, B ⊆ V, B ∈ B(Rn ),
ϕ−1 (B)
Lebesgue Integration on Rn
or, equivalently, (taking B = ϕ(E)),
Z
λ ϕ(E) ≤
|Jϕ | dλ, E ⊆ U, E ∈ B(Rn ).
401
(11.18)
E
The proof of (11.18) is a sequence of lemmas, the first of which treats the
case of a linear change of variable.
11.6.3 Lemma. If T ∈ L(Rn , Rn ) is nonsingular, then
λ(T (E)) = | det T |λ(E), E ∈ B(Rn ).
(11.19)
Proof. Since T is nonsingular, T (E) ∈ B(Rn ) so the left side of (11.19) is
defined. Furthermore, if (11.19) holds for T1 and T2 , then it holds for T1 T2 :
λ T1 T2 (E) = | det T1 |λ T2 (E) = | det T1 | | det T2 |λ(E) = | det(T1 T2 )|λ(E).
Now observe that a nonsingular linear transformation T may be expressed as
a product of elementary linear transformations, that is, linear transformations
whose matrices are obtained from the identity matrix by one of the following
operations:
(a) Interchange of two rows.
(b) Multiplication of a row by a nonzero constant.
(c) Addition of one row to another.
This is simply the assertion that a matrix may be put into reduced row echelon
form by a sequence of elementary row operations. (See Appendix B.) We
claim that (11.19) holds for elementary linear transformations T and bounded
intervals E = I1 × . . . × In .
In case (a), det T = −1 and T (E) is the interval obtained from E by
interchanging a pair of intervals Ii and Ij , hence (11.19) holds in this case.
In (b), T (E) is the interval obtained from E by multiplying one of the
coordinate intervals by a nonzero constant a, hence λ(T (E)) = |a|λ(E). Since
| det T | = |a|, (11.19) holds in this case as well.
For case (c), assume for definiteness that the matrix of T is the result of
adding row two of the identity matrix to row one, so
T (x1 , x2 , x3 , . . . , xn ) = (x1 + x2 , x2 , x3 , . . . , xn ).
Then det T = 1 and
λ T (E) =
Z
1T (E) (x) dx =
Z
1E (x1 − x2 , x2 , . . . , xn ) dx.
By the Fubini–Tonelli theorem and translation invariance, the last integral
402
A Course in Real Analysis
evaluates to
ZZ
Z
···
1I1 (x1 − x2 )1I2 (x2 ) · · · 1In (xn ) dxn · · · dx2 dx1
Z
Z
= |In | · · · |I3 | 1I2 (x2 ) 1I1 (x1 − x2 ) dx1 dx2
= |In | · · · |I3 | |I2 | |I1 |
= λ(E).
Therefore, (c) holds.
It now follows that (11.19) holds for all nonsingular T and all intervals
E. To verify (11.19) for all Borel sets E, we use 11.5.7. For a fixed bounded
interval I, let GI denote the collection of all E ∈ B(Rn ) for which
λ(T (E ∩ I)) = | det T |λ(E ∩ I).
(11.20)
By the first part of the proof, GI contains all intervals. Let A, B ∈ GI with
A ⊆ B, and set C = A ∩ I and D = B ∩ I. Then (B \ A) ∩ I = D \ C and
λ T (D\C) = λ T (D) −λ T (C) = | det T | λ(D)−λ(C) = | det T |λ(D\C),
hence B \ A ∈ GI . (The operation of substraction is legitimate because C and
D are bounded.) Now let Ak ∈ GI , Ak ↑ A. Letting k → +∞ in
λ(T (Ak ∩ I)) = | det T |λ(Ak ∩ I)
shows that A ∈ GI . Therefore, GI satisfies (a)–(c) of 11.5.7, hence (11.20)
holds for every E ∈ B(Rn ). Taking a sequence of bounded intervals Ik ↑ Rn in
(11.20) yields (11.19).
√
r n/2
y
r/2
Qr (y)
Br√n/2 (y)
FIGURE 11.3: Concentric cube and ball.
For the remaining lemmas, the following terminology and notation will be
useful. The cube with center y ∈ Rn and edge r > 0 is the semi-closed interval
Q = Qr (y) := {x ∈ Rn : yj − r/2 ≤ xj < yj + r/2, j = 1, . . . , n} .
Lebesgue Integration on Rn
√
Note that |Q| = rn and the diameter of Q is r n. Thus
403
Br/2 (y) ⊆ Qr (y) ⊆ Br√n/2 (y).
A paving of a subset A of Rn is a finite collection Qr of pairwise disjoint cubes
with edge r that covers A. Two pavings Qr = {Qr (xj ) : 1 ≤ j ≤ m} and
Qs = {Qs (xj ) : 1 ≤ j ≤ m} with the same centers are said to be concentric.
Any bounded set A has a paving Qr with arbitrarily small r. Indeed, if
A ⊆ [a, b)n , one need only subdivide [a, b) into subintervals of size (b − a)/k
for sufficiently large k and form Cartesian products of these subintervals.
11.6.4 Lemma. Let K ⊆ U be compact.
(a) For each sufficiently small δ > 0, there exists a compact set Kδ with
K ⊆ Kδ ⊆ U .
(b) For each r < δ, there exists a paving Qr of K contained in Kδ .
Proof. For subsets A, B ⊆ Rn , denote by d(A, B) the distance between A
and B:
d(A, B) = inf {ka − bk : a ∈ A, b ∈ B} .
√
Since K is compact and U c is closed, δ0 := d(U c , K) > 0. For 0 < δ < δ0 / n,
let
√
Kδ = x : d(x, K) ≤ δ n .
Then Kδ is compact and K ⊆ Kδ ⊆ U . Let Q be a cube with edge r. If
Qi
K
Kδ
U
FIGURE 11.4: The paving Qr .
x ∈ Q ∩ K and y ∈ Q ∩ Kδc , then
√
√
δ n < d(y, K) ≤ kx − yk ≤ r n.
Therefore, if r < δ and Q ∩ K 6= ∅, then Q ∩ Kδc = ∅, that is, Q ⊆ Kδ . Since
K is bounded, there exists a paving Qr of K. Removing those members of Qr
that do not meet K produces a paving of K contained in Kδ .
11.6.5 Corollary.
Let ψ : U → Rn be C 1 on U and let E ⊆ U with λ(E) = 0.
Then λ ψ(E) = 0.
404
A Course in Real Analysis
Proof. Suppose first that E is bounded. Let V ⊇ E be open with compact
closure contained in U and set
c := sup kψ 0 (z)k.
z∈cl(V )
By continuity of ψ 0 and compactness of cl(V ), c < +∞. Given ε > 0, let
W ⊇ E be open with compact closure K = cl(W ) ⊆ V such that λ(K) < ε/2.
This is possible by 10.4.4, since λ(E) = 0.
Now let Kδ be as in 11.6.4. Since Kδ ↓ K as δ ↓ 0, we may take δ sufficiently
small so that λ(Kδ ) < ε. According to the lemma, we may choose a paving
Qr = {Q1 , . . . , Qk } of K contained in Kδ with r < ε. It follows that
kr =
n
k
X
λ(Qj ) = λ
j=1
[
k
Qj
< ε.
(11.21)
j=1
Let xj denote the center of Qj . Since Qj is convex, 9.3.6 implies that
√
kψ(x) − ψ(xj )k ≤ ckx − xj k ≤ cr n, x ∈ Qj .
Therefore,
ψ(Qj ) ⊆ Bcr√n ψ(xj )) ⊆ Q2cr√n ψ(xj )
and so
λ ψ Qj
√
≤ (2cr n)n .
Since the sets ψ(Qj ) cover ψ(K),
√
√
λ (ψ(E)) ≤ λ (ψ(K)) ≤ k(2cr n)n ≤ (2c n)n ε,
the last inequality by (11.21). Since ε was arbitrary, λ (ψ(E)) = 0. This proves
the assertion of the lemma for bounded E. In the unbounded case, take a
sequence of bounded Borel sets Ek ↑ E.
11.6.6 Lemma. Let ψ be C 1 on U , Q a cube contained in U , and let In
denote the identity
transformation on Rn . If kdψx0 − In k ≤ c for all x ∈ Q,
then λ ψ(Q) ≤ [(1 + c)n]n λ(Q).
Proof. Let ψ̃(x) = ψ(x) − x. Then dψ̃x = dψx − In . By 9.3.6,
kψ̃(x) − ψ̃(y)k ≤ ckx − yk, for all x, y ∈ Q.
Thus, if Q has center x0 and edge r, then for all x ∈ Q
√
kψ(x)−ψ(x0 )k ≤ kψ̃(x)− ψ̃(x0 )k+kx−x0 k ≤ (c+1)kx−x0 k ≤ (c+1) nr/2,
that
√ is, ψ(Q) is contained in the closed ball C with center ψ(x0 ) and radius
n(c + 1)r/2. Since C is contained in the cube with center ψ(x0 ) and edge
(c + 1)nr,
λ ψ(Q) ≤ [(c + 1)nr]n = [(c + 1)n]n λ(Q).
Lebesgue Integration on Rn
405
11.6.7 Lemma. Let ψ : U → Rn be C 1 on U and let K ⊆ U be compact.
Then, for each ε > 0, there exists δ > 0, a compact set Kδ with K ⊆ Kδ ⊆ U ,
and concentric pavings Qr , Qnr of K contained in Kδ with arbitrarily small r
such that for any Qr (y) ∈ Qr ,
λ ϕ Qr (y) ≤ (1 + ε)n |Jϕ (y)|λ Qnr (y)
(11.22)
Moreover, δ may be chosen so that
Z
Z
|Jϕ (x)| dx <
|Jϕ (x)| dx + ε.
Kδ
Proof. Let M = sup
For x, y ∈ U define
ψ y (x) = dϕy
Since dϕy
−1
−1
(11.23)
K
(dϕy )−1
: y ∈ Kδ , where Kδ is chosen as in 11.6.4.
−1
−1
ϕ(x) − ϕ(y) = dϕy
ϕ(x) − dϕy
ϕ(y) .
is linear, by the chain rule
d(ψ y )x = (dϕy )−1 ◦ dϕx .
Thus for all x ∈ U , y ∈ Kδ , and z ∈ Rn ,
kd(ψ y )x (z) − zk =
dϕy
−1
dϕx (z) − dϕy (z) ≤ M kdϕx − dϕy k kzk.
Therefore, by definition of the operator norm,
kd(ψ y )x − In k ≤ M kdϕx − dϕy k.
(11.24)
Now, by the uniform continuity of dϕ on Kδ there exists 0 < √δ1 < δ
such that kdϕx − dϕy k ≤ ε/M for all x, y ∈ Kδ with kx − yk < δ1 n. Let
r < δ1 /n and let Qr Qnr be concentric pavings
of√ K contained in Kδ . If
√
x ∈ Q := Qr (y) ∈ Qr , then kx − yk < r n < δ1 n, hence, from (11.24),
kd(ψ y )x − In k < ε. By 11.6.6,
λ ψ y (Q) ≤ [(1 + ε)n]n λ(Q) = (1 + ε)n λ Qnr (y) .
(11.25)
On the other hand, since ψ y (Q) = dϕy )
translation invariance and 11.6.3,
−1
−1
ϕ(Q) − dϕy
ϕ(y) , by
−1
λ ψ y (Q) = λ dϕy
(ϕ(Q)) = |Jϕ (y)|−1 λ ϕ(Q) .
(11.26)
Inequality (11.22) now follows from (11.25) and (11.26).
R
For (11.23), note that since K1/k ↓ K and µ(A) := A |Jϕ | dλ is a measure
on the Borel sets (11.3.4), µ K1/k ↓ µ(K). Thus there exists k such that
µ K1/k < µ(K) + ε. Taking δ < 1/k completes the proof.
406
A Course in Real Analysis
11.6.8 Lemma. If K ⊆ U is compact, then
Z
λ ϕ(K) ≤
|Jϕ (y)| dy.
K
Proof. Let ε > 0 and choose δ > 0 as in 11.6.7. By uniform continuity of Jϕ (x)
on Kδ , there exists δ1 < δ such that
|Jϕ (x) − Jϕ (y)| < ε for all x, y ∈ Kδ with kx − yk < δ1 .
Choose pavings Qr = {Qr (y)}y and Qnr = {Qnr (y)}y as in 11.6.7. Then for
x ∈ Qnr (y)
|Jϕ (y)| ≤ |Jϕ (x) − Jϕ (y)| + |Jϕ (x)| < ε + |Jϕ (x)|,
hence, by (11.22),
(1 + ε)−n λ ϕ(Qr (y)) ≤ |Jϕ (y)|λ(Qnr (y)) ≤
Z
|Jϕ (x)| + ε dx,
Qnr (y)
so
X
(1 + ε)−n λ ϕ(K) ≤
(1 + ε)−n λ ϕ(Qr (y))
y
Z
|Jϕ (x)| + ε dx
≤
Kδ
Z
|Jϕ (x)| dx + ε 1 + λ(Kδ ) .
≤
by (11.23)
K
Letting ε → 0 verifies the lemma.
Now use 10.4.5 to obtain an increasing
sequence of compact sets Kk ⊆ E
such that λ(Kk ) ↑ λ(E). Then λ ϕ(Kk ) ↑ λ ϕ(E) and, by 11.6.8,
Z
Z
|Jϕ (y)| dy ≤
|Jϕ (y)| dy.
λ ϕ(Kk ) ≤
Kk
E
Letting k → +∞ yields (11.18), completing the proof of the change of variables
theorem.
11.6.9 Remark. If V is a linear subspace of Rn of dimension m < n, then
λn (V) = 0. To see this, let v1 , . . ., vm , . . ., vn , be an orthonormal basis for
Rn , where the first m vectors form a basis for V.3 Define TV ∈ L(Rn , Rn ) such
that TV (vj ) = ej , 1 ≤ j ≤ n. Then TV is an orthogonal transformation and
TV (V) = Rm × {0}. By 11.6.3
λn (V) = | det(TV )|λn (Rm × {0}) = 0,
3 This
is always possible by the Gram–Schmidt process.
Lebesgue Integration on Rn
407
as claimed. This also shows that (11.19) holds for singular transformations T
as well, since then both sides of that equation are zero.
While the n-dimensional volume of a subset E of V is zero, E may still
have positive m-dimensional measure. This is defined as
λV (E) := λm TV (E) for E ∈ TV−1 B(Rm ) .
From a geometric point of view, this is a reasonable definition, since an
orthogonal transformation is either a rotation or a rotation combined with
a reflection and therefore does not change volumes or areas. To see that the
definition does not depend on the particular choice of the orthonormal basis,
let w1 , . . . , wn be another orthonormal basis for Rn whose first m members
form a basis for V and let T̃V ∈ L(Rn , Rn ) satisfy T̃V (wj ) = ej , 1 ≤ j ≤ n. Set
T = T̃V TV−1 . Then, by (11.19),
λm T̃V (E) = λm T TV (E) = | det T |λm TV (E) = λm TV (E) ,
the last equality because T is orthogonal and hence has determinant ±1.
♦
Exercises
1. Define the n-dimensional ellipsoid
(
)
2
2
x1
xn
E = (x1 , . . . , xn ) :
+ ··· +
≤1 ,
a1
an
where aj > 0. Prove that λn (E) = a1 · · · an λn C1 (0) .
p
p
p
2. Show that the volume of the solid with surface |x| + |y| + |z| = 1
is given by
Z Z
Z
1
1−u
1−u−v
64
uvw dw dv du.
0
0
0
3.S ⇓4 Let h be Lebesgue integrable on [0, +∞). Use 11.6.2 to prove that
Z
Z ∞
h(kxk) dx = nαn
h(r)rn−1 dr.
Rn
0
4. Use Exercise 3 to show that for n ≥ 2
Z
Z
(a)
exp(−kxk) dx = n! αn .
(b)
Rn
exp(−kxk2 ) dx = π n/2 .
Rn
5.S A hole of radius R ∈ (0, 1) is drilled in the (n + 1)-dimensional ball
C1n+1 (0) from the north pole (0, 0, . . . , 1) to the south pole (0, 0, . . . , −1).
Use Exercise 3 to show that the amount removed from the ball is
p
p
nαn R 1 − R2 − arcsin 1 − R2 + π/2 .
4 This
exercise will be used in 13.2.5 and 13.4.2.
408
A Course in Real Analysis
6.S (Theorem of Pappus) Let E ∈ M(Rn ) be bounded with positive
n-dimensional Lebesgue measure such that xn > 0 for all x =
(x1 , . . . , xn ) ∈ E. Define
Er = {(x1 , . . . , xn−1 , xn cos θ, xn sin θ) : x ∈ E, 0 < θ < 2π} .
Prove that
λn+1 (Er ) = 2πxn λn (E),
where
1
xn :=
λn (E)
Z
xn dλn (x1 , . . . , xn ),
E
the nth coordinate of the centroid x of E. Thus if n = 2, then Er is
the rotation of E about the x1 -axis, and the theorem of Pappus asserts
that the volume of Er is equal to the area of E times the distance the
centroid of E travels around the x1 axis.
x2
E
x
θ
x1
x3
FIGURE 11.5: Theorem of Pappus.
Chapter 12
Curves and Surfaces in Rn
12.1
Parameterized Curves
A parameterized curve Rn is a continuous function ϕ : I → Rn , where I
is an interval in R. We shall usually refer to ϕ as simply a curve. The range
ϕ(I) of ϕ is called the trace of ϕ and is denoted by trace(ϕ). The curve is
said to lie in a set E ⊆ Rn if trace(ϕ) ⊆ E. The curve is called simple if ϕ is
one-to-one. If I = [a, b], the point ϕ(a) is the initial point of the curve and
ϕ(b) the terminal point. The curve ϕ is then said to be closed if ϕ(a) = ϕ(b),
and simple closed if it is closed and ϕ is one-to-one on (a, b), that is, the curve
intersects itself only at the initial and terminal points. For example, the curve
(cos(2kπt), sin(2kπt)), t ∈ [0, 1], k ∈ N, is a simple closed curve iff k = 1; its
trace is the circle x2 + y 2 = 1.
ϕ(a)
ϕ(b)
ϕ(a)
ϕ(a) = ϕ(b)
ϕ(b)
Simple curve
Non-simple curve
Simple closed curve
FIGURE 12.1: Curves in R2 .
A curve ϕ : I → Rn is said to be of class C r if ϕ is C r on an open interval
containing I. A C 1 curve ϕ is smooth if ϕ0 (t) 6= 0 for all t ∈ I. For example, on
[−1, 1] the curve ϕ(t) = (t, t2 ) is smooth but the curve ψ(t) = (t3 , t6 ), which
has the same trace as ϕ, is not.
A curve ϕ : [a, b] → Rn is said to be piecewise smooth if, for some partition
a = a0 < a1 < · · · < am = b, ϕ is smooth on each interval [aj−1 , aj ]. This implies that ϕ0 is uniformly continuous on each interval of smoothness (aj−1 , aj )
and has right-hand and left-hand limits at the left and right endpoints, respectively. Thus a piecewise smooth curve may be viewed as a concatenation (sum)
409
410
A Course in Real Analysis
of smooth curves, as shown in Figure 12.2. Note that at junctions that are
corners there are two tangent vectors, and at junctions that are cusps there is
one. A point on a smooth portion of the curve will be called a smooth point. A
piecewise smooth curve therefore consists of smooth points and finitely many
corner or cusp points.
corner
cusp
smooth
point
corner
FIGURE 12.2: A piecewise smooth curve with tangent vectors.
A reparametrization of a curve ϕ : I → Rn is a curve ψ = ϕ ◦ α : J → Rn ,
where α : J → I is continuous, strictly increasing, and α(J) = I (hence
trace(ψ) = trace(ϕ)). If ϕ is smooth, then α is required to be smooth with
positive Jacobian. If ψ is a reparametrization of ϕ, then ϕ and ψ are said to
be equivalent. For example, the smooth curve (t, t2 , t3 ) (t > 0) is equivalent to
the curve (et , e2t , e3t ) (t ∈ R).
A curve ϕ : I → Rn has a positive direction, namely, the direction that ϕ(t)
moves as t increases. An equivalent curve ψ = ϕ ◦ α has the same direction
since α is strictly increasing. The curve −ϕ, defined by
(−ϕ)(t) := ϕ(−t), −t ∈ I,
has the opposite (negative) direction. If ϕ is piecewise smooth, then the positive
direction is given by the tangent vectors ϕ0 (t), defined at smooth points. At
corners and cusps the tangent vectors are right- and left-hand limits. The set
of tangent vectors to a curve is called the tangent vector field (defined more
precisely later).
12.1.1 Proposition. . Let ϕj : [aj , bj ] → Rn , j = 1, . . . , k, be piecewise C 1
curves such that ϕj (bj ) = ϕj+1 (aj+1 ), j = 1, . . . , k − 1. Then there exists a
piecewise C 1 curve ϕ : [0, 1] → Rn , denoted by
ϕ = ϕ1 + ϕ2 + · · · + ϕk
and called the sum of the curves ϕj , such that ϕ
ϕj .
[(j−1)/k,j/k]
is equivalent to
Proof. Define αj : [(j − 1)/k, j/k] → [aj , bj ] by
αj (t) = bj + (bj − aj )(kt − j), (j − 1)/k ≤ t ≤ j/k,
and ϕ : [0, 1] → Rn by ϕ = ϕj ◦ αj on [(j − 1)/k, j/k].
Curves and Surfaces in Rn
411
Exercises
1.S Prove that the notion of equivalent smooth curves is an equivalence
relation.
2. Show that if ϕ is smooth and ψ = ϕ ◦ α is an equivalent curve, then
ϕ0 (t)
ψ 0 (t)
=
.
kϕ0 (t)k
kψ 0 (t)k
Thus the unit tangent vector field is invariant under a reparametrization.
3.S Sketch the trace of the curve ϕ(t) = (t2 , t3 − t) on the interval [−2, 2].
Find all points on the trace where there are two tangent vectors and
express these vectors in terms of the standard basis.
4. Find the tangent vector field of the given curve ϕ on the interval [0, 2π].
Sketch the trace and find all points on the trace at which there are two
tangent vectors. Express these vectors in terms of the standard basis.
(a)S ϕ(t) = sin t, cos(2t) .
(b) ϕ(t) = cos t, sin(2t) .
(c) ϕ(t) = cos t, cos(2t) .
(d) ϕ(t) = sin t, sin(2t) .
5. In (a)–(d) below, find a smooth simple curve or a smooth simple closed
curve ϕ : I → C with trace C.
x2
y2
(a) C is the intersection of the elliptic cylinder 2 + 2 = 1 and the
a
b
plane x + y + z = 1.
x2
y2
(b)S C is the intersection of the elliptic cylinder 2 + 2 = 1 and the
a
b
surface z = 2xy.
(c) C is the intersection in the first octant of the paraboloid z = x2 + y 2
and the plane x + y + z = 1.
(d) C is the intersection in the first octant of the cone z = x2 + y 2 and
the plane x + y + z = 1.
6.S Let ϕ : [a, b] → Rn be a C 1 curve with the property that for some
x ∈ Rn , ϕ(t) = x for infinitely many t ∈ [a, b]. Prove that ϕ is not
smooth.
1
7. Let f be C 1 on an open
set U and let ϕ be a C curve in U . Suppose
0
that ϕ (t) = ∇f ϕ(t) for all t > a and that the limit x := limt→+∞ ϕ(t)
exists in U . Prove that ∇f (x) = 0.
Hint. Assume ∇f (x) 6= 0. Let g = f ◦ ϕ and show that g 0 (t) >
k∇f (x)k2 /2 for all sufficiently large t.
412
12.2
A Course in Real Analysis
Integration on Curves
Rectifiable Curves
Let ϕ : I → Rn be a parameterized curve. Assume first that I = [a, b]. For
a partition P = {t0 = a < t1 < · · · < tk−1 < tk = b} of [a, b] define
LP (ϕ) =
k
X
kϕ(tj ) − ϕ(tj−1 )k,
j=1
which is the length of the inscribed polygonal line with segments joining the
points ϕ(tj−1 ) and ϕ(tj ).
ϕ(t2 )
ϕ(t3 )
ϕ(t1 )
ϕ(b)
ϕ(a)
FIGURE 12.3: Inscribed polygonal line.
The (arc) length of ϕ is defined as
length(ϕ) := sup LP (ϕ),
P
where the supremum is taken over all partitions P of [a, b]. If length(ϕ) < +∞,
then ϕ is said to be rectifiable. Note that if ψ = ϕ ◦ α is equivalent to ϕ,
then length(ψ) = length(ϕ), since α : [c, d] → [a, b] induces a one-to-one
correspondence between partitions of [c, d] and [a, b].
If I = [a, b) (where b could be infinite), define
length(ϕ) := sup length ϕ [a,t] .
a<t<b
A similar definition is given for intervals I = (a, b]. For an open interval
I = (a, b), define
length(ϕ) := length ϕ (a,c] + length ϕ [c,b) ,
where a < c < b. By 12.2.3 below, the expression on the right does not depend
on the intermediate point c.
12.2.1 Example. Let α > 0. The curve
ϕ(t) = x(t), y(t) = t, tα sin(1/t) , 0 < t ≤ b, ϕ(0) = 0,
Curves and Surfaces in Rn
413
is rectifiable iff α > 1. This follows from the inequalities
k
X
|y(tj ) − y(tj−1 )| ≤
k
X
j=1
kϕ(tj ) − ϕ(tj−1 )k ≤ 2(b − a) + 2
j=1
k
X
|y(tj ) − y(tj−1 )|
j=1
and 5.9.3.
♦
We prove in 12.2.4 below that piecewise C 1 curves on [a, b] are rectifiable.
For this, we require two lemmas. The proof of the first is similar to that of the
corresponding result for lower Darboux sums and is left as an exercise.
12.2.2 Lemma. Let ϕ : [a, b] → Rn be a curve and let P and Q be partitions
of [a, b]. If P is a refinement of Q, then LQ (ϕ) ≤ LP (ϕ).
12.2.3 Lemma. Let ϕ : [a, b] → Rn be a curve and c ∈ (a, b). Then
length(ϕ) = length ϕ|[a,c] + length ϕ|[c,b] .
In particular, ϕ is rectifiable iff ϕ|[a,c] and ϕ|[c,b] are rectifiable.
Proof. Let P 0 and P 00 be partitions of [a, c] and [c, b], respectively, and set
P = P 0 ∪ P 00 . Then P is a partition of [a, b] and
length(ϕ) ≥ LP (ϕ) = LP 0 ϕ|[a,c] + LP 00 ϕ|[c,b] .
Taking suprema over P 0 and then P 00 yields
length(ϕ) ≥ length ϕ|[a,c] + length ϕ|[c,b] .
For the reverse inequality, let P = {t0 = a < t1 < · · · < tk = b} be a partition
of [a, b] and suppose c ∈ (ti−1 , ti ]. If P 0 = {t0 = a < t1 < · · · < ti−1 < c} and
P 00 = {c ≤ ti < · · · < tk = b}, then an application of the triangle inequality
shows that
LP (ϕ) ≤ LP 0 ϕ|[a,c] + LP 00 ϕ|[c,b] ≤ length ϕ|[a,c] + length ϕ|[c,b] .
Since P was arbitrary,
length(ϕ) ≤ length ϕ|[a,c] + length ϕ|[c,b] .
12.2.4 Theorem. Let ϕ : [a, b] → Rn be piecewise C 1 . Then ϕ is rectifiable
and
m Z aj
X
length(ϕ) =
kϕ0 (t)k dt,
j=1
aj−1
where ϕ is smooth on the intervals [aj−1 , aj ], a = a0 < a1 < · · · < am = b.
414
A Course in Real Analysis
Proof. By 12.2.3 we may assume that ϕ = (ϕ1 , . . . , ϕn ) is C 1 on [a, b]. Given
ε > 0, choose δ > 0 so that
Z b
m
X
kϕ0 (t)k dt −
(12.1)
kϕ0 (tk )k∆tk < ε, ∆tk := tk − tk−1
a
k=1
for all partitions P = {t0 = a < t1 < · · · < tm−1 < tm = b} with kPk < δ. For
such a partition P, choose sj,k ∈ (tk−1 , tk ) such that
ϕj (tk ) − ϕj (tk−1 ) = ϕ0j (sj,k )∆tk ,
k = 1, . . . , m, j = 1, . . . , n.
Then
LP (ϕ) =
m
X
m X
n
X
kϕ(tk ) − ϕ(tk−1 )k =
1/2
∆tk ,
j=1
k=1
k=1
|ϕ0j (sj,k )|2
hence
LP (ϕ) −
m
X
kϕ0 (tk )k∆tk
k=1
=

m  X
n
X
k=1

|ϕ0j (sj,k )|2
1/2
−
j=1
X
n
|ϕ0j (tk )|2

1/2 
j=1
∆tk .

Taking a smaller δ if necessary, we may assume that the absolute value of the
term in braces is less than ε/(b − a). This is possible by the uniform continuity
of ϕ0 . It follows that
LP (ϕ) −
m
X
kϕ0 (tk )k∆tk < ε.
(12.2)
k=1
From (12.1) and (12.2) we now have
Z b
Z
kϕ0 (t)k dt − 2ε < LP (ϕ) <
a
b
kϕ0 (t)k dt + 2ε
(12.3)
a
for all P with kPk < δ. Since LP (ϕ) ≤ length(ϕ) and ε was arbitrary, the first
inequality in (12.3) implies that
Z b
kϕ0 (t)k dt ≤ length(ϕ).
a
For the reverse inequality, let Q be any partition of [a, b]. Refine Q to obtain
a partition P with kPk < δ. Then, from 12.2.2 and the second inequality in
(12.3),
Z b
L(ϕ, Q) <
kϕ0 (t)k dt + 2ε.
a
Since Q and ε are arbitrary, length(ϕ) ≤
Rb
a
kϕ0 (t)k dt.
Curves and Surfaces in Rn
415
The proof of the following corollary is left to the reader.
12.2.5 Corollary. If ϕ : [a, b) → Rn is C 1 , then length(ϕ) is the improper
Rb
integral a kϕ0 (t)k dt.
12.2.6 Example. Let ϕ(t) = e−t cos t, e−t sin t , where 0 ≤ t < +∞. Then
R∞
kϕ0 (t)k = e−t , hence length(ϕ) = 0 e−t = 1.
♦
Line Integrals
Let ϕ : [a, b] → Rn be a C 1 curve with trace C and let f : C → R be
continuous. The line integral of f over ϕ is defined by
Z
Z
Z b
f ds =
f ds =
f ϕ(t) kϕ0 (t)k dt.
ϕ
C
a
Note that if ψ = ϕ ◦ α is an equivalent parametrization, where α : [c, d] → [a, b]
is C 1 , then, by the chain rule and the change of variables theorem,
Z d
Z d
0
f ψ(t) kψ (t)k dt =
f ϕ(α(t)) kϕ0 α(t) kα0 (t) dt
c
c
=
Z
b
f ϕ(u) kϕ0 (u)k du.
a
The value of a line integral is therefore independent of the choice of parametrization.
If ϕ : [a, b] → Rn is piecewise C 1 , then the line integral is defined as
Z
XZ
f ds =
f ds,
ϕ
j
ϕj
where ϕj is the restriction of ϕ to [aj , aj+1 ] and ϕ is C 1 on [aj , aj+1 ]. If
ϕ : I → Rn is C 1 , where I is an arbitrary interval, then the line integral is
defined as an improper integral, as in the case of arc length.
12.2.7 Remark. Theorem 12.2.4 shows that arc length is the line integral of
the constant function 1. Using techniques similar to those
found in the proof
R
of that theorem, one may show that if ϕ is C 1 , then ϕ f is the limit of sums
of the form
k
X
(f ◦ ϕ)(t∗j )kϕ(tj ) − ϕ(tj−1 )k, t∗j ∈ (tj−1 , tj ),
j=1
as maxj kϕ(tj ) − ϕ(tj−1 )k → 0. This interpretation is useful in applications.
For example, if f (x) is the mass per unit length at the point x of a wire C,
then (f ◦ ϕ)(t∗j )kϕ(tj ) − ϕ(tj−1 )k is approximately the mass of a small piece
of the wire. Summing
and taking the limit gives the mass of the wire as the
R
line integral C f ds.
♦
416
A Course in Real Analysis
Vector Fields
12.2.8 Definition. A vector field on a set E ⊆ Rn is a function
F~ = (f1 , . . . , fn ) : E → Rn .
The vector field is said to be of class C r if each fj is C r .
♦
Geometrically, a vector field assigns to each point of E a unique vector in R ,
as illustrated in Figure 12.4.
n
E
x F~ (x)
FIGURE 12.4: Vector field on E.
If ϕ is a simple smooth curve and x = ϕ(t), then
ϕ0 (t)
~vϕ (x) := ϕ0 (t) and T~ϕ (x) :=
kϕ0 (t)k
denote, respectively, the tangent vector field and unit tangent vector field along
ϕ. If ϕ denotes the position of a particle at time t, then the tangent vector
field is called the velocity vector field of the particle.
Vector fields that describe forces, such as gravitation or electromagnetism,
are called force fields. Line integrals may then be used to calculate the work
done by the force in moving a particle along a curve. Specifically, suppose the
particle moves along a simple smooth curve ϕ : [a, b] → R3 under the action
of a continuous force field F~ = f1 , f2 , f3 on C := trace(ϕ). The work ∆j W
done by the force in moving the particle from a point xj = ϕ(tj ) on C to a
nearby point xj+1 = ϕ(tj+1 ) is approximately the component of the force in
the direction of the tangent to the curve at xj multiplied by the distance the
particle travels:
∆j W ≈ F~ (xj ) · T~ϕ (xj ) kxj − xj+1 k.
P
The total work W done by the force is then approximately j ∆j W . Since F
is continuous, the approximation gets better by taking smaller intervals. It is
therefore reasonable to define the totalPwork done by the force in moving the
particle along the curve as the limit of j ∆j W as maxj kxj − xj+1 k → 0. By
12.2.7, we are therefore led to the definition
Z
Z b
W :=
F~ · T~ ds =
F~ ϕ(t) · T~ϕ ϕ(t) kϕ0 (t)k dt.
ϕ
a
Curves and Surfaces in Rn
417
Since T~ϕ ϕ(t) = kϕ0 (t)k−1 ϕ0 (t), we see that
Z b
Z b
dx1
dx2
dx3
f1 (x)
F~ ϕ(t) · ϕ0 (t) dt =
W =
+ f2 (x)
+ f3 (x)
dt,
dt
dt
dt
a
a
where x = ϕ(t). The last integral is frequently written
Z
f1 dx1 + f2 dx2 + f3 dx3 .
ϕ
The integrand is called a (differential) 1-form on C in R3 .
Differential 1-Forms in Rn
Let fj be defined on a set S ⊆ Rn . The symbol
ω := f1 dx1 + · · · + fn dxn
is called a (differential) 1-form on S. The form is said to be C r on S if each
fj is C r on S, where r ∈ N ∪ {+∞}. Given another 1-form
η = g1 dx1 + · · · + gn dxn
on S and a, b ∈ R, the 1-form aω + bη on S is defined by
aω + bη := (af1 + bg1 ) dx1 + · · · + (afn + bgn ) dxn .
~ = (h1 , . . . , hn ) is a vector field on S, we define the inner product ω · H
~ of
If H
~
ω and H on S by
~
ω · H(x)
:=
n
X
fj (x)hj (x),
x ∈ S.
j=1
The integral of a continuous (that is, C 0 ) 1-form ω over a C 1 curve
ϕ : [a, b] → S is defined as
Z
Z b
Z b
0
0 ω=
f1 ϕ(t) ϕ1 (t) + · · · + fn ϕ(t) ϕn (t) dt =
F~ (ϕ(t)) · ϕ0 (t) dt,
ϕ
a
a
R
where F~ := (f1 , . . . , fn ). If ϕ is only piecewise C , then ϕ ω is defined to be
the sum of the integrals over the intervals on which ϕ is C 1 .
The following properties of the integral are easily established:
Z
Z
Z
•
(aω + bη) = a ω + b η,
1
ϕ
•
Z
ϕ
ω=−
Z
−ϕ
•
Z
ϕ
and
ω,
ϕ
ω=
ϕ1 +···+ϕk
k Z
X
j=1
ϕj
ω.
418
A Course in Real Analysis
A continuous 1-form ω = f1 dx1 + · · · + fn dxn on an open set U ⊆ Rn is
said to be exact if there exists a C 1 function f on U such that fj = ∂j f on U
for each j. We then write
n
X
∂f
ω = df =
dxj .
∂x
j
i=1
The following proposition shows that the integral of an exact form over a
curve depends only on f and the endpoints of the curve.
12.2.9 Proposition. If ϕ : [a, b] → U is piecewise C 1 , then
Z
df = f ϕ(b) − f ϕ(a) .
ϕ
Proof. If ϕ is C 1 , then, by the chain rule and the fundamental theorem of
calculus,
Z
Z bX
Z b
n
0
df =
(∂j f ) ϕ(t) ϕi (t) dt =
(f ◦ ϕ)0 (t) dt = f ϕ(b) − f ϕ(a) .
ϕ
a i=1
a
If ϕ is only piecewise C 1 , subdivide the interval [a, b] into intervals on which ϕ
is smooth, apply the above result to each subinterval, and sum the results.
12.2.10 Theorem. Let U ⊆ Rn be open and connected and let ω be a continuous 1-form on U . The following statements are equivalent:
(a) ω is exact.
R
(b) ϕ ω = 0 for every closed piecewise C 1 curve ϕ in U .
R
R
(c) φ ω = ψ ω for every pair of piecewise C 1 curves φ, ψ : [a, b] → Rn in U
with φ(a) = ψ(a) and φ(b) = ψ(b).
Proof. That (a) implies (b) follows from 12.2.9.
φ
ψ(a) = φ(a)
ϕ
ψ(b) = φ(b)
ψ
FIGURE 12.5: ϕ = ψ − φ.
by
For (b) implies (c), define a closed, piecewise smooth curve ϕ : [a, b+1] → Rn
ϕ(t) = ψ(t), a ≤ t ≤ b, ϕ(t) = φ (b + (b − t)(b − a)) , b ≤ t ≤ b + 1.
Curves and Surfaces in Rn
419
(See Figure 12.5.) Then ϕ|[b,b+1] is equivalent to −φ, hence if (b) holds,
Z
Z
Z
0=
ω=
ω − ω,
ϕ
ψ
φ
proving (c).
ϕx
a
ψt
U
x
x + tej
FIGURE 12.6: ϕx+tej = ϕx + ψt .
Pn
Now assume that (c) holds and let ω = j=1 fj dxj . To establish (a), we
construct a function f on U such that ∂j f = fj . Choose any point a ∈ U .
1
By Exercise 8.7.8, for each x ∈ U there exists a piecewise
R C curve ϕx in U
with initial point a and terminal point x. Define f (x) = ϕx ω. By (c), f (x) is
independent of the path and hence is well-defined. Fix j, let t > 0, and denote
by ψt the line segment x + uej , 0 ≤ u ≤ t. Then ψt lies in U for sufficiently
small t > 0, and by continuity of fj ,
Z
Z
1
1
1 t
f (x + tej ) − f (x) =
ω=
fj x + uej ) du → fj (x)
t
t ψt
t 0
as t → 0+ . A similar argument works for the case t → 0− . Therefore, ∂j f (x) =
fj (x), as required.
Exercises
1.S Determine which of the following curves are rectifiable.
(a) ϕ(t) = (t, t−p ), 0 < t ≤ 1, where p > 0.
2
3
(b) ϕ(t) = (e−t , e−t , e−t ), 0 ≤ t < +∞.
(c) ϕ(t) = t−1 , e−t , t ≥ 1.
(d) ϕ(t) = t−1 , e−t , 0 < t ≤ 1.
R
2. Evaluate ϕ f for
(a) ϕ(t) = (t3 /3, t4 /4), 1 ≤ t ≤ 2, f (x, y) = x/y.
(b)S ϕ(t) = t, sin(2t), cos(2t) , 0 ≤ t ≤ π/4, f (x, y, z) = xz.
(c) ϕ(t) = t, t2 /2, t3 /3 , 0 ≤ t ≤ 1, f (x, y, z) = x + 6z.
√
(d) ϕ(t) = sin t, 2 cos t, sin t , 0 ≤ t ≤ π/2, f (x, y, z) = xyz.
420
A Course in Real Analysis
3. Set up, but do not evaluate, the integral that gives the circumference of
x2
y2
the ellipse 2 + 2 = 1. (Your answer should involve sin2 t.)
a
b
4. In each case below, find a smooth simple curve or a smooth simple closed
curve with trace C. Use the parametrization to find an integral that
gives the length of the curve. (Do not evaluate the integral.)
(a) C = (x, y) : x3 − 7y 2 = 1, 1 < x < 2, y > 0 .
(b)S C = (x, y) : 9(x − 1)2 + 4(y − 2)2 = 36 .
(c) C = (x, y) : x2 − y 2 = 4, x > 2, 0 < x + y < a .
5. Let ϕ(x) = (x, g(x)), a ≤ x ≤ b, where g is continuously differentiable,
and let f (x, y) be continuous on the graph of g. Show that
Z
ϕ
f=
Z
b
p
f x, g(x) 1 + [g 0 (x)]2 dx.
a
Use this to find
Z
(a)
f if g(x) = (2/5)x5/2 and f (x, y) = x2 , 0 ≤ x ≤ 1.
ϕ
(b)S the length of the graph of the equation x2/3 + y 2/3 = 1.
(c) the length of the graph of the function g(x) =
0 < a ≤ x ≤ b and p > 2.
xp
x2−p
+
, where
2p 2(p − 2)
6. Prove 12.2.5.
7.S Let a smooth curve ϕ : [a, b] → R2 be described in polar coordinates by
ϕ(t) = r(t) cos θ(t), r(t) sin θ(t) , r(t) ≥ 0.
Show that
length(ϕ) =
Z
b
q
2 2
r(t) θ0 (t) + r0 (t) dt.
a
8. Let F~ = (F1 , F2 , F3 ) be a force field in R3 that moves a particle of mass
m along a smooth curve ϕ : [a, b] → R3 . The kinetic energy of the particle
at time t is defined as 21 mkϕ0 (t)k2 . Use Newton’s second law F~ = mϕ00
to show that the work done by the force in moving the particle from ϕ(a)
to ϕ(b) is the change in kinetic energy
0
2
1
2 mkϕ (b)k
− 12 mkϕ0 (a)k2 .
Curves and Surfaces in Rn
421
9. A force field F~ in R3 is said to be conservative if there exists a function
P (x, y, z) such that F~ = −∇P . P (x, y, z) is called the potential energy
of an object at the point (x, y, z).
(a)S Show that the work done by a conservative force in moving the
object along a curve ϕ from ϕ(a) to ϕ(b) is
P ϕ(a) − P ϕ(b) .
(b) Deduce the Law of Conservation of Energy
P ϕ(b) + 12 mkϕ0 (b)k2 = P ϕ(a) + 12 mkϕ0 (a)k2 ,
that is, the sum of the potential and kinetic energies is constant.
(c) Find a potential function for the gravitational force field
F (x) = −mM Gkxk−3 x,
where M is the mass of the earth (concentrated at the origin, the center
of the earth), m is the mass of the particle at point x, and G is the
gravitation constant.
10. For a smooth curve ϕ : [a, b] :→ Rn , define the arc length function
s = s(t) by
Z t
s(t) =
kϕ0 (τ )k dτ, a ≤ t ≤ b.
a
Show that s has a smooth inverse t = t(s), 0 ≤ s ≤ ` := length(ϕ). The
curve ψ(s) = ϕ(t(s)) is called a reparametrization of ϕ by arc length.
Show that, for a continuous vector field F~ on trace(ϕ) = trace(ψ),
Z
F~ · T~ϕ =
ϕ
Z
`
F~ ψ(s) · ψ 0 (s) ds.
0
11. Let P = {a = t0 < t1 < · · · < tk = b} be a partition of [a, b]. For
f : [a, b] → R, define
VP (f ) =
k
X
|f (tj ) − f (tj−1 )|.
j=1
Then f is said to have bounded variation on the interval [a, b] if
supP VP (f ) < +∞. (Section 5.9.) Show that a curve ϕ = (ϕ1 , . . . , ϕn ) :
[a, b] → Rn is rectifiable iff each component function ϕi has bounded
variation on [a, b].
422
A Course in Real Analysis
12.3
Parameterized Surfaces
12.3.1 Definition. Let 1 ≤ m ≤ n. A smooth parameterized m-surface in
Rn is a C 1 function ϕ = (ϕ1 , . . . , ϕn ) : U → Rn , where U ⊆ Rm is open and
the derivative ϕ0 (u) has rank m at each point u ∈ U . A reparametrization of
ϕ is a smooth parameterized m-surface ψ = ϕ ◦ α : V → Rn , where V ⊆ Rm
is open and α : V → U is C 1 with C 1 inverse α−1 : U → V such that Jα > 0
on V . In this case, ϕ and ψ are said to be equivalent.
♦
We shall usually drop the qualifier “smooth” when referring to parameterized surfaces. Note that the parameter set U is a m-parameterized surface in
Rm . Here, we take ϕ to be the identity map ι : U → U .
Tangent Spaces of a Parameterized Surface
Let ϕ : U → Rn be a parameterized m-surface and u ∈ U . For small |t| the
line segment u + tej is contained in U and is mapped by ϕ onto a curve in
S := ϕ(U ) with tangent vector
d
∂ϕ1
∂ϕn
ϕ(u + tej ) = dϕu (ej ) =
(u), . . . ,
(u) =: ∂j ϕ(u),
dt t=0
∂uj
∂uj
where e1 , . . . , em are the standard basis vectors in Rm . Note that ∂j ϕ(u) is
just the jth column of ϕ0 (u). Since ϕ0 (u) has rank m, the vectors dϕu (ej ) are
linearly independent and hence form a basis for an m-dimensional subspace
Tϕ(u) of Rn , called the tangent space of ϕ at u. Thus dϕu is a linear isomorphism from Rm onto Tϕ(u) mapping the frame (e1 , . . . , em ) onto the frame
(∂1 ϕ(u), . . . , ∂m ϕ(u)).1 Note that ϕ is not assumed to be one-to-one, and
ϕ(u) = ϕ(v) does not necessarily imply that Tϕ(u) = Tϕ(v) . (See Figure 12.7.)
p
Tϕ(v)
ϕ(U )
Tϕ(u)
FIGURE 12.7: Tangent spaces at p = ϕ(u) = ϕ(v).
1A
frame in a finite dimensional vector space is simply an ordered basis—see Appendix B.
Curves and Surfaces in Rn
423
Orientation of a Parameterized m-Surface
Tangent spaces may be used to assign an orientation to a parameterized
m-surface, a notion that will be needed later to construct the integral of a
differential form on a surface. First, we define orientation for the space Rm .
Two frames (v 1 , . . . , v m ) and (w1 , . . . , wm ) in Rm are said to be orientation
equivalent if the determinants of the matrices
1
v · · · v m and w1 · · · wm
(where v j and wj are written as column vectors) have the same sign. Orientation
equivalence is easily seen to be an equivalence relation. The collection of frames
of Rm is therefore partitioned into two classes, one that contains (e1 . . . , em )
and the other containing (−e1 . . . , em ). An orientation is assigned to Rm by
designating one of these equivalence classes to be positive and the other negative.
Any frame in the former class is then said to have positive orientation, while
a frame in the latter class is said to have negative orientation. For example,
if m = 3 and (v 1 , v 2 , v 3 ) has positive orientation, then so does (v 2 , v 3 , v 1 ),
while (v 2 , v 1 , v 3 ) has negative orientation. By convention, the standard or
positive orientation of Rm is the orientation obtained by designating the frame
(e1 , . . . , em ) to be positive. For example, in the standard orientation, the sign
of the frame (em , e1 , . . . , em−1 ) is (−1)m−1 . We shall always assume that the
spaces Rm have the standard orientation.
A parameterized m-surface ϕ : U → Rn is said to be orientable if, whenever
ϕ(u) = ϕ(v),
• Tϕ(u) = Tϕ(v) and
• the matrix of the linear transformation,
Tuv = (dϕv )−1 ◦ dϕu : Rm → Rm
(12.4)
has positive determinant.
Frames (ξ 1 , . . . , ξ m ) and (ζ 1 , . . . , ζ m ) in Tϕ(u) are then declared to be orientation equivalent if the frames
1
m
−1 1
−1 m
dϕ−1
u (ξ , . . . , ξ ) := dϕu (ξ ), . . . , dϕu (ξ )
and
1
m
−1 1
−1 m
dϕ−1
u (ζ , . . . , ζ ) := dϕu (ζ ), . . . , dϕu (ζ )
are orientation equivalent in Rm . Since
Tuv ◦ (dϕu )−1 (ξ 1 , . . . , ξ m ) = (dϕv )−1 (ξ 1 , . . . , ξ m )
1
m
m
−1 1
and det Tuv > 0, the frames dϕ−1
u (ξ , . . . , ξ ) and dϕv (ξ , . . . , ξ ) have the
same sign, hence the notion of orientation equivalence in the common tangent
space Tϕ(u) = Tϕ(v) is well-defined. As with the vector space Rm , orientation
424
A Course in Real Analysis
equivalence on Tϕ(u) is an equivalence relation with two equivalence classes,
one containing dϕu (e1 , . . . , em ), the other containing dϕu (−e1 . . . , em ). The
positive (negative) orientation of ϕ is obtained by designating the equivalence
class containing dϕu (e1 , . . . , em ) to be positive (negative) for every u ∈ U .
We define the sign of ϕ by
(
+1 if ϕ is positively oriented,
sign(ϕ) =
−1 ϕ is negatively oriented.
Obviously, if ϕ is one-to-one, then it is orientable. For example, a simple
smooth curve ϕ : I → Rn is orientable, and since
d(ϕt )(e1 ) = ϕ0 (t) = lim +
∆t→0
ϕ(t + ∆t) − ϕ(t)
,
∆t
the positive orientation is the one for which the tangent vector dϕt (e1 ) is in
the direction of increasing t. By contrast, the curve in Figure (12.7) is not
orientable.
12.3.2 Example. Let a1 , . . . , am be linearly independent vectors in Rn and
let b ∈ Rn . Define m-dimensional parameterized affine space ϕ : Rm → Rn by
ϕ(u) = ϕ(u1 , . . . , um ) = b +
m
X
ui ai .
i=1
x3
u2 a2
b
u1 a1
x2
x1
FIGURE 12.8: Affine space.
Since ϕ is one-to-one, it is orientable. Since ∂i ϕ = ai , the tangent space at
each point is the subspace of Rn with frame (a1 , . . . , am ).
♦
12.3.3 Example. The Cartesian product of circles
ϕ(θ1 , . . . , θm ) = r1 cos θ1 , r1 sin θ1 , . . . , rm cos θm , rm sin θm , ri > 0,
is a parameterized m-surface in R2m . Orientability follows from the periodicity
of the sine and cosine functions.
♦
Curves and Surfaces in Rn
425
Orientation of a Parameterized (n − 1)-Surface
For m = n−1, the notion of orientability may be formulated more concretely
in terms of a normal vector field.
12.3.4 Lemma. Let ϕ : U → Rn be a parameterized (n − 1)-surface. Define
∂ϕ⊥ : U → Rn by
∂ϕ⊥ :=
n
X
i=1
(−1)i+n
∂(ϕ1 , . . . , ϕbi , . . . , ϕn ) i
e,
∂(u1 , . . . , un−1 )
where the hat indicates that ϕi is omitted in the calculation, and let


∂1 ϕ(u)


..


.
.
A := 

∂n−1 ϕ(u)
∂ϕ⊥ (u) n×n
Then dϕ⊥ (u) is perpendicular to the tangent space Tϕ(u) , and
|A| = k∂ϕ⊥ (u)k2 = det ϕ0 (u)t ϕ0 (u) > 0.
(12.5)
Proof. Let m = n − 1. For each j, the determinant
∂j ϕ1 (u)
∂1 ϕ1 (u)
Dj (u) :=
..
.
···
···
∂m ϕ1 (u) · · ·
∂j ϕn (u)
∂1 ϕn (u)
..
.
∂m ϕn (u)
has two identical rows and hence is zero. Expanding Dj (u) along the first row
and multiplying by (−1)m yields
Dj (u) = (−1)m
n
X
∂(ϕ1 , . . . , ϕbi , . . . , ϕn )
(−1)i+1 ∂j ϕi
= ∂j ϕ(u) · ∂ϕ⊥ (u).
∂(u
,
.
.
.
,
u
)
1
n−1
i=1
Therefore, d∂j (u) · ϕ⊥ (u) = 0, so ϕ⊥ (u) is perpendicular to Tϕ(u) .
To prove the first equality in (12.5), expand |A| along the last row to obtain
|A| =
2
n X
∂(ϕ1 , . . . , ϕbi , . . . , ϕn )
i=1
∂(u1 , . . . , um )
= kϕ⊥ (u)k2 > 0,
the positive inequality because ϕ0 has rank m.
For the second equality in (12.5), using what has already been established
426
A Course in Real Analysis
we calculate
k∂ϕ⊥ (u)k4 = |A|2 = |AAt |
=
∂1 ϕ(u) · ∂1 ϕ(u)
..
.
···
∂1 ϕ(u) · ∂m ϕ(u)
..
.
∂m ϕ(u) · ∂1 ϕ(u) · · · ∂m ϕ(u) · ∂m ϕ(u)
0
···
0
0
⊥
2
t 0
= k∂ϕ (u)k det ϕ (u) ϕ (u) .
0
..
.
0
k∂ϕ⊥ (u)k2
12.3.5 Corollary. The frame dϕu (e1 ), . . . , dϕu (en−1 ), ∂ϕ⊥ (u) is positively
oriented in Rn .
12.3.6 Theorem. Let ϕ : U → Rn be a parameterized (n − 1)-surface. The
following statements are equivalent:
(a) ϕ is orientable.
(b) ϕ(u) = ϕ(v) ⇒ ∂ϕ⊥ (u) = c∂ϕ⊥ (v) for some c > 0.
~ ϕ : ϕ(U ) → Rn (necessarily unique) such that
(c) There exists a function N
~ ϕ ϕ(u) = k∂ϕ⊥ (u)k−1 ∂ϕ⊥ (u)
N
1
=q
0
det ϕ (u)t ϕ0 (u)
(12.6)
n
X
(−1)i+n
i=1
∂(ϕ1 , . . . , ϕbi , . . . , ϕn ) i
e.
∂(u1 , . . . , un−1 )
Proof. For u ∈ U , let Tu : Rn → Rn denote the unique linear isomorphism
such that
Tu (ej ) = dϕu (ej ), 1 ≤ j ≤ n − 1, and Tu en ) = ∂ϕ⊥ (u).
By Lemma 12.3.4, det Tu = kϕ⊥ (u)k2 > 0. Suppose that ϕ(u) = ϕ(v). Since
∂ϕ⊥ (u) ⊥ Tϕ(u) and ∂ϕ⊥ (v) ⊥ Tϕ(v) ,
Tϕ(v) = Tϕ(u)
iff
∂ϕ⊥ (u) = c∂ϕ⊥ (v) for some c 6= 0.
In this case, by (12.4),
(Tv−1 Tu )(ej ) = dϕv
(Tv−1 Tu )(en ) = Tv−1
−1
dϕu (ej ) = Tuv (ej ), 1 ≤ j ≤ n − 1, and
c∂ϕ⊥ (v) = cen .
Thus the matrix of Tv−1 Tu has columns Tuv (e1 ), . . ., Tuv (en−1 ), cen . It follows
that
0 < det Tu / det Tv = det(Tv−1 Tu ) = c det Tuv .
(12.7)
Curves and Surfaces in Rn
427
~ ϕ exists and let
With these preliminaries out of the way, assume that N
ϕ(u) = ϕ(v). Then
~ ϕ ϕ(u) = N
~ ϕ ϕ(v) = k∂ϕ⊥ (v)k−1 ∂ϕ⊥ (v),
k∂ϕ⊥ (u)k−1 ∂ϕ⊥ (u) = N
hence ∂ϕ⊥ (u) = c∂ϕ⊥ (v) for some c > 0. By the first paragraph, Tϕ(u) = Tϕ(v)
and det Tuv > 0. Therefore, ϕ is orientable.
Conversely, assume that ϕ is orientable and let ϕ(u) = ϕ(v). Then Tϕ(u) =
Tϕ(v) , hence ∂ϕ⊥ (u) = c∂ϕ⊥ (v) for some c 6= 0. Since det[Tuv ] > 0, c > 0 by
(12.7). Therefore,
k∂ϕ⊥ (u)k−1 ∂ϕ⊥ (u) = k∂ϕ⊥ (v)k−1 ∂ϕ⊥ (v),
~ ϕ may be unambiguously defined by (12.6).
so N
12.3.7 Special Cases.
(a) n = 2: Then ϕ⊥ = (−ϕ02 , ϕ01 ), the inward normal. (Figure 12.9.)
(−ϕ02 , ϕ01 )
(ϕ01 , ϕ02 )
FIGURE 12.9: The inward unit normal.
(b) n = 3: Then
∂1 ϕ2
∂ϕ⊥ =
∂2 ϕ2
∂ ϕ
∂1 ϕ3
,− 1 1
∂2 ϕ1
∂2 ϕ3
∂1 ϕ3 ∂1 ϕ1
,
∂2 ϕ3 ∂2 ϕ1
∂1 ϕ2
∂2 ϕ2
= ∂1 ϕ × ∂2 ϕ,
the familiar cross product of ∂1 ϕ and ∂2 ϕ.
dϕu (e2 )
~ ϕ (p)
N
S = ϕ(U )
p
dϕu (e1 )
FIGURE 12.10: Normal vector to S at p.
~ ϕ (p) is a right-handed
Thus the positively oriented frame dϕu (e1 ), dϕu (e2 ), N
system, as shown in Figure 12.10.
428
A Course in Real Analysis
(c) Let U ⊆ Rn−1 be open and let g : U → R be C 1 . Define
ϕ(u1 , . . . , un−1 ) = u1 , . . . , un−1 , g(u1 , . . . , un−1 ) .
Then ϕ(U ) is the graph of g. Since ϕ is one-to-one, it is orientable. Also,
∂j ϕ = 0, · · · , 0, 1, 0, · · · , 0, ∂j g ⊥ (−∂1 g, · · · , −∂j g, · · · , −∂n−1 g, 1
j
and, by elementary row operations,
1
0
..
.
0
−∂1 g
···
···
···
···
0
0
..
.
1
−∂n−1 g
∂1 g
∂2 g
..
.
= (∂1 g)2 + · · · + (∂n−1 g)2 + 1.
∂n−1 g
1
Since this is positive, by uniqueness,
(−∂1 g, · · · , −∂n−1 g, 1
(−∇g,
1
~ϕ ◦ ϕ = p
N
=p
.
(∂1 g)2 + · · · + (∂n−1 g)2 + 1
k∇gk2 + 1
♦
12.3.8 Example. Let r > 0 and define
ϕ(θ1 , θ2 ) = (r sin θ1 cos θ2 , r sin θ1 sin θ2 , r cos θ1 ), θ1 ∈ (0, π), θ2 ∈ (0, 2π).
The image of ϕ is the sphere in R3 with radius r and center (0, 0, 0) and with
the great circle (r sin θ1 , 0, r cos θ1 ) (that is, θ2 = 0) through the poles (0, 0, ±r)
missing. Since
∂1 ϕ(θ1 , θ2 ) = r(cos θ1 cos θ2 , cos θ1 sin θ2 , − sin θ1 ) and
∂2 ϕ(θ1 , θ2 ) = r(− sin θ1 sin θ2 , sin θ1 cos θ2 , 0),
by 12.3.7
∂ϕ⊥ (θ1 , θ2 ) = ∂1 ϕ(θ1 , θ2 ) × ∂2 ϕ(θ1 , θ2 )
= r sin θ1 r sin θ1 cos θ2 , r sin θ1 sin θ2 , r cos θ1
= (r sin θ1 )ϕ(θ1 , θ2 ).
Therefore,
~ ϕ ◦ ϕ)(θ1 , θ2 ) =
(N
that is,
ϕ(θ1 , θ2 )
= r−1 ϕ(θ1 , θ2 ),
kϕ(θ1 , θ2 )k
~ ϕ (p) = p ,
N
kpk
p ∈ S.
♦
Curves and Surfaces in Rn
429
x2
ψ1 (t), ψ2 (t) cos θ, ψ2 (t) sin θ
ψ(t)
θ
x1
x3
FIGURE 12.11: Surface of revolution.
12.3.9 Example. Let I be an open interval and ψ : I → R2 a smooth curve
with ψ2 (t) > 0 for t ∈ I. The parameterized surface of revolution in R3 is
defined by
ϕ(t, θ) = (ψ1 (t), ψ2 (t) cos θ, ψ2 (t) sin θ), t ∈ I, θ ∈ R.
From (12.3.7) and the calculations
∂1 ϕ(t, θ) = ψ10 (t), ψ20 (t) cos θ, ψ20 (t) sin θ ,
∂2 ϕ(t, θ) = 0, −ψ2 (t) sin θ, ψ2 (t) cos θ ,
we have
and
∂ϕ⊥ (t, θ) = ψ2 (t) ψ20 (t), −ψ10 (t) cos θ, −ψ10 (t) sin θ
(12.8)
∂ψ ⊥ (t) = (−ψ20 (t), ψ10 (t)).
Now suppose that ψ is orientable. We claim that ϕ is then orientable. To
see this, suppose that ϕ(t1 , θ1 ) = ϕ(t2 , θ2 ). Then ψ1 (t1 ) = ψ1 (t2 ), and because
ψ2 (t) > 0, ψ2 (t1 ) = ψ2 (t2 ) and hence θ2 = θ1 + 2kπ. By orientability of ψ,
(−ψ20 (t2 ), ψ10 (t2 )) = ∂ψ ⊥ (t2 ) = c∂ψ ⊥ (t1 ) = c(−ψ20 (t1 ), ψ10 (t1 ))
for some c > 0. It follows from (12.8) that
∂ϕ⊥ (t2 , θ2 ) = ψ2 (t2 ) ψ20 (t2 ), −ψ10 (t2 ) cos θ2 , −ψ10 (t2 ) sin θ2
= cψ2 (t1 ) ψ20 (t1 ), −ψ10 (t1 ) cos θ1 , −ψ10 (t1 ) sin θ1
= c∂ϕ⊥ (t1 , θ1 ),
which shows that ϕ is orientable. Moreover, from (12.8),
k∂ϕ⊥ (t, θ)k = ψ2 (t)kψ 0 (t)k,
430
A Course in Real Analysis
hence
⊥
~ ϕ (ϕ(t, θ)) = ∂ϕ (t, θ) = kψ 0 (t)k−1 ψ20 (t), −ψ10 (t) cos θ, −ψ10 (t) sin θ ,
N
⊥
k∂ϕ (t, θ)k
which is the rotation of the unit normal vector −Nψ about the x1 axis.
For the special case ψ(x) = x, f (x) ,
ϕ(x, θ) = (x, f (x) cos θ, f (x) sin θ)
and
~ ϕ (ϕ(x, θ)) = [f 0 (x)]2 + 1
N
−1/2
f 0 (x), − cos θ, − sin θ .
A point (x, y, z) on the surface S = ϕ(U ) and not on the graph of f may be
written uniquely as
x, f (x) cos(θ(y, z)), f (x) sin θ(y, z)
where 0 < θ(y, z) < 2π is the (continuous) argument of (y, z) determined by
θ0 = 0 (see 9.4.6). Therefore,
~ ϕ (x, y, z) = [f 0 (x)]2 + 1 −1/2 f 0 (x), − cos θ(y, z) , − sin θ(y, z) ,
N
which is continuous on S by the periodicity of sine and cosine.
♦
12.3.10 Example. The parameterized Möbius strip is defined by
ϕ(t, θ) = 2 + t cos 12 θ cos θ, 2 + t cos 12 θ sin θ, t sin 12 θ ,
where −1 < t < 1 and θ ∈ R. The surface may be concretely realized by taking
one end of a long strip of paper, giving it a half-twist, and gluing it to the
other end.
FIGURE 12.12: Möbius strip.
The Möbius strip is not orientable. Indeed, ϕ(0, 0) = ϕ(0, 2π), but since
∂1 ϕ(0, 0) = −∂1 ϕ(0, 2π) = (1, 0, 0) and ∂2 ϕ(0, 0) = ∂2 ϕ(0, 2π) = (0, 1, 0),
we see that
∂1 ϕ(0, 0) × ∂2 ϕ(0, 0) = (0, 0, 1) = −∂1 ϕ(0, 2π) × ∂2 ϕ(0, 2π).
~ ϕ cannot exist.
Therefore, N
♦
Curves and Surfaces in Rn
431
Exercises
1. Assuming that R3 has the standard orientation, find the sign of the
frames
(a)S (e1 + e2 , e2 + e3 , e3 + e1 ).
(b) (−e1 + e2 + e3 , e1 − e2 + e3 , e1 + e2 − e3 ).
2. Show that the frames
(e1 + e2 + e3 , 2e1 + e2 + 3e3 ) and (e1 + 3e2 − e3 , e1 + 4e2 − 2e3 )
in R3 span the same subspace but have opposite orientations.
3. Let ϕ : U → Rn be a parameterized m-surface and ψ = ϕ ◦ α : V → Rn
a reparametrization of ϕ. Show that ϕ is orientable iff ψ is orientable.
4. Let ϕ : U → Rn be an orientable parameterized (n − 1)-surface and let
ψ = ϕ ◦ α : V → Rn be a reparametrization of ϕ. Find ∂ψ ⊥ in terms of
∂ϕ⊥ . Use the result to show that Nψ = Nϕ on S := ϕ(U ) = ψ(V ).
~ ϕ (x, y, z) for the torus
5.S Use 12.3.9 to find N
ϕ(φ, θ) = a cos φ, (b + a sin φ) cos θ, (b + a sin φ) sin θ , 0 < θ, φ < 2π,
where 0 < a < b.
~ ϕ (x, y, z) for the following orientable 2-surfaces in R3 :
6. Find N
(a)S ϕ(t, θ) = (t cos θ, t sin θ, t), t > 0, θ ∈ R.
(b) ϕ(t, θ) = (sinh t, cosh t cos θ, cosh t sin θ), t, θ ∈ R (hyperboloid of
one sheet).
(c) ϕ(t, θ) = (cosh t, sinh t cos θ, sinh t sin θ), t, θ ∈ R (one sheet of a
hyperboloid of two sheets).
(d) ϕ(t, θ) = (t cos θ, t sin θ, θ), t > 0, θ ∈ R (helicoid).
(e) ϕ(t, θ) = (t cos θ, t sin θ, θ2 ), t > 0, θ > 0.
(f)S ϕ(t, s) = (1 − s) a cos t, a sin t, 0 + s b cos t, b sin t, 1), 0 < s < 1,
where 0 < a < b.
7.S Let V ⊆ Rn−2 be open and let ψ : V → Rn−1 be an (n−2)-parameterized
surface in Rn−1 . Define the cylinder ϕ over ψ by
ϕ(v, s) = ψ(v), s , v ∈ V, s ∈ (a, b).
Show that
432
A Course in Real Analysis
(a) ϕ is a parameterized (n − 1)-surface in Rn .
(b) ∂ϕ⊥ (u1 , . . . , un−1 ) = ∂ψ ⊥ (u1 , . . . , un−2 ), 0 .
(c) ϕ is orientable iff ψ is orientable, in which case
Nϕ (x1 , . . . , xn ) = Nψ (x1 , . . . , xn−1 ), 0 .
8. Let V ⊆ Rn−2 be open and let ψ : V → Rn−1 be an (n−2)-parameterized
surface in Rn−1 . Define the cone over ψ by
ϕ(v, s) = (1 − s)ψ(v), s , v ∈ V, 0 < s < 1.
Show that
(a) ϕ is a parameterized (n − 1)-surface in Rn .
(b) ∂ϕ⊥ (v, s) = (1 − s)n−2 ∂ψ ⊥ (v), D(v, s) , where

(1 − s)a1,1
···
(1 − s)a1,n−2
 (1 − s)a2,1
·
·
·
(1 − s)a2,n−2

D(v, s) = 
..
..

.
.
(1 − s)an−1,1
···
(1 − s)an−1,n−2
−ψ1 (v)
−ψ2 (v)
..
.





−ψn−1 (v)
and [ai,j ](n−1)×(n−2) = ψ 0 (v).
12.4
m-Dimensional Surfaces
Let 1 ≤ m < n and let V ⊆ Rn be open. Suppose that the function
F = (F1 , . . . , Fn−m ) : V → Rn−m
is C 1 on V such that the (n − m) × n matrix F 0 (x) has rank n − m at each
point x ∈ V . A set of the form
S = {x ∈ V : F (x) = c} ,
where c ∈ Rn−m , is called an m-dimensional level surface of F or simply an
m-surface in Rn . By replacing F by F − c, we may (and hereafter shall) take
c = 0.
Local Parametrization of an m-Surface
The following theorem shows that an m-surface may be “patched together”
from a collection of one-to-one parameterized m-surfaces. This will be an
important tool in the development of a theory of integration on m-surfaces.
Curves and Surfaces in Rn
433
12.4.1 Theorem. Let S = {x ∈ V : F (x) = 0} be an m-surface in Rn .
(a) For each a ∈ S there exist open sets Ua ⊆ Rm and Va ⊆ Rn with a ∈ Va ,
and a one-to-one parameterized m-surface ϕa from Ua onto Sa := S ∩ Va .
1
(b) Each ϕ−1
a is the restriction to Sa of a C map on Va .
(c) If Sa ∩ Sb 6= ∅, then the mapping
−1
−1
ϕab := ϕ−1
b ◦ ϕa : ϕa (Sa ∩ Sb ) → ϕb (Sa ∩ Sb )
is C 1 with inverse ϕba .
(d) The mappings ϕa may be chosen so that 0 ∈ Ua and ϕa (0) = a.
Proof. If (a)–(c) of the theorem hold and a = ϕa (u0 ), then (d) may be achieved
by replacing Ua by Ua − u0 and ϕa by ϕa (u + u0 ), u ∈ Ua − u0 .
We prove (a)–(c) first for the case m = n − 1, that is, for F real-valued,
and then outline the proof for the general case.
Since F has rank 1, ∂i F (a) 6= 0 for some index i (which typically depends
on a). Define a C 1 map Ga : V → Rn by
Ga (x1 , . . . , xn ) = x1 , . . . , xi−1 , F (x1 , . . . , xn ), xi+1 , . . . xn .
Thus Ga simply replaces the ith coordinate of its argument x by F (x). Note
that G0a (x) is the identity matrix with row i replaced by ∇F (x). A standard
row reduction shows that JGa (a) = ∂i F (a). Since this is nonzero, by the
inverse function theorem there exist open sets Va ⊆ V and Wa = Ga (Va ) in
1
Rn with a ∈ Va such that Ga is one-to-one on Va and G−1
a : Wa → Va is C .
Taking smaller Wa and Va if necessary, we may suppose that
Wa = (α1 , β1 ) × · · · × (αn , βn ).
Note that 0 ∈ (αi , βi ), since a1 , . . . , ai−1 , 0, ai+1 , . . . , an = Ga (a) ∈ Wa .
Now let (u1 , . . . , un ) ∈ Wa and set (v1 , . . . , vn ) = G−1
a (u1 , . . . , un ). Then
(u1 , . . . , ui , . . . , un ) = Ga (v1 , . . . , vn )
= v1 , . . . , vi−1 , F (v1 , . . . , vn ), vi+1 , . . . , vn
= u1 , . . . , ui−1 , (F ◦ G−1
a )(u1 , . . . , un ), ui+1 , . . . , un ,
hence
(F ◦ G−1
a )(u1 , . . . , un ) = ui
(12.9)
(F ◦ G−1
a )(u1 , . . . , ui−1 , 0, ui+1 , . . . , un ) = 0.
(12.10)
and, in particular,
Now set
Ua := (α1 , β1 ) × · · · × (αi−1 , βi−1 ) × (αi+1 , βi+1 ) × · · · × (αn , βn )
434
A Course in Real Analysis
and define ϕa : Ua → Rn by
ϕa (u1 , . . . , un−1 ) = G−1
a (u1 , . . . , ui−1 , 0, ui , . . . , un−1 ).
By (12.10), F ϕa (u1 , . . . , un−1 ) = 0, hence ϕa (Ua ) ⊆ Sa . Conversely, by
(12.9),
(v1 , . . . , vn ) ∈ Sa ⇒ ui = (F ◦ G−1
a )(u1 , . . . , un ) = F (v1 , . . . , vn ) = 0
⇒ (v1 , . . . , vn ) = G−1
a (u1 , . . . , ui−1 , 0, ui+1 , . . . un−1 ) = ϕa (u1 , . . . , un−1 ).
Therefore, ϕa (Ua ) = Sa .
Sa
Ua
Wa
S
ϕa
G−1
a
Va
FIGURE 12.13: The mapping G−1
a .
Now define the injection mapping ιa : Ua → Wa and the projection mapping
πa : Va → Rn−1 , respectively, by
ιa (u1 , . . . , un−1 ) = (u1 , . . . , ui−1 , 0, ui , . . . , un−1 ) and
πa (v1 , . . . , vn ) = (v1 , . . . , vi−1 , vi+1 , . . . , vn ).
−1
Then πa ◦ ιa : Ua → Ua is the identity function and ϕa = G−1
a ◦ ιa . Since Ga
has rank n and ιa has rank n − 1, ϕa has rank n − 1. Also, if v = ϕa (u), then
(πa ◦ Ga )(v) = (πa ◦ Ga ◦ ϕa )(u) = πa ◦ ιa (u) = u = ϕ−1
a (v),
1
which shows that ϕ−1
a : Sa → Ua is the restriction to Sa of the C function
πa ◦ Ga : Va → Ua .
Now let b ∈ S and Sa ∩ Sb =
6 ∅. Then Gb ◦ G−1
a maps the open set
Ga (Va ∩ Vb ) onto the open set Gb (Va ∩ Vb ). Also, in the preceding notation,
−1
ϕa = G−1
a ◦ ιa on Ua and ϕb = πb ◦ Gb on Sb , hence
−1
ϕ−1
b ◦ ϕa = πb ◦ Gb ◦ Ga ◦ ιa ,
−1
which maps the open set ϕ−1
a (Sb ∩ Sa ) ⊆ Ua onto the open set ϕb (Sb ∩ Sa ) ⊆
Ub and is C 1 with C 1 inverse ϕ−1
a ◦ ϕb . This verifies the theorem for the case
m = n − 1.
Curves and Surfaces in Rn
435
In the general case, there exist indices i1 < · · · < ik in {1, . . . , n} such that
∂(F1 , . . . Fk )
(a) 6= 0,
∂(ui1 , . . . , uik )
where k := n − m. Let i01 < i02 < · · · < i0m denote the complementary indices.
(In the above case, these were the indices 1, . . . , i − 1, i + 1, . . . , n.) Define
Ga (x1 , . . . , xn ) to be the n-tuple (x1 , . . . , xn ), with the coordinates xi1 , . . . , xik
replaced by F1 (x), . . ., Fk (x). Then JGa (a) 6= 0, so the sets Va and Wa may
be obtained as before. Define
Ua = (αi01 , βi01 ) × · · · × (αi0m , βi0m ) → Rn
and the injection mapping ιa : Ua → Wa by
ιa (u1 , u2 , . . . , um ) = (v1 , v2 , . . . , vn ),
where vij = 0, 1 ≤ j ≤ k, and vi0j = uj , 1 ≤ j ≤ m. Thus ιa places zeros in
the coordinate positions i1 < · · · < ik and fills the complementary positions
by u1 , . . . , um . Finally, define the projection mapping πa : Va → Rn−1 by
πa (v1 , . . . , vn ) = (vi01 , . . . , vi0m ).
The proof then proceeds as before.
S
Sa
ϕa
b
a
Sb
ϕab
Ua
ϕb
Ub
ϕba
FIGURE 12.14: Transition mappings.
The functions ϕa : Ua → Sa in the theorem are called local parametrizations
of S, and the C 1 functions ϕab are called transition mappings. The sets Sa
are called surface elements. A collection of local parameterizations of S whose
surface elements cover S is called an atlas for S. Note that if F is C r then,
as an examination of the proof reveals, the local parameterizations and the
transition maps are C r as well.
436
A Course in Real Analysis
12.4.2 Example. Consider the (n − 1)-sphere S := {y ∈ Rn : kyk = 1} with
north and south poles p := (0, . . . , 0, 1) and q := (0, . . . , 0, −1). Let the points
y = (y1 , . . . , yn ) and x = (x1 , . . . , xn−1 ) be related as in Figure 12.15.
p = (0, . . . , 0, 1)
Rn
S
y
0
Rn−1
(x, 0)
q = (0, . . . , 0, −1)
FIGURE 12.15: Stereographic projection from p.
Then for some t,
(x1 , . . . , xn−1 , −1) = (x, 0) − p = t(y − p) = ty1 , . . . tyn−1 , t(yn − 1) ,
hence
(x1 , . . . , xn−1 ) =
1
(y1 , . . . , yn−1 ), −1 ≤ yn < 1.
1 − yn
The mapping
x = ϕ−1 (y) =
1
(y1 , . . . , yn−1 ), yn < 1,
1 − yn
from S \ {p} onto Rn−1 , is called the stereographic projection from p onto
the equatorial hyperplane xn = 0. One readily checks that the inverse of this
mapping is given by
1
y = ϕ(x) =
2x1 , . . . , 2xn−1 , kxk2 − 1 , x ∈ Rn−1 .
2
1 + kxk
Similarly, the stereographic projection from q is given by
x = ϕ̃−1 (y) =
1
(y1 , . . . , yn−1 ), yn > −1
1 + yn
with inverse
y = ϕ̃(x) =
1
2x1 , . . . , 2xn−1 , 1 − kxk2 .
2
1 + kxk
The set {ϕ, ϕ̃} is an atlas for S. The transition mapping from Rn−1 \ {0} to
Rn−1 \ {0} is the self-inverse mapping
x
(ϕ−1 ◦ ϕ̃)(x) =
.
♦
kxk2
Curves and Surfaces in Rn
437
Tangent Space of an m-Surface
The local parameterizations ϕa of an m-surface S = {x : F (x) = 0} may
be used to construct a tangent space at each point a ∈ S. Let
ϕa (u) = ϕb (v) ∈ Sa ∩ Sb .
Then v := ϕab (u) and, by the chain rule applied to ϕa = ϕb ◦ ϕab ,
d(ϕa )u = d(ϕb )v ◦ d(ϕab )u .
Since d(ϕab )u : Rm → Rm is an isomorphism, the vectors dj := dϕab )u (ej )
form a basis of Rm . Therefore, we have the mapping of Rm -frames
d(ϕa )u (e1 , . . . , em ) = d(ϕb )v (d1 , . . . , dm ),
(12.11)
which shows that Tϕb (v) = Tϕa (u) and hence makes the following definition
meaningful.
12.4.3 Definition. The tangent space Tx to S at a point x ∈ S is defined as
Tϕa (u) , where ϕa is any local parametrization of S with ϕa (u) = x.
♦
The next proposition gives an intrinsic characterization of tangent space.
12.4.4 Proposition. For x ∈ S let Λx denote the set of all vectors in Rn of
the form α0 (0), where α : (−r, r) → S is a C 1 curve with α(0) = x. Then
Tx = Λx = {z ∈ R : dFx (z) = 0} =
n
n−m
\
{z ∈ Rn : ∇Fi (x) · z = 0} .
i=1
Proof. Let ϕ be a local parametrization of S with ϕ(u) = x. A member of Tx
m
X
is of the form z =
ai dϕu (ei ). For small |t|, the curve
i=1
m
X
α(t) = ϕ u + t
ai ei
i=1
lies in S, α(0) = x, and, by the chain rule,
α0 (0) = dϕu
X
m
i=1
ai ei
=
m
X
ai dϕ)u (ei ) = z.
i=1
Therefore, z ∈ Λx . On the other hand, if α0 (0) ∈ Λx , then differentiating the
identity (F ◦ α)(t) = 0 at t = 0 yields dFx α0 (0) = 0. We have shown that
Tx ⊆ Λx ⊆ {z : dFx (z) = 0} .
Since dFx (z) = 0 has dimension m, the three spaces must be equal.
438
A Course in Real Analysis
12.4.5 Remark. The proposition shows that if S1 is an m1 -surface, S2 is
an m2 -surface, and ψ : S1 → S2 is C 1 , then for x ∈ S1 and y = ψ(x) the
function dψx maps Tx into Ty . Indeed, if v ∈ Tx , then there exists a smooth
curve α1 : (−1, 1) → S1 with α1 (0) = x and α10 (0) = v. Then α2 =: ψ ◦ α1 is
a smooth curve in S2 and dψx (v) = (ψ ◦ α1 )0 (0) = α20 (0) ∈ Ty .
♦
dψx
x
S1
v
α1
y
α2
S2
ψ
FIGURE 12.16: The mapping dψx : Tx → Ty .
Orientation of an m-Surface
Let S be an m-surface with local parameterizations ϕa : Ua → Rn . Since
ϕa is one-to-one, it is orientable. Suppose the parameterizations have the same
orientation, that is, sign(ϕa ) = sign(ϕb ) for all a and b. If u ∈ Ua , v ∈ Ub ,
and ϕa (u) = ϕb (v), then (12.11) shows that the orientation of Tϕb (v) agrees
with that of Tϕa (u) iff Jϕab (v) > 0. Thus if Jϕab > 0 whenever Sa ∩ Sb 6= ∅,
then S may be given a well-defined orientation via the orientations of the
local parameterizations. In this case, S is said to be orientable. The positive
orientation is obtained if each local parametrization is positively oriented.
Orientation of an (n − 1)-Surface
Orientability of an (n − 1)-surface may be characterized in terms of the
~ ϕ . For this we need the following lemma, which relates
normal vector fields N
a
~ ϕ and N
~ ϕ on overlapping surface elements.
N
a
b
12.4.6 Lemma. Let x := ϕa (u) = ϕb (v) ∈ Sa ∩ Sb , where u ∈ Ua , v ∈ Ub .
Then
~ ϕ (x) = |Jϕ (u)|−1 Jϕ (u)N
~ ϕ (x) = sign Jϕ (u) N
~ ϕ (x).
N
a
ab
ab
b
ab
b
Proof. Since ϕa = ϕb ◦ ϕab and v = ϕab (u), the chain rule implies that
ϕ0a (u) = ϕ0b (v)ϕ0ab (u)
and
∂(ϕa,1 , . . . , ϕ
d
∂(ϕb,1 , . . . , ϕ
d
a,i , . . . , ϕa,n )
b,i , . . . , ϕb,n )
(u) =
(v)Jϕab (u).
∂(u1 , . . . , un−1 )
∂(v1 , . . . , vn−1 )
Curves and Surfaces in Rn
439
From the first equation,
q
q
det ϕ0a (u)t ϕ0a (u) = det ϕ0ab (u)t ϕ0b (v)t ϕ0b (v)ϕ0ab (u)
q
= |Jϕab (u)| det ϕ0b (v)t ϕ0b (v) .
The assertion now follows by recalling that
~ ϕ (x) = q
N
a
n
X
∂(ϕa,1 , . . . , ϕ
d
a,i , . . . , ϕa,n )
(−1)i+n
(u)
∂(u
,
.
.
.
, un−1 )
1
det ϕ0a (u)t ϕ0a (u) i=1
1
and
~ ϕ x) = q
N
b
n
X
1
det ϕ0b (v)t ϕ0b (v)
i=1
(−1)i+n
∂(ϕb,1 , . . . , ϕ
d
b,i , . . . , ϕb,n )
(v).
∂(v1 , . . . , vn−1 )
12.4.7 Theorem. An (n − 1)-surface S is orientable iff there exists a contin~ on S such that
uous vector field N
~
N
Sa
~ ϕ for each a ∈ S.
=N
a
(12.12)
~ϕ = N
~ ϕ on
Proof. If S is orientable, then Jϕab > 0, hence, by 12.4.6, N
a
b
~
~
Sa ∩ Sb . Therefore, (12.12) defines N unambiguously. Since Nϕa is easily seen
~ is continuous on S.
to be continuous on Sa and Sa is relatively open in S, N
~ on S that
Conversely, assume there exists a continuous vector field N
satisfies (12.12). If x = ϕa (u) ∈ Sa ∩ Sb , then
~ ϕ (x) = N
~ (x) = N
~ ϕ (x),
N
a
b
hence, by 12.4.6, Jϕab (u) > 0. Therefore, S is orientable.
Let S be orientable with positive orientation. Then, by definition, the frame
d(ϕa )u (e1 ), . . . , d(ϕa )u (en−1 )) in Ta is designated as positive (sign(ϕa ) > 0)
for each a ∈ S. Since the frame
~ (a)
d(ϕa )u (e1 ), . . . , d(ϕa )u (en−1 ), N
~ . The
in Rn is positive (12.3.5), we say in this case that S is oriented by N
~
notion of orientation by −N is defined analogously. For example, the sphere
S = {(x1 , . . . , xn ) : kxk = r} is locally parameterized by the mappings ϕ and
ϕ̃ of 12.3.8. The positive orientation is given by the unit normal vector field
~ (p) = kpk−1 p, called the outward unit normal.
N
12.4.8 Corollary. If S = {x : F (x) = 0} is connected, then S is orientable
and
~ = k∇F k−1 ∇F or N
~ = −k∇F k−1 ∇F.
N
440
A Course in Real Analysis
~ implies
Proof. Since ∇F (x) is perpendicular to S at x, the uniqueness of N
that
~ (x) = s(x) ∇F (x) , x ∈ S,
N
k∇F (x)k
where s(x) = ±1 is constant on each surface element. Since the surface elements
are open in S, s(x) is continuous. Since S is connected, s(x) must be constant
on S.
(n − 1)-Surfaces-with-Boundary
To discuss surfaces-with-boundary, we shall need the following notation:
Rn−1
:= y ∈ Rn−1 : yn−1 > 0 .
+
Hn−1 := y ∈ Rn−1 : yn−1 ≥ 0 .
∂Hn−1 := y ∈ Rn−1 : yn−1 = 0 .
12.4.9 Definition. An (n − 1)-surface-with-boundary is a subset of Rn of the
form
S = {x ∈ W : F (x) = 0 and gi (x) ≥ 0, i = 1, . . . , k} ,
where W ⊆ Rn is open and F : W → R and gi : W → R are C 1 and satisfy
the following conditions:
(a) ∇F (x) 6= 0 for all x ∈ S.
(b) The sets Bi := {x ∈ S : gi (x) = 0} are pairwise disjoint.
(c) For each i and x ∈ Bi , the vectors ∇F (x) and ∇gi (x) are linearly independent.
Sk
The set ∂S:= i=1 Bi is called the boundary of S and S \ ∂S is the interior.♦
x3
B2 : g2 (x) := 1 − x3 = 0
∇g1
∇F
∇g2
x1
F (x) := x21 + x22 − 1 = 0
x2
B1 : g1 (x) := x3 = 0
∇F
FIGURE 12.17: Cylinder-with-boundary: x21 + x22 = 1, 0 ≤ x3 ≤ 1.
Curves and Surfaces in Rn
441
If V denotes the open set {x ∈ W : gi (x) > 0, i = 1, . . . , k}, then
S \ ∂S = {x ∈ V : F (x) = 0} .
Therefore, condition (a) implies that the interior of S is an (n − 1)-surface.
Conditions (b) and (c) assert that the boundary of S is made up of disjoint
(n − 2)-surfaces. Indeed, if Fi := (F, gi ), then Bi = {x ∈ W : Fi = 0} and
∇F
0
Fi =
∇gi
has rank 2. Also, because the (n − 2)-surfaces Bi are pairwise disjoint, a local
parametrization of Bi may be chosen to be disjoint from a local parametrization
of Bj .
The following theorem shows that, as in the case of an (n − 1)-surface,
an (n − 1)-surface-with-boundary may be described by a collection of local
parameterizations.
12.4.10 Theorem. Let S be an (n − 1)-surface-with-boundary.
(a) If a ∈ S \ ∂S, then there exists a local parametrization ϕa : Ua → Rn of
S \ ∂S at a with ϕa (0) = a.
(b) If a ∈ ∂S, then there exists an open set Ũa ⊆ Rn−1 and a one-to-one
parameterized (n − 1)-surface ϕ̃a : Ũa → Rn−1 with ϕ̃a (0) = a such that
if Ua := Ũa ∩ Hn−1 and ϕa := ϕ̃a U , then
a
(i) ϕa Ua is open in S,
(ii) ϕa Ua ∩ Rn−1
is open in S \ ∂S, and
+
n−1
(iii) ϕa Ua ∩ ∂H
is open in ∂S.
ϕa Ua ∩ ∂Hn−1
ϕ̃a Ũa
+
∂S
S
a
ϕa Ua ∩ Rn−1
+
FIGURE 12.18: Surface element Sa = ϕa Ũa ∩ Hn−1 .
Proof. Part (a) follows from 12.4.1, since S \ ∂S is an (n − 1)-surface without
boundary.
For part (b), we may assume without loss of generality that ∂S =
{x ∈ S : g(x) = 0}. Choose a local parametrization ψa : Wa → Rn of
442
A Course in Real Analysis
S 0 := {x ∈ W : F (x) = 0} such that ψa (0) = a. Since ψa has rank n − 1 and
g has rank 1, ∂i (g ◦ ψa ) 0) 6= 0 for some i. Define Ha : Wa → Rn−1 by
Ha (w1 , . . . , wn−1 ) = w1 , . . . , wi−1 , wi+1 , . . . , wn−1 , g ◦ ψa (w1 , . . . , wn−1 ) .
Then Ha has rank n − 1 at 0, hence, by the inverse function theorem, there
exist open sets W̃a ⊆ Wa and Ũa = Ha (W̃a ) in Rn−1 with 0 ∈ W̃a such that
Ha is one-to-one on W̃a and Ha−1 : Ũa → W̃a is C 1 . Set
ϕ̃a = ψa ◦ Ha−1 : Ũa → S 0 .
If u = Ha (w) ∈ Ũa , then g ◦ ψa (w) = g ◦ ψa ◦ Ha−1 (u) = g ◦ ϕ̃a (u), hence, by
definition of Ha ,
(u1 , . . . , un−1 ) = w1 , . . . , wi−1 , wi+1 , . . . , wn−1 , g ◦ ϕ̃a (u) .
Therefore, un−1 = g ◦ ϕ̃a (u), so
g ◦ ϕ̃a (u) > 0 iff u ∈ Ũa ∩ Rn−1
and g ◦ ϕ̃a (u) = 0 iff u ∈ Ũa ∩ ∂ Hn−1 .
+
It follows that
ϕ̃a Ũa ∩ Rn−1
= (S \ ∂S) ∩ ψ W̃a and ϕ̃a Ũa ∩ ∂Hn−1 = ∂S ∩ ψ W̃a .
+
Since ψ W̃a is open in S 0 and S 0 ⊇ S, (i)–(iii) follow.
Oriented (n − 1)-Surfaces-with-Boundary
As in the non-boundary case, orientation of an (n−1)-surface-with-boundary
S may be defined in terms of local parameterizations. By 12.4.4, the (n − 1)dimensional tangent space at a ∈ S is
TaS = {z ∈ Rn : z · ∇F (a) = 0} .
The new feature here is that if a ∈ ∂S, say a ∈ Bi , then there is also an
(n − 2)-dimensional tangent space to ∂S at a, namely,
Ta∂S = {z ∈ Rn : z · ∇F (a) = z · ∇gi (a) = 0} .
The connection between TaS and Ta∂S is described as follows: Let ϕa be a
local parametrization of S as described in part (b) of 12.4.10, where ϕa (0) =
a. Since ϕ̃a (Ua ∩ ∂Hn−1 ) ⊆ ∂S and (e1 , . . . , en−2 ) is a frame for ∂Hn−1 ,
d(ϕ̃a )0 (e1 , . . . , en−2 ) is a frame for Ta∂S . Since the vector d(ϕ̃a )0 (−en−1 ) is
not in the subspace Ta∂S ,
d(ϕ̃a )0 (−en−1 , e1 , . . . , en−2 )
(12.13)
is a frame for TaS . The induced orientation of ∂S is obtained by declaring the
frame d(ϕ̃a )0 (e1 , . . . , en−2 ) of Ta∂S to have the sign of the frame (12.13). If S
is positively oriented, then this sign is (−1)n−1 .
Curves and Surfaces in Rn
443
TaS
R+
n−1
d(ϕa )0 (−e
Hn−1
d(ϕa )0 (e1 )
Ua
0
−en−1
∂Hn−1
n=3
)
∂S
a
Ta∂S
→
−
N ϕ (a)
ϕa
S
FIGURE 12.19: Induced orientation of Ta∂S .
Figure 12.19 depicts the case n = 3. Here, S is oriented by the normal
~ (pointing outward). Therefore, by definition, the frame d(ϕa )0 (e1 , e2 ) is
N
positive in TaS , hence so is the frame d(ϕ̃a )0 (−e2 , e1 ). Thus, again by definition,
the frame d(ϕ̃a )0 (e1 ) of Ta∂S is positive in the induced orientation. Note that
~ ϕ ) in R3 is positive (12.3.5), so
because the frame (d(ϕa )0 (e1 ), d(ϕa )0 (e2 ), N
~ ϕ ). The latter therefore forms a rightis the frame (d(ϕa )0 (−e2 ), d(ϕa )0 (e1 ), N
3
handed system in R . Thus if d(ϕ̃a )0 (−e2 ) points upward, then d(ϕ̃a )0 (e1 )
must point in the direction shown. Therefore, the induced orientation of ∂S
is the one for which the surface S is on the left when ∂S is traversed in the
direction of the tangent vectors d(ϕ̃a )0 (e1 ).
Exercises
1. Let 0 < a < b. Show that the mapping
ϕ(φ, θ) = a cos φ, (b + a sin φ) cos θ, (b + a sin φ) sin θ , 0 < θ, φ < 2π,
p
2
is a local parametrization of the torus x2 +
y 2 + z 2 − b = a2 with
two circles missing.
2. Let U = x ∈ Rn−1 : kxk < 1 and define a local parametrization
ψ : U → S n−1 = {y ∈ Rn : kyk = 1} by
p
ψ(x) = x, 1 − kxk2 , x ∈ Rn−1
Give a geometric description of ψ. Referring to 12.4.2, find the transition
mapping ϕ̃−1 ◦ ψ.
3.S Consider the stereographic projection ϕ−1
1 (y) = x from p onto the
hyperplane xn = −1 shown in Figure 12.20, where y = (y1 , . . . , yn ) and
x = (x1 , . . . , xn−1 ). Calculate ϕ1 (x) and ϕ−1
1 (y) and find the transition
mapping ϕ−1 ◦ ϕ1 , where ϕ is the mapping of 12.4.2.
444
A Course in Real Analysis
p = (0, . . . , 0, 1)
S
Rn
y
0
q = (0, . . . , 0, −1)
(x, −1)
FIGURE 12.20: Stereographic projection ϕ−1
1 (y) from p.
4. Replace the sphere in 12.4.2 by the elliptic paraboloid
)
(
2 2
y2
y1
+
, y3 < 1
S = (y1 , y2 , y3 ) : y3 =
a1
a2
(with p = (0, 0, 1)) and find the corresponding maps ϕ and ϕ−1 .
5.S Repeat Exercise 4 using the elliptic cone
(
)
2 2
y1
y2
2
S = (y1 , y2 , y3 ) : y3 =
+
, 0 < y3 < 1 .
a1
a2
6. Repeat Exercise 4 using the ellipsoid
(
)
2 2
y1
y2
S = (y1 , y2 , y3 ) :
+
+ y32 = 1 .
a1
a2
7. Find the equation of the tangent plane Ta at a = (1, 1, 1) for each of the
following surfaces:
(a)S x21 + 2x22 + 3x23 = 6. (b) x21 + x22 − 2x23 = 0. (c) x21 − x22 + x3 = 1.
8. An n × n matrix A is said to be orthogonal if At A is the identity matrix.
Identifying a 2 × 2 matrix [ xx13 xx24 ] with the point (x1 , x2 , x3 , x4 ), show
that the collection of all 2 × 2 orthogonal matrices is a 1-surface S
in R4 . Characterize the matrices in the tangent space to S at each of the
following points:
√ √
1 0
−1 0
0 1
1/√2 −1/√2
(a)
. (b)
. (c)
. (d)
.
0 1
0 1
1 0
1/ 2
1/ 2
The matrices in the tangent space at the point in part (a) are the so-called
2 × 2 skew-symmetric matrices.
Curves and Surfaces in Rn
445
9. Referring to 12.4.2, let y ∈ S and set T := d(ϕ−1 )y : Ty → Rn−1 .
1
(a)S Prove that kT (v)k =
kvk for all v ∈ Ty .
(1 − yn )
(b) Use (a), the bilinearity of v · w and T (v) · T (w), and the identity
2v · w = kv + wk2 − kvk2 − kwk2
to prove that
T (v) · T (w)
v·w
=
, v, w ∈ Ty .
kvkkwk
kT (v)kkT (w)k
Thus, by 12.4.4, the stereographic projection preserves the angle at the
intersection of a pair of simple smooth curves on S.
10. Let each of the following 2-surfaces-with-boundary be positively oriented.
Find parametrizations of the boundary curves that are compatible with
the induced orientation on the boundary.
(a) S = (x1 , x2 , x3 ) : x21 + x22 = 1, 0 ≤ x3 ≤ 2 − x2 .
(b) S = (x1 , x2 , x3 ) : x3 = x21 + x22 , 0 ≤ x3 ≤ 1 − x1 − x2 .
(c) S = (x1 , x2 , x3 ) : x21 + x22 + x23 = 4, −2 ≤ x3 ≤ 3 − x1 − x2 .
Hint. For (c) the boundary is a circle on the plane x1 + x2 + x3 = 3.
Translate and rotate that plane into the plane x3 = 0, find a parametric
equation of the rotated circle with center 0, then reverse the procedure to find the parametrization of the original circle with appropriate
orientation.
11. Let S = {x : F (x) = 0} be an oriented 2-surface in R3 , where F is C 2 .
(a)S The tangent bundle of S is the set
[
TS =
{x} × Tx .
x∈S
Show that
TS = (x, v) ∈ R6 : F (x) = 0 and v · ∇F (x) = 0
and that TS is a 4-surface in R6 .
(b) The sphere bundle of S is the subset
TS1 := {(x, v) ∈ TS : kvk = 1} .
Show that TS1 is a 3-surface in R6 .
(c) Let S = x ∈ R3 : kxk2 = 3 . Show that the tangent space to the
√
√
√
sphere bundle TS1 at the point (1, 1, 1, 1/ 6, 1/ 6, −2/ 6) consists of
all vectors w ∈ R6 satisfying the system
w1
√
−3 6w3
w4
+ w2
+ w4
+ w5
+
+
−
w3
w5
2w6
+ w6
=0
=0
=0
Chapter 13
Integration on Surfaces
Throughout the chapter m and n are
fixed positive integers with 1 ≤ m ≤ n.
In this chapter we construct the integral of a differential m-form on an
m-surface in Rn , a generalization of the line integral of a 1-form on a curve.
This will provide the necessary context for the divergence theorem and the
theorems of Green and Stokes, far-reaching generalizations of the fundamental
theorem of calculus
13.1
Differential Forms
Alternating Multilinear Functionals
An m-multilinear functional on Rn is a real-valued function
M (a1 , . . . , am ),
a1 , . . . , am ∈ Rn ,
that is linear in each variable ai separately. (See Section 9.7.) Such a function
is said to be alternating if interchanging two vectors changes the sign of M :
M (a1 , . . . , ai , . . . , aj , . . . , am ) = −M (a1 , . . . , aj , . . . , ai , . . . , am ).
Thus if ai = aj , then M (a1 , . . . , am ) = 0. Note that a linear combination of
alternating m-multilinear functionals is an alternating m-multilinear functional.
A permutation of (1, . . . , m) is a one-to-one function σ mapping {1, . . . , m}
onto itself, frequently denoted by (i1 , . . . , im ), where ik = σ(k). The sign (−1)σ
of σ is positive (negative) if an even (odd) number of adjacent interchanges
are required to transform (i1 , . . . , im ) back to (1, . . . , m) (see Appendix B). It
follows that if M is an alternating m-multilinear functional, then
M (aσ(1) , . . . , aσ(m) ) = (−1)σ M (a1 , . . . , am ).
An important example is the determinant of an n × n matrix, which is
447
448
A Course in Real Analysis
multilinear and alternating on its rows as well as its columns. To build on this,
we introduce the following notation. Define
Jm = {j := (j1 , . . . , jm ) : 1 ≤ jk ≤ n} , and
Im = {i := (i1 , . . . , im ) : 1 ≤ i1 < i2 < · · · < im ≤ n} .
Thus Jm is the set of all m-tuples of (possibly repeated) indices in {1, . . . , n}
and Im the set of all strictly increasing m-tuples in Jm . In particular, In =
{(1, . . . , n)}.
Now let A be an n × m matrix with columns a1 , . . . , am ∈ Rn and B an
m × n matrix with rows b1 , . . . , bm ∈ Rn . For any member j = (j1 , . . . , jm ) of
Jm define Aj to be the m × m matrix whose rth row is row jr of A and define
B j to be the m × m matrix whose cth column is column jc of B, that is,
 1
 1


aj1 a2j1 · · · am
a1 a21 · · · am
j1
1
2
m
 a12 a22 · · · am

 a1j
2 

 2 aj2 · · · aj2 
Aj = a1 · · · am j =  .
= .

.
.
.
.
..
.. 
..
.. 
 ..
 ..

a1n
a2n
···
am
n
j
a1jm
a2jm
···
am
jm
and
 1
j
b1
b1
 b12
 

B j =  ...  =  .
 ..
bm
b1m

Thus j selects rows from
m = 3,

1
4

7
10
and

1
5
9
b21
b22
..
.
···
···
b2m
···
j  j1
bn1
b1
bj1
bn2 

 2
..  =  .
 ..
. 
bnm
bjm1
bj12
bj22
..
.
···
···

bj1m
bj2m 

.. 
. 
bjm2
···
bjmm
A and columns from B. For example, for n = 4 and
2
6
10
2
5
8
11
3
7
11


3
10

6
4
=
9
1
12 (4,2,1)
(4,4,1) 
4
4
8
=8
12
12
11
5
2
4
8
12

12
6
3

1
5 .
9
Finally, define the alternating m-multilinear functional dxj = dxj1 ,...,jm on Rn
by
dxj a1 , . . . , am = det[a1 · · · am ]j .
Note that if m = 1, the definition reduces to dxj (a) = aj , as defined in
Section 9.7.
Integration on Surfaces
449
13.1.1 Lemma. If i = (i1 , · · · , im ) and j = (j1 , · · · , jm ) ∈ Im , then
(
1 if i = j,
j1
jm
dxi e , . . . , e
=
0 otherwise,
where e1 , . . . , en are the standard basis vectors in Rn .
Proof. By definition,
 j1
e1
 .
jm
j1
dxi e , . . . , e
= det  ..
···
ejn1
···

eji11
ej1m

.
..
.  = ..
1
ejnm i
ejim
···
···
eji1m
.. ,
.
m
ejim
where eji = 1 if i = j and 0 otherwise. If j1 < i1 , then j1 < i` for every `, hence
the first column is zero and the determinant is zero. Similarly, if j1 > i1 , then
the first row is zero and, again, the determinant is zero. If j1 = i1 , then the
determinant reduces to
eji22 · · · eji2m
..
.. ,
.
.
2
ejim
···
m
ejim
and an induction argument completes the proof.
13.1.2 Lemma. Let M and M 0 be alternating m-multilinear functionals on
Rn . If
M (ei1 , . . . , eim ) = M 0 (ei1 , . . . , eim )
(13.1)
for all (i1 , . . . , im ) ∈ Im , then M = M 0 .
Proof. For j = 1, . . . , m, let aj = (aj1 , . . . , ajn ) =
M (a1 , . . . , am ) = M
n
X
a1i ei , . . . ,
i=1
=
n
X
i1 =1
···
Pn
n
X
i=1
aji ei . By multilinearity,
!
i
am
i e
i=1
n
X
i1
im
a1i1 · · · am
im M (e , . . . , e ),
im =1
with the analogous equality holding for M 0 . It therefore suffices to show that
M (ei1 , . . . , eim ) = M 0 (ei1 , . . . , eim ).
This is clear if two of the indices ik are equal, since then both sides are zero.
If the indices are distinct, then, by permuting the vectors ei1 , . . . , eim and
attaching the appropriate signs, the indices may be brought into increasing
order, and the desired equality then follows from the hypothesis.
450
A Course in Real Analysis
13.1.3 Theorem. If M is an alternating m-multilinear functional on Rn ,
then
X
M=
M (ei1 , . . . , eim ) dxi1 ,··· ,im .
(i1 ,...,im )∈Im
Proof. Let M denote the alternating m-multilinear functional on the right. If
(j1 , . . . , jm ) ∈ Im , then
X
M 0 (ej1 , . . . , ejm ) =
M (ei1 , . . . , eim ) dxi1 ,...,im (ej1 , . . . , ejm )
0
(i1 ,...,im )∈Im
= M (ej1 , . . . , ejm ),
the second equality from 13.1.1. By 13.1.2, M = M 0 .
The following application of 13.1.3 will be needed later in connection with
integration on surfaces.
13.1.4 Binet–Cauchy Product. Let C be an m × n matrix and D an n × m
matrix. Then
X
det(CD) =
det C i det Di .
i∈Im
Proof. Let c1 , . . ., cm ∈ Rn denote the rows of C and d1 , . . ., dm ∈ Rn the
columns of D, the latter considered as variables. Define
M d1 , . . . , dm = det(CD) = det ci · dj m×m .
Then M is an alternating m-multilinear form and, by 13.1.3,
X
M d1 , . . . , dm =
M (ei1 , . . . , eim ) dxi1 ,...,im d1 , . . . , dm .
(i1 ,...,im )∈Im
Since M ei1 , . . . , eim = det C i and dxi d1 , . . . , dm = det Di , the conclusion
follows.
13.1.5 Corollary. If C and D are n × n matrices, then
det(CD) = (det C)(det D).
13.1.6 Corollary. If A is an n × m matrix, then
X
det(At A) =
[det(Ai )]2 .
i∈Im
i
t
Proof. Take
C = At and D =
A in the
theorem 2and note that C = (Ai ) , so
t
i
det C det Di = det Ai det Ai = [det Ai ] .
From 13.1.6, we have
13.1.7 Corollary. Let A be an n × m matrix. Then A has rank m iff
det(At A) 6= 0.
Integration on Surfaces
451
Definition of a Differential Form
A differential m-form on a set S ⊆ Rn is a function ω that assigns to each
x ∈ S an alternating m-multilinear functional ωx on Rn . We shall usually drop
the qualifier “differential” when referring to forms. The integer m is called the
degree of the form. A 0-form is simply a real-valued function on S.
By 13.1.3, if ω is an m-form, then for each i ∈ Im there exists a unique
function gi on S such that
X
ωx =
gi (x) dxi , x ∈ S.
i∈Im
Conversely, if fj is a real-valued function on S, then
X
ωx :=
fj (x) dxj , x ∈ S,
(13.2)
j∈Jm
defines an m-form on S. If each fj is of class C r on S (that is, on an open set
containing S), then ω is called a differential form of class C r or simply a C r
form, where r ∈ Z+ ∪ {+∞}.
The Algebra of Differential Forms
For a ∈ R and m-forms
X
X
ω=
fj dxj and η =
gj dxj
j∈Jm
j∈Jm
on S, define m-forms aω and ω + η on S by
X
X
aω :=
afj dxj and ω + η :=
(fj + gj ) dxj .
j∈Jm
j∈Jm
The collection of m-forms on S is easily seen to be a vector space under these
operations.
It is also possibly to multiply forms. For this, the notation
dxj1 ,...,jm = dxj1 ∧ · · · ∧ dxjm
(13.3)
will be useful. The right side may be interpreted as a product of differentials,
called a wedge product and made precise below. Because dxj1 ,...,jm (a1 , . . . , am )
is a determinant, interchanging a pair of differentials in (13.3) changes the sign
of the product. Furthermore, if there are duplicate indices, then the product is
zero. Thus we have the “rules”
dxj ∧ dxi = −dxi ∧ dxj
and dxi ∧ dxi = 0.
(13.4)
Using these rules, one can reduce any m-form to its unique canonical representation
X
ω=
gi1 ,...,im dxi1 ∧ · · · ∧ dxim .
(i1 ,...,im )∈Im
452
A Course in Real Analysis
For example, the 3-form in R4
ω = f dx2 ∧ dx1 ∧ dx2 + g dx3 ∧ dx2 ∧ dx1 + h dx2 ∧ dx4 ∧ dx1
has canonical representation
ω = −g dx1 ∧ dx2 ∧ dx3 + h dx1 ∧ dx2 ∧ dx4 .
13.1.8 Definition. Let 1 ≤ p, q ≤ n. The wedge product or exterior product
of the forms
X
X
ω=
fj1 ,...,jp dxj1 ∧ · · · ∧ dxjp and η =
gk1 ,...,kq dxk1 ∧ · · · ∧ dxkq
(j1 ,...,jp )∈Jp
(k1 ,...,kq )∈Jq
is the form
ω ∧ η :=
X
fj1 ,...,jp gk1 ,...,kq dxj1 ∧ · · · ∧ dxjp ∧ dxk1 ∧ · · · ∧ dxkq . (13.5)
(j1 ,...,jp )∈Jp
(k1 ,...,kq )∈Jq
If f is a 0-form on S, then the p-form f ω = f ∧ ω is defined by
X
f ∧ ω :=
f fj1 ,...,jp dxj1 ∧ · · · ∧ dxjp .
♦
(j1 ,...,jp )∈J
Note that the right side of (13.5) may be obtained by formally multiplying
the sums defining ω and η, where the product of forms dxi1 ∧ · · · ∧ dxip and
dxj1 ∧ · · · ∧ dxjq is defined as
dxj1 ∧ · · · ∧ dxjp ∧ dxk1 ∧ · · · ∧ dxkq .
The rules in (13.4) may then be used to obtain the canonical representation of
ω ∧ η. The resulting form has degree ≤ n, in compliance with our definition.
13.1.9 Example. In R4 ,
(a)
(f1 dx1 + f2 dx2 + f3 dx3 + f4 dx4 ) ∧ (g1 dx1 + g2 dx2 )
= (f1 g2 − f2 g1 ) dx1 ∧ dx2 − f3 g1 dx1 ∧ dx3 − f3 g2 dx2 ∧ dx3
− f4 g2 dx2 ∧ dx4 − f4 g1 dx1 ∧ dx4 .
(b)
(f1 dx1 + f2 dx2 + f3 dx3 + f4 dx4 ) ∧ (h1 dx1 ∧ dx3 + h2 dx2 ∧ dx4 )
= f1 h2 dx1 ∧ dx2 ∧ dx4 − f2 h1 dx1 ∧ dx2 ∧ dx3
− f3 h2 dx2 ∧ dx3 ∧ dx4 + f4 h1 dx1 ∧ dx3 ∧ dx4 .
♦
It must still be shown that the definition of ω ∧ η in (13.5) is independent
of the particular representations of ω and η. To see this, apply the rules in
(13.4), first on the indices jp and then on the indices kq , to reduce the right
side of (13.5) to
X
f˜i1 ,...,ip g̃i01 ,...,i0p dxi1 ∧ · · · ∧ dxip ∧ dxi01 ∧ · · · ∧ dxi0q
(i1 ,...,ip )∈Ip
(i01 ,...,i0q )∈Iq
Integration on Surfaces
453
where
X
f˜i1 ,...,ip dxi1 ∧ · · · ∧ dxip and
X
g̃i01 ,...,i0q dxi01 ∧ · · · ∧ dxi0q
(i01 ,...,i0q )∈Iq
(i1 ,...,ip )∈Ip
are the canonical representations of ω and η. Since the latter are unique, every
version of ω ∧ η may be reduced to the same form, hence ω ∧ η is well-defined.
13.1.10 Proposition. Let ω be a p-form, η a q-form, and ν an r-form, where
1 ≤ p, q, r ≤ n. Then
(a) ω ∧ η is linear in each variable separately;
(b) (ω ∧ η) ∧ ν = ω ∧ (η ∧ ν);
(c) η ∧ ω = (−1)pq ω ∧ η.
Proof. The straightforward proofs of (a) and (b) are left to the reader. For the
proof of (c), let ω and η be as in 13.1.8. Then
X
η∧ω =
gk1 ,...,kq fj1 ,...,jp dxk1 ∧ · · · ∧ dxkq ∧ dxj1 ∧ · · · ∧ dxjp
(k1 ,...,kq )∈Jq
(j1 ,...,jp )∈Jp
=
X
gk1 ,...,kq fj1 ,...,jp (−1)pq dxj1 ∧ · · · ∧ dxjp ∧ dxk1 ∧ · · · ∧ dxkq
(k1 ,...,kq )∈Jq
(j1 ,...,jp )∈Jp
= (−1)pq ω ∧ η,
the last equality because pq adjacent interchanges are required.
Pn
13.1.11 Proposition. Let aj = i=1 aji ej , j = 1, . . . , n. Then
!
!
n
n
X
X
1
n
ai dxi ∧ · · · ∧
ai dxi = det[a1 · · · an ]dx1 ∧ · · · ∧ dxn .
i=1
i=1
Proof. By properties of the wedge product, the left side of the equation is
n
X
i1 =1
···
n
X
in =1
a1i1 · · · anin dxi1 ∧ · · · ∧ dxin =
X
a1i1 · · · anin dxi1 ∧ · · · ∧ dxin .
i1 ,...,in distinct
If σ = (i1 , . . . , in ), then dxi1 ∧ · · · ∧ dxin = (−1)σ dx1 ∧ · · · ∧ dxn , and the
assertion follows from the definition of determinant.
The proposition provides an alternate method for evaluating determinants.
454
A Course in Real Analysis


1
3 5
4 6. By wedge product rules applied to
13.1.12 Example. Let A = 2
3 −2 1
the forms constructed from the columns,
(1 dx1 + 2 dx2 + 3 dx3 ) ∧ (3 dx1 + 4 dx2 − 2 dx3 ) ∧ (5 dx1 + 6 dx2 + 1 dx3 )
= (−2 dx1,2 − 11 dx1,3 − 16 dx2,3 ) ∧ (5 dx1 + 6 dx2 + dx3 )
= (−2 dx1,2,3 + 66 dx1,2,3 − 80 dx1,2,3 , )
= −16 dx1,2,3 ,
hence det(A) = −16.
♦
The Differential of a Form
13.1.13 Definition. The differential of a 0-form f of class C 1 on S ⊆ Rn is
its differential as a C 1 function, namely, the 1-form
df =
n
X
(∂j f )dxj .
j=1
The differential of an m-form
X
ω=
fj1 ,...,jm dxj1 ∧ · · · ∧ dxjm
(j1 ,...,jm )∈J
of class C 1 on S is the (m + 1)-form dω defined by
X
dω =
(dfj1 ,...,jm ) ∧ dxj1 ∧ · · · ∧ dxjm
(13.6)
(j1 ,...,jm )∈Jm
=
X
n
X
(∂j fj1 ,...,jm ) dxj ∧ dxj1 ∧ · · · ∧ dxjm .
♦
(j1 ,...,jm )∈J j=1
Note that if m = n, then dω = 0, since in the last expression every dxj is a
dxji for some i.
As in the case of wedge products, it must be verified that the definition of
dω does not depend on the particular representation of ω. For this we use the
rules in (13.4) to express ω canonically as
X
ω=
gi1 ,...,im dxi1 ∧ · · · ∧ dxim .
(i1 ,...,im )∈Im
Here, each gi1 ,...,im is a linear combination the functions fj1 ,...,jm produced by
combining these functions during the reduction process. Applying the same
sequence of operations to the sum on the right in (13.6) results in
X
ηi1 ,...,im ∧ dxi1 ∧ dxi2 ∧ · · · ∧ dxim ,
(i1 ,...,im )∈Im
Integration on Surfaces
455
where ηi1 ,...,im is precisely the same linear combination of the forms dfj1 ,...,jm .
Since the differential is linear on 0-forms, ηi1 ,...,im = dgi1 ,...,im . Therefore, all
versions of dω may be reduced to the same form and hence are equal.
For the next example, we introduce the following notation and terminology
from classical vector analysis.
13.1.14 Definition. The curl of a C 1 vector field F~ = (f1 , f2 , f3 ) on an open
subset of R3 is the vector
curl F~ = (∂2 f3 − ∂3 f2 ) e1 + (∂3 f1 − ∂1 f3 ) e2 + (∂1 f2 − ∂2 f1 ) e3 .
The divergence of a C 1 vector field F~ = (f1 , . . . , fn ) on an open subset of Rn
is defined by
n
X
~
div F =
∂i fi .
i=1
If ω =
Pn
j=1
fj dxj we define div ω = div F~ .
♦
13.1.15 Example. In R3 ,
(a) d f1 dx1 + f2 dx2 + f3 dx3
= (∂1 f1 dx1 + ∂2 f1 dx2 + ∂3 f1 dx3 ) ∧ dx1
+ (∂1 f2 dx1 + ∂2 f2 dx2 + ∂3 f2 dx3 ) ∧ dx2
+ (∂1 f3 dx1 + ∂2 f3 dx2 + ∂3 f3 dx3 ) ∧ dx3
= (∂2 f3 − ∂3 f2 ) dx2,3 + (∂3 f1 − ∂1 f3 ) dx3,1 + (∂1 f2 − ∂2 f1 ) dx1,2
= e1 · curl F~ dx2,3 + e2 · curl F~ dx3,1 + e3 · curl F~ dx1,2 .
(b) d f3 dx1 ∧ dx2 + f1 dx2 ∧ dx3 + f2 dx3 ∧ dx1 )
= (∂1 f3 dx1 + ∂2 f3 dx2 + ∂3 f3 dx3 ) ∧ dx1 ∧ dx2
+ (∂1 f1 dx1 + ∂2 f1 dx2 + ∂3 f1 dx3 ) ∧ dx2 ∧ dx3
+ (∂1 f2 dx1 + ∂2 f2 dx2 + ∂3 f2 dx3 ) ∧ dx3 ∧ dx1
= (∂1 f1 + ∂2 f2 + ∂3 f3 ) dx1 ∧ dx2 ∧ dx3
= div F~ dx1 ∧ dx2 ∧ dx3 .
♦
13.1.16 Theorem. Let f be a 0-form, let ω and η be p-forms, and let ν be a
q form, all of class C 1 on S ⊆ Rn . Then
(a) d(aω + bη) = a dω + b dη, a, b ∈ R;
(b) d2 ω := d(dω) = 0;
(c) d(ω ∧ ν) = (dω) ∧ ν + (−1)p ω ∧ (dν);
(d) d(f ν) = (df ) ∧ ν + f dν.
Proof. Part (a) is clear from the definition of addition and scalar multiplication
of m-forms and the linearity of the differential operator on 0-forms.
456
A Course in Real Analysis
For (b), it suffices by linearity to prove that
d (df ) dxj1 ∧ dxj2 ∧ · · · ∧ dxjp = 0.
The left side of this equation is
X
n
d
(∂k f )dxk ∧ dxj1 ∧ · · · ∧ dxjp
k=1
=
X
n X
n
∂j ∂k f dxj ∧ dxk ∧ dxj1 ∧ · · · ∧ dxjp .
j=1 k=1
Since dxk ∧ dxj = −dxj ∧ dxk and ∂j ∂k f = ∂k ∂j f , the terms in the square
brackets on the right cancel pairwise, producing zero, as required.
To prove (c), let
X
X
ω=
fj dxj and ν =
gk dxk .
j∈Jp
k∈Jq
By the product rule for differentials of 0-forms,
X
d(ω ∧ ν) =
d(fj gk ) ∧ dxj ∧ dxk
j∈Jp , k∈Jq
=
X
gk (dfj ) ∧ dxj ∧ dxk +
j∈Jp ,k∈Jq
X
fj (dgk ) ∧ dxj ∧ dxk
j∈Jp ,k∈Jq
= (dω) ∧ ν + (−1)−p ω ∧ (dν),
the last equality because p adjacent interchanges are needed to place the form
dgk in the second sum to the immediate left of dxk .
Part (d) follows from (c) with p = 0.
The Pullback of a Form
Throughout this subsection, U ⊆ Rm and W ⊆ Rn
are open and ϕ : U → W is a C 1 map.
13.1.17 Definition. The pullback by ϕ of a C 1 function (0-form) f on W is
the 0-form ϕ∗ (f ) on U defined by
ϕ∗ (f )(u) := f ϕ(u) , u ∈ U.
The pullback by ϕ of the 1-form dxj on W is the 1-form ϕ∗ (dxj ) on U defined
by
m
X
∂ϕj
ϕ∗ (dxj ) :=
dui = dϕj , j = 1, . . . , n.
∂ui
i=1
Integration on Surfaces
457
The pullback by ϕ of the C 1 p-form
X
ω=
fj1 ,...,jp dxj1 ∧ · · · ∧ dxjp
(j1 ,...,jp )∈Jp
on W is the C 1 p-form ϕ∗ ω on U defined by
X
ϕ∗ ω :=
ϕ∗ (fj1 ,...,jp )ϕ∗ (dxj1 ) ∧ · · · ∧ ϕ∗ (dxjp ).
♦
(j1 ,...,jp )∈Jp
Arguments similar to those used earlier show that the definition of ϕ∗ ω is
independent of the representation of ω.
13.1.18 Example. Let ϕ = (ϕ1 , ϕ2 , ϕ3 ) : R2 → R3 be C 1 . Then
(a) ϕ∗ f dx1 ∧ dx2 ) = ϕ∗ (f )ϕ∗ ( dx1 ) ∧ ϕ∗ ( dx2 )
∂ϕ1
∂ϕ2
∂ϕ2
∂ϕ1
du1 +
du2 ∧
du1 +
du2
= (f ◦ ϕ)
∂u1
∂u2
∂u1
∂u2
∂ϕ1 ∂ϕ2
∂ϕ2 ∂ϕ1
= (f ◦ ϕ)
−
du1 ∧ du2 .
∂u1 ∂u2
∂u1 ∂u2
(b) ϕ∗ f1 dx1 + f2 dx2 + f3 dx3
= ϕ∗ (f1 )ϕ∗ ( dx1 ) + ϕ∗ (f2 )ϕ∗ ( dx2 ) + ϕ∗ (f3 )ϕ∗ ( dx3
∂ϕ1
∂ϕ1
∂ϕ2
∂ϕ2
du1 +
du2 + (f2 ◦ ϕ)
du1 +
du2
= (f1 ◦ ϕ)
∂u1
∂u2
∂u1
∂u2
∂ϕ3
∂ϕ3
+ (f3 ◦ ϕ)
du1 +
du2
∂u1
∂u2
∂ϕ1
∂ϕ2
∂ϕ3
= (f1 ◦ ϕ)
+ (f2 ◦ ϕ)
+ (f3 ◦ ϕ)
du1
∂u1
∂u1
∂u1
∂ϕ2
∂ϕ3
∂ϕ1
+ (f2 ◦ ϕ)
+ (f3 ◦ ϕ)
du2 . ♦
+ (f1 ◦ ϕ)
∂u2
∂u2
∂u2
13.1.19 Theorem. If ω and η are C 1 p-forms and ν is a C 1 q-form, then
(a) ϕ∗ (aω + bη) = aϕ∗ (ω) + bϕ∗ (η), a, b ∈ R;
(b) ϕ∗ (ω ∧ ν) = ϕ∗ (ω) ∧ ϕ∗ (ν);
(c) ϕ∗ (dω) = dϕ∗ (ω);
(d) (ϕ∗ ω)u (a1 , . . . , ap ) = ωϕ(u) (dϕu (a1 ), . . . , dϕu (ap )).
Proof. Part (a) follows directly from the definition of pullback. Part (b) is easily
established for ω = f dxi1 ∧ · · · ∧ dxip and ν = g dxj1 ∧ · · · ∧ dxjq ; bilinearity
of the wedge product and linearity of ϕ∗ then imply that (b) holds generally.
For (c) it suffices, by linearity of the differential and pullback, to verify
that
ϕ∗ d(f dxj1 ∧ · · · ∧ dxjp ) = dϕ∗ (f dxj1 ∧ · · · ∧ dxjp ),
458
A Course in Real Analysis
that is,
n
X
[(∂j f ) ◦ ϕ] ϕ∗ ( dxj ) ∧ ϕ∗ (dxj1 ) ∧ · · · ∧ ϕ∗ (dxjp )
j=1
=
X
m
∂i (f ◦ ϕ)dui ϕ∗ (dxj1 ) ∧ · · · ∧ ϕ∗ (dxjp ) (13.7)
i=1
By the chain rule,
∂i (f ◦ ϕ) =
n
X
∂ϕj
,
(∂j f ) ◦ ϕ
∂ui
j=1
hence the right side of (13.7) is
n
m
X
X
∂ϕj
(∂j f ) ◦ ϕ
dui
∂ui
j=1
i=1
!
∧ ϕ∗ (dxj1 ) ∧ · · · ∧ ϕ∗ (dxjp ).
Recalling the definition of ϕ∗ ( dxj ), we see that the last expression is precisely
the left side of (13.7).
To prove (d), let ω have canonical representation
X
ω=
fi1 ,...,ip dxi1 ∧ · · · ∧ dxip .
(i1 ,...,ip )∈Ip
By 13.1.2, it suffices to show that
(ϕ∗ ω)u (e`1 , . . . , e`p ) = ωϕ(u) (dϕu (e`1 ), . . . , dϕu (e`p ))
for any (`1 , . . . , `p ) ∈ Ip . The left side of this equation is
X
ϕ∗ (fi )(u) ϕ∗ (dxi1 ) ∧ · · · ∧ ϕ∗ (dxip ) (e`1 , . . . , e`p )
i∈Ip
and the right side is
X
fi ϕ(u) dxi1 ∧ · · · ∧ dxip dϕu (e`1 ), . . . , dϕu (e`p ))
i∈Ip
Hence it suffices to prove that
ϕ∗ (dxi1 ) ∧ · · · ∧ ϕ∗ (dxip ) (e`1 , . . . , e`p )
= dxi1 ∧ · · · ∧ dxip dϕu (e`1 ), . . . , dϕu (e`p ))
(13.8)
By multilinearity,
ϕ∗ (dxi1 ) ∧ · · · ∧ ϕ∗ (dxip ) =
X
∂ϕip
∂ϕi1
···
duj1 ∧ · · · ∧ dujp .
∂uj1
∂ujp
(j1 ,...,jp )∈Jp
Integration on Surfaces
459
Now, duj1 ∧ · · · ∧ dujp (e`1 , . . . , e`p ) 6= 0 only if the p-tuple (j1 , . . . , jp ) is a
permutation of (`1 , . . . , `p ). For each such p-tuple define a permutation σ of
(1, . . . , p) such that `k = jσ(k) . Then
duj1 ∧ · · · ∧ dujp (e`1 , . . . , e`p ) = (−1)σ du`1 ∧ · · · ∧ du`p (e`1 , . . . , e`p ) = (−1)σ
and
∂ϕiσ(1)
∂ϕiσ(p)
∂ϕip
∂ϕip
∂ϕi1
∂ϕi1
···
=
···
=
···
,
∂uj1
∂ujp
∂u`τ (1)
∂u`τ (p)
∂u`1
∂u`p
where τ = σ −1 . Thus the left side of (13.8) is
X
∗
∂ϕiσ(p)
∂ϕiσ(1)
···
, (13.9)
ϕ (dxi1 )∧· · ·∧ϕ∗ (dxip ) (e`1 , . . . , e`p ) =
(−1)σ
∂u`1
∂u`p
σ
where the sum is taken over all permutations σ of (1, . . . , p).
On the other hand, since
dϕu (e ) = ∂`j ϕ(u) =
`j
p
X
∂ϕi (u)
i=1
∂u`j
ei ,
the right side of (13.8) is

dxi1 ∧ · · · ∧ dxip
=

p
p
X
X
∂ϕ
∂ϕ
j
j

ej , . . . ,
ej 
∂u
∂u
`
`
1
p
j=1
j=1
X
∂ϕjp
∂ϕj1
···
dx ∧ · · · ∧ dxip (ej1 , . . . , ejp ). (13.10)
∂u`1
∂u`p i1
(j1 ,...,jp )∈Jp
As above, dxi1 ∧ · · · ∧ dxip (ej1 , . . . , ejp ) 6= 0 only if the p-tuple (j1 , . . . , jp ) is
a permutation of (i1 , . . . , ip ). For each such p-tuple, define a permutation σ of
(1, . . . , p) such that jk = iσ(k) . Then
dxi1 ∧ · · · ∧ dxip (ej1 , . . . , ejp ) = dxi1 ∧ · · · ∧ dxip eiσ(1) , . . . , eiσ(k) = (−1)σ
and
∂ϕiσ(1)
∂ϕiσ(p)
∂ϕjp
∂ϕj1
···
=
···
∂u`1
∂u`p
∂u`1
∂u`p
so (13.10) reduces to
X
σ
(−1)σ
∂ϕiσ(1)
∂ϕiσ(p)
···
,
∂u`1
∂u`p
where the sum is taken over all permutations of (1, . . . , p). As this is precisely
(13.9) the proof is complete.
460
A Course in Real Analysis
Exercises
1. Let Tj ∈ L(Rn , R), j = 1, . . . , m. Which of the following functions is
multilinear on Rn ?
Pm
Qm
(a) M (x1 , . . . , xm ) := i=1 Ti (xi ). (b) M (x1 , . . . , xm ) := i=1 Ti (xi ).
2. For fixed c = (c1 , c2 ), d = (d1 , d2 ) ∈ R2 define
M (x, y) := (c · x)(d · y) − (c · y)(d · x), x, y ∈ R2 .
(a) Show that M is an alternating multilinear functional on R2 .
(b) Express M in terms of differentials, as in 13.1.3
3.S Let M (a1 , . . . , am ) be a multilinear functional on Rn with the property
that M (a1 , . . . , am ) = 0 whenever two of the vectors aj are equal. Prove
that M is alternating.
4. Let M be an alternating m-multilinear functional on Rn . Show that if
the vectors a1 , . . . , am are linearly dependent, then M (a1 , . . . , am ) = 0.
5. Let M (a1 , . . . , am ) be an m-multilinear functional on Rn . Define
Alt(M )(a1 , . . . , am ) =
1 X
(−1)σ M aσ(1) , . . . , aσ(m) ,
m! σ
where the sum is taken over all permutations σ of (1, . . . , m). Show
that Alt(M ) is an alternating m-multilinear functional on Rn and that
Alt(M ) = M iff M is alternating.
n
6. Prove that the vector space of m-forms on S has dimension m
.
7. Find the canonical representation of the following forms in R3 :
(a)S (f1 dx1 + f2 dx2 + f3 dx3 ) ∧ (g1 dx1 + g2 dx2 + g3 dx3 ).
(b) (f1 dx1 + f2 dx2 + f3 dx3 ) ∧ (g1 dx1 + g2 dx2 + g3 dx3 )
∧(h1 dx1 + h2 dx2 + h3 dx3 ).
8. Find the canonical representation of the following forms in R5 :
(a) (−dx1 + dx2 + dx3 ∧ (dx1 − 2dx2 + 3dx3 ).
(b)S (dx1 + dx2 ) ∧ (dx1 − dx3 ) ∧ (dx2 + 2dx3 ).
(c) dx1 ∧ (dx1 ∧ dx3 + 3dx5 ∧ dx4 ).
(d) dx1 ∧ dx2 + dx1 ∧ dx3 ∧ dx
4 ∧ dx3 + dx2 ∧ dx5
∧ dx3 ∧ dx1 + dx4 ∧ dx1 .
9. Find the canonical representation of the following forms in Rn :
(a)S dx2 ∧ dx4 ∧ · · · ∧ dx2k ∧ dx1 ∧ dx3 ∧ · · · ∧ dx2k−1 , 2k ≤ n.
(b) dx1 ∧ dx5 ∧ · · · ∧ dx4k−3 ∧ dx3 ∧ dx7 ∧ · · · ∧ dx4k−1
∧ dx2 ∧ dx6 ∧ · · · ∧ dx4k−2 ∧ dx4 ∧ dx8 ∧ · · · ∧ dx4k , 4k ≤ n.
Integration on Surfaces
461
10. Show that if ω is an m-form and m is odd, then ω ∧ ω = 0. Find an
example of a 2-form ω in R4 such that ω ∧ ω 6= 0.
11. Use the method of 13.1.12 to verify the determinants
1
(a) −1
1
−1
1
−1
0 2
3
1
1 = -4. (b)S 2 −1 0 = 9. (c) 0
1 −1
0
2 1
−1
12. Show directly that in Rn , d f ( dx1 ∧ · · · ∧ dxn ) = 0.
1 2
−1 1 = -6.
1 0
13. Let f : R → R be C 1 and define gj (x) = f (xj ). Find the canonical
representation of
n
n
X
X
S
(a) d
gj dxj .
(b) d
gn−j+1 dxj .
j=1
j=1
14. Find d(f dg), where f is C 1 and g is C 2 on W .
15.S A form η on W is exact if η = dω for some form ω on W . Prove that if
η is exact and dν = 0, then η ∧ ν is exact.
Pn
16. Let f and ω := i=1 fi dxi be C 1 on an open set W ⊆ Rn . Show that if
d(f ω) = 0, then f ω ∧ dω = (df ) ∧ ω ∧ ω.
17.S Let U ⊆ Rk , V ⊆ R` , and W ⊆ Rn be open and let ϕ : U → V and
ψ : V → W be C 1 . If ω is an m-form on W , prove that (ψ ◦ ϕ)∗ ω =
ϕ∗ (ψ ∗ ω). Hint. Use 13.1.19(d).
18. Let U, W ⊆ Rn be open and let ϕ : U → W and f : W → Rn be C 1 .
Show that ϕ∗ (dx1 ∧ · · · ∧ dxn ) = det(ϕ0 ) du1 ∧ · · · ∧ dun .
19.S Let F = (f1 , f2 , f3 ) be C 1 on R3 and homogeneous of degree k ∈ N. (See
Exercise 9.3.15.) Let ω = f1 dx1 + f2 dx2 + f3 dx3 . Show that if dω = 0,
then ω = df where f (x) = (k + 1)−1 F (x) · x.
13.2
Integrals on Parameterized Surfaces
Recall that the length of a parameterized curve C in Rn is, by definition,
a limit of lengths of inscribed polygonal lines. The proof of 12.2.4 shows
that if the curve C is C 1 , then its length may be also be approximated by
tangent line segments. This idea may be extended to higher dimensions, using
tangent parallelepipeds to approximate surface area. This leads ultimately to
the definition of the integral of a function or a form on a surface.
462
A Course in Real Analysis
Area of a Parallelepiped
13.2.1 Definition. The parallelepiped spanned by vectors a1 , . . . , am ∈ Rn is
the set
X
m
1
m
i
P = P (a , . . . , a ) :=
ti a : 0 ≤ ti ≤ 1 .
i=1
The volume vol(P ) of P is its n-dimensional Lebesgue measure.
♦
For m = n, there is a simple formula for the volume:
13.2.2 Lemma. vol(P ) = det a1 . . . an .
Proof.
Denote
by T ∈ L(Rn , Rn ) the linear mapping with matrix A :=
1
n
a · · · a . Since T (ej ) = aj , a typical member of P := P (a1 , . . . , an ) may
be expressed as
X
n
n
X
ti ai = T
ti ei = T (t1 , . . . , tn ) ∈ T ([0, 1]n ) .
i=1
i=1
By 11.6.3 and 11.6.9, λn (P ) = λn (T ([0, 1]n )) = | det A|λn ([0, 1]n ) = | det A|.
If m < n, then λn (P ) = 0 but P may still have positive m-dimensional
Lebesgue measure, as defined in 11.6.9. Specifically, let V denote the linear
span of the vectors a1 , . . ., am and choose an orthonormal basis v 1 , . . ., v n
of Rn such that v 1 , . . ., v m is a basis for V. Define T ∈ L(V, Rm ) so that
T (v j ) = ej , 1 ≤ j ≤ m. Thus T “rotates” and/or “reflects” V onto Rm × {0}.
The area of P is then defined by
area P (a1 , . . . , am ) = λm T P (a1 , . . . , am ) .
A concrete value for this area is given in the following theorem.
13.2.3 Theorem. Let m < n, a1 , . . ., am ∈ Rn , and A = [a1 · · · am ]. Then
X
p
2 1/2
area P (a1 , . . . , am ) = det(At A) =
det Ai
.
i∈Im
Proof. Set b = T (aj ) and B = b
j
1
···
b . By linearity of T ,
m
T P (a1 , . . . , am ) = P (b1 , . . . , bm ) ⊆ Rm ,
hence, by 13.2.2,
area P (a1 , . . . , am ) = λm P (b1 , . . . , bm ) = | det B|.
Now, the (i, j)th entry of B t B is bi ·bj , and because T preserves inner products
this is the same as ai · aj . Therefore, B t B = At A, hence
p
p
p
| det B| = (det B t )(det B) = det(B t B) = det(At A).
This proves the first equality in the theorem. The second equality is from
13.1.6.
Integration on Surfaces
463
Area of a Parameterized Surface
Let ϕ : U → Rn be a parameterized m-surface in Rn with image S and
let u = (u1 , . . . , um ) ∈ U and a = ϕ(u) ∈ S. Choose a small m-dimensional
interval
Q = [u1 , u1 + ∆u1 ] × · · · × [um , um + ∆um ] ⊆ U, ∆uj > 0.
As noted in Chapter 12, the line segments u + tej in U map onto curves in S
with tangent vectors
dϕu (ej ) = ∂j ϕ(u),
1 ≤ j ≤ m,
at ϕ(u). The matrix with columns ∂j ϕ(u) is ϕ0 (u), the Jacobian matrix of ϕ at
u. By 13.2.3, the parallelepiped spanned by the vectors ∆uj ∂j ϕ(u) therefore
(∆u2 ) dϕu (e2 )
U
Q
(∆u2 )e2
u (∆u )e
1 1
ϕ
S = ϕ(U )
p
ϕ(Q)
(∆u1 ) dϕu (e1 )
FIGURE 13.1: Parallelogram approximation to ϕ(Q).
has area
q
det ϕ0 (u)t ϕ0 (u) ∆u1 ∆u2 · · · ∆um ,
which is taken as an approximation of the area of the surface element ϕ(Q).
Partitioning U into a grid Q of intervals Q and summing these expressions,
we obtain the Riemann sums
Xq
det ϕ0 (u)t ϕ0 (u) ∆u1 ∆u2 · · · ∆um .
Q
It is reasonable then to define the area of S as the limit of these sums as the
diameters of the intervals Q tend to zero, that is,
Z q
area(ϕ) :=
det ϕ0 (u)t ϕ0 (u) du.
(13.11)
U
Integral of a Function on a Parameterized Surface
Let f be a continuous, real-valued function on S = ϕ(U ). Motivated by
(13.11) we define the surface integral of f over ϕ by
Z
Z
q
f dS =
(f ◦ ϕ)(u) det ϕ0 (u)t ϕ0 (u) du
(13.12)
ϕ
U
464
A Course in Real Analysis
whenever the right side exists. In particular,
Z
area(S) =
1 dS.
ϕ
The integral on the right in (13.12) may be interpreted as a Lebesgue integral
or (if ϕ has compact support) as a Riemann integral. In the latter case, it is a
limit of Riemann sums
q
X
(13.13)
(f ◦ ϕ)(u) det ϕ0 (u)t ϕ0 (u) ∆u1 · · · ∆um .
Q
This interpretation has important physical applications. For example, if f
is the density in mass per unit area of a curved sheet S in R3 , then (13.13)
approximates the mass of the surface element
X
ϕ {u +
tj ej : 0 ≤ tj ≤ ∆uj } ,
j
hence ϕ f gives the mass of S. For another example, let f (x) be denote the
R
temperature of the sheet at point x ∈ S. Then [area(S)]−1 ϕ f dS gives the
average temperature of the sheet.
To evaluate (13.12), it is useful to note that since ϕ0 = ∂1 ϕ · · · ∂n ϕ ,
by 13.1.6
R
X
det ϕ0 (u)t ϕ0 (u) =
(i1 ,...,im )∈Im
2
∂(ϕii , . . . ϕim )
(u)
∂(u1 , . . . , um )
(13.14)
The following instances of 13.12 are of particular interest.
13.2.4 Special Cases.
(a) m = 1: Then det ϕ0 (u)t ϕ0 (u) = kϕ0 (u)k2 , hence
Z
Z
Z
0
f dS =
(f ◦ ϕ)(u)kϕ (u)k du =
f ds,
ϕ
U
ϕ
which is the line integral of Section 12.2.
(b) m = 2: In this case
det ϕ0 (u)t ϕ0 (u) = det
hence
Z
ϕ
f dS =
Z
U
∂1 ϕ ∂1 ϕ
∂2 ϕ
∂2 ϕ
=
∂1 ϕ · ∂1 ϕ ∂1 ϕ · ∂1 ϕ
∂1 ϕ · ∂2 ϕ ∂2 ϕ · ∂2 ϕ
q
2
(f ◦ ϕ) k∂1 ϕk2 k∂2 ϕk2 − ∂1 ϕ · ∂2 ϕ du.
Integration on Surfaces
465
(c) m = n − 1: Here
det ϕ (u) ϕ (u) =
0
t
0
n X
∂(ϕ1 , . . . , ϕbi , . . . ϕn )
∂(u1 , . . . , un−1 )
i=1
hence
Z
f dS =
ϕ
Z
(u)
2
= k∂ϕ⊥ (u)k2 ,
(f ◦ ϕ)(u)k∂ϕ⊥ (u)k du.
U
(d) ϕ(u1 , . . . , un−1 ) = u1 , . . . , un−1 , g(u1 , . . . , un−1 ) (the graph of g):
Let i = (1, . . . , i − 1, i + 1, . . . , n). Then
∂(ϕ1 , . . . , ϕbi , . . . ϕn )
=
∂(u1 , . . . , un−1 )
1
0
..
.
0
1
..
.
···
···
0
0
..
.
0
0 ···
1
∂1 g ∂2 g · · · ∂n−1 g i
(
(−1)n−1+i ∂i g, i < n,
=
1,
i = n,
hence
and
(13.15)
det ϕ0 (u)t ϕ0 (u) = 1 + k∇g(u)k2
Z
f dS =
Z
ϕ
p
(f ◦ ϕ)(u) 1 + ||∇g(u)||2 du.
♦
U
13.2.5 Example. Let S be the following portion of an n-dimensional cone:
n
o
n
X
x2i , 0 < xn+1 < 1 .
S = (x1 , . . . , xn+1 ) : x2n+1 =
i=1
Then S is parameterized by
ϕ(x) = x, g(x) , g(x) := kxk, x := (x1 , . . . , xn ),
where ∇g(x) = x/kxk. If f is of the form f (x) = h(kxk), then, by Exercise 11.6.3,
Z
Z 1
√
f dS = 2 n αn
h(r)rn−1 dr.
ϕ
0
In particular, taking h = 1,
area(S) =
√
2 αn ,
♦
The following result will be needed later to construct the integral of a
function on a general m-surface. It asserts that the integral over a parameterized
surface ϕ is invariant under a change of parameter and hence may be viewed
as a construct intrinsic to the image of ϕ.
466
A Course in Real Analysis
13.2.6 Proposition. Let U and V be open subsets of Rm , α : V → U a C 1
function with C 1 inverse, and ϕ : U → Rn a Rparameterized
m-surface. Then
R
ψ := ϕ ◦ α is a parameterized m-surface and ϕ f dS = ψ f dS.
Proof. By the chain rule, ψ 0 (v) = ϕ0 (u)α0 (v), where u = α(v), hence
det ψ 0 (v)t ψ 0 (v) = det α0 (v)t ϕ0 (u)t ϕ0 (u)α0 (v)
2
= Jα (v) det ϕ0 (u)t ϕ0 (u) .
Therefore, by the change of variables theorem,
Z
Z
q
f dS =
(f ◦ ψ)(v) det ψ 0 (v)t ψ 0 (v) dv
ψ
V
Z
q
=
(f ◦ ϕ)(α(v)) det ϕ0 (α(v))t ϕ0 (α(v)) |Jα (v)| dv
ZV
q
=
(f ◦ ϕ)(u) det ϕ0 (u)t ϕ0 (u) du
ZU
=
f dS.
ϕ
13.2.7 Remark. The material in this section holds, in particular, for a local
parametrization of an m-surface as well as a local parametrization of an (n − 1)surface-with-boundary. In the latter case, the domain of the parametrization
at a boundary point is an open set in Hn−1 .
♦
Integration of a Form on a Parameterized m-Surface
13.2.8 Definition. Let ϕ : U → Rn be a parameterized orientable m-surface
in Rn and let
X
ω=
fj1 ,··· ,jm dxj1 ∧ · · · ∧ dxjm
(j1 ,··· ,jm )∈Jm
be a continuous m-form on S := ϕ(U ). The integral of ω over ϕ is defined by
Z
Z
Z
ω=
ω = sign(ϕ)
ωϕ(u) dϕu (e1 ), . . . , dϕu (em ) du.
♦
ϕ
S
U
The inclusion of sign(ϕ) corresponds to the familiar convention
Z a
Z b
f (t) dt = −
f (t) dt
b
a
for Riemann integrals, which reflects the fact that the process of Riemann
integration respects the natural orientation (ordering) of the interval [a, b].
Recalling that dϕu (ej ) = ∂j ϕ(u) and
∂(ϕj1 , . . . , ϕjm )
dxj1 ∧ · · · ∧ dxjm ∂1 ϕ(u), . . . , ∂m ϕ(u) =
(u),
∂(u1 , . . . , um )
Integration on Surfaces
we obtain the formula
Z
Z
ω = sign(ϕ)
ϕ
X
(fj1 ,...,jm ◦ ϕ)
U (j ,...,j )∈J
1
m
m
467
∂(ϕj1 , . . . , ϕjm )
du.
∂(u1 , . . . , um )
(13.16)
The following instances of (13.16) are of particular importance.
13.2.9 Special Cases. Let ϕ be positively oriented.
(a) m = 1:
Z X
n
fi dxi =
i=1
ϕ
n Z
X
i=1
fi ϕ(t) ϕ0i (t) dt,
I
which is the integral of Section 12.2.
(b) m = n − 1:
Z X
n
ϕ
ci ∧ · · · ∧ dxn =
fi dx1 ∧ · · · ∧ dx
i=1
Z X
n
∂(ϕ1 , . . . , ϕbi , . . . ϕn )
du.
(fi ◦ ϕ)
∂(u1 , . . . , un−1 )
i=1
U
In particular, for the graph ϕ(u1 , . . . , un−1 ) = u1 , . . . , un−1 , g(u1 , . . . , un−1 ) ,
we have from (13.15)
Z X
n
ci ∧ · · · ∧ dxn =
fi dx1 ∧ · · · ∧ dx
fn ◦ ϕ +
U
i=1
ϕ
Z h
(c) m = 2, n = 3: Let Dij (u) :=
n−1
X
i
(−1)n−1+i (fi ◦ ϕ)∂i g du.
i=1
∂(ϕi , ϕj )
. Then
∂(u1 , u2 )
Z
f1 dx2 ∧ dx3 + f2 dx1 ∧ dx3 + f3 dx1 ∧ dx2
ϕ
Z
=
[(f1 ◦ ϕ)(u)D23 (u) + (f2 ◦ ϕ)(u)D13 (u) + (f3 ◦ ϕ)(u)D12 (u)] du.
U
(d) m-form on parameterized surface ι : U → U :
Z
Z
X
gj1 ,··· ,jk duj1 ∧ · · · ∧ dujm =
ι (j ,··· ,j )∈J
1
m
m
X
gj (u) du.
13.2.10
Notation. For the integral on the left in (d) we write
R
.
In
particular,
ι
Z
Z
g duj1 ∧ · · · ∧ dujm =
g(u) du.
ι
♦
U j∈J
m
U
R
U
instead of
♦
468
A Course in Real Analysis
13.2.11 Example. Let S be the following portion of a paraboloid:
S = (x1 , x2 , x3 ) : x1 = x22 + x23 , 0 < x1 < 1 .
For purposes of integration, we may consider S to be the image of the parameterized 2-surface
√
√
ϕ(t, θ) = (t, t cos θ, t sin θ), 0 < t < 1, 0 < θ < 2π,
since there are no contributions to an integral on the set where θ = 0.
By 13.2.9(c) ,
Z
Z 1 Z 2π
∂(ϕ1 , ϕ2 )
x22 x3 dx1 ∧ dx2 =
[t3/2 cos2 θ sin θ]
dθ dt
∂(t, θ)
S
0
0
Z 1 Z 2π
=−
t2 cos2 θ sin2 θ dθ dt
0
=−
0
π
.
12
♦
The following proposition, the analog of 13.2.6 for differential forms, shows
that the definition of integral of a form is invariant under reparametrizations.
13.2.12 Proposition. Let U, V be open connected subsets of Rm , α : V → U
a C 1 function with C 1 inverse and positive Jacobian, and ϕ : U → Rn a
parameterized orientable m-surface. If ω is a continuous m-form on ϕ(U ),
then
Z
Z
ω=
ω.
ϕ
ϕ◦α
Proof. Note first that sign(Jα ) is constant since α is C 1 and V is connected.
Let ψ = ϕ ◦ α. By the chain rule and the change of variables theorem,
Z
∂(ψj1 , . . . , ψjm )
fj1 ,...,jm ◦ ψ
dv
∂(v1 , . . . , vm )
V
Z
∂(ϕj1 , . . . , ϕjm )
α(v) Jα (v) dv
=
fj1 ,...,jm ◦ ϕ ◦ α (v)
∂(u
,
.
.
.
,
u
)
1
m
ZV
∂(ϕj1 , . . . , ϕjm )
=
(fj1 ,...,jm ◦ ϕ)
du.
∂(u1 , . . . , um )
U
The conclusion now follows from (13.16) and linearity of the integral.
R
The final result of this section expresses ϕ ω as an integral of a form on U .
It will be needed in the proof of Stokes’s theorem.
13.2.13 Theorem. Let U ⊆ Rm be open and let ϕ : U → Rn be an oriented
parameterized surface. If ω is a C 1 m-form on ϕ(U ), then
Z
Z
ω = sign(ϕ)
ϕ∗ ω.
ϕ
U
Integration on Surfaces
469
Proof. By (d) of 13.1.19, if ι : U → U denotes the identity map then
ωϕ(u) (dϕu (e1 ), . . . , dϕu (em )) = (ϕ∗ ω)u (e1 , . . . , em )
= (ϕ∗ ω)u (d ιu (e1 ), . . . , d ιu (em )),
The result now follows directly from the definition of the integral of a form
(13.2.8) and 13.2.10.
Exercises
1. Find the area of the following 2-surfaces in R3 .
(a) ϕ(t, θ) = (t cos θ, t sin θ, t), t ∈ (0, 1), θ ∈ (0, 2π).
(b)S ϕ(t, θ) = (t cos θ, t sin θ, θ), 0 < t < 1, 0 < θ < 2π.
(c) ϕ(θ, s) = (1 − s) a cos θ, a sin θ, 0 + s b cos θ, b sin θ, 1),
0 < s < 1, 0 < θ < 2π, 0 < a < b.
2. Let a1 , . . . , am ∈ Rn be linearly independent and let b ∈ Rn . Define
ϕ : Rm → Rn by
ϕ(u1 , . . . , um ) = b +
m
X
ui a i .
i=1
(See 12.3.2.) For a continuous function f on Rn , prove that
Z
Z
p
f = det(At A)
(f ◦ ϕ)(u) du,
Rn
ϕ
where A = a1 · · · am n×m .
3. Let ϕ be as in Exercise 2. Show that
Z
Z X
X
fi dxi =
det(Ai )
fi ◦ ϕ du.
ϕ i∈I
m
i∈Im
U
4.S Show that the area of the Cartesian product of circles
ϕ(θ1 , . . . , θm ) = r1 cos θ1 , r1 sin θ1 , . . . , rm cos θm , rm sin θm , ri > 0,
is (2πr1 )(2πr2 ) · · · (2πrm ).
5. Let ϕ be the product of two circles:
ϕ(θ1 , θ2 ) = r1 cos θ1 , r1 sin θ1 , r2 cos θ2 , r2 sin θ2 , ri > 0,
and let
ω = f12 dx1 ∧ dx2 + f13 dx1 ∧ dx3 + f14 dx1 ∧ dx4
+ f23 dx2 ∧ dx3 + f24 dx2 ∧ dx4 + f34 dx3 ∧ dx4 .
470
A Course in Real Analysis
Show that
Z
Z
ω = r1 r2
ϕ
0
2π
Z
2π
(f13 ◦ ϕ) sin θ1 sin θ2 − (f14 ◦ ϕ) sin θ1 cos θ2
0
− (f23 ◦ ϕ) cos θ1 sin θ2 + (f24 ◦ ϕ] cos θ1 cos θ2 dθ dφ.
6.S (Area of an n-dimensional simplex in Rn+1 ). Use Example 11.5.5 to find
the surface area of
n+1
n
o
X
S = (x1 , . . . xn+1 ) :
xj = 1 and xj ≥ 0 .
j=1
x3
1
S
1
x1
x2
1
FIGURE 13.2: Two dimensional simplex S in R3 .
7.S Let U ⊆ Rn−2 be open and let ψ : U → Rn−1 be a parameterized
(n − 2)-surface in Rn−1 . Let ϕ : U × [0, h] → Rn be the cylinder
ϕ(u, s) = ψ(u), s , u ∈ U, 0 ≤ s ≤ h.
Show that area(ϕ) = h · area(ψ).
8.S Let ϕ be the cylinder of Exercise 7 for n = 3 and h = 1. Show that
Z
f1 dx2 ∧ dx3 + f2 dx1 ∧ dx3 + f3 dx1 ∧ dx2
ϕ
=
Z
1
(f1 ◦ ϕ)ψ10 + (f2 ◦ ϕ)ψ20 dt.
0
9. Let ψ : [a, b] → R2 be a C 1 curve in R2 and let ϕ : [a, b] × (0, h) → R3
be the cone
ϕ(t, s) = (1 − s/h)ψ(t), s , a ≤ t ≤ b, 0 < s < h.
Show that the area of ϕ is
Z q
2
h b 0 2 0 2
ψ1 (t) + ψ2 (t) + h−2 [ψ1 (t)ψ20 (t) − ψ2 (t)ψ10 (t) dt.
2 a
Integration on Surfaces
471
Use this to show that the √
surface area of a right circular cone with radius
r and axis length h is πr r2 + h2 .
10. Let ϕ be the cone of Exercise 9 with h = 1. Show that
Z
f1 dx2 ∧ dx3 + f2 dx1 ∧ dx3 + f3 dx1 ∧ dx2
ϕ
=
1
2
Z
1
n
o
(f1 ◦ ϕ)ψ10 + (f2 ◦ ϕ)ψ20 + [ψ1 (t)ψ20 (t) − ψ2 (t)ψ10 (t)] dt.
0
11.S Let ψ : [a, b] → R2 a parameterized C 1 curve with ψ2 (t) > 0 for all t.
Define
ϕ(t, θ) = ψ1 (t), ψ2 (t) cos θ, ψ2 (t) sin θ , t ∈ I, θ ∈ (0, 2π),
which is the parameterized surface of revolution of 12.3.9. Show that
Z b
area(ϕ) = 2π
ψ2 (t)kψ 0 (t)k dt = (2πy)length(ψ),
(13.17)
a
Z
1
y ds, the y-coordinate of the
length(ψ) ψ
centroid of ψ. Use the first part of (13.17) to find the surface area of the
torus
ϕ(t, θ) = a cos θ, (b + a sin t) cos θ, (b + a sin t) sin θ , 0 < θ, t < 2π,
where (x, y) = ψ and y :=
where 0 < a < b. Show also that the area of the cone in Exercise 9 may
be found from (13.17).
12. Let ϕ be the parameterized surface of revolution in Exercise 11 and let
ω := f1 dx2 ∧ dx3 + f2 dx1 ∧ dx3 + f3 dx1 ∧ dx2 . Show that
Z
Z bZ 2π
Z bZ 2π
(f2 ◦ ϕ)ψ10 (t)ψ2 (t) cos θ dθ dt
ω=
(f1 ◦ ϕ)ψ2 (t)ψ20 (t) dθ dt +
ϕ
a
a
0
Z bZ
−
a
0
2π
(f3 ◦ ϕ)ψ10 (t)ψ2 (t) sin θ dθ dt.
0
Show also that if ψ(t) = (t, g(t)) (the graph of g), then this reduces to
Z b Z 2π
g(t) (f1 ◦ ϕ)g 0 (t) + (f2 ◦ ϕ) cos θ − (f3 ◦ ϕ) sin θ dθ dt.
a
0
13. Use Exercise
12 to evaluate Z
Z
Z
(a)S
x1 x3 dx1 ∧ dx2 , (b)
x2 x3 dx1 ∧ dx2 , (c)
x21 x22 dx2 ∧ dx3 ,
S
S
S
where S is the cone
S = (x1 , x2 , x3 ) : x21 = x22 + x23 , 0 < x1 < 1 .
472
A Course in Real Analysis
14. Repeat Exercise 13 using the portion of the hyperboloid
n
√ o
S = (x1 , x2 , x3 ) : x21 − x22 − x23 = 1, 1 < x1 < 2 .
p
2
15. Let S be the torus given by x21 +
x22 + x23 − b = a2 , where 0 < a < b.
Use Exercise 12 to evaluate
Z
Z
Z
(a)
x2 dx2 ∧ dx3 . (b)S
x1 dx2 ∧ dx3 . (c)
x2 dx1 ∧ dx3 .
S
13.3
S
S
Partitions of Unity
The theorem proved in this section will be used to extend the definition of
the integral to functions and forms on m-surfaces. It will also be needed later
in the proofs of Stokes’s theorem and the divergence theorem.
13.3.1 Definition. The support of a continuous function ψ : Rn → R is
defined by
supp(ψ) = cl {x : ψ(x) 6= 0} .
♦
Thus, by definition of closure, supp(ψ) is the smallest closed set outside of
which ψ is zero.
13.3.2 Partition of Unity. Let K be a compact subset of Rn and let
{Ui : i ∈ I} be an open cover of K. Then there exists a finite subcover
{U1 , . . . , Up } of K and C ∞ functions χi : Rn → [0,
i = 1, . . . , p, such
P+∞),
p
that supp(χi ) is compact and contained in Ui and i=1 χi = 1 on K.
χ1
χ2
K
U1
U2
FIGURE 13.3: A partition of unity subordinate to U1 and U2 .
The functions χi are said to form a partition of unity subordinate to the
open sets Ui . They are typically used to patch together local data to form a
global construct such as a surface integral, or to reduce a global problem to a
local one, as in the case of the proof of Stokes’s theorem.
The proof of 13.3.2 requires several lemmas which are of intrinsic interest.
13.3.3 Lemma. Let a < b. Then there exists a C ∞ function h : R → [0, +∞)
such that h > 0 on (a, b), and h = 0 on (a, b)c .
Integration on Surfaces
473
Proof. Define h by
(
exp (x − a)−1 (x − b)−1 if a < x < b,
h(x) =
0
otherwise.
Clearly, h(m) = 0 on [a, b]c for all m ≥ 0. Moreover, if x ∈ (a, b), then h(m) (x)
is a sum of terms of the form
±h(x)
, p, q ∈ Z+ .
(x − a)p (x − b)q
Since the exponent (x − a)−1 (x − b)−1 is negative on (a, b), by l’Hospital’s rule,
lim
x→a+
h(x)
= 0.
(x − a)p (x − b)q
Therefore, limx→a h(m) (x) = 0, and an induction argument then shows that
h(m) (a) = 0 for all m. A similar argument holds at the point b. Thus h is C ∞
on R.
FIGURE 13.4: The functions h and g.
13.3.4 Lemma. Let a < b. Then there exists a C ∞ function g : R → R such
that 0 ≤ g ≤ 1, g = 0 on (−∞, a], and g = 1 on [b, +∞).
Proof. Let h be the function in 13.3.3. Then
g(x) :=
hZ
a
b
i−1 Z
h
x
h
a
has the required properties.
13.3.5 Lemma. Let I = (a1 , b1 ) × · · · × (an , bn ). Then there exists a C ∞
function f : Rn → R such that f > 0 on I and f = 0 on I c .
Proof. For each j, let hj : R → [0, +∞) be a C ∞ function such that hj > 0
on (aj , bj ) and hj = 0 on (aj , bj )c . The function
f (x1 , . . . , xn ) := h1 (x1 ) · · · hn (xn )
then satisfies the requirements.
474
A Course in Real Analysis
For the next lemma we define the open cube with center x ∈ Rn and edge
2r by
{y ∈ Rn : xj − r < yj < xj + r, j = 1, . . . , n} .
13.3.6 Lemma. Let K ⊆ U ⊆ Rn , where K is compact and U is open. Then
there exists a C ∞ function ψ : Rn → [0, 1] such that supp(ψ) ⊆ U and ψ = 1
on K.
Proof. For each x ∈ K, let Vx be an open cube with center x and edge 2r
such that cl Vx ⊆ U and let Wx ⊆ Vx denote the concentric open cube with
center x and edge r. Since K is compact, there exist finitely many cubes Wx
whose union contains K. Denote these cubes by W1 , . . . , Wm and denote the
corresponding cubes Vx by V1 , . . . , Vm . (See Figure 13.5.) By 13.3.5, for each i
f =0
f >0
U
Wi
Vi
K
FIGURE 13.5: The cubes Wi and Vi .
there exists a C ∞ function fi : Rn → R such that fi > 0 on Wi and fi = 0 on
Wic . Set
m
m
m
X
[
[
f=
fi , V =
Vi , and W =
Wi .
i=1
i=1
i=1
Then f is nonnegative and C on R , f > 0 on W ⊇ K, and supp(f ) ⊆
cl(V ) ⊆ U . Now let a = minx∈K f (x). Since a > 0, there exists a C ∞ function
g : R → [0, 1] such that g = 0 on (−∞, 0] and g = 1 on [a, +∞) (13.3.4). The
function ψ := g ◦ f then has the required properties.
∞
n
Proof of the partition of unity theorem.
For each x ∈ K, let i(x) be an index such
that x ∈ Ui(x) . Choose a bounded
open set Vx containing x such that cl Vx ⊆ Ui(x) . Since K is compact, finitely
many of the sets Vx cover K. Denote these by V1 , . . . Vp and denote the
corresponding sets Ui(x) by U1 , . . . , Up . Since Vi ⊆ Ki := cl(Vi ) ⊆ Ui , by 13.3.6
there exists a C ∞ function ψi : Rn → [0, 1] such that ψi = 1 on Ki and
supp(ψi ) ⊆ Ui . Now set
χ1 = ψ1 and χi = (1 − ψ1 )(1 − ψ2 ) · · · (1 − ψi−1 )ψi , i > 1.
Then χi is C ∞ , 0 ≤ χi ≤ 1, and supp(χi ) ⊆ supp(ψi ) ⊆ Ui . Finally, let
ηi = (1 − ψ1 )(1 − ψ2 ) · · · (1 − ψi ).
Integration on Surfaces
475
For i > 1,
ηi−1 − ηi = (1 − ψ1 )(1 − ψ2 ) · · · (1 − ψi−1 ) 1 − (1 − ψi ) = χi ,
hence
p
X
χi = χ1 +
p
X
i=1
(ηi−1 − ηi ) = χ1 + η1 − ηp = 1 − ηp .
i=2
S
S
Pp
Since K ⊆ i Vi ⊆ i Ki and ψi = 1 on Ki , ηp = 0 on K, hence i=1 χi = 1
on K, completing the proof.
13.4
Integration on Compact m-Surfaces
In this section we define the integrals of a function and a form on a compact
m-surface
S = {x ∈ V : F (x) = 0} ,
where V ⊆ Rn is open, F : V → Rn−m is C 1 , and F 0 (x) has rank n − m for
all x ∈ V .
To set the stage, let {(Ua , ϕa ) : a ∈ S} be an atlas for S. By the partition
of unity theorem, there exist finitely many charts (Ui , ϕi ) := Uai , ϕai and C 1
n
functions
P χi : R → R such that the sets Si := ϕi (Ui ) cover S, supp(χi ) ⊆ Si ,
and i χi = 1 on S.
Integral of a Function
The (surface) integral of a continuous function f on S is defined by
Z
XZ
XZ q
f dS =
χi f =
(χi f ) ◦ ϕi (u) det ϕ0i (u)t ϕ0i (u) du.
S
i
ϕi
i
Ui
To see that the integral is independent of the system { Ui , ϕi , χi )}i and hence
is well-defined, consider another such system { Ũj , ϕ̃j , χ̃j }j . Since
X
X
χi =
χi χ̃j and χ̃j =
χ̃j χi on S,
j
we see that
XZ
i
Set
ϕi
f χi =
i
XZ
i,j
ϕi
f χi χ̃j and
XZ
j
ϕ̃j
f χ̃j =
XZ
i,j
ϕ̃j
−1
−1
αij = ϕ̃−1
j ◦ ϕi : ϕi (Si ∩ S̃j ) → ϕj (Si ∩ S̃j ).
f χ̃j χi .
476
A Course in Real Analysis
−1
RSince f χi χ̃j R= 0 outside Si ∩ S̃j and ϕi = ϕ̃j ◦ αij on ϕi (Si ∩ S̃j ), by 13.2.6
f χi χ̃j = ϕ̃j f χi χ̃j . Therefore,
ϕi
XZ
XZ
f χi =
f χ̃j ,
ϕi
i
ϕ̃j
j
as required.
The definition of the integral is extended to a finite union S of compact
m-surfaces S1 , . . . , Sp by defining
Z
XZ
f dS =
f dS.
S
Si
i
13.4.1 Definition. The area of S is defined as
Z
area(S) =
1 dS.
♦
S
13.4.2 Example. In 11.5.6 we found that the volume of the closed ball
Crn (0) = {x ∈ Rn : ||x|| ≤ r} is rn αn , where

(n−1)/2

 2(2π)
if n is odd,

···3 · 1
αn = n(n − 2)n/2
(2π)



if n is even.
n(n − 2) · · · 4 · 2
We now show that for the sphere S := Srn−1 (0) = {x ∈ Rn : ||x|| = r},
n
area(S) = nrn−1 αn = λn Crn (0) .
(13.18)
r
To this end, note that the upper hemisphere H u of S is the graph of the
function
q
p
g(x1 , . . . , xn−1 ) = r2 − (x21 + · · · + x2n−1 ) = r2 − ||x||2 , ||x|| ≤ r.
Let 0 < t < 1 and consider the part of the hemisphere Htu for which ||x|| < rt.
Since
||x||2
r2
1 + ||∇g(x)||2 = 1 + 2
=
,
r − ||x||2
r2 − ||x||2
by 13.2.4(c)
area(Htu ) = r
Z
r2 − ||x||2
||x||<rt
−1/2
dx = r(n − 1)αn−1
Z
0
rt
√
sn−2
ds,
r2 − s2
where the second equality comes from Exercise 11.6.3. The substitution s = xr
produces
Z t
xn−2
√
area(Htu ) = (n − 1)rn−1 αn−1
dx.
1 − x2
0
Integration on Surfaces
477
The lower hemisphere counterpart Ht` has the same area. Since
1S = lim 1Htu + lim 1Ht` ,
t→1−
t→1−
it follows from Exercise 6 below that
area(S) = 2 lim−
t→1
area(Htu )
= 2(n − 1)r
n−1
1
Z
αn−1
0
xn−2
√
dx.
1 − x2
(13.19)
By Exercise 5.3.7,
Z
0
1


 (n − 3)(n − 5) · · · 4 · 2

xn−2
− 2)(n − 4) · · · 3 · 1
√
dx = (n
π
(n
− 3)(n − 5) · · · 3 · 1

1 − x2


2 (n − 2)(n − 2) · · · 4 · 2
if n is odd
if n is even.
Now use (13.19) and (13.20), to obtain (13.18).
(13.20)
♦
Integral of an m-Form
The definition of the (surface) integral of an m-form on a (positively)
oriented compact m-surface S is analogous to the case of a function on S:
Z
XZ
XZ
ω=
χi ω =
(χi ω)ϕi (u) ∂1 ϕi (u), . . . , ∂m ϕi (u) du.
S
i
ϕi
i
Ui
The argument that the integral is well-defined proceeds as above.
As in the case of functions, the the integral of a form on a finite union S of
oriented compact m-surfaces S1 , . . . , Sp is defined by
Z
XZ
f dS =
f dS.
S
i
Si
Exercises
For these exercises, declare a function f : S → R to be
Borel measurable if f ◦ ϕa : Ua → R is Borel measurable
on Ua for each local parametrization ϕa of S.
1. Show that the collection of Borel measurable functions on S is closed
under the operations described in 10.5.3.
R
2. Define S f dS for nonnegative Borel measurable functions f on S.
R
R
3. Call a Borel measurable function f integrable if S f + dS and S f − dS
are finite. In this case, define
Z
Z
Z
f dS =
f + dS −
f − dS.
S
S
S
478
A Course in Real Analysis
Z
Z
Prove that the resulting integral is linear and
f dS ≤
|f | dS.
S
S
4. Formulate and prove versions of the monotone convergence theorem,
Fatou’s lemma, and the dominated convergence theorem for Borel measurable functions on S.
S
5. Define the σ-field B(S) := {S ∩ B : B ∈ B(Rn )}. Show that 1E is Borel
measurable on S for every E ∈ B(S).
6. Show that
µS (E) :=
Z
1E dS,
E ∈ B(S),
S
defines a measure on B(S). (µS is called surface measure on S. It provides
a way of calculating the area of Borel subsets of S.)
7.S Use Exercise 6 to verify the first equality in (13.19).
13.5
The Fundamental Theorems of Calculus
The fundamental theorem of single variable calculus expresses the integral
of a continuous function on an interval [a, b] as a function of the boundary
{a, b}. The theorem tacitly assumes that integration occurs from “left to right,”
which is to say that the interval [a, b] is positively oriented. The fundamental
theorem of calculus may then be stated as follows: For any primitive F of
a continuous function f on [a, b], the integral of f is F (b) − F (a) if [a, b] is
positively oriented and F (a)−F (b) if [a, b] is negatively oriented. The theorems
proved in this section are higher dimensional versions of this formulation.
Stokes’s Theorem
While the proof of the following theorem is based on many of the intricate
constructs developed earlier, the conclusion of the theorem is remarkably easy
to state.
13.5.1 Stokes’s Theorem. Let S be a compact oriented (n − 1)-surface-withboundary in Rn and let ω be a C 1 (n − 2)-form on S. If ∂S has the induced
orientation, then
Z
Z
ω=
dω.
∂S
S
Proof. We prove the theorem first under the assumption that ω has compact
support contained ϕ(I), where ϕ : U → Rn is a local parametrization of S and
I is a bounded (n − 1) dimensional interval with cl(I) ⊆ U . For definiteness,
Integration on Surfaces
479
we assume that sign(ϕ) = 1, that is, the frame dϕu (e1 ), . . . , dϕu (en−1 ) is
~ ϕ.
positive in Tϕ(u) . Thus ϕ is oriented by N
Suppose first that U is open in Rn−1 and ϕ(U ) ⊆ RS \ ∂S. We may then take
I = (a1 , b1 ) × · · · × (an−1 , bn−1 ). Since ω = 0 on ∂S, ∂S ω = 0. Set η := ϕ∗ (ω).
By 13.1.19, dη := ϕ∗ (dω), hence, by 13.2.13,
Z
Z
dω =
dη.
S
Since η is an (n − 2) form on U ⊆ R
η=
n−1
X
U
n−1
, it may be expressed as
ci ∧ · · · ∧ dun−1 ,
(−1)i+1 fi du1 ∧ · · · ∧ du
i=1
where fi is C 1 on U and supp(fi ) ⊆ I. Then


Z
Z n−1
n−1
X
X ∂fi
ci ∧ · · · ∧ dun−1
dη =
(−1)i+1 
duj  ∧ du1 ∧ · · · ∧ du
∂u
j
U
U i=1
j=1
=
Z n−1
X
(−1)i+1
U i=1
=
n−1
XZ
i=1
∂fi
ci ∧ · · · ∧ dun−1
dui ∧ du1 ∧ · · · ∧ du
∂ui
(∂i fi ) du1 ∧ · · · ∧ dun−1 .
(13.21)
U
Since fi = 0 outside I, by the Fubini–Tonelli theorem, recalling 13.2.10, we
have
Z
Z b1
Z bn−1 Z bi
(∂i fi ) du1 ∧· · ·∧dun−1 =
···
(∂i fi ) dui dun−1 · · · d
dui · · · du1 .
U
a1
an−1
ai
By the fundamental theorem of calculus, the innermost integral evaluates to
fi (u1 , . . . , bi , . . . , un−1 ) − fi (u1 , . . . , ai , . . . , un−1 ) = 0 − 0 = 0.
Therefore,
Z
dω = 0 =
S
Z
ω.
∂S
Now suppose that ϕ(U ) ∩ ∂S 6= ∅. Then U is an open subset in Hn−1 that
intersects ∂ Hn−1 and I is of the form (a1 , b1 ) × · · · × (an−2 , bn−2 ) × [0, bn−1 ).
The above argument works for every term in the sum (13.21) except the last.
It is still the case that fn−1 (u1 , . . . , un−2 , bn−1 ) = 0, but now there is no
guarantee that fn−1 (u1 , . . . , un−2 , 0) = 0. Thus all we may conclude is that
Z
Z
Z b1
Z bn−2
dω =
dη = −
···
fn−1 (u1 , . . . , un−2 , 0) dun−2 · · · du1 . (13.22)
S
U
a1
an−2
480
A Course in Real Analysis
Set V = U ∩∂Hn−1 . By definition of induced orientation, sign ϕ
hence, by 13.2.13 again,
Z
ω = (−1)n−1
∂S
Z
η=
V
n−1
X
(−1)n+i
V
= (−1)n−1 ,
Z
ci ∧ · · · ∧ dun−1 .
fi du1 ∧ · · · ∧ du
V
i=1
Since un−1 = 0 on V , dun−1 must be zero on V . Therefore, the first n − 2
terms in the above sum vanish and we are left with
Z
Z
ω=−
fn−1 du1 ∧ · · · ∧ dun−2 ,
∂S
V
which is (13.22). This verifies the theorem if ω has compact support.
In the general case, let {ϕi : Ui → S : i = 1, . . . , m} be an atlas of local
parameterizations of S and let {χi : i = 1, . . . , m} be a partition of unity
subordinate to the open sets Ui . By the first part of the proof,
Z
Z
χi ωi =
d(χi ω), i = 1, . . . , m.
∂S
S
P
By the product rule, d(χi ω) = χi dω + (d χi )ω. Since i χi = 1,
hX i
h X i
X
d(χi ω) =
χi dω + d
χi ω = dω.
i
Therefore,
Z
i
ω=
∂S
XZ
i
i
χi ω =
∂S
XZ
i
S
d(χi ω) =
Z
dω.
S
From the first part of the proof we have
13.5.2 Corollary. If S is a compact oriented (n−1)-surface-without-boundary
and if ω is an (n − 2)-form on S, then
Z
dω = 0.
S
13.5.3 Remark. (a) For the case n = 3, Stokes’s formula takes the form
Z
Z
[f1 dx1 + f2 dx2 + f3 dx3 ] =
d[f1 dx1 + f2 dx2 + f3 dx3 ].
(13.23)
∂S
S
The left side of (13.23) may be written
Z
~
~ := (dx1 , dx2 , dx3 ).
F~ · dr,
where dr
∂S
Let F~ := (f1 , f2 , f3 ) and (g1 , g2 , g3 ) := curl F~ . By 13.1.15(a),
d f1 dx1 + f2 dx2 + f3 dx3 = g1 dx2 ∧ dx3 + g2 dx3 ∧ dx1 + g3 dx1 ∧ dx2 .
Integration on Surfaces
481
For a local parametrization ϕ,
Z
[g1 dx2 ∧ dx3 + g2 dx3 ∧ dx1 + g3 dx1 ∧ dx2 ]
ϕ
Z ∂(ϕ3 , ϕ1 )
∂(ϕ1 , ϕ2 )
∂(ϕ2 , ϕ3 )
=
+ (g2 ◦ ϕ)
+ (g3 ◦ ϕ)
du
(g1 ◦ ϕ)
∂(u1 , u2 )
∂(u1 , u2 )
∂(u1 , u2 )
U
Z
q
~ ϕ ◦ ϕ det ϕ0 (u)t ϕ0 (u) du
=
(curl F~ ◦ ϕ) · N
ZU
~ ϕ dS
=
curl F~ · N
ϕ(U )
Using a partition of unity we see that the right side of (13.23) may then
R
~ dS. Thus we obtain the classical version of Stokes’s
be written S curl F~ · N
formula
Z
Z
~ =
~ dS
F~ · dr
curl F~ · N
(13.24)
∂S
S
(b) If S is the graph of a function g(x, y) on a set D ⊆ R2 , then S may
be parameterized by ϕ(x, y) = (x, y, g(x, y)). Hence, if S is oriented by the
upward normal
(−∇g, 1
p
k∇gk2 + 1
(see (12.3.7)(c) ), then (13.24) becomes
Z
Z
~
~
F · dr =
curl F~ · (−gx , −gy , 1) dx dy.
∂S
♦
D
13.5.4 Example. We verify (13.24) for
F (x1 , x2 , x3 ) = (x2 x3 , x1 , x2 )
on the cylinder-with-boundary
S := (x1 , x2 , x3 ) : x21 + x22 = 1, 0 ≤ x3 ≤ 1
~ (x1 , x2 , x3 ) = (x1 , x2 , 0). The
(Figure 12.17) oriented by the outward normal N
bottom boundary may be parameterized by ϕ1 (t) = (cos t, sin t, 0) and the top
by ϕ2 (t) = (cos t, − sin t, 1), 0 ≤ t ≤ 2π. Therefore,
Z
~ =
F~ · dr
Z
+
Z
∂S
2π
− f1 (cos t, sin t, 0) sin t + f2 (cos t, sin t, 0) cos t dt
− f1 (cos t, − sin t, 1) sin t − f2 (cos t, − sin t, 1) cos t] dt
cos2 t + sin2 t − cos2 t dt = π.
0
2π
0
=
Z
0
2π
482
A Course in Real Analysis
To find
~ dS we parameterize S by ϕ(t, x3 ) = (cos t, sin t, x3 ). Then
curl F~ · N


− sin t 0
0
−
sin
t
cos
t
0
 cos t 0 = 1,
det ϕ (t, x3 )t ϕ0 (t, x3 ) = det
0
0
1
0
1
R
S
and since curl F = (1, x2 , 1 − x3 ),
Z
Z
~ dS =
curl F~ · N
S
2π
(cos t + sin2 t) dt = π.
♦
0
Divergence Theorem
The divergence theorem is a variation of Stokes’s theorem. In the latter
theorem, integration occurs on an (n − 1)-dimensional surface in Rn with
(n − 2)-dimensional boundary. In the former, the domain of integration is a
regular region, which may be viewed as an n-dimensional surface in Rn with
(n − 1)-dimensional boundary.
13.5.5 Definition. A bounded open subset E of Rn is called a regular region
if, for each point a ∈ bd(E), there exists a neighborhood Ua of a and a C 1
function Fa : Ua → R with ∇Fa 6= 0 such that
(i) Sa := bd(E) ∩ Ua = {x : Fa (x) = 0},
(ii) E ∩ Ua = {x : Fa (x) < 0}, and
(iii) cl(E)c ∩ Ua = {x : Fa (x) > 0}.
♦
Note that Sa is an (n − 1)-surface in Rn and hence has a local parametrization at each x ∈ Sa . Let
~na = k∇Fa k−1 ∇Fa ,
and let x ∈ Sa . For sufficiently small |t|, h(t) := x + t~na (x) ∈ Ua . Since
(Fa ◦ h)0 (0) = ∇Fa (x) · ~na (x) = k∇Fa (x)k > 0,
(Fa ◦ h) is strictly increasing. Because (Fa ◦ h)(0) = 0 we therefore have
(
< 0 if t < 0,
Fa x + t~na (x)
> 0 if t > 0.
It follows from (ii) and (iii) that the normal vector t~na (x) to Sa at x points
into E if t < 0 and away from E (that is, toward E c ) if t > 0. The exterior
unit normal vector on bd(E) is then defined by
~n(x) = ~na (x), x ∈ Sa .
Uniqueness and continuity of ~na shows that ~n is well-defined and continuous
on bd(E). (See Figure 13.6.)
Integration on Surfaces
Ua
→
−
n (x)
Fa > 0
483
a x
Fa = 0
Fa < 0
E
FIGURE 13.6: Regular region E.
13.5.6 Example. The n-dimensional annulus
E = {x ∈ Rn : r1 < kxk < r2 }
is a regular region in Rn . Here, bd(E) has the components
Si = {x ∈ Rn : kxk = ri } , i = 1, 2.
The conditions of regularity are met by defining
(
r1 − kxk
on {x ∈ Rn : kxk < (r1 + r2 )/2} if a ∈ S1 ,
Fa (x) =
kxk − r2
on {x ∈ Rn : kxk > (r1 + r2 )/2} if a ∈ S2 .
Figure 13.7 depicts the case n = 2.
♦
S1
S2
E
FIGURE 13.7: Annulus in R2 with exterior normal.
13.5.7 Divergence Theorem. If E is a regular region in Rn and ω is a C 1
1-form on cl(E), then
Z
Z
ω · ~n dS =
div ωx dx.
(13.25)
bd(E)
E
484
A Course in Real Analysis
Proof. The proof uses ideas similar to those used in the proof of Stokes’s
theorem. By hypothesis, ω is C 1 on an open set containing cl(E), which we
may assume also contains the sets Ua in 13.5.5. Since cl(E) is compact, by using
a partition of unity as in the proof of Stokes’s theorem, we may assume that
for any a = (a1 , . . . , an ) ∈ cl(E) and the neighborhoods W of a constructed
in the proof,
n
[
K :=
supp(fi ) ⊆ W.
i=1
Suppose first that a ∈ E. Choose an n-dimensional interval W containing
a such that cl(W ) ⊆ E. If K ⊆ W , then ω = 0 on W c ⊇ bd(E), hence
Z
ω · ~n dS = 0 and
Z
div ωx dx =
E
bd(E)
Z
n
X
∂i fi (x) dx = 0,
W i=1
the last equality by the Fubini–Tonelli theorem and the fundamental theorem
of calculus. Therefore, (13.25) holds in this case.
bd(E)
K
W
E
a
FIGURE 13.8: The case a ∈ E.
Now let a ∈ bd(E) and let Ua and Fa be as in 13.5.5. We may assume
that the components ai of a and ni (a) of ~n(a) are positive, otherwise apply a
rotation and translation; the change of variables theorem implies that (13.25)
is invariant under such transformations. (See Exercise 11 below for a special
case of this.) We show that for each i = 1, . . . , n, there exists a neighborhood
Wi of a such that if K ⊆ Wi then
Z
Z
fi ni dS =
∂i fi (x) dx.
(13.26)
S
E
For notational simplicity, we do this for the case i = n.
Since ∂n Fa (a) 6= 0, by the implicit function theorem there exists a neighborhood V of (a1 , . . . , an−1 ), an open interval I containing an , and a C 1 function
g : V → R such that V × I ⊆ Ua , an = g(a1 , . . . , an−1 ), and
Fa x1 , . . . , xn−1 , g(x1 , . . . , xn−1 ) = 0 on V.
By continuity, we may choose V and I sufficiently small so that
g(x1 , . . . , xn−1 ) > 0 and ∂n Fa (x) > 0 for all x ∈ V × I.
Now let x = (x1 , . . . , xn ) ∈ (V × I) ∩ E. Since Fa (x) is a strictly increasing
Integration on Surfaces
485
xn = g(x1 , . . . , xn−1 )
I
bd(E)
xn < g(x1 , . . . , xn−1 )
V ×I
a
K
V
Ua
E
FIGURE 13.9: The case a ∈ bd(E).
function of xn ∈ I when the other coordinates are fixed and since Fa (x) < 0,
it must be the case that 0 < xn < g(x1 , . . . , xn−1 ). Thus
(V × I) ∩ E = {x ∈ V × I : 0 < xn < g(x1 , . . . , xn−1 )} and
(V × I) ∩ Sa = {x ∈ V × I : xn = g(x1 , . . . , xn−1 )} .
(See Figure 13.9.) Note that the function ϕ defined by
ϕ(v) := (v, g(v)), v = (v1 , . . . , vn−1 ) ∈ V,
is a local parametrization of Sa with unit normal
(1 + k∇gk2 )−1/2 − ∇g, 1 .
Since this points outward it coincides with ~n. In particular, the nth component
of ~n is
nn = (1 + k∇gk2 )−1/2
on (V × I) ∩ Sa . Therefore, if K ⊆ V × I then, by 13.2.4(d),
Z
Z
Z
fn
p
dS =
(fn ◦ ϕ)(v) dv. (13.27)
fn nn (a) dS =
1 + k∇gk2
V
Sa
(V ×I)∩Sa
On the other hand, since fn = ∂n fn = 0 outside K, by the Fubini–Tonelli
theorem,
Z
Z
∂n fn (x) dx =
∂n fn (x) dx
(V ×I)∩E
E
=
=
Z Z
ZV
g(v1 ,...,vn−1 )
(∂n fn )(v1 , . . . , vn−1 , xn ) dxn dv1 . . . dvn−1
0
(fn ◦ ϕ)(v) dv,
(13.28)
V
the last equality by the fundamental theorem of calculus. Setting Wn = V × I
and comparing (13.27) and (13.28), we see that (13.26) holds for i = n. A
similar proof works for i < n. Thus if K ⊆ W1 ∩ · · · ∩ Wn , then (13.26) holds
for all i. Summing from 1 to n we obtain (13.25).
486
A Course in Real Analysis
Connection with Stokes’s Theorem
Let E be a regular region in Rn whose boundary is a finite union of compact
connected (n − 1)-surfaces of the form S = {x : F (x) = 0}, where F : U → R
is a C 1 function with ∇F 6= 0 such that Ua = U and Fa = F for all a ∈ S. A
ball or annulus in Rn are simple examples. By 12.4.8, S is oriented and, for
each local parametrization ϕ : V → S,
~n(ϕ(v)) = q
n
X
∂(ϕ1 , . . . , ϕ
[
i−1 , . . . , ϕn )
(−1)i−1
(v).
∂(v
,
.
.
. , vn−1 )
1
det ϕ0 (v)t ϕ0 (v) i=1
±1
where the sign is chosen to be the same for all v. Let each S have the orientation
for which the sign is (+). We shall call the resulting orientation of bd(E) positive.
In this setting we have the following consequence of the divergence theorem.
13.5.8 Theorem. Let E be as described above and let
ω=
n
X
fi dx1 ∧ · · · ∧ d
dxi ∧ · · · ∧ dxn
i=1
be an (n − 1)-form on cl(E). If bd(E) is positively oriented, then
Z
Z
ω=
dω.
E
bd(E)
Proof. Recalling the additive definition of bd(E) ω, we may assume that bd(E)
consists of a single compact connected (n − 1)-surface S. Let
R
η :=
n
X
(−1)i−1 fi dxi .
i=1
By the above,
(~n · η) ◦ ϕ(v) = q
n
X
∂(ϕ1 , . . . , ϕbi , . . . , ϕn )
(fi ◦ ϕ)(v)
(v),
∂(v1 , . . . , vn−1 )
det ϕ0 (v)t ϕ0 (v) i=1
1
hence
Z
ϕ
~n · η dS =
n Z
X
i=1
∂(ϕ1 , . . . , ϕbi , . . . , ϕn )
(fi ◦ ϕ)
dv =
∂(v1 , . . . , vn−1 )
V
Using a partition of unity we obtain
Z
Z
~n · η dS.
ω=
S
S
Z
ω.
ϕ
(13.29)
Integration on Surfaces
487
On the other hand,
dω =
n X
n
X
(∂j fi ) dxj ∧ dx1 ∧ · · · ∧ d
dxi ∧ · · · ∧ dxn
i=1
j=1
X
n
i−1
=
(−1) ∂i fi dx1 ∧ · · · ∧ dxn
i=1
= div η dx1 ∧ · · · ∧ dxn ,
hence, recalling 13.2.10,
Z
Z
Z
dω =
div ηx dx1 ∧ · · · ∧ dxn =
div ηx dx
E
E
(13.30)
E
The conclusion now follows from (13.29), (13.30), and the divergence theorem.
13.5.9 Remark. The divergence theorem has an interesting application to
fluid dynamics. Consider an incompressible fluid moving in space. Let ρ(x, t)
denote the density of the fluid in mass per unit volume at time t and point x,
and let ~v (x, t) denote its velocity. If ~n is normal to a small surface element of
area ∆S, then (ρ~v · ~n)(∆S)(∆t) is approximately the mass of the fluid flowing
across that surface element during a small time period ∆t. The rate of flow is
then (ρ~v · ~n)∆S. Adding these quantities and taking limits, we see that the
rate of flow of the fluid across a surface S in the direction of the normal is
given by the integral
Z
ρ~v · ~n dS
S
Now let E be a regular region with smooth boundary S. Applying the
foregoing to a ball Bε in E with boundary Sε , center y, and outer normal ~n,
we see that the integral
Z
ρ~v · ~n dS
Sε
represents the rate of flow of the fluid out of the ball, that is, the negative of
the rate of
R change of fluid in the ball. Since the amount of fluid in the ball at
time t is Bε ρ(x, t) dx,
d
dt
Z
Bε
ρ(x, t) dx = −
Z
Sε
ρ~v · ~n dS = −
Z
div (ρ~v ) dx,
Bε
the last equality by the divergence theorem. Differentiating under the integral
sign and dividing by vol(Bε ), we obtain
Z
Z
1
1
∂t ρ(x, t) dx = −
div (ρ~v ) dx.
vol(Bε ) Bε
vol(Bε ) Bε
488
A Course in Real Analysis
Letting ε → 0, we obtain
∂t ρ(y, t) = −div ρ(y, t)~v (y, t) .
In particular, if ρ is constant in time, then div (ρ~v ) is zero throughout E, hence
Z
Z
~
ρ~v · n dS =
div (ρ~v ) dx = 0,
S
E
that is, the amount of fluid flowing out of E equals the amount flowing in. ♦
Green’s Theorem
Let E be a regular region in R2 with boundary the union of finitely many
smooth simple pairwise disjoint curves C = ϕ(I). The boundary bd(E) is said
to be positively oriented if the vector obtained by rotating the unit tangent
vector T~ , which is in the direction of (ϕ01 , ϕ02 ), 90 degrees clockwise. This
produces the exterior normal ~n on C, which is in the direction of (ϕ02 , −ϕ01 ).
The region is then to the left as the boundary is traced in the direction of the
tangent vector field on each curve C.
C1
T~
E
~n
C2
C3
FIGURE 13.10: Regular region E in R2 .
Now let ω = Q dx − P dy. Then
(ω · ~n)◦ϕ = (Q◦ϕ, −P ◦ϕ)·(ϕ02 , −ϕ01 )kϕ0 k−1 = (P ◦ϕ)ϕ01 +(Q◦ϕ)ϕ02 kϕ0 k−1 ,
hence
Z
ω · ~n ds =
C
Z
(P dx + Q dy).
C
Summing over the curves C, we have
Z
Z
~
ω · n ds =
bd(E)
(P dx + Q dy).
bd(E)
Since
∂Q ∂P
−
,
∂x
∂y
we obtain the following important special case of the divergence theorem.
div ω =
Integration on Surfaces
489
13.5.10 Green’s Theorem. Let E be a region in R2 , as described above. If
P, Q are C 1 functions on an open set containing E, then
Z
ZZ ∂Q ∂P
(P dx + Q dy) =
−
dx dy.
(13.31)
∂x
∂y
bd(E)
E
13.5.11 Corollary. The area of E is given by
Z
1
(x dy − y dx).
area(S) =
2 ∂S
Proof. Apply Green’s theorem to P (x, y) = −y/2, Q(x, y) = x/2, noting that
Qx − Py = 1.
x2
y2
13.5.12 Example. The ellipse 2 + 2 = 1 has parametrization x = a cos t,
a
b
y = b sin t, 0 ≤ t ≤ 2π. Therefore, the area inside the ellipse is
Z
1 2π
ab(cos2 t + sin2 t) dt = πab.
♦
2 0
The Piecewise Smooth Case
Both Stokes’s theorem and the divergence theorem may be extended to
more general surfaces called piecewise smooth. In the case n = 3, these are
finite unions of smooth surfaces S1 , . . ., Sk that fit together so that
• no three surfaces meet in more than a single point, and
• the common boundary of two of these surfaces consists of finitely many
disjoint piecewise smooth simple closed curves.
S3
S5
S3
S2
S4
S2
S1
S1
FIGURE 13.11: Piecewise smooth surfaces.
R
(See Figure 13.11.) If S is such a surface, then the surface integral S f dS is
Pk R
defined as the sum j=1 Sj f dS. The integral of a form on S has an analogous
definition. These definitions are reasonable since, by cancelations, the common
490
A Course in Real Analysis
boundary of a pair of surfaces contributes nothing to the integral. We illustrate
the basic idea with the simple example of a cube. Removing a face of the cube
results in a surface-with-boundary Q, which we orient by the outward normal.
If Stokes’s theorem is applied to each of the five faces and the results are added,
the integrals along the boundaries cancel and one is left with Stokes’s theorem
for Q:
Z
Z
~
~
~ dS.
F · dr =
curl F~ · N
∂Q
Q
Q
∂Q
FIGURE 13.12: Oriented cube without bottom face.
Similarly, Green’s theorem extends to regions in R2 whose boundaries
are only piecewise smooth. This, of course, leads to extended versions of its
corollaries. Here’s an application of the extended version of 13.5.11:
13.5.13 Example. Let ∂S be a closed polygon consisting of m line segments
Li := [(ai , bi ) : (ai+1 , bi+1 )], i = 1, 2, . . . , m,
where (am+1 , bm+1 ) = (a1 , b1 ) and the vertices are in counterclockwise order.
(See Figure 13.13.)
(a4 , b4 )
L4
(a5 , b5 )
L3
(a3 , b3 )
L2
(a2 , b2 )
L5
L1
(a1 , b1 )
FIGURE 13.13: Closed polygon.
Integration on Surfaces
491
Then Li has the parametrization
x = (1 − t)ai + tai+1 , y = (1 − t)bi + tbi+1 , 0 ≤ t ≤ 1,
hence
Z
(x dy − y dx) = (bi+1 − bi )
Z
Li
1
(1 − t)ai + tai+1 dt
0
− (ai+1 − ai )
Z
1
(1 − t)bi + tbi+1 dt
0
= ai bi+1 − ai+1 bi .
Therefore,
m
area(S) =
1X
(ai bi+1 − ai+1 bi ).
2 1=1
♦
Exercises
1.S Verify directly the following version of Stokes’s theorem
Z
[f dx + g dy + h dz]
∂S
Z
=
(hy − gz ) dy ∧ dz + (fz − hx ) dx ∧ dz + (gx − fy ) dx ∧ dy ,
S
where S is the cylinder (x, y, z) : x2 + y 2 = 1, 0 ≤ z ≤ 1 .
2. For (x, y) 6= (0, 0) define
P (x, y) =
−y dx
x2 + y 2
and Q(x, y) =
x dy
.
x2 + y 2
Show that
(a) Qx = Py .
R
(b) ϕr P dx + Q dy = 2π, where ϕr (t) = (r cos t, r sin t), 0 ≤ t ≤ 2π.
R
(c) ψ P dx + Q dy = 2π, where ψ is any piecewise smooth, clockwise
oriented, simple closed curve enclosing (0, 0).
Z 2π
cos2m t sin2m t
2π
+
(d)
4m+2 dt = (2m + 1)ab , m ∈ Z , a, b > 0.
2
4m+2
2
a cos
t + b sin
t
0
3. Let 0 < r < R and let S = (x, y) : r2 ≤ x2 + y 2 ≤ R2 . Verify Green’s
492
A Course in Real Analysis
theorem on S for
(a) S P (x, y) = p
−y
+
y
x2
(b) P (x, y) = p
y2
x2 + y 2
x
(c) P (x, y) = 2
,
x + y2
,
Q(x, y) = p
,
Q(x, y) = p
x
+ y2
x
x2
x2 + y 2
−y
Q(x, y) = 2
.
x + y2
.
.
4. Use Green’s theorem to evaluate the following integrals, where the curves
C have counterclockwise orientation.
Z
(a)
sin(x − y) dx + sin(x + y) dy , C = bd [0, π/2] × [0, π/2] .
ZC
−xy
(b)
e
dx + exy dy , C = bd [0, 1] × [0, 1] .
ZC
(c)
cos(xy) dx + sin(xy) dy , C = bd [0, 1] × [0, 1] .
ZC
(d)S
f (x) dx + g(y) dy , where f and g are C 1 and C is simple, closed,
C
and piecewise C 1 .
5.S Use 13.5.11 to show that the area enclosed by the “elliptical astroid”
x2
a2
1/(2m+1) 2 1/(2m+1)
y
+
= 1, a > 0, b > 0, m ∈ Z+ ,
b2
is given by
Z
π/2
β
cos2m t + sin2m t) dt =
0
βπ (2m − 1)(2m − 3) · · · 5 · 3
,
2
2m(2m − 2) · · · 4 · 2
where β := 4−m ab(m + 21 ). (See 5.3.4.)
6. Let E be a regular region in Rn and let f and g be C 2 on cl(E). Prove
Green’s formulas:
Z
Z
~
(a)
f ∇g · n dS =
∇f · ∇g + f ∇2 g dx.
E
bd(E)
(b)
Z
(f ∇g − g∇f ) · ~n dS =
f ∇2 g − g∇2 f dx,
E
bd(E)
where ∇2 f :=
Z
n
X
∂2f
i=1
∂x2i
, the Laplacian of f .
Integration on Surfaces
493
7. A C 2 function f is said to be harmonic on set S ⊆ Rn if ∇2 f = 0 on an
open set containing S.
R
(a) Show that if f is harmonic on the ball Cr (0), then Sr (0) ∇f · ~n dS = 0.
(b) Show that if f and g are harmonic on the region cl(E) of 13.5.6 and
~nt = kxk−1 x on St := St (0), then
Z
Z
~
∇f · n1 dS =
∇f · ~n2 dS
S1
and
Z
S2
(g ∇f − f ∇g) · ~n1 dS =
S1
Z
(g ∇f − f ∇g) · ~n2 dS.
S2
8. Let E ⊆ Rn be a regular region and let f be harmonic on cl(E) (Exercise 7). Show that
Z
Z
2
k∇f k dx =
f ∇f · ~n dS,
E
bd(E)
where ~n is the outer normal. Deduce that if f = 0 on bd(E) and E is
connected, then f = 0 on E.
9.S Let E ⊆ Rn be a regular region and let f and g be harmonic on cl(E)
(Exercise 7). Show that
Z
Z
(f ∇g + g∇f ) · ~n dS = 2
∇f · ∇g dx,
E
bd(E)
where ~n is the outer normal.
10. Let n > 2. For t > 0, let Ct = Ct (0), St = St (0), and ~nt (x) = kxk−1 x,
the outer normal to St . Suppose f is harmonic on Cr (Exercise 7). Prove
the average value property of harmonic functions
Z
1
f dS
f (0) =
area(Sr ) Sr
by verifying (a)–(f) for 0 < t ≤ r. (Refer to 13.4.2.)
(a) The function g(x) := kxk2−n , x 6= 0, is harmonic.
Z
Z
2−n
(b)
f ∇g · ~nt dS = n−1
f dS.
t
St
St
Z
(c)
g∇f · ~nt dS = 0.
St
(d)
1
tn−1
Z
St
f dS =
1
rn−1
Z
f dS.
Sr
494
A Course in Real Analysis
Z
1
1
(e)
f dS =
f dS.
area(Sr ) Sr
area(St ) St
Z
1
f dS = f (0).
(f) lim
t→0 area(St ) S
t
Z
11. Let E be a region as in the statement of Green’s theorem. For the
functions ψ in (a) and (b) below, prove that if the conclusion of Green’s
theorem holds for ψ(E), then it holds for E. (This is a special case of
the statement in the proof of the divergence that the region E may be
rotated and translated without loss of generality.)
(a) ψ is the translation ψ(x, y) = (x + x0 , y + y0 ).
(b) ψ is the rotation ψ(x, y) = x cos θ − y sin θ, x sin θ + y cos θ .
S1
S1
S2
C
C
S2
(a)
(b)
FIGURE 13.14: Surfaces S1 and S2 with common boundary C.
12.S Orient the surfaces S1 and S2 in (a) and (b) of Figure 13.14 by their
outer normals ~n. Show that in
Z
Z
Z
(a),
curl F~ · ~n dS = 0; (b), curl F~ · ~n dS = curl F~ · ~n dS.
S1 ∪S2
S1
S2
13. Let a ∈ Rn , n > 2, and define an (n − 1) form ω on Rn+1 \ {a} by
ωx = kx − ak−n
n
X
ci ∧ · · · ∧ dxn .
(−1)i−1 (xi − ai ) dx1 ∧ · · · ∧ dx
i=1
Show that dω = 0. Conclude that if S is Ra compact, oriented n-surfacewith-boundary in Rn+1 and a 6∈ S, then ∂S ω = 0.
14.S Use the divergence theorem and 11.5.6 to show that the area of the
sphere Sr (0) is nrn−1 αn , derived by another method in 13.4.2.
15. Let E ⊆ Rn be a regular region and a ∈ E. Define f on Rn \ {a} by
Integration on Surfaces
495
f (x) = kx − ak2−n . Show that div ∇f = 0. Conclude that if Cr (a) ⊆ E,
then
Z
Z
(∇f ) · ~n dS =
(∇f ) · ~n dS = (2 − n)nαn ,
bd(E)
Sr (a)
where ~n denotes the outer normals.
Closed Forms in Rn
*13.6
13.6.1 Definition. A C 1 m-form ω on an open subset W of Rn is said to be
closed if d ω = 0. The form ω is exact if there exists a C 2 (m − 1)-form η on
W such that d η = ω.
♦
By 13.1.16(b), an exact form is closed. The converse is false (see Exercise 13.5.2). However, there is a general class of regions on which every closed
m-form is exact. We consider first the case m = 1.
Closed 1-Forms on Simply Connected Regions
13.6.2 Definition. An open connected subset U of Rn is said to be simply
connected if for each closed C 2 curve ϕ : [a, b] → Rn in U there exists a C 2
function Φ : [a, b] × [0, 1] → U such that for all s ∈ [0, 1] and t ∈ [a, b],
Φ(t, 1) = ϕ(t), Φ(t, 0) = ϕ(a) = ϕ(b), and Φ(a, s) = Φ(b, s).
♦
The function Φ is called a (C 2 ) homotopy between ϕ and the point p : ϕ(a) =
ϕ(b).
s
1
Φ( · , 1)
s
Φ( · , s)
a
b
Φ( · , 0)
t
q
p
FIGURE 13.15: Curves contracting to p must pass through q.
Note that, for each s ∈ [0, 1], Φ(·, s) is a closed C 2 curve in U such that
496
A Course in Real Analysis
Φ(·, 1) = ϕ and Φ(·, 0) is a single point p. Thus a simply connected region U
has the property that every closed curve in U may be contracted smoothly to a
point while remaining in U (see Figure 13.15). In R2 this means that there are
no “holes” in U . In higher dimensions a simply connected set may have holes.
For example, Rn \ C1 (0) is simply connected if n ≥ 3. However, the holes may
not be too large: the set R3 \ L, where L is a line, is not simply connected.
To prove that every closed 1-form of class C 2 on a simply connected set is
exact, we follow [5].
13.6.3 Lemma.
Let ω be a closed 1-form on a simply connected subset U of
R
Rn . Then ϕ ω = 0 for each closed C 2 curve ϕ in U .
Pn
Proof. Let ω = j=1 fj dxj and let Φ : [a, b] × [0, 1] → U be a homotopy as
in 13.6.2. By hypothesis,
0 = dω =
n X
n
X
∂i fj dxi ∧ dxj =
j=1 i=1
hence
X
(∂i fj − ∂j fi )dxi ∧ dxj ,
1≤i<j≤n
∂i fj = ∂j fi for all i and j.
(13.32)
Define C 1 functions P (t, s) and Q(t, s) on S := [a, b] × [0, 1] by
P =
n
X
(fj ◦ Φ)∂s Φj and Q =
j=1
n
X
(fi ◦ Φ)∂t Φj ,
i=1
where we have suppressed the variable (t, s). By the chain and product rules,
∂t P =
n n
o
X
(∂s Φj )[(∇fj ) ◦ Φ] · (∂t Φ) + (fj ◦ Φ)(∂ts Φj ) and
j=1
n n
o
X
∂s Q =
(∂t Φi )[(∇fi ) ◦ Φ] · (∂s Φ) + (fi ◦ Φ)(∂st Φi ) .
i=1
Since Φ is C 2 , ∂ts Φj = ∂st Φj , hence
∂t P − ∂s Q =
n n
n n
X
X
(∂s Φj )[(∇fj ) ◦ Φ] · (∂t Φ) −
(∂t Φi )[(∇fi ) ◦ Φ] · (∂s Φ)
j=1
=
X
i=1
(∂s Φj )[(∂i fj ) ◦ Φ](∂t Φi ) −
i, j
X
(∂t Φi )[(∂j fi ) ◦ Φ](∂s Φj ).
i, j
By (13.32), ∂t P − ∂s Q = 0, hence, by Green’s theorem,
Z
(P ds + Q dt) = 0.
∂S
Integration on Surfaces
497
Now, the positively oriented boundary of S consists of the parameterized
line segments
ψ1 (t) = (t, 0),
ψ2 (s) = (b, s),
a ≤ t ≤ b;
ψ3 (t) = (−t, 1), −b ≤ t ≤ −a;
0 ≤ s ≤ 1;
ψ4 (s) = (a, −s), −1 ≤ s ≤ 0.
(See Figure 13.6.) From the calculations
s
1
ψ3
ψ4
ψ2
ψ1
a
t
b
FIGURE 13.16: Boundary parametrization.
Z
(P ds + Q dt) =
ψ1
Z
(P ds + Q dt) = −
b
Z
Q(t, 0) dt,
a
Z b
ψ3
=
Q(t, 1) dt,
Z
1
Z
P (b, s) ds,
0
(P ds + Q dt) = −
Z
ψ4
we have
Z
Z
0=
(P ds + Q dt) =
∂S
n
1X
(P ds + Q dt) =
ψ2
a
Z
Z
1
P (a, s) ds,
0
1
P (b, s) − P (a, s) ds +
Z
b
Q(t, 0) − Q(t, 1) dt,
a
0
fj Φ(b, s)) ∂s Φj (b, s) − fj Φ(a, s) ∂s Φj (a, s) ds
0 j=1
+
Z
n
bX
fj ϕ(a) ∂t Φj (t, 0) − fj ϕ(t) ∂t Φj (t, 1) dt.
a j=1
Since Φ(b, s) = Φ(a, s), the first integral is zero hence so is the second. Since
∂t Φj (t, 0) = 0 and ∂t Φj (t, 1) = ϕ0j (t), we see from the second integral that
Z
ϕ
ω=
Z
n
bX
fj ϕ(t) ϕ0j (t) dt = 0.
a j=1
13.6.4 Lemma. Let ϕ : [0, 1] :→ Rn be a piecewise C 1 curve such that
ϕ(0) = ϕ(1). Then there exists a sequence of C ∞ functions ϕk : [0, 1] → Rn
with the following properties:
498
A Course in Real Analysis
(a) ϕk (0) = ϕk (1) = ϕ(0) for all k.
(b) limk ϕ0k (t) = ϕ0 (t) at each continuity point t of ϕ0 .
(c) limk ϕk = ϕ uniformly on [0, 1].
(d) The sequences {ϕk } and {ϕ0k } are uniformly bounded on [0, 1].
Proof. By considering components, we may assume that n = 1, that is, ϕ is
real-valued. Extend ϕ0 periodically with period 1 to R. Let M be a bound for
|ϕ0 | on R. By 13.3.3 there exists a C ∞ function h : R → R+ such that h > 0
c
on (−1, 1) and h =
R 0 on (−1, 1) . Multiplying h by a positive constant, we
may assume that R h = 1. Let
R hk (x) = kh(kx), k = 1, 2, . . . . Then hk ≥ 0,
hk (x) = 0 for |x| ≥ 1/k, and R hk = 1. Define a C ∞ function gk on R by
gk (x) =
Z
∞
ϕ (y)hk (x − y) dy =
0
Z
−∞
1/k
ϕ0 (x + y)hk (y) dy.
−1/k
The sequence {gk } is uniformly bounded since
Z ∞
Z
|gk (x)| ≤
|ϕ0 (x + y)|hk (y) dy ≤ M
−∞
By periodicity,
Z
∞
hk (y) dy = M.
−∞
1
ϕ0 (x + y) dx =
Z
0
1
ϕ0 (x) dx = ϕ(1) − ϕ(0) = 0
0
(Exercise 5.3.1), hence, by Fubini’s theorem,
Z 1
Z ∞
Z 1
gk (x) dx =
hk (y)
ϕ0 (x + y) dx dy = 0.
−∞
0
0
Now define ϕk on R by
ϕk (x) = ϕ(0) +
Z
x
gk (y) dy.
0
Then (a) and (d) hold and (b) follows from
Z 1/k
0
0
0
0
ϕk (x) − ϕ (x) = gk (x) − ϕ (x) =
ϕ (x + y) − ϕ0 (x) hk (y) dy,
−1/k
which tends to 0 at continuity points x as k → +∞.
Finally, (c) follows from (b), the inequality
Z t
Z 1
|ϕk (t) − ϕ(t)| ≤
|ϕ0k (x) − ϕ0 (x)| dx ≤
|ϕ0k (x) − ϕ0 (x)| dx,
0
0
and Lebesgue’s dominated convergence theorem, noting that the set of discontinuity points of ϕ0 is finite and hence has measure zero.
Integration on Surfaces
499
13.6.5 Theorem. Let ω be a closed 1-form on a simply connected subset U
of Rn . Then ω is exact.
R
Proof. By 12.2.10 it suffices to show that ϕ ω = 0 for every piecewise C 1
closed curve ϕ : [0, 1] → U . Let {ϕk } be as in 13.6.4. Since ϕk → ϕ uniformly
on [0, 1] and ϕ([0, 1]) ⊆ U , it follows
R that ϕk ([0, 1]) ⊆ U for all sufficiently large
k (Exercise 8.5.22). For such k, ϕk ω = 0 by 13.6.3. By (b) and (c) of 13.6.4,
Lebesgue’s dominated
convergence
theorem,
and the definition of integral of a
R
R
R
form (13.16), ϕk ω → ϕ ω. Therefore, ϕ ω = 0, as required.
Closed m-Forms on Star-Shaped Regions
13.6.6 Definition. A subset W of Rn is said to be star-shaped with respect
to y ∈ W if the line segment from y to any point x ∈ W lies in W :
y + t(x − y) ∈ W, 0 ≤ t ≤ 1.
♦
For example, a convex set is star-shaped with respect to every one of its
points. In Figure 13.17, W is star-shaped with respect to y but not z, and V
is not star-shaped with respect to any of its points.
z
x
x
y
y
W
V
FIGURE 13.17: Star-shaped and non-star-shaped regions.
13.6.7 Poincaré’s Lemma. Let W ⊆ Rn be open and star-shaped with respect
to some y ∈ W . If ω is a closed C 1 m-form on W , where 1 ≤ m ≤ n, then ω
is exact.
Proof. Define a function ψ : [0, 1] × W → W by ψ(t, x) = y + t(x − y). For
an r-form
X
η=
gj dxj
j∈Jr
on W , define the (r − 1)-form ηe on W by
X Z 1
r−1
ηex =
t (gj ◦ ψ)(t, x) dt η j , where
j∈Jm
η j :=
r
X
i=1
0
dj ∧ · · · ∧ dxj , j = (j1 , . . . , jr ).
(−1)i−1 (xji − yji ) dxj1 ∧ · · · ∧ dx
i
r
500
A Course in Real Analysis
A standard argument shows that the definition of ηe is independent of the
choice of representation of η. In particular, by putting η in canonical form we
see that η = 0 ⇒ ηe = 0. Furthermore,
dη j =
r
X
dj ∧ · · · ∧ dxj = r dxj .
(−1)i−1 d(xji − yji ) dxj1 ∧ · · · ∧ dx
i
m
i=1
Now let
ω=
X
fj dxj .
j∈Jm
Then
X Z
ω
e=
j∈Jm
1
t
m−1
(fj ◦ ψ)(t, x) dt ωj ,
0
and, by 13.1.16(d) (suppressing the variables (t, x) in fj ◦ ψ(t, x)),
Z 1
X Z 1
m−1
m−1
dω
e=
d
t
fj ◦ ψ dt ∧ ωj +
t
fj ◦ ψ dt dωj .
0
j∈Jm
0
Differentiating under the integral sign, applying the chain rule, and noting
that ψx = tIn , we have
Z 1
X
n Z 1
m−1
m
d
t
(fj ◦ ψ) dt =
t (∂i (fj ) ◦ ψ) dt dxi .
0
i=1
0
Therefore, using dωj = m dxj ,
( n Z
)
Z 1
1
X X
tm (∂i fj ) ◦ ψ dt dxi ∧ ωj + m
tm−1 fj ◦ ψ dxj .
dω
e=
i=1
j∈Jm
0
0
(13.33)
On the other hand,
dω =
X
n
X
j∈Jm
i=1
hence, since dω = 0,
n Z
X X
j∈Jm i=1
1
!
∂i fj dxi
∧ dxj =
n
X X
∂i fj dxi ∧ dxj ,
j∈Jm i=1
f = 0.
t (∂i (fj ) ◦ ψ)(t, x) dt (dω)(i,j) = dω
m
0
By the above definition,
(dω)(i,j) =
m
X
dj ∧ · · · ∧ dxj
(−1)` (xj` − yj` ) dxi ∧ dxj1 ∧ · · · ∧ dx
m
`
`=1
+ (xi − yi ) dxj1 ∧ · · · ∧ dxjm
= − dxi ∧ ωj + (xi − yi ) dxj ,
Integration on Surfaces
501
hence
=
n Z
X X
j∈Jm i=1
1
tm (∂i fj ) ◦ ψ dt − dxi ∧ ωj + (xi − yi ) dxj = 0.
(13.34)
0
Adding (13.33) and (13.34), we obtain
( Z
X
)
Z 1
n 1
X
m−1
m
t
(fj ◦ ψ) +
dxj .
dω
e=
m
(xi − yi ) t (∂i fj ) ◦ ψ dt
j∈Jm
0
0
i=1
The term in braces is simply
Z 1
d m
[t fj ◦ ψ] dt = tm fj ◦ ψ
dt
0
1
0
= fj .
Therefore, d ω
e = ω, which shows that ω is exact.
From Poincaré’s lemma we obtain the following results from classical vector
analysis, where, in keeping with the spirit, we write grad f for ∇f .
13.6.8 Corollary. Let W be an open star-shaped subset of R3 and let
F~ (x, y, z) = P (x, y, z), Q(x, y, z), R(x, y, z)
be a C 1 vector field on W . Then
(a) curl F~ = 0 iff F~ = grad f for some C 2 function f : W → R.
~ for some C 2 vector field G
~ on W .
(b) div F~ = 0 iff F~ = curl G
Proof. (a) If F~ = grad f = (fx , fy , fz ), then
curl F~ = (fzy − fyz , fxz − fzx , fyx − fxy ),
which is zero because f is C 2 . Conversely, assume that curl F~ = 0, that is,
Ry − Qz = Pz − Rx = Qx − Py = 0.
Let ω = P dx + Q dy + R dz. Then
dω = (Py dy + Pz dz) ∧ dx + (Qx dx + Qz dz) ∧ dy + (Rx dx + Ry dy) ∧ dz
= (Qx − Py ) dx ∧ dy + (Rx − Pz ) dx ∧ dz + (Ry − Qz ) dy ∧ dz = 0
so ω is closed. By Poincaré’s lemma, there exists a 0-form f of class C 2 on W
such that df = ω, that is, grad f = F~ .
~ where G
~ = (f, g, h), then
(b) If F~ = curl G,
P = hy − gz , Q = fz − hx , and R = gx − fy ,
502
A Course in Real Analysis
hence, if G is C 2 ,
div F~ = Px + Qy + Rz = (hyx − gzx ) + (fzy − hxy ) + (gxz − fyz ) = 0.
Conversely, assume div F~ = 0 and let
ω = R dx ∧ dy + P dy ∧ dz + Q dz ∧ dx.
Then dω = div F~ dx ∧ dy ∧ dz, hence ω is closed. By Poincaré’s lemma,
ω = d(f dx+g dy+h dz) = (gx −fy ) dx∧ dy+(hx −fz ) dx∧ dz+(hy −gz ) dy∧ dz
for some C 2 functions f , g, h on W . Therefore,
P = hy − gz ,
that is, F~ = curl (f, g, h).
Q = fz − hx ,
R = gx − fy ,
Part III
Appendices
Appendix A
Set Theory
In this appendix we give an overview of those aspects of elementary set theory
that are used throughout the book. For details the reader may wish to consult
[2, 8].
Notation for a Set
A set is simply a collection of objects, each of which is called a member or
element of the set. Sets are usually denoted by capital letters, and members of
a set by small letters. If x is a member of the set A, we write x ∈ A; otherwise,
we write x 6∈ A. The empty set, denoted by ∅, is the set with no members.
A concrete set may be described either by listing its elements or by setbuilder notation. The latter notation is of the form {x : P (x)}, which is read
“the set of all x such that P (x),” where P (x) is a well-defined property that x
must possess in order to belong to the set. For example, the set A of all odd
positive integers may be described as
A = {1, 3, 5, . . .} = {n : n = 2m − 1 for some positive integer m}.
A set A is a subset of a set B, written A ⊆ B, if every member of A is a
member of B. If A ⊆ B and A 6= B, then A is called a proper subset of a set B.
The empty set is a subset of every set and a proper subset of every nonempty
set. Sets A and B are said to be equal, written A = B, if each is a subset of
the other. If all sets under consideration are subsets of the set S, then S is
called a universal set (of discourse).
Set Operations
Let S be a universal set. The basic set operations are
A∪B
A∩B
A×B
Ac
A\B
=
=
=
=
=
{x : x ∈ A or x ∈ B},
{x : x ∈ A and x ∈ B},
{(x, y) : x ∈ A and y ∈ B},
{x : x ∈ S and x 6∈ A},
{x : x ∈ A and x 6∈ B},
union of A and B;
intersection of A and B;
Cartesian product of A and B;
complement of A in S;
difference of A and B.
More generally, if {Ai : i ∈ I} is an arbitrary collection of sets indexed by a
505
506
A Course in Real Analysis
set I, then the union and intersection of the collection are defined, respectively,
by
[
Ai = {x : x ∈ Ai for some i ∈ I},
i∈I
\
Ai = {x : x ∈ Ai for every i ∈ I}.
i∈I
If the index set is {1, 2 . . . , n} or {1, 2, . . . , n, . . .}, we use the alternate notation
n
[
Aj = A1 ∪ A2 ∪ · · · ∪ An ,
j=1
n
\
Aj = A1 ∩ A2 ∩ · · · ∩ An
j=1
and
∞
[
Aj = A1 ∪ A2 ∪ . . . ,
∞
\
Aj = A1 ∩ A2 ∩ . . .
j=1
j=1
A sequence of sets An is said to be increasing if A1 ⊆ A2 ⊆ · · · , in which
case we write An ↑. Similarly, the sequence is decreasing if A1 ⊇ A2 ⊇ · · · ,
written An ↓. In the first case we also write An ↑ A, where A = A1 ∪ A2 ∪ · · · ,
and in the second An ↓ A, where A = A1 ∩ A2 ∩ · · · .
For finitely many sets we extend the definition of Cartesian product by
n
Y
Aj = A1 × · · · × An = {(a1 , . . . , an ) : aj ∈ Aj , j = 1, . . . n},
j=1
where (a1 , . . . , an ) is an (ordered) n-tuple. Also, we write
An = A × A · · · × A .
{z
}
|
n
In particular, for an interval [a, b] and the set of all real numbers R,
[a, b]n = [a, b] × · · · × [a, b] and Rn = R × · · · × R .
|
{z
}
{z
}
|
n
n
The following propositions summarize the basic properties of set operations
that will be needed in the text. As with many set equalities, they may be
established directly by showing that an arbitrary member of the left side of an
equation is a member of the right side, and vice versa.
Proposition. If {Ai : i ∈ I} is collection of subsets of a set S, then
\ \
[ c \
(a)
Ai =
Aci .
(b) A ∪
Ai =
A ∪ Ai .
i∈I
(c)
\
i∈I
i∈I
Ai
c
=
[
i∈I
i∈I
Aci .
(d) A ∩
[
i∈I
i∈I
Ai =
[
i∈I
A ∩ Ai .
Set Theory
507
Parts (a) and (c) of the above proposition are known as DeMorgan’s laws.
Parts (b) and (d) are called distributive laws.
Proposition. The Cartesian product of sets has the following properties:
(a) A × A1 ∪ · · · ∪ An = (A × A1 ) ∪ · · · ∪ (A × An ).
(b) A × A1 ∩ · · · ∩ An = (A × A1 ) ∩ · · · ∩ (A × An ).
(c) A1 ∩ · · · ∩ An × B1 ∩ · · · ∩ Bn = (A1 × B1 ) ∩ · · · ∩ (An × Bn ).
Partitions and Equivalence Relations
A collection of sets is pairwise disjoint if A ∩ B = ∅ for each pair of distinct
members A and B in the collection. A partition of a set S is a collection of
nonempty pairwise disjoint sets whose union is S.
An equivalence relation on a set S is a subset R of S × S with the following
properties:
• (reflexivity) xRx for every x ∈ S;
• (symmetry) xRy ⇒ yRx;
• (transitivity) xRy and yRz ⇒ xRz.
Here, as is customary, we have written xRy for (x, y) ∈ R.
There is an important duality regarding partitions and equivalence relations:
If R is an equivalence relation on S, then the collection of sets of the form
[x] := {y ∈ S : xRy},
called an equivalence class of the relation, is a partition of S. Conversely, given
a partition of S, define
xRy iff x and y are in the same partition member.
Then R is an equivalence relation on S whose equivalence classes are precisely
the members of the partition.
Functions
Let A and B be nonempty sets. A function or mapping from A to B is a
rule f that assigns to each member x of A a unique member f (x) of B. We
then write f : A → B. The set A is called the domain of f . The alternate
notation x 7→ f (x) : A → B is also used. If A0 ⊆ A and B0 ⊆ B, then
f (A0 ) = {f (x) : x ∈ A0 } and f −1 (B0 ) = {x ∈ A : f (x) ∈ B0 }
are called, respectively, the image of A0 and the pre-image of B0 under f . The
set f (A) is called the range of f . A function f : A → B is said to be onto B if
f (A) = B, and one-to-one if x1 6= x2 implies f (x1 ) 6= f (x2 ).
508
A Course in Real Analysis
Proposition. Let f : A → B be a function, {Ai : i ∈ I} a collection of subsets
of A, and {Bj : j ∈ J} a collection of subsets of B. Then
[ [
(a) f −1
Bj =
f −1 (Bj ).
j∈J
(b) f −1
\
j∈J
Bj =
j∈J
(c) f
[
\
Ai =
(f)
(g)
(h)
[
f (Ai ).
i∈I
\
Ai ⊆
f (Ai ), where equality holds if f is one-to-one.
i∈I
(e)
f −1 (Bj ).
j∈J
i∈I
(d) f
\
i∈I
c
f
= f −1 (Bj ) .
c
f (Aci ) ⊆ f (Ai ) , where equality holds if f is onto B.
f f −1 (Bj ) ⊆ Bj , where equality holds if f is onto B.
Ai ⊆ f −1 f (Ai ) , where equality holds if f is one-to-one.
−1
(Bjc )
If f : A → B and g : C → D are functions with B ⊆ C, then the
composition of g and f is the function g ◦ f : A → D defined by
(g ◦ f )(x) = g f (x) , x ∈ A.
If D0 ⊆ D, then
(g ◦ f )−1 (D0 ) = f −1 g −1 (D0 ) .
If f : A → B is one-to-one and onto B, then the inverse f −1 : B → A is
defined by the rule x = f −1 (y) iff y = f (x). One then has the identities
(f −1 ◦ f )(x) = x and (f ◦ f −1 )(y) = y, x ∈ A, y ∈ B.
Thus f −1 ◦ f and f ◦ f −1 are the identity functions on A and B, respectively.
Cardinality
Two sets A and B are said to have the same cardinality if there exists a
one-to-one function from A onto B. A set A is finite if either A is the empty set
or A has the same cardinality as {1, 2, . . . , n} for some positive integer n. In
the latter case, the members of A may be labeled with the numbers 1, 2, . . . , n,
so A may be written {a1 , a2 , . . . , an }. A set A is countably infinite if it has the
same cardinality as the set of natural numbers. In this case the members of A
may be labeled with the positive integers 1, 2, 3, . . . A set is countable if it is
either finite or countably infinite; otherwise it is said to be uncountable. The
set of all integers is countably infinite, as is the set of rational numbers. The
set R of all real numbers is uncountable, as is any (nondegenerate) interval of
real numbers.
Appendix B
Linear Algebra
This appendix contains a brief review of the main ideas of linear algebra that
will be needed in Part II of the text. For details and proofs the reader is
referred to [9].
Vector Spaces. Bases
A vector space is a set V containing at least one member 0, called the
zero vector, together with two operations u + v and au (u, v ∈ V, a ∈ R),
called vector addition and scalar multiplication, respectively, such that for all
u, v, w ∈ V and a, b ∈ R the following axioms hold:
• Associativity of addition: (u + v) + w = u + (v + w).
• Commutativity of addition: u + v = v + u.
• Additive identity: v + 0 = v.
• Existence of additive inverse: u + (−u) = 0.
• Associativity of scalar multiplication: (ab)u = a(bu).
• Scalar distributivity: a(u + v) = au + av.
• Vector distributivity: (a + b)u = au + bu.
• Scalar multiplicative identity: 1u = u.
A subset W of V containing the zero vector and closed under the operations
of vector addition and scalar multiplication is called a subspace of V. The set
W is then a vector space under the operations it inherits from V. A linear
combination of vectors v 1 , . . . , v n ∈ V is an expression of the form
c1 v 1 + · · · + cn v n , cj ∈ R.
The set of all linear combinations of v 1 , . . . , v n is called the linear span of
v 1 , . . . , v n or the subspace spanned by v 1 , . . . , v n . The vectors v 1 , . . . , v n are
then said to span V.
Vectors v 1 , . . . , v n ∈ V are linearly independent if an equation of the form
c1 v 1 + · · · + cn v n = 0
509
510
A Course in Real Analysis
can hold only if c1 = · · · = cn = 0. A basis for V is a finite set of linearly
independent vectors that span V. It follows that each member of V is uniquely
expressible as a linear combination of the basis vectors. A vector space that has
a basis is said to be finite dimensional; otherwise it is infinite dimensional. All
bases in a finite dimensional vector space V have the same number of vectors.
This number is called the dimension of the vector space and is denoted by
dim V. A frame for a finite dimensional vector space is an ordered basis.
If V is finite dimensional, then every set of linearly independent vectors
may be extended to a basis, and every finite set of vectors that span V may
be reduced to a basis.
An important example of a finite dimensional vector space is Euclidean
space Rn (Section 1.6). The standard basis in Rn consists of the n vectors
e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), en = (0, 0, . . . , 0, 1).
An example of an infinite dimensional vector space is the set of all Riemann
integrable functions on [a, b] with the operations of pointwise addition and
scalar multiplication.
A basis {w1 , . . . , wm } for a subspace W of Rn is orthonormal if
(
0 if i 6= j,
wi · wj =
1 if i = j,
where (·) is the usual inner (= dot) product on Rn . For example, the standard
basis is orthonormal. Every subspace of Rn has an orthonormal basis.
Linear Transformations
Let U and V be vector spaces. A linear transformation from U to V is a
function T : U → V with the properties
T (u + v) = T u + T v and T (cu) = cT u, u, v ∈ U, c ∈ R.
Here, we have used the convention for linear transformations of dropping the
parentheses in the notation T (u) when there is no danger of ambiguity. The
collection of all linear transformations from U to V is denoted by L(U , V). It
is a vector space under the operations T1 + T2 and cT defined by
(T1 + T2 )(u) = T1 u + T2 u, (cT )u = c(T u),
u ∈ U , c ∈ R.
If T ∈ L(U, V) and S ∈ L(V, W), then ST := S ◦ T is a member of L(U, W).
Also, the subspace N (T ) := T −1 ({0}) of U is called the nullspace of T . The
range of T , which is a subspace of V, is denoted by R(T ). If U and R(T ) are
finite dimensional, then
dim N (T ) + dim R(T ) = dim U.
Linear Algebra
511
If T ∈ L(U, V) is one-to-one and onto V, then T −1 ∈ L(V, U). In this case
T is said to be invertible. If U and V are finite dimensional, then T is invertible
iff N (T ) = {0} iff R(T ) = V. In this case T maps a frame (u1 , . . . , un ) in U
onto a frame (v 1 , . . . , v n ) in V, where v j = T uj . We indicate this by writing
T (u1 , . . . , un ) = (v 1 , . . . , v n ).
Matrices
An m × n matrix is a rectangular array of real numbers with m rows and
n columns. It is written variously as
 1
  
a1 a21 · · · an1
a1
 a12 a22 · · · an2   a2  
  
A = [aji ]m×n =  .
..
..  =  ..  = a1 a2 · · · an ,
.
 .
.
···
.   . 
a1m
a2m
···
anm
am
where ai = (a1i , · · · , ani ) is the ith row of A and aj = (aj1 , · · · , ajm ) is the jth
column of A (written, of course, as a column). The number aji located in row i
and column j of the matrix is also written aij and is called the (i, j)th entry
of A.
For a ∈ R and matrices A = [aji ]m×n , B = [bji ]m×n , and C = [aji ]n×p , the
sum A + B, scalar multiple aA, and product AC are defined, respectively, by
A+B = [xji ], xji := aji +bji , aA = [yij ], yij := aaji , AC = [zij ], zij :=
n
X
aki cjk .
k=1
The product AC may also be written as
 
a1
 a2   
 ..  c1 c2 · · · cp n×p = [ai · cj ]m×p .
 . 
am
m×n
The m × n matrix Om×n with all entries equal to 0 is called a zero matrix.
It has the property that A+Om×n = A for all m×n matrices A. The collection
of m × n matrices is a vector space under the operations A + B and aA and
with zero Om×n .
The transpose of an m × n matrix A is the n × m matrix At := [xji ], where
j
xi = aij . For example


t
1 4
1 2 3
= 2 5 .
4 5 6
3 6
The transpose operation has the following properties:
(A + B)t = At + B t , (aA)t = aAt , (AC)t = C t At .
512
A Course in Real Analysis
For each n, the matrix

1
0

In :=  .
 ..
0
0 ···
1 ···
..
. ···
0 ···

0
0

.. 
.
1
is called the nth order identity matrix. It has the property that
AIn = A and In B = B
for all m × n matrices A and all n × p matrices B.
An n×n matrix A is said to be nonsingular if there exists a matrix, denoted
by A−1 and called the inverse of A, such that
AA−1 = A−1 A = In .
The inverse operation has the property
(AB)−1 = B −1 A−1
for all nonsingular n × n matrices A and B.
An m × n matrix A is said to be in reduced row echelon form if the following
conditions hold:
• Any nonzero row has its first entry equal to 1. This entry is then called
the leading entry of the row.
• If rows i and k are nonzero and i < k, then the leading entry of row i is
to the left of the leading entry of row k.
• Entries above and below a leading entry are zero.
• Any zero row is below all nonzero rows.
For example, the following matrix is

0 1
0 0

0 0
0 0
in reduced row echelon form:

0 3 0
1 7 0
.
0 0 1
0 0 0
For a given n, In is the only n × n matrix in reduced row echelon form without
any zero rows.
An elementary row operation on an m × n matrix A is one of the following:
• Interchange a pair of rows.
• Multiply a row by a nonzero scalar.
• Add to one row a scalar multiple of another.
Linear Algebra
513
An elementary matrix is a matrix obtained from the identity matrix by an
elementary row operation. Each elementary row operation on A may be achieved
by multiplying A on the left by a suitable elementary matrix. For example,
the multiplication


 

0 1 0 1 2 3
4 5 6
1 0 0 4 5 6 = 1 2 3
0 0 1 7 8 9
7 8 9
switches the first and second rows


1 0 0
1
2 1 0 4
0 0 1
7
of A, and the
 
2 3
1
5 6 = 6
8 9
7
multiplication

2 3
9 12
8 9
adds twice row one to row two. Using elementary operations, one may transform
any m × n matrix A into reduced row echelon form R. It follows that there
exists a sequence of elementary matrices Ej such that
R = Ep Ep−1 · · · E1 A.
The row rank (column rank) of a matrix A is the maximum number of
linearly independent rows (columns) of A. The row rank of a matrix is always
equal to the column rank. (This is clear for the reduced row echelon form.)
The rank of a matrix is its row (= column) rank.
The Matrix of a Linear Transformation
Let T ∈ L(Rn , Rm ). The matrix of T is defined by
[T ] = T e1 T e2 · · · T en
(where T ej is P
written as a column). If T ej = (aj1 , aj2 , · · · , ajm ) and x =
n
(x1 , . . . , xn ) = j=1 xj ej , then, by linearity of T ,
T (x1 , x2 , . . . , xn ) =
n
X
xj T e j =
j=1
=
n
X
j=1
n
X
(aj1 xj , aj2 xj , · · · , ajm xj )
j=1
aj1 xj ,
n
X
j=1
aj2 xj , · · · ,
n
X
j=1
which may be written in column matrix form as
 
 1
a1 a21 · · · an1
x1
 a12 a22 · · · an2   x2 

 
[T ]xt =  .
..
..   ..  .
 ..
.
···
.  . 
1
2
xn
am am · · · anm
ajm xj ,
514
A Course in Real Analysis
Note that aji may be expressed as (T ej ) · ei .
The operations of addition, scalar multiplication, and composition of linear
transformations correspond to addition, scalar multiplication, and multiplication of matrices in the following way: If T, T 0 ∈ L(Rn , Rm ) and S ∈ L(Rm , Rp ),
then
[T + T 0 ] = [T ] + [T 0 ], [tT ] = t[T ] [ST ] = [S][T ].
In particular, if T ∈ L(Rn , Rn ), then T is invertible iff [T ] is nonsingular.
An n × n matrix A is orthogonal if AAt = In , that is, if At = A−1 or,
equivalently, det A = ±1. (See below.) A linear transformation T ∈ L(Rn , Rm )
is said to be orthogonal if [T ] is orthogonal.
Determinants
A permutation of the n-tuple (1, . . . , n) is a one-to-one function σ mapping
{1, . . . , n} onto itself. It is frequently denoted by (i1 , . . . , in ), where ik = σ(k).
The permutation is said to be even or odd according as an even or odd number
of adjacent interchanges are required to transform (i1 , . . . , in ) to (1, . . . , n) (or
vice versa). For example, (3, 2, 1) is odd and (4, 3, 2, 1) is even. The sign of a
permutation σ is defined by
(
1 if σ is even,
(−1)σ =
−1 if σ is odd.
We then have
(−1)στ = (−1)σ (−1)τ
and
(−1)σ
−1
= (−1)σ ,
where, as is customary, τ σ stands for τ ◦ σ.
The determinant of an n × n matrix A = [aji ] is defined by
a11
a12
det A = .
..
a21
a22
..
.
a1m
a2m
···
···
···
···
an1
X
an2
σ(1)
(−1)σ a1 · · · aσ(n)
,
.. :=
n
.
anm
σ
where the sum is taken over all permutations σ of (1, . . . , n). For example,
a
c
b
= ad − bc,
d
since (−1)(1,2) = 1 and (−1)(2,1) = −1.
If T ∈ L(Rn , Rn ) we denote the determinant of the matrix of T by det T
rather than by the more cumbersome det[T ].
The following theorem summarizes the main properties of determinants.
Parts (a)–(f) follow directly from the above definition; part (g) is proved in
Chapter 13.
Linear Algebra
515
Theorem. Let A = a1 · · · an be an n × n matrix and t ∈ R. Then
(a) det a1 · · · taj · · · an = t det a1 · · · aj · · · an .
(b) det a1 · · · aj + b · · · an = det a1 · · · aj · · · an + det a1 · · · b · · · an .
(c) Interchanging two rows of A changes the sign of the determinant.
(d) If A has a pair of duplicate rows, then det A = 0.
(e) Adding a multiple of one row to another does not change the value of the
determinant.
(f) det At = det A. Thus any “row property” has a corresponding “column
property.”
(g) If B is an n × n matrix, then det(AB) = (det A)(det B).
The following theorem is frequently useful in evaluating determinants.
Theorem. Let A = [aij ] be an n × n matrix, and for each (i, j), let Aij denote
the matrix obtained by removing row i and column j from A. Then for each
fixed i and j,
det A =
n
X
(−1)i+k aik det Aik =
k=1
n
X
(−1)k+j akj det Akj .
k=1
The first equality is called expansion along row i and the second expansion
along column j. For example, expanding along row 1,
a11
a21
a31
a12
a22
a32
a13
a
a23 = a11 22
a32
a33
a23
a
− a12 21
a33
a31
a23
a
+ a13 21
a33
a31
a22
.
a32
a b
= ad − bc may then be used to complete the evaluation.
c d
For another example, consider
The formula
Ip
Oq×p
Cp×q
= det D,
Dq×q
obtained by successive expansion along the first column.
The preceding theorem may be used to prove the following result.
Theorem. Let A = [aij ] be an n × n matrix. Then A−1 exists iff det A =
6 0.
In this case the (i, j) entry of A−1 is
(−1)i+j
det Aji
.
det A
516
A Course in Real Analysis
The last theorem may be used to prove Cramer’s Rule: Consider a system
of n equations in n unknowns, written in matrix form as Ax = b or explicitly
as

   
a11 a12 · · · a1n
x1
b1
 a21 a22 · · · a2n   x2   b2 

   
 ..
..
..   ..  =  ..  .
 .
.
···
.  .   . 
an1
an2
···
ann
xn
bn
If A is nonsingular, then the solution to the system is
a11
a21
1
xj =
.
det A ..
an1
···
···
···
···
a1,j−1
a2,j−1
..
.
b1
b2
..
.
a1,j+1
a2,j+1
..
.
···
···
a1n
a2n
.. .
.
an,j−1
bn
an,j+1
···
ann
Appendix C
Solutions to Selected Problems
Section 1.2
1. (b) (ab) + (−a)b = a + (−a) b = 0 · b = 0, so uniqueness of the additive
inverse implies −(ab) = (−a)b. A similar argument works for the second
equality.
(d) By (b), (−1)a = 1(−a) = −a.
(f) Using commutativity and associativity of multiplication and the
distributive law and 1.2.1(i),
a/b + c/d = ab−1 (dd−1 ) + cd−1 (bb−1 ) = ad(b−1 d−1 ) + bc(b−1 d−1 )
= ad(bd)−1 + bc(bd−1 ) = (ad + bc)/(bd).
3. If s := r/x ∈ Q, then, by Exercise 2, x = r/s ∈ Q, a contradiction.
Therefore, r/x ∈ I. The remaining parts have similar proofs.
1
n!
n−1n−2
· · · = n . For (b),
n
n
n
n
(2n)! = 2n(2n − 2)(2n − 4) · · · 4 · 2 (2n − 1)(2n − 3) · · · 3 · 1
= 2n n(n − 1)(n − 2) · · · 2 · 1 (2n − 1)(2n − 3) · · · 3 · 1 .
5. The left side of (a) is
8. f (k) = k 3 − (k − 1)3 = 3k 2 − 3k + 1.
Section 1.3
1. (c) Follows from a/b − c/d = (ad − bc)/bd.
4. If 0 < x < y, then multiplying the inequality by 1/(xy) and using (d) of
1.3.2 shows that 1/y < 1/x. If x < y < 0, then 0 < −y < −x, hence, by
the first part, 1/(−x) < 1/(−y) so 1/x > 1/y.
6. (a) By Exercise 1.2.4, y n − xn = (y − x)
n
X
y n−j xj−1 . Each term of the
j=1
sum is positive and less than y n−j y j−1 = y n−1 . Since there are n terms,
part (a) follows.
8. a = ta + (1 − t)a < tb + (1 − t)b = b.
517
518
A Course in Real Analysis
10. If a > b, then x := a − b > 0 and a > b + x, contradicting the hypothesis.
13. (b) 0 ≤ (x − y)2 + (y − z)2 + (z − x)2 = 2(x2 + y 2 + z 2 ) − 2(xy + yz + xz).
14. Expand (x − a)2 ≥ 0 and divide by x.
18. If a ≤ x ≤ b, then x ≤ |b| and −x ≤ −a ≤ |a|, hence |x| ≤ max{|a|, |b|}.
21. Assume without loss of generality that S1 = S \{a1 , . . . , ak }, so min S1 =
ak+1 . Each of the remaining sets Sj contains at least one of a1 , . . . , ak ,
hence min Sj ≤ ak < ak+1 , verifying the assertion.
Section 1.4
2. (a) sup = 12, inf = −12.
(b) sup = 1, inf = −1.
3. (c) sup = 10/3, inf = 3;
(d) sup =
(e) sup = +∞, inf = −∞.
(i) sup =
1
2
+
√
2
4 ,
inf =
1
2
√
−
2
4 ;
√
3+ 5
2 ,
inf = −∞;
(h) sup = 3, inf = 0;
(m) sup = 4/3, inf = −1.
5. Let x, y ∈ A. Then ±(x−y) ≤ sup A−inf A, hence |x−y| ≤ sup A−inf A.
Since |x|−|y| ≤ |x−y|, |x|−|y| ≤ sup A−inf A so |x| ≤ sup A−inf A+|y|.
Since x was arbitrary, we have sup |A| ≤ sup A − inf A + |y|, hence
sup |A| − sup A + inf A ≤ |y|. Since y was arbitrary, it follows that
sup |A| − sup A + inf A ≤ inf |A|.
6. (b) Since x > 0, xa ≤ x sup A for all a ∈ A, hence sup (xA) ≤ x sup A.
Replacing x by 1/x proves the inequality in the other direction. The
infimum case is similar.
√
√
√
9. Let a < b and choose a rational r in (a − 2, b − 2). Then r + 2 is
irrational and in (a, b).
12. (b) If n := bxc = −b−xc, then x − 1 < n ≤ x and x ≤ n < x + 1. This is
possible only if x = n. The converse is trivial.
(c) By definition −x − 1 < b−xc ≤ −x.
m
1/n
14. Let x := (bm )
and y := b1/n . By definition, x is the unique positive
h
m in h 1/n n im
solution of xn = bm . Since y n = b1/n
= b
= bm , x = y.
17. Let ` ≤ x ≤ u for all x ∈ A. By the Archimedean principle, there
exist positive integers m and n such that −m < ` ≤ u < n. Set N =
max{m, n}.
Solutions to Selected Problems
519
√
√
20. For√any a ∈ N, if√r := n + a +√ n ∈ Q, then squaring both sides
of n + a = r − n shows√that n ∈ Q and hence that n = j 2 for
some j ∈ N (1.4.11). Then n + a ∈ Q, hence n = k 2 for some k ∈ N.
Therefore, a = k 2 − j 2 = (k − j)(k + j). If a = 11, then k − j = 1 and
j + k = 11 so n = 25. If a = 21, then either k − j = 1 and j + k = 21 or
k − j = 3 and j + k = 7. The first choice leads to j = 10 and n = 100
and the second to j = 2 and n = 4.
Section 1.5
3. Let f (n) denote the sum on the left side of the equation and g(n) the
sum on the right. Then f (1) = 1/2 = g(1). Now let n ≥ 1. Then
f (n + 1) − f (n) =
2n+2
X
k=1
g(n + 1) − g(n) =
2n
(−1)k+1 X (−1)k+1
1
1
−
=
−
k
k
2n + 1 2n + 2
2n+2
X
k=n+2
k=1
2n
X
1
1
1
1
1
−
=
+
−
.
k
k
2n + 2 2n + 1 n + 1
k=n+1
Since the right sides are equal, f (n) = g(n) ⇒ f (n + 1) = g(n + 1).
5.
25 3
3 n
6. (b)
−
500
X
k=1
15 2
2 n
+ 16 n.
(4k 2 − 1) = 4
500 · 501 · 1001
− 500 = 167, 166, 500.
6
7. For n ≥ 1, let Q(n) be the statement P (n − 1 + n0 ). Then Q(1) = P (n0 )
is true. Assume Q(n) = P (n − 1 + n0 ) is true. Then Q(n + 1) = P (n + n0 )
is true. By mathematical induction, Q(n) = P (n − 1 + n0 ) is true for all
n ≥ 1, that is, P (n) is true for every n ≥ n0 .
8. In each case, let f (n) be the left side of the inequality and g(n) the right
side, and let P (n) : f (n) < g(n). Let n0 be the base value of n for which
P (n) is true. It is straightforward to check that f (n0 ) < g(n0 ). Assume
P (n) holds for some n ≥ n0 , so that f (n)/g(n) < 1. Then
(a)
f (n + 1)
2n + 3
f (n)
1
= n+1 =
+
< 1.
g(n + 1)
2
2g(n) 2n
(e)
2n+1 (n + 1)!
2
f (n + 1)
f (n)
=
=
< 1.
n+1
g(n + 1)
(n + 1)
g(n) (1 + 1/n)n
9. Check that 6 < ln(6!). For the induction step, use (n + 1)! = (n + 1)n!.
13. Let gn denote the expression on the right in the assertion. One checks
directly that g0 = g1 = 1. Let n ≥ 2 and assume that fj = gj for all
520
A Course in Real Analysis
2 ≤ j ≤ n. Then
gn+1 − fn+1 = gn+1 − fn − fn−1 = gn+1 − gn − gn−1
1
1
= √ an+2 − an+1 − an + √ bn+2 − bn+1 − bn
5
5
bn 2
an 2
= √ (a − a − 1) + √ (b − b − 1) = 0.
5
5
15. The set of all nonnegative integers of the form m−qn, q ∈ Z, is nonempty
(Archimedean principle), hence has a smallest member r = m − qn (well
ordering principle). If r ≥ n, then 0 ≤ r − n = m − (q + 1)n < r,
contradicting the minimal property of r. Therefore, m = qn + r has the
required form. If also m = q 0 n + r0 , q 0 ∈ Z, and r0 ∈ {0, . . . n − 1}, then
|q − q 0 |n = |r − r0 | < n, hence q 0 = q and r0 = r.
Section 1.6
1. x = c −
d · e − (b · c)(b · d)
a,
1 − (a · b)(b · d)
y =e−
b · c − (a · b)(d · e)
d.
1 − (a · b)(b · d)
2. (c) By the triangle inequality,
||x||2 = ||x − y + y||2 ≤ ||x − y||2 + ||y||2 ,
hence ||x||2 − ||y||2 ≤ ||x − y||2 . Similarly, ||y||2 − ||x||2 ≤ ||x − y||2 .
3. By 1.6.3, ||x1 + x2 + · · · + xk ||22 =
n
X
xi · xj =
i,j=1
k
X
xj · xj .
j=1
7. The hypotheses imply that
n
X
j=1
x2j =
n
X
yj2 = 1 and
j=1
n
X
(xj + yj )2 = 4.
j=1
Pn
Pn
It follows that j=1 xj yj = 1 and j=1 (xj − yj )2 = 0. The same does
not hold for || · ||∞ (take x = (−1, 1) and y = (1, 1)) or for || · ||1 (take
x = (1, 0) and y = (0, 1)).
Section 2.1
1. (a) an = [a + b + (−1)n (b − a)]/2.
3. (b) If n ≥ 6, |(2n2 − n)/(n2 + 3) − 2| = |n + 6|/(n2 + 3) ≤ 2n/n2 = 2/n.
Therefore, choose N ≥ min{6, 2/ε}.
(e) |(2 + 1/n)3 − 8| = (2 + 1/n)2 + 2(2 + 1/n) + 4 /n ≤ 19/n, so choose
any integer N > 19/ε.
Solutions to Selected Problems
521
5. Let r = pq −1 , p, q ∈ Z, q > 0. For all n ≥ q, n!r ∈ Z, hence sin(n!rπ) = 0.
7. Let A = {x1 , . . . , xp } and Aj = {n : an = xj }. One of these sets, say A1 ,
must have infinitely many members. Since |x1 − a| ≤ |x1 − an | + |an − a|
and an → a, letting n → +∞ through A1 shows that x1 = a. We may
therefore choose ε > 0 so that I := (a − ε, a + ε) contains no xj for j ≥ 2.
Let N ∈ N such that an ∈ I for all n ≥ N . For such n, an = a.
8. (a) bn = (3an + 2bn − 3an )/2 → (c − 3a)/2.
√
9. (a) 2.
(d) b/2 a.
(g) −kak−1 .
(k) 1/2.
11. Use −r ≤ an − bn ≤ r and 2.1.4.
14. (a) Suppose first that r > 1. Set hn = r1/n − 1. By the binomial theorem,
r = (1 + hn )n > nhn , hence, by the squeeze principle, hn → 0. If r < 1,
consider 1/r.
17. an < ran−1 < r2 an−2 < · · · < rn−1 a1 → 0. For the example, take
an = 21/n .
19. Choose N such that an − a < ε for all n ≥ N . For such n,
0 ≤ min{a1 , . . . , an } − a ≤ an − a < ε.
Therefore, min{a1 , . . . , an } → a. The converse is false: consider an =
1 + (−1)n .
22. Suppose that c ≤ f (x) − x ≤ d for all x, so c + jx ≤ f (jx) ≤ djx.
Summing and using Exercise 1.5.4,
nc + xn(n + 1)/2 ≤
n
X
f (jx) ≤ nd + xn(n + 1)/2,
j=1
hence
c/n + x(1 + 1/n)/2 ≤ (1/n2 )
n
X
f (jx) ≤ d/n + x(1 + 1/n)/2.
j=1
Letting n → +∞, we obtain (a). Part (b) is proved similarly.
Section 2.2
1. Since
a1/n
a1/(n+1)
= a1/n(n+1) < 1 < b1/n(n+1) =
b1/n
b1/(n+1)
,
a1/n is increasing and b1/n is decreasing. Each tends to 1 by Exercise 2.1.14.
522
A Course in Real Analysis
3. By results of Section 2.1,
an = a(1/n + nb)−1 → 0 and nan = a(1/n2 + b)−1 → ab−1 .
The condition an+1 < an is equivalent to (n2 + n)b > 1, which holds
eventually. The condition (n+1)an+1 > nan is equivalent to the inequality
(n + 1)2 > n2 .
3x + 4
1
=
. Then f : [1, 2] → [1, 2], f is
2 + (1 + x)−1
2x + 3
increasing and f (am ) = am+2 . Since a1 , a2 ∈ [1, 2], an ∈ [1, 2] for all n.
7. Let f (x) = 1 +
Since a1 = 1, a2 = 3/2, a3 = 7/5 and a4 = 17/12, the inequalities
a2n+2 < a2n and a2n+1 > a2n−1
hold for n = 1. Assume they hold for n = k. Then
a2k+4 = f (a2k+2 ) < f (a2k ) = a2k+2
and
a2k+3 = f (a2k+1 ) > f (a2k−1 ) = a2k+1 ,
hence the inequalities hold for n = k + 1.
Since the sequences {a2n } and {a2n+1 } are bounded and monotone,
the monotone convergence theorem implies that a2n → a and a2n+1 → b
for some a, b ∈√R. Letting n → +∞
√ in f (a2n ) = a2n+2 gives f (a) = a.
Therefore, a = 2. Similarly, b = 2.
√
√
√
2
r ≥ 2x r, hence (x + r/x)/2 ≥ r. Therefore, an ≥ r.
9. For x > 0,
√x +
2
2
For x ≥ r, x + r ≤ 2x , hence (x + r/x)/2 ≤ x. Therefore,√an ≥ an+1 .
By the monotone convergence theorem, an → a for some a ≥ r. Letting
n → +∞ in an = (an−1
√ + r/an−1 )/2, yields a = (a + r/a)/2, which has
positive solution a = r.
Section 2.3
1. (a) 0, ±3/8.
2/k
3. (d) an = 1 +
(c) ±4, ±6, ±12, ±14.
2n+k −k
1
1
1+
→ e.
2n + k
2n + k
5. If {an } lies in the set {x1 , . . . , xn }, then one of the sets {n : an = xj } must
have infinitely many members and a subsequence may be constructed
from these.
P∞
8. Given ε > 0, choose N so that n=N |an+k − an | < ε. For m > n ≥ N ,
|amk − ank | ≤ |amk − a(m−1)k | + · · · + |a(n+1)k − ank | < ε.
Therefore, {ank }∞
n=1 is Cauchy.
Solutions to Selected Problems
523
10. Clearly an → 0 implies bn → 0. For the converse, suppose an 6→ 0.
Choose ε > 0 and a subsequence such that ank ≥ ε > 0 for all k. Then
1
1
1
1
1 = bn k
+
≤
b
+
,
nk
aqnk
εq
εq−p
aq−p
nk
hence bn 6→ 0. If 0 < q < p, then
√ the sufficiency is false: Take an = n,
q = 1/2 and p = 1. Then bn = n/(n + 1) → 0 but an → +∞.
Section 2.4
1. (a) lim inf = −5/3, lim sup = 5/3.
(c) lim inf = −14, lim sup = 14.
(h) lim inf = −∞, lim sup = +∞.
3. Follows from Exercise 1.4.6.
5. Follows from {ank : k ≥ n} ⊆ {ak : k ≥ n}.
7. 0 < b − ε < bn < b + ε ⇒ an (b − ε) < an bn < an (b + ε)
⇒ (b − ε) lim supn→∞ an ≤ lim supn→∞ an bn ≤ (b + ε) lim supn→∞ an .
Now let ε → 0.
10. Choose r so that lim inf n→∞ bn > r > 0. Then, given ε > 0, there exists
N such that an > a/2 and bn > r, and
cn := (bn − 3an )(bn + 2an ) = b2n − an bn − 6a2n < ε
for every n > N . Then bn − 3an = cn /(bn + 2an ) < ε/(r + a), so
lim supn→∞ bn ≤ 3a.
an+1
. Choose r strictly between
an
these numbers and then choose N such that an /an−1 > r for all n > N .
For such n,
an > an−1 r > an−2 r2 > · · · > aN rn−N ,
1/n
12. Suppose that lim inf n an
1/n
< lim inf n
1/n
hence lim inf n an ≥ lim inf n (aN r1−N/n ) = r, a contradiction. To
evaluate limn n/(n!)1/n take an = nn /n! and calculate
n
an+1
n+1
=
→ e.
an
n
Section 3.1
1. Let x1 < · · · < xn denote the points of E and let
δ=
1
min{xj − xi : 1 ≤ i < j ≤ n}.
2
Then for each j, (xj − δ, xj + δ) ∩ E = {xj }.
524
A Course in Real Analysis
4. Let ε, M > 0.
(b) The limit is 1. If |x − 1| < 1, then x > 0, hence
2|x − 1|
x+3
−1 =
< 2|x − 1|.
3x + 1
3x + 1
Therefore, choose δ = min{1, ε/2}.
√
√
√
(d) The limit is +∞: x < − M − 1 ⇒ −x > M and − x − 1 > M ⇒
x2 + x = (−x)(−x − 1) > M .
6. (a) 2/3.
7. (b) −1/2.
(d) +∞.
(g) 9/25.
√
√
√
√
r
b+x− b−x
c+x+ c−x
c
√
√
→
(f) √
= √
.
b
c+x− c−x
b+x+ b−x
√
√
(h) (a d)/(c b).
9. The limit exists at a iff lim{x→a, x∈Q} f (x) = lim{x→a, x∈I} f (x). By
continuity of polynomials, this is equivalent to 4a2 + 2a − 11 = 3a2 + a − 5.
Thus a = −3, 2.
√
√
11. (a) a.
(e) (c a)/(2 b).
13. Proof for the case f increasing and L := limn f (an ) ∈ R: Given ε > 0,
choose N such that L − ε < f (an ) < L + ε for all n ≥ N . Let x > aN and
let n be the least integer > N such that x < an . Then an−1 ≤ x < an so
L − ε < f (an−1 ) ≤ f (x) ≤ f (an ) < L + ε.
Section 3.2
1. (a) −1, 1.
(c) −2/3, 2/3.
(e) −1, 1.
(i) −3, 1.
3. lim sup case: Assume a ∈ R. Set
L = lim sup{x→a, x∈E} f (x) and Lj = lim sup{x→a, x∈Ej } f (x), j = 1, 2.
By 3.2.1, there exists a sequence an ∈ E1 such that f (an ) → L1 . Since
an ∈ E, by the same theorem, L1 ≤ L. Similarly, L2 ≤ L. Now let bn ∈ E
such that f (bn ) → L. Then one of the sets, say E1 , contains infinitely
many terms of the sequence. Therefore, L ≤ L1 , hence L = max{L1 , L2 }.
4. Let g(x) = 1/f (x). Then g(r) = 1/f (r).
Section 3.3
1. By continuity, f (2) = limx→2− (mx + 3) = limx→2+ (3x2 + 7), that is,
2m + 3 = 19. Therefore, m = 8.
Solutions to Selected Problems
525
4. This follows from
lim{x→a, x∈Q} d(x)g(x) = g(a) and lim{x→a, x∈I} d(x)g(x) = 0.
8. The identity implies that f (nx) = nf (x), n ∈ N. Also, f (0) + f (0) = f (0)
so f (0) = 0. Since f (−x) + f (x) = f (0), we see that f (−x) = −f (x),
hence f (nx) = nf (x) for all n ∈ Z. Let m, n ∈ N. Then f (x) =
f (nx/n) = nf (x/n). Replacing x by xm gives mf (x) = f (mx) =
nf (mx/n). Thus, f (tx) = tf (x) for all x ∈ R and t ∈ Q. Since f is
continuous at zero and f (x − y) = f (x) + f (−y) = f (x) − f (y), f is
continuous on R. Thus, f (tx) = tf (x) for all x, t ∈ R. Setting x = 1
gives the desired result.
P
9. (c) Let a ∈ R and ε > 0. Choose N so that n>N 2−n < ε and then
choose δ > 0 so that (a, a+δ) contains none of the numbers c1 , c2 , . . . , cN .
If a < x < a + δ, then
X
X
0 ≤ f (x) − f (a) =
2−n ≤
2−n < ε.
n:a<cn ≤x
n>N
Therefore, f is right continuous at a.
If a 6∈ {cn }, then we may choose δ so that (a − δ, a] contains none of
the numbers c1 , c2 , . . . , cN . If a − δ < x < a, then, as before,
X
0 ≤ f (a) − f (x) =
2−n < ε.
n:x<cn ≤a
Therefore, f is left continuous at a.
If a = ck and a − δ < x < a, then
f (ck ) − f (x) =
X
2−n ≥ 2−k
n:x<cn ≤ck
so f is not left continuous at ck .
11. (a) Let xn → x in [0, 1] and let ε > 0. By hypothesis, for each n we may
choose tn ∈ (0, 1) \ {x} such that |tn − xn | < 1/n and |f (tn ) − g(xn )| < ε.
Then tn → x, hence f (tn ) → g(x). From
|g(xn ) − g(x)| ≤ |g(xn ) − f (tn )| + |f (tn ) − g(x)|
we then have
lim sup |g(xn ) − g(x)| ≤ ε.
n
Since ε was arbitrary, g(xn ) → g(x).
(b) Note that f is continuous at x iff f (x) = g(x). Let ε > 0. We show
that the set Dε := {x ∈ [0, 1] : |f (x) − g(x)| ≥ ε} is finite. The desired
526
A Course in Real Analysis
conclusionSwill follow on observing that the set of discontinuities off is
∞
precisely n=1 D1/n .
Suppose Dε is infinite. Then there exists a sequence of distinct terms
such that |f (xn ) − g(xn )| ≥ ε for all n. By the Bolzano–Weierstrass
theorem, {xn } has a convergent subsequence, say xnk → x. Because the
terms of {xn } are distinct, xnk 6= x for all large k, hence f (xnk ) → g(x).
Also, by continuity, g(xnk ) → g(x). But this contradicts the inequality
|f (xnk ) − g(xnk )| ≥ ε.
Section 3.4
2. Let x0 > 0 and choose r > x0 such that f (x) < f (x0 ) for all x with
|x| > r. Then the maximum M of f on [−r, r] is ≥ f (x0 ), hence M must
be the maximum of f on R.
4. (e) Suppose f is not upper semicontinuous at x0 . Choose r such that
f (x0 ) < r < lim supx→x0 f (x) and then i such that fi (x0 ) < r. For each
δ > 0, r < sup0<|x−x0 |<δ f (x), hence there exists xδ with 0 < |xδ − x0 | <
δ such that f (xδ ) > r. Thus we have
fi (x0 ) ≤ f (x0 ) < r < f (xδ ) ≤ fi (xδ ) ≤
sup
fi (x).
0<|x−x0 |<δ
Letting δ → 0 produces the contradiction
fi (x0 ) < r ≤ lim sup fi (x) ≤ fi (x0 ).
x→x0
If fn (x) := xn on [0,1], then f = inf n fn is discontinuous at x = 1.
5. (b) Define
(
(1 − 2x)(1 − d(x))
f (x) =
(2x − 1)d(x)
if 0 ≤ x ≤ 1/2,
if 1/2 < x ≤ 1,
where d(x) is the Dirichlet function on [0, 1].
7. (c) f (x) = tan x − x. Since limx→(n+1/2)π− tan x = +∞, choose b ∈
(nπ, (n + 1/2)π) such that f (nπ) = −nπ < 0 < f (b).
(f) Let f (x) denote the left side minus the right side of the equation.
Then
lim+ f (x) = +∞,
lim f (x) = −∞,
x→0
x→π/2−
hence there exist 0 < a < b < π/2 such that f (a) > 0 > f (b).
9. Set g(x) = f (x) − x. Then g(b) ≤ 0 ≤ g(a) and the result follows from
the intermediate value theorem.
Solutions to Selected Problems
527
Section 3.5
1. Take f (x) = 1/x and g(x) = x2 on (0, 1).
3. (a) Use the inequality
√
√
√
|ax2 − ay 2 |
a |x − y|(|x| + |y|)
p
p
≤p
≤ a |x − y|.
x2 + b/a + y 2 + b/a
ax2 + b + ay 2 + b
7. Given ε > 0, choose δ > 0 such that |f (x) − f (y)| < ε for all x, y ∈ E
with |x − y| < δ. Then choose N such that |an − am | < δ for all m, n ≥ N .
For such m, n, |f (an ) − f (am )| < ε.
9. The inequality |x| − |y| ≤ |x − y| shows that |x| is uniformly continuous.
The given functions are therefore compositions of uniformly continuous
functions.
13. If 0 < p ≤ 1, then
sin x
= 0, and
x→+∞ xp
lim
lim+
x→0
sin x
sin x
= lim+ x1−p
= 0 or 1.
p
x→0
x
x
Therefore, (sin x)/xp has a uniformly continuous extension to [0, +∞).
If p > 1, (sin x)/xp is continuous on (0, +∞) but has no continuous
extension to [0, +∞).
15. Since f may be extended continuously to [a, b], it is bounded. The
examples f (x) = x on (0, +∞) and f (x) = 1/x on (0, 1) show that the
assumptions cannot be relaxed.
18. f (x) has unequal one-sided limits at 0 while those of g(x) are equal.
Hence 0 is a removable discontinuity of g but not of f .
Section 4.1
f (x + h) − f (x)
=
h
2
1
→√
(b) √
.
√
2x + 1
2x + 2h + 1 + 2x + 1
−3
−3
√
→
(d) √
.
√
√
2(3x
+ 2)3/2
3x + 3h + 2 3x + 2 3x + 2 + 3x + 3h + 2
2
x −1
3(5x + 7)1/3
5(3x + 2)1/5
4x
cos
.
3. (a)
+
.
(c) 2
(x + 1)2
x2 + 1
5(3x + 2)4/5
3(5x + 7)2/3
1. If f denotes the given function,
4. (b) −
1
y
−
y cos xy 2
2x
528
A Course in Real Analysis
7. f is continuous at 1 iff 2a + b = 1. For such a and b, f is differentiable
iff a + b = 3. Therefore, a = −2 and b = 5.
11. (b) The difference quotient is
f (a − h) − f (a)
f (a + h2 ) − f (a)
h+
→ f 0 (a).
h2
−h
14. For all h 6= 0, [f (x + h) − f (x)]/h ≥ 0, hence f 0 (x) ≥ 0.
16. Clear for n = 2. Suppose the assertion holds for n ≥ 2. Then
D
n+1
n X
n
(f g) =
D (Dk f )(Dn−k g)
k
k=0
n X
n k+1
=
(D
f )(Dn−k g) + (Dk f )(Dn+1−k g)
k
k=0
n
X
n
n
=
+
(Dk f )(Dn+1−k g) + gDn+1 f + f Dn+1 g
k−1
k
k=1
n+1
X n + 1 (Dk f )(Dn+1−k g).
=
k
k=0
18. (c) (f 0 ◦ g)g 00 + (f 00 ◦ g)(g 0 )2 .
19. (a)
(−1)n n!
.
xn+1
sin xn
21. If x 6= 0, f (x) = x
n cos x + m n
. Also,
x

does not exist if n + m < 1,
sin xn n+m−1 
0
f (0) = lim
x
=0
if n + m > 1,
x→0 xn


=1
if n + m = 1.
0
m+n−1
n
Therefore, f 0 is continuous at 0 if n + m ≥ 1.
23. For the second order determinant use the expansion
f1
g1
f2
= f1 g2 − f2 g1 .
g2
For the third order determinant, expand along a row or column and use
the formula for the second order case. The same idea may be applied to
nth order determinants.
Solutions to Selected Problems
529
Section 4.2
√
1. Set f (x) = cos x − x + 1. Since f (0) > 0 > f (π/2), f has at least one
zero in (0, π/2), by the intermediate value theorem. Since f 0 < 0 on
(0, π/2), f is strictly decreasing so the zero is unique.
3. Since f 0 (x) = 4x(x − 1)(x − 2) < 0 on (1, 2), f has at exactly one zero
in the interval (1, 2) iff f (1)(= 1 + c) and f (2)(= c) have opposite signs,
that is, iff c < 0 < c + 1, or −1 < c < 0.
7. The assertion is clear if n = 0. Suppose it holds for all polynomials
with degree ≤ n. Let P (x) have degree n + 1 and suppose that the
equation sin(ax) = P (x) has more than n + 2 solutions. Then f (x) :=
sin(ax) − P (x) has more than n + 2 zeros, hence, by Rolle’s theorem,
f 00 (x) = −a2 sin(ax) − P 00 (x) has more than n zeros. But this means
that sin(ax) = −P 00 (x)/a2 has more than n solutions, contradicting the
induction hypothesis.
9. By the Cauchy mean value theorem,
|f (x) − f (y)| |g 0 (c)| = |g(x) − g(y)| |f 0 (c)| ≤ |g(x) − g(y)| |g 0 (c)|.
11. The derivative of x−1 sin x is negative since tan x > x, 0 < x < π/2.
17. Let c1 < · · · < cm be the distinct zeros of P 0 . By the intermediate value
theorem, P 0 has a constant sign on (cj , cj+1 ). Therefore, P (x) is strictly
monotone on these intervals.
19. Let |f 0 | ≤ c < r. Then g 0 (x) = r + f 0 (x) ≥ r − c > 0, so g is strictly
increasing, hence one-to-one. By the mean value theorem, |f (x) − f (0)| ≤
c|x| or f (0) − c|x| ≥ f (x) ≤ f (0) + c|x|. Therefore,
f (0) + rx − c|x| ≤ g(x) ≤ f (0) + rx + c|x|.
Thus x > 0 ⇒ g(x) ≥ f (0) + (r − c)x ⇒ limx→+∞ g(x) = +∞, and
x < 0 ⇒ g(x) ≤ f (0) + (r − c)x ⇒ limx→−∞ g(x) = −∞. By the
intermediate value theorem, g(R) = R.
22. g 0 (0) = 0, hence f 0 (0) > 0. Since f (±1/nπ) = ±1/nπ for all n ∈ N, f is
not monotone on any neighborhood of 0.
25. Let a, b ∈ I with a < b and suppose
that f 0 (a) < y0 < f 0 (b), so
0
0
g (a) < 0 < g (b). Then g(x) − g(a) /(x − a) < 0 for x ∈ (a, a + δ), so
the minimum of g cannot occur at a. Similarly, the minimum of g cannot
occur at b. Thus, by the local extremum theorem, g 0 (x0 ) = 0, that is,
f 0 (x0 ) = y0 , for some x0 ∈ (a, b).
530
A Course in Real Analysis
28. Set q(x, y) = [f (x) − f (y)](x − y), x 6= y. If f is uniformly differentiable
on I, then
|f 0 (x) − f 0 (y)| ≤ |f 0 (x) − q(x, y)| + |f 0 (y) − q(x, y)|
shows that f 0 is uniformly continuous. Conversely, assume that f 0 is
uniformly continuous. By the mean value theorem, for each x < y there
exists a z ∈ (x, y) such that
|q(x, y) − f 0 (y)| = |f 0 (z) − f 0 (y)|.
It follows that f is uniformly differentiable.
31. If such a function exists, then
lim
x→y
f (x) − f (y)
= ϕ(y, y)
x−y
so f 0 (y) = ϕ(y, y), which is continuous in y.
Conversely, assume f is continuously differentiable on an open interval
I and define

 f (x) − f (y) if x 6= y,
x−y
ϕ(x, y) =
 0
f (x)
if x = y.
Clearly ϕ is continuous on {(x, y) ∈ I × I : x 6= y}. By the mean value
theorem, ϕ(x, y) = f 0 (ξxy )(x − y), where ξxy is between x and y. The
continuity of ϕ on I × I now follows from the continuity of f 0 .
Section 4.4
1. (b) (2 − 3x)/(2x − 3), x 6= 3/2. (f) cos−1
3x − 2
, 1/2 < x < 3/4.
1−x
5. (b) Fix y > 0 and let f (x) = ln(xy) − ln x − ln y. Since f 0 (x) = 0,
f (x) = f (1) = 0 for all x > 0.
7. (b) ax+y = exp((x + y) ln a) = exp(x ln a) exp(y ln a) = ax ay .
9. xa = exp(a ln x), hence (xa )0 = exp(a ln x)(a/x) = axa−1 .
13. The derivative of the left side of (c) is
2
4x
p
−
, y :=
x2 + 1 (x2 + 1)2 1 − y 2
x2 − 1
x2 + 1
which reduces to 0. Therefore, the left side is constant.
2
,
Solutions to Selected Problems
531
14. Set c = f 0 (0). Since
f (h) − 1
f (x + h) − f (x)
= f (x)
,
h
h
f 0 (x) exists and equals cf (x). Therefore, e−cx f (x) has zero derivative,
hence e−cx f (x) = f (0).
f 00 f −1 (x)
−1 00
18. (f ) (x) = − 3 .
f 0 f −1 (x)
Section 4.5
1. (a) p − q.
(d) −1.
(g) −2.
(s) 1 if p > 1, +∞ if p ≤ 1.
(j) 0.
(m) −∞.
(p) 0.
(v) 1.
2. (c) f (0) = limx→0+ f (x) = 5/3.
3. (a) ln an = n−1 ln sin(1/n) is of the form −∞
+∞ , hence has the same limit
as
1 cos(1/n)
cos(1/n)
1
=−
→ 0.
2
−1
n sin(1/n)
n
n sin(1/n)
Therefore, an → 1.
6. By logarithmic differentiation,
1
ln x 1/x
f 0 (x) = 1 +
x .
x1/x −
x
x
By l’Hospital’s rule, x1/x → 1, hence limx→+∞ f 0 (x) = 1. Applying the
mean value theorem to f on each of the intervals [n, n + 1] shows that
f (n + 1) − f (n) → 1.
9. Let L := limx→+∞ f 0 (x)/g 0 (x). By l’Hospital’s rule, limx→+∞ g(x)/f (x)
exists and equals 1/L. Another application of l’Hospital’s rule yields
ln f (x)
f 0 (x) g(x)
= lim 0
= 1.
x→+∞ ln g(x)
x→+∞ g (x) f (x)
lim
10. (a) By l’Hospital’s rule, the quotient has the same limit as
αβ f 0 (a + αh) − f 0 (a + βh)
2
h
αβ
f 0 (a + αh) − f 0 (a)
f 0 (a + βh) − f 0 (a)
=
α
−β
,
2
αh
βh
which is αβ(α − β)f 00 (a)/2.
532
A Course in Real Analysis
12. Apply l’Hospital’s rule n times to f (x)/x−n to obtain
lim+ xn f (x) = lim+
x→0
x→0
where a =
(−1)n f (n) (x)
= lim+ ax2n f (n) (x),
x→0
n(n + 1) . . . (2n − 1)x−2n
(−1) (n − 1)!
. Therefore, lim+ xn f (x) exists and equals aL.
x→0
(2n − 1)!
n
16. By l’Hospital’s rule,
f (g(x))
f 0 (g(x))g 0 (x)
= lim
= L.
x→+∞ g(x)
x→+∞
g 0 (x)
√
For examples, take f (x) = x, ln x, or x + 1/x, and g(x) = xn , ex , or
ln x.
lim
18. By l’Hospital’s rule,
xf (x)
= lim xf 0 (x) + f (x)
x→+∞
x→+∞
x
= lim xf 0 (x) + lim f (x).
lim f (x) = lim
x→+∞
x→+∞
x→+∞
For the second part consider, f (x) = ln x.
Section 4.6
2. Apply Taylor’s theorem to the function between the inequalities to
produce the number c ∈ (0, x) in the remainder term:
(b) f (k) (x) = (−1)k e−x , hence e−x =
k=0
e−c ∈ (0, 1).
1
3. Let In :=
n!
ing,
2n−1
X
Z
x
(x − t)n f (n+1) (t) dt = −
a
In = −
n
X
f (k) (a)
k=1
k!
(−1)k k
e−c 2n
x +
x , where
k!
(2n)!
f (n) (a)
(x − a)n + In−1 . Iteratn!
(x − a)k + I0 = −Tn (x, a) + f (x).
5. By Taylor’s theorem,
bk =
n−k
P (k) (b)
1 X
=
(j + 1)(j + 2) · · · (j + k)(b − a)j ak+j .
k!
k! j=0
Section 4.7
1. (a) −1.52137970.
2. (a) 0.87672621.
4. 7.937253933.
(d) −1.42360584.
(g) 1.220924381.
(c) 1.55714559.
Solutions to Selected Problems
533
Section 5.1
3. Since Mj (−f ) = −mj (f ), S(−f, P) = −S(f, P), hence
Z
b
(−f ) = inf S(−f, P) = inf (−S(f, P)) = − sup S(f, P) = −
a
P
P
Replacing f by −f shows that
P
Rb
a
(−f ) = −
Rb
a
Z
b
f.
a
f.
5. Since g may be obtained from f by changing one point at a time, we
may assume that f = g except at a single point c ∈ (a, b). Let ε > 0 and
let M be a bound for both |f | and |g|. The point c is in at most two
intervals of any partition P, and each of these has width ≤ kPk. Since
f = g on the remaining intervals,
|S(f, P) − S(g, P)| ≤ 2M kPk.
It follows from 5.1.15 that
Rb
a
f=
Rb
a
g. Similarly
Rb
a
f=
Rb
a
g.
6. (c) Let g = sin f , ε > 0, and let P be any partition of [a, b] such that
S(f, P) − S(f, P) < ε. For fixed j, choose sequences an , bn ∈ [xj−1 , xj ]
such that g(an ) → Mj (g) and g(bn ) → mj (g). Then
g(an ) − g(bn ) ≤ |f (an ) − f (bn )| ≤ Mj (f ) − mj (f ),
hence Mj (g) − mj (g) ≤ Mj (f ) − mj (f ). Therefore,
S(g, P) − S(g, P) ≤ S(f, P) − S(f, P) < ε.
7. (a) Let L = limP F (P) and M = limP G(P). Given ε > 0, choose Pε0
and Pε00 such that |F (P) − L| < η for all partitions P refining Pε0 and
|G(P) − M | < η for all partitions P refining Pε00 , where η = ε/(2|α| +
2|β| + 2). Let Pε denote the common refinement of Pε0 and Pε00 . Then
both inequalities hold for any partition P refining Pε , hence
|(αF (P) + βG) − (αL + βM )| ≤ |α||F (P) − L| + |β||G(P) − M | < ε.
Rb
Rb
(b) Given ε > 0, choose Pε such that a f − ε < S(f, Pε ) ≤ a f .
The inequality still holds if Pε is replaced by a refinement. Therefore,
Rb
f = limP S(f, P).
a
Section 5.2
1. Assume cn → c ∈ (a, b). Choose δ > 0 so that a < c − δ < c + δ < b
and choose N so that cn ∈ (c − δ, c + δ) for all n > N . Since f has only
finitely many discontinuities on [a, c − δ] ∪ [c + δ, b], f is integrable on
534
A Course in Real Analysis
these intervals and the integrals are zero. Thus, given ε > 0, there exist
partitions P1 of [a, c − δ] and P2 of [c + δ, b] such that
S(f, P1 ) − S(f, P1 ) < ε/3 and S(f, P2 ) − S(f, P2 ) < ε/3.
Define a partition P on [a, b] by P = P1 ∪ P2 and let |f | ≤ M on [a, b].
If δ < ε/6M , then
S(f, P) − S(f, P) ≤ S(f, P1 ) − S(f, P1 ) + S(f, P2 ) − S(f, P2 ) + 2M δ < ε.
Therefore, f ∈ Rba . Moreover,
Z
a
hence
b
f=
Z
c−δ
f+
c+δ
Z
a
f+
c−δ
Z
b
Z
f=
c+δ
Z
c+δ
f,
c−δ
c+δ
|f | ≤ 2M δ.
f ≤
a
b
Z
c−δ
Since δ may be made arbitrarily small,
Rb
a
f = 0.
5. Set Mn = max{f1 , . . . , fn }. Then M2 = f1 + f2 + |f1 − f2 | /2 ∈ Rba .
Since Mn = max{Mn−1 , fn }, the general result follows by induction. A
similar argument holds for min.
Rb
6. Choose x0 such that f (x0 ) = supa≤x≤b f (x). Then a f ≤ f (x0 )(b − a) <
M (b − a).
9. Let |f | ≤ M on [a, b]. Then |F (x, y) − F (x, y0 )| ≤ M (y − y0 ), hence
limy→y0 F (x, y) = F (x, y0 ).
12. (a) By the approximation property, choose x0 such that |f (x0 )| > M − ε.
By continuity, we may take x0 ∈ (a, b) and we may choose δ > 0 such
that |f (x)| > M − ε for all x ∈ (x0 − δ, x0 + δ). Then
M (b − a) ≥
Z
b
Z
x0 +δ/2
|f | ≥
|f | ≥ δ(M − ε).
x0 −δ/2
a
(b) By (a), |f (x)|p > (M − ε)p on (x0 − δ, x0 + δ), hence, as in (a),
δ(M − ε)p ≤
Z
b
|f |p ≤ M p (b − a).
a
Therefore,
δ 1/p (M − ε) ≤
Z
a
b
|f |p
1/p
≤ M (b − a)1/p ,
Solutions to Selected Problems
535
hence
M − ε ≤ lim inf
b
Z
p→+∞
|f |p
1/p
≤ lim sup
Z
p→+∞
a
b
|f |p
1/p
≤ M.
a
Since ε was arbitrary,
lim inf
Z
p→+∞
b
|f |
p
1/p
= lim sup
p→+∞
a
b
Z
|f |p
1/p
= M.
a
Section 5.3
1. By a change of variables and periodicity,
Z p
Z p+y
f (x + y) dx =
f (x) dx
y
0
=
Z
p
f (x) dx +
y
=
Z
Z
f (x) dx
p
p
f (x) dx +
p+y
Z
y
=
p+y
Z
f (x − p) dx
p
p
f (x) dx +
y
Z
y
f (x dx =
0
f (x) dx.
0
1
Z
3. (a) On [0, 1], 2x/π ≤ sin x ≤ x. Since
p
Z
√
0
inequalities follow.
x
x2 + 1
dx =
√
2 − 1, the
5. (a) Substituting y = x1/n and integrating by parts n − 1 times yields
Z
1
exp x1/n dx = n
Z
0
1
y n−1 ey dy = F (1) − F (0),
0
where
F (y) = (−1)n+1 n!ey
n−1
X
j=0
(−1)j j
y .
j!
7. Let I denote the integral. Successive integration by parts yields
Z 1
(k − 1)(k − 3) · · · (k − 2j + 1)
I=
Ij , Ij :=
xk−2j (1 − x2 )j−1/2 dx.
1 · 3 · · · (2j − 1)
0
If k is odd, take j = (k − 1)/2 so
I=
(k − 1)(k − 3) · · · 4 · 2
Ij , Ij =
1 · 3 · · · (k − 2)
Z
0
1
x(1 − x2 )(k−2)/2 dx = k −1 .
536
A Course in Real Analysis
If k is even, take j = k/2 so
(k − 1)(k − 3) · · · 3 · 1
I=
Ij = Ij =
3 · 5 · · · (k − 1)
Z
1
(1 − x2 )(k−1)/2 dx.
0
By trig. substitution and Exercise 6,
Ij =
π (k − 1)(k − 3) · · · 3 · 1
.
2
k(k − 2) · · · 4 · 2
π/2
Z
cosk θ dθ =
0
9. Substituting s = f (t) and integrating by parts yields
y
Z
f −1 (s) ds =
Z
0
hence
Z x
f+
0
f −1 (y)
tf 0 (t) dt = yf −1 (y) −
0
Z
f
= yf
−1
f,
0
y
−1
f −1 (y)
Z
(y) +
x
Z
0
Z
f −1 (y)
f−
0
f = yf
−1
(y) +
Z
0
x
f.
f −1 (y)
If f −1 (y) ≤ x, then f (t) ≥ y for all t ∈ [f −1 (y), x], hence
Z x
Z x
f (t) dt + yf −1 (y) ≥
y dt + yf −1 (y) = xy.
f −1 (y)
f −1 (y)
On the other hand, if f −1 (y) ≥ x then f (t) ≤ y for all t ∈ [x, f −1 (y)],
hence
Z x
Z f −1 (y)
f (t) dt + yf −1 (y) ≥ −
y dt + yf −1 (y) = xy.
f −1 (y)
x
10. (b) Take f (t) = ln(t + 1), 0 ≤ x ≤ 1, and 0 ≤ y ≤ ln 2 in Young’s
inequality to obtain
Z x
Z y
(x + 1) ln(x + 1) − x + ey − y − 1 =
ln(t + 1) dt + (es − 1) ds ≥ xy.
0
0
Replace x + 1 by x, 1 ≤ x ≤ 2.
13. Integrate by parts to obtain
Z b
f (x) sin(nx) dx = f (x) cos(nx)
a
17. If F is a primitive of f , then
chain rule.
Z
a
b
1
+
n
Z
b
f 0 (x) cos(nx) dx.
a
v(x)
u(x)
f = F v(x) − F u(x) . Now use the
Solutions to Selected Problems
537
19. By l’Hospital’s rule,
Z x
Z x i
h
g(x)
lim
f = lim g(x)f (x) + g 0 (x)
f = g(a)f (a).
x→a x − a a
x→a
a
Z
20. (a) sn is a Riemann sum for
1
xp dx, hence limn→+∞ sn = 1/(p + 1).
0
21. By the mean value theorem,
|f (x) − f (xk−1 )| ≤ M |x − xk−1 | ≤ M (xk − xk−1 ) = M h, x ∈ [xk−1 , xk ],
hence
Z b
n Z
n
X
X
f−
f (xk−1 )h =
a
k=1
k=1
xk
f (x) − f (xk−1 ) dx ≤ M nh2 .
xk−1
Section 5.5
3. The substitution t = sin x yields
Z
√
π/3
f sin x dx =
π/6
Z
3/2
1/2
f (t)
√
dt.
1 − t2
Now apply 5.5.3 with g(t) = (1 − t2 )−1/2 .
Z b
7. G(b) ≤
f g ≤ G(a). Now apply the intermediate value theorem to G.
a
9. Apply 5.5.3 to obtain c ∈ [0, 1] such that
Z π
Z c
Z
g(x) sin x dx = g(0)
sin x dx + g(1)
0
0
π
sin x dx = cos c + 1.
c
Section 5.7
R1
R1
Rε
1. Let f (x) denote the integrand and 0 < ε < 1. Then 0 f = 0 f + ε f.
On (0, ε], 2x/π ≤ sin x ≤ x and 1 − ε ≤ 1 − x < 1, hence
1
(π/2)p
≤ f (x) ≤
.
p
|x|
(1 − ε)q |x|p
On [ε, 1),
Therefore,
1
1
≤ f (x) ≤
.
(1 − x)q sinp 1
(1 − x)q sinp ε
R1
0
f converges iff p, q > 1.
538
A Course in Real Analysis
5. Only (b) and (d) diverge.
p
sin x
< 1 for 0 < x < r. Then
x
Z
Z ε
Z ε
sinp x
1 ε q−p
x
≤
xq−p dx.
dx ≤
2 0
xq
0
0
8. Choose r > 0 so that 1/2 <
Now apply 5.7.3(a).
9. (a) all p.
(k) p > −1.
Rx
11. Let g(x) = x(1+x2 )−1 , h(x) = sin x and f := gh. Then | 1 h| is bounded
and g 0 < 0 so, by 5.7.17, f is improperly integrable on [1, +∞). For every
n,
Z
(c) all p.
∞
|f | dx ≥
0
n Z
X
j=2
=M
jπ
(j−1)π
n
X
j=2
(h) p > −2.
n
X
x| sin x|
dx ≥
2
1+x
j=2
Z
jπ
(j−1)π
π(j − 1)| sin x|
dx
1 + π2 j 2
j−1
,
1 + π2 j 2
where M is a positive constant. The sums in the last equality are unbounded, hence h is not improperly absolutely integrable in this case.
13. (a) Converges for all p > 0 if 0 < q < 1; diverges for all p > 0 if q ≥ 1.
(b) Converges for all p > 0 if 0 < q < 1; diverges for all p > 0 if q ≥ 1.
(c) Converges if p > 2 or q > 2 and diverges otherwise.
(d) Converges if p < 2 or q < 2 and diverges otherwise.
(e) Converges iff q < 1.
(f) Converges iff pq < 1.
15. Integrate by parts:
Z ∞
Z
2
In :=
x2n e−x /2 dx = (2n − 1)
−∞
∞
x2n−2 e−x
2
/2
dx = (2n − 1)In−1 .
−∞
20. Both integrals converge. The root test is inconclusive.
24. By the Cauchy–Schwarz inequality,
Z ∞
1/2 Z ∞
1/2
Z ∞p
f (x)
dx
≤
f (x) dx
< +∞.
x
x2
1
1
1
Solutions to Selected Problems
539
Rx
26. Let F (x) = a f g, a ≤ x < b, and let bn ↑ b. By the weighted mean
value theorem,
F (bm ) − F (bn ) = f (cm,n )[G(bm ) − G(bn )]
for some cm,n between bm and bn . Since G is bounded and f (cm,n ) → 0,
{F (bn )} is a Cauchy sequence and hence converges. Since {bn } was
Rb
arbitrary, a f g converges.
Rt
28. Let F (t) = 0 f dx. Then
Z
t
f (x + c) dx =
Z
c+t
f (x) dx =
c
0
hence
Z
Z
f (x) dx + F (t + c) − F (t),
c
∞
f (x + c) dx =
Z
∞
f (x) dx.
c
0
Similarly,
t
Z
0
f (x + c) dx =
−∞
Z
c
f (x) dx.
−∞
Section 5.8
2. Given ε > 0, let An be covered by intervals In,k , k = 1, 2, . . ., with total
length < ε/2n . Then the union is covered by intervals In,k , n, k = 1, 2, . . .,
with total length < ε.
6. The discontinuity set is countable, hence the integral exists. Since all
lower sums are zero, the integral must be zero.
Section 6.1
1. (a)
m3 (m + 1)
.
2m + 1
m
(c) ln(3/2).
(e)
1 X (−1)k
.
m
k
k=1
(i) − ln(m + 1).
(n)
m
X
(−1)k
k
k=1
2. (a)
(g)
23
.
480
.
1
.
1 + r2
3. (a) 193e.
5. Let sn =
(c) (e − 1/e)/2.
n
n
n
X
X
X
1
1
4
, un =
, and vn =
.
k
2k − 1
(2k − 1)(2k + 1)
k=1
k=1
k=1
(a) s2n = sn /2 + un , hence, by 6.1.9,
un −
1
2
ln n = [s2n − ln(2n)] − 12 [sn − ln n] + ln 2 → 21 γ + ln 2.
540
A Course in Real Analysis
8. Given ε > 0, choose N such that L − ε < ak /bk < L + ε for all k ≥ N .
Multiplying by bk and summing,
(L − ε)
m
X
k=n
bk <
m
X
k=n
ak < (L + ε)
m
X
bk , m > n ≥ N.
k=n
Letting m → +∞ and dividing,
P∞
ak
< L + ε, n ≥ N.
L − ε < Pk=n
∞
k=n bk
P
P
12. Let sn and tn denote the nth partial sums of an and bn , respectively.
P
Then tk = snk so P
{tk } is a subsequence of {sn }. Therefore, if n an
converges, so does k bk . If the
Pterms an are nonnegative, then,
P for each
b
n, sn ≤
t
for
k
≥
n,
hence
if
converges,
then
so
does
k
k
k
n an . The
P∞
series n=0 (−1)n shows that the latter assertion fails in general.
15. By summing a geometric series, a real number x with representation
bN bN −1 · · · b0 .a1 a2 · · · an 999 · · · , where an 6= 9 may be written as
bN bN −1 · · · b0 .a1 a2 · · · an + 10−n = bN bN −1 · · · b0 .a1 a2 · · · an−1 a0n ,
where a0n := an + 1. Therefore, a real number has at least one standard
representation.
Suppose that bN bN −1 · · · b0 .a1 a2 · · · = cM cM −1 · · · c0 .d1 d2 · · · are standard representations. Then
|bN bN −1 · · · b0 − cM cM −1 · · · c0 | = |(.d1 d2 · · · ) − (.a1 a2 · · · )|
∞
X
|dj − aj |
≤
.
10j
j=1
Since the representations are standard, |dj − aj | cannot eventually equal
9, hence the right side is < 1. Therefore, since the left side is an integer,
it must be zero. It follows from Exercise 1.5.16 that M = N and bj = cj ,
0 ≤ j ≤ N . Then a1 .a2 a3 · · · = d1 .d2 d3 · · · , hence a1 = d1 . An induction
argument shows that an = dn for all n.
Section 6.2
1. By the ratio test, (a), (b), (e), and (f) converge; (c) and (d) diverge.
2. (a) Converges by ratio test.
(d) Converges by ratio test.
(g) Converges by integral test iff p > 1.
P
(j) Diverges by limit comparison with
1/n.
Solutions to Selected Problems
P
(m) Converges by limit comparison with
1/n2 .
541
(p) Diverges by ratio test.
(s) Diverges since 2ln n = np , p = ln 2 < 1.
P
(v) Converges by limit comparison with
1/2n .
5. For all sufficiently large n, an < an n1/n < 2an .
6. (a) Converges iff p > 1.
q > 1 + p.
(e) Converges iff q > p.
8. (a) Since an → 0, a2n < an for all large n. Therefore,
comparison test.
(g) Converges iff
P
bn converges by
(d) Converges by comparison test: bn ≤ an .
(h) Converges: For n sufficiently large, say n ≥ N , an < 1, hence
bn = M aN · · · an < M an , where M = a1 · · · aN −1 .
(l) Converges by the Cauchy–Schwarz inequality.
11. The inequality implies that {an /bn } is a decreasing sequence and hence
converges to L < +∞. Now use the comparison test.
14. Since limx→∞ f (g(x)) = limx→∞ g(x) = 0, l’Hospital’s rule implies that
0
0
limx→∞ f (g(x))/g(x)
P = limx→∞ f (g(x))
P = f (0). Now apply the limit
comparison test to n f (g(n)) and n g(n).
P
15. (a) If
f (1/np ) converges, then f (0) = limn f (1/np ) = 0. Suppose
f (xp )
f 0 (0) 6= 0. Then, by l’Hospital’s rule, limx→0 2p = ∞. Therefore,
x
eventually f (1/np ) > 1/n2p so the series diverges by the comparison test.
17. (a) n!(e − sn ) = m(n − 1)! −
n
X
n!
k=1
k!
∈ N.
∞
X
1
(n + k)!
k=1
1
1
1
=
1+
+
+ ...
(n + 1)! n + 2 (n + 2)(n + 3)
1
1
1
<
1+
+
+ ...
(n + 1)!
n + 1 (n + 1)2
1
n+1
=
,
(n + 1)! n
(b) e − sn =
hence n!(e − sn ) < 1/n. By (a) and (b), n!(e − sn ) is a positive integer
< 1/n, which is impossible.
542
A Course in Real Analysis
Section 6.3
3. (a) and (c) diverge: dn → 2/3; (b) converges: dn → 3/2.
5. (a) By ratio test: series converges if p < e and diverges if p > e. If p = e,
series diverges by Raabe’s test since then dn → −1/2.
6. Ratio test fails. Raabe: dn → (1 + p)/2, hence converges if p > 1 and
diverges if p < 1. Also diverges if p = 1, since then an = 1/(2n + 1).
10. (a) Diverges.
(d) Converges iff r > 1.
13. − ln an / ln n → ln b.
16. (a) Converges iff q > p.
18. Let c > 1 and choose r ∈ (1, c). Then, for sufficiently large n, cn > r,
hence
r
ln a−1
n = ln n + cn ln ln n > ln n + r ln ln n = ln n(ln n)
R∞
1
. Since 2 1/x(ln x)r dx < +∞, the integral
r
n(ln n)
and comparison tests complete the proof in this case. The case c < 1 is
similar.
and therefore an <
The given series diverges.
21. Take bn = n ln n in Kummer’s test. Then
1
βn
n
cn = 1 + +
n ln n−(n+1) ln(n+1) = (n+1) ln
+βn .
n n ln n
n+1
Since the first term on the right side tends to −1, lim inf βn > 1 implies
n→∞
lim inf cn > 0, and lim sup βn < 1 implies lim inf cn < 0.
n→∞
n→∞
n→∞
Section 6.4
2. Choose r > 1 and N ∈ N such that |an+1 |/|an | > r for all n ≥ N . Then
|aN +k | > rk |aN | for all k, hence an 6→ 0. Therefore, series diverges.
4. (a) Diverges.
(b) Converges conditionally.
(c) Converges absolutely if p > 1, conditionally if p ≤ 1.
(i) Converges absolutely if p > 1/2, conditionally if p ≤ 1/2.
(m) Converges absolutely if p > 1, diverges if p < 1.
n − 1/2
. If p ≤ 1, then bn sin nθ need not tend to zero (see
+ (−1)n
Example
8.3.10). For p > 1, it suffices by Dirichlet’s test to P
show that
P
|bn+1 − bn | < +∞. This follows by limit comparison with
1/np .
9. Let bn =
np
Solutions to Selected Problems
543
13. (a) For n ∈ N, n = qmn + rn , where rn , mn ∈ N and 0 ≤ rn ≤ q − 1.
Since sn − sqmn is a sum of terms of the form aqmn +j , j = 1, . . . q − 1,
each of which → 0, sn − sqmn → 0. Therefore, sn → s.
(b) For n ∈ N,
1 1 1
1 1 1
1
1
1
1 1
−
+ +
+
+ +
−
+
+
s6n = 1 + +
2 3
4 5 6
7 8 9
10 11 12
1
1
1
1
1
1
+ ··· +
+
+
−
+
+
6n − 5 6n − 4 6n − 3
6n − 2 6n − 1 6n
1
1 1
1 1
1
1
= 1−
+
−
+
−
+ ··· +
−
4
2 5
3 6
6n − 3 6n
=
6n−3
X
3
3
3
1
3
+
+
+ ··· +
=3
.
1·4 2·5 3·6
(6n − 3)6n
k(k + 3)
k=1
The last expression converges to (1 + 1/2 + 1/3) = 11/6 by 6.1.5 with
m = 3. By part (a), s = 11/6.
(c) Let tn be the nth partial sum of the series. Then
1
1
1 1 1 1 1 1 1 1
+ − − + + + − −
+ ··· −
2 3 4 5 6 7 8 9 10
5n
1 1 1 1 1 1 1 1 1
+ + − − + + + − − + ···
3 3 3 3 8 8 8 8 8
1
1
1
+ +
+ ··· +
.
8 13
5n − 2
t5n = 1 +
1
3
1
=
3
≥
Thus t5n → +∞, so the series diverges.
Section 6.5
3. (a), (b), (c): Double limit does not exist; only one iterated limit exists.
(d), (g), (l): Iterated limits exist and are unequal. Hence double limit
does not exist.
(e), (h): Iterated limits exist and are equal. Double limit exists.
(f), (i), (k): Iterated limits exist and are equal. Double limit does not
exist.
(j) If a = b, iterated limits exist and are equal, double limit exists. If
a 6= b, iterated limits exist and are unequal.
Pm Pn
Pn
9. Let sm,n = j=1 k=1 aj,k and sn = k=1 bk . Then for m ≥ n,
sn ≤ sn,n ≤ sm,n ≤ sm,m ≤ s2m−1 ,
hence the result follows from the squeeze principle.
544
A Course in Real Analysis
10. (b) Let bn =
Pn
j=1 aj,n+1−j =
n
X
j=1
1
and let sn =
[j 2 + (n + 1 − j)2 ]p/2
Pn
2
2
k=1 bk . The minimum of x +(n+1−x) on [1, n] occurs at x = (n+1)/2
and the maximum at x = 1 and x = n, hence
(n + 1)2 /2 ≤ j 2 + (n + 1 − j)2 ≤ n2 + 1, 1 ≤ j ≤ n,
and therefore
(n2
2p/2 n
n
≤ bn ≤
,
p/2
(n + 1)p
+ 1)
so the double series converges iff p > 2.
11. If |r| ≥ 1, then am,n 6→ 0, hence the double series diverges. Let |r| < 1
and set cm = |r|m /(1−|r|m ). Choose M such that |r|m < 1/2 for m > M .
Then
∞ X
∞
X
m=1 n=1
|r|mn =
M
X
cm +
m=1
∞
X
m=M +1
cm ≤
M
X
cm + 2
m=1
∞
X
|r|m < +∞.
m=1
Therefore, the iterated series, and hence the double series, converges
absolutely.
1/mn
12. Let L < 1. Choose r ∈ (L, 1) and then N suchP
that am,n < r for
all m, n ≥ N . For such m, n, am,n < rmn , hence
am,n converges by
Exercises 6 and 11. If L > 1, choose r ∈ (1, L) and then N such that
1/mn
am,n > r for all m, n ≥ N . For such m, n, am,n > rmn > 1, hence
am,n 6→ 0, so the series diverges.
Section 7.1
1. (b) Pointwise to 0 on (−1, 1] for all p ≥ 0, uniformly on intervals [a, 1]
for a > −1 and p < 1. Uniformly on [−1, 1] if p < 0.
(d) Pointwise to 0 on R, uniformly on |x| ≥ a > 0.
(g) Uniformly to 0 on R.
(j) Pointwise on R, uniformly on the sets |x| ≥ r > 1 and |x| ≤ s < 1.
2. (a) Pointwise but not uniformly.
(b) Uniformly.
6. For example, fn (x) = x + 1/n, f (x) = gn (x) = g(x) = x on [1, +∞].
10. Given ε > 0, choose δ > 0 such that |f (x) − f (y)| < ε for all x, y ∈ R
with |x − y| < δ. Then choose N such that |an − a| < δ for all n ≥ N .
For such n and for all x,
|fn (x) − f (x + a)| = |f (x + an ) − f (x + a)| < ε.
Solutions to Selected Problems
545
13. If x ∈ Q has reduced form x = k/m, then fn (x) = 1 for all n ≥ m.
Therefore, fn converges pointwise to the Dirichlet function d(x). Suppose
the convergence were uniform on [0, 1]. Then we could find n such that
|fn (x) − d(x)| < 1 for all x ∈ [0, 1]. In particular, |fn (1/m) − 1| < 1 for
all m > n, which is impossible since fn (1/m) = 0.
17. Let M > |f0 (x)| + 1 for all x ∈ S. Then
|fn+1 (x) − fn (x)| = | sin rfn (x) − sin rfn−1 (x) |
≤ r|fn (x) − fn−1 (x)| ≤ · · · ≤ rn |f1 (x) − f0 (x)|
≤ M rn .
Since r < 1, {fn } is uniformly Cauchy. Therefore, fn → some f , uniformly
on S. The generalization is proved in a similar manner, using the mean
value theorem.
Section 7.2
4. Let x > 0. By l’Hospital’s rule, n2 xe−nx has the same limit as 2ne−nx ,
namely, 0. The convergence is not uniform on (0, 1), however, as may
be seen by taking bn = 1/n in 7.1.5. An integration by parts shows that
R 1 2 −nx
n xe
dx = 1 − e−n (1 + n) → 1.
0
R1
5. (d) Let L := limn 0 fn . By the mean value theorem, e−x/n − 1 =
(−x/n)e−ξ/n , hence
√
so
n e−x/n − 1
x
e−ξ/n
1
≤ √ ≤√
n
n
√
n e−x/n − 1 /x converges uniformly to zero. Therefore, L = 0.
√
√
6. If x ≥ r > 0, then fn (x) = n/(1 + n2 x2 ) ≤ n/(1 + n2 r2 ), hence
fn → 0 uniformly on [r, +∞). The convergence is not uniform on (0, 1),
as can be seen by taking bn = 1/n in 7.1.5. A substitution shows that
R1
f = n−1/2 arctan n → 0.
0 n
Rb
8. (a) n sin fn → f 0 /f uniformly ⇒ a n sin fn → ln f (b) − ln f (a).
9. This follows from the inequality
Z
a
x
fn (t) dt −
Z
a
x
f (t) dt ≤
Z
a
x
|fn (t) − f (t)| dt ≤
Z
b
|fn − f |.
a
546
A Course in Real Analysis
Section 7.3
1. (a) Pointwise on (1, +∞), uniformly on [r, +∞), r > 1.
(d) Uniformly on [0, +∞).
(g) Pointwise on (0, +∞), uniformly on [r, +∞), r > 0.
(i) If p > 1, pointwise on [0, +∞), uniformly on [0, r];
If p = 1, converges only at x = 0.
2. (b) s(x) =
vals.
1
. Pointwise on (1/e, e), uniformly on closed subinter1 + ln x
4. Both s(x) and c(x) converge uniformly on R by the M -test. Therefore,
term by term integration is justified so
Z x
Z π/2
X an
X an
cos (2n + 1)x ,
c(t) dt =
sin(nx).
s(t) dt =
2n + 1
n
0
x
n
n
6. (a) Let p ≤ 1/2 and x 6= 0. By l’Hospital’s rule, n−1 [1 − cos(x/np )] has
the same limit as n → +∞ as
−pxn−p−1 sin(x/np )
sin(x/np )
= px2 n1−2p
.
2
−1/n
x/np
Since this limit is positive, (a) follows from the limit comparison test.
(b) Since cosine is an even function, to show uniform convergence on
intervals [a, b] we may assume a = 0. By the mean value theorem, for
each n ∈ N and x ∈ [0, b] there exists xn ∈ [0, b] such that
|1 − cos(x/np )| = (x/np )| sin(xn /np )| ≤ b2 /n2p .
Therefore, uniform convergence on [0, b] follows from the M -test. Since
1−cos(x/np ) does not converge uniformly to 0 on any unbounded interval,
s(x) does not converge uniformly on R.
9. Let |f 0 | ≤ M on I. By the mean value theorem, for each x ∈ I and n ∈ N
there exists ξ between x/(n + 1) and 0 such that
1 x |xf 0 (ξ)|
rM
f
=
≤
.
n
n+1
n(n + 1)
n(n + 1)
Therefore, s(x) converges uniformly on I by the Weierstrass M -test.
Since f 0 is bounded, the derived series
∞
X
1
x
s0 (x) =
f0
n(n + 1)
n+1
n=1
converges uniformly on I and s0 (0) = f 0 (0).
Solutions to Selected Problems
547
11. Since fn ≥ 0, the partial sums of the series increase, so the conclusion
follows from Dini’s theorem (7.1.12).
13. For x ∈ [a, b], either fn (a) ≤ fn (x) ≤ fn (b) or fn (b) ≤ fn (x) ≤ fn (a),
hence
|fn (x)| ≤ Mn :=
P
P max{|fn (a)|, |fn (b)|} ≤ |fn (a)| + |fn (b)|. Since
Mn < +∞, s =
n converges uniformly on [a, b]. Since each
n f
Rb
PRb
fn ∈ R [a, b] , s ∈ R [a, b] and a s =
f .
a n
15. By Dini’s theorem, the convergence of {gn } is uniform. Therefore, the
result follows from 7.3.9.
18. Since g is continuous and n−2 [g + n] ↓ 0, the convergence is uniform on
closed bounded intervals I. By 7.3.9, s(x) converges uniformly
on I. The
P
convergence is not absolute for any x (compare with n 1/n).
Section 7.4
1. (a) (−1, 3).
2. (b)
(d) (−1, 1].
(g) (−1/4, 1/4).
(i) (−1, 1).
∞
X
3n−3 n
x , −2/3 < x < 2/3.
2n−2
n=3
3. (a) Replace x by x − 1 in (7.12), where |x − 1| < 1, to obtain
x ln x = (x − 1) ln x + ln x
∞
∞
X
X
(−1)n+1
(−1)n+1
=
(x − 1)n+1 +
(x − 1)n
n
n
n=1
n=1
∞
∞
X
X
(−1)n+1
(−1)n
(x − 1)n +
(x − 1)n
n
−
1
n
n=1
n=2
∞ n
X
(−1)
(−1)n+1
= (x − 1) +
+
(x − 1)n
n
−
1
n
n=2
=
= (x − 1) +
4. (a)
∞
X
(−1)n
(x − 1)n .
n(n
−
1)
n=2
∞
X
(−1)n+1 2n + 3n n
x , |x| < 1/3.
n
n=1
∞
X
(−1)n 4n 2n+1
(g)
x
, x ∈ R.
(2n + 1)!
n=0
5. Use arccos x = π/2 − arcsin x and (7.20).
9. (a)
∞
X
n=1
(−1)n
x2n−1
.
(2n − 1)(2n + 1)!
(e)
∞
X
(−1)n n
x , x > 0.
(2n + 1)!
n=0
548
A Course in Real Analysis
10. (b)
x(1 − x2 )
.
(1 + x2 )2
11. 27/4.
12. (a)
(d)
∞
X
n=1
∞
X
(−1)n+1 cn xn , |x| < 1, cn :=
n
X
(−1)k
k=1
cn x2n+1 , x ∈ R, cn :=
n=0
n
X
k=0
k
.
(−1)k
.
(2k + 1)!(n − k)!
√
16. For |x| < ( 5 − 1)/2,
(1 − x − x2 )s(x) =
∞
X
cn xn −
n=0
∞
X
cn xn+1 −
n=0
= c0 + c1 x − c0 x +
∞
X
cn xn+2
n=0
∞
X
(cn − cn−1 − cn−2 )xn
n=2
= 1.
18. Replace x by −t2 in (7.19) to obtain
√
∞
X
1
(−1)n (2n)! 2n
t ,
=
(n!)2 4n
1 + t2
n=0
|t| < 1.
Integrating from 0 to x yields the desired representation.
21. (a)) Choose r such that Rs−1 = lim supn |cn |1/n < r < 1. Then
2
|cn2 |1/n = |cn2 |1/n
n
< rn → 0,
hence Rt = +∞.
(b) If cn = (1 + a/np )n , p > 0, then Rs = 1 and
a −n
Rt = lim 1 + 2p
n
n
 −a
e



0
=

+∞



1
if
if
if
if
p = 1/2,
p < 1/2 and a > 0
p < 1/2 and a < 0
p > 1/2.
22. (a) If 0 < Rs < +∞, choose N such that |cn |1/n < 2Rs−1 for all n ≥
2
N . For such n, |cn |1/n < (2Rs−1 )1/n → 1, hence Rt ≥ 1. Similarly,
2
|cn |1/n > (Rs−1 /2)1/n for infinitely many n, hence Rt ≤ 1.
P∞
27. By the alternating series test, n=0 cn xn converges at x = −1, hence
the result follows from Abel’s continuity theorem.
Solutions to Selected Problems
n
X
549
x
, x ∈ [0, 1). By 7.4.6 and
(1 − x)2
k=1
the boundedness of f , sn (x) → s(x) uniformly on [0, r], 0 < r < 1.
28. (a) Let sn (x) =
kxk and s(x) =
30. Define h on I ∪ J by
(
h(x) =
f (x) if x ∈ I,
g(x) if x ∈ J.
By 7.4.19, f = g on I ∩ J, hence h is well-defined and analytic on I ∪ J.
33. (a)
P By 7.4.13,n if the series g(x) converges for |x−a| < r1 , then f (x)g(x) =
cn (x − a) , where
c0 = a0 b0 = a0 = 1 and cn =
n
X
ak bn−k = a0 bn − bn = 0, n ≥ 1.
k=0
Therefore, f (x)g(x) = 1 for |x − a| < r1 .
(b) Suppose |an | ≤ M n for all n. If |bj | ≤ (2M )j for 1 ≤ j ≤ n − 1, then
|bn | ≤
n
X
|ak ||bn−k | ≤
k=1
n
X
2n−k M k M n−k < (2M )n .
k=1
By induction, |bn | ≤ (2M ) for all n.
n
(c) By 7.4.16, there exists a constant M > 0 such that |an | ≤ M n for all
n, hence (b) holds. By 7.4.16, g is analytic at a.
Section 8.1
1. Only (b) and (d) are not metrics.
3. Symmetry and coincidence are clear. To verify the triangle inequality
d(x, y) ≤ d(x, z) + d(y, z) simply note that if xj 6= yj then either xj 6= zj
or yj 6= zj so that every index j contributing to d(x, y) also contributes
to d(x, z) + d(y, z).
5. By the triangle inequality,
d(x, y) ≤ d(x, a) + d(a, y) ≤ d(x, a) + d(a, b) + d(b, y),
hence d(x, y) − d(a, b) ≤ d(x, a) + d(b, y). Similarly d(a, b) − d(x, y) ≤
d(x, a) + d(b, y).
10. Let {xn } be a Cauchy sequence in E. Some Ej must contain a subsequence
of {xn }, and since Ej is complete, the subsequence converges to a member
of Ej . By Exercise 9, {xn } converges.
The assertion is false for infinitely sets. For example, let {r1 , r2 , . . .} be
an enumeration of the rationals, and take En = {rn } (or {r1 , . . . , rn }).
550
A Course in Real Analysis
13. The proof of (a) is straightforward. For the necessity in (b),
let {(xn , yn )}
be Cauchy in Z. Since d(xn , xm ) ≤ η (xn , yn ), (xm , ym ) , {xn } is Cauchy
in X. Similarly, {yn } is Cauchy in Y . The converse is clear. Part (c) is
proved in a similar manner, and (d) follows from (b) and (c).
15. Part (a) is straightforward. For example, if ρ(x, y) = 0, then ρ(x, y) =
d(x, y), hence x = y. Parts (b) and (c) follow from the observation that
ρ(x, y) = d(x, y) if either term is less than a. Part (d) follows from (b)
and (c).
The metrics need not be metrically equivalent: Take d to be the usual
metric on R.
The function σ does not define a metric on X since σ(x, x) = a > 0.
18. (a) The triangle inequality follows from the observation that the function
t(1 + t)−1 is increasing on [0, +∞). The remaining properties of a metric
are easily established. Parts (b) and (c) follow from the definition of ρ
and the equation
ρ(x, y)
d(x, y) =
,
1 − ρ(x, y)
noting that ρ < 1.
The metrics |x−y| and |x−y|/(1+|x−y|) are not metrically equivalent.
20. By Exercise 18, each ρk is a metric on X. It follows easily that ρ is a metric
on X. For (b), suppose ρ(xn , x) → 0. Since ρk ≤ 2k ρ, ρk (xn , x) → 0. By
Exercise 18, dk (xn , x) → 0. Conversely, suppose dk (xn , x) → 0, hence
ρ
k (xn , x) → 0, for each k. Given ε > 0, choose M ∈ N such that
P
−n
< ε/2 and choose N > M so that
n>M 2
ρ1 (xn , x) + ρ2 (xn , x) + · · · + ρM (xn , x) < ε/2
for all n ≥ N . For such n, ρ(xn , x) < ε.
23. For x, y ∈ [1, b],
|fn (x, y) − f (x, y)| =
y(1 + xn )1/n − x(1 + y n )1/n
y(1 + y n )1/n
=
(1 + xn )1/n − x(1 + y −n )1/n
(1 + y n )1/n
|(1 + xn )1/n − x| + |x − x(1 + y −n )1/n |
(1 + y n )1/n
h
i
h
i
≤ x (1 + x−n )1/n − 1 + x (1 + y −n )1/n − 1
h
i
≤ b (21/n − 1) + (21/n − 1) → 0.
≤
Solutions to Selected Problems
551
Section 8.2
1.
(1, 0)
(0, 0)
(0, 0)
B1d1 (0, 0)
B1d∞ (0, 0)
(1, 0)
FIGURE C.1: Open balls for Exercise 1.
3. r = d(x, y)/2.
5. If x, y ∈ Br (a) and 0 < t < 1, then
ktx + (1 − t)y − ak = kt(x − a) + (1 − t)(y − a)k
≤ kt(x − a)k + k(1 − t)(y − a)k
< tr + (1 − t)r = r.
In general, spheres are not convex. (Consider (R2 , d2 ).)
8. By Exercise 8.1.6, ρ is a metric. Since ex is a continuous function on R
with continuous inverse, ρ(xn , x) → 0 iff |xn − x| → 0. Therefore, ρ is
topologically equivalent to the usual metric of R. (R, ρ) is not complete
in this metric. For example, {−n}∞
n=1 is a Cauchy sequence in (R, ρ)
with no limit. Therefore, ρ cannot be metrically equivalent to the usual
metric of R.
12. Let {fn } be a sequence in C converging uniformly to f . Then fn (x) =
fn (1 − x) for all n and x. Taking limits yields f (x) = f (1 − x) for all
x. To see that C is not closed in the metric of Exercise 8.1.22, define
fn ∈ C by fn (1/2) = 1, fn (x) = 0 if x ∈ [0, 1/2 − 1/n] ∪ [1/2 + 1/n, 1]
and linear on [1/2 − 1/n, 1/2 + 1/n].
Section 8.3
1. (a) cl(A) ∪ cl(B) is closed and ⊇ A ∪ B, so cl(A) ∪ cl(B) ⊇ cl(A ∪ B).
Similarly, cl(A ∪ B) ⊇ cl(A) and cl(A ∪ B) ⊇ cl(B).
(d) int(A)∪int(B) is open and ⊆ A∪B, hence int(A)∪int(B) ⊆ int(A∪B).
The example A = (0, 1], B = (1, 2) in R produces strict inclusion.
(f) bd(cl(A)) = cl(cl(A)) \ int(cl(A)) ⊆ cl(A) \ int(A) = bd(A). The
example A = Q in R produces strict inclusion.
552
A Course in Real Analysis
3. (b) (x, y, 0) : x2 + y 2 = 1 .
(e) {(1, 0), (0, 0)}.
2
2
(f) The circle (x, y) : x + y = 1 together with the point (0, 0).
6. (a) By 8.3.6, y ∈ clY (A) iff for any sequence {an } in A with an → y,
y ∈ A. The same characterization can be given for y ∈ clX (A) ∩ Y .
8. The sequence {fn } has no cluster points in C([0, 1]), k · k∞ , hence the
set {f1 , f2 , . . .} is closed. The identically
zero function is a cluster point
of the sequence in C([0, 1]), k · k1 , hence the set is not closed in this
space.
9. (a) B is open and B ⊆ C, hence B ⊆ int(C). The example B1 (x) = {x}
and C1 (x) = X in a nontrivial discrete space gives strict inclusion.
12. (b) By 8.3.9, for any y ∈ R there exist integers nk > 0, mk such that
nk /(2π) + mk → y − x/(2π) hence
sin(nk + x) = sin 2π (nk + x)/(2π) + mk → sin(2πy).
Therefore, the set is dense in [−1, 1].
16. Let u ∈ U and choose ε > 0 such that Bε (u) ⊆ U . Since Y is dense in
X, Bε (u) ∩ U ∩ Y = Bε (u) ∩ Y 6= ∅.
If U is not open, then the assertion may not hold. For example, take
X = [0, 1], Y = (0, 1], and U = {0}.
S
20. (a) Let u, v ∈ I := i∈I Ii and t ∈ (0, 1). Then u ∈ Ii and v ∈ Ij
for some i, j ∈ I. Since Ii ∩ Ij =
6 ∅, Ii ∪ Ij is an interval. Therefore,
tu + (1 − t)v ∈ Ii ∪ Ij ⊆ I, hence I is an interval. Since each Ii is open, I
is open.
Section 8.4
1. (b), (k) (o), (r) Limit and iterated limits are 0.
(e) Limit does not exist. One iterated limit is 0, the other is 1.
(i) Limit and iterated limits exist and = 1/2.
2. (a) The limit is 1 since
x2 − 5y 2
8y 2
8y 2
−
1
=
<
≤ 8a−2/p |y|2(1−1/p) → 0.
x2 + 3y 2
x2 + 3y 2
(|y|/a)2/p + 3y 2
(b) The limit does not exist, as may be seen by converting to polar
coordinates.
Solutions to Selected Problems
553
6 y there exists a number
3. By the Cauchy mean value theorem, for each x =
θ = θ(x, y) between x and y such that
g(x, y) =
f 0 (θ)
.
cos θ
Since limy→x θ(x, y) = x, define g(x, x) = f 0 (x)/ cos x.
6. This follows from Exercise 8.1.5
7. Given ε > 0, choose
p δ > 0 such that |f (x) − f (a)| < ε for all x, a with
|x − a| < δ. Let (x − a)2 + (y − b)2 < δ/2. Then
p
x2 + y 2 −
p
|x2 + y 2 − a2 − b2 |
a2 + b2 = p
√
x2 + y 2 + a2 + b2
|x − a|(|x| + |a|) + |y − b|(|y| + |b|)
p
≤
√
x2 + y 2 + a2 + b2
≤ |x − a| + |y − b|
p
≤ 2 (x − a)2 + (y − b)2 < δ,
hence |g(x, y) − g(a, b)| < ε.
8. For a proof using the sequential criterion for uniform continuity, let
xn − an , yn − bn → 0. Then αxn + βyn − (αan + βbn ) → 0, hence
g(xn , yn ) − g(an , bn ) = f (αxn + βyn ) − f (αan + βan ) → 0.
The functions xy and sin(xy) are not uniformly continuous on R2 . (For
the former √
take xn = yn = n + 1/n and an =√bn = n. For the latter take
xn = yn = 2π [n + 1/(3n)] and an = bn = 2π n.)
11. This follows from the inequalities
|fj (x) − fj (a)| ≤ kf (x) − f (a)k ≤
n
X
|fj (x) − fj (a)|.
j=1
12. We prove the uniform continuity part. Given ε > 0, choose a fixed n
such that ρ(fn (x), f (x)) < ε/3 for all x ∈ X. Then choose δ > 0 such
that ρ(fn (x), fn (a)) < ε/3 for all x, a ∈ X with d(x, a) < δ. The triangle
inequality then shows that ρ(f (x), f (a)) < ε/3 for all x, a ∈ X with
d(x, a) < δ.
Section 8.5
1. (a) compact.
(f) bounded, not closed.
(b) closed, not bounded.
(h) neither bounded nor closed.
554
A Course in Real Analysis
3. Compact case: Let {Ui : i ∈ I} be an open cover of C := C1 ∪ · · · ∪ Ck ,
where each Cj is compact. For each j there exists a finite set Ij ⊆ I
such that {Ui : i ∈ Ij } covers Cj . If I0 is the union of the Ij , then
{Ui : i ∈ I0 } is a finite subcover of C.
4. Such an intersection is closed and contained in a compact set and is
therefore compact.
7. If E is totally bounded, then cl(E) is totally bounded. Since X is complete,
cl(E) is complete. Therefore, by 8.5.8, cl(E) is sequentially compact. In
particular, every sequence in E has a cluster point in X.
Conversely, assume every sequence in E has a subsequence that
converges in X. Let {yn } be a sequence in cl(E). For each n, choose
xn ∈ E such that d(xn , yn ) < 1/n. By hypothesis, a subsequence xnk
converges to some x ∈ X, hence ynk → x. Therefore, cl(E) is sequentially
compact hence totally bounded.
T∞
S∞
11. Suppose n=1 Cn = ∅. Then n=1 Cnc = X, hence {Cnc : n ∈ N} is
an open cover of X and therefore also of C1 . Choose n ∈ N such that
C1 ⊆ C1c ∪ · · · ∪ Cnc . Taking complements, Cn = C1 ∩ · · · ∩ Cn ⊆ C1c ⊆ Cnc ,
which is impossible.
13. By the approximation property of suprema, there exist sequences {an }
and {bn } in A such that d(an , bn ) → d(A). Since A is compact, there
exists a subsequence {a0n } of {an } converging to some a ∈ A. Similarly,
there exists a subsequence {b00n } of the corresponding subsequence {b0n }
that converges to some b ∈ A. It follows that d(a, b) = limn d(a00n , b00n ) =
d(A).
For the example, take A = {fn } in C [0, 1] with the sup metric, where
fn (x) = xn . Then d(A) = 1 > d(fn , fm ) for all m, n.
15. (a) For any a ∈ A, d(A, x) ≤ d(a, x) ≤ d(a, y) + d(y, x), hence d(A, x) −
d(y, x) ≤ d(a, y). Taking the infimum over a yields d(A, x) − d(y, x) ≤
d(A, y) or d(A, x) − d(A, y) ≤ d(y, x). Interchanging x and y yields (a).
(b) If x 6∈ cl(A) there exists r > 0 such that Br (x) ∩ cl(A) = ∅. Then
d(a, x) ≥ r for all a ∈ A, hence d(A, x) > 0. Conversely, assume x ∈ cl(A)
and let an ∈ A with an → x. Since d(A, x) ≤ d(an , x) → 0, d(A, x) = 0.
(c) By (b), the denominator of FAB (x) is positive, hence FAB is welldefined. Continuity follows from (a), and clearly 0 ≤ FAB ≤ 1. The last
assertions follow from (b).
(d) U = {x ∈ X : FAB (x) < 1/2}, V = {x ∈ X : FAB (x) > 1/2}.
19. Let xn := f (1/n) and yn := f (2π −1/n). Then limn xn = limn yn = (1, 0)
but f −1 (xn ) = 1/n → 0 and f −1 (yn ) = 2π − 1/n → 2π.
21. Each set is a continuous image of the compact set A × B.
Solutions to Selected Problems
555
Section 8.6
3. Suppose that F is equicontinuous at a. Given ε > 0, choose δ > 0 such
that ρ(f (x), f (a)) < ε for all x ∈ X with d(x, a) < δ and all f ∈ F.
Given sequences {fn } in F and {xn } in E with xn → a, choose
N such
that d(xn , a) < δ for all n ≥ N . For such n, ρ fn (xn ), fn (a) < ε.
Conversely, suppose that F is not equicontinuous at a. Then there
exist ε > 0 and members
xn of E and fn of F such that d(xn , a) < 1/n
but ρ fn (xn ), fn (a) ≥ ε. Therefore, the sequential condition does not
hold.
7. Let x > a ≥ c. By the mean value theorem applied to the function
f (z) = z −p on (na, nx),
1
pn|x − a|
1
−
=
for some yn ∈ (na, nx).
p
p
(nx)
(na)
ynp+1
Since ynp+1 ≥ (nc)p+1 ≥ cp+1 , |(nx)−p − (na)−p | ≤ p|x − a|c−(p+1) , which
shows equicontinuity.
9. Take xn = a + π/n in Exercise 3. Then xn → a but
sin(nxn ) − sin(na) = −2 sin(na),
which has no limit if a is a nonzero rational number.
11. By the mean value theorem, |f (x) − f (y)| ≤ M |x − y|.
14. Let kfi k∞ ≤ M for all i. Then |Fi (x) − Fi (y)| ≤ M |x − y|, hence F is
uniformly equicontinuous on [a, b]. It follows that the uniform closure G
of F in C([a, b]) is uniformly equicontinuous on [a, b] (Exercise 6). Since
G is also closed and bounded, it is compact (Arzelà–Ascoli Theorem),
hence totally bounded.
Section 8.7
1. (c) not connected.
(d) path connected, hence connected.
(e) connected iff −1 ≤ a ≤ 1.
5. Then f (u) and f (v) have opposite signs, say f (u) < 0 < f (v). Since the
range of f is connected, it contains the interval (f (u), f (v)).
7. Let f = (g, h) : X → R2 and L := {(x, x) : x ∈ R}. Then L separates R2
into two open half-planes H1 and H2 . Choose any x0 ∈ X and suppose
f (x0 ) ∈ H1 . Then E := f −1 (H1c ) = f −1 (H2 ) is both open and closed.
Since X is connected, E = ∅. Therefore, f (X) ⊆ H1 .
556
A Course in Real Analysis
9. Consider the case B := B1 (0). Any point in B c may be connected to the
sphere S := S2 (0) by a radial line segment. Since S is path connected
(8.7.10), B c is path connected.
12. Denote the union by A. Let f : A → {0, 1} be continuous. Since An is
connected, f (An ) is a single point. Since An ∩ An+1 =
6 ∅, an induction
argument shows that f is constant.
16. Suppose that f : L → C is such a function. Then f −1 : C → L is
continuous (8.5.11). Remove a point p from the interior of L. Then f −1
maps the connected set C \ f (p) onto the disconnected set L \ p.
The function f (t) = (cos t, sin t) maps [0, 2π] continuously onto the
circle x2 + y 2 = 1.
20. Let x ∈ bd(A) and ε > 0. Then there exist u, v ∈ Bε (x) such that
f (u) ≥ c and f (v) < c. Since Bε (x) is convex, it is connected, hence
f Bε (x) is an interval and so must contain c. Taking ε = 1/n, we may
construct a sequence xn → x with f (xn ) = c for each n. Therefore,
f (x) = c. This shows that bd(A) ⊆ B.
The example f (x) = x2 on R with c = 0 shows that the inclusion may
be strict.
22. (a) Cx is connected by Exercise 13. Let u ∈ Cx and choose ε > 0 such
that Bε (u) ⊆ U . Since Bε (u) is connected, Bε (u) ∪ Cx is connected,
hence Bε (u) ⊆ Cx . Therefore, Cx is open. If Cx ∩ Cy 6=, then Cx ∪ Cy
is connected hence Cx = Cy . Therefore, U is a union of pairwise disjoint
components.
(b) Choose a point with rational coordinates in each component in (a).
Since these points form a countable set, the union is countable.
Section 8.8
3. Choose a sequence of polynomials Pn converging uniformly to f on
Rb
Rb
[a, b]. By hypothesis, a f Pn = 0 for all n, hence a f 2 = 0. Since f is
continuous, f = 0. If a ≥ 0, then the polynomials with even powers form
a separating algebra, hence the result follows as before.
6. By the Stone–Weierstrass theorem, there exists a sequence of functions
gn in A converging uniformly to f . Set fn = gn − gn (x0 ). Then fn ∈ A
and gn (x0 ) → 0, hence
kfn − f k∞ ≤ kfn − gn k∞ + kgn − f k∞ = |gn (x0 )| + kgn − f k∞ → 0.
9. By 8.8.8, there exists a sequence {Tn } of trigonometric polynomials
converging uniformly to f on [0, 2π]. For any j, sin(jx) and cos(jx)
m
n
are
x, hence, by hypothesis,
R 2π linear combinations of products sin xRcos
2π 2
f
(x)T
(x)
dx
=
0
for
all
n.
Therefore,
f
= 0 so f = 0.
n
0
0
Solutions to Selected Problems
557
Pm
11. The set of all functions of the form T (x) := b0 + j=1 bj sin(jx) on
[−π/2, π/2] is an algebra A containing the constant functions. Since sin x
separates points, so does A. Therefore, given ε > 0, kf − T k∞ < ε/2 for
some T . Since f (0) = 0, |b0 | < ε/2. Therefore, kf − (T − b0 )k∞ < ε.
Pn
15. The functions i=1 gi (x)hi (y) form an algebra and separate points of
X ×Y.
Section 8.9
1. Assume that X has the decreasing sequence property
and let {xn } be a
Cauchy sequence in X. Take Cn = cl {xn , xn+1 , . . .} . Then Cn is closed,
Cn+1 ⊆ Cn and d(Cn ) → 0 (because {xn } is Cauchy). By assumption,
there exists x ∈ X such that x ∈ Cn for all n. It follows that some
subsequence of {xn } converges to x. Therefore xn → x (Exercise 8.1.9).
3. Let {r1 , r2 . . .} be an enumeration
of Q. Then Un := {rn , rn+1 , . . .} is
T
open and dense in Q but n Un = ∅.
Section 9.1
1. (a)
2y dx − 2x dy
.
(x + y)2
(e) cos(x2 y)(2xy dx + x2 dy).
2
(h) exy (y 2 dx + 2xy dy).
x
e sin y ex cos y
2. (b)
.
ey cos x ey sin x
(c)
3
1
y − x2 y
2
2
2
4xy 2
(x + y )
x3 − xy 2
.
−4y 3
3. Let ∆ = {(x, x) : x ∈ R}.
(a) Differentiable on R2 iff p, q > 3, in which case partials are continuous.
(d) Differentiable on R2 iff p, q > 1. Partials are continuous iff p > 2.
4. (a) Differentiable and partials continuous iff p + q > 1.
(d) Differentiable and partials continuous iff p + q > 2s + 1.
8. x · ∇f (x) = a · f (x), x · ∇g(x) = g(x).
10. e−f (x) (ex1 , ex2 , . . . , exn ).
12. (a)
xi
.
kxk
(c)
kxk2 − x2i
.
kxk3
Section 9.2
1. Let α denote the right side of the inequality. Clearly, kT k ≤ α. If kxk ≤ 1,
then kT xk ≤ kT kkxk ≤ kT k, hence α ≤ kT k.
558
A Course in Real Analysis
3. Since ∇(ψ −1 ) = ψ −2 ∇ψ, the assertion follows from the scalar product
rule.
4. (a) Let f (x) = x and ψ(x) = kxk in the product rule (9.2.6). Since dfx
is the identity transformation and ∇ψ(x) = x/kxk,
dgx (h) = kxkh + kxk−1 (x · h)x.
Therefore,
dgx (x) = kxkx + kxk−1 (x · x)x = 2kxkx.
6. Let η(h), µ(k) be such that
f (a+h) = f (a)+dfa (h)+khkη(h), g(b+k) = g(b)+dgb (k)+kkkµ(k)
for all h ∈ Rp , k ∈ Rq with khk, kkk sufficiently small, and
lim η(h) = lim µ(k) = 0.
h→0
k→0
Let T (h, k) = αdfa (h) + βdgb (k). Then T is linear in (h, k) and
ε(h, k) := F (a + h, b + k) − F (a, b) − T (h, k) = αkhkη(h) + βkkkµ(k).
Since k(h, k)k =
p
k(hk2 + kkk2 ≥ khk, kkk,
kε(h, k)k
|α|khkkη(h)k + |β|kkkkµ(k)k
≤
≤ |α|kη(h)k + |β|kµ(k)k.
k(h, k)k
k(h, k)k
10. Part (a) follows from Exercise 8.5.14. For (b), set g(t) = kf (t) − vk2 .
Then g 0 (t) = 2(f (t) − v) · f 0 (t), and since g(t0 ) is the minimum value of
g, g 0 (t0 ) = 0.
Section 9.3
1. g 0 ϕ(x)ψ(y) ϕ0 (x)ψ(y), ϕ(x)ψ 0 (y) .
3. gx a · x, b · x)a + gy a · x, b · x)b.
7. T f 0 (x).
10. (a) Let g(t) = f (a + tu). By definition, Du f (a) = g 0 (0). On the other
hand, by the chain rule, g 0 (t) = u · ∇f (a + tu). Setting t = 0 yields (a).
f (tu) − f (0)
ab2
= lim 2
exists for all u = (a, b). f is not
t→0
t→0 a + b4 t2
t
√
continuous at (0, 0), since f → 0 along y = 0 but f = 1/2 along y = x,
x > 0.
(c) lim
Solutions to Selected Problems
12. Let F (x) =
Rb
a
559
f (t, x) dt. By the mean value theorem,
F (x + h) − F (x)
=
h
b
Z
fx (t, x + rh) dt, for some r = r(t, x, h) ∈ (0, 1).
a
Since fx is uniformly continuous, fx (t, x + rh) → fx (t, x) uniformly in t
Rb
on [a, b] as h → 0. Therefore, F 0 (x) = a fx (t, x) dt.
15. Let ϕ(t) = t−p f (tx). By the product rule and the chain rule,
ϕ0 (t) =
1
−p
f (tx) + p ∇f (tx) · x.
tp+1
t
If f is homogeneous of degree p, then ϕ is a constant function, hence
p
1
f (tx) = p ∇f (tx) · x.
tp+1
t
Setting t = 1 produces the desired identity. On the other hand, if the
identity holds, then tx · ∇f (tx) = pf (tx) for all t and x, hence ϕ0 (t) = 0.
Therefore, ϕ(t) = ϕ(1), which shows that f is homogeneous of degree p.
17. Fix y ∈ C and define g on U by g(x) = f (x) − f (y) − dfy (x). Then
g(x) − g(y) = f (x) − f (y) − dfy (x − y) and dgz = dfz − dfy (9.1.7),
hence the result follows from 9.3.6 applied to g.
Section 9.4
1. (a) {(x, y) : x 6= y}.
(e) {(x, y) : xy 6= 0}.
(b) {(x, y) : x + y 6= (2n + 1)π/2, n ∈ Z}.
(f) {(x, y) : x, y > 0, y 6= x}.
(i) {(x, y) : y 6= ±x}.
(j) {(x, y, z) : xyz 6= 0}.
√
√
2. (i) x = 12 u + u2 − 4v , y = 12 u − u2 + 4v .
√
√
1/2
1/2
(v) x = √12 u − u2 − 4v 2
, y = √12 u + u2 − 4v 2
.
4. Set u = x(x2 + y 2 )−1 and v = y(x2 + y 2 )−1 . Square and add. Jf = −1.
Section 9.5
1. Let f (x, y) = x + y 2 + exy − 1. Then fx (0, 0) = 1 and fy (0, 0) = 0, so the
implicit function theorem guarantees a local solution x = x(y) but says
nothing about a solution y = y(x).
5. Let
F (x, y, z) = sin(x + z) + ln(y + z) −
G(x, y, z) = e
xz
+ sin(πy + z) − 1.
√
2/2 and
560
A Course in Real Analysis
Then, at (π/4, 1, 0), F = G = 0 and
∂(F, G) ∂(F, G) ∂(F, G)
6= 0.
∂(x, y) ∂(y, z) ∂(x, z)
8. Let F = x − y + z + u2 − 2, G = −x + 2z + u3 − 2, H = −y + 3z + u4 − 3.
Then, at (1, 1, 1, 1), F = G = H = 0 and
∂(F, G, H) ∂(F, G, H) ∂(F, G, H)
6= 0.
∂(x, y, u) ∂(y, z, u) ∂(x, z, u)
9. (b) Let a := fx (0, 0) and b := fy (0, 0). Thecondition is b(a + 1) 6= 0. The
−fx (x, y)fx f (x, y), y
.
derivative is
fy (x, y)fx f (x, y), y + fy f (x, y), y
11. (a) The condition is a(a3 − ab2 − b3 ) 6= 0 where a := fx (0, 0), b := fy (0, 0).
13. f 0 (1) + g 0 (1) + h0 (1) 6= 0.
15. Let y = F (x1 , . . . , xn ). If x1 is a function of x2 , . . . , xn , then, assuming
the necessary differentiability,
0=
hence
∂y
∂x1
= Fx 1
+ Fxn ,
∂xn
∂xn
∂x1
Fx
= − n . In this manner we obtain
∂xn
Fx1
Fx
Fx
∂x2 ∂x3
∂xn ∂x1
Fx Fx
...
= (−1)n 1 2 . . . n−1 n = (−1)n .
∂x1 ∂x2
∂xn−1 ∂xn
Fx 2 Fx 3
Fxn Fx1
Section 9.6
1. (b) zrr = t2 zxx + 2tzxy + zyy , ztt = r2 zxx + 2rzxy + zyy .
(e) zrr = (e2r sin2 t)zxx + (e2r cos2 t)zyy + (2e2r sin t cos t)zxy
+ (er sin t)zx + (er cos t)zx ,
ztt = (e2r cos2 t)zxx + (e2r sin2 t)zyy − (2e2r sin t cos t)zxy
− (er sin t)zx − (er cos t)zx .
(f) zr = axzx , zt = byzy , zrr = a2 x2 zxx + a2 xzx , ztt = b2 y 2 zyy + b2 yzy .
4. Fx + zx Fz = 0, hence
2
0 = Fxx + 2zx Fxz + zxx
Fzz + zxx Fz = Fxx − 2
and so
zxx = −
Fx
F2
Fxz + x2 Fzz + zxx Fz
Fz
Fz
1
Fx
F2
Fxx + 2 2 Fxz − x3 Fzz .
Fz
Fz
Fz
Solutions to Selected Problems
561
5. (a) ut = −k 2 u, uxx = −u.
(b) By logarithmic differentiation,
2
2
1
2
x
x
−
− 2 u.
ut =
u, uxx =
4k 2 t2
2t
4k 4 t2
4k t
7. The second order partial derivatives are
wρρ = (sin φ cos θ)2 wxx + (sin φ sin θ)2 wyy + (cos θ)2 wzz
+ (2 sin φ) (sin φ sin θ cos θ)wxy + (cos φ cos θ)wxz + (cos φ sin θ)wyz ,
wθθ = (ρ sin φ)2 (sin2 θ)wxx + (cos2 θ)wyy − 2(sin θ cos θ)wxy
− (ρ sin φ)[(cos θ)wx − (sin θ)wy ],
wφφ = ρ (cos φ cos θ)2 wxx + (cos φ sin θ)2 wyy + (sin φ)2 wzz
+ 2ρ2 (cos2 φ sin θ cos θ)wxy − (cos φ sin φ cos θ)wxz − (cos φ sin φ sin θ)wyz
− ρ (sin φ cos θ)wx + (sin φ sin θ)wy + (cos φ)wz .
2
9. fxi = pxi kxkp−2 g 0 (kxkp ), hence
fxi xi = p kxkp−2 + (p − 2)x2i kxkp−4 g 0 (kxkp ) + p2 x2i kxk2(p−2) g 00 (kxkp ) ,
h
i
fxi xj = pxi xj (p − 2)kxkp−4 g 0 (kxkp ) + pkxk2(p−2) g 00 (kxkp ) (i 6= j).
Section 9.7
1. (b)
∂3f
∂3f
∂3f
∂3f
(dx)3 + 3 2 (dx)2 dy + 3
dx (dy)2 + 3 (dx)3 .
3
2
∂x
∂x ∂y
∂x∂y
∂y
2. (a) 2y 2 (3x + y) (dx)2 + 12xy(x + y) dx dy + 2x2 (x + 3y) (dy)2 .
6
4
2
(b) 4 (dx)2 + 3 2 dx dy + 2 3 (dy)2 .
x y
x y
x y
(c) −y 2 sin(xy) (dx)2 + 2 cos(xy) − xy sin(xy) dx dy − x2 sin(xy) (dy)2 .
(d) 2f (x, y) (2x2 + 1) (dx)2 + 4xy dx dy + (2y 2 + 1) (dy)2 .
1
(e) 2
2(y − x2 ) (dx)2 − 4x dx dy − (dy)2 .
2
(x + y)
3. zero.
5. (a) f + h1
∂f
∂f
∂f
+ h2
+ h3
. The terms are evaluated at a.
∂x1
∂x2
∂x3
8. By induction,
∂ p f (x)
= bp11 bp22 . . . bpnn ϕ(p) b · x ,
. . . ∂xpnn
∂xp11 ∂xp22
562
A Course in Real Analysis
hence
(x · ∇)p f (0) = ϕ(p) (0)
X
p
(b1 x1 )p1 (b2 x2 )p2 . . . (bn xn )pn
p1 , p2 , . . . , pn
= ϕ(p) (0)(b · x)p ,
where the second equality follows from the multinomial theorem.
11. (a) x + y − 16 (x + y)3 .
(d) x + y − 31 (x + y)3 .
Section 9.8
2. x2 +2y 2 +3z 2 −xy−yz−xz =
1
2
(x−y)2 +(y−z)2 +(x−z)2 +y 2 +2z 2 ≥ 0.
3. (a) (0, 0): local min; (−4/3, 4/3): saddle.
(d) (1, 1), (−1, −1): local max; (0, 0): saddle.
(f) (2, −2): saddle.
(i) (1/3, 1/3): local max; (0, 0), (0, 1), (1, 0): saddle.
4. (a) Use polar coordinates to optimize the resulting single variable function
g(θ) = cos θ +sin θ, g 0 (θ) = − sin θ +cos θ, 0 ≤ θ ≤ 2π. The
√ critical points
of g occur at values
of
θ
that
satisfy
sin
θ
=
cos
θ
=
±
2/2. At these
√
values, g(θ) = ± 2. Also, g(0) √
= g(2π) = 1. Therefore, the maximum
and minimum values of f are ± 2.
2
6. (b) The only critical point is (2/3, −1/3). On
√bd(D),√f = x − x + 2,
−1 ≤ x ≤ 1, which has critical point (±1/ 2, ±1/ 2). Checking the
values of f at these√points and
√ at (±1, 0) shows that the maximum of f is
f (−1, 0) = f (−1/ 2, −1/ 2) = 2 and the minimum is f (2/3, −1/3) =
−1/3.
√
(d) The only critical point is (0, 0). On bd(D), f = ± sin x 1 − x2 ,
√
√
−1 ≤ x ≤ 1, which has critical points (±1/ 2, ±1/ 2). Checking the
values of f at these points and at (±1, 0) shows that the extreme values
of f are ± sin(1/2).
10. Since
lim
(x,y)→(0+ ,0+ )
f (x, y) =
lim
(x,y)→(+∞,+∞)
f (x, y) = +∞,
f has a minimum on (0, c) × (0, d) for suitable c, d > 0, and the
minimum must occur at a critical point. The unique critical point is
(a2/3 b−1/3 , a−1/3 b2/3 ), which gives the minimum 3(ab)1/3 .
2
Pn
11. Let f (m, b) = i=1 yi − mxi − b . Since not all x coordinates are
the same, m must be bounded. Since the data is bounded, b must be
Solutions to Selected Problems
563
bounded. Therefore, the minimum exists and must occur at the unique
critical point (m, b) of f , which is determined by the system
n
n
X
X
(yi − mxi − b)(−xi ) =
(yi − mxi − b)(−1) = 0.
i=1
i=1
It follows that x · y − mkxk2 − nbx = mx − y + b = 0.
15. Let f (x, y) = ax2 + 2bxy + y 2 and g(x, y) = x2 + y 2 − c2 . The equation
∇f = λ∇g yields ax + by = λx and bx + y = λy. Multiplying the first
equation by x and the second by y and then adding yields
f (x, y) = λ(x2 + y 2 ) = λc2 .
Since the system (a − λ)x + by = bx + (1 − λ)y = 0 has a nontrivial
solution iff the determinant of the coefficient matrix is zero, we obtain
λ2 − (a + 1)λ + a − b2 = 0. Solving for λ we see that the maximum and
minimum values of f on the circle are
p
λc2 = a + 1 ± (a + 1)2 + 4(a − b2 ) (c2 /2).
17. We minimize f (x, y) := (x − 1)2 + (y − 2)2 + (z − 3)2 subject to the
constraint g(x, y, z) := x2 + y 2 − z = 0. From ∇f = λ∇g we have
x − 1 = λx, y − 2 = λy, z − 3 = −λ/2,
from which it follows that y = 2x and z = 3−(x−1)/2x. From z = x2 +y 2
we then have 3 − (x − 1)/2x = 5x2 , or 10x3 − 5x − 1 = 0.
19. We minimize f (x, y) := (x − 1)2 + (y − 2)2 + (z − 3)2 subject to the
constraint g(x, y, z) := z 2 − x2 − y 2 − 1 = 0. From ∇f = λ∇g we have
x=
1
2
3
, y=
x, z =
,
1+λ
1+λ
1−λ
hence y = 2x and z = 3x/(2x − 1). Substituting into z 2 − x2 − y 2 = 1
yields the desired polynomial.
22. Let f (x, y, z) = x + 2y + 3z, g1 (x, y, z) = x + y + z − 1, and g2 (x, y, z) =
x2 + y 2 + z 2 − 1. From ∇f = λ1 ∇g1 + λ2 ∇g2 ,
1 = λ1 + 2λ2 x, 2 = λ1 + 2λ2 y, 3 = λ1 + 2λ2 z.
Subtracting yields 1 = 2λ2 (y − x) = 2λ2 (z − y) so y − x = z − y or
x − 2y + z = 0. Combining this with the constraint x + y + z = 1 yields
y = 1/3 and z = 2/3 − x. From the constraint
x2 + y 2 + z 2 = 1 we
√
obtain x2 − 2x/3 − 2/9 = 0 so x = (1√± 3)/3. The maximum value of
f (≈ 3.154694) √
occurs when x = (1 − 3)/3, the minimum (≈ 0.845293)
when x = (1 + 3)/3.
564
A Course in Real Analysis
Pn
24. We minimize f (x) := i=1 (xi − bi )2 subject to the constraint g(x) :=
a · x − c = 0. From ∇f = λ∇g we have (xj − bj ) = λaj /2, hence
(xj − bj )2 =
λ2 a2j
λ(aj xj − aj bj )
=
,
4
2
1 ≤ j ≤ n.
Adding and using the constraint,
f (x) =
λ
λ
λ2
kak2 and f (x) = (a · x − a · b) = (c − a · b)
4
2
2
Therefore, λ = 2(c − a · b)kak−2 , which gives the desired conclusion.
26. Let f (x) = kx−ak2 and g(x) = kxk2 −1. The equation ∇f = λ∇g leads
to the system xj − aj = λxj , or xj (1 − λ) = aj , j = 1, . . . , n. Therefore,
n
X
xj = aj /(1 − λ), so by the constraint kak2 =
a2j = (1 − λ)2 , hence
j=1
x = ±a/kak. The distance to the sphere is then the smaller of
± akak−1 − a = 1 ± kak−1 kak,
namely 1 − kak−1 kak.
Pn
27. (a) Let f (x) = a · x and g(x) = i=1 bi /xi − 1. From ∇f = λ∇g we
have ai = −λbi /x2i , hence
p
p
√
= ai bi /µ, µ := −λ.
xi = µ bi /ai and bi x−1
i
√
Pn √
The constraint implies that µ = i=1 ai bi . Since ai xi = µ ai bi , the
n p
X
2
minimum is
a i bi .
i=1
That the value is indeed the minimum may be argued as follows. If x
is any point satisfying the constraint, then
f (x) = a1 x1 + a2 x2 + · · · + an−1 xn−1 +
1−
an bn
Pn−1
i=1
bi /xi
,
where xi > bP
i . Thus |f | → +∞ as the variables x1 , x2 , . . ., xn−1 become
n
large or as i=1 bi /xi nears 1. Therefore, the minimum occurs in the
interior of a compact set, hence at the point obtained above.
30. Since cl(U ) is compact, there exist points u, v ∈ cl(U ) such that
f (u) ≤ f (x) ≤ f (v) for all x ∈ cl(U ).
If f (u) = f (v), then f is a constant function and the result follows. If
f (u) < f (v), then one of the points, say u, must lie in U . By 9.8.2,
f 0 (u) = 0.
Solutions to Selected Problems
565
Section 10.1
1. Let µ be as in 10.1.5 with pk = 1/k, or let µ be as in ??, and take
Ak = {k, k + 1, . . .}.
3. By the inclusion-exclusion principle and additivity,
µ(A ∪ B) = µ(A) + µ(B) − µ(A ∩ B) = µ(A), and
µ(A) = µ(A \ B) + µ(A ∩ B) = µ(A \ B).
5. Let B = A1 ∪ · · · ∪ An . By 10.1.6(c),
µ(A1 ∪ · · · ∪ An+1 ) = µ(B ∪ An+1 ) = µ(B) + µ(An+1 ) − µ(B ∩ An+1 ).
By the induction hypothesis,
µ(B) + µ(An+1 )
=
n+1
X
i=1
µ(Ai ) −
n
X
µ(Ai ∩ Aj ) + · · · + (−1)n−1 µ(A1 ∩ · · · ∩ An )
1≤i<j≤n
and
µ(B ∩ An+1 ) =
n
X
n
X
µ(Ai ∩ An+1 ) −
i=1
µ(Ai ∩ Aj ∩ An+1 )
1≤i<j≤n
+ · · · + (−1)n−1 µ(A1 ∩ · · · ∩ An ∩ An+1 ).
Subtracting produces the desired formula.
7. For any finite subset S of N with at least m members, define
\
\
ES =
Ej ∩
Ejc .
j∈S c
j∈S
There are countably many such sets, they are pairwise disjoint, and
∞
[
Ek ⊇
k=1
[
ES = A.
S
Moreover, Ek ∩ ES 6= ∅ iff k ∈ S, in which case Ek ⊇ ES . Therefore, by
additivity,
∞
X
µ(Ek ∩ A) =
∞ X
X
µ(Ek ∩ ES ) =
k=1 S
k=1
=
XX
S k∈S
∞
XX
µ(ES ∩ Ek )
S k=1
µ(ES ∩ Ek ) ≥
X
S
mµ(ES ) = mµ(A).
566
A Course in Real Analysis
Section 10.2
1. Clearly, λ∗ (A) ≤ α(A). To show the reverse inequality, let {Ik } be a
sequence in I that covers A. Given ε > 0 choose Jk ∈ J containing
P
n
IP
covers A and k |Ik | ≥
k such that |Jk | < |Ik | + ε/2 . Then {Jk }P
k |Jk | − ε ≥ α(A) − ε. Since ε was arbitrary,
k |Ik | ≥ α(A). Therefore,
λ∗ (A) ≥ α(A).
4. Let {Ik } be any sequence of intervals that covers A. Then the intervals
x + Ik cover x + A, hence
λ∗ (x + A) ≤
∞
X
k=1
|x + Ik | =
∞
X
|Ik |.
k=1
Since {Ik } was arbitrary, λ∗ (x + A) ≤ λ∗ (A). Since A = (A + x) − x,
the reverse inequality also holds.
Section 10.3
2. For any C ⊆ Rn ,
C ∩ (−E) = −[(−C) ∩ E] and C ∩ (−E)c = −[(−C) ∩ E c ],
hence, using D := −C as a test set for E,
λ∗ C ∩ (−E) + λ∗ C ∩ (−E)c = λ∗ (D ∩ E) + λ∗ (D ∩ E c ) = λ∗ (D).
By Exercise 10.2.5, λ∗ (D) = λ∗ (C). Therefore, −E ∈ M and λ(−E) =
λ(E).
S∞
4. Let D = n=1 (rk − ε/2n+1 , rk + ε/2n+1 ), where {r1 , r2 , . . .} is an enumeration of Q.
Section 10.4
1. Let {rS1 , r2 , . . . , } be an enumeration of the rationals in [0, 1] and set
∞
A = k=1 (rj − ε/2k+1 , rj + ε/2k+1 ) and C = [0, 1] ∩ Ac . Then C is
compact and λ(A) < ε, hence
λ(E \ C) = λ([0, 1] \ C) = λ([0, 1] ∩ A) < ε.
4. Let F denote the collection of all Borel sets B such that rB is a Borel
set. Then F is a σ-field containing all intervals, hence F = B. A similar
argument works for the other sets.
Solutions to Selected Problems
567
Section 10.5
1. For t ∈ R,
{x : f (x) < t} =
=
∞
[
{x ∈ Ak : f (x) < t}
k=1
∞
[
{x : (f 1Ak )(x) < t} ∩ Ak ∈ F.
n=1
5. (a) {x : g(x) < h(x)} =
S
r∈Q
{x : g(x) < r < h(x)} ∈ F.
(d) Since gh is measurable the assertion follows from
{x ∈ S : g(x)h(x) ≤ 1} ∩ {x ∈ S : g(x)h(x) ≥ 1} .
8. If 1E is measurable, then E = {x : 1E (x) > 0} ∈ F. Conversely, if
E ∈ F and t ∈ R, then


if t < 0,
∅
c
{x : 1E (x) ≤ t} = E if 0 ≤ t < 1, and


S
if t ≥ 1.
In each case, {x : 1E (x) ≤ t} ∈ F, hence 1E is measurable.
10. 1A∆B (x) = 1 iff 1A (x) − 1B (x) = 1 or 1B (x) − 1A (x) = 1 iff x ∈ A \ B
or x ∈ B \ A.
14. The range of f is {1/k : k ∈ N}. Since f (x) = 1/k iff 1/x − 1 < k ≤ 1/x
iff 1/(k + 1) < x ≤ 1/k, the assertion follows from Exercise 7.
17. Let ε > 0 and choose N ∈ N such that 2N > 1/ε and f ≤ N on S. Let
k > N , so 0 ≤ f ≤ k. Then, in the notation of the proof of 10.5.8,
k
fk =
k2
X
j−1
j=1
2k
1Ak,j , where
Ak,j = x ∈ S : (j − 1)2−k ≤ f (x) < j2−k , j = 1, 2, . . . , k2k .
For any x ∈ S there exists j ∈ {1, 2, . . . , k2k } such that x ∈ Ak,j , hence
0 ≤ f (x) − fk (x) = f (x) − (j − 1)2−k ≤ 1/2k < ε.
19. (a) That F is a σ-field follows from properties of preimages.
Tm
(b) Since f −1 I1 × · · · × Im = j=1 fj−1 (Ij ), F contains all intervals,
hence, by minimality, F = B(Rm ).
(c) If A ∈ B(R) and B := F −1 (A), then B ∈ B(Rm ), hence, from (b),
g −1 (A) = f −1 (B) ∈ B(Rn ).
568
A Course in Real Analysis
Section 11.2
3. Since f
−1
{d } =
2
∞
[
d/10k , (d + 1)/10k ∩ I,
k=1
Z
f dλ =
[0,1]
9
X
d=1
9
1 X
d2 λ f −1 {d2 } =
d2 .
9
d=1
R
5. (a) If E |g| dλR= 0, then g = 0 a.e.,R hence both
integralsR in (a) are
R
zero. Suppose E |g| dλ 6= 0. Since m E |g| ≤ E f |g| ≤ M E |g| on E,
−1 R
R
a := E |g| dλ
f |g| dλ satisfies the requirement.
E
(b) For example, take E = (−1, 1) and f = g = 1(−1,0) − 1(0,1) , so
Z
Z
Z
fg =
1(−1,0) + 1(0,1) = 2,
g = 0.
E
E
E
(c) Given ε > 0, choose δ > 0 such that −ε < f (t) − f (x) < ε for all
t ∈ (x − δ, x + δ). If y ∈ (x, x + δ), then, by (a),
Z
Z
Z
f dλ −
f dλ − f (x)(y − x) = [f (t) − f (x)]1[x,y] (t) dt
[a,y]
[a,x]
Z
= ay 1[x,y] dλ = ay (y − x),
where |ay | ≤ ε. Dividing by y − x proves the assertion for right-hand
limits.
7. Use 11.2.17 and the Stone–Weierstrass theorem.
9. By the approximation property of integrals, given ε > 0, there exists a
simple function g such
R that ||f − g||1 < ε and g = 0 outside an interval
[a, b]. If k > b, then [k,k+1] g dλ = 0, hence
Z
Z
f dλ =
[k,k+1]
(f − g) dλ ≤
[k,k+1]
Z
|f − g| dλ < ε.
[k,k+1]
11. Let A = {x ∈ I : |f (x)| > ε}. Then
Z
Z
2
f dλ ≥
f 2 dλ ≥ ε2 λ(A).
I
A
R1
15. By linearity of the integral, 0 P (x)f (x) dx = 0 for every polynomial
P (x) whose terms have even powers. Let g : [0, 1] be continuous. Since x2
Solutions to Selected Problems
569
separates points of [0, 1], by the Stone–Weierstrass theorem there exists
a sequence of such polynomials Pn such that ||g − Pn ||∞ → 0. Then
1
Z
fg =
Z
1
f (g − Pn ) ≤ ||g − Pn ||∞
1
|f | → 0,
0
0
0
Z
R1
so 0 gf = 0. Since the continuous functions are dense in L1 [0, 1], there
R1
exists a sequence of continuous functions gn such that 0 |f − gn | → 0.
Then
Z
Z
Z
1
1
f2 =
0
Therefore,
R1
0
1
f (f − gn ) ≤ ||f ||∞
0
|f − gn | → 0.
0
f 2 = 0, hence f = 0 a.e.
Section 11.3
1. In each case the integrand fk → 0. Moreover, in (a) the functions
2
are bounded, and in (b)
√ |fk (x)| ≤ 1/x on [1, +∞), and |fk (x)| ≤
3/2
2 2
kx /(1 + k x ) ≤ 1/ x on [0, 1]. Thus in each case the dominated
convergence theorem may be used.
2. (b) k ln(1 + f /k 2 ) ≤ f /k.
f sin f /k
(c) k sin f /k =
→ f and k sin(f /k) ≤ f .
f /k
In each case apply the dominated convergence theorem.
3. g(1 + f /k)n e−f ↑ g.
5. |f (x) sin(tx)/x| ≤ |tf (x)|, hence is integrable in x for each t. If tk → t,
then {tk } is bounded hence |tk f (x)| ≤ C|f (x)|. Now apply the dominated
convergence theorem.
S∞
8. Assume first that f ≥ 0. Since Rn = k=1 Ak , 11.3.4 implies that
Z
f dλ =
∞ Z
X
k=1
Ak
f dλ =
∞
X
ak λ(Ak ).
k=1
The general result follows by considering f ± .
P∞
12. By 11.2.6, the series k=1 |fk (x)| converges a.e.
R
R
15. By Fatou’s lemma, lim inf(fk − g) dλ ≤ lim inf (fk − g) dλ.
R
R
19. Let µ+ (E) = E f + dλ and µ− (E) = E f − dλ, E ∈ M(Rn ). Then µ±
are measures that agree on compact intervals. Since every open interval
is an increasing union of compact intervals, by continuity from below,
the measures µ± agree on open intervals. Since every open set is a
570
A Course in Real Analysis
countable disjoint union of open intervals, they agree on open sets. By
10.4.4, for any bounded E ∈ M there exists a decreasing sequence of
bounded open sets Uk ⊇ E such that λ(Uk ) → λ(E). By Exercise 11.2.8,
µ± (Uk ) → µ± (E). Therefore, µ± agree on bounded Lebesgue Rmeasurable
sets. In particular, if Ak = {x ∈ [−k, k] : f (x) > 0}, then Ak f dλ =
0. Since 1Ak f ≥ 0, it follows that 1Ak f = 0 a.e., hence
λ(Ak ) = 0.
Since Ak ↑ {x ∈ R : f (x) > 0}, λ {x ∈ R : f (x) > 0} = 0. A similar
argument shows that λ {x ∈ R : f (x) < 0} = 0.
Section 11.5
1. The given set is the union of the sets
j
Ej,k = [−k, k] × · · · × [−k, k] ∩ Q × · · · × [−k, k], k ∈ N, 1 ≤ j ≤ k,
over all k and j, and λ(Ej,k ) = (2k)n−1 λ(Q ∩ [−k, k]) = 0.
2. (a) By the Fubini–Tonelli theorem and a substitution,
Z
Z ∞
Z ∞
2
f dλ =
···
x1 · · · xn e−||x|| dx1 · · · dxn
I
Z0 ∞
Z0 ∞
Z ∞
2
2
2
=
· · · x2 · · · xn e−(x2 +···xn ) dx2 · · · dxn
x1 e−x1 dx1
0
0
0
Z
Z +∞
1 ∞
−(x22 +···x2n )
=
···
x2 · · · xn e
dx2 · · · dxn
2 0
0
1
= ··· = n.
2
5. By the Fubini–Tonelli theorem, the integral, denote it by I, equals
Z
Z 1
x dxm dλ(x, x1 , x2 , . . . , xm−1 ).
0≤x≤x1 ≤···≤xm−1 ≤1
xm−1
Performing the inner integration, we have
Z
I=
(1 − xm−1 )x dλ(x, x1 , x2 , . . . , xm−1 ).
0≤x≤x1 ≤···≤xm−1 ≤1
Integrating with respect to xm−1 ,
Z
1
I=
(1 − xm−2 )2 x dλ(x, x1 , x2 , . . . , xm−2 ).
2 0≤x≤x1 ≤···≤xm−2 ≤1
Continuing we obtain
I=
1
m!
Z
the last equality by 5.3.4.
0
1
(1 − x)m x dx =
1
,
(m + 2)!
Solutions to Selected Problems
571
7. The function
F (t, x) := t−p f (t)1[x1/p ,1] (t) = t−p f (t)1(0,tp ] (x)
is Borel measurable on (0, 1) × (0, 1) and is integrable since
Z
Z
Z
Z
−p
|F (t, x)| dλ(t, x) =
t |f (t)|
1(0,tp ] (x) dx dt
(0,1) (0,1)
(0,1)
(0,1)
Z
=
t−p |f (t)| tp dt
(0,1)
Z
=
|f | dλ < +∞.
(0,1)
Repeating the calculation with F (t, x) instead of |F (t, x)|,
Z
Z
Z
Z
g(x) dx =
F (t, x) dλ(t, x) =
f dλ.
(0,1)
(0,1)
9. (a) Let Ic (t) =
Z
(0,1)
(0,1)
c
e−xt sin x dx. Using the given identity, we have
0
Z
0
c
sin x
dx =
x
Z
c
Z
∞
e
0
0
−xt
sin x dt dx =
Z
∞
Ic (t) dt,
(†)
0
the second equality by the Fubini–Tonelli theorem, valid since
Z cZ ∞
Z c
| sin x|
dx < +∞.
e−xt | sin x| dt dx =
x
0
0
0
Integrating by parts twice yields
Ic (t) =
1 1 − e−ct (cos c + t sin c)
2
1+t
so limc→+∞ Ic (t) = (1 + t2 )−1 . Moreover, if c ≥ 1 then
|Ic (t)| ≤
1 + e−ct (1 + t)
2
≤
.
1 + t2
1 + t2
Since the last function is integrable on [0, +∞), (†) and Lebesgue’s
dominated convergence theorem imply that
Z ∞
Z ∞
Z ∞
sin x
1
π
dx = lim
Ic (t) dt =
dt = .
2
c→+∞
x
1
+
t
2
0
0
0
11. (a) By translation invariance and symmetry,
Z
Z
f ∗ g(x) =
f (x − y)g(y) dy =
f (−y)g(x + y) dy
n
Rn
ZR
=
f (y)g(x − y) dy = g ∗ f (x).
Rn
572
A Course in Real Analysis
14. Let α = G(b) − G(a) and β = F (b) − F (a) . By the Fubini–Tonelli
theorem,
Z
Z
F (x)g(x) dx − αF (a) =
[F (x) − F (a)]g(x) dx
[a,b]
[a,b]
Z
Z
=
1[a,x] (t)f (t)g(x) dt dx
[a,b] [a,b]
Z
Z
=
1[t,b] (x)f (t)g(x) dx dt,
[a,b]
[a,b]
and by a notation switch,
Z
Z
G(x)f (x) dx − βG(a) =
[G(x) − G(a)]f (x) dx
[a,b]
[a,b]
Z
Z
=
1[a,x] (t)g(t)f (x) dt dx
[a,b] [a,b]
Z
Z
=
1[a,t] (x)g(x)f (t) dx dt.
[a,b]
[a,b]
Adding, we have
Z
Z
F (x)g(x) dx − αF (a) +
G(x)f (x) dx − βG(a)
[a,b]
[a,b]
Z
Z
=
1[a,t] (x) + 1[t,b] (x) f (t)g(x) dx dt
[a,b] [a,b]
Z
Z
=
f (t)g(x) dx dt = αβ,
[a,b]
[a,b]
which implies the desired conclusion.
Section 11.6
3. Take f (x) = h(||x||) in (11.16).
5. Let x = (x1 , . . . , xn ). The hole H may be described by
n
o
p
p
H = (x, xn+1 ) : − 1 − ||x||2 ≤ xn+1 ≤ 1 − ||x||2 , ||x|| ≤ R .
By Exercise 3,
λ(H) = 2
Z
||x||≤R
Z
p
1 − k|x||2 dλ(x) = 2nαn
R
p
1 − r2 dr
0
p
p
= nαn R 1 − R2 − arcsin 1 − R2 + π/2 .
Solutions to Selected Problems
573
6. Define a C ∞ map ϕ : Rn × (0, 2π) → Rn+1 by
ϕ(x1 , . . . , xn , θ) = (x1 , . . . , xn−1 , xn cos θ, xn sin θ).
The condition
xn > 0 implies that ϕ is one-to-one. Since Er = ϕ(E ×
0, 2π) and
In−1
0
0
cos θ −xn sin θ = xn ,
Jϕ = 0
0
sin θ xn cos θ
the change of variables theorem implies that
Z
Z 2π Z
λn+1 (Er ) =
1
dλ
=
xn dλn (x1 , . . . , xn ) dθ
n+1
ϕ E×(0,2π)
= 2π
Z
0
E
xn dλn (x1 , . . . , xn ).
E
Section 12.1
1. If ψ = ϕ ◦ α and φ = ψ ◦ β, where α and β are C 1 with C 1 inverses, then
φ = ϕ ◦ γ, where γ = α ◦ β is C 1 with C 1 inverse β −1 ◦ α−1 .
3. There are two
√ unit tangent vectors at the point where t = ±1, namely,
(±e1 + e2 )/ 2.
4. (a) The trace is the parabola (x, 1 − 2x2 ). The tangent vector field is
(cos t)e1 − 2(sin(2t))e2 .
5. (b) x = a cos t, y = b sin t, z = ab sin(2t), 0 ≤ t ≤ 2π.
6. If ϕ(t) = x for infinitely many t, then, by the Bolzano–Weierstrass
theorem, there would exist a point t ∈ [a, b] and a sequence tn → t with
tn 6= t for all n such that ϕ(tn ) = x for all n. Then
ϕ0 (t) = lim
n
ϕ(tn ) − ϕ(t)
0−0
= lim
= 0,
n
tn − t
tn − t
hence ϕ is not smooth.
Section 12.2
1. Only (b) and (c) are rectifiable.
√
2. (b) − 5/4.
4. (b) x = 1 + 2 cos t, y = 2 + 3 sin t, 0 ≤ t ≤ 2π.
Z 2π p
length =
4 sin2 t + 9 cos2 t dt.
0
574
A Course in Real Analysis
5. (b) 6.
7. x0 = r0 cos θ−rθ0 sin θ, y 0 = r0 sin θ+rθ0 cos θ, (x0 )2 +(y 0 )2 = (rθ0 )2 +(r0 )2 .
Z b
Z a
d
9. (a) W = −
∇P ϕ(t) · α0 (t) dt =
(P ◦ ϕ)(t) dt.
a
b dt
Section 12.3
1. (a) If T (e1 , e2 , e3 ) = (e1 + e2 , e2 + e3 , e3 + e1 ), then


1 0 1
[T ] = 1 1 0 ,
0 1 1
which has positive determinant. Therefore, the sign of the frame is
positive.
5. Take ψ(φ) = (a cos φ, b + a sin φ), 0 < φ < 2π. By 12.3.9,
~ ϕ ϕ(φ, θ) = a(cos φ, sin φ cos θ, sin φ sin θ).
N
Setting x = a cos φ, y = (b + a sin φ) cos θ and z = (b + a sin φ) sin θ, we
have
p
y
z
, sin θ = p
,
a sin φ = y 2 + z 2 − b, cos θ = p
2
2
2
y +z
y + z2
hence
p
p
y( y 2 + z 2 − b) 2 z( y 2 + z 2 − b) 3
1
~
p
p
Nϕ (x, y, z) = xe +
e +
e .
y2 + z2
y2 + z2
6. Let (x, y, z) = ϕ(t, θ) and use ∂ϕ⊥ = (xt , yt , zt ) × (xθ , yθ , zθ ).
(a) ∂ϕ⊥ (t, θ) = t − cos θ, sin θ, 1 , so
~ ϕ (t cos θ, t sin θ, t) = √1 (− cos θ, − sin θ, 1 .
N
2
1
−x −y
~
Therefore, Nϕ (x, y, z) = √
,
,1 .
z
z
2
(f) Let f (s) = a + s(b − a). Then ϕ(t, s) = (f (s) cos t, f (s) sin t, s) and
∂ϕ⊥ (t, s) = f (s)(cos t, sin t, a − b). Therefore,
~ ϕ (x, y, z) = p
N
1
1 + (a − b)2
(cos t, sin t, a − b) ,
where x = f (s) cos t, y = f (s) sin t, and z = s, so
1
x
y
~
Nϕ (x, y, z) = p
,
,a − b .
1 + (a − b)2 f (z) f (z)
Solutions to Selected Problems
575
7. If v ∈ V and u = (v, s), then
det ϕ0 (u)t ϕ0 (u) = det ψ 0 (v)t ψ 0 (v) 6= 0,
hence ϕ is a parameterized (n − 2)-surface. Moreover, since the deter∂(ϕ1 , . . . , ϕn−1 )
minant
has a column of zeros, setting u = (v, s), we
∂(u1 , . . . , un−1 )
have
∂ϕ⊥ (u) =
n
X
∂(ϕ1 , . . . , ϕbi , . . . , ϕn )
(−1)i+n
(v, un−1 )ei
∂(u
,
.
.
.
,
u
)
1
n−1
i=1
n−1
X
∂(ψ1 , . . . , ψbi , . . . , ψn−1 )
(v)ei
∂(u
,
.
.
.
,
u
)
1
n−1
i=1
= ∂ϕ⊥ (v), 0 ,
=
(−1)i+n
proving (b). Part (c) follows from (b) and 12.3.6.
Section 12.4
3. The transition mapping is ϕ−1 ◦ ϕ1 (x) = 12 x, as may be seen from
−1
x = ϕ−1
(y1 , . . . , yn−1 ), yn < 1,
1 (y) = 2(1 − yn )
hence
ϕ1 (x) = (4 + kxk2 )−1 4x1 , . . . , 4xn−1 , kxk2 − 4 .
5. For (y1 , y2 , y2 ) on the cone,
(x1 , x2 ) = ϕ−1 (y1 , y2 , y2 ) =
1
(y1 , y2 ), 0 < y3 < 1,
1 − y3
hence ϕ(x1 , x2 ) = ((1 − y3 )x1 , (1 − y3 )x2 , y3 ), where
2 2
x1
x2
α
α=
+
.
y3 =
1+α
a1
a2
7. (a) {v : v1 + 2v2 + 3v3 = 0}.
9. The transpose

1



1 
0
1 − yn 
 ..
.

0
of T v is
0 ···
0
1 ···
0
..
.
0 ···
..
.
1
y1 
1 − yn   
 v1
y2   v 
  2
1
1 − yn 
..  =

1
−
yn


..  .
.  v
yn−1  n
1 − yn
y1 vn 
1 − yn 




y v 
 v2 + 2 n 

1 − yn 

.
..




.

yn−1 vn 
vn +
1 − yn

v1 +
576
A Course in Real Analysis
For part (a), use the identities
n−1
X
yj2
=1−
yn2
and
j=1
n−1
X
vj yj = −v n yn ,
j=1
a consequence of kyk = 1 and y · v = 0.
11. (a) The conditions are
F (x) = 0 and G(x, v) := v1 ∂1 F (x) + v2 ∂2 F (x) + v3 ∂3 F (x) = 0.
The Jacobian matrix of (F, G) (suppressing x and v) is
∂1 F ∂ 2 F ∂ 3 F
0
0
0
,
∂1 G ∂2 G ∂3 G ∂1 F ∂2 F ∂3 F
which has rank 2 since for each x, ∂i F (x) 6= 0 for some i.
Section 13.1
3. Fix ai for i ≥ 3 and set B(a, b) := M (a, b, a3 , . . . , am ). Then
B(a + b, a + b) = B(a, a) + B(b, a) + B(a, b) + B(b, b),
hence B(b, a) + B(a, b) = 0. This shows that M changes sign if the first
two arguments are interchanged. The other pairs are treated similarly.
7. (a) (f1 g2 − g1 f2 ) dx1,2 + (f1 g3 − g1 f3 ) dx1,3 + (f2 g3 − g2 f3 ) dx2,3 .
8. (b) −dx1,2,3 .
9. (a) (−1)k(k+1)/2 dx1 ∧ · · · ∧ dx2k ,
11. (b) (− dx1 + 2 dx2 ) ∧ (− dx2 + 2 dx3 ) ∧ (2 dx1 + dx3 ) = 9 dx1,2,3 .
!
n
n
n
X
X
X
∂gj
13. (a)
dxi ∧ dxj =
f 0 (xj ) dxj ∧ dxj = 0,
∂x
i
j=1
i=1
j=1
15. By 13.1.16, d(ω ∧ ν) = (dω) ∧ ν + (−1)p ω ∧ (dν) = η ∧ ν.
17. By 13.1.19(d),
[ϕ∗ (ψ ∗ ω)](a1 , . . . , am ) = (ψ ∗ ω)ϕ(u) (dϕu (a1 ), . . . , dϕu (am ))
= ωψ(ϕ(u)) (dψ)ϕ(u) (dϕu (a1 )), . . . , (dψ)ϕ(u) (dϕu (am ))
= ωψ◦ϕ(u) d(ψ ◦ ϕ)u (a1 ), . . . , d(ψ ◦ ϕ)u (am ) ,
the last equality by the chain rule.
Solutions to Selected Problems
577
19. By Exercise 9.3.15, x · ∇fj (x) = kfj (x). Since dω = 0, ∂1 f2 = ∂2 f1 ,
∂3 f1 = ∂1 f3 and ∂2 f3 = ∂3 f2 (13.1.15). Therefore,
1
(x1 ∂1 f1 + x2 ∂1 f2 + x3 ∂1 f3 + f1 )
k+1
1
=
(x1 ∂1 f1 + x2 ∂2 f1 + x3 ∂3 f1 + f1 )
k+1
= f1 .
∂1 f =
Similarly, ∂2 f = f2 and ∂3 f = f3 .
Section 13.2
1. (b) Since ∂ϕ⊥ (t, θ) = (sin θ, − cos θ, t), k∂ϕ⊥ (t, θ)k2 = 1 + t2 , hence
Z 1p
√ √
1 + t2 dt = 21 2 + ln(1 + 2) .
area(ϕ) = 2π
0
4. For the case m = 2:


−r1 sin θ1
0
 r1 cos θ1

0
,
ϕ0 (θ1 , θ2 ) = 

0
−r2 sin θ2 
0
r2 cos θ2
hence det ϕ0 (θ1 , θ2 )t ϕ0 (θ1 , θ2 ) = (r1 r2 )2 . Therefore,
area(ϕ) =
Z
0
2π
Z
2π
r1 r2 dθ1 dθ2 = (2πr1 )(2πr2 ).
0
6. In the notation of Example 11.5.5, the surface is the graph of
g(x1 , . . . xn ) := 1 −
n
X
xj ,
(x1 , . . . , xn ) ∈ S(1, n).
j=1
Therefore, the surface area is
√
Z
p
√
n+1
2
1 + k∇gk dλn = 1 + n λn S(1, n) =
.
n!
S(1,n)
7. Let un−1 = s. Since
ϕ(u1 , . . . , un−1 ) = ψ1 (u1 , . . . , un−2 ), . . . , ψn−1 (u1 , . . . , un−2 ), un−1 ,

 ∂(ψ1 , . . . , ψbi , . . . , ψn−1 )
∂(ϕ1 , . . . , ϕbi , . . . , ϕn ) 
if i ≤ n − 1
=
∂(u1 , . . . , un−2 )

∂(u1 , . . . , un−1 )
0
if i = n.
578
A Course in Real Analysis
Therefore, det ϕ0 (u)t ϕ0 (u) = det ψ 0 (u)t ψ 0 (u) , so
Z hZ q
area(ϕ) =
det ψ 0 (u)t ψ 0 (u) du = h · area(ψ).
U
0
8. From ϕ(t, s) = (ψ1 (t), ψ2 (t), s), we have
∂(ϕ1 , ϕ3 )
∂(ϕ1 , ϕ2 )
∂(ϕ2 , ϕ3 )
ψ 0 (t) 0
ψ 0 (t) 0
ψ 0 (t) 0
,
,
.
= 2
= 1
= 10
0 1
0 1
ψ2 (t) 0
∂(t, s)
∂(t, s)
∂(t, s)
11. By 12.3.7(b), ∂ϕ⊥ (t, θ) = ψ2 (t) ψ20 (t), −ψ10 (t) cos θ, −ψ10 (t) sin θ , hence
k∂ϕ⊥ (t, θ)k = ψ2 (t)kψ 0 (t)k.
For the torus, take ψ(t) = (a cos t, b + a sin t), 0 < t < 2π. Then
Z 2π
Z 2π
0
area(ϕ) = 2π
ψ2 (t)kψ (t)k dt = 2πa
(b + a sin t) dt = 4π 2 ab.
0
0
For the cone, take ψ(t) = (t, rt/h), 0 ≤ t ≤ h, so
Z h
p
p
area(ϕ) = 2π
(rt/h) (r/h)2 + 1 dt = πr r2 + h2 .
0
13. (a) Take g(t) = t, 0 < t < 1, so (x1 , x2 , x3 ) := ϕ(t) = t, t cos θ, t sin θ
and
Z
Z 1 Z 2π
ω=
t (f1 ◦ ϕ) + (f2 ◦ ϕ) cos θ − (f3 ◦ ϕ) sin θ dθ dt.
ϕ
0
0
Since f1 = f2 = 0 and f3 = x1 x3 = t2 sin θ,
Z
Z 1 Z 2π
π
ω=−
t3 sin2 θ dθ dt = − .
4
ϕ
0
0
15. (b) A local parametrization of S is
(x1 , x2 , x3 ) := ϕ(t, θ) = ψ1 (t), ψ2 (t) cos θ, ψ2 (t) sin θ , 0 < θ, t < 2π,
where
ψ1 (t) = a cos t and ψ2 (t) = b + a sin t.
Therefore, by Exercise 12,
Z
f1 dx2 ∧ dx3 + f2 dx1 ∧ dx3 + f3 dx1 ∧ dx2
ϕ
=a
2π
Z
2π
Z
0
(f1 ◦ ϕ)(b + a sin t) cos t dθ dt
0
Z
2π
Z
2π
(f2 ◦ ϕ)(b + a sin t) sin t cos θ dθ dt
−a
0
Z
0
2π
Z
−a
0
0
2π
(f3 ◦ ϕ)(b + a sin t) sin t sin θ dθ dt.
Solutions to Selected Problems
579
Since f2 = f3 = 0 and f1 = x1 = a cos t,
Z
Z 2π Z 2π
2
(b + a sin t) cos2 t dθ dt = 2a2 bπ 2 .
ω=a
ϕ
0
0
Section 13.4
4. For example, let fn and g be Borel measurable
andRintegrable on S with
R
|fn | ≤ g for all n. If fn → f on S, then S fn dS → S fn dS. This follows
directly from the dominated convergence theorem applied to fn ◦ ϕa .
u
`
7. This follows from Exercise 6 and H1/n
∪ H1/n
↑ S.
Section 13.5
1. The parameterizations are
• S : (cos t, sin t, z),
• bottom boundary: (cos t, sin t, 0),
• top boundary: (cos t, − sin t, 1),
where, 0 ≤ t ≤ 2π and 0 ≤ z ≤ 1. The left side of Stokes’s formula then
reduces to
Z 2π
−
[f (cos t, sin t, 0) + f (cos t, − sin t, 1)] sin t dt
0
+
Z
2π
[g(cos t, sin t, 0) − g(cos t, − sin t, 1)] cos t dt
(†)
0
and the right side to
Z 1 Z 2π
[(hy − gz )(cos t, sin t, z) cos t + (fz − hx )(cos t, sin t, z) sin t] dt dz
0
=
Z
0
2π
0
Z
1
[−gz (cos t, sin t, z) cos t + fz (cos t, sin t, z) cos t] dz dt,
0
where we have used
hy (cos t, sin t, z) cos t − hx (cos t, sin t, z) sin t =
d
h(cos t, sin t, z).
dt
Make the substitution t = 2π − s in (†) to complete the argument.
3. (a) Both sides equal (a) 2π(R − r).
4. (d) 0.
580
A Course in Real Analysis
5. Use the parametrization x = a cos2m+1 t, y = b sin2m+1 t, 0 ≤ t ≤ 2π, to
first obtain
Z
Z
2π
1
(x dy − y dx) = ab m + 12
cos2m t sin2m t dt.
2 ϕ
0
Then use sin(2t) = 2 sin t cos t and 5.3.4.
9. Apply the divergence theorem using d(f g) = f dg + g df and div d(f g) =
f ∇2 g + 2∇f · ∇g + g∇2 f .
12. In (a), the induced orientation of C from S1 is the opposite of the induced
orientation from S2 . By Stoke’s theorem,
Z
Z
Z
~ =−
curl F~ · ~n dS =
F~ · dr
curl F~ · ~n dS,
S1
C
S2
hence
Z
curl F~ · ~n dS =
S
Z
curl F~ · ~n dS +
Z
curl F~ · ~n dS = 0.
S2
S1
In (b), the induced orientations of C from S1 and S2 are the same, so
Z
Z
Z
~ =
curl F~ · ~n dS =
F~ · dr
curl F~ · ~n dS.
S1
S2
C
14. Let F (x) = x, x ∈ Rn . On S := Sr (0), ~n = x/r, hence
Z
Z
F~ · ~n dS =
r dS = r · area(S).
S
Since div F~ = n,
Z
kxk<r
S
div F~ dx =
Z
kxk<r
n dx = nrn αn .
Bibliography
[1] Apostol, Thom, Mathematical Analysis, 2nd ed. Addison-Wesley, 2000.
[2] Bloch, Ethan, Proofs and Fundamentals, Birkhaüser, 2000.
[3] Burckill, J.C. and H. Burkill, A Second Course in Mathematical Analysis,
Cambridge Press, 1970.
[4] Dudley, Richard, Real Analysis and Probability, Cambridge University
Press, 2002.
[5] Fleming, Wendell, Functions of Several Variables, 2nd ed. Springer Verlag,
1977.
[6] Mendelson, Elliot, Number Systems and the Foundations of Analysis,
Dover, 2008.
[7] Rudin, Walter, Principles of Mathematical Analysis, 3rd ed. McGraw-Hill,
1976.
[8] Sibley, Thomas, Foundations of Mathematics, John Wiley & Sons 2009.
[9] Strang, Gilbert, Linear Algebra and Its Application, 4th ed. Thomson
Brooks/Cole, 2006.
[10] Wilder, Raymond, Introduction to the Foundations of Mathematics, 2nd
ed. John Wiley and Sons, 1965.
581
Mathematics
A Course in Real Analysis provides a rigorous treatment of the foundations of differential and integral calculus at the advanced undergraduate level.
The third part consists of appendices on set theory and linear algebra as well as solutions to some of the exercises.
Features
• Provides a detailed axiomatic account of the real number system
• Develops the Lebesgue integral on n from the beginning
• Gives an in-depth description of the algebra and calculus of differential forms on
surfaces in n
• Offers an easy transition to the more advanced setting of differentiable manifolds
by covering proofs of Stokes’s theorem and the divergence theorem at the
concrete level of compact surfaces in n
• Summarizes relevant results from elementary set theory and linear algebra
• Contains over 90 figures that illustrate the essential ideas behind a concept or
proof
• Includes more than 1,600 exercises throughout the text, with selected solutions
in an appendix
•
•
•
•
•
Access online or download to your smartphone, tablet or PC/Mac
Search the full text of this and other titles you own
Make and share notes and highlights
Copy and paste text and figures for use in your own documents
Customize your view by changing font size and layout
K22153
w w w. c rc p r e s s . c o m
JUNGHENN
With clear proofs, detailed examples, and numerous exercises, this book gives a thorough treatment of the subject. It progresses from single variable to multivariable functions, providing a logical development of material that will prepare readers for more
advanced analysis-based studies.
A COURSE IN
The second part focuses on functions of several variables. It introduces the topological
ideas needed (such as compact and connected sets) to describe analytical properties
of multivariable functions. This part also discusses differentiability and integrability of
multivariable functions and develops the theory of differential forms on surfaces in n.
REAL ANALYSIS
The first part of the text presents the calculus of functions of one variable. This part
covers traditional topics, such as sequences, continuity, differentiability, Riemann integrability, numerical series, and the convergence of sequences and series of functions.
It also includes optional sections on Stirling’s formula, functions of bounded variation,
Riemann–Stieltjes integration, and other topics.
WITH VITALSOURCE ®
EBOOK
A
COURSE IN
REAL
ANALYSIS
HUGO D. JUNGHENN
Descargar