Flavors of Geometry
MSRI Publications
Volume
31
, 1997
An Elementary Introduction
to Modern Convex Geometry
KEITH BALL
Contents
Preface
1
Lecture 1. Basic Notions
2
Lecture 2. Spherical Sections of the Cube
8
Lecture 3. Fritz Johnâs Theorem
13
Lecture 4. Volume Ratios and Spherical Sections of the Octahedron
19
Lecture 5. The BrunnâMinkowski Inequality and Its Extensions
25
Lecture 6. Convolutions and Volume Ratios: The Reverse Isoperimetric
Problem
32
Lecture 7. The Central Limit Theorem and Large Deviation Inequalities 37
Lecture 8. Concentration of Measure in Geometry
41
Lecture 9. Dvoretzkyâs Theorem
47
Acknowledgements
53
References
53
Index
55
Preface
These notes are based, somewhat loosely, on three series of lectures given by
myself, J. Lindenstrauss and G. Schechtman, during the Introductory Workshop
in Convex Geometry held at the Mathematical Sciences Research Institute in
Berkeley, early in 1996. A fourth series was given by B. Bollob´
as, on rapid
mixing and random volume algorithms; they are found elsewhere in this book.
The material discussed in these notes is not, for the most part, very new, but
the presentation has been strongly influenced by recent developments: among
other things, it has been possible to simplify many of the arguments in the light
of later discoveries. Instead of giving a comprehensive description of the state of
the art, I have tried to describe two or three of the more important ideas that
have shaped the modern view of convex geometry, and to make them as accessible
1
2
KEITH BALL
as possible to a broad audience. In most places, I have adopted an informal style
that I hope retains some of the spontaneity of the original lectures. Needless to
say, my fellow lecturers cannot be held responsible for any shortcomings of this
presentation.
I should mention that there are large areas of research that fall under the
very general name of convex geometry, but that will barely be touched upon in
these notes. The most obvious such area is the classical or âBrunnâMinkowskiâ
theory, which is well covered in [Schneider 1993]. Another noticeable omission is
the combinatorial theory of polytopes: a standard reference here is [Brøndsted
1983].
Lecture 1. Basic Notions
The topic of these notes is convex geometry. The objects of study are con-
vex bodies: compact, convex subsets of Euclidean spaces, that have nonempty
interior. Convex sets occur naturally in many areas of mathematics: linear pro-
gramming, probability theory, functional analysis, partial differential equations,
information theory, and the geometry of numbers, to name a few.
Although convexity is a simple property to formulate, convex bodies possess
a surprisingly rich structure. There are several themes running through these
notes: perhaps the most obvious one can be summed up in the sentence: âAll
convex bodies behave a bit like Euclidean balls.â Before we look at some ways in
which this is true it is a good idea to point out ways in which it definitely is not.
This lecture will be devoted to the introduction of a few basic examples that we
need to keep at the backs of our minds, and one or two well known principles.
The only notational conventions that are worth specifying at this point are
the following. We will use
| ¡ |
to denote the standard Euclidean norm on
R
n
. For
a body
K
, vol(
K
) will mean the volume measure of the appropriate dimension.
The most fundamental principle in convexity is the
HahnâBanach separation
theorem
, which guarantees that each convex body is an intersection of half-spaces,
and that at each point of the boundary of a convex body, there is at least one
supporting hyperplane. More generally, if
K
and
L
are disjoint, compact, convex
subsets of
R
n
, then there is a linear functional
Ď
:
R
n
â
R
for which
Ď
(
x
)
< Ď
(
y
)
whenever
x
â
K
and
y
â
L
.
The simplest example of a convex body in
R
n
is the cube, [
â
1
,
1]
n
. This does
not look much like the Euclidean ball. The largest ball inside the cube has radius
1, while the smallest ball containing it has radius
â
n
, since the corners of the
cube are this far from the origin. So, as the dimension grows, the cube resembles
a ball less and less.
The second example to which we shall refer is the
n
-dimensional regular solid
simplex: the convex hull of
n
+ 1 equally spaced points. For this body, the ratio
of the radii of inscribed and circumscribed balls is
n
: even worse than for the
cube. The two-dimensional case is shown in Figure 1. In Lecture 3 we shall see
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
3
Figure 1.
Inscribed and circumscribed spheres for an
n
-simplex.
that these ratios are extremal in a certain well-defined sense.
Solid simplices are particular examples of cones. By a
cone
in
R
n
we just mean
the convex hull of a single point and some convex body of dimension
n
â
1 (Figure
2). In
R
n
, the volume of a cone of height
h
over a base of (
n
â
1)-dimensional
volume
B
is
Bh/n
.
The third example, which we shall investigate more closely in Lecture 4, is the
n
-dimensional âoctahedronâ, or
cross-polytope:
the convex hull of the 2
n
points
(
Âą
1
,
0
,
0
, . . . ,
0), (0
,
Âą
1
,
0
, . . . ,
0), . . . , (0
,
0
, . . . ,
0
,
Âą
1). Since this is the unit ball
of the
`
1
norm on
R
n
, we shall denote it
B
n
1
. The circumscribing sphere of
B
n
1
has radius 1, the inscribed sphere has radius 1
/
â
n
; so, as for the cube, the ratio
is
â
n
: see Figure 3, left.
B
n
1
is made up of 2
n
pieces similar to the piece whose points have nonnegative
coordinates, illustrated in Figure 3, right. This piece is a cone of height 1 over
a base, which is the analogous piece in
R
n
â
1
. By induction, its volume is
1
n
¡
1
n
â
1
¡ ¡ ¡ ¡ ¡
1
2
¡
1 =
1
n
!
,
and hence the volume of
B
n
1
is 2
n
/n
!.
Figure 2.
A cone.
4
KEITH BALL
(1
,
0
, . . . ,
0)
(
1
n
, . . . ,
1
n
)
(1
,
0
,
0)
(0
,
1
,
0)
(0
,
0
,
1)
Figure 3.
The cross-polytope (left) and one orthant thereof (right).
The final example is the Euclidean ball itself,
B
n
2
=
x
â
R
n
:
n
X
1
x
2
i
â¤
1
.
We shall need to know the volume of the ball: call it
v
n
. We can calculate the
surface âareaâ of
B
n
2
very easily in terms of
v
n
: the argument goes back to the
ancients. We think of the ball as being built of thin cones of height 1: see Figure
4, left. Since the volume of each of these cones is 1
/n
times its base area, the
surface of the ball has area
nv
n
. The sphere of radius 1, which is the surface of
the ball, we shall denote
S
n
â
1
.
To calculate
v
n
, we use integration in spherical polar coordinates. To specify
a point
x
we use two coordinates:
r
, its distance from 0, and
θ
, a point on the
sphere, which specifies the direction of
x
. The point
θ
plays the role of
n
â
1 real
coordinates. Clearly, in this representation,
x
=
rθ
: see Figure 4, right. We can
x
θ
0
Figure 4.
Computing the volume of the Euclidean ball.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
5
write the integral of a function on
R
n
as
Z
R
n
f
=
Z
â
r
=0
Z
S
n
â
1
f
(
rθ
) â
dθ
â
r
n
â
1
dr.
(1.1)
The factor
r
n
â
1
appears because the sphere of radius
r
has area
r
n
â
1
times that
of
S
n
â
1
. The notation â
dθ
â stands for âareaâ measure on the sphere: its total
mass is the surface area
nv
n
. The most important feature of this measure is
its rotational invariance: if
A
is a subset of the sphere and
U
is an orthogonal
transformation of
R
n
, then
U A
has the same measure as
A
. Throughout these
lectures we shall normalise integrals like that in (1.1) by pulling out the factor
nv
n
, and write
Z
R
n
f
=
nv
n
Z
â
0
Z
S
n
â
1
f
(
rθ
)
r
n
â
1
dĎ
(
θ
)
dr
where
Ď
=
Ď
n
â
1
is the rotation-invariant measure on
S
n
â
1
of total mass 1. To
find
v
n
, we integrate the function
x
7â
exp
â
1
2
n
X
1
x
2
i
both ways. This function is at once invariant under rotations and a product of
functions depending upon separate coordinates; this is what makes the method
work. The integral is
Z
R
n
f
=
Z
R
n
n
Y
1
e
â
x
2
i
/
2
dx
=
n
Y
1
Z
â
ââ
e
â
x
2
i
/
2
dx
i
=
â
2
Ď
n
.
But this equals
nv
n
Z
â
0
Z
S
n
â
1
e
â
r
2
/
2
r
n
â
1
dĎ dr
=
nv
n
Z
â
0
e
â
r
2
/
2
r
n
â
1
dr
=
v
n
2
n/
2
Î
n
2
+ 1
.
Hence
v
n
=
Ď
n/
2
Î
n
2
+ 1
.
This is extremely small if
n
is large. From Stirlingâs formula we know that
Î
n
2
+ 1
âź
â
2
Ď e
â
n/
2
n
2
(
n
+1)
/
2
,
so that
v
n
is roughly
r
2
Ďe
n
n
.
To put it another way, the Euclidean ball of
volume
1 has
radius
about
r
n
2
Ďe
,
which is pretty big.
6
KEITH BALL
r
=
v
â
1
/n
n
Figure 5.
Comparing the volume of a ball with that of its central slice.
This rather surprising property of the ball in high-dimensional spaces is per-
haps the first hint that our intuition might lead us astray. The next hint is
provided by an answer to the following rather vague question: how is the mass
of the ball distributed? To begin with, letâs estimate the (
n
â
1)-dimensional
volume of a slice through the centre of the ball of volume 1. The ball has radius
r
=
v
â
1
/n
n
(Figure 5). The slice is an (
n
â
1)-dimensional ball of this radius, so its volume is
v
n
â
1
r
n
â
1
=
v
n
â
1
1
v
n
(
n
â
1)
/n
.
By Stirlingâs formula again, we find that the slice has volume about
â
e
when
n
is large. What are the (
n
â
1)-dimensional volumes of parallel slices? The
slice at distance
x
from the centre is an (
n
â
1)-dimensional ball whose radius is
â
r
2
â
x
2
(whereas the central slice had radius
r
), so the volume of the smaller
slice is about
â
e
â
r
2
â
x
2
r
n
â
1
=
â
e
1
â
x
2
r
2
(
n
â
1)
/
2
.
Since
r
is roughly
p
n/
(2
Ďe
), this is about
â
e
1
â
2
Ďex
2
n
(
n
â
1)
/
2
â
â
e
exp(
â
Ďex
2
)
.
Thus, if we project the mass distribution of the ball of volume 1 onto a single
direction, we get a distribution that is approximately Gaussian (normal) with
variance 1
/
(2
Ďe
). What is remarkable about this is that the variance does not
depend upon
n
. Despite the fact that the radius of the ball of volume 1 grows
like
p
n/
(2
Ďe
), almost all of this volume stays within a slab of fixed width: for
example, about 96% of the volume lies in the slab
{
x
â
R
n
:
â
1
2
â¤
x
1
â¤
1
2
}
.
See Figure 6.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
7
n
= 2
n
= 16
n
= 120
96%
Figure 6.
Balls in various dimensions, and the slab that contains about 96% of
each of them.
So the volume of the ball concentrates close to
any
subspace of dimension
n
â
1. This would seem to suggest that the volume concentrates near the centre
of the ball, where the subspaces all meet. But, on the contrary, it is easy to see
that, if
n
is large, most of the volume of the ball lies near its surface. In objects of
high dimension, measure tends to concentrate in places that our low-dimensional
intuition considers small. A considerable extension of this curious phenomenon
will be exploited in Lectures 8 and 9.
To finish this lecture, letâs write down a formula for the volume of a general
body in spherical polar coordinates. Let
K
be such a body with 0 in its interior,
and for each direction
θ
â
S
n
â
1
let
r
(
θ
) be the radius of
K
in this direction.
Then the volume of
K
is
nv
n
Z
S
n
â
1
Z
r
(
θ
)
0
s
n
â
1
ds dĎ
=
v
n
Z
S
n
â
1
r
(
θ
)
n
dĎ
(
θ
)
.
This tells us a bit about particular bodies. For example, if
K
is the cube [
â
1
,
1]
n
,
whose volume is 2
n
, the radius satisfies
Z
S
n
â
1
r
(
θ
)
n
=
2
n
v
n
â
r
2
n
Ďe
n
.
So the âaverageâ radius of the cube is about
r
2
n
Ďe
.
This indicates that the volume of the cube tends to lie in its corners, where the
radius is close to
â
n
, not in the middle of its facets, where the radius is close
to 1. In Lecture 4 we shall see that the reverse happens for
B
n
1
and that this has
a surprising consequence.
8
KEITH BALL
If
K
is (
centrally
)
symmetric
, that is, if
â
x
â
K
whenever
x
â
K
, then
K
is
the unit ball of some norm
k ¡ k
K
on
R
n
:
K
=
{
x
:
k
x
k
K
â¤
1
}
.
This was already mentioned for the octahedron, which is the unit ball of the
`
1
norm
k
x
k
=
n
X
1
|
x
i
|
.
The norm and radius are easily seen to be related by
r
(
θ
) =
1
k
θ
k
,
for
θ
â
S
n
â
1
,
since
r
(
θ
) is the largest number
r
for which
rθ
â
K
. Thus, for a general sym-
metric body
K
with associated norm
k ¡ k
, we have this formula for the volume:
vol(
K
) =
v
n
Z
S
n
â
1
k
θ
k
â
n
dĎ
(
θ
)
.
Lecture 2. Spherical Sections of the Cube
In the first lecture it was explained that the cube is rather unlike a Euclidean
ball in
R
n
: the cube [
â
1
,
1]
n
includes a ball of radius 1 and no more, and is
included in a ball of radius
â
n
and no less. The cube is a bad approximation
to the Euclidean ball. In this lecture we shall take this point a bit further. A
body like the cube, which is bounded by a finite number of flat facets, is called a
polytope
. Among symmetric polytopes, the cube has the fewest possible facets,
namely 2
n
. The question we shall address here is this:
If
K
is a polytope in
R
n
with m facets, how well can
K
approximate the
Euclidean ball?
Letâs begin by clarifying the notion of approximation. To simplify matters we
shall only consider symmetric bodies. By analogy with the remarks above, we
could define the distance between two convex bodies
K
and
L
to be the smallest
d
for which there is a scaled copy of
L
inside
K
and another copy of
L
,
d
times
as large, containing
K
. However, for most purposes, it will be more convenient
to have an affine-invariant notion of distance: for example we want to regard all
parallelograms as the same. Therefore:
Definition.
The distance
d
(
K, L
) between symmetric convex bodies
K
and
L
is the least positive
d
for which there is a linear image Ë
L
of
L
such that
Ë
L
â
K
â
d
Ë
L
. (See Figure 7.)
Note that this distance is multiplicative, not additive: in order to get a metric (on
the set of linear equivalence classes of symmetric convex bodies) we would need to
take log
d
instead of
d
. In particular, if
K
and
L
are identical then
d
(
K, L
) = 1.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
9
Ë
L
d
Ë
L
K
Figure 7.
Defining the distance between
K
and
L
.
Our observations of the last lecture show that the distance between the cube
and the Euclidean ball in
R
n
is
at most
â
n
. It is intuitively clear that it really
is
â
n
, i.e., that we cannot find a linear image of the ball that sandwiches the
cube any better than the obvious one. A formal proof will be immediate after
the next lecture.
The main result of this lecture will imply that, if a polytope is to have small
distance from the Euclidean ball, it must have very many facets: exponentially
many in the dimension
n
.
Theorem 2.1.
Let
K
be a
(
symmetric
)
polytope in
R
n
with
d
(
K, B
n
2
) =
d
.
Then
K
has at least
e
n/
(2
d
2
)
facets
.
On the other hand
,
for each
n
,
there is a polytope
with
4
n
facets whose distance from the ball is at most
2.
The arguments in this lecture, including the result just stated, go back to the
early days of packing and covering problems. A classical reference for the subject
is [Rogers 1964].
Before we embark upon a proof of Theorem 2.1, letâs look at a reformulation
that will motivate several more sophisticated results later on.
A symmetric
convex body in
R
n
with
m
pairs of facets can be realised as an
n
-dimensional
slice (through the centre) of the cube in
R
m
. This is because such a body is
the intersection of
m
slabs in
R
n
, each of the form
{
x
:
|h
x, v
i| â¤
1
}
, for some
nonzero vector
v
in
R
n
. This is shown in Figure 8.
Thus
K
is the set
{
x
:
|h
x, v
i
i| â¤
1 for 1
â¤
i
â¤
m
}
, for some sequence (
v
i
)
m
1
of
vectors in
R
n
. The linear map
T
:
x
7â
(
h
x, v
1
i
, . . . ,
h
x, v
m
i
)
embeds
R
n
as a subspace
H
of
R
m
, and the intersection of
H
with the cube
[
â
1
,
1]
m
is the set of points
y
in
H
for which
|
y
i
| â¤
1 for each coordinate
i
. So
this intersection is the image of
K
under
T
. Conversely, any
n
-dimensional slice
of [
â
1
,
1]
m
is a body with at most
m
pairs of faces. Thus, the result we are
aiming at can be formulated as follows:
The cube in
R
m
has almost spherical sections whose dimension
n
is roughly
log
m
and not more.
10
KEITH BALL
Figure 8.
Any symmetric polytope is a section of a cube.
In Lecture 9 we shall see that all symmetric
m
-dimensional convex bodies have
almost spherical sections of dimension about log
m
. As one might expect, this is
a great deal more difficult to prove for general bodies than just for the cube.
For the proof of Theorem 2.1, letâs forget the symmetry assumption again and
just ask for a polytope
K
=
{
x
:
h
x, v
i
i â¤
1 for 1
â¤
i
â¤
m
}
with
m
facets for which
B
n
2
â
K
â
dB
n
2
.
What do these inclusions say about the vectors (
v
i
)? The first implies that each
v
i
has length at most 1, since, if not,
v
i
/
|
v
i
|
would be a vector in
B
n
2
but not
in
K
. The second inclusion says that if
x
does not belong to
dB
n
2
then
x
does
not belong to
K
: that is, if
|
x
|
> d
, there is an
i
for which
h
x, v
i
i
>
1. This is
equivalent to the assertion that for every unit vector
θ
there is an
i
for which
h
θ, v
i
i âĽ
1
d
.
Thus our problem is to look for as few vectors as possible,
v
1
, v
2
, . . . , v
m
, of
length at most 1, with the property that for every unit vector
θ
there is some
v
i
with
h
θ, v
i
i âĽ
1
/d
. It is clear that we cannot do better than to look for vectors
of length exactly 1: that is, that we may assume that all facets of our polytope
touch the ball. Henceforth we shall discuss only such vectors.
For a fixed unit vector
v
and
Îľ
â
[0
,
1), the set
C
(
Îľ, v
) of
θ
â
S
n
â
1
for which
h
θ, v
i âĽ
Îľ
is called a
spherical cap
(or just a
cap
); when we want to be precise,
we will call it the
Îľ
-
cap about
v
. (Note that
Îľ
does not refer to the radius!) See
Figure 9, left.
We want every
θ
â
S
n
â
1
to belong to at least one of the (1
/d
)-caps determined
by the (
v
i
). So our task is to estimate the number of caps of a given size needed
to cover the entire sphere. The principal tool for doing this will be upper and
lower estimates for the area of a spherical cap. As in the last lecture, we shall
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
11
0
Îľ
v
v
r
Figure 9.
Left:
Îľ
-cap
C
(
Îľ, v
)
about
v
. Right: cap of radius
r
about
v
.
measure this area as a proportion of the sphere: that is, we shall use
Ď
n
â
1
as
our measure. Clearly, if we show that each cap occupies only a small proportion
of the sphere, we can conclude that we need plenty of caps to cover the sphere.
What is slightly more surprising is that once we have shown that spherical caps
are not
too
small, we will also be able to deduce that we
can
cover the sphere
without using too many.
In order to state the estimates for the areas of caps, it will sometimes be
convenient to measure the size of a cap in terms of its radius, instead of using
the
Îľ
measure. The cap of radius
r
about
v
is
θ
â
S
n
â
1
:
|
θ
â
v
| â¤
r
as illustrated in Figure 9, right. (In defining the radius of a cap in this way we
are implicitly adopting a particular metric on the sphere: the metric induced by
the usual Euclidean norm on
R
n
.) The two estimates we shall use are given in
the following lemmas.
Lemma 2.2 (Upper bound for spherical caps).
For
0
â¤
Îľ <
1,
the cap
C
(
Îľ, u
)
on
S
n
â
1
has measure at most
e
â
nÎľ
2
/
2
.
Lemma 2.3 (Lower bound for spherical caps).
For
0
â¤
r
â¤
2,
a cap of
radius
r
on
S
n
â
1
has measure at least
1
2
(
r/
2)
n
â
1
.
We can now prove Theorem 2.1.
Proof.
Lemma 2.2 implies the first assertion of Theorem 2.1 immediately. If
K
is a polytope in
R
n
with
m
facets and if
B
n
2
â
K
â
dB
n
2
, we can find
m
caps
C
(
1
d
, v
i
) covering
S
n
â
1
. Each covers at most
e
â
n/
(2
d
2
)
of the sphere, so
m
âĽ
exp
n
2
d
2
.
To get the second assertion of Theorem 2.1 from Lemma 2.3 we need a little
more argument. It suffices to find
m
= 4
n
points
v
1
, v
2
, . . . , v
m
on the sphere so
that the caps of radius 1 centred at these points cover the sphere: see Figure 10.
Such a set of points is called a 1
-net
for the sphere: every point of the sphere is
12
KEITH BALL
1
1
2
0
Figure 10.
The
1
2
-cap has radius 1.
within distance 1 of some
v
i
. Now, if we choose a set of points on the sphere any
two of which are at
least
distance 1 apart, this set cannot have too many points.
(Such a set is called 1
-separated
.) The reason is that the caps of radius
1
2
about
these points will be disjoint, so the sum of their areas will be at most 1. A cap of
radius
1
2
has area at least
1
4
n
, so the number
m
of these caps satisfies
m
â¤
4
n
.
This does the job, because a
maximal
1-separated set is automatically a 1-net:
if we canât add any further points that are at least distance 1 from all the points
we have got, it can only be because every point of the sphere is
within
distance 1
of at least one of our chosen points. So the sphere has a 1-net consisting of only
4
n
points, which is what we wanted to show.
Lemmas 2.2 and 2.3 are routine calculations that can be done in many ways.
We leave Lemma 2.3 to the dedicated reader. Lemma 2.2, which will be quoted
throughout Lectures 8 and 9, is closely related to the Gaussian decay of the
volume of the ball described in the last lecture. At least for smallish
Îľ
(which is
the interesting range) it can be proved as follows.
Proof.
The proportion of
S
n
â
1
belonging to the cap
C
(
Îľ, u
) equals the pro-
portion of the solid ball that lies in the âspherical coneâ illustrated in Figure 11.
0
Îľ
â
1
â
Îľ
2
Figure 11.
Estimating the area of a cap.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
13
As is also illustrated, this spherical cone is contained in a ball of radius
â
1
â
Îľ
2
(if
Îľ
â¤
1
/
â
2), so the ratio of its volume to that of the ball is at most
1
â
Îľ
2
n/
2
â¤
e
â
nÎľ
2
/
2
.
In Lecture 8 we shall quote the upper estimate for areas of caps repeatedly. We
shall in fact be using yet a third way to measure caps that differs very slightly
from the
C
(
Îľ, u
) description. The reader can easily check that the preceding
argument yields the same estimate
e
â
nÎľ
2
/
2
for this other description.
Lecture 3. Fritz Johnâs Theorem
In the first lecture we saw that the cube and the cross-polytope lie at distance
at most
â
n
from the Euclidean ball in
R
n
, and that for the simplex, the distance
is at most
n
. It is intuitively clear that these estimates cannot be improved. In
this lecture we shall describe a strong sense in which this is as bad as things get.
The theorem we shall describe was proved by Fritz John [1948].
John considered ellipsoids inside convex bodies. If (
e
j
)
n
1
is an orthonormal
basis of
R
n
and (
Îą
j
) are positive numbers, the ellipsoid
(
x
:
n
X
1
h
x, e
j
i
2
Îą
2
j
â¤
1
)
has volume
v
n
Q
Îą
j
. John showed that each convex body contains a unique
ellipsoid of largest volume and, more importantly, he
characterised
it. He showed
that if
K
is a symmetric convex body in
R
n
and
E
is its maximal ellipsoid then
K
â
â
n
E
.
Hence, after an affine transformation (one taking
E
to
B
n
2
) we can arrange that
B
n
2
â
K
â
â
nB
n
2
.
A nonsymmetric
K
may require
nB
n
2
, like the simplex, rather than
â
nB
n
2
.
Johnâs characterisation of the maximal ellipsoid is best expressed after an
affine transformation that takes the maximal ellipsoid to
B
n
2
. The theorem states
that
B
n
2
is the maximal ellipsoid in
K
if a certain condition holdsâroughly, that
there be plenty of points of contact between the boundary of
K
and that of the
ball. See Figure 12.
Theorem 3.1 (Johnâs Theorem).
Each convex body
K
contains an unique
ellipsoid of maximal volume
.
This ellipsoid is
B
n
2
if and only if the following
conditions are satisfied
:
B
n
2
â
K
and
(
for some
m
)
there are Euclidean unit
vectors
(
u
i
)
m
1
on the boundary of
K
and positive numbers
(
c
i
)
m
1
satisfying
X
c
i
u
i
= 0
(3.1)
and
X
c
i
h
x, u
i
i
2
=
|
x
|
2
for each
x
â
R
n
.
(3.2)
14
KEITH BALL
Figure 12.
The maximal ellipsoid touches the boundary at many points.
According to the theorem the points at which the sphere touches
âK
can be
given a mass distribution whose centre of mass is the origin and whose inertia
tensor is the identity matrix. Letâs see where these conditions come from. The
first condition, (3.1), guarantees that the (
u
i
) do not all lie âon one side of the
sphereâ. If they did, we could move the ball away from these contact points and
blow it up a bit to obtain a larger ball in
K
. See Figure 13.
The second condition, (3.2), shows that the (
u
i
) behave rather like an or-
thonormal basis in that we can resolve the Euclidean norm as a (weighted) sum
of squares of inner products. Condition (3.2) is equivalent to the statement that
x
=
X
c
i
h
x, u
i
i
u
i
for all
x.
This guarantees that the (
u
i
) do not all lie close to a proper subspace of
R
n
. If
they did, we could shrink the ball a little in this subspace and expand it in an
orthogonal direction, to obtain a larger ellipsoid inside
K
. See Figure 14.
Condition (3.2) is often written in matrix (or operator) notation as
X
c
i
u
i
â
u
i
=
I
n
(3.3)
where
I
n
is the identity map on
R
n
and, for any unit vector
u
,
u
â
u
is the
rank-one orthogonal projection onto the span of
u
, that is, the map
x
7â h
x, u
i
u
.
The trace of such an orthogonal projection is 1. By equating the traces of the
Figure 13.
An ellipsoid where all contacts are on one side can grow.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
15
Figure 14.
An ellipsoid (solid circle) whose contact points are all near one plane
can grow.
matrices in the preceding equation, we obtain
X
c
i
=
n.
In the case of a
symmetric
convex body, condition (3.1) is redundant, since
we can take any sequence (
u
i
) of contact points satisfying condition (3.2) and
replace each
u
i
by the pair
Âą
u
i
each with half the weight of the original.
Letâs look at a few concrete examples. The simplest is the cube. For this
body the maximal ellipsoid is
B
n
2
, as one would expect. The contact points are
the standard basis vectors (
e
1
, e
2
, . . . , e
n
) of
R
n
and their negatives, and they
satisfy
n
X
1
e
i
â
e
i
=
I
n
.
That is, one can take all the weights
c
i
equal to 1 in (3.2). See Figure 15.
The simplest nonsymmetric example is the simplex. In general, there is no
natural way to place a regular simplex in
R
n
, so there is no natural description of
the contact points. Usually the best way to talk about an
n
-dimensional simplex
is to realise it in
R
n
+1
: for example as the convex hull of the standard basis
e
1
e
2
Figure 15.
The maximal ellipsoid for the cube.
16
KEITH BALL
Figure 16.
K
is contained in the convex body determined by the hyperplanes
tangent to the maximal ellipsoid at the contact points.
vectors in
R
n
+1
. We leave it as an exercise for the reader to come up with a nice
description of the contact points.
One may get a bit more of a feel for the second condition in Johnâs Theorem by
interpreting it as a rigidity condition. A sequence of unit vectors (
u
i
) satisfying
the condition (for some sequence (
c
i
)) has the property that if
T
is a linear map
of determinant 1, not all the images
T u
i
can have Euclidean norm less than 1.
Johnâs characterisation immediately implies the inclusion mentioned earlier:
if
K
is symmetric and
E
is its maximal ellipsoid then
K
â
â
n
E
. To check this
we may assume
E
=
B
n
2
. At each contact point
u
i
, the convex bodies
B
n
2
and
K
have the same supporting hyperplane. For
B
n
2
, the supporting hyperplane at
any point
u
is perpendicular to
u
. Thus if
x
â
K
we have
h
x, u
i
i â¤
1 for each
i
,
and we conclude that
K
is a subset of the convex body
C
=
{
x
â
R
n
:
h
x, u
i
i â¤
1 for 1
â¤
i
â¤
m
}
.
(3.4)
An example of this is shown in Figure 16.
In the symmetric case, the same argument shows that for each
x
â
K
we have
|h
x, u
i
i| â¤
1 for each
i
. Hence, for
x
â
K
,
|
x
|
2
=
X
c
i
h
x, u
i
i
2
â¤
X
c
i
=
n.
So
|
x
| â¤
â
n
, which is exactly the statement
K
â
â
n B
n
2
. We leave as a slightly
trickier exercise the estimate
|
x
| â¤
n
in the nonsymmetric case.
Proof of Johnâs Theorem.
The proof is in two parts, the harder of which
is to show that if
B
n
2
is
an
ellipsoid of largest volume, then we can find an
appropriate system of weights on the contact points. The easier part is to show
that if such a system of weights exists, then
B
n
2
is the
unique
ellipsoid of maximal
volume. We shall describe the proof only in the symmetric case, since the added
complications in the general case add little to the ideas.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
17
We begin with the easier part. Suppose there are unit vectors (
u
i
) in
âK
and
numbers (
c
i
) satisfying
X
c
i
u
i
â
u
i
=
I
n
.
Let
E
=
x
:
n
X
1
h
x, e
j
i
2
Îą
2
j
â¤
1
be an ellipsoid in
K
, for some orthonormal basis (
e
j
) and positive
Îą
j
. We want
to show that
n
Y
1
Îą
j
â¤
1
and that the product is equal to 1 only if
Îą
j
= 1 for all
j
.
Since
E â
K
we have that for each
i
the hyperplane
{
x
:
h
x, u
i
i
= 1
}
does not
cut
E
. This implies that each
u
i
belongs to the
polar ellipsoid
y
:
n
X
1
Îą
2
j
h
y, e
j
i
2
â¤
1
.
(The reader unfamiliar with duality is invited to check this.) So, for each
i
,
n
X
j
=1
Îą
2
j
h
u
i
, e
j
i
2
â¤
1
.
Hence
X
i
c
i
X
j
Îą
2
j
h
u
i
, e
j
i
2
â¤
X
c
i
=
n.
But the left side of the equality is just
P
j
Îą
2
j
, because, by condition (3.2), we
have
X
i
c
i
h
u
i
, e
j
i
2
=
|
e
j
|
2
= 1
for each
j
. Finally, the fact that the geometric mean does not exceed the arith-
metic mean (the AM/GM inequality) implies that
Y
Îą
2
j
1
/n
â¤
1
n
X
Îą
2
j
â¤
1
,
and there is equality in the first of these inequalities only if all
Îą
j
are equal to 1.
We now move to the harder direction. Suppose
B
n
2
is an ellipsoid of largest
volume in
K
. We want to show that there is a sequence of contact points (
u
i
)
and positive weights (
c
i
) with
1
n
I
n
=
1
n
X
c
i
u
i
â
u
i
.
We already know that, if this is possible, we must have
X
c
i
n
= 1
.
18
KEITH BALL
So our aim is to show that the matrix
I
n
/n
can be written as a convex combina-
tion of (a finite number of) matrices of the form
u
â
u
, where each
u
is a contact
point. Since the space of matrices is finite-dimensional, the problem is simply
to show that
I
n
/n
belongs to the convex hull of the set of all such rank-one
matrices,
T
=
{
u
â
u
:
u
is a contact point
}
.
We shall aim to get a contradiction by showing that if
I
n
/n
is
not
in
T
, we can
perturb the unit ball slightly to get a new ellipsoid in
K
of larger volume than
the unit ball.
Suppose that
I
n
/n
is not in
T
. Apply the separation theorem in the space of
matrices to get a linear functional
Ď
(on this space) with the property that
Ď
I
n
n
< Ď
(
u
â
u
)
(3.5)
for each contact point
u
. Observe that
Ď
can be represented by an
n
Ă
n
matrix
H
= (
h
jk
), so that, for any matrix
A
= (
a
jk
),
Ď
(
A
) =
X
jk
h
jk
a
jk
.
Since all the matrices
u
â
u
and
I
n
/n
are symmetric, we may assume the same
for
H
. Moreover, since these matrices all have the same trace, namely 1, the
inequality
Ď
(
I
n
/n
)
< Ď
(
u
â
u
) will remain unchanged if we add a constant to
each diagonal entry of
H
. So we may assume that the trace of
H
is 0: but this
says precisely that
Ď
(
I
n
) = 0.
Hence, unless the identity has the representation we want, we have found a
symmetric matrix
H
with zero trace for which
X
jk
h
jk
(
u
â
u
)
jk
>
0
for every contact point
u
. We shall use this
H
to build a bigger ellipsoid inside
K
.
Now, for each vector
u
, the expression
X
jk
h
jk
(
u
â
u
)
jk
is just the number
u
T
Hu
. For sufficiently small
δ >
0, the set
E
δ
=
x
â
R
n
:
x
T
(
I
n
+
δH
)
x
â¤
1
is an ellipsoid and as
δ
tends to 0 these ellipsoids approach
B
n
2
. If
u
is one of
the original contact points, then
u
T
(
I
n
+
δH
)
u
= 1 +
δu
T
Hu >
1
,
so
u
does not belong to
E
δ
. Since the boundary of
K
is compact (and the function
x
7â
x
T
Hx
is continuous)
E
δ
will not contain any other point of
âK
as long as
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
19
δ
is sufficiently small. Thus, for such
δ
, the ellipsoid
E
δ
is strictly inside
K
and
some slightly expanded ellipsoid is inside
K
.
It remains to check that each
E
δ
has volume at least that of
B
n
2
. If we denote
by (
Âľ
j
) the eigenvalues of the symmetric matrix
I
n
+
δH
, the volume of
E
δ
is
v
n
Q
Âľ
j
, so the problem is to show that, for each
δ
, we have
Q
Âľ
j
â¤
1. What
we know is that
P
Âľ
j
is the trace of
I
n
+
δH
, which is
n
, since the trace of
H
is
0. So the AM/GM inequality again gives
Y
Âľ
1
/n
j
â¤
1
n
X
Âľ
j
â¤
1
,
as required.
There is an analogue of Johnâs Theorem that characterises the unique ellipsoid
of minimal volume containing a given convex body. (The characterisation is
almost identical, guaranteeing a resolution of the identity in terms of contact
points of the body and the Euclidean sphere.) This minimal volume ellipsoid
theorem can be deduced directly from Johnâs Theorem by duality. It follows
that, for example, the ellipsoid of minimal volume containing the cube [
â
1
,
1]
n
is the obvious one: the ball of radius
â
n
. It has been mentioned several times
without proof that the distance of the cube from the Euclidean ball in
R
n
is
exactly
â
n.
We can now see this easily: the ellipsoid of minimal volume outside
the cube has volume
â
n
n
times that of the ellipsoid of maximal volume inside
the cube. So we cannot sandwich the cube between homothetic ellipsoids unless
the outer one is at least
â
n
times the inner one.
We shall be using Johnâs Theorem several times in the remaining lectures. At
this point it is worth mentioning important extensions of the result. We can view
Johnâs Theorem as a description of those linear maps from Euclidean space to a
normed space (whose unit ball is
K
) that have largest determinant, subject to the
condition that they have norm at most 1: that is, that they map the Euclidean
ball into
K
. There are many other norms that can be put on the space of linear
maps. Johnâs Theorem is the starting point for a general theory that builds
ellipsoids related to convex bodies by maximising determinants subject to other
constraints on linear maps. This theory played a crucial role in the development
of convex geometry over the last 15 years. This development is described in
detail in [Tomczak-Jaegermann 1988].
Lecture 4. Volume Ratios and Spherical Sections of the
Octahedron
In the second lecture we saw that the
n
-dimensional cube has almost spherical
sections of dimension about log
n
but not more. In this lecture we will examine
the corresponding question for the
n
-dimensional cross-polytope
B
n
1
. In itself,
this body is as far from the Euclidean ball as is the cube in
R
n
: its distance from
the ball, in the sense described in Lecture 2 is
â
n
. Remarkably, however, it has
20
KEITH BALL
almost spherical sections whose dimension is about
1
2
n
. We shall deduce this from
what is perhaps an even more surprising statement concerning intersections of
bodies. Recall that
B
n
1
contains the Euclidean ball of radius
1
â
n
. If
U
is an
orthogonal transformation of
R
n
then
U B
n
1
also contains this ball and hence so
does the intersection
B
n
1
âŠ
U B
n
1
. But, whereas
B
n
1
does not lie in any Euclidean
ball of radius less than 1, we have the following theorem [KaËsin 1977]:
Theorem 4.1.
For each
n
,
there is an orthogonal transformation
U
for which
the intersection
B
n
1
âŠ
U B
n
1
is contained in the Euclidean ball of radius
32
/
â
n
(
and contains the ball of radius
1
/
â
n
).
(The constant 32 can easily be improved: the important point is that it is inde-
pendent of the dimension
n
.) The theorem states that the intersection of just two
copies of the
n
-dimensional octahedron may be approximately spherical. Notice
that if we tried to approximate the Euclidean ball by intersecting rotated copies
of the cube, we would need exponentially many in the dimension, because the
cube has only 2
n
facets and our approximation needs exponentially many facets.
In contrast, the octahedron has a much larger number of facets, 2
n
: but of course
we need to do a lot more than just count facets in order to prove Theorem 4.1.
Before going any further we should perhaps remark that the cube has a property
that corresponds to Theorem 4.1. If
Q
is the cube and
U
is the same orthogonal
transformation as in the theorem, the convex hull
conv(
Q
âŞ
U Q
)
is at distance at most 32 from the Euclidean ball.
In spirit, the argument we shall use to establish Theorem 4.1 is KaËsinâs original
one but, following [Szarek 1978], we isolate the main ingredient and we organise
the proof along the lines of [Pisier 1989]. Some motivation may be helpful. The
ellipsoid of maximal volume inside
B
n
1
is the Euclidean ball of radius
1
â
n
. (See
Figure 3.) There are 2
n
points of contact between this ball and the boundary of
B
n
1
: namely, the points of the form
Âą
1
n
,
Âą
1
n
, . . . ,
Âą
1
n
,
one in the middle of each facet of
B
n
1
. The vertices,
(
Âą
1
,
0
,
0
, . . . ,
0)
, . . . ,
(0
,
0
, . . . ,
0
,
Âą
1)
,
are the points of
B
n
1
furthest from the origin. We are looking for a rotation
U B
n
1
whose facets chop off the spikes of
B
n
1
(or vice versa). So we want to know that
the points of
B
n
1
at distance about 1
/
â
n
from the origin are fairly typical, while
those at distance 1 are atypical.
For a unit vector
θ
â
S
n
â
1
, let
r
(
θ
) be the radius of
B
n
1
in the direction
θ
,
r
(
θ
) =
1
k
θ
k
1
=
n
X
1
|
θ
i
|
â
1
.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
21
In the first lecture it was explained that the volume of
B
n
1
can be written
v
n
Z
S
n
â
1
r
(
θ
)
n
dĎ
and that it is equal to 2
n
/n
!. Hence
Z
S
n
â
1
r
(
θ
)
n
dĎ
=
2
n
n
!
v
n
â¤
2
â
n
n
.
Since the average of
r
(
θ
)
n
is at most 2
/
â
n
n
, the value of
r
(
θ
) cannot often be
much more than 2
/
â
n
. This feature of
B
n
1
is captured in the following definition
of Szarek.
Definition.
Let
K
be a convex body in
R
n
. The
volume ratio
of
K
is
vr(
K
) =
vol(
K
)
vol(
E
)
1
/n
,
where
E
is the ellipsoid of maximal volume in
K
.
The preceding discussion shows that vr(
B
n
1
)
â¤
2 for all
n
. Contrast this with the
cube in
R
n
, whose volume ratio is about
â
n/
2. The only property of
B
n
1
that
we shall use to prove KaËsinâs Theorem is that its volume ratio is at most 2. For
convenience, we scale everything up by a factor of
â
n
and prove the following.
Theorem
4.2.
Let
K
be a symmetric convex body in
R
n
that contains the
Euclidean unit ball
B
n
2
and for which
vol(
K
)
vol(
B
n
2
)
1
/n
=
R.
Then there is an orthogonal transformation
U
of
R
n
for which
K
âŠ
U K
â
8
R
2
B
n
2
.
Proof.
It is convenient to work with the norm on
R
n
whose unit ball is
K
. Let
k ¡ k
denote this norm and
| ¡ |
the standard Euclidean norm. Notice that, since
B
n
2
â
K
, we have
k
x
k ⤠|
x
|
for all
x
â
R
n
.
The radius of the body
K
âŠ
U K
in a given direction is the minimum of the
radii of
K
and
U K
in that direction. So the norm corresponding to the body
K
âŠ
U K
is the
maximum
of the norms corresponding to
K
and
U K
. We need
to find an orthogonal transformation
U
with the property that
max (
k
U θ
k
,
k
θ
k
)
âĽ
1
8
R
2
for every
θ
â
S
n
â
1
. Since the maximum of two numbers is at least their average,
it will suffice to find
U
with
k
U θ
k
+
k
θ
k
2
âĽ
1
8
R
2
for all
θ.
22
KEITH BALL
For each
x
â
R
n
write
N
(
x
) for the average
1
2
(
k
U x
k
+
k
x
k
). One sees imme-
diately that
N
is a norm (that is, it satisfies the triangle inequality) and that
N
(
x
)
⤠|
x
|
for every
x
, since
U
preserves the Euclidean norm. We shall show in
a moment that there is a
U
for which
Z
S
n
â
1
1
N
(
θ
)
2
n
dĎ
â¤
R
2
n
.
(4.1)
This says that
N
(
θ
) is large on average: we want to deduce that it is large
everywhere.
Let
Ď
be a point of the sphere and write
N
(
Ď
) =
t
, for 0
< t
â¤
1. The crucial
point will be that, if
θ
is close to
Ď
, then
N
(
θ
) cannot be much more than
t
. To
be precise, if
|
θ
â
Ď
| â¤
t
then
N
(
θ
)
â¤
N
(
Ď
) +
N
(
θ
â
Ď
)
â¤
t
+
|
θ
â
Ď
| â¤
2
t.
Hence,
N
(
θ
) is at most 2
t
for every
θ
in a spherical cap of radius
t
about
Ď
.
From Lemma 2.3 we know that this spherical cap has measure at least
1
2
t
2
n
â
1
âĽ
t
2
n
.
So 1
/N
(
θ
)
2
n
is at least 1
/
(2
t
)
2
n
on a set of measure at least (
t/
2)
n
. Therefore
Z
S
n
â
1
1
N
(
θ
)
2
n
dĎ
âĽ
1
(2
t
)
2
n
t
2
n
=
1
2
3
n
t
n
.
By (4.1), the integral is at most
R
2
n
, so
t
âĽ
1
/
(8
R
2
). Thus our arbitrary point
Ď
satisfies
N
(
Ď
)
âĽ
1
8
R
2
.
It remains to find
U
satisfying (4.1). Now, for any
θ
, we have
N
(
θ
)
2
=
k
U θ
k
+
k
θ
k
2
2
⼠k
U θ
k k
θ
k
,
so it will suffice to find a
U
for which
Z
S
n
â
1
1
k
U θ
k
n
k
θ
k
n
dĎ
â¤
R
2
n
.
(4.2)
The hypothesis on the volume of
K
can be written in terms of the norm as
Z
S
n
â
1
1
k
θ
k
n
dĎ
=
R
n
.
The group of orthogonal transformations carries an invariant probability mea-
sure. This means that we can average a function over the group in a natural
way. In particular, if
f
is a function on the sphere and
θ
is some point on the
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
23
sphere, the average over orthogonal
U
of the value
f
(
U θ
) is just the average of
f
on the sphere: averaging over
U
mimics averaging over the sphere:
ave
U
f
(
U θ
) =
Z
S
n
â
1
f
(
Ď
)
dĎ
(
Ď
)
.
Hence,
ave
U
Z
S
n
â
1
1
k
U θ
k
n
.
k
θ
k
n
dĎ
(
θ
) =
Z
S
n
â
1
ave
U
1
k
U θ
k
n
1
k
θ
k
n
dĎ
(
θ
)
=
Z
S
n
â
1
Z
S
n
â
1
1
k
Ď
k
n
dĎ
(
Ď
)
1
k
θ
k
n
dĎ
(
θ
)
=
Z
S
n
â
1
1
k
θ
k
n
dĎ
(
θ
)
2
=
R
2
n
.
Since the average over all
U
of the integral is at most
R
2
n
, there is at least one
U
for which the integral is at most
R
2
n
. This is exactly inequality (4.2).
The choice of
U
in the preceding proof is a random one. The proof does not
in any way tell us how to find an explicit
U
for which the integral is small. In
the case of a general body
K
, this is hardly surprising, since we are assuming
nothing about how the volume of
K
is distributed. But, in view of the earlier
remarks about facets of
B
n
1
chopping off spikes of
U B
n
1
, it is tempting to think
that for the particular body
B
n
1
we might be able to write down an appropriate
U
explicitly. In two dimensions the best choice of
U
is obvious: we rotate the
diamond through 45
âŚ
and after intersection we have a regular octagon as shown
in Figure 17.
The most natural way to try to generalise this to higher dimensions is to look
for a
U
such that each vertex of
U B
n
1
is exactly aligned through the centre of a
facet of
B
n
1
: that is, for each standard basis vector
e
i
of
R
n
,
U e
i
is a multiple of
one of the vectors (
Âą
1
n
, . . . ,
Âą
1
n
). (The multiple is
â
n
since
U e
i
has length 1.)
Thus we are looking for an
n
Ă
n
orthogonal matrix
U
each of whose entries is
B
2
1
UB
2
1
B
2
1
âŠ
UB
2
1
Figure 17.
The best choice for
U
in two dimensions is a
45
âŚ
rotation.
24
KEITH BALL
Âą
1
/
â
n
. Such matrices, apart from the factor
â
n
, are called
Hadamard matrices
.
In what dimensions do they exist? In dimensions 1 and 2 there are the obvious
(1)
and
1
â
2
1
â
2
1
â
2
â
1
â
2
!
.
It is not too hard to show that in larger dimensions a Hadamard matrix cannot
exist unless the dimension is a multiple of 4. It is an open problem to determine
whether they exist in all dimensions that are multiples of 4. They are known to
exist, for example, if the dimension is a power of 2: these examples are known
as the Walsh matrices.
In spite of this, it seems extremely unlikely that one might prove KaËsinâs
Theorem using Hadamard matrices. The Walsh matrices certainly do not give
anything smaller than
n
â
1
/
4
; pretty miserable compared with
n
â
1
/
2
. There
are some good reasons, related to Ramsey theory, for believing that one cannot
expect to find genuinely explicit matrices of any kind that would give the right
estimates.
Letâs return to the question with which we opened the lecture and see how
Theorem 4.1 yields almost spherical sections of octahedra. We shall show that,
for each
n
, the 2
n
-dimensional octahedron has an
n
-dimensional slice which
is within distance 32 of the (
n
-dimensional) Euclidean ball. By applying the
argument of the theorem to
B
n
1
, we obtain an
n
Ă
n
orthogonal matrix
U
such
that
k
U x
k
1
+
k
x
k
1
âĽ
â
n
16
|
x
|
for every
x
â
R
n
, where
k ¡ k
1
denotes the
`
1
norm. Now consider the map
T
:
R
n
â
R
2
n
with matrix
U
I
. For each
x
â
R
n
, the norm of
T x
in
`
2
n
1
is
k
T x
k
1
=
k
U x
k
1
+
k
x
k
1
âĽ
â
n
16
|
x
|
.
On the other hand, the Euclidean norm of
T x
is
|
T x
|
=
p
|
U x
|
2
+
|
x
|
2
=
â
2
|
x
|
.
So, if
y
belongs to the image
T
R
n
, then, setting
y
=
T x
,
k
y
k
1
âĽ
â
n
16
|
x
|
=
â
n
16
â
2
|
y
|
=
â
2
n
32
|
y
|
.
By the CauchyâSchwarz inequality, we have
k
y
k
1
â¤
â
2
n
|
y
|
, so the slice of
B
2
n
1
by the subspace
T
R
n
has distance at most 32 from
B
n
2
, as we wished to show.
A good deal of work has been done on embedding of other subspaces of
L
1
into
`
1
-spaces of low dimension, and more generally subspaces of
L
p
into low-
dimensional
`
p
, for 1
< p <
2. The techniques used come from probability
theory:
p
-stable random variables, bootstrapping of probabilities and deviation
estimates. We shall be looking at applications of the latter in Lectures 8 and 9.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
25
The major references are [Johnson and Schechtman 1982; Bourgain et al. 1989;
Talagrand 1990].
Volume ratios have been studied in considerable depth. They have been found
to be closely related to the so-called cotype-2 property of normed spaces: this re-
lationship is dealt with comprehensively in [Pisier 1989]. In particular, Bourgain
and Milman [1987] showed that a bound for the cotype-2 constant of a space
implies a bound for the volume ratio of its unit ball. This demonstrated, among
other things, that there is a uniform bound for the volume ratios of slices of
octahedra of all dimensions. A sharp version of this result was proved in [Ball
1991]: namely, that for each
n
,
B
n
1
has largest volume ratio among the balls of
n
-dimensional subspaces of
L
1
. The proof uses techniques that will be discussed
in Lecture 6.
This is a good point to mention a result of Milman [1985] that looks super-
ficially like the results of this lecture but lies a good deal deeper. We remarked
that while we can almost get a sphere by intersecting two copies of
B
n
1
, this is
very far from possible with two cubes. Conversely, we can get an almost spher-
ical convex hull of two cubes but not of two copies of
B
n
1
. The
QS-Theorem
(an abbreviation for âquotient of a subspaceâ) states that if we combine the two
operations, intersection and convex hull, we can get a sphere no matter what
body we start with.
Theorem 4.3 (QS-Theorem).
There is a constant
M
(
independent of ev-
erything
)
such that
,
for any symmetric convex body
K
of any dimension
,
there
are linear maps
Q
and
S
and an ellipsoid
E
with the following property
:
if
Ë
K
= conv(
K
âŞ
QK
),
then
E â
Ë
K
âŠ
S
Ë
K
â
M
E
.
Lecture 5. The BrunnâMinkowski Inequality
and Its Extensions
In this lecture we shall introduce one of the most fundamental principles in
convex geometry: the BrunnâMinkowski inequality. In order to motivate it, letâs
begin with a simple observation concerning convex sets in the plane. Let
K
â
R
2
be such a set and consider its slices by a family of parallel lines, for example those
parallel to the
y
-axis. If the line
x
=
r
meets
K
, call the length of the slice
v
(
r
).
The graph of
v
is obtained by shaking
K
down onto the
x
-axis like a deck of
cards (of different lengths). This is shown in Figure 18. It is easy to see that the
function
v
is concave on its support. Towards the end of the last century, Brunn
investigated what happens if we try something similar in higher dimensions.
Figure 19 shows an example in three dimensions. The central, hexagonal, slice
has larger volume than the triangular slices at the ends: each triangular slice
can be decomposed into four smaller triangles, while the hexagon is a union of
six such triangles. So our first guess might be that the slice area is a concave
26
KEITH BALL
v
(
x
)
K
Figure 18.
Shaking down a convex body.
function, just as slice length was concave for sets in the plane. That this is not
always so can be seen by considering slices of a cone, parallel to its base: see
Figure 20.
Since the area of a slice varies as the square of its distance from the coneâs
vertex, the area function obtained looks like a piece of the curve
y
=
x
2
, which
is certainly not concave. However, it is reasonable to guess that the cone is an
extremal example, since it is âonly justâ a convex body: its curved surface is
âmade up of straight linesâ. For this body, the square root of the slice function
just manages to be concave on its support (since its graph is a line segment). So
our second guess might be that for a convex body in
R
3
, a slice-area function has
a
square-root
that is concave on its support. This was proved by Brunn using
an elegant symmetrisation method. His argument works just as well in higher
dimensions to give the following result for the (
n
â
1)-dimensional volumes of
slices of a body in
R
n
.
Figure 19.
A polyhedron in three dimensions. The faces at the right and left
are parallel.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
27
x
Figure 20.
The area of a coneâs section increases with
x
2
.
Theorem 5.1 (Brunn).
Let
K
be a convex body in
R
n
,
let
u
be a unit vector
in
R
n
,
and for each
r
let
H
r
be the hyperplane
{
x
â
R
n
:
h
x, u
i
=
r
}
.
Then the function
r
7â
vol (
K
âŠ
H
r
)
1
/
(
n
â
1)
is concave on its support
.
One consequence of this is that if
K
is centrally symmetric, the largest slice
perpendicular to a given direction is the central slice (since an even concave
function is largest at 0). This is the situation in Figure 19.
Brunnâs Theorem was turned from an elegant curiosity into a powerful tool by
Minkowski. His reformulation works in the following way. Consider three parallel
slices of a convex body in
R
n
at positions
r
,
s
and
t
, where
s
= (1
â
Îť
)
r
+
Îťt
for
some
Îť
â
(0
,
1). This is shown in Figure 21.
Call the slices
A
r
,
A
s
, and
A
t
and think of them as subsets of
R
n
â
1
. If
x
â
A
r
and
y
â
A
t
, the point (1
â
Îť
)
x
+
Îťy
belongs to
A
s
: to see this, join the points
(
r, x
) and (
t, y
) in
R
n
and observe that the resulting line segment crosses
A
s
at
(
s,
(1
â
Îť
)
x
+
Îťy
). So
A
s
includes a new set
(1
â
Îť
)
A
r
+
ÎťA
t
:=
{
(1
â
Îť
)
x
+
Îťy
:
x
â
A
r
, y
â
A
t
}
.
A
r
A
s
A
t
(
t, y
)
(
s, x
)
Figure 21.
The section
A
s
contains the weighted average of
A
r
and
A
t
.
28
KEITH BALL
(This way of using the addition in
R
n
to define an addition of sets is called
Minkowski addition
.) Brunnâs Theorem says that the volumes of the three sets
A
r
,
A
s
, and
A
t
in
R
n
â
1
satisfy
vol (
A
s
)
1
/
(
n
â
1)
âĽ
(1
â
Îť
) vol (
A
r
)
1
/
(
n
â
1)
+
Îť
vol (
A
t
)
1
/
(
n
â
1)
.
The BrunnâMinkowski inequality makes explicit the fact that all we really know
about
A
s
is that it includes the Minkowski combination of
A
r
and
A
t
. Since we
have now eliminated the role of the ambient space
R
n
, it is natural to rewrite
the inequality with
n
in place of
n
â
1.
Theorem 5.2 (BrunnâMinkowski inequality).
If
A
and
B
are nonempty
compact subsets of
R
n
then
vol ((1
â
Îť
)
A
+
ÎťB
)
1
/n
âĽ
(1
â
Îť
) vol (
A
)
1
/n
+
Îť
vol (
B
)
1
/n
.
(The hypothesis that
A
and
B
be nonempty corresponds in Brunnâs Theorem
to the restriction of a function to its support.) It should be remarked that the
inequality is stated for general compact sets, whereas the early proofs gave the
result only for convex sets. The first complete proof for the general case seems
to be in [L
Äąusternik 1935].
To get a feel for the advantages of Minkowskiâs formulation, letâs see how it
implies the classical isoperimetric inequality in
R
n
.
Theorem 5.3 (Isoperimetric inequality).
Among bodies of a given volume
,
Euclidean balls have least surface area
.
Proof.
Let
C
be a compact set in
R
n
whose volume is equal to that of
B
n
2
, the
Euclidean ball of radius 1. The surface âareaâ of
C
can be written
vol(
âC
) = lim
Îľ
â
0
vol (
C
+
ÎľB
n
2
)
â
vol (
C
)
Îľ
,
as shown in Figure 22. By the BrunnâMinkowski inequality,
vol (
C
+
ÎľB
n
2
)
1
/n
âĽ
vol (
C
)
1
/n
+
Îľ
vol (
B
n
2
)
1
/n
.
K
K
+
ÎľB
n
2
Figure 22.
Expressing the area as a limit of volume increments.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
29
Hence
vol (
C
+
ÎľB
n
2
)
âĽ
vol (
C
)
1
/n
+
Îľ
vol (
B
n
2
)
1
/n
n
âĽ
vol (
C
) +
nÎľ
vol (
C
)
(
n
â
1)
/n
vol (
B
n
2
)
1
/n
.
So
vol(
âC
)
âĽ
n
vol(
C
)
(
n
â
1)
/n
vol(
B
n
2
)
1
/n
.
Since
C
and
B
n
2
have the same volume, this shows that vol(
âC
)
âĽ
n
vol(
B
n
2
),
and the latter equals vol(
âB
n
2
), as we saw in Lecture 1.
This relationship between the BrunnâMinkowski inequality and the isoperimetric
inequality will be explored in a more general context in Lecture 8.
The BrunnâMinkowski inequality has an alternative version that is formally
weaker. The AM/GM inequality shows that, for
Îť
in (0
,
1),
(1
â
Îť
) vol(
A
)
1
/n
+
Îť
vol(
B
)
1
/n
âĽ
vol(
A
)
(1
â
Îť
)
/n
vol(
B
)
Îť/n
.
So the BrunnâMinkowski inequality implies that, for compact sets
A
and
B
and
Îť
â
(0
,
1),
vol((1
â
Îť
)
A
+
ÎťB
)
âĽ
vol(
A
)
1
â
Îť
vol(
B
)
Îť
.
(5.1)
Although this multiplicative BrunnâMinkowski inequality is weaker than the
BrunnâMinkowski inequality for
particular
A
,
B
, and
Îť
, if one knows (5.1) for
all
A
,
B
, and
Îť
one can easily deduce the BrunnâMinkowski inequality for all
A
,
B
, and
Îť
. This deduction will be left for the reader.
Inequality (5.1) has certain advantages over the BrunnâMinkowski inequality.
(i) We no longer need to stipulate that
A
and
B
be nonempty, which makes the
inequality easier to use.
(ii) The dimension
n
has disappeared.
(iii) As we shall see, the multiplicative inequality lends itself to a particularly
simple proof because it has a generalisation from sets to functions.
Before we describe the functional BrunnâMinkowski inequality letâs just remark
that the multiplicative BrunnâMinkowski inequality can be reinterpreted back in
the setting of Brunnâs Theorem: if
r
7â
v
(
r
) is a function obtained by scanning
a convex body with parallel hyperplanes, then log
v
is a concave function (with
the usual convention regarding
ââ
).
In order to move toward a functional generalisation of the multiplicative
BrunnâMinkowski inequality letâs reinterpret inequality (5.1) in terms of the
characteristic functions of the sets involved. Let
f
,
g
, and
m
denote the char-
acteristic functions of
A
,
B
, and (1
â
Îť
)
A
+
ÎťB
respectively; so, for example,
f
(
x
) = 1 if
x
â
A
and 0 otherwise. The volumes of
A
,
B
, and (1
â
Îť
)
A
+
ÎťB
are the integrals
R
R
n
f
,
R
R
n
g
, and
R
R
n
m
. The BrunnâMinkowski inequality says
that
Z
m
âĽ
Z
f
1
â
Îť
Z
g
Îť
.
30
KEITH BALL
But what is the relationship between
f
,
g
, and
m
that guarantees its truth? If
f
(
x
) = 1 and
g
(
y
) = 1 then
x
â
A
and
y
â
B
, so
(1
â
Îť
)
x
+
Îťy
â
(1
â
Îť
)
A
+
ÎťB,
and hence
m
((1
â
Îť
)
x
+
Îťy
) = 1. This certainly ensures that
m
((1
â
Îť
)
x
+
Îťy
)
âĽ
f
(
x
)
1
â
Îť
g
(
y
)
Îť
for any
x
and
y
in
R
n
.
This inequality for the three functions at least has a homogeneity that matches
the desired inequality for the integrals.
In a series of papers, Pr´
ekopa and
Leindler proved that this homogeneity is enough.
Theorem 5.4 (The Pr´
ekopaâLeindler inequality).
If
f
,
g
and
m
are
nonnegative measurable functions on
R
n
,
Îť
â
(0
,
1)
and for all
x
and
y
in
R
n
,
m
((1
â
Îť
)
x
+
Îťy
)
âĽ
f
(
x
)
1
â
Îť
g
(
y
)
Îť
(5.2)
then
Z
m
âĽ
Z
f
1
â
Îť
Z
g
Îť
.
It is perhaps helpful to notice that the Pr´
ekopaâLeindler inequality looks like
H¨
olderâs inequality, backwards. If
f
and
g
were given and we set
m
(
z
) =
f
(
z
)
1
â
Îť
g
(
z
)
Îť
(for each
z
), then H¨
olderâs inequality says that
Z
m
â¤
Z
f
1
â
Îť
Z
g
Îť
.
(H¨
olderâs inequality is often written with 1
/p
instead of 1
â
Îť
, 1
/q
instead of
Îť
, and
f
,
g
replaced by
F
p
,
G
q
.) The difference between Pr´
ekopaâLeindler and
H¨
older is that, in the former, the value
m
(
z
) may be much larger since it is a
supremum over many pairs (
x, y
) satisfying
z
= (1
â
Îť
)
x
+
Îťy
rather than just
the pair (
z, z
).
Though it generalises the BrunnâMinkowski inequality, the Pr´
ekopaâLeindler
inequality is a good deal simpler to prove, once properly formulated. The argu-
ment we shall use seems to have appeared first in [Brascamp and Lieb 1976b].
The crucial point is that the passage from sets to functions allows us to prove
the inequality by induction on the dimension, using only the one-dimensional
case. We pay the small price of having to do a bit extra for this case.
Proof of the Pr´
ekopaâLeindler inequality.
We start by checking the
one-dimensional BrunnâMinkowski inequality. Suppose
A
and
B
are nonempty
measurable subsets of the line. Using
| ¡ |
to denote length, we want to show that
|
(1
â
Îť
)
A
+
ÎťB
| âĽ
(1
â
Îť
)
|
A
|
+
Îť
|
B
|
.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
31
We may assume that
A
and
B
are compact and we may shift them so that
the right-hand end of
A
and the left-hand end of
B
are both at 0. The set
(1
â
Îť
)
A
+
ÎťB
now includes the essentially disjoint sets (1
â
Îť
)
A
and
ÎťB
, so its
length is at least the sum of the lengths of these sets.
Now suppose we have nonnegative integrable functions
f
,
g
, and
m
on the
line, satisfying condition (5.2). We may assume that
f
and
g
are bounded. Since
the inequality to be proved has the same homogeneity as the hypothesis (5.2),
we may also assume that
f
and
g
are normalised so that sup
f
= sup
g
= 1.
By Fubiniâs Theorem, we can write the integrals of
f
and
g
as integrals of the
lengths of their level sets:
Z
f
(
x
)
dx
=
Z
1
0
|
(
f
âĽ
t
)
|
dt,
and similarly for
g
. If
f
(
x
)
âĽ
t
and
g
(
y
)
âĽ
t
then
m
((1
â
Îť
)
x
+
Îťy
)
âĽ
t
. So we
have the inclusion
(
m
âĽ
t
)
â
(1
â
Îť
)(
f
âĽ
t
) +
Îť
(
g
âĽ
t
)
.
For 0
â¤
t <
1 the sets on the right are nonempty so the one-dimensional Brunnâ
Minkowski inequality shows that
|
(
m
âĽ
t
)
| âĽ
(1
â
Îť
)
|
(
f
âĽ
t
)
|
+
Îť
|
(
g
âĽ
t
)
|
.
Integrating this inequality from 0 to 1 we get
Z
m
âĽ
(1
â
Îť
)
Z
f
+
Îť
Z
g,
and the latter expression is at least
Z
f
1
â
Îť
Z
g
Îť
by the AM/GM inequality. This does the one-dimensional case.
The induction that takes us into higher dimensions is quite straightforward, so
we shall just sketch the argument for sets in
R
n
, rather than functions. Suppose
A
and
B
are two such sets and, for convenience, write
C
= (1
â
Îť
)
A
+
ÎťB.
Choose a unit vector
u
and, as before, let
H
r
be the hyperplane
{
x
â
R
n
:
h
x, u
i
=
r
}
perpendicular to
u
at âpositionâ
r
. Let
A
r
denote the slice
A
âŠ
H
r
and similarly
for
B
and
C
, and regard these as subsets of
R
n
â
1
. If
r
and
t
are real numbers,
and if
s
= (1
â
Îť
)
r
+
Îťt
, the slice
C
s
includes (1
â
Îť
)
A
r
+
ÎťB
t
. (This is reminiscent
32
KEITH BALL
of the earlier argument relating Brunnâs Theorem to Minkowskiâs reformulation.)
By the inductive hypothesis in
R
n
â
1
,
vol(
C
s
)
âĽ
vol(
A
r
)
1
â
Îť
.
vol(
B
t
)
Îť
.
Let
f
,
g
, and
m
be the functions on the line, given by
f
(
x
) = vol(
A
x
)
,
g
(
x
) = vol(
B
x
)
,
m
(
x
) = vol(
C
x
)
.
Then, for
r
,
s
, and
t
as above,
m
(
s
)
âĽ
f
(
r
)
1
â
Îť
g
(
t
)
Îť
.
By the one-dimensional Pr´
ekopaâLeindler inequality,
Z
m
âĽ
Z
f
1
â
Îť
Z
g
Îť
.
But this is exactly the statement vol(
C
)
âĽ
vol(
A
)
1
â
Îť
vol(
B
)
Îť
, so the inductive
step is complete.
The proof illustrates clearly why the Pr´
ekopaâLeindler inequality makes things
go smoothly. Although we only carried out the induction for
sets
, we required
the one-dimensional result for the
functions
we get by scanning sets in
R
n
.
To close this lecture we remark that the BrunnâMinkowski inequality has nu-
merous extensions and variations, not only in convex geometry, but in combina-
torics and information theory as well. One of the most surprising and delightful
is a theorem of Busemann [1949].
Theorem 5.5 (Busemann).
Let
K
be a symmetric convex body in
R
n
,
and
for each unit vector
u
let
r
(
u
)
be the volume of the slice of
K
by the subspace
orthogonal to
u
.
Then the body whose radius in each direction
u
is
r
(
u
)
is itself
convex
.
The BrunnâMinkowski inequality is the starting point for a highly developed
classical theory of convex geometry. We shall barely touch upon the theory in
these notes. A comprehensive reference is the recent book [Schneider 1993].
Lecture 6. Convolutions and Volume Ratios:
The Reverse Isoperimetric Problem
In the last lecture we saw how to deduce the classical isoperimetric inequality
in
R
n
from the BrunnâMinkowski inequality. In this lecture we will answer the
reverse question. This has to be phrased a bit carefully, since there is no upper
limit to the surface area of a body of given volume, even if we restrict attention
to convex bodies. (Consider a very thin pancake.) For this reason it is natural to
consider affine equivalence classes of convex bodies, and the question becomes:
given a convex body, how small can we make its surface area by applying an
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
33
affine (or linear) transformation that preserves volume? The answer is provided
by the following theorem from [Ball 1991].
Theorem 6.1.
Let
K
be a convex body and
T
a regular solid simplex in
R
n
.
Then there is an affine image of
K
whose volume is the same as that of
T
and
whose surface area is no larger than that of
T
.
Thus, modulo affine transformations, simplices have the largest surface area
among convex bodies of a given volume. If
K
is assumed to be centrally sym-
metric then the estimate can be strengthened: the cube is extremal among sym-
metric bodies. A detailed proof of Theorem 6.1 would be too long for these
notes. We shall instead describe how the symmetric case is proved, since this is
considerably easier but illustrates the most important ideas.
Theorem 6.1 and the symmetric analogue are both deduced from volume-ratio
estimates. In the latter case the statement is that among symmetric convex
bodies, the cube has largest volume ratio. Letâs see why this solves the reverse
isoperimetric problem. If
Q
is any cube, the surface area and volume of
Q
are
related by
vol(
âQ
) = 2
n
vol(
Q
)
(
n
â
1)
/n
.
We wish to show that any other convex body
K
has an affine image Ë
K
for which
vol(
â
Ë
K
)
â¤
2
n
vol( Ë
K
)
(
n
â
1)
/n
.
Choose Ë
K
so that its maximal volume ellipsoid is
B
n
2
, the Euclidean ball of
radius 1. The volume of Ë
K
is then at most 2
n
, since this is the volume of the
cube whose maximal ellipsoid is
B
n
2
. As in the previous lecture,
vol(
â
Ë
K
) = lim
Îľ
â
0
vol( Ë
K
+
ÎľB
n
2
)
â
vol( Ë
K
)
Îľ
.
Since Ë
K
â
B
n
2
, the second expression is at most
lim
Îľ
â
0
vol( Ë
K
+
Îľ
Ë
K
)
â
vol( Ë
K
)
Îľ
= vol( Ë
K
) lim
Îľ
â
0
(1 +
Îľ
)
n
â
1
Îľ
=
n
vol( Ë
K
) =
n
vol( Ë
K
)
1
/n
vol( Ë
K
)
(
n
â
1)
/n
â¤
2
n
vol( Ë
K
)
(
n
â
1)
/n
,
which is exactly what we wanted.
The rest of this lecture will thus be devoted to explaining the proof of the
volume-ratio estimate:
Theorem 6.2.
Among symmetric convex bodies the cube has largest volume
ratio
.
As one might expect, the proof of Theorem 6.2 makes use of Johnâs Theorem
from Lecture 3. The problem is to show that, if
K
is a convex body whose
maximal ellipsoid is
B
n
2
, then vol(
K
)
â¤
2
n
. As we saw, it is a consequence of
34
KEITH BALL
Johnâs theorem that if
B
n
2
is the maximal ellipsoid in
K
, there is a sequence (
u
i
)
of unit vectors and a sequence (
c
i
) of positive numbers for which
X
c
i
u
i
â
u
i
=
I
n
and for which
K
â
C
:=
{
x
:
|h
x, u
i
i| â¤
1 for 1
â¤
i
â¤
m
}
.
We shall show that this
C
has volume at most 2
n
. The principal tool will be a
sharp inequality for norms of generalised convolutions. Before stating this letâs
explain some standard terms from harmonic analysis.
If
f
and
g
:
R
â
R
are bounded, integrable functions, we define the
convolu-
tion
f
â
g
of
f
and
g
by
f
â
g
(
x
) =
Z
R
f
(
y
)
g
(
x
â
y
)
dy.
Convolutions crop up in many areas of mathematics and physics, and a good
deal is known about how they behave. One of the most fundamental inequalities
for convolutions is Youngâs inequality: If
f
â
L
p
,
g
â
L
q
, and
1
p
+
1
q
= 1 +
1
s
,
then
k
f
â
g
k
s
⤠k
f
k
p
k
g
k
q
.
(Here
k ¡ k
p
means the
L
p
norm on
R
, and so on.) Once we have Youngâs in-
equality, we can give a meaning to convolutions of functions that are not both
integrable and bounded, provided that they lie in the correct
L
p
spaces. Youngâs
inequality holds for convolution on any locally compact group, for example the
circle. On compact groups it is sharp: there is equality for constant functions.
But on
R
, where constant functions are not integrable, the inequality can be
improved (for most values of
p
and
q
). It was shown by Beckner [1975] and
Brascamp and Lieb [1976a] that the correct constant in Youngâs inequality is
attained if
f
and
g
are appropriate Gaussian densities: that is, for some positive
a
and
b
,
f
(
t
) =
e
â
at
2
and
g
(
t
) =
e
â
bt
2
. (The appropriate choices of
a
and
b
and
the value of the best constant for each
p
and
q
will not be stated here. Later we
shall see that they can be avoided.)
How are convolutions related to convex bodies? To answer this question we
need to rewrite Youngâs inequality slightly. If 1
/r
+1
/s
= 1, the
L
s
norm
k
f
â
g
k
s
can be realised as
Z
R
(
f
â
g
)(
x
)
h
(
x
)
for some function
h
with
k
h
k
r
= 1. So the inequality says that, if 1
/p
+1
/q
+1
/r
=
2, then
Z Z
f
(
y
)
g
(
x
â
y
)
h
(
x
)
dy dx
⤠k
f
k
p
k
g
k
q
k
h
k
r
.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
35
We may rewrite the inequality again with
h
(
â
x
) in place of
h
(
x
), since this
doesnât affect
k
h
k
r
:
Z Z
f
(
y
)
g
(
x
â
y
)
h
(
â
x
)
dy dx
⤠k
f
k
p
k
g
k
q
k
h
k
r
.
(6.1)
This can be written in a more symmetric form via the map from
R
2
into
R
3
that
takes (
x, y
) to (
y, x
â
y,
â
x
) =: (
u, v, w
). The range of this map is the subspace
H
=
{
(
u, v, w
) :
u
+
v
+
w
= 0
}
.
Apart from a factor coming from the Jacobian of this map, the integral can be
written
Z
H
f
(
u
)
g
(
v
)
h
(
w
)
,
where the integral is with respect to two-dimensional measure on the subspace
H
. So Youngâs inequality and its sharp forms estimate the integral of a product
function on
R
3
over a subspace. What is the simplest product function? If
f
,
g
,
and
h
are each the characteristic function of the interval [
â
1
,
1], the function
F
given by
F
(
u, v, w
) =
f
(
u
)
g
(
v
)
h
(
w
)
is the characteristic function of the cube [
â
1
,
1]
3
â
R
3
. The integral of
F
over
a subspace of
R
3
is thus the area of a slice of the cube: the area of a certain
convex body. So there is some hope that we might use a convolution inequality
to estimate volumes.
Brascamp and Lieb proved rather more than the sharp form of Youngâs in-
equality stated earlier. They considered not just two-dimensional subspaces of
R
3
but
n
-dimensional subspaces of
R
m
. It will be more convenient to state their
result using expressions analogous to those in (6.1) rather than using integrals
over subspaces. Notice that the integral
Z Z
f
(
y
)
g
(
x
â
y
)
h
(
â
x
)
dy dx
can be written
Z
R
2
f
h
x, v
1
i
g
h
x, v
2
i
h
h
x, v
3
i
dx,
where
v
1
= (0
,
1),
v
2
= (1
,
â
1) and
v
3
= (
â
1
,
0) are vectors in
R
2
. The theorem
of Brascamp and Lieb is the following.
Theorem 6.3.
If
(
v
i
)
m
1
are vectors in
R
n
and
(
p
i
)
m
1
are positive numbers satis-
fying
m
X
1
1
p
i
=
n,
and if
(
f
i
)
m
1
are nonnegative measurable functions on the line
,
then
R
R
n
Q
m
1
f
i
(
h
x, v
i
i
)
Q
m
1
k
f
i
k
p
i
36
KEITH BALL
is âmaximisedâ when the
(
f
i
)
are appropriate Gaussian densities
:
f
i
(
t
) =
e
â
a
i
t
2
,
where the
a
i
depend upon
m
,
n
,
the
p
i
,
and the
v
i
.
The word maximised is in quotation marks since there are degenerate cases for
which the maximum is not attained. The value of the maximum is not easily
computed since the
a
i
are the solutions of nonlinear equations in the
p
i
and
v
i
. This apparently unpleasant problem evaporates in the context of convex
geometry: the inequality has a normalised form, introduced in [Ball 1990], which
fits perfectly with Johnâs Theorem.
Theorem 6.4.
If
(
u
i
)
m
1
are unit vectors in
R
n
and
(
c
i
)
m
1
are positive numbers
for which
m
X
1
c
i
u
i
â
u
i
=
I
n
,
and if
(
f
i
)
m
1
are nonnegative measurable functions
,
then
Z
R
n
Y
f
i
(
h
x, u
i
i
)
c
i
â¤
Y Z
f
i
c
i
.
In this reformulation of the theorem, the
c
i
play the role of 1
/p
i
: the Fritz John
condition ensures that
P
c
i
=
n
as required, and miraculously guarantees that
the correct constant in the inequality is 1 (as written). The functions
f
i
have
been replaced by
f
c
i
i
, since this ensures that equality occurs if the
f
i
are identical
Gaussian densities. It may be helpful to see why this is so. If
f
i
(
t
) =
e
â
t
2
for all
i
, then
Y
f
i
(
h
x, u
i
i
)
c
i
= exp
â
X
c
i
h
x, u
i
i
2
=
e
â|
x
|
2
=
n
Y
1
e
â
x
2
i
,
so the integral is
Z
e
â
t
2
n
=
Y Z
e
â
t
2
c
i
=
Y Z
f
i
c
i
.
Armed with Theorem 6.4, letâs now prove Theorem 6.2.
Proof of the volume-ratio estimate.
Recall that our aim is to show that,
for
u
i
and
c
i
as usual, the body
C
=
{
x
:
|h
x, u
i
i| â¤
1 for 1
â¤
i
â¤
m
}
has volume at most 2
n
. For each
i
let
f
i
be the characteristic function of the
interval [
â
1
,
1] in
R
. Then the function
x
7â
Y
f
i
h
x, u
i
i
c
i
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
37
is exactly the characteristic function of
C
. Integrating and applying Theorem
6.4 we have
vol(
C
)
â¤
Y Z
f
i
c
i
=
Y
2
c
i
= 2
n
.
The theorems of Brascamp and Lieb and Beckner have been greatly extended
over the last twenty years. The main result in Becknerâs paper solved the old
problem of determining the norm of the Fourier transform between
L
p
spaces.
There are many classical inequalities in harmonic analysis for which the best
constants are now known. The paper [Lieb 1990] contains some of the most
up-to-date discoveries and gives a survey of the history of these developments.
The methods described here have many other applications to convex geometry.
There is also a reverse form of the BrascampâLieb inequality appropriate for
analysing, for example, the ratio of the volume of a body to that of the minimal
ellipsoid containing it.
Lecture 7. The Central Limit Theorem
and Large Deviation Inequalities
The material in this short lecture is not really convex geometry, but is intended
to provide a context for what follows. For the sake of readers who may not be
familiar with probability theory, we also include a few words about independent
random variables.
To begin with, a
probability measure
Âľ
on a set ⌠is just a measure of total
mass
Âľ
(âŚ) = 1. Real-valued functions on ⌠are called random variables and
the integral of such a function
X
: âŚ
â
R
, its mean, is written
EX
and called
the expectation of
X
. The variance of
X
is
E
(
X
â
EX
)
2
. It is customary to
suppress the reference to ⌠when writing the measures of sets defined by random
variables. Thus
Âľ
(
{
Ď
â
⌠:
X
(
Ď
)
<
1
}
)
is written
Âľ
(
X <
1): the probability that
X
is less than 1.
Two crucial, and closely related, ideas distinguish probability theory from
general measure theory. The first is independence. Two random variables
X
and
Y
are said to be independent if, for any functions
f
and
g
,
Ef
(
X
)
g
(
Y
) =
Ef
(
X
)
Eg
(
Y
)
.
Independence can always be viewed in a canonical way. Let (âŚ
, Âľ
) be a product
space (âŚ
1
Ă
âŚ
2
, Âľ
1
â
Âľ
2
), where
Âľ
1
and
Âľ
2
are probabilities. Suppose
X
and
Y
are random variables on ⌠for which the value
X
(
Ď
1
, Ď
2
) depends only upon
Ď
1
while
Y
(
Ď
1
, Ď
2
) depends only upon
Ď
2
. Then any integral (that converges
appropriately)
Ef
(
X
)
g
(
Y
) =
Z
f
(
X
(
s
))
g
(
Y
(
t
))
dÂľ
1
â
Âľ
2
(
s, t
)
38
KEITH BALL
s
0
âŚ
2
âŚ
1
Figure 23.
Independence and product spaces
can be written as the product of integrals
Z
f
(
X
(
s
))
dÂľ
1
(
s
)
Z
g
(
Y
(
t
))
dÂľ
2
(
t
) =
Ef
(
X
)
Eg
(
Y
)
by Fubiniâs Theorem. Putting it another way, on each line
{
(
s
0
, t
) :
t
â
âŚ
2
}
,
X
is fixed, while
Y
exhibits its full range of behaviour in the correct proportions.
This is illustrated in Figure 23.
In a similar way, a sequence
X
1
, X
2
, . . . , X
n
of independent random variables
arises if each variable is defined on the product space âŚ
1
Ă
âŚ
2
Ă
. . .
Ă
âŚ
n
and
X
i
depends only upon the
i
-th coordinate.
The second crucial idea, which we will not discuss in any depth, is the use
of many different
Ď
-fields on the same space. The simplest example has already
been touched upon. The product space âŚ
1
Ă
âŚ
2
carries two
Ď
-fields, much smaller
than the product field, which it inherits from âŚ
1
and âŚ
2
respectively. If
F
1
and
F
2
are the
Ď
-fields on âŚ
1
and âŚ
2
, the sets of the form
A
Ă
âŚ
2
â
âŚ
1
Ă
âŚ
2
for
A
â F
1
form a
Ď
-field on âŚ
1
Ă
âŚ
2
; letâs call it Ë
F
1
. Similarly,
Ë
F
2
=
{
âŚ
1
Ă
B
:
B
â F
2
}
.
âTypicalâ members of these
Ď
-fields are shown in Figure 24.
Ë
F
1
Ë
F
2
Figure 24.
Members of the âsmallâ
Ď
-fields
Ë
F
1
and
Ë
F
2
on
âŚ
1
Ă
âŚ
2
.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
39
One of the most beautiful and significant principles in mathematics is the cen-
tral limit theorem: any random quantity that arises as the sum of many small
independent contributions is distributed very much like a Gaussian random vari-
able. The most familiar example is coin tossing. We use a coin whose decoration
is a bit austere: it has +1 on one side and
â
1 on the other. Let
Îľ
1
, Îľ
2
, . . . , Îľ
n
be the outcomes of
n
independent tosses. Thus the
Îľ
i
are independent random
variables, each of which takes the values +1 and
â
1 each with probability
1
2
.
(Such random variables are said to have a
Bernoulli distribution
.) Then the
normalised sum
S
n
=
1
â
n
n
X
1
Îľ
i
belongs to an interval
I
of the real line with probability very close to
1
â
2
Ď
Z
I
e
â
t
2
/
2
dt.
The normalisation 1
/
â
n
, ensures that the variance of
S
n
is 1: so there is some
hope that the
S
n
will all be similarly distributed.
The standard proof of the central limit theorem shows that much more is true.
Any sum of the form
n
X
1
a
i
Îľ
i
with real coefficients
a
i
will have a roughly Gaussian distribution as long as each
a
i
is fairly small compared with
P
a
i
2
. Some such smallness condition is clearly
needed since if
a
1
= 1
and
a
2
=
a
3
=
¡ ¡ ¡
=
a
n
= 0
,
the sum is just
Îľ
1
, which is not much like a Gaussian. However, in many in-
stances, what one really wants is not that the sum is distributed like a Gaussian,
but merely that the sum cannot be large (or far from average) much more often
than an appropriate Gaussian variable. The example above clearly satisfies such
a condition:
Îľ
1
never deviates from its mean, 0, by more than 1.
The following inequality provides a deviation estimate for any sequence of
coefficients. In keeping with the custom among functional analysts, I shall refer
to the inequality as Bernsteinâs inequality. (It is not related to the Bernstein
inequality for polynomials on the circle.) However, probabilists know the result
as Hoeffdingâs inequality, and the earliest reference known to me is [Hoeffding
1963]. A stronger and more general result goes by the name of the Azumaâ
Hoeffding inequality; see [Williams 1991], for example.
Theorem 7.1 (Bernsteinâs inequality).
If
Îľ
1
, Îľ
2
, . . . , Îľ
n
are independent
Bernoulli random variables and if
a
1
, a
2
, . . . , a
n
satisfy
P
a
i
2
= 1,
then for each
positive
t
we have
Prob
n
X
i
=1
a
i
Îľ
i
> t
â¤
2
e
â
t
2
/
2
.
40
KEITH BALL
This estimate compares well with the probability of finding a standard Gaussian
outside the interval [
â
t, t
],
2
â
2
Ď
Z
â
t
e
â
s
2
/
2
ds.
The method by which Bernsteinâs inequality is proved has become an industry
standard.
Proof.
We start by showing that, for each real
Îť
,
Ee
Îť
P
a
i
Îľ
i
â¤
e
Îť
2
/
2
.
(7.1)
The idea will then be that
P
a
i
Îľ
i
cannot be large too often, since, whenever it
is large, its exponential is enormous.
To prove (7.1), we write
Ee
Îť
P
a
i
Îľ
i
=
E
n
Y
1
e
Îťa
i
Îľ
i
and use independence to deduce that this equals
n
Y
1
Ee
Îťa
i
Îľ
i
.
For each
i
the expectation is
Ee
Îťa
i
Îľ
i
=
e
Îťa
i
+
e
â
Îťa
i
2
= cosh
Îťa
i
.
Now, cosh
x
â¤
e
x
2
/
2
for any real
x
, so, for each
i
,
Ee
Îťa
i
Îľ
i
â¤
e
Îť
2
a
2
i
/
2
.
Hence
Ee
Îť
P
a
i
Îľ
i
â¤
n
Y
1
e
Îť
2
a
2
i
/
2
=
e
Îť
2
/
2
,
since
P
a
2
i
= 1.
To pass from (7.1) to a probability estimate, we use the inequality variously
known as Markovâs or Chebyshevâs inequality: if
X
is a nonnegative random
variable and
R
is positive, then
R
Prob(
X
âĽ
R
)
â¤
EX
(because the integral includes a bit where a function whose value is at least
R
is
integrated over a set of measure Prob(
X
âĽ
R
)).
Suppose
t
âĽ
0. Whenever
P
a
i
Îľ
i
âĽ
t
, we will have
e
t
P
a
i
Îľ
i
âĽ
e
t
2
. Hence
e
t
2
Prob
X
a
i
Îľ
i
âĽ
t
â¤
Ee
t
P
a
i
Îľ
i
â¤
e
t
2
/
2
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
41
by (7.1). So
Prob
X
a
i
Îľ
i
âĽ
t
â¤
e
â
t
2
/
2
,
and in a similar way we get
Prob
X
a
i
Îľ
i
⤠â
t
â¤
e
â
t
2
/
2
.
Putting these inequalities together we get
Prob
X
a
i
Îľ
i
âĽ
t
â¤
2
e
â
t
2
/
2
.
In the next lecture we shall see that deviation estimates that look like Bernsteinâs
inequality hold for a wide class of functions in several geometric settings. For
the moment letâs just remark that an estimate similar to Bernsteinâs inequality,
Prob
X
a
i
X
i
âĽ
t
â¤
2
e
â
6
t
2
,
(7.2)
holds for
P
a
2
i
= 1, if the
Âą
1 valued random variables
Îľ
i
are replaced by in-
dependent random variables
X
i
each of which is uniformly distributed on the
interval
â
1
2
,
1
2
. This already has a more geometric flavour, since for these (
X
i
)
the vector (
X
1
, X
2
, . . . , X
n
) is distributed according to Lebesgue measure on
the cube
â
1
2
,
1
2
n
â
R
n
. If
P
a
2
i
= 1, then
P
a
i
X
i
is the distance of the point
(
X
1
, X
2
, . . . , X
n
) from the subspace of
R
n
orthogonal to (
a
1
, a
2
, . . . , a
n
). So (7.2)
says that most of the mass of the cube lies close to any subspace of
R
n
, which is
reminiscent of the situation for the Euclidean ball described in Lecture 1.
Lecture 8. Concentration of Measure in Geometry
The aim of this lecture is to describe geometric analogues of Bernsteinâs de-
viation inequality. These geometric deviation estimates are closely related to
isoperimetric inequalities. The phenomenon of which they form a part was in-
troduced into the field by V. Milman: its development, especially by Milman
himself, led to a new, probabilistic, understanding of the structure of convex
bodies in high dimensions. The phenomenon was aptly named the concentration
of measure.
We explained in Lecture 5 how the BrunnâMinkowski inequality implies the
classical isoperimetric inequality in
R
n
: among bodies of a given volume, the
Euclidean balls have least surface area. There are many other situations where
isoperimetric inequalities are known; two of them will be described below. First
letâs recall that the argument from the BrunnâMinkowski inequality shows more
than the isoperimetric inequality.
Let
A
be a compact subset of
R
n
. For each point
x
of
R
n
, let
d
(
x, A
) be the
distance from
x
to
A
:
d
(
x, A
) = min
{|
x
â
y
|
:
y
â
A
}
.
42
KEITH BALL
A
A
Îľ
Îľ
Figure 25.
An
Îľ
-neighbourhood.
For each positive
Îľ
, the Minkowski sum
A
+
ÎľB
n
2
is exactly the set of points
whose distance from
A
is at most
Îľ
. Letâs denote such an
Îľ
-neighbourhood
A
Îľ
;
see Figure 25.
The BrunnâMinkowski inequality shows that, if
B
is an Euclidean ball of the
same volume as
A
, we have
vol(
A
Îľ
)
âĽ
vol(
B
Îľ
)
for any
Îľ >
0.
This formulation of the isoperimetric inequality makes much clearer the fact that
it relates the measure and the metric on
R
n
. If we blow up a set in
R
n
using the
metric, we increase the measure by at least as much as we would for a ball.
This idea of comparing the volumes of a set and its neighbourhoods makes
sense in any space that has both a measure and a metric, regardless of whether
there is an analogue of Minkowski addition. For any metric space (âŚ
, d
) equipped
with a Borel measure
Âľ
, and any positive
Îą
and
Îľ
, it makes sense to ask: For
which sets
A
of measure
Îą
do the blow-ups
A
Îľ
have smallest measure? This
general isoperimetric problem has been solved in a variety of different situations.
We shall consider two closely related geometric examples. In each case the mea-
sure
Âľ
will be a probability measure: as we shall see, in this case, isoperimetric
inequalities may have totally unexpected consequences.
In the first example, ⌠will be the sphere
S
n
â
1
in
R
n
, equipped with either
the geodesic distance or, more simply, the Euclidean distance inherited from
R
n
as shown in Figure 26. (This is also called the
chordal metric;
it was used
in Lecture 2 when we discussed spherical caps of given radii.) The measure
will be
Ď
=
Ď
n
â
1
, the rotation-invariant probability on
S
n
â
1
. The solutions of
the isoperimetric problem on the sphere are known exactly: they are spherical
caps (Figure 26, right) or, equivalently, they are balls in the metric on
S
n
â
1
.
Thus, if a subset
A
of the sphere has the same measure as a cap of radius
r
, its
neighbourhood
A
Îľ
has measure at least that of a cap of radius
r
+
Îľ
.
This statement is a good deal more difficult to prove than the classical isoperi-
metric inequality on
R
n
: it was discovered by P. L´
evy, quite some time after the
isoperimetric inequality in
R
n
. At first sight, the statement looks innocuous
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
43
x
y
d
(
x, y
)
Figure 26.
The Euclidean metric on the sphere. A spherical cap (right) is a ball
for this metric.
enough (despite its difficulty): but it has a startling consequence.
Suppose
Îą
=
1
2
, so that
A
has the measure of a hemisphere
H
. Then, for each positive
Îľ
,
the set
A
Îľ
has measure at least that of the set
H
Îľ
, illustrated in Figure 27. The
complement of
H
Îľ
is a spherical cap that, as we saw in Lecture 2, has measure
about
e
â
nÎľ
2
/
2
. Hence
Ď
(
A
Îľ
)
âĽ
1
â
e
â
nÎľ
2
/
2
, so almost the entire sphere lies within
distance
Îľ
of
A
, even though there may be points rather far from
A
. The measure
and the metric on the sphere âdonât matchâ: the mass of
Ď
concentrates very
close to any set of measure
1
2
. This is clearly related to the situation described
in Lecture 1, in which we found most of the mass of the ball concentrated near
each hyperplane: but now the phenomenon occurs for any set of measure
1
2
.
The phenomenon just described becomes even more striking when reinter-
preted in terms of Lipschitz functions. Suppose
f
:
S
n
â
1
â
R
is a function on
the sphere that is 1-Lipschitz: that is, for any pair of points
θ
and
Ď
on the
sphere,
|
f
(
θ
)
â
f
(
Ď
)
| ⤠|
θ
â
Ď
|
.
There is at least one number
M
, the median of
f
, for which both the sets (
f
â¤
M
)
and (
f
âĽ
M
) have measure at least
1
2
. If a point
x
has distance at most
Îľ
from
Îľ
Figure 27.
An
Îľ
-neighbourhood of a hemisphere.
44
KEITH BALL
(
f
â¤
M
), then (since
f
is 1-Lipschitz)
f
(
x
)
â¤
M
+
Îľ.
By the isoperimetric inequality all but a tiny fraction of the points on the sphere
have this property:
Ď
(
f > M
+
Îľ
)
â¤
e
â
nÎľ
2
/
2
.
Similarly,
f
is larger than
M
â
Îľ
on all but a fraction of the sphere. Putting
these statements together we get
Ď
(
|
f
â
M
|
> Îľ
)
â¤
2
e
â
nÎľ
2
/
2
.
So, although
f
may vary by as much as 2 between a point of the sphere and
its opposite, the function is nearly equal to
M
on almost the entire sphere:
f
is
practically constant.
In the case of the sphere we thus have the following pair of properties.
(i) If
A
â
⌠with
Âľ
(
A
) =
1
2
then
Âľ
(
A
Îľ
)
âĽ
1
â
e
â
nÎľ
2
/
2
.
(ii) If
f
: âŚ
â
R
is 1-Lipschitz there is a number
M
for which
Âľ
(
|
f
â
M
|
> Îľ
)
â¤
2
e
â
nÎľ
2
/
2
.
Each of these statements may be called an approximate isoperimetric inequal-
ity. We have seen how the second can be deduced from the first. The reverse
implication also holds (apart from the precise constants involved). (To see why,
apply the second property to the function given by
f
(
x
) =
d
(
x, A
).)
In many applications, exact solutions of the isoperimetric problem are not as
important as deviation estimates of the kind we are discussing. In some cases
where the exact solutions are known, the two properties above are a good deal
easier to prove than the solutions: and in a great many situations, an exact
isoperimetric inequality is not known, but the two properties are.
The for-
mal similarity between property 2 and Bernsteinâs inequality of the last lecture
is readily apparent. There are ways to make this similarity much more than
merely formal: there are deviation inequalities that have implications for Lips-
chitz functions and imply Bernsteinâs inequality, but we shall not go into them
here.
In our second example, the space ⌠will be
R
n
equipped with the ordinary
Euclidean distance.
The measure will be the standard Gaussian probability
measure on
R
n
with density
Îł
(
x
) = (2
Ď
)
â
n/
2
e
â|
x
|
2
/
2
.
The solutions of the isoperimetric problem in Gauss space were found by Borell
[1975]. They are half-spaces. So, in particular, if
A
â
R
n
and
Âľ
(
A
) =
1
2
, then
Âľ
(
A
Îľ
) is at least as large as
Âľ
(
H
Îľ
), where
H
is the half-space
{
x
â
R
n
:
x
1
â¤
0
}
and so
H
Îľ
=
{
x
:
x
1
â¤
Îľ
}
: see Figure 28.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
45
Îľ
H
Îľ
Figure 28.
An
Îľ
-neighbourhood of a half-space.
The complement of
H
Îľ
has measure
1
â
2
Ď
Z
â
Îľ
e
â
t
2
/
2
dt
â¤
e
â
Îľ
2
/
2
.
Hence,
Âľ
(
A
Îľ
)
âĽ
1
â
e
â
Îľ
2
/
2
.
Since
n
does not appear in the exponent, this looks much weaker than the state-
ment for the sphere, but we shall see that the two are more or less equivalent.
Borell proved his inequality by using the isoperimetric inequality on the
sphere. A more direct proof of a deviation estimate like the one just derived
was found by Maurey and Pisier, and their argument gives a slightly stronger,
Sobolev-type inequality [Pisier 1989, Chapter 4]. We too shall aim directly for
a deviation estimate, but a little background to the proof may be useful.
There was an enormous growth in understanding of approximate isoperimetric
inequalities during the late 1980s, associated most especially with the name of
Talagrand. The reader whose interest has been piqued should certainly look at
Talagrandâs gorgeous articles [1988; 1991a], in which he describes an approach to
deviation inequalities in product spaces that involves astonishingly few structural
hypotheses. In a somewhat different vein (but prompted by his earlier work),
Talagrand [1991b] also found a general principle, strengthening the approximate
isoperimetric inequality in Gauss space. A simplification of this argument was
found by Maurey [1991]. The upshot is that a deviation inequality for Gauss
space can be proved with an extremely short argument that fits naturally into
these notes.
Theorem 8.1 (Approximate isoperimetric inequality for Gauss space).
Let
A
â
R
n
be measurable and let
Âľ
be the standard Gaussian measure on
R
n
.
Then
Z
e
d
(
x,A
)
2
/
4
dÂľ
â¤
1
Âľ
(
A
)
.
Consequently
,
if
Âľ
(
A
) =
1
2
,
Âľ
(
A
Îľ
)
âĽ
1
â
2
e
â
Îľ
2
/
4
.
46
KEITH BALL
Proof.
We shall deduce the first assertion directly from the Pr´
ekopaâLeindler
inequality (with
Îť
=
1
2
) of Lecture 5. To this end, define functions
f
,
g
, and
m
on
R
n
, as follows:
f
(
x
) =
e
d
(
x,A
)
2
/
4
Îł
(
x
)
,
g
(
x
) =
Ď
A
(
x
)
Îł
(
x
)
,
m
(
x
) =
Îł
(
x
)
,
where
Îł
is the Gaussian density. The assertion to be proved is that
Z
e
d
(
x,A
)
2
/
4
dÂľ
Âľ
(
A
)
â¤
1
,
which translates directly into the inequality
Z
R
n
f
Z
R
n
g
â¤
Z
R
n
m
2
.
By the Pr´
ekopaâLeindler inequality it is enough to check that, for any
x
and
y
in
R
n
,
f
(
x
)
g
(
y
)
â¤
m
x
+
y
2
2
.
It suffices to check this for
y
â
A
, since otherwise
g
(
y
) = 0. But, in this case,
d
(
x, A
)
⤠|
x
â
y
|
. Hence
(2
Ď
)
n
f
(
x
)
g
(
y
) =
e
d
(
x,A
)
2
/
4
e
â
x
2
/
2
e
â
y
2
/
2
â¤
exp
|
x
â
y
|
2
4
â
|
x
|
2
2
â
|
y
|
2
2
= exp
â
|
x
+
y
|
2
4
=
exp
â
1
2
x
+
y
2
2
2
= (2
Ď
)
n
m
x
+
y
2
2
,
which is what we need.
To deduce the second assertion from the first, we use Markovâs inequality, very
much as in the proof of Bernsteinâs inequality of the last lecture. If
Âľ
(
A
) =
1
2
,
then
Z
e
d
(
x,A
)
2
/
4
dÂľ
â¤
2
.
The integral is at least
e
Îľ
2
/
4
Âľ d
(
x, A
)
âĽ
Îľ
.
So
Âľ d
(
x, A
)
âĽ
Îľ
â¤
2
e
â
Îľ
2
/
4
,
and the assertion follows.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
47
It was mentioned earlier that the Gaussian deviation estimate above is essentially
equivalent to the concentration of measure on
S
n
â
1
. This equivalence depends
upon the fact that the Gaussian measure in
R
n
is concentrated in a spherical shell
of thickness approximately 1, and radius approximately
â
n
. (Recall that the
Euclidean ball of volume 1 has radius approximately
â
n
.) This concentration is
easily checked by direct computation using integration in spherical polars: but
the inequality we just proved will do the job instead. There is an Euclidean
ball of some radius
R
whose Gaussian measure is
1
2
. According to the theorem
above, Gaussian measure concentrates near the boundary of this ball. It is not
hard to check that
R
is about
â
n
. This makes it quite easy to show that the
deviation estimate for Gaussian measure guarantees a deviation estimate on the
sphere of radius
â
n
with a decay rate of about
e
â
Îľ
2
/
4
. If everything is scaled
down by a factor of
â
n
, onto the sphere of radius 1, we get a deviation estimate
that decays like
e
â
nÎľ
2
/
4
and
n
now appears in the exponent. The details are left
to the reader.
The reader will probably have noticed that these estimates for Gauss space
and the sphere are not quite as strong as those advertised earlier, because in
each case the exponent is
. . . Îľ
2
/
4
. . .
instead of
. . . Îľ
2
/
2
. . .
. In some applications,
the sharper results are important, but for our purposes the difference will be
irrelevant. It was pointed out to me by Talagrand that one can get as close
as one wishes to the correct exponent
. . . Îľ
2
/
2
. . .
by using the Pr´
ekopaâLeindler
inequality with
Îť
close to 1 instead of
1
2
and applying it to slightly different
f
and
g
.
For the purposes of the next lecture we shall assume an estimate of
e
â
Îľ
2
/
2
,
even though we proved a weaker estimate.
Lecture 9. Dvoretzkyâs Theorem
Although this is the ninth lecture, its subject, Dvoretzkyâs Theorem, was re-
ally the start of the modern theory of convex geometry in high dimensions. The
phrase âDvoretzkyâs Theoremâ has become a generic term for statements to the
effect that high-dimensional bodies have almost ellipsoidal slices. Dvoretzkyâs
original proof shows that any symmetric convex body in
R
n
has almost ellip-
soidal sections of dimension about
â
log
n
. A few years after the publication
of Dvoretzkyâs work, Milman [Milman 1971] found a very different proof, based
upon the concentration of measure, which gave slices of dimension log
n
. As we
saw in Lecture 2 this is the best one can get in general. Milmanâs argument gives
the following.
Theorem 9.1.
There is a positive number
c
such that
,
for every
Îľ >
0
and
every natural number
n
,
every symmetric convex body of dimension
n
has a slice
of dimension
k
âĽ
cÎľ
2
log(1 +
Îľ
â
1
)
log
n
48
KEITH BALL
that is within distance
1 +
Îľ
of the
k
-dimensional Euclidean ball
.
There have been many other proofs of similar results over the years. A par-
ticularly elegant one [Gordon 1985] gives the estimate
k
âĽ
cÎľ
2
log
n
(removing
the logarithmic factor in
Îľ
â
1
), and this estimate is essentially best possible. We
chose to describe Milmanâs proof because it is conceptually easier to motivate
and because the concentration of measure has many other uses. A few years ago,
Schechtman found a way to eliminate the log factor within this approach, but
we shall not introduce this subtlety here. We shall also not make any effort to
be precise about the dependence upon
Îľ
.
With the material of Lecture 8 at our disposal, the plan of proof of Theorem
9.1 is easy to describe. We start with a symmetric convex body and we consider
a linear image
K
whose maximal volume ellipsoid is the Euclidean ball. For this
K
we will try to find almost spherical sections, rather than merely ellipsoidal
ones. Let
k ¡ k
be the norm on
R
n
whose unit ball is
K
. We are looking for a
k
-dimensional space
H
with the property that the function
θ
7â k
θ
k
is almost constant on the Euclidean sphere of
H
,
H
âŠ
S
n
â
1
. Since
K
contains
B
n
2
, we have
k
x
k ⤠|
x
|
for all
x
â
R
n
, so for any
θ
and
Ď
in
S
n
â
1
,
|k
θ
k â k
Ď
k| ⤠k
θ
â
Ď
k ⤠|
θ
â
Ď
|
.
Thus
k ¡ k
is a Lipschitz function on the sphere in
R
n
, (indeed on all of
R
n
). (We
used the same idea in Lecture 4.) From Lecture 8 we conclude that the value of
k
θ
k
is almost constant on a very large proportion of
S
n
â
1
: it is almost equal to
its average
M
=
Z
S
n
â
1
k
θ
k
dĎ,
on most of
S
n
â
1
.
We now choose our
k
-dimensional subspace at random. (The exact way to do
this will be described below.) We can view this as a random embedding
T
:
R
k
â
R
n
.
For any particular unit vector
Ď
â
R
k
, there is a very high probability that its
image
T Ď
will have norm
k
T Ď
k
close to
M
. This means that even if we select
quite a number of vectors
Ď
1
, Ď
2
, . . . , Ď
m
in
S
k
â
1
we can guarantee that there
will be some choice of
T
for which
all
the norms
k
T Ď
i
k
will be close to
M
. We
will thus have managed to pin down the radius of our slice in many different
directions. If we are careful to distribute these directions well over the sphere in
R
k
, we may hope that the radius will be almost constant on the entire sphere.
For these purposes, âwell distributedâ will mean that all points of the sphere in
R
k
are close to one of our chosen directions. As in Lecture 2 we say that a set
{
Ď
1
, Ď
2
, . . . , Ď
m
}
in
S
k
â
1
is a
δ
-net
for the sphere if every point of
S
k
â
1
is within
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
49
(Euclidean) distance
δ
of at least one
Ď
i
. The arguments in Lecture 2 show that
S
k
â
1
has a
δ
-net with no more than
m
=
4
δ
k
elements. The following lemma states that, indeed, pinning down the norm on
a very fine net, pins it down everywhere.
Lemma 9.2.
Let
k ¡ k
be a norm on
R
k
and suppose that for each point
Ď
of
some
δ
-net on
S
k
â
1
,
we have
M
(1
â
Îł
)
⤠k
Ď
k â¤
M
(1 +
Îł
)
for some
Îł >
0.
Then
,
for
every
θ
â
S
k
â
1
,
M
(1
â
Îł
â
2
δ
)
1
â
δ
⤠k
θ
k â¤
M
(1 +
Îł
)
1
â
δ
.
Proof.
Clearly the value of
M
plays no real role here so assume it is 1. We
start with the upper bound. Let
C
be the maximum possible ratio
k
x
k
/
|
x
|
for
nonzero
x
and let
θ
be a point of
S
k
â
1
with
k
θ
k
=
C
. Choose
Ď
in the
δ
-net
with
|
θ
â
Ď
| â¤
δ
. Then
k
θ
â
Ď
k â¤
C
|
θ
â
Ď
| â¤
Cδ
, so
C
=
k
θ
k ⤠k
Ď
k
+
k
θ
â
Ď
k â¤
(1 +
Îł
) +
Cδ.
Hence
C
â¤
(1 +
Îł
)
1
â
δ
.
To get the lower bound, pick some
θ
in the sphere and some
Ď
in the
δ
-net
with
|
Ď
â
θ
| â¤
δ
. Then
(1
â
Îł
)
⤠k
Ď
k ⤠k
θ
k
+
k
Ď
â
θ
k ⤠k
θ
k
+
(1 +
Îł
)
1
â
δ
|
Ď
â
θ
| ⤠k
θ
k
+
(1 +
Îł
)
δ
1
â
δ
.
Hence
k
θ
k âĽ
1
â
Îł
â
δ
(1 +
Îł
)
1
â
δ
=
(1
â
Îł
â
2
δ
)
1
â
δ
.
According to the lemma, our approach will give us a slice that is within distance
1 +
Îł
1
â
Îł
â
2
δ
of the Euclidean ball (provided we satisfy the hypotheses), and this distance can
be made as close as we wish to 1 if
Îł
and
δ
are small enough.
We are now in a position to prove the basic estimate.
Theorem 9.3.
Let
K
be a symmetric convex body in
R
n
whose ellipsoid of
maximal volume is
B
n
2
and put
M
=
Z
S
n
â
1
k
θ
k
dĎ
50
KEITH BALL
as above
.
Then
K
has almost spherical slices whose dimension is of the order of
nM
2
.
Proof.
Choose
Îł
and
δ
small enough to give the desired accuracy, in accordance
with the lemma.
Since the function
θ
7â k
θ
k
is Lipschitz (with constant 1) on
S
n
â
1
, we know
from Lecture 8 that, for any
t
âĽ
0,
Ď
k
θ
k â
M
> t
â¤
2
e
â
nt
2
/
2
.
In particular,
Ď
k
θ
k â
M
> M Îł
â¤
2
e
â
nM
2
Îł
2
/
2
.
So
M
(1
â
Îł
)
⤠k
θ
k â¤
M
(1 +
Îł
)
on all but a proportion 2
e
â
nM
2
Îł
2
/
2
of the sphere.
Let
A
be a
δ
-net on the sphere in
R
k
with at most (4
/δ
)
k
elements. Choose
a random embedding of
R
k
in
R
n
: more precisely, fix a particular copy of
R
k
in
R
n
and consider its images under orthogonal transformations
U
of
R
n
as
a random subspace with respect to the invariant probability on the group of
orthogonal transformations. For each fixed
Ď
in the sphere of
R
k
, its images
U Ď
, are uniformly distributed on the sphere in
R
n
. So for each
Ď
â A
, the
inequality
M
(1
â
Îł
)
⤠k
U Ď
k â¤
M
(1 +
Îł
)
holds for
U
outside a set of measure at most 2
e
â
nM
2
Îł
2
/
2
. So there will be at
least one
U
for which this inequality holds for
all
Ď
in
A
, as long as the sum of
the probabilities of the bad sets is at most 1. This is guaranteed if
4
δ
k
2
e
â
nM
2
Îł
2
/
2
<
1
.
This inequality is satisfied by
k
of the order of
nM
2
Îł
2
2 log(4
/δ
)
.
Theorem 9.3 guarantees the existence of spherical slices of
K
of large dimension,
provided the average
M
=
Z
S
n
â
1
k
θ
k
dĎ
is not too small. Notice that we certainly have
M
â¤
1 since
k
x
k ⤠|
x
|
for all
x
.
In order to get Theorem 9.1 from Theorem 9.3 we need to get a lower estimate
for
M
of the order of
â
log
n
â
n
.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
51
This is where we must use the fact that
B
n
2
is the
maximal
volume ellipsoid in
K
. We saw in Lecture 3 that in this situation
K
â
â
nB
n
2
, so
k
x
k ⼠|
x
|
/
â
n
for
all
x
, and hence
M
âĽ
1
â
n
.
But this estimate is useless, since it would not give slices of dimension bigger
than 1. It is vital that we use the more detailed information provided by Johnâs
Theorem.
Before we explain how this works, letâs look at our favourite examples. For
specific norms it is usually much easier to compute the mean
M
by writing it as
an integral with respect to Gaussian measure on
R
n
. As in Lecture 8 let
Âľ
be
the standard Gaussian measure on
R
n
, with density
(2
Ď
)
â
n/
2
e
â|
x
|
2
/
2
.
By using polar coordinates we can write
Z
S
n
â
1
k
θ
k
dĎ
=
Î(
n/
2)
â
2Î((
n
+ 1)
/
2)
Z
R
n
k
x
k
dÂľ
(
x
)
>
1
â
n
Z
R
n
k
x
k
dÂľ
(
x
)
.
The simplest norm for which to calculate is the
`
1
norm. Since the body we
consider is supposed to have
B
n
2
as its maximal ellipsoid we must use
â
nB
n
1
, for
which the corresponding norm is
k
x
k
=
1
â
n
n
X
1
|
x
i
|
.
Since the integral of this sum is just
n
times the integral of any one coordinate
it is easy to check that
1
â
n
Z
R
n
k
x
k
dÂľ
(
x
) =
r
2
Ď
.
So for the scaled copies of
B
n
1
, we have
M
bounded below by a fixed number, and
Theorem 9.3 guarantees almost spherical sections of dimension proportional to
n
. This was first proved, using exactly the method described here, in [Figiel et al.
1977], which had a tremendous influence on subsequent developments. Notice
that this result and KaËsinâs Theorem from Lecture 4 are very much in the same
spirit, but neither implies the other. The method used here does not achieve
dimensions as high as
n/
2 even if we are prepared to allow quite a large distance
from the Euclidean ball. On the other hand, the volume-ratio argument does
not give sections that are very close to Euclidean: the volume ratio is the closest
one gets this way. Some time after KaËsinâs article appeared, the gap between
these results was completely filled by Garnaev and Gluskin [Garnaev and Gluskin
1984]. An analogous gap in the general setting of Theorem 9.1, namely that the
existing proofs could not give a dimension larger than some fixed multiple of
log
n
, was recently filled by Milman and Schechtman.
52
KEITH BALL
What about the cube? This body
has
B
n
2
as its maximal ellipsoid, so our job
is to estimate
1
â
n
Z
R
n
max
|
x
i
|
dÂľ
(
x
)
.
At first sight this looks much more complicated than the calculation for
B
n
1
,
since we cannot simplify the integral of a maximum. But, instead of estimating
the mean of the function max
|
x
i
|
, we can estimate its median (and from Lecture
8 we know that they are not far apart). So let
R
be the number for which
Âľ
(max
|
x
i
| â¤
R
) =
Âľ
(max
|
x
i
| âĽ
R
) =
1
2
.
From the second identity we get
1
â
n
Z
R
n
max
|
x
i
|
dÂľ
(
x
)
âĽ
R
2
â
n
.
We estimate
R
from the first identity. It says that the cube [
â
R, R
]
n
has Gauss-
ian measure
1
2
. But the cube is a âproductâ so
Âľ
([
â
R, R
]
n
) =
1
â
2
Ď
Z
R
â
R
e
â
t
2
/
2
dt
!
n
.
In order for this to be equal to
1
2
we need the expression
1
â
2
Ď
Z
R
â
R
e
â
t
2
/
2
dt
to be about 1
â
(log 2)
/n.
Since the expression approaches 1 roughly like
1
â
e
â
R
2
/
2
,
we get an estimate for
R
of the order of
â
log
n
. From Theorem 9.3 we then
recover the simple result of Lecture 2 that the cube has almost spherical sections
of dimension about log
n
.
There are many other bodies and classes of bodies for which
M
can be ef-
ficiently estimated. For example, the correct order of the largest dimension of
Euclidean slice of the
`
n
p
balls, was also computed in the paper [Figiel et al. 1977]
mentioned earlier.
We would like to know that for a
general
body with maximal ellipsoid
B
n
2
we
have
Z
R
n
k
x
k
dÂľ
(
x
)
âĽ
(constant)
p
log
n
(9.1)
just as we do for the cube. The usual proof of this goes via the DvoretzkyâRogers
Lemma, which can be proved using Johnâs Theorem. This is done for example in
[Pisier 1989]. Roughly speaking, the DvoretzkyâRogers Lemma builds something
like a cube around
K
, at least in a subspace of dimension about
n
2
, to which we
then apply the result for the cube. However, I cannot resist mentioning that the
methods of Lecture 6, involving sharp convolution inequalities, can be used to
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
53
show that among all symmetric bodies
K
with maximal ellipsoid
B
n
2
the cube
is precisely the one for which the integral in (9.1) is smallest. This is proved by
showing that for each
r
, the Gaussian measure of
rK
is at most that of [
â
r, r
]
n
.
The details are left to the reader.
This last lecture has described work that dates back to the seventies. Although
some of the material in earlier lectures is more recent (and some is much older),
I have really only scratched the surface of what has been done in the last twenty
years. The book of Pisier to which I have referred several times gives a more
comprehensive account of many of the developments. I hope that readers of
these notes may feel motivated to discover more.
Acknowledgements
I would like to thank Silvio Levy for his help in the preparation of these notes,
and one of the workshop participants, John Mount, for proofreading the notes
and suggesting several improvements. Finally, a very big thank you to my wife
Sachiko Kusukawa for her great patience and constant love.
References
[Ball 1990] K. M. Ball, âVolumes of sections of cubes and related problemsâ, pp. 251â
260 in
Geometric aspects of functional analysis
(Israel Seminar, 1987â1988), edited
by J. Lindenstrauss and V. D. Milman, Lecture Notes in Math.
1376
, Springer,
1990.
[Ball 1991]
K. M. Ball, âVolume ratios and a reverse isoperimetric inequalityâ,
J.
London Math. Soc.
44
(1991), 351â359.
[Beckner 1975]
W. Beckner, âInequalities in Fourier analysisâ,
Ann. of Math.
102
(1975), 159â182.
[Borell 1975] C. Borell, âThe BrunnâMinkowski inequality in Gauss spaceâ,
Inventiones
Math.
30
(1975), 205â216.
[Bourgain and Milman 1987] J. Bourgain and V. Milman, âNew volume ratio properties
for convex symmetric bodies in
R
n
â,
Invent. Math.
88
(1987), 319â340.
[Bourgain et al. 1989] J. Bourgain, J. Lindenstrauss, and V. Milman, âApproximation
of zonoids by zonotopesâ,
Acta Math.
162
(1989), 73â141.
[Brascamp and Lieb 1976a]
H. J. Brascamp and E. H. Lieb, âBest constants in
Youngâs inequality, its converse and its generalization to more than three functionsâ,
Advances in Math.
20
(1976), 151â173.
[Brascamp and Lieb 1976b] H. J. Brascamp and E. H. Lieb, âOn extensions of the
BrunnâMinkowski and Pr´
ekopaâLeindler theorems, including inequalities for log
concave functions, and with an application to the diffusion equationâ,
J. Funct.
Anal.
22
(1976), 366â389.
[Brøndsted 1983] A. Brøndsted,
An introduction to convex polytopes
, Graduate Texts
in Math.
90
, Springer, New York, 1983.
54
KEITH BALL
[Busemann 1949] H. Busemann, âA theorem on convex bodies of the BrunnâMinkowski
typeâ,
Proc. Nat. Acad. Sci. USA
35
(1949), 27â31.
[Figiel et al. 1977]
T. Figiel, J. Lindenstrauss, and V. Milman, âThe dimension of
almost spherical sections of convex bodiesâ,
Acta Math.
139
(1977), 53â94.
[Garnaev and Gluskin 1984] A. Garnaev and E. Gluskin, âThe widths of a Euclidean
ballâ,
Dokl. A. N. USSR
277
(1984), 1048â1052. In Russian.
[Gordon 1985] Y. Gordon, âSome inequalities for Gaussian processes and applicationsâ,
Israel J. Math.
50
(1985), 265â289.
[Hoeffding 1963] W. Hoeffding, âProbability inequalities for sums of bounded random
variablesâ,
J. Amer. Statist. Assoc.
58
(1963), 13â30.
[John 1948] F. John, âExtremum problems with inequalities as subsidiary conditionsâ,
pp. 187â204 in
Studies and essays presented to R. Courant on his
60
th birthday
(Jan.
8, 1948), Interscience, New York, 1948.
[Johnson and Schechtman 1982] W. B. Johnson and G. Schechtman, âEmbedding
`
m
p
into
`
n
1
â,
Acta Math.
149
(1982), 71â85.
[KaË
sin 1977] B. S. KaË
sin, âThe widths of certain finite-dimensional sets and classes of
smooth functionsâ,
Izv. Akad. Nauk SSSR Ser. Mat.
41
:2 (1977), 334â351, 478. In
Russian.
[Lieb 1990] E. H. Lieb, âGaussian kernels have only Gaussian maximizersâ,
Invent.
Math.
102
(1990), 179â208.
[L
Äąusternik 1935]
L. A. L
Äąusternik, âDie BrunnâMinkowskische Ungleichung f¨
ur
beliebige messbare Mengenâ,
C. R. Acad. Sci. URSS
8
(1935), 55â58.
[Maurey 1991]
B. Maurey, âSome deviation inequalitiesâ,
Geom. Funct. Anal.
1
:2
(1991), 188â197.
[Milman 1971] V. Milman, âA new proof of A. Dvoretzkyâs theorem on cross-sections
of convex bodiesâ,
Funkcional. Anal. i PriloË
zen
5
(1971), 28â37. In Russian.
[Milman 1985] V. Milman, âAlmost Euclidean quotient spaces of subspaces of finite
dimensional normed spacesâ,
Proc. Amer. Math. Soc.
94
(1985), 445â449.
[Pisier 1989] G. Pisier,
The volume of convex bodies and Banach space geometry
, Tracts
in Math.
94
, Cambridge U. Press, Cambridge, 1989.
[Rogers 1964] C. A. Rogers,
Packing and covering
, Cambridge U. Press, Cambridge,
1964.
[Schneider 1993] R. Schneider,
Convex bodies: the BrunnâMinkowski theory
, Encyclo-
pedia of Math. and its Applications
44
, Cambridge U. Press, 1993.
[Szarek 1978] S. J. Szarek, âOn Kashinâs almost Euclidean orthogonal decomposition
of
`
n
1
â,
Bull. Acad. Polon. Sci. S´
er. Sci. Math. Astronom. Phys.
26
(1978), 691â694.
[Talagrand 1988]
M. Talagrand, âAn isoperimetric inequality on the cube and the
Khintchine-Kahane inequalitiesâ,
Proc. Amer. Math. Soc.
104
(1988), 905â909.
[Talagrand 1990] M. Talagrand, âEmbedding subspaces of
L
1
into
`
N
1
â,
Proc. Amer.
Math. Soc.
108
(1990), 363â369.
[Talagrand 1991a] M. Talagrand, âA new isoperimetric inequalityâ,
Geom. Funct. Anal.
1
:2 (1991), 211â223.
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
55
[Talagrand 1991b]
M. Talagrand, âA new isoperimetric inequality and the concen-
tration of measure phenomenonâ, pp. 94â124 in
Geometric aspects of functional
analysis
(Israel Seminar, 1989â1990), edited by J. Lindenstrauss and V. D. Milman,
Lecture Notes in Math.
1469
, Springer, 1991.
[Tomczak-Jaegermann 1988] N. Tomczak-Jaegermann,
BanachâMazur distances and
finite-dimensional operator ideals
, Pitman Monographs and Surveys in Pure and
Applied Math.
38
, Longman, Harlow, 1988.
[Williams 1991] D. Williams,
Probability with martingales
, Cambridge Mathematical
Textbooks, Cambridge University Press, Cambridge, 1991.
Index
affine transformation, 8, 13, 32
almost, see approximate
AM/GM inequality, 17, 19, 29, 31
approximate
ball, 8, 20
intersection, 20
section, 8, 9, 19, 24, 49
ellipsoid, see approximate ball
isoperimetric inequality, 44, 45
area, see also volume
arithmetic mean, see AM/GM inequality
average radius, 7
Azuma, Kazuoki, 39
AzumaâHoeffding inequality, 39
B
n
1
, 3
B
n
2
, 4
ball, see also approximate ball
and cube, 2, 8, 9, 19
and octahedron, 3
and simplex, 2
central section, 6
circumscribed, 2, 3, 8
distribution of volume, 6
Euclidean, 4
for arbitrary norm, 8
inscribed, 2, 3, 8
near polytope, 9
section, 6
similarities to convex body, 2
volume, 4
Ball, Keith M., 25, 33, 36
Beckner, William, 34, 37
Bernoulli distribution, 39
Bernsteinâs inequality, 39, 41, 44, 46
Bernstein, Sergei N., 39
Bollob´
as, B´
ela, 1
bootstrapping of probabilities, 24
Borell, Christer, 44
bound, see inequality
Bourgain, Jean, 25
Brascamp, Herm Jan, 30, 34, 35, 37
BrascampâLieb reverse inequality, 37
Brunnâs Theorem, 26, 28, 29
Brunn, Hermann, 25, 26
BrunnâMinkowski
inequality, 25, 28, 32, 41
functional, 29
multiplicative, 29
theory, 2
Brøndsted, Arne, 2
Busemannâs Theorem, 32
Busemann, Herbert, 32
cap, 10, 22
radius, 11
volume, 10, 11
CauchyâSchwarz inequality, 24
central
limit theorem, 37, 39
section, 6
centrally symmetric, see symmetric
characteristic function, 29
Chebyshevâs inequality, 40
chordal metric, 42
circumscribed ball, 2, 8
classical convex geometry, 2
coin toss, 39
combinatorial
theory of polytopes, 2
combinatorics, 32
concave function, 25, 27, 29
concentration, see also distribution
of measure, 7, 41, 47
56
KEITH BALL
cone, 3, 26
spherical, 12
volume, 3
contact point, 13
convex
body
definition, 2
ellipsoids inside, 13
similarities to ball, 2
symmetric, 8
hull, 2, 3
convolution
generalised, 34
inequalities, 52
cotype-2 property, 25
cross-polytope, 3, 8
and ball, 13
volume, 3
ratio, 19, 21
cube, 2, 7, 33, 52, 53
and ball, 8, 9, 13, 19
distribution of volume, 41
maximal ellipsoid, 15
sections of, 8
volume, 7
distribution, 7
ratio, 19, 21
deck of cards, 25
density, see distribution
determinant maximization, 19
deviation estimate, 24, 39
geometric, 41
in Gauss space, 45
differential equations, 2
dimension-independent constant, 6, 20,
25, 47
distance
between convex bodies, 8
distribution, see also concentration, see
also volume distribution
Gaussian, 6, 12
normal, 6
of volume, 12
duality, 17, 19
Dvoretzkyâs Theorem, 47
Dvoretzky, Aryeh, 47, 52
DvoretzkyâRogers Lemma, 52
ellipsoid, 13
maximal, see maximal ellipsoid
minimal, 37
polar, 17
Euclidean
metric, 11
on sphere, 42
norm, 2, 14
expectation, 37
Figiel, Tadeusz, 51, 52
Fourier transform, 37
Fubiniâs Theorem, 31, 38
functional, 2, 18
analysis, 2
BrunnâMinkowski inequality, 29
Garnaev, AndreËÄą, 51
Gaussian
distribution, 6, 12, 34, 36, 39, 44, 46, 47,
51, 53
measure, see Gaussian distribution
generalized convolution, 34
geometric
mean, see AM/GM inequality
geometry
of numbers, 2
Gluskin, E. D., 51
Gordon, Yehoram, 48
H¨
older inequality, 30
Hadamard matrix, 24
HahnâBanach separation theorem, 2
harmonic
analysis, 34, 37
Hoeffding, Wassily, 39
homogeneity, 30
identity
resolution of, 14, 19
independence
from
dimension,
see
dimension-independent constant
independent variables, 37
inequality
AM/GM, 17, 19, 29, 31
Bernsteinâs, 39, 41, 44, 46
BrunnâMinkowski, 25, 28, 32, 41
functional, 29
multiplicative, 29
CauchyâSchwarz, 24
Chebyshevâs, 40
convolution, 52
deviation, see deviation estimate
for cap volume, 11
for norms of convolutions, 34
AN ELEMENTARY INTRODUCTION TO MODERN CONVEX GEOMETRY
57
H¨
older, 30
isoperimetric, 28, 32, 41, 44, 45
Markovâs, 40, 46
Pr´
ekopaâLeindler, 30, 46, 47
Youngâs, 34
inertia tensor, 14
information theory, 2, 32
inscribed ball, 2, 8
intersection
of convex bodies, 20
invariant measure on
O
(
n
), 22, 50
isoperimetric
inequality, 28, 32, 41
approximate, 44, 45
in Gauss space, 45
on sphere, 42
problem
in Gauss space, 44
Johnâs Theorem, 13, 33, 36, 51, 52
extensions, 19
proof of, 16
John, Fritz, 13
Johnson, William B., 25
KaË
sinâs Theorem, 20, 51
KaË
sin, B. S., 20
Kusukawa, Sachiko, 53
L
p
norm, 34
`
1
norm, 3, 8, 24, 51
L´
evy, Paul, 42
large deviation inequalities, see deviation
estimate
Leindler, L´
aszl´
o, 30
Levy, Silvio, 53
Lieb, Elliott H., 30, 34, 35, 37
Lindenstrauss, Joram, 1, 25, 51, 52
linear
programming, 2
Lipschitz function, 43, 50
Lyusternik, Lazar A., 28
Markovâs inequality, 40, 46
mass, see also volume
distribution on contact points, 14
Maurey, Bernard, 45
maximal
ellipsoid, 13, 21, 34, 48, 49, 51, 53
for cube, 15
mean, see AM/GM inequality
metric
and measure, 42
on space of convex bodies, 8
on sphere, 11
Milman, Vitali D., 25, 41, 47, 51, 52
minimal ellipsoid, 19, 37
Minkowski
addition, 28, 42
Minkowski, Hermann, 27
Mount, John, 53
MSRI, 1
multiplicative
BrunnâMinkowski inequality, 29
N
(
x
), 22
net, 11, 48
nonsymmetric convex body, 13, 15
norm
`
1
, 3, 8, 24, 51
and radius, 8
Euclidean, 2, 14, 21
of convolution, 34
with given ball, 8, 21, 48
normal distribution, 6
normalisation of integral, 5, 39
octahedron, see cross-polytope
omissions, 2
orthant, 3
orthogonal
group, 22, 50
projection, 14
orthonormal basis, 13, 14
partial differential equations, 2
Pisier, Gilles, 20, 45, 52
polar
coordinates, 4, 7, 47, 51
ellipsoid, 17
polytope, 8
as section of cube, 9
combinatorial theory of âs, 2
near ball, 9
Pr´
ekopa, Andr´
as, 30
Pr´
ekopaâLeindler inequality, 30, 46, 47
probability
bootstrapping of, 24
measure, 37, 42
that. . . , 37
theory, 2, 24, 37
projection
orthogonal, 14
QS-Theorem, 25
58
KEITH BALL
quotient of a subspace, 25
r
(
θ
), 7
radius
and norm, 8
and volume for ball, 5
average, 7
in a direction, 7, 32
of body in a direction, 21
Ramsey theory, 24
random
subspace, 48, 50
variable, 37
ratio
between volumes, see volume ratio
resolution of the identity, 14, 19
reverse
BrascampâLieb inequality, 37
rigidity condition, 16
Rogers, Claude Ambrose, 9, 52
S
n
â
1
, 4
Schechtman, Gideon, 1, 25, 48, 51
Schneider, Rolf, 2, 32
section
ellipsoidal, see approximate ball
length of, 25
of ball, 6
of cube, 8, 9
parallel, 27
spherical, see approximate ball
sections
almost spherical, 50
1-separated set, 12
separation
theorem, 18
shaking down a set, 25
Ď
, 5
Ď
-field, 38
simplex, 15, 33
regular, 2
slice, see section
sphere, see also ball
cover by caps, 10
measure on, 5
volume, 5
spherical
cap, see cap
cone, 12
coordinates, see polar coordinates
section, see approximate ball
Stirlingâs formula, 5, 6
supporting hyperplane, 2, 16
symmetric body, 8, 15, 25, 33, 47, 49
Szarek, Stanis law Jerzy, 20, 21
Talagrand, Michel, 25, 45
tensor product, 14
Tomczak-Jaegermann, Nicole, 19
vol, 2
volume
distribution
of ball, 6
of cube, 7, 41
of ball, 4
of cone, 3
of cross-polytope, 3
of ellipsoid, 13
of symmetric body, 8
ratio, 19, 21, 25, 33, 36, 51
vr, 21
Walsh matrix, 24
weighted average, 14, 27
Williams, David, 39
Youngâs inequality, 34
Keith Ball
Department of Mathematics
University College
University of London
London
United Kingdom
kmb@math.ucl.ac.uk