Sie sind auf Seite 1von 60

The Green-Tao Theorem on arithmetic progressions within the

primes
Thomas Bloom
November 7, 2010
Contents
1 Introduction 6
1.1 Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Arithmetic Progressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Structure of the Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Arithmetic Progressions 12
2.1 How to count arithmetic progressions . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Szemeredi Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Pseudorandomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Uniformity Norms and the Generalised von Neumann Theorem 17
3.1 The Gowers Uniformity Norm . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 The Generalised von Neumann Theorem . . . . . . . . . . . . . . . . . . . . 19
3.3 Dual Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Decomposition Theorem 22
4.1 The Green-Tao Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.2 The Gowers-Hahn-Banach Proof . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 The Relative Szemeredi Theorem . . . . . . . . . . . . . . . . . . . . . . . . 25
5 Progressions in the Primes 27
5.1 Counting the Primes and the W-trick . . . . . . . . . . . . . . . . . . . . . . 27
5.2 Pseudorandom Majorant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 The Green-Tao Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6 Further Results 35
6.1 Extensions of the Green-Tao Theorem . . . . . . . . . . . . . . . . . . . . . 35
6.2 Asymptotics for P(k, N) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.3 Explicit Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
A Proof of the Decomposition Theorem 39
2
CONTENTS 3
B Estimates for
R
43
B.1 Euler Product for independent linear forms . . . . . . . . . . . . . . . . . . . 46
B.2 Euler product for simple linear forms . . . . . . . . . . . . . . . . . . . . . . 50
B.3 Pseudorandomness of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
C Fourier transform 55
D The GI and MN Conjectures 57
4
Standard Notation
The following notation and denitions are standard, and will be used without comment
throughout this dissertation.
A k-term arithmetic progression (or just a k-progression) is a set of the form a + nb :
0 n k 1 for some a, b N. We exclude the degenerate case where b = 0.
We say that f = O(g) if there exists a constant C such that for all suciently large x,
[f(x)[ Cg(x).
f = o(1) if lim
x
f(x) = 0.
f g if f = (1 + o(1))g.
[n] := 1, 2, . . . , n 1, n.
(n) := #m [n] : (m, n) = 1.
(n) :=
_

_
1 if n = 1;
(1)
k
if n is square-free and has k prime divisors;
0 otherwise.
Specic notation and conventions
We shall often be looking at the arithmetic mean of a function over a given set. For conve-
nience, we denote this using the expectation notation,
E(f(x) : x X) = E
xX
f(x) :=
1
[X[

xX
f(x).
Also, since we shall be trying to count k-term progressions of primes, it is convenient to
introduce the following function:
P(k, N) := # of k-term arithmetic progressions of primes in [N].
In many cases, we shall be taking an average over a hypercube; that is, points of the form
(
1
, . . . ,
n
) where
i
= 0 or 1 for all 1 i n. We denote the n-dimensional hypercube
by C
n
:= 0, 1
n
. We denote the hypercube with the origin removed by C

n
:= 0, 1
n

(0, . . . , 0). Where we are dealing with such points, of the form = (
1
, . . . ,
n
) and
h = (h
1
, . . . , h
n
), we dene the scalar product to be
h :=
1
h
1
+ +
n
h
n
.
We will be working mainly over the ring of integers modulo N, denoted by Z
N
:= Z/NZ,
and since we are letting N we may always assume that N is a prime, and so Z
N
is a
eld. We shall often be considering the space of functions f : Z
N
R, which we denote by
R
N
. We give this the inner product
f, g) := E
xZ
N
(f(x)g(x)),
CONTENTS 5
and where convenient the L
p
norms,
|f|
p
:= (E
xZ
N
[f(x)[
p
)
1/p
for 1 p < , and
|f|

:= sup
xZ
N
[f(x)[.
We shall denote dependence on constants by subscripts. For example, O
k,
implies that
the constant implicit in the O notation is dependent on the constants k and . Similarly,
o
k
implies that the rate of decay is dependent on k. Since in most cases there is only
one variable, I shall omit these subscripts for clarity, only making clear which variables the
constants depend on when this is important or not clear from context.
In almost all of this dissertation, the only variable is N, and hence, for instance, f = o(1)
implies f tends to zero as N . The only other variable parameter is R :=

N, and
hence we may still take the o(1) errors to be decaying as N .
Whenever we use the variable p, we are ranging over primes. For example,

px
p denotes
the sum of all primes less than or equal to x.
Acknowledgements
I would like to thank my supervisor for his encouragement, advice and a careful reading of
the rst draft. I would also like to thank several of my fellow undergraduates for careful
proofreading and comments on clarity and structure.
Chapter 1
Introduction
Small arithmetic progressions within the primes are easy to come by. It is trivial to nd
progressions with one or two terms, and 3, 5, 7 gives a 3-term arithmetic progression. A mo-
ments thought yields the 5-term arithmetic progression 5, 11, 17, 23, 29. The problem quickly
becomes a lot more dicult the rst 6-term arithmetic progression is 7, 37, 67, 97, 127, 157
and the current record holder has 26 terms:
43142746595714191 + 5283234035979900n for n = 0, . . . , 25.
1
A natural problem to pose is whether we can nd such progressions within the primes for
any given length. It is easy to prove that there can be no innite arithmetic progression
within the primes, so the best we can hope for is the following recent theorem of Ben Green
and Terence Tao:
Theorem 1.1 (Green-Tao, 2008 [10]). There are arbitrarily long arithmetic progressions
within the primes.
In fact, they prove something much stronger, and give an increasing function of N as
a lower bound for how many such progressions are in the rst N integers. It follows that
there are in fact innitely many arithmetic progressions of primes of any nite length. In
this dissertation we describe a proof of this theorem, motivating and making clear the key
steps and insights needed as we go along. It is a synthesis of methods and ideas from [10],
[7] and [3], including some simplications and new expository remarks. It is the rst explicit
description of the entire proof which includes the simplications made since [10].
In this introductory section we present the background to the problem, including heuris-
tics, related conjectures, and previous partial results which the Green-Tao theorem builds
upon. We conclude by giving an overview of the structure of the proof. Chapters 2 to 5
focus on the dierent components of the proof, which are then brought together to prove (a
stronger form of) Theorem 1.1 at the end of Chapter 5. Chapter 6 gives some extensions
and related results which have since been obtained.
1
Found by Benot Perichon using the PrimeGrid software by Geo Reynolds and Jaroslaw Wroblewski, April 2010.
6
CHAPTER 1. INTRODUCTION 7
1.1 Heuristics
The prime number theorem states that if (x) is the number of primes less than or equal to
x, then
(x)
x
log x
.
In probabilistic terms, this suggests that if we select an integer from [1, x] uniformly at
random, then it is prime with probability roughly 1/ log x. This model fails as a method
of proving statements about the primes if the primes were truly random then we would
expect roughly about the same number of even and odd primes. The failure is because the
model only considers the density of the primes, and not their arithmetical properties. This
model is surprisingly useful, however, in formulating conjectures about how we expect the
primes to behave. By including some information about their arithmetic properties (which
often only changes the original conjecture by a constant) these heuristics can be converted
into proven theorems.
For example, consider the following problem: how many primes less than or equal to x
are congruent to a modulo b? If a and b have a common divisor greater than 1, the answer is
trivially either 0 or 1, so suppose a and b are coprime. Let us select integers below x uniformly
at random. The probability that it is prime should be roughly 1/ log x and the probability
that it is congruent to a modulo b is 1/(b), since there are (b) coprime congruence classes
modulo b. Thus the probability that it meets both these conditions, assuming they are
independent, should be 1/(b) log x and this leads us to conjecture that

a,b
(x)
x
(b) log x
where
a,b
(x) is the number of primes less than or equal to x which are congruent to a modulo
b. This conjecture turns out to be correct, and is known as the Prime Number Theorem for
Arithmetic Progressions.
Encouraged by the success of the probabilistic model in counting primes within a given
arithmetic progression, let us try it with our problem of counting arithmetic progressions
within the primes. We now use the expectation notation, together with the fact that if A is
an event then E(# of times A occurs) = P(A):
E(#a, b such that a, a + b, . . . , a + (k 1)b N are all prime)
= P(a, a + b, . . . , a + (k 1)b are all prime)
P(a is prime) P(a + (k 1)b is prime)

1
log
k
N
where we have made the further assumption that the events of being prime are roughly
independent. Furthermore, since there are (within a constant factor of) N
2
arithmetic
8
progressions of length k in [1, N], this leads us to conjecture that
P(k, N)
N
2
log
k
N
.
In particular, since the right hand side is unbounded as N tends to innity, there are innitely
many k-term progressions within the primes. In fact, we shall prove something very similar
to this asymptotic, giving a lower bound which only diers from the above heuristic by a
constant factor. That is, we prove the following theorem.
Theorem 1.2 (Green-Tao). For any k 3 there exists a constant c
k
> 0 such that, for all
suciently large N,
P(k, N) c
k
N
2
log
k
N
.
It is also possible to give an upper bound of this form, so P(k, N) is within a constant
factor of the heuristic answer. The correct asymptotic seems to be similar - the heuristic
above multiplied by some absolute constant, which reects the arithmetical information
about the primes which the probabilistic model does not include.
Conjecture 1.1. For any k 3 there exists a constant (
k
> 0 depending only on k such
that
P(k, N) (
k
N
2
log
k
N
.
This has been veried for 3 k 6 see Chapter 6 for details. In fact, this is only a
special case of a more general conjecture obtained by Hardy and Littlewood using similar
heuristics. A linear form is a function in d variables of the shape
a
0
+ a
1
n
1
+ a
2
n
2
+ + a
d
n
d
where a
i
Q.
Conjecture 1.2 (Hardy-Littlewood Prime Tuples Conjecture [14]). Let
1
, . . . ,
k
be linear
forms in d variables. Then for some constant ( dependent only on the linear forms,
#n = (n
1
, . . . , n
d
) [0, N]
d
:
1
(n), . . . ,
k
(n) prime (
N
d
log
k
N
.
An armative answer to this conjecture would not only give asymptotic for P(k, N), but
also settle the long-standing Twin Primes Conjecture and Goldbach Conjecture. See [7] for
more details. The Hardy-Littlewood conjecture is still unproven, although by extending the
methods used in [10], Green and Tao have proven it for a signicant class of linear forms in
[7].
CHAPTER 1. INTRODUCTION 9
1.2 Arithmetic Progressions
The advances which led to the Green-Tao theorem are more about the structure of arithmetic
progressions than about the nature of the primes themselves. As Ben Green puts it in [5],
they lie not in our understanding of the primes but rather in what we can say about
arithmetic progressions.
The problem of nding arithmetic progressions within sets began with a 1936 paper [1]
of Erdos and Turan, in which they introduce the function r
k
(n), dened to be the size of the
largest subset of 1, . . . , n with no k-term arithmetic progression. The problem of locating
k-term arithmetic progressions within suciently large sets is equivalent to showing that
r
k
(n) grows suciently slowly. In particular, they conjectured that r
k
(n) = o(n) for any
k. If we dene the density of a set of integers A to be liminf
N
|A{1,...,N}|
N
, then this is
equivalent to the statement that any set of integers with positive density contains a k-term
progression for any k. In fact, Erd os later made the stronger conjecture that
Conjecture 1.3 (Erd os). If

nA
1
n
=
then A contains arbitrarily long arithmetic progressions.
This is essentially equivalent to showing that r
k
(n) = O
_
n
log n
_
for any k. Note that this is
stronger than r
k
(n) = o(n), since it gives an explicit upper bound on the rate of decay. The
Green-Tao theorem would be an immediate corollary of this conjecture, since

p
1
p
diverges.
The weaker conjecture that r
k
(n) = o(n) was proven for the rst non-trivial case k = 3 by
Roth in 1956, giving the following theorem.
Theorem 1.3 (Roth, 1956 [18]). Any subset of N with positive density contains a 3-term
arithmetic progression.
The case k = 4 was proven by Szemeredi in 1969, and in 1975 he managed to extend this
to arbitrary k using complicated combinatorial arguments.
Theorem 1.4 (Szemeredi, 1975 [19]). Any subset of N with positive density contains arbi-
trarily long arithmetic progressions.
In fact, we can strengthen Szemeredis theorem to show that we can nd not just one,
but innitely many progressions of any nite length using a simple combinatorial argument
rst noted by Varnavides.
Corollary 1.1 (Varnavides, 1959 [24]). Let A N and k 2 . If there exists > 0 such
that A 1, . . . , N for all suciently large N, then there exists a constant c
k,
> 0
such that, for suciently large N, there are at least c
k,
N
2
k-term arithmetic progressions
in A 1, . . . , N.
10
The prime number theorem implies the primes have density zero, and hence Szemeredis
theorem cannot be applied directly. It will, however, be invoked at a crucial step in proving
the Green-Tao theorem.
Szemeredis theorem has been reproven several times since 1975 using methods from a
surprisingly diverse array of mathematical elds rst by Furstenberg in 1977 using ergodic
theory and then twice by Gowers, using Fourier analysis and hypergraph regularity. More
recently, in 2009 an alternative combinatorial proof was found by the internet polymath
project. For a detailed survey of the dierent proofs see [21]. It was insights gained from
studying the common features of these dierent proofs which led to the methods used by
Green and Tao.
The rst progress on the problem of progressions within the primes came from van der
Corput in 1939, who settled the result for k = 3 using the circle method.
Theorem 1.5 (van der Corput, 1939 [23]). There are innitely many arithmetic progressions
consisting of three primes.
After this theorem, although signicant results were obtained for sets of positive density
as outlined above, no further results were obtained for the primes until 1981 when Heath-
Brown showed, using methods similar to van der Corputs, the following partial result for
k = 4.
Theorem 1.6 (Heath-Brown, 1981 [15]). There are innitely many arithmetic progressions
consisting of three primes and one almost-prime (that is, a number with only two prime
factors, counted with multiplicity).
1.3 Structure of the Proof
The outline of the proof given here is new, although of course all the ideas are implicit in the
original approach of Green and Tao. In [10], however, they did not make the transference
principle explicit; it is discussed in more detail in, for example, [3].
The fundamental insight behind Green and Taos work was that, heuristically, a large
random subset of the integers is very similar to the integers themselves, conclusions which
hold for the latter should hold for the former. In other words, there should be a kind
of transference principle which would allow results which hold for the integers to hold for
suciently random subsets.
Let us call such subsets pseudorandom sets. Applying the transference principle to Sze-
meredis theorem, we may hope the following to hold.
Conjecture 1.4 (Relative Szemeredi Theorem). Let X N be suciently pseudorandom.
Then any subset of X with positive density inside X has arbitrarily long arithmetic progres-
sions.
In particular, although the primes have zero density within N, we may hope to nd some
pseudorandom set X N in which the primes have positive density, and deduce that the
CHAPTER 1. INTRODUCTION 11
primes contain arbitrarily long arithmetic progressions. In [10], Green and Tao do exactly
this, and their proof can be divided into two distinct parts.
The rst is proving the Relative Szemeredi theorem that is, showing that the kind of
structure reected in Szemeredis theorem is amenable to the transference principle men-
tioned above. This is accomplished using the machinery of Gowers uniformity norms, rst
introduced by Gowers to prove Szemeredis theorem. In particular, these norms induce a
notion of distance between subsets of N with the following properties.
The rst is that this distance preserves the structure of containing arithmetic progressions,
in that sets which are close have roughly the same number of arithmetic progressions. This
is formalised as the Generalised von Neumann theorem. The mathematics needed here relies
on the Gowers uniformity norms, and is similar to the kind of regularity arguments used in
the hypergraph proof of Szemeredis theorem by Gowers. This, along with a discussion of
the uniformity norms, is the subject of Chapter 3.
The second is the transference principle mentioned above: a set which is dense inside some
pseudorandom subset of the integers is close to a set which is dense within the integers. This
is formalised as the Decomposition theorem. In the original proof, Green and Tao used
nitary ergodic theory to prove this, inspired by Furstenbergs proof of Szemeredis theorem
using ergodic theory. In this dissertation, we present a simpler proof discovered by Gowers
using functional analysis. This is the subject of Chapter 4.
The third component is Szemeredis theorem, which shows that the larger set obtained
from the decomposition contains many arithmetic progressions. We then invoke the Gen-
eralised von Neumann theorem to show that the original set also contains many arithmetic
progressions. This nishes the proof of the Relative Szemeredi theorem.
With this in place, the second part of the proof of the Green-Tao theorem is showing that
the hypothesis of the Relative Szemeredi theorem is met: the primes sit inside a pseudoran-
dom set with positive density. For this, we will use a weighted version of the almost-primes
numbers with few prime factors. We will discuss this part of the proof in Chapter 5. To
show that the almost-primes are suciently pseudorandom uses techniques from traditional
analytic number theory, and Green and Tao were able to use arguments and results already
established by Goldston and Yldrm in their work on small gaps between the primes [2].
This part of the argument has since been simplied; we have incorporated these simplica-
tions into Chapter 5.
Chapter 2
Arithmetic Progressions
In this chapter we discuss arithmetic progressions and Szemeredis theorem in more detail,
and formulate the precise denition of pseudorandomness which we shall require. The ex-
position in this chapter is new, but the ideas it discusses are all present in [10] and earlier
work.
2.1 How to count arithmetic progressions
In many problems dealing with the existence of certain structures in the natural numbers,
it is easier to try to solve the apparently more dicult problem of counting how many such
structures we may expect to nd in any nite subset of the natural numbers. Hence it is
sucient to show that this count is not zero.
Another simplication that can be made is to consider functions instead of sets. We may
pass from considering a subset A N to its characteristic function 1
A
: N 0, 1 using
the following important observation:
1
A
(x)1
A
(x + r) 1
A
(x + (k 1)r) =
_
1 if x, x + r, . . . , x + (k 1)r A;
0 otherwise.
Hence we may count all arithmetic progressions within A Z
N
by the sum

x,rZ
N
1
A
(x)1
A
(x + r) 1
A
(x + (k 1)r).
Note that we have switched from considering arithmetic progressions within 1, . . . , N to
those within Z
N
. This is to avoid the problem that, for instance, x, r 1, . . . , N does not
guarantee that x + (k 1)r 1, . . . , N. We discuss this issue further below.
Statements about the existence of arithmetic progressions then reduce to the above sum
being non-zero. In fact, whenever the sum is bounded below by a constant, it is also bounded
below by a constant multiple of N
2
for suciently large N, thanks to simple arguments such
as were used by Varnavides to prove Corollary 1.1. Hence we in fact consider the above sum
12
CHAPTER 2. ARITHMETIC PROGRESSIONS 13
weighted by
1
N
2
. This leads us to the expectation notation, and motivates us to make the
following denition:
Denition 2.1. Let f
0
, . . . , f
k1
: Z
N
R. The normalised count of k-term arithmetic
progressions in f
0
, . . . , f
k1
is dened by

k
(f
0
, . . . , f
k1
) := E(f
0
(x)f
1
(x + r) f
k1
(x + (k 1)r) : x, r Z
N
) .
The normalised count of k-term arithmetic progressions in f is

k
(f) :=
k
(f, . . . , f).
Remark 2.1. The use of
k
was not present in [10], although they repeatedly use the
expectation it is shorthand for. Conventionally this expectation is denoted by , but in this
context this would create confusion with the von Mangoldt function for the primes.
The discussion above shows that, for A Z
N
, there are N
2

k
(1
A
) many k-term arithmetic
progressions in A.
There are two important things to observe about the way we are counting arithmetic
progressions. The rst is that we include the degenerate case when r = 0. This will not be
a problem, as such degenerate cases will contribute at most 1/N to
k
, which we shall only
be estimating up to o(1) errors.
The second potential problem is the wraparound issue noted above we are counting
arithmetic progressions in Z
N
, rather than 1, . . . , N. For instance, when N = 5, we would
include 1, 4, 2 as a 3-term arithmetic progression. This will happen if and only if, for some
1 i k 1, the term a + ir is larger than N, for then it would have to wraparound Z
N
.
One crude way to avoid this, which will be sucient, is to restrict a and r to being less than
N/k. In the case of the primes, we will ensure this by incorporating some small factor in the
denition of our counting function f to ensure such wraparound arithmetic progressions are
not counted.
2.2 Szemeredi Revisited
By considering characteristic functions instead of sets, using a Varnavides argument, and
using the wraparound trick mentioned above to pass from considering 1, . . . , N to Z
N
, we
may rewrite Theorem 1.4 as follows:
Theorem 2.1. For any k 1 and > 0 there exists a constant c
k,
> 0 such that the
following holds. Let f : Z
N
0, 1 such that, for suciently large N, the density Ef .
Then for suciently large N,

k
(f) c
k,
.
An important consequence of the approach of Gowers was the realisation that the function
f need not be discrete, but it is sucient that it be bounded above by 1. This leads us to
the following formulation of Szemeredis theorem:
14
Theorem 2.2. For any k 1 and > 0 there exists a constant c
k,
> 0 such that the
following holds. Let f : Z
N
R such that 0 f(x) 1 for all x Z
N
and, for suciently
large N, the density Ef . Then for suciently large N,

k
(f) c
k,
.
As explained in the introduction, this theorem solves the problem adequately for su-
ciently dense sets of integers, but cannot be applied to the primes, since they have zero
density. In particular, if we let 1
P
be the characteristic function of the primes, then the
prime number theorem implies that E1
P

1
log N
0 as N . Thus the Ef > 0
hypothesis of Theorem 2.2 is not satised.
We can avoid this and increase the density of our prime counting function by weighting
it with a log N factor that is, we instead consider the function f := log N 1
P
. This now
satises the density hypothesis, but is no longer bounded above by 1. The Relative Szemeredi
theorem allows us to weaken this restriction, requiring only that it is bounded above by some
suciently pseudorandom function. In particular, as an analogue to Theorem 2.2, we have
the following precise version of Conjecture 1.4.
Conjecture 2.1 (Relative Szemeredi Theorem). Let : Z
N
R be k-pseudorandom, and
let f : Z
N
R such that 0 f . If there exists a constant > 0 such that, for
suciently large N, the density Ef , then there exists a constant c

k,
> 0 such that, for
suciently large N,

k
(f) c

k,
.
To prove this theorem, we use the strategy outlined in the introduction. It will be proven
in Chapter 4 as Theorem 4.4.
2.3 Pseudorandomness
This section gives some original remarks on the notion of pseudorandomness from [10], and
presents a clearer classication of the pseudorandomness condition into two components.
We rst explain what kind of pseudorandomness we will need. Recall that is a function
from Z
N
to R, and should be thought of as a weighted indicator function for a subset of the
integers which is suciently pseudorandom for a transference principle to hold. The precise
conditions given below are determined by what we require in the technical theorems to come
later, but they reect the following principles:
1. should behave like the constant function 1, since the pseudorandom set we transfer
to should behave like the integers.
2. The events (a) and (b) should be independent, for distinct a and b, since the proba-
bility that distinct elements belong to the pseudorandom set is independent.
We divide the denition of pseudorandom below into two parts, which correspond to the
two components of the Relative Szemeredi theorem: the Generalised von Neumann theorem
CHAPTER 2. ARITHMETIC PROGRESSIONS 15
and the Decomposition theorem. In [10], Green and Tao also divide up the denition into
two parts, though they do it dierently into a linear forms condition and a correlation
condition. In that form, however, it is not clear that there is a distinction between the types
of pseudorandomness which the two components require.
The rst is the pseudorandomness required to prove the Generalised von Neumann theo-
rem.
Denition 2.2 (Linear Pseudorandomness). We say that is linearly k-pseudorandom if
whenever we have a system of m k2
k1
linear forms in t 3k 4 variables,

i
(x) :=
t

j=1
L
ij
x
j
where 1 i m,
such that none of the t-tuples (L
ij
)
1jt
Q
t
are zero, none is a rational multiple of another,
and moreover for each i, j the coecient L
ij
Q has numerator and denominator bounded
by k in absolute value, then
E
xZ
t
N
((
1
(x)) (
m
(x))) = 1 + o
k
(1).
Remark 2.2. All the linear forms in this condition are assumed to be homogenous, that is,
having zero constant term so (0) = 0. In particular, we have the measure condition,
E() = 1 + o(1),
which agrees with our rst principle. Linear pseudorandomness should be viewed as a kind
of independence between (
1
), . . . , (
m
), in accordance with our second principle. This
is a very strong condition, since it gives us a great deal of control over a large class of
linear forms, in particular the k linear forms in 2 variables which give a k-term arithmetic
progression:

1
(x
1
, x
2
) := x
1
,
2
(x
1
, x
2
) := x
1
+ x
2
, . . . ,
k
(x
1
, x
2
) := x
1
+ (k 1)x
2
.
From the linear pseudorandomness condition with these linear forms we get

k
() = 1 + o(1),
and a lot of arithmetic progressions counted by our pseudorandom function. The power of
the transference principle is that we dont lose too many of these when passing to suitable
f .
The next condition is required for the Decomposition theorem to hold.
Denition 2.3 (Simple Pseudorandomness). We say that is simply k-pseudorandom if
whenever we have m 2
k1
simple linear forms
i
in t k variables, that is, ones of the
shape

i
(x) :=
t

j=1

ij
x
j
+ b
i
16
where
ij
0, 1, such that the ane parts
i
= (
i1
, . . . ,
it
) are not zero or rational
multiples of each other, then
E((
1
(x)) (
m
(x)) = 1 + o(1).
Furthermore, there exists a weight function
m
: Z
N
R
+
such that E(
q
) = O
m,q
(1) for all
1 q < and for all h
1
, . . . , h
m
Z
N
we have the upper bound
E
xZ
N
((x + h
1
) (x + h
m
))

1i<jm
(h
i
h
j
).
Remark 2.3. We cannot apply the rst part to give an asymptotic for the second part, since
the ane parts of the linear forms are all the same. We also observe that we cannot control
these expressions using linear pseudorandomness, since the forms are non-homogenous.
We now make the following umbrella denition, which is required for the Relative Sze-
meredi theorem.
Denition 2.4 (Pseudorandomness). We say that is k-pseudorandom if it is both linearly
k-pseudorandom and simply k-pseudorandom.
The constant function 1 is the easiest example of a pseudorandom function. In fact, it is
also an important one, since the space of pseudorandom functions is star-shaped around 1,
as the following easily veried lemma shows.
Lemma 2.1. If is linearly pseudorandom, then so is + (1 ) for any (0, 1);
similarly for simple pseudorandomness.
This lemma will be important in several places, since it will allow us to pass from bounds
of the form f + 1 to ones of form f losing only a constant factor, but preserving
pseudorandomness.
Remark 2.4. It is believed that these conditions are stronger than necessary. Weakening
the strength of the pseudorandomness necessary (particular the simple pseudorandomness
required for the Decomposition theorem) is one goal of current research in this area.
Chapter 3
Uniformity Norms and the
Generalised von Neumann Theorem
In this chapter we introduce the Gowers uniformity norm, which will play a central role
in the proof. We also prove the rst component of the Relative Szemeredi theorem, the
Generalised von Neumann theorem. Once again, the substantial ideas in this chapter are all
present in [10], though the exposition in the rst section contains some new ideas.
3.1 The Gowers Uniformity Norm
Recall that our strategy for proving the Relative Szemeredi theorem is to show that a set
dense in a pseudorandom set is close in some sense to one dense in the natural numbers.
In terms of the functional approach in the previous chapter, we seek some metric d on R
N
such that:
1. If d(f, g) is small then f and g count a similar number of k-term arithmetic progressions,
and
2. If f for some pseudorandom then there exists a bounded g such that d(f, g) is
small.
The easiest way to obtain a metric is to induce it from some norm on the space. We now
give the denition of the required norm as in [10]. First, however, we give some original
remarks to help motivate the denition.
We seek to decompose a function f , where is a pseudorandom function, as f = g+h
where g is bounded and h is small, in the sense that
k
(g + h) is well approximated by

k
(g). We now need to specify what is meant by small.
Our initial approach might be to use
k
, the normalised count of arithmetic progressions,
directly that is, we hope to achieve a decomposition where
k
(h) is small. There are two
problems with this approach.
17
18
The rst is that we hope to approximate
k
(g + h) by
k
(g), and so we need that

k
(g + h) =
k
(g) +

=I[k]

k
(f
1
, . . . , f
k
)
=
k
(g) + negligible terms.
where f
i
= h if i I and f
i
= g otherwise. In other words, we need not only
k
(h) small, but
also
k
(f
1
, . . . , f
k
) small whenever some f
i
= h. It would be sucient to prove an inequality
such as

k
(f
1
, . . . , f
k
) = O
k
( inf
1ik

k
(f
i
)).
This cannot hold, however, as shown by the following counterexample. Dene f
1
(n) = 1 if
n = 0, and f
1
(n) = 0 otherwise, and let f
i
1 for all i 2. Then inf
k
(f
i
) =
k
(f
1
) =
1/N
2
, whereas
k
(f
1
, . . . , f
k
) = 1/N for all N.
The second problem with using
k
directly is that it is not a norm on R
N
, for the trivial
reason that it can be negative. We may avoid this by taking the absolute value, but although
it is easily veried that [
k
[ is a seminorm, it is not a norm. For example, take N = 3, and
f : Z
3
R dened by f(0) = 0, f(1) = 1 and f(2) = 1. A simple calculation shows

k
(f) = 0, although f ,= 0. This is a problem for the existence of a decomposition, since
the analytic machinery we hope to use to nd such a decomposition relies on h being small
in some norm on R
N
.
Hence we should not demand that h be small in terms of
k
, but rather in some other
norm on R
N
. To discover what this should look like, focus on the rst of the problems
above bounding
k
(f
1
, . . . , f
k
). The most common tool in bounding expectations is the
Cauchy-Schwarz inequality,
E(XY )
2
E(X
2
)E(Y
2
).
We need to bound
k
in terms of something involving only h, and so we must remove k 1
functions. For concreteness, let us temporarily x k = 3. After applying the Cauchy-Schwarz
inequality twice, we may bound the
3
(f
1
, f
2
, f
3
) term,
E(f
1
(x)f
1
(x + r)f
2
(x + r + r) : x, r Z
N
),
with a product where each term has the shape
E(f(x)f(x + h
1
)f(x + h
2
)f(x + h
1
+ h
2
) : x, h
1
, h
2
Z
N
).
This is similar to the shape of
3
, but with the sum r+r replaced with h
1
+h
2
for independent
variables h
1
, h
2
. This suggests that if h is small with respect to this expectation, then we
can use the Cauchy-Schwarz inequality to show that
3
(f
1
, f
2
, f
3
) is small whenever some
f
i
= h, and hence that
3
(g + h)
3
(g).
Motivated by this, we make the following denition.
Denition 3.1. For any d 1, the Gowers d-uniformity norm of a function f : Z
N
R is
dened as
|f|
U
d := E
_

C
d
f(x + h) : x Z
N
, h Z
d
N
_
1/2
d
.
CHAPTER 3. UNIFORMITY NORMS AND THE GENERALISED VON NEUMANN THEOREM 19
Note that, in the k = 3 case, |f|
4
U
2
is exactly the expectation we obtained above. In
general, we will use the U
k1
norm to deal with progressions of length k.
It is easy to verify that | |
U
d is a seminorm for d 1. That it is also a genuine norm
when d 2 follows from the easily veried fact that
|f|
U
2 = |

f|
4
where

f is the Fourier transform of f, and the less obvious monotonicity property
|f|
U
d1 |f|
U
d.
Recalling that we are seeking an inequality of the shape
# of k-progressions counted by f |f|
U
k1,
the monotonicity property agrees with the trivial observation that any (k + 1)-progression
truncated gives a k-progression (and hence if the count of (k +1)-progressions is small, then
so is the count of k-progressions).
With this denition in place, we can restate our strategy for proving the Relative Sze-
meredi theorem. We need to show that the U
k1
norm has the following properties.
1. If |h|
U
k1 is small then (for suitable g)
k
(g)
k
(g + h).
2. If f then f = g + h where g is bounded and |h|
U
k1 is small.
The rst is the Generalised von Neumann Theorem, which occupies the next section. The
second is the crucial Decomposition Theorem, which we discuss in the next chapter.
We need to control the count of arithmetic progressions over functions with small Gowers
uniformity norm. Since we shall often be referring to functions with small Gowers uniformity
norms, it is convenient to make the following denition.
Denition 3.2. We say that f is -uniform if |f|
U
d , and more generally say that f is
uniform if |f|
U
d is small.
For a more in-depth discussion of the Gowers uniformity norms, including a proof of the
monotonicity property mentioned above, see (for example) Appendix B of [7].
3.2 The Generalised von Neumann Theorem
We come now to the rst component needed to prove the Relative Szemeredi theorem. A
specialised form of this theorem, when 1, was rst used by Gowers in his proof of
Szemeredis theorem. The fact that it could be generalised to linearly pseudorandom was
rst noticed by Green and Tao in [10] indeed, the linearly pseudorandom condition which
we require to satisfy was chosen with the proof of this theorem in mind.
The proof is long and technical, and can be found in [10]. The idea is to repeatedly apply
the Cauchy-Schwarz inequality as outlined above until we are at a stage where we can apply
the pseudorandom condition.
20
Theorem 3.1 (Generalised von Neumann Theorem). Let be linearly pseudorandom, and
f
0
, . . . , f
k1
obey the bounds [f
i
(x)[ (x) for all x. Then

k
(f
0
, . . . , f
k1
) = O
k
( inf
0ik1
|f
i
|
U
k1) + o
k
(1)
Remark 3.1. Using Theorem 2.1, and rescaling the f
i
where necessary, we may in fact
weaken the conditions to [f
i
[ + 2. This fact will be needed when we apply this to prove
the Relative Szemeredi theorem, since we will need to apply it to h = f g where 0 f
and 0 g 2.
Proof. Omitted. See [10], Section 3.
In particular, if h is uniform, then
k
(g +h) is approximately
k
(g). This is the key step
in the proof of the Relative Szemeredi theorem, so we formulate it precisely as follows.
Corollary 3.1. If f = g + h where [g[, [h[ for some linearly pseudorandom , and h is
-uniform, then

k
(f) =
k
(g) + O
k
() + o(1).
Proof. Expanding out the expectation notation, we see that

k
(g + h) =
k
(g) +

=I[k]
(f
1
, . . . , f
k
)
where f
i
= h if i I and f
i
= g otherwise. We then apply Theorem 3.1 and the condition
that h is -uniform to show that each of the terms in the sum is bounded by O
k
()+o
k
(1).
3.3 Dual Norms
This section closely follows the rst part of section 6 in [10], although due to the usefulness
of dual norms in the new approach to the Decomposition theorem in the next chapter, we
dene the dual norm in generality.
In general, whenever we have a norm | | on R
N
we may dene the dual norm as follows:
|f|

:= sup[f, g)[ : |g| 1.


It is easy to check that this denes a seminorm on R
N
, and for the norms we shall be dealing
with it will also be a norm. The use of this denition lies in the inequality
f, g) |f||g|

.
In particular, whenever |g|

is small, and g correlates with f to a large degree, then |f|


must be large. That is, smallness of the dual norm prevents the norm of related functions
from being small. In the case of the U
d
norms, we say that g is anti-uniform if it has small
dual U
d
norm, and so anti-uniformity is an obstruction to uniformity.
Closely linked to the introduction of dual norms, we also introduce the concept of dual
functions at least, with respect to the U
d
norms. In the following denition, and throughout
CHAPTER 3. UNIFORMITY NORMS AND THE GENERALISED VON NEUMANN THEOREM 21
the rest of this dissertation, we shall x d = k 1, recalling that k is to be taken as a xed
quantity. The dual function of f is dened as
Tf(x) := E
_
_

C
k1
0
f(x + h) : h Z
k1
N
_
_
.
The use of this lies in the following lemma. This will be useful later, when we shall apply it
to deduce that suciently uniform functions do not correlate much with their dual functions.
Lemma 3.1.
f, Tf) = |f|
2
k1
U
k1
.
Proof. Expand out both sides using their denitions.
Chapter 4
Decomposition Theorem
The goal of this chapter is to prove the following.
Theorem 4.1 (Decomposition Theorem). Let be simply pseudorandom, and some pa-
rameter such that 1 > > 0.
Suppose N is suciently large, depending on . Then for every function 0 f we
can decompose it as f = g + h where 0 g 2 and h is -uniform.
This is the nal, and most crucial part of the proof of the Relative Szemeredi theorem,
and hence of the entire Green-Tao theorem. It is presented as a decomposition, which allows
us to decompose f into a bounded part (to which we may apply Szemeredis theorem) and
a uniform part, whose contribution is negligible by the Generalised von Neumann theorem.
It is, however, better viewed as a transference theorem: it allows us to transfer properties
of the integers to pseudorandom subsets of the integers. In this case, the desired property is
that dense subsets contain arbitrarily long arithmetic progressions. The relationship between
decomposition theorems and transference theorems holds in quite general terms, and is
discussed in detail in [3].
4.1 The Green-Tao Proof
The original proof used by Green and Tao in [10] is quite dierent to the one present below,
and relies on a nitary ergodic theory inspired by Furstenbergs proof of Szemeredis theorem.
We briey sketch their approach here before presenting the simpler proof by Gowers.
Their proof constructs the decomposition in stages. They begin by looking at the decom-
position f = E(f) + (f E(f)). It follows from the pseudorandomness of that E(f) is
bounded, so the remaining problem is to show that f E(f) is suciently uniform.
Of course, there is no guarantee that it will be. Instead, they use the machinery of
conditional expectations over -algebras to increase the uniformity as follows. By using dual
functions as obstructions to uniformity, if f E(f) is not suciently uniform sets can be
added to create an expanded -algebra B. These new sets are chosen so that the conditional
expectation E(f [ B) absorbs the impact of the dual functions which were obstructing the
22
CHAPTER 4. DECOMPOSITION THEOREM 23
uniformity. In particular, the dierence f E(f [ B) lacks these obstructions, and is more
uniform. Furthermore, it follows from the pseudorandomness of and the fact that f
that E(f [ B) remains bounded.
They continue in this fashion, keeping E(f [ B) bounded at each stage, while increasing
the uniformity of f E(f [ B). There is no guarantee, however, that this process will
terminate that is, while the approximations are becoming more uniform at each stage,
they may never become suciently uniform.
Green and Tao show that this process must terminate using an energy increment argument
used in several approaches to Szemeredis theorem. This argument uses the fact that at
each stage in their construction, the pseudorandomness of ensures that E(f [ B) remains
bounded. The energy, that is, the L
2
-norm, of E(f [ B) increases at each stage, but since it
is bounded above, there must be a stage at which the energy may not increase, and hence
no further approximations can be made and the process must terminate.
If the process has terminated, however, it must mean that the approximation at that stage
was suciently uniform, and so the decomposition at this stage meets our requirements.
4.2 The Gowers-Hahn-Banach Proof
The simpler proof outlined in this section takes a very dierent approach. Rather than
constructing a decomposition explicitly, it uses the Hahn-Banach theorem to derive a con-
tradiction if no decomposition exists. This approach was independently discovered by Gowers
[3] and Reingold, Trevisan, Tulsiani and Vadhan [17].
The proof we give here follows the outline given in [3]. Some parts of the argument have
been simplied, since we do not require the generality given by Gowers, and the presentation
of the argument given here is new.
We begin by stating the version of the Hahn-Banach theorem
1
that we will use.
Theorem 4.2 (Hahn-Banach theorem). Let K
1
and K
2
be closed convex subsets of R
N
,
each containing 0, and suppose that f R
N
cannot be written as a convex combination
c
1
f
1
+ c
2
f
2
with f
i
K
i
. Then there exists R
N
such that f, ) > 1 and g, ) 1 for
every g K
1
K
2
.
With this theorem available to us, the strategy should be fairly obvious. Recall that we
need a decomposition f = g + h where g is bounded and h is uniform. We suppose that
no such decomposition exists and use Theorem 4.2 to derive a contradiction. Roughly, this
will be as follows: f, ) will be large, but , ) will be small, contradicting the fact that
f . We hope to say that , ) is small since it is the sum of 1, ), which is bounded,
and 1, ), which is o(1) since 1 is uniform and is anti-uniform. There are, however,
signicant technical diculties to be overcome before we can put this into action. First, we
need the following simple consequence of pseudorandomness.
Lemma 4.1 (Uniformity of 1). If : Z
N
R is simply k-pseudorandom, then
| 1|
U
k1 = o(1).
1
This is quite dierent from the Hahn-Banach theorem as it is usually stated; for a derivation of the version stated, see [3].
24
Proof sketch. Expand out the denitions and use the binomial theorem.
Now let us try to prove Theorem 4.1 using only this Lemma and the Hahn-Banach the-
orem. In terms of the latter, we have two closed convex subsets of R
N
: positive functions
bounded by 2 and functions which are -uniform. If the decomposition does not hold, then
by Theorem 4.2 we can nd some function such that
1. f, ) > 1,
2. g, ) 1 for every g such that 0 g 2, and
3. h, ) 1 for every h such that |h|
U
k1 .
In particular, by setting g to be the function g(x) = 2 whenever (x) 0 and g(x) = 0
otherwise, we can suppose that 1,
+
)
1
2
, where
+
is the positive part of dened by

+
(x) := (x) when (x) 0 and 0 otherwise. We have the following chain of inequalities:
1 < f, ) f,
+
) ,
+
) = 1,
+
) + 1,
+
)
1
2
+|
+
|

U
k1
| 1|
U
k1.
Using Lemma 4.1, to obtain a contradiction for N suciently large it suces to show that
+
is anti-uniform. The problem is that condition 3 only gives us a bound for ||

U
k1
, and this is
not strong enough. The diculty lies in passing from from
+
, which is necessary since we
can only deduce from f that f, ) , ) if is strictly non-negative; if some stronger
version of Theorem 4.2 were available that guaranteed 0 then the simple argument given
above would be sucient. In particular, instead of simple pseudorandomness, all we would
need is the weaker condition | 1|
U
k1 = o(1).
To x this argument, we will show that
+
can be approximated by a function that is
anti-uniform. This is technically messy, and we leave the details to Appendix A. It is in
proving this approximation, however, that the majority of the simple pseudorandomness
condition is required. It gives the following theorem.
Theorem 4.3 (Approximation with an anti-uniform function). Condition (3) above implies
that there exists a function such that |
+
|

1/8 and ||

U
k1
A for some A
depending only on .
Using this theorem, we may adapt the chain of inequalities above to use this approximation
to
+
, and obtain the inequality
1 < f, )
3
4
+ o(1) + A| 1|
U
k1.
Since A is xed and | 1|
U
k1 is o(1), we have a contradiction for N large enough, which
proves Theorem 4.1. Once again, the details are technical and left to Appendix A.
CHAPTER 4. DECOMPOSITION THEOREM 25
4.3 The Relative Szemeredi Theorem
We now have all we need to prove the main component of the Green-Tao theorem. The proof
below lls in the sketch given in [10], making some minor changes since our Decomposition
theorem is dierent to the form in which it is given there.
Theorem 4.4 (Relative Szemeredi Theorem). Let k 3 and > 0, and let : Z
N
R
+
be
k-pseudorandom. Suppose f : Z
N
R satises 0 f(x) (x) for all x Z
N
and Ef .
Then for all suciently large N,

k
(f)
c
k,/3
2
where c
k,/3
> 0 is the constant appearing in Theorem 2.2.
Proof. Let 0 < < 1 be some parameter to be chosen later, and let f = g + h be the
decomposition given by Theorem 4.1. Hence we have 0 g 2 and h is -uniform. We
would like to apply Szemeredis theorem to the function g; however, it is bounded above by
2 rather than 1, and its density is bounded below by a function of , which we need to be
independent of our constant to be able to later take it suciently small. Hence we instead
consider the function (g + )/(2 + ). We now have
E
_
g +
2 +
_
=
E(f) E(h) +
2 +


2 +
>

3
,
since [E(h)[ E([h[) = |h|
U
1 |h|
U
k1 . Furthermore, we have
0
g +
2 +
1.
Hence for the function (g +)/(2 +), the conditions of Szemeredis theorem, Theorem 2.2,
are met and we may apply it to obtain the lower bound, for N suciently large (depending
only on k and )

k
(g + )
k
_
g +
2 +
_
c
k,/3
for some constant c dependent only on k and . Since < 2, putting our upper bounds into
the denition of
k
gives us

k
(f
0
, . . . , f
k1
) 2
k1
= O
k
()
whenever f
j
= or g for 0 j k 1, and at least one f
i
is equal to . Hence

k
(g) c
k,
O
k
().
On the other hand, |h|
U
k1 , and [g[, [h[ f +g +2. Hence by applying Corollary 3.1
(and Remark 3.1) we see that

k
(f) =
k
(g) + O
k
() + o
k
(1).
26
In particular,

k
(f) c
k,/3
O
k
() o
k
(1).
By taking small enough (depending on k and ) we can ensure that the O
k
() term is less
than c/4, and by taking N suciently large, we can also ensure that the o
k
(1) term is less
than c/4. Hence, for N suciently large,

k
(f)
c
k,/3
2
as required.
Weak Pseudorandomness
This section outlines a slightly dierent approach, the existence of which is hinted at by a
remark in [10].
If we can assume that is dependent on k, then we can prove a Relative Szemeredi
theorem from conditions which are strictly weaker than the pseudorandomness conditions
we have been using so far. This will be important in our application to the primes, when we
shall only be able to prove these weaker conditions.
Let
k
be some suciently small constant depending only on k. Then we dene weak
linear pseudorandomness and weak simple pseudorandomness by changing the asymptotics
and upper bounds required by adding a O(
k
) factor. This aects our argument as follows.
The Generalised von Neumann theorem is altered by a factor of O
k
(
k
), using an almost
identical proof.
The Decomposition theorem requires the condition that is suciently small depending
on
k
. This is because a factor of | 1|
U
k1 is no longer o(1), but is instead O(
k
) +o(1).
By using these modied theorems in the proof of our Relative Szemeredi theorem above,
we arrive at a lower bound of the form

k
(f) c
k,
O
k
() O
k
(
k
) + o(1).
If is dependent on k, then as long as we take
k
suciently small, we may still arrive at the
required lower bound. This gives us the following alternative Relative Szemeredi theorem,
which we shall use in our application to the primes.
Theorem 4.5 (Alternative Relative Szemeredi Theorem). Let k 1 and let be a weak
k-pseudorandom function. Suppose f : Z
N
R satises 0 f(x) (x) for all x Z
N
and Ef
1
10k
(say). Then for all suciently large N,

k
(f)
c
k
2
where c
k
= c
k,1/30k
> 0 is the constant appearing in Theorem 2.2.
Chapter 5
Progressions in the Primes
In this chapter we apply the Relative Szemeredi theorem to the problem of locating arithmetic
progressions within the prime numbers. The exposition here is original, and presents a
combined and simplied version of ideas used in [10] and [7].
To apply Theorem 4.5 to the primes, we require two things:
1. A suitable function f : Z
N
R
+
which counts only primes and with E(f)
k
for
some
k
> 0 dependent only on k, and
2. A (weakly) k-pseudorandom function such that f .
5.1 Counting the Primes and the W-trick
The rst obvious candidate for a function which counts only primes is the prime indicator
function, 1
P
(n), which is dened to be 1 if n is prime and 0 otherwise. This suers from
the same diculty which prevented us from applying the regular Szemeredi theorem to the
primes: the primes do not have positive density, since it follows from the prime number
theorem that E(1
P
) = O(
1
log N
) = o(1).
There is, however, a standard method of avoiding this issue instead of counting the
primes with weight 1, we count them with weight log p, using the von Mangoldt function
1
(n) :=
_
log n if n is prime, and
0 otherwise.
It is a simple consequence of the prime number theorem that E() = 1 +o(1), and hence is
certainly positive for suciently large N. Crucially, however, it is not bounded above this
is the reason that we require the Relative Szemeredi theorem.
We may hope to use to count the primes, but there are two complications, the rst
easy to deal with and the second more troublesome.
1
Normally the von Mangoldt function is also taken to count prime powers, and the contribution from these is shown to be
negligible. For this application, however, including the prime powers would not make the argument any easier, so we have just
excluded them from the denition.
27
28
The rst is the wraparound problem noted in Chapter 2 since we are working in Z
N
,
we need to ensure that our arithmetic progressions are still progressions within [1, N]. For
this we need a +(k 1)b N which can be forced by only counting arithmetic progressions
in Z
N
with a, b N/k. Again, for this to hold it is sucient if the progression is contained
entirely within [
k
N, 2
k
N] for any
k
< 1/k. To ensure this, we shall modify to be zero
on n outside this interval.
The second problem is that the primes do not behave as randomly as we need them to.
Recall that we need to show that the primes are inside some pseudorandom set with positive
density. Pseudorandom sets behave similarly to random sets in particular, they should
have uniform distribution across all congruence classes as N tends to innity. Since the
primes have positive density within this set, they should also reect this, and be roughly
uniformly distributed across all congruence classes.
The primes are obviously not distributed this uniformly, however for instance, there is
only one prime congruent to 0 modulo 2. It appears then that our primes in fact do not sit
inside a pseudorandom set with the required positive density. To avoid this diculty, Green
and Tao introduce the W-trick and restrict their attention to primes in certain restricted
congruence classes.
The key observation is that how pseudorandom we need our set to be depends only
on k, which is xed. Hence the majorant need not be fully random, but only random
enough dependent on k. In particular, it does not need to be uniformly distributed across
all congruence classes, but only those small enough to be detected by the linear forms in the
denition of k-pseudorandom. Hence the primes which are inside this majorant should be
roughly uniformly distributed in these small congruence classes.
If we only look at integers belonging to one of these small congruence classes, both in the
majorant and in the primes, this avoids the diculty. The primes do not need to be uni-
formly distributed across all small congruence classes, since we know that our pseudorandom
majorant only includes one of them.
To be precise, let w be some suciently large integer, depending only on k, and then
dene
W :=

pw
p
to be the product of all primes below w. We restrict ourselves to looking at the congruence
class n 1 modulo W.
2
Hence we look at not over the interval 1, . . . , N, but rather on
the set W + 1, 2W + 1, . . . , NW + 1, and set it to be zero everywhere else.
Restricting ourselves in this way reduces the primes we are counting by a factor polynomial
in W, and hence our eventual lower bound for the count of progressions in the primes will
be o by some polynomial factor in W. Since W is dependent only on k, however, we may
incorporate this factor into our constant c
k
.
In technical terms, we use the W-trick in verifying pseudorandomness to be able to deduce
that for p > w the linear forms we are estimating over are independent over Z
p
as well as
2
We may in fact choose any b coprime to W, and in fact choosing this using the pigeonhole principle avoids the use for the
prime number theorem. We follow Green and Tao in choosing 1 for simplicity here.
CHAPTER 5. PROGRESSIONS IN THE PRIMES 29
over Q. This enables us to bound certain local factor estimates with a crucial p
2
, and we
can obtain the estimates we require by taking w suciently large. For more details, see
Appendix B.
Including both the wraparound factor and the W-trick in the denition of gives the
required prime counting function.
Denition 5.1 (Prime Counting Function).

(n) :=
_
(W)
2
k
W
log(Wn + 1) if n [
k
N, 2
k
N] and Wn + 1 is prime, and
0 otherwise.
for some
k
and
k
suciently small depending only on k.
The factor (W)/W here is required largely to provide a lower bound for the density
which is independent of W. This is to avoid circularity, since how large we need to take w
partially depends on the density of

. The factor of 1/2
k
is to ensure that it is majorized
by the pseudorandom function we shall construct in the next section.
To show that

is suitable for use in the Relative Szemeredi theorem, it remains to show
that its density is bounded below. We require the following classical theorem of analytic
number theory.
Theorem 5.1 (Prime Number Theorem for Arithmetic Progressions).

px
pa (mod b)
log p =
x
(b)
(1 + o(1)).
We use this theorem to show that our prime counting function

satises the density
requirement:
Lemma 5.1. For suciently large N,
E(

(x) : x Z
N
)

k
4
k
.
Remark 5.1. The constants
k
and
k
are those in the denition of the prime counting
function, above, and the pseudorandom majorant, below. Exact values could be computed,
but the important thing is that they are dependent only on k, and hence so is the lower
bound for the density given in this lemma.
30
Proof. We simply expand out the expectation notation and apply Theorem 5.1 as follows:
E(

(x) : x Z
N
) =
(W)
2
k
WN

k
Nn2
k
N
Wn+1 is prime
log(Wn + 1)
=
(W)
2
k
WN

k
WNp2
k
WN
p1 (mod W)
log p + o(1)
=
(W)
2
k
WN
_
2
k
WN
(W)


k
WN
(W)
_
(1 + o(1))
=

k
2
k
(1 + o(1)).
In particular, for suciently large N, we can assume that 1 + o(1)
1
2
, which gives us the
result.
5.2 Pseudorandom Majorant
Now let us consider the construction of the pseudorandom majorant for

. Again, a natural
candidate is just

itself. The problem is that verifying pseudorandomness for

is extremely
dicult the linear forms condition, for instance, is comparable in diculty to the prime
tuples Conjecture 1.2, and hence harder than both the Twin Primes Conjecture and the
Goldbach Conjecture.
Instead, we use an idea from sieve theory. Traditional sieve theory methods have proven
extremely successful in obtaining estimates and asymptotics for the almost-primes (numbers
with few prime divisors), but its methods cannot be rened to include the primes themselves.
Fortunately, however, all we require now is a majorant for the weighted primes, and the
weighted almost-primes will do the job.
Recall that we counted the weighted primes using a restricted form of the function.
Hence, we rst consider the elementary identity
(n) =

d|n
(d) log
_
n
d
_
.
We need to adjust this to also count numbers with few prime factors. Note that if n has
many prime factors, then most of its divisors will be small in relation to n. In particular,
by truncating the sum in the identity above, only summing over divisors less than some
parameter R, we obtain a function that is approximately for numbers with many prime
factors. Hence this modied function only diers from on the almost-primes.
Motivated by this, we introduce the truncated divisor sum

R
(n) :=

d|n
dR
(d) log
_
R
d
_
=

d|n
(d) log
+
_
R
d
_
.
CHAPTER 5. PROGRESSIONS IN THE PRIMES 31
If p is a prime suciently large with respect to R then the only term counted in the sum
above will be d = 1, and so
R
(p) = (p). Of course,
R
will count more than just the primes
but, as noted above, it will eectively only count almost-primes. Hence
R
can be viewed as
a weighted indicator function for the almost-primes, and so just as we used modied by the
W-trick to count the primes, we shall use
R
modied by the W-trick as a pseudorandom
majorant. One small obstacle to overcome is that we require our pseudorandom majorant to
be positive, whereas
R
can take on negative values. We circumnavigate this by the simple
trick of squaring the function to guarantee positive values.
By including the almost primes, obtaining estimates for
R
is signicantly easier than for
, and ideas from sieve theory can be applied. Green and Tao were fortunate here, in that
Goldston and Yldrm had already considered the truncated divisor sum in their work on
small gaps between primes [2], and had eectively proven the linear forms estimate required.
It was this proof which was used in the original paper [10].
A signicant simplication for this part of the proof was discovered later by Green and
Tao, and is outlined in [7] and [20]. In the original proof, to be able to provide the required
estimates for sums of
2
R
, the log
+
(R/d) factor in the denition of
R
was replaced by a
contour integral using the identity
log
+
x =
1
2i
_

x
z
z
2
dz
for a vertical line . This enabled them to replace the sum with a contour integral involving
the Riemann zeta function , to which classical information about the zeta function could
be applied. In particular, they required the existence of a certain region to the left of the
line '(z) = 1 in which is free of zeroes.
The new idea was as follows. They rst replace the log
+
(R/d) factor with a smooth
approximation,

_
log d
log R
_
where is some smooth, bounded function with compact support. Note that log
+
(R/d)
corresponds to taking (x) = log R(1 x) for x (0, 1) and (x) = 0 otherwise. Then,
instead of replacing that with a contour integral, they could replace it with its Fourier
transform. The crucial fact here was that they could truncate the integral and consider it
over a bounded interval at the cost of o(1) errors since (as is smooth) the Fourier transform
decays very rapidly.
In the remainder of the argument then, instead of having to estimate functions involving
(z) with [z[ unbounded, they had only to consider z such that z = 1 + o(1). In this case,
the only fact about required is the existence of a simple pole at z = 1. These ideas are
explained in detail in Appendix B.
To be able to use this simpler argument, we dene a modied form of the truncated
divisor sum as follows.
32
Denition 5.2 (Modied Truncated Divisor Sum).

R
(n) :=

d|n
(d)
_
log d
log R
_
where is some smooth, bounded function with compact support such that (0) = 1,
(x) = 0 for [x[ 1 and
_
1
0
[

(x)[
2
dx = 1.
These conditions on are required for technical reasons, but (log d/ log R) should be
viewed as a smooth approximation to log
+
(R/d), and so

R
is a smooth approximation to

R
.
We require one nal adjustment before the denition of the pseudorandom majorant. In
estimating the linear forms and correlation expectations required to demonstrate pseudo-
randomness, we get the hoped-for 1 + o(1) term, multiplied by a factor of
W
(W) log R
. To
compensate for this, we scale our pseudorandom majorant to remove this factor. Hence we
arrive at our nal denition as follows.
Denition 5.3 (Pseudorandom Majorant for the Primes). We dene : Z
N
R
+
by
(n) :=
_
(W) log R
W

R
(Wn + 1)
2
if n [
k
N, 2
k
N], and
1 otherwise
for some suciently small
k
depending only on k, and where R := N

k
for some suciently
small
k
depending only on k.
Remark 5.2. This diers from the used in [10] by using the smoothed out
R
for reasons
outlined above.
As outlined in the previous section, the W-trick is required so only considers numbers
congruent to 1 modulo W. The two-part denition of is needed due to diculties in
passing between [1, N] and Z
N
. Namely, we prove correlation estimates for

R
over the
interval [1, N], and to be able to apply these to the function (which is instead dened
over Z
N
), we must truncate to some small interval. The
k
determines how small the cut-o
parameter R is, and hence how almost the almost-primes we count are.
We rst verify that it is a majorant for our prime counting function

. The factor of
(W)
2
k
W
in the denition of

is required partially to make this proof work.
Lemma 5.2. For N suciently large depending on k, (n)

(n) for all n Z
N
.
Proof. Since we have squared the

R
in the denition of , it is clear that (n) 0 for all
n Z
N
. The claim follows immediately unless Wn + 1 is a prime and n [
k
N, 2
k
N] (for
otherwise

(n) = 0), so let us suppose we are in this case.
By taking N suciently large we may also suppose that Wn + 1
WN
2k
+ 1 > N

k
= R
since W is dependent only on k.
CHAPTER 5. PROGRESSIONS IN THE PRIMES 33
Therefore

R
(Wn + 1) = 1, and so
(n) =
(W)
W
log R =
(W)

k
W
log N
(W)

k
W
log
_
n
2
k
_
.
For n suciently large (which we may force by taking N suciently large)
n
2
k

Wn + 1
and so
(n)
(W)

k
W
log(

Wn + 1) =

(n).
All that remains now is to verify that is weakly k-pseudorandom, for which we must
provide an asymptotic and an upper bound for taken over linear forms. The proofs of
these are long and technical, so have been postponed to Appendix B. The arguments there
are a simpler case of those given in [7], though we have made some simplications since we
do not require the generality proven there.
Theorem 5.2. is weakly k-pseudorandom.
Proof. See Appendix B.
Remark 5.3. We only obtain the weak pseudorandomness condition discussed at the end
of Chapter 4, which is sucient for these purposes since Lemma 5.1 gives a lower bound
for the density depending only k. To obtain the strong linearly pseudorandom condition we
must take w as an increasing function of N (as is done in [10]), which would prevent us from
obtaining the uniform lower bound in Theorem 5.3.
We have constructed a (weakly) pseudorandom majorant for our prime counting function

, and can apply the Relative Szemeredi theorem to nally complete the proof of the Green-
Tao theorem.
5.3 The Green-Tao Theorem
We now have all that we need to prove the main result, that the primes contain innitely
many arbitrarily long arithmetic progressions. In fact, we can deduce a stronger result,
giving an explicit lower bound. The proof below follows the outline given in [10], making it
explicit how to obtain the lower bound they mention as a remark.
Theorem 5.3 (The Green-Tao Theorem). For any k 3 there exists some constant c

k
> 0
depending only on k such that, for N suciently large,
P(k, N) c

k
N
2
log
k
N
.
34
Proof. Choose M such that N = 2
k
WM + 1|. By Lemma 5.1, for suciently large N,
E
nZ
M
(

)

k
4
k
.
Furthermore, by Lemma 5.2 and Theorem 5.2, 0

where is weakly k-pseudorandom.
Hence by Theorem 4.5, the alternative Relative Szemeredi Theorem, there exists some con-
stant c
k
> 0 such that, for suciently large M,

k
(

)
c
k
2
.
If we dene 1
P
(n) = 1 if Wn + 1 is prime and n [
k
M, 2
k
M] and 0 otherwise, then

(n) =
(W)
2
k
W
log(Wn + 1)1
P
(n)
(W)
2
k
W
log N1
P
(n).
Let a, a+b, . . . , a+(k1)b be an arithmetic progression of such n. Since a+ib [
k
M, 2
k
M]
for all 0 i k 1, and since we may take
k
< 1/k, this will be an arithmetic progression
in [1, M], not just Z
M
. Furthermore, Wa + 1, Wa + 1 + Wb, . . . , Wa + 1 + (k 1)Wb will
be an arithmetic progression of primes in [1, N], thanks to our initial choice of M.
Furthermore, since the degenerate case b = 0 can contribute at most
1
M
to the expectation,
we see that, for N suciently large,
P(k, N) M
2
_

k
(1
P
)
1
M
_
M
2
_
2
k
W
(W) log N
_
k
c
k
4

_
N
4
k
W
_
2
log
k
N
_
4W
(W)
_
k
c
k
4
c

k
N
2
log
k
N
for some constant c

k
> 0, as long as we take W suciently large depending on k.
Corollary 5.1. For any k, the primes contain innitely many k-term arithmetic progres-
sions.
It is crucial in the above proof that W was not dependent on N. If we take W as some
increasing function of N, as Green and Tao do, then although it makes certain parts of the
argument simpler (we do not need to appeal to weak pseudorandomness), we cannot get
a lower bound of the form we have stated. It is, however, strong enough to deduce this
corollary.
Chapter 6
Further Results
6.1 Extensions of the Green-Tao Theorem
Although the conclusion Theorem 5.3 was stated for primes, we may in fact use an almost
identical argument for subsets with positive density within the primes (since if A B with
positive density and B C with positive density then A C with positive density). In
particular, since every prime congruent to 1 modulo 4 is the sum of two squares, Green and
Tao obtain the following previously unknown result:
Theorem 6.1. There are arbitrarily long arithmetic progressions where every term is the
sum of two squares.
A natural question to ask is whether the same methods can prove the stronger result that
the primes contain arbitrarily long polynomial progressions. This was shown by Tao and
Ziegler by similar methods in 2008, appealing this time to a Polynomial Szemeredi theorem
proven by Bergelson and Leibman in 1996. They show the following:
Theorem 6.2 (Tao-Ziegler [22]). Given any k polynomials with integer coecients P
1
, . . . , P
k
such that P
1
(0) = = P
k
(0) = 0, the primes contain innitely many progressions of the
form
x + P
1
(m), . . . , x + P
k
(m) with m > 0.
Generalising the other way, we can talk about prime elements in any ring, so it is natural
to ask whether Green-Tao results can be obtained in these alternative settings. Tao has
extended the method to deal with the Gaussian primes (those in the ring Z[i]):
Theorem 6.3 (Tao [21]). The Gaussian primes contain innitely many instances of any
constellation, that is, sets of the form
a + v
0
b, . . . , a + v
k1
b all prime, with a Z[i] and b Z
for any xed k and Gaussian integers v
0
, . . . , v
k1
.
Ho` ang Le has also proven a version for the ring of polynomials over a nite eld:
35
36
Theorem 6.4 (Ho` ang Le [16]). Let F
q
be the nite eld with q elements. Then for any k
we may nd polynomials f, g F
q
[t] with g ,= 0 such that for any polynomial P F
q
[t] with
degree less than k, the polynomial
f + Pg
is irreducible.
For all these extensions, the previous remarks apply so that they are also valid for sets of
positive density within the primes.
6.2 Asymptotics for P(k, N)
Recent advances by Green, Tao and Ziegler appear to have established the conjectured
asymptotic for all k. We will now briey outline the status of these results.
In [7] they conditionally established the Hardy-Littlewood prime tuples conjecture, Con-
jecture 1.2, for linear forms which are not rational multiples of each other. This result was
conditional on two partially resolved conjectures: the inverse Gowers-norm conjecture, GI(s),
and the Mobius-nilsequence conjecture, MN(s). In particular, they prove the following
Theorem 6.5 (Green-Tao, conditional). If the MN(k 2) and GI(k 2) conjectures hold,
then
P(k, N) C
k
N
2
log
k
N
where the constant is dened by
C
k
:=

p<k
p
k2
(p 1)
k1

pk
p
k2
(p k + 1)
(p 1)
k1
.
The GI(k 2) conjecture roughly says that if a bounded function has large Gowers U
k1
norm then it correlates with a structured type of function known as a (k2)-step nilsequence.
Hence if a function is not Gowers uniform then we can deduce a lot about its structure.
The MN(k 2) conjecture roughly says that (k 2)-step nilsequence obtained from this
does not correlate well with the Mobius function .
Putting these together, the idea behind the proof (ignoring signicant technical dicul-
ties) is similar to the one used in the Green-Tao theorem: we measure the primes using some
variant on the von Mangoldt function, call it . If the Gowers norm of 1 is large, then by
the inverse Gowers-norm conjecture it correlates well with a nilsequence. Due to the close
connection between the Mobius function and , however this leads to a contradiction to
the Mobius-nilsequence conjecture. Hence we can decompose into a bounded part and a
uniform part, show that the uniform part is negligible and use some Szemeredi-type theorem
to give an asymptotic for the bounded part.
Hence to give asymptotics for arithmetic progressions within the primes it suces to
prove these conjectures. GI(1) and MN(1) are classical and can be deduced from the circle
methods used by Hardy, Littlewood and Vinogradov. Green and Tao have proven GI(2) in
[9] and MN(2) in [11], giving the following unconditional asymptotics for the cases k = 3, 4.
CHAPTER 6. FURTHER RESULTS 37
Theorem 6.6 (Green-Tao).
P(3, N) C
3
N
2
log
3
N
, and
P(4, N) C
4
N
2
log
4
N
;
where the constants are dened by
C
3
:= 2

p3
p(p 2)
(p 1)
2
1.3203, and
C
4
:=
3
4

p5
p
2
(p 3)
(p 1)
3
0.4764.
The Mobius-nilsequence conjecture was established in all cases in [8], hence Theorem 6.5
is only dependent on the inverse Gowers-norm conjecture. Green, Tao and Ziegler have
further established the GN(3) conjecture in [12], so we have the following.
Theorem 6.7 (Green-Tao-Ziegler).
P(5, N) C
5
N
2
log
5
N
,
where the constant is
C
5
:=
27
16

p5
p
3
(p 4)
(p 1)
4
0.5189.
Recently, Green, Tao and Ziegler have extended the strategy employed in [12] to all cases
in [13], nally removing the conditional status of Theorem 6.5 and giving an asymptotic for
prime progressions of any length.
6.3 Explicit Bounds
We nally mention the related question of where the rst k-term prime progression can
be found. Let [1, N
k
] be the smallest interval containing a k-term prime progression, say
a, a + b, . . . , a + (k 1)b.
Let us rst consider lower bounds for N
k
. It is easily veried that a k and that every
prime less than k must divide b, which gives the lower bound
N
k
> k + (k 1)

p<k
p.
It follows from the prime number theorem that

p<n
p = e
(1+o(1))n
, which gives the asymp-
totic lower bound
N
k
> e
(1+o
k
(1))k
.
38
Now we turn to the harder problem of an upper bound for N
k
. It is possible to compute
one by keeping careful track of the constants and rates of decay in the o(1) errors in the
proof of the Green-Tao theorem above. Since the proof invokes Szemeredis theorem, the
bounds obtained are heavily dependent on the optimal bound for Szemeredis theorem. The
best obtained so far is from Gowers proof using Fourier analysis in [4]. For a set of density
this proof gives an upper bound for the rst occurrence of a k-term arithmetic progression
of:
2
2

2
2
k+9
.
To calculate the best possible bound for the Green-Tao theorem from the given proof seems
a herculean task. Green and Tao have, however, in a brief note [6] estimated the constants
and errors in their original proof, obtaining the following (non-optimal) upper bound, where
c is some absolute constant:
N
k
< c2
2
2
2
2
2
2
100k
.
Clearly, the gap between the lower and upper bounds is quite large. It has been conjectured
that there exist k-term prime progressions with step equal to (the smallest possible)

pk
p
for all k, and so the asymptotic lower bound gives the correct order of N
k
. This has been
veried up to k = 21 by computer.
Appendix A
Proof of the Decomposition Theorem
In this appendix we give a proof of Theorem 4.3 and how Theorem 4.1 follows from this.
The ideas here are those in [3], although we present the argument in a dierent way and
take advantage of the narrowness of our goal to make some simplications.
First, we make the following denition.
Denition A.1 (Basic anti-uniform functions). Let : Z
N
R be simply pseudorandom.
A function g : Z
N
R is a basic anti-uniform function if g = Tf for some 0 f .
The following two lemmas are the only parts where we use the fact that is simply
pseudorandom. They give us bounds on anti-uniformity which we shall use to construct the
anti-uniform in Theorem 4.3.
Lemma A.1 (Basic anti-uniform functions are bounded). If 0 f for some simply
pseudorandom then |Tf|
L
= O
k
(1).
Sketch proof. We in fact show that |Tf|

2
2
k1
+ o(1), by writing out the denition of
Tf and applying the simple pseudorandomness condition.
Lemma A.2 (Basic anti-uniform products are anti-uniform). Let 0 f
1
, . . . , f
m
for
some simply pseudorandom measure . Then
|Tf
1
Tf
m
|

U
k1
= O
m
(1).
Remark A.1. This is the part of the proof which requires the bulk of the simple pseudo-
randomness condition. If a way to avoid this lemma were found, then the Decomposition
theorem could be proven with only the hypothesis that | 1|
U
d is suciently small for
some large d.
Sketch proof. Recalling the denition of the dual norm, we need to show that for all f R
N
such that |f|
U
k1 1, we have the bound
f,
m

j=1
Tf
j
) = O
m
(1).
39
40
After applications of the Cauchy-Schwarz and H older inequalities, together with the fact
that f , it is sucient to show that
E
_
_
E
_
_

C
k1
(y + h) : y Z
N
_
_
m
: h Z
k1
N
_
_
= O
m
(1).
Using the simply pseudorandom condition and the triangle inequality, we may bound this
expression by E(
m
) = O
m
(1) for some weight function , and we are done.
The idea now is to construct , our anti-uniform approximation to
+
, as follows. We
show that is a small linear combination of basic anti-uniform functions, and then use this
fact to construct a polynomial in which is anti-uniform and approximates
+
. For the rst
task, we dene the following norm.
Denition A.2. We dene the basic norm by
||
B
:= inf
_
n

i=1
[
i
[ : =
n

i=1

i
Tf
i
, 0 f
1
, . . . , f
n

_
,
and if ||
B
then we say that is -basic.
Remark A.2. It is a simple exercise to verify that the expression above does in fact give
a norm on R
N
. In this denition, and for the rest of this proof, will be a xed simply
pseudorandom function.
The | |
B
norm measures how well we can approximate by a linear combination of basic
anti-uniform functions, which are well-behaved by the above lemmas. It is easy to deduce
the following analogues for Lemma A.1 and Lemma A.2.
Lemma A.3 (Basic functions are bounded). If is -basic then ||

= O().
Lemma A.4 (Basic powers are anti-uniform). If is -basic then, for any integer m,
|
m
|

U
k1
= O
m
(
m
).
The following lemma constructs an anti-uniform approximation to , provided that it is
suciently basic.
Lemma A.5 (Approximation with an anti-uniform polynomial). There exists a polynomial
P such that if 1 > > 0, then for every -basic function ,
1. |P
+
|


1
8
2. |P|

U
k1
A
for some A dependent only on .
APPENDIX A. PROOF OF THE DECOMPOSITION THEOREM 41
Proof. By Lemma A.3 we know that for every -basic , we have ||

C < C for some


C independent of and . Now choose some polynomial P such that [P(x) x
+
[
1
8
on
[C, C], and hence |P
+
|


1
8
. Note that P is independent of both and .
Furthermore, by Lemma A.4, for any integer m and any -basic , we have |
m
|

U
k1
=
O
m
(
m
) . If we denote the polynomial P by a
n
x
n
+ + a
0
then by the triangle inequality
|P|

U
k1
[a
n
[|
n
|

U
k1
+ +[a
0
[
= O([a
n
[
n
+ +[a
0
[)
and we denote this last quantity by A, noting that it is dependent only on .
Finally, we show that condition (3) implies that is suciently basic to be able to
construct such an anti-uniform approximation.
Lemma A.6 (Lack of correlation with uniform functions implies basic). If h, ) 1 for
every -uniform 0 h then is
2
k1
-basic.
Sketch proof. Use the fact that |Th|
B
1 and Lemma 3.1 to deduce that if |h|

B

2
k1
then h is -uniform. The result follows using the fact that the dual of a dual norm is the
original norm.
Putting these together, we get the following precise form of Theorem 4.3.
Theorem A.1. If h, ) 1 for every -uniform function h then there exists a polynomial
P(x) such that |P
+
|
1
8
and |P|

U
k1
A for some A dependent only on .
Proof. Combine Lemmas A.6 and A.5.
We can now complete our original strategy for proving Theorem 4.1, as shown below.
Proof of Theorem 4.1. Suppose no such decomposition exists. Then by the Hahn-Banach
theorem there exists a such that
1. f, ) > 1,
2. g, ) 1 for every g such that 0 g 2, and
3. h, ) 1 for every -uniform h.
Condition (2) implies that 1,
+
)
1
2
, and by Theorem A.1, condition (3) implies there
exists a polynomial P such that |P
+
|


1
8
and |P|

U
k1
A for some A dependent
only on . Using this, we may obtain the bound
,
+
) = 1,
+
) +1, P
+
) + 1, P) +,
+
P)

1
2
+
1
8
+ A| 1|
U
k1 +
1
8
(1 + o(1))
=
3
4
+ o(1)
42
since, by Lemma 4.1, | 1|
U
k1 = o(1). We have also used the fact that , and hence A, is
xed for the duration of this proof. Since f , and f and
+
are both strictly positive, we
can deduce that f,
+
) ,
+
). Appealing to condition (1) above, we have the following
inequalities:
1 < f, ) f,
+
) ,
+
)
3
4
+ o(1).
Hence have a contradiction for N suciently large, and so the required decomposition must
exist.
Appendix B
Estimates for
R
In this appendix we outline the number theoretical arguments needed to show that the
function constructed in Chapter 5 is pseudorandom. The proofs given here are a synthesis
of those in Section 10 of [10] and Appendix D of [7].
Theorem B.1 (Linear Forms Estimate). Suppose we have m linear forms in t variables
x = (x
1
, . . . , x
t
), each of the form
i
(x) =

t
j=1
L
ij
x
j
+ b
i
with integer coecients bounded
by [L
ij
[
_
w
2
_
1/4
and with the t-tuples (L
ij
)
t
j=1
never identically zero and none a rational
multiple of another. Let B N
t
be a product of t intervals, each of length at least R
10m
.
Dene the modied linear forms
i
(x) = W
i
(x) + 1. Then
E(
R
(
1
(x))
2

R
(
m
(x))
2
: x B) = e
O
m(
1
w
)
(1 + o(1))
_
W
(W) log R
_
m
.
Theorem B.2 (Correlation Estimate). Suppose we have m linear forms of the form
i
(x) =
x + b
i
for distinct [b
i
[ N
2
, and let
i
and B be as above. Then
E(
R
(
1
(x))
2

R
(
m
(x))
2
: x B) = e
O
m(
1
w
)
)(1+o(1))
_
W
(W) log R
_
m

p|
(1+O(p
1/2
))
where denotes the integer
=

1i<jm
[h
i
h
j
[.
Remark B.1. Note that the linear forms in the second theorem are not covered by the rst,
since all the (L
ij
) are identically 1. We will combine the proof for both below, since they
are identical except for the dierent evaluation of the Euler product

p
F
p
below. These are
included in separate sections below.
Sketch proof. For both theorems we must estimate the same expectation, so let us denote
this by E
,R
. Expanding out the denitions we get
E
,R
=

a,bN
m
_
m

i=1
(a
i
)(b
i
)
_
log a
i
log R
_

_
log b
i
log R
_
_
E
_
m

i=1
1
a
i
,b
i
|
i
(x)
: x B
_
43
44
where we write a = (a
1
, . . . , a
m
) and b = (b
1
, . . . , b
m
). Since (x) = 0 for x 1 we have
removed the restriction that a, b R. Furthermore, if we let D be the least common multiple
of a
1
, . . . , b
m
, then we may replace the presence of B in the expectation above with Z
t
D
, with
only (assuming
k
in the denition of R is suciently small) a o(1) error, which can be
included in the right hand side.
We shall denote the expectation factor as
a,b
, and expand it as an Euler product using
the Chinese Remainder Theorem:
1

a,b
:= E
_
m

i=1
1
a
i
,b
i
|
i
(x)
: x Z
t
D
_
=

p
E
_
_
_
_

j such that
p|a
j
or p|b
j
1
p|
j
(x)
: x Z
t
p
_
_
_
_
=

a,b
(p).
Note that this is the only factor inuenced by the choice of linear forms. Hence we can write
E
,R
=

a,bN
m
_
m

i=1
(a
i
)(b
i
)
_
log a
i
log R
_

_
log b
i
log R
_
_

a,b
.
Let be the inverse Fourier transform of e
x
(x), so that
(x) =
_
R
(t)e
ix(1+t)
dt.
Since e
x
(x) is smooth with compact support, is also smooth and decays rapidly for
any A > 0, [(t)[ = O
A
((1 + t)
A
).
2
By restricting the range of integration to I :=
[log
1/2
R, log
1/2
R], we see that (for any c R and A > 0)

_
log c
log R
_
=
_
I
c

1+it
log R
(t)dt + O
A
(c
1
log R
log
A
R).
To simplify notation, let t

:=
1+it
log R
. Since
_
log c
log R
_
= O(c
1/ log R
), the above gives us
m

j=1

_
log a
j
log R
_

_
log b
j
log R
_
=
_
I

_
I
m

j=1
(x
j
)(y
j
)
a
x

j
j
b
y

j
j
dx
j
dy
j
+ O
A
(log
A
R
m

j=1
(a
j
b
j
)
1/ log R
).
It can be shown
3
that the error term contributes O
A
(log
O(1)A
R) to E
,R
, which is o(1) for
A large enough. Hence we have
E
,R
=
_
I

_
I
_
_

a,bN
m

a,b
(p)
m

j=1
(a
j
)(b
j
)
a
x

j
j
b
y

j
j
_
_
m

j=1
(x
j
)(y
j
)dx
j
dy
j
+ o(1).
1
In the product we should properly have p|D, but this restriction can be dropped for the multiplicand is 1 otherwise.
2
See Appendix C for details.
3
See [7], p. 69
APPENDIX B. ESTIMATES FOR
R
45
By unique factorisation and the presence of , we may factor the rst term as an Euler
product:

a,bN
m

a,b
(p)
m

j=1
(a
j
)(b
j
)
a
x

j
j
b
y

j
j
=

p
_
_

a,b{1,p}
m

a,b
(p)
m

j=1
(a
j
)(b
j
)
a
x

j
j
b
y

j
j
_
_
:=

p
E
p
.
and hence it remains to evaluate
E
,R
=
_
I

_
I

p
E
p
m

j=1
(x
j
)(y
j
)dx
j
dy
j
+ o(1).
Applying Lemma B.2, we get for some modied Euler factors F
p
dened below
E
,R
= (1 + o(1)) log
m
R
_
I

_
I

p
F
p
m

j=1
_
(1 + ix
j
)(1 + iy
j
)
2 + i(x
j
+ y
j
)
(x
j
)(y
j
)
_
dx
j
dy
j
= (1 + o(1))

p
F
p
log
m
R
__
I
_
I
(1 + ix)(1 + iy)
2 + i(x + y)
(x)(y)dxdy
_
m
(providing that we obtain an estimate for

p
F
p
independent of x

j
, y

j
)
= (1 + o(1))

p
F
p
log
m
R
__
R
_
R
(1 + ix)(1 + iy)
2 + i(x + y)
(x)(y)dxdy + o(1)
_
m
= (1 + o(1))

p
F
p
log
m
R.
We could replace the limits of integration by R at the cost of o(1) factors thanks to the rapid
convergence of , and the nal equalities are a consequence of Lemma B.1. Only now does
the proof for our two theorems diverge, in the estimation of

p
F
p
. I have divided these up
into the sections below. Simply plug in Corollaries B.1 and B.2 respectively to obtain the
two theorems.
Lemma B.1 (Sieve Factor Calculation).
_
R
_
R
_
(1 + ix)(1 + iy)
2 + i(x + y)
(x)(y)
_
dxdy =
_

0

(t)dt = 1.
Proof sketch. For the rst equality, evaluate the integral using the observation that
1
2 + i(x + y)
=
_

0
e
(1+ix)t
e
(1+iy)t
dt
to separate the variables, and recalling that (x) is the inverse Fourier transform of e
x
(x).
The second equality follows from our original choice of .
46
It remains to give estimates for the Euler factors. Note that by explicitly considering all
the possible a, b 1, p
m
we can rewrite the Euler factors E
p
in the more convenient form
E
p
=

a,b{1,p}
m

a,b
(p)
m

j=1
(a
j
)(b
j
)
a
x

j
j
b
y

j
j
=

I,J[m]
(1)
|I|+|J|

IJ
(p)
p
P
jI
x

j
+
P
jJ
y

j
where

X
(p) := E(

jX
1
p|
j
(x)
[ x Z
t
p
).
It will also be convenient to dene the altered Euler factors
E

p
:=
m

j=1
(p
1+x

j
1)(p
1+y

j
1)
p(p
1+x

j
+y

j
1)
.
and then, as promised, we dene F
p
:=
E
p
E

p
. These factors are much easier to evaluate than
E
p
directly, and the following lemma allows us to pass between them in the proof above.
Lemma B.2 (Euler Product Evaluation).

p
E
p
=

p
F
p
1 + o(1)
log
m
R
m

j=1
_
(1 + ix
j
)(1 + iy
j
)
2 + i(x
j
+ y
j
)
_
.
Proof sketch. This is a simple consequence of the denition of E

p
and Lemma B.3 below.
Note that since, for instance, x
j
[log
1/2
R, log
1/2
R] and x

j
:=
1+ix
j
log R
we have that 1+x

j
=
1 + o(1) and so the lemma is applicable.
Lemma B.3 (Zeta Function Estimate). When '(s) > 1 and s = 1 + o(1),

p
_
1
1
p
s
_
= (1 + o(1))(s 1).
B.1 Euler Product for independent linear forms
All that remains is to provide a suitable estimate for

p
F
p
. The strategy in this section
and the next is the same, and runs as follows. We rst obtain bounds on the local factor
estimates
X
(p) for p w (it is here that the W-trick is applied). Secondly, we use these
bounds to estimate E
p
in terms of E

p
for each prime p w. Finally, we use this to estimate

p
E
p
in terms of

p
E

p
, and since F
p
= E
p
/E

p
, these give us an estimate for

p
F
p
.
Lemma B.4 (Local Factor Estimate). For p w

(p) = 1

X
(p) =
1
p
whenever [X[ = 1
APPENDIX B. ESTIMATES FOR
R
47

X
(p)
1
p
2
whenever [X[ 2
Proof. When X = we are simply taking the expectation of the empty product, which is 1.
When [X[ = 1 then, for some j,

X
(p) = E(1
p|
j
(x)
[ x Z
t
p
) =
#x Z
t
p
:
j
(x) 0 (mod p)
p
t
=
1
p
since
j
: Z
t
p
Z
p
is a uniform covering. For the nal claim, note that it suces to prove it
for the case [X[ = 2, so let us suppose X = j, k. Write the linear forms as

j
(x) =
t

i=1
a
i
b
i
x
i
+ l
j
and
k
(x) =
t

i=0
c
i
d
i
x
i
+ l
k
.
Suppose that the pure linear forms W(
j
b
j
) and W(
k
b
k
) are multiplies of each
other mod p, so that for some and every 1 i t we have
W
a
i
b
i
W
c
i
d
i
(mod p).
Since p W, we may rearrange this to give
a
1
d
1
b
1
c
1

a
2
d
2
b
2
c
2

a
t
d
t
b
t
c
t
(mod p)
and hence, for any 1 i t,
a
1
d
1
b
i
c
i
b
1
c
1
a
i
d
i
(mod p).
In other words,
p[ [a
1
d
1
b
i
c
i
b
1
c
1
a
i
d
i
[ [a
1
d
1
b
i
c
i
[ +[b
1
c
1
a
i
d
i
[ < w p
where for the inequalities we have used the bounds [a
i
[, [b
i
[, [c
i
[, [d
i
[ <
_
w
2
_
1/4
. Hence we have
equality not only in Z
p
but also in Z, and so
a
i
b
i
=
_
a
1
d
1
b
1
c
1
_
c
i
d
i
.
This contradicts our hypothesis that the pure linear forms are not rational multiples of one
another. Hence the pure linear forms are also independent over Z
p
.
Let Z be the set of solutions to
j
(x)
k
(x) 0 (mod p). Since W(
j
l
j
) and
W(
k
l
k
) are not multiples of each other modulo p, it follows that Z is contained in the
intersection of two skew ane subspaces of Z
t
p
, and hence has cardinality at most p
t2
. By
denition,
X
(p) =
|Z|
p
t
, and we are done.
48
Lemma B.5 (Euler Factor Estimate). For p w
E
p
=
_
1 + O
m
_
1
p
2
__
E

p
Proof. We divide up the sum in the denition of E
p
into the cases I = J = , [I[ [J[ = 1
and [I[ [J[ 2 and apply Lemma B.8 to get
E
p
:=

I,J[m]
(1)
|I|+|J|

IJ
(p)
p
P
jI
x

j
+
P
jJ
y

j
=

(p)
m

j=1
_
1
p
1+x

j
+
1
p
1+y

1
p
1+x

j
+y

j
_
+

I,J[m]
|I||J|2
O
m
(1/p
2
)
p
P
jI
x

j
+
P
jJ
y

j
= 1
m

j=1
_
p
x

j
+ p
y

j
1
p
1+x

j
+y

j
_
+ O
m
(1/p
2
).
Hence we need to show that
E
p
E

p
=
1

m
j=1
_
p
x

j
+p
y

j
1
p
1+x

j
+y

j
_

m
j=1
(p
1+x

j
1)(p
1+y

j
1)
p(p
1+x

j
+y

j
1)
+ O
m
_
1
p
2
_
= 1 + O
m
_
1
p
2
_
,
which follows from Taylor expansion.
Lemma B.6.

pw
_
1 + O
m
_
1
p
2
__
= e
O
m
(
1
w
)
.
Proof. We use the inequality [1 + a[ e
a
to see that

pw

1 + O
m
_
1
p
2
_

e
O
m

P
pw
1
p
2

e
O
m(
P
nw
1
n
2
)
.
Since x
2
is a decreasing function, we may also bound the sum above by an integral, to get

nw
1
n
2

1
w
+
_

w
1
x
2
dx =
2
w
.
Combining these two inequalities gives us the result.
Lemma B.7 (Euler Product Simplication).

p
E
p
= e
O
m(
1
w
)
__
W
(W)
_
m
+ o(1)
_

p
E

p
APPENDIX B. ESTIMATES FOR
R
49
Proof. We divide the product into two parts, p < w and p w, and evaluate each separately.
Applying rst Lemma B.5 and then Lemma B.6, we get

pw
E
p
=

pw
_
1 + O
m
_
1
p
2
__

pw
E

p
= e
O
m(
1
w
)

pw
E

p
.
Since E
p
= 1 for p < w, so in particular

p<w
E
p
= 1, it remains to show that
_
W
(W)
_
m
+ o(1) =

p<w
E
1
p
.
Noticing that (since W =

p<w
p and is multiplicative)
W
(W)
=

p<w
p
(p)
=

p<w
p
p 1
,
it suces in turn to show that for all p < w
_
p
p 1
_
m
+ o(1) = E
1
p
.
After some algebraic manipulation, we see that
E
1
p
= p
m
m

j=1
p
1+x

j
+y

j
1
(p
1+x

j
1)(p
1+y

j
1)
.
Furthermore, since x

j
:=
1+ix
j
log R
where x
j
R, p
x

j
= 1 + o(1) and similarly for the other
two cases. Hence we get
E
1
p
= p
m
m

j=1
(1 + o(1))p 1
((1 + o(1))p 1)
2
= p
m
(p 1 + o(1))
m
=
_
p
p 1
_
m
+ o(1)
which concludes the proof.
Corollary B.1 (Euler Product Estimate).

p
F
p
= e
O
m(
1
w
)
(1 + o(1))
_
W
(W)
_
m
.
50
B.2 Euler product for simple linear forms
Lemma B.8 (Local Factor Estimate). For p w

X
(p) =
1
p
whenever [X[ 2 and p[,

X
(p) = 0 whenever [X[ 2 and p ,

X
(p) =
1
p
whenever [X[ = 1, and

(p) = 1.
Proof sketch. The nal two claims proceed exactly as in the previous section. For the rst
two claims, note that if [X[ 2 then
X
(p) is equal to 1/p if all the residue classes h
i
(mod p) are equal, and 0 otherwise.
Lemma B.9 (Euler Factor Estimate).
E
p
=
_
1 + O
_
1
p
2
__
E

p
whenever p , and
E
p
=
_
1 + O
_
1

p
__
E

p
whenever p [ .
Proof. For the rst claim, we apply Lemma B.8 and argue as in the proof of Lemma B.5.
For the second, note that if p[ then, using the denition of E
p
and Lemma B.8,
E
p
= 1 +
1
p

I,J[m]
IJ=
(1)
|I|+|J|
p
P
jI
x

j
+
P
jJ
y

j
= 1
1
p
+
1
p
_
_

I[m]
(1)
|I|
p
P
jI
x

j
_
_
_
_

J[m]
(1)
|J|
p
P
jJ
y

j
_
_
= 1
1
p
+
1
p
m

j=1
_
1
1
p
x

j
__
1
1
p
y

j
_
= 1 + O
_
1

p
_
.
Similarly, one can show that E

p
= 1 + O
_
1

p
_
, and we are done.
Lemma B.10 (Euler Product Simplication).

p
E
p
= e
O
m(
1
w
)
__
W
(W)
_
m
+ o(1)
_

p|
(1 + O(p
1/2
))

p
E

p
.
APPENDIX B. ESTIMATES FOR
R
51
Proof. As in the proof of Lemma B.7, we know that

p<w
E
p
=
__
W
(W)
_
m
+ o(1)
_

p<w
E

p
and also that

pw
p
E
p
= e
O
m(
1
w
)

pw
p
E

p
.
An application of the above Euler factor estimate gives

pw
p|
E
p
=

p|
(1 + O(p
1/2
))

pw
p|
E

p
and combining these three products gives us the required result.
Corollary B.2 (Euler Product Estimate).

p
F
p
= e
O
m(
1
w
)
(1 + o(1))
_
W
(W)
_
m

p|
(1 + O(p
1/2
)).
B.3 Pseudorandomness of
This section proves the linear and simple pseudorandomness conditions. The arguments in
this section are exactly those in [10], Section 9, and are included here for completeness.
Theorem B.3 (Weak Linear Pseudorandomness Condition). If we have m k 2
k1
ho-
mogenous linear forms
i
in t 3k 4 variables with rational coecients bounded by k in
both numerator and denominator, and none equal to zero or a rational multiple of another,
then
E((
1
(x)) . . . (
m
(x)) : x Z
t
N
) = 1 +
m
+ o(1),
where [
m
[
k
for some
k
depending only on k.
Proof. First we clear denominators and assume that all the linear forms have integer coe-
cients, at the cost of increasing the bound on the coecients to (k+1)!. Taking w suciently
large, we can assume that (k + 1)! <
_
w
2
and so we can apply Theorem B.1 to these linear
forms.
We must rst chop up the range of summation into boxes before we can apply Theo-
rem B.1, to deal with the two-part denition of . Let Q =

N, and divide Z
t
p
into Q
t
roughly equal sized boxes, B
u
1
,...,u
t
= B
u
.
B
u
= x Z
t
N
: x
j
[u
j
Q|, (u
j
+ 1)Q|)
52
Call u Z
t
Q
nice if every linear form takes the box B
u
entirely inside or outside of the
interval [
k
N, 2
k
N]. Note that by denition of Q and the upper bound on m, N/Q > R
5m
,
so we may apply Theorem B.1 to obtain
E((
1
(x)) (
m
(x)) [ x B
u
1
,...,u
t
) = e
O
m(
1
w
)
(1 + o(1))
since we can replace each factor by either 1 or
(W) log R
W

2
R
(
i
(x)).
So the nice boxes have already been dealt with, and give us the answer were looking for.
Next we show that most boxes are nice more precisely, that the proportion of non-nice
boxes is at most O(1/Q).
Suppose u is not nice; then there exists some linear form and x, y B
u
such that
(x) [
k
N, 2
k
N] but (y) , [
k
N, 2
k
N]. Suppose (x) =

t
j=1
L
j
x
j
+ b. Then
(x), (y) =
t

j=1
L
j
Qu
j
| + b + O(Q).
Hence
Either
k
N or 2
k
N =
t

j=1
L
j
Qu
j
| + b + O(Q),
and so
t

j=1
L
j
u
j
=
k
Q +
b
Q
+ O(1)(modQ).
But since (L
j
) is non-zero, the number of t-tuples u which satisfy this is at most O(Q
t1
), and
hence the proportion of non-nice boxes with respect to is O(1/Q). But since the number
of linear forms is bounded also, the total proportion of non-nice boxes is also O(1/Q).
When u is not nice, we can bound by the trivial bound 1 +
(W) log R
W

2
R
(
i
(x)). Multi-
plying out and applying Theorem B.1 again, we get
E((
1
(x)) (
m
(x)) [ x B
u
) = e
O
m(
1
w
)
(O(1) + o(1)).
Putting it all together
LHS = E(E((
1
(x)) (
m
(x)) [ x B
u
) [ u Z
t
Q
) + o(1)
= e
O
m(
1
w
)
(1 + O(1/Q) + o(1)) = 1 +
m
+ o(1)
where
m
can be taken suciently small by taking w suciently large.
Theorem B.4 (Weak Simple Pseudorandomness Condition). Whenever we have m 2
k1
simple linear forms
i
in t k variables, then
E((
1
(x)) (
m
(x)) = 1 +
m
+ o(1).
APPENDIX B. ESTIMATES FOR
R
53
Furthermore, there exists a weight function
m
: Z
N
R
+
such that E(
q
) = O
m,q
(1) for all
1 q < and for all h
1
, . . . , h
m
Z
N
we have the upper bound
E
xZ
N
((x + h
1
) (x + h
m
)) (1 +
m
)

1i<jm
(h
i
h
j
).
In both cases, [
m
[
k
for some
k
suciently small depending on k.
Proof. The rst part is proven exactly as for the previous theorem. For the second, we
construct our weight function in the next lemma, with the additional requirement that
(0) := exp(Cmlog N/ log log N).
Note this preserves the bounds E(
q
) = O
m,q
(1) for all q, since the weight at 0 contributes
at most o
m,q
(1).
First suppose at least two of the h
i
are equal. We may bound the left hand side crudely
by ||
m

. Standard estimates give us


||

exp(C log N/ log log N)


and so
||
m

(0)

1i<jm
(h
i
h
j
),
which is the required bound.
Now suppose that all h
i
are distinct. By Theorem B.2,
E((x + h
1
) (x + h
m
) : x Z
N
) = e
O
m(
1
w
)
(1 + o
m
(1))

p|
(1 + O(p
1/2
))
(1 +
m
)(1 + o
m
(1))

1i<jm
(h
i
h
j
)
and by choosing w suciently we may ensure that
m
is as small as required. Furthermore,
by adjusting the function by a constant factor depending only on m (and hence k), we can
absorb the o
m
(1) error into the sum, which gives the required result.
Lemma B.11 (Construction of the weight function). For any m 1 there is a weight
function
m
: Z R
+
such that for all distinct h
1
, . . . , h
m
we have

p|
_
1 + O
m
_
1

p
__

1i<jm
(h
i
h
j
),
where
:=

1i<jm
[h
i
h
j
[.
Furthermore, for any 0 < q < ,
E(
q
(n) : 0 < [n[ N) = O
m,q
(1).
54
Proof. We take
m
(n) := O
m
(1)

p|n
(1 +
1

p
)
O
m
(1)
for all n ,= 0. We note that by the
arithmetic mean-geometric mean inequality,

p|
_
1 + O
m
_
1

p
__

1i<jm
_
_

p|h
i
h
j
_
1 +
1

p
_
_
_
O
m
(1)
O
m
(1)

1i<jm
_
_

p|h
i
h
j
_
1 +
1

p
_
_
_
O
m
(1)
=

1i<jm
(h
i
h
j
).
Hence it remains to show that
E
_
_

p|n
_
1 +
1

p
_
O
m
(q)
: 0 < [n[ N
_
_
= O
m,q
(1)
for all 0 < q < . Since
_
1 +
1

p
_
O
m
(q)
is bounded by 1 +
1
p
1/4
for all but O
m,q
(1) many
primes p, we have
E
_
_

p|n
_
1 +
1

p
_
O
m
(q)
: 0 < [n[ N
_
_
O
m,q
(1)E
_
_

p|n
_
1 +
1
p
1/4
_
: 0 < n N
_
_
.
We now use the fact that

p|n
_
1 +
1
p
1/4
_

d|n
1
d
1/4
to get that
E
_
_

p|n
_
1 +
1

p
_
O
m
(q)
: 0 < [n[ N
_
_
O
m,q
(1)
1
2N

1|n|N

d|n
1
d
1/4
O
m,q
(1)
1
2N
N

d=1
N
d
5/4
= O
m,q
(1)
and we are done.
Appendix C
Fourier transform
In this appendix we prove a standard fact about rapid decay of the Fourier transform required
in the proof of Theorems B.1 and B.2. The proof here is taken from [25].
Lemma C.1. If f is a bounded function with compact support, then

f is also bounded. In
fact, we have
|

f|

|f|
1
Proof. For any t R,
[

f(t)[ :=

_
f(x)e
ixt
dx

_
[f(x)[dx =: |f|
1
< .
Theorem C.1. Suppose f is C
N
with compact support and f
(n)
L
1
for all 0 n N.
Then

f
(n)
(t) = (it)
n

f(t)
when 0 n N and furthermore
[

f(t)[ = O((1 +[t[)


N
)
Proof. We use induction on N. For N = 1, by integration by parts we have

(t) =
_
f

(x)e
ixt
dx = it
_
e
ixt
f(x)dx = it

f()
The inductive step follows easily.
It follows from Lemma C.1 that

f
(n)
is bounded, and hence the rst part of the theorem
implies that t
n

f is bounded if n N, say [t
n

f[ D. Note that we can take this bound
to be uniform over all n, since there are only nitely many n to be considered. From the
binomial theorem it follows that for some constant C
C(1 +[t[)
N

n=0
[t
n
[
55
56
And hence
[

f[
N

n=0
[t
n
[ DN
so
[

f[
DN

N
n=0
[t
n
[

DN
C(1 +[t[)
N
= O((1 +[t[)
N
)
Corollary C.1. If f is smooth with compact support then

f(t) = O
A
((1 + [t[)
A
) for any
A > 0.
Appendix D
The GI and MN Conjectures
In this appendix we give a formal statement of the conjectures mentioned in Chapter Six.
These statements are taken from [7], which contains an in-depth discussion of these conjec-
tures and the results surrounding them.
Denition D.1 (Nilpotent). Let G be connected, simply connected, Lie group with central
series G
0
G
1
G
2
. . . (that is, G
0
= G
1
= G and G
i+1
= [G, G
i
] for i 2). We say
that G is s-step nilpotent if G
s+1
= 1.
Denition D.2 (Nilmanifold). Let G be an s-step nilpotent group, and G a discrete,
cocompact subgroup. Then the quotient G/ is an s-step nilmanifold.
Denition D.3 (Nilsequence). An s-step nilsequence is a sequence of the form (F(g
n
x))
nN
where g G, x G/ and F : G/ R is a continuous function for some s-step nilmanifold
G/.
Conjecture D.1 (Inverse Gowers norm conjecture for s). Suppose that 0 < 1. Then
there exists a nite collection /
s,
of s-step nilmanifolds with the following property.
Given any N and 1-bounded function f on [N] such that
|f|
U
s+1
[N]

there is a nilmanifold in /
s,
and a 1-bounded s-step nilsequence (F(g
n
x)) on it with a
bounded Lipschitz constant (i.e. a bound dependent only on s and delta, not N) such that
[E
[N]
f(n)F(g
n
x)[
s,
1
Let us see what this gives us in the case s = 1, i.e. for the Gowers U
2
norm. A group is
1-step nilpotent if and only if it is Abelian. In this case, one can in fact take G = R and
= Z and /
1,
is just the singleton set R/Z independent of . This case of the conjecture
is easy to prove: if f has a large U
2
norm then it is easy to show that it correlates with a
linear character, i.e. a function of the form e
2in
N
, and this is a 1-step nilsequence on R/Z
taking F to be the identity, x = 1 and g = e
2i
N
.
57
58
Conjecture D.2 (Mobius Nilsequence Conjecture). Let G/ be an s-step nilmanifold and
(F(g
n
x)) a bounded s-step nilsequence. Then for any A > 0
[E
[N]
(n)F(g
n
x)[ log
A
N
where the implicit bound is dependent on A, s, the nilmanifold and the Lipschitz constant of
the nilsequence (but not, importantly, on the nilsequence itself, not on g or x).
Bibliography
[1] P. Erdos and P. Tur an, On some sequences of integers, J. London Math. Soc. 11 (1936),
261264.
[2] D. Goldston and C. Y. Yldrm, Small gaps between primes, I, preprint available at
arXiv:0504336.
[3] W. T. Gowers, Decompositions, Approximate Structure, Transference, and the Hahn-
Banach theorem, preprint available at arXiv:0811.3103.
[4] , A new proof of Szemeredis theorem, GAFA 11 (2001), 465588.
[5] Ben Green, Long arithmetic progressions of primes, Analytic Number Theory: a tribute
to Gauss and Dirichlet (Tschnikel Duke, ed.), Clay Mathematics Proceedings, 2007,
pp. 149168.
[6] Ben Green and Terence Tao, A bound for progressions of length k in the primes, available
at http://www.math.ucla.edu/ tao/preprints/Expository/quantitative AP.dvi.
[7] , Linear equations in primes, to appear in Annals of Math., preprint available at
arXiv:0606088.
[8] , The Mobius function is strongly orthogonal to nilsequences, preprint available
at arXiv:0807.1736.
[9] , An inverse theorem for the Gowers U
3
(G)-norm, with applications, Proc. Ed-
inburgh Math. Soc. 51 (2008), no. 1, 71153.
[10] , The primes contain arbitrarily long arithmetic progressions, Annals of Math.
167 (2008), 481547.
[11] , Quadratic uniformity of the Mobius function, Annales de lInstitut Fourier
(Grenoble) 58 (2008), no. 6, 18631935.
[12] Ben Green, Terence Tao, and Tamar Ziegler, An inverse theorem for the Gowers U
4
-
norm, submitted to the Glasg. Math. J., preprint available at arXiv:0911.5681.
[13] , An inverse theorem for the Gowers U
s+1
[N]-norm, preprint available at
arXiv:1009.3998.
59
60
[14] G. H. Hardy and J. E. Littlewood, Some problems of partitio numerorum III: On the
expression of a number as a sum of primes, Acta Math. 44 (1923), 170.
[15] D. R. Heath-Brown, Three primes and an almost-prime in arithmetic progression, J.
London Math. Soc. 23 (1981), 396414.
[16] Th` ai Ho`ang Le, Green-Tao theorem in function elds, preprint available at
arXiv:0908.2642.
[17] Omer Reingold, Luca Trevisan, Madhur Tulsiani, and Salil Vadhan, New proofs
of the Green-Tao-Ziegler dense model theorem: An exposition, preprint available at
arXiv:0806.0381.
[18] K. F. Roth, On certain sets of integers, J. London Math. Soc. 28 (1953), 242252.
[19] E. Szemeredi, On sets of integers containing no k elements in arithmetic progression,
Acta Arith. (1975), 299345.
[20] Terence Tao, A remark on Goldston-Yldrm correlation estimates, available at
http://www.math.ucla.edu/ tao/preprints/Expository/gy-corr.dvi.
[21] , What is good mathematics?, Bull. Amer. Math. Soc. 44 (2007), 623634.
[22] Terence Tao and Tamar Ziegler, The primes contain arbitrarily long polynomial progres-
sions, Acta Math. 201 (2008), 213305.
[23] J. G. van der Corput,

Uber Summen von Primzahlen und Primzahlquadraten, Math.
Ann. 116 (1939), 150.
[24] P. Varnavides, On certain sets of positive density, J. London Math. Soc. 34 (1959),
358360.
[25] Thomas Wol, Lectures in harmonic analysis, available online at
http://www.math.ubc.ca/ ilaba/wol/.

Das könnte Ihnen auch gefallen