Proof of Kepler’s Laws from first principles

Oskar John Hollinsworth

December 26, 2014

Johannes Kepler famously devised his three laws of planetary motion, which describe situations like our own solar
system, in which almost all of the mass is located in a central body (the Sun) and thus we can assume that the only
prevailing force is that of the Sun’s gravity on the orbiting planets. Kepler’s laws state:
1. The orbit of a planet is an ellipse with the Sun at one of the two foci.
2. A line segment connecting a planet and the Sun ‘sweeps out’ equal areas in equal times.
3. The square of the orbital period of a planet is proportional to the cube of the semi-major axis of its orbit (T 2 ∝ R3 )
The first law means that if you imagine the Sun, a point P and a planet as three fixed pegs, then a rubber band of a
certain size could fit round these three pegs to describe the orbit of the planet (using the rubber band like a compass).
The second law means that planets move fastest when closest to the Sun and slowest when furthest away. The √ third √ law
states that if you place a planet twice as far away, the time taken to complete an orbit will scale by a factor of 23 = 2 2.
These laws in fact describe a very general type of system, it is not particular to the force of gravity. The laws actually
describe the motion of particles under the action of a ‘central force’, i.e. one that is a function only of the particle’s distance
to the centre and whose line of action is along the line connecting the centre to the particle. It is using this language and
describing this much more general type of system, that we will now prove all three of these laws using only Newton’s Laws
and some serious calculus.

1 Prerequisites

1.1 The scalar equations of motion in a polar coordinate system

Our objective is to be able to write equations of motion, i.e. ΣF = ma = mr̈ for different arrays of forces, thus we must find
a nice expression for r̈ in terms of scalars.
Let r̂ be the radius unit vector, or more specifically, a vector of magnitude one unit in the direction of increasing
 radius, or
cos θ
in other words, outwards from the centre. Resolving into its horizontal and vertical components, r̂ = .
sin θ

Let θ̂ be the angle unit vector, that is, a vector of magnitude one unit in the direction of increasing θ (the angle the position

− sin θ
vector makes with the x-axis). By resolving or simply noting that it is perpendicular to r̂, we can see that θ̂ = .
cos θ
Interestingly, this is also the derivative of r̂ with respect to θ. Therefore, using the chain rule:

dr̂ d cos θ
dt dt sin θ
− sin θ
= θ̇
cos θ
= θ̇θ̂

We can obtain a similar result for θ̂:

dθ̂ d − sin θ
dt dt cos θ
− cos θ
= θ̇
− sin θ
cos θ
= −θ̇
sin θ

= −θ̇r̂

We can rewrite r (a general position vector) as r = rr̂. Now we are finally ready to find r̈!
Differentiating with respect to time using the product rule and our previous results:

ṙ = ṙr̂ + r
= ṙr̂ + rθ̇θ̂
dr̂ d
r̈ = r̈r̂ + ṙ + r (θ̇θ̂) + ṙθ̇θ̂
dt dt
= r̈r̂ + ṙθ̇θ̂ + r(θ̇ + θ̈θ̂) + ṙθ̇θ̂
= r̈r̂ + 2ṙθ̇θ̂ + r(−θ̇2 r̂ + θ̈θ̂)
= (r̈ − rθ̇2 )r̂ + (rθ̈ + 2ṙθ̇)θ̂
Recognising a product rule:
1 d 2
= (r̈ − rθ̇2 )r̂ + (r θ̇)θ̂
r dt

1.2 The scalar equations of motion for a central force

A central force is one which can be written in the form F (r)r̂, i.e. it is a function purely of distance to a centre and the line
of action is along the line connecting the centre to the particle. Therefore the vector equation of motion is:

mr̈ = F (r)r̂

Using our previous result this becomes:

1 d 2
m((r̈ − rθ̇2 )r̂ + (r θ̇)θ̂) = F (r)r̂
r dt
Equating coefficients of the components in each direction we arrive at two scalar equations of motion:

m(r̈ − rθ̇2 ) = F (r)

m d 2
(r θ̇) = 0
r dt

These can be written more nicely by noting that m = 0 is a very boring case that we are clearly not interested in:

F (r) = m(r̈ − rθ̇2 ) (1.1)

d 2
(r θ̇) = 0 (1.2)
Integrating (1.2) we get merely a constant of integration on the right hand side, written h in this case:

r2 θ̇ = h

This is a useful thing to know to be constant, for instance, it will allow us to simplify the first equation such that it does not
refer to θ:
1 2 2
F (r) = m(r̈ − (r θ̇) ) (1.3)
= m(r̈ − 3 ) (1.4)

1.3 The scalar equations of path for a central force

It turns out to be very helpful to also have a result relating r and θ directly, without having to consider everything as
functions of time. Therefore, we shall now relate u = 1r and θ, completely removing t from our equations to reach a very
handy result. We shall find an equation for r̈ which we can then substitute into (1.4).

d 1
ṙ = ( )
dt u
Using a double chain rule :
d 1 du dθ
= ( )
du u dθ dt
1 du
=− 2 θ̇
u dθ
Recalling that h = r2 θ̇ we can write θ̇ = hu2 :
= −h

Now we just need to differentiate once more:

r̈ = (ṙ)
d du
= − (h )
dt dθ
Using the chain rule:
d2 u
= −hθ̇
Writing θ̇ = hu2 :
d2 u
= −h2 u2

Now we are ready to substitute this into (1.4):

d2 u h2
F (r) = m((−h2 u2 2 ) − 3 )
dθ r
1 1 2 2d u 2 3
− F =h u +h u
m u dθ2
d2 u
1 1
− F = +u
mh2 u2 u dθ2

This gives us a highly useful equation of path:

F u1

d2 u
+u=− (1.5)
dθ2 mh2 u2

1.4 The Polar Equation of a conic with a focus at the origin

For the purposes of this proof, we shall assume that we are talking about an ellipse (0 ≤ e < 1) and the positive focus, but
the proof is essentially identical in the other cases. The focus is a distance ae from the centre of the shape, so with our origin
at the focus, the directrix is now ae closer than normal (which is x = ae ), so it lies at x = ae − ae.
The distance of a point on the conic to the directrix is e times the distance of the point to the focus (which is now the origin).

r = e( − ae − x)

= a − ae2 − er cos θ
Letting p = a(1 − e2 ):
= p − er cos θ
p = r(1 + e cos θ)

Thus we have the equation:

pu = 1 + e cos θ (1.6)

For a hyperbola, take p = a(e2 − 1) and e > 1. For a parabola, take e = 1 and p = 2a. Otherwise, all the conics have the
same form.

1.5 Torques

A torque, or moment, is a turning force. Following the analogy with linear motion, angular momentum changes due to an
external torque much as linear momentum changes due to an external force.

τ =r×F
F =
p = mv
dL dp dr
=r× + ×p
dt dt dt
=r×F +v×p
But p and v are parallel so their cross product is 0.

2 The Proofs

2.1 The First Law

Theorem 1. Planetary Orbits are elliptical with the Sun at one of the foci

Proof. Newton’s Law of gravitation states:

GM m
F =− r̂ (2.1)
GM m
This is clearly a central force where F (r) = − so we can substitute (2.1) into (1.5).
F u1

d2 u
dθ2 mh2 u2
GM mu2
mh2 u2
= 2
This gives us the following second-order linear differential equation:

d2 u GM
+u= 2 (2.2)
dθ h
The auxiliary quadratic equation is x2 + 1 = 0 which is solved by x = ±i so the general solution is A cos θ + B sin θ. The
particular solution is simply GM
h2 giving the solution:

u= + A cos θ + B sin θ
if we just choose a C and ϕ such that A = C cos ϕ and B = C sin ϕ then we can rewrite this as:
u= + C cos(θ − ϕ)
We need to be able to easily compare this to the polar equation for a conic so let us rewrite it:
u= + C cos(θ − ϕ)
h2 u h2 C
=1+ cos(θ − ϕ)
Recall from (1.6):
= pu = 1 + e cos θ
h2 h2 C
Thus we do indeed have a conic with p = ,e= , an angle φ between the semi-major axis and the x-axis and one
of the foci at the origin. However, it could be any of the three conic sections. Looking at the equation for e, we can see that
it will tend to be an ellipse when h and D are small but M is large (G is a tiny constant). This will occur when the initial
velocity and angular velocity are small but the mass of the central body is large, which is exactly what we have in our solar
The path will be more parabolic when the mass of the central body is smaller and the velocities are larger, like objects on
Earth tend to be.

2.2 The Second Law

Theorem 2. , where A is the area swept out by the position vector of the particle, is constant
Proof. Recall the scalar equations of motion for a body moving under the action of a central force:

F (r) = m(r̈ − rθ̇2 ) (2.3)

d 2
(r θ̇) = 0 (2.4)
Integrating (2.4) we get merely a constant of integration, written h in this case:

r2 θ̇ = h

Now, note that a small area swept out, δA, can be approximated as the area of a segment of a circle, so δA ≈ 12 r2 δθ. Dividing
by δt and then taking the limit:

dA 1 dθ
= r2
dt 2 dt
1 2
= r θ̇
= where h is constant
Alternatively, as the force acts along the line of the position vector, there can be no torque on the particles, thus angular
momentum must be constant and by the same reasoning as above:
dA L
= where L and m are obviously constant
dt 2m

2.3 The Third Law
4π 2 a3
Theorem 3. T 2 =
Proof. Let us consider the elliptical path of an orbiting body which we found is given by the equation:
u= + C cos(θ − ϕ)
We can find the Orbital Period of this motion using:

T =
but as θ̇ is a function of time this becomes:
Z 2π

T =
0 θ̇
We can find θ̇ using θ̇ = hu2 :
Z 2π
1 dθ
T =
h 0 ( GM
h2 + C cos(θ − ϕ))2
Z 2π

0 (GM + h2 C cos(θ − ϕ))2

However, this cannot be solved analytically, so we need to find a more cunning method of proof, which will tell us something
about the particular solutions we would get for different values of the unknowns.
We can also find the orbital period using:

T =

which is essentially just time= distance
speed for 2 dimensional distance. The area of an ellipse we can prove is πab thus: stretch
an ellipse by a factor of a in the x-direction, forming a circle. This new shape obviously has an area πb2 , but as we only
changed the x-values of the shape, any rectangle drawn within it has reduced in area by simply a factor of ab . Therefore, the
original area must have been ab × πa2 = πab.
We know that Ȧ = 2 so we can find the period:

T =
T2 = 2
We previously found that h2 = GM p where p = a(1 − e2 ):
GM p
4π 2 a2 b2
GM a(1 − e2 )
Note that b2 = a2 (1 − e2 ) so (1 − e2 ) = :
4π 2 a2 b2
= 2
GM a( ab 2 )
4π 2 a3

