Lectures Princeton

Physics 301 10-Sep-2004 1-1
Introduction
In this course we will cover selected topics in thermodynamics and statistical mechan-
ics. Since we only have twelve weeks, the selection is necessarily limited. You will probably
need to take a graduate course in thermal physics or do studying on your own in order to
gain a thorough knowledge of the subject.
Classical (or maybe “conventional” is better) thermodynamics is an approach to ther-

mal physics “from the large.” Statistical mechanics approaches the subject “from the
small.” In thermodynamics, one is concerned with things like the pressure, temperature,
volume, composition, etc., of various systems. These are macroscopic quantities and in
many cases can be directly observed or felt by our senses. Relations between these quan-
tities can be derived without knowing about the microscopic properties of the system.
Statistical mechanics takes explicit account of the fact that all systems are made
of large numbers of atoms or molecules (or other particles). The macroscopic properties
(pressure, volume, etc.) of the system are found as averages over the microscopic properties
(positions, momenta, etc.) of the particles in the system.
In this course we will tend to focus more on the statistical mechanics rather than
the thermodynamics approach. I believe this carries over better to modern subjects like
condensed matter physics. In any case, it surely reflects my personal preference!
Some History (mostly taken from Reif)
As it turns out, thermodynamics developed some time before statistical mechanics.

The fact that heat is a form of energy was becoming apparent in the late 1700’s and early
1800’s with Joule pretty much establishing the equivalence in the 1840’s. The second law
of thermodynamics was recognized by Carnot in the 1820’s. Thermodynamics continued
to be developed in the second half of the 19th century by, among others, Clausius, Kelvin
and Gibbs.
Statistical mechanics was developed in the late 19th and early 20th centuries by Clau-
sius, Maxwell, Boltzmann, and Gibbs.
I find all of this rather amazing because at the time of the initial development of
thermodynamics, the principle of energy conservation hadn’t been firmly established. Sta-
tistical mechanics was developed when the existence of atoms and molecules was still being
debated. The fact that macroscopic properties of systems can be understood in terms of
the microscopic properties of atoms and molecules helped convince folks of the reality of
atoms and molecules.
c 2004, Princeton University Physics Department, Edward J. Groth

Copyright
Physics 301 10-Sep-2004 1-2
Still more amazing is the fact that the foundations of statistical mechanics were de-
veloped before quantum mechanics. Incorporating quantum mechanics did make some
changes, especially in the counting of states, but the basic approach and ideas of statisti-
cal mechanics remained valid. I suspect that this is a reflection of both the strength and
weakness of statistical methods. By averaging over many molecules you derive results that
are independent of the detailed properties of individual molecules. The flip side is that
you can’t learn very much about these details with statistical methods.
Some Thermodynamic Concepts
From mechanics, we’re familiar with concepts such as volume, energy, pressure (force
per unit area), mass, etc. Two new quantities that appear in thermodynamics are tem-
perature (T ) and entropy (S).
We will find that temperature is related to the amount of energy in a system. Higher
temperature means greater internal energy (usually). When two systems are placed in
contact, energy in the form of heat flows from the higher temperature system to the lower
temperature system. When the energy stops flowing the systems are in thermal equilibrium
with each other and we say they are at the same temperature. It turns out if two systems
are in thermal equilibrium with a third system, they are also in thermal equilibrium with
each other. (This is sometimes called the zeroth law of thermodynamics.) So the concept
of temperature is well defined. It’s even more well defined than that as we will see later in
the course.
Two systems can exchange energy by macroscopic processes, such as compression or

expansion, or by microscopic processes. It is the microscopic process that is called heat
transfer. Consider a collision among billiard balls. We think of this as a macroscopic
process and we can determine the energy transfer involved by making measurements of
a few macroscopic parameters such as the masses and velocity components. If we scale
down by roughly 24 orders of magnitude, we consider a collision between molecules, a
microscopic process. A very large number of collisions occur in any macroscopic time
interval. A typical molecule in the atmosphere undergoes ∼ 1010 collisions per second. All
these collisions result in the exchange of energy and it is the net macroscopic transfer of
energy resulting from all the microscopic energy transfers that we call heat.
Recall that the first law of thermodynamics is

dU = dQ + dW ,
where dU is the change of (internal) energy of a system, dQ is energy added to the system
via a heat transfer, and dW is energy added by doing work on the system.
Aside: you will often see the heat and work written as d̄Q and d̄W . This is a reminder
that these quantities are not perfect differentials, just small changes. A system (in equi-

Copyright
Physics 301 10-Sep-2004 1-3
librium) has a well defined internal energy U (P, V, . . .) which can be differentiated with
respect to P , V , . . ., but there is no such thing as the heat or work content of a system.
The heat and work refer to energy transfers during a change to the system.
So the first law really boils down to a statement of energy conservation. You can
change the energy of a system by adding energy microscopically (dQ) or macroscopically
(dW ).
While we’re at it, the second law of thermodynamics can be stated in many ways,
but one way (without worrying too much about rigor) is: it’s impossible to turn heat
completely into work with no other change. So for example, if you build a heat engine
(like a power plant) you can’t turn all the heat you get (from burning coal) completely
into electrical energy. You must dump some waste heat. From this law, one can derive the
existence of entropy and the fact that it must always increase. (Or you can define entropy,
and state the second law in terms of the increase in entropy).
Entropy
Earlier, we mentioned that temperature is related to internal energy. So, a picture

we might carry around is that as the temperature goes up, the velocities of the random
motions of the molecules increase, they tumble faster, they vibrate with greater amplitude,
etc. What kind of picture can we carry around for entropy? Well that’s harder, but as the
course goes along we should develop such a picture.
To start, we might recall that the change in entropy of a system is the heat added to
the system divided by the temperature of the system (all this is for a reversible process,
etc.):
dS = dQ/T .
If a dQ > 0 is added to one system, −dQ must be added to a second system. To ensure
that entropy increases, T1 < T2 ; the first system is cooler than the second system. The
molecules in the first system speed up and the molecules in the second system slow down.
After the heat is transfered (in a direction which makes entropy increase) the distribution
of molecular speeds in the two systems is more nearly the same. The probability that a
fast molecule is from system 1 has increased while the probability that a fast molecule
is from system 2 has decreased. Similarly, the probability that a slow molecule is from
system 2 has increased and the probability a slow molecule is from system 1 has decreased.
In other words, as a result of the increase of entropy, the odds have become more even. So
increasing entropy corresponds to a leveling of the probabilities.
Higher entropy means more uniform probability for the possible states of the system
consistent with whatever constraints might exist (such as a fixed total energy of the sys-
tem). So entropy is related to the number of accessible states of the system and we will

Copyright
Physics 301 10-Sep-2004 1-4
find that maximizing the entropy is equivalent to assuming that each accessible state is
equally likely.
The first law of thermodynamics can be written as
dU = dQ + dW = T dS − p dV or dS = dU/T + p dV /T ,
where we’ve assumed that the number of particles in the system is constant and the work
done on the system results from pressure acting while the volume changes. Suppose the
system is an ideal gas. Then the energy depends only on temperature
dU = nCV dT ,
where n is the number of moles and CV is the molar specific heat at constant volume which
we take to be constant. The equation of state is
pV = nRT or p/T = nR/V ,
where R is the gas constant. We plug these into the first law and obtain
dT dV
dS = nCV + nR ,
T V
which can be integrated to give
Tf Vf
Sf − Si = nCV log + nR log .
Ti Vi
So, we have an expression for the entropy difference between any two states of an ideal
gas. But how can we relate this to what’s going on at the microscopic level? (Note, unless
otherwise stated, by log, I mean a natural logarithm, loge .)
First, let’s make a distinction between the macroscopic state and the microscopic
state. The macroscopic state is completely specified (at equilibrium!) by a small number of
parameters such as p, V , n, etc. Classically, the microscopic state requires the specification
of the position and velocity of each particle
r1 , v1 , r2 , v2 , . . . , rN , vN ,
where N is the number of particles. N is usually a huge number, comparable to Avogadro’s

number, the number of particles in a mole, N0 = 6.02 × 1023 . Since there is such a large
ratio of microscopic to macroscopic parameters, it must be that many microscopic states
may produce a given macroscopic state.
How many microscopic states are there? Why do we want to know? The idea is that
the macroscopic state which is generated by the most microscopic states is the most likely.
Suppose we say that
S ∝ log g ,

Copyright
Physics 301 10-Sep-2004 1-5
where g is the number of microstates corresponding to the macrostate.
This definition has the desirable property that if we have two non-interacting systems
with states g1 and g2 , and we bring them together, the entropy is additive.
S = S1 + S2 .
Since the systems are non-interacting, bringing the systems together does not change the
states available to either system, and any microstate of system 1 may be combined with
any microstate of system 2 to yield a microstate of the combined system. This means that
there are a total of g1 · g2 states altogether. By defining the entropy with a logarithm, we
ensure that it’s additive (at least in this case!).
So let’s count states. At first sight, you might think there are an infinite number of
states because r and v are continuous variables. Well, perhaps if you change them only
slightly, you don’t really get a new state.
Example: Ideal Gas Entropy
Consider one mole of ideal gas at STP. Its volume is V = 22.4 L = 2 × 104 cm3 and it
contains N0 = 6 × 1023 molecules. How big is a molecule? Answer: about 1 Å = 10−8 cm.
A molecular volume is Vm ≈ 10−24 cm3 . Imagine dividing our total volume V into cells the
size of a molecule. There are M = V /Vm = 2 × 1028 cells. Let’s specify the micro-position
state by stating which cells have molecules in them. That is, we are going to specify the
positions of the molecules to a molecular diameter. How many states are there? Pick a cell
for the first molecule. This can be done in M ways. Pick a cell for the second molecule.
This can be done in M − 1 ≈ M ways. For the third molecule, there are M − 2 ≈ M
ways. Continue to the N th molecule for which there are M − N ≈ M ways to pick a cell.
Altogether there are about
1024
g ≈ M N ≈ 1028 ,
ways to distribute the molecules in the cells. The fact that we get M N rather than a
binomial coefficient depends on the fact that M ≈ 1028 ≫ N ≈ 1024 . Also, we should
probably divide by N ! to account for permutations of the molecules in the cells (since we
can’t distinguish one molecule from another), but leaving this out won’t hurt anything at
this point.
As an example, consider a two dimensional gas containing N = 10 molecules and

M = 100 cells. The figure shows a couple of the possible position microstates of this gas.

Copyright
Physics 301 10-Sep-2004 1-6
There are
M !/N !(M − N )! = 1.7 × 1013
distinct states. Our approximation gives 1020 states; the difference is mostly due to ignoring
the 10! in the denominator.
Knowing the number of states, we have
S ∝ N log M ,
V
= N log ,
Vm
constant for given amount of gas
z }| {
= N log V − N log Vm .
| {z }
volume term in entropy
The N log Vm term is a constant for a given amount of gas and disappears in any calculation
of the change in entropy, Sf − Si . Similarly, the N ! correction would also disappear. So
a lot of the (really awful?) approximations we made just don’t matter because things like
the size of a molecule drop out as long as we only consider entropy differences.
The N log V term is the volume term in the ideal gas entropy. By considering the
microstates in velocity, we would obtain the temperature term (and we will later in the
term!).

Copyright
Physics 301 10-Sep-2004 1-7
What Those Large Numbers Mean
The key aspect of all this is the large number of states! Suppose we have a gas in
equilibrium in a container of volume 2V . Why doesn’t the gas, by chance, wind up in
one-half the container with volume V ? How many states are there in each case?
N N
V 2V
g1 = , g2 = .
Vm Vm
And,
g2
= 2N ,
g1
= 2Avogadro’s Number ,
23
= 26×10 ,
23
= 102×10 ,
= 1 |000 ·{z
· · 000} .
2×10 23 zeros
Such a state might be legal, but it’s extremely!!! unlikely. The fact that a system in
equilibrium has the maximum possible entropy is nothing more than the fact that the
normal equilibrium state has so many more ways to occur than an obviously weird state,
that the weird state just never occurs.
Quantum Mechanics and Counting States
You might be thinking that’s it pretty flaky to assert that we need only specify a
molecular position to a molecular diameter. We’ve shown that as long as it’s small, the
resolution has no effect on our calculation of changes in the entropy, so this is OK for
classical mechanics.
If we consider quantum mechanics, then we find that systems are in definite states.
There are many ways to see this. An example is to consider a particle in a box and fit the
wave functions in.
Another way is to consider the uncertainty principle,
∆px ∆x ≥ h̄/2 .
If the state of the system is specified by a point in the x px diagram (phase space), then
one can’t tell the difference between states which are as close or closer than the above. So
we can divide up this phase space into cells of h̄/2 and we can specify a state by saying
which cells are occupied and which are not.

Copyright
Physics 301 10-Sep-2004 1-8
As a numerical example, consider air (N2 ) at room temperature. mN2 = 28mp =

28 × 1.7 × 10−24 g = 4.8 × 10−23 g. A typical kinetic energy is mv 2 /2 = 3kT /2 with
T = 300 K and k = 1.38 × 10−16 erg/K, then E ∼ 6 × 10−14 erg, v ∼ 5.1 × 104 cm/s,
p ∼ 2.4 × 10−18 g cm/s. The molecular size is about r ∼ 1 Å = 10−8 cm, so
p r = 2.4 × 10−26 g cm2 /s > h̄ = 1 × 10−27 erg s .
Thus, at room temperature, one can specify the momentum of a molecule to a rea-
sonable fraction of a typical momentum and the position to about the molecular size and
still be consistent with quantum mechanics and the uncertainty principle. That is, room
temperature air is classical, but not wildly separated from the quantum domain. If we
consider lower temperatures or higher densities, electrons in metals, etc. quantum effects
will be more important.
The ideal gas at STP is a “low occupancy” system. That is, the probability that any
particular state is occupied is extremely small. This means that the most likely number
of occupants of a particular state is zero, one occurs very rarely, and we just don’t need
to worry about two at all. This is the classical limit and corresponds to the Boltzmann
distribution.
If we have higher occupancy systems (denser and/or colder), then states occupied
by two or more particles can become likely. At this point quantum mechanics enters.
There are two kinds of particles: integer spin particles called bosons (such as photons or
other particles that we associate with waves) and half-integer spin particles called fermions
(protons, electrons, particles that we associate with matter). An arbitrary number of
bosons can be placed in a single quantum state. This leads to Bose-Einstein statistics and
the Bose distribution. At most one fermion can be placed in a quantum state. This leads
to Fermi-Dirac statistics and the Fermi distribution.
A lot of what we do this term will be learning about and applying the Boltzmann,
Bose, and Fermi distributions!

Copyright
Physics 301 13-Sep-2004 2-1
Reading
This week, you should read the first two chapters of K&K.
Entropy and the Number of States
As we discussed last time, in the statistical view, entropy is related to the number
of “microstates” of a system. In particular, the entropy is the log of the number of
states that are accessible to the system when it has specified macroscopic parameters
(its “macrostate”).
The fact that entropy always increases is just a reflection of the fact that a system
adjusts its macroscopic parameters, within the allowed constraints, so as to maximize the
number of accessible states and hence the entropy.
So, a large part of statistical mechanics has to do with counting states and another
large part has to do with deriving interesting results from these simple ideas.
Why is the Number of States Maximized?
Good question. We are going to take this is an axiom or postulate. We will not
attempt to prove it. However, we can give some plausibility arguments.
First, remember that we are typically dealing with something like Avogadro’s number
of particles, N0 = 6.02 × 1023 . As we discussed last time, this makes the probability
distributions very sharp. Or put another way, improbable events are very improbable.
The other thing that happens with a large number of particles has to do with the
randomness of the interactions. Molecules in a gas are in continual motion and collide with
each other (we will see later in the term, how often). During these collisions, molecules
exchange energy, momentum, angular momentum, etc. The situation in a liquid is similar,
one of the differences between a liquid and gas has to do with the distance a molecule
travels between collisions: in a gas, a molecule typically travels many molecular diameters;
in a liquid, the distance between collisions is of the order of a molecular diameter. In a
solid, molecules tend to be confined to specific locations, but they oscillate around these
locations and exchange energy, momentum, etc. with their neighbors.
OK, molecules are undergoing collisions and interactions all the time. As a result, the
distribution of molecular positions and speeds is randomized. If you pick a molecule and
ask things like where is it located, how fast is it going, etc., the answers can only be given
in terms of probabilities and these answers will be the same no matter which molecule you

Copyright
Physics 301 13-Sep-2004 2-2
pick. (Provided you pick the same kind of molecule - you’ll probably get different answers
for a N2 molecule and an Ar atom, but you’ll get the same answers for two N2 molecules.)
Sticky point: suppose we assume that the world is described by classical mechanics.
Also suppose we know the interactions between molecules in some isolated system. Suppose
we also know all ∼ N0 positions ri and momenta pi (and whatever else we might need to
know to specify the system, perhaps the angular momenta of the molecules, etc.). Then in
principle, the equations of motion can be solved and the solution tells us the exact state
of the system for all future times. That is, there is nothing random about it! How do we
reconcile this with the probabilistic view espoused in the preceding paragraphs?
So far as I know, there are reasonable practical answers to this question, but there are
no good philosophical answers. The practical answers have to do with the fact that one
can’t really write down and solve the equations of motion for ∼ N0 particles. But we can in
principle! A somewhat better answer is that we can only know the initial conditions with
some precision, not infinite precision. As we evolve the equations of motion forward, the
initial uncertainties grow and eventually dominate the evolution. This is one of the basic
concepts of chaos which has received a lot of attention in recent years: small changes in
the initial conditions can lead to large changes in the final result. (Have you ever wished
you could get a 10 day or 30 day weather forecast? Why do they stop with the 5 day
forecast?)
Of course, the fact that we cannot measure infinitely precisely the initial conditions nor
solve such a large number of equations does not mean (still assuming classical mechanics)
that it couldn’t be done in principle. (This is the philosophical side coming again!) So
perhaps there is still nothing random going on. At this point one might notice that it’s
impossible to make a totally isolated system, so one expects (small) random perturbations
from outside the system. These will disturb the evolution of the system and have essentially
the same effect as uncertainties in the initial conditions. But, perhaps one just needs to
include a larger system!
If we recognize that quantum mechanics is required, then we notice that quantum

mechanics is an inherently probabilistic theory. Also, I’m sure you’ve seen or will see in
your QM course that in general, uncertainties tend to grow with time (the spreading out
of a wave packet is a typical example). On the other hand, the system must be described
by a wave function (depending on ∼ N0 variables), whose evolution is determined by
Schroedinger’s equation . . ..
As you can see this kind of discussion can go on forever.
So, as said before, we are going to postulate that a system is equally likely to be in
any state that is consistent with the constraints (macroscopic parameters) applied to the
system.

Copyright
Physics 301 13-Sep-2004 2-3
As it happens, there is a recent Physics Today article on exactly this subject: trying to
go from the reversibility of classical mechanics to the irreversibility of statistical mechanics.
It’s by G. M. Zaslavsky and is called, “Chaotic Dynamics and the Origin of Statistical
Laws,” 1999, vol. 52, no. 8, pt. 1, p. 39. I think you can read this article and get a feel for
the problem even if some of it goes over your head (as some of it goes over my head).
Aside—Entropy and Information
In recent times, there has been considerable interest in the information content of
data streams and what manipulating (computing with) those data streams does to the
information content. It is found that concepts in information theory are very similar
to concepts in thermodynamics. One way out of the “in principle” problems associated
with classical entropy is to consider two sources of entropy: a physical entropy and an
information or algorithmic entropy. This goes something like the following: if we had
some gas and we had the knowledge of each molecule’s position and momentum, then the
physical entropy would be zero (there’s nothing random about the positions and momenta),
but the algorithmic entropy of our list of positions and momenta would be large (and
equal to the physical entropy of a similar gas whose positions and momenta we hadn’t
determined). What is algorithmic entropy? Essentially, the logarithm of the number of
steps in the algorithm required to reproduce the list.
In 1998, Toby Marriage wrote a JP on this topic. You can find it at
http://physics.princeton.edu/www/jh/juniors fall98.html .
One of our criteria for junior papers is that other juniors should be able to understand the
paper; so I think you might get something out of this paper as well!

Copyright
Physics 301 13-Sep-2004 2-4
Macroscopic Parameters
We will be most concerned with systems in equilibrium. Such a system can usually
be described by a small number of macroscopic parameters. For example, consider a gas.
If the density of the gas is low enough, it can be described quite well by the ideal gas law
when it’s in equilibrium:
pV = N kT = nRT ,
where p is the pressure, V is the volume, N is the number of molecules, n is the number of
moles, k = 1.38 × 10−16 erg K−1 is Boltzmann’s constant or the gas constant per molecule,
R = 8.31 × 107 erg mole−1 K−1 = N0 k is the gas constant per mole, and T is the absolute
temperature.
Notice that some parameters depend on how much gas we have and some don’t. For
example, if we replicate our original system, so we have twice as much, then V , N , U
(internal energy), and S (entropy) all double; p, and T stay the same. We are ignoring the
contribution of any surface interactions which we expect to be very small. Can you think
why? Parameters which depend on the size of the system are called extensive parameters.
Parameters that are independent of the size of the system are called intensive parameters.
Note that the gas law is not the whole story. If more than one kind of molecule is in
the gas, we need to specify the numbers of each kind: N1 , N2 , . . .. Also, the gas law does
not say anything about the energy of the gas or its entropy. The gas law is an equation of
state, but it needs to be supplemented by other relations in order that we know everything
there is to know about the gas (macroscopically, that is!). For systems more complicated
than a gas, other parameters may be needed.
Another thing to notice is that not all parameters may be specified independently. For
example, having specified N , T , and V , the pressure is determined. Thus there is a certain
minimum number of parameters which specify the system. Any property of the system
must be a function of these parameters. Furthermore, we can often change variables and
use a different set of parameters. For a single component ideal gas, we might have
p = p(N, V, T ), U = U (N, V, T ), S = S(N, V, T ) .
We might imagine solving for T in terms of N , V , and U , and we can write
p = p(N, V, U ), T = T (N, V, U ), S = S(N, V, U ) .
Anything that depends only on the equilibrium state of the system can be expressed as
a function of the parameters chosen. Which parameters are to be used depends on the
particular situation under discussion. For example, if the volume of a system is under our
control, we would likely use that as one of the independent parameters. On the other hand,
many processes occur at constant pressure (with the volume adjusting to what it needs to

Copyright
Physics 301 13-Sep-2004 2-5
be). In this case, using p rather than V as the independent parameter will probably be
more convenient.
The Temperature
As we remarked, the entropy is the logarithm of the number of microstates accessible

to a system. The number of states must be a function of the same macroscopic parameters
that determine the macrostate of the system. Let’s consider a system described by its
internal energy U , its volume V , and the number of each kind of constituent particle Na ,
Nb , . . .. For the moment, we ignore the possibility of reactions which can change particles
of one kind into another kind. This means that our expressions will have the same form for
Na , Nb , etc., so we’ll just assume a single kind of particle for the time being and assume
we have N of them. Then the number of microstates is
g = g(U, V, N ) .
If we have two systems, that we prevent from interacting, then the number of mi-
crostates of the combined system is
g(U, V, N, U1, V1 , N1 ) = g1 (U1 , V1 , N1 )g2 (U2 , V2 , N2 ) ,
with
U = U1 + U2 , V = V1 + V2 , N = N 1 + N2 .
This is straightforward. Any microstate in system 1 can be paired with any microstate in
system 2, so the total number of microstates is just the product of the number for each
system. Also, we have specified the macrostate in terms of extensive parameters, so we
can write the parameters of the combined system as the sum of those for the individual
systems as well as one set of the individual system parameters.
Following K&K, the dimensionless entropy is just
σ(U, V, N, U1, V1 , N1 ) = log g(U, V, N, U1, V1 , N1 ) = log g1 g2

= log g1 + log g2 = σ1 (U1 , V1 , N1 ) + σ2 (U2 , V2 , N2 ) .
So far, we haven’t really done anything. We’ve just written down some definitions
twice. We have prevented the two systems from interacting, so nothing exciting can hap-
pen. Now let’s suppose we allow the systems to exchange energy. In other words, we allow
U1 and U2 to vary, but any change in U1 has a compensating change in U2 so that U is
constant. In addition, we prevent changes in volume and numbers of particles, so that V1 ,
V2 , N1 , and N2 remain constant.

Copyright
Physics 301 13-Sep-2004 2-6
We’re placing the systems in thermal contact, but preventing changes in volume or
particle number. We know what will happen: energy flows from the hotter system to the
cooler system until they come to thermal equilibrium at the same temperature. We know
this from our intuitive understanding of the second law: heat flows from a hot object to a
cold object.
But, what about our postulate that a system maximizes the number of accessible
microstates? In this case, it means that the system adjusts U1 and U2 to maximize the
entropy. So,

∂σ
=0 since σ is maximized
∂U1 V,N

∂σ1 ∂σ2
= +
∂U1 V,N ∂U1 V,N

∂σ1 ∂σ2 ∂U2
= +
∂U1 V,N ∂U2 V,N ∂U1

∂σ1 ∂σ2
= − since ∆U1 = −∆U2 .
∂U1 V,N ∂U2 V,N
This means
∂σ1 ∂σ2
= ,
∂U1 V,N ∂U2 V,N
after equilibrium has been established.
So at equilibrium, the rate of change of entropy with respect to energy is the same for
the two systems. If we started out with the two systems and we allowed them to exchange
energy and nothing happened, then we know that ∂σ/∂U was already the same. If system
1 and system 2 are in equilibrium with respect to energy exchange and we allow system
1 to exchange energy with a third system and nothing happens, then ∂σ3 /∂U3 must also
have the same value and nothing will happen if systems 2 and 3 are allowed to exchange
energy. Thus, ∂σ/∂U has properties very similar to those we ascribe to temperature. In
fact, we can define the temperature as:

1 ∂σ
= .
τ ∂U V,N
This makes τ an intensive quantity (it’s the ratio of two extensive quantities), and it makes
the energy flow in the “correct” direction.
This can be seen as follows: if the two systems are not in equilibrium when we allow
energy to flow, then the entropy of the combined systems must increase: The increase in

Copyright
Physics 301 13-Sep-2004 2-7
entropy after a very small amount of energy has been transferred is
δσ > 0
= δσ1 + δσ2
1 1
= δU1 + δU2
τ τ2
1
1 1
= − δU1 .
τ1 τ2
So if τ1 < τ2 , δU1 > 0, which means energy flows from the high τ system to the low τ
system.
Finally, if you remember your elementary thermodynamics, recall that dU = T dS −

p dV which agrees with this definition of temperature.
Units: from our definitions σ is dimensionless and τ has the dimensions of energy. You
recall that temperature T has the dimensions of Kelvins and entropy S has the dimensions
of ergs per Kelvin. As it turns out,
S = kσ ,
τ = kT .
Boltzmann’s constant is really just a scale factor which converts conventional units to the
fundamental units we’ve defined above.
It’s often said that we measure temperature in Kelvins or degrees Celsius or Fahren-
heit because the measurement of temperature was established before the development of
thermodynamics which in turn took place before the connection to energy was fully ap-
preciated. What would you think if you tuned in to the weather channel and found out
that the high tomorrow was expected to be 4.14 × 10−14 erg or 0.0259 eV??? (If I did the
arithmetic correctly, this is ∼ 80◦ F.)
Actually, to measure a temperature, we need a thermometer. Thermometers make use

of physical properties which vary with temperature. (That’s obvious I suppose!) The trick
is to calibrate the thermometers so you get an accurate measure of the thermodynamic
temperature, τ /k. A recent Physics Today article discusses some of the difficulties in
defining a good practical scale for τ /k < 1 Kelvin. (Soulen, Jr., R. J., and Fogle, W. E.,
1997, Physics Today, vol. 50, no. 8, p. 36, “Temperature Scales Below 1 Kelvin.”)
One other thing to point out here: You’ve no doubt noticed the V, N subscripts. When
you read a thermodynamics text, you’ll often find the statement that this a reminder that
V and N are being held fixed in taking the indicated partial derivative. Well, this is true,
but since we have a partial derivative, which already means hold everything else fixed, why
do we need an extra reminder? Answer: since there are so many choices of independent
variables, these subscripts are really a reminder of the set of independent variables in use.

Copyright
Physics 301 13-Sep-2004 2-8
Note that we can add energy to a gas keeping the volume and number of particles fixed. In
this case the pressure and temperature rise. Alternatively, we can keep the pressure and
number of particles fixed. In this case the volume and temperature increase. Furthermore,

∂σ ∂σ
6= .
∂U V,N ∂U p,N
When it’s obvious from the context which set of independent variables are in use, I will
probably be lazy and omit the subscripts.

Copyright
Physics 301 15-Sep-2004 3-1
Pressure
Last lecture, we considered two systems with entropy as a function of internal energy,
volume and number of particles,
σ(U, V, N, U1, V1 , N1 ) = σ1 (U1 , V1 , N1 ) + σ2 (U2 , V2 , N2 ) .
We allowed them to exchange internal energy (that is, they were placed in thermal con-
tact), and by requiring that the entropy be a maximum, we were able to show that the
temperature is
1 ∂σ
= .
τ ∂U V,N
Suppose we continue to consider our two systems, and ask what happens if we allow
them to exchange volume as well as energy? (We’re placing them in mechanical as well as
thermal contact.) Again, the total entropy must be a maximum with respect to exchanges
of energy and exchanges of volume. Working through similar mathematics, we find an
expression for the change in total entropy and insist that it be zero (so the entropy is
maximum) at equilibrium,
0 = δσ
∂σ1 ∂σ2 ∂σ1 ∂σ2
= δU1 + δU2 + δV1 + δV2
∂U ∂U2 ∂V ∂V2
1 1
∂σ1 ∂σ2 ∂σ1 ∂σ2
= − δU1 + − δV1 ,
∂U1 ∂U2 ∂V1 ∂V2
from which we infer that at equilibrium,
∂σ1 ∂σ2
= ,
∂U1 ∂U2
which we already knew, and

∂σ1 ∂σ2
= .
∂V1 ∂V2
This last equation is new, and it must have something to do with the pressure. Why?
Because, once the temperatures are the same, two systems exchange volume only if one
system can “push harder” and expand while the other contracts. We define the pressure:

∂σ
p=τ .
∂V U,N
We will see later that this definition agrees with the conventional definition of pressure as
force per unit area.

Copyright
Physics 301 15-Sep-2004 3-2
Chemical Potential
Well, there’s one variable left, guess what we’re going to do now! Suppose we allow
the two systems to exchange particles as well as energy and volume. Again, we want to
maximize the entropy with respect to changes in all the independent variables and this
leads to,
0 = δσ
∂σ1 ∂σ2 ∂σ1 ∂σ2 ∂σ1 ∂σ2
= δU1 + δU2 + δV1 + δV2 + δN1 + δN2
∂U1 ∂U2 ∂V1 ∂V2 ∂N1 ∂N2

∂σ1 ∂σ2 ∂σ1 ∂σ2 ∂σ1 ∂σ2
= − δU1 + − δV1 + − δN1 .
∂U1 ∂U2 ∂V1 ∂V2 ∂N1 ∂N2
So, when the systems can exchange particles as well as energy and volume,
∂σ1 ∂σ2
= .
∂N1 ∂N2
The fact that these derivatives must be equal in equilibrium allows us to define yet another
quantity, µ, the chemical potential

∂σ
µ = −τ .
∂N U,V
If two systems are allowed to exchange particles and the chemical potentials are unequal,
there will be a net flow of particles until the chemical potentials are equal. Like temperature
and pressure, chemical potential is an intensive quantity. Unlike temperature and pressure,
you probably have not come across chemical potential in your elementary thermodynamics.
You can think of it very much like a potential energy per particle. Systems with high
chemical potential want to send particles to a system with low potential energy per particle.
Note that we can write a change in the entropy of a system, specified in terms of U , V ,
and N as
1 p µ
dσ = dU + dV − dN ,
τ τ τ
or rearranging,
dU = τ dσ − p dV + µ dN .
Which is the conservation of energy (first law of thermodynamics) written for a system
which can absorb energy in the form of heat, which can do mechanical pV work, and which
can change its energy by changing the number of particles.

Copyright
Physics 301 15-Sep-2004 3-3
First Derivatives versus Second Derivatives
You will notice that the quantitites defined by first derivatives are not material specific.
For example, whether it’s nitrogen gas or a block of steel, the rate of change of energy
with entropy (at constant volume and particle number) is the temperature.
We’ll eventually define Helmholtz and Gibbs free energies and enthalpy (different
independent variables) and it will always be the case that the first derivatives of these
quantities produce other quantities that are not material specific.
To get to material specific quantities, one must go to second derivatives. For example,
suppose we have a block of steel and nitrogen gas inside a container that is thermally
insulating, fixed in volume, and impermeable. Then at equilibrium,

∂Usteel ∂Unitrogen
= = −p .
∂Vsteel S,N ∂Vnitrogen S,N
If we make a change to the volume of the container, we might be interested in
∂p ∂2U
=− .
∂V ∂V 2
This quantity is related to the compressibility of the material. Nitrogen gas is much more
compressible than steel and most of the volume change will be taken up by the gas, not
the steel. In other words, the material specific quantity (second derivative) is different for
the two materials.

Copyright
Physics 301 15-Sep-2004 3-4
Probability
Here, we will introduce some basic concepts of probability. To start with, one imagines
some experiment or other process in which several possible outcomes may occur. The
possible outcomes are known, but not definite. For example, tossing a die leads to one
of the 6 numbers 1, 2, 3, 4, 5, 6 turning up, but which number will occur is not known
in advance. Presumably, a set of elementary outcomes can be defined and all possible
outcomes can be specified by saying which elementary outcomes must occur. For example,
the tossing of the die resulting in an even number would be made up of the elementary
events: the toss is 2 or the toss is 4 or the toss is 6. A set of elementary events is such
that one and only one event can occur in any repetition of the experiment. For example,
the events (1) the toss results in a prime number and (2) the toss gives an even number
could not both be part of a set of elementary events, because if the number 2 comes up,
both events have occurred!
One imagines that a very large number of tosses of the die take place. Furthermore,
in each toss, an attempt is made to ensure that there is no memory of the previous toss.
(This is another way of saying successive tosses are independent.) Then the probability
of an event is just the fraction of times it occurs in this large set of experiments, that
is, ne /N , where ne is the number of times event e occurs and N is the total number of
experiments. In principle, we should take the limit as the number of trials goes to ∞.
From this definition it is easy to see that the probabilities of a set of elementary events
must satisfy
pi ≥ 0 ,
and X
pi = 1 ,
i
where pi is the probability of event i and i is an index that ranges over the possible
elementary events.
The above definition is intuitive, but gives the sense of a process occurring in time.
That is, we throw the same die over and over again and keep track of what happens.
Instead, we can imagine a very large number of dice. Each die has been prepared, as nearly
as possible, to be identical. Each die is shaken (randomized) and tossed independently.
Again, the probability of an event is the fraction of the total number of trials in which
the event occurs. This collection of identically prepared systems and identically performed
trials is called an ensemble and averages that we calculate with this construction are called
ensemble averages.
You are probably thinking that for the die, the probabilities of each of the six elemen-
tary events 1, 2, 3, 4, 5, 6 must be 1/6. Well, they could be, but it’s not necessary! You’ve
heard of loaded dice, right? All that’s really necessary is that each pi be non-negative
and that their sum be 1. Probability theory itself makes no statement about the values of

Copyright
Physics 301 15-Sep-2004 3-5
the probabilities. The values must come from somewhere else. In general, we just assign
probabilities to the elementary events. Often we will appeal to symmetry or other argu-
ments to assign the probabilities. For example, since a die is a symmetric cube, no face
can be distinguished (mechanically) from any other face and we can plausibly argue that
the probabilities should be equal.
Aside: well, the dots have to be painted on, so the die isn’t perfectly symmetric.
Presumably, differences in the amount and pattern of paint on each face make a negligible
difference in the mechanical properties of the die (such as cm, moment of inertia, etc.) so
it’s a very good approximation to regard the die as symmetric. However, some dice have
rather large indentations for each dot. I’ve occasionally wondered if this might make a
detectable difference in the probabilities.
In our discussion of the entropy, we postulated that a system is equally likely to be

in any microscopic state consistent with the constraints. This amounts to assigning the
probabilities and is basically an appeal to symmetry in the same way that assigning equal
probabilities to each face of a die is an appeal to symmetry!
Averages
Assuming there is some numeric value associated with each elementary event, we can
calculate its average value just by adding up all the values and dividing by the total number
of trials—exactly what you think of as an average. So, if event i produces the value yi ,
then its average value is
1
hyi = (y1 + y1 + · · · y1 + y2 + y2 + · · · y2 + · · ·)
N | {z } | {z }
n1 times n2 times
1
= (n1 y1 + n2 y2 + · · ·)
N
X
= yi pi .
i
Quantities like y, whose value varies across an ensemble, are called random variables.
After the average, we will often be most interested in the variance (often called the
square of the standard deviation.) This is just the average value of the square of the
deviation from the average.

var(y) = σy2 = (y − hyi)2 ,
where σy is the standard deviation in y, not the entropy! The standard deviation is a
measure of the spread about the average.

Copyright
Physics 301 15-Sep-2004 3-6
Probabilities for Continuous Variables.
Rather than giving one of a finite (or infinite) number of discrete outcomes, an ex-
periment might result in the measurement of a random variable which is continuously
distributed over some finite (or infinite) range. In this case we deal with a probability
density rather than discrete probabilities. For example, we might make a measurement of
a continuous variable x. Then the probability that the measurement falls in a small range
dx around the value x is
Prob(x < result < x + dx) = p(x) dx ,
where p(x) is the probability density. Just as for discrete probabilities, the probability
density must satisfy
p(x) ≥ 0 ,
and Z
p(x) dx = 1 .
allowed range of x
We can simply define p(x) = 0 when x is outside the allowed range, so the normalization
becomes Z +∞
p(x) dx = 1 .
−∞
The average of any function of x, y(x) is defined by

Z +∞
hyi = y(x)p(x) dx .
−∞

Copyright
Physics 301 15-Sep-2004 3-7
The Binomial Distribution
As an example of working with probabilities, we consider the binomial distribution.

We have N trials or N copies of similar systems. Each trial or system has two possible
outcomes or states. We can call these heads or tails (if the experiment is tossing a coin),
spin up or spin down (for spin 1/2 systems), etc. We suppose that each trial or system is
independent and we suppose the probability of heads in one trial or spin up in one system
is p and the probability of tails or spin down is 1 − p = q. (Let’s just call these up and
down, I’m getting tired of all these words!)
To completely specify the state of the system, we would have to say which of the N
systems are up and which are down. Since there are 2 states for each of the N systems,
the total number of states is 2N . The probability that a particular state occurs depends on
the number of ups and downs in that state. In particular, the probability of a particular
state with n up spins and N − n down spins is
Prob(single state with n up spins) = pn q N−n .
Usually, we are not interested in a single state with n up spins, but we are interested in all
the states that have n up spins. We need to know how many there are. There is 1 state
with no up spins. There are N different ways we have exactly one of the N spins up and
N − 1 down. There are N (N − 1)/2 ways to have two spins up. In general, there are N n
different states with n up spins. These states are distinct, so the probability of getting any
state with n up spins is just the sum of the probabilities of the individual states. So

N n N−n
Prob(any state with n up spins) = p q .
n
Note that
N
N
X N
1 = (p + q) = pn q N−n ,
n=0
n
and the probabilities are properly normalized.
To illustrate a trick for computing average values, suppose that when there are n up
spins, a measurement of the variable y produces n. What are the mean and variance of y?
To calculate the mean, we want to perform the sum,
N
X N n N−n
hyi = n p q .
n=0
n
Consider the binomial expansion

N
N
X N n N−n
(p + q) = p q ,
n=0
n

Copyright
Physics 301 15-Sep-2004 3-8
and observe that if we treat (for the moment) p and q as independent mathematical
variables and we differentiate both sides of this expression with respect to p (keeping q
fixed), we get
N
N−1
X N n−1 N−n
N (p + q) = n p q .
n=0
n
The RHS is almost what we want—it’s missing one power of p. No problem, just multiply
by p,
N
N−1
X N n N−n
N p(p + q) = n p q .
n=0
n
This is true for any (positive) values of p and q. Now specialize to the case where p+q = 1.
Then
N
X N n N−n
Np = n p q = hyi .
n=0
n
A similar calculation gives

var(y) = N pq .
The fractional spread about the mean is proportional to N −1/2 . This is typical;
as the number of particles grows, the fractional deviations from the mean of physical
quantities decreases in proportion to N −1/2 . So with ∼ N0 numbers of particles, fractional
fluctuations in physical quantities are ∼ 10−12 . This is extremely small. Even though the
macroscopic parameters in statistical mechanics are random variables, their fluctuations
are so small that they can usually be ignored. We speak of the energy of a system and
write down a single value, even though the energy of a system in thermal contact with a
heat bath is properly a random variable which fluctuates continuously.

Copyright
Physics 301 17-Sep-2004 4-1
Example—A Spin System
In the last lecture, we discussed the binomial distribution. Now, I would like to add
a little physical content by considering a spin system. Actually this will be a model for a
paramagnetic material. This system is also a trivial subcase of the Ising model.
We’ll consider a large number, N , of identical spin 1/2 systems. As you know, if
you pick an axis, and measure the component of angular momentum of a spin 1/2 system
along that axis, you can get only two answers: +h̄/2 and −h̄/2. If there’s charge involved,
then there’s a magnetic moment, m, parallel or antiparallel to the angular momentum. If
there’s a magnetic field, B, then this defines an axis and the energy m · B of the spin
system in the magnetic field can be either −mB if the magnetic moment is parallel to the
field or +mB if the magnetic moment is anti-parallel to the field. To save some writing,
let E = mB > 0 so the energy of an individual system is ±E.
In this model, we are considering only the energies of the magnetic dipoles in an
external magnetic field. We are ignoring all other interactions and sources of energy. For
example, we are ignoring magnetic interactions between the individual systems, which
means we are dealing with a paramagnetic material, not a ferromagnetic material. Also,
we are ignoring diamagnetic effects—effects caused by induced magnetic moments when
the field is established. Generally, if there is a permanent dipole moment m, paramagnetic
effects dominate diamagnetic effects.
Of course, there must be some interactions of our magnets with each other or with
the outside world or there would be no way for them to change their energies and come to
equilibrium. What we’re assuming is that these interactions are there, but just so small
that we don’t need to count them when we add up the energy. (Of course the smaller they
are, the longer it will take for equilibrium to be established. . .)
Our goal here is to work out expressions for the energy, entropy, temperature, in terms
of the number of parallel and antiparallel magnetic moments.
If there is no magnetic field, then there is nothing to pick out any direction, and
we expect that any given magnetic moment or spin is equally likely to be parallel or
antiparallel to any direction we pick. So the probability of parallel should be the same
as the probability of antiparallel should be 1/2: p = 1 − p = q = 1/2. If we turn on
the magnetic field, we expect that more magnets will line up parallel to the field than
antiparallel (p > q) so that the entire system has a lower total energy than it would have
with equal numbers of magnets parallel and antiparallel.
If we didn’t know anything about thermal effects, we’d say that all the magnets should
align with the field in order to get the lowest total energy. But we do know something
about thermal effects. What we know is that these magnets are exchanging energy with
each other and the rest of the world, so a magnet that is parallel to the field, having energy

Copyright
Physics 301 17-Sep-2004 4-2
−E, might receive energy +2E and align antiparallel to the field with energy +E. It will
stay antiparallel until it can give up the energy 2E to a different magnet or to the outside
world. The strengths of the interactions determine how rapidly equilibrium is approached
(a subject we will skip for the time being), but the temperature sets an energy scale and
determines how likely it is that chunks of energy of size 2E are available.
So suppose that n of the magnets are parallel to the field and N − n are antiparallel.
K&K define the “spin excess”, as the number parallel minus the number antiparallel,
2s = n − (N − n) = 2n − N or n = s + N/2. The energy of the entire system is then
U (n) = −nE + (N − n)E = −(2n − N )E = −2sE .
The entropy is the log of the number of ways our system can have this amount of energy
and this is just the binomial coefficient.

N N!
σ(n) = log = log .
n (N/2 + s)! (N/2 − s)!
To put this in the context of our previous discussion of entropy and energy, note that
we talked about determining the entropy as a function of energy, volume, and number of
particles. In this case, the volume doesn’t enter and we’re not changing the number of
particles (or systems) N . At the moment, we are not writing the entropy as an explicit
function of the energy. Instead, the two equations above are parametric equations for the
entropy and energy.
To find the temperature, we need ∂σ/∂U . In our formulation, the entropy and energy
are functions of a discrete variable, not a continuous variable. No problem! We’ll just send
one magnet from parallel to anti-parallel. This will make a change in energy, ∆U , and a
change in entropy, ∆σ and we simply take the ratio as the approximation to the partial
derivative. So,
∆U = U (n − 1) − U (n) = 2E ,
∆σ = σ(n − 1) − σ(n)

N N
= log − log
n−1 n

N! n! (N − n)!
= log
(n − 1)! (N − n + 1)! N!
n
= log
N −n+1
n
= log 1 can’t matter if N − n ∼ N0
N −n
N/2 + s
= log ,
N/2 − s

Copyright
Physics 301 17-Sep-2004 4-3
where the last line expresses the result in terms of the spin excess. Throwing away the 1
is OK, provided we are not at zero temperature where n = N .
The temperature is then
∆U 2E
τ= = .
∆σ log(N/2 + s)/(N/2 − s)
At this point it’s convenient to solve for s. We have
N/2 + s
= e2E/τ ,
N/2 − s
and with a little algebra

2s E
= tanh .
N τ
The plot shows this function—fractional spin excess versus E/τ . To the left, thermal
energy dominates magnetic energy and the net alignment is small. To the right, magnetic
energy dominates thermal energy and the alignment is large. Just what we expected!
Suppose the situation is such that E/τ is large. Then the magnets are all aligned. Now
turn off the magnetic field, leaving the magnets aligned. What happens? The system is
no longer in equilibrium. It absorbs energy and entropy from its surroundings, cooling the
surroundings. This technique is actually used in low temperature experiments. It’s called
adiabatic demagnetization. Demagnetization refers to removing the external magnetic field
and adiabatic refers to doing it gently enough to leave the magnets aligned.

Copyright
Physics 301 17-Sep-2004 4-4
The Boltzmann Factor
An additional comment on probabilities: When the spin excess is 2s, the probabilities
of parallel or antiparallel alignment are:
1 s 1 s
p= + , q= − .
2 N 2 N
The ratio of the probabilities is
q 1 − 2s/N
= = e−2E/τ .
p 1 + 2s/N
This is a general result. The relative probability that a system is in two states with an
energy difference ∆E is just
Probability of high energy state
= e−∆E/τ = e−∆E/kT .
Probability of low energy state
This is called the Boltzmann factor. As we’ve already mentioned, this says that energies
<
∼ kT are “easy” to come by, while energies > kT are hard to come by! The temperature
sets the scale of the relevant energies.
The Gaussian Distribution
We’ve discussed two discrete probability distributions, the binomial distribution and
(in the homework) the Poisson distribution. As an example of a continuous distribution,
we’ll consider the Gaussian (or normal) distribution. It is a function of one continuous
variable and occurs throughout the sciences.
The reason the Gaussian distribution is so prevalent is that under very general con-
ditions, the distribution of a random variable which is the sum of a large number of
independent, identically distributed random variables, approaches the Gaussian distribu-
tion as the number of random variables in the sum goes to infinity. This result is called
the central limit theorem and is proven in probability courses.
The distribution depends on two parameters, the mean, µ, (not the chemical poten-
tial!) and the standard deviation, σ (not the entropy!). The probability density is
(x − µ)2
1 −
p(x) = √ e 2σ 2 .
2πσ 2
You should be able to show that
Z +∞
(x − µ)2
1 −
√ e 2σ 2 dx = 1 ,
2πσ 2 −∞

Copyright
Physics 301 17-Sep-2004 4-5
Z +∞
(x − µ)2
1 −
hxi = √ xe 2σ 2 dx = µ ,
2πσ 2 −∞
Z +∞
(x − µ)2
1 −
var(x) = √ (x − µ)2 e 2σ 2 dx = σ 2 .
2πσ 2 −∞
Appendix A of K&K might be useful if you have trouble with these integrals. One can
always recenter so that x is measured from µ and rescale so that x is measured in units of
σ. Then the density takes the dimensionless form,
1 2
p(x) = √ e−x /2 .
2π
Sometimes you might need to integrate this density over a finite (rather than infinite)
range. Two related functions are of interest, the error function
Z Z √
z 2z
2 2 1 2
erf(z) = √ e−t dt = 2 √ e−x /2 dx ,
π 0 2π 0
and the complementary error function

Z ∞ Z ∞
2 −t 2 1 −x2 /2 dx ,
erfc(z) = √ e dt = 2 √ √ e
π z 2π 2z
where the first expression (involving

√ t) is the typical definition, and the second (obtained
by changing variables t = x/ 2 rewrites the definition in terms of the Gaussian probability
density. Note that erf(0) = 0, erf(∞) = 1, and erf(z) + erfc(z) = 1.
The Gaussian density is just the “bell” curve, peaked in the middle, with small tails.
The error function gives the probability associated with a range in x at the middle of
the curve, while the complementary error function gives probabilities associated with the
tails of the distribution. In general, you have to look these up in tables, or have a fancy
calculator that can generate them. As an example, you might hear someone at a research
talk say, “I’ve obtained a marginal two-sigma result.” What this means is that the signal
that was detected was only 2σ larger than no signal. A noise effect this large or larger will
happen with probability
Z ∞
1 2 1 2
√ e−x /2 dx = erfc √ = 0.023 .
2π 2 2 2
That is, more than 2 percent of the time, noise will give a 2σ result just by chance. This
is why 2σ is marginal.
We’re straying a bit from thermal physics, so let’s get back on track. One of the
reasons for bringing up a Gaussian distribution is that many other distributions approach

Copyright
Physics 301 17-Sep-2004 4-6
a Gaussian distribution when large numbers are involved. (The central limit theorem
might have something to do with this!) For example, the binomial distribution. When
the numbers are large, we can replace the discrete distribution in n with a continuous
distribution. The advantage is that it is often easier to work with a continuous function.
In particular, the probability of a spin excess, s, is
N! N N
ps = p 2 +s q 2 −s .
(N/2 + s)! (N/2 − s)!
We need to do something with the factorials. In K&K, Appendix A, Stirling’s approxima-

tion is derived. For large N , √
N ! ∼ 2πN N N e−N .
With this, we have
s
2πN NN
ps ∼ (N/2+s) (N/2−s)
pN/2+s q N/2−s
2π(N/2 + s) 2π(N/2 − s) (N/2 + s) (N/2 − s)
s
1 pN/2+s q N/2−s
=
2πN (1/2 + s/N ) (1/2 − s/N ) (1/2 + s/N )(N/2+s) (1/2 − s/N )(N/2−s)
s r
1 pq pN/2+s q N/2−s
=
2πN (1/2 + s/N ) (1/2 − s/N ) pq (1/2 + s/N )(N/2+s) (1/2 − s/N )(N/2−s)
r (N/2+s+1/2) (N/2−s+1/2)
1 p q
= .
2πN pq 1/2 + s/N 1/2 − s/N
Recall that the variance of the binomial distribution is N pq, so things are starting to look
promising. Also, we are working under the assumption that we are dealing with large
numbers. This means that s cannot be close to ±N/2. If it were, then we would have a
small number of aligned, or a small number of anti-aligned magnets. So, in the exponents
in the last line, N/2 ± s is a large number and we can ignore the 1/2. Then
r (N/2+s) (N/2−s)
1 p q
ps = .
2πN pq 1/2 + s/N 1/2 − s/N
This is a sharply peaked function. We expect the peak to be centered at s = s0 = hsi =

hni−N/2 = N p−N/2 = N (p−1/2). We want to expand this function about its maximum.
Actually, it will be easier to locate the peak and expand the function, if we work with its
logarithm.

N 1 s N 1 s
log ps = A + +s log p − log + + −s log q − log − ,
2 2 N 2 2 N

Copyright
Physics 301 17-Sep-2004 4-7
where
1 1
A = log .
2 2πN pq
To locate the maximum of this function, we take the derivative and set it to 0

d log ps 1 s 1 s
= log p − log + − 1 − log q + log − +1.
ds 2 N 2 N
We note that this expression is 0 when s/N = p − 1/2, just as we expected. So this is
the point about which we’ll expand the logarithm. The next term in a Taylor expansion
requires the second derivative
d2 log ps 1 1
2
=− −
ds N/2 + s N/2 − s
1 1 1
=− − =− ,
Np Nq N pq
where, in the last line, we substituted the value of s at the maximum. We can expand the
logarithm as
1 1
log ps = A − (s − s0 )2 + · · ·
2 N pq
where s0 = N (p − 1/2) is the value of s at the maximum. Finally, we let σ 2 = N pq,
exponentiate the logarithm, and obtain,
1 2 2
p(s) ∼ √ e−(s − s0 ) /2σ ,
2πσ 2
where the notation has been changed to indicate a continuous variable rather than a dis-
crete variable. You might worry about this last step. In particular, we have a discrete
probability that we just converted into a probability density. In fact, p(s) ds is the prob-
ability that that the variable is in the range s → s + ds. In the discrete case, the spacing
between values of s is unity, so we require,

p(s) (s + 1) − s = ps ,
which leads to p(s) = ps . Had there been a different spacing there would be a different
factor relating the discrete and continuous expressions.
All this was a lot of work to demonstrate in some detail that for large N (and not too
large s), the binomial distribution describing our paramagnetic system goes over to the
Gaussian distribution. Of course, expanding the logarithm to second order guarantees a
Gaussian!
In practice, you would not go to all this trouble to do the conversion. The way
you would actually do the conversion is to notice that large numbers are involved, so the

Copyright
Physics 301 17-Sep-2004 4-8
distribution must be Gaussian. Then all you need to know are the mean and variance
which you calculate from the binomial distribution or however you can. Then you just
write down the Gaussian distribution with the correct mean and variance.
Returning to our paramagnetic system, we found earlier that the mean value of the
spin excess is
N E
s0 = tanh .
2 τ
We can use the Gaussian approximation provided s is not too large compared to N/2 which
means E <
∼ τ . In this case, a little algebra shows that the variance is
2
2 1 E
σ = N pq = N sech .
2 τ
For given E/τ , the actual s fluctuates about the

√ √ mean s0 with a spread proportional to
N and a fractional spread proportional to 1/ N . A typical system has N ∼ N0 , so the
fractional spread is of order 10−12 and the actual s is always very close to s0 .
While we’re at it, it’s also interesting to apply Stirling’s approximation to calculate
the entropy of our paramagnetic system. Recalling Stirling’s approximation for large N ,
√
N ! ∼ 2πN N N e−N .
Taking the logarithm, we have
1 1
log N ! ∼ log 2π + log N + N log N − N .
2 2
The first two terms can be ignored in comparison with the last two, so
log N ! ∼ N log N − N .
Suppose our spin system has s0 ≈ 0. Then the entropy is
N!
σ ≈ log
(N/2)! (N/2)!

∼ N log N − N − 2 (N/2) log(N/2) − (N/2)
= N log N − N log(N/2)
= N log 2
= 4.2 × 1023 (fundamental units)
= 5.8 × 107 erg K−1 (conventional units) ,
where the last two lines assume one mole of magnets.

Copyright
Physics 301 20-Sep-2004 5-1
Reading
K&K, Chapter 3. Also, for a little culture, there is a handout which is a one page
article from Science, 1997, G. √Bertsch, vol. 277, p. 1619. It describes melting in clusters
consisting of 139 atoms! So 1/ N ≈ 10%, quite a bit larger than the one part in a trillion
we’ve been talking about when there is a mole of particles! One might expect things to be
less well defined, and sure enough, these clusters seem to melt over an appreciable range
in temperature (rather than a single T ). This may be a case where statistical mechanics
is just on, or just past the edge!
The Boltzmann Factor
Let’s imagine we have an object which is isolated from the rest of the world, so its
energy, entropy, and so on are constant. Furthermore, it has come to equilibrium, so all
parts of the object are at the same temperature, pressure, and so on. We imagine dividing
this object into two pieces: a small piece, called the system; and a large piece called the
heat bath or heat reservoir (or just bath or reservoir). The system is supposed to be
sufficiently small that it’s useful to think of it as being in a single (quantum) state. A
system might be a single atom or molecule, but it could be a larger entity if such an entity
can be described as being in a single state. The remainder of the object, the bath, is
supposed to be very large and might consist of a very large number of similar atoms or
molecules. (In principle, we should let N → ∞, where N is the number of molecules in
the bath.) The system and the bath interact so that the system is in a particular state
only with some probability. We are going to calculate this probability. We want to speak
of the system as having energy E. This means that the interaction between the system
and the bath must be weak in order that we can ascribe a definite energy to the system.
Consider two states of the system, 1 and 2, with energies E1 and E2 . When the system
is in state 1, the bath has energy U − E1 , where U is the total (fixed) energy of the bath
plus system. The number of states that correspond to this energy in the system is

g(U − E1 ) × 1 = exp σ(U − E1 ) × 1 .
Where the first factor is the number of states in the bath and the second factor is the
number of states (just 1) of the system. To have the factor 1 here is the reason that we
insisted the system be in a single state. Similarly, the number of states in the case that
the system has energy E2 is

g(U − E2 ) × 1 = exp σ(U − E2 ) × 1 .
Our fundamental assumption is that each state (of the system plus bath together)
that is compatible with the constraints is equally probable. So, the ratio of the probability

Copyright
Physics 301 20-Sep-2004 5-2
that the system is in state 2 to the probability that it is in state 1 is
P (state 2) g(U − E2 )
= = eσ(U − E2 ) − σ(U − E1 ) .
P (state 1) g(U − E1 )
Now, E is an energy on the scale of an energy of a single molecule while U is an energy

typical of a mole. So we can expand σ in a tailor series around E = 0,
∂σ E
σ(U − E) = σ(U ) − E + · · · = σ(U ) − + · · · .
∂U τ
Note that The first term on the right is ∼ N times larger than the second term, and we
expect that the first omitted term will be another factor of ∼ 1/N smaller. Inserting this
expansion into our expression for the probability ratio, we have
P (state 2)
= eσ(U ) − E2 /τ − σ(U ) + E1 /τ = e−(E2 − E1 )/τ .
P (state 1)
The probability ratio is an exponential in the energy difference divided by the temperature.
As noted when we discussed the model paramagnetic system, this is called the Boltzmann
factor, and we’ve just shown that this is a general result.
It may seem we got something for nothing: we made a few definitions, did some
mathematical hocus-pocus and voila, out pops the Boltzmann factor. A key ingredient
is our postulate that all states which satisfy the constraints are equally probable. The
number of states goes up with increasing energy U . The rate of increase (in the logarithm)
is measured by the temperature, τ . When the system is in a state with energy E, it
necessarily received that energy from the heat bath. The more energy the heat bath gives
to the system, the fewer states it “has left.” This is what makes higher energy states of
the system less probable.
In principle, we should consider an ensemble of identically prepared heat baths and

systems. An ensemble in which the probability follows the Boltzmann form (∝ exp(−E/τ ))
is called a canonical ensemble.

Copyright
Physics 301 20-Sep-2004 5-3
Systems with Several Forms of Energy
A single system in a canonical ensemble might have several ways to store energy. For
example, if the system is a gas molecule, it has energy associated with translation of the
center of mass, rotation about the center of mass, and various forms of internal energy such
as vibration or other electronic excitations and interactions among spins. If the energies
can be assigned independently, then the probabilities are independent. For example, if we
want to know the probability that a molecule is in a particular rotational state, with energy
Erot , we can think of the bath as including the translational and internal motions of the
molecule under consideration as well as all the motions of all the other molecules. Similarly,
if we consider the translational motion, with energy Etran , all other forms of energy in this
molecule and all other molecules may constitute the heat bath. The probabilities are
P (Etran ) ∝ e−Etran /τ ,
P (Erot ) ∝ e−Erot /τ ,
··· .
These probabilities are independent, so the probability that our molecule has a total energy
Etot = Etran + Erot + · · · is just the product,
P (Etot ) ∝ e−(Etran + Erot + · · ·)/τ = e−Etot /τ .
Caveat 1: this is the probability for a single state of the system having the specified
total energy. If the system has several different ways of distributing Etot among its sub-
systems, then each way has the same probability (our assumption that each state has the
same probability) and the probability of being in any state with that Etot is just the sum.
Caveat 2: it may not always be true that the energies can be distributed independently.
For example, the moment of inertia, and hence the rotational energy levels, may depend
on the amplitude of vibration.
In (older) thermo texts, you will often see expressions like
P (energy 2) g2
= e−(E2 − E1 )/τ ,
P (energy 1) g1
where g1 and g2 are called statistical weights. The statistical weight of states with energy
E is just the number of distinct states with that energy. This expression agrees with the
statement above that the probabilities of the independent states add. Note the difference
between the probability of a single state with energy E versus the probability of any state
with energy E.

Copyright
Physics 301 20-Sep-2004 5-4
Diversion: the Maxwell Velocity Distribution
Consider a low density gas. Generally, the forces between molecules are only large
when the molecules are within a few molecular diameters of each other. The mean separa-
tion is many molecular diameters. We expect that a molecule in a low density gas satisfies
our definition of a system which is weakly interacting with its heat bath (all the other
molecules in the gas).
We consider only the translational energy of a molecule. The probability that a

molecule is in a particular state with energy E = mv 2 /2 is just
2
P (E) ∝ e−E/τ = e−mv /2τ ,
where m is the mass and v is the velocity of the molecule. Now we have a problem. This
expressions applies to a particular state. Usually we think of the energy of translation
and the velocity as continuous variables. What constitutes a particular state? Let’s post-
pone the counting of states for a bit and see how far we can get with some reasonable
assumptions.
First of all, if we want to treat the energy and velocity as continuous variables, we
should be talking about a probability density, where the probability that the energy lies in
the range E → E + dE is given by p(E) dE. This will be the probability of getting E no
matter how many different states have this E, and this probability must be proportional
to the Boltzmann factor and to the number of states that have this E. In other words,
p(E) dE = Ce−E/τ n(E) dE ,

R
where C is a constant whose value is adjusted to ensure that p(E) dE = 1, and n(E),
usually called the density of states, is the number of distinct states per unit energy with
energy E. That is, we’ve put our ignorance of the number of states into this function!
We note that the Boltzmann factor, written in terms of velocities, is
2 2 2 2
P (E) ∝ e−mv /2τ = e−m(vx + vy + vz )/2τ .
This strongly suggests that each component of the velocity is normally distributed. We
might expect this on other grounds as well. For example, if we pick a molecule and measure
the x component of its velocity, we note that vx is a random variable and its value is
determined by the previous history of collisions the molecule has undergone. These are
the kinds of conditions that give rise to Gaussian distributions although proving it in this
case might be painful. We expect that hvx i = 0. Otherwise, our sample of gas would have
a non-zero velocity in the x direction and we are considering a sample at rest (of course!).
Also, the mean square value of vx contributes to the energy and we would expect this to be

Copyright
Physics 301 20-Sep-2004 5-5
related to the temperature as the Boltzmann distribution suggests. Finally (among things
that we expect), there is no reason to prefer one direction over another, so the probability
distributions for vx , vy , and vz should be the same. In fact, the probability of any velocity
component should be the same which means that the joint probability should depend only
on the magnitude and not on the direction of the velocity.
Putting all this together, we guess that the joint probability that the x, y, and z
components of the velocity are in the ranges vx → vx + dvx , vy → vy + dvy , and vz →
vz + dvz , is
p(vx , vy , vz ) dvx dvy dvz

s s s
1 2 1 2 1 2
= e−mvx /2τ dvx × e−mvy /2τ dvy × e−mvz /2τ dvz ,
2πτ /m 2πτ /m 2πτ /m
23
1 2
= e−mv /2τ dvx dvy dvz .
2πτ /m
This is called the Maxwell velocity distribution. Now let’s change variables from Cartesian
components to spherical polar components, v, θ, φ where
vx = v sin θ cos φ ,
vy = v sin θ sin φ ,
vz = v cos θ .
This is just like changing coordinates from x, y, z to r, θ, φ. The result is

23
1 2
p(v, θ, φ) dv dθ dφ = e−mv /2τ v 2 sin θ dv dθ dφ .
2πτ /m
Just as we expected, the probability is independent of direction, so we can integrate over

direction (θ and φ) to obtain the probability density for the velocity and any direction.
The integral over directions gives 4π and we have,
23
1 2
p(v) dv = 4π e−mv /2τ v 2 dv .
2πτ /m
Let’s change variables one more time, back to the energy E = mv 2 /2,
23
1
e−E/τ E 2 dE .
1
p(E) dE = 2π
πτ
Comparing this with our earlier expression,

√ we conclude that the number of states per unit
energy, n(E) must be proportional to E.

Copyright
Physics 301 20-Sep-2004 5-6
To recap: knowing that the probability of a state with energy E is proportional

to exp(−E/τ ), we argued that it was reasonable to expect the velocity components to be
normally distributed. We used symmetry arguments to specify the form of the distribution
and we used the probability distribution in energy to set the variance. We made some
changes in variables to convert our guessed probability density in velocity to a probability
density in energy from which we concluded that the number of kinetic energy states per
unit energy is ∝ E 1/2 .
Aside—the Gamma Function
I kind of side-stepped the gamma function last week when we were using Stirling’s
approximation. But I think I should at least introduce the function as it’s necessary for
some of the integrals you might need to do. For example, to verify that p(E) above is
properly normalized, you will have to integrate over the energy and this is most naturally
written as a gamma function. Define the gamma function by an integral:
Z ∞
Γ(z) = tz−1 e−t dt .
0
Using integration by parts,

Z ∞
Γ(z) = tz−1 e−t dt ,
0
∞ Z ∞
z−1 −t
tz−2 e−t dt

= −t e +(z − 1) (z ≥ 1) ,
0 0
Z ∞
= (z − 1) tz−2 e−t dt ,
0
= (z − 1) Γ(z − 1) .
The gamma function satisfies a recursion relation. It is straightforward to show that

Γ(1) = 1 and with the recursion relation, one has
n! = Γ(n + 1) .
This relation can be used to extend the definition of factorial to non-integers!
The next most interesting arguments of the gamma function after the integers are half
integers. Using the recursion relation, these can be found if one knows
√
Γ(1/2) = π.
Can you show this?

Copyright
Physics 301 20-Sep-2004 5-7
For completeness, I should mention the incomplete gamma functions. (I couldn’t

resist, sorry!) These arise when the range of integration does not include the entire interval
[0, ∞).
The Partition Function
We’ve had a diversion and an aside, so maybe it’s time to get back to thermal physics.
Consider the function X
Z(τ ) = e−Es /τ ,
s
where the sum extends over all states s. Among other things, this function normalizes the
probabilities,
e−Es /τ X
P (Es ) = ; P (Es ) = 1 .
Z(τ ) s
Z is called the partition function. We’ve written it as a function of temperature, but it’s
also a function of the energies of all the states which might be functions of macroscopic
parameters of the system.
The average energy of a member of the ensemble is

X
hEi = Es P (Es ) .
s
Consider the derivative of Z with respect to temperature (the energies of the states, Es ,
do not depend on temperature),
∂Z ∂ X −Es /τ
= e ,
∂τ ∂τ s

X
−E s /τ ∂ −Es
= e ,
s
∂τ τ
X Es
= e−Es /τ 2 ,
s
τ
1 X
= 2 Es e−Es /τ ,
τ s
Z 1 X
= 2 Es e−Es /τ ,
τ Z s
Z
= hEi ,
τ2

Copyright
Physics 301 20-Sep-2004 5-8
from which we deduce that

∂ log Z
hEi = τ 2 .
∂τ
Our discussion followed K&K except that we use E rather than ε and K&K appear
to change the definition of U in midstream from the energy of heat bath to the average
energy of a state of a single system.

Copyright
Physics 301 22-Sep-2004 6-1
Entropy and Probabilities
We’ve been using the idea that the entropy is the logarithm of the number of states
accessible to the system. We’ve also said that each state is equally likely. At this point, I’d
like to make the connection between entropy and probability. This allows one to construct
an expression for the entropy of a system that isn’t in equilibrium. It should also improve
our intuition about the entropy and the partition function.
We might expect that an expression for entropy can be written in terms of the prob-
abilities that a system is in a particular state. If there are g states, and the probabilities
are p1 , p2 , . . . , pg , then we would like to write
σ = σ(p1 , p2 , . . . , pg ) .
One of the things that will guide us in our selection of the function is that the entropy
should be additive (i.e., an extensive parameter). If we have two non-interacting systems
with total numbers of states g1 and g2 , entropies σ1 and σ2 , and probabilities p1i and p2j
(the first index is the system, the second index is the state), we can also think of it as a
single system with g = g1 g2 states, σ = σ1 + σ2 and, since any state in system 1 can be
combined with any state in 2, the probability of a state in the combined system must be
pij = p1i p2j .
Since the probabilities multiply, while the entropies add, we might expect that the
entropy should involve the log of the probability. The first guess might be
X
σ1 = − log p1i wrong .
i
Since p1i ≤ 1, the minus sign is inserted to make the entropy positive. Why doesn’t this
expression work? There are several reasons. First, suppose one has a totally isolated
system. Then only states with the exact energy of the system are allowed. Disallowed
states have p1i = 0 and this will lead to problems with the logarithm. In addition, with
the above expression, the entropy is not additive. To fix up the problem with p1i = 0, we
might try multiplying by p1i since in the limit x → 0, x log x → 0. Does this make the
entropy additive? Consider
X
σ=− p1i p2j log p1i p2j ,
i,j
X X
=− p1i p2j log p1i − p1i p2j log p2j ,
i,j i,j
X X
=− p1i log p1i − p2j log p2j ,
i j
= σ1 + σ2 .

Copyright
Physics 301 22-Sep-2004 6-2
P P
We used the fact that i p1i = j p2j = 1. We adopt the following expression for the
entropy in terms of the probabilities.
X
σ=− pi log pi ,
i
where we can include or omit states with probability 0 without affecting the value of the
entropy.
What set of probabilities maximizes the entropy? The answer depends on the condi-
tions under which we seek a maximum. Suppose we are dealing with a completely isolated
system. Then a state can have non-zero probability only if it has the required energy (and
any other conserved quantities). So let’s limit our sum to allowed states. (Here, we’re
doing this for convenience, not because our expression might blow up!) The other thing
we know is that the probabilities of the allowed states sum to 1. The problem we want to
solve is maximizing the entropy under the constraint that the probabilities sum to 1. How
do we maximize with a constraint? Lagrange multipliers! So we seek to maximize
!
X
X(p) = σ(p) + λ 1 − pi ,
i
!
X X
=− pi log pi + λ 1 − pi .
i i
We set the derivative of X with respect to pi to zero,
∂X
0= = − log pi − 1 − λ .
∂pi
This gives
pi = e−(λ + 1) ,
so the probabilities of all allowed states are the same when the entropy is a maximum. We
also set the derivative of X with respect to λ to 0 which recovers the condition that the
probabilities sum to 1. Solving for λ, we find λ = log g − 1. (g is the number of allowed
states and the number of terms in the sum.) Finally,
X1 1 1
σ=− log = − log = log g ,
i
g g g
as we had before.
Now suppose we consider a system which is not isolated, but is in equilibrium thermal
contact with a heat bath so that the average value of its internal energy is U . Again, we
sum only over allowed states. This time states with energies other than U are allowed,

Copyright
Physics 301 22-Sep-2004 6-3
provided the average turns out to be U . We want to find the probabilities that maximize
the entropy under the constraints that the probabilities sum to 1 and average energy is U .
We find the maximum of
! !
X X X
X(p) = − pi log pi + λ1 1 − pi + λ2 U − pi Ei ,
i i i
where Ei is the energy of state i. We want
∂X
0= = − log pi − 1 − λ1 − λ2 Ei .
∂pi
It follows that
pi = e−1 − λ1 − λ2 Ei .
You are asked to show in the homework that λ2 = 1/τ . So, the probabilities wind up with
a Boltzmann factor!
Consider an ensemble of systems. The case in which the energy of each system is
identical and equal probabilities are assigned is known as the micro-canonical ensemble.
The case in which the energies vary and the probabilities are assigned with Boltzmann
factors is known as the canonical ensemble.
Heat Capacity
In general, the amount of energy added to a system in the form of heat, dQ, and the
rise in temperature dτ resulting from this addition of heat are proportional,
dQ = C dτ ,
where C is “constant” of proportionality. Why is constant in quotes? Answer: the bad

news is that it can depend on just about everything. The good news is that over a small
range of temperature it doesn’t vary too much, so it can be treated as a constant.
One of the things it obviously depends on is the amount of material in the system. To
remove this dependence, one often divides by something related to the amount of material
and then speaks of the specific heat. For example, dividing by the mass of the system gives
the heat capacity per unit mass, c = C/m. Of course, this is only useful if one is dealing
with a homogeneous material. That is you might speak of the specific heat of aluminum
and the specific heat of water, but for boiling water in an aluminum pan you would be
concerned with the heat capacity (which you could calculate from the masses of the pan
and the water and the specific heats from the Handbook of Chemistry and Physics). In the
case of gasses, the amount of material is usually measured in moles and the heat capacity

Copyright
Physics 301 22-Sep-2004 6-4
is divided by the number of moles to give molar specific heat or molar heat capacity. This
is usually a number of order the gas constant. In statistical physics, we often speak of the
heat capacity per molecule. This is usually a number of order Boltzmann’s constant.
All the above is mainly bookkeeping. Of more significance is the fact that heat ca-
pacities can depend on the manner in which the heat is added to a system. Heat added
to a system whose volume is kept constant will result in a different temperature rise than
the same amount of heat added to an identical system whose pressure is kept constant.
The heat capacities in these cases are often written CV , constant volume, or Cp , constant
pressure. Of course, one can think of more complicated way in which heat may be added,
and these can lead to additional definitions of heat capacity.
After a while, you might get the idea that we are overly fixated on heat capacities.
Measurements of heat capacities, aside from their engineering utility, are also a good way
to test theories of atomic and molecular structure and interactions, A theory will predict
the energy levels of a molecular system, and statistical mechanics relates the microscopic
energy levels to macroscopic quantities. Heat added and temperature rise are relatively
easy macroscopic quantities to measure, so these can provide tests of microscopic theories.
For the moment, we will consider the addition of heat at constant volume and particle
number. Then the internal energy changes only because of the addition of the heat. (No
change due to work done, no change due to adding particles.) Then,

dQ τ dσ ∂σ ∂U
CV = = =τ = ,
dτ dτ ∂τ V ∂τ V
where we can use dU for τ dσ since we are considering a constant volume (as indicated
by the subscript) and constant particle number process. Changing to partial derivatives is
just a clean up of the mathematical notation.
It should be noted that in our fundamental units, heat capacities are dimensionless.
However, in conventional units the dimensions are energy per Kelvin.
As an example, we can apply the formalism of the partition function to the param-
agnetic spin system we discussed in lecture 4. Recall, we worked that system out starting
from the entropy. The Boltzmann factor showed up when we considered the numbers of
aligned and anti-aligned magnets. Now we are going to use the partition function (the
sum of Boltzmann factors) to work out the energy from the temperature. An individual
magnetic dipole has energy −E when aligned with the field and +E when anti-aligned.
(This is a slight difference in the definition of E used in the discussion of the partition
function, but it’s the same as we used in the discussion of the paramagnetic system.) Then
Z(τ ) = e+E/τ + e−E/τ = 2 cosh(E/τ ) .

Copyright
Physics 301 22-Sep-2004 6-5
We use the earlier expression to get the average energy per magnet and multiply by N to
get the total energy in the system,
∂ log Z
U = Nτ2 ,
∂τ
2 sinh E/τ ∂ E
= Nτ ,
cosh E/τ ∂τ τ
−E
= N τ 2 tanh E/τ 2 ,
τ
= −N E tanh E/τ ,
as before! The heat capacity is ∂U/∂τ , so
∂
CV = −N E tanh E/τ ,
∂τ
∂ E
= −N E sech2 E/τ ,
∂τ τ
2
E E
= +N sech .
τ τ
At high and low temperatures the heat capacity goes to zero because of the τ −2 dependence
and the exponential dependence, respectively. When the temperature is low, it’s very hard
for the bath to get together enough energy to flip a magnet, so increasing the temperature
of the bath has little effect on the total energy of the system. When the temperature is
high, the magnets are 50% aligned and 50% anti-aligned and the system cannot be made
more random by adding energy.

Copyright
Physics 301 22-Sep-2004 6-6
Reversible Processes
We’ve glossed over the fact that adding heat is a process, not a state of the system. So
exactly what happens when we add heat depends on just how we do it. In particular, we
consider reversible processes. For such a process, the “driving force” is infinitesimally small,
the process occurs infinitely slowly, and an infinitesimal change in the driving force can
make the process run equally well in the reverse direction. While the process is underway,
the state of the system (and heat bath) is not an equilibrium state. Because things are
happening so slowly, we can imagine that the system is arbitrarily close to an equilibrium
state. A reversible process can be thought of as a continuous sequence of equilibrium states
carrying the initial state of the system to its final state.
In the case of heat transfer, the driving force is the temperature difference. To obtain
an approximation to a reversible process the temperature difference between the system
and the source of heat must be very small. The process must be carried out slowly so the
system has time to get close to equilibrium (for the heat energy to be carried to all parts
of the system). Furthermore, the system must not lose any energy by unwanted thermal
conduction through the boundaries. (In actual practice, this means you want to perform
the heat transfer as quickly as possible, before an appreciable amount can leak away!) In
any case, a reversible process is a convenient idealization much like (and related to) the
idealization of frictionless mechanical systems.
Food for thought: Do you suppose there’s a connection between the “direction of the
flow of time” and the increase of entropy in irreversible processes?

Copyright
Physics 301 22-Sep-2004 6-7
Pressure
Recall in lectures 2 and 3, we maximized the entropy with respect to changes in energy,
volume, and particle number and made some definitions and came up with
dU = τ dσ − p dV + µ dN ,
with
1 ∂σ p ∂σ µ ∂σ
= , = , − = .
τ ∂U V,N τ ∂V U,N τ ∂N U,V
Observe that if σ and N are constant, then the pressure is

∂U
p=− .
∂V σ,N
What does it mean to keep σ and N constant? N is easy, we just keep the same number
of particles. σ requires a bit more thought. If the entropy doesn’t change, the number of
states doesn’t change. If we perform a volume change at constant entropy, we are changing
the volume without changing the microstates of the system. In other words, the energy
change produced by a change in volume at constant entropy is purely a mechanical energy
change, not a thermal energy change, We know mechanics: the work done (and energy
supplied) when the volume is increased by dV is just −p dV , where p is the ordinary
mechanical pressure, force per unit area. This argument supports our identification of

∂σ
p=τ ,
∂V U,N
as the conventional mechanical pressure.

Copyright
Physics 301 22-Sep-2004 6-8
Pressure in a Low Density Gas, I
Suppose we have some gas in equilibrium at low density. By low density, we mean
that the molecules spend most of the time well separated from each other so that they are
weakly interacting. Occasionally, there are collisions between molecules and these serve
to randomize the velocities and maintain thermal equilibrium. To the extent that we can
treat molecules as point masses and ignore their interactions, we have a model for an ideal
gas.
We are going to relate the pressure to the motions of the gas molecules. To start
with, consider a wall of the container holding the gas. The wall is perpendicular to the x
direction. There is a force on this wall because gas molecules are bouncing off it. If the
mass of a molecule is m, and the x component of its velocity is vx , then the change in
momentum experienced by this molecule if it makes a perfectly elastic, perfectly reflecting
collision from the wall is ∆Px = −2mvx , and of course, the wall experiences an equal and
opposite change in momentum.
A perfectly elastic and perfectly reflecting collision is one in which vx changes sign
and vy and vz are unchanged. In other words, the angle of reflection equals the angle of
incidence and the kinetic energy is unchanged by the collision. Do we really think that all
collisions between molecules and walls are like this? Of course not. A gas molecule doesn’t
collide with a “wall,” it collides with one, or a small number, of molecules that are part
of the wall, and this collision is just as “randomizing” as collisions between gas molecules.
On the average, there can be no net change in vy or vz or the gas would start moving in
the y or z directions. In the same way, there can be no net change in the energy caused by
collisions with the wall or the gas would heat up or cool down, contrary to the assumption
of thermal equilibrium. So as a convenience, which is consistent with the average behavior,
we treat the collisions as perfectly elastic and reflecting.
The change in momentum of the wall in one collision is

2mvx . Consider a time interval ∆t and an area of the wall
∆A, and consider molecules with velocities in the range vx →
vx + dvx , vy → vy + dvy , vz → vz + dvz . If, at the beginning
of the time interval, such a molecule is headed in the positive
x direction and contained within the parallelepiped indicated
schematically in the figure, it will collide with the wall during
the time interval. The change in momentum caused by such
molecules is

N
δPx = [vx ∆t∆A] [p(vx , vy , vz ) dvx dvy dvz ] [2mvx ] ,
V
where the first factor is the number of molecules per unit vol-
ume, the second factor is the volume, the third gives the frac-
tion of molecules which have the specified velocity, and the last

Copyright
Physics 301 22-Sep-2004 6-9
is the change in momentum per collision. Note that p(vx , vy , vz ) is the probability density
for the velocities and is most likely the Maxwell velocity distribution, but all we require
is that it be independent of direction. We find the total change in momentum during the
time interval by adding up the contribution of all molecules
Z +∞ Z +∞ Z +∞
N
∆Px = dvx dvy dvz p(vx , vy , vz ) 2mvx2 ∆t ∆A ,
0 −∞ −∞ V
where the integral over vx includes only vx > 0 since we want the molecules that are about
to collide with the wall, not those which have just collided. Pressure is the rate of change
of momentum per unit area, so
Z +∞ Z +∞ Z +∞
∆Px N N
p= = p(vx , vy , vz ) mvx2 dvx dvy dvz = mhvx2 i .
∆t ∆A V −∞ −∞ −∞ V
Since the distribution is independent of direction, we dropped the factor of two and ex-
tended the range of integration to vx < 0. Also since the distribution is isotropic, we have
mhvx2 i = mhvy2 i = mhvz2 i = 31 mhv 2 i = 23 hEtran i, where hEtran i is the average translational
kinetic energy per molecule. Finally,
2 2
pV = N hEtran i = Utran ,
3 3
where Utran is the translational kinetic energy of all the gas molecules. If, in fact, the prob-
ability density for the velocities is the Maxwell density, then hEtran i = 3τ /2 (homework!)
and
pV = N τ = N kT = nRT ,
where n is the number of moles and R is the gas constant.

Copyright
Physics 301 22-Sep-2004 6-10
Pressure in a Low Density Gas, II
Look inside a gas, and consider molecules with velocity components in dvx dvy dvz ≡
d3 v at (vx , vy , vz ) = v, these molecules have an x momentum density (momentum per unit
volume) of
δPx N
= mvx p(v) d3 v .
δV V
All this momentum is carried in the x direction at speed vx . Note that positive momentum
is carried in the positive direction while negative momentum is carried in the negative
direction; both contribute to a momentum flux in the x direction. In fact, the flux of x
momentum in the x direction (momentum per unit area perpendicular to x per unit time)
is
δPx N
= mvx2 p(v) d3 v .
δA δt V
To get the total flux of momentum, we integrate over all velocities and come up with the
same thing we had before. Momentum per unit area per second which is force per unit
area which is pressure is
Z
N N
p= d3 v mvx2 p(v) = mhvx2 i .
V V
So, why did we bother with this? For one thing, we don’t have to introduce a wall to talk
about pressure. Pressure exists throughout a fluid. Secondly, it’s a first introduction to
calculation of transport phenomena.
In the preceding we considered the transport of x momentum in the x direction. Of

course, y momentum is transfered in the y direction and z momentum in the z direction.
These are usually numerically equal to the flux of x momentum in the x direction and we
have an isotropic pressure. One can also transport x momentum in the y and z directions,
y momentum in the x and z directions and z momentum in the x and y directions. For
the simple gas we’ve been considering, these fluxes are zero (can you see why?). However,
in more complicated situations, they might not be zero; they correspond to viscous forces.
In general, we need a nine component object to specify the transport of momentum (a
vector) in any of three directions. This is a second rank tensor, usually called the stress
tensor.
Summary: For an ideal gas, we’ve found an expression relating pressure, volume and
translational kinetic energy. We related the energy to the temperature using the Maxwell
velocity distribution, which was motivated by the Boltzmann factor. However, in writing
down the Maxwell distribution, we “finessed” the issue of counting the states, so we haven’t
really derived the ideal gas law.

Copyright
Physics 301 24-Sep-2004 7-1
States of a Particle in a Box.
In order to count states, we will use quantum mechanics to ensure that we have
discrete states and energy levels. Let’s consider a single particle which is confined to a
cubical box of dimensions L × L × L. You might think that this is artificial and wonder
how the physics could depend on the size of a box a particle is in? It is artificial and it’s
a trick to make the math easier. Once a box is big enough, the physics doesn’t depend on
the size of the box, and the physics we deduce must not depend in any critical way on the
box size when we take the limit of a very big box. (Of course, the volume of a system is
one of the extensive parameters that describes the system and it’s OK for the volume to
enter in a manner like it does in the ideal gas law!) In what follows, we’ll ignore rotational
and internal energy of the particles and drop the “tran” subscript.
As you probably know, particles are described by wave functions

in quantum mechanics. The de Broglie relation between wavelength
(λ) and momentum (P ) is P = h/λ, where h is Planck’s constant.
The wave function for a particle in a box must be zero at the walls of
the box (otherwise the particle might be found right at the wall). In
one dimension, suitable wave functions are ψ(x) ∝ sin(nx πx/L) where
0 ≤ x ≤ L and nx is an integer. This amounts to fitting an integer
number of half wavelengths into the box. (If you recall the Bohr model
of the atom, the idea there is to fit an integral number of wavelengths in
the electron’s orbit.) The momentum is Px = ±nx h/2L = ±nx h̄π/L.
The ± sign on the momentum indicates that the wave function is
a standing wave that is a superposition of travelling waves moving in
both directions. The first three wave functions are shown in the figure. In three dimensions,
we have
ψ(x, y, z) ∝ sin(nx πx/L) sin(ny πy/L) sin(nz πy/L) ,
which corresponds to fitting standing waves in all three directions in the box. The mo-
mentum is
πh̄
P = (±nx , ±ny , ±nz ) .
L
The energy is
P2 π 2 h̄2
E= = 2
n2x + n2y + n2z .
2m 2mL
Now we are getting close to being able to count states. Consider a three dimensional
coordinate system with axes nx , ny , nz (regarded as continuous variables). There is a
state at each lattice point (nx , ny , nz integers) in the octant where all are non-negative.
Because h̄ is so small, and also because we usually deal with a large number of particles,
we will be concerned with energies where the quantum numbers (the n’s) are large. How

Copyright
Physics 301 24-Sep-2004 7-2
many states are there with energy < E? Answer:

s !3
1 4π E
N (< E) = 2 .
8 3 π h̄ /2mL2
2
This is just the volume of an octant of a sphere with radius given by the square root above.
It’s the number of states because each state (lattice point) occupies unit volume. For a
large number of states, we don’t care about the small errors made at the surface of the
sphere. The number of states with energy less than E is the integral of the density of
states, n(E),
Z E
N (< E) = n(E ′ ) dE ′ ,
0
where E ′ is the dummy variable of integration. Differentiate both sides with respect to E
using the previous result for N (< E),
√ !3
√ 2mL
n(E) = 2π E .
2πh̄
Recall that when we discussed the Maxwell

√ distribution, we concluded that the density
of states had to be proportional to E in order to give the Maxwell distribution. Sure
enough that’s what we get. All the other factors get absorbed into the overall normalization
constant.
It will be instructive to work out a numerical value for the number of states for
a typical case. So let’s suppose that E = 3kT /2 where T = 273 K and the volume
L3 = 22 × 103 cm3 . That is, we consider an energy and volume corresponding to the
average energy and molar volume of a gas at standard temperature and pressure. For m
we’ll use a mass corresponding to an N2 molecule. The result is about 4 × 1030 states,
more than a million times Avogadro’s number.
We have been working out the states for one particle in a box. If we have more
than one particle in the box, and they are non-interacting, then the same set of states is
available to each particle. It the particles are weakly interacting, then these states are a
first approximation to the actual states and we will usually just ignore the interactions
when it comes to counting states. With this approximation, and with a mole of particles
in the box, we’ve found that less than one in a million of the available states are occupied.

Copyright
Physics 301 24-Sep-2004 7-3
Partition Function for a Single Particle in a Box
We can use the same states we’ve just discussed to evaluate the partition function for
a single particle in a box. We have

X π 2 h̄2 2 2 2

Z(τ ) = exp − n + ny + nz .
nx ,ny ,nz
2mL2 τ x
We will make a negligibly small error by converting the sums to integrals,

Z Z Z
∞ ∞
π 2 h̄2
∞
2 2 2

Z(τ ) = dnx dny dnz exp − n + ny + nz ,
0 0 0 2mL2 τ x
√ !3 Z ∞ Z ∞ Z ∞
2mτ 3
= L dx dy dz exp(−x2 − y 2 − z 2 ) ,
πh̄ 0 0 0
(rescaling variables)
√ !3 Z ∞
1 2mτ 2
= 4π L 3
dr r 2 e−r ,
8 πh̄ 0
(changing to spherical coordinates and integrating over angles)

√ !3
1 2mτ 1 3
= 4π V Γ ,
8 πh̄ 2 2
V √
= 2 (Γ(3/2) = π/2) ,
(2πh̄ /mτ )3/2
= nQ V .
The volume of the system is V = L3 and the quantity that occurs in the last line, nQ
has the dimensions of an inverse volume or a number per unit volume. mτ is the square
of a typical momentum. So h̄2 /mτ ∼ λ2 , and the volume associated with nQ is roughly
a cube of the de Broglie wavelength. This is roughly the smallest volume in which you
can confine the particle (given its energy) and still be consistent with the uncertainty
principle. K&K call nQ the quantum concentration. A concentration is just a number per
unit volume, and nQ can be thought of as the concentration that separates the classical
(lower concentrations) and quantum (higher concentrations) domains. For a typical gas
at STP, the actual concentration n = N/V is much less than the quantum concentration
(by the same factor as the ratio of the number of states to the number of molecules we
calculated earlier), so the gas can be treated classically.

Copyright
Physics 301 24-Sep-2004 7-4
Partition Function for N Particles in a Box
If we have N non-interacting particles in our box, all with the same mass, then (see
the homework) the partition function for the composite system is just the product of the
partition functions for the individual systems,
ZN (τ ) = Z1 (τ )N wrong!
where ZN is the N -particle partition function and Z1 is the 1-particle partition function
calculated in the previous section.
Why is this wrong? Recall that the partition function is the sum of Boltzmann
factors over all the states of the composite system. Writing ZN as a product includes
terms corresponding to molecule A with energy Ea and molecule B with energy Eb and
vice versa: molecule A with Eb and molecule B with Ea . However, these are not different
composite states if molecules A and B are indistinguishable! The product overcounts the
composite states. Any given Boltzmann factor appears in the sum roughly N ! times more
than it should because there are roughly N ! permutations of the molecules among the
single particle states that give the same composite state.
Why “roughly?” Answer, if there are two or more particles in the same single particle
state, then the correction for indistinguishability (what a word!) is not required. However,
we’ve already seen that for typical low density gasses, less than one in a million single
particle states will be occupied, so it’s quite safe to ignore occupancies greater than 1. (If
this becomes a bad approximation, other quantum effects enter as well, so we need to do
something different, anyway!)
To correct the product, we just divide by N !,

1 N 1
ZN (τ ) = Z1 = (nQ V )N .
N! N!
To find the energy, we use

∂
hU i = τ 2 log ZN ,
∂τ
∂
= τ2 (− log N ! + N log nQ + N log V ) ,
∂τ
∂
= τ2 (N log nQ ) ,
∂τ
(derivatives of N ! and log V give 0)
3/2 !
∂ mτ
= τ2 N log ,
∂τ 2πh̄2
3
= Nτ ,
2
Copyright
Physics 301 24-Sep-2004 7-5
which expresses the energy of an ideal gas in terms of the temperature. We’ve obtained this
result before, using the Maxwell distribution. Note that our correction for overcounting of
the microstates does not appear in the result.
In lecture 6, we noted that

∂U
p=− ,
∂V σ,N
and we remarked that keeping the entropy constant while changing the volume of a system
means keeping the probability of each microstate constant. The average energy is
X
hEi = Es P (Es ) ,
s
and keeping the probabilities of the microstates constant means that P (Es ) doesn’t change.
Thus, changing the volume at constant entropy changes the energy through changes in
energies of the individual states. For each single particle state,
1
Es ∝ 2
∝ V −2/3 ,
L
which means
dU dEs 2 dV
= =− ,
U Es 3 V
all at constant σ. Then the pressure is

∂U 2 U
p=− = .
∂V σ,N 3 V
Again, this is a result we’ve seen before.

Copyright
Physics 301 24-Sep-2004 7-6
Helmholtz Free Energy
Recall the expression for the conservation of energy,
dU = τ dσ − p dV ,
where we have omitted the chemical potential term, since we won’t be contemplating
changing the number of particles at the moment.
If we have a system whose temperature is fixed by thermal contact with a heat bath,
it is convenient to eliminate the differential in the entropy in favor of a differential in the
temperature. For this we use a Legendre transformation—exactly the same kind of trans-
formation used in classical mechanics to go from the Lagrangian, a function of coordinates
and velocities, to the Hamiltonian, a function of coordinates and momenta.
Define the Helmholtz free energy by
F = U − τσ .
Not all authors use the symbol F for this quantity—I believe some use A and there may
be others. In any case,
dF = dU − τ dσ − σ dτ = τ dσ − p dV − τ dσ − σ dτ = −σ dτ − p dV .
If a system is placed in contact with a heat bath and its volume is fixed, then its free
energy is an extremum. As it turns out, the extremum is a minimum. To show this, we
show that when the entropy of the system plus reservoir is a maximum (so equilibrium is
established), the free energy is a minimum.
σ = σr (U − Us ) + σs (Us ) ,
= σr (U ) − Us (∂σr /∂U ) + · · · + σs (Us ) ,
= σr (U ) − Us /τ + σs (Us ) ,
= σr (U ) − (Us − τ σs (Us )) /τ ,
= σr (U ) − Fs /τ .
In the above, the subscripts r and s refer to the reservoir and system and U is the fixed
total energy shared by the reservoir and system. Note that unlike our derivation of the
Boltzmann factor, the system here need not be so small that it can be considered to be
in a single state—it can be a macroscopic composite system. However, it should be much
smaller than the reservoir so that Us ≪ U . Also, the partial derivatives above occur at
fixed volume and particle number. Since σr (U ) is just a number, and τ is fixed, maximizing
σ requires minimizing F .

Copyright
Physics 301 24-Sep-2004 7-7
From dF = −σ dτ − p dV , we see

∂F ∂F
σ=− and p=− .
∂τ V ∂V τ
Substituting F = U − τ σ in the right equation above,

∂U ∂σ
p=− +τ .
∂V τ ∂V τ
This shows that at fixed temperature, if a system can lower its energy by expanding, then
it generates a “force,” pressure, that will create an expansion. This is probably intuitive,
since we are used to the idea that the equilibrium state is a minimum energy state. If the
system can increase its entropy (at fixed temperature) by expanding, this too, generates a
“force” to create an expansion.
Note that
∂σ ∂ 2F ∂ 2F ∂p
=− =− = .
∂V τ ∂V ∂τ ∂τ ∂V ∂τ V
The outer equality in this line is called a Maxwell relation. These occur often in thermody-
namics and result from the fact that many thermodynamic parameters are first derivatives
of the same thermodynamic “potential,” such as the free energy in this case.
The Free Energy and the Partition Function
Consider F = U − τ σ and σ = −(∂F/∂τ )V . Putting these together, we have
U = F + τσ ,

∂F
=F −τ ,
∂τ V
∂(F/τ )
= −τ 2 .
∂τ
Recall the expression for energy in terms of the partition function
∂ log Z
U = τ2 .
∂τ
Comparing with the above, we see
F
= − log Z + C ,
τ

Copyright
Physics 301 24-Sep-2004 7-8
where C is a constant independent of τ . In fact, the constant must be zero in order to give
the correct entropy as τ → 0. If τ is sufficiently small, only the lowest energy (E0 ) state
enters the partition function. If it occurs g0 different ways, then log Z → log g0 − E0 /τ
and σ = −∂F/∂τ → ∂(τ log g0 − E0 − τ C)/∂τ = log g0 − C. So C = 0 in order that the
entropy have the correct zero point. Then
F = −τ log Z or Z = e−F/τ .
Remembering that the Boltzmann factor is normalized by the partition function to yield
a probability, we have
e−Es /τ
P (Es ) = = e(F − Es )/τ .
Z
Just for fun, let’s apply some of these results using the partition function for the ideal
gas we derived earlier.
F = −τ log Z ,
= −τ log((nQ V )N /N !) ,
= −τ (N log nQ + N log V − N log N + N ) (Stirling’s approx.) ,
h i
= −τ N log (mτ /2πh̄2 )3/2 (V /N ) − τ N .
With p = −∂F/∂V , we have

p = τ N/V ,
the ideal gas law again. For the entropy, σ = −∂F/∂τ ,
σ = N log(nQ V /N ) + (3/2)N + N ,

nQ 5
= N log + ,
n 2
where n = N/V is the concentration. This called the Sackur-Tetrode formula. Note that
if one considers the change in entropy between two states of an ideal gas,
3 τf Vf
σf − σi = N log + N log ,
2 τi Vi
a classical result which doesn’t contain Planck’s constant. However, to set the zero point
and get an “absolute” entropy, Planck’s constant does enter since it determines the spac-
ing between states and their total number. The overcounting correction does not make
any difference in the pressure above, but it does enter the entropy—as might have been
expected. A final note is that these expressions for an ideal gas do not apply in the limit
τ → 0. (Why?)

Copyright
Physics 301 27-Sep-2004 8-1
Reading
This week, we’ll concentrate on the material in K&K chapter 4. This might be called
the thermodynamics of oscillators.
Classical Statistical Mechanics
Recall that statistical mechanics was developed before quantum mechanics. In our
discussions, we’ve made use of the fact that quantum mechanics allows us to speak of
discrete states (sometimes we have to put our system in a box of volume V ), so it makes
sense to talk about the number of states available to a system, to define the entropy as
the logarithm of the number of states, and to speak of maximizing the number of states
(entropy) available to the system. If one didn’t know about quantum mechanics and didn’t
know about discrete states, how would one do statistical mechanics?
Answer: in classical statistical mechanics, the phase space volume plays the role of
the number of states. We’ve mentioned phase space briefly. Here’s a slightly more detailed
description. In classical mechanics, one has the Lagrangian, L(q, q̇, t) which is a function
of generalized coordinates q, their velocities, q̇, and possibly the time, t. The equations of
motion are Lagrange’s equations

d ∂L ∂L
− =0.
dt ∂ q̇ ∂q
Note that q might be a single variable or it might stand for a vector of coordinates. In
the latter case, there is one equation of motion for each coordinate. The Hamiltonian is
defined by a Legendre transformation,
H(q, p, t) = pq̇ − L(q, q̇, t) ,
where
∂L
p= ,
∂ q̇
is called the momentum conjugate to q (or the canonical momentum). The equations of
motion become (assuming neither L nor H is an explicit function of time)
∂H ∂H
q̇ = , ṗ = − ,
∂p ∂q
so that each second order equation of motion has been replaced by a pair of first order
equations of motion.
If p and q are given for a particle at some initial time, then the time development of p
and q are determined by the equations of motion. If we consider a single pair of conjugate

Copyright
Physics 301 27-Sep-2004 8-2
coordinates, p and q (i.e., a one-dimensional system), and we consider a space with axes q
and p, then a point in this space represents a state of the system. The equations of motion
determine a trajectory (or orbit) in this space that the system follows. The q-p space is
called phase space. If we consider a 3-dimensional particle, then three coordinates and
three momenta are required to describe the particle. Phase space becomes 6-dimensional
and is a challenge to draw. If we consider N 3-dimensional particles, then phase space
becomes 6N -dimensional. Or, one might draw N trajectories in a 6-dimensional space.
As an example of a phase space that we might actually be able to draw, consider two
1-dimensional particles moving along a common line. Suppose they are essentially free
particles. The phase space coordinates are q1 , p1 , q2 , and p2 . (Subscripts refer to particles
1 and 2.) The figure shows an attempt at drawing a trajectory in the 4-dimensional phase
space. Since we have free particles, p1 and p2 are constants and q1 and q2 are linear
functions of time, for example, q1 = p1 t/m1 . The figure shows a trajectory for q1 and for
q2 . As shown, q1 has a positive momentum, so its trajectory is from left to right, while
q2 has a negative momentum, so its trajectory is from right to left. Each point on the
trajectory of q1 corresponds to exactly one point on the trajectory of q2 —the points are
labeled by time and points at the same time are corresponding points. If we could draw
in four dimensions, there would be a single line representing both particles and we would
not have to point out this correspondence.
Note that at some time, both particles are at the same physical place in space and
simply pass through each other as we’ve drawn the trajectories above. Instead of passing
through each other, suppose they have a collision and “bounce backwards.” This might be
represented by the diagram shown in the next figure. This has been drawn assuming equal

Copyright
Physics 301 27-Sep-2004 8-3
masses, equal and opposite momenta, and an elastic collision. I’m sure you can work out
diagrams for other cases.
Suppose we are considering a low density gas (again!). We certainly would not want
to try to draw a phase space for all the particles in the gas and we certainly wouldn’t
want to try to draw all the trajectories including collisions. In the two particle case we’ve
been considering, suppose we blinked while the collision occurred. What would we see?
The answer (for a suitable blink) is shown in the next figure. We’d see particles 1 and 2
moving along as free particles before we blinked and again after we blinked, but while we
blinked, they changed their momenta. We’ve already mentioned that in a low density gas,
the molecules travel several molecular diameters between collisions while collisions occur
only when molecules are within a few molecular diameters of each other. One way to treat
a low density gas is to treat the molecules as free particles and to try to add in something
to account for the collisions. By looking at the drawing of the collision (where we blinked),
we can see that one way is to say that the particles follow phase space trajectories for a
free particle, except every now and then a trajectory ends and reappears—at random—
somewhere else. The disappearance and reappearance of phase space trajectories does not
really happen; it’s an approximate way to treat collisions.
All this is motivation for the idea that collisions randomize the distribution of particles
in phase space. Of course the randomization must be consistent with whatever constraints
are placed on the system (such as fixed total energy, etc.) In general, if a system is
in thermal contact with another system, we would expect that the exchanges of energy,

Copyright
Physics 301 27-Sep-2004 8-4
required for thermal equilibrium, would result in randomization of the phase space location.
The classical statistical mechanics analog of our postulate that all accessible states
are equally probable is the postulate that all accessible regions of phase space are equally
probable. In other words, a point in phase space plays the role of a state. The leveling
of the probabilities is of course accomplished by the collisions and energy transfers we’ve
just been discussing. It shouldn’t be too hard to convince yourself that any concept we’ve
discussed that doesn’t explicitly require Planck’s constant can just as easily be done with
classical statistical mechanics as with quantum statistical mechanics. Even in cases where
we used h̄, if there is a reasonable mapping of quantum states to phase space volume, the
classical treatment will give the same results as the quantum treatment (but of course,
lacking an h̄).
As an example, suppose we consider a single free particle in a box of volume V in

thermal contact with a reservoir at temperature τ . Our derivation of the Boltzmann factor
did not depend on quantum mechanics, so the probability of finding this particle in a state
with energy E is exp(−E/τ ), just as before. The partition function is no longer a sum
over states, but an integral over phase space volume,
Z +L/2 Z +L/2 Z +L/2 Z +∞ Z +∞ Z +∞

ZC = dx dy dz dpx dpy dpz exp −(p2x + p2y + p2z )/2mτ ,
−L/2 −L/2 −L/2 −∞ −∞ −∞
where ZC stands for the classical partition function, and the volume is taken to be a cube
of side L for convenience.
√ The integrals over the coordinates give V and each integral over
a momentum gives 2πmτ . The result is
ZC = V (2πmτ )3/2 .
Recall our previous result for the free particle partition function,
3
3/2 1
ZQ = V (2πmτ ) ,
h
where the subscript Q indicates the “quantum” partition function. Note that the expres-
sion includes h, not h̄. So, in this case, the classical and quantum partition functions are
the same except for a factor of h−3 . Mostly, we use the logarithm of the partition function.
This means that many results that we derive from the partition function will not depend
on whether we use ZC or ZQ . For example, the energy is τ 2 ∂(log Z)/∂τ , so the h3 factor
has no effect on the energy. An important exception is the entropy. The entropy is missing
an additive constant. This has no effect on relative entropy, but it does matter for absolute
entropy. (How would you measure absolute entropy?) By comparing the two expressions
one sees that for each pair of conjugate phase space coordinates, such as x and px , one
should assign the volume h to a single state. Using classical considerations, we can (at
least for a low density gas) reproduce the quantum results simply by using
dx dpx dy dpy dz dpz
,
h h h
Copyright
Physics 301 27-Sep-2004 8-5
as the appropriate volume in phase space. This works in general provided the average
occupancy is very low.
We see that the Maxwell velocity distribution falls out of the classical approach and
the h−3 even if included, would get erased in the normalization factor for the probability
density.
A Classical Harmonic Oscillator
Now suppose we have a one dimensional harmonic oscillator, whose Hamiltonian is
p2 kq 2 1
H= + = p2 + m2 ω 2 q 2 ,
2m 2 2m
where ω 2 = k/m is the natural frequency of the oscillator, Suppose this oscillator is in
thermal equilibrium at temperature τ . What is the mean value of its energy? One way
we can work this out is to take the Boltzmann factor as the probability density in phase
space. So
P (E) dq dp = C exp −(p2 + m2 ω 2 q 2 )/2mτ dq dp ,
where uppercase P is used for probability density to distinguish if from momentum. The
normalization constant, C, is set by requiring that the integral of the probability density
over phase space be unity. The position√and momentum integrals can be done separately
√
and lead to normalization factors mω/ 2πmτ for the position coordinate and 1/ 2πmτ
for the momentum coordinate. To get the average energy of this oscillator, we have
Z +∞ Z +∞
1
hEi = C dq p2 + m2 ω 2 q 2 exp −(p2 + m2 ω 2 q 2 )/2mτ ,
dp
−∞ −∞ 2m
Z +∞ 2 2 2 Z +∞ 2
mω m ω q −m2 ω 2 q 2 /2mτ 1 p −p2 /2mτ
=√ dq e +√ dp e ,
2πmτ −∞ 2m 2πmτ −∞ 2m
Z +∞ Z +∞
τ 1 2 2 τ 1
= √ dx x exp(−x /2) + √ dy y 2 exp(−y 2 /2) ,
2 2π −∞ 2 2π −∞
τ τ
= + =τ.
2 2
Of course, we could obtain the same result by calculating the partition function and going
from there. Note that the harmonic oscillator has two ways to store energy: as kinetic
energy or as potential energy. Each of these can be considered a degree of freedom and
each stores, on the average, τ /2 = kT /2. This is an example related to equipartition of
energy discussed by K&K in chapter 3.

Copyright
Physics 301 27-Sep-2004 8-6
Classical Cavity Radiation
We’re familiar with the idea that hot objects radiate energy. There are the expressions
“red hot” and “white hot” denoting very hot objects. The color comes from the appear-
ance of the objects and is the color of the electromagnetic energy radiated by the object.
Allowing two objects to exchange radiation is a way to place them in thermal contact. For
ordinary temperatures this may not be a very efficient method of heat exchange compared
to conduction or convection, but at high temperatures (in stars, for example) it can become
the dominant method of energy transfer. Also, when working at cryogenic temperatures,
one needs to shield the experiment from direct exposure to room temperature radiation
because this can be an important heat load on the cold apparatus.
How can we make a perfect absorber of radiant energy? If we could, what would it
look like? If it absorbed all the radiation that hit it, then nothing would be reflected back,
so we couldn’t see anything and it would appear black. A perfect absorber is called a
blackbody. We could make a perfect absorber by making a large cavity with a small hole
connecting the cavity to the outside world. Then, as seen from the outside, any radiation
hitting the hole, passes through the hole and bounces around inside the cavity until it is
absorbed. By making the cavity sufficiently big and the hole sufficiently small, we can
make the chances of the radiation coming back out the hole before it’s absorbed as small
as we like. (Of course, when the wavelength of the radiation is comparable to or larger
than the size of the hole, then we have to worry about diffraction...)
A cavity containing a hole must also radiate energy. If not, it would heat up to arbi-
trarily high temperatures (and of course, this violates the second law of thermodynamics
by transferring heat from a cold object to a hot object with no other change). So when a
cavity is in thermal equilibrium with its surroundings, It must radiate energy out through
the hole at the same rate that energy is absorbed through the hole. A hole has no proper-
ties, so the radiated spectrum (energy at each frequency or wavelength or color) can only
depend on the temperature. This spectrum is called the blackbody (or thermal or cavity)
radiation spectrum. A real physical object which is a perfect absorber must radiate the
same spectrum. We can place a physically black object into thermal contact with a cavity
radiator. In order to avoid violating the second law, the energy absorbed must equal the
energy radiated. If we consider a filter which is perfectly transparent in some frequency
range and perfectly reflecting outside this range and we insert this filter between the two
objects, then we conclude that the perfect absorber and the cavity radiator must radiate
the same spectrum (the same amount of energy at each frequency). Finally, real absorbers
are not perfect. If in equilibrium, a fraction a of the incident radiation is absorbed, with
the rest being reflected, then it must be the case that it emits the fraction a of the ideal
blackbody radiation, otherwise we can arrange to violate the second law. Finally, by using
our filter again, we conclude that if it absorbs the fraction a(ω) at frequency ω, it must
radiate the same fraction e(ω) = a(ω) of the ideal blackbody radiation spectrum. Jargon:
a is called the absorptivity and e is called the emissivity.

Copyright
Physics 301 27-Sep-2004 8-7
The upshot of all this is that there is a universal radiation spectrum that depends
only on temperature and is called the blackbody, thermal or cavity radiation spectrum.
Let us try to calculate this spectrum.
The spectrum is produced by electromagnetic fields inside the cavity. These fields
contain energy and they are in equilibrium with the walls of the cavity at temperature τ .
To make life simple, let’s suppose our cavity is a cube of side L. You may recall from your
studies of electromagnetism that the fields in the cavity can be divided into modes with
each mode oscillating like a harmonic oscillator. Electromagnetic energy oscillates back
and forth between the electric field (like the position coordinate in a standard harmonic
oscillator) and the magnetic field (like the momentum coordinate).
Each mode can store energy independently, so each mode contributes a harmonic os-
cillator term to the Hamiltonian of the cavity. Different modes have different frequencies
and this is where the spectrum comes from. So we are getting close: we’ve already calcu-
lated the average energy in a harmonic oscillator; all we have to do now is enumerate the
modes and their frequencies and we’ll have the calculation of the blackbody spectrum.
As you may know, the electric field for a given mode in a perfectly conducting cavity
has components of the form
Ez = E0 sin(ωt) sin(nx πx/L) sin(ny πy/L) cos(nz πz/L) ,
where E0 is the amplitude of the mode (an electric field, not an energy!), ω is the frequency
of oscillation, and nx , ny , and nz are integers. The cavity is assumed to run from 0 to L in
each coordinate. The sine terms ensure that electric field component parallel to a perfectly
conducting wall is zero at the wall. (Electric field is always perpendicular to the surface
of a perfect conductor). The cosine term ensures that the magnetic field (related to the
E-field by Maxwell’s equations) has no perpendicular component at the wall. The integers
nx , ny , nz are related to the number of half wavelengths that fit in the cavity. Maxwell’s
equations tell us that
π 2 c2
ω 2 = 2 n2x + n2y + n2z .
L
We can think of a mode as a wave bouncing around inside the cavity and when it has
bounced around once, it must be in phase with the original wave in order to have resonance
and a mode. This is another way of seeing how the integers arise. We also know that any
electromagnetic wave in a vacuum has two polarizations. So for each set of positive integers
there are two independent oscillators. And each oscillator has average energy τ according
to our earlier calculation.
To calculate the spectrum, we need to know how many sets of integers correspond to
a given range in ω. We start by considering the number that correspond to a frequency
less than a particular ω. If we consider a three dimensional space with axes nx , ny , and

Copyright
Physics 301 27-Sep-2004 8-8
nz , then
q the number is the number of lattice points in this space in the positive octant
with n2x + n2y + n2z < r = ωL/πc. This is just the volume of 1/8 of a sphere of radius r,
3
1 4π ωL
N (< ω) = 2 ,
8 3 πc
where the factor of 2 accounts for the two polarizations. This should be reminding you
very strongly of what we did when counting the states of a particle in a box. The number
of oscillators in the frequency range ω → ω + dω is found by differentiating the above, and
V ω2
n(ω) dω = 2 3 dω ,
π c
where V has been inserted in place of L3 .
Now, each oscillator has average energy τ , so the energy per unit frequency in the
cavity is
dU V ω2 τ
dω = 2 3 dω ,
dω π c
and the total energy in the cavity is found by integrating over all frequencies,
Z +∞
V ω2τ
U= dω = ∞ !!!
0 π 2 c3
This says there is an infinite energy in the cavity and this can’t be right! Also the energy
per unit frequency result says that the energy is concentrated towards the high frequencies
in proportion to ω 2 . This is called the ultraviolet catastrophe. It says that if you made a
cavity and put a small hole in it to let the radiation out, you’d be instantly incinerated by
the flux of X-rays and gamma rays! Of course, we’re all still here, so this doesn’t happen.
Where did we go wrong? This is the same question physicists were asking themselves in
the latter part of the nineteenth and the early part of the twentieth century.
The answer is, we didn’t go wrong, at least as far as classical physics is concerned. Ev-
erything we did leading up to infinite energy density in a cavity is perfectly legal according
to classical physics. It is one of the many contradictions that arose around the turn of the
century that led to the development of quantum mechanics. One of the things to note is
that cavity radiation could be measured and at low frequencies it gave results in agreement
with what we’ve just derived. That is, the spectral density (1/V ) dU/dω is proportional
to τ and to ω 2 . For higher frequencies the measured result falls far below our calculation.
The ω 2 region is called the Rayleigh-Jeans part of the spectrum. What’s needed to cure
our calculation is a way to keep the high frequency modes from being excited. We shall see
that it is the discreteness of the energy levels provided by quantum mechanics that keeps
these modes quiescent.

Copyright
Physics 301 29-Sep-2004 9-1
A Quantum Harmonic Oscillator
The quantum harmonic oscillator (the only kind there is, really) has energy levels
given by
En = (n + 1/2)h̄ω ,
where n ≥ 0 is an integer and the E0 = h̄ω/2 represents zero point fluctuations in the
ground state. We are going to shift the origin slightly and take the energy to be
En = nh̄ω .
That is, we are going to ignore zero point energies. The actual justification for this is a
little problematic, but basically, it represents an unavailable energy, so we just leave it out
of the accounting.
The partition function is then

∞
X 1 exp(h̄ω/τ )
Z= e−nh̄ω/τ = = .
n=0
1 − exp(−h̄ω/τ ) exp(h̄ω/τ ) − 1
P
(Thus is just an infinite series xn with x = exp(−h̄ω/τ ).) We calculate the average
energy of the oscillator
∂ log Z h̄ω
hEi = τ 2 = .
∂τ exp(h̄ω/τ ) − 1
It’s instructive to consider two limiting cases. First, consider the case, that h̄ω ≪ τ . That
is, the energy level spacing of the oscillator is much less than the typical thermal energy.
In this case, the denominator becomes
h̄ω h̄ω
eh̄ω/τ − 1 ≈ 1 + +···−1 = .
τ τ
If we plug this into the expression for the average energy, we get
hEi → τ , (h̄ω ≪ τ ) ,
just as we found for the classical case. On the other hand, if h̄ω/τ ≫ 1, then the exponential
in the denominator is large compared to unity and the average energy becomes
hEi → h̄ω e−h̄ω/τ , (h̄ω ≫ τ ) .
In other words, the average energy is “exponentially killed off” for high energies. Recall
that we needed a way to keep the high energy modes quiescent in order to solve our cavity
radiation problem!

Copyright
Physics 301 29-Sep-2004 9-2
Quantum Cavity Radiation
Now that we have given a treatment of the quantum harmonic oscillator, we can return
to the cavity radiation problem. Note that our counting of states is basically the counting
of electromagnetic modes. These came out quantized because we were considering standing
electromagnetic waves. That is, classical considerations gave us quantized frequencies and
quantized modes. With quantum mechanics, we identify each mode as a quantum oscillator
and realize that the energies (and amplitudes) of each mode are quantized as well.
In addition, with quantum mechanics, we know that particles have wave properties and
vice-versa and that quantum mechanics associates an energy with a frequency according
to E = h̄ω. So given that we have a mode of frequency ω, it is populated by particles
with energy h̄ω. We can get the mode classically by considering standing waves of the
electromagnetic field and we can get it quantum mechanically by considering a particle
in a box. Either way we get quantized frequencies. With quantum mechanics we also
find that the occupants of the modes are particles with energies h̄ω, so we get quantized
energies at each quantized frequency. When you take a course in quantum field theory,
you will learn about second quantization which is what we’ve just been talking about!
The particles associated with the electric field are called photons. They are massless
and travel at the speed of light. They carry energy E = hν = h̄ω and momentum p =
h/λ = hν/c = h̄ω/c, where ω and λ are the frequency and wavelength of the corresponding
wave. Note that the frequency in Hertz is ν = ω/2π.
When h̄ω ≪ τ , so the thermal energy is much larger than the photon energy, we have
hEi τ
→ ≫1, (h̄ω ≪ τ ) .
h̄ω h̄ω
The average energy divided by the energy per photon is the average number of photons
in the mode or the average occupancy. We see that in the limit of low energy modes,
each mode has many photons. When quantum numbers are large, we expect quantum
mechanics to go over to classical mechanics and sure enough this is the limit where the
classical treatment gives a reasonable answer. At the other extreme, when the photon
energy is high compared to the thermal energy, we have
hEi
→ e−h̄ω/τ ≪ 1 , (h̄ω ≫ τ ) .
h̄ω
In this limit, the average occupancy is much less than 1. This means that the mode is
quiescent (as needed) and also that quantum effects should be dominant. In particular,
the heat bath, whose typical energies are ∼ τ has a hard time getting together an energy
much larger than τ all at once so as to excite a high energy mode.
Perhaps a bit of clarification is needed here. When discussing the ideal gas, consisting
of atoms or molecules, we said that a low occupancy gas was classical and a high occupancy

Copyright
Physics 301 29-Sep-2004 9-3
gas needed to be treated with quantum mechanics—apparently just the opposite of what
was said in the previous paragraph! When a large number of particles are in the same
state, they can be treated as a classical field. Thus at low photon energies, with many
photons in the same mode, we can speak of the electromagnetic field of the mode. At high
photon energies, where the occupancy is low, the behavior is like that of a classical particle
but a quantized field.
Let’s calculate the cavity radiation spectrum. The only change we need to make from
our previous treatment is to substitute the quantum oscillator average energy in place of
the classical result. The number of modes per unit frequency is the same whether we count
the modes classically or quantum mechanically. However, since the average energy now
depends on frequency, we must include it in the integral when we attempt to find the total
energy. The energy per unit frequency is
dU V ω2 h̄ω
dω = 2 3 dω ,
dω π c exp(h̄ω/τ ) − 1
and the total energy in the cavity is,

Z +∞
V ω2 h̄ω
U= dω 2 3
.
0 π c exp(h̄ω/τ ) − 1
It is convenient to divide the energy per unit frequency by the volume and consider the
spectral density uω , where
1 dU h̄ω 3
uω = = 2 3 .
V dω π c (exp(h̄ω/τ ) − 1)
This is called the Planck radiation law. It is simply the energy per unit volume per unit
frequency at frequency ω inside a cavity at temperature τ . For convenience, let x = h̄ω/τ .
Then x is dimensionless and we have
τ3 x3
uω = ,
π 2 h̄2 c3 ex − 1
The shape of the spectrum is given by the second factor above which is plotted in the figure.

Copyright
Physics 301 29-Sep-2004 9-4
Changing the temperature shifts the curve to higher frequencies (in proportion to τ ) and
multiplies the curve by τ 3 (and constants). At low energy the spectrum is proportional to
ω 2 in agreement with the classical result. At high energy there is an exponential cut-off.
The exponential behavior on the high energy side of the curve is known as Wien’s law.
To find the total energy per unit volume we have

Z +∞
u= dω uω ,
0
Z +∞
h̄ω 3
= dω ,
0 π 2 c3 (exp(h̄ω/τ ) − 1)
Z ∞ 3
τ4 x dx
= ,
π h̄ c 0 ex − 1
2 3 3
τ 4 π4
= (looking up the integral) ,
π 2 h̄3 c3 15
π2
= τ4 .
15h̄3 c3
The fact that radiation density is proportional to τ 4 is called the Stefan-Boltzmann law.

Copyright
Physics 301 29-Sep-2004 9-5
We can also calculate the entropy of radiation. We have
π2 4
U =Vu= 3 3 Vτ .
15h̄ c
We know that τ dσ = dU when the volume is constant, so
1 4π 2 3 4π 2 2
dσ = 3 3 V τ dτ = 3 3 V τ dτ .
τ 15h̄ c 15h̄ c
We integrate this relation setting the integration constant to 0, (why?) and obtain
4π 2 3
σ= 3 3 Vτ .
45h̄ c
It is sometimes useful to think of blackbody radiation as a gas of photons. Some of

the homework problems explore this point of view as well as other interesting facts about
the blackbody radiation law.
One application of the blackbody radiation law has to do with the cosmic microwave
background radiation which is believed to be thermal radiation left over from the hot big
bang which started our universe. Due to the expansion of the universe, it has cooled
down. This radiation has been measured very precisely by the FIRAS instrument on
the COBE satellite and is shown in the accompanying figure which was put together by

Copyright
Physics 301 29-Sep-2004 9-6
Lyman Page mostly from data collected by Reach, et al., 1995, Astrophysical Journal,
451, 188. The dashed curve is the theoretical curve and the solid curve represents the
measurements where the error bars are smaller than the width of the curve! Other curves
on the plot represent deviations in the curve due to our motion through the background
radiation (dipole), irregularities due to fluctuations that eventually gave rise to galaxies and
physicists (anisotropy) and sources of interfering foreground emission. The temperature
is 2.728 ± 0.002 K where the error (one standard deviation) is all systematic and reflects
how well the experimenters could calibrate their thermometer and subtract the foreground
sources.

Copyright
Physics 301 29-Sep-2004 9-7
More on Blackbody Radiation
Before moving on to other topics, we’ll clean up a few loose ends having to do with
blackbody radiation.
In the homework you are asked to show that the pressure is given by
π2τ 4
p= ,
45h̄3 c3
from which one obtains
1
pV =
U,
3
for a photon gas. This is to be compared with
2
pV = U,
3
appropriate for a monatomic ideal gas.
In an adiabatic (isentropic—constant entropy) expansion, an ideal gas obeys the re-

lation
pV γ = Constant ,
where γ = Cp /CV is the ratio of heat capacity at constant pressure to heat capacity at
constant volume. For a monatomic ideal gas, γ = 5/3. For complicated gas molecules with
many internal degrees of freedom, γ → 1. A monatomic gas is “stiffer” than a polyatomic
gas in the sense that the pressure in a monatomic gas rises faster for a given amount of
compression. What are the heat capacities of a photon gas? Since
π2 4
U= 3 3Vτ ,
15h̄ c

∂U 4π 2
CV = = V τ3 .
∂τ V 15h̄3 c3
How about the heat capacity at constant pressure. We can’t do that! The pressure
depends only on the temperature, so we can’t change the temperature without changing
the pressure. We can imagine adding some heat energy to a photon gas. In order to keep
the pressure constant, we must let the gas expand while we add the energy. So, we can
certainly add heat at constant pressure, it just means the temperature is constant as well,
so I suppose the heat capacity at constant pressure is formally infinite!
If one recalls the derivation of the adiabatic law for an ideal gas, it’s more or less
an accident that the exponent turns out to be the ratio of heat capacities. This, plus
the fact that we can’t calculate a constant pressure heat capacity is probably a good sign

Copyright
Physics 301 29-Sep-2004 9-8
that we should calculate the adiabatic relation for photon gas directly. We already know
σ ∝ V τ 3 ∝ V p3/4 , so for an adiabatic process with a photon gas,
pV 4/3 = Constant ,
and a photon gas is “softer” than an ideal monatomic gas, but “stiffer” than polyatomic
gases. Note that γ = 4/3 mainly depends on the fact that photons are massless. Consider
a gas composed of particles of energy E and momentum P = E/c, where c is the speed
of light. Suppose that particles travel at speed c and that their directions of motion are
isotropically distributed. Then if the energy density is u = nE, where n is the number
of particles per unit volume, the pressure is u/3. This can be found by the same kind of
argument suggested in the homework problem. The same result holds if the particles have
a distribution in energy provided they satisfy P = E/c and v = c. This will be the case
for ordinary matter particles if they are moving at relativistic speeds. A relativistic gas is
“softer” than a similar non-relativistic gas!
On problem 3 of the homework you are asked to determine the power per unit area
radiated by the surface of a blackbody or, equivalently, a small hole in a cavity. The result
is (c/4)u where the speed of light accounts for the speed at which energy is transported by
the photons and the factor of 1/4 accounts for the efficiency with which the energy gets
through the hole. The flux is
π2τ 4 π 2 k4 4 4
J= 3 2 = 3 2 T = σB T
60h̄ c 60h̄ c
where the Stefan-Boltzmann constant is
π 2 k4 −5 erg
σB = 3 2 = 5.6687 × 10 .
60h̄ c cm s K4
2
We saw that the Planck curve involved the function x3 /(exp(x) − 1) with x = h̄ω/τ .
Let’s find the value of x for which this curve is a maximum. We have
d x3
0= ,
dx ex − 1
3x2 x3 ex
= x − x ,
e − 1 (e − 1)2
(3x2 − x3 )ex − 3x2
=
(ex − 1)2
or
0 = (x − 3)ex + 3 .
This transcendental equation must be solved numerically. The result is x = 2.82144. At
maximum,
h̄ωmax h νmax
2.82 = = ,
τ k T
Copyright
Physics 301 29-Sep-2004 9-9
or
T h Kelvin
= = 0.017 .
νmax 2.82k Gigahertz
So the above establishes a relation between the temperature and the frequency of the
maximum energy density per unit frequency.
You will often see uν which is the energy density per unit Hertz rather than the energy
density per unit radians per second. This is related to uω by the appropriate number of
2π’s. You will also see uλ which is the energy density per unit wavelength. This is found
from
uλ |dλ| = uω |dω| .
This says that the energy density within a range of wavelengths should be same as the
energy density within the corresponding range of frequencies. The absolute value signs are
there because we only care about the widths of the ranges, not the signs of the ranges. We
use ω = 2πc/λ and dω = (2πc/λ2 )|dλ|,

dω
uλ = uω ,
dλ
h̄(2πc/λ)3 2πc
= 2 3 ,
π c (exp(2πh̄c/λτ ) − 1) λ2
8πhc
= 5 ,
λ (exp(hc/λτ ) − 1)
8πτ 5 x5
= 4 4 x ,
h c e −1
where x = hc/λτ . At long wavelengths, uλ → 8πτ /λ4 , and at short wavelengths uλ is
exponentially cut off. The maximum of uλ occurs at a wavelength given by the solution of
(x − 5)ex + 5 = 0 .
The solution is x = 4.96511 . . .. From this, we have

hc
λmax T = = 0.290 cm K .
4.97k
This is known as Wien’s displacement law. It simply says that the wavelength of the
maximum in the spectrum and the temperature are inversely related. In this form, the
constant is easy to remember. It’s just 3 mm Kelvin. (Note that the wavelength of the
maximum in the frequency spectrum and the wavelength of the maximum in the wavelength
spectrum differ by a factor of about 1.6. This is just a reflection of the fact that wavelength
and frequency are inversely related.)
Let’s apply some of these formulae to the sun. First, the peak of the spectrum is in
about the middle of the visible band (do you think this is a coincidence or do you suppose

Copyright
Physics 301 29-Sep-2004 9-10
there’s a reason for it?), at about λmax ≈ 5000 Å = 5 × 10−5 cm. Using the displacement
law (and assuming the sun radiates as a blackbody), we find Tsun ≈ 5800 K. The luminosity
of the sun is L = 3.8 × 1033 erg s−1 . The radius of the sun is r = 7.0 × 1010 cm. The flux
emitted by the sun is J = L/4πr 2 = 6.2 × 1010 erg cm−2 s−1 . This is about 60 Megawatts
per square meter! We equate this to σB T 4 and find Tsun ≈ 5700 K, very close to what we
estimated from the displacement law.
Problem 17 in chapter 4 of K&K points out that the entropy of a single mode of
thermal radiation depends only on the average number of photons in the mode. Let’s see
if we can work this out. We will use

∂ τ log Z
σ= ,
∂τ V
where Z is the partition function and the requirement of constant volume is satisfied by
holding the frequency of the mode constant. We’ve already worked out the partition
function for a single mode
1
Z= .
1 − e−h̄ω/τ
The average occupancy (number of photons) in the mode is
1
n= ,
eh̄ω/τ −1
from which we find

n+1 h̄ω n+1
= eh̄ω/τ , or = log .
n τ n
Now let’s do the derivatives to get the entropy
∂
σ= (τ log Z) ,
∂τ
∂
= log Z − τ log 1 − e−h̄ω/τ ,
∂τ
1 1 h̄ω 1
−h̄ω/τ
= log −τ −e − − ,
1 − e−h̄ω/τ 1 − e−h̄ω/τ τ τ
1 h̄ω e−h̄ω/τ
= log + ,
1 − n/(n + 1) τ 1 − e−h̄ω/τ

n+1 n/(n + 1)
= log(n + 1) + log ,
n 1 − n/(n + 1)
n+1
= log(n + 1) + n log ,
n
= (n + 1) log(n + 1) − n log n ,

Copyright
Physics 301 29-Sep-2004 9-11
Which is the form given in K&K. This is another way of making the point that the ex-
pansion of the universe does not change the entropy of the background radiation. The
expansion redshifts each photon—stretches out its wavelength—in proportion to the ex-
pansion factor, but it does not change the number of photons that have the redshifted
wavelength—the number of photons in the mode. So, the entropy doesn’t change. (This
assumes that the photons don’t interact with each other or with the matter. Once the
universe is cool enough (≤ 4000 K) that the hydrogen is no longer ionized, then the inter-
actions are very small.)

Copyright
Physics 301 01-Oct-2004 10-1
Johnson Noise
This is another application of the thermal equilibrium of electromagnetic modes. Con-

sider an ideal transmission line, like a long piece of lossless coaxial cable. Suppose its
length is L and suppose it is shorted out at each end. Then any wave that travels along
the line is reflected at each end and to have an appreciable amplitude, the length of the
line must contain an integral number of half wavelengths. In other words this is just a
one-dimensional cavity of length L. There are modes of the electromagnetic fields, most
conveniently represented by the potential difference between the inner and outer conduc-
tors. Vn = Vn,0 sin(ωt) sin(nπx/L), where n is any positive integer. Vn and Vn,0 represent
the potential difference and the amplitude of the potential difference. The fields must
satisfy Maxwell’s equations, so ω = nπc/L. Actually, if the coax is filled with a dielectric,
the speed of propagation can be different from c, let’s assume it’s filled with vacuum. If
this line is in thermal equilibrium at temperature τ , each mode acts like an oscillator and
has average energy h̄ω/(eh̄ω/τ − 1). Let’s consider the low frequency limit so the average
energy in each mode is just τ . The number of modes per integer n is just 1. Then the
number of modes per unit frequency is
L
n(ω) dω = dω .
πc
The energy per unit length per unit frequency is then
τ
uω = ,
πc
at low frequencies.
As you may know, all transmission lines have a characteristic impedance, R. If a

resistor R is connected across the end of the line, then a wave traveling down the line is
completely absorbed by the resistor. So, let’s take a resistor, in equilibrium at temperature
τ , and connect it to the end of the line. Since the resistor and the line are at the same
temperature, they are already in thermal equilibrium and no net energy transfer takes
place. Each mode in the line is a standing wave composed equally of traveling waves
headed in both directions. The waves traveling towards the resistor will be completely
absorbed by the resistor. This means that the resistor must emit waves with equal power
in order that there be no net transfer of energy. The energy in the frequency band dω per
unit length headed towards the resistor is τ dω/2πc. This is traveling at speed c, so the
power incident on the resistor is τ dω/2π which is also the power emitted by the resistor,
What we’ve established so far is that the line feeds power τ dω/2π into the resistor
and vice-versa. This means that a voltage must appear across the resistor. This will
be a fluctuating voltage with mean 0 since it’s a random thermal voltage. However, its
mean square value will not be zero. Let’s see if we can calculate this. As an equivalent
circuit, we have a resistor R, a voltage generator (the thermally induced voltage source),

Copyright
Physics 301 01-Oct-2004 10-2
a filter (to limit the frequencies to dω), and another resistor of resistance R representing
the transmission line. Then the current is I = V /2R. The average power delivered to the
resistor is then hI 2 iR = hV 2 i/4R = τ dω/2π. In the lab, one measures frequencies in Hertz
rather than radians per second, ν = ω/2π. Finally
hV 2 i = 4Rτ dν .
This relates the mean square noise voltage which appears across a resistor to the temper-
ature, resistance, and bandwidth (dν). Of course, this voltage results from fluctuations
in the motions of electrons inside the resistor, but we calculated it by considering electro-
magnetic modes in a one-dimensional cavity, a much simpler system! This thermal noise
voltage is called Johnson noise.
Debye Theory of Lattice Vibrations
A little thought will show that sound waves in a solid are not all that different from
electromagnetic waves in a cavity. Further thought will show that there are some important
differences that we must take into account.
The theory of lattice vibrations that we’ll discuss below applies to the ion lattice in
a conductor. In addition, one needs to account for the thermal effects of the conduction
electrons which behave in many respects like a gas. We’ll consider the electron gas later
in the term. For now, we imagine that we’re dealing with an insulator.
We will treat crystalline solids. This is mainly for conceptual convenience, but also
because we want reasonably well defined vibrational modes.
As a model, suppose the atoms in a solid are arranged in a regular cubic lattice. Each
atom vibrates around its equilibrium position. The equilibrium and the characteristics
of the vibrations are determined by interactions with the neighboring atoms. We can
imagine that each atom is connected to its six nearest neighbors by springs. At first
sight, this seems silly. But, the equilibrium position is determined by a minimum in
the potential energy, and the potential energy almost surely increases quadratically with
displacement from equilibrium. This gives a linear restoring force which is exactly what
happens with a spring. So our solid is a large number of coupled oscillators. In general,
the motion of a system of coupled oscillators is very complex. You probably know from
your classical mechanics course, that the motion of a system of coupled oscillators can be
resolved into a superposition of normal modes with the motion of each mode being simple
harmonic in time. So, we can describe the motion with the N vectors ri which represent
the displacement of each atom from its equilibrium position, or we can describe the motion
with 3N normal mode amplitudes. For those of you that know about Fourier transforms,
the normal modes are just the Fourier transforms of the position coordinates.

Copyright
Physics 301 01-Oct-2004 10-3
These normal modes represent elastic vibrations of our solid. They are standing elastic
waves, or standing sound waves. In this respect, they are similar to the cavity modes we
discussed earlier. There are two main differences. First, there are three polarizations:
there are two transversely polarized waves (as we had in the electromagnetic case) and
one longitudinally polarized wave (absent in the electromagnetic case). Second, there is a
limit to the number of modes. If our solid contains N atoms, there are 3N modes. In the
electromagnetic case, there is no upper limit to the frequency of a mode. High frequency
modes with h̄ω ≫ τ are not excited, but they are there. In the elastic case, frequencies
which are high enough that the wavelength is shorter than twice the distance between
atoms do not exist.
For simplicity, we are going to assume that the velocity of sound is isotropic and is
the same for both transverse and longitudinal waves. Also, we’ll assume that the elastic
properties of the solid are independent of the amplitude of the vibrations (at least for the
amplitudes we’ll be dealing with).
We’ll carry over as much stuff from the electromagnetic case as we can. A typical
mode will look something like
nx πx ny πy nz πz
displacement component = A sin ωt sin sin sin ,
L L L
where the sine factors might be cosines depending on the mode, A represents an amplitude,
and for convenience, the solid is a cube of side L. The frequency and mode numbers are
related by the speed of sound, v,
π2v2 2
ω2 = n x + n 2
y + n 2
z .
L2
If the solid contains N atoms, the distance between atoms is L/N 1/3 . The wavelength
must be longer than twice this distance. More precisely
2L 2L
> 1/3 ,
nx N
with similar relations for ny and nz . In other words, the mode numbers nx , ny , and nz are
integers within the cube N 1/3 × N 1/3 × N 1/3 . This lower limit on the wavelength (upper
limit on the frequency) is an example of the Nyquist limit discussed later in these notes.
The number of modes per unit frequency is just as it was for the electromagnetic case
except that we must multiply by 3/2 to account for the three polarizations instead of two,
3V ω 2
n(ω) dω = dω .
2π 2 v 3
This works for frequencies low enough that the corresponding n’s are within the cube.
It’s messy to deal with this cubical boundary to the mode number space. Instead, let’s

Copyright
Physics 301 01-Oct-2004 10-4
approximate the upper boundary as the surface of a sphere which gives the same number
of modes. In other words, there will be an upper limit to the frequency, called ωD , such
that Z ωD Z ωD
3V V
3N = n(ω) dω = 2 3
ω 2 dω = ω3 ,
2v3 D
0 2π v 0 2π
which gives
2 1/3
6π N
ωD = v.
V
Each mode acts like a harmonic oscillator and its energy is an integer times h̄ω. The
quanta of sound are called phonons. A solid contains a thermally excited phonon gas. The
average energies of these oscillators are just as they were in the electromagnetic case. We
find the total energy by adding up the energies in all the modes,
Z ωD
3V h̄ω
U= 2 3
ω2 dω ,
2π v 0 exp(h̄ω/τ ) − 1
Z xD 3
3V 4 x dx
= 3 τ ,
2π 2 h̄ v 3 0 ex − 1
where 1/3
h̄ωD 6π 2 N h̄v kθ θ
xD = = = = ,
τ V τ kT T
where θ is called the Debye temperature and is given by
2 1/3
6π N h̄v
θ= .
V k
The Debye temperature is not a temperature you can change by adding or removing
heat from a solid! Instead, it’s a characteristic of a given solid. The way to think of it is
that a vibration with phonon energy equal to kθ is the highest frequency vibration that
can exist within the solid. Otherwise the wavelength would be too short. (The weird factor
of 6π 2 occurs because we replaced a cube with a sphere!) Typical Debye temperatures are
a few hundred Kelvin.
The limit of integration depends on the temperature, so in general, we can’t look up

the integral. Instead, we have to numerically integrate and produce a table for different
values of xD = θ/T . Such a table is given in K&K.
There are two limiting cases where we can do the integral. The first case is very low
temperature (T ≪ θ). In this case xD is very large and we can replace xD by ∞. Then
the integral is π 4 /15 and we have
π2V 4 3π 4 N 4
U= τ = 3 3τ .
10h̄3 v 3 5k θ

Copyright
Physics 301 01-Oct-2004 10-5
The heat capacity at constant volume is then

3
12 4 T
CV = π Nk .
5 θ
So a prediction of this theory is that at low temperatures, the heat capacities of solids
should be proportional to T 3 . This is borne out by experiment!
The other limit we can consider is very high temperature (T ≫ θ). In this case, we
expect all modes are excited to an average energy τ , so the total should be U = 3N τ . Is
this what we get? At very high temperatures, xD ≪ 1, so we can expand the exponential
in the denominator of the integrand,
Z xD
3V 4 x3 dx
U= τ ,
2π 2 h̄3 v 3 ex − 1
Z0 xD
3V
= τ4 x2 dx ,
2π 2 h̄3 v 3 0
V
= τ 4 x3D ,
2π 2 h̄3 v 3
V h̄3 v 3 6π 2 N
4
= τ ,
2π h̄3 v 3
2 τ3 V
= 3N τ ,
as expected. Actually, we picked ωD so this result would occur “by construction.” The
heat capacity goes to 3N k in this limit.
In one of the homework problems you are asked to come up with a better approxima-
tion in the limit of small xD .

Copyright
Physics 301 01-Oct-2004 10-6
The Nyquist Frequency
We imagine that we have a function of time that we sample periodically, every T

seconds. Then the Nyquist frequency is the frequency corresponding to a period of two
samples. ωN = 2π/2T = π/T . Consider a sine wave at some frequency ω,
y(t) = sin(ωt + φ) .
Since we are sampling, we don’t have a continuous function, but a discrete set of values:
ym = sin(ωmT + φ) .
Suppose the frequency is larger than the Nyquist frequency. Then we can write it as
an even integer times the Nyquist frequency plus a frequency less than the the Nyquist
frequency:
ω = 2nωN + Ω = 2πn/T + Ω ,
where −ωN ≤ Ω ≤ +ωN . Then
ym = sin(2πnm + ΩmT + φ) = sin(ΩmT + φ) .
In other words, when we sample a sine wave periodically, waves with frequencies greater
than the Nyquist frequency look exactly the same as waves with frequencies less than the
Nyquist frequency. This is illustrated in the figure. The arrows along the bottom indicate
the times at which the signal is sampled. A signal at the Nyquist frequency would have
one cycle every two sample intervals. The high frequency wave has 3.7 cycles every two

Copyright
Physics 301 01-Oct-2004 10-7
sample intervals or a phase change of 3.7π = (4 − 0.3)π every sample. We can’t tell how
many multiples of 2π go by between samples, so the high frequency wave looks exactly
like a low frequency wave with −0.3 cycles per two samples. The points show the value of
the signal (either wave) at each sampling interval. Of course the application to the Debye
theory of lattice vibrations is that the Nyquist spatial frequency is the highest frequency
a periodic lattice can support.

Copyright
Physics 301 04-Oct-2004 11-1
Reading
This week we’ll work on chemical potential and the Gibbs distribution which is covered
in K&K chapter 5.
Parting Shot on Oscillators
Before we get to the main part of this week’s material, let’s have a quick recap on
oscillators.
If we have an oscillator of frequency ω, its energy levels are spaced by h̄ω. If this
oscillator is in thermal equilibrium at temperature τ , then if h̄ω < τ , its average energy is
τ . If h̄ω > τ , its average energy is exponentially “killed off” and it’s not too gross of an
approximation to say that it’s 0. This happens because of the quantized states. Energy
can only be exchanged with the oscillator in units of h̄ω and when this is larger than a
thermal energy, the heat reservoir almost never gets enough energy together to excite the
oscillator.
We know the energy of the oscillator, and all we have to do is count up how many
oscillators there are in order to find the total energy of the system.
R
For blackbody radiation, the number of modes is proportional to ω 3 ( ω 2 dω), and all
these modes are excited up to the maximum ω where h̄ω = τ . So the energy in blackbody
radiation is proportional to τ 4 .
In the case of lattice vibrations, we again have a number of modes proportional to ω 3 ,

but there are a finite number, so if we run out of modes before we reach the maximum
ω of an excited oscillator, then every mode has energy τ and the total energy is 3N τ
(where 3N is the number of modes). If we don’t run out of modes before we reach the
maximum ω of an excited oscillator, then the situation is just like that with blackbody
radiation and the energy is proportional to τ 4 . These two cases correspond to high and
low temperatures and give heat capacities which are constant or proportional to τ 3 for
high and low temperatures. By the way, the fact that molar heat capacities of solids are
usually 3R at room temperature (R is the gas constant) is called the law of Dulong and
Petit.
Finally, when we considered Johnson noise in a resistor, we made

R use of a one dimen-
sional cavity, where the number of modes is proportional to ω ( dω) and we considered
the low temperature limit and found the energy proportional to τ dω.
Basically, we know the energy of the modes and we count the modes. All the factors
of π, h̄, etc., come out as a result of the proper bookkeeping when we do the counting.

Copyright
Physics 301 04-Oct-2004 11-2
Integrals Related to Planck’s Law
Judging by experience with previous classes, some (maybe many) of you are wondering
just how one goes about doing the integral
Z ∞
x2 dx
.
0 exp(x) − 1
The first thing to note is that doing integrals is an art, not a science. You’ve probably
learned a number of techniques for doing integrals. However, there is never a guarantee
that an arbitrary expression can be integrated in closed form, or even as a useful series.
Some expressions you just have to integrate numerically!
Let’s see what we can do about

Z ∞
xn dx
In = ,
0 exp(x) − 1
where n need not be an integer, but I think we’ll need n > 0. The first thing to do is to
try and look it up! I like Dwight, Tables of Integrals and other Mathematical Data, 4th
edition, 1964, MacMillan. (Actually, I bought mine when I was an undergraduate in the
late 60’s. It seems that they were coming out with a new edition every 10 years, so maybe
it’s up to the seventh edition by now!) Anyway, in my edition of Dwight, there is entry
860.39: Z ∞ p−1
x dx Γ(p) 1 1 Γ(p)
ax
= p 1 + p + p + · · · = p ζ(p) ,
0 e −1 a 2 3 a
and this is basically the integral we’re trying to do.
We’ve talked about the gamma function, Γ(z), see lecture 5. The Riemann Zeta
function is
∞
X 1
ζ(s) = ,
ks
k=1
where Re(s) > 1. The function can be defined for other values of s, but this series
requires Re(s) > 1. A good reference book for special functions is Abramowitz and Stegun,
Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables,
US Government Printing Office. If you look up the zeta function in this handbook, you’ll
find lots of cool stuff. For example,
Y 1
ζ(s) = .
1 − 1/ps
primes p
The zeta function establishes a connection between number theory and function theory!
Other things you’ll find in A&S are various integral representations, representations in

Copyright
Physics 301 04-Oct-2004 11-3
terms of Bernoulli and Euler polynomials (if you don’t know what these are, they’re also
discussed in A&S), special values, and tables of values. For example ζ(0) = −1/2, ζ(1) =
∞, ζ(2) = π 2 /6, and ζ(4) = π 4 /90. ζ(3), needed for the integral at the top of the page,
does not have a simple value. Instead, we find it in a table, ζ(3) = 1.2020569 . . ..
Can we write In in the form suggested by Dwight?

Z ∞
xn dx
In = ,
0 exp(x) − 1
Z ∞ n −x
x e dx
= ,
0 1 − e−x
Z ∞ X∞
= xn e−x e−mx dx ,
0 m=0
Z ∞ ∞
X
= xn e−(m+1)x dx ,
0 m=0
∞
X Z ∞
n −(m+1)x
= x e dx ,
m=0 0
∞ Z ∞
X 1 n
= ((m + 1)x) e−(m+1)x d ((m + 1)x) ,
m=0
(m + 1)n+1 0
∞ Z ∞
X 1
= y n e−y dy ,
m=0
(m + 1)n+1 0
∞
X 1
= Γ(n + 1) ,
m=0
(m + 1)n+1
∞
X 1
= Γ(n + 1) , (m now starts at 1)
m=1
(m)n+1
= Γ(n + 1) ζ(n + 1) ,
in agreement with Dwight.
If you need to numerically evaluate ζ(s), you can just start summing the series. Sup-
pose you’ve summed the inverseR ∞ powers from 1 to M − 1. You should be able to show
s
(make some sketches) that M dx/x = 1/[(s − 1)M (s−1) ] is less than remainder of the
R∞
sum and M dx/(x − 1)s = 1/[(s − 1)(M − 1)(s−1) ] is greater than the remainder of the
sum. You can use the average of these two integrals as an estimate of the remainder of the
sum and half their difference as a bound on the numerical error. (Actually the error will
be quite a bit smaller!). As an example, consider ζ(2) = π 2 /6 = 1.64493 . . .. The sum of
the first 10 terms, 1 + 1/4 + 1/9 + 1/16 + · · ·+ 1/100 = 1.5497677 . . .. The two integrals are
just 1/11 = 0.09090909 . . . and 1/10 = 0.1. Their average is 0.09545454 . . . and half their

Copyright
Physics 301 04-Oct-2004 11-4
difference is 0.004545 . . ., so numerically we can be pretty sure that the value is within
0.00455 of 1.64522. In fact, we actually miss by only 0.00028!
The Chemical Potential
Recall in lectures 2 and 3 we discussed two systems in thermal (microscopic exchange

of energy), volume (macroscopic exchange of energy), and diffusive (exchange of particles)
equilibrium. By requiring that the entropy be a maximum, we found that
1 p µ
dσ = dU + dV − dN ,
τ τ τ
where µ is the chemical potential and N is the number of particles.
In other words,
∂σ
µ = −τ .
∂N U,V
We can also rewrite the differential relation above in the form
from which we deduce

∂U
µ= .
∂N σ,V
Adding a particle to a system changes its energy by µ.
Of course, the entropy is not a completely natural variable to work with as a dependent
variable. To get around this, we use the Helmholtz free energy which we’ve previously
defined as a function of temperature and volume. We now extend the definition to include
particle number. In particular, consider the free energies of two systems in contact with
a reservoir at temperature τ . We allow these two systems to exchange particles until
equilibrium is established. The free energy of the combined system is
F = F1 + F2 ,
where the subscripts refer to the individual systems. The free energy will be a minimum
at constant temperature and volume. The change in free energy due to particle exchange
is
∂F1 ∂F2
dF = dF1 + dF2 = dN1 + dN2 .
∂N1 τ,V ∂N2 τ,V

Copyright
Physics 301 04-Oct-2004 11-5
We want dF = 0 at minimum and, since the total number of particles is constant, we have
dN1 = −dN2 which means that

∂F1 ∂F2
= =µ.
∂N1 τ,V ∂N2 τ,V
This constitutes yet another definition of the chemical potential. Is it the same chemical
potential we’ve already defined? Yes, provided the free energy continues to be defined by
F = U − τσ .
Then when the particle number changes, we have
dF = dU − τ dσ − σ dτ ,
= τ dσ − p dV + µ dN − τ dσ − σ dτ ,
= −σ dτ − p dV + µ dN ,
and it’s the same chemical potential according to either definition.
By the way we defined the chemical potential, it must be the same for two systems in
diffusive and thermal contact, once they’ve reached equilibrium. What if the two systems
in diffusive contact do not have equal values of the chemical potential? Since
dF = dF1 + dF2 = µ1 dN1 + µ2 dN2 ,
there will be a flow of particles in order to minimize F . If µ1 > µ2 , then dN1 < 0 and
dN2 = −dN1 > 0, so particles flow from the system with the higher chemical potential to
the system with the lower chemical potential.
To summarize,

∂σ ∂U ∂F
µ = −τ = = .
∂N U,V ∂N σ,V ∂N τ,V

Copyright
Physics 301 04-Oct-2004 11-6
Getting a Feel for the Chemical Potential
The chemical potential is another one of those thermodynamic quantities that seems
to appear by magic. In order to gain intuition about the chemical potential, you will
probably have to see it in action and work with it for a while.
To start this process, note that adding a particle to a system requires that the energy
of the system be changed by µ while the entropy and volume are kept constant. Better
yet, the free energy changes by µ while the temperature and volume are kept constant.
Why might adding a particle to a system change the system’s energy? There are
at least two reasons. There might be macroscopic fields around (such as gravitational or
electromagnetic fields) in which the particle has an ordinary potential energy (mgh or eΦ
for example). In addition when a particle is added to a system at temperature τ , it must
acquire a thermal energy which depends on τ and other parameters of the system. In other
words the change in energy upon adding a particle can be due to both macroscopic fields
and microscopic thermal effects.
The distinction made in K&K between the external, internal and total chemical po-
tentials is just a division into the macroscopic, microscopic, and total contributions to the
energy upon adding a particle.
Let’s find the chemical potential of the classical, ideal, monatomic gas. Recall in
lecture 7, we found,
3
U = Nτ ,
2
nQ
F = −N τ log +1 ,
n
nQ 5
σ = N log + ,
n 2
3/2
mτ
nQ = ,
2πh̄2
N
n= .
V
Of the thermodynamic potentials U , F , and σ above, only F is expressed in terms

of its natural independent variables τ , V , and N . Let’s find µ by differentiating F with
respect to N while keeping τ and V constant.

∂F ∂ nQ V
µ= = −N τ log +1 ,
∂N τ,V ∂N N

nQ V
= −τ log +1 +τ ,
N

Copyright
Physics 301 04-Oct-2004 11-7
nQ V
= −τ log ,
N
n
= τ log .
nQ
Note that for a typical gas, the concentration n is quite small compared to the quantum
concentration, nQ , perhaps a part in 106 to a part in 105 . So we expect µ = −14τ to −11τ .
If µ gets close to zero, then the concentration is approaching the quantum concentration
and the classical treatment is no longer valid.
Suppose we wanted to calculate the chemical potential by differentiating the entropy.

We express the entropy in terms of its natural independent variables U , V , and N ,

nQ 5
σ = N log + ,
n 2
3/2 ! !
mτ V 5
= N log + ,
2πh̄2 N 2
substitute τ = 2U/3N ,
3/2 ! !
mU V 5
=N log + .
3πh̄2 N 5/2 2
Now differentiate with respect to N , multiply by −τ , and replace U with 3N τ /2,

∂σ
µ = −τ ,
∂N U,V
" 3/2 ! !#
∂ mU V 5
= −τ N log + ,
∂N 3πh̄2 N 5/2 2
" 3/2 ! ! #
mU V 5 5
= −τ log + − ,
3πh̄2 N 5/2 2 2
nQ V
= −τ log ,
N
n
= τ log ,
nQ
the same as we obtained before.
Since we’re having so much fun playing with the mathematics, suppose we wanted to
find the chemical potential by differentiating the energy. This should be done at constant
entropy and volume. The Sackur-Tetrode formula for the entropy is fairly messy to solve
for the temperature, so we may not want to rewrite the energy in closed form as a function
of the entropy, volume and number of particles. Instead, we can write the energy as a

Copyright
Physics 301 04-Oct-2004 11-8
differential involving dN and dτ and find the relation between these differentials which
makes the entropy change vanish. This will be left as a homework problem. Needless to
say, one gets the same expression for the chemical potential that we’ve already derived.
As an example of a macroscopic contribution to the chemical potential, consider our

atmosphere which exists in the gravitational field of the Earth. Suppose the atmosphere
consists of a single kind of atom of mass m, is isothermal, and is in equilibrium. Then the
chemical potential must be the same everywhere in the gas. At height h above the zero
level for the gravitational potential, there is a contribution mgh from the gravitational
field. This means
n
µ = Constant = τ log + mgh ,
nQ
or
n(h) = n(0)e−mgh/τ .
so the concentration decreases exponentially with altitude. With the ideal gas law, p/τ =
N/V = n, so the pressure also decreases exponentially with altitude. We can write
p(h) = p(0)e−h/h0 ,
where h0 = τ /mg = kT /mg = RT /M g, is called the scale height of the atmosphere.

R is the gas constant, and M is the molar mass. If we take, T = 300 K, M = 28 g
(appropriate for Nitrogen molecules), and g = 980 cm s−2 , we find h0 ≈ 9 km. There are
several problems with this simple model. First, the atmosphere is stirred by winds, so it is
not in equilibrium. However, this is important only in the first few miles above sea level.
At higher altitudes, the atmosphere is approximately isothermal, but with a considerable
colder temperature, about 230 K according to the plot in K&K. Also, each molecule should
have a slightly different scale height, with the lighter molecules having a large scale height
(and being more likely to completely evaporate from the Earth).

Copyright
Physics 301 06-Oct-2004 12-1
The Gibbs Factor
In lecture 5, we divided up a large object into a system and a heat reservoir and we
considered what happens when the system and reservoir exchange energy. This led us to
the Boltzmann factor and the partition function. Now let’s consider the what happens if
we divide up the large object into the system and a reservoir except this time we allow
the exchange of particles as well as energy. The total energy is U0 and the total number
of particles is N0 (which is not Avogadro’s number during this discussion!). What is
the probability that the system is in the single state 1 with N1 particles and energy E1
compared to the probability that it’s in the single state 2 with N2 particles and energy
E2 ? The ratio is
P (N1 , E1 ) g(N0 − N1 , U0 − E1 ) × 1
= ,
P (N2 , E2 ) g(N0 − N2 , U0 − E2 ) × 1
where g(NR , UR ) is the number of states available to the reservoir when it contains NR
particles and has energy UR . We are just applying our postulate that the probability is
proportional to the number of available states. We have
P (N, E) ∝ g(N0 − N, U0 − E) ,
∝ eσ(N0 − N, U0 − E) ,

∂σ ∂σ
σ(N0 , U0 ) − N −E
∝e ∂N ∂U ,
∝ e(σ(N0 , U0 ) + N µ/τ − E/τ ) ,

∝ eσ(N0 , U0 ) × e(N µ/τ − E/τ ) ,
∝ e(N µ/τ − E/τ ) ,
where we dropped the first factor since it’s a constant for any given reservoir. The proba-
bility
P (N, E) ∝ e(N µ − E)/τ ,
is called the Gibbs factor. A probability distribution described by Gibbs factors is called
the grand canonical distribution.
Consider the sum X

Z= e(N µ − E)/τ ,
All N , E
where the sum is over all numbers of particles N , and for each N , all possible states of
N particles with energies E. Or, one can sum over all energies first, and then over all
numbers of particles with that energy. Basically it’s a sum over all possible states which
are parameterized by number of particles and energy. Z is called the Gibbs sum, the grand
sum, or the grand partition function. K&K use a kind of cursive Z symbol for this sum.
I don’t seem to have that in my TEX fonts, so we’ll use Z.

Copyright
Physics 301 06-Oct-2004 12-2
The grand partition function is used to normalize the probabilities:

1 (N µ − E)/τ
P (N, E) = e .
Z
With the normalized probability distribution, we can compute mean values. For example,
the mean number of particles in the system is
1 X
hN i = N e(N µ − E)/τ ,
Z
All N , E
1 X N (N µ − E)/τ
=τ e ,
Z τ
All N , E
1 ∂Z
=τ ,
Z ∂µ
∂ log Z
=τ .
∂µ
The mean energy is slightly more complicated. If we differentiate Z with respect to 1/τ ,
it will pull down a term with the energy in it, but we’ll also get the number of particles
again.
1 X
hN µ − Ei = (N µ − E) e(N µ − E)/τ ,
Z
All N , E
1 ∂Z
= ,
Z ∂(1/τ )
∂ log Z
= ,
∂(1/τ )
∂ log Z
hEi = hN µi − ,
∂(1/τ )
∂ log Z
= µ hN i − ,
∂(1/τ )
∂ log Z ∂ log Z
= µτ − ,
∂µ ∂(1/τ )
∂ log Z ∂ log Z
= µτ + τ2 .
∂µ ∂τ
K&K define the activity as

λ = eµ/τ .
The grand partition function can be rewritten in terms of the activity as
X
Z= λN e−E/τ ,
All N , E

Copyright
Physics 301 06-Oct-2004 12-3
from which it follows that the average number of particles is

∂ log Z
hN i = λ .
∂λ
Example: Binding of N Molecules
This example is related to the myoglobin example discussed in K&K and also to K&K,
chapter 5, problem 14. The example system is a hemoglobin molecule which can bind zero
to four oxygen molecules. A hemoglobin molecule is similar to four myoglobin molecules,
each of which can bind zero or one oxygen molecule.
We will work out an expression for the average number of molecules as a function of
the partial pressure of oxygen in the atmosphere. We will assume that each successive
molecule binds with same energy (relative to infinite separation), ǫ < 0. This is not quite
right as successive oxygen molecules are bound more tightly than the first oxygen molecule.
Also, we will start by assuming that 0 to M molecules may be bound, and specialize to the
case of four molecules later. Finally, we will assume that there is only one state in which
N molecules are bound. This corresponds to assuming that the molecules are bound to
hemoglobin in a definite order. We will let the activity be λ = exp(µ/τ ). Then the grand
partition function is,
2 M
Z = 1 + λe−ǫ/τ + λe−ǫ/τ + · · · + λe−ǫ/τ ,
M +1
1 − λe −ǫ/τ
= ,
1 − λe−ǫ/τ
M +1
log Z = log 1 − λe −ǫ/τ − log 1 − λe−ǫ/τ .
Now we differentiate with respect to λ in order to find the average number of bound
molecules,
∂ log Z
hN i = λ ,
∂λ
M +1
(M + 1) λe−ǫ/τ λe−ǫ/τ
=− M +1 + ,
−ǫ/τ 1 − λe−ǫ/τ
1 − λe
M M +1
1 − (M + 1) λe−ǫ/τ + M λe−ǫ/τ
= λe−ǫ/τ ! ! .
M +1
1 − λe−ǫ/τ 1 − λe−ǫ/τ

Copyright
Physics 301 06-Oct-2004 12-4
In the case that M = 1, which corresponds to myoglobin which can bind one molecule,
the expression becomes
λe−ǫ/τ
hN i = .
1 + λe−ǫ/τ
Now, λ = exp(µ/τ ), and in order for our system to be in equilibrium with atmospheric
oxygen it must have the same chemical potential as atmospheric oxygen. This means that
λ = n/nQ = p/τ nQ , where p is the partial pressure of oxygen, τ is the temperature of
atmospheric oxygen (presumably room temperature), and nQ is the quantum concentration
evaluated at temperature τ and for the mass of an O2 molecule. So λ can be evaluated
numerically for any desired partial pressure of oxygen. If we look at the curve for myoglobin
in figure 5.12 of K&K, it appears that the average number of bound molecules is about 1/2
when the partial pressure of oxygen is about 5 mm of Hg. (One atmosphere is 760 mm of
Hg and oxygen is about 20% of the atmosphere, so the maximum partial pressure is roughly
150 mm of Hg. Let λ1/2 be the activity when the number of bound molecules is 1/2. From
our expression above, we see that this means λ1/2 exp(−ǫ/τ ) = 1 or exp(−ǫ/τ ) = 1/λ1/2 .
Let’s plug this into the formula for hemoglobin (M = 4) and express the result as a fraction
of the maximum number of bound molecules. The result is
hN i x 1 − 5x4 + 4x5 x 1 + 2x + 3x2 + 4x3

f= = = ,
4 4 (1 − x5 )(1 − x) 4 1 + x + x 2 + x3 + x4
where x = λ/λ1/2 . This curve is shown in the figure. Binding more molecules, all with the
same binding energy, causes a sharper transition from “empty” to “full!”

Copyright
Physics 301 06-Oct-2004 12-5
More on the Chemical Potential—Energy to Add a Particle
This section is based on the discussion in K&K in pages 250–252.
As you recall, the chemical potential is the amount of energy required to add one par-
ticle to a system. Also, we learned that the chemical potential for an ideal monatomic gas
is µ = τ log(n/nQ ) which is about −14 to −11τ for a typical gas under typical conditions.
It appears that if we add one more particle to a gas, we’re not required to spend energy,
but we get back some energy! This must be wrong, but what’s the explanation?
The answer has to do with where the particle came from. There’s also an energy
involved in removing the particle from its original location before we add it to the gas.
Suppose we have two containers of the same gas at the same temperature, τ . Suppose
the chemical potentials are different with µ2 > µ1 . Then the concentrations must be
different, or equivalently, the pressures are different with p2 > p1 . If we remove a molecule
from container 1 and add it to container 2, we receive energy µ1 from container 1 but must
give energy µ2 to container 2. The total amount of energy that must be supplied by an
external agent to move this molecule is µ2 − µ1 > 0. What does this turn out to be?
∆E = µ2 − µ1 ,
n2 n1
= τ log − τ log ,
nQ nQ
n2
= τ log .
n1
Now suppose we have N molecules of a gas at temperature τ and we isothermally com-
press it from volume V1 down to volume V2 or, equivalently, from concentration n1 up to
concentration n2 . How much mechanical work is required?
Z V2
∆W = −p dV ,
V1
Z V2
dV
= −N τ ,
V1 V
V2
= −N τ log ,
V1
V1
= N τ log
,
V2
N/V2
= N τ log ,
N/V1
n2
= N τ log .
n1
So the energy per molecule required to isothermally change the concentration from n1 to
n2 is just the energy required to move one molecule from a gas at concentration n1 to a
gas at concentration n2 .

Copyright
Physics 301 06-Oct-2004 12-6
In fact, we could imagine doing the following: Isothermally compress the gas in con-
tainer 1 from concentration n1 to concentration n2 . This requires spending an energy
N τ log(n2 /n1 ). Move the molecule from container 1 to container 2. This requires no en-
ergy since the concentrations and the chemical potentials are now the same. Expand the
gas in container 1 back to concentration n1 . This recovers an energy (N − 1)τ log(n2 /n1 ),
so the net expenditure of energy is τ log(n2 /n1 ) = µ2 − µ1 . Recall that the internal en-
ergy of an ideal monatomic gas depends only on its temperature (U = 3N τ /2). Before
and after we moved the molecule from container 1 to container 2, the temperature of all
the gas was τ , so the internal energy of the gas did not change! Where did the energy
τ log(n2 /n1 ) go??? Hints: has the free energy of the combined systems changed? What
about the entropy?
Example: Chemical Potential and Batteries
Surprise: chemical potential might actually have something to do with chemistry!

An example has to do with batteries—or better, voltaic cells. K&K have a discussion of
the lead acid battery used in cars on pages 129–131. However, I’ve been told that this
discussion is not quite right. In particular see Saslow, W., 1996, PRL, 76, 4849.
By the way, did you know that Princeton subscribes to many of the on-line journals?
This means if you access the web from a Princeton address, you’ll be allowed to read the
journals on-line. In particular, you can find Physical Review and Physical Review Letters
on-line and the article cited above can be downloaded and printed out.
Rather than discuss the lead acid battery, let’s look at a simpler (I hope) system:
the Daniell cell. This is discussed by the same author in 1999, AJP, 67, 574. (True
confession: I have not read the article in the American Journal of Physics, but rather, the
preprint that used to be on the author’s web site. But the TAMU physics web site has
been revamped and I can’t find the preprint anymore!) The Daniell cell is also discussed
in chemistry textbooks, such as the one I used many years ago, Pauling, L., 1964, College
Chemistry, (San Francisco:Freeman), p. 354. The following discussion is based on both of
these sources.
The figure shows a schematic of the cell. It has a solution of zinc sulfate (ZnSO4 )
surrounding a zinc electrode and a copper sulfate (CuSO4 ) solution surrounding a copper
electrode. The two solutions are in contact. The zinc electrode is the negative electrode
or cathode and the copper electrode is the positive electrode or anode.
Chemical reactions occur at the electrodes. At the copper electrode, the reaction is
Cu++ + 2e− → Cu .
The copper ion was in solution and the electrons come from the electrode. The neutral
copper atom “plates out” on the copper electrode.

Copyright
Physics 301 06-Oct-2004 12-7
At the zinc electrode, the reaction is

Zn → Zn++ + 2e− .
Zinc atoms in the electrode go into solution as zinc ions and leave behind two electrons on
the cathode.
If a wire is connected between the two

electrodes, the electrons left behind by the
zinc can travel through the external circuit
to the copper electrode where they join up
with the copper ions to plate out the copper
atoms. (Of course, electrons go in one end
and different electrons come out the other
end. . ..) Charge is transfered inside the cell,
through the electrolyte, by sulfate ions. That
is, one can think of CuSO4 dissociating into
Cu++ and SO−− 4 at the positive electrode,
the Cu++ plates out leaving behind a spare
sulfate ion which diffuses over to the negative
electrode to join up with a zinc ion and form
ZnSO4 . (Of course, sulfate ions don’t go all
the way across the electrolyte; ions go in one
end and different ions come out the other end. . ..) Essentially all the current in the
electrolyte is carried by the ions and none by electrons.
If we actually have a complete circuit, current will flow until one of the consumables
is exhausted. If all the copper is plated out of solution or if the zinc electrode is completely
dissolved, that will be the end of the cell. When operated in this mode, the cell converts
chemical potential energy into electrical energy.
Our methods apply to equilibrium situations, so we’ll discuss the situation when
there is no current flowing in the external circuit and the system has reached equilibrium.
(Actually, a non-uniform distribution of electrolytes is also not an equilibrium situation, so
we are really assuming that the time for the electrolytes to diffuse is long compared to the
time for the reactions at the electrodes to complete.) As zinc goes into solution and copper
plates out, the electrodes acquire charges and electric potentials. When the the potentials
are large enough the reactions stop. When the reactions stop, the chemical potentials of
the atoms/ions must be the same whether they are in solution or on the electrodes.
Let Va , Vs , and Vc be the electric potentials (voltages) of the anode, solutions, and
cathode. Note that we assume the electrolytes (solutions) are equipotentials. If not, there
would be a current flow until a uniform potential is established. The voltage of the cell
(e.g., measured by a voltmeter placed across the anode and cathode) is
Vcell = Va − Vc = (Va − Vs ) + (Vs − Vc ) .

Copyright
Physics 301 06-Oct-2004 12-8
Consider a zinc ion in the cathode. When equilibrium has been established, the chemical
potential of the zinc in the cathode must be the same as that of zinc in solution. The
chemical potential is made of two parts: the internal chemical potential and the potential
energy of the ion in macroscopic electric potential of the cathode or the solution:
µci (Zn++ ) + 2eVc = µsi (Zn++ ) + 2eVs ,
or
µci (Zn++ ) − µsi (Zn++ ) = 2e(Vs − Vc ) ,
where e > 0 represents the magnitude of the charge on an electron and µci and µsi represent
the internal chemical potentials in the cathode and the solution. Note that I have shown
the zinc as zinc ions on the cathode as well as in solution. This is mainly for clarity and
can be justified by noting that the conduction electrons in a metal are not localized to
any particular atom. The difference of internal chemical potentials is determined by the
chemical reaction. It is customary to divide this by the magnitude of the electric charge
and the number of charges involved and tabulate as a potential difference. So, for example,
my 1962 edition of the Handbook of Chemistry and Physics has a table titled “Potentials
of Electrochemical Reactions at 25◦ C in which one finds +0.7628 V listed for the reaction
Zn → Zn++ + 2e− . This means that Vs − Vc is about 0.76 V.
At the anode, with no current flowing, we have
µai (Cu++ ) − µsi (Cu++ ) = −2e(Va − Vs ) .
The Handbook lists the electric potential of the reaction Cu → Cu++ + 2e− as −0.3460 V.
Thus Va − Vs = 0.35 V and the open circuit cell potential is Vcell = 1.11 V.
Comments: the potentials associated with reactions that occur at the cathode or
anode are called half cell potentials. If this reminds you of redox reactions in chemistry,
it should! The Handbook contains a table titled “Electromotive Force and Composition
of Voltaic Cells” which gives the composition and voltage of selected cells. The half cell
voltages are determined by defining a standard half cell (a platinum electrode over which
hydrogen ions are bubbled) as a standard with zero half cell potential. Then all other half
cells are measured relative to the standard. Recall: only potential energy differences are
important! Finally, by now you should be getting a feel for why it’s called the chemical
potential!

Copyright
Physics 301 08-Oct-2004 13-1
Example: Magnetic Particles in a Magnetic Field
Recall the paramagnetic spin system we discussed in lecture 4. In this system, there
are magnets with orientations parallel or antiparallel to a magnetic field. In the parallel
orientation, the energy is −mB = −E, where m is the magnetic moment and B is the
magnetic field. In the antiparallel orientation the energy is +mB = +E. In lecture 4,
we worked out the relative numbers of parallel and antiparallel magnets and found that it
depended on the ratio of thermal to magnetic energies.
Following the discussion in K&K, pages 127–129, suppose that we have the same kind
of system, but in addition, the magnetic particles are free to move, so the aligned magnets
will be attracted to regions of high field strength while the antiparallel magnets will be
repelled from regions of high field strength. Of course, in the regions of high field, one
would expect to find a greater fraction aligned even if the particles couldn’t move. . . Let n↑
be the concentration of parallel and n↓ be the concentration of antiparallel systems. Just
as with an ideal gas, we expect that microscopic or internal contribution to the chemical
potential should depend on the concentration,
n↑ n↓
µ↑,int = τ log and µ↓,int = τ log .
nQ nQ
We assume that we can treat the parallel and antiparallel magnets as distinct kinds of
“particles.” To the internal chemical potential must be added the external potential due
the energy in the magnetic field,
n↑
µ↑ = τ log − mB ,
nQ
n↓
µ↓ = τ log + mB .
nQ
Now, the parallel and antiparallel magnets are in thermal equilibrium with each other and
can be changed into one another. That is, one can remove a particle from the parallel
group and add it to the antiparallel group and vice-versa. When the system has come to
equilibrium, at temperature τ , the free energy must be stationary with respect to changes
in the particle numbers which means the chemical potentials of the two kinds of particles
must be the same. Furthermore, we are allowing the particles to diffuse to regions of higher
or lower field strength, and the chemical potential must be independent of field strength.
So,
µ↑ = µ↓ = Constant.
This relation together with the previous equations are easily solved to yield
1 1
n↑ (B) = n(0)e+mB/τ and n↓ (B) = n(0)e−mB/τ ,
2 2

Copyright
Physics 301 08-Oct-2004 13-2
where we’re explicitly showing that the concentrations depend on B and n(0) is the com-
bined concentration where B = 0. The combined concentration as a function of B is

m2 B 2
n(B) = n↑ (B) + n↓ (B) = n(0) cosh(mB/τ ) = n(0) 1 + +··· .
2τ 2
These relations show both effects we mentioned earlier. The higher the field strength,
the greater the fraction of aligned magnets (as we already knew from lecture 4) and the
greater the concentration of magnets. The magnetic particles diffuse to regions of high
field strength.
In figure 5.6, K&K show a plot of chemical potential versus concentration for sev-
eral different field strengths. In problem 5 of chapter 5, we are asked for what value of
m/τ was this figure drawn. The key datum to extract from the plot is that at a given
chemical potential, the concentration increases by two orders of magnitude as B is in-
creased from 0 to 20 kG. We can plug this directly into the previous expression to get
m/τ = 5.30/(20000 G) = 0.000265 G−1 . Note that we had to use the cosh form of the
expression, not the series, because mB/τ > 1. Problem 5.5 also asks how many Bohr
magnetons must be contained in each particle. A Bohr magneton (roughly the magnetic
moment of an electron) is µB = eh̄/2mc where e and m are the charge and mass of an
electron. µB = 0.927 × 10−20 erg G−1 . Doing the arithmetic, we obtain about 1200 mag-
netons. The particles must contain 1200 paramagnetic molecules with a spin of h̄/2 and
a magnetic moment of µB . They could also contain a more or less arbitrary number of
non-magnetic molecules.
Example: Impurity Ionization
In pages 143–144, K&K discuss an impurity atom in a semiconductor. The atom may
lose a valence electron and become ionized. The energy required to remove an electron
from the donor atom is I. The model for this impurity atom is a three state system:
the ionized state has energy 0 and no electron is present. There are two bound states,
both have energy −I and both have one electron present. One has the electron with spin
up along some axis and the other has the electron with spin down. The grand partition
function is
Z = 1 + e(µ + I)/τ + e(µ + I)/τ ,
where the first term comes from the ionized state and the second and third terms account
for the spin up and spin down bound states. The average number of (bound) electrons
and the average energy are
e(µ + I)/τ + e(µ + I)/τ 2e(µ + I)/τ

hN i = = ,
1 + e(µ + I)/τ + e(µ + I)/τ 1 + 2e(µ + I)/τ
−Ie(µ + I)/τ − Ie(µ + I)/τ −2Ie(µ + I)/τ
hEi = = ,
1 + e(µ + I)/τ + e(µ + I)/τ 1 + 2e(µ + I)/τ

Copyright
Physics 301 08-Oct-2004 13-3
The probability that the impurity atom is ionized is
1
P (N = 0) = .
1 + 2e + I)/τ
(µ
If we don’t know the value of µ, we can’t actually calculate any of these averages or this
probability. What sets the value of µ? Answer: µ is determined by the electron distribution
in the rest of the semiconductor. (A subject we’ll get to in a few weeks!) Although we don’t
know µ at this point, we’re used to the idea that µ increases with increasing concentration.
In the above expressions we see that increasing µ increases the mean number of particles
in the system, decreases the mean energy (energy goes down for a bound particle), and
decreases the probability of being ionized. All this is reasonable and might have been
expected. The higher the concentration of electrons in the semiconductor, the harder it is
for the atom to give an extra electron to the semiconductor and become ionized!
Example: K&K, Chapter 5, Problem 6
In this problem we are asked to work with a 3 state system. The states are: (1) no
particle, energy is 0; (2) one particle, energy is still 0: (3) one particle, energy is ǫ, so
a particle can be absent, present with zero energy, or present with energy ǫ. The grand
partition function is
Z = 1 + λ + λe−ǫ/τ ,
where λ = exp(µ/τ ). The three terms in this sum correspond to the three states enumerated
above. The thermal average occupancy is just the average number of particles in the system
and is
1 λ + λe−ǫ/τ
hN i = 0 · 1 + 1 · λ + 1 · λe−ǫ/τ = .
Z 1 + λ + λe−ǫ/τ
Of course, this result can also be obtained using hN i = λ(∂/∂λ) log Z. Just as in the
previous example, increasing µ (λ) makes it harder for the system to give the particle
to the reservoir (which determines µ) and the system is more likely to contain a bound
particle. The thermal average occupancy of the state with energy ǫ is
λe−ǫ/τ λe−ǫ/τ
hN (E = ǫ)i = = .
Z 1 + λ + λe−ǫ/τ
Here we see that in the limit of very large µ (λ) the system always contains a particle, and
the relative probability that the particle is in the high energy state is just the Boltzmann
factor, exp(−ǫ/τ ). The average energy is
ǫλe−ǫ/τ ǫλe−ǫ/τ
hEi = = .
Z 1 + λ + λe−ǫ/τ

Copyright
Physics 301 08-Oct-2004 13-4
Finally, we are asked to calculate the grand partition function in the event that a
particle can exist in both the zero energy state and the state with energy ǫ simultaneously.
In other words, there is a fourth state of the system; it contains two particles and has
energy ǫ. We have
Z = 1 + λ + λe−ǫ/τ + λ2 e−ǫ/τ = (1 + λ) · (1 + λe−ǫ/τ ) .
In this case, Z can be factored. K&K point out that this means that the system can
be treated as two independent systems. This is an example of a general rule that for
independent (but weakly interacting) systems, the grand partition function is the product
of the grand partition functions for each independent system, just as the partition function
of independent systems is a product of the individual partition functions (Homework 2,
problem 4).
Fermi-Dirac and Bose-Einstein Distributions
When we considered a low density gas in lecture 7, we considered the single particle
states of a particle confined to a box. To treat more than one particle, we imagined that
the particles were weakly interacting, so we could, to some level of approximation, treat
each particle as though it occupied a single particle state. In the limit of no interactions
between particles, this would be exact (but it might be hard to achieve thermal equilib-
rium!). For typical gases at room temperature and atmospheric pressure we found that the
concentration was very low, so that the chance that any single particle state was occupied
was very small, maybe one part in a million. We just didn’t have to worry about the
chances of finding two particles in a state.
Now we want to consider the distribution when there’s a good chance of finding single
particle states occupied. We are going to assume that we have non-interacting particles in
which each particle in the system can be said to be in a single particle state. There are
two kinds of particles, fermions, which have half integer spins (spin angular momentum
is a half integer times h̄), and bosons which have integral spins. fermions obey the Pauli
exclusion principle: at most one particle may occupy a single state. On the other hand, an
unlimited number of bosons may be placed in any given state. Since we have independent
single particle states, the grand partition function for all the states is the product of the
grand partition function for the individual states. So to start with, let’s calculate the grand
partition function for an individual state of energy ǫ.
In the case of fermions, there are two possibilities: no particle present with energy 0
and one particle present with energy ǫ. Then
Z = 1 + e(µ − ǫ)/τ .

Copyright
Physics 301 08-Oct-2004 13-5
The average number of particles in this state of energy ǫ is denoted by f (ǫ)
e(µ − ǫ)/τ 1
f (ǫ) = hN i = = .
1 + e(µ − ǫ)/τ e(ǫ − µ)/τ + 1
This is called the Fermi-Dirac distribution and fermions are said to obey Fermi-Dirac
statistics. If ǫ = µ, then the average number of particles in the state is 1/2. If ǫ < µ, the
average occupancy is bigger than 1/2 and approaches 1 as ǫ → −∞. If ǫ > µ, the average
occupancy is less than 1/2 and approaches 0 as ǫ → +∞. This distribution starts at 1 at
very low energies, winds up at 0 at very high energies and makes the transition from 0 to
1 in the neighborhood of µ. The temperature controls the width of the transition. At very
low temperatures the transition is sharp. At high temperatures, the transition is gradual.
For bosons, the possibilities are no particles present with energy 0, 1 particle present
with energy ǫ, 2 particles present with energy 2ǫ, 3 particles present with energy 3ǫ, and
so on. The grand partition function is
1
Z = 1 + e(µ − ǫ)/τ + e2(µ − ǫ)/τ + e3(µ − ǫ)/τ + · · · = .
1 − e(µ − ǫ)/τ
Note that µ < ǫ if the sum is to converge. The average occupancy is
∂ log Z e(µ − ǫ)/τ 1

f (ǫ) = τ = = .
∂µ 1 − e(µ − ǫ)/τ e(ǫ − µ)/τ − 1

Copyright
Physics 301 08-Oct-2004 13-6
This is called the Bose-Einstein distribution and bosons obey Bose-Einstein statistics.
Again, note that µ < ǫ if the distribution function is to make sense. In fact, if we have
weakly interacting particles occupying states of several different energies, they all have
the same chemical potential which must therefore be less than the lowest energy of any
available state. In other words
µ < ǫminimum .
The minimum energy is often set to zero, and then µ < 0, but the real constraint is just
that µ be lower than any accessible energy. The Bose-Einstein distribution diverges as
ǫ → µ. As ǫ → +∞ the distribution goes exponentially to zero. The average occupancy is
1 when ǫ − µ = τ log 2. At lower energies there is more than one particle in the state and
at higher energies there is less than one particle in the state.
The Bose-Einstein distribution, with µ = 0, is exactly the occupancy we came up

with for photons in blackbody radiation. Photons have spin 1, so they are bosons and
obey Bose-Einstein statistics. There is no lower limit on the wavelength, so the lowest
conceivable energy is arbitrarily close to zero which means µ ≤ 0.

Copyright
Physics 301 08-Oct-2004 13-7
At large energies both the Fermi-Dirac and Bose-Einstein distributions become
f (ǫ → +∞) → e(µ − ǫ)/τ .
In this limit the average occupancy is small and quantum effects are negligible; this is the
classical limit. The classical distribution is called the Maxwell-Boltzmann distribution.

Copyright
Physics 301 11-Oct-2004 14-1
Reading
K&K chapter 6 and the first half of chapter 7 (the Fermi gas).
The Ideal Gas Again
Using the grand partition function, we’ve discussed the Fermi-Dirac and Bose-Einstein
distributions and their classical—low occupancy—limit, the Maxwell-Boltzmann distribu-
tion.
In lecture 7, we considered an ideal gas starting from the partition function. We

considered the states of a single particle in a box and we used the Boltzmann factor and
the number of such states to calculate the partition function for a single particle in a box.
Then we said the partition function for N weakly interacting particles is the product of N
single particle partition functions divided by N !,
1 N 1
ZN (τ ) = Z1 = (nQ V )N .
N! N!
We introduced the factor of N ! to account for the permutations of the N particles among
the single particle states forming the overall composite state of the system.
The introduction of this N ! factor was something of a “fast one!” We gave a plausible
argument for it, but without a formalism that includes the particle number, it’s hard to
do more. Now that we have the grand partition function we can reconsider the problem.
In addition to cleaning up this detail, we also want to consider how to account for
the internal states of the molecules in a gas, the heat and work, etc., required for various
processes with an ideal gas, and also we want to consider the absolute entropy and see how
the Sackur-Tetrode formula relates to experiment.

Copyright
Physics 301 11-Oct-2004 14-2
The N Particle Problem
The factor of N ! in the ideal gas partition function was apparently controversial in
the early days of statistical mechanics. In fact, in Schroedinger’s book on the subject, he
has a chapter called “The N Particle Problem.”
I think this is kind of amusing, so let’s see what the N particle problem really is.
The free energy with the N ! term is
FC = −τ log Z ,
= −τ log((nQ V )N /N !) ,
= −τ (N log nQ + N log V − N log N + N ) ,
h i
2 3/2
= −τ N log (mτ /2πh̄ ) (V /N ) − τ N ,
where the subscript C denotes the “correct” free energy. Without the N ! term, the “in-
correct” free energy is
FI = −τ log Z ,
= −τ log(nQ V )N ,
= −τ (N log nQ + N log V ) ,
h i
= −τ N log (mτ /2πh̄2 )3/2 V
For the entropy, σ = −∂F/∂τ ,
σC = N log(nQ V /N ) + (3/2)N + N ,

nQ V 5
= N log + ,
N 2
and
σI = N log(nQ V ) + (3/2)N ,

3
= N log(nQ V ) + .
2
With a given amount of gas, the change in entropy between an initial and final state is
given correctly by either formula,
3 τf Vf
σCf − σCi = σIf − σIi = N log + N log .
2 τi Vi
But what happens when we change the amount of gas?
Note that N and V are both extensive quantities; the concentration, n = N/V , is an
intensive quantity. σC is proportional to an extensive quantity. On the other hand, σI

Copyright
Physics 301 11-Oct-2004 14-3
contains an extensive quantity times the logarithm of an extensive quantity. This means
that the (incorrect) entropy is not proportional to the amount of gas we have!
For example, suppose we have two volumes V , each containing N molecules of (the
same kind of) gas at temperature τ . Then each has σI = N (log(nQ V ) + 3/2), for a total of
2N (log(nQ V )+3/2). We can imagine a volume 2V divided in half by a removable partition.
We start with the partition in place and the entropy as above. We remove the partition.
Now we have 2N molecules in a volume 2V . The entropy becomes σI = 2N (log(2nQ V ) +
3/2) which exceeds the entropy with the partition in place by ∆σI = 2N log 2! But did
anything really change upon removing the partition? What kind of measurements could
we make on the gas in either volume to detect whether the partition were in place or not???
Note that the total σC is the same before and after the partition is removed.
We might consider the same experiment but performed with two different kinds of
molecules, A and B. We start with N molecules of type A on one side of the partition and
N molecules of type B on the other side of the partition. Before the partition is removed,
we have
nQA V 5 nQB V 5
σC = N log + + N log + ,
N 2 N 2
where the two kinds of molecules may have different masses and so might have different
quantum concentrations. Now we remove the partition. This time we have to wait for
equilibrium to be established. We assume that no chemical reactions occur—we are only
waiting for the molecules to diffuse so that they are uniformly mixed. Once equilibrium
has been established, each molecule occupies single particle states in a volume 2V and the
entropy is
2nQA V 5 2nQB V 5
σC = N log + + N log + ,
N 2 N 2
which is 2N log 2 greater than the initial entropy. This increase is called the entropy of
mixing.
In the experiment with the same gas on both sides of the partition, the incorrect
expression for entropy gave an increase which turns out to be the same as the entropy
of mixing if we start out with two different gases. This emphasizes the point that the
incorrect expression results from over counting the states by treating the molecules as
distinguishable.
In the mixing experiment, we can make measurements that tell us whether the par-
tition has been removed. If we sample the gas in one of the volumes and find all the
molecules are type A, then we’re pretty sure that the partition hasn’t been removed! If we
find a mixture of type A and type B, then we’re pretty sure that it has been removed.
Finally, note that if we go back to the case of the same gas, and reinsert the partition,

Copyright
Physics 301 11-Oct-2004 14-4
then σI decreases by 2N log 2. This is a violation of the second law!
The Ideal Gas From the Grand Partition Function
An ideal gas is the low occupancy limit of non-interacting particles. In this limit, both
the Fermi-Dirac and Bose-Einstein distributions become the Maxwell-Boltzmann distribu-
tion which is
f (ǫ) = e(µ − ǫ)/τ ,
where f (ǫ) ≪ 1 is the average occupancy of a state with energy ǫ. The chemical potential,
µ, is found by requiring that the gas have the correct number of molecules,
X
N= e(µ − ǫ)/τ ,
All states
X
= eµ/τ e−ǫ/τ ,
All states
= eµ/τ Z1 ,
= eµ/τ nQ V ,
where Z1 is the single particle partition function we discussed earlier. Then

n
µ = τ log ,
nQ
as we found earlier.
The free energy satisfies

∂F
= µ(N, τ, V ) ,
∂N τ,V
so
Z N
F = µ(N ′ , τ, V ) dN ′ ,
0
Z N
N′
= τ log dN ′ ,
0 nQ V
Z N
=τ (log N ′ − log(nQ V )) dN ′ ,
0
N
′ ′ ′ ′
= τ N log N − N − N log(nQ V ) ,
0

n
= N τ log −1 .
nQ

Copyright
Physics 301 11-Oct-2004 14-5
Of course, this is in agreement with what we had before. The N ! factor that we previously
inserted by hand, comes about naturally with this method. (It is responsible for the N in
the concentration in the logarithm and the −1 within the parentheses.)
As a reminder, the pressure is found from the free energy by

∂F
p=− ,
∂V τ,N
which gives the ideal gas equation of state
Nτ
p= .
V
The entropy is found by differentiating with respect to the temperature,

∂F
σ=− ,
∂τ V,N
which gives the Sackur-Tetrode expression,

nQ 5
σ = N log + .
n 2
The internal energy is most easily found from
3
U = F + τσ = Nτ .
2
The energy of an ideal gas depends only on the number of particles and the temperature.
Since
the change in energy at constant volume and particle number is just τ dσ. Then the heat
capacity at constant volume is

∂σ
CV = τ ,
∂τ V,N
which for the case of an ideal gas is
3 3
CV = N = Nk ,
2 2
where the last expression gives the heat capacity in conventional units. The molar specific
heat at constant volume is (3/2)N0 k = (3/2)R where R is the gas constant. The heat

Copyright
Physics 301 11-Oct-2004 14-6
capacity at constant pressure can be found by requiring that dV and dτ be such that p
doesn’t change.
∂σ ∂U ∂V
Cp = τ = +p .
∂τ p,N ∂τ p,N ∂τ p,N
Since U depends only on N and τ ,

∂U
= CV .
∂τ p,N
With the ideal gas equation of state, V = (N/p)τ ,

∂V
p =N,
∂τ p,N
so
Cp = CV + N , or Cp = CV + N k (in conventional units) .
For the molar heat capacities, we have
Cp = CV + R ,
and for the ideal monatomic gas, these are
3 5
CV = R, and Cp = R.
2 2
The ratio of specific heats is usually denoted by γ, which for an ideal monatomic gas is
Cp 5
γ= = .
CV 3

Copyright
Physics 301 11-Oct-2004 14-7
Internal Degrees of Freedom
There are several corrections we might make to our treatment of the ideal gas. If
we go to high occupancies, our treatment using the Maxwell-Boltzmann distribution is
inappropriate and we should start from the Fermi-Dirac or Bose-Einstein distribution
directly.
We have ignored the interactions between molecules. This is a good approximation

for low density gases, but not so good for higher densities (but these higher densities can
still be low enough that the MB distribution applies). We will discuss an approximate
treatment of interactions in a few weeks when we discuss phase transitions.
Finally, we have ignored any internal structure of the molecules. We will remedy this
omission now. We imagine that each molecule contains several internal states with energies
ǫint . Note that int is understood to be an index over the internal states. There may be
states with the same energy and states with differing energies. In our non-interacting
model, the external energy is just the kinetic energy due to the translation motion of the
center of mass, ǫcm . Again, cm is to be understood as an index which ranges over all states
of motion of the cm. Although we are considering internal energies, we are not considering
ionization or dissociation. When a molecule changes its internal state, we assume the
number of particles does not change.
Let’s consider the grand partition function for a single state of center of mass motion.
That is, we’re going to consider the grand partition function for single particle states—with
internal degrees of freedom—in a box. The energy of the particle is ǫcm + ǫint . Then the
grand partition function is
Z = 1 + e(µ − ǫcm − ǫint,1 )/τ + e(µ − ǫcm − ǫint,2 )/τ + · · ·

+ two particle terms + three particle terms + · · · ,
X
= 1 + e(µ − ǫcm )/τ e−ǫint /τ
int
+ two particle terms + three particle terms + · · · ,
= 1 + e(µ − ǫcm )/τ Zint + two particle terms + three particle terms + · · · ,
2 3
Z Z
= 1 + e(µ − ǫcm )/τ Zint + e2(µ − ǫcm )/τ int + e3(µ − ǫcm )/τ int + · · · ,
2! 3!
where Zint is the partition function for the internal states. The above expression is strictly
correct only for bosons. For fermions, we would need to be sure that the multiple particle
terms have all particles in different states which means that the internal partition functions
do not factor as shown above.
However, we really don’t need to worry about this because we’re going to go to the
classical limit where the occupancy is very small. This means we can truncate the sum

Copyright
Physics 301 11-Oct-2004 14-8
above after the second term,
Z = 1 + e(µ − ǫcm )/τ Zint .

The mean occupancy of the center of mass state, whatever the internal state, is
e(µ − ǫcm )/τ Zint

f (ǫcm ) = ≈ e(µ − ǫcm )/τ Zint ,
1 + e(µ − ǫcm )/τ Z int
which is just the Maxwell-Boltzmann distribution with an extra factor of the internal
partition function, Zint .
Now we should modify our previous expressions to allow for this extra factor of Zint .
Recall that we chose the chemical potential to get the correct number of particles. In
that calculation, exp(µ/τ ) must be replaced by exp(µ/τ )Zint, and everything else will go
through as before. Then our new expression for µ is

n n
µ = τ log = τ log − log Zint .
nQ Zint nQ
The free energy becomes

n
F = Fcm + Fint = N τ log −1 ,
nQ Zint
where Fcm is our previous expression for the free energy due to the center of mass motion
of molecules with no internal degrees of freedom, and
Fint = −N τ log Zint ,
is the free energy of the internal states alone. The expression for the pressure is unchanged
since in the normal situation, the partition function of the internal states does not depend
on the volume. (Is this really true? How do we get liquids and solids? Under what
conditions might it be a good approximation?) The expression for the entropy becomes
σ = σcm + σint ,
where σcm is our previous expression for the entropy of an ideal gas, the Sackur-Tetrode
expression, and

∂Fint ∂(N τ log Zint ) ∂(log Zint )
σint = − = = N log Zint + N τ .
∂τ V,N ∂τ V,N ∂τ V,N
The energy, U , and therefore the heat capacities, receive a contribution from the internal
states. The extra energy is

∂Fint 2 ∂ Fint
Uint = Fint + τ σint = Fint − τ = −τ .
∂τ V,N ∂τ τ

Copyright
Physics 301 13-Oct-2004 15-1
Examples of Zint
We have been discussing how to modify our treatment to allow for the internal states
and energies of molecules in, for example, an ideal gas To make further progress, we need
to consider some specific examples of internal structure that can give rise to Zint . Suppose
the molecules are single atoms but these atoms have a spin quantum number S. Then
there are 2S + 1 internal states that correspond to the 2S + 1 projections of the spin along
an arbitrary axis. In the absence of a magnetic field, all these states have the same energy
which we take as ǫint = 0. Then Zint = 2S + 1 and
Fint = −N τ log(2S + 1) ,
σint = N log(2S + 1) ,
Uint = 0 ,
so the entropy is increased over that of a simple ideal gas, but the energy doesn’t change.
The increase in entropy is easy to understand. What’s happening is that each atom has
2S + 1 times as many states available as a simple atom with no internal structure. The
entropy, the logarithm of the number of states, increases by log(2S + 1) per atom.
That was a fairly trivial example. Here’s another one: Suppose that each molecule
has one internal state with energy ǫ1 . Then Zint = exp(−ǫ1 /τ ) and
Fint = −N τ log Zint = +N ǫ1 ,
σint = 0 ,
Uint = N ǫ1 ,
and
∆µ = −τ log Zint = +ǫ1 .
In this example, we didn’t change the entropy (each molecule has just one state), but we
added ǫ1 to the energy of each molecule. This change in energy shows up in the chemical
potential as a per molecule change and it shows up in the free energy and energy as N
times the per molecule change. This example is basically a small test of the self-consistency
of the formalism!
More realistic examples include the rotational and vibrational states of the molecules.
Single atoms have neither rotational nor vibrational modes (they do have electronic exci-
tations!). A linear molecule (any diatomic molecule and some symmetric molecules such
as CO2 , but not H2 O) has two rotational degrees of freedom. Non-linear molecules have
three rotational degrees of freedom. Diatomic molecules have one degree of vibrational
freedom. More complicated molecules have more degrees of vibrational freedom. If the
molecule has M atoms, 3M coordinates are required to specify the locations of all the

Copyright
Physics 301 13-Oct-2004 15-2
atoms. The molecule thus has 3M degrees of freedom. Three of these are used in specify-
ing the location of the center of mass. Two or three are used for the rotational degrees of
freedom. The remainder are vibrational degrees of freedom.
You might be uncomfortable with 0 or two degrees of rotational freedom for point
or linear molecules. To make this plausible, recall that an atom consists of a nucleus
surrounded by an electron cloud. The electrons are in states with angular momentum,
and to change the angular momentum one or more electrons must be placed in an excited
electronic state. This is possible, but if there is an appreciable thermal excitation of such
states, the atom has a fair chance of being ionized. If the atom is part of a molecule, that
molecule has probably been dissociated as molecular binding energies are usually small
compared to atomic binding energies. The upshot of all this is that such excitations are
not important unless the temperature is high enough that molecules are dissociating and
atoms are ionizing!
Rotational energy is the square of the angular momentum divided by twice the moment
of inertia. (Ignoring things like the fact that inertia is a tensor!) Since angular momentum
is quantized, so is rotational energy. This means that at high temperatures, we expect
an average energy of τ /2 per rotational degree of freedom, but at low temperatures we
expect that the rotational modes are “exponentially frozen out.” In this case, they do not
contribute to the partition function, the energy, or the entropy. The spacing between the
rotational energy levels sets the scale for low and high temperatures.
Similarly, for each vibrational degree of freedom, we expect that the corresponding
normal mode of oscillation can be treated as a harmonic oscillator and that at high tem-
peratures there will be an average energy of τ per vibrational degree of freedom (τ /2 in
kinetic energy and τ /2 in potential energy). At low temperatures the vibrational modes
are exponentially frozen out and do not contribute to the internal partition function, the
energy or the entropy. h̄ times the frequency of vibration sets the scale for low and high
temperatures.
As an example, consider a diatomic gas. At low temperatures, the energy will be

3N τ /2, the entropy will be given by the Sackur-Tetrode expression and the molar heat
capacities will be CV = 3R/2 and Cp = 5R/2 with γ = 5/3. As the temperature is raised
the rotational modes are excited and the energy becomes 5N τ /2 with molar specific heats
of CV = 5R/2 and Cp = 7R/2 and γ = 7/5. If the temperature is raised still higher the
vibrational modes can be excited and U = 7N τ /2, CV = 7R/2, Cp = 9R/2, and γ = 9/7.

Copyright
Physics 301 13-Oct-2004 15-3
Ideal Gas Processes
We will consider various processes involving a fixed amount of an ideal gas. We

will assume that the heat capacities are independent of temperature for these processes.
(In other words, the temperature changes will not be large enough to thaw or freeze the
rotational or vibrational modes.) We will want to know the work done, heat added, the
change in energy and the change in entropy of the system.
Note that work and heat depend on the process, while energy and entropy changes
depend only on the initial and final states. For the most part we will consider reversible
processes.
Consider a constant volume process. About the only thing one can do is add heat! In
this case, pf /pi = Tf /Ti .
Q = nCV (Tf − Ti ) ,
W =0,
∆U = nCV (Tf − Ti ) ,
Z
dT Tf
∆S = nCV = nCV log ,
T Ti
where Q is the heat added to the gas, W is the work done on the gas, n is the number of
moles, CV and Cp are the molar heat capacities in conventional units and T and S are the
temperature and entropy in conventional units.
Consider a constant pressure (isobaric) process. In this case, if heat is added, the gas
will expand and Vf /Vi = Tf /Ti
Q = nCp (Tf − Ti ) ,
Z Vf
W =− p dV = −nR(Tf − TI ) ,
Vi
∆U = nCV (Tf − Ti ) ,
Z Tf
dT Tf Tf Vf
∆S = nCp = nCp log = nCV log + nR log .
Ti T Ti Ti Vi
Consider a constant temperature (isothermal) process. Heat is added and the gas
expands to maintain a constant temperature. The pressure and volume satisfy pf /pi =

Copyright
Physics 301 13-Oct-2004 15-4
(Vf /Vi )−1 .

Z Vf Z Vf
dV Vf
W =− p dV = −nRT = −nRT log ,
Vi Vi V Vi
∆U = 0 ,
Vf
Q = −W = nRT log ,
Vi
Z Vf
dV Vf
∆S = nR = nR log .
Vi V Vi
Consider a constant entropy process. This is often called an adiabatic process. How-
ever, adiabatic is also taken to mean that no heat is transfered. Since it is possible to
change the entropy without heat transfer, the term isentropic can be used to explicitly
mean that the entropy is constant. It is left as an exercise to show that in an isentropic
process with an ideal gas, pV γ = constant. Then
Z Z !
Vf Vf
dV pi Viγ 1 1
W =− p dV = −pi Viγ = −
Vi Vi Vγ γ−1 Vfγ−1 Viγ−1
γ−1 !
nRTi Vi
=− 1− ,
γ −1 Vf
Q=0,
γ−1 !
nRTi Vi
∆U = W = − 1− ,
γ−1 Vf
∆S = 0 .
Finally, let’s consider an irreversible process. Suppose a gas is allowed to expand from
a volume Vi into a vacuum until its volume is Vf . This is called a free expansion. No
work is done to the gas and no heat is added, so the energy and temperature don’t change.
The initial and final states are the same as in the reversible isothermal expansion, so the
entropy change is the same as for that case,
Q=0,
W =0,
∆U = 0 ,
Vf
∆S = nR log .
Vi
This is an adiabatic, but not isentropic, process. Note that if a gas is not ideal, then it may
be that the energy depends on volume (or rather, concentration) as well as temperature.

Copyright
Physics 301 13-Oct-2004 15-5
Such a deviation from the ideal gas law can be uncovered by measuring the temperature
of a gas before and after a free expansion.
The Gibbs Paradox Revisited
Last lecture’s treatment of the N particle problem might have engendered some un-
easiness; especially the example with two volumes of gas that were allowed to mix. Recall
that we had two volumes, V , each containing N molecules of gas at temperature τ , and
separated by a partition. We considered two cases: the same gas on both sides of the
partition and different gases, A and B, on the two sides of the partition. When the parti-
tion is removed in the same gas case “nothing happens,” the system is still in equilibrium
and the entropy doesn’t change—according to the correct expression which included the
N ! over counting correction in the partition function. When the partition is removed in
the different gas case, we must wait a while for equilibrium to be established and once
this happens, we find that the entropy has gone up by 2N log 2.This is called the entropy
of mixing. The incorrect expression for the entropy (omitting the N ! over counting cor-
rection in the partition function) gives the same entropy of mixing in both cases. This
manifestation of the N particle problem is called the “Gibbs paradox.”
The fact that we have to wait for equilibrium to be established means that the mixing
of the two different gases is a non-reversible process. Entropy always increases in non-
reversible processes! On the other hand, removing the partition between identical gases
is a reversible process (in the limit of a negligible mass, frictionless partition. . .). In a
reversible process, total entropy (system plus surroundings) does not increase, and there is
obviously no entropy change in the surroundings when the partition is removed from the
identical gases.
The question of measuring the entropy has come up several times. There is no such
thing as an entropy meter that one can attach to a system and out pops a reading of the
entropy! Changes of the entropy can be measured. Recall that dσ = d̄Q/τ for a reversible
process. So if we can take a system from one state to another via a (close approximation
of a) reversible process and measure the heat flow and temperature, we can deduce the
entropy difference between the two states. To measure the absolute entropy of a state, we
must start from a state whose absolute entropy we know. This is a state at τ = 0! We will
see how this goes in the comparison of the Sackur-Tetrode expression with experimental
results.
Aside: the fact that we can only measure changes in entropy should R not be that
bothersome. At the macroscopic level, entropy is defined by an integral ( dQ/τ ). There
is always the question of the constant of integration. A similar problem occurs with
potential energy. It is potential energy differences that are important to the dynamics and
only differences can be measured. For example, consider a mass m on the surface of the
Earth. If we take the zero of gravitational potential energy to be at infinite separation of

Copyright
Physics 301 13-Oct-2004 15-6
the mass and the Earth, then the potential energy of the mass-Earth system is −GM m/R
when the mass sits on the surface of the Earth at distance R from the center. I’m sure we’re
all happy with this, right? But, there is no potential energy meter that you can attach to
the mass and out pops a potential energy value. Instead, −GM m/R is a calculated value
much like the entropy of mixing is a calculated value. What can we actually measure in
the gravitational case? We can measure the force required R to change the height (distance
from the center of the Earth) of the mass and so measure F · dr. That is, we can measure
the change in gravitational potential energy between two states. Of course, we have to
be careful that there is no friction, that F = −mg, that there is negligible acceleration
of the mass, etc. In other words, we have to approximate a reversible process! Reversible
processes aren’t just for thermodynamics! I suspect they’re easier to visualize in other
branches of physics, so they don’t cause as much comment and concern. What about
measuring the “absolute” gravitational potential? This requires measuring the changes in
potential energy between the state whose potential we know (infinite separation) and the
state whose potential we want to know (mass on the surface of the Earth). I suppose if we
had enough money, we might get NASA to help with this project! Of course, by calculation
we can relate the “absolute” gravitational potential to other quantities, for example, the
escape velocity, that can be more easily measured. This is one of the arts of theoretical
physics.
Back to our mixing example: Can we think of a reversible process which would allow
us to mix the gases? Then we could calculate the entropy change by keeping track of the
heat added and the temperature at which it was added. Also, if we knew of such a process.
we could use it—in reverse!—to separate the gases again. The process I have in mind uses
semi-permeable membranes. We need two of them: one that passes molecules of type A
freely but is impermeable to molecules of type B, and a second which does the opposite.
It passes type B molecules and blocks type A molecules. We can call these “A-pass”
and “B-pass” membranes for short. Do such membranes actually exist? Semi-permeable
membranes certainly exist. The molecules that are passed may not pass through “freely,”
but if we move the membrane slowly enough, it should be possible to make the friction
due to the molecules passing through the small holes in the membrane negligibly small.
The possibility of finding the desired membranes depends on the molecules in question and
almost certainly is not possible for most pairs of molecules. However, the fact that semi-
permeable membranes do exist for some molecules would seem to make this a reasonable
thought experiment (if not an actually realizable experiment).

Copyright
Physics 301 13-Oct-2004 15-7
The figure is a schematic of our mix-

ing apparatus. The volume 2V is in equi-
librium with a thermal bath at tempera-
ture τ . At the center of the volume and
dividing it into two sections of volume V
are the two membranes. The A-pass mem-
brane confines the N type B molecules to
the right volume and the B-pass mem-
brane confines the N type A molecules
to the left volume. The next two figures
show the situations when the gases are
partially and fully mixed. The membranes
are something like pistons and are moved
by mechanical devices that aren’t shown.
These devices are actually quite impor-
tant as each membrane receives a net force
from the gases. The mechanical devices
are used to counteract this force and do work on the gases as the membranes are moved
slowly and reversibly through the volume.
What is the force on a membrane due

to the gases? Consider the A-pass mem-
brane. A molecules pass freely through
this membrane so there is no interaction
of this membrane with the A molecules.
The B molecules are blocked by this mem-
brane, so the B molecules are bouncing off
the membrane as though it were a solid
surface. Since there are B molecules on
the right side of this membrane and not on
the left, there is a pressure (only from the
B molecules) which is just N τ /VB where
VB is the volume occupied by B molecules
to the right of the A-pass membrane. The
net force from this pressure points to the
left. So as we move the membrane to the
left, the B molecules do work on it. This
comes from the energy, UB of the B molecules. The B molecule system would cool, ex-
cept it is in contact with the heat bath, so heat flows from the bath to make up for the
work done on the membrane and keep τ and hence UB constant. The same thing happens
with the B-pass membrane and the A gas. It is the heat transfered (reversibly) from the
reservoir to each gas that increases the entropy of the gas.

Copyright
Physics 301 13-Oct-2004 15-8
By now, you’re convinced that the A

molecules can be treated as a gas occu-
pying the volume to the left of the B-
pass membrane without worrying what’s
going on with the B molecules and vice-
versa. We’ve arranged this “by construc-
tion.” Our model for an ideal gas is based
on non-interacting molecules (well, weakly
interacting, but only enough to maintain
thermal equilibrium). We’ve also made
the membranes so they interact only with
A or B molecules but not both. So the A
molecules interact strongly with the walls
of the container and the B-pass membrane
and interact weakly (→ 0) with every-
thing else including other A molecules. So
when the B pass membrane is moved all
the way to the right, the A molecules undergo an isothermal expansion from V to 2V . We
apply our ideal gas results for an isothermal expansion and find
Z 2V Z 2V Z 2V
dQ p dV dV 2V
∆σA = = = N = N log = N log 2 .
V τ V τ V V V
Of course, a similar result applies to the B molecules when the A pass membrane is moved
all the way to the left. The total change of entropy in this process is
∆σ = 2N log 2 ,
which is what we had obtained before by applying the Sackur-Tetrode expression to the
initial and final states of the irreversible process.
Aside: suppose we have several ideal gases occupying the same volume. pi = Ni τ /V
is called the partial pressure of gas i and is the pressure that the same Ni molecules of the
gas would have if they occupied the volume by themselves (no P other gases present). Then
the total pressure is the sum of the partial pressures: p = i pi . This is called Dalton’s
law. It falls out of our ideal gas treatment “by construction.” Since the gases are non-
interacting, the presence of other gases cannot affect the rate of momentum transfer by a
given gas! So, Dalton’s law seems trivial, but it probably helped point the way towards a
non-interacting model as a good first approximation.

Copyright
Physics 301 15-Oct-2004 16-1
The Sackur-Tetrode Entropy and Experiment
In this section we’ll be quoting some numbers found in K&K which are quoted from
the literature.
You may recall that I’ve several times asked how one would measure absolute entropy?
I suspect that I pretty much gave it away (if you hadn’t figured it out already) in the last
section. The answer is you have to measure heat transfers
R from a state of known absolute
entropy to the desired state so that you can calculate dQ/τ . What is a state of known
entropy? Answer, at absolute 0, one expects the entropy to be very small and we can take
it to be 0. Actually, there is the third law of thermodynamics (not as famous as the first
two!) which says that the entropy should go to a constant as τ → 0.
At absolute 0, a reasonable system will be in its ground state. In fact the ground state
might not be a single state. For example if we consider a “perfect crystal,” its ground
state is clearly unique. But real crystals have imperfections. Suppose a crystal is missing
a single atom from its lattice. If there are N atoms in the crystal there are presumably N
different sites from which the atom could be missing so the entropy is log N . Also, there’s
presumably an energy cost for having a missing atom, so the crystal is not really in its
ground state. But this might be as close as we can get with a real crystal. The point is
that the energy and the entropy are both very small in this situation and very little error
is made by assuming that σ(0) = 0. (Compare log N with N log(nQ /n) when N ∼ 1023 !)
In fact, a bigger problem is getting to very low temperatures. In practice, one gets as
low as one can and then extrapolates to τ = 0 using a Debye law (assuming an insulating
solid). So to measure the entropy of a monatomic R ideal gas such as neon, one makes heat
capacity measurements and does the integral C(τ ) dτ /τ . The heat capacity measure-
ments go to as low a τ as needed to get a reliable extrapolation to 0 with the Debye law.
According to K&K, the calculation goes like this: solid neon melts at 24.55 K. At the
melting point, its entropy (by extrapolation and numerical integration) is
J
Smelting − S0 = 14.29 .
mol K
To melt the solid (which occurs at a constant temperature) requires 335 J mol−1 so the
entropy required to melt is
J
∆Smelt = 13.65 .
mol K
Again, a numerical integration is required to find the entropy change as the liquid neon is
taken from the freezing point to the boiling point at 27.2 K. This is
J
Sboiling − Sfreezing = 3.85 .
mol K
Finally, 1761 J mol−1 is required to boil the neon at the boiling point, and
J
∆Sboil = 64.74 .
mol K
Copyright
Physics 301 15-Oct-2004 16-2
Now we have a gas to which we can apply the Sackur-Tetrode expression. Assuming
S0 = 0, the total is
J
Svapor = ∆Sboil + (Sboiling − Sfreezing ) + ∆Smelt + (Smelting − S0 ) = 96.40 ,
mol K
σ = 6.98 × 1024 /mol ,

where I have quoted the sum from K&K which differs slightly from the sum you get by
adding up the four numbers presumably because there is some round-off in the input
numbers. (For example, using a periodic table on the web, I find the melting and boiling
points of neon are 24.56 K and 27.07 K.) According to K&K, the Sackur-Tetrode value
for neon at the boiling point is
J
SSackur−Tetrode = 96.45 ,
mol K
which is in very good agreement with the observed value.
When I plug into the Sackur-Tetrode expression I actually get,
J
SSackur−Tetrode = 96.47 ,
mol K
still in very good agreement with the observed value. Why did I get a slightly different
value than that quoted in K&K? I used
" " 3/2 , # #
mkT p 5
S = R log + ,
2πh̄2 kT 2
Everything can be looked up, but I’m using p/kT instead of N0 /V , which assumes the
ideal gas law is valid. However, this expression is being applied right at the boiling point,
so it’s not clear that the ideal gas law should work all that well.
Some other things to note. (1) If we had left out the N ! over counting correction, We
would have to add
J
R(log N0 − 1) = 447 ,
mol K
to the above. This swamps any slight problems with deviations from the ideal gas law or
inaccuracies in the numerical integrations! (2) The Sackur-Tetrode expression includes h̄
which means it depends on quantum mechanics. So this is an example where measurements
of the entropy pointed towards quantum mechanics. Of course, h̄ occurs inside a logarithm,
so it might not have been so easy to spot!

Copyright
Physics 301 15-Oct-2004 16-3
The Ideal Fermi Gas
Consider a metal like sodium or copper (or the other metals in the same columns in the
periodic table). These metals have one valence electron—an electron which can be easily
removed from the atom, so these atoms often form chemical bonds as positively charged
ions. In the solid metal, the valence electrons aren’t bound to the atoms. How do we know
this? Because the metals are good conductors of electricity. If the electrons were bound to
the atoms they would be insulators. Of course, there are interactions between the electrons
and the ions and between the electrons and other electrons. But, as a first approximation
we can treat all the valence electrons as forming a gas of free (non-interacting) particles
confined to the metal.
Let’s do a little numerology. First, let’s calculate the quantum concentration for an
electron at room temperature,
3/2
me kT
nQ = ,
2πh̄2
!3/2
(9.108 × 10−28 g) (1.380 × 10−16 erg K−1 ) (300 K)
= 2 ,
2π (1.054 × 10−27 erg s)
= 1.26 × 1019 cm−3 ,
3
1
= .
43 Å
In other words, the density of electrons is equal to the room temperature quantum con-
centration if there is one electron every 43 Å. Now consider copper. It has a density of
8.90 g cm−3 and an atomic mass of 63.54 amu. So the number density of copper atoms is
3
22 −3 1
nCu = 8.44 × 10 cm = .
2.3 Å
The number of electrons in the electron gas (assuming one per copper atom) exceeds the
quantum concentration by a factor of 6700. For copper the actual concentration and the
quantum concentration are equal at a temperature of about 100,000 K (assuming we could
get solid copper that hot!).
The upshot of all this is that we are definitely not in the classical domain when dealing
with an electron gas in metals under normal conditions. We will have to use the Fermi-
Dirac distribution function. Low energy states are almost certain to be filled. When this
is true, the system is said to be degenerate. Furthermore, the electron gas is “cold” in the
sense that thermal energies are small compared to the energies required to confine them
to high densities. (This is most easily seen from an uncertainty principle argument.)

Copyright
Physics 301 15-Oct-2004 16-4
So as a first approximation, we can use the Fermi-Dirac distribution at zero temper-

ature. This means we will ignore thermal energies altogether. At zero temperature, the
Fermi-Dirac distribution becomes,

1 1, ǫ<µ;
f (ǫ) = →
e(ǫ − µ)/τ + 1 0, ǫ>µ.
Imagine a chunk of copper in which all the valence electrons have been removed (it would
have a rather large electric charge . . .). Add back one valence electron remembering that
the temperature is 0. This electron goes into the lowest available state. Add another
electron, it goes into the state with the next lowest energy. Actually it’s the same center
of mass state and the same energy, but the second electron has its spin pointing in the
opposite direction from the first. The third electron goes in the state with the next lowest
energy. And so on. What we are doing is filling up states (with no gaps) until we run out
of valence electrons. Since we have the lowest possible energy, this configuration must be
the ground state (which is the state the system should be in at 0 temperature!).
We must choose the chemical potential so that our metal has the correct number
of valence electrons. To do this, we need to know the number of states. Since we are
considering free electrons, we are dealing with single particle states in a box. This is the
same calculation we’ve done before. The number of states with position vector in the
element d3 x and momentum vector in the element d3 p is
d 3 x d3 p
dn(x, p) = 2 ,
8π 3 h̄3
where the factor of 2 arises because there are two spin states for each center of mass state.
When the number of states is used in an integral, the integral over d3 x just leads to the
volume of the box, V . The element d3 p = p2 dp dΩ and the solid angle may be integrated
over to give 4πp2 dp. Finally, the independent variable may be converted from momentum
to energy with p2 /2m = ǫ, and we have
3/2
V 2m √
dn(ǫ) = ǫ dǫ .
2π 2 h̄2
It’s customary to write this as the density of states per unit energy
3/2
V 2m √
D(ǫ) dǫ = ǫ dǫ .
2π 2 h̄2
Now we’re ready to calculate µ. At zero temperature, the occupancy is 1 up to µ and

0 above µ, so the total number of electrons is
Z µ
N= D(ǫ) dǫ .
0

Copyright
Physics 301 15-Oct-2004 16-5
Before we do this integral, a bit of jargon. The energy of the highest filled state at zero
temperature is called the Fermi energy, ǫF . So µ(τ = 0) = ǫF . Then
Z ǫF 3/2 3/2
V 2m √ V 2m 3/2
N= ǫ dǫ = ǫF ,
0 2π 2 h̄2 3π 2 h̄2
or
h̄2
ǫF = (3π 2 n)2/3 ,
2m
where n = N/V is the concentration. One also speaks of the Fermi temperature defined by
τF = kTF = ǫF . This is not actually the temperature of anything, but is a measure of the
energy that separates degenerate from non-degenerate behavior. The Fermi temperature
is a few tens of thousands of Kelvins for most metals, so the electron gas in typical metals
is cold. Having determined the Fermi energy, we can determine the total energy of the
gas. We just add up the energies of all the occupied states
Z ǫF 3/2 3/2 2
V 2m √ V 2m 5/2 V 2 5/3 h̄ 3
U0 = ǫ 2 ǫ dǫ = ǫF = (3π n) = N ǫF ,
0 2π h̄2 5π 2 h̄2 5π 2 2m 5
where the subscript on U indicates the ground state energy.
In the ground state, the average energy of an electron is 3/5 the Fermi energy. Also
note that the concentration contains V −1 which means U0 ∝ V −2/3 which means that as
the system expands, the energy goes down which means it must be exerting a pressure on
its container. This is called degeneracy pressure. In fact,

∂U0 2 U0
p=− = ,
∂V σ,N 3 V
so
2
U0 , pV =
3
just as for an ideal gas. Note that the derivative is to be taken at constant entropy. We
are dealing with the ground state, so the entropy is constant at 0.
As a point of interest, using the concentration for copper that we calculated earlier,
we find
ǫF = 7.02 eV and TF = 81, 500 K ,
and the electron gas in copper really is “cold” all the way up to the point where copper
melts! (1358 K)

Copyright
Physics 301 15-Oct-2004 16-6
Heat Capacity of a Cold Fermi Gas
In the preceding section we considered the Fermi gas in its ground state. This does
not allow us to consider adding heat to the gas because the gas would no longer be in the
ground state. To calculate the heat capacity, we need to expand our treatment a bit.
What we need to do is calculate the energy of the gas as a function of temperature. We

will calculate the difference between the ground state energy and the energy at temperature
τ and we will make use of the fact that the gas is cold.
With a cold gas, all the action occurs within τ of µ. That is the occupancy goes from
1 to 0 over a range of a few τ centered at µ. Since we have a cold gas, this is a relatively
narrow range.
The difference in energy between the gas at temperature τ and the gas in the ground
state is Z ∞
∆U (τ ) = D(ǫ)f (ǫ)ǫ dǫ − U0 ,
0
Z ∞ Z ∞
= D(ǫ)f (ǫ)(ǫ − ǫF ) dǫ + D(ǫ)f (ǫ)ǫF dǫ − U0 ,
0 0
Z ∞
= D(ǫ)f (ǫ)(ǫ − ǫF ) dǫ + N ǫF − U0 .
0
Now, we differentiate with respect to τ to get the heat capacity.
∂∆U
CV = ,
Z ∂τ ∞
df (ǫ)
= D(ǫ) (ǫ − ǫF ) dǫ .
0 dτ
So far, everything is “exact,” now we start making approximations. At τ = 0, the distribu-

tion is a step function, so its derivative is a delta function (a very sharply peaked function
in the neighborhood of the step). This means that the main contribution to the integral
occurs (even if τ is not 0) when ǫ is very close to µ. So we will ignore the variation in the
density of states, evaluate it at µ and take it out of the integral. What about µ? At τ = 0,
µ = ǫF . As τ increases, µ decreases, but when τ ≪ ǫF , the change in µ is negligibly small
(plot some curves!), so we will take µ = ǫF . Then we have
Z ∞
df (ǫ)
CV = D(ǫF ) (ǫ − ǫF ) dǫ .
0 dτ
Now,
e(ǫ − ǫF )/τ

df d 1 ǫ − ǫF
= = 2 .
dτ dτ e(ǫ − ǫF )/τ + 1 τ2
e(ǫ − ǫF )/τ +1

Copyright
Physics 301 15-Oct-2004 16-7
At this point, we change variables to x = (ǫ − ǫF )/τ and we have
x2 ex dx
Z +∞
CV = τ D(ǫF ) 2 ,
−∞ (ex + 1)
where the actual lower limit of integration, −ǫF /τ , has been replaced by −∞ since all the
contribution to the integral is in the neighborhood of x = 0. It turns out that the integral
is π 2 /3, so
π2
CV = D(ǫF )τ ,
3
a surprisingly simple result! If we plug in the expression for the density of states, we have
3N
D(ǫF ) = ,
2ǫF
and
π2 τ π2 τ
CV = N = N .
2 ǫF 2 τF
In conventional units,
π2 T
CV = Nk .
2 TF
Some comments. This is proportional to T which means that the energy of the electron
gas is U0 + constant · τ 2 . Can we see how this happens? In going from 0 to τ , we are
exciting electrons in the energy range from ǫF −τ → ǫF by giving them a thermal energy of
roughly τ . The number of such electrons is roughly N τ /ǫF , so the added energy is roughly
N τ 2 /ǫF . The Fermi gas heat capacity is quite a bit smaller than that of a classical ideal gas
with the same energy and pressure. This is because only the fraction τ /τF of the electrons
are excited out of the ground state. At low temperatures, heat capacities of metals have
a linear term due to the electrons and a cubic term due to the lattice. Some experimental
data may be found in K&K.

Copyright
Physics 301 1-Nov-2004 17-1
Reading
Finish K&K chapter 7 and start on chapter 8. Also, I’m passing out several Physics
Today articles. The first is by Graham P. Collins, August, 1995, vol. 48, no. 8, p. 17,
“Gaseous Bose-Einstein Condensate Finally Observed.” This describes the research leading
up the first observation of a BE condensate that’s not a superfluid or superconductor. The
second is by Barbara Goss Levi, March, 1997, vol. 50, no. 3, p. 17, “Bose Condensates are
Coherent Inside and Outside an Atom Trap,” describing the first “atom laser” which was
based on a BE condensate. The third is also by Levi, October, 1998, vol. 51, no. 10, p. 17,
“At Long Last, a Bose-Einstein Condensate is Formed in Hydrogen,” describing even more
progress on BE condensates.
In addition, there is a recent Science report on an atomic Fermi Gas, DeMarco, B.,
and Jin, D. S., September 10, 1999, vol. 285, p. 1703, “Onset of Fermi Degeneracy in a
Trapped Atomic Gas.”
Bose condensates and Fermi degeneracy are current hot topics in Condensed Matter
Research. Searching the preprint server or “Googling” with appropriate keywords is bound
to turn up many more articles.
More on Fermi Gases
So far, we’ve considered the zero temperature Fermi gas and done an approximate
treatment of the low temperature heat capacity of Fermi gases. The zero temperature
Fermi gas was straightforward. We simply said that all states, starting from the lowest
energy state, are filled until we run out of particles. The energy at which this happens
is called the Fermi energy and is the same as the chemical potential at 0 temperature,
ǫF = µ(τ = 0). Basically, all we had to do was determine the density of states, a problem
we’ve dealt with before.
Working on the low temperature heat capacity required an approximate calculation

of the energy versus temperature for a cold Fermi gas. In this calculation we assumed
that the density of states near the Fermi energy is constant and this allows one to pull
the density of states out of the integral and also to set the chemical potential to its 0
temperature value.
These approximations work quite well for the electron gas in metals at room tempera-
ture because the Fermi temperature for these electron is typically several tens of thousands
of Kelvins.
To calculate the energy, etc., at arbitrary temperatures, one must numerically inte-
grate the Fermi-Dirac distribution times the density of states to obtain the number of
particles. Then the chemical potential is varied until the desired number of particles is

Copyright
Physics 301 1-Nov-2004 17-2
obtained. Knowing the chemical potential, one can integrate the density of states times
the Fermi-Dirac distribution times the energy to get the energy at a given temperature.
All of this requires numerical integration or approximate techniques.
Figure 7.9 and tables 7.2 and 7.3 of K&K demonstrate that the low temperature heat
capacities (low enough that the Debye lattice vibrations are accurately following a T 3 heat
capacity) have a component proportional to the temperature and list the proportionality
constants for various metals. One thing you will notice is that the proportionality constants
agree with the calculations to only ∼ 30% and up to a factor of 2 in at least one case. This
is most likely due to the fact that the electrons are not really a non-interacting gas. Also,
there are effects due to the crystal structure such as energy bands and gaps.
Other Fermi Gases
In addition to the conduction electron gas in metals, Fermi gases occur in other
situations.
In heavy elements, the number of electrons per atom becomes large enough that a
statistical treatment is a reasonable approximation. This kind of treatment is called the
Thomas-Fermi (or sometimes the Fermi-Thomas) model of the atom.
Also in heavy elements, the number of nucleons (neutrons and protons) in the nucleus
is large and, again, a statistical treatment is a reasonable approximation. The radius of a
nucleus is
R ≈ (1.3 × 10−13 cm) · A1/3 ,
where A is the number of nucleons. The coefficient in this relationship can vary by a
tenth or so depending on just how one measures the size—scattering by charged particles,
scattering by neutrons, effects on atomic structure, etc. Aside: the unit of length 10−13 cm
which is one femtometer is called the Fermi in nuclear physics. The volume of a nucleus is
4π
V = 2.2 × 10−39 A cm3 ,
3
and the number density or concentration is
A
nnuc = = 1.1 × 1038 cm−3 .
V
The nuclear density (with this number of significant digits, the mass difference between
neutrons and protons is negligible) is
ρnuc = 1.8 × 1014 g cm−3 .

Copyright
Physics 301 1-Nov-2004 17-3
Basically, all nuclei have the same density. Of course, this is not quite true. Nuclei have a
shell structure and “full shell” nuclei are more tightly bound than partially full shell nuclei.
Also, the very lightest nuclei show some deviations. Nevertheless, the density variations
aren’t large and it’s reasonable to speak of the nuclear density.
The neutron to proton ratio in nuclei is about 1 : 1 for light nuclei up to about 1.5 : 1
for heavier nuclei. Assuming the latter value, then it is the neutrons whose Fermi energy
is important.
h̄2 2/3
ǫF = 3π 2 (0.6 · nnuc ) = 5.2 × 10−5 erg = 32 MeV .
2mn
This is a little larger than K&K’s number because it’s computed for a nucleus with 40% pro-
tons and 60% neutrons, instead of equal numbers. Since the average kinetic energy in a
Fermi gas is 3ǫF /5, the average kinetic energy is about 19 MeV in a heavy nucleus. The
experimentally determined binding energy per nucleon is about 8 MeV. This varies some-
what, especially for light nuclei; it reaches a peak at 56 Fe. To the extent that the binding
energy per nucleon and the kinetic energy per nucleon are constant, the potential energy
per nucleon is also constant. This reflects the fact that the nuclear force is the short range
strong force and nuclei only “see” their nearest neighbors. The strong force is about the
same between neutrons and protons, between protons and protons and between neutrons
and neutrons. But, the protons have a long range electromagnetic interaction. As the
number of particles goes up the “anti-binding” energy of the protons goes up faster than
the number of protons (can you figure out the exponent?) so the equilibrium shifts to
favor neutrons in spite of the fact that they are slightly more massive than protons.
The Fermi temperature for neutrons in a heavy nucleus is
TF = ǫF /k = 3.8 × 1011 K ,
so nuclei (which are usually in their ground state) are very cold!
In a star like the Sun, gravity is balanced by the pressure of a very hot, but classical,
ideal gas. The Sun has a mass about 300,000 times that of the Earth and a radius about
100 times that of the Earth, so the average density of the Sun is somewhat less than that
of the Earth (it’s about the density of water!). The temperature varies from about 20
million Kelvins at the center to about 6000 K at the surface. So it’s completely gaseous
and the electrons are non-degenerate throughout. Since the sun is radiating, it is cooling.
Energy is supplied by nuclear reactions in the Sun’s core.
A typical white dwarf star has about the mass of the Sun but the radius of the Earth.
It’s the degeneracy pressure of the electrons that balances gravity in a white dwarf. White
dwarves shine by cooling. There are no nuclear reactions in the core, so after they cool
enough, they become invisible. White dwarves are discussed in K&K, so let’s move on to
neutron stars which are not discussed in K&K.

Copyright
Physics 301 1-Nov-2004 17-4
Neutron Stars
In a neutron star it’s the degeneracy pressure of the neutrons that balances gravity.
A typical neutron star has a mass like the Sun (M⊙ = 2 × 1033 g) but a radius smaller
than New Jersey, let’s say R ≈ 10 km. Let’s assume that the mass in a neutron star is
uniformly distributed. What’s the density?
ρ = M/V = 4.8 × 1014 g cm−3 ,
about three times nuclear density. (Of course, the density in a star is not uniform and it
may exceed 10 times nuclear density in the center, but we’re just trying to do a back of
the envelope calculation here.) In terms of the concentration of neutrons, this corresponds
to
n0 = 2.9 × 1038 cm−3 .
The Fermi energy for these neutrons is
ǫF,0 ≈ 86 MeV ,
and the Fermi temperature is

TF = 1012 K .
Neutron stars are nowhere near this hot. Otherwise they would be very strong sources of
gamma rays. Instead they are thought to have temperatures of millions of degrees and
radiate X-rays. I believe there are some observations which indicate this. Also, due to
having a magnetic field and rotating, they can radiate electromagnetic energy and are
observed as pulsars. In any case, the neutrons in a neutron star are cold!
An interesting question is why is a neutron star made of neutrons? (Well, if it weren’t,

we probably wouldn’t call it a neutron star, but besides that?) In particular, what’s
wrong with the following? Let the star be made of protons and electrons, each with the
concentration we’ve just calculated. Then the star is electrically neutral because there
is a sea of positively charged protons and a sea of negatively charged electrons. But the
protons have a slightly lower mass than the neutrons and this is true even if one adds in
the mass of the electron, so this configuration would seem to be energetically favored over
the neutron configuration. In fact, a free neutron decays according to
n → p + e− + ν̄e ,
where ν̄e is the electron anti-neutrino. The half life is about 15 minutes. Neutrinos are
massless (or very nearly so) and for our purposes we can ignore them. That is, we can
assume that the neutrons are in equilibrium with the protons and electrons. If we need to
change a neutron into a proton and electron, the above reaction will do it. If we need to
change a proton and electron into a neutron, there is
p + e− → n + νe .

Copyright
Physics 301 1-Nov-2004 17-5
What would be the Fermi energies of the protons and electrons in our hypothetical
star? The Fermi energy for the protons would be very nearly the same as that for the
neutrons above (because the concentration would be the same and the mass is nearly the
same). On the other hand, the Fermi energy of the electrons would be larger by the ratio
of the neutron mass to the electron mass, a factor of 1838, so the electron Fermi energy
would be about 160,000 MeV, enough to make about 170 nucleons! Remember that the
chemical potential (the Fermi energy since all the gases are cold) is the energy required to
add a particle to a system. If neutrons are in equilibrium with protons and electrons, then
the chemical potential of the neutrons equals the chemical potential of the protons plus
the chemical potential of the electrons minus the energy difference between a neutron and
a proton plus electron. In other words
ǫF,n = ǫF,p + ǫF,e − (mn − mp − me )c2 .
Denote the concentrations of the neutrons, protons, and electrons by nn , np , and ne . Then
np = ne ,
for charge neutrality and

np + nn = n ,
where n is the concentration of the nucleons, which is not changed by the reactions above.
To simplify the notation a bit, let
np ne nn
x= = , 1−x= .
n n n
Each of the Fermi energies can be written in terms of the concentrations
2/3
h̄2 n
ǫF,n = (3π 2 nn )2/3 = ǫF,0 (1 − x)2/3 ,
2mn n0
2/3
h̄2 n mn 2/3
ǫF,p = (3π 2 np )2/3 = ǫF,0 x ,
2mp n0 mp
2/3
h̄2 n mn 2/3
ǫF,e = (3π 2 ne )2/3 = ǫF,0 x ,
2me n0 me
where
h̄2
ǫF,0 = (3π 2 n0 )2/3 ,
2mn
is the Fermi energy for a pure neutron gas at the concentration n0 we calculated previously.
We plug these energies into the energy equation to obtain
2/3 2/3
n 2/3 n mn mn
ǫF,0 (1 − x) = ǫF,0 + x2/3 − E ,
n0 n0 mp me

Copyright
Physics 301 1-Nov-2004 17-6
where E = (mn − mp − me )c2 = 0.783 MeV is the mass energy excess of a neutron over a
proton and electron. If we rearrange slightly, we obtain

2/3 mn mn E n0 2/3
(1 − x) = + x2/3 − ,
mp me ǫF,0 n
or
2/3 0.783 n0 2/3
2/3
(1 − x) = (1.0014 + 1838.7) x − ,
86 n
or n 2/3
2/3 2/3 0
x = 0.000544 (1 − x) + 0.0091 .
n
If n is in the neighborhood of n0 , then x is small, we can ignore x on the right hand side,
and we finally obtain
x ≈ 1.3 × 10−5 .
At higher concentrations x will get slightly smaller and at lower concentrations x will grow
slowly. The concentration of neutrons, protons, and electrons are equal (x = 0.5) when
n = 2.2 × 10−8 n0 = 6.4 × 1030 cm−3 .
Such low concentrations will be attained only very near the surface of the neutron star.
Caveats: (1) The Fermi energy of the electrons works out to be about 87 MeV, so the
electrons are extremely relativistic, so we really shouldn’t be using our non-relativistic
formula for the electron Fermi energy. One of this week’s homework problems gives you a
chance to modify the treatment to allow for relativistic electrons. (2) With an electron and
proton instead of a neutron, the pressure changes, so the equilibrium condition that we
set up is not quite right. Nevertheless, this calculation gives the flavor of what’s involved
and points to the correct conclusion: For most of its volume a neutron star is almost pure
neutrons!
Of course, we can turn the earlier question around: how is it that nuclei have any
protons??? Haven’t we just shown that at nuclear densities, the nucleons must exist as
neutrons, not protons???

Copyright
Physics 301 3-Nov-2004 18-1
Bose-Einstein Gases
An amazing thing happens if we consider a gas of non-interacting bosons. For suf-

ficiently low temperatures, essentially all the particles are in the same state (the ground
state). This Bose-Einstein condensation can occur even when the temperature is high
enough that one would naively expect that higher energy states should be well populated.
In addition, properties of the gas change when it is in this state, so something like a phase
transition occurs.
Note that photons obey Bose statistics so they constitute a non-interacting gas of
bosons. We’ve already calculated their distribution (which has µ = 0). It’s just the
Planck distribution and this distribution does not have a Bose-Einstein condensation. The
difference between photons and the situation we’re about to discuss is that there is no
fixed number of photons. If a photon gas is cooled, the number of photons per unit volume
decreases. This is related to the fact that photons are massless. It’s possible to create or
destroy a photon of arbitrarily small energy. The gases we’ll be considering will contain a
fixed number of matter particles. One can’t create or destroy these bosons without doing
something about the rest mass energy (and perhaps other conserved quantum numbers)!
So let the gas contain N particles. The Bose-Einstein distribution is
1
f (ǫ) = ,
e(ǫ − µ)/τ − 1
and the sum of this distribution function over all states must add up to N . For convenience,
we adjust the energy scale so that the lowest energy state has ǫ = 0.
When τ → 0, all the particles must be in the ground state,
1
lim =N,
τ →0 −µ/τ
e −1
1
lim e−µ/τ − 1 = ,
τ →0 N
1
lim e−µ/τ = 1 + ,
τ →0 N
µ 1
lim 1 − + · · · = 1 + ,
τ →0 τ N
−µ 1
lim = ,
τ →0 τ N
τ
lim µ = − .
τ →0 N
Recall that µ must be lower than any accessible energy for the Bose-Einstein distribution
and here we have µ < 0 in agreement with this constraint, although it converges to 0 as

Copyright
Physics 301 3-Nov-2004 18-2
τ → 0, but this is to be expected as all the particles must pile up in the ground state when
τ → 0. It’s instructive to evaluate µ for a mole of particles at a temperature of 1 K. The
result is
µ(1 K) = −2.3 × 10−40 erg .
If we consider a mole of 4 He and treat it as an ideal gas with p = 1 atm and T = 1 K, then
its volume would be V = 82 cm3 . This would be equivalent to a cube of side L = 4.3 cm.
Recall that the energies of single particle states in a cube are
π 2 h̄2 2
ǫ(nx , ny , nz ) = (n + n2y + n2z ) .
2mL2 x
The ground state has nx = ny = nz = 1 and in the first excited state one of these quantum
numbers is 2. Using the L we just calculated and the mass of 4 He, we find
ǫ(1, 1, 1) = 1.34 × 10−31 erg , ǫ(2, 1, 1) = 2.68 × 10−31 erg .
Actually, the ground state energy is supposed to be adjusted to 0, so we need to subtract

ǫ(1, 1, 1) from all energies in the problem. Then the ground state energy is 0 and the first
excited state energy is
ǫ1 = 1.34 × 10−31 erg = 5.8 × 108 |µ| ,
at T = 1 K. The key point is that even though the energy of the first excited state is
incredibly small, and you might think such a small energy can have nothing to do with any
macroscopic properties of a system, this energy (or more properly, the difference in energy
between the ground state and the first excited state) is almost nine orders of magnitude
bigger than µ (at the temperature and density we’re considering). Under these conditions,
what is the population of the first excited state?
1
N1 = ,
e(ǫ1 − µ)/τ − 1
1
= 8 ,
e−5.8 × 10 µ/τ − 1
1
= ,
1 − 5.8 × 108 µ/τ + · · · − 1
1
= ,
−5.8 × 108 µ/τ
1
= ,
5.8 × 108 /N
N
= ,
5.8 × 108
so the occupancy of the first excited state is almost 9 orders of magnitude smaller than the
occupancy of the ground state. Essentially all the particles are in the ground state even
though kT is much larger than the excitation energy!

Copyright
Physics 301 3-Nov-2004 18-3
Now we want to do a proper sum of the occupancy over the energy states. We might
try to write Z ∞
N= f (ǫ)D(ǫ) dǫ ,
0
where 3/2
V 2m √
D(ǫ) = ǫ,
4π 2 h̄2
is the same density of states we used for the Fermi-gas except there’s a factor of two missing
because we’re assuming a spin 0 boson gas. (If the spin were different from 0, we would
include a factor 2S + 1 to account for the multiplicity of the spin states.) The expression
above has the problem that it fails to count the particles in the ground state. We have had
this problem in previous calculations but it never mattered because there were only a few
(2 or less) in the ground state and ignoring these particles makes absolutely no difference
to any quantity involving the other ∼ 1023 particles.
However, we are expecting to find many, and in some cases, most of the particles in
the ground state. It would not be a good idea to ignore them in the sum! So we write the
sum as Z ∞
N = N0 + f (ǫ)D(ǫ) dǫ ,
0
where the first term is the number of particles in the ground state and the second term
accounts for all particles in excited states. This term still makes an error in the low
energy excited states (since we’re integrating rather than summing), but when these states
contain a lot of particles, the ground state contains orders of magnitude more, so errors
in the occupancies of these states are of no concern. In the case that these states don’t
contain many particles, it means that the occupancies of all states are small, and again we
make no appreciable error if we miss on the occupancies of a few of the low energy excited
states.
So, the number of particles in the ground state is
1
N0 = ,
e−µ/τ − 1
and the number of particles in excited states is
Z ∞
Ne = f (ǫ)D(ǫ) dǫ ,
0
Z ∞ 3/2
1 V 2m √
= 2 ǫ dǫ ,
e(ǫ − µ)/τ − 1 4π
2
0 h̄
3/2 Z ∞
V 2m 1 √
= ǫ dǫ ,
4π 2 h̄2 0 e(ǫ − µ)/τ − 1

Copyright
Physics 301 3-Nov-2004 18-4
3/2 Z ∞
V 2m 1 √
= ǫ dǫ (since |µ|/τ ≪ ǫ/τ ) ,
4π 2 h̄2 0 eǫ/τ − 1
3/2 Z ∞
V 2m 3/2 1 √
= 2 2 τ x x dx (x = ǫ/τ ) ,
4π h̄ 0 e −1
3/2
V 2m
= τ 3/2 Γ(3/2)ζ(3/2) ,
4π 2 h̄2
3/2
√ V 2m
= 1.306 π 2 2 τ 3/2 ,
4π h̄
3/2
mτ
= 2.612V ,
2πh̄2
= 2.612V nQ ,
where nQ is the quantum concentration again.
The major approximation we made in the above calculation was ignoring the chemical
potential. As long as there are an appreciable number of particles in the ground state,
then |µ| must be much smaller then the energy of any excited state and this is a good
approximation. With the numerical example we worked out before, |µ| will be closer to
0 than to the first excited state energy provided the ground state contains about 1015 or
more particles which means the excited states must contain about 6×1023 −1015 = 6×1023
particles. In other words, our approximation for Ne above should be valid all the way to
the point where Ne = N . This means that the Ne ∝ τ 3/2 . We define the proportionality
constant by defining the Einstein condensation temperature, τE , such that
3/2
τ
Ne = N ,
τE
so 2/3
2πh̄2 N
τE = ,
m 2.612V
and we expect the expression for Ne should be valid from τ = 0 up to τ = τE . Then the
number in the condensate is
3/2 !
τ
N0 = N 1 − .
τE
Numerically, the Einstein temperature is

115
TE = 2/3
,
Vm m
where TE is in Kelvins, Vm is the molar volume in cm3 and m is the molar weight in
grams. For liquid 4 He, with a molar volume of 27.6 cm3 , this gives TE = 3.1 K. There

Copyright
Physics 301 3-Nov-2004 18-5
is actually a transition in liquid helium at about 2.17 K. Below this temperature, liquid
4
He develops a superfluid phase. This phase is most likely a Bose-Einstein condensation,
but it is more complicated than the simple theory we have worked out because there are
interatomic forces between the helium atoms. We know this because there must be forces
that are responsible for the condensation of helium gas to liquid helium at T = 4.2 K and
one atmosphere.
If you read the articles referenced at the beginning of these notes, you’ll see that a
major problem faced by the experimenters in creating BE condensates in other systems is
getting the atoms cold enough and dense enough to actually form the condensate. In the
case of helium, the attractive interactions help to get the density high enough to form the
condensate at more accessible temperatures!
Superfluid Helium
As mentioned, the transition of 4 He at 2.17 K at 1 atm is believed to be the conden-

sation of most of the helium atoms into the ground state—a Bose-Einstein condensation.
That this does not occur at the calculated temperature of 3.1 K is believed to be due to
the fact that there are interactions among helium atoms so that helium cannot really be
described as a non-interacting boson gas! Above the transition temperature, helium is
refered to as He I and below the transition, it’s called He II.
K&K present several reasons why thinking of liquid helium as a non-interacting gas
is not totally off the wall. You should read them and also study the phase diagrams for
both 4 He and 3 He (K&K figures 7.14 and 7.15).
The fact that something happens at 2.17 K is shown by the heat capacity versus tem-
perature curve (figure 7.12 in K&K) which is similar to the heat capacity curve for a phase
transition and also not all that different from the curve you’re going to calculate for the
Bose-Einstein condensate in the homework problem (despite what the textbook problem
actually says about a marked difference in the curves!). In addition to the heat capacity,
the mechanical properties of 4 He are markedly different below the transition temperature.
The liquid helium becomes a superfluid which means it flows without viscosity (that is,
friction).
The existence of a Bose-Einstein condensate does not necessarily imply the existence
of superfluidity.
To understand this we need to examine the mechanics of friction on a microscopic

level. Without worrying about the details, friction must be caused by molecular collisions
which transfer energy from the average motion (represented by the bulk velocity) to the
microscopic motion (internal energy).

Copyright
Physics 301 3-Nov-2004 18-6
If our Bose-Einstein condensate were really formed from a gas of non-interacting

particles, then it would be possible to excite any molecule in the condensate out of the
ground state and into the first excited state simply by providing the requisite energy (and
momentum). Previously, we calculated that under typical conditions, the energy difference
between the ground state and the first excited state was about 10−31 erg, an incredibly
small amount of energy that would be very easy to provide given that thermal energies are
about 10−16 erg.
In order to have superfluid behavior, it must be that there are interactions among
the molecules such that it’s not possible to excite just one molecule out of the condensate
and into the first excited state. A better way to say this is that due to the molecular
interactions, the single particle states are not discrete energy states of the superfluid.
We need to consider the normal modes of the fluid—the longitudinal oscillations or the
sound waves. In particular, we can consider travelling sound waves of wave vector k
and frequency ω. (Rather than the standing waves which carry no net momentum.) A
travelling wave carries energy in units of h̄ω and momentum in units h̄k. The number of
units is determined by the number of phonons or quasiparticles in the wave.
Now imagine an object of mass M moving though a stationary superfluid with velocity
Vi . In order for there to be a force on the object, there must be a momentum transfer
to the superfluid. In order to do this, the object must create excitations in the fluid
which contain momentum (the quasiparticles in the travelling waves we just discussed).
Of course, if quasiparticles already exist, the object could “collide” with a quasiparticle
and scatter it to a new state of energy and momentum. (This can also be viewed as the
absorption of one quasiparticle and the emission of another.) We will assume that there
are not very many existing quasiparticles and consider only the creation (emission) of a
quasiparticle.
So, let’s consider this emission process. Before the event, the object has velocity Vi
and afterwards it has velocity Vf . We must conserve both energy and momentum,
1 1
M Vi2 = M Vf2 + h̄ω ,
2 2
and
M Vi = M Vf + h̄k .
We can go through some algebra with the goal of solving for Vi · k. The momentum
equation can be rewritten as
M Vi − h̄k = M Vf ,
squared and divided by 2M ,
1 1 2 2 1
M Vi2 − h̄Vi · k + h̄ k = M Vf2 .
2 2M 2

Copyright
Physics 301 3-Nov-2004 18-7
Subtract from the energy equation to get

1 2 2
h̄Vi · k = h̄ω + h̄ k ,
2M
or
k h̄ω h̄k
Vi ·= + .
k h̄k 2M
What we really want to do is place a lower limit on the magnitude of Vi . This means we
can drop the term containing M on the right hand side. The smallest value will occur
when Vi is parallel to k/k, the unit vector in the k direction. This corresponds to emission
of the quasiparticle in the forward direction. Thus
h̄ω
Vi > .
h̄k
I’ve left the h̄’s there in order to emphasize that the right hand side is the ratio of the
energy to the momentum of an excitation,
Suppose the excitations are single particle states with momentum h̄k and energy
h̄2 k 2 /2m. This is the travelling wave analog to the standing wave particle in a box states
we’ve discussed many times. Then the right hand side becomes h̄k/2m which goes to
zero as k → 0. Thus, an object moving with an arbitrarily small velocity can produce
an excitation and feel a drag force—there is no superfluid in this case. (Note: k must be
bigger than ∼ 1/L, where L is the size of the box containing the superfluid, but as we’ve
already seen the energies corresponding to this k are tiny compared to thermal energies.)
Suppose the excitations are sound waves (as we’ve been assuming) and the phase
velocity is independent of k. Then
ω
Vi > = vs ,
k
where vs is the phase velocity of sound in the fluid. This means that if an object flows
through the fluid at less than the velocity of sound, the flow is without drag! That is, the
fluid is a superfluid.
In fact, the vs is not independent of k and what sets the limit is the minimum phase
velocity of any excitation that can be created by the object moving through the fluid.
Figure 7.17 in K&K shows that this minimum is about 5000 cm s−1 for the low lying
excitations in 4 He. K&K point out that Helium ions have been observed to travel without
drag through He II at speeds up to about 5000 cm s−1 !
Comment 1: It appears that we’ve shown that any fluid should be a superfluid as
long as we don’t move things through it faster than its speed of sound. In our derivation,
we made an assumption that doesn’t apply in most cases. Can you figure out which
assumption it was?

Copyright
Physics 301 3-Nov-2004 18-8
Comment 2: As a general rule, superfluidity or superconductivity requires the conden-

sation of many particles into a single (ground) state and a threshold for creating excitations.
The minimum velocity required to create an excitation is the threshold for non-viscous flow.

Copyright
Physics 301 5-Nov-2004 19-1
Heat and Work
Now we want to discuss the material covered in chapter 8 of K&K. This material might
be considered to have a more classical thermodynamics rather than statistical mechanics
flavor. We’ve already discussed a lot of this material in bits and pieces throughout the
term, so we will try to focus on the material not yet covered and just hit the highlights of
the remaining material.
Heat and work occur during processes. They are energy transfers. Work is an energy
transfer by macroscopic means and heat is an energy transfer by microscopic means. We’ve
discussed reversible processes several times and we’ll assume reversible processes unless
we explicitly state otherwise. When work is done to or by a system, the macroscopic
parameters of the system are changed—for example changing the volume causes p dV
work to be performed. Performing work changes the energy, U , of a system. But work
does not change the entropy. Heat transfer changes the entropy as well as the energy:
dU = d̄Q = τ dσ .
A very important activity in any modern society is the conversion of heat to work.
This is why we have power plants and engines, etc. Basically all forms of mechanical or
electrical energy that we use involve heat to work conversion. Not all of them involve fossil
fuels, and in some cases it may be hard to see where the heat enters. For example, what
about hydro-electric power? This is the storage of water behind a dam and then releasing
the gravitational potential energy of the water to run an electric generator. Where is the
heat supplied? Heat is supplied in the form of sunlight which keeps the weather going
which provides water in the atmosphere to make rain to fill the lake behind the dam. Of
course, the economics of this process are quite different from the economics of an oil fired
electrical generating plant.
It was the steam engine (conversion of heat, obtained by burning coal, into work)
that allowed the industrial revolution to proceed. The desire to make better steam engines
produced thermodynamics!
With an irreversible process you can turn work completely into heat. Actually, this
statement is not well defined. What we really mean to say is that with an irreversible
process we can use work to increase the internal energy of a system and leave that system
in a final configuration that would be exactly the same as if we had reversible heated the
system. For example, consider a viscous fluid in an insulating container. Immersed in
the fluid is a paddle wheel which is connected by a string running over various pulleys
and whatever to a weight. The weight is allowed to fall under the influence of gravity.
Because the fluid is so viscous, the weight drops at a constant slow speed. Once the weight
reaches the end of its travel, we wait for the fluid to stop sloshing and the temperature and
pressure in the fluid to become uniform. Thus essentially all of the mechanical gravitational

Copyright
Physics 301 5-Nov-2004 19-2
potential energy is converted to internal energy of the fluid. We can take the fluid from
the same initial state to the same final state by heating slowly (reversibly!) until we have
the same temperature rise.
There is no known way to convert (reversibly or non-reversibly) heat (more properly,

internal energy, U ) entirely into work with no other change. This is one of the ways of
stating the second law of thermodynamics.
It is certainly possible to convert heat into work. (I’m getting tired of trying to say it
exactly correctly, so I’ll just use the vernacular and you know what I mean, right?) The
constraints are that you can’t convert all of it to work or there must be some permanent
change in the system or both. For example, suppose we reversibly add heat to an ideal gas
while we keep the volume constant. Then we insulate the gas and allow it to reversibly
expand until its temperature is the same as when we started. Then the internal energy
of the gas is the same as when we started, so we have completely converted the heat into
work, but the system is not the same as when we started. The gas now occupies a bigger
volume and has a lower pressure.
The problem is that when we reversibly add heat to a system we add internal energy
dU = d̄Q and we also add entropy dσ = d̄Q/τ , but when we use the system to perform work,
we remove only the energy dU = d̄W and leave the entropy! If we want to continue using
the system to convert heat to work, we have to remove the entropy as well as the energy,
so there is no accumulation of entropy. The only way to remove entropy (reversibly) is to
remove heat. We want to remove less heat than we added (so we have some energy left
over for work) so we must remove the heat at a lower temperature than it was added in
order to transfer the same amount of entropy.
To make this a little more quantitative, consider some time interval (perhaps a com-
plete cycle of a cyclic engine) during which heat Qh is transfered into the system at
temperature τh , heat −Ql is transfered into the system at temperature τl , and energy −W
in the form of work is transfered into the system. (So heat Ql > 0 leaves the system and
work W > 0 is performed on the outside world.) At the end of this time interval we want
the system to be in the same state it was when we started. This means
∆U = 0 = Qh − Ql − W ,
and
Qh Ql
∆σ = 0 = − .
τh τl
We find
Qh τh
= ,
Ql τl
and
W τl
ηC = =1− .
Qh τh

Copyright
Physics 301 5-Nov-2004 19-3
The ratio of the heat input and output is the same as the ratio of the temperatures of
the input and output reservoirs. The energy conversion efficiency or just efficiency, η is
defined as the work output over the heat input, W/Qinput . For the ideal engine we’ve
been considering, the efficiency is ηC , the Carnot efficiency, and is the upper limit to the
efficiency of any real (i.e. non-reversible) engine operating between temperature extremes
τh and τl . Carnot might be called the father of thermodynamics. He worked in the early
1800’s and understood the second law. This was before heat was recognized as a form of
energy!
Of course, this definition of efficiency is motivated by the fact that if you’re an electric
power company, you can charge your customers based on W but you have to pay your
suppliers based on Qh and you want to maximize profits!
We live on the surface of the Earth and any engine must dump its waste heat, Ql ,
at what amounts to room temperature, about 300 K. This is roughly the equilibrium
temperature of the surface of the Earth and is set by radiation equilibrium between the
Sun and Earth and between the Earth and space (T = 3 K). See problem 5 in chapter 4 of
K&K. Aside: are you surprised that room temperature and the surface temperature of the
Earth are about the same? Anyway, the waste heat goes into the environment and usually
generates thermal pollution. There may come a time when a cost is associated with Ql .
In this case it’s still desirable to maximize η, because that minimizes Ql .
Because waste heat must be dumped at room temperature, improving the Carnot
efficiency requires increasing the high temperature, τh . But this is not so easy to do,
especially in an economically viable power plant that’s supposed to last for many years,
Comment 1: no real engine is reversible, so all real engines operating between tem-
perature extremes τh and τl will have an efficiency less than the Carnot efficiency.
Comment 2: many practical engines are designed in such a way that heat is exchanged
at intermediate temperatures as well as the extremes. Such engines, even if perfectly
reversible, frictionless, etc. have an efficiency less than the Carnot efficiency. However, if
one assumes a perfect, reversible engine operating according to the specified design, one
can calculate an efficiency (lower than the Carnot efficiency) which is the upper limit that
can be achieved by that engine design.
A reversible refrigerator uses work supplied from the outside to remove heat from
a low temperature reservoir and deposit that heat plus the work used as heat in a high
temperature reservoir. Such a refrigerator is basically the reversible engine just discussed,
but run backwards! The signs of Qh , Ql , and W all change but anything involving their
ratios remains the same. (Note if you wanted to “derive” refrigerators you could start
from the same idea we used with the engine—entropy is only transfered when there’s a
heat transfer and entropy must not be allowed to accumulate in the refrigerator.) With
refrigerators, one uses a coefficient of performance, this is defined as the ratio of the heat

Copyright
Physics 301 5-Nov-2004 19-4
removed from the low temperature reservoir to the work required. This is
Ql
γ= ,
W
and for a reversible refrigerator operating between temperatures τl and τh , the Carnot
coefficient of performance is
τl
γC = ,
τh − τl
and this is an upper limit to the performance of any refrigerator operating between the
same temperature extremes.
Aside: If you go to a department store and look at air conditioners, you will find
something called an energy efficiency rating (EER) which is basically the coefficient of
performance. But, I believe these are given in BTU per hour per watt. That is they have
dimensions instead of being dimensionless! To convert to a dimensionless number you must
multiply by
1055 Joules 1 Hour J Hour
· = 0.29 .
1 BTU 3600 Seconds BTU s
A typical EER you’ll find on an air conditioner is roughly 10, so the “real” γ is about 3!
Note that all reversible engines operating between the same two temperature reservoirs
must have the same efficiency. Similarly, all reversible refrigerators operating between the
same two temperature reservoirs must have the same coefficients of performance.
To see this, suppose that one has two reversible engines operating between the same
two temperature reservoirs but they have different efficiencies. Run the high efficiency
engine for some time, taking heat Qh from the high temperature reservoir, producing
work W , and dumping heat Ql = Qh − W in the low temperature reservoir. Now run
the other engine in reverse (it’s reversible!) as a refrigerator removing heat Ql from the
low temperature reservoir, so the low temperature reservoir is exactly the same as when
we started. To do this, the refrigerator is supplied with work W ′ and it dumps heat
Q′h = Ql + W ′ to the high temperature reservoir. Since this is the less efficient of the two
reversible engines, W ′ < W . So the net effect of running the two engines is to extract heat
from the high temperature reservoir and turn it completely into work. This is a violation
of the second law of thermodynamics, and it does not happen. Therefore all reversible
engines operating between the same two reservoirs must have the same efficiency.
Similar kinds of arguments can be used to show that all reversible refrigerators op-
erating between the same two reservoirs must have the same coefficients of performance,
that reversible engines are more efficient than irreversible engines, and that reversible
refrigerators have higher coefficients of performance than irreversible refrigerators.

Copyright
Physics 301 5-Nov-2004 19-5
The Carnot Cycle
We’ve mentioned the Carnot efficiency and we’ve talked about heat engines, but how
would one make a heat engine that (were it reversible) would actually have the Carnot
efficiency? Simple, make an engine that uses the Carnot cycle.
The Carnot cycle is most conveniently plotted on a

temperature-entropy diagram. We plot the entropy of
the “working substance” in an engine on the horizontal
axis and the temperature of the working substance on
the vertical axis. The working substance might be an
ideal gas. It’s whatever component actually receives the
heat and undergoes changes in its entropy and internal
energy and performs work on the outside world. There
are four steps in a Carnot cycle. In step ab, the tem-
perature is constant at τh while the entropy is increased
from σ1 to σ2 . This is the step in which the system is in
contact with the high temperature reservoir and heat Qh = τh (σ2 −σ1 ) is added to the sys-
tem. If the system is an ideal gas, then it must expand to keep the temperature constant,
so it does work on the outside world. In step bc, the temperature is lowered at constant
entropy. No heat is exchanged, the gas expands and does more work on the outside world.
In step cd, entropy is removed at constant temperature by placing the system in contact
with the low temperature reservoir. The heat removed is Ql = τl (σ2 − σ1 ). In this step,
the gas is compressed in order to maintain constant temperature, so the outside world does
work on the gas. In step da, the system is returned to the starting temperature, τh , by
isentropic compression, so more work is done on the system. The hatched area within the
path followed by the system is the total heat added to the system in one cycle of operation,
I
Q = Qh − Ql = τ dσ .
Since the system returns to its starting configuration (same U ) this is also the work done
in one cycle.
Whatever the working substance, a Carnot cycle al-

ways looks like a rectangle on a τ σ diagram. (It has two
isothermal segments and two isentropic segments.) Given a
working substance we can plot the Carnot cycle on a pV di-
agram. The figure shows the Carnot cycle for a monatomic
ideal gas. The vertices abcd in this diagram are the same
as the vertices abcd in the τ σ diagram. So paths ab and
cd are the constant temperature paths with τ = τh and
τ = τl . Along these paths pV = const. Paths bc and da
are the constant entropy paths with σ = σ2 and σ = σ1 .

Copyright
Physics 301 5-Nov-2004 19-6
Along these paths pV 5/3 = const. The work done on the outside world in one cycle is the
hatched area within the path, I
W = p dV .
The arrows on the paths indicate clockwise traversal. In this direction, the Carnot
cycle is a Carnot engine producing work and waste heat from high temperature input heat.
If the cycle is run in reverse—counterclockwise—one has a Carnot refrigerator using work
to move heat to a higher temperature reservoir.
As an example of a non-Carnot cycle, suppose we

have a monatomic ideal gas which is the working sub-
stance of a reversible engine and it follows the rectangu-
lar path on the pV diagram shown in the figure. Along
da, heat is added at constant volume V1 . On ab, heat is
added at constant pressure. p2 , and work is performed.
On bc heat is removed at constant volume, V2 , and on
cd, heat is removed at constant pressure, p1 , while the
outside world does some work on the system. As before
the total work done on the outside world is the area
within the path and in this case,
W = (p2 − p1 )(V2 − V1 ) .
The heat added is

3 5
Qin = N (τa − τd ) + N (τb − τa ) ,
2 2
3 5
= (p2 V1 − p1 V1 ) + (p2 V2 − p2 V1 ) ,
2 2
5 3
= p2 V2 − p2 V1 − p1 V1 .
2 2
The actual efficiency of this reversible engine is
W p2 V2 − p2 V1 − p1 V2 + p1 V1
η= = 5 ,
Qin p V − p2 V1 − 32 p1 V1
2 2 2
while the Carnot efficiency is

p1 V1
ηC = 1 − .
p2 V2
These formulae aren’t all that illuminating so let’s consider a numerical example: suppose
p2 = 2p1 and V2 = 2V1 . Then the temperature at the hottest (upper right) vertex is 4
times the temperature at the lowest (lower left) vertex, so the Carnot efficiency is
3
ηC = .
4
Copyright
Physics 301 5-Nov-2004 19-7
The actual efficiency is

2
η=.
13
If you’re actually trying to build a heat engine that operates on this cycle, then as you
improve the engine by reducing friction, heat losses, etc., you will approach an efficiency
of 2/13 and this should be your goal, not the Carnot efficiency.
If you want to approach the Carnot efficiency, you must redesign the cycle to be more
like a Carnot cycle. In the cycle shown, the extreme temperatures are reached only at the
upper right and lower left vertices. Most heat transfers are at less extreme temperatures
and this is why the actual efficiency is so much less than the Carnot efficiency .
Other Thermodynamic Functions
We have concentrated on the internal energy, U , in the preceding discussion. If we

consider a constant temperature process, then the work done on the system is the change
in the Helmholtz free energy. This is because at constant temperature, d(τ σ) = τ dσ, so
d̄W = dU − d(τ σ) = dF (constant temperature) .
Many processes occur at constant pressure, such as all processes open to the atmo-
sphere. If a process occurs at constant pressure, then we are letting the system adjust it’s
volume “as necessary,” so we cannot really use or supply any p dV work performed by or
on the system. The p dV work “just happens.” If the system can perform work in other
ways, then we can divide the work into the p dV work and other work,
d̄W = d̄Wother + d̄WpV ,
and
d̄Wother = dU − d̄WpV − d̄Q ,
= dU + p dV − d̄Q ,
= dU + d(pV ) − d̄Q ,
= dH − d̄Q (constant pressure),
where
H = U + pV ,
is called the enthalpy. In any constant pressure process the heat added plus the non-p dV
work done is the change in enthalpy. In particular, if there is no non-p dV work done,
the change in enthalpy is just the heat added. Note that d̄Wother is what K&K call the
“effective” work in a constant pressure process.

Copyright
Physics 301 5-Nov-2004 19-8
In the event that we have a reversible process that occurs at constant temperature
and constant pressure, the Gibbs free energy is useful. This is defined as
G = F + pV = H − τ σ = U + pV − τ σ .
It should be clear that
d̄Wother = dG (constant temperature and pressure) .
Also, a system which is allowed to come to equilibrium at constant temperature and
pressure will come to equilibrium at a minimum of the Gibbs free energy.
As an example of the use of the Gibbs free energy, consider a system (cell) of two
noninteracting electrodes in an electrolyte consisting of sulfuric acid dissolved in water.
The sulfuric acid becomes two hydrogen ions and one sulfate ion,
H2 SO4 ↔ 2H+ + SO−−
4 .
When current is forced through the system in the direction to supply electrons to the
cathode, the reaction at the cathode is
2H+ + 2e− → H2 ,
and the reaction at the anode is
1
SO−−
4 + H2 O → H2 SO4 + O2 + 2e− ,
2
and the net reaction is
1
H2 O → H2 + O2 .
2
If the current is passed through the cell slowly and the cell is open to the atmosphere and
kept at constant temperature, then the process occurs at constant τ and p. The “other”
work is electrical work.
1
Wother = G(H2 O) − G(H2 ) − G(O2 ) ,
2
where the Gibbs free energies can be looked up in tables and it is found that the difference
is
∆G = −273, 000 J mol−1 .
The other work done is electrical work equal to the charge times the voltage. Since we
have two electrons per mole,
Wother = −2eN0 V0 ,
or
∆G
V0 = − = 1.229 Volts ,
2N0 e
where N0 is Avogadro’s number and e is the charge on an electron. V0 is the voltage
that is established with no current flowing. At higher voltage, current flows and the cell
liberates hydrogen and oxygen (electrolysis). If these gases are kept in the cell and allowed
to run the reactions in reverse, one obtains a fuel cell in which the reaction of hydrogen
and oxygen to form water generates electricity “directly.”

Copyright
Physics 301 8-Nov-2004 20-1
Reading
K&K chapter 9 and start on chapter 10. Also, some of the material we’ll be discussing
this week is taken from Mandl, chapter 11.
Gibbs Free Energy
As we discussed last time, the Gibbs free energy is obtained from the energy via two
Legendre transformations to change the independent variables from entropy to temperature
and from volume to pressure,
dU (σ, V, N ) = +τ dσ − p dV + µ dN
ւ ց
F = U − τσ H = U + pV
dF (τ, V, N ) = −σ dτ − p dV + µ dN dH(σ, p, N ) = +τ dσ + V dp + µ dN
↓ ↓
G = F + pV G = H − τσ
G = U − τ σ + pV
dG(τ, p, N ) = −σ dτ + V dp + µ dN .
There are a couple of general points to make here. First of all, if the system has other ways
of storing energy, those ways should be included in all these thermodynamic functions. For
example, if the system is magnetic and is in a magnetic field, then there will have to be
an integral of the magnetization (magnetic dipole moment per unit volume) times the
magnetic field times the volume element to account for the magnetic energy. The second
point
P is that if the system contains several different kinds of particles, then µ dN is replaced
by i µi dNi , where the index i runs over the particle types. (We will be doing this shortly!)
The above way of writing the energy, the Helmholtz free energy, F , the enthalpy, H, and
the Gibbs free energy, G are really just shorthand for what might actually have to be
included.
As remarked earlier, the Gibbs free energy is particularly useful for situations in which
the system is in contact with a thermal reservoir which keeps the temperature constant,
dτ = 0, and a pressure reservoir which keeps the pressure constant, dp = 0. Then if the
number of particles doesn’t change, the Gibbs free energy is an extremum dG = 0, and in
fact, it must be a minimum (because the entropy enters with a minus sign!).
Another thing to note is that τ , p, and µ are intensive parameters while σ, V , N , and
G itself are extensive parameters. This means that for fixed temperature and pressure,
G must be proportional to the number of particles. Or, G = N f (τ, p) where f is some
function of the temperature and pressure. If we differentiate with respect to N , we have

∂G
= f (τ, p) .
∂N τ,p

Copyright
Physics 301 8-Nov-2004 20-2
If we compare this with the earlier expression for dG, we see that f (τ, p) = µ(τ, p). In
other words, the chemical potential of a single component system depends only on the
temperature and pressure. Furthermore,
G(τ, p, N ) = N µ(τ, p) .
What happens when there is more than one kind of particle in the system? In this
case, we can show that X
G(τ, p, N1 , N2 , . . .) = Ni µi ,
i
We must have for any λ,
G(τ, p, λN1 , λN2 , . . .) = λG(τ, p, N1, N2 , . . .) ,
as this just expresses the fact that G and the Ni are extensive parameters. Now, set
xi = λNi and differentiate with respect to λ,
X ∂G ∂xi
= G(τ, p, N1 , N2 , . . .) .
∂xi ∂λ
i
Note that ∂xi /∂λ = Ni and when λ → 1, then xi → Ni , and ∂G/∂Ni = µi , so

X
G(τ, p, N1 , N2 , . . .) = Ni µi ,
i
but it is not necessarily true that µi depends only on τ and p.
As an example, We can write down the Gibbs free energy for a classical ideal gas. We
worked out the Helmholtz free energy in lecture 14. For a single component ideal gas, it is

n
F = N τ log −1 ,
nQ Zint
so
n
G = N τ log − 1 + pV ,
nQ Zint

N/V
= N τ log − 1 + Nτ ,
nQ Zint
p
= N τ log ,
τ nQ Zint
where we used the ideal gas law to replace N τ with pV and N/V with p/τ . Of course,
we could also have obtained this result with our expression for µ that we worked out in
lecture 14! Note that, as advertised, µ is a function only of p and τ .

Copyright
Physics 301 8-Nov-2004 20-3
If we have a multicomponent ideal gas, the situation is slightly more complicated.

Starting from the Helmholtz free energy again, we have
X
ni

G= Ni τ log − 1 + pV ,
i
ni,Q Zi,int
X Ni /V
X
= Ni τ log − Ni τ + pV ,
i
n i,Q Z i,int i
X (Ni /N )(N/V )
X
= Ni τ log − Ni τ + pV ,
ni,Q Zi,int
i i
X xi p

= Ni τ log − N τ + pV ,
i
τ ni,Q Zi,int
X xi p
= Ni τ log ,
i
τ n i,Q Z i,int
where xi is the fractional concentration of molecules of type i, xi = Ni /N = ni /n. Also,

xi p = pi , the partial pressure of molecules of type i. The quantum concentrations are
given a molecular subscript since they depend on the masses P of the molecules as well
as the temperature. The Gibbs free energy is of the form, i Ni µi , and the chemical
potentials depend on pressure, temperature, and the intensive parameters xi .
The derivation of G in the above paragraph hides an important issue in the internal
partition functions, Zi,int . This is the fact that all energies in the system must be measured
from the same zero point. In particular, if we have molecules that can undergo chemical
reactions (which is where we’re headed), then we might have a reaction like
A+B↔C.
If C is stable, then the reaction of A and B to produce C gives up some binding energy ǫb ,
so the ground state energy for ZC,int is lower than zero by ǫb . In other words, the internal
energy states of the molecules are
A: 0, ǫA,1 , ǫA,2 , ǫA,3 , ... ,

B: 0, ǫB,1 , ǫB,2 , ǫB,3 , ... ,
C: − ǫb , −ǫb + ǫC,1 , −ǫb + ǫC,2 , −ǫb + ǫC,3 , ... .
When we compute the internal partition function for molecule C we need to include −ǫb
as part of the energy in every term in the sum. This extra energy will factor out and we
will have
ZC,int = e+ǫb /τ Z0,C,int ,
where Z0,C,int is the usual partition function with the ground state at 0. Since the logarithm
of the partition function occurs in the chemical potential, the net effect is to add −ǫb to µC

Copyright
Physics 301 8-Nov-2004 20-4
and −NC ǫb to the Gibbs free energy. The message is that energies must be measured on
a common scale. We will sometimes assume the internal partition functions are calculated
with the internal ground state energy set to 0 and explicitly add any binding energies to the
chemical potentials. Other times, we will assume that all binding energies are incorporated
into the internal partition functions!
Chemical Equilibrium
Suppose we have a chemical reaction which takes place at constant temperature and
pressure. Then, we know that the Gibbs free energy is a minimum. But in addition to
this condition, we also have the constraint imposed by the reaction. In particular, we can
write any chemical reaction as
ν1 A1 + ν2 A2 + ν3 A3 + · · · + νl Al = 0 ,
where Ai stands for a particular compound and νi denotes the relative amount of that
compound which occurs in the reaction. For example, the formation of water from hydrogen
and oxygen is typically written,
2H2 + O2 → 2H2 O .
This becomes
2H2 + O2 − 2H2 O = 0 ,
with
A1 = H2 , A2 = O2 , A3 = H2 O ,
ν1 = 2 , ν2 = 1 , ν3 = −2 .
If the reaction occurs, the change in numbers of molecules is described by νi ,
dNi = νi dR ,
where dR is the number of times the reaction occurs in the direction that makes the left
hand side. Then the change in the Gibbs free energy is
!
X X X
dG = µi dNi = µi νi dR = µi νi dR .
i i i
This must be an extremum which means that there is no change in G if the reaction or
the inverse reaction occurs (dR = ±1), so
X
µi νi = 0 ,
i

Copyright
Physics 301 8-Nov-2004 20-5
when a reaction occurs at constant temperature and pressure.
Note 1: the expression we’ve just derived also holds if the temperature and volume
are held fixed. This is most easily seen by noting
P that when the temperature and pressure
are held fixed, the reaction proceeds until i µi νi = 0 at which point the system has
some particular volume determined by the total amount of reactants, the pressure and
temperature. If we start with the temperature fixed and some particular volume, the
reaction proceeds to equilibrium at which point the system has some pressure. Now imagine
that one had started with this pressure, and allowed the reaction to proceed at constant
pressure. Assuming there are not multiple minima in G, the reaction will wind up at the
same place and have the same volume!
Note 2: the expression we’ve just derived holds for a single chemical reaction. If there
are several reactions going on but the net reaction can be reduced to a single reaction,
the above holds. For example, if the reaction is catalyzed by another molecule via an
intermediate step, the reaction rate might differ with and without the catalyst, but the
equilibrium will be the same.
Note 3: the νi are fixed. It is the chemical potentials which adjust to satisfy the
equilibrium condition. Other constraints may need to be satisfied as well. For example,
in the water reaction above, the equilibrium condition provides one equation for the three
unknown chemical potentials. Two other conditions might be the total amount of hydrogen
and the total amount of oxygen.
P Note 4: if there is more than one reaction, there may be several equations similar to
i µi νi = 0 which must be satisfied at equilibrium. As an example, consider
N2 + O2 ↔ 2NO .
P
The equilibrium condition, i µi νi , can be written
µN2 + µO2 = 2µNO .
In other words, we just substitute the appropriate chemical potentials for the chemicals in
the reaction equation. If we also have
2N ↔ N2 , 2O ↔ O2 , N + O ↔ NO ,
then we also have the additional relations among the chemical potentials (at equilibrium),
2µN = µN2 , 2µO = µO2 , µN + µO = µNO .
Note that there are five kinds of molecules. There must be a total amount of nitrogen
and a total amount of oxygen (two conditions), and there are four conditions of chemical
equilibrium. There are six conditions for five chemical potentials. However, the four

Copyright
Physics 301 8-Nov-2004 20-6
equilibrium conditions are not all independent. For example, the last one can be derived
from the previous three.
The Law of Mass Action
We’ve seen that forPchemical equilibrium, the chemical potentials adjust to satisfy the
equilibrium condition, i µi νi = 0. Among other things, the chemical potentials depend
on the concentrations of the molecules. To bring this out, we’ll consider the case that all
molecules participating in a reaction can be treated as an ideal classical gas. (This, of
course, works for low density gases, but also for low concentration solutes.) Then
ni
µi = τ log = τ log ni − τ log ni,Q Zi,int = τ log ni − τ log ci ,
ni,Q Zi,int
where
ci = ni,Q Zi,int ,
and ci depends on the characteristics of molecule i through its mass in the quantum
concentration and its internal states in the partition function, but otherwise ci depends
only on the temperature. Note that in this expression, we’re assuming that any binding
energies are included in the internal partition function.
The equilibrium condition can be written

X X
νi log ni = νi log ci ,
i i
X X
log nνi i = log cνi i ,
i i
Y Y
log nνi i = log cνi i ,
i i
Y Y
nνi i = cνi i ,
i i
Y
nνi i = K(τ ) .
i
The last line is known as the law of mass action. The quantity K(τ ) is known as the
equilibrium constant and is not a constant but depends on temperature. In terms of
molecular properties, it’s given by
Y Y ν
K(τ ) = cνi i = (ni,Q Zi,int ) i .
i i
Note that at a given temperature, measurement of all the concentrations allows one to
determine the equilibrium constant at that temperature. For complicated situations it is

Copyright
Physics 301 8-Nov-2004 20-7
easier to determine the constant experimentally than to calculate it from the molecular
properties!
Application: pH
Water can undergo the reaction
H2 O ↔ H+ + OH− .
In water at room temperature a very small fraction of the water molecules are dissociated
into hydrogen and hydroxyl ions. The equilibrium concentrations satisfy
[H+ ][OH− ] = 10−14 mol2 l−2 .
The notation [whatever] denotes the concentration of whatever. This is almost in the form
of the law of mass action. We need to divide by the concentration of H2 O to place it in
the proper form. However, the concentration of water in water is about 56 mol/l and it
doesn’t change very much, so we can treat it as a constant, and then the law of mass action
takes the form of the above equation. Note that in pure water, the concentrations must
be equal, so
[H+ ] = [OH− ] = 10−7 mol l−1 .
The pH of a solution is defined as
pH = − log10 [H+ ] .
The pH of pure water is 7. If an acid, such as H Cl is dissolved in water, the increased

availability of H+ shifts the equilibrium to increase [H+ ] and decrease [OH− ], but the
product stays constant. When H+ goes up, the pH goes down. Similarly, adding a base,
such as Na OH, increases the concentration of [OH− ] and increases the pH.

Copyright
Physics 301 8-Nov-2004 20-8
Other Ways of Expressing the Law of Mass Action
We have written the law of mass action in terms of the particle concentrations, ni =
Ni /V . The partial pressure of component i is pi = Ni τ /V , or ni = pi /τ . If we substitute
these forms in the law of mass action and rearrange slightly, we have
!
Y Y P
pνi i = τ νi
K(τ ) = τ νi
K(τ ) = Kp (τ ) ,
i i
where the equilibrium constant is now called Kp (τ ), depends only on temperature, and is
the product of K(τ ) and the appropriate power of the temperature.
We can also write the law of mass action in terms of the fractional particle concentra-
tions, xi = Ni /N = pi /p, introduced earlier. We simply divide each partial pressure above
by p (or each concentration by the overall concentration n = N/V and we have
P νi
Y τ
xνi i = K(τ ) = Kx (τ, p) ,
i
p
where the equilibrium constant is the product of K(τ ) and the appropriate power of τ /p.
In this case, the equilibrium constant is a function of pressure as well as temperature.

Copyright
Physics 301 10-Nov-2004 21-1
The Direction of a Reaction
Suppose we have a reaction such as
A+B↔C,
which has come to equilibrium at some temperature τ . Now we raise the temperature.
Does the equilibrium shift to the left (more A and B) or to the right (more C)?
The heat of reaction at constant pressure, Qp , is the heat that must be supplied to the
system if the reaction goes from left to right. If Qp > 0, heat is absorbed and the reaction
is called endothermic. If Qp < 0, heat is released and the reaction is called exothermic.
For a reaction at constant pressure, the heat is the change in the enthalpy of the
system, Qp = ∆H. We have
H = G + τσ ,
and
∂G
σ=− ,
∂τ p,Ni
so
∂G 2 ∂ G
H =G−τ = −τ .
∂τ p,Ni ∂τ τ p,Ni
What we actually want to do is to change the temperature slightly. Then the system is
no longer in equilibrium and the reaction (in the forward or reverse direction) will have
to occur in order to restore equilibrium. When the reaction occurs from left to right, the
change in particle number is ∆Ni = −νi and the change in G is
X
∆G = − µi νi .
i
If this is 0, we have the equilibrium condition (but we’ve taken it out of equilibrium by
changing the temperature). The change in H is
!!
∂ ∆G ∂ X µi νi
Qp = ∆H = −τ 2 = +τ 2 .
∂τ τ p,Ni ∂τ i
τ
p,Ni
The chemical potential is

xi p/τ
µi = τ log .
ni,Q Zi,int

Copyright
Physics 301 10-Nov-2004 21-2
We substitute into our expression for Qp and obtain,
∂ X
Qp = τ 2 (νi log (xi p) − νi log (τ ni,Q Zi,int )) ,
∂τ i
∂ X
= −τ 2 (νi log (τ ni,Q Zi,int )) ,
∂τ i
∂ X ν
= −τ 2 (log (τ ni,Q Zi,int ) i ) ,
∂τ i
!
∂ Y ν
= −τ 2 log (τ ni,Q Zi,int ) i ,
∂τ i
∂
= −τ 2 log Kp (τ ) .
∂τ
We’ve related the heat of reaction to the equilibrium constant! This is called van’t Hoff’s
equation. A note on signs: I’ve assumed that the νi are positive on the left hand side of
the reaction and negative for the right hand side of the reaction. Mandl (who provides the
basis for this section) assumes the opposite, so we wind up with our equilibrium constants
being inverses of each other and opposite signs in the van’t Hoff equation.
In any case, our law of mass action has the concentrations of the left hand side
reactants in the numerator and the right hand side reactants in the denominator. So an
increase in the equilibrium constant means the reaction goes to the left and a decrease
means the reaction goes to the right. We see that if Qp is positive (we have to add heat
to go from left to right, an endothermic reaction), then our equilibrium constant decreases
with temperature. This means increasing the temperature moves the reaction to the right.
Rule of thumb: increasing the temperature causes the reaction to go towards whatever
direction it can absorb energy. We’ve just shown that increasing the temperature drives
an endothermic reaction to the right. It will drive an exothermic reaction to the left.

Copyright
Physics 301 10-Nov-2004 21-3
Application: the Saha Equation
This section is related to K&K, chapter 9, problem 2. Consider the ionization of

atomic hydrogen,
p+ + e− ↔ H .
Ionizing hydrogen from its ground state requires an energy of 13.6 eV, and as the above
reaction is written, it’s exothermic from left to right. If we are considering low density
gases, we can treat them as classical ideal gases and apply our law of mass action:
[p+ ][e− ] (np,Q Zp,int )(ne,Q Ze,int )

= ,
[H] nH,Q ZH,int exp(I/τ )
where the partition function for the hydrogen atom is to be computed with the ground state
at the zero of energy, as we’ve taken explicit account of the binding energy I = 13.6 eV.
This (or more properly, some of the forms we will derive below) is called the Saha equation.
Some of the factors in the equilibrium constant are easy to calculate and others are
hard to calculate! Let’s do the easy ones. First of all, the mass of a proton and the mass of
a hydrogen atom are almost the same, so the quantum concentrations of the proton and the
hydrogen are almost the same and we can cancel them out. The quantum concentration
of the electron is 3/2
me τ
ne,Q = .
2πh̄2
The internal partition functions for the electron and proton are both just 2, since each has
spin 1/2. This leaves us with the internal partition function of the hydrogen atom. This is
complicated. First of all, the electron and proton each have two spin states, so whatever
else is going on there is a factor of four due to the spins.
Aside: in fact the spins can combine with the orbital angular momentum to give a
total angular momentum. In the ground state, the orbital angular momentum is zero
and the spins can be parallel to give a total angular momentum of 1h̄ with 3 states or
anti-parallel to give a total angular momentum of 0 with 1 state. The parallel states are
slightly higher in energy than the anti-parallel state. Transitions between these states are
called hyperfine transitions and result in the 21 cm line which is radiated and absorbed
by neutral hydrogen throughout our galaxy and others. In any case, the energy difference
between these states is small enough to be ignored in computing the internal partition
function for the purposes of the Saha equation.
When all is said and done, we have

3/2
[p+ ][e− ] me τ 1
=4 e−I/τ ,
[H] 2πh̄2 ZH,int

Copyright
Physics 301 10-Nov-2004 21-4
where the factor of four accounts for the two spin states of the proton and the two spin
states of the electron (there is a factor of four in the hydrogen partition function as well).
If the temperature is small compared to the binding energy of hydrogen (which means it’s
small compared to the difference between the first excited state and the ground state),
then we might as well approximate the partition function as 4. This gives,
3/2
[p+ ][e− ] me τ
≈ e−I/τ .
[H] 2πh̄2
If we have only hydrogen and ionized hydrogen, [p+ ] = [e− ] and

3/4
p me τ
−
[e ] ≈ [H] e−I/2τ .
2πh̄2
Some points to note: the fact that the exponential has −I/2τ indicates that this is a
mass action effect, not a Boltzmann factor effect. If there is another source of electrons (for
example, heavier elements whose outer electrons are loosely bound), the reaction would
shift to favor more hydrogen and fewer protons. The Saha equation applies to gases in
space or stars as well as donor atoms in semi-conductors (modified for the appropriate
physical characteristics of the atom and the medium).
In fact, we can do a little more with the Saha equation. Let’s consider an atom which
has several electrons, and ask about the ionization equilibrium between the ions that have
been ionized i times and those that have been ionized i + 1 times,
ni+1 [e− ] (ni+1,Q Zi+1,int )(ne,Q Ze,int )

= ,
ni ni,Q Zi,int exp(Ii+1,i /τ )
where ni+1 and ni are the concentrations of the two ions, ni+1,Q and ni,Q are the quantum
concentrations of the two ions which are essentially the same, so we cancel them out,
Zi+1,int and Zi,int are the internal partition functions of the two ions, and Ii+1,i is the
difference in binding energy between the two ions. That is, Ii+1,i is the energy required to
remove an electron from ion i and produce ion i + 1. Now, each ion will have some internal
structure and energy levels. We let ǫi+1,j be the energy (relative to 0 for the ion ground
state) of the j th state of ion i + 1. This state has multiplicity gi+1,j . (If there is more
than one state at a given energy we say that energy is degenerate and the multiplicity
is the number of such states. Sometimes the multiplicity is called the degeneracy or the
statistical weight.) Similarly, ǫi,k and gi,k are the energy and multiplicity of the k th state
of ion i. The fraction of ions i + 1 which are in state j is given by a Boltzmann factor,
ni+1,j gi+1,j e−ǫi+1,j /τ

= .
ni+1 Zi+1,int

Copyright
Physics 301 10-Nov-2004 21-5
If we substitute this expression into the Saha equation, and also substitute the quantum
concentration of the electrons and the internal partition function of the electrons (2), we
get
3/2
ni+1,j [e− ] 2gi+1,j me τ
= 2 e−(Ii+1,i + ǫi+1,j − ǫi,k )/τ .
ni,k gi,k 2πh̄
This form of the Saha equation connects the concentration of ions in various energy levels
to the electron concentration and the temperature. Note that we managed to get rid of the
internal partition functions. Of course, now we have a relation connecting concentrations
of states of a given energy level rather than concentrations of ions of a given ionization.
We can apply the above expression to hydrogen (again!). There are only two ionization
states. We let i = 0 and k = 0, so ni,k is the concentration of hydrogen atoms in the ground
state (which has multiplicity g0,0 = 4 and energy ǫ0,0 = 0. The ionized state is just a proton
which has a multiplicity of 2, and no excited states. So
3/2
[p+ ][e− ] me τ
= e−I/τ ,
n0,0 2πh̄2
which is essentially the same equation we had before except that now it includes only
hydrogen atoms in the ground state and it is “exact.”
Phase Transitions
Phase transitions occur throughout physics. We are all familiar with melting ice and
boiling water. But other kinds of phase transitions occur as well. Some solids, when heated
through certain temperatures, change their crystal structure. For example, sulfur can exist
in monoclinic or rhombic forms.
When iron is cooled below the Curie point, it spontaneously magnetizes. The Curie
point of iron is Tc = 1043 K. A typical chunk of iron has no net magnetization because it
magnetizes in small domains with the direction of the magnetic field oriented at random.
The magnetization, even in the small domains, disappears above the Curie temperature.
The transition between the normal and superfluid states of 4 He is a phase transition
as are the transitions between normal and superconducting states in superconductors.
You’ve probably heard about the “symmetry breaking phase transitions” that might
have occurred in the very early universe, as the universe cooled from its extremely hot
“initial” state. Such transitions “broke” the symmetry of the fundamental forces causing
there to be different couplings for the strong, weak, electromagnetic, and gravitational
force. The latent heat released in such a transition might have driven the universe into a
state of very rapid expansion (inflation).

Copyright
Physics 301 10-Nov-2004 21-6
The spontaneous magnetization of iron as it’s cooled below the Curie temperature is
an example of a symmetry breaking transition. Above the Curie point, the atomic magnets
(spins) are oriented at random (by thermal fluctuations). So any direction is the same as
any other direction and there is rotational symmetry. Below the Curie point (and within
a single domain) all the atomic magnets are lined up, so a single direction is picked out
and the rotational symmetry is broken.
This is not an exhaustive list of phase transitions! Even so, we will not have time to
discuss all these kinds of phase transitions. We will start with something “simple” like the
liquid to gas transition.
Phase Diagrams
Suppose we do some very simple experiments.

We place pure water inside a container which keeps
the amount of water constant and doesn’t allow any
other kinds of molecules to enter. The container is
in contact with adjustable temperature and pressure
reservoirs. We dial in a temperature and a pressure,
wait for equilibrium to be established, and then see
what we have. For most pressures and temperatures
we will find that the water is all solid (ice), all liquid,
or all vapor (steam). For some temperatures and
pressures we will find mixtures of solid and vapor, or
solid and liquid, or liquid and vapor. The figure shows a schematic plot of a phase diagram
for water. I didn’t put any numbers on the axes—which is why it’s schematic. (Also, there
are several kinds of ice which we’re ignoring!) K&K give a diagram, but it doesn’t have
any resolution at the triple point.
Note that the first figure (which we’ll talk about

some more in a minute) is something like a map: it
says here we have vapor, there we have solid, etc.
The second figure is a schematic of a pV diagram
showing an isotherm. For an ideal gas, we would
have a hyperbola. For the isotherm as shown, we
have pure liquid on the branch to the left of point
a, pure vapor to the right of point b and along the
segment from a to b we have a mixture of liquid and
vapor. If we move along this isotherm from left to
right, we are essentially moving down a vertical line
in the pτ diagram. To the left of point a we are moving to lower pressures, with liquid
water. from a to b we are stuck at the line in the pτ diagram that divides the liquid from
the vapor region, and to the right of b we are moving down in the vapor region. So the

Copyright
Physics 301 10-Nov-2004 21-7
entire transition from all liquid to all vapor which is a to b in the pV diagram happens
in a single point in the pτ diagram. At this point, the water has a fixed temperature
and pressure, and what adjusts to match the volume is the relative amounts of liquid and
vapor.
Now, at each location in the pτ diagram, we fix the temperature and pressure and let
the system come to equilibrium. The equilibrium condition is that the Gibbs free energy
is minimized. Ignoring for the moment the fact that the water can be a solid, the Gibbs
free energy is
G(p, τ, Nl , Nv ) = Nl µl (p, τ ) + Nv µv (p, τ ) ,
where the subscripts l and v refer to the liquid and vapor and we’ve made use of the
fact that for a single component substance the chemical potential can be written as a
function of p and τ only. There are several ways we might minimize G. First of all,
if µl (p, τ ) < µv (p, τ ), then we minimize G by setting Nl = N and Nv = 0 where N is
the total number of water molecules. In other words, the system is entirely liquid. If
µv (p, τ ) < µl (p, τ ), we minimize the free energy by making the system entirely vapor.
Finally, if µl (p, τ ) = µv (p, τ ), we can’t change the free energy by changing the amount of
vapor and liquid, so we can have a mixture with the exact amounts of liquid and vapor
determined by other constraints (such as the volume to be occupied).
So, what we’ve just shown is that where liquid and vapor coexist in equilibrium, we
must have
µl (p, τ ) = µv (p, τ ) ,
which is exactly the same condition we would have come up with had we considered the
“reaction”
H2 Oliquid ↔ H2 Ovapor .
This is a relation between p and τ and it describes a curve on the pτ diagram. It’s called
the vapor pressure curve.
With similar arguments, we deduce that solid and vapor coexist along the curve defined
by
µs (p, τ ) = µv (p, τ ) ,
which is called the sublimation curve, and solid and liquid coexist along the curve
µs (p, τ ) = µl (p, τ ) ,
which is the melting curve.
If we have all three chemical potentials equal simultaneously,
µs (p, τ ) = µl (p, τ ) = µv (p, τ ) ,

Copyright
Physics 301 10-Nov-2004 21-8
we have two conditions on p and τ and this defines a point. This unique (for each substance)
point where solid, liquid, and vapor all coexist is called the triple point. For water,
Tt = 273.16 K , pt = 4.58 mm Hg .
Actually, this is now used to define the Kelvin scale.
If a substance has more than three phases, it can have more than one triple point.
For example, the two crystalline phases of sulfur give it four phases, and it has three triple
points.
The vapor pressure curve eventually ends at a point called the critical point. At this
point, one can’t tell the difference between the liquid phase and the vapor phase. We’ll
say more about this later, but for now, consider that as you go up in temperature, you
get sufficiently violent motions that binding to neighboring molecules (a liquid) becomes
a negligible contribution to the energy. As one goes up in temperature, the heat of va-
porization decreases. At the critical point it is zero. The critical point for water occurs
at
Tc = 647.30 K , pc = 219.1 atm .
Another way to think of the phase diagram and the coexistence curves is to imagine
a 3D plot. Pressure and temperature are measured in a horizontal plane, while µ(p, τ ) is
plotted as height above the plane. This defines a surface. In fact we have several surfaces,
one for µs , µl , and µv . We take the overall surface to be the lowest of all the surfaces—
remember we’re trying to minimize G. Where µv is the lowest, we have pure vapor, etc.
Where two surfaces intersect, we have a coexistence curve.
Of course, the phase diagram corresponds to equilibrium. It is possible to have liquid

in the vapor region (superheated), or solid region (supercooled), etc., but these situations
are unstable and the system will try to get to equilibrium. Whether this happens rapidly
or slowly depends on the details of the particular situation.

Copyright
Physics 301 12-Nov-2004 22-1
First Order and Second Order Phase Transitions
In the phase diagram we’ve been discussing, as we cross a coexistence curve, G is

continuous, but its slope changes discontinuously. This is true whether we cross the curve
by changing temperature or by changing pressure. This means that the entropy and volume
have step discontinuities. Recall,
dG = −σ dτ + V dp + µ dN ,
so
∂G ∂G ∂G
σ=− , V =+ , µ=+ .
∂τ p,N ∂p τ,N ∂N p,τ
The situation is sketched in the left pair of plots in the figure which shows the change
in entropy resulting from the phase transition. Such a transition is called a first order
transition—the first derivatives of G have discontinuities.
Second order transitions have discontinuities in the second derivatives. So things

like the entropy and volume are continuous, but their slopes change suddenly. This is
illustrated in the righthand pair of plots.
Since there is a discontinuous change in the entropy in a first order transition, heat
must be added, and
∆σ = L/τ ,
where L is the heat required for the system to go from completely liquid to completely
vapor at temperature τ . This is called the heat of vaporization (or sometimes the latent

Copyright
Physics 301 12-Nov-2004 22-2
heat of vaporization). Similarly, there are heats of melting (fusion) and sublimation. In a
first order transition, the heat capacities dQ/dτ are δ-functions!
The Clausius-Clapeyron Equation
Now we are going to return to a first order transition, like the liquid–vapor transition
in water and see if we can say something about the functional form of the coexistence
curve. The vapor pressure curve is given by
µl (p, τ ) = µv (p, τ ) .
Let’s move a short distance along the curve in which τ changes by dτ and p changes by
dp. As we move along the curve, the chemical potentials change as well. If we remain on
the curve, the change in both chemical potentials must be the same. We have
dµl (p, τ ) = dµv (p, τ ) ,

∂µl ∂µl ∂µv ∂µv
dp + dτ = dp + dτ ,
∂p τ ∂τ p ∂p τ ∂τ p

∂µv ∂µl ∂µv ∂µl
+ dp − dp = − dτ + dτ ,
∂p τ ∂p τ ∂τ p ∂τ p

− ∂µ v
+ ∂µl
dp ∂τ
p
∂τ
= p .
dτ + ∂µ v
− ∂µ l
∂p ∂p
τ τ
Now, what are all these partial derivatives?

∂µv ∂Gv /Nv
− =− ,
∂τ p ∂τ p,Nv

1 ∂Gv
=− ,
Nv ∂τ p,Nv
1 σv
=+ σv = = sv ,
Nv Nv
where sv is the entropy per particle in the vapor phase. The other partial derivative in
the numerator gives the entropy per particle in the liquid phase, sl , while the partials in
the denominator give the volumes per particle in the vapor and liquid phases, vv and vl ,
respectively. Altogether, we have
dp sv − sl
= .
dτ vv − vl
This is called the Clausius-Clapeyron equation.

Copyright
Physics 301 12-Nov-2004 22-3
Some comments on this equation are in order. First of all, dp/dτ is the slope of
the vapor pressure curve (it has nothing directly to do with the equation of state of the
substance). Secondly, we have the entropy per particle and the volume per particle. Since
we have a ratio, the equation remains true if we use the entropy per mole and volume per
mole, or the entropy per gram and volume per gram, etc. In words, the equation says
the slope of the vapor pressure curve is the ratio of the change in specific entropy to the
change in specific volume between the vapor and liquid phases. A similar equation applies
to each coexistence curve. We just need to put in the right quantities. For example, the
melting curve would have
dp sl − ss
= .
dτ vl − vs
A final comment is that the specific entropies and volumes are to be evaluated at the
temperature and pressure at the point on the coexistence curve for which the slope is
desired.
The Clausius-Clapeyron equation is often written in other forms. In particular, the

change in entropy can be immediately related to the latent heat.
dp ℓ
= ,
dτ τ ∆v
where ℓ is the specific latent heat and ∆v is the change in specific volume. We can apply
this to the melting of ice and the change in melting temperature with pressure. We start
with what happens at 1 atm and 0◦ C = 273.15 K. The specific latent heat of fusion is
3.35 × 109 ergs g−1 . The specific volumes of ice and liquid water are
vs = 1.09070 cm3 g−1 , vl = 1.00013 cm3 g−1 .
Remember, water expands as it freezes! The result is
dp
= −1.35 × 108 dyne cm−2 K−1 = −134 atm K−1 .
dT
The slope is negative! This accounts for the fact that the melting curve of water leaves the
triple point headed up and slightly to the left. Most materials (which expand on melting!)
have a melting curve which leaves the triple point headed up and slightly to the right. That
is, a large positive slope instead of a large negative slope. This unusual property of water is
often said to be the reason why we can have figure skating and ice hockey and why glaciers
can flow. As a glacier meets up with an obstruction, the pressure at the point of contact
with the obstruction increases until the ice melts and the liquid water can flow around the
obstruction and refreeze on the other side. Similarly, ice skates can melt ice and the liquid
water helps lubricate the skate. This sounds good, but the numbers don’t work out. For
example, one needs about 10 meters of ice to generate a pressure of one atmosphere; to get
a 10 degree change in melting temperature, we would need a 13 km thick glacier. A 50 kg
skater on a pair of 20 cm by 0.1 mm (very sharp) skates would produce about a 1 degree

Copyright
Physics 301 12-Nov-2004 22-4
change in melting temperature. Although this effect may play a role, it is probable that
surface effects are more important. For example, a water molecule on the surface forms
bonds with fewer neighbors than a molecule in the interior of the solid. Also, it may be
attracted to the material in contact with the surface making it easier to “melt.”
Now let’s look at what happens at the normal boiling point of water and the liquid–
vapor transition. This occurs at 1 atm and 100◦ C = 373.15 K. The latent heat of va-
porization is 2.257 × 1010 ergs g−1 and the specific volumes of the liquid and the vapor
are
vl = 1.043 cm3 g−1 , vv = 1673 cm3 g−1 .
This gives
dp
= 3.62 × 104 dyne cm−2 K−1 = 0.036 atm K−1 .
dT
On Mauna Kea in Hawaii at an altitude of about 14,000 ft, the pressure is about 60% of
sea level pressure. That is, the pressure has decreased by 0.4 atm. Using the slope we just
calculated, we find that the boiling point of water decreases by 11◦ C to 89◦ C.
Up to this point, we haven’t made any approximations in dealing with the Clausius-
Clapeyron equation. When we deal with the vapor pressure curve, we can usually neglect
the volume of the liquid compared to the gas (as we’ve just seen). Also, we can use the
ideal gas law to get volume in terms of the pressure and temperature (of course, if the
substance is making the transition between liquid and gas, the ideal gas law may not apply
all that well!). Recall, we need the specific volume, so if we have the latent heat per unit
mass, then we can write v = RT /pM , where R is the gas constant (per mole) and M is
the molecular weight (mass per mole). Then we have
dp ℓ Mℓ
= =p .
dT T (RT /pM ) RT 2
This is yet another form of the Clausius-Clapeyron equation. Note that if we evaluate the
slope at the normal boiling point with this expression, we get
dp
= 3.54 × 104 dyne cm−2 K−1 = 0.035 atm K−1 ,
dT
within 3% of what we had with the exact expression.
We can rewrite the approximate form of the Clausius-Clapeyron equation as

dp M ℓ dT
= .
p R T2
If we now assume that the ℓ does not depend on temperature or pressure, we can integrate
this expression.
Mℓ
log p = − + constant ,
RT
Copyright
Physics 301 12-Nov-2004 22-5
or
p = p0 e−M ℓ/RT .
What this says is that a semi-log plot of the vapor pressure against T −1 should be a straight
line. K&K figure 10.3 shows that this is not all that bad of an approximation. Of course,
we know that the latent heat isn’t constant and it goes to zero at the critical point. Part
of the reason the plot in K&K doesn’t look all that bad is that the scale is very coarse
and covers 8 orders of magnitude in pressure. Even though the latent heat isn’t constant,
it’s a good approximation to assume it is for a small range of the curve and for a small
range, the pressure is well approximated by an exponential of 1/τ . (One can think of the
integration constant, p0 , as changing from one small range to the next.)
The van der Waals Equation of State
If we want to have an atomic model for a liquid–vapor transition, we will need to model
the gas as something more than non-interacting point particles. A reasonably successful
approach models the gas molecules as having an attractive force for separations larger than
some distance (roughly the equilibrium separation in the solid). The attractive force gets
weaker and goes to zero as the separation is increased. If the molecules get too close, a
strong repulsive force arises. We can approximate this force by thinking of the molecules as
“hard spheres” which can get as close as twice their radii, but no closer. K&K, figure 10.7
shows a schematic of the potential energy curve of the interaction between two molecules.
(Remember the force is the negative of the slope of this curve.) Fortunately, we don’t need
to know the details of this curve. It’s only described to this extent in order to motivate
the van der Waals equation of state.
The van der Waals equation of state is

N2
p + a 2 (V − bN ) = N τ ,
V
where a and b are constants that depend on the gas molecules. b is related to the hard
sphere repulsion and a is related to the longer range attraction. This equation of state
can be obtained by starting with the Helmholtz free energy of an ideal gas and making
corrections to account for these effects.
The ideal gas free energy is
F = −N τ (log(nQ /n) + 1) .
If each molecule occupies a volume b then the effective volume available is V − bN , so the
concentration should be replaced by N/(V −bN ). Of course, this is not entirely legitimate,
but if each molecule has a volume b, then you would expect the pressure to diverge if the
density reaches 1 molecule per volume b. This is exactly what this correction provides.

Copyright
Physics 301 12-Nov-2004 22-6
Since there is an attractive force between the molecules, there is a net negative con-
tribution to the energy produced by every pair of molecules. We will evaluate this in an
approximate way. Suppose φ(r) is the potential energy between two molecules separated by
r. The potential energy of one molecule due to its interactions with all the other molecules
is Z ∞
u= n(r)φ(r) dV ,
rmin
where rmin corresponds to the minimum distance set by b and n(r) is the concentration at
distance r from the given molecule. The simplest thing we can do is to assume n(r) = n =
const. This is called the mean field approximation. We assume that each molecule moves
in the average field of all the other molecules and does not affect the density of the other
molecules. Of course, since there is an attractive force, the concentration of molecules
around any given molecule will be higher than it is at a randomly chosen point. That
is, the molecular positions are correlated. The mean field approximation ignores these
correlations. So, we have Z ∞
u=n φ(r) dV = −2na ,
rmin
which is really just the definition of a. The factor of two is included for computational
convenience. Due to its interactions with all the other molecules, a given molecule has, on
the average, a change to it’s energy of −2na. There are N molecules, so the total change
in energy due to the attractive part of the van der Waals interaction is
N2
∆U = −2a ,
V
However, this double counts the interaction energy since each molecule is counted twice:
once while contributing to the mean field and once while being acted upon by the mean
field. So we need to divide by a factor of two (why we put 2 in to start with!). So
N2
∆U = ∆F = −a .
V
Our final approximate expression for the free energy of a van der Waals gas is

nQ (V − bN ) N2
F = −N τ log +1 −a .
N V
We differentiate with respect to the volume to get the pressure,

∂F Nτ N2
p=− = −a 2 ,
∂V τ,N V − bN V
which can be rearranged to

N2
p + a 2 (V − bN ) = N τ .
V

Copyright
Physics 301 12-Nov-2004 22-7
We can put the van der Waals equation of state into dimensionless form if we define
pc = a/27b2 , Vc = 3bN , τc = 8a/27b .
Then
p 3 V 1 8τ
+ − = .
pc (V /Vc )2 Vc 3 3 τc
This equation is plotted for several values of τ in the figure. For large τ , it approaches
the ideal gas equation of state, but for small τ there are large deviations from the ideal
gas equation of state. We will explore these deviations and see what they have to do with
phase transitions next time!

Copyright

Lectures Princeton

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lectures Princeton

Hochgeladen von

Copyright:

Verfügbare Formate

Physics 301 10-Sep-2004 1-1

Classical (or maybe “conventional” is better) thermodynamics is an approach to ther-

Some History (mostly taken from Reif)

As it turns out, thermodynamics developed some time before statistical mechanics.

c 2004, Princeton University Physics Department, Edward J. Groth

Some Thermodynamic Concepts

Two systems can exchange energy by macroscopic processes, such as compression or

Recall that the first law of thermodynamics is

c 2004, Princeton University Physics Department, Edward J. Groth

Earlier, we mentioned that temperature is related to internal energy. So, a picture

c 2004, Princeton University Physics Department, Edward J. Groth

The first law of thermodynamics can be written as

pV = nRT or p/T = nR/V ,

where N is the number of particles. N is usually a huge number, comparable to Avogadro’s

c 2004, Princeton University Physics Department, Edward J. Groth

where g is the number of microstates corresponding to the macrostate.

Example: Ideal Gas Entropy

As an example, consider a two dimensional gas containing N = 10 molecules and

c 2004, Princeton University Physics Department, Edward J. Groth

Knowing the number of states, we have

c 2004, Princeton University Physics Department, Edward J. Groth

What Those Large Numbers Mean

Quantum Mechanics and Counting States

Another way is to consider the uncertainty principle,

c 2004, Princeton University Physics Department, Edward J. Groth

As a numerical example, consider air (N2 ) at room temperature. mN2 = 28mp =

p r = 2.4 × 10−26 g cm2 /s > h̄ = 1 × 10−27 erg s .

c 2004, Princeton University Physics Department, Edward J. Groth

Entropy and the Number of States

Why is the Number of States Maximized?

c 2004, Princeton University Physics Department, Edward J. Groth

If we recognize that quantum mechanics is required, then we notice that quantum

As you can see this kind of discussion can go on forever.

c 2004, Princeton University Physics Department, Edward J. Groth

Aside—Entropy and Information

In 1998, Toby Marriage wrote a JP on this topic. You can find it at

c 2004, Princeton University Physics Department, Edward J. Groth

p = p(N, V, T ), U = U (N, V, T ), S = S(N, V, T ) .

We might imagine solving for T in terms of N , V , and U , and we can write

p = p(N, V, U ), T = T (N, V, U ), S = S(N, V, U ) .

c 2004, Princeton University Physics Department, Edward J. Groth

As we remarked, the entropy is the logarithm of the number of microstates accessible

g(U, V, N, U1, V1 , N1 ) = g1 (U1 , V1 , N1 )g2 (U2 , V2 , N2 ) ,

Following K&K, the dimensionless entropy is just

σ(U, V, N, U1, V1 , N1 ) = log g(U, V, N, U1, V1 , N1 ) = log g1 g2

c 2004, Princeton University Physics Department, Edward J. Groth

after equilibrium has been established.

c 2004, Princeton University Physics Department, Edward J. Groth

entropy after a very small amount of energy has been transferred is

Finally, if you remember your elementary thermodynamics, recall that dU = T dS −

Actually, to measure a temperature, we need a thermometer. Thermometers make use

c 2004, Princeton University Physics Department, Edward J. Groth

c 2004, Princeton University Physics Department, Edward J. Groth

σ(U, V, N, U1, V1 , N1 ) = σ1 (U1 , V1 , N1 ) + σ2 (U2 , V2 , N2 ) .

from which we infer that at equilibrium,

which we already knew, and

c 2004, Princeton University Physics Department, Edward J. Groth

c 2004, Princeton University Physics Department, Edward J. Groth

First Derivatives versus Second Derivatives

If we make a change to the volume of the container, we might be interested in

c 2004, Princeton University Physics Department, Edward J. Groth

c 2004, Princeton University Physics Department, Edward J. Groth

In our discussion of the entropy, we postulated that a system is equally likely to be