Sie sind auf Seite 1von 340

Chapter Nine

RAY OPTICS
AND OPTICAL
INSTRUMENTS
9.1 INTRODUCTION
Nature has endowed the human eye (retina) with the sensitivity to detect
electromagnetic waves within a small range of the electromagnetic
spectrum. Electromagnetic radiation belonging to this region of the
spectrum (wavelength of about 400 nm to 750 nm) is called light. It is
mainly through light and the sense of vision that we know and interpret
the world around us.
There are two things that we can intuitively mention about light from
common experience. First, that it travels with enormous speed and second,
that it travels in a straight line. It took some time for people to realise that
the speed of light is finite and measurable. Its presently accepted value
in vacuum is c = 2.99792458 108 m s1. For many purposes, it suffices
to take c = 3 108 m s1. The speed of light in vacuum is the highest
speed attainable in nature.
The intuitive notion that light travels in a straight line seems to
contradict what we have learnt in Chapter 8, that light is an
electromagnetic wave of wavelength belonging to the visible part of the
spectrum. How to reconcile the two facts? The answer is that the
wavelength of light is very small compared to the size of ordinary objects
that we encounter commonly (generally of the order of a few cm or larger).
In this situation, as you will learn in Chapter 10, a light wave can be
considered to travel from one point to another, along a straight line joining

Physics
them. The path is called a ray of light, and a bundle of such rays
constitutes a beam of light.
In this chapter, we consider the phenomena of reflection, refraction
and dispersion of light, using the ray picture of light. Using the basic
laws of reflection and refraction, we shall study the image formation by
plane and spherical reflecting and refracting surfaces. We then go on to
describe the construction and working of some important optical
instruments, including the human eye.

PARTICLE

MODEL OF LIGHT

Newtons fundamental contributions to mathematics, mechanics, and gravitation often blind


us to his deep experimental and theoretical study of light. He made pioneering contributions
in the field of optics. He further developed the corpuscular model of light proposed by
Descartes. It presumes that light energy is concentrated in tiny particles called corpuscles.
He further assumed that corpuscles of light were massless elastic particles. With his
understanding of mechanics, he could come up with a simple model of reflection and
refraction. It is a common observation that a ball bouncing from a smooth plane surface
obeys the laws of reflection. When this is an elastic collision, the magnitude of the velocity
remains the same. As the surface is smooth, there is no force acting parallel to the surface,
so the component of momentum in this direction also remains the same. Only the component
perpendicular to the surface, i.e., the normal component of the momentum, gets reversed
in reflection. Newton argued that smooth surfaces like mirrors reflect the corpuscles in a
similar manner.
In order to explain the phenomena of refraction, Newton postulated that the speed of
the corpuscles was greater in water or glass than in air. However, later on it was discovered
that the speed of light is less in water or glass than in air.
In the field of optics, Newton the experimenter, was greater than Newton the theorist.
He himself observed many phenomena, which were difficult to understand in terms of
particle nature of light. For example, the colours observed due to a thin film of oil on water.
Property of partial reflection of light is yet another such example. Everyone who has looked
into the water in a pond sees image of the face in it, but also sees the bottom of the pond.
Newton argued that some of the corpuscles, which fall on the water, get reflected and some
get transmitted. But what property could distinguish these two kinds of corpuscles? Newton
had to postulate some kind of unpredictable, chance phenomenon, which decided whether
an individual corpuscle would be reflected or not. In explaining other phenomena, however,
the corpuscles were presumed to behave as if they are identical. Such a dilemma does not
occur in the wave picture of light. An incoming wave can be divided into two weaker waves
at the boundary between air and water.

9.2 REFLECTION

310

OF

LIGHT

BY

SPHERICAL MIRRORS

We are familiar with the laws of reflection. The angle of reflection (i.e., the
angle between reflected ray and the normal to the reflecting surface or
the mirror) equals the angle of incidence (angle between incident ray and
the normal). Also that the incident ray, reflected ray and the normal to
the reflecting surface at the point of incidence lie in the same plane
(Fig. 9.1). These laws are valid at each point on any reflecting surface
whether plane or curved. However, we shall restrict our discussion to the
special case of curved surfaces, that is, spherical surfaces. The normal in

Ray Optics and


Optical Instruments
this case is to be taken as normal to the tangent
to surface at the point of incidence. That is, the
normal is along the radius, the line joining the
centre of curvature of the mirror to the point of
incidence.
We have already studied that the geometric
centre of a spherical mirror is called its pole while
that of a spherical lens is called its optical centre.
The line joining the pole and the centre of curvature
of the spherical mirror is known as the principal
axis. In the case of spherical lenses, the principal
axis is the line joining the optical centre with its
principal focus as you will see later.

FIGURE 9.1 The incident ray, reflected ray


and the normal to the reflecting surface lie
in the same plane.

9.2.1 Sign convention


To derive the relevant formulae for reflection by spherical mirrors and
refraction by spherical lenses, we must first adopt a sign convention for
measuring distances. In this book, we shall follow the Cartesian sign
convention. According to this
convention, all distances are measured
from the pole of the mirror or the optical
centre of the lens. The distances
measured in the same direction as the
incident light are taken as positive and
those measured in the direction
opposite to the direction of incident
light are taken as negative (Fig. 9.2).
The heights measured upwards with
respect to x-axis and normal to the
principal axis (x-axis) of the mirror/
lens are taken as positive (Fig. 9.2). The
heights measured downwards are
FIGURE 9.2 The Cartesian Sign Convention.
taken as negative.
With a common accepted convention, it turns out that a single formula
for spherical mirrors and a single formula for spherical lenses can handle
all different cases.

9.2.2 Focal length of spherical mirrors


Figure 9.3 shows what happens when a parallel beam of light is incident
on (a) a concave mirror, and (b) a convex mirror. We assume that the rays
are paraxial, i.e., they are incident at points close to the pole P of the mirror
and make small angles with the principal axis. The reflected rays converge
at a point F on the principal axis of a concave mirror [Fig. 9.3(a)].
For a convex mirror, the reflected rays appear to diverge from a point F
on its principal axis [Fig. 9.3(b)]. The point F is called the principal focus
of the mirror. If the parallel paraxial beam of light were incident, making
some angle with the principal axis, the reflected rays would converge (or
appear to diverge) from a point in a plane through F normal to the principal
axis. This is called the focal plane of the mirror [Fig. 9.3(c)].

311

Physics

FIGURE 9.3 Focus of a concave and convex mirror.

The distance between the focus F and the pole P of the mirror is called
the focal length of the mirror, denoted by f. We now show that f = R/2,
where R is the radius of curvature of the mirror. The geometry
of reflection of an incident ray is shown in Fig. 9.4.
Let C be the centre of curvature of the mirror. Consider a
ray parallel to the principal axis striking the mirror at M. Then
CM will be perpendicular to the mirror at M. Let be the angle
of incidence, and MD be the perpendicular from M on the
principal axis. Then,
MCP = and MFP = 2
Now,
MD
MD
and tan 2 =
(9.1)
CD
FD
For small , which is true for paraxial rays, tan ,
tan 2 2. Therefore, Eq. (9.1) gives

tan =

FIGURE 9.4 Geometry of


reflection of an incident ray on
(a) concave spherical mirror,
and (b) convex spherical mirror.

MD
MD
=2
FD
CD
CD
or, FD =
(9.2)
2
Now, for small , the point D is very close to the point P.
Therefore, FD = f and CD = R. Equation (9.2) then gives
f = R/2
(9.3)

9.2.3 The mirror equation


312

If rays emanating from a point actually meet at another point after


reflection and/or refraction, that point is called the image of the first
point. The image is real if the rays actually converge to the point; it is

Ray Optics and


Optical Instruments
virtual if the rays do not actually meet but appear
to diverge from the point when produced
backwards. An image is thus a point-to-point
correspondence with the object established
through reflection and/or refraction.
In principle, we can take any two rays
emanating from a point on an object, trace their
paths, find their point of intersection and thus,
obtain the image of the point due to reflection at a
spherical mirror. In practice, however, it is
convenient to choose any two of the following rays:
(i) The ray from the point which is parallel to the
principal axis. The reflected ray goes through
FIGURE 9.5 Ray diagram for image
the focus of the mirror.
formation by a concave mirror.
(ii) The ray passing through the centre of
curvature of a concave mirror or appearing to pass through it for a
convex mirror. The reflected ray simply retraces the path.
(iii) The ray passing through (or directed towards) the focus of the concave
mirror or appearing to pass through (or directed towards) the focus
of a convex mirror. The reflected ray is parallel to the principal axis.
(iv) The ray incident at any angle at the pole. The reflected ray follows
laws of reflection.
Figure 9.5 shows the ray diagram considering three rays. It shows
the image AB (in this case, real) of an object A B formed by a concave
mirror. It does not mean that only three rays emanate from the point A.
An infinite number of rays emanate from any source, in all directions.
Thus, point A is image point of A if every ray originating at point A and
falling on the concave mirror after reflection passes through the point A.
We now derive the mirror equation or the relation between the object
distance (u), image distance (v) and the focal length ( f ).
From Fig. 9.5, the two right-angled triangles ABF and MPF are
similar. (For paraxial rays, MP can be considered to be a straight line
perpendicular to CP.) Therefore,
BA
PM

BF
FP

BA BF
( PM = AB)
(9.4)
BA
FP
Since APB = APB, the right angled triangles ABP and ABP are
also similar. Therefore,

or

BA B P
B A BP
Comparing Eqs. (9.4) and (9.5), we get

(9.5)

B F B P FP B P
(9.6)
FP
FP
BP
Equation (9.6) is a relation involving magnitude of distances. We now
apply the sign convention. We note that light travels from the object to
the mirror MPN. Hence this is taken as the positive direction. To reach

313

Physics
the object AB, image AB as well as the focus F from the pole P, we have
to travel opposite to the direction of incident light. Hence, all the three
will have negative signs. Thus,
B P = v, FP = f, BP = u
Using these in Eq. (9.6), we get
v f
f

v
u

v f
v
f
u
1 1
1
(9.7)
v u
f
This relation is known as the mirror equation.
The size of the image relative to the size of the object is another
important quantity to consider. We define linear magnification (m) as the
ratio of the height of the image (h) to the height of the object (h):

or

h
(9.8)
h
h and h will be taken positive or negative in accordance with the accepted
sign convention. In triangles ABP and ABP, we have,

m=

BA
BP
BA
BP
With the sign convention, this becomes

h
h
so that

v
u

h
v

(9.9)
h
u
We have derived here the mirror equation, Eq. (9.7), and the
magnification formula, Eq. (9.9), for the case of real, inverted image formed
by a concave mirror. With the proper use of sign convention, these are,
in fact, valid for all the cases of reflection by a spherical mirror (concave
or convex) whether the image formed is real or virtual. Figure 9.6 shows
the ray diagrams for virtual image formed by a concave and convex mirror.
You should verify that Eqs. (9.7) and (9.9) are valid for these cases
as well.

m=

314

FIGURE 9.6 Image formation by (a) a concave mirror with object between
P and F, and (b) a convex mirror.

Ray Optics and


Optical Instruments
Example 9.1 Suppose that the lower half of the concave mirrors
reflecting surface in Fig. 9.5 is covered with an opaque (non-reflective)
material. What effect will this have on the image of an object placed
in front of the mirror?

EXAMPLE 9.1

Solution You may think that the image will now show only half of the
object, but taking the laws of reflection to be true for all points of the
remaining part of the mirror, the image will be that of the whole object.
However, as the area of the reflecting surface has been reduced, the
intensity of the image will be low (in this case, half).
Example 9.2 A mobile phone lies along the principal axis of a concave
mirror, as shown in Fig. 9.7. Show by suitable diagram, the formation
of its image. Explain why the magnification is not uniform. Will the
distortion of image depend on the location of the phone with respect
to the mirror?

EXAMPLE 9.2

FIGURE 9.7

Solution
The ray diagram for the formation of the image of the phone is shown
in Fig. 9.7. The image of the part which is on the plane perpendicular
to principal axis will be on the same plane. It will be of the same size,
i.e., BC = BC. You can yourself realise why the image is distorted.
Example 9.3 An object is placed at (i) 10 cm, (ii) 5 cm in front of a
concave mirror of radius of curvature 15 cm. Find the position, nature,
and magnification of the image in each case.
Solution
The focal length f = 15/2 cm = 7.5 cm
(i) The object distance u = 10 cm. Then Eq. (9.7) gives
1
v

or

1
10
v=

1
7. 5
10 7.5
2 .5

= 30 cm

v
( 30)

3
u
( 10)
The image is magnified, real and inverted.

Also, magnification m =

EXAMPLE 9.3

The image is 30 cm from the mirror on the same side as the object.

315

Physics
(ii) The object distance u = 5 cm. Then from Eq. (9.7),
1
v

1
5
v=

EXAMPLE 9.3

or

1
7.5
5 7 .5
7.5 5

15 cm

This image is formed at 15 cm behind the mirror. It is a virtual image.

v
15

3
u
( 5)
The image is magnified, virtual and erect.
Magnification m =

Example 9.4 Suppose while sitting in a parked car, you notice a


jogger approaching towards you in the side view mirror of R = 2 m. If
the jogger is running at a speed of 5 m s1, how fast the image of the
jogger appear to move when the jogger is (a) 39 m, (b) 29 m, (c) 19 m,
and (d) 9 m away.
Solution
From the mirror equation, Eq. (9.7), we get
v

fu
u

For convex mirror, since R = 2 m, f = 1 m. Then


( 39) 1 39
m
39 1
40
Since the jogger moves at a constant speed of 5 m s1, after 1 s the
position of the image v (for u = 39 + 5 = 34) is (34/35 )m.
The shift in the position of image in 1 s is

for u = 39 m, v

EXAMPLE 9.4

39 34 1365 1360
5
1
m
40 35
1400
1400 280
Therefore, the average speed of the image when the jogger is between
39 m and 34 m from the mirror, is (1/280) m s1
Similarly, it can be seen that for u = 29 m, 19 m and 9 m, the
speed with which the image appears to move is
1
1
1
m s 1 ,
m s 1 and
m s 1 , respectively.
150
60
10
Although the jogger has been moving with a constant speed, the speed
of his/her image appears to increase substantially as he/she moves
closer to the mirror. This phenomenon can be noticed by any person
sitting in a stationary car or a bus. In case of moving vehicles, a
similar phenomenon could be observed if the vehicle in the rear is
moving closer with a constant speed.

9.3 REFRACTION

316

When a beam of light encounters another transparent medium, a part of


light gets reflected back into the first medium while the rest enters the
other. A ray of light represents a beam. The direction of propagation of
an obliquely incident ray of light that enters the other medium, changes

Ray Optics and


Optical Instruments
at the interface of the two media. This
phenomenon is called refraction of light. Snell
experimentally obtained the following laws of
refraction:
(i) The incident ray, the refracted ray and the
normal to the interface at the point of
incidence, all lie in the same plane.
(ii) The ratio of the sine of the angle of incidence
to the sine of angle of refraction is constant.
Remember that the angles of incidence (i ) and
refraction (r ) are the angles that the incident
and its refracted ray make with the normal,
respectively. We have
sin i
sin r

n 21

(9.10)

FIGURE 9.8 Refraction and reflection of light.

where n 21 is a constant, called the refractive index of the second medium


with respect to the first medium. Equation (9.10) is the well-known Snells
law of refraction. We note that n 21 is a characteristic of the pair of media
(and also depends on the wavelength of light), but is independent of the
angle of incidence.
From Eq. (9.10), if n 21 > 1, r < i , i.e., the refracted ray bends towards
the normal. In such a case medium 2 is said to be optically denser (or
denser, in short) than medium 1. On the other hand, if n 21 <1, r > i, the
refracted ray bends away from the normal. This is the case when incident
ray in a denser medium refracts into a rarer medium.
Note: Optical density should not be confused with mass density,
which is mass per unit volume. It is possible that mass density of
an optically denser medium may be less than that of an optically
rarer medium (optical density is the ratio of the speed of light in
two media). For example, turpentine and water. Mass density of
turpentine is less than that of water but its optical density is higher.
If n 21 is the refractive index of medium 2
with respect to medium 1 and n12 the refractive
index of medium 1 with respect to medium 2,
then it should be clear that
n12

1
n 21

(9.11)

It also follows that if n 32 is the refractive


index of medium 3 with respect to medium 2
then n 32 = n 31 n 12, where n 31 is the refractive
index of medium 3 with respect to medium 1.
FIGURE 9.9 Lateral shift of a ray refracted
Some elementary results based on the laws
through a parallel-sided slab.
of refraction follow immediately. For a
rectangular slab, refraction takes place at two
interfaces (air-glass and glass-air). It is easily seen from Fig. 9.9 that
317
r2 = i1, i.e., the emergent ray is parallel to the incident raythere is no

Physics

FIGURE 9.10 Apparent depth for


(a) normal, and (b) oblique viewing.

deviation, but it does suffer lateral displacement/


shift with respect to the incident ray. Another familiar
observation is that the bottom of a tank filled with
water appears to be raised (Fig. 9.10). For viewing
near the normal direction, it can be shown that the
apparent depth, (h1) is real depth (h 2) divided by
the refractive index of the medium (water).
The refraction of light through the atmosphere
is responsible for many interesting phenomena. For
example, the sun is visible a little before the actual
sunrise and until a little after the actual sunset
due to refraction of light through the atmosphere
(Fig. 9.11). By actual sunrise we mean the actual
crossing of the horizon by the sun. Figure 9.11
shows the actual and apparent positions of the sun
with respect to the horizon. The figure is highly
exaggerated to show the effect. The refractive index
of air with respect to vacuum is 1.00029. Due to
this, the apparent shift in the direction of the sun
is by about half a degree and the corresponding
time difference between actual sunset and apparent
sunset is about 2 minutes (see Example 9.5). The
apparent flattening (oval shape) of the sun at sunset
and sunrise is also due to the same phenomenon.

318

EXAMPLE 9.5

FIGURE 9.11 Advance sunrise and delayed sunset due to


atmospheric refraction.

Example 9.5 The earth takes 24 h to rotate once about its axis. How
much time does the sun take to shift by 1 when viewed from
the earth?
Solution
Time taken for 360 shift = 24 h
Time taken for 1 shift = 24/360 h = 4 min.

Ray Optics and


Optical Instruments
THE

DROWNING CHILD, LIFEGUARD AND

SNELLS

LAW

Consider a rectangular swimming pool PQSR; see figure here. A lifeguard sitting at G
outside the pool notices a child drowning at a point C. The guard wants to reach the
child in the shortest possible time. Let SR be the
side of the pool between G and C. Should he/she
take a straight line path GAC between G and C or
GBC in which the path BC in water would be the
shortest, or some other path GXC? The guard knows
that his/her running speed v1 on ground is higher
than his/her swimming speed v2.
Suppose the guard enters water at X. Let GX =l1
and XC =l 2. Then the time taken to reach from G to
C would be
t

l1
v1

l2
v2

To make this time minimum, one has to


differentiate it (with respect to the coordinate of X ) and find the point X when t is a
minimum. On doing all this algebra (which we skip here), we find that the guard should
enter water at a point where Snells law is satisfied. To understand this, draw a
perpendicular LM to side SR at X. Let GXM = i and CXL = r. Then it can be seen that t
is minimum when
sin i
sin r

v1
v2

In the case of light v1/v2, the ratio of the velocity of light in vacuum to that in the
medium, is the refractive index n of the medium.
In short, whether it is a wave or a particle or a human being, whenever two mediums
and two velocities are involved, one must follow Snells law if one wants to take the
shortest time.

9.4 TOTAL INTERNAL REFLECTION


When light travels from an optically denser medium to a rarer medium
at the interface, it is partly reflected back into the same medium and
partly refracted to the second medium. This reflection is called the internal
reflection.
When a ray of light enters from a denser medium to a rarer medium,
it bends away from the normal, for example, the ray AO1 B in Fig. 9.12.
The incident ray AO1 is partially reflected (O1C) and partially transmitted
(O1B) or refracted, the angle of refraction (r ) being larger than the angle of
incidence (i ). As the angle of incidence increases, so does the angle of
refraction, till for the ray AO3, the angle of refraction is /2. The refracted
ray is bent so much away from the normal that it grazes the surface at
the interface between the two media. This is shown by the ray AO3 D in
Fig. 9.12. If the angle of incidence is increased still further (e.g., the ray
AO4), refraction is not possible, and the incident ray is totally reflected.

319

Physics
This is called total internal reflection. When
light gets reflected by a surface, normally
some fraction of it gets transmitted. The
reflected ray, therefore, is always less intense
than the incident ray, howsoever smooth the
reflecting surface may be. In total internal
reflection, on the other hand, no
transmission of light takes place.
The angle of incidence corresponding to
an angle of refraction 90, say AO3N, is
called the critical angle (ic ) for the given pair
of media. We see from Snells law [Eq. (9.10)]
FIGURE 9.12 Refraction and internal reflection
of rays from a point A in the denser medium
that if the relative refractive index is less
(water) incident at different angles at the interface than one then, since the maximum value
with a rarer medium (air).
of sin r is unity, there is an upper limit
to the value of sin i for which the law can be satisfied, that is, i = ic
such that
sin ic = n 21
(9.12)
For values of i larger than ic, Snells law of refraction cannot be
satisfied, and hence no refraction is possible.
The refractive index of denser medium 1 with respect to rarer medium
2 will be n12 = 1/sin ic. Some typical critical angles are listed in Table 9.1.

TABLE 9.1 CRITICAL ANGLE OF SOME TRANSPARENT MEDIA WITH RESPECT TO AIR
Substance medium

Refractive index

Critical angle

Water

1.33

48.75

Crown glass

1.52

41.14

Dense flint glass

1.62

37.31

Diamond

2.42

24.41

A demonstration for total internal reflection

320

All optical phenomena can be demonstrated very easily with the use of a
laser torch or pointer, which is easily available nowadays. Take a glass
beaker with clear water in it. Stir the water a few times with a piece of
soap, so that it becomes a little turbid. Take a laser pointer and shine its
beam through the turbid water. You will find that the path of the beam
inside the water shines brightly.
Shine the beam from below the beaker such that it strikes at the
upper water surface at the other end. Do you find that it undergoes partial
reflection (which is seen as a spot on the table below) and partial refraction
[which comes out in the air and is seen as a spot on the roof; Fig. 9.13(a)]?
Now direct the laser beam from one side of the beaker such that it strikes
the upper surface of water more obliquely [Fig. 9.13(b)]. Adjust the
direction of laser beam until you find the angle for which the refraction

Ray Optics and


Optical Instruments
above the water surface is totally absent and the beam is totally reflected
back to water. This is total internal reflection at its simplest.
Pour this water in a long test tube and shine the laser light from top,
as shown in Fig. 9.13(c). Adjust the direction of the laser beam such that
it is totally internally reflected every time it strikes the walls of the tube.
This is similar to what happens in optical fibres.
Take care not to look into the laser beam directly and not to point it
at anybodys face.

9.4.1 Total internal reflection in nature and


its technological applications
(i) Mirage: On hot summer days, the air near the ground becomes hotter
than the air at higher levels. The refractive index of air increases with
its density. Hotter air is less dense, and has smaller refractive index
than the cooler air. If the air currents are small, that is, the air is still,
the optical density at different layers of air increases with height. As a
result, light from a tall object such as a tree, passes through a medium
whose refractive index decreases towards the ground. Thus, a ray of
light from such an object successively bends away from the normal
and undergoes total internal reflection, if the angle of incidence for
the air near the ground exceeds the critical angle. This is shown in
Fig. 9.14(b). To a distant observer, the light appears to be coming
FIGURE 9.13
from somewhere below the ground. The observer naturally assumes
Observing total
that light is being reflected from the ground, say, by a pool of water internal reflection in
near the tall object. Such inverted images of distant tall objects cause
water with a laser
an optical illusion to the observer. This phenomenon is called mirage. beam (refraction due
to glass of beaker
This type of mirage is especially common in hot deserts. Some of you
neglected
being very
might have noticed that while moving in a bus or a car during a hot
thin).
summer day, a distant patch of road, especially on a highway, appears
to be wet. But, you do not find any evidence of wetness when you
reach that spot. This is also due to mirage.

FIGURE 9.14 (a) A tree is seen by an observer at its place when the air above the ground is
at uniform temperature, (b) When the layers of air close to the ground have varying
temperature with hottest layers near the ground, light from a distant tree may
undergo total internal reflection, and the apparent image of the tree may create
321
an illusion to the observer that the tree is near a pool of water.

Physics
(ii) Diamond : Diamonds are known for their
spectacular brilliance. Their brilliance
is mainly due to the total internal
reflection of light inside them. The critical
angle for diamond-air interface ( 24.4)
is very small, therefore once light enters
a diamond, it is very likely to undergo
total internal reflection inside it.
Diamonds found in nature rarely exhibit
the brilliance for which they are known.
It is the technical skill of a diamond
cutter which makes diamonds to
sparkle so brilliantly. By cutting the
diamond suitably, multiple total
internal reflections can be made
to occur.
(iii) Prism : Prisms designed to bend light by
FIGURE 9.15 Prisms designed to bend rays by
90 or by 180 make use of total internal
90 and 180 or to invert image without changing
reflection [Fig. 9.15(a) and (b)]. Such a
its size make use of total internal reflection.
prism is also used to invert images
without changing their size [Fig. 9.15(c)].
In the first two cases, the critical angle ic for the material of the prism
must be less than 45. We see from Table 9.1 that this is true for both
crown glass and dense flint glass.
(iv) Optical fibres: Now-a-days optical fibres are extensively used for
transmitting audio and video signals through long distances. Optical
fibres too make use of the phenomenon of total internal reflection.
Optical fibres are fabricated with high quality composite glass/quartz
fibres. Each fibre consists of a core and cladding. The refractive index
of the material of the core is higher than that of the cladding.
When a signal in the form of light is
directed at one end of the fibre at a suitable
angle, it undergoes repeated total internal
reflections along the length of the fibre and
finally comes out at the other end (Fig. 9.16).
Since light undergoes total internal reflection
at each stage, there is no appreciable loss in
the intensity of the light signal. Optical fibres
FIGURE 9.16 Light undergoes successive total
are fabricated such that light reflected at one
internal reflections as it moves through an
side of inner surface strikes the other at an
optical fibre.
angle larger than the critical angle. Even if the
fibre is bent, light can easily travel along its
length. Thus, an optical fibre can be used to act as an optical pipe.
A bundle of optical fibres can be put to several uses. Optical fibres
are extensively used for transmitting and receiving electrical signals which
are converted to light by suitable transducers. Obviously, optical fibres
can also be used for transmission of optical signals. For example, these
are used as a light pipe to facilitate visual examination of internal organs
322
like esophagus, stomach and intestines. You might have seen a commonly

Ray Optics and


Optical Instruments
available decorative lamp with fine plastic fibres with their free ends
forming a fountain like structure. The other end of the fibres is fixed over
an electric lamp. When the lamp is switched on, the light travels from the
bottom of each fibre and appears at the tip of its free end as a dot of light.
The fibres in such decorative lamps are optical fibres.
The main requirement in fabricating optical fibres is that there should
be very little absorption of light as it travels for long distances inside
them. This has been achieved by purification and special preparation of
materials such as quartz. In silica glass fibres, it is possible to transmit
more than 95% of the light over a fibre length of 1 km. (Compare with
what you expect for a block of ordinary window glass 1 km thick.)

9.5 REFRACTION AT SPHERICAL SURFACES


AND BY LENSES
We have so far considered refraction at a plane interface. We shall now
consider refraction at a spherical interface between two transparent media.
An infinitesimal part of a spherical surface can be regarded as planar
and the same laws of refraction can be applied at every point on the
surface. Just as for reflection by a spherical mirror, the normal at the
point of incidence is perpendicular to the tangent plane to the spherical
surface at that point and, therefore, passes through its centre of curvature.
We first consider refraction by a single spherical surface and follow it by
thin lenses. A thin lens is a transparent optical medium bounded by two
surfaces; at least one of which should be spherical. Applying the formula
for image formation by a single spherical surface successively at the two
surfaces of a lens, we shall obtain the lens makers formula and then the
lens formula.

9.5.1 Refraction at a spherical surface


Figure 9.17 shows the geometry of formation of image I of an object O on
the principal axis of a spherical surface with centre of curvature C, and
radius of curvature R. The rays are incident from a medium of refractive
index n1, to another of refractive index n 2. As before, we take the aperture
(or the lateral size) of the surface to be small
compared to other distances involved, so that small
angle approximation can be made. In particular,
NM will be taken to be nearly equal to the length of
the perpendicular from the point N on the principal
axis. We have, for small angles,
tan NOM =

MN
OM

tan NCM =

MN
MC

tan NIM =

MN
MI

FIGURE 9.17 Refraction at a spherical


surface separating two media.

323

Physics
LIGHT SOURCES AND PHOTOMETRY
It is known that a body above absolute zero temperature emits electromagnetic radiation.
The wavelength region in which the body emits the radiation depends on its absolute
temperature. Radiation emitted by a hot body, for example, a tungsten filament lamp
having temperature 2850 K are partly invisible and mostly in infrared (or heat) region.
As the temperature of the body increases radiation emitted by it is in visible region. The
sun with temperature of about 5500 K emits radiation whose energy versus wavelength
graph peaks approximately at 550 nm corresponding to green light and is almost in the
middle of the visible region. The energy versus wavelength distribution graph for a given
body peaks at some wavelength, which is inversely proportional to the absolute
temperature of that body.
The measurement of light as perceived by human eye is called photometry. Photometry
is measurement of a physiological phenomenon, being the stimulus of light as received
by the human eye, transmitted by the optic nerves and analysed by the brain. The main
physical quantities in photometry are (i) the luminous intensity of the source,
(ii) the luminous flux or flow of light from the source, and (iii) illuminance of the surface.
The SI unit of luminous intensity (I ) is candela (cd). The candela is the luminous intensity,
in a given direction, of a source that emits monochromatic radiation of frequency
540 1012 Hz and that has a radiant intensity in that direction of 1/683 watt per steradian.
If a light source emits one candela of luminous intensity into a solid angle of one steradian,
the total luminous flux emitted into that solid angle is one lumen (lm). A standard
100 watt incadescent light bulb emits approximately 1700 lumens.
In photometry, the only parameter, which can be measured directly is illuminance. It
is defined as luminous flux incident per unit area on a surface (lm/m2 or lux ). Most light
meters measure this quantity. The illuminance E, produced by a source of luminous
intensity I, is given by E = I/r2, where r is the normal distance of the surface from the
source. A quantity named luminance (L), is used to characterise the brightness of emitting
or reflecting flat surfaces. Its unit is cd/m2 (sometimes called nit in industry) . A good
LCD computer monitor has a brightness of about 250 nits.

Now, for NOC, i is the exterior angle. Therefore, i = NOM + NCM


i=

MN
OM

MN
MC

(9.13)

Similarly,
r = NCM NIM
i.e., r =

MN
MC

MN
MI

Now, by Snells law


n1 sin i = n 2 sin r
or for small angles

324

n1i = n 2r

(9.14)

Ray Optics and


Optical Instruments
Substituting i and r from Eqs. (9.13) and (9.14), we get
n1

n2

n 2 n1

OM

MI

MC

(9.15)

Here, OM, MI and MC represent magnitudes of distances. Applying the


Cartesian sign convention,
OM = u, MI = +v, MC = +R
Substituting these in Eq. (9.15), we get
n 2 n1 n 2 n1
(9.16)
v
u
R
Equation (9.16) gives us a relation between object and image distance
in terms of refractive index of the medium and the radius of
curvature of the curved spherical surface. It holds for any curved
spherical surface.
Example 9.6 Light from a point source in air falls on a spherical
glass surface (n = 1.5 and radius of curvature = 20 cm). The distance
of the light source from the glass surface is 100 cm. At what position
the image is formed?
Solution
We use the relation given by Eq. (9.16). Here
u = 100 cm, v = ?, R = + 20 cm, n1 = 1, and n 2 = 1.5.
We then have
0.5
20

or v = +100 cm
The image is formed at a distance of 100 cm from the glass surface,
in the direction of incident light.

E XAMPLE 9.6

1.5
1
v 100

9.5.2 Refraction by a lens


Figure 9.18(a) shows the geometry of image formation by a double convex
lens. The image formation can be seen in terms of two steps:
(i) The first refracting surface forms the image I1 of the object O
[Fig. 9.18(b)]. The image I1 acts as a virtual object for the second surface
that forms the image at I [Fig. 9.18(c)]. Applying Eq. (9.15) to the first
interface ABC, we get
n1
OB

n2
BI1

n 2 n1
BC1

(9.17)

A similar procedure applied to the second interface* ADC gives,


n2
DI1

n1 n 2 n 1
DI
DC2

(9.18)

* Note that now the refractive index of the medium on the right side of ADC is n1
while on its left it is n 2. Further DI1 is negative as the distance is measured
against the direction of incident light.

325

Physics
For a thin lens, BI 1 = DI 1. Adding
Eqs. (9.17) and (9.18), we get
n1
OB

n1
1
(n 2 n1 )
DI
BC1

1
DC2

(9.19)

Suppose the object is at infinity, i.e.,


OB and DI = f, Eq. (9.19) gives
n1
f

(n 2 n1 )

1
BC1

1
DC2

(9.20)

The point where image of an object


placed at infinity is formed is called the
focus F, of the lens and the distance f gives
its focal length. A lens has two foci, F and
F, on either side of it (Fig. 9.19). By the
sign convention,
BC1 = + R1,
DC2 = R 2
So Eq. (9.20) can be written as
1
f

FIGURE 9.18 (a) The position of object, and the


image formed by a double convex lens,
(b) Refraction at the first spherical surface and
(c) Refraction at the second spherical surface.

n 21 1

1
R1

1
R2

n 21

n2
n1

(9.21)

Equation (9.21) is known as the lens


makers formula. It is useful to design
lenses of desired focal length using surfaces
of suitable radii of curvature. Note that the
formula is true for a concave lens also. In
that case R1is negative, R 2 positive and
therefore, f is negative.
From Eqs. (9.19) and (9.20), we get
n1
OB

n1
DI

n1
f

(9.22)

Again, in the thin lens approximation, B and D are both close to the
optical centre of the lens. Applying the sign convention,
BO = u, DI = +v, we get
1 1
v u

326

1
f

(9.23)

Equation (9.23) is the familiar thin lens formula. Though we derived


it for a real image formed by a convex lens, the formula is valid for both
convex as well as concave lenses and for both real and virtual images.
It is worth mentioning that the two foci, F and F, of a double convex
or concave lens are equidistant from the optical centre. The focus on the
side of the (original) source of light is called the first focal point, whereas
the other is called the second focal point.
To find the image of an object by a lens, we can, in principle, take any
two rays emanating from a point on an object; trace their paths using

Ray Optics and


Optical Instruments
the laws of refraction and find the point where
the refracted rays meet (or appear to meet). In
practice, however, it is convenient to choose any
two of the following rays:
(i) A ray emanating from the object parallel to
the principal axis of the lens after refraction
passes through the second principal focus
F (in a convex lens) or appears to diverge (in
a concave lens) from the first principal focus F.
(ii) A ray of light, passing through the optical
centre of the lens, emerges without any
deviation after refraction.
(iii) A ray of light passing through the first
principal focus (for a convex lens) or
appearing to meet at it (for a concave lens)
emerges parallel to the principal axis after
refraction.
Figures 9.19(a) and (b) illustrate these rules
for a convex and a concave lens, respectively.
You should practice drawing similar ray
diagrams for different positions of the object with
respect to the lens and also verify that the lens
FIGURE 9.19 Tracing rays through (a)
formula, Eq. (9.23), holds good for all cases.
convex lens (b) concave lens.
Here again it must be remembered that each
point on an object gives out infinite number of
rays. All these rays will pass through the same image point after refraction
at the lens.
Magnification (m) produced by a lens is defined, like that for a mirror,
as the ratio of the size of the image to that of the object. Proceeding in the
same way as for spherical mirrors, it is easily seen that for a lens
h
v
=
(9.24)
h
u
When we apply the sign convention, we see that, for erect (and virtual)
image formed by a convex or concave lens, m is positive, while for an
inverted (and real) image, m is negative.

m=

Example 9.7 A magician during a show makes a glass lens with


n = 1.47 disappear in a trough of liquid. What is the refractive index
of the liquid? Could the liquid be water?

E XAMPLE 9.7

Solution
The refractive index of the liquid must be equal to 1.47 in order to
make the lens disappear. This means n1 = n 2.. This gives 1/f = 0 or
f . The lens in the liquid will act like a plane sheet of glass. No,
the liquid is not water. It could be glycerine.

9.5.3 Power of a lens


Power of a lens is a measure of the convergence or divergence, which a
lens introduces in the light falling on it. Clearly, a lens of shorter focal

327

Physics
length bends the incident light more, while converging it
in case of a convex lens and diverging it in case of a
concave lens. The power P of a lens is defined as the
tangent of the angle by which it converges or diverges a
beam of light falling at unit distant from the optical centre
(Fig. 9.20).
h
; if h
f

tan

1 tan

1
f

or

1
f for small

value of . Thus,
FIGURE 9.20 Power of a lens.

P=

1
f

(9.25)

The SI unit for power of a lens is dioptre (D): 1D = 1m1. The power of
a lens of focal length of 1 metre is one dioptre. Power of a lens is positive
for a converging lens and negative for a diverging lens. Thus, when an
optician prescribes a corrective lens of power + 2.5 D, the required lens is
a convex lens of focal length + 40 cm. A lens of power of 4.0 D means a
concave lens of focal length 25 cm.
Example 9.8 (i) If f = 0.5 m for a glass lens, what is the power of the
lens? (ii) The radii of curvature of the faces of a double convex lens
are 10 cm and 15 cm. Its focal length is 12 cm. What is the refractive
index of glass? (iii) A convex lens has 20 cm focal length in air. What
is focal length in water? (Refractive index of air-water = 1.33, refractive
index for air-glass = 1.5.)
Solution
(i) Power = +2 dioptre.
(ii) Here, we have f = +12 cm, R1 = +10 cm, R2 = 15 cm.
Refractive index of air is taken as unity.
We use the lens formula of Eq. (9.22). The sign convention has to
be applied for f, R1 and R 2.
Substituting the values, we have

EXAMPLE 9.8

1
1
1
(n 1)
12
10
15
This gives n = 1.5.
(iii) For a glass lens in air, n 2 = 1.5, n1 = 1, f = +20 cm. Hence, the lens
formula gives
1
1
1
0.5
20
R1 R 2
For the same glass lens in water, n 2 = 1.5, n1 = 1.33. Therefore,
1.33
1
(1.5 1.33)
f
R1

1
R2

(9.26)

Combining these two equations, we find f = + 78.2 cm.

9.5.4 Combination of thin lenses in contact


328

Consider two lenses A and B of focal length f1 and f2 placed in contact


with each other. Let the object be placed at a point O beyond the focus of

Ray Optics and


Optical Instruments
the first lens A (Fig. 9.21). The first lens produces
an image at I1. Since image I1 is real, it serves as a
virtual object for the second lens B, producing the
final image at I. It must, however, be borne in mind
that formation of image by the first lens is presumed
only to facilitate determination of the position of the
final image. In fact, the direction of rays emerging
FIGURE 9.21 Image formation by a
from the first lens gets modified in accordance with
combination
of two thin lenses in contact.
the angle at which they strike the second lens. Since
the lenses are thin, we assume the optical centres of the lenses to be
coincident. Let this central point be denoted by P.
For the image formed by the first lens A, we get
1 1 1
v1 u f1
For the image formed by the second lens B, we get

(9.27)

1 1 1
v v1 f 2
Adding Eqs. (9.27) and (9.28), we get

(9.28)

1 1
v u

(9.29)

1
f1

1
f2

If the two lens-system is regarded as equivalent to a single lens of


focal length f, we have
1 1
v u

1
f

so that we get
1
f

1
f1

1
f2

(9.30)

The derivation is valid for any number of thin lenses in contact. If


several thin lenses of focal length f1, f2, f3,... are in contact, the effective
focal length of their combination is given by
1
f

1
f1

1
f2

1
f3

In terms of power, Eq. (9.31) can be written as


P = P1 + P2 + P3 +

(9.31)

(9.32)

where P is the net power of the lens combination. Note that the sum in
Eq. (9.32) is an algebraic sum of individual powers, so some of the terms
on the right side may be positive (for convex lenses) and some negative
(for concave lenses). Combination of lenses helps to obtain diverging or
converging lenses of desired magnification. It also enhances sharpness
of the image. Since the image formed by the first lens becomes the object
for the second, Eq. (9.25) implies that the total magnification m of the
combination is a product of magnification (m1, m 2, m 3,...) of individual
lenses
m = m1 m 2 m 3 ...

(9.33)

329

Physics
Such a system of combination of lenses is commonly used in designing
lenses for cameras, microscopes, telescopes and other optical instruments.
Example 9.9 Find the position of the image formed by the lens
combination given in the Fig. 9.22.

FIGURE 9.22

Solution Image formed by the first lens


1
v1
1
v1

1
u1

1
f1

1
1
30 10

or
v1 = 15 cm
The image formed by the first lens serves as the object for the second.
This is at a distance of (15 5) cm = 10 cm to the right of the second
lens. Though the image is real, it serves as a virtual object for the
second lens, which means that the rays appear to come from it for
the second lens.
1
v2

1
10

1
10

EXAMPLE 9.9

or
v2 =
The virtual image is formed at an infinite distance to the left of the
second lens. This acts as an object for the third lens.
1
v3

1
u3

1
f3

or

1
v3

or

v3 = 30 cm

1
30

The final image is formed 30 cm to the right of the third lens.

9.6 REFRACTION

330

THROUGH A

PRISM

Figure 9.23 shows the passage of light through a triangular prism ABC.
The angles of incidence and refraction at the first face AB are i and r1,
while the angle of incidence (from glass to air) at the second face AC is r2
and the angle of refraction or emergence e. The angle between the
emergent ray RS and the direction of the incident ray PQ is called the
angle of deviation, .

Ray Optics and


Optical Instruments
In the quadrilateral AQNR, two of the angles
(at the vertices Q and R) are right angles.
Therefore, the sum of the other angles of the
quadrilateral is 180.
A + QNR = 180
From the triangle QNR,
r1 + r2 + QNR = 180
Comparing these two equations, we get
r 1 + r2 = A
(9.34)

The total deviation is the sum of deviations


at the two faces,
= (i r1 ) + (e r2 )

FIGURE 9.23 A ray of light passing through


a triangular glass prism.

that is,

=i+eA
(9.35)
Thus, the angle of deviation depends on the angle of incidence. A plot
between the angle of deviation and angle of incidence is shown in
Fig. 9.24. You can see that, in general, any given value of , except for
i = e, corresponds to two values i and hence of e. This, in fact, is expected
from the symmetry of i and e in Eq. (9.35), i.e., remains the same if i
and e are interchanged. Physically, this is related
to the fact that the path of ray in Fig. 9.23 can be
traced back, resulting in the same angle of
deviation. At the minimum deviation Dm, the
refracted ray inside the prism becomes parallel
to its base. We have
= Dm, i = e which implies r1 = r2.
Equation (9.34) gives
A
2
In the same way, Eq. (9.35) gives

2r = A or r =

(9.36)

Dm = 2i A, or i = (A + Dm)/2

(9.37)

The refractive index of the prism is


n 21

n2
n1

sin[( A Dm )/2]
sin [ A / 2]

(9.38)

FIGURE 9.24 Plot of angle of deviation ( )


versus angle of incidence (i ) for a
triangular prism.

The angles A and D m can be measured


experimentally. Equation (9.38) thus provides a
method of determining refractive index of the material of the prism.
For a small angle prism, i.e., a thin prism, Dm is also very small, and
we get

n 21

A Dm /2
sin[( A Dm )/2]

sin[ A /2]
A /2

Dm = (n 211)A
It implies that, thin prisms do not deviate light much.

331

Physics
9.7 DISPERSION

BY A

PRISM

It has been known for a long time that when a narrow beam of sunlight,
usually called white light, is incident on a glass prism, the emergent
light is seen to be consisting of several colours. There is actually a
continuous variation of colour, but broadly, the different component
colours that appear in sequence are:
violet, indigo, blue, green, yellow, orange
and red (given by the acronym
VIBGYOR). The red light bends the
least, while the violet light bends the most
(Fig. 9.25).
The phenomenon of splitting of light
into its component colours is known as
dispersion. The pattern of colour
components of light is called the spectrum
of light. The word spectrum is now used
in a much more general sense: we
discussed in Chapter 8 the electroFIGURE 9.25 Dispersion of sunlight or white light
magnetic spectrum over the large range
on passing through a glass prism. The relative
of wavelengths, from -rays to radio
deviation of different colours shown is highly
waves, of which the spectrum of light
exaggerated.
(visible spectrum) is only a small part.
Though the reason for appearance of
spectrum is now common knowledge, it was a matter of much debate in
the history of physics. Does the prism itself create colour in some way or
does it only separate the colours already present in white light?
In a classic experiment known for its simplicity but great significance,
Isaac Newton settled the issue once for all. He put another similar prism,
but in an inverted position, and let the emergent beam from the first
prism fall on the second prism (Fig. 9.26). The resulting emergent beam
was found to be white light. The explanation was clear the first prism
splits the white light into its component colours, while the inverted prism
recombines them to give white light. Thus, white
light itself consists of light of different colours,
which are separated by the prism.
It must be understood here that a ray of light,
as defined mathematically, does not exist. An
actual ray is really a beam of many rays of light.
Each ray splits into component colours when it
enters the glass prism. When those coloured rays
come out on the other side, they again produce a
white beam.
FIGURE 9.26 Schematic diagram of
We now know that colour is associated with
Newtons classic experiment on
wavelength of light. In the visible spectrum, red
dispersion of white light.
light is at the long wavelength end (~700 nm) while
the violet light is at the short wavelength end
(~ 400 nm). Dispersion takes place because the refractive index of medium
332
for different wavelengths (colours) is different. For example, the bending

Ray Optics and


Optical Instruments
of red component of white light is least while it is most for the violet.
Equivalently, red light travels faster than violet light in a glass prism.
Table 9.2 gives the refractive indices for different wavelength for crown
glass and flint glass. Thick lenses could be assumed as made of many
prisms, therefore, thick lenses show chromatic aberration due to
dispersion of light.

TABLE 9.2 REFRACTIVE INDICES FOR DIFFERENT WAVELENGTHS


Colour

Wavelength (nm)

Crown glass

Flint glass

Violet

396.9

1.533

1.663

Blue

486.1

1.523

1.639

Yellow

589.3

1.517

1.627

Red

656.3

1.515

1.622

The variation of refractive index with wavelength may be more


pronounced in some media than the other. In vacuum, of course, the
speed of light is independent of wavelength. Thus, vacuum (or air
approximately) is a non-dispersive medium in which all colours travel
with the same speed. This also follows from the fact that sunlight reaches
us in the form of white light and not as its components. On the other
hand, glass is a dispersive medium.

9.8 SOME NATURAL PHENOMENA

DUE TO

SUNLIGHT

The rainbow is an example of the dispersion of sunlight by the water


drops in the atmosphere. This is a phenomenon due to combined effect
of dispersion, refraction and reflection of sunlight by spherical water
droplets of rain. The conditions for observing a rainbow are that the sun
should be shining in one part of the sky (say near western horizon) while
it is raining in the opposite part of the sky (say eastern horizon).
An observer can therefore see a rainbow only when his back is towards
the sun.
In order to understand the formation of rainbows, consider
Fig. (9.27(a). Sunlight is first refracted as it enters a raindrop, which
causes the different wavelengths (colours) of white light to separate.
Longer wangelength of light (red) are bent the least while the shorter
wavelength (violet) are bent the most. Next, these component rays strike

Formation of rainbows

9.8.1 The rainbow

http://www.eo.ucar.edu/rainbows
http://www.atoptics.co.uk/bows.htm

The interplay of light with things around us gives rise to several beautiful
phenomena. The spectacle of colour that we see around us all the time is
possible only due to sunlight. The blue of the sky, white clouds, the redhue at sunrise and sunset, the rainbow, the brilliant colours of some
pearls, shells, and wings of birds, are just a few of the natural wonders
we are used to. We describe some of them here from the point of view
of physics.

333

Physics

FIGURE 9.27 Rainbow: (a) The sun rays incident on a water drop get refracted twice
and reflected internally by a drop; (b) Enlarge view of internal reflection and
refraction of a ray of light inside a drop form primary rainbow; and
(c) secondary rainbow is formed by rays
undergoing internal reflection twice
inside the drop.

334

the inner surface of the water drop and get internally reflected if the angle
between the refracted ray and normal to the drop surface is greater then
the critical angle (48, in this case). The reflected light is refracted again
as it comes out of the drop as shown in the figure. It is found that the
violet light emerges at an angle of 40 related to the incoming sunlight
and red light emerges at an angle of 42. For other colours, angles lie in
between these two values.

Ray Optics and


Optical Instruments
Figure 9.27(b) explains the formation of primary rainbow. We see
that red light from drop 1 and violet light from drop 2 reach the observers
eye. The violet from drop 1 and red light from drop 2 are directed at level
above or below the observer. Thus the observer sees a rainbow with
red colour on the top and violet on the bottom. Thus, the primary
rainbow is a result of three-step process, that is, refraction, reflection
and refraction.
When light rays undergoes two internal reflections inside a raindrop,
instead of one as in the primary rainbow, a secondary rainbow is formed
as shown in Fig. 9.27(c). It is due to four-step process. The intensity of
light is reduced at the second reflection and hence the secondary rainbow
is fainter than the primary rainbow. Further, the order of the colours is
reversed in it as is clear from Fig. 9.27(c).

9.8.2 Scattering of light


As sunlight travels through the earths atmosphere, it gets scattered
(changes its direction) by the atmospheric particles. Light of shorter
wavelengths is scattered much more than light of longer wavelengths.
(The amount of scattering is inversely proportional to the fourth power
of the wavelength. This is known as Rayleigh scattering). Hence, the bluish
colour predominates in a clear sky, since blue has a shorter wavelength than red and is scattered much more strongly. In fact, violet
gets scattered even more than blue, having a shorter wavelength.
But since our eyes are more sensitive to blue than violet, we see the
sky blue.
Large particles like dust and water
droplets present in the atmosphere
behave differently. The relevant quantity
here is the relative size of the wavelength
of light , and the scatterer (of typical size,
say, a). For a << , one has Rayleigh
scattering which is proportional to 1/4.
For a >> , i.e., large scattering objects
(for example, raindrops, large dust or ice
particles) this is not true; all wavelengths
are scattered nearly equally. Thus, clouds
FIGURE 9.28 Sunlight travels through a longer
which have droplets of water with a >>
distance in the atmosphere at sunset and sunrise.
are generally white.
At sunset or sunrise, the suns rays
have to pass through a larger distance in the atmosphere (Fig. 9.28).
Most of the blue and other shorter wavelengths are removed by scattering.
The least scattered light reaching our eyes, therefore, the sun looks
reddish. This explains the reddish appearance of the sun and full moon
near the horizon.

9.9 OPTICAL INSTRUMENTS


A number of optical devices and instruments have been designed utilising
reflecting and refracting properties of mirrors, lenses and prisms.
Periscope, kaleidoscope, binoculars, telescopes, microscopes are some

335

Physics
examples of optical devices and instruments that are in common use.
Our eye is, of course, one of the most important optical device the nature
has endowed us with. Starting with the eye, we then go on to describe
the principles of working of the microscope and the telescope.

9.9.1 The eye

336

Figure 9.29 (a) shows the eye. Light enters the eye through a curved
front surface, the cornea. It passes through the pupil which is the central
hole in the iris. The size of the pupil can change under control of muscles.
The light is further focussed by the eye lens on the retina. The retina is a
film of nerve fibres covering the curved back surface of the eye. The retina
contains rods and cones which sense light intensity and colour,
respectively, and transmit electrical signals via the optic nerve to the brain
which finally processes this information. The shape (curvature) and
therefore the focal length of the lens can be modified somewhat by the
ciliary muscles. For example, when the muscle is relaxed, the focal length
is about 2.5 cm and objects at infinity are in sharp focus on the retina.
When the object is brought closer to the eye, in order to maintain the
same image-lens distance ( 2.5 cm), the focal length of the eye lens
becomes shorter by the action of the ciliary muscles. This property of the
eye is called accommodation. If the object is too close to the eye, the lens
cannot curve enough to focus the image on to the retina, and the image
is blurred. The closest distance for which the lens can focus light on the
retina is called the least distance of distinct vision, or the near point.
The standard value for normal vision is taken as 25 cm. (Often the near
point is given the symbol D.) This distance increases with age, because
of the decreasing effectiveness of the ciliary muscle and the loss of
flexibility of the lens. The near point may be as close as about 7 to 8 cm
in a child ten years of age, and may increase to as much as 200 cm at 60
years of age. Thus, if an elderly person tries to read a book at about 25 cm
from the eye, the image appears blurred. This condition (defect of the eye)
is called presbyopia. It is corrected by using a converging lens for reading.
Thus, our eyes are marvellous organs that have the capability to
interpret incoming electromagnetic waves as images through a complex
process. These are our greatest assets and we must take proper care to
protect them. Imagine the world without a pair of functional eyes. Yet
many amongst us bravely face this challenge by effectively overcoming
their limitations to lead a normal life. They deserve our appreciation for
their courage and conviction.
In spite of all precautions and proactive action, our eyes may develop
some defects due to various reasons. We shall restrict our discussion to
some common optical defects of the eye. For example, the light from a
distant object arriving at the eye-lens may get converged at a point in
front of the retina. This type of defect is called nearsightedness or myopia .
This means that the eye is producing too much convergence in the incident
beam. To compensate this, we interpose a concave lens between the eye
and the object, with the diverging effect desired to get the image focussed
on the retina [Fig. 9.29(b)].

Ray Optics and


Optical Instruments

FIGURE 9.29 (a) The structure of the eye; (b) shortsighted or myopic eye and its correction;
(c) farsighted or hypermetropic eye and its correction; and (d) astigmatic eye and its correction.

Similarly, if the eye-lens focusses the incoming light at a point behind


the retina, a convergent lens is needed to compensate for the defect in vision.
This defect is called farsightedness or hypermetropia [Fig. 9.29(c)].
Another common defect of vision is called astigmatism. This occurs
when the cornea is not spherical in shape. For example, the cornea could
have a larger curvature in the vertical plane than in the horizontal plane
or vice-versa. If a person with such a defect in eye-lens looks at a wire
mesh or a grid of lines, focussing in either the vertical or the horizontal
plane may not be as sharp as in the other plane. Astigmatism results in
lines in one direction being well focussed while those in a perpendicular
direction may appear distorted [Fig. 9.29(d)]. Astigmatism can be
corrected by using a cylindrical lens of desired radius of curvature with
an appropriately directed axis. This defect can occur along with myopia
or hypermetropia.
Example 9.10 What focal length should the reading spectacles have
for a person for whom the least distance of distinct vision is 50 cm?

1
1
1
1
or f
50
25 50
orf = + 50 cm (convex lens).

E XAMPLE 9.10

Solution The distance of normal vision is 25 cm. So if a book is at


u = 25 cm, its image should be formed at v = 50 cm. Therefore, the
desired focal length is given by
1
1 1
f
v u

337

Physics

EXAMPLE 9.11

Example 9.11
(a) The far point of a myopic person is 80 cm in front of the eye. What
is the power of the lens required to enable him to see very distant
objects clearly?
(b) In what way does the corrective lens help the above person? Does
the lens magnify very distant objects? Explain carefully.
(c) The above person prefers to remove his spectacles while reading
a book. Explain why?
Solution
(a) Solving as in the previous example, we find that the person should
use a concave lens of focal length = 80 cm, i.e., of power = 1.25
dioptres.
(b) No. The concave lens, in fact, reduces the size of the object, but
the angle subtended by the distant object at the eye is the same
as the angle subtended by the image (at the far point) at the eye.
The eye is able to see distant objects not because the corrective
lens magnifies the object, but because it brings the object (i.e., it
produces virtual image of the object) at the far point of the eye
which then can be focussed by the eye-lens on the retina.
(c) The myopic person may have a normal near point, i.e., about
25 cm (or even less). In order to read a book with the spectacles,
such a person must keep the book at a distance greater than
25 cm so that the image of the book by the concave lens is produced
not closer than 25 cm. The angular size of the book (or its image)
at the greater distance is evidently less than the angular size
when the book is placed at 25 cm and no spectacles are needed.
Hence, the person prefers to remove the spectacles while reading.

338

EXAMPLE 9.12

Example 9.12 (a) The near point of a hypermetropic person is 75 cm


from the eye. What is the power of the lens required to enable the
person to read clearly a book held at 25 cm from the eye? (b) In what
way does the corrective lens help the above person? Does the lens
magnify objects held near the eye? (c) The above person prefers to
remove the spectacles while looking at the sky. Explain why?
Solution
(a) u = 25 cm, v = 75 cm
1/f = 1/25 1/75, i.e., f = 37.5 cm.
The corrective lens needs to have a converging power of +2.67
dioptres.
(b) The corrective lens produces a virtual image (at 75 cm) of an object
at 25 cm. The angular size of this image is the same as that of the
object. In this sense the lens does not magnify the object but merely
brings the object to the near point of the hypermetric eye, which
then gets focussed on the retina. However, the angular size is
greater than that of the same object at the near point (75 cm)
viewed without the spectacles.
(c) A hypermetropic eye may have normal far point i.e., it may have
enough converging power to focus parallel rays from infinity on
the retina of the shortened eyeball. Wearing spectacles of converging
lenses (used for near vision) will amount to more converging power
than needed for parallel rays. Hence the person prefers not to use
the spectacles for far objects.

Ray Optics and


Optical Instruments
9.9.2 The microscope
A simple magnifier or microscope is a converging lens of small focal length
(Fig. 9.30). In order to use such a lens as a microscope, the lens is held
near the object, one focal length away or less, and
the eye is positioned close to the lens on the other
side. The idea is to get an erect, magnified and
virtual image of the object at a distance so that it
can be viewed comfortably, i.e., at 25 cm or more.
If the object is at a distance f, the image is at
infinity. However, if the object is at a distance
slightly less than the focal length of the lens, the
image is virtual and closer than infinity. Although
the closest comfortable distance for viewing the
image is when it is at the near point (distance
D 25 cm), it causes some strain on the eye.
Therefore, the image formed at infinity is often
considered most suitable for viewing by the relaxed
eye. We show both cases, the first in Fig. 9.30(a),
and the second in Fig. 9.30(b) and (c).
The linear magnification m , for the image
formed at the near point D, by a simple microscope
can be obtained by using the relation
m

v
u

1 1

v f

v
f

Now according to our sign convention, v is


negative, and is equal in magnitude to D. Thus,
the magnification is
m

D
f

(9.39)

Since D is about 25 cm, to have a magnification of


six, one needs a convex lens of focal length,
FIGURE 9.30 A simple microscope; (a) the
f = 5 cm.
magnifying lens is located such that the
Note that m = h/h where h is the size of the
image is at the near point, (b) the angle
object and h the size of the image. This is also the
subtanded by the object, is the same as
ratio of the angle subtended by the image
that at the near point, and (c) the object
to that subtended by the object, if placed at D for
near the focal point of the lens; the image
comfortable viewing. (Note that this is not the angle
is far off but closer than infinity.
actually subtended by the object at the eye, which
is h/u.) What a single-lens simple magnifier
achieves is that it allows the object to be brought closer to the eye than D.
We will now find the magnification when the image is at infinity. In
this case we will have to obtained the angular magnification. Suppose
the object has a height h. The maximum angle it can subtend, and be
clearly visible (without a lens), is when it is at the near point, i.e., a distance
D. The angle subtended is then given by
tan

h
o
D

(9.40)

339

Physics
We now find the angle subtended at the eye by the image when the
object is at u. From the relations
h
v
m
h
u
we have the angle subtended by the image
h
v
is at u = f.

tan

h v
v u

h
. The angle subtended by the object, when it
u

h
f

(9.41)

as is clear from Fig. 9.29(c). The angular magnification is, therefore


m

D
f

i
o

(9.42)

This is one less than the magnification when the image is at the near
point, Eq. (9.39), but the viewing is more comfortable and the difference
in magnification is usually small. In subsequent discussions of optical
instruments (microscope and telescope) we shall assume the image to be
at infinity.
A simple microscope has a limited maximum magnification ( 9) for
realistic focal lengths. For much larger magnifications, one uses two
lenses, one compounding the effect of the other. This is known as a
compound microscope. A schematic diagram of
a compound microscope is shown in Fig. 9.31.
The lens nearest the object, called the objective,
forms a real, inverted, magnified image of the
object. This serves as the object for the second
lens, the eyepiece, which functions essentially
like a simple microscope or magnifier, produces
the final image, which is enlarged and virtual.
The first inverted image is thus near (at or
within) the focal plane of the eyepiece, at a
distance appropriate for final image formation
at infinity, or a little closer for image formation
at the near point. Clearly, the final image is
inverted with respect to the original object.
We now obtain the magnification due to a
FIGURE 9.31 Ray diagram for the
compound microscope. The ray diagram of
formation of image by a compound
Fig. 9.31 shows that the (linear) magnification
microscope.
due to the objective, namely h/h, equals
h
L
h
fo
where we have used the result
mO

340

tan

h
fo

h
L

(9.43)

Ray Optics and


Optical Instruments
Here h is the size of the first image, the object size being h and fo
being the focal length of the objective. The first image is formed near the
focal point of the eyepiece. The distance L, i.e., the distance between the
second focal point of the objective and the first focal point of the eyepiece
(focal length fe ) is called the tube length of the compound microscope.
As the first inverted image is near the focal point of the eyepiece, we
use the result from the discussion above for the simple microscope to
obtain the (angular) magnification me due to it [Eq. (9.39)], when the
final image is formed at the near point, is
D
[9.44(a)]
fe
When the final image is formed at infinity, the angular magnification
due to the eyepiece [Eq. (9.42)] is
me

me = (D/fe )

[9.44(b)]

Thus, the total magnification [(according to Eq. (9.33)], when the


image is formed at infinity, is

m om e

L
fo

D
fe

20 25
250
1
2
Various other factors such as illumination of the object, contribute to
the quality and visibility of the image. In modern microscopes, multicomponent lenses are used for both the objective and the eyepiece to
improve image quality by minimising various optical aberrations (defects)
in lenses.

The worlds largest optical telescopes

m om e

http://astro.nineplanets.org/bigeyes.html

L
D
(9.45)
fo
fe
Clearly, to achieve a large magnification of a small object (hence the
name microscope), the objective and eyepiece should have small focal
lengths. In practice, it is difficult to make the focal length much smaller
than 1 cm. Also large lenses are required to make L large.
For example, with an objective with fo = 1.0 cm, and an eyepiece with
focal length fe = 2.0 cm, and a tube length of 20 cm, the magnification is
m

9.9.3 Telescope
The telescope is used to provide angular magnification of distant objects
(Fig. 9.32). It also has an objective and an eyepiece. But here, the objective
has a large focal length and a much larger aperture than the eyepiece.
Light from a distant object enters the objective and a real image is formed
in the tube at its second focal point. The eyepiece magnifies this image
producing a final inverted image. The magnifying power m is the ratio of
the angle subtended at the eye by the final image to the angle which
the object subtends at the lens or the eye. Hence
fo
h fo
m
.
(9.46)
fe h
fe
In this case, the length of the telescope tube is fo + fe .

341

Physics
Terrestrial telescopes have, in
addition, a pair of inverting lenses to
make the final image erect. Refracting
telescopes can be used both for
terrestrial and astronomical
observations. For example, consider
a telescope whose objective has a focal
length of 100 cm and the eyepiece a
focal length of 1 cm. The magnifying
power of this telescope is
m = 100/1 = 100.
Let us consider a pair of stars of
actual separation 1 (one minute of
arc). The stars appear as though they
FIGURE 9.32 A refracting telescope.
are separated by an angle of 100 1
= 100 =1.67.
The main considerations with an astronomical telescope are its light
gathering power and its resolution or resolving power. The former clearly
depends on the area of the objective. With larger diameters, fainter objects
can be observed. The resolving power, or the ability to observe two objects
distinctly, which are in very nearly the same direction, also depends on
the diameter of the objective. So, the desirable aim in optical telescopes
is to make them with objective of large diameter. The largest lens objective
in use has a diameter of 40 inch (~1.02 m). It is at the Yerkes Observatory
in Wisconsin, USA. Such big lenses tend to be very heavy and therefore,
difficult to make and support by their edges. Further, it is rather difficult
and expensive to make such large sized lenses which form images that
are free from any kind of chromatic aberration and distortions.
For these reasons, modern telescopes use a concave mirror rather
than a lens for the objective. Telescopes with mirror objectives are called
reflecting telescopes. They have several advantages. First, there is no
chromatic aberration in a mirror. Second, if a parabolic reflecting surface
is chosen, spherical aberration is also removed. Mechanical support is
much less of a problem since a mirror weighs much less than a lens of
equivalent optical quality, and can be
supported over its entire back surface, not
just over its rim. One obvious problem with a
reflecting telescope is that the objective mirror
focusses light inside the telescope tube. One
must have an eyepiece and the observer right
there, obstructing some light (depending on
the size of the observer cage). This is what is
done in the very large 200 inch (~5.08 m)
diameters, Mt. Palomar telescope, California.
The viewer sits near the focal point of the
FIGURE 9.33 Schematic diagram of a reflecting mirror, in a small cage. Another solution to
telescope (Cassegrain).
the problem is to deflect the light being
focussed by another mirror. One such
arrangement using a convex secondary mirror to focus the incident light,
342
which now passes through a hole in the objective primary mirror, is shown

Ray Optics and


Optical Instruments
in Fig. 9.33. This is known as a Cassegrain telescope, after its inventor.
It has the advantages of a large focal length in a short telescope. The
largest telescope in India is in Kavalur, Tamil Nadu. It is a 2.34 m diameter
reflecting telescope (Cassegrain). It was ground, polished, set up, and is
being used by the Indian Institute of Astrophysics, Bangalore. The largest
reflecting telescopes in the world are the pair of Keck telescopes in Hawaii,
USA, with a reflector of 10 metre in diameter.

SUMMARY
1.

2.

3.

4.

Reflection is governed by the equation i = r and refraction by the


Snells law, sini/sinr = n, where the incident ray, reflected ray, refracted
ray and normal lie in the same plane. Angles of incidence, reflection
and refraction are i, r and r, respectively.
The critical angle of incidence ic for a ray incident from a denser to rarer
medium, is that angle for which the angle of refraction is 90. For
i > ic, total internal reflection occurs. Multiple internal reflections in
diamond (ic 24.4), totally reflecting prisms and mirage, are some
examples of total internal reflection. Optical fibres consist of glass
fibres coated with a thin layer of material of lower refractive index.
Light incident at an angle at one end comes out at the other, after
multiple internal reflections, even if the fibre is bent.
Cartesian sign convention: Distances measured in the same direction
as the incident light are positive; those measured in the opposite
direction are negative. All distances are measured from the pole/optic
centre of the mirror/lens on the principal axis. The heights measured
upwards above x-axis and normal to the principal axis of the mirror/
lens are taken as positive. The heights measured downwards are taken
as negative.
Mirror equation:

1
v

5.

1
u

where u and v are object and image distances, respectively and f is the
focal length of the mirror. f is (approximately) half the radius of
curvature R. f is negative for concave mirror; f is positive for a convex
mirror.
For a prism of the angle A, of refractive index n 2 placed in a medium
of refractive index n1,

sin A

n2
n1

n 21

6.

1
f

D m /2

sin A /2

where Dm is the angle of minimum deviation.


For refraction through a spherical interface (from medium 1 to 2 of
refractive index n1 and n 2, respectively)

n 2 n1 n 2 n 1
v
u
R
Thin lens formula

1
v

1
u

1
f

343

Physics
Lens makers formula

1
f

n2

n1

n1

1
R1

1
R2

R1 and R2 are the radii of curvature of the lens surfaces. f is positive


for a converging lens; f is negative for a diverging lens. The power of a
lens P = 1/f.
The SI unit for power of a lens is dioptre (D): 1 D = 1 m1.
If several thin lenses of focal length f1, f2, f3,.. are in contact, the
effective focal length of their combination, is given by

1
f

7.
8.

9.

1
f1

1
f2

1
f3

The total power of a combination of several lenses is


P = P 1 + P2 + P 3 +
Dispersion is the splitting of light into its constituent colours.
The Eye: The eye has a convex lens of focal length about 2.5 cm. This
focal length can be varied somewhat so that the image is always formed
on the retina. This ability of the eye is called accommodation. In a
defective eye, if the image is focussed before the retina (myopia), a
diverging corrective lens is needed; if the image is focussed beyond the
retina (hypermetropia), a converging corrective lens is needed.
Astigmatism is corrected by using cylindrical lenses.
Magnifying power m of a simple microscope is given by m = 1 + (D/f ),
where D = 25 cm is the least distance of distinct vision and f is the
focal length of the convex lens. If the image is at infinity, m = D/f. For
a compound microscope, the magnifying power is given by
m = me m0 where me = 1 + (D/fe ), is the magnification due to the
eyepiece and mo is the magnification produced by the objective.
Approximately,

L
fo

D
fe

where fo and fe are the focal lengths of the objective and eyepiece,
respectively, and L is the distance between their focal points.
10. Magnifying power m of a telescope is the ratio of the angle subtended
at the eye by the image to the angle subtended at the eye by the
object.

fo
fe

where f0 and fe are the focal lengths of the objective and eyepiece,
respectively.

POINTS TO PONDER
1.
2.

344

The laws of reflection and refraction are true for all surfaces and
pairs of media at the point of the incidence.
The real image of an object placed between f and 2f from a convex lens
can be seen on a screen placed at the image location. If the screen is
removed, is the image still there? This question puzzles many, because
it is difficult to reconcile ourselves with an image suspended in air

Ray Optics and


Optical Instruments

3.

4.

5.

without a screen. But the image does exist. Rays from a given point
on the object are converging to an image point in space and diverging
away. The screen simply diffuses these rays, some of which reach our
eye and we see the image. This can be seen by the images formed in air
during a laser show.
Image formation needs regular reflection/refraction. In principle, all
rays from a given point should reach the same image point. This is
why you do not see your image by an irregular reflecting object, say
the page of a book.
Thick lenses give coloured images due to dispersion. The variety in
colour of objects we see around us is due to the constituent colours of
the light incident on them. A monochromatic light may produce an
entirely different perception about the colours on an object as seen in
white light.
For a simple microscope, the angular size of the object equals the
angular size of the image. Yet it offers magnification because we can
keep the small object much closer to the eye than 25 cm and hence
have it subtend a large angle. The image is at 25 cm which we can see.
Without the microscope, you would need to keep the small object at
25 cm which would subtend a very small angle.

EXERCISES
9.1

9.2

9.3

9.4

A small candle, 2.5 cm in size is placed at 27 cm in front of a concave


mirror of radius of curvature 36 cm. At what distance from the mirror
should a screen be placed in order to obtain a sharp image? Describe
the nature and size of the image. If the candle is moved closer to the
mirror, how would the screen have to be moved?
A 4.5 cm needle is placed 12 cm away from a convex mirror of focal
length 15 cm. Give the location of the image and the magnification.
Describe what happens as the needle is moved farther from the mirror.
A tank is filled with water to a height of 12.5 cm. The apparent
depth of a needle lying at the bottom of the tank is measured by a
microscope to be 9.4 cm. What is the refractive index of water? If
water is replaced by a liquid of refractive index 1.63 up to the same
height, by what distance would the microscope have to be moved to
focus on the needle again?
Figures 9.34(a) and (b) show refraction of a ray in air incident at 60
with the normal to a glass-air and water-air interface, respectively.
Predict the angle of refraction in glass when the angle of incidence
in water is 45 with the normal to a water-glass interface [Fig. 9.34(c)].

FIGURE 9.34

345

Physics
9.5

9.6

9.7

9.8

9.9

9.10

9.11

9.12

9.13

9.14

9.15

346

A small bulb is placed at the bottom of a tank containing water to a


depth of 80cm. What is the area of the surface of water through
which light from the bulb can emerge out? Refractive index of water
is 1.33. (Consider the bulb to be a point source.)
A prism is made of glass of unknown refractive index. A parallel
beam of light is incident on a face of the prism. The angle of minimum
deviation is measured to be 40. What is the refractive index of the
material of the prism? The refracting angle of the prism is 60. If
the prism is placed in water (refractive index 1.33), predict the new
angle of minimum deviation of a parallel beam of light.
Double-convex lenses are to be manufactured from a glass of
refractive index 1.55, with both faces of the same radius of
curvature. What is the radius of curvature required if the focal length
is to be 20 cm?
A beam of light converges at a point P. Now a lens is placed in the
path of the convergent beam 12 cm from P. At what point does the
beam converge if the lens is (a) a convex lens of focal length 20 cm,
and (b) a concave lens of focal length 16 cm?
An object of size 3.0 cm is placed 14cm in front of a concave lens of
focal length 21 cm. Describe the image produced by the lens. What
happens if the object is moved further away from the lens?
What is the focal length of a convex lens of focal length 30 cm in
contact with a concave lens of focal length 20 cm? Is the system a
converging or a diverging lens? Ignore thickness of the lenses.
A compound microscope consists of an objective lens of focal length
2.0 cm and an eyepiece of focal length 6.25 cm separated by a
distance of 15 cm. How far from the objective should an object be
placed in order to obtain the final image at (a) the least distance of
distinct vision (25 cm), and (b) at infinity? What is the magnifying
power of the microscope in each case?
A person with a normal near point (25 cm) using a compound
microscope with objective of focal length 8.0 mm and an eyepiece of
focal length 2.5 cm can bring an object placed at 9.0 mm from the
objective in sharp focus. What is the separation between the two
lenses? Calculate the magnifying power of the microscope,
A small telescope has an objective lens of focal length 144 cm and
an eyepiece of focal length 6.0 cm. What is the magnifying power of
the telescope? What is the separation between the objective and
the eyepiece?
(a) A giant refracting telescope at an observatory has an objective
lens of focal length 15 m. If an eyepiece of focal length 1.0 cm is
used, what is the angular magnification of the telescope?
(b) If this telescope is used to view the moon, what is the diameter
of the image of the moon formed by the objective lens? The
diameter of the moon is 3.48 106 m, and the radius of lunar
orbit is 3.8 108 m.
Use the mirror equation to deduce that:
(a) an object placed between f and 2 f of a concave mirror produces
a real image beyond 2 f.
(b) a convex mirror always produces a virtual image independent
of the location of the object.
(c) the virtual image produced by a convex mirror is always
diminished in size and is located between the focus and
the pole.

Ray Optics and


Optical Instruments

9.16

9.17

(d) an object placed between the pole and focus of a concave mirror
produces a virtual and enlarged image.
[Note: This exercise helps you deduce algebraically properties of
images that one obtains from explicit ray diagrams.]
A small pin fixed on a table top is viewed from above from a distance
of 50 cm. By what distance would the pin appear to be raised if it is
viewed from the same point through a 15 cm thick glass slab held
parallel to the table? Refractive index of glass = 1.5. Does the answer
depend on the location of the slab?
(a) Figure 9.35 shows a cross-section of a light pipe made of a
glass fibre of refractive index 1.68. The outer covering of the
pipe is made of a material of refractive index 1.44. What is the
range of the angles of the incident rays with the axis of the pipe
for which total reflections inside the pipe take place, as shown
in the figure.
(b) What is the answer if there is no outer covering of the pipe?

FIGURE 9.35

9.18

9.19

9.20

9.21

Answer the following questions:


(a) You have learnt that plane and convex mirrors produce virtual
images of objects. Can they produce real images under some
circumstances? Explain.
(b) A virtual image, we always say, cannot be caught on a screen.
Yet when we see a virtual image, we are obviously bringing it
on to the screen (i.e., the retina) of our eye. Is there a
contradiction?
(c) A diver under water, looks obliquely at a fisherman standing on
the bank of a lake. Would the fisherman look taller or shorter to
the diver than what he actually is?
(d) Does the apparent depth of a tank of water change if viewed
obliquely? If so, does the apparent depth increase or decrease?
(e) The refractive index of diamond is much greater than that of
ordinary glass. Is this fact of some use to a diamond cutter?
The image of a small electric bulb fixed on the wall of a room is to be
obtained on the opposite wall 3 m away by means of a large convex
lens. What is the maximum possible focal length of the lens required
for the purpose?
A screen is placed 90 cm from an object. The image of the object on
the screen is formed by a convex lens at two different locations
separated by 20 cm. Determine the focal length of the lens.
(a) Determine the effective focal length of the combination of the
two lenses in Exercise 9.10, if they are placed 8.0 cm apart with
their principal axes coincident. Does the answer depend on
which side of the combination a beam of parallel light is incident?
Is the notion of effective focal length of this system useful at all?
(b) An object 1.5 cm in size is placed on the side of the convex lens
in the arrangement (a) above. The distance between the object

347

Physics
9.22

9.23

9.24

9.25

9.26

9.27

9.28

9.29

9.30

9.31

348

and the convex lens is 40 cm. Determine the magnification


produced by the two-lens system, and the size of the image.
At what angle should a ray of light be incident on the face of a prism
of refracting angle 60 so that it just suffers total internal reflection
at the other face? The refractive index of the material of the prism is
1.524.
You are given prisms made of crown glass and flint glass with a
wide variety of angles. Suggest a combination of prisms which will
(a) deviate a pencil of white light without much dispersion,
(b) disperse (and displace) a pencil of white light without much
deviation.
For a normal eye, the far point is at infinity and the near point of
distinct vision is about 25 cm in front of the eye. The cornea of the
eye provides a converging power of about 40 dioptres, and the least
converging power of the eye-lens behind the cornea is about 20
dioptres. From this rough data estimate the range of accommodation
(i.e., the range of converging power of the eye-lens) of a normal eye.
Does short-sightedness (myopia) or long-sightedness (hyper metropia) imply necessarily that the eye has partially lost its ability
of accommodation? If not, what might cause these defects of vision?
A myopic person has been using spectacles of power 1.0 dioptre
for distant vision. During old age he also needs to use separate
reading glass of power + 2.0 dioptres. Explain what may have
happened.
A person looking at a person wearing a shirt with a pattern
comprising vertical and horizontal lines is able to see the vertical
lines more distinctly than the horizontal ones. What is this defect
due to? How is such a defect of vision corrected?
A man with normal near point (25 cm) reads a book with small print
using a magnifying glass: a thin convex lens of focal length 5 cm.
(a) What is the closest and the farthest distance at which he should
keep the lens from the page so that he can read the book when
viewing through the magnifying glass?
(b) What is the maximum and the minimum angular magnification
(magnifying power) possible using the above simple microscope?
A card sheet divided into squares each of size 1 mm2 is being viewed
at a distance of 9 cm through a magnifying glass (a converging lens
of focal length 9 cm) held close to the eye.
(a) What is the magnification produced by the lens? How much is
the area of each square in the virtual image?
(b) What is the angular magnification (magnifying power) of the
lens ?
(c) Is the magnification in (a) equal to the magnifying power in (b)?
Explain.
(a) At what distance should the lens be held from the figure in
Exercise 9.29 in order to view the squares distinctly with the
maximum possible magnifying power?
(b) What is the magnification in this case?
(c) Is the magnification equal to the magnifying power in this case?
Explain.
What should be the distance between the object in Exercise 9.30
and the magnifying glass if the virtual image of each square in the
figure is to have an area of 6.25 mm2. Would you be able to see the
squares distinctly with your eyes very close to the magnifier ?

Ray Optics and


Optical Instruments

9.32

9.33

9.34

9.35

9.36

9.37

[Note: Exercises 9.29 to 9.31 will help you clearly understand the
difference between magnification in absolute size and the angular
magnification (or magnifying power) of an instrument.]
Answer the following questions:
(a) The angle subtended at the eye by an object is equal to the
angle subtended at the eye by the virtual image produced by a
magnifying glass. In what sense then does a magnifying glass
provide angular magnification?
(b) In viewing through a magnifying glass, one usually positions
ones eyes very close to the lens. Does angular magnification
change if the eye is moved back?
(c) Magnifying power of a simple microscope is inversely proportional
to the focal length of the lens. What then stops us from using a
convex lens of smaller and smaller focal length and achieving
greater and greater magnifying power?
(d) Why must both the objective and the eyepiece of a compound
microscope have short focal lengths?
(e) When viewing through a compound microscope, our eyes should
be positioned not on the eyepiece but a short distance away
from it for best viewing. Why? How much should be that short
distance between the eye and eyepiece?
An angular magnification (magnifying power) of 30X is desired using
an objective of focal length 1.25 cm and an eyepiece of focal length
5 cm. How will you set up the compound microscope?
A small telescope has an objective lens of focal length 140 cm and
an eyepiece of focal length 5.0 cm. What is the magnifying power of
the telescope for viewing distant objects when
(a) the telescope is in normal adjustment (i.e., when the final image
is at infinity)?
(b) the final image is formed at the least distance of distinct vision
(25 cm)?
(a) For the telescope described in Exercise 9.34 (a), what is the
separation between the objective lens and the eyepiece?
(b) If this telescope is used to view a 100 m tall tower 3 km away,
what is the height of the image of the tower formed by the objective
lens?
(c) What is the height of the final image of the tower if it is formed at
25 cm?
A Cassegrain telescope uses two mirrors as shown in Fig. 9.33. Such
a telescope is built with the mirrors 20 mm apart. If the radius of
curvature of the large mirror is 220 mm and the small mirror is
140 mm, where will the final image of an object at infinity be?
Light incident normally on a plane mirror attached to a galvanometer
coil retraces backwards as shown in Fig. 9.36. A current in the coil
produces a deflection of 3.5o of the mirror. What is the displacement
of the reflected spot of light on a screen placed 1.5 m away?

FIGURE 9.36

349

Physics
9.38

Figure 9.37 shows an equiconvex lens (of refractive index 1.50) in


contact with a liquid layer on top of a plane mirror. A small needle
with its tip on the principal axis is moved along the axis until its
inverted image is found at the position of the needle. The distance of
the needle from the lens is measured to be 45.0 cm. The liquid is
removed and the experiment is repeated. The new distance is
measured to be 30.0 cm. What is the refractive index of the liquid?

FIGURE 9.37

350

Wave Optics

Chapter Ten

tt
o N
be C
E
re R
pu T
bl
is
he
d

WAVE OPTICS

no

10.1 INTRODUCTION

In 1637 Descartes gave the corpuscular model of light and derived Snells
law. It explained the laws of reflection and refraction of light at an interface.
The corpuscular model predicted that if the ray of light (on refraction)
bends towards the normal then the speed of light would be greater in the
second medium. This corpuscular model of light was further developed
by Isaac Newton in his famous book entitled OPTICKS and because of
the tremendous popularity of this book, the corpuscular model is very
often attributed to Newton.
In 1678, the Dutch physicist Christiaan Huygens put forward the
wave theory of light it is this wave model of light that we will discuss in
this chapter. As we will see, the wave model could satisfactorily explain
the phenomena of reflection and refraction; however, it predicted that on
refraction if the wave bends towards the normal then the speed of light
would be less in the second medium. This is in contradiction to the
prediction made by using the corpuscular model of light. It was much
later confirmed by experiments where it was shown that the speed of
light in water is less than the speed in air confirming the prediction of the
wave model; Foucault carried out this experiment in 1850.
The wave theory was not readily accepted primarily because of
351
Newtons authority and also because light could travel through vacuum

Physics

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

and it was felt that a wave would always require a medium to propagate
from one point to the other. However, when Thomas Young performed
his famous interference experiment in 1801, it was firmly established
that light is indeed a wave phenomenon. The wavelength of visible
light was measured and found to be extremely small; for example, the
wavelength of yellow light is about 0.5 m. Because of the smallness
of the wavelength of visible light (in comparison to the dimensions of
typical mirrors and lenses), light can be assumed to approximately
travel in straight lines. This is the field of geometrical optics, which we
had discussed in the previous chapter. Indeed, the branch of optics in
which one completely neglects the finiteness of the wavelength is called
geometrical optics and a ray is defined as the path of energy
propagation in the limit of wavelength tending to zero.
After the interference experiment of Young in 1801, for the next 40
years or so, many experiments were carried out involving the
interference and diffraction of lightwaves; these experiments could only
be satisfactorily explained by assuming a wave model of light. Thus,
around the middle of the nineteenth century, the wave theory seemed
to be very well established. The only major difficulty was that since it
was thought that a wave required a medium for its propagation, how
could light waves propagate through vacuum. This was explained
when Maxwell put forward his famous electromagnetic theory of light.
Maxwell had developed a set of equations describing the laws of
electricity and magnetism and using these equations he derived what
is known as the wave equation from which he predicted the existence
of electromagnetic waves*. From the wave equation, Maxwell could
calculate the speed of electromagnetic waves in free space and he found
that the theoretical value was very close to the measured value of speed
o f l i g h t . F r o m t h i s , h e p r o p o u n d e d t h a t light must be an
electromagnetic wave. Thus, according to Maxwell, light waves are
associated with changing electric and magnetic fields; changing electric
field produces a time and space varying magnetic field and a changing
magnetic field produces a time and space varying electric field. The
changing electric and magnetic fields result in the propagation of
electromagnetic waves (or light waves) even in vacuum.
In this chapter we will first discuss the original formulation of the
Huygens principle and derive the laws of reflection and refraction. In
Sections 10.4 and 10.5, we will discuss the phenomenon of interference
which is based on the principle of superposition. In Section 10.6 we
will discuss the phenomenon of diffraction which is based on HuygensFresnel principle. Finally in Section 10.7 we will discuss the
phenomenon of polarisation which is based on the fact that the light
waves are transverse electromagnetic waves.

352

* Maxwell had predicted the existence of electromagnetic waves around 1855; it


was much later (around 1890) that Heinrich Hertz produced radiowaves in the
laboratory. J.C. Bose and G. Marconi made practical applications of the Hertzian
waves

Wave Optics
DOES

LIGHT TRAVEL IN A STRAIGHT LINE?

tt
o N
be C
E
re R
pu T
bl
is
he
d

Light travels in a straight line in Class VI; it does not do so in Class XII and beyond! Surprised,
arent you?
In school, you are shown an experiment in which you take three cardboards with
pinholes in them, place a candle on one side and look from the other side. If the flame of the
candle and the three pinholes are in a straight line, you can see the candle. Even if one of
them is displaced a little, you cannot see the candle. This proves, so your teacher says,
that light travels in a straight line.
In the present book, there are two consecutive chapters, one on ray optics and the other
on wave optics. Ray optics is based on rectilinear propagation of light, and deals with
mirrors, lenses, reflection, refraction, etc. Then you come to the chapter on wave optics,
and you are told that light travels as a wave, that it can bend around objects, it can diffract
and interfere, etc.
In optical region, light has a wavelength of about half a micrometre. If it encounters an
obstacle of about this size, it can bend around it and can be seen on the other side. Thus a
micrometre size obstacle will not be able to stop a light ray. If the obstacle is much larger,
however, light will not be able to bend to that extent, and will not be seen on the other side.
This is a property of a wave in general, and can be seen in sound waves too. The sound
wave of our speech has a wavelength of about 50 cm to 1 m. If it meets an obstacle of the
size of a few metres, it bends around it and reaches points behind the obstacle. But when it
comes across a larger obstacle of a few hundred metres, such as a hillock, most of it is
reflected and is heard as an echo.
Then what about the primary school experiment? What happens there is that when we
move any cardboard, the displacement is of the order of a few millimetres, which is much
larger than the wavelength of light. Hence the candle cannot be seen. If we are able to move
one of the cardboards by a micrometer or less, light will be able to diffract, and the candle
will still be seen.
One could add to the first sentence in this box : It learns how to bend as it grows up!

10.2 HUYGENS PRINCIPLE

no

We would first define a wavefront: when we drop a small stone on a calm


pool of water, waves spread out from the point of impact. Every point on
the surface starts oscillating with time. At any instant, a photograph of
the surface would show circular rings on which the disturbance is
maximum. Clearly, all points on such a circle are oscillating in phase
because they are at the same distance from the source. Such a locus of
points, which oscillate in phase is called a wavefront ; thus a wavefront
is defined as a surface of constant phase. The speed with which the
wavefront moves outwards from the source is called the speed of the FIGURE 10.1 (a) A
wave. The energy of the wave travels in a direction perpendicular to the diverging spherical
wave emanating from
wavefront.
If we have a point source emitting waves uniformly in all directions, a point source. The
wavefronts are
then the locus of points which have the same amplitude and vibrate in
spherical.
the same phase are spheres and we have what is known as a spherical
353
wave as shown in Fig. 10.1(a). At a large distance from the source, a

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

small portion of the sphere can be considered as a plane and we have


what is known as a plane wave [Fig. 10.1(b)].
Now, if we know the shape of the wavefront at t = 0, then Huygens
principle allows us to determine the shape of the wavefront at a later
time . Thus, Huygens principle is essentially a geometrical construction,
which given the shape of the wafefront at any time allows us to determine
FIGURE 10.1 (b) At a
large distance from the shape of the wavefront at a later time. Let us consider a diverging
the source, a small wave and let F1F2 represent a portion of the spherical wavefront at t = 0
(Fig. 10.2). Now, according to Huygens principle, each point of the
portion of the
spherical wave can wavefront is the source of a secondary disturbance and the wavelets
be approximated by a emanating from these points spread out in all directions with the speed
plane wave.
of the wave. These wavelets emanating from the wavefront are usually
referred to as secondary wavelets and if we draw a common tangent
to all these spheres, we obtain the new position of the wavefront at a
later time.

FIGURE 10.2 F 1F2 represents the spherical wavefront (with O as


centre) at t = 0. The envelope of the secondary wavelets
emanating from F1F 2 produces the forward moving wavefront G1 G2 .
The backwave D1 D2 does not exist.

no

Thus, if we wish to determine the shape of the wavefront at t = , we


draw
spheres of radius v from each point on the spherical wavefront
FIGURE 10.3
where
v represents the speed of the waves in the medium. If we now draw
Huygens geometrical
a
common
tangent to all these spheres, we obtain the new position of the
construction for a
wavefront
at
t = . The new wavefront shown as G1G 2 in Fig. 10.2 is again
plane wave
propagating to the
spherical with point O as the centre.
right. F1 F2 is the
The above model has one shortcoming: we also have a backwave which
plane wavefront at is shown as D D in Fig. 10.2. Huygens argued that the amplitude of the
1 2
t = 0 and G1 G2 is the secondary wavelets is maximum in the forward direction and zero in the
wavefront at a later
backward direction; by making this adhoc assumption, Huygens could
time . The lines A1 A2 ,
explain the absence of the backwave. However, this adhoc assumption is
B1B2 etc, are
normal to both F1F 2 not satisfactory and the absence of the backwave is really justified from
more rigorous wave theory.
and G1G 2 and
In a similar manner, we can use Huygens principle to determine the
represent rays.
shape of the wavefront for a plane wave propagating through a medium
354
(Fig. 10.3).

Wave Optics

10.3 R EFRACTION AND REFLECTION OF


P LANE W AVES USING H UYGENS
PRINCIPLE
We will now use Huygens principle to derive the laws of
refraction. Let PP represent the surface separating medium
1 and medium 2, as shown in Fig. 10.4. Let v1 and v 2
represent the speed of light in medium 1 and medium 2,
respectively. We assume a plane wavefront AB propagating
in the direction AA incident on the interface at an angle i
as shown in the figure. Let be the time taken by the
wavefront to travel the distance BC. Thus,
BC = v 1

FIGURE 10.4 A plane wave AB is incident at an angle i


on the surface PP separating medium 1 and medium 2.
The plane wave undergoes refraction and CE represents
the refracted wavefront. The figure corresponds to v2 < v1
so that the refracted waves bends towards the normal.

Christiaan Huygens
(1629 1695) Dutch
physicist, astr onomer,
mathematician and the
founder of the wave
theory of light. His book,
T reatise on light, makes
fascinating reading even
today. He brilliantly
explained the double
refraction shown by the
mineral calcite in this
work in addition to
reflection and refraction.
He was the first to
analyse circular and
simple harmonic motion
and designed and built
improved clocks and
telescopes. He discovered
the true geometry of
Saturns rings.

CHRISTIAAN HUYGENS (1629 1695)

tt
o N
be C
E
re R
pu T
bl
is
he
d

10.3.1 Refraction of a plane wave

no

In order to determine the shape of the refracted wavefront, we draw a


sphere of radius v2 from the point A in the second medium (the speed of
the wave in the second medium is v2 ). Let CE represent a tangent plane
drawn from the point C on to the sphere. Then, AE = v 2 and CE would
represent the refracted wavefront. If we now consider the triangles ABC
and AEC, we readily obtain
sin i =

BC
v
= 1
A C AC

(10.1)

and

AE v 2
=
(10.2)
AC AC
where i and r are the angles of incidence and refraction, respectively.
sin r =

355

Physics
Thus we obtain
sin i
v
= 1
sin r
v2

(10.3)

356

c
v1

(10.4)

c
n2 = v
2

(10.5)

n1 =

and

are known as the refractive indices of medium 1 and medium 2,


respectively. In terms of the refractive indices, Eq. (10.3) can be
written as
n 1 sin i = n 2 sin r

(10.6)

http://www.falstad.com/ripple/

This is the Snells law of refraction. Further, if 1 and 2 denote the


wavelengths of light in medium 1 and medium 2, respectively and if the
distance BC is equal to 1 then the distance AE will be equal to 2 (because
if the crest from B has reached C in time , then the crest from A should
have also reached E in time ); thus,

no

Demonstration of interference, diffraction, refraction, resonance and Doppler effect

tt
o N
be C
E
re R
pu T
bl
is
he
d

From the above equation, we get the important result that if r < i (i.e.,
if the ray bends toward the normal), the speed of the light wave in the
second medium (v 2) will be less then the speed of the light wave in the
first medium (v1). This prediction is opposite to the prediction from the
corpuscular model of light and as later experiments showed, the prediction
of the wave theory is correct. Now, if c represents the speed of light in
vacuum, then,

1
BC
v
=
= 1
AE
v2
2

or

v1

v2

(10.7)

The above equation implies that when a wave gets refracted into a
denser medium (v1 > v 2) the wavelength and the speed of propagation
decrease but the frequency (= v/) remains the same.

10.3.2 Refraction at a rarer medium

We now consider refraction of a plane wave at a rarer medium, i.e.,


v 2 > v 1. Proceeding in an exactly similar manner we can construct a
refracted wavefront as shown in Fig. 10.5. The angle of refraction
will now be greater than angle of incidence; however, we will still have
n 1 sin i = n 2 sin r . We define an angle ic by the following equation
sin i c =

n2
n1

(10.8)

Thus, if i = ic then sin r = 1 and r = 90. Obviously, for i > ic, there can
not be any refracted wave. The angle ic is known as the critical angle and
for all angles of incidence greater than the critical angle, we will not have

Wave Optics

tt
o N
be C
E
re R
pu T
bl
is
he
d

any refracted wave and the wave will undergo what is known as total
internal reflection. The phenomenon of total internal reflection and its
applications was discussed in Section 9.4.

FIGURE 10.5 Refraction of a plane wave incident on a rarer medium for


which v 2 > v1 . The plane wave bends away from the normal.

10.3.3 Reflection of a plane wave by a plane surface

We next consider a plane wave AB incident at an angle i on a reflecting


surface MN. If v represents the speed of the wave in the medium and if
represents the time taken by the wavefront to advance from the point B
to C then the distance
BC = v

In order the construct the reflected wavefront we draw a sphere of


radius v from the point A as shown in Fig. 10.6. Let CE represent the
tangent plane drawn from the point C to this sphere. Obviously

no

AE = BC = v

FIGURE 10.6 Reflection of a plane wave AB by the reflecting surface MN.


AB and CE represent incident and reflected wavefronts.

If we now consider the triangles EAC and BAC we will find that they
are congruent and therefore, the angles i and r (as shown in Fig. 10.6)
would be equal. This is the law of reflection.
Once we have the laws of reflection and refraction, the behaviour of
prisms, lenses, and mirrors can be understood. These phenomena were

357

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

discussed in detail in Chapter 9 on the basis of rectilinear propagation of


light. Here we just describe the behaviour of the wavefronts as they
undergo reflection or refraction. In Fig. 10.7(a) we consider a plane wave
passing through a thin prism. Clearly, since the speed of light waves is
less in glass, the lower portion of the incoming wavefront (which travels
through the greatest thickness of glass) will get delayed resulting in a tilt
in the emerging wavefront as shown in the figure. In Fig. 10.7(b) we
consider a plane wave incident on a thin convex lens; the central part of
the incident plane wave traverses the thickest portion of the lens and is
delayed the most. The emerging wavefront has a depression at the centre
and therefore the wavefront becomes spherical and converges to the point
F which is known as the focus. In Fig. 10.7(c) a plane wave is incident on
a concave mirror and on reflection we have a spherical wave converging
to the focal point F. In a similar manner, we can understand refraction
and reflection by concave lenses and convex mirrors.

FIGURE 10.7 Refraction of a plane wave by (a) a thin prism, (b) a convex lens. (c) Reflection of a
plane wave by a concave mirror.

From the above discussion it follows that the total time taken from a
point on the object to the corresponding point on the image is the same
measured along any ray. For example, when a convex lens focusses light
to form a real image, although the ray going through the centre traverses
a shorter path, but because of the slower speed in glass, the time taken
is the same as for rays travelling near the edge of the lens.

no

10.3.4 The doppler effect

358

We should mention here that one should be careful in constructing the


wavefronts if the source (or the observer) is moving. For example, if there
is no medium and the source moves away from the observer, then later
wavefronts have to travel a greater distance to reach the observer and
hence take a longer time. The time taken between the arrival of two
successive wavefronts is hence longer at the observer than it is at the
source. Thus, when the source moves away from the observer the
frequency as measured by the source will be smaller. This is known as
the Doppler effect. Astronomers call the increase in wavelength due to
doppler effect as red shift since a wavelength in the middle of the visible
region of the spectrum moves towards the red end of the spectrum. When
waves are received from a source moving towards the observer, there is
an apparent decrease in wavelength, this is referred to as blue shift.

Wave Optics

tt
o N
be C
E
re R
pu T
bl
is
he
d

You have already encountered Doppler effect for sound waves in


Chapter 15 of Class XI textbook. For velocities small compared to the
speed of light, we can use the same formulae which we use for sound
waves. The fractional change in frequency / is given by v radial/c, where
v radial is the component of the source velocity along the line joining the
observer to the source relative to the observer; v radial is considered positive
when the source moves away from the observer. Thus, the Doppler shift
can be expressed as:

vradial
c

(10.9)

The formula given above is valid only when the speed of the source is
small compared to that of light. A more accurate formula for the Doppler
effect which is valid even when the speeds are close to that of light, requires
the use of Einsteins special theory of relativity. The Doppler effect for
light is very important in astronomy. It is the basis for the measurements
of the radial velocities of distant galaxies.
Example 10.1 What speed should a galaxy move with respect
to us so that the sodium line at 589.0 nm is observed
at 589.6 nm?
Solution Since = c,

(for small changes in and ). For

= 589.6 589.0 = + 0.6 nm


we get [using Eq. (10.9)]

0.6
= + 3.06 105 m s 1
589.0

= 306 km/s

Therefore, the galaxy is moving away from us.

no
Solution

(a) Reflection and refraction arise through interaction of incident light


with the atomic constituents of matter. Atoms may be viewed as

E XAMPLE 10.2

Example 10.2
(a) When monochromatic light is incident on a surface separating
two media, the reflected and refracted light both have the same
frequency as the incident frequency. Explain why?
(b) When light travels from a rarer to a denser medium, the speed
decreases. Does the reduction in speed imply a reduction in the
energy carried by the light wave?
(c) In the wave picture of light, intensity of light is determined by the
square of the amplitude of the wave. What determines the intensity
of light in the photon picture of light.

E XAMPLE 10.1

or, v radial + c

v radial
c

359

Physics

E XAMPLE 10.2

oscillators, which take up the frequency of the external agency (light)


causing forced oscillations. The frequency of light emitted by a charged
oscillator equals its frequency of oscillation. Thus, the frequency of
scattered light equals the frequency of incident light.

tt
o N
be C
E
re R
pu T
bl
is
he
d

(b) No. Energy carried by a wave depends on the amplitude of the


wave, not on the speed of wave propagation.
(c) For a given frequency, intensity of light in the photon picture is
determined by the number of photons crossing an unit area per
unit time.

10.4 COHERENT AND INCOHERENT ADDITION OF WAVES

In this section we will discuss the interference pattern produced by


the superposition of two waves. You may recall that we had discussed
the superposition principle in Chapter 15 of your Class XI textbook.
Indeed the entire field of interference is based on the superposition
principle according to which at a particular point in the medium, the
resultant displacement produced by a number of waves is the vector
sum of the displacements produced by each of the waves.
Consider two needles S1 and S2 moving periodically up and down
in an identical fashion in a trough of water [Fig. 10.8(a)]. They produce
two water waves, and at a particular point, the phase difference between
the displacements produced by each of the waves does not change
with time; when this happens the two sources are said to be coherent.
Figure 10.8(b) shows the position of crests (solid circles) and troughs
(dashed circles) at a given instant of time. Consider a point P for which

(a)

S1 P = S2 P

(b)

no

FIGURE 10.8 (a) Two


needles oscillating in
phase in water
represent two coherent
sources.
(b) The pattern of
displacement of water
molecules at an
instant on the surface
of water showing nodal
N (no displacement)
and antinodal A
(maximum
displacement) lines.

360

Since the distances S1 P and S2 P are equal, waves from S1 and S2


will take the same time to travel to the point P and waves that emanate
from S1 and S2 in phase will also arrive, at the point P, in phase.
Thus, if the displacement produced by the source S1 at the point P
is given by
y1 = a cos t

then, the displacement produced by the source S2 (at the point P) will
also be given by
y2 = a cos t

Thus, the resultant of displacement at P would be given by


y = y1 + y 2 = 2 a cos t

Since the intensity is the proportional to the square of the


amplitude, the resultant intensity will be given by
I = 4 I0

where I0 represents the intensity produced by each one of the individual


sources; I 0 is proportional to a 2. In fact at any point on the perpendicular
bisector of S1 S2, the intensity will be 4I0. The two sources are said to

Wave Optics
interfere constructively and we have what is referred to as constructive
interference. We next consider a point Q [Fig. 10.9(a)]
for which
S2Q S1 Q = 2

tt
o N
be C
E
re R
pu T
bl
is
he
d

The waves emanating from S1 will arrive exactly two cycles earlier
than the waves from S2 and will again be in phase [Fig. 10.9(a)]. Thus, if
the displacement produced by S1 is given by
y1 = a cos t

then the displacement produced by S2 will be given by

y2 = a cos (t 4) = a cos t
where we have used the fact that a path difference of 2 corresponds to a
phase difference of 4. The two displacements are once again in phase
and the intensity will again be 4 I0 giving rise to constructive interference.
In the above analysis we have assumed that the distances S1Q and S 2Q
are much greater than d (which represents the distance between S1 and
S2) so that although S1Q and S2Q are not equal, the amplitudes of the
displacement produced by each wave are very nearly the same.
We next consider a point R [Fig. 10.9(b)] for which

FIGURE 10.9
(a) Constructive
The waves emanating from S1 will arrive exactly two and a half cycles
interference at a
later than the waves from S2 [Fig. 10.10(b)]. Thus if the displacement point Q for which the
path difference is 2.
produced by S1 is given by
(b) Destructive
y1 = a cos t
interference at a
point R for which the
then the displacement produced by S2 will be given by
path difference is
y2 = a cos (t + 5) = a cos t
2.5 .

S2R S1 R = 2.5

where we have used the fact that a path difference of 2.5 corresponds to
a phase difference of 5. The two displacements are now out of phase
and the two displacements will cancel out to give zero intensity. This is
referred to as destructive interference.
To summarise: If we have two coherent sources S1 and S2 vibrating
in phase, then for an arbitrary point P whenever the path difference,
S1P ~ S2P = n

(n = 0, 1, 2, 3,...)

(10.10)

no

we will have constructive interference and the resultant intensity will be


4I0 ; the sign ~ between S1P and S2 P represents the difference between
S1P and S2 P. On the other hand, if the point P is such that the path
difference,

1
) (n = 0, 1, 2, 3, ...)
(10.11)
2
FIGURE 10.10 Locus
we will have destructive interference and the resultant intensity will be of points for which
zero. Now, for any other arbitrary point G (Fig. 10.10) let the phase S P S P is equal to
1
2
difference between the two displacements be . Thus, if the displacement
zero, , 2 , 3 .
produced by S1 is given by
S1P ~ S2P = (n+

y1 = a cos t

361

Physics
then, the displacement produced by S2 would be
y2 = a cos (t + )
and the resultant displacement will be given by
y = y1 + y2

tt
o N
be C
E
re R
pu T
bl
is
he
d

= a [cos t + cos (t + )]

= 2 a cos (/2) cos (t + /2)

The amplitude of the resultant displacement is 2a cos (/2) and


therefore the intensity at that point will be

http://www.colorado.edu/physics/2000/applets/fourier.html

Ripple Tank experiments on wave interference

I = 4 I 0 cos2 ( /2)

(10.12)

If = 0, 2 , 4 , which corresponds to the condition given by


Eq. (10.10) we will have constructive interference leading to maximum
intensity. On the other hand, if = , 3, 5 [which corresponds to
the condition given by Eq. (10.11)] we will have destructive interference
leading to zero intensity.
Now if the two sources are coherent (i.e., if the two needles are going
up and down regularly) then the phase difference at any point will not
change with time and we will have a stable interference pattern; i.e., the
positions of maxima and minima will not change with time. However, if
the two needles do not maintain a constant phase difference, then the
interference pattern will also change with time and, if the phase difference
changes very rapidly with time, the positions of maxima and minima will
also vary rapidly with time and we will see a time-averaged intensity
distribution. When this happens, we will observe an average intensity
that will be given by
< I >= 4 I 0 < cos 2 ( /2 ) >

(10.13)

where angular brackets represent time averaging. Indeed it is shown in


Section 7.2 that if (t ) varies randomly with time, the time-averaged
quantity < cos2 ( /2) > will be 1/2. This is also intuitively obvious because
the function cos2 ( /2) will randomly vary between 0 and 1 and the
average value will be 1/2. The resultant intensity will be given by

no

I = 2 I0

362

(10.14)

at all points.
When the phase difference between the two vibrating sources changes
rapidly with time, we say that the two sources are incoherent and when
this happens the intensities just add up. This is indeed what happens
when two separate light sources illuminate a wall.

10.5 INTERFERENCE
EXPERIMENT

OF

LIGHT WAVES

AND

YOUNG S

We will now discuss interference using light waves. If we use two sodium
lamps illuminating two pinholes (Fig. 10.11) we will not observe any
interference fringes. This is because of the fact that the light wave emitted
from an ordinary source (like a sodium lamp) undergoes abrupt phase

Wave Optics

tt
o N
be C
E
re R
pu T
bl
is
he
d

changes in times of the order of 1010 seconds. Thus


the light waves coming out from two independent
sources of light will not have any fixed phase
relationship and would be incoherent, when this
happens, as discussed in the previous section, the
intensities on the screen will add up.
The British physicist Thomas Young used an
ingenious technique to lock the phases of the waves
emanating from S1 and S2 . He made two pinholes S1
FIGURE 10.11 If two sodium
and S2 (very close to each other) on an opaque screen
lamps illuminate two pinholes
[Fig. 10.12(a)]. These were illuminated by another
S1 and S2, the intensities will add
pinholes that was in turn, lit by a bright source. Light
up and no interference fringes will
be observed on the screen.
waves spread out from S and fall on both S1 and S2 .
S1 and S2 then behave like two coherent sources
because light waves coming out from S1 and S2 are derived from the
same original source and any abrupt phase change in S will manifest in
exactly similar phase changes in the light coming out from S1 and S2 .
Thus, the two sources S1 and S2 will be locked in phase; i.e., they will be
coherent like the two vibrating needle in our water wave example
[Fig. 10.8(a)].

(a)

(b)

FIGURE 10.12 Youngs arrangement to produce interfer ence patter n.

no

Thus spherical waves emanating from S 1 and S2 will produce


interference fringes on the screen GG, as shown in Fig. 10.12(b). The
positions of maximum and minimum intensities can be calculated by
using the analysis given in Section 10.4 where we had shown that for an
arbitrary point P on the line GG [Fig. 10.12(b)] to correspond to a
maximum, we must have
S2P S1P = n ;

n = 0, 1, 2 ...

(10.15)

Now,

(S2P ) (S1P ) =

D2 + x +

d
2

D2 + x

d
2

= 2x d

363

Physics
where S1S2 = d and OP = x . Thus
2 xd
S2 P S1P = S P+S P
2
1

(10.16)

THOMAS YOUNG (1773 1829)

tt
o N
be C
E
re R
pu T
bl
is
he
d

If x, d<<D then negligible error will be introduced if


S2P + S1 P (in the denominator) is replaced by 2D. For
example, for d = 0.1 cm, D = 100 cm, OP = 1 cm (which
correspond to typical values for an interference
experiment using light waves), we have
S2 P + S1P = [(100)2 + (1.05)2] + [(100)2 + (0.95)2]

Thomas
Young
(1773 1829) English
physicist, physician and
Egyptologist. Young worked
on a wide variety of
scientific problems, ranging
from the structure of the eye
and the mechanism of
vision to the decipherment
of the Rosetta stone. He
revived the wave theory of
light and recognised that
interference phenomena
provide proof of the wave
properties of light.

200.01 cm
Thus if we replace S2P + S1 P by 2 D, the error involved is
about 0.005%. In this approximation, Eq. (10.16)
becomes
S2 P S1P

(10.17)

Hence we will have constructive interference resulting in


a bright region when
n D
; n = 0, 1, 2, ...
(10.18)
d
On the other hand, we will have a dark region near
x = xn =

; n = 0, 1, 2
(10.19)
d
Thus dark and bright bands appear on the screen, as shown in
Fig. 10.13. Such bands are called fringes. Equations (10.18) and (10.19)
show that dark and bright fringes are equally spaced and the distance
between two consecutive bright and dark fringes is given by

x = x n = (n+

= x n+1 x n

no

or =

364

(10.20)
d
which is the expression for the fringe width. Obviously, the central point
O (in Fig. 10.12) will be bright because S1O = S2 O and it will correspond
to n = 0. If we consider the line perpendicular to the plane of the paper
and passing through O [i.e., along the y-axis] then all points on this line
will be equidistant from S1 and S2 and we will have a bright central fringe
which is a straight line as shown in Fig. 10.13. In order to determine the
shape of the interference pattern on the screen we note that a particular
fringe would correspond to the locus of points with a constant value of
S2 P S1 P. Whenever this constant is an integral multiple of , the fringe
will be bright and whenever it is an odd integral multiple of /2 it will be
a dark fringe. Now, the locus of the point P lying in the x-y plane such
that S2P S1P (= ) is a constant, is a hyperbola. Thus the fringe pattern
will strictly be a hyperbola; however, if the distance D is very large compared
to the fringe width, the fringes will be very nearly straight lines as shown
in Fig. 10.13.

tt
o N
be C
E
re R
pu T
bl
is
he
d

Wave Optics

FIGURE 10.13 Computer generated fringe pattern produced by two point source S1 and S2 on the
screen GG (Fig. 10.12); (a) and (b) correspond to d = 0.005 mm and 0.025 mm, respectively (both
figures correspond to D = 5 cm and = 5 105 cm.) (Adopted from OPTICS by A. Ghatak, Tata
McGraw Hill Publishing Co. Ltd., New Delhi, 2000.)

In the double-slit experiment shown in Fig. 10.12, we have taken the


source hole S on the perpendicular bisector of the two slits, which is
shown as the line SO. What happens if the source S is slightly away from
the perpendicular bisector. Consider that the source is moved to some
new point S and suppose that Q is the mid-point of S1 and S2. If the
angle SQS is , then the central bright fringe occurs at an angle , on
the other side. Thus, if the source S is on the perpendicular bisector,
then the central fringe occurs at O, also on the perpendicular bisector. If
S is shifted by an angle to point S, then the central fringe appears at a
point O at an angle , which means that it is shifted by the same angle
on the other side of the bisector. This also means that the source S, the
mid-point Q and the point O of the central fringe are in a straight line.
We end this section by quoting from the Nobel lecture of Dennis Gabor*

no

The wave nature of light was demonstrated convincingly for the


first time in 1801 by Thomas Young by a wonderfully simple
experiment. He let a ray of sunlight into a dark room, placed a
dark screen in front of it, pierced with two small pinholes, and
beyond this, at some distance, a white screen. He then saw two
darkish lines at both sides of a bright line, which gave him
sufficient encouragement to repeat the experiment, this time with
spirit flame as light source, with a little salt in it to produce the
bright yellow sodium light. This time he saw a number of dark
lines, regularly spaced; the first clear proof that light added to
light can produce darkness. This phenomenon is called

* Dennis Gabor received the 1971 Nobel Prize in Physics for discovering the
principles of holography.

365

Physics
interference. Thomas Young had expected it because he believed
in the wave theory of light.

FIGURE 10.14 Photograph and the graph of the intensity


distribution in Youngs double-slit experiment.

EXAMPLE 10.3

http://vsg.quasihome.com/interfer.html

Interactive animation of Youngs experiment

tt
o N
be C
E
re R
pu T
bl
is
he
d

We should mention here that the fringes are straight lines although
S1 and S 2 are point sources. If we had slits instead of the point sources
(Fig. 10.14), each pair of points would have produced straight line fringes
resulting in straight line fringes with increased intensities.

Example 10.3 Two slits are made one millimetre apart and the screen
is placed one metre away. What is the fringe separation when bluegreen light of wavelength 500 nm is used?
Solution Fringe spacing =

D 1 5 10 7
=
m
d
1 10 3

= 5 104 m = 0.5 mm

366

EXAMPLE 10.4

no

Example 10.4 What is the effect on the interference fringes in a


Youngs double-slit experiment due to each of the following operations:
(a) the screen is moved away from the plane of the slits;
(b) the (monochromatic) source is replaced by another
(monochromatic) source of shorter wavelength;
(c) the separation between the two slits is increased;
(d) the source slit is moved closer to the double-slit plane;
(e) the width of the source slit is increased;
(f ) the monochromatic source is replaced by a source of white
light?

Wave Optics
( In each operation, take all parameters, other than the one specified,
to remain unchanged.)

tt
o N
be C
E
re R
pu T
bl
is
he
d

Solution
(a) Angular separation of the fringes remains constant
(= /d). The actual
separation of the fringes increases in
proportion to the distance of the screen from the plane of the
two slits.

(b) The separation of the fringes (and also angular separation)


decreases. See, however, the condition mentioned in (d) below.
(c) The separation of the fringes (and also angular separation)
decreases. See, however, the condition mentioned in (d) below.

(d) Let s be the size of the source and S its distance from the plane of
the two slits. For interference fringes to be seen, the condition
s/S < /d should be satisfied; otherwise, interference patterns
produced by different parts of the source overlap and no fringes
are seen. Thus, as S decreases (i.e., the source slit is brought
closer), the interference pattern gets less and less sharp, and
when the source is brought too close for this condition to be valid,
the fringes disappear. Till this happens, the fringe separation
remains fixed.
(e) Same as in (d). As the source slit width increases, fringe pattern
gets less and less sharp. When the source slit is so wide that the
condition s/S /d is not satisfied, the interference pattern
disappears.

Thus, the fringe closest on either side of the central white fringe
is red and the farthest will appear blue. After a few fringes, no
clear fringe pattern is seen.

E XAMPLE 10.4

(f ) The interference patterns due to different component colours of


white light overlap (incoherently). The central bright fringes for
different colours are at the same position. Therefore, the central
fringe is white. For a point P for which S2P S1P = b/2, where b
( 4000 ) represents the wavelength for the blue colour, the blue
component will be absent and the fringe will appear red in colour.
Slightly farther away where S2QS1Q = b = r/2 where r ( 8000 )
is the wavelength for the red colour, the fringe will be predominantly
blue.

no

10.6 DIFFRACTION

If we look clearly at the shadow cast by an opaque object, close to the


region of geometrical shadow, there are alternate dark and bright regions
just like in interference. This happens due to the phenomenon of
diffraction. Diffraction is a general characteristic exhibited by all types of
waves, be it sound waves, light waves, water waves or matter waves. Since
the wavelength of light is much smaller than the dimensions of most
obstacles; we do not encounter diffraction effects of light in everyday
observations. However, the finite resolution of our eye or of optical

367

Physics
instruments such as telescopes or microscopes is limited due to the
phenomenon of diffraction. Indeed the colours that you see when a CD is
viewed is due to diffraction effects. We will now discuss the phenomenon
of diffraction.

tt
o N
be C
E
re R
pu T
bl
is
he
d

10.6.1 The single slit

In the discussion of Youngs experiment, we stated that a single narrow


slit acts as a new source from which light spreads out. Even before Young,
early experimenters including Newton had noticed that light spreads
out from narrow holes and slits. It seems to turn around corners and
enter regions where we would expect a shadow. These effects, known as
diffraction, can only be properly understood using wave ideas. After all,
you are hardly surprised to hear sound waves from someone talking
around a corner !
When the double slit in Youngs experiment is replaced by a single
narrow slit (illuminated by a monochromatic source), a broad pattern
with a central bright region is seen. On both sides, there are alternate
dark and bright regions, the intensity becoming weaker away from the
centre (Fig. 10.16). To understand this, go to Fig. 10.15, which shows a
parallel beam of light falling normally on a single slit LN of width a. The
diffracted light goes on to meet a screen. The midpoint of the slit is M.
A straight line through M perpendicular to the slit plane meets the
screen at C. We want the intensity at any point P on the screen. As before,
straight lines joining P to the different points L,M,N, etc., can be treated as
parallel, making an angle with the normal MC.
The basic idea is to divide the slit into much smaller parts, and add
their contributions at P with the proper phase differences. We are treating
different parts of the wavefront at the slit as secondary sources. Because
the incoming wavefront is parallel to the plane of the slit, these sources
are in phase.
The path difference NP LP between the two edges of the slit can be
calculated exactly as for Youngs experiment. From Fig. 10.15,
NP LP = NQ

= a sin

no

368

(10.21)

Similarly, if two points M1 and M2 in the slit plane are separated by y, the
path difference M2 P M1P y. We now have to sum up equal, coherent
contributions from a large number of sources, each with a different phase.
This calculation was made by Fresnel using integral calculus, so we omit
it here. The main features of the diffraction pattern can be understood by
simple arguments.
At the central point C on the screen, the angle is zero. All path
differences are zero and hence all the parts of the slit contribute in phase.
This gives maximum intensity at C. Experimental observation shown in

Wave Optics

tt
o N
be C
E
re R
pu T
bl
is
he
d

Fig. 10.15 indicates that the intensity has a


central maximum at = 0 and other
secondary maxima at l (n+1/2) /a, and
has minima (zero intensity) at l n /a,
n = 1, 2, 3, .... It is easy to see why it has
minima at these values of angle. Consider
first the angle where the path difference a
is . Then,

/a .

(10.22)

Now, divide the slit into two equal halves


FIGURE 10.15 The geometry of path
differences
for diffraction by a single slit.
LM and MN each of size a/2. For every point
M 1 in LM, there is a point M2 in MN such that
M 1M 2 = a/2. The path difference between M 1 and M2 at P = M 2P M1P
= a/2 = /2 for the angle chosen. This means that the contributions
from M1 and M2 are 180 out of phase and cancel in the direction
= /a. Contributions from the two halves of the slit LM and MN,
therefore, cancel each other. Equation (10.22) gives the angle at which
the intensity falls to zero. One can similarly show that the intensity is
zero for = n /a, with n being any integer (except zero!). Notice that the
angular size of the central maximum increases when the slit width a
decreases.
It is also easy to see why there are maxima at = (n + 1/2) /a and
why they go on becoming weaker and weaker with increasing n. Consider
an angle = 3/2a which is midway between two of the dark fringes.
Divide the slit into three equal parts. If we take the first two thirds of the
slit, the path difference between the two ends would be
2
2 a 3
a =

=
3
3 2a

(10.23)

no

The first two-thirds of the slit can therefore be divided


into two halves which have a /2 path difference. The
contributions of these two halves cancel in the same manner
as described earlier. Only the remaining one-third of the
slit contributes to the intensity at a point between the two
minima. Clearly, this will be much weaker than the central
maximum (where the entire slit contributes in phase). One
can similarly show that there are maxima at (n + 1/2) /a
with n = 2, 3, etc. These become weaker with increasing n,
since only one-fifth, one-seventh, etc., of the slit contributes
in these cases. The photograph and intensity pattern
corresponding to it is shown in Fig. 10.16.
There has been prolonged discussion about difference
between intereference and diffraction among scientists since
the discovery of these phenomena. In this context, it is

FIGURE 10.16 Intensity


distribution and photograph of
fringes due to diffraction
at single slit.

369

Physics
interesting to note what Richard Feynman* has said in his famous
Feynman Lectures on Physics:

370

In the double-slit experiment, we must note that the pattern on the


screen is actually a superposition of single-slit diffraction from each slit
or hole, and the double-slit interference pattern. This is shown in
Fig. 10.17. It shows a broader diffraction peak in which there appear
several fringes of smaller width due to double-slit interference. The
number of interference fringes occuring in the broad diffraction peak
depends on the ratio d/a, that is the ratio of the distance between the
two slits to the width of a slit. In the limit of a becoming very small, the
diffraction pattern will become very flat and we will obsrve the two-slit
interference pattern [see Fig. 10.13(b)].

http://www.phys.hawaii.edu/~teb/optics/java/slitdiffr/

FIGURE 10.17 The actual double-slit interference pattern.


The envelope shows the single slit diffraction.

Example 10.5 In Example 10.3, what should the width of each slit be
to obtain 10 maxima of the double slit pattern within the central
maximum of the single slit pattern?

EXAMPLE 10.5

no

Interactive animation on single slit diffraction pattern

tt
o N
be C
E
re R
pu T
bl
is
he
d

No one has ever been able to define the difference between


interference and diffraction satisfactorily. It is just a question
of usage, and there is no specific, important physical difference
between them. The best we can do is, roughly speaking, is to
say that when there are only a few sources, say two interfering
sources, then the result is usually called interference, but if there
is a large number of them, it seems that the word diffraction is
more often used.

Solution We want a = , =

a = = 0 .2 mm
a
5
Notice that the wavelength of light and distance of the screen do not
enter in the calculation of a.
10

=2

In the double-slit interference experiment of Fig. 10.12, what happens


if we close one slit? You will see that it now amounts to a single slit. But
you will have to take care of some shift in the pattern. We now have a
source at S, and only one hole (or slit) S 1 or S2. This will produce a single* Richand Feynman was one of the recipients of the 1965 Nobel Prize in Physics
for his fundamental work in quantum electrodynamics.

Wave Optics

tt
o N
be C
E
re R
pu T
bl
is
he
d

slit diffraction pattern on the screen. The centre of the central bright fringe
will appear at a point which lies on the straight line SS1 or SS2, as the
case may be.
We now compare and contrast the interference pattern with that seen
for a coherently illuminated single slit (usually called the single slit
diffraction pattern).
(i) The interference pattern has a number of equally spaced bright and
dark bands. The diffraction pattern has a central bright maximum
which is twice as wide as the other maxima. The intensity falls as we
go to successive maxima away from the centre, on either side.
(ii) We calculate the interference pattern by superposing two waves
originating from the two narrow slits. The diffraction pattern is a
superposition of a continuous family of waves originating from each
point on a single slit.
(iii) For a single slit of width a, the first null of the interference pattern
occurs at an angle of /a. At the same angle of /a, we get a maximum
(not a null) for two narrow slits separated by a distance a.
One must understand that both d and a have to be quite small, to be
able to observe good interference and diffraction patterns. For example,
the separation d between the two slits must be of the order of a milimetre
or so. The width a of each slit must be even smaller, of the order of 0.1 or
0.2 mm.
In our discussion of Youngs experiment and the single-slit diffraction,
we have assumed that the screen on which the fringes are formed is at a
large distance. The two or more paths from the slits to the screen were
treated as parallel. This situation also occurs when we place a converging
lens after the slits and place the screen at the focus. Parallel paths from
the slit are combined at a single point on the screen. Note that the lens
does not introduce any extra path differences in a parallel beam. This
arrangement is often used since it gives more intensity than placing the
screen far away. If f is the focal length of the lens, then we can easily work
out the size of the central bright maximum. In terms of angles, the
separation of the central maximum from the first null of the diffraction
pattern is /a . Hence, the size on the screen will be f /a.

10.6.2 Seeing the single slit diffraction pattern

no

It is surprisingly easy to see the single-slit diffraction pattern for oneself.


The equipment needed can be found in most homes two razor blades
and one clear glass electric bulb preferably with a straight filament. One
has to hold the two blades so that the edges are parallel and have a
narrow slit in between. This is easily done with the thumb and forefingers
FIGURE 10.18
(Fig. 10.18).
Holding two blades to
Keep the slit parallel to the filament, right in front of the eye. Use
form a single slit. A
spectacles if you normally do. With slight adjustment of the width of the bulb filament viewed
slit and the parallelism of the edges, the pattern should be seen with its through this shows
bright and dark bands. Since the position of all the bands (except the
clear diffraction
central one) depends on wavelength, they will show some colours. Using
bands.
a filter for red or blue will make the fringes clearer. With both filters
371
available, the wider fringes for red compared to blue can be seen.

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

In this experiment, the filament plays the role of the first slit S in
Fig. 10.16. The lens of the eye focuses the pattern on the screen (the
retina of the eye).
With some effort, one can cut a double slit in an aluminium foil with
a blade. The bulb filament can be viewed as before to repeat Youngs
experiment. In daytime, there is another suitable bright source subtending
a small angle at the eye. This is the reflection of the Sun in any shiny
convex surface (e.g., a cycle bell). Do not try direct sunlight it can damage
the eye and will not give fringes anyway as the Sun subtends an angle
of (1/2).
In interference and diffraction, light energy is redistributed. If it
reduces in one region, producing a dark fringe, it increases in another
region, producing a bright fringe. There is no gain or loss of energy,
which is consistent with the principle of conservation of energy.

10.6.3 Resolving power of optical instruments

In Chapter 9 we had discussed about telescopes. The angular resolution


of the telescope is determined by the objective of the telescope. The stars
which are not resolved in the image produced by the objective cannot be
resolved by any further magnification produced by the eyepiece. The
primary purpose of the eyepiece is to provide magnification of the image
produced by the objective.
Consider a parallel beam of light falling on a convex lens. If the lens is
well corrected for aberrations, then geometrical optics tells us that the
beam will get focused to a point. However, because of diffraction, the
beam instead of getting focused to a point gets focused to a spot of finite
area. In this case the effects due to diffraction can be taken into account
by considering a plane wave incident on a circular aperture followed by
a convex lens (Fig. 10.19). The analysis of the corresponding diffraction
pattern is quite involved; however, in principle, it is similar to the analysis
carried out to obtain the single-slit diffraction pattern. Taking into account
the effects due to diffraction, the pattern on the focal plane would consist
of a central bright region surrounded by concentric dark and bright rings
(Fig. 10.19). A detailed analysis shows that the radius of the central bright
region is approximately given by
1.22 f
0.61 f
=
2a
a

(10.24)

no

r0

372

FIGURE 10.19 A parallel beam of light is incident on a convex lens.


Because of diffraction effects, the beam gets focused to a
spot of radius 0.61 f/a .

Wave Optics
where f is the focal length of the lens and 2a is the diameter of the circular
aperture or the diameter of the lens, whichever is smaller. Typically if
0.5 m, f 20 cm and a 5 cm
we have

tt
o N
be C
E
re R
pu T
bl
is
he
d

r0 1.2 m
Although the size of the spot is very small, it plays an important role
in determining the limit of resolution of optical instruments like a telescope
or a microscope. For the two stars to be just resolved
f r0

0.61 f
a

implying

0.61
(10.25)
a
Thus will be small if the diameter of the objective is large. This
implies that the telescope will have better resolving power if a is large. It
is for this reason that for better resolution, a telescope must have a large
diameter objective.

Example 10.6 Assume that light of wavelength 6000 is coming from


a star. What is the limit of resolution of a telescope whose objective
has a diameter of 100 inch?

0.61 6 10 5
127

2.9 10 7 radians

E XAMPLE 10.6

Solution A 100 inch telescope implies that 2a = 100 inch


= 254 cm. Thus if,
5
6000 = 610 cm
then

no

We can apply a similar argument to the objective lens of a microscope.


In this case, the object is placed slightly beyond f, so that a real image is
formed at a distance v [Fig. 10.20]. The magnification ratio of
image size to object size is given by m l v/f. It can be seen from
Fig. 10.20 that
D/f l 2 tan
(10.26)
where 2 is the angle subtended by the diameter of the objective lens at
the focus of the microscope.

FIGURE 10.20 Real image formed by the objective lens of the microscope.

373

Physics
DETERMINE

THE RESOLVING POWER OF YOUR EYE

tt
o N
be C
E
re R
pu T
bl
is
he
d

You can estimate the resolving power of your eye with a simple experiment. Make
black stripes of equal width separated by white stripes; see figure here. All the black
stripes should be of equal width, while the width of the intermediate white stripes should
increase as you go from the left to the right. For example, let all black stripes have a width
of 5 mm. Let the width of the first two white stripes be 0.5 mm each, the next two white
stripes be 1 mm each, the next two 1.5 mm each, etc. Paste this pattern on a wall in a
room or laboratory, at the height of your eye.

Now watch the pattern, preferably with one eye. By moving away or closer to the wall,
find the position where you can just see some two black stripes as separate stripes. All
the black stripes to the left of this stripe would merge into one another and would not be
distinguishable. On the other hand, the black stripes to the right of this would be more
and more clearly visible. Note the width d of the white stripe which separates the two
regions, and measure the distance D of the wall from your eye. Then d/D is the resolution
of your eye.
You have watched specks of dust floating in air in a sunbeam entering through your
window. Find the distance (of a speck) which you can clearly see and distinguish from a
neighbouring speck. Knowing the resolution of your eye and the distance of the speck,
estimate the size of the speck of dust.

When the separation between two points in a microscopic specimen


is comparable to the wavelength of the light, the diffraction effects
become important. The image of a point object will again be a diffraction
pattern whose size in the image plane will be
v = v

1.22
D

(10.27)

Two objects whose images are closer than this distance will not be
resolved, they will be seen as one. The corresponding minimum
separation, dmin, in the object plane is given by
v

no

dmin =

374

1. 22
D

1 .22 v
.
D
m

1 .22 f
D
Now, combining Eqs. (10.26) and (10.28), we get
=

d min =

1.22
2 tan

(10.28)

Wave Optics
1.22
2 sin

(10.29)

tt
o N
be C
E
re R
pu T
bl
is
he
d

If the medium between the object and the objective lens is not air but
a medium of refractive index n, Eq. (10.29) gets modified to
d min =

1.22
2 n sin

(10.30)

The product n sin is called the numerical aperture and is sometimes


marked on the objective.
The resolving power of the microscope is given by the reciprocal of
the minimum separation of two points seen as distinct. It can be seen
from Eq. (10.30) that the resolving power can be increased by choosing a
medium of higher refractive index. Usually an oil having a refractive index
close to that of the objective glass is used. Such an arrangement is called
an oil immersion objective. Notice that it is not possible to make sin
larger than unity. Thus, we see that the resolving power of a microscope
is basically determined by the wavelength of the light used.
There is a likelihood of confusion between resolution and
magnification, and similarly between the role of a telescope and a
microscope to deal with these parameters. A telescope produces images
of far objects nearer to our eye. Therefore objects which are not resolved
at far distance, can be resolved by looking at them through a telescope.
A microscope, on the other hand, magnifies objects (which are near to
us) and produces their larger image. We may be looking at two stars or
two satellites of a far-away planet, or we may be looking at different
regions of a living cell. In this context, it is good to remember that a
telescope resolves whereas a microscope magnifies.

10.6.4 The validity of ray optics

An aperture (i.e., slit or hole) of size a illuminated by a parallel beam


sends diffracted light into an angle of approximately /a . This is the
angular size of the bright central maximum. In travelling a distance z,
the diffracted beam therefore acquires a width z/a due to diffraction. It
is interesting to ask at what value of z the spreading due to diffraction
becomes comparable to the size a of the aperture. We thus approximately
equate z/a with a. This gives the distance beyond which divergence of
the beam of width a becomes significant. Therefore,
a2

no

(10.31)

We define a quantity z F called the Fresnel distance by the following


equation

zF

a2 /

Equation (10.31) shows that for distances much smaller than z F , the
spreading due to diffraction is smaller compared to the size of the beam.
It becomes comparable when the distance is approximately zF . For
distances much greater than z F , the spreading due to diffraction

375

Physics
dominates over that due to ray optics (i.e., the size a of the aperture).
Equation (10.31) also shows that ray optics is valid in the limit of
wavelength tending to zero.

EXAMPLE 10.7

tt
o N
be C
E
re R
pu T
bl
is
he
d

Example 10.7 For what distance is ray optics a good approximation


when the aperture is 3 mm wide and the wavelength is 500 nm?

Solution z F =

3 2

( 3 10 )
=
5 10

= 18 m

This example shows that even with a small aperture, diffraction


spreading can be neglected for rays many metres in length. Thus, ray
optics is valid in many common situations.

10.7 POLARISATION

no

Consider holding a long string that is held horizontally, the other end of
which is assumed to be fixed. If we move the end of the string up and
down in a periodic manner, we will generate a wave propagating in the
+x direction (Fig. 10.21). Such a wave could be described by the following
equation

376

FIGURE 10.21 (a) The curves represent the displacement of a string at


t = 0 and at t = t, respectively when a sinusoidal wave is propagating
in the +x-direction. (b) The curve represents the time variation
of the displacement at x = 0 when a sinusoidal wave is propagating
in the +x-direction. At x = x, the time variation of the
displacement will be slightly displaced to the right.

Wave Optics
y (x,t ) = a sin (kx t)

(10.32)

where a and (= 2 ) represent the amplitude and the angular frequency


of the wave, respectively; further,
2
(10.33)
k
represents the wavelength associated with the wave. We had discussed
propagation of such waves in Chapter 15 of Class XI textbook. Since the
displacement (which is along the y direction) is at right angles to the
direction of propagation of the wave, we have what is known as a
transverse wave. Also, since the displacement is in the y direction, it is
often referred to as a y-polarised wave. Since each point on the string
moves on a straight line, the wave is also referred to as a linearly polarised
wave. Further, the string always remains confined to the x-y plane and
therefore it is also referred to as a plane polarised wave.
In a similar manner we can consider the vibration of the string in the
x-z plane generating a z-polarised wave whose displacement will be given
by

tt
o N
be C
E
re R
pu T
bl
is
he
d

z (x,t ) = a sin (kx t )

(10.34)

It should be mentioned that the linearly polarised waves [described


by Eqs. (10.33) and (10.34)] are all transverse waves; i.e., the
displacement of each point of the string is always at right angles to the
direction of propagation of the wave. Finally, if the plane of vibration of
the string is changed randomly in very short intervals of time, then we
have what is known as an unpolarised wave. Thus, for an unpolarised
wave the displacement will be randomly changing with time though it
will always be perpendicular to the direction of propagation.

no

Light waves are transverse in nature; i.e., the electric field associated
with a propagating light wave is always at right angles to the direction of
propagation of the wave. This can be easily demonstrated using a simple
polaroid. You must have seen thin plastic like sheets, which are called
polaroids. A polaroid consists of long chain molecules aligned in a
particular direction. The electric vectors (associated with the propagating
light wave) along the direction of the aligned molecules get absorbed.
Thus, if an unpolarised light wave is incident on such a polaroid then
the light wave will get linearly polarised with the electric vector oscillating
along a direction perpendicular to the aligned molecules; this direction
is known as the pass-axis of the polaroid.
Thus, if the light from an ordinary source (like a sodium lamp) passes
through a polaroid sheet P1, it is observed that its intensity is reduced by
half. Rotating P1 has no effect on the transmitted beam and transmitted
intensity remains constant. Now, let an identical piece of polaroid P 2 be
placed before P1. As expected, the light from the lamp is reduced in
intensity on passing through P2 alone. But now rotating P1 has a dramatic
effect on the light coming from P2. In one position, the intensity transmitted

377

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

by P 2 followed by P1 is nearly zero. When turned by 90 from this position,


P1 transmits nearly the full intensity emerging from P2 (Fig. 10.22).

FIGURE 10.22 (a) Passage of light through two polaroids P 2 and P 1. The
transmitted fraction falls from 1 to 0 as the angle between them varies
from 0 to 90. Notice that the light seen through a single polaroid
P 1 does not vary with angle. (b) Behaviour of the electric vector
when light passes through two polaroids. The transmitted
polarisation is the component parallel to the polaroid axis.
The double arrows show the oscillations of the electric vector.

378

where I0 is the intensity of the polarized light after passing through


P1 . This is known as Malus law. The above discussion shows that the
intensity coming out of a single polaroid is half of the incident intensity.
By putting a second polaroid, the intensity can be further controlled
from 50% to zero of the incident intensity by adjusting the angle between
the pass-axes of two polaroids.
Polaroids can be used to control the intensity, in sunglasses,
windowpanes, etc. Polaroids are also used in photographic cameras and
3D movie cameras.

EXAMPLE 10.8

no

The above experiment can be easily understood by assuming that


light passing through the polaroid P 2 gets polarised along the pass-axis
of P 2. If the pass-axis of P 2 makes an angle with the pass-axis of P1 ,
then when the polarised beam passes through the polaroid P2, the
component E cos (along the pass-axis of P2 ) will pass through P2 .
Thus, as we rotate the polaroid P1 (or P 2), the intensity will vary as:
I = I0 cos2
(10.35)

Example 10.8 Discuss the intensity of transmitted light when a


polaroid sheet is rotated between two crossed polaroids?
Solution Let I0 be the intensity of polarised light after passing through
the first polariser P 1. Then the intensity of light after passing through
second polariser P 2 will be

I = I 0 cos 2 ,

Wave Optics

tt
o N
be C
E
re R
pu T
bl
is
he
d

I = I 0 cos2 cos2

EXAMPLE 10.8

where is the angle between pass axes of P 1 and P 2. Since P 1 and P 3


are crossed the angle between the pass axes of P 2 and P 3 will be
(/2 ). Hence the intensity of light emerging from P3 will be

= I0 cos sin =(I0/4) sin 2

Therefore, the transmitted intensity will be maximum when = /4.

10.7.1 Polarisation by scattering

The light from a clear blue portion of the sky shows a rise and fall of
intensity when viewed through a polaroid which is rotated. This is nothing
but sunlight, which has changed its direction (having been scattered) on
encountering the molecules of the earths atmosphere. As Fig. 10.23(a)
shows, the incident sunlight is unpolarised. The dots stand for polarisation
perpendicular to the plane of the figure. The double arrows show
polarisation in the plane of the figure. (There is no phase relation between
these two in unpolarised light). Under the influence of the electric field of
the incident wave the electrons in the molecules acquire components of
motion in both these directions. We have drawn an observer looking at
90 to the direction of the sun. Clearly, charges accelerating parallel to
the double arrows do not radiate energy towards this observer since their
acceleration has no transverse component. The radiation scattered by
the molecule is therefore represented by dots. It is polarised
perpendicular to the plane of the figure. This explains the polarisation of
scattered light from the sky.

no

FIGURE 10.23 (a) Polarisation of the blue scattered light from the sky.
The incident sunlight is unpolarised (dots and arrows). A typical
molecule is shown. It scatters light by 90 polarised normal to
the plane of the paper (dots only). (b) Polarisation of light
reflected from a transparent medium at the Brewster angle
(reflected ray perpendicular to refracted ray).

The scattering of light by molecules was intensively investigated by


C.V. Raman and his collaborators in Kolkata in the 1920s. Raman was
awarded the Nobel Prize for Physics in 1930 for this work.

379

Physics
A

SPECIAL CASE OF TOTAL TRANSMISSION

tt
o N
be C
E
re R
pu T
bl
is
he
d

When light is incident on an interface of two media, it is observed that some part of it
gets reflected and some part gets transmitted. Consider a related question: Is it possible
that under some conditions a monochromatic beam of light incident on a surface
(which is normally reflective) gets completely transmitted with no reflection? To your
surprise, the answer is yes.

Let us try a simple experiment and check what happens. Arrange a laser, a good
polariser, a prism and screen as shown in the figure here.

Let the light emitted by the laser source pass through the polariser and be incident
on the surface of the prism at the Brewsters angle of incidence iB. Now rotate the
polariser carefully and you will observe that for a specific alignment of the polariser, the
light incident on the prism is completely transmitted and no light is reflected from the
surface of the prism. The reflected spot will completely vanish.

no

10.7.2 Polarisation by reflection

380

Figure 10.23(b) shows light reflected from a transparent medium, say,


water. As before, the dots and arrows indicate that both polarisations are
present in the incident and refracted waves. We have drawn a situation
in which the reflected wave travels at right angles to the refracted wave.
The oscillating electrons in the water produce the reflected wave. These
move in the two directions transverse to the radiation from wave in the
medium, i.e., the refracted wave. The arrows are parallel to the direction
of the reflected wave. Motion in this direction does not contribute to the
reflected wave. As the figure shows, the reflected light is therefore linearly
polarised perpendicular to the plane of the figure (represented by dots).
This can be checked by looking at the reflected light through an analyser.
The transmitted intensity will be zero when the axis of the analyser is in
the plane of the figure, i.e., the plane of incidence.
When unpolarised light is incident on the boundary between two
transparent media, the reflected light is polarised with its electric vector
perpendicular to the plane of incidence when the refracted and reflected
rays make a right angle with each other. Thus we have seen that when
reflected wave is perpendicular to the refracted wave, the reflected wave
is a totally polarised wave. The angle of incidence in this case is called
Brewsters angle and is denoted by iB. We can see that i B is related to the
refractive index of the denser medium. Since we have i B+r = /2, we get
from Snells law

sin i B
sin i B
=
sin r
sin ( /2 i B )

Wave Optics
=

sin i B
= tan i B
cos iB

(10.36)

This is known as Brewsters law.

tt
o N
be C
E
re R
pu T
bl
is
he
d

EXAMPLE 10.9

Example 10.9 Unpolarised light is incident on a plane glass surface.


What should be the angle of incidence so that the reflected and
refracted rays are perpendicular to each other?
Solution For i + r to be equal to /2, we should have tan i B = = 1.5.
This gives i B = 57. This is the Brewsters angle for air to glass
interface.

no

For simplicity, we have discussed scattering of light by 90, and


reflection at the Brewster angle. In this special situation, one of the two
perpendicular components of the electric field is zero. At other angles,
both components are present but one is stronger than the other. There is
no stable phase relationship between the two perpendicular components
since these are derived from two perpendicular components of an
unpolarised beam. When such light is viewed through a rotating analyser,
one sees a maximum and a minimum of intensity but not complete
darkness. This kind of light is called partially polarised.
Let us try to understand the situation. When an unpolarised beam of
light is incident at the Brewsters angle on an interface of two media, only
part of light with electric field vector perpendicular to the plane of
incidence will be reflected. Now by using a good polariser, if we completely
remove all the light with its electric vector perpendicular to the plane of
incidence and let this light be incident on the surface of the prism at
Brewsters angle, you will then observe no reflection and there will be
total transmission of light.
We began this chapter by pointing out that there are some phenomena
which can be explained only by the wave theory. In order to develop a
proper understanding, we first described how some phenomena like
reflection and refraction, which were studied on this basis of Ray Optics
in Chapter 9, can also be understood on the basis of Wave Optics. Then
we described Youngs double slit experiment which was a turning point
in the study of optics. Finally, we described some associated points such
as diffraction, resolution, polarisation, and validity of ray optics. In the
next chapter, you will see how new experiments led to new theories at
the turn of the century around 1900 A.D.

SUMMARY

1.

Huygens principle tells us that each point on a wavefront is a source


of secondary waves, which add up to give the wavefront at a later time.

2.

Huygens construction tells us that the new wavefront is the forward


envelope of the secondary waves. When the speed of light is
independent of direction, the secondary waves are spherical. The rays
are then perpendicular to both the wavefronts and the time of travel

381

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

3.

is the same measured along any ray. This principle leads to the well
known laws of reflection and refraction.
The principle of superposition of waves applies whenever two or more
sources of light illuminate the same point. When we consider the
intensity of light due to these sources at the given point, there is an
interference term in addition to the sum of the individual intensities.
But this term is important only if it has a non-zero average, which
occurs only if the sources have the same frequency and a stable
phase difference.
Youngs double slit of separation d gives equally spaced fringes of
angular separation /d. The source, mid-point of the slits, and central
bright fringe lie in a straight line. An extended source will destroy
the fringes if it subtends angle more than /d at the slits.
A single slit of width a gives a diffraction pattern with a central
2
, etc.,
maximum. The intensity falls to zero at angles of ,
a
a
with successively weaker secondary maxima in between. Diffraction
limits the angular resolution of a telescope to /D where D is the
diameter. Two stars closer than this give strongly overlapping images.
Similarly, a microscope objective subtending angle 2 at the focus,
in a medium of refractive index n, will just separate two objects spaced
at a distance /(2n sin ), which is the resolution limit of a
microscope. Diffraction determines the limitations of the concept of
light rays. A beam of width a travels a distance a2 /, called the Fresnel
distance, before it starts to spread out due to diffraction.
Natural light, e.g., from the sun is unpolarised. This means the electric
vector takes all possible directions in the transverse plane, rapidly
and randomly, during a measurement. A polaroid transmits only one
component (parallel to a special axis). The resulting light is called
linearly polarised or plane polarised. When this kind of light is viewed
through a second polaroid whose axis turns through 2, two maxima
and minima of intensity are seen. Polarised light can also be produced
by reflection at a special angle (called the Brewster angle) and by
scattering through /2 in the earths atmosphere.

4.

5.

6.

POINTS TO PONDER

1.

no

2.

3.

4.

5.

382

Waves from a point source spread out in all directions, while light was
seen to travel along narrow rays. It required the insight and experiment
of Huygens, Young and Fresnel to understand how a wave theory could
explain all aspects of the behaviour of light.
The crucial new feature of waves is interference of amplitudes from different
sources which can be both constructive and destructive, as shown in
Youngs experiment.
Even a wave falling on single slit should be regarded as a large number of
sources which interefere constructively in the forward direction ( = 0),
and destructively in other directions.
Diffraction phenomena define the limits of ray optics. The limit of the
ability of microscopes and telescopes to distinguish very close objects is
set by the wavelength of light.
Most interference and diffraction effects exist even for longitudinal waves
like sound in air. But polarisation phenomena are special to transverse
waves like light waves.

Wave Optics

EXERCISES
Monochromatic light of wavelength 589 nm is incident from air on a
water surface. What are the wavelength, frequency and speed of
(a) reflected, and (b) refracted light? Refractive index of water is
1.33.

tt
o N
be C
E
re R
pu T
bl
is
he
d

10.1

10.2

What is the shape of the wavefront in each of the following cases:


(a) Light diverging from a point source.

(b) Light emerging out of a convex lens when a point source is placed
at its focus.
(c) The portion of the wavefront of light from a distant star intercepted
by the Earth.

10.3

(a) The refractive index of glass is 1.5. What is the speed of light in
glass? (Speed of light in vacuum is 3.0 108 m s1)

(b) Is the speed of light in glass independent of the colour of light? If


not, which of the two colours red and violet travels slower in a
glass prism?

10.4

In a Youngs double-slit experiment, the slits are separated by


0.28 mm and the screen is placed 1.4 m away. The distance between
the central bright fringe and the fourth bright fringe is measured
to be 1.2 cm. Determine the wavelength of light used in the
experiment.

10.5

In Youngs double-slit experiment using monochromatic light of


wavelength , the intensity of light at a point on the screen where
path difference is , is K units. What is the intensity of light at a
point where path difference is /3 ?

10.6

A beam of light consisting of two wavelengths, 650 nm and 520 nm,


is used to obtain inter ference fringes in a Youngs double-slit
experiment.
(a) Find the distance of the third bright fringe on the screen from
the central maximum for wavelength 650 nm.

(b) What is the least distance from the central maximum where the
bright fringes due to both the wavelengths coincide?

In a double-slit experiment the angular width of a fringe is found to


be 0.2 on a screen placed 1 m away. The wavelength of light used is
600 nm. What will be the angular width of the fringe if the entire
experimental apparatus is immersed in water? Take refractive index
of water to be 4/3.

no

10.7

10.8

What is the Brewster angle for air to glass transition? (Refractive


index of glass = 1.5.)

10.9

Light of wavelength 5000 falls on a plane reflecting surface. What


are the wavelength and frequency of the reflected light? For what
angle of incidence is the reflected ray normal to the incident ray?

10.10 Estimate the distance for which ray optics is good approximation
for an aperture of 4 mm and wavelength 400 nm.

383

Physics
ADDITIONAL EXERCISES

tt
o N
be C
E
re R
pu T
bl
is
he
d

10.11 The 6563 H line emitted by hydrogen in a star is found to be redshifted by 15 . Estimate the speed with which the star is receding
from the Earth.
10.12 Explain how Corpuscular theory predicts the speed of light in a
medium, say, water, to be greater than the speed of light in vacuum.
Is the prediction confirmed by experimental determination of the
speed of light in wate r? If not, which alternative picture of light is
consistent with experiment?
10.13 You have lear nt in the text how Huygens principle leads to the laws
of reflection and refraction. Use the same principle to deduce directly
that a point object placed in front of a plane mirror produces a
virtual image whose distance from the mirror is equal to the object
distance fr om the mirror.
10.14 Let us list some of the factors, which could possibly influence the
speed of wave propagation:
(i) nature of the source.
(ii) direction of propagation.

(iii) motion of the sour ce and/or observer.


(iv) wavelength.
(v) intensity of the wave.

On which of these factors, if any, does


(a) the speed of light in vacuum,

(b) the speed of light in a medium (say, glass or water),


depen d ?

10.15 For sound waves, the Doppler formula for frequency shift differs
slightly between the two situations: (i) source at rest; observer
moving, and (ii) source moving; observer at rest. The exact Doppler
formulas for the case of light waves in vacuum are, however, strictly
identical for these situations. Explain why this should be so. Would
you expect the formulas to be strictly identical for the two situations
in case of light travelling in a medium ?

no

10.16 In double-slit experiment using light of wavelength 600 nm, the


angular width of a fringe formed on a distant screen is 0.1. What is
the spacing between the two slits?

384

10.17 Answer the following questions:


(a) In a single slit diffraction experiment, the width of the slit is
made double the original width. How does this affect the size
and intensity of the central diffraction band ?
(b) In what way is diffraction from each slit related to the
interference pattern in a double-slit experiment ?
(c) When a tiny circular obstacle is placed in the path of light from
a distant source, a bright spot is seen at the centre of the shadow
of the obstacle. Explain wh y?
(d) Two students are separated by a 7 m partition wall in a room
10 m high. If both light and sound waves can bend around

Wave Optics

tt
o N
be C
E
re R
pu T
bl
is
he
d

obstacles, how is it that the students are unable to see each


other even though they can converse easily.
(e) Ray optics is based on the assumption that light travels in a
straight line. Diffraction effects (observed when light propagates
through small apertures/slits or around small obstacles)
disprove this assumption. Yet the ray optics assumption is so
commonly used in understanding location and several other
properties of images in optical instruments. What is the
justification?
Two towers on top of two hills are 40 km apart. The line joining
them passes 50 m above a hill halfway between the towers. What is
the longest wavelength of radio waves, which can be sent between
the towers without appreciable diffraction effects?
A parallel beam of light of wavelength 500 nm falls on a narrow slit
and the resulting diffraction pattern is observed on a screen 1 m
away. It is observed that the first minimum is at a distance of 2.5
mm from the centre of the screen. Find the width of the slit.
Answer the following questions:
(a) When a low flying aircraft passes overhead, we sometimes notice
a slight shaking of the picture on our TV screen. Suggest a
possible explanation.
(b) As you have learnt in the text, the principle of linear
superposition of wave displacement is basic to understanding
intensity distributions in diffraction and interference patterns.
What is the justification of this principle?
In deriving the single slit diffraction pattern, it was stated that the
intensity is zero at angles of n /a. Justify this by suitably dividing
the slit to bring out the cancellation.

10.18

10.19

10.20

no

10.21

385

Physics

Chapter Eleven

tt
o N
be C
E
re R
pu T
bl
is
he
d

DUAL NATURE OF
RADIATION AND
MATTER

no

11.1 I NTRODUCTION

386

The Maxwells equations of electromagnetism and Hertz experiments on


the generation and detection of electromagnetic waves in 1887 strongly
established the wave nature of light. Towards the same period at the end
of 19th century, experimental investigations on conduction of electricity
(electric discharge) through gases at low pressure in a discharge tube led
to many historic discoveries. The discovery of X-rays by Roentgen in 1895,
and of electron by J. J. Thomson in 1897, were important milestones in
the understanding of atomic structure. It was found that at sufficiently
low pressure of about 0.001 mm of mercury column, a discharge took
place between the two electrodes on applying the electric field to the gas
in the discharge tube. A fluorescent glow appeared on the glass opposite
to cathode. The colour of glow of the glass depended on the type of glass,
it being yellowish-green for soda glass. The cause of this fluorescence
was attributed to the radiation which appeared to be coming from the
cathode. These cathode rays were discovered, in 1870, by William
Crookes who later, in 1879, suggested that these rays consisted of streams
of fast moving negatively charged particles. The British physicist
J. J. Thomson (1856 -1940) confirmed this hypothesis. By applying
mutually perpendicular electric and magnetic fields across the discharge
tube, J. J. Thomson was the first to determine experimentally the speed
and the specific charge [charge to mass ratio (e/m )] of the cathode ray

Dual Nature of Radiation


and Matter

tt
o N
be C
E
re R
pu T
bl
is
he
d

particles. They were found to travel with speeds ranging from about 0.1
to 0.2 times the speed of light (3 108 m/s). The presently accepted value
of e/m is 1.76 1011 C/kg. Further, the value of e/m was found to be
independent of the nature of the material/metal used as the cathode
(emitter), or the gas introduced in the discharge tube. This observation
suggested the universality of the cathode ray particles.
Around the same time, in 1887, it was found that certain metals, when
irradiated by ultraviolet light, emitted negatively charged particles having
small speeds. Also, certain metals when heated to a high temperature were
found to emit negatively charged particles. The value of e/m of these particles
was found to be the same as that for cathode ray particles. These
observations thus established that all these particles, although produced
under different conditions, were identical in nature. J. J. Thomson, in 1897,
named these particles as electrons, and suggested that they were
fundamental, universal constituents of matter. For his epoch-making
discovery of electron, through his theoretical and experimental
investigations on conduction of electricity by gasses, he was awarded the
Nobel Prize in Physics in 1906. In 1913, the American physicist R. A.
Millikan (1868-1953) performed the pioneering oil-drop experiment for
the precise measurement of the charge on an electron. He found that the
charge on an oil-droplet was always an integral multiple of an elementary
charge, 1.602 10 19 C. Millikans experiment established that electric
charge is quantised. From the values of charge (e ) and specific charge
(e/m ), the mass (m) of the electron could be determined.

11.2 ELECTRON EMISSION

no

We know that metals have free electrons (negatively charged particles) that
are responsible for their conductivity. However, the free electrons cannot
normally escape out of the metal surface. If an electron attempts to come
out of the metal, the metal surface acquires a positive charge and pulls the
electron back to the metal. The free electron is thus held inside the metal
surface by the attractive forces of the ions. Consequently, the electron can
come out of the metal surface only if it has got sufficient energy to overcome
the attractive pull. A certain minimum amount of energy is required to be
given to an electron to pull it out from the surface of the metal. This
minimum energy required by an electron to escape from the metal surface
is called the work function of the metal. It is generally denoted by 0 and
measured in eV (electron volt). One electron volt is the energy gained by an
electron when it has been accelerated by a potential difference of 1 volt, so
that 1 eV = 1.602 10 19 J.
This unit of energy is commonly used in atomic and nuclear physics.
The work function ( 0 ) depends on the properties of the metal and the
nature of its surface. The values of work function of some metals are
given in Table 11.1. These values are approximate as they are very
sensitive to surface impurities.
Note from Table 11.1 that the work function of platinum is the highest
(0 = 5.65 eV ) while it is the lowest ( 0 = 2.14 eV) for caesium.
The minimum energy required for the electron emission from the metal
surface can be supplied to the free electrons by any one of the following
physical processes:

387

Physics
TABLE 11.1 WORK FUNCTIONS OF

SOME METALS

Work function
(eV)

Metal

Work function
(eV)

Cs

2.14

Al

4.28

2.30

Hg

4.49

Na

2.75

Cu

4.65

Ca

3.20

Ag

4.70

Mo

4.17

Ni

5.15

Pb

4.25

Pt

5.65

tt
o N
be C
E
re R
pu T
bl
is
he
d

Metal

(i) Thermionic emission : By suitably heating, sufficient thermal energy


can be imparted to the free electrons to enable them to come out of the
metal.
(ii) Field emission : By applying a very strong electric field (of the order of
108 V m1) to a metal, electrons can be pulled out of the metal, as in a
spark plug.
(iii) Photo-electric emission: When light of suitable frequency illuminates
a metal surface, electrons are emitted from the metal surface. These
photo(light)-generated electrons are called photoelectrons.

11.3 PHOTOELECTRIC E FFECT

no

11.3.1 Hertzs observations

388

The phenomenon of photoelectric emission was discovered in 1887 by


Heinrich Hertz (1857-1894), during his electromagnetic wave experiments.
In his experimental investigation on the production of electromagnetic
waves by means of a spark discharge, Hertz observed that high voltage
sparks across the detector loop were enhanced when the emitter plate
was illuminated by ultraviolet light from an arc lamp.
Light shining on the metal surface somehow facilitated the escape of
free, charged particles which we now know as electrons. When light falls
on a metal surface, some electrons near the surface absorb enough energy
from the incident radiation to overcome the attraction of the positive ions
in the material of the surface. After gaining sufficient energy from the
incident light, the electrons escape from the surface of the metal into the
surrounding space.

11.3.2 Hallwachs and Lenards observations


Wilhelm Hallwachs and Philipp Lenard investigated the phenomenon of
photoelectric emission in detail during 1886-1902.
Lenard (1862-1947) observed that when ultraviolet radiations were
allowed to fall on the emitter plate of an evacuated glass tube enclosing
two electrodes (metal plates), current flows in the circuit (Fig. 11.1). As
soon as the ultraviolet radiations were stopped, the current flow also

Dual Nature of Radiation


and Matter

OF

Simulate experiments on photoelectric effect

11.4 EXPERIMENTAL S TUDY


EFFECT

http://www.kcvs.ca/site/projects/physics.html

tt
o N
be C
E
re R
pu T
bl
is
he
d

stopped. These observations indicate that when ultraviolet radiations fall


on the emitter plate C, electrons are ejected from it which are attracted
towards the positive, collector plate A by the electric field. The electrons
flow through the evacuated glass tube, resulting in the current flow. Thus,
light falling on the surface of the emitter causes current in the external
circuit. Hallwachs and Lenard studied how this photo current varied with
collector plate potential, and with frequency and intensity of incident light.
Hallwachs, in 1888, undertook the study further and connected a
negatively charged zinc plate to an electroscope. He observed that the
zinc plate lost its charge when it was illuminated by ultraviolet light.
Further, the uncharged zinc plate became positively charged when it was
irradiated by ultraviolet light. Positive charge on a positively charged
zinc plate was found to be further enhanced when it was illuminated by
ultraviolet light. From these observations he concluded that negatively
charged particles were emitted from the zinc plate under the action of
ultraviolet light.
After the discovery of the electron in 1897, it became evident that the
incident light causes electrons to be emitted from the emitter plate. Due
to negative charge, the emitted electrons are pushed towards the collector
plate by the electric field. Hallwachs and Lenard also observed that when
ultraviolet light fell on the emitter plate, no electrons were emitted at all
when the frequency of the incident light was smaller than a certain
minimum value, called the threshold frequency. This minimum frequency
depends on the nature of the material of the emitter plate.
It was found that certain metals like zinc, cadmium, magnesium, etc.,
responded only to ultraviolet light, having short wavelength, to cause
electron emission from the surface. However, some alkali metals such as
lithium, sodium, potassium, caesium and rubidium were sensitive
even to visible light. All these photosensitive substances emit electrons
when they are illuminated by light. After the discovery of electrons, these
electrons were termed as photoelectrons. The phenomenon is called
photoelectric effect.

P HOTOELECTRIC

no

Figure 11.1 depicts a schematic view of the arrangement used for the
experimental study of the photoelectric effect. It consists of an evacuated
glass/quartz tube having a photosensitive plate C and another metal
plate A. Monochromatic light from the source S of sufficiently short
wavelength passes through the window W and falls on the photosensitive
plate C (emitter). A transparent quartz window is sealed on to the glass
tube, which permits ultraviolet radiation to pass through it and irradiate
the photosensitive plate C. The electrons are emitted by the plate C and
are collected by the plate A (collector), by the electric field created by the
battery. The battery maintains the potential difference between the plates
C and A, that can be varied. The polarity of the plates C and A can be
reversed by a commutator. Thus, the plate A can be maintained at a desired
positive or negative potential with respect to emitter C. When the collector
plate A is positive with respect to the emitter plate C, the electrons are

389

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

attracted to it. The emission of electrons causes flow of


electric current in the circuit. The potential difference
between the emitter and collector plates is measured by
a voltmeter (V) whereas the resulting photo current
flowing in the circuit is measured by a microammeter
(A). The photoelectric current can be increased or
decreased by varying the potential of collector plate A
with respect to the emitter plate C. The intensity and
frequency of the incident light can be varied, as can the
potential difference V between the emitter C and the
collector A.
We can use the experimental arrangement of
Fig. 11.1 to study the variation of photocurrent with
(a) intensity of radiation, (b) frequency of incident
radiation, (c) the potential difference between the
plates A and C, and (d) the nature of the material
of plate C. Light of different frequencies can be used
by putting appropriate coloured filter or coloured
glass in the path of light falling on the emitter C. The intensity
of light is varied by changing the distance of the light source
from the emitter.

FIGURE 11.1 Experimental


arrangement for study of
photoelectric effect.

11.4.1 Effect of intensity of light on photocurrent

FIGURE 11.2 Variation of


Photoelectric current with
intensity of light.

The collector A is maintained at a positive potential with


respect to emitter C so that electrons ejected from C are
attracted towards collector A. Keeping the frequency of the
incident radiation and the accelerating potential fixed, the
intensity of light is varied and the resulting photoelectric
current is measured each time. It is found that the
photocurrent increases linearly with intensity of incident light
as shown graphically in Fig. 11.2. The photocurrent is directly
proportional to the number of photoelectrons emitted per
second. This implies that the number of photoelectrons
emitted per second is directly proportional to the intensity
of incident radiation.

no

11.4.2 Effect of potential on photoelectric current

390

We first keep the plate A at some positive accelerating potential with respect
to the plate C and illuminate the plate C with light of fixed frequency
and fixed intensity I1. We next vary the positive potential of plate A gradually
and measure the resulting photocurrent each time. It is found that the
photoelectric current increases with increase in accelerating (positive)
potential. At some stage, for a certain positive potential of plate A, all the
emitted electrons are collected by the plate A and the photoelectric current
becomes maximum or saturates. If we increase the accelerating potential
of plate A further, the photocurrent does not increase. This maximum
value of the photoelectric current is called saturation current. Saturation
current corresponds to the case when all the photoelectrons emitted by
the emitter plate C reach the collector plate A.
We now apply a negative (retarding) potential to the plate A with respect
to the plate C and make it increasingly negative gradually. When the

Dual Nature of Radiation


and Matter

tt
o N
be C
E
re R
pu T
bl
is
he
d

polarity is reversed, the electrons are


repelled and only the most energetic
electrons are able to reach the collector A.
The photocurrent is found to decrease
rapidly until it drops to zero at a certain
sharply defined, critical value of the negative
potential V0 on the plate A. For a particular
frequency of incident radiation, the
minimum negative (retarding) potential V0
given to the plate A for which the
photocurrent stops or becomes zero is
called the cut-off or stopping potential.
The interpretation of the observation in
terms of photoelectrons is straightforward.
All the photoelectrons emitted from the
metal do not have the same energy.
Photoelectric current is zero when the
stopping potential is sufficient to repel even
the most energetic photoelectrons, with the
maximum kinetic energy (K max), so that
Kmax = e V0

FIGURE 11.3 Variation of photocurr ent with


collector plate potential for different
intensity of incident radiation.

(11.1)

We can now repeat this experiment with incident radiation of the same
frequency but of higher intensity I2 and I3 (I3 > I2 > I 1). We note that the
saturation currents are now found to be at higher values. This shows
that more electrons are being emitted per second, proportional to the
intensity of incident radiation. But the stopping potential remains the
same as that for the incident radiation of intensity I1, as shown graphically
in Fig. 11.3. Thus, for a given frequency of the incident radiation, the
stopping potential is independent of its intensity. In other words, the
maximum kinetic energy of photoelectrons depends on the light source
and the emitter plate material, but is independent of intensity of incident
radiation.

11.4.3 Effect of frequency of incident radiation on stopping


potential

no

We now study the relation between the


frequency of the incident radiation and the
stopping potential V0 . We suitably adjust the
same intensity of light radiation at various
frequencies and study the variation of
photocurrent with collector plate potential. The
resulting variation is shown in Fig. 11.4. We
obtain different values of stopping potential but
the same value of the saturation current for
incident radiation of different frequencies. The
energy of the emitted electrons depends on the
frequency of the incident radiations. The
stopping potential is more negative for higher
frequencies of incident radiation. Note from

FIGURE 11.4 Variation of photoelectric current


with collector plate potential for different
frequencies of incident radiation.

391

Physics

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

Fig. 11.4 that the stopping potentials are in the


order V03 > V02 > V01 if the frequencies are in the
order 3 > 2 > 1 . This implies that greater the
frequency of incident light, greater is the
maximum kinetic energy of the photoelectrons.
Consequently, we need greater retarding
potential to stop them completely. If we plot a
graph between the frequency of incident radiation
and the corresponding stopping potential for
different metals we get a straight line, as shown
in Fig. 11.5.
The graph shows that
FIGURE 11.5 Variation of stopping potential V0
with frequency of incident radiation for a
(i) the stopping potential V0 varies linearly with
given photosensitive material.
the frequency of incident radiation for a given
photosensitive material.
(ii) there exists a certain minimum cut-off frequency 0 for which the
stopping potential is zero.
These observations have two implications:
(i) The maximum kinetic energy of the photoelectrons varies linearly
with the frequency of incident radiation, but is independent of its
intensity.
(ii) For a frequency of incident radiation, lower than the cut-off
frequency 0 , no photoelectric emission is possible even if the
intensity is large.
This minimum, cut-off frequency 0, is called the threshold frequency.
It is different for different metals.
Different photosensitive materials respond differently to light. Selenium
is more sensitive than zinc or copper. The same photosensitive substance
gives different response to light of different wavelengths. For example,
ultraviolet light gives rise to photoelectric effect in copper while green or
red light does not.
Note that in all the above experiments, it is found that, if frequency of
the incident radiation exceeds the threshold frequency, the photoelectric
emission starts instantaneously without any apparent time lag, even if
the incident radiation is very dim. It is now known that emission starts in
a time of the order of 10 9 s or less.
We now summarise the experimental features and observations
described in this section.
(i) For a given photosensitive material and frequency of incident radiation
(above the threshold frequency), the photoelectric current is directly
proportional to the intensity of incident light (Fig. 11.2).
(ii) For a given photosensitive material and frequency of incident radiation,
saturation current is found to be proportional to the intensity of
incident radiation whereas the stopping potential is independent of
its intensity (Fig. 11.3).
(iii) For a given photosensitive material, there exists a certain minimum
cut-off frequency of the incident radiation, called the threshold
frequency, below which no emission of photoelectrons takes place,
no matter how intense the incident light is. Above the threshold
392
frequency, the stopping potential or equivalently the maximum kinetic

Dual Nature of Radiation


and Matter

tt
o N
be C
E
re R
pu T
bl
is
he
d

energy of the emitted photoelectrons increases linearly with the


frequency of the incident radiation, but is independent of its intensity
(Fig. 11.5).
(iv) The photoelectric emission is an instantaneous process without any
apparent time lag (10 9s or less), even when the incident radiation is
made exceedingly dim.

11.5 PHOTOELECTRIC EFFECT


OF LIGHT

AND

WAVE T HEORY

no

The wave nature of light was well established by the end of the nineteenth
century. The phenomena of interference, diffraction and polarisation were
explained in a natural and satisfactory way by the wave picture of light.
According to this picture, light is an electromagnetic wave consisting of
electric and magnetic fields with continuous distribution of energy over
the region of space over which the wave is extended. Let us now see if this
wave picture of light can explain the observations on photoelectric
emission given in the previous section.
According to the wave picture of light, the free electrons at the surface
of the metal (over which the beam of radiation falls) absorb the radiant
energy continuously. The greater the intensity of radiation, the greater are
the amplitude of electric and magnetic fields. Consequently, the greater
the intensity, the greater should be the energy absorbed by each electron.
In this picture, the maximum kinetic energy of the photoelectrons on the
surface is then expected to increase with increase in intensity. Also, no
matter what the frequency of radiation is, a sufficiently intense beam of
radiation (over sufficient time) should be able to impart enough energy to
the electrons, so that they exceed the minimum energy needed to escape
from the metal surface . A threshold frequency, therefore, should not exist.
These expectations of the wave theory directly contradict observations (i),
(ii) and (iii) given at the end of sub-section 11.4.3.
Further, we should note that in the wave picture, the absorption of
energy by electron takes place continuously over the entire
wavefront of the radiation. Since a large number of electrons absorb energy,
the energy absorbed per electron per unit time turns out to be small.
Explicit calculations estimate that it can take hours or more for a single
electron to pick up sufficient energy to overcome the work function and
come out of the metal. This conclusion is again in striking contrast to
observation (iv) that the photoelectric emission is instantaneous. In short,
the wave picture is unable to explain the most basic features of
photoelectric emission.

11.6 EINSTEIN S PHOTOELECTRIC EQUATION: ENERGY


QUANTUM OF RADIATION
In 1905, Albert Einstein (1879 -1955) proposed a radically new picture
of electromagnetic radiation to explain photoelectric effect. In this picture,
photoelectric emission does not take place by continuous absorption of
energy from radiation. Radiation energy is built up of discrete units the
so called quanta of energy of radiation. Each quantum of radiant energy

393

Physics

Albert Einstein (1879


1955) Einstein, one of the
greatest physicists of all
time, was born in Ulm,
Germany. In 1905, he
published three pathbreaking papers. In the
first paper, he introduced
the notion of light quanta
(now called photons) and
used it to explain the
features of photoelectric
effect. In the second paper,
he developed a theory of
Brownian
motion,
confirmed experimentally a
few years later and provided
a convincing evidence of
the atomic picture of matter.
The third paper gave birth
to the special theory of
relativity. In 1916, he
published the general
theory of relativity. Some of
Einsteins most significant
later contributions are: the
notion
of
stimulated
emission introduced in an
alternative derivation of
Plancks
blackbody
radiation law, static model
of the universe which
started modern cosmology,
quantum statistics of a gas
of massive bosons, and a
critical analysis of the
foundations of quantum
mechanics. In 1921, he was
awarded the Nobel Prize in
physics for his contribution
to theoretical physics and
the photoelectric effect.

no

ALBERT EINSTEIN (1879 1955)

tt
o N
be C
E
re R
pu T
bl
is
he
d

has energy h , where h is Plancks constant and the


frequency of light. In photoelectric effect, an electron
absorbs a quantum of energy (h ) of radiation. If this
quantum of energy absorbed exceeds the minimum
energy needed for the electron to escape from the metal
surface (work function 0), the electron is emitted with
maximum kinetic energy
K max = h 0
(11.2)

394

More tightly bound electrons will emerge with kinetic


energies less than the maximum value. Note that the
intensity of light of a given frequency is determined by
the number of photons incident per second. Increasing
the intensity will increase the number of emitted electrons
per second. However, the maximum kinetic energy of the
emitted photoelectrons is determined by the energy of each
photon.
Equation (11.2) is known as Einsteins photoelectric
equation. We now see how this equation accounts in a
simple and elegant manner all the observations on
photoelectric effect given at the end of sub-section 11.4.3.
According to Eq. (11.2), Kmax depends linearly on ,
and is independent of intensity of radiation, in
agreement with observation. This has happened
because in Einsteins picture, photoelectric effect arises
from the absorption of a single quantum of radiation
by a single electron. The intensity of radiation (that is
proportional to the number of energy quanta per unit
area per unit time) is irrelevant to this basic process.
Since Kmax must be non-negative, Eq. (11.2 ) implies
that photoelectric emission is possible only if
h > 0
or > 0 , where

0 =

(11.3)
h
Equation (11.3) shows that the greater the work
function 0, the higher the minimum or threshold
frequency 0 needed to emit photoelectrons. Thus,
there exists a threshold frequency0 (= 0 /h) for the
metal surface, below which no photoelectric emission
is possible, no matter how intense the incident
radiation may be or how long it falls on the surface.
In this picture, intensity of radiation as noted above,
is proportional to the number of energy quanta per
unit area per unit time. The greater the number of
energy quanta available, the greater is the number of
electrons absorbing the energy quanta and greater,
therefore, is the number of electrons coming out of
the metal (for > 0 ). This explains why, for > 0 ,
photoelectric current is proportional to intensity.

Dual Nature of Radiation


and Matter
In Einsteins picture, the basic elementary process involved in
photoelectric effect is the absorption of a light quantum by an electron.
This process is instantaneous. Thus, whatever may be the intensity
i.e., the number of quanta of radiation per unit area per unit time,
photoelectric emission is instantaneous. Low intensity does not mean
delay in emission, since the basic elementary process is the same.
Intensity only determines how many electrons are able to participate
in the elementary process (absorption of a light quantum by a single
electron) and, therefore, the photoelectric current.
Using Eq. (11.1), the photoelectric equation, Eq. (11.2), can be
written as

tt
o N
be C
E
re R
pu T
bl
is
he
d

e V 0 = h 0; for 0

0
(11.4)
e
e
This is an important result. It predicts that the V0 versus curve is a
straight line with slope = (h/e), independent of the nature of the material.
During 1906-1916, Millikan performed a series of experiments on
photoelectric effect, aimed at disproving Einsteins photoelectric equation.
He measured the slope of the straight line obtained for sodium, similar to
that shown in Fig. 11.5. Using the known value of e, he determined the
value of Plancks constant h . This value was close to the value of Plancks
contant (= 6.626 1034J s) determined in an entirely different context.
In this way, in 1916, Millikan proved the validity of Einsteins photoelectric
equation, instead of disproving it.
The successful explanation of photoelectric effect using the hypothesis
of light quanta and the experimental determination of values of h and 0 ,
in agreement with values obtained from other experiments, led to the
acceptance of Einsteins picture of photoelectric effect. Millikan verified
photoelectric equation with great precision, for a number of alkali metals
over a wide range of radiation frequencies.

or V0 =

11.7 PARTICLE N ATURE

OF

LIGHT: THE PHOTON

no

Photoelectric effect thus gave evidence to the strange fact that light in
interaction with matter behaved as if it was made of quanta or packets of
energy, each of energy h .
Is the light quantum of energy to be associated with a particle? Einstein
arrived at the important result, that the light quantum can also be
associated with momentum (h /c). A definite value of energy as well as
momentum is a strong sign that the light quantum can be associated
with a particle. This particle was later named photon. The particle-like
behaviour of light was further confirmed, in 1924, by the experiment of
A.H. Compton (1892-1962) on scattering of X-rays from electrons. In
1921, Einstein was awarded the Nobel Prize in Physics for his contribution
to theoretical physics and the photoelectric effect. In 1923, Millikan was
awarded the Nobel Prize in physics for his work on the elementary
charge of electricity and on the photoelectric effect.
We can summarise the photon picture of electromagnetic radiation
as follows:

395

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

(i) In interaction of radiation with matter, radiation behaves as if it is


made up of particles called photons.
(ii) Each photon has energy E (=h ) and momentum p (= h /c), and
speed c, the speed of light.
(iii) All photons of light of a particular frequency , or wavelength , have
the same energy E (=h = hc/) and momentum p (= h /c = h/),
whatever the intensity of radiation may be. By increasing the intensity
of light of given wavelength, there is only an increase in the number of
photons per second crossing a given area, with each photon having
the same energy. Thus, photon energy is independent of intensity of
radiation.
(iv) Photons are electrically neutral and are not deflected by electric and
magnetic fields.
(v) In a photon-particle collision (such as photon-electron collision), the
total energy and total momentum are conserved. However, the number
of photons may not be conserved in a collision. The photon may be
absorbed or a new photon may be created.

EXAMPLE 11.1

Example 11.1 Monochromatic light of frequency 6.0 1014 Hz is


3
produced by a laser. The power emitted is 2.0 10 W. (a) What is the
energy of a photon in the light beam? (b) How many photons per second,
on an average, are emitted by the source?

Solution
(a) Each photon has an energy
34
14
E = h = ( 6.63 10 J s) (6.0 10 Hz)
19
= 3.98 10 J
(b) If N is the number of photons emitted by the source per second,
the power P transmitted in the beam equals N times the energy
per photon E, so that P = N E. Then
N=

P 2.0 103 W
=
19
E 3.98 10 J
15

= 5.0 10

photons per second.

Example 11.2 The work function of caesium is 2.14 eV. Find (a) the
threshold frequency for caesium, and (b) the wavelength of the incident
light if the photocurrent is brought to zero by a stopping potential of
0.60 V.

396

0 = 0 =
h

EXAMPLE 11.2

no

Solution
(a) For the cut-off or threshold frequency, the energy h 0 of the incident
radiation must be equal to work function 0, so that

2.14eV
6.63 10 34 J s

2.14 1.6 10 19 J
= 5.16 1014 Hz
6.63 10 34 J s

Thus, for frequencies less than this threshold frequency, no


photoelectrons are ejected.
(b) Photocurrent reduces to zero, when maximum kinetic energy of
the emitted photoelectrons equals the potential energy e V0 by the
retarding potential V0. Einsteins Photoelectric equation is

Dual Nature of Radiation


and Matter
eV0 = h 0 =

hc

= hc/(eV0 + 0 )
=

(6.63 1034 J s) (3 108 m/s)


(0.60 eV + 2.14 eV)

19.89 1026 J m
(2.74 eV)

19.89 1026 J m
= 454 nm
2.74 1.6 10 19 J

EXAMPLE 11.2

tt
o N
be C
E
re R
pu T
bl
is
he
d

or,

Example 11.3 The wavelength of light in the visible region is about


390 nm for violet colour, about 550 nm (average wavelength) for yellowgreen colour and about 760 nm for red colour.
(a) What are the energies of photons in (eV) at the (i) violet end, (ii)
average wavelength, yellow-green colour, and (iii) red end of the
visible spectrum? (Take h = 6.6310 34 J s and 1 eV = 1.610 19J.)
(b) From which of the photosensitive materials with work functions
listed in Table 11.1 and using the results of (i), (ii) and (iii) of (a),
can you build a photoelectric device that operates with visible
light?
Solution
(a) Energy of the incident photon, E = h = hc/
E = (6.631034J s) (3108 m/s)/
=

1.989 10 25 J m

no

EXAMPLE 11.3

(i) For violet light, 1 = 390 nm (lower wavelength en d)


1.989 1025 J m
Incident photon energy, E 1 =
9
39010 m
19
= 5.10 10 J
5.10 1019 J
=
19
1.610 J/eV
= 3.19 eV
(ii) For yellow-green light, 2 = 550 nm (average wavelength)
1.989 1025 J m
Incident photon energy, E 2 =
9
55010 m
19
= 3.6210 J = 2.26 eV
(iii) For red light, 3 = 760 nm (higher wavelength end )
1.989 1025 J m
Incident photon energy, E3 =
9
76010 m
19
= 2.6210 J = 1.64 eV
(b) For a photoelectric device to operate, we require incident light energy
E to be equal to or greater than the work function 0 of the material.
Thus, the photoelectric device will operate with violet light (with
E = 3.19 eV) photosensitive material Na (with 0 = 2.75 eV), K (with
0 = 2.30 e V) and Cs (with 0 = 2.14 eV). It will also operate with
yellow-green light (with E = 2.26 eV) for Cs (with 0 = 2.14 eV ) only.
However, it will not operate with red light (with E = 1.64 eV) for any
of these photosensitive materials.

397

Physics
11.8 W AVE NATURE

OF

MATTER

tt
o N
be C
E
re R
pu T
bl
is
he
d

The dual (wave-particle) nature of light (electromagnetic radiation, in


general) comes out clearly from what we have learnt in this and the
preceding chapters. The wave nature of light shows up in the phenomena
of interference, diffraction and polarisation. On the other hand, in
photoelectric effect and Compton effect which involve energy and
momentum transfer, radiation behaves as if it is made up of a bunch of
particles the photons. Whether a particle or wave description is best
suited for understanding an experiment depends on the nature of the
experiment. For example, in the familiar phenomenon of seeing an object
by our eye, both descriptions are important. The gathering and focussing
mechanism of light by the eye-lens is well described in the wave picture.
But its absorption by the rods and cones (of the retina) requires the photon
picture of light.
A natural question arises: If radiation has a dual (wave-particle) nature,
might not the particles of nature (the electrons, protons, etc.) also exhibit
wave-like character? In 1924, the French physicist Louis Victor de Broglie
(pronounced as de Broy) (1892-1987) put forward the bold hypothesis
that moving particles of matter should display wave-like properties under
suitable conditions. He reasoned that nature was symmetrical and that
the two basic physical entities matter and energy, must have symmetrical
character. If radiation shows dual aspects, so should matter. De Broglie
proposed that the wave length associated with a particle of momentum
p is given as

h
h
=
p
mv

(11.5)

where m is the mass of the particle and v its speed. Equation (11.5) is
known as the de Broglie relation and the wavelength of the matter
wave is called de Broglie wavelength. The dual aspect of matter is evident
in the de Broglie relation. On the left hand side of Eq. (11.5), is the
attribute of a wave while on the right hand side the momentum p is a
typical attribute of a particle. Plancks constant h relates the two
attributes.
Equation (11.5) for a material particle is basically a hypothesis whose
validity can be tested only by experiment. However, it is interesting to see
that it is satisfied also by a photon. For a photon, as we have seen,
p = h /c

(11.6)

no

Therefore,

398

h c
= =
(11.7)
p
That is, the de Broglie wavelength of a photon given by Eq. (11.5) equals
the wavelength of electromagnetic radiation of which the photon is a
quantum of energy and momentum.
Clearly, from Eq. (11.5 ), is smaller for a heavier particle ( large m ) or
more energetic particle (large v). For example, the de Broglie wavelength
of a ball of mass 0.12 kg moving with a speed of 20 m s 1 is easily
calculated:

Dual Nature of Radiation


and Matter
P HOTOCELL

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

A photocell is a technological application of the photoelectric effect. It is a device whose


electrical properties are affected by light. It is also sometimes called an electric eye. A photocell
consists of a semi-cylindrical photo-sensitive metal plate C (emitter) and a wire loop A
(collector) supported in an evacuated glass or quartz bulb. It is connected to the external
circuit having a high-tension battery B and microammeter (A) as shown in the Figure.
Sometimes, instead of the plate C, a thin layer of photosensitive material is pasted on the
inside of the bulb. A part of the bulb is left clean for the light to enter it.
When light of suitable wavelength falls on the
emitter C, photoelectrons are emitted. These
photoelectrons are drawn to the collector A.
Photocurrent of the order of a few microampere
can be normally obtained from a photo cell.
A photocell converts a change in intensity of
illumination into a change in photocurrent. This
current can be used to operate control systems
and in light measuring devices. A photocell of lead
sulphide sensitive to infrared radiation is used
in electronic ignition circuits.
In scientific work, photo cells are used
whenever it is necessary to measure the intensity
of light. Light meters in photographic cameras
make use of photo cells to measure the intensity
of incident light. The photocells, inserted in the
door light electric circuit, are used as automatic
door opener. A person approaching a doorway
may interrupt a light beam which is incident on
a photocell. The abrupt change in photocurrent
A photo cell
may be used to start a motor which opens the
door or rings an alarm. They are used in the
control of a counting device which records every interruption of the light beam caused by a
person or object passing across the beam. So photocells help count the persons entering an
auditorium, provided they enter the hall one by one. They are used for detection of traffic
law defaulters: an alarm may be sounded whenever a beam of (invisible) radiation is
intercepted.
In burglar alarm, (invisible) ultraviolet light is continuously made to fall on a photocell
installed at the doorway. A person entering the door interrupts the beam falling on the
photocell. The abrupt change in photocurrent is used to start an electric bell ringing. In fire
alarm, a number of photocells are installed at suitable places in a building. In the event of
breaking out of fire, light radiations fall upon the photocell. This completes the electric
circuit through an electric bell or a siren which starts operating as a warning signal.
Photocells are used in the reproduction of sound in motion pictures and in the television
camera for scanning and telecasting scenes. They are used in industries for detecting minor
flaws or holes in metal sheets.

p = m v = 0.12 kg 20 m s1 = 2.40 kg m s1

6.63 10 34 J s
h
=
= 2.76 1034 m
2.40 kg m s 1
p

399

This wavelength is so small that it is beyond any


measurement. This is the reason why macroscopic objects
in our daily life do not show wave-like properties. On the
other hand, in the sub-atomic domain, the wave character
of particles is significant and measurable.
Consider an electron (mass m, charge e) accelerated
from rest through a potential V. The kinetic energy K
of the electron equals the work done (eV ) on it by the
electric field:

tt
o N
be C
E
re R
pu T
bl
is
he
d

LOUIS VICTOR DE BROGLIE (1892 1987)

Physics

K =e V

Now, K =

no

Louis Victor de Broglie


(1892 1987) French
physicist who put forth
revolutionary idea of wave
natur e of matter. This idea
was developed by Erwin
Schrdinger into a fullfledged theory of quantum
mechanics
commonly
known as wave mechanics.
In 1929, he was awarded the
Nobel Prize in Physics for his
discovery of the wave nature
of electrons.

400

p=

(11.8)

1
p2
m v2 =
, so that
2
2m

2m K =

2 m eV

(11.9)

The de Broglie wavelength of the electron is then


=

h
h
h
=
=
p
2mK
2 m eV

(11.10)

Substituting the numerical values of h, m, e,


we get
1.227

nm
(11.11)
V
where V is the magnitude of accelerating potential in
volts. For a 120 V accelerating potential, Eq. (11.11) gives
= 0.112 nm. This wavelength is of the same order as
the spacing between the atomic planes in crystals. This
suggests that matter waves associated with an electron could be verified
by crystal diffraction experiments analogous to X-ray diffraction. We
describe the experimental verification of the de Broglie hypothesis in the
next section. In 1929, de Broglie was awarded the Nobel Prize in Physics
for his discovery of the wave nature of electrons.
The matterwave picture elegantly incorporated the Heisenbergs
uncertainty principle. According to the principle, it is not possible to
measure both the position and momentum of an electron (or any other
particle) at the same time exactly. There is always some uncertainty ( x )
in the specification of position and some uncertainty (p ) in the
specification of momentum. The product of x and p is of the order of *
(with = h/2), i.e.,

x p

(11.12)

Equation (11.12) allows the possibility that x is zero; but then p


must be infinite in order that the product is non-zero. Similarly, if p is
zero, x must be infinite. Ordinarily, both x and p are non-zero such
that their product is of the order of .
Now, if an electron has a definite momentum p, (i.e.p = 0), by the de
Broglie relation, it has a definite wavelength . A wave of definite (single)
* A more rigorous treatment gives x p /2.

Dual Nature of Radiation


and Matter

tt
o N
be C
E
re R
pu T
bl
is
he
d

wavelength extends all over space. By Borns


probability interpretation this means that the
electron is not localised in any finite region of
space. That is, its position uncertainty is infinite
(x ), which is consistent with the
uncertainty principle.
In general, the matter wave associated with
the electron is not extended all over space. It is
a wave packet extending over some finite region
of space. In that case x is not infinite but has
some finite value depending on the extension
of the wave packet. Also, you must appreciate
that a wave packet of finite extension does not
have a single wavelength. It is built up of
wavelengths spread around some central
wavelength.
By de Broglies relation, then, the
momentum of the electron will also have a
spread an uncertainty p. This is as expected
from the uncertainty principle. It can be shown
that the wave packet description together with
de Broglie relation and Borns probability
interpretation reproduce the Heisenbergs
uncertainty principle exactly.
In Chapter 12, the de Broglie relation will
be seen to justify Bohrs postulate on
quantisation of angular momentum of electron
in an atom.
Figure 11.6 shows a schematic diagram of
(a) a localised wave packet, and (b) an extended
wave with fixed wavelength.

FIGURE 11.6 (a) The wave packet description of


an electron. The wave packet corresponds to a
spread of wavelength around some central
wavelength (and hence by de Broglie relation,
a spread in momentum). Consequently, it is
associated with an uncertainty in position
(x) and an uncertainty in momentum (p).
(b) The matter wave corresponding to a
definite momentum of an electron
extends all over space. In this case,
p = 0 and x .

Example 11.4 What is the de Broglie wavelength associated with (a) an


electron moving with a speed of 5.4106 m/s, and (b) a ball of mass 150 g
travelling at 30.0 m/s?

no

Solution
(a) For the electron:
31
6
Mass m = 9.1110 kg, speed v = 5.410 m/s. Then, momentum
31
6
p = m v = 9.1110 (kg) 5.4 10 (m/s)
24
p = 4.92 10 kg m/s
de Broglie wavelength, = h/p

6.63 1034 J s
24
4. 92 10 kg m/s
= 0.135 nm
=

EXAMPLE 11.4

(b) For the ball:


Mass m = 0.150 kg, speed v = 30.0 m/s.
Then momentum p = m v = 0.150 (kg) 30.0 (m/s)
p = 4.50 kg m/s
de Broglie wavelength = h/p.

401

6. 63 1034 J s
4. 50 kg m/s
34

= 1.47 10 m
The de Broglie wavelength of electron is comparable with X-ray
19
wavelengths. However, for the ball it is about 10
times the size of
the proton, quite beyond experimental measurement.

tt
o N
be C
E
re R
pu T
bl
is
he
d

EXAMPLE 11.4

Physics

Example 11.5 An electron, an -particle, and a proton have the same


kinetic energy. Which of these particles has the shortest de Broglie
wavelength?

EXAMPLE 11.5

Solution
For a particle, de Broglie wavelength, = h/p
2
Kinetic energy, K = p /2m

Then, = h / 2mK
For the same kinetic energy K, the de Broglie wavelength associated
with the particle is inversely proportional to the square root of their

( H) is 1836 times massive than an electron and


( He) four times that of a proton.

masses. A proton

1
1

an -particle 2
Hence, particle has the shortest de Broglie wavelength.

P ROBABILITY

INTERPRETATION TO MATTER WAVES

402

EXAMPLE 11.6

no

It is worth pausing here to reflect on just what a matter wave associated with a particle,
say, an electron, means. Actually, a truly satisfactory physical understanding of the
dual nature of matter and radiation has not emerged so far. The great founders of
quantum mechanics (Niels Bohr, Albert Einstein, and many others) struggled with this
and related concepts for long. Still the deep physical interpretation of quantum
mechanics continues to be an area of active research. Despite this, the concept of
matter wave has been mathematically introduced in modern quantum mechanics with
great success. An important milestone in this connection was when Max Born (18821970) suggested a probability interpretation to the matter wave amplitude. According
to this, the intensity (square of the amplitude) of the matter wave at a point determines
the probability density of the particle at that point. Probability density means probability
per unit volume. Thus, if A is the amplitude of the wave at a point, |A| 2 V is the
probability of the particle being found in a small volume V around that point. Thus,
if the intensity of matter wave is large in a certain region, there is a greater probability
of the particle being found there than where the intensity is small.

Example 11.6 A particle is moving three times as fast as an electron.


The ratio of the de Broglie wavelength of the particle to that of the
electron is 1.813 104. Calculate the particles mass and identify the
particle.
Solution
de Broglie wavelength of a moving particle, having mass m and
velocity v:

Dual Nature of Radiation


and Matter
=

h h
=
p mv

Then, mass of the particle, m = me

ve
v

m = (9.111031 kg) (1/3) (1/1.813 104)


m = 1.675 1027 kg.
Thus, the particle, with this mass could be a proton or a neutron.

EXAMPLE 11.6

tt
o N
be C
E
re R
pu T
bl
is
he
d

Mass, m = h/v
For an electron, mass m e = h/e ve
Now, we have v/ve = 3 and
/ e = 1.813 10 4

Example 11.7 What is the de Broglie wavelength associated with an


electron, accelerated through a potential differnece of 100 volts?
Solution Accelerating potential V = 100 V. The de Broglie wavelength
is
1.227
nm
V

1.227
nm = 0.123 nm
100
The de Broglie wavelength associated with an electron in this case is of
the order of X-ray wavelengths.
=

11.9 DAVISSON

AND

EXAMPLE 11.7

= h /p =

GERMER EXPERIMENT

no

The wave nature of electrons was first experimentally verified by C.J.


Davisson and L.H. Germer in 1927 and independently by G.P. Thomson,
in 1928, who observed
diffraction effects with beams of
electrons scattered by crystals.
Davisson and Thomson shared
the Nobel Prize in 1937 for their
experimental discovery of
diffraction of electrons by
crystals.
The experimental arrangement used by Davisson and
Germer is schematically shown
in Fig. 11.7. It consists of an
electron gun which comprises of
a tungsten filament F, coated
with barium oxide and heated
by a low voltage power supply
(L.T. or battery). Electrons
FIGURE 11.7 Davisson-Germer electron
emitted by the filament are
diffraction arrangement.
accelerated to a desired velocity

403

by applying suitable potential/voltage from a high voltage power supply


(H.T. or battery). They are made to pass through a cylinder with fine
holes along its axis, producing a fine collimated beam. The beam is made
to fall on the surface of a nickel crystal. The electrons are scattered in all
directions by the atoms of the crystal. The intensity of the electron beam,
scattered in a given direction, is measured by the electron detector
(collector). The detector can be moved on a circular scale and is connected
to a sensitive galvanometer, which records the current. The deflection of
the galvanometer is proportional to the intensity of the electron beam
entering the collector. The apparatus is enclosed in an evacuated chamber.
By moving the detector on the circular scale at different positions, the
intensity of the scattered electron beam is measured for different values
of angle of scattering which is the angle between the incident and the
scattered electron beams. The variation of the intensity (I ) of the scattered
electrons with the angle of scattering is obtained for different accelerating
voltages.
The experiment was performed by varying the accelarating voltage
from 44 V to 68 V. It was noticed that a strong peak appeared in the
intensity (I ) of the scattered electron for an accelarating voltage of 54V at
a scattering angle = 50
The appearance of the peak in a particular direction is due to the
constructive interference of electrons scattered from different layers of the
regularly spaced atoms of the crystals. From the electron diffraction
measurements, the wavelength of matter waves was found to be
0.165 nm.
The de Broglie wavelength associated with electrons, using
Eq. (11.11), for V = 54 V is given by

no

Development of electron microscope

tt
o N
be C
E
re R
pu T
bl
is
he
d

http://www.nobelprize.org/nobel_prizes/physics/laureates/1986/presentation-speech.html

Physics

404

= h /p =

1 .227
54

1 .227
V

nm

nm = 0.167 nm

Thus, there is an excellent agreement between the theoretical value


and the experimentally obtained value of de Broglie wavelength. DavissonGermer experiment thus strikingly confirms the wave nature of electrons
and the de Broglie relation. More recently, in 1989, the wave nature of a
beam of electrons was experimentally demonstrated in a double-slit
experiment, similar to that used for the wave nature of light. Also, in an
experiment in 1994, interference fringes were obtained with the beams of
iodine molecules, which are about a million times more massive than
electrons.
The de Broglie hypothesis has been basic to the development of modern
quantum mechanics. It has also led to the field of electron optics. The
wave properties of electrons have been utilised in the design of electron
microscope which is a great improvement, with higher resolution, over
the optical microscope.

Dual Nature of Radiation


and Matter
SUMMARY
The minimum energy needed by an electron to come out from a metal
surface is called the work function of the metal. Energy (greater than
the work function () required for electron emission from the metal
surface can be supplied by suitably heating or applying strong electric
field or irradiating it by light of suitable frequency.

tt
o N
be C
E
re R
pu T
bl
is
he
d

1.

2.

3.

Photoelectric effect is the phenomenon of emission of electrons by metals


when illuminated by light of suitable frequency. Certain metals respond
to ultraviolet light while others are sensitive even to the visible light.
Photoelectric effect involves conversion of light energy into electrical
energy. It follows the law of conservation of energy. The photoelectric
emission is an instantaneous process and possesses certain special
features.
Photoelectric current depends on (i) the intensity of incident light, (ii)
the potential difference applied between the two electrodes, and (iii)
the nature of the emitter material.

4.

The stopping potential (Vo) depends on (i) the frequency of incident


light, and (ii) the nature of the emitter material. For a given frequency
of incident light, it is independent of its intensity. The stopping potential
is directly related to the maximum kinetic energy of electrons emitted:
e V0 = (1/2) m v2max = K max.

5.

Below a certain frequency (threshold frequency) 0 , characteristic of


the metal, no photoelectric emission takes place, no matter how large
the intensity may be.

6.

The classical wave theory could not explain the main features of
photoelectric effect. Its picture of continuous absorption of energy from
radiation could not explain the independence of Kmax on intensity, the
existence of o and the instantaneous nature of the process. Einstein
explained these features on the basis of photon picture of light.
According to this, light is composed of discrete packets of energy called
quanta or photons. Each photon carries an energy E (= h ) and
momentum p (= h/), which depend on the frequency ( ) of incident
light and not on its intensity. Photoelectric emission from the metal
surface occurs due to absorption of a photon by an electron.

7.

Einsteins photoelectric equation is in accordance with the energy


conservation law as applied to the photon absorption by an electron in
the metal. The maximum kinetic energy (1/2)m v2max is equal to
the photon energy ( h ) minus the work function 0 (= h 0) of the
target metal:

no

1
m v2max = V0 e = h 0 = h ( 0 )
2

8.

This photoelectric equation explains all the features of the photoelectric


effect. Millikans first precise measurements confirmed the Einsteins
photoelectric equation and obtained an accurate value of Plancks
constant h . This led to the acceptance of particle or photon description
(nature) of electromagnetic radiation, introduced by Einstein.
Radiation has dual nature: wave and particle. The nature of experiment
determines whether a wave or particle description is best suited for
understanding the experimental result. Reasoning that radiation and
matter should be symmetrical in nature, Louis Victor de Broglie

405

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

attributed a wave-like character to matter (material particles). The waves


associated with the moving material particles are called matter waves
or de Broglie waves.
9. The de Broglie wavelength () associated with a moving particle is related
to its momentum p as: = h/p. The dualism of matter is inherent in the
de Broglie relation which contains a wave concept () and a particle concept
(p). The de Broglie wavelength is independent of the charge and nature of
the material particle. It is significantly measurable (of the order of the
atomic-planes spacing in crystals) only in case of sub-atomic particles
like electrons, protons, etc. (due to smallness of their masses and hence,
momenta). However, it is indeed very small, quite beyond measur ement,
in case of macroscopic objects, commonly encountered in everyday life.

10. Electron diffraction experiments by Davisson and Ger mer, and by G. P.


Thomson, as well as many later experiments, have verified and confirmed
the wave-nature of electrons. The de Broglie hypothesis of matter waves
supports the Bohr s concept of stationary orbits.

Physical
Quantity

Symbol

Dimensions

Unit

Remarks

Plancks
constant

[ML2 T 1 ]

Js

E = h

Stopping
potential

V0

[ML 2 T 3A 1]

e V 0= Kmax

Work
function

[ML2 T 2 ]

J ; eV

Kmax = E 0

Threshold
frequency

[T 1]

Hz

0 = 0 /h

de Broglie
wavelength

[L]

= h/p

POINTS TO PONDER

1.

no

2.

406

3.

Free electrons in a metal are free in the sense that they move inside the
metal in a constant potential (This is only an approximation). They are
not free to move out of the metal. They need additional energy to get
out of the metal.
Free electrons in a metal do not all have the same energy. Like molecules
in a gas jar, the electrons have a certain energy distribution at a given
temperature. This distribution is different from the usual Maxwells
distribution that you have learnt in the study of kinetic theory of gases.
You will lear n about it in later courses, but the difference has to do
with the fact that electrons obey Paulis exclusion principle.
Because of the energy distribution of free electrons in a metal, the
energy required by an electron to come out of the metal is different for
different electrons. Electrons with higher energy require less additional
energy to come out of the metal than those with lower energies. Work
function is the least energy required by an electron to come out of the
metal.

Dual Nature of Radiation


and Matter
4.

tt
o N
be C
E
re R
pu T
bl
is
he
d

5.

Observations on photoelectric ef fect imply that in the event of matterlight interaction, absorption of energy takes place in discrete units of h .
This is not quite the same as saying that light consists of particles,
each of energy h .
Observations on the stopping potential (its independence of intensity
and dependence on frequency) are the crucial discriminator between
the wave-picture and photon-picture of photoelectric effect.

6.

h
has physical
p
significance; its phase velocity vp has no physical significance. However,
the group velocity of the matter wave is physically meaningful and
equals the velocity of the particle.
The wavelength of a matter wave given by =

EXERCISES

11.1

11.2

Find the
(a) maximum frequency, and
(b) minimum wavelength of X-rays produced by 30 kV electrons.
The work function of caesium metal is 2.14 eV. When light of
frequency 6 1014Hz is incident on the metal surface, photoemission
of electrons occurs. What is the
(a) maximum kinetic energy of the emitted electrons,
(b) Stopping potential, and
(c) maximum speed of the emitted photoelectrons?

11.3

The photoelectric cut-off voltage in a certain experiment is 1.5 V.


What is the maximum kinetic energy of photoelectrons emitted?

11.4

Monochromatic light of wavelength 632.8 nm is produced by a


helium-neon laser. The power emitted is 9.42 mW.
(a) Find the energy and momentum of each photon in the light beam,
(b) How many photons per second, on the average, arrive at a target
irradiated by this beam? (Assume the beam to have uniform
cross-section which is less than the target area ), and

(c) How fast does a hydrogen atom have to travel in order to have
the same momentum as that of the photon?
The energy flux of sunlight reaching the surface of the earth is
3
2
1.388 10 W/m . How many photons (nearly) per square metre are
incident on the Earth per second ? Assume that the photons in the
sunlight have an average wavelength of 550 nm.

no

11.5

11.6

11.7

In an experiment on photoelectric effect, the slope of the cut-off


15
voltage versus frequency of incident light is found to be 4.12 10 V s.
Calculate the value of Plancks constant.
A 100 W sodium lamp radiates energy uniformly in all directions.
The lamp is located at the centre of a large sphere that absorbs all
the sodium light which is incident on it. The wavelength of the
sodium light is 589 nm. (a) What is the energy per photon associated

407

Physics
with the sodium light? (b) At what rate are the photons delivered to
the sphere?
The threshold frequency for a certain metal is 3.3 10 14 Hz. If light
of frequency 8.2 1014 Hz is incident on the metal, predict the cutoff voltage for the photoelectric emission.
11.9 The work function for a certain metal is 4.2 eV. Will this metal give
photoelectric emission for incident radiation of wavelength 330 nm?
11.10 Light of frequency 7.21 10 14 Hz is incident on a metal surface.
5
Electrons with a maximum speed of 6.0 10 m/s are ejected from
the surface. What is the threshold frequency for photoemission of
electrons?

tt
o N
be C
E
re R
pu T
bl
is
he
d

11.8

11.11 Light of wavelength 488 nm is produced by an argon laser which is


used in the photoelectric effect. When light from this spectral line is
incident on the emitter, the stopping (cut-of f) potential of
photoelectr ons is 0.38 V. Find the work function of the material
from which the emitter is made.
11.12 Calculate the
(a) momentum, and
(b) de Broglie wavelength of the electrons accelerated through a
potential dif ference of 56 V.
11.13 What is the
(a) momentum,
(b) speed, and
(c) de Broglie wavelength of an electron with kinetic energy of
120 eV.
11.14 The wavelength of light from the spectral emission line of sodium is
589 nm. Find the kinetic energy at which
(a) an electron, and
(b) a neutron, would have the same de Broglie wavelength.

11.15 What is the de Broglie wavelength of


(a) a bullet of mass 0.040 kg travelling at the speed of 1.0 km/s,
(b) a ball of mass 0.060 kg moving at a speed of 1.0 m/s, and
(c) a dust particle of mass 1.0 10 9 kg drifting with a speed of
2.2 m/s ?

no

11.16 An
(a)
(b)
(c)

408

electron and a photon each have a wavelength of 1.00 nm. Find


their momenta,
the energy of the photon, and
the kinetic energy of electron.

11.17 (a) For what kinetic energy of a neutron will the associated de Broglie
wavelength be 1.40 10 10 m?
(b) Also find the de Broglie wavelength of a neutron, in thermal
equilibrium with matter, having an average kinetic energy of
(3/2) k T at 300 K.
11.18 Show that the wavelength of electromagnetic radiation is equal to
the de Broglie wavelength of its quantum (photon).
11.19 What is the de Broglie wavelength of a nitrogen molecule in air at
300 K ? Assume that the molecule is moving with the root-meansquare speed of molecules at this temperature. (Atomic mass of
nitrogen = 14.0076 u)

Dual Nature of Radiation


and Matter

ADDITIONAL EXERCISES

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

11.20 (a) Estimate the speed with which electrons emitted from a heated
emitter of an evacuated tube impinge on the collector maintained
at a potential difference of 500 V with respect to the emitter.
Ignore the small initial speeds of the electrons. The
specific charge of the electron, i.e., its e/m is given to be
11
1
1.76 10 C kg .
(b) Use the same formula you employ in (a) to obtain electron speed
for an collector potential of 10 MV. Do you see what is wrong ? In
what way is the formula to be modified ?
11.21 (a) A monoenergetic electron beam with electron speed of
5.20 106 m s1 is subject to a magnetic field of 1.30 104 T
normal to the beam velocity. What is the radius of the circle traced
by the beam, given e/m for electron equals 1.76 1011C kg1.
(b) Is the formula you employ in (a) valid for calculating radius of
the path of a 20 MeV electron beam? If not, in what way is it
modified ?
[Note: Exercises 11.20(b) and 11.21(b) take you to relativistic
mechanics which is beyond the scope of this book. They have been
inserted here simply to emphasise the point that the formulas you
use in part (a) of the exercises are not valid at very high speeds or
energies. See answers at the end to know what very high speed or
energy means.]
11.22 An electron gun with its collector at a potential of 100 V fires out
electrons in a spherical bulb containing hydrogen gas at low
pressure (102 mm of Hg). A magnetic field of 2.83 10 4 T curves
the path of the electrons in a circular orbit of radius 12.0 cm. (The
path can be viewed because the gas ions in the path focus the beam
by attracting electrons, and emitting light by electron capture; this
method is known as the fine beam tube method.) Determine
e/m from the data.
11.23 (a) An X-ray tube produces a continuous spectrum of radiation with
its short wavelength end at 0.45 . What is the maximum energy
of a photon in the radiation?
(b) From your answer to (a), guess what order of accelerating voltage
(for electrons) is required in such a tube ?
11.24 In an accelerator experiment on high-energy collisions of electrons
with positrons, a certain event is interpreted as annihilation of an
electron-positron pair of total energy 10.2 BeV into two -rays of
equal energy. What is the wavelength associated with each -ray?
9
(1BeV = 10 eV)
11.25 Estimating the following two numbers should be interesting. The
first number will tell you why radio engineers do not need to worry
much about photons! The second number tells you why our eye can
never count photons, even in barely detectable light.
(a) The number of photons emitted per second by a Medium wave
transmitter of 10 kW power, emitting radiowaves of wavelength
500 m.
(b) The number of photons entering the pupil of our eye per second
corresponding to the minimum intensity of white light that we

409

Physics
humans can perceive (1010 W m 2). Take the area of the pupil
to be about 0.4 cm2, and the average frequency of white light to
be about 6 1014 Hz.

tt
o N
be C
E
re R
pu T
bl
is
he
d

11.26 Ultraviolet light of wavelength 2271 from a 100 W mercury source


irradiates a photo-cell made of molybdenum metal. If the stopping
potential is 1.3 V, estimate the work function of the metal. How
would the photo-cell respond to a high intensity (105 W m2) red
light of wavelength 6328 produced by a He-Ne laser?
9

11.27 Monochromatic radiation of wavelength 640.2 nm (1nm = 10 m)


from a neon lamp irradiates photosensitive material made of caesium
on tungsten. The stopping voltage is measured to be 0.54 V. The
source is replaced by an iron source and its 427.2 nm line irradiates
the same photo-cell. Predict the new stopping voltage.
11.28 A mercury lamp is a convenient source for studying frequency
dependence of photoelectric emission, since it gives a number of
spectral lines ranging from the UV to the red end of the visible
spectrum. In our experiment with rubidium photo-cell, the following
lines from a mercury source were used:
1 = 3650 , 2= 4047 , 3= 4358 , 4= 5461 , 5= 6907 ,
The stopping voltages, respectively, were measured to be:
V 01 = 1.28 V, V 02 = 0.95 V, V 03 = 0.74 V, V 04 = 0.16 V, V05 = 0 V

no

Determine the value of Plancks constant h, the threshold frequency


and work function for the material.
[Note: You will notice that to get h from the data, you will need to
know e (which you can take to be 1.6 1019 C). Experiments of this
kind on Na, Li, K, etc. were performed by Millikan, who, using his
own value of e (from the oil-drop experiment) confirmed Einsteins
photoelectric equation and at the same time gave an independent
estimate of the value of h.]
11.29 The work function for the following metals is given:

410

Na: 2.75 eV; K: 2.30 eV; Mo: 4.17 eV; Ni: 5.15 eV. Which of these
metals will not give photoelectric emission for a radiation of
wavelength 3300 from a He-Cd laser placed 1 m away from the
photocell? What happens if the laser is brought nearer and placed
50 cm away ?
11.30 Light of intensity 105 W m2 falls on a sodium photo-cell of surface
2
area 2 cm . Assuming that the top 5 layers of sodium absorb the
incident energy, estimate time required for photoelectric emission
in the wave-picture of radiation. The work function for the metal is
given to be about 2 eV. What is the implication of your answer?
11.31 Crystal diffraction experiments can be performed using X-rays, or
electrons accelerated through appropriate voltage. Which probe has
greater energy? (For quantitative comparison, take the wavelength
of the pr obe equal to 1 , which is of the order of inter -atomic spacing
31
in the lattice) (m e=9.11 10
kg).
11.32 (a) Obtain the de Broglie wavelength of a neutron of kinetic energy
150 eV. As you have seen in Exer cise 11.31, an electron beam of
this ener gy is suitable for crystal diffraction experiments. Would
a neutron beam of the same energy be equally suitable ? Explain.
27
(m n = 1.675 10 kg)

Dual Nature of Radiation


and Matter

tt
o N
be C
E
re R
pu T
bl
is
he
d

11.33

(b) Obtain the de Broglie wavelength associated with thermal


neutrons at room temperature (27 C). Hence explain why a fast
neutron beam needs to be thermalised with the environment
before it can be used for neutron diffraction experiments.
An electron microscope uses electrons accelerated by a voltage of
50 kV. Deter mine the de Broglie wavelength associated with the
electrons. If other factors (such as numerical aperture, etc.) are
taken to be roughly the same, how does the resolving power of an
electron microscope compare with that of an optical microscope
which uses yellow light?
The wavelength of a probe is roughly a measure of the size of a
structure that it can probe in some detail. The quark structure
of protons and neutrons appears at the minute length-scale of
15
10 m or less. This structure was first probed in early 1970s using
high energy electron beams produced by a linear accelerator at
Stanford, USA. Guess what might have been the order of energy of
these electron beams. (Rest mass energy of electron = 0.511 MeV.)
Find the typical de Broglie wavelength associated with a He atom in
helium gas at room temperature (27 C) and 1 atm pressure; and
compare it with the mean separation between two atoms under these
conditions.
Compute the typical de Broglie wavelength of an electron in a metal
at 27 C and compare it with the mean separation between two
electrons in a metal which is given to be about 2 1010 m.

11.34

11.35

11.36

no

[Note: Exercises 11.35 and 11.36 reveal that while the wave-packets
associated with gaseous molecules under ordinary conditions are
non-overlapping, the electron wave-packets in a metal strongly
overlap with one another. This suggests that whereas molecules in
an ordinary gas can be distinguished apart, electrons in a metal
cannot be distintguished apart from one another. This
indistinguishibility has many fundamental implications which you
will explore in more advanced Physics courses.]
11.37 Answer the following questions:
(a) Quarks inside protons and neutrons are thought to carry
fractional charges [(+2/3)e ; (1/3)e]. Why do they not show up
in Millikans oil-drop experiment ?
(b) What is so special about the combination e/m ? Why do we not
simply talk of e and m separately?
(c) Why should gases be insulators at ordinary pressures and start
conducting at very low pressures ?
(d) Every metal has a definite work function. Why do all
photoelectrons not come out with the same energy if incident
radiation is monochromatic? Why is there an energy distribution
of photoelectrons ?
(e) The energy and momentum of an electron are related to the
frequency and wavelength of the associated matter wave by the
relations:
E = h , p =

But while the value of is physically significant, the value of


(and therefore, the value of the phase speed ) has no physical
significance. Why?

411

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

APPENDIX

11.1 The history of wave-particle flip-flop

no

What is light? This question has haunted mankind for a long time. But systematic experiments were done by
scientists since the dawn of the scientific and industrial era, about four centuries ago. Around the same time,
theoretical models about what light is made of were developed. While building a model in any branch of
science, it is essential to see that it is able to explain all the experimental observations existing at that time.
It is therefore appropriate to summarize some observations about light that were known in the seventeenth
century.
The properties of light known at that time included (a) rectilinear propagation of light, (b) reflection from
plane and curved surfaces, (c) refraction at the boundary of two media, (d) dispersion into various colours, (e)
high speed. Appropriate laws were formulated for the first four phenomena. For example, Snell formulated his
laws of refraction in 1621. Several scientists right from the days of Galileo had tried to measure the speed of
light. But they had not been able to do so. They had only concluded that it was higher than the limit of their
measurement.
Two models of light were also proposed in the seventeenth century. Descartes, in early decades of seventeenth
century, proposed that light consists of particles, while Huygens, around 1650-60, proposed that light consists
of waves. Descartes proposal was merely a philosophical model, devoid of any experiments or scientific
ar guments. Newton soon after, ar ound 1660-70, extended Descartes particle model, known as corpuscular
theory , built it up as a scientific theory, and explained various known properties with it. These models, light
as waves and as particles, in a sense, are quite opposite of each other. But both models could explain all the
known properties of light. There was nothing to choose between them.
The history of the development of these models over the next few centuries is interesting. Bartholinus, in
1669, discovered double refraction of light in some crystals, and Huygens, in 1678, was quick to explain it on
the basis of his wave theory of light. In spite of this, for over one hundred years, Newtons particle model was
firmly believed and preferred over the wave model. This was partly because of its simplicity and partly because
of Newtons influence on contemporary physics.
Then in 1801, Young performed his double-slit experiment and observed interference fringes. This
phenomenon could be explained only by wave theory. It was realized that diffraction was also another
phenomenon which could be explained only by wave theory. In fact, it was a natural consequence of Huygens
idea of secondary wavelets emanating from every point in the path of light. These experiments could not be
explained by assuming that light consists of particles. Another phenomenon of polarisation was discovered
around 1810, and this too could be naturally explained by the wave theory. Thus wave theory of Huygens
came to the forefront and Newtons particle theory went into the background. This situation again continued
for almost a century.
Better experiments were performed in the nineteenth century to determine the speed of light. With more
accurate experiments, a value of 3108 m/s for speed of light in vacuum was arrived at. Around 1860, Maxwell
proposed his equations of electromagnetism and it was realized that all electromagnetic phenomena known at
that time could be explained by Maxwells four equations. Soon Maxwell showed that electric and magnetic
fields could propagate through empty space (vacuum) in the form of electromagnetic waves. He calculated the
speed of these waves and arrived at a theoretical value of 2.998108 m/s. The close agreement of this value
with the experimental value suggested that light consists of electromagnetic waves. In 1887 Hertz demonstrated
the generation and detection of such waves. This established the wave theory of light on a firm footing. We
might say that while eighteenth century belonged to the particle model, the nineteenth century belonged to
the wave model of light.
Vast amounts of experiments were done during the period 1850-1900 on heat and related phenomena, an
altogether different area of physics. Theories and models like kinetic theory and thermodynamics were developed
which quite successfully explained the various phenomena, except one.

412

Dual Nature of Radiation


and Matter

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

Every body at any temperature emits radiation of all wavelengths. It also absorbs radiation falling on it.
A body which absorbs all the radiation falling on it is called a black body. It is an ideal concept in physics, like
concepts of a point mass or uniform motion. A graph of the intensity of radiation emitted by a body versus
wavelength is called the black body spectrum. No theory in those days could explain the complete black body
spectrum!
In 1900, Planck hit upon a novel idea. If we assume, he said, that radiation is emitted in packets of energy
instead of continuously as in a wave, then we can explain the black body spectrum. Planck himself regarded
these quanta, or packets, as a property of emission and absorption, rather than that of light. He derived a
formula which agreed with the entire spectrum. This was a confusing mixture of wave and particle pictures
radiation is emitted as a particle, it travels as a wave, and is again absorbed as a particle! Moreover, this put
physicists in a dilemma. Should we again accept the particle picture of light just to explain one phenomenon?
Then what happens to the phenomena of interference and diffraction which cannot be explained by the
particle model?
But soon in 1905, Einstein explained the photoelectric effect by assuming the particle picture of light.
In 1907, Debye explained the low temperature specific heats of solids by using the particle picture for lattice
vibrations in a crystalline solid. Both these phenomena belonging to widely diverse areas of physics could be
explained only by the particle model and not by the wave model. In 1923, Comptons x-ray scattering experiments
from atoms also went in favour of the particle pictur e. This increased the dilemma further.
Thus by 1923, physicists faced with the following situation. (a) There were some phenomena like rectilinear
propagation, reflection, refraction, which could be explained by either particle model or by wave model. (b)
There were some phenomena such as diffraction and interference which could be explained only by the wave
model but not by the particle model. (c) There were some phenomena such as black body radiation, photoelectric
effect, and Compton scattering which could be explained only by the particle model but not by the wave model.
Somebody in those days aptly remarked that light behaves as a particle on Mondays, Wednesdays and Fridays,
and as a wave on Tuesdays, Thursdays and Saturdays, and we dont talk of light on Sundays!
In 1924, de Broglie proposed his theory of wave-particle duality in which he said that not only photons
of light but also particles of matter such as electrons and atoms possess a dual character, sometimes
behaving like a particle and sometimes as a wave. He gave a formula connecting their mass, velocity, momentum
(particle characteristics), with their wavelength and frequency (wave characteristics)! In 1927 Thomson, and
Davisson and Germer, in separate experiments, showed that electrons did behave like waves with a wavelength
which agreed with that given by de Broglies formula. Their experiment was on diffraction of electrons through
crystalline solids, in which the regular arrangement of atoms acted like a grating. Very soon, diffraction
experiments with other particles such as neutrons and protons were performed and these too confirmed with
de Broglies formula. This confirmed wave-particle duality as an established principle of physics. Here was a
principle, physicists thought, which explained all the phenomena mentioned above not only for light but also
for the so-called particles.
But there was no basic theoretical foundation for wave-particle duality. De Broglies proposal was
merely a qualitative argument based on symmetry of nature. Wave-particle duality was at best a principle, not
an outcome of a sound fundamental theory. It is true that all experiments whatever agreed with de Broglie
formula. But physics does not work that way. On the one hand, it needs experimental confirmation, while on
the other hand, it also needs sound theoretical basis for the models proposed. This was developed over the
next two decades. Dirac developed his theory of radiation in about 1928, and Heisenberg and Pauli gave it a
firm footing by 1930. T omonaga, Schwinger, and Feynman, in late 1940s, produced further refinements and
cleared the theory of inconsistencies which were noticed. All these theories mainly put wave-particle duality
on a theoretical footing.
Although the story continues, it grows more and more complex and beyond the scope of this note. But
we have here the essential structure of what happened, and let us be satisfied with it at the moment. Now it
is regarded as a natural consequence of present theories of physics that electromagnetic radiation as well as
particles of matter exhibit both wave and particle properties in different experiments, and sometimes even in
the different parts of the same experiment.

413

Physics

Chapter Twelve

ATOMS

12.1 INTRODUCTION

414

By the nineteenth century, enough evidence had accumulated in favour of


atomic hypothesis of matter. In 1897, the experiments on electric discharge
through gases carried out by the English physicist J. J. Thomson (1856
1940) revealed that atoms of different elements contain negatively charged
constituents (electrons) that are identical for all atoms. However, atoms on a
whole are electrically neutral. Therefore, an atom must also contain some
positive charge to neutralise the negative charge of the electrons. But what
is the arrangement of the positive charge and the electrons inside the atom?
In other words, what is the structure of an atom?
The first model of atom was proposed by J. J. Thomson in 1898.
According to this model, the positive charge of the atom is uniformly
distributed throughout the volume of the atom and the negatively charged
electrons are embedded in it like seeds in a watermelon. This model was
picturesquely called plum pudding model of the atom. However
subsequent studies on atoms, as described in this chapter, showed that
the distribution of the electrons and positive charges are very different
from that proposed in this model.
We know that condensed matter (solids and liquids) and dense gases at
all temperatures emit electromagnetic radiation in which a continuous
distribution of several wavelengths is present, though with different
intensities. This radiation is considered to be due to oscillations of atoms

Atoms

ERNST RUTHERFORD (1871 1937)

and molecules, governed by the interaction of each atom or


molecule with its neighbours. In contrast, light emitted from
rarefied gases heated in a flame, or excited electrically in a
glow tube such as the familiar neon sign or mercury vapour
light has only certain discrete wavelengths. The spectrum
appears as a series of bright lines. In such gases, the
average spacing between atoms is large. Hence, the
radiation emitted can be considered due to individual atoms
rather than because of interactions between atoms or
molecules.
In the early nineteenth century it was also established
that each element is associated with a characteristic
spectrum of radiation, for example, hydrogen always gives
a set of lines with fixed relative position between the lines.
Ernst Rutherford (1871
1937) British physicist
This fact suggested an intimate relationship between the
who did pioneering work on
internal structure of an atom and the spectrum of
radioactive radiation. He
radiation emitted by it. In 1885, Johann Jakob Balmer
discovered alpha-rays and
(1825 1898) obtained a simple empirical formula which
beta-rays. Along with
gave the wavelengths of a group of lines emitted by atomic
Federick Soddy, he created
hydrogen. Since hydrogen is simplest of the elements
the modern theory of
known, we shall consider its spectrum in detail in this
radioactivity. He studied
chapter.
the emanation of thorium
Ernst Rutherford (18711937), a former research
and discovered a new noble
student of J. J. Thomson, was engaged in experiments on
gas, an isotope of radon,
-particles emitted by some radioactive elements. In 1906,
now known as thoron. By
he proposed a classic experiment of scattering of these
scattering alpha-rays from
-particles by atoms to investigate the atomic structure.
the
metal
foils,
he
This experiment was later performed around 1911 by Hans
discovered the atomic
Geiger (18821945) and Ernst Marsden (18891970, who
nucleus and proposed the
was 20 year-old student and had not yet earned his
plenatery model of the
bachelors degree). The details are discussed in Section
atom. He also estimated the
12.2. The explanation of the results led to the birth of
approximate size of the
Rutherfords planetary model of atom (also called the
nucleus.
nuclear model of the atom). According to this the entire
positive charge and most of the mass of the atom is concentrated in a small
volume called the nucleus with electrons revolving around the nucleus just
as planets revolve around the sun.
Rutherfords nuclear model was a major step towards how we see
the atom today. However, it could not explain why atoms emit light of
only discrete wavelengths. How could an atom as simple as hydrogen,
consisting of a single electron and a single proton, emit a complex
spectrum of specific wavelengths? In the classical picture of an atom, the
electron revolves round the nucleus much like the way a planet revolves
round the sun. However, we shall see that there are some serious
difficulties in accepting such a model.

12.2 ALPHA-PARTICLE SCATTERING AND


RUTHERFORDS NUCLEAR MODEL OF ATOM
At the suggestion of Ernst Rutherford, in 1911, H. Geiger and E. Marsden
performed some experiments. In one of their experiments, as shown in

415

Physics

FIGURE 12.1 Geiger-Marsden scattering experiment.


The entire apparatus is placed in a vacuum chamber
(not shown in this figure).

Fig. 12.1, they directed a beam of


5.5 MeV -particles emitted from a
214
radioactive source at a thin metal
83 Bi
foil made of gold. Figure 12.2 shows a
schematic diagram of this experiment.
Alpha-particles emitted by a 214
83 Bi
radioactive source were collimated into
a narrow beam by their passage
through lead bricks. The beam was
allowed to fall on a thin foil of gold of
thickness 2.1 107 m. The scattered
alpha-particles were observed through
a rotatable detector consisting of zinc
sulphide screen and a microscope. The
scattered alpha-particles on striking
the screen produced brief light flashes
or scintillations. These flashes may be
viewed through a microscope and the
distribution of the number of scattered
particles may be studied as a function
of angle of scattering.

FIGURE 12.2 Schematic arrangement of the Geiger-Marsden experiment.

416

A typical graph of the total number of -particles scattered at different


angles, in a given interval of time, is shown in Fig. 12.3. The dots in this
figure represent the data points and the solid curve is the theoretical
prediction based on the assumption that the target atom has a small,
dense, positively charged nucleus. Many of the -particles pass through
the foil. It means that they do not suffer any collisions. Only about 0.14%
of the incident -particles scatter by more than 1; and about 1 in 8000
deflect by more than 90. Rutherford argued that, to deflect the -particle
backwards, it must experience a large repulsive force. This force could

Atoms
be provided if the greater part of the
mass of the atom and its positive charge
were concentrated tightly at its centre.
Then the incoming -particle could get
very close to the positive charge without
penetrating it, and such a close
encounter would result in a large
deflection. This agreement supported
the hypothesis of the nuclear atom. This
is why Rutherford is credited with the
discovery of the nucleus.
In Rutherfords nuclear model of
the atom, the entire positive charge and
most of the mass of the atom are
concentrated in the nucleus with the
electrons some distance away. The
electrons would be moving in orbits
FIGURE 12.3 Experimental data points (shown by
about the nucleus just as the planets
dots) on scattering of -particles by a thin foil at
do around the sun. Rutherfords
different angles obtained by Geiger and Marsden
experiments suggested the size of
using the setup shown in Figs. 12.1 and
the nucleus to be about 1015 m to
12.2. Rutherfords nuclear model predicts the solid
1014 m. From kinetic theory, the size
curve which is seen to be in good agreement with
experiment.
of an atom was known to be 1010 m,
about 10,000 to 100,000 times larger
than the size of the nucleus (see Chapter 11, Section 11.6 in Class XI
Physics textbook). Thus, the electrons would seem to be at a distance
from the nucleus of about 10,000 to 100,000 times the size of the nucleus
itself. Thus, most of an atom is empty space. With the atom being largely
empty space, it is easy to see why most -particles go right through a
thin metal foil. However, when -particle happens to come near a nucleus,
the intense electric field there scatters it through a large angle. The atomic
electrons, being so light, do not appreciably affect the -particles.
The scattering data shown in Fig. 12.3 can be analysed by employing
Rutherfords nuclear model of the atom. As the gold foil is very thin, it
can be assumed that -particles will suffer not more than one scattering
during their passage through it. Therefore, computation of the trajectory
of an alpha-particle scattered by a single nucleus is enough. Alphaparticles are nuclei of helium atoms and, therefore, carry two units, 2e,
of positive charge and have the mass of the helium atom. The charge of
the gold nucleus is Ze, where Z is the atomic number of the atom; for
gold Z = 79. Since the nucleus of gold is about 50 times heavier than an
-particle, it is reasonable to assume that it remains stationary
throughout the scattering process. Under these assumptions, the
trajectory of an alpha-particle can be computed employing Newtons
second law of motion and the Coulombs law for electrostatic
force of repulsion between the alpha-particle and the positively
417
charged nucleus.

Physics
The magnitude of this force is
1 (2e )( Ze )
F
(12.1)
4 0
r2
where r is the distance between the -particle and the nucleus. The force
is directed along the line joining the -particle and the nucleus. The
magnitude and direction of the force on an -particle continuously
changes as it approaches the nucleus and recedes away from it.

12.2.1 Alpha-particle trajectory


The trajectory traced by an -particle depends on the impact parameter,
b of collision. The impact parameter is the perpendicular distance of the
initial velocity vector of the -particle from the centre of the nucleus (Fig.
12.4). A given beam of -particles has a
distribution of impact parameters b, so that
the beam is scattered in various directions
with different probabilities (Fig. 12.4). (In
a beam, all particles have nearly same
kinetic energy.) It is seen that an -particle
close to the nucleus (small impact
parameter) suffers large scattering. In case
of head-on collision, the impact parameter
is minimum and the -particle rebounds
back ( ). For a large impact parameter,
the -particle goes nearly undeviated and
has a small deflection ( 0).
FIGURE 12.4 Trajectory of -particles in the
The fact that only a small fraction of the
coulomb field of a target nucleus. The impact
number of incident particles rebound back
parameter, b and scattering angle
indicates that the number of -particles
are also depicted.
undergoing head on collision is small. This,
in turn, implies that the mass of the atom is concentrated in a small
volume. Rutherford scattering therefore, is a powerful way to determine
an upper limit to the size of the nucleus.

418

EXAMPLE 12.1

Example 12.1 In the Rutherfords nuclear model of the atom, the


nucleus (radius about 1015 m) is analogous to the sun about which
the electron move in orbit (radius 1010 m) like the earth orbits
around the sun. If the dimensions of the solar system had the same
proportions as those of the atom, would the earth be closer to or
farther away from the sun than actually it is ? The radius of earths
orbit is about 1.5 1011 m. The radius of sun is taken as 7 108 m.
Solution The ratio of the radius of electrons orbit to the radius of
nucleus is (1010 m)/(1015 m) = 105, that is, the radius of the electrons
orbit is 105 times larger than the radius of nucleus. If the radius of
the earths orbit around the sun were 105 times larger than the radius
of the sun, the radius of the earths orbit would be 105 7 108 m =
7 1013 m. This is more than 100 times greater than the actual orbital
radius of earth. Thus, the earth would be much farther away from
the sun.
It implies that an atom contains a much greater fraction of empty
space than our solar system does.

Atoms
Example 12.2 In a Geiger-Marsden experiment, what is the distance
of closest approach to the nucleus of a 7.7 MeV -particle before it
comes momentarily to rest and reverses its direction?

(2e )( Ze )
d

2Ze 2
4 0d

Thus the distance of closest approach d is given by


d

2 Ze 2
4 0K

The maximum kinetic energy found in -particles of natural origin is


7.7 MeV or 1.2 1012 J. Since 1/40 = 9.0 109 N m2/C2. Therefore
with e = 1.6 1019 C, we have,
(2)(9.0 109 Nm 2 / C 2 )(1.6 10 19 C )2 Z
1.2 10 12 J
16
= 3.84 10 Z m
The atomic number of foil material gold is Z = 79, so that
d

d (Au) = 3.0 1014 m = 30 fm. (1 fm (i.e. fermi) = 1015 m.)

EXAMPLE 12.2

The radius of gold nucleus is, therefore, less than 3.0 1014 m. This
is not in very good agreement with the observed result as the actual
radius of gold nucleus is 6 fm. The cause of discrepancy is that the
distance of closest approach is considerably larger than the sum of
the radii of the gold nucleus and the -particle. Thus, the -particle
reverses its motion without ever actually touching the gold nucleus.

Simulate Rutherford scattering experiment

http://www-outreach.phy.cam.ac.uk/camphy/nucleus/nucleus6_1.htm

Solution The key idea here is that throughout the scattering process,
the total mechanical energy of the system consisting of an -particle
and a gold nucleus is conserved. The systems initial mechanical
energy is Ei, before the particle and nucleus interact, and it is equal
to its mechanical energy Ef when the -particle momentarily stops.
The initial energy Ei is just the kinetic energy K of the incoming
- particle. The final energy Ef is just the electric potential energy U
of the system. The potential energy U can be calculated from
Eq. (12.1).
Let d be the centre-to-centre distance between the -particle and
the gold nucleus when the -particle is at its stopping point. Then
we can write the conservation of energy Ei = Ef as

12.2.2 Electron orbits


The Rutherford nuclear model of the atom which involves classical
concepts, pictures the atom as an electrically neutral sphere consisting
of a very small, massive and positively charged nucleus at the centre
surrounded by the revolving electrons in their respective dynamically
stable orbits. The electrostatic force of attraction, Fe between the revolving
electrons and the nucleus provides the requisite centripetal force (Fc ) to
keep them in their orbits. Thus, for a dynamically stable orbit in a
hydrogen atom
Fe = Fc

mv 2
r

1
4

e2
r2

(12.2)

419

Physics
Thus the relation between the orbit radius and the electron
velocity is

e2
4 0mv 2

(12.3)

The kinetic energy (K ) and electrostatic potential energy (U ) of the electron


in hydrogen atom are

1
mv 2
2

e2
8

and U

e2
4

(The negative sign in U signifies that the electrostatic force is in the r


direction.) Thus the total energy E of the electron in a hydrogen atom is

e2
8

e2
r

e2
8

(12.4)

The total energy of the electron is negative. This implies the fact that
the electron is bound to the nucleus. If E were positive, an electron will
not follow a closed orbit around the nucleus.
Example 12.3 It is found experimentally that 13.6 eV energy is
required to separate a hydrogen atom into a proton and an electron.
Compute the orbital radius and the velocity of the electron in a
hydrogen atom.
Solution Total energy of the electron in hydrogen atom is 13.6 eV =
13.6 1.6 1019 J = 2.2 1018 J. Thus from Eq. (12.4), we have

e2
8

2.2 10

0r

18

This gives the orbital radius

EXAMPLE 12.3

e2
8

(9 109 N m 2/C2 )(1.6 10


(2)(2.2 10 18 J)

19

C)2

= 5.3 1011 m.
The velocity of the revolving electron can be computed from Eq. (12.3)
with m = 9.1 1031 kg,

e
4

0mr

2.2 106 m/s.

12.3 ATOMIC SPECTRA

420

As mentioned in Section 12.1, each element has a characteristic spectrum


of radiation, which it emits. When an atomic gas or vapour is excited at
low pressure, usually by passing an electric current through it, the emitted
radiation has a spectrum which contains certain specific wavelengths
only. A spectrum of this kind is termed as emission line spectrum and it

Atoms
consists of bright lines on a
dark
background.
The
spectrum emitted by atomic
hydrogen is shown in
Fig. 12.5. Study of emission
line spectra of a material can
therefore serve as a type of
fingerprint for identification
of the gas. When white light
passes through a gas and we
analyse the transmitted light
using a spectrometer we find
some dark lines in the
FIGURE 12.5 Emission lines in the spectrum of hydrogen.
spectrum. These dark lines
correspond precisely to those wavelengths which were found in the
emission line spectrum of the gas. This is called the absorption spectrum
of the material of the gas.

12.3.1 Spectral series


We might expect that the frequencies of the light emitted by a particular
element would exhibit some regular pattern. Hydrogen is the simplest
atom and therefore, has the simplest spectrum. In the observed spectrum,
however, at first sight, there does not seem to be
any resemblance of order or regularity in spectral
lines. But the spacing between lines within certain
sets of the hydrogen spectrum decreases in a
regular way (Fig. 12.5). Each of these sets is called
a spectral series. In 1885, the first such series was
observed by a Swedish school teacher Johann Jakob
Balmer (18251898) in the visible region of the
hydrogen spectrum. This series is called Balmer
series (Fig. 12.6). The line with the longest
wavelength, 656.3 nm in the red is called H; the
FIGURE 12.6 Balmer series in the
next line with wavelength 486.1 nm in the blueemission spectrum of hydrogen.
green is called H, the third line 434.1 nm in the
violet is called H; and so on. As the wavelength
decreases, the lines appear closer together and are weaker in intensity.
Balmer found a simple empirical formula for the observed wavelengths
1

1
1
(12.5)
2
2 n2
where is the wavelength, R is a constant called the Rydberg constant,
and n may have integral values 3, 4, 5, etc. The value of R is 1.097 107 m1.
This equation is also called Balmer formula.
Taking n = 3 in Eq. (12.5), one obtains the wavelength of the H line:
1

1.097

107

1
22

= 1.522 106 m1
i.e., = 656.3 nm

1
32

m 1

421

Physics
For n = 4, one obtains the wavelength of H line, etc. For n = , one obtains
the limit of the series, at = 364.6 nm. This is the shortest wavelength in
the Balmer series. Beyond this limit, no further distinct lines appear,
instead only a faint continuous spectrum is seen.
Other series of spectra for hydrogen were subsequently discovered.
These are known, after their discoverers, as Lyman, Paschen, Brackett,
and Pfund series. These are represented by the formulae:
Lyman series:
1

1 1
12 n 2
Paschen series:
1

1
1
2
3 n2
Brackett series:
1

1
1
42 n 2
Pfund series:
1

n = 2,3,4...

(12.6)

n = 4,5,6...

(12.7)

n = 5,6,7...

(12.8)

1
1
n = 6,7,8...
(12.9)
2
5 n2
The Lyman series is in the ultraviolet, and the Paschen and Brackett
series are in the infrared region.
The Balmer formula Eq. (12.5) may be written in terms of frequency
of the light, recalling that
c =
or

c
Thus, Eq. (12.5) becomes

1
1
(12.10)
2
2 n2
There are only a few elements (hydrogen, singly ionised helium, and
doubly ionised lithium) whose spectra can be represented by simple
formula like Eqs. (12.5) (12.9).
Equations (12.5) (12.9) are useful as they give the wavelengths that
hydrogen atoms radiate or absorb. However, these results are empirical
and do not give any reasoning why only certain frequencies are observed
in the hydrogen spectrum.
= Rc

12.4 BOHR MODEL

422

OF THE

HYDROGEN ATOM

The model of the atom proposed by Rutherford assumes that the atom,
consisting of a central nucleus and revolving electron is stable much like
sun-planet system which the model imitates. However, there are some
fundamental differences between the two situations. While the planetary
system is held by gravitational force, the nucleus-electron system being
charged objects, interact by Coulombs Law of force. We know that an

Atoms

Niels Henrik David Bohr


(1885 1962) Danish
physicist who explained the
spectrum of hydrogen atom
based on quantum ideas.
He gave a theory of nuclear
fission based on the liquiddrop model of nucleus.
Bohr contributed to the
clarification of conceptual
problems in quantum
mechanics, in particular by
proposing the complementary principle.

NIELS HENRIK DAVID BOHR (1885 1962)

object which moves in a circle is being constantly


accelerated the acceleration being centripetal in nature.
According to classical electromagnetic theory, an
accelerating charged particle emits radiation in the form
of electromagnetic waves. The energy of an accelerating
electron should therefore, continuously decrease. The
electron would spiral inward and eventually fall into the
nucleus (Fig. 12.7). Thus, such an atom can not be stable.
Further, according to the classical electromagnetic theory,
the frequency of the electromagnetic waves emitted by
the revolving electrons is equal to the frequency of
revolution. As the electrons spiral inwards, their angular
velocities and hence their frequencies would change
continuously, and so will the frequency of the light
emitted. Thus, they would emit a continuous spectrum,
in contradiction to the line spectrum actually observed.
Clearly Rutherford model tells only a part of the story
implying that the classical ideas are not sufficient to
explain the atomic structure.

FIGURE 12.7 An accelerated atomic electron must spiral into the


nucleus as it loses energy.
Example 12.4 According to the classical electromagnetic theory,
calculate the initial frequency of the light emitted by the electron
revolving around a proton in hydrogen atom.
Solution From Example 12.3 we know that velocity of electron moving
around a proton in hydrogen atom in an orbit of radius 5.3 1011 m
is 2.2 106 m/s. Thus, the frequency of the electron moving around
the proton is

2.2 106 m s
2

5.3 10

11

6.6 10 Hz.
According to the classical electromagnetic theory we know that the
frequency of the electromagnetic waves emitted by the revolving
electrons is equal to the frequency of its revolution around the nucleus.
Thus the initial frequency of the light emitted is 6.6 1015 Hz.
15

EXAMPLE 12.4

v
2 r

423

Physics
It was Niels Bohr (1885 1962) who made certain modifications in
this model by adding the ideas of the newly developing quantum
hypothesis. Niels Bohr studied in Rutherfords laboratory for several
months in 1912 and he was convinced about the validity of Rutherford
nuclear model. Faced with the dilemma as discussed above, Bohr, in
1913, concluded that in spite of the success of electromagnetic theory in
explaining large-scale phenomena, it could not be applied to the processes
at the atomic scale. It became clear that a fairly radical departure from
the established principles of classical mechanics and electromagnetism
would be needed to understand the structure of atoms and the relation
of atomic structure to atomic spectra. Bohr combined classical and early
quantum concepts and gave his theory in the form of three postulates.
These are :
(i) Bohrs first postulate was that an electron in an atom could revolve
in certain stable orbits without the emission of radiant energy,
contrary to the predictions of electromagnetic theory. According to
this postulate, each atom has certain definite stable states in which it
can exist, and each possible state has definite total energy. These are
called the stationary states of the atom.
(ii) Bohrs second postulate defines these stable orbits. This postulate
states that the electron revolves around the nucleus only in those
orbits for which the angular momentum is some integral multiple of
h/2 where h is the Plancks constant (= 6.6 1034 J s). Thus the
angular momentum (L) of the orbiting electron is quantised. That is
L = nh/2

(12.11)

(iii) Bohrs third postulate incorporated into atomic theory the early
quantum concepts that had been developed by Planck and Einstein.
It states that an electron might make a transition from one of its
specified non-radiating orbits to another of lower energy. When it
does so, a photon is emitted having energy equal to the energy
difference between the initial and final states. The frequency of the
emitted photon is then given by
(12.12)
h = Ei Ef
where Ei and Ef are the energies of the initial and final states and Ei > Ef .
For a hydrogen atom, Eq. (12.4) gives the expression to determine
the energies of different energy states. But then this equation requires
the radius r of the electron orbit. To calculate r, Bohrs second postulate
about the angular momentum of the electronthe quantisation
condition is used. The angular momentum L is given by
L = mvr
Bohrs second postulate of quantisation [Eq. (12.11)] says that the
allowed values of angular momentum are integral multiples of h/2.
Ln = mvnrn =

424

nh
2

(12.13)

where n is an integer, rn is the radius of nth possible orbit and vn is the


speed of moving electron in the nth orbit. The allowed orbits are numbered

Atoms
1, 2, 3 ..., according to the values of n, which is called the principal
quantum number of the orbit.
From Eq. (12.3), the relation between vn and rn is
e
vn
4 0mrn
Combining it with Eq. (12.13), we get the following expressions for vn
and rn,
vn

1 e2
1
n4 0 h 2

(12.14)

and
rn

n2
m

h
2

4 0
e2

(12.15)

Eq. (12.14) depicts that the orbital speed in the nth orbit falls by a factor
of n. Using Eq. (12.15), the size of the innermost orbit (n = 1) can be
obtained as

r1

h2 0
me 2

This is called the Bohr radius, represented by the symbol a0. Thus,
h2 0
(12.16)
me 2
Substitution of values of h, m, 0 and e gives a 0 = 5.29 1011 m. From
Eq. (12.15), it can also be seen that the radii of the orbits increase as n2.
The total energy of the electron in the stationary states of the hydrogen
atom can be obtained by substituting the value of orbital radius in Eq.
(12.4) as
a0

En

or En

e2
8 0

m
n2

2
h

e2
4 0

me 4
8n 2 02h 2

(12.17)

Substituting values, Eq. (12.17) yields


2.18 10 18
J
(12.18)
n2
Atomic energies are often expressed in electron volts (eV) rather than
joules. Since 1 eV = 1.6 1019 J, Eq. (12.18) can be rewritten as
En

13.6
eV
(12.19)
n2
The negative sign of the total energy of an electron moving in an orbit
means that the electron is bound with the nucleus. Energy will thus be
required to remove the electron from the hydrogen atom to a distance
infinitely far away from its nucleus (or proton in hydrogen atom).
En

425

Physics
The derivation of Eqs. (12.17) (12.19) involves the assumption that
the electronic orbits are circular, though orbits under inverse square
force are, in general elliptical. (Planets move in elliptical orbits under the
inverse square gravitational force of the sun.) However, it was shown by
the German physicist Arnold Sommerfeld (1868 1951) that, when the
restriction of circular orbit is relaxed, these equations continue to hold
even for elliptic orbits.

ORBIT

VS STATE (ORBITAL PICTURE) OF ELECTRON IN ATOM

426

EXAMPLE 12.5

We are introduced to the Bohr Model of atom one time or the other in the course of
physics. This model has its place in the history of quantum mechanics and particularly
in explaining the structure of an atom. It has become a milestone since Bohr introduced
the revolutionary idea of definite energy orbits for the electrons, contrary to the classical
picture requiring an accelerating particle to radiate. Bohr also introduced the idea of
quantisation of angular momentum of electrons moving in definite orbits. Thus it was a
semi-classical picture of the structure of atom.
Now with the development of quantum mechanics, we have a better understanding
of the structure of atom. Solutions of the Schrdinger wave equation assign a wave-like
description to the electrons bound in an atom due to attractive forces of the protons.
An orbit of the electron in the Bohr model is the circular path of motion of an electron
around the nucleus. But according to quantum mechanics, we cannot associate a definite
path with the motion of the electrons in an atom. We can only talk about the probability
of finding an electron in a certain region of space around the nucleus. This probability
can be inferred from the one-electron wave function called the orbital. This function
depends only on the coordinates of the electron.
It is therefore essential that we understand the subtle differences that exist in the two
models:

Bohr model is valid for only one-electron atoms/ions; an energy value, assigned to
each orbit, depends on the principal quantum number n in this model. We know
that energy associated with a stationary state of an electron depends on n only, for
one-electron atoms/ions. For a multi-electron atom/ion, this is not true.

The solution of the Schrdinger wave equation, obtained for hydrogen-like atoms/
ions, called the wave function, gives information about the probability of finding an
electron in various regions around the nucleus. This orbital has no resemblance
whatsoever with the orbit defined for an electron in the Bohr model.

Example 12.5 A 10 kg satellite circles earth once every 2 h in an


orbit having a radius of 8000 km. Assuming that Bohrs angular
momentum postulate applies to satellites just as it does to an electron
in the hydrogen atom, find the quantum number of the orbit of the
satellite.
Solution
From Eq. (12.13), we have
m vn rn = nh/2

Atoms

EXAMPLE 12.5

Here m = 10 kg and rn = 8 106 m. We have the time period T of the


circling satellite as 2 h. That is T = 7200 s.
Thus the velocity vn = 2 rn/T.
The quantum number of the orbit of satellite
n = (2 rn)2 m/(T h).
Substituting the values,
n = (2 8 106 m)2 10/(7200 s 6.64 1034 J s)
= 5.3 1045
Note that the quantum number for the satellite motion is extremely
large! In fact for such large quantum numbers the results of
quantisation conditions tend to those of classical physics.

12.4.1 Energy levels


The energy of an atom is the least (largest negative value)
when its electron is revolving in an orbit closest to the
nucleus i.e., the one for which n = 1. For n = 2, 3, ... the
absolute value of the energy E is smaller, hence the energy
is progressively larger in the outer orbits. The lowest state
of the atom, called the ground state, is that of the lowest
energy, with the electron revolving in the orbit of smallest
radius, the Bohr radius, a 0. The energy of this state (n = 1),
E1 is 13.6 eV. Therefore, the minimum energy required to
free the electron from the ground state of the hydrogen atom
is 13.6 eV. It is called the ionisation energy of the hydrogen
atom. This prediction of the Bohrs model is in excellent
agreement with the experimental value of ionisation energy.
At room temperature, most of the hydrogen atoms are
in ground state. When a hydrogen atom receives energy
by processes such as electron collisions, the atom may
acquire sufficient energy to raise the electron to higher
energy states. The atom is then said to be in an excited
state. From Eq. (12.19), for n = 2; the energy E2 is
3.40 eV. It means that the energy required to excite an
electron in hydrogen atom to its first excited state, is an
energy equal to E2 E1 = 3.40 eV (13.6) eV = 10.2 eV.
Similarly, E3 = 1.51 eV and E3 E1 = 12.09 eV, or to excite
the hydrogen atom from its ground state (n = 1) to second
excited state (n = 3), 12.09 eV energy is required, and so
on. From these excited states the electron can then fall back
to a state of lower energy, emitting a photon in the process.
Thus, as the excitation of hydrogen atom increases (that is
as n increases) the value of minimum energy required to
free the electron from the excited atom decreases.
The energy level diagram* for the stationary states of a
hydrogen atom, computed from Eq. (12.19), is given in

FIGURE 12.8 The energy level


diagram for the hydrogen atom.
The electron in a hydrogen atom
at room temperature spends
most of its time in the ground
state. To ionise a hydrogen
atom an electron from the
ground state, 13.6 eV of energy
must be supplied. (The horizontal
lines specify the presence of
allowed energy states.)

* An electron can have any total energy above E = 0 eV. In such situations the
electron is free. Thus there is a continuum of energy states above E = 0 eV, as
shown in Fig. 12.8.

427

Physics
Fig. 12.8. The principal quantum number n labels the stationary
states in the ascending order of energy. In this diagram, the highest
energy state corresponds to n = in Eq, (12.19) and has an energy
of 0 eV. This is the energy of the atom when the electron is
completely removed (r = ) from the nucleus and is at rest. Observe how
the energies of the excited states come closer and closer together as n
increases.

FRANCK HERTZ

EXPERIMENT

The existence of discrete energy levels in an atom was directly verified in 1914 by James
Franck and Gustav Hertz. They studied the spectrum of mercury vapour when electrons
having different kinetic energies passed through the vapour. The electron energy was
varied by subjecting the electrons to electric fields of varying strength. The electrons
collide with the mercury atoms and can transfer energy to the mercury atoms. This can
only happen when the energy of the electron is higher than the energy difference between
an energy level of Hg occupied by an electron and a higher unoccupied level (see Figure).
For instance, the difference between an occupied energy level of Hg and a higher
unoccupied level is 4.9 eV. If an electron of having an energy of 4.9 eV or more passes
through mercury, an electron in mercury atom can absorb energy from the bombarding
electron and get excited to the higher level [Fig (a)]. The colliding electrons kinetic energy
would reduce by this amount.

The excited electron would subsequently fall back to the ground state by emission of
radiation [Fig. (b)]. The wavelength of emitted radiation is:

hc
E

6.625 10 34 3 108
= 253 nm
4.9 1.6 10 19

By direct measurement, Franck and Hertz found that the emission spectrum of
mercury has a line corresponding to this wavelength. For this experimental verification
of Bohrs basic ideas of discrete energy levels in atoms and the process of photon emission,
Frank and Hertz were awarded the Nobel prize in 1925.

12.5 THE LINE SPECTRA

428

OF THE

HYDROGEN ATOM

According to the third postulate of Bohrs model, when an atom makes a


transition from the higher energy state with quantum number ni to the
lower energy state with quantum number nf (nf < ni ), the difference of
energy is carried away by a photon of frequency if such that

Atoms
hif = Eni Enf

(12.20)

Using Eq. (12.16), for Enf and Eni, we get


hif =

me 4
1
8 02h 2 n f2

or if =

me 4
1
2 3
8 0h n f2

1
n i2
1
n i2

(12.21)

(12.22)

Equation (12.21) is the Rydberg formula, for the spectrum of the


hydrogen atom. In this relation, if we take nf = 2 and ni = 3, 4, 5..., it
reduces to a form similar to Eq. (12.10) for the Balmer series. The Rydberg
constant R is readily identified to be
R=

me 4
8 2h 3c

(12.23)

If we insert the values of various constants in Eq. (12.23), we get


R = 1.03 107 m1
This is a value very close to the value (1.097 107 m1) obtained from the
empirical Balmer formula. This agreement between the theoretical and
experimental values of the Rydberg constant provided a direct and
striking confirmation of the Bohrs model.
Since both nf and ni are integers,
this immediately shows that in
transitions between different atomic
levels, light is radiated in various
discrete frequencies. For hydrogen
spectrum, the Balmer formula
corresponds to nf = 2 and ni = 3, 4, 5,
etc. The results of the Bohrs model
suggested the presence of other series
spectra for hydrogen atomthose
corresponding to transitions resulting
from nf = 1 and ni = 2, 3, etc.; nf = 3
and ni = 4, 5, etc., and so on. Such
series were identified in the course of
spectroscopic investigations and are
known as the L yman, Balmer,
Paschen, Brackett, and Pfund series.
The
electronic
transitions
corresponding to these series are
shown in Fig. 12.9.
The various lines in the atomic
spectra are produced when electrons
jump from higher energy state to a
lower energy state and photons are
emitted. These spectral lines are called
emission lines. But when an atom
FIGURE 12.9 Line spectra originate in
transitions between energy levels.
absorbs a photon that has precisely

429

Physics
the same energy needed by the electron in a lower energy state to make
transitions to a higher energy state, the process is called absorption.
Thus if photons with a continuous range of frequencies pass through a
rarefied gas and then are analysed with a spectrometer, a series of dark
spectral absorption lines appear in the continuous spectrum. The dark
lines indicate the frequencies that have been absorbed by the atoms of
the gas.
The explanation of the hydrogen atom spectrum provided by Bohrs
model was a brilliant achievement, which greatly stimulated progress
towards the modern quantum theory. In 1922, Bohr was awarded Nobel
Prize in Physics.
Example 12.6 Using the Rydberg formula, calculate the wavelengths
of the first four spectral lines in the Lyman series of the hydrogen
spectrum.
Solution The Rydberg formula is
hc/if =

me 4
1
8 2h 2 n f2

1
n i2

The wavelengths of the first four lines in the Lyman series correspond
to transitions from ni = 2,3,4,5 to nf = 1. We know that

me 4
= 13.6 eV = 21.76 1019 J
8 2h 2
Therefore,

hc
i1

EXAMPLE 12.6

21.76

10

19

1
1

1
n i2

6.625 10 34 3 108 n i2
m =
21.76 10 19 (n i2 1)

0.9134 n i2
10
(n i2 1)

= 913.4 ni2/(ni2 1)
Substituting ni = 2,3,4,5, we get 21 = 1218 , 31 = 1028 , 41 = 974.3 ,
and 51 = 951.4 .

12.6 DE BROGLIES EXPLANATION OF BOHRS


SECOND POSTULATE OF QUANTISATION

430

Of all the postulates, Bohr made in his model of the atom, perhaps the
most puzzling is his second postulate. It states that the angular
momentum of the electron orbiting around the nucleus is quantised (that
is, Ln = nh/2; n = 1, 2, 3 ). Why should the angular momentum have
only those values that are integral multiples of h/2 ? The French physicist
Louis de Broglie explained this puzzle in 1923, ten years after Bohr
proposed his model.
We studied, in Chapter 11, about the de Broglies hypothesis that
material particles, such as electrons, also have a wave nature. C. J. Davisson
and L. H. Germer later experimentally verified the wave nature of electrons

Atoms
in 1927. Louis de Broglie argued that the electron in its
circular orbit, as proposed by Bohr, must be seen as a particle
wave. In analogy to waves travelling on a string, particle waves
too can lead to standing waves under resonant conditions.
From Chapter 15 of Class XI Physics textbook, we know that
when a string is plucked, a vast number of wavelengths are
excited. However only those wavelengths survive which have
nodes at the ends and form the standing wave in the string. It
means that in a string, standing waves are formed when the
total distance travelled by a wave down the string and back is
one wavelength, two wavelengths, or any integral number of
wavelengths. Waves with other wavelengths interfere with
themselves upon reflection and their amplitudes quickly drop
to zero. For an electron moving in nth circular orbit of radius
rn, the total distance is the circumference of the orbit, 2rn.
Thus

FIGURE 12.10 A standing


wave is shown on a circular
orbit where four de Broglie
wavelengths fit into the
circumference of the orbit.

n = 1, 2, 3...
(12.24)
2 rn = n,
Figure 12.10 illustrates a standing particle wave on a
circular orbit for n = 4, i.e., 2rn = 4, where is the de Broglie
wavelength of the electron moving in nth orbit. From Chapter
11, we have = h/p, where p is the magnitude of the electrons
momentum. If the speed of the electron is much less than the speed of
light, the momentum is mvn. Thus, = h/mvn. From Eq. (12.24), we have
2 rn = n h/mvn
or m vn rn = nh/2
This is the quantum condition proposed by Bohr for the angular
momentum of the electron [Eq. (12.13)]. In Section 12.5, we saw that
this equation is the basis of explaining the discrete orbits and energy
levels in hydrogen atom. Thus de Broglie hypothesis provided an
explanation for Bohrs second postulate for the quantisation of angular
momentum of the orbiting electron. The quantised electron orbits and
energy states are due to the wave nature of the electron and only resonant
standing waves can persist.
Bohrs model, involving classical trajectory picture (planet-like electron
orbiting the nucleus), correctly predicts the gross features of the
hydrogenic atoms*, in particular, the frequencies of the radiation emitted
or selectively absorbed. This model however has many limitations.
Some are:
(i) The Bohr model is applicable to hydrogenic atoms. It cannot be
extended even to mere two electron atoms such as helium. The analysis
of atoms with more than one electron was attempted on the lines of
Bohrs model for hydrogenic atoms but did not meet with any success.
Difficulty lies in the fact that each electron interacts not only with the
positively charged nucleus but also with all other electrons.

* Hydrogenic atoms are the atoms consisting of a nucleus with positive charge
+Ze and a single electron, where Z is the proton number. Examples are hydrogen
atom, singly ionised helium, doubly ionised lithium, and so forth. In these
atoms more complex electron-electron interactions are nonexistent.

431

Physics
The formulation of Bohr model involves electrical force between
positively charged nucleus and electron. It does not include the
electrical forces between electrons which necessarily appear in
multi-electron atoms.
(ii) While the Bohrs model correctly predicts the frequencies of the light
emitted by hydrogenic atoms, the model is unable to explain the
relative intensities of the frequencies in the spectrum. In emission
spectrum of hydrogen, some of the visible frequencies have weak
intensity, others strong. Why? Experimental observations depict that
some transitions are more favoured than others. Bohrs model is
unable to account for the intensity variations.
Bohrs model presents an elegant picture of an atom and cannot be
generalised to complex atoms. For complex atoms we have to use a new
and radical theory based on Quantum Mechanics, which provides a more
complete picture of the atomic structure.

LASER

LIGHT

Imagine a crowded market place or a railway platform with people entering a gate and
going towards all directions. Their footsteps are random and there is no phase correlation
between them. On the other hand, think of a large number of soldiers in a regulated march.
Their footsteps are very well correlated. See figure here.
This is similar to the difference between light emitted by
an ordinary source like a candle or a bulb and that emitted
by a laser. The acronym LASER stands for Light Amplification
by Stimulated Emission of Radiation. Since its development
in 1960, it has entered into all areas of science and technology.
It has found applications in physics, chemistry, biology,
medicine, surgery, engineering, etc. There are low power
lasers, with a power of 0.5 mW, called pencil lasers, which
serve as pointers. There are also lasers of different power,
suitable for delicate surgery of eye or glands in the stomach.
Finally, there are lasers which can cut or weld steel.
Light is emitted from a source in the form of packets of
waves. Light coming out from an ordinary source contains a mixture of many wavelengths.
There is also no phase relation between the various waves. Therefore, such light, even if it is
passed through an aperture, spreads very fast and the beam size increases rapidly with
distance. In the case of laser light, the wavelength of each packet is almost the same. Also
the average length of the packet of waves is much larger. This means that there is better
phase correlation over a longer duration of time. This results in reducing the divergence of
a laser beam substantially.
If there are N atoms in a source, each emitting light with intensity I, then the total
intensity produced by an ordinary source is proportional to NI, whereas in a laser source,
it is proportional to N2I. Considering that N is very large, we see that the light from a laser
can be much stronger than that from an ordinary source.
When astronauts of the Apollo missions visited the moon, they placed a mirror on its
surface, facing the earth. Then scientists on the earth sent a strong laser beam, which was
reflected by the mirror on the moon and received back on the earth. The size of the reflected
laser beam and the time taken for the round trip were measured. This allowed a very
accurate determination of (a) the extremely small divergence of a laser beam and (b) the
distance of the moon from the earth.

432

Atoms
SUMMARY
1.
2.
3.

4.

5.

6.

Atom, as a whole, is electrically neutral and therefore contains equal


amount of positive and negative charges.
In Thomsons model, an atom is a spherical cloud of positive charges
with electrons embedded in it.
In Rutherfords model, most of the mass of the atom and all its positive
charge are concentrated in a tiny nucleus (typically one by ten thousand
the size of an atom), and the electrons revolve around it.
Rutherford nuclear model has two main difficulties in explaining the
structure of atom: (a) It predicts that atoms are unstable because the
accelerated electrons revolving around the nucleus must spiral into
the nucleus. This contradicts the stability of matter. (b) It cannot
explain the characteristic line spectra of atoms of different elements.
Atoms of each element are stable and emit characteristic spectrum.
The spectrum consists of a set of isolated parallel lines termed as line
spectrum. It provides useful information about the atomic structure.
The atomic hydrogen emits a line spectrum consisting of various series.
The frequency of any line in a series can be expressed as a difference
of two terms;
Lyman series:

Rc

Balmer series:

Rc

1
n2

Paschen series:

Rc

Brackett series:

Rc

Pfund series:
7.

1
12

Rc

; n = 2, 3, 4,...

22

n2

32

n2

42

n2

52

n2

; n = 3, 4, 5,...
; n = 4, 5, 6,...
; n = 5, 6, 7,...

; n = 6, 7, 8,...

To explain the line spectra emitted by atoms, as well as the stability


of atoms, Niels Bohr proposed a model for hydrogenic (single elctron)
atoms. He introduced three postulates and laid the foundations of
quantum mechanics:
(a) In a hydrogen atom, an electron revolves in certain stable orbits
(called stationary orbits) without the emission of radiant energy.
(b) The stationary orbits are those for which the angular momentum
is some integral multiple of h/2. (Bohrs quantisation condition.)
That is L = nh/2, where n is an integer called a quantum number.
(c) The third postulate states that an electron might make a transition
from one of its specified non-radiating orbits to another of lower
energy. When it does so, a photon is emitted having energy equal
to the energy difference between the initial and final states. The
frequency () of the emitted photon is then given by
h = Ei Ef
An atom absorbs radiation of the same frequency the atom emits,
in which case the electron is transferred to an orbit with a higher
value of n.
Ei + h = Ef

433

Physics
8.

As a result of the quantisation condition of angular momentum, the


electron orbits the nucleus at only specific radii. For a hydrogen atom
it is given by

rn

n2
m

h
2

4 0
e2

The total energy is also quantised:

En

me 4
8n 2 02h 2

= 13.6 eV/n2
The n = 1 state is called ground state. In hydrogen atom the ground
state energy is 13.6 eV. Higher values of n correspond to excited
states (n > 1). Atoms are excited to these higher states by collisions
with other atoms or electrons or by absorption of a photon of right
frequency.
9. de Broglies hypothesis that electrons have a wavelength = h/mv gave
an explanation for Bohrs quantised orbits by bringing in the waveparticle duality. The orbits correspond to circular standing waves in
which the circumference of the orbit equals a whole number of
wavelengths.
10. Bohrs model is applicable only to hydrogenic (single electron) atoms.
It cannot be extended to even two electron atoms such as helium.
This model is also unable to explain for the relative intensities of the
frequencies emitted even by hydrogenic atoms.

POINTS TO PONDER
1.

2.

3.

4.

5.

434

Both the Thomsons as well as the Rutherfords models constitute an


unstable system. Thomsons model is unstable electrostatically, while
Rutherfords model is unstable because of electromagnetic radiation
of orbiting electrons.
What made Bohr quantise angular momentum (second postulate) and
not some other quantity? Note, h has dimensions of angular
momentum, and for circular orbits, angular momentum is a very
relevant quantity. The second postulate is then so natural!
The orbital picture in Bohrs model of the hydrogen atom was
inconsistent with the uncertainty principle. It was replaced by modern
quantum mechanics in which Bohrs orbits are regions where the
electron may be found with large probability.
Unlike the situation in the solar system, where planet-planet
gravitational forces are very small as compared to the gravitational
force of the sun on each planet (because the mass of the sun is so
much greater than the mass of any of the planets), the electron-electron
electric force interaction is comparable in magnitude to the electronnucleus electrical force, because the charges and distances are of the
same order of magnitude. This is the reason why the Bohrs model
with its planet-like electron is not applicable to many electron atoms.
Bohr laid the foundation of the quantum theory by postulating specific
orbits in which electrons do not radiate. Bohrs model include only

Atoms

6.

7.

one quantum number n. The new theory called quantum mechanics


supportes Bohrs postulate. However in quantum mechanics (more
generally accepted), a given energy level may not correspond to just
one quantum state. For example, a state is characterised by four
quantum numbers (n, l, m, and s), but for a pure Coulomb potential
(as in hydrogen atom) the energy depends only on n.
In Bohr model, contrary to ordinary classical expectation, the
frequency of revolution of an electron in its orbit is not connected to
the frequency of spectral line. The later is the difference between two
orbital energies divided by h. For transitions between large quantum
numbers (n to n 1, n very large), however, the two coincide as expected.
Bohrs semiclassical model based on some aspects of classical physics
and some aspects of modern physics also does not provide a true
picture of the simplest hydrogenic atoms. The true picture is quantum
mechanical affair which differs from Bohr model in a number of
fundamental ways. But then if the Bohr model is not strictly correct,
why do we bother about it? The reasons which make Bohrs model
still useful are:
(i) The model is based on just three postulates but accounts for almost
all the general features of the hydrogen spectrum.
(ii) The model incorporates many of the concepts we have learnt in
classical physics.
(iii) The model demonstrates how a theoretical physicist occasionally
must quite literally ignore certain problems of approach in hopes
of being able to make some predictions. If the predictions of the
theory or model agree with experiment, a theoretician then must
somehow hope to explain away or rationalise the problems that
were ignored along the way.

EXERCISES
12.1

12.2

Choose the correct alternative from the clues given at the end of
the each statement:
(a) The size of the atom in Thomsons model is .......... the atomic
size in Rutherfords model. (much greater than/no different
from/much less than.)
(b) In the ground state of .......... electrons are in stable equilibrium,
while in .......... electrons always experience a net force.
(Thomsons model/ Rutherfords model.)
(c) A classical atom based on .......... is doomed to collapse.
(Thomsons model/ Rutherfords model.)
(d) An atom has a nearly continuous mass distribution in a ..........
but has a highly non-uniform mass distribution in ..........
(Thomsons model/ Rutherfords model.)
(e) The positively charged part of the atom possesses most of the
mass in .......... (Rutherfords model/both the models.)
Suppose you are given a chance to repeat the alpha-particle
scattering experiment using a thin sheet of solid hydrogen in place
of the gold foil. (Hydrogen is a solid at temperatures below 14 K.)
What results do you expect?

435

Physics
12.3

What is the shortest wavelength present in the Paschen series of


spectral lines?
12.4 A difference of 2.3 eV separates two energy levels in an atom. What
is the frequency of radiation emitted when the atom make a
transition from the upper level to the lower level?
12.5 The ground state energy of hydrogen atom is 13.6 eV. What are the
kinetic and potential energies of the electron in this state?
12.6 A hydrogen atom initially in the ground level absorbs a photon,
which excites it to the n = 4 level. Determine the wavelength and
frequency of photon.
12.7 (a) Using the Bohrs model calculate the speed of the electron in a
hydrogen atom in the n = 1, 2, and 3 levels. (b) Calculate the orbital
period in each of these levels.
12.8 The radius of the innermost electron orbit of a hydrogen atom is
5.31011 m. What are the radii of the n = 2 and n =3 orbits?
12.9 A 12.5 eV electron beam is used to bombard gaseous hydrogen at
room temperature. What series of wavelengths will be emitted?
12.10 In accordance with the Bohrs model, find the quantum number
that characterises the earths revolution around the sun in an orbit
of radius 1.5 1011 m with orbital speed 3 104 m/s. (Mass of earth
= 6.0 1024 kg.)

ADDITIONAL EXERCISES

436

12.11 Answer the following questions, which help you understand the
difference between Thomsons model and Rutherfords model better.
(a) Is the average angle of deflection of -particles by a thin gold foil
predicted by Thomsons model much less, about the same, or
much greater than that predicted by Rutherfords model?
(b) Is the probability of backward scattering (i.e., scattering of
-particles at angles greater than 90) predicted by Thomsons
model much less, about the same, or much greater than that
predicted by Rutherfords model?
(c) Keeping other factors fixed, it is found experimentally that for
small thickness t, the number of -particles scattered at
moderate angles is proportional to t. What clue does this linear
dependence on t provide?
(d) In which model is it completely wrong to ignore multiple
scattering for the calculation of average angle of scattering of
-particles by a thin foil?
12.12 The gravitational attraction between electron and proton in a
hydrogen atom is weaker than the coulomb attraction by a factor of
about 1040. An alternative way of looking at this fact is to estimate
the radius of the first Bohr orbit of a hydrogen atom if the electron
and proton were bound by gravitational attraction. You will find the
answer interesting.
12.13 Obtain an expression for the frequency of radiation emitted when a
hydrogen atom de-excites from level n to level (n1). For large n,
show that this frequency equals the classical frequency of revolution
of the electron in the orbit.

Atoms
12.14 Classically, an electron can be in any orbit around the nucleus of
an atom. Then what determines the typical atomic size? Why is an
atom not, say, thousand times bigger than its typical size? The
question had greatly puzzled Bohr before he arrived at his famous
model of the atom that you have learnt in the text. To simulate what
he might well have done before his discovery, let us play as follows
with the basic constants of nature and see if we can get a quantity
with the dimensions of length that is roughly equal to the known
size of an atom (~ 1010m).
(a) Construct a quantity with the dimensions of length from the
fundamental constants e, me, and c. Determine its numerical
value.
(b) You will find that the length obtained in (a) is many orders of
magnitude smaller than the atomic dimensions. Further, it
involves c. But energies of atoms are mostly in non-relativistic
domain where c is not expected to play any role. This is what
may have suggested Bohr to discard c and look for something
else to get the right atomic size. Now, the Plancks constant h
had already made its appearance elsewhere. Bohrs great insight
lay in recognising that h, me, and e will yield the right atomic
size. Construct a quantity with the dimension of length from h,
me, and e and confirm that its numerical value has indeed the
correct order of magnitude.
12.15 The total energy of an electron in the first excited state of the
hydrogen atom is about 3.4 eV.
(a) What is the kinetic energy of the electron in this state?
(b) What is the potential energy of the electron in this state?
(c) Which of the answers above would change if the choice of the
zero of potential energy is changed?
12.16 If Bohrs quantisation postulate (angular momentum = nh/2) is a
basic law of nature, it should be equally valid for the case of planetary
motion also. Why then do we never speak of quantisation of orbits
of planets around the sun?
12.17 Obtain the first Bohrs radius and the ground state energy of a
muonic hydrogen atom [i.e., an atom in which a negatively charged
muon () of mass about 207me orbits around a proton].

437

Physics

Chapter Thirteen

tt
o N
be C
E
re R
pu T
bl
is
he
d

NUCLEI

no

13.1 I NTRODUCTION

438

In the previous chapter, we have learnt that in every atom, the positive
charge and mass are densely concentrated at the centre of the atom
forming its nucleus. The overall dimensions of a nucleus are much smaller
than those of an atom. Experiments on scattering of -particles
demonstrated that the radius of a nucleus was smaller than the radius
of an atom by a factor of about 104 . This means the volume of a nucleus
is about 10 12 times the volume of the atom. In other words, an atom is
almost empty. If an atom is enlarged to the size of a classroom, the nucleus
would be of the size of pinhead. Nevertheless, the nucleus contains most
(more than 99.9%) of the mass of an atom.
Does the nucleus have a structure, just as the atom does? If so, what
are the constituents of the nucleus? How are these held together? In this
chapter, we shall look for answers to such questions. We shall discuss
various properties of nuclei such as their size, mass and stability, and
also associated nuclear phenomena such as radioactivity, fission and fusion.

13.2 ATOMIC MASSES

AND

COMPOSITION OF NUCLEUS

The mass of an atom is very small, compared to a kilogram; for example,


the mass of a carbon atom, 12 C, is 1.992647 1026 kg. Kilogram is not
a very convenient unit to measure such small quantities. Therefore, a

Nuclei
different mass unit is used for expressing atomic masses. This unit is the
atomic mass unit (u), defined as 1/12th of the mass of the carbon (12C)
atom. According to this definition
mass of one
12

12

C atom

tt
o N
be C
E
re R
pu T
bl
is
he
d

1u =

1.992647 10 26 kg
12

= 1.660539 10 27 kg

(13.1)

The atomic masses of various elements expressed in atomic mass


unit (u) are close to being integral multiples of the mass of a hydrogen
atom. There are, however, many striking exceptions to this rule. For
example, the atomic mass of chlorine atom is 35.46 u.
Accurate measurement of atomic masses is carried out with a mass
spectrometer, The measurement of atomic masses reveals the existence
of different types of atoms of the same element, which exhibit the same
chemical properties, but differ in mass. Such atomic species of the same
element differing in mass are called isotopes. (In Greek, isotope means
the same place, i.e. they occur in the same place in the periodic table of
elements.) It was found that practically every element consists of a mixture
of several isotopes. The relative abundance of different isotopes differs
from element to element. Chlorine, for example, has two isotopes having
masses 34.98 u and 36.98 u, which are nearly integral multiples of the
mass of a hydrogen atom. The relative abundances of these isotopes are
75.4 and 24.6 per cent, respectively. Thus, the average mass of a chlorine
atom is obtained by the weighted average of the masses of the two
isotopes, which works out to be
75.4 34.98 + 24.6 36.98
100
= 35.47 u
which agrees with the atomic mass of chlorine.
Even the lightest element, hydrogen has three isotopes having masses
1.0078 u, 2.0141 u, and 3.0160 u. The nucleus of the lightest atom of
hydrogen, which has a relative abundance of 99.985%, is called the
proton. The mass of a proton is
=

m p = 1.00727 u = 1.67262 10 27 kg

(13.2)

no

This is equal to the mass of the hydrogen atom (= 1.00783u), minus


the mass of a single electron (m e = 0.00055 u). The other two isotopes of
hydrogen are called deuterium and tritium. Tritium nuclei, being
unstable, do not occur naturally and are produced artificially in
laboratories.
The positive charge in the nucleus is that of the protons. A proton
carries one unit of fundamental charge and is stable. It was earlier thought
that the nucleus may contain electrons, but this was ruled out later using
arguments based on quantum theory. All the electrons of an atom are
outside the nucleus. We know that the number of these electrons outside
the nucleus of the atom is Z, the atomic number. The total charge of the

439

Physics
atomic electrons is thus (Ze), and since the atom is neutral, the charge
of the nucleus is (+Ze). The number of protons in the nucleus of the atom
is, therefore, exactly Z, the atomic number.
Discovery of Neutron

tt
o N
be C
E
re R
pu T
bl
is
he
d

Since the nuclei of deuterium and tritium are isotopes of hydrogen, they
must contain only one proton each. But the masses of the nuclei of
hydrogen, deuterium and tritium are in the ratio of 1:2:3. Therefore, the
nuclei of deuterium and tritium must contain, in addition to a proton,
some neutral matter. The amount of neutral matter present in the nuclei
of these isotopes, expressed in units of mass of a proton, is approximately
equal to one and two, respectively. This fact indicates that the nuclei of
atoms contain, in addition to protons, neutral matter in multiples of a
basic unit. This hypothesis was verified in 1932 by James Chadwick
who observed emission of neutral radiation when beryllium nuclei were
bombarded with alpha-particles. ( -particles are helium nuclei, to be
discussed in a later section). It was found that this neutral radiation
could knock out protons from light nuclei such as those of helium, carbon
and nitrogen. The only neutral radiation known at that time was photons
(electromagnetic radiation). Application of the principles of conservation
of energy and momentum showed that if the neutral radiation consisted
of photons, the energy of photons would have to be much higher than is
available from the bombardment of beryllium nuclei with -particles.
The clue to this puzzle, which Chadwick satisfactorily solved, was to
assume that the neutral radiation consists of a new type of neutral
particles called neutrons. From conservation of energy and momentum,
he was able to determine the mass of new particle as very nearly the
same as mass of proton.
The mass of a neutron is now known to a high degree of accuracy. It is
m n = 1.00866 u = 1.67491027 kg

(13.3)

no

Chadwick was awarded the 1935 Nobel Prize in Physics for his
discovery of the neutron.
A free neutron, unlike a free proton, is unstable. It decays into a
proton, an electron and a antineutrino (another elementary particle), and
has a mean life of about 1000s. It is, however, stable inside the nucleus.
The composition of a nucleus can now be described using the following
terms and symbols:

440

Z - atomic number = number of protons

[13.4(a)]

N - neutron number = number of neutrons

[13.4(b)]

A - mass number = Z + N
= total number of protons and neutrons [13.4(c)]
One also uses the term nucleon for a proton or a neutron. Thus the
number of nucleons in an atom is its mass number A.
Nuclear species or nuclides are shown by the notation AZ X where X is
the chemical symbol of the species. For example, the nucleus of gold is
denoted by 197
. It contains 197 nucleons, of which 79 are protons
79 Au
and the rest118 are neutrons.

Nuclei

tt
o N
be C
E
re R
pu T
bl
is
he
d

The composition of isotopes of an element can now be readily


explained. The nuclei of isotopes of a given element contain the same
number of protons, but differ from each other in their number of neutrons.
Deuterium, 12 H , which is an isotope of hydrogen, contains one proton
and one neutron. Its other isotope tritium, 13 H , contains one proton and
two neutrons. The element gold has 32 isotopes, ranging from A =173 to
A = 204. We have already mentioned that chemical properties of elements
depend on their electronic structure. As the atoms of isotopes have
identical electronic structure they have identical chemical behaviour and
are placed in the same location in the periodic table.
All nuclides with same mass number A are called isobars. For
example, the nuclides 13 H and 32 He are isobars. Nuclides with same
neutron number N but different atomic number Z, for example 198
80 Hg
and 197
,
are
called
isotones.
Au
79

13.3 SIZE

OF THE

N UCLEUS

As we have seen in Chapter 12, Rutherford was the pioneer who


postulated and established the existence of the atomic nucleus. At
Rutherfords suggestion, Geiger and Marsden performed their classic
experiment: on the scattering of -particles from thin gold foils. Their
experiments revealed that the distance of closest approach to a gold
nucleus of an -particle of kinetic energy 5.5 MeV is about 4.0 1014 m.
The scattering of -particle by the gold sheet could be understood by
Rutherford by assuming that the coulomb repulsive force was solely
responsible for scattering. Since the positive charge is confined to the
nucleus, the actual size of the nucleus has to be less than 4.0 1014 m.
If we use -particles of higher energies than 5.5 MeV, the distance of
closest approach to the gold nucleus will be smaller and at some point
the scattering will begin to be affected by the short range nuclear forces,
and differ from Rutherfords calculations. Rutherfords calculations are
based on pure coulomb repulsion between the positive charges of the particle and the gold nucleus. From the distance at which deviations set
in, nuclear sizes can be inferred.
By performing scattering experiments in which fast electrons, instead
of -particles, are projectiles that bombard targets made up of various
elements, the sizes of nuclei of various elements have been accurately
measured.
It has been found that a nucleus of mass number A has a radius

no

R = R 0 A 1/3

(13.5)

15

where R 0 = 1.2 10 m. This means the volume of the nucleus, which


is proportional to R 3 is proportional to A. Thus the density of nucleus is
a constant, independent of A, for all nuclei. Different nuclei are likes
drop of liquid of constant density. The density of nuclear matter is
approximately 2.3 1017 kg m3. This density is very large compared to
ordinary matter, say water, which is 103 kg m3. This is understandable,
as we have already seen that most of the atom is empty. Ordinary matter
consisting of atoms has a large amount of empty space.

441

Physics

Solution
mFe = 55.85,

26

u = 9.27 10

kg

mass
9.27 10 26
1

=
15 3
volume
(4 / 3)(1.2 10 ) 56
= 2.29 1017 kg m3
The density of matter in neutron stars (an astrophysical object) is
comparable to this density. This shows that matter in these objects
has been compressed to such an extent that they resemble a big nucleus.

tt
o N
be C
E
re R
pu T
bl
is
he
d

EXAMPLE 13.1

Example 13.1 Given the mass of iron nucleus as 55.85u and A=56,
find the nuclear density?

Nuclear density =

13.4 MASS-ENERGY

AND

N UCLEAR BINDING ENERGY

13.4.1 Mass Energy

Einstein showed from his theory of special relativity that it is necessary


to treat mass as another form of energy. Before the advent of this theory
of special relativity it was presumed that mass and energy were conserved
separately in a reaction. However, Einstein showed that mass is another
form of energy and one can convert mass-energy into other forms of
energy, say kinetic energy and vice-versa.
Einstein gave the famous mass-energy equivalence relation
E = mc 2

(13.6)

Here the energy equivalent of mass m is related by the above equation


and c is the velocity of light in vacuum and is approximately equal to
3108 m s1.

no

EXAMPLE 13.2

Example 13.2 Calculate the energy equivalent of 1 g of substance.

442

Solution

Energy, E = 10

8 2

( 3 10 ) J

E = 103 9 1016 = 9 10 13 J

Thus, if one gram of matter is converted to energy, there is a release


of enormous amount of energy.

Experimental verification of the Einsteins mass-energy relation has


been achieved in the study of nuclear reactions amongst nucleons, nuclei,
electrons and other more recently discovered particles. In a reaction the
conservation law of energy states that the initial energy and the final
energy are equal provided the energy associated with mass is also
included. This concept is important in understanding nuclear masses
and the interaction of nuclei with one another. They form the subject
matter of the next few sections.

13.4.2 Nuclear binding energy


In Section 13.2 we have seen that the nucleus is made up of neutrons
and protons. Therefore it may be expected that the mass of the nucleus
is equal to the total mass of its individual protons and neutrons. However,

Nuclei
the nuclear mass M is found to be always less than this. For example, let
us consider 168 O ; a nucleus which has 8 neutrons and 8 protons. We
have
Mass of 8 neutrons = 8 1.00866 u

tt
o N
be C
E
re R
pu T
bl
is
he
d

Mass of 8 protons = 8 1.00727 u

Mass of 8 electrons = 8 0.00055 u

Therefore the expected mass of 168 O nucleus


= 8 2.01593 u = 16.12744 u.
The atomic mass of 168 O found from mass spectroscopy experiments
is seen to be 15.99493 u. Substracting the mass of 8 electrons (8 0.00055 u)
from this, we get the experimental mass of 168 O nucleus to be 15.99053 u.
Thus, we find that the mass of the 168 O nucleus is less than the total
mass of its constituents by 0.13691u. The difference in mass of a nucleus
and its constituents, M, is called the mass defect, and is given by
M = [Zm p + ( A Z )m n ] M

(13.7)

What is the meaning of the mass defect? It is here that Einsteins


equivalence of mass and energy plays a role. Since the mass of the oxygen
nucleus is less that the sum of the masses of its constituents (8 protons
and 8 neutrons, in the unbound state), the equivalent energy of the oxygen
nucleus is less than that of the sum of the equivalent energies of its
constituents. If one wants to break the oxygen nucleus into 8 protons
and 8 neutrons, this extra energy M c2 , has to supplied. This energy
required Eb is related to the mass defect by
Eb = M c 2

(13.8)

Example 13.3 Find the energy equivalent of one atomic mass unit,
first in Joules and then in MeV. Using this, express the mass defect
2
of 168 O in MeV/c .
Solution
1u = 1.6605 1027 kg
2
To convert it into energy units, we multiply it by c and find that
27
8 2
energy equivalent = 1.6605 10
(2.9979 10 ) kg m2/s2
10
= 1.4924 10
J
=

1.4924 1010
eV
19
1.602 10

no

The energy needed to separate

16
8

O into its constituents is thus

127.5 MeV/c .

If a certain number of neutrons and protons are brought together to


form a nucleus of a certain charge and mass, an energy Eb will be released

EXAMPLE 13.3

= 0.9315 10 9 eV
= 931.5 MeV
or, 1u = 931.5 MeV/ c 2
For 186 O , M = 0.13691 u = 0.13691931.5 MeV/c 2
2
= 127.5 MeV/c

443

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

in the process. The energy Eb is called the binding energy of the nucleus.
If we separate a nucleus into its nucleons, we would have to supply a
total energy equal to Eb, to those particles. Although we cannot tear
apart a nucleus in this way, the nuclear binding energy is still a convenient
measure of how well a nucleus is held together. A more useful measure
of the binding between the constituents of the nucleus is the binding
energy per nucleon, Ebn, which is the ratio of the binding energy Eb of a
nucleus to the number of the nucleons, A, in that nucleus:

no

E bn = E b / A
(13.9)
We can think of binding energy per nucleon as the average energy
per nucleon needed to separate a nucleus into its individual nucleons.
Figure 13.1 is a plot of the
binding energy per nucleon Ebn
versus the mass number A for a
large number of nuclei. We notice
the following main features of
the plot:
(i) the binding energy per
nucleon, Ebn , is practically
constant, i.e. practically
independent of the atomic
number for nuclei of middle
mass number ( 30 < A < 170).
The curve has a maximum of
about 8.75 MeV for A = 56
and has a value of 7.6 MeV
FIGURE 13.1 The binding energy per nucleon
for A = 238.
as a function of mass number.
(ii) E bn is lower for both light
nuclei ( A<30) and heavy
nuclei (A>170).
We can draw some conclusions from these two observations:
(i) The force is attractive and sufficiently strong to produce a binding
energy of a few MeV per nucleon.
(ii) The constancy of the binding energy in the range 30 < A < 170 is a
consequence of the fact that the nuclear force is short-ranged. Consider
a particular nucleon inside a sufficiently large nucleus. It will be under
the influence of only some of its neighbours, which come within the
range of the nuclear force. If any other nucleon is at a distance more
than the range of the nuclear force from the particular nucleon it will
have no influence on the binding energy of the nucleon under
consideration. If a nucleon can have a maximum of p neighbours
within the range of nuclear force, its binding energy would be
proportional to p. Let the binding energy of the nucleus be pk, where
k is a constant having the dimensions of energy. If we increase A by
adding nucleons they will not change the binding energy of a nucleon
inside. Since most of the nucleons in a large nucleus reside inside it
and not on the surface, the change in binding energy per nucleon
would be small. The binding energy per nucleon is a constant and is
444
approximately equal to pk. The property that a given nucleon

Nuclei

tt
o N
be C
E
re R
pu T
bl
is
he
d

influences only nucleons close to it is also referred to as saturation


property of the nuclear force.
(iii) A very heavy nucleus, say A = 240, has lower binding energy per
nucleon compared to that of a nucleus with A = 120. Thus if a
nucleus A = 240 breaks into two A = 120 nuclei, nucleons get more
tightly bound. This implies energy would be released in the process.
It has very important implications for energy production through
fission, to be discussed later in Section 13.7.1.
(iv) Consider two very light nuclei (A 10) joining to form a heavier
nucleus. The binding energy per nucleon of the fused heavier nuclei
is more than the binding energy per nucleon of the lighter nuclei.
This means that the final system is more tightly bound than the initial
system. Again energy would be released in such a process of
fusion. This is the energy source of sun, to be discussed later in
Section 13.7.3.

13.5 NUCLEAR FORCE

no

The force that determines the motion of atomic electrons is the familiar
Coulomb force. In Section 13.4, we have seen that for average mass
nuclei the binding energy per nucleon is approximately 8 MeV, which is
much larger than the binding energy in atoms. Therefore, to bind a
nucleus together there must be a strong attractive force of a totally
different kind. It must be strong enough to overcome the repulsion
between the (positively charged) protons and to bind both protons and
neutrons into the tiny nuclear volume. We have already seen
that the constancy of binding energy per nucleon can be
understood in terms of its short-range. Many features of the
nuclear binding force are summarised below. These are
obtained from a variety of experiments carried out during 1930
to 1950.
(i) The nuclear force is much stronger than the Coulomb force
acting between charges or the gravitational forces between
masses. The nuclear binding force has to dominate over
the Coulomb repulsive force between protons inside the
nucleus. This happens only because the nuclear force is
much stronger than the coulomb force. The gravitational
force is much weaker than even Coulomb force.
(ii) The nuclear force between two nucleons falls rapidly to FIGURE 13.2 Potential energy
of a pair of nucleons as a
zero as their distance is more than a few femtometres. This
function of their separation.
leads to saturation of forces in a medium or a large-sized
For a separation greater
nucleus, which is the reason for the constancy of the
than r0, the force is attractive
binding energy per nucleon.
and for separations less
than r0, the force is
A rough plot of the potential energy between two nucleons
strongly repulsive.
as a function of distance is shown in the Fig. 13.2. The
potential energy is a minimum at a distance r0 of about
0.8 fm. This means that the force is attractive for distances larger
than 0.8 fm and repulsive if they are separated by distances less
445
than 0.8 fm.

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

(iii) The nuclear force between neutron-neutron, proton-neutron and


proton-proton is approximately the same. The nuclear force does not
depend on the electric charge.
Unlike Coulombs law or the Newtons law of gravitation there is no
simple mathematical form of the nuclear force.

13.6 R ADIOACTIVITY

A. H. Becquerel discovered radioactivity in 1896 purely by accident. While


studying the fluorescence and phosphorescence of compounds irradiated
with visible light, Becquerel observed an interesting phenomenon. After
illuminating some pieces of uranium-potassium sulphate with visible
light, he wrapped them in black paper and separated the package from a
photographic plate by a piece of silver. When, after several hours of
exposure, the photographic plate was developed, it showed blackening
due to something that must have been emitted by the compound and
was able to penetrate both black paper and the silver.
Experiments performed subsequently showed that radioactivity was
a nuclear phenomenon in which an unstable nucleus undergoes a decay.
This is referred to as radioactive decay. Three types of radioactive decay
occur in nature :
(i) -decay in which a helium nucleus 42 He is emitted;

(ii) -decay in which electrons or positrons (particles with the same mass
as electrons, but with a charge exactly opposite to that of electron)
are emitted;
(iii) -decay in which high energy (hundreds of keV or more) photons are
emitted.
Each of these decay will be considered in subsequent sub-sections.

13.6.1 Law of radioactive decay

In any radioactive sample, which undergoes , or -decay, it is found


that the number of nuclei undergoing the decay per unit time is
proportional to the total number of nuclei in the sample. If N is the
number of nuclei in the sample and N undergo decay in time t then

no

N
N
t

446

or, N/ t = N,
(13.10)
where is called the radioactive decay constant or disintegration constant.
The change in the number of nuclei in the sample* is dN = N in
time t. Thus the rate of change of N is (in the limit t 0)
dN
= N
dt

* N is the number of nuclei that decay, and hence is always positive. dN is the
change in N, which may have either sign. Here it is negative, because out of
original N nuclei, N have decayed, leaving (NN) nuclei.

Nuclei
dN
= dt
N
Now, integrating both sides of the above equation,we get,
or,

t
dN
= dt
N
N0
t0

tt
o N
be C
E
re R
pu T
bl
is
he
d

(13.11)

or, ln N ln N 0 = (t t0 )

(13.12)

Here N 0 is the number of radioactive nuclei in the sample at some


arbitrary time t0 and N is the number of radioactive nuclei at any
subsequent time t. Setting t0 = 0 and rearranging Eq. (13.12) gives us
ln

N
= t
N0

(13.13)

which gives

N(t) = N0 e t

(13.14)

Note, for example, the light bulbs follow no such exponential decay law.
If we test 1000 bulbs for their life (time span before they burn out or
fuse), we expect that they will decay (that is, burn out) at more or less
the same time. The decay of radionuclides follows quite a different law,
the law of radioactive decay represented by Eq. (13.14).
The total decay rate R of a sample is the number of nuclei
disintegrating per unit time. Suppose in a time interval dt, the decay
count measured is N. Then dN = N.
The positive quantity R is then defined as
R=

dN
dt

Differentiating Eq. (13.14), we get


R = N0 e t

or, R = R 0 e t

(13.15)

no

This is equivalant to the law of radioactivity decay,


since you can integrate Eq. (13.15) to get back Eq.
(13.14). Clearly, R0 = N 0 is the decay rate at t = 0. The
decay rate R at a certain time t and the number of
undecayed nuclei N at the same time are related by
R = N

(13.16)

FIGURE 13.3 Exponential decay of a


radioactive species. After a lapse of
T 1/2 , population of the given species
drops by a factor of 2.

The decay rate of a sample, rather than the number of radioactive


nuclei, is a more direct experimentally measurable quantity and is given
a specific name: activity. The SI unit for activity is becquerel, named
after the discoverer of radioactivity, Henry Becquerel.

447

Physics
1 becquerel is simply equal to 1 disintegration or decay per second.
There is also another unit named curie that is widely used and is related
to the SI unit as:
1 curie = 1 Ci = 3.7 1010 decays per second

MARIE SKLODOWSKA CURIE (1867-1934)

tt
o N
be C
E
re R
pu T
bl
is
he
d

= 3.7 1010 Bq
Different radionuclides differ greatly in their rate of
decay. A common way to characterize this feature is
through the notion of half-life. Half-life of a radionuclide
(denoted by T1/2 ) is the time it takes for a sample that has
initially, say N 0 radionuclei to reduce to N0 /2. Putting
N = N0/2 and t = T 1/2 in Eq. (13.14), we get
ln 2

T1/2 =

Marie Sklodowska Curie


(1867-1934)
Born in
Poland. She is recognised
both as a physicist and as
a chemist. The discovery of
radioactivity by Henri
Becquerel in 1896 inspired
Marie and her husband
Pierre Curie in their
researches and analyses
which led to the isolation of
radium and polonium
elements. She was the first
person to be awarded two
Nobel Prizes- for Physics in
1903 and for Chemistry
in 1911.

0.693

(13.17)

Clearly if N0 reduces to half its value in time T1/2, R0


will also reduce to half its value in the same time according
to Eq. (13.16).
Another related measure is the average or mean life
. This again can be obtained from Eq. (13.14). The
number of nuclei which decay in the time interval t to t +
t is R(t )t (= N 0e tt). Each of them has lived for time
t. Thus the total life of all these nuclei would be t N0e t
t. It is clear that some nuclei may live for a short time
while others may live longer. Therefore to obtain the mean
life, we have to sum (or integrate) this expression over all
times from 0 to , and divide by the total number N 0 of
nuclei at t = 0. Thus,

N 0 te t d t

= te t d t

N0
0
One can show by performing this integral that

= 1/

We summarise these results with the following:

448

ln 2

= ln 2

(13.18)

Radioactive elements (e.g., tritium, plutonium) which are short-lived


i.e., have half-lives much less than the age of the universe ( 15 billion
years) have obviously decayed long ago and are not found in nature.
They can, however, be produced artificially in nuclear reactions.

EXAMPLE 13.4

no

T1/2 =

Example 13.4 The half-life of 238


undergoing -decay is 4.5 109
92 U
years. What is the activity of 1g sample of 239 28 U ?
Solution
T 1/2 = 4.5 109 y
9
7
= 4.5 10 y x 3.16 x 10 s/y
17
= 1.42 10 s

Nuclei
One k mol of any isotope contains Avogadros number of atoms, and
so 1g of 238
contains
92 U

EXAMPLE 13.4

tt
o N
be C
E
re R
pu T
bl
is
he
d

1
kmol 6.025 1026 atoms/kmol
238 103
20
= 25.3 10 atoms.
The decay rate R is
R = N

0.693
0.693 25.3 1020 1
N =
s
T1/ 2
1.42 1017

= 1.23 104 s1
4
= 1.23 10 Bq

Solution
By definition of half-life, half of the initial sample will remain
undecayed after 12.5 y. In the next 12.5 y, one-half of these nuclei
would have decayed. Hence, one fourth of the sample of the initial
pure tritium will remain undecayed.

EXAMPLE 13.5

Example 13.5 Tritium has a half-life of 12.5 y undergoing beta decay.


What fraction of a sample of pure tritium will remain undecayed
after 25 y.

13.6.2 Alpha decay

A well-known example of alpha decay is the decay of uranium


thorium 234
with the emission of a helium nucleus 42 He
90 Th
238
92

234
90

Th + 42 He

(-decay)

238
92

U to

(13.19)

In -decay, the mass number of the product nucleus (daughter


nucleus) is four less than that of the decaying nucleus (parent nucleus),
while the atomic number decreases by two. In general, -decay of a parent
nucleus AZ X results in a daughter nucleus AZ42Y
A
Z

A 4
Z 2

Y + 42 He

(13.20)

no

From Einsteins mass-energy equivalance relation [Eq. (13.6)] and


energy conservation, it is clear that this spontaneous decay is possible
only when the total mass of the decay products is less than the mass of
the initial nucleus. This difference in mass appears as kinetic energy of
the products. By referring to a table of nuclear masses, one can check
that the total mass of 234
and 42 He is indeed less than that of 238
.
90 Th
92 U
The disintegration energy or the Q-value of a nuclear reaction is the
difference between the initial mass energy and the total mass energy of
the decay products. For -decay
Q = (mX m Y mHe) c2

(13.21)

Q is also the net kinetic energy gained in the process or, if the initial
nucleus X is at rest, the kinetic energy of the products. Clearly, Q> 0 for
exothermic processes such as -decay.

449

Physics
Example 13.6 We are given the following atomic masses:
23 8
4
= 238.05079 u
= 4.00260 u
92 U
2 He
23 4
1
= 234.04363 u 1 H = 1.00783 u
9 0 Th
237
91

tt
o N
be C
E
re R
pu T
bl
is
he
d

Pa = 237.05121 u
Here the symbol Pa is for the element protactinium (Z = 91).
(a) Calculate the energy released during the alpha decay of 238
U.
92
(b) Show that 238
U
can
not
spontaneously
emit
a
proton.
92
Solution
(a) The alpha decay of 238
is given by Eq. (13.20). The energy released
92 U
in this process is given by
Q = (M U M Th MHe) c 2
Substituting the atomic masses as given in the data, we find
Q = (238.05079 234.04363 4.00260)u c 2
= (0.00456 u) c 2
= (0.00456 u) (931.5 MeV/u)
= 4.25 MeV.
(b) If 238
U spontaneously emits a proton, the decay process would be
92
23 8
92

23 7
91

1
Pa + 1 H

EXAMPLE 13.6

The Q for this process to happen is


= (MU M Pa MH ) c 2
= (238.05079 237.05121 1.00783) u c 2
2
= ( 0.00825 u) c
= (0.00825 u)(931.5 MeV/u)
= 7.68 MeV
Thus , the Q of the process is negative and therefore it cannot proceed
spontaneously. We will have to supply an energy of 7.68 MeV to a
23 8
nucleus to make it emit a proton.
92 U

13.6.3 Beta decay

In beta decay, a nucleus spontaneously emits an electron ( decay) or a


positron (+ decay). A common example of decay is
32
15

32
P 16
S + e +

(13.22)

and that of + decay is

no

22
11

450

22
Na 10
Ne + e + +

(13.23)

The decays are governed by the Eqs. (13.14) and (13.15), so that one
can never predict which nucleus will undergo decay, but one can
characterize the decay by a half-life T1/2 . For example, T1/2 for the decays
above is respectively 14.3 d and 2.6y. The emission of electron in decay
is accompanied by the emission of an antineutrino ( ); in + decay, instead,
a neutrino () is generated. Neutrinos are neutral particles with very small
(possiblly, even zero) mass compared to electrons. They have only weak
interaction with other particles. They are, therefore, very difficult to detect,
since they can penetrate large quantity of matter (even earth) without any
interaction.

Nuclei
In both and + decay, the mass number A remains unchanged. In
decay, the atomic number Z of the nucleus goes up by 1, while in +
decay Z goes down by 1. The basic nuclear process underlying decay
is the conversion of neutron to proton

n p + e +

(13.24)

tt
o N
be C
E
re R
pu T
bl
is
he
d

while for decay, it is the conversion of proton into neutron


p n + e+ +

(13.25)

Note that while a free neutron decays to proton, the decay of proton to
neutron [Eq. (13.25)] is possible only inside the nucleus, since proton
has smaller mass than neutron.

13.6.4 Gamma decay

Like an atom, a nucleus also has discrete energy levels - the ground
state and excited states. The scale of energy is, however, very different.
Atomic energy level spacings are of the order of eV, while the difference in
nuclear energy levels is of the order of MeV. When a
nucleus in an excited state spontaneously decays
to its ground state (or to a lower energy state), a
photon is emitted with energy equal to the difference
in the two energy levels of the nucleus. This is the
so-called gamma decay. The energy (MeV)
corresponds to radiation of extremely short
wavelength, shorter than the hard X-ray region.
Typically, a gamma ray is emitted when a or
decay results in a daughter nucleus in an excited
state. This then returns to the ground state by a
single photon transition or successive transitions
involving more than one photon. A familiar example
FIGURE 13.4 -decay of 60
Ni nucleus
28
is the successive emmission of gamma rays of
followed by emission of two rays
energies 1.17 MeV and 1.33 MeV from the
from deexcitation of the daughter
deexcitation of 60
nuclei formed from decay
28 Ni
nucleus 60
Ni .
28
60
of 27 Co .

13.7 NUCLEAR ENERGY

no

The curve of binding energy per nucleon Ebn, given in Fig. 13.1, has
a long flat middle region between A = 30 and A = 170. In this region
the binding energy per nucleon is nearly constant (8.0 MeV). For
the lighter nuclei region, A < 30, and for the heavier nuclei region,
A > 170, the binding energy per nucleon is less than 8.0 MeV, as we
have noted earlier. Now, the greater the binding energy, the less is the
total mass of a bound system, such as a nucleus. Consequently, if nuclei
with less total binding energy transform to nuclei with greater binding
energy, there will be a net energy release. This is what happens when a
heavy nucleus decays into two or more intermediate mass fragments
(fission) or when light nuclei fuse into a havier nucleus (fusion.)
Exothermic chemical reactions underlie conventional energy sources
such as coal or petroleum. Here the energies involved are in the range of

451

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

electron volts. On the other hand, in a nuclear reaction, the energy release
is of the order of MeV. Thus for the same quantity of matter, nuclear
sources produce a million times more energy than a chemical source.
Fission of 1 kg of uranium, for example, generates 10 14 J of energy;
compare it with burning of 1 kg of coal that gives 10 7 J.

13.7.1 Fission

New possibilities emerge when we go beyond natural radioactive decays


and study nuclear reactions by bombarding nuclei with other nuclear
particles such as proton, neutron, -particle, etc.
A most important neutron-induced nuclear reaction is fission. An
example of fission is when a uranium isotope

235
92

U bombarded with a

neutron breaks into two intermediate mass nuclear fragments


1
0

236
144
89
1
n +235
92 U 92 U 56 Ba + 36 Kr + 3 0 n

(13.26)

The same reaction can produce other pairs of intermediate mass


fragments
1
0

236
133
n + 235
92 U 92 U 51Sb +
Or, as another example,
1
0

140
n + 235
92 U 54 Xe +

94
38

99
41

Nb + 4 10 n

Sr + 2 10n

(13.27)
(13.28)

The fragment products are radioactive nuclei; they emit particles in


succession to achieve stable end products.
The energy released (the Q value ) in the fission reaction of nuclei like
uranium is of the order of 200 MeV per fissioning nucleus. This is
estimated as follows:
Let us take a nucleus with A = 240 breaking into two fragments each
of A = 120. Then
E bn for A = 240 nucleus is about 7.6 MeV,

E bn for the two A = 120 fragment nuclei is about 8.5 MeV.


Gain in binding energy for nucleon is about 0.9 MeV.

no

Hence the total gain in binding energy is 2400.9 or 216 MeV.

452

The disintegration energy in fission events first appears as the kinetic


energy of the fragments and neutrons. Eventually it is transferred to the
surrounding matter appearing as heat. The source of energy in nuclear
reactors, which produce electricity, is nuclear fission. The enormous
energy released in an atom bomb comes from uncontrolled nuclear fission.
We discuss some details in the next section how a nuclear reactor
functions.

13.7.2 Nuclear reactor


Notice one fact of great importance in the fission reactions given in Eqs.
(13.26) to (13.28). There is a release of extra neutron (s) in the fission
process. Averagely, 2 neutrons are released per fission of uranium
nucleus. It is a fraction since in some fission events 2 neutrons are

Nuclei
I NDIAS ATOMIC

ENERGY PROGRAMME

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

The atomic energy programme in India was launched around the time of independence
under the leadership of Homi J. Bhabha (1909-1966). An early historic achievement
was the design and construction of the first nuclear reactor in India (named Apsara)
which went critical on August 4, 1956. It used enriched uranium as fuel and water as
moderator. Following this was another notable landmark: the construction of CIRUS
(Canada India Research U.S.) reactor in 1960. This 40 MW reactor used natural uranium
as fuel and heavy water as moderator. Apsara and CIRUS spurred research in a wide
range of areas of basic and applied nuclear science. An important milestone in the first
two decades of the programme was the indigenous design and construction of the
plutonium plant at Trombay, which ushered in the technology of fuel reprocessing
(separating useful fissile and fertile nuclear materials from the spent fuel of a reactor) in
India. Research reactors that have been subsequently commissioned include ZERLINA,
PURNIMA (I, II and III), DHRUVA and KAMINI. KAMINI is the countrys first large research
reactor that uses U-233 as fuel. As the name suggests, the primary objective of a research
reactor is not generation of power but to provide a facility for research on different aspects
of nuclear science and technology. Research reactors are also an excellent source for
production of a variety of radioactive isotopes that find application in diverse fields:
industry, medicine and agriculture.
The main objectives of the Indian Atomic Energy programme are to provide safe and
reliable electric power for the countrys social and economic progress and to be selfreliant in all aspects of nuclear technology. Exploration of atomic minerals in India
undertaken since the early fifties has indicated that India has limited reserves of uranium,
but fairly abundant reserves of thorium. Accordingly, our country has adopted a threestage strategy of nuclear power generation. The first stage involves the use of natural
uranium as a fuel, with heavy water as moderator. The Plutonium-239 obtained from
reprocessing of the discharged fuel from the reactors then serves as a fuel for the second
stage the fast breeder reactors. They are so called because they use fast neutrons for
sustaining the chain reaction (hence no moderator is needed) and, besides generating
power, also breed more fissile species (plutonium) than they consume. The third stage,
most significant in the long term, involves using fast breeder reactors to produce fissile
Uranium-233 from Thorium-232 and to build power reactors based on them.
India is currently well into the second stage of the programme and considerable
work has also been done on the third the thorium utilisation stage. The country
has mastered the complex technologies of mineral exploration and mining, fuel
fabrication, heavy water production, reactor design, construction and operation, fuel
reprocessing, etc. Pressurised Heavy Water Reactors (PHWRs) built at different sites in
the country mark the accomplishment of the first stage of the programme. India is now
more than self-sufficient in heavy water production. Elaborate safety measures both in
the design and operation of reactors, as also adhering to stringent standards of
radiological protection are the hallmark of the Indian Atomic Energy Programme.

produced, in some 3, etc. The extra neutrons in turn can initiate fission
processes, producing still more neutrons, and so on. This leads to the
possibility of a chain reaction, as was first suggested by Enrico Fermi. If
the chain reaction is controlled suitably, we can get a steady energy

453

Physics

http://www.npcil.nic.in/main/AllProjectOperationDisplay.aspx

no

Nuclear power plants in India

tt
o N
be C
E
re R
pu T
bl
is
he
d

output. This is what happens in a nuclear reactor. If the chain reaction is


uncontrolled, it leads to explosive energy output, as in a nuclear bomb.
There is, however, a hurdle in sustaining a chain reaction, as described
here. It is known experimentally that slow neutrons (thermal neutrons)
are much more likely to cause fission in 235
than fast neutrons. Also
92 U
fast neutrons liberated in fission would escape instead of causing another
fission reaction.
The average energy of a neutron produced in fission of 235
is 2 MeV.
92 U
These neutrons unless slowed down will escape from the reactor without
interacting with the uranium nuclei, unless a very large amount of
fissionable material is used for sustaining the chain reaction. What one
needs to do is to slow down the fast neutrons by elastic scattering with
light nuclei. In fact, Chadwicks experiments showed that in an elastic
collision with hydrogen the neutron almost comes to rest and proton
carries away the energy. This is the same situation as when a marble hits
head-on an identical marble at rest. Therefore, in reactors, light nuclei
called moderators are provided along with the fissionable nuclei for slowing
down fast neutrons. The moderators commonly used are water, heavy
water (D 2O) and graphite. The Apsara reactor at the Bhabha Atomic
Research Centre (BARC), Mumbai, uses water as moderator. The other
Indian reactors, which are used for power production, use heavy water
as moderator.
Because of the use of moderator, it is possible that the ratio, K, of
number of fission produced by a given generation of neutrons to the
number of fission of the preceeding generation may be greater than one.
This ratio is called the multiplication factor; it is the measure of the growth
rate of the neutrons in the reactor. For K = 1, the operation of the reactor
is said to be critical, which is what we wish it to be for steady power
operation. If K becomes greater than one, the reaction rate and the reactor
power increases exponentially. Unless the factor K is brought down very
close to unity, the reactor will become supercritical and can even explode.
The explosion of the Chernobyl reactor in Ukraine in 1986 is a sad
reminder that accidents in a nuclear reactor can be catastrophic.
The reaction rate is controlled through control-rods made out of
neutron-absorbing material such as cadmium. In addition to control rods,
reactors are provided with safety rods which, when required, can be
inserted into the reactor and K can be reduced rapidly to less than unity.
The more abundant isotope 238
in naturally occurring uranium is
92 U
non-fissionable. When it captures a neutron, it produces the highly
radioactive plutonium through these reactions
238
92

239

U + n 239
92 U 93 Np +e +

239
93

Np 239
94 Pu+ e +

(13.29)

Plutonium undergoes fission with slow neutrons.

454

Figure 13.5 shows the schematic diagram of a nuclear reactor based


on thermal neutron fission. The core of the reactor is the site of nuclear

fission. It contains the fuel elements in suitably fabricated form. The fuel
may be say enriched uranium (i.e., one that has greater abundance of
235
than naturally occurring uranium). The core contains a moderator
92 U
to slow down the neutrons. The core is surrounded by a reflector to reduce
leakage. The energy (heat) released in fission is continuously removed by
a suitable coolant. A containment vessel prevents the escape of radioactive
fission products. The whole assembly is shielded to check harmful
radiation from coming out. The reactor can be shut down by means of
rods (made of, for example, cadmium) that have high absorption of
neutrons. The coolant transfers heat to a working fluid which in turn
may produce stream. The steam drives turbines and generates electricity.
Like any power reactor, nuclear reactors generate considerable waste
products. But nuclear wastes need special care for treatment since they
are radioactive and hazardous. Elaborate safety measures, both for reactor
operation as well as handling and reprocessing the spent fuel, are
required. These safety measures are a distinguishing feature of the Indian
Atomic Energy programme. An appropriate plan is being evolved to study
the possibility of converting radioactive waste into less active and shortlived material.

A simplified online simulation of a nuclear reactor

FIGURE 13.5 Schematic diagram of a nuclear reactor based on


thermal neutron fission.

http://esa21.kennesaw.edu/activities/nukeenergy/nuke.htm

tt
o N
be C
E
re R
pu T
bl
is
he
d

Nuclei

13.7.3 Nuclear fusion energy generation in stars

no

When two light nuclei fuse to form a larger nucleus, energy is released,
since the larger nucleus is more tightly bound, as seen from the binding
energy curve in Fig.13.1. Some examples of such energy liberating nuclear
fusion reactions are :
1
1

H + 11 H 21H + e + + + 0.42 MeV

[13.29(a)]

2
1

H + 21 H 32He + n + 3.27 MeV

[13.29(b)]

2
1

H + 21H 31H + 11H + 4.03 MeV

[13.29(c)]

455

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

In the first reaction, two protons combine to form a deuteron and


a positron with a release of 0.42 MeV energy. In reaction [13.29(b)], two
deuterons combine to form the light isotope of helium. In reaction
(13.29c), two deuterons combine to form a triton and a proton. For
fusion to take place, the two nuclei must come close enough so that
attractive short-range nuclear force is able to affect them. However,
since they are both positively charged particles, they experience coulomb
repulsion. They, therefore, must have enough energy to overcome this
coulomb barrier. The height of the barrier depends on the charges and
radii of the two interacting nuclei. It can be shown, for example, that
the barrier height for two protons is ~ 400 keV, and is higher for nuclei
with higher charges. We can estimate the temperature at which two
protons in a proton gas would (averagely) have enough energy to
overcome the coulomb barrier:
(3/2)k T = K
400 keV, which gives T ~ 3 109 K.
When fusion is achieved by raising the temperature of the system so
that particles have enough kinetic energy to overcome the coulomb
repulsive behaviour, it is called thermonuclear fusion.
Thermonuclear fusion is the source of energy output in the interior
of stars. The interior of the sun has a temperature of 1.5107 K, which
is considerably less than the estimated temperature required for fusion
of particles of average energy. Clearly, fusion in the sun involves protons
whose energies are much above the average energy.
The fusion reaction in the sun is a multi-step process in which the
hydrogen is burned into helium. Thus, the fuel in the sun is the hydrogen
in its core. The proton-proton (p, p) cycle by which this occurs is
represented by the following sets of reactions:
1
1

+
H + 11H 21H + e + + 0.42 MeV

e + + e + + 1.02 MeV

(i)

(ii)

2
1

H + 11H 32 He + + 5.49 MeV

3
2

He + 32 He 42 He + 11 H + 11 H + 12.86 MeV (iv)

(iii)

(13.30)

no

For the fourth reaction to occur, the first three reactions must occur
twice, in which case two light helium nuclei unite to form ordinary helium
nucleus. If we consider the combination 2(i) + 2(ii) + 2(iii) +(iv), the net
effect is

456

4 11H + 2e 42 He + 2 + 6 + 26.7 MeV

or (4 11H + 4e ) ( 42 He + 2e ) + 2 + 6 + 26.7 MeV


Thus, four hydrogen atoms combine to form an
release of 26.7 MeV of energy.

(13.31)
4
2

He atom with a

Helium is not the only element that can be synthesized in the interior
of a star. As the hydrogen in the core gets depleted and becomes helium,
the core starts to cool. The star begins to collapse under its own gravity

Nuclei

tt
o N
be C
E
re R
pu T
bl
is
he
d

which increases the temperature of the core. If this temperature increases


to about 108 K, fusion takes place again, this time of helium nuclei into
carbon. This kind of process can generate through fusion higher and
higher mass number elements. But elements more massive than those
near the peak of the binding energy curve in Fig. 13.1 cannot be so
produced.
The age of the sun is about 5109 y and it is estimated that there is
enough hydrogen in the sun to keep it going for another 5 billion years.
After that, the hydrogen burning will stop and the sun will begin to cool
and will start to collapse under gravity, which will raise the core
temperature. The outer envelope of the sun will expand, turning it into
the so called red giant.

NUCLEAR

HOLOCAUST

In a single uranium fission about 0.9235 MeV (200 MeV) of energy is liberated. If
each nucleus of about 50 kg of 235U undergoes fission the amount of energy involved is
about 4 1015J. This energy is equivalent to about 20,000 tons of TNT, enough for a
superexplosion. Uncontrolled release of large nuclear energy is called an atomic explosion.
On August 6, 1945 an atomic device was used in warfare for the first time. The US
dropped an atom bomb on Hiroshima, Japan. The explosion was equivalent to 20,000
tons of TNT. Instantly the radioactive products devastated 10 sq km of the city which
had 3,43,000 inhabitants. Of this number 66,000 were killed and 69,000 were injured;
more than 67% of the citys structures were destroyed.
High temperature conditions for fusion reactions can be created by exploding a fission
bomb. Super-explosions equivalent to 10 megatons of explosive power of TNT were tested
in 1954. Such bombs which involve fusion of isotopes of hydrogen, deuterium and tritium
are called hydrogen bombs. It is estimated that a nuclear arsenal sufficient to destroy
every form of life on this planet several times over is in position to be triggered by the
press of a button. Such a nuclear holocaust will not only destroy the life that exists now
but its radioactive fallout will make this planet unfit for life for all times. Scenarios based
on theoretical calculations predict a long nuclear winter, as the radioactive waste will
hang like a cloud in the earths atmosphere and will absorb the suns radiation.

13.7.4 Controlled thermonuclear fusion

no

The natural thermonuclear fusion process in a star is replicated in a


thermonuclear fusion device. In controlled fusion reactors, the aim is to
generate steady power by heating the nuclear fuel to a temperature in the
range of 108 K. At these temperatures, the fuel is a mixture of positive
ions and electrons (plasma). The challenge is to confine this plasma, since
no container can stand such a high temperature. Several countries
around the world including India are developing techniques in this
connection. If successful, fusion reactors will hopefully supply almost
unlimited power to humanity.

457

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

Example 13.7 Answer the following questions:


(a) Are the equations of nuclear reactions (such as those given in
Section 13.7) balanced in the sense a chemical equation (e.g.,
2H2 + O2 2 H2 O) is? If not, in what sense are they balanced on
both sides ?
(b) If both the number of protons and the number of neutrons are
conserved in each nuclear reaction, in what way is mass converted
into energy (or vice-versa) in a nuclear reaction?
(c) A general impression exists that mass-energy interconversion
takes place only in nuclear reaction and never in chemical
reaction. This is strictly speaking, incorrect. Explain.
Solution

(a) A chemical equation is balanced in the sense that the number of


atoms of each element is the same on both sides of the equation.
A chemical reaction merely alters the original combinations of
atoms. In a nuclear reaction, elements may be transmuted. Thus,
the number of atoms of each element is not necessarily conserved
in a nuclear r eaction. However, the number of protons and the
number of neutrons are both separately conserved in a nuclear
reaction. [Actually, even this is not strictly true in the realm of
very high energies what is strictly conserved is the total charge
and total baryon number. We need not pursue this matter here.]
In nuclear reactions (e.g., Eq. 13.26), the number of protons and
the number of neutrons are the same on the two sides of the equation.

458

EXAMPLE 13.7

no

(b) We know that the binding energy of a nucleus gives a negative


contribution to the mass of the nucleus (mass defect). Now, since
proton number and neutron number are conserved in a nuclear
reaction, the total rest mass of neutrons and protons is the same
on either side of a reaction. But the total binding energy of nuclei
on the left side need not be the same as that on the right hand
side. The difference in these binding energies appears as energy
released or absorbed in a nuclear reaction. Since binding energy
contributes to mass, we say that the difference in the total mass
of nuclei on the two sides get converted into energy or vice-versa.
It is in these sense that a nuclear reaction is an example of massenergy interconversion.

(c) From the point of view of mass-energy interconversion, a chemical


reaction is similar to a nuclear reaction in principle. The energy
released or absorbed in a chemical reaction can be traced to the
difference in chemical (not nuclear) binding energies of atoms and
molecules on the two sides of a reaction. Since, strictly speaking,
chemical binding energy also gives a negative contribution (mass
defect) to the total mass of an atom or molecule, we can equally
well say that the difference in the total mass of atoms or molecules,
on the two sides of the chemical reaction gets converted into energy
or vice-versa. However, the mass defects involved in a chemical
reaction are almost a million times smaller than those in a nuclear
reaction.This is the reason for the general impression, (which is
incorrec t) that mass-energy interconversion does not take place
in a chemical reaction.

Nuclei
SUMMARY
An atom has a nucleus. The nucleus is positively charged. The radius
of the nucleus is smaller than the radius of an atom by a factor of
104. More than 99.9% mass of the atom is concentrated in the nucleus.

tt
o N
be C
E
re R
pu T
bl
is
he
d

1.

2.

On the atomic scale, mass is measured in atomic mass units (u). By


definition, 1 atomic mass unit (1u) is 1/12th mass of one atom of 12 C;
1u = 1.660563 1027 kg.

3.

A nucleus contains a neutral particle called neutron. Its mass is almost


the same as that of proton
The atomic number Z is the number of protons in the atomic nucleus
of an element. The mass number A is the total number of protons and
neutrons in the atomic nucleus; A = Z+N; Here N denotes the number
of neutrons in the nucleus.

4.

A nuclear species or a nuclide is represented as


chemical symbol of the species.

A
Z

X , where X is the

Nuclides with the same atomic number Z, but different neutron number
N are called isotopes. Nuclides with the same A are isobars and those
with the same N are isotones.
Most elements are mixtures of two or more isotopes. The atomic mass
of an element is a weighted average of the masses of its isotopes. The
masses are the relative abundances of the isotopes.

5.

A nucleus can be considered to be spherical in shape and assigned a


radius. Electron scattering experiments allow determination of the
nuclear radius; it is found that radii of nuclei fit the formula
R = R0 A1/3,
where R0 = a constant = 1.2 fm. This implies that the nuclear density
is independent of A. It is of the order of 1017 kg/m3 .

6.

Neutrons and protons are bound in a nucleus by the short-range strong


nuclear force. The nuclear force does not distinguish between neutron
and proton.
The nuclear mass M is always less than the total mass, m, of its
constituents. The difference in mass of a nucleus and its constituents
is called the mass defect,

7.

M = (Z mp + (A Z )mn ) M
Using Einsteins mass energy relation, we express this mass difference
in terms of energy as
Eb = M c2

no

8.

The energy Eb represents the binding energy of the nucleus. In the


mass number range A = 30 to 170, the binding energy per nucleon is
nearly constant, about 8 MeV/nucleon.
Energies associated with nuclear processes are about a million times
larger than chemical process.

9.

The Q-value of a nuclear process is


Q = final kinetic energy initial kinetic energy.
Due to conservation of mass-energy, this is also,

Q = (sum of initial masses sum of final masses)c2


10. Radioactivity is the phenomenon in which nuclei of a given species
transform by giving out or or rays; -rays are helium nuclei;

459

Physics
-rays are electrons. -rays are electromagnetic radiation of wavelengths
shorter than X-rays;

tt
o N
be C
E
re R
pu T
bl
is
he
d

11. Law of radioactive decay : N (t) = N(0) et


where is the decay constant or disintegration constant.
The half-life T 1/2 of a radionuclide is the time in which N has been
reduced to one-half of its initial value. The mean life is the time at
which N has been reduced to e1 of its initial value

T1/2 =

ln2

= ln 2

12. Energy is released when less tightly bound nuclei are transmuted into
more tightly bound nuclei. In fission, a heavy nucleus like
into two smaller fragments, e.g.,

23 5
92

1
0

13 3
51

U+ n

Sb +

99
41

235
92

U breaks
1

Nb + 4 0 n

13. The fact that more neutrons are produced in fission than are consumed
gives the possibility of a chain reaction with each neutron that is
produced triggering another fission. The chain reaction is uncontrolled
and rapid in a nuclear bomb explosion. It is controlled and steady in
a nuclear reactor. In a reactor, the value of the neutron multiplication
factor k is maintained at 1.
14. In fusion, lighter nuclei combine to form a larger nucleus. Fusion of
hydrogen nuclei into helium nuclei is the source of energy of all stars
including our sun.

Physical Quantity

Symbol

Atomic mass unit

Dimensions

Units

Remarks

[M]

[T 1 ]

s1

T 1/2

[T]

Time taken for the decay


of one-half of the initial
number of nuclei present
in a radioactive sample.

Mean life

[T]

Time at which number of


nuclei has been reduced to
1
e of its initial value

Activity of a radioactive sample

[ T1 ]

Bq

Measure of the activity


of a radioactive source.

Disintegration or
decay constant

no

Half-life

POINTS TO PONDER
1.
2.

460

Unit of mass for


expressing atomic or
nuclear masses. One
atomic mass unit equals
th
12
1/12 of the mass of C
atom.

The density of nuclear matter is independent of the size of the nucleus.


The mass density of the atom does not follow this rule.
The radius of a nucleus determined by electron scattering is found to
be slightly different from that determined by alpha-particle scattering.

Nuclei
This is because electron scattering senses the charge distribution of
the nucleus, whereas alpha and similar particles sense the nuclear
matter.
After Einstein showed the equivalence of mass and energy, E = mc 2 ,
we cannot any longer speak of separate laws of conservation of mass
and conservation of energy, but we have to speak of a unified law of
conservation of mass and energy. The most convincing evidence that
this principle operates in nature comes from nuclear physics. It is
central to our understanding of nuclear energy and harnessing it as a
source of power. Using the principle, Q of a nuclear process (decay or
reaction) can be expressed also in terms of initial and final masses.

tt
o N
be C
E
re R
pu T
bl
is
he
d

3.

4.

The nature of the binding energy (per nucleon) curve shows that
exothermic nuclear reactions are possible, when two light nuclei fuse
or when a heavy nucleus undergoes fission into nuclei with intermediate
mass.

5.

For fusion, the light nuclei must have sufficient initial energy to
overcome the coulomb potential barrier. That is why fusion requires
very high temperatures.

6.

Although the binding energy (per nucleon) curve is smooth and slowly
varying, it shows peaks at nuclides like 4 He, 16O etc. This is considered
as evidence of atom-like shell structure in nuclei.

7.

Electrons and positron are a particle-antiparticle pair. They are


identical in mass; their charges are equal in magnitude and opposite.
(It is found that when an electr on and a positron come together, they
annihilate each other giving energy in the form of gamma-ray photons.)

8.

In - -decay (electron emission), the particle emitted along with electron


is anti-neutrino ( ). On the other hand, the particle emitted in + decay (positron emission) is neutrino (). Neutrino and anti-neutrino
are a particle-antiparticle pair. There ar e anti particles associated
with every particle. What should be antiproton which is the anti
particle of the proton?

A free neutron is unstable ( n p + e + ). But a similar free proton


decay is not possible, since a proton is (slightly) lighter than a neutron.
10. Gamma emission usually follows alpha or beta emission. A nucleus
in an excited (higher) state goes to a lower state by emitting a gamma
photon. A nucleus may be left in an excited state after alpha or beta
emission. Successive emission of gamma rays from the same nucleus
(as in case of 60Ni, Fig. 13.4) is a clear proof that nuclei also have
discrete energy levels as do the atoms.
9.

no

11. Radioactivity is an indication of the instability of nuclei. Stability


requires the ratio of neutron to proton to be around 1:1 for light
nuclei. This ratio increases to about 3:2 for heavy nuclei. (More
neutrons are required to overcome the effect of repulsion among the
protons.) Nuclei which are away from the stability ratio, i.e., nuclei
which have an excess of neutrons or protons are unstable. In fact,
only about 10% of knon isotopes (of all elements), are stable. Others
have been either artificially produced in the laboratory by bombarding
, p, d, n or other particles on targets of stable nuclear species or
identified in astronomical observations of matter in the universe.

461

Physics
EXERCISES

tt
o N
be C
E
re R
pu T
bl
is
he
d

You may find the following data useful in solving the exercises:
e = 1.61019C
N = 6.02310 23 per mole
9
2
2
23 0 1
1/(4 0) = 9 10 N m /C
k = 1.38110 J K
1 MeV = 1.61013J
7
1 year = 3.15410 s

1 u = 931.5 MeV/c 2

mH = 1.007825 u

mn = 1.008665 u

m( He ) = 4.002603 u

m e = 0.000548 u

4
2

13.1

(a) Two stable isotopes of lithium 63 Li and 73 Li have respective


abundances of 7.5% and 92.5%. These isotopes have masses
6.01512 u and 7.01600 u, respectively. Find the atomic mass
of lithium.
(b) Boron has two stable isotopes, 105 B and 115B . Their respective
masses are 10.01294 u and 11.00931 u, and the atomic mass of
boron is 10.811 u. Find the abundances of 105 B and 11
B.
5

13.2

20
21
22
The three stable isotopes of neon: 10
Ne, 10
Ne and 10
Ne have
respective abundances of 90.51%, 0.27% and 9.22%. The atomic
masses of the three isotopes are 19.99 u, 20.99 u and 21.99 u,
respectively. Obtain the average atomic mass of neon.

13.3

Obtain the binding energy (in MeV) of a nitrogen nucleus


14
given m 7 N =14.00307 u

13.4

13.5

13.6

Obtain the binding energy of the nuclei


MeV from the following data:

22 6
88

(iii) -decay of
+

(v) -decay of

Ra

32
15
11
6

(ii) -decay of

no

Bi in units of

Pu

210
83

97
43

(vi) -decay of
120
54

242
94

(iv) -decay of

(vii) Electron capture of

462

209
83

N ,

Write nuclear reaction equations for

13.8

Fe and

14
7

m ( 56
Fe ) = 55.934939 u
m ( 209
Bi ) = 208.980388 u
26
83
A given coin has a mass of 3.0 g. Calculate the nuclear energy that
would be required to separate all the neutrons and protons from
each other. For simplicity assume that the coin is entirely made of
63
atoms (of mass 62.92960 u).
29 Cu
(i) -decay of

13.7

56
26

Bi

Tc

Xe
A radioactive isotope has a half-life of T years. How long will it take
the activity to reduce to a) 3.125%, b) 1% of its original value?
The normal activity of living carbon-containing matter is found to
be about 15 decays per minute for every gram of carbon. This activity
arises from the small proportion of radioactive 146 C present with the
stable carbon isotope 126 C . When the organism is dead, its interaction
with the atmosphere (which maintains the above equilibrium activity)
ceases and its activity begins to drop. From the known half-life (5730
years) of 146 C , and the measured activity, the age of the specimen
can be approximately estimated. This is the principle of 146 C dating

Nuclei
used in archaeology. Suppose a specimen from Mohenjodaro gives
an activity of 9 decays per minute per gram of carbon. Estimate the
approximate age of the Indus-Valley civilisation.
13.9

Obtain the amount of 60


Co necessary to provide a radioactive source
27
of 8.0 mCi strength. The half-life of 60
Co is 5.3 years.
27

tt
o N
be C
E
re R
pu T
bl
is
he
d

13.10 The half-life of 90


Sr is 28 years. What is the disintegration rate of
38
15 mg of this isotope?
13.11 Obtain approximately the ratio of the nuclear radii of the gold isotope
197
Au and the silver isotope 107
Ag .
79
47
13.12 Find the Q-value and the kinetic energy of the emitted -particle in
220
the -decay of (a) 226
Ra and (b) 86 Rn .
88
Given m ( 226
Ra ) = 226.02540 u,
88
m ( 228 62 Rn ) = 220.01137 u,

13.13 The radionuclide


11
6

m ( 222
Rn ) = 222.01750 u,
86

m ( 218 46 Po ) = 216.00189 u.

11

C decays according to

C 115 B + e + + : T1/2 =20.3 min

The maximum energy of the emitted positron is 0.960 MeV.


Given the mass values:
m ( 116 C ) = 11.011434 u and m ( 116 B ) = 11.009305 u,

calculate Q and compare it with the maximum energy of the positron


emitted.

23

13.14 The nucleus 10


Ne decays by emission. Write down the -decay
equation and determine the maximum kinetic energy of the
electrons emitted. Given that:

23
m ( 10
Ne ) = 22.994466 u

23
m ( 11
Na ) = 22.089770 u.

13.15 The Q value of a nuclear reaction A + b C + d is defined by


2
Q = [ mA + mb mC md]c
where the masses refer to the respective nuclei. Determine from the
given data the Q-value of the following reactions and state whether
the reactions are exothermic or endothermic.
(i) 11 H+31 H 12 H+12 H
(ii)

12
6

20
C+126 C 10
Ne+24 He

Atomic masses are given to be

no

m ( 12 H ) = 2.014102 u
m ( 13 H ) = 3.016049 u
m ( 126 C ) = 12.000000 u
m ( 1200 Ne ) = 19.992439 u

13.16 Suppose, we think of fission of a 56


Fe nucleus into two equal
26
28
fragments, 13
.
Is
the
fission
energetically
possible? Argue by
Al
56
working out Q of the process. Given m ( 26 Fe ) = 55.93494 u and
28
m ( 13
Al ) = 27.98191 u.
13.17 The fission properties of 239
are very similar to those of 239 25 U . The
94 Pu
average energy released per fission is 180 MeV. How much energy,

463

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

in MeV, is released if all the atoms in 1 kg of pure 239 49 Pu undergo


fission?
13.18 A 1000 MW fission reactor consumes half of its fuel in 5.00 y. How
much 235
U did it contain initially? Assume that the reactor operates
92
80% of the time, that all the energy generated arises from the fission
of 235
U and that this nuclide is consumed only by the fission process.
92
13.19 How long can an electric lamp of 100W be kept glowing by fusion of
2.0 kg of deuterium? Take the fusion reaction as
2
1

H+ 1 H 2 He +n+3.27 MeV

13.20 Calculate the height of the potential barrier for a head on collision
of two deuterons. (Hint: The height of the potential barrier is given
by the Coulomb repulsion between the two deuterons when they
just touch each other. Assume that they can be taken as hard
spheres of radius 2.0 fm.)
1/3
13.21 From the relation R = R 0A , where R 0 is a constant and A is the
mass number of a nucleus, show that the nuclear matter density is
nearly constant (i.e. independent of A).
+
13.22 For the (positron) emission from a nucleus, there is another
competing process known as electron capture (electron from an inner
orbit, say, the Kshell, is captured by the nucleus and a neutrino is
emitted).
+

e +

A
Z

A
Z 1

Y+

Show that if + emission is energetically allowed, electron capture


is necessarily allowed but not viceversa.

ADDITIONAL EXERCISES

13.23 In a periodic table the average atomic mass of magnesium is given


as 24.312 u. The average value is based on their relative natural
abundance on earth. The three isotopes and their masses are 24
12 Mg
26
(23.98504u), 25
(24.98584u)
and
(25.98259u).
The
natural
Mg
Mg
12
12
abundance of 24
is 78.99% by mass. Calculate the abundances
12 Mg
of other two isotopes.
13.24 The neutron separation energy is defined as the energy required to
remove a neutron from the nucleus. Obtain the neutron separation
energies of the nuclei 41
and 27
from the following data:
20 Ca
13 Al
m ( 40
) = 39.962591 u
20 Ca
m ( 41
) = 40.962278 u
20 Ca

no

m ( 26
) = 25.986895 u
13 Al

464

m ( 27
13 Al ) = 26.981541 u
13.25 A source contains two phosphorous radio nuclides 32
(T 1/2 = 14.3d)
15 P
and 33
(T
=
25.3d).
Initially,
10%
of
the
decays
come
from 33
.
15 P
1/2
15 P
How long one must wait until 90% do so?
13.26 Under certain circumstances, a nucleus can decay by emitting a
particle more massive than an -particle. Consider the following
decay processes:
22 3
88

Ra

2 09
82

Pb +

14
6

Nuclei
22 3
88

Ra

2 19
86

Rn + 2 He

Calculate the Q-values for these decays and determine that both
are energetically allowed.

tt
o N
be C
E
re R
pu T
bl
is
he
d

13.27 Consider the fission of 239 28 U by fast neutrons. In one fission event,
no neutrons are emitted and the final end products, after the beta
99
decay of the primary fragments, are 140
58 Ce and 44 Ru . Calculate Q
for this fission process. The relevant atomic and particle masses
are
m( 238
) =238.05079 u
92 U

m( 140
58 Ce ) =139.90543 u
m( 99
) = 98.90594 u
44 Ru

13.28 Consider the DT reaction (deuteriumtritium fusion)


2
1

H + 1 H 2 He + n

(a) Calculate the energy released in MeV in this reaction from the
data:
m( 21 H )=2.014102 u

m( 31 H ) =3.016049 u

no

(b) Consider the radius of both deuterium and tritium to be


approximately 2.0 fm. What is the kinetic energy needed to
overcome the coulomb repulsion between the two nuclei? To what
temperature must the gas be heated to initiate the reaction?
(Hint: Kinetic energy required for one fusion event =average
thermal kinetic energy available with the interacting particles
= 2(3kT/2); k = Boltzmans constant, T = absolute temperature.)
13.29 Obtain the maximum kinetic energy of -particles, and the radiation
frequencies of decays in the decay scheme shown in Fig. 13.6. You
are given that
m(198Au) = 197.968233 u
m(198Hg) =197.966760 u

FIGURE13.6

465

Physics

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

13.30 Calculate and compare the energy released by a) fusion of 1.0 kg of


hydrogen deep within Sun and b) the fission of 1.0 kg of 235U in a
fission r eactor.
13.31 Suppose India had a target of producing by 2020 AD, 200,000 MW
of electric power, ten percent of which was to be obtained from nuclear
power plants. Suppose we are given that, on an average, the efficiency
of utilization (i.e. conversion to electric energy) of thermal energy
produced in a reactor was 25%. How much amount of fissionable
uranium would our country need per year by 2020? Take the heat
energy per fission of 235U to be about 200MeV.

466

Chapter Fourteen

tt
o N
be C
E
re R
pu T
bl
is
he
d

SEMICONDUCTOR
ELECTRONICS:
MATERIALS, DEVICES
AND SIMPLE CIRCUITS

no

14.1 INTRODUCTION

Devices in which a controlled flow of electrons can be obtained are the


basic building blocks of all the electronic circuits. Before the discovery of
transistor in 1948, such devices were mostly vacuum tubes (also called
valves) like the vacuum diode which has two electrodes, viz., anode (often
called plate) and cathode; triode which has three electrodes cathode,
plate and grid; tetrode and pentode (respectively with 4 and 5 electrodes).
In a vacuum tube, the electrons are supplied by a heated cathode and
the controlled flow of these electrons in vacuum is obtained by varying
the voltage between its different electrodes. Vacuum is required in the
inter-electrode space; otherwise the moving electrons may lose their
energy on collision with the air molecules in their path. In these devices
the electrons can flow only from the cathode to the anode (i.e., only in one
direction). Therefore, such devices are generally referred to as valves.
These vacuum tube devices are bulky, consume high power, operate
generally at high voltages (~100 V) and have limited life and low reliability.
The seed of the development of modern solid-state semiconductor
electronics goes back to 1930s when it was realised that some solidstate semiconductors and their junctions offer the possibility of controlling
the number and the direction of flow of charge carriers through them.
Simple excitations like light, heat or small applied voltage can change
the number of mobile charges in a semiconductor. Note that the supply

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

and flow of charge carriers in the semiconductor devices are within the
solid itself, while in the earlier vacuum tubes/valves, the mobile electrons
were obtained from a heated cathode and they were made to flow in an
evacuated space or vacuum. No external heating or large evacuated space
is required by the semiconductor devices. They are small in size, consume
low power, operate at low voltages and have long life and high reliability.
Even the Cathode Ray Tubes (CRT) used in television and computer
monitors which work on the principle of vacuum tubes are being replaced
by Liquid Crystal Display (LCD) monitors with supporting solid state
electronics. Much before the full implications of the semiconductor devices
was formally understood, a naturally occurring crystal of galena (Lead
sulphide, PbS) with a metal point contact attached to it was used as
detector of radio waves.
In the following sections, we will introduce the basic concepts of
semiconductor physics and discuss some semiconductor devices like
junction diodes (a 2-electrode device) and bipolar junction transistor (a
3-electrode device). A few circuits illustrating their applications will also
be described.

no

14.2 CLASSIFICATION OF METALS, C ONDUCTORS AND


SEMICONDUCTORS

468

On the basis of conductivity


On the basis of the relative values of electrical conductivity ( ) or resistivity
( = 1/ ), the solids are broadly classified as:
(i) Metals: They possess very low resistivity (or high conductivity).
~ 102 108 m
~ 102 108 S m1
(ii) Semiconductors: They have resistivity or conductivity intermediate
to metals and insulators.
~ 105 106 m
~ 105 106 S m1
(iii)Insulators: They have high resistivity (or low conductivity).
~ 10 11 1019 m
~ 1011 1019 S m1
The values of and given above are indicative of magnitude and
could well go outside the ranges as well. Relative values of the resistivity
are not the only criteria for distinguishing metals, insulators and
semiconductors from each other. There are some other differences, which
will become clear as we go along in this chapter.
Our interest in this chapter is in the study of semiconductors which
could be:
(i) Elemental semiconductors: Si and Ge
(ii) Compound semiconductors: Examples are:
Inorganic: CdS, GaAs, CdSe, InP, etc.
Organic: anthracene, doped pthalocyanines, etc.
Organic polymers: polypyrrole, polyaniline, polythiophene, etc.
Most of the currently available semiconductor devices are based on
elemental semiconductors Si or Ge and compound inorganic

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

tt
o N
be C
E
re R
pu T
bl
is
he
d

semiconductors. However, after 1990, a few semiconductor devices using


organic semiconductors and semiconducting polymers have been
developed signalling the birth of a futuristic technology of polymerelectronics and molecular-electronics. In this chapter, we will restrict
ourselves to the study of inorganic semiconductors, particularly
elemental semiconductors Si and Ge. The general concepts introduced
here for discussing the elemental semiconductors, by-and-large, apply
to most of the compound semiconductors as well.
On the basis of energy bands

no

According to the Bohr atomic model, in an isolated atom the energy of


any of its electrons is decided by the orbit in which it revolves. But when
the atoms come together to form a solid they are close to each other. So
the outer orbits of electrons from neighbouring atoms would come very
close or could even overlap. This would make the nature of electron motion
in a solid very different from that in an isolated atom.
Inside the crystal each electron has a unique position and no two
electrons see exactly the same pattern of surrounding charges. Because
of this, each electron will have a different energy level. These different
energy levels with continuous energy variation form what are called
energy bands. The energy band which includes the energy levels of the
valence electrons is called the valence band. The energy band above the
valence band is called the conduction band. With no external energy, all
the valence electrons will reside in the valence band. If the lowest level in
the conduction band happens to be lower than the highest level of the
valence band, the electrons from the valence band can easily move into
the conduction band. Normally the conduction band is empty. But when
it overlaps on the valence band electrons can move freely into it. This is
the case with metallic conductors.
If there is some gap between the conduction band and the valence
band, electrons in the valence band all remain bound and no free electrons
are available in the conduction band. This makes the material an
insulator. But some of the electrons from the valence band may gain
external energy to cross the gap between the conduction band and the
valence band. Then these electrons will move into the conduction band.
At the same time they will create vacant energy levels in the valence band
where other valence electrons can move. Thus the process creates the
possibility of conduction due to electrons in conduction band as well as
due to vacancies in the valence band.
Let us consider what happens in the case of Si or Ge crystal containing
N atoms. For Si, the outermost orbit is the third orbit (n = 3), while for Ge
it is the fourth orbit (n = 4). The number of electrons in the outermost
orbit is 4 (2s and 2p electrons). Hence, the total number of outer electrons
in the crystal is 4N. The maximum possible number of electrons in the
outer orbit is 8 (2s + 6p electrons). So, for the 4N valence electrons there
are 8N available energy states. These 8N discrete energy levels can either
form a continuous band or they may be grouped in different bands
depending upon the distance between the atoms in the crystal (see box
on Band Theory of Solids).
At the distance between the atoms in the crystal lattices of Si and Ge,
the energy band of these 8N states is split apart into two which are
separated by an energy gap Eg (Fig. 14.1). The lower band which is

469

Physics
completely occupied by the 4N valence electrons at temperature of absolute
zero is the valence band. The other band consisting of 4N energy states,
called the conduction band, is completely empty at absolute zero.
THEORY OF SOLIDS

tt
o N
be C
E
re R
pu T
bl
is
he
d

B AND

no

Consider that the Si or Ge crystal


contains N atoms. Electrons of each
atom will have discrete energies in
different orbits. The electron energy
will be same if all the atoms are
isolated, i.e., separated from each
other by a large distance. However,
in a crystal, the atoms are close to
each other (2 to 3 ) and therefore
the electrons interact with each
other and also with the
neighbouring atomic cores. The
overlap (or interaction) will be more
felt by the electrons in the
outermost orbit while the inner
orbit or core electron energies may
remain unaffected. Therefore, for understanding electron energies in Si or Ge crystal, we
need to consider the changes in the energies of the electrons in the outermost orbit only.
For Si, the outermost orbit is the third orbit (n = 3), while for Ge it is the fourth orbit
(n = 4). The number of electrons in the outermost orbit is 4 (2s and 2p electrons). Hence,
the total number of outer electrons in the crystal is 4N. The maximum possible number of
outer electrons in the orbit is 8 (2s + 6p electrons). So, out of the 4N electrons, 2N electrons
are in the 2N s-states (orbital quantum number l = 0) and 2N electrons are in the available
6N p-states. Obviously, some p-electron states are empty as shown in the extreme right of
Figure. This is the case of well separated or isolated atoms [region A of Figure].
Suppose these atoms start coming nearer to each other to form a solid. The energies
of these electrons in the outermost orbit may change (both increase and decrease) due to
the interaction between the electrons of different atoms. The 6N states for l = 1, which
originally had identical energies in the isolated atoms, spread out and form an energy
band [region B in Figure]. Similarly, the 2N states for l = 0, having identical energies in
the isolated atoms, split into a second band (carefully see the region B of Figure) separated
from the first one by an energy gap.
At still smaller spacing, however, there comes a region in which the bands merge with
each other. The lowest energy state that is a split from the upper atomic level appears to
drop below the upper state that has come from the lower atomic level. In this region (region
C in Figure), no energy gap exists where the upper and lower energy states get mixed.
Finally, if the distance between the atoms further decreases, the energy bands again
split apart and are separated by an energy gap Eg (region D in Figure). The total number
of available energy states 8N has been re-apportioned between the two bands (4N states
each in the lower and upper energy bands). Here the significant point is that there are
exactly as many states in the lower band (4N) as there are available valence electrons
from the atoms (4N ).
Therefore, this band ( called the valence band ) is completely filled while the upper
band is completely empty. The upper band is called the conduction band.

470

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

The lowest energy level in the


conduction band is shown as E C and
highest energy level in the valence band
is shown as EV . Above E C and below EV
there are a large number of closely spaced
energy levels, as shown in Fig. 14.1.
The gap between the top of the valence
band and bottom of the conduction band
is called the energy band gap (Energy gap
Eg ). It may be large, small, or zero,
depending upon the material. These
different situations, are depicted in Fig.
14.2 and discussed below:
Case I: This refers to a situation, as
shown in Fig. 14.2(a). One can have a
metal either when the conduction band
FIGURE 14.1 The energy band positions in a
is partially filled and the balanced band
semiconductor
at 0 K. The upper band, called the
is partially empty or when the conduction
conduction
band,
consists of infinitely large number
and valance bands overlap. When there
of closely spaced energy states. The lower band,
is overlap electrons from valence band can
called the valence band, consists of closely spaced
easily move into the conduction band.
completely filled energy states.
This situation makes a large number of
electrons available for electrical conduction. When the valence band is
partially empty, electrons from its lower level can move to higher level
making conduction possible. Therefore, the resistance of such materials
is low or the conductivity is high.

FIGURE 14.2 Difference between energy bands of (a) metals,


(b) insulators and (c) semiconductors.

471

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

Case II: In this case, as shown in Fig. 14.2(b), a large band gap Eg exists
(Eg > 3 eV). There are no electrons in the conduction band, and therefore
no electrical conduction is possible. Note that the energy gap is so large
that electrons cannot be excited from the valence band to the conduction
band by thermal excitation. This is the case of insulators.
Case III: This situation is shown in Fig. 14.2(c). Here a finite but small
band gap (E g < 3 eV) exists. Because of the small band gap, at room
temperature some electrons from valence band can acquire enough
energy to cross the energy gap and enter the conduction band. These
electrons (though small in numbers) can move in the conduction band.
Hence, the resistance of semiconductors is not as high as that of the
insulators.
In this section we have made a broad classification of metals,
conductors and semiconductors. In the section which follows you will
learn the conduction process in semiconductors.

14.3 INTRINSIC SEMICONDUCTOR

no

We shall take the most common case of Ge and Si whose lattice structure
is shown in Fig. 14.3. These structures are called the diamond-like
structures. Each atom is surrounded by four nearest neighbours. We
know that Si and Ge have four valence electrons. In its crystalline
structure, every Si or Ge atom tends to share one of its four valence
electrons with each of its four nearest neighbour atoms, and also to take
share of one electron from each such neighbour. These shared electron
pairs are referred to as forming a covalent bond or simply a valence
bond. The two shared electrons can be assumed to shuttle back-andforth between the associated atoms holding them together strongly.
Figure 14.4 schematically shows the 2-dimensional representation of Si
or Ge structure shown in Fig. 14.3 which overemphasises the covalent
bond. It shows an idealised picture in which no bonds are broken (all
bonds are intact). Such a situation arises at low
temperatures. As the temperature increases, more
thermal energy becomes available to these electrons
and some of these electrons may breakaway
(becoming free electrons contributing to conduction).
The thermal energy effectively ionises only a few atoms
in the crystalline lattice and creates a vacancy in the
bond as shown in Fig. 14.5(a). The neighbourhood,
from which the free electron (with charge q ) has come
out leaves a vacancy with an effective charge (+q ). This
vacancy with the effective positive electronic charge is
called a hole. The hole behaves as an apparent free
particle with effective positive charge.
In intrinsic semiconductors, the number of free
FIGURE 14.3 Three-dimensional diaelectrons,
ne is equal to the number of holes, n h. That is
mond-like crystal structure for Carbon,
n
=
n
= ni
(14.1)
Silicon or Germanium with
e
h
where
n
is
called
intrinsic
carrier
concentration.
respective lattice spacing a equal
i
Semiconductors posses the unique property in
to 3.56, 5.43 and 5.66 .
472
which, apart from electrons, the holes also move.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

tt
o N
be C
E
re R
pu T
bl
is
he
d

Suppose there is a hole at site 1 as shown in


Fig. 14.5(a). The movement of holes can be
visualised as shown in Fig. 14.5(b). An electron
from the covalent bond at site 2 may jump to
the vacant site 1 (hole). Thus, after such a jump,
the hole is at site 2 and the site 1 has now an
electron. Therefore, apparently, the hole has
moved from site 1 to site 2. Note that the electron
originally set free [Fig. 14.5(a)] is not involved
in this process of hole motion. The free electron
moves completely independently as conduction
electron and gives rise to an electron current, I e
under an applied electric field. Remember that
the motion of hole is only a convenient way of
describing the actual motion of bound electrons,
whenever there is an empty bond anywhere in
the crystal. Under the action of an electric field,
these holes move towards negative potential
giving the hole current, Ih. The total current, I is
thus the sum of the electron current I e and the
hole current Ih:

FIGURE 14.4 Schematic two-dimensional


representation of Si or Ge structure showing
covalent bonds at low temperature
(all bonds intact). +4 symbol
indicates inner cores of Si or Ge.

no

I = Ie + Ih
(14.2)
It may be noted that apart from the process of generation of conduction
electrons and holes, a simultaneous process of recombination occurs in
which the electrons recombine with the holes. At equilibrium, the rate of
generation is equal to the rate of recombination of charge carriers. The
recombination occurs due to an electron colliding with a hole.

(a)

(b)

FIGURE 14.5 (a) Schematic model of generation of hole at site 1 and conduction electron
due to thermal energy at moderate temperatures. (b) Simplified representation of
possible thermal motion of a hole. The electron from the lower left hand covalent bond
(site 2) goes to the earlier hole site1, leaving a hole at its site indicating an
473
apparent movement of the hole from site 1 to site 2.

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

An intrinsic semiconductor
will behave like an insulator at
T = 0 K as shown in Fig. 14.6(a).
It is the thermal energy at
higher temperatures (T > 0K),
which excites some electrons
from the valence band to the
conduction band. These
thermally excited electrons at
T > 0 K, partially occupy the
conduction band. Therefore,
the energy-band diagram of an
intrinsic semiconductor will be
as shown in Fig. 14.6(b). Here,
some electrons are shown in
the conduction band. These
have come from the valence
band leaving equal number of
holes there.

FIGURE 14.6 (a) An intrinsic semiconductor at T = 0 K


behaves like insulator. (b) At T > 0 K, four ther mally generated
electron-hole pairs. The filled circles ( ) represent electrons
and empty fields ( ) represent holes.

EXAMPLE 14.1

Example 14.1 C, Si and Ge have same lattice structure. Why is C


insulator while Si and Ge intrinsic semiconductors?

Solution The 4 bonding electrons of C, Si or Ge lie, respectively, in


the second, third and fourth orbit. Hence, energy required to take
out an electron from these atoms (i.e., ionisation energy Eg ) will be
least for Ge, followed by Si and highest for C. Hence, number of free
electrons for conduction in Ge and Si are significant but negligibly
small for C.

no

14.4 EXTRINSIC SEMICONDUCTOR

474

The conductivity of an intrinsic semiconductor depends on its


temperature, but at room temperature its conductivity is very low. As
such, no important electronic devices can be developed using these
semiconductors. Hence there is a necessity of improving their
conductivity. This can be done by making use of impurities.
When a small amount, say, a few parts per million (ppm), of a suitable
impurity is added to the pure semiconductor, the conductivity of the
semiconductor is increased manifold. Such materials are known as
extrinsic semiconductors or impurity semiconductors. The deliberate
addition of a desirable impurity is called doping and the impurity atoms
are called dopants. Such a material is also called a doped semiconductor.
The dopant has to be such that it does not distort the original pure
semiconductor lattice. It occupies only a very few of the original
semiconductor atom sites in the crystal. A necessary condition to attain
this is that the sizes of the dopant and the semiconductor atoms should
be nearly the same.
There are two types of dopants used in doping the tetravalent Si
or Ge:
(i) Pentavalent (valency 5); like Arsenic (As), Antimony (Sb), Phosphorous
(P), etc.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

tt
o N
be C
E
re R
pu T
bl
is
he
d

(ii) Trivalent (valency 3); like Indium (In),


Boron (B), Aluminium (Al), etc.
We shall now discuss how the doping
changes the number of charge carriers (and
hence the conductivity) of semiconductors.
Si or Ge belongs to the fourth group in the
Periodic table and, therefore, we choose the
dopant element from nearby fifth or third
group, expecting and taking care that the
size of the dopant atom is nearly the same as
that of Si or Ge. Interestingly, the pentavalent
and trivalent dopants in Si or Ge give two
entirely different types of semiconductors as
discussed below.
(i) n-type semiconductor

no

Suppose we dope Si or Ge with a pentavalent


element as shown in Fig. 14.7. When an atom
of +5 valency element occupies the position
of an atom in the crystal lattice of Si, four of
its electrons bond with the four silicon
neighbours while the fifth remains very
weakly bound to its parent atom. This is
because the four electrons participating in
bonding are seen as part of the effective core FIGURE 14.7 (a) Pentavalent donor atom (As, Sb,
P, etc.) doped for tetravalent Si or Ge giving nof the atom by the fifth electron. As a result
type semiconductor, and (b) Commonly used
the ionisation energy required to set this
schematic representation of n-type material
electron free is very small and even at room
which shows only the fixed cores of the
temperature it will be free to move in the
substituent donors with one additional effective
lattice of the semiconductor. For example, the positive charge and its associated extra electron.
energy required is ~ 0.01 eV for germanium,
and 0.05 eV for silicon, to separate this
electron from its atom. This is in contrast to the energy required to jump
the forbidden band (about 0.72 eV for germanium and about 1.1 eV for
silicon) at room temperature in the intrinsic semiconductor. Thus, the
pentavalent dopant is donating one extra electron for conduction and
hence is known as donor impurity. The number of electrons made
available for conduction by dopant atoms depends strongly upon the
doping level and is independent of any increase in ambient temperature.
On the other hand, the number of free electrons (with an equal number
of holes) generated by Si atoms, increases weakly with temperature.
In a doped semiconductor the total number of conduction electrons
ne is due to the electrons contributed by donors and those generated
intrinsically, while the total number of holes n h is only due to the holes
from the intrinsic source. But the rate of recombination of holes would
increase due to the increase in the number of electrons. As a result, the
number of holes would get reduced further.
Thus, with proper level of doping the number of conduction electrons
475
can be made much larger than the number of holes. Hence in an extrinsic

Physics
semiconductor doped with pentavalent impurity, electrons
become the majority carriers and holes the minority carriers.
These semiconductors are, therefore, known as n-type
semiconductors. For n-type semiconductors, we have,
ne >> nh
(14.3)

tt
o N
be C
E
re R
pu T
bl
is
he
d

(ii) p-type semiconductor

no

FIGURE 14.8 (a) Trivalent


acceptor atom (In, Al, B etc.)
doped in tetravalent Si or Ge
lattice giving p-type semiconductor. (b) Commonly used
schematic representation of
p-type material which shows
only the fixed core of the
substituent acceptor with
one effective additional
negative charge and its
associated hole.

476

This is obtained when Si or Ge is doped with a trivalent impurity


like Al, B, In, etc. The dopant has one valence electron less than
Si or Ge and, therefore, this atom can form covalent bonds with
neighbouring three Si atoms but does not have any electron to
offer to the fourth Si atom. So the bond between the fourth
neighbour and the trivalent atom has a vacancy or hole as shown
in Fig. 14.8. Since the neighbouring Si atom in the lattice wants
an electron in place of a hole, an electron in the outer orbit of
an atom in the neighbourhood may jump to fill this vacancy,
leaving a vacancy or hole at its own site. Thus the hole is
available for conduction. Note that the trivalent foreign atom
becomes effectively negatively charged when it shares fourth
electron with neighbouring Si atom. Therefore, the dopant atom
of p-type material can be treated as core of one negative charge
along with its associated hole as shown in Fig. 14.8(b). It is
obvious that one acceptor atom gives one hole. These holes are
in addition to the intrinsically generated holes while the source
of conduction electrons is only intrinsic generation. Thus, for
such a material, the holes are the majority carriers and electrons
are minority carriers. Therefore, extrinsic semiconductors doped
with trivalent impurity are called p-type semiconductors. For
p-type semiconductors, the recombination process will reduce
the number (n i)of intrinsically generated electrons to ne . We
have, for p-type semiconductors
nh >> ne
(14.4)

Note that the crystal maintains an overall charge neutrality


as the charge of additional charge carriers is just equal and
opposite to that of the ionised cores in the lattice.
In extrinsic semiconductors, because of the abundance of
majority current carriers, the minority carriers produced
thermally have more chance of meeting majority carriers and
thus getting destroyed. Hence, the dopant, by adding a large number of
current carriers of one type, which become the majority carriers, indirectly
helps to reduce the intrinsic concentration of minority carriers.
The semiconductors energy band structure is affected by doping. In
the case of extrinsic semiconductors, additional energy states due to donor
impurities (ED ) and acceptor impurities (EA ) also exist. In the energy band
diagram of n-type Si semiconductor, the donor energy level ED is slightly
below the bottom EC of the conduction band and electrons from this level
move into the conduction band with very small supply of energy. At room
temperature, most of the donor atoms get ionised but very few (~1012)
atoms of Si get ionised. So the conduction band will have most electrons
coming from the donor impurities, as shown in Fig. 14.9(a). Similarly,

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

tt
o N
be C
E
re R
pu T
bl
is
he
d

for p-type semiconductor, the acceptor energy level EA is slightly above


the top EV of the valence band as shown in Fig. 14.9(b). With very small
supply of energy an electron from the valence band can jump to the level
EA and ionise the acceptor negatively. (Alternately, we can also say that
with very small supply of energy the hole from level EA sinks down into
the valence band. Electrons rise up and holes fall down when they gain
external energy.) At room temperature, most of the acceptor atoms get
ionised leaving holes in the valence band. Thus at room temperature the
density of holes in the valence band is predominantly due to impurity in
the extrinsic semiconductor. The electron and hole concentration in a
semiconductor in thermal equilibrium is given by
ne nh = n i2

(14.5)

Though the above description is grossly approximate and


hypothetical, it helps in understanding the difference between metals,
insulators and semiconductors (extrinsic and intrinsic) in a simple
manner. The difference in the resistivity of C, Si and Ge depends upon
the energy gap between their conduction and valence bands. For C
(diamond), Si and Ge, the energy gaps are 5.4 eV, 1.1 eV and 0.7 eV,
respectively. Sn also is a group IV element but it is a metal because the
energy gap in its case is 0 eV.

FIGURE 14.9 Energy bands of (a) n-type semiconductor at T > 0K, (b) p-type
semiconductor at T > 0K.

Solution Note that thermally generated electrons (ni ~10 16 m3 ) are


negligibly small as compared to those produced by doping.
Therefore, ne ND .
2
Since nenh = ni , The number of holes
nh = (2.25 1032)/(5 1022)
~ 4.5 109 m3

EXAMPLE 14.2

no

Example 14.2 Suppose a pure Si crystal has 5 1028 atoms m3. It is


doped by 1 ppm concentration of pentavalent As. Calculate the
number of electrons and holes. Given that ni =1.5 10 16 m3.

477

Physics
14.5 p-n JUNCTION

http://hyperphysics.phy-astr.gsu.edu/hbase/solids/pnjun.html

tt
o N
be C
E
re R
pu T
bl
is
he
d

A p-n junction is the basic building block of many semiconductor devices


like diodes, transistor, etc. A clear understanding of the junction behaviour
is important to analyse the working of other semiconductor devices.
We will now try to understand how a junction is formed and how the
junction behaves under the influence of external applied voltage (also
called bias).

14.5.1 p-n junction formation

no

Formation and working of p-n junction diode

Consider a thin p-type silicon (p-Si) semiconductor wafer. By adding


precisely a small quantity of pentavelent impurity, part of the p-Si wafer
can be converted into n-Si. There are several processes by which a
semiconductor can be formed. The wafer now contains p-region and
n-region and a metallurgical junction between p-, and n- region.
Two important processes occur during the formation of a p-n junction:
diffusion and drift. We know that in an n-type semiconductor, the
concentration of electrons (number of electrons per unit volume) is more
compared to the concentration of holes. Similarly, in a p-type
semiconductor, the concentration of holes is more than the concentration
of electrons. During the formation of p-n junction, and due to the
concentration gradient across p-, and n- sides, holes diffuse from p-side
to n-side (p n) and electrons diffuse from n-side to p-side (n p). This
motion of charge carries gives rise to diffusion current across the junction.
When an electron diffuses from n p, it leaves behind an ionised
donor on n-side. This ionised donor (positive charge) is immobile as it is
bonded to the surrounding atoms. As the electrons continue to diffuse
from n p, a layer of positive charge (or positive space-charge region) on
n-side of the junction is developed.
Similarly, when a hole diffuses from p n due to the concentration
gradient, it leaves behind an ionised acceptor (negative charge) which is
immobile. As the holes continue to diffuse, a layer of negative charge (or
negative space-charge region) on the p-side of the junction is developed.
This space-charge region on either side of the junction together is known
as depletion region as the electrons and holes taking part in the initial
movement across the junction depleted the region of its
free charges (Fig. 14.10). The thickness of depletion region
is of the order of one-tenth of a micrometre. Due to the
positive space-charge region on n-side of the junction and
negative space charge region on p-side of the junction,
an electric field directed from positive charge towards
negative charge develops. Due to this field, an electron on
p-side of the junction moves to n-side and a hole on nside of the junction moves to p-side. The motion of charge
carriers due to the electric field is called drift. Thus a
FIGURE 14.10 p-n junction
drift current, which is opposite in direction to the diffusion
formation process.
478
current (Fig. 14.10) starts.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

tt
o N
be C
E
re R
pu T
bl
is
he
d

Initially, diffusion current is large and drift current is small.


As the diffusion process continues, the space-charge regions
on either side of the junction extend, thus increasing the electric
field strength and hence drift current. This process continues
until the diffusion current equals the drift current. Thus a p-n
junction is formed. In a p-n junction under equilibrium there
is no net current.
The loss of electrons from the n-region and the gain of
electron by the p-region causes a difference of potential across
the junction of the two regions. The polarity of this potential is
such as to oppose further flow of carriers so that a condition of
equilibrium exists. Figure 14.11 shows the p-n junction at
equilibrium and the potential across the junction. The
n-material has lost electrons, and p material has acquired
electrons. The n material is thus positive relative to the p
material. Since this potential tends to prevent the movement of
electron from the n region into the p region, it is often called a
barrier potential.

FIGURE 14.11 (a) Diode under


equilibrium (V = 0), (b) Barrier
potential under no bias.

Solution No! Any slab, howsoever flat, will have roughness much
larger than the inter-atomic crystal spacing (~2 to 3 ) and hence
continuous contact at the atomic level will not be possible. The junction
will behave as a discontinuity for the flowing charge carriers.

EXAMPLE 14.3

Example 14.3 Can we take one slab of p-type semiconductor and


physically join it to another n-type semiconductor to get p-n junction?

14.6 SEMICONDUCTOR DIODE

A semiconductor diode [Fig. 14.12(a)] is basically a


p-n junction with metallic contacts provided at the
ends for the application of an external voltage. It is a
two terminal device. A p-n junction diode is
symbolically represented as shown in Fig. 14.12(b).
The direction of arrow indicates the conventional
direction of current (when the diode is under forward
bias). The equilibrium barrier potential can be altered
by applying an external voltage V across the diode.
The situation of p-n junction diode under equilibrium
(without bias) is shown in Fig. 14.11(a) and (b).

FIGURE 14.12 (a) Semiconductor diode,


(b) Symbol for p-n junction diode.

no

14.6.1 p-n junction diode under forward bias

When an external voltage V is applied across a semiconductor diode such


that p-side is connected to the positive terminal of the battery and n-side
to the negative terminal [Fig. 14.13(a)], it is said to be forward biased.
The applied voltage mostly drops across the depletion region and the
voltage drop across the p-side and n-side of the junction is negligible.
(This is because the resistance of the depletion region a region where
there are no charges is very high compared to the resistance of n-side
and p-side.) The direction of the applied voltage (V ) is opposite to the

479

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

built-in potential V0. As a result, the depletion layer width


decreases and the barrier height is reduced [Fig. 14.13(b)]. The
effective barrier height under forward bias is (V 0 V ).
If the applied voltage is small, the barrier potential will be
reduced only slightly below the equilibrium value, and only a
small number of carriers in the materialthose that happen to
be in the uppermost energy levelswill possess enough energy
to cross the junction. So the current will be small. If we increase
the applied voltage significantly, the barrier height will be reduced
and more number of carriers will have the required energy. Thus
the current increases.
Due to the applied voltage, electrons from n-side cross the
depletion region and reach p-side (where they are minority
carries). Similarly, holes from p-side cross the junction and reach
the n-side (where they are minority carries). This process under
forward bias is known as minority carrier injection. At the
junction boundary, on each side, the minority carrier
concentration increases significantly compared to the locations
far from the junction.
Due to this concentration gradient, the injected electrons on
p-side diffuse from the junction edge of p-side to the other end
of p-side. Likewise, the injected holes on n-side diffuse from the
junction edge of n-side to the other end of n-side
(Fig. 14.14). This motion of charged carriers on either side
gives rise to current. The total diode forward current is sum
of hole diffusion current and conventional current due to
electron diffusion. The magnitude of this current is usually
in mA.

FIGURE 14.13 (a) p-n


junction diode under forward
bias, (b) Barrier potential
(1) without battery, (2) Low
battery voltage, and (3) High
voltage battery.

14.6.2 p-n junction diode under reverse bias

no

FIGURE 14.14 Forward bias


minority carrier injection.

480

When an external voltage (V ) is applied across the diode such


that n-side is positive and p-side is negative, it is said to be
reverse biased [Fig.14.15(a)]. The applied voltage mostly
drops across the depletion region. The direction of applied voltage is same
as the direction of barrier potential. As a result, the barrier height increases
and the depletion region widens due to the change in the electric field.
The effective barrier height under reverse bias is (V0 + V ), [Fig. 14.15(b)].
This suppresses the flow of electrons from n p and holes from p n.
Thus, diffusion current, decreases enormously compared to the diode
under forward bias.
The electric field direction of the junction is such that if electrons on
p-side or holes on n-side in their random motion come close to the
junction, they will be swept to its majority zone. This drift of carriers
gives rise to current. The drift current is of the order of a few A. This is
quite low because it is due to the motion of carriers from their minority
side to their majority side across the junction. The drift current is also
there under forward bias but it is negligible (A) when compared with
current due to injected carriers which is usually in mA.
The diode reverse current is not very much dependent on the applied
voltage. Even a small voltage is sufficient to sweep the minority carriers
from one side of the junction to the other side of the junction. The current

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

is not limited by the magnitude of the applied voltage but is


limited due to the concentration of the minority carrier on either
side of the junction.
The current under reverse bias is essentially voltage
independent upto a critical reverse bias voltage, known as
breakdown voltage (Vbr ). When V = Vbr, the diode reverse current
increases sharply. Even a slight increase in the bias voltage causes
large change in the current. If the reverse current is not limited by
an external circuit below the rated value (specified by the
manufacturer) the p-n junction will get destroyed. Once it exceeds
the rated value, the diode gets destroyed due to overheating. This
can happen even for the diode under forward bias, if the forward
current exceeds the rated value.
The circuit arrangement for studying the V-I characteristics
of a diode, (i.e., the variation of current as a function of applied
FIGURE 14.15 (a) Diode
voltage) are shown in Fig. 14.16(a) and (b). The battery is connected
under reverse bias,
to the diode through a potentiometer (or reheostat) so that the
(b) Barrier potential under
applied voltage to the diode can be changed. For different values
reverse bias.
of voltages, the value of the current is noted. A graph between V
and I is obtained as in Fig. 14.16(c). Note that in forward bias
measurement, we use a milliammeter since the expected current is large
(as explained in the earlier section) while a micrometer is used in reverse
bias to measure the current. You can see in Fig. 14.16(c) that in forward

FIGURE 14.16 Experimental circuit arrangement for studying V-I characteristics of


a p-n junction diode (a) in forward bias , (b) in reverse bias. (c) Typical V-I
481
characteristics of a silicon diode.

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

bias, the current first increases very slowly, almost negligibly, till the
voltage across the diode crosses a certain value. After the characteristic
voltage, the diode current increases significantly (exponentially), even for
a very small increase in the diode bias voltage. This voltage is called the
threshold voltage or cut-in voltage (~0.2V for germanium diode and
~0.7 V for silicon diode).
For the diode in reverse bias, the current is very small (~A) and almost
remains constant with change in bias. It is called reverse saturation
current. However, for special cases, at very high reverse bias (break down
voltage), the current suddenly increases. This special action of the diode
is discussed later in Section 14.8. The general purpose diode are not
used beyond the reverse saturation current region.
The above discussion shows that the p-n junction diode primerly
allows the flow of current only in one direction (forward bias). The forward
bias resistance is low as compared to the reverse bias resistance. This
property is used for rectification of ac voltages as discussed in the next
section. For diodes, we define a quantity called dynamic resistance as
the ratio of small change in voltage V to a small change in current I:
rd =

V
I

(14.6)

482

FIGURE 14.17

EXAMPLE 14.4

no

Example 14.4 The V-I characteristic of a silicon diode is shown in


the Fig. 14.17. Calculate the resistance of the diode at (a) I D = 15 mA
and (b) VD = 10 V.

Solution Considering the diode characteristics as a straight line


between I = 10 mA to I = 20 mA passing through the origin, we can
calculate the resistance using Ohms law.
(a) From the curve, at I = 20 mA, V = 0.8 V, I = 10 mA, V = 0.7 V
rf b = V/I = 0.1V/10 mA = 10
(b) From the curve at V = 10 V, I = 1 A,
Therefore,
rrb = 10 V/1A= 1.0 107

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

14.7 APPLICATION OF JUNCTION DIODE AS A RECTIFIER

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

From the V-I characteristic of a junction diode we see that it allows current
to pass only when it is forward biased. So if an alternating voltage is
applied across a diode the current flows only in that part of the cycle
when the diode is forward biased. This property
is used to rectify alternating voltages and the
circuit used for this purpose is called a rectifier.
If an alternating voltage is applied across a
diode in series with a load, a pulsating voltage will
appear across the load only during the half cycles
of the ac input during which the diode is forward
biased. Such rectifier circuit, as shown in
Fig. 14.18, is called a half-wave rectifier. The
secondary of a transformer supplies the desired
ac voltage across terminals A and B. When the
voltage at A is positive, the diode is forward biased
and it conducts. When A is negative, the diode is
reverse-biased and it does not conduct. The reverse
saturation current of a diode is negligible and can
be considered equal to zero for practical purposes.
(The reverse breakdown voltage of the diode must
be sufficiently higher than the peak ac voltage at
the secondary of the transformer to protect the
diode from reverse breakdown.)
Therefore, in the positive half-cycle of ac there
FIGURE 14.18 (a) Half-wave rectifier
is a current through the load resistor R L and we
circuit, (b) Input ac voltage and output
get an output voltage, as shown in Fig. 14.18(b),
voltage waveforms from the rectifier circuit.
whereas there is no current in the negative halfcycle. In the next positive half-cycle, again we get
the output voltage. Thus, the output voltage, though still varying, is
restricted to only one direction and is said to be rectified. Since the
rectified output of this circuit is only for half of the input ac wave it is
called as half-wave rectifier.
The circuit using two diodes, shown in Fig. 14.19(a), gives output
rectified voltage corresponding to both the positive as well as negative
half of the ac cycle. Hence, it is known as full-wave rectifier. Here the
p-side of the two diodes are connected to the ends of the secondary of the
transformer. The n-side of the diodes are connected together and the
output is taken between this common point of diodes and the midpoint
of the secondary of the transformer. So for a full-wave rectifier the
secondary of the transformer is provided with a centre tapping and so it
is called centre-tap transformer. As can be seen from Fig.14.19(c) the
voltage rectified by each diode is only half the total secondary voltage.
Each diode rectifies only for half the cycle, but the two do so for alternate
cycles. Thus, the output between their common terminals and the centretap of the transformer becomes a full-wave rectifier output. (Note that
there is another circuit of full wave rectifier which does not need a centre483
tap transformer but needs four diodes.) Suppose the input voltage to A

Physics

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

with respect to the centre tap at any instant


is positive. It is clear that, at that instant,
voltage at B being out of phase will be
negative as shown in Fig.14.19(b). So, diode
D1 gets forward biased and conducts (while
D2 being reverse biased is not conducting).
Hence, during this positive half cycle we get
an output current (and a output voltage
across the load resistor R L) as shown in
Fig.14.19(c). In the course of the ac cycle
when the voltage at A becomes negative with
respect to centre tap, the voltage at B would
be positive. In this part of the cycle diode
D1 would not conduct but diode D2 would,
giving an output current and output
voltage (across RL ) during the negative half
cycle of the input ac. Thus, we get output
voltage during both the positive as well as
the negative half of the cycle. Obviously,
this is a more efficient circuit for getting
rectified voltage or current than the halfwave rectifier
The rectified voltage is in the form of
pulses of the shape of half sinusoids.
Though it is unidirectional it does not have
a steady value. To get steady dc output
from the pulsating voltage normally a
capacitor is connected across the output
terminals (parallel to the load RL). One can
also use an inductor in series with RL for
FIGURE 14.19 (a) A Full-wave rectifier
the same purpose. Since these additional
circuit; (b) Input wave forms given to the
circuits appear to filter out the ac ripple
diode D 1 at A and to the diode D2 at B;
and give a pure dc voltage, so they are
(c) Output waveform across the
called filters.
load RL connected in the full-wave
Now we shall discuss the role of
rectifier circuit.
capacitor in filtering. When the voltage
across the capacitor is rising, it gets
charged. If there is no external load, it remains charged to the peak voltage
of the rectified output. When there is a load, it gets discharged through
the load and the voltage across it begins to fall. In the next half-cycle of
rectified output it again gets charged to the peak value (Fig. 14.20). The
rate of fall of the voltage across the capacitor depends upon the inverse
product of capacitor C and the effective resistance R L used in the circuit
and is called the time constant. To make the time constant large value of
C should be large. So capacitor input filters use large capacitors. The
output voltage obtained by using capacitor input filter is nearer to the
peak voltage of the rectified voltage. This type of filter is most widely
484
used in power supplies.

tt
o N
be C
E
re R
pu T
bl
is
he
d

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

FIGURE 14.20 (a) A full-wave rectifier with capacitor filter, (b) Input and output
voltage of rectifier in (a).

14.8 SPECIAL PURPOSE p-n JUNCTION DIODES

In the section, we shall discuss some devices which are basically junction
diodes but are developed for different applications.

14.8.1 Zener diode

no

It is a special purpose semiconductor diode, named after its inventor


C. Zener. It is designed to operate under reverse bias in the breakdown
region and used as a voltage regulator. The symbol for Zener diode is
shown in Fig. 14.21(a).
Zener diode is fabricated by heavily doping both p-, and
n- sides of the junction. Due to this, depletion region formed
is very thin (<106 m) and the electric field of the junction is
extremely high (~5106 V/m) even for a small reverse bias
voltage of about 5V. The I-V characteristics of a Zener diode is
shown in Fig. 14.21(b). It is seen that when the applied reverse
bias voltage(V) reaches the breakdown voltage (Vz ) of the Zener
diode, there is a large change in the current. Note that after
the breakdown voltage Vz, a large change in the current can
be produced by almost insignificant change in the reverse bias
voltage. In other words, Zener voltage remains constant, even
though current through the Zener diode varies over a wide
range. This property of the Zener diode is used for regulating
supply voltages so that they are constant.
Let us understand how reverse current suddenly increases
at the breakdown voltage. We know that reverse current is
due to the flow of electrons (minority carriers) from p n and
holes from n p. As the reverse bias voltage is increased, the
electric field at the junction becomes significant. When the
reverse bias voltage V = V z, then the electric field strength is
high enough to pull valence electrons from the host atoms on
the p-side which are accelerated to n-side. These electrons
account for high current observed at the breakdown. The
FIGURE 14.21 Zener diode,
emission of electrons from the host atoms due to the high
(a) symbol, (b) I-V
electric field is known as internal field emission or field
characteristics.
ionisation. The electric field required for field ionisation is of
485
the order of 106 V/m.

Physics
Zener diode as a voltage regulator

tt
o N
be C
E
re R
pu T
bl
is
he
d

We know that when the ac input voltage of a rectifier fluctuates, its rectified
output also fluctuates. To get a constant dc voltage from the dc
unregulated output of a rectifier, we use a Zener diode. The circuit diagram
of a voltage regulator using a Zener diode is shown in Fig. 14.22.
The unregulated dc voltage (filtered output of a
rectifier) is connected to the Zener diode through a series
resistance R s such that the Zener diode is reverse biased.
If the input voltage increases, the current through R s
and Zener diode also increases. This increases the
voltage drop across R s without any change in the
voltage across the Zener diode. This is because in the
breakdown region, Zener voltage remains constant even
though the current through the Zener diode changes.
Similarly, if the input voltage decreases, the current
through Rs and Zener diode also decreases. The voltage
drop across Rs decreases without any change in the
FIGURE 14.22 Zener diode as DC
voltage across the Zener diode. Thus any increase/
voltage regulator
decrease in the input voltage results in, increase/
decrease of the voltage drop across Rs without any
change in voltage across the Zener diode. Thus the Zener diode acts as a
voltage regulator. We have to select the Zener diode according to the
required output voltage and accordingly the series resistance R s.

EXAMPLE 14.5

Example 14.5 In a Zener regulated power supply a Zener diode with


V Z = 6.0 V is used for regulation. The load current is to be 4.0 mA and
the unr egulated input is 10.0 V. What should be the value of series
resistor R S?
Solution
The value of RS should be such that the current through the Zener
diode is much larger than the load current. This is to have good load
regulation. Choose Zener current as five times the load current, i.e.,
I Z = 20 mA. The total current through R S is, therefore, 24 mA. The
voltage dr op acr oss R S is 10.0 6.0 = 4.0 V . This gives
3
R S = 4.0V/(24 10 ) A = 167 . The nearest value of carbon resistor
is 150 . So, a series resistor of 150 is appropriate. Note that slight
variation in the value of the resistor does not matter, what is important
is that the current IZ should be sufficiently larger than IL .

no

14.8.2 Optoelectronic junction devices

486

We have seen so far, how a semiconductor diode behaves under applied


electrical inputs. In this section, we learn about semiconductor diodes in
which carriers are generated by photons (photo-excitation). All these
devices are called optoelectronic devices. We shall study the functioning
of the following optoelectronic devices:
(i) Photodiodes used for detecting optical signal (photodetectors).
(ii) Light emitting diodes (LED) which convert electrical energy into light.
(iii) Photovoltaic devices which convert optical radiation into electricity
(solar cells).

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
(i) Photodiode

tt
o N
be C
E
re R
pu T
bl
is
he
d

A Photodiode is again a special purpose p-n


junction diode fabricated with a transparent
window to allow light to fall on the diode. It is
operated under reverse bias. When the photodiode
is illuminated with light (photons) with energy (h)
greater than the energy gap (E g) of the
semiconductor, then electron-hole pairs are
generated due to the absorption of photons. The
diode is fabricated such that the generation of
e-h pairs takes place in or near the depletion region
of the diode. Due to electric field of the junction,
electrons and holes are separated before they
recombine. The direction of the electric field is such
that electrons reach n-side and holes reach p-side.
Electrons are collected on n-side and holes are
collected on p-side giving rise to an emf. When an
external load is connected, current flows. The
magnitude of the photocurrent depends on the
intensity of incident light (photocurrent is
proportional to incident light intensity).
It is easier to observe the change in the current
with change in the light intensity, if a reverse bias
is applied. Thus photodiode can be used as a
photodetector to detect optical signals. The circuit
diagram used for the measurement of I-V
characteristics of a photodiode is shown in
Fig. 14.23(a) and a typical I-V characteristics in
Fig. 14.23(b).

FIGURE 14.23 (a) An illuminated


photodiode under reverse bias , (b) I-V
characteristics of a photodiode for different
illumination intensity I4 > I3 > I2 > I1.

Example 14.6 The current in the forward bias is known to be more


(~mA) than the current in the reverse bias (~A). What is the reason
then to operate the photodiodes in reverse bias?

no

p = p + p
Here n and p are the electron and hole concentrations* at any
particular illumination and n and p are carriers concentration when
there is no illumination. Remember n = p and n >> p. Hence, the

* Note that, to create an e-h pair, we spend some ener gy (photoexcitation, thermal
excitation, etc.). Therefore when an electron and hole recombine the energy is
released in the form of light (radiative recombination) or heat (non-radiative
recombination). It depends on semiconductor and the method of fabrication of
the p-n junction. For the fabrication of LEDs, semiconductors like GaAs, GaAsGaP are used in which radiative recombination dominates.

EXAMPLE 14.6

Solution Consider the case of an n-type semiconductor. Obviously,


the majority carrier density (n ) is considerably larger than the
minority hole density p (i.e., n >> p). On illumination, let the excess
electrons and holes generated be n and p, respectively:
n = n + n

487

fractional change in the majority carriers (i.e., n/n ) would be much


less than that in the minority carriers (i.e., p/p). In general, we can
state that the fractional change due to the photo-effects on the
minority carrier dominated reverse bias current is more easily
measurable than the fractional change in the forward bias current.
Hence, photodiodes are preferably used in the reverse bias condition
for measuring light intensity.

tt
o N
be C
E
re R
pu T
bl
is
he
d

EXAMPLE 14.6

Physics

no

(ii) Light emitting diode

488

It is a heavily doped p-n junction which under forward bias emits


spontaneous radiation. The diode is encapsulated with a transparent
cover so that emitted light can come out.
When the diode is forward biased, electrons are sent from n p (where
they are minority carriers) and holes are sent from p n (where they are
minority carriers). At the junction boundary the concentration of minority
carriers increases compared to the equilibrium concentration (i.e., when
there is no bias). Thus at the junction boundary on either side of the
junction, excess minority carriers are there which recombine with majority
carriers near the junction. On recombination, the energy is released in
the form of photons. Photons with energy equal to or slightly less than
the band gap are emitted. When the forward current of the diode is small,
the intensity of light emitted is small. As the forward current increases,
intensity of light increases and reaches a maximum. Further increase in
the forward current results in decrease of light intensity. LEDs are biased
such that the light emitting efficiency is maximum.
The V-I characteristics of a LED is similar to that of a Si junction
diode. But the threshold voltages are much higher and slightly different
for each colour. The reverse breakdown voltages of LEDs are very low,
typically around 5V. So care should be taken that high reverse voltages
do not appear across them.
LEDs that can emit red, yellow, orange, green and blue light are
commercially available. The semiconductor used for fabrication of visible
LEDs must at least have a band gap of 1.8 eV (spectral range of visible
light is from about 0.4 m to 0.7 m, i.e., from about 3 eV to 1.8 eV). The
compound semiconductor Gallium Arsenide Phosphide (GaAs 1x P x ) is
used for making LEDs of different colours. GaAs0.6 P0.4 (Eg ~ 1.9 eV) is
used for red LED. GaAs (E g ~ 1.4 eV) is used for making infrared LED.
These LEDs find extensive use in remote controls, burglar alarm systems,
optical communication, etc. Extensive research is being done for
developing white LEDs which can replace incandescent lamps.
LEDs have the following advantages over conventional incandescent
low power lamps:
(i) Low operational voltage and less power.
(ii) Fast action and no warm-up time required.
(iii) The bandwidth of emitted light is 100 to 500 or in other words it
is nearly (but not exactly) monochromatic.
(iv) Long life and ruggedness.
(v) Fast on-off switching capability.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
(iii) Solar cell

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

A solar cell is basically a p-n junction which


generates emf when solar radiation falls on the
p-n junction. It works on the same principle
(photovoltaic effect) as the photodiode, except that
no external bias is applied and the junction area
is kept much larger for solar radiation to be
incident because we are interested in more power.
A simple p-n junction solar cell is shown in
Fig. 14.24.
A p-Si wafer of about 300 m is taken over
which a thin layer (~0.3 m) of n-Si is grown on
one-side by diffusion process. The other side of
p-Si is coated with a metal (back contact). On the
top of n-Si layer, metal finger electrode (or metallic
FIGURE 14.24 (a) Typical p-n junction
grid) is deposited. This acts as a front contact. The
solar cell; (b) Cross-sectional view.
metallic grid occupies only a very small fraction
of the cell area (<15%) so that light can be incident
on the cell from the top.
The generation of emf by a solar cell, when light falls on, it is due to
the following three basic processes: generation, separation and collection
(i) generation of e-h pairs due to light (with h > Eg )
close to the junction; (ii) separation of electrons and
holes due to electric field of the depletion region.
Electrons are swept to n-side and holes to p-side;
(iii) the electrons reaching the n-side are collected by
the front contact and holes reaching p-side are collected
by the back contact. Thus p-side becomes positive and
n-side becomes negative giving rise to photovoltage.
When an external load is connected as shown in
the Fig. 14.25(a) a photocurrent IL flows through the
load. A typical I-V characteristics of a solar cell is shown
in the Fig. 14.25(b).
Note that the I V characteristics of solar cell is
drawn in the fourth quadrant of the coordinate axes.
This is because a solar cell does not draw current but
supplies the same to the load.
Semiconductors with band gap close to 1.5 eV are
ideal materials for solar cell fabrication. Solar cells are
made with semiconductors like Si (Eg = 1.1 eV), GaAs
(Eg = 1.43 eV), CdTe (Eg = 1.45 eV), CuInSe2 (Eg = 1.04
eV), etc. The important criteria for the selection of a
material for solar cell fabrication are (i) band gap (~1.0
to 1.8 eV), (ii) high optical absorption (~104 cm 1), (iii)
electrical conductivity, (iv) availability of the raw
FIGURE 14.25 (a) A typical
material, and (v) cost. Note that sunlight is not always
illuminated
p-n junction solar cell;
required for a solar cell. Any light with photon energies
(b) I-V characteristics of a solar cell.
greater than the bandgap will do. Solar cells are used
to power electronic devices in satellites and space
vehicles and also as power supply to some calculators. Production of
low-cost photovoltaic cells for large-scale solar energy is a topic
489
for research.

Physics
Example 14.7 Why are Si and GaAs are preferred materials for
solar cells?

tt
o N
be C
E
re R
pu T
bl
is
he
d

Solution The solar radiation spectrum received by us is shown in


Fig. 14.26.

EXAMPLE 14.7

FIGURE 14.26

The maxima is near 1.5 eV . For photo-excitation, h > Eg . Hence,


semiconductor with band gap ~1.5 eV or lower is likely to give better
solar conversion efficiency. Silicon has Eg ~ 1.1 eV while for GaAs it is
~1.53 eV. In fact, GaAs is better (in spite of its higher band gap) than
Si because of its relatively higher absorption coefficient. If we choose
materials like CdS or CdSe (E g ~ 2.4 eV), we can use only the high
energy component of the solar energy for photo-conversion and a
significant part of energy will be of no use.
The question arises: why we do not use material like PbS (Eg ~ 0.4 eV)
which satisfy the condition h > Eg for maxima corresponding to the
solar radiation spectra? If we do so, most of the solar radiation will be
absorbed on the top-layer of solar cell and will not reach in or near
the depletion region. For effective electron-hole separation, due to
the junction field, we want the photo-generation to occur in the
junction region only.

no

14.9 J UNCTION TRANSISTOR

490

The credit of inventing the transistor in the year 1947 goes to J. Bardeen
and W.H. Brattain of Bell Telephone Laboratories, U.S.A. That transistor
was a point-contact transistor. The first junction transistor consisting of
two back-to-back p-n junctions was invented by William Schockley
in 1951.
As long as only the junction transistor was known, it was known
simply as transistor. But over the years new types of transistors were
invented and to differentiate it from the new ones it is now called the
Bipolar Junction Transistor (BJT). Even now, often the word transistor

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
is used to mean BJT when there is no confusion. Since our study is
limited to only BJT, we shall use the word transistor for BJT without
any ambiguity.

14.9.1 Transistor: structure and action

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

A transistor has three doped regions forming two p-n junctions


between them. Obviously, there are two types of transistors, as shown
in Fig. 14.27.
(i) n-p-n transistor : Here two segments of n-type semiconductor
(emitter and collector) are separated by a segment of p-type
semiconductor (base).
(ii) p-n-p transistor: Here two segments of p-type semiconductor
(termed as emitter and collector) are separated by a segment of
n-type semiconductor (termed as base).
The schematic representations of an n-p-n and a p-n-p
configuration are shown in Fig. 14.27(a). All the three segments of a
transistor have different thickness and their doping levels are also
different. In the schematic symbols used for representing p-n-p and
n-p-n transistors [Fig. 14.27(b)] the arrowhead shows the direction of
conventional current in the transistor. A brief description of the three
segments of a transistor is given below:
Emitter: This is the segment on one side of the transistor shown in
Fig. 14.27(a). It is of moderate size and heavily doped. It supplies
a large number of majority carriers for the current flow through
the transistor.
Base: This is the central segment. It is very thin and lightly doped.
Collector: This segment collects a major portion of the majority
carriers supplied by the emitter. The collector side is moderately
doped and larger in size as compared to the emitter.
We have seen earlier in the case of a p-n junction, that there is a
formation of depletion region acorss the junction. In case of a transistor
depletion regions are formed at the emitter base-junction and the basecollector junction. For understanding the action of a transistor, we
have to consider the nature of depletion regions formed at these
junctions. The charge carriers move across different regions of the
transistor when proper voltages are applied across its terminals.
The biasing of the transistor is done differently for different uses.
The transistor can be used in two distinct ways. Basically, it was
invented to function as an amplifier, a device which produces a enlarged
copy of a signal. But later its use as a switch acquired equal
importance. We shall study both these functions and the ways the
transistor is biased to achieve these mutually exclusive functions.
First we shall see what gives the transistor its amplifying capabilities.
The transistor works as an amplifier, with its emitter -base junction
forward biased and the base-collector junction reverse biased. This
situation is shown in Fig. 14.28, where VCC and VEE are used for creating
the respective biasing. When the transistor is biased in this way it is
said to be in active state.We represent the voltage between emitter and
base as VEB and that between the collector and the base as VCB. In

FIGURE 14.27
(a) Schematic
representations of a
n-p-n transistor and
p-n-p transistor, and
(b) Symbols for n-p-n
and p-n-p transistors.

491

Physics

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

Fig. 14.28, base is a common terminal for the two


power supplies whose other terminals are
connected to emitter and collector, respectively. So
the two power supplies are represented as VEE, and
VCC, respectively. In circuits, where emitter is the
common terminal, the power supply between the
base and the emitter is represented as VBB and that
between collector and emitter as VCC.
Let us see now the paths of current carriers in
the transistor with emitter-base junction forward
biased and base-collector junction reverse biased.
The heavily doped emitter has a high concentration
of majority carriers, which will be holes in a p-n-p
transistor and electrons in an n-p-n transistor.
These majority carriers enter the base region in
large numbers. The base is thin and lightly doped.
So the majority carriers there would be few. In a
p-n-p transistor the majority carriers in the base
are electrons since base is of n-type semiconductor.
The large number of holes entering the base from
the emitter swamps the small number of electrons
there. As the base collector-junction is reversebiased, these holes, which appear as minority
carriers at the junction, can easily cross the
junction and enter the collector. The holes in the
base could move either towards the base terminal
to combine with the electrons entering from outside
or cross the junction to enter into the collector and
reach the collector terminal. The base is made thin
so that most of the holes find themselves near
the reverse-biased base-collector junction and so
cross the junction instead of moving to the base
FIGURE 14.28 Bias Voltage applied on: (a)
terminal.
p-n-p transistor and (b) n-p-n transistor.
It is interesting to note that due to forward
bias a large current enters the emitter-base
junction, but most of it is diverted to adjacent reverse-biased base-collector
junction and the current coming out of the base becomes a very small
fraction of the current that entered the junction. If we represent the hole
current and the electron current crossing the forward biased junction by
Ih and Ie respectively then the total current in a forward biased diode is
the sum Ih + I e. We see that the emitter current I E = I h + Ie but the base
current IB << I h + Ie , because a major part of IE goes to collector instead of
coming out of the base terminal. The base current is thus a small fraction
of the emitter current.
The current entering into the emitter from outside is equal to the
emitter current IE. Similarly the current emerging from the base terminal
is IB and that from collector terminal is IC. It is obvious from the above
description and also from a straight forward application of Kirchhoffs
law to Fig. 14.28(a) that the emitter current is the sum of collector current
492
and base current:

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
IE = IC + I B

(14.7)

tt
o N
be C
E
re R
pu T
bl
is
he
d

We also see that IC IE .


Our description of the direction of motion of the holes is identical
with the direction of the conventional current. But the direction of motion
of electrons is just opposite to that of the current. Thus in a p-n-p
transistor the current enters from emitter into base whereas in a n-p-n
transistor it enters from the base into the emitter. The arrowhead in the
emitter shows the direction of the conventional current.
The description about the paths followed by the majority and minority
carriers in a n-p-n is exactly the same as that for the p-n-p transistor.
But the current paths are exactly opposite, as shown in Fig. 14.28. In
Fig. 14.28(b) the electrons are the majority carriers supplied by the
n-type emitter region. They cross the thin p-base region and are able to
reach the collector to give the collector current, IC . From the above
description we can conclude that in the active state of the transistor the
emitter-base junction acts as a low resistance while the base collector
acts as a high resistance.

14.9.2 Basic transistor circuit configurations and transistor


characteristics
In a transistor, only three terminals are available, viz., Emitter (E), Base
(B) and Collector (C). Therefore, in a circuit the input/output connections
have to be such that one of these (E, B or C) is common to both the input
and the output. Accordingly, the transistor can be connected in either of
the following three configurations:
Common Emitter (CE), Common Base (CB), Common Collector (CC)
The transistor is most widely used in the CE configuration and we
shall restrict our discussion to only this configuration. Since more
commonly used transistors are n-p-n Si transistors, we shall confine
our discussion to such transistors only. With p-n-p transistors the
polarities of the external power supplies are to be inverted.
Common emitter transistor characteristics

no

When a transistor is used in CE configuration, the input


is between the base and the emitter and the output is
between the collector and the emitter. The variation of
the base current IB with the base-emitter voltage VBE is
called the input characteristic. Similarly, the variation
of the collector current I C with the collector-emitter
voltage VCE is called the output characteristic. You will
see that the output characteristics are controlled by
the input characteristics. This implies that the collector
current changes with the base current.
The input and the output characteristics of an
n-p-n transistors can be studied by using the circuit
shown in Fig. 14.29.
To study the input characteristics of the transistor
in CE configuration, a curve is plotted between the base
current IB against the base-emitter voltage VBE. The

FIGURE 14.29 Circuit arrangement


for studying the input and output
characteristics of n-p-n transistor in
CE configuration.

493

Physics

no

tt
o N
be C
E
re R
pu T
bl
is
he
d

collector-emitter voltage VCE is kept fixed while


studying the dependence of I B on VBE. We are
interested to obtain the input characteristic
when the transistor is in active state. So the
collector -emitter voltage V CE is kept large
enough to make the base collector junction
reverse biased. Since VCE = VCB + VBE and for Si
transistor VBE is 0.6 to 0.7 V, VCE must be
sufficiently larger than 0.7 V. Since the
transistor is operated as an amplifier over large
range of VCE, the reverse bias across the basecollector junction is high most of the time.
Therefore, the input characteristics may be
obtained for VCE somewhere in the range of 3 V
to 20 V. Since the increase in VCE appears as
increase in VCB, its effect on IB is negligible. As
a consequence, input characteristics for various
values of VCE will give almost identical curves.
Hence, it is enough to determine only one input
characteristics. The input characteristics of a
transistor is as shown in Fig. 14.30(a).
The output characteristic is obtained by
observing the variation of IC as VCE is varied
keeping I B constant. It is obvious that if V BE is
increased by a small amount, both hole current
from the emitter region and the electron current
from the base region will increase. As a
consequence both I B and I C will increase
FIGURE 14.30 (a) Typical input
proportionately. This shows that when I B
characteristics, and (b) Typical output
increases IC also increases. The plot of IC versus
characteristics.
VCE for different fixed values of I B gives one
output characteristic. So there will be different output characteristics
corresponding to different values of IB as shown in Fig. 14.30(b).
The linear segments of both the input and output characteristics can
be used to calculate some important ac parameters of transistors as
shown below.
(i) Input resistance (ri ): This is defined as the ratio of change in baseemitter voltage (VBE) to the resulting change in base current (IB ) at
constant collector-emitter voltage (VCE ). This is dynamic (ac resistance)
and as can be seen from the input characteristic, its value varies with
the operating current in the transistor:

494

ri =

VBE
I B

(14.8)
VCE

The value of ri can be anything from a few hundreds to a few thousand


ohms.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
(ii) Output resistance (ro ): This is defined as the ratio of change in
collector-emitter voltage (VCE ) to the change in collector current (IC)
at a constant base current IB.
VCE
IC

(14.9)
IB

tt
o N
be C
E
re R
pu T
bl
is
he
d

ro =

The output characteristics show that initially for very small values of
VCE , IC increases almost linearly. This happens because the base-collector
junction is not reverse biased and the transistor is not in active state. In
fact, the transistor is in the saturation state and the current is controlled
by the supply voltage VCC (=VCE) in this part of the characteristic. When
VCE is more than that required to reverse bias the base-collector junction,
IC increases very little with VCE. The reciprocal of the slope of the linear
part of the output characteristic gives the values of ro. The output
resistance of the transistor is mainly controlled by the bias of the basecollector junction. The high magnitude of the output resistance (of the
order of 100 k) is due to the reverse-biased state of this diode. This
also explains why the resistance at the initial part of the characteristic,
when the transistor is in saturation state, is very low.
(iii) Current amplification factor ( ): This is defined as the ratio of
the change in collector current to the change in base current at a
constant collector-emitter voltage (V CE) when the transistor is in
active state.

ac =

I C
I B

(14.10)

VCE

This is also known as small signal current gain and its value is very
large.
If we simply find the ratio of IC and IB we get what is called dc of the
transistor. Hence,
IC
(14.11)
IB
Since I C increases with IB almost linearly and I C = 0 when IB = 0, the values
of both dc and ac are nearly equal. So, for most calculations dc can be
used. Both ac and dc vary with VCE and IB (or I C) slightly.

dc =

Example 14.8 From the output characteristics shown in Fig. 14.30(b),


calculate the values of ac and dc of the transistor when V CE is
10 V and IC = 4.0 mA.

ac =

IC
I B

, dc =

V CE

IC
IB

For determining ac and dc at the stated values of VCE and IC one can
proceed as follows. Consider any two characteristics for two values
of IB which lie above and below the given value of I C . Here IC = 4.0 mA.
(Choose characteristics for IB = 30 and 20 A.) At VCE = 10 V we read
the two values of IC from the graph. Then

EXAMPLE 14.8

no

Solution

495

Physics
Therefore, ac = 1.5 mA/ 10 A = 150
For determining dc, either estimate the value of IB corresponding to
I C = 4.0 mA at V CE = 10 V or calculate the two values of dc for the two
characteristics chosen and find their mean.
Therefore, for I C = 4.5 mA and IB = 30 A,
dc = 4.5 mA/ 30 A = 150
and for IC = 3.0 mA and IB = 20 A
dc =3.0 mA / 20 A = 150
Hence, dc =(150 + 150) /2 = 150

tt
o N
be C
E
re R
pu T
bl
is
he
d

EXAMPLE 14.8

IB = (30 20) A = 10 A, IC = (4.5 3.0) mA = 1.5 mA

14.9.3 Transistor as a device

no

The transistor can be used as a device application depending on the


configuration used (namely CB, CC and CE), the biasing of the E-B and
B-C junction and the operation region namely cutoff, active region and
saturation. As mentioned earlier we have confined only to the CE
configuration and will be concentrating on the biasing and the operation
region to understand the working of a device.
When the transistor is used in the cutoff
or saturation state it acts as a switch. On
the other hand for using the transistor as
an amplifier, it has to operate in the active
region.
(i) Transistor as a switch
We shall try to understand the operation of
the transistor as a switch by analysing the
behaviour of the base-biased transistor in
CE configuration as shown in Fig. 14.31(a).
Applying Kirchhoffs voltage rule to the
input and output sides of this circuit, we
get
V BB = IBRB + VBE
(14.12)
and
V CE = VCC ICRC.
(14.13)
We shall treat VBB as the dc input
voltage Vi and VCE as the dc output voltage
V O. So, we have
V i = I BRB + VBE and
V o = V CC I CRC.
Let us see how V o changes as V i
increases from zero onwards. In the case
of Si transistor, as long as input Vi is less
than 0.6 V, the transistor will be in cut off
FIGURE 14.31 (a) Base-biased transistor in CE
configuration, (b) Transfer characteristic.
state and current I C will be zero.
Hence V o = VCC

496

When V i becomes greater than 0.6 V the transistor is in active state with
some current IC in the output path and the output Vo decrease as the

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

tt
o N
be C
E
re R
pu T
bl
is
he
d

term I CR C increases. With increase of Vi , I C increases almost linearly


and so V o decreases linearly till its value becomes less than
about 1.0 V.
Beyond this, the change becomes non linear and transistor goes into
saturation state. With further increase in Vi the output voltage is found to
decrease further towards zero though it may never become zero. If we plot
the Vo vs Vi curve, [also called the transfer characteristics of the base-biased
transistor (Fig. 14.31(b)], we see that between cut off state and active state
and also between active state and saturation state there are regions of
non-linearity showing that the transition from cutoff state to active state
and from active state to saturation state are not sharply defined.
Let us see now how the transistor is operated as a switch. As long as
Vi is low and unable to forward-bias the transistor, Vo is high (at VCC ). If
Vi is high enough to drive the transistor into saturation, then Vo is low,
very near to zero. When the transistor is not conducting it is said to be
switched off and when it is driven into saturation it is said to be switched
on. This shows that if we define low and high states as below and above
certain voltage levels corresponding to cutoff and saturation of the
transistor, then we can say that a low input switches the transistor off
and a high input switches it on. Alternatively, we can say that a low
input to the transistor gives a high output and a high input gives a low
output. The switching circuits are designed in such a way that the
transistor does not remain in active state.
(ii) Transistor as an amplifier

For using the transistor as an amplifier we will use the active region of
the Vo versus V i curve. The slope of the linear part of the curve represents
the rate of change of the output with the input. It is negative because the
output is VCC ICR C and not ICR C. That is why as input voltage of the CE
amplifier increases its output voltage decreases and the output is said to
be out of phase with the input. If we consider Vo and Vi as small
changes in the output and input voltages then Vo /Vi is called the small
signal voltage gain AV of the amplifier.
If the VBB voltage has a fixed value corresponding to the mid point of
the active region, the circuit will behave as a CE amplifier with voltage
gain Vo / Vi . We can express the voltage gain AV in terms of the resistors
in the circuit and the current gain of the transistor as follows.
We have, Vo = VCC ICRC
Therefore, V o = 0 RC IC

Similarly, from Vi = IBR B + VBE

no

Vi = RB I B + V BE
But VBE is negligibly small in comparison to I BRB in this circuit.
So, the voltage gain of this CE amplifier (Fig. 14.32) is given by
AV = RC IC / R B IB
= ac(R C /RB )
(14.14)
where ac is equal to I C/I B from Eq. (14.10). Thus the linear
portion of the active region of the transistor can be exploited for the use
in amplifiers. Transistor as an amplifier (CE configuration) is discussed
in detail in the next section.

497

Physics
14.9.4 Transistor as an Amplifier (CE-Configuration)

tt
o N
be C
E
re R
pu T
bl
is
he
d

To operate the transistor as an amplifier it is necessary to fix its operating


point somewhere in the middle of its active region. If we fix the value of
V BB corresponding to a point in the middle of the linear part of the transfer
curve then the dc base current IB would be constant and corresponding
collector current IC will also be constant. The dc voltage VCE = VCC - ICR C
would also remain constant. The operating values of VCE and IB determine
the operating point, of the amplifier.
If a small sinusoidal voltage with amplitude v s is superposed on
the dc base bias by connecting the source of that signal in series with the
V BB supply, then the base current will have sinusoidal variations
superimposed on the value of I B. As a consequence the collector current
also will have sinusoidal variations
superimposed on the value of I C, producing
in turn corresponding change in the value
of V O. We can measure the ac variations
across the input and output terminals by
blocking the dc voltages by large capacitors.
In the discription of the amplifier given
above we have not considered any ac signal.
In general, amplifiers are used to amplify
alternating signals. Now let us superimpose
an ac input signal v i (to be amplified) on the
FIGURE 14.32 A simple circuit of a
bias VBB (dc) as shown in Fig. 14.32. The
CE-transistor amplifier.
output is taken between the collector and
the ground.
The working of an amplifier can be easily understood, if we first
assume that v i = 0. Then applying Kirchhoffs law to the output loop,
we get
V cc = VCE + Ic RL

(14.15)

Likewise, the input loop gives


V BB = VBE + IB R B

(14.16)

When v i is not zero, we get

V BE + vi = VBE + I B RB + IB (R B + r i)

The change in VBE can be related to the input resistance ri [see


Eq. (14.8)] and the change in IB. Hence

no

v i = IB (R B + ri )

498

= r IB
The change in I B causes a change in I c. We define a parameter ac,
which is similar to the dc defined in Eq. (14.11), as
I
i
ac = c = c
(14.17)
I B ib
which is also known as the ac current gain A i. Usually ac is close to dc
in the linear region of the output characteristics.
The change in I c due to a change in IB causes a change in VCE and the
voltage drop across the resistor RL because VCC is fixed.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
These changes can be given by Eq. (14.15) as
VCC = V CE + R L IC = 0
or V CE = R L IC
The change in VCE is the output voltage v 0. From Eq. (14.10), we get

tt
o N
be C
E
re R
pu T
bl
is
he
d

v0 = V CE = ac RL IB
The voltage gain of the amplifier is
Av =

v0
VCE
=
vi
r I B

ac RL

(14.18)
r
The negative sign represents that output voltage is opposite with phase
with the input voltage.
From the discussion of the transistor characteristics you have seen
that there is a current gain ac in the CE configuration. Here we have also
seen the voltage gain Av . Therefore the power gain Ap can be expressed
as the product of the current gain and voltage gain. Mathematically
Ap = ac A v
(14.19)
Since ac and A v are greater than 1, we get ac power gain. However it
should be realised that transistor is not a power generating device. The
energy for the higher ac power at the output is supplied by the battery.
Example 14.9 In Fig. 14.31(a), the VBB supply can be varied from 0V
to 5.0 V. The Si transistor has dc = 250 and RB = 100 k, RC = 1 K,
VCC = 5.0V. Assume that when the transistor is saturated, V CE = 0V
and VBE = 0.8V. Calculate (a) the minimum base current, for which
the transistor will reach saturation. Hence, (b) determine V1 when
the transistor is switched on. (c) find the ranges of V1 for which the
transistor is switched off and switched on.

no

Note that the transistor is in active state when I B varies from 0.0mA
to 20mA. In this range, IC = IB is valid. In the saturation range,
IC IB.

EXAMPLE 14.9

Solution
Given at saturation VCE = 0V, VBE = 0.8V
VCE = VCC ICR C
IC = VCC/RC = 5.0V/1.0k = 5.0 mA
Therefore IB = IC/ = 5.0 mA/250 = 20 A
The input voltage at which the transistor will go into saturation is
given by
VIH = V BB = IBR B +V BE
= 20A 100 k + 0.8V = 2.8V
The value of input voltage below which the transistor remains cutoff
is given by
VIL = 0.6V, V IH = 2.8V
Between 0.0V and 0.6V, the transistor will be in the switched off
state. Between 2.8V and 5.0V, it will be in switched on state.

499

Physics

EXAMPLE 14.10

tt
o N
be C
E
re R
pu T
bl
is
he
d

Example 14.10 For a CE transistor amplifier, the audio signal voltage


across the collector resistance of 2.0 k is 2.0 V. Suppose the current
amplification factor of the transistor is 100, What should be the value
of R B in series with VBB supply of 2.0 V if the dc base current has to be
10 times the signal current. Also calculate the dc drop across the
collector resistance. (Refer to Fig. 14.33).
Solution The output ac voltage is 2.0 V. So, the ac collector curr ent
i C = 2.0/2000 = 1.0 mA. The signal current through the base is,
therefore given by i B = i C / = 1.0 mA/100 = 0.010 mA. The dc base
current has to be 10 0.010 = 0.10 mA.
From Eq.14.16, RB = (V BB - V BE ) /IB. Assuming VBE = 0.6 V,
R B = (2.0 0.6 )/0.10 = 14 k.
The dc collector current IC = 1000.10 = 10 mA.

14.9.5 Feedback amplifier and transistor oscillator

no

In an amplifier, we have seen that a sinusoidal input is given which appears


as an amplified signal in the output. This means that an external input is
necessary to sustain ac signal in the
output for an amplifier. In an oscillator, we
get ac output without any external input
signal. In other words, the output in an
oscillator is self-sustained. To attain this,
an amplifier is taken. A portion of the
output power is returned back (feedback)
to the input in phase with the starting
power (this process is termed positive
feedback) as shown in Fig. 14.33(a). The
feedback can be achieved by inductive
coupling (through mutual inductance) or
LC or RC networks. Different types of
oscillators essentially use different methods
of coupling the output to the input
(feedback network), apart from the resonant
circuit for obtaining oscillation at a
particular frequency. For understanding
the oscillator action, we consider the circuit
shown in Fig. 14.33(b) in which the
feedback is accomplished by inductive
coupling from one coil winding (T 1) to
another coil winding (T2). Note that the coils
T 2 and T1 are wound on the same core and
hence are inductively coupled through their
mutual inductance. As in an amplifier, the
base-emitter junction is forward biased
FIGURE 14.33 (a) Principle of a transistor
while the base-collector junction is reverse
amplifier with positive feedback working as an
biased. Detailed biasing circuits actually
oscillator and (b) Tuned collector oscillator, (c) Rise
used have been omitted for simplicity.
and fall (or built up) of current Ic and Ie due to the
Let us try to understand how oscillations
inductive coupling.
500
are built. Suppose switch S1 is put on to

Semiconductor Electronics:
Materials, Devices and
Simple Circuits

tt
o N
be C
E
re R
pu T
bl
is
he
d

apply proper bias for the first time. Obviously, a surge of collector current
flows in the transistor. This current flows through the coil T2 where
terminals are numbered 3 and 4 [Fig. 14.33(b)]. This current does not
reach full amplitude instantaneously but increases from X to Y, as shown
in Fig. [14.33(c)(i)]. The inductive coupling between coil T2 and coil T 1
now causes a current to flow in the emitter circuit (note that this actually
is the feedback from input to output). As a result of this positive feedback,
this current (in T1; emitter current) also increases from X to Y [Fig.
14.33(c)(ii)]. The current in T2 (collector current) connected in the collector
circuit acquires the value Y when the transistor becomes saturated. This
means that maximum collector current is flowing and can increase no
further. Since there is no further change in collector current, the magnetic
field around T2 ceases to grow. As soon as the field becomes static, there
will be no further feedback from T2 to T1. Without continued feedback,
the emitter current begins to fall. Consequently, collector current decreases
from Y towards Z [Fig. 14.33(c)(i)]. However, a decrease of collector current
causes the magnetic field to decay around the coil T2 . Thus, T1 is now
seeing a decaying field in T2 (opposite from what it saw when the field was
growing at the initial start operation). This causes a further decrease in
the emitter current till it reaches Z when the transistor is cut-off. This
means that both IE and IC cease to flow. Therefore, the transistor has
reverted back to its original state (when the power was first switched on).
The whole process now repeats itself. That is, the transistor is driven to
saturation, then to cut-off, and then back to saturation. The time for
change from saturation to cut-off and back is determined by the constants
of the tank circuit or tuned circuit (inductance L of coil T2 and C connected
in parallel to it). The resonance frequency ( ) of this tuned circuit
determines the frequency at which the oscillator will oscillate.

(14.20)
2 LC
In the circuit of Fig. 14.33(b), the tank or tuned circuit is connected
in the collector side. Hence, it is known as tuned collector oscillator. If the
tuned circuit is on the base side, it will be known as tuned base oscillator.
There are many other types of tank circuits (say RC) or feedback circuits
giving different types of oscillators like Colpitts oscillator, Hartley
oscillator, RC-oscillator.

14.10 DIGITAL ELECTRONICS

AND

LOGIC GATES

no

In electronics circuits like amplifiers, oscillators, introduced to you in


earlier sections, the signal (current or voltage) has been in the form of
continuous, time-varying voltage or current. Such signals are called
continuous or analogue signals. A typical analogue signal is shown in
Figure. 14.34(a). Fig. 14.34(b) shows a pulse waveform in which only
discrete values of voltages are possible. It is convenient to use binary
numbers to represent such signals. A binary number has only two digits
0 (say, 0V) and 1 (say, 5V). In digital electronics we use only these two
levels of voltage as shown in Fig. 14.34(b). Such signals are called Digital
Signals. In digital circuits only two values (represented by 0 or 1) of the
input and output voltage are permissible.

501

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

This section is intended to provide the first step in our understanding


of digital electronics. We shall restrict our study to some basic building
blocks of digital electronics (called Logic Gates) which process the digital
signals in a specific manner. Logic gates are used in calculators, digital
watches, computers, robots, industrial control systems, and in
telecommunications.
A light switch in your house can be used as an example of a digital
circuit. The light is either ON or OFF depending on the switch position.
When the light is ON, the output value is 1. When the light is OFF the
output value is 0. The inputs are the position of the light switch. The
switch is placed either in the ON or OFF position to activate the light.

FIGURE 14.34 (a) Analogue signal, (b) Digital signal.

14.10.1 Logic gates

Input

Output

no

(b)
FIGURE 14.35
(a) Logic symbol,
(b) Truth table of
NOT gate.

502

A gate is a digital circuit that follows curtain logical relationship


between the input and output voltages. Therefore, they are generally
known as logic gates gates because they control the flow of
information. The five common logic gates used are NOT, AND, OR,
NAND, NOR. Each logic gate is indicated by a symbol and its function
is defined by a truth table that shows all the possible input logic level
combinations with their respective output logic levels. Truth tables
help understand the behaviour of logic gates. These logic gates can
be realised using semiconductor devices.
(i) NOT gate

This is the most basic gate, with one input and one output. It produces
a 1 output if the input is 0 and vice-versa. That is, it produces an
inverted version of the input at its output. This is why it is also known
as an inverter. The commonly used symbol together with the truth
table for this gate is given in Fig. 14.35.

(ii) OR Gate
An OR gate has two or more inputs with one output. The logic symbol
and truth table are shown in Fig. 14.36. The output Y is 1 when either
input A or input B or both are 1s, that is, if any of the input is high, the
output is high.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
Input

Output

tt
o N
be C
E
re R
pu T
bl
is
he
d

(b)

FIGURE 14.36 (a) Logic symbol (b) Truth table of OR gate.

Apart from carrying out the above mathematical logic operation, this
gate can be used for modifying the pulse waveform as explained in the
following example.
Example 14.11 Justify the output waveform (Y) of the OR gate for
the following inputs A and B given in Fig. 14.37.
Solution Note the following:
At t < t1;
A = 0, B = 0;
Hence Y = 0
For t 1 to t2; A = 1, B = 0;
Hence Y = 1
For t 2 to t3; A = 1, B = 1;
Hence Y = 1
For t 3 to t4; A = 0, B = 1;
Hence Y = 1
For t 4 to t5; A = 0, B = 0;
Hence Y = 0
For t 5 to t6; A = 1, B = 0;
Hence Y = 1
For t > t6; A = 0, B = 1;
Hence Y = 1
Therefore the waveform Y will be as shown in the Fig. 14.37.

no

(iii) AND Gate


An AND gate has two or more inputs and one output. The output Y of
AND gate is 1 only when input A and input B are both 1. The logic
symbol and truth table for this gate are given in Fig. 14.38

FIGURE 14.38 (a) Logic symbol, (b) Truth table of AND gate.

EXAMPLE 14.11

FIGURE 14.37

Input

Output

(b)

503

Physics
Example 14.12 Take A and B input waveforms similar to that in
Example 14.11. Sketch the output waveform obtained from AND gate.
Hence Y = 0
Hence Y = 0
Hence Y = 1
Hence Y = 0
Hence Y = 0
Hence Y = 0
Hence Y = 0
can be drawn

EXAMPLE 14.12

tt
o N
be C
E
re R
pu T
bl
is
he
d

Solution
For t t 1;
A = 0, B = 0;
For t1 to t2;
A = 1, B = 0;
For t2 to t3;
A = 1, B = 1;
For t3 to t4;
A = 0, B = 1;
For t4 to t5;
A = 0, B = 0;
For t5 to t6;
A = 1, B = 0;
For t > t 6;
A = 0, B = 1;
Based on the above, the output waveform for AND gate
as given below.

FIGURE 14.39

(iv) NAND Gate

This is an AND gate followed by a NOT gate. If inputs A and B are both
1, the output Y is not 1. The gate gets its name from this NOT AND
behaviour. Figure 14.40 shows the symbol and truth table of NAND gate.
NAND gates are also called Universal Gates since by using these
gates you can realise other basic gates like OR, AND and NOT (Exercises
14.16 and 14.17).
Input
Output
A

(b)

504

EXAMPLE 14.13

no

FIGURE 14.40 (a) Logic symbol, (b) Truth table of NAND gate.

Example 14.13 Sketch the output Y from a NAND gate having inputs
A and B given below:
Solution
For t < t 1;
For t 1 to t2;
For t 2 to t3;
For t 3 to t4;

A
A
A
A

=
=
=
=

1,
0,
0,
1,

B
B
B
B

=
=
=
=

1;
0;
1;
0;

Hence
Hence
Hence
Hence

Y
Y
Y
Y

=
=
=
=

0
1
1
1

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
A = 1, B = 1;
A = 0, B = 0;
A = 0, B = 1;

Hence Y = 0
Hence Y = 1
Hence Y = 1

EXAMPLE 14.13

tt
o N
be C
E
re R
pu T
bl
is
he
d

For t4 to t5;
For t5 to t6;
For t > t 6;

FIGURE 14.41

(v) NOR Gate


It has two or more inputs and one output. A NOT- operation applied
after OR gate gives a NOT-OR gate (or simply NOR gate). Its output Y is
1 only when both inputs A and B are 0, i.e., neither one input nor the
other is 1. The symbol and truth table for NOR gate is given in
Fig. 14.42.
Input
Output
A

(b)

FIGURE 14.42 (a) Logic symbol, (b) Truth table of NOR gate.

NOR gates are considered as universal gates because you can obtain
all the gates like AND, OR, NOT by using only NOR gates (Exercises 14.18
and 14.19).

14.11 INTEGRATED CIRCUITS

no

The conventional method of making circuits is to choose components


like diodes, transistor, R, L, C etc., and connect them by soldering wires
in the desired manner. Inspite of the miniaturisation introduced by the
discovery of transistors, such circuits were still bulky. Apart from this,
such circuits were less reliable and less shock proof. The concept of
fabricating an entire circuit (consisting of many passive components like
R and C and active devices like diode and transistor) on a small single
block (or chip) of a semiconductor has revolutionised the electronics
technology. Such a circuit is known as Integrated Circuit (IC). The most
widely used technology is the Monolithic Integrated Circuit. The word

505

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

monolithic is a combination of two greek words, monos


means single and lithos means stone. This, in effect,
means that the entire circuit is formed on a single
silicon crystal (or chip). The chip dimensions are as
small as 1mm 1mm or it could even be smaller. Figure
14.43 shows a chip in its protective plastic case, partly
removed to reveal the connections coming out from the
chip to the pins that enable it to make external
connections.
FIGURE 14.43 The casing and
Depending on nature of input signals, ICs can be
connection of a chip.
grouped in two categories: (a) linear or analogue ICs
and (b) digital ICs. The linear ICs process analogue
signals which change smoothly and continuously over a range of values
between a maximum and a minimum. The output is more or less directly
proportional to the input, i.e., it varies linearly with the input. One of the
most useful linear ICs is the operational amplifier.
The digital ICs process signals that have only two values. They
contain circuits such as logic gates. Depending upon the level of
integration (i.e., the number of circuit components or logic gates), the ICs
are termed as Small Scale Integration, SSI (logic gates < 10); Medium
Scale Integration, MSI (logic gates < 100); Large Scale Integration, LSI
(logic gates < 1000); and Very Large Scale Integration, VLSI (logic gates >
1000). The technology of fabrication is very involved but large scale
industrial production has made them very inexpensive.

FASTER

AND SMALLER: THE FUTURE OF COMPUTER TECHNOLOGY

no

The Integrated Chip (IC) is at the heart of all computer systems. In fact ICs are found in
almost all electrical devices like cars, televisions, CD players, cell phones etc. The
miniaturisation that made the modern personal computer possible could never have
happened without the IC. ICs are electronic devices that contain many transistors, resistors,
capacitors, connecting wires all in one package. You must have heard of the
microprocessor. The microprocessor is an IC that processes all information in a computer,
like keeping track of what keys are pressed, running programmes, games etc. The IC was
first invented by Jack Kilky at Texas Instruments in 1958 and he was awarded Nobel Prize
for this in 2000. ICs are produced on a piece of semiconductor crystal (or chip) by a process
called photolithography. Thus, the entire Information Technology (IT) industry hinges on
semiconductors. Over the years, the complexity of ICs has increased while the size of its
features continued to shrink. In the past five decades, a dramatic miniaturisation in
computer technology has made modern day computers faster and smaller. In the 1970s,
Gordon Moore, co-founder of INTEL, pointed out that the memory capacity of a chip (IC)
approximately doubled every one and a half years. This is popularly known as Moores
law. The number of transistors per chip has risen exponentially and each year computers
are becoming more powerful, yet cheaper than the year before. It is intimated from current
trends that the computers available in 2020 will operate at 40 GHz (40,000 MHz) and
would be much smaller, more efficient and less expensive than present day computers.
The explosive growth in the semiconductor industry and computer technology is best
expressed by a famous quote from Gordon Moore: If the auto industry advanced as rapidly
as the semiconductor industry, a Rolls Royce would get half a million miles per gallon, and
it would be cheaper to throw it away than to park it.

506

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
SUMMARY
Semiconductors are the basic materials used in the present solid state
electronic devices like diode, transistor, ICs, etc.

2.

Lattice structure and the atomic structure of constituent elements


decide whether a particular material will be insulator, metal or
semiconductor.
Metals have low resistivity (102 to 108 m), insulators have very high
resistivity (>108 m1), while semiconductors have intermediate values
of resistivity.

tt
o N
be C
E
re R
pu T
bl
is
he
d

1.

3.

4.
5.

6.

7.
8.

9.

10.

11.
12.

n-type semiconducting Si or Ge is obtained by doping with pentavalent


atoms (donors) like As, Sb, P, etc., while p-type Si or Ge can be obtained
by doping with trivalent atom (acceptors) like B, Al, In etc.
n enh = n i2 in all cases. Further, the material possesses an overall charge
neutrality.
There are two distinct band of energies (called valence band and
conduction band) in which the electr ons in a material lie. Valence
band energies are low as compared to conduction band energies. All
energy levels in the valence band are filled while energy levels in the
conduction band may be fully empty or partially filled. The electrons in
the conduction band are free to move in a solid and are responsible for
the conductivity. The extent of conductivity depends upon the energy
gap (Eg ) between the top of valence band (EV ) and the bottom of the
conduction band EC. The electrons from valence band can be excited by
heat, light or electrical energy to the conduction band and thus, produce
a change in the current flowing in a semiconductor.
For insulators E g > 3 eV, for semiconductors Eg is 0.2 eV to 3 eV, while
for metals Eg 0.
p-n junction is the key to all semiconductor devices. When such a
junction is made, a depletion layer is formed consisting of immobile
ion-cores devoid of their electrons or holes. This is responsible for a
junction potential barrier.
By changing the external applied voltage, junction barriers can be
changed. In forward bias (n-side is connected to negative terminal of the
battery and p-side is connected to the positive), the barrier is decreased
while the barrier increases in reverse bias. Hence, forward bias current
is more (mA) while it is very small (A) in a p-n junction diode.
Diodes can be used for rectifying an ac voltage (restricting the ac voltage
to one direction). With the help of a capacitor or a suitable filter, a dc
voltage can be obtained.
There are some special purpose diodes.

no

13.

Semiconductors are elemental (Si, Ge) as well as compound (GaAs,


CdS, etc.).
Pure semiconductors are called intrinsic semiconductors. The presence
of charge carriers (electrons and holes) is an intrinsic property of the
material and these are obtained as a result of thermal excitation. The
number of electrons (ne ) is equal to the number of holes (n h ) in intrinsic
conductors. Holes are essentially electron vacancies with an effective
positive charge.
The number of charge carriers can be changed by doping of a suitable
impurity in pure semiconductors. Such semiconductors are known as
extrinsic semiconductors. These are of two types (n-type and p-type).
In n-type semiconductors, n e >> nh while in p-type semiconductors n h >> ne.

14.

15.

507

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

16. Zener diode is one such special purpose diode. In reverse bias, after a
certain voltage, the current suddenly increases (breakdown voltage) in
a Zener diode. This property has been used to obtain voltage regulation.
17. p-n junctions have also been used to obtain many photonic or
optoelectronic devices where one of the participating entity is photon:
(a) Photodiodes in which photon excitation results in a change of reverse
saturation current which helps us to measure light intensity; (b) Solar
cells which convert photon energy into electricity; (c) Light Emitting
Diode and Diode Laser in which electron excitation by a bias voltage
results in the generation of light.
18. Transistor is an n-p-n or p-n-p junction device. The central block
(thin and lightly doped) is called Base while the other electrodes are
Emitter and Collectors. The emitter -base junction is forwar d biased
while collector -base junction is r everse biased.
19. The transistors can be connected in such a manner that either C or E
or B is common to both the input and output. This gives the three
configurations in which a transistor is used: Common Emitter (CE),
Common Collector (CC) and Common Base (CB). The plot between IC
and VCE for fixed IB is called output characteristics while the plot between
IB and VBE with fixed VCE is called input characteristics. The important
transistor parameters for CE-configuration are:

V BE
I B V
CE
V CE
output resistance, ro =
IC I
input resistance, ri =

current amplification factor, =

IC
IB

VCE

20. T ransistor can be used as an amplifier and oscillator. In fact, an


oscillator can also be considered as a self-sustained amplifier in which
a part of output is fed-back to the input in the same phase (positive
feed back). The voltage gain of a transistor amplifier in common emitter
configuration is: A v =

vo
R
= C , where RC and RB are respectively
vi
RB

no

the resistances in collector and base sides of the circuit.


21. When the transistor is used in the cutoff or saturation state, it acts as
a switch.
22. There are some special circuits which handle the digital data consisting
of 0 and 1 levels. This forms the subject of Digital Electronics.
23. The important digital circuits performing special logic operations are
called logic gates. These are: OR, AND, NOT, NAND, and NOR gates.
24. In modern day circuit, many logical gates or circuits are integrated in
one single Chip. These are known as Intgrated circuits (IC).

508

POINTS TO PONDER
1.

The energy bands ( EC or EV) in the semiconductors are space delocalised


which means that these are not located in any specific place inside the
solid. The energies are the overall averages. When you see a picture in
which EC or EV are drawn as straight lines, then they should be
respectively taken simply as the bottom of conduction band energy levels
and top of valence band energy levels.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
In elemental semiconductors (Si or Ge), the n-type or p-type
semiconductors are obtained by introducing dopants as defects. In
compound semiconductors, the change in relative stoichiometric ratio
can also change the type of semiconductor. For example, in ideal GaAs
the ratio of Ga:As is 1:1 but in Ga-rich or As-rich GaAs it could
respectively be Ga1.1 As0.9 or Ga 0.9 As1.1. In general, the presence of
defects control the properties of semiconductors in many ways.
In transistors, the base region is both narrow and lightly doped,
otherwise the electrons or holes coming from the input side (say, emitter
in CE-configuration) will not be able to r each the collector.

tt
o N
be C
E
re R
pu T
bl
is
he
d

2.

3.

4.

We have described an oscillator as a positive feedback amplifier. For


stable oscillations, the voltage feedback (Vfb) from the output voltage
(Vo) should be such that after amplification (A) it should again become
Vo. If a fraction is feedback, then Vfb = Vo . and after amplification
its value A(vo.) should be equal to Vo. This means that the criteria for
stable oscillations to be sustained is A = 1. This is known as
Barkhausen's Criteria.

5.

In an oscillator, the feedback is in the same phase (positive feedback).


If the feedback voltage is in opposite phase (negative feedback), the
gain is less than 1 and it can never work as oscillator. It will be an
amplifier with reduced gain. However, the negative feedback also reduces
noise and distortion in an amplifier which is an advantageous feature.

EXERCISES

14.1

14.2

no

14.3

In an n-type silicon, which of the following statement is true:


(a) Electrons are majority carriers and trivalent atoms are the
dopants.
(b) Electrons are minority carriers and pentavalent atoms are the
dopants.
(c) Holes are minority carriers and pentavalent atoms are the
dopants.
(d) Holes are majority carriers and trivalent atoms are the dopants.
Which of the statements given in Exercise 14.1 is true for p-type
semiconductos.
Carbon, silicon and germanium have four valence electrons each.
These are characterised by valence and conduction bands separated
by energy band gap respectively equal to (Eg)C , (Eg )Si and (E g)Ge. Which
of the following statements is true?
(a) (Eg )Si < (Eg )Ge < (E g)C
(b) (Eg )C < ( Eg )Ge > (Eg )Si
(c) (Eg )C > ( Eg )Si > ( Eg )Ge
(d) (Eg )C = ( Eg )Si = ( Eg )Ge
In an unbiased p-n junction, holes diffuse from the p-region to
n-region because
(a) free electrons in the n-region attract them.
(b) they move across the junction by the potential difference.
(c) hole concentration in p-region is more as compared to n-region.
(d) All the above.

14.4

509

Physics
When a forward bias is applied to a p-n junction, it
(a) raises the potential barrier.
(b) reduces the majority carrier current to zero.
(c) lowers the potential barrier.
(d) None of the above.
14.6 For transistor action, which of the following statements are correct:
(a) Base, emitter and collector regions should have similar size and
doping concentrations.
(b) The base region must be very thin and lightly doped.
(c) The emitter junction is forward biased and collector junction is
reverse biased.
(d) Both the emitter junction as well as the collector junction are
forward biased.
14.7 For a transistor amplifier, the voltage gain
(a) remains constant for all frequencies.
(b) is high at high and low frequencies and constant in the middle
frequency range.
(c) is low at high and low frequencies and constant at mid
frequencies.
(d) None of the above.
14.8 In half-wave rectification, what is the output frequency if the input
frequency is 50 Hz. What is the output frequency of a full-wave rectifier
for the same input frequency.
14.9 For a CE-transistor amplifier, the audio signal voltage across the
collected resistance of 2 k is 2 V. Suppose the current amplification
factor of the transistor is 100, find the input signal voltage and base
current, if the base r esistance is 1 k.
14.10 Two amplifiers are connected one after the other in series (cascaded).
The first amplifier has a voltage gain of 10 and the second has a
voltage gain of 20. If the input signal is 0.01 volt, calculate the output
ac signal.
14.11 A p-n photodiode is fabricated from a semiconductor with band gap
of 2.8 eV. Can it detect a wavelength of 6000 nm?

tt
o N
be C
E
re R
pu T
bl
is
he
d

14.5

no

ADDITIONAL EXERCISES

14.12 The number of silicon atoms per m 3 is 5 1028. This is doped


22
3
20
simultaneously with 5 10 atoms per m of Arsenic and 5 10
3
per m atoms of Indium. Calculate the number of electrons and holes.
Given that ni = 1.5 10 16 m3. Is the material n-type or p-type?
14.13 In an intrinsic semiconductor the energy gap Eg is 1.2eV. Its hole
mobility is much smaller than electron mobility and independent of
temperature. What is the ratio between conductivity at 600K and
that at 300K? Assume that the temperature dependence of intrinsic
carrier concentration ni is given by
ni = n 0 exp

510

Eg
2k B T

where n0 is a constant.

Semiconductor Electronics:
Materials, Devices and
Simple Circuits
14.14 In a p-n junction diode, the current I can be expressed as

I = I 0 exp

eV
1
2k B T

tt
o N
be C
E
re R
pu T
bl
is
he
d

where I0 is called the reverse saturation current, V is the voltage


across the diode and is positive for forward bias and negative for
reverse bias, and I is the current through the diode, k B is the
Boltzmann constant (8.610 5 eV/K) and T is the absolute
12
temperature. If for a given diode I0 = 5 10 A and T = 300 K, then
(a) What will be the forward current at a forward voltage of 0.6 V?
(b) What will be the increase in the current if the voltage across the
diode is increased to 0.7 V?
(c) What is the dynamic resistance?
(d) What will be the current if reverse bias voltage changes from 1 V
to 2 V?
14.15 You are given the two cir cuits as shown in Fig. 14.44. Show that
circuit (a) acts as OR gate while the circuit (b) acts as AND gate.

FIGURE 14.44

14.16 Write the truth table for a NAND gate connected as given in
Fig. 14.45.

FIGURE 14.45

Hence identify the exact logic operation carried out by this circuit.

no

14.17 You are given two circuits as shown in Fig. 14.46, which consist
of NAND gates. Identify the logic operation carried out by the two
circuits.

FIGURE 14.46

14.18 Write the truth table for circuit given in Fig. 14.47 below consisting
of NOR gates and identify the logic operation (OR, AND, NOT) which
this circuit is performing.

511

Physics

tt
o N
be C
E
re R
pu T
bl
is
he
d

FIGURE 14.47

(Hint: A = 0, B = 1 then A and B inputs of second NOR gate will be 0


and hence Y=1. Similarly work out the values of Y for other
combinations of A and B. Compare with the truth table of OR, AND,
NOT gates and find the correct one.)

14.19 Write the truth table for the cir cuits given in Fig. 14.48 consisting of
NOR gates only. Identify the logic operations (OR, AND, NOT) performed
by the two circuits.

no

FIGURE 14.48

512

408

PHYSICS

no N
C
tt E
o R
be T
re
pu

Careful observations with the Youngs


modulus experiment (explained in section
9.6.2), show that there is also a slight
reduction in the cross-section (or in the
diameter) of the wire. The strain
perpendicular to the applied force is called
lateral strain. Simon Poisson pointed out
that within the elastic limit, lateral strain is
directly proportional to the longitudinal strain.
The ratio of the lateral strain to the
longitudinal strain in a stretched wire is
called Poissons ratio. If the original diameter
of the wire is d and the contraction of the
diameter under stress is d, the lateral strain
is d/d. If the original length of the wire is L
and the elongation under stress is L, the
longitudinal strain is L/L. Poissons ratio is
then (d/d)/(L/L) or (d/L) (L/d).
Poissons ratio is a ratio of two strains; it is a
pure number and has no dimensions or units.
Its value depends only on the nature of
material. For steels the value is between 0.28
and 0.30, and for aluminium alloys it is about
0.33.

is

9.6.5 POISSONS RATIO

elastic potential energy. When a wire of


original length L and area of cross-section A
is subjected to a deforming force F along the
length of the wire, let the length of the wire
be elongated by l. Then from Eq. (9.8), we have
F = YA (l/L). Here Y is the Youngs modulus
of the material of the wire. Now for a further
elongation of infinitesimal small length dl,
work done dW is F dl or YAldl/L. Therefore,
the amount of work done (W) in increasing
the length of the wire from L to L + l, that is
from l = 0 to l = l is

bl

CHAPTER 9

he
d

SUPPLEMENTARY MATERIAL

9.6.6 Elastic Potential Energy in a


Stretched Wire

When a wire is put under a tensile stress,


work is done against the inter-atomic forces.
This work is stored in the wire in the form of

W=

0
l

YAl
YA l 2

dl =
2
L
L
2

1
l
W = Y AL
2
L
=

1
Youngs modulus strain2
2
volume of the wire

1
stress strain volume of the
2
wire

This work is stored in the wire in the form


of elastic potential energy (U). Therefore the
elastic potential energy per unit volume of the
wire (u) is
u=

(A1)

SUPPLEMENTARY MATERIAL

vc = Re /( d).

CHAPTER 11
11.9.4 Blackbody Radiation

no N
C
tt E
o R
be T
re
pu

We have so far not mentioned the wavelength


content of thermal radiation. The important
thing about thermal radiation at any
temperature is that it is not of one (or a few)
wavelength(s) but has a continuous spectrum
from the small to the long wavelengths. The
energy content of radiation, however, varies for
different wavelengths. Figure A1 gives the
experimental curves for radiation energy per
unit area per unit wavelength emitted by a
blackbody versus wavelength for different
temperatures.

The value of the constant (Wiens constant)


is 2.9 103 m K. This law explains why the
colour of a piece of iron heated in a hot flame
first becomes dull red, then reddish yellow and
finally white hot. Wiens law is useful for
estimating the surface temperatures of celestial
bodies like the moon, sun and other stars. Light
from the moon is found to have a maximum
intensity near the wavelength 14 m. By Wiens
law, the surface of the moon is estimated to have
a temperature of 200 K. Solar radiation has a
maximum at m = 4753 . This corresponds to
T = 6060 K. Remember, this is the temperature
of the surface of the sun, not its interior.
The most significant feature of the
blackbody radiation curves in Fig. A1 is that
they are universal. They depend only on the
temperature and not on the size, shape or
material of the blackbody. Attempts to explain
blackbody radiation theoretically, at the
beginning of the twentieth century, spurred the
quantum revolution in physics, as you will
learn in later courses.
Energy can be transferred by radiation over
large distances, without a medium (i.e., in
vacuum). The total electromagnetic energy
radiated by a body at absolute temperature T
is proportional to its size, its ability to radiate
(called emissivity) and most importantly to its
temperature. For a body which is a perfect
radiator, the energy emitted per unit time (H)
is given by

he
d

(To be inserted on Page 260, Chapter 10, Physics,


Class XI, Vol. 2 textbook before last paragraph of
second column.)
The maximum velocity of a fluid in a tube
for which the flow remains streamlined is called
its critical velocity. From Eq. 10.21, it is

is

CRITICAL VELOCITY

given by what is known as Wiens Displacement


Law:
m T = constant
(A1)

bl

CHAPTER 10

409

H = AT 4

Fig. A1: Energy emitted versus wavelength for a


blackbody at different temperatures

Notice that the wavelength m for which energy


is maximum decreases with increasing
temperature. The relation between m and T is

(A2)

where A is the area and T is the absolute


temperature of the body. This relation obtained
experimentally by Stefan and later proved
theoretically by Boltzmann is known as StefanBoltzmann law and the constant is called
Stefan-Boltzmann constant. Its value in SI units
is 5.67 108 W m2 K4. Most bodies emit only a
fraction of the rate given by Eq. (A2). A substance
like lamp black comes close to the limit. One,

410

PHYSICS

Here e = 1 for a perfect radiator. For a tungsten


lamp, for example, e is about 0.4. Thus, a tungsten
lamp at a temperature of 3000 K and a surface
area of 0.3 cm2 radiates at the rate H = 0.3
104 0.4 5.67 108 (3000)4 = 60 W.
A body at temperature T, with surroundings
at temperatures Ts, emits as well as receives
energy. For a perfect radiator, the net rate of
loss of radiant energy is
H = A (T 4 Ts4)
For a body with emissivity e, the relation
modifies to
H = e A (T 4 Ts4)

(A4)

no N
C
tt E
o R
be T
re
pu

As an example, let us estimate the heat


radiated by our bodies. Suppose the surface area
of a persons body is about 1.9 m2 and the room
temperature is 22C. The internal body
temperature, as we know, is about 37 C. The
skin temperature may be 28C (say). The
emissivity of the skin is about 0.97 for the
relevant region of electromagnetic radiation. The
rate of heat loss is:

The earths surface is a source of thermal


radiation as it absorbs energy received from sun.
The wavelength of this radiation lies in the long
wavelength (infrared) region. But a large portion
of this radiation is absorbed by greenhouse
gases, namely, carbon dioxide (CO2); methane
(CH4); nitrous oxide (N2O); chlorofluorocarbon
(CFxClx); and tropospheric ozone (O3). This heats
up the atmosphere which, in turn, gives more
energy to earth resulting in warmer surface.
This increases the intensity of radiation from
the surface. The cycle of processes described
above is repeated until no radiation is available
for absorption. The net result is heating up of
earths surface and atmosphere. This is known
as Greenhouse effect. Without the Greenhouse
effect, the temperature of the earth would have
been 18C.
Concentration of greenhouse gases has
enhanced due to human activities, making the
earth warmer. According to an estimate, average
temperature of earth has increased by 0.3 to
0.6C, since the beginning of this century,
because of this enhancement. By the middle of
the next century, the earths global temperature
may be 1 to 3C higher than today. This global
warming may cause problem for human life,
plants and animals. Because of global warming,
ice caps are melting faster, sea level is rising,
and weather pattern is changing. Many coastal
cities are at the risk of getting submerged. The
enhanced Greenhouse effect may also result in
expansion of deserts. All over the world, efforts
are being made to minimise the effect of global
warming.

he
d

(A3)

is

H = AeT 4

11.9.5 Greenhouse Effect

bl

therefore, defines a dimensionless fraction e


called emissivity and writes,

H = 5.67 108 1.9 0.97 {(301)4 (295)4}


= 66.4 W

which is more than half the rate of energy


production by the body at rest (120 W). To
prevent this heat loss effectively (better than
ordinary clothing), modern arctic clothing has
an additional thin shiny metallic layer next to
the skin, which reflects the bodys radiation.

SUPPLEMENTARY MATERIAL

411

no N
C
tt E
o R
be T
re
pu

bl

is

he
d

NOTES

he
d

is

bl

no N
C
tt E
o R
be T
re
pu
NOTES

CHAPTER SIX

WORK, ENERGY

AND

POWER

6.1 INTRODUCTION

6.1 Introduction
6.2 Notions of work and kinetic
energy : The work-energy
theorem
6.3 Work

6.4 Kinetic energy


6.5 Work done by a variable
force

6.6 The work-energy theorem for


a variable force
6.7 The concept of potential
energy
6.8 The conservation of
mechanical energy
6.9 The potential energy of a
spring
6.10 Various forms of ener gy : the
law of conservation of energy
6.11 Power

6.12 Collisions
Summary
Points to ponder
Exer cises
Additional exercises
Appendix 6.1

The terms work, energy and power are frequently used


in everyday language. A farmer ploughing the field, a
construction worker carrying bricks, a student studying for
a competitive examination, an artist painting a beautiful
landscape, all are said to be working. In physics, however,
the word Work covers a definite and precise meaning.
Somebody who has the capacity to work for 14-16 hours a
day is said to have a large stamina or energy. We admire a
long distance runner for her stamina or energy. Energy is
thus our capacity to do work. In Physics too, the term energy
is related to work in this sense, but as said above the term
work itself is defined much more precisely. The word power
is used in everyday life with different shades of meaning. In
karate or boxing we talk of powerful punches. These are
delivered at a great speed. This shade of meaning is close to
the meaning of the word power used in physics. We shall
find that there is at best a loose correlation between the
physical definitions and the physiological pictures these
terms generate in our minds. The aim of this chapter is to
develop an understanding of these three physical quantities.
Before we proceed to this task, we need to develop a
mathematical prerequisite, namely the scalar product of two
vectors.
6.1.1 The Scalar Product
We have learnt about vectors and their use in Chapter 4.
Physical quantities like displacement, velocity, acceleration,
force etc. are vectors. We have also learnt how vectors are
added or subtracted. We now need to know how vectors are
multiplied. There are two ways of multiplying vectors which
we shall come across : one way known as the scalar product
gives a scalar from two vectors and the other known as the
vector product produces a new vector from two vectors. We
shall look at the vector product in Chapter 7. Here we take
up the scalar product of two vectors. The scalar product or
dot product of any two vectors A and B, denoted as A. B (read

2015-16(20/01/2015)

WORK, ENERGY AND POWER

A dot B) is defined as
A. B = A B cos

115

(6.1a)

where is the angle between the two vectors as


shown in Fig. 6.1(a). Since A, B and cos are
scalars, the dot product of A and B is a scalar
quantity. Each vector, A and B, has a direction
but their scalar product does not have a
direction.
From Eq. (6.1a), we have
A .B = A (B cos )
= B (A cos )
Geometrically, B cos is the projection of B onto
A in Fig.6.1 (b) and A cos is the projection of A
onto B in Fig. 6.1 (c). So, A. B is the product of
the magnitude of A and the component of B along
A. Alternatively, it is the product of the
magnitude of B and the component of A along B.
Equation (6.1a) shows that the scalar product
follows the commutative law :
A. B = B.A
Scalar product obeys the distributive
law:
A . (B + C) = A .B + A .C
Further,

A = Ax $i + Ay $j + Az k$

A. ( B) = (A .B)

$
B = B x $i + By $j + Bz k
their scalar product is
A .B

For unit vectors $i, $j,k$ we have


$i $i = $j $j = k$ k$ = 1

$i $j = $j k$ = k
$ $i = 0
Given two vectors

Ay j

. B i
Az k
x

B yj

B z k

= Ax Bx + Ay B y + Az B z

(6.1b)
From the definition of scalar product and
(Eq. 6.1b) we have :
( i ) A A Ax Ax A y Ay A z Az
A 2 = A 2x + Ay2 + A 2z
(6.1c)
2
.
since A A = |A ||A| cos 0 = A .
(ii)
A .B = 0, if A and B are perpendicular.
Or,

u Example 6.1 Find the angle between force


unit and displacement
F = (3 $i + 4 $j 5 k)
unit. Also find the
d = (5 $i + 4 $j + 3 k)
projection of F on d.
Answer F.d = Fx d x Fyd y Fz d z

Hence F.d

= 3 (5) + 4 (4) + ( 5) (3)


= 16 unit
= F d cos = 16 unit

Now F.F

= F 2 Fx2 Fy2
= 9 + 16 + 25
= 50 unit

and d.d

= d 2 = d2x

where is a real number.


The proofs of the above equations are left to
you as an exercise.

Ax i

d2y

Fz2

d2z

= 25 + 16 + 9
= 50 unit
cos

16
16
=
= 0 .3 2 ,
50 5 0 50

= cos1 0.32

Fig. 6.1 (a) The scalar product of two vectors A and B is a scalar : A. B = A B cos . (b) B cos is the projection
of B onto A. (c) A cos is the projection of A onto B.

2015-16(20/01/2015)

116

PHYSICS

6.2 NOTIONS OF WORK AND KINETIC


ENERGY: THE WORK-ENERGY THEOREM
The following relation for rectilinear motion under
constant acceleration a has been encountered
in Chapter 3,
v 2 u2 = 2 as
where u and v are the initial and final speeds
and s the distance traversed. Multiplying both
sides by m/2, we have
1
mv 2
2

1
mu 2
2

mas

Fs

known to be proportional to the speed of


the drop but is otherwise undetermined.
Consider a drop of mass 1.00 g falling from
a height 1.00 km. It hits the ground with
a speed of 50.0 m s-1. (a) What is the work
done by the gravitational force ? What is
the work done by the unknown resistive
force?
Answer (a) The change in kinetic energy of the
drop is

(6.2a)
K

where the last step follows from Newtons


Second Law. We can generalise Eq. (6.1)
to three dimensions by employing
vectors
v 2 u2 = 2 a.d
Once again multiplying both sides by m/2 , we
obtain
1
1
mv 2
mu 2 m a.d F.d
(6.2b)
2
2
The above equation provides a motivation for
the definitions of work and kinetic energy. The
left side of the equation is the difference in the
quantity half the mass times the square of the
speed from its initial value to its final value. We
call each of these quantities the kinetic energy,
denoted by K. The right side is a product of the
displacement and the component of the force
along the displacement. This quantity is called
work and is denoted by W. Eq. (6.2b) is then

1
10-3 50 50
2
= 1.25 J
where we have assumed that the drop is initially
at rest.
Assuming that g is a constant with a value
10 m/s2 , the work done by the gravitational force
is,
Wg = mgh
= 10 -3 10 103
= 10.0 J
(b) From the work-energy theorem
K
W g Wr
where W r is the work done by the resistive force
on the raindrop. Thus
Wr = K W g
= 1.25 10
= 8.75 J
is negative.

where Ki and K f are respectively the initial and


final kinetic energies of the object. Work refers
to the force and the displacement over which it
acts. Work is done by a force on the body over
a certain displacement.
Equation (6.2) is also a special case of the
work-energy (WE) theorem : The change in
kinetic energy of a particle is equal to the
work done on it by the net force. We shall
generalise the above derivation to a varying force
in a later section.

6.3 WORK

Example 6.2 It is well known that a


raindrop falls under the influence of the
downward gravitational force and the
opposing resistive force. The latter is

(6.3)

Kf Ki = W

1
m v2
2

As seen earlier, work is related to force and the


displacement over which it acts. Consider a
constant force F acting on an object of mass m.
The object undergoes a displacement d in the
positive x-direction as shown in Fig. 6.2.

Fig. 6.2 An object undergoes a displacement d


under the influence of the force F.

2015-16(20/01/2015)

WORK, ENERGY AND POWER

We see that if there is no displacement, there


is no work done even if the force is large. Thus,
when you push hard against a rigid brick wall,
the force you exert on the wall does no work. Yet
your muscles are alternatively contracting and
relaxing and internal energy is being used up
and you do get tired. Thus, the meaning of work
in physics is different from its usage in everyday
language.
No work is done if :
(i) the displacement is zero as seen in the
example above. A weightlifter holding a 150
kg mass steadily on his shoulder for 30 s
does no work on the load during this time.
(ii) the force is zero. A block moving on a smooth
horizontal table is not acted upon by a
horizontal force (since there is no friction), but
may undergo a large displacement.
(iii) the force and displacement are mutually
perpendicular. This is so since, for = /2 rad
(= 90o), cos (/2) = 0. For the block moving on
a smooth horizontal table, the gravitational
force mg does no work since it acts at right
angles to the displacement. If we assume that
the moons orbits around the earth is
perfectly circular then the earths
gravitational force does no work. The moons
instantaneous displacement is tangential
while the earths force is radially inwards and
= /2.
Work can be both positive and negative. If is
between 0o and 90o, cos in Eq. (6.4) is positive.
If is between 90o and 180o , cos is negative.
In many examples the frictional force opposes
displacement and = 180o . Then the work done
by friction is negative (cos 180o = 1).
From Eq. (6.4) it is clear that work and energy
have the same dimensions, [ML2T2]. The SI unit
of these is joule (J), named after the famous British
physicist James Prescott Joule (1811-1869). Since
work and energy are so widely used as physical
concepts, alternative units abound and some of
these are listed in Table 6.1.

Table 6.1 Alternative Units of Work/Energy in J

The work done by the force is defined to be


the product of component of the force in the
direction of the displacement and the
magnitude of this displacement. Thus
W = (F cos )d = F.d
(6.4)

117

Example 6.3 A cyclist comes to a skidding


stop in 10 m. During this process, the force
on the cycle due to the road is 200 N and
is directly opposed to the motion. (a) How
much work does the road do on the cycle ?
(b) How much work does the cycle do on
the road ?

Answer Work done on the cycle by the road is


the work done by the stopping (frictional) force
on the cycle due to the road.
(a) The stopping force and the displacement make
an angle of 180o ( rad) with each other.
Thus, work done by the road,
Wr = Fd cos
= 200 10 cos
= 2000 J
It is this negative work that brings the cycle
to a halt in accordance with WE theorem.
(b) From Newtons Third Law an equal and
opposite force acts on the road due to the
cycle. Its magnitude is 200 N. However, the
road undergoes no displacement. Thus,
work done by cycle on the road is zero.

The lesson of Example 6.3 is that though the


force on a body A exerted by the body B is always
equal and opposite to that on B by A (Newtons
Third Law); the work done on A by B is not
necessarily equal and opposite to the work done
on B by A.
6.4 KINETIC ENERGY
As noted earlier, if an object of mass m has
velocity v, its kinetic energy K is
K

m v. v

mv 2

(6.5)

Kinetic energy is a scalar quantity. The kinetic


energy of an object is a measure of the work an

2015-16(20/01/2015)

118

PHYSICS

Table 6.2 Typical kinetic energies (K)

object can do by the virtue of its motion. This


notion has been intuitively known for a long time.
The kinetic energy of a fast flowing stream
has been used to grind corn. Sailing
ships employ the kinetic energy of the wind. Table
6.2 lists the kinetic energies for various
objects.
Example 6.4 In a ballistics demonstration
a police officer fires a bullet of mass 50.0 g
with speed 200 m s-1 (see Table 6.2) on soft
plywood of thickness 2.00 cm. The bullet
emerges with only 10% of its initial kinetic
energy. What is the emergent speed of the
bullet ?

This is illustrated in Fig. 6.3(a). Adding


successive rectangular areas in Fig. 6.3(a) we
get the total work done as
x

F (x )x

where the summation is from the initial position


xi to the final position x f.
If the displacements are allowed to approach
zero, then the number of terms in the sum
increases without limit, but the sum approaches
a definite value equal to the area under the curve
in Fig. 6.3(b). Then the work done is

W =

Answer The initial kinetic energy of the bullet


is mv2 /2 = 1000 J. It has a final kinetic energy
of 0.11000 = 100 J. If v f is the emergent speed
of the bullet,
1
2

2
mv f

vf =

(6.6)

xi

lim xf
lim
F (x ) x
x 0

xi

xf

F x dx

(6.7)

xi

= 100 J
2 100 J
0.05 kg

= 63.2 m s1

where lim stands for the limit of the sum when


x tends to zero. Thus, for a varying force
the work done can be expressed as a definite
integral of force over displacement (see also
Appendix 3.1).

The speed is reduced by approximately 68%


(not 90%).

6.5 WORK DONE BY A VARIABLE FORCE


A constant force is rare. It is the variable force,
which is more commonly encountered. Fig. 6.2
is a plot of a varying force in one dimension.
If the displacement x is small, we can take
the force F (x) as approximately constant and
the work done is then
W =F (x) x

Fig. 6.3(a)

2015-16(20/01/2015)

WORK, ENERGY AND POWER

119

The work done by the frictional force is


Wf area of the rectangle AGHI
W f = (50) 20
= 1000 J
The area on the negative side of the force axis
has a negative sign.

Fig. 6.3 (a) The shaded rectangle represents the


work done by the varying force F(x), over
the small displacement x, W = F(x) x.
(b) adding the areas of all the rectangles we
find that for x 0, the area under the curve
is exactly equal to the work done by F(x).

Example 6.5 A woman pushes a trunk on


a railway platform which has a rough
surface. She applies a force of 100 N over a
distance of 10 m. Thereafter, she gets
progressively tired and her applied force
reduces linearly with distance to 50 N. The
total distance through which the trunk has
been moved is 20 m. Plot the force applied
by the woman and the frictional force, which
is 50 N versus displacement. Calculate the
work done by the two forces over 20 m.

6.6 THE WORK-ENERGY THEOREM FOR A


VARIABLE FORCE
We are now familiar with the concepts of work
and kinetic energy to prove the work-energy
theorem for a variable force. We confine
ourselves to one dimension. The time rate of
change of kinetic energy is
dK
dt

d 1
m v2
dt 2
dv
v
dt
= F v (from Newtons Second Law)
m

dx
dt

Thus
dK = Fdx
Integrating from the initial position (x i ) to final
position ( x f ), we have

Answer

Kf

xf

dK

Fdx

Ki

xi

where, Ki and K f are the initial and final kinetic


energies corresponding to x i and x f.
xf

Fig. 6.4 Plot of the force F applied by the woman and


the opposing frictional force f versus
displacement.

The plot of the applied force is shown in Fig.


6.4. At x = 20 m, F = 50 N ( 0). We are given
that the frictional force f is |f|= 50 N. It opposes
motion and acts in a direction opposite to F. It
is therefore, shown on the negative side of the
force axis.
The work done by the woman is
W F area of the rectangle ABCD + area of
the trapezium CEID
W F = 100 10 +
= 1000 + 750
= 1750 J

1
(100 + 50) 10
2

or

Kf

F dx

Ki

(6.8a)

xi

From Eq. (6.7), it follows that


Kf Ki = W

(6.8b)

Thus, the WE theorem is proved for a variable


force.
While the WE theorem is useful in a variety of
problems, it does not, in general, incorporate the
complete dynamical information of Newtons
second law. It is an integral form of Newtons
second law. Newtons second law is a relation
between acceleration and force at any instant of
time. Work-energy theorem involves an integral
over an interval of time. In this sense, the temporal
(time) information contained in the statement of
Newtons second law is integrated over and is

2015-16(20/01/2015)

120

PHYSICS

not available explicitly. Another observation is that


Newtons second law for two or three dimensions
is in vector form whereas the work-energy
theorem is in scalar form. In the scalar form,
information with respect to directions contained
in Newtons second law is not present.
Example 6.6 A block of mass m = 1 kg,
moving on a horizontal surface with speed
v i = 2 ms1 enters a rough patch ranging
from x = 0.10 m to x = 2.01 m. The retarding
force Fr on the block in this range is inversely
proportional to x over this range,

k
for 0.1 < x < 2.01 m
x
= 0 for x < 0.1m and x > 2.01 m
where k = 0.5 J. What is the final kinetic
energy and speed vf of the block as it
crosses this patch ?
Fr =

Answer From Eq. (6.8a)


2.01

Kf

Ki
0.1

1
mv 2i
2
1
mv 2
2 i

k
dx
x

k ln x

2.01
0.1

d
V(h)
mg
dh
The negative sign indicates that the
gravitational force is downward. When released,
the ball comes down with an increasing speed.
Just before it hits the ground, its speed is given
by the kinematic relation,
v2 = 2gh
This equation can be written as
F

k ln 2.01/0.1

= 2 0.5 ln (20.1)
= 2 1.5 = 0.5 J
v f = 2K f /m = 1 m s1

Here, note that ln is a symbol for the natural


logarithm to the base e and not the logarithm to
the base 10 [ln X = loge X = 2.303 log10 X].

6.7 THE CONCEPT OF POTENTIAL ENERGY


The word potential suggests possibility or
capacity for action. The term potential energy
brings to ones mind stored energy. A stretched
bow-string possesses potential energy. When it
is released, the arrow flies off at a great speed.
The earths crust is not uniform, but has
discontinuities and dislocations that are called
fault lines. These fault lines in the earths crust

are like compressed springs. They possess a


large amount of potential energy. An earthquake
results when these fault lines readjust. Thus,
potential energy is the stored energy by virtue
of the position or configuration of a body. The
body left to itself releases this stored energy in
the form of kinetic energy. Let us make our notion
of potential energy more concrete.
The gravitational force on a ball of mass m is
mg . g may be treated as a constant near the earth
surface. By near we imply that the height h of
the ball above the earths surface is very small
compared to the earths radius RE (h <<RE) so that
we can ignore the variation of g near the earths
surface* . In what follows we have taken the
upward direction to be positive. Let us raise the
ball up to a height h. The work done by the external
agency against the gravitational force is mgh. This
work gets stored as potential energy.
Gravitational potential energy of an object, as a
function of the height h, is denoted by V(h) and it
is the negative of work done by the gravitational
force in raising the object to that height.
V (h) = mgh
If h is taken as a variable, it is easily seen that
the gravitational force F equals the negative of
the derivative of V(h) with respect to h. Thus,

1
m v2 = m g h
2
which shows that the gravitational potential
energy of the object at height h, when the object
is released, manifests itself as kinetic energy of
the object on reaching the ground.
Physically, the notion of potential energy is
applicable only to the class of forces where work
done against the force gets stored up as energy.
When external constraints are removed, it
manifests itself as kinetic energy. Mathematically,
(for simplicity, in one dimension) the potential

The variation of g with height is discussed in Chapter 8 on Gravitation.

2015-16(20/01/2015)

WORK, ENERGY AND POWER

121

energy V(x) is defined if the force F(x) can be


written as

6.8 THE CONSERVATION OF MECHANICAL


ENERGY

which means that K + V, the sum of the kinetic


and potential energies of the body is a constant.
Over the whole path, xi to xf, this means that
Ki + V(x i ) = Kf + V(x f )
(6.11)
The quantity K +V (x ), is called the total
mechanical energy of the system. Individually
the kinetic energy K and the potential energy
V(x) may vary from point to point, but the sum
is a constant. The aptness of the term
conservative force is now clear.
Let us consider some of the definitions of a
conservative force.
l
A force F(x) is conservative if it can be derived
from a scalar quantity V(x) by the relation
given by Eq. (6.9). The three-dimensional
generalisation requires the use of a vector
derivative, which is outside the scope of this
book.
l
The work done by the conservative force
depends only on the end points. This can be
seen from the relation,
W = Kf K i = V (x i ) V(x f )
which depends on the end points.
l
A third definition states that the work done
by this force in a closed path is zero. This is
once again apparent from Eq. (6.11) since
xi = x f .
Thus, the principle of conservation of total
mechanical energy can be stated as
The total mechanical energy of a system is
conserved if the forces, doing work on it, are
conservative.
The above discussion can be made more
concrete by considering the example of the
gravitational force once again and that of the
spring force in the next section. Fig. 6.5 depicts
a ball of mass m being dropped from a cliff of
height H.

For simplicity we demonstrate this important


principle for one-dimensional motion. Suppose
that a body undergoes displacement x under
the action of a conservative force F. Then from
the WE theorem we have,
K = F(x) x
If the force is conservative, the potential energy
function V(x) can be defined such that
V = F(x) x
The above equations imply that
K + V = 0
(K + V ) = 0
(6.10)

Fig. 6.5 The conversion of potential energy to kinetic


energy for a ball of mass m dropped from a
height H.

dV
dx

F x
This implies that
xf

Vf

F(x)dx
xi

dV

Vi

Vf

Vi

The work done by a conservative force such as


gravity depends on the initial and final positions
only. In the previous chapter we have worked
on examples dealing with inclined planes. If an
object of mass m is released from rest, from the
top of a smooth (frictionless) inclined plane of
height h, its speed at the bottom
is 2 gh irrespective of the angle of inclination.
Thus, at the bottom of the inclined plane it
acquires a kinetic energy, mgh. If the work done
or the kinetic energy did depend on other factors
such as the velocity or the particular path taken
by the object, the force would be called nonconservative.
The dimensions of potential energy are
[ML2 T 2] and the unit is joule (J), the same as
kinetic energy or work. To reiterate, the change
in potential energy, for a conservative force,
V is equal to the negative of the work done by
the force
V = F(x) x
(6.9)
In the example of the falling ball considered in
this section we saw how potential energy was
converted to kinetic energy. This hints at an
important principle of conservation in mechanics,
which we now proceed to examine.

2015-16(20/01/2015)

122

PHYSICS

The total mechanical energies E0 , Eh , and EH


of the ball at the indicated heights zero (ground
level), h and H, are
(6.11 a)

EH = mgH
Eh
E0

1
mvh2
2
= (1/2) mvf2
mgh

(6.11 b)
(6.11 c)

The constant force is a special case of a spatially


dependent force F(x). Hence, the mechanical
energy is conserved. Thus
EH = E0
1
mgH
mv 2f
or,
2
vf

Answer (i) There are two external forces on


the bob : gravity and the tension (T) in the
string. The latter does no work since the
displacement of the bob is always normal to the
string. The potential energy of the bob is thus
associated with the gravitational force only. The
total mechanical energy E of the system is
conserved. We take the potential energy of the
system to be zero at the lowest point A. Thus,
at A :

E=

2 gH

1
2

mv02

T A mg =

a result that was obtained in section 3.7 for a


freely falling body.
Further,
EH = Eh
which implies,
2
(6.11 d)
vh = 2g(H h )
and is a familiar result from kinematics.
At the height H, the energy is purely potential.
It is partially converted to kinetic at height h and
is fully kinetic at ground level. This illustrates
the conservation of mechanical energy.

Example 6.7 A bob of mass m is suspended


by a light string of length L . It is imparted a
horizontal velocity v o at the lowest point A
such that it completes a semi-circular
trajectory in the vertical plane with the string
becoming slack only on reaching the topmost
point, C. This is shown in Fig. 6.6. Obtain an
expression for (i) vo; (ii) the speeds at points
B and C; (iii) the ratio of the kinetic energies
(KB/KC ) at B and C. Comment on the nature
of the trajectory of the bob after it reaches
the point C.

mv02
L

(6.12)
[Newtons Second Law]

where TA is the tension in the string at A. At the


highest point C, the string slackens, as the
tension in the string (TC ) becomes zero.
Thus, at C

1
E = mvc2 + 2mgL
2

mg =

mvc2
L

(6.13)

[Newtons Second Law]

(6.14)

where vC is the speed at C. From Eqs. (6.13) and


(6.14)
5
mgL
2
Equating this to the energy at A
E

or,

5
m 2
mgL
v
2
2 0
v0
5 gL

(ii) It is clear from Eq. (6.14)


vC =

gL

At B, the energy is
1
mv 2B mgL
2
Equating this to the energy at A and employing
E

the result from (i), namely v 20 = 5 gL ,


1
mv 2B
2

Fig. 6.6

mgL

1
mv 02
2

5
m gL
2

2015-16(20/01/2015)

WORK, ENERGY AND POWER

123

v B = 3gL

W =+

k x 2m
2

(6.16)

(iii) The ratio of the kinetic energies at B and C


is :
1
mv 2B
KB 2
3
=
=
K C 1 mv 2 1
2 C
At point C, the string becomes slack and the
velocity of the bob is horizontal and to the left. If
the connecting string is cut at this instant, the
bob will execute a projectile motion with
horizontal projection akin to a rock kicked
horizontally from the edge of a cliff. Otherwise
the bob will continue on its circular path and
complete the revolution.

6.9 THE POTENTIAL ENERGY OF A SPRING


The spring force is an example of a variable force
which is conservative. Fig. 6.7 shows a block
attached to a spring and resting on a smooth
horizontal surface. The other end of the spring
is attached to a rigid wall. The spring is light
and may be treated as massless. In an ideal
spring, the spring force Fs is proportional to
x where x is the displacement of the block from
the equilibrium position. The displacement could
be either positive [Fig. 6.7(b)] or negative
[Fig. 6.7(c)]. This force law for the spring is called
Hookes law and is mathematically stated as
Fs = kx
The constant k is called the spring constant. Its
unit is N m-1. The spring is said to be stiff if k is
large and soft if k is small.
Suppose that we pull the block outwards as in
Fig. 6.7(b). If the extension is x m, the work done by
the spring force is
xm

xm

Fs dx

Ws
k x 2m
2

kx dx
0

(6.15)

This expression may also be obtained by


considering the area of the triangle as in
Fig. 6.7(d). Note that the work done by the
external pulling force F is positive since it
overcomes the spring force.

Fig. 6.7 Illustration of the spring force with a block


attached to the free end of the spring.
(a) The spring force Fs is zero when the
displacement x from the equilibrium position
is zero. (b) For the stretched spring x > 0
and Fs < 0 (c) For the compressed spring
x < 0 and Fs > 0.(d) The plot of Fs versus x.
The area of the shaded triangle represents
the work done by the spring force. Due to the
opposing signs of Fs and x, this work done is
2
negative, W s = kx m
/ 2.

The same is true when the spring is


compressed with a displacement xc (< 0). The
spring force does work W s = kx 2c / 2 while the

2015-16(20/01/2015)

124

PHYSICS

external force F does work + kxc2 / 2 . If the block


is moved from an initial displacement xi to a
final displacement xf , the work done by the
spring force Ws is
xf

k x dx

Ws
xi

k x 2i

k x 2f

and vice versa, however, the total mechanical


energy remains constant. This is graphically
depicted in Fig. 6.8.

(6.17)

Thus the work done by the spring force depends


only on the end points. Specifically, if the block
is pulled from x i and allowed to return to xi ;
k x dx
xi

k x 2i
2

k x 2i
2

=0
(6.18)
The work done by the spring force in a cyclic
process is zero. We have explicitly demonstrated
that the spring force (i) is position dependent
only as first stated by Hooke, (Fs = kx); (ii)
does work which only depends on the initial and
final positions, e.g. Eq. (6.17). Thus, the spring
force is a conservative force.
We define the potential energy V(x) of the spring
to be zero when block and spring system is in the
equilibrium position. For an extension (or
compression) x the above analysis suggests that
kx 2
(6.19)
2
You may easily verify that dV/dx = k x, the
spring force. If the block of mass m in Fig. 6.7 is
extended to xm and released from rest, then its
total mechanical energy at any arbitrary point x,
where x lies between xm and + xm, will be given by
V(x) =

1
1
2 1
k xm
= k x2 + m v2
2
2
2
where we have invoked the conservation of
mechanical energy. This suggests that the speed
and the kinetic energy will be maximum at the
equilibrium position, x = 0, i.e.,

Fig. 6.8 Parabolic plots of the potential energy V and


kinetic energy K of a block attached to a
spring obeying Hookes law. The two plots
are complementary, one decreasing as the
other increases. The total mechanical
energy E = K + V remains constant.

xi

Ws

Example 6.8 To simulate car accidents, auto


manufacturers study the collisions of moving
cars with mounted springs of different spring
constants. Consider a typical simulation with
a car of mass 1000 kg moving with a speed
18.0 km/h on a smooth road and colliding
with a horizontally mounted spring of spring
constant 6.25 103 N m1 . What is the
maximum compression of the spring ?

Answer At maximum compression the kinetic


energy of the car is converted entirely into the
potential energy of the spring.
The kinetic energy of the moving car is
1
K = mv 2
2
=

1
3
10 5 5
2

K = 1.25 104 J
1
1
2
m v 2m = k x m
2
2
where vm is the maximum speed.
k
x
m m
Note that k/m has the dimensions of [T-2] and
our equation is dimensionally correct. The
kinetic energy gets converted to potential energy
or

vm =

where we have converted 18 km h1 to 5 m s1 [It is


useful to remember that 36 km h1 = 10 m s1].
At maximum compression x m, the potential
energy V of the spring is equal to the kinetic
energy K of the moving car from the principle of
conservation of mechanical energy.
1
V = k x 2m
2

2015-16(20/01/2015)

WORK, ENERGY AND POWER

125

= 1.25 104 J
We obtain
xm = 2.00 m
We note that we have idealised the situation.
The spring is considered to be massless. The
surface has been considered to possess
negligible friction.

We conclude this section by making a few


remarks on conservative forces.
(i) Information on time is absent from the above
discussions. In the example considered
above, we can calculate the compression, but
not the time over which the compression
occurs. A solution of Newtons Second Law
for this system is required for temporal
information.
(ii) Not all forces are conservative. Friction, for
example, is a non-conservative force. The
principle of conservation of energy will have
to be modified in this case. This is illustrated
in Example 6.9.
(iii) The zero of the potential energy is arbitrary.
It is set according to convenience. For the
spring force we took V(x) = 0, at x = 0, i.e. the
unstretched spring had zero potential
energy. For the constant gravitational force
mg, we took V = 0 on the earths surface. In
a later chapter we shall see that for the force
due to the universal law of gravitation, the
zero is best defined at an infinite distance
from the gravitational source. However, once
the zero of the potential energy is fixed in a
given discussion, it must be consistently
adhered to throughout the discussion. You
cannot change horses in midstream !
Example 6.9 Consider Example 6.8 taking
the coefficient of friction, , to be 0.5 and
calculate the maximum compression of the
spring.

Answer In presence of friction, both the spring


force and the frictional force act so as to oppose
the compression of the spring as shown in
Fig. 6.9.
We invoke the work-energy theorem, rather
than the conservation of mechanical energy.
The change in kinetic energy is

Fig. 6.9 The forces acting on the car.

1
K = Kf K i = 0 m v 2
2
The work done by the net force is
1
kx m2
2
Equating we have
W

1
m v2
2

m g x m

1
2
k xm
2

m g x m

Now mg = 0.5 10 3 10 = 5 103 N (taking


g =10.0 m s -2). After rearranging the above
equation we obtain the following quadratic
equation in the unknown x m.
k x 2m
xm

m v2

2 m g x m
mg

m 2 g2

0
m k v2

1/2

k
where we take the positive square root since x m
is positive. Putting in numerical values we
obtain
x m = 1.35 m
which, as expected, is less than the result in
Example 6.8.
If the two forces on the body consist of a
conservative force F c and a non-conservative
force Fnc , the conservation of mechanical energy
formula will have to be modified. By the WE
theorem
(F c+ F nc ) x = K
But
Fc x = V
Hence,
(K + V) = Fnc x
E = F nc x
where E is the total mechanical energy. Over
the path this assumes the form
Ef Ei = W nc
Where W nc is the total work done by the
non-conservative forces over the path. Note that

2015-16(20/01/2015)

126

unlike the conservative force, Wnc depends on


the particular path i to f.

6.10 VARIOUS FORMS OF ENERGY : THE LAW


OF CONSERVATION OF ENERGY
In the previous section we have discussed
mechanical energy. We have seen that it can be
classified into two distinct categories : one based
on motion, namely kinetic energy; the other on
configuration (position), namely potential energy.
Energy comes in many a forms which transform
into one another in ways which may not often
be clear to us.
6.10.1 Heat
We have seen that the frictional force is not a
conservative force. However, work is associated
with the force of friction, Example 6.5. A block of
mass m sliding on a rough horizontal surface
with speed v 0 comes to a halt over a distance x 0.
The work done by the force of kinetic friction f
over x 0 is f x0 . By the work-energy theorem
m vo2/2
f x 0 . If we confine our scope to
mechanics, we would say that the kinetic energy
of the block is lost due to the frictional force.
On examination of the block and the table we
would detect a slight increase in their
temperatures. The work done by friction is not
lost, but is transferred as heat energy. This
raises the internal energy of the block and the
table. In winter, in order to feel warm, we
generate heat by vigorously rubbing our palms
together. We shall see later that the internal
energy is associated with the ceaseless, often
random, motion of molecules. A quantitative idea
of the transfer of heat energy is obtained by
noting that 1 kg of water releases about 42000 J
of energy when it cools by10 C.
6.10.2 Chemical Energy
One of the greatest technical achievements of
humankind occurred when we discovered how
to ignite and control fire. We learnt to rub two
flint stones together (mechanical energy), got
them to heat up and to ignite a heap of dry leaves
(chemical energy), which then provided
sustained warmth. A matchstick ignites into a
bright flame when struck against a specially
prepared chemical surface. The lighted
matchstick, when applied to a firecracker,
results in a spectacular display of sound and
light.

PHYSICS

Chemical energy arises from the fact that the


molecules participating in the chemical reaction
have different binding energies. A stable chemical
compound has less energy than the separated parts.
A chemical reaction is basically a rearrangement
of atoms. If the total energy of the reactants is more
than the products of the reaction, heat is released
and the reaction is said to be an exothermic
reaction. If the reverse is true, heat is absorbed and
the reaction is endothermic. Coal consists of
carbon and a kilogram of it when burnt releases
about 3 107 J of energy.
Chemical energy is associated with the forces
that give rise to the stability of substances. These
forces bind atoms into molecules, molecules into
polymeric chains, etc. The chemical energy
arising from the combustion of coal, cooking gas,
wood and petroleum is indispensable to our daily
existence.
6.10.3 Electrical Energy
The flow of electrical current causes bulbs to
glow, fans to rotate and bells to ring. There are
laws governing the attraction and repulsion of
charges and currents, which we shall learn
later. Energy is associated with an electric
current. An urban Indian household consumes
about 200 J of energy per second on an average.
6.10.4 The Equivalence of Mass and Energy
Till the end of the nineteenth century, physicists
believed that in every physical and chemical
process, the mass of an isolated system is
conserved. Matter might change its phase, e.g.
glacial ice could melt into a gushing stream, but
matter is neither created nor destroyed; Albert
Einstein (1879-1955) however, showed that mass
and energy are equivalent and are related by
the relation
E = m c2
(6.20)
where c, the speed of light in vacuum is
approximately 3 10 8 m s1. Thus, a staggering
amount of energy is associated with a mere
kilogram of matter
E = 1 (3 10 8)2 J = 9 1016 J.
This is equivalent to the annual electrical output
of a large (3000 MW) power generating station.
6.10.5 Nuclear Energy
The most destructive weapons made by man, the
fission and fusion bombs are manifestations of

2015-16(20/01/2015)

WORK, ENERGY AND POWER

127

Table 6.3 Approximate energy associated with various phenomena

nucleus like uranium

235 ,
92 U

is split by a neutron

into lighter nuclei. Once again the final mass is


less than the initial mass and the mass difference
translates into energy, which can be tapped to
provide electrical energy as in nuclear power
plants (controlled nuclear fission) or can be
employed in making nuclear weapons
(uncontrolled nuclear fission). Strictly, the energy
E released in a chemical reaction can also be
related to the mass defect m = E/c2 . However,
for a chemical reaction, this mass defect is much
smaller than for a nuclear reaction. Table 6.3
lists the total energies for a variety of events and
phenomena.

the above equivalence of mass and energy [Eq.


(6.20)]. On the other hand the explanation of the
life-nourishing energy output of the sun is also
based on the above equation. In this case
effectively four light hydrogen nuclei fuse to form
a helium nucleus whose mass is less than the
sum of the masses of the reactants. This mass
difference, called the mass defect m is the
source of energy (m)c2 . In fission, a heavy

Example 6.10 Examine Tables 6.1-6.3


and express (a) The energy required to
break one bond in DNA in eV; (b) The
kinetic energy of an air molecule (1021 J)
in eV; (c) The daily intake of a human adult
in kilocalories.

Answer (a) Energy required to break one bond


of DNA is
10 20 J
~ 0.06 eV
1.6 10 19 J/eV
Note 0.1 eV = 100 meV (100 millielectron volt).
(b) The kinetic energy of an air molecule is
10

21

1 .6 10

19

J
J/eV

~ 0 .006 2 eV

This is the same as 6.2 meV.


(c) The average human consumption in a day is
10 7 J
~ 2400 kcal
4.2103 J/kcal

2015-16(20/01/2015)

128

PHYSICS

We point out a common misconception created


by newspapers and magazines. They mention
food values in calories and urge us to restrict
diet intake to below 2400 calories. What they
should be saying is kilocalories (kcal) and not
calories. A person consuming 2400 calories a
day will soon starve to death! 1 food calorie is
1 kcal.

The instantaneous power is defined as the


limiting value of the average power as time
interval approaches zero,

6.10.6 The Principle of Conservation of


Energy

The work dW done by a force F for a displacement


dr is dW = F.dr. The instantaneous power can
also be expressed as

We have seen that the total mechanical energy


of the system is conserved if the forces doing work
on it are conservative. If some of the forces
involved are non-conservative, part of the
mechanical energy may get transformed into
other forms such as heat, light and sound.
However, the total energy of an isolated system
does not change, as long as one accounts for all
forms of energy. Energy may be transformed from
one form to another but the total energy of an
isolated system remains constant. Energy can
neither be created, nor destroyed.
Since the universe as a whole may be viewed
as an isolated system, the total energy of the
universe is constant. If one part of the universe
loses energy, another part must gain an equal
amount of energy.
The principle of conservation of energy cannot
be proved. However, no violation of this principle
has been observed. The concept of conservation
and transformation of energy into various forms
links together various branches of physics,
chemistry and life sciences. It provides a
unifying, enduring element in our scientific
pursuits. From engineering point of view all
electronic, communication and mechanical
devices rely on some forms of energy
transformation.
6.11 POWER
Often it is interesting to know not only the work
done on an object, but also the rate at which
this work is done. We say a person is physically
fit if he not only climbs four floors of a building
but climbs them fast. Power is defined as the
time rate at which work is done or energy is
transferred.
The average power of a force is defined as the
ratio of the work, W, to the total time t taken

Pav =

dW
dt

F.

(6.21)

dr
dt

= F.v

(6.22)

where v is the instantaneous velocity when the


force is F.
Power, like work and energy, is a scalar
quantity. Its dimensions are [ML2 T 3 ]. In the SI,
its unit is called a watt (W). The watt is 1 J s1.
The unit of power is named after James Watt,
one of the innovators of the steam engine in the
eighteenth century.
There is another unit of power, namely the
horse-power (hp)
1 hp = 746 W
This unit is still used to describe the output of
automobiles, motorbikes, etc.
We encounter the unit watt when we buy
electrical goods such as bulbs, heaters and
refrigerators. A 100 watt bulb which is on for 10
hours uses 1 kilowatt hour (kWh) of energy.
100 (watt) 10 (hour)
= 1000 watt hour
=1 kilowatt hour (kWh)
= 103 (W) 3600 (s)
= 3.6 10 6 J
Our electricity bills carry the energy
consumption in units of kWh. Note that kWh is
a unit of energy and not of power.
u Example 6.11 An elevator can carry a
maximum load of 1800 kg (elevator +
passengers) is moving up with a constant
speed of 2 m s1. The frictional force opposing
the motion is 4000 N. Determine the
minimum power delivered by the motor to
the elevator in watts as well as in horse
power.

2015-16(20/01/2015)

WORK, ENERGY AND POWER

129

Answer The downward force on the elevator is


F = m g + Ff = (1800 10) + 4000 = 22000 N
The motor must supply enough power to balance
this force. Hence,
P = F. v = 22000 2 = 44000 W = 59 hp

6.12 COLLISIONS
In physics we study motion (change in position).
At the same time, we try to discover physical
quantities, which do not change in a physical
process. The laws of momentum and energy
conservation are typical examples. In this
section we shall apply these laws to a commonly
encountered phenomena, namely collisions.
Several games such as billiards, marbles or
carrom involve collisions.We shall study the
collision of two masses in an idealised form.
Consider two masses m1 and m2 . The particle
m1 is moving with speed v 1i , the subscript i
implying initial. We can cosider m2 to be at rest.
No loss of generality is involved in making such
a selection. In this situation the mass m1
collides with the stationary mass m2 and this
is depicted in Fig. 6.10.

by the second particle. F21 is likewise the force


exerted on the second particle by the first particle.
Now from Newtons third law, F12 = F21. This
implies
p1 + p2 = 0
The above conclusion is true even though the
forces vary in a complex fashion during the
collision time t. Since the third law is true at
every instant, the total impulse on the first object
is equal and opposite to that on the second.
On the other hand, the total kinetic energy of
the system is not necessarily conserved. The
impact and deformation during collision may
generate heat and sound. Part of the initial kinetic
energy is transformed into other forms of energy.
A useful way to visualise the deformation during
collision is in terms of a compressed spring. If
the spring connecting the two masses regains
its original shape without loss in energy, then
the initial kinetic energy is equal to the final
kinetic energy but the kinetic energy during the
collision time t is not constant. Such a collision
is called an elastic collision. On the other hand
the deformation may not be relieved and the two
bodies could move together after the collision. A
collision in which the two particles move together
after the collision is called a completely inelastic
collision. The intermediate case where the
deformation is partly relieved and some of the
initial kinetic energy is lost is more common and
is appropriately called an inelastic collision.
6.12.2 Collisions in One Dimension

Fig. 6.10 Collision of massm1, with a stationary mass m2.

Consider first a completely inelastic collision


in one dimension. Then, in Fig. 6.10,

The masses m1 and m 2 fly-off in different


directions. We shall see that there are
relationships, which connect the masses, the
velocities and the angles.

1 = 2 = 0
m1v 1i = (m1+m2 )vf (momentum conservation)

6.12.1 Elastic and Inelastic Collisions

vf

In all collisions the total linear momentum is


conserved; the initial momentum of the system
is equal to the final momentum of the system.
One can argue this as follows. When two objects
collide, the mutual impulsive forces acting over
the collision time t cause a change in their
respective momenta :
p 1 = F12 t
p 2 = F21 t
where F12 is the force exerted on the first particle

m1
v
m1 m 2 1i

(6.23)

The loss in kinetic energy on collision is


K

1
m v2
2 1 1i

1
m 1v12i
2

1
(m
2 1

m 2 )v 2f

1
m 12
v12i
2 m1 m 2

[using Eq. (6.23)]

1
m1
m1v12i 1
2
m1 m 2

2015-16(20/01/2015)

130

PHYSICS

An experiment on head-on collision


In performing an experiment on collision on a horizontal surface, we face three difficulties.
One, there will be friction and bodies will not travel with uniform velocities. Two, if two bodies
of different sizes collide on a table, it would be difficult to arrange them for a head-on collision
unless their centres of mass are at the same height above the surface. Three, it will be fairly
difficult to measure velocities of the two bodies just before and just after collision.
By performing this experiment in a vertical direction, all the three difficulties vanish. Take
two balls, one of which is heavier (basketball/football/volleyball) and the other lighter (tennis
ball/rubber ball/table tennis ball). First take only the heavier ball and drop it vertically from
some height, say 1 m. Note to which it rises. This gives the velocities near the floor or ground,
just before and just after the bounce (by using v 2 = 2 gh ). Hence you
will get the coefficient of restitution.
Now take the big ball and a small ball and hold them in your
hands one over the other, with the heavier ball below the lighter
one, as shown here. Drop them together, taking care that they remain
together while falling, and see what happens. You will find that the
heavier ball rises less than when it was dropped alone, while the
lighter one shoots up to about 3 m. With practice, you will be able to
hold the ball properly so that the lighter ball rises vertically up and
does not fly sideways. This is head-on collision.
You can try to find the best combination of balls which gives you
the best effect. You can measure the masses on a standard balance.
We leave it to you to think how you can determine the initial and
final velocities of the balls.

1 m1m2
v2
2 m1 m 2 1i

v1 f

which is a positive quantity as expected.


Consider next an elastic collision. Using the
above nomenclature with 1 = 2 = 0, the
momentum and kinetic energy conservation
equations are

2
1 1i

m v

2
1 1f

m v

2
2 2f

From Eqs. (6.24) and (6.25) it follows that,

or,

v1i )

m1v1 f (v2 f
v12i

v1 f )

(v1i

v 1f )(v1i

v1 f )

v2 f

v1 f

Hence,

v1i

v 1f = 0
v 2f = v 1i

Substituting this in Eq. (6.24), we obtain

The first mass comes to rest and pushes off the


second mass with its initial speed on collision.
Case II : If one mass dominates, e.g. m2 > > m1
v1f ~ v1i
v2f ~ 0
The heavier mass is undisturbed while the
lighter mass reverses its velocity.

v1 f )

v12f

v 2 f (v1i

Case I : If the two masses are equal

m1v1i (v 2 f

(6.27)

2m1v1i
(6.28)
m1 m 2
Thus, the unknowns {v 1f, v2f} are obtained in
terms of the knowns {m1 , m2 , v1i}. Special cases
of our analysis are interesting.

(6.25)

m v

m2)
v
m 2 1i

and v 2 f

(6.24)

m1v 1i = m1 v1f + m2 v2f

(m1
m1

(6.26)

Example 6.12 Slowing down of neutrons:


In a nuclear reactor a neutron of high
speed (typically 107 m s1) must be slowed

2015-16(20/01/2015)

WORK, ENERGY AND POWER

131

to 103 m s1 so that it can have a high

6.12.3 Collisions in Two Dimensions

probability of interacting with isotope 235


92 U
and causing it to fission. Show that a
neutron can lose most of its kinetic energy
in an elastic collision with a light nuclei
like deuterium or carbon which has a mass
of only a few times the neutron mass. The
material making up the light nuclei, usually
heavy water (D2 O) or graphite, is called a
moderator.

Fig. 6.10 also depicts the collision of a moving


mass m1 with the stationary mass m2 . Linear
momentum is conserved in such a collision.
Since momentum is a vector this implies three
equations for the three directions {x, y, z}.
Consider the plane determined by the final
velocity directions of m1 and m2 and choose it to
be the x-y plane. The conservation of the
z-component of the linear momentum implies
that the entire collision is in the x-y plane. The
x- and y-component equations are

Answer The initial kinetic energy of the neutron


is
1
m v2
2 1 1i

K1i

while its final kinetic energy from Eq. (6.27)


1
m 1v12f
2

K1 f

1
m1
m1
2
m1

m2
m2

v12i

K1i

m1
m1

m2
m2

f2 = 1 f1 (elastic collision)
4m1m2
m1

m2

1
m1v1 f 2
2

1
m 2 v2 f 2
2

(6.31)

We obtain an additional equation. That still


leaves us one equation short. At least one of
the four unknowns, say 1, must be made known
for the problem to be solvable. For example, 1
can be determined by moving a detector in an
angular fashion from the x to the y axis. Given
{m1 , m2, v 1i , 1} we can determine {v1f , v2f , 2}
from Eqs. (6.29)-(6.31).

One can also verify this result by substituting


from Eq. (6.28).
For deuterium m2 = 2m1 and we obtain
f1 = 1/9 while f 2 = 8/9. Almost 90% of the
neutrons energy is transferred to deuterium. For
carbon f1 = 71.6% and f2 = 28.4%. In practice,
however, this number is smaller since head-on
collisions are rare.

If the initial velocities and final velocities of


both the bodies are along the same straight line,
then it is called a one-dimensional collision, or
head-on collision. In the case of small spherical
bodies, this is possible if the direction of travel
of body 1 passes through the centre of body 2
which is at rest. In general, the collision is twodimensional, where the initial velocities and the
final velocities lie in a plane.

(6.30)

1
m 1v1 i 2
2

while the fractional kinetic energy gained by the


moderating nuclei K2f /K 1i is

0 = m1v 1f sin 1 m 2v 2f sin 2

If, further the collision is elastic,

K1 f

(6.29)

One knows {m1, m2, v 1i } in most situations. There


are thus four unknowns {v1f , v2f , 1 and 2 }, and
only two equations. If 1 = 2 = 0, we regain
Eq. (6.24) for one dimensional collision.

The fractional kinetic energy lost is


f1

m1v 1i = m1v 1f cos 1 + m2 v2f cos 2

Example 6.13 Consider the collision


depicted in Fig. 6.10 to be between two
billiard balls with equal masses m1 = m2 .
The first ball is called the cue while the
second ball is called the target. The
billiard player wants to sink the target
ball in a corner pocket, which is at an
angle 2 = 37. Assume that the collision
is elastic and that friction and rotational
motion are not important. Obtain 1.

Answer From momentum conservation, since


the masses are equal

v1i = v1f + v 2f
or

v 1i 2

v1 f

v1 f 2

v2 f
v2 f 2

v1 f

v2 f

2v1 f .v 2 f

2015-16(20/01/2015)

132

PHYSICS

v1 f 2

v2 f 2

2v1 f v 2 f cos

37

(6.32)

Since the collision is elastic and m1 = m2 it follows


from conservation of kinetic energy that
v1i2

v1 f 2

v2 f 2

(6.33)

Comparing Eqs. (6.32) and (6.33), we get


cos ( 1 + 37) = 0
or

1 + 37 = 90

Thus, 1 = 53
This proves the following result : when two equal
masses undergo a glancing elastic collision with
one of them at rest, after the collision, they will
move at right angles to each other.

The matter simplifies greatly if we consider


spherical masses with smooth surfaces, and
assume that collision takes place only when the
bodies touch each other. This is what happens
in the games of marbles, carrom and billiards.
In our everyday world, collisions take place only
when two bodies touch each other. But consider
a comet coming from far distances to the sun, or
alpha particle coming towards a nucleus and
going away in some direction. Here we have to
deal with forces involving action at a distance.
Such an event is called scattering. The velocities
and directions in which the two particles go away
depend on their initial velocities as well as the
type of interaction between them, their masses,
shapes and sizes.

SUMMARY
1.

2.

3.

The work-energy theorem states that the change in kinetic energy of a body is the work
done by the net force on the body.
K f - Ki = Wnet
A force is conservative if (i) work done by it on an object is path independent and
depends only on the end points {xi , xj }, or (ii) the work done by the force is zero for an
arbitrary closed path taken by the object such that it returns to its initial position.
For a conservative force in one dimension, we may define a potential energy function V(x)
such that

F (x ) =

dV ( x )
dx
xf

or

Vi V f =

F (x ) d x

xi

4.
5.

6.

The principle of conservation of mechanical energy states that the total mechanical
energy of a body remains constant if the only forces that act on the body are conservative.
The gravitational potential energy of a particle of mass m at a height x about the earths
surface is
V(x) = m g x
where the variation of g with height is ignored.
The elastic potential energy of a spring of force constant k and extension x is

V (x ) =
7.

1
k x2
2

The scalar or dot product of two vectors A and B is written as A.B and is a scalar
quantity given by : A. B = AB cos , where is the angle between A and B. It can be
positive, negative or zero depending upon the value of . The scalar product of two
vectors can be interpreted as the product of magnitude of one vector and component
of the other vector along the first vector. For unit vectors :

i i = j j = k
k
= 1 and i j = j k
=k
i = 0
Scalar products obey the commutative and the distributive laws.

2015-16(20/01/2015)

WORK, ENERGY AND POWER

133

POINTS TO PONDER
1.

2.

3.

The phrase calculate the work done is incomplete. We should refer (or imply
clearly by context) to the work done by a specific force or a group of forces on a
given body over a certain displacement.
Work done is a scalar quantity. It can be positive or negative unlike mass and
kinetic energy which are positive scalar quantities. The work done by the friction
or viscous force on a moving body is negative.
For two bodies, the sum of the mutual forces exerted between them is zero from
Newtons Third Law,
F12 + F21 = 0
But the sum of the work done by the two forces need not always cancel, i.e.
W 12 + W21 0

4.

5.

6.

7.

8.
9.

However, it may sometimes be true.


The work done by a force can be calculated sometimes even if the exact nature of
the force is not known. This is clear from Example 6.2 where the WE theorem is
used in such a situation.
The WE theorem is not independent of Newtons Second Law. The WE theorem
may be viewed as a scalar form of the Second Law. The principle of conservation
of mechanical energy may be viewed as a consequence of the WE theorem for
conservative forces.
The WE theorem holds in all inertial frames. It can also be extended to noninertial frames provided we include the pseudoforces in the calculation of the
net force acting on the body under consideration.
The potential energy of a body subjected to a conservative force is always
undetermined upto a constant. For example, the point where the potential
energy is zero is a matter of choice. For the gravitational potential energy mgh,
the zero of the potential energy is chosen to be the ground. For the spring
potential energy kx2/2 , the zero of the potential energy is the equilibrium position
of the oscillating mass.
Every force encountered in mechanics does not have an associated potential
energy. For example, work done by friction over a closed path is not zero and no
potential energy can be associated with friction.
During a collision : (a) the total linear momentum is conserved at each instant of
the collision ; (b) the kinetic energy conservation (even if the collision is elastic)
applies after the collision is over and does not hold at every instant of the collision.
In fact the two colliding objects are deformed and may be momentarily at rest
with respect to each other.

2015-16(20/01/2015)

134

PHYSICS

EXERCISES
6.1 The sign of work done by a force on a body is important to understand. State carefully
if the following quantities are positive or negative:
(a) work done by a man in lifting a bucket out of a well by means of a rope tied to the
bucket.
(b) work done by gravitational force in the above case,
(c) work done by friction on a body
sliding down an inclined plane,
(d) work done by an applied force on
a body moving on a rough
horizontal plane with uniform
velocity,
(e) work done by the resistive force of
air on a vibrating pendulum in
bringing it to rest.
6.2 A body of mass 2 kg initially at rest
moves under the action of an applied
horizontal force of 7 N on a table with
coefficient of kinetic friction = 0.1.
Compute the
(a) work done by the applied force in
10 s,
(b) work done by friction in 10 s,
(c) work done by the net force on the
body in 10 s,
(d) change in kinetic energy of the
body in 10 s,
and interpret your results.
6.3 Given in Fig. 6.11 are examples of some
potential energy functions in one
dimension. The total energy of the
particle is indicated by a cross on the
ordinate axis. In each case, specify the
regions, if any, in which the particle
cannot be found for the given energy.
Also, indicate the minimum total
energy the particle must have in each
case. Think of simple physical contexts
for which these potential energy shapes
are relevant.

Fig. 6.11

2015-16(20/01/2015)

WORK, ENERGY AND POWER

6.4 The potential energy function for a


particle executing linear simple
harmonic motion is given by V(x) =
kx2 /2, where k is the force constant
of the oscillator. For k = 0.5 N m-1,
the graph of V(x) versus x is shown
in Fig. 6.12. Show that a particle of
total energy 1 J moving under this
potential must turn back when it
reaches x = 2 m.

135

Fig. 6.12

6.5 Answer the following :


(a) The casing of a rocket in flight
burns up due to friction. At
whose expense is the heat
energy required for burning
obtained? The rocket or the
atmosphere?
(b) Comets move around the sun
in highly elliptical orbits. The
gravitational force on the
Fig. 6.13
comet due to the sun is not
normal to the comets velocity
in general. Yet the work done by the gravitational force over every complete orbit
of the comet is zero. Why ?
(c) An artificial satellite orbiting the earth in very thin atmosphere loses its energy
gradually due to dissipation against atmospheric resistance, however small. Why
then does its speed increase progressively as it comes closer and closer to the earth ?
(d) In Fig. 6.13(i) the man walks 2 m carrying a mass of 15 kg on his hands. In Fig.
6.13(ii), he walks the same distance pulling the rope behind him. The rope goes
over a pulley, and a mass of 15 kg hangs at its other end. In which case is the work
done greater ?
6.6 Underline the correct alternative :
(a) When a conservative force does positive work on a body, the potential energy of
the body increases/decreases/remains unaltered.
(b) Work done by a body against friction always results in a loss of its kinetic/potential
energy.
(c) The rate of change of total momentum of a many-particle system is proportional
to the external force/sum of the internal forces on the system.
(d) In an inelastic collision of two bodies, the quantities which do not change after
the collision are the total kinetic energy/total linear momentum/total energy of
the system of two bodies.
6.7 State if each of the following statements is true or false. Give reasons for your answer.
(a) In an elastic collision of two bodies, the momentum and energy of each body is
conserved.
(b) Total energy of a system is always conserved, no matter what internal and external
forces on the body are present.
(c) Work done in the motion of a body over a closed loop is zero for every force in
nature.
(d) In an inelastic collision, the final kinetic energy is always less than the initial
kinetic energy of the system.
6.8 Answer carefully, with reasons :
(a) In an elastic collision of two billiard balls, is the total kinetic energy conserved
during the short time of collision of the balls (i.e. when they are in contact) ?
(b) Is the total linear momentum conserved during the short time of an elastic collision
of two balls ?

2015-16(20/01/2015)

136

PHYSICS

(c) What are the answers to (a) and (b) for an inelastic collision ?
(d) If the potential energy of two billiard balls depends only on the separation distance
between their centres, is the collision elastic or inelastic ? (Note, we are talking
here of potential energy corresponding to the force during collision, not gravitational
potential energy).
6.9 A body is initially at rest. It undergoes one-dimensional motion with constant
acceleration. The power delivered to it at time t is proportional to
(i) t1/2
(ii) t
(iii) t3/2
(iv) t2
6.10 A body is moving unidirectionally under the influence of a source of constant power.
Its displacement in time t is proportional to
(i) t1/2
(ii) t
(iii) t3/2
(iv) t2
6.11 A body constrained to move along the z-axis of a coordinate system is subject to a
constant force F given by
N
F = i + 2 j + 3 k

6.12

6.13

6.14

6.15

6.16

where i, j, k are unit vectors along the x-, y- and z-axis of the system respectively.
What is the work done by this force in moving the body a distance of 4 m along the
z-axis ?
An electron and a proton are detected in a cosmic ray experiment, the first with kinetic
ener gy 10 keV, and the second with 100 keV. Which is faster, the electron or the
proton ? Obtain the ratio of their speeds. (electron mass = 9.11 10 -31 kg, proton mass
= 1.67 1027 kg, 1 eV = 1.60 1019 J).
A rain drop of radius 2 mm falls from a height of 500 m above the ground. It falls with
decreasing acceleration (due to viscous resistance of the air) until at half its original
height, it attains its maximum (terminal) speed, and moves with uniform speed
ther eafter. What is the work done by the gravitational force on the dr op in the first
and second half of its journey ? What is the work done by the resistive force in the
entire journey if its speed on reaching the ground is 10 m s1 ?
A molecule in a gas container hits a horizontal wall with speed 200 m s1 and angle 30
with the normal, and rebounds with the same speed. Is momentum conserved in the
collision ? Is the collision elastic or inelastic ?
A pump on the ground floor of a building can pump up water to fill a tank of volume 30 m 3
in 15 min. If the tank is 40 m above the ground, and the efficiency of the pump is 30%,
how much electric power is consumed by the pump ?
Two identical ball bearings in contact with each other and resting on a frictionless
table are hit head-on by another ball bearing of the same mass moving initially with a
speed V. If the collision is elastic, which of the following (Fig. 6.14) is a possible result
after collision ?

Fig. 6.14

2015-16(20/01/2015)

WORK, ENERGY AND POWER

137

6.17 The bob A of a pendulum released from 30o to the


vertical hits another bob B of the same mass at rest
on a table as shown in Fig. 6.15. How high does
the bob A rise after the collision ? Neglect the size of
the bobs and assume the collision to be elastic.

6.18 The bob of a pendulum is released from a horizontal


position. If the length of the pendulum is 1.5 m,
what is the speed with which the bob arrives at the
lowermost point, given that it dissipated 5% of its
initial energy against air resistance ?
6.19 A trolley of mass 300 kg carrying a sandbag of 25 kg
is moving uniformly with a speed of 27 km/h on a
Fig. 6.15
frictionless track. After a while, sand starts leaking
out of a hole on the floor of the trolley at the rate of
0.05 kg s1. What is the speed of the trolley after the entire sand bag is empty ?
6.20 A body of mass 0.5 kg travels in a straight line with velocity v =a x3/2 where a = 5 m1/2 s1.
What is the work done by the net force during its displacement from x = 0 to
x=2m?
6.21 The blades of a windmill sweep out a circle of area A. (a) If the wind flows at a
velocity v perpendicular to the circle, what is the mass of the air passing through it
in time t ? (b) What is the kinetic energy of the air ? (c) Assume that the windmill
converts 25% of the winds energy into electrical energy, and that A = 30 m2 , v = 36
km/h and the density of air is 1.2 kg m3. What is the electrical power produced ?
6.22 A person trying to lose weight (dieter) lifts a 10 kg mass, one thousand times, to a
height of 0.5 m each time. Assume that the potential energy lost each time she
lowers the mass is dissipated. (a) How much work does she do against the gravitational
force ? (b) Fat supplies 3.8 107J of energy per kilogram which is converted to
mechanical energy with a 20% efficiency rate. How much fat will the dieter use up?
6.23 A family uses 8 kW of power. (a) Direct solar energy is incident on the horizontal
surface at an average rate of 200 W per square meter. If 20% of this ener gy can be
converted to useful electrical energy, how large an area is needed to supply 8 kW?
(b) Compare this area to that of the roof of a typical house.
Additional Exercises
6.24 A bullet of mass 0.012 kg and horizontal speed 70 m s1 strikes a block of wood of
mass 0.4 kg and instantly comes to rest with respect to the block. The block is
suspended from the ceiling by means of thin wires. Calculate the height to which
the block rises. Also, estimate the amount of heat produced in the block.
6.25 Two inclined frictionless tracks, one gradual and the other steep meet at A from
where two stones are allowed to slide down from rest, one on each track (Fig. 6.16).
Will the stones reach the bottom at the same time ? Will they reach there with the
same spee d? Explain. Given 1 = 300, 2 = 600, and h = 10 m, what are the speeds and
times taken by the two stones ?

Fig. 6.16

2015-16(20/01/2015)

138

PHYSICS

6.26 A 1 kg block situated on a rough incline is connected to a spring of spring constant 100
N m1 as shown in Fig. 6.17. The block is released from rest with the spring in the
unstretched position. The block moves 10 cm down the incline before coming to rest.
Find the coefficient of friction between the block and the incline. Assume that the
spring has a negligible mass and the pulley is frictionless.

Fig. 6.17
6.27 A bolt of mass 0.3 kg falls from the ceiling of an elevator moving down with an uniform
speed of 7 m s1. It hits the floor of the elevator (length of the elevator = 3 m) and does
not rebound. What is the heat produced by the impact ? Would your answer be dif ferent
if the elevator were stationary ?
6.28 A trolley of mass 200 kg moves with a uniform speed of 36 km/h on a frictionless track.
A child of mass 20 kg runs on the trolley from one end to the other (10 m away) with a
speed of 4 m s1 relative to the trolley in a direction opposite to the its motion, and
jumps out of the trolley. What is the final speed of the trolley ? How much has the
trolley moved from the time the child begins to run ?
6.29 Which of the following potential energy curves in Fig. 6.18 cannot possibly describe the
elastic collision of two billiard balls ? Here r is the distance between centres of the balls.

Fig. 6.18
6.30 Consider the decay of a free neutron at rest : n g p + e

2015-16(20/01/2015)

WORK, ENERGY AND POWER

139

Show that the two-body decay of this type must necessarily give an electron of fixed
energy and, therefore, cannot account for the observed continuous energy distribution
in the -decay of a neutron or a nucleus (Fig. 6.19).

Fig. 6.19
[Note: The simple result of this exercise was one among the several arguments advanced by W.
Pauli to predict the existence of a third particle in the decay products of -decay. This
particle is known as neutrino. We now know that it is a particle of intrinsic spin (like
e, p or n), but is neutral, and either massless or having an extremely small mass
(compared to the mass of electron) and which interacts very weakly with matter. The
correct decay process of neutron is : n g p + e + ]

2015-16(20/01/2015)

140

PHYSICS

APPENDIX 6.1 : POWER CONSUMPTION IN WALKING


The table below lists the approximate power expended by an adult human of mass 60 kg.
Table 6.4 Approximate power consumption
Mechanical work must not be confused with the everyday usage
of the term work. A woman standing with a very heavy load on
her head may get very tired. But no mechanical work is involved.
That is not to say that mechanical work cannot be estimated in
ordinary human activity.
Consider a person walking with constant speed v0 . The mechanical work he does may be estimated simply
with the help of the work-energy theorem. Assume :
(a) The major work done in walking is due to the acceleration and deceleration of the legs with each stride
(See Fig. 6.20).
(b) Neglect air resistance.
(c) Neglect the small work done in lifting the legs against gravity.
(d) Neglect the swinging of hands etc. as is common in walking.
As we can see in Fig. 6.20, in each stride the leg is brought from rest to a speed, approximately equal to the
speed of walking, and then brought to rest again.

Fig. 6.20 An illustration of a single stride in walking. While the first leg is maximally off the round, the second leg
is on the ground and vice-versa
The work done by one leg in each stride is m l v 20 by the work-energy theorem. Here ml is the mass of the leg.
Note m l v 2
0 /2 energy is expended by one set of leg muscles to bring the foot from rest to speed v0 while an
additional m l v 20 /2 is expended by a complementary set of leg muscles to bring the foot to rest from speed v0.
Hence work done by both legs in one stride is (study Fig. 6.20 carefully)

W s =2ml v 20

(6.34)

Assuming ml = 10 kg and slow running of a nine-minute mile which translates to 3 m s-1 in SI units, we obtain
W s = 180 J / stride
If we take a stride to be 2 m long, the person covers 1.5 strides per second at his speed of 3 m s-1 . Thus the
power expended
J
stride
1.5
stride
second
= 270 W

P =180

We must bear in mind that this is a lower estimate since several avenues of power loss (e.g. swinging of hands,
air resistance etc.) have been ignored. The interesting point is that we did not worry about the forces involved.
The forces, mainly friction and those exerted on the leg by the muscles of the rest of the body, are hard to
estimate. Static friction does no work and we bypassed the impossible task of estimating the work done by the
muscles by taking recourse to the work-ener gy theorem. We can also see the advantage of a wheel. The wheel
permits smooth locomotion without the continual starting and stopping in mammalian locomotion.

2015-16(20/01/2015)

CHAPTER ELEVEN

OF

MATTER

he
d

THERMAL PROPERTIES

11.1 INTRODUCTION

temperature

11.4 Ideal-gas equation and

is

no N
C
tt E
o R
be T
re
pu

absolute temperature
Thermal expansion
Specific heat capacity
Calorimetry
Change of state
Heat transfer
11.10 Newtons law of cooling

bl

11.1 Introduction
11.2 Temperature and heat
11.3 Measurement of

We all have common-sense notions of heat and temperature.


Temperature is a measure of hotness of a body. A kettle
with boiling water is hotter than a box containing ice. In
physics, we need to define the notion of heat, temperature,
etc., more carefully. In this chapter, you will learn what heat
is and how it is measured, and study the various proceses by
which heat flows from one body to another. Along the way,
you will find out why blacksmiths heat the iron ring before
fitting on the rim of a wooden wheel of a bullock cart and
why the wind at the beach often reverses direction after the
sun goes down. You will also learn what happens when water
boils or freezes, and its temperature does not change during
these processes even though a great deal of heat is flowing
into or out of it.

11.5
11.6
11.7
11.8
11.9

Summary
Points to ponder
Exercises

11.2 TEMPERATURE AND HEAT


We can begin studying thermal properties of matter with
definitions of temperature and heat. Temperature is a relative
measure, or indication of hotness or coldness. A hot utensil
is said to have a high temperature, and ice cube to have a
low temperature. An object that has a higher temperature
than another object is said to be hotter. Note that hot and
cold are relative terms, like tall and short. We can perceive
temperature by touch. However, this temperature sense is
somewhat unreliable and its range is too limited to be useful
for scientific purposes.
We know from experience that a glass of ice-cold water left
on a table on a hot summer day eventually warms up whereas
a cup of hot tea on the same table cools down. It means that
when the temperature of body, ice-cold water or hot tea in
this case, and its surrounding medium are different, heat
transfer takes place between the system and the surrounding
medium, until the body and the surrounding medium are at
the same temperature. We also know that in the case of glass
tumbler of ice cold water, heat flows from the environment to

THERMAL PROPERTIES OF MATTER

275

is

A relationship for converting between the two


scales may be obtained from a graph of
Fahrenheit temperature (t F) versus celsius
temperature (tC) in a straight line (Fig. 11.1),
whose equation is

t F 32 t C
=
180
100

no N
C
tt E
o R
be T
re
pu

A measure of temperature is obtained using a


thermometer. Many physical properties of
materials change sufficiently with temperature
to be used as the basis for constructing
thermometers. The commonly used property is
variation of the volume of a liquid with
temperature. For example, a common
thermometer (the liquid-in-glass type) with
which you are familiar. Mercury and alcohol are
the liquids used in most liquid-in-glass
thermometers.
Thermometers are calibrated so that a
numerical value may be assigned to a given
temperature. For the definition of any standard
scale, two fixed reference points are needed.
Since all substances change dimensions with
temperature, an absolute reference for
expansion is not available. However, the
necessary fixed points may be correlated to
physical phenomena that always occur at the
same temperature. The ice point and the steam
point of water are two convenient fixed points
and are known as the freezing and boiling points.
These two points are the temperatures at which
pure water freezes and boils under standard
pressure. The two familiar temperature scales
are the Fahrenheit temperature scale and the
Celsius temperature scale. The ice and
steam point have values 32 F and 212 F
respectively, on the Fahrenheit scale and 0 C
and 100 C on the Celsius scale. On the
Fahrenheit scale, there are 180 equal intervals
between two reference points, and on the celsius
scale, there are 100.

Fig. 11.1 A plot of Fahrenheit temperature (tF) versus


Celsius temperature (tc ).

bl

11.3 MEASUREMENT OF TEMPERATURE

he
d

the glass tumbler, whereas in the case of hot


tea, it flows from the cup of hot tea to the
environment. So, we can say that heat is the
form of energy transferred between two (or
more) systems or a system and its
surroundings by virtue of temperature
difference. The SI unit of heat energy
transferred is expressed in joule (J) while SI unit
of temperature is kelvin (K), and C is a
commonly used unit of temperature. When an
object is heated, many changes may take place.
Its temperature may rise, it may expand or
change state. We will study the effect of heat on
different bodies in later sections.

(11.1)

11.4 IDEAL-GAS EQUATION AND ABSOLUTE


TEMPERATURE
Liquid-in-glass thermometers show different
readings for temperatures other than the fixed
points because of differing expansion properties.
A thermometer that uses a gas, however, gives
the same readings regardless of which gas is
used. Experiments show that all gases at low
densities exhibit same expansion behaviour. The
variables that describe the behaviour of a given
quantity (mass) of gas are pressure, volume, and
temperature (P, V, and T )(where T = t + 273.15;
t is the temperature in C). When temperature
is held constant, the pressure and volume of a
quantity of gas are related as PV = constant.
This relationship is known as Boyles law, after
Robert Boyle (1627-1691) the English Chemist
who discovered it. When the pressure is held
constant, the volume of a quantity of the gas is
related to the temperature as V/T = constant.
This relationship is known as Charles law, after
the French scientist Jacques Charles (17471823). Low density gases obey these laws, which
may be combined into a single relationship.

276

he
d

no N
C
tt E
o R
be T
re
pu

PV
= R
T
or PV = RT
(11.2)
where, is the number of moles in the sample
of gas and R is called universal gas constant:
R = 8.31 J mol1 K1
In Eq. 11.2, we have learnt that the pressure
and volume are directly proportional to
temperature : PV T. This relationship allows a
gas to be used to measure temperature in a
constant volume gas thermometer. Holding the
volume of a gas constant, it gives P T. Thus,
with a constant-volume gas thermometer,
temperature is read in terms of pressure. A plot
of pressure versus temperature gives a straight
line in this case, as shown in Fig. 11.2.
However, measurements on real gases deviate
from the values predicted by the ideal gas law
at low temperature. But the relationship is linear
over a large temperature range, and it looks as
though the pressure might reach zero with
decreasing temperature if the gas continued to
be a gas. The absolute minimum temperature
for an ideal gas, therefore, inferred by
extrapolating the straight line to the axis, as in
Fig. 11.3. This temperature is found to be
273.15 C and is designated as absolute zero.
Absolute zero is the foundation of the Kelvin
temperature scale or absolute scale temperature

named after the British scientist Lord Kelvin.


On this scale, 273.15 C is taken as the zero
point, that is 0 K (Fig. 11.4).

is

Notice that since PV = constant and V/T =


constant for a given quantity of gas, then PV/T
should also be a constant. This relationship is
known as ideal gas law. It can be written in a
more general form that applies not just to a given
quantity of a single gas but to any quantity of
any dilute gas and is known as ideal-gas
equation:

Fig. 11.3 A plot of pressure versus temperature and


extrapolation of lines for low density gases
indicates the same absolute zero
temperature.

bl

Fig. 11.2 Pressure versus temperature of a low


density gas kept at constant volume.

PHYSICS

Fig. 11.4 Comparision of the Kelvin, Celsius and


Fahrenheit temperature scales.

The size of the unit for Kelvin temperature is


the same celsius degree, so temperature on these
scales are related by
T = tC + 273.15
(11.3)

11.5 THERMAL EXPANSION


You may have observed that sometimes sealed
bottles with metallic lids are so tightly screwed
that one has to put the lid in hot water for
sometime to open the lid. This would allow the
metallic cover to expand, thereby loosening it to
unscrew easily. In case of liquids, you may have
observed that mercury in a thermometer rises,
when the thermometer is put in a slightly warm
water. If we take out the thermometer from the

THERMAL PROPERTIES OF MATTER

277

Table 11.1 Values of coefficient of linear


expansion for some materials

l (105 K1)

Aluminium
Brass
Iron
Copper
Silver
Gold
Glass (pyrex)
Lead

2.5
1.8
1.2
1.7
1.9
1.4
0.32
0.29

he
d

Materials

Similarly, we consider the fractional change

V
, of a substance for temperature
V
change T and define the coefficient of volume
in volume,

as

bl

expansion,

is

warm water the level of mercury falls again.


Similarly, in the case of gases, a balloon partially
inflated in a cool room may expand to full size
when placed in warm water. On the other hand,
a fully inflated balloon when immersed in cold
water would start shrinking due to contraction
of the air inside.
It is our common experience that most
substances expand on heating and contract on
cooling. A change in the temperature of a body
causes change in its dimensions. The increase
in the dimensions of a body due to the increase
in its temperature is called thermal expansion.
The expansion in length is called linear
expansion. The expansion in area is called area
expansion. The expansion in volume is called
volume expansion (Fig. 11.5).

no N
C
tt E
o R
be T
re
pu

V 1
V =
(11.5)

V T
Here V is also a characteristic of the
substance but is not strictly a constant. It
depends in general on temperature (Fig 11.6). It
is seen that V becomes constant only at a high
temperature.

al T

(a) Linear expansion

2a l T

(b) Area expansion

3a l T

(c) Volume expansion

Fig. 11.5 Thermal Expansion.

If the substance is in the form of a long rod,


then for small change in temperature, T, the
fractional change in length, l/l, is directly
proportional to T.

l
= 1 T
l

(11.4)

where 1 is known as the coefficient of linear


expansion and is characteristic of the material
of the rod. In Table 11.1 are given typical average
values of the coefficient of linear expansion for
some materials in the temperature range 0 C
to 100 C. From this Table, compare the value
of l for glass and copper. We find that copper
expands about five times more than glass for
the same rise in temperature. Normally, metals
expand more and have relatively high values
of l.

Fig. 11.6 Coefficient of volume expansion of copper


as a function of temperature.

Table 11.2 gives the values of co-efficient of


volume expansion of some common substances
in the temperature range 0 100 C. You can
see that thermal expansion of these substances
(solids and liquids) is rather small, with

278

PHYSICS

materials like pyrex glass and invar (a special


iron-nickel alloy) having particularly low values
of V. From this Table we find that the value of
v for alcohol (ethyl) is more than mercury and
expands more than mercury for the same rise
in temperature.

v ( K1)

Aluminium
Brass
Iron
Paraffin
Glass (ordinary)
Glass (pyrex)
Hard rubber
Invar
Mercurry
Water
Alcohol (ethyl)

7 10 5
6 10 5
3.55 10 5
58.8 10 5
2.5 10 5
1 10 5
2.4 10 4
2 10 6
18.2 10 5
20.7 10 5
110 10 5

is

Materials

he
d

Values of coefficient of volume


expansion for some substances

V T
=
V
T

no N
C
tt E
o R
be T
re
pu

Water exhibits an anomalous behavour; it


contracts on heating between 0 C and 4 C.
The volume of a given amount of water decreases
as it is cooled from room temperature, until its
temperature reaches 4 C, [Fig. 11.7(a)]. Below
4 C, the volume increases, and therefore the
density decreases [Fig. 11.7(b)].
This means that water has a maximum
density at 4 C. This property has an important
environmental effect: Bodies of water, such as

bl

Table 11.2

lakes and ponds, freeze at the top first. As a


lake cools toward 4 C, water near the surface
loses energy to the atmosphere, becomes denser,
and sinks; the warmer, less dense water near
the bottom rises. However, once the colder water
on top reaches temperature below 4 C, it
becomes less dense and remains at the surface,
where it freezes. If water did not have this
property, lakes and ponds would freeze from the
bottom up, which would destroy much of their
animal and plant life.
Gases at ordinary temperature expand more
than solids and liquids. For liquids, the
coefficient of volume expansion is relatively
independent of the temperature. However, for
gases it is dependent on temperature. For an
ideal gas, the coefficient of volume expansion at
constant pressure can be found from the ideal
gas equation :
PV = RT
At constant pressure
PV = R T

Temperature (C)
(a)
Fig. 11.7

1
for ideal gas
(11.6)
T
At 0 C, v = 3.7 103 K1, which is much
larger than that for solids and liquids.
Equation (11.6) shows the temperature
dependence of v; it decreases with increasing
temperature. For a gas at room temperature and
constant pressure v is about 3300 106 K1, as
i.e. v =

Temperature (C)
(b)

Thermal expansion of water.

THERMAL PROPERTIES OF MATTER

279

Al = a (b)
b
b
a

A2 = b (a)

Fig. 11.8

(11.8)

(11.9)

no N
C
tt E
o R
be T
re
pu

What happens by preventing the thermal


expansion of a rod by fixing its ends rigidly?
Clearly, the rod acquires a compressive strain
due to the external forces provided by the rigid
support at the ends. The corresponding stress
set up in the rod is called thermal stress. For
example, consider a steel rail of length 5 m and
area of cross section 40 cm2 that is prevented
from expanding while the temperature rises by
10 C. The coefficient of linear expansion of steel
is l(steel) = 1.2 105 K1. Thus, the compressive

Consider a rectangular sheet of the solid


material of length a and breadth b (Fig. 11.8 ).
When the temperature increases by T, a
increases by a = l aT and b increases by b
= lb T. From Fig. 11.8, the increase in area
A = A1 +A2 + A3
A = a b + b a + (a) (b)
= a lb T + b l a T + (l)2 ab (T)2
= l ab T (2 + l T) = l A T (2 + l T)
Since l 105 K1, from Table 11.1, the
product l T for fractional temperature is small
in comparision with 2 and may be neglected.
Hence,

is

v = 3l

A3 = (a) (b)

bl

3V l
= 3V l T
l
which gives
V =

Answer

he
d

much as order(s) of magnitude larger than the


coefficient of volume expansion of typical liquids.
There is a simple relation between the
coefficient of volume expansion ( v ) and
coefficient of linear expansion (l). Imagine a
cube of length, l, that expands equally in all
directions, when its temperature increases by
T. We have
l = l l T
so, V = (l+l)3 l3 3l2 l
(11.7)
In equation (11.7), terms in (l)2 and (l)3 have
been neglected since l is small compared to l.
So

F
l
= Ysteel = 2.4
10 7 N m 2 , which
A
l
corresponds to an external force of

l
= 2.4 107 40 104 j 105N.
l

F = AYsteel

If two such steel rails, fixed at their outer ends,


are in contact at their inner ends, a force of this
magnitude can easily bend the rails.
Example 11.1 Show that the coefficient
of area expansions, (A/A)/T, of a
rectangular sheet of the solid is twice its
linear expansivity, l.

l
strain is
= l(steel) T = 1.2 105 10=1.2 104.
l
Youngs modulus of steel is Y (steel) = 2 1011 N m2.
Therefore, the thermal stress developed is

A 1
 2l

A T

Example 11.2 A blacksmith fixes iron ring


on the rim of the wooden wheel of a bullock
cart. The diameter of the rim and the iron
ring are 5.243 m and 5.231 m respectively
at 27 C. To what temperature should the
ring be heated so as to fit the rim of the
wheel?

Answer

Given,

T1 = 27 C

LT1 = 5.231 m
LT2 = 5.243 m

So,

LT2 =LT1 [1+l (T2T1)]

5.243 m = 5.231 m [1 + 1.20105 K1 (T227 C)]


or T2 = 218 C.

280

PHYSICS

S 1 Q
=
(11.11)
m m T
The specific heat capacity is the property of
the substance which determines the change in
the temperature of the substance (undergoing
no phase change) when a given quantity of heat
is absorbed (or rejected) by it. It is defined as
the amount of heat per unit mass absorbed or
rejected by the substance to change its
temperature by one unit. It depends on the
nature of the substance and its temperature.
The SI unit of specific heat capacity is J kg1 K1.
If the amount of substance is specified in
terms of moles , instead of mass m in kg, we
can define heat capacity per mole of the
substance by

is

he
d

s=

no N
C
tt E
o R
be T
re
pu

Take some water in a vessel and start heating it


on a burner. Soon you will notice that bubbles
begin to move upward. As the temperature is
raised the motion of water particles increases
till it becomes turbulent as water starts boiling.
What are the factors on which the quantity of
heat required to raise the temperature of a
substance depend? In order to answer this
question in the first step, heat a given quantity
of water to raise its temperature by, say 20 C
and note the time taken. Again take the same
amount of water and raise its temperature by
40 C using the same source of heat. Note the
time taken by using a stopwatch. You will find
it takes about twice the time and therefore,
double the quantity of heat required raising twice
the temperature of same amount of water.
In the second step, now suppose you take
double the amount of water and heat it, using
the same heating arrangement, to raise the
temperature by 20 C, you will find the time
taken is again twice that required in the first
step.
In the third step, in place of water, now heat
the same quantity of some oil, say mustard oil,
and raise the temperature again by 20 C. Now
note the time by the same stopwatch. You will
find the time taken will be shorter and therefore,
the quantity of heat required would be less than
that required by the same amount of water for
the same rise in temperature.
The above observations show that the quantity
of heat required to warm a given substance
depends on its mass, m, the change in
temperature, T and the nature of substance.
The change in temperature of a substance, when
a given quantity of heat is absorbed or rejected
by it, is characterised by a quantity called the
heat capacity of that substance. We define heat
capacity, S of a substance as

heat absorbed or rejected to change the


temperature of unit mass of it by one unit. This
quantity is referred to as the specific heat
capacity of the substance.
If Q stands for the amount of heat absorbed
or rejected by a substance of mass m when it
undergoes a temperature change T, then the
specific heat capacity, of that substance is given
by

bl

11.6 SPECIFIC HEAT CAPACITY

Q
(11.10)
T
where Q is the amount of heat supplied to
the substance to change its temperature from T
to T + T.
You have observed that if equal amount of
heat is added to equal masses of different
substances, the resulting temperature changes
will not be the same. It implies that every
substance has a unique value for the amount of
S=

S 1 Q
=
(11.12)
T
where C is known as molar specific heat
capacity of the substance. Like S, C also
depends on the nature of the substance and its
temperature. The SI unit of molar specific heat
capacity is J mol1 K1.
However, in connection with specific heat
capacity of gases, additional conditions may be
needed to define C. In this case, heat transfer
can be achieved by keeping either pressure or
volume constant. If the gas is held under
constant pressure during the heat transfer, then
it is called the molar specific heat capacity at
constant pressure and is denoted by Cp. On
the other hand, if the volume of the gas is
maintained during the heat transfer, then the
corresponding molar specific heat capacity is
called molar specific heat capacity at constant
volume and is denoted by Cv. For details see
Chapter 12. Table 11.3 lists measured specific
heat capacity of some substances at atmospheric
pressure and ordinary temperature while Table
11.4 lists molar specific heat capacities of some
gases. From Table 11.3 you can note that water
C=

THERMAL PROPERTIES OF MATTER

Specific heat capacity of some substances at room temperature and atmospheric


pressure

Specific heat capacity


(J kg1 K1)
900.0
506.5
386.4
127.7
236.1
134.4
4186.0

Specific heat capacity


(J kg1 K1)

Ice
Glass
Iron
Kerosene
Edible oil
Mercury

equal to the heat gained by the colder body,


provided no heat is allowed to escape to the
surroundings. A device in which heat
measurement can be made is called a
calorimeter. It consists a metallic vessel and
stirrer of the same material like copper or
alumiunium. The vessel is kept inside a wooden
jacket which contains heat insulating materials
like glass wool etc. The outer jacket acts as a
heat shield and reduces the heat loss from the
inner vessel. There is an opening in the outer
jacket through which a mercury thermometer
can be inserted into the calorimeter. The
following example provides a method by which
the specific heat capacity of a given solid can be
determinated by using the principle, heat gained
is equal to the heat lost.

no N
C
tt E
o R
be T
re
pu

has the highest specific heat capacity compared


to other substances. For this reason water is
used as a coolant in automobile radiators as well
as a heater in hot water bags. Owing to its high
specific heat capacity, the water warms up much
more slowly than the land during summer and
consequently wind from the sea has a cooling
effect. Now, you can tell why in desert areas,
the earth surface warms up quickly during the
day and cools quickly at night.

2060
840
450
2118
1965
140

he
d

Aluminium
Carbon
Copper
Lead
Silver
Tungesten
Water

Substance

is

Substance

bl

Table 11.3

281

Gas

Cp (J mol1K1)

Cv(J mol1K1)

He

20.8

12.5

H2

28.8

20.4

N2

29.1

20.8

O2

29.4

21.1

CO2

37.0

28.5

11.7 CALORIMETRY

A system is said to be isolated if no exchange or


transfer of heat occurs between the system and
its surroundings. When different parts of an
isolated system are at different temperature, a
quantity of heat transfers from the part at higher
temperature to the part at lower temperature.
The heat lost by the part at higher temperature
is equal to the heat gained by the part at lower
temperature.
Calorimetry means measurement of heat.
When a body at higher temperature is brought
in contact with another body at lower
temperature, the heat lost by the hot body is

Table 11.4 Molar specific heat capacities of


some gases

Example 11.3 A sphere of aluminium of


0.047 kg placed for sufficient time in a
vessel containing boiling water, so that the
sphere is at 100 C. It is then immediately
transfered to 0.14 kg copper calorimeter
containing 0.25 kg of water at 20 C. The
temperature of water rises and attains a
steady state at 23 C. Calculate the specific
heat capacity of aluminium.

Answer In solving this example we shall use


the fact that at a steady state, heat given by an
aluminium sphere will be equal to the heat
absorbed by the water and calorimeter.
Mass of aluminium sphere (m1) = 0.047 kg
Initial temp. of aluminium sphere = 100 C
Final temp. = 23 C
Change in temp (T ) = (100 C - 23 C) = 77 C
Let specific heat capacity of aluminium be sAl.

is

Fig. 11.9 A plot of temperature versus time showing


the changes in the state of ice on heating
(not to scale).

bl

The change of state from solid to liquid is


called melting and from liquid to solid is called
fusion. It is observed that the temperature
remains constant until the entire amount of the
solid substance melts. That is, both the solid
and liquid states of the substance coexist in
thermal equilibrium during the change of
states from solid to liquid. The temperature
at which the solid and the liquid states of the
substance in thermal equilibrium with each
other is called its melting point. It is
characteristic of the substance. It also depends
on pressure. The melting point of a substance
at standard atomspheric pressure is called its
normal melting point. Let us do the following
activity to understand the process of melting
of ice.
Take a slab of ice. Take a metallic wire and
fix two blocks, say 5 kg each, at its ends. Put
the wire over the slab as shown in Fig. 11.10.
You will observe that the wire passes through
the ice slab. This happens due to the fact that
just below the wire, ice melts at lower
temperature due to increase in pressure. When
the wire has passed, water above the wire freezes
again. Thus the wire passes through the slab
and the slab does not split. This phenomenon
of refreezing is called regelation. Skating is
possible on snow due to the formation of water
below the skates. Water is formed due to the
increase of pressure and it acts as a
lubricant.

no N
C
tt E
o R
be T
re
pu

The amount of heat lost by the aluminium


sphere = m1s Al T = 0.047kg s Al 77 C
Mass of water (m2) = 0.25 kg
Mass of calorimeter (m3) = 0.14 kg
Initial temp. of water and calorimeter = 20 C
Final temp. of the mixture = 23 C
Change in temp. (T2) = 23 C 20 C = 3 C
Specific heat capacity of water (sw)
= 4.18 103 J kg1 K1
Specific heat capacity of copper calorimeter
= 0.386 103 J kg1 K1
The amount of heat gained by water and
calorimeter = m2 sw T2 + m3scuT2
= (m2sw + m3scu) (T2)
= 0.25 kg 4.18 103 J kg1 K1 + 0.14 kg
0.386 103 J kg1 K1) (23 C 20 C)
In the steady state heat lost by the aluminium
sphere = heat gained by water + heat gained by
calorimeter.
So, 0.047 kg sAl 77 C
= (0.25 kg 4.18 103 J kg1 K1+ 0.14 kg
0.386 103 J kg1 K1)(3 C)
sAl = 0.911 kJ kg 1 K1
W

PHYSICS

he
d

282

11.8 CHANGE OF STATE

Matter normally exists in three states: solid,


liquid, and gas. A transition from one of these
states to another is called a change of state. Two
common changes of states are solid to liquid
and liquid to gas (and vice versa). These changes
can occur when the exchange of heat takes place
between the substance and its surroundings.
To study the change of state on heating or
cooling, let us perform the following activity.
Take some cubes of ice in a beaker. Note the
temperature of ice (0 C). Start heating it slowly
on a constant heat source. Note the temperature
after every minute. Continuously stir the
mixture of water and ice. Draw a graph between
temperature and time (Fig. 11.9). You will
observe no change in the temperature so long
as there is ice in the beaker. In the above process,
the temperature of the system does not change
even though heat is being continuously supplied.
The heat supplied is being utilised in changing
the state from solid (ice) to liquid (water).

THERMAL PROPERTIES OF MATTER

283

100 C when it again becomes steady. The heat


supplied is now being utilised to change water
from liquid state to vapour or gaseous state.

Triple Point

he
d

is

Take a round-bottom flask, more than half


filled with water. Keep it over a burner and fix a

bl

Fig. 11.10

After the whole of ice gets converted into water


and as we continue further heating, we shall
see that temperature begins to rise. The
temperature keeps on rising till it reaches nearly

The change of state from liquid to vapour (or


gas) is called vaporisation. It is observed that
the temperature remains constant until the
entire amount of the liquid is converted into
vapour. That is, both the liquid and vapour states
of the substance coexist in thermal equilibrium,
during the change of state from liquid to vapour.
The temperature at which the liquid and the
vapour states of the substance coexist is called
its boiling point. Let us do the following activity
to understand the process of boiling of water.

no N
C
tt E
o R
be T
re
pu

The temperature of a substance remains constant during its change of state (phase change).
A graph between the temperature T and the Pressure P of the substance is called a phase
diagram or P T diagram. The following figure shows the phase diagram of water and CO2.
Such a phase diagram divides the P T plane into a solid-region, the vapour-region and the
liquid-region. The regions are separated by the curves such as sublimation curve (BO), fusion
curve (AO) and vaporisation curve (CO). The points on sublimation curve represent states
in which solid and vapour phases coexist. The point on the sublimation curve BO represent
states in which the solid and vapour phases co-exist. Points on the fusion curve AO represent
states in which solid and liquid phase coexist. Points on the vapourisation curve CO represent
states in which the liquid and vapour phases coexist. The temperature and pressure at which
the fusion curve, the vaporisation curve and the sublimation curve meet and all the three
phases of a substance coexist is called the triple point of the substance. For example the
triple point of water is represented by the temperature 273.16 K and pressure 6.11103 Pa.

(a)

(b)

Pressure-temperature phase diagrams for (a) water and (b) CO2 (not to the scale).

284

PHYSICS

cork. Keep the f lask turned upside down on the


stand. Pour ice-cold water on the flask. Water
vapours in the flask condense reducing the
pressure on the water surface inside the flask.
Water begins to boil again, now at a lower
temperature. Thus boiling point decreases with
decrease in pressure.

he
d

This explains why cooking is difficult on hills.


At high altitudes, atmospheric pressure is lower,
reducing the boiling point of water as compared
to that at sea level. On the other hand, boiling
point is increased inside a pressure cooker by
increasing the pressure. Hence cooking is faster.
The boiling point of a substance at standard
atmospheric pressure is called its normal
boiling point.

is

thermometer and steam outlet through the cork


of the flask (Fig. 11.11). As water gets heated in
the flask, note first that the air, which was
dissolved in the water, will come out as small
bubbles. Later, bubbles of steam will form at
the bottom but as they rise to the cooler water
near the top, they condense and disappear.
Finally, as the temperature of the entire mass
of the water reaches 100 C, bubbles of steam
reach the surface and boiling is said to occur.
The steam in the flask may not be visible but as
it comes out of the flask, it condenses as tiny
droplets of water, giving a foggy appearance.

no N
C
tt E
o R
be T
re
pu

bl

However, all substances do not pass through


the three states: solid-liquid-gas. There are
certain substances which normally pass from
the solid to the vapour state directly and vice
versa. The change from solid state to vapour
state without passing through the liquid state
is called sublimation, and the substance is said
to sublime. Dry ice (solid CO2) sublimes, so also
iodine. During the sublimation process both the
solid and vapour states of a substance coexist
in thermal equilibrium.
11.8.1 Latent Heat

Fig. 11.11 Boiling process.

If now the steam outlet is closed for a few


seconds to increase the pressure in the flask,
you will notice that boiling stops. More heat
would be required to raise the temperature
(depending on the increase in pressure) before
boiling begins again. Thus boiling point increases
with increase in pressure.
Let us now remove the burner. Allow water to
cool to about 80 C. Remove the thermometer
and steam outlet. Close the flask with the airtight

In Section 11.8, we have learnt that certain


amount of heat energy is transferred between a
substance and its surroundings when it
undergoes a change of state. The amount of heat
per unit mass transferred during change of state
of the substance is called latent heat of the
substance for the process. For example, if heat
is added to a given quantity of ice at 10 C, the
temperature of ice increases until it reaches its
melting point (0 C). At this temperature, the
addition of more heat does not increase the
temperature but causes the ice to melt, or
changes its state. Once the entire ice melts,
adding more heat will cause the temperature of
the water to rise. A similar situation
occurs during liquid gas change of state at the
boiling point. Adding more heat to boiling water
causes vaporisation, without increase in
temperature.

THERMAL PROPERTIES OF MATTER

Temperatures of the change of state and latent heats for various substances at
1 atm pressure

Lf
(105J kg1)

114
1063
328
39
210
219
0

1.0
0.645
0.25
0.12
0.26
0.14
3.33

78
2660
1744
357
196
183
100

Note that when heat is added (or removed)


during a change of state, the temperature
remains constant. Note in Fig. 11.12 that the
slopes of the phase lines are not all the same,
which indicates that specific heats of the various
states are not equal. For water, the latent heat of
fusion and vaporisation are Lf = 3.33 105 J kg1
and Lv = 22.6 105 J kg1 respectively. That is
3.33 105 J of heat are needed to melt 1 kg of
ice at 0 C, and 22.6 105 J of heat are needed
to convert 1 kg of water to steam at 100 C. So,
steam at 100 C carries 22.6 105 J kg1 more
heat than water at 100 C. This is why burns
from steam are usually more serious than those
from boiling water.

Fig. 11.12 Temperature versus heat for water at


1 atm pressure (not to scale).

Lv
(105J kg1)
8.5
15.8
8.67
2.7
2.0
2.1
22.6

no N
C
tt E
o R
be T
re
pu

The heat required during a change of state


depends upon the heat of transformation and
the mass of the substance undergoing a change
of state. Thus, if mass m of a substance
undergoes a change from one state to the other,
then the quantity of heat required is given by
Q=mL
or L = Q/m
(11.13)
where L is known as latent heat and is a
characteristic of the substance. Its SI unit is
J kg1. The value of L also depends on the
pressure. Its value is usually quoted at standard
atmospheric pressure. The latent heat for a solidliquid state change is called the latent heat of
fusion (Lf), and that for a liquid-gas state change
is called the latent heat of vaporisation (Lv).
These are often referred to as the heat of fusion
and the heat of vaporisation. A plot of
temperature versus heat energy for a quantity
of water is shown in Fig. 11.12. The latent heats
of some substances, their freezing and boiling
points, are given in Table 11.5.

Boiling
Point (C)

he
d

Ethyl alcohol
Gold
Lead
Mercury
Nitrogen
Oxygen
Water

Melting
Point (C)

is

Substance

bl

Table 11.5

285

Example 11.4 When 0.15 kg of ice of 0 C


mixed with 0.30 kg of water at 50 C in a
container, the resulting temperature is
6.7 C. Calculate the heat of fusion of ice.
(swater = 4186 J kg1 K1)

Answer
Heat lost by water = msw (fi)w
= (0.30 kg ) (4186 J kg1 K1) (50.0 C 6.7 C)
= 54376.14 J
Heat required to melt ice = m2Lf = (0.15 kg) Lf
Heat required to raise temperature of ice
water to final temperature = mIsw (fi)I
= (0.15 kg) (4186 J kg1 K 1) (6.7 C 0 C)
= 4206.93 J
Heat lost = heat gained
54376.14 J = (0.15 kg ) Lf + 4206.93 J
Lf = 3.34105 J kg1.
W

286

PHYSICS

= heat required to convert 3 kg of


ice at 12 C to steam at 100 C,
= heat required to convert ice at
12 C to ice at 0 C.
= m sice T1 = (3 kg) (2100 J kg1.
K1) [0(12)]C = 75600 J
= heat required to melt ice at
0 C to water at 0 C
= m Lf ice = (3 kg) (3.35 105 J kg1)
= 1005000 J
= heat required to convert water
at 0 C to water at 100 C.
= msw T2 = (3kg) (4186J kg1 K1)
(100 C)
= 1255800 J
= heat required to convert water
at 100 C to steam at 100 C.
= m Lsteam = (3 kg) (2.256106 J
kg1)
=
6768000 J
= Q1 + Q2 + Q3 + Q4
= 75600J + 1005000 J
+ 1255800 J + 6768000 J
= 9.1 10 6 J
W

he
d
Heating by conduction, convection and
radiation.

11.9.1 Conduction

Conduction is the mechanism of transfer of heat


between two adjacent parts of a body because
of their temperature difference. Suppose one end
of a metallic rod is put in a flame, the other end
of the rod will soon be so hot that you cannot
hold it by your bare hands. Here heat transfer
takes place by conduction from the hot end of
the rod through its different parts to the other
end. Gases are poor thermal conductors while
liquids have conductivities intermediate between
solids and gases.
Heat conduction may be described
quantitatively as the time rate of heat flow in a
material for a given temperature difference.
Consider a metallic bar of length L and uniform
cross section A with its two ends maintained at
different temperatures. This can be done, for
example, by putting the ends in thermal contact
with large reservoirs at temperatures, say, TC and
TD respectively (Fig. 11.14). Let us assume the
ideal condition that the sides of the bar are fully
insulated so that no heat is exchanged between
the sides and the surroundings.
After sometime, a steady state is reached; the
temperature of the bar decreases uniformly with
distance from TC to TD; (TC>TD). The reservoir at
C supplies heat at a constant rate, which
transfers through the bar and is given out at
the same rate to the reservoir at D. It is found

no N
C
tt E
o R
be T
re
pu

Now,

Fig. 11.13

is

Answer We have
Mass of the ice, m = 3 kg
specific heat capacity of ice, sice
= 2100 J kg1 K1
specific heat capacity of water, swater
= 4186 J kg1 K1
latent heat of fusion of ice, Lf ice
= 3.35 105 J kg1
latent heat of steam, Lsteam
= 2.256 106 J kg1

temperature difference. What are the different


ways by which this energy transfer takes place?
There are three distinct modes of heat transfer :
conduction, convection and radiation
(Fig. 11.13).

bl

Example 11.5 Calculate the heat required


to convert 3 kg of ice at 12 C kept in a
calorimeter to steam at 100 C at
atmospheric pressure. Given specific heat
capacity of ice = 2100 J kg1 K1, specific
heat capacity of water = 4186 J kg 1 K1,
latent heat of fusion of ice = 3.35 105
J kg1 and latent heat of steam = 2.256 106
J kg1.

Q1

Q2

Q3

Q4

So,

11.9 HEAT TRANSFER

We have seen that heat is energy transfer from


one system to another or from one part of a
system to another part, arising due to

THERMAL PROPERTIES OF MATTER

287

prohibited and keeps the room cooler. In some


situations, heat transfer is critical. In a nuclear
reactor, for example, elaborate heat transfer
systems need to be installed so that the
enormous energy produced by nuclear fission
in the core transits out sufficiently fast, thus
preventing the core from overheating.

Fig. 11.14 Steady state heat flow by conduction in


a bar with its two ends maintained at
temperatures TC and TD; (TC > TD).

Materials

Thermal conductivity
(J s1 m1 K1 )

Silver
Copper
Aluminium
Brass
Steel
Lead
Mercury

bl

H = KA

406
385
205
109
50.2
34.7
8.3

is

Metals

experimentally that in this steady state, the rate


of flow of heat (or heat current) H is proportional
to the temperature difference (TC TD) and the
area of cross section A and is inversely
proportional to the length L :
TC TD
(11.14)
L
The constant of proportionality K is called the
thermal conductivity of the material. The
greater the value of K for a material, the more
rapidly will it conduct heat. The SI unit of K is
J S 1 m 1 K 1 or W m 1 K 1 . The thermal
conductivities of various substances are listed
in Table 11.5. These values vary slightly with
temperature, but can be considered to be
constant over a normal temperature range.
Compare the relatively large thermal
conductivities of the good thermal conductors,
the metals, with the relatively small thermal
conductivities of some good thermal insulators,
such as wood and glass wool. You may have
noticed that some cooking pots have copper
coating on the bottom. Being a good conductor
of heat, copper promotes the distribution of heat
over the bottom of a pot for uniform cooking.
Plastic foams, on the other hand, are good
insulators, mainly because they contain pockets
of air. Recall that gases are poor conductors,
and note the low thermal conductivity of air in
the Table 11.5. Heat retention and transfer are
important in many other applications. Houses
made of concrete roofs get very hot during
summer days, because thermal conductivity of
concrete (though much smaller than that of a
metal) is still not small enough. Therefore, people
usually prefer to give a layer of earth or foam
insulation on the ceiling so that heat transfer is

he
d

Table 11.6 Thermal conductivities of some


materials

no N
C
tt E
o R
be T
re
pu

Non-metals

Insulating brick
Concrete
Body fat
Felt
Glass
Ice
Glass wool
Wood
Water

0.15
0.8
0.20
0.04
0.8
1.6
0.04
0.12
0.8

Gases

Air
Argon
Hydrogen

0.024
0.016
0.14

Example 11.6 What is the temperature of


the steel-copper junction in the steady
state of the system shown in Fig. 11.15.
Length of the steel rod = 15.0 cm, length
of the copper rod = 10.0 cm, temperature
of the furnace = 300 C, temperature of
the other end = 0 C. The area of cross
section of the steel rod is twice that of the
copper rod. (Thermal conductivity of steel
= 50.2 J s 1 m 1 K 1 ; and of copper
= 385 J s1m1K1).

288

PHYSICS

K1 A1 300

K 2 T2

K1

K2

Using this equation, the heat current H through


either bar is

L2

15

385 T
10

which gives T = 44.4 C

Example 11.7 An iron bar (L1 = 0.1 m, A1


= 0.02 m 2 , K 1 = 79 W m 1 K 1 ) and a
brass bar (L 2 = 0.1 m, A 2 = 0.02 m 2 ,
K2 = 109 W m1K1) are soldered end to end
as shown in Fig. 11.16. The free ends of
the iron bar and brass bar are maintained
at 373 K and 273 K respectively. Obtain
expressions for and hence compute (i) the
temperature of the junction of the two bars,
(ii) the equivalent thermal conductivity of
the compound bar, and (iii) the heat
current through the compound bar.

Answer

he
d

K1T1

T0 =

K1 A T1 T0

K 2 A(T0 T2 )
L

where 1 and 2 refer to the steel and copper rod


respectively. For A1 = 2 A 2, L1 = 15.0 cm,
L2 = 10.0 cm, K1 = 50.2 J s1 m1 K 1, K2 = 385 J
s1 m1 K 1, we have
50.2 2 ( 300 T )

K 2 A 2 ( T0 T2 )
L2

For A1 = A2 = A and L1 = L2 = L, this equation


leads to
K1 (T1 T0) = K2 (T0 T2)
Thus the junction temperature T0 of the two
bars is

H=

K 2 A2 T 0

L1

no N
C
tt E
o R
be T
re
pu

L1

K 1 A1 T1 T0

is

Answer The insulating material around the


rods reduces heat loss from the sides of the rods.
Therefore, heat flows only along the length of
the rods. Consider any cross section of the rod.
In the steady state, heat flowing into the element
must equal the heat flowing out of it; otherwise
there would be a net gain or loss of heat by the
element and its temperature would not be
steady. Thus in the steady state, rate of heat
flowing across a cross section of the rod is the
same at every point along the length of the
combined steel-copper rod. Let T be the
temperature of the steel-copper junction in the
steady state. Then,

bl

Fig. 11.15

Given, L1 = L2= L = 0.1 m, A1 = A2= A= 0.02 m2


K 1 = 79 W m 1 K 1 , K 2 = 109 W m 1 K 1 ,
T1 = 373 K, and T2 = 273 K.
Under steady state condition, the heat
current (H1) through iron bar is equal to the
heat current (H2) through brass bar.
So, H = H1 = H2

A T1 T0
K1 K 2
K1 K 2
L

A T1 T2
L

1
K1

1
K2

Using these equations, the heat current H


through the compound bar of length L1 + L2 = 2L
and the equivalent thermal conductivity K, of
the compound bar are given by

K A T1 T2

H
K

2L

2 K1 K 2
K1 K 2

K1T1

(i) T0

K1

K 2 T2
K2

79 W m 1K 1 373 K
1

79 W m K

109 W m 1K 1 273 K
109 W m 1K 1

= 315 K

2 K1 K 2
(ii) K = K
K2
1

Fig 11.16

2 (79 W m 1 K 1 ) (109 W m 1 K 1 )
79 W m 1 K 1 +109 W m 1 K 1

= 91.6 W m1 K1

THERMAL PROPERTIES OF MATTER

91.6 W m 1 K 1 0.02 m 2 373 K273 K


2 0.1 m

= 916.1 W
11.9.2 Convection

no N
C
tt E
o R
be T
re
pu

Convection is a mode of heat transfer by actual


motion of matter. It is possible only in fluids.
Convection can be natural or forced. In natural
convection, gravity plays an important part.
When a fluid is heated from below, the hot part
expands and, therefore, becomes less dense.
Because of buoyancy, it rises and the upper
colder part replaces it. This again gets heated,
rises up and is replaced by the colder part of
the fluid. The process goes on. This mode of
heat transfer is evidently different from
conduction. Convection involves bulk transport
of different parts of the fluid. In forced
convection, material is forced to move by a pump
or by some other physical means. The common
examples of forced convection systems are
forced-air heating systems in home, the human
circulatory system, and the cooling system of
an automobile engine. In the human body, the
heart acts as the pump that circulates blood
through different parts of the body, transferring
heat by forced convection and maintaining it at
a uniform temperature.
Natural convection is responsible for many
familiar phenomena. During the day, the ground
heats up more quickly than large bodies of water

he
d

2L

do. This occurs both because the water has a


greater specific heat and because mixing
currents disperse the absorbed heat throughout
the great volume of water. The air in contact
with the warm ground is heated by conduction.
It expands, becoming less dense than the
surrounding cooler air. As a result, the warm
air rises (air currents) and other air moves
(winds) to fill the space-creating a sea breeze
near a large body of water. Cooler air descends,
and a thermal convection cycle is set up, which
transfers heat away from the land. At night,
the ground loses its heat more quickly, and the
water surface is warmer than the land. As a
result, the cycle is reveresed (Fig. 11.17).
The other example of natural convection is
the steady surface wind on the earth blowing
in from north-east towards the equator, the so
called trade wind. A resonable explanation is
as follows : the equatorial and polar regions of
the earth receive unequal solar heat. Air at the
earths surface near the equator is hot while
the air in the upper atmosphere of the poles is
cool. In the absence of any other factor, a
convection current would be set up, with the
air at the equatorial surface rising and moving
out towards the poles, descending and
streaming in towards the equator. The rotation
of the earth, however, modifies this convection
current. Because of this, air close to the equator
has an eastward speed of 1600 km/h, while it
is zero close to the poles. As a result, the air
descends not at the poles but at 30 N (North)
latitude and returns to the equator. This is
called trade wind.

is

K A T1 T2

bl

(iii) H

289

Fig. 11.17

Convection cycles.

290

he
d

11.10 NEWTONS LAW OF COOLING

is

We all know that hot water or milk when left on


a table begins to cool gradually. Ultimately it
attains the temperature of the surroundings.
To study how a given body can cool on
exchanging heat with its surroundings, let us
perform the following activity.
Take some water, say 300 ml, in a
calorimeter with a stirrer and cover it with
two holed lid. Fix a thermometer through a
hole in the lid and make sure that the bulb of
thermometer is immersed in the water. Note
the reading of the thermometer. This reading
T1 is the temperature of the surroundings.
Heat the water kept in the calorimeter till it
attains a temperature, say, 40 C above room
temperature (i.e., temperature of the
surroundings). Then stop heating the water
by removing the heat source. Start the stopwatch and note the reading of the
thermometer after fixed interval of time, say
after every one minute of stirring gently with
the stirrer. Continue to note the temperature
(T2) of water till it attains a temperature about
5 C above that of the surroundings. Then plot
a graph by taking each value of temperature
T = T2 T1 along y axis and the coresponding
value of t along x-axis (Fig. 11.18).

no N
C
tt E
o R
be T
re
pu

Conduction and convection require some


material as a transport medium. These modes
of heat transfer cannot operate between bodies
separated by a distance in vacuum. But the
earth does receive heat from the sun across a
huge distance and we quickly feel the warmth
of the fire nearby even though air conducts
poorly and before convection can set in. The
third mechanism for heat transfer needs no
medium; it is called radiation and the energy
so radiated by electromagnetic waves is called
radiant energy. In an electromagnetic wave
electric and magnetic fields oscillate in space
and time. Like any wave, electromagnetic waves
can have different wavelengths and can travel
in vacuum with the same speed, namely the
speed of light i.e., 3 108 m s1 . You will learn
these matters in more details later, but you now
know why heat transfer by radiation does not
need any medium and why it is so fast. This is
how heat is transfered to the earth from the
sun through empty space. All bodies emit
radiant energy, whether they are solid, liquid
or gases. The electromagnetic radiation emitted
by a body by virtue of its temperature like the
radiation by a red hot iron or light from a
filament lamp is called thermal radiation.
When this thermal radiation falls on other
bodies, it is partly reflected and partly absorbed.
The amount of heat that a body can absorb by
radiation depends on the colour of the body.
We find that black bodies absorb and emit
radiant energy better than bodies of lighter
colours. This fact finds many applications in
our daily life. We wear white or light coloured
clothes in summer so that they absorb the least
heat from the sun. However, during winter, we
use dark coloured clothes which absorb heat
from the sun and keep our body warm. The
bottoms of the utensils for cooking food are
blackened so that they absorb maximum heat
from the fire and give it to the vegetables to be
cooked.
Similarly, a Dewar flask or thermos bottle is
a device to minimise heat transfer between the
contents of the bottle and outside. It consists
of a double-walled glass vessel with the inner
and outer walls coated with silver. Radiation
from the inner wall is reflected back into the

contents of the bottle. The outer wall similarly


reflects back any incoming radiation. The space
between the walls is evacuted to reduce
conduction and convection losses and the flask
is supported on an insulator like cork. The
device is, therefore, useful for preventing hot
contents (like milk) from getting cold, or
alternatively to store cold contents (like ice).

bl

11.9.3 Radiation

PHYSICS

Fig. 11.18 Curve showing cooling of hot water


with time.

THERMAL PROPERTIES OF MATTER

291

bl

is

he
d

For small temperature differences, the rate


of cooling, due to conduction, convection, and
radiation combined, is proportional to the
difference in temperature. It is a valid
approximation in the transfer of heat from a
radiator to a room, the loss of heat through the
wall of a room, or the cooling of a cup of tea on
the table.

Fig. 11.19 Verification of Newtons Law of cooling.

no N
C
tt E
o R
be T
re
pu

From the graph you will infer how the cooling


of hot water depends on the difference of its
temperature from that of the surroundings. You
will also notice that initially the rate of cooling
is higher and decreases as the temperature of
the body falls.
The above activity shows that a hot body loses
heat to its surroundings in the form of heat
radiation. The rate of loss of heat depends on
the difference in temperature between the body
and its surroundings. Newton was the first to
study, in a systematic manner, the relation
between the heat lost by a body in a given
enclosure and its temperature.
According to Newtons law of cooling, the rate
of loss of heat, dQ/dt of the body is directly
proportional to the difference of temperature
T = (T2T1) of the body and the surroundings.
The law holds good only for small difference of
temperature. Also, the loss of heat by radiation
depends upon the nature of the surface of the
body and the area of the exposed surface. We
can write
dQ
= k (T2 T1 )

(11.15)
dt
where k is a positive constant depending upon
the area and nature of the surface of the body.
Suppose a body of mass m and specific heat
capacity s is at temperature T2. Let T1 be the
temperature of the surroundings. If the
temperature falls by a small amount dT2 in time
dt, then the amount of heat lost is
dQ = ms dT2
Rate of loss of heat is given by

dQ
dT
= ms 2
(11.16)
dt
dt
From Eqs. (11.15) and (11.16) we have

dT2
= k (T2 T1 )
dt
dT2
k
dt = K dt
=
T2 T1
ms
where K = k/m s
On integrating,
log e (T2 T1) = K t + c
or T2 = T1 + C e Kt; where C = e c
m s

Newtons law of cooling can be verified with


the help of the experimental set-up shown in
Fig. 11.19(a). The set-up consists of a double
walled vessel (V) containing water in between
the two walls. A copper calorimeter (C)
containing hot water is placed inside the double
walled vessel. Two thermometers through the
corks are used to note the temperatures T2 of
water in calorimeter and T1 of hot water in
between the double walls respectively.
Temperature of hot water in the calorimeter is
noted after equal intervals of time. A graph is
plotted between log e (T2T1) and time (t). The
nature of the graph is observed to be a straight
line having a negative slope as shown in Fig.
11.19(b). This is in support of Eq. (11.18).

(11.17)

(11.18)
(11.19)

Equation (11.19) enables you to calculate the


time of cooling of a body through a particular
range of temperature.

Example 11.8 A pan filled with hot food


cools from 94 C to 86 C in 2 minutes when
the room temperature is at 20 C. How long
will it take to cool from 71 C to 69 C?

Answer The average temperature of 94 C and


86 C is 90 C, which is 70 C above the room
temperature. Under these conditions the pan
cools 8 C in 2 minutes.
Using Eq. (11.17), we have
Change in temperature
Time

K T

292

PHYSICS

When we divide above two equations, we


have

8 C
= K ( 70 C)
2 min

8 C/2 min K (70 C)


=
2 C/time K (50 C)

The average of 69 C and 71 C is 70 C, which


is 50 C above room temperature. K is the same
for this situation as for the original.

Time = 0.7 min

2 C
= K (50 C)
Time

he
d

= 42 s

SUMMARY

Heat is a form of energy that flows between a body and its surrounding medium by
virtue of temperature difference between them. The degree of hotness of the body is
quantitatively represented by temperature.

2.

A temperature-measuring device (thermometer) makes use of some measurable property


(called thermometric property) that changes with temperature. Different thermometers
lead to different temperature scales. To construct a temperature scale, two fixed points
are chosen and assigned some arbitrary values of temperature. The two numbers fix
the origin of the scale and the size of its unit.

3.

The Celsius temperature (tC) and the Farenheit temperare (tF)are related by

bl

is

1.

tF = (9/5) tC + 32

The ideal gas equation connecting pressure (P), volume (V) and absolute temperature
(T) is :

no N
C
tt E
o R
be T
re
pu

4.

PV = RT

where is the number of moles and R is the universal gas constant.

5.

In the absolute temperature scale, the zero of the scale is the absolute zero of temperature
the temperature where every substance in nature has the least possible molecular
activity. The Kelvin absolute temperature scale (T ) has the same unit size as the Celsius
scale (Tc ), but differs in the origin :
TC

6.

= T 273.15

The coefficient of linear expansion (l ) and volume expansion (v ) are defined by the
relations :

V
V

where l and V denote the change in length l and volume V for a change of temperature
T. The relation between them is :

v = 3 l

7.

The specific heat capacity of a substance is defined by

s=

1 Q
m T

THERMAL PROPERTIES OF MATTER

293

where m is the mass of the substance and Q is the heat required to change its
temperature by T. The molar specific heat capacity of a substance is defined by

1 Q
T

where is the number of moles of the substance.


The latent heat of fusion (Lf) is the heat per unit mass required to change a substance
from solid into liquid at the same temperature and pressure. The latent heat of
vaporisation (Lv) is the heat per unit mass required to change a substance from liquid
to the vapour state without change in the temperature and pressure.

9.

The three modes of heat transfer are conduction, convection and radiation.

he
d

8.

T T
H=K A

bl

where K is the thermal conductivity of the material of the bar.

is

10. In conduction, heat is transferred between neighbouring parts of a body through


molecular collisions, without any flow of matter. For a bar of length L and uniform
cross section A with its ends maintained at temperatures TC and TD, the rate of flow of
heat H is :

no N
C
tt E
o R
be T
re
pu

11. Newtons Law of Cooling says that the rate of cooling of a body is proportional to the
excess temperature of the body over the surroundings :

dQ
= k (T2 T1 )
dt

Where T1 is the temperature of the surrounding medium and T2 is the temperature of


the body.

294

PHYSICS

POINTS TO PONDER
1. The relation connecting Kelvin temperature (T ) and the Celsius temperature tc
T = tc + 273.15
and the assignment T = 273.16 K for the triple point of water are exact relations (by
choice). With this choice, the Celsius temperature of the melting point of water and
boiling point of water (both at 1 atm pressure) are very close to, but not exactly equal
to 0 C and 100 C respectively. In the original Celsius scale, these latter fixed points
were exactly at 0 C and 100 C (by choice), but now the triple point of water is the
preferred choice for the fixed point, because it has a unique temperature.
A liquid in equilibrium with vapour has the same pressure and temperature throughout
the system; the two phases in equilibrium differ in their molar volume (i.e. density).
This is true for a system with any number of phases in equilibrium.

3.

Heat transfer always involves temperature difference between two systems or two parts
of the same system. Any energy transfer that does not involve temperature difference
in some way is not heat.

4.

Convection involves flow of matter within a fluid due to unequal temperatures of its
parts. A hot bar placed under a running tap loses heat by conduction between the
surface of the bar and water and not by convection within water.

EXERCISES

bl

is

he
d

2.

no N
C
tt E
o R
be T
re
pu

11.1 The triple points of neon and carbon dioxide are 24.57 K and 216.55 K respectively.
Express these temperatures on the Celsius and Fahrenheit scales.

11.2 Two absolute scales A and B have triple points of water defined to be 200 A and
350 B. What is the relation between TA and TB ?
11.3 The electrical resistance in ohms of a certain thermometer varies with temperature
according to the approximate law :
R = Ro [1 + (T To )]

The resistance is 101.6 at the triple-point of water 273.16 K, and 165.5 at the
normal melting point of lead (600.5 K). What is the temperature when the resistance
is 123.4 ?

11.4 Answer the following :


(a)

The triple-point of water is a standard fixed point in modern thermometry.


Why ? What is wrong in taking the melting point of ice and the boiling point
of water as standard fixed points (as was originally done in the Celsius scale) ?

(b)

There were two fixed points in the original Celsius scale as mentioned above
which were assigned the number 0 C and 100 C respectively. On the absolute
scale, one of the fixed points is the triple-point of water, which on the Kelvin
absolute scale is assigned the number 273.16 K. What is the other fixed point
on this (Kelvin) scale ?

(c)

The absolute temperature (Kelvin scale) T is related to the temperature tc on


the Celsius scale by
tc = T 273.15

Why do we have 273.15 in this relation, and not 273.16 ?

(d)

What is the temperature of the triple-point of water on an absolute scale


whose unit interval size is equal to that of the Fahrenheit scale ?

THERMAL PROPERTIES OF MATTER

295

11.5 Two ideal gas thermometers A and B use oxygen and hydrogen respectively. The
following observations are made :
Temperature

Pressure
thermometer A

Pressure
thermometer B

Triple-point of water

1.250

105 Pa

0.200

105 Pa

Normal melting point


of sulphur

1.797

105 Pa

0.287

105 Pa

What is the absolute temperature of normal melting point of sulphur as read


by thermometers A and B ?

(b)

What do you think is the reason behind the slight difference in answers of
thermometers A and B ? (The thermometers are not faulty). What further
procedure is needed in the experiment to reduce the discrepancy between the
two readings ?

he
d

(a)

A steel tape 1m long is correctly calibrated for a temperature of 27.0 C. The


length of a steel rod measured by this tape is found to be 63.0 cm on a hot day
when the temperature is 45.0 C. What is the actual length of the steel rod on that
day ? What is the length of the same steel rod on a day when the temperature is
27.0 C ? Coefficient of linear expansion of steel = 1.20 105 K1 .

11.7

A large steel wheel is to be fitted on to a shaft of the same material. At 27 C, the


outer diameter of the shaft is 8.70 cm and the diameter of the central hole in the
wheel is 8.69 cm. The shaft is cooled using dry ice. At what temperature of the
shaft does the wheel slip on the shaft? Assume coefficient of linear expansion of
the steel to be constant over the required temperature range :
steel = 1.20 105 K1.

no N
C
tt E
o R
be T
re
pu

bl

is

11.6

11.8

A hole is drilled in a copper sheet. The diameter of the hole is 4.24 cm at 27.0 C.
What is the change in the diameter of the hole when the sheet is heated to 227 C?
Coefficient of linear expansion of copper = 1.70 105 K1.

11.9

A brass wire 1.8 m long at 27 C is held taut with little tension between two rigid
supports. If the wire is cooled to a temperature of 39 C, what is the tension
developed in the wire, if its diameter is 2.0 mm ? Co-efficient of linear expansion
of brass = 2.0 105 K1; Youngs modulus of brass = 0.91 1011 Pa.

11.10

A brass rod of length 50 cm and diameter 3.0 mm is joined to a steel rod of the
same length and diameter. What is the change in length of the combined rod at
250 C, if the original lengths are at 40.0 C? Is there a thermal stress developed
at the junction ? The ends of the rod are free to expand (Co-efficient of linear
expansion of brass = 2.0 105 K1, steel = 1.2 105 K1 ).

11.11

The coefficient of volume expansion of glycerin is 49 105 K1. What is the fractional
change in its density for a 30 C rise in temperature ?

11.12

A 10 kW drilling machine is used to drill a bore in a small aluminium block of mass


8.0 kg. How much is the rise in temperature of the block in 2.5 minutes, assuming
50% of power is used up in heating the machine itself or lost to the surroundings.
Specific heat of aluminium = 0.91 J g1 K1.

11.13

A copper block of mass 2.5 kg is heated in a furnace to a temperature of 500 C


and then placed on a large ice block. What is the maximum amount of ice that
can melt? (Specific heat of copper = 0.39 J g1 K1; heat of fusion of water
= 335 J g1 ).

11.14

In an experiment on the specific heat of a metal, a 0.20 kg block of the metal at


150 C is dropped in a copper calorimeter (of water equivalent 0.025 kg) containing
150 cm3 of water at 27 C. The final temperature is 40 C. Compute the specific

296

PHYSICS

heat of the metal. If heat losses to the surroundings are not negligible, is your
answer greater or smaller than the actual value for specific heat of the metal ?
Given below are observations on molar specific heats at room temperature of some
common gases.
Gas
Hydrogen
Nitrogen
Oxygen
Nitric oxide
Carbon monoxide
Chlorine

Molar specific heat (Cv )


(cal mo11 K1)
4.87
4.97
5.02
4.99
5.01
6.17

he
d

11.15

Answer the following questions based on the P-T phase diagram of carbon dioxide:
At what temperature and pressure can the solid, liquid and vapour phases of
CO2 co-exist in equilibrium ?

(b)

What is the effect of decrease of pressure on the fusion and boiling point of
CO2 ?

(c)

What are the critical temperature and pressure for CO 2 ? What is their
significance ?

(d)

Is CO2 solid, liquid or gas at (a) 70 C under 1 atm, (b) 60 C under 10 atm,
(c) 15 C under 56 atm ?

bl

(a)

no N
C
tt E
o R
be T
re
pu

11.16

is

The measured molar specific heats of these gases are markedly different from
those for monatomic gases. Typically, molar specific heat of a monatomic gas is
2.92 cal/mol K. Explain this difference. What can you infer from the somewhat
larger (than the rest) value for chlorine ?

11.17

Answer the following questions based on the P T phase diagram of CO2:


(a) CO2 at 1 atm pressure and temperature 60 C is compressed isothermally.
Does it go through a liquid phase ?
(b)

What happens when CO2 at 4 atm pressure is cooled from room temperature
at constant pressure ?

(c)

Describe qualitatively the changes in a given mass of solid CO2 at 10 atm


pressure and temperature 65 C as it is heated up to room temperature at
constant pressure.

(d)

CO2 is heated to a temperature 70 C and compressed isothermally. What


changes in its properties do you expect to observe ?

11.18

A child running a temperature of 101F is given an antipyrin (i.e. a medicine that


lowers fever) which causes an increase in the rate of evaporation of sweat from his
body. If the fever is brought down to 98 F in 20 min, what is the average rate of
extra evaporation caused, by the drug. Assume the evaporation mechanism to be
the only way by which heat is lost. The mass of the child is 30 kg. The specific
heat of human body is approximately the same as that of water, and latent heat of
evaporation of water at that temperature is about 580 cal g1.

11.19

A thermacole icebox is a cheap and efficient method for storing small quantities
of cooked food in summer in particular. A cubical icebox of side 30 cm has a
thickness of 5.0 cm. If 4.0 kg of ice is put in the box, estimate the amount of ice
remaining after 6 h. The outside temperature is 45 C, and co-efficient of thermal
conductivity of thermacole is 0.01 J s1 m1 K1. [Heat of fusion of water = 335 103
J kg1]

11.20

A brass boiler has a base area of 0.15 m2 and thickness 1.0 cm. It boils water at the
rate of 6.0 kg/min when placed on a gas stove. Estimate the temperature of the part

THERMAL PROPERTIES OF MATTER

297

of the flame in contact with the boiler. Thermal conductivity of brass = 109 J s1 m1
K1 ; Heat of vaporisation of water = 2256 103 J kg1.
Explain why :
(a) a body with large reflectivity is a poor emitter
(b) a brass tumbler feels much colder than a wooden tray on a chilly day
(c) an optical pyrometer (for measuring high temperatures) calibrated for an ideal
black body radiation gives too low a value for the temperature of a red hot
iron piece in the open, but gives a correct value for the temperature when the
same piece is in the furnace
(d) the earth without its atmosphere would be inhospitably cold
(e) heating systems based on circulation of steam are more efficient in warming
a building than those based on circulation of hot water

11.22

A body cools from 80 C to 50 C in 5 minutes. Calculate the time it takes to cool


from 60 C to 30 C. The temperature of the surroundings is 20 C.

no N
C
tt E
o R
be T
re
pu

bl

is

he
d

11.21

CHAPTER TWELVE

he
d

THERMODYNAMICS

is

no N
C
tt E
o R
be T
re
pu

12.4

Introduction
Thermal equilibrium
Zeroth law of
Thermodynamics
Heat, internal energy and
work
First law of
thermodynamics
Specific heat capacity
Thermodynamic state
variables and equation of
state
Thermodynamic processes
Heat engines
Refrigerators and heat
pumps
Second law of
thermodynamics
Reversible and irreversible
processes
Carnot engine

bl

12.1
12.2
12.3

12.1 INTRODUCTION
In previous chapter we have studied thermal properties of
matter. In this chapter we shall study laws that govern
thermal energy. We shall study the processes where work is
converted into heat and vice versa. In winter, when we rub
our palms together, we feel warmer; here work done in rubbing
produces the heat. Conversely, in a steam engine, the heat
of the steam is used to do useful work in moving the pistons,
which in turn rotate the wheels of the train.
In physics, we need to define the notions of heat,
temperature, work, etc. more carefully. Historically, it took a
long time to arrive at the proper concept of heat. Before the
modern picture, heat was regarded as a fine invisible fluid
filling in the pores of a substance. On contact between a hot
body and a cold body, the fluid (called caloric) flowed from
the colder to the hotter body ! This is similar to what happens
when a horizontal pipe connects two tanks containing water
up to different heights. The flow continues until the levels of
water in the two tanks are the same. Likewise, in the caloric
picture of heat, heat flows until the caloric levels (i.e., the
temperatures) equalise.
In time, the picture of heat as a fluid was discarded in
favour of the modern concept of heat as a form of energy. An
important experiment in this connection was due to Benjamin
Thomson (also known as Count Rumford) in 1798. He
observed that boring of a brass cannon generated a lot of
heat, indeed enough to boil water. More significantly, the
amount of heat produced depended on the work done (by the
horses employed for turning the drill) but not on the
sharpness of the drill. In the caloric picture, a sharper drill
would scoop out more heat fluid from the pores; but this
was not observed. A most natural explanation of the
observations was that heat was a form of energy and the
experiment demonstrated conversion of energy from one form
to anotherfrom work to heat.

12.5
12.6
12.7

12.8
12.9
12.10
12.11
12.12
12.13

Summary
Points to ponder
Exercises

THERMODYNAMICS

bl

is

he
d

in a different context : we say the state of a system


is an equilibrium state if the macroscopic
variables that characterise the system do not
change in time. For example, a gas inside a closed
rigid container, completely insulated from its
surroundings, with fixed values of pressure,
volume, temperature, mass and composition that
do not change with time, is in a state of
thermodynamic equilibrium.
In general, whether or not a system is in a state
of equilibrium depends on the surroundings and
the nature of the wall that separates the system
from the surroundings. Consider two gases A and
B occupying two different containers. We know
experimentally that pressure and volume of a
given mass of gas can be chosen to be its two
independent variables. Let the pressure and
volume of the gases be (PA, VA) and (PB, VB)
respectively. Suppose first that the two systems
are put in proximity but are separated by an
adiabatic wall an insulating wall (can be
movable) that does not allow flow of energy (heat)
from one to another. The systems are insulated
from the rest of the surroundings also by similar
adiabatic walls. The situation is shown
schematically in Fig. 12.1 (a). In this case, it is
found that any possible pair of values (PA, VA) will
be in equilibrium with any possible pair of values
(PB, VB ). Next, suppose that the adiabatic wall is
replaced by a diathermic wall a conducting wall
that allows energy flow (heat) from one to another.
It is then found that the macroscopic variables of
the systems A and B change spontaneously until
both the systems attain equilibrium states. After
that there is no change in their states. The
situation is shown in Fig. 12.1(b). The pressure
and volume variables of the two gases change to
(PB , VB ) and (PA , VA ) such that the new states
of A and B are in equilibrium with each other**.
There is no more energy flow from one to another.
We then say that the system A is in thermal
equilibrium with the system B.
What characterises the situation of thermal
equilibrium between two systems ? You can guess
the answer from your experience. In thermal
equilibrium, the temperatures of the two systems

no N
C
tt E
o R
be T
re
pu

Thermodynamics is the branch of physics that


deals with the concepts of heat and temperature
and the inter-conversion of heat and other forms
of energy. Thermodynamics is a macroscopic
science. It deals with bulk systems and does not
go into the molecular constitution of matter. In
fact, its concepts and laws were formulated in the
nineteenth century before the molecular picture
of matter was firmly established. Thermodynamic
description involves relatively few macroscopic
variables of the system, which are suggested by
common sense and can be usually measured
directly. A microscopic description of a gas, for
example, would involve specifying the co-ordinates
and velocities of the huge number of molecules
constituting the gas. The description in kinetic
theory of gases is not so detailed but it does involve
molecular
distribution
of
velocities.
Thermodynamic description of a gas, on the other
hand, avoids the molecular description altogether.
Instead, the state of a gas in thermodynamics is
specified by macroscopic variables such as
pressure, volume, temperature, mass and
composition that are felt by our sense perceptions
and are measurable*.
The distinction between mechanics and
thermodynamics is worth bearing in mind. In
mechanics, our interest is in the motion of particles
or bodies under the action of forces and torques.
Thermodynamics is not concerned with the
motion of the system as a whole. It is concerned
with the internal macroscopic state of the body.
When a bullet is fired from a gun, what changes
is the mechanical state of the bullet (its kinetic
energy, in particular), not its temperature. When
the bullet pierces a wood and stops, the kinetic
energy of the bullet gets converted into heat,
changing the temperature of the bullet and the
surrounding layers of wood. Temperature is
related to the energy of the internal (disordered)
motion of the bullet, not to the motion of the bullet
as a whole.

299

12.2 THERMAL EQUILIBRIUM


Equilibrium in mechanics means that the net
external force and torque on a system are zero.
The term equilibrium in thermodynamics appears

Thermodynamics may also involve other variables that are not so obvious to our senses e.g. entropy, enthalpy,
etc., and they are all macroscopic variables.

**

Both the variables need not change. It depends on the constraints. For instance, if the gases are in containers
of fixed volume, only the pressures of the gases would change to achieve thermal equilibrium.

300

PHYSICS

he
d

no N
C
tt E
o R
be T
re
pu

Imagine two systems A and B, separated by an


adiabatic wall, while each is in contact with a third
system C, via a conducting wall [Fig. 12.2(a)]. The
states of the systems (i.e., their macroscopic
variables) will change until both A and B come to
thermal equilibrium with C. After this is achieved,
suppose that the adiabatic wall between A and B
is replaced by a conducting wall and C is insulated
from A and B by an adiabatic wall [Fig.12.2(b)]. It
is found that the states of A and B change no
further i.e. they are found to be in thermal
equilibrium with each other. This observation
forms the basis of the Zeroth Law of
Thermodynamics, which states that two
systems in thermal equilibrium with a third
system separately are in thermal equilibrium
with each other. R.H. Fowler formulated this
law in 1931 long after the first and second Laws
of thermodynamics were stated and so numbered.

is

12.3 ZEROTH LAW OF THERMODYNAMICS

The Zeroth Law clearly suggests that when two


systems A and B, are in thermal equilibrium,
there must be a physical quantity that has the
same value for both. This thermodynamic
variable whose value is equal for two systems in
thermal equilibrium is called temperature (T ).
Thus, if A and B are separately in equilibrium
with C, TA = TC and TB = TC. This implies that
TA = TB i.e. the systems A and B are also in
thermal equilibrium.
We have arrived at the concept of temperature
formally via the Zeroth Law. The next question
is : how to assign numerical values to
temperatures of different bodies ? In other words,
how do we construct a scale of temperature ?
Thermometry deals with this basic question to
which we turn in the next section.

bl

are equal. We shall see how does one arrive at


the concept of temperature in thermodynamics?
The Zeroth law of thermodynamics provides the
clue.

(a)

(a)

(b)

Fig. 12.1 (a) Systems A and B (two gases) separated


by an adiabatic wall an insulating wall
that does not allow flow of heat. (b) The
same systems A and B separated by a
diathermic wall a conducting wall that
allows heat to flow from one to another. In
this case, thermal equilibrium is attained
in due course.

(b)

Fig. 12.2 (a) Systems A and B are separated by an


adiabatic wall, while each is in contact
with a third system C via a conducting
wall. (b) The adiabatic wall between A
and B is replaced by a conducting wall,
while C is insulated from A and B by an
adiabatic wall.

12.4 HEAT, INTERNAL ENERGY AND WORK


The Zeroth Law of Thermodynamics led us to
the concept of temperature that agrees with our
commonsense notion. Temperature is a marker

THERMODYNAMICS

bl

is

he
d

volume of the container); it also includes


rotational and vibrational motion of the
molecules (Fig. 12.3).
What are the ways of changing internal
energy of a system ? Consider again, for
simplicity, the system to be a certain mass of
gas contained in a cylinder with a movable
piston as shown in Fig. 12.4. Experience shows
there are two ways of changing the state of the
gas (and hence its internal energy). One way is
to put the cylinder in contact with a body at a
higher temperature than that of the gas. The
temperature difference will cause a flow of
energy (heat) from the hotter body to the gas,
thus increasing the internal energy of the gas.
The other way is to push the piston down i.e. to
do work on the system, which again results in
increasing the internal energy of the gas. Of
course, both these things could happen in the
reverse direction. With surroundings at a lower
temperature, heat would flow from the gas to
the surroundings. Likewise, the gas could push
the piston up and do work on the surroundings.
In short, heat and work are two different modes
of altering the state of a thermodynamic system
and changing its internal energy.
The notion of heat should be carefully
distinguished from the notion of internal energy.
Heat is certainly energy, but it is the energy in
transit. This is not just a play of words. The
distinction is of basic significance. The state of
a thermodynamic system is characterised by its
internal energy, not heat. A statement like a
gas in a given state has a certain amount of
heat is as meaningless as the statement that
a gas in a given state has a certain amount
of work. In contrast, a gas in a given state
has a certain amount of internal energy is a
perfectly meaningful statement. Similarly, the
statements a certain amount of heat is
supplied to the system or a certain amount
of work was done by the system are perfectly
meaningful.
To summarise, heat and work in
thermodynamics are not state variables. They
are modes of energy transfer to a system
resulting in change in its internal energy,
which, as already mentioned, is a state variable.
In ordinary language, we often confuse heat
with internal energy. The distinction between
them is sometimes ignored in elementary

no N
C
tt E
o R
be T
re
pu

of the hotness of a body. It determines the


direction of flow of heat when two bodies are
placed in thermal contact. Heat flows from the
body at a higher temperature to the one at lower
temperature. The flow stops when the
temperatures equalise; the two bodies are then
in thermal equilibrium. We saw in some detail
how to construct temperature scales to assign
temperatures to different bodies. We now
describe the concepts of heat and other relevant
quantities like internal energy and work.
The concept of internal energy of a system is
not difficult to understand. We know that every
bulk system consists of a large number of
molecules. Internal energy is simply the sum of
the kinetic energies and potential energies of
these molecules. We remarked earlier that in
thermodynamics, the kinetic energy of the
system, as a whole, is not relevant. Internal
energy is thus, the sum of molecular kinetic and
potential energies in the frame of reference
relative to which the centre of mass of the system
is at rest. Thus, it includes only the (disordered)
energy associated with the random motion of
molecules of the system. We denote the internal
energy of a system by U.
Though we have invoked the molecular
picture to understand the meaning of internal
energy, as far as thermodynamics is concerned,
U is simply a macroscopic variable of the system.
The important thing about internal energy is
that it depends only on the state of the system,
not on how that state was achieved. Internal
energy U of a system is an example of a
thermodynamic state variable its value
depends only on the given state of the system,
not on history i.e. not on the path taken to arrive
at that state. Thus, the internal energy of a given
mass of gas depends on its state described by
specific values of pressure, volume and
temperature. It does not depend on how this
state of the gas came about. Pressure, volume,
temperature, and internal energy are
thermodynamic state variables of the system
(gas) (see section 12.7). If we neglect the small
intermolecular forces in a gas, the internal
energy of a gas is just the sum of kinetic energies
associated with various random motions of its
molecules. We will see in the next chapter that
in a gas this motion is not only translational
(i.e. motion from one point to another in the

301

302

he
d

Fig. 12.3 (a) Internal energy U of a gas is the sum


of the kinetic and potential energies of its
molecules when the box is at rest. Kinetic
energy due to various types of motion
(translational, rotational, vibrational) is to
be included in U. (b) If the same box is
moving as a whole with some velocity,
the kinetic energy of the box is not to be
included in U.

transfer : heat and work. Let


Q = Heat supplied to the system by the
surroundings
W = Work done by the system on the
surroundings
U = Change in internal energy of the system
The general principle of conservation of
energy then implies that
Q = U + W
(12.1)
i.e. the energy (Q) supplied to the system goes
in partly to increase the internal energy of the
system (U) and the rest in work on the
environment (W). Equation (12.1) is known as
the First Law of Thermodynamics. It is simply
the general law of conservation of energy applied
to any system in which the energy transfer from
or to the surroundings is taken into account.
Let us put Eq. (12.1) in the alternative form

is

physics books. For proper understanding of


thermodynamics, however, the distinction is
crucial.

PHYSICS

bl

Q W = U

(12.2)

no N
C
tt E
o R
be T
re
pu

Now, the system may go from an initial state


to the final state in a number of ways. For
example, to change the state of a gas from
(P1, V1) to (P 2, V2), we can first change the
volume of the gas from V1 to V2, keeping its
pressure constant i.e. we can first go the state
(P1, V2) and then change the pressure of the
gas from P1 to P2, keeping volume constant, to
take the gas to (P 2, V2). Alternatively, we can
first keep the volume constant and then keep
the pressure constant. Since U is a state
variable, U depends only on the initial and
final states and not on the path taken by the
gas to go from one to the other. However, Q
and W will, in general, depend on the path
taken to go from the initial to final states. From
the First Law of Thermodynamics, Eq. (12.2),
it is clear that the combination Q W, is
however, path independent. This shows that
if a system is taken through a process in which
U = 0 (for example, isothermal expansion of
an ideal gas, see section 12.8),

Fig. 12.4 Heat and work are two distinct modes of


energy transfer to a system that results in
change in its internal energy. (a) Heat is
energy transfer due to temperature
difference between the system and the
surroundings. (b) Work is energy transfer
brought about by means (e.g. moving the
piston by raising or lowering some weight
connected to it) that do not involve such a
temperature difference.

Q = W

12.5 FIRST LAW OF THERMODYNAMICS

i.e., heat supplied to the system is used up


entirely by the system in doing work on the
environment.

We have seen that the internal energy U of a


system can change through two modes of energy

If the system is a gas in a cylinder with a


movable piston, the gas in moving the piston

THERMODYNAMICS

303

does work. Since force is pressure times area,


and area times displacement is volume, work
done by the system against a constant pressure
P is
W = P V

If the amount of substance is specified in


terms of moles (instead of mass m in kg ), we
can define heat capacity per mole of the
substance by

As an application of Eq. (12.3), consider the


change in internal energy for 1 g of water when
we go from its liquid to vapour phase. The
measured latent heat of water is 2256 J/g. i.e.,
for 1 g of water Q = 2256 J. At atmospheric
pressure, 1 g of water has a volume 1 cm3 in
liquid phase and 1671 cm3 in vapour phase.
Therefore,
W =P (Vg Vl ) = 1.013 105 (1670)106 =169.2 J
Equation (12.3) then gives

no N
C
tt E
o R
be T
re
pu

U = 2256 169.2 = 2086.8 J

C is known as molar specific heat capacity of


the substance. Like s, C is independent of the
amount of substance. C depends on the nature
of the substance, its temperature and the
conditions under which heat is supplied. The
unit of C is J mo11 K1. As we shall see later (in
connection with specific heat capacity of gases),
additional conditions may be needed to define
C or s. The idea in defining C is that simple
predictions can be made in regard to molar
specific heat capacities.
Table 12.1 lists measured specific and molar
heat capacities of solids at atmospheric pressure
and ordinary room temperature.
We will see in Chapter 13 that predictions of
specific heats of gases generally agree with
experiment. We can use the same law of
equipartition of energy that we use there to
predict molar specific heat capacities of solids.
Consider a solid of N atoms, each vibrating
about its mean position. An oscillator in one
dimension has average energy of 2
k BT
= kBT. In three dimensions, the average energy
is 3 kBT. For a mole of a solid, the total energy is

he
d

(12.3)

(12.6)

is

Q = U + P V

1 Q
T

bl

where V is the change in volume of the gas.


Thus, for this case, Eq. (12.1) gives

We see that most of the heat goes to increase


the internal energy of water in transition from
the liquid to the vapour phase.
12.6 SPECIFIC HEAT CAPACITY

Suppose an amount of heat Q supplied to a


substance changes its temperature from T to
T + T. We define heat capacity of a substance
(see Chapter 11) to be
S=

Q
T

(12.4)

We expect Q and, therefore, heat capacity S


to be proportional to the mass of the substance.
Further, it could also depend on the
temperature, i.e., a different amount of heat may
be needed for a unit rise in temperature at
different temperatures. To define a constant
characteristic of the substance and
independent of its amount, we divide S by the
mass of the substance m in kg :
s

S
m

1
m

Q
T

U = 3 kBT

NA = 3 RT

Now, at constant pressure, Q = U + P V


U, since for a solid V is negligible. Therefore,
C=

Q
T

Table 12.1

U
T

= 3R

(12.7)

Specific and molar heat capacities


of
some
solids
at
room
temperature and atmospheric
pressure

(12.5)

s is known as the specific heat capacity of the


substance. It depends on the nature of the
substance and its temperature. The unit of
specific heat capacity is J kg1 K1.

As Table 12.1 shows, the experimentally


measured values which generally agrees with

304

PHYSICS

predicted value 3R at ordinary temperatures.


(Carbon is an exception.) The agreement is
known to break down at low temperatures.

ideal gas, we have a simple relation.

Specific heat capacity of water

where C p and C v are molar specific heat


capacities of an ideal gas at constant pressure
and volume respectively and R is the universal
gas constant. To prove the relation, we begin
with Eq. (12.3) for 1 mole of the gas :

(12.8)

Q = U + P V

he
d

The old unit of heat was calorie. One calorie


was earlier defined to be the amount of heat
required to raise the temperature of 1g of water
by 1C. With more precise measurements, it was
found that the specific heat of water varies
slightly with temperature. Figure 12.5 shows
this variation in the temperature range 0 to
100 C.

Cp Cv = R

If Q is absorbed at constant volume, V = 0


Cv

Q
T

U
T

U
T

(12.9)

Q
T

no N
C
tt E
o R
be T
re
pu

Cp

bl

is

where the subscript v is dropped in the last


step, since U of an ideal gas depends only on
temperature. (The subscript denotes the
quantity kept fixed.) If, on the other hand, Q
is absorbed at constant pressure,

Fig. 12.5 Variation of specific heat capacity of water


with temperature.

For a precise definition of calorie, it was,


therefore, necessary to specify the unit
temperature interval. One calorie is defined
to be the amount of heat required to raise the
temperature of 1g of water from 14.5 C to
15.5 C. Since heat is just a form of energy,
it is preferable to use the unit joule, J.
In SI units, the specific heat capacity of water
is 4186 J kg1 K1 i.e. 4.186 J g1 K1. The so
called mechanical equivalent of heat defined
as the amount of work needed to produce
1 cal of heat is in fact just a conversion factor
between two different units of energy : calorie
to joule. Since in SI units, we use the unit joule
for heat, work or any other form of energy, the
term mechanical equivalent is now
superfluous and need not be used.
As already remarked, the specific heat
capacity depends on the process or the
conditions under which heat capacity transfer
takes place. For gases, for example, we can
define two specific heats : specific heat
capacity at constant volume and specific
heat capacity at constant pressure. For an

U
T

V
T

(12.10)

The subscript p can be dropped from the


first term since U of an ideal gas depends only
on T. Now, for a mole of an ideal gas
PV = RT

which gives
P

V
T

(12.11)

Equations (12.9) to (12.11) give the desired


relation, Eq. (12.8).
12.7

THERMODYNAMIC STATE VARIABLES


AND EQUATION OF STATE

Every equilibrium state of a thermodynamic


system is completely described by specific
values of some macroscopic variables, also
called state variables. For example, an
equilibrium state of a gas is completely
specified by the values of pressure, volume,
temperature, and mass (and composition if
there is a mixture of gases). A thermodynamic
system is not always in equilibrium. For example,
a gas allowed to expand freely against vacuum
is not an equilibrium state [Fig. 12.6(a)]. During
the rapid expansion, pressure of the gas may

THERMODYNAMICS

temperature do not. To decide which variable is


extensive and which intensive, think of a
relevant system in equilibrium, and imagine that
it is divided into two equal parts. The variables
that remain unchanged for each part are
intensive. The variables whose values get halved
in each part are extensive. It is easily seen, for
example, that internal energy U, volume V, total
mass M are extensive variables. Pressure P,
temperature T, and density are intensive
variables. It is a good practice to check the
consistency of thermodynamic equations using
this classification of variables. For example, in
the equation
Q = U + P V

he
d

not be uniform throughout. Similarly, a mixture


of gases undergoing an explosive chemical
reaction (e.g. a mixture of petrol vapour and
air when ignited by a spark) is not an
equilibrium state; again its temperature and
pressure are not uniform [Fig. 12.6(b)].
Eventually, the gas attains a uniform
temperature and pressure and comes to
thermal and mechanical equilibrium with its
surroundings.

305

is

quantities on both sides are extensive*. (The


product of an intensive variable like P and an
extensive quantity V is extensive.)

bl

12.8 THERMODYNAMIC PROCESSES

no N
C
tt E
o R
be T
re
pu

12.8.1 Quasi-static process


Consider a gas in thermal and mechanical
equilibrium with its surroundings. The pressure
of the gas in that case equals the external
pressure and its temperature is the same as
that of its surroundings. Suppose that the
external pressure is suddenly reduced (say by
lifting the weight on the movable piston in the
container). The piston will accelerate outward.
During the process, the gas passes through
states that are not equilibrium states. The nonequilibrium states do not have well-defined
pressure and temperature. In the same way, if
a finite temperature difference exists between
the gas and its surroundings, there will be a
rapid exchange of heat during which the gas
will pass through non-equilibrium states. In
due course, the gas will settle to an equilibrium
state with well-defined temperature and
pressure equal to those of the surroundings. The
free expansion of a gas in vacuum and a mixture
of gases undergoing an explosive chemical
reaction, mentioned in section 12.7 are also
examples where the system goes through nonequilibrium states.
Non-equilibrium states of a system are difficult
to deal with. It is, therefore, convenient to
imagine an idealised process in which at every
stage the system is an equilibrium state. Such a

Fig. 12.6 (a) The partition in the box is suddenly


removed leading to free expansion of the
gas. (b) A mixture of gases undergoing an
explosive chemical reaction. In both
situations, the gas is not in equilibrium and
cannot be described by state variables.

In short, thermodynamic state variables


describe equilibrium states of systems. The
various state variables are not necessarily
independent. The connection between the state
variables is called the equation of state. For
example, for an ideal gas, the equation of state
is the ideal gas relation
PV=RT

For a fixed amount of the gas i.e. given , there


are thus, only two independent variables, say P
and V or T and V. The pressure-volume curve
for a fixed temperature is called an isotherm.
Real gases may have more complicated
equations of state.
The thermodynamic state variables are of two
kinds: extensive and intensive. Extensive
variables indicate the size of the system.
Intensive variables such as pressure and

As emphasised earlier, Q is not a state variable. However, Q is clearly proportional to the total mass of
system and hence is extensive.

306

PHYSICS

he
d

A process in which the temperature of the


system is kept fixed throughout is called an
isothermal process. The expansion of a gas in
a metallic cylinder placed in a large reservoir of
fixed temperature is an example of an isothermal
process. (Heat transferred from the reservoir to
the system does not materially affect the
temperature of the reservoir, because of its very
large heat capacity.) In isobaric processes the
pressure is constant while in isochoric
processes the volume is constant. Finally, if
the system is insulated from the surroundings
and no heat flows between the system and the
surroundings, the process is adiabatic. The
definitions of these special processes are
summarised in Table. 12.2
Some special thermodynamic
processes

bl

is

Table 12.2

no N
C
tt E
o R
be T
re
pu

process is, in principle, infinitely slow-hence the


name quasi-static (meaning nearly static). The
system changes its variables (P, T, V ) so slowly
that it remains in thermal and mechanical
equilibrium with its surroundings throughout.
In a quasi-static process, at every stage, the
difference in the pressure of the system and the
external pressure is infinitesimally small. The
same is true of the temperature difference
between the system and its surroundings. To
take a gas from the state (P, T ) to another state
(P , T ) via a quasi-static process, we change
the external pressure by a very small amount,
allow the system to equalise its pressure with
that of the surroundings and continue the
process infinitely slowly until the system
achieves the pressure P . Similarly, to change
the temperature, we introduce an infinitesimal
temperature difference between the system and
the surrounding reservoirs and by choosing
reservoirs of progressively different temperatures
T to T , the system achieves the temperature T .

We now consider these processes in some detail :


Isothermal process

Fig. 12.7

In a quasi-static process, the temperature


of the surrounding reservoir and the
external pressure differ only infinitesimally
from the temperature and pressure of the
system.

A quasi-static process is obviously a


hypothetical construct. In practice, processes
that are sufficiently slow and do not involve
accelerated motion of the piston, large
temperature gradient, etc. are reasonably
approximation to an ideal quasi-static process.
We shall from now on deal with quasi-static
processes only, except when stated otherwise.

For an isothermal process (T fixed), the ideal gas


equation gives
PV = constant
i.e., pressure of a given mass of gas varies inversely
as its volume. This is nothing but Boyles Law.
Suppose an ideal gas goes isothermally (at
temperature T ) from its initial state (P1, V1) to
the final state (P2, V 2). At any intermediate stage
with pressure P and volume change from V to
V + V (V small)
W = P V

Taking (V 0) and summing the quantity


W over the entire process,
V2

W =

P dV

V1

V2

RT

dV
V
V1

RT In

V2
V1

(12.12)

THERMODYNAMICS

307

where in the second step we have made use of


the ideal gas equation PV = RT and taken the
constants out of the integral. For an ideal gas,
internal energy depends only on temperature.
Thus, there is no change in the internal energy
of an ideal gas in an isothermal process. The
First Law of Thermodynamics then implies that
heat supplied to the gas equals the work done
by the gas : Q = W. Note from Eq. (12.12) that
for V2 > V1, W > 0; and for V2 < V1, W < 0. That
is, in an isothermal expansion, the gas absorbs
heat and does work while in an isothermal
compression, work is done on the gas by the
environment and heat is released.

We can calculate, as before, the work done in


an adiabatic change of an ideal gas from the
state (P1, V1, T1) to the state (P2, V2, T2).

Adiabatic process

From Eq. (12.34), the constant is P1V1 or P2V2

P dV
V1

constant

V1 V

Thus if an ideal gas undergoes a change in


its state adiabatically from (P1, V1) to (P2, V2) :

P 1 V1 = P 2 V2

(12.14)

Figure12.8 shows the P-V curves of an ideal


gas for two adiabatic processes connecting two
isotherms.

+1

V2
V1

constant 1
1

(1 )
V 1 V 1
2

P2V2

1
1

V2

P2V2

P1V1

is

W =

(12.15)

V1

P1V1

R(T1 T2 )

bl

dV

he
d

V2
= constant

(12.16)

As expected, if work is done by the gas in an


adiabatic process (W > 0), from Eq. (12.16),
T2 < T1. On the other hand, if work is done on
the gas (W < 0), we get T 2 > T 1 i.e., the
temperature of the gas rises.

no N
C
tt E
o R
be T
re
pu

In an adiabatic process, the system is insulated


from the surroundings and heat absorbed or
released is zero. From Eq. (12.1), we see that
work done by the gas results in decrease in its
internal energy (and hence its temperature for
an ideal gas). We quote without proof (the result
that you will learn in higher courses) that for
an adiabatic process of an ideal gas.
P V = const
(12.13)
where is the ratio of specific heats (ordinary
or molar) at constant pressure and at constant
volume.
Cp
=
Cv

V2
W=

Isochoric process

In an isochoric process, V is constant. No work


is done on or by the gas. From Eq. (12.1), the
heat absorbed by the gas goes entirely to change
its internal energy and its temperature. The
change in temperature for a given amount of
heat is determined by the specific heat of the
gas at constant volume.
Isobaric process

In an isobaric process, P is fixed. Work done by


the gas is
W = P (V2 V1) = R (T2 T1)

(12.17)

Since temperature changes, so does internal


energy. The heat absorbed goes partly to
increase internal energy and partly to do work.
The change in temperature for a given amount
of heat is determined by the specific heat of the
gas at constant pressure.
Cyclic process

Fig. 12.8

P-V curves for isothermal and adiabatic


processes of an ideal gas.

In a cyclic process, the system returns to its


initial state. Since internal energy is a state
variable, U = 0 for a cyclic process. From

308

PHYSICS

W = Q1 Q2

Fig. 12.9

Schematic representation of a heat engine.


The engine takes heat Q1 from a hot
reservoir at temperature T1, releases heat
Q2 to a cold reservoir at temperature T2
and delivers work W to the surroundings.

The cycle is repeated again and again to get


useful work for some purpose. The discipline of
thermodynamics has its roots in the study of heat
engines. A basic question relates to the efficiency
of a heat engine. The efficiency () of a heat
engine is defined by
W
(12.18)
Q1
where Q 1 is the heat input i.e., the heat
absorbed by the system in one complete cycle

(12.19)

Q
=1 2

Q1

he
d

i.e.,
(12.20)

For Q2 = 0, = 1, i.e., the engine will have


100% efficiency in converting heat into work.
Note that the First Law of Thermodynamics i.e.,
the energy conservation law does not rule out
such an engine. But experience shows that
such an ideal engine with = 1 is never possible,
even if we can eliminate various kinds of losses
associated with actual heat engines. It turns
out that there is a fundamental limit on the
efficiency of a heat engine set by an independent
principle of nature, called the Second Law of
Thermodynamics (section 12.11).
The mechanism of conversion of heat into
work varies for different heat engines. Basically,
there are two ways : the system (say a gas or a
mixture of gases) is heated by an external
furnace, as in a steam engine; or it is heated
internally by an exothermic chemical reaction
as in an internal combustion engine. The
various steps involved in a cycle also differ from
one engine to another.

no N
C
tt E
o R
be T
re
pu

Heat engine is a device by which a system is


made to undergo a cyclic process that results
in conversion of heat to work.
(1) It consists of a working substancethe
system. For example, a mixture of fuel
vapour and air in a gasoline or diesel engine
or steam in a steam engine are the working
substances.
(2) The working substance goes through a cycle
consisting of several processes. In some of
these processes, it absorbs a total amount
of heat Q1 from an external reservoir at some
high temperature T1.
(3) In some other processes of the cycle, the
working substance releases a total amount
of heat Q2 to an external reservoir at some
lower temperature T2.
(4) The work done (W ) by the system in a cycle
is transferred to the environment via some
arrangement (e.g. the working substance
may be in a cylinder with a moving piston
that transfers mechanical energy to the
wheels of a vehicle via a shaft).
The basic features of a heat engine are
schematically represented in Fig. 12.9.

is

12.9 HEAT ENGINES

and W is the work done on the environment in


a cycle. In a cycle, a certain amount of heat (Q2)
may also be rejected to the environment. Then,
according to the First Law of Thermodynamics,
over one complete cycle,

bl

Eq. (12.1), the total heat absorbed equals the


work done by the system.

12.10 REFRIGERATORS AND HEAT PUMPS

A refrigerator is the reverse of a heat engine.


Here the working substance extracts heat Q2
from the cold reservoir at temperature T2, some
external work W is done on it and heat Q1 is
released to the hot reservoir at temperature T1
(Fig. 12.10).

Fig. 12.10 Schematic representation of a refrigerator


or a heat pump, the reverse of a heat
engine.

THERMODYNAMICS

309

Pioneers of Thermodynamics

he
d

Lord Kelvin (William Thomson) (1824-1907), born in Belfast, Ireland, is


among the foremost British scientists of the nineteenth century. Thomson
played a key role in the development of the law of conservation of energy
suggested by the work of James Joule (1818-1889), Julius Mayer (18141878) and Hermann Helmholtz (1821-1894). He collaborated with Joule
on the so-called Joule-Thomson effect : cooling of a gas when it expands
into vacuum. He introduced the notion of the absolute zero of temperature
and proposed the absolute temperature scale, now called the Kelvin scale
in his honour. From the work of Sadi Carnot (1796-1832), Thomson arrived
at a form of the Second Law of Thermodynamics. Thomson was a versatile
physicist, with notable contributions to electromagnetic theory and
hydrodynamics.

no N
C
tt E
o R
be T
re
pu

bl

is

Rudolf Clausius (1822-1888), born in Poland, is generally regarded as


the discoverer of the Second Law of Thermodynamics. Based on the work
of Carnot and Thomson, Clausius arrived at the important notion of entropy
that led him to a fundamental version of the Second Law of
Thermodynamics that states that the entropy of an isolated system can
never decrease. Clausius also worked on the kinetic theory of gases and
obtained the first reliable estimates of molecular size, speed, mean free
path, etc.

A heat pump is the same as a refrigerator.


What term we use depends on the purpose of
the device. If the purpose is to cool a portion of
space, like the inside of a chamber, and higher
temperature reservoir is surrounding, we call
the device a refrigerator; if the idea is to pump
heat into a portion of space (the room in a
building when the outside environment is cold),
the device is called a heat pump.
In a refrigerator the working substance
(usually, in gaseous form) goes through the
following steps : (a) sudden expansion of the gas
from high to low pressure which cools it and
converts it into a vapour-liquid mixture, (b)
absorption by the cold fluid of heat from the
region to be cooled converting it into vapour, (c)
heating up of the vapour due to external work
done on the system, and (d) release of heat by
the vapour to the surroundings, bringing it to
the initial state and completing the cycle.
The coefficient of performance ( ) of a
refrigerator is given by
Q2
W

(12.21)

where Q2 is the heat extracted from the cold


reservoir and W is the work done on the
systemthe refrigerant. ( for heat pump is
defined as Q1/W) Note that while by definition
can never exceed 1, can be greater than 1.
By energy conservation, the heat released to the
hot reservoir is
Q1 = W + Q2

i.e.,

Q2
Q1 Q2

(12.22)

In a heat engine, heat cannot be fully


converted to work; likewise a refrigerator cannot
work without some external work done on the
system, i.e., the coefficient of performance in Eq.
(12.21) cannot be infinite.
12.11 SECOND LAW OF THERMODYNAMICS
The First Law of Thermodynamics is the principle
of conservation of energy. Common experience
shows that there are many conceivable
processes that are perfectly allowed by the First
Law and yet are never observed. For example,
nobody has ever seen a book lying on a table
jumping to a height by itself. But such a thing

310

bl

is

he
d

bring both the system and surroundings to their


initial states with no other effect anywhere ?
Experience suggests that for most processes in
nature this is not possible. The spontaneous
processes of nature are irreversible. Several
examples can be cited. The base of a vessel on
an oven is hotter than its other parts. When
the vessel is removed, heat is transferred from
the base to the other parts, bringing the vessel
to a uniform temperature (which in due course
cools to the temperature of the surroundings).
The process cannot be reversed; a part of the
vessel will not get cooler spontaneously and
warm up the base. It will violate the Second Law
of Thermodynamics, if it did. The free expansion
of a gas is irreversible. The combustion reaction
of a mixture of petrol and air ignited by a spark
cannot be reversed. Cooking gas leaking from a
gas cylinder in the kitchen diffuses to the
entire room. The diffusion process will not
spontaneously reverse and bring the gas back
to the cylinder. The stirring of a liquid in thermal
contact with a reservoir will convert the work
done into heat, increasing the internal energy
of the reservoir. The process cannot be reversed
exactly; otherwise it would amount to conversion
of heat entirely into work, violating the Second
Law of Thermodynamics. Irreversibility is a rule
rather an exception in nature.
Irreversibility arises mainly from two causes:
one, many processes (like a free expansion, or
an explosive chemical reaction) take the system
to non-equilibrium states; two, most processes
involve friction, viscosity and other dissipative
effects (e.g., a moving body coming to a stop and
losing its mechanical energy as heat to the floor
and the body; a rotating blade in a liquid coming
to a stop due to viscosity and losing its
mechanical energy with corresponding gain in
the internal energy of the liquid). Since
dissipative effects are present everywhere and
can be minimised but not fully eliminated, most
processes that we deal with are irreversible.
A thermodynamic process (state i state f )
is reversible if the process can be turned back
such that both the system and the surroundings
return to their original states, with no other
change anywhere else in the universe. From the
preceding discussion, a reversible process is an
idealised notion. A process is reversible only if
it is quasi-static (system in equilibrium with the

no N
C
tt E
o R
be T
re
pu

would be possible if the principle of conservation


of energy were the only restriction. The table
could cool spontaneously, converting some of its
internal energy into an equal amount of
mechanical energy of the book, which would
then hop to a height with potential energy equal
to the mechanical energy it acquired. But this
never happens. Clearly, some additional basic
principle of nature forbids the above, even
though it satisfies the energy conservation
principle. This principle, which disallows many
phenomena consistent with the First Law of
Thermodynamics is known as the Second Law
of Thermodynamics.
The Second Law of Thermodynamics gives a
fundamental limitation to the efficiency of a heat
engine and the co-efficient of performance of a
refrigerator. In simple terms, it says that
efficiency of a heat engine can never be unity.
According to Eq. (12.20), this implies that heat
released to the cold reservoir can never be made
zero. For a refrigerator, the Second Law says that
the co-efficient of performance can never be
infinite. According to Eq. (12.21), this implies
that external work (W ) can never be zero. The
following two statements, one due to Kelvin and
Planck denying the possibility of a perfect heat
engine, and another due to Clausius denying
the possibility of a perfect refrigerator or heat
pump, are a concise summary of these
observations.

PHYSICS

Second Law of Thermodynamics

Kelvin-Planck statement
No process is possible whose sole result is the
absorption of heat from a reservoir and the
complete conversion of the heat into work.
Clausius statement

No process is possible whose sole result is the


transfer of heat from a colder object to a hotter
object.
It can be proved that the two statements
above are completely equivalent.
REVERSIBLE AND IRREVERSIBLE
PROCESSES
Imagine some process in which a thermodynamic
system goes from an initial state i to a final
state f. During the process the system absorbs
heat Q from the surroundings and performs
work W on it. Can we reverse this process and
12.12

THERMODYNAMICS

12.13 CARNOT ENGINE

he
d

is

no N
C
tt E
o R
be T
re
pu

Suppose we have a hot reservoir at temperature


T1 and a cold reservoir at temperature T2. What
is the maximum efficiency possible for a heat
engine operating between the two reservoirs and
what cycle of processes should be adopted to
achieve the maximum efficiency ? Sadi Carnot,
a French engineer, first considered this question
in 1824. Interestingly, Carnot arrived at the
correct answer, even though the basic concepts
of heat and thermodynamics had yet to be firmly
established.

complete a cycle, we need to take the system


from temperature T1 to T2 and then back from
temperature T2 to T1. Which processes should
we employ for this purpose that are reversible?
A little reflection shows that we can only adopt
reversible adiabatic processes for these
purposes, which involve no heat flow from any
reservoir. If we employ any other process that is
not adiabatic, say an isochoric process, to take
the system from one temperature to another, we
shall need a series of reservoirs in the
temperature range T2 to T1 to ensure that at each
stage the process is quasi-static. (Remember
again that for a process to be quasi-static and
reversible, there should be no finite temperature
difference between the system and the reservoir.)
But we are considering a reversible engine that
operates between only two temperatures. Thus
adiabatic processes must bring about the
temperature change in the system from T1 to T2
and T2 to T1 in this engine.

bl

surroundings at every stage) and there are no


dissipative effects. For example, a quasi-static
isothermal expansion of an ideal gas in a
cylinder fitted with a frictionless movable piston
is a reversible process.
Why is reversibility such a basic concept in
thermodynamics ? As we have seen, one of the
concerns of thermodynamics is the efficiency
with which heat can be converted into work.
The Second Law of Thermodynamics rules out
the possibility of a perfect heat engine with 100%
efficiency. But what is the highest efficiency
possible for a heat engine working between two
reservoirs at temperatures T1 and T2 ? It turns
out that a heat engine based on idealised
reversible processes achieves the highest
efficiency possible. All other engines involving
irreversibility in any way (as would be the case
for practical engines) have lower than this
limiting efficiency.

311

We expect the ideal engine operating between


two temperatures to be a reversible engine.
Irreversibility is associated with dissipative
effects, as remarked in the preceding section,
and lowers efficiency. A process is reversible if
it is quasi-static and non-dissipative. We have
seen that a process is not quasi-static if it
involves finite temperature difference between
the system and the reservoir. This implies that
in a reversible heat engine operating between
two temperatures, heat should be absorbed
(from the hot reservoir) isothermally and
released (to the cold reservoir) isothermally. We
thus have identified two steps of the reversible
heat engine : isothermal process at temperature
T1 absorbing heat Q1 from the hot reservoir, and
another isothermal process at temperature T2
releasing heat Q2 to the cold reservoir. To

Fig. 12.11 Carnot cycle for a heat engine with an


ideal gas as the working substance.

A reversible heat engine operating between


two temperatures is called a Carnot engine. We
have just argued that such an engine must have
the following sequence of steps constituting one
cycle, called the Carnot cycle, shown in Fig.
12.11. We have taken the working substance of
the Carnot engine to be an ideal gas.
(a) Step 1 2 Isothermal expansion of the gas
taking its state from (P1, V1, T1) to
(P2, V2, T1).
The heat absorbed by the gas (Q1) from the
reservoir at temperature T 1 is given by

312

PHYSICS

(b)

Step 2 3 Adiabatic expansion of the gas


from (P2, V2, T1) to (P3, V3, T2)
Work done by the gas, using
Eq. (12.16), is
W2

(c)

(12.23)

R T1
3

T2

(12.24)

Step 3 4 Isothermal compression of the


gas from (P3, V3, T2) to (P4, V4, T2).

Heat released (Q2) by the gas to the reservoir


at temperature T2 is given by Eq. (12.12). This
is also the work done (W3 4) on the gas by the
environment.

i.e.

V3

Q2

(12.25)

(12.29)

T1

T2 V4

i.e.

V1
V4

T2

T1 V1
1/

(d)

Step 4 1 Adiabatic compression of the


gas from (P4, V4, T2) to (P1,V1, T1).

Work done on the gas, [using Eq.(12.16)], is


W4

T1

T2
-1

(12.26)

From Eqs. (12.23) to (12.26) total work done


by the gas in one complete cycle is
W = W 1 2 + W2 3 W 3
V

W4 1
V

2
3
= RT1 ln V RT2 ln V (12.27)
1
4

The efficiency of the Carnot engine is


W

Q1

T2
T1

Q2

Q1

In V3
V4

(12.28)

V2

In V
1

Now since step 2 3 is an adiabatic process,


T1 V2

T2 V3

(12.30)

T1

From Eqs. (12.29) and (12.30),


V3 V2
=
V4 V1

(12.31)

Using Eq. (12.31) in Eq. (12.28), we get


T2

T1

(Carnot engine)

(12.32)

We have already seen that a Carnot engine


is a reversible engine. Indeed it is the only
reversible engine possible that works between
two reservoirs at different temperatures. Each
step of the Carnot cycle given in Fig. 12.11 can
be reversed. This will amount to taking heat Q2
from the cold reservoir at T2, doing work W on
the system, and transferring heat Q1 to the hot
reservoir. This will be a reversible refrigerator.

no N
C
tt E
o R
be T
re
pu

1)

bl

W3

1/(

Similarly, since step 4 1 is an adiabatic


process

V
RT2 ln 3
V4

T2

he
d

W1 2 = Q1 = R T1

V2
ln V
1

V2

is

Eq. (12.12). This is also the work done (W1 2)


by the gas on the environment.

We next establish the important result


(sometimes called Carnots theorem) that
(a) working between two given temperatures T1
and T2 of the hot and cold reservoirs respectively,
no engine can have efficiency more than that of
the Carnot engine and (b) the efficiency of the
Carnot engine is independent of the nature of
the working substance.
To prove the result (a), imagine a reversible
(Carnot) engine R and an irreversible engine I
working between the same source (hot reservoir)
and sink (cold reservoir). Let us couple the
engines, I and R, in such a way so that I acts
like a heat engine and R acts as a refrigerator.
Let I absorb heat Q1 from the source, deliver
work W and release the heat Q1- W to the sink.
We arrange so that R returns the same heat Q1
to the source, taking heat Q2 from the sink and
requiring work W = Q1 Q2 to be done on it.

THERMODYNAMICS

313

than that of the Carnot engine. A similar


argument can be constructed to show that a
reversible engine with one particular substance
cannot be more efficient than the one using
another substance. The maximum efficiency of
a Carnot engine given by Eq. (12.32) is
independent of the nature of the system
performing the Carnot cycle of operations. Thus
we are justified in using an ideal gas as a system
in the calculation of efficiency of a Carnot
engine. The ideal gas has a simple equation of
state, which allows us to readily calculate , but
the final result for , [Eq. (12.32)], is true for
any Carnot engine.

he
d

Now suppose R < I i.e. if R were to act


as an engine it would give less work output
than that of I i.e. W < W for a given Q1. With R
acting like a refrigerator, this would mean
Q2 = Q1 W > Q1 W . Thus on the whole,
the coupled I-R system extracts heat
(Q1 W) (Q1 W ) = (W W ) from the cold
reservoir and delivers the same amount of work
in one cycle, without any change in the source
or anywhere else. This is clearly against the
Kelvin-Planck statement of the Second Law of
Thermodynamics. Hence the assertion I > R
is wrong. No engine can have efficiency greater

Q1 T1
=
Q2 T2

no N
C
tt E
o R
be T
re
pu
W

Fig. 12.12 An irreversible engine (I) coupled to a


reversible refrigerator (R). If W > W, this
would amount to extraction of heat
W W from the sink and its full
conversion to work, in contradiction with
the Second Law of Thermodynamics.

SUMMARY

1.
2.

3.

(12.33)

is a universal relation independent of the nature


of the system. Here Q1 and Q2 are respectively,
the heat absorbed and released isothermally
(from the hot and to the cold reservoirs) in a
Carnot engine. Equation (12.33), can, therefore,
be used as a relation to define a truly universal
thermodynamic temperature scale that is
independent of any particular properties of the
system used in the Carnot cycle. Of course, for
an ideal gas as a working substance, this
universal temperature is the same as the ideal
gas temperature introduced in section 12.11.

bl

is

This final remark shows that in a Carnot


cycle,

The zeroth law of thermodynamics states that two systems in thermal equilibrium with a
third system are in thermal equilibrium with each other. The Zeroth Law leads to the
concept of temperature.
Internal energy of a system is the sum of kinetic energies and potential energies of the
molecular constituents of the system. It does not include the over-all kinetic energy of
the system. Heat and work are two modes of energy transfer to the system. Heat is the
energy transfer arising due to temperature difference between the system and the
surroundings. Work is energy transfer brought about by other means, such as moving
the piston of a cylinder containing the gas, by raising or lowering some weight connected
to it.
The first law of thermodynamics is the general law of conservation of energy applied to
any system in which energy transfer from or to the surroundings (through heat and
work) is taken into account. It states that
Q = U + W

where Q is the heat supplied to the system, W is the work done by the system and U
is the change in internal energy of the system.

314

PHYSICS

4.

The specific heat capacity of a substance is defined by

1 Q
m T

s=

where m is the mass of the substance and Q is the heat required to change its
temperature by T. The molar specific heat capacity of a substance is defined by

1 Q
T

he
d

where is the number of moles of the substance. For a solid, the law of equipartition
of energy gives
C = 3R
which generally agrees with experiment at ordinary temperatures.

Calorie is the old unit of heat. 1 calorie is the amount of heat required to raise the
temperature of 1 g of water from 14.5 C to 15.5 C. 1 cal = 4.186 J.
For an ideal gas, the molar specific heat capacities at constant pressure and volume
satisfy the relation
Cp Cv = R

is

5.

Equilibrium states of a thermodynamic system are described by state variables. The


value of a state variable depends only on the particular state, not on the path used to
arrive at that state. Examples of state variables are pressure (P ), volume (V ), temperature
(T ), and mass (m ). Heat and work are not state variables. An Equation of State (like
the ideal gas equation PV = RT ) is a relation connecting different state variables.

no N
C
tt E
o R
be T
re
pu

6.

bl

where R is the universal gas constant.

7.

A quasi-static process is an infinitely slow process such that the system remains in
thermal and mechanical equilibrium with the surroundings throughout. In a
quasi-static process, the pressure and temperature of the environment can differ from
those of the system only infinitesimally.

8.

In an isothermal expansion of an ideal gas from volume V1 to V2 at temperature T the


heat absorbed (Q) equals the work done (W ) by the gas, each given by
Q = W =

9.

RT

V2
ln V
1

In an adiabatic process of an ideal gas


PV

= constant

Cp

where

Cv

Work done by an ideal gas in an adiabatic change of state from (P1, V1, T1) to (P2, V2, T2)
is
W

R T1

T2

10. Heat engine is a device in which a system undergoes a cyclic process resulting in
conversion of heat into work. If Q1 is the heat absorbed from the source, Q2 is the heat
released to the sink, and the work output in one cycle is W, the efficiency of the engine
is:
W
Q1

Q2
Q1

THERMODYNAMICS

315

11. In a refrigerator or a heat pump, the system extracts heat Q2 from the cold reservoir and
releases Q1 amount of heat to the hot reservoir, with work W done on the system. The
co-efficient of performance of a refrigerator is given by

Q2
Q2
=
W
Q1 Q2

12. The second law of thermodynamics disallows some processes consistent with the First
Law of Thermodynamics. It states

he
d

Kelvin-Planck statement
No process is possible whose sole result is the absorption of heat from a reservoir and
complete conversion of the heat into work.
Clausius statement

No process is possible whose sole result is the transfer of heat from a colder object to a
hotter object.

is

Put simply, the Second Law implies that no heat engine can have efficiency equal to
1 or no refrigerator can have co-efficient of performance equal to infinity.

bl

13. A process is reversible if it can be reversed such that both the system and the surroundings
return to their original states, with no other change anywhere else in the universe.
Spontaneous processes of nature are irreversible. The idealised reversible process is a
quasi-static process with no dissipative factors such as friction, viscosity, etc.

no N
C
tt E
o R
be T
re
pu

14. Carnot engine is a reversible engine operating between two temperatures T1 (source) and
T2 (sink). The Carnot cycle consists of two isothermal processes connected by two
adiabatic processes. The efficiency of a Carnot engine is given by

=1

T2
T1

(Carnot engine)

No engine operating between two temperatures can have efficiency greater than that of
the Carnot engine.

15. If Q > 0, heat is added to the system

If Q < 0, heat is removed to the system


If W > 0, Work is done by the system

If W < 0, Work is done on the system

Quantity

Symbol

Dimensions

Unit

Remark

Co-efficienty of volume
expansion

[K1]

K1

v = 3 1

Heat supplied to a system

[ML2 T2]

Q is not a state
variable

Specific heat

[L2 T2 K1]

J kg1 K1

Thermal Conductivity

[MLT3 K1]

J s1 K1

H = KA

dt
dx

316

PHYSICS

POINTS TO PONDER

3.
4.
5.

he
d

2.

Temperature of a body is related to its average internal energy, not to the kinetic energy
of motion of its centre of mass. A bullet fired from a gun is not at a higher temperature
because of its high speed.
Equilibrium in thermodynamics refers to the situation when macroscopic variables
describing the thermodynamic state of a system do not depend on time. Equilibrium of
a system in mechanics means the net external force and torque on the system are zero.
In a state of thermodynamic equilibrium, the microscopic constituents of a system are
not in equilibrium (in the sense of mechanics).
Heat capacity, in general, depends on the process the system goes through when heat is
supplied.
In isothermal quasi-static processes, heat is absorbed or given out by the system even
though at every stage the gas has the same temperature as that of the surrounding
reservoir. This is possible because of the infinitesimal difference in temperature between
the system and the reservoir.
EXERCISES

is

1.

A geyser heats water flowing at the rate of 3.0 litres per minute from 27 C to 77 C.
If the geyser operates on a gas burner, what is the rate of consumption of the fuel if
its heat of combustion is 4.0 104 J/g ?

12.2

What amount of heat must be supplied to 2.0 10 2 kg of nitrogen (at room


temperature) to raise its temperature by 45 C at constant pressure ? (Molecular
mass of N2 = 28; R = 8.3 J mol1 K1.)

no N
C
tt E
o R
be T
re
pu

bl

12.1

12.3

Explain why
(a) Two bodies at different temperatures T1 and T2 if brought in thermal contact do
not necessarily settle to the mean temperature (T1 + T2 )/2.
(b) The coolant in a chemical or a nuclear plant (i.e., the liquid used to prevent
the different parts of a plant from getting too hot) should have high specific
heat.
(c) Air pressure in a car tyre increases during driving.
(d) The climate of a harbour town is more temperate than that of a town in a desert
at the same latitude.

12.4

A cylinder with a movable piston contains 3 moles of hydrogen at standard temperature


and pressure. The walls of the cylinder are made of a heat insulator, and the piston
is insulated by having a pile of sand on it. By what factor does the pressure of the
gas increase if the gas is compressed to half its original volume ?

12.5

In changing the state of a gas adiabatically from an equilibrium state A to another


equilibrium state B, an amount of work equal to 22.3 J is done on the system. If the
gas is taken from state A to B via a process in which the net heat absorbed by the
system is 9.35 cal, how much is the net work done by the system in the latter case ?
(Take 1 cal = 4.19 J)

12.6

Two cylinders A and B of equal capacity are connected to each other via a stopcock.
A contains a gas at standard temperature and pressure. B is completely evacuated.
The entire system is thermally insulated. The stopcock is suddenly opened. Answer
the following :
(a) What is the final pressure of the gas in A and B ?
(b) What is the change in internal energy of the gas ?
(c) What is the change in the temperature of the gas ?
(d) Do the intermediate states of the system (before settling to the final equilibrium
state) lie on its P-V-T surface ?

THERMODYNAMICS

12.7

12.8

A steam engine delivers 5.4108J of work per minute and services 3.6 109J of heat
per minute from its boiler. What is the efficiency of the engine? How much heat is
wasted per minute?
An electric heater supplies heat to a system at a rate of 100W. If system performs
work at a rate of 75 joules per second. At what rate is the internal energy increasing?
A thermodynamic system is taken from an original state to an intermediate state by
the linear process shown in Fig. (12.13)

bl

is

he
d

12.9

317

no N
C
tt E
o R
be T
re
pu

Fig. 12.13
Its volume is then reduced to the original value from E to F by an isobaric process.
Calculate the total work done by the gas from D to E to F
12.10 A refrigerator is to maintain eatables kept inside at 90C. If room temperature is 360C,
calculate the coefficient of performance.

CHAPTER THIRTEEN

he
d

KINETIC THEORY

13.1 INTRODUCTION

is

bl

Introduction
Molecular nature of matter
Behaviour of gases
Kinetic theory of an ideal gas
Law of equipartition of energy
Specific heat capacity
Mean free path

no N
C
tt E
o R
be T
re
pu

13.1
13.2
13.3
13.4
13.5
13.6
13.7

Boyle discovered the law named after him in 1661. Boyle,


Newton and several others tried to explain the behaviour of
gases by considering that gases are made up of tiny atomic
particles. The actual atomic theory got established more than
150 years later. Kinetic theory explains the behaviour of gases
based on the idea that the gas consists of rapidly moving
atoms or molecules. This is possible as the inter-atomic forces,
which are short range forces that are important for solids
and liquids, can be neglected for gases. The kinetic theory
was developed in the nineteenth century by Maxwell,
Boltzmann and others. It has been remarkably successful. It
gives a molecular interpretation of pressure and temperature
of a gas, and is consistent with gas laws and Avogadros
hypothesis. It correctly explains specific heat capacities of
many gases. It also relates measurable properties of gases
such as viscosity, conduction and diffusion with molecular
parameters, yielding estimates of molecular sizes and masses.
This chapter gives an introduction to kinetic theory.

Summary
Points to ponder
Exercises
Additional exercises

13.2 MOLECULAR NATURE OF MATTER


Richard Feynman, one of the great physicists of 20th century
considers the discovery that Matter is made up of atoms to
be a very significant one. Humanity may suffer annihilation
(due to nuclear catastrophe) or extinction (due to
environmental disasters) if we do not act wisely. If that
happens, and all of scientific knowledge were to be destroyed
then Feynman would like the Atomic Hypothesis to be
communicated to the next generation of creatures in the
universe. Atomic Hypothesis: All things are made of atoms little particles that move around in perpetual motion,
attracting each other when they are a little distance apart,
but repelling upon being squeezed into one another.
Speculation that matter may not be continuous, existed in
many places and cultures. Kanada in India and Democritus

KINETIC THEORY

319

Atomic Hypothesis in Ancient India and Greece

of matter. The theory is now well accepted by


scientists. However even at the end of the
nineteenth century there were famous scientists
who did not believe in atomic theory !
From many observations, in recent times we
now know that molecules (made up of one or
more atoms) constitute matter. Electron
microscopes
and scanning tunnelling
microscopes enable us to even see them. The
size of an atom is about an angstrom (10 -10 m).
In solids, which are tightly packed, atoms are
spaced about a few angstroms (2 ) apart. In
liquids the separation between atoms is also
about the same. In liquids the atoms are not
as rigidly fixed as in solids, and can move
around. This enables a liquid to flow. In gases
the interatomic distances are in tens of
angstroms. The average distance a molecule
can travel without colliding is called the mean
free path. The mean free path, in gases, is of
the order of thousands of angstroms. The atoms
are much freer in gases and can travel long
distances without colliding. If they are not
enclosed, gases disperse away. In solids and
liquids the closeness makes the interatomic force
important. The force has a long range attraction
and a short range repulsion. The atoms attract
when they are at a few angstroms but repel when
they come closer. The static appearance of a gas

no N
C
tt E
o R
be T
re
pu

in Greece had suggested that matter may consist


of indivisible constituents. The scientific Atomic
Theory is usually credited to John Dalton. He
proposed the atomic theory to explain the laws
of definite and multiple proportions obeyed by
elements when they combine into compounds.
The first law says that any given compound has,
a fixed proportion by mass of its constituents.
The second law says that when two elements
form more than one compound, for a fixed mass
of one element, the masses of the other elements
are in ratio of small integers.
To explain the laws Dalton suggested, about
200 years ago, that the smallest constituents
of an element are atoms. Atoms of one element
are identical but differ from those of other
elements. A small number of atoms of each
element combine to form a molecule of the
compound. Gay Lussacs law, also given in early
19th century, states: When gases combine
chemically to yield another gas, their volumes
are in the ratios of small integers. Avogadros
law (or hypothesis) says: Equal volumes of all
gases at equal temperature and pressure have
the same number of molecules. Avogadros law,
when combined with Daltons theory explains
Gay Lussacs law. Since the elements are often
in the form of molecules, Daltons atomic theory
can also be referred to as the molecular theory

bl

is

he
d

Though John Dalton is credited with the introduction of atomic viewpoint in modern science, scholars in
ancient India and Greece conjectured long before the existence of atoms and molecules. In the Vaiseshika
school of thought in India founded by Kanada (Sixth century B.C.) the atomic picture was developed in
considerable detail. Atoms were thought to be eternal, indivisible, infinitesimal and ultimate parts of matter.
It was argued that if matter could be subdivided without an end, there would be no difference between a
mustard seed and the Meru mountain. The four kinds of atoms (Paramanu Sanskrit word for the
smallest particle) postulated were Bhoomi (Earth), Ap (water), Tejas (fire) and Vayu (air) that have characteristic
mass and other attributes, were propounded. Akasa (space) was thought to have no atomic structure and
was continuous and inert. Atoms combine to form different molecules (e.g. two atoms combine to form a
diatomic molecule dvyanuka, three atoms form a tryanuka or a triatomic molecule), their properties depending
upon the nature and ratio of the constituent atoms. The size of the atoms was also estimated, by conjecture
or by methods that are not known to us. The estimates vary. In Lalitavistara, a famous biography of the
Buddha written mainly in the second century B.C., the estimate is close to the modern estimate of atomic
size, of the order of 10 10 m.
In ancient Greece, Democritus (Fourth century B.C.) is best known for his atomic hypothesis. The
word atom means indivisible in Greek. According to him, atoms differ from each other physically, in
shape, size and other properties and this resulted in the different properties of the substances formed
by their combination. The atoms of water were smooth and round and unable to hook on to each
other, which is why liquid /water flows easily. The atoms of earth were rough and jagged, so they held
together to form hard substances. The atoms of fire were thorny which is why it caused painful burns.
These fascinating ideas, despite their ingenuity, could not evolve much further, perhaps because they
were intuitive conjectures and speculations not tested and modified by quantitative experiments - the
hallmark of modern science.

320

PHYSICS

13.3 BEHAVIOUR OF GASES

P2V2
= constant = kB
N 2 T2

he
d

P1V1
As
N1T1

PV = KT

(13.1)

(13.2)

is

if P, V and T are same, then N is also same for


all gases. This is Avogadros hypothesis, that the
number of molecules per unit volume is same
for all gases at a fixed temperature and pressure.
The number in 22.4 litres of any gas is 6.02
1023. This is known as Avogadro number and
is denoted by NA. The mass of 22.4 litres of any
gas is equal to its molecular weight in grams at
S.T.P (standard temperature 273 K and pressure
1 atm). This amount of substance is called a
mole (see Chapter 2 for a more precise definition).
Avogadro had guessed the equality of numbers
in equal volumes of gas at a fixed temperature
and pressure from chemical reactions. Kinetic
theory justifies this hypothesis.
The perfect gas equation can be written as

no N
C
tt E
o R
be T
re
pu

Properties of gases are easier to understand than


those of solids and liquids. This is mainly
because in a gas, molecules are far from each
other and their mutual interactions are
negligible except when two molecules collide.
Gases at low pressures and high temperatures
much above that at which they liquefy (or
solidify) approximately satisfy a simple relation
between their pressure, temperature and volume
given by (see Ch. 11)

for a given sample of the gas. Here T is the


temperature in kelvin or (absolute) scale. K is
a constant for the given sample but varies with
the volume of the gas. If we now bring in the
idea of atoms or molecules then K is proportional
to the number of molecules, (say) N in the
sample. We can write K = N k . Observation tells
us that this k is same for all gases. It is called
Boltzmann constant and is denoted by k .

bl

is misleading. The gas is full of activity and the


equilibrium is a dynamic one. In dynamic
equilibrium, molecules collide and change their
speeds during the collision. Only the average
properties are constant.
Atomic theory is not the end of our quest, but
the beginning. We now know that atoms are not
indivisible or elementary. They consist of a
nucleus and electrons. The nucleus itself is made
up of protons and neutrons. The protons and
neutrons are again made up of quarks. Even
quarks may not be the end of the story. There
may be string like elementary entities. Nature
always has surprises for us, but the search for
truth is often enjoyable and the discoveries
beautiful. In this chapter, we shall limit ourselves
to understanding the behaviour of gases (and a
little bit of solids), as a collection of moving
molecules in incessant motion.

(13.3)
PV = RT
where is the number of moles and R = NA
kB is a universal constant. The temperature T is
absolute temperature. Choosing kelvin scale for

John Dalton (1766- 1844)

He was an English chemist. When different types of atoms combine,


they obey certain simple laws. Daltons atomic theory explains these
laws in a simple way. He also gave a theory of colour
blindness.
Amedeo Avogadro (1776 1856)

He made a brilliant guess that equal volumes of gases


have equal number of molecules at the same
temperature and pressure. This helped in
understanding the combination of different gases in
a very simple way. It is now called Avogadros hypothesis (or law). He also
suggested that the smallest constituent of gases like hydrogen, oxygen and
nitrogen are not atoms but diatomic molecules.

KINETIC THEORY

321

M
M0

N
NA

(13.4)

is

no N
C
tt E
o R
be T
re
pu

bl

pV

J mol

1 1
K

where M is the mass of the gas containing N


molecules, M0 is the molar mass and NA the
Avogadros number. Using Eqs. (13.4) and (13.3)
can also be written as
PV = kB NT
or
P = kB nT

i.e., keeping temperature constant, pressure of


a given mass of gas varies inversely with volume.
This is the famous Boyles law. Fig. 13.2 shows
comparison between experimental P-V curves
and the theoretical curves predicted by Boyles
law. Once again you see that the agreement is
good at high temperatures and low pressures.
Next, if you fix P, Eq. (13.1) shows that V T
i.e., for a fixed pressure, the volume of a gas is
proportional to its absolute temperature T
(Charles law). See Fig. 13.3.

he
d

absolute temperature, R = 8.314 J mol1K1.


Here

P (atm)

Fig.13.1 Real gases approach ideal gas behaviour


at low pressures and high temperatures.

where n is the number density, i.e. number of


molecules per unit volume. kB is the Boltzmann
constant introduced above. Its value in SI units
is 1.38 1023 J K1.
Another useful form of Eq. (13.3) is
RT
P
(13.5)
M0
where is the mass density of the gas.
A gas that satisfies Eq. (13.3) exactly at all
pressures and temperatures is defined to be an
ideal gas. An ideal gas is a simple theoretical
model of a gas. No real gas is truly ideal.
Fig. 13.1 shows departures from ideal gas
behaviour for a real gas at three different
temperatures. Notice that all curves approach
the ideal gas behaviour for low pressures and
high temperatures.
At low pressures or high temperatures the
molecules are far apart and molecular
interactions are negligible. Without interactions
the gas behaves like an ideal one.
If we fix and T in Eq. (13.3), we get
PV = constant

(13.6)

Fig.13.2 Experimental P-V curves (solid lines) for


steam at three temperatures compared
with Boyles law (dotted lines). P is in units
of 22 atm and V in units of 0.09 litres.

Finally, consider a mixture of non-interacting


ideal gases: moles of gas 1, moles of gas
1
2
2, etc. in a vessel of volume V at temperature T
and pressure P. It is then found that the
equation of state of the mixture is :
PV = ( 1 + 2 + ) RT

i.e. P

RT
V

= P1 + P2 +

RT
V

(13.7)
...

(13.8)
(13.9)

Clearly P1 = 1 R T/V is the pressure gas 1


would exert at the same conditions of volume
and temperature if no other gases were present.
This is called the partial pressure of the gas.
Thus, the total pressure of a mixture of ideal
gases is the sum of partial pressures. This is
Daltons law of partial pressures.

322

he
d

is

Example 13.3
What is the average
distance between atoms
(interatomic
distance) in water? Use the data given in
Examples 13.1 and 13.2.
Answer : A given mass of water in vapour state
has 1.67103 times the volume of the same mass
of water in liquid state (Ex. 13.1). This is also
the increase in the amount of volume available
for each molecule of water. When volume
increases by 103 times the radius increases by
V1/3 or 10 times, i.e., 10 2 = 20 . So the
average distance is 2 20 = 40 .

no N
C
tt E
o R
be T
re
pu

We next consider some examples which give


us information about the volume occupied by
the molecules and the volume of a single
molecule.

density of water molecule may therefore, be


regarded as roughly equal to the density of bulk
water = 1000 kg m3. To estimate the volume of
a water molecule, we need to know the mass of
a single water molecule. We know that 1 mole
of water has a mass approximately equal to
(2 + 16)g = 18 g = 0.018 kg.
Since 1 mole
contains about
6 1023
molecules (Avogadros number), the mass of
a molecule of water is (0.018)/(6 1023) kg =
3 1026 kg. Therefore, a rough estimate of the
volume of a water molecule is as follows :
Volume of a water molecule
= (3 1026 kg)/ (1000 kg m3)
= 3 1029 m3
= (4/3) (Radius)3
Hence, Radius 2 10-10 m = 2

bl

Fig. 13.3 Experimental T-V curves (solid lines) for


CO2 at three pressures compared with
Charles law (dotted lines). T is in units of
300 K and V in units of 0.13 litres.

PHYSICS

Example 13.1 The density of water is 1000


kg m3. The density of water vapour at 100 C
and 1 atm pressure is 0.6 kg m3. The
volume of a molecule multiplied by the total
number gives ,what is called, molecular
volume. Estimate the ratio (or fraction) of
the molecular volume to the total volume
occupied by the water vapour under the
above conditions of temperature and
pressure.

Answer For a given mass of water molecules,


the density is less if volume is large. So the
volume of the vapour is 1000/0.6 = /(6 10 -4 )
times larger. If densities of bulk water and water
molecules are same, then the fraction of
molecular volume to the total volume in liquid
state is 1. As volume in vapour state has
increased, the fractional volume is less by the
same amount, i.e. 610-4.
Example 13.2 Estimate the volume of a
water molecule using the data in Example
13.1.

Answer In the liquid (or solid) phase, the


molecules of water are quite closely packed. The

Example 13.4 A vessel contains two nonreactive gases : neon (monatomic) and
oxygen (diatomic). The ratio of their partial
pressures is 3:2. Estimate the ratio of (i)
number of molecules and (ii) mass density
of neon and oxygen in the vessel. Atomic
mass of Ne = 20.2 u, molecular mass of O2
= 32.0 u.

Answer Partial pressure of a gas in a mixture is


the pressure it would have for the same volume
and temperature if it alone occupied the vessel.
(The total pressure of a mixture of non-reactive
gases is the sum of partial pressures due to its
constituent gases.) Each gas (assumed ideal)
obeys the gas law. Since V and T are common to
the two gases, we have P1V = 1 RT and P2V =
2 RT, i.e. (P1/P2) = (1 / 2). Here 1 and 2 refer
to neon and oxygen respectively. Since (P1/P2) =
(3/2) (given), (1/ 2) = 3/2.

KINETIC THEORY

323

3
2

m1
m2

20.2
32.0

0.947

1
2

M1
M2

13.4 KINETIC THEORY OF AN IDEAL GAS

Elastic collision of a gas molecule with


the wall of the container.

(vx, vy, vz ) hits the planar wall parallel to yzplane of area A (= l 2). Since the collision is elastic,
the molecule rebounds with the same velocity;
its y and z components of velocity do not change
in the collision but the x-component reverses
sign. That is, the velocity after collision is
(-vx, vy, vz ) . The change in momentum of the
molecule is : mvx (mvx) = 2mvx . By the
principle of conservation of momentum, the
momentum imparted to the wall in the collision
= 2mvx .
To calculate the force (and pressure) on the
wall, we need to calculate momentum imparted
to the wall per unit time. In a small time interval
t, a molecule with x-component of velocity vx
will hit the wall if it is within the distance vx t
from the wall. That is, all molecules within the
volume Avx t only can hit the wall in time t.
But, on the average, half of these are moving
towards the wall and the other half away from
the wall. Thus the number of molecules with
velocity (vx, vy, vz ) hitting the wall in time t is
A vx t n where n is the number of molecules
per unit volume. The total momentum
transferred to the wall by these molecules in
time t is :
Q = (2mvx) ( n A vx t )
(13.10)
The force on the wall is the rate of momentum
transfer Q/t and pressure is force per unit
area :
P = Q /(A t) = n m vx2
(3.11)
Actually, all molecules in a gas do not have
the same velocity; there is a distribution in
velocities. The above equation therefore, stands
for pressure due to the group of molecules with
speed vx in the x-direction and n stands for the
number density of that group of molecules. The

no N
C
tt E
o R
be T
re
pu

Kinetic theory of gases is based on the molecular


picture of matter. A given amount of gas is a
collection of a large number of molecules
(typically of the order of Avogadros number) that
are in incessant random motion. At ordinary
pressure and temperature, the average distance
between molecules is a factor of 10 or more than
the typical size of a molecule (2 ). Thus the
interaction between the molecules is negligible
and we can assume that they move freely in
straight lines according to Newtons first law.
However, occasionally, they come close to each
other, experience intermolecular forces and their
velocities change. These interactions are called
collisions. The molecules collide incessantly
against each other or with the walls and change
their velocities. The collisions are considered to
be elastic. We can derive an expression for the
pressure of a gas based on the kinetic theory.
We begin with the idea that molecules of a
gas are in incessant random motion, colliding
against one another and with the walls of the
container. All collisions between molecules
among themselves or between molecules and the
walls are elastic. This implies that total kinetic
energy is conserved. The total momentum is
conserved as usual.

Fig. 13.4

is

m1 / V
m2 /V

bl

he
d

(i) By definition 1 = (N1/NA ) and 2 = (N2/NA)


where N1 and N2 are the number of molecules
of 1 and 2, and NA is the Avogadros number.
Therefore, (N1/N2) = (1 / 2) = 3/2.
(ii) We can also write 1 = (m1/M1) and 2 =
(m2/M2) where m1 and m2 are the masses of
1 and 2; and M1 and M2 are their molecular
masses. (Both m1 and M1; as well as m2 and
M2 should be expressed in the same units).
If 1 and 2 are the mass densities of 1 and
2 respectively, we have

13.4.1 Pressure of an Ideal Gas

Consider a gas enclosed in a cube of side l. Take


the axes to be parallel to the sides of the cube,
as shown in Fig. 13.4. A molecule with velocity

324

PHYSICS

(13.13)

where v is the speed and v 2 denotes the mean


of the squared speed. Thus

13.4.2 Kinetic Interpretation of Temperature


Equation (13.14) can be written as
PV = (1/3) nV m v 2
(13.15a)

no N
C
tt E
o R
be T
re
pu

(13.14)
P = (1/3) n m v 2
Some remarks on this derivation. First,
though we choose the container to be a cube,
the shape of the vessel really is immaterial. For
a vessel of arbitrary shape, we can always choose
a small infinitesimal (planar) area and carry
through the steps above. Notice that both A and
t do not appear in the final result. By Pascals
law, given in Ch. 10, pressure in one portion of

he
d

= (1/3) [ v x2 + vy2 + v z2 ] = (1/3) v 2

is

2
v x2 = vy = v z2

the gas in equilibrium is the same as anywhere


else. Second, we have ignored any collisions in
the derivation. Though this assumption is
difficult to justify rigorously, we can qualitatively
see that it will not lead to erroneous results.
The number of molecules hitting the wall in time
t was found to be n Avx t. Now the collisions
are random and the gas is in a steady state.
Thus, if a molecule with velocity (vx, vy, vz )
acquires a different velocity due to collision with
some molecule, there will always be some other
molecule with a different initial velocity which
after a collision acquires the velocity (vx, vy, vz ).
If this were not so, the distribution of velocities
would not remain steady. In any case we are
finding v x2 . Thus, on the whole, molecular
collisions (if they are not too frequent and the
time spent in a collision is negligible compared
to time between collisions) will not affect the
calculation above.

bl

total pressure is obtained by summing over the


contribution due to all groups:
P = n m v x2
(13.12)
2
2
where v x is the average of vx . Now the gas
is isotropic, i.e. there is no preferred direction
of velocity of the molecules in the vessel.
Therefore, by symmetry,

Founders of Kinetic Theory of Gases


James Clerk Maxwell (1831 1879), born in Edinburgh,
Scotland, was among the greatest physicists of the nineteenth
century. He derived the thermal velocity distribution of molecules
in a gas and was among the first to obtain reliable estimates of
molecular parameters from measurable quantities like viscosity,
etc. Maxwells greatest achievement was the unification of the laws
of electricity and magnetism (discovered by Coulomb, Oersted,
Ampere and Faraday) into a consistent set of equations now called
Maxwells equations. From these he arrived at the most important
conclusion that light is an
electromagnetic
wave.
Interestingly, Maxwell did not
agree with the idea (strongly
suggested by the Faradays
laws of electrolysis) that
electricity was particulate in
nature.

Ludwig
Boltzmann
(1844 1906)
born in
Vienna, Austria, worked on the kinetic theory of gases
independently of Maxwell. A firm advocate of atomism, that is
basic to kinetic theory, Boltzmann provided a statistical
interpretation of the Second Law of thermodynamics and the
concept of entropy. He is regarded as one of the founders of classical
statistical mechanics. The proportionality constant connecting
energy and temperature in kinetic theory is known as Boltzmanns
constant in his honour.

KINETIC THEORY

325

NA

28
6.02 1026

4.65 10 26 kg.

he
d

v 2 = 3 kB T / m = (516)2 m2s-2
The square root of v 2 is known as root mean
square (rms) speed and is denoted by vrms,
( We can also write v 2 as < v2 >.)
vrms = 516 m s-1
The speed is of the order of the speed of sound
in air. It follows from Eq. (13.19) that at the same
temperature, lighter molecules have greater rms
speed.

is

Example 13.5 A flask contains argon and


chlorine in the ratio of 2:1 by mass. The
temperature of the mixture is 27 C. Obtain
the ratio of (i) average kinetic energy per
molecule, and (ii) root mean square speed
vrms of the molecules of the two gases.
Atomic mass of argon = 39.9 u; Molecular
mass of chlorine = 70.9 u.

Answer The important point to remember is that


the average kinetic energy (per molecule) of any
(ideal) gas (be it monatomic like argon, diatomic
like chlorine or polyatomic) is always equal to
(3/2) kBT. It depends only on temperature, and
is independent of the nature of the gas.
(i) Since argon and chlorine both have the same
temperature in the flask, the ratio of average
kinetic energy (per molecule) of the two gases
is 1:1.
(ii) Now
m vrms2 = average kinetic energy per
molecule = (3/2) ) kBT where m is the mass
of a molecule of the gas. Therefore,

no N
C
tt E
o R
be T
re
pu

Equation (13.15) then gives :


PV = (2/3) E
(13.17)
We are now ready for a kinetic interpretation
of temperature. Combining Eq. (13.17) with the
ideal gas Eq. (13.3), we get
E = (3/2) kB NT
(13.18)
or E/ N =
m v 2 = (3/2) kBT
(13.19)
i.e., the average kinetic energy of a molecule is
proportional to the absolute temperature of the
gas; it is independent of pressure, volume or
the nature of the ideal gas. This is a fundamental
result relating temperature, a macroscopic
measurable
parameter
of
a
gas
(a thermodynamic variable as it is called) to a
molecular quantity, namely the average kinetic
energy of a molecule. The two domains are
connected by the Boltzmann constant. We note
in passing that Eq. (13.18) tells us that internal
energy of an ideal gas depends only on
temperature, not on pressure or volume. With
this interpretation of temperature, kinetic theory
of an ideal gas is completely consistent with the
ideal gas equation and the various gas laws
based on it.
For a mixture of non-reactive ideal gases, the
total pressure gets contribution from each gas
in the mixture. Equation (13.14) becomes
(13.20)
P = (1/3) [n1m1 v12 + n2 m2 v 22 + ]
In equilibrium, the average kinetic energy of
the molecules of different gases will be equal.
That is,

M N2

bl

PV = (2/3) N x m v 2
(13.15b)
where N (= nV ) is the number of molecules in
the sample.
The quantity in the bracket is the average
translational kinetic energy of the molecules in
the gas. Since the internal energy E of an ideal
gas is purely kinetic*,
E = N (1/2) m v 2
(13.16)

m1 v12 = m2 v 22 = (3/2) kB T
so that
P = (n1 + n2 + ) kB T

(13.21)

which is Daltons law of partial pressures.


From Eq. (13.19), we can get an idea of the
typical speed of molecules in a gas. At a
temperature T = 300 K, the mean square speed
of a molecule in nitrogen gas is :

2
vrms

Ar
2
rms Cl

Cl

Cl

Ar

Ar

70.9
=1.77
39.9

where M denotes the molecular mass of the gas.


(For argon, a molecule is just an atom of argon.)
Taking square root of both sides,
vrms

Ar

vrms

Cl

= 1.33

You should note that the composition of the


mixture by mass is quite irrelevant to the above

* E denotes the translational part of the internal energy U that may include energies due to other degrees of
freedom also. See section 13.5.

326

PHYSICS

Maxwell Distribution Function

no N
C
tt E
o R
be T
re
pu

bl

is

he
d

In a given mass of gas, the velocities of all molecules are not the same, even when bulk
parameters like pressure, volume and temperature are fixed. Collisions change the direction
and the speed of molecules. However in a state of equilibrium, the distribution of speeds is
constant or fixed.
Distributions are very important and useful when dealing with systems containing large
number of objects. As an example consider the ages of different persons in a city. It is not
feasible to deal with the age of each individual. We can divide the people into groups: children
up to age 20 years, adults between ages of 20 and 60, old people above 60. If we want more
detailed information we can choose smaller intervals, 0-1, 1-2,..., 99-100 of age groups. When
the size of the interval becomes smaller, say half year, the number of persons in the interval
will also reduce, roughly half the original number in the one year interval. The number of
persons dN(x) in the age interval x and x+dx is proportional to dx or dN(x) = nx dx. We have
used nx to denote the number of persons at the value of x.

Maxwell distribution of molecular speeds

In a similar way the molecular speed distribution gives the number of molecules between
2
the speeds v and v+ dv. dN(v) = 4p N a3ebv v2 dv = nvdv. This is called Maxwell distribution.
The plot of nv against v is shown in the figure. The fraction of the molecules with speeds v and
v+dv is equal to the area of the strip shown. The average of any quantity like v2 is defined by
the integral <v2> = (1/N ) v2 dN(v) = (3kB T/m) which agrees with the result derived from
more elementary considerations.

calculation. Any other proportion by mass of


argon and chlorine would give the same answers
to (i) and (ii), provided the temperature remains
unaltered.
Example 13.6 Uranium has two isotopes
of masses 235 and 238 units. If both are
present in Uranium hexafluoride gas which
would have the larger average speed ? If
atomic mass of fluorine is 19 units,
estimate the percentage difference in
speeds at any temperature.

Answer At a fixed temperature the average


energy = m <v2 > is constant. So smaller the

mass of the molecule, faster will be the speed.


The ratio of speeds is inversely proportional to
the square root of the ratio of the masses. The
masses are 349 and 352 units. So
v349 / v352 = ( 352/ 349)1/2 = 1.0044 .
V
= 0.44 %.
V
[235U is the isotope needed for nuclear fission.
To separate it from the more abundant isotope
238
U, the mixture is surrounded by a porous
cylinder. The porous cylinder must be thick and
narrow, so that the molecule wanders through
individually, colliding with the walls of the long
pore. The faster molecule will leak out more than

Hence difference

KINETIC THEORY

is V + u towards the bat. When the ball rebounds


(after hitting the massive bat) its speed, relative
to bat, is V + u moving away from the bat. So
relative to the wicket the speed of the rebounding
ball is V + (V + u) = 2V + u, moving away from
the wicket. So the ball speeds up after the
collision with the bat. The rebound speed will
be less than u if the bat is not massive. For a
molecule this would imply an increase in
temperature.
You should be able to answer (b) (c) and (d)
based on the answer to (a).
(Hint: Note the correspondence, piston bat,

he
d

the slower one and so there is more of the lighter


molecule (enrichment) outside the porous
cylinder (Fig. 13.5). The method is not very
efficient and has to be repeated several times
for sufficient enrichment.].
When gases diffuse, their rate of diffusion is
inversely proportional to square root of the
masses (see Exercise 13.12 ). Can you guess the
explanation from the above answer?

327

cylinder wicket, molecule ball.)

13.5 LAW OF EQUIPARTITION OF ENERGY

is

The kinetic energy of a single molecule is

1
1
1
mv x2
mvy2
mv z2
(13.22)
2
2
2
For a gas in thermal equilibrium at
temperature T the average value of energy

bl

> is

no N
C
tt E
o R
be T
re
pu

denoted by <

Fig. 13.5 Molecules going through a porous wall.

Example 13.7 (a) When a molecule (or


an elastic ball) hits a ( massive) wall, it
rebounds with the same speed. When a ball
hits a massive bat held firmly, the same
thing happens. However, when the bat is
moving towards the ball, the ball rebounds
with a different speed. Does the ball move
faster or slower? (Ch.6 will refresh your
memory on elastic collisions.)
(b) When gas in a cylinder is compressed
by pushing in a piston, its temperature
rises. Guess at an explanation of this in
terms of kinetic theory using (a) above.
(c) What happens when a compressed gas
pushes a piston out and expands. What
would you observe ?
(d) Sachin Tendulkar uses a heavy cricket
bat while playing. Does it help him in
anyway ?

Answer (a) Let the speed of the ball be u relative


to the wicket behind the bat. If the bat is moving
towards the ball with a speed V relative to the
wicket, then the relative speed of the ball to bat

1
mv x2
2

1
mvy2
2

1
mv z2
2

3
kBT
2

(13.23)

Since there is no preferred direction, Eq. (13.23)


implies

1
mv x2
2

1
1
kBT ,
mvy2
2
2

1
kBT ,
2

1
1
mv z2
kBT
(13.24)
2
2
A molecule free to move in space needs three
coordinates to specify its location. If it is
constrained to move in a plane it needs two;and
if constrained to move along a line, it needs just
one coordinate to locate it. This can also be
expressed in another way. We say that it has
one degree of freedom for motion in a line, two
for motion in a plane and three for motion in
space. Motion of a body as a whole from one
point to another is called translation. Thus, a
molecule free to move in space has three
translational degrees of freedom. Each
translational degree of freedom contributes a
term that contains square of some variable of
motion, e.g.,
mv x2 and similar terms in
vy and vz. In, Eq. (13.24) we see that in thermal
equilibrium, the average of each such term is
kBT .

328

PHYSICS

1
mv x2
2

1
mv y2
2

1
mv z2
2

1
I1
2

2
1

1
I2
2

2
2

(13.25)

no N
C
tt E
o R
be T
re
pu

bl

is

(13.26)
t
r
v
where k is the force constant of the oscillator
and y the vibrational co-ordinate.
Once again the vibrational energy terms in
Eq. (13.26) contain squared terms of vibrational
variables of motion y and dy/dt .
At this point, notice an important feature in
Eq.(13.26). While each translational and
rotational degree of freedom has contributed only
one squared term in Eq.(13.26), one vibrational
mode contributes two squared terms : kinetic
and potential energies.
Each quadratic term occurring in the
expression for energy is a mode of absorption of
energy by the molecule. We have seen that in
thermal equilibrium at absolute temperature T,
for each translational mode of motion, the
average energy is kBT. A most elegant principle
of classical statistical mechanics (first proved
by Maxwell) states that this is so for each mode
of energy: translational, rotational and
vibrational. That is, in equilibrium, the total
energy is equally distributed in all possible
energy modes, with each mode having an average
energy equal to
kBT. This is known as the
law of equipartition of energy. Accordingly,
each translational and rotational degree of
freedom of a molecule contributes kBT to the
energy while each vibrational frequency
contributes 2
kBT = kBT , since a vibrational
mode has both kinetic and potential energy
modes.
The proof of the law of equipartition of energy
is beyond the scope of this book. Here we shall
apply the law to predict the specific heats of
gases theoretically. Later we shall also discuss
briefly, the application to specific heat of solids.

he
d

Molecules of a monatomic gas like argon have


only translational degrees of freedom. But what
about a diatomic gas such as O 2 or N2? A
molecule of O2 has three translational degrees
of freedom. But in addition it can also rotate
about its centre of mass. Figure 13.6 shows the
two independent axes of rotation 1 and 2, normal
to the axis joining the two oxygen atoms about
which the molecule can rotate*. The molecule
thus has two rotational degrees of freedom, each
of which contributes a term to the total energy
consisting of translational energy t and
rotational energy r.

Fig. 13.6 The two independent axes of rotation of a


diatomic molecule

where 1 and 2 are the angular speeds about


the axes 1 and 2 and I1, I2 are the corresponding
moments of inertia. Note that each rotational
degree of freedom contributes a term to the
energy that contains square of a rotational
variable of motion.
We have assumed above that the O molecule
is a rigid rotator, i.e. the molecule does not
vibrate. This assumption, though found to be
true (at moderate temperatures) for O2, is not
always valid. Molecules like CO even at moderate
temperatures have a mode of vibration, i.e. its
atoms oscillate along the interatomic axis like
a one-dimensional oscillator, and contribute a
vibrational energy term v to the total energy:
2

1
dy
m
2
dt

1 2
ky
2

13.6 SPECIFIC HEAT CAPACITY


13.6.1 Monatomic Gases
The molecule of a monatomic gas has only three
translational degrees of freedom. Thus, the
average energy of a molecule at temperature
T is (3/2)kBT . The total internal energy of a
mole of such a gas is

* Rotation along the line joining the atoms has very small moment of inertia and does not come into play for
quantum mechanical reasons. See end of section 13.6.

KINETIC THEORY

3
RT
2

(13.27)

The molar specific heat at constant volume,


Cv, is
Cv (monatomic gas) =

dU
3
= RT
dT
2

(13.28)

For an ideal gas,


Cp Cv = R
(13.29)
where Cp is the molar specific heat at constant
pressure. Thus,
Cp =

(13.30)

The ratio of specific heats

Cp

5
3

Cv

(13.31)

13.6.2 Diatomic Gases

f
f

(13.36)

Note that Cp Cv = R is true for any ideal


gas, whether mono, di or polyatomic.
Table 13.1 summarises the theoretical
predictions for specific heats of gases ignoring
any vibrational modes of motion. The values are
in good agreement with experimental values of
specific heats of several gases given in Table 13.2.
Of course, there are discrepancies between
predicted and actual values of specific heats of
several other gases (not shown in the table), such
as Cl2, C2H6 and many other polyatomic gases.
Usually, the experimental values for specific
heats of these gases are greater than the
predicted values given in Table13.1 suggesting
that the agreement can be improved by including
vibrational modes of motion in the calculation.
The law of equipartition of energy is thus well

no N
C
tt E
o R
be T
re
pu

As explained earlier, a diatomic molecule treated


as a rigid rotator like a dumbbell has 5 degrees
of freedom : 3 translational and 2 rotational.
Using the law of equipartition of energy, the total
internal energy of a mole of such a gas is

i.e. Cv = (3 + f ) R, Cp = (4 + f ) R,

he
d

NA

is

3
kBT
2

bl

329

5
5
kBT
NA
RT
(13.32)
2
2
The molar specific heats are then given by
U

Cv (rigid diatomic) =

(rigid diatomic) =

R, Cp =

(13.33)

(13.34)
5
If the diatomic molecule is not rigid but has
in addition a vibrational mode
5
U = k B T + k B T
2

Cv

7
R, C p
2

7
RT
N A =
2

9
R,
2

9
R
7

(13.35)

13.6.3 Polyatomic Gases

In general a polyatomic molecule has 3


translational, 3 rotational degrees of freedom
and a certain number ( f ) of vibrational modes.
According to the law of equipartition of energy,
it is easily seen that one mole of such a gas has
3
3
U = ( kB T +
kBT + f kBT ) NA
2
2

Table 13.1 Predicted values of specific heat


capacities of gases (ignoring
vibrational modes),
Nature of
Gas

Cv

Cp - Cv

Cp

(J mol- K- )

(J mol- K- )

(J mol- 1 K- 1)

Monatomic

12.5

20.8

8.31

1.67

Diatomic

20.8

29.1

8.31

1.40

Triatomic

24.93

33.24

8.31

1.33

Table13.2

Measured values of specific heat


capacities of some gases

330

PHYSICS

Example 13.8 A cylinder of fixed capacity


44.8 litres contains helium gas at standard
temperature and pressure. What is the
amount of heat needed to raise the
temperature of the gas in the cylinder by
15.0 C ? (R = 8.31 J mo11 K1).

13.6.5 Specific Heat Capacity of Water


We treat water like a solid. For each atom average
energy is 3kBT. Water molecule has three atoms,
two hydrogen and one oxygen. So it has
U = 3 3 kBT NA = 9 RT
and C = Q/ T = U / T = 9R .
This is the value observed and the agreement
is very good. In the calorie, gram, degree units,
water is defined to have unit specific heat. As 1
calorie = 4.179 joules and one mole of water
is 18 grams, the heat capacity per mole is
~ 75 J mol-1 K-1 ~ 9R . However with more
complex molecules like alcohol or acetone the
arguments, based on degrees of freedom, become
more complicated.
Lastly, we should note an important aspect
of the predictions of specific heats, based on the
classical law of equipartition of energy. The
predicted specific heats are independent of
temperature. As we go to low temperatures,
however, there is a marked departure from this
prediction. Specific heats of all substances
approach zero as T 0. This is related to the
fact that degrees of freedom get frozen and
ineffective at low temperatures. According to
classical physics degrees of freedom must
remain unchanged at all times. The behaviour
of specific heats at low temperatures shows the
inadequacy of classical physics and can be
explained only by invoking quantum
considerations, as was first shown by Einstein.
Quantum mechanics requires a minimum,
nonzero amount of energy before a degree of
freedom comes into play. This is also the reason
why vibrational degrees of freedom come into
play only in some cases.

no N
C
tt E
o R
be T
re
pu

Answer Using the gas law PV = RT, you can


easily show that 1 mol of any (ideal) gas at
standard temperature (273 K) and pressure
(1 atm = 1.01 105 Pa) occupies a volume of
22.4 litres. This universal volume is called molar
volume. Thus the cylinder in this example
contains 2 mol of helium. Further, since helium
is monatomic, its predicted (and observed) molar
specific heat at constant volume, Cv = (3/2) R,
and molar specific heat at constant pressure,
Cp = (3/2) R + R = (5/2) R . Since the volume of
the cylinder is fixed, the heat required is
determined by Cv. Therefore,
Heat required = no. of moles molar specific
heat rise in temperature
= 2 1.5 R 15.0 = 45 R
= 45 8.31 = 374 J.

As Table 13.3 shows the prediction generally


agrees with experimental values at ordinary
temperature (Carbon is an exception).

he
d

ordinary

is

at

bl

verified
experimentally
temperatures.

13.6.4 Specific Heat Capacity of Solids


We can use the law of equipartition of energy to
determine specific heats of solids. Consider a
solid of N atoms, each vibrating about its mean
position. An oscillation in one dimension has
average energy of 2
kBT = kBT . In three
dimensions, the average energy is 3 kBT. For a
mole of solid, N = N A , and the total
energy is
U = 3 kBT NA = 3 RT
Now at constant pressure Q = U + PV
= U, since for a solid V is negligible. Hence,
C

Q
T

U
T

3R

(13.37)

Table 13.3 Specific Heat Capacity of some


solids at room temperature and
atmospheric pressure

13.7 MEAN FREE PATH

Molecules in a gas have rather large speeds of


the order of the speed of sound. Yet a gas leaking
from a cylinder in a kitchen takes considerable
time to diffuse to the other corners of the room.
The top of a cloud of smoke holds together for
hours. This happens because molecules in a gas
have a finite though small size, so they are bound
to undergo collisions. As a result, they cannot

KINETIC THEORY

331

Seeing is Believing

will collide with it (see Fig. 13.7). If n is the


number of molecules per unit volume, the
molecule suffers nd2 <v> t collisions in time
t. Thus the rate of collisions is nd2 <v> or the
time between two successive collisions is on the
average,
= 1/(n <v> d2 )
(13.38)
The average distance between two successive
collisions, called the mean free path l, is :
l = <v> = 1/(nd2)
(13.39)
In this derivation, we imagined the other
molecules to be at rest. But actually all molecules
are moving and the collision rate is determined
by the average relative velocity of the molecules.
Thus we need to replace <v> by <v > in Eq.
r
(13.38). A more exact treatment gives

no N
C
tt E
o R
be T
re
pu

move straight unhindered; their paths keep


getting incessantly deflected.

bl

is

he
d

Can one see atoms rushing about. Almost but not quite. One can see pollen grains of a flower being
pushed around by molecules of water. The size of the grain is ~ 10-5 m. In 1827, a Scottish botanist
Robert Brown, while examining, under a microscope, pollen grains of a flower suspended in water
noticed that they continuously moved about in a zigzag, random fashion.
Kinetic theory provides a simple explanation of the phenomenon. Any object suspended in water is
continuously bombarded from all sides by the water molecules. Since the motion of molecules is random,
the number of molecules hitting the object in any direction is about the same as the number hitting in
the opposite direction. The small difference between these molecular hits is negligible compared to the
total number of hits for an object of ordinary size, and we do not notice any movement of the object.
When the object is sufficiently small but still visible under a microscope, the difference in molecular
hits from different directions is not altogether negligible, i.e. the impulses and the torques given to the
suspended object through continuous bombardment by the molecules of the medium (water or some
other fluid) do not exactly sum to zero. There is a net impulse and torque in this or that direction. The
suspended object thus, moves about in a zigzag manner and tumbles about randomly. This motion
called now Brownian motion is a visible proof of molecular activity. In the last 50 years or so molecules
have been seen by scanning tunneling and other special microscopes.
In 1987 Ahmed Zewail, an Egyptian scientist working in USA was able to observe not only the
molecules but also their detailed interactions. He did this by illuminating them with flashes of laser
light for very short durations, of the order of tens of femtoseconds and photographing them. ( 1 femtosecond = 10-15 s ). One could study even the formation and breaking of chemical bonds. That is really
seeing !

Fig. 13.7 The volume swept by a molecule in time t


in which any molecule will collide with it.

Suppose the molecules of a gas are spheres


of diameter d. Focus on a single molecule with
the average speed <v>. It will suffer collision with
any molecule that comes within a distance d
between the centres. In time t, it sweeps a
volume d2 <v> t wherein any other molecule

1/

2 n d2

(13.40)

Let us estimate l and for air molecules with


average speeds <v> = ( 485m/s). At STP
0.02 1023

n=

22.4 10 3

= 2.7 10 25 m -3.
Taking, d = 2 1010 m,
= 6.1 1010 s
and l = 2.9 107 m 1500d

(13.41)

332

PHYSICS

Answer The d for water vapour is same as that


of air. The number density is inversely
proportional to absolute temperature.

Hence, mean free path l 4 10 7 m


Note that the mean free path is 100 times the
interatomic distance ~ 40 = 4 10-9 m calculated
earlier. It is this large value of mean free path that
leads to the typical gaseous behaviour. Gases can
not be confined without a container.
Using, the kinetic theory of gases, the bulk
measurable properties like viscosity, heat
conductivity and diffusion can be related to the
microscopic parameters like molecular size. It
is through such relations that the molecular
sizes were first estimated.

bl

SUMMARY

The ideal gas equation connecting pressure (P ), volume (V ) and absolute temperature
(T ) is
= kB NT
PV = RT
where is the number of moles and N is the number of molecules. R and kB are universal
constants.

no N
C
tt E
o R
be T
re
pu

1.

273
2 1025 m 3
373

he
d

Example 13.9 Estimate the mean free path


for a water molecule in water vapour at 373 K.
Use information from Exercises 13.1 and
Eq. (13.41) above.

25
So n 2.7 10

is

As expected, the mean free path given by


Eq. (13.40) depends inversely on the number
density and the size of the molecules. In a highly
evacuated tube n is rather small and the mean
free path can be as large as the length of the
tube.

R = 8.314 J mol1 K1,

2.

kB =

R
NA

= 1.38 1023 J K1

Real gases satisfy the ideal gas equation only approximately, more so at low pressures
and high temperatures.
Kinetic theory of an ideal gas gives the relation

1
n m v2
3

where n is number density of molecules, m the mass of the molecule and v 2 is the
mean of squared speed. Combined with the ideal gas equation it yields a kinetic
interpretation of temperature.

1
m v2
2

3.

v2

1/ 2

3k B T
m

This tells us that the temperature of a gas is a measure of the average kinetic energy
of a molecule, independent of the nature of the gas or molecule. In a mixture of gases at
a fixed temperature the heavier molecule has the lower average speed.
The translational kinetic energy
E=

This leads to a relation

PV =

4.

3
k B T , vrms
2

3
2

kB NT.

The law of equipartition of energy states that if a system is in equilibrium at absolute


temperature T, the total energy is distributed equally in different energy modes of

KINETIC THEORY

333

absorption, the energy in each mode being equal to


kB T. Each translational and
rotational degree of freedom corresponds to one energy mode of absorption and has
energy kB T. Each vibrational frequency has two modes of energy (kinetic and potential)
with corresponding energy equal to
2
kB T = kB T.
Using the law of equipartition of energy, the molar specific heats of gases can be
determined and the values are in agreement with the experimental values of specific
heats of several gases. The agreement can be improved by including vibrational modes
of motion.

6.

The mean free path l is the average distance covered by a molecule between two
successive collisions :

1
2n

d2

where n is the number density and d the diameter of the molecule.

is

POINTS TO PONDER

he
d

5.

Pressure of a fluid is not only exerted on the wall. Pressure exists everywhere in a fluid.
Any layer of gas inside the volume of a container is in equilibrium because the pressure
is the same on both sides of the layer.

2.

We should not have an exaggerated idea of the intermolecular distance in a gas. At


ordinary pressures and temperatures, this is only 10 times or so the interatomic distance
in solids and liquids. What is different is the mean free path which in a gas is 100
times the interatomic distance and 1000 times the size of the molecule.

no N
C
tt E
o R
be T
re
pu

bl

1.

3.

The law of equipartition of energy is stated thus: the energy for each degree of freedom
in thermal equilibrium is k T. Each quadratic term in the total energy expression of
B
a molecule is to be counted as a degree of freedom. Thus, each vibrational mode gives
2 (not 1) degrees of freedom (kinetic and potential energy modes), corresponding to the
energy 2
k T = k T.
B

4.

5.

Molecules of air in a room do not all fall and settle on the ground (due to gravity)
because of their high speeds and incessant collisions. In equilibrium, there is a very
slight increase in density at lower heights (like in the atmosphere). The effect is small
since the potential energy (mgh) for ordinary heights is much less than the average
kinetic energy
mv2 of the molecules.
2
< v > is not always equal to ( < v >)2. The average of a squared quantity is not necessarily
the square of the average. Can you find examples for this statement.

EXERCISES

13.1

Estimate the fraction of molecular volume to the actual volume occupied by oxygen
gas at STP. Take the diameter of an oxygen molecule to be 3 .

13.2

Molar volume is the volume occupied by 1 mol of any (ideal) gas at standard
temperature and pressure (STP : 1 atmospheric pressure, 0 C). Show that it is 22.4
litres.

13.3

Figure 13.8 shows plot of PV/T versus P for 1.00103 kg of oxygen gas at two
different temperatures.

334

PHYSICS

T1

T2

Fig. 13.8

he
d

PV (J K1)
T

bl

is

(a) What does the dotted plot signify?


(b) Which is true: T1 > T2 or T1 < T2?
(c) What is the value of PV/T where the curves meet on the y-axis?
(d) If we obtained similar plots for 1.00103 kg of hydrogen, would we get the same
value of PV/T at the point where the curves meet on the y-axis? If not, what mass
of hydrogen yields the same value of PV/T (for low pressurehigh temperature
region of the plot) ? (Molecular mass of H 2 = 2.02 u, of O 2 = 32.0 u,
R = 8.31 J mo11 K1.)

An oxygen cylinder of volume 30 litres has an initial gauge pressure of 15 atm and
a temperature of 27 C. After some oxygen is withdrawn from the cylinder, the gauge
pressure drops to 11 atm and its temperature drops to 17 C. Estimate the mass of
oxygen taken out of the cylinder (R = 8.31 J mol1 K1, molecular mass of O2 = 32 u).

13.5

An air bubble of volume 1.0 cm3 rises from the bottom of a lake 40 m deep at a
temperature of 12 C. To what volume does it grow when it reaches the surface,
which is at a temperature of 35 C ?

13.6

Estimate the total number of air molecules (inclusive of oxygen, nitrogen, water
vapour and other constituents) in a room of capacity 25.0 m3 at a temperature of
27 C and 1 atm pressure.

13.7

Estimate the average thermal energy of a helium atom at (i) room temperature
(27 C), (ii) the temperature on the surface of the Sun (6000 K), (iii) the temperature
of 10 million kelvin (the typical core temperature in the case of a star).

13.8

Three vessels of equal capacity have gases at the same temperature and pressure.
The first vessel contains neon (monatomic), the second contains chlorine (diatomic),
and the third contains uranium hexafluoride (polyatomic). Do the vessels contain
equal number of respective molecules ? Is the root mean square speed of molecules
the same in the three cases? If not, in which case is vrms the largest ?

13.9

At what temperature is the root mean square speed of an atom in an argon gas
cylinder equal to the rms speed of a helium gas atom at 20 C ? (atomic mass of Ar
= 39.9 u, of He = 4.0 u).

no N
C
tt E
o R
be T
re
pu

13.4

13.10 Estimate the mean free path and collision frequency of a nitrogen molecule in a
cylinder containing nitrogen at 2.0 atm and temperature 17 0C. Take the radius of a
nitrogen molecule to be roughly 1.0 . Compare the collision time with the time the
molecule moves freely between two successive collisions (Molecular mass of N2 =
28.0 u).

KINETIC THEORY

335

Additional Exer
cises
Exercises
13.11 A metre long narrow bore held horizontally (and closed at one end) contains a 76 cm
long mercury thread, which traps a 15 cm column of air. What happens if the tube
is held vertically with the open end at the bottom ?

he
d

13.12 From a certain apparatus, the diffusion rate of hydrogen has an average value of
28.7 cm3 s1. The diffusion of another gas under the same conditions is measured to
have an average rate of 7.2 cm3 s1. Identify the gas.
[Hint : Use Grahams law of diffusion: R1/R2 = ( M2 /M1 )1/2, where R1, R2 are diffusion
rates of gases 1 and 2, and M1 and M2 their respective molecular masses. The law is
a simple consequence of kinetic theory.]

n2 = n1 exp [ -mg (h2 h1)/ kBT ]

is

13.13 A gas in equilibrium has uniform density and pressure throughout its volume. This
is strictly true only if there are no external influences. A gas column under gravity,
for example, does not have uniform density (and pressure). As you might expect, its
density decreases with height. The precise dependence is given by the so-called law
of atmospheres

bl

where n2, n1 refer to number density at heights h2 and h1 respectively. Use this
relation to derive the equation for sedimentation equilibrium of a suspension in a
liquid column:
n2 = n1 exp [ -mg NA ( - P) (h2 h1)/ ( RT)]

no N
C
tt E
o R
be T
re
pu

where is the density of the suspended particle, and that of surrounding medium.
[NA is Avogadros number, and R the universal gas constant.] [Hint : Use Archimedes
principle to find the apparent weight of the suspended particle.]

13.14 Given below are densities of some solids and liquids. Give rough estimates of the
size of their atoms :

Substance

Carbon (diamond)
Gold
Nitrogen (liquid)
Lithium
Fluorine (liquid)

Atomic Mass (u)

Density (103 Kg m-3)

12.01
197.00
14.01
6.94
19.00

2.22
19.32
1.00
0.53
1.14

[Hint : Assume the atoms to be tightly packed in a solid or liquid phase, and use
the known value of Avogadros number. You should, however, not take the actual
numbers you obtain for various atomic sizes too literally. Because of the crudeness
of the tight packing approximation, the results only indicate that atomic sizes are in
the range of a few ].

Chapter Two

ELECTROSTATIC
POTENTIAL AND
CAPACITANCE
2.1 INTRODUCTION
In Chapters 6 and 8 (Class XI), the notion of potential energy was
introduced. When an external force does work in taking a body from a
point to another against a force like spring force or gravitational force,
that work gets stored as potential energy of the body. When the external
force is removed, the body moves, gaining kinetic energy and losing
an equal amount of potential energy. The sum of kinetic and
potential energies is thus conserved. Forces of this kind are called
conservative forces. Spring force and gravitational force are examples of
conservative forces.
Coulomb force between two (stationary) charges is also a conservative
force. This is not surprising, since both have inverse-square dependence
on distance and differ mainly in the proportionality constants the
masses in the gravitational law are replaced by charges in Coulombs
law. Thus, like the potential energy of a mass in a gravitational
field, we can define electrostatic potential energy of a charge in an
electrostatic field.
Consider an electrostatic field E due to some charge configuration.
First, for simplicity, consider the field E due to a charge Q placed at the
origin. Now, imagine that we bring a test charge q from a point R to a
point P against the repulsive force on it due to the charge Q. With reference

Physics
to Fig. 2.1, this will happen if Q and q are both positive
or both negative. For definiteness, let us take Q, q > 0.
Two remarks may be made here. First, we assume
that the test charge q is so small that it does not disturb
the original configuration, namely the charge Q at the
origin (or else, we keep Q fixed at the origin by some
unspecified force). Second, in bringing the charge q from
FIGURE 2.1 A test charge q (> 0) is
R to P, we apply an external force Fext just enough to
moved from the point R to the
counter the repulsive electric force FE (i.e, Fext= FE).
point P against the repulsive
force on it by the charge Q (> 0)
This means there is no net force on or acceleration of
placed at the origin.
the charge q when it is brought from R to P, i.e., it is
brought with infinitesimally slow constant speed. In
this situation, work done by the external force is the negative of the work
done by the electric force, and gets fully stored in the form of potential
energy of the charge q. If the external force is removed on reaching P, the
electric force will take the charge away from Q the stored energy (potential
energy) at P is used to provide kinetic energy to the charge q in such a
way that the sum of the kinetic and potential energies is conserved.
Thus, work done by external forces in moving a charge q from R to P is
P

WRP =

e xt

id r

FE idr

(2.1)

This work done is against electrostatic repulsive force and gets stored
as potential energy.
At every point in electric field, a particle with charge q possesses a
certain electrostatic potential energy, this work done increases its potential
energy by an amount equal to potential energy difference between points
R and P.
Thus, potential energy difference
U = U P U R = WRP
(2.2)

52

(Note here that this displacement is in an opposite sense to the electric


force and hence work done by electric field is negative, i.e., WRP .)
Therefore, we can define electric potential energy difference between
two points as the work required to be done by an external force in moving
(without accelerating ) charge q from one point to another for electric field
of any arbitrary charge configuration.
Two important comments may be made at this stage:
(i) The right side of Eq. (2.2) depends only on the initial and final positions
of the charge. It means that the work done by an electrostatic field in
moving a charge from one point to another depends only on the initial
and the final points and is independent of the path taken to go from
one point to the other. This is the fundamental characteristic of a
conservative force. The concept of the potential energy would not be
meaningful if the work depended on the path. The path-independence
of work done by an electrostatic field can be proved using the
Coulombs law. We omit this proof here.

Electrostatic Potential
and Capacitance
(ii) Equation (2.2) defines potential energy difference in terms
of the physically meaningful quantity work. Clearly,
potential energy so defined is undetermined to within an
additive constant.What this means is that the actual value
of potential energy is not physically significant; it is only
the difference of potential energy that is significant. We can
always add an arbitrary constant to potential energy at
every point, since this will not change the potential energy
difference:
(U P + ) (U R + ) = U P U R

W P = U P U = U P

(2.3)

Since the point P is arbitrary, Eq. (2.3) provides us with a


definition of potential energy of a charge q at any point.
Potential energy of charge q at a point (in the presence of field
due to any charge configuration) is the work done by the
external force (equal and opposite to the electric force) in
bringing the charge q from infinity to that point.

2.2 ELECTROSTATIC POTENTIAL


Consider any general static charge configuration. We define
potential energy of a test charge q in terms of the work done
on the charge q. This work is obviously proportional to q, since
the force at any point is q E, where E is the electric field at that
point due to the given charge configuration. It is, therefore,
convenient to divide the work by the amount of charge q, so
that the resulting quantity is independent of q. In other words,
work done per unit test charge is characteristic of the electric
field associated with the charge configuration. This leads to
the idea of electrostatic potential V due to a given charge
configuration. From Eq. (2.1), we get:
Work done by external force in bringing a unit positive
charge from point R to P
U UR
= VP V R = P

Count Alessandro Volta


(1745 1827) Italian
physicist, professor at
Pavia. Volta established
that the animal electricity observed by Luigi
Galvani, 17371798, in
experiments with frog
muscle tissue placed in
contact with dissimilar
metals, was not due to
any exceptional property
of animal tissues but
was also generated
whenever any wet body
was sandwiched between
dissimilar metals. This
led him to develop the
first voltaic pile , or
battery, consisting of a
large stack of moist disks
of cardboard (electrolyte)
sandwiched
between disks of metal
(electrodes).

(2.4)

where VP and VR are the electrostatic potentials at P and R, respectively.


Note, as before, that it is not the actual value of potential but the potential
difference that is physically significant. If, as before, we choose the
potential to be zero at infinity, Eq. (2.4) implies:
Work done by an external force in bringing a unit positive charge
from infinity to a point = electrostatic potential (V ) at that point.

53

COUNT ALESSANDRO VOLTA (1745 1827)

Put it differently, there is a freedom in choosing the point


where potential energy is zero. A convenient choice is to have
electrostatic potential energy zero at infinity. With this choice,
if we take the point R at infinity, we get from Eq. (2.2)

Physics
In other words, the electrostatic potential (V )
at any point in a region with electrostatic field is
the work done in bringing a unit positive
charge (without acceleration) from infinity to
that point.
The qualifying remarks made earlier regarding
potential energy also apply to the definition of
potential. To obtain the work done per unit test
charge, we should take an infinitesimal test charge
q, obtain the work done W in bringing it from
infinity to the point and determine the ratio
W/ q. Also, the external force at every point of
the path is to be equal and opposite to the
electrostatic force on the test charge at that point.

FIGURE 2.2 Work done on a test charge q


by the electrostatic field due to any given
charge configuration is independent
of the path, and depends only on
its initial and final positions.

2.3 POTENTIAL

DUE TO A

P OINT CHARGE

Consider a point charge Q at the origin (Fig. 2.3). For definiteness, take Q
to be positive. We wish to determine the potential at any point P with
position vector r from the origin. For that we must
calculate the work done in bringing a unit positive
test charge from infinity to the point P. For Q > 0,
the work done against the repulsive force on the
test charge is positive. Since work done is
independent of the path, we choose a convenient
path along the radial direction from infinity to
the point P.
At some intermediate point P on the path, the
electrostatic
force on a unit positive charge is
FIGURE 2.3 Work done in bringing a unit
positive test charge from infinity to the
Q 1
r
(2.5)
point P, against the repulsive force of
4 0r '2

charge Q (Q > 0), is the potential at P due to


the charge Q.

W =

Q
4 0r '2

where r is the unit vector along OP. Work done


against this force from r to r + r is

(2.6)

The negative sign appears because for r < 0, W is positive . Total


work done (W) by the external force is obtained by integrating Eq. (2.6)
from r = to r = r,
r

Q
Q
2 dr =
4

r
'
4

0r
0

W =

Q
4 0r

(2.7)

This, by definition is the potential at P due to the charge Q

54

V (r ) =

Q
4 0r

(2.8)

Electrostatic Potential
and Capacitance
Equation (2.8) is true for any
sign of the charge Q, though we
considered Q > 0 in its derivation.
For Q < 0, V < 0, i.e., work done (by
the external force) per unit positive
test charge in bringing it from
infinity to the point is negative. This
is equivalent to saying that work
done by the electrostatic force in
bringing the unit positive charge
form infinity to the point P is
positive. [This is as it should be,
since for Q < 0, the force on a unit
positive test charge is attractive, so
that the electrostatic force and the
FIGURE 2.4 Variation of potential V with r [in units of
displacement (from infinity to P) are
(Q/ 40) m-1] (blue curve) and field with r [in units
in the same direction.] Finally, we
-2
of (Q/4 0) m ] (black curve) for a point charge Q.
note that Eq. (2.8) is consistent with
the choice that potential at infinity
be zero.
Figure (2.4) shows how the electrostatic potential ( 1/r ) and the
electrostatic field ( 1/r 2 ) varies with r.
Example 2.1
(a) Calculate the potential at a point P due to a charge of 4 107C
located 9 cm away.
9
(b) Hence obtain the work done in bringing a charge of 2 10 C
from infinity to the point P. Does the answer depend on the path
along which the charge is brought?
Solution
(a)
= 4 104 V
9

2.4 POTENTIAL

DUE TO AN

EXAMPLE 2.1

(b) W = qV = 2 10 C 4 10 V
= 8 105 J
No, work done will be path independent. Any arbitrary infinitesimal
path can be resolved into two perpendicular displacements: One along
r and another perpendicular to r. The work done corr esponding to
the later will be zero.

ELECTRIC DIPOLE

As we learnt in the last chapter, an electric dipole consists of two charges


q and q separated by a (small) distance 2a. Its total charge is zero. It is
characterised by a dipole moment vector p whose magnitude is q 2a
and which points in the direction from q to q (Fig. 2.5). We also saw that
the electric field of a dipole at a point with position vector r depends not
just on the magnitude r, but also on the angle between r and p. Further,

55

Physics
the field falls off, at large distance, not as
1/r 2 (typical of field due to a single charge)
but as 1/r 3. We, now, determine the electric
potential due to a dipole and contrast it
with the potential due to a single charge.
As before, we take the origin at the
centre of the dipole. Now we know that the
electric field obeys the superposition
principle. Since potential is related to the
work done by the field, electrostatic
potential also follows the superposition
principle. Thus, the potential due to the
dipole is the sum of potentials due to the
charges q and q
V =

FIGURE 2.5 Quantities involved in the calculation


of potential due to a dipole.

1 q q

4 0 r1 r2

(2.9)

where r1 and r2 are the distances of the


point P from q and q, respectively.

Now, by geometry,

r12 = r 2 + a2 2ar cos


r22 = r 2 + a2 + 2ar cos

(2.10)

We take r much greater than a ( r >> a ) and retain terms only upto
the first order in a/r

2a cos

r 2 1

(2.11)

Similarly,
2a cos

r22 r 2 1 +

(2.12)

Using the Binomial theorem and retaining terms upto the first order
in a/r ; we obtain,
1 1
2a cos
1

r1 r
r
1 1
2a cos
1 +

r2 r
r

1/2

1
a

1 + cos

r
r

[2.13(a)]

1
a

1 cos
r
r

[2.13(b)]

1/2

Using Eqs. (2.9) and (2.13) and p = 2qa, we get


V =

56

q 2 acos p cos
=
4 0
r2
4 0r 2

Now, p cos = pir

(2.14)

Electrostatic Potential
and Capacitance
where r is the unit vector along the position vector OP.
The electric potential of a dipole is then given by
V =

1 pi r
4 0 r 2 ;

(r >> a)

(2.15)

Equation (2.15) is, as indicated, approximately true only for distances


large compared to the size of the dipole, so that higher order terms in
a/r are negligible. For a point dipole p at the origin, Eq. (2.15) is, however,
exact.
From Eq. (2.15), potential on the dipole axis ( = 0, ) is given by
V =

1 p
4 0 r 2

(2.16)

(Positive sign for = 0, negative sign for = .) The potential in the


equatorial plane ( = /2) is zero.
The important contrasting features of electric potential of a dipole
from that due to a single charge are clear from Eqs. (2.8) and (2.15):
(i) The potential due to a dipole depends not just on r but also on the
angle between the position vector r and the dipole moment vector p.
(It is, however, axially symmetric about p. That is, if you rotate the
position vector r about p, keeping fixed, the points corresponding
to P on the cone so generated will have the same potential as at P.)
(ii) The electric dipole potential falls off, at large distance, as 1/r 2 , not as
1/r, characteristic of the potential due to a single charge. (You can
refer to the Fig. 2.5 for graphs of 1/r 2 versus r and 1/r versus r,
drawn there in another context.)

2.5 POTENTIAL

DUE TO A

SYSTEM

OF

CHARGES

Consider a system of charges q1, q2,, qn with position vectors r1, r2 ,,


rn relative to some origin (Fig. 2.6). The potential V1 at P due to the charge
q1 is
V1 =

1 q1
4 0 r1P

where r1P is the distance between q1 and P.


Similarly, the potential V 2 at P due to q2 and
V3 due to q3 are given by
V2 =

1 q2
1 q3
V3 =
,
4 0 r2P
4 0 r3P

where r2P and r3P are the distances of P from


charges q2 and q3, respectively; and so on for the
potential due to other charges. By the
superposition principle, the potential V at P due
to the total charge configuration is the algebraic
sum of the potentials due to the individual
charges
V = V1 + V2 + ... + Vn
(2.17)

FIGURE 2.6 Potential at a point due to a


system of charges is the sum of potentials
due to individual charges.

57

Physics
=

1 q1 q2
q
+
+ ...... + n

4 0 r1P r2 P
rnP

(2.18)

If we have a continuous charge distribution characterised by a charge


density (r), we divide it, as before, into small volume elements each of
size v and carrying a charge v. We then determine the potential due
to each volume element and sum (strictly speaking , integrate) over all
such contributions, and thus determine the potential due to the entire
distribution.
We have seen in Chapter 1 that for a uniformly charged spherical shell,
the electric field outside the shell is as if the entire charge is concentrated
at the centre. Thus, the potential outside the shell is given by
V =

1 q
4 0 r

[2.19(a)]

(r R )

where q is the total charge on the shell and R its radius. The electric field
inside the shell is zero. This implies (Section 2.6) that potential is constant
inside the shell (as no work is done in moving a charge inside the shell),
and, therefore, equals its value at the surface, which is
V =

1 q
4 0 R

[2.19(b)]
8

Example 2.2 Two charges 3 10 C and 2 10 C are located


15 cm apart. At what point on the line joining the two charges is the
electric potential zero? Take the potential at infinity to be zero.
Solution Let us take the origin O at the location of the positive charge.
The line joining the two charges is taken to be the x-axis; the negative
charge is taken to be on the right side of the origin (Fig. 2.7).

FIGURE 2.7

Let P be the required point on the x-axis where the potential is zero.
If x is the x-coordinate of P, obviously x must be positive. (Ther e is no
possibility of potentials due to the two charges adding up to zero for
x < 0.) If x lies between O and A, we have

3 10 8
2 10 8

=0

2
(15 x) 10 2
4 0 x 10
1

58

EXAMPLE 2.2

where x is in cm. That is,


3
2

= 0
x 15 x
which gives x = 9 cm.
If x lies on the extended line OA, the required condition is
3
2

=0
x x 15

Electrostatic Potential
and Capacitance
EXAMPLE 2.2

which gives
x = 45 cm
Thus, electric potential is zero at 9 cm and 45 cm away from the
positive charge on the side of the negative charge. Note that the
formula for potential used in the calculation required choosing
potential to be zero at infinity.
Example 2.3 Figures 2.8 (a) and (b) show the field lines of a positive
and negative point charge respectively.

(a) Give the signs of the potential differ ence VP VQ; VB VA .


(b) Give the sign of the potential energy difference of a small negative
charge between the points Q and P; A and B.
(c) Give the sign of the work done by the field in moving a small
positive charge from Q to P.
(d) Give the sign of the work done by the external agency in moving
a small negative charge from B to A.
(e) Does the kinetic energy of a small negative charge increase or
decrease in going from B to A?
Solution
1
, VP > V Q. Thus, (VP VQ ) is positive. Also V B is less negative
r
than V A . Thus, VB > V A or ( VB VA) is positive.
A small negative charge will be attracted towards positive charge.
The negative charge moves from higher potential energy to lower
potential energy. Therefore the sign of potential energy difference
of a small negative charge between Q and P is positive.
Similarly, (P.E.)A > (P.E.)B and hence sign of potential ener gy
differences is positive.
In moving a small positive charge fr om Q to P, work has to be
done by an external agency against the electric field. Therefore,
work done by the field is negative.
In moving a small negative charge from B to A work has to be
done by the external agency. It is positive.
Due to force of repulsion on the negative charge, velocity decreases
and hence the kinetic energy decreases in going from B to A.

(a) As V
(b)

(c)

(e)

EXAMPLE 2.3

(d)

Electric potential, equipotential surfaces:

http://video.mit.edu/watch/4-electrostatic-potential-elctric-energy-ev-conservative-fieldequipotential-sufaces-12584/

FIGURE 2.8

59

Physics
2.6 EQUIPOTENTIAL S URFACES
An equipotential surface is a surface with a constant value of potential
at all points on the surface. For a single charge q, the potential is given
by Eq. (2.8):
1 q
4 o r
This shows that V is a constant if r is constant . Thus, equipotential
surfaces of a single point charge are concentric spherical surfaces centred
at the charge.
Now the electric field lines for a single charge q are radial lines starting
from or ending at the charge, depending on whether q is positive or negative.
Clearly, the electric field at every point is normal to the equipotential surface
passing through that point. This is true in general: for any charge
configuration, equipotential surface through a point is normal to the
electric field at that point. The proof of this statement is simple.
If the field were not normal to the equipotential surface, it would
have non-zero component along the surface. To move a unit test charge
against the direction of the component of the field, work would have to
be done. But this is in contradiction to the definition of an equipotential
surface: there is no potential difference between any two points on the
surface and no work is required to move a test charge on the surface.
The electric field must, therefore, be normal to the equipotential surface
at every point. Equipotential surfaces offer an alternative visual picture
in addition to the picture of electric field lines around a charge
configuration.
V=

FIGURE 2.9 For a


single charge q
(a) equipotential
surfaces are
spherical surfaces
centred at the
charge, and
(b) electric field
lines are radial,
starting from the
charge if q > 0.

FIGURE 2.10 Equipotential surfaces for a uniform electric field.

For a uniform electric field E, say, along the x -axis, the equipotential
surfaces are planes normal to the x -axis, i.e., planes parallel to the y-z
plane (Fig. 2.10). Equipotential surfaces for (a) a dipole and (b) two
identical positive charges are shown in Fig. 2.11.

60

FIGURE 2.11 Some equipotential surfaces for (a) a dipole,


(b) two identical positive charges.

Electrostatic Potential
and Capacitance
2.6.1 Relation between field and potential
Consider two closely spaced equipotential surfaces A and B (Fig. 2.12)
with potential values V and V + V, where V is the change in V in the
direction of the electric field E. Let P be a point on the
surface B. l is the perpendicular distance of the
surface A from P. Imagine that a unit positive charge
is moved along this perpendicular from the surface B
to surface A against the electric field. The work done
in this process is |E| l.
This work equals the potential difference
VAVB.
Thus,
|E| l = V (V + V)= V
i.e., |E|=

V
l

(2.20)

Since V is negative, V = |V|. we can rewrite


Eq (2.20) as

E =

V
V
= +
l
l

FIGURE 2.12 From the


potential to the field.

(2.21)

We thus arrive at two important conclusions concerning the relation


between electric field and potential:
(i) Electric field is in the direction in which the potential decreases
steepest.
(ii) Its magnitude is given by the change in the magnitude of potential
per unit displacement normal to the equipotential surface at the point.

2.7 POTENTIAL E NERGY

OF A

S YSTEM

OF

CHARGES

Consider first the simple case of two charges q1and q2 with position vector
r1 and r2 relative to some origin. Let us calculate the work done
(externally) in building up this configuration. This means that we consider
the charges q1 and q2 initially at infinity and determine the work done by
an external agency to bring the charges to the given locations. Suppose,
first the charge q1 is brought from infinity to the point r1 . There is no
external field against which work needs to be done, so work done in
bringing q1 from infinity to r1 is zero. This charge produces a potential in
space given by
1 q1
V1 =
4 0 r1P
where r1P is the distance of a point P in space from the location of q1.
From the definition of potential, work done in bringing charge q2 from
infinity to the point r2 is q2 times the potential at r2 due to q1 :
work done on q2 =

1 q1q2
4 0 r12

61

Physics
where r12 is the distance between points 1 and 2.
Since electrostatic force is conservative, this work gets
stored in the form of potential energy of the system. Thus,
the potential energy of a system of two charges q1 and q2 is
FIGURE 2.13 Potential energy of a
system of charges q 1 and q2 is
directly proportional to the product
of charges and inversely to the
distance between them.

U =

1 q1q2
4 0 r12

(2.22)

Obviously, if q2 was brought first to its present location and


q1 brought later, the potential energy U would be the same.
More generally, the potential energy expression,
Eq. (2.22), is unaltered whatever way the charges are brought to the specified
locations, because of path-independence of work for electrostatic force.
Equation (2.22) is true for any sign of q1and q2. If q1q2 > 0, potential
energy is positive. This is as expected, since for like charges (q1q2 > 0),
electrostatic force is repulsive and a positive amount of work is needed to
be done against this force to bring the charges from infinity to a finite
distance apart. For unlike charges (q1 q2 < 0), the electrostatic force is
attractive. In that case, a positive amount of work is needed against this
force to take the charges from the given location to infinity. In other words,
a negative amount of work is needed for the reverse path (from infinity to
the present locations), so the potential energy is negative.
Equation (2.22) is easily generalised for a system of any number of
point charges. Let us calculate the potential energy of a system of three
charges q1, q2 and q3 located at r1, r2, r3, respectively. To bring q1 first
from infinity to r1, no work is required. Next we bring q2 from infinity to
r2. As before, work done in this step is
q 2V1( r2 ) =

1 q1q2
4 0 r12

(2.23)

The charges q1 and q2 produce a potential, which at any point P is


given by
1 q1 q2
+
(2.24)
4 0 r1P r2P
Work done next in bringing q3 from infinity to the point r3 is q3 times
V1, 2 at r3
V1, 2 =

1 q1q3 q2q 3
+
(2.25)
4 0 r13
r23
The total work done in assembling the charges
at the given locations is obtained by adding the work
done in different steps [Eq. (2.23) and Eq. (2.25)],
q 3V1,2 (r3 ) =

FIGURE 2.14 Potential energy of a


system of three charges is given by
Eq. (2.26), with the notation given
in the figure.

62

1 q1q2 q1q3 q2q 3


+
+
(2.26)
4 0 r12
r13
r23
Again, because of the conservative nature of the
electrostatic force (or equivalently, the path
independence of work done), the final expression for
U, Eq. (2.26), is independent of the manner in which
the configuration is assembled. The potential energy
U =

Electrostatic Potential
and Capacitance
is characteristic of the present state of configuration, and not the way
the state is achieved.
Example 2.4 Four charges are arranged at the corners of a square
ABCD of side d, as shown in Fig. 2.15.(a) Find the work required to
put together this arrangement. (b) A charge q0 is brought to the centre
E of the square, the four charges being held fixed at its corners. How
much extra work is needed to do this?

FIGURE 2.15

Solution
(a) Since the work done depends on the final arrangement of the
charges, and not on how they ar e put together, we calculate work
needed for one way of putting the charges at A, B, C and D. Suppose,
first the charge +q is brought to A, and then the charges q, +q, and
q are brought to B, C and D, respectively. The total work needed can
be calculated in steps:
(i) Work needed to bring charge + q to A when no charge is present
elsewhere: this is zero.
(ii) Work needed to bring q to B when + q is at A. This is given by
(charge at B) (electrostatic potential at B due to charge +q at A)
q
q2
= q
=

4 d
4 d
0

(iii) Work needed to bring charge +q to C when + q is at A and q is at


B. This is given by (charge at C) (potential at C due to charges
at A and B)

+q
q
= +q
+

4 0 d 2 4 0 d

q 2
1
1

40 d
2
(iv) Work needed to bring q to D when +q at A,q at B, and +q at C.
This is given by (charge at D) (potential at D due to charges at A,
B and C)
+q
q
q
= q
+
+
4

d
4

d
2
4

0 d

0
0
=

q 2
40 d

1
2

EXAMPLE 2.4

63

Physics
Add the work done in steps (i), (ii), (iii) and (iv). The total work
required is
=

EXAMPLE 2.4

q 2
(0) + (1) +
4 0d

q 2
4 2
4 0d

1
1
1
+ 2

2
2

The work done depends only on the arrangement of the charges, and
not how they are assembled. By definition, this is the total
electrostatic energy of the charges.
(Students may try calculating same work/energy by taking charges
in any other order they desire and convince themselves that the energy
will remain the same.)
(b) The extra work necessary to bring a charge q0 to the point E when
the four charges are at A, B, C and D is q 0 (electrostatic potential at
E due to the charges at A, B, C and D). The electrostatic potential at
E is clearly zero since potential due to A and C is cancelled by that
due to B and D. Hence no work is required to bring any charge to
point E.

2.8 POTENTIAL E NERGY

IN AN

EXTERNAL FIELD

2.8.1 Potential energy of a single charge

64

In Section 2.7, the source of the electric field was specified the charges
and their locations - and the potential energy of the system of those charges
was determined. In this section, we ask a related but a distinct question.
What is the potential energy of a charge q in a given field? This question
was, in fact, the starting point that led us to the notion of the electrostatic
potential (Sections 2.1 and 2.2). But here we address this question again
to clarify in what way it is different from the discussion in Section 2.7.
The main difference is that we are now concerned with the potential
energy of a charge (or charges) in an external field. The external field E is
not produced by the given charge(s) whose potential energy we wish to
calculate. E is produced by sources external to the given charge(s).The
external sources may be known, but often they are unknown or
unspecified; what is specified is the electric field E or the electrostatic
potential V due to the external sources. We assume that the charge q
does not significantly affect the sources producing the external field. This
is true if q is very small, or the external sources are held fixed by other
unspecified forces. Even if q is finite, its influence on the external sources
may still be ignored in the situation when very strong sources far away
at infinity produce a finite field E in the region of interest. Note again that
we are interested in determining the potential energy of a given charge q
(and later, a system of charges) in the external field; we are not interested
in the potential energy of the sources producing the external electric field.
The external electric field E and the corresponding external potential
V may vary from point to point. By definition, V at a point P is the work
done in bringing a unit positive charge from infinity to the point P.

Electrostatic Potential
and Capacitance
(We continue to take potential at infinity to be zero.) Thus, work done in
bringing a charge q from infinity to the point P in the external field is qV.
This work is stored in the form of potential energy of q. If the point P has
position vector r relative to some origin, we can write:
Potential energy of q at r in an external field
= qV(r)

(2.27)

where V(r) is the external potential at the point r.


Thus, if an electron with charge q = e = 1.61019 C is accelerated by
a potential difference of V = 1 volt, it would gain energy of qV = 1.6
1019 J. This unit of energy is defined as 1 electron volt or 1eV, i.e.,
1 eV=1.6 1019J. The units based on eV are most commonly used in
atomic, nuclear and particle physics, (1 keV = 103eV = 1.6 1016 J, 1 MeV
= 106eV = 1.6 1013J, 1 GeV = 109eV = 1.6 1010J and 1 TeV = 1012eV
= 1.6 107J). [This has already been defined on Page 117, XI Physics
Part I, Table 6.1.]

2.8.2 Potential energy of a system of two charges in an


external field
Next, we ask: what is the potential energy of a system of two charges q1
and q2 located at r1and r2, respectively, in an external field? First, we
calculate the work done in bringing the charge q1 from infinity to r1.
Work done in this step is q1 V(r1), using Eq. (2.27). Next, we consider the
work done in bringing q2 to r2 . In this step, work is done not only against
the external field E but also against the field due to q1.
Work done on q2 against the external field
= q2 V (r2)
Work done on q2 against the field due to q1
q1q 2
=
4 o r12
where r12 is the distance between q1 and q2 . We have made use of Eqs.
(2.27) and (2.22). By the superposition principle for fields, we add up
the work done on q2 against the two fields (E and that due to q1):
Work done in bringing q2 to r2
= q2V ( r2 ) +

q1q2
4 o r12

(2.28)

Thus,
Potential energy of the system
= the total work done in assembling the configuration
= q1V ( r1 ) + q2V (r2 ) +

q1q2
4 0r12

(2.29)

EXAMPLE 2.5

Example 2.5
(a) Determine the electrostatic potential energy of a system consisting
of two charges 7 C and 2 C (and with no external field) placed
at (9 cm, 0, 0) and (9 cm, 0, 0) respectively.
(b) How much work is required to separate the two charges infinitely
away from each other?

65

Physics
(c) Suppose that the same system of charges is now placed in an
external electric field E = A (1/r 2); A = 9 105 C m2. What would
the electrostatic energy of the configuration be?
Solution
(a) U =

1 q1q2
7 ( 2) 10 12
= 9 109
= 0.7 J.
4 0 r
0.18

(b) W = U2 U1 = 0 U = 0 (0.7) = 0.7 J.


(c) The mutual interaction energy of the two charges remains
unchanged. In addition, there is the energy of interaction of the
two charges with the exter nal electric field. We find,
7 C
2 C
+ A
0.09m
0.09m
and the net electrostatic energy is

E XAMPLE 2.5

q1V ( r1 ) + q2 V (r2 ) = A

q1V ( r1 ) + q2 V (r2 ) +

q1q 2
7 C
2 C
= A
+A
0.7 J
4 0r12
0.09 m
0.09 m
= 70 20 0.7 = 49.3 J

2.8.3 Potential energy of a dipole in an external field


Consider a dipole with charges q1 = +q and q2 = q placed in a uniform
electric field E, as shown in Fig. 2.16.
As seen in the last chapter, in a uniform electric field,
the dipole experiences no net force; but experiences a
torque given by
= pE
(2.30)
which will tend to rotate it (unless p is parallel or
antiparallel to E). Suppose an external torque ext is
applied in such a manner that it just neutralises this
torque and rotates it in the plane of paper from angle 0
to angle 1 at an infinitesimal angular speed and without
angular acceleration. The amount of work done by the
external torque will be given by
FIGURE 2.16 Potential energy of a
dipole in a uniform external field.

W =

e xt ( ) d = pE sin d
0

= pE ( cos 0 cos 1 )

(2.31)

This work is stored as the potential energy of the system. We can then
associate potential energy U( ) with an inclination of the dipole. Similar
to other potential energies, there is a freedom in choosing the angle where
the potential energy U is taken to be zero. A natural choice is to take
0 = / 2. (n explanation for it is provided towards the end of discussion.)
We can then write,

66

U ( ) = pE cos cos = pE cos = piE

(2.32)

Electrostatic Potential
and Capacitance
This expression can alternately be understood also from Eq. (2.29).
We apply Eq. (2.29) to the present system of two charges +q and q. The
potential energy expression then reads
U ( ) = q [V ( r1 ) V ( r2 )]

q2
4 0 2a

(2.33)

Here, r1 and r2 denote the position vectors of +q and q. Now, the


potential difference between positions r1 and r2 equals the work done
in bringing a unit positive charge against field from r2 to r1. The
displacement parallel to the force is 2a cos. Thus, [V(r1 )V (r2 )] =
E 2a cos . We thus obtain,
U ( ) = pE cos

q2
q2
= pi E
4 0 2a
4 0 2a

(2.34)

We note that U ( ) differs from U( ) by a quantity which is just a constant


for a given dipole. Since a constant is insignificant for potential energy, we
can drop the second term in Eq. (2.34) and it then reduces to Eq. (2.32).
We can now understand why we took 0=/2. In this case, the work
done against the external field E in bringing +q and q are equal and
opposite and cancel out, i.e., q [V (r1 ) V (r2)]=0.

2.9 ELECTROSTATICS

OF

EXAMPLE 2.6

Example 2.6 A molecule of a substance has a permanent electric


29
dipole moment of magnitude 10
C m. A mole of this substance is
polarised (at low temperature) by applying a strong electrostatic field
of magnitude 106 V m1. The direction of the field is suddenly changed
by an angle of 60. Estimate the heat released by the substance in
aligning its dipoles along the new direction of the field. For simplicity,
assume 100% polarisation of the sample.
Solution Here, dipole moment of each molecules = 1029 C m
As 1 mole of the substance contains 6 1023 molecules,
23
29
total dipole moment of all the molecules, p = 6 10 10
C m
6
= 6 10 C m
6
6
Initial potential energy, Ui = pE cos = 610 10 cos 0 = 6 J
6
Final potential energy (when = 60), Uf = 6 10 106 cos 60 = 3 J
Change in potential energy = 3 J (6J) = 3 J
So, there is loss in potential energy. This must be the energy released
by the substance in the form of heat in aligning its dipoles.

CONDUCTORS

Conductors and insulators were described briefly in Chapter 1.


Conductors contain mobile charge carriers. In metallic conductors, these
charge carriers are electrons. In a metal, the outer (valence) electrons
part away from their atoms and are free to move. These electrons are free
within the metal but not free to leave the metal. The free electrons form a
kind of gas; they collide with each other and with the ions, and move
randomly in different directions. In an external electric field, they drift
against the direction of the field. The positive ions made up of the nuclei
and the bound electrons remain held in their fixed positions. In electrolytic
conductors, the charge carriers are both positive and negative ions; but

67

Physics
the situation in this case is more involved the movement of the charge
carriers is affected both by the external electric field as also by the
so-called chemical forces (see Chapter 3). We shall restrict our discussion
to metallic solid conductors. Let us note important results regarding
electrostatics of conductors.

1. Inside a conductor, electrostatic field is zero


Consider a conductor, neutral or charged. There may also be an external
electrostatic field. In the static situation, when there is no current inside
or on the surface of the conductor, the electric field is zero everywhere
inside the conductor. This fact can be taken as the defining property of a
conductor. A conductor has free electrons. As long as electric field is not
zero, the free charge carriers would experience force and drift. In the
static situation, the free charges have so distributed themselves that the
electric field is zero everywhere inside. Electrostatic field is zero inside a
conductor.

2. At the surface of a charged conductor, electrostatic field


must be normal to the surface at every point
If E were not normal to the surface, it would have some non-zero
component along the surface. Free charges on the surface of the conductor
would then experience force and move. In the static situation, therefore,
E should have no tangential component. Thus electrostatic field at the
surface of a charged conductor must be normal to the surface at every
point. (For a conductor without any surface charge density, field is zero
even at the surface.) See result 5.

3. The interior of a conductor can have no excess charge in


the static situation
A neutral conductor has equal amounts of positive and negative charges
in every small volume or surface element. When the conductor is charged,
the excess charge can reside only on the surface in the static situation.
This follows from the Gausss law. Consider any arbitrary volume element
v inside a conductor. On the closed surface S bounding the volume
element v, electrostatic field is zero. Thus the total electric flux through S
is zero. Hence, by Gausss law, there is no net charge enclosed by S. But
the surface S can be made as small as you like, i.e., the volume v can be
made vanishingly small. This means there is no net charge at any point
inside the conductor, and any excess charge must reside at the surface.

4. Electrostatic potential is constant throughout the volume


of the conductor and has the same value (as inside) on
its surface

68

This follows from results 1 and 2 above. Since E = 0 inside the conductor
and has no tangential component on the surface, no work is done in
moving a small test charge within the conductor and on its surface. That
is, there is no potential difference between any two points inside or on
the surface of the conductor. Hence, the result. If the conductor is charged,

Electrostatic Potential
and Capacitance
electric field normal to the surface exists; this means potential will be
different for the surface and a point just outside the surface.
In a system of conductors of arbitrary size, shape and
charge configuration, each conductor is characterised by a constant
value of potential, but this constant may differ from one conductor to
the other.

5. Electric field at the surface of a charged conductor


E=

n
0

(2.35)

is a unit vector normal


where is the surface charge density and n
to the surface in the outward direction.
To derive the result, choose a pill box (a short cylinder) as the Gaussian
surface about any point P on the surface, as shown in Fig. 2.17. The pill
box is partly inside and partly outside the surface of the conductor. It
has a small area of cross section S and negligible height.
Just inside the surface, the electrostatic field is zero; just outside, the
field is normal to the surface with magnitude E. Thus,
the contribution to the total flux through the pill box
comes only from the outside (circular) cross-section
of the pill box. This equals ES (positive for > 0,
negative for < 0), since over the small area S, E
may be considered constant and E and S are parallel
or antiparallel. The charge enclosed by the pill box
is S.
By Gausss law
ES =

E=

S
0

(2.36)

Including the fact that electric field is normal to the


surface, we get the vector relation, Eq. (2.35), which
is true for both signs of . For > 0, electric field is
normal to the surface outward; for < 0, electric field
is normal to the surface inward.

FIGURE 2.17 The Gaussian surface


(a pill box) chosen to derive Eq. (2.35)
for electric field at the surface of a
char ged conductor.

6. Electrostatic shielding
Consider a conductor with a cavity, with no charges inside the cavity. A
remarkable result is that the electric field inside the cavity is zero, whatever
be the size and shape of the cavity and whatever be the charge on the
conductor and the external fields in which it might be placed. We have
proved a simple case of this result already: the electric field inside a charged
spherical shell is zero. The proof of the result for the shell makes use of
the spherical symmetry of the shell (see Chapter 1). But the vanishing of
electric field in the (charge-free) cavity of a conductor is, as mentioned
above, a very general result. A related result is that even if the conductor

69

Physics

FIGURE 2.18 The electric field inside a


cavity of any conductor is zero. All
charges reside only on the outer surface
of a conductor with cavity. (There are no
charges placed in the cavity.)

is charged or charges are induced on a neutral


conductor by an external field, all charges reside
only on the outer surface of a conductor with cavity.
The proofs of the results noted in Fig. 2.18 are
omitted here, but we note their important
implication. Whatever be the charge and field
configuration outside, any cavity in a conductor
remains shielded from outside electric influence: the
field inside the cavity is always zero. This is known
as electrostatic shielding. The effect can be made
use of in protecting sensitive instruments from
outside electrical influence. Figure 2.19 gives a
summary of the important electrostatic properties
of a conductor.

FIGURE 2.19 Some important electr ostatic properties of a conductor.

70

EXAMPLE 2.7

Example 2.7
(a) A comb run through ones dry hair attracts small bits of paper.
Why?
What happens if the hair is wet or if it is a rainy day? (Remember,
a paper does not conduct electricity.)
(b) Or dinary rubber is an insulator. But special rubber tyr es of
aircraft are made slightly conducting. Why is this necessary?
(c) Vehicles carrying inflammable materials usually have metallic
ropes touching the ground during motion. Why?
(d) A bird perches on a bare high power line, and nothing happens
to the bird. A man standing on the ground touches the same line
and gets a fatal shock. Why?
Solution
(a) This is because the comb gets charged by friction. The molecules
in the paper gets polarised by the charged comb, resulting in a
net force of attraction. If the hair is wet, or if it is rainy day, friction
between hair and the comb reduces. The comb does not get
charged and thus it will not attract small bits of paper.

Electrostatic Potential
and Capacitance

(d) Current passes only when there is difference in potential.

2.10 DIELECTRICS

AND

E XAMPLE 2.7

(b) To enable them to conduct charge (produced by friction) to the


ground; as too much of static electricity accumulated may result
in spark and result in fire.
(c) Reason similar to (b).

POLARISATION

Dielectrics are non-conducting substances. In contrast to conductors,


they have no (or negligible number of ) charge carriers. Recall from Section
2.9 what happens when a conductor is placed in an
external electric field. The free charge carriers move
and charge distribution in the conductor adjusts
itself in such a way that the electric field due to
induced charges opposes the external field within
the conductor. This happens until, in the static
situation, the two fields cancel each other and the
net electrostatic field in the conductor is zero. In a
dielectric, this free movement of charges is not
possible. It turns out that the external field induces
dipole moment by stretching or re-orienting
molecules of the dielectric. The collective effect of all
the molecular dipole moments is net charges on the
FIGURE 2.20 Difference in behaviour
surface of the dielectric which produce a field that
of a conductor and a dielectric
opposes the external field. Unlike in a conductor,
in an external electric field.
however, the opposing field so induced does not
exactly cancel the external field. It only reduces it.
The extent of the effect depends on the
nature of the dielectric. To understand the
effect, we need to look at the charge
distribution of a dielectric at the
molecular level.
The molecules of a substance may be
polar or non-polar. In a non-polar
molecule, the centres of positive and
negative charges coincide. The molecule
then has no permanent (or intrinsic) dipole
moment. Examples of non-polar molecules
are oxygen (O 2 ) and hydrogen (H 2 )
molecules which, because of their
symmetry, have no dipole moment. On the
other hand, a polar molecule is one in which
the centres of positive and negative charges
are separated (even when there is no
FIGURE 2.21 Some examples of polar
external field). Such molecules have a
and non-polar molecules.
permanent dipole moment. An ionic
molecule such as HCl or a molecule of water
71
(H2O) are examples of polar molecules.

Physics
In an external electric field, the
positive and negative charges of a nonpolar molecule are displaced in opposite
directions. The displacement stops when
the external force on the constituent
charges of the molecule is balanced by
the restoring force (due to internal fields
in the molecule). The non-polar molecule
thus develops an induced dipole moment.
The dielectric is said to be polarised by
the external field. We consider only the
simple situation when the induced dipole
moment is in the direction of the field and
is proportional to the field strength.
(Substances for which this assumption
is true are called linear isotropic
dielectrics.) The induced dipole moments
of different molecules add up giving a net
dipole moment of the dielectric in the
presence of the external field.
A dielectric with polar molecules also
develops a net dipole moment in an
external field, but for a different reason.
FIGURE 2.22 A dielectric develops a net dipole
In the absence of any external field, the
moment in an external electric field. (a) Non-polar
different permanent dipoles are oriented
molecules, (b) Polar molecules.
randomly due to thermal agitation; so
the total dipole moment is zero. When
an external field is applied, the individual dipole moments tend to align
with the field. When summed over all the molecules, there is then a net
dipole moment in the direction of the external field, i.e., the dielectric is
polarised. The extent of polarisation depends on the relative strength of
two mutually opposite factors: the dipole potential energy in the external
field tending to align the dipoles with the field and thermal energy tending
to disrupt the alignment. There may be, in addition, the induced dipole
moment effect as for non-polar molecules, but generally the alignment
effect is more important for polar molecules.
Thus in either case, whether polar or non-polar, a dielectric develops
a net dipole moment in the presence of an external field. The dipole
moment per unit volume is called polarisation and is denoted by P. For
linear isotropic dielectrics,
P = e E

72

(2.37)

where e is a constant characteristic of the dielectric and is known as the


electric susceptibility of the dielectric medium.
It is possible to relate e to the molecular properties of the substance,
but we shall not pursue that here.
The question is: how does the polarised dielectric modify the original
external field inside it? Let us consider, for simplicity, a rectangular
dielectric slab placed in a uniform external field E 0 parallel to two of its
faces. The field causes a uniform polarisation P of the dielectric. Thus

Electrostatic Potential
and Capacitance
every volume element v of the slab has a dipole moment
P v in the direction of the field. The volume element v is
macroscopically small but contains a very large number of
molecular dipoles. Anywhere inside the dielectric, the
volume element v has no net charge (though it has net
dipole moment). This is, because, the positive charge of one
dipole sits close to the negative charge of the adjacent dipole.
However, at the surfaces of the dielectric normal to the
electric field, there is evidently a net charge density. As seen
in Fig 2.23, the positive ends of the dipoles remain
unneutralised at the right surface and the negative ends at
the left surface. The unbalanced charges are the induced
charges due to the external field.
Thus the polarised dielectric is equivalent to two charged
surfaces with induced surface charge densities, say p
and p. Clearly, the field produced by these surface charges
opposes the external field. The total field in the dielectric
is, thereby, reduced from the case when no dielectric is
present. We should note that the surface charge density
p arises from bound (not free charges) in the dielectric.

2.11 CAPACITORS

AND

FIGURE 2.23 A uniformly


polarised dielectric amounts
to induced surface charge
density, but no volume
charge density.

CAPACITANCE

A capacitor is a system of two conductors separated by an insulator


(Fig. 2.24). The conductors have charges, say Q1 and Q 2, and potentials
V1 and V2. Usually, in practice, the two conductors have charges Q
and Q, with potential difference V = V1 V2 between them. We shall
consider only this kind of charge configuration of the capacitor. (Even a
single conductor can be used as a capacitor by assuming the other at
infinity.) The conductors may be so charged by connecting them to the
two terminals of a battery. Q is called the charge of the capacitor, though
this, in fact, is the charge on one of the conductors the total charge of
the capacitor is zero.
The electric field in the region between the
conductors is proportional to the charge Q. That
is, if the charge on the capacitor is, say doubled,
the electric field will also be doubled at every point.
(This follows from the direct proportionality
between field and charge implied by Coulombs
law and the superposition principle.) Now,
potential difference V is the work done per unit
positive charge in taking a small test charge from
the conductor 2 to 1 against the field.
FIGURE 2.24 A system of two conductors
Consequently, V is also proportional to Q, and separated by an insulator forms a capacitor.
the ratio Q/V is a constant:
Q
C =
(2.38)
V
The constant C is called the capacitance of the capacitor. C is independent
73
of Q or V, as stated above. The capacitance C depends only on the

Physics
geometrical configuration (shape, size, separation) of the system of two
conductors. [As we shall see later, it also depends on the nature of the
insulator (dielectric) separating the two conductors.] The SI unit of
capacitance is 1 farad (=1 coulomb volt-1) or 1 F = 1 C V 1. A capacitor
with fixed capacitance is symbolically shown as --| |--, while the one with
variable capacitance is shown as
.
Equation (2.38) shows that for large C, V is small for a given Q. This
means a capacitor with large capacitance can hold large amount of charge
Q at a relatively small V. This is of practical importance. High potential
difference implies strong electric field around the conductors. A strong
electric field can ionise the surrounding air and accelerate the charges so
produced to the oppositely charged plates, thereby neutralising the charge
on the capacitor plates, at least partly. In other words, the charge of the
capacitor leaks away due to the reduction in insulating power of the
intervening medium.
The maximum electric field that a dielectric medium can withstand
without break-down (of its insulating property) is called its dielectric
strength; for air it is about 3 106 Vm1. For a separation between
conductors of the order of 1 cm or so, this field corresponds to a potential
difference of 3 104 V between the conductors. Thus, for a capacitor to
store a large amount of charge without leaking, its capacitance should
be high enough so that the potential difference and hence the electric
field do not exceed the break-down limits. Put differently, there is a limit
to the amount of charge that can be stored on a given capacitor without
significant leaking. In practice, a farad is a very big unit; the most common
units are its sub-multiples 1 F = 106 F, 1 nF = 109 F, 1 pF = 1012 F,
etc. Besides its use in storing charge, a capacitor is a key element of most
ac circuits with important functions, as described in Chapter 7.

2.12 THE PARALLEL PLATE CAPACITOR


A parallel plate capacitor consists of two large plane parallel conducting
plates separated by a small distance (Fig. 2.25). We first take the
intervening medium between the plates to be
vacuum. The effect of a dielectric medium between
the plates is discussed in the next section. Let A be
the area of each plate and d the separation between
them. The two plates have charges Q and Q. Since
d is much smaller than the linear dimension of the
plates (d2 << A), we can use the result on electric
field by an infinite plane sheet of uniform surface
charge density (Section 1.15). Plate 1 has surface
charge density = Q/A and plate 2 has a surface
charge density . Using Eq. (1.33), the electric field
in different regions is:
Outer region I (region above the plate 1),
FIGURE 2.25

74

The parallel plate capacitor.

E=

20

20

=0

(2.39)

Electrostatic Potential
and Capacitance
Outer region II (region below the plate 2),
E=

=0
20 20

(2.40)

In the inner region between the plates 1 and 2, the electric fields due
to the two charged plates add up, giving
E=

Q
+
=
=
20 20 0 0 A

(2.41)

1 Qd

(2.42)

0 A

The capacitance C of the parallel plate capacitor is then


Q
0 A
= =
(2.43)
V
d
which, as expected, depends only on the geometry of the system. For
typical values like A = 1 m2, d = 1 mm, we get
C =

8.85 10 12 C2 N 1m 2 1 m2
= 8.85 10 9 F
(2.44)
10 3 m
(You can check that if 1F= 1C V 1 = 1C (NC1m)1 = 1 C2 N1m1.)
This shows that 1F is too big a unit in practice, as remarked earlier.
Another way of seeing the bigness of 1F is to calculate the area of the
plates needed to have C = 1F for a separation of, say 1 cm:
C =

Cd

1F 10 2 m
= 10 9 m 2
0
8.85 10 12 C2N 1m 2
which is a plate about 30 km in length and breadth!
A=

2.13 EFFECT

OF

DIELECTRIC

ON

Factors affecting capacitance, capacitors in action


Interactive Java tutorial

V = Ed =

http://micro.magnet.fsu.edu/electromag/java/capacitance/

The direction of electric field is from the positive to the negative plate.
Thus, the electric field is localised between the two plates and is
uniform throughout. For plates with finite area, this will not be true near
the outer boundaries of the plates. The field lines bend outward at the
edges an effect called fringing of the field. By the same token, will not
be strictly uniform on the entire plate. [E and are related by Eq. (2.35).]
However, for d2 << A, these effects can be ignored in the regions sufficiently
far from the edges, and the field there is given by Eq. (2.41). Now for
uniform electric field, potential difference is simply the electric field times
the distance between the plates, that is,

(2.45)

CAPACITANCE

With the understanding of the behavior of dielectrics in an external field


developed in Section 2.10, let us see how the capacitance of a parallel
plate capacitor is modified when a dielectric is present. As before, we
have two large plates, each of area A, separated by a distance d. The
charge on the plates is Q, corresponding to the charge density (with
= Q/A). When there is vacuum between the plates,
E0 =

75

Physics
and the potential difference V0 is
V0 = E0d
The capacitance C 0 in this case is
C0 =

Q
A
= 0
V0
d

(2.46)

Consider next a dielectric inserted between the plates fully occupying


the intervening region. The dielectric is polarised by the field and, as
explained in Section 2.10, the effect is equivalent to two charged sheets
(at the surfaces of the dielectric normal to the field) with surface charge
densities p and p. The electric field in the dielectric then corresponds
to the case when the net surface charge density on the plates is ( p ).
That is,
E=

P
0

(2.47)

so that the potential difference across the plates is


V = Ed =

P
d
0

(2.48)

For linear dielectrics, we expect p to be proportional to E0, i.e., to .


Thus, ( p ) is proportional to and we can write

P =

(2.49)
K
where K is a constant characteristic of the dielectric. Clearly, K > 1. We
then have
V =

d
Qd
=
0 K A 0 K

(2.50)

The capacitance C, with dielectric between the plates, is then


Q 0 KA
=
(2.51)
V
d
The product 0 K is called the permittivity of the medium and is
denoted by
(2.52)
= 0 K
For vacuum K = 1 and = 0; 0 is called the permittivity of the vacuum.
The dimensionless ratio
C =

K =

(2.53)

is called the dielectric constant of the substance. As remarked before,


from Eq. (2.49), it is clear that K is greater than 1. From Eqs. (2.46) and
(2. 51)
K =

76

C
C0

(2.54)

Thus, the dielectric constant of a substance is the factor (>1) by which


the capacitance increases from its vacuum value, when the dielectric is
inserted fully between the plates of a capacitor. Though we arrived at

Electrostatic Potential
and Capacitance
Eq. (2.54) for the case of a parallel plate capacitor, it holds good for any
type of capacitor and can, in fact, be viewed in general as a definition of
the dielectric constant of a substance.

ELECTRIC

DISPLACEMENT

We have introduced the notion of dielectric constant and arrived at Eq. (2.54), without
giving the explicit relation between the induced charge density p and the polarisation P.
We take without proof the result that

P = P in
where n is a unit vector along the outward normal to the surface. Above equation is
general, true for any shape of the dielectric. For the slab in Fig. 2.23, P is along n at the
at the left surface. Thus at the right surface, induced
right surface and opposite to n
charge density is positive and at the left surface, it is negative, as guessed already in our
qualitative discussion before. Putting the equation for electric field in vector form
E in =

Pi n
0

or ( 0 E + P) i n
=
The quantity 0 E + P is called the electric displacement and is denoted by D. It is a
vector quantity. Thus,
D = 0 E + P, D i n
= ,
The significance of D is this : in vacuum, E is related to the free charge density .
When a dielectric medium is present, the corresponding role is taken up by D. For a
dielectric medium, it is D not E that is directly related to free charge density , as seen in
above equation. Since P is in the same direction as E, all the three vectors P, E and D are
parallel.
The ratio of the magnitudes of D and E is
D
0
=
= 0K
E P
Thus,
D = 0 K E
and P = D 0E = 0 (K 1)E
This gives for the electric susceptibility e defined in Eq. (2.37)
e = 0 (K1)

Solution Let E 0 = V0/d be the electric field between the plates when
there is no dielectric and the potential difference is V0. If the dielectric
is now inserted, the electric field in the dielectric will be E = E0/K.
The potential difference will then be

EXAMPLE 2.8

Example 2.8 A slab of material of dielectric constant K has the same


area as the plates of a parallel-plate capacitor but has a thickness
(3/4) d, where d is the separation of the plates. How is the capacitance
changed when the slab is inserted between the plates?

77

E XAMPLE 2.8

Physics
1
E 3
V = E0 ( d ) + 0 ( d )
4
K 4
1
3
K +3
= E 0 d( +
) = V0
4 4K
4K
The potential difference decreases by the factor (K + 3)/K while the
free charge Q0 on the plates remains unchanged. The capacitance
thus increases
Q
4 K Q0
4K
C= 0 =
=
C0
V
K + 3 V0
K +3

2.14 COMBINATION

OF

CAPACITORS

We can combine several capacitors of capacitance C1, C 2,, Cn to obtain


a system with some effective capacitance C. The effective capacitance
depends on the way the individual capacitors are combined. Two simple
possibilities are discussed below.

2.14.1 Capacitors in series


Figure 2.26 shows capacitors C1 and C2 combined in series.
The left plate of C 1 and the right plate of C2 are connected to two
terminals of a battery and have charges Q and Q ,
respectively. It then follows that the right plate of C 1
has charge Q and the left plate of C 2 has charge Q.
If this was not so, the net charge on each capacitor
would not be zero. This would result in an electric
field in the conductor connecting C 1and C 2. Charge
would flow until the net charge on both C 1 and C 2
is zero and there is no electric field in the conductor
connecting C 1 and C 2 . Thus, in the series
combination, charges on the two plates (Q) are the
same on each capacitor. The total potential drop V
across the combination is the sum of the potential
drops V 1 and V2 across C1 and C2, respectively.

FIGURE 2.26 Combination of two


capacitors in series.

Q
Q
V = V1 + V2 = C + C
1
2

(2.55)

V
1
1
i.e., Q = C + C ,
1
2

(2.56)

Now we can regard the combination as an


effective capacitor with charge Q and potential
difference V. The effective capacitance of the
combination is
Q
(2.57)
V
We compare Eq. (2.57) with Eq. (2.56), and
obtain
C=

78

FIGURE 2.27 Combination of n


capacitors in series.

1
1
1
=
+
C C1 C 2

(2.58)

Electrostatic Potential
and Capacitance
The proof clearly goes through for any number of
capacitors arranged in a similar way. Equation (2.55),
for n capacitors arranged in series, generalises to
Q
Q
Q
V = V1 + V2 + ... + Vn =
+
+ ... +
(2.59)
C1 C 2
Cn
Following the same steps as for the case of two
capacitors, we get the general formula for effective
capacitance of a series combination of n capacitors:
1
1
1
1
1
=
+
+
+ ... +
(2.60)
C C1 C 2 C 3
Cn

2.14.2 Capacitors in parallel


Figure 2.28 (a) shows two capacitors arranged in
parallel. In this case, the same potential difference is
applied across both the capacitors. But the plate charges
(Q1) on capacitor 1 and the plate charges (Q 2) on the
capacitor 2 are not necessarily the same:
Q1 = C 1V, Q2 = C2V
(2.61)
The equivalent capacitor is one with charge
Q = Q1 + Q 2
(2.62)
and potential difference V.
Q = CV = C1 V + C 2V
(2.63)
The effective capacitance C is, from Eq. (2.63),
C = C1 + C2
(2.64)
The general formula for effective capacitance C for
parallel combination of n capacitors [Fig. 2.28 (b)]
follows similarly,
Q = Q 1 + Q2 + ... + Q n
(2.65)
i.e., CV = C 1V + C 2V + ... CnV
(2.66)
which gives
C = C1 + C2 + ... Cn
(2.67)

FIGURE 2.28 Parallel combination of


(a) two capacitors, (b) n capacitors.

Example 2.9 A network of four 10 F capacitors is connected to a 500 V


supply, as shown in Fig. 2.29. Determine (a) the equivalent capacitance
of the network and (b) the charge on each capacitor. (Note, the char ge
on a capacitor is the charge on the plate with higher potential, equal
and opposite to the charge on the plate with lower potential.)

E XAMPLE 2.9

FIGURE 2.29

79

Physics
Solution
(a) In the given network, C1, C2 and C3 are connected in series. The
effective capacitance C of these three capacitors is given by
1
1
1
1
=
+
+
C C1 C2 C3
For C1 = C2 = C3 = 10 F, C = (10/3) F. The network has C and C4
connected in parallel. Thus, the equivalent capacitance C of the
network is

E XAMPLE 2.9

10

C = C + C4 =
+ 10 F =13.3F
3

(b) Clearly, from the figure, the charge on each of the capacitors, C1,
C2 and C3 is the same, say Q. Let the charge on C4 be Q . Now, since
the potential difference across AB is Q/C1, across BC is Q/C2, across
CD is Q/C3 , we have
Q
Q
Q
+
+
= 500 V
C1 C2 C3
Also, Q /C4 = 500 V.

This gives for the given value of the capacitances,


10
F = 1.7 103 C and
3
3
Q = 500V 10 F = 5.0 10 C
Q = 500 V

2.15 ENERGY STORED

IN A

CAPACITOR

A capacitor, as we have seen above, is a system of two conductors with


charge Q and Q. To determine the energy stored in this configuration,
consider initially two uncharged conductors 1 and 2. Imagine next a
process of transferring charge from conductor 2 to conductor 1 bit by
bit, so that at the end, conductor 1 gets charge Q. By
charge conservation, conductor 2 has charge Q at
the end (Fig 2.30 ).
In transferring positive charge from conductor 2
to conductor 1, work will be done externally, since at
any stage conductor 1 is at a higher potential than
conductor 2. To calculate the total work done, we first
calculate the work done in a small step involving
transfer of an infinitesimal (i.e., vanishingly small)
amount of charge. Consider the intermediate situation
when the conductors 1 and 2 have charges Q and
Q respectively. At this stage, the potential difference
FIGURE 2.30 (a) Work done in a small
V between conductors 1 to 2 is Q /C, where C is the
step of building charge on conductor 1
capacitance of the system. Next imagine that a small
from Q to Q + Q. (b) Total work done
charge Q is transferred from conductor 2 to 1. Work
in charging the capacitor may be
done in this step ( W ), resulting in charge Q on
viewed as stored in the energy of
conductor 1 increasing to Q + Q , is given by
electric field between the plates.

80

W = V Q =

Q
Q
C

(2.68)

Electrostatic Potential
and Capacitance
Since Q can be made as small as we like, Eq. (2.68) can be written as
1
[(Q + Q )2 Q 2 ]
(2.69)
2C
Equations (2.68) and (2.69) are identical because the term of second
order in Q , i.e., Q 2 /2C, is negligible, since Q is arbitrarily small. The
total work done (W) is the sum of the small work ( W ) over the very large
number of steps involved in building the charge Q from zero to Q.

W =

W =

sum over all ste ps

=
=

sum over all steps

1
[(Q + Q )2 Q 2 ]
2C

(2.70)

1
[{ Q 2 0} + {(2 Q )2 Q 2 } +{(3 Q )2 (2 Q )2 } + ...
2C

+ {Q2 (Q Q)2 }]

(2.71)

1
Q
[Q 2 0] =
(2.72)
2C
2C
The same result can be obtained directly from Eq. (2.68) by integration
=

Q
1 Q 2
W =
Q' =
C
C 2
0

=
0

Q2
2C

This is not surprising since integration is nothing but summation of


a large number of small terms.
We can write the final result, Eq. (2.72) in different ways
Q2 1
1
= CV 2 = QV
(2.73)
2C 2
2
Since electrostatic force is conservative, this work is stored in the form
of potential energy of the system. For the same reason, the final result for
potential energy [Eq. (2.73)] is independent of the manner in which the
charge configuration of the capacitor is built up. When the capacitor
discharges, this stored-up energy is released. It is possible to view the
potential energy of the capacitor as stored in the electric field between
the plates. To see this, consider for simplicity, a parallel plate capacitor
[of area A(of each plate) and separation d between the plates].
Energy stored in the capacitor
W =

1 Q 2 ( A )2
d
=

0 A
2 C
2

(2.74)

The surface charge density is related to the electric field E between


the plates,
E=

(2.75)

From Eqs. (2.74) and (2.75) , we get


Energy stored in the capacitor
U = (1/2 ) 0 E 2 A d

(2.76)

81

Physics
Note that Ad is the volume of the region between the plates (where
electric field alone exists). If we define energy density as energy stored
per unit volume of space, Eq (2.76) shows that
Energy density of electric field,
u =(1/2)0 E 2
(2.77)
Though we derived Eq. (2.77) for the case of a parallel plate capacitor,
the result on energy density of an electric field is, in fact, very general and
holds true for electric field due to any configuration of charges.
Example 2.10 (a) A 900 pF capacitor is charged by 100 V battery
[Fig. 2.31(a)]. How much electrostatic energy is stored by the capacitor?
(b) The capacitor is disconnected from the battery and connected to
another 900 pF capacitor [Fig. 2.31(b)]. What is the electrostatic energy
stored by the system?

FIGURE 2.31

Solution
(a) The charge on the capacitor is
12
8
Q = CV = 900 10 F 100 V = 9 10 C
The energy stored by the capacitor is
2
= (1/2) CV = (1/2) QV
= (1/2) 9 108C 100 V = 4.5 10 6 J
(b) In the steady situation, the two capacitors have their positive
plates at the same potential, and their negative plates at the
same potential. Let the common potential difference be V. The
charge on each capacitor is then Q = CV . By charge conservation,
Q = Q/2. This implies V = V/2. The total energy of the system is
1
1
Q ' V ' = QV = 2.25 106 J
2
4
Thus in going from (a) to (b), though no charge is lost; the final
energy is only half the initial energy. Where has the remaining
energy gone?
There is a transient period before the system settles to the
situation (b). During this period, a transient current flows from
the first capacitor to the second. Energy is lost during this time
in the form of heat and electromagnetic radiation.

82

EXAMPLE 2.10

= 2

Electrostatic Potential
and Capacitance

2.16 VAN DE GRAAFF GENERATOR

1 Q
4 0 R

(2.78)

Now, as shown in Fig. 2.32, let us suppose that in some way we


introduce a small sphere of radius r, carrying some charge q, into the
large one, and place it at the centre. The potential due to this new charge
clearly has the following values at the radii indicated:
Potential due to small sphere of radius r carrying charge q
=

1 q
4 0 r at surface of small sphere

1 q
4 0 R at large shell of radius R.

(2.79)

Van de Graaff generator, principle and demonstration:

http://www.physics.gla.ac.uk/~kskeldon/PubSci/exhibits/E10/

This is a machine that can build up high voltages of the order of a few
million volts. The resulting large electric fields are used to accelerate
charged particles (electrons, protons, ions) to high energies needed for
experiments to probe the small scale structure of matter. The principle
underlying the machine is as follows.
Suppose we have a large spherical conducting shell of radius R, on
which we place a charge Q. This charge spreads itself uniformly all over
the sphere. As we have seen in Section 1.14, the field outside the sphere
is just that of a point charge Q at the centre; while the field inside the
sphere vanishes. So the potential outside is that of a point charge; and
inside it is constant, namely the value at the radius R. We thus have:
Potential inside conducting spherical shell of radius R carrying charge Q
= constant

Taking both charges q and Q into account we have for the total
potential V and the potential difference the values
V (R ) =

V (r ) =

1 Q q
+
4 0 R R

1 Q q
+
4 0 R r

V (r ) V ( R ) =

q
4 0

1 1
r R

(2.80)

Assume now that q is positive. We see that,


independent of the amount of charge Q that may have
accumulated on the larger sphere and even if it is
positive, the inner sphere is always at a higher
potential: the difference V(r )V (R) is positive. The
potential due to Q is constant upto radius R and so
cancels out in the difference!
This means that if we now connect the smaller and
larger sphere by a wire, the charge q on the former

FIGURE 2.32 Illustrating the principle


of the electrostatic generator.

83

Physics
will immediately flow onto the matter, even
though the charge Q may be quite large. The
natural tendency is for positive charge to
move from higher to lower potential. Thus,
provided we are somehow able to introduce
the small charged sphere into the larger one,
we can in this way keep piling up larger and
larger amount of charge on the latter. The
potential (Eq. 2.78) at the outer sphere would
also keep rising, at least until we reach the
breakdown field of air.
This is the principle of the van de Graaff
generator. It is a machine capable of building
up potential difference of a few million volts,
and fields close to the breakdown field of air
which is about 3 106 V/m. A schematic
diagram of the van de Graaff generator is given
in Fig. 2.33. A large spherical conducting
FIGURE 2.33 Principle of construction
shell (of few metres radius) is supported at a
of Van de Graaf f generator.
height several meters above the ground on
an insulating column. A long narrow endless
belt insulating material, like rubber or silk, is wound around two pulleys
one at ground level, one at the centre of the shell. This belt is kept
continuously moving by a motor driving the lower pulley. It continuously
carries positive charge, sprayed on to it by a brush at ground level, to the
top. There it transfers its positive charge to another conducting brush
connected to the large shell. Thus positive charge is transferred to the
shell, where it spreads out uniformly on the outer surface. In this way,
voltage differences of as much as 6 or 8 million volts (with respect to
ground) can be built up.

SUMMARY

84

1.

Electrostatic for ce is a conservative force. Work done by an exter nal


force (equal and opposite to the electrostatic force) in bringing a charge
q from a point R to a point P is VP VR , which is the difference in
potential energy of charge q between the final and initial points.

2.

Potential at a point is the work done per unit charge (by an external
agency) in bringing a charge from infinity to that point. Potential at a
point is arbitrary to within an additive constant, since it is the potential
difference between two points which is physically significant. If potential
at infinity is chosen to be zero; potential at a point with position vector
r due to a point charge Q placed at the origin is given is given by
1 Q
V (r ) =
4 o r

3.

The electrostatic potential at a point with position vector r due to a


point dipole of dipole moment p placed at the origin is
1 p ir
V (r ) =
2
4 o r

Electrostatic Potential
and Capacitance
The result is true also for a dipole (with charges q and q separated by
2a) for r >> a .
4.

For a charge configuration q1 , q2 , ..., qn with position vectors r1 ,


r2 , ... rn, the potential at a point P is given by the superposition principle

V =

1 q1 q 2
q
( +
+ ... + n )
4 0 r1P r2P
rn P

where r1P is the distance between q1 and P, as and so on.


5.

An equipotential surface is a surface over which potential has a constant


value. For a point charge, concentric spheres centered at a location of
the charge are equipotential surfaces. The electric field E at a point is
perpendicular to the equipotential surface through the point. E is in the
direction of the steepest decrease of potential.

6.

Potential energy stored in a system of charges is the work done (by an


external agency) in assembling the charges at their locations. Potential
energy of two charges q 1, q2 at r1, r 2 is given by

U =

1 q1 q2
4 0 r12

where r12 is distance between q1 and q2 .


7.

The potential energy of a charge q in an external potential V(r) is qV(r).


The potential energy of a dipole moment p in a uniform electric field E
is p.E.

8. Electrostatics field E is zero in the interior of a conductor; just outside


the surface of a charged conductor, E is nor mal to the surface given by

E=

9.

n where n
is the unit vector along the outward normal to the
0

surface and is the surface charge density. Charges in a conductor can


reside only at its surface. Potential is constant within and on the surface
of a conductor. In a cavity within a conductor (with no charges), the
electric field is zero.
A capacitor is a system of two conductors separated by an insulator. Its
capacitance is defined by C = Q/V, where Q and Q are the charges on
the two conductors and V is the potential difference between them. C is
determined purely geometrically, by the shapes, sizes and relative
positions of the two conductors. The unit of capacitance is farad:,
1 F = 1 C V 1. For a parallel plate capacitor (with vacuum between the
plates),
C=

A
d

where A is the area of each plate and d the separation between them.
10. If the medium between the plates of a capacitor is filled with an insulating
substance (dielectric), the electric field due to the charged plates induces
a net dipole moment in the dielectric. This effect, called polarisation,
gives rise to a field in the opposite direction. The net electric field inside
the dielectric and hence the potential difference between the plates is
thus reduced. Consequently, the capacitance C increases from its value
C 0 when there is no medium (vacuum),
C = KC 0
where K is the dielectric constant of the insulating substance.

85

Physics
11. For capacitors in the series combination, the total capacitance C is given by

1
1
1
1
=
+
+
+ ...
C C1 C2 C 3
In the parallel combination, the total capacitance C is:
C = C 1 + C2 + C3 + ...
where C1 , C 2, C 3 ... are individual capacitances.
12. The energy U stored in a capacitor of capacitance C, with charge Q and
voltage V is

U =

1
1
1 Q2
QV = CV 2 =
2
2
2 C

The electric energy density (energy per unit volume) in a region with
2
electric field is (1/2)0 E .
13. A Van de Graaff generator consists of a large spherical conducting shell
(a few metre in diameter). By means of a moving belt and suitable brushes,
charge is continuously transferred to the shell and potential difference
of the order of several million volts is built up, which can be used for
accelerating charged particles.

Physical quantity
Potential

Symbol

or V

Dimensions

Unit

[M1 L 2 T3 A1]

Remark
Potential difference is
physically significant

Capacitance

Polarisation

Dielectric constant

[M1 L 2 T4 A 2]
2

[L

AT]

F
C m-2

Dipole moment per unit


volume

[Dimensionless]

POINTS TO PONDER
1.

Electrostatics deals with forces between charges at rest. But if there is


a force on a charge, how can it be at rest? Thus, when we are talking of
electrostatic force between charges, it should be understood that each
charge is being kept at rest by some unspecified force that opposes the
net Coulomb force on the charge.

2.

A capacitor is so configured that it confines the electric field lines


within a small region of space. Thus, even though field may have
considerable strength, the potential difference between the two
conductors of a capacitor is small.
Electric field is discontinuous across the surface of a spherical charged

3.

shell. It is zero inside and


4.

86


n
0 outside. Electric potential is, however

continuous across the surface, equal to q/40 R at the surface.


The torque p E on a dipole causes it to oscillate about E. Only if
there is a dissipative mechanism, the oscillations are damped and the
dipole eventually aligns with E.

Electrostatic Potential
and Capacitance
5.
6.

7.

Potential due to a charge q at its own location is not defined it is


infinite.
In the expression q V(r) for potential energy of a charge q , V(r) is the
potential due to external charges and not the potential due to q. As seen
in point 5, this expression will be ill-defined if V(r) includes potential
due to a charge q itself.
A cavity inside a conductor is shielded from outside electrical influences.
It is worth noting that electrostatic shielding does not work the other
way round; that is, if you put charges inside the cavity, the exterior of
the conductor is not shielded from the fields by the inside charges.

EXERCISES
2.1

2.2
2.3

2.4

2.5

2.6

2.7

2.8

Two charges 5 108 C and 3 108 C are located 16 cm apart. At


what point(s) on the line joining the two charges is the electric
potential zero? Take the potential at infinity to be zero.
A regular hexagon of side 10 cm has a charge 5 C at each of its
vertices. Calculate the potential at the centre of the hexagon.
Two charges 2 C and 2 C are placed at points A and B 6 cm
apart.
(a) Identify an equipotential surface of the system.
(b) What is the direction of the electric field at every point on this
surface?
A spherical conductor of radius 12 cm has a charge of 1.6 107C
distributed uniformly on its surface. What is the electric field
(a) inside the sphere
(b) just outside the sphere
(c) at a point 18 cm from the centre of the sphere?
A parallel plate capacitor with air between the plates has a
capacitance of 8 pF (1pF = 1012 F). What will be the capacitance if
the distance between the plates is reduced by half, and the space
between them is filled with a substance of dielectric constant 6?
Three capacitors each of capacitance 9 pF are connected in series.
(a) What is the total capacitance of the combination?
(b) What is the potential difference across each capacitor if the
combination is connected to a 120 V supply?
Thr ee capacitors of capacitances 2 pF, 3 pF and 4 pF ar e connected
in parallel.
(a) What is the total capacitance of the combination?
(b) Determine the charge on each capacitor if the combination is
connected to a 100 V supply.
In a parallel plate capacitor with air between the plates, each plate
3
2
has an area of 6 10 m and the distance between the plates is 3 mm.
Calculate the capacitance of the capacitor. If this capacitor is
connected to a 100 V supply, what is the charge on each plate of
the capacitor?

87

Physics
2.9

2.10
2.11

Explain what would happen if in the capacitor given in Exercise


2.8, a 3 mm thick mica sheet (of dielectric constant = 6) were inserted
between the plates,
(a) while the voltage supply remained connected.
(b) after the supply was disconnected.
A 12pF capacitor is connected to a 50V battery. How much
electrostatic energy is stored in the capacitor?
A 600pF capacitor is charged by a 200V supply. It is then
disconnected from the supply and is connected to another
uncharged 600 pF capacitor. How much electrostatic energy is lost
in the process?

ADDITIONAL EXERCISES
2.12

2.13

2.14

2.15

2.16

A charge of 8 mC is located at the origin. Calculate the work done in


taking a small charge of 2 109 C from a point P (0, 0, 3 cm) to a
point Q (0, 4 cm, 0), via a point R (0, 6 cm, 9 cm).
A cube of side b has a charge q at each of its vertices. Determine the
potential and electric field due to this charge array at the centre of
the cube.
Two tiny spheres carrying charges 1.5 C and 2.5 C are located 30 cm
apart. Find the potential and electric field:
(a) at the mid-point of the line joining the two charges, and
(b) at a point 10 cm from this midpoint in a plane normal to the
line and passing through the mid-point.
A spherical conducting shell of inner radius r1 and outer radius r2
has a charge Q.
(a) A charge q is placed at the centre of the shell. What is the
surface charge density on the inner and outer surfaces of the
shell?
(b) Is the electric field inside a cavity (with no charge) zero, even if
the shell is not spherical, but has any irregular shape? Explain.
(a) Show that the normal component of electrostatic field has a
discontinuity from one side of a charged surface to another
given by

0
is a unit vector normal to the surface at a point and
where n
is the surface charge density at that point. (The direction of
is from side 1 to side 2.) Hence show that just outside a
n
/ 0.
conductor, the electric field is n
(b) Show that the tangential component of electrostatic field is
continuous from one side of a charged surface to another. [Hint:
For (a), use Gausss law. For, (b) use the fact that work done by
electrostatic field on a closed loop is zero.]
A long charged cylinder of linear charged density is surrounded
by a hollow co-axial conducting cylinder. What is the electric field in
the space between the two cylinders?
In a hydrogen atom, the electron and proton are bound at a distance
of about 0.53 :
(E2 E1 )i n
=

2.17

88

2.18

Electrostatic Potential
and Capacitance
(a)

2.19

2.20

2.21

2.22

Estimate the potential energy of the system in eV, taking the


zero of the potential energy at infinite separation of the electron
from proton.
(b) What is the minimum work required to free the electron, given
that its kinetic energy in the orbit is half the magnitude of
potential energy obtained in (a)?
(c) What are the answers to (a) and (b) above if the zero of potential
energy is taken at 1.06 separation?
If one of the two electrons of a H2 molecule is removed, we get a
hydrogen molecular ion H +2. In the ground state of an H+2, the two
protons are separated by roughly 1.5 , and the electron is roughly
1 from each proton. Determine the potential energy of the system.
Specify your choice of the zero of potential energy.
Two charged conducting spheres of radii a and b are connected to
each other by a wire. What is the ratio of electric fields at the surfaces
of the two spheres? Use the result obtained to explain why charge
density on the sharp and pointed ends of a conductor is higher
than on its flatter portions.
Two charges q and +q are located at points (0, 0, a) and (0, 0, a),
respectively.
(a) What is the electrostatic potential at the points (0, 0, z) and
(x, y, 0) ?
(b) Obtain the dependence of potential on the distance r of a point
from the origin when r/a >> 1.
(c) How much work is done in moving a small test charge from the
point (5,0,0) to (7,0,0) along the x-axis? Does the answer
change if the path of the test charge between the same points
is not along the x-axis?
Figure 2.34 shows a charge array known as an electric quadrupole.
For a point on the axis of the quadrupole, obtain the dependence
of potential on r for r/a >> 1, and contrast your results with that
due to an electric dipole, and an electric monopole (i.e., a single
charge).

FIGURE 2.34

2.23

An electrical technician requires a capacitance of 2 F in a circuit


across a potential differ ence of 1 kV. A large number of 1 F capacitors
are available to him each of which can withstand a potential
differ ence of not more than 400 V. Suggest a possible arrangement
that requires the minimum number of capacitors.

2.24

What is the area of the plates of a 2 F parallel plate capacitor, given


that the separation between the plates is 0.5 cm? [You will realise
from your answer why ordinary capacitors are in the range of F or
less. However, electr olytic capacitors do have a much larger
capacitance (0.1 F) because of very minute separation between the
conductors.]

89

Physics
2.25

Obtain the equivalent capacitance of the network in Fig. 2.35. For a


300 V supply, determine the charge and voltage acr oss each capacitor.

FIGURE 2.35

2.26

2.27

2.28

2.29

90

The plates of a parallel plate capacitor have an area of 90 cm2 each


and are separated by 2.5 mm. The capacitor is charged by connecting
it to a 400 V supply.
(a) How much electrostatic energy is stored by the capacitor?
(b) View this energy as stored in the electrostatic field between
the plates, and obtain the energy per unit volume u. Hence
arrive at a relation between u and the magnitude of electric
field E between the plates.
A 4 F capacitor is charged by a 200 V supply. It is then disconnected
from the supply, and is connected to another uncharged 2 F
capacitor. How much electrostatic energy of the first capacitor is
lost in the form of heat and electromagnetic radiation?
Show that the force on each plate of a parallel plate capacitor has a
magnitude equal to () QE, where Q is the charge on the capacitor,
and E is the magnitude of electric field between the plates. Explain
the origin of the factor .
A spherical capacitor consists of two concentric spherical conductors,
held in position by suitable insulating supports (Fig. 2.36). Show

FIGURE 2.36

Electrostatic Potential
and Capacitance

2.30

2.31

that the capacitance of a spherical capacitor is given by


4 0 r1r2
C=
r1 r2
where r1 and r 2 are the radii of outer and inner spheres,
respectively.
A spherical capacitor has an inner sphere of radius 12 cm and an
outer sphere of radius 13 cm. The outer sphere is earthed and the
inner sphere is given a charge of 2.5 C. The space between the
concentric spheres is filled with a liquid of dielectric constant 32.
(a) Determine the capacitance of the capacitor.
(b) What is the potential of the inner sphere?
(c) Compare the capacitance of this capacitor with that of an
isolated sphere of radius 12 cm. Explain why the latter is much
smaller.
Answer carefully:
(a) Two large conducting spheres carrying charges Q1 and Q 2 are
brought close to each other. Is the magnitude of electr ostatic
force between them exactly given by Q1 Q2/40r 2, where r is
the distance between their centres?
(b) If Coulombs law involved 1/r3 dependence (instead of 1/r 2),
would Gausss law be still true ?
(c) A small test charge is released at rest at a point in an
electrostatic field configuration. Will it travel along the field
line passing through that point?
(d) What is the work done by the field of a nucleus in a complete
circular orbit of the electron? What if the orbit is elliptical?
(e) We know that electric field is discontinuous across the surface
of a charged conductor. Is electric potential also discontinuous
there?
(f ) What meaning would you give to the capacitance of a single
conductor?
(g) Guess a possible reason why water has a much greater
dielectric constant (= 80) than say, mica (= 6).

2.32

A cylindrical capacitor has two co-axial cylinders of length 15 cm


and radii 1.5 cm and 1.4 cm. The outer cylinder is earthed and the
inner cylinder is given a charge of 3.5 C. Determine the capacitance
of the system and the potential of the inner cylinder. Neglect end
effects (i.e., bending of field lines at the ends).

2.33

A parallel plate capacitor is to be designed with a voltage rating


1 kV, using a material of dielectric constant 3 and dielectric strength
about 107 Vm1. (Dielectric strength is the maximum electric field a
material can tolerate without breakdown, i.e., without starting to
conduct electricity through partial ionisation.) For safety, we should
like the field never to exceed, say 10% of the dielectric strength.
What minimum area of the plates is required to have a capacitance
of 50 pF?
Describe schematically the equipotential surfaces corresponding to
(a) a constant electric field in the z-direction,
(b) a field that uniformly increases in magnitude but remains in a
constant (say, z) direction,

2.34

91

Physics
(c)
(d)
2.35

2.36

2.37

92

a single positive charge at the origin, and


a uniform grid consisting of long equally spaced parallel charged
wires in a plane.
In a Van de Graaff type generator a spherical metal shell is to be a
15 106 V electrode. The dielectric strength of the gas surrounding
7
1
the electrode is 5 10 Vm . What is the minimum radius of the
spherical shell requir ed? (You will lear n from this exercise why one
cannot build an electrostatic generator using a very small shell
which requires a small charge to acquire a high potential.)
A small sphere of radius r1 and charge q 1 is enclosed by a spherical
shell of radius r2 and charge q 2. Show that if q1 is positive, charge
will necessarily flow from the sphere to the shell (when the two are
connected by a wire) no matter what the charge q2 on the shell is.
Answer the following:
(a) The top of the atmosphere is at about 400 kV with respect to
the surface of the earth, corresponding to an electric field that
decreases with altitude. Near the surface of the earth, the field
1
is about 100 Vm . Why then do we not get an electric shock as
we step out of our house into the open? (Assume the house to
be a steel cage so there is no field inside!)
(b) A man fixes outside his house one evening a two metre high
insulating slab carrying on its top a large aluminium sheet of
2
area 1m . Will he get an electric shock if he touches the metal
sheet next morning?
(c) The discharging current in the atmosphere due to the small
conductivity of air is known to be 1800 A on an average over
the globe. Why then does the atmosphere not discharge itself
completely in due course and become electrically neutral? In
other words, what keeps the atmosphere charged?
(d) What are the forms of energy into which the electrical energy
of the atmosphere is dissipated during a lightning?
(Hint: The earth has an electric field of about 100 Vm1 at its
surface in the downward direction, corresponding to a surface
charge density = 109 C m2. Due to the slight conductivity of
the atmosphere up to about 50 km (beyond which it is good
conductor), about + 1800 C is pumped every second into the
earth as a whole. The earth, however, does not get discharged
since thunderstorms and lightning occurring continually all
over the globe pump an equal amount of negative charge on
the earth.)

Das könnte Ihnen auch gefallen