Beruflich Dokumente
Kultur Dokumente
GEOMETRICAL OPTICS
VIRENDRA N. MAHAJAN
FUNDAMENTALS OF
GEOMETRICAL OPTICS
Virendra N. Mahajan
FUNDAMENTALS OF
GEOMETRICAL OPTICS
Virendra N. Mahajan
THE AEROSPACE CORPORATION
AND
COLLEGE OF OPTICAL SCIENCES - THE UNIVERSITY OF ARIZONA
SPIE PRESS
Bellingham, Washington USA
Library of Congress Cataloging-in-Publication Data
Mahajan, Virendra N.
Fundamentals of geometrical optics / Virendra N. Mahajan.
pages cm
Includes bibliographical references and index.
ISBN 978-0-8194-9998-1
1. Geometrical optics--Study and teaching. 2. Optical instruments--Reliability--Study
and teaching. 3. Diffraction--Study and teaching. I. Title.
QC382.M34 2014
535'.32--dc23
2014010949
Published by
SPIE
P.O. Box 10
Bellingham, Washington 98227-0010 USA
Phone: +1 360.676.3290
Fax: +1 360.647.1445
Email: Books@spie.org
Web: http://spie.org
All rights reserved. No part of this publication may be reproduced or distributed in any
form or by any means without written permission of the publisher.
The content of this book reflects the work and thought of the author(s). Every effort has
been made to publish reliable and accurate information herein, but the publisher is not
responsible for the validity of the information or for any outcomes resulting from reliance
thereon.
SHASHI PRABHA
FOREWORD
We are living in the most exciting time, so far, in the use and application
of the phenomenon of light, as we understand it. Optics is now an
important subject in many disciplines, and so competence in optics is at
issue. This volume provides the interested reader with a solid resource to
embark on learning about geometrical optics, which is the foundation of
imaging and non-imaging optics. Professor Virendra N. Mahajan provides
a clear and detailed discussion of essential topics for the understanding of
image formation.
Dr. Mahajan has significant experience teaching and writing about the
subject. He is well known in the optics community and has traveled
around the world, lecturing about optical imaging and aberrations; one of
his favorite topics is the use of Zernike polynomials in optics.
I have known Dr. Mahajan ever since he started teaching at the College of
Optical Sciences in 2005. He flew back and forth from Los Angeles to
Tucson every week to share his knowledge with students. I have also
enjoyed noticing the fine interest and polite interaction he has with his
optics colleagues.
From interacting with Dr. Mahajan over the years, it is apparent that
he is concerned with clearly connecting topics in geometrical optics to
provide students with a solid foundation. One example is his detailed
style in describing and deriving, say, the laws of geometrical optics and ray
tracing in 3D, and the evolution of Gaussian optics from them. Another
example is his insightful explanation of how the individual primary
aberration coefficients of a system of surfaces can be added directly
to form the overall system’s coefficients.
vii
TABLE OF CONTENTS
FUNDAMENTALS OF
GEOMETRICAL OPTICS
Preface ............................................................................................................................ xix
Acknowledgment............................................................................................................ xxi
L[
1.7 Paraxial Ray Tracing............................................................................................. 24
1.7.1 Snell’s Law ................................................................................................25
1.7.2 Point on a Spherical Surface ......................................................................25
1.7.3 Distance between Two Points....................................................................25
1.7.4 Unit Vector along a Surface Normal ......................................................... 26
1.7.5 Unit Vector along a Ray ............................................................................26
1.7.6 Transfer of a Ray ....................................................................................... 26
1.7.7 Refraction of a Ray ....................................................................................27
1.7.8 Reflection of a Ray ....................................................................................27
1.8 Gaussian Approximation and Imaging ................................................................28
1.8.1 Gaussian Approximation ........................................................................... 28
1.8.2 Gaussian Imaging by a Refracting Surface ............................................... 29
1.8.3 Gaussian Imaging by a Reflecting Surface................................................31
1.8.4 Gaussian Imaging by a Multisurface System ............................................34
1.9 Imaging beyond Gaussian Approximation ..........................................................34
[L
2.11 Summary of Results ............................................................................................. 109
2.11.1 Imaging Equations ................................................................................... 109
2.11.1.1 General System ........................................................................109
2.11.1.2 Refracting Surface ................................................................... 111
2.11.1.3 Thin Lens ................................................................................. 111
2.11.1.4 Afocal System..........................................................................112
2.11.1.5 Plane-Parallel Plate ..................................................................112
2.11.2 Petzval Image ..........................................................................................112
2.11.3 Misalignments..........................................................................................113
2.11.3.1 Misaligned Surface ..................................................................113
2.11.3.2 Misaligned Thin Lens ..............................................................113
2.11.4 Anamorphic Imaging Systems ................................................................113
Problems ......................................................................................................................... 115
xii
CHAPTER 4: PARAXIAL RAY TRACING
4.1 Introduction ..........................................................................................................147
4.2 Refracting Surface ............................................................................................... 148
4.3 General System..................................................................................................... 152
4.3.1 Determination of Cardinal Points ............................................................152
4.3.2 Combination of Two Systems ................................................................. 154
4.4 Thin Lens ..............................................................................................................155
4.5 Thick Lens ............................................................................................................159
4.6 Two-Lens System ................................................................................................. 162
4.7 Reflecting Surface (Mirror) ................................................................................165
4.8 Two-Mirror System ............................................................................................. 168
4.8.1 Focal Length ............................................................................................168
4.8.2 Obscuration ..............................................................................................170
4.9 Catadioptric System: Thin-Lens–Mirror Combination ................................... 172
4.10 Two-Ray Lagrange Invariant ............................................................................. 174
4.11 Summary of Results ............................................................................................. 177
4.11.1 Ray-Tracing Equations ............................................................................177
4.11.2 Thick Lens ............................................................................................... 179
4.11.3 Two-Lens System ....................................................................................180
4.11.4 Two-Mirror System ................................................................................. 180
4.11.5 Two-Ray Lagrange Invariant ..................................................................181
Problems ......................................................................................................................... 182
[Y
8.3 Wavefront Defocus Aberration ..........................................................................322
8.4 Wavefront Tilt Aberration ..................................................................................325
8.5 Aberrations of a Rotationally Symmetric System............................................. 326
8.5.1 Explicit Dependence on Object Coordinates........................................... 326
8.5.2 No Explicit Dependence on Object Coordinates ..................................... 329
8.6 Additivity of Primary Aberrations ..................................................................... 331
8.6.1 Introduction..............................................................................................331
8.6.2 Primary Wave Aberrations ......................................................................332
8.6.3 Transverse Ray Aberrations ....................................................................335
8.6.4 Off-Axis Point Object ..............................................................................336
8.6.5 Higher-Order Aberrations........................................................................337
8.7 Strehl Ratio and Aberration Balancing ............................................................. 337
8.7.1 Strehl Ratio ..............................................................................................337
8.7.2 Aberration Balancing............................................................................... 338
8.8 Zernike Circle Polynomials................................................................................. 340
8.8.1 Introduction..............................................................................................340
8.8.2 Polynomials in Optical Design ................................................................341
8.8.3 Polynomials in Optical Testing ............................................................... 345
8.8.4 Characteristics of Polynomial Aberrations ..............................................349
8.8.4.1 Isometric Characteristics ........................................................... 349
8.8.4.2 Interferometric Characteristics ..................................................350
8.9 Relationship between Zernike Polynomials and Classical Aberrations ......... 352
8.9.1 Introduction..............................................................................................352
8.9.2 Wavefront Tilt Aberration ....................................................................... 352
8.9.3 Wavefront Defocus Aberration................................................................353
8.9.4 Astigmatism............................................................................................. 353
8.9.5 Coma ........................................................................................................354
8.9.6 Spherical Aberration ................................................................................355
8.9.7 Seidel Coefficients from Zernike Coefficients ........................................355
8.10 Aberrations of an Anamorphic System ..............................................................356
8.10.1 Introduction..............................................................................................356
8.10.2 Classical Aberrations ............................................................................... 357
8.10.3 Polynomial Aberrations Orthonormal over a Rectangular Pupil ............358
8.10.4 Expansion of a Rectangular Aberration Function in Terms of
Orthonormal Rectangular Polynomials ................................................... 360
8.11 Observation of Aberrations ................................................................................363
8.11.1 Primary Aberrations ................................................................................364
8.11.2 Interferograms..........................................................................................364
8.11.3 Random Aberrations ................................................................................369
8.12 Summary of Results ............................................................................................. 370
8.12.1 Wave and Ray Aberrations ......................................................................370
8.12.2 Wavefront Defocus Aberration................................................................370
[YL
8.12.3 Wavefront Tilt Aberration ....................................................................... 370
8.12.4 Primary Aberrations ................................................................................371
8.12.5 Strehl Ratio and Aberration Balancing ....................................................371
8.12.6 Zernike Circle Polynomials ..................................................................... 371
8.12.6.1 Use of Zernike Polynomials in Wavefront Analysis ............... 371
8.12.6.2 Polynomials in Optical Design ................................................371
8.12.6.3 Zernike Primary Aberrations ................................................... 372
8.12.6.4 Polynomials in Optical Testing ............................................... 373
8.12.6.5 Isometric and Interferometric Characteristics ......................... 374
8.12.7 Relationship between Zernike and Seidel Coefficients ........................... 374
8.12.8 Aberrations of an Anamorphic System....................................................374
Appendix: Combination of Two Zernike Polynomial Aberrations with the
Same n Value and Varying as cos mqq and sin mqq ................................. 376
References ......................................................................................................................377
Problems ......................................................................................................................... 378
[YLL
EPILOGUE
E1 Introduction ..........................................................................................................423
E2 Principles of Geometrical Optics and Imaging..................................................423
...............................
E3 Ray Tracing: Exact and Paraxial ....................................................................... 423
E4 Gaussian Optics ....................................................................................................424
E4.1 Tangent Plane or Paraxial Surface ..........................................................424
E4.2 Sign Convention ......................................................................................424
E4.3 Cardinal Points ........................................................................................424
E4.4 Graphical Imaging ................................................................................... 425
E4.5 Lagrange Invariant................................................................................... 425
E4.6 Matrix Approach to Gaussian Imaging....................................................426
E4.7 Petzval Image ..........................................................................................426
E4.8 Field of View ........................................................................................... 426
E4.9 Chromatic Aberrations ............................................................................426
E5 Image Brightness ..................................................................................................427
E6 Image Quality ....................................................................................................... 427
E6.1 Wave and Ray Aberrations ......................................................................427
E6.2 Primary Aberrations ................................................................................428
E6.3 Spot Size and Aberration Balancing ........................................................429
E6.4 Strehl Ratio and Aberration Balancing ....................................................429
E7 Reflecting Systems................................................................................................430
E8 Anamorphic Imaging Systems ............................................................................430
E9 Aberration Tolerance and a Golden Rule of Optical Design ........................... 431
E10 General Comments ..............................................................................................431
References ......................................................................................................................433
Bibliography................................................................................................................... 435
Index ............................................................................................................................. 437
[YLLL
PREFACE
Portions of this book have their origin in the author’s lectures given as an adjunct
professor in the electrical engineering/electrophysics department of the University of
Southern California from about 1984 to 1998. It is a precursor to the author’s “Optical
Imaging and Aberrations books (Part I: Ray Geometrical Optics; Part II: Wave
Diffraction Optics; and Part III: Wavefront Analysis),” all published by SPIE Press. It is
an expanded yet simplified version of some of the material from Part I, and contains some
new material. The focus is on Gaussian imaging, ray tracing, radiometry, basic optical
instruments, optical aberrations, and spot diagrams. The primary aberrations of simple
systems, such as a thin lens or a two-mirror telescope, that are derived in Part I are not
discussed here. The book can be used as a textbook for a senior undergraduate or a first-
year graduate class.
Some of the familiar optical instruments such as the eye, magnifier, microscope,
telescope, and pinhole camera are addressed in Chapter 6. The most common and
interesting among them is the eye, which is discussed in detail. The resolution of such
common optical instruments is discussed based on Rayleigh’s criterion of resolution, thus
[L[
necessitating a brief discussion of the aberration-free diffraction image of a point object,
i.e., the Airy pattern. The chromatic aberrations of a system are discussed in Chapter 7. A
refracting surface, a thin lens, a plane-parallel plate, and a doublet are considered as
simple examples of systems.
The content of each chapter is summarized in its last section. This section is written
to be comprehensive enough that it can be read on its own without reading the whole
chapter. Each chapter ends with a set of problems, which are an integral part of the book.
They help develop and test how to apply the results obtained in a chapter to practical
situations.
The book ends with an epilogue, which gives a summary of the imaging process, and
outlines the next steps within and beyond geometrical optics.
[[
ACKNOWLEDGMENT
Once again, I am pleased to acknowledge the generous support I have received over
the years from my employer, The Aerospace Corporation, in preparing this book. I am
grateful to my former classmate Dr. William H. Swantner for his advice on this work. I
had useful discussions about the human eye with my son, Vinit Bharati, who is a retina
surgeon. My thanks to Drs. Pantazis Mouroulis and Brian Stone, and two anonymous
reviewers, for reading a draft of the book and providing useful feedback. Of course, I am
the only one responsible for any shortcomings or errors in the book. My special thanks go
to Professor José Sasián for writing the Foreword. The Sanskrit verse on p. xxv was
provided by Professor Sally Sutherland of the University of California at Berkeley.
I do not have enough words to thank my wife, Shashi Prabha, for tolerating my time
away from her while I was busy writing this book. This is the last of my five books on
optical imaging and aberrations, and I dedicate it to her.
Finally, I thank SPIE Press Editor Scott McNeill and Press Manager Tim Lamkins
for their quality support in bringing this book to publication. Scott has meticulously
upgraded some of the figures, including the color figures on chromatic aberrations.
[[L
SYMBOLS AND NOTATION
a radius of exit pupil q shape factor
The snow does not diminish the beauty of the Himalayan mountains
which are the source of countless gems. Indeed, one flaw is lost
among a host of virtues, as the moon’s dark spot is lost among its rays.
[[Y
CHAPTER 1
1
2 FOUNDATIONS OF GEOMETRICAL OPTICS
We begin this chapter with a brief introduction of the Cartesian sign convention for
the distances and heights of the object and image points, and the angles of incidence and
refraction or reflection and slope angles of the rays. We discuss Fermat’s principle that
the optical path length of a ray from one point to another is stationary, and derive the laws
of rectilinear propagation in a homogeneous medium, refraction by a refracting surface,
and reflection by a reflecting surface (first in 2D and then in 3D). These laws are used to
obtain ray-tracing equations representing the propagation of a ray exactly from a certain
point to a point on a refracting or a reflecting surface, or refraction or reflection of the ray
by the surface, and propagation of the refracted or reflected ray to the next surface. The
purpose of exact ray tracing is to determine the aberrations of a system consisting of a
series of refracting and/or reflecting surfaces that generally have a common axis of
rotational symmetry called the optical axis. Such a system is called a centered or a
rotationally symmetric system. Its surfaces bend light rays from an object according to the
three laws to form its image.
For rays and normals to the refracting and reflecting surfaces making small angles
with the optical axis, Gauss gave an extremely useful approximation to the exact theory.
In this approximation, the sines and tangents of the angles of the rays with the optical axis
are replaced by the angles, and any diagonal distances are approximated by the
corresponding axial distances. Gaussian optics or imaging relates the object distance and
size to the image distance and size through the parameters of the imaging system such as
the radii of curvature of the surfaces and refractive indices of the media between them.
The image of an object obtained according to geometrical optics in the Gaussian
approximation is called the Gaussian image.
3
4 FOUNDATIONS OF GEOMETRICAL OPTICS
meridional) plane, and rays lying in this plane are called tangential (or meridional) rays.
Those rays that intersect this plane are called skew rays.
The role of an optical designer is to design an imaging system so that it can form an
image of a certain size at a certain location, given the object size and location. Given the
radiance of an extended object or the intensity of a point object, the designer chooses the
sizes of the imaging elements that yield an image of some prescribed irradiance or
intensity. Gaussian optics is also used to determine the extent of the object that can be
imaged, i.e., it is used to determine the field of view of the system. A quantity of
paramount interest that is beyond Gaussian optics, but a design must satisfy, is the
expected quality of the image. A designer must choose the shapes and materials of the
imaging elements that balance their chromatic and monochromatic aberrations to produce
an image of acceptable quality across the field of view of the system.
2. Distances to the right of and above (left of and below) a reference point are
positive (negative). The object distance S and image height h ¢ are numerically
negative in Figure 1-1, and object height h and image distance S ¢ are
numerically positive.
4. The acute angle of a ray from the optical axis or from the surface normal is
positive (negative) if it is counterclockwise (clockwise). The angles q and q ¢ of
the incident and refracted rays P0 Q and QP¢0 from the surface normal QC are
both positive in Figure 1-1. However, the angles f and b ¢0 of the surface normal
and the refracted ray from the optical axis OA are both numerically negative.
1.3 Fermat’s Principle 5
n n¢
Q
q
P q¢
(–)f
h b0 V (–)b¢0 P¢0
P0 OA C (–)h¢
P¢
R
(–)S S¢
5. When light travels from right to left, such as when it is reflected by an odd
number of mirrors, then the refractive index and the spacing between two
adjacent surfaces are given a negative sign. The negative distance is consistent
with the sign convention for the distance, and a negative refractive index results
from the negative wave velocity.
Throughout the book, any quantities that are numerically negative are indicated in the
figures by a parenthetical negative sign ( - ) .
If we consider the actual and neighboring paths of a ray in going from a point P1 to a
point P2 , as indicated in Figure 1-2, so that the two paths deviate by no more than a small
quantity , then the difference in their optical path lengths is given by
P2 P2
(1-1a)
W ( ) = Ú nds¢ - Ú nds
P1 P1
= O 2 ( ) , (1-1b)
where ds and ds¢ are the differential elements of path length along the actual and
( )
neighboring virtual rays, respectively, n is the corresponding refractive index, and O 2
indicates a function that depends on through 2 and/or higher powers of . It is clear
from Eq. (1-1b) that
lim ∂W
Æ 0 ∂ = 0 . (1-2a)
P2
d Ú nds = 0 , (1-2b)
P1
where d indicates a differential variation. Thus, up to the first order in , the two optical
path lengths are equal.
The optical path length of an actual ray compared to those of the neighboring virtual
rays may be a maximum or a minimum, or all of the rays may have equal optical path
lengths. This may be seen from the properties of an ellipse (or ellipsoid), as illustrated in
Figure 1-3. An ellipse has the property (see Figure 1-3a) that the sum of the distances of a
'
P2
ds¢
ds
P1
Figure 1-2. The actual and virtual paths of a ray in going from a point P1 to a point
P2 . The actual path is indicated by a solid line, and the two paths deviate from each
other by no more than a small quantity at any point along the path.
1.3 Fermat’s Principle 7
R
P
Q
qr
(–)qi
(a)
F1 F2
Q P
(b)
F1 F2
R
Q P
(c)
F1 F2
Figure 1-3. Stationarity of optical path length. (a) [ F1 PF2 ] = [ F1 QF2 ] for the
ellipsoidal mirror; [ F1 PF2 ] is a minimum for the plane mirror. (b) [ F1 PF2 ] is a
maximum for the concave mirror. (c) [ F1 PF2 ] is a minimum for the convex mirror.
Similarly, if we consider the concave mirror shown dashed in Figure 1-3b so that it
has a common tangent and therefore a common normal with the ellipse at the point, then
the optical path length of the actual ray F1 PF2 is a maximum compared with the
neighboring virtual (in the sense of fictitious) rays. We note, for example, that
where V and Q are the points of incidence of two neighboring rays PAV and PBQ. From
Fermat’s principle, the optical path length [ AQA¢ ] of the virtual ray AQA¢ may be
written
[ AQA¢ ] ( )
= [ AVA ¢ ] + O 2 , (1-5a)
where = VQ is a small quantity. Substituting Eq. (1-5a) into Eq. (1-4), we obtain
[ AQA¢ ] ( )
= [ BQB ¢ ] + O 2 . (1-5b)
n n¢
B B¢
P A V A¢ P¢
W W¢
where AB is of the same order of magnitude as VQ. Subtracting Eq. (1-6a) from Eq.
(1-5b), we obtain
[QA¢] ( )
= [QB¢] + O 2 , (1-6b)
or the ray QB¢ is perpendicular to the wavefront W ¢ at the point B¢. If the wavefront W ¢
is refracted by another refracting surface, the refracted rays and the wavefront produced
by it can again be shown to be orthogonal to each other.
It should be noted that although the incident wavefront W is spherical with its center
of curvature at P, the refracted wavefront W ¢ may or may not be spherical, depending
on the shape of the refracting surface S. If W ¢ is spherical with its center of curvature at
P ¢ , then S is called a Cartesian surface, and the points P and P ¢ are called a Cartesian
pair or perfect conjugates. Because the rays are perpendicular to the wavefront, an
alternative definition of a perfect image is that all of the rays pass through the image point
P ¢ . For example, an ellipsoidal refracting surface with an eccentricity e = n n¢
separating media of refractive indices n and n ¢ is a Cartesian surface for a collimated
beam (see Problem 1.1). Similarly, an ellipsoidal mirror is a Cartesian surface for a point
object placed at one of its two geometrical focii (see Problem 1.2).
[
W ( ) = n ( P1 B + BP2 ) - ( P1 A + AP2 ) ]
1/ 2 1/ 2
Ï
[ 2
= n Ì ( P1 A) + 2
Ó
] + [( AP ) 2
2
+ 2 ] ¸
- ( P1 A + AP2 )˝
˛
ÏÔ È 2 ˘ È 2 ˘ ¸Ô
= n Ì P1 AÍ1 + 2 + ... - ( P1 A + AP2 )˝
˙ Í
2 + ... + AP2 1 +
˙
ÔÓ ÍÎ 2( P1 A) ˚˙ ÍÎ 2( AP2 ) ˚˙ Ô˛
= O 2( ) , (1-7)
where = AB is a small deviation of the virtual path from the actual, n is the refractive
( )
index of the homogeneous medium, and O 2 represents terms with powers of greater
than or equal to two. As expected, there is no linear term in ; thus, the derivative of
W ( ) with respect to in the limit of Æ 0 is zero.
1.5.2 Refraction in 2D
Consider refraction of a ray at an interface between two media of refractive indices n
and n ¢ , as illustrated in Figure 1-6. The optical path length of a ray in propagating from a
point P to another point P ¢ after refraction at a point A is given by
1/ 2
[ PAP ¢ ] (
= n a2 + x2 )1/ 2 + n¢ [(b - x) 2 + c 2 ] . (1-8)
If we displace the point A by a small amount along the interface, the value of x changes
B
'
P2
P1 A
n n¢
P¢ Refracted
q¢ Ray
c
A
Surface x b- x
Normal b
q
a
Incident
Ray
P
by that amount. According to Fermat’s principle, the derivative of the optical path length
with respect to x is zero. Equating to zero the derivative of the right hand side of Eq. (1-8)
with respect to x, we obtain
x b-x
0 = n - n¢
2 1/ 2 1/ 2
(a 2
+x ) [(b - x) 2
+ c2 ]
= n sin q - n ¢ sin q ¢ ,
or
where q and q ¢ are the angles of incidence and refraction of the incident and refracted
rays from the surface normal at the point of incidence. The incident ray, the reflected ray,
and the surface normal are coplanar. Equation (1-9), along with coplanarity of the rays
and the surface normal, is the law of refraction, also called Snell’s law.
When light is incident normally on a surface so that the angle of incidence q is zero,
then the angle of refraction q ¢ is also zero. The angle of refraction increases as the angle
of incidence increases. When light is incident from a medium of higher refractive index n
to a medium of lower index n ¢ , the angle of refraction reaches its maximum value of 90˚
[ ]
corresponding to an angle of incidence of sin -1 (n ¢ n) . This angle of incidence is called
the critical angle. Its value for a glass-to-air interface is 41.8˚, as may be seen by letting
n = 1.5 and n ¢ = 1 in Eq. (1-9). When light is incident at an angle that is larger than the
critical angle, it is reflected at the interface according to the law of reflection, discussed
below. This phenomenon, called total internal reflection, is used in a right-angle
reflecting prism, as illustrated in Figure 1-7. Such a prism is used in optical systems to
deviate the path of a beam by 90˚. Its diagonal face acts like a mirror because the rays are
incident on it at angle of 45˚ and undergo a total internal reflection.
12 FOUNDATIONS OF GEOMETRICAL OPTICS
45∞
45∞
1.5.3 Reflection in 2D
Now consider the reflection of a ray from a reflecting surface, as illustrated in Figure
1-8. The optical path length of a ray in propagating from a point P to another point P ¢
after reflection at a point A is given by
[PAP¢ ] = n ( PA - AP ¢)
1/ 2
[
= n ÏÌ (b - x) + c 2
Ó
2
] - (a 2
+ x2 )1/ 2 ¸˝˛ , (1-10)
where the refractive index associated with the reflected ray is -n because of its backward
propagation. Equating to zero the derivative of the right-hand side of Eq. (1-10) with
respect to x, as in the case of refraction, we obtain
Reflected a
Ray P¢
b
x
(–)q¢
Surface
Normal A
q
b-x
Incident
Ray P
c
x b-x
0 = +
2 1/ 2 1/ 2
(a 2
+x ) [(b - x) 2
+ c2 ]
= sin q¢ + sin q ,
or
q¢ = - q , (1-11)
where q and q ¢ are the angles of incidence and reflection that the incident and reflected
rays make with the surface normal at the point of incidence, respectively. The angle q¢ in
Figure 1-8 is numerically negative, and so we have inserted a minus sign in Eq. (1-11).
The incident ray, the reflected ray, and the surface normal are coplanar. No
approximation is involved in Eq. (1-11). This equation along with the coplanarity of the
rays is the law of reflection. It can be obtained from Eq. (1-9) by letting n ¢ = - n because
the reflected ray lies in the same medium as the incident ray but travels backward
compared to the incident ray.
1.5.4 Refraction in 3D
As illustrated in Figure 1-9, which is not in a plane, consider a ray originating at a
r
point P( r ) incident on a refracting surface, separating media of refractive indices n and
r
n ¢ , at a point A0 and passing through a point P ¢ ( r ¢ ) after refraction by the surface,
r r
where r and r ¢ are the position vectors of the respective points with O (not shown in the
figure) as an arbitrary origin of coordinates. Given the incident ray PA0 , we want to
determine the refracted ray A0 P ¢ . Let A be some point on the surface (not necessarily in
the plane of the paper) in the vicinity of A0 . Imagine the vector rOA to move on the
surface along a curve through A0 that obeys the equation OA = f ( u) , where u is the
length of this curve from A0 . The optical path length of the ray from point P to point P ¢
→
P (r)
→ →
f (u) – r
∧
e θ
u
A
A0
n → →
θ r – f (u)
∧
n ∧ e
→
P (r )
As the point A moves, and therefore as u varies, a whole family of paths is generated.
From Fermat’s principle, the true path is obtained by letting the optical path length be
stationary, i.e., by letting
Ïd r r r r ¸
Ì
Ó du [
n f ( u) - r + n ¢ r ¢ - f ( u) ˝
˛ u =0
= 0 . ] (1-13)
Now,
r r r r r
f (u) - r = ( f ◊ f + rr ◊ rr - 2 f ◊ rr)1 2 . (1-14)
Therefore,
r
r r df
d r r ( )
f -r ◊
f (u) - r = r r du
du f -r
r
df
= eˆ ◊ , (1-15)
du
Similarly,
r
r r df
d r r (
r¢ - f ) ◊
r ¢ - f ( u) = - r r du
du r¢ - f
r
r df
= - e¢ ◊ , (1-17)
du
where ê¢ is a unit vector along the reflected ray AP¢ given by
r r
r¢ - f
eˆ¢ = r r . (1-18)
r¢ - f
r
È df ˘
( ˆ ˆ )
Í ne - n ¢e¢ ◊ ˙ = 0 . (1-19)
Î du ˚
u= 0
r
The vector df du( )
u =0
is a tangent to the curve at A0 but otherwise arbitrary.
Accordingly, neˆ - n ¢eˆ¢ must be parallel to the surface normal v̂ at A0 . Thus, we can
write
where b is a constant. Because ê¢ is a linear combination of ê and v̂ , it must lie in the
plane of incidence, defined as the plane containing the unit vectors ê and v̂ . Thus, the
incident and refracted rays, and the surface normal at the point of incidence, are coplanar.
Taking a dot product of both sides of Eq. (1-20) with v̂ , we obtain
neˆ ◊ vˆ - n ¢eˆ¢ ◊ vˆ = b ,
or
where q and q¢ are the angles the incident and refracted rays make with the surface
normal, known as the angles of incidence and refraction, respectively. Substituting for b
from Eq. (1-21) into Eq. (1-20), we obtain
neˆ ¥ vˆ - n ¢eˆ¢ ¥ vˆ = 0 ,
or
Equation (1-23) and coplanarity of the incident and refracted rays and the surface normal
is the law of refraction, or Snell’s law in 3D. Thus, a ray incident at an angle q is
refracted at an angle q¢ such that the refracted ray lies in the plane of incidence.
Substituting for cos q¢ from Eq. (1-23) into Eq. (1-22) yields the value of ê¢ according to
12
n ¢eˆ¢ = neˆ + ÈÍ n ¢ 2 - n 2 sin 2 q
Î
( )1 2 - n cos q ˘˙˚ vˆ . (1-24)
1.5.5 Reflection in 3D
r
Consider a ray originating at a point P( r ) , as in Figure 1-10, incident on a reflecting
r
surface at a point A0 and passing through a point P ¢ ( r ¢ ) after reflection by the surface,
with O as an arbitrary origin of coordinates. This figure, like Figure 1-9, is also not in a
plane. Given the incident ray PA0 , we want to determine the reflected ray A0 P ¢ . Let A be
16 FOUNDATIONS OF GEOMETRICAL OPTICS
→
→ P (r )
→ →
P (r) r – f (u)
(-)θ
→ → θ
f (u) – r ∧ →
e a
u ∧
A0 e
n A
∧
some point on the surface in the vicinity of A0 . Imagine that the vector
r OA moves on the
surface along a curve through A0 that obeys the equation OA = f ( u) , where u is the
length of this curve from A0 . The optical path length of the ray from point P to point P ¢ ,
through the point A in a medium of refractive index n, is given by
[ PAP¢ ] = n( PA - AP ¢)
r r r r
[
= n f ( u) - r - r ¢ - f ( u) ] , (1-25)
where the optical path length of the reflected ray AP¢ is negative due to the negative
refractive index associated with it. As A moves, and therefore as u varies, a whole family
of paths is generated. From Fermat’s principle, the true path is obtained by letting the
optical path length be stationary, i.e., by letting
Ïd r r r r ¸
Ì
Ó du
[
f ( u) - r - r ¢ - f ( u) ˝ ]
˛ u =0
= 0 . (1-26)
r
È df ˘
( ˆ ˆ )
Í e + e¢ ◊ ˙ = 0 , (1-27)
Î du ˚
u= 0
where ê and ê¢ are unit vectors along the incident and reflected rays, respectively.
1.6 Exact Ray Tracing 17
r
(
The vector df du
u =0
)
lies along the tangent to the curve at A0 . The curve is
arbitrary so long as it passes through A0 and stays on the surface. Therefore, eˆ + eˆ¢ must
be perpendicular to all tangents to the surface at A0 , or eˆ + eˆ¢ must be along the normal
v̂ to the tangent plane at A0 . Thus, we can write
where a is a constant, and conclude that ê¢ must lie in the plane of incidence, defined as
the plane containing ê and v̂ . Thus, the incident and reflected rays, and the surface
normal at the point of incidence, are coplanar. From the triangle A0 A1 A2 in Figure 1-10,
we find that a = 2 cos q, where q is the angle of incidence of the ray. Substituting for a
into Eq. (1-27), we obtain
Because ê and ê¢ are unit vectors and, therefore, have the same length, they intersect
eˆ + eˆ¢ and, therefore, v̂ at the same angle. Thus we obtain the law of reflection that the
reflected ray makes the same angle with the surface normal at the point of incidence as
the incident ray and lies in the plane of incidence. The reflection of a ray can be treated as
a special case of refraction by letting n ¢ = -n , as may be seen by comparing the
corresponding equations, e.g., Eqs. (1-22) and (1-29).
Consider a ray with a unit vector ê0 and direction cosines (k0 , l 0 , m0 ) , as in Figure
r
1-11, originating at a point object A0 with position vector r0 and coordinates ( x 0 , y 0 , z 0 )
v
incident at a point A1 with a position vector r1 and coordinates ( x1 , y1 , z1 ) on a spherical
refracting surface of radius of curvature R1 separating media of refractive indices n 0 and
n1 . Let the distance between the object plane and the vertex V1 of the surface be D01 so
that z 0 = - D01 . Because A1 lies on the spherical surface with the origin at its vertex V1,
2
x12 + y12 + (z1 - R1 ) = R12 (1-30)
or
D01 D12
The z coordinate z1 of a point on the surface represents the sag of the surface at that
point.
1.6.2 Rectilinear Propagation from the Object Plane to the First Refracting Surface
v r
It is evident from Figure 1-11 that the position vectors r1 and r0 are related to each
other according to
v r
r1 = r0 + S01eˆ0 , (1-32)
where
2 12
S01 = [(x - x )
1 0
2 2
+ ( y1 - y 0 ) + (D01 + z1 ) ] (1-33)
is the distance between A0 and A1 . The sign of S01 is the same as that of D01 + z1 , as
may be seen by considering a ray incident at the vertex V1. The transverse coordinates
( x1, y1) of A1 can be written
x1 = x 0 + S01k0 (1-34a)
and
y1 = y 0 + S01l 0 . (1-34b)
1.6 Exact Ray Tracing 19
We note that to determine the coordinates ( x1 , y1 ) from Eqs. (1-34), we need S01 ,
which itself depends on them through Eq. (1-33). Thus, these equations are coupled and
must be solved simultaneously. Substituting Eqs. (1-31) and (1-34) into Eq. (1-33), we
obtain a quadratic equation in S01 in terms of the known quantities. Solving this equation
and substituting the value thus obtained into Eqs. (1-34a) and (1-34b) yields the
transverse coordinates ( x1, y1) of the ray at A1 . The transfer operation of the ray in
propagating from point A0 to point A1 is described by Eqs. (1-31) and (1-34), along with
Eq. (1-33).
The ray is refracted at A1 by the refracting surface, according to Eq. (1-22). The unit
vector v̂1 along its normal at A1 is given by
A1C1
v̂1 =
R1
1
=
R1
(- x1, - y1, R1 - z1)
1 Ê
= - x , - y1 , R12 - x12 - y12 ˆ¯ . (1-35)
R1 Ë 1
Substituting Eq. (1-35) into Eq. (1-22), we obtain the direction cosines (k1 , l1 , m1 ) of the
refracted ray with a unit vector ê1 :
x1
n1k1 = n 0 k0 - (n1 cos q1 - n 0 cos q 0 ) , (1-36a)
R1
y1
n1l1 = n 0 l 0 - (n1 cos q1 - n 0 cos q 0 ) , (1-36b)
R1
and
or
1 Ê
cos q 0 = - x k - y1l 0 + R12 - x12 - y12 1 - k 02 - l 02 ˆ¯ ; (1-37b)
R1 Ë 1 0
(
cos q1 = 1 - sin 2 q1 )1 2
12
[ 2
= 1 - (n 0 n1) sin 2 q 0 ] (1-38a)
or
1 2
cos q1 =
n1 1
(
n - n 02 - n 02 cos 2 q 0 )1 2 . (1-38b)
Equations (1-36) through (1-38) describe the refraction operation of the ray at point A1 .
1.6.4 Rectilinear Propagation from the First Refracting Surface to the Second
The refracted ray propagates in a straight line until it reaches a point A2 , with a
r
position vector r2 on the next spherical refracting surface of radius of curvature R2 with
its vertex V2 at a distance D12 from the vertex V1 separating media of refractive indices
n1 and n 2 . The straight line propagation of the ray from point A1 to point A2 can be
obtained in a manner similar to the ray propagation from point A0 to point A1 .
z 2 = R2 - R22 - x 22 - y 22 . (1-39)
r
The position vector r2 of point A2 is given by
v r
r2 = r1 + S12eˆ1 , (1-40)
where
2 12
S12 = [(x 2
2 2
- x1 ) + ( y 2 - y1 ) + (D12 - z1 + z 2 ) ] (1-41)
is the distance between the points A1 and A2 . The sign of S12 is the same as that of
D12 - z1, as may be seen by considering a ray incident on the vertex V2 .
The transverse coordinates ( x 2 , y 2 ) of point A2 where the ray meets the second
surface are given by
x 2 = x1 + S12 k1 (1-42a)
and
y 2 = y1 + S12 l1 . (1-42b)
Equations (1-39) and (1-42), along with Eq. (1-41), describe the transfer operation of the
ray from point A1 to point A2 . Again, Eqs. (1-41) and (1-42) are coupled and have to be
solved simultaneously.
1.6 Exact Ray Tracing 21
The tracing of a ray reflected by a reflecting surface, as illustrated in Figure 1-12, can
be treated in a similar manner. Consider a ray incident at a point ( x1 , y1 , z1 ) on a
reflecting surface of radius of curvature R1 with a unit vector ê0 and direction cosines
(k0 , l0 , m0 ) . From Eqs. (1-29) and (1-35), the direction cosines (k1, l1, m1) of the reflected
ray with a unit vector ê1 are given by
x1
k1 = - k 0 - 2 cos q 0 , (1-43a)
R1
y1
l1 = - l 0 - 2 cos q 0 , (1-43b)
R1
and
where q 0 is the angle of incidence of the ray, and cos q 0 = eˆ0 ◊ vˆ1 is given by Eq. (1-37).
Equations (1-43), along with Eq. (1-37), describe the reflection operation of the ray at
point A1 . These equations can be obtained from the corresponding Eqs. (1-36) for a
refraction operation by letting n 0 = 1 = - n1 .
A2
∧
e1
(–)θ 0
∧ θ0
A0 e0 A1 (x1, y1, z1)
x
R1 ∧
1
z
V1 C1
z1
y
( )
x 2 + y 2 - 2Rz + 1 - e 2 z 2 = 0 , (1-44)
where ( x , y , z ) are the coordinates of a point on its surface with its origin at its vertex.
The sag of the surface, namely, the z coordinate, is given by
z =
(x2 + y2) R . (1-45)
1/ 2
1 + [1 - (1 - e 2 ) ( x 2 + y 2 ) R2 ]
If we let e = 0, as for a sphere, Eqs. (1-44) and (1-45) reduce to the corresponding
equations (1-30) and (1-31) for a spherical surface, respectively. In lens design, it is quite
common to use the curvature c in place of 1 R and Schwarschild constant k in place of
- e2.
1 2
F (x, y, z) =
R
[ (
x + y 2 + 1 - e2 z 2 - 2z = 0 .) ] (1-46)
The unit vector along the normal to the surface at the point ( x , y , z ) can be written
-(∂F ∂x , ∂F ∂y , ∂F ∂z )
vˆ =
2 12
[(∂F ∂x) 2 2
+ (∂F ∂y ) + (∂F ∂z ) ]
1 È -x -y z˘
= Í
VÎR R
, (
, 1 - 1 - e2 ˙ ,
R˚
) (1-47)
where the minus sign in the first equation represents the fact that the surface normal is
toward the vertex center of curvature, and
2 12
[ (
V = 1 + 2e 2 (z R) - e 2 1 - e 2 (z R) ) ] . (1-48)
Letting e = 0, it can be seen that V Æ 1, and Eq. (1-47) reduces to the unit vector given
by Eq. (1-35) for a spherical surface.
When a ray with a unit vector ê0 and direction cosines (k0 , l 0 , m0 ) originating at a
r
point with a position vector r0 and coordinates ( x 0 , y 0 , z 0 ) is incident on a conic surface
of eccentricity e1 and vertex radius of curvature R1 , the point of incidence ( x1 , y1 , z1 ) is
still given by Eqs. (1-34), except that the sag value, following Eq. (1-45), is given by
1.6 Exact Ray Tracing 23
z1 =
( x12 + y12 ) R1 . (1-49)
1/ 2
1 + [1 - (1 - e12 ) ( x12 + y12 ) R12 ]
Equation (1-49) is substituted into Eq. (1-33) for the distance S01 between the two points.
Substituting Eq. (1-47) into Eq. (1-22), we obtain the direction cosines of the
refracted ray:
x1
n1k1 = n 0 k0 - (n1 cos q1 - n 0 cos q 0 ) , (1-50a)
V1R1
y1
n1l1 = n 0 l 0 - (n1 cos q1 - n 0 cos q 0 ) , (1-50b)
V1R1
and
1 È z ˘
n1m1 = n 0 m0 + (n1 cos q1 - n 0 cos q 0 ) Í
V1 Î
1 - 1 - e12 1 ˙ ,
R1 ˚
( ) (1-50c)
2 12
[ (
V1 = 1 + 2e12 (z1 R1) - e12 1 - e12 (z1 R1) ) ] , (1-51)
or
1 Ï x1 y È z ˘¸
cos q 0 =
V0
Ì - k0
ÔÓ R1
- l 0 1 + 1 - k02 - l 02 Í1 - 1 - e12 1 ˙ ˝ ,
R1 Î R1 ˚ Ô˛
( ) (1-52b)
and cos q1, obtained from Snell’s law, is given by Eq. (1-38).
When a ray with a unit vector ê0 and direction cosines (k0 , l 0 , m0 ) originating at a
r
point with a position vector r0 and coordinates ( x 0 , y 0 , z 0 ) is reflected by a conic
reflecting surface of eccentricity e1 and vertex radius of curvature R1 , the direction
cosines (k1 , l1 , m1 ) of the reflected ray with a unit vector ê1 are given by [see Eq. (1-29)]
x1
k1 = - k 0 - 2 cos q 0 , (1-53a)
V1R1
y1
l1 = - l 0 - 2 cos q 0 , (1-53b)
V1R1
24 FOUNDATIONS OF GEOMETRICAL OPTICS
and
1 È z ˘
m1 = - m0 - 2 cos q 0
V1 ÍÎ
( )
1 - 1 - e12 1 ˙ ,
R1 ˚
(1-53c)
where q 0 is the angle of incidence of the ray, and cos q 0 = eˆ0 ◊ vˆ1 is given by Eq. (1-52).
paraxial ray tracing. In this section, we first list the relevant assumptions and then the
consequent paraxial ray-tracing equations. In each case, we write the exact equation
followed by its paraxial or approximate form to highlight their differences.
The angles of incidence and refraction are assumed to be small so that their sines are
equal to the respective angles, i.e., sin q ~ q and sin q¢ ~ q¢ . Therefore, Snell’s law,
is approximated by
n ¢q ¢ ~ nq . (1-54b)
Of course, the incident and refracted rays, and the surface normal at the point of
incidence on the refracting surface, are coplanar.
( x, y, z) = ÊË x , y , R - R 2 - x 2 - y 2 ˆ¯ (1-55a)
~ ( x, y, 0) . (1-55b)
The point is assumed to be close to the optical axis so that the sag z of the point on the
surface is negligible. Thus, a spherical surface is replaced by a planar surface, called the
tangent plane or the paraxial surface, passing through the surface vertex.
Replace the diagonal distances, such as S12 between two points on two surfaces in
Figure 1-11, with the corresponding axial distance D12 :
2 12
S12 = [(x 2
2 2
- x1 ) + ( y 2 - y1 ) + (D12 - z1 + z 2 ) ] (1-56a)
~ D12 . (1-56b)
Thus, the distance between the two points is replaced by the distance between the two
tangent planes, i.e., it is approximated by the distance between their vertices. It has the
implication that the ray between two points is nearly parallel and close to the optical axis.
26 FOUNDATIONS OF GEOMETRICAL OPTICS
1Ê
vˆ = - x , - y , R - R 2 - x 2 - y 2 ˆ¯ (1-57a)
RË
~ 1 (- x, - y, 0) . (1-57b)
R
Thus, the direction cosines x R and y R are very small so that the normal is practically
parallel to the z axis, which is the optical axis.
(k, l , m) = ÊË k , l , 1 - k 2 - l 2 ˆ¯ (1-58a)
~ (k, l , 1) . (1-58b)
With these approximations, the transfer, refraction, and reflection operation Eqs. (1-
34), (1-36), and (1-43) are simplified as follows.
x1 = x 0 + S01k0 (1-59a)
~ x 0 + D01k0 , (1-59b)
and
y1 = y 0 + S01l 0 (1-60a)
~ y 0 + D01l 0 . (1-60b)
Approximating the distance S01 between two points by their axial distance D01
decouples, for example, Eqs. (1-33) and (1-34).
1.7 Paraxial Ray Tracing 27
x1
n1k1 = n 0 k0 - (n1 cos q1 - n 0 cos q 0 ) (1-61a)
R1
~ n 0 k0 - (n1 - n 0 ) x1 , (1-61b)
R1
and
y1
n1l1 = n 0 l 0 - (n1 cos q1 - n 0 cos q 0 ) (1-62a)
R1
~ n 0 l 0 - (n1 - n 0 ) y1 , (1-62b)
R1
where the angles of incidence and refraction q 0 and q1 are small so that
cos q1 ~ 1 ~ cos q 0 .
x1
k1 = - k 0 - 2 cos q 0 (1-63a)
R1
~ - k 0 - 2 x1 , (1-63b)
R1
and
y1
l1 = - l 0 - 2 cos q 0 (1-64a)
R1
~ - l 0 - 2 y1 . (1-64b)
R1
Again, the angles of incidence and reflection, which are equal to each other in magnitude,
are assumed to be small so that their cosine is unity.
28 FOUNDATIONS OF GEOMETRICAL OPTICS
We have seen that for rays and surface normals to refracting and reflecting surfaces
making small angles with the optical axis, we can replace the sines and tangents of the
angles of the rays with the optical axis by the angles, and any diagonal distances between
two points by the corresponding axial distances. Because the sine of an angle is replaced
by the angle itself, which is the first-order approximation, ray tracing in this
approximation is referred to as first-order optics. The approximation of small angles is
also called the Gaussian approximation. It is used to determine the image location and
size in terms of the object location and size. The image is referred to as the Gaussian
image, and the process of determining the image in this manner, regardless of the
magnitude of the angles and sizes, is called Gaussian optics. However, the larger the
angles and sizes are, the coarser the approximation, yielding poorer image quality due to
the larger aberrations.
It is quite common in optics literature to consider a point object along the y axis
when imaged by a rotationally symmetric optical system, thus making the yz plane the
tangential plane [2,3]. To maintain symmetry of the aberration function about this plane,
the polar angle q of a pupil point is accordingly defined as the angle made by its position
vector with the y axis, contrary to the standard Cartesian convention as the angle with the
x axis. Choosing a point object along the x axis, thus making the zx plane the tangential
plane, removes this difficulty. The coma aberration, for example, is then expressed as
( ) ( )
x x 2 + y 2 instead of as y x 2 + y 2 .
In Gaussian optics, the aberrations of an image are neglected, or, equivalently, the
image is assumed to be aberration free. Gaussian imaging depends only on the vertex
radius of curvature. Thus, the Gaussian image formed by a conic surface of some vertex
radius of curvature is exactly the same as that formed by a spherical surface of the same
radius of curvature.
1.8 Gaussian Approximation and Imaging 29
In the paraxial approximation, we use Eq. (1-59b) to obtain the height x1 of the point
of incidence and Eq. (1-61b) to obtain the direction cosine k1 of the refracted ray. If a ray
makes an angle a with the x axis, then its direction cosine k = cos a . Let the ray make
an angle b with the z axis, called its slope angle, where a + b = p 2. Evidently,
cos a = sin b . For small slope angles, sin b ~ b , and thus, cos a ~ b. Therefore, if b 0
and b1 are the slope angles of the incident and the refracted rays, respectively, we can
write the ray-tracing equations (1-59b) and (1-61b) as
x1 = x 0 + D01b 0 (1-65)
A1
A0
x (-) b1
x0 b0 x1
z
n0 V1 n1 C1 (-) x 2
A2
R1
D01 D12
Figure 1-13. Gaussian imaging by a refracting surface. A ray from a point object A0
at a height x 0 is incident at a slope angle b 0 at a point A1 at a height x1 on the
tangent plane V1 A1 of a refracting surface of vertex radius of curvature R1 , with its
center of curvature C1 and separating media of refractive indices n 0 and n1 . The
ray is refracted at a slope angle b 1 , forming the Gaussian image at a point A2 at a
height x 2 . The numerically negative quantities are indicated by a parenthetical
minus sign (–).
30 FOUNDATIONS OF GEOMETRICAL OPTICS
and
x1
n1b1 = n 0b 0 - (n1 - n 0 ) , (1-66)
R1
where b 0 and b1 are the slope angles of the incident and the refracted rays, respectively.
Similarly, the height x 2 of a point A2 on the refracted ray at an axial distance D12 is
given by
x 2 = x1 + D12b1 . (1-67)
Both b1 and x 2 are numerically negative in Figure 1-13. Substituting for x1 and b1 from
Eqs. (1-65) and (1-66), we obtain
È D ˘ È n D D ˘
x 2 = Í1 + 12 (n 0 - n1 )˙ x 0 + Í D01 + 0 D12 + 01 12 (n 0 - n1)˙b 0 . (1-68)
Î n1R1 ˚ Î n1 n1R1 ˚
In order that the point A2 be the image of the point object A0 , all of the rays
originating at the point object and incident on the refracting surface must pass through the
image point after refraction. Thus, x 2 must be independent of b 0 , or the coefficient of
b 0 in Eq. (1-68) must be zero. Therefore, Eq. (1-68) reduces to
n0 D D
D01 + D12 + 01 12 (n 0 - n1 ) = 0 . (1-69)
n1 n1R1
n0 n n - n0
+ 1 = 1 . (1-70)
D01 D12 R1
This equation is called the Gaussian imaging equation for the refracting surface. It gives
the distance D12 of the image from the surface in terms of the distance D01 of the surface
from the object.
È D ˘
x 2 = Í1 + 12 (n 0 - n1 )˙ x 0 , (1-71)
Î n1R1 ˚
which gives the height x 2 of the image of an object of height x 0 . The ratio of x 2 to x 0 is
called the transverse magnification of the image. Thus,
x2
Mx = (1-72a)
x0
D12
=1+ (n - n1)
n1R1 0
(1-72b)
1.8 Gaussian Approximation and Imaging 31
or
n 0 D12
Mx = - , (1-72c)
n1 D01
where we have used Eq. (1-70) in the last step. Equations (1-70) and (1-72) describe the
location and size of the image in terms of the corresponding quantities for the object.
b1 =
(n 0 - n1) x Ên n - n1
+Á 0 + 0
ˆ
D01 ˜ b 0 . (1-73)
0
n1R1 Ë n1 n1R1 ¯
If we consider a cone of rays of angular subtense Db 0 diverging from the point object
and incident on the surface, the corresponding angular subtense Db1 of the cone of rays
converging to the image point can be obtained by differentiating Eq. (1-73):
∂b1
Mb = (1-74a)
∂b 0
n 0 n 0 - n1
= + D01 , (1-74b)
n1 n1R1
or
D01
Mb = - , (1-74c)
D12
where we have again used Eq. (1-70) in the last step. Equation (1-74) gives the angular
magnification of the rays. The product of the transverse and angular magnifications is
given by
n0
M xMb = , (1-75)
n1
which is independent of the object and image distances. It illustrates that a large
transverse magnification is accompanied by a small angular magnification.
(-) b1
A1
A0
b0 x1
A2
x0 x2
V1 C1
R1
D01 D12
Figure 1-14. Gaussian imaging by a reflecting surface. A ray from a point object A0
at a height x 0 is incident at a slope angle b 0 at a point A1 at a height x1 on the
tangent plane V1 A1 of a reflecting surface of vertex radius of curvature R1 , with its
center of curvature C1 . The ray is reflected at a slope angle b 1 , forming the
Gaussian image at a point A2 at a height x 2 .
x1 = x 0 + D01b 0 (1-76)
and
x1
b1 = - b 0 - 2 , (1-77)
R1
x 2 = x1 + D12b1 . (1-78)
Ê 2D ˆ Ê 2D01D12 ˆ
x 2 = Á 1 - 12 ˜ x 0 + Á D01 - D12 - ˜b . (1-79)
Ë R1 ¯ Ë R1 ¯ 0
In order that the point A2 be the image of the point object A0 , all of the rays
originating at the point object and incident on the reflecting surface must pass through the
image point after reflection. Thus, x 2 must be independent of b 0 , or the coefficient of b 0
in Eq. (1-79) must be zero. Therefore,
1.8 Gaussian Approximation and Imaging 33
2D01D12
D01 - D12 - = 0 . (1-80)
R1
1 1 2
- = . (1-81)
D12 D01 R1
This equation is called the Gaussian imaging equation for the reflecting surface. It gives
the distance D12 of the image from the surface in terms of the distance D01 of the surface
from the object. The image does not lie on the ray, but lies instead on its extension.
Accordingly, it is not real, but virtual.
Ê 2D ˆ
x 2 = Á 1 - 12 ˜ x 0 , (1-82)
Ë R1 ¯
which gives the height x 2 of the image of an object of height x 0 . The ratio of x 2 to x 0 is
called the transverse magnification of the image. Thus,
x2
Mx = (1-83a)
x0
2 D12
= 1- (1-83b)
R1
or
D12
Mx = , (1-83c)
D01
where we have used Eq. (1-81) in the last step. Equations (1-81) and (1-83) describe the
location and size of the image in terms of the corresponding quantities for the object.
2 Ê 2 ˆ
b1 = - x 0 - Á1 + D01˜ b 0 . (1-84)
R1 Ë R1 ¯
If we consider a cone of rays of angular subtense Db 0 diverging from the point object
and incident on the surface, the corresponding angular subtense Db1 of the cone of rays
converging to the image point can be obtained by differentiating Eq. (1-84):
∂b1
Mb = (1-85a)
∂b 0
34 FOUNDATIONS OF GEOMETRICAL OPTICS
Ê 2 ˆ
= - Á1 + D01 ˜ (1-85b)
Ë R1 ¯
or
D01
Mb = - , (1-85c)
D12
where we have again used Eq. (1-81) in the last step. Equation (1-85) gives the angular
magnification of the rays. The product of the transverse and angular magnifications is
given by
M xMb = -1 , (1-86)
which is independent of the object and image distances. It illustrates that, as in the case of
imaging by a refracting surface, a large transverse magnification is accompanied by a
small angular magnification.
In a multisurface imaging system, the image formed by the first surface becomes the
object for the second surface, and so on, for the succeeding surfaces of the system. The
image thus formed by the last surface is the image formed by the system. Gaussian
imaging by a refracting system is developed in Chapter 2, and that by a reflecting system
in Chapter 3. These can be easily combined to treat imaging by a general imaging system
consisting of refracting and reflecting surfaces. It is shown in Chapter 2 that it is not
necessary to perform Gaussian imaging by each surface to obtain the image formed by a
system. Instead, once the location of two principal planes and two focal points of a
system are determined, the Gaussian image of an object formed by the system can be
determined in one step.
A quantity of paramount interest that is beyond Gaussian optics but that a design
must satisfy is the expected quality of the image. A designer must choose the shapes of
the imaging elements so as to balance their aberrations to yield an image of acceptable
quality across the field of view of the system. Because the image distance and transverse
magnification depend on the refractive indices of the materials of the elements of an
imaging system, which depend on the wavelength of the object radiation, the images
formed suffer from chromatic aberrations, as discussed in Chapter 7. For example, the
image of a white point object is not white. An optical designer strives to select materials
so that the chromatic aberrations they introduce cancel each other as much as possible.
In practice, the wavefront exiting from an imaging system is rarely spherical. Its
deviations from being spherical represent its wave aberrations. The possible aberrations
of a system with an axis of rotational symmetry are discussed in Chapter 8. When the
wavefront is not spherical, the rays intersect the image plane in the vicinity of the
Gaussian image point. The characteristics of the ray distribution in the image plane for
the various aberration types, i.e., the spot diagrams, are discussed in Chapter 9. A lens
designer resorts to using nonspherical surfaces to reduce or eliminate the aberrations over
a certain field of view. The Gaussian imaging properties of a nonspherical surface are, of
course, the same as those of the corresponding spherical surface because they are
determined by its vertex radius of curvature.
Even if the rays from a point object all converge to its Gaussian image point, its
observed image is not a point. The converging beam of the imaging light spreads due to
its diffraction as it propagates to the image plane. A brief discussion of the diffraction-
based aberration-free image, namely, the Airy pattern, is given in Section 6.8.2. In the
presence of aberrations, the light in the diffraction image spreads even more [7].
36 FOUNDATIONS OF GEOMETRICAL OPTICS
Our sign convention for distances, heights, and angles is the Cartesian sign
convention, discussed in Section 1.2. It has the advantage that there are no special rules to
remember other than those of a right-handed Cartesian coordinate system, regardless of
whether the object or the image is real or virtual, or a whether refracting or a reflecting
surface is convex or concave to the light incident on it.
where ds is a differential element of path length along the ray, n is the refractive index of
the medium as a function of the path, and d represents a differential variation.
where q and q ¢ are the angles of incidence and refraction from the surface normal at the
point of incidence. Moreover, the incident and refracted rays, and the surface normal, are
coplanar.
q¢ = - q . (1-89)
Once again, the incident and the reflected rays, and the surface normal are coplanar. The
law of reflection can be obtained from the law of refraction by letting n ¢ = - n because
the reflected ray lies in the same medium as the incident ray. The negative sign represents
the backward propagation of the reflected ray.
1.10 Summary of Results 37
The exact ray tracing consists of a transfer operation, in which a ray propagates from
a certain point to a point on a refracting or a reflecting surface, and a refraction or
reflection operation, which describes its refraction or reflection by the surface. Such ray
tracing is used primarily to determine the aberrations of a system with the aid of
computer software, and thereby the quality of an image.
When a ray with direction cosines (k0 , l 0 , m0 ) originating at a point object A0 with
coordinates ( x 0 , y 0 , z 0 ) , as in Figure 1-11, is incident on a spherical refracting surface of
radius of curvature R1 separating media of refractive indices n 0 and n1 at a distance
D01 , its rectilinear propagation from A0 to the point A1 where it meets the surface is
referred to as the transfer operation. The coordinates ( x1 , y1 , z1 ) of A1 are given by
x1 = x 0 + S01k0 , (1-90a)
y1 = y 0 + S01l 0 , (1-90b)
and
where
2 12
S01 = [(x - x )
1 0
2 2
+ ( y1 - y 0 ) + (D01 + z1 ) ] (1-91)
is the distance between A0 and A1 . The origin of the coordinates lies at the vertex V1 of
the surface, and thus z 0 = - D01 . Equation (1-90c) represents the fact that A1 lies on the
surface.
Equations (1-90) and (1-91) are coupled and must be solved simultaneously to obtain
the transverse coordinates ( x1, y1 ) of the ray at A1 . Once these coordinates are known,
the z1 coordinate is determined from Eq. (1-90c) by virtue of the fact that A1 lies on the
surface.
x1
n1k1 = n 0 k0 - (n1 cos q1 - n 0 cos q 0 ) , (1-92a)
R1
y1
n1l1 = n 0 l 0 - (n1 cos q1 - n 0 cos q 0 ) , (1-92b)
R1
and
1 Ê
cos q 0 = - x k - y1l 0 + R12 - x12 - y12 1 - k 02 - l 02 ˆ¯ , (1-93)
R1 Ë 1 0
and
1 2
cos q1 = (
n - n 02 - n 02 cos 2 q 0
n1 1
)1 2 . (1-94)
Once the direction cosines (k1 , l1 ) are known, the direction cosine m1 can be obtained
from the relation k12 + l12 + m12 = 1 .
x1
k1 = - k 0 - 2 cos q 0 , (1-95a)
R1
y1
l1 = - l 0 - 2 cos q 0 , (1-95b)
R1
and
where q 0 is the angle of incidence of the ray given by Eq. (1-93). The reflection
operation can be obtained from the corresponding refraction operation by letting
n 0 = 1 = - n1 .
1.10 Summary of Results 39
The ray-tracing equations for conic refracting and reflecting surfaces are given
concisely in Sections 1.6.7 and 1.6.8. The differences for a conic surface compared to a
spherical surface result from their sag [see Eq. (1-45)] and the surface normal differences
[see Eq. (1-47)].
When the rays make small angles with the optical axis and surface normals, we can
approximate their sines and tangents with the angles themselves. Similarly, if the
transverse coordinates ( x , y ) of a point on a surface are much smaller than its radius of
curvature, we can neglect the sag of a refracting or reflecting surface and approximate the
diagonal distance between two points by the corresponding axial distance. Such
assumptions yield equations for the transverse coordinates that are no longer coupled.
The corresponding ray tracing is referred to as the paraxial ray tracing.
Moreover, the projections of a skew ray in the zx and yz planes propagate through a
system independently of each other. Consequently, for a rotationally symmetric imaging
system, we need to trace rays only in one of these planes. The plane of choice is generally
the tangential plane zx.
n0 n n - n0
+ 1 = 1 . (1-96)
D01 D12 R1
40 FOUNDATIONS OF GEOMETRICAL OPTICS
The transverse magnification of the image and the angular magnification of the rays
originating at a point object and converging to its Gaussian image point are given by
n 0 D12
Mx = - (1-97)
n1 D01
and
D01
Mb = - , (1-98)
D12
n0
M xMb = . (1-99)
n1
We point out that in ray tracing, the object distance D01 is measured from the object
to the surface. However, in Chapter 2, we will consider it from the vertex of the imaging
surface, thus changing its sign.
The Gaussian imaging equations for a reflecting surface of vertex radius of curvature
R1 , as in Figure 1-14, can be obtained from those for a corresponding refracting surface
by letting n1 = - n 0 . Thus, we may write
1 1 2
- = , (1-100)
D12 D01 R1
D12
Mx = , (1-101)
D01
D01
Mb = - , (1-102)
D12
and
M xMb = -1 . (1-103)
Generally, the refractive index of the medium for imaging by a reflecting surface is unity.
The imaging equations for a reflecting surface can be obtained from the corresponding
equations for a refracting surface by letting n 0 = 1 = - n1 . The minus sign with n1
represents the backward propagation of the reflected ray compared to that of the incident
ray.
References 41
REFERENCES
5. M. V. Klein and T. E. Furtak, Optics, John Wiley & Sons, New York (1988).
PROBLEMS
1.2 Show that an ellipsoidal mirror is a Cartesian surface for a point object placed at
one of its two geometrical focii.
1.3 Determine the focal length of a thin lens by considering an object at infinity. The
refractive index of the lens is n, and the radii of curvature of its two surfaces are R1
and R2 .
CHAPTER 2
REFRACTING SYSTEMS
43
44 REFRACTING SYSTEMS
We begin this chapter by rederiving the imaging equations for a refracting surface by
assuming small angles of incidence and refraction and small slope angles of the rays (as
in Section 1.8.2). How to determine the image graphically is also considered. We use
standard notation suitable for a multisurface imaging system. Both the Gaussian and
Newtonian forms of the imaging equations are given. These equations are used to obtain
the corresponding equations for a thin lens. The imaging equations for a multisurface
refracting system are derived next. The principal, focal, and nodal points, collectively
called the cardinal points of such systems, are discussed. It is shown that simple imaging
equations, similar to those for a single refracting surface, are obtained, provided the
object and image distances are measured from the respective principal points of the
system in the Gaussian form, and from the focal points in the Newtonian form of the
imaging equations. The concept of the Lagrange invariant is discussed in each case.
Afocal systems, i.e., those for which a parallel beam of light incident on them
emerges as a parallel beam of light, or the object and its image both lie at infinity, are also
discussed. Imaging by a plane-parallel plate is considered, and it is shown that the
distance between the object and its image is independent of the object location, depending
only on the refractive index and the thickness of the plate.
In Gaussian imaging, the object and image distances are measured along the optical
axis, even when they are located off the axis. This introduces a small focus error that
increases quadratically with the height of a point object. Consequently, an error-free
image of a plane object is formed on a spherical surface, called the Petzval image surface.
The radius of curvature of this surface is independent of the object or the image distance.
How this image is determined is discussed briefly. Next, how an image is displaced
because of a slight misalignment of an imaging element is discussed. Finally, we briefly
discuss imaging by an anamorphic system with different transverse magnifications in two
orthogonal symmetry planes, thus yielding a rectangular image of a square object.
45
46 REFRACTING SYSTEMS
We first consider the imaging of an axial point object P0 lying at a distance S from
V. An object ray P0 Q incident at a point Q on the surface at a height x from the optical
axis is refracted as a ray QP¢0 intersecting the optical axis at a point P0¢ at a distance S ¢
from V. Let the angles of incidence and refraction (i.e., the angles of the incident and
refracted rays from the surface normal QC at the point of incidence Q) be q and q ¢ ,
respectively. Similarly, let the slope angles of the rays from the optical axis be 0 and
¢0 .
n n¢
Q
q
x q¢
b0 V (–)f (–)b¢0
P0 OA C P¢0
R
(–)S S¢
In the Gaussian approximation of Snell's law (i.e., for small angles), the angles of
incidence and refraction are related to each other according to [see Eq. (1-54b)]
n ¢q ¢ = nq . (2-1)
The rays propagating according to this approximation are called paraxial rays. From the
triangle P0 CQ , we note that
q = 0 - f , (2-2a)
where the angle f of the surface normal from the optical axis is numerically negative.
Similarly, from triangle CP0¢Q , we note that
q ¢ = ¢0 - f , (2-2b)
where ¢0 is numerically negative. Now the tangent of a small angle is approximately
equal to the angle in radians. Thus, we may write
0 = - x / S , (2-3a)
¢0 = - x / S ¢ , (2-3b)
and
f = -x R , (2-3c)
where the object distance S is numerically negative because P0 lies to the left of V.
Substituting Eqs. (2-3) into Eqs. (2-2) and substituting the results thus obtained into Eq.
(2-1), we obtain
n¢ n n¢ - n
- = . (2-4)
S¢ S R
We note that Eq. (2-4) is independent of the height x of the point of incidence Q of the
ray. Thus, in the Gaussian approximation, all rays incident on the surface pass through P0¢
after being refracted by it. Equation (2-4) is called the Gaussian imaging equation.. It
gives the position of the image point for a given position of the object point. It is
applicable to any conic surface with a vertex radius of curvature R. The reference point
for the object and image distances is the vertex V of the refracting surface. A point object,
such as P0 , and its corresponding Gaussian image point P0¢ are called the conjugate
points.
Imaging can also be considered in terms of waves. A point source emanates spherical
waves. A wave surface with a constant phase, spherical in this case, is called a wavefront.
Thus, as illustrated in Figure 2-2, a spherical wave of radius of curvature S diverging
from the point object P0 is incident on the refracting surface. The refracting surface
converts this wave into a spherical wave of radius of curvature S ¢ converging to the
image point P0¢ . The curvature of a wavefront is called its vergence. When multiplied
48 REFRACTING SYSTEMS
n n¢
P0 C P0¢
R
(–)S S¢
by the refractive index of the medium in which the wavefront lies, it is called the optical
(or reduced) vergence. Thus, V = n S is the optical vergence of the incident wavefront,
and V ¢ = n ¢ S ¢ is the optical vergence of the refracted wavefront. As shown later [see Eq.
(2-7)], the right-hand side of Eq. (2-4) is called the refracting power K of the surface. In
terms of the vergences of the wavefronts and the power of the refracting surface, the
imaging equation can be written
V¢ - V = K . (2-5)
In Figure 2-1, the point object P0 is real in the sense that the object rays actually
originate from it. Similarly, the image point P0¢ is real in the sense that the refracted rays
actually pass through it. The object distance S is numerically negative, and the image
distance S ¢ is numerically positive. However, we note from Eq. (2-4) that, as the object
moves closer to the refracting surface such that S < nR (n ¢ - n) , then S ¢ is numerically
negative, indicating that the image lies on the left-hand side of the refracting surface. This
is illustrated in Figure 2-3, where it is shown that an object ray P0 Q from an object point
P0 is refracted such that an extension of the refracted ray intersects the optical axis on the
object side at P0¢ , which is the image of P0 . The image in this case is virtual in the sense
that any refracted ray appears to come from it, but does not actually pass through it. The
image can also be virtual if R is numerically negative, i.e., if the center of curvature C of
the refracting surface lies to the left of its vertex V , or if n ¢ < n. If there is another
imaging element to the right of the refracting surface, then the rays incident on it are real,
and the virtual image becomes a real object for it.
2.2 Spherical Refracting Surface 49
n n¢
Q
V
P¢0 P0 OA C
(–)S R
(–)S¢
Figure 2-3. Virtual image P0¢ of a real point object P0 , where S < nR (n ¢ - n) . An
object ray, such as P0 Q , is refracted by the refracting surface such that the
refracted ray appears to come from the image P0¢ .
n n¢
OA P¢0 C P0
S¢
R
S
Figure 2-4. Imaging of a virtual point object P0 by a refracting surface. The real
image lies at P0¢ .
50 REFRACTING SYSTEMS
n n¢
Q
V
P¢0 OA C P0
R
(–)S¢ S
Figure 2-5. Imaging of a real point object P0 lying to the right of a refracting
surface.
indices are reversed, i.e., they become negative quantities. However, Eq. (2-4) does not
change, as may be seen by reversing the signs of n and n ¢ . Therefore, the imaging
equation in this case becomes
n¢ n n¢ - n
- = . (2-6)
S S¢ R
emerging from it as lying in its image space. Sometimes, a distinction is made between a
real and a virtual space. The portion of the object space lying to the left of a system is
called its real object space, and the portion of the image space lying to its right is called
its real image space. The remaining portions are correspondingly called virtual object
and image spaces.
n n¢
V
F¢
f¢
(a)
n n¢
V
F
(–)f
(b)
Figure 2-6. Focal points of a refracting surface. (a) Image-space focal point F ¢ . (b)
Object-space focal point F . In Gaussian optics, refraction of the rays takes place at
the tangent plane passing through the vertex of the surface.
52 REFRACTING SYSTEMS
at infinity (i.e., S ¢ = • ), as in Figure 2-6b, is called the object-space focal length, where
F is called the object-space focal point. Rays originating at F and incident on the surface
are made parallel by it. Of course, if rays parallel to the optical axis are incident on the
surface from right to left, they will be focused at F after being refracted by it. The planes
passing through the focal points F and F ¢ that are perpendicular to the optical axis are
called the object-space and image-space focal planes, respectively. It should be evident
from Figure 2-6 that the focal points F and F ¢ are not conjugate points.
By their definitions, the image-space and object-space focal lengths of the refracting
surface, obtained from Eq. (2-4), are given by
n¢
f¢ = R (2-7a)
n¢ - n
and
n
f = - R , (2-7b)
n¢ - n
respectively. The two focal lengths are, therefore, related to each other according to
f ¢ = - ( n ¢ n) f . (2-8)
Just as the image of an object can be virtual, so can the focal point of an imaging
system. This is illustrated in Figure 2-7, where the radius of curvature of the refracting
surface is numerically negative. A ray incident parallel to the optical axis is bent away
from the axis, and an extension of the refracted ray intersects the axis at the virtual focal
point F ¢ . Similarly, the focal point is virtual if R is numerically positive, but n ¢ < n.
n n¢
C F¢
(–)R
(–)f¢
Figure 2-7. Virtual focal point F ¢ of a refracting surface. As in Figure 2-1, n ¢ > n,
but R is numerically negative.
2.2 Spherical Refracting Surface 53
The quantity on the right-hand side of Eq. (2-4) is called the refracting power K of
the surface. It is a measure of the ability of the refracting surface to convert a parallel
beam into a converging beam; the shorter the distance at which the refracted beam is
focused, the higher the power of the refracting surface. Its reciprocal is called the
equivalent or effective focal length fe of the surface. Thus, we may write
n¢ - n 1
K = = . (2-9)
R fe
The power K and the equivalent focal length fe are positive if n ¢ - n and R have the
same sign. Such a surface is called a positive or a converging surface. Similarly, K and
fe are negative if n ¢ - n and R have opposite signs. Such a surface is called a negative or
a diverging surface. We also note that fe = f ¢ if n ¢ = 1, i.e., the equivalent focal length
represents the image-space focal length when the refractive index n ¢ of the image space
is unity. In terms of the refracting power and focal lengths, Eq. (2-4) may be written
n¢ n 1 n¢ n
- = K = = = - . (2-10)
S¢ S fe f¢ f
When the focal length is measured in meters, the unit of power is called a diopter (D),
which is measured in m–1.
2.2.4 Magnifications and Lagrange Invariant
Now we consider the imaging of an off-axis point object P lying at a height h from
the optical axis in the object plane passing through P0 , as illustrated in Figure 2-8. The
incident and the refracted rays PV and VP¢ , respectively, are shown in the figure passing
through the vertex V. The image lies at the point P ¢ , where the refracted ray VP¢
intersects the image plane passing through P0¢ . Both the object and the image planes are
mutually parallel and perpendicular to the optical axis. It is evident from the figure that
the angles of incidence and refraction from the surface normal at V, i.e., from the optical
axis, are given by
q = h/S (2-11a)
and
q¢ = h¢ / S ¢ , (2-11b)
respectively. Note that q , q ¢ , and h ¢ are all numerically negative. Substituting Eqs. (2-
11) into the Snell's law equation (2-1), we find that the transverse magnification of the
image is given by
h¢ nS ¢
Mt ∫ = . (2-12a)
h n ¢S
It should be evident that the image of an extended object lying in the object plane is
uniformly magnified so that the image is geometrically similar to the object. Substituting
54 REFRACTING SYSTEMS
n n′
P
A
h
(–)θ C P′0
P0 V (–)θ′ (–)h′
P′
R
(–)S S′
Figure 2-8. Imaging of an off-axis point object P lying at a height h from the optical
axis. The image point P ¢ lies at a height h ¢ .
for n ¢ S ¢ from Eq. (2-10) into Eq. (2-12a), the magnification can be written in terms of
the object distance S and the focal length f ¢ :
nf ¢
Mt = . (2-12b)
nf ¢ + n ¢S
S¢ - R
Mt = - . (2-12c)
R-S
The ray angular magnification, representing the ratio of the angular divergence of
the rays from P0 to their angular convergence to P0¢ (see Figure 2-9), is given by
M = ¢0 / 0 = S / S ¢ . (2-13)
2.2 Spherical Refracting Surface 55
Note that Mb is not the ratio of the angular sizes q ¢ and q of the image P0¢P ¢ and the
object P0 P , respectively, subtended at V in Figure 2-8. From Eqs. (2-12) and (2-14), we
find that the product of the transverse and angular magnifications is given by
Mt Mb = n / n ¢ , (2-14)
which depends only on the ratio of the refractive indicex of the object space to that of the
image space. In particular, it does not depend on the object and image distances.
Consequently, a large transverse magnification of the image can be obtained only with a
correspondingly small angular magnification of the rays, i.e., by having a much smaller
angular divergence of the rays at the image than at the object. From the definitions of the
magnifications, namely, Eqs. (2-12a) and (2-13), Eq. (2-14) can also be written
showing that the quantity nh0 does not change upon refraction (see Figure 2-9). This
quantity is called the Lagrange (or the Smith–Helmholtz) invariant.
From Eqs. (2-12a) and (2-13), the transverse magnification of the image can also be
written
nb 0
Mt = , (2-16)
n ¢b¢0
i.e., it can be obtained from the slope angles of the incident and refracted rays for an axial
object point.
n n′
P B
h
β0 (–)β′0 P′0
V
P0 C (–)h′
P′
R
(–)S S′
Mq ∫ q ¢ q
h¢ S
=
S¢ h
= Mt M
n
= . (2-17)
n¢
Suppose we treat the angles q and q ¢ as the angular divergences of the rays from an
object and its image located at V. We should then be able to obtain Eq. (2-17) from Eq.
(2-13) by replacing 0 by q and ¢0 by q ¢ . This is indeed the case because Mt = 1 for the
conjugates located at the vertex. Thus, Eqs. (2-12a) and (2-13) yield the result
Mq = n n ¢ .
For a small change S in the object distance, let the corresponding change in the
image distance be D S ¢ , as illustrated in Figure 2-10. The ratio S ¢ S is called the
longitudinal magnification Ml because it represents the magnification of the image of a
small axial object. Differentiating both sides of Eq. (2-4), we find that
n n¢
V
P0 P1 C P¢0 P¢1
DS
DS¢
R
(–)S S¢
moves in the same direction as the object. Because the value of Mt varies with the
position of the object, Ml also varies with it. Therefore, Eq.(2-18) is valid only for
infinitesimal values of DS . In this equation, the refracting surface is assumed to be fixed,
and D S ¢ represents the image displacement corresponding to an object displacement DS .
However, if the object is fixed and the refracting surface is displaced by an amount D ,
( )
then the corresponding displacement of the image is given by 1 - n ¢ M 2 n D , as shown
in Section 2.8.3.
We note from Eq. (2-18) that, unless Mt = ± 1, the longitudinal and transverse
magnifications are not equal, and the 3D image of a 3D object is accordingly
geometrically different from the object. This is illustrated in Figure 2-11. The transverse
image is reversed, as illustrated by the reversal of the arrows P0¢x ¢ and P0¢y ¢ compared to
the arrows P0 x and P0 y . As illustrated by the arrows P0 z and P0¢z ¢ , the longitudinal
image points in the same direction as the object, yielding a positive longitudinal
magnification.
n¢ n
- = K (2-19a)
S0¢ S0
and
n¢ n
- = K . (2-19b)
S1¢ S1
y¢ z¢
n¢ P¢0
F¢
x¢
n
V
x F
z
P0
y
n n¢
V
P0 P1 C P¢0 P¢1
L
L¢
R
(–)S0 S0¢
(–)S1 S1¢
n¢ n n¢ n
- = - ,
S1¢ S1 S0¢ S0
or
Ê1 1ˆ Ê1 1ˆ
n¢ Á - ˜ = n Á - ˜ . (2-20)
S
Ë 1¢ S ¢
0¯ S
Ë 1 S0¯
L¢ S ¢ - S0¢
Ml = = 1
L S1 - S0
n S0¢ S1¢
=
n ¢ S0 S1
n¢
= M0 M1 , (2-21)
n
where M0 = nS0¢ n ¢S0 and M1 = nS1¢ n ¢S1 are the transverse magnifications of the
images lying in image planes passing through P0¢ and P1¢ , respectively. Thus, for
example, the image of a cube is a truncated pyramid, as illustrated in Figure 2-13. The
pyramid becomes approximately a rectangular parallelepiped if the cube is infinitesimal
in size. It is shown in Section 2.5 that for an afocal system, Mt is independent of the
position of the object and, therefore, so is Ml .
2.2 Spherical Refracting Surface 59
D
C
G H
P0
n n¢
P1
A B
F
E O
L
(–) D¢ H¢
S
C¢
G¢
P¢0
B¢ P¢1
S¢ A¢
F¢
L¢ E¢
Figure 2-13. Image of a cube. The image is a truncated pyramid owing to different
transverse magnifications of the images of objects lying in different object planes.
Extension of one or more of these rays may be necessary for them to intersect each other.
Moreover, in Gaussian optics, which is based on the paraxial rays, any refraction (or
reflection) at a surface takes place at the plane that is a tangent to it at its vertex, as
shown, for example, in Figures 2-9 and 2-14.
a dashed line) and passing through C intersects the image-space focal plane at a point D.
The refracted ray corresponding to the incident ray P0 E passes through the point D and
intersects the optical axis at the Gaussian image point P0¢ . The parallel rays P0 E and CD
are focused by the refracting surface at the point D in the focal plane. The point D may
also be determined by considering a hypothetical parallel ray passing through the object-
space focal point F. It is refracted as a ray parallel to the optical axis intersecting the focal
plane at the point D.
n n¢
P 1 B
2 1
h 3
2 F¢ P0¢
V
P0 F C (–)h¢
A P¢
3
(–)f R
(–)z f¢ z¢
(–)S S¢
n n¢
D
V
P0 F C F¢ P0¢
(–)z z¢
(–)S S¢
Figure 2-15. Graphical imaging to determine the image P0¢ of an axial point object
P0 .
2.3 Thin Lens 61
Mt ∫ h ¢ h = - f z , (2-22)
where z (like f ) is numerically negative because P0 lies to the left of the reference point
F. Similarly, from similar triangles VF ¢B and F ¢P0¢ P ¢, it may also be written
Mt = - z ¢ f ¢ . (2-23)
Equating the right-hand sides of these equations, we obtain the Newtonian imaging
equation:
zz ¢ = f f ¢ = - (n n ¢) f ¢ 2 . (2-24)
It is evident from Eq. (2-24) that z and z ¢ must have opposite signs, implying that an
object and its image lie on the opposite sides of the corresponding focal points. For
example, if the object lies to the left of F, then the image lies to the right of F ¢ .
Differentiating both sides of Eq. (2-24) and using Eqs. (2-8), (2-22), and (2-23), we
obtain Eq. (2-18), relating the longitudinal and transverse magnifications.
n
P0¢¢ F1¢ P0¢
P0 OA C2 C1 F¢
(–)R2 R1
(–)S1 ∫ S S¢1 = S2
S2¢ ∫ S ¢
Figure 2-16. Imaging of an axial point object P0 by a thin lens of refractive index n.
The lens surfaces have radii of curvature of R1 and R2 . The line O A connecting
their centers of curvature C1 and C2 defines the optical axis of the lens. C is the
center of the lens. P0¢ is the image of P0 formed by the first surface, and P0¢¢ is the
image of the virtual object P0¢ formed by the second surface.
optical axis OA of the lens. Consider an axial point object P0 lying at a distance S1 from
the lens. Its image P0¢ formed by the first surface lies at a distance S1¢ that, according to
Eq. (2-4), is given by
n 1 n -1
- = . (2-25)
S1¢ S1 R1
A ray from P0 is refracted by the surface intersecting the optical axis at P0¢ . This image is
a virtual object for the second surface because the rays associated with it appear to
converge to it rather than actually diverge from it. It lies at a distance S2 = S1¢. Its image
P0¢¢ formed by the surface lies at a distance S2¢ , that, according to Eq. (2-4), is given by
1 n 1- n
- = . (2-26)
S2¢ S1¢ R2
1 1 Ê 1 1ˆ
- = (n - 1) Á - ˜ , (2-27)
S¢ S Ë R1 R2 ¯
where we have let S1 = S and S2¢ = S ¢ be the object and final image distances, as
indicated in Figure 2-16. Equation (2-27) is the Gaussian imaging equation relating the
object and image distances.
1 Ê 1 1ˆ
= ( n - 1) Á - ˜ . (2-28)
f¢ Ë R1 R2 ¯
Thus, a ray incident on the lens parallel to its optical axis is refracted by the first surface
intersecting the optical axis at F1¢ at a distance nR1 (n - 1) , as illustrated in Figure 2-16.
This ray is refracted by the second surface intersecting the optical axis at F ¢ , which is the
image-space focal point. In effect, the parallel ray incident on the lens is refracted by it
passing through F ¢ , as illustrated in Figure 2-17a. Similarly, by definition of the object-
space focal length, f represents the object distance that yields an image at infinity. Thus,
S ¢ = • when S = f , where f = - f ¢ . A ray from the object-space focal point F incident
on the lens emerges from it parallel to its optical axis upon refraction, as illustrated in
Figure 2-17b. It should be evident that the focal points F and F ¢ , which lie on the
opposite sides of the lens, are not conjugates of each other. The imaging equation (2-27)
can be written in terms of the focal length f ¢ as
1 1 1
- = . (2-29)
S¢ S f¢
The right-hand side of Eq. (2-27) represents the refracting power K of the lens. Its
reciprocal is called the equivalent or effective focal length fe of the lens. Thus, we may
write
1 1
K = = (2-30)
fe f¢
Ê 1 1ˆ
= ( n - 1) Á - ˜ (2-31a)
Ë R1 R2 ¯
= ( n - 1) (C1 - C 2 ) , (2-31b)
F¢
C F C
f¢ (–)f
(a) (b)
Figure 2-17. Focal points of a positive thin lens with its center C. (a) Image-space
focal point F ¢ . (b) Object-space focal point F. Both focal points are real in that
parallel rays converge to F ¢ , and rays actually originating from F form a parallel
beam after refraction by the lens.
64 REFRACTING SYSTEMS
where C = 1 R is the curvature of a surface. We note that the refracting power of the lens
is equal to the sum of the refracting powers K1 and K 2 of its two surfaces, i.e.,
K = K1 + K 2 , (2-32)
where
n -1
K1 = (2-33a)
R1
and
1- n
K2 = . (2-33b)
R2
We note that the focal length or the power of a lens depends on the difference in the
curvatures of its surfaces but not on the curvatures themselves. Thus, if the curvatures of
the lens surfaces are changed by the same amount, its shape changes without changing its
Gaussian properties. This degree of freedom, called the bending of the lens, is used in
reducing its aberrations. The equation (2-31) for the focal length of a thin lens, in terms of
its refractive index and the curvatures of its surfaces, has traditionally been called the lens
maker’s formula. This is, however, not correct because a lens of zero thickness cannot be
fabricated. This name should instead be associated with Eq. (4-41) for a thick lens
(described in Chapter 4).
A lens with surface curvatures of the same sign is called a meniscus lens. It can be
positive or negative, as illustrated in Figure 2-19. Unless it is surrounded by a medium of
higher refractive index, a lens that is thick at the center compared to its edges is positive,
and a lens that is thin at the center is negative. Of course, one of the surfaces may be
planar, in which case the lens is called planoconvex or planoconcave, depending on the
curvature of the other surface.
F¢ C C F
(–)f¢ (–)f
(a) (b)
Figure 2-18. Focal points of a negative thin lens. (a) Image-space focal point F ¢ . (b)
Object-space focal point F. Both focal points are virtual in that parallel rays appear
to diverge from F ¢ , or rays appearing to converge to F form a parallel beam after
refraction by the lens.
(a) (b)
Figure 2-19. (a) Positive and (b) negative meniscus lens. The radii of curvature of
their surfaces have the same sign. The lens thickness at the center is higher
compared to that at the edges for a positive meniscus, and lower for a negative
meniscus.
F¢ F¢
C P¢0 P0 P0 P 0¢ C
f¢ (–)f ¢
S¢ (–)S ¢
Figure 2-20. Virtual point object P0 at the real focus F ¢ of a positive lens. The
image point P0¢ is real. (b) Real point object P0 at the virtual focus F of a negative
lens. The image point P0¢ is virtual.
66 REFRACTING SYSTEMS
Similarly, if a point object is placed at the focal point F ¢ of a negative lens, as in Figure
2-20b, i.e., a real point object P0 at F ¢ , a virtual image is formed at P0¢ . The image
distance in both cases is given by half the corresponding focal length.
M1 ∫ h1¢ / h1 (2-34a)
S1¢
= . (2-34b)
nS1
A parallel ray from P is refracted by the first surface passing through its focal point F1¢ ,
which, in turn, is refracted by the second surface passing through the focal point F ¢ of
the lens and intersecting the final image plane at the image point P ¢¢ . The magnification
of the erect image P0¢¢P ¢¢ of the object P0¢P ¢ formed by the second surface is given by
nl S2¢
= . (2-35b)
S1¢
Therefore, the transverse magnification of the final image P0¢¢P ¢¢ of the object P0 P
formed by the lens as a whole is given by
(2-36)
Mt = M1 M2 = h2¢ h1 = S2¢ S1 ,
or
Mt ∫ h ¢ h = S ¢ S , (2-37a)
P
n
h1 ≡ h
F′ P′′0 F1′ P0′
P0 OA C2 C1 (–)h
′ 2 ≡ h′ (–)h′1 ≡ h2
P′′
P′
(–)R2 R1
(–)S1 ≡ S S1′ = S2
S2′ ≡ S ′
Figure 2-21. Imaging of an off-axis point object P. The dotted line simply shows that
the final image P ¢¢ lies on the line joining P ¢ and C2 , as expected.
2.3 Thin Lens 67
where we have let h = h1 and h ¢ = h2¢ be the object and final image heights, respectively.
Substituting for S ¢ from Eq. (2-29) into Eq. (2-37a), the magnification can also be
written in terms of S and f ¢ :
f¢
Mt = . (2-37b)
f¢+ S
The angular magnification of a ray bundle diverging from the axial point object P0
and converging toward its image P0¢ (see Figure 2-22) is given by
M = ¢0 0 = S S ¢ . (2-38)
From Eqs. (2-37) and (2-38), we find that the product of the transverse magnification of
the image and the angular magnification of the ray bundle for a thin lens is given by
Mt M = 1 . (2-39)
From the definitions of the magnifications, Eq. (2-39) can also be written
showing that the quantity h0 is invariant upon refraction by the lens. This quantity is
called the Lagrange invariant. [It is shown in Section 5.4.10 that the object flux entering
the lens is proportional to its square.] From Eq. (2-40), the transverse magnification of the
image can also be written
Mt = 0 ¢0 , (2-41)
i.e., it is given by the ratio of the slope angles of the incident and refracted rays for an
axial point object.
h
β0 (–)β′0 P′0
P0 C (–)h′
P′
(–)S S′
The comments made following Eq. (2-18) apply to Eq. (2-42) as well. Thus, for example,
when the object is displaced longitudinally, the image is displaced in the same direction
as the object. In Eq. (2-42), the lens is assumed to be fixed in position, and D S ¢
represents the displacement of the image corresponding to a displacement DS of the
object. However, if the object is fixed and the lens is displaced by an amount D , then the
( )
corresponding displacement of the image is 1 - Mt2 D , as shown in Section 2.9.3.
Mt ∫ h ¢ h = - f z . (2-43)
Mt = - z ¢ f ¢ . (2-44)
The negative sign on the right-hand sides of Eqs. (2-43) and (2-44) has been introduced
because Mt in Figure 2-23 is numerically negative due to h ¢ being numerically negative.
From Eqs. (2-47) and (2-44), we obtain
z z¢ = f f ¢ = - f ¢2 , (2-45)
2.3 Thin Lens 69
P B
h n
F¢ P¢0
P0 F C (–)h¢
A P¢
(–)z z¢
(–)f f¢
(–)S S¢
Figure 2-23. Imaging by a lens of refractive index n and focal length f ¢ . Compared
with Figure 2-15, the two-step imaging (one for each surface) has been replaced by
single-step imaging.
which is the Newtonian imaging equation. It is clear from this equation that z and z ¢ must
have opposite signs, implying that an object and its image lie on opposite sides of the
corresponding focal points. Differentiating both sides of Eq. (2-39) and using Eqs. (2-43)
and (2-44), we obtain Eq. (2-42) relating the transverse and longitudinal magnifications.
1 1 1
- = , (2-46)
L+S S f¢
or
S2
L = - , (2-47)
S + f¢
and
∂L
0 =
∂S
S( S + 2 f ¢ )
= . (2-48)
( S + f ¢)2
We discard the solution S = 0 because it implies an object and its image both located at
the lens. Therefore,
70 REFRACTING SYSTEMS
h
P′0
P0 C (–)h′
P′
(–)S S′
Figure 2-24. Image throw L of a lens representing the distance between an object
and its image.
S = -2f¢ . (2-49)
Lmin = S ¢ - S
= 4f¢ . (2-50)
Thus, the minimum throw of a thin lens for real conjugates is equal to 4 f ¢ . The
magnification of the image in that case is - 1.
By interchanging the (magnitudes of the) object and image distances, Eq. (2-29)
shows that a pair of conjugates is obtained for two positions of the lens, as illustrated in
Figure 2-25. The focal length of a lens can be determined accurately from the throw of an
image and the spacing d between the lens positions. We note from the figure that
L = S¢ - S (2-51)
and
d = - (S ¢ + S ) . (2-52)
L+d
S = - (2-53a)
2
and
L-d
S¢ = . (2-53b)
2
2.3 Thin Lens 71
h
P0′
P0 C (–)h′
P′
(–)h′′
d P′′
(–)S S′
– S′ –S
Figure 2-25. Two lens positions for a pair of conjugates. The object and image
distances are interchanged as the lens is moved from one position to the other.
Substituting Eqs. (2-53) into Eq. (2-29), we obtain the focal length of the lens:
L2 - d 2
f¢ = . (2-54)
4L
h¢ S¢ L-d
M1 ∫ = = - (2-55a)
h S L+d
and
h ¢¢ S 1
M2 ∫ = = , (2-55b)
h S¢ M1
nm¢ n n - nm nm¢ - nl
- m = l + (2-56a)
S¢ S R1 R2
= K1 + K 2 (2-56b)
nm¢ n
= = - m (2-56c)
f¢ f
72 REFRACTING SYSTEMS
nm n′m
P
nl (–)β′0
h
β0 C P′0
P0 F
F′ (–)h′
P′
(–)z z′
(–)f f′
(–)S S′
Figure 2-26. Imaging by a thin lens with different media on its two sides. A ray
incident toward its center deviates upon refraction, unlike the case of a thin lens in
air.
1
= K = , (2-56d)
fe
Mt ∫ h ¢ h = nm S ¢ nm¢ S (2-57a)
= - z¢ f ¢ = - f z , (2-57c)
M = ¢0 0 = S S ¢ ,
(2-58)
Mt M = nm nm¢ , (2-59)
2
M l ∫ D S ¢ D S = (n m n ¢m )(S ¢ S) = (n ¢m n m )M t2 = M t M b , (2-61)
and
Comparing these equations with those for a thin lens in air, we note that the power K of
the lens is again equal to the sum of the powers K1 and K 2 of its surfaces, although the
power of each surface is now different from that before. However, as illustrated in Figure
2-26, a ray incident in the direction of the lens center does not pass through it undeviated.
Moreover, the object- and image-space focal lengths have different magnitudes.
From Eqs. (2-56a) and (2-56b), the focal length of a thin lens in air may be written
2.4 General System 73
1 Ên ˆÊ 1 1ˆ
= Á l - 1˜ Á - ˜ , (2-63)
f¢ Ë na ¯ Ë R1 R2 ¯
where na is the refractive index of air. The refractive index of air is approximately equal
to 1.0003. It is a universal practice in optical design to specify the refractive index of a
lens material relative to that of the air (instead of vacuum). Letting nl na = n , where n is
the specified index of the lens material, Eq. (2-63) reduces to Eq. (2-31a), showing that
the index of air may be assumed to be unity when the index of the lens is specified with
respect to it.
As illustrated in Figure 2-27, consider two thin lenses L1 and L2 of focal lengths f1¢
and f2¢ in contact. To determine the focal length of the doublet, we consider an object
lying at infinity. The first lens forms its image at F1¢ at a distance f1¢ . This image is the
object for the second lens, which forms its image at a distance f ¢ that is the focal length
of the doublet, according to
1 1 1
= + . (2-64)
f¢ f1¢ f2¢
Thus, the inverse of the focal length of the doublet is equal to the sum of the inverse of
the focal lengths of its two lenses. It shows that, because the power of a lens is equal to
the inverse of its focal length, the power of the doublet is equal to the sum of the powers
of its two lenses. The above reasoning can be extended to three or more thin lenses,
yielding the result that the power of a system consisting of any number of thin lenses in
contact is equal to the sum of the powers of the individual lenses. Of course, the
approximation involved in neglecting the thickness of the lenses becomes increasingly
coarser as the number of lenses increases.
L1 L2
F¢ F1¢
f¢
f1¢
Figure 2-27. Doublet consisting of two thin lenses in contact. The focal length of the
doublet is f ¢ .
lens is thick means that the distance of this image from the vertex of the second surface
becomes the object distance for this surface. (The vertices of the two surfaces were
assumed to be coincident when the thickness of the lens was neglected in Section 2.3.)
Similarly, the image formed by a doublet consisting of two thin lenses separated by some
distance can be obtained by sequentially applying the imaging equation for a thin lens.
The size of the image formed in each case can also be obtained by a sequential
application of the magnification equation. The focal point of a thick lens or a doublet is
simply the image point corresponding to an axial point object lying at infinity, i.e., it is
the point where the rays incident parallel to the optical axis are focused after refraction by
the system. Of course, the image formed by a general system consisting of many imaging
elements can be obtained in a similar manner by a sequential application of the imaging
equation for a refracting surface and or a thin lens, as the case may be. It should be
evident, though, that the imaging equation will become increasingly complex as the
number of imaging elements of a system increases. Moreover, although the focal point of
a general system can also be determined in the same manner as for a thick lens or a
doublet, it is not clear how its focal length should be defined and determined. It is not, for
example, equal to the distance of the focal point from the vertex of the last surface of the
system (which is called the image-space focal distance).
We show that by defining suitable reference points, called the cardinal points, the
imaging equation for any imaging system can be reduced to one similar to that for the
refracting surface. There are six cardinal points, of which only three are independent.
Once they are known, the system can be replaced by them, regardless of its complexity.
The object and image distances are measured from the respective principal points, which
correspond to planes of unity transverse magnification. Similarly, the focal lengths
represent the distances of the focal points from the respective principal points. The two
nodal points correspond to unity angular magnification. The thin lens is a special case in
which the principal (and nodal) points coincide with its center. The principal points of a
2.4 General System 75
refracting surface coincide with its vertex, and its nodal points coincide with its center of
curvture.
A ray is shown incident parallel to the optical axis of the system in Figure 2-28a. It
exits from the system intersecting the axis at F ¢ . Only the first and the last surfaces of
the system are shown schematically in the figure. Similarly, only the incident and the exit
segments of the ray are shown, i.e., its intermediate segments are not shown. The image-
space focal point F ¢ of a system is defined as the point through which rays incident
parallel to its optical axis from the left pass after being refracted by it. The rays
converging toward F ¢ when extended backward intersect the incident parallel rays in a
plane called the image-space principal plane. This plane intersects the optical axis at a
point H ¢ called the image-space principal point. The rays behave as if all of their
deviation takes place at the principal plane. The distance H ¢F ¢ of the focal point F ¢ from
the principal point H ¢ is called the image-space focal length f ¢.
The object-space focal point F, shown in Figure 2-28b, is defined as the axial point
such that the rays originating from it and incident on the system emerge from it parallel to
the optical axis after being refracted by it. The rays originating from F when extended
n n¢ n n¢
H¢ F¢ F H
f¢ (–)f
(a) (b)
Figure 2-28. Principal and focal points of an imaging system. The system consisting
of many surfaces is shown schematically by its first and last surfaces only. Similarly,
only the incident and exit segments of the ray are shown. (a) The image-space focal
point F ¢ and principal point H ¢ , illustrating the focal length f ¢ . (b) The object-
space focal point F and principal point H , illustrating the focal length f.
76 REFRACTING SYSTEMS
forward intersect the emergent parallel rays in a plane called the object-space principal
plane. This plane intersects the optical axis at a point H called the object-space principal
point. The rays behave as if all of their deviation takes place at the principal plane. The
distance HF of the focal point F from the principal point H is called the object-space
focal length f.
By definition, the principal planes are planes of unity transverse magnification. This
may be seen from Figure 2-29, where a system with focal points F and F ¢ is considered.
A ray 1 incident in the direction AQ parallel to the optical axis emerges from the system
passing through F ¢, and the extensions of the incident and emergent rays intersect at a
point Q ¢ on the image-space principal plane H ¢Q ¢ . A second ray 2 incident on the
system passing through F emerges from it in the direction Q ¢A¢ parallel to the optical
axis, and the extensions of the incident and emergent rays intersect at a point Q on the
object-space principal plane HQ. Thus, the two rays initially directed toward Q emerge in
directions that intersect at Q ¢. Thus, Q ¢ is an image of Q, and vice versa, i.e., Q and Q ¢
are conjugate points. Similarly, the principal planes HQ and H ¢Q ¢ are conjugate planes.
Because HQ = H ¢Q ¢ , they are conjugate planes of unity (positive) transverse
magnification.
It should be understood that all of the rays incident parallel to the optical axis pass
through F ¢ after emerging from the system only in the Gaussian approximation. In
reality, they generally intersect the focal plane at various points in the vicinity of F ¢ .
Similarly, the incident parallel rays and the corresponding exit rays do not generally
intersect at points lying in a plane. The image-space principal plane H ¢Q ¢ is the Gaussian
approximation of a nonplanar surface.
1 2
A Q Q¢ A¢
2 1
F H H¢ F¢
(–)f f¢
Mt Mb = n n ¢ , (2-65)
M = ¢
(2-66)
= n n¢ .
n ¢¢ = n . (2-67)
n n′
1 Q
P Q′
3 F1 F′1 4 1 (–)β′
4
h 2
(–)β F′ P′0
P0 F H H′ (–)h′
3
R R′ 2 P′
(–)z (–)f f′ z′
(–)S S′
Now consider a ray 4 such that it and ray 3 leave the object-space focal plane from
the same point F1 . The image of F1 is formed at infinity, i.e., the emergent rays F1¢ F ¢
and H ¢ P ¢ are parallel to each other, and ray 4 passes through F ¢ after being refracted by
the system. From the triangle FHF1 , we note that
= FF1 f . (2-68a)
¢ = - FF1 f ¢ , (2-68b)
n¢ n
= - . (2-69)
f¢ f
Noting that h = S and h ¢ = ¢ S ¢ , and using Eq. (2-67), the transverse magnification of
the image P0¢P ¢ of the object P0 P is given by
h¢ nS ¢
Mt ∫ = . (2-70)
h n ¢S
where the negative sign is due to h ¢ being numerically negative. Note that f and S are
both numerically negative. Comparing Eqs. (2-70) and (2-71), we obtain the Gaussian
imaging equation
n¢ n n¢ n
- = = - . (2-72)
S¢ S f¢ f
n¢ n n¢ n 1
- = = - =K = . (2-73)
S¢ S f¢ f fe
The angular magnification of a ray bundle diverging from the axial point object P0
and converging to its image point P0¢ after refraction by the system, as illustrated in
Figure 2-31, is given by
2.4 General System 79
n n′
P Q Q′
h
β0 (–)β′0 P′0
P0 H H′ (–)h′
P′
(–)S S′
Mb = ¢0 / 0 = S / S ¢ , (2-74)
where we have used the fact that HQ = H ¢Q ¢ by virtue of unity magnification of the
principal planes. From Eqs. (2-70) and (2-74), the product of the transverse and angular
magnifications is given by Eq. (2-65), as expected. From the definitions of the
magnifications, Eq. (2-65) may also be written
thus demonstrating the Lagrange invariance for the entire system. It is closely related to
the conservation of energy in the imaging process, as illustrated by Problem 5.7. From
Eq. (2-75), the transverse magnification of the image can also be written
n0
Mt = , (2-76)
n ¢¢0
i.e., it can be obtained from the slope angles of an axial incident ray and the
corresponding refracted ray in the image space of the system.
Differentiating Eq. (2-72), we find that the longitudinal magnification of the image is
given by
Ml ∫ D S ¢ D S = (n n ¢ )( S ¢ S ) 2 = (n ¢ n) Mt2 = Mt Mb . (2-77)
The comments made following Eq. (2-18) also apply to Eq. (2-77). Thus, for example, if
the object is displaced longitudinally, the image is also displaced in the same direction.
It may be noted that only three parameters are needed to determine the location and
size of the image of an object: the locations of the two principal points, and the focal
length of the system.
80 REFRACTING SYSTEMS
NN ¢ = AA¢ = HH ¢ , (2-78)
i.e., the distance between the nodal points is equal to the distance between the principal
points. If we consider a second ray F B parallel to the first but passing through F, it
emerges parallel to the optical axis in the direction BB¢. From the congruent triangles
HFB and F ¢N ¢B¢, we find that
F ¢N ¢ = HF = f , (2-79)
i.e., the distance of the image-space nodal point N ¢ from the corresponding focal point
F ¢ is equal to the object-space focal length f. Also from the congruent triangles HFB and
F ¢N ¢B ¢ ,
F ¢H ¢ + H ¢N ¢ = HN + NF . (2-80)
H ¢N ¢ = HN . (2-81)
n n¢
H N¢ H¢ F¢
F N
A¢
A
N1
B¢
B N1¢
(–)f f¢
f¢ (–)f
F ¢H ¢ = NF ,
or
FN = H ¢F ¢ = f ¢ , (2-82)
i.e., the distance of the object-space nodal point N from the corresponding focal point F is
equal to the image-space focal length f ¢ .
Letting Mb = 1 in Eq. (2-65), we note that the nodal planes are conjugate planes with
a transverse magnification of n n¢ . This may also be seen directly from Eq. (2-70) by
considering the nodal points as Gaussian conjugates with S ∫ HN and S ¢ ∫ H ¢N ¢ . It
should be noted that only the nodal points N and N ¢ have the property of unity ray angle
magnification; the other conjugate points in the nodal planes do not have this property.
For example, in Figure 2-32, N1 and N1¢ are conjugate points in the nodal planes, as may
be seen by considering a ray from N1 incident parallel to the axis. It emerges from the
system passing through F ¢ . The extension of the emergent ray intersects the incident ray
at N1¢ . The transverse magnification of the image given by N ¢N1¢ NN1 is equal to n n¢ .
Thus, a ray incident in the direction of N1 emerges as if it is coming from N1¢ , but the
emergent ray is obviously not parallel to the incident ray. When n = n ¢ , then f = - f ¢ ,
and therefore N and H coincide, and N ¢ and H ¢ coincide.
Mt ∫ h ¢ h = - f z = - z ¢ f ¢ . (2-83)
Accordingly,
zz ¢ = f f ¢ = - (n ¢ n) f ¢ 2 , (2-84)
which is the Newtonian imaging equation. It is evident from this equation that z and z ¢
must have opposite signs, implying that an object and its image lie on the opposite sides
of the corresponding focal points. By differentiating both sides of Eq. (2-84), we obtain
Eq. (2-77), relating the longitudinal and transverse magnifications.
the optical axis. This ray determines the height h ¢ of the image point P ¢ . The intersection
of the these two rays in the image space determines the location of P ¢ . A ray incident in
the direction of the object-space nodal point N emerges from the system in a parallel
direction passing through the image-space nodal point N ¢ , as illustrated in Figure 2-32.
This third ray is not shown in Figure 2-30, but it provides a check on the graphical
construction. As stated earlier, if the refractive indices of the object and image spaces are
equal, then a principal point coincides with its corresponding nodal point.
It should be understood that, in Gaussian imaging, all of the object rays transmitted
by the system pass through the Gaussian image point. In reality, of course, this does not
generally happen. The rays intersect the image plane in the vicinity of the Gaussian image
point. The deviation of a ray from the Gaussian image point is called its ray aberration
(discussed in Chapter 8). The distribution of the rays in the image plane is called the spot
diagram (discussed in Chapter 9). The Gaussian approximation helps determine the
location of the image point, but the quality of the image depends on its aberrations.
Consider two object planes with axial points P0 and Q0 separated by a distance L, as
illustrated in Figure 2-33. Let the corresponding image planes with axial points P0¢ and
Q0¢ be separated by a distance L ¢ . Let z be the distance of Q0 from the object-space focal
point F, and let z ¢ be the distance of Q0¢ from the image-space focal point F ¢ . The
Newtonian imaging equation for determining the location of the P0¢ - and Q0¢ -image
planes yields
zz ¢ = f f ¢ (2-85)
and
n n¢
P0 F Q0 Q 0¢ F¢ P¢0
(–)z z¢
(–)L L¢
Figure 2-33. Object and image distances referred to conjugate planes other than the
principal planes. F and F ¢ are the object- and image-space focal points. The object
and image distances of the conjugates P0 and P0¢ are referred to the conjugates Q0
and Q0¢ , respectively.
2.4 General System 83
( z + L) ( z ¢ + L ¢) = f f¢ . (2-86)
z L ¢ + z ¢L + L L ¢ = 0 . (2-87)
f z¢
MQ = - = - . (2-88)
z f¢
f f ¢ MQ
+ = 1 . (2-89)
L MQ L¢
n ¢ MQ n n¢
- = . (2-90)
L¢ L MQ f¢
Equation (2-90) is a generalized imaging equation wherein L and L ¢ are the object and
image distances referred to the corresponding conjugate planes with a transverse
magnification of MQ . The magnification of the image in the P0¢ plane is given by
z¢ + L¢
MP = -
f¢
nL ¢ 1
= , (2-91)
n ¢L MQ
where we have substituted for z ¢ f ¢ from Eq. (2-88) and for L ¢ f ¢ from Eq. (2-90).
As expected, if we let MQ = 1 for the principal planes, Eqs. (2-90) and (2-91) reduce
to Eqs. (2-73) and (2-70), respectively. Letting MQ = n n ¢ for the nodal planes, they
reduce to
n n¢ n¢
- = (2-92)
L¢ L f¢
and
L¢
MP = , (2-93)
L
respectively. The entrance and exit pupils, discussed in Chapter 5, may also be used as
reference planes by letting MQ equal the pupil magnification. Therefore, it is not
essential that the object and image distances be referred to the principal planes. However,
84 REFRACTING SYSTEMS
it is convenient to use them because of their unity magnification and the resulting
simplicity of the associated graphical construction of imaging. Of course, because air is
the medium of the object and image spaces in most applications, the nodal points are the
same as the corresponding principal points.
It should be evident that the principal and nodal points of a thin lens in air (or any
other medium) coincide at its center. A ray incident in the direction of the center is
refracted without any deviation. When the media on its two sides have different refractive
indices, then the principal points still coincide at its center, but the nodal points coincide
at a distance f ¢ + f from it. A ray incident in the direction of the center in this case is
refracted with deviation according to Snell's law. In each case, a ray incident parallel to
the optical axis emerges passing through the image-space focal point F ¢ , and a ray
incident in the direction of the object-space focal point F emerges parallel to the optical
axis. The Newtonian imaging equation for a general system is the same as for a single
refracting surface or a thin lens, because the principal points are not utilized in this
equation.
Consider, for example, a system consisting of a series of j refracting surfaces. Let the
refractive indices of the object and image spaces for the ith surface be ni and ni¢ ,
respectively, where ni¢ = ni +1 [because the image space for the ith surface is the object
space for the (i + 1) th surface]. Let the object and image distances for the ith surface be
2.4 General System 85
Si and Si¢ , respectively. If hi and hi¢ are the heights of the object and image for this
surface, where hi¢ = hi +1 [because the image for the ith surface is the object for the
(i + 1) th surface], the magnification of the image formed by it is given by
hi¢
Mi =
hi
ni Si¢
= . (2-94)
ni¢Si
h ¢j
M=
h1
h1¢ h2¢ h ¢j
= ◊◊◊
h1 h2 hj
= M1 M2 ◊◊◊ M j
n1S1¢ n2 S2¢ n j S ¢j
= ◊◊◊
n1¢S1 n2¢ S2 n ¢j S j
n1 S1¢ S2¢ S ¢j
= ◊◊◊ . (2-95)
n ¢j S1 S2 Sj
If S and S ¢ are the object and final image distances from the principal points of the
system, the magnification of the image is also given by
n1S ¢
M = . (2-96)
n ¢j S
S¢ S¢ S¢ S ¢j
= 1 2 ◊◊◊ . (2-97)
S S1 S2 n ¢j S j
For an object lying at infinity, both S and S1 are equal to infinity, and S ¢ and S1¢
become the image-space focal lengths f ¢ and f1¢ of the system and the first surface,
respectively. Therefore, we may write
S2¢ S ¢j
f ¢ = f1¢ ◊◊◊ . (2-98)
S2 n ¢j S j
The image distance S ¢j in this case locates the image-space focal point F ¢ of the system
from the vertex of the jth surface. It is called the image-space focal distance. The focal
length f ¢ locates the image-space principal point H ¢ , because F ¢ lies at a distance f ¢
86 REFRACTING SYSTEMS
from it. The focal length of a thick lens, for example, can be determined in this manner.
The effect of its thickness can be determined by comparing it with Eq. (2-28) for the focal
length of a thin lens. Once f ¢ is known, the object-space focal length f can be obtained
from Eq. (2-69). However, the object-space focal point F and the principal point H have
to be determined separately by considering an object at infinity in the image space and
determining its image in the object space.
hi¢
Mi =
hi
Si¢
= . (2-99)
Si
h ¢j
M =
h1
h1¢ h2¢ h ¢j
= ◊◊◊
h1 h2 hj
= M1 M2 ◊◊◊ M j
S1¢ S2¢ S ¢j
= ◊◊◊ . (2-100)
S1 S2 Sj
If S and S ¢ are the object and final image distances from the corresponding principal
points of the system, the magnification of the image is also given by
S¢
M= . (2-101)
S
S¢ S¢ S¢ S ¢j
= 1 2 ◊◊◊ . (2-102)
S S1 S2 Sj
For an object at infinity, both S and S1 are equal to infinity, and S ¢ and S1¢ become
the image-space focal lengths f ¢ of the system and the first lens, respectively. Therefore,
we may write
2.4 General System 87
S2¢ S ¢j
f ¢ = f1¢ ◊◊◊ . (2-103)
S2 Sj
The image distance S ¢j is the image-space focal distance and locates the image-space
focal point F ¢ of the system from the center of the jth lens. The focal length f ¢ locates
the image-space principal point H ¢ , because F ¢ lies at a distance f ¢ from it. Once f ¢ is
known, the object-space focal length can be obtained from Eq. (2-69). However, the
object-space focal point F and the principal point H have to be determined separately by
considering an object at infinity in the image space and determining its image in the
object space.
S1 = • ,
S1¢ = f1¢ ,
S2 = f1¢ - t ,
1 1 1
- = .
S2¢ S2 f2¢
Thus,
f2¢( f1¢ - t )
S2¢ = . (2-104)
f1¢ + f2¢ - t
L1 L2
H¢ F¢ F1¢
t S2¢
S2
f¢
S1¢ ∫ f1¢
Figure 2-34. Image-space focal point F ¢ of two thin lenses separated by a distance t.
88 REFRACTING SYSTEMS
f ¢ = f1¢( S2¢ S2 )
f1¢f2¢
= . (2-105a)
f1¢ + f2¢ - t
The image distance S2¢ represents the image-space focal distance and locates the focal
point F ¢ . The principal point H ¢ is located by using the fact that H ¢F ¢ = f ¢ . For
comparison with Eq. (2-64) for the case when the two lenses are in contact, we write Eq.
(2-105a) in the form
1 1 1 t
= + - .
f¢ f1¢ f2¢ f1¢f2¢
(2-105b)
The nodal points of a lens system can also be determined in a laboratory by placing it
on a nodal slide, which is a device that permits rotation of the lens about an axis that lies
on the optical axis and is perpendicular to it. The axis of rotation is changed by sliding the
lens on the slide. When a collimated beam is incident on the lens parallel to its axis, it is
2.4 General System 89
focused at its focal point F ¢ , as illustrated in Figure 2-35a. This figure is drawn for the
general case when the refractive indices of the object and image space are not equal. The
beam focus P0¢ coincides with the focal point F ¢ . When the lens is rotated about a point
on its axis, the focus of the beam is displaced, except when the rotation is about the nodal
point N ¢ . In Figure 2-35b, the lens has been rotated about a point Q lying between N and
N ¢ , resulting in a displacement of the beam focus to a point P0¢¢ . A ray incident in the
direction of N emerges from the system parallel to the incident ray as if coming from N ¢ .
In Figure 2-35c, where the rotation is about N ¢ , the beam focus stays at P0¢ . In this case,
a ray passing through the nodal point N is displaced, but it passes through the nodal point
N ¢ in the same direction as the incident ray. By turning the lens around (so that its front
and back are interchanged) and repeating the process, the other nodal point N can be
determined.
If the refractive indices n and n ¢ of the object and image spaces are equal, as would
be the case in a laboratory measurement, then the principal points coincide with the
corresponding nodal points. When the principal points of a system are located and its
focal lengths are determined, so that the focal points are located, all of the Gaussian
characteristics of the image of an object can be determined.
n n¢
n n¢
N Q P¢0
P¢0
N N¢ F¢ N¢ P¢0¢
F¢
(a) (b)
n n¢
N P¢0
N¢
F¢
(c)
Figure 2-35. Determination of nodal points of a lens system. (a) A parallel beam is
focused at N ¢ coincident with the focal point F ¢ . (b) When the lens is rotated about
a point Q, the beam focus is displaced to P0¢¢ . (c) When the system is rotated about
the nodal point N ¢ , the beam focus stays at N ¢ , although the focal point F ¢ has
been displaced.
90 REFRACTING SYSTEMS
We start with a discussion of the Lagrange invariant for an object or its image at
infinity, and show that an afocal system images objects with transverse and longitudinal
magnifications that are independent of the object distance. Examples of afocal systems
(such as a beam reducer or expander), telephoto and wide-angle camera lenses, and a
telescope, are considered in Chapter 6. A plane-parallel plate, discussed in the next
section, is an example of an afocal system with unity transverse magnification.
n ¢h ¢¢0 = - nx 0 , (2-106)
where h ¢ is the image height, and ¢0 is the slope angle of the axial ray in the image
space. This result is rederived in Section 4.10 from a two-ray Lagrange invariant.
Similarly, if the image is formed at infinity, as when an object lies in the front focal
plane (see Figure 2-36c), the image-space Lagrange invariant becomes indeterminate
because h ¢ Æ • and ¢0 Æ 0 . Considering the image P0¢P ¢ , we note that x 0¢ = - S ¢¢0
and h ¢ = S ¢ . Eliminating S ¢ , we obtain h ¢¢0 = - x 0¢ ¢ . As the image moves to infinity,
the product h ¢¢0 remains finite and equal to - x 0¢ ¢ . The corresponding Lagrange
2.5 Afocal Systems 91
n n′
P Q Q′
h x0 x 0′
β0 (–)β (–)β′0 P′0
P0 H H′ (–)β′ (–)h′
P′
(–)S S′
(a)
n n′
Q Q′
x0 x 0′
(–)β (–)β′0 P′0
H H′ (–)β′ (–)h′
P′
f′
(b)
n n′
P Q Q′
h x0 x 0′
β0 (–)β
P0 H H′ (–)β′
(–)f
(c)
nh0 = - n ¢ x 0¢ ¢ . (2-107)
n ¢x 0¢ ¢ = nx 0 , (2-108)
where n and n ¢ are the refractive indices of the object and image spaces, x 0 and x 0¢ are
the heights of the axial conjugate rays (i.e., rays parallel to the axis), and and ¢ are
the slope angles of conjugate rays from an off-axis point object. Thus, the ray angular
magnification is given by
¢ nx0
= . (2-109)
n ¢x0¢
Now we consider how afocal systems form images of objects located at finite
distances. Consider, for example, the imaging of an object P0 P of height h by an afocal
system, as illustrated in Figure 2-37b. Its image P0¢P ¢ has a height of h ¢ that can be
obtained by determining the image formed successively by each surface of the system.
S ¢ = h ¢ ¢0
n ¢h ¢ 2 .
=
nh0 (2-110)
S¢ n ¢h ¢ 2 n¢ 2
= = Mt . (2-111)
S nh 2 n
n n′
β x0 (–) β′
(–)x 0′
(a)
n n′
P Q
(b)
Figure 2-37. (a) Lagrange invariant of an afocal system for infinite conjugates. (b)
Finite conjugate imaging by an afocal system. Conceptually, the system is assumed
to be multisurface; therefore, a dotted line in the figure does not represent a ray but
merely a line joining its point of incidence on and its point of emergence from the
system to establish a continuation of the ray.
2.6.1 Introduction
A plane-parallel plate, as its name implies, is a plate with two surfaces that are plane
and parallel to each other. It is a thick lens whose two surfaces have infinite radii of
curvature. Unlike a lens, a plane-parallel plate is not used for imaging per se, but it is
often used in imaging systems as a beam splitter or a window. The imaging equations for
such a plate cannot be obtained from those for a thin lens by letting the radii of curvature
of its two surfaces approach infinity because the thickness of the lens is neglected by its
definition. However, as discussed below, they can be obtained by applying the imaging
equations (2-4) and (2-12) for a spherical surface to its two surfaces and combining the
results thus obtained. We show that the distance between an object and its image formed
by the plate is independent of the object position. Thus, as illustrated in Figure 2-38a, a
plane-parallel plate placed in the path of a converging beam displaces the focus of the
beam from P1 to P2 by an amount that depends only on the thickness and the refractive
index of the plate.
94 REFRACTING SYSTEMS
P2
P1
(a)
A D
45∞
45∞
B C
(b)
Figure 2-38. (a) Plane-parallel plate placed in the path of a converging beam of light.
Rays incident on the plate converging toward P1 converge toward P2 after being
refracted by it. (b) A right-angle reflecting prism placed in the path of a converging
beam. The optical path lengths of the rays for the prism are equivalent to those for a
plane-parallel plate, where the virtual portion ADC of the equivalent plate is
obtained by a reflection of its real portion ABC by the reflecting surface AC.
OA
(–)h
P¢ P P¢¢
(–)S1
t
(–)S¢1
(–)S¢2
(–)S2
S1¢ = nS1
(2-112a)
∫ nS
and
S2¢ = S2 n
= ( S1¢ - t ) n
t
= S- (2-113a)
n
and
Noting that S2¢ is numerically negative, the displacement PP¢¢ of the final image
from the object may be written
PP ¢¢ = - S1 - (- S2¢ - t )
= t (1 - 1 n) . (2-114)
Thus, the image displacement PP¢¢ is independent of the object distance S; it depends
only on the thickness t and refractive index n of the plate. Accordingly, the longitudinal
magnification of the image is unity. This may also be seen from Eq. (2-77) by noting that
the transverse magnification of the image is unity and the refractive indices of the object
and image spaces are equal to each other, as they are both equal to unity. It should be
evident that a plane-parallel plate is an afocal system with unity transverse and
longitudinal magnifications.
n n¢
P¢1 P¢
P¢2
P¢p P¢¢
Ro h¢
P0 V0
C2 OA C P¢0
UR
(–) h
V
P2
P P1
SS
R
(–)S S¢
Figure 2-40. Petzval image by a spherical refracting surface with its center of
curvature at C. P0¢P ¢ is the Gaussian image of a planar object P0 P . The
corresponding Petzval image is P0¢Pp¢ . P0 P1 is a spherical object concentric with the
refracting surface. Its Petzval image is the concentric surface P0¢P1¢ . P0¢P2¢ is the
spherical Petzval image of a spherical object P0 P2 , whose center of curvature lies on
the optical axis at C2 . Note that VP1 = S and VP1¢ = S ¢.
corresponding change D S ¢ in the image distance is given by Eq. (2-18). In Figure 2-40,
P1¢P2¢ gives the increase in image distance VP1¢ = S ¢ corresponding to an increase of P1 P2
in the (numerically negative) object distance VP1 = S of conjugates P1 and P1¢. Now,
P1 P2 is approximately equal to the difference in the sags of points P2 and P1 . As the
heights of P1 and P2 from the optical axis are approximately equal to h, we may write
DS ∫ P1 P2
= VP2 - VP1 (2-115a)
2
Ê ˆ
~ -h Á 1 - 1 ˜ .
2 Ë Ro R - S ¯
Note that because P2 lies to the right of P1 in the figure, the center of curvature of the
object lies between P0 and C, and therefore Ro < R - S . Similarly, D S ¢ is equal to the
difference in the sags of points P2¢ and P1¢ , i.e.,
D S ¢ ∫ P1¢ P2¢
= VP2¢ - VP1¢
2
Ê ˆ (2-115b)
~ h¢ Á 1 - 1 ˜ ,
2 Ë Ri R - S ¢ ¯
where Ri is the radius of curvature of the image surface P0¢P2¢ . From Eq. (2-18), we may
write
D S¢ n¢ h¢2
= . (2-116)
DS n h2
98 REFRACTING SYSTEMS
Substituting for DS and D S ¢ from Eqs. (2-115) and (2-116) into Eq. (2-18), we obtain
(after some manipulations)
1 1 1 Ê 1 1ˆ
- = - . (2-117)
n¢ Ri n Ro R Ë n¢ n ¯
1 n - n¢
= , (2-118)
Ri nR
which is numerically negative in Figure 2-40 for n ¢ > n . This image surface, shown in
figure as P0¢Pp¢ , is called the Petzval image surface, and its radius of curvature Ri is called
the Petzval radius of curvature. Note that Ri does not depend on the object distance S or
the image distance S ¢ ; it depends only on the radius of curvature of the refracting surface
and the refractive indices of the media that this surface separates. The location of the
Petzval image Pp¢ of an off-axis point object P is the point of intersection of the auxiliary
axis PCP¢ and a spherical surface of radius of curvature Ri centered on the optical axis
and passing through the axial image point P0¢ . The corresponding Gaussian image point
P ¢ is, of course, the point of intersection of the auxiliary axis with the Gaussian image
plane.
1 1 1 Ê1 1ˆ
- = Á - ˜ , (2-119)
n1 Ri1 n0 Ro R1 Ë n1 n0 ¯
1 1 1 Ê 1 1ˆ
- = Á - ˜ ,
n2 Ri 2 n1 Ri1 R2 Ë n2 n1 ¯
...
and
1 1 1 Ê 1 1 ˆ
- = Á - ˜ , (2-120)
nk Rik nk -1 Rik -1 Rk Ë nk nk -1 ¯
2.7 Petzval Image 99
respectively. Adding these equations and letting the radius of curvature of the object
surface Ro Æ •, we obtain the radius of curvature of the Petzval image surface produced
by a system of k refracting surfaces according to
1 k 1 Ê 1 1 ˆ
= nk  Á - (2-121a)
Rik j =1 R j Ë n j n j -1 ˜¯
k Kj
= - nk  , (2-121b)
j =1 n j n j -1
where
n j - n j -1
Kj = (2-122)
Rj
is the refracting power of the jth surface. We note that the Petzval radius is independent
of the object and image distances. Unless the sum on the right-hand side of Eq. (2-121a),
called the Petzval sum, is zero, the Petzval image is spherical with a radius of curvature
Rik .
1 1- n
= . (2-123)
Ri1 R1
The second refracting surface images the first Petzval surface into a second surface, with
a radius of curvature Ri2 given by
1 1 1 Ê 1
- = 1 - ˆ ,
Ri 2 nRi1 R2 Ë n¯
or
1 1- n Ê 1 1ˆ
= Á - ˜ (2-124a)
Ri 2 n Ë R1 R2 ¯
1
= - . (2-124b)
nf¢
Rp = - n f ¢ . (2-125)
Note the radius of curvature of the Petzval surface does not depend on the object or
the image distance; it depends only on the refractive index and the focal length of the
lens. Its value is numerically negative for a positive lens; i.e., the Petzval surface is
curved toward the lens with its center of curvature lying to its left, as illustrated in Figure
2-41a. The radius of curvature of the virtual Petzval surface for a negative lens is
numerically positive, as illustrated in Figure 2-41b; it lies to the left of the lens and is
curved toward it.
If a system consists of a series of thin lenses, then the first lens forms the image of a
planar object on a Petzval surface. This image surface becomes the object for the next
lens in the series, and so on. It can be shown that the radius of curvature of the Petzval
surface of a system consisting of a series of m thin lenses of refracting indices n j and
focal lengths f j¢ , where j = 1, 2, ..., m, is given by (see Problem 2.14)
1 m 1
= Â - . (2-126)
Rp j =1 n j f j¢
Petzval
Surface
Cp P¢0
(–)Rp
S¢
(a)
Petzval
Surface
P¢0 Cp
Rp
(–)S¢
(b)
Figure 2-41. Petzval surface of a thin lens. (a) Real for a positive lens. (b) Virtual for
a negative lens. C p is the center of curvature of the Petzval surface.
2.8 Misaligned Surface 101
In Section 2.3.3, we determined the axial displacement of the image for a small
displacement of a point object, thus yielding an expression for the longitudinal
magnification. Now, we determine the displacement of the image when the imaging
surface is slightly decentered, tilted, or despaced. The imaging surface is either
nonspherical, so that it has a well-defined vertex, or is an element of a series of coaxial
imaging surfaces, so that there is a well-defined optical axis, and thus yields a vertex.
First, we consider an imaging surface that is laterally displaced from its nominal
position, as indicated in Figure 2-42. Such a displacement of the surface is referred to as
its decenter. In the perturbed position, its axis is still parallel to the optical axis of the
unperturbed system. Let the displacement be along the x axis with a value of D. In its
unperturbed position, let the heights of its object and image points P and P ¢ from its
optical axis VC be h and h ¢, where V is the vertex and C is the center of curvature of the
surface, respectively. The two heights are related to each other according to
h ¢ = Mh , (2-127)
z
y P¢¢
P¢
Vp Cp h¢p
P0¢¢ h¢
D
P0 V C P¢0
(–)hp
(–)h
R
(–)S S¢
Figure 2-42. Decentered surface. In the unperturbed state, the center of curvature of
the surface shown by the solid curve lies at C. The point object P and its image P ¢
are at heights h and h ¢ , respectively, from the optical axis VC. When the surface is
decentered by an amount D along the x axis, as indicated by the dashed surface, its
center of curvature moves to C p and the image is displaced to P ¢¢. The new object
and image heights are h p and h p¢ , respectively, from the new optical axis Vp C p .
102 REFRACTING SYSTEMS
In the perturbed position, the object and image heights from the new optical axis
Vp C p become
hp = h - D (2-128)
and
h p¢ = Mh p
= h¢ - MD , (2-129)
respectively. Note that h and M are numerically negative in Figure 2-42. The image point
for the decentered surface lies at P ¢¢. The image displacement, which is also along the x
axis, is given by
P¢ P ¢¢ = h p¢ - (h ¢ - D )
(2-130a)
= (1 - M ) D ,
or
P ¢P ¢¢ = (1 - M ) D c d , (2-130b)
where c d = is the displacement of the center of curvature of the surface due to its
decenter. The image displacement is independent of the height h of the object.
Accordingly, the displacement P0¢P0¢¢ of the axial image point is also given by Eq. (2-
130b).
h p = h - S (2-131)
and
h p¢ = Mh p
= h ¢ - MS . (2-132)
P¢¢
h¢p P¢
Cp
P0 V bS¢ P0¢¢ h¢
b
(–)bS P¢0
C
(–)hp (–)h
R
(–)S S¢
Figure 2-43. Tilted surface. When the surface is tilted by an angle , indicated by
the dashed surface, its center of curvature C moves to C p . The heights of the object
P and image P ¢ change from h to hp and from h ¢ to h p¢ , respectively. The image for
the tilted surface is located at P ¢¢ .
The image displacement, which is along the x axis, as in the case of a decentered
surface, is given by
P ¢P ¢¢ = hp¢ - ( h ¢ - S ¢) , (2-133a)
or
P ¢P ¢¢ = ( S ¢ - MS) . (2-133b)
Substituting for S in terms of S ¢ from Eq. (2-10) for the image magnification and S ¢ in
terms of R from Eq. (2-4) for imaging, we find that
P ¢P ¢¢ = (1 - M ) R , (2-134a)
or
P ¢P ¢¢ = (1 - M ) D c t , (2-134b)
where D c t = R is the displacement of the center of curvature of the surface due to its
tilt.
It should be evident from Eqs. (2-130b) and (2-134b) that the image is not displaced
unless the center of curvature of the surface is displaced. Thus, for example, if the
displacement of the center of curvature due to a decenter of the surface is canceled by its
tilt, the image does not move.
104 REFRACTING SYSTEMS
( )
P0¢P0¢¢ = 1 - n ¢ M 2 n D . (2-135)
Note that the distance of the displaced image from the displaced surface is
S ¢ - (n ¢ n) M 2 D . The image height h ¢ may also change, but the more serious effect is the
defocused image. If a surface of a multisurface system is displaced, the distances of the
object for each surface that follows it also change.
n n′
P
(1 – n′M2t /n)Δ
h
Δ P′0 P′′0
P0 V C (–)h′
P′ P′′
R
(–)S S′
Figure 2-44. Despaced surface. When the surface is despaced slightly, the object and
image distances change, and, therefore, the image is displaced, thus creating
defocus.
2.9 Misaligned Thin Lens 105
hp = h - D (2-136)
and
h p¢ = Mh p
= h¢ - MD , (2-137)
P¢ P ¢¢ = h p¢ - (h¢ - D ) , (2-138a)
or
P¢ P ¢¢ = (1 - M ) D , (2-138b)
P¢¢
P¢
h¢p
C¢ h¢ P¢¢
0
D
P0 C P0
(–) hp (–) h
P
(–)S S¢
Figure 2-45. Decentered lens. When a lens is decentered slightly, the object and
image distances from its center do not change, but their heights from its optical axis
change.
106 REFRACTING SYSTEMS
h p¢ = Mhp
= h ¢ - MS . (2-139)
P ¢P ¢¢ = hp¢ - ( h ¢ - S ¢)
= ( S ¢ - MS )
(2-140)
= 0 .
Thus, the image does not move. This is not surprising, because the image lies on the ray
passing through the center of the lens, which does not change when the lens is tilted about
its center.
P¢
h¢
P0 b
P0
(–) h
P
(–)S S¢
Figure 2-46. Tilted lens. When a lens is tilted about its center, the image stays
stationary.
2.10 Anamorphic Imaging Systems 107
P¢
P¢¢
P0 h¢
P0¢¢
P0¢
(–) h
P
D
(1 – M2) D
(–)S S¢
Figure 2-47. Despaced lens. When the lens is despaced, the object and image
distances from its center change, and, therefore, displace the image, thus creating
defocus.
Consider a point object P located at a point (p, q) in the object plane imaged by an
anamorphic system at a point P ¢ , as illustrated in Figure 2-48. The cylindrical lens L1
schematically represents cylindrical lenses with their symmetry axes parallel to x axis,
and similarly for L2 along the y axis. The system is symmetric about the yz and zx planes
whose intersection defines its optical axis z. The rays in the zx plane originating at P are
transmitted by L1 like a plane-parallel plate, and focused by L2 at P ¢ . Similarly, the rays
in the yz plane are focused by L1 at P ¢ and transmitted by L2 like a plane-parallel plate.
The projections of skew rays on the zx and yz planes contribute to the image in a similar
manner.
Let S1 be the distance of the point object P, and S1¢ be the distance of the Gaussian
image point P ¢ from the object- and image-space principal planes H1 and H1¢ of the lens
L1 , respectively, as illustrated in Figure 2-49. They are related to each other by the
image-space focal length f1¢ according to
1 1 1
- = , (2-141)
S1¢ S1 f1¢
or
108 REFRACTING SYSTEMS
S1¢ f1¢
S1 = . (2-142)
f1¢ - S1¢
Similarly, the object and image distances S2 and S2¢ for the lens L2 of focal length f 2¢
are related to each other according to
1 1 1
- = (2-143)
S2¢ S2 f 2¢
or
1 1 1
- = , (2-144)
S1¢ - d 2 S1 - d1 f 2¢
where d1 and d 2 are the distances H1H 2 and H1¢H 2¢ between the respective principal
planes of the two lenses. In the thin-lens approximation, d1 and d 2 are equal to the
spacing between the lenses. Substituting for S1 from Eq. (2-142) into Eq. (2-144), we
obtain a quadratic equation in S1¢ yielding two solutions for it. A corresponding value of
S1 can be obtained for each value of S1¢ from Eq. (2-142). Thus, an anamorphic system
has only two pairs of conjugates, compared to an infinite number for a rotationally
symmetric imaging system. It should be evident that the image magnifications along the x
and y axes are different, as they are given by
S2¢
Mx = - (2-145)
S2
and
S1¢
My = - , (2-146)
S1
respectively. Consequently, for example, the image of a square object is rectangular and
that of a circle is elliptical.
n¢ n n¢ n
- = = - (2-147a)
S¢ S f¢ f
and
110 REFRACTING SYSTEMS
n n′
P Q Q′
h (–)β′0
β0 P′0
P0 F H H′ F′ (–)h′
P′
(–)z (–)f f′ z′
(–)S S′
h¢ nS ¢ n0
Mt ∫ = = , (2-147b)
h n ¢S n ¢¢0
respectively, where f and f ¢ are the object- and image-space focal lengths of the system.
However, the focal points are obviously not conjugates of each other.
The principal planes are conjugate planes with a magnification of unity, as illustrated
by the fact that Q and Q ¢ are conjugate points at the same height from the axis. Note that
the dashed line QQ¢ is not a ray but merely an illustration of this fact. A ray incident
parallel to the optical axis of the system passes through the image-space focal point F ¢
after emerging from the system. Similarly, a ray incident in the direction of the object-
space focal point F emerges parallel to the optical axis. This ray determines the image
height h ¢ . The object-space nodal point N lies at a distance f ¢ from the corresponding
focal point F. Similarly, the image-space nodal point N ¢ lies at a distance f from the
corresponding focal point F ¢ . The spacing between the nodal points is equal to that
between the principal points. A ray incident in the direction of N emerges from the
system in a parallel direction passing through N ¢ . The principal points are conjugates of
each other, as are the nodal points.
The angular magnification of a ray bundle diverging from the axial point object P0
and converging to its image point P0¢ after refraction by the system, as illustrated in
Figure 2-48, is given by
Mb = ¢0 / 0 = S / S ¢ . (2-147c)
Mt Mb = n n ¢ , (2-147d)
where D S ¢ represents the length of the axial image. It also represents the change in image
distance D S ¢ due to a small change DS in the object distance. It shows that the image
moves in the same direction as the object.
The Newtonian imaging equation, where the object and image distances z and z ¢ are
measured from the object- and image-space focal points, respectively, are given by
zz ¢ = f f ¢ = - (n n ¢) f ¢ 2 (2-147g)
and
Mt ∫ h ¢ h = - z ¢ f ¢ = - f z . (2-147h)
n¢
f¢ = R . (2-148)
n¢ - n
The principal points coincide with the vertex of the surface, and the nodal points coincide
with its center of curvature. A ray incident in the direction of the center of curvature
passes through the surface undeviated because the angles of incidence and refraction are
both equal to zero.
In the case of a thin lens in air, the principal and nodal points all coincide at its
center. Thus, the object and image distances are measured from its center, and a ray
incident in the direction of its center passes through it undeviated. For a lens of refractive
index n with surfaces of radii of curvature R1 and R2 , the imaging equations reduce to
1 1 1 Ê 1 1ˆ 1
- = = (n - 1) Á - ˜ = - , (2-149a)
S¢ S f¢ Ë R1 R2 ¯ f
Mt ∫ h ¢ h = S ¢ S = 0 ¢0 , (2-149b)
Mb = ¢0 0 = S S ¢ , (2-149c)
Mt Mb = 1 , (2-149d)
112 REFRACTING SYSTEMS
2
Ml ∫ D S ¢ D S = ( S ¢ S ) = Mt2 = Mt Mb , (2-149f)
z z¢ = f f ¢ = - f ¢2 , (2-149g)
and
Mt ∫ h ¢ h = = - z ¢ f ¢ = - f z . (2-149h)
In the case of an afocal system, the focal points lie at infinity. The ray angular
magnification for infinite conjugates is given by
¢ nx0
= , (2-150)
n ¢x0¢
where x 0 and x 0¢ are the heights of a parallel ray in the object and image spaces with
refractive indices n and n ¢ , respectively. The image of an object can be determined by
applying the imaging equation successively for each surface or thin lens of the system.
Let Mt = h ¢ h be the magnification of the image. The image distance S ¢ for another
object lying at a distance S from the previous object is given by
S¢ n ¢h ¢ 2 n¢ 2
= = Mt . (2-151)
S nh 2 n
1 k 1 Ê 1 1 ˆ
= nk  Á - . (2-152)
Rp j =1 R j Ë n j n j -1 ˜¯
Rp = - n f ¢ . (2-153)
2.11.3 Misalignments
P ¢P ¢¢ = (1 - M ) D , (2-154)
( )
P0¢P0¢¢ = 1 - n ¢ M 2 n D . (2-155)
( )
P0¢P0¢¢ = 1 - M 2 D . (2-156)
transverse magnifications in the two symmetry planes. As a result, the image of a square
object is rectangular and that of a rectangular object can be square.
Problems 115
PROBLEMS
Illustrate each problem by a diagram.
2.2 From Eqs. (2-79) and (2-82), verify that the nodal points of a refracting surface
coincide with its center of curvature.
2.3 Consider a glass sphere of radius of curvature 3 cm and refractive index 1.5. Find
the apparent position and relative size of a flower (a) embedded at its center, and
(b) placed at a distance of R n from its center and observed from the other side of
the center. This problem illustrates the concept of a contact magnifier. A typical
lens magnifier produces a magnified (virtual) image of an object placed between it
and its front focal plane. A hemispherical or hyperhemispherical contact lens
magnifier produces a magnified (virtual) image of an object placed in contact with
its planar surface. The Contact magnifiers can be used in reverse, as in immersed
detectors, where the image is focused on the detector, which is in contact with the
planar surface of the hemispherical or the hyperhemispherical lens. The image on
the detector is smaller in size by the magnification of the lens determined in parts
(a) and (b). See R. C. Jones, “Immersed radiation detectors,” Appl. Opt. 1, 607–613
(1962).
2.4 From Eq. (2-98), derive the focal length of a thick lens of refractive index n and
thickness t with surfaces of radii of curvature R1 and R2 . This problem is
discussed in detail by way of ray tracing in Chapter 4.
2.6 A 4-cm square slide is placed at a distance of 8 cm from a thin lens. Determine the
focal length of the lens if an image is to be formed on a screen at a distance of 2 m.
(a) What is the size of the image? (b) Sketch the Petzval surface indicating the
value of longitudinal defocus for the corner point of the slide.
2.7 Two thin lenses of focal lengths 10 cm and 20 cm are placed 5 cm apart. If an
object is placed at a distance of 30 cm from the first lens, determine the location
and size of the image formed by the system.
2.8 (a) Determine the radii of curvature of a thin equiconvex lens of refractive index
1.5 and a power of 5 D. (b) Determine the location of the image of an object lying
at a distance of 40 cm from the lens at a height of 5 cm from its optical axis. (c)
How does the image location change if the lens is displaced by 1 mm and is
decentered by 1 mm? Check the change in image height also.
116 REFRACTING SYSTEMS
2.9 Consider a thin lens of refractive index 1.5 and focal length 30 cm. Determine its
focal length and power when placed in water. The refractive index of water is 1.33.
2.10 Consider a plane-parallel plate of thickness t and refractive index n. (a) Derive an
expression for the location and the size of the image of an object lying at a distance
So from its front surface. (b) Derive an expression for the location and the size of
the image of its front surface formed by its back surface. (c) Sketch the various
quantities determined for t = 1 cm, n = 1.5, and So = 3 cm.
2.11 Consider an afocal system consisting of two lenses of equal focal lengths f ¢
placed 2 f ¢ apart. (a) Determine the transverse and longitudinal magnifications of
the image of a nearby object. (b) Determine the space between the object and its
image. Show that the position and size of the image do not change as the system is
moved along its axis. (c) How are the imaging properties of the system affected if
a third lens of focal length f ¢ is placed at the common focal point of the first two?
(d) As an example, consider f ¢ = 10 cm and an object placed at a distance of 30
cm from the first lens.
2.12 The size of the image of a distant object depends on the focal length of the imaging
system. A telephoto lens consisting of a positive lens and a negative lens is used to
obtain a large image such that the back focal distance is kept small. (a) Design a
telephoto lens with a focal length of 20 cm and a back focal distance of 4 cm. Let
the focal length of the positive lens be 4 cm. (b) Determine the focal length and the
back focal distance of the lens when it is reversed. Show that the reversed lens
works as a wide-angle lens.
2.13 (a) Show that the power of a system changes when it is reversed (by rotating it by
180 o about an axis normal to its optical axis) unless the refractive indices of its
object and image spaces are equal. Consider a thin lens of refractive index 1.5 and
radii of curvature 10 cm and - 15 cm with air in its object space and water in its
image space. Calculate its focal lengths and power, then reverse the lens and
repeat the calculations.
2.14 Show that the radius of curvature of the Petzval surface of a system consisting of a
series of m thin lenses of refractive indices n j and focal lengths f j¢ , where j = 1, 2,
m
..., m, is given by 1 Rp = Â - 1 n j f j¢ .
j =1
2.15 Consider a camera with an adjustable focus. Assume the lens to be thin with a focal
length of 10 cm. If the object distance changes from 2 m to 4 m, determine the lens
movements required to keep the image focused at the film.
CHAPTER 3
REFLECTING SYSTEMS
117
Chapter 3
Reflecting Systems
3.1 INTRODUCTION
In Section 1.8.3, we considered Gaussian imaging by a reflecting surface, and
showed that the curved surface can be replaced by a planar surface that is a tangent to the
surface at its vertex, called the tangent plane or the paraxial surface. In this chapter, we
rederive the Gaussian imaging equations for a spherical reflecting surface and show that
they can be obtained from those for a corresponding refracting surface by substituting the
refractive index associated with the reflected rays equal to the negative value of that
associated with the incident rays. Both Gaussian and Newtonian forms of the imaging
equations are given. We also show how to determine the image graphically. The Petzval
image, two-mirror telescopes, a beam expander, and the image displacement resulting
from the misalignment of a mirror are also discussed.
q¢ = - q , (3-1)
where q and q ¢ are the angles of incidence and reflection, respectively. From
triangle P0 CQ , we note that
f = 0 - q , (3-2)
where f is the angle that the surface normal at the point of incidence Q makes with the
optical axis. Similarly, from triangle CP0¢Q , we note that
¢0 = f + q¢ . (3-3)
The tangent of a small angle is approximately equal to the angle in radians. Thus, we may
write
f = - x /R , (3-4a)
0 = - x / S , (3-4b)
and
119
120 REFLECTING SYSTEMS
(–)q
q¢ x
b0 f b0¢
P0 C P0¢ F¢ V
(–)f ¢
(–)S¢
(–)R
(–)S
(a)
(–)q¢
q Q
x
(–)b¢0
b0 (–)f
P0 V P0¢ F¢ C
(–)S S¢
f¢
R
(b)
¢0 = - x / S ¢ , (3-4c)
where x is the height of the point of incidence, and S ¢ is the image distance. Substituting
for q and q ¢ from Eqs. (3-2) and (3-3) into Eq. (3-1) and Eqs. (3-4) for the other angles,
we obtain the Gaussian imaging equation
1 1 2
+ = . (3-5)
S¢ S R
When the object lies at infinity, i.e., when S = - •, the corresponding image
distance S ¢ ∫ VF ¢ = f ¢ , where f ¢ is called the focal length of the mirror. Thus, the focal
length of the mirror is given by
f¢ = R 2 . (3-6)
The rays incident parallel to the optical axis come to focus after reflection by the mirror at
the point F ¢ , which lies halfway between V and C. It is evident that a mirror has only one
focal point. The object- and image-space focal points are coincident just as the two spaces
are coincident. Thus, if a point source is placed at the focal point F ¢ , its rays incident on
the mirror become parallel after being reflected by it. Substituting Eq. (3-6) into Eq. (3-
5), we obtain
1 1 1
+ = . (3-7)
S¢ S f¢
The focal point F ¢ of a mirror is illustrated in Figure 3-2 for both a concave and a
convex mirror. It is real in the case of a concave mirror, but it is virtual in the case of a
convex mirror. We note that Eq. (3-5) is independent of the refractive index of the
medium in which the rays are incident or reflected. Therefore, it is independent of the
direction of propagation of the rays. The focal length f ¢ is numerically negative for a
concave mirror but positive for a convex mirror.
For object rays propagating from left to right, the rays on the first mirror (not
necessarily the first imaging element) of a system will be incident propagating from left
to right. In a medium of refractive index n, the incident rays will be associated with a
refractive index n, but the reflected rays will be associated with a refractive index
n ¢ = - n . Any refracting imaging elements following the mirror will be assigned
refractive indices with a negative value of their actual refractive indices because the rays
on them are incident propagating from right to left. When reflected by a second mirror in
the system, these rays will propagate from left to right and will be associated with a
refractive index n2¢ = n . Therefore, we define the reflecting power K and the equivalent
or effective focal length fe of a mirror according to
122 REFLECTING SYSTEMS
1 2n ¢
K = = , (3-8)
fe R
where n ¢ is the refractive index associated with the rays reflected by it. Thus, if the first
mirror in a system is concave, it has a negative value of R, a negative value of n ¢ in Eq.
(3-8), and positive values of K and fe . Similarly, a second concave mirror in a system
will have a positive value of R, a positive value of n ¢ in Eq. (3-8), and positive values of
K and fe . Therefore, a concave mirror is always a positive imaging element, regardless
of the direction of the rays incident on it. Similarly, a convex mirror has negative values
of K and fe , i.e., it is always a negative imaging element, regardless of the direction of
the rays incident on it. In air, n ¢ = - 1 for the first mirror, and its reflecting power and
equivalent focal length are given by
1 2
K1 = = - , (3-9a)
fe1 R1
V
C F¢
(–)f¢
(–)R
(a)
V F¢ C
f¢
R
(b)
Figure 3-2. The focal point F ¢ of a mirror. It lies halfway between the vertex V and
the center of curvature C of the mirror. (a) Concave mirror. (b) Convex mirror.
3.2 Spherical Reflecting Surface (Spherical Mirror) 123
where R1 is its radius of curvature. Similarly, because n ¢ = 1 for a second mirror, its
reflecting power and equivalent focal length are given by
1 2
K2 = = , (3-9b)
fe 2 R2
where R2 is its radius of curvature. Continuing in this manner, we find that the reflecting
power K j and equivalent focal length fej of a jth mirror in air of radius of curvature R j in
a system is given by
1 2
Kj = = ( -1) j . (3-10)
fej Rj
Now we consider the imaging of an off-axis point object P at height h from the
optical axis in the object plane passing through P0 , as illustrated in Figure 3-3. A ray PV
incident at the vertex V of the mirror is reflected as a ray VP¢ intersecting the image
plane passing through P0¢ at the point P ¢ , which locates the image point at a height h ¢ . It
is evident from the figure that
q = h/S (3-11a)
and
q¢ = h¢ / S ¢ . (3-11b)
Substituting Eqs. (3-11) into Eq. (3-1), we find that the transverse magnification of the
image is given by
Mt ∫ h ¢ h = - S ¢ S . (3-12a)
The image formed by a concave mirror is inverted, as in Figure 3-3a, but that by a convex
mirror is erect, as in Figure 3-3b. Accordingly, the magnification is negative in Figure 3-
3a and positive in Figure 3-3b. From similar triangles P0 PC and P0¢P ¢C , we find that the
transverse magnification may also be written
R - S¢
Mt = - .
S-R (3-12b)
Letting R Æ • shows that the magnification for a plane mirror is unity, as expected.
The ray angular magnification representing the ratio of the angular divergence of the
rays from P0 to the angular convergence of these rays to P0¢ , as in Figure 3-4, is given by
M = ¢0 / 0 = S / S ¢ . (3-13)
h
P0′ (–)θ
V
P0 C (–)h′ F′ θ′
P′
(–)S′
(–)R
(–)S
(a)
′
P
P′
h
(–)θ h′
P0 θ′ V P0′ F′ C
(–)S S′
R
(b)
Figure 3-3. Gaussian imaging of an off-axis point object P at height h. (a) Concave
mirror forms a real and inverted image at P ¢ at a height h ¢ . (b) Convex mirror
forms a virtual and erect image.
Mt M = - 1 . (3-14)
representing the Lagrange invariance for the mirror. The Lagrange invariant is nh0 ,
where n = 1. From Eq. (3-15), the transverse magnification of the image can also be
written
3.2 Spherical Reflecting Surface (Spherical Mirror) 125
h
β0 P0′ β0′
P0 C (–)h′ F′ V
P′
(a)
P¢
h (–)b¢0
b0 h¢
P0 V P0¢ F¢ C
(b)
Figure 3-4. Lagrange invariant nh0 of a mirror. (a) Concave mirror. (b) Convex
mirror.
Mt = - 0 ¢0 , (3-16)
i.e., it can also be obtained from the slope angles of the axial incident ray and the
corresponding reflected ray.
real object, an increase in the object distance takes place (from a larger negative value to
a smaller one) by moving the object closer to the mirror. In Figure 3-3a, a decrease in the
image distance (from a smaller negative value to a larger one) implies that the image
moves away from the mirror. Similarly, in Figure 3-3b, a decrease in the image distance
(from a larger positive value to a smaller one) implies that the image moves closer to the
mirror. Thus, the image moves in a direction opposite to that of the object. This is true for
a system with an odd number of mirrors, as may be seen from Eq. (2-61) by letting
n ¢ n = - 1. The opposite is true if the number of mirrors is even because then n ¢ n = 1.
Figure 3-5 illustrates a 3D image of a 3D object. The reversal of the image arrows P0 x ¢,
P0 y ¢ , and P0 z ¢, compared with the corresponding object arrows P0 x , P0 y , and P0 z ,
shows that both the transverse and longitudinal magnifications are numerically negative.
This is different from that for a refracting surface, where the longitudinal magnification is
positive (see Figure 2-10).
In Eq. (3-17), the mirror is assumed to be fixed in position, and D S ¢ represents the
displacement of the image corresponding to a displacement DS of the object. However, if
the object is fixed and the mirror is displaced by an amount D , then the corresponding
( )
displacement of the image is 1 + Mt 2 D , as shown in Section 3.6.3.
Comparing Eq. (3-5) with Eq. (2-4), we note that the imaging properties of a
spherical reflecting surface can be obtained from those of a spherical refracting surface if
we let n = 1 because the medium between the object and the mirror is air, and n ¢ = - 1,
representing a reflected ray propagating backward. Similarly, the expression for the focal
y¢
z¢
V
P0 z C P0¢ F¢
x¢
y
(–)S¢
(–)R
(–)S
length, reflecting power, magnifications, and Lagrange invariant for a mirror can be
obtained from the corresponding expressions for a refracting surface by letting n ¢ = - n ,
where n = 1.
Equation (3-7) is the Gaussian imaging equation for a mirror in which the object and
image distances are measured from its vertex. If we measure the object and image
distances z and z ¢, respectively, from the focal point F ¢ , as indicated in Figure 3-6, then
from similar triangles VF ¢B and P0¢F ¢P ¢, and similar triangles P0 F ¢P and VF ¢A , we find
that the transverse magnification of the image is given by
Mt = h ¢ h = z ¢ f ¢ = f ¢ z . (3-18)
Therefore,
z z¢ = f ¢2 , (3-19)
P B
V
P0 C (–)h′ P0′ F′
A
P′
(–) z′
(–) z
(–)f ′
(–)S′
(–)R
(–)S
(a)
B
P
P¢
h
A h¢
P0 V P¢0 F¢ C
f¢
(–)S S¢
(–)z ¢
(–)z
(b)
Figure 3-6. (a) Paraxial imaging of a real object P0 P of height h. (a) Concave mirror
forms a real and inverted image P0¢P ¢ of height h ¢ . (b) Convex mirror forms a
virtual and erect image. In the Gaussian approximation, the reflection of rays takes
place at the tangent plane AVB.
3.3 Two-Mirror Telescopes 129
P0 C P0¢ F¢ V
(a)
P0 V P0¢ F¢ C
(b)
However, because one mirror obscures a portion of the other, the beam of light that forms
the final image is annular, resulting in a decrease in the amount of light in the image.
Let R1 and R2 be the vertex radii of curvature of the primary and secondary mirrors,
respectively. Their corresponding focal lengths are given by f1¢ = R1 2 and f2¢ = R2 2 .
They are coaxial such that the system is rotationally symmetric about the optical axis that
passes through their vertices. Let the vertex-to-vertex spacing from mirror M1 to mirror
M2 be (a numerically negative quantity) t.
Applying Eq. (3-7) to the primary mirror M1 , we note that for an object at infinity,
S1 = - • , and therefore the image is formed at its focus F1¢ , called the prime focus, at a
(numerically negative) distance S1¢ = f1¢ from M1 . This image is the object for the
secondary mirror M2 and lies at a distance
S2 = f1¢ - t (3-20)
from it. In Figure 3-8a, F1¢ lies inside the focus of the secondary mirror, i.e., S2 < f2¢ ,
but in Figure 3-8b, it lies outside, i.e., S2 > f2¢ . In both cases, a real image is formed by
M2 that lies at the telescope (or Cassegrain) focus F ¢ at a distance S2¢ given by
1 1 1
= - , (3-21)
S2¢ f 2¢ S2
or
S2¢ locates the image formed by the system, and S2¢ + t gives the distance of the image
from the primary mirror, called the working distance. Although the location of the focal
point F ¢ is thus determined, the focal length is not; it can be obtained by substituting for
S2 and S2¢ into Eq. (2-98) and letting n2¢ = 1, representing the refractive index associated
with the ray reflected by M2 . Therefore, the focal length of the telescope is given by
1 1 1 t
= - + - . (3-23)
f¢ f1¢ f2¢ f1¢f2¢
3.3 Two-Mirror Telescopes 131
h¢2
h1¢
OA F1¢ b F¢
M2
M1
(–)S2 (–)t
(–)f1¢
S¢2
(a)
h′1
OA F′1 β F′
(–)h′2
M2
M1
S2 (–)f1′
(–)t
S′2
(b)
By definition, the focal length is the distance of F ¢ from the principal point H ¢ . As
illustrated in Figure 3-9, H ¢ is the point where the optical axis intersects the principal
plane, which, in turn, is the transverse plane passing through the point of intersection of a
ray incident parallel to the optical axis and the corresponding exit ray passing through
F ¢ . To determine f ¢ , we need to know the slope angle ¢ of the exit ray in terms of the
height of the incident ray. This is done in Section 4.8, where Eq. (3-23) is rederived.
For an object lying at infinity at an angle from the optical axis of the system (see
Figure 3-8), the height h1¢ of its image formed by M1 is given by
The image formed by M1 is the object for M2 . The height h2¢ of the final image formed
by M2 (and therefore by the system) is given by
M2 ∫ h2¢ h1¢
= - S2¢ S2
or
M2 = - f ¢ f1¢ . (3-25)
(–)b¢
H¢ OA F1¢ F¢
M2
M1
(–)S2 (–)t
S¢2
(–)f1¢
f¢
Figure 3-9. Ray tracing of a two-mirror system to determine its focal point F ¢ and
principal point H ¢. A ray incident parallel to the optical axis is reflected by mirror
M1 in the direction of its focal point F1¢ , which, in turn, is reflected by mirror M2
in the direction of the telescope focus F ¢ . The incident and the exit rays meet in the
principal plane whose intersection with the optical axis locates the principal point
H¢ .
3.4 Beam Expander 133
The magnification M 2 of the image formed by the secondary mirror is called the
secondary magnification. From Eqs. (3-24) and (3-25), we obtain
h2¢ = f ¢ . (3-26)
It is evident from Figure 3-9 that the focal ratio of the image-forming light cone is given
by
F = f ¢ D1 , (3-27)
where D1 is the diameter of the primary mirror. It should be evident from Figure 3-8 that
the secondary mirror obscures the beam incident on the primary mirror. Accordingly, the
image-forming light cone is annular (see Section 4.8 for obscuration value). The signs of
the various quantities associated with the Cassegrain and Gregorian telescopes are
summarized in Table 3-1.
If the two mirrors of the telescope are confocal (meaning “common focus”), i.e., if
t = f1¢ - f2¢ , then f ¢ Æ • , and the system becomes afocal, acting as a beam expander. As
illustrated in Figure 3-10, a parallel beam incident at an angle with the optical axis OA
is focused by the primary mirror M1 at a height h1¢ = - f1¢ in the plane of the common
focus. The beam focus lies at an angle h1¢ f2¢ from the optical axis. Thus, the secondary
mirror recollimates the light, making an angle ¢ = - h1¢ f2¢ . It is evident that the beam-
expansion ratio is D2 D1 = f2¢ f1¢, where D1 and D2 are the diameters of the primary
and secondary mirrors, respectively. The angular magnification of the output beam is
given by ¢ = f1¢ f2¢ . The product of the transverse and angular magnifications is unity.
The radius of curvature Ri of the Petzval image surface for a spherical refracting
surface of radius of curvature R separating media of refractive indices n and n¢ is given
Table 3-1. Signs of focal lengths, etc. for Cassegrain and Gregorian telescopes.
f1¢ – –
f 2¢ – +
f¢ + –
M2 + –
S2¢ + +
t – –
Rp – +
134 REFLECTING SYSTEMS
D2
OA F¢1 F2¢
(–)h¢1
D1
b
M1
b¢
M2
(–) t S¢1 = f1¢
f2¢
by Eq. (2-118):
1 n - n¢
= . (3-28)
Ri nR
Letting n = 1 and n ¢ = - 1, we obtain the radius of curvature of the Petzval surface for a
corresponding reflecting surface:
1 1
= - ( - 1 - 1) ,
Ri1 R
or
Rp = R 2 = f ¢ .
(3-29)
For a concave (converging or a positive) mirror with its center of curvature to the left of
its vertex, R and f ¢ are numerically negative (see Figure 3-11a). Therefore, Rp is also
numerically negative, or the Petzval surface is curved in the same manner as the mirror
with a radius of curvature equal to the focal length of the mirror. For a convex (diverging
or a negative) mirror with its center of curvature to the right of its vertex, R and f ¢ are
numerically positive (see Figure 3-11b). Therefore, Rp is also numerically positive, or
the Petzval surface is virtual and curved in the same manner as the mirror. When the
object is at infinity so that the image lies in the focal plane of the mirror, the Petzval
surface is concentric with the mirror, regardless of whether the mirror is concave or
convex.
3.5 Petzval Image Surface 135
Petzval Petzval
Surface Surface
Cp C P¢0 F¢ P¢0 F¢ Cp C
(–)Rp Rp
(–)f ¢ f¢
(–)R R
(a) (b)
Figure 3-11. Petzval surface of a mirror. (a) Concave mirror with a real Petzval
image surface. (b) Convex mirror with a virtual Petzval image surface. C and F ¢
are the center of curvature and the focal point of the mirror. P0¢ is the axial image
point, and C p is the center of curvature of the Petzval surface. P0¢ C p is the radius of
curvature of the Petzval surface.
1 k 1 Ê 1 1 ˆ
= nk  Á - . (3-30)
Rik j =1 R j Ë n j n j -1 ˜¯
The rays on each surface are incident from left to right, and the radius of curvature Rj of a
surface, including the Petzval surface, is numerically positive or negative, depending on
whether its center of curvature lies to the right or the left of its vertex, i.e., depending on
whether it is convex or concave to the light incident on it. If the jth surface of a system is
a reflecting one, then n j -1 = 1 and n j = - 1 when rays are incident on it from left to right.
However, if they are incident from right to left, as, for example, on the secondary mirror
of a two-mirror system, then n j -1 = - 1 and n j = 1.
For a system consisting of two mirrors with radii of curvature R1 and R2 , the
refractive indices have the values n0 = 1 , n1 = - 1 , and n2 = 1 (a second reflection makes
n2 positive). Thus, Eq. (3-30) reduces to
1 1 1
= (- 1 - 1) + (1 + 1) ,
Ri 2 R1 R2
or
136 REFLECTING SYSTEMS
1 Ê 1 1ˆ
= 2Á- + ˜
Rp Ë R1 R2 ¯
1 1
= - + , (3-31)
f1¢ f2¢
where f1¢ and f2¢ are the focal lengths of the mirrors. This surface is shown passing
through F ¢ in Figure 3-8. It is curved toward the primary mirror in the case of a
Cassegrain telescope, and away from it in the case of a Gregorian telescope.
Similarly, we find that the radius of curvature Rp of the Petzval surface for a system
consisting of k mirrors with radii of curvature R j , j = 1, 2, ..., k , is given by
1 k k j 1
= 2( - 1) Â ( - 1) . (3-32)
Rp j =1 Rj
hp = h - D (3-33)
and
h ¢p = M t h p
= h¢ - M t D , (3-34)
P¢ P ¢¢ = h ¢p - ( h ¢ - D)
= (1 - M t ) D . (3-35)
Cp
h
D P0′ Vp
C (–)h′ V
P0 P ′′ p
(–)h′
P′
(–)S ′
(–)R
(–)S
If a mirror is tilted about its vertex by an angle , as illustrated in Figure 3-13, the
image is displaced from the point P ¢ to a point P ¢¢ . The object and image heights with
respect to the tilted axis of the mirror are given by
h p = h - S (3-36)
and
h ¢p = M t h p
= h ¢ - M t bS . (3-37)
P ¢P ¢¢ = hp¢ - ( h ¢ - S ¢)
= (S ¢ - M t S)
= 2S ¢ , (3-38)
where we have used Eqs. (3-12a) and (3-37). When a ray is incident at a certain angle on
the mirror (including a plane mirror), the reflected ray is tilted by an angle 2 when the
mirror is tilted by an angle . Thus, the image is displaced by 2S ¢ . Substituting for S ¢
in terms of M t and R from Eqs. (3-5) and (3-12), Eq. (3-38) can also be written
138 REFLECTING SYSTEMS
h Cp
(–)β P0′
(–)h′p V
P0 C P′′
(–)h′
P′
(–)S′
(–)R
(–)S
Figure 3-13. Tilted mirror. When the mirror is tilted by an angle b , the image is
displaced by an amount 2bb S ¢ .
P ¢P ¢¢ = (1 - M t ) D , (3-39)
where D = R is the displacement of the center of curvature of the mirror. From Eqs. (3-
35) and (3-39), we note that the image displacement is determined by the displacement of
the center of curvature. If a mirror is decentered and tilted such that the center of
curvature is not displaced, then the image is also not displaced. Note that Eq. (3-39) does
not apply to a plane mirror because its center of curvature lies at infinity, and the
magnification of the image formed by it is unity.
( )
P0¢P0¢¢ = 1 + M t2 D . (3-40)
Letting M t = 1 for a plane mirror, the image displacement is twice that of the mirror, as
expected.
3.7 Misaligned Two-Mirror Telescope 139
h
P0′ P0′′ V Δ
Vp
P0 C (–)h′
P′ P′′
(1– M2 )Δ
(–)S′
(–)R
(–)S
Figure 3-15a shows a properly aligned two-mirror telescope. When the secondary
mirror is decentered by a small amount D along the x axis, as in Figure 3-15b, the image
is displaced by an amount (1 - M2 ) D , where M2 is the transverse magnification of the
image formed by it.
When the secondary mirror is tilted with respect to the primary mirror by an angle ,
as in Figure 3-15c, the rays reflected by it tilt by an angle 2. Thus, the image formed by
it is displaced by an amount 2 S2¢ , where S2¢ is the distance of the final image from the
secondary mirror M2 .
(a)
F
M2
M1
D
(b)
(–)(1– M2)D
C2 b 2bS2¢
(c)
C¢2
S2¢
(d)
The imaging equations for a spherical mirror of radius of curvature R can be obtained
from the corresponding equations for a refracting surface by letting n = 1 because the
medium between the object and the mirror is air, and n ¢ = - 1, representing a reflected
ray propagating backward. However, a mirror has only one principal point that coincides
with its vertex V, one focal point F ¢ that lies halfway between V and the center of
curvature C, and one nodal point that coincides with C (see Figures 3-16 and 3-17). A ray
incident parallel to the optical axis is reflected by the mirror passing through F ¢ , a ray
incident in the direction of F ¢ is reflected parallel to the axis, and a ray incident in the
direction of C is reflected upon itself. The imaging equations are given by
1 1 2 1
+ = = , (Gaussian Imaging Equation) (3-41a)
S¢ S R f¢
Mt Mb = 1 , (3-41d)
2
Ml = D S ¢ D S = - ( S S ¢ ) = - Mt 2 , (Longitudinal Magnification) (3-41f)
P B
V
P0 C (–)h′ P0′ F′
A
P′
(–) z′
(–) z
(–)f ′
(–)S′
(–)R
(–)S
h
β0 P0′ β0′
P0 C (–)h′ F′ V
P′
and
The negative sign in Eq. (3-41e) indicates, for example, that if the object is displaced
longitudinally towards the mirror, the image is displaced away from it.
If the mirror is decentered and/or tilted such that its center of curvature is displaced
by an amount D , the image is displaced by an amount (1 - Mt )D . The image
displacement is also equal to 2S ¢ when the mirror is tilted by an angle . If the mirror is
despaced by an amount D , the image is displaced by an amount 1 + Mt2 D . ( )
3.8.2 Imaging by a Two-Mirror Telescope
The imaging equations for a two-mirror telescope, illustrated in Figure 3-18, with
mirrors of radii of curvature R1 and R2 , focal lengths fi¢ = Ri 2 , and spaced a
(numerically negative) distance t apart are given by
f1¢f2¢
f¢ = , (Focal Length of the Telescope) (3-42a)
f1¢ - f2¢ - t
h¢2
h 1¢
D1
OA F 1¢ b F¢
M2
M1
(–)S2 (–)t
(–)f1¢ S¢2 + t
S¢2
S2¢ = f2¢( f1¢ - t ) ( f1¢ - t - f2¢) , (Image Distance from Secondary Mirror) (3-42f)
1 Ê 1 1ˆ 1 1
= 2Á- + ˜ = - + , (Petzval Radius of Curvature) (3-42h)
Rp Ë R1 R2 ¯ f1¢ f2¢
where D1 is the diameter of the primary mirror, and is the field angle.
PROBLEMS
Illustrate each problem by a diagram.
3.2 The right-hand side mirror of automobiles is inscribed with the words "objects are
closer than they appear." Determine the radius of curvature of the mirror if the ratio
of the distances is 1.2.
3.3 A Mangin mirror consists of a thin, negative meniscus lens with a silvered back
surface. Show that, if R1 and R2 are the radii of curvature of the lens and n is its
refractive index, the focal length of the Mangin mirror is given by
fs¢ -1 = 2 nR2- 1 - 2(n - 1) R1-1 .
3.4 The Hubble space telescope is a Cassegrain telescope with a focal ratio of 24. Its
primary mirror has a diameter of 2.4 m and a focal ratio of 2.3. The spacing
between its two mirrors is 4.905 m. (a) Determine its focal length and illustrate it
on a schematic of the telescope. (b) Calculate its working distance. (c) Determine
the radius of curvature of the Petzval image surface.
3.5 Show that imaging by a thin converging lens of focal length f ¢ in contact with a
plane mirror is equivalent to a concave mirror of radius of curvature f ¢ . If the lens
has a focal length of 15 cm, determine the image of an object lying at a distance of
10 cm from it (a) in the absence of the mirror and (b) when the mirror is present.
CHAPTER 4
145
Chapter 4
Paraxial Ray Tracing
4.1 INTRODUCTION
Paraxial ray tracing was introduced in Chapter 1 (see Section 1.7) and utilized in
Chapters 2 and 3 to show that an imaging system could be characterized by its principal
points and focal lengths. Although a system has six cardinal points, only three are
independent. If the refractive indices of the object and image spaces are equal, which is
often the case in practice, then the nodal points coincide with the corresponding principal
points, and the object- and image-space focal lengths are equal in magnitude. We showed
that the image of a point object formed by a system can be determined graphically by
tracing any two of the three specific object rays: a ray incident parallel to the optical axis
of the system and emerging from it passing through the image-space focal point; a ray
incident passing through its object-space focal point and emerging parallel to the optical
axis; and a ray incident passing through its object-space nodal point and emerging from
the system passing though its image-space nodal point. The point of intersection of these
rays in the image space determines the image point.
However, before any of these three rays can be traced, we must know the location of
the cardinal points. Of course, we need their locations in order to apply the Gaussian
imaging equations as well. In the case of a single refracting surface, the principal points
coincide with its vertex, and its nodal points coincide with its center of curvature. The
principal and the nodal points of a thin lens (in air) coincide with its center. In this
chapter, we develop the paraxial ray-tracing equations and demonstrate their utility by
determining the cardinal points of simple imaging systems. Starting at an object point, a
ray undergoes rectilinear propagation to the first surface of the system; it is refracted or
reflected at the surface, depending on whether it is a refracting or a reflecting surface; it
undergoes rectilinear propagation again until it reaches the next surface; and the process
repeats itself until the ray reaches the image plane.
We first develop the paraxial ray-tracing equations for a refracting surface and
demonstrate their utility by determining its focal length. Ray tracing of a general system
to determine its cardinal points is considered next. How to determine the cardinal points
of a combination of two systems is also discussed. As examples of simple systems, a thin
lens, a thick lens, and a two-lens system are considered. The ray-tracing equations for a
mirror are derived next and applied to a two-mirror system, and a catadioptric system
consisting of a thin lens and a mirror. Some of the results obtained in Chapters 2 and 3 on
these simple systems are rederived to gain familiarity with the use of the ray-tracing
equations. In practice, the ray-tracing equations are used to determine not only the
Gaussian properties of a system but also the size of the imaging elements and apertures,
vignetting of the rays, and obscurations in mirror systems. This is illustrated by
determining the obscuration ratio of a two-mirror system, and the relative size of its
secondary mirror and the hole in its primary mirror.
147
148 PARAXIAL RAY TRACING
It is evident from Figure 4-1 that for paraxial rays, rectilinear propagation from A0
to A1 gives
n0 n1
q0 A1
A0 b0 q1
(–)b1 A2
x1
x0
x2
(–)f
V OA C
R1
t0 t1
x1 = x 0 + t00 , (4-1)
where 0 is the slope angle of the incident ray A 0 A 1 from the optical axis. Equation (4-
1) is called the transfer ray-tracing equation, and, except for notation, it is the same as
Eq. (1-65). According to the paraxial form of Snell’s law, refraction of the ray at A1 gives
n1q1 = n0 q 0 , (4-2)
where q 0 and q1 are the angles of incidence and refraction (i.e., the angles of the incident
and refracted rays from the surface normal A1C at the point A1 ), respectively.
f = 1 - q1 (4-3a)
and
q 0 = 0 - f , (4-3b)
where
f = - x1 R1 (4-3c)
is the angle the surface normal makes with the optical axis. Both f and 1 are
numerically negative in the figure. Substituting for q 0 , q1 , and f from Eqs. (4-3) into
Eq. (4-2), we obtain
x1
n11 = n00 + (n0 - n1 ) . (4-4)
R1
Equation (4-4) is called the refraction ray-tracing equation. It is the same as Eq. (1-66).
The rectilinear propagation of the refracted ray from A1 to A2 gives
x2 = x1 + t11 . (4-5)
If the next surface lies at a distance t1 from the first, then A2 determines the point of
incidence on it. In that case the ray A1 A2 is refracted at the point of incidence A2 by the
second surface according to an equation similar to Eq. (4-4), and the ray propagates
rectilinearly until it reaches the next surface. Using Eqs. (4-1) and (4-4) recursively, the
ray can be propagated to the image plane of a multisurface system.
n0 n1
x1, b1
b0
x0, b0 (–)b1 x2
x0 x1 x2
V C
R1
t0 t1
(a)
n0 n1
x0, 0
x1, b1
x0 x1 x2 = 0
(–)b1
V C F¢
R1
t0 t1 = f¢1
(b)
Figure 4-2. Ray tracing of a spherical refracting surface. (a) General case. (b)
Determination of focal point F ¢.
As a simple example, we use the ray-tracing equations to determine the focal length
of the refracting surface. If we let 0 = 0 , corresponding to a ray incident parallel to the
optical axis of the system, as in Figure 4-2b, and let x2 = 0 , corresponding to the point of
intersection of the refracted ray with the optical axis, then the corresponding value of t1
gives the focal length f1¢ . Letting 0 = 0 in Eqs. (4-1) and (4-4), and x2 = 0 in Eq. (4-5),
we find that
n1
f1¢ = R1 , (4-6)
n1 - n0
It should be evident that the position and height of the image of an object formed by
a refracting surface can also be obtained by using the ray-tracing equations. Thus, for
example, consider a ray from an axial point object P0 with a slope angle 0 , as illustrated
in Figure 4-3a. Letting x 0 = 0 in Eq. (4-1) , we obtain
4.2 Refracting Surface 151
n0 n1
x1
P0 b0 (–)b1 P0¢
x0 = 0 V C x2 = 0
R1
t0 t1
(a)
n0 n1
b0
P
x1
x0 (–)b1 P0¢
P0 V C (–)x2
P¢
R1
t0 t1
(b)
Figure 4-3. Imaging of a point object by a refracting surface. (a) On-axis point
object P0 . (b) Off-axis point object P.
x1 = t00 , (4-7)
as is evident from the figure. Substituting Eq. (4-7) into Eq. (4-4), the slope 1 of the
refracted ray is given by
t00
n11 = n00 + (n0 - n1 ) . (4-8)
R1
Substituting for 1 from Eq. (4-8) into Eq. (4-5) and letting x2 = 0 , we obtain the
distance t1 of the axial image point P0¢ :
x1
t1 = -
1
n1t0
= ,
n0 + (n0 - n1 )(t0 R1 )
or
n1 n0 n - n0
+ = 1 . (4-9)
t1 t0 R1
152 PARAXIAL RAY TRACING
t1 È x1 ˘
x2 = x1 + Ín00 + (n0 - n1 ) ˙
n1 Î R1 ˚
t1 È Ê n1 n0 ˆ ˘
= x1 + Ín00 - Á + ˜ x1 ˙
n1 ÍÎ Ë t1 t0 ¯ ˙˚
t1 Ê n0 ˆ
= Á n 00 - x˜ , (4-10)
n1 Ë t 0 1¯
where we have used Eq. (4-9). Substituting for x1 from Eq. (4-1), we obtain
x2 nt
= - 01 . (4-11a)
x0 n1t0
Writing t0 and t1 in terms of the ray heights and slope angles, we obtain
x2 n
= 0 0 . (4-11b)
x0 n11
Except for notation, Eqs. (4-9) and (4-11a) are the same as Eqs. (2-4) and (2-12a),
respectively. Similarly, Eq. (4-11b) is the same as Eq. (2-17). Note that t0 = - S because
S is the distance of the object from the refracting surface, and t0 is the distance a ray
propagates from the object to the refracting surface.
In this section we illustrate how the cardinal points of a system may be determined
from its design parameters by using the transfer and refraction ray-tracing equations. We
also show how the cardinal points of a combination of two systems can be determined
from the cardinal points of the individual systems.
n n¢
Q Q¢
x
b0 (–)b¢0 P¢0
P0 H H¢
(–)S S¢
n¢ n n¢ n
- = = - , (4-12)
S¢ S f¢ f
where n and n ¢ are the refractive indices of the object and image spaces, and f and f ¢ are
the focal lengths of the system in those spaces. By multiplying both sides of Eq. (4-12) by
the height x of the ray at the principal planes and noting that 0 = - x S and ¢0 = - x S ¢ ,
we can write it in terms of the slope angles of the rays:
As illustrated in Figure 4-5, the focal length of the system can be determined by
considering a ray incident parallel to the optical axis at a certain height and determining
the point of intersection F ¢ of the emergent ray with the optical axis. Thus, as S Æ - •
or 0 Æ 0 , then P0¢ Æ F ¢ , the image-space focal point. Letting 0 = 0 in Eq. (4-13), we
find that
f ¢ = - x ¢0 . (4-14)
If x j is the height of the emergent ray at the last surface of the system, then the distance t
n n¢
Q Q¢
x (–)b¢0
xj
H H¢ Vj F¢
t
f¢
Figure 4-5. Determination of focal point F ¢ . Only the first and the last surface of the
system are shown.
154 PARAXIAL RAY TRACING
of the focal point F ¢ from the vertex Vj of the surface is given by (see Figure 4-5)
xj
t = - . (4-15)
¢0
Of course, the incident and the emergent rays intersect each other at a point in the image-
space principal plane whose intersection with optical axis locates the corresponding
principal point H ¢ . Because H ¢F ¢ = f ¢ , the point H ¢ is located once the focal point F ¢
and the focal length f ¢ are known. The distance of H ¢ from the vertex Vj of the last
surface may be written
x - xj
Vj H ¢ = , (4-16)
¢0
as is evident from Figure 4-5. The object-space focal point F and the principal point H
can be determined in a similar manner by tracing a ray incident parallel to the axis from
right to left.
n0 n1 n2
x1
x2 (–)b 1
H (–)b2
H1 H¢1 H2 H¢ H¢2 F¢ F¢1
t
f¢
f¢1
Figure 4-6. Combination of two systems. The quantity t ∫ H1¢ H2 represents the
separation of the principal planes of the two systems of focal lengths f1¢ and f2¢ .
4.3 General System 155
focal point F ¢ of the system as the point of intersection of the emergent ray with the
optical axis. From Eq. (4-13), the slope angle 1 of the ray incident on the second system
is given by
1 = - x1 f1¢ . (4-17)
x2 = x1 + t1
= x1 (1 - t f1¢) . (4-18)
The slope angle 2 of the ray emerging from the second system is given by
Ên n ntˆ
= - x1 Á 1 + 2 - 2 ˜ . (4-19)
Ë f1¢ f2¢ f1¢f2¢ ¯
2 = - x1 f ¢ . (4-20)
n2 n n nt
= 1 + 2 - 2 , (4-21a)
f¢ f1¢ f2¢ f1¢f2¢
t
K = K1 + K 2 - K1 K 2 . (4-21b)
n1
The principal point H ¢ of the system is located by considering its distance from H2¢ ,
which is given by
H2¢ H ¢ = ( x1 - x 2 ) 2
= t (1 2 )
= - t ( f ¢ f1¢) (4-22a)
n2 K1
= -t . (4-22b)
n1 K
Similarly, it can be shown that the location of the principal point H is given by
156 PARAXIAL RAY TRACING
n0 K 2
= -t . (4-23b)
n1 K
Equations (4-22) and (4-23) can be used to obtain equations for a thick lens (of which a
thin lens is a special case) or two thin lenses. However, we will use the results for a
refracting surface recursively to gain some additional insight.
Equation (4-21a) gives the focal length of the combined system in terms of the
separation t ∫ H1¢H2 of the principal planes of the two individual systems. In a
microscope, a standardized quantity of interest is its tube length L. It represents the
separation F1¢F2 of the focal planes of the individual systems, called the objective and the
eyepiece. The object-space focal point F2 of the eyepiece lies to the right of the image-
space focal point F1¢ of the objective, as illustrated in Figure 4-7. Let f1¢ and f2¢ be the
image-space focal lengths of the objective and the eyepiece, respectively. From the
figure, we find that
L = t - f1¢ + f2 , (4-24)
where f2 is the object-space focal length of the eyepiece. Although n0 may be different
from unity, as in an oil immersion microscope, both n1 and n2 are equal to unity. Thus,
f2 = - f2¢ , and Eq. (4-24) may be written
Substituting for t in terms of L from Eq. (4-25) into Eq. (4-21a) and letting n1 = n2 = 1 ,
we obtain
f1¢f 2¢
f¢ = . (4-26)
L
n0 n1 = 1 n2 = 1
Figure 4-7. Schematic of the principal and focal points of a microscope and its
objective and eyepiece.
4.4 Thin Lens 157
In terms of powers Ki = 1 fi¢ of the objective and the eyepiece, we can write
K = K1 K2 L . (4-27)
In the case of a thin lens, the refraction of an incident ray takes place at its two
surfaces that have a negligible spacing between them. This is illustrated schematically in
Figure 4-8a. It starts at point ( x0 , 0 ) in a medium of refractive index n0 and travels a
distance t0 to the first surface. The point of incidence is ( x1 , 1 ) on the first surface and
( x2 , 2 ) on the second. The lens has a refractive index n and a thickness t1 that is
negligible. The ray ends at a height x3 from the optical axis at a distance t2 from the
second surface in a medium of refractive index n0 .
x1, b1 x2, b2
x0, b0
x1 x3
x0 x2 x3
n0 n1 n2
t0 t2
t1
(a)
x0 x1
x2 b2 x3 = 0
F¢
t 2 = f¢
(b)
x1, b1
x0, b0
x2
x0 b0 x1 (–)b1 x2
t0 t1
(c)
Figure 4-8. Ray tracing of a thin lens. (a) General case. (b) Object at infinity. (c)
Simplified ray tracing of a thin lens where the lens thickness t1 is neglected.
158 PARAXIAL RAY TRACING
Now, we apply Eqs. (4-1) and (4-4) recursively to obtain the focal length of a thin
lens of refractive index n and spherical surfaces of radii of curvature R1 and R2 in a
medium of refractive index n0 . The paraxial ray-tracing equations for the lens to
determine its focal point F ¢ and focal length f ¢ may be written as follows (see Figure 4-
8b):
x1 = x 0 + t0 0 (4-28a)
x1
n1 1 = n00 + (n0 - n1 ) (4-29a)
R1
x0
= (n0 - n1 ) , (4-29b)
R1
x2 = x1 + t1 1 (4-30a)
= x0 , (4-30c)
x2
n2 2 = n11 + (n1 - n2 ) , (4-31)
R2
Ê n - n1 n1 - n2 ˆ
n22 = Á 0 + ˜ x0 ,
Ë R1 R2 ¯
x3 = x 2 + t2 2 (4-32)
= x 0 + t2 2
and
n2 n n - n 0 n 2 - n1
∫ 2 = 1 + . (4-33)
f¢ t2 R1 R2
Except for the notation, Eq. (4-33) is the same as may be obtained from Eqs. (2-56).
For a lens surrounded by the same medium on both sides, e.g., air or water, we let
n2 = n0 in Eq. (4-33) and obtain
1 n - n0 Ê 1 1ˆ
= 1 Á - ˜ . (4-34)
f¢ n0 Ë R1 R2 ¯
4.5 Thick Lens 159
Substituting Eqs. (4-29a) and (4-30b) into Eq. (4-31) with n2 = n0 , and utilizing Eq. (4-
34), we obtain
x1
2 = 0 - . (4-35)
f¢
Referring to Figure 4-8c, where a ray incident on the lens is shown refracted by it in
one step (rather than in two, as in Figures 4-8a and 4-8b), the ray-tracing Eqs. (4-28a), (4-
35), and (4-32) for a thin lens of image-space focal length f1¢ may be written
x1 = x 0 + t0 0 , (4-36)
x1
1 = 0 - , (4-37)
f1¢
and
x2 = x1 + t1 1 , (4-38)
Now we consider a thick lens of refractive index n , thickness t, and surfaces with
radii of curvature R1 and R2 , and determine its focal length by recursive application of
the ray-tracing equations (4-1) and (4-4) for transfer and refraction at a refracting
surface, respectively. With reference to Figure 4-9 and noting that n0 = 1 , n1 = n , and
n2 = 1 , we proceed as follows by considering a ray incident on the lens from left to right,
parallel to its axis, so that 0 = 0 :
x1 = x 0 + t00
x1
n11 = n00 + (n0 - n1 ) , (4-39)
R1
x0
n1 = (1 - n) ,
R1
x2 = x1 + t11
Ê n - 1ˆ
= Á1 - t ˜ x0 ,
Ë nR1 ¯
160 PARAXIAL RAY TRACING
t2
t1 ≡ t
(–)f f′
R1
(–)R2
Figure 4-9. Ray tracing of a thick lens of refractive index n and thickness. C1 and
C2 are the centers of curvature of the surfaces of the lens with vertices V1 and V2
and radii of curvature R1 and R2 , respectively.
x2
n22 = n11 + (n1 - n2 ) , (4-40)
R2
or
È1 - n n - 1 Ê n - 1ˆ ˘
2 = Í + Á1 - t ˜ ˙ x0 ,
ÍÎ R1 R2 Ë nR1 ¯ ˙˚
and
1
= - 2 ,
f¢ x0
or
2
1 Ê 1 1 ˆ t (n - 1)
= ( n - 1) Á - ˜ + . (4-41)
f¢ Ë R1 R2 ¯ nR1 R2
Because the medium surrounding the lens is air, the refractive index of the image space is
unity. Thus, f ¢ is also the equivalent focal length of the lens. Equation (4-41) is the
lensmaker’s formula. It reduces to Eq. (2-28) for the focal length of a thin lens when the
term containing the thickness t is neglected. In that case, the thickness t is kept as small as
possible so that the lens can be fabricated, yet the term containing it can be neglected.
Similarly, letting x2 = x1 and substituting for 1 from Eq. (4-39) into Eq. (4-40), we
obtain Eq. (4-37) for a thin lens.
x3 = x 2 + t2 2
= 0 ,
4.5 Thick Lens 161
x2
t2 = - ,
2
or
Ê n - 1ˆ
t2 = f ¢ Á1 - t ˜ . (4-42)
Ë nR1 ¯
The quantity t2 ∫ V2 F ¢ , where V2 is the vertex of the second surface, locates the focal
point F ¢ and represents the image-space focal distance. A positive value of t2 implies
that the focal point F ¢ lies to the right of V2 . The principal point H ¢ is located by noting
that H ¢F ¢ = f ¢ . A positive value of f ¢ implies a converging or a positive lens, implying
that H ¢ lies to the left of F ¢ . It lies at a distance
V2 H ¢ = t2 - f ¢
n -1 (4-43)
= -tf¢
n R1
The object-space focal point F and the principal point H can be determined in a
similar manner by considering a ray incident parallel to the axis from right to left. Thus,
we can show that the distance of the focal point F from the vertex V1 of the first surface is
given by
Ê n - 1ˆ
V1 F = f Á1 + t ˜ , (4-44)
Ë n R2 ¯
where f = - f ¢ is the object-space focal length of the lens. The distance of the principal
point H from the vertex V1 is given by
n -1
V1 H = - t f ¢ , (4-45)
n R2
and a positive value implies that H lies to the right of V1 . The distance of H ¢ from H is
given by
HH ¢ = t - (V1 H + H ¢ V2 )
È n -1 Ê 1 1 ˆ˘
= t Í1 - f ¢ Á - ˜˙
ÍÎ n Ë R1 R2 ¯ ˙˚ (4-46a)
~ n -1 t (4-46b)
n
= t3 , (4-46c)
162 PARAXIAL RAY TRACING
where we have used the thin lens formula for the focal length in obtaining Eq. (4-46b)
and n = 1.5 in further obtaining Eq. (4-46c). Thus, unless the lens is very thick, the
separation of its principal points is approximately equal to one-third of its thickness
independent of its radii of curvature. As expected in the limit of a thin lens, the principal
points coincide with its center.
In Section 2.4.9, we considered a system with two thin lenses in air spaced a certain
distance apart, and determined its focal length as well as its principal and focal points.
We now revisit this problem by way of ray tracing.
Consider two thin lenses L1 and L2 of image-space focal lengths f1¢ and f2¢
separated by a distance t1 , as illustrated in Figure 4-11. Using Eqs. (4-36) and (4-37)
recursively, we can obtain the focal points and the principal points of the combined
imaging system as follows:
x1 = x 0 + t0 0
F F¢
H, H¢
(a)
F H F¢
H¢
(b)
F H, H¢ F¢
(c)
F¢
F V1 H H¢ V2
(d)
H, F¢ at •
H¢, F at – •
(e)
F¢ H¢
(–)f¢
(f)
Figure 4-10. The principal and focal points of a thick lens of increasing thickness.
The magnitudes of the radii of curvature of its two surfaces are assumed to be equal
in the figure. (a) Thin lens. (b) Thick lens. (c) Concentric lens. (d) Thick lens such
that the image-space focal point F ¢ lies at the back vertex V2 . (e) Afocal thick lens.
(f) Convex thick lens with a negative image-space focal length f ¢ .
x1
1 = 0 -
f1¢
x0
= - ,
f1¢
x 2 = x1 + t11
Ê t ˆ
= x 0 Á1 - 1 ˜ ,
Ë f1¢¯
164 PARAXIAL RAY TRACING
L1 L2
x0, 0 (–)b1
x1, b1
x0 x2, b2
x3 = 0
OA H¢ (–)b2 F¢
f 1¢ f 2¢
t1 t2
f¢
Figure 4-11. Ray tracing of a two-lens system to determine its object-space focal
point F ¢ and principal point H ¢ .
x2
2 = 1 -
f2¢
Ê1 1 t ˆ
= - x0 Á + - 1 ˜ ,
Ë f1¢ f2¢ f1¢f2¢ ¯
1
= - 2 ,
f¢ x0
or
1 1 1 t
= + - 1 , (4-47)
f¢ f1¢ f2¢ f1¢f2¢
x3 = x 2 + t2 2
and
x2
t2 = - ,
2
or
Ê t ˆ
t2 = f ¢ Á1 - 1 ˜ . (4-48)
Ë f1¢¯
Equation (4-47) may also be obtained from Eq. (4-21a) by letting the refractive index of
the object and image spaces of the system be equal to unity, i.e., by letting n1 = n2 = 1 .
4.7 Reflecting Surface (Mirror) 165
The quantity t2 , called the image-space focal distance, locates the image-space focal
point F ¢ . The principal point H ¢ is located by noting that H ¢F ¢ = f ¢ . The object-space
focal point F and principal point H can be determined in a similar manner by considering
a ray incident parallel to the axis from right to left. We find that F lies at a distance
f (1 - t1 f2¢ ) from lens L1 , where f = - f ¢ because the lenses are in air. Such a distance
of the focal point F from the vertex of the first element of a system is called its object-
space focal distance.
It is easy to see from Eqs. (4-47) and (4-48) that if t1 = f1¢ + f2¢ , then f ¢ Æ • and,
therefore, t2 Æ • . Thus, the system is afocal (as in a Keplerian or a Galilean telescope
discussed later in Chapter 6), and the focal point F ¢ lies at infinity on the right-hand side
of the system. The principal point H ¢ lies to the left-hand side of the lens L2 at a
distance f ¢ - t2 = f ¢t1 f1¢ Æ • , i.e., it lies at infinity on the left-hand side of the system.
Similarly, we can show that the principal point H and the focal point F lie at infinity on
the right-hand and left-hand sides of the system, respectively. If t1 < f1¢ + f2¢ , then the
system has a positive focal length. If, however, t1 > f1¢ + f2¢ , then the system has a
negative focal length.
If lens L1 is placed at the front focal point F2 of lens L2 , i.e., if t1 = f2¢ , then
f ¢ = f2¢ , and the front focal point F of the system coincides with F2 . Because the height
of the image of a certain point object is determined by the object ray passing through the
front focal point of the imaging system (see Section 2.4.6), we find that it is the same for
imaging by the doublet as it is for imaging by lens L2 alone. This is why it is desirable to
place the spectacle lenses with different corrections for the two eyes in the front focal
plane of the eyes; otherwise, the images on the retinas will have different magnifications.
x1 = x 0 + t00 . (4-49)
q¢ = - q . (4-50)
(–)q¢ A2
x2
q
A1
A0 b0
x1
x0 (–)f
V (–)b1 F¢ C
f1¢
R1
(–)t1
t0
0 - 1 = q - q ¢ = 2q , (4-51a)
1 = f - q , (4-51b)
and
x1
f = - , (4-51c)
R1
where f is the angle the surface normal makes with the optical axis. Note that 1 , q ¢ , and
f are all numerically negative angles in the figure. Substituting for q from Eq. (4-51b)
and for f from Eq. (4-51c) into Eq. (4-51a), we obtain
2 x1
1 = - 0 - . (4-52)
R1
x2 = x1 + t1 1 . (4-53)
Note that t1 is numerically negative in this equation because the rays are propagating
from right to left as they travel from A1 to A2 . Therefore, the quantity t11 is
numerically positive.
x2
x2
x1, b1
x0, b0
x1
x0
V
(–)t1
t0
(a)
x0, 0 x1, b1
x0 x1
(–)b1 x2 = 0
V F¢
t1
(b)
Figure 4-13. Ray tracing of a reflecting surface. (a) General case. (b) Determination
of the focal point.
If we let 0 = 0 , corresponding to a ray incident parallel to the optical axis, and let
x2 = 0 , corresponding to the intersection of the reflected ray with the optical axis, as
illustrated in Figure 4-13b, then the corresponding value of t1 gives the focal length of
the mirror. Letting 0 = 0 in Eqs. (4-49) and (4-52), and x2 = 0 in Eq. (4-53), we find
that the focal length of the mirror is given by
R1
f1¢ = , (4-54)
2
in agreement with Eq. (3-6). The reflecting power of the mirror is given by
K ∫ n ¢ f ¢ = - 1 f1¢ . (4-55)
168 PARAXIAL RAY TRACING
x1 = x 0 + t00
= x0 ,
2 x1
1 = - 0 -
R1
2 x1
= - ,
R1
x 2 = x1 + t1 1
Ê 2t ˆ
= x 0 Á1 - 1 ˜ , (4-56)
Ë R1 ¯
x0, 0 x1, b1
x2, b2
b1
x0 x1
x2 x3 = 0
(–)b2
H¢ OA F1¢ F¢
M2
M1
(–)f1¢
(–)t1
t2
f¢
Figure 4-14. Ray tracing of a two-mirror system to determine its focal point F ¢ and
principal point H ¢.
4.8 Two-Mirror System 169
2 x2
2 = - 1 -
R2
È1 1 Ê 2t1 ˆ ˘
= 2 x0 Í - Á1 - ˜˙ ,
R
ÍÎ 1 R2 Ë R1 ¯ ˚˙
1
= - 2 , (4-57)
f¢ x0
or
1 Ê 1 1 2t ˆ
= - 2Á - + 1 ˜ , (4-58)
f¢ R
Ë 1 R2 R1 R2 ¯
x3 = x 2 + t2 2
x2
t2 = - ,
2
or
Ê 2t ˆ
t2 = f ¢ Á1 - 1 ˜ . (4-59)
Ë R1 ¯
The quantity t2 locates the image-space focal point F ¢ and represents its distance from
M2 , called the image-space focal distance of the system. The principal point H ¢ is
located by noting that H ¢F ¢ = f ¢ . A positive value of f ¢ implies that F ¢ lies to the right
of H ¢ at a distance f ¢ from it. Similarly, by considering a ray incident parallel to the
optical axis from right to left, the location of the object-space focal point F and the
principal point H can be determined. We find that F lies at a distance f (1 - 2t1 R2 ) from
M1 , where f = - f ¢ is the object-space focal length of the system. This distance is the
object-space focal distance of the system.
Letting f1¢ = R1 2 and f2¢ = R2 2 denote the focal lengths of the mirrors, Eq. (4-58)
for the focal length of the system can be written
1 1 1 t
= - + - 1 . (4-60)
f¢ f1¢ f2¢ f1¢f2¢
This result may also be obtained from Eq. (4-21a) by letting n1 = -1 (representing the
refractive index associated with the ray reflected by M1 ) and n2 = 1 (representing the
refractive index associated with the ray reflected by M2 ). In terms of the equivalent focal
lengths of the mirrors, fe1 = - R1 2 and fe2 = R2 2 , defined by Eqs. (3-9), Eq. (4-60)
can also be written
170 PARAXIAL RAY TRACING
1 1 1 t
= + - 1 . (4-61)
f¢ fe1 fe 2 fe1 fe 2
We note that f ¢ Æ • if t1 = f1 - f 2 , i.e., the system becomes afocal if the mirrors are
confocal (i.e., if they have a common focus). If the magnitude of the spacing between the
mirrors is smaller (than that for the afocal setting), then the system has a positive focal
length, and it is called a Cassegrain telescope. If it is larger, then the focal length of the
system is negative, and it is called a Gregorian telescope. Both telescopes form a real
image of an object lying at infinity, as illustrated in Figure 3-8.
4.8.2 Obscuration
It should be evident from Figure 4-14 that the central portion of a bundle of rays
incident on the primary mirror M1 is blocked by the secondary mirror M2 . Thus, the
image-forming beam is hollow on the inside. It is said to be centrally obscured in the case
of an axial point object lying at infinity. The ratio of the heights of the LQQHUmost to the
RXWHUmost rays of the image-forming light cone is called the obscuration ratio of the
system.
From Eq. (4-56), the obscuration ratio of the image-forming beam converging to
the image point at F ¢ is given by
x1, b1
x2, b2
b1 x1
x2 x3
(–)b2
H¢ OA F1¢ F¢
M2
M1
(–)t1
(–)f 1¢
f¢
Figure 4-15. Axial ray tracing of a two-mirror system, illustrating its obscuration
ratio = x 2 x1 and the radius x3 of the hole in the primary mirror M1 .
4.8 Two-Mirror System 171
x2 t
∫ = 1- 1 . (4-62)
x1 f1¢
x3 = x2 + t22
È Ê 1 1 ˆ˘
= x1 Í1 + t1 Á - ˜ ˙ , (4-63)
ÍÎ Ë f ¢ f1¢¯ ˙˚
By tracing an off-axis ray, as illustrated in Figure 4-16, we can determine how the
field of view of a system affects the values of x2 and x3 . Consider a ray incident at an
angle 0 at a height x1 on M1 , representing an outermost ray from an object point at
infinity making an angle 0 with the optical axis. The equations for tracing this ray are:
x1
1 = - 0 - ,
f1¢
x2 = x1 + t1 1
Ê t ˆ
= - 0 t1 + x1 Á1 - 1 ˜ , (4-64)
Ë f1¢¯
x1, b1
b0
x2, b2 x1
x2 x3
h1¢ h¢
H¢ OA F1¢ F¢
M2
M1
(–)t1
(–)f 1¢
f¢
Figure 4-16. Off-axis ray tracing of a two-mirror system, illustrating the increase in
radius of the secondary mirror M2 and the hole in the primary mirror M1 . The
dashed axial ray is shown for comparison. The heights of the images formed by M1
and the system are h1¢ and h ¢ , respectively.
172 PARAXIAL RAY TRACING
x2
2 = - 1 -
f2¢
Ê t ˆ x
= 0 Á1 + 1 ˜ - 1 ,
Ë f2¢ ¯ f ¢
and (with t2 = - t1 )
x3 = x2 + t2 2
Ê t ˆ È Ê 1 1 ˆ˘
= - 0 t1 Á 2 + 1 ˜ + x1 Í1 + t1 Á - ˜ ˙ . (4-65)
Ë f2¢ ¯ ÍÎ Ë f ¢ f1¢¯ ˙˚
Comparing Eq. (4-64) with Eq. (4-62), we find that the radius of M2 increases by - 0 t1 ,
which, in turn, increases the obscuration ratio. Similarly, comparing Eq. (4-65) with Eq.
(4-63), we find that radius of the hole in M1 increases by - 0 t1 (2 + t1 f 2¢ ) . Good image
quality is generally obtained for only very small values of the field angle 0 (a few
degrees) due to the rapid increase of aberrations with it. Thus, the approximate results are
reasonably accurate for a preliminary design of the system. The precise results in the final
stages of a design are obtained by exact ray tracing using a computer-based code.
Finally, we consider a catadioptric system consisting of a thin lens of focal length fl¢
and a concave mirror of radius of curvature R (and, therefore, focal length fm¢ = R 2 )
separated by a distance t, as illustrated in Figure 4-17, and determine its focal length. The
results obtained are applied to a Schmidt camera, which consists of a spherical mirror and
a corrector plate placed at its center of curvature. The primary purpose of the plate is to
correct the spherical aberration of the mirror. However, it also has a small focus term and
thus acts like a (weak) lens.
Applying the ray-tracing equations (4-36) and (4-37) for a thin lens and Eqs. (4-49)
and (4-52) for a mirror, we obtain the focal length fs¢ of the system as follows:
x1 = x 0 + t00
x1 x
1 = 0 - = - 0 ,
fl¢ fl¢
x2 = x1 + t1 1
Ê tˆ
= x 0 Á1 - ˜ ,
Ë fl¢¯
4.9 Catadioptric System: Thin-Lens–Mirror Combination 173
x2, b2
x0 x1 x2
x3 b2
C ¢ F¢
Fm V H¢ F¢l
M
L
(–)fm¢
(–)t2
(–)R
t
(–)fs¢
f¢l
x2
2 = - 1 -
fm¢
È1 1 Ê t ˆ˘
= x0 Í - Á1 - ˜ ˙
ÍÎ fl¢ fm¢ Ë fl¢¯ ˚˙
x
= - 0 ,
fs¢
where the negative sign in the last step accounts for the fact that 2 is numerically
positive, whereas fs¢ is numerically negative in the figure. Thus, the focal length of the
system is given by
1 Ê1 1 t ˆ
= -Á - + ˜ . (4-66)
fs¢ f
Ë l ¢ f ¢
m fl fm¢ ¯
¢
The focusing power K s and the equivalent focal length fe of the system are given by
ns¢ 1 1
Ks ∫ = - = , (4-67)
fs¢ fs¢ fe
where ns¢ = - 1 is the refractive index of the image space of the system. The distance t2
of the focal point F ¢ from the vertex V of the mirror is given by
x3 = x 2 + t2 2
Thus,
x2
t2 = - ,
2
or
Ê tˆ
t2 = fs¢ Á1 - ˜ . (4-68)
Ë fl¢¯
1 1 1
= + (4-69)
fs¢ fl¢ fm¢
and
Ê 2f¢ ˆ
t2 = fs¢ Á1 + m ˜ . (4-70)
Ë fl ¢ ¯
There is only one principal point and one focal point. Thus, the reference point for
both object and image distances is either the principal point H ¢ or the focal point F ¢ ,
depending on whether the Gaussian or the Newtonian imaging equation is used. It should
be noted that if the lens is placed close to the mirror, then the rays reflected by the mirror
are refracted by the lens before a final image is formed (see Problem 4.1)
As shown in Sections 2.4.3 and 3.2.3, the Lagrange invariant, which is the product of
the slope angle of a ray from an axial point object, object height, and the refractive index
of the object space, is invariant upon refraction or reflection by a surface, and thus for a
system consisting of any number of such surfaces. Now we consider this invariant in
terms of the heights and slopes of two arbitrary rays incident on the system. We show
how this invariant reduces to that for finite or infinite conjugates. We also show that the
slope and the height of any other ray incident on the system can be obtained anywhere in
space as a linear combination of the slopes and heights of the other two in that space.
Consider, as illustrated in Figure 4-18, two linearly independent rays (such that one
is not a scaled version of the other) incident at heights x 0 and x with slope angles 0 and
on a refracting surface of radius of curvature R separating media of refractive indices n
and n ¢ . From Eq. (4-4), the slope angles ¢0 and ¢ of the corresponding refracted rays
are given by
x0
n ¢¢0 = n0 + (n - n ¢ ) (4-71)
R
4.10 Two-Ray Lagrange Invariant 175
n n¢
x0 (–)b¢ P¢
x 0¢
P0 b0 b x x¢ h¢
V C F¢ (–)b0¢ P¢0
(–)h
R
f¢
and
x
n ¢¢ = n + (n - n ¢ ) . (4-72)
R
showing that the quantity n(0 x - x 0 ) , called the two-ray Lagrange invariant, is
invariant upon refraction of the rays. If we let x 0¢ and x ¢ be the heights of the rays in a
plane at a distance t from the refracting surface, we find from Eq. (4-1) that
x 0¢ = x 0 + t ¢0 (4-74)
and
x ¢ = x + t¢ . (4-75)
Thus, the quantity n ¢(¢0 x - ¢ x 0 ) remains invariant upon transfer of the rays from one
plane to another. From Eqs. (4-73) and (4-76) we find that
showing the equality of the Lagrange invariant in the object and image spaces. Thus, the
two-ray Lagrange invariant remains the same throughout the optical system, including the
176 PARAXIAL RAY TRACING
object and image spaces. This invariant relation applies to a multisurface system as well,
(as may be seen by placing another refracting surface at some distance from the existing
refracting surface), in which case the right- and left-hand sides of Eq. (4-77) refer to its
object and image spaces, respectively.
Now we utilize the two-ray Lagrange invariant to show that if the slope and height of
any third ray are known in a certain space, they can be obtained in any other space,
without tracing it, as a linear combination of the values of the two rays in that other
space. Let the slopes of the three rays in a certain space of refractive index n be 1 , 2 ,
and 3 , and let their heights in a certain plane in that space be x1 , x2 , and x3 ,
respectively. Suppose that two of the rays have been traced such that their heights and
slopes ( x1¢ , 1¢ ) and ( x2¢ , ¢2 ) in another space are known. We show that the height and
slope ( x3¢ , 3¢ ) of the third ray in that space can be determined without tracing it from the
heights and slopes of the other two.
n n¢
b0 = 0
x (–)b¢
x0
b (–)b0¢ h¢
V b C F¢
R
f¢
The two-ray Lagrange invariant in the plane for which the heights and slopes of the
rays are known can be calculated according to
and
Using primes for the corresponding quantities in another space, the Lagrange invariant in
terms of the quantities in this space may be written
and
respectively. From Eqs. (4-79), we find that the height and the slope of the third ray in the
other space are given by
and
L132¢ - L231¢
3¢ = . (4-81)
L12
Because the quantities on the right-hand sides of Eqs. (4-80) and (4-81) are known, the
height and slope ( x3¢ , 3¢ ) of the third ray can be determined without actually tracing it.
As discussed later in Section 5.2.3, a marginal ray from the axial point of an object
and a chief ray from its edge are the two rays that can be traced to determine the location
and size of the images of an object and the entrance pupil of a system. The first ray
passing through the edge of the entrance pupil passes through the edge of the exit pupil
and thus determines its size. It also passes through the center of the image and thereby
determines its location. The second ray passing through the center of the entrance pupil
passes through the center of the exit pupil and thus determines its location. It also passes
through the edge of the image and thus determines its size. The height and slope of a third
ray in any space can be determined from the heights and slopes of these two rays in that
space.
178 PARAXIAL RAY TRACING
x1 = x 0 + t00 . (4-82)
x1
n11 = n00 + (n0 - n1 ) . ( Refracting Surface) (4-83)
R1
If the ray is refracted by a thin lens of focal length f1¢ instead, as in Figure 4-21, then its
slope after refraction is given by
n0 n1
x1, b1
b0
x0, b0 (–)b1 x2
x0 x1 x2
V C
R1
t0 t1
x 1, b 1
x0, b0
x2
x0 b0 x1 (–)b1 x2
t0 t1
x1
1 = 0 - . ( Thin lens) (4-84)
f1¢
2 x1
1 = - 0 - . (Mirror ) (4-85)
R1
The above equations are applied recursively to trace a ray through a multisurface optical
system. For example, the height of the refracted or reflected ray after propagating a
distance t1 is given by
x2 = x1 + t11 . (4-86)
Ray-tracing equations are used to determine not only the Gaussian properties of a
system but also the size of the imaging elements and apertures, vignetting of rays, and
obscurations in mirror systems.
[
The front and back focal distances are given by - f ¢ 1 + t (n - 1) n R2 ] and
[ ]
f ¢ 1 - t (n - 1) n R1 , respectively.
x2
x2
(–)b1
x1, b1
x0, b0
b0
x1
x0
V C
(–)t 1
t0
R1
OA F C2 V1 H H¢ V2 F¢ C1
n –1
f ¢(1 – t
nR 1
)
– f ¢(1 – t n – 1 ) t
nR1
(–)f f¢
R1
(–)R2
1 1 1 t
= + - .
f¢ f1¢ f2¢ f1¢f2¢ (4-88)
The front and back focal distances are given by - f ¢ (1 - t f2¢) and f ¢ (1 - t f1¢) ,
respectively.
L1 L2
F H¢ F¢
f 1¢ f 2¢
t
– f ¢( 1 – ) t f ¢ (1 – t )
f¢2 f1¢
f¢
Figure 4-24. Two-lens system consisting of two thin lenses separated by a distance t.
4.11 Summary of Results 181
1 1 1 t
= - + - , (4-89)
f¢ f1¢ f 2¢ f1¢f 2¢
where fi¢ = Ri 2 is the focal length of a mirror. The back focal distance, representing the
distance of the focal point F ¢ from the secondary mirror M2 , is given by f ¢ (1 - t f1¢) .
The obscuration ratio, representing the ratio of the inner to the outer radii of the axial ray
bundle converging to the focal point, is given by 1 - t f1¢ . The corresponding radius of
[ (
the hole in the primary mirror M1 of radius a is given by a 1 + t1 f ¢ -1 - f1¢ -1 . Both the )]
obscuration ratio and the hole radius increase as the field of view increases.
where n and n ¢ are the refractive indices of the object and image spaces, respectively.
H¢ OA F 1¢ F¢
M2
M1
(–)f1¢
(–)t
f ¢ (1 + t )
f1¢
f¢
PROBLEMS
4.2 A thick lens has a refractive index of 1.5. Its surfaces have radii of curvature of 10
cm and – 25 cm. If the second surface is silvered and the lens is 2 cm thick, locate
the focal point and the principal point of the system.
4.4 Two thin lenses of focal lengths ± f ¢ are placed a distance f ¢ apart. (a) Determine
the focal points, principal points, and the focal length of the system. How does the
order of the lenses affect the result? (b) Repeat the problem when the lenses have
focal lengths f ¢ and - f ¢ 6 and are placed a distance 2 f ¢ 3 apart.
4.5 Consider a system of two thin lenses of focal lengths f1¢ and f2¢ spaced a distance t
apart. (a) Determine its cardinal points if f1¢ = 2 f2¢ and t = 0.5 f2¢, f2¢, 1.5 f2¢
(Huygens eyepiece), 2 f2¢ , and 3 f2¢ (astronomical telescope). (b) Repeat the
problem if f1¢ = - 2 f2¢ and t = 0.5 f2¢ , - f2¢ (Galilean telescope), and - 1.5 f2¢
(telephoto lens). Let f1¢ = 10 cm .
Cornea Lens
Retina
OA
n1 n2 n3
t1 t2 t3
C3 = 0.16667 n3 = 1.336
(a) Determine t3 . (b) Determine the six cardinal points and show them on the axis.
(c) Determine the cardinal points for an underwater swimmer. Indicate the changes
Problems 183
from (b). Note that Ci is the curvature of a surface, i.e., it is the reciprocal of its
radius of curvature. (Hint: One focal point is on the retina. The refractive index of
water is 1.336.) Note: t is in units of mm, and C is in units of mm–1.
4.7 In a nearsighted eye, the focal point F ¢ lies in front of the retina. Assume that the
eye can be approximated, as shown in the figure below, such that F ¢ is 23 mm
from the cornea instead of 24.387 mm, as in a normal eye. (a) Determine the
prescription of a corrective lens placed 15 mm in front of the cornea that makes
F ¢ lie on the retina. (b) Repeat the calculation for a contact lens.
15.707 mm
1.348 mm
H Retina
H¢
F F¢
Cornea Lens
n = 1.336
n = 1.000
1.602 mm
24.387 mm
4.8 Consider a lens of refractive index n and thickness t with its two surfaces having
equal radii of curvature R. (a) Show that the distance between its principal points
is also equal to t. (b) Determine its principal and focal points for n = 1.5 ,
t = 2 cm , and R = 10 cm .
4.9 Consider a concentric lens of refractive index n with its two surfaces having radii
of curvature R1 and R2 . Show that such a lens behaves as a negative thin lens
placed at the common center of curvature of its two surfaces with a focal length
that is n times the focal length of a thin lens of the same refractive index and
surfaces with the same radii of curvature. Determine its principal and focal points
for n = 1.5 , R1 = 10 cm , and R2 = 8 cm .
4.10 The Hubble space telescope is a Cassegrain telescope with a focal ratio of 24. Its
primary mirror is its aperture stop (discussed in Chapter 5), with a diameter of 2.4
m and a focal ratio of 2.3. The spacing between its two mirrors is 4.905 m. (a)
Determine the location of its principal and focal points. (b) Determine the location
and size of its exit pupil. (c) Determine the diameters of the secondary mirror and
the hole in the primary mirror for a field of view of ± 5 mrad.
CHAPTER 5
185
186 STOPS, PUPILS, AND RADIOMETRY
References ......................................................................................................................229
Problems ......................................................................................................................... 230
Chapter 5
Stops, Pupils, and Radiometry
5.1 INTRODUCTION
In previous chapters, we have shown how to determine the position and size of the
Gaussian image of an object. However, we did not consider the sizes of the imaging
elements or the apertures in the imaging system. Accordingly, no effort was made there to
determine the cone of object rays that enters or exits from the imaging system. Such
calculations are essential for the determination of the image intensity in terms of the
object intensity, or the image irradiance in terms of the object radiance.
We begin this chapter by introducing the concept of an aperture stop and its images,
called the entrance and exit pupils in the object and image spaces of an imaging system,
respectively. The light cone from a point object that enters the system is limited by the
entrance pupil. Similarly, the light cone that exits from the system and converges to the
image point is limited by the exit pupil. Certain special rays, such as the chief and
marginal rays, are defined. The chief ray from the edge of an object determines the
location of the exit pupil and the height of the image. Similarly, the marginal ray from the
axial point of the object determines the size of the exit pupil and the location of the axial
image point. Vignetting or blocking of the rays from an off-axis point object by the
aperture stop and/or other elements of the system, thus changing the effective shape of
the stop and pupils, is explained. A telecentric stop is defined, and its advantages are
briefly discussed. The field stop and its images, the entrance and exit windows, and the
angular field of view of a system are also described.
The field of radiometry deals with the determination of the amount of light radiated
by a source per unit area per unit solid angle, or the amount falling on a surface per unit
area [1–4]. We discuss the radiometry of point-object imaging, followed by the
radiometry of extended-object imaging. We introduce terms such as intensity of a point
source, radiance of an extended source, irradiance of a surface, and characteristics of a
Lambertian source. A relationship between the intensities of a point object and its point
image is derived. The irradiance of a surface due to a Lambertian disc is also derived. An
invariant relation between the radiances of an object and its image is obtained, and the
cosine-fourth law of image irradiance is discussed [5–7]. The irradiance distribution of
the images formed by systems that are telecentric or concentric is also discussed.
187
188 STOPS, PUPILS, AND RADIOMETRY
AS
EnP
ExP
1
MR 0
OA CR0
P0 P¢0
MR
02
(a)
AS
EnP
ExP P¢
R1
M
P0 OA
P¢0
CR
P
MR2
(b)
Figure 5-1. Imaging by a thin lens with an aperture stop at the lens. (a) On-axis
imaging. (b) Off-axis imaging. The cone angle of a ray bundle diverging from a
point object and incident on the lens is limited by the size of the lens. Similarly, the
cone angle of the ray bundle converging to the corresponding image point is also
limited by the size of the lens. The lens aperture is the aperture stop AS, entrance
pupil EnP, and exit pupil ExP of the imaging system. CR represents the chief ray
that passes through the center of the lens, and MR represents a marginal ray passing
through its edge.
the ray bundle converging to the image point P0¢ . An observer looking at the lens from
P0¢ does not see the aperture stop but sees instead its image, the exit pupil ExP, formed by
the lens. In Figure 5-3, an aperture is placed behind the lens. We note that the cone angle
of the ray bundle transmitted by the system and reaching the image point is limited by it.
It is therefore also the exit pupil of the system. The image of the aperture stop by the lens
is the entrance pupil because it appears to limit the cone angle of the corresponding
incident ray bundle. An observer looking at the lens from P0 does not see the aperture
stop AS, but sees instead its image, the entrance pupil EnP, formed by the lens.
190 STOPS, PUPILS, AND RADIOMETRY
ExP AS
EnP
MR 01
CR0
P0 OA P¢0
MR
02
(a)
P¢
ExP AS
EnP
P0 OA
P¢0
MR
1
CR
MR 2
P
(b)
Figure 5-2. Imaging by a thin lens with aperture stop AS in front of the lens. (a) On
axis imaging. (b) Off-axis imaging. The aperture stop is also the entrance pupil
EnP, and its image by the lens is the exit pupil ExP. The chief ray CR passes
through the center of the aperture stop and appears to pass through the center of
the exit pupil. The cone angle of the ray bundle diverging from a point object and
incident on the lens is limited by the entrance pupil, and the cone angle of the ray
bundle converging to the image point appears to be limited by the exit pupil.
Suppose we add another lens to the right of the aperture stop so that the imaging
system consists of two thin lenses with an aperture lying between them, as illustrated in
Figure 5-4. Now AS is the aperture stop, and its images by the lenses L1 and L2 are the
entrance and exit pupils EnP and ExP, respectively. From the definition of the object and
image spaces given in Section 2.2.2, we note that the aperture stop lies in the image space
of lens L1 and the object space of lens L2 . Similarly, the entrance pupil lies in the
(virtual) object space of lens L1 , and the exit pupil lies in the (virtual) image space of lens
L2 . Moreover, the entrance pupil lies in the (virtual) object space and the exit pupil lies in
the (virtual) image space of the two-lens system. An observer looking at the system from
P0 does not see the aperture stop AS but sees instead its image, the entrance pupil EnP,
5.2 Stops, Pupils, and Vignetting 191
AS
ExP EnP
MR 01
OA CR0
P0 P¢0
MR
02
(a)
P¢
AS
ExP EnP
P0 OA
MR 1 P¢
0
CR
MR2
P
(b)
Figure 5-3. Imaging by a thin lens with an aperture stop AS behind the lens. (a) On-
axis imaging. (b) Off-axis imaging. The aperture stop is also the exit pupil ExP, and
its image by the lens is the entrance pupil EnP. The cone angle of the ray bundle
diverging from a point object appears to be limited by the entrance pupil, and the
cone of the ray bundle converging to the corresponding image point is limited by the
exit pupil. The chief ray CR passes through the center of the aperture stop and
appears to pass through the center of the entrance pupil.
formed by L1 . Similarly, an observer looking at the system from P0¢ also does not see the
aperture stop, but sees instead its image, the exit pupil ExP , formed by L2 .
ExP
EnP
L1
AS L2
MR 01
B02
OA CR0 A01
P0 A02 P¢0
B01
MR
02
(a)
ExP
L EnP L
1 AS
2
C2
B2 P¢
P0 OA A2
MR 1 A1 P¢0
B1
CR
C1
MR2
P
(b)
Figure 5-4. (a) Imaging of an on-axis point object P0 by an optical imaging system
consisting of two lenses L1 and L2 . OA is the optical axis. The Gaussian image is at
P0¢ . AS is the aperture stop; its image by L1 is the entrance pupil EnP, and its image
by L2 is the exit pupil ExP. CR0 is the axial chief ray, and MR0 is an axial marginal
ray. (b) Imaging of an off-axis point object P. The Gaussian image is at P ¢. CR is the
off-axis chief ray, and MR is an off-axis marginal ray.
5.2 Stops, Pupils, and Vignetting 193
(semidiameter) of each element and aperture in the system. The element or aperture with
the highest ratio is the aperture stop. If the angle of the chosen ray is increased, each ratio
increases by a proportional amount until it reaches a value of unity for the aperture stop.
Any further increase in the angle of the ray will lead to its vignetting by the aperture stop.
Which element of a system acts as its aperture stop depends on the location of the
object. As a simple example, the aperture A in Figure 5-5 is the aperture stop for objects
such as PA lying to the left of P, where P is the point of intersection of the line joining the
upper edges of A and the lens L with the optical axis. However, the lens itself acts as the
aperture stop for objects, such as PL , lying to the right of P.
Using the Lagrange invariant equation (2-74), we find that the angles of the marginal
and chief rays are related to each other according to
A L
PA P PL
97
EnP ExP
n
A¢ n¢
A
P MR 0
CR MR
a a¢ 0
h
b0 (–)q (–)b¢0 P¢0
P0 O O¢ (–)q¢
(–)h¢
CR
Optical P¢
System
(–)L o Li
Figure 5-6. Schematic diagram of a system and its entrance and exit pupils EnP and
ExP, respectively, showing the marginal ray P0 A ◊◊A¢ P0¢ from the axial point object
P0 and the chief ray PO ◊◊O ¢P ¢ from the off-axis point object P ¢.
where n and n ¢ are the refractive indices of the object and image spaces, h and h ¢ are the
object and image heights, q and q ¢ are the chief ray angles (both numerically negative)
in the object and image spaces, and a and a ¢ are the radii of the entrance and exit pupils,
respectively. Moreover, we have used the fact that the object and image distances from
the entrance and exit pupils, respectively, are given by
Lo = h / q = - a 0 , (5-2a)
and
Li = h ¢ / q¢ = - a ¢ ¢0 . (5-2b)
The quantities in Eq. (5-1) represent, from left to right, the two-ray Lagrange invariant
(discussed in Section 4.10) in the planes of the object, entrance pupil, exit pupil, and the
image.
5.2.4 Vignetting
The amount of light in the image of a point object depends on the size and location of
the aperture stop or, equivalently, the entrance pupil of the imaging system. However, not
all of the rays transmitted by one element of the system are transmitted by another. Figure
5-7 illustrates vignetting of rays. The rays in the shaded region are transmitted by the
aperture stop AS, but they are missed by the lens L and are said to be vignetted.
The vignetting of rays from an off-axis point object by a multielement system may
be determined by projecting the images of all elements and apertures (by the preceding
elements) on the entrance pupil using the point object as the center of projection. The
5.2 Stops, Pupils, and Vignetting 195
AS L
P
P0
Figure 5-7. Vignetting of rays. Rays from an off-axis point object P in the shaded
region are transmitted by the aperture stop AS but vignetted by the lens L.
common area of these projections represents the effective entrance pupil of the system for
the point object under consideration. Its images formed by the elements that precede it
and by the entire system are the effective aperture stop and the effective exit pupil of the
system. An alternative but equally valid approach to determining the vignetting of rays is
to project the images of all elements on the exit pupil using the Gaussian image point as
the center of projection. The common area of these projections on the exit pupil
represents the effective exit pupil. The images of the common area by the elements that
follow it (looking at them from the image point) and by the entire system are the effective
aperture stop and the effective entrance pupil, respectively.
In Figure 5-4a, the lenses are quite large compared with the aperture stop; therefore,
they do not in any way limit the ray bundle from the object point P0 transmitted by the
system. AS is indeed the aperture stop because it limit the ray bundle. Similarly, we note
from Figure 5-4b that, for any point on the object P0 P , there is no vignetting of the
aperture stop, i.e., any ray that is not blocked by the aperture stop is also not blocked by
either of the two lenses. Thus, for a circular aperture stop, the entrance and exit pupils are
also circular. We note that the cone of light rays from an axial point object illuminates the
lenses symmetrically, but the one from the off-axis point object illuminates them
eccentrically. We also note that different portions of the lenses are used for different point
objects. The same region of an imaging element is used for different point objects only
when the aperture stop is located at the element.
However, consider Figure 5-8a, which also shows a system consisting of two lenses
L1 and L2 with an aperture A placed between them. The images of A and L2 by L1 are
indicated as A¢ , and L2¢ , respectively. An observer in the object space sees L1 , A¢ , and
L2¢ , but not A and L2 . We note that A is the aperture stop of the system for only those
objects that have their axial points lying between P1 and P2 , where P1 and P2 are the
points of intersection of the lines joining the upper edges of L1 and A¢, and A¢ and L2¢ ,
respectively, with the optical axis. For these objects, A¢ subtends the smallest angle (at
an axial point) among L1 , A¢, and L2¢ . It is, therefore, the entrance pupil of the system.
196 STOPS, PUPILS, AND RADIOMETRY
For objects lying to the left of P1 , L1 subtends the smallest angle. Thus, it ( L1 ) is the
aperture stop of the system for such objects, in which case it is also the entrance pupil of
the system. For objects lying to the right of P2 , L2¢ subtends the smallest angle.
Therefore, for these objects, L2 is the aperture stop and the exit pupil of the system, and
L2¢ is its entrance pupil.
P1 P2
L1 L2
A
(a) A¢ L¢2 Projections of
L1 and L¢2 on EnP
L¢2
L1
P0
EnP
AS
P EnP
(b)
L1
EnP
P0 Effective EnP
L¢2
AS
P EnP
(c)
Figure 5-8. Aperture stop of a system and its vignetting. A¢ and L2¢ are the images of
A and L2 by L1 . (a) Determination of the aperture stop. (b) Diagram showing no
vignetting for an on-axis point object P0 . (c) Vignetting diagram for an off-axis
point object P. The circles on the right-hand side of the figure show projections of
L1 and L2¢ on EnP with the point object under consideration as the center of
projection.
5.2 Stops, Pupils, and Vignetting 197
Figure 5-8c shows the projections of L1 and L2¢ on EnP as viewed from an off-axis
point object P. These projections, illustrated as eccentric circles on the right-hand side of
the figure, are shown to be circular only as an approximation of the actual ellipses. The
ray bundle originating at P and transmitted by the system is shown shaded in the figure. It
is clear that the upper marginal ray (sometimes called the upper rim ray) is limited by L2 ,
and the lower marginal ray (sometimes called the lower rim ray) is limited by L1 ; i.e., the
upper portion of the ray bundle from P is blocked by L2 , and its lower portion is blocked
by L1 . Thus, there is vignetting of the aperture stop and the effective aperture stop, and
the corresponding entrance and exit pupils are no longer circular. The shape of the
effective entrance pupil is shown shaded in the figure as the region of EnP that is
common with the projections of L1 and L2¢ on it. Its Gaussian images by L1 and L2 give
the shapes of the effective aperture stop and exit pupil, respectively. The consequence of
the variation of the shape of the entrance pupil with the location of point object P lies not
only in the loss of light in its image but also in the distribution of the image light (because
it depends on the shape of the pupil). Diagrams such as those shown on the right-hand
side of Figures 5-8b and 5-8c, illustrating the shape of the pupil for a certain point object,
are called vignetting diagrams.
discussed in Section 2.5) and if the aperture stop is placed in an intermediate focal plane,
then both the entrance and exit pupils lie at infinity, and the system is said to be
telecentric on both object and image sides. However, a system cannot be telecentric on
the object side if the object lies at infinity because then the aperture stop will lie in the
image plane where it cannot control the cross-section of the focused beams. A telecentric
stop on the image side, for example, has the advantage that the size or the shape of an
image is insensitive to small focus errors, as may be seen from Figure 5-9. In Figure 5-9a,
the height of the image center does not change with defocus, i.e., P ¢ and P ¢¢ are at the
same height. However, in Figure 5-9b, where the aperture stop does not lie in the front
focal plane, a small defocus changes the height of the image center, as may be seen from
the fact that P ¢¢ is at a slightly larger height than P ¢ , i.e., h ¢¢ > h ¢.
................
AS .
EnP P¢
.. ..........
........ CR P¢¢
CR
....... h¢
.......
(–)h F .......
P
Optical
System
(a)
AS
EnP
.......
........... P¢ P¢¢
CR
.....................
h¢ h¢¢
(–)h
CR
P ....................
Optical
System
(b)
Figure 5-9. (a) Telecentric aperture stop on the image side. (b) Nontelecentric
aperture stop. A dotted line shown within the system here and in Figure 5-10 does
not represent a ray but merely a line joining its points of incidence on and
emergence from the system. A small focus error does not change the height of the
image center in (a), but it does in (b).
5.2 Stops, Pupils, and Vignetting 199
EnW
ExW
Field
EnP Stop ExP
CR
CR
qo qi
Optical
System Image
Object Plane
Plane
Figure 5-10. Field stop, entrance and exit windows, and field of view of a system.
The field stop is assumed to lie at an intermediate image of the object. The dotted
line is not a ray but a mere illustration that the angle q 0 is limited by the field stop.
window EnW , and its image by the elements that follow it is called the exit window ExW .
The field stop is placed at a real image of the object. The image may be an intermediate
or the final one. Accordingly, the entrance and exit windows lie in the object and image
planes, respectively. The entrance window defines the object field that is actually imaged.
Simple examples of field stops are the rectangular diaphragm or the plate holder for the
film in a camera or for a slide in a slide projector. The field stop of a system is
determined by finding the image of each aperture and element by the imaging elements
that precede it and determining the image that subtends the smallest angle at the center of
the entrance pupil. This image is the entrance window, and the physical stop
corresponding to it is the field stop. The field stop may also be determined by tracing a
chief ray from a certain off-axis point object and calculating the ratio of the height and
radius of each element and aperture in the system. The element with the highest ratio is
the field stop.
The angle qo subtended by the entrance window at the center of the entrance pupil
defines the angular field of view of the system in object space. Similarly, the angle q i
subtended by the exit window at the center of the exit pupil is the angular field of view of
the system in image space. According to Eq. (5-1), their ratio qo / q i is equal to the
magnification of the exit pupil when the refractive indices of the object and image spaces
are equal.
It should be noted that, whereas the position and the size of the aperture stop
determine the quality and the amount of light in the final image (by virtue of blocking
rays with large aberrations), the field stop determines only the portion of the object that is
imaged. Additional stops and baffles are placed in optical systems to block stray light
from reaching the final image area. An example is a stop called a Lyot stop (or a cold stop
when used in an infrared system) placed at a real image of the aperture stop.
200 STOPS, PUPILS, AND RADIOMETRY
If a flux dF from a point source irradiates a surface element of area dS, the flux incident
on the surface per unit area is called the irradiance E (in watts/square meter, or W/m2 ) of
the surface. Thus, the irradiance of the surface is given by
dF
E = . (5-4)
dS
Now we determine the flux incident on a circular aperture of radius a from a point
source P of intensity I lying at a distance R on its axis (see Figure 5-11). Consider an
annular element of radius r and width dr making an angle q = cos -1 ( R d ) with the axis.
Its area is given by dS = 2 p rdr , while its projected area perpendicular to the line joining
it and the point source is given by dS cos q. The solid angle dW subtended by it at the
point source is given by dS cos q d 2 , where d is the distance between the two.
Accordingly, the flux incident on it is given by
dS1 = dS
I dS cos3 q
=
R2
2 p rdr
= IR 32 .
(R 2
+ r2 )
Integrating over r from 0 to a, we obtain the total flux incident on the aperture, i.e.,
F = Ú dF
a
Û rdr (5-5a)
= 2 p IR Ù
ı 2 3/ 2
0 (R 2
+r )
È ˘
1 1
= 2p IR Í – 1/ 2
˙
ÍR
Î R + a2
2
( ) ˙
˚
= IW , (5-5b)
5.3 Radiometry of Point Object Imaging 201
dr
a
a
d r
q
P
where
È ˘
1 1
W = 2p R Í - ˙
12˙
ÍR
ÍÎ R + a2
2
( ) ˙˚
= 2 p (1 - cos a )
= 4 p sin 2 (a 2) (5-5c)
is the solid angle, and a is the semiangle subtended by the aperture at the point source.
p a2 I
F =
R2
IS
= , (5-6a)
R2
and
W = S R 2 = pa 2 , (5-6b)
where S = p a 2 is the area of the aperture. Thus, for a distant point source, the solid angle
subtended by the aperture is simply its area divided by the square of its distance from the
point source, and I R 2 is the uniform irradiance on the aperture. Equation (5-6a)
represents the inverse-square law of irradiance; namely, the irradiance of a surface by a
point source lying on its surface normal is inversely proportional to the square of its
distance from the radiating source. Figure 5-12 shows how the flux varies with R a.
Comparing curve (a) with (b), it shows that the exact value given by Eq. (5-5a) is smaller
202 STOPS, PUPILS, AND RADIOMETRY
1.8
1.6
1.4
1.2
F/pI
1
(b)
(a)
0.8
0.6
0.4
0.2
0
0 1 2 3 4 5
R/a
than the approximate value given by Eq. (5-6a), and that the two are practically equal to
each other for R a ≥ 5. The difference between the two is less than 3% when R = 5 .
F = I o W( P) , (5-7)
where
Sen cos q
W( P) =
( PO) 2
(
= Sen L2o cos 3 q ) (5-8)
is the solid angle subtended by the entrance pupil at the point object P. Here, PO is the
distance between the point object P and the center O of the entrance pupil, and q is the
angle the chief ray makes with the optical axis of the system in object space. It is assumed
here that the dimensions of the entrance pupil are small enough compared to its distance
from the object plane that the variation of the angle q with the location of an area
5.3 Radiometry of Point Object Imaging 203
EnP ExP
P¢
CR
P0 q¢
q O O¢ P0¢
CR
P
Figure 5-13. Radiometry of point object imaging. A point object P lies in the object
plane at a distance Lo from the entrance pupil EnP of the system. Its Gaussian
image P ¢ lies in the image plane at a distance Li from the exit pupil ExP of the
system. The chief ray CR makes an angle q in the object space and q ¢ in the image
space of the system.
element on the pupil can be neglected and, therefore, integration across the pupil is not
required. Equation (5-7) may also be written
F = Io W ( P0 ) cos 3 q , (5-9)
where
is the solid angle subtended by the entrance pupil at the axial point object P0 . If we divide
the flux F by the area Sen , we find that the irradiance of the pupil is proportional to
cos 3 q , and thus obtain the cosine-third power law of irradiance of a surface by a point
source.
In the absence of any transmission losses in the system, the flux F emerges from the
exit pupil and focuses on the image point P ¢ . If Ii is the intensity of the image point,
then the flux emerging from the exit pupil is given by
F ¢ = Ii W ¢( P ¢ ) , (5-11)
where
Sex cos q ¢
W ¢( P ¢ ) =
(O ¢ P ¢ ) 2
( )
= Sex L2i cos 3 q ¢ (5-12)
is the solid angle subtended by the exit pupil at the image point. Here, O¢P ¢ is the
distance between the center O¢ of the exit pupil and the image point P ¢, Sex is the area of
204 STOPS, PUPILS, AND RADIOMETRY
the exit pupil, and q ¢ is the angle the chief ray makes with the optical axis in image
space. Equation (5-12) may also be written
where
is the solid angle subtended by the exit pupil at the axial image point P0¢ . As in the case
of the entrance pupil, the dimensions of the exit pupil are assumed to be small enough
compared with its distance from the image plane that the variation of the angle q ¢ with
the location of an area element on the exit pupil can be neglected, and therefore
integration across the pupil is not required. Thus, Eq. (5-11) may be written
Io W ( P0 ) cos 3 q
Ii = . (5-16)
W ¢( P0¢ ) cos 3 q ¢
It should be noted that the image point is a uniform point source only within the solid
angle W ¢( P ¢ ) because (according to geometrical optics) there is no radiation outside it.
For an axial point object, both q and q ¢ approach zero, and Eq. (5-16) reduces to
Io W( P0 )
Ii = , (5-17)
W ¢( P0¢)
Io Fex2
Ii = , (5-18)
Fen2
where Fen = Lo Den and Fex = Li Dex are the focal ratios of the optical beams entering
and exiting from the system, and Den and Dex are the diameters of the entrance and exit
pupils.
Lambertian object formed by an optical system. We show that the irradiance in the image
plane decreases as the fourth power of the cosine of the chief ray angle.
The flux from an object element incident on the system and transmitted by it to the
image element can be calculated in two different ways, depending on the location of the
aperture stop. If the aperture stop lies in the object space so that it is also the entrance
pupil, then the flux entering the system can be calculated by integrating across the
entrance pupil. If the aperture stop lies in the image space so that it is also the exit pupil,
then the image flux is obtained by integrating across the exit pupil. Both of these
approaches are illustrated. If, however, the aperture stop lies somewhere inside the
system, then the integration may be performed across the entrance or the exit pupil. The
region of integration depends on the shape of the pupil, which may or may not be the
same as that of the aperture stop due to vignetting or distortion of the pupil image. The
pupil shape, which is system specific, may be determined by tracing a bundle of rays.
Generally, the radiance of self-radiating and reradiating surfaces does not vary
strongly with the direction of radiation. According to Eq. (5-19), the radiance of a surface
element is independent of the direction of radiation if its intensity is proportional to cosq.
dS cos q
dS q
P
Such a surface is said to obey Lambert’s cosine law of intensity. A surface that radiates
uniformly in all directions is called a Lambertian surface or a uniform diffuser, depending
on whether it is self-radiating or reradiating. The sun is a spherical blackbody radiating
uniformly in all directions and therefore appears as a uniform disc. A laser beam, on the
other hand, is highly directional and obviously not a Lambertian source of radiation.
dF2
dE =
dS2
(5-21)
= B dS1 cos q1 cos q 2 R 2 .
We now determine the axial irradiance of a Lambertian disc of radius a and radiance
B (see Figure 5-16). Because of the axial symmetry of the disc, we consider an elemental
ring of radius r and width dr . Its area is given by dSe = 2prdr . The flux radiated by this
ring per unit solid angle on a parallel elemental area dSr centered on the axis of the disc
at a distance R is given by BdSe cos q , where q is the angle the line joining a point on the
ring and the receiver makes with the axis. The solid angle subtended by the receiver area
dSr at any point on the ring is given by dSr cos q d 2 or dSr cos 3 q R 2 , where
12
(
d = R cos q = r 2 + R 2 ) is the distance between a point on the ring and the receiver.
Thus, the flux incident on the receiver by the ring is given by BdSe dSr cos 4 q R 2 . Its
irradiance is accordingly given by
q1 (–)q2
dS1 dS2
dr
a
a
r d
q
® dS r
2prdr
= BR 2 2 . (5-22)
(r 2
+ R2 )
Integrating from 0 to a, we obtain the axial irradiance due to the disc:
a
2Û rdr
E(0) = 2 p BR Ù 2
ı
0
(r 2
+ R2 )
BS
= (5-23a)
a 2 + R2
= p Bsin 2 a , (5-23b)
where S = p a 2 is the area of the disc, and a is the semiangle subtended by the disc at the
point of observation. When R << a , E(0) Æ p B . Thus, when the source is very large
compared with the distance of the receiver, the axial irradiance is independent of the
distance between the two.
= BW (5-24b)
= I R2 , (5-24c)
where W = S R 2 is the solid angle subtended by the disc, at the receiver, and I = BS is
the intensity of the disc along its axis. Thus, the irradiance due to the disc at large
distances is equal to the product of its radiance and the solid angle subtended by it at the
distant receiver. The disc also behaves like a point source of intensity BS at large
distances, as expected. The difference between the actual value and that given by the
208 STOPS, PUPILS, AND RADIOMETRY
For a distant off-axis receiver making an angle d with the disc axis (see Figure 5-
18), the irradiance is given by
E (d ) = E (0 ) cos 4 d (5-25a)
where one factor of cos d arises from the projected area of the disc along the line joining
1.0 0.05
0.8 0.04
(R / a)2 (R / a)2
0.6 [1+(R / a)2] –1 0.03
E(0)
pB
0.4 0.02
0.2 0.01
[1+(R / a)2] –1
0.0 0.00
0 2 4 6 8 10
R/a
a
® dS r
d
a
its center and the point of observation, another due to the projected area of the receiver,
and cos 2 d due to the increase in distance between the disc and the receiver. When R is
not much greater than a, the actual irradiance values are higher than those predicted by
Eq. (5-25a). Figure 5-19 shows how the irradiance decreases as the off-axis angle d
o
increases, especially for d >
~ 10 .
where we have used Eq. (5-20) with dS1 = dS , dS2 = 2 p rdr , q1 = q 2 = q , and d is the
distance between the source and a point on the annulus. Letting d = R cos q
12
( )
= R 2 + r 2 , Eq. (5-26) may also be written
dF = B dS (2 p rdr ) cos 4 q R 2
B dS R 2 (2 p rdr )
= 2 . (5-27)
(R 2
+ r2 )
1
0.8
0.6
E( δ )/E(0)
0.4
0.2
0
0 10 20 30 40 50 60
δ
Figure 5-19. The cos 4 d variation of irradiance at large distances, where the off-axis
angle d is in degrees.
210 STOPS, PUPILS, AND RADIOMETRY
dr
a a
d r
q
dS
The total flux received by the aperture is obtained by integrating over r from 0 to a:
a
2Û rdr
F = 2 p B dS R Ù 2
ı
0
(r 2
+ R2 )
p B dS a 2
=
a 2 + R2
= p B dS sin 2 a , (5-28)
where a is the semiangle subtended by the aperture at the source. Comparing Eq. (5-28)
with Eq. (5-23b), we note that the flux incident on an aperture of radius a from a
Lambertian source of area dS is exactly the same as the flux incident on an area dS by a
Lambertian disc of radius a. It shows that the same amount of flux is transmitted from a
source to a receiver if their roles are interchanged. Comparing Eq. (5-28) with Eq. (5-5)
for the flux received from a point source, we note that the intensity I of the point source
has been replaced by the radiance B of the Lambertian source. The reason the two
expressions are different is because of the extra cosine factor in the projected area of the
source.
The flux calculation may also be carried out in terms of the solid angle subtended by
the annulus. The flux incident on the annulus is given by B dS cos q d W, where
is the solid angle subtended by the annulus at the source. We note from the figure that
sin q = r d
r
= 12 . (5-30)
(r 2
+ R2 )
5.4 Radiometry of Extended Object Imaging 211
dr cos q
dq = .
d (5-31)
d W = 2 p sin q dq . (5-32)
= pB dS sin 2 a , (5-33)
dW =
(rd q) (r sin q df)
r2
= sin q d q df . (5-34)
It represents the area on a unit sphere lying between the angles q and q + dq , and f and
f + df , as may be seen from the figure. If B is the radiance of the beam, the flux incident
on an elementary area dS is given by
dF = B dS cos q d W . (5-35)
The azimuthal angle f does not change upon refraction by virtue of the fact that the
incident ray, the refracted ray, and the surface normal are coplanar. The solid angle of the
refracted beam is given by
d W ¢ = sin q ¢ d q ¢ d f . (5-37)
r dq
r
q
dq r sinq df
r si
f nq
df
x r sinq df
(a)
dW
dq
q
n
dS y
n¢
dq¢ q¢
x
dW¢
(b)
Figure 5-21. (a) Solid angle of an elementary beam in polar coordinates ( r, q, f) and
(b) its change from d W to d W ¢ upon refraction at an interface separating media of
refractive indices n and n ¢ .
5.4 Radiometry of Extended Object Imaging 213
Equating the products of the left-hand sides of Eqs. (5-36) and (5-38) to the products of
their right-hand sides, we find that
i.e., the quantity n 2 cos q d W is invariant upon refraction. The two solid angles are
different from each other because the rays bounding dW are refracted by slightly
different amounts due to their slightly different angles of incidence. For a reflecting
surface, they are equal because then q ¢ = q .
dF ¢ = B¢ dS cos q ¢ d W ¢ . (5-40)
In the absence of any transmission loss, the incident flux is equal to the refracted flux,
i.e.,
dF ¢ = dF . (5-41)
Therefore, equating the right-hand sides of Eqs. (5-35) and (5-40), and substituting Eq.
(5-31), we obtain
B¢ B
2 = . (5-42)
n¢ n2
Thus, when the rays are refracted by a surface, the quantity B n 2 associated with
them is invariant. We refer to this invariance as the radiance theorem. When the rays are
reflected by a lossless surface, their radiance is invariant because n ¢ = – n in that case.
Because the entrance pupil of an optical imaging system lies in its object space, the
radiance of rays at the entrance pupil is equal to the object radiance. Similarly, because
the exit pupil lies in the image space, the radiance of rays at the exit pupil is equal to the
image radiance. We make use of Eq. (5-42) in Section 5.4.7 in obtaining the image
irradiance distribution in terms of the object radiance.
first imaging element is also the aperture stop. Let its radius be aen and area be
2
Sen = p aen . We assume that the aperture stop is much smaller than the object distance
from it so that the line joining a given (Lambertian) object element of area dS and any
element d Sen on the aperture makes approximately the same angle g with the optical
axis (see Figure 5-22). If B is the radiance of the object element, its intensity in the
direction of the entrance pupil is given by BdS cos q , where q is the angle of the chief
ray in the object space. The projected area of the pupil in the direction of the object
element is given by Sen cos q , and the distance between the two is Lo cos q .
Accordingly, the solid angle subtended by the pupil at the object element is given by
Sen cos q
dW = 2 . (5-43)
( Lo cos q)
F = B dW dS cos q
2
= B ( Sen Lo ) dS cos 4 q . (5-44)
Neglecting the loss of light while propagating through the system, this flux is contained
in the corresponding image element. If dS¢ is the area of this element, its irradiance is
given by
E(q) = F dS ¢
where
n EnP ExP n¢
P¢
dS¢
dSen CR
P0 a g q¢ g¢
q O F O¢ (–)a¢ P0¢
CR
dS
P dSex
d
Object Plane Image Plane
Optical
System
(–)L o Li
(
E( 0) = p B M 2 a 2 ) (5-46a)
( )
= B M 2 W ( P0 ) (5-46b)
(
= p B 4M 2 ) Fen2 (5-46c)
is its magnification, and a = aen Lo is the semiangle of the cone subtended by the
entrance pupil at the axial object point P0 . The quantity 2 a is called the angular
aperture of the light cone entering the system. As in Eq. (5-18),
Fen = Lo Den
= 1 2a (5-48)
is the f-number of this light cone. Equation (5-45) represents the cosine-fourth power law
in the object space, showing that the irradiance of the image of a Lambertian object
decreases as the fourth power of the cosine of the chief ray angle q in the object space.
This decrease can be overcome by introducing barrel distortion into the system. Of
course, there may be an additional decrease due to vignetting.
When aen is not very small compared to Lo , then a = tan -1 aen Lo , and it is ( )
replaced by sin a in Eq. (5-46a). The f-number of the light cone in that case is given by
The quantity n sin a is called its numerical aperture in the object space.
If d is the distance of the object-space focal point from the entrance pupil and f is the
object-space focal length of the system, then Eq. (2-83) yields
M = -f ( Lo - d ) . (5-50)
( )
E(q) = B f 2 Sen cos 4 q
= ( p Bn ¢ 2
)
4n 2 F•2 cos 4 q , (5-51)
where
F• = f ¢ Den (5-52)
216 STOPS, PUPILS, AND RADIOMETRY
is the focal ratio of the image-forming light cone for an object lying at infinity and we
have made use of Eq. (2-69). F• is called the f-number or the relative aperture of the
system. If the diameter Den = 2 aen of the entrance pupil is increased by a certain factor so
that the system collects more light, the image irradiance does not change if the image-
space focal length f ¢ is also increased by the same factor. The amount of light collected
2
increases as Den , and the image area increases as f ¢ 2 so that the irradiance does not
change unless the f-number also changes. Accordingly, the f-number and not the entrance
pupil diameter determines the light-gathering capability of a system in the sense of image
irradiance.
A camera lens with a small f-number is said to be fast since it yields higher
irradiance on film, thus requiring a shorter exposure time. Its speed is inversely
proportional to the square of its f-number. The diameter of the lens, and therefore the flux
density on the film, is controlled by a shutter, but its focal length is fixed (unless an
additional lens is attached). The f-number markings on the rim of a camera lens, e.g.,
22.6, 16, 11.3, 8, 5.6, 4, 2.8, 2, and 1.4, represent increasing shutter opening by a factor of
2 in the area from one number to the next. Assuming a good-quality lens, a larger lens
opening (and, therefore, a smaller f-number) also gives a better resolution. Smaller f-
numbers are used for fast-moving or dimly illuminated objects.
The focal ratio of the image-forming light cone for finite conjugates can be related to
F• as follows. From Eqs. (2-70) and (2-72), the image distance S ¢ of P0¢ can be written
S ¢ = f ¢(1 - M ) . (5-53)
Similarly, the image distance s ¢ of the exit pupil can be written in terms of the pupil
magnification m = Dex Den . Thus, we may write the focal ratio of the light cone exiting
from the exit pupil [see Eq. (5-18)]
Fex = Li Dex
= ( S ¢ - s ¢ ) Dex (5-54)
= F• (1 - M m) .
d S ¢ cos q¢
d W¢ = 2 . (5-55)
( Li cos q¢)
F ¢ = B¢Sex dW ¢ cos q ¢
= ( B¢dS ¢ L )S
2
i ex cos 4 q ¢ . (5-56)
E(q ¢) = F ¢ dS ¢
where
2
E ( 0 ) = p B ( n ¢ n) a ¢ 2 (5-58a)
2
= B (n ¢ n) Sex L2i( ) (5-58b)
2
= B (n ¢ n) W ¢ ( P0¢ ) (5-58c)
2
= p B ( n ¢ 2 n) Fex2 , (5-58d)
and we have written B ¢ in terms of B, according to Eq. (5-42). Here, a ¢ = aex Li is the
semiangle of the cone subtended by the exit pupil at the axial image point P0¢ , and the
angle 2 a ¢ is called the angular aperture of the image-forming light cone exiting from
the system with its apex at P0¢ . As in Eq. (5-18),
Fex = Li Dex
= 1 2 a¢ (5-59)
is the f-number of the light cone exiting from the system. Equation (5-57) represents the
cosine-fourth power law in the image space, showing that the irradiance of the image of a
Lambertian object decreases as the fourth power of the cosine of the chief ray angle q ¢ in
the image space.
When aex is not very small compared to Li , then a ¢ = tan -1 ( aex Li ) and it is
replaced by sin a ¢ in Eq. (5-58a). The f-number of the exiting light cone in that case is
given by
The quantity n ¢ sin a ¢ is called its numerical aperture in the image space.
218 STOPS, PUPILS, AND RADIOMETRY
If the object lies at infinity, then the image point P0¢ coincides with the image-space
focal point F ¢ , and Fex Æ F• , where
F• = n ¢ 2 NA•¢ . (5-61)
Here,
NA•¢ = n ¢ sin a ¢• (5-62)
The angular aperture, the f-number, and the numerical aperture all give a measure of
the light-gathering capability of an optical system in the sense that the image illumination
depends on them. It is customary to use the f-number of the image-forming light cone for
systems such as cameras imaging objects lying at large distances. The term numerical
aperture is used when imaging objects at short distances, as in microscopes.
5.4.9 Throughput
If we consider the corresponding object and image elements centered on the optical
axis at P0 and P0¢ , respectively, then equating the axial image irradiances given by Eqs.
(5-46b) and (5-58c), we obtain
n ¢ 2 dS ¢ W ¢ ( P0¢ ) = n 2 dS W ( P0 ) . (5-63)
Thus, the quantity n 2 dS W ( P0 ) , called the optical throughput, is an invariant. Note that if
n = n ¢ (in practice, they are often both equal to unity), then B = B¢, and the product of
the area and the solid angle may simply be called the throughput. In that case, the
throughput multiplied by the radiance gives the flux passing through the system.
EnP ExP
n
n¢
MR 0 MR
a a¢ 0
h b0 (–)b¢0 h¢
P0 O O¢ P¢0
Optical
System
(–)L o Li
Figure 5-23. Invariant relations in imaging. The object is a small circular object of
radius h.
show that they are interrelated by the conservation of energy in the process. Consider a
small circular object of radius h and radiance B at a distance Lo from the entrance pupil
of radius a of a certain imaging system, as illustrated in Figure 5-23. The flux incident on
the entrance pupil is given by
(
Fo = p h 2 B pa 2 L2o ) . (5-64)
If the exit pupil has a radius a ¢ and the image has a radius of h ¢ , radiance B¢ , and lies at
a distance Li from it, then the flux in the image is given by
(
Fi = p h ¢ 2 B¢ p a ¢ 2 L2i ) . (5-65)
Equating the flux entering the system to that exiting from it based on conservation of
energy, we obtain
This is precisely the result obtained if we square the Lagrange invariant equation (2-75)
and multiply by the radiance invariance given by Eq. (5-42). If we substitute for B¢ in
terms of B, we obtain the throughput invariance of Eq. (5-63).
angle subtended by dS¢ at the exit pupil is simply equal to dS ¢ L i2 . Thus, the flux
emerging from a small exit pupil and converging on the image element is given by
( )
F ¢ = B ¢dS ¢ Sex L i2 cos q ¢
2
E(q¢ ) = B ( n ¢ / n) W¢ (P0¢ ) cos q¢ . (5-68)
Thus, the irradiance of the spherical image formed by a concentric system decreases
linearly with the cosine of the angle of the chief ray in the object or the image space. For
an object lying at infinity, we let Li = f ¢ , the focal length of the system.
5.5 PHOTOMETRY
Now we give a brief discussion of photometry, the branch of radiometry that is
limited to observations with the human eye, which is sensitive only in the visible region
of the electromagnetic spectrum called light. The theory of photometry, in terms of the
transfer of light from a source to a receiver, is the same as discussed earlier, except that
the spectral response of the eye must be taken into account to determine the final result of
any observation. The names, symbols, and units of photometric quantities are given,
along with an equation for obtaining a photometric quantity from a corresponding
radiometric quantity. It is shown that a Lambertian surface appears equally bright at all
distances and along all directions of observation. The reason stars can be observed during
daytime with the aid of a telescope is also discussed.
given by
F l = k Ú F r (l ) V (l ) d l , (5-69)
= candela (cd)
1.0
0.8
0.6
V
0.2
0.0
380 420 460 500 540 580 620 660 700 740 780
l (nm)
Figure 5-24. Relative spectral response of the human eye for day (photopic) and
night (scotopic) vision.
222 STOPS, PUPILS, AND RADIOMETRY
Table 5-2. Relative spectral response of the human eye for day (photopic) and night
(scotopic) vision.
Consider an object of height h lying at a distance R from the front principal point
H, as illustrated in Figure 5-25. An image of height h ¢ is formed on the retina at a
distance R ¢ from the back principal point H ¢ . (see Problems 4.6 and 4.7 for a Gaussian
model of the human eye; see also Section 6.2.2.) The angular sizes and ¢ of the object
and image as seen from the respective principal points are related to each other according
to [see Eq. (2-67)]
n = n ¢ ¢ , (5-70)
where n and n ¢ are the refractive indices of the object and image spaces, respectively.
The image height h ¢ is given by
h ¢ = R ¢ ¢
(5-71)
= (n n¢) R¢ .
As the object distance varies, the eye lens changes its focal length by a process called
accommodation so that the distance R ¢ remains practically invariant (see Section 6.2.3).
Consequently, the apparent size of an object is proportional to the angle it subtends at
H , independent of the state of accommodation.
n n′
P
P0 H H′ (–)h′
P′
(–)R R′
θ1
dS′1
dS1 dS2 Image
Eye Pupil
Lambertian
Object
R R′
where q1 is the angle between the normal to the surface dS1 and the direction of
observation, i.e., the line joining the centers of dS1 and dS2 . The angle q 2 is zero
because dS2 is normal to this line.
If h is the transmission factor of the eye, the flux reaching the retina is h dF . This
flux is distributed over the retinal image of object dS1 . The projected area of the observed
surface normal to the direction of observation is dS1 cos q1 . Therefore, if R¢ is the image
distance, then the area of the image is given by
2
dS1¢ = ( R¢ n ¢R) dS1 cos q1 , (5-73)
where nR¢ n ¢R with n = 1 is the (linear) magnification of the image. Hence, the
illuminance on the retina is given by
dF
E = h
dS1¢
h n ¢ 2 L dS2 (5-74)
= .
R¢ 2
seen from the object and image spaces are the entrance (EnP) and exit (ExP) pupils,
respectively. An object ray passing through the center of the aperture stop and actually or
appearing to pass through the centers of the entrance and exit pupils is the chief (or the
principal) ray ( CR). An object ray passing through the edge of the aperture stop and
actually or appearing to pass through the edges of the entrance and exit pupils is the
marginal ray (MR). The chief ray from the edge of an object determines the location of
the exit pupil and the height of the image. Similarly, the marginal ray from the axial point
object determines the size of the exit pupil and the location of the axial image point. The
approximate size of an imaging element to avoid vignetting by it is equal to the sum of
the magnitudes of the heights of the chief ray on it from the edge point object and the
marginal ray from the axial point object. A system is telecentric on the image side when
its aperture stop lies in its object-space focal plane. The exit pupil in this case lies at
infinity, and a chief ray lies parallel to the optical axis in the image space. Similarly, a
system is telecentric on the object side if its aperture stop lies in the image-space focal
plane. An afocal system with an aperture stop placed in an intermediate focal plane is
telecentric on both object and image sides.
The field stop of a system is an aperture, placed at a final or intermediate real image
of the object, that limits the cone angle of the transmitted chief rays from an object. Its
images as seen from the object and image spaces are the entrance and exit windows EnW
and ExW , respectively. The entrance window defines the object field that is actually
imaged in the exit window. The angle subtended by the entrance window at the center of
the entrance pupil represents the angular field of view of the system in object space.
Similarly, the angle subtended by the exit window at the center of the exit pupil is the
angular field of view of the system in image space. The ratio of the two angles is equal to
the magnification of the exit pupil when the refractive indices of the object and image
spaces are equal.
È ˘
1 1
F = 2 pIR Í - ˙
12˙ (5-75a)
ÍR
ÍÎ
2
(
R + a2 ) ˙˚
I
= S for R >> a , (5-75b)
R2
where S = p a 2 is the area of the aperture. Because S R 2 is the solid angle subtended by
the aperture on a distant point source, I S R 2 is the flux incident on the aperture, and
I R 2 is the uniform irradiance on it yielding the inverse-square law of irradiance.
2
Ê F ˆ cos q ˆ 3
Ii = Io Á ex ˜ Ê , (5-76)
Ë Fen ¯ Ë cos q ¢ ¯
where Fen and Fex are the focal ratios of the optical beams entering and exiting from the
imaging system, and q and q ¢ are the chief ray angles in the object and image spaces.
BS
E( 0) = , (5-77)
a + R2
2
where S = p a 2 is the area of the disc. At a large distance from the disc, Eq. (5-77)
reduces to
= I R2 , (5-78b)
The off-axis irradiance at a point at a large distance making an angle d with the axis
of the disc, i.e., the flux incident on the area element d Sr (illustrated in Figure 5-18) per
unit area, is given by
When R is not much greater than a, the actual irradiance values are higher than those
predicted by Eq. (5-79).
B¢ B
2 = , (5-80)
n¢ n2
where n and n ¢ are the refractive indices of the object and image spaces, respectively. In
the case of imaging by a mirror, n ¢ = - n , and therefore B¢ = B. In practice, n ¢ = n even
for a refracting system, and therefore B¢ = B. In reality, however, B¢ < B due to losses in
the system.
5.6 Summary of Results 227
For a uniformly radiating object with a radiance B, the image irradiance distribution
is generally nonuniform. When the aperture stop of the system lies in the object space, it
decreases according to (see Figure 5-27)
where
E( 0) = ( p B M ) a 2 (5-82a)
(
= p B 4M 2 ) Fen2 . (5-82b)
Here, q is the chief ray angle in the object space, M is the image magnification, 2a is
the angular aperture of the entrance pupil, and Fen is the focal ratio of the light cone
entering the entrance pupil.
where
F• = f ¢ Den (5-84)
is the corresponding focal ratio of the image-forming light cone. Here, f ¢ is the focal
length of the system, and Den is the diameter of its entrance pupil. The focal ratio Fex for
finite conjugates is related to F• according to
Fex = F• (1 - M m) , (5-85)
n EnP ExP n¢
P¢
dS¢
CR
P0 a q¢
q O F O¢ (–)a¢ P0¢
CR
dS
P
Figure 5-27. Radiometry of point object imaging. P and P ¢ are the object and image
points, and d S and d S¢ are the object and image elements.
228 STOPS, PUPILS, AND RADIOMETRY
where m = Dex Den is the pupil magnification, Dex being the diameter of the exit pupil.
If the aperture stop lies in the image space, then the irradiance distribution is given
by
where
2
E ( 0 ) = p B ( n ¢ n) Fex2 . (5-86b)
Equations (5-81) and (5-86a) represent the cosine-fourth power law of irradiance in the
object and image spaces, respectively, showing that the irradiance of the image of a
Lambertian object decreases as the fourth power of the cosine of the chief ray angle q in
the object space or q¢ in the image space.
In a concentric system, the aperture stop, entrance pupil, and the exit pupil all lie at
the common center of curvature of the imaging elements, and the image is formed on a
concentric spherical surface. The chief ray angles in the object and image spaces are
equal, and an image element is normal to the line joining it and the center of the exit
pupil. The irradiance distribution is accordingly given by
Thus, the irradiance of the spherical image formed by a concentric system decreases
linearly with the cosine of the angle of the chief ray in the object or the image space,
where E( 0) may be obtained from Eq. (5-68).
n ¢ 2 L Se
E = h , (5-88)
R¢ 2
where h is the transmission of the eye, n ¢ is its refractive index, R ¢ is its diameter, and
Se is the area of its pupil.
The size of the retinal image of an object subtending an angle b at the eye is given
by (n n ¢) R¢b , where n and n ¢ are the refractive indices of the object and image spaces,
respectively, and R ¢ is the distance of the retina from the image-space principal point of
the eye. In practice, n = 1 for observations in air and 1.33 for observations in water,
n ¢ = 1.33 , and R ~ 2.5 cm.
References 229
REFERENCES
1. R. McCluney, Introduction to Radiometry and Photometry, Artech, Boston
(1994).
5. M. Reiss, “The cos4 law of illumination,” J. Opt. Soc. Am. 35, 283–288 (1945).
7. M. Reiss, “Notes on the cos 4 law of illumination,” J. Opt. Soc. Am. 38, 980–986
(1948).
PROBLEMS
5.1 Consider a system consisting of two thin lenses of equal focal lengths with an
aperture stop placed midway between them. Show that its entrance and exit pupils
lie at its respective principal points.
5.2 A system consisting of two thin lenses with focal lengths of 10 cm and 5 cm and
with apertures of 4 cm are spaced 4 cm apart. A stop 2 cm in diameter is located
midway between them. (a) Determine its principal points. (b) Find the position and
size of its entrance and exit pupils. (c) Find the position and size of the image of an
object placed 10 cm from the first lens. (d) Sketch everything on a diagram
showing, in addition, the two tangential marginal rays and the chief ray from the
top of the object if it is 4 cm high. (e) In the object plane considered, what is the
maximum height of a point object for which there is no vignetting?
5.3 Consider a system consisting of two thin lenses placed 4 cm apart with a 4-cm
aperture placed midway between them. The first lens has a diameter of 4.6 cm and
a focal length of 5.8 cm. The second lens has a diameter of 5.8 cm. An object is
placed 8 cm from the first lens. (a) Determine the aperture stop of the system. (b)
Sketch the vignetting diagram for a point object 4 cm from the optical axis.
5.4 An exit pupil with a 3-cm aperture is located 6 cm in front of a convex mirror that
has a radius of curvature of 10 cm. An object 1 cm high is centrally located on the
axis 12 cm in front of the mirror. (a) Locate the entrance pupil and the image. (b)
Find the minimum diameter of the mirror needed to see the entire object from all
points of the exit pupil.
5.6 Show that the height of a light bulb (assumed to be a point source) from the center
of a circular table of radius a for maximum illumination at its edges is given by
2 a .
5.7 According to the Stefan–Boltzmann law, the exitance (i.e., the power radiated by a
unit area) of a blackbody at a temperature T (in Kelvin) is given by sT 4 , where
s = 5.67 ¥ 10 –8 W m 2 K 4 is the Stefan–Boltzmann constant. Consider the sun to
Problems 231
be a blackbody at 6000 K. (a) Determine its radiance. (b) Calculate the solar
irradiance on the earth, called the solar constant (the solar constant is also
expressed as 2 calories/cm2 min). (c) Compare it with the irradiance of the solar
image formed by a lens with an f-number of 5. (d) Assuming that the moon
reradiates 20% of the light incident on it, compare the lunar irradiance on the earth
for full moon with solar irradiance in full sunlight. Some of the sizes and distances
of interest are as follows: the radius of the sun and its distance from the earth are
6.96 ¥ 10 8 m and 1.49 ¥ 1011 m , respectively, and the radius of the moon and its
distance from the earth are 1.77 ¥ 10 6 m and 3.80 ¥ 10 8 m , respectively.
5.8 Consider an optical system imaging a small circular object of radius h centered on
its optical axis. Let the circular image be of radius h ¢ . Let 0 and ¢0 be small
slope angles of the axial marginal rays in the object and image spaces of the system
(see Figure 5-2). Show by using the Lagrange invariance of Eq. (2-74) that the
object and image radiances are related to each other according to Eq. (5-34), where
n and n ¢ are the refractive indices of the object and image spaces. The object and
image sizes are assumed to be small so that the entrance and exit pupils subtend
approximately the same angles at every point on them.
5.9 Determine the flux incident on a solar panel 1 m ¥ 2 m when the sun is at zenith,
30 o and 60 o . Assume that the radiance of the sun is 22.5 MW/m2 sr and its angular
diameter as seen from the earth is half a degree.
CHAPTER 6
OPTICAL INSTRUMENTS
233
Chapter 6
Optical Instruments
6.1 INTRODUCTION
In this chapter we describe the basic principles of some of the commonly used
optical instruments. We start with the most common, the human eye, and discuss how
spectacles correct near- or farsightedness. We then discuss a magnifier (or a reading
glass), a microscope, and a telescope. We illustrate how the eye interacts with such
instruments when images are observed by humans. A pinhole camera is also described
briefly.
6.2 EYE
6.2.1 Anatomy and Structure
The human eye is a visual positive lens system that forms a real image on the retina,
as illustrated in Figure 6-1. It is nearly spherical, with a diameter of about 2.5 cm among
adults and a tough 1-mm-thick outer shell called the sclera. Its front portion, where the
eye bulges outward, represents the first element of the lens system called the cornea. It is
a transparent tissue approximately 0.5 mm thick, with a refractive index of 1.377, while
the rest of the sclera is white and opaque. Nearly two-thirds of the bending of object rays
takes place at the air–cornea interface. The cornea is also slightly reflective and acts like a
convex mirror, resulting in our ability to see ourselves in the eyes of another person.
Because the refractive index of the cornea is very close to that of water (1.333), no
significant refraction takes place at a water–cornea interface. Accordingly, a person
cannot see very well under water (divers wear a mask that creates airspace between the
water and the eye). The eyelids protect the delicate cornea from foreign particles. By
blinking constantly, they keep a layer of tears on the cornea. The tear film is produced by
glands within the lids. Without the tears, a dry cornea loses its transparency.
Rays emerging from the cornea pass through a chamber filled with a clear watery
fluid called the aqueous humor, which has a refractive index of 1.336. Because of the
closeness of the refractive indices, only a small refraction of the rays takes place at the
cornea–aqueous humor interface.
A diaphragm, called the iris, immersed in the aqueous humor, controls the amount of
light entering the eye. Its central hole is called the pupil, which can be seen as a small
central black spot of the eye. It is black, of course, because light goes through it. While
the iris defines the entrance pupil, its image by a crystalline lens, which lies immediately
behind the iris, defines the exit pupil of the eye. The exit pupil is located behind the iris
and is somewhat smaller than the entrance pupil. The iris also gives the eye its color, e.g.,
brown, green, or blue. It is made up of circular and radial muscles that expand or contract
to increase or decrease the diameter of the pupil from approximately 2 mm in bright light
to 8 mm in darkness. The lens, which is about 4 mm thick and 9 mm in diameter, is a
complex, layered fibrous mass surrounded by an elastic membrane. As many as 22,000
very fine layers are arranged as in an onion. Its index of refraction varies from about
1.406 at the inner core to approximately 1.386 at the less dense cortex. Its index is a
radial analog to the linearly varying index of spectacle glasses in use today. Behind the
lens is another chamber filled with a transparent gelatinous substance called the vitreous
humor, which has a refractive index of 1.337. The lens is suspended in place by
threadlike fibers that are connected to the ciliate muscle. The muscle contracts, loosening
tension on the lens and allowing it to bulge, thereby increasing its power to focus on a
nearby object.
Within the tough sclerotic wall is an inner shell, called the choroid. It is a dark layer
with blood vessels and pigmented cells. It absorbs any stray light like the interior black
walls of a camera. A paper-thin (about 50 mm) layer of photoreceptor cells, called the
retina, covers much of the inner surface of the eye. The red glow in the flash photo of
some people represents the light reflected from the retina by fine blood vessels.
Interestingly, the curved retina closely approximates the Petzval surface of the eye’s
optical system. There are two types of photoreceptor cells, called rods and cones. There
are about 125 million rods, 6.5 million cones, and a million fibers. The rods are
extremely sensitive to light, but do not distinguish color. The cones are used in bright
light, such as daylight, and provide color perception. They do not function in a color-
blind person. The normal wavelength range of human vision is approximately 380 nm to
780 nm. How the response varies with wavelength is given in Table 5-2 and Figure 5-24.
The crystalline lens absorbs in the ultraviolet. With age, the lens gets clouded and loses
transparency, a condition called cataract. People whose lenses have been surgically
removed are significantly more sensitive to ultraviolet light.
The electrical impulses generated by the retinal cells are carried by fibers at a rate of
about 109 bits/sec. The eye interfaces with the brain through the optic nerve, which does
not contain any photoreceptors. It is therefore insensitive to light, thereby creating a blind
spot about 0.6 mm in diameter. The blind spot can be demonstrated very easily by
6.2 Eye 237
considering Figure 6-2. With the left eye closed, stare at the + sign, starting it at a
distance of about 25 cm and slowly bringing the figure closer. At a certain distance the
picture on the right disappears as its image falls on the blind spot of the right eye. A
chronically elevated pressure of the fluids within the eye, a condition called glaucoma,
can lead to blindness if not treated.
At the center of the retina, there is an area about 2.5 to 3 mm in diameter, known as
the yellow spot or macula. There is a tiny, rod-free region about 0.3 mm in diameter,
called the fovea centralis. The cones in this region are thinner and more densely packed,
and thus yield the sharpest image. They have a diameter of about 1.5 mm and are spaced
about 2 to 2.5 mm apart. There are about 14,700 cells/mm2 in the fovea compared to a fine
laser printer, which has 5500 dots/mm2. Without the fovea, 90–95% of the vision is lost;
only the peripheral day and night vision is retained. The fovea does not lie on the optical
axis of the lens system of the eye. The line joining the lens center and the fovea is
referred to as the visual axis of the eye.
Comparing the eye to a camera, the cornea provides a majority of the focusing, the
iris is the aperture stop, the crystalline lens provides the fine focusing, and the retina
plays the role of a film or a solid state detector (pixel) array. Of course, the eyelids are
equivalent to a lens cover.
Because the spacing between the principal points (and, therefore, between the nodal
points) is very small (only 0.32 mm), a single refracting surface represents both the
cornea and the lens in a reduced eye model, as illustrated in Figure 6-3c. The principal
points coincide with the vertex of the surface in this model, and the nodal points coincide
with its center of curvature. The eye is assumed to be filled with vitreous humor of
+
Figure 6-2. Blind spot demonstration.
238 OPTICAL INSTRUMENTS
(a)
F HH¢ N N¢ F¢
(b)
F HH¢ N N¢ F¢
(c)
F H H¢ N N¢ F¢
10 0 10 20
mm
Figure 6-3. Paraxial models of the eye, illustrating its cardinal points. (a) Schematic
eye. (b) Simplified schematic eye (single-surface cornea). (c) Reduced eye (single
refracting surface).
refractive index 1.333. Thus, the focal length of the reduced eye is the same as that of the
Helmholtz eye. The optical parameters of the three models are listed in Table 6-1.
6.2.3 Accommodation
The cornea provides nearly 43 of the total 60 diopters of the focusing power of an
eye (see Problem 6.2). The fine focusing of the image of an object as its distance changes
is performed by the crystalline lens in a process called accommodation. Generally, the
lens muscles are relaxed when forming the image of an object lying at infinity, as
illustrated in Figure 6-4a. As the object moves closer, the ciliary muscle contracts the
front surface of the lens, which becomes more curved. The lens becomes thicker at the
center, thereby reducing its focal length and maintaining a sharp image on the retina. This
is illustrated in Figure 6-4b. Too much work by the ciliary muscles over long periods
leads to eye strain or fatigue. As the object still moves closer, a point is reached when the
lens shape cannot change any more, and the image of any closer object is blurred. The
closest point for which the eye can form a sharp image is called the near point. It varies
from 7 cm for a teenager to 25 cm or so for a young adult, and to roughly 100 cm in a
middle-age person. Accommodation changes the power of the crystalline lens by about 4
diopters, although a teenager may possess more than 10 diopters.
F′
(a)
P′
P
Near point
25 cm
(b)
Figure 6-4. Normal eye. (a) Relaxed eye showing the far point at infinity. (b)
Accommodated eye illustrating the near point.
240 OPTICAL INSTRUMENTS
distance between the lens and retina. Birds of prey keep a rapidly moving object in
constant focus over a wide range of distances by changing the curvature of the cornea. In
Lasik (laser in-situ keratomileusis) surgery, it is indeed the curvature of the cornea that is
changed. In a cataract operation, on the other hand, the crystalline lens is replaced by a
plastic lens.
Visual acuity is maximum at the fovea and decreases in the outer region of the retina.
Thus, the fovea provides the details, and the outer region gives a general view of an
object scene. The eye has an elliptical field of view, approximately 150° high and 210°
wide. The stereoscopic field of view obtained with the use of both eyes is approximately
circular with an angular diameter of about 130°. The eyeball rotates automatically as
needed so that the image of the region of interest in a certain object falls on the fovea. In
low illumination, the eye becomes color blind due to the low sensitivity of the cones.
Visual acuity decreases as illumination decreases. The optic nerve transmits the retinal
6.2 Eye 241
(a)
(b)
Figure 6-5. Snellen eye charts for (a) literate and (b) illiterate patients.
242 OPTICAL INSTRUMENTS
Table 6-2. Relationship between visual acuity and the corresponding corrective
power required in diopers for a nearsighted person.
8 20/20 0.00
7 20/25 –0.25
6 20/30 –0.50
5 20/40 –0.75
4 20/50 –1.00
3 20/70 –1.25
2 20/100 –1.50
20/150 –2.00
1 20/200 –2.50
20/250 –3.0
20/300 –3.5
20/350 –4.0
20/400 –4.5
20/450 –5.0
20/500 –5.5
20/600 –6.5
image to the brain which interprets it. For example, the image of an object on the retina is
inverted, yet a person sees it as being erect.
When parallel rays from an object at infinity are focused at F ¢ in front of the retina,
the focusing power of the eye is too high, as illustrated in Figure 6-6a, and the eye is said
to be myopic. Such a condition can also happen if the curvature of the cornea is too high.
It has the consequence that the far point falls short of infinity, and all points beyond
6.2 Eye 243
F¢
Object at infinity
(a)
P¢
P
Far point
(b)
F¢
Object at infinity
(c)
P¢
P
Nearby
object
(d)
Figure 6-6. Myopic (or nearsighted) eye. (a) Object at infinity focused at F ¢ by a
relaxed eye in front of its retina, thus forming a blurry image on the retina. (b)
Objects at the far point and closer are in focus, illustrating that nearby objects are
seen well. (c) Object at infinity imaged by a negative spectacle lens forming a virtual
image at the far point. The virtual image is the object for the eye, which images it at
F ¢ on the retina. (d) Object P closer than the far point imaged at P ¢ on the retina
by the spectacle lens and accommodation.
244 OPTICAL INSTRUMENTS
it will appear blurred. An object point P located at the far point is imaged at P ¢ on the
retina without any accommodation (see Figure 6-6b). Objects closer than P are imaged on
the retina with accommodation. A person with a myopic eye is said to be nearsighted,
i.e., nearby objects are seen well, but the distant objects are not. A myope brings an
object close enough, i.e., at or within the far point, to see it well. The near point of a
myope with normal accommodation is closer than if the eye was emmetropic (normal). A
myopic eye can be compensated with a negative spectacle lens such that the combination
of the two yields a focus on the retina without accommodation [4]. An object at infinity is
imaged by the spectacle lens at the far point (see Figure 6-6c), which the eye is able to
focus on without accommodation. The objects at other distances are seen well with
accommodation. Of course, objects at distances shorter than the far point are also seen
well without the spectacles, using somewhat less accommodation. The nearby objects are
focused on with accommodation, as illustrated in Figure 6-6d.
When an object lying at infinity is focused beyond the retina, as illustrated in Figure
6-7a, the eye is said to be hyperopic. The focusing power of the eye is too weak, i.e., the
cornea is less curved, or the lens has become too thin in its relaxed state. Accordingly,
distant objects are seen well only by accommodation, as illustrated in Figure 6-7b.
However, there is not sufficient accommodation for nearby objects, which are imaged
beyond the retina and are therefore out of focus (see Figure 6-7c). A person with such a
condition is said to be farsighted, i.e., distant objects are seen well, but nearby objects are
not. A hyperope pushes objects at an arm's length (> 25 cm) to see them well. This
distance represents the near point of the person, and spectacles are needed to see well any
objects that are closer than this point. The far point F ¢ of a hyperope is virtually located
behind the eye, as in Figure 6-7a. With normal accommodation, the near point of such a
person is more distant than that for a normal eye. The focal length of the spectacle lens is
chosen so as to bring the near point to a comfortable distance of 25 cm (see Figure 6-7d).
With a positive spectacle lens, distant objects are imaged sharply without much
accommodation. However, nearby objects are imaged by the lens beyond the near point,
which, in turn, are brought in focus by accommodation. An object at infinity is imaged by
it at the far point, which the eye images on the retina, as illustrated in Figure 6-7e.
Another common defect of the eye is astigmatism. It arises from a cornea that is toric
(or spherocylindrical) instead of being spherical, i.e., the cornea has an uneven curvature
(as an egg and not as a pingpong ball), resulting in different power in different meridians
(see Figure 6-8). A toric surface, such as that illustrated in Figure 6-9, forms a line image
of a point object even when it lies on its axis. However, a person afflicted with
astigmatism sees only a blurry image. If the object consists of vertical and horizontal
lines, as in the wires of a window screen, such a person can focus (by accommodation) on
only the vertical or the horizontal lines at a time. This is analogous to the spoked-wheel
example of Figure 9-16, where the rim is in focus in one observation plane and the spokes
are in focus in another when imaged by a lens with astigmatism. The astigmatism of the
eye may be other than horizontal or vertical. Its axis can be determined by looking at an
6.2 Eye 245
F¢
Far
Object at infinity point
(a)
P¢
P
Near point
(b)
P¢
P
25 cm
(c)
P¢
Near point P
25 cm
(d)
F¢ Far
Object at infinity point
(e)
Figure 6-7. Hyperopic (or farsighted) eye. (a) Object at infinity focused beyond the
retina by a relaxed eye, thus forming a blurry image on the retina. The focal point
F ¢ is the virtual far point of the eye. (b) Object at or beyond the near point focused
on the retina by accommodation. (c) Object P closer than the near point imaged at
P ¢ beyond the retina even with accommodation. (d) Positive spectacle lens images
an object P at 25 cm at the near point, which the eye images at P ¢ on the retina. (e)
Object at infinity imaged by the spectacle lens at the far point, which the eye images
on the retina.
246 OPTICAL INSTRUMENTS
Figure 6-8. Astigmatic eye illustrating a cornea with uneven curvature, resulting in a
blurry image of a point object on the retina. Although line images are formed in
front of and behind the retina, and a circular image is formed halfway between
them, a patient perceives only a blurry image.
(a) (b)
astigmatism eye chart consisting of radial spokes, as shown in Figure 6-10a. The spoke
that is seen in focus, e.g., the one at 30° from the vertical in Figure 6-10b, represents the
axis of astigmatism. A cylindrical lens, illustrated in Figure 6-11, is used to correct
astigmatism. It introduces power only along its axis. If the eye is also myopic or
hyperopic, then a toric lens is required for correction. However, with a rigid contact lens,
the space between its back surface and the cornea is filled with the tear fluid. Thus,
astigmatism practically disappears, and the curvature of the front surface provides the
needed myopic or hyperopic correction. A soft contact lens requires proper orientation to
align its toroidal power with that of the eye.
(a) (b)
Figure 6-10. (a) Astigmatism eye chart. (b) Chart as observed by an astigmatic eye,
indicating the axis of astigmatism at 30 o from the vertical.
Real
line focus
Virtual
line focus
(a) (b)
Figure 6-11. Cylindrical lens showing parallel rays incident on it are focused on a
line (as opposed to a point in the case of a spherical lens). (a) Convex. (b) Concave.
With age comes another condition called presbiopia, i.e., inability of the eye to
accommodate. The crystalline lens hardens and becomes inflexible. As a result, a
nearsighted person, for example, cannot read a newspaper at a normal distance while
wearing glasses. In order to read it, the newspaper is kept at arm’s length. The near point
in this case has receded beyond the comfortable reading distance. The choice is either to
remove the eyeglasses or wear bifocal lenses with a less-negative lower half. Similarly, a
farsighted person wears bifocals with a more-positive lower half. Some people cannot
adjust to bifocal spectacles and keep two sets. For example, a nearsighted person may use
one pair of spectacles for distant objects (e.g., driving and watching television) and
another for nearby objects (e.g., reading and sewing).
A starting point for a prescription is determined by looking at the eye chart without
any spectacles. Visual acuity of a person does not by itself determine if that person is
248 OPTICAL INSTRUMENTS
Bifocal
(Reading Glasses)
Spectacle lenses are generally made with (spectacle) crown glass of refractive index
1.523. Sometimes, a glass with a high refractive index of 1.70 is used to reduce the
weight of the lens. These days, plastic lenses (Plexiglass) of index 1.495 are used because
their weight is roughly half that of a glass lens. Photochromic lenses, which change
transmission as a function of illumination level, are also quite common. The contact
lenses are meniscus lenses varying from 6 to 15 mm in diameter. A contact lens rests on
the cornea with a conforming shape. They are approximately 100 mm thick and ride on
the tear fluid. Their refractive index ranges from 1.43 to 1.49. They are made from some
polymer or even Plexiglass, weighing about 1 to 3 mg.
The cause of poor acuity is not always an abnormal refraction by the cornea or the
lens, or distance from the retina. It may simply be due to other causes, such as the media
of the eye may be partially opaque, or the retina may be diseased. Whether the spectacles
will improve the acuity or not can be easily checked by looking through a pinhole (so that
only a small central region of the cornea/lens is used). If the person under examination
can see well, the spectacles will help. Otherwise, the condition is pathological, and the
spectacles may not help at all.
6.3 MAGNIFIER
The apparent size of an object as seen with an unaided eye depends on the angle it
subtends at the eye. As an object is moved closer from a position P1 to a position P2 , the
angle it subtends at the eye increases from 1 to 2 , as illustrated in Figure 6-12a. This
results in an increase in the size of the image on the retina, which the eye keeps in focus
by accommodation. However, there is a limit to how close the object can get before the
eye is not able to accommodate any more, and the image gets blurred. Although this
distance varies somewhat from person to person, 25 cm is considered a standard near
point or the distance of most distinct vision. A magnifier, also called a simple
microscope, is a positive lens used to magnify the image beyond one’s accommodation.
People use it as a magnifying reading glass to look at fine print, and watchmakers use it
as an eye loupe to look at the details inside a watch.
Let = - h 25 be the angle subtended by an object of height h (in cm) when placed
at the near point, as illustrated in Figure 6-12b. If the object is observed through a
magnifier, it can be brought much closer to the eye. If it is placed inside the focus F of
the magnifier at a distance S from it, as illustrated in Figure 6-12c, it forms a large virtual
but erect image of height h ¢ at a distance S ¢ from the eye. This virtual image is seen by
the eye subtending a much larger angle ¢ on the retina.
If a magnifier of focal length f ¢ lies at a distance d from the eye, it forms a virtual
image of height h ¢ at a distance S ¢ + d from the eye, where
h ¢ = h( S ¢ S)
= h ( f ¢ - S ¢) f ¢ . (6-1)
250 OPTICAL INSTRUMENTS
P1 P2 P1¢
P2¢
(a)
h
(–)b
Near
point
25 cm
(b)
h¢
(–)b¢
h
F
(–)f
(–)S (–)d
(–)S¢
(c)
(–)b¢
h
F
(–)f
(d)
Figure 6-12. Magnifier. (a) Object at various distances observed with an unaided
eye. (b) Object observed at the standard distance of 25 cm. (c) Object observed
through a magnifier of focal length f ¢ kept at a distance d from the eye. (d) Object
placed in the front focal plane of the magnifier and observed through a magnifier in
(near) contact with the eye.
6.4 Microscope 251
The virtual image is seen by the eye subtending an angle ¢ = h ¢ ( S ¢ + d ) at the eye. The
ratio of the size of the retinal image thus formed to its size when the object is seen
without the magnifier from a distance of 25 cm is called the visual magnification of the
magnifier. It is given by the angular magnification
Mb = ¢
h¢ (S ¢ + d )
=
- h 25
25( f ¢ - S ¢)
= - . (6-2)
f ¢( S ¢ + d )
The magnification increases when the magnifier is moved closer to the eye. If it is
held close to the eye, we let d Æ 0 . Moreover, if the image formed by the magnifier lies
at the near point of the eye, then S ¢ = - 25 cm. Accordingly, the visual magnification
becomes
25
Mb = +1 , (6-3)
f¢
where f ¢ is in cm. The smaller the value of f ¢ is, the larger the value of the
magnification. If the object is placed in the focal plane of the magnifier, then S = - f ¢
and S ¢ = - • , as in Figure 9-13d, and a normal eye sees it without much accommodation.
In this case, ¢ Æ - h f ¢ and
Mb Æ 25 f ¢ . (6-4)
This result can also be obtained from Eq. (6-2) by letting S ¢ Æ - • , regardless of the
value of d. The magnifiers are often specified by this magnification. For example, a
magnifier with a focal length of 5 cm is labeled as 5 ¥ .
6.4 MICROSCOPE
A microscope is generally used to see the details of very small objects at very short
distances. As illustrated in Figure 6-13, a microscope, or more accurately, a compound
microscope, consists of two lenses, one with a very short focal length called the objective,
and the other with a somewhat longer focal length called the eyepiece. In practice, both
the objective and the eyepiece are actually made up of several lenses to reduce the
monochromatic as well as chromatic aberrations (which are discussed in Chapters 7 and
8). The objective of a microscope is its aperture stop, and its image by the eyepiece is its
exit pupil. All of the light entering the objective and refracted by the eyepiece passes
through the exit pupil. The pupil of the eye is placed at the exit pupil; otherwise, the field
of view is restricted.
When an object P0 P is placed just beyond the focal point of the objective, a real
252 OPTICAL INSTRUMENTS
P¢¢¢
0
P¢¢¢
Eye
ExP
Eyepiece
Fe P0¢
P¢
L
(–)a¢
F¢o
Objective
EnP
AS
a
P0 Fo
P
P0¢¢
P¢¢
magnified image P0¢P ¢ is formed by it. This image is further magnified by the eyepiece
acting as a magnifier. The magnified virtual image P0¢¢P ¢¢ is observed by the eye, yielding
a final image P0¢¢¢P ¢¢¢ on the retina. The magnification M of the retinal image is equal to
the product of the transverse magnification Mt of the image formed by the objective and
the angular magnification M of the image formed by the eyepiece:
M = Mt M . (6-5)
image-space focal point Fo¢ of the objective. From Eq. (2-83), the magnification of this
image is given by
Mt = - L fo¢ . (6-6)
This image is magnified by the eyepiece, which, in turn, is seen by the eye. The visual
magnification of the retinal image formed by the eyepiece is given by
M = 25 fe¢ . (6-7)
Both Eqs. (6-6) and (6-7) become exact when the objective forms the image P0¢P ¢ of the
object in the focal plane of the eyepiece, which, in turn, forms the virtual image P0¢¢P ¢¢ at
infinity. Substituting these equations into Eq. (6-5), we obtain the magnification of the
microscope:
L 25
M = - (6-8a)
fo¢ fe¢
= 25 f ¢ , (6-8b)
where
L
f¢ = - (6-9)
fo¢ fe¢
is the focal length of the microscope, as may be seen by letting t = fo¢ + fe¢ - L in Eq. (4-
26). The magnification M represents the ratio of the size of the retinal image when an
object is viewed through the microscope to its size when viewed without any aid but
placed at a distance of 25 cm.
Objects are often observed with a microscope by placing them under a cover glass.
Sometimes the space between the cover glass and the objective is filled with a liquid,
such as an oil, to yield a higher numerical aperture and better resolution, as discussed in
Section 6.8.5. A nearsighted or farsighted person can remove their spectacles and focus
the microscope by moving the eyepiece in and out, but an astigmatic person must wear
them when observing with a microscope.
6.5 TELESCOPE
Whereas a microscope is used to view very small, nearby objects, a telescope is used
to view large, distant objects. Like a microscope, an astronomical telescope also consists
of two lenses: an objective and an eyepiece. The two lenses in a telescope are confocal
(or common focus), as illustrated in Figure 6-14, and represent an example of an afocal
system. (A reflecting afocal telescope is discussed in Section 3.4 in the form of a beam
expander.) In Figure 6-14a, both lenses are positive, and the telescope is called Keplerian.
In Figure 6-14b, the first lens is positive, but the second is negative, and the telescope
254 OPTICAL INSTRUMENTS
D1 D2
F¢1 , F2
f1¢ – f2¢
D1 D2
F¢1 , F 2
– f2¢
f1¢
Figure 6-14. Refracting telescope consisting of two lenses with a common focus. The
image-space focal point F1¢ of the first lens and the object-space focal point F2 of the
second lens are coincident. (a) A Keplerian telescope has positive lenses, i.e., f1¢ and
f2¢ are both numerically positive. (b) A Galilean telescope has a positive first lens
but a negative second lens, i.e., f1¢ is numerically positive, but f2¢ is numerically
negative.
is called Galilean. The first lens of diameter D1 is called the objective (because it is
closer to the object), and the second lens of smaller diameter D2 is called the eyepiece
(because the observing eye is placed near it).
A parallel beam of light incident on the system is focused at the common focus by
the first lens and emerges as a parallel beam from the second. If the first lens has a longer
focal length than that of the second, the system may also be used as a beam reducer.
Similarly, if the second lens has a longer focal length, then the system can be used as a
beam expander, as may be seen by reversing the system. A screen with a hole, called a
spatial filter, can be inserted at the focus to clean up a laser beam incident on a Keplerain
telescope. However, if the beam is of high power, then the Galilean telescope can be
used to avoid air breakdown at the common focus. It is easy to see from the figure that
the beam-expansion ratio D2 D1 is given by f2¢ f1¢ , where D and f ¢ are the diameter
and the image-space focal length of a lens. In Figure 6-14, the image of an object can be
determined by applying the Gaussian or the Newtonian imaging equation recursively to
the two lenses. Now, if the object position changes by a distance S , then the image
position changes by a distance S ¢ = Mt2 S , according to Eq. (2-111), where Mt is the
transverse magnification of the image (and the refractive indices of the object and image
spaces are both equal to unity).
6.5 Telescope 255
Incidentally, the object-space focal point F1 of the first lens and the image-space
focal point F2¢ of the second lens are conjugates of each other, as illustrated in Figure 6-
15. Considering F1 as the object, the first lens forms its image at infinity. Thus, parallel
rays are incident on the second lens, which focuses them at F2¢ .
Parallel rays from a point on a distant object are shown incident on an objective of a
long focal length fo¢ in Figure 6-16a. A real inverted image is formed at P ¢ in its focal
plane. If the focus of the eyepiece also lies in this plane (i.e., if the eyepiece is confocal
with the objective), it forms a virtual image P ¢¢ of P ¢ at infinity, which is observed by a
relatively relaxed eye as P ¢¢¢ .
In astronomical telescopes, the aperture stop AS lies at the objective, which is,
therefore, the entrance pupil EnP. Its image by the eyepiece of a short focal length fe¢ is
the exit pupil ExP. The distance between the eyepiece and the exit pupil is called the eye
relief. This distance, indicated as s ¢ in Figure 6-16b, is the image distance corresponding
to an object distance s = fe¢ + fo¢ . It is given by s ¢ = - fe¢( fe¢ + fo¢) fo¢ . The eye is placed
as close to the exit pupil as possible (to avoid restricting the field of view). If Dex and
Den are the diameters of the entrance and exit pupils, the magnification of the pupil is
given by
= s¢ s = - f ¢ f ¢ . (6-11)
F1 F2¢
(–)f1 f2¢
(a)
F1 F2¢
(–)f1 (–)f2¢
(b)
Figure 6-15. Conjugate focal points. (a) Keplarian telescope. (b) Galilean telescope.
256 OPTICAL INSTRUMENTS
AS
EnP
ExP
P¢¢¢
CR Fe¢
F o¢ , Fe b¢
(–)b
P¢ Eye
Eyepiece
Objective P¢¢
at infinity
fo¢ – fe¢
(a)
AS
EnP
MR
A
ExP
Den CR
CR F o¢ , Fe Fe¢
C
Dex
B (–)b D b¢ O
E MR
P¢
Eyepiece
Objective
fo¢ – fe¢
(–)s s¢
(b)
Figure 6-16. Keplerian telescope with positive objective and eyepiece. (a) The image
formed by the objective is reimaged by the confocal eyepiece at infinity, which is
observed by a relaxed eye. (b) The eyepiece limits the angle of a chief ray CR that
can be trasmitted by it.
This result may also be seen from similar triangles ABFe and CDFe formed by the
marginal ray from an axial point object at infinity.
Mb = ¢ , (6-12)
where is the angle subtended by the object at the objective (or at the unaided eye), and
¢ is the angle subtended at the eye by the image formed by the eyepiece. It represents
the factor by which the telescope magnifies the angular separation of the images of two
distant objects. It may be seen from the triangles BCE and CEO formed by the chief ray
CR in Figure 6-16b that the magnification of the retinal image due to the telescope is
given by the ratio of the focal length of the objective to that of the eyepiece, i.e.,
6.5 Telescope 257
For an object lying at a finite distance, the total magnification of the telescope is obtained
by multiplying the magnification of the objective with that of the eyepiece.
We also note that the eyepiece limits the angle of a chief ray transmitted by the
system. Although Figure 6-16a indicates that a chief ray with a larger angle than shown
may be transmitted by the eyepiece, it also indicates that the outer rays in the off-axis ray
bundle will be vignetted. Thus, the angle represents the unvignetted field of view of the
telescope. Hence, from Eq. (6-12), a telescope with a large field of view is accompanied
by a correspondingly small image magnification (as observed by the eye). For two
positive lenses, as in Figure 6-16b, both focal lengths are numerically positive and the
negative sign in Eq. (6-14) indicates that the image formed by the telescope is inverted.
Sometimes prisms or additional lenses are used to erect the image. Such astronomical
telescopes are referred to as terrestrial telescopes.
The field of view of a telescope can be increased by inserting a lens, called a field
lens, at the image formed by the objective. Figure 6-17a shows an object at a field angle
such that only the rays from the lowest portion of the objective are incident on the
eyepiece; all other rays are vignetted (unless the diameter of the eyepiece is increased).
With a slightly larger angle, there would be complete vignetting. However, as illustrated
in Figure 6-17b, a lens placed at the intermediate image plane bends the rays toward the
eyepiece, thus eliminating vignetting. The increase in the field of view is obtained
without increasing the diameter of the eyepiece. Although the position of the final image
is unaffected by the field lens, the exit pupil does move to the left, and the eye relief is
reduced. In practice, the field lens is often displaced slightly to avoid seeing its
imperfections.
AS
EnP
ExP P¢¢¢
CR
F o¢ , Fe Fe¢
(–)b
b¢
Eye
P¢
Eyepiece
Objective
P¢¢
at infinity
fo¢ – fe¢
(a)
AS
EnP Field
Lens
ExP P¢¢¢
CR
F o¢ , Fe Fe¢
(–)b
b¢
Eye
P¢ Eyepiece
Objective P¢¢
at infinity
fo¢ – fe¢
(b)
Figure 6-17. Use of a field lens to increase the field of view of a Keplarian telescope.
(a) Field of view limitation because of vignetting of rays, except from the lowest
portion of the objective. (b) Vignetting eliminated by a field lens placed in the focal
plane of the objective.
note that in this case the objective limits the angle of a chief ray transmitted by the
telescope. Therefore, it is the field stop of the system. The chief advantage of the Galilean
telescope is its small overall length. It is used for viewing operas and sports.
6.6 Ocular 259
P¢¢
at infinity
F o¢ , Fe
P¢
Objective Eyepiece
fo¢
fe
(a)
AS
ExP EnP
CR
(–)b¢ (–)b
MR
Eyepiece Eye
Objective
fe
fo¢
(b)
Figure 6-18. (a) Galilean telescope with a negative eyepiece. (b) Objective as the field
stop, and eye as the aperture stop AS. The entrance pupil EnP is indicated as the
object, and the exit pupil ExP is its image by the objective.
6.6 OCULAR
Although a magnifier may be used as an eyepiece in a microscope or a telescope, in
practice, compound lenses are designed to reduce lateral color (discussed in Chapter 7).
Such eyepieces are called oculars. Whereas a magnifier is used to look at a real object, an
ocular is used to look at an image formed by another optical system. An example of an
ocular is the Huygens eyepiece discussed in Section 7.6.2. It consists of two lenses of the
same material separated by a distance equal to half the sum of their focal lengths. The
lens closer to the eye is called the eye lens, and the one close to the objective is called the
field lens. Binoculars consist of two telescopes mounted side-by-side, one for each eye.
be
b¢
F1 F2¢ He¢ H¢ F¢, F¢e
f¢
fe¢
be
(–)b¢
F1¢, F 2 F¢2 H¢ H¢e F¢, F¢e
fe¢
f¢
Figure 6-19. (a) Telephoto lens attached to an imaging system, giving a long effective
focal length and thereby a large image of a distant object. (b) Wide-angle lens
attached to an imaging system, giving a short effective focal length and thereby
giving a wide field of view. F ¢ and H ¢ are the image-space focal point and principal
point of the imaging system. Similarly, Fe¢ and He¢ are the object-space focal point
and principal point of the combined system.
Figure 6-19a shows how it can be combined with an afocal system to form a telephoto
system. The first lens in the figure is positive, and the second is negative. Thus, f1¢ is
numerically positive, but f2¢ is numerically negative, where f2¢ < f1¢ . The combined
system has a longer effective focal length fe¢ = f ¢ Mt , where Mt = - f2¢ f1¢ is the
transverse magnification of the afocal system, and Mt < 1. We note from the figure that
the angular magnification is M = ¢ e = 1 Mt , and M > 1 or e < ¢ .
Now, the size of the image of a distant object formed by an optical system depends
linearly on its focal length. Because such an image is increased in size by the use of the
afocal system, i.e., the combined system is a telephoto system. If ¢ is the field of view
of the imaging system by itself, we find that the effective field of view of the combined
system is reduced to e , which is smaller than ¢ by a factor of 1 Mt . Note that to avoid
vignetting by the afocal system, e must be £ D2 f1¢, where D2 is the diameter of the
beam emerging from the afocal system. It should be noted that adding an afocal system to
an imaging system is not the only way to achieve the telephoto effect. A positive and a
6.8 Resolution 261
negative lens of suitable focal lengths also form a telephoto system, as illustrated by
Problem 2.12.
Similarly, a wide-angle lens is used to take pictures of large, nearby objects, like for
example, a large group of people. When the afocal system is used in reverse so that the
first lens is negative ( f1¢ < 0) and the second lens is positive ( f2¢ > 0), as in Figure 6-19b,
the effective focal length of the combined system is reduced to fe¢ = f ¢ Mt , where
Mt = - f2¢ f1¢ , and Mt > 1. The angular magnification of the afocal system is
M = ¢ e = - 1 Mt , and M < 1 or e > ¢ . Note that ¢ is numerically negative in
Figure 6-19b. If ¢ is the field of view of the imaging system by itself, we find that the
effective field of view e of the combined system is larger than , or that the combined
system is a wide-angle system.
6.8 RESOLUTION
6.8.1 Introduction
The resolution of an optical instrument represents its ability to resolve detail in an
object. Based on Gaussian optics, the image of a point object is also a point. All of the
rays emanating from a point object and transmitted by an imaging system pass through
the Gaussian image point. In reality, of course, the rays generally intersect the image
plane in the vicinity of the image point as a spot diagram due to the aberrations of the
system (see Chapter 9) because the ability of a system to resolve objects is limited by its
aberrations. However, because of diffraction of the wave by the finite aperture stop (or,
equivalently, the exit pupil) of the system, a point image is not obtained even if the
aberrations are zero (otherwise, the irradiance at the image point would be infinity). Thus,
in practice, the resolution of a system is inherently limited by diffraction. It is only further
degraded by the aberrations. In this section, we briefly discuss the characteristics of the
diffraction image of a point object, and introduce the Rayleigh criterion of resolution. We
discuss the resolution of an eye, a microscope, and a telescope, assuming an aberration-
free system.
◊
where J1 ( ) is the first-order Bessel function of the first kind, and r is the radial distance
of a point from the Gaussian image point in units of l F . The distribution is normalized
2
by the value p P 4l2 F of the irradiance at the center r = 0 , where P is the total power.
This distribution is shown in Figure 6-20. It consists of a bright spot, called the Airy disc,
surrounded by dark and bright rings of decreasing irradiance.
262 OPTICAL INSTRUMENTS
Figure 6-20. 2D diffraction image of a point object, called the Airy pattern, with
83.8% of the total light in the central bright spot.
The irradiance distribution has a principal maximum at the center with a value of
unity because
È 2 J (x) ˘
Limit Í 1 ˙ = 1 . (6-16)
x Æ 0 Î x ˚
Noting that
d È J1 ( x ) ˘ J ( x)
= - 2 , (6-18)
dx ÍÎ x ˙˚ x
◊
where J 2 ( ) is the second-order Bessel function of the first kind, the positions of the
secondary maxima are given by the roots of
J 2 (p r ) = 0, r π 0 . (6-19)
6.8 Resolution 263
Integrating the Airy pattern over a circular area, we obtain the power contained in a
circle of radius rc :
◊
which is normalized by the total power P. Here, J0 ( ) is the zeroth-order Bessel function
of the first kind. According to Eq. (6-17), because the dark rings (minima of zero
irradiance) correspond to J1 ( p r ) = 0 , we note that the powers inside and outside an mth
dark ring of radius rm (in units of l F ) are given by
and
respectively.
The irradiance and encircled power distributions of the Airy pattern are shown in
Figure 6-21. The positions of several minima and maxima, and the relative irradiance and
encircled power corresponding to them, are given in Table 6-4. We note that the Airy
disc, or the first dark ring, has a radius of 1.22 (in units of lF ) and contains 83.8% of the
total power. The first bright ring, with inner and outer radii of 1.22 and 2.23, contains
7.2%; the second bright ring 2.8%; and the third bright ring 1.4% of the total power.
1.0
0.8
P
I(r), P(rc)
0.6
0.4
0.2 I
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
r, rc
Figure 6-21. Irradiance and encircled power distributions of the Airy pattern.
Table 6-4. Irradiance and encircled power corresponding to the maxima and
minima of the Airy pattern. The irradiance is normalized by its central value
p 4 l 2 F 2 , and the encircled power is normalized by the total power P . The units of
r and rc are l F .
Max/Min r, rc I (r ) P( rc )
Max 0 1 0
0.8
0.6
I (x)
0.4
0.2
0
-4 -3 -2 -1 0 1 2 3 4
x
Figure 6-22. Irradiance distribution along the x axis of the image of two incoherent
point objects of equal intensity separated by the Rayleigh resolution of 1.22ll F . The
central value is 0.73.
Figure 6-23. 2D image of two incoherent point objects of equal intensity separated
by the Rayleigh resolution of 1.22ll F .
266 OPTICAL INSTRUMENTS
h ¢ = 1.22 l ¢F (6-24a)
= 0.61 l ¢ a ¢ , (6-24c)
where a ¢ = Dex 2 Li is the semiangular aperture of the exit pupil as seen from the image.
In practice, Li >> Dex 2 ; therefore, we have let tan a ¢ = sin a ¢ = a ¢ . From the sine
condition (see Reference 1 in Chapter 1), the magnification of the Gaussian image is
given by
h¢ n sin a
=
h n ¢ sin a ¢
~ n sin a , (6-25)
n ¢a ¢
where a is the semiangular aperture of the entrance pupil as seen from the object. In
EnP ExP
n
n¢
P
CR MR MR
Den Dex
h
a (–)b (–)a¢ P¢0
P0 O O¢ (–)b¢
(–)h¢
CR
Optical P¢
System
(–)L o Li
observations with microscopes, the value of the angle a can be quite large. The quantity
n sin a is called the numerical aperture of the imaging system. According to Eq. (5-33),
it determines the flux entering the system. From Eqs. (6-24c) and (6-25), we obtain the
minimum distance h between resolved object points P0 and P:
h ~ n ¢ h ¢a ¢ (6-26a)
n sin a
0.61n ¢l ¢
= (6-26b)
n sin a
0.61l 0
= . (6-26c)
n sin a
We note that the numerical aperture also determines the resolution of a system. The larger
its value is, the smaller the distance between two resolvable points, i.e., the better the
resolution.
From Eq. (6-26c), the angular resolution in the object space is given by
= h Lo
0.61l 0
= . (6-27)
nLo sin a
where Den is the diameter of the entrance pupil of the system. In such cases, Eq. (6-27),
reduces to
Similarly, from Eq. (6-24b), the angular resolution in the image space is given by
¢ = h ¢ Li (6-30a)
For visual observations, the eye is placed at the exit pupil of the system, and the final
image is formed on its retina (indeed, this is the origin of the names for the entrance and
exit pupils). Otherwise, the apparent field of view is reduced. Moreover, the diameter of
the exit pupil is chosen to be equal to that of the eye’s pupil. If it is larger, then any light
outside of the eye’s pupil is wasted. If it is smaller, then it determines the diameter of the
Airy disc formed on the retina, which increases, thereby degrading the resolution. Of
course, the diameter of the exit pupil does not affect the distance between the Gaussian
268 OPTICAL INSTRUMENTS
images of the two object points, which depends on the magnification of the imaging
system. The magnification of a system when the diameters of the exit pupil and the eye
are equal is called its normal magnification. Choosing such a magnification is referred to
as pupil matching. Any magnification in excess of the normal magnification is called the
empty magnification because it does not improve the apparent resolution. However, it still
has merit in that positioning of the eye is eased and fatigue is reduced.
NAeye = n sin a
~ a
Deye
=
2 Lo
3 mm
= = 0.006 , (6-31)
2 ¥ 25 cm
where we have let n = 1 for objects in air and Lo = 25 cm as the distance of most distinct
vision. The angle a represents the angle of the cone of light from a point object entering
the eye. From Eqs. (6-26c) and (6-27), the linear and angular resolutions of the eye in the
object space are given by
CR
h
MR
a (–)b a¢ P0¢
Deye
P0 (–)h¢
(–)b¢
P¢
(–)L o Li
0.61 l 0
h =
NAeye
= 101 l 0
= 55 mm (6-32)
and
= h Lo
55 mm
= = 0.22 mrad = 0.76¢ . (6-33)
25 cm
From Eqs. (6-24c) and (6-30a), the corresponding quantities in the image space are given
by
h ¢ = 0.61 l ¢ a ¢
0.61 l 0
=
n ¢a ¢
0.61 l 0
=
1.33 (1.5 25)
and
¢ = h ¢ Li
4.2 mm
= = 0.17 mrad = 0.57¢ . (6-35)
25 mm
If the pupil of the eye is reduced in size, the angle a and, therefore, the numerical
aperture of the eye decreases, and the Airy disc becomes larger, thus degrading the
resolution. If the diameter is increased, the diffraction-limited resolution increases, but
the aberrations of the eye degrade it.
Consider a microscope imaging two point objects such that their images are just
resolved by the eye. Under normal magnification, the eye can resolve objects whose
images by the microscope subtend an angle of 1.22 l 0 Deye . The magnification of a
microscope is given by Eq. (6-5), where Mt is the transverse magnification of the
objective, and M = 25 cm f e¢ is the angular magnification of the eyepiece. The sine
condition applied to the objective yields (see Figure 6-13)
n sin a
Mt = (6-36)
a¢
2n sin a
= ,
Dex fe
where
Dex
a¢ = , (6-37)
2 fe
and Dex is the diameter of the exit pupil assumed to be equal to that of the eye. Thus, the
magnification of the microscope may be written
2 n sin a
M =
Dex 25 cm
NAobj
= , (6-38)
NAeye
where NAobj = n sin a is the numerical aperture of the objective. With a dry objective
(n = 1) , a practical limit for its numerical aperture is about 0.95. However, by filling the
space between the object and the objective with a liquid of refractive index matching that
of the objective, the numerical aperture can be increased to about 1.6. An oil-immersion
objective increases the numerical aperture compared to that of the eye and thereby
improves the resolution by a factor of 1.6/0.006 = 267. The linear resolution given by Eq.
(6-26c) changes from 55 mm to
0.61 ¥ 0.55 mm
h =
1.6
= 0.21 mm . (6-39)
diameters of the objective of the telescope and the eye’s pupil. The angular resolution of a
system is given by Eq. (6-29). Under normal magnification of a telescope, the diameter of
its exit pupil is equal to that of the eye’s pupil. Thus, the improvement in visual resolution
is given by the ratio of the diameter of the objective to that of the eye’s pupil [see Eq. (6-
13)]:
Dobj
M = . (6-40)
Deye
When Dex = Deye , the size of the Airy disc obtained with the telescope is the same as
that obtained without it, but the angular separation of the two objects as observed on the
retina is increased by M . The diameter of the exit pupil is given by Eq. (6-11). As in the
case of a microscope, if the exit pupil of the telescope is larger than the eye's pupil, then
the amount of light outside the eye’s pupil is wasted. If it is smaller, then the Airy disc on
the retina becomes larger, thereby degrading the resolution. The large objective of the
telescope not only improves the visual resolution, but also increases the total flux in the
image. Thus, it helps to see dim objects as well. It is common knowledge that stars are
too dim compared with the sky to be observed in the daytime with a naked eye. However,
they can be observed with the aid of a telescope. As explained below, this is due to an
increase in the star intensity on the retina when a telescope is used.
(
F = L So Se R 2 ) , (6-41)
where Se R 2 is the solid angle subtended by the pupil at the object. This flux is spread
on the retina over an area Si , given by (see Figure 6-26a)
2
Ê R¢ ˆ
Si = Á ˜ So . (6-42a)
Ë n¢ R ¯
F 2
= L Se ( n ¢ R¢) . (6-42b)
Si
If the same extended object is observed with the aid of a telescope, as illustrated in
Figure 6-26b, then the eye sees the image formed by the telescope. Let D1 and D2 be the
diameters of the objective and the eyepiece, respectively. If the aperture stop of the
telescope lies at the objective, then its image by the eyepiece is its exit pupil. For a
confocal telescope with its objective and eyepiece of focal lengths f1¢ and f2¢ ,
respectively, the exit pupil lies at a distance f2¢( f1¢ + f2¢ ) f1¢ from the eyepiece with a
272 OPTICAL INSTRUMENTS
b/n¢ Retina
Eye lens
R¢
(a)
AS
EnP
ExP
b L2 (–)b¢ (–)b¢/n¢
L1 Retina
R¢b¢
(–)
n¢
Eyepiece Eye lens
Objective
R¢
(b)
Figure 6-26. Daytime observation of a star against a sky background (a) without and
(b) with the aid of a telescope. The lenses L1 and L2 , called the objective and the
eyepiece, respectively, with a spacing equal to the sum of their focal lengths,
constitute the afocal telescope.
diameter Dex = D2 . The images are observed by placing the eye at the exit pupil or as
close to it as possible.
Because the luminance of the telescopic image, according to Eq. (5-42), is (at most)
equal to the object luminance, it is evident that the retinal illuminance will be the same as
in the case of an unaided observation, provided the diameter of the exit pupil of the
telescope is greater than or equal to the diameter of the eye pupil. If the diameter of the
exit pupil is smaller, then Se is replaced by the area Sex of the exit pupil, and the retinal
illuminance is reduced by a factor Sex Se . Thus, the retinal illuminance of the image of a
distant extended object, such as the sky, observed with the aid of a telescope, is less than
or equal to the corresponding illuminance obtained when the object is observed with the
naked eye.
Now consider the observation of a point object, such as a star. The apparent intensity
of the star depends on the amount of light in its retinal image. The ratio of the amount of
2
light in this image with and without a telescope is given by ( D1 De ) , provided D2 £ De .
2
If D2 > De , then a fraction ( De D2 ) of the total light received by the telescope enters
the eye. Thus, in this case, the ratio of the amount of light in the retinal image with and
2
without the telescope is given by ( D1 D2 ) . In either case, this ratio is greater than 1.
Thus, the intensity of the star image on the retina is increased by using the telescope.
Because the illuminance of the sky background on the retina either stays the same or
decreases with the use of the telescope, and the intensity of the star image on the retina
6.9 Pinhole Camera 273
increases with its use, the star visibility or the signal-to-noise ratio increases.
Accordingly, it is possible to observe bright stars in the daytime by using a telescope for
which D1 > De .
The difference between a pinhole camera and a regular camera is that in the former
there is no lens to form the image. In a regular camera, a lens converts a diverging
spherical wave from a point object P0 into a spherical wave converging to an image point
P0¢ in the image plane. In a pinhole camera, a spherical wave of radius of curvature Lo
diverging from the object P0 is incident on the pinhole and continues as a diverging wave
toward the image plane, as illustrated in Figure 6-27a, where Lo is the object distance
from the pinhole. For a perfect image, a converging spherical wave of radius of curvature
Li (illustrated by the dashed wavefronts) should emerge from the pinhole converging to
the image point P0¢ , where Li is the image distance from the pinhole. Accordingly, the
defocus wave aberration at a distance r from the center of the pinhole is given by the sum
of the sags of two spherical wavefronts passing through the center of the pinhole with
their centers of curvature lying at the object and image points. This is illustrated in
Figure 6-27b as AB + BC = AC. Thus, the wave aberration may be written
1Ê 1 1ˆ 2
W (r ) = Á - ˜r , (6-43)
2 Ë Li Lo ¯
where the object distance Lo is numerically negative. The image will be practically
diffraction limited according to the Rayleigh criterion [5], if the peak value of the
aberration is less than or equal to l 4 . Thus, we may write
1 1 l 1
- = 2 = , (6-44)
Li Lo 2a fe
where fe is the effective focal length of the pinhole. For a distant object ( Lo Æ • ), the
radius of the pinhole is simply given by
12
a = ( Li l 2) . (6-45)
The image spot for a point object is approximately the Airy disc with a radius of
0.61lLi a .
274 OPTICAL INSTRUMENTS
P0 P0¢
(a)
Object
plane
Pinhole Image
plane
(–)L0 Li
B
A C
r
(b)
h P0¢
(–)h¢
P0
P¢
(–)Lo Li
Object plane Pinhole Image plane
(c)
Figure 6-27. (a) Imaging by a pinhole camera of radius a. (b) Wavefront incident on
the pinhole, and emerging wavefront required for perfect imaging. The pinhole size
is extremely exaggerated for clarity of the wavefronts. The camera length Li >> a .
(c) Distortion-free imaging.
A pinhole camera suffers from chromatic aberration because its focal length depends
on the wavelength. Similarly, because the pinhole appears to be elliptical from an off-axis
point object, its focal length for an object in the horizontal plane differs from that in a
vertical plane. Thus, it suffers from astigmatism. However, it is free of distortion, i.e., the
transverse magnification of an image is independent of the field angle. We note that the
chief ray (i.e., an object ray incident through the center of the hole) reaches the image
6.10 Summary of Results 275
plane without any deviation, as illustrated in Figure 6-27c. The magnification of the
image is given by
M = h ¢ h = Li Lo . (6-46)
The main disadvantage of a pinhole camera is the long exposure it requires due to the
small size of the pinhole.
A nearsighted (myopic) eye sees nearby objects well, but the images of objects
beyond a certain point (called the far point) are blurry. A negative lens with a focal length
equal to the distance of this point is used to see distant objects well. Simlarly, a farsighted
(hyperopic) eye sees distant objects well with accommodation, but the images of nearby
objects within a certain point (called the near point) are blurry. A positive lens of focal
length such that it forms the image of this point at a comfortable reading distance of 25
cm is used to see nearby objects well.
A visual acuity of 20/20 implies that letters that subtend 5 arc min at the eye at a
distance of 20 ft can be read. Accordingly, a visual acuity of 20/60, for example, implies
that what a normal eye can resolve at 60 ft is being resolved at 20 ft.
6.10.2 Magnifier
The apparent size of an object as seen with an unaided eye increases as the object is
brought closer, until the eye reaches the limit of its accommodation at about 25 cm. The
object can be brought closer if it is observed through a magnifier of a short focal length
f ¢ . If the object is placed in the front focal plane of the magnifying lens, the
magnification of the image seen with and without the magnifier is given by 25 f ¢ , where
f ¢ is in cm.
6.10.3 Microscope
In a compound microscope, the objective of very short focal length forms a real
magnified image of the object (see Figure 6-13), which, in turn, is magnified as a virtual
image by the eyepiece. The magnification of the retinal image with and without the
276 OPTICAL INSTRUMENTS
microscope is equal to the product of the transverse magnification of the objective and the
angular magnification of the eyepiece. If the virtual image lies at a distance of 25 cm,
then the microscope magnification is also given by 25 f ¢ , where f ¢ is the effective focal
length of the microscope in cm.
6.10.4 Telescope
A telescope is an afocal system used to view distant objects (practically at infinity).
In a Keplerian telescope, both the objective and the eyepiece are positive lenses; the
aperture stop lies at the objective, and the field stop at the eyepiece. The pupil
magnification is given by - fo¢ fe¢ and the telescope magnification by - fo¢ fe¢ , where fo¢
and fe¢ are the focal lengths of the objective and the eyepiece, respectively. In a Galilean
telescope, used to watch opera and sports, the eyepiece is negative. The objective is in the
field stop, and the observing eye is the aperture stop.
6.10.5 Resolution
The resolution of an optical imaging system is its ability to discern the details of an
object. According to the Rayleigh criterion of resolution, two incoherent point objects of
equal intensity are just resolved if their separation is given by
0.61l 0
h = , (6-47)
n sin a
where l 0 is the wavelength of the object radiation in vacuum, n is the refractive index of
the object space, and a is the semiangular aperture of the entrance pupil of the system as
seen from the object. The quantity n sin a is called the numerical aperture of the system.
The numerical aperture of the eye with a pupil of diameter 3 mm observing an object
at its near point at a distance of 25 cm is 0.006. Thus, the resolution of the eye at a
wavelength of 0.55 mm is 55 mm. The numerical aperture of a microscope with a dry
objective is at most 0.95. Its resolution is accordingly 167 times better. If an oil of
refractive index n is used between the object and the objective, then the resolution is
improved by a factor n. In the case of a telescope, the resolution is improved, compared
to that of the eye, by a factor of the ratio of the diameter of the objective and the eye. We
have assumed diffraction-limited resolution in these calculations. In practice, it will be
degraded by the aberrations of the imaging system.
where l is the mean wavelength of object radiation. Its disadvantage is the long exposure
time due to the small aperture.
References 277
REFERENCES
1. W. N. Charman, “Optics of the eye,” in Handbok of Optics, Vol. III, Chapter 1,
M. Bass, Ed., 3rd ed., McGraw-Hill, New York (2010).
2. G. A. Fry, “The eye and vision,” in Applied Optics and Optical Engineering, Vol.
II, Chapter 1, R. Kingslale, Ed., Academic Press, San Diego, CA (1965).
PROBLEMS
6.1 A person with a 15-cm-wide face looks into the eyes of another person. Determine
the location and size of the image of the first person formed by the cornea of the
other person assuming a 15-cm gap between the two people.
6.2 Show from the data in Table 6-1 that each model of the eye provides the same total
focusing power of 60 D. Also show that the cornea of the schematic and simplified
schematic eyes provides nearly 43 D of focusing power.
6.3 The far point of a nearsighted eye lies at 1 m from it. Determine the power of a
correction spectacle lens worn at 15 mm from the eye. What is the power of a
contact correction lens?
6.4 The headlights of a car are approximately 1.5 m apart. Determine their separation
on the retina when the car is 30 m away.
6.5 A person can see himself clearly with relaxed eyes, standing 1 m from a mirror.
With accommodation, he can see well when only 15 cm away. (a) Determine the
prescription required to see distant objects clearly. (b) How far is his near point
when wearing glasses? (c) How close can he be to the mirror to see himself clearly
when wearing glasses? (d) How far, at the most, can he be from a concave mirror
of radius of curvature 1 m to see the image of a picture hanging 60 cm from the
mirror?
6.6 A person with a near point 25 cm away from her eyes wants to view an object with
a magnifying glass that is marked 6 ¥. She holds the magnifying glass close to her
eyes. Determine the range of object distance to view the magnified image without
undue strain on the eyes.
6.7 A Ramsden eyepiece consists of two positive lenses of equal focal length f ¢
spaced (2 3) f ¢ apart. Determine the location of the image formed by a microscope
or telescope objective for relaxed-eye viewing through the eyepiece.
6.8 Determine the optimum diameter of the pinhole of a pinhole camera where the
photographic plate lies at a distance of 10 cm from it. Let the mean wavelength of
object radiation be 0.55 mm. Determine the size of the image of a 6-inch-high
object placed 6 ft from the camera.
CHAPTER 7
CHROMATIC ABERRATIONS
279
Chapter 7
Chromatic Aberrations
7.1 INTRODUCTION
So far, we have discussed the imaging relations without explicitly stating the
wavelength of the object radiation. Because the refractive index of a transparent
substance decreases with increasing wavelength, a thin lens, for example, made of such a
substance will have a shorter focal length for a shorter wavelength. Consequently, an
axial point object emanating white light will be imaged at different distances along the
axis depending on the wavelength, i.e., the image will not be a “white” point. Similarly,
the height of the image of an off-axis point object will vary with the wavelength, resulting
in different sizes of the image of a multiwavelength object. The axial and transverse
extents of the image of a multiwavelength point object are called longitudinal and
transverse chromatic aberrations, respectively. They describe a chromatic change in the
position and magnification of the image, which are discussed in this chapter. The
longitudinal chromatic aberration is also called the axial color.
n′ n n′ − n
− = (7-1)
S′ S R
and
M ≡ h h ′ = nS ′ nS , (7-2)
281
282 CHROMATIC ABERRATIONS
AS
n n′ ExP
A
0
MR MR
0b MR
0r
a
R
P0 UR0 V0 UR0 O (–)α
C P′0b P′0r
B
(–)δS′
L
RS
R
(–)S S′
(a)
AS
n n′ ExP
A
MRr (–)δh′c
P′r
MR P′b
a b
(–)δh′
R
M
D
P0 h′ h′r
V0 O γ b
C P′0b P′0r
CR
b UR
(–)h (–)δS′
CR r
UR
P
L
RS
R
(–)S S′
P′r
(–)δh′ (–)δh′c
CRr
CRb
P′b
h′r
Disk of blue rays
h′b
diverging from P′b
P′0b P′0r
Figure 7-1. Chromatic aberrations of a refracting surface RS. UR, MR, and CR are
the undeviated, marginal, and chief rays. (a) On-axis imaging. (b) Off-axis imaging.
The subscripts b and r denote blue and red light. The axial color δ S ′ = Sb′ − Sr′ ,
where Sb′ and Sr′ are the distances of the blue and red images. The transverse
chromatic aberration δ h ′ = h b′ − h r′ , where hb′ and hr′ are the image heights in the
blue and red Gaussian image planes, respectively. The lateral color δ h c′ represents
the height difference of the blue and red chief rays in a given image plane.
7.2 Refracting Surface 283
where M is the transverse magnification of the image. Let δ represent a small change in a
certain quantity corresponding to a small change in the wavelength, or, equivalently, a
small change in the refractive index. Because the object distance S is independent of the
wavelength, by differentiating both sides of Eq. (7-1), we obtain
δ n′ n′ δn δn ′ − δn
− 2 δS ′ − = . (7-3)
S′ S′ S R
δS ′ ⎛ δn δn′ ⎞ ⎛ S ′ ⎞
= ⎜ − ⎟ ⎜ − 1⎟ . (7-4)
S′ ⎝ n n′ ⎠ ⎝ R ⎠
δM
= δ h′ h ′
M
δn δn ′ δS ′
= − +
n n′ S′
⎛ δn δn ′ ⎞ S ′
= ⎜ − ⎟ , (7-5)
⎝ n n′ ⎠ R
where in the last step we have used Eq. (7-4). Note that the fractional chromatic variation
of magnification is independent of the object (or image) height. The quantities δn and
δ n ′ represent the difference in the refractive index of the object and image spaces,
respectively, for the blue and red light. The blue and red light represent, in general, the
shortest and the longest wavelengths of the object radiation spectrum. The chromatic
change δS ′ = Sb′ − Sr′ in the position of the axial image represents the distance between
the axial Gaussian images P0′b and P0′r for the blue and red light. It is called the
longitudinal chromatic aberration or simply the axial color. The chromatic change
in the image height, called the transverse chromatic aberration, represents the difference
in the heights of the blue and red chief rays in the blue and red Gaussian image planes,
respectively.
From a practical standpoint, the quantity of interest is the size of the image of a point
object in a given Gaussian image plane. For example, the image of an on-axis point
object in the red Gaussian image plane consists of a bright-red Gaussian image point P0′r
at the center, surrounded by blue rays. The blue rays originating at P0 pass through the
Gaussian image point P0b and diverge from it as a blue disk of rays in the red Gaussian
image plane. The radius P0′r B of the blue disk of rays is given by (see Figure 7-1a)
284 CHROMATIC ABERRATIONS
ri = α δS ′
= (a L) δS ′ , (7-7)
where a is the radius of the exit pupil, and L is the distance of the image from it.
Similarly, the image in the blue Gaussian image plane consists of a bright-blue Gaussian
image point P0′b at the center, surrounded by red rays that converge to P0′r . The radius
P0′b R of the red disk is approximately the same as that of the blue disk. For a given
angular size of the light cone forming a Gaussian image point, the ratio a L is fixed, i.e.,
if the position of the exit pupil is changed so that L changes, its diameter (in practice, the
diameter of the aperture stop) is also changed so that a L does not change. Thus, the size
of the blue or red image disk, called the transverse axial color, does not change as the
position of the exit pupil is changed.
In the case of an off-axis object point P, its image in the red Gaussian image plane
consists of a red Gaussian image point and a displaced disk of blue rays. The radius of the
blue disk is approximately the same as that for the on-axis image. The displacement of
the blue disk represents the difference in the heights of the blue and red chief rays in this
image plane. We note from Figure 7-1b that the displacement, called the lateral color and
representing the chromatic aberration of the chief ray in a given image plane, is given by
δ hc′ = δ h ′ − γ δS ′
= δ h ′ − (h ′ L ) δS ′ , (7-8)
where γ is the angle that the blue chief ray CRb makes with the optical axis in image
space. It differs from δ h ′ , which is the difference in the heights of the blue and red chief
rays in the blue and red Gaussian image planes, respectively. Like δS ′ and δ h ′ , δ hc′ is
also numerically negative in Figure 7-1b.
We note that the value of δ hc′ changes as the value of L changes. This is to be
expected because the chief ray changes as the position of the exit pupil is changed. From
similar triangles CP0′b Pb′ and Pb′DPr′ in Figure 7-1b, we find that
h′
δ h′ = δS ′ . (7-9)
S′ − R
1 1
δ hc′ = h ′ ⎛ − ⎞ δS ′ . (7-10)
⎝ S′ − R L ⎠
Thus, δ hc′ = 0 as L → S ′ − R , i.e., when the exit pupil lies at the center of curvature.
This is to be expected because the undeviated ray UR becomes the chief ray for both blue
and red light.
7.2 Refracting Surface 285
⎛ 1 1⎞
δ hc′2 = δ hc′1 + ⎜ − ⎟ h ′δ S ′ . (7-11)
⎝ L1 L2 ⎠
Equation (7-11) represents the stop-shift equation for the lateral color.
It is evident from Eq. (7-8) that if the longitudinal aberration δS ′ is zero [it cannot
happen for a single surface (unless S ′ = R) or even a thin lens (unless S ′ = 0 )], then δ hc′
is equal to δ h ′ , independent of the position of the exit pupil.
The chromatic aberrations of an image formed by a thin lens of focal length f ′ and
refractive index n can be obtained by applying the results for a single refracting surface
successively to its two surfaces. Or, we can obtain them from the imaging and
magnification equations of a thin lens derived in Section 2.3. Because the image-space
focal length f ′ of the lens depends on its refractive index n, the image distance S ′ and
height h ′ also depend on it, i.e., the image is accompanied by both axial and lateral color.
δS ′ δf ′
=
S′ 2 f ′2
1
= − (7-12)
f ′V
and
δM δh ′ δS′
= = (7-13a)
M h′ S′
S′
= − , (7-13b)
f ′V
respectively, where
V =
(n − 1) (7-14)
δn
is called the dispersive constant of the lens material. Thus, for a change δn in the
refractive index, there is a corresponding change δ f ′ in the focal length, δ S ′ in the
image distance, and δ h ′ in the image height. It is evident that the smaller the value of δn
286 CHROMATIC ABERRATIONS
is, the larger the value of V, the smaller the change in focal length, and the smaller the
values of chromatic aberrations.
It is common practice to consider n as the refractive index for the yellow line of
helium (l = 0.5876 m ) , called the d line, and dn as the difference nF - nC between the
refractive indices for the Fraunhofer lines F and C, i.e., for the blue (l = 0.4861 mm ) and
red (l = 0.6563 m ) lines of hydrogen. Glass manufacturers often give the refractive
index data as a six-digit number. For example, BK7 glass is specified as #517642. The
first three digits define its refractive index according to nd - 1 = 0.517 , and the
remaining three digits define its dispersive constant according to
nd - 1
V = (7-15a)
nF – nC
= 64.2 . (7-15b)
The dispersive constant of a glass defined according to Eq. (7-15a) is called its Abbe
number.
The refractive indices of the available lens materials and their Abbe numbers from
Schott Optical Glass are given in Figure 7-2, called an nd /Vd diagram. Each glass in this
diagram is identified by a point whose position is called its optical position. The Abbe
numbers of glasses vary from about 20 to 90. The glasses with nd > 1.60, Vd > 50 or nd <
1.60, Vd > 55 are called crowns and are indicated by the letter K; others are called flints
and are indicated by the letter F. The simple crown (kron in German) glasses (soda-lime-
silicate glasses) have low dispersion, and simple flint glasses (lead-alkali-silicate glasses)
have high dispersion. The addition of barium oxide (BaO) yields a low dispersion with a
relatively high refractive index. The borosilicate crown glasses contain boron oxide
(B2 O3 ) instead of the calcium oxide used in normal soda-lime-silicate glass. The addition
of boron oxide yields a low refractive index and low dispersion.
The light and heavy flint glasses contain low and high lead and barium amounts,
respectively. Use of fluorine instead of oxygen also lowers the refractive index and
dispersion. The barium flint glasses contain both barium oxide and lead oxide; the crown
flint glasses contain calcium oxide and lead oxide, resulting in average dispersions. Use
of rare earths such as lanthanum (La) yields glasses of high refractive index and high
Abbe numbers. The terms heavy and light crowns or flints are also used, e.g., barium
heavy flint (BaSF) or phosphorus heavy crown (PSK) (the letter S is for schwer in
German, meaning “heavy” or “dense”). The barium crown glasses contain a large
proportion of boron oxide and barium oxide, while their silicon dioxide (SiO2 ) content is
low. The K group in the diagram includes the barium light crowns (BaLK) and the zinc
crown (ZK). The glasses given in the diagram are for use with visible light. The materials
for use with infrared radiation have been discussed in several publications by McCarthy
[1–6].
7.2 Refracting Surface 287
Figure 7-2. Refractive indices and Abbe numbers of various glass materials
available from Schott Optical Glass, Inc.
288 CHROMATIC ABERRATIONS
The radius of the blue or the red disk of rays in the red or the blue Gaussian image
plane, respectively, is again given by Eq. (7-7), as may be seen from Figure 7-3a. The
axial and transverse axial colors are independent of the position of the aperture stop since
a L is kept fixed as the position is changed. Similarly, from Figure 7-3b, we can show
that the (numerically positive) displacement hc′ of the blue disk from the red Gaussian
image point Pr′ of an off-axis point object P is given by Eq. (7-8). From similar triangles
CP0′b Pb′ and Pb′DPr′ ,
δ h ′ = ( h ′ / S ′ ) δS ′ . (7-16)
Thus, the lateral color δ hc′ representing the transverse chromatic aberration of the chief
ray in a given image plane may be written
δ hc′ = δ h ′ − γ δS ′
1 1
= h ′ ⎛ − ⎞ δS ′ . (7-17)
⎝ S′ L ⎠
It approaches zero when the exit pupil lies at the lens as in Figure 7-3c, i.e., as L → S ′ .
The chief ray in this case passes through the center of the lens undeviated regardless of its
wavelength. Because the chief rays of different colors are coincident, they intersect an
image plane at the same point. In a given image plane, rays (other than the chief ray) of
different colors are not in sharp focus due to longitudinal chromatic aberration. The stop-
shift equation for the lateral color is the same as Eq. (7-11).
As a numerical example, Figure 7-4 shows how the focal length of a thin lens made
of BK7 glass varies with wavelength. The variation of its refractive index is also shown
in the figure. We note that the refractive index decreases as the wavelength increases.
Thus, from Eq. (2-28), the focal length increases as the wavelength increases.
Because the aperture stop is located at the first surface, the entrance pupil EnP of the
system is also located there. Moreover, the entrance and exit pupils EnP1 and ExP1 for
this surface are also located at the surface. The entrance pupil EnP2 for the second
surface is ExP1 . The exit pupil ExP2 for this surface is the image of EnP2 formed by it.
7.4 Plane-Parallel Plate 289
ExP
a R
′
P0r
(–)α
P0 O ′
P0b B
(–)δS′
(–)S S′
(a)
ExP δh′c
P′b P′r
CR b D (–)δh′
CR r h′r
P0 C γ h′b
O P′0b P′0r
(–)h UR CR r
CRb
L
P
(b)
CRb
Disk of blue rays δh′c b δh′c
P′r CR P′r
diverging from P′b
γ (–)δh′
(–)δh′
P′b D
P′b h′r
δh′c (–)δS′
CRr
ExP
Pr′
Pb′
Rr
CR b,C (–)δh′
P0 h′b h′r
′
P0b ′
P0r
(–)h
CR (–)δS′
(c)
Figure 7-3. Chromatic aberrations of a thin lens. (a) On-axis imaging. (b) Off-axis
imaging. (c) Off-axis imaging with the exit pupil at the lens. The axial color is δ S ′ ,
and the lateral color is δ h c′ . The lateral color in (c) is zero.
290 CHROMATIC ABERRATIONS
1.02 1.535
1.530
1.01
f ′/fd′
1.525
1.00
f ′/fd′ 1.520
n
0.99
1.515
n
0.98
1.510
0.97 1.505
0.4 0.6 0.8 1.0
λ
Figure 7-4. Variation of refractive index and focal length of a thin lens made of BK7
glass #517642 with wavelength λ . The focal length is normalized by its value for the
d line. Thus, f ′ fd′ = (nd − 1) (n − 1) . The wavelength is in micrometers.
Thus, letting n2 = n, n2′ = 1, s2 = − t , and R2 = ∞ , we find from Eqs. (2-4) and (2-10)
that ExP2 is located at a distance s2′ = − t n from the second surface, and its
magnification m2 = 1. As expected from Eq. (2-97), ExP2 lies at a distance t (1 − 1 / n)
from the first surface. Of course, ExP2 is also the exit pupil ExP of the system. It is
evident that for the first surface, the distance L1 of the image P ′ from ExP1 is equal to its
distance S1′ from the surface. For the second surface, distance L2 of the image P ′′ from
ExP2 is given by
because L2, S2′ , and s2′ are all numerically negative. Substituting for S2′ and s2′ , we find
that
L2 = S . (7-18b)
Now we consider the chromatic aberrations of the plate. Differentiating Eq. (2-96a),
the axial color is given by
t
δ S2′ = δn . (7-19)
n2
7.4 Plane-Parallel Plate 291
ExP
AS ExP2
ExP1
EnP
n
OA CR
(–)h O
(–)S′2
P′ P P′′
s2 = – t
(–)S1
t
(–)S′1
(–)L 1
(–)L 2
(–)S′2
(–)S2
The transverse chromatic aberration δ h ′ is zero, because the image magnification is unity
regardless of the refractive index of a ray due to the zero refracting power of the plate.
The lateral color representing the difference in the heights of the blue and red chief rays
in the final image plane is given by Eq. (7-8):
h′
δ h c′ = γδ S 2′ = − δ S′ ,
L2
or
h′ t
δ h c′ = − δn . (7-20)
S n2
Of course, the exit pupil, which is the image of the first surface by the second, also has
chromatic aberrations. That is why the centers of the blue and red exit pupil are shown in
Figure 7-6 to lie on the optical axis at Ob and Or , respectively. However, its impact on
Eq. (7-20) is a second-order effect.
292 CHROMATIC ABERRATIONS
ExP2
ExP
AS
ExP1
EnP
n
CRr
CRb
OA
(–)h Or Ob
CR
Figure 7-6. Chromatic aberrations of a plane-parallel plate. The axial color is δ S2′ ,
and the lateral color is δ h c′ .
zz ′ = f f ′ , (7-21)
where z is the object distance from the object-space focal point F , z ′ is the image
distance from the image-space focal point F ′ , and f and f ′ are the object-space and
image-space focal lengths of the imaging system, respectively, as illustrated in Figure 7-
7. In practice, a system is generally surrounded by air, and therefore n = n ′ = 1 and
f = − f ′.
7.5 General System 293
P′
h′
P0 V V′
F H H′ F′ P′
0
(–)h
Optical
P system
(–)d
(–)f d′ f′
(–) l l′
Figure 7-7. General imaging system showing the location of its principal and focal
points H and H ′, and F and F ′ , respectively. Also shown are the object and image
locations. For a system in air, n = 1 = n ′ and f = − f ′ .
δz δz ′ 2
+ = δf ′ . (7-22)
z z′ f′
Let l be the distance of the object from the vertex V of the first surface of the system.
Similarly, let l ′ be the distance of the final image from the vertex V ′ of its last surface.
Also, let d and d ′ be the distances of the principal points H and H ′ from the vertices V
and V ′ , respectively. Then
z = l− f −d
= l + f′ − d (7-23a)
and
z′ = l′ − f ′ − d ′ . (7-23b)
δz = δf ′ − δ d (7-24a)
and
δz ′ = δl ′ − δf ′ − δ d ′ , (7-24b)
where δl is zero because the object position is independent of the wavelength. The object
and image distances are also related to each other by the transverse magnification Mt of
the image according to
294 CHROMATIC ABERRATIONS
z = − f Mt
= f ′ Mt (7-25a)
and
z ′ = − f ′M t . (7-25b)
2
δ l ′ = δ d ′ − Mt2 δ d + (1 − Mt ) δf ′ . (7-26)
Thus, the axial color δ l ′ can be determined for any value of the image magnification Mt
from the change δf ′ of the image-space focal length f ′ and the displacements δd and
δ d ′ of the principal points H and H ′ , respectively. The displacements of the principal
and focal points are determined in the usual manner by tracing blue and red rays incident
on the system parallel to its optical axis.
h′ h = − z ′ f ′ (7-27)
and take its logarithmic differentiation, noting that the object height is independent of the
wavelength. Thus,
δh′ δz′ δ f ′
= −
h′ z′ f′
δ l ′ − δf ′ − δ d ′ δf ′
= − −
f ′ Mt f′
1 ⎡ δl ′ − δ d ′ ⎛ 1 ⎞ ⎤
= − ⎢ + ⎜1 − ⎟ δf ′ ⎥
f′ ⎢⎣ Mt ⎝ Mt ⎠ ⎥⎦
1
=
f′
[
Mt δ d − ( Mt − 1) δf ′ ] , (7-28)
where we have substituted for δ l ′ from Eq. (7-26). The lateral color of the image lying at
a distance L from the exit pupil is given by
δ hc′ δ h′ δl ′
= −
h′ h′ L
1 δl ′
=
f′
[
Mt δ d − ( Mt − 1) δf ′ − ]
L
.
(7-29)
7.6 Doublet 295
For an object at infinity, Mt is zero, and Eqs. (7-26) and (7-28) reduce to
δl ′ = δ d ′ + δ f ′ (7-30a)
and
δ hc′ δ f ′ δl ′
= − , (7-30b)
h′ f′ L
respectively. We note that if a system is designed so that its axial color δ l ′ is zero, its
lateral color δ hc′ is generally not equal to zero. We refer to a system as being achromatic
if its axial and lateral colors are both equal to zero.
The radius of the blue or red disk of rays in the red or the blue Gaussian image plane,
respectively, is given by Eq. (7-7), with δS ′ replaced by δ l ′ , as may be seen from Figure
7-8. Whereas the axial and transverse axial colors are independent of the position of the
aperture stop, the effect of a stop shift on the lateral color is given by Eq. (7-11). As a
simple example of a general system, the chromatic aberrations of a thick lens are
considered in Problem 7.2, where the conditions for an achromatic focal length of a
singlet is derived.
7.6 DOUBLET
In this section, we determine the chromatic aberrations of a doublet, i.e., a system
consisting of two thin lenses. The lenses may be of the same or different materials. We
show that a doublet with two separated lenses cannot be achromatic. We also show that a
doublet consisting of two thin lenses of different materials in contact can be designed to
be achromatic.
ExP P′ δh′c
n n′ b P′r
(–)δh′
CR b CR r
h′ h′r
b
P0 MR MR
γ b r
(–)δl ′
P
(–)l Optical L
System
l′
Figure 7-8. Chromatic aberrations of a general imaging system. The axial color δ l ′
represents the difference in the distances of the blue and red images. The lateral
color δ h c′ represents the difference in the heights of the blue and red chief rays in a
given image plane.
296 CHROMATIC ABERRATIONS
Consider two thin lenses of image-space focal lengths f1′ and f2′ separated by a
distance t, as in Figure 7-9. The focal length f ′ of the combination is given by Eq. (4-
32), i.e.,
1 1 1 t
= + − . (7-31)
f′ f1′ f2′ f1′f2′
Differentiating Eq. (7-31) and using Eq. (7-12), we find that the focal length f ′ is
stationary, i.e., its differential is zero, if
f1′V1 + f2′ V2
t = , (7-32)
V1 + V2
where V1 and V2 are the dispersive constants of the lenses. Although the variation of
focal length of a doublet with wavelength is much reduced (compared to that of a singlet)
by a combination of two lenses in this manner, it is not completely independent of the
wavelength. For example, if the spacing t is chosen by substituting the focal lengths and
V-numbers of the lenses for a certain wavelength, the blue and red focal lengths are
generally not equal to each other. However, they can be made equal, for example, if the
spacing t is chosen at a wavelength λ m for which the refractive index nm for each lens is
equal to the mean of the corresponding blue and red refractive indices, i.e., if λ m is such
that nm = (nF + nC ) 2 (see Problem 7.4). The V-number of a lens in this case is
accordingly defined as Vm = (nm − 1) ( nF − nC ) .
L1 L2
f 1′ f 2′
t
f r′
f b′
Figure 7-9. Doublet consisting of two thin lenses of focal lengths f1′ and f2′ spaced
apart by a distance t. Its focal length f ′ is the same for blue and red light, but the
focal points are not coincident (because the principal points are not).
7.6 Doublet 297
δ l ′ = δ d ′ − Mt2 δ d . (7-33)
Because the value of f ′ is the same for two wavelengths, the image-space focal point F ′
and the principal point H ′ for one wavelength are displaced from those for the other by
the same amount δ d ′ . Now, F ′ lies at a distance
⎛ t⎞
t2 = f ′ ⎜1 − ⎟ (7-34)
⎝ f1′⎠
from the center of the second lens [see Eq. (4-34)]. Differentiating Eq. (4-34), we find
that H ′ and F ′ are displaced by
δ d ′ ≡ δt2
δf1′
= f ′t
f1′ 2
f ′t
= − .
f1′V1 (7-35)
Similarly, considering the distance f (1 − t f2′ ) of the object-space focal point F from the
center of the first lens and noting that the object-space focal length f and the image-
space focal length f ′ are related to each other according to f = − f ′ , we find that the
object-space principal point H and focal point F are displaced by an amount
δf2′
δ d = − f ′t
f2′ 2
f ′t
= . (7-36)
f2′ V2
Substituting Eqs. (7-35) and (7-36) into Eq. (7-33), we obtain the axial color
⎛ 1 Mt2 ⎞
δ l ′ = − f ′t ⎜ + . (7-37)
⎝ f1′V1 f2′ V2 ⎟⎠
1 1⎛1 1⎞
= ⎜ + ⎟ (7-39)
f′ 2 ⎝ f1′ f2′ ⎠
and
1
t =
2
( f1′ + f2′) . (7-40)
Because both f1′ and f2′ vary with the wavelength in the same manner, Eq. (7-40) can be
satisfied at one wavelength only, and the value of f ′ at this wavelength may also be
written f ′ = f1′f2′ t . Accordingly, the focal length of the doublet given by Eq. (7-39) is
independent of the wavelength to the first order in δn. Again, the blue and red focal
lengths are equal if the spacing t is chosen at a wavelength λ m for which the refractive
index nm is equal to the mean of the blue and red refractive indices. Substituting for
t = f1′f2′ f ′ , Eqs. (7-35) through (7-38) reduce to
δ d ′ = − f2′ V , (7-41a)
δd = f1′ V , (7-41b)
1
δl ′ = −
V
(
f2′ + f1′Mt2 ) , (7-41c)
and
δ hc′ f ′M δl ′
= 1 t − .
h′ V L (7-41d)
The variation of the focal length with wavelength is shown Figure 7-10c. Its
minimum value is 10 cm, corresponding to a wavelength λ m = 0.5535 μm . Its value
increases as the wavelength deviates from this wavelength, but the deviation is quite
small, and the blue and red focal lengths are equal. Moreover, it is evident from the
parabola-like variation that there is a variety of pairwise wavelengths at which the focal
lengths are equal. Practically speaking, the variation of the focal length is negligible.
Now, the apparent size of an object as perceived by an observing eye is determined by the
7.6 Doublet 299
F F′1 , H
F2, H′ F′
f2 = –7.5 = d′
t1 = 11.25
f ′= 10
f′1 = 15 = d
(a)
F F1′, H
F2, H′ F′
f′ = 10
t1 = 11.25
(b)
1.00008 2.60
1.00006 2.55
f'/ f 'm
t2
1.00004 2.50
1.00002 2.45
2.40
1.00000 0.45 0.50 0.55 0.60 0.65 0.70
0.40 0.50 0.55 0.60 0.65 0.70
λ λ
(c) (d)
Figure 7-10. Doublet consisting of two thin lenses separated by a distance t1 ≡ t . (a)
Schematic of a Huygens eyepiece of focal length 10 cm. The two thin lenses are made
of BK7 glass. (b) The eyepiece forms an image at infinity of the image at F formed
by the objective (not shown). (c) Variation of focal length of the doublet with
wavelength. (d) Variation of back focal distance t2 with wavelength. The wavelength
is in micrometers, and t2 is in centimeters.
300 CHROMATIC ABERRATIONS
size of the image formed on the retina, which, in turn, depends on the angle it subtends at
the eye. This angle for a point object at a certain height is independent of the wavelength
if the focal length is independent. Thus, the constant focal length of the eyepiece leads to
a constant magnification and, therefore, zero lateral color.
1 1 1
= + (7-42a)
t2 f1′ − t f2′
1
= + (n − 1) κ 2 , (7-42b)
1
− t
(n − 1) κ1
where κ for a lens in terms of the radii of curvature R1 and R2 of its two surfaces is given
by
⎛ 1 1⎞
κi = ⎜ − ⎟ , i = 1, 2 . (7-43)
⎝ R1 R2 ⎠ i
Differentiating Eq. (7-42), we find that the variation of t2 with respect to n for lenses of
the same material is equal to zero if the value of t is given by
2
f2′ = − f1′ (1 − t f1′) . (7-44)
It shows that the focal lengths f1′ and f2′ must be of opposite signs. Because the spacing
given by Eq. (7-44) is different from that given by Eq. (7-40), δ f ′ is no longer zero.
Therefore, Eq. (7-30b) shows that the lateral color given by (δf ′ f ′)h ′ is not zero.
Therefore, the axial and lateral colors of a doublet with two separated thin lenses cannot
be simultaneously equal to zero. This is true even if the two lenses are made of different
materials, as may be seen from Eqs. (7-30). Zero axial color is obtained if δ f ′ = − δ d ′ ,
which, in turn, yields a lateral color of (δ f ′ f ′) h ′ . The doublet is not achromatic unless
δ f ′ and δ d ′ are each equal to zero. This is (approximately) true in the case of a thin-lens
doublet discussed in Section 7.6.4. Accordingly, a Huygens eyepiece is achromatic if, for
example, its two separated lenses are each an achromatic thin-lens doublet.
It is not surprising that a doublet consisting of two separated thin lenses is not
achromatic. It is shown next that a system with two separated components cannot be
achromatic unless each component is individually achromatic.
7.6 Doublet 301
For an alternative proof for the system to be achromatic, we consider the imaging of
an object of height h1 lying at a distance S1 from L1 in two steps, as illustrated in Figure
7-12. L1 forms the image of the object at a distance S1′ with a height of h1′ given by
This image lies at a distance S2 from L2 , which forms its image at a distance S2′ with a
height h2′ given by
The axial color of the image formed by L2 is zero if S2′ is independent of wavelength. Its
lateral color is also independent of the wavelength if δ h2′ = 0. Or, because h1 and S1 are
P R
B
h
P0ʹ
β (−)βʹ
P0
(−)hʹ
Pʹ
L1 L2
L1 L2
P
h1
P0ʹ P0ʹʹ
P0 (−)h1ʹ (−)h2
(−)h2ʹ
Pʹ
Pʹʹ
= 0 , (7-47)
where we have used the fact that δ S2 = − δ S1′ because of the fixed spacing between L1
and L2 . Thus, δ h2′ = 0 if δ S1′ = 0 , i.e., if the image formed by L1 has zero axial color.
Equation (7-45) then shows that δ h1′ is also zero. Thus, the image formed by L1 must be
achromatic. Therefore, the system consisting of two separated components L1 and L2
must be individually achromatic if the system is to be achromatic.
f1′ V
= − 2 . (7-48)
f2′ V1
1 1 1
= + , (7-49)
f′ f1′ f2′
f ′(V1 − V2 )
f1′ = (7-50a)
V1
7.6 Doublet 303
and
f ′(V2 − V1 )
f2′ = . (7-50b)
V2
By the definition of a thin lens, the principal points of a thin-lens doublet coincide at
its center. Therefore, the blue and red focal points also coincide with each other.
Accordingly, both the axial and lateral colors are zero, regardless of the value of the
object distance. It should be noted, however, that the focal length of a thin-lens doublet
can be made the same at only two selected wavelengths for which the difference δn in
the refractive indices is used in defining V. This may be seen as follows: The focal
lengths f F′ and fC′ of the doublet for the F and C lines are equal to each other according
to Eq. (7-49) if
1 1 1 1
+ = + , (7-51)
f F′1 f F′ 2 fC′1 fC′ 2
or
or
κ2 n − nC1
= − F1 . (7-53)
κ1 nF 2 − nC 2
This is indeed the result obtained by substituting the expressions for the focal length and
the Abbe number from Eqs. (2-28) and (7-15a), respectively, into Eq. (7-49). The focal
lengths of the doublet for another pair of wavelengths will be equal to each other
provided the ratio of the differences in the refractive indices for them is equal to that
given by Eq. (7-53). The residual chromatic aberration at wavelengths other than λ F and
λ C is called the secondary spectrum.
The doublet has the same focal length for a third wavelength, e.g., the d line,
provided the refractive indices also satisfy the relation
κ2 n − nd 1
= − F1 . (7-54)
κ1 n F 2 − nd 2
nF1 − nd1 n − nd 2
= F2 . (7-55)
nF1 − nC1 nF 2 − nC 2
304 CHROMATIC ABERRATIONS
R2 = R3 = – 4.22
R1 = 6.07
R4 = – 14.29
BK7 SF2
(a)
1.008 1.70
1.006
1.68
1.004 f ′/fd′
f ′/fd′ 1.66 n
1.002
1.64
1.000
n
0.998 1.62
0.4 0.6 0.8 1.0
λ
(b)
Figure 7-13. Achromatic thin-lens doublet. (a) Cemented doublet with a focal length
of 10 cm consisting of BK7 and SF2 glass lenses. The focal lengths of the two lenses
are 4.82 cm and –9.29 cm, respectively. (b) Variation of focal length with the
wavelength. The variation of the refractive index n of SF2 glass is also shown in the
figure. Its refractive index for the d line is 1.645, and its Abbe number is 33.60. The
Abbe number of BK7 is 64.17.
7.7 Summary of Results 305
Figure 7-13b. Its minimum occurs in the vicinity of the d line. We note again from the
parabola-like variation that there is a variety of pair-wise wavelengths for which the focal
lengths are equal. However, compared to a doublet with separated components, as in
Figure 7-10, there is a built-in design feature of equal focal lengths for the F and C lines.
We note from Eqs. (7-50) that because V1 and V2 are positive, f1′ and f2′ have
opposite signs. Moreover, the specification of f ′ and the dispersive constants of the lens
materials specifies their focal lengths f1′ and f2′ . However, the focal length of a thin lens
depends on the difference in the curvatures of its surfaces, while its spherical aberration
and coma depend on the curvatures through its shape factor. This degree of freedom (i.e.,
the choice of the radii of curvature of its four surfaces) can be utilized to make the
achromatic thin-lens doublet free of spherical aberration and coma.
where f ′ is the image-space focal length, δd and δ d ′ are the axial colors of the object-
space and image-space principal points, and Mt is the image magnification. The axial
color is independent of any stop shift because the image distance is independent of the
location of the aperture stop. The radius of the blue (red) disk of rays in the red (blue)
Gaussian image plane is given by
ri = ( a L ) δS ′ , (7-57)
where a is the radius of the exit pupil, and L is the distance of the image from it. It is also
independent of the stop shift because a L is kept fixed when the stop is shifted. The
lateral color representing the difference in the heights of the blue and red chief rays in an
image plane is given by
δ hc′ 1 δ S′
h′
=
f′
[ ]
Mt δ d − ( Mt − 1) δf ′ −
L
, (Lateral Color) (7-58)
where L is the distance of the image from the exit pupil. For an object at infinity,
′ and δ hc2
magnification is zero. The values of lateral colors δ hc1 ′ corresponding to two
exit pupil locations, such that the image lies at distances L1 and L2 from them, are
related to each other according to
⎛ 1 1⎞
δ hc′2 = δ hc′1 + ⎜ − ⎟ h ′δ S ′ . (7-59)
⎝ L1 L2 ⎠
306 CHROMATIC ABERRATIONS
Equation (7-59) represents the stop-shift equation for the lateral color.
1 ⎛ 1 1⎞
= (n − 1) ⎜ − ⎟ . (7-60)
f′ ⎝ R1 R2 ⎠
δf ′ 1
= − , (7-61)
f′ V
where
V =
(n − 1) (7-62)
δn
It is common practice to consider n as the refractive index for the yellow line of
helium ( λ = 0.5876 m ) , called the d line, and δn as the difference nF − nC between the
refractive indices for the Fraunhofer lines F and C, i.e., for the blue (λ = 0.4861 μm ) and
red (λ = 0.6563 m ) lines of hydrogen. Glass manufacturers often give the refractive
index data as a six-digit number. For example, BK7 glass is specified as #517642. The
first three digits define its refractive index according to nd − 1 = 0.517 , and the
remaining three digits define its dispersive constant according to
nd − 1
V = (7-63a)
nF – nC
= 64.2 . (7-63b)
The dispersive constant of a glass defined according to Eq. (7-15a) is called its Abbe
number.
S′ 2
δS ′ = − (7-64a)
f ′V
and
1 1
δ hc′ = h ′ ⎛ − ⎞ δS ′ . (7-64b)
⎝ S′ L ⎠
7.7 Summary of Results 307
These equations can be obtained from Eqs. (7-58) and (7-60) by setting the axial colors
δd and δ d ′ of the principal points of a general system equal to zero and using the thin-
lens imaging equations.
t
δ S′ = δn (7-65a)
n2
and
h′ t
δ hc′ = − δn . (7-65b)
S n2
7.7.4 Doublet
The focal length f ′ of a doublet with lenses of focal lengths f1′ and f2′ spaced a
distance t apart is given by
1 1 1 t
= + − . (7-66)
f′ f1′ f2′ f1′f2′
f1′V1 + f2′ V2
t = , (7-67)
V1 + V2
where V1 and V2 are the dispersive constants of the lenses. Its axial and lateral colors are
given by
⎛ 1 Mt2 ⎞
δ S ′ = − f ′t ⎜ + (7-68a)
⎝ f1′V1 f2′ V2 ⎟⎠
and
δ hc′ tMt δl ′
= − . (7-68b)
h′ f2′ V2 L
f ′t
δd = − (7-69a)
f2′ V2
and
308 CHROMATIC ABERRATIONS
f ′t
δd ′ ≡ − . (7-69b)
f1′V1
If the lenses are made of the same material such that V1 = V2 = V , then
1 1⎛1 1⎞
= ⎜ + ⎟ , (7-70a)
f′ 2 ⎝ f1′ f2′ ⎠
1
t =
2
( f1′ + f2′) , (7-70b)
1
δ S′ = −
V
(
f2′ + f1′Mt2 ) , (7-70c)
δ hc′ f ′M δl ′
= 1 t − , (7-70d)
h′ V L
δd = f1′ V , (7-70e)
and
δ d ′ = − f2′ V . (7-70f)
1 1 1
= + . (7-71a)
f′ f1′ f2′
It is achromatic if
f1′ V
= − 2 , (7-71b)
f2′ V1
i.e., if
f ′(V1 − V2 )
f1′ = (7-71c)
V1
and
f ′(V2 − V1 )
f2′ = . (7-71d)
V2
The focal lengths of the doublet are equal for the F and C lines. The residual
chromatic aberration at wavelengths other than λ F and λ C is called the secondary
spectrum. A system corrected for three wavelengths is called apochromatic. Its focal
length for the F, C, and d lines is the same if the lenses have the same relative partial
dispersion (nF − nd ) ( nF − nC ) , i.e., if
nF1 − nd1 n − nd 2
= F2 . (7-72)
nF1 − nC1 nF 2 − nC 2
310 CHROMATIC ABERRATIONS
REFERENCES
1. D. E. McCarthy, “The reflection and transmission of infrared materials, Part 1,
Spectra from 2 μm to 50 μm,” Appl. Opt. 2, 591–595 (1963).
PROBLEMS
7.1 Consider a plane-parallel plate placed in the path of a converging beam. The plate
has a refractive index of 1.5, a thickness of 1 cm, and a diameter of 4 cm. In the
absence of the plate, the beam comes to a focus at a distance of 8 cm from its front
surface at a height of 0.5 cm from its axis. (a) Calculate the position of the focus in
the presence of the plate. (b) Determine its chromatic aberrations for δn = 0.008
and illustrate by a diagram.
7.2 Consider the thick lens of refractive index n, thickness t, and surfaces of radii of
curvature R1 and R 2 discussed in Section 4.6. (a) Show that its back focal distance
t2 can be written
1 ⎡ 1 1 ⎤
= (n − 1) ⎢ − ⎥ ,
t2 ⎣ R1 − bt R2 ⎦
where b = (n − 1) n . (b) By letting ∂t2 ∂n = 0 , show that the position of its focal
point is achromatic if its thickness and radii of curvature are related according to
R2 =
( R1 − bt )2 .
R1 − b 2 t
b (t R1 ) − 1
f′ = R12 .
b 2t
(c) Show that it is achromatic with respect to its focal length if its thickness is given
by
n 2 ( R1 − R2 )
t = ,
n2 − 1
or that the distance between the centers of curvature of its two surfaces is given by
t n 2 . Show that the corresponding focal length in this case is given by
1 n −1⎛ 1 1⎞
= ⎜ − ⎟ ,
f′ n + 1 ⎝ R1 R2 ⎠
7.3 Consider a concentric lens (see Problem 4.9) made of BK7 glass, with radii of
curvature 5 cm and 4 cm, placed in a converging beam of image-forming light of a
certain system such that the axial image is concentric with the lens. Calculate the
312 CHROMATIC ABERRATIONS
lateral color introduced by each surface and show that their contributions cancel
each other.
7.4 Show that a doublet is achromatic with respect to its focal length if the spacing t is
chosen at a wavelength λ m for which the refractive index nm for each lens is
equal to the mean of the corresponding blue and red refractive indices, i.e., if λ m is
such that nm = (nF + nC ) 2 . The V-number of a lens in this case is given by
Vm = (nm − 1) (nF − nC ) .
7.5 Consider the Mangin mirror of Problem 3.3 imaging an object so that the image
distance is S ′ . Show that its axial color is given by
[ ]
δ S ′ = S ′ 2 (2 fs′ − R1 ) n R1 fs′ δ n .
For an aperture stop located at the mirror, its lateral color is zero.
CHAPTER 8
MONOCHROMATIC ABERRATIONS
313
314 MONOCHROMATIC ABERRATIONS
In this chapter, the wave and transverse ray aberrations are discussed, and a
relationship between them is derived. The wave aberrations for a certain point object
represent the optical deviations of its wavefront at the exit pupil from being spherical.
The wave aberrations are zero if the wavefront is spherical, in which case all of the rays
converge to its center of curvature, and a perfect point image is obtained. The ray
aberrations represent the displacement of the rays from the Gaussian image point.
The concept of Strehl ratio as a measure of image quality is introduced next, and
balancing of an aberration of a certain order with one or more aberrations of lower orders
is discussed. The aberrations of a system are discussed in terms of Zernike circle
315
316 MONOCHROMATIC ABERRATIONS
polynomials, which are not only orthogonal over a circular pupil but also represent
balanced classical aberrations with minimum variance across the pupil. The aberrations in
the form of Zernike polynomials are referred to as the orthogonal aberrations. The
relationships between the classical and orthogonal aberrations are discussed.
Although the transverse ray aberrations of a system for a certain point object can be
obtained by tracing the rays through the system and up to the image plane, they can also
be obtained from the wave aberrations. However, the distribution of rays in an image
plane does not represent the true picture of an image because it does not take into account
the diffraction of the wavefront at the exit pupil. Because the wave aberrations play a
fundamental role in determining the image quality, knowledge of them is essential. We
point out that the ray aberrations are not additive, in that those in the final image plane
cannot be obtained by adding their values in the intermediate image planes formed by the
surfaces of a system. Of course, the contribution of a surface to the ray aberration in the
final image plane can be obtained from its wave aberration using the parameters of the
final image.
8.2.1 Definitions
Consider an optical system imaging a point object P, as illustrated in Figure 8-1. The
object radiates a spherical wave. If the image is perfect, the diverging spherical wave
incident on the system is converted by it into a spherical wave converging to the Gaussian
8.2 Wave and Ray Aberrations 317
Optical
System
P¢
Figure 8-1. Perfect imaging by an optical system. P is the point object, and P ¢ is its
Gaussian image point.
image point P ¢ . With a few exceptions, the wave exiting from practical systems is only
approximately spherical.
We now introduce the concept of wave and ray aberrations associated with an object
ray and derive a relationship between the two. The optical path length of a ray in a
medium of refractive index n is equal to n times its geometrical path length. If rays from
a point object are traced through the system and up to the exit pupil such that each one
travels an optical path length equal to that of the chief ray, the surface passing through
their end points is called the system wavefront for the point object under consideration. If
the wavefront is spherical, with its center of curvature at the Gaussian image point, we
say that the image is perfect. The rays transmitted by the system in that case have equal
optical path lengths in propagating from P to P ¢ , and they all pass through P ¢ . If,
however, the actual wavefront deviates from the spherical wavefront, called the Gaussian
reference sphere, we say that the image is aberrated. The rays do not have equal optical
path lengths, and they intersect the Gaussian image plane in the vicinity of P ¢ .
The optical deviation (i.e., geometric deviations times the refractive index ni of the
image space) of the wavefront from the Gaussian reference sphere along a ray is called its
wave aberrations. It represents the difference between the optical path lengths of the ray
under consideration and the chief ray in traveling from the point object to the reference
sphere. Accordingly, the wave aberration associated with the chief ray is zero. Because
the optical path lengths of the rays from the reference sphere to the Gaussian image point
are equal, the wave aberration of a ray is also equal to the difference between its optical
path length from the point object P to the Gaussian image point P ¢ and that of the chief
ray.
The wave aberration of a ray from a point object is positive if it travels an extra
optical path length, compared to the chief ray, in order to reach the Gaussian reference
sphere [1]. Figures 8-2a and 8-2b illustrate the reference sphere S and the aberrated
wavefront W for on-axis and off-axis point objects P0 and P, respectively. The reference
sphere, which is centered at the Gaussian image point P0¢ in Figure 8-2a or P ¢ in Figure
8-2b, and the wavefront pass through the center O of the exit pupil. The wave aberration
ni Q Q of a general ray GR0 or GR, as shown in the figures, is numerically positive. The
318 MONOCHROMATIC ABERRATIONS
ExP
Q Q(x, y, z)
GR0 x
b
y
W(x,y) = niQQ
S
W
R
Figure 8-2. (a) Aberrated wavefront for an on-axis point object. The reference
sphere S of radius of curvature R is centered at the Gaussian image point P0¢ . The
wavefront W and reference sphere S pass through the center O of the exit pupil ExP.
A right-hand Cartesian coordinate system showing x, y, and z axes is illustrated,
where the z axis is along the optical axis of the imaging system. Angular rotations a ,
, and g about the three axes are also indicated. CR0 is the chief ray, and a general
ray GR0 is shown intersecting the Gaussian image plane at P0¢¢ .
ExP
_
Q Q(x, y, z)
GR
P¢¢(xi, yi)
P¢(xg, 0)
R
O OA P¢0
x
a
z
g
y b
W(x, y) = niQQ
S
W
zg
Figure 8-2. (b) Aberrated wavefront for an off-axis point object. The reference
sphere S of radius of curvature R is centered at the Gaussian image point P ¢ . The
value of R in this figure is slightly larger than its value in Figure 8-2a. GR is a
general ray intersecting the Gaussian image plane at the point P ¢¢ . By definition,
the chief ray (not shown) passes through O, but it may or may not pass through P ¢ .
The displacement of the chief ray in the image plane from P ¢ represents distortion.
8.2 Wave and Ray Aberrations 319
We assume that a point object such as P lies along the x axis. (There is no loss of
generality due to this because the system is rotationally symmetric about the optical axis.)
The z x plane containing the optical axis and the point object is called the tangential or
the meridional plane. The corresponding Gaussian image point P ¢ lying in the Gaussian
image plane along its x axis also lies in the tangential plane. This may be seen by
considering a tangential object ray and Snell’s law, according to which the incident and
refracted (or reflected) rays lie in the same plane. The chief ray always lies in the
tangential plane. The plane normal to the tangential plane but containing the chief ray is
called the sagittal plane. As the chief ray bends when it is refracted or reflected by an
optical surface, so does the sagittal plane. It should be evident that only the chief ray lies
in both the tangential and sagittal planes because it lies along the line of intersection of
these two planes.
xo
P (xo, 0) xp
Q (x, y)
P0
an ct
xg
pl bje
e
r
O
q
P¢¢ (xi, yi, zg)
yo
R
O P¢ (xg, 0, zg)
an il
pl up
e
P
zg
yp P¢0
z
pl n
e
e sia
an
ag us
im Ga
yg
Figure 8-3. Right-hand coordinate system in the object, exit pupil, and image planes.
The optical axis of the system is along the z axis, and the off-axis point object P is
assumed to be along the x axis, thus making the z x plane the tangential plane.
320 MONOCHROMATIC ABERRATIONS
Consider a ray such as G R from the object passing through the system and
intersecting the Gaussian image plane at P ¢¢( xi , yi ) , as illustrated in Figure 8-2b. The
displacement P ¢P ¢¢ of P ¢¢ from the Gaussian image point P ¢ is called the geometrical or
the transverse ray aberration. The distribution of rays in an image plane is called the ray
spot diagram. The ray aberrations and spot diagrams are discussed in Chapter 9. When
the wavefront is spherical, with its center of curvature at the Gaussian image point, then
the wave and ray aberrations are zero. In that case, all of the object rays transmitted by
the system pass through the Gaussian image point, and the image is said to be perfect.
W ( x + Dx ) = ni ( AB + CE )
= W ( x ) + DW , (8-1)
where x + D x is the height of the point D. Note that the angle EAC is equal to b , and
CE = bD x . Now
DW = ni CE
= nibD x
= ni ( xi R)D x , (8-2)
or
R DW
xi = . (8-3)
ni D x
8.2 Wave and Ray Aberrations 321
ExP
E
C D P¢¢
b xi
b
P¢
A
B
R xg
O OA P0¢
S
W
zg
Figure 8-4. Wave and ray aberrations. W is the wavefront for a point object whose
Gaussian image lies at P ¢ . P ¢P ¢¢ is the ray aberration, and ni AB is the wave
aberration of a ray ABP¢¢ passing through a point B on the reference sphere S with
its center of curvature at P ¢ . The ray ABP¢¢ is normal to the wavefront W at the
point A, and BP¢ is the surface normal at a point B on the reference sphere.
R ∂W
xi = . (8-4)
ni ∂x
A similar equation is obtained for the y coordinate of the point P ¢¢ . Thus, the wave and
ray aberrations are related to each other according to
R Ê ∂W ∂W ˆ
( xi , yi ) = Á , ˜ , (8-5)
ni Ë ∂x ∂y ¯
Thus, if W ( x, y) is the wave aberration of a ray in the exit pupil, the corresponding
ray aberration in the image plane is given by its spatial derivative multiplied by the radius
of curvature of the Gaussian reference sphere and divided by the refractive index of the
image space. Because the rays are normal to a wavefront, the ray aberrations depend on
the shape of the wavefront and, therefore, on its geometrical path length difference from
322 MONOCHROMATIC ABERRATIONS
the reference sphere. The division by n i in Eq. (8-5) converts the optical path length
difference into the geometrical path length difference. When an image is formed in free
space, as is often the case in practice, then ni = 1.
Note that the tangential rays, i.e., those lying in the z x plane, lie along the x axis of the
exit pupil plane and thus correspond to q = 0 or p . Similarly, the sagittal rays, i.e., those
lying in a plane orthogonal to the tangential plane but containing the chief ray lie along
the y axis of the exit pupil plane and thus correspond to q = p 2 or 3p 2 . If W (r, q)
represents the aberration in polar coordinates, then the ray aberrations are given by
R Ê ∂ W sin q ∂ W ∂W cos q ∂W ˆ
(xi , yi ) = Á cos q – , sin q + ˜ . (8-7)
ni Ë ∂r r ∂q ∂r r ∂q ¯
For a radially symmetric aberration W ( r) , a ray of zone r in the exit pupil plane
intersects the Gaussian image plane at a distance ri from the Gaussian image point given
by
R ∂W
ri = . (8-8)
ni ∂r
ExP
Q2 Q1
O B P1 P2
S centered at P1
W centered at P2
W S
Z
to it at its vertex.) It is numerically positive because, compared with the chief ray passing
through O, it represents the extra optical path length that a ray passing through Q1 has to
travel in order to reach the reference sphere. Thus, the defocus wave aberration at the
point Q1 is given by
ni Ê 1 1 ˆ 2
W (r ) = Á - ˜r , (8-9)
2 Ëz R ¯
where r is the distance of Q1 from the optical axis. We note that the defocus wave
aberration is proportional to r 2 . If z ~ R , then Eq. (8-9) may be written
W (r ) ~ - ni D R2 r 2 , (8-10)
2 R
where
DR = z - R (8-11)
is called the longitudinal defocus. We note that the defocus wave aberration and the
longitudinal defocus have numerically opposite signs.
A defocus aberration is also introduced if the system is assembled properly, but the
image is observed in a plane other than the Gaussian image plane. Consider, for example,
an imaging system forming an aberration-free image at the Gaussian image point P2 .
(Note that the Gaussian image is now located at P2 in Figure 8-5.) Thus, the wavefront at
the exit pupil is spherical, passing through its center O with its center of curvature at P2 .
324 MONOCHROMATIC ABERRATIONS
Let the image be observed in a defocused plane passing through a point P1 that lies on the
line joining O and P2 . For the observed image at P1 to be aberration free, the wavefront
at the exit pupil must be spherical, with its center of curvature at P1 . Such a wavefront
forms the reference sphere with respect to which the aberration of the actual wavefront
must be defined. Once again, the aberration of the wavefront at a point Q1 on the
reference sphere is given by Eq. (8-9).
For a system with a circular exit pupil of radius a, Eq. (8-9) may be written
ni Ê 1 1
W (r ) = - ˆ a 2 r2 (8-12a)
2 Ëz R¯
= Bd r2 , (8-12b)
where
r = r a (8-13)
is the normalized distance of a point in the pupil plane from its center, and
ni Ê 1 1 ˆ 2
Bd = - a (8-14a)
2 Ë z R¯
~ - ni D R 8 F 2 (8-14b)
is the peak value of the defocus aberration. The quantity F in Eq. (8-14b) is the focal
ratio of the image-forming light cone. It is given by
F = R 2a . (8-15)
We note that a positive value of Bd implies a negative value of the longitudinal defocus
D R, or z < R. Thus, an imaging system with a positive value of defocus aberration Bd can
be made defocus free if the image is observed in a plane lying farther from the exit pupil,
compared with the defocused image plane, by a distance 8 Bd F 2 ni . Similarly, a
positive defocus aberration of Bd = - ni R 8F 2 is introduced into the system if the
image is observed in a plane lying closer to the exit pupil, compared with the defocus-free
image plane, by a (numerically negative) distance D R.
The radius of curvature Rik of the Petzval image surface of a system (of k imaging
surfaces) is given by Eq. (2-124). Therefore, an observation of the image of a point object
in the Gaussian image plane (i.e., on a planar surface) is equivalent to a longitudinal
defocus of h ¢ 2 2 R ik . Substituting into Eq. (8-14b), we obtain the corresponding field
curvature aberration of Bd = n i h ¢ 2 16R ik F 2 .
longer wavelengths. If the red wavefront is chosen as the reference sphere, then the
defocus wave aberration corresponding to an axial color of d S ¢ is given by
ni d S¢ 2
W d (r) = - r , (8-16)
2 R2
where ni is the refractive index of the image space.
Next, we consider a wavefront tilt angle and the corresponding wavefront tilt
aberration. We consider a system that has one or more of its optical elements
inadvertently tilted and/or decentered slightly, resulting in a transverse displacement of
the image of a point object from its Gaussian image at P1 to P2 , as indicated in Figure 8-
6. Thus, a spherical wavefront with its center of curvature at P2 emerges from the exit
pupil of the system. The Gaussian reference sphere is, of course centered at P1 . The
aberration of the wavefront at a point Q1 on the reference sphere is its optical deviation
ni Q2 Q1 from the reference sphere along the ray passing through Q1 . It is evident that for
small values of the ray aberration P1 P2 , the wavefront and the reference sphere are tilted
with respect to each other by a small angle . The ray and the wave aberrations can be
written
xi = R (8-17)
ExP
Q2 Q1
r
P2
xi
b
O OA P1
S W
Figure 8-6. Wavefront tilt. The spherical wavefront W is centered at P2 , while the
reference sphere S is centered at P1 . Thus, for small values of P1 P2 , the two
spherical surfaces are tilted with respect to each other by a small angle = P1 P2 R ,
where R is their radius of curvature. The ray Q2 P2 is normal to the wavefront at Q2.
326 MONOCHROMATIC ABERRATIONS
and
respectively, where (r, q) are the polar coordinates of the point Q1 projected onto the
plane of the exit pupil. Both the wave and ray aberrations are numerically positive in
Figure 8-6.
Once again, for a system with a circular exit pupil of radius a, Eq. (8-18) may be
written
where
Bt = ni a (8-20)
is the peak value of the tilt aberration. Note that a positive value of Bt implies that the
wavefront tilt angle is also positive, as in Figure 8-6. Thus, if an aberration-free
wavefront is centered at P2 , then an observation with respect to P1 as the origin implies
that we have introduced a tilt aberration of Bt r cos q.
In the case of lateral color, the wavefronts are spherical, but their centers of curvature
lie at a higher height from the optical axis for the longer wavelength. Again choosing the
red wavefront as the reference sphere, the wavefront tilt aberration due to a lateral color
of d hc¢ is given by
d hc¢
Wt (r, q) = ni r cos q . (8-21)
R
8.5 ABERRATIONS OF A ROTATIONALLY SYMMETRIC SYSTEM
In this section, we obtain the form of the aberration terms for a rotationally
symmetric system, expand an aberration function in terms of them, and discuss the
primary aberrations. The aberration terms are discussed with and without their explicit
dependence on the object coordinates. The aberration function of a system is also
expanded in terms of Zernike polynomials, which are in widespread use in optical design
and testing.
Consider a rotationally symmetric optical system imaging a point object P. The axis
of rotational symmetry, namely, ther optical axis, lies along the z axis. Let the position
vector of the object point be h with rectangular coordinates ( xo , yo ) in a plane
r
orthogonal to the optical axis. Similarly, let r be the position vector of a point with
rectangular coordinates ( x, y) in the plane of the exit pupil of the system, which is also
orthogonal to the optical axis. The origins of ( x o , yo ) and ( x, y) lie on the optical axis,
and we assume, for example, that the xo and x axes are coplanar.
8.5 Aberrations of a Rotationally Symmetric System 327
Because the pupils of optical systems are generally circular, it is convenient to use
polar coordinates. Let (h, q o ) and (r, q) be the polar coordinates corresponding to the
angular coordinates ( x o , yo ) , and ( x, y) of the object and pupil points respectively,
where
and
r under
Now, quantities that are invariant
r r r of the optical system about its axis of
rotation
symmetry are the three scalars h , r , and h r , where ◊
r 1/ 2
(
h = h = x o2 + yo2 ) , (8-24a)
rr 1/ 2
(
= r = x 2 + y2 ) , (8-24b)
and
r r
◊
h r = hr cos(q – q o ) (8-25a)
= x o x + yo y . (8-25b)
In order that the aberration function consist of terms with positive integral powers of
the four rectangular coordinates, it must depend on the first two through h 2 and r 2 . If we
rotate the system about the optical axis by a certain angle, the aberration function must
not change. We note that this is indeed the case. As the system rotates, so do the x and y
axes in each plane. Both q and q o change by the angle of rotation, but h, r, and q - q o
do not change. Thus, because of rotational symmetry, the aberration function depends on
the four variables ( x o , yo ) and ( x, y) only through the three combinations h 2 , r 2 , and
hr cos (q - q o ) . These combinations are called the rotational invariants of the aberration
function of an optical imaging system with an axis of rotational symmetry.
where C ijm and l a nm are the expansion coefficients, i, j, and m are positive integers
including zero, 2i + m = l and 2 j + m = n , and we have written the object height h in
terms of the image height h ¢ . The subscripts on the coefficients l anm represent the
328 MONOCHROMATIC ABERRATIONS
Because the aberration associated with the chief ray (for which r = 0 ) is zero, the
zero-degree term and those varying as h ¢ 2i but without any dependence on r must be
zero. The second-degree terms are also zero. For example, the term varying as r 2
represents a defocus aberration that is independent of h ¢ . It must be zero, because
otherwise the Gaussian image point with respect to which the aberration function is
defined must be incorrect. Similarly, the term varying as hr cos q must also be zero,
because otherwise it implies a transverse shift of the image point, or the image height
being different from h ¢ .
The first nonzero aberration terms are of fourth order, i.e., those for which k = 4 .
They are called the primary or the Seidel aberrations. The primary aberration function
may be written
4
W P ( h¢; r , q) = 0 a 40 r + 1a 31h¢ r 3 cos q + 2 a 22 h¢ 2 r 2 cos 2 q
(8-27)
+ 2 a 20 h ¢ 2 r 2 + 3 a11h ¢ 3 r cos q .
The values of the aberration coefficients depend on the construction parameters of the
system, such as the radii of curvature of its surfaces, the refractive indices of the spaces
between them, and the values of their spacing. The coefficients 0 a40 , 1a31 , 2 a22 , 2 a20 ,
and 3 a 11 represent the coefficients of spherical aberration, coma, astigmatism, field
curvature, and distortion, respectively. The primary aberrations are listed in Table 8-1.
They can be determined from the Gaussian imaging characteristics of a system without
tracing rays [2]. They can also be determined from the ray-trace data of the paraxial chief
and marginal rays.
We note that the dependence of the field curvature term on the pupil coordinates is
just like the defocus aberration discussed in Section 8.3. Thus, this term is a defocus
whose coefficient varies quadratically with the height of the point object. It can be
eliminated by observing the image of a planar object on a curved surface (typically
spherical, as discussed in Section 9.3.3), thus the name field curvature. Similarly, the
dependence of the distortion term on the pupil coordinates is just like the wavefront tilt
aberration discussed in Section 8.4. Therefore, this term is a wavefront tilt aberration
whose coefficient varies cubically with the height of the point object. Accordingly, the
image of a point object in the presence of distortion is perfect, but it is transversally
displaced from the Gaussian image point; the displacement depends on the height of the
point object. The reason for the name distortion becomes clear when the image of an
extended object is considered. (For an example, see Section 9.3.4, where the distorted
image of a square grid is considered.)
8.5 Aberrations of a Rotationally Symmetric System 329
*The word “primary” is to be associated with these names, e.g., primary spherical.
r = r a , r£ 0 £1 , (8-28)
where a is the radius of the exit pupil of the system. Combining the aberration terms that
have different dependencies on the object coordinates but the same dependence on pupil
coordinates so that there is only one term for each pair of (n, m) values, the aberration
terms may also be written in the form anm rn cos m q , where n and m are positive integers,
including zero, and n - m ≥ 0 and even. Each aberration coefficient anm depends on the
image height h ¢ , and because 0 £ r £ 1 and cos q £ 1, it represents the peak value or
half of the peak-to-valley value of the corresponding aberration term, depending on
whether m is even or odd, respectively.
The primary aberrations written in this simplified form are also listed in Table 8-1.
They correspond to terms with n + m £ 4. The primary aberration function of Eq. (8-27)
may be written in terms of these coefficients in the form
WP (r, q) = a11r cos q + a20 r2 + a22 r2 cos 2 q + a31r3 cos q + a40 r4 , (8-29)
where
3
a11 = 3 a11h ¢ a = at h ¢ 3 a = At , (8-30a)
2
a20 = 2 a20 h ¢ a 2 = ad h ¢ 2 a 2 = Ad , (8-30b)
2
a22 = 2 a22 h ¢ a 2 = aa h ¢ 2 a 2 = Aa , (8-30c)
and
4
a40 = 0 a40 a = as a 4 = As , (8-30e)
Comparing the distortion term given in Table 8-1 with the wavefront tilt aberration
given by Eq. (8-19b), we note that although the two are similar in their dependence on the
pupil coordinates, their coefficients depend on the image height differently. The
distortion coefficient a11 (or At ) varies with h ¢ as h ¢ 3 , but the tilt coefficient Bt is
independent of h ¢. Similarly, comparing the field curvature term with the defocus wave
aberration given by Eq. (8-8b), we note that their dependence on the pupil coordinates is
the same. However, whereas the field curvature coefficient a20 (or Ad ) varies with h ¢ as
h ¢ 2 , the defocus coefficient Bd is independent of h ¢. It is for these reasons that we have
used a different symbol, namely, B, for the defocus and tilt coefficients, compared to the
symbol A for the field curvature and distortion coefficients.
The aberrations of sixth order, i.e., for which k = 6, are called the secondary or the
Schwarzchild aberrations. The aberration function through the sixth order aberrations,
i.e., for k £ 6, or n + m £ 6, may be written
WS (r, q) = a11r cos q + a20 r2 + a22 r2 cos 2 q + a31r3 cos q + a33r3 cos3 q
(8-31)
+ a40 r4 + a42 r4 cos2 q + a51r5 cos q + a60 r6 ,
where
a11 = ( 3 a11h ¢
3
+ 5 a11h¢ 5 a , ) (8-32a)
a20 = ( 2 a20 h ¢
2
+ 4 a20 h¢ 4 a 2 ) , (8-32b)
a22 = ( 2 a22 h ¢
2
+ 4 a22 h¢ 4 a 2 ) , (8-32c)
a31 = (a 1 31h ¢ )
+ 3 a31h ¢ 3 a 3 , (8-32d)
3 3
a33 = 3 a33 h ¢ a , (8-32e)
2 4
a42 = 2 a42 h ¢ a , (8-32g)
6
a60 = 0 a60 a . (8-32i)
Written in this form, the aberration function has nine aberration terms through the sixth
order. For convenience, the values of the indices n and m, and the combined aberration
8.6 Additivity of Primary Aberrations 331
terms along with their names, are listed in Table 8-2. Because the dependence of an
aberration term on the image height h ¢ is contained in the aberration coefficient anm , it
should be noted that the primary aberrations (including distortion and field curvature
terms) are not the same as those discussed earlier because they contain aberration
components not only of the fourth degree, but the sixth degree as well. For example,
a 40r 4 consists of spherical and lateral spherical aberrations 0 a 40 a 4 r 4 and 2 a 40 h ¢ 2 a 4 r 4 .
Similarly, the aberration function through the eighth order can be written by
combining the primary, secondary, and the tertiary aberrations [2].
To determine the aberrations of a system for imaging a certain point object, we must
trace the object rays through the system and then determine their wave aberrations as the
differences in their optical path lengths in reaching the Gaussian reference sphere (with
its center of curvature at the Gaussian image point passing through the center of the exit
pupil of the system) from a certain reference ray. Typically, this ray is the chief ray that
passes through the center of the exit pupil. The wave aberrations represent the separations
of the wavefront from the reference sphere along the rays. The corresponding transverse
ray aberrations represent the separations of the rays in the Gaussian image plane from the
Gaussian image point.
Consider a ray P0 A1 from the axial point object P0 incident on the first surface at a
point A1 . Let the refracted ray intersect the first Gaussian image plane at A2 , where P1 A2
represents the transverse aberration of the ray produced by the first surface. The wave
aberration of the ray, namely, its primary spherical aberration, is given by
8.6 Additivity of Primary Aberrations 333
A1
n0 n1 n2
P29
r A4
1
P1 B
P0 V1 V2 P2
r W
A2 2
S
Q
A3
Image Plane
Figure 8-7. Wave aberration [ A3Q ] and transverse ray aberration P2 A2 of a ray
P0 A1 originating at a point object P0 when imaged by a system consisting of two
refracting surfaces separating media of refractive indices n 0 , n1 , and n 2 . P1 is the
Gaussian image of P0 formed by the first surface, and P2 is the Gaussian image of
P1 formed by the second surface.
where the square brackets indicate an optical path length, r1 is the distance of point A1
from the optical axis, and a1 is the coefficient of spherical aberration. This coefficient
depends on the shape of the refracting surface, the object distance D01 , and the refractive
indices n 0 and n1 [2]. Similarly, the wave aberration of the Gaussian image P2 of the
point object P1 formed by the second surface is given by
W 2 (r 2 ) = [ P1 A3 P2 ] - [ PV
1 2 P2 ]
( )
= a 2r 42 + O r 62 , (8-34)
where r 2 is the distance of point A3 from the optical axis, and a 2 is the coefficient of its
spherical aberration that depends on the shape of the refracting surface, object distance
D23 , and the refractive indices n1 and n 2 . The distances r1 and r 2 are approximately
related to each other according to
D23
r 2 = r1 . (8-35)
D12
334 MONOCHROMATIC ABERRATIONS
The wave aberration W s of the system in forming the image P2 of the point object P0 can
be written
( )
= a1r14 + O r16 + a 2r 42 + O r 62 ( ) , (8-37)
thus demonstrating the additivity of the primary wave aberrations. Using Eq. (8-35), Eq.
(8-37) can be written in terms of a single variable r 2 in the form
( )
W s (r 2 ) = a sr 42 + O r 62 , (8-38)
where
4
ÊD ˆ
a s = a1Á 12 ˜ + a 2 (8-39)
Ë D23 ¯
Now, according to Fermat’s principle, the difference in the optical path lengths of the
actual and virtual rays is of second order in the transverse distance between them. Thus,
2
[ A1P1 A3 ] - [ A1 A2 A3 ] ~ (P1 A2 )
( )
= O r 62 , (8-40)
where P1 A2 ~ r13 ~ r 32 represents the transverse aberration of the ray for the first surface.
Therefore, by adding the optical path length of the incident ray P0 A1 in Eq. (8-40), we
may write
W s (r 2 ) = [ P0 A1 A2 A3 P2 ] - [ P0V1PV
1 2 P2 ] + O r 2
6
( ) . (8-42)
Next, consider the Gaussian reference sphere of radius P2 A3 passing through a point
B on the optical axis. Thus,
[ A3 P2 ] = [ BP2 ] . (8-43)
Let the wavefront W passing through the point B intersect the actual ray A3 A4 at Q .
Then, by definition, the wave aberration associated with the ray is [ A3Q ] . It is
numerically negative because the optical path length [ P0 A1 A2 A3 ] to reach the reference
8.6 Additivity of Primary Aberrations 335
[ P0 A1 A2 A3 P2 ] + [ A3Q ] = [ P0V1PV
1 2 BP2 ] , (8-45)
or
[ A 3Q ] = [ P0 A1 A2 A3 P2 ] - [ P0V1PV
1 2 BP2 ]
( )
= W s (r 2 ) + O r 62 . (8-46)
Thus, in view of Eq. (8-38), the primary wave aberration associated with a ray is equal to
the sum of the primary aberrations associated with it for each of the two surfaces of the
system.
Now let us take into account the aberration difference between the point Q lying on
the virtual ray A3 P2 and the actual ray A3 A4 . If we let Q1 and Q2 be the points where
the wavefront intersects the two rays, the aberration difference [ A3Q2 ] - [ A3Q1 ] is
proportional to the optical path difference between the two rays, which from Fermat’s
principle is of second order in the transverse distance between them. This difference is
2
proportional to [ P2 A4 ] , which, in turn, is proportional to r 6 . Thus, the primary wave
aberration of the two-surface system, being equal to the sum of the primary aberrations of
the two surfaces, is also valid for Q lying on the actual ray A3 A4 . This result can be
generalized to a system consisting of any number of refracting and/or reflecting surfaces,
thus establishing the additivity theorem for primary wave aberrations.
It should also be clear that the primary aberrations cannot describe the exact wave
aberrations because the rays do not pass through the Gaussian image point formed by a
surface unless the image formed is indeed aberration free. To determine the exact wave
aberration, the optical path length of a ray must be determined by tracing it exactly from
the object plane to the Gaussian reference sphere of the system, and then compared with
that of the chief ray or some other reference ray.
The ray aberration of the image formed by the first surface in Figure 8-7 is P1 A2
given by
336 MONOCHROMATIC ABERRATIONS
D12 ∂W1
P1 A2 = . (8-47)
n1 ∂r1
It is numerically negative because the point A2 , where the ray intersects the first
Gaussian image plane lies below the optical axis. If we consider a ray P1 A3 originating at
the point object P1 (which is the Gaussian image of the point object P0 formed by the
first surface), it is refracted as a ray A3 P2¢ . The transverse ray aberration associated with
this ray is P2 P2¢ , given by
D34 ∂W 2
P2 P2¢ = . (8-48)
n 2 ∂r 2
The ray aberration for the system associated with the ray P0 A1 incident on the first
surface from the point object P0 is P2 A4 . It is given by
D34 ∂W s
P2 A4 = . (8-49)
n 2 ∂r 2
It is numerically positive, because the point A4 where the ray intersects the final image
plane is above the optical axis. It is not equal to the sum of the ray aberrations P1 A2 and
P2 P2¢ of the two surfaces. In this sense, the ray aberrations are not additive.
Now, from Eq. (8-37), the primary wave aberration of the system can be written
Thus, Eq. (8-49) for the ray aberration of the system can be written
D34 Ê ∂W1 ∂W 2 ˆ
P2 A4 = Á + ˜ . (8-51)
n 2 Ë ∂r 2 ∂r 2 ¯
The first term on the right-hand side of Eq. (8-51) represents the ray aberration
contribution of the first surface, and the second term represents that of the second surface.
Note the difference between the first term and the ray aberration P1 A2 given by Eq. (8-
47). Such a difference will occur for each surface of a system, except for the last. Thus,
whereas the primary wave aberrations of the surfaces of a system are additive, their ray
aberrations are not. The ray aberration of the system must be obtained from its wave
aberration, and not by adding the ray aberrations of its surfaces.
So far, we have considered the imaging of an axial point object. For an off-axis point
object, additional primary aberrations, namely, coma, astigmatism, field curvature, and
distortion, appear. Of course, by definition of a primary aberration, they are all of fourth
order in the pupil and object coordinates. However, the same reasoning applies to show
that the primary aberrations are additive, i.e., the optical path length difference between a
8.7 Strehl Ratio and Aberration Balancing 337
real ray and a virtual ray is of second order in the transverse ray aberration. Because each
(primary) ray aberration is of third order, its square is of sixth order. Thus, the primary
wave aberrations are additive also for an off-axis point object.
Just as we can calculate the primary aberration of the image of a point object formed
by an imaging surface, we can also calculate its secondary and higher-order intrinsic
aberrations. However, adding the intrinsic secondary aberrations of the surfaces will not
yield the correct secondary aberrations of the system. The reason is simple: the image
formed by the previous surface, instead of being a point, is actually a spot diagram
resulting from its primary aberrations. The primary aberrations of the image formed by
this surface must be taken into account to determine the extrinsic secondary aberrations
of the next surface. The sum of the intrinsic and extrinsic secondary aberrations of a
surface yields its total secondary aberrations. The sum of the total secondary aberrations
of the surfaces yields the correct secondary aberrations of the system. Similarly, the
primary and secondary aberrations of an image formed by a surface must be taken into
account to determine the tertiary aberrations of the next surface, and then then adds the
aberrations of the surfaces to determine it for the system, and so on.
S ~ exp ( - s F2 ) , (8-52)
where s F2 is the variance of the phase aberration across the exit pupil. The variance is
given by
2
s F2 = F 2 - F , (8-53)
where the mean and the mean square values of the aberration are obtained from the
expression
338 MONOCHROMATIC ABERRATIONS
1 2p
Û Û
Fn = p -1 Ù Ù F n (r, q) r dr dq , (8-54)
ı ı
0 0
with n = 1 and 2, respectively. It is assumed that the amplitude across the pupil is
uniform, which would otherwise act as the weighting function in the integral in Eq. (8-
54). For a high-quality imaging system, a typical value of the Strehl ratio desired is 0.8,
corresponding to a wave aberration with s w = l 14 , where s w = (l 2 p) s F is the
standard deviation of the wave aberration, or the wavefront sigma.
Table 8-3 gives the form as well as the standard deviation s F of a primary (or
Seidel) aberration, where its coefficient Ai represents the peak value of the aberration. It
also lists the aberration tolerance, i.e., the value of the aberration coefficient Ai , for a
Strehl ratio of 0.8. The aberration tolerance listed in Table 8-3 is for the wave (as opposed
to the phase) aberration coefficient, as is customary in optics. It should be understood that
the tolerance numbers given are not accurate to the second decimal place. They are listed
Spherical As r 4 2 As As l 4.19
=
3 5 3.35
as such for consistency only. Note that the dependence of the field curvature on r as r 2
is the same as that for the defocus wave aberration. Similarly, the dependence of
distortion on (r, q) as r cos q is the same as that for the wavefront tilt aberration.
F(r) = As r 4 + Bd r 2 . (8-55)
As Bd
= + (8-56)
3 2
and
As2 B2 A B
F2 = + d + s d . (8-57)
5 3 2
4 As2 B2 A B
= + d + s d . (8-58)
45 12 6
∂ s F2
= 0 , (8-59)
∂ Bd
and checking that it yields a minimum and not a maximum. Thus, we find that the
optimum value is Bd = - As, and the balanced aberration is given by
(
F bs (r) = As r 4 - r 2 ) . (4-60)
Its standard deviation or sigma value is As 6 5 , which is a factor of 4 smaller than the
corresponding value 2 As 3 5 for Bd = 0. Because the sigma value of the aberration has
been reduced by a factor of 4, its tolerance has been increased by the same factor. For
example, S = 0.8 is obtained in the Gaussian image plane for As = l 4 . However, the
same Strehl ratio is obtained for As = 1 l in a slightly defocused image plane such that
340 MONOCHROMATIC ABERRATIONS
Bd = - l . The defocused image plane lies at a distance 8l F 2 from the Gaussian image
plane.
Similarly, we balance astigmatism with defocus and coma with tilt. Table 8-4 lists
the forms of balanced primary aberrations, their standard deviations, and their tolerances
for a Strehl ratio of 0.8, according to Eq. (8-52). Also listed in the table is the location of
the diffraction focus, i.e., the point with respect to which the aberration variance is
minimum so that the Strehl ratio is maximum at this point. The amount of balancing
defocus is minus half the amount of astigmatism, or the diffraction focus lies at a distance
4 F 2 Aa from the Gaussian image plane along the z axis. The balancing tilt in the case of
coma is minus two-thirds the amount of coma. Thus, the maximum Strehl ratio is
obtained at a point that is displaced from the Gaussian image point by 4 FAc 3 but lies in
the Gaussian image plane. The balancing of higher-order aberrations can be considered in
a similar manner.
Spherical (
As r 4 - r2 ) (0, 0, 8F A )
2
s
As 0.955l
6 5
Coma (
Ac r3 - 2r 3 cos q) (4 FAc 3, 0, 0 ) Ac 0.604l
6 2
Aa
Astigmatism (
Aa r2 cos 2 q - 1 2 ) (0 , 0 , 4 F A )
2
a
2 6
0.349l
= ( Aa 2 ) r2 cos 2q
*The diffraction focus coordinates are relative to the Gaussian image point.
8.8 Zernike Circle Polynomials 341
We normalize the radial coordinate r of a point on the circular pupil by its radius a so
that the maximum value of r = r a is unity. We refer to a pupil normalized in this
manner as a unit circular pupil.
where c nm is a Zernike expansion coefficient, and n and m are positive integers including
zero such that n – m ≥ 0 and even. The radial and angular dependence of the polynomials
is given by
12
È 2( n + 1) ˘
Z nm (r, q) = Í 1 + d ˙ Rnm (r) cos mq , (8-62)
Î m0 ˚
where
( n - m )/ 2 ( -1) s ( n - s)!
Rnm (r) = Â r n - 2s (8-63)
s= 0 Ên+m ˆ Ên-m ˆ
s!Á - s˜ ! Á - s˜ !
Ë 2 ¯ Ë 2 ¯
The radial polynomials Rnm (r) are orthogonal to each other according to
1 1
Ú Rn (r) Rn ¢ (r) r dr =
m m
d , (8-64)
0 2(n+ 1) nn ¢
Moreover,
Ïd m 0 for even n 2
Rnm ( 0) = Ì (8-66)
Ó - d m 0 for odd n 2
The variation of some typical radial polynomials is shown in Figure 8-8. The angular
functions are orthogonal according to
2p
Ú cos mq cos m ¢q dq = p (1 + d m 0 ) d mm ¢ . (8-68)
0
342 MONOCHROMATIC ABERRATIONS
1 1 2p m
Ú Ú Z (r, q)Z n ¢ (r, q) r dr d q = d nn ¢d mm ¢
m¢
. (8-69)
p0 0 n
The mean value of a polynomial, except piston, is zero, as may be seen by letting
n ¢ = 0 = m¢ .
The index n of a Zernike polynomial represents its radial degree or the order
because it represents the highest power of r in the polynomial. This is different from the
order of a classical aberration, which represents the degree of the Cartesian coordinates of
the point object (for which the aberration function is being considered) and pupil points
(see Section 8.5.1). The index m of a polynomial is referred to as its azimuthal frequency.
The polynomials are ordered such that a polynomial with a lower value of n is ordered
first, and for a given value of n, a polynomial with a lower value of m is ordered first. The
polynomials through n = 8 and m = 0 are given in Table 8-5. The number of
polynomials of a certain order n is (n 2) + 1 when n is even, and ( n - 1) 2 when it is odd.
Their number through an order n is given by
2
n
N n = Ê + 1ˆ for even n , (8-70a)
Ë2 ¯
Multiplying both sides of Eq. (8-61) by Z nm¢ ¢ (r, q) , integrating over the unit pupil,
and utilizing the orthonormality Eq. (8-60), we obtain
1 2p • n 1 2p
Ú Ú W (r, q)Z n ¢ (r, q)r dr d q = Â Â c nm Ú Ú Z n (r, q)Z n ¢ (r, q) r dr d q
m¢ m m¢
0 0 n =0 m =0 0 0
• n
= p   c nm d nn ¢d mm ¢ = pc n ¢m ¢ , (8-71)
n =0 m =0
or
1 1 2p
Ú Ú W (r, q)Z n (r, q) r dr d q .
m
c nm = (8-72)
p0 0
2
s W2 = W 2 (r, q) - W (r, q) , (8-73)
where the angular brackets indicate a mean value over the unit pupil according to
1 2p 1 2p
k Û Û Û Û
W (r, q) = Ù Ù W k (r, q) r dr d q Ù Ù r dr d q , k = 1, 2 . (8-74)
ı ı ı ı
0 0 0 0
8.8 Zernike Circle Polynomials 343
n=4
0.5 8
R n(ρ)
0 (a)
0
-0.5 6
2
-1
0 0.2 0.4 0.6 0.8 1
U
n=5
0.5
7
1
R n(ρ)
0 (b)
1
-0.5
-1
0 0.2 0.4 0.6 0.8 1
U
n=6
0.5
2
R n(ρ)
0 (c)
2
-0.5 8
4
-1
0 0.2 0.4 0.6 0.8 1
U
Figure 8-8. Variation of a Zernike circle radial polynomial Rnm (r) as a function of
r . (a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.
344 MONOCHROMATIC ABERRATIONS
Table 8-5. Orthonormal Zernike circle polynomials and their names when identified
with aberrations.
0 0 1 Piston
2 0 (
3 2r2 - 1 ) Field curvature (defocus)
2
2 2 6 r cos 2q Primary astigmatism
3 1 (
8 3r3 - 2r cos q ) Primary coma
3 3 8 r3 cos 3q
4 0 (
5 6r 4 - 6r2 + 1 ) Primary spherical
4 2 (
10 4r - 3r 4 2
) cos 2q Secondary astigmatism
4
4 4 10 r cos 4q
5 1 (
12 10r5 - 12r3 + 3r cos q ) Secondary coma
5 3 12 (5r 5
- 4r3 cos 3q )
5 5 12 r5 cos 5q
6 0 (
7 20r6 - 30r 4 + 12r2 - 1 ) Secondary spherical
6 2 (
14 15r - 20r + 6r 6 4 2
) cos 2q Tertiary astigmatism
6 4 14 (6r 6
- 5r 4
) cos 4q
6 6 14 r6 cos 6q
7 1 ( )
4 35r 7 - 60r5 + 30r3 - 4r cos q Tertiary coma
7 3 4 (21r - 30r + 10r ) cos 3q
7 5 3
7 5 4 (7r - 6r ) cos 5q
7 5
7 7 4 r 7 cos 7q
8 0 (
3 70r8 - 140r6 + 90r 4 - 20r2 + 1 ) Tertiary spherical
*The words “orthonormal Zernike” are to be associated with these names, e.g., orthonormal
Zernike primary astigmatism.
8.8 Zernike Circle Polynomials 345
• n
= Â Â c nm d n 0d m 0
n =0 m =0
= c 00 , (8-75)
where again we have used the orthonormality Eq. (8-69) for n ¢ = 0 = m¢ , i.e., for Z 00 = 1.
The mean square value of the aberration function is given by
• n • n 1 2p
W 2 (r, q) =   c nm   c n ¢m ¢ Ú Ú Z nm (r, q)Z nm¢ ¢ (r, q) r dr d q
n =0 m=0 n ¢=0 m¢=0 0 0
• n • n
=   c nm   c n ¢m ¢d nn ¢d mm ¢
n =0 m =0 n ¢=0 m ¢=0
• n
2
= Â Â c nm . (8-76)
n =0 m =0
or
• n
s W2 = Â 2
 c nm . (8-78)
n =1 m = 0
Thus, the variance of the aberration function is equal to the sum of the squares of the
orthonormal expansion coefficients c nm , except c 00 . It illustrates that an orthonormal
coefficient represents the standard deviation of the corresponding polynomial aberration
term. We point out that unless the mean value of the aberration W = 0 , then
1/ 2
s w π Wrms , where W rms = W 2 is the root-mean-square (rms) value of the aberration.
An even number is associated with a cosine polynomial, and an odd number with a sine
polynomial. The orthogonality of the trigonometric functions yields
2p
Ï cos mq cos m¢q , j and j ¢ are both even
Ô cos mq sin m¢q , j is even and j ¢ is odd
Û Ô
Ù dq Ìsin mq cos m¢q , j is odd and j ¢ is even
ı Ô
0
ÔÓsin mq sin m¢q , j and j ¢ are both odd
Therefore, the polynomials are orthonormal over a unit circular pupil according to
1 2p 1 2p
Ú Ú Z j (r, q) Z j ¢ (r, q) r dr dq Ú Ú r dr dq = d jj ¢ . (8-81)
0 0 0 0
N n = ( n + 1)( n + 2) 2 . (8-82)
2p
11
aj = Ú
p0 Ú W (r, q)Z j (r, q) r dr dq . (8-84)
0
It is evident from Eq. (8-84) that the value of a coefficient a j is independent of the
8.8 Zernike Circle Polynomials 347
Table 8-6. Orthonormal Zernike circle polynomials Z j (r, q) . The indices j, n, and m
are called the polynomial number, radial degree, and azimuthal frequency,
respectively. The polynomials Z j are ordered such that an even j corresponds to a
symmetric polynomial varying as cos mqq , and an odd j corresponds to an
antisymmetric polynomial varying as sin mqq. A polynomial with a lower value of n
is ordered first, and for a given value of n, a polynomial with a lower value of m is
ordered first.
4 2 0 (
3 2r2 - 1 ) Defocus
7 3 1 (
8 3r3 - 2r sin q ) Primary y-coma
8 3 1 8 (3r 3
- 2r) cos q Primary x-coma
9 3 3 8 r 3 sin 3 q
10 3 3 8 r 3 cos 3 q
11 4 0 (
5 6r 4 - 6r2 + 1 ) Primary spherical aberration
12 4 2 (
10 4r 4 - 3r2 cos 2q ) 0∞ secondary astigmatism
13 4 2 10 ( 4r 4
- 3r ) sin 2q
2 45∞ secondary astigmatism
14 4 4 10 r 4 cos 4 q
15 4 4 10 r 4 sin 4 q
16 5 1 ( )
12 10r5 - 12r3 + 3r cos q Secondary x-coma
18 5 3 12 (5r - 4r ) cos 3q
5 3
19 5 3 12 (5r - 4r ) sin 3q
5 3
20 5 5 12 r 5 cos 5 q
21 5 5 12 r 5 sin 5 q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
348 MONOCHROMATIC ABERRATIONS
22 6 0 (
7 20r6 - 30r 4 + 12r2 - 1 ) Secondary spherical
23 6 2 ( 6
)
14 15r - 20r + 6r sin 2q 4 2
45∞ tertiary astigmatism
25 6 4 14 (6r - 5r ) sin 4q
6 4
26 6 4 14 (6r - 5r ) cos 4q
6 4
27 6 6 14 r 6 sin 6 q
28 6 6 14 r 6 cos 6 q
29 7 1 ( )
4 35r 7 - 60r5 + 30r3 - 4r sin q Tertiary y-coma
33 7 5 4 (7r - 6r ) sin 5q
7 5
34 7 5 4 (7r - 6r ) cos 5q
7 5
35 7 7 4 r 7 sin 7 q
36 7 7 4 r 7 cos 7 q
37 8 0 (
3 70r8 - 140r6 + 90r 4 - 20r2 + 1 ) Tertiary spherical
38 8 2 ( )
18 56r 8 - 105r 6 + 60r 4 - 10r 2 cos 2q 0∞ quaternary astigmatism
42 8 6 18 (8r 8 - 7r 6 ) cos 6q
43 8 6 18 (8r 8 - 7r 6 ) sin 6q
44 8 8 18 r 8 cos 8q
45 8 8 18 r 8 sin 8q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
8.8 Zernike Circle Polynomials 349
number J of the polynomials used in Eq. (8-83) for the expansion of the aberration
function. Thus, one or more polynomial terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.
As in the case of polynomials in optical design, the mean and the mean square values
of the aberration function are given by
W (r, q) = a1 (8-85)
and
J
W 2 (r, q) = Â a 2j , (8-86)
j =1
2
s W2 = W 2 (r, q) - W (r, q)
J
= Â a 2j - a12 , (8-87)
j =1
or
J
s W2 = Â a 2j . (8-88)
j =2
It should be evident that the P-V numbers of two polynomials with the same values
of n and m are the same. Moreover, the P-V numbers of polynomials with the same value
of n but different values of m, except m = 0, are also the same. The P-V numbers of a
polynomial representing the fabrication errors give a measure of the depth of material to
be removed in the fabrication process.
350 MONOCHROMATIC ABERRATIONS
Z2 4 Z17 2 12 = 6.928 Z 32 8
Z3 4 Z18 2 12 = 6.928 Z 33 8
Z5 2 6 = 4.899 Z 20 2 12 = 6.928 Z 35 8
Z6 2 6 = 4.899 Z 21 2 12 = 6.928 Z 36 8
Each fringe represents a contour of constant phase or aberration. The fringe is dark
when the phase is an odd multiple of p, or the aberration is an odd multiple of l 2. In
8.8 Zernike Circle Polynomials 351
the case of tilts, for example, the aberration changes by one wave four times, which is the
same as the P-V number of four waves. Thus, four straight-line fringes symmetric about
the center are obtained. The x-tilt polynomial Z2 yields vertical fringes, and the y-tilt
polynomial Z3 yields horizontal fringes. Similarly, defocus aberration Z4 yields about
3.5 fringes. In the case of spherical aberration Z11 , the aberration starts at a value of 5
waves, decreases to zero, reaches a negative value of - 5 2 waves, and then increases
to 5 waves. Thus, the total number of times the aberration changes by unity is equal to
6.7, and approximately seven circular fringes are obtained. The interferograms of the
Zernike primary aberrations are shown in Figure 8-9. The corresponding diffraction PSFs
are included for completeness.
Z1 Z2
Z4
Z6
Z8
Z11
Figure 8-9. Zernike circle polynomials shown as isometric plots on the top,
interferograms on the left, and diffraction PSFs on the right for a sigma value of one
wave.
352 MONOCHROMATIC ABERRATIONS
The Seidel aberrations are well known in optical design, where the optical system
has an axis of rotational symmetry with the consequence that the angle-dependent terms
are in the form of powers of cos q . However, the measured aberrations of a system in
optical testing generally contain both the cosine and sine terms due to the assembly and
fabrication errors. We show how to define the effective Seidel coefficients in such cases.
We emphasize that the Seidel aberration coefficients determined from the primary
Zernike aberrations will be in error unless the higher-order terms that also contain Seidel
terms are negligible [5].
represents a tilt of the wavefront about the y axis by an angle 4 (l D)a 2 , where the
aberration coefficient is in units of wavelength. It results in a displacement of the PSF
along the x axis by 4 l Fa 2 , where F is the focal ratio of the image-forming light cone.
Similarly, the Zernike tilt aberration
represents a tilt of the wavefront about the x axis by an angle 4 (l D)a 3 and results in a
displacement of the PSF along the y axis by 4l Fa 3 . The P-V number of the aberration is
4 a2 .
It should be evident that when the cosine and sine terms of a certain aberration are
present simultaneously, as in optical testing, their combination represents the aberration
whose orientation depends on the value of the component terms. For example, if both x
and y Zernike tilts are present in the form
it can be written
(
W (r, q) = 2 a 22 + a 32 )1 2 r cos [q - tan-1(a3 a2 )] . (8-92)
to decide the sign of the overall tilt and the value of its angle are discussed in the
Appendix. It is evident from Eq. (8-92) that if b j = 0 , then the angle of orientation is
zero, indicating a cos mq polynomial. Similarly, if a j = 0, then the angle is p 2m ,
indicating a sin mq polynomial. The Zernike tilt aberration Z 2 (r, q) is similar to the
Seidel distortion in its (r, q) dependence (see Tables 8-1 and 8-2). It displaces the image
along the x axis by 4 l Fa 3 .
(
Z 4 (r, q) = a 4 3 2r 2 - 1 ) , (8-93)
where a 4 is its sigma value. It varies with r as r 2 , just as the Seidel field curvature
varies with it. The constant term in Z 4 (r) results in its mean value across the circular
pupil to be zero, without changing its standard deviation. The P-V number for the
aberration is 2 3a 4 . Because the aberration is radially symmetric, so are the PSF and the
spot diagram.
8.9.4 Astigmatism
The Zernike primary astigmatism
can be written
a 5 Z 5 (r, q) = [
6 a 5r 2 cos 2(q + p 4) ] . (8-96)
does not yield a line image in any plane. Because of its variation with q as cos 2q , it is
referred to as the 0∞ astigmatism in conformance with the corresponding primary
astigmatism. The name tertiary astigmatism in Table 8-5 can be explained similarly.
If both 0 o and 45∞ astigmatisms are present so that the aberration function is
(
W (r, q) = a 52 + a 62 )1 2 {[
6 r 2 cos 2 q - (1 2) tan -1(a 5 a 6 )]} , (8-99)
(
= a 6 6 2r 2 cos 2 q - r 2 ) (8-100b)
= a6 6 ( - 2r 2 sin 2 q + r 2 ) . (8-100c)
8.9.5 Coma
The Zernike aberration terms a 8 Z 8 (r, q) and a 7 Z 7 (r, q) are called the x and y
Zernike comas. They represent classical coma r 3 cos q or r 3 sin q balanced with tilt
r cos q or r sin q , respectively, to yield minimum variance (see Problem 8.5). The names
for the secondary and tertiary comas can be explained similarly.
When both x- and y-Zernike comas are present, the aberration may be written
= ( ) (
8 a 8 3r 3 - 2r cos q + 8 a 7 3r 3 - 2r sin q ,) (8-101b)
or
(
W (r, q) = a 72 + a 82 )1 2 8 (3r3 - 2r) cos [q - tan-1(a7 a8 )] , (8-101c)
8.9 Relationship between Zernike Polynomials and Classical Aberrations 355
Zernike comas also contain Seidel coma. Thus, it is only if the higher-order Zernike
comas are zero or negligible that the PSF aberrated by primary Zernike coma will be
symmetric about a line making an angle of tan -1 (a 7 a 8 ) with the x axis. Similarly, it is
only if the secondary and tertiary astigmatisms are zero or negligible that the Seidel
12
( )
astigmatism is 2 6 a 52 + a 62 , as in Eq. (8-99).
where
(
a 22 = 1 20 7 , a11 = 1 4 5 , a 4 = 9 20 3 , a1 = 1 4 . ) (8-103)
If we infer the Seidel spherical aberration from only the primary Zernike aberration
a11Z11 (r) , its amount would be 1.5 waves. Such a conclusion is obviously incorrect,
because in reality the amount of Seidel spherical aberration is zero. Needless to say, if we
expand the aberration function up to the first, say, as many as 21 terms, we will, in fact,
incorrectly conclude that the amount of Seidel spherical aberration is 1.5 waves.
However, the Seidel spherical aberration will correctly reduce to zero when at least the
first 22 terms are included in the expansion. For an off-axis image, there are angle-
dependent aberrations, e.g., Z14 , that also contain Seidel aberrations. Thus, it is important
that the expansion be carried out up to a certain number of terms such that any additional
356 MONOCHROMATIC ABERRATIONS
terms do not significantly change the mean square difference between the function and its
estimate. Otherwise, the inferred Seidel aberrations will be erroneous [5].
where A p is the piston aberration, other coefficients Ai represent the peak value of the
corresponding Seidel aberration term, and b i is the orientation angle of the Seidel
aberration. They are given by
A p = a1 - 3a 4 + 5a11 , (8-105a)
2 2 12 Ê a - 8a7 ˆ
At = 2ÈÍ a 2 - 8 a 8
( ) + (a 3 - 8 a 7 ˘˙
) , b t = tan -1Á 3 ˜ , (8-105b)
Î ˚ Ë a2 - 8a8 ¯
Ad = 2 ( 3a 4 - 3 5a11 - Aa ) , (8-105c)
1
(
Aa = 2 6 a 52 + a 62 )1 2 , ba =
2
tan -1 (a 5 a 6 ) , (8-105d)
(
Ac = 6 2 a 72 + a 82 )1 2 , b c = tan -1 (a 7 a 8 ) , (8-105e)
and
As = 6 5a11 . (8-105f)
As a note of caution, we add that the approximation of Eq. (8-104a) is good only when
the higher-order Zernike aberrations that also contain Seidel aberration terms are
negligible.
object and pupil points, which become the building blocks of its aberration function for a
certain point object. The six invariants reduce to three “rotational” invariants for a
rotationally symmetric system, or equivalently for an infinite number of symmetry
planes.
In this section, we briefly discuss the power series expansion of the aberration
function in terms of the six reflection invariants and define the classical aberrations of an
anamorphic system. The balanced aberrations for a rectangular pupil are represented by
the products of the Legendre polynomials, one for each of the two dimensions of the
rectangular pupil [6]. The compound Legendre polynomials are orthogonal across a
rectangular pupil and, like the classical aberrations, are inherently separable in the
Cartesian coordinates of the pupil point. They are different from the orthogonal
polynomials representing the balanced aberrations for a system with rotational symmetry
but a rectangular pupil. Although products of Chebyshev polynomials, one for the x axis
and the other for the y axis, are also orthogonal over a rectangular pupil [7], they are not
suitable for anamorphic systems, because they do not represent balanced aberrations for
such systems.
Because of the symmetry of the system about the orthogonal planes zx and yz (see
Figure 2-48), the aberration function, which depends on both ( p, q) and ( x , y )
coordinates, consists of products of positive integral powers of six reflection invariants:
p 2 , x 2 , px , q 2 , y 2 , and qy . (8-106)
The first three are symmetric about the yz plane, and the other three are symmetric about
the zx plane. A power-series expansion of the aberration function can be written
W ( p, q; x , y ) = Â
i, j , k, l , m, n
( ) i (q2 ) j ( x 2 ) k ( y 2 ) l ( px) m (qy) n
C i, j, k, l, m, n p 2 , (8-107)
where i, j, k , l, m , and n are positive integers including zero, and C i, j ,k,l ,m,n is the
coefficient of the aberration term that has a degree in the object and pupil coordinates
given by
It is evident that the degree of an aberration term is even, and thus the aberration function
consists of aberrations of even orders only. The zero-degree term must be zero, as it
represents the aberration of the chief ray, which is zero by its definition as the reference
ray. There are six terms of second degree, namely the reflection invariants multiplied
358 MONOCHROMATIC ABERRATIONS
with their respective coefficients. Two of these terms, namely those in p 2 and q 2 , are
piston terms, i.e., they are independent of the pupil coordinates, and can generally be
ignored. Among the other four, those in px and qy , represent lateral deviations of the
image point from the Gaussian image point, and those in x 2 and y 2 represent
longitudinal deviations. Because our aberration function is defined with respect to the
Gaussian image point, these four terms must be zero. It is clear that the aberration terms
are separable in the Cartesian coordinates ( x , y ) of a pupil point.
There are 21 terms of the fourth degree, of which three are piston terms and two are
equal to another two. Thus, we are left with 16 terms that depend on the pupil
coordinates. They are called the primary aberrations of an anamorphic system, compared
to only five for a rotationally symmetric system. The primary aberration function can be
written
( ) ( ) ( )
W ( p, q; x , y ) = C1 p 3 + C 2 pq 2 x + C 3 p 2 q + C 4 q 3 y + C 5 p 2 + C 6 q 2 x 2
( 2
)
+ C 7 pqxy + C 8 p + C 9 q y + C10 pxy + C11qyx + C12 px 3
2 2 2 2
where we have expressed the aberration coefficients in a simplified form with one
subscript for convenience. For a rotationally symmetric system, the six reflection
coefficients reduce to three rotational invariants, namely, p 2 + q 2 , x 2 + y 2 , and px + qy ,
r r
and the 16 primary aberrations reduce to five. If h and rr are r the position vectors of the
r r r r
object and pupil points, rotational invariants are h ◊ h , r ◊ r , h ◊ r or h 2 , r 2 , and
r then the
r r r
hr cos q , where h = h , r = r , and q is the polar angle of r with respect to that of h .
In conformance with the aberrations of a rotationally symmetric system, the linear terms
in x and y are the distortion aberrations; the quadratic terms may be referred to as the field
curvature, defocus, or astigmatism; the cubic terms are comas; and the quaternary terms
are the spherical aberrations. It is easy to see that an anamorphic system has three primary
aberrations for an axial point object, compared to only one for a rotationally symmetric
system.
Q j ( x , y ) = Ll ( x ) Lm ( y ) , (8-110)
where j is a polynomial ordering index starting with j = 1, and l and m are positive
integers (including zero). It is evident that these polynomials are inherently separable in
the Cartesian pupil coordinates x and y. This is different from the Zernike circle
polynomials, which are orthogonal over a unit circle, but separable in polar coordinates
(r, q) , where 0 £ r £ 1 and 0 £ q £ 2p . The order n of a polynomial representing its
degree in the pupil coordinates is given by n = l + m . As in the case of Zernike circle
polynomials, the number of polynomials with a certain order n is n + 1. The number of
polynomials through a certain order n is given by
N n = ( n + 1)( n + 2) 2 . (8-111)
Q1( x, y ) = L0 ( x ) L0 ( y ) = 1 . (8-112)
n Ln ( x)
0 1
1 3x
2 ( )( )
5 2 3x 2 - 1
3 ( 7 2)( 5x - 3x )
3
4 (3 8)( 35x - 30 x + 3)
4 2
5 ( 11 8)(63x - 70 x + 15x)
5 3
7 ( )(
15 16 429 x 7 - 693x 5 + 315x 3 - 35x )
8 ( )( )
17 128 6435 x 8 - 12012 x 6 + 6930 x 4 - 1260 x 2 + 35
360 MONOCHROMATIC ABERRATIONS
1 1 1
Ú Ú Q ( x , y ) Q j ¢ ( x , y ) dx dy = d jj ¢ . (8-113)
4 -1 -1 j
The rectangular Q-polynomials up to and including the eighth order are listed in
Table 8-9 as products of the Legendre polynomials, along with the names associated with
some of them. Their explicit form can be obtained by using the expressions of the
orthonormal Legendre polynomials given in Table 8-8. Note that for each polynomial
Ll ( x ) Lm ( y ) , there is a corresponding polynomial Lm ( x ) Ll ( y ) . These polynomials are
evidently different from those for a rotationally symmetric system with a rectangular
pupil. The rectangular polynomials given in Section 9.4 for such a system are not
separable in the Cartesian coordinates (x, y) of a pupil point.
Q32 ( x , y ) = L4 ( x ) L3 ( y ) . (8-114)
It should be evident that the polynomials for a square pupil can be obtained from
those for a rectangular pupil by letting a = b , i.e., by using the same scale for the x and y
axes. Products of Chebyshev polynomials (one for the x, and the other for the y axis),
which are also orthogonal over a rectangular or a square pupil, have been suggested for
the analysis of rectangular wavefronts [7]. However, they are not suitable for anamorphic
systems because they do not represent balanced aberrations for such systems.
0 Q1 = L0 ( x ) L0 ( y ) Piston
1 Q2 = L1( x ) L0 ( y ) x-tilt
1 Q3 = L0 ( x ) L1( y ) y-tilt
2 Q4 = L 2 ( x ) L 0 ( y ) x-defocus
2 Q5 = L1( x ) L1( y )
2 Q6 = L 0 ( x ) L 2 ( y ) y-defocus
3 Q7 = L 3 ( x ) L 0 ( y ) x-primary coma
3 Q8 = L2 ( x ) L1( y )
3 Q9 = L1( x ) L2 ( y )
4 Q12 = L3 ( x ) L1( y )
4 Q13 = L2 ( x ) L2 ( y )
4 Q14 = L1( x ) L3 ( y )
5 Q17 = L4 ( x ) L1( y )
5 Q18 = L3 ( x ) L2 ( y )
5 Q19 = L2 ( x ) L3 ( y )
5 Q20 = L1( x ) L4 ( y )
6 Q23 = L5 ( x ) L1( y )
6 Q24 = L4 ( x ) L2 ( y )
6 Q25 = L3 ( x ) L3 ( y )
6 Q26 = L2 ( x ) L4 ( y )
6 Q27 = L1( x ) L5 ( y )
7 Q30 = L6 ( x ) L1( y )
7 Q31 = L5 ( x ) L2 ( y )
7 Q32 = L4 ( x ) L3 ( y )
7 Q33 = L3 ( x ) L4 ( y )
7 Q34 = L2 ( x ) L5 ( y )
7 Q35 = L1( x ) L6 ( y )
8 Q38 = L7 ( x ) L1( y )
8 Q39 = L6 ( x ) L2 ( y )
8 Q40 = L5 ( x ) L3 ( y )
8 Q41 = L4 ( x ) L4 ( y )
8 Q42 = L5 ( x ) L3 ( y )
8 Q43 = L2 ( x ) L6 ( y )
8 Q44 = L1( x ) L7 ( y )
1 1 1
aj = Ú Ú W ( x , y ) Q j ( x , y )dx dy . (8-116)
4 -1 -1
W ( x , y) = a1 . (8-117)
2
2
sW = [W (x, y)]2 - W ( x, y)
J
= Â a 2j . (8-119)
j =2
8.11.2 Interferograms
There are a variety of interferometers that are used to detect and measure aberrations
of optical systems [8]. Figure 8-11 schematically illustrates a Twyman–Green
interferometer in which a collimated laser beam is divided into two parts by a beam
splitter BS. One part, called the test beam, is incident on the system under test, indicated
8.11 Observation of Aberrations 365
Defocus: ρ2
Spherical: ρ 4
Balanced Spherical: ρ 4 − ρ2
⎛ 2 ⎞
Coma: ρ cosθ
3
Balanced Coma: ⎜ ρ 3 − ρ⎟ cosθ
⎝ 3 ⎠
1
Astigmatism: ρ cos2 2
θ
Balanced Astigmatism: ρ cos θ − ρ2
2 2
2
M1
BS
L M2
x
L¢
by the lens L, and the other, called the reference beam, is incident on a plane mirror M1 .
The focus F of the lens system lies at the center of curvature C of a spherical mirror M2 .
As the angle of the incident light is changed to study the off-axis aberrations of the
system, the mirror is tilted so that its center of curvature lies at the current focus of the
beam. In this arrangement the mirror does not introduce any aberration because it is
forming the image of an object lying at its center of curvature. The two reflected beams
interfere in the region of their overlap. The lens L ¢ is used to observe the interference
pattern on a screen S. A record of the interference pattern is called an interferogram. Note
that because the test beam goes through the lens system L twice, its aberration is twice
that of the system.
If the reference beam has uniform phase and the test beam has a phase distribution
F(x, y), and if their amplitudes are equal to each other, the irradiance distribution of their
interference pattern is given by
I ( x , y ) = I 0 1 + exp[iF( x , y ) ] 2
{ [
= 2 I0 1 + cos ( x, y) ]} , (8-121)
where I0 is the irradiance when only one beam is present. The irradiance has a maximum
value equal to 4 I0 at those points for which
F( x, y) = 2p n (8-122a)
F( x, y) = 2 p ( n + 1 2) , (8-122b)
where n is a positive or negative integer, including zero. Each fringe in the interference
pattern represents a certain value of n, which, in turn, corresponds to the locus of ( x, y)
points with the phase aberration given by Eq. (8-122a) for a bright fringe and Eq. (8-
[ ]
122b) for a dark fringe. If the test beam is aberration free F ( x, y) = 0 , then the
interference pattern has a uniform irradiance of 2 I0 .
Figure 8-12 shows interferograms when the lens system L under test suffers from
3 l of a primary aberration, corresponding to 6 l of an aberration of the interfering test
beam. In our discussion, we give the value of an aberration coefficient in wavelength
units, rather than in radians, as is customary in optics. For defocus and spherical
aberration, the interference pattern consists of concentric circular interference fringes.
The fringe spacing depends on the type of aberration. Figure 8-12a shows the
interferogram obtained when the system is aberration free but is misfocused, i.e., when its
focus F lies to the left or right of the center of curvature C of the spherical mirror M2 by
an amount corresponding to 3 l of the defocus aberration. [See Eqs. (8-14)] for the
relationship between the longitudinal defocus, i.e., the axial spacing between F and C,
and the peak defocus aberration Bd , which is 3 l in our example.]
Figure 8-12b shows the interferograms obtained when the system has 3 l of
spherical aberration (i.e., As = 3 l) and a certain amount of defocus. The case Bd = 0
(i.e., coincident F and C) represents such a system with an image of a certain object being
observed in its Gaussian or paraxial image plane. Similarly, the interferogram obtained
for Bd As = - 2 represents the system when the image is observed in its marginal image
plane. For a system with a positive spherical aberration, its marginal focus lies farther
from it than its paraxial focus (see Figure 8-11). Thus, this interferogram is obtained
when points F and C are separated from each other axially, according to Eq. (8-10b), by
48l F 2 , i.e., when F lies to the left of C by 48l F 2 . The other two interferograms,
Bd = - As and Bd = - 1.5 As , represent the system when the image is observed in the
minimum-aberration-variance plane and the circle-of-least-confusion plane, respectively.
Figure 8-12c shows the interferograms obtained when light is incident at a certain
angle from the axis of the system so that it suffers from 3 l of coma. The fringes in this
case are cubic curves. The case Bt = 0 corresponds to two parallel interfering beams (F
and C are coincident in this case). The case Bt = - 2 Ac 3 represents the system
corresponding to a minimum aberration variance. A tilt aberration with a peak value of
Bt may be obtained by transversally displacing C from F by ( - 2 FBt , 0 ) . It may also be
obtained by tilting the plane mirror M1 by an angle Bt a, where a is the radius of the test
beam [see Eq. (8-15) and note the factors of 2 resulting from the reflection of the
reference beam by mirror M 1 and the doubling of the system aberration in the test beam].
368 MONOCHROMATIC ABERRATIONS
Figure 8-12d shows the interferograms obtained when the system suffers from 3l of
astigmatism. When Bd = 0 or - Aa , representing the system with an image being
observed in a plane containing one or the other astigmatic focal line, respectively, we
obtain an interferogram with straight-line fringes because the aberration then depends on
either x or y (but not both). However, the fringe spacing is not uniform. When
Bd = - Aa 2 , the fringe pattern consists of rectangular hyperbolas. If the system under
test is aberration free but the two interfering beams are tilted with respect to each other,
representing a wavefront tilt error, we obtain straight-line fringes that are uniformly
spaced. The fringe spacing is inversely proportional to the angle between the two beams.
D / r0 = 10
s w = 0.4l
(a) Aberration
(b) (c)
Figure 8-13. Aberration introduced by atmospheric turbulence with D r0 = 10. Its
standard deviation is 0.4 l . (a) Aberration shape. (b) Aberration interferogram.
(c) Interferogram with 25 l of tilt.
370 MONOCHROMATIC ABERRATIONS
R Ê ∂W ∂W ˆ
( xi , yi ) = Á , ˜ , (8-123)
ni Ë ∂x ∂y ¯
where R is the radius of curvature of the reference sphere with respect to which the
aberration is defined, and ni is the refractive index of the image space. For a radial
aberration W (r ) , the distance ri of the intersection point from the Gaussian image point
is given by
R ∂W
ri = . (8-124)
ni ∂r
The wave aberration of a ray is positive if it has to travel a longer optical path length,
compared to the chief ray, in order to reach the Gaussian reference sphere [1].
W (r ) = Bd r2 , (8-125)
where
ni Ê 1 1
Bd = - ˆ a2 (8-126a)
2 zË R¯
~ - ni D R 8 F 2 for z ~ R (8-126b)
is the peak defocus aberration. Here, r = r a is the normalized distance of a point in the
plane of the exit pupil, and F = R 2 a is the focal ratio of the image-forming light cone.
The quantity D R is called the longitudinal defocus. The defocus wave aberration and the
longitudinal defocus have numerically opposite signs.
where
8.12 Summary of Results 371
Bt = ni ab (8-127b)
Ï As r 4 , Spherical
Ô
Ô Acr3cosq , Coma
Ô
W (r, q) = Ì Aar2 cos 2q , Astigmatism (8-128)
Ô 2
Ô Ad r , Field curvature
Ô A r cosq , Distortion ,
Ó t
where Ai represents the peak value of an aberration and contains the dependence on the
object point location. The primary wave aberrations of a multisurface system are additive
in the sense that they can be obtained by adding the primary wave aberrations of the
surfaces, where the Gaussian image of a point object formed by one surface becomes the
point object for the next surface.
S ~ exp ( - s F2 ) , (8-129)
where s F2 is the variance of the phase aberration across the exit pupil of imaging system.
The variance of an aberration can be reduced by balancing it with one or more aberrations
of the same and/or lower-order thereby increasing the Strehl ratio. The primary
aberrations with and without balancing are listed in Tables 8-3 and 8-4, respectively.
• n
W (r, q) = Â Â c nm Z nm (r, q) , (8-130)
n =0 m =0
where c nm is an expansion coefficient, and n and m are positive integers, including zero,
such that n – m ≥ 0 and even. The radial and angular dependence of the polynomials is
given by
12
È 2( n + 1) ˘
Z nm (r, q) = Í ˙ Rn (r) cos mq ,
m
(8-131)
Î 1 + d m0 ˚
1 1 2p m
Ú Ú Z (r, q)Z n ¢ (r, q) r dr d q = d nn ¢d mm ¢
m¢
. (8-132)
p0 0 n
The polynomials are ordered such that a polynomial with a lower value of n is ordered
first, and for a given value of n, a polynomial with a lower value of m is ordered first. The
polynomials through n = 8 and m = 0 are given in Table 8-5. The variance of the
aberration function is equal to the sum of the squares of the orthonormal expansion
coefficients c nm , except c 00 :
• n
s W2 = Â 2
 c nm . (8-133)
n =1 m = 0
(
Ïc40 5 6r4 - 6r2 + 1 , Spherical
Ô
)
Ôc 8 3r3 - 2r cosq , Coma
Ô 31 ( )
Ô
W (r, q) = Ìc22 6 r2 cos 2q , Astigmatism (8-134)
Ô 2
( )
Ôc20 3 2r - 1 , Field curvature
Ô
Ôc11 2r cos q , Distortion ,
Ó
where cij is the aberration coefficient. The aberrations in this form are orthonormal over
a unit circular pupil, and the standard deviation of an aberration is given by cij . The
Zernike spherical aberration consists of Seidel spherical aberration and an equal and
opposite amount of defocus. The Zernike coma consists of Seidel coma and a tilt of – 2/3
the amount of coma. The Zernike astigmatism consists of Seidel astigmatism and – 1/2
the amount of astigmatism. An aberration balanced in this manner yields the minimum
8.12 Summary of Results 373
standard deviation but not necessarily the minimum spot radius, as discussed in Chapter
9.
An even number is associated with a cosine polynomial, and an odd number with a sine
polynomial. The polynomials are orthonormal over a unit circular pupil according to
1 2p 1 2p
Ú Ú Z j (r, q) Z j ¢ (r, q) r dr dq Ú Ú r dr dq = d jj ¢ . (8-136)
0 0 0 0
The polynomials are ordered such that an even j corresponds to a symmetric polynomial
varying as cosmq, and an odd j corresponds to an antisymmetric polynomial varying as
sinmq. A polynomial with a lower value of n is ordered first, and for a given value of n, a
polynomial with a lower value of m is ordered first. The first 45 orthonormal Zernike
polynomials are listed in Table 8-6.
J
s W2 = Â a 2j . (8-139)
j =2
The form of a Seidel aberration in terms of their dependence on the pupil coordinates
(r, q) , and the corresponding balanced and Zernike aberrations are given in Table 8-10.
See the Appendix on how to combine cosine and sine aberration terms to obtain a Seidel
aberration at a certain angle.
The aberration function of an anamorphic system depends on the object and pupil
coordinates ( p, q) and ( x , y ) , respectively, through six reflection invariants p 2 , q 2 , x 2 ,
y 2 , px , and qy , compared to three rotational invariants p 2 + q 2 , x 2 + y 2 , and px + qy
in the case of a rotationally symmetric system. Its aberration terms are separable in the
Table 8-10. Seidel aberrations and the corresponding balanced aberrations and
Zernike polynomials
Spherical , r 4 r4 - r2 (
Z 40 = 5 6r 4 - 6r 2 + 1 )
Coma, r 3 cos q (r3 - 2r 3) cos q Z 13 = 8 ( 3r 3 - 2r) cos q
Field curvature, r 2 (
Z 20 = 3 2r 2 - 1 )
Distortion, r cos q Z11 = 2r cos q
8.12 Summary of Results 375
pupil coordinates. The degree of an aberration term is even, and the aberration function
accordingly consists of aberrations of even orders only. There are 16 primary aberrations
[see Eq. (8-109)], as opposed to only five for a rotationally symmetric system [see Eq. (2-
16)].
If two Zernike polynomial aberrations with the same value of n and varying as
cos mq and sin mq are present simultaneously with sigma values a j and b j , we can
write their sum in the form
= (
2(n + 1) Rnm (r) a j cos mq + b j sin mq )
= {[ (
2(n + 1) Rnm (r) a 2j + b 2j cos m q - (1 m) tan -1 b j a j )]} . (8A-1)
Ô ( j j )
Ï - tan -1 b a for positive a and negative b
j j (8 A - 2a)
tan -1
(
bj a j ) = Ì
Ó ( )
Ô p - tan -1 b j a j for negative a j and positive b j . (8 A - 2 b)
(
An alternative when a 2 is negative is to let the angle be - tan -1 b j a j , as when a 2 is
12 12
)
(
positive, but also replace a 2j + b 2j )
with - a 2j + b 2j . ( )
References 377
REFERENCES
1. As with the sign convention in Gaussian optics, different authors use different
sign conventions for the wave aberration associated with a ray. We have assigned
a positive sign to a ray that travels a longer optical path length compared to that of
the chief ray to reach the reference sphere. This convention is used, for example in
M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,
New York, 1999). Some authors give it a negative sign; see, for example, W. T.
Welford, Aberrations of the Symmetrical Optical System (Academic Press, New
York, 1974). They assign a negative sign when the wavefront at the ray lags the
reference sphere, and a positive sign when it leads. As a result, their equation for
the transverse ray aberration has a minus sign on the right-hand side of Eq. (8-5),
because a ray must end up at the same point in the image plane regardless of the
sign convention used for the wave aberration. In practice, it does not matter which
sign convention is used as long as it is used consistently.
PROBLEMS
8.1 Show that the defocus wave aberration introduced by a lens of image-space focal
length f ¢ is given by W (r ) = - (ni / 2 f ¢) r 2 , where ni is the refractive index of the
image space.
8.2 The field curvature aberration of an imaging system may be written W(r)
= ad h ¢ 2 r 2 , where a d is the aberration coefficient, and h ¢ is the height of a
Gaussian image point. Show that the effect of the aberration is eliminated if the
image is observed on a spherical surface of radius of curvature 1 4 ad R 2 passing
through a corresponding axial Gaussian image point, where R is the radius of
curvature of the reference sphere with respect to which the aberration is defined.
The refractive index of the image space is assumed to be unity.
8.4 Consider an imaging system suffering from distortion aberration given by W(r,q)
= at h ¢ 3r cos q , where a t is the aberration coefficient, and h ¢ is the height of a
Gaussian image point. Determine the height of the actual image point.
8.5 Show that the Seidel coma r 3 cos q balanced with tilt r cos q , and Seidel
astigmatism r 2 cos 2 q balanced with defocus for minimum variance across a
[ ]
circular pupil have the form r 3 - (2 3) cos q , and (1 2)r 2 cos 2q , respectively.
8.7 Consider a three-mirror system. Determine the fabriaction tolerance for the mirrors
for a Strehl ratio of 0.8. For simplicity, assume that each mirror has the same figure
tolerance.
CHAPTER 9
379
Chapter 9
Spot Sizes and Diagrams
9.1 INTRODUCTION
In Chapter 4 we developed paraxial ray-tracing equations to determine the Gaussian
imaging properties of a system, vignetting of rays, size of imaging elements and
apertures, and obscurations in mirror systems. In the paraxial approximation, all rays
from a point object that are transmitted by the system pass through the Gaussian image
point. However, when the rays are traced according the exact laws of geometrical optics,
they generally intersect in the vicinity of the Gaussian image point. The ray distribution
on an observation surface, as depicted by the intersection points, is called the spot
diagram, and its extent is called the spot size.
In this chapter, we discuss the distribution of rays in the image of a point object
aberrated by a primary aberration. The density of rays over an observation surface is
called the geometrical point-spread function. We define its centroid and sigma value, and
calculate them for primary aberrations, without explicitly calculating the ray density
distribution [1]. In the case of spherical aberration and astigmatism, the ray distribution
and spot size are also considered in image planes other than the Gaussian, thereby
introducing the concept of aberration balancing. In the early stages of the design of an
optical imaging system, one often considers its transverse ray aberrations in an image
plane for a set of rays lying along a certain line in the plane of the exit pupil and passing
through its center. Such a set of rays is called a ray fan. We illustrate the wave and ray
aberrations for ray fans along the x and y axes. Also discussed are the balanced
aberrations for the minimum spot sigma in terms of Zernike circle polynomials.
Aberration tolerances, including the depth of focus, and a golden rule of optical design
are discussed. The characteristics of the ray spots and tolerance for primary aberrations
are summarized in the last section.
9.2 THEORY
Consider an optical system consisting of a series of rotationally symmetric coaxial
refracting and/or reflecting surfaces imaging a point object P lying at a height h along the
x axis, as in Figure 8-3. The primary aberration function at the exit pupil of the system
may be written
where (r, q) are the polar coordinates of a point in the x y plane of the exit pupil, h ′ is
the height of the Gaussian image point P ′, and a s , a c , a a , a d , and at represent the
coefficients of spherical aberration, coma, astigmatism, field curvature, and distortion,
respectively. The angle q is equal to zero or p for points lying in the tangential or
meridional plane (i.e., the z x plane containing the optical axis and the point object, and,
therefore its Gaussian image). The chief ray, which by definition passes through the
center of the exit pupil, always lies in this plane. The plane normal to the tangential plane
381
382 SPOT SIZES AND DIAGRAMS
but containing the chief ray is called the sagittal plane. The angle q is equal to π / 2 or
3π / 2 for points lying in the sagittal plane. As the chief ray bends when it is refracted or
reflected by a surface, so does the sagittal plane. The rays lying in the tangential plane are
referred to as the tangential ray fan and those lying in the sagittal plane are referred to as
the sagittal ray fan. For an optical system with a circular exit pupil, say of radius a, it is
convenient to use normalized coordinates (ρ, θ) where ρ = r a, 0 ≤ ρ ≤ 1, 0 ≤ θ < 2 π ,
suppress the explicit dependence on h ′, and write the aberration function in the form
where the new aberration coefficients Ai are related to the ai used in Eq. (9-1) according
to
As = as a 4 , Ac = ac h ′a 3 , Aa = aa h ′ 2 a 2 , Ad = ad h ′ 2 a 2 , At = at h ′ 3a . (9-3)
1
(ξ, η) = ( x, y) (9-4a)
a
In rectangular coordinates, Eq. (9-2) for the primary aberration function may be
written
2
(
W (ξ, η) = As ξ 2 + η2 ) ( ) ( )
+ Ac ξ ξ 2 + η2 + Aa ξ 2 + Ad ξ 2 + η2 + At ξ . (9-5)
The ray distribution on an observation surface is called the ray spot diagram, and
their density (i.e., the number of rays per unit area) over the surface is called the
geometrical point-spread function (PSF). If the system is aberration free, then the
wavefront at the exit pupil is spherical, and all of the object rays transmitted by the
system converge to the Gaussian image point. When the wavefront is aberrated, a ray
9.2 Theory 383
passing through a point (ξ, η) or (r, q) in the plane of the exit pupil intersects the
Gaussian image plane at a point ( xi , yi ) , which, following Eq. (8-5), may be written
⎛ ∂W ∂W ⎞
( xi , yi ) = 2F ⎜ , ⎟ (9-6a)
⎝ ∂ξ ∂η ⎠
⎛ ∂W sinθ ∂W ∂W cosθ ∂W ⎞
= 2F ⎜ cosθ – , sinθ + ⎟ , (9-6b)
⎝ ∂ ρ ρ ∂ θ ∂ρ ρ ∂θ ⎠
where F = R 2 a is the focal ratio of the image-forming light cone. Here, R is the radius
of curvature of the Gaussian reference sphere with respect to which the aberration
W (ρ, θ) is defined, and ( xi , yi ) are the coordinates of the point of intersection of the ray
in the Gaussian image plane with respect to the Gaussian image point and represent its
ray aberrations. The reference sphere is centered at the Gaussian image point and, like the
aberrated wavefront, passes through the center of the exit pupil. In Eqs. (9-6), we have
assumed that the refractive index ni of the image space is unity because it is often the
case in practice.
For a radially symmetric aberration, i.e., one for which W (ρ, θ) = W (ρ), we note
from Eq. (9-6b) that the PSF is also radially symmetric. The radial distance ri of a ray
from the Gaussian image point in that case is given by
1/ 2
(
ri = xi2 + yi2 )
∂W(ρ)
= 2F , (9-7)
∂ρ
where the vertical bars ensure that ri is a numerically positive quantity.
( xc , yc ) = xi , yi
∫∫ ( xi , yi ) Ig ( xi , yi ) dxi dyi
= , (9-9)
∫∫ Ig ( xi , yi ) dxi dyi
where the angular brackets indicate a mean value. However, it can be obtained in a
simple manner by substituting Eqs. (9-6a) and (9-8) into Eq. (9-9). Thus, for a uniformly
384 SPOT SIZES AND DIAGRAMS
Ê ∂W ∂W ˆ
(xc , yc ) =
2F
ÚÚ ÁË ∂x , ∂h ˜¯ d x dh
ÚÚ d x dh
Ê ∂W ∂W ˆ
= (2F p)
ÚÚ ÁË ∂x , ∂h ˜¯ d x dh . (9-10)
The standard deviation of the image distribution or the spot sigma is given by
12
ss = ( x i - x c )2 + ( y i - y c )2 (9-11a)
12
ÏÔ 1 ÈÊ ∂W 2 2 ¸Ô
ˆ Ê ∂W ˆ ˘
= 2F Ì
ÔÓ p
ÚÚ ÍÁ
ÍÎË ∂x
- xc ˜ + Á
¯ Ë ∂h
- y c ˜ ˙ d x dh˝
¯ ˙
˚ Ô˛
. (9-11b)
For a symmetric aberration such as astigmatism, the PSF is symmetric, and the centroid
lies at the origin, i.e., ( x c , y c ) = (0, 0) . The spot sigma in such cases is equal to the root
mean square radius. Substituting Eq. (4-7) for a radially symmetric aberration, Eq. (4-
15b) reduces to
12
È Û Ê ∂W ˆ 2 ˘
s s = 2 2F Í ÙÁ ˜ r d r˙ . (9-11c)
ÍÎ ı Ë ∂r ¯ ˙˚
W (ρ) = As ρ 4 (9-12)
with respect to a reference sphere centered at the Gaussian image point P0′ of an axial
point object P0 . Substituting Eq. (9-12) into Eq. (9-7), we find that a ray of zone r in the
plane of the exit pupil intersects the Gaussian image plane at a distance
9.3 Application to Primary Aberrations 385
ri = 8 FAs ρ3 (9-13)
from P0′. Thus, the rays lying on a circle of radius r in the exit pupil lie on a circle of
radius ri given by Eq. (9-13) in the Gaussian image plane. The maximum value of ri is
8FAs and corresponds to rays with ρ = 1, i.e., it corresponds to the marginal rays. We
refer to the maximum value of ri as the radius of the image spot. For an off-axis point
object, because As is independent of the height h of the point object from the optical axis,
the ray distribution owing to spherical aberration alone is also independent of h.
Let us consider the ray distribution in a slightly defocused image plane by intro-
ducing a defocus aberration Bd . The aberration with respect to a new reference sphere
centered at a defocused point lying at a distance z from the plane of the exit pupil may be
written
1 ⎛1 1⎞ 2
Bd = − a (9 − 15a )
2 ⎝ z R⎠
ΔR
~ − . (9 − 15b)
8F 2
Here, Δ R = z − R , and Eq. (9-15b) follows from Eq. (9-15a) when z ~ R . Note that Bd
is numerically negative for z > R , i.e., if the defocused image plane lies farther from the
exit pupil than the Gaussian image plane, or the longitudinal defocus Δ R is positive.
Figure 9-1 shows how the wave aberration given by Eq. (9-14) varies across the exit pupil
for values of Bd corresponding to paraxial ( Bd = 0) , marginal ( Bd = − 2 As ) , midway
( Bd = − As ) , and least-confusion ( Bd = − 1.5 As ) image planes. The names of the image
planes given here will become clear from what follows. We note that for the negative
value of Bd , the aberration is negative everywhere except at the center and the edge of
the pupil, where it is zero.
The rays of zone r now lie in the defocused image plane on a circle of radius
ri = 8 FAs ρ3 + ( Bd 2 As ) ρ . (9-16)
The circle in the image plane is traced out in the same sense as in the pupil plane as θ
varies from 0 to 2p to complete a circle of rays. In a given image plane, i.e., for a given
value of Bd , the maximum value of ri as r varies from 0 to 1 is the spot radius in that
plane. It occurs either at the stationary value of r obtained by letting ∂ri ∂r = 0 or at the
end value r = 1. We note that r = 0 at the other end point ri = 0 , implying that the chief
ray passes through the center of the image. When Bd is negative, ri = 0 also for rays
with r = - Bd 2 As .
386 SPOT SIZES AND DIAGRAMS
1.00
0.75 W(ρ)
= ρ4 +(Bd /As)ρ2
As
0.50
Bd
=0
0.25 As
W(ρ)
As
0.00
–1
– 0.25
–1.5
– 0.50
–2
– 0.75
– 1.00
0.0 0.2 0.4 0.6 0.8 1.0
ρ
Figure 9-1. Variation of spherical aberration across the exit pupil in units of As
combined with different amounts of defocus Bd . Aberration variance is minimum
when Bd = - As .
How ri varies with r is shown in Figure 9-2 for the values of Bd considered above.
We note that only when Bd = 0 does a given value of ri corresponds to a certain value of
r. When Bd = − 2 As , there are two different values of r lying between zero and one that
correspond to a given value of ri , i.e., rays lying on two different circles in the pupil
plane lie on the same circle in the image plane. When Bd = − As , or Bd = − 3 As 2, there
are three different values of r lying between zero and one that correspond to a given
1.0
0.8
0.6
ri
Bd
= –2
As
0.4
– 1.5
0.2
–1
0
0.0
0.0 0.2 0.4 0.6 0.8 1.0
ρ
Figure 9-2. Radius ri of a circle of rays in units of 8FAs in various image planes
characterized by the value of Bd as a function of corresponding radius ρ in the
pupil plane. The spot radius is minimum when Bd = -1.5 As .
9.3 Application to Primary Aberrations 387
value of ri for 0 < ri < 1 3 6 or 0 < ri < 1 4, respectively, i.e., rays lying on three
different circles in the pupil plane lie on the same circle in the image plane. A circle of
rays with a larger value of ri up to ri = 1 2 corresponds to only one circle of rays in the
pupil plane when Bd = − As . There are two circles of rays in the pupil plane with ρ = 1 2
and 1 that correspond to ri = 1 4 when Bd = − 3 As 2.
R = − 8 F 2 Bd (9-17a)
(9-17b)
= 16 F 2 As
from P0′ . A positive value of Δ R implies that, compared with the old reference sphere,
the new reference sphere is centered at a point that is farther from the center of the exit
pupil, or that the defocused image plane lies farther from the exit pupil than the Gaussian
image plane. Thus, the point of intersection M of the marginal rays lies to the right of P0′ ,
as shown in Figure 9-3. This is to be expected because as may be seen from Figure 9-3,
the wavefront W is less curved than the reference sphere S for positive values of As . The
points P0′ and M are called the Gaussian or paraxial (meaning for very small values of r)
and the marginal image points, respectively. Substituting Bd = − 2 As into Eq. (9-16),
ExP
MR
1
0.5
0.25 0.385
O CR P′0
M
MW LC
G
Longitudinal
spherical
aberration
W S
R
Figure 9-3. Ray spot radii in various image planes for a wavefront W aberrated by
spherical aberration. G – Gaussian or paraxial, M – marginal, MW – midway, L C –
least confusion. The reference sphere S is centered at a Gaussian image point P0′ .
388 SPOT SIZES AND DIAGRAMS
we find that the maximum value of ri in the marginal image plane occurs for rays of zone
= 1 3 . This maximum value, i.e., the spot radius, is 2 3 3 (or 0.385) times the
corresponding value in the Gaussian image plane. Thus, the marginal spot radius is
considerably smaller than the paraxial spot radius. The quantity Δ R given by Eq. (9-17b)
is called the longitudinal spherical aberration. It represents the distance of the marginal
image point from the Gaussian image point. If we consider the variation of longitudinal
spherical aberration with ρ , i.e., if we determine the distance of the point where the rays
of a zone ρ intersect the optical axis from P0′ , we find from Eqs. (9-15b) and (9-16) that
it varies quadratically with ρ according to
R = 16 F 2 As ρ2 . (9-17c)
The image plane M W lying midway between the Gaussian and marginal planes
corresponds to Bd = − As . The spot radius in this plane is half of that in the Gaussian
image plane G and corresponds to marginal rays. The image plane that has the smallest
spot radius corresponds to that value of Bd that minimizes the maximum value of ri as r
varies from 0 to 1 in Eq. (9-16). It is evident from Eq. (9-16) that Bd must be negative; a
positive value of Bd can only increase the value of ri for any value of ρ . The value of ρ
corresponding to the spot radius is either ρ1 = c / 6 obtained by letting ∂ri ∂ρ = 0,
where c = − Bd / As , or ρ2 = 1 . In units of 8FAs , the corresponding values of the spot
radius are r1 = c 3 / 2 / 3 6 and r2 = 1 − c / 2 , respectively. Figure 9-4 shows that r1
increases monotonically as c increases, but r2 first decreases, approaches zero as c → 2,
and then increases monotonically. The value of c that gives the minimum spot radius is
the one obtained by letting r1 = r2 . This equality yields a cubic equation in c with
solutions c = 6, 6, and 3/2. The value 3/2 yields the minimum spot radius. Thus, the spot
radius is minimum in a plane LC (for least confusion) corresponding to Bd = − 3 As 2 ,
i.e., a plane that is 3/4 of the way from the Gaussian image plane to the marginal image
plane. The spot radius in this case is 1/4 of the Gaussian spot radius and corresponds to
the rays of zone ρ = 1 2 and 1. This spot is called the circle of least confusion. The spot
radii in the various image planes considered here are listed in Table 9-1. Note that they
increase linearly with F and As .
Because of the radial symmetry of spherical aberration, the wave and ray aberrations
of any ray fan can be written immediately from Eqs. (9-14) and (9-16), respectively. For
example, for the tangential ray fan, i.e., for the η = 0 rays, we may write
[
W (ξ, 0) = As ξ 4 + ( Bd As ) ξ 2 ] (9-18a)
and
( xi , yi ) [
= 8 FAs ξ 3 + ( Bd 2 As ) ξ, 0 ] . (9-18b)
Figure 9-5 shows how the wave and ray aberrations of a ray fan for spherical aberration
vary with x for the various defocus values listed in Table 9-1.
9.3 Application to Primary Aberrations 389
2
ri
1
r1
r2
0
0 2 4 6 8
c
W(ξ, 0) xi
1 8
Bd /As = 0
4 Bd /As = –2
– 3/2
0 ξ 0 ξ
–1 (0, 0)
–1 (0, 0)
0
– 3/2
–4
–2
–1 –8
–1 0 1 –1 0 1
Figure 9-5. Wave and ray aberrations for a ray fan for spherical aberration
corresponding to various image planes. The wave aberration is in units of As , and
the ray aberration is in units of FAs .
390 SPOT SIZES AND DIAGRAMS
Table 9-1. Ray spot sizes in units of 8 FAs , for peak spherical aberration As .
Because of its radial symmetry, the centroid of the PSF lies at the Gaussian image
point (0, 0) . Substituting Eq. (9-16) into Eq. (9-11c), we obtain the image spot sigma
2 12
ÏÔ 1 B 1 Ê B ˆ ¸Ô
ss = 8FAs Ì + d + Á d ˜ ˝ . (9-19)
4 3 As 2 Ë 2 As ¯ Ô
ÓÔ ˛
Letting
∂s s
= 0 , (9-20)
∂Bd
The deliberate mixing of one aberration with one or more other aberrations to reduce
the stop size is called aberration balancing. Here, we have balanced spherical aberration
with defocus in order to minimize the spot radius or its sigma value. The amount of
defocus that gives the smallest ray spot or its sigma value may be called the optimum
defocus based on geometrical optics. The balanced aberration giving the smallest ray spot
is As [ρ4 − (3 / 2) ρ2 ] . Similarly, the balanced aberration that gives the smallest spot
sigma is As [ρ4 − ( 4 / 3) ρ2 ] . Based on diffraction, the optimum amount of defocus
corresponds to the midway plane, because in that case it is used to reduce the variance of
the aberration across the exit pupil, i.e., the balanced aberration giving minimum
( )
variance is As ρ 4 − ρ2 , similar to the Zernike circle polynomial Z40 (ρ) [see Tables 8-4
and 8-5].
9.3 Application to Primary Aberrations 391
0.5
0.4
σs 0.3
0.2
0.1
– 2.0 – 1.5 – 1.0 – 0.5 0
B d/A s
Figure 9-6. Variation of s s in units of 8FAs for spherical aberration with defocus.
9.3.2 Coma
The coma wave aberration is given by
or
W (ξ, η) = Ac ξ ξ 2 + η2 ( ) . (9-21b)
Substituting Eq. (9-21) into Eqs. (9-6), we obtain the corresponding ray aberrations in the
Gaussian image plane with respect to the Gaussian image point:
(
= 2 FAc ρ2 + 2ξ 2 , 2ξη ) . (9-22b)
For a given value of ρ, the locus of the points of intersection of the rays in the Gaussian
image plane is given by
2 2
(x i − 4 FAcρ2 ) (
+ yi2 = 2 FAc ρ2 ) . (9-23)
Thus, the rays coming from a circle of radius ρ in the exit pupil lie on a circle of radius
( )
2 FAc ρ2 centered at 4 FAc ρ2 , 0 in the image plane. The circle in the image plane is
traced out twice in the same sense as in the pupil plane as q varies from 0 to 2π to
complete a circle of rays. As illustrated in Figure 9-7, because CB CP′ = 1 2 , all of the
rays in the image plane are contained in a cone with a semiangle of 30° bounded by a
392 SPOT SIZES AND DIAGRAMS
xi
ρ = 1 Rays
C
Ac
2F
B A
4FAc S
ρ = 1/2 Rays
30°
yi
P′ xi
T
x(ξ)
MRt S
P′
ExP MRs
Q θ
h′
CR
r
z
O OA P′0
MRs MRt
yi
y(η)
Figure 9-7. Ray spot diagram for coma. The tangential marginal rays MRt are
focused at the point T, and the sagittal marginal rays MRs are focused at the point
S. All of the rays in the image plane lie in a cone with a semiangle of 30° and its
vertex at the Gaussian image point P ′ bounded by the upper arc of a circle of
radius 2 FAc centered at (4 FAc , 0) . The cone angle is 30° because CB CP′ = 1 2 .
circle of radius 2FAc centered at ( 4 FAc , 0) corresponding to the marginal rays. Here, C
is the center of the circle formed by the marginal rays, and P ′A and P ′B are tangents to
the circle. The vertex of the cone, of course, coincides with the Gaussian image point P ′ .
Only the chief ray passes through P ′ . Rays in the image plane corresponding to a zone of
ρ = 1 2 are also shown in the figure. They lie on a circle of radius FAc 2 centered at
( FAc , 0) in the image plane. Because the spot diagram has the shape of a comet, the
aberration is appropriately called coma. Note that the tangential marginal rays
MRt (ρ = 1, θ = 0, π ) intersect this plane at a point T at a distance 6FAc from P ′ along
9.3 Application to Primary Aberrations 393
the xi axis, and the sagittal marginal rays MRs (ρ = 1, θ = π 2 , 3π 2) intersect the image
plane at a point S at a distance 2FAc from P ′ . Accordingly, the length 6FAc and half-
width 2FAc of the coma pattern are called tangential and sagittal coma, respectively.
According to Eq. (9-21b), the wave aberration for the tangential ray fan is given by
Wt (ξ, 0) = Ac ξ 3 . (9-24)
It is zero for the sagittal ray fan. The ray aberrations given by Eq. (9-22b) may be written
for the two types of rays in the form
( xi , yi )t (
= 6 FAc ξ 2 , 0 ) (9-25a)
and
( xi , yi )s (
= 2 FAc η2 , 0 ) . (9-25b)
We note that even though the wave aberration of the rays in the sagittal fan is zero, their
ray aberration is not; the rays are displaced along the x (or x) axis in the image plane.
Figure 9-8 shows the variation of wave and ray aberrations with pupil coordinates. We
note that the wave aberration is odd and the ray aberration is even in pupil coordinates.
Of course, this is also evident from Eqs. (9-21b) and (9-22b).
Because the PSF is highly asymmetric about the Gaussian image point P ′ , its
centroid does not lie at it. Substituting Eq. (9-22b) into Eq. (9-10), we obtain the
coordinates of the centroid
W(ξ, 0) xi
1 8
xi(ξ)
4
xi(η)
0 ξ 0 ξ, η
(0, 0) (0, 0)
–4
–1 –8
–1 0 1 –1 0 1
Figure 9-8. Wave and ray aberrations for tangential and sagittal ray fans for coma.
The wave aberration is in units of Ac , and the ray aberration is in units of FAc . The
wave aberration is zero for the sagittal ray fan.
394 SPOT SIZES AND DIAGRAMS
Thus, the centroid lies at the point S in Figure 9-7 where the sagittal marginal rays
intersect the image plane. Substituting Eqs. (9-22a) and (9-26) into Eq. (9-11a), we obtain
the image spot sigma:
2 12
s s = 2 FAc [r 2
(2 + cos 2q) - 1] + r 4 sin 2 2q
= 2 2 3FAc . (9-27)
Measuring the ray coordinates in the image plane with respect to a point other than
the Gaussian image point is equivalent to introducing a wavefront tilt aberration in the
aberration function, and may be written
where Bt is the peak value of the balancing tilt aberration and corresponds to measuring
the wave aberration with respect to a reference sphere centered at a point in the image
plane with coordinates ( − 2 FBt , 0 ) or, equivalently, measuring the ray coordinates with
respect to this point. Thus, measuring the ray aberrations with respect to the centroid is
equivalent to a tilt aberration of -Ac r cos q or Bt = - Ac . Accordingly, the aberration
function with respect to the centroid can be written
(
W (r, q) = Ac r 3 - r cos q . ) (9-29)
It should be evident that if the ray aberrations are measured with respect to a point
other than the centroid, including the Gaussian image point, the sigma value of the spot
will increase. The aberration function given by Eq. (9-29) represents coma aberration
balanced optimally with tilt aberration to yield a minimum value of the spot sigma, or
bring its centroid at the Gaussian image point. However, the variance of the wave
aberration is minimum when Bt = − (2 3) Ac , i.e., if the balanced aberration is
[ ]
Ac ρ3 − (2 / 3) ρ cos θ , similar to the Zernike polynomial Z31 (ρ, θ) [see Tables 8-4, and
(8-5)].
It is worth mentioning that the centroid of a PSF is associated with the line of sight of
an imaging system. Moreover, the centroid of a geometrical PSF is identical to the
diffraction PSF [2].
W (x, h) = ( Aa + Ad + Bd ) x 2 + ( Ad + Bd ) h2 , (9-30b)
where Aa and Ad are both proportional to h ′ 2 , and the balancing defocus coefficient Bd
is related to the longitudinal defocus Δ R according to Eq. (9-17a). The corresponding ray
aberrations are given by
( xi , yi ) [
= 4 Fρ ( Aa + Ad + Bd ) cosθ, ( Ad + Bd ) sinθ ] (9-31a)
[
= 4F ( Aa + Ad + Bd ) ξ , ( Ad + Bd ) η ] . (9-31b)
For a given value of r, the locus of the points of intersection of the rays in the defocused
image plane is given by
2 2
⎛ xi ⎞ + ⎛ yi ⎞ = 1 , (9-32)
⎝ A⎠ ⎝ B⎠
where
A = 4 F( Aa + Ad + Bd ) ρ (9-33)
and
B = 4 F( Ad + Bd ) ρ . (9-34)
Thus, the rays lying on a circle of radius r in the exit pupil, in general, lie in a defocused
image plane on an ellipse whose semiaxes are given by A and B. The largest ellipse is
obtained for the marginal rays.
ΔRt
ΔRb
T
C
ΔRs
S
xi
OA
x(ξ)
MRt
P′
ExP CR
MRs yi
O
MRs
MRt
y(η)
Figure 9-9. Astigmatic images in the presence of field curvature, showing elliptical
image spots and astigmatic focal lines. The sagittal marginal rays MRs are shown
converging on the sagittal line image S, and the tangential marginal rays MRt are
shown converging on the tangential line image T. The line images S and T, and the
circle of least confusion C, are special cases of the elliptical spots.
Because both Aa and Ad ~ h ′ 2 , the length of the sagittal and tangential line images
of a point object increases quadratically with the height h ′ of the Gaussian image point.
Similarly, Δ Rs , Δ Rt , Δ Rb , and longitudinal astigmatism increase as h ′ 2 . For a line
object, equating Δ R to the sag of a curved line image, we find that the sagittal,
tangential, and best images are parabolic with the vertex radii of curvature given by
Rs = h ′ 2 16 F 2 Ad (9-35a)
= 1 4 R 2 ad , (9-35b)
9.3 Application to Primary Aberrations 397
Rt = h ′ 2 16 F 2 ( Aa + Ad ) (9-36a)
= 1 4 R 2 ( aa + ad ) , (9-36b)
and
h′2
Rb = (9-37a)
8 F 2 ( Aa + 2 Ad )
1
= , (9-37b)
2 R 2 ( aa + 2 ad )
3 1
− = 4 R 2 ( 2 ad − aa ) . (9-38)
Rs Rt
The right-hand side is also related to the radius of curvature Rp of the Petzval image, and
Eq. (9-38) may be written
3 1 2
− = . (9-39)
Rs Rt Rp
Because the sag of a surface is inversely proportional to its (vertex) radius of curvature,
Eq. (9-39) has the consequence that, as illustrated in Figure 9-10, the Petzval surface is
three times as far from the tangential surface as it is from the sagittal surface. Moreover,
the sagittal surface always lies between the tangential and the Petzval surfaces. When
astigmatism is zero, the sagittal and the tangential surfaces reduce to the Petzval surface.
We also note from Eqs. (9-35) through (9-37) that
1 1⎛ 1 1⎞
= ⎜ + ⎟ , (9-40)
Rb 2 ⎝ Rs Rt ⎠
i.e., the vertex curvature of the best-image surface is equal to the mean value of the vertex
curvatures of the sagittal and tangential surfaces. The best-image surface is planar when
aa = − 2 ad . In that case, Rs = − Rt , i.e., the sagittal and tangential image surfaces have
equal but opposite vertex curvatures.
The wave and ray aberrations of a tangential ray fan are given by Eqs. (9-31b) and
(9-32b) according to
Wt (ξ, 0) = ( Aa + Ad + Bd ) ξ 2 (9-41)
398 SPOT SIZES AND DIAGRAMS
P′ T S P P′ P S T P′ P
and
( xi , yi )t = 4 F( Aa + Ad + Bd ) (ξ, 0) , (9-42)
respectively. Similarly, for the sagittal ray fan, they are given by
Ws (0, η) = ( Ad + Bd ) η2 (9-43)
and
( xi , yi )s = 4 F ( Ad + Bd ) ( η, 0) . (9-44)
The centroid of the PSF lies at the Gaussian image point (0, 0) because it is
symmetric about both the xi and yi axes. The image spot sigma may be obtained by
substituting Eq. (9-32a) into Eq. (9-11c). Thus,
9.3 Application to Primary Aberrations 399
W(ξ, 0) xi
1 8
(Ad + Bd)/Aa = 0
4 (Ad + Bd)/Aa = 0
–1/2
–1/2
0 ξ 0 ξ
(0, 0) –1 (0, 0) –1
–4
–1 –8
–1 0 1 –1 0 1
Figure 9-11. Wave and ray aberrations for a tangential ray fan for astigmatism
corresponding to various image planes. The wave aberration is in units of Aa , and
the ray aberration is in units of FAa .
2 12
È A + Bd Ê A + Bd ˆ ˘
s s = 2FAa Í1 + 2 d + 2Á d ˜ ˙ . (9-45)
ÍÎ Aa Ë Aa ¯ ˙
˚
2.0
1.9
1.8
σs
1.7
1.6
1.5
1.4
– 1.0 – 0.9 – 0.8 – 0.7 – 0.6 – 0.5 – 0.4 – 0.3 – 0.2 – 0.1 0
(A d + B d )/A a
∂s s
= 0 , (9-46)
∂Bd
we find that the spot sigma is minimum and equal to 2 FAa when Ad + Bd = − Aa 2 ,
i.e., in the plane of the circle of least confusion, as expected for uniform irradiance. The
spot shape and size, including its s value, in an image plane defined by the balancing
defocus are summarized in Table 9-2.
If astigmatism is the only aberration present, i.e., if the field curvature coefficient
Ad = 0 in Eqs. (9-31), then all of the object rays transmitted by the exit pupil intersect the
Gaussian image plane on a line S of full length 8FAa along the xi axis centered at the
Gaussian image point P ′ , as illustrated in Figure 9-13. This is the sagittal image of a
point object. The sagittal rays converge on the Gaussian image point. Similarly, a
tangential line image T of the same full length as the sagittal line image is obtained in a
defocused image plane corresponding to Bd = − Ad . The tangential rays converge to a
point at its center. The sagittal image of a line object is also a line that is slightly longer
(by an amount 8FAa ) than but coincident with its Gaussian line image. However, its
tangential image is parabolic with a vertex radius of curvature of h ′ 2 / 16 F 2 Aa or
1 / 4 R 2 aa . Note that the longitudinal astigmatism in this case represents the sag of the
tangential image surface. Similarly, the sagittal image of a planar object will be planar,
but its tangential image will be paraboloidal.
Table 9-2. Ray spot shape, size, and sigma for astigmatism Aa and field curvature A d
in various image planes defined by defocus Bd .
Balancing
Image Defocus Spot Shape and Size* Spot Sigma
Plane Bd s s 2FAa
2 1/ 2
⎡ A + Bd ⎛ A + Bd ⎞ ⎤
8 F( Aa + Ad + Bd ) ⎢1 + 2 d + 2⎜ d
General Bd Elliptical, ⎟ ⎥
⎢⎣ Aa ⎝ Aa ⎠ ⎥
× 8 F( Ad + Bd ) ⎦
2 1/ 2
⎡ Ad ⎛ Ad ⎞ ⎤
8 F( Aa + Ad ) ⎢1 + 2 + 2⎜ ⎟ ⎥
Gaussian 0 Elliptical,
× 8 FAd ⎢⎣ Aa ⎝ Aa ⎠ ⎥
⎦
Sagittal − Ad Line along xi axis, 8FAa 1
Tangential − ( Ad + Aa ) Line along yi axis, 8FAa 1
xi
x(ξ) T
S C
MR t
MR s
ExP P′
yi
CR MRs
O z
OA
MR t
y(η)
Figure 9-13. Astigmatic focal lines when only astigmatism is present. The tangential
marginal rays MRt are focused at a point on the tangential focal line T. Similarly,
the sagittal marginal rays MRs are focused at the Gaussian image point P ′ on the
sagittal focal line S. The focal lines S and T lie in the tangential and sagittal planes,
respectively. The circle of least confusion C lies in a plane midway between the
planes of line images S and T.
Figure 9-14 illustrates the effect of astigmatism and field curvature on the image of a
spoked wheel where the images formed on the sagittal and tangential surfaces are shown.
A magnification of − 1 is assumed in the figure. As discussed earlier, a point object P is
imaged as a sagittal or radial line Ps′ on the sagittal surface and as a tangential line Pt′ on
h=1
h = 1/2
P′s
P′t
P0 P′0 P′0
Object
(a) O bject (b) Image on (c) Image on
sagittal tangential
surface surface
the tangential surface. Each point on the object is imaged in this manner, so that the
sagittal image consists of sharp radial lines and diffuse circles while the tangential image
consists of sharp circles and diffuse radial lines. If the object contains lines that are
neither radial nor tangential, they will not be sharply imaged on any surface.
It should be understood that the astigmatism discussed here is for a system that is
rotationally symmetric about its optical axis, and its value reduces to zero for an axial
point object. It is different from the astigmatism of the eye which is caused by one or
more of its refracting surfaces, usually the cornea, that is curved more in one plane than
another. The refracting surface that is normally spherical acquires a small cylindrical
component, i.e., it becomes toric. Such a surface forms a line image of a point object even
when it lies on its axis. Thus, a person afflicted with astigmatism sees points as lines. If
the object consists of vertical and horizontal lines as in the wires of a window screen,
such a person can focus (by accommodation) only on the vertical or the horizontal lines at
a time. This is analogous to the spoked wheel example where the rim is in focus in one
observation plane and the spokes are in focus in another.
W (ρ) = Ad ρ2 , (9-47)
where Ad varies with the image height as h ′ 2 . Because the wave aberration is radially
symmetric, the distribution of rays in the Gaussian image plane is also radially
symmetric. For rays lying on a circle of radius r in the exit pupil, the radius of the
corresponding circle of rays in the image plane, following Eq. (9-7), is given by
ri = 4 FAd ρ . (9-48a)
Its maximum value of 4FAd represents the spot radius, and corresponds to the marginal
rays. The circle in the image plane is traced out in the same sense as in the pupil as q
varies from 0 to 2p. As may be seen by substituting Eq. (9-47) into Eq. (9-11c), the spot
sigma value is given by
s s = 2 2 FAd . (9-48b)
From the discussion in Section 8.3, we note that an aberration represented by Eq. (9-
47) implies that the wavefront is spherical, but it is not centered at the Gaussian image
point. Instead, it is centered at a distance
D R = 8 F 2 Ad (9-49)
from the Gaussian image point along the optical axis (strictly speaking, it is centered on
the line joining the center of the exit pupil and the Gaussian image point). Because the
9.3 Application to Primary Aberrations 403
Wt (ξ, 0) = Ad ξ 2 (9-50a)
and
( xi , yi )t = ( 4 FAd ξ , 0) . (9-50b)
Figure 9-15 shows how the wave and ray aberrations vary with x. The PSF in this case
2
( )
has a uniform irradiance of I p a 2 2 R Ad across a circle of radius 4FAd .
A similar result is obtained when the image is observed in a defocused image plane
at a distance z. According to Eq. (9-15b), a longitudinal defocus of Δ R = z − R
introduces a defocus aberration of Bd ρ2 , where
Δ R = 8 F 2 Bd . (9-51)
Unlike the field curvature coefficient Ad , the value of Bd is independent of the height of
a point object. From Eq. (9-48), the spot radius is given by
rimax = 4 FBd
ΔR
= . (9-52)
2F
W(ξ, 0) xi
1 8
0 ξ 0 ξ
(0, 0) (0, 0)
–4
–1 –8
–1 0 1 –1 0 1
Figure 9-15. Wave and ray aberrations of a ray fan for field curvature. The wave
aberration is in units of Ad , and the ray aberration is in units of FAd .
404 SPOT SIZES AND DIAGRAMS
This result can also be obtained from a simple geometry of defocus, as illustrated in
Figure 9-16. It shows the rays coming to focus at the axial image point P0′ . It is seen from
the figure that, if the image is observed in a defocused image plane at a distance z ± Δ R ,
then the spot radius is given by rimax Δ R = a Li = 1 2 F , in agreement with Eq. (9-52).
The image quality (based on geometrical optics) is not affected as long as the spot
radius is smaller than the grain size of the film or the detector element of a photodetector
array used to record the image. Thus, the tolerable amount of longitudinal defocus, called
the depth of focus, can be determined. An alternative approach, based on the diffraction
image (instead of the ray image), is to use the Rayliegh criterion according to which the
peak value of defocus aberration must be less than or equal to λ 4 . This, in turn,
corresponds to a longitudinal defocus of 2λ F 2 . The corresponding tolerance on the
object position, called the depth of field, may be obtained from Eq. (2-77) for the
longitudinal magnification. Thus, the depth of a field is given by Δ R Mt 2 , where Mt is
the transverse magnification of the image.
9.3.5 Distortion
or
W (ξ, η) = At ξ , (9-53b)
ExP
a MR
P′0 rimax
MR
ΔR
Li
( xi , yi ) = (2 FAt , 0) (9-54a)
(
= Rat h ′ 3 , 0 ) . (9-54b)
Because the ray aberrations are independent of the coordinates (ρ, θ) of a ray in the exit
pupil, all of the rays converge at the image point (2 FAt , 0) , which lies along the xi axis
at a distance 2FAt from the Gaussian image point. Thus, a wavefront aberrated by
distortion is tilted with respect to the Gaussian reference sphere by an angle
= At a . (9-55)
This angle is proportional to h ¢ 3 , and represents the line-of-sight error in the location of a
point object. Similarly, the distance 2FAt of the perfect image point from the Gaussian
image point is proportional to h ′ 3 . Distortion is often measured as a fraction of the image
height. Thus, for example, the percent distortion is 100 Rat h ′ 2 . ( )
It should be noted that although the ray aberration for distortion is independent of the
ray coordinates in the pupil plane, all of the rays converge at the point (2 FAt , 0) if
distortion is the only wave aberration present. However, if other wave aberrations are
present, then different rays will intersect the Gaussian image plane at different points.
The chief ray will still intersect the Gaussian image plane at the point (2 FAt , 0) because
its ray aberration due to the other wave aberrations is zero. Therefore, the ray distortion
aberration is the distance of the point where the chief ray intersects the Gaussian image
plane from the Gaussian image point, i.e., it represents the distance between the points of
intersection of the actual (within the approximation of a primary aberration) and the
paraxial chief rays in the Gaussian image plane.
In order for the distortion to be zero, the chief ray from any point in the object plane
must pass through its Gaussian image point. This has the implication that the image
magnification M must be independent of the object height. Thus, if we consider two point
objects P1 and P2 at heights h1 and h2 , as illustrated in Figure 9-17, the heights h1′ and
h2′ of their Gaussian images P1′ and P2′ must be related to each other according to
EnP ExP
P2 CR
2
P1
h2
h1 CR (–)β2
1
(–)β1 P′0
P0 O O′ (–)β1′
(–)h′1
(–)β2′ CR
1
(–)h′2
P′1
Optical
CR
System 2
P′2
(–)L o Li
h1′ h′
M = = 2 . (9-56)
h1 h2
Substituting for the object and image heights in terms of the slope angles of the
corresponding chief rays, we may write
where Lo and Li are the object and image distances from the entrance and exit pupils,
respectively. Thus the requirement for zero distortion is that the ratio of the tangents of
the slope angles of a chief ray in the object and image spaces must be independent of the
location of the object point. The value of the ratio is given by M ( Lo Li ) . Equation (9-57)
is called the tangent condition for eliminating distortion. It should be noted, however, that
we have assumed that all of the chief rays in the image space of the system to pass
through the center O′ of the exit pupil. This would be true only if the axial point O of the
entrance pupil is imaged perfectly at O′ . In other words, spherical aberration of the
system for pupil imaging must be zero. This may often not be the case because a system
will normally be designed to reduce the spherical aberration for imaging of the object
plane. The tangent condition is satisfied in the case of imaging by a pinhole camera
(discussed in Section 6.9) and a thin lens with a collocated aperture stop, because the
chief ray is transmitted without any deviation in both cases.
Because of distortion, the image of any point object is displaced from its Gaussian
image point by an amount 2FAt along a line joining the axial image point and the
Gaussian image point under consideration. We consider imaging of point objects P1 and
P2 that are at distances h1 and h2 , respectively, from the axial point object P0 . Their
Gaussian images P1′ and P2′ are located at distances h1′ and h2′ , respectively, from the
Gaussian image P0′ of the axial object P0 . Because of distortion, the images are displaced
to positions P1′′ and P2′′ so that the displacements P1′ P1′′ and P2′ P2′′ are proportional to
h1′ 3, and h2′ 3 , respectively.
We note from similar triangles P0′ P1′ P2′ and P2′ A P2′′ in Figure 9-18 that
P2′A P ′ P ′′ AP2′′
= 2 2 = , (9-58)
h1′ h2′ b
L′′2
L′2
L1
P′2 P′′
2
A
h′2
b
h1
P1 P0 P′0 P′′
1
h′1 P′1
h2
P2
L2
L′′1
Figure 9-18. Image of a square in the presence of distortion. The dashed square is
the Gaussian image. L1′ L2′ and L1′′ L2′′ are the Gaussian and distorted images of the
line object L1 L2 , respectively. A magnification of – 1.5 is assumed in the figure.
(
= Rat h1′ h1′2 + b 2 ) .
which represents the sag of P2′′ from a line parallel to the Gaussian line image L1′L2′ but
passing through P1′′ . From Eq. (9-58),
For small values of at , AP2′′ is also small; therefore, P1′′P2′′ ~ P1′ P2′ = b . From Eq. (9-
60) we note then that the sag of P2′′ is proportional to the square of its distance b from
P1′′ . Thus, the locus of P2′′ represents a parabola with a vertex at P1′′ and a vertex radius
of curvature of 1 2 Rat h1′ . If at is positive, the parabolic image is curved away from the
Gaussian image line, as shown in Figure 9-18. If it is negative, the parabolic image will
be curved toward the Gaussian image line. We note from Eq. (9-60) that if the line object
intersects the optical axis so that h1′ is zero, then the sag of P2′′ is also zero. Accordingly,
the image P2′′ of a point object P2 is simply displaced along the image line. Thus, the
image of a line object intersecting the optical axis is also a line differing from the
408 SPOT SIZES AND DIAGRAMS
Gaussian image line only in that it is slightly longer. This discussion can be easily
extended to obtain the distorted images of a square grid shown in Figure 9-19. It should
be evident that when At is positive, we speak of a pincushion distortion. Similarly, when
At is negative, we speak of a barrel distortion.
These polynomials are listed in Table 9-3 and may be obtained from the Zernike
polynomials given in Table 8-2. They are not orthogonal over a unit circle, but their
gradients, representing the ray aberrations, are orthogonal [4]. The polynomials
B40 (ρ) , B31 (ρ) cos θ , and B22 (ρ) cos 2θ represent balanced spherical aberration, coma, and
astigmatism, respectively, giving a minimum spot sigma.
12
Ï • • È 2 • 2 ˘¸
s s = 2 F Ì Â 4 n bn0 ( ) 2
( )
+ Â Í m bmm
m =1 Î
( )
+ Â 2( 2i + m) b2mi + m ˙ ˝
˚˛
. (9-64)
Ó n 2 =1 i =1
P′2
P′2
P1 P′1
P0 P′0 P′0 P′1
P2
Figure 9-19. Images of a square grid in the presence of distortion. When the
distortion aberration coefficient At is positive, we obtain pincushion distortion, as in
(b). When At is negative, we obtain barrel distribution, as in (c). The dashed
squares represent the Gaussian image of the square object with a magnification of
– 1.5.
9.4 Balanced Aberrations for the Minimum Spot Sigma 409
Table 9-3. Balanced wave aberration polynomials Bnm (ρ) cos mθ for minimum spot
sigma s s .
0 0 1 Piston
1 1 ρ cos θ Tilt
2 0 (
2 ρ2 − 1 ) Defocus
3 1 (
3 ρ3 − ρ cos θ) Primary coma
3 3 ρ3 cos 3θ
4 0 (
2 3ρ 4 − 4ρ2 + 1 ) Primary spherical
4 2 4(ρ 4
)
− ρ2 cos 2θ Secondary astigmatism
4 4 ρ 4 cos 4θ
5 1 (
5 2ρ5 − 3ρ3 + ρ cos θ ) Secondary coma
5 3 5(ρ 5
)
− ρ3 cos 3θ
5 5 ρ5 cos 5θ
6 0 ( )
2 10ρ6 − 18ρ4 + 9ρ2 − 1 Secondary spherical
6 2 3(5ρ − 8ρ + 3ρ ) cos 2θ
6 4 2
Tertiary astigmatism
6 4 6(ρ − ρ ) cos 4θ
6 4
6 6 ρ6 cos 6θ
7 1 ( )
7 5ρ7 − 10ρ5 + 6ρ3 − ρ cos θ Tertiary coma
7 3 7(3ρ − 5ρ + 2ρ ) cos 3θ
7 5 3
7 5 7(ρ − ρ ) cos 5θ
7 5
7 7 ρ7 cos 7θ
8 0 (
2 35ρ8 − 80ρ6 + 60ρ 4 − 16ρ2 + 1 ) Tertiary spherical
410 SPOT SIZES AND DIAGRAMS
If an optical system is aberration free, the wavefront at its exit pupil corresponding to
a certain point object is spherical, and all of the object rays lying in the pupil plane
converge to the Gaussian image point. For an aberrated system, the wavefront is
nonspherical and the rays are distributed in a finite region of an image plane. This
distribution of rays is called a spot diagram. We first illustrate mapping of the zonal rays
from the pupil plane to the image plane for a primary aberration. We consider rays from
four zones of the exit pupil, namely, r = 1/4, 1/2, 3/4, and 1. In Figure 9-20, the rays
from these zones are indicated by different symbols so that they can be tracked in the
image plane.
Figure 9-21 illustrates the distribution of rays for spherical aberration in the Gaussian
or paraxial ( Bd = 0) , midway ( Bd = − As ) , least-confusion ( Bd = − 3 2 As ) , and
marginal ( Bd = − 2 As ) planes. We note that in the plane of least confusion, rays from
zones ρ = 1 2 and 1 arrive on the same circle. By definition, the marginal rays (ρ = 1)
intersect the optical axis at the marginal image point. The spot radius in the marginal
image plane corresponds to rays of zone = 1 3 = 0.577 , and they are indicated by D in
the figure.
Figure 9-22 illustrates the distribution of rays for coma in the Gaussian image plane.
As in Figure 9-7, all rays lie in a cone of semiangle of 30° bounded by a circle of
marginal rays of radius 2FAc centered at ( 4 FAc , 0) .
Figure 9-23 illustrates the ray distribution of various images for astigmatism. The
images shown are the (a) sagittal line, (b) least-confusion circle, (c) tangential line, and
(d) ellipise that is symmetrically opposite the least-confusion circle. The value of Bd
for these images is given by ( Ad + Bd ) Aa = 0 , - 1 2 , - 1, and 1 2 , respectively.
The ray distribution for field curvature alone in the Gaussian image plane is identical
to that for astigmatism in the plane of least confusion if Bd = Aa 2 . Comparing Figures
9-21a , 9-22, and 9-23b, we note that rays of a given zone r lie on a circle whose
0 1 h
Figure 9-20. Zonal rays in the pupil plane corresponding to four zones: = 1 4 , 1 2 ,
3 4 , and 1.
9.5 Spot Diagrams 411
xi
0
2 4 6 8 yi
xi
4
2 4 yi
(a)
xi
(b)
xi 4
2 2
1
0 1 2 yi 0 2 4 yi
(c)
(d)
Figure 9-21. Ray distribution for spherical aberration in (a) Gaussian, (b) midway,
(c) least-confusion, and (d) marginal image planes. The units of x i and yi are FAs .
In practice, the spot diagrams are obtained by tracing an array of object rays through
a system and determining their points of intersection with the image plane. They give a
qualitative description of the effects of an aberration. They do not, for example, bring out
the singularities of infinite irradiance of the aberrated PSFs, which are fortunately unreal
physically. A designer generally starts with rays that are distributed in a certain grid
pattern in the plane of the entrance pupil of the system. Figure 9-24 shows the ray grid
patterns in the pupil plane that are commonly used in practice. In Figure 9-24a, the rays
are distributed in a uniformly spaced square array, whereas in Figure 9-24b they are
distributed in a hexapolar array.
412 SPOT SIZES AND DIAGRAMS
xi
–3 –2 –1 0 1 2 3 yi
Figure 9-22. Ray distribution for coma in the paraxial image plane. The units of x i
and yi are FAc .
xi
1
xi
4
0 1 2 yi
3
2
(b) Least confusion
1
xi
0 yi yi
–1 (c) Tangential
xi
–2 2
–3 1
–4
0 1 2 3 4 5 yi
(a) Sagittal
Figure 9-23. Ray distribution of various images for astigmatism: (a) sagittal, (b)
least confusion, (c) tangential, and (d) symmetrically opposite to least confusion. The
units of xi and yi are FAa .
9.5 Spot Diagrams 413
1 1
0.5 0.5
0 0
– 0.5 – 0.5
–1 –1
–1 – 0.5 0 0.5 1 –1 – 0.5 0 0.5 1
(a) (b)
Figure 9-24. Ray grid pattern in the pupil plane normalized by the pupil radius. (a)
Square grid of uniformly spaced points. (b) Hexapolar grid of concentric rings.
In the absence of any aberration, the spot diagram in a defocused image plane looks
exactly like the one in the pupil plane, except for its scale. The spot diagrams for
spherical aberration in various image planes considered above are shown in Figures 9-25
and 9-26. It is evident that, instead of the expected radial symmetry of the PSFs, a four-
fold symmetry is obtained in the case of the square grid of rays in the pupil plane, and
hexagonal symmetry in the case of the hexapolar grid. This is simply an artifact of the
8 4
4 2
0 0
–4 –2
–8 –4
–8 –4 0 4 8 –4 –2 0 2 4
Bd /As = 0 Bd /As = –1
(a) (b)
2 4
1 2
0 0
–1 –2
–2 –4
–2 –1 0 1 2 –4 –2 0 2 4
Bd /As = –1.5 Bd /As = –2
(c) (d)
Figure 9-25. Spot diagrams for spherical aberration in various image planes for a
square grid of rays: (a) Gaussian, (b) midway, (c) least confusion, and (d) marginal.
The spot sizes are in units of FAs . The PSFs are four-fold symmetric, instead of
being radially symmetric, because of the square grid of rays in the pupil plane.
414 SPOT SIZES AND DIAGRAMS
8 4
4 2
0 0
–4 –2
–8 –4
–8 –4 0 4 8 –4 –2 0 2 4
Bd/As = 0 Bd/As = –1
(a) (b)
2 4
1 2
0 0
–1 –2
–2 –4
–2 –1 0 1 2 –4 –2 0 2 4
Bd/As = –1.5 Bd/As = – 2
(c) (d)
Figure 9-26. Spot diagrams for spherical aberration in various image planes for a
hexapolar grid of rays: (a) Gaussian, (b) midway, (c) least confusion, and (d)
marginal. The spot sizes are in units of FAs . The PSFs are six-fold symmetric,
instead of being radially symmetric, because of the hexapolar grid of rays in the
pupil plane.
ray grid used in the pupil plane. As in the case of defocus, the PSF for astigmatism is also
uniform. Thus, the spot diagram for it also looks like the input array across an elliptical
spot, which reduces to a circle or a line depending on the amount of balancing defocus.
The spot diagrams for coma are shown in Figure 9-27. Only the chief ray passes through
the Gaussian image point, which is shown with coordinates (0, 0) in the figure. Note that
the two grids yield different results near the top of the spot.
6 6
5 5
4 4
3 3
2 2
1 1
0 0
–2 –1 0 1 2 –2 –1 0 1 2
(a) (b)
Figure 9-27. Spot diagrams for coma in units of FAc for (a) square and (b) polar
array of rays in the pupil plane. Only the chief ray passes through the Gaussian
image point, which is shown to lie at (0, 0).
9.6 Aberration Tolerance and a Golden Rule of Optical Design 415
It is common practice in lens design to look at the spot diagrams in the early stages
of a design, in spite of the fact that they do not represent reality. As discussed in Section
6.8.2, the aberration-free image of a point object is the Airy pattern. As the aberration
increases, the geometrical and diffraction PSFs begin to increasingly resemble each other.
Just as in the diffraction treatment [2] an optical system is considered practically
diffraction limited if the peak (or peak-to-valley) aberration is less than λ 4 (Rayleigh’s
quarter-wave rule), or the standard deviation of the aberration across the exit pupil is less
than λ 14 (Maréchal’s criterion), similarly optical designers consider a system to be
close to its diffraction limit if the ray spot radius is less than or equal to the radius
1.22 λ F of the Airy disc. We note, for example, that this holds for spherical aberration in
the Gaussian image plane if As ≤ 0.15 λ , although a larger value of As is obtained in the
other image planes. Considering that the long dimension of the coma spot is 6FAc and
the line image for astigmatism is 8 FAa long, the aberration tolerance for the spot size to
be smaller than the Airy disc is Ac < 0.4 λ and Aa < 0.3 λ , respectively. The aberration
tolerances based on the spot size are summarized in Table 9-4. These tolerances, although
larger than λ 4 , are roughly consistent with Rayleigh’s quarter-wave rule. Thus, it is
reasonable to use the size of the spot diagrams as a qualitative measure of quality of the
design until it becomes smaller than the Airy disc. This yields a golden rule of optical
design of using spot diagrams until their size is approximately equal to that of the Airy
disc, and then analyzing the system by its aberration variance and diffraction
characteristics, such as the aberrated diffraction PSF or the modulation transfer function.
The depth of focus (giving the tolerance on the location of the plane for observing the
image) can be determined from Eq. (9-51). Thus, the aberration tolerance is Bd < ~ 0.3 λ
for a spot radius smaller than or equal to that of the Airy disc, which, in turn, implies a
depth of focus of 2.4 λ F 2 . This is roughly consistent with a value of 2 λ F 2 obtained
according to Rayleigh’s quarter-wave rule. The corresponding depth of field (giving the
tolerance on the object location for a fixed observation plane) can be determined from the
depth of focus by using Eq. (2-77) for the longitudinal magnification. Similarly,
distortion tolerance for a certain amount of line-of-sight error can be obtained from Eq.
(9-55) by replacing At by Bt .
A positive value of longitudinal spherical aberration implies that the marginal image
corresponding to Bd = − 2 As lies farther from the exit pupil than the Gaussian image.
The circle of least confusion lies in a plane that is 3 4 of the way from the paraxial to the
marginal image plane.
9.7.2 (
Coma Ac ρ 3 cosθ )
Sagittal coma = 2FAc . (9-66a)
9.7.3 (
Astigmatism and Field Curvature Aa ρ2 cos 2 θ + Ad ρ2 )
Full length of sagittal focal line = 8FAa . (9-67a)
This line is centered on the chief ray at a distance Δ Rs = 8 F 2Ad from the Gaussian image
point and lies along the xi axis.
This circle is centered on the chief ray and lies in a plane that is midway between the
sagittal and tangential focal line images. It is referred to as the best image.
The radii of curvature of the sagittal, tangential, Petzval, and best-image surfaces are
given by
h¢ 2
Rs = , (9-68a)
16 F 2 Ad
h¢ 2
Rt = , (9-68b)
16 F 2 ( Ad + Aa )
2 3 1
= - (9-69a)
Rp Rs Rt
16 F 2
= (2 Ad - Aa ) , (9-69b)
h¢ 2
and
h¢ 2
Rb = . (9-69c)
8 F 2 ( Aa + 2 Ad )
Moreover,
1 1⎛ 1 1⎞
= ⎜ + ⎟ . (9-69d)
Rb 2 ⎝ Rs Rt ⎠
If only field curvature is present, then an image of radius 4FAd is obtained in the
Gaussian image plane. The image reduces to a point if it is observed in an image plane at
a distance Δ R = 8 F 2 Ad from the Gaussian image plane.
and
s s = 2 2 FAd , (9-70b)
respectively.
= At a , (9-71)
where a is the radius of the exit pupil, and it represents the line-of-sight error in the
position of the image or the object point. The image point lies at (2 FAt , 0) relative to the
Gaussian image point.
⎧0.16 λ , Spherical
⎪
Ai = ⎨0.4 λ , Coma (9-72)
⎪0.3 λ , Astigmatism or defocus .
⎩
If an aberration is balanced with another, the standard deviation of the aberration and
the spot size are not minimized for the same amount of the balancing aberration. For
example, when spherical aberration As ρ 4 is balanced with defocus Bd ρ2 , the standard
deviation is minimized when Bd = − As , but the spot radius is minimized when
Bd = −1.5 As . When astigmatism Aaρ2 cos 2 θ is balanced with defocus, the standard
deviation and spot radius are both minimized when Bd = − 0.5 As . The depth of focus is
given by 8 F 2 Bd , and dividing it by Mt2 gives the depth of field, where Mt is the
magnification of the image.
REFERENCES
1. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Geometrical Optics,
SPIE Press, Bellingham, WA (1998) [doi: 10.1117/3.265735].
PROBLEMS
9.1 Consider Problem 2.5, imaging a slide by a thin lens. (a) Determine the depth of
focus for a defocus aberration of 0.3 λ , giving the tolerance on the distance
between the lens and the screen, i.e., on the location of the screen. (b) What is the
corresponding tolerance on the distance between the slide and the lens, i.e., on the
location of the slide?
9.2 Sketch the geometrical PSF of a system with a uniformly illuminated circular exit
pupil aberrated by spherical aberration W (ρ) = As ρ 4 in the Gaussian, marginal,
least-confusion, and midway image planes for As = 1 λ , λ = 0.5 μm , and F = 10 ,
and total image power of 1 W. Give the location of these image planes with respect
to the Gaussian image plane. Calculate the size and sigma value of the image spot
in these planes.
9.3 Consider the imaging system of Problem 7.2, except that it is aberrated by
astigmatism W (ρ, θ) = Aa ρ2 cos 2 θ , where Aa = λ 4 . Calculate the size, location,
and irradiance of the tangential, sagittal, and least-confusion images of a point
object.
9.4 Consider an imaging system forming the image of a point object at a distance of
15 cm from the plane of its exit pupil at a height of 0.2 cm from its optical axis. Let
the image be aberrated by λ 4 each of astigmatism and field curvature. If the
radius of the exit pupil is 1 cm, determine and sketch the tangential, sagittal, and
Petzval image surfaces for λ = 0.5 μm .
9.5 Sketch the pattern of the image of the point object considered in Problem 7.4 if it is
aberrated by coma given by W (ρ, θ) = Ac ρ3 cos θ , where Ac = λ 4 . Illustrate the
tangential and sagittal coma on this sketch. Determine the spot sigma and centroid
of the image spot.
9.6 Sketch the pattern of the image of a point object aberrated by secondary coma
A5ρ5 cos θ , where A5 is the peak value of the aberration. Illustrate the tangential
and sagittal coma on the sketch for F = 4 and A5 = 1.5 λ , where l = 3 mm. Also,
determine the centroid of the image and its sigma value.
EPILOGUE
E1 Introduction ..........................................................................................................423
E2 Principles of Geometrical Optics and Imaging..................................................423
...............................
E3 Ray Tracing: Exact and Paraxial ....................................................................... 423
E4 Gaussian Optics ....................................................................................................424
E4.1 Tangent Plane or Paraxial Surface ..........................................................424
E4.2 Sign Convention ......................................................................................424
E4.3 Cardinal Points ........................................................................................424
E4.4 Graphical Imaging ................................................................................... 425
E4.5 Lagrange Invariant................................................................................... 425
E4.6 Matrix Approach to Gaussian Imaging....................................................426
E4.7 Petzval Image ..........................................................................................426
E4.8 Field of View ........................................................................................... 426
E4.9 Chromatic Aberrations ............................................................................426
E5 Image Brightness ..................................................................................................427
E6 Image Quality ....................................................................................................... 427
E6.1 Wave and Ray Aberrations ......................................................................427
E6.2 Primary Aberrations ................................................................................428
E6.3 Spot Size and Aberration Balancing ........................................................429
E6.4 Strehl Ratio and Aberration Balancing ....................................................429
E7 Reflecting Systems................................................................................................430
E8 Anamorphic Imaging Systems ............................................................................430
E9 Aberration Tolerance and a Golden Rule of Optical Design ........................... 431
E10 General Comments ..............................................................................................431
References ......................................................................................................................433
421
Epilogue
E1 INTRODUCTION
We give brief a summary of the imaging process with emphasis on its salient
features, and outline the next steps within and beyond geometrical optics. The numbers
given in parentheses are the section numbers where a particular topic is discussed.
When a point object is imaged by an imaging system, a portion of the spherical wave
originating at the object is intercepted by the system. It propagates through the system,
and if a spherical wave exits from it, a perfect point image is formed at the center of
curvature of this converging spherical wave. If rays are traced from the point object
toward and through the imaging system, they exit from the system and converge to the
image point. Thus, a diverging spherical wavefront with its center of curvature at the
point object is converted by the imaging system into a spherical wavefront converging to
the perfect image point. The optical path lengths of the rays from the point object to the
image point are equal to each other. With few exceptions, the actual shape of the
wavefront emerging from the system is generally not spherical, indicating an aberrated
image.
When the rays make small angles with the optical axis and surface normals, their
sines and tangents can be approximated by the angles themselves. Similarly, if the
transverse coordinates ( x , y ) of a point on a refracting or a reflecting surface with its
symmetry axis along the z axis are much smaller than its radius of curvature, we can
423
424 EPILOGUE
neglect the sag of the surface, and approximate the diagonal distance between two points
by the corresponding axial distance. The ray tracing carried out under such assumptions is
called paraxial ray tracing (1.7). Under such ray tracing, the equations for the transverse
coordinates of a point on a ray are no longer coupled. Moreover, the projections of a skew
ray in the zx and yz planes propagate independently of each other. Consequently, for a
rotationally symmetric imaging system, we need to trace rays only in one of these planes.
This is generally done in the tangential plane, i.e., the plane containing the point object
and the optical axis. A ray incident in the tangential plane remains in this plane after its
refraction or reflection by an element of the system, and, therefore, by the entire system.
This is also true for the exact ray.
E4 GAUSSIAN OPTICS
E4.1 Tangent Plane or Paraxial Surface
The paraxial ray-tracing equations are used to determine the location and size of the
image formed by an imaging system in terms of the object location and size. The image
thus obtained is referred to as the Gaussian image, and the process of determining the
image in this manner, regardless of the magnitude of the angles and sizes, is called
Gaussian optics. Because of paraxial ray tracing, the curved refracting or reflecting
surface is replaced by a planar surface passing through its vertex, called the tangent plane
or the paraxial surface (1.8.2). Only the vertex radius of curvature of the surface is
utilized in the imaging equations. The use of the tangent plane implies, for example, that
there is no distinction between the Gaussian image formed by a spherical surface of a
certain radius of curvature and a conic surface with the same vertex radius of curvature.
An object and its corresponding image are referred to as conjugates of each other because
one is the image of the other. The Gaussian image is aberration free by definition. The
aberrations of an actual image are determined separately as the next step to evaluate the
quality of the image.
points, called the cardinal points of the system: two principal points, two focal points, and
two nodal points. Only three of the six cardinal points are independent (2.4.2). The
principal points are conjugates of each other, and so are the nodal points. If the refractive
indices of the object and image spaces are equal, which is often the case in practice, then
the nodal points coincide with the corresponding principal points, and the object- and
image-space focal lengths are equal in magnitude.
Once the cardinal points are known, the system can be replaced by them regardless
of its complexity. The object and image distances are measured from the respective
principal points, which correspond to conjugate planes of unity transverse magnification.
Similarly, the focal lengths represent the distances of the focal points from the respective
principal points. The two nodal points correspond to unity angular magnification. The
principal and the nodal points of a thin lens (in air) coincide with its center. The principal
points of a refracting surface coincide with its vertex, and its nodal points coincide with
its center of curvature. The imaging equation for any imaging system is similar to that for
a single refracting surface.
lateral color. A system is considered achromatic if both the axial and lateral colors are
zero.
E5 IMAGE BRIGHTNESS
Once an image of suitable location and size has been obtained, the next step is to
determine its brightness. This is done by determining the aperture stop and its images, the
entrance and exit pupils in the object and image spaces of the system. Rays with
increasingly larger cone angles are incident on the system to determine the aperture in the
system that physically limits most the solid angle of the transmitted rays (5.2.2). Such ray
tracing is also used to determine the size of the imaging elements or the obscurations in
imaging systems. Having obtained the aperture stop, the entrance and exit pupils are
obtained by using the Gaussian imaging equations. The light cone from a point object that
enters the system is limited by the entrance pupil. Similarly, the light cone that exits from
the system and converges onto the image point is limited by the exit pupil. The chief ray
from the edge of an object determines the location of the exit pupil and the height of the
image. Similarly, the marginal ray from the axial point of the object determines the size
of the exit pupil and the location of the axial image point.
The intensity of the image of a point object varies as the cube of the cosine of its
angle from the optical axis (5.3). The irradiance of the image of an extended object
decreases as the fourth power of the angle of an object element from the optical axis
(5.4.6). For visual observations, as in the case of, e.g., telescopes and microscopes, the
spectral response of the human eye is taken into account. As the point object moves off-
axis, at some position, some of the rays intercepted by the entrance pupil begin to be
vignetted or blocked by one or another element. The aperture stop, which is circular for
the axial point object, becomes nearly elliptical with a corresponding reduction in the
transmitted flux.
E6 IMAGE QUALITY
E6.1 Wave and Ray Aberrations
In Gaussian optics, all of the object rays from a certain point object transmitted by a
system pass through the Gaussian image point. The imaging system is assumed to convert
the spherical wavefront diverging from the point object into a spherical wavefront
converging to the Gaussian image point. In reality, however, when the rays are traced
exactly (instead of paraxially), they generally do not converge to an image point. Instead
they intersect the image plane at various points in a small region in the vicinity of the
Gaussian image point in the form of a spot diagram, indicating that the exiting wavefront
is not spherical, or that it is aberrated. These aberrations determine the quality of the
image, as discussed in Sections E6.3 and 6.4.
The wave aberrations of the image of a point object are obtained by tracing rays from
the point object through the system and up to its exit pupil such that each one travels an
optical path length equal to that of the chief ray (i.e., the one passing through the center of
the pupil). The surface passing through the end points of the rays is the system wavefront
for the point object under consideration. If the wavefront is spherical, with its center of
428 EPILOGUE
curvature at the Gaussian image point, we obtain a perfect image point. The rays
transmitted by the system in that case have equal optical path lengths in propagating from
the object point to the Gaussian image point, and they all pass through the image point. If,
however, the actual wavefront deviates from the spherical wavefront, called the Gaussian
reference sphere, and the image is aberrated (8.2.1). The rays do not have equal optical
path lengths, and they intersect the Gaussian image plane in the vicinity of the Gaussian
image point. The ( x , y ) separations of the intersection point of a ray from the Gaussian
image point are called its transverse ray aberrations, and they are positive or negative
according to the Cartesian sign convention. The wave aberration of a ray from a point
object is positive if it travels an extra optical path length, compared to the chief ray, in
order to reach the Gaussian reference sphere (see Reference 1 in Chapter 8).
There are five aberrations of fourth order in object (or image) and pupil coordinates,
referred to as the primary or the Seidel aberrations, namely, spherical aberration, coma,
astigmatism, field curvature, and distortion (8.5). The primary wave aberrations of a
multisurface system are additive in the sense that they can be obtained by adding the
primary wave aberrations of the surfaces, where the Gaussian image of a point object
formed by one surface becomes the point object for the next surface (8.6.2). Thus, by
knowing the primary aberrations of a refracting surface, the aberrations of a single lens,
for example, can be obtained. Similarly, by knowing the primary aberrations of a
reflecting surface, the aberrations of a two-mirror astronomical telescope can be obtained.
The higher-order aberrations, e.g., secondary or Schwarzchild aberrations, cannot be
obtained in this manner. To obtain the higher-order aberrations of a surface, the effect of
the aberrations of the image formed by the previous surface must be taken into account.
New aberrations arise when the system is perturbed so that one or more of its imaging
elements is decentered and/or tilted, and the system loses its rotational symmetry. These
aberrations have different dependencies on the object height but the same dependence on
the pupil coordinates as the aberrations of the unperturbed system (sse Chapter 7 in
Reference 1).
E6 Image Quality 429
Although the transverse ray aberrations of a system for a certain point object can be
obtained by tracing the rays through the system and up to the image plane, they can also
be obtained from the wave aberrations. The ray aberrations are not additive in that those
in the final image plane cannot be obtained by adding their values in the intermediate
image planes formed by the surfaces of a system. Of course, the contribution of a surface
to the ray aberration in the final image plane can be obtained from its wave aberration
using the parameters of the final image (8.6.3).
Because of the variation of the refractive index of a transparent substance with the
wavelength, the optical path length of a ray passing through it also depends on the
wavelength. Accordingly, the monochromatic aberrations of a refracting system also vary
with the wavelength. However, this variation is generally small, especially for a narrow
spectral bandwidth. It is calculated by exact ray tracing of the system. Of course, a
reflecting system is achromatic.
When some of the rays in the spot diagram are concentrated in a small area and the
others are scattered over a large area, as in the case of coma, the quantity of interest is the
standard deviation or the spot sigma of the ray distribution. The amount of the balancing
aberration for the minimum value of spot sigma is different. For example, the defocus
aberration that minimizes the spot sigma is - 4 3 times the amount of spherical
aberration. As the criterion for balancing changes, so does the amount of the balancing
aberration.
The reason for the widespread use of Zernike circle polynomials in wavefront
analysis is that they are not only orthogonal over a circular pupil, but they also represent
balanced classical aberrations for such pupils (8.8). These polynomials are separable in
polar coordinates of a pupil point. The aberrations in the form of these polynomials are
430 EPILOGUE
referred to as the orthogonal aberrations. The coefficients of the classical aberrations can
be obtained from those of the orthogonal aberrations (8.9).
E7 REFLECTING SYSTEMS
Generally, the refractive index of the medium for imaging by a reflecting surface is
unity. The ray-tracing equations (exact as well as paraxial) for a reflecting surface can be
obtained from the corresponding equations for a refracting surface by letting the
refractive index associated with the reflected ray to be equal to and opposite of that
associated with the incident ray (1.6). The opposite sign accounts for the backward
propagation of the reflected ray compared to that of the incident ray. The imaging and
wave aberration equations for a reflecting surface can be obtained in a similar manner
from the corresponding equations for a refracting surface. Although it is convenient to
use the equations for a refracting surface to obtain the corresponding equations for a
reflecting surface, the physical insight is lost in so doing. That is why Gaussian imaging
by a reflecting system is discussed in this book on an equal basis as a refracting system.
The wave aberrations can also be balanced to give the smallest ray spot size, e.g., the
circle of least confusion in the case of spherical aberration or astigmatism, or the smallest
standard deviation of the ray distribution (often incorrectly called the root-mean-square
radius). It should be evident that if an aberration is balanced with another, the standard
deviation of the aberration and the spot size are not minimized for the same amount of the
balancing aberration. An exception is astigmatism, which, when balanced with defocus,
yields minimum variance as well as the smallest spot size (9.3.3).
It is common practice in lens design to look at the spot diagrams in the early stages
of a design, in spite of the fact that they do not represent reality. For example, based on
diffraction, the aberration-free image of a point object is the Airy pattern (6.8.2), but it is
a point only according to geometrical optics. So why do the lens designers use spot
diagrams? The reason is that not only are the spot diagrams easy to generate but also that
with increasing aberration, the geometrical and diffraction PSFs begin to increasingly
resemble each other. Just as in the diffraction treatment an optical system is considered
practically diffraction limited if the peak (or peak-to-valley) aberration is less than l 4
(Rayleigh’s quarter-wave rule) or if the standard deviation of the aberration across the
exit pupil is less than l 14 (Maréchal’s criterion) (6.8.3), similarly, the optical designers
consider a system to be close to its diffraction limit if the ray spot radius is less than or
equal to the radius of the Airy disc.
The aberration tolerances based on the spot size are roughly consistent with the
Rayleigh’s quarter-wave rule. Similarly, the depth of focus (giving the tolerance on the
location of the plane for observing the image) based on a spot radius smaller than or equal
to that of the Airy disc is roughly consistent with its value obtained according to
Rayleigh’s quarter-wave rule. The corresponding depth of field (giving the tolerance on
the object location for a fixed observation plane) can be obtained from the depth of focus
by using the longitudinal magnification. Accordingly, it is reasonable to use the size of
the spot diagrams as a qualitative measure of quality of the design until it becomes
smaller than the Airy disc. This yields a golden rule of optical design in that a designer
may strive for spot diagrams of a size nearly equal to that of the Airy disc, and then
analyze the system performance by its aberration variance and diffraction characteristics,
such as the aberrated diffraction point-spread function (PSF) or the modulation transfer
function (MTF) (9.6).
Gaussian image, thus yielding the fact that the Gaussian images formed by conic and
spherical surfaces of the same radius of curvature are identical. The distinction between
the Gaussian and Petzval image should also be understood. Paraxial ray tracing is used to
determine the aperture stop and thereby the entrance and exit pupils and, in turn, the
irradiance of the image in terms of the radiance of the object. It is also used to determine
the approximate size of the imaging elements, obscurations in mirror systems, vignetting
of rays as the object moves increasingly off axis, and the resulting change in the shape of
the pupil.
It is important to work on the problems given at the end of each chapter, because
they are extensions of the theory given in the text, or, more often, as applications of the
theory. They are an essential part of the book because only by working through such
problems, can one appreciate the theory and validate its understanding. Having tools is
not enough; one must also know how to use them. Only by working the problems can the
readers gauge their aptitude. The use of computer software is discouraged until the basic
concepts of Gaussian imaging are thoroughly understood.
The next step beyond Gaussian optics is to determine the image quality, and that
requires exact ray tracing to determine the aberrations. The understanding of the primary
aberrations is of paramount importance, because they can be the dominant aberrations in
the early stages of a design. Once one can solve simple problems that use Gaussian
optics, paraxial ray tracing, and graphical imaging, one is ready to tackle complex
problems by using the commercially available optical design and analysis software such
as CODE V, ZEMAX, SYNOPSYS, and OSLO.
A lens designer designs an imaging system so that it can form an image of a certain
size at a certain location, given the size and the location of the object. Given the radiance
of an extended object or the intensity of a point object, the designer chooses the size of
the imaging elements that will yield an image of some prescribed irradiance or intensity.
Gaussian optics is also used to determine the extent of the object that can be imaged, i.e.,
it is used to determine the field of view of the system. A designer must also choose the
shapes and materials of the imaging elements to balance their chromatic and
monochromatic aberrations to yield an image of acceptable quality across the field of
E10 General Comments 433
view of the system. However, the task of a designer is not finished until a system is
fabricated, assembled, and tested.
434 EPILOGUE
REFERENCES
F. A. Jenkins and H. E. White, Fundamentals of Optics, 4th ed., McGraw-Hill, New York
(1976).
R. Kingslake and B. Johnson, Lens Design Fundamentals, 2nd ed., Academic Press, San
Diego, CA (2009).
M. V. Klein and T. E. Furtak, Optics, John Wiley and Sons, New York (1988).
D. Malacara and Z. Malacara, Handbook of Lens Design, Dekker, New York (1994).
L. C. Martin and W. T. Welford, Technical Optics, Vol. I, 2nd ed., Pitman, London,
(1966).
P. Mouroulis and J. Macdonald, Geometrical Optics and Optical Design, Oxford, New
York (1997).
D. C. O’Shea, Elements of Modern Optical Design, John Wiley and Sons, New York
(1985).
D. J. Schroeder, Astronomical Optics, 2nd ed., Academic Press, San Diego, CA (2000).
W. J. Smith, Modern Optical Engineering, 2nd ed., McGraw-Hill, New York (1990).
435
Index
A
angular aperture.......... 217, 219, 220, 229
Abbe number ..................................... 286
angular field of view
aberration
image space ........................... 189, 225
balanced ......................................... 340
object space .......................... 189, 225
chromatic ....................................... 281
angular magnification
classical .................................. 352, 357
general system ................................. 78
combined primary and secondary .. 331
reflecting surface ............... 34, 54, 123
definition ........................................ 317
refracting surface ....................... 31, 54
defocus ........................................... 323
thin lens ........................................... 67
extrinsic.......................................... 332
aperture stop .............................. 187, 188
geometrical..................................... 320
apochromatic ...................................... 304
intrinsic .......................................... 322
aspheric surface ................................... 35
order ............................................... 327
astigmatism
peak-to-valley value ....................... 329
definition ....................................... 328
peak value ...................................... 329
focal lines....................................... 395
primary ................... 328, 329, 331, 332
interferogram ................................. 368
Schwarzchild.................................. 330
longitudinal.................................... 395
secondary ............... 328, 330, 331, 337
sagittal ........................................... 395
Seidel ............................. 328, 329, 331
shape .............................................. 365
tilt ........................... 325–327, 351, 353
spot diagram .................................. 412
tolerance ......................... 338, 340, 378
spot sigma.............................. 395, 400
transverse ray ......... 315, 320, 332, 336
tangential ....................................... 395
variance .......................................... 339
atmospheric coherence length ........... 369
wave ....................... 315, 317, 321, 332
atmospheric turbulence ..................... 369
aberration balancing
auxiliary axis ......................... 96, 98, 112
definition ........................ 338, 378, 390
axial color
primary aberrations ........................ 340
definition ............................... 281, 283
aberration tolerance .... 338, 340, 378, 415
doublet ........................................... 297
accommodation ................................... 238
general system ............................... 295
achromatic systems
plane-parallel plate ........................ 290
doublet ........................................... 302
refracting surface ........................... 283
additivity theorem ............................... 335
afocal system ........................................ 90 thin lens ......................................... 285
beam expander ............................... 133
for telephoto lens ........................... 259
B
for wide-angle lens................. 260, 261 beam expander
reflecting telescope ................ 133, 253 reflecting........................................ 133
refracting telescope ........................ 253 refracting ................................. 88, 254
Airy disc ............................................. 261 beam-expansion ratio ......................... 254
Airy pattern .................... 5, 261, 262, 264 blind spot .................................... 236, 237
ametropic ............................................ 242
anamorphic system C
imaging .......................................... 107 cardinal points ......................... 45, 74, 84
aberrations...................................... 357 combination of two systems .......... 154
reflection invariants ....................... 356
437
438 Index
P R
parabola ............................................. 407 radial image ........................................ 395
parallel beam ................... 45, 53, 63, 133 radiance ............................. 187, 205, 211
paraxial approximation ......................... 25 radiance theorem ............................... 213
paraxial ray tracing ...... 24, 29, 34, 35, 39 radiometry
paraxial surface .............................. 25, 39 extended object imaging 204, 214, 226
peak-to-valley aberration ............ 329, 350 point object imaging .............. 200, 225
peak value .................................. 329, 371 random aberrations ............................ 369
442 Index
Virendra N. Mahajan was born in Vihari, Pakistan, and educated in India and the
United States. He received his Ph.D. degree in optical sciences from the College of
Optical Sciences, University of Arizona. He spent nine years at the Charles Stark Draper
Laboratory in Cambridge, Massachusetts, where he worked on space optical systems.
Since 1983, he has been at The Aerospace Corporation in El Segundo, California, where
he is a distinguished scientist working on space-based surveillance systems. Parts I and II
of Optical Imaging and Aberrations evolved out of a graduate course he taught as an
adjunct professor in the Electrical Engineering-Electrophysics department at the
University of Southern California. Dr. Mahajan is an adjunct professor in the College of
Optical Sciences at the University of Arizona, and the Department of Optics and
Photonics at the National Central University in Taiwan, where he teaches graduate
courses on imaging and aberrations. He also teaches short courses on aberrations at
meetings of the Optical Society of America and SPIE. He has published numerous papers
on diffraction, aberrations, wavefront analysis, adaptive optics, and acousto-optics. He is a
fellow of OSA, SPIE, and the Optical Society of India. He is an associate editor of
OSA’s 3rd edition of the Handbook of Optics, and a recipient of SPIE’s Conrady award.
He has served as a Topical Editor of Optics Letters, chairman of OSA’s Astronomical,
Aeronautical, and Space Optics technical group, and a member of several committees of
both OSA and SPIE. Dr. Mahajan is the author of Aberration Theory Made Simple, 2nd
ed. (2011), editor of Selected Papers on Effects of Aberrations in Optical Imaging (1994),
and author of Optical Imaging and Aberrations, Part I: Ray Geometrical Optics (1998),
Part II: Wave Diffraction Optics, 2nd ed. (2011), and Part III: Wavefront Analysis (2013),
all published by SPIE Press.
FUNDAMENTALS OF
GEOMETRICAL OPTICS
Virendra N. Mahajan
Optical imaging starts with geometrical optics and ray tracing lies at its forefront. This book
starts with Fermat’s principle, and derives the three laws of geometrical optics from it. These
laws are used to obtain the exact ray-tracing equations, whose paraxial approximation yields
the Gaussian imaging. After discussing imaging by refracting and reflecting systems, paraxial
ray tracing is used to determine the size of imaging elements and obscuration in mirror
systems. Stops, pupils, radiometry, and optical instruments are discussed next. The
chromatic and monochromatic aberrations are discussed in detail, followed by spot sizes and
spot diagrams of aberrated images of point objects. Each chapter ends with a summary and
a set of problems. The book ends with an epilogue, which summarizes the imaging process,
and outlines the next steps within and beyond geometrical optics.
P.O. Box 10
Bellingham, WA 98227-0010
ISBN: 9780819499981
SPIE Vol. No.: PM245