(Press Monograph) Virendra N. Mahajan-Fundamentals of Geometrical Optics-Society of Photo Optical (2014) PDF

FUNDAMENTALS OF
GEOMETRICAL OPTICS
VIRENDRA N. MAHAJAN
FUNDAMENTALS OF
GEOMETRICAL OPTICS
Virendra N. Mahajan
FUNDAMENTALS OF
GEOMETRICAL OPTICS
Virendra N. Mahajan
THE AEROSPACE CORPORATION
AND
COLLEGE OF OPTICAL SCIENCES - THE UNIVERSITY OF ARIZONA
SPIE PRESS
Bellingham, Washington USA
Library of Congress Cataloging-in-Publication Data
Mahajan, Virendra N.
Fundamentals of geometrical optics / Virendra N. Mahajan.
pages cm
Includes bibliographical references and index.
ISBN 978-0-8194-9998-1
1. Geometrical optics--Study and teaching. 2. Optical instruments--Reliability--Study
and teaching. 3. Diffraction--Study and teaching. I. Title.
QC382.M34 2014
535'.32--dc23
2014010949
Published by
SPIE
P.O. Box 10
Bellingham, Washington 98227-0010 USA
Phone: +1 360.676.3290
Fax: +1 360.647.1445
Email: Books@spie.org
Web: http://spie.org
Copyright © 2014 Society of Photo-Optical Instrumentation Engineers (SPIE)
All rights reserved. No part of this publication may be reproduced or distributed in any
form or by any means without written permission of the publisher.
The content of this book reflects the work and thought of the author(s). Every effort has
been made to publish reliable and accurate information herein, but the publisher is not
responsible for the validity of the information or for any outcomes resulting from reliance
thereon.
Printed in the United States of America.

Second printing
To my wife
SHASHI PRABHA
FOREWORD
We are living in the most exciting time, so far, in the use and application
of the phenomenon of light, as we understand it. Optics is now an
important subject in many disciplines, and so competence in optics is at
issue. This volume provides the interested reader with a solid resource to
embark on learning about geometrical optics, which is the foundation of
imaging and non-imaging optics. Professor Virendra N. Mahajan provides
a clear and detailed discussion of essential topics for the understanding of
image formation.
Dr. Mahajan has significant experience teaching and writing about the
subject. He is well known in the optics community and has traveled
around the world, lecturing about optical imaging and aberrations; one of
his favorite topics is the use of Zernike polynomials in optics.
I have known Dr. Mahajan ever since he started teaching at the College of
Optical Sciences in 2005. He flew back and forth from Los Angeles to
Tucson every week to share his knowledge with students. I have also
enjoyed noticing the fine interest and polite interaction he has with his
optics colleagues.
From interacting with Dr. Mahajan over the years, it is apparent that
he is concerned with clearly connecting topics in geometrical optics to
provide students with a solid foundation. One example is his detailed
style in describing and deriving, say, the laws of geometrical optics and ray
tracing in 3D, and the evolution of Gaussian optics from them. Another
example is his insightful explanation of how the individual primary
aberration coefficients of a system of surfaces can be added directly
to form the overall system’s coefficients.
I wish that readers will benefit from Vini Mahajan's Fundamentals of

Geometrical Optics and treasure it as a favorite reference.
May 2014 José Sasián

College of Optical Sciences
University of Arizona
Tucson, Arizona
vii
TABLE OF CONTENTS
FUNDAMENTALS OF
GEOMETRICAL OPTICS
Preface ............................................................................................................................ xix
Acknowledgment............................................................................................................ xxi
Symbols and Notation.................................................................................................. xxiii
CHAPTER 1: FOUNDATIONS OF GEOMETRICAL OPTICS

1.1 Introduction ..............................................................................................................3
1.2 Sign Convention ....................................................................................................... 4
1.3 Fermat’s Principle....................................................................................................5
1.4 Rays and Wavefronts............................................................................................... 8
1.5 Laws of Geometrical Optics ..................................................................................10

1.5.1 Rectilinear Propagation ............................................................................. 10
1.5.2 Refraction in 2D ........................................................................................10
1.5.3 Reflection in 2D ........................................................................................12
1.5.4 Refraction in 3D ........................................................................................13
1.5.5 Reflection in 3D ........................................................................................15
1.6 Exact Ray Tracing ................................................................................................. 17
1.6.1 Ray Incident on a Spherical Surface..........................................................17
1.6.2 Rectilinear Propagation from the Object Plane to the
First Refracting Surface............................................................................. 18
1.6.3 Refraction of a Ray by a Spherical Refracting Surface............................. 19
1.6.4 Rectilinear Propagation from the First Refracting Surface to the Second 20
1.6.5 Reflection of a Ray by a Spherical Reflecting Surface ............................. 21
1.6.6 Conic Surface and Surface Normal ..........................................................22
1.6.7 Refraction of a Ray by a Conic Refracting Surface ..................................22
1.6.8 Reflection of a Ray by a Conic Reflecting Surface................................... 23
1.6.9 Tracing a Tangential Ray ..........................................................................24
1.6.10 Determining Wave and Ray Aberrations ..................................................24
L[
1.7 Paraxial Ray Tracing............................................................................................. 24
1.7.1 Snell’s Law ................................................................................................25
1.7.2 Point on a Spherical Surface ......................................................................25
1.7.3 Distance between Two Points....................................................................25
1.7.4 Unit Vector along a Surface Normal ......................................................... 26
1.7.5 Unit Vector along a Ray ............................................................................26
1.7.6 Transfer of a Ray ....................................................................................... 26
1.7.7 Refraction of a Ray ....................................................................................27
1.7.8 Reflection of a Ray ....................................................................................27
1.8 Gaussian Approximation and Imaging ................................................................28
1.8.1 Gaussian Approximation ........................................................................... 28
1.8.2 Gaussian Imaging by a Refracting Surface ............................................... 29
1.8.3 Gaussian Imaging by a Reflecting Surface................................................31
1.8.4 Gaussian Imaging by a Multisurface System ............................................34
1.9 Imaging beyond Gaussian Approximation ..........................................................34
1.10 Summary of Results ............................................................................................... 36

1.10.1 Sign Convention ........................................................................................36
1.10.2 Fermat’s Principle......................................................................................36
1.10.3 Laws of Geometrical Optics ......................................................................36
1.10.4 Exact Ray Tracing ..................................................................................... 37
1.10.4.1 Transfer Operation..................................................................... 37
1.10.4.2 Refraction Operation ................................................................. 37
1.10.4.3 Reflection Operation..................................................................38
1.10.4.4 Ray Tracing a Conic Surface..................................................... 39
1.10.4.5 Tracing a Tangential Ray ..........................................................39
1.10.5 Paraxial Ray Tracing ................................................................................. 39
1.10.6 Gaussian Optics ......................................................................................... 39
1.10.6.1 Gaussian Imaging by a Refracting Surface ............................... 39
1.10.6.2 Gaussian Imaging by a Reflecting Surface................................40
References ........................................................................................................................41
Problems ........................................................................................................................... 42
CHAPTER 2: REFRACTING SYSTEMS

2.1 Introduction ............................................................................................................45
2.2 Spherical Refracting Surface ................................................................................46
2.2.1 Gaussian Imaging Equation....................................................................... 46
2.2.2 Object and Image Spaces........................................................................... 50
2.2.3 Focal Lengths and Refracting Power ........................................................51
2.2.4 Magnifications and Lagrange Invariant..................................................... 53
2.2.5 Graphical Imaging ..................................................................................... 59
2.2.6 Newtonian Imaging Equation ....................................................................61
x
2.3 Thin Lens ................................................................................................................61
2.3.6 Image Throw..............................................................................................69
2.3.7 Thin Lens Not in Air..................................................................................71
2.3.8 Thin Lenses in Contact ..............................................................................73
2.4 General System....................................................................................................... 73
2.4.1 Introduction................................................................................................73
2.4.2 Cardinal Points and Planes ........................................................................75
2.4.3 Gaussian Imaging, Focal Lengths, and Magnifications ............................77
2.4.4 Nodal Points and Planes ............................................................................80
2.4.7 Reference to Other Conjugate Planes ........................................................82
2.4.8 Comparison of Imaging by a General System and a Refracting Surface
or a Thin Lens ............................................................................................84
2.4.9 Determination of Cardinal Points ..............................................................85
2.5 Afocal Systems ........................................................................................................90
2.5.1 Introduction................................................................................................90
2.5.2 Lagrange Invariant for an Infinite Conjugate ............................................91
2.5.3 Imaging by an Afocal System....................................................................91
2.6 Plane-Parallel Plate ................................................................................................93
2.6.1 Introduction................................................................................................93
2.6.2 Imaging Relations ......................................................................................94
2.7 Petzval Image..........................................................................................................96
2.7.1 Spherical Refracting Surface ..................................................................... 96
2.7.2 General System ..........................................................................................98
2.7.3 Thin Lens ................................................................................................... 99
2.8 Misaligned Surface............................................................................................... 101
2.8.1 Decentered Surface ..................................................................................101
2.8.2 Tilted Surface ..........................................................................................102
2.8.3 Despaced Surface ....................................................................................104
2.9 Misaligned Thin Lens ..........................................................................................105
2.9.1 Decentered Lens ......................................................................................105
2.9.2 Tilted Lens ............................................................................................... 106
2.9.3 Despaced Lens ......................................................................................... 106
2.10 Anamorphic Imaging Systems ............................................................................107
[L
2.11 Summary of Results ............................................................................................. 109
2.11.1 Imaging Equations ................................................................................... 109
2.11.1.1 General System ........................................................................109
2.11.1.2 Refracting Surface ................................................................... 111
2.11.1.3 Thin Lens ................................................................................. 111
2.11.1.4 Afocal System..........................................................................112
2.11.1.5 Plane-Parallel Plate ..................................................................112
2.11.2 Petzval Image ..........................................................................................112
2.11.3 Misalignments..........................................................................................113
2.11.3.1 Misaligned Surface ..................................................................113
2.11.3.2 Misaligned Thin Lens ..............................................................113
2.11.4 Anamorphic Imaging Systems ................................................................113
Problems ......................................................................................................................... 115
CHAPTER 3: REFLECTING SYSTEMS

3.1 Introduction ..........................................................................................................119
3.2 Spherical Reflecting Surface (Spherical Mirror) ..............................................119
3.2.1 Gaussian Imaging Equation..................................................................... 119
3.2.2 Focal Length and Reflecting Power ........................................................121
3.2.3 Magnifications and the Lagrange Invariant............................................. 123
3.2.4 Graphical Imaging ................................................................................... 127
3.2.5 Newtonian Imaging Equation ..................................................................127
3.3 Two-Mirror Telescopes ....................................................................................... 129
3.4 Beam Expander ....................................................................................................133
3.5 Petzval Image........................................................................................................133
3.5.1 Single Mirror ........................................................................................... 133
3.5.2 Two-Mirror System ................................................................................. 135
3.5.3 System of k Mirrors ................................................................................. 136
3.6 Misaligned Mirror................................................................................................136
3.6.1 Decentered Mirror ................................................................................... 136
3.6.2 Tilted Mirror ............................................................................................137
3.6.3 Despaced Mirror ......................................................................................138
3.7 Misaligned Two-Mirror Telescope ..................................................................... 139
3.7.1 Decentered Secondary Mirror..................................................................139
3.7.2 Tilted Secondary Mirror ..........................................................................139
3.7.3 Despaced Secondary Mirror ....................................................................139
3.8.1 Imaging by a Mirror ................................................................................141
3.8.2 Imaging by a Two-Mirror Telescope ......................................................142
Problems ......................................................................................................................... 144
xii
CHAPTER 4: PARAXIAL RAY TRACING
4.1 Introduction ..........................................................................................................147
4.2 Refracting Surface ............................................................................................... 148
4.3 General System..................................................................................................... 152
4.3.1 Determination of Cardinal Points ............................................................152
4.3.2 Combination of Two Systems ................................................................. 154
4.4 Thin Lens ..............................................................................................................155
4.5 Thick Lens ............................................................................................................159
4.6 Two-Lens System ................................................................................................. 162
4.7 Reflecting Surface (Mirror) ................................................................................165
4.8 Two-Mirror System ............................................................................................. 168
4.8.1 Focal Length ............................................................................................168
4.8.2 Obscuration ..............................................................................................170
4.9 Catadioptric System: Thin-Lens–Mirror Combination ................................... 172
4.10 Two-Ray Lagrange Invariant ............................................................................. 174
4.11.1 Ray-Tracing Equations ............................................................................177
4.11.2 Thick Lens ............................................................................................... 179
4.11.3 Two-Lens System ....................................................................................180
4.11.5 Two-Ray Lagrange Invariant ..................................................................181
Problems ......................................................................................................................... 182
CHAPTER 5: STOPS, PUPILS, AND RADIOMETRY

5.1 Introduction ..........................................................................................................187
5.2 Stops, Pupils, and Vignetting ..............................................................................188
5.2.1 Introduction..............................................................................................188
5.2.2 Aperture Stop, and Entrance and Exit Pupils ..........................................188
5.2.3 Chief and Marginal Rays ......................................................................... 193
5.2.4 Vignetting ................................................................................................194
5.2.5 Size of an Imaging Element ....................................................................197
5.2.6 Telecentric Aperture Stop ........................................................................197
5.2.7 Field Stop, and Entrance and Exit Windows ........................................... 198
5.3 Radiometry of Point Object Imaging ................................................................. 200
5.3.1 Flux Received by an Aperture ................................................................. 200
5.3.2 Inverse-Square Law of Irradiance ........................................................... 201
5.3.3 Image Intensity ........................................................................................202
5.4 Radiometry of Extended Object Imaging ..........................................................204
5.4.1 Introduction..............................................................................................204
5.4.2 Lambertian Surface..................................................................................205
[LLL
5.4.3 Illumination by a Lambertian Disc ..........................................................206
5.4.5 Image Radiance ....................................................................................... 211
5.4.6 Image Irradiance: Aperture Stop in front of the System..........................213
5.4.7 Image Irradiance: Aperture Stop in back of the System ..........................216
5.4.8 Telecentric Systems ................................................................................. 218
5.4.9 Throughput ..............................................................................................218
5.4.10 Interrelations among Invariants in Imaging ............................................218
5.4.11 Concentric Systems ................................................................................. 219
5.5 Photometry ........................................................................................................... 220
5.5.1 Photometric Quantities and Spectral Response of the Human Eye......... 220
5.5.2 Imaging by the Human Eye ..................................................................... 223
5.5.3 Brightness of a Lambertian Surface ........................................................223
5.6.1 Stops, Pupils, Windows, and Field of View ............................................224
5.6.2 Radiometry of Point Object Imaging ......................................................225
5.6.3 Radiometry of Extended Object Imaging ................................................226
5.6.3.1 Illumination by a Lambertian Disc............................................226
5.6.3.2 Image Radiance ......................................................................... 226
5.6.3.3 Image Irradiance........................................................................227
5.6.4 Visual Observations................................................................................. 228
References ......................................................................................................................229
Problems ......................................................................................................................... 230
CHAPTER 6: OPTICAL INSTRUMENTS
6.1 Introduction ..........................................................................................................235

6.2 Eye ......................................................................................................................... 235
6.2.1 Anatomy and Structure ............................................................................235
6.2.2 Paraxial Models ....................................................................................... 237
6.2.3 Accommodation ......................................................................................238
6.2.4 Visual Acuity ........................................................................................... 240
6.2.5 Spectacles (or Eyeglasses)....................................................................... 242
6.3 Magnifier ..............................................................................................................249
6.4 Microscope ............................................................................................................251
6.5 Telescope ............................................................................................................... 253
6.6 Ocular....................................................................................................................259
6.7 Telephoto Lens and Wide-Angle Camera ..........................................................259
6.8 Resolution ............................................................................................................. 261
6.8.1 Introduction..............................................................................................261
6.8.2 Airy Pattern..............................................................................................261
6.8.3 Rayleigh Criterion of Resolution............................................................. 263
[LY
6.8.4 Resolution of an Imaging System ............................................................266
6.8.5 Resolution of the Eye ..............................................................................268
6.8.6 Resolution of a Microscope ..................................................................... 269
6.8.7 Resolution of a Telescope........................................................................270
6.9 Pinhole Camera ....................................................................................................273
6.10.1 Eye ........................................................................................................... 275
6.10.2 Magnifier ................................................................................................. 275
6.10.3 Microscope ..............................................................................................275
6.10.4 Telescope ................................................................................................. 276
6.10.5 Resolution ................................................................................................276
6.10.6 Pinhole Camera........................................................................................276
References ......................................................................................................................277
Problems ......................................................................................................................... 278
CHAPTER 7: CHROMATIC ABERRATIONS
7.1 Introduction ..........................................................................................................281

7.3 Thin Lens ..............................................................................................................285
7.4 Plane-Parallel Plate ..............................................................................................288
7.5 General System..................................................................................................... 292
7.6 Doublet ..................................................................................................................295
7.6.1 Lenses of Different Materials ..................................................................296
7.6.2 Lenses of the Same Material....................................................................297
7.6.3 Doublet with Two Separated Components ..............................................301
7.6.4 Thin-Lens Doublet................................................................................... 302
7.7.1 General System ........................................................................................305
7.7.2 Thin Lens ................................................................................................. 306
7.7.3 Plane-Parallel Plate ..................................................................................307
7.7.4 Doublet ....................................................................................................307
References ......................................................................................................................310
Problems ......................................................................................................................... 311
CHAPTER 8: MONOCHROMATIC ABERRATIONS

8.1 Introduction ..........................................................................................................315
8.2 Wave and Ray Aberrations ................................................................................. 316
8.2.1 Definitions ............................................................................................... 316
8.2.2 Relationship between Wave and Ray Aberrations ..................................320
[Y
8.3 Wavefront Defocus Aberration ..........................................................................322
8.4 Wavefront Tilt Aberration ..................................................................................325
8.5 Aberrations of a Rotationally Symmetric System............................................. 326
8.5.1 Explicit Dependence on Object Coordinates........................................... 326
8.5.2 No Explicit Dependence on Object Coordinates ..................................... 329
8.6 Additivity of Primary Aberrations ..................................................................... 331
8.6.1 Introduction..............................................................................................331
8.6.2 Primary Wave Aberrations ......................................................................332
8.6.3 Transverse Ray Aberrations ....................................................................335
8.6.4 Off-Axis Point Object ..............................................................................336
8.6.5 Higher-Order Aberrations........................................................................337
8.7 Strehl Ratio and Aberration Balancing ............................................................. 337
8.7.1 Strehl Ratio ..............................................................................................337
8.7.2 Aberration Balancing............................................................................... 338
8.8 Zernike Circle Polynomials................................................................................. 340
8.8.1 Introduction..............................................................................................340
8.8.2 Polynomials in Optical Design ................................................................341
8.8.3 Polynomials in Optical Testing ............................................................... 345
8.8.4 Characteristics of Polynomial Aberrations ..............................................349
8.8.4.1 Isometric Characteristics ........................................................... 349
8.8.4.2 Interferometric Characteristics ..................................................350
8.9 Relationship between Zernike Polynomials and Classical Aberrations ......... 352
8.9.1 Introduction..............................................................................................352
8.9.2 Wavefront Tilt Aberration ....................................................................... 352
8.9.3 Wavefront Defocus Aberration................................................................353
8.9.4 Astigmatism............................................................................................. 353
8.9.5 Coma ........................................................................................................354
8.9.6 Spherical Aberration ................................................................................355
8.9.7 Seidel Coefficients from Zernike Coefficients ........................................355
8.10 Aberrations of an Anamorphic System ..............................................................356
8.10.1 Introduction..............................................................................................356
8.10.2 Classical Aberrations ............................................................................... 357
8.10.3 Polynomial Aberrations Orthonormal over a Rectangular Pupil ............358
8.10.4 Expansion of a Rectangular Aberration Function in Terms of
Orthonormal Rectangular Polynomials ................................................... 360
8.11 Observation of Aberrations ................................................................................363
8.11.1 Primary Aberrations ................................................................................364
8.11.2 Interferograms..........................................................................................364
8.11.3 Random Aberrations ................................................................................369
8.12.1 Wave and Ray Aberrations ......................................................................370
[YL
8.12.5 Strehl Ratio and Aberration Balancing ....................................................371
8.12.6 Zernike Circle Polynomials ..................................................................... 371
8.12.6.1 Use of Zernike Polynomials in Wavefront Analysis ............... 371
8.12.6.2 Polynomials in Optical Design ................................................371
8.12.6.3 Zernike Primary Aberrations ................................................... 372
8.12.6.4 Polynomials in Optical Testing ............................................... 373
8.12.6.5 Isometric and Interferometric Characteristics ......................... 374
8.12.7 Relationship between Zernike and Seidel Coefficients ........................... 374
8.12.8 Aberrations of an Anamorphic System....................................................374
Appendix: Combination of Two Zernike Polynomial Aberrations with the
Same n Value and Varying as cos mqq and sin mqq ................................. 376
References ......................................................................................................................377
Problems ......................................................................................................................... 378
CHAPTER 9: SPOT SIZES AND DIAGRAMS

9.1 Introduction ..........................................................................................................381
9.2 Theory ................................................................................................................... 381
9.3 Application to Primary Aberrations ..................................................................384
9.3.2 Coma ........................................................................................................391
9.3.3 Astigmatism and Field Curvature ............................................................394
9.3.4 Field Curvature and Depth of Focus........................................................402
9.3.5 Distortion ................................................................................................. 404
9.4 Balanced Aberrations for the Minimum Spot Sigma ....................................... 408
9.5 Spot Diagrams ......................................................................................................410
9.6 Aberration Tolerance and a Golden Rule of Optical Design ........................... 415
9.7.1 Spherical Aberration ............................................................................... 416
9.7.2 Coma ....................................................................................................... 416
9.7.3 Astigmatism and Field Curvature ........................................................... 416
9.7.4 Field Curvature and Defocus ................................................................... 417
9.7.5 Distortion ................................................................................................418
9.7.6 Aberration Tolerance ............................................................................... 418
9.7.7 A Golden Rule of Optical Design............................................................418
References ......................................................................................................................419
Problems ......................................................................................................................... 420
[YLL
EPILOGUE
E1 Introduction ..........................................................................................................423
E2 Principles of Geometrical Optics and Imaging..................................................423
...............................
E3 Ray Tracing: Exact and Paraxial ....................................................................... 423
E4 Gaussian Optics ....................................................................................................424
E4.1 Tangent Plane or Paraxial Surface ..........................................................424
E4.2 Sign Convention ......................................................................................424
E4.3 Cardinal Points ........................................................................................424
E4.4 Graphical Imaging ................................................................................... 425
E4.5 Lagrange Invariant................................................................................... 425
E4.6 Matrix Approach to Gaussian Imaging....................................................426
E4.7 Petzval Image ..........................................................................................426
E4.8 Field of View ........................................................................................... 426
E4.9 Chromatic Aberrations ............................................................................426
E5 Image Brightness ..................................................................................................427
E6 Image Quality ....................................................................................................... 427
E6.1 Wave and Ray Aberrations ......................................................................427
E6.2 Primary Aberrations ................................................................................428
E6.3 Spot Size and Aberration Balancing ........................................................429
E6.4 Strehl Ratio and Aberration Balancing ....................................................429
E7 Reflecting Systems................................................................................................430
E8 Anamorphic Imaging Systems ............................................................................430
E9 Aberration Tolerance and a Golden Rule of Optical Design ........................... 431
E10 General Comments ..............................................................................................431
References ......................................................................................................................433
Bibliography................................................................................................................... 435
Index ............................................................................................................................. 437
[YLLL
PREFACE
Portions of this book have their origin in the author’s lectures given as an adjunct
professor in the electrical engineering/electrophysics department of the University of
Southern California from about 1984 to 1998. It is a precursor to the author’s “Optical
Imaging and Aberrations books (Part I: Ray Geometrical Optics; Part II: Wave
Diffraction Optics; and Part III: Wavefront Analysis),” all published by SPIE Press. It is
an expanded yet simplified version of some of the material from Part I, and contains some
new material. The focus is on Gaussian imaging, ray tracing, radiometry, basic optical
instruments, optical aberrations, and spot diagrams. The primary aberrations of simple
systems, such as a thin lens or a two-mirror telescope, that are derived in Part I are not
discussed here. The book can be used as a textbook for a senior undergraduate or a first-
year graduate class.
Geometrical optics is fundamental to optical imaging. Chapter 1 lays out its

foundations. It starts with the sign convention of Cartesian geometry, states the Fermat’s
principle, and derives the three laws of geometrical optics from it. These laws are used to
obtain the equations for exact ray tracing, and those for paraxial ray tracing are obtained
from them as an approximation. The latter equations are used to obtain the basic
equations of Gaussian optics. In Chapter 2, the Gaussian and Newtonian imaging
equations are derived for a refracting surface using the small angle approximation of
Snell’s law. The equations thus obtained are applied to derive the imaging equations for a
thin lens, and for a general imaging system. Afocal systems, as applied to astronomical
telescopes, and telephoto and wide-angle camera lenses are discussed. The Petzval image
describing the defocus error of the Gaussian image of an off-axis point object is
considered. Also discussed is how the Gaussian image is displaced due to a misalignment
of a surface or a thin lens. Imaging by an anamorphic system is briefly considered.
Imaging by reflecting systems is discussed in Chapter 3, including Gaussian imaging by
two-mirror telescopes.
The imaging equations obtained in Chapters 2 and 3 are rederived in Chapter 4 by

using the paraxial ray-tracing equations. These ray-tracing equations are also used to
determine the size of the imaging elements, vignetting of rays by them for off-axis point
objects, and obscurations in mirror systems. Stops, pupils, and radiometry are discussed
in Chapter 5. How to determine the aperture stop of a system and its images in the object
and image spaces, i.e., the entrance and exit pupils, is described. The intensity of the
image of a point object, invariance of the radiance of a ray bundle as it is refracted or
reflected, and the irradiance distribution of the image of an extended object in terms of its
radiance distribution are discussed. A brief discussion of photometry is also given.
Some of the familiar optical instruments such as the eye, magnifier, microscope,
telescope, and pinhole camera are addressed in Chapter 6. The most common and
interesting among them is the eye, which is discussed in detail. The resolution of such
common optical instruments is discussed based on Rayleigh’s criterion of resolution, thus
[L[
necessitating a brief discussion of the aberration-free diffraction image of a point object,
i.e., the Airy pattern. The chromatic aberrations of a system are discussed in Chapter 7. A
refracting surface, a thin lens, a plane-parallel plate, and a doublet are considered as
simple examples of systems.
The monochromatic aberrations of a system with an emphasis on primary aberrations

are considered in Chapter 8. The wave and ray aberrations are introduced, and a simple
derivation of the relationship between them is given. The Strehl ratio of an image as a
measure of its quality is introduced, and the balancing of wave aberrations to minimize
their variance is discussed. The aberrations are also discussed in terms of the Zernike
circle polynomials because of their widespread use in optical design and testing. The
aberrations of an anamorphic imaging system are also discussed. The spot sizes and
diagrams for primary aberrations are addressed in Chapter 9. Aberration balancing for
minimum standard deviation of the ray distribution of an image spot is discussed. The
aberration tolerances for primary aberrations based on their spot radius are derived, and
the golden rule of optical design is described.
The content of each chapter is summarized in its last section. This section is written
to be comprehensive enough that it can be read on its own without reading the whole
chapter. Each chapter ends with a set of problems, which are an integral part of the book.
They help develop and test how to apply the results obtained in a chapter to practical
situations.
The book ends with an epilogue, which gives a summary of the imaging process, and
outlines the next steps within and beyond geometrical optics.
El Segundo, California Virendra N. Mahajan

April 2014
[[
ACKNOWLEDGMENT
Once again, I am pleased to acknowledge the generous support I have received over
the years from my employer, The Aerospace Corporation, in preparing this book. I am
grateful to my former classmate Dr. William H. Swantner for his advice on this work. I
had useful discussions about the human eye with my son, Vinit Bharati, who is a retina
surgeon. My thanks to Drs. Pantazis Mouroulis and Brian Stone, and two anonymous
reviewers, for reading a draft of the book and providing useful feedback. Of course, I am
the only one responsible for any shortcomings or errors in the book. My special thanks go
to Professor José Sasián for writing the Foreword. The Sanskrit verse on p. xxv was
provided by Professor Sally Sutherland of the University of California at Berkeley.
I do not have enough words to thank my wife, Shashi Prabha, for tolerating my time
away from her while I was busy writing this book. This is the last of my five books on
optical imaging and aberrations, and I dedicate it to her.
Finally, I thank SPIE Press Editor Scott McNeill and Press Manager Tim Lamkins
for their quality support in bringing this book to publication. Scott has meticulously
upgraded some of the figures, including the color figures on chromatic aberrations.
[[L
SYMBOLS AND NOTATION
a radius of exit pupil q shape factor
ai aberration coefficient R radius of curvature of a surface or

reference sphere
Ai peak aberration coefficient
Rnm (ρ) Zernike radial polynomial
AS aperture stop
s entrance-pupil distance
CR chief ray
s′ exit-pupil distance
e eccentricity
S object distance
EnP entrance pupil
S′ image distance
EnW entrance window
t thickness
ExP exit pupil
V Abbe number, spectral response
ExW exit window
W wave aberration
f focal length
x, y rectangular coordinates of a point
F focal ratio or f-number, focal point,
flux z sag, object or observation distance
GR general ray z′ image distance
h object height ray or field angle
h′ image height ∆R longitudinal defocus
H principal point r,θ polar coordinates of a point
K power of a system λ optical wavelength
L image distance from exit pupil (ξ, η) = ( x, y) a normalized rectangular

coordinates
m pupil-image magnification ρ = r / a normalized radial coordinate
in the pupil plane
M object-image magnification
σF standard deviation of figure errors
MR marginal ray
σs ray spot sigma
n refractive index
σW standard deviation of wave
OA optical axis aberration
p position factor Φ phase aberration
P point object ψ angular deviation of ray
P′ Gaussian image point
(−) x numerically negative quantity x
xxiii
Anantaratnaprabhavasya yasya himam
. na saubhagyavilopi jatam
Eko hi doso
. gunasannipate
. ˙ .
nimajjatindoh. kiranesvivankah
.
The snow does not diminish the beauty of the Himalayan mountains
which are the source of countless gems. Indeed, one flaw is lost
among a host of virtues, as the moon’s dark spot is lost among its rays.
Kalidasa Kumarasambhava 1.3
[[Y
CHAPTER 1
FOUNDATIONS OF GEOMETRICAL OPTICS
1.1 Introduction ..............................................................................................................3
1.2 Sign Convention ....................................................................................................... 4
1.3 Fermat’s Principle....................................................................................................5
1.4 Rays and Wavefronts............................................................................................... 8
1.5 Laws of Geometrical Optics ..................................................................................10

1.5.1 Rectilinear Propagation ............................................................................. 10
1.5.2 Refraction in 2D ........................................................................................10
1.5.3 Reflection in 2D ........................................................................................12
1.5.4 Refraction in 3D ........................................................................................13
1.5.5 Reflection in 3D ........................................................................................15
1.6 Exact Ray Tracing ................................................................................................. 17
1.6.1 Ray Incident on a Spherical Surface..........................................................17
1.6.2 Rectilinear Propagation from the Object Plane to
the First Refracting Surface ....................................................................... 18
1.6.3 Refraction of a Ray by a Spherical Refracting Surface............................. 19
1.6.4 Rectilinear Propagation from the First Refracting Surface to the Second 20
1.6.5 Reflection of a Ray by a Spherical Reflecting Surface ............................. 21
1.6.6 Conic Surface and Surface Normal ..........................................................22
1.6.7 Refraction of a Ray by a Conic Refracting Surface ..................................22
1.6.8 Reflection of a Ray by a Conic Reflecting Surface................................... 23
1.6.9 Tracing a Tangential Ray ..........................................................................24
1.6.10 Determining Wave and Ray Aberrations ..................................................24
1.7 Paraxial Ray Tracing............................................................................................. 24
1.7.1 Snell’s Law ................................................................................................25
1.7.2 Point on a Spherical Surface ......................................................................25
1.7.3 Distance between Two Points....................................................................25
1.7.4 Unit Vector along a Surface Normal ......................................................... 26
1.7.5 Unit Vector along a Ray ............................................................................26
1.7.6 Transfer of a Ray ....................................................................................... 26
1.7.7 Refraction of a Ray ....................................................................................27
1.7.8 Reflection of a Ray ....................................................................................27
1
2 FOUNDATIONS OF GEOMETRICAL OPTICS
1.8 Gaussian Approximation and Imaging ................................................................28

1.8.1 Gaussian Approximation ........................................................................... 28
1.8.2 Gaussian Imaging by a Refracting Surface ............................................... 29
1.8.3 Gaussian Imaging by a Reflecting Surface................................................31
1.8.4 Gaussian Imaging by a Multisurface System ............................................34
1.9 Imaging beyond Gaussian Approximation ..........................................................34
1.10 Summary of Results ............................................................................................... 36

1.10.1 Sign Convention ........................................................................................36
1.10.2 Fermat’s Principle......................................................................................36
1.10.3 Laws of Geometrical Optics ......................................................................36
1.10.4 Exact Ray Tracing ..................................................................................... 37
1.10.4.1 Transfer Operation..................................................................... 37
1.10.4.2 Refraction Operation ................................................................. 37
1.10.4.3 Reflection Operation..................................................................38
1.10.4.4 Ray Tracing a Conic Surface..................................................... 39
1.10.4.5 Tracing a Tangential Ray ..........................................................39
1.10.5 Paraxial Ray Tracing ................................................................................. 39
1.10.6 Gaussian Optics ......................................................................................... 39
1.10.6.1 Gaussian Imaging by a Refracting Surface ............................... 39
1.10.6.2 Gaussian Imaging by a Reflecting Surface................................40
References ........................................................................................................................41
Problems ........................................................................................................................... 42
Chapter 1
Foundations of Geometrical Optics
1.1 INTRODUCTION
In geometrical optics, light is described by rays that propagate according to three
laws: rectilinear propagation, refraction, and reflection. Their direction of propagation
indicates the direction of the flow of light energy. They are normal to a wavefront. They
are not a physical entity in the sense that we cannot isolate a ray, yet they are very
convenient for describing the process of imaging by a system.
We begin this chapter with a brief introduction of the Cartesian sign convention for
the distances and heights of the object and image points, and the angles of incidence and
refraction or reflection and slope angles of the rays. We discuss Fermat’s principle that
the optical path length of a ray from one point to another is stationary, and derive the laws
of rectilinear propagation in a homogeneous medium, refraction by a refracting surface,
and reflection by a reflecting surface (first in 2D and then in 3D). These laws are used to
obtain ray-tracing equations representing the propagation of a ray exactly from a certain
point to a point on a refracting or a reflecting surface, or refraction or reflection of the ray
by the surface, and propagation of the refracted or reflected ray to the next surface. The
purpose of exact ray tracing is to determine the aberrations of a system consisting of a
series of refracting and/or reflecting surfaces that generally have a common axis of
rotational symmetry called the optical axis. Such a system is called a centered or a
rotationally symmetric system. Its surfaces bend light rays from an object according to the
three laws to form its image.
For rays and normals to the refracting and reflecting surfaces making small angles
with the optical axis, Gauss gave an extremely useful approximation to the exact theory.
In this approximation, the sines and tangents of the angles of the rays with the optical axis
are replaced by the angles, and any diagonal distances are approximated by the
corresponding axial distances. Gaussian optics or imaging relates the object distance and
size to the image distance and size through the parameters of the imaging system such as
the radii of curvature of the surfaces and refractive indices of the media between them.
The image of an object obtained according to geometrical optics in the Gaussian
approximation is called the Gaussian image.
The assumption or approximation of small angles is referred to as the Gaussian or

the paraxial (meaning near the optical axis) approximation. A distinction is made
sometimes between Gaussian and paraxial optics in that paraxial optics is a limiting case
of Gaussian optics in which the angles are infinitesimal quantities. The rays traced in this
approximation are called paraxial rays and the corresponding method of ray tracing is
referred to as paraxial ray tracing. Because of the rotational symmetry, only rays lying in
the plane containing the optical axis and the point object under consideration need to be
considered to determine the Gaussian image. Such a plane is called the tangential (or
3
meridional) plane, and rays lying in this plane are called tangential (or meridional) rays.
Those rays that intersect this plane are called skew rays.
The role of an optical designer is to design an imaging system so that it can form an
image of a certain size at a certain location, given the object size and location. Given the
radiance of an extended object or the intensity of a point object, the designer chooses the
sizes of the imaging elements that yield an image of some prescribed irradiance or
intensity. Gaussian optics is also used to determine the extent of the object that can be
imaged, i.e., it is used to determine the field of view of the system. A quantity of
paramount interest that is beyond Gaussian optics, but a design must satisfy, is the
expected quality of the image. A designer must choose the shapes and materials of the
imaging elements that balance their chromatic and monochromatic aberrations to produce
an image of acceptable quality across the field of view of the system.
1.2 SIGN CONVENTION

Although there is no universally accepted standard sign convention, we will use the
Cartesian sign convention [1]. It has the advantage that there are no special rules to
remember other than those of a right-handed Cartesian coordinate system, regardless of
whether the object or the image is real or virtual. Our sign convention is the same as that
used by Mouroulis and Macdonald [2] and Welford [3], but it is different from the sign
convention used, for example, by Jenkins and White [4], Klein and Furtak [5], and Hecht
and Zajac [6]. Its rules, as they apply to the quantities encountered in Gaussian optics, are
listed below. They are illustrated with the aid of Figure 1-1, which shows the imaging of
an object by a refracting surface of radius of curvature R separating media of refractive
indices n and n ¢ . The object P0 P has a height of h and lies at a distance S from the
surface. Its image P0¢P ¢ has a height of h ¢ and lies at a distance S ¢ .
1. Light is incident on an imaging system from left to right.
2. Distances to the right of and above (left of and below) a reference point are
positive (negative). The object distance S and image height h ¢ are numerically
negative in Figure 1-1, and object height h and image distance S ¢ are
numerically positive.
3. The radius of curvature of a surface is treated as the distance of its center of

curvature from its vertex. Thus, it is positive (negative) when the center of
curvature lies to the right (left) of the vertex. R is numerically positive in Figure
1-1.
4. The acute angle of a ray from the optical axis or from the surface normal is
positive (negative) if it is counterclockwise (clockwise). The angles q and q ¢ of
the incident and refracted rays P0 Q and QP¢0 from the surface normal QC are
both positive in Figure 1-1. However, the angles f and b ¢0 of the surface normal
and the refracted ray from the optical axis OA are both numerically negative.
1.3 Fermat’s Principle 5
n n¢
Q
q
P q¢
(–)f
h b0 V (–)b¢0 P¢0
P0 OA C (–)h¢
P¢
R
(–)S S¢
Figure 1-1. Gaussian imaging by a convex spherical refracting surface of radius of

curvature R separating media of refractive indices n and n ¢ , where n ¢ > n. VC is
the optical axis OA of the surface, where V is the vertex of the surface and C is its
center of curvature. The axial point object P0 lies at a distance S, and its image P0¢
lies at a distance S ¢ from V. The angles q and q ¢ are the angles of the incident and
refracted rays P0 Q and P0¢Q , respectively, from the surface normal QC at the point
of incidence Q. The slope angles of these rays from the optical axis are 0 and ¢0 .
The off-axis point object P lies at a height h from the optical axis. Its image P ¢ lies
at a height h ¢ . Numerically negative quantities are indicated by a negative
parenthetical sign (–).
5. When light travels from right to left, such as when it is reflected by an odd
number of mirrors, then the refractive index and the spacing between two
adjacent surfaces are given a negative sign. The negative distance is consistent
with the sign convention for the distance, and a negative refractive index results
from the negative wave velocity.
Throughout the book, any quantities that are numerically negative are indicated in the
figures by a parenthetical negative sign ( - ) .
1.3 FERMAT’S PRINCIPLE

Fermat’s principle states that the time a ray takes in traveling from one point to
another along its actual path is stationary with respect to small changes of that path. By
definition, the refractive index of a medium is the ratio of the speed of light in vacuum to
its corresponding value in the medium. Because the time taken by a ray is inversely
proportional to the speed of light in a medium, which in turn is inversely proportional to
its refractive index, the principle may also be stated as follows: The optical path length of
a ray in traveling from one point to another along its actual path is stationary, where the
optical path length is equal to the geometrical path length multiplied by the refractive
index. The optical path length is stationary in the sense that any deviation of the path
from the actual that is of first order in small quantities produces a deviation in the optical
path length that is at least of second order in small quantities.
If we consider the actual and neighboring paths of a ray in going from a point P1 to a
point P2 , as indicated in Figure 1-2, so that the two paths deviate by no more than a small
quantity , then the difference in their optical path lengths is given by
P2 P2
(1-1a)
W ( ) = Ú nds¢ - Ú nds
P1 P1
= O 2 ( ) , (1-1b)
where ds and ds¢ are the differential elements of path length along the actual and
( )
neighboring virtual rays, respectively, n is the corresponding refractive index, and O 2
indicates a function that depends on through 2 and/or higher powers of . It is clear
from Eq. (1-1b) that
lim ∂W
Æ 0 ∂ = 0 . (1-2a)
Equation (1-2a) may also be written
P2
d Ú nds = 0 , (1-2b)
P1
where d indicates a differential variation. Thus, up to the first order in , the two optical
path lengths are equal.
The optical path length of an actual ray compared to those of the neighboring virtual
rays may be a maximum or a minimum, or all of the rays may have equal optical path
lengths. This may be seen from the properties of an ellipse (or ellipsoid), as illustrated in
Figure 1-3. An ellipse has the property (see Figure 1-3a) that the sum of the distances of a
'
P2
ds¢
ds
P1
Figure 1-2. The actual and virtual paths of a ray in going from a point P1 to a point
P2 . The actual path is indicated by a solid line, and the two paths deviate from each
other by no more than a small quantity at any point along the path.
1.3 Fermat’s Principle 7
R
P
Q
qr
(–)qi
(a)
F1 F2
Q P
(b)
F1 F2
R
Q P
(c)
F1 F2
Figure 1-3. Stationarity of optical path length. (a) [ F1 PF2 ] = [ F1 QF2 ] for the
ellipsoidal mirror; [ F1 PF2 ] is a minimum for the plane mirror. (b) [ F1 PF2 ] is a
maximum for the concave mirror. (c) [ F1 PF2 ] is a minimum for the convex mirror.
point P on it from its geometrical foci F1 and F2 is independent of its location.

Moreover, according to the law of reflection derived later in this section, the angles made
by the lines F1 P and F2 P with the normal PN to the ellipse at P are equal. Thus, if we
place a point source at the focus F1 of an ellipsoidal mirror, all of the rays from it pass
through F2 after reflection by the mirror, and their optical path lengths are equal to each
other. Thus, for example, [ F1 PF2 ] = [ F1QF2 ] , where the square brackets indicate an
optical path length. However, for a plane mirror that is a tangent to the ellipse at the point
P, the optical path length [ F1 PF2 ] of the actual ray will be a minimum compared with any
neighboring optical path length, such as [ F1 RF2 ] .
Similarly, if we consider the concave mirror shown dashed in Figure 1-3b so that it
has a common tangent and therefore a common normal with the ellipse at the point, then
the optical path length of the actual ray F1 PF2 is a maximum compared with the
neighboring virtual (in the sense of fictitious) rays. We note, for example, that
[ F1 RF2 ] < [ F1QF2 ] = [ F1 PF2 ] . (1-3a)
Moreover, if we consider a convex mirror as in Figure 1-3c, having a common tangent

with the ellipse at the point P, then the optical path length of the actual ray F1 PF2 is a
minimum compared with the neighboring virtual rays. In this case,
[ F1 RF2 ] > [ F1QF2 ] = [ F1 PF2 ] . (1-3b)
1.4 RAYS AND WAVEFRONTS

In geometrical optics, light is described by rays that propagate according to three
laws discussed in the next section: rectilinear propagation, refraction, and reflection.
Their direction of propagation indicates the direction of flow of the light energy. They are
not a physical entity in the sense that we cannot isolate a ray, yet they are very convenient
for describing the process of imaging by a system. The optical path length of a ray in a
certain medium is equal to its geometrical path length times the refractive index of the
medium. A surface passing through the end points of rays that have traveled equal optical
path lengths from a point object is called an optical wavefront. Accordingly, it is a
surface of constant phase. Using Fermat’s principle, we derive the Malus–Dupin theorem,
which states that a set of rays that are orthogonal to a wavefront remains so after
refraction by a refracting surface.
Let W be a spherical wavefront of rays emanating from a point object P, as illustrated

in Figure 1-4. In a uniform medium, the rays are orthogonal to the wavefront at the points
of their intersection. When these rays are refracted by a refracting surface S so that they
all travel equal optical path lengths, a wavefront W ¢ is obtained. By definition of the
wavefronts, we have
[ AVA ¢ ] = [ BQB¢ ] , (1-4)
where V and Q are the points of incidence of two neighboring rays PAV and PBQ. From
Fermat’s principle, the optical path length [ AQA¢ ] of the virtual ray AQA¢ may be
written
[ AQA¢ ] ( )
= [ AVA ¢ ] + O 2 , (1-5a)
where = VQ is a small quantity. Substituting Eq. (1-5a) into Eq. (1-4), we obtain
[ AQA¢ ] ( )
= [ BQB ¢ ] + O 2 . (1-5b)
Because BQ is perpendicular to the wavefront W at the point B,

1.4 Rays and Wavefronts 9
n n¢
B B¢
P A V A¢ P¢
W W¢
Figure 1-4. Refraction of a spherical wavefront W by a surface S separating media

of refractive indices n and n ¢, showing that rays such as BQ that are perpendicular
to the wavefront W remain perpendicular to the wavefront W ¢ after refraction.
[ AQ] = [ BQ] + O 2( ) , (1-6a)
where AB is of the same order of magnitude as VQ. Subtracting Eq. (1-6a) from Eq.
(1-5b), we obtain
[QA¢] ( )
= [QB¢] + O 2 , (1-6b)
or the ray QB¢ is perpendicular to the wavefront W ¢ at the point B¢. If the wavefront W ¢
is refracted by another refracting surface, the refracted rays and the wavefront produced
by it can again be shown to be orthogonal to each other.
It should be noted that although the incident wavefront W is spherical with its center
of curvature at P, the refracted wavefront W ¢ may or may not be spherical, depending
on the shape of the refracting surface S. If W ¢ is spherical with its center of curvature at
P ¢ , then S is called a Cartesian surface, and the points P and P ¢ are called a Cartesian
pair or perfect conjugates. Because the rays are perpendicular to the wavefront, an
alternative definition of a perfect image is that all of the rays pass through the image point
P ¢ . For example, an ellipsoidal refracting surface with an eccentricity e = n n¢
separating media of refractive indices n and n ¢ is a Cartesian surface for a collimated
beam (see Problem 1.1). Similarly, an ellipsoidal mirror is a Cartesian surface for a point
object placed at one of its two geometrical focii (see Problem 1.2).
If the wavefront W ¢ is not spherical, its deviations from a corresponding spherical

surface are called the wave aberrations, and the distances of the points of intersection of
the rays from P ¢ in an image plane passing through it are called the transverse ray
aberrations. The distribution of rays in the image plane is called a spot diagram. In
practice, we are generally interested in forming images of extended objects, not of just
point sources. The task of a lens designer is to design systems with as few surfaces as
possible, mutually balancing their aberrations to yield wavefronts in the image space that
are close to being spherical over a wide region of the object space.
1.5 LAWS OF GEOMETRICAL OPTICS

In this section, we derive from Fermat's principle the three laws of geometrical
optics, namely, rectilinear propagation in a homogeneous medium, refraction at an
interface between two homogeneous media, and reflection at an interface. We first derive
them in 2D and then generalize them in 3D.
1.5.1 Rectilinear Propagation

In a homogeneous medium, i.e., one of uniform refractive index, a light ray
propagates in a straight line, as indicated in Figure 1-5. This law is referred to as the law
of rectilinear propagation. It is self-evident because a ray propagating from one point to
another in a straight line joining the two points propagates along a path of the shortest
optical path length. We note from Figure 1-5 that the difference in optical path lengths of
a virtual (or fictitious) path P1 BP2 and the actual path P1 AP2 is given by
[
W ( ) = n ( P1 B + BP2 ) - ( P1 A + AP2 ) ]
1/ 2 1/ 2
Ï
[ 2
= n Ì ( P1 A) + 2
Ó
] + [( AP ) 2
2
+ 2 ] ¸
- ( P1 A + AP2 )˝
˛
ÏÔ È 2 ˘ È 2 ˘ ¸Ô
= n Ì P1 AÍ1 + 2 + ... - ( P1 A + AP2 )˝
˙ Í
2 + ... + AP2 1 +
˙
ÔÓ ÍÎ 2( P1 A) ˚˙ ÍÎ 2( AP2 ) ˚˙ Ô˛
= O 2( ) , (1-7)
where = AB is a small deviation of the virtual path from the actual, n is the refractive
( )
index of the homogeneous medium, and O 2 represents terms with powers of greater
than or equal to two. As expected, there is no linear term in ; thus, the derivative of
W ( ) with respect to in the limit of Æ 0 is zero.
1.5.2 Refraction in 2D
Consider refraction of a ray at an interface between two media of refractive indices n
and n ¢ , as illustrated in Figure 1-6. The optical path length of a ray in propagating from a
point P to another point P ¢ after refraction at a point A is given by
1/ 2
[ PAP ¢ ] (
= n a2 + x2 )1/ 2 + n¢ [(b - x) 2 + c 2 ] . (1-8)
If we displace the point A by a small amount along the interface, the value of x changes
B
'
P2
P1 A
Figure 1-5. Rectilinear propagation of a ray from a point P1 to a point P2 .

1.5 Laws of Geometrical Optics 11
n n¢
P¢ Refracted
q¢ Ray
c
A
Surface x b- x
Normal b
q
a
Incident
Ray
P
Figure 1-6. Refraction of a ray. PA is a ray incident on a planar surface separating

media of refractive indices n and n ¢ at an angle q with the surface normal, and
AP¢ is the corresponding refracted ray at an angle q ¢ .
by that amount. According to Fermat’s principle, the derivative of the optical path length
with respect to x is zero. Equating to zero the derivative of the right hand side of Eq. (1-8)
with respect to x, we obtain
x b-x
0 = n - n¢
2 1/ 2 1/ 2
(a 2
+x ) [(b - x) 2
+ c2 ]
= n sin q - n ¢ sin q ¢ ,
or
n ¢ sin q¢ = n sin q , (1-9)
where q and q ¢ are the angles of incidence and refraction of the incident and refracted
rays from the surface normal at the point of incidence. The incident ray, the reflected ray,
and the surface normal are coplanar. Equation (1-9), along with coplanarity of the rays
and the surface normal, is the law of refraction, also called Snell’s law.
When light is incident normally on a surface so that the angle of incidence q is zero,
then the angle of refraction q ¢ is also zero. The angle of refraction increases as the angle
of incidence increases. When light is incident from a medium of higher refractive index n
to a medium of lower index n ¢ , the angle of refraction reaches its maximum value of 90˚
[ ]
corresponding to an angle of incidence of sin -1 (n ¢ n) . This angle of incidence is called
the critical angle. Its value for a glass-to-air interface is 41.8˚, as may be seen by letting
n = 1.5 and n ¢ = 1 in Eq. (1-9). When light is incident at an angle that is larger than the
critical angle, it is reflected at the interface according to the law of reflection, discussed
below. This phenomenon, called total internal reflection, is used in a right-angle
reflecting prism, as illustrated in Figure 1-7. Such a prism is used in optical systems to
deviate the path of a beam by 90˚. Its diagonal face acts like a mirror because the rays are
incident on it at angle of 45˚ and undergo a total internal reflection.
45∞
45∞
Figure 1-7. A right-angle reflecting prism. A parallel beam incident on it is reflected

from its diagonal face as if it were a mirror.
1.5.3 Reflection in 2D
Now consider the reflection of a ray from a reflecting surface, as illustrated in Figure
1-8. The optical path length of a ray in propagating from a point P to another point P ¢
after reflection at a point A is given by
[PAP¢ ] = n ( PA - AP ¢)
1/ 2
[
= n ÏÌ (b - x) + c 2
Ó
2
] - (a 2
+ x2 )1/ 2 ¸˝˛ , (1-10)
where the refractive index associated with the reflected ray is -n because of its backward
propagation. Equating to zero the derivative of the right-hand side of Eq. (1-10) with
respect to x, as in the case of refraction, we obtain
Reflected a
Ray P¢
b
x
(–)q¢
Surface
Normal A
q
b-x
Incident
Ray P
c
Figure 1-8. Reflection of a ray. PA is a ray incident on a planar reflecting surface at

an angle q with the surface normal, and AP¢ is the corresponding reflected ray at
an angle q ¢ .
x b-x
0 = +
2 1/ 2 1/ 2
(a 2
+x ) [(b - x) 2
+ c2 ]
= sin q¢ + sin q ,
or
q¢ = - q , (1-11)
where q and q ¢ are the angles of incidence and reflection that the incident and reflected
rays make with the surface normal at the point of incidence, respectively. The angle q¢ in
Figure 1-8 is numerically negative, and so we have inserted a minus sign in Eq. (1-11).
The incident ray, the reflected ray, and the surface normal are coplanar. No
approximation is involved in Eq. (1-11). This equation along with the coplanarity of the
rays is the law of reflection. It can be obtained from Eq. (1-9) by letting n ¢ = - n because
the reflected ray lies in the same medium as the incident ray but travels backward
compared to the incident ray.
1.5.4 Refraction in 3D
As illustrated in Figure 1-9, which is not in a plane, consider a ray originating at a
r
point P( r ) incident on a refracting surface, separating media of refractive indices n and
r
n ¢ , at a point A0 and passing through a point P ¢ ( r ¢ ) after refraction by the surface,
r r
where r and r ¢ are the position vectors of the respective points with O (not shown in the
figure) as an arbitrary origin of coordinates. Given the incident ray PA0 , we want to
determine the refracted ray A0 P ¢ . Let A be some point on the surface (not necessarily in
the plane of the paper) in the vicinity of A0 . Imagine the vector rOA to move on the
surface along a curve through A0 that obeys the equation OA = f ( u) , where u is the
length of this curve from A0 . The optical path length of the ray from point P to point P ¢
→
P (r)
→ →
f (u) – r
∧
e θ
u
A
A0
n → →
θ r – f (u)
∧
n ∧ e
→
P (r )
Figure 1-9. Refraction of a ray by a surface separating media of refractive indices n

and n ¢ . PA0 and A0 P ¢ are the actual incident and refracted rays, and PA and AP¢
are the corresponding nearby virtual rays. v̂ is a unit vector along the normal to the
surface at point A0 . q and q ¢ are the angles the rays make with the surface normal,
called the angles of incidence and refraction, respectively.
through the point A is given by
[ PAP¢ ] = nPA + n ¢AP ¢

r r r r
= n f ( u) - r + n ¢ r ¢ - f ( u) . (1-12)
As the point A moves, and therefore as u varies, a whole family of paths is generated.
From Fermat’s principle, the true path is obtained by letting the optical path length be
stationary, i.e., by letting
Ïd r r r r ¸
Ì
Ó du [
n f ( u) - r + n ¢ r ¢ - f ( u) ˝
˛ u =0
= 0 . ] (1-13)
Now,
r r r r r
f (u) - r = ( f ◊ f + rr ◊ rr - 2 f ◊ rr)1 2 . (1-14)
Therefore,
r
r r df
d r r ( )
f -r ◊
f (u) - r = r r du
du f -r
r
df
= eˆ ◊ , (1-15)
du
where ê is a unit vector along the ray PA given by

r r
f -r
eˆ = r r . (1-16)
f -r
Similarly,
r
r r df
d r r (
r¢ - f ) ◊
r ¢ - f ( u) = - r r du
du r¢ - f
r
r df
= - e¢ ◊ , (1-17)
du
where ê¢ is a unit vector along the reflected ray AP¢ given by
r r
r¢ - f
eˆ¢ = r r . (1-18)
r¢ - f
Substituting Eqs. (1-15) and (1-17) into Eq. (1-13), we obtain

r
È df ˘
( ˆ ˆ )
Í ne - n ¢e¢ ◊ ˙ = 0 . (1-19)
Î du ˚
u= 0
r
The vector df du( )
u =0
is a tangent to the curve at A0 but otherwise arbitrary.
Accordingly, neˆ - n ¢eˆ¢ must be parallel to the surface normal v̂ at A0 . Thus, we can
write
neˆ - n ¢eˆ¢ = bvˆ , (1-20)
where b is a constant. Because ê¢ is a linear combination of ê and v̂ , it must lie in the
plane of incidence, defined as the plane containing the unit vectors ê and v̂ . Thus, the
incident and refracted rays, and the surface normal at the point of incidence, are coplanar.
Taking a dot product of both sides of Eq. (1-20) with v̂ , we obtain
neˆ ◊ vˆ - n ¢eˆ¢ ◊ vˆ = b ,
or
b = n cos q - n ¢ cos q¢ , (1-21)
where q and q¢ are the angles the incident and refracted rays make with the surface
normal, known as the angles of incidence and refraction, respectively. Substituting for b
from Eq. (1-21) into Eq. (1-20), we obtain
n ¢eˆ¢ = neˆ + ( n ¢ cos q¢ - n cos q) vˆ . (1-22)
Similarly, taking a vector product with v̂ , we obtain
neˆ ¥ vˆ - n ¢eˆ¢ ¥ vˆ = 0 ,
or
n ¢ sin q¢ = n sin q . (1-23)
Equation (1-23) and coplanarity of the incident and refracted rays and the surface normal
is the law of refraction, or Snell’s law in 3D. Thus, a ray incident at an angle q is
refracted at an angle q¢ such that the refracted ray lies in the plane of incidence.
Substituting for cos q¢ from Eq. (1-23) into Eq. (1-22) yields the value of ê¢ according to
12
n ¢eˆ¢ = neˆ + ÈÍ n ¢ 2 - n 2 sin 2 q
Î
( )1 2 - n cos q ˘˙˚ vˆ . (1-24)
1.5.5 Reflection in 3D
r
Consider a ray originating at a point P( r ) , as in Figure 1-10, incident on a reflecting
r
surface at a point A0 and passing through a point P ¢ ( r ¢ ) after reflection by the surface,
with O as an arbitrary origin of coordinates. This figure, like Figure 1-9, is also not in a
plane. Given the incident ray PA0 , we want to determine the reflected ray A0 P ¢ . Let A be
→
→ P (r )
→ →
P (r) r – f (u)
(-)θ
→ → θ
f (u) – r ∧ →
e a
u ∧
A0 e
n A
∧
Figure 1-10. Reflection of a ray by a reflecting surface in a medium of refractive

index n. PA0 and A0 P ¢ are the actual incident and reflected rays, and PA and AP¢
are the corresponding nearby virtual rays. v̂ is a unit vector along the normal to the
surface at point A0 . q and q ¢ are the angles the rays make with the surface normal,
called the angles of incidence and reflection.
some point on the surface in the vicinity of A0 . Imagine that the vector
r OA moves on the
surface along a curve through A0 that obeys the equation OA = f ( u) , where u is the
length of this curve from A0 . The optical path length of the ray from point P to point P ¢ ,
through the point A in a medium of refractive index n, is given by
[ PAP¢ ] = n( PA - AP ¢)
r r r r
[
= n f ( u) - r - r ¢ - f ( u) ] , (1-25)
where the optical path length of the reflected ray AP¢ is negative due to the negative
refractive index associated with it. As A moves, and therefore as u varies, a whole family
of paths is generated. From Fermat’s principle, the true path is obtained by letting the
optical path length be stationary, i.e., by letting
Ïd r r r r ¸
Ì
Ó du
[
f ( u) - r - r ¢ - f ( u) ˝ ]
˛ u =0
= 0 . (1-26)
r
È df ˘
( ˆ ˆ )
Í e + e¢ ◊ ˙ = 0 , (1-27)
Î du ˚
u= 0
where ê and ê¢ are unit vectors along the incident and reflected rays, respectively.
1.6 Exact Ray Tracing 17
r
(
The vector df du
u =0
)
lies along the tangent to the curve at A0 . The curve is
arbitrary so long as it passes through A0 and stays on the surface. Therefore, eˆ + eˆ¢ must
be perpendicular to all tangents to the surface at A0 , or eˆ + eˆ¢ must be along the normal
v̂ to the tangent plane at A0 . Thus, we can write
eˆ + eˆ¢ = avˆ , (1-28)
where a is a constant, and conclude that ê¢ must lie in the plane of incidence, defined as
the plane containing ê and v̂ . Thus, the incident and reflected rays, and the surface
normal at the point of incidence, are coplanar. From the triangle A0 A1 A2 in Figure 1-10,
we find that a = 2 cos q, where q is the angle of incidence of the ray. Substituting for a
into Eq. (1-27), we obtain
eˆ¢ = - eˆ + 2vˆ cos q . (1-29)
Because ê and ê¢ are unit vectors and, therefore, have the same length, they intersect
eˆ + eˆ¢ and, therefore, v̂ at the same angle. Thus we obtain the law of reflection that the
reflected ray makes the same angle with the surface normal at the point of incidence as
the incident ray and lies in the plane of incidence. The reflection of a ray can be treated as
a special case of refraction by letting n ¢ = -n , as may be seen by comparing the
corresponding equations, e.g., Eqs. (1-22) and (1-29).
1.6 EXACT RAY TRACING

Exact ray tracing is used to determine the wave and ray aberrations, and thereby the
spot sizes and diagrams, and the aberrated diffraction images. We discuss such ray
tracing here to illustrate its differences from the so-called paraxial ray tracing. Otherwise,
the approximations that are implicit in paraxial ray tracing remain hidden. Of course,
paraxial ray tracing, discussed in the next section, or imaging in the paraxial
approximation, leads to Gaussian optics.
1.6.1 Ray Incident on a Spherical Surface
Consider a ray with a unit vector ê0 and direction cosines (k0 , l 0 , m0 ) , as in Figure
r
1-11, originating at a point object A0 with position vector r0 and coordinates ( x 0 , y 0 , z 0 )
v
incident at a point A1 with a position vector r1 and coordinates ( x1 , y1 , z1 ) on a spherical
refracting surface of radius of curvature R1 separating media of refractive indices n 0 and
n1 . Let the distance between the object plane and the vertex V1 of the surface be D01 so
that z 0 = - D01 . Because A1 lies on the spherical surface with the origin at its vertex V1,
2
x12 + y12 + (z1 - R1 ) = R12 (1-30)
or
z1 = R1 - R12 - x12 - y12 . (1-31)

A1(x1, y1, z1)

θ0 ∧
A0(x0, y0, z0) e 1
∧ θ1 A2(x2, y2, z2)
x S01 e0 ∧ S12
1 ∧
R1 R2 2
z
n0 V1 C1 n1 V2 C2 n2
z1 z2
y
D01 D12
Figure 1-11. Propagation of a ray from a point A0 to a point A1 on a spherical

refracting surface of radius of curvature R1 separating media of refractive indices
n1 and n1¢ ; refraction of the ray at point A1 ; and propagation of the refracted ray
from point A1 to a point A2 , where the ray meets a spherical refracting surface of
radius of curvature R2 . z1 and z 2 are the sags of the two surfaces, and v̂1 and v̂ 2
are unit vectors along their surface normals at the points of incidence, respectively,
toward the center of curvatures C1 and C 2 . The angles of incidence and refraction
of the ray are q 0 and q 1 , respectively.
The z coordinate z1 of a point on the surface represents the sag of the surface at that
point.
1.6.2 Rectilinear Propagation from the Object Plane to the First Refracting Surface
v r
It is evident from Figure 1-11 that the position vectors r1 and r0 are related to each
other according to
v r
r1 = r0 + S01eˆ0 , (1-32)
where
2 12
S01 = [(x - x )
1 0
2 2
+ ( y1 - y 0 ) + (D01 + z1 ) ] (1-33)
is the distance between A0 and A1 . The sign of S01 is the same as that of D01 + z1 , as
may be seen by considering a ray incident at the vertex V1. The transverse coordinates
( x1, y1) of A1 can be written
x1 = x 0 + S01k0 (1-34a)
and
y1 = y 0 + S01l 0 . (1-34b)
We note that to determine the coordinates ( x1 , y1 ) from Eqs. (1-34), we need S01 ,
which itself depends on them through Eq. (1-33). Thus, these equations are coupled and
must be solved simultaneously. Substituting Eqs. (1-31) and (1-34) into Eq. (1-33), we
obtain a quadratic equation in S01 in terms of the known quantities. Solving this equation
and substituting the value thus obtained into Eqs. (1-34a) and (1-34b) yields the
transverse coordinates ( x1, y1) of the ray at A1 . The transfer operation of the ray in
propagating from point A0 to point A1 is described by Eqs. (1-31) and (1-34), along with
Eq. (1-33).
1.6.3 Refraction of a Ray by a Spherical Refracting Surface
The ray is refracted at A1 by the refracting surface, according to Eq. (1-22). The unit
vector v̂1 along its normal at A1 is given by
A1C1
v̂1 =
R1
1
=
R1
(- x1, - y1, R1 - z1)
1 Ê
= - x , - y1 , R12 - x12 - y12 ˆ¯ . (1-35)
R1 Ë 1
Substituting Eq. (1-35) into Eq. (1-22), we obtain the direction cosines (k1 , l1 , m1 ) of the
refracted ray with a unit vector ê1 :
x1
n1k1 = n 0 k0 - (n1 cos q1 - n 0 cos q 0 ) , (1-36a)
R1
y1
n1l1 = n 0 l 0 - (n1 cos q1 - n 0 cos q 0 ) , (1-36b)
R1
and
R12 - x12 - y12

n1m1 = n 0 m0 + (n1 cos q1 - n 0 cos q 0 ) , (1-36c)
R1
where q 0 and q1 are the angles of incidence and refraction, respectively;
cos q 0 = eˆ0 ◊ vˆ1 (1-37a)
or
1 Ê
cos q 0 = - x k - y1l 0 + R12 - x12 - y12 1 - k 02 - l 02 ˆ¯ ; (1-37b)
R1 Ë 1 0
and from Snell’s law [see Eq. (1-23)],

(
cos q1 = 1 - sin 2 q1 )1 2
12
[ 2
= 1 - (n 0 n1) sin 2 q 0 ] (1-38a)
or
1 2
cos q1 =
n1 1
(
n - n 02 - n 02 cos 2 q 0 )1 2 . (1-38b)
Equations (1-36) through (1-38) describe the refraction operation of the ray at point A1 .
1.6.4 Rectilinear Propagation from the First Refracting Surface to the Second
The refracted ray propagates in a straight line until it reaches a point A2 , with a
r
position vector r2 on the next spherical refracting surface of radius of curvature R2 with
its vertex V2 at a distance D12 from the vertex V1 separating media of refractive indices
n1 and n 2 . The straight line propagation of the ray from point A1 to point A2 can be
obtained in a manner similar to the ray propagation from point A0 to point A1 .
The sag z 2 of the surface is given by
z 2 = R2 - R22 - x 22 - y 22 . (1-39)
r
The position vector r2 of point A2 is given by
v r
r2 = r1 + S12eˆ1 , (1-40)
where
2 12
S12 = [(x 2
2 2
- x1 ) + ( y 2 - y1 ) + (D12 - z1 + z 2 ) ] (1-41)
is the distance between the points A1 and A2 . The sign of S12 is the same as that of
D12 - z1, as may be seen by considering a ray incident on the vertex V2 .
The transverse coordinates ( x 2 , y 2 ) of point A2 where the ray meets the second
surface are given by
x 2 = x1 + S12 k1 (1-42a)
and
y 2 = y1 + S12 l1 . (1-42b)
Equations (1-39) and (1-42), along with Eq. (1-41), describe the transfer operation of the
ray from point A1 to point A2 . Again, Eqs. (1-41) and (1-42) are coupled and have to be
solved simultaneously.
1.6.5 Reflection of a Ray by a Spherical Reflecting Surface
The tracing of a ray reflected by a reflecting surface, as illustrated in Figure 1-12, can
be treated in a similar manner. Consider a ray incident at a point ( x1 , y1 , z1 ) on a
reflecting surface of radius of curvature R1 with a unit vector ê0 and direction cosines
(k0 , l0 , m0 ) . From Eqs. (1-29) and (1-35), the direction cosines (k1, l1, m1) of the reflected
ray with a unit vector ê1 are given by
x1
k1 = - k 0 - 2 cos q 0 , (1-43a)
R1
y1
l1 = - l 0 - 2 cos q 0 , (1-43b)
R1
and
R12 - x12 - y12

m1 = - m0 + 2 cos q 0 , (1-43c)
R1
where q 0 is the angle of incidence of the ray, and cos q 0 = eˆ0 ◊ vˆ1 is given by Eq. (1-37).
Equations (1-43), along with Eq. (1-37), describe the reflection operation of the ray at
point A1 . These equations can be obtained from the corresponding Eqs. (1-36) for a
refraction operation by letting n 0 = 1 = - n1 .
A2
∧
e1
(–)θ 0
∧ θ0
A0 e0 A1 (x1, y1, z1)
x
R1 ∧
1
z
V1 C1
z1
y
Figure 1-12. Reflection of a ray A0 A1 by a reflecting spherical surface of radius of

curvature R1 at the point of incidence A1 , where q 0 is the angle of incidence of the
ray, z1 is the sag of the surface, and v̂1 is a unit vector along the surface normal at
the point of incidence toward the center of curvature C .
1.6.6 Conic Surface and Surface Normal
The equation of a conic surface (conicoid) of eccentricity e and vertex radius of

curvature R is given by [1]
( )
x 2 + y 2 - 2Rz + 1 - e 2 z 2 = 0 , (1-44)
where ( x , y , z ) are the coordinates of a point on its surface with its origin at its vertex.
The sag of the surface, namely, the z coordinate, is given by
z =
(x2 + y2) R . (1-45)
1/ 2
1 + [1 - (1 - e 2 ) ( x 2 + y 2 ) R2 ]
If we let e = 0, as for a sphere, Eqs. (1-44) and (1-45) reduce to the corresponding
equations (1-30) and (1-31) for a spherical surface, respectively. In lens design, it is quite
common to use the curvature c in place of 1 R and Schwarschild constant k in place of
- e2.
The conic equation (1-44) can also be written in the form
1 2
F (x, y, z) =
R
[ (
x + y 2 + 1 - e2 z 2 - 2z = 0 .) ] (1-46)
The unit vector along the normal to the surface at the point ( x , y , z ) can be written
-(∂F ∂x , ∂F ∂y , ∂F ∂z )
vˆ =
2 12
[(∂F ∂x) 2 2
+ (∂F ∂y ) + (∂F ∂z ) ]
1 È -x -y z˘
= Í
VÎR R
, (
, 1 - 1 - e2 ˙ ,
R˚
) (1-47)
where the minus sign in the first equation represents the fact that the surface normal is
toward the vertex center of curvature, and
2 12
[ (
V = 1 + 2e 2 (z R) - e 2 1 - e 2 (z R) ) ] . (1-48)
Letting e = 0, it can be seen that V Æ 1, and Eq. (1-47) reduces to the unit vector given
by Eq. (1-35) for a spherical surface.
1.6.7 Refraction of a Ray by a Conic Refracting Surface
When a ray with a unit vector ê0 and direction cosines (k0 , l 0 , m0 ) originating at a
r
point with a position vector r0 and coordinates ( x 0 , y 0 , z 0 ) is incident on a conic surface
of eccentricity e1 and vertex radius of curvature R1 , the point of incidence ( x1 , y1 , z1 ) is
still given by Eqs. (1-34), except that the sag value, following Eq. (1-45), is given by
z1 =
( x12 + y12 ) R1 . (1-49)
1/ 2
1 + [1 - (1 - e12 ) ( x12 + y12 ) R12 ]
Equation (1-49) is substituted into Eq. (1-33) for the distance S01 between the two points.
Substituting Eq. (1-47) into Eq. (1-22), we obtain the direction cosines of the
refracted ray:
x1
V1R1
y1
V1R1
and
1 È z ˘
n1m1 = n 0 m0 + (n1 cos q1 - n 0 cos q 0 ) Í
V1 Î
1 - 1 - e12 1 ˙ ,
R1 ˚
( ) (1-50c)
where, following Eq. (1-48), V1 is given by
2 12
[ (
V1 = 1 + 2e12 (z1 R1) - e12 1 - e12 (z1 R1) ) ] , (1-51)
cos q 0 = eˆ0 ◊ vˆ1 , (1-52a)
or
1 Ï x1 y È z ˘¸
cos q 0 =
V0
Ì - k0
ÔÓ R1
- l 0 1 + 1 - k02 - l 02 Í1 - 1 - e12 1 ˙ ˝ ,
R1 Î R1 ˚ Ô˛
( ) (1-52b)
and cos q1, obtained from Snell’s law, is given by Eq. (1-38).
1.6.8 Reflection of a Ray by a Conic Reflecting Surface
When a ray with a unit vector ê0 and direction cosines (k0 , l 0 , m0 ) originating at a
r
point with a position vector r0 and coordinates ( x 0 , y 0 , z 0 ) is reflected by a conic
reflecting surface of eccentricity e1 and vertex radius of curvature R1 , the direction
cosines (k1 , l1 , m1 ) of the reflected ray with a unit vector ê1 are given by [see Eq. (1-29)]
x1
k1 = - k 0 - 2 cos q 0 , (1-53a)
V1R1
y1
l1 = - l 0 - 2 cos q 0 , (1-53b)
V1R1
and
1 È z ˘
m1 = - m0 - 2 cos q 0
V1 ÍÎ
( )
1 - 1 - e12 1 ˙ ,
R1 ˚
(1-53c)
where q 0 is the angle of incidence of the ray, and cos q 0 = eˆ0 ◊ vˆ1 is given by Eq. (1-52).
1.6.9 Tracing a Tangential Ray

The plane passing through a point object and the optical axis is called the tangential
(or the meridional) plane. For a point object lying on the x axis, the zx plane is the
tangential plane. For a ray incident in this plane, called a tangential ray, both y 0 and l 0
are equal to zero. The y coordinate y1 of the point of incidence of this ray on a refracting
or a reflecting surface is also zero, as may also be seen from Eq. (1-34b). As a result, the
direction cosine l1 of the refracted or reflected ray is also equal to zero, as may be seen
from Eqs. (1-36b) and (1-43b), respectively. Accordingly, a tangential ray remains
tangential after refraction or reflection. This result is simply a consequence of the
coplanarity of the incident ray, surface normal, and the refracted or reflected ray. Note
that because the point of incidence of the tangential ray lies in the tangential plane, the
surface normal at this point also lies in this plane.
1.6.10 Determining Wave and Ray Aberrations

Exact ray tracing is used to determine the wave and transverse ray aberrations of a
system. To determine the wave aberrations, the rays originating at a point object are
traced through the system by repeating the process of transfer and refraction and/or
reflection operations until the rays reach a reference sphere. The center of curvature of
the reference sphere lies at the Gaussian image point and passes through the center of the
exit pupil of the system. The difference in the optical path lengths of the rays from some
reference ray, often the ray that passes through the center of the exit pupil and called the
chief ray, and ending on the reference sphere, are the wave aberrations for the point
object under consideration. To determine the transverse ray aberrations, the rays are
traced up to the image plane. When the rays end on a planar surface, the ray-tracing
equations are modified by letting the radius of curvature of the last surface approach
infinity and its sag approach zero. The ray intercepts in the image plane from some
reference point, often the Gaussian image point, are the transverse ray aberrations. The
distribution of the rays in an observation plane is called the ray spot diagram. Of course,
the wave and ray aberrations can be obtained from each other because of the relationship
between them, which is discussed in Section 8.2.2 [see Eq. (8-5)].
1.7 PARAXIAL RAY TRACING

If the rays make small angles with the optical axis and surface normals, we can
approximate their sines and tangents with the angles themselves. Similarly, if the
transverse coordinates ( x , y ) of a point on a surface are much smaller than its radius of
curvature, we can approximate its diagonal distance from another point by the
corresponding axial distance. The ray tracing performed under such assumptions is called
1.7 Paraxial Ray Tracing 25
paraxial ray tracing. In this section, we first list the relevant assumptions and then the
consequent paraxial ray-tracing equations. In each case, we write the exact equation
followed by its paraxial or approximate form to highlight their differences.
1.7.1 Snell’s Law
The angles of incidence and refraction are assumed to be small so that their sines are
equal to the respective angles, i.e., sin q ~ q and sin q¢ ~ q¢ . Therefore, Snell’s law,
n ¢ sin q¢ = n sin q , (1-54a)
is approximated by
n ¢q ¢ ~ nq . (1-54b)
Of course, the incident and refracted rays, and the surface normal at the point of
incidence on the refracting surface, are coplanar.
1.7.2 Point on a Spherical Surface
Coordinates ( x , y , z ) of a point on a spherical surface of radius of curvature R :
( x, y, z) = ÊË x , y , R - R 2 - x 2 - y 2 ˆ¯ (1-55a)
~ ( x, y, 0) . (1-55b)
The point is assumed to be close to the optical axis so that the sag z of the point on the
surface is negligible. Thus, a spherical surface is replaced by a planar surface, called the
tangent plane or the paraxial surface, passing through the surface vertex.
1.7.3 Distance between Two Points
Replace the diagonal distances, such as S12 between two points on two surfaces in
Figure 1-11, with the corresponding axial distance D12 :
2 12
S12 = [(x 2
2 2
- x1 ) + ( y 2 - y1 ) + (D12 - z1 + z 2 ) ] (1-56a)
~ D12 . (1-56b)
Thus, the distance between the two points is replaced by the distance between the two
tangent planes, i.e., it is approximated by the distance between their vertices. It has the
implication that the ray between two points is nearly parallel and close to the optical axis.
1.7.4 Unit Vector along a Surface Normal
Unit vector v̂ along the surface normal at a point ( x , y , z ) on a spherical surface of

radius of curvature R:
1Ê
vˆ = - x , - y , R - R 2 - x 2 - y 2 ˆ¯ (1-57a)
RË
~ 1 (- x, - y, 0) . (1-57b)
R
Thus, the direction cosines x R and y R are very small so that the normal is practically
parallel to the z axis, which is the optical axis.
1.7.5 Unit Vector along a Ray
The unit vector (k , l , m) along a ray is:
(k, l , m) = ÊË k , l , 1 - k 2 - l 2 ˆ¯ (1-58a)
~ (k, l , 1) . (1-58b)
Thus, the ray is assumed to be practically parallel to the z axis.
With these approximations, the transfer, refraction, and reflection operation Eqs. (1-
34), (1-36), and (1-43) are simplified as follows.
1.7.6 Transfer of a Ray
Transfer of a ray with direction cosines (k0 , l 0 , m0 ) from a point A0 ( x , y ) to a point

A1( x1, y1) at a distance S01 , as in Figure 1-11, approximated by the corresponding axial
distance D01 :
x1 = x 0 + S01k0 (1-59a)
~ x 0 + D01k0 , (1-59b)
and
y1 = y 0 + S01l 0 (1-60a)
~ y 0 + D01l 0 . (1-60b)
Approximating the distance S01 between two points by their axial distance D01
decouples, for example, Eqs. (1-33) and (1-34).
1.7 Paraxial Ray Tracing 27
1.7.7 Refraction of a Ray
Refraction of a ray with direction cosines (k0 , l 0 , m0 ) by a spherical refracting

surface of radius of curvature R1 separating media of refractive indices n 0 and n1 at a
point ( x1, y1) to a refracted ray with direction cosines (k1 , l1 , m1 ) , as in Figure 1-11:
x1
n1k1 = n 0 k0 - (n1 cos q1 - n 0 cos q 0 ) (1-61a)
R1
~ n 0 k0 - (n1 - n 0 ) x1 , (1-61b)
R1
and
y1
n1l1 = n 0 l 0 - (n1 cos q1 - n 0 cos q 0 ) (1-62a)
R1
~ n 0 l 0 - (n1 - n 0 ) y1 , (1-62b)
R1
where the angles of incidence and refraction q 0 and q1 are small so that
cos q1 ~ 1 ~ cos q 0 .
1.7.8 Reflection of a Ray
Reflection of a ray with direction cosines (k 0 , l 0 , m 0 ) by a spherical reflecting

surface of radius of curvature R1 at a point ( x1 , y1 ) to a reflected ray with direction
cosines (k1 , l1 , m1 ) , as in Figure 1-12:
x1
k1 = - k 0 - 2 cos q 0 (1-63a)
R1
~ - k 0 - 2 x1 , (1-63b)
R1
and
y1
l1 = - l 0 - 2 cos q 0 (1-64a)
R1
~ - l 0 - 2 y1 . (1-64b)
R1
Again, the angles of incidence and reflection, which are equal to each other in magnitude,
are assumed to be small so that their cosine is unity.
1.8 GAUSSIAN APPROXIMATION AND IMAGING

1.8.1 Gaussian Approximation
We have seen that for rays and surface normals to refracting and reflecting surfaces
making small angles with the optical axis, we can replace the sines and tangents of the
angles of the rays with the optical axis by the angles, and any diagonal distances between
two points by the corresponding axial distances. Because the sine of an angle is replaced
by the angle itself, which is the first-order approximation, ray tracing in this
approximation is referred to as first-order optics. The approximation of small angles is
also called the Gaussian approximation. It is used to determine the image location and
size in terms of the object location and size. The image is referred to as the Gaussian
image, and the process of determining the image in this manner, regardless of the
magnitude of the angles and sizes, is called Gaussian optics. However, the larger the
angles and sizes are, the coarser the approximation, yielding poorer image quality due to
the larger aberrations.
It is important to note that in paraxial ray-tracing equations of transfer, refraction,

and reflection, the coordinates of a point and the direction cosines of a ray depend
linearly on each other. Moreover, the x and k values at a point depend only on the x and k
values at other points [see, for example, Eqs. (1-59b) and (1-61b)]. They are completely
independent of the y and l values, unlike in exact ray tracing where the ( x; k) and ( y; l )
pairs of variables are coupled with each other [see, for example, Eqs. (1-36a) and (1-
37b)]. Accordingly, the projections of a skew ray in the zx and yz planes propagate
independently of each other. A ray in the zx plane, for example, remains in the zx plane
after refraction or reflection. Because of the rotational symmetry, it has the consequence
that we need to trace rays only in one of these planes. In the following chapters, we will
assume a point object along the x axis and trace rays in the tangential plane zx.
It is quite common in optics literature to consider a point object along the y axis
when imaged by a rotationally symmetric optical system, thus making the yz plane the
tangential plane [2,3]. To maintain symmetry of the aberration function about this plane,
the polar angle q of a pupil point is accordingly defined as the angle made by its position
vector with the y axis, contrary to the standard Cartesian convention as the angle with the
x axis. Choosing a point object along the x axis, thus making the zx plane the tangential
plane, removes this difficulty. The coma aberration, for example, is then expressed as
( ) ( )
x x 2 + y 2 instead of as y x 2 + y 2 .
In Gaussian optics, the aberrations of an image are neglected, or, equivalently, the
image is assumed to be aberration free. Gaussian imaging depends only on the vertex
radius of curvature. Thus, the Gaussian image formed by a conic surface of some vertex
radius of curvature is exactly the same as that formed by a spherical surface of the same
radius of curvature.
1.8 Gaussian Approximation and Imaging 29
1.8.2 Gaussian Imaging by a Refracting Surface
Consider a ray in the zx plane, as in Figure 1-13, originating at a point object A0 at a

height x 0 with direction cosine k 0 incident at a point A1 at an axial distance D01 at a
height x1 on the tangent plane V1 A1 of a refracting surface of vertex radius of curvature
R1 separating media of refractive indices n 0 and n1 , where n1 > n 0 , as a result of which
the incident ray refracts toward the surface normal (resulting in a smaller angle of
refraction compared to the angle of incidence). It should be noted that the Gaussian
imaging of an object depends only on the vertex radius of curvature. Therefore, it is not
necessary to specify the shape of the surface, e.g., spherical or conic, to determine the
location and size of the Gaussian image of an object.
In the paraxial approximation, we use Eq. (1-59b) to obtain the height x1 of the point
of incidence and Eq. (1-61b) to obtain the direction cosine k1 of the refracted ray. If a ray
makes an angle a with the x axis, then its direction cosine k = cos a . Let the ray make
an angle b with the z axis, called its slope angle, where a + b = p 2. Evidently,
cos a = sin b . For small slope angles, sin b ~ b , and thus, cos a ~ b. Therefore, if b 0
and b1 are the slope angles of the incident and the refracted rays, respectively, we can
write the ray-tracing equations (1-59b) and (1-61b) as
x1 = x 0 + D01b 0 (1-65)
A1
A0
x (-) b1
x0 b0 x1
z
n0 V1 n1 C1 (-) x 2
A2
R1
D01 D12
Figure 1-13. Gaussian imaging by a refracting surface. A ray from a point object A0
at a height x 0 is incident at a slope angle b 0 at a point A1 at a height x1 on the
tangent plane V1 A1 of a refracting surface of vertex radius of curvature R1 , with its
center of curvature C1 and separating media of refractive indices n 0 and n1 . The
ray is refracted at a slope angle b 1 , forming the Gaussian image at a point A2 at a
height x 2 . The numerically negative quantities are indicated by a parenthetical
minus sign (–).
and
x1
n1b1 = n 0b 0 - (n1 - n 0 ) , (1-66)
R1
where b 0 and b1 are the slope angles of the incident and the refracted rays, respectively.
Similarly, the height x 2 of a point A2 on the refracted ray at an axial distance D12 is
given by
x 2 = x1 + D12b1 . (1-67)
Both b1 and x 2 are numerically negative in Figure 1-13. Substituting for x1 and b1 from
Eqs. (1-65) and (1-66), we obtain
È D ˘ È n D D ˘
x 2 = Í1 + 12 (n 0 - n1 )˙ x 0 + Í D01 + 0 D12 + 01 12 (n 0 - n1)˙b 0 . (1-68)
Î n1R1 ˚ Î n1 n1R1 ˚
In order that the point A2 be the image of the point object A0 , all of the rays
originating at the point object and incident on the refracting surface must pass through the
image point after refraction. Thus, x 2 must be independent of b 0 , or the coefficient of
b 0 in Eq. (1-68) must be zero. Therefore, Eq. (1-68) reduces to
n0 D D
D01 + D12 + 01 12 (n 0 - n1 ) = 0 . (1-69)
n1 n1R1
Equation (1-69) can be rearranged and written in the form
n0 n n - n0
+ 1 = 1 . (1-70)
D01 D12 R1
This equation is called the Gaussian imaging equation for the refracting surface. It gives
the distance D12 of the image from the surface in terms of the distance D01 of the surface
from the object.
Because of Eq. (1-69), Eq. (1-68) reduces to
È D ˘
x 2 = Í1 + 12 (n 0 - n1 )˙ x 0 , (1-71)
Î n1R1 ˚
which gives the height x 2 of the image of an object of height x 0 . The ratio of x 2 to x 0 is
called the transverse magnification of the image. Thus,
x2
Mx = (1-72a)
x0
D12
=1+ (n - n1)
n1R1 0
(1-72b)
or
n 0 D12
Mx = - , (1-72c)
n1 D01
where we have used Eq. (1-70) in the last step. Equations (1-70) and (1-72) describe the
location and size of the image in terms of the corresponding quantities for the object.
Substituting for x1 from Eq. (1-65) into Eq. (1-66), we obtain
b1 =
(n 0 - n1) x Ên n - n1
+Á 0 + 0
ˆ
D01 ˜ b 0 . (1-73)
0
n1R1 Ë n1 n1R1 ¯
If we consider a cone of rays of angular subtense Db 0 diverging from the point object
and incident on the surface, the corresponding angular subtense Db1 of the cone of rays
converging to the image point can be obtained by differentiating Eq. (1-73):
∂b1
Mb = (1-74a)
∂b 0
n 0 n 0 - n1
= + D01 , (1-74b)
n1 n1R1
or
D01
Mb = - , (1-74c)
D12
where we have again used Eq. (1-70) in the last step. Equation (1-74) gives the angular
magnification of the rays. The product of the transverse and angular magnifications is
given by
n0
M xMb = , (1-75)
n1
which is independent of the object and image distances. It illustrates that a large
transverse magnification is accompanied by a small angular magnification.
1.8.3 Gaussian Imaging by a Reflecting Surface
Consider a ray in the zx plane, as in Figure 1-14, originating at a point object A0 at a

height x 0 with direction cosine k 0 incident at a point A1 at an axial distance D01 at a
height x1 on the tangent plane V1 A1 of a reflecting surface of vertex radius of curvature
R1 . In the paraxial approximation, we use Eq. (1-59b) to obtain the height x1 of the point
of incidence, and Eq. (1-62b) to obtain the direction cosine k1 of the reflected ray. Again
using the slope angles b 0 and b1 of the incident and the reflected rays, respectively, we
may write
(-) b1
A1
A0
b0 x1
A2
x0 x2
V1 C1
R1
D01 D12
Figure 1-14. Gaussian imaging by a reflecting surface. A ray from a point object A0
at a height x 0 is incident at a slope angle b 0 at a point A1 at a height x1 on the
tangent plane V1 A1 of a reflecting surface of vertex radius of curvature R1 , with its
center of curvature C1 . The ray is reflected at a slope angle b 1 , forming the
Gaussian image at a point A2 at a height x 2 .
x1 = x 0 + D01b 0 (1-76)
and
x1
b1 = - b 0 - 2 , (1-77)
R1
where b1 is numerically negative. Similarly, the height x 2 of a point A2 on the reflected

ray at an axial distance D12 is given by
x 2 = x1 + D12b1 . (1-78)
Substituting for x1 and b1 from Eqs. (1-76) and (1-77), we obtain
Ê 2D ˆ Ê 2D01D12 ˆ
x 2 = Á 1 - 12 ˜ x 0 + Á D01 - D12 - ˜b . (1-79)
Ë R1 ¯ Ë R1 ¯ 0
In order that the point A2 be the image of the point object A0 , all of the rays
originating at the point object and incident on the reflecting surface must pass through the
image point after reflection. Thus, x 2 must be independent of b 0 , or the coefficient of b 0
in Eq. (1-79) must be zero. Therefore,
2D01D12
D01 - D12 - = 0 . (1-80)
R1
Equation (1-80) can be rearranged and written in the form
1 1 2
- = . (1-81)
D12 D01 R1
This equation is called the Gaussian imaging equation for the reflecting surface. It gives
the distance D12 of the image from the surface in terms of the distance D01 of the surface
from the object. The image does not lie on the ray, but lies instead on its extension.
Accordingly, it is not real, but virtual.
Because of Eq. (1-80), Eq. (1-79) reduces to
Ê 2D ˆ
x 2 = Á 1 - 12 ˜ x 0 , (1-82)
Ë R1 ¯
which gives the height x 2 of the image of an object of height x 0 . The ratio of x 2 to x 0 is
called the transverse magnification of the image. Thus,
x2
Mx = (1-83a)
x0
2 D12
= 1- (1-83b)
R1
or
D12
Mx = , (1-83c)
D01
where we have used Eq. (1-81) in the last step. Equations (1-81) and (1-83) describe the
location and size of the image in terms of the corresponding quantities for the object.
Substituting for x1 from Eq. (1-76) into Eq. (1-77), we obtain
2 Ê 2 ˆ
b1 = - x 0 - Á1 + D01˜ b 0 . (1-84)
R1 Ë R1 ¯
If we consider a cone of rays of angular subtense Db 0 diverging from the point object
and incident on the surface, the corresponding angular subtense Db1 of the cone of rays
converging to the image point can be obtained by differentiating Eq. (1-84):
∂b1
Mb = (1-85a)
∂b 0
Ê 2 ˆ
= - Á1 + D01 ˜ (1-85b)
Ë R1 ¯
or
D01
Mb = - , (1-85c)
D12
where we have again used Eq. (1-81) in the last step. Equation (1-85) gives the angular
magnification of the rays. The product of the transverse and angular magnifications is
given by
M xMb = -1 , (1-86)
which is independent of the object and image distances. It illustrates that, as in the case of
imaging by a refracting surface, a large transverse magnification is accompanied by a
small angular magnification.
1.8.4 Gaussian Imaging by a Multisurface System
In a multisurface imaging system, the image formed by the first surface becomes the
object for the second surface, and so on, for the succeeding surfaces of the system. The
image thus formed by the last surface is the image formed by the system. Gaussian
imaging by a refracting system is developed in Chapter 2, and that by a reflecting system
in Chapter 3. These can be easily combined to treat imaging by a general imaging system
consisting of refracting and reflecting surfaces. It is shown in Chapter 2 that it is not
necessary to perform Gaussian imaging by each surface to obtain the image formed by a
system. Instead, once the location of two principal planes and two focal points of a
system are determined, the Gaussian image of an object formed by the system can be
determined in one step.
1.9 IMAGING BEYOND GAUSSIAN APPROXIMATION

Gaussian optics is based on paraxial rays. It determines the location and size of the
image in terms of the corresponding quantities of the object. Except for its size, the
Gaussian image is an exact replica of the object. We develop the imaging relationships of
Gaussian optics of refracting systems in Chapter 2, and those of reflecting systems in
Chapter 3. Paraxial ray tracing, discussed in detail in Chapter 4, is used to size the
imaging elements, including the stops and pupils, and determine the obscuration of light
beams and vignetting of the rays. Given the radiance of an extended object or the
intensity of a point object, the image irradiance or intensity can be determined, as
discussed in Chapter 5. Gaussian optics is also used to determine the extent of the object
that can be imaged, i.e., the field of view of the system. With such knowledge, some
imaging characteristics of simple optical instruments can be discussed, as in Chapter 6.
1.9 Imaging beyond Gaussian Approximation 35
A quantity of paramount interest that is beyond Gaussian optics but that a design
must satisfy is the expected quality of the image. A designer must choose the shapes of
the imaging elements so as to balance their aberrations to yield an image of acceptable
quality across the field of view of the system. Because the image distance and transverse
magnification depend on the refractive indices of the materials of the elements of an
imaging system, which depend on the wavelength of the object radiation, the images
formed suffer from chromatic aberrations, as discussed in Chapter 7. For example, the
image of a white point object is not white. An optical designer strives to select materials
so that the chromatic aberrations they introduce cancel each other as much as possible.
The monochromatic wave aberrations introduced by an imaging system are

completely neglected in Gaussian or paraxial optics. The Gaussian image, by definition,
is aberration free; it makes no distinction between a spherical and an aspheric (e.g., conic)
surface, because only the vertex radius of curvature of an aspheric surface is used in the
Gaussian imaging and paraxial ray-tracing equations. However, the quality of the images
formed by these surfaces may be quite different due to the differences in their shapes. For
example, a paraboloidal mirror focuses the rays from an axial point object at infinity to a
point, but a corresponding spherical mirror does not. Exact ray tracing is needed to
determine the quality of an image, which depends on the aberrations. The image of a
point object is aberration free only if the rays emanating from a point object travel exactly
the same optical length in traversing the imaging system and reaching the Gaussian image
point. An imaging system must convert a diverging spherical wavefront originating at the
point object into a spherical wavefront converging to the Gaussian image point.
In practice, the wavefront exiting from an imaging system is rarely spherical. Its
deviations from being spherical represent its wave aberrations. The possible aberrations
of a system with an axis of rotational symmetry are discussed in Chapter 8. When the
wavefront is not spherical, the rays intersect the image plane in the vicinity of the
Gaussian image point. The characteristics of the ray distribution in the image plane for
the various aberration types, i.e., the spot diagrams, are discussed in Chapter 9. A lens
designer resorts to using nonspherical surfaces to reduce or eliminate the aberrations over
a certain field of view. The Gaussian imaging properties of a nonspherical surface are, of
course, the same as those of the corresponding spherical surface because they are
determined by its vertex radius of curvature.
Even if the rays from a point object all converge to its Gaussian image point, its
observed image is not a point. The converging beam of the imaging light spreads due to
its diffraction as it propagates to the image plane. A brief discussion of the diffraction-
based aberration-free image, namely, the Airy pattern, is given in Section 6.8.2. In the
presence of aberrations, the light in the diffraction image spreads even more [7].
1.10 SUMMARY OF RESULTS

1.10.1 Sign Convention
Our sign convention for distances, heights, and angles is the Cartesian sign
convention, discussed in Section 1.2. It has the advantage that there are no special rules to
remember other than those of a right-handed Cartesian coordinate system, regardless of
whether the object or the image is real or virtual, or a whether refracting or a reflecting
surface is convex or concave to the light incident on it.
1.10.2 Fermat’s Principle
In geometrical optics, light consists of rays. Their direction of propagation indicates

the direction of the flow of light energy. They are normal to a wavefront, which is a
surface of constant phase. Fermat’s principle states that the optical path length of a ray in
traveling from a point P1 to another point P2 is stationary, or
P2
d Ú nds = 0 , (1-87)
P1
where ds is a differential element of path length along the ray, n is the refractive index of
the medium as a function of the path, and d represents a differential variation.
1.10.3 Laws of Geometrical Optics
The rays propagate according to three laws:
1. In a homogenous medium, a ray propagates in a straight line. This is referred to as

the law of rectilinear propagation.
2. A ray incident at an interface separating media of refractive indices n and n ¢ is

refracted according to Snell’s law, which states that
n ¢ sin q¢ = n sin q , (1-88)
where q and q ¢ are the angles of incidence and refraction from the surface normal at the
point of incidence. Moreover, the incident and refracted rays, and the surface normal, are
coplanar.
3. When a ray is incident on a reflecting surface at an angle q from the surface

normal, it is reflected at an angle
q¢ = - q . (1-89)
Once again, the incident and the reflected rays, and the surface normal are coplanar. The
law of reflection can be obtained from the law of refraction by letting n ¢ = - n because
the reflected ray lies in the same medium as the incident ray. The negative sign represents
the backward propagation of the reflected ray.
1.10 Summary of Results 37
1.10.4 Exact Ray Tracing
The exact ray tracing consists of a transfer operation, in which a ray propagates from
a certain point to a point on a refracting or a reflecting surface, and a refraction or
reflection operation, which describes its refraction or reflection by the surface. Such ray
tracing is used primarily to determine the aberrations of a system with the aid of
computer software, and thereby the quality of an image.
1.10.4.1 Transfer Operation
When a ray with direction cosines (k0 , l 0 , m0 ) originating at a point object A0 with
coordinates ( x 0 , y 0 , z 0 ) , as in Figure 1-11, is incident on a spherical refracting surface of
radius of curvature R1 separating media of refractive indices n 0 and n1 at a distance
D01 , its rectilinear propagation from A0 to the point A1 where it meets the surface is
referred to as the transfer operation. The coordinates ( x1 , y1 , z1 ) of A1 are given by
x1 = x 0 + S01k0 , (1-90a)
y1 = y 0 + S01l 0 , (1-90b)
and
z1 = R1 - R12 - x12 - y12 , (1-90c)
where
2 12
S01 = [(x - x )
1 0
2 2
+ ( y1 - y 0 ) + (D01 + z1 ) ] (1-91)
is the distance between A0 and A1 . The origin of the coordinates lies at the vertex V1 of
the surface, and thus z 0 = - D01 . Equation (1-90c) represents the fact that A1 lies on the
surface.
Equations (1-90) and (1-91) are coupled and must be solved simultaneously to obtain
the transverse coordinates ( x1, y1 ) of the ray at A1 . Once these coordinates are known,
the z1 coordinate is determined from Eq. (1-90c) by virtue of the fact that A1 lies on the
surface.
1.10.4.2 Refraction Operation
When a ray is incident on a refracting surface, its refraction is referred to as the

refraction operation. For an incident ray with direction cosines (k0 , l 0 , m0 ) incident on a
point A1 of a refracting surface of radius of curvature R1 with coordinates ( x1 , y1 , z1 ) ,
separating media of refractive indices n 0 and n1 , as in Figure 1-11, the direction cosines
(k1, l1, m1) of the refracted ray are given by
x1
R1
y1
R1
and
R12 - x12 - y12

n1m1 = n 0 m0 + (n1 cos q1 - n 0 cos q 0 ) , (1-92c)
R1
where q 0 and q1 are the angles of incidence and refraction, respectively,
1 Ê
cos q 0 = - x k - y1l 0 + R12 - x12 - y12 1 - k 02 - l 02 ˆ¯ , (1-93)
R1 Ë 1 0
and
1 2
cos q1 = (
n - n 02 - n 02 cos 2 q 0
n1 1
)1 2 . (1-94)
Once the direction cosines (k1 , l1 ) are known, the direction cosine m1 can be obtained
from the relation k12 + l12 + m12 = 1 .
1.10.4.3 Reflection Operation
When a ray is incident on a reflecting surface, its reflection is referred to as the

reflection operation. For an incident ray with direction cosines (k0 , l 0 , m0 ) incident on a
point A1 of a reflecting surface of radius of curvature R1 with coordinates ( x1 , y1 , z1 ) , as
in Figure 1-12, the direction cosines (k1 , l1 , m1 ) of the reflected ray can be obtained from
the corresponding direction cosines of a refracted ray by letting n 0 = 1 = - n1 . They are
given by
x1
k1 = - k 0 - 2 cos q 0 , (1-95a)
R1
y1
l1 = - l 0 - 2 cos q 0 , (1-95b)
R1
and
R12 - x12 - y12

m1 = - m0 + 2 cos q 0 , (1-95c)
R1
where q 0 is the angle of incidence of the ray given by Eq. (1-93). The reflection
operation can be obtained from the corresponding refraction operation by letting
n 0 = 1 = - n1 .
1.10.4.4 Ray Tracing a Conic Surface
The ray-tracing equations for conic refracting and reflecting surfaces are given
concisely in Sections 1.6.7 and 1.6.8. The differences for a conic surface compared to a
spherical surface result from their sag [see Eq. (1-45)] and the surface normal differences
[see Eq. (1-47)].
1.10.4.5 Tracing a Tangential Ray

The plane containing the point object and the optical axis of a system is called the
tangential plane. A ray incident in the tangential plane remains in this plane after its
refraction or reflection by an element of the system, and therefore by the entire system.
1.10.5 Paraxial Ray Tracing
When the rays make small angles with the optical axis and surface normals, we can
approximate their sines and tangents with the angles themselves. Similarly, if the
transverse coordinates ( x , y ) of a point on a surface are much smaller than its radius of
curvature, we can neglect the sag of a refracting or reflecting surface and approximate the
diagonal distance between two points by the corresponding axial distance. Such
assumptions yield equations for the transverse coordinates that are no longer coupled.
The corresponding ray tracing is referred to as the paraxial ray tracing.
Moreover, the projections of a skew ray in the zx and yz planes propagate through a
system independently of each other. Consequently, for a rotationally symmetric imaging
system, we need to trace rays only in one of these planes. The plane of choice is generally
the tangential plane zx.
1.10.6 Gaussian Optics
The approximation of small angles is called the paraxial or the Gaussian

approximation. It is used to determine the image location and size in terms of the object
location and size. The image is referred to as the Gaussian image, and the process of
determining the image in this manner, regardless of the magnitude of the angles and sizes,
is called Gaussian optics. In Gaussian optics, paraxial ray tracing is utilized, and the
refracting or the reflecting surface is replaced by a planar surface passing through its
vertex, called the tangent plane or the paraxial surface. The aberrations of an image are
completely neglected, or the image is assumed to be aberration free.
1.10.6.1 Gaussian Imaging by a Refracting Surface
The imaging equation for a refracting surface of vertex radius of curvature R1

separating media of refractive indices n 0 and n1 , as in Figure 1-13, relating the object
and image distances D01 and D12 from the vertex of the surface, is given by
n0 n n - n0
+ 1 = 1 . (1-96)
D01 D12 R1
The transverse magnification of the image and the angular magnification of the rays
originating at a point object and converging to its Gaussian image point are given by
n 0 D12
Mx = - (1-97)
n1 D01
and
D01
Mb = - , (1-98)
D12
respectively. The product of the linear and angular magnifications is given by
n0
M xMb = . (1-99)
n1
We point out that in ray tracing, the object distance D01 is measured from the object
to the surface. However, in Chapter 2, we will consider it from the vertex of the imaging
surface, thus changing its sign.
1.10.6.2 Gaussian Imaging by a Reflecting Surface
The Gaussian imaging equations for a reflecting surface of vertex radius of curvature
R1 , as in Figure 1-14, can be obtained from those for a corresponding refracting surface
by letting n1 = - n 0 . Thus, we may write
1 1 2
- = , (1-100)
D12 D01 R1
D12
Mx = , (1-101)
D01
D01
Mb = - , (1-102)
D12
and
M xMb = -1 . (1-103)
Generally, the refractive index of the medium for imaging by a reflecting surface is unity.
The imaging equations for a reflecting surface can be obtained from the corresponding
equations for a refracting surface by letting n 0 = 1 = - n1 . The minus sign with n1
represents the backward propagation of the reflected ray compared to that of the incident
ray.
References 41
REFERENCES
1. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Geometrical Optics,

Section 1.6, SPIE Press, Bellingham, WA (1998) [doi:10.1117/3.265735.ch1].
2. P. Mouroulis and J. Macdonald, Geometrical Optics and Optical Design, Oxford,

New York (1997).
3. W. T. Welford, Aberrations of the Symmetrical Optical System, Academic Press,

New York (1974).
4. F. A. Jenkins and H. E. White, Fundamentals of Optics, 4th ed., McGraw-Hill,

New York (1976).
5. M. V. Klein and T. E. Furtak, Optics, John Wiley & Sons, New York (1988).
6. E. Hecht, Optics, 4th ed., Addison-Wesley, San Francisco (2002).
7. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave Diffraction

Optics, 2nd ed., SPIE Press, Bellingham, WA (2011) [doi: 10.1117/3.898443].
PROBLEMS
1.1 Show that an ellipsoidal refracting surface with an eccentricity e = n n¢ separating

media of refractive indices n and n ¢ is a Cartesian surface for a collimated beam
incident on it.
1.2 Show that an ellipsoidal mirror is a Cartesian surface for a point object placed at
one of its two geometrical focii.
1.3 Determine the focal length of a thin lens by considering an object at infinity. The
refractive index of the lens is n, and the radii of curvature of its two surfaces are R1
and R2 .
CHAPTER 2
REFRACTING SYSTEMS
2.1 Introduction ............................................................................................................45

2.2 Spherical Refracting Surface ................................................................................46
2.2.2 Object and Image Spaces........................................................................... 50
2.3 Thin Lens ................................................................................................................61
2.3.6 Image Throw..............................................................................................69
2.3.7 Thin Lens Not in Air..................................................................................71
2.3.8 Thin Lenses in Contact ..............................................................................73
2.4 General System....................................................................................................... 73
2.4.1 Introduction................................................................................................73
2.4.2 Cardinal Points and Planes ........................................................................75
2.4.3 Gaussian Imaging, Focal Lengths, and Magnifications ............................77
2.4.4 Nodal Points and Planes ............................................................................80
2.4.7 Reference to Other Conjugate Planes ........................................................82
2.4.8 Comparison of Imaging by a General System and a Refracting Surface
or a Thin Lens ............................................................................................84
2.4.9 Determination of Cardinal Points ..............................................................85
2.5 Afocal Systems ........................................................................................................90
2.5.1 Introduction................................................................................................90
2.5.2 Lagrange Invariant for an Infinite Conjugate ............................................91
2.5.3 Imaging by an Afocal System....................................................................91
2.6 Plane-Parallel Plate ................................................................................................93
2.6.1 Introduction................................................................................................93
2.6.2 Imaging Relations ......................................................................................94
43
44 REFRACTING SYSTEMS
2.7 Petzval Image..........................................................................................................96

2.7.1 Spherical Refracting Surface ..................................................................... 96
2.7.2 General System ..........................................................................................98
2.7.3 Thin Lens ................................................................................................... 99
2.8 Misaligned Surface............................................................................................... 101
2.8.1 Decentered Surface ..................................................................................101
2.8.2 Tilted Surface ..........................................................................................102
2.8.3 Despaced Surface ....................................................................................104
2.9 Misaligned Thin Lens ..........................................................................................105
2.9.1 Decentered Lens ......................................................................................105
2.9.2 Tilted Lens ............................................................................................... 106
2.9.3 Despaced Lens ......................................................................................... 106
2.10 Anamorphic Imaging Systems ............................................................................107
2.11.1 Imaging Equations ................................................................................... 109
2.11.1.1 General System ........................................................................109
2.11.1.2 Refracting Surface ................................................................... 111
2.11.1.3 Thin Lens ................................................................................. 111
2.11.1.4 Afocal System..........................................................................112
2.11.1.5 Plane-Parallel Plate ..................................................................112
2.11.2 Petzval Image ..........................................................................................112
2.11.3 Misalignments..........................................................................................113
2.11.3.1 Misaligned Surface ..................................................................113
2.11.3.2 Misaligned Thin Lens ..............................................................113
2.11.4 Anamorphic Imaging Systems ................................................................113
Problems ......................................................................................................................... 115
Chapter 2
Refracting Systems
2.1 INTRODUCTION
In Section 1.6, we showed how to trace rays exactly from one surface to the next, and
when they are refracted or reflected by a refracting or a reflecting surface. We then
considered paraxial ray tracing, i.e., when the rays make small angles with the surface
normals and the optical axis. We showed that, in the paraxial approximation, the curved
surface could be replaced by a planar surface that is a tangent to the surface at its vertex,
called the tangent plane or the paraxial surface. The approximation led to Gaussian
optics, which represents imaging equations for obtaining the image of an object, i.e., the
size and the location of the image in terms of the size and location of the object and the
parameters of the imaging system.
We begin this chapter by rederiving the imaging equations for a refracting surface by
assuming small angles of incidence and refraction and small slope angles of the rays (as
in Section 1.8.2). How to determine the image graphically is also considered. We use
standard notation suitable for a multisurface imaging system. Both the Gaussian and
Newtonian forms of the imaging equations are given. These equations are used to obtain
the corresponding equations for a thin lens. The imaging equations for a multisurface
refracting system are derived next. The principal, focal, and nodal points, collectively
called the cardinal points of such systems, are discussed. It is shown that simple imaging
equations, similar to those for a single refracting surface, are obtained, provided the
object and image distances are measured from the respective principal points of the
system in the Gaussian form, and from the focal points in the Newtonian form of the
imaging equations. The concept of the Lagrange invariant is discussed in each case.
Afocal systems, i.e., those for which a parallel beam of light incident on them
emerges as a parallel beam of light, or the object and its image both lie at infinity, are also
discussed. Imaging by a plane-parallel plate is considered, and it is shown that the
distance between the object and its image is independent of the object location, depending
only on the refractive index and the thickness of the plate.
In Gaussian imaging, the object and image distances are measured along the optical
axis, even when they are located off the axis. This introduces a small focus error that
increases quadratically with the height of a point object. Consequently, an error-free
image of a plane object is formed on a spherical surface, called the Petzval image surface.
The radius of curvature of this surface is independent of the object or the image distance.
How this image is determined is discussed briefly. Next, how an image is displaced
because of a slight misalignment of an imaging element is discussed. Finally, we briefly
discuss imaging by an anamorphic system with different transverse magnifications in two
orthogonal symmetry planes, thus yielding a rectangular image of a square object.
45
2.2 SPHERICAL REFRACTING SURFACE

In this section, we derive equations that describe the imaging of an object by a
spherical refracting surface. The concept of Lagrange invariance is introduced, showing
that a large transverse magnification of an image is accompanied by a small angular
magnification of the rays so that the product of the two magnifications is a constant.
2.2.1 Gaussian Imaging Equation

As indicated in Figure 2-1, consider a spherical refracting surface of a radius of
curvature R separating media of refractive indices n and n ¢ , where n ¢ > n. The line VC
joining its vertex V and center of curvature C defines its optical axis OA. Because the
surface is spherical, it does not have a unique vertex. However, the central point of the
surface defines its vertex.
We first consider the imaging of an axial point object P0 lying at a distance S from
V. An object ray P0 Q incident at a point Q on the surface at a height x from the optical
axis is refracted as a ray QP¢0 intersecting the optical axis at a point P0¢ at a distance S ¢
from V. Let the angles of incidence and refraction (i.e., the angles of the incident and
refracted rays from the surface normal QC at the point of incidence Q) be q and q ¢ ,
respectively. Similarly, let the slope angles of the rays from the optical axis be 0 and
¢0 .
n n¢
Q
q
x q¢
b0 V (–)f (–)b¢0
P0 OA C P¢0
R
(–)S S¢
Figure 2-1. Imaging by a convex spherical refracting surface of radius of curvature

R separating media of refractive indices n and n ¢ , where n ¢ > n. VC is the optical
axis OA of the surface, where V is the vertex of the surface, and C is its center of
curvature. The distances from V of the axial point object P0 and its Gaussian image
P0¢ are S and S ¢ , respectively. The angles q and q ¢ are the angles of the incident
and refracted rays P0 Q and QP¢0 , respectively, from the surface normal QC at the
point of incidence Q. The slope angles of these rays are 0 and ¢0 . The numerically
negative quantities are indicated by a parenthetical negative sign (–).
2.2 Spherical Refracting Surface 47
In the Gaussian approximation of Snell's law (i.e., for small angles), the angles of
incidence and refraction are related to each other according to [see Eq. (1-54b)]
n ¢q ¢ = nq . (2-1)
The rays propagating according to this approximation are called paraxial rays. From the
triangle P0 CQ , we note that
q = 0 - f , (2-2a)
where the angle f of the surface normal from the optical axis is numerically negative.
Similarly, from triangle CP0¢Q , we note that
q ¢ = ¢0 - f , (2-2b)
where ¢0 is numerically negative. Now the tangent of a small angle is approximately
equal to the angle in radians. Thus, we may write
0 = - x / S , (2-3a)
¢0 = - x / S ¢ , (2-3b)
and
f = -x R , (2-3c)
where the object distance S is numerically negative because P0 lies to the left of V.
Substituting Eqs. (2-3) into Eqs. (2-2) and substituting the results thus obtained into Eq.
(2-1), we obtain
n¢ n n¢ - n
- = . (2-4)
S¢ S R
We note that Eq. (2-4) is independent of the height x of the point of incidence Q of the
ray. Thus, in the Gaussian approximation, all rays incident on the surface pass through P0¢
after being refracted by it. Equation (2-4) is called the Gaussian imaging equation.. It
gives the position of the image point for a given position of the object point. It is
applicable to any conic surface with a vertex radius of curvature R. The reference point
for the object and image distances is the vertex V of the refracting surface. A point object,
such as P0 , and its corresponding Gaussian image point P0¢ are called the conjugate
points.
Imaging can also be considered in terms of waves. A point source emanates spherical
waves. A wave surface with a constant phase, spherical in this case, is called a wavefront.
Thus, as illustrated in Figure 2-2, a spherical wave of radius of curvature S diverging
from the point object P0 is incident on the refracting surface. The refracting surface
converts this wave into a spherical wave of radius of curvature S ¢ converging to the
image point P0¢ . The curvature of a wavefront is called its vergence. When multiplied
n n¢
P0 C P0¢
R
(–)S S¢
Figure 2-2. Imaging in terms of wavefronts. Spherical wavefronts originate at the

object point P0 and converge to the image point P0¢ after refraction by a surface
with its center of curvature at C. The straight lines are the corresponding incident
and refracted rays.
by the refractive index of the medium in which the wavefront lies, it is called the optical
(or reduced) vergence. Thus, V = n S is the optical vergence of the incident wavefront,
and V ¢ = n ¢ S ¢ is the optical vergence of the refracted wavefront. As shown later [see Eq.
(2-7)], the right-hand side of Eq. (2-4) is called the refracting power K of the surface. In
terms of the vergences of the wavefronts and the power of the refracting surface, the
imaging equation can be written
V¢ - V = K . (2-5)
The vergence of a wavefront is numerically positive or negative, depending on whether it

is converging or diverging.
In Figure 2-1, the point object P0 is real in the sense that the object rays actually
originate from it. Similarly, the image point P0¢ is real in the sense that the refracted rays
actually pass through it. The object distance S is numerically negative, and the image
distance S ¢ is numerically positive. However, we note from Eq. (2-4) that, as the object
moves closer to the refracting surface such that S < nR (n ¢ - n) , then S ¢ is numerically
negative, indicating that the image lies on the left-hand side of the refracting surface. This
is illustrated in Figure 2-3, where it is shown that an object ray P0 Q from an object point
P0 is refracted such that an extension of the refracted ray intersects the optical axis on the
object side at P0¢ , which is the image of P0 . The image in this case is virtual in the sense
that any refracted ray appears to come from it, but does not actually pass through it. The
image can also be virtual if R is numerically negative, i.e., if the center of curvature C of
the refracting surface lies to the left of its vertex V , or if n ¢ < n. If there is another
imaging element to the right of the refracting surface, then the rays incident on it are real,
and the virtual image becomes a real object for it.
n n¢
Q
V
P¢0 P0 OA C
(–)S R
(–)S¢
Figure 2-3. Virtual image P0¢ of a real point object P0 , where S < nR (n ¢ - n) . An
object ray, such as P0 Q , is refracted by the refracting surface such that the
refracted ray appears to come from the image P0¢ .
If a beam of rays converging to a point on the right-hand side of the vertex V is

incident on the refracting surface, as in Figure 2-4, the corresponding point object P0 is
considered virtual, and its distance S from the surface is numerically positive. The object
may be the image of some other object formed by another imaging system preceding the
refracting surface. However, if a real point object lies on the right-hand side of V at a
distance S so that light initially travels from right to left, as in Figure 2-5, then the object
lies in a medium of refractive index n ¢ , and the image lies in a medium of index n. This
is the case, for example, when considering the image of one element of a system by
another that precedes it. Because light is traveling backward, the signs of the refractive
n n¢
OA P¢0 C P0
S¢
R
S
Figure 2-4. Imaging of a virtual point object P0 by a refracting surface. The real
image lies at P0¢ .
n n¢
Q
V
P¢0 OA C P0
R
(–)S¢ S
Figure 2-5. Imaging of a real point object P0 lying to the right of a refracting
surface.
indices are reversed, i.e., they become negative quantities. However, Eq. (2-4) does not
change, as may be seen by reversing the signs of n and n ¢ . Therefore, the imaging
equation in this case becomes
n¢ n n¢ - n
- = . (2-6)
S S¢ R
Of course, S is numerically positive, and a numerically positive value of S ¢ implies that

the image point lies on the right-hand side of V. One can use an alternative and perhaps a
simpler approach that treats the real object on the right-hand side of V as an image and
determines its conjugate object as the actual image. Thus, we let S ¢ equal the numerically
positive object distance in Eq. (2-4) and determine the value of S. A numerically positive
value of S implies that the actual image lies on the right-hand side of V. Of course, one
can also mentally rotate the system 180 degrees about a vertical line passing through V,
use an equation, such as Eq. (2-4), in which S is numerically negative in a medium of
index n ¢ , reverse the sign of R, determine S ¢ in a medium of index n, and rotate the
system back to its original configuration.
2.2.2 Object and Image Spaces

Any imaging system is associated with its object and image spaces. Its object space
is the space that contains all of the physical objects lying to its left and all of the points
that are conjugate to any physical objects lying to its right. Similarly, its image space is
the space that contains all of the physical objects lying to its right and all of the points
that are conjugate to any physical objects lying to its left. Of course, the two spaces are
conjugates of each other. Now, every object is associated with an image, which may be
real or virtual, lying on either side of the system. Accordingly, both the object and image
spaces extend from infinity on the left of the system to infinity on its right. Thus, the two
spaces are superimposed on each other. A distinction is made between the two spaces by
considering the rays before entering the system as lying in its object space and those
emerging from it as lying in its image space. Sometimes, a distinction is made between a
real and a virtual space. The portion of the object space lying to the left of a system is
called its real object space, and the portion of the image space lying to its right is called
its real image space. The remaining portions are correspondingly called virtual object
and image spaces.
2.2.3 Focal Lengths and Refracting Power

If an object lies at infinity, i.e., if S = - • , as illustrated in Figure 2-6a, then the
corresponding image distance S ¢ ∫ VF ¢ = f ¢ , where f ¢ is called the image-space focal
length of the refracting surface. The point F ¢ is called the image-space focal point of the
surface. Rays incident on the surface parallel to its optical axis are focused at F ¢ after
being refracted by it. Similarly, the object distance S ∫ VF = f , for which the image lies
n n¢
V
F¢
f¢
(a)
n n¢
V
F
(–)f
(b)
Figure 2-6. Focal points of a refracting surface. (a) Image-space focal point F ¢ . (b)
Object-space focal point F . In Gaussian optics, refraction of the rays takes place at
the tangent plane passing through the vertex of the surface.
at infinity (i.e., S ¢ = • ), as in Figure 2-6b, is called the object-space focal length, where
F is called the object-space focal point. Rays originating at F and incident on the surface
are made parallel by it. Of course, if rays parallel to the optical axis are incident on the
surface from right to left, they will be focused at F after being refracted by it. The planes
passing through the focal points F and F ¢ that are perpendicular to the optical axis are
called the object-space and image-space focal planes, respectively. It should be evident
from Figure 2-6 that the focal points F and F ¢ are not conjugate points.
By their definitions, the image-space and object-space focal lengths of the refracting
surface, obtained from Eq. (2-4), are given by
n¢
f¢ = R (2-7a)
n¢ - n
and
n
f = - R , (2-7b)
n¢ - n
respectively. The two focal lengths are, therefore, related to each other according to
f ¢ = - ( n ¢ n) f . (2-8)
If f ¢ is numerically positive, as in Figure 2-6a, then f is numerically negative, and F lies

to the left of V, as in Figure 2-6b. The focal points F and F ¢ lie on the opposite sides of
the vertex V at different distances from each other.
Just as the image of an object can be virtual, so can the focal point of an imaging
system. This is illustrated in Figure 2-7, where the radius of curvature of the refracting
surface is numerically negative. A ray incident parallel to the optical axis is bent away
from the axis, and an extension of the refracted ray intersects the axis at the virtual focal
point F ¢ . Similarly, the focal point is virtual if R is numerically positive, but n ¢ < n.
n n¢
C F¢
(–)R
(–)f¢
Figure 2-7. Virtual focal point F ¢ of a refracting surface. As in Figure 2-1, n ¢ > n,
but R is numerically negative.
The quantity on the right-hand side of Eq. (2-4) is called the refracting power K of
the surface. It is a measure of the ability of the refracting surface to convert a parallel
beam into a converging beam; the shorter the distance at which the refracted beam is
focused, the higher the power of the refracting surface. Its reciprocal is called the
equivalent or effective focal length fe of the surface. Thus, we may write
n¢ - n 1
K = = . (2-9)
R fe
The power K and the equivalent focal length fe are positive if n ¢ - n and R have the
same sign. Such a surface is called a positive or a converging surface. Similarly, K and
fe are negative if n ¢ - n and R have opposite signs. Such a surface is called a negative or
a diverging surface. We also note that fe = f ¢ if n ¢ = 1, i.e., the equivalent focal length
represents the image-space focal length when the refractive index n ¢ of the image space
is unity. In terms of the refracting power and focal lengths, Eq. (2-4) may be written
n¢ n 1 n¢ n
- = K = = = - . (2-10)
S¢ S fe f¢ f
When the focal length is measured in meters, the unit of power is called a diopter (D),
which is measured in m–1.
2.2.4 Magnifications and Lagrange Invariant
Now we consider the imaging of an off-axis point object P lying at a height h from
the optical axis in the object plane passing through P0 , as illustrated in Figure 2-8. The
incident and the refracted rays PV and VP¢ , respectively, are shown in the figure passing
through the vertex V. The image lies at the point P ¢ , where the refracted ray VP¢
intersects the image plane passing through P0¢ . Both the object and the image planes are
mutually parallel and perpendicular to the optical axis. It is evident from the figure that
the angles of incidence and refraction from the surface normal at V, i.e., from the optical
axis, are given by
q = h/S (2-11a)
and
q¢ = h¢ / S ¢ , (2-11b)
respectively. Note that q , q ¢ , and h ¢ are all numerically negative. Substituting Eqs. (2-
11) into the Snell's law equation (2-1), we find that the transverse magnification of the
image is given by
h¢ nS ¢
Mt ∫ = . (2-12a)
h n ¢S
It should be evident that the image of an extended object lying in the object plane is
uniformly magnified so that the image is geometrically similar to the object. Substituting
n n′
P
A
h
(–)θ C P′0
P0 V (–)θ′ (–)h′
P′
R
(–)S S′
Figure 2-8. Imaging of an off-axis point object P lying at a height h from the optical
axis. The image point P ¢ lies at a height h ¢ .
for n ¢ S ¢ from Eq. (2-10) into Eq. (2-12a), the magnification can be written in terms of
the object distance S and the focal length f ¢ :
nf ¢
Mt = . (2-12b)
nf ¢ + n ¢S
Another equation for magnification can be obtained by considering an object ray PA

incident in the direction of the center of curvature C. Because the angle of incidence of
this ray is zero, its angle of refraction is also zero, and the ray remains undeviated when
refracted by the surface. The ray PACP¢ is exact, and the Gaussian image point P ¢ lies
on it. It is seen from similar triangles P0 CP and P0¢CP ¢ that the magnification of the
image is given by
S¢ - R
Mt = - . (2-12c)
R-S
For an object lying at infinity ( S = - •), Mt = 0 . For an object lying between

infinity and the object-space focal plane, Mt < 0 , i.e., the image is inverted. An object
ray is refracted toward the optical axis. As the object approaches the object-space focal
plane, Mt Æ • , and an object ray is refracted parallel to the optical axis. For an object
lying between the object-space focal plane and the surface, Mt > 0 , or the image is erect.
The image is virtual in this case, as illustrated in Figure 2-3. As the object approaches the
surface, the image also approaches it with Mt = 1.
The ray angular magnification, representing the ratio of the angular divergence of
the rays from P0 to their angular convergence to P0¢ (see Figure 2-9), is given by
M = ¢0 / 0 = S / S ¢ . (2-13)
Note that Mb is not the ratio of the angular sizes q ¢ and q of the image P0¢P ¢ and the
object P0 P , respectively, subtended at V in Figure 2-8. From Eqs. (2-12) and (2-14), we
find that the product of the transverse and angular magnifications is given by
Mt Mb = n / n ¢ , (2-14)
which depends only on the ratio of the refractive indicex of the object space to that of the
image space. In particular, it does not depend on the object and image distances.
Consequently, a large transverse magnification of the image can be obtained only with a
correspondingly small angular magnification of the rays, i.e., by having a much smaller
angular divergence of the rays at the image than at the object. From the definitions of the
magnifications, namely, Eqs. (2-12a) and (2-13), Eq. (2-14) can also be written
n ¢h ¢¢0 = nh0 , (2-15)
showing that the quantity nh0 does not change upon refraction (see Figure 2-9). This
quantity is called the Lagrange (or the Smith–Helmholtz) invariant.
From Eqs. (2-12a) and (2-13), the transverse magnification of the image can also be
written
nb 0
Mt = , (2-16)
n ¢b¢0
i.e., it can be obtained from the slope angles of the incident and refracted rays for an axial
object point.
and qq¢ of the image P0¢P ¢ and the

From Eqs. (2-11), the ratio of the angular sizes q ¢ and
object P0 P subtended at the vertex V in Figure 2-8 is given by
n n′
P B
h
β0 (–)β′0 P′0
V
P0 C (–)h′
P′
R
(–)S S′
Figure 2-9. Lagrange invariant nh0 of a refracting surface.

Mq ∫ q ¢ q
h¢ S
=
S¢ h
= Mt M
n
= . (2-17)
n¢
Suppose we treat the angles q and q ¢ as the angular divergences of the rays from an
object and its image located at V. We should then be able to obtain Eq. (2-17) from Eq.
(2-13) by replacing 0 by q and ¢0 by q ¢ . This is indeed the case because Mt = 1 for the
conjugates located at the vertex. Thus, Eqs. (2-12a) and (2-13) yield the result
Mq = n n ¢ .
For a small change S in the object distance, let the corresponding change in the
image distance be D S ¢ , as illustrated in Figure 2-10. The ratio S ¢ S is called the
longitudinal magnification Ml because it represents the magnification of the image of a
small axial object. Differentiating both sides of Eq. (2-4), we find that
Ml ∫ D S ¢ D S = ( n n ¢)( S ¢ S) 2 = ( n ¢ n) Mt2 = Mt Mb . (2-18)
Whether the transverse magnification Mt is positive or negative, the longitudinal

magnification Ml is always positive, indicating, for example, that if the object distance S
increases (from a larger negative value to a smaller one), i.e., if the object moves closer to
the refracting surface, then the image distance S ¢ also increases (from a smaller positive
value to a larger one), i.e., the image moves farther from the surface. Thus, the image
n n¢
V
P0 P1 C P¢0 P¢1
DS
DS¢
R
(–)S S¢
Figure 2-10. Longitudinal magnification. As an axial point object moves from P0 to

P1 by a small amount S , the image moves in the same direction from P0¢ to P1¢ by
an amount S ¢ . The longitudinal magnification D S ¢ D S of the image is given by
Eq. (2-18).
moves in the same direction as the object. Because the value of Mt varies with the
position of the object, Ml also varies with it. Therefore, Eq.(2-18) is valid only for
infinitesimal values of DS . In this equation, the refracting surface is assumed to be fixed,
and D S ¢ represents the image displacement corresponding to an object displacement DS .
However, if the object is fixed and the refracting surface is displaced by an amount D ,
( )
then the corresponding displacement of the image is given by 1 - n ¢ M 2 n D , as shown
in Section 2.8.3.
We note from Eq. (2-18) that, unless Mt = ± 1, the longitudinal and transverse
magnifications are not equal, and the 3D image of a 3D object is accordingly
geometrically different from the object. This is illustrated in Figure 2-11. The transverse
image is reversed, as illustrated by the reversal of the arrows P0¢x ¢ and P0¢y ¢ compared to
the arrows P0 x and P0 y . As illustrated by the arrows P0 z and P0¢z ¢ , the longitudinal
image points in the same direction as the object, yielding a positive longitudinal
magnification.
For a finite value of DS , let S0 and S1 be the object distances of P0 and P1 , as

illustrated in Figure 2-12. The corresponding image distances S0¢ and S1¢ are given by
n¢ n
- = K (2-19a)
S0¢ S0
and
n¢ n
- = K . (2-19b)
S1¢ S1
y¢ z¢
n¢ P¢0
F¢
x¢
n
V
x F
z
P0
y
Figure 2-11. 3D image of a 3D object. The magnification of the transverse image is

negative, as illustrated by the reversal of arrows in the x and y directions. The z
arrows point in the same direction, indicating a positive longitudinal magnification.
n n¢
V
P0 P1 C P¢0 P¢1
L
L¢
R
(–)S0 S0¢
(–)S1 S1¢
Figure 2-12. Image P0¢P1¢ of a longitudinal object P0 P1 . The longitudinal

magnification L ¢ L of the image is given by Eq. (2-21).
Subtracting (2-19a) from (2-19b), we obtain
n¢ n n¢ n
- = - ,
S1¢ S1 S0¢ S0
or
Ê1 1ˆ Ê1 1ˆ
n¢ Á - ˜ = n Á - ˜ . (2-20)
S
Ë 1¢ S ¢
0¯ S
Ë 1 S0¯
Accordingly, the longitudinal magnification is given by
L¢ S ¢ - S0¢
Ml = = 1
L S1 - S0
n S0¢ S1¢
=
n ¢ S0 S1
n¢
= M0 M1 , (2-21)
n
where M0 = nS0¢ n ¢S0 and M1 = nS1¢ n ¢S1 are the transverse magnifications of the
images lying in image planes passing through P0¢ and P1¢ , respectively. Thus, for
example, the image of a cube is a truncated pyramid, as illustrated in Figure 2-13. The
pyramid becomes approximately a rectangular parallelepiped if the cube is infinitesimal
in size. It is shown in Section 2.5 that for an afocal system, Mt is independent of the
position of the object and, therefore, so is Ml .
D
C
G H
P0
n n¢
P1
A B
F
E O
L
(–) D¢ H¢
S
C¢
G¢
P¢0
B¢ P¢1
S¢ A¢
F¢
L¢ E¢
Figure 2-13. Image of a cube. The image is a truncated pyramid owing to different
transverse magnifications of the images of objects lying in different object planes.
2.2.5 Graphical Imaging

The location of the Gaussian image point P ¢ can be determined graphically as the
point of intersection of any two of the following three conveniently drawn rays from the
point object P, as illustrated in Figure 2-14.
1. By definition of the image-space focal point F ¢ , a ray 1 incident parallel to the

optical axis passes through this point after refraction.
2. Ray 2 incident in the direction of the center of curvature C of the refracting
surface is refracted by it without any deviation. This is because the angle of
incidence of the ray is zero, and therefore the angle of refraction is also zero.
3. By definition of the object-space focal point F, a ray 3 incident passing through

this point is refracted parallel to the optical axis.
Extension of one or more of these rays may be necessary for them to intersect each other.
Moreover, in Gaussian optics, which is based on the paraxial rays, any refraction (or
reflection) at a surface takes place at the plane that is a tangent to it at its vertex, as
shown, for example, in Figures 2-9 and 2-14.
The Gaussian image P0¢ of an on-axis point object P0 can be determined

independently (rather than as the point of intersection of the optical axis and the line that
is perpendicular to it and passes through P ¢ ) as follows: Consider a ray P0 E incident on
the surface, as shown in Figure 2-15. A hypothetical ray incident parallel to it (shown by
a dashed line) and passing through C intersects the image-space focal plane at a point D.
The refracted ray corresponding to the incident ray P0 E passes through the point D and
intersects the optical axis at the Gaussian image point P0¢ . The parallel rays P0 E and CD
are focused by the refracting surface at the point D in the focal plane. The point D may
also be determined by considering a hypothetical parallel ray passing through the object-
space focal point F. It is refracted as a ray parallel to the optical axis intersecting the focal
plane at the point D.
n n¢
P 1 B
2 1
h 3
2 F¢ P0¢
V
P0 F C (–)h¢
A P¢
3
(–)f R
(–)z f¢ z¢
(–)S S¢
Figure 2-14. Graphical imaging to determine the image P0¢P ¢ of an object P0 P by a

refracting surface. In the Gaussian approximation, refraction of the object rays
takes place at the tangent plane AVB.
n n¢
D
V
P0 F C F¢ P0¢
(–)z z¢
(–)S S¢
Figure 2-15. Graphical imaging to determine the image P0¢ of an axial point object
P0 .
2.3 Thin Lens 61
2.2.6 Newtonian Imaging Equation

In the Gaussian imaging equation (2-4), the object and image distances S and S ¢ ,
respectively, are measured from the vertex V of the refracting surface. In the Newtonian
imaging equation, they are measured from the respective focal points F and F ¢ . Thus, let
z and z ¢ be the object and image distances from the focal points F and F ¢, respectively,
as indicated in Figure 2-14. From similar triangles P0 FP and FVA in this figure, we note
that the transverse magnification may be written
Mt ∫ h ¢ h = - f z , (2-22)
where z (like f ) is numerically negative because P0 lies to the left of the reference point
F. Similarly, from similar triangles VF ¢B and F ¢P0¢ P ¢, it may also be written
Mt = - z ¢ f ¢ . (2-23)
Equating the right-hand sides of these equations, we obtain the Newtonian imaging
equation:
zz ¢ = f f ¢ = - (n n ¢) f ¢ 2 . (2-24)
It is evident from Eq. (2-24) that z and z ¢ must have opposite signs, implying that an
object and its image lie on the opposite sides of the corresponding focal points. For
example, if the object lies to the left of F, then the image lies to the right of F ¢ .
Differentiating both sides of Eq. (2-24) and using Eqs. (2-8), (2-22), and (2-23), we
obtain Eq. (2-18), relating the longitudinal and transverse magnifications.
2.3 THIN LENS

It should be evident that the image formed by a lens consisting of two refracting
surfaces can be obtained by a repeated application of the imaging equation for a
refracting surface. The image formed by the first surface becomes the object for the
second, and its image by the second surface yields the image formed by the lens. In this
section, we consider imaging by a thin lens in air, i.e., one for which the spacing between
its two surfaces is negligible. We derive simple imaging equations for a thin lens such
that it is not necessary to apply the imaging equations for each surface to determine the
image of an object. Thus, we show that it is possible to determine the image of an object
without determining the image formed by its two surfaces sequentially. The imaging
equations when the media on the two sides of a thin lens are different, are also given.
Finally, it is shown that the power a system consisting of thin lenses in contact is equal to
the sum of the powers of the individual lenses.

Consider a thin lens in air made of a material of refractive index n, as illustrated in
Figure 2-16. Let the radii of curvature of its two surfaces be R1 and R2 , with their
centers of curvature at C1 and C2 , respectively. The line joining C1 and C2 defines the
n
P0¢¢ F1¢ P0¢
P0 OA C2 C1 F¢
(–)R2 R1
(–)S1 ∫ S S¢1 = S2
S2¢ ∫ S ¢
Figure 2-16. Imaging of an axial point object P0 by a thin lens of refractive index n.
The lens surfaces have radii of curvature of R1 and R2 . The line O A connecting
their centers of curvature C1 and C2 defines the optical axis of the lens. C is the
center of the lens. P0¢ is the image of P0 formed by the first surface, and P0¢¢ is the
image of the virtual object P0¢ formed by the second surface.
optical axis OA of the lens. Consider an axial point object P0 lying at a distance S1 from
the lens. Its image P0¢ formed by the first surface lies at a distance S1¢ that, according to
Eq. (2-4), is given by
n 1 n -1
- = . (2-25)
S1¢ S1 R1
A ray from P0 is refracted by the surface intersecting the optical axis at P0¢ . This image is
a virtual object for the second surface because the rays associated with it appear to
converge to it rather than actually diverge from it. It lies at a distance S2 = S1¢. Its image
P0¢¢ formed by the surface lies at a distance S2¢ , that, according to Eq. (2-4), is given by
1 n 1- n
- = . (2-26)
S2¢ S1¢ R2
Adding Eqs. (2-24) and (2-25), we obtain
1 1 Ê 1 1ˆ
- = (n - 1) Á - ˜ , (2-27)
S¢ S Ë R1 R2 ¯
where we have let S1 = S and S2¢ = S ¢ be the object and final image distances, as
indicated in Figure 2-16. Equation (2-27) is the Gaussian imaging equation relating the
object and image distances.
2.3.2 Focal Lengths and Refracting Power

By definition, image-space focal length f ¢ represents the image distance when the
object lies at infinity, i.e., S ¢ = f ¢ when S = - • . Therefore, from Eq. (2-27), f ¢ i s
given by
2.3 Thin Lens 63
1 Ê 1 1ˆ
= ( n - 1) Á - ˜ . (2-28)
f¢ Ë R1 R2 ¯
Thus, a ray incident on the lens parallel to its optical axis is refracted by the first surface
intersecting the optical axis at F1¢ at a distance nR1 (n - 1) , as illustrated in Figure 2-16.
This ray is refracted by the second surface intersecting the optical axis at F ¢ , which is the
image-space focal point. In effect, the parallel ray incident on the lens is refracted by it
passing through F ¢ , as illustrated in Figure 2-17a. Similarly, by definition of the object-
space focal length, f represents the object distance that yields an image at infinity. Thus,
S ¢ = • when S = f , where f = - f ¢ . A ray from the object-space focal point F incident
on the lens emerges from it parallel to its optical axis upon refraction, as illustrated in
Figure 2-17b. It should be evident that the focal points F and F ¢ , which lie on the
opposite sides of the lens, are not conjugates of each other. The imaging equation (2-27)
can be written in terms of the focal length f ¢ as
1 1 1
- = . (2-29)
S¢ S f¢
The right-hand side of Eq. (2-27) represents the refracting power K of the lens. Its
reciprocal is called the equivalent or effective focal length fe of the lens. Thus, we may
write
1 1
K = = (2-30)
fe f¢
Ê 1 1ˆ
= ( n - 1) Á - ˜ (2-31a)
Ë R1 R2 ¯
= ( n - 1) (C1 - C 2 ) , (2-31b)
F¢
C F C
f¢ (–)f
(a) (b)
Figure 2-17. Focal points of a positive thin lens with its center C. (a) Image-space
focal point F ¢ . (b) Object-space focal point F. Both focal points are real in that
parallel rays converge to F ¢ , and rays actually originating from F form a parallel
beam after refraction by the lens.
where C = 1 R is the curvature of a surface. We note that the refracting power of the lens
is equal to the sum of the refracting powers K1 and K 2 of its two surfaces, i.e.,
K = K1 + K 2 , (2-32)
where
n -1
K1 = (2-33a)
R1
and
1- n
K2 = . (2-33b)
R2
We note that the focal length or the power of a lens depends on the difference in the
curvatures of its surfaces but not on the curvatures themselves. Thus, if the curvatures of
the lens surfaces are changed by the same amount, its shape changes without changing its
Gaussian properties. This degree of freedom, called the bending of the lens, is used in
reducing its aberrations. The equation (2-31) for the focal length of a thin lens, in terms of
its refractive index and the curvatures of its surfaces, has traditionally been called the lens
maker’s formula. This is, however, not correct because a lens of zero thickness cannot be
fabricated. This name should instead be associated with Eq. (4-41) for a thick lens
(described in Chapter 4).
A lens with a positive value of K , fe , or f ¢ , as illustrated in Figure 2-17, is called a

converging or a positive lens. A lens with the curvatures of its two surfaces having the
same magnitude but opposite signs is referred to as an equiconvex lens. The surfaces
refract a ray incident on the lens toward the optical axis. Similarly, a lens with a negative
value of K, fe , or f ¢ is called a diverging or a negative lens.. It is shown in Figure 2-18,
illustrating its focal points. Parallel rays incident on the lens, as in Figure 2-18a, are
refracted by it, appearing to diverge from the image-space focal point F ¢ . Similarly, rays
converging to the virtual object-space focal point F are refracted by the lens into a
parallel beam, as illustrated in Figure 2-18b. A lens whose first surface has a negative
curvature and second surface has a positive curvature of the same magnitude as the first is
referred to as an equiconcave lens.
A lens with surface curvatures of the same sign is called a meniscus lens. It can be
positive or negative, as illustrated in Figure 2-19. Unless it is surrounded by a medium of
higher refractive index, a lens that is thick at the center compared to its edges is positive,
and a lens that is thin at the center is negative. Of course, one of the surfaces may be
planar, in which case the lens is called planoconvex or planoconcave, depending on the
curvature of the other surface.
It should be noted, however, that if a beam converging to F ¢ is incident on a positive

lens, as in Figure 2-20a, i.e., a virtual point object P0 at F ¢ , a real image is formed at P0¢ .
2.3 Thin Lens 65
F¢ C C F
(–)f¢ (–)f
(a) (b)
Figure 2-18. Focal points of a negative thin lens. (a) Image-space focal point F ¢ . (b)
Object-space focal point F. Both focal points are virtual in that parallel rays appear
to diverge from F ¢ , or rays appearing to converge to F form a parallel beam after
refraction by the lens.
(a) (b)
Figure 2-19. (a) Positive and (b) negative meniscus lens. The radii of curvature of
their surfaces have the same sign. The lens thickness at the center is higher
compared to that at the edges for a positive meniscus, and lower for a negative
meniscus.
F¢ F¢
C P¢0 P0 P0 P 0¢ C
f¢ (–)f ¢
S¢ (–)S ¢
Figure 2-20. Virtual point object P0 at the real focus F ¢ of a positive lens. The
image point P0¢ is real. (b) Real point object P0 at the virtual focus F of a negative
lens. The image point P0¢ is virtual.
Similarly, if a point object is placed at the focal point F ¢ of a negative lens, as in Figure
2-20b, i.e., a real point object P0 at F ¢ , a virtual image is formed at P0¢ . The image
distance in both cases is given by half the corresponding focal length.

The transverse magnification of the image formed by the lens can be obtained by
applying Eq. (2-12) to the images formed by its two surfaces. A ray from an off-axis
point object P passing through the center of curvature C1 of the first surface is shown in
Figure 2-21 intersecting the image plane at its image P ¢ . The magnification of the
inverted image P0¢P ¢ of the object P0 P formed by the first surface is given by
M1 ∫ h1¢ / h1 (2-34a)
S1¢
= . (2-34b)
nS1
A parallel ray from P is refracted by the first surface passing through its focal point F1¢ ,
which, in turn, is refracted by the second surface passing through the focal point F ¢ of
the lens and intersecting the final image plane at the image point P ¢¢ . The magnification
of the erect image P0¢¢P ¢¢ of the object P0¢P ¢ formed by the second surface is given by
M2 ∫ h2¢ / h2 = h2¢ / h1¢ (2-35a)
nl S2¢
= . (2-35b)
S1¢
Therefore, the transverse magnification of the final image P0¢¢P ¢¢ of the object P0 P
formed by the lens as a whole is given by
(2-36)
Mt = M1 M2 = h2¢ h1 = S2¢ S1 ,
or
Mt ∫ h ¢ h = S ¢ S , (2-37a)
P
n
h1 ≡ h
F′ P′′0 F1′ P0′
P0 OA C2 C1 (–)h
′ 2 ≡ h′ (–)h′1 ≡ h2
P′′
P′
(–)R2 R1
(–)S1 ≡ S S1′ = S2
S2′ ≡ S ′
Figure 2-21. Imaging of an off-axis point object P. The dotted line simply shows that
the final image P ¢¢ lies on the line joining P ¢ and C2 , as expected.
2.3 Thin Lens 67
where we have let h = h1 and h ¢ = h2¢ be the object and final image heights, respectively.
Substituting for S ¢ from Eq. (2-29) into Eq. (2-37a), the magnification can also be
written in terms of S and f ¢ :
f¢
Mt = . (2-37b)
f¢+ S
The angular magnification of a ray bundle diverging from the axial point object P0
and converging toward its image P0¢ (see Figure 2-22) is given by
M = ¢0 0 = S S ¢ . (2-38)
From Eqs. (2-37) and (2-38), we find that the product of the transverse magnification of
the image and the angular magnification of the ray bundle for a thin lens is given by
Mt M = 1 . (2-39)
From the definitions of the magnifications, Eq. (2-39) can also be written
h ¢¢0 = h0 , (2-40)
showing that the quantity h0 is invariant upon refraction by the lens. This quantity is
called the Lagrange invariant. [It is shown in Section 5.4.10 that the object flux entering
the lens is proportional to its square.] From Eq. (2-40), the transverse magnification of the
image can also be written
Mt = 0 ¢0 , (2-41)
i.e., it is given by the ratio of the slope angles of the incident and refracted rays for an
axial point object.
Differentiating both sides of Eq. (2-27), we obtain the longitudinal magnification of

the image:
2
Ml ∫ D S ¢ D S = ( S ¢ S ) = Mt2 = Mt Mb . (2-42)
h
β0 (–)β′0 P′0
P0 C (–)h′
P′
(–)S S′
Figure 2-22. Lagrange invariant h0 for imaging by a thin lens.

The comments made following Eq. (2-18) apply to Eq. (2-42) as well. Thus, for example,
when the object is displaced longitudinally, the image is displaced in the same direction
as the object. In Eq. (2-42), the lens is assumed to be fixed in position, and D S ¢
represents the displacement of the image corresponding to a displacement DS of the
object. However, if the object is fixed and the lens is displaced by an amount D , then the
( )
corresponding displacement of the image is 1 - Mt2 D , as shown in Section 2.9.3.

The Gaussian image of a point object can be located graphically, as illustrated in
Figure 2-23, in the same manner as in Section 2.2.5 for the case of a refracting surface,
except that a ray through the center of curvature of the surface is replaced by one through
the center of the lens. Thus, a ray from an object point P incident parallel to the optical
axis of the lens emerges from it passing through its image-space focal point F ¢ , and a ray
incident in the direction of its object-space focal point F emerges parallel to the optical
axis. The intersection of these two rays locates the image point P ¢ . The ray passing
through F determines the image height h ¢ . The transverse magnification given by Eq. (2-
37) dictates similarity of the triangles P0 CP and P0¢CP ¢ , showing that a ray incident in
the direction of the center C of the lens passes through it undeviated. Figure 2-23 is
similar to Figure 2-16 except that the two-step imaging (one for each surface) has been
replaced by single-step imaging. Only two of the three rays from an off-axis point object,
namely, parallel to the axis, in the direction of the object-space focal point, and in the
direction of the center, are needed to determine the image point. Of course, the third ray
provides a good check on the correctness of the drawing.

In the Gaussian imaging equation (2-27), the object and image distances S and S ¢ ,
respectively, are measured from the lens center. In the corresponding Newtonian imaging
equation, they are measured from the respective focal points. Thus, as indicated in Figure
2-23, let z and z ¢ be the object and image distances from the focal points F and F ¢ ,
respectively. From similar triangles P0 FP and FCA, we note that the transverse
magnification of the image can be written
Mt ∫ h ¢ h = - f z . (2-43)
Similarly, from similar triangles CF ¢B and P0¢F ¢P ¢ , it may also be written
Mt = - z ¢ f ¢ . (2-44)
The negative sign on the right-hand sides of Eqs. (2-43) and (2-44) has been introduced
because Mt in Figure 2-23 is numerically negative due to h ¢ being numerically negative.
From Eqs. (2-47) and (2-44), we obtain
z z¢ = f f ¢ = - f ¢2 , (2-45)
2.3 Thin Lens 69
P B
h n
F¢ P¢0
P0 F C (–)h¢
A P¢
(–)z z¢
(–)f f¢
(–)S S¢
Figure 2-23. Imaging by a lens of refractive index n and focal length f ¢ . Compared
with Figure 2-15, the two-step imaging (one for each surface) has been replaced by
single-step imaging.
which is the Newtonian imaging equation. It is clear from this equation that z and z ¢ must
have opposite signs, implying that an object and its image lie on opposite sides of the
corresponding focal points. Differentiating both sides of Eq. (2-39) and using Eqs. (2-43)
and (2-44), we obtain Eq. (2-42) relating the transverse and longitudinal magnifications.
2.3.6 Image Throw

The distance L between an object and its image is called the throw (see Figure 2-24).
Its minimum value for real conjugates may be obtained by letting S ¢ = L + S in Eq. (2-
29) and setting the differential of L with respect to S equal to zero. Thus,
1 1 1
- = , (2-46)
L+S S f¢
or
S2
L = - , (2-47)
S + f¢
and
∂L
0 =
∂S
S( S + 2 f ¢ )
= . (2-48)
( S + f ¢)2
We discard the solution S = 0 because it implies an object and its image both located at
the lens. Therefore,
h
P′0
P0 C (–)h′
P′
(–)S S′
Figure 2-24. Image throw L of a lens representing the distance between an object
and its image.
S = -2f¢ . (2-49)
The corresponding value of S ¢ from Eq. (2-29) is equal to 2 f ¢ . Accordingly,
Lmin = S ¢ - S
= 4f¢ . (2-50)
Thus, the minimum throw of a thin lens for real conjugates is equal to 4 f ¢ . The
magnification of the image in that case is - 1.
By interchanging the (magnitudes of the) object and image distances, Eq. (2-29)
shows that a pair of conjugates is obtained for two positions of the lens, as illustrated in
Figure 2-25. The focal length of a lens can be determined accurately from the throw of an
image and the spacing d between the lens positions. We note from the figure that
L = S¢ - S (2-51)
and
d = - (S ¢ + S ) . (2-52)
Solving for S and S ¢ , we obtain
L+d
S = - (2-53a)
2
and
L-d
S¢ = . (2-53b)
2
2.3 Thin Lens 71
h
P0′
P0 C (–)h′
P′
(–)h′′
d P′′
(–)S S′
– S′ –S
Figure 2-25. Two lens positions for a pair of conjugates. The object and image
distances are interchanged as the lens is moved from one position to the other.
Substituting Eqs. (2-53) into Eq. (2-29), we obtain the focal length of the lens:
L2 - d 2
f¢ = . (2-54)
4L
The magnifications of the two images are given by
h¢ S¢ L-d
M1 ∫ = = - (2-55a)
h S L+d
and
h ¢¢ S 1
M2 ∫ = = , (2-55b)
h S¢ M1
and they are reciprocal of each other.
2.3.7 Thin Lens Not in Air

If the media on the object and image sides of a thin lens of refractive index nl have
the refractive indices nm and nm¢ , as illustrated in Figure 2-26, then it can be shown that
the following imaging equations are obtained:
nm¢ n n - nm nm¢ - nl
- m = l + (2-56a)
S¢ S R1 R2
= K1 + K 2 (2-56b)
nm¢ n
= = - m (2-56c)
f¢ f
nm n′m
P
nl (–)β′0
h
β0 C P′0
P0 F
F′ (–)h′
P′
(–)z z′
(–)f f′
(–)S S′
Figure 2-26. Imaging by a thin lens with different media on its two sides. A ray
incident toward its center deviates upon refraction, unlike the case of a thin lens in
air.
1
= K = , (2-56d)
fe
Mt ∫ h ¢ h = nm S ¢ nm¢ S (2-57a)
= nm0 nm¢ ¢0 (2-57b)
= - z¢ f ¢ = - f z , (2-57c)
M = ¢0 0 = S S ¢ ,
(2-58)
Mt M = nm nm¢ , (2-59)
nm¢ h ¢¢0 = nm h0 , (2-60)
2
M l ∫ D S ¢ D S = (n m n ¢m )(S ¢ S) = (n ¢m n m )M t2 = M t M b , (2-61)
and
z z ¢ = f f ¢ = - (nm nm¢ ) f ¢ 2 . (2-62)
Comparing these equations with those for a thin lens in air, we note that the power K of
the lens is again equal to the sum of the powers K1 and K 2 of its surfaces, although the
power of each surface is now different from that before. However, as illustrated in Figure
2-26, a ray incident in the direction of the lens center does not pass through it undeviated.
Moreover, the object- and image-space focal lengths have different magnitudes.
From Eqs. (2-56a) and (2-56b), the focal length of a thin lens in air may be written
2.4 General System 73
1 Ên ˆÊ 1 1ˆ
= Á l - 1˜ Á - ˜ , (2-63)
f¢ Ë na ¯ Ë R1 R2 ¯
where na is the refractive index of air. The refractive index of air is approximately equal
to 1.0003. It is a universal practice in optical design to specify the refractive index of a
lens material relative to that of the air (instead of vacuum). Letting nl na = n , where n is
the specified index of the lens material, Eq. (2-63) reduces to Eq. (2-31a), showing that
the index of air may be assumed to be unity when the index of the lens is specified with
respect to it.
2.3.8 Thin Lenses in Contact

The image of an object formed by two or more thin lenses in contact can be obtained
in the same manner as for the two surfaces of a thin lens, except that the imaging equation
for a surface is replaced by that of a lens. The image of an object formed by the first lens
acts as the object for the second lens. Just as we showed that the power of a thin lens is
given by the sum of the powers of its surfaces, we now show that the power of a doublet
consisting of two thin lenses in contact is given by the sum of the powers of the thin
lenses.
As illustrated in Figure 2-27, consider two thin lenses L1 and L2 of focal lengths f1¢
and f2¢ in contact. To determine the focal length of the doublet, we consider an object
lying at infinity. The first lens forms its image at F1¢ at a distance f1¢ . This image is the
object for the second lens, which forms its image at a distance f ¢ that is the focal length
of the doublet, according to
1 1 1
= + . (2-64)
f¢ f1¢ f2¢
Thus, the inverse of the focal length of the doublet is equal to the sum of the inverse of
the focal lengths of its two lenses. It shows that, because the power of a lens is equal to
the inverse of its focal length, the power of the doublet is equal to the sum of the powers
of its two lenses. The above reasoning can be extended to three or more thin lenses,
yielding the result that the power of a system consisting of any number of thin lenses in
contact is equal to the sum of the powers of the individual lenses. Of course, the
approximation involved in neglecting the thickness of the lenses becomes increasingly
coarser as the number of lenses increases.
2.4 GENERAL SYSTEM

2.4.1 Introduction
If an imaging system consists of a thick lens such that the spacing between its two
surfaces is not negligible, the image formed by it can be obtained by sequentially
applying the imaging equation for a refracting surface. The image formed by the first
surface becomes the object for the second, as in the case of a thin lens. The fact that the
L1 L2
F¢ F1¢
f¢
f1¢
Figure 2-27. Doublet consisting of two thin lenses in contact. The focal length of the
doublet is f ¢ .
lens is thick means that the distance of this image from the vertex of the second surface
becomes the object distance for this surface. (The vertices of the two surfaces were
assumed to be coincident when the thickness of the lens was neglected in Section 2.3.)
Similarly, the image formed by a doublet consisting of two thin lenses separated by some
distance can be obtained by sequentially applying the imaging equation for a thin lens.
The size of the image formed in each case can also be obtained by a sequential
application of the magnification equation. The focal point of a thick lens or a doublet is
simply the image point corresponding to an axial point object lying at infinity, i.e., it is
the point where the rays incident parallel to the optical axis are focused after refraction by
the system. Of course, the image formed by a general system consisting of many imaging
elements can be obtained in a similar manner by a sequential application of the imaging
equation for a refracting surface and or a thin lens, as the case may be. It should be
evident, though, that the imaging equation will become increasingly complex as the
number of imaging elements of a system increases. Moreover, although the focal point of
a general system can also be determined in the same manner as for a thick lens or a
doublet, it is not clear how its focal length should be defined and determined. It is not, for
example, equal to the distance of the focal point from the vertex of the last surface of the
system (which is called the image-space focal distance).
We show that by defining suitable reference points, called the cardinal points, the
imaging equation for any imaging system can be reduced to one similar to that for the
refracting surface. There are six cardinal points, of which only three are independent.
Once they are known, the system can be replaced by them, regardless of its complexity.
The object and image distances are measured from the respective principal points, which
correspond to planes of unity transverse magnification. Similarly, the focal lengths
represent the distances of the focal points from the respective principal points. The two
nodal points correspond to unity angular magnification. The thin lens is a special case in
which the principal (and nodal) points coincide with its center. The principal points of a
refracting surface coincide with its vertex, and its nodal points coincide with its center of
curvture.
2.4.2 Cardinal Points and Planes

A general imaging system is characterized by six cardinal points: two principal
points, two focal points, and two nodal points. The planes normal to the optical axis and
passing through these points are correspondingly called principal planes, focal planes.
and nodal planes. The location of the principal and focal points is sufficient to describe
Gaussian imaging by the system.
A ray is shown incident parallel to the optical axis of the system in Figure 2-28a. It
exits from the system intersecting the axis at F ¢ . Only the first and the last surfaces of
the system are shown schematically in the figure. Similarly, only the incident and the exit
segments of the ray are shown, i.e., its intermediate segments are not shown. The image-
space focal point F ¢ of a system is defined as the point through which rays incident
parallel to its optical axis from the left pass after being refracted by it. The rays
converging toward F ¢ when extended backward intersect the incident parallel rays in a
plane called the image-space principal plane. This plane intersects the optical axis at a
point H ¢ called the image-space principal point. The rays behave as if all of their
deviation takes place at the principal plane. The distance H ¢F ¢ of the focal point F ¢ from
the principal point H ¢ is called the image-space focal length f ¢.
The object-space focal point F, shown in Figure 2-28b, is defined as the axial point
such that the rays originating from it and incident on the system emerge from it parallel to
the optical axis after being refracted by it. The rays originating from F when extended
n n¢ n n¢
H¢ F¢ F H
f¢ (–)f
(a) (b)
Figure 2-28. Principal and focal points of an imaging system. The system consisting
of many surfaces is shown schematically by its first and last surfaces only. Similarly,
only the incident and exit segments of the ray are shown. (a) The image-space focal
point F ¢ and principal point H ¢ , illustrating the focal length f ¢ . (b) The object-
space focal point F and principal point H , illustrating the focal length f.
forward intersect the emergent parallel rays in a plane called the object-space principal
plane. This plane intersects the optical axis at a point H called the object-space principal
point. The rays behave as if all of their deviation takes place at the principal plane. The
distance HF of the focal point F from the principal point H is called the object-space
focal length f.
By definition, the principal planes are planes of unity transverse magnification. This
may be seen from Figure 2-29, where a system with focal points F and F ¢ is considered.
A ray 1 incident in the direction AQ parallel to the optical axis emerges from the system
passing through F ¢, and the extensions of the incident and emergent rays intersect at a
point Q ¢ on the image-space principal plane H ¢Q ¢ . A second ray 2 incident on the
system passing through F emerges from it in the direction Q ¢A¢ parallel to the optical
axis, and the extensions of the incident and emergent rays intersect at a point Q on the
object-space principal plane HQ. Thus, the two rays initially directed toward Q emerge in
directions that intersect at Q ¢. Thus, Q ¢ is an image of Q, and vice versa, i.e., Q and Q ¢
are conjugate points. Similarly, the principal planes HQ and H ¢Q ¢ are conjugate planes.
Because HQ = H ¢Q ¢ , they are conjugate planes of unity (positive) transverse
magnification.
It should be understood that all of the rays incident parallel to the optical axis pass
through F ¢ after emerging from the system only in the Gaussian approximation. In
reality, they generally intersect the focal plane at various points in the vicinity of F ¢ .
Similarly, the incident parallel rays and the corresponding exit rays do not generally
intersect at points lying in a plane. The image-space principal plane H ¢Q ¢ is the Gaussian
approximation of a nonplanar surface.
1 2
A Q Q¢ A¢
2 1
F H H¢ F¢
(–)f f¢
Figure 2-29. Unity transverse magnification of principal planes.

2.4.3 Gaussian Imaging, Focal Lengths, and Magnifications

Figure 2-30 illustrates imaging by a general optical system represented by its
principal planes. Given the cardinal points of a system, the Gaussian image of a point
object formed by it can be determined graphically in the same manner as in the case of a
refracting surface, except that the center of curvature of the surface is replaced by the
nodal points of the system discussed later (see Figure 2-32). Thus, the ray 1 incident
parallel to the optical axis emerges from the system passing through the focal point F ¢ .
Because of the unity magnification of the principal planes, the emergent ray appears to
come from Q ¢ on the image-space principal plane, which is at the same height as the
point of incidence Q on the object-space principal plane. Similarly, the ray 2 incident in
the direction of F emerges parallel to the optical axis such that the point of emergence R¢
is at the same height as the point of incidence R. This ray determines the image height h ¢ .
Moreover, a ray incident in the direction of H emerges as if it were coming from H ¢.

If n and n ¢ are the refractive indices of the object and image spaces of the system, then
repeated application of Eq. (2-15) for a refracting surface yields for the system
Mt Mb = n n ¢ , (2-65)
where Mt is the transverse magnification of the image, and Mb is the corresponding

angular magnification of the ray bundle from the axial point object. Because Mt = 1 for
the principal planes, the angular magnification of a ray 3 incident in the direction of H ,
making an angle with the optical axis, and appearing to emerge from H ¢ , making an
angle ¢ , is given by
M = ¢
(2-66)
= n n¢ .
Thus, for the principal planes,
n ¢¢ = n . (2-67)
n n′
1 Q
P Q′
3 F1 F′1 4 1 (–)β′
4
h 2
(–)β F′ P′0
P0 F H H′ (–)h′
3
R R′ 2 P′
(–)z (–)f f′ z′
(–)S S′
Figure 2-30. Imaging by a general imaging system.

Note that both and ¢ are numerically negative in Figure 2-30.
Now consider a ray 4 such that it and ray 3 leave the object-space focal plane from
the same point F1 . The image of F1 is formed at infinity, i.e., the emergent rays F1¢ F ¢
and H ¢ P ¢ are parallel to each other, and ray 4 passes through F ¢ after being refracted by
the system. From the triangle FHF1 , we note that
= FF1 f . (2-68a)
Similarly, because FF1 = H ¢F1¢, we find from triangle H ¢F ¢F1¢ that
¢ = - FF1 f ¢ , (2-68b)
where we have introduced a negative sign on the right-hand side because ¢ is

numerically negative. Substituting Eqs. (2-68) into Eq. (2-67), we obtain a relationship
between the focal lengths f and f ¢ , similar to Eq. (2-8) for a single refracting surface:
n¢ n
= - . (2-69)
f¢ f
Noting that h = S and h ¢ = ¢ S ¢ , and using Eq. (2-67), the transverse magnification of
the image P0¢P ¢ of the object P0 P is given by
h¢ nS ¢
Mt ∫ = . (2-70)
h n ¢S
Considering similar triangles P0 FP and FHR, we find that

h¢ f nf ¢ (2-71)
= - = ,
h S- f nf ¢ + n ¢S
where the negative sign is due to h ¢ being numerically negative. Note that f and S are
both numerically negative. Comparing Eqs. (2-70) and (2-71), we obtain the Gaussian
imaging equation
n¢ n n¢ n
- = = - . (2-72)
S¢ S f¢ f
The ratio of the image-/object-space refractive index to the image-/object-space focal

length is called the refracting power K of the system. Its reciprocal represents the
equivalent or effective focal length fe . Thus, Eq. (2-72) may be written
n¢ n n¢ n 1
- = = - =K = . (2-73)
S¢ S f¢ f fe
and converging to its image point P0¢ after refraction by the system, as illustrated in
Figure 2-31, is given by
n n′
P Q Q′
h
β0 (–)β′0 P′0
P0 H H′ (–)h′
P′
(–)S S′
Figure 2-31. Lagrange invariant nh0 of an imaging system.
Mb = ¢0 / 0 = S / S ¢ , (2-74)
where we have used the fact that HQ = H ¢Q ¢ by virtue of unity magnification of the
principal planes. From Eqs. (2-70) and (2-74), the product of the transverse and angular
magnifications is given by Eq. (2-65), as expected. From the definitions of the
magnifications, Eq. (2-65) may also be written
n ¢h ¢¢0 = nh0 , (2-75)
thus demonstrating the Lagrange invariance for the entire system. It is closely related to
the conservation of energy in the imaging process, as illustrated by Problem 5.7. From
Eq. (2-75), the transverse magnification of the image can also be written
n0
Mt = , (2-76)
n ¢¢0
i.e., it can be obtained from the slope angles of an axial incident ray and the
corresponding refracted ray in the image space of the system.
Differentiating Eq. (2-72), we find that the longitudinal magnification of the image is
given by
Ml ∫ D S ¢ D S = (n n ¢ )( S ¢ S ) 2 = (n ¢ n) Mt2 = Mt Mb . (2-77)
The comments made following Eq. (2-18) also apply to Eq. (2-77). Thus, for example, if
the object is displaced longitudinally, the image is also displaced in the same direction.
It may be noted that only three parameters are needed to determine the location and
size of the image of an object: the locations of the two principal points, and the focal
length of the system.
2.4.4 Nodal Points and Planes

The nodal points N and N ¢ correspond to unity ray angular magnification, as
illustrated in Figure 2-32. Thus, a ray incident in the direction of N emerges parallel to it
as if coming from N ¢. From the parallelogram AA¢NN ¢, we note that
NN ¢ = AA¢ = HH ¢ , (2-78)
i.e., the distance between the nodal points is equal to the distance between the principal
points. If we consider a second ray F B parallel to the first but passing through F, it
emerges parallel to the optical axis in the direction BB¢. From the congruent triangles
HFB and F ¢N ¢B¢, we find that
F ¢N ¢ = HF = f , (2-79)
i.e., the distance of the image-space nodal point N ¢ from the corresponding focal point
F ¢ is equal to the object-space focal length f. Also from the congruent triangles HFB and
F ¢N ¢B ¢ ,
F ¢H ¢ + H ¢N ¢ = HN + NF . (2-80)
However, from the congruent triangles NHA and N ¢H ¢A ¢ ,
H ¢N ¢ = HN . (2-81)
n n¢
H N¢ H¢ F¢
F N
A¢
A
N1
B¢
B N1¢
(–)f f¢
f¢ (–)f
Figure 2-32. Unity angular magnification of nodal points N and N¢ of an imaging

system.
Substituting Eq. (2-81) into Eq. (2-80), we obtain
F ¢H ¢ = NF ,
or
FN = H ¢F ¢ = f ¢ , (2-82)
i.e., the distance of the object-space nodal point N from the corresponding focal point F is
equal to the image-space focal length f ¢ .
Letting Mb = 1 in Eq. (2-65), we note that the nodal planes are conjugate planes with
a transverse magnification of n n¢ . This may also be seen directly from Eq. (2-70) by
considering the nodal points as Gaussian conjugates with S ∫ HN and S ¢ ∫ H ¢N ¢ . It
should be noted that only the nodal points N and N ¢ have the property of unity ray angle
magnification; the other conjugate points in the nodal planes do not have this property.
For example, in Figure 2-32, N1 and N1¢ are conjugate points in the nodal planes, as may
be seen by considering a ray from N1 incident parallel to the axis. It emerges from the
system passing through F ¢ . The extension of the emergent ray intersects the incident ray
at N1¢ . The transverse magnification of the image given by N ¢N1¢ NN1 is equal to n n¢ .
Thus, a ray incident in the direction of N1 emerges as if it is coming from N1¢ , but the
emergent ray is obviously not parallel to the incident ray. When n = n ¢ , then f = - f ¢ ,
and therefore N and H coincide, and N ¢ and H ¢ coincide.

If we measure the object and image distances z and z ¢ from the focal points F and
F ¢, respectively, as illustrated in Figure 2-30, we find from similar triangles P0 FP and
FHR, and H ¢F ¢Q ¢ and F ¢P0¢P ¢ , that
Mt ∫ h ¢ h = - f z = - z ¢ f ¢ . (2-83)
Accordingly,
zz ¢ = f f ¢ = - (n ¢ n) f ¢ 2 , (2-84)
which is the Newtonian imaging equation. It is evident from this equation that z and z ¢
must have opposite signs, implying that an object and its image lie on the opposite sides
of the corresponding focal points. By differentiating both sides of Eq. (2-84), we obtain
Eq. (2-77), relating the longitudinal and transverse magnifications.

The Gaussian image point P ¢ of a point object P at a height h can also be determined
graphically in a manner similar to that for imaging by a refracting surface or by a thin
lens. As in Figure 2-30, an object ray incident parallel to the optical axis of the system
passes through the image-space focal point F ¢ after emerging from the system. Similarly,
an object ray incident in the direction of the object-space focal point F emerges parallel to
the optical axis. This ray determines the height h ¢ of the image point P ¢ . The intersection
of the these two rays in the image space determines the location of P ¢ . A ray incident in
the direction of the object-space nodal point N emerges from the system in a parallel
direction passing through the image-space nodal point N ¢ , as illustrated in Figure 2-32.
This third ray is not shown in Figure 2-30, but it provides a check on the graphical
construction. As stated earlier, if the refractive indices of the object and image spaces are
equal, then a principal point coincides with its corresponding nodal point.
It should be understood that, in Gaussian imaging, all of the object rays transmitted
by the system pass through the Gaussian image point. In reality, of course, this does not
generally happen. The rays intersect the image plane in the vicinity of the Gaussian image
point. The deviation of a ray from the Gaussian image point is called its ray aberration
(discussed in Chapter 8). The distribution of the rays in the image plane is called the spot
diagram (discussed in Chapter 9). The Gaussian approximation helps determine the
location of the image point, but the quality of the image depends on its aberrations.
2.4.7 Reference to Other Conjugate Planes

So far we have used the principal planes as the reference planes for defining the
object and image distances in the Gaussian imaging equation. We now derive a
generalized imaging equation where the object and image distances are measured with
respect to an arbitrary pair of conjugate planes.
Consider two object planes with axial points P0 and Q0 separated by a distance L, as
illustrated in Figure 2-33. Let the corresponding image planes with axial points P0¢ and
Q0¢ be separated by a distance L ¢ . Let z be the distance of Q0 from the object-space focal
point F, and let z ¢ be the distance of Q0¢ from the image-space focal point F ¢ . The
Newtonian imaging equation for determining the location of the P0¢ - and Q0¢ -image
planes yields
zz ¢ = f f ¢ (2-85)
and
n n¢
P0 F Q0 Q 0¢ F¢ P¢0
(–)z z¢
(–)L L¢
Figure 2-33. Object and image distances referred to conjugate planes other than the
principal planes. F and F ¢ are the object- and image-space focal points. The object
and image distances of the conjugates P0 and P0¢ are referred to the conjugates Q0
and Q0¢ , respectively.
( z + L) ( z ¢ + L ¢) = f f¢ . (2-86)
z L ¢ + z ¢L + L L ¢ = 0 . (2-87)
The magnification MQ of the image in the Q0¢ -plane is given by
f z¢
MQ = - = - . (2-88)
z f¢
Substituting for z and z ¢ in terms of MQ into Eq. (2-87), we may write
f f ¢ MQ
+ = 1 . (2-89)
L MQ L¢
Substituting for f in terms of f ¢ from Eq. (2-69), Eq. (2-89) becomes
n ¢ MQ n n¢
- = . (2-90)
L¢ L MQ f¢
Equation (2-90) is a generalized imaging equation wherein L and L ¢ are the object and
image distances referred to the corresponding conjugate planes with a transverse
magnification of MQ . The magnification of the image in the P0¢ plane is given by
z¢ + L¢
MP = -
f¢
nL ¢ 1
= , (2-91)
n ¢L MQ
where we have substituted for z ¢ f ¢ from Eq. (2-88) and for L ¢ f ¢ from Eq. (2-90).
As expected, if we let MQ = 1 for the principal planes, Eqs. (2-90) and (2-91) reduce
to Eqs. (2-73) and (2-70), respectively. Letting MQ = n n ¢ for the nodal planes, they
reduce to
n n¢ n¢
- = (2-92)
L¢ L f¢
and
L¢
MP = , (2-93)
L
respectively. The entrance and exit pupils, discussed in Chapter 5, may also be used as
reference planes by letting MQ equal the pupil magnification. Therefore, it is not
essential that the object and image distances be referred to the principal planes. However,
it is convenient to use them because of their unity magnification and the resulting
simplicity of the associated graphical construction of imaging. Of course, because air is
the medium of the object and image spaces in most applications, the nodal points are the
same as the corresponding principal points.
2.4.8 Comparison of Imaging by a General System and a Refracting Surface or a

Thin Lens
Comparing the imaging equations for a general optical system with those for a single
refracting surface, we find that they are indeed similar to each other. The only significant
difference is that, in the case of the former, the object and image distances are measured
from the principal points H and H ¢, respectively. For a general system, a ray incident in
the direction of object-space nodal point N emerges from the system in a parallel
direction as if coming from the image-space nodal point N ¢ . For a single refracting (or
reflecting) surface, the principal points coincide with its vertex, and its nodal points
coincide with its center of curvature. A ray incident in the direction of the center of
curvature is refracted without any deviation because both the angles of incidence and
refraction are zero. The tangent plane or the paraxial refracting surface referred to in
Section 2.2.5 is indeed its principal plane.
It should be evident that the principal and nodal points of a thin lens in air (or any
other medium) coincide at its center. A ray incident in the direction of the center is
refracted without any deviation. When the media on its two sides have different refractive
indices, then the principal points still coincide at its center, but the nodal points coincide
at a distance f ¢ + f from it. A ray incident in the direction of the center in this case is
refracted with deviation according to Snell's law. In each case, a ray incident parallel to
the optical axis emerges passing through the image-space focal point F ¢ , and a ray
incident in the direction of the object-space focal point F emerges parallel to the optical
axis. The Newtonian imaging equation for a general system is the same as for a single
refracting surface or a thin lens, because the principal points are not utilized in this
equation.
2.4.9 Determination of Cardinal Points

The cardinal points of a system can be determined from its design parameters by a
sequential application of the imaging equations for its elements. We show how to
calculate the location of principal and focal points. The location of the nodal points can
then be determined by using Eqs. (2-78) and (2-81). However, in a laboratory, where the
medium for the object and image spaces of the system is air, the principal points are
coincident with the corresponding nodal points, and it is more convenient to determine
the former based on a property of the latter.
Consider, for example, a system consisting of a series of j refracting surfaces. Let the
refractive indices of the object and image spaces for the ith surface be ni and ni¢ ,
respectively, where ni¢ = ni +1 [because the image space for the ith surface is the object
space for the (i + 1) th surface]. Let the object and image distances for the ith surface be
Si and Si¢ , respectively. If hi and hi¢ are the heights of the object and image for this
surface, where hi¢ = hi +1 [because the image for the ith surface is the object for the
(i + 1) th surface], the magnification of the image formed by it is given by
hi¢
Mi =
hi
ni Si¢
= . (2-94)
ni¢Si
The magnification M of the final image is given by
h ¢j
M=
h1
h1¢ h2¢ h ¢j
= ◊◊◊
h1 h2 hj
= M1 M2 ◊◊◊ M j
n1S1¢ n2 S2¢ n j S ¢j
= ◊◊◊
n1¢S1 n2¢ S2 n ¢j S j
n1 S1¢ S2¢ S ¢j
= ◊◊◊ . (2-95)
n ¢j S1 S2 Sj
If S and S ¢ are the object and final image distances from the principal points of the
system, the magnification of the image is also given by
n1S ¢
M = . (2-96)
n ¢j S
Equating the right-hand sides of Eqs. (2-95) and (2-96), we obtain
S¢ S¢ S¢ S ¢j
= 1 2 ◊◊◊ . (2-97)
S S1 S2 n ¢j S j
For an object lying at infinity, both S and S1 are equal to infinity, and S ¢ and S1¢
become the image-space focal lengths f ¢ and f1¢ of the system and the first surface,
respectively. Therefore, we may write
S2¢ S ¢j
f ¢ = f1¢ ◊◊◊ . (2-98)
S2 n ¢j S j
The image distance S ¢j in this case locates the image-space focal point F ¢ of the system
from the vertex of the jth surface. It is called the image-space focal distance. The focal
length f ¢ locates the image-space principal point H ¢ , because F ¢ lies at a distance f ¢
from it. The focal length of a thick lens, for example, can be determined in this manner.
The effect of its thickness can be determined by comparing it with Eq. (2-28) for the focal
length of a thin lens. Once f ¢ is known, the object-space focal length f can be obtained
from Eq. (2-69). However, the object-space focal point F and the principal point H have
to be determined separately by considering an object at infinity in the image space and
determining its image in the object space.
As another example, we consider a system consisting of a series of j thin lenses in

air. Let the object and image distances for the ith lens be Si and Si¢ , respectively. If hi
and hi¢ are the heights of the object and image for this lens, where hi¢ = hi +1 [because the
image height for the ith lens is the object height for the (i + 1) th lens], the magnification
of the image formed by it is given by
hi¢
Mi =
hi
Si¢
= . (2-99)
Si
The magnification M of the final image is given by
h ¢j
M =
h1
h1¢ h2¢ h ¢j
= ◊◊◊
h1 h2 hj
= M1 M2 ◊◊◊ M j
S1¢ S2¢ S ¢j
= ◊◊◊ . (2-100)
S1 S2 Sj
If S and S ¢ are the object and final image distances from the corresponding principal
points of the system, the magnification of the image is also given by
S¢
M= . (2-101)
S
Equating the right-hand sides of Eqs. (2-100) and (2-101), we obtain
S¢ S¢ S¢ S ¢j
= 1 2 ◊◊◊ . (2-102)
S S1 S2 Sj
For an object at infinity, both S and S1 are equal to infinity, and S ¢ and S1¢ become
the image-space focal lengths f ¢ of the system and the first lens, respectively. Therefore,
we may write
S2¢ S ¢j
f ¢ = f1¢ ◊◊◊ . (2-103)
S2 Sj
The image distance S ¢j is the image-space focal distance and locates the image-space
focal point F ¢ of the system from the center of the jth lens. The focal length f ¢ locates
the image-space principal point H ¢ , because F ¢ lies at a distance f ¢ from it. Once f ¢ is
known, the object-space focal length can be obtained from Eq. (2-69). However, the
object-space focal point F and the principal point H have to be determined separately by
considering an object at infinity in the image space and determining its image in the
object space.
As an example of a simple application, we determine the image-space focal point F ¢

and the principal point H ¢ of a system consisting of two thin lenses L1 and L2 of focal
lengths f1¢ and f2¢ separated by a distance t, as illustrated in Figure 2-34. The focal point
F ¢ is the image of an axial object at infinity. Therefore, we proceed as follows:
S1 = • ,
S1¢ = f1¢ ,
S2 = f1¢ - t ,
1 1 1
- = .
S2¢ S2 f2¢
Thus,
f2¢( f1¢ - t )
S2¢ = . (2-104)
f1¢ + f2¢ - t
From Eq. (2-103), we obtain
L1 L2
H¢ F¢ F1¢
t S2¢
S2
f¢
S1¢ ∫ f1¢
Figure 2-34. Image-space focal point F ¢ of two thin lenses separated by a distance t.
f ¢ = f1¢( S2¢ S2 )
f1¢f2¢
= . (2-105a)
f1¢ + f2¢ - t
The image distance S2¢ represents the image-space focal distance and locates the focal
point F ¢ . The principal point H ¢ is located by using the fact that H ¢F ¢ = f ¢ . For
comparison with Eq. (2-64) for the case when the two lenses are in contact, we write Eq.
(2-105a) in the form
1 1 1 t
= + - .
f¢ f1¢ f2¢ f1¢f2¢
(2-105b)
Of course, Eq. (2-105b) reduces to Eq. (2-64) as t approaches zero.
In a telephoto lens, a positive lens L1 is combined with a negative lens L2 to yield a

long focal length f ¢ while keeping the focal distance S2¢ short. The long focal length
gives a large image, because M = f ¢ z , according to Eq. (2-83) when n = n ¢ , and the
short focal distance keeps the camera length to a manageable size. We also note from Eq.
(2-105a) that if t = f1¢ + f2¢ , then both S2¢ and f ¢ approach infinity. Thus, when the lenses
are confocal (i.e., when they have a common focus), a parallel beam incident on the
systems emerges from it as a parallel beam. Such a system is called afocal and forms the
basis of a beam expander (or a reducer when used in reverse) or a telescope. Afocal
systems are discussed in more detail in Section 2.5. A beam expander, a telephoto lens,
and a telescope are all discussed in more detail in Chapter 6.
The determination of the image-space focal point F ¢ of a system as the image of an

object at infinity is equivalent to tracing a ray incident parallel to the optical axis and
finding its intersection with the axis in the image space of the system. Determination of
the cardinal points in this manner from the design parameters of the system is discussed
in Chapter 4 on paraxial ray tracing. Several examples are also discussed there.
Given a lens system, its image-space focal point F ¢ can be determined in a

laboratory by shining a collimated beam on it and locating the point where the beam is
focused. Similarly, its object-space focal point can be determined by shining the beam
from the other direction. Its focal length can be determined by finding the image of a
point object and using the Newtonian imaging equation (2-84). The two focal lengths are
related to each other by the refractive indices of the object and image spaces, according to
Eq. (2-68). Once the focal lengths are known, the principal points can be located with
respect to the focal points and, in turn, the nodal points by using Eqs. (2-79) and (2-82).
The nodal points of a lens system can also be determined in a laboratory by placing it
on a nodal slide, which is a device that permits rotation of the lens about an axis that lies
on the optical axis and is perpendicular to it. The axis of rotation is changed by sliding the
lens on the slide. When a collimated beam is incident on the lens parallel to its axis, it is
focused at its focal point F ¢ , as illustrated in Figure 2-35a. This figure is drawn for the
general case when the refractive indices of the object and image space are not equal. The
beam focus P0¢ coincides with the focal point F ¢ . When the lens is rotated about a point
on its axis, the focus of the beam is displaced, except when the rotation is about the nodal
point N ¢ . In Figure 2-35b, the lens has been rotated about a point Q lying between N and
N ¢ , resulting in a displacement of the beam focus to a point P0¢¢ . A ray incident in the
direction of N emerges from the system parallel to the incident ray as if coming from N ¢ .
In Figure 2-35c, where the rotation is about N ¢ , the beam focus stays at P0¢ . In this case,
a ray passing through the nodal point N is displaced, but it passes through the nodal point
N ¢ in the same direction as the incident ray. By turning the lens around (so that its front
and back are interchanged) and repeating the process, the other nodal point N can be
determined.
If the refractive indices n and n ¢ of the object and image spaces are equal, as would
be the case in a laboratory measurement, then the principal points coincide with the
corresponding nodal points. When the principal points of a system are located and its
focal lengths are determined, so that the focal points are located, all of the Gaussian
characteristics of the image of an object can be determined.
n n¢
n n¢
N Q P¢0
P¢0
N N¢ F¢ N¢ P¢0¢
F¢
(a) (b)
n n¢
N P¢0
N¢
F¢
(c)
Figure 2-35. Determination of nodal points of a lens system. (a) A parallel beam is
focused at N ¢ coincident with the focal point F ¢ . (b) When the lens is rotated about
a point Q, the beam focus is displaced to P0¢¢ . (c) When the system is rotated about
the nodal point N ¢ , the beam focus stays at N ¢ , although the focal point F ¢ has
been displaced.
2.5 AFOCAL SYSTEMS

2.5.1 Introduction
An afocal (or without focus, or without focal length) optical imaging system is one
that forms the image at infinity of an object at infinity, i.e., a parallel beam of light
incident on such a system emerges from it as a parallel beam. Because an emerging ray
does not intersect the optical axis or the corresponding incident ray, the concepts of focal
points, principal points, and therefore, focal lengths lose their meanings for such a
system. One may say that the corresponding principal and focal points lie at infinity on
opposite sides of the system and that the focal length is infinity as well, or the focusing
power K = 0 . An afocal system is characterized by its transverse magnification, which is
independent of the object distance.
We start with a discussion of the Lagrange invariant for an object or its image at
infinity, and show that an afocal system images objects with transverse and longitudinal
magnifications that are independent of the object distance. Examples of afocal systems
(such as a beam reducer or expander), telephoto and wide-angle camera lenses, and a
telescope, are considered in Chapter 6. A plane-parallel plate, discussed in the next
section, is an example of an afocal system with unity transverse magnification.
2.5.2 Lagrange Invariant for an Infinite Conjugate

When an object lies at infinity at a certain angle from the optical axis of a system,
the Lagrange invariant becomes indeterminate because h Æ • and 0 Æ 0 . However,
we now show that the product h0 remains finite. Consider the imaging of an object P0 P
of height h lying at a distance S, as illustrated in Figure 2-36a. An axial ray making an
angle 0 with the optical axis is incident on the system at a height x 0 = - S0 . The point
object P lies at an angle from the optical axis, where h = S . Eliminating S from the
two expressions, we find that h0 = - x 0 . If the object is moved to infinity, as in Figure
2-36b, then S Æ - • , 0 Æ 0 (i.e., the axial ray is incident parallel to the optical axis at
a height x 0 ), and h Æ • , but h0 remains finite and equal to - x 0. Thus, the object-
space Lagrange invariant nh0 equals - nx 0. Because the object lies at infinity, its
image is formed in the focal plane and the Lagrange invariant equation (2-75) is replaced
by
n ¢h ¢¢0 = - nx 0 , (2-106)
where h ¢ is the image height, and ¢0 is the slope angle of the axial ray in the image
space. This result is rederived in Section 4.10 from a two-ray Lagrange invariant.
Similarly, if the image is formed at infinity, as when an object lies in the front focal
plane (see Figure 2-36c), the image-space Lagrange invariant becomes indeterminate
because h ¢ Æ • and ¢0 Æ 0 . Considering the image P0¢P ¢ , we note that x 0¢ = - S ¢¢0
and h ¢ = S ¢ . Eliminating S ¢ , we obtain h ¢¢0 = - x 0¢ ¢ . As the image moves to infinity,
the product h ¢¢0 remains finite and equal to - x 0¢ ¢ . The corresponding Lagrange
2.5 Afocal Systems 91
n n′
P Q Q′
h x0 x 0′
β0 (–)β (–)β′0 P′0
P0 H H′ (–)β′ (–)h′
P′
(–)S S′
(a)
n n′
Q Q′
x0 x 0′
(–)β (–)β′0 P′0
H H′ (–)β′ (–)h′
P′
f′
(b)
n n′
P Q Q′
h x0 x 0′
β0 (–)β
P0 H H′ (–)β′
(–)f
(c)
Figure 2-36. (a) Imaging of an object P0 P of height h lying at a distance S. (b)

Imaging of an object lying at infinity. The object-space Lagrange invariant is
- nx0. (c) Imaging of an object lying in the front focal plane. The image-space
Lagrange invariant is - n ¢x0¢¢ .
invariant equation is given by
nh0 = - n ¢ x 0¢ ¢ . (2-107)
2.5.3 Imaging by an Afocal System

In Section 2.5.2, we discussed the Lagrange invariant for an object lying at infinity.
For an afocal system working at infinite conjugates, as illustrated in Figure 2-37a, the
Lagrange invariant equation (2-75) becomes
n ¢x 0¢ ¢ = nx 0 , (2-108)
where n and n ¢ are the refractive indices of the object and image spaces, x 0 and x 0¢ are
the heights of the axial conjugate rays (i.e., rays parallel to the axis), and and ¢ are
the slope angles of conjugate rays from an off-axis point object. Thus, the ray angular
magnification is given by
¢ nx0
= . (2-109)
n ¢x0¢
Now we consider how afocal systems form images of objects located at finite
distances. Consider, for example, the imaging of an object P0 P of height h by an afocal
system, as illustrated in Figure 2-37b. Its image P0¢P ¢ has a height of h ¢ that can be
obtained by determining the image formed successively by each surface of the system.
Consider another object Q0 Q at a distance S from P0 P. The distance S ¢ of its image

Q0¢ Q ¢ from the image P0¢P ¢ can be determined as follows. Because the system is afocal,
the transverse magnification h ¢ h is independent of the position of the object, as may be
seen from the figure. If we consider a ray P0 Q incident on the system at an angle 0 , the
corresponding angle ¢0 of the emerging ray may be obtained from the Lagrange
invariant equation (2-75). Therefore, the image distance S ¢ is given by
S ¢ = h ¢ ¢0
n ¢h ¢ 2 .
=
nh0 (2-110)
Substitutuing h = 0 S , the ratio of the conjugate distances is given by
S¢ n ¢h ¢ 2 n¢ 2
= = Mt . (2-111)
S nh 2 n
Because the transverse magnification is independent of the object position, the

longitudinal magnification S ¢ S is also constant.
2.6 Plane-Parallel Plate 93
n n′
β x0 (–) β′
(–)x 0′
(a)
n n′
P Q
h β0 (–)β′0 P′0 Q′0

P0 Q0 (–)h′
P′ Q′
S S′
(b)
Figure 2-37. (a) Lagrange invariant of an afocal system for infinite conjugates. (b)
Finite conjugate imaging by an afocal system. Conceptually, the system is assumed
to be multisurface; therefore, a dotted line in the figure does not represent a ray but
merely a line joining its point of incidence on and its point of emergence from the
system to establish a continuation of the ray.
2.6 PLANE-PARALLEL PLATE
2.6.1 Introduction
A plane-parallel plate, as its name implies, is a plate with two surfaces that are plane
and parallel to each other. It is a thick lens whose two surfaces have infinite radii of
curvature. Unlike a lens, a plane-parallel plate is not used for imaging per se, but it is
often used in imaging systems as a beam splitter or a window. The imaging equations for
such a plate cannot be obtained from those for a thin lens by letting the radii of curvature
of its two surfaces approach infinity because the thickness of the lens is neglected by its
definition. However, as discussed below, they can be obtained by applying the imaging
equations (2-4) and (2-12) for a spherical surface to its two surfaces and combining the
results thus obtained. We show that the distance between an object and its image formed
by the plate is independent of the object position. Thus, as illustrated in Figure 2-38a, a
plane-parallel plate placed in the path of a converging beam displaces the focus of the
beam from P1 to P2 by an amount that depends only on the thickness and the refractive
index of the plate.
P2
P1
(a)
A D
45∞
45∞
B C
(b)
Figure 2-38. (a) Plane-parallel plate placed in the path of a converging beam of light.
Rays incident on the plate converging toward P1 converge toward P2 after being
refracted by it. (b) A right-angle reflecting prism placed in the path of a converging
beam. The optical path lengths of the rays for the prism are equivalent to those for a
plane-parallel plate, where the virtual portion ADC of the equivalent plate is
obtained by a reflection of its real portion ABC by the reflecting surface AC.
Figure 2-38b shows a right-angle reflecting prism as an example of a plane-parallel

plate. The prism is used in optical systems to deviate the path of a beam by 90˚, as
discussed in Section 1.3.2. Its diagonal face acts as a mirror because the rays incident on
it undergo a total internal reflection. The “unfolded” path of the rays, called a tunnel
diagram, illustrates that the prism ABC is equivalent to a plane-parallel plate ABCD in
terms of their optical path lengths.
2.6.2 Imaging Relations
Consider, as indicated in Figure 2-39, a plane-parallel plate of thickness t and

refractive index n forming the image of a point object P lying at a distance S from its
front surface and at a height h from its axis. Using Eqs. (2-4) and (2-12), we determine
the location of the image of the point object P. For the first surface n1 = 1, n1¢ = n , and
R1 = •. Accordingly, it forms the image of P at P ¢ at a distance S1¢ and height h1¢ ,
where
OA
(–)h
P¢ P P¢¢
(–)S1
t
(–)S¢1
(–)S¢2
(–)S2
Figure 2-39. Imaging of a point object P by a plane-parallel plate of refractive index

n and thickness t. P ¢ is the image of P formed by the first surface of the plate, and
P ¢¢ is the image of P ¢ formed by its second surface.
S1¢ = nS1
(2-112a)
∫ nS
and
M1 ∫ h1¢ / h1 = n1S1¢ n1¢ S1 = 1 , (2-112b)
where h1 ∫ h . For the second surface, n2 = n, n2¢ = 1 , R2 = • , and S2 = S1¢ - t .

Therefore, it forms the image of P ¢ at P ¢¢ at a distance S2¢ and height h2¢ , where
S2¢ = S2 n
= ( S1¢ - t ) n
t
= S- (2-113a)
n
and
M2 ∫ h2¢ h1¢ = n2 S2¢ / n2¢ S2 = 1 . (2-113b)
Thus, the transverse magnification of the image is unity.

Noting that S2¢ is numerically negative, the displacement PP¢¢ of the final image
from the object may be written
PP ¢¢ = - S1 - (- S2¢ - t )
= t (1 - 1 n) . (2-114)
Thus, the image displacement PP¢¢ is independent of the object distance S; it depends
only on the thickness t and refractive index n of the plate. Accordingly, the longitudinal
magnification of the image is unity. This may also be seen from Eq. (2-77) by noting that
the transverse magnification of the image is unity and the refractive indices of the object
and image spaces are equal to each other, as they are both equal to unity. It should be
evident that a plane-parallel plate is an afocal system with unity transverse and
longitudinal magnifications.
2.7 PETZVAL IMAGE

axis. This results in a slight error in the position of the image thus determined for an off-
axis point object. The correct position of the image formed by an imaging surface is
obtained by considering the object and image distances along an auxiliary axis connecting
the off-axis point object and the (vertex) center of curvature of the surface. The image
thus obtained is called the Petzval image point. The correct image of a planar object
obtained in this manner is spherical, and it is called the Petzval image. Similarly, the
image formed by a multisurface system is also spherical, unless a quantity called the
Petzval sum is zero. In this section, we derive expressions for the radius of curvature of
the Petzval surface for a single refracting surface, generalize it to a system of surfaces to
obtain the Petzval sum, and apply the result to imaging by a thin lens.
2.7.1 Spherical Refracting Surface

Consider imaging by a spherical surface of radius of curvature R separating media of
refractive indices n and n ¢ , as illustrated in Figure 2-40. P0¢P ¢ is the Gaussian image of
object P0 P obtained by use of Eqs. (2-4) and (2-12). The object and image distances are
S and S¢ regardless of whether a point object is axial or not. It should be evident from the
symmetry of the spherical refracting surface that the image of a concentric spherical
object surface P0 P1 will also be a concentric spherical surface P0¢P1¢ . Thus, P1 and P1¢ are
Gaussian conjugate points on the auxiliary axis PCP¢ , just as P0 and P0¢ are on the
optical axis P0 CP0¢ . We now show that the image of a planar object P0 P is also a
spherical surface, and not a planar surface, as assumed in Figure 2-8.
For generality, we first consider a spherical object surface P0 P2 of radius of

curvature Ro with its center of curvature C2 lying on the optical axis. Let P0¢P2¢ be the
corresponding Petzval image surface, where P2¢ is the Gaussian conjugate of P2 on the
auxiliary axis. If the object distance changes by a small amount DS , then the
2.7 Petzval Image 97
n n¢
P¢1 P¢
P¢2
P¢p P¢¢
Ro h¢
P0 V0
C2 OA C P¢0
UR
(–) h
V
P2
P P1
SS
R
(–)S S¢
Figure 2-40. Petzval image by a spherical refracting surface with its center of
curvature at C. P0¢P ¢ is the Gaussian image of a planar object P0 P . The
corresponding Petzval image is P0¢Pp¢ . P0 P1 is a spherical object concentric with the
refracting surface. Its Petzval image is the concentric surface P0¢P1¢ . P0¢P2¢ is the
spherical Petzval image of a spherical object P0 P2 , whose center of curvature lies on
the optical axis at C2 . Note that VP1 = S and VP1¢ = S ¢.
corresponding change D S ¢ in the image distance is given by Eq. (2-18). In Figure 2-40,
P1¢P2¢ gives the increase in image distance VP1¢ = S ¢ corresponding to an increase of P1 P2
in the (numerically negative) object distance VP1 = S of conjugates P1 and P1¢. Now,
P1 P2 is approximately equal to the difference in the sags of points P2 and P1 . As the
heights of P1 and P2 from the optical axis are approximately equal to h, we may write
DS ∫ P1 P2
= VP2 - VP1 (2-115a)
2
Ê ˆ
~ -h Á 1 - 1 ˜ .
2 Ë Ro R - S ¯
Note that because P2 lies to the right of P1 in the figure, the center of curvature of the
object lies between P0 and C, and therefore Ro < R - S . Similarly, D S ¢ is equal to the
difference in the sags of points P2¢ and P1¢ , i.e.,
D S ¢ ∫ P1¢ P2¢
= VP2¢ - VP1¢
2
Ê ˆ (2-115b)
~ h¢ Á 1 - 1 ˜ ,
2 Ë Ri R - S ¢ ¯
where Ri is the radius of curvature of the image surface P0¢P2¢ . From Eq. (2-18), we may
write
D S¢ n¢ h¢2
= . (2-116)
DS n h2
Substituting for DS and D S ¢ from Eqs. (2-115) and (2-116) into Eq. (2-18), we obtain
(after some manipulations)
1 1 1 Ê 1 1ˆ
- = - . (2-117)
n¢ Ri n Ro R Ë n¢ n ¯
By letting Ro Æ •, we obtain the radius of curvature of the spherical image surface

for a planar object P0 P :
1 n - n¢
= , (2-118)
Ri nR
which is numerically negative in Figure 2-40 for n ¢ > n . This image surface, shown in
figure as P0¢Pp¢ , is called the Petzval image surface, and its radius of curvature Ri is called
the Petzval radius of curvature. Note that Ri does not depend on the object distance S or
the image distance S ¢ ; it depends only on the radius of curvature of the refracting surface
and the refractive indices of the media that this surface separates. The location of the
Petzval image Pp¢ of an off-axis point object P is the point of intersection of the auxiliary
axis PCP¢ and a spherical surface of radius of curvature Ri centered on the optical axis
and passing through the axial image point P0¢ . The corresponding Gaussian image point
P ¢ is, of course, the point of intersection of the auxiliary axis with the Gaussian image
plane.
2.7.2 General System

If n0 , n1 , º, nk represent the refractive indices of the media separating a series of k
refracting surfaces of vertex radii of curvature R1 , R2 , º, Rk , then following Eq. (2-
100), the radii of curvature Ri1 , Ri 2 , º, Rik , of the Petzval image surfaces formed by
them are given by
1 1 1 Ê1 1ˆ
- = Á - ˜ , (2-119)
n1 Ri1 n0 Ro R1 Ë n1 n0 ¯
1 1 1 Ê 1 1ˆ
- = Á - ˜ ,
n2 Ri 2 n1 Ri1 R2 Ë n2 n1 ¯
...
and
1 1 1 Ê 1 1 ˆ
- = Á - ˜ , (2-120)
nk Rik nk -1 Rik -1 Rk Ë nk nk -1 ¯
2.7 Petzval Image 99
respectively. Adding these equations and letting the radius of curvature of the object
surface Ro Æ •, we obtain the radius of curvature of the Petzval image surface produced
by a system of k refracting surfaces according to
1 k 1 Ê 1 1 ˆ
= nk Â Á - (2-121a)
Rik j =1 R j Ë n j n j -1 ˜¯
k Kj
= - nk Â , (2-121b)
j =1 n j n j -1
where
n j - n j -1
Kj = (2-122)
Rj
is the refracting power of the jth surface. We note that the Petzval radius is independent
of the object and image distances. Unless the sum on the right-hand side of Eq. (2-121a),
called the Petzval sum, is zero, the Petzval image is spherical with a radius of curvature
Rik .
2.7.3 Thin Lens

Consider imaging by a thin lens of refractive index n with surfaces of radii of
curvature R1 and R2 . The radius of curvature of its Petzval surface can be obtained by
applying Eq. (2-100) to refraction by its two surfaces, or equivalently by using Eq. (2-
104a) and letting n0 = 1, n1 = n, and n2 = 1. The radius of curvature of the Petzval
image surface produced by the first refracting surface (for a planar object) is given by
1 1- n
= . (2-123)
Ri1 R1
The second refracting surface images the first Petzval surface into a second surface, with
a radius of curvature Ri2 given by
1 1 1 Ê 1
- = 1 - ˆ ,
Ri 2 nRi1 R2 Ë n¯
or
1 1- n Ê 1 1ˆ
= Á - ˜ (2-124a)
Ri 2 n Ë R1 R2 ¯
1
= - . (2-124b)
nf¢
Thus, the radius of curvature Rp ∫ Ri 2 of the Petzval image surface is given by

Rp = - n f ¢ . (2-125)
Note the radius of curvature of the Petzval surface does not depend on the object or
the image distance; it depends only on the refractive index and the focal length of the
lens. Its value is numerically negative for a positive lens; i.e., the Petzval surface is
curved toward the lens with its center of curvature lying to its left, as illustrated in Figure
2-41a. The radius of curvature of the virtual Petzval surface for a negative lens is
numerically positive, as illustrated in Figure 2-41b; it lies to the left of the lens and is
curved toward it.
If a system consists of a series of thin lenses, then the first lens forms the image of a
planar object on a Petzval surface. This image surface becomes the object for the next
lens in the series, and so on. It can be shown that the radius of curvature of the Petzval
surface of a system consisting of a series of m thin lenses of refracting indices n j and
focal lengths f j¢ , where j = 1, 2, ..., m, is given by (see Problem 2.14)
1 m 1
= Â - . (2-126)
Rp j =1 n j f j¢
Petzval
Surface
Cp P¢0
(–)Rp
S¢
(a)
Petzval
Surface
P¢0 Cp
Rp
(–)S¢
(b)
Figure 2-41. Petzval surface of a thin lens. (a) Real for a positive lens. (b) Virtual for
a negative lens. C p is the center of curvature of the Petzval surface.
2.8 Misaligned Surface 101
2.8 MISALIGNED SURFACE
In Section 2.3.3, we determined the axial displacement of the image for a small
displacement of a point object, thus yielding an expression for the longitudinal
magnification. Now, we determine the displacement of the image when the imaging
surface is slightly decentered, tilted, or despaced. The imaging surface is either
nonspherical, so that it has a well-defined vertex, or is an element of a series of coaxial
imaging surfaces, so that there is a well-defined optical axis, and thus yields a vertex.
2.8.1 Decentered Surface
First, we consider an imaging surface that is laterally displaced from its nominal
position, as indicated in Figure 2-42. Such a displacement of the surface is referred to as
its decenter. In the perturbed position, its axis is still parallel to the optical axis of the
unperturbed system. Let the displacement be along the x axis with a value of D. In its
unperturbed position, let the heights of its object and image points P and P ¢ from its
optical axis VC be h and h ¢, where V is the vertex and C is the center of curvature of the
surface, respectively. The two heights are related to each other according to
h ¢ = Mh , (2-127)
where M is the (transverse) magnification of the image.
z
y P¢¢
P¢
Vp Cp h¢p
P0¢¢ h¢
D
P0 V C P¢0
(–)hp
(–)h
R
(–)S S¢
Figure 2-42. Decentered surface. In the unperturbed state, the center of curvature of
the surface shown by the solid curve lies at C. The point object P and its image P ¢
are at heights h and h ¢ , respectively, from the optical axis VC. When the surface is
decentered by an amount D along the x axis, as indicated by the dashed surface, its
center of curvature moves to C p and the image is displaced to P ¢¢. The new object
and image heights are h p and h p¢ , respectively, from the new optical axis Vp C p .
In the perturbed position, the object and image heights from the new optical axis
Vp C p become
hp = h - D (2-128)
and
h p¢ = Mh p
= h¢ - MD , (2-129)
respectively. Note that h and M are numerically negative in Figure 2-42. The image point
for the decentered surface lies at P ¢¢. The image displacement, which is also along the x
axis, is given by
P¢ P ¢¢ = h p¢ - (h ¢ - D )
(2-130a)
= (1 - M ) D ,
or
P ¢P ¢¢ = (1 - M ) D c d , (2-130b)
where c d = is the displacement of the center of curvature of the surface due to its
decenter. The image displacement is independent of the height h of the object.
Accordingly, the displacement P0¢P0¢¢ of the axial image point is also given by Eq. (2-
130b).
2.8.2 Tilted Surface
Now we consider a tilt of an imaging surface from its nominal orientation. We

assume that the surface has been rotated by a small angle about its vertex in the
tangential plane, as illustrated in Figure 2-43. In the unperturbed position, the point object
P and its Gaussian image P ¢ are at heights h and h ¢ from the optical axis VC. When the
surface is tilted, the Gaussian image of the point object P is displaced to P ¢¢ . With
respect to the tilted optical axis of the surface, the heights of the object point P and image
point P ¢¢ are given by
h p = h - S (2-131)
and
h p¢ = Mh p
= h ¢ - MS . (2-132)
Note that because h is numerically negative in the figure, h - S is a numerically smaller

height than h.
2.8 Misaligned Surface 103
P¢¢
h¢p P¢
Cp
P0 V bS¢ P0¢¢ h¢
b
(–)bS P¢0
C
(–)hp (–)h
R
(–)S S¢
Figure 2-43. Tilted surface. When the surface is tilted by an angle , indicated by
the dashed surface, its center of curvature C moves to C p . The heights of the object
P and image P ¢ change from h to hp and from h ¢ to h p¢ , respectively. The image for
the tilted surface is located at P ¢¢ .
The image displacement, which is along the x axis, as in the case of a decentered
surface, is given by
P ¢P ¢¢ = hp¢ - ( h ¢ - S ¢) , (2-133a)
or
P ¢P ¢¢ = ( S ¢ - MS) . (2-133b)
Substituting for S in terms of S ¢ from Eq. (2-10) for the image magnification and S ¢ in
terms of R from Eq. (2-4) for imaging, we find that
P ¢P ¢¢ = (1 - M ) R , (2-134a)
or
P ¢P ¢¢ = (1 - M ) D c t , (2-134b)
where D c t = R is the displacement of the center of curvature of the surface due to its
tilt.
It should be evident from Eqs. (2-130b) and (2-134b) that the image is not displaced
unless the center of curvature of the surface is displaced. Thus, for example, if the
displacement of the center of curvature due to a decenter of the surface is canceled by its
tilt, the image does not move.
2.8.3 Despaced Surface

When an optical surface of an imaging system is displaced longitudinally, i.e., along
its optical axis, the distance of the object point from it changes, and therefore the distance
of its image point also changes. The longitudinal displacement of the surface is referred
to as its despace. For a despace D of the surface, as in Figure 2-44, the object distance
changes from S to S - D ; or it changes by dS = -D . Thus, according to Eq. (2-18), the
change in the image distance due to a change in the object distance alone is
d S ¢ = - (n ¢ n) M 2 D . Adding it to the displacement of the surface, the net displacement of
the image is given by
( )
P0¢P0¢¢ = 1 - n ¢ M 2 n D . (2-135)
Note that the distance of the displaced image from the displaced surface is
S ¢ - (n ¢ n) M 2 D . The image height h ¢ may also change, but the more serious effect is the
defocused image. If a surface of a multisurface system is displaced, the distances of the
object for each surface that follows it also change.
In Eq. (2-18), the refracting surface is assumed to be fixed in position, and D S ¢

represents the displacement of the image corresponding to a displacement DS of the
object. However, when the object is fixed and the refracting surface is displaced by an
amount D , then the corresponding displacement of the image is given by Eq. (2-135).
When the change in object distance is the same in both cases, the change in the image
distance is not; the image location is indeed different in the two cases. The difference
comes about because when the refracting surface is displaced, the reference point for both
the object and image distances changes.
n n′
P
(1 – n′M2t /n)Δ
h
Δ P′0 P′′0
P0 V C (–)h′
P′ P′′
R
(–)S S′
Figure 2-44. Despaced surface. When the surface is despaced slightly, the object and
image distances change, and, therefore, the image is displaced, thus creating
defocus.
2.9 Misaligned Thin Lens 105
2.9 MISALIGNED THIN LENS

Now we consider the displacement of an image formed by a thin lens when it is
slightly misaligned. We show that the image does not move when the lens is tilted about
its center.
2.9.1 Decentered Lens

Consider a thin lens forming the image of an object. Let the heights of the object and
its image from the optical axis be h and h ¢ , respectively. Let M = h ¢ h = S ¢ S be the
image magnification. We are interested in determining the displacement of the image if
the lens is decentered by an amount D along the x axis, as illustrated in Figure 2-45. The
optical axis is still parallel to the z axis, but it is displaced along the x axis by an amount
D from its nominal position. The object and image heights from the perturbed optical
axis are given by
hp = h - D (2-136)
and
h p¢ = Mh p
= h¢ - MD , (2-137)
respectively. Therefore, the image displacement is given by
P¢ P ¢¢ = h p¢ - (h¢ - D ) , (2-138a)
or
P¢ P ¢¢ = (1 - M ) D , (2-138b)
which is independent of the height h of the point object P.
P¢¢
P¢
h¢p
C¢ h¢ P¢¢
0
D
P0 C P0
(–) hp (–) h
P
(–)S S¢
Figure 2-45. Decentered lens. When a lens is decentered slightly, the object and
image distances from its center do not change, but their heights from its optical axis
change.
2.9.2 Tilted Lens

When a lens is tilted about its center by an angle , as illustrated in Figure 2-46, the
object height changes from h to h p = h - S . The corresponding image height changes
from h¢ to
h p¢ = Mhp
= h ¢ - MS . (2-139)
The image displacement is given by
P ¢P ¢¢ = hp¢ - ( h ¢ - S ¢)
= ( S ¢ - MS )
(2-140)
= 0 .
Thus, the image does not move. This is not surprising, because the image lies on the ray
passing through the center of the lens, which does not change when the lens is tilted about
its center.
2.9.3 Despaced Lens

When a thin lens is displaced longitudinally, i.e., along its optical axis, as illustrated
in Figure 2-47, the distance of the object point from it changes, and therefore the distance
of its image point also changes. However, the heights of the object and image points do
not change. For a longitudinal movement D of the surface, the object distance changes
from S to S - D , or it changes by dS = -D . Thus, according to Eq. (2-36), the change in
the image distance due to a change in the object distance alone is d S ¢ = - M 2 D . Adding it
(
to the displacement of the lens, the net displacement of the image is given by 1 - M 2 D . )
Note that the distance of the displaced image from the displaced lens is S ¢ - M 2 D . As in
the case of a refracting surface, the image height h ¢ may also changes, but the more
serious effect is the defocused image.
P¢
h¢
P0 b
P0
(–) h
P
(–)S S¢
Figure 2-46. Tilted lens. When a lens is tilted about its center, the image stays
stationary.
2.10 Anamorphic Imaging Systems 107
P¢
P¢¢
P0 h¢
P0¢¢
P0¢
(–) h
P
D
(1 – M2) D
(–)S S¢
Figure 2-47. Despaced lens. When the lens is despaced, the object and image
distances from its center change, and, therefore, displace the image, thus creating
defocus.
2.10 ANAMORPHIC IMAGING SYSTEMS

An anamorphic imaging system, for example, one consisting of cylindrical optics, is
symmetric about two orthogonal planes whose intersection defines its optical axis. The
Gaussian images of a point object with object rays in the two symmetry planes are
formed separately. They are coincident in the final image space of the system for only
two pairs of conjugate planes. By definition, an anamorphic system forms the image of an
extended object with different transverse magnifications in the two symmetry planes.
Thus, for example, the image of a square object is rectangular and that of a rectangular
object can be square.
Consider a point object P located at a point (p, q) in the object plane imaged by an
anamorphic system at a point P ¢ , as illustrated in Figure 2-48. The cylindrical lens L1
schematically represents cylindrical lenses with their symmetry axes parallel to x axis,
and similarly for L2 along the y axis. The system is symmetric about the yz and zx planes
whose intersection defines its optical axis z. The rays in the zx plane originating at P are
transmitted by L1 like a plane-parallel plate, and focused by L2 at P ¢ . Similarly, the rays
in the yz plane are focused by L1 at P ¢ and transmitted by L2 like a plane-parallel plate.
The projections of skew rays on the zx and yz planes contribute to the image in a similar
manner.
Let S1 be the distance of the point object P, and S1¢ be the distance of the Gaussian
image point P ¢ from the object- and image-space principal planes H1 and H1¢ of the lens
L1 , respectively, as illustrated in Figure 2-49. They are related to each other by the
image-space focal length f1¢ according to
1 1 1
- = , (2-141)
S1¢ S1 f1¢
or
Figure 2-48. Schematic of an anamorphic imaging system consisting of orthogonal

cylindrical lenses in a configuration called crossed cylinders. The system is
symmetric about the yz and zx planes whose intersection defines the optical axis z. A
fan of rays in the zx plane is shown originating at a point P in the center of a square
object. The cylindrical lens L1 acts as a plane-parallel plate on these rays and
transmits them without any bending . When the transmitted rays are incident on the
cylindrical lens L2 , they are refracted by it just like a spherical lens, and focused at
the image point P ¢ .
Figure 2-49. Gaussian imaging by an anamorphic imaging system, such as in Figure

2-48.
S1¢ f1¢
S1 = . (2-142)
f1¢ - S1¢
Similarly, the object and image distances S2 and S2¢ for the lens L2 of focal length f 2¢
are related to each other according to
1 1 1
- = (2-143)
S2¢ S2 f 2¢
or
1 1 1
- = , (2-144)
S1¢ - d 2 S1 - d1 f 2¢
where d1 and d 2 are the distances H1H 2 and H1¢H 2¢ between the respective principal
planes of the two lenses. In the thin-lens approximation, d1 and d 2 are equal to the
spacing between the lenses. Substituting for S1 from Eq. (2-142) into Eq. (2-144), we
obtain a quadratic equation in S1¢ yielding two solutions for it. A corresponding value of
S1 can be obtained for each value of S1¢ from Eq. (2-142). Thus, an anamorphic system
has only two pairs of conjugates, compared to an infinite number for a rotationally
symmetric imaging system. It should be evident that the image magnifications along the x
and y axes are different, as they are given by
S2¢
Mx = - (2-145)
S2
and
S1¢
My = - , (2-146)
S1
respectively. Consequently, for example, the image of a square object is rectangular and
that of a circle is elliptical.

2.11.1 Imaging Equations
2.11.1.1 General System
If an object of height h lies at a distance S from the principal point H of an imaging

system in object space of refractive index n, its image distance S ¢ from the image-space
principal point H ¢ in image space of refractive index n ¢ and the image height h ¢ are
given by (see Figure 2-50)
n¢ n n¢ n
- = = - (2-147a)
S¢ S f¢ f
and
n n′
P Q Q′
h (–)β′0
β0 P′0
P0 F H H′ F′ (–)h′
P′
(–)z (–)f f′ z′
(–)S S′
Figure 2-50. Imaging by a general system.
h¢ nS ¢ n0
Mt ∫ = = , (2-147b)
h n ¢S n ¢¢0
respectively, where f and f ¢ are the object- and image-space focal lengths of the system.
However, the focal points are obviously not conjugates of each other.
The principal planes are conjugate planes with a magnification of unity, as illustrated
by the fact that Q and Q ¢ are conjugate points at the same height from the axis. Note that
the dashed line QQ¢ is not a ray but merely an illustration of this fact. A ray incident
parallel to the optical axis of the system passes through the image-space focal point F ¢
after emerging from the system. Similarly, a ray incident in the direction of the object-
space focal point F emerges parallel to the optical axis. This ray determines the image
height h ¢ . The object-space nodal point N lies at a distance f ¢ from the corresponding
focal point F. Similarly, the image-space nodal point N ¢ lies at a distance f from the
corresponding focal point F ¢ . The spacing between the nodal points is equal to that
between the principal points. A ray incident in the direction of N emerges from the
system in a parallel direction passing through N ¢ . The principal points are conjugates of
each other, as are the nodal points.
and converging to its image point P0¢ after refraction by the system, as illustrated in
Figure 2-48, is given by
Mb = ¢0 / 0 = S / S ¢ . (2-147c)
The product of the transverse and angular magnifications is given by
Mt Mb = n n ¢ , (2-147d)
showing that a large value of Mt is accompanied by a correspondingly small value of

Mb . Equation (2-146b) may also be written
n ¢h ¢¢0 = nh0 , (2-147e)

representing Lagrange invariance. The longitudinal magnification representing the

magnification of an axial object of length DS is given by
Ml ∫ D S ¢ D S = ( n n ¢)( S ¢ S) 2 = ( n ¢ n) Mt2 = Mt Mb , (2-147f)
where D S ¢ represents the length of the axial image. It also represents the change in image
distance D S ¢ due to a small change DS in the object distance. It shows that the image
moves in the same direction as the object.
The Newtonian imaging equation, where the object and image distances z and z ¢ are
measured from the object- and image-space focal points, respectively, are given by
zz ¢ = f f ¢ = - (n n ¢) f ¢ 2 (2-147g)
and
Mt ∫ h ¢ h = - z ¢ f ¢ = - f z . (2-147h)
2.11.1.2 Refracting Surface
If the system consists simply of a refracting surface of radius of curvature R

separating media of refractive indices n and n ¢ , the above equations apply with
n¢
f¢ = R . (2-148)
n¢ - n
The principal points coincide with the vertex of the surface, and the nodal points coincide
with its center of curvature. A ray incident in the direction of the center of curvature
passes through the surface undeviated because the angles of incidence and refraction are
both equal to zero.
2.11.1.3 Thin Lens
In the case of a thin lens in air, the principal and nodal points all coincide at its
center. Thus, the object and image distances are measured from its center, and a ray
incident in the direction of its center passes through it undeviated. For a lens of refractive
index n with surfaces of radii of curvature R1 and R2 , the imaging equations reduce to
1 1 1 Ê 1 1ˆ 1
- = = (n - 1) Á - ˜ = - , (2-149a)
S¢ S f¢ Ë R1 R2 ¯ f
Mt ∫ h ¢ h = S ¢ S = 0 ¢0 , (2-149b)
Mb = ¢0 0 = S S ¢ , (2-149c)
Mt Mb = 1 , (2-149d)
h ¢¢0 = h0 , (2-149e)
2
Ml ∫ D S ¢ D S = ( S ¢ S ) = Mt2 = Mt Mb , (2-149f)
z z¢ = f f ¢ = - f ¢2 , (2-149g)
and
Mt ∫ h ¢ h = = - z ¢ f ¢ = - f z . (2-149h)
2.11.1.4 Afocal System
In the case of an afocal system, the focal points lie at infinity. The ray angular
magnification for infinite conjugates is given by
¢ nx0
= , (2-150)
n ¢x0¢
where x 0 and x 0¢ are the heights of a parallel ray in the object and image spaces with
refractive indices n and n ¢ , respectively. The image of an object can be determined by
applying the imaging equation successively for each surface or thin lens of the system.
Let Mt = h ¢ h be the magnification of the image. The image distance S ¢ for another
object lying at a distance S from the previous object is given by
S¢ n ¢h ¢ 2 n¢ 2
= = Mt . (2-151)
S nh 2 n
The transverse magnification Mt and longitudinal magnification Ml = S ¢ S are

independent of the object position.
2.11.1.5 Plane-Parallel Plate
The image of an object formed by a plane-parallel plate of refractive index n and

thickness t lies at a distance t (1 - 1 n) from the object, independent of the object
distance. Its transverse and longitudinal magnifications are both equal to unity.
2.11.2 Petzval Image

In Gaussian optics, the object and image distances are measured along the optical
axis of a system when determining the Gaussian image formed by it. This leads to a slight
error in the position of the image for an off-axis point object. The correct position of the
image formed by an imaging surface is obtained by considering the object and image
distances along an auxiliary axis connecting the off-axis point object and the (vertex)
center of curvature of the surface. The image thus obtained is called the Petzval image
point. It has the consequence that the correct image of a planar object is spherical and is
called the Petzval image surface. Accordingly, the image formed by a multisurface
system is also spherical unless a quantity called the Petzval sum is zero.
If n0 , n1 , º, nk represent the refractive indices of the media separating a series of k

refracting surfaces of vertex radii of curvature R1 , R2 , º, Rk , the radius of curvature of
the Petzval image of a planar object is given by
1 k 1 Ê 1 1 ˆ
= nk Â Á - . (2-152)
Rp j =1 R j Ë n j n j -1 ˜¯
For a thin lens of refractive index n and focal length f ¢ , it is given by
Rp = - n f ¢ . (2-153)
2.11.3 Misalignments
2.11.3.1 Misaligned Surface
If a surface of radius of curvature R separating media of refractive indices n and n ¢ is

decentered or tilted so that its center of curvature is displaced by an amount D , its image
is displaced from its position P ¢ to a position P ¢¢ such that
P ¢P ¢¢ = (1 - M ) D , (2-154)
where M is the magnification of the image. If a surface is tilted by an angle b , then

D = bR. If the surface is displaced longitudinally by an amount D , then the image is
displaced from its axial position P0¢ to a position P0¢¢ such that
( )
P0¢P0¢¢ = 1 - n ¢ M 2 n D . (2-155)
2.11.3.2 Misaligned Thin Lens
When a thin lens is decentered by an amount D , an image of magnification M

formed by it is displaced by an amount that is also given by Eq. (2-148). However, there
is no image displacement if the lens is tilted by a certain angle about its center. The image
in this case lies on the ray passing through the center of the lens, which does not change
when it is tilted about its center. If the lens is displaced longitudinally by an amount D ,
the image is displaced by an amount
( )
P0¢P0¢¢ = 1 - M 2 D . (2-156)
2.11.4 Anamorphic Imaging Systems

An anamorphic imaging system is symmetric about two orthogonal planes whose
intersection defines its optical axis. The Gaussian images of a point object with object
rays in the two symmetry planes are formed separately. They are coincident in the final
image space of the system for only two pairs of conjugate planes, as shown in Section
2.10. An anamorphic system forms the image of an extended object with different
transverse magnifications in the two symmetry planes. As a result, the image of a square
object is rectangular and that of a rectangular object can be square.
Problems 115
PROBLEMS
Illustrate each problem by a diagram.
2.1 Determine the focal length of an air bubble 2 mm in diameter suspended in a

solution of refractive index 1.48.
2.2 From Eqs. (2-79) and (2-82), verify that the nodal points of a refracting surface
coincide with its center of curvature.
2.3 Consider a glass sphere of radius of curvature 3 cm and refractive index 1.5. Find
the apparent position and relative size of a flower (a) embedded at its center, and
(b) placed at a distance of R n from its center and observed from the other side of
the center. This problem illustrates the concept of a contact magnifier. A typical
lens magnifier produces a magnified (virtual) image of an object placed between it
and its front focal plane. A hemispherical or hyperhemispherical contact lens
magnifier produces a magnified (virtual) image of an object placed in contact with
its planar surface. The Contact magnifiers can be used in reverse, as in immersed
detectors, where the image is focused on the detector, which is in contact with the
planar surface of the hemispherical or the hyperhemispherical lens. The image on
the detector is smaller in size by the magnification of the lens determined in parts
(a) and (b). See R. C. Jones, “Immersed radiation detectors,” Appl. Opt. 1, 607–613
(1962).
2.4 From Eq. (2-98), derive the focal length of a thick lens of refractive index n and
thickness t with surfaces of radii of curvature R1 and R2 . This problem is
discussed in detail by way of ray tracing in Chapter 4.
2.5 Consider a plano-convex lens 3 cm thick with a radius of curvature 10 cm and

refractive index 1.5. Determine how far to place a point source from its planar
surface to yield a parallel beam of light exiting from the lens.
2.6 A 4-cm square slide is placed at a distance of 8 cm from a thin lens. Determine the
focal length of the lens if an image is to be formed on a screen at a distance of 2 m.
(a) What is the size of the image? (b) Sketch the Petzval surface indicating the
value of longitudinal defocus for the corner point of the slide.
2.7 Two thin lenses of focal lengths 10 cm and 20 cm are placed 5 cm apart. If an
object is placed at a distance of 30 cm from the first lens, determine the location
and size of the image formed by the system.
2.8 (a) Determine the radii of curvature of a thin equiconvex lens of refractive index
1.5 and a power of 5 D. (b) Determine the location of the image of an object lying
at a distance of 40 cm from the lens at a height of 5 cm from its optical axis. (c)
How does the image location change if the lens is displaced by 1 mm and is
decentered by 1 mm? Check the change in image height also.
2.9 Consider a thin lens of refractive index 1.5 and focal length 30 cm. Determine its
focal length and power when placed in water. The refractive index of water is 1.33.
2.10 Consider a plane-parallel plate of thickness t and refractive index n. (a) Derive an
expression for the location and the size of the image of an object lying at a distance
So from its front surface. (b) Derive an expression for the location and the size of
the image of its front surface formed by its back surface. (c) Sketch the various
quantities determined for t = 1 cm, n = 1.5, and So = 3 cm.
2.11 Consider an afocal system consisting of two lenses of equal focal lengths f ¢
placed 2 f ¢ apart. (a) Determine the transverse and longitudinal magnifications of
the image of a nearby object. (b) Determine the space between the object and its
image. Show that the position and size of the image do not change as the system is
moved along its axis. (c) How are the imaging properties of the system affected if
a third lens of focal length f ¢ is placed at the common focal point of the first two?
(d) As an example, consider f ¢ = 10 cm and an object placed at a distance of 30
cm from the first lens.
2.12 The size of the image of a distant object depends on the focal length of the imaging
system. A telephoto lens consisting of a positive lens and a negative lens is used to
obtain a large image such that the back focal distance is kept small. (a) Design a
telephoto lens with a focal length of 20 cm and a back focal distance of 4 cm. Let
the focal length of the positive lens be 4 cm. (b) Determine the focal length and the
back focal distance of the lens when it is reversed. Show that the reversed lens
works as a wide-angle lens.
2.13 (a) Show that the power of a system changes when it is reversed (by rotating it by
180 o about an axis normal to its optical axis) unless the refractive indices of its
object and image spaces are equal. Consider a thin lens of refractive index 1.5 and
radii of curvature 10 cm and - 15 cm with air in its object space and water in its
image space. Calculate its focal lengths and power, then reverse the lens and
repeat the calculations.
2.14 Show that the radius of curvature of the Petzval surface of a system consisting of a
series of m thin lenses of refractive indices n j and focal lengths f j¢ , where j = 1, 2,
m
..., m, is given by 1 Rp = Â - 1 n j f j¢ .
j =1
2.15 Consider a camera with an adjustable focus. Assume the lens to be thin with a focal
length of 10 cm. If the object distance changes from 2 m to 4 m, determine the lens
movements required to keep the image focused at the film.
CHAPTER 3
REFLECTING SYSTEMS
3.1 Introduction ..........................................................................................................119

3.2 Spherical Reflecting Surface (Spherical Mirror) ..............................................119
3.2.1 Gaussian Imaging Equation..................................................................... 119
3.2.2 Focal Length and Reflecting Power ........................................................121
3.2.3 Magnifications and the Lagrange Invariant............................................. 123
3.2.4 Graphical Imaging ................................................................................... 127
3.2.5 Newtonian Imaging Equation ..................................................................127
3.3 Two-Mirror Telescopes ....................................................................................... 129
3.4 Beam Expander ....................................................................................................133
3.5 Petzval Image........................................................................................................133
3.5.1 Single Mirror ........................................................................................... 133
3.5.3 System of k Mirrors ................................................................................. 136
3.6 Misaligned Mirror................................................................................................136
3.6.1 Decentered Mirror ................................................................................... 136
3.6.2 Tilted Mirror ............................................................................................137
3.6.3 Despaced Mirror ......................................................................................138
3.7 Misaligned Two-Mirror Telescope ..................................................................... 139
3.7.1 Decentered Secondary Mirror..................................................................139
3.7.2 Tilted Secondary Mirror ..........................................................................139
3.7.3 Despaced Secondary Mirror ....................................................................139
3.8.1 Imaging by a Mirror ................................................................................141
3.8.2 Imaging by a Two-Mirror Telescope ......................................................142
Problems ......................................................................................................................... 144
117
Chapter 3
Reflecting Systems
3.1 INTRODUCTION
In Section 1.8.3, we considered Gaussian imaging by a reflecting surface, and
showed that the curved surface can be replaced by a planar surface that is a tangent to the
surface at its vertex, called the tangent plane or the paraxial surface. In this chapter, we
rederive the Gaussian imaging equations for a spherical reflecting surface and show that
they can be obtained from those for a corresponding refracting surface by substituting the
refractive index associated with the reflected rays equal to the negative value of that
associated with the incident rays. Both Gaussian and Newtonian forms of the imaging
equations are given. We also show how to determine the image graphically. The Petzval
image, two-mirror telescopes, a beam expander, and the image displacement resulting
from the misalignment of a mirror are also discussed.
3.2 SPHERICAL REFLECTING SURFACE (SPHERICAL MIRROR)

Consider a spherical reflecting surface of radius of curvature R, as illustrated in

Figure 3-1. The line VC joining its vertex V and its center of curvature C defines its
optical axis. Consider an axial point object P0 at a distance S. Let the slope angles of the
incident and reflected rays from the optical axis be 0 and ¢0 , respectively. According
to the law of reflection [see Eq. (1-6)],
q¢ = - q , (3-1)
where q and q ¢ are the angles of incidence and reflection, respectively. From
triangle P0 CQ , we note that
f = 0 - q , (3-2)
where f is the angle that the surface normal at the point of incidence Q makes with the
optical axis. Similarly, from triangle CP0¢Q , we note that
¢0 = f + q¢ . (3-3)
The tangent of a small angle is approximately equal to the angle in radians. Thus, we may
write
f = - x /R , (3-4a)
0 = - x / S , (3-4b)
and
119
120 REFLECTING SYSTEMS
(–)q
q¢ x
b0 f b0¢
P0 C P0¢ F¢ V
(–)f ¢
(–)S¢
(–)R
(–)S
(a)
(–)q¢
q Q
x
(–)b¢0
b0 (–)f
P0 V P0¢ F¢ C
(–)S S¢
f¢
R
(b)
Figure 3-1. Gaussian imaging of an axial point object P0 by a spherical reflecting

surface of radius of curvature R and center of curvature C. (a) Concave mirror
forms a real Gaussian image P0¢ . (b) Convex mirror forms a virtual Gaussian image.
F ¢ is the focal point of the mirror.
3.2 Spherical Reflecting Surface (Spherical Mirror) 121
¢0 = - x / S ¢ , (3-4c)
where x is the height of the point of incidence, and S ¢ is the image distance. Substituting
for q and q ¢ from Eqs. (3-2) and (3-3) into Eq. (3-1) and Eqs. (3-4) for the other angles,
we obtain the Gaussian imaging equation
1 1 2
+ = . (3-5)
S¢ S R
As R Æ • , S ¢ Æ - S , as expected for a plane mirror.
3.2.2 Focal Length and Reflecting Power
When the object lies at infinity, i.e., when S = - •, the corresponding image
distance S ¢ ∫ VF ¢ = f ¢ , where f ¢ is called the focal length of the mirror. Thus, the focal
length of the mirror is given by
f¢ = R 2 . (3-6)
The rays incident parallel to the optical axis come to focus after reflection by the mirror at
the point F ¢ , which lies halfway between V and C. It is evident that a mirror has only one
focal point. The object- and image-space focal points are coincident just as the two spaces
are coincident. Thus, if a point source is placed at the focal point F ¢ , its rays incident on
the mirror become parallel after being reflected by it. Substituting Eq. (3-6) into Eq. (3-
5), we obtain
1 1 1
+ = . (3-7)
S¢ S f¢
The focal point F ¢ of a mirror is illustrated in Figure 3-2 for both a concave and a
convex mirror. It is real in the case of a concave mirror, but it is virtual in the case of a
convex mirror. We note that Eq. (3-5) is independent of the refractive index of the
medium in which the rays are incident or reflected. Therefore, it is independent of the
direction of propagation of the rays. The focal length f ¢ is numerically negative for a
concave mirror but positive for a convex mirror.
For object rays propagating from left to right, the rays on the first mirror (not
necessarily the first imaging element) of a system will be incident propagating from left
to right. In a medium of refractive index n, the incident rays will be associated with a
refractive index n, but the reflected rays will be associated with a refractive index
n ¢ = - n . Any refracting imaging elements following the mirror will be assigned
refractive indices with a negative value of their actual refractive indices because the rays
on them are incident propagating from right to left. When reflected by a second mirror in
the system, these rays will propagate from left to right and will be associated with a
refractive index n2¢ = n . Therefore, we define the reflecting power K and the equivalent
or effective focal length fe of a mirror according to
1 2n ¢
K = = , (3-8)
fe R
where n ¢ is the refractive index associated with the rays reflected by it. Thus, if the first
mirror in a system is concave, it has a negative value of R, a negative value of n ¢ in Eq.
(3-8), and positive values of K and fe . Similarly, a second concave mirror in a system
will have a positive value of R, a positive value of n ¢ in Eq. (3-8), and positive values of
K and fe . Therefore, a concave mirror is always a positive imaging element, regardless
of the direction of the rays incident on it. Similarly, a convex mirror has negative values
of K and fe , i.e., it is always a negative imaging element, regardless of the direction of
the rays incident on it. In air, n ¢ = - 1 for the first mirror, and its reflecting power and
equivalent focal length are given by
1 2
K1 = = - , (3-9a)
fe1 R1
V
C F¢
(–)f¢
(–)R
(a)
V F¢ C
f¢
R
(b)
Figure 3-2. The focal point F ¢ of a mirror. It lies halfway between the vertex V and
the center of curvature C of the mirror. (a) Concave mirror. (b) Convex mirror.
where R1 is its radius of curvature. Similarly, because n ¢ = 1 for a second mirror, its
reflecting power and equivalent focal length are given by
1 2
K2 = = , (3-9b)
fe 2 R2
where R2 is its radius of curvature. Continuing in this manner, we find that the reflecting
power K j and equivalent focal length fej of a jth mirror in air of radius of curvature R j in
a system is given by
1 2
Kj = = ( -1) j . (3-10)
fej Rj
Now we consider the imaging of an off-axis point object P at height h from the
optical axis in the object plane passing through P0 , as illustrated in Figure 3-3. A ray PV
incident at the vertex V of the mirror is reflected as a ray VP¢ intersecting the image
plane passing through P0¢ at the point P ¢ , which locates the image point at a height h ¢ . It
is evident from the figure that
q = h/S (3-11a)
and
q¢ = h¢ / S ¢ . (3-11b)
Substituting Eqs. (3-11) into Eq. (3-1), we find that the transverse magnification of the
image is given by
Mt ∫ h ¢ h = - S ¢ S . (3-12a)
The image formed by a concave mirror is inverted, as in Figure 3-3a, but that by a convex
mirror is erect, as in Figure 3-3b. Accordingly, the magnification is negative in Figure 3-
3a and positive in Figure 3-3b. From similar triangles P0 PC and P0¢P ¢C , we find that the
transverse magnification may also be written
R - S¢
Mt = - .
S-R (3-12b)
Letting R Æ • shows that the magnification for a plane mirror is unity, as expected.
The ray angular magnification representing the ratio of the angular divergence of the
rays from P0 to the angular convergence of these rays to P0¢ , as in Figure 3-4, is given by
M = ¢0 / 0 = S / S ¢ . (3-13)
From Eqs. (3-12a) and (3-13), we obtain

h
P0′ (–)θ
V
P0 C (–)h′ F′ θ′
P′
(–)S′
(–)R
(–)S
(a)
′
P
P′
h
(–)θ h′
P0 θ′ V P0′ F′ C
(–)S S′
R
(b)
Figure 3-3. Gaussian imaging of an off-axis point object P at height h. (a) Concave
mirror forms a real and inverted image at P ¢ at a height h ¢ . (b) Convex mirror
forms a virtual and erect image.
Mt M = - 1 . (3-14)
Equation (3-14) may also be written
h ¢¢0 = - h0 , (3-15)
representing the Lagrange invariance for the mirror. The Lagrange invariant is nh0 ,
where n = 1. From Eq. (3-15), the transverse magnification of the image can also be
written
h
β0 P0′ β0′
P0 C (–)h′ F′ V
P′
(a)
P¢
h (–)b¢0
b0 h¢
P0 V P0¢ F¢ C
(b)
Figure 3-4. Lagrange invariant nh0 of a mirror. (a) Concave mirror. (b) Convex
mirror.
Mt = - 0 ¢0 , (3-16)
i.e., it can also be obtained from the slope angles of the axial incident ray and the
corresponding reflected ray.
Differentiating Eq. (3-5), we obtain the longitudinal magnification Ml of the image

in terms of its transverse magnification Mt according to
2
Ml = D S ¢ / D S = - ( S / S ¢ ) = - Mt 2 . (3-17)
Equation (3-17) shows that whether Mt is positive or negative, Ml is always negative.

Thus, for example, if the object distance increases, the image distance decreases. For a
real object, an increase in the object distance takes place (from a larger negative value to
a smaller one) by moving the object closer to the mirror. In Figure 3-3a, a decrease in the
image distance (from a smaller negative value to a larger one) implies that the image
moves away from the mirror. Similarly, in Figure 3-3b, a decrease in the image distance
(from a larger positive value to a smaller one) implies that the image moves closer to the
mirror. Thus, the image moves in a direction opposite to that of the object. This is true for
a system with an odd number of mirrors, as may be seen from Eq. (2-61) by letting
n ¢ n = - 1. The opposite is true if the number of mirrors is even because then n ¢ n = 1.
Figure 3-5 illustrates a 3D image of a 3D object. The reversal of the image arrows P0 x ¢,
P0 y ¢ , and P0 z ¢, compared with the corresponding object arrows P0 x , P0 y , and P0 z ,
shows that both the transverse and longitudinal magnifications are numerically negative.
This is different from that for a refracting surface, where the longitudinal magnification is
positive (see Figure 2-10).
In Eq. (3-17), the mirror is assumed to be fixed in position, and D S ¢ represents the
displacement of the image corresponding to a displacement DS of the object. However, if
the object is fixed and the mirror is displaced by an amount D , then the corresponding
( )
displacement of the image is 1 + Mt 2 D , as shown in Section 3.6.3.
Comparing Eq. (3-5) with Eq. (2-4), we note that the imaging properties of a
spherical reflecting surface can be obtained from those of a spherical refracting surface if
we let n = 1 because the medium between the object and the mirror is air, and n ¢ = - 1,
representing a reflected ray propagating backward. Similarly, the expression for the focal
y¢
z¢
V
P0 z C P0¢ F¢
x¢
y
(–)S¢
(–)R
(–)S
Figure 3-5. 3D image of a 3D object. Both the transverse and longitudinal

magnifications are numerically negative, as illustrated by the reversal of the image
of the x, y, and z arrows. This is different from that for a refracting surface, where
the longitudinal magnification is positive (see Figure 2-11).
length, reflecting power, magnifications, and Lagrange invariant for a mirror can be
obtained from the corresponding expressions for a refracting surface by letting n ¢ = - n ,
where n = 1.
The graphical construction of an image formed by a reflecting surface is similar to

that for a refracting surface, except that the former has only one focal point. It is
illustrated in Figure 3-6 for a concave and a convex mirror. Thus, the Gaussian image P ¢
of a point object P may be determined as the point of intersection of any of the following
three rays after reflection by the mirror. First, a ray incident parallel to the axis of the
mirror is reflected by it passing through its focal point F ¢ . Second, a ray incident in the
direction of F ¢ is reflected parallel to the axis. Third, a ray incident in the direction of the
center of curvature C is reflected upon itself. It should be evident that a mirror has only
one principal point, which coincides with its vertex; its nodal points coincide with its
center of curvature. It should be remembered that in Gaussian optics, which is based on
paraxial rays, any reflection at a surface takes place at a plane that is tangent to it at its
vertex, namely, the tangent plane AVB, as illustrated in Figure 3-6. The tangent plane is
sometimes referred to as the paraxial reflecting surface.
The Gaussian image P0¢ of an on-axis point object P0 can be determined

independently (rather than as the point of intersection of the optical axis and the line that
is perpendicular to it and passes through P ¢ ) as follows: Consider a ray P0 E incident on
the surface, as shown in Figure 3-7. A hypothetical ray incident parallel to it and passing
through C intersects the focal plane at a point D. The reflected ray corresponding to the
incident ray P0 E passes through the point D and intersects the optical axis at the
Gaussian image point P0¢ . The point D may also be determined by considering a
hypothetical parallel ray passing through the focal point F ¢ . It is refracted as a ray
parallel to the optical axis intersecting the focal plane at the point D.
Equation (3-7) is the Gaussian imaging equation for a mirror in which the object and
image distances are measured from its vertex. If we measure the object and image
distances z and z ¢, respectively, from the focal point F ¢ , as indicated in Figure 3-6, then
from similar triangles VF ¢B and P0¢F ¢P ¢, and similar triangles P0 F ¢P and VF ¢A , we find
that the transverse magnification of the image is given by
Mt = h ¢ h = z ¢ f ¢ = f ¢ z . (3-18)
Therefore,
z z¢ = f ¢2 , (3-19)
which is the Newtonian imaging equation.

P B
V
P0 C (–)h′ P0′ F′
A
P′
(–) z′
(–) z
(–)f ′
(–)S′
(–)R
(–)S
(a)
B
P
P¢
h
A h¢
P0 V P¢0 F¢ C
f¢
(–)S S¢
(–)z ¢
(–)z
(b)
Figure 3-6. (a) Paraxial imaging of a real object P0 P of height h. (a) Concave mirror
forms a real and inverted image P0¢P ¢ of height h ¢ . (b) Convex mirror forms a
virtual and erect image. In the Gaussian approximation, the reflection of rays takes
place at the tangent plane AVB.
3.3 Two-Mirror Telescopes 129
P0 C P0¢ F¢ V
(a)
P0 V P0¢ F¢ C
(b)
Figure 3-7. Graphical Gaussian imaging by a spherical reflecting surface of an axial

point object P0 . (a) Concave mirror. (b) Convex mirror.
3.3 TWO-MIRROR TELESCOPES

In this section, we consider imaging by a telescope consisting of two mirrors. The
strict definition of a telescope is an optical system that is afocal, i.e., for an object lying at
infinity on one side, it forms the image at infinity on the other side. Historically, this
definition held as long as the image was observed by humans because the eye is most
relaxed when it sees an object lying at infinity. However, with the advent of a
photographic film and more recently, solid-state optical detectors, the definition of a
telescope has evolved to a system that is focal such that the image is formed at a finite
distance on the film or a detector array, often called the focal plane array. Reflecting
telescopes (as opposed to the refracting telescopes discussed in Section 6.5) have the
advantage that the images formed by them do not suffer from chromatic aberrations.
However, because one mirror obscures a portion of the other, the beam of light that forms
the final image is annular, resulting in a decrease in the amount of light in the image.
Two configurations of a two-mirror astronomical telescope imaging an object lying

at infinity are illustrated in Figure 3-8. Both telescopes have a concave primary mirror,
but one has a convex secondary mirror, and the other a concave. The axial image
formation is illustrated by the solid-line rays, and the off-axis by the dashed-line ray. In
the Cassegrain telescope, the image formed by the concave primary mirror M1 lies
beyond the convex secondary mirror M2 and is thus a virtual object for it. In the
Gregorian telescope, both mirrors are concave, and the image formed by the primary
mirror lies between the two. For a given focal length of the primary mirror, the
Cassegrain telescope is evidently shorter in length compared to the Gregorian.
Let R1 and R2 be the vertex radii of curvature of the primary and secondary mirrors,
respectively. Their corresponding focal lengths are given by f1¢ = R1 2 and f2¢ = R2 2 .
They are coaxial such that the system is rotationally symmetric about the optical axis that
passes through their vertices. Let the vertex-to-vertex spacing from mirror M1 to mirror
M2 be (a numerically negative quantity) t.
Applying Eq. (3-7) to the primary mirror M1 , we note that for an object at infinity,
S1 = - • , and therefore the image is formed at its focus F1¢ , called the prime focus, at a
(numerically negative) distance S1¢ = f1¢ from M1 . This image is the object for the
secondary mirror M2 and lies at a distance
S2 = f1¢ - t (3-20)
from it. In Figure 3-8a, F1¢ lies inside the focus of the secondary mirror, i.e., S2 < f2¢ ,
but in Figure 3-8b, it lies outside, i.e., S2 > f2¢ . In both cases, a real image is formed by
M2 that lies at the telescope (or Cassegrain) focus F ¢ at a distance S2¢ given by
1 1 1
= - , (3-21)
S2¢ f 2¢ S2
or
S2¢ = f2¢( f1¢ - t ) ( f1¢ - t - f2¢) . (3-22)
S2¢ locates the image formed by the system, and S2¢ + t gives the distance of the image
from the primary mirror, called the working distance. Although the location of the focal
point F ¢ is thus determined, the focal length is not; it can be obtained by substituting for
S2 and S2¢ into Eq. (2-98) and letting n2¢ = 1, representing the refractive index associated
with the ray reflected by M2 . Therefore, the focal length of the telescope is given by
1 1 1 t
= - + - . (3-23)
f¢ f1¢ f2¢ f1¢f2¢
3.3 Two-Mirror Telescopes 131
h¢2
h1¢
OA F1¢ b F¢
M2
M1
(–)S2 (–)t
(–)f1¢
S¢2
(a)
h′1
OA F′1 β F′
(–)h′2
M2
M1
S2 (–)f1′
(–)t
S′2
(b)
Figure 3-8. Astronomical telescope consisting of two mirrors M1 and M2 .

(a) Cassegrain and (b) Gregorian forms are shown in this figure. The spherical
image surface passing through the focal point F ¢ is an illustration of the Petzval
image surface. The axial image formation is illustrated by the solid-line rays, and
off-axis by the dashed-line rays.
By definition, the focal length is the distance of F ¢ from the principal point H ¢ . As
illustrated in Figure 3-9, H ¢ is the point where the optical axis intersects the principal
plane, which, in turn, is the transverse plane passing through the point of intersection of a
ray incident parallel to the optical axis and the corresponding exit ray passing through
F ¢ . To determine f ¢ , we need to know the slope angle ¢ of the exit ray in terms of the
height of the incident ray. This is done in Section 4.8, where Eq. (3-23) is rederived.
For an object lying at infinity at an angle from the optical axis of the system (see
Figure 3-8), the height h1¢ of its image formed by M1 is given by
h1¢ = - f1¢ . (3-24)
The image formed by M1 is the object for M2 . The height h2¢ of the final image formed
by M2 (and therefore by the system) is given by
M2 ∫ h2¢ h1¢
= - S2¢ S2
= f2¢ ( f1¢ - t - f2¢) ,
or
M2 = - f ¢ f1¢ . (3-25)
(–)b¢
H¢ OA F1¢ F¢
M2
M1
(–)S2 (–)t
S¢2
(–)f1¢
f¢
Figure 3-9. Ray tracing of a two-mirror system to determine its focal point F ¢ and
principal point H ¢. A ray incident parallel to the optical axis is reflected by mirror
M1 in the direction of its focal point F1¢ , which, in turn, is reflected by mirror M2
in the direction of the telescope focus F ¢ . The incident and the exit rays meet in the
principal plane whose intersection with the optical axis locates the principal point
H¢ .
3.4 Beam Expander 133
The magnification M 2 of the image formed by the secondary mirror is called the
secondary magnification. From Eqs. (3-24) and (3-25), we obtain
h2¢ = f ¢ . (3-26)
It is evident from Figure 3-9 that the focal ratio of the image-forming light cone is given
by
F = f ¢ D1 , (3-27)
where D1 is the diameter of the primary mirror. It should be evident from Figure 3-8 that
the secondary mirror obscures the beam incident on the primary mirror. Accordingly, the
image-forming light cone is annular (see Section 4.8 for obscuration value). The signs of
the various quantities associated with the Cassegrain and Gregorian telescopes are
summarized in Table 3-1.
3.4 BEAM EXPANDER
If the two mirrors of the telescope are confocal (meaning “common focus”), i.e., if
t = f1¢ - f2¢ , then f ¢ Æ • , and the system becomes afocal, acting as a beam expander. As
illustrated in Figure 3-10, a parallel beam incident at an angle with the optical axis OA
is focused by the primary mirror M1 at a height h1¢ = - f1¢ in the plane of the common
focus. The beam focus lies at an angle h1¢ f2¢ from the optical axis. Thus, the secondary
mirror recollimates the light, making an angle ¢ = - h1¢ f2¢ . It is evident that the beam-
expansion ratio is D2 D1 = f2¢ f1¢, where D1 and D2 are the diameters of the primary
and secondary mirrors, respectively. The angular magnification of the output beam is
given by ¢ = f1¢ f2¢ . The product of the transverse and angular magnifications is unity.
3.5 PETZVAL IMAGE

3.5.1 Single Mirror
The radius of curvature Ri of the Petzval image surface for a spherical refracting
surface of radius of curvature R separating media of refractive indices n and n¢ is given
Table 3-1. Signs of focal lengths, etc. for Cassegrain and Gregorian telescopes.
Quantity Cassegrain Gregorian
f1¢ – –
f 2¢ – +
f¢ + –
M2 + –
S2¢ + +
t – –
Rp – +
D2
OA F¢1 F2¢
(–)h¢1
D1
b
M1
b¢
M2
(–) t S¢1 = f1¢
f2¢
Figure 3-10. Schematic of a beam-expander system consisting of two confocal

mirrors M1 and M2 with their focal points F1¢ and F2¢ . The dotted lines shown are
parallel to the optical axis OA.
by Eq. (2-118):
1 n - n¢
= . (3-28)
Ri nR
Letting n = 1 and n ¢ = - 1, we obtain the radius of curvature of the Petzval surface for a
corresponding reflecting surface:
1 1
= - ( - 1 - 1) ,
Ri1 R
or
Rp = R 2 = f ¢ .
(3-29)
For a concave (converging or a positive) mirror with its center of curvature to the left of
its vertex, R and f ¢ are numerically negative (see Figure 3-11a). Therefore, Rp is also
numerically negative, or the Petzval surface is curved in the same manner as the mirror
with a radius of curvature equal to the focal length of the mirror. For a convex (diverging
or a negative) mirror with its center of curvature to the right of its vertex, R and f ¢ are
numerically positive (see Figure 3-11b). Therefore, Rp is also numerically positive, or
the Petzval surface is virtual and curved in the same manner as the mirror. When the
object is at infinity so that the image lies in the focal plane of the mirror, the Petzval
surface is concentric with the mirror, regardless of whether the mirror is concave or
convex.
3.5 Petzval Image Surface 135
Petzval Petzval
Surface Surface
Cp C P¢0 F¢ P¢0 F¢ Cp C
(–)Rp Rp
(–)f ¢ f¢
(–)R R
(a) (b)
Figure 3-11. Petzval surface of a mirror. (a) Concave mirror with a real Petzval
image surface. (b) Convex mirror with a virtual Petzval image surface. C and F ¢
are the center of curvature and the focal point of the mirror. P0¢ is the axial image
point, and C p is the center of curvature of the Petzval surface. P0¢ C p is the radius of
curvature of the Petzval surface.
3.5.2 Two-Mirror System
The radius of curvature Rik , or Rp , of the Petzval image surface of a system

consisting of k refracting surfaces of radii of curvature R j , j = 1, 2, ..., k , separating
media of refractive indices n0 , n1 , ..., nk respectively, is given by Eq. (2–121a):
1 k 1 Ê 1 1 ˆ
= nk Â Á - . (3-30)
Rik j =1 R j Ë n j n j -1 ˜¯
The rays on each surface are incident from left to right, and the radius of curvature Rj of a
surface, including the Petzval surface, is numerically positive or negative, depending on
whether its center of curvature lies to the right or the left of its vertex, i.e., depending on
whether it is convex or concave to the light incident on it. If the jth surface of a system is
a reflecting one, then n j -1 = 1 and n j = - 1 when rays are incident on it from left to right.
However, if they are incident from right to left, as, for example, on the secondary mirror
of a two-mirror system, then n j -1 = - 1 and n j = 1.
For a system consisting of two mirrors with radii of curvature R1 and R2 , the
refractive indices have the values n0 = 1 , n1 = - 1 , and n2 = 1 (a second reflection makes
n2 positive). Thus, Eq. (3-30) reduces to
1 1 1
= (- 1 - 1) + (1 + 1) ,
Ri 2 R1 R2
or
1 Ê 1 1ˆ
= 2Á- + ˜
Rp Ë R1 R2 ¯
1 1
= - + , (3-31)
f1¢ f2¢
where f1¢ and f2¢ are the focal lengths of the mirrors. This surface is shown passing
through F ¢ in Figure 3-8. It is curved toward the primary mirror in the case of a
Cassegrain telescope, and away from it in the case of a Gregorian telescope.
3.5.3 System of k Mirrors
Similarly, we find that the radius of curvature Rp of the Petzval surface for a system
consisting of k mirrors with radii of curvature R j , j = 1, 2, ..., k , is given by
1 k k j 1
= 2( - 1) Â ( - 1) . (3-32)
Rp j =1 Rj
3.6 MISALIGNED MIRROR
The displacement of an image due to the misalignment of a reflecting surface can be

obtained from the corresponding results for a refracting surface by letting n¢ = - n = -1.
However, it is instructive to rederive them here.
3.6.1 Decentered Mirror
Consider a mirror of radius of curvature R forming the image of a point object P

lying at a distance S and height h, as illustrated in Figure 3-12. The image point lies at
distance S ¢ and height h ¢ given by Eqs. (3-5) and (3-12), respectively. If the mirror is
decentered by an amount D , the object and image heights become
hp = h - D (3-33)
and
h ¢p = M t h p
= h¢ - M t D , (3-34)
respectively, where M t is the transverse magnification of the image. The image is

displaced to position P ¢¢ , where the displacement P ¢P ¢¢ is given by
P¢ P ¢¢ = h ¢p - ( h ¢ - D)
= (1 - M t ) D . (3-35)
Letting M t = 1 for a plane mirror, the image stays stationary, as expected.

3.6 Misaligned Mirror 137
Cp
h
D P0′ Vp
C (–)h′ V
P0 P ′′ p
(–)h′
P′
(–)S ′
(–)R
(–)S
Figure 3-12. Decentered mirror. When the mirror is decentered by an amount D ,

the image is displaced by an amount (1 - M t ) D .
3.6.2 Tilted Mirror
If a mirror is tilted about its vertex by an angle , as illustrated in Figure 3-13, the
image is displaced from the point P ¢ to a point P ¢¢ . The object and image heights with
respect to the tilted axis of the mirror are given by
h p = h - S (3-36)
and
h ¢p = M t h p
= h ¢ - M t bS . (3-37)
The image displacement is given by
P ¢P ¢¢ = hp¢ - ( h ¢ - S ¢)
= (S ¢ - M t S)
= 2S ¢ , (3-38)
where we have used Eqs. (3-12a) and (3-37). When a ray is incident at a certain angle on
the mirror (including a plane mirror), the reflected ray is tilted by an angle 2 when the
mirror is tilted by an angle . Thus, the image is displaced by 2S ¢ . Substituting for S ¢
in terms of M t and R from Eqs. (3-5) and (3-12), Eq. (3-38) can also be written
h Cp
(–)β P0′
(–)h′p V
P0 C P′′
(–)h′
P′
(–)S′
(–)R
(–)S
Figure 3-13. Tilted mirror. When the mirror is tilted by an angle b , the image is
displaced by an amount 2bb S ¢ .
P ¢P ¢¢ = (1 - M t ) D , (3-39)
where D = R is the displacement of the center of curvature of the mirror. From Eqs. (3-
35) and (3-39), we note that the image displacement is determined by the displacement of
the center of curvature. If a mirror is decentered and tilted such that the center of
curvature is not displaced, then the image is also not displaced. Note that Eq. (3-39) does
not apply to a plane mirror because its center of curvature lies at infinity, and the
magnification of the image formed by it is unity.
3.6.3 Despaced Mirror
If a mirror is displaced along its axis by an amount D , as illustrated in Figure 3-14,

the object distance changes from S to S - D , or it changes by dS = -D . The
corresponding change in distance from the new position of the mirror is given by
d S ¢ = - M t2 D. Adding the displacement of the mirror to it, the net image displacement is
given by
( )
P0¢P0¢¢ = 1 + M t2 D . (3-40)
Letting M t = 1 for a plane mirror, the image displacement is twice that of the mirror, as
expected.
3.7 Misaligned Two-Mirror Telescope 139
h
P0′ P0′′ V Δ
Vp
P0 C (–)h′
P′ P′′
(1– M2 )Δ
(–)S′
(–)R
(–)S
Figure 3-14. Despaced mirror. When the mirror is displaced longitudinally by an

amount D , the image is displaced by an amount 1 + M t2 D .( )
3.7 MISALIGNED TWO-MIRROR TELESCOPE
As a practical example, we apply the results of a misaligned mirror to a two-mirror
telescope when its secondary mirror is misaligned with respect to its primary mirror.
3.7.1 Decentered Secondary Mirror
Figure 3-15a shows a properly aligned two-mirror telescope. When the secondary
mirror is decentered by a small amount D along the x axis, as in Figure 3-15b, the image
is displaced by an amount (1 - M2 ) D , where M2 is the transverse magnification of the
image formed by it.
3.7.2 Tilted Secondary Mirror
When the secondary mirror is tilted with respect to the primary mirror by an angle ,
as in Figure 3-15c, the rays reflected by it tilt by an angle 2. Thus, the image formed by
it is displaced by an amount 2 S2¢ , where S2¢ is the distance of the final image from the
secondary mirror M2 .
3.7.3 Despaced Secondary Mirror
The effect of a longitudinal displacement of the secondary mirror relative to the

primary mirror is to change the spacing t between them. If the secondary mirror moves by
an amount dt , as in Figure 3-15d, the position of the image formed by the primary mirror
does not change. However, the distance of the object for the secondary mirror changes by
dt , and the distance of the image formed by it from its displaced position accordingly
(
changes by M22 d t . Accordingly, the final image is displaced by an amount 1 + M22 d t . )
(a)
F
M2
M1
D
(b)
(–)(1– M2)D
C2 b 2bS2¢
(c)
C¢2
S2¢
(d)
(–)dt (–)(1 + M22)dt
Figure 3-15. Misalignments of a two-mirror telescope. (a) Aligned telescope.

(b) Secondary mirror decentered along the x axis by D . (c) Secondary mirror tilted
in the z x plane by an angle b so that its center of curvature is displaced from C2 to
C2¢ . (d) Secondary mirror despaced by d t .
3.8.1 Imaging by a Mirror
The imaging equations for a spherical mirror of radius of curvature R can be obtained
from the corresponding equations for a refracting surface by letting n = 1 because the
medium between the object and the mirror is air, and n ¢ = - 1, representing a reflected
ray propagating backward. However, a mirror has only one principal point that coincides
with its vertex V, one focal point F ¢ that lies halfway between V and the center of
curvature C, and one nodal point that coincides with C (see Figures 3-16 and 3-17). A ray
incident parallel to the optical axis is reflected by the mirror passing through F ¢ , a ray
incident in the direction of F ¢ is reflected parallel to the axis, and a ray incident in the
direction of C is reflected upon itself. The imaging equations are given by
1 1 2 1
+ = = , (Gaussian Imaging Equation) (3-41a)
S¢ S R f¢
M t ∫ h ¢ h = - S ¢ S = - 0 ¢0 , (Transverse Magnification) (3-41b)
M b = - ¢0 0 = S S ¢ , (Angular Magnification) (3-41c)
Mt Mb = 1 , (3-41d)
h ¢¢0 = - h0 , (Lagrange Invariance) (3-41e)
2
Ml = D S ¢ D S = - ( S S ¢ ) = - Mt 2 , (Longitudinal Magnification) (3-41f)
P B
V
P0 C (–)h′ P0′ F′
A
P′
(–) z′
(–) z
(–)f ′
(–)S′
(–)R
(–)S
Figure 3-16. Imaging by a mirror of radius of curvature R.

h
β0 P0′ β0′
P0 C (–)h′ F′ V
P′
Figure 3-17. Angular magnification of rays.
z z¢ = f ¢2 , (Newtonian Imaging Equation) (3-41g)
Mt = h ¢ h = z ¢ f ¢ = f ¢ z , (Transverse Magnification) (3-41h)
and
Rp = R 2 = f ¢ . (Petzval Radius of Curvature) (3-41i)
The negative sign in Eq. (3-41e) indicates, for example, that if the object is displaced
longitudinally towards the mirror, the image is displaced away from it.
If the mirror is decentered and/or tilted such that its center of curvature is displaced
by an amount D , the image is displaced by an amount (1 - Mt )D . The image
displacement is also equal to 2S ¢ when the mirror is tilted by an angle . If the mirror is
despaced by an amount D , the image is displaced by an amount 1 + Mt2 D . ( )
3.8.2 Imaging by a Two-Mirror Telescope
The imaging equations for a two-mirror telescope, illustrated in Figure 3-18, with
mirrors of radii of curvature R1 and R2 , focal lengths fi¢ = Ri 2 , and spaced a
(numerically negative) distance t apart are given by
f1¢f2¢
f¢ = , (Focal Length of the Telescope) (3-42a)
f1¢ - f2¢ - t
h1¢ = f1¢ , (Primary Image Height) (3-42b)
h2¢ = f ¢ , (Image Height) (3-42c)
M2 = - f ¢ f1¢ , (Secondary Magnification) (3-42d)

h¢2
h 1¢
D1
OA F 1¢ b F¢
M2
M1
(–)S2 (–)t
(–)f1¢ S¢2 + t
S¢2
Figure 3-18. Imaging by a two-mirror telescope.
F = f ¢ D1 , (Focal Ratio of the Image-Forming Light Cone) (3-42e)
S2¢ = f2¢( f1¢ - t ) ( f1¢ - t - f2¢) , (Image Distance from Secondary Mirror) (3-42f)
S2¢ + t , (Working Distance) (3-42g)
1 Ê 1 1ˆ 1 1
= 2Á- + ˜ = - + , (Petzval Radius of Curvature) (3-42h)
Rp Ë R1 R2 ¯ f1¢ f2¢
where D1 is the diameter of the primary mirror, and is the field angle.
If the secondary mirror is decentered with respect to the primary mirror by an

amount D , the image is displaced by an amount (1 - M2 )D . If it is tilted by an angle ,
the image is displaced by an amount 2S2¢ . If it is despaced by an amount dt , the image
(
is displaced by an amount 1 + M22 d t . )
PROBLEMS
3.1 An object is placed at a distance of 15 cm from a concave mirror of radius of

curvature 10 cm. (a) Determine the location and magnification of the image. (b)
How does the image change if the object and the mirror are immersed in water? (c)
Repeat problem (a) for an object placed at the center of curvature of the mirror.
3.2 The right-hand side mirror of automobiles is inscribed with the words "objects are
closer than they appear." Determine the radius of curvature of the mirror if the ratio
of the distances is 1.2.
3.3 A Mangin mirror consists of a thin, negative meniscus lens with a silvered back
surface. Show that, if R1 and R2 are the radii of curvature of the lens and n is its
refractive index, the focal length of the Mangin mirror is given by
fs¢ -1 = 2 nR2- 1 - 2(n - 1) R1-1 .
3.4 The Hubble space telescope is a Cassegrain telescope with a focal ratio of 24. Its
primary mirror has a diameter of 2.4 m and a focal ratio of 2.3. The spacing
between its two mirrors is 4.905 m. (a) Determine its focal length and illustrate it
on a schematic of the telescope. (b) Calculate its working distance. (c) Determine
the radius of curvature of the Petzval image surface.
3.5 Show that imaging by a thin converging lens of focal length f ¢ in contact with a
plane mirror is equivalent to a concave mirror of radius of curvature f ¢ . If the lens
has a focal length of 15 cm, determine the image of an object lying at a distance of
10 cm from it (a) in the absence of the mirror and (b) when the mirror is present.
CHAPTER 4
PARAXIAL RAY TRACING
4.1 Introduction ..........................................................................................................147

4.3 General System..................................................................................................... 152
4.3.1 Determination of Cardinal Points ............................................................152
4.3.2 Combination of Two Systems ................................................................. 154
4.4 Thin Lens ..............................................................................................................155
4.5 Thick Lens ............................................................................................................159
4.6 Two-Lens System ................................................................................................. 162
4.7 Reflecting Surface (Mirror) ................................................................................165
4.8 Two-Mirror System ............................................................................................. 168
4.8.1 Focal Length ............................................................................................168
4.8.2 Obscuration ..............................................................................................170
4.9 Catadioptric System: Thin-Lens–Mirror Combination ................................... 172
4.10 Two-Ray Lagrange Invariant ............................................................................. 174
4.11.1 Ray-Tracing Equations ............................................................................177
4.11.2 Thick Lens ............................................................................................... 179
4.11.3 Two-Lens System ....................................................................................180
4.11.5 Two-Ray Lagrange Invariant ..................................................................181
Problems ......................................................................................................................... 182
145
Chapter 4
Paraxial Ray Tracing
4.1 INTRODUCTION
Paraxial ray tracing was introduced in Chapter 1 (see Section 1.7) and utilized in
Chapters 2 and 3 to show that an imaging system could be characterized by its principal
points and focal lengths. Although a system has six cardinal points, only three are
independent. If the refractive indices of the object and image spaces are equal, which is
often the case in practice, then the nodal points coincide with the corresponding principal
points, and the object- and image-space focal lengths are equal in magnitude. We showed
that the image of a point object formed by a system can be determined graphically by
tracing any two of the three specific object rays: a ray incident parallel to the optical axis
of the system and emerging from it passing through the image-space focal point; a ray
incident passing through its object-space focal point and emerging parallel to the optical
axis; and a ray incident passing through its object-space nodal point and emerging from
the system passing though its image-space nodal point. The point of intersection of these
rays in the image space determines the image point.
However, before any of these three rays can be traced, we must know the location of
the cardinal points. Of course, we need their locations in order to apply the Gaussian
imaging equations as well. In the case of a single refracting surface, the principal points
coincide with its vertex, and its nodal points coincide with its center of curvature. The
principal and the nodal points of a thin lens (in air) coincide with its center. In this
chapter, we develop the paraxial ray-tracing equations and demonstrate their utility by
determining the cardinal points of simple imaging systems. Starting at an object point, a
ray undergoes rectilinear propagation to the first surface of the system; it is refracted or
reflected at the surface, depending on whether it is a refracting or a reflecting surface; it
undergoes rectilinear propagation again until it reaches the next surface; and the process
repeats itself until the ray reaches the image plane.
We first develop the paraxial ray-tracing equations for a refracting surface and
demonstrate their utility by determining its focal length. Ray tracing of a general system
to determine its cardinal points is considered next. How to determine the cardinal points
of a combination of two systems is also discussed. As examples of simple systems, a thin
lens, a thick lens, and a two-lens system are considered. The ray-tracing equations for a
mirror are derived next and applied to a two-mirror system, and a catadioptric system
consisting of a thin lens and a mirror. Some of the results obtained in Chapters 2 and 3 on
these simple systems are rederived to gain familiarity with the use of the ray-tracing
equations. In practice, the ray-tracing equations are used to determine not only the
Gaussian properties of a system but also the size of the imaging elements and apertures,
vignetting of the rays, and obscurations in mirror systems. This is illustrated by
determining the obscuration ratio of a two-mirror system, and the relative size of its
secondary mirror and the hole in its primary mirror.
147
148 PARAXIAL RAY TRACING
The Lagrange invariant discussed in Chapters 2 and 3 is generalized by describing it

in terms of the height and slope of two rays. The invariant in terms of the height and
slope of one ray can be obtained from it as a special case. We also show that given the
slope and height of a ray in a certain space of a system, its slope and height in any other
space can be obtained, without tracing it, from the slopes and heights of any two other
rays in that space. Two particular paraxial rays that are useful in determining the location
and size of the image of an object and that of the aperture stop of a system are the chief
and marginal rays, discussed in Chapter 5. Ultimately, for the final design and image
quality assessment, the rays have to be traced exactly, as discussed in Section 1.6.
Fortunately, this laborious task is routinely performed these days with the aid of
computers. Indeed, computer codes are commercially available for this purpose.
4.2 REFRACTING SURFACE
We now derive the ray-tracing equations for a refracting surface. As indicated in

Figure 4-1, consider a spherical refracting surface of radius of curvature R1 separating
media of refractive indices n0 and n1 . An object ray A 0 A 1 from a point object A0
incident at a point A1 on the refracting surface is refracted as a ray A1 A2 . Let x 0 , x1 , and
x2 be the heights of the points A0 , A1 , and A2 , respectively, from the optical axis VC,
where V is the vertex, and C is the center of curvature of the surface. Let t0 be the axial
distance of V (or the tangent plane passing though V in the Gaussian approximation) from
A0 and t1 be the distance of A2 from V. Note that per our sign convention, the object
distance S is equal to - t0 because it represents the distance from V.
It is evident from Figure 4-1 that for paraxial rays, rectilinear propagation from A0
to A1 gives
n0 n1
q0 A1
A0 b0 q1
(–)b1 A2
x1
x0
x2
(–)f
V OA C
R1
t0 t1
Figure 4-1. Ray tracing of a spherical refracting surface of radius of curvature R1

separating media of refractive indices n0 and n1 .
4.2 Refracting Surface 149
x1 = x 0 + t00 , (4-1)
where 0 is the slope angle of the incident ray A 0 A 1 from the optical axis. Equation (4-
1) is called the transfer ray-tracing equation, and, except for notation, it is the same as
Eq. (1-65). According to the paraxial form of Snell’s law, refraction of the ray at A1 gives
n1q1 = n0 q 0 , (4-2)
where q 0 and q1 are the angles of incidence and refraction (i.e., the angles of the incident
and refracted rays from the surface normal A1C at the point A1 ), respectively.
We note from Figure 4-1 that
f = 1 - q1 (4-3a)
and
q 0 = 0 - f , (4-3b)
where
f = - x1 R1 (4-3c)
is the angle the surface normal makes with the optical axis. Both f and 1 are
numerically negative in the figure. Substituting for q 0 , q1 , and f from Eqs. (4-3) into
Eq. (4-2), we obtain
x1
n11 = n00 + (n0 - n1 ) . (4-4)
R1
Equation (4-4) is called the refraction ray-tracing equation. It is the same as Eq. (1-66).
The rectilinear propagation of the refracted ray from A1 to A2 gives
x2 = x1 + t11 . (4-5)
If the next surface lies at a distance t1 from the first, then A2 determines the point of
incidence on it. In that case the ray A1 A2 is refracted at the point of incidence A2 by the
second surface according to an equation similar to Eq. (4-4), and the ray propagates
rectilinearly until it reaches the next surface. Using Eqs. (4-1) and (4-4) recursively, the
ray can be propagated to the image plane of a multisurface system.
The ray tracing of a refracting surface is illustrated schematically in Figure 4-2a. A

ray starts at a height x 0 from the optical axis with a slope angle 0 and propagates an
axial distance t0 to the refracting surface. The starting point is indicated by the
coordinates ( x 0 , 0 ) . The ray is incident on the surface at a height x1 and is refracted
with a slope 1 . The point of incidence is indicated by ( x1 , 1 ) , where x1 and 1 are
given by Eqs. (4-1) and (4-4), respectively. The height of a point x2 on the refracted ray
at a distance t1 from the refracting surface is given by Eq. (4-5).
n0 n1
x1, b1
b0
x0, b0 (–)b1 x2
x0 x1 x2
V C
R1
t0 t1
(a)
n0 n1
x0, 0
x1, b1
x0 x1 x2 = 0
(–)b1
V C F¢
R1
t0 t1 = f¢1
(b)
Figure 4-2. Ray tracing of a spherical refracting surface. (a) General case. (b)
Determination of focal point F ¢.
As a simple example, we use the ray-tracing equations to determine the focal length
of the refracting surface. If we let 0 = 0 , corresponding to a ray incident parallel to the
optical axis of the system, as in Figure 4-2b, and let x2 = 0 , corresponding to the point of
intersection of the refracted ray with the optical axis, then the corresponding value of t1
gives the focal length f1¢ . Letting 0 = 0 in Eqs. (4-1) and (4-4), and x2 = 0 in Eq. (4-5),
we find that
n1
f1¢ = R1 , (4-6)
n1 - n0
in agreement with Eq. (2-7a).
It should be evident that the position and height of the image of an object formed by
a refracting surface can also be obtained by using the ray-tracing equations. Thus, for
example, consider a ray from an axial point object P0 with a slope angle 0 , as illustrated
in Figure 4-3a. Letting x 0 = 0 in Eq. (4-1) , we obtain
n0 n1
x1
P0 b0 (–)b1 P0¢
x0 = 0 V C x2 = 0
R1
t0 t1
(a)
n0 n1
b0
P
x1
x0 (–)b1 P0¢
P0 V C (–)x2
P¢
R1
t0 t1
(b)
Figure 4-3. Imaging of a point object by a refracting surface. (a) On-axis point
object P0 . (b) Off-axis point object P.
x1 = t00 , (4-7)
as is evident from the figure. Substituting Eq. (4-7) into Eq. (4-4), the slope 1 of the
refracted ray is given by
t00
n11 = n00 + (n0 - n1 ) . (4-8)
R1
Substituting for 1 from Eq. (4-8) into Eq. (4-5) and letting x2 = 0 , we obtain the
distance t1 of the axial image point P0¢ :
x1
t1 = -
1
n1t0
= ,
n0 + (n0 - n1 )(t0 R1 )
or
n1 n0 n - n0
+ = 1 . (4-9)
t1 t0 R1
To obtain the height of the image P ¢ of an off-axis point object P, we consider an

object ray at a height x 0 with a slope angle 0 , as illustrated in Figure 4-3b. The ray
height at the surface is given by Eq. (4-1). The slope of the refracted ray is given by Eq.
(4-4). Substituting Eq. (4-4) into Eq. (4-5), we obtain the height x2 of the image point P ¢
at an axial distance t1 :
t1 È x1 ˘
x2 = x1 + Ín00 + (n0 - n1 ) ˙
n1 Î R1 ˚
t1 È Ê n1 n0 ˆ ˘
= x1 + Ín00 - Á + ˜ x1 ˙
n1 ÍÎ Ë t1 t0 ¯ ˙˚
t1 Ê n0 ˆ
= Á n 00 - x˜ , (4-10)
n1 Ë t 0 1¯
where we have used Eq. (4-9). Substituting for x1 from Eq. (4-1), we obtain
x2 nt
= - 01 . (4-11a)
x0 n1t0
Writing t0 and t1 in terms of the ray heights and slope angles, we obtain
x2 n
= 0 0 . (4-11b)
x0 n11
Except for notation, Eqs. (4-9) and (4-11a) are the same as Eqs. (2-4) and (2-12a),
respectively. Similarly, Eq. (4-11b) is the same as Eq. (2-17). Note that t0 = - S because
S is the distance of the object from the refracting surface, and t0 is the distance a ray
propagates from the object to the refracting surface.
4.3 GENERAL SYSTEM
In this section we illustrate how the cardinal points of a system may be determined
from its design parameters by using the transfer and refraction ray-tracing equations. We
also show how the cardinal points of a combination of two systems can be determined
from the cardinal points of the individual systems.
4.3.1 Determination of Cardinal Points

By recursive application of the transfer and refraction ray-tracing equations (4-1) and
(4-4), a ray can be traced through a multielement system. Consider, for example, as
illustrated in Figure 4-4, a ray incident on the system from an axial point object P0 with a
slope angle 0 at a distance S from the object-space principal plane. The ray emerges
from the system with a slope angle ¢0 intersecting the optical axis at the image point P0¢
at a distance S ¢ from the image-space principal plane. The image distance is given by the
imaging equation (2-72):
n n¢
Q Q¢
x
b0 (–)b¢0 P¢0
P0 H H¢
(–)S S¢
Figure 4-4. Image location by ray tracing.
n¢ n n¢ n
- = = - , (4-12)
S¢ S f¢ f
where n and n ¢ are the refractive indices of the object and image spaces, and f and f ¢ are
the focal lengths of the system in those spaces. By multiplying both sides of Eq. (4-12) by
the height x of the ray at the principal planes and noting that 0 = - x S and ¢0 = - x S ¢ ,
we can write it in terms of the slope angles of the rays:
n ¢¢0 - n0 = - xn ¢ f ¢ . (4-13)
As illustrated in Figure 4-5, the focal length of the system can be determined by
considering a ray incident parallel to the optical axis at a certain height and determining
the point of intersection F ¢ of the emergent ray with the optical axis. Thus, as S Æ - •
or 0 Æ 0 , then P0¢ Æ F ¢ , the image-space focal point. Letting 0 = 0 in Eq. (4-13), we
find that
f ¢ = - x ¢0 . (4-14)
If x j is the height of the emergent ray at the last surface of the system, then the distance t
n n¢
Q Q¢
x (–)b¢0
xj
H H¢ Vj F¢
t
f¢
Figure 4-5. Determination of focal point F ¢ . Only the first and the last surface of the
system are shown.
of the focal point F ¢ from the vertex Vj of the surface is given by (see Figure 4-5)
xj
t = - . (4-15)
¢0
Of course, the incident and the emergent rays intersect each other at a point in the image-
space principal plane whose intersection with optical axis locates the corresponding
principal point H ¢ . Because H ¢F ¢ = f ¢ , the point H ¢ is located once the focal point F ¢
and the focal length f ¢ are known. The distance of H ¢ from the vertex Vj of the last
surface may be written
x - xj
Vj H ¢ = , (4-16)
¢0
as is evident from Figure 4-5. The object-space focal point F and the principal point H
can be determined in a similar manner by tracing a ray incident parallel to the axis from
right to left.
4.3.2 Combination of Two Systems

If two systems with known cardinal points are combined, it is easy to determine the
cardinal points of the combined system. Let ( H1 , H1¢ ) and ( H2 , H2¢ ) be the principal
points of the two systems separated by a distance t between H1¢ and H2 , as illustrated in
Figure 4-6. Let the refractive indices of the object and image spaces of the combined
system be n0 and n2 , and let the refractive index of the space between the two systems
be n1 . Let the image-space focal lengths of the two systems be f1¢ and f2¢ .
To determine the image-space focal length f ¢ of the combined system, we trace a

ray incident parallel to the common optical axis at a certain height x1 and determine the
n0 n1 n2
x1
x2 (–)b 1
H (–)b2
H1 H¢1 H2 H¢ H¢2 F¢ F¢1
t
f¢
f¢1
Figure 4-6. Combination of two systems. The quantity t ∫ H1¢ H2 represents the
separation of the principal planes of the two systems of focal lengths f1¢ and f2¢ .
focal point F ¢ of the system as the point of intersection of the emergent ray with the
optical axis. From Eq. (4-13), the slope angle 1 of the ray incident on the second system
is given by
1 = - x1 f1¢ . (4-17)
The height x2 of the ray incident on the second system is given by
x2 = x1 + t1
= x1 (1 - t f1¢) . (4-18)
The slope angle 2 of the ray emerging from the second system is given by
n22 = n11 - n2 x2 f2¢
Ên n ntˆ
= - x1 Á 1 + 2 - 2 ˜ . (4-19)
Ë f1¢ f2¢ f1¢f2¢ ¯
It is evident from the figure that
2 = - x1 f ¢ . (4-20)
Comparing Eqs. (4-19) and (4-20), we find that
n2 n n nt
= 1 + 2 - 2 , (4-21a)
f¢ f1¢ f2¢ f1¢f2¢
or, in terms of the powers Ki = ni fi¢ of the systems,
t
K = K1 + K 2 - K1 K 2 . (4-21b)
n1
The principal point H ¢ of the system is located by considering its distance from H2¢ ,
which is given by
H2¢ H ¢ = ( x1 - x 2 ) 2
= t (1 2 )
= - t ( f ¢ f1¢) (4-22a)
n2 K1
= -t . (4-22b)
n1 K
Similarly, it can be shown that the location of the principal point H is given by
H1 H = t (n0 f ¢ n1 f2¢ ) (4-23a)
n0 K 2
= -t . (4-23b)
n1 K
Equations (4-22) and (4-23) can be used to obtain equations for a thick lens (of which a
thin lens is a special case) or two thin lenses. However, we will use the results for a
refracting surface recursively to gain some additional insight.
Equation (4-21a) gives the focal length of the combined system in terms of the
separation t ∫ H1¢H2 of the principal planes of the two individual systems. In a
microscope, a standardized quantity of interest is its tube length L. It represents the
separation F1¢F2 of the focal planes of the individual systems, called the objective and the
eyepiece. The object-space focal point F2 of the eyepiece lies to the right of the image-
space focal point F1¢ of the objective, as illustrated in Figure 4-7. Let f1¢ and f2¢ be the
image-space focal lengths of the objective and the eyepiece, respectively. From the
figure, we find that
L = t - f1¢ + f2 , (4-24)
where f2 is the object-space focal length of the eyepiece. Although n0 may be different
from unity, as in an oil immersion microscope, both n1 and n2 are equal to unity. Thus,
f2 = - f2¢ , and Eq. (4-24) may be written
L = t - f1¢ - f2¢ . (4-25)
Substituting for t in terms of L from Eq. (4-25) into Eq. (4-21a) and letting n1 = n2 = 1 ,
we obtain
f1¢f 2¢
f¢ = . (4-26)
L
n0 n1 = 1 n2 = 1
F1 H1 H¢1 F¢1 F2 H2 H¢2 F¢2 F¢ H¢

(–)f2 (–)f2¢
(–)f1 f¢1 (–)f ¢
Figure 4-7. Schematic of the principal and focal points of a microscope and its
objective and eyepiece.
4.4 Thin Lens 157
In terms of powers Ki = 1 fi¢ of the objective and the eyepiece, we can write
K = K1 K2 L . (4-27)
The object-space focal length of the microscope is given by f = - n0 f ¢ . A standard value

of L is 16 cm, although some manufacturers have used other values.
4.4 THIN LENS
In the case of a thin lens, the refraction of an incident ray takes place at its two
surfaces that have a negligible spacing between them. This is illustrated schematically in
Figure 4-8a. It starts at point ( x0 , 0 ) in a medium of refractive index n0 and travels a
distance t0 to the first surface. The point of incidence is ( x1 , 1 ) on the first surface and
( x2 , 2 ) on the second. The lens has a refractive index n and a thickness t1 that is
negligible. The ray ends at a height x3 from the optical axis at a distance t2 from the
second surface in a medium of refractive index n0 .
x1, b1 x2, b2
x0, b0
x1 x3
x0 x2 x3
n0 n1 n2
t0 t2
t1
(a)
x0, 0 x1, b1 x2, b2 (–)b1
x0 x1
x2 b2 x3 = 0
F¢
t 2 = f¢
(b)
x1, b1
x0, b0
x2
x0 b0 x1 (–)b1 x2
t0 t1
(c)
Figure 4-8. Ray tracing of a thin lens. (a) General case. (b) Object at infinity. (c)
Simplified ray tracing of a thin lens where the lens thickness t1 is neglected.
Now, we apply Eqs. (4-1) and (4-4) recursively to obtain the focal length of a thin
lens of refractive index n and spherical surfaces of radii of curvature R1 and R2 in a
medium of refractive index n0 . The paraxial ray-tracing equations for the lens to
determine its focal point F ¢ and focal length f ¢ may be written as follows (see Figure 4-
8b):
x1 = x 0 + t0 0 (4-28a)
= x 0 for a ray incident parallel to the optical axis, (4-28b)
x1
n1 1 = n00 + (n0 - n1 ) (4-29a)
R1
x0
= (n0 - n1 ) , (4-29b)
R1
x2 = x1 + t1 1 (4-30a)
= x1 because we neglect t1 (4-30b)
= x0 , (4-30c)
x2
n2 2 = n11 + (n1 - n2 ) , (4-31)
R2
or, substituting for n11 from Eq. (4-23b),
Ê n - n1 n1 - n2 ˆ
n22 = Á 0 + ˜ x0 ,
Ë R1 R2 ¯
x3 = x 2 + t2 2 (4-32)
= x 0 + t2 2
= 0 for the focal point F ¢ ,
and
n2 n n - n 0 n 2 - n1
∫ 2 = 1 + . (4-33)
f¢ t2 R1 R2
Except for the notation, Eq. (4-33) is the same as may be obtained from Eqs. (2-56).
For a lens surrounded by the same medium on both sides, e.g., air or water, we let
n2 = n0 in Eq. (4-33) and obtain
1 n - n0 Ê 1 1ˆ
= 1 Á - ˜ . (4-34)
f¢ n0 Ë R1 R2 ¯
4.5 Thick Lens 159
Substituting Eqs. (4-29a) and (4-30b) into Eq. (4-31) with n2 = n0 , and utilizing Eq. (4-
34), we obtain
x1
2 = 0 - . (4-35)
f¢
Referring to Figure 4-8c, where a ray incident on the lens is shown refracted by it in
one step (rather than in two, as in Figures 4-8a and 4-8b), the ray-tracing Eqs. (4-28a), (4-
35), and (4-32) for a thin lens of image-space focal length f1¢ may be written
x1 = x 0 + t0 0 , (4-36)
x1
1 = 0 - , (4-37)
f1¢
and
x2 = x1 + t1 1 , (4-38)
respectively. We note that if x1 = 0 , then 1 = 0 , showing that a ray incident in the

direction of the center of the lens emerges from it undeviated. Accordingly, the principal
and nodal points of a thin lens coincide at its center.
4.5 THICK LENS
Now we consider a thick lens of refractive index n , thickness t, and surfaces with
radii of curvature R1 and R2 , and determine its focal length by recursive application of
the ray-tracing equations (4-1) and (4-4) for transfer and refraction at a refracting
surface, respectively. With reference to Figure 4-9 and noting that n0 = 1 , n1 = n , and
n2 = 1 , we proceed as follows by considering a ray incident on the lens from left to right,
parallel to its axis, so that 0 = 0 :
x1 = x 0 + t00
= x 0 for a ray incident parallel to the optical axis ,
x1
n11 = n00 + (n0 - n1 ) , (4-39)
R1
x0
n1 = (1 - n) ,
R1
x2 = x1 + t11
Ê n - 1ˆ
= Á1 - t ˜ x0 ,
Ë nR1 ¯
x0, 0 x1, β1 (–)β1

x2, β2
x0 x1 n
(–)β2 x3 = 0
OA F C2 V1 H H′ V2 F′ C1
t2
t1 ≡ t
(–)f f′
R1
(–)R2
Figure 4-9. Ray tracing of a thick lens of refractive index n and thickness. C1 and
C2 are the centers of curvature of the surfaces of the lens with vertices V1 and V2
and radii of curvature R1 and R2 , respectively.
x2
n22 = n11 + (n1 - n2 ) , (4-40)
R2
or
È1 - n n - 1 Ê n - 1ˆ ˘
2 = Í + Á1 - t ˜ ˙ x0 ,
ÍÎ R1 R2 Ë nR1 ¯ ˙˚
and
1
= - 2 ,
f¢ x0
or
2
1 Ê 1 1 ˆ t (n - 1)
= ( n - 1) Á - ˜ + . (4-41)
f¢ Ë R1 R2 ¯ nR1 R2
Because the medium surrounding the lens is air, the refractive index of the image space is
unity. Thus, f ¢ is also the equivalent focal length of the lens. Equation (4-41) is the
lensmaker’s formula. It reduces to Eq. (2-28) for the focal length of a thin lens when the
term containing the thickness t is neglected. In that case, the thickness t is kept as small as
possible so that the lens can be fabricated, yet the term containing it can be neglected.
Similarly, letting x2 = x1 and substituting for 1 from Eq. (4-39) into Eq. (4-40), we
obtain Eq. (4-37) for a thin lens.
The image-space focal point F ¢ is located by letting x3 = 0 . Thus,
x3 = x 2 + t2 2
= 0 ,
4.5 Thick Lens 161
x2
t2 = - ,
2
or
Ê n - 1ˆ
t2 = f ¢ Á1 - t ˜ . (4-42)
Ë nR1 ¯
The quantity t2 ∫ V2 F ¢ , where V2 is the vertex of the second surface, locates the focal
point F ¢ and represents the image-space focal distance. A positive value of t2 implies
that the focal point F ¢ lies to the right of V2 . The principal point H ¢ is located by noting
that H ¢F ¢ = f ¢ . A positive value of f ¢ implies a converging or a positive lens, implying
that H ¢ lies to the left of F ¢ . It lies at a distance
V2 H ¢ = t2 - f ¢
n -1 (4-43)
= -tf¢
n R1
from V2 , and a negative value indicates that H ¢ lies to the left of V2 .
The object-space focal point F and the principal point H can be determined in a
similar manner by considering a ray incident parallel to the axis from right to left. Thus,
we can show that the distance of the focal point F from the vertex V1 of the first surface is
given by
Ê n - 1ˆ
V1 F = f Á1 + t ˜ , (4-44)
Ë n R2 ¯
where f = - f ¢ is the object-space focal length of the lens. The distance of the principal
point H from the vertex V1 is given by
n -1
V1 H = - t f ¢ , (4-45)
n R2
and a positive value implies that H lies to the right of V1 . The distance of H ¢ from H is
given by
HH ¢ = t - (V1 H + H ¢ V2 )
È n -1 Ê 1 1 ˆ˘
= t Í1 - f ¢ Á - ˜˙
ÍÎ n Ë R1 R2 ¯ ˙˚ (4-46a)
~ n -1 t (4-46b)
n
= t3 , (4-46c)
where we have used the thin lens formula for the focal length in obtaining Eq. (4-46b)
and n = 1.5 in further obtaining Eq. (4-46c). Thus, unless the lens is very thick, the
separation of its principal points is approximately equal to one-third of its thickness
independent of its radii of curvature. As expected in the limit of a thin lens, the principal
points coincide with its center.
As illustrated in Figure 4-10, it is interesting to explore the variation in the positions

of the principal and focal points of a thick lens with increasing thickness. In this figure,
the magnitudes of the radii of curvature of its two surfaces are assumed to be equal, i.e.,
R1 = R2 . Figure 4-10a shows the thin-lens approximation of a thick lens. Accordingly,
the principal points coincide. In practice, there is some spacing between them, as
indicated in Figure 4-10b. Equation (4-46a) shows that the principal points coincide if the
two surfaces of the thick lens are concentric, as shown in Figure 4-10c. The focal length
in this case is given by f ¢ = nR 2( n - 1) . According to Eq. (4-42), the image-space focal
point F ¢ lies at the back vertex V2 , as in Figure 4-10d, if t = n R1 (n - 1) . Its focal length
in that case is given by f ¢ = R1 (n - 1) , which is independent of the value of R2 ,
showing that the second refracting surface has no effect on the image formed at its vertex.
Similarly, according to Eq. (4-44), if t = - n R2 (n - 1) , the object-space focal point F
lies at the front vertex V1 , and the focal length of the lens is given by f ¢ = - R2 ( n - 1) ,
independent of the value of R1 .When F lies at V1 in Figure 4-9, R1 = R2 and F ¢ lies at
V2 .
If the thickness is increased to t = n ( R1 - R2 ) (n - 1) , the focal length approaches

infinity, i.e., the lens becomes afocal. Because R1 = - R2 in the figure, F lies at V1 when
F ¢ lies at V2 . As illustrated in Figure 4-10e, parallel rays incident on the lens are focused
inside it and emerge from it as parallel rays. The corresponding principal and focal points
lie at infinity on the opposite sides of the lens. If the thickness of the lens is increased
further, the principal points lie farther from the respective vertices than the corresponding
focal points, as shown in Figure 4-10f, thus giving it a negative image-space focal length
f ¢ even though its shape is biconvex. In this case, the term in Eq. (4-41) containing the
thickness t is numerically negative and larger in magnitude than the other term.
4.6 TWO-LENS SYSTEM
In Section 2.4.9, we considered a system with two thin lenses in air spaced a certain
distance apart, and determined its focal length as well as its principal and focal points.
We now revisit this problem by way of ray tracing.
Consider two thin lenses L1 and L2 of image-space focal lengths f1¢ and f2¢
separated by a distance t1 , as illustrated in Figure 4-11. Using Eqs. (4-36) and (4-37)
recursively, we can obtain the focal points and the principal points of the combined
imaging system as follows:
x1 = x 0 + t0 0
= x 0 for a ray incident parallel to the optical axis,

4.6 Two-Lens System 163
F F¢
H, H¢
(a)
F H F¢
H¢
(b)
F H, H¢ F¢
(c)
F¢
F V1 H H¢ V2
(d)
H, F¢ at •
H¢, F at – •
(e)
F¢ H¢
(–)f¢
(f)
Figure 4-10. The principal and focal points of a thick lens of increasing thickness.
The magnitudes of the radii of curvature of its two surfaces are assumed to be equal
in the figure. (a) Thin lens. (b) Thick lens. (c) Concentric lens. (d) Thick lens such
that the image-space focal point F ¢ lies at the back vertex V2 . (e) Afocal thick lens.
(f) Convex thick lens with a negative image-space focal length f ¢ .
x1
1 = 0 -
f1¢
x0
= - ,
f1¢
x 2 = x1 + t11
Ê t ˆ
= x 0 Á1 - 1 ˜ ,
Ë f1¢¯
L1 L2
x0, 0 (–)b1
x1, b1
x0 x2, b2
x3 = 0
OA H¢ (–)b2 F¢
f 1¢ f 2¢
t1 t2
f¢
Figure 4-11. Ray tracing of a two-lens system to determine its object-space focal
point F ¢ and principal point H ¢ .
x2
2 = 1 -
f2¢
Ê1 1 t ˆ
= - x0 Á + - 1 ˜ ,
Ë f1¢ f2¢ f1¢f2¢ ¯
1
= - 2 ,
f¢ x0
or
1 1 1 t
= + - 1 , (4-47)
f¢ f1¢ f2¢ f1¢f2¢
x3 = x 2 + t2 2
= 0 for the right focal point,
and
x2
t2 = - ,
2
or
Ê t ˆ
t2 = f ¢ Á1 - 1 ˜ . (4-48)
Ë f1¢¯
Equation (4-47) may also be obtained from Eq. (4-21a) by letting the refractive index of
the object and image spaces of the system be equal to unity, i.e., by letting n1 = n2 = 1 .
4.7 Reflecting Surface (Mirror) 165
The quantity t2 , called the image-space focal distance, locates the image-space focal
point F ¢ . The principal point H ¢ is located by noting that H ¢F ¢ = f ¢ . The object-space
focal point F and principal point H can be determined in a similar manner by considering
a ray incident parallel to the axis from right to left. We find that F lies at a distance
f (1 - t1 f2¢ ) from lens L1 , where f = - f ¢ because the lenses are in air. Such a distance
of the focal point F from the vertex of the first element of a system is called its object-
space focal distance.
It is easy to see from Eqs. (4-47) and (4-48) that if t1 = f1¢ + f2¢ , then f ¢ Æ • and,
therefore, t2 Æ • . Thus, the system is afocal (as in a Keplerian or a Galilean telescope
discussed later in Chapter 6), and the focal point F ¢ lies at infinity on the right-hand side
of the system. The principal point H ¢ lies to the left-hand side of the lens L2 at a
distance f ¢ - t2 = f ¢t1 f1¢ Æ • , i.e., it lies at infinity on the left-hand side of the system.
Similarly, we can show that the principal point H and the focal point F lie at infinity on
the right-hand and left-hand sides of the system, respectively. If t1 < f1¢ + f2¢ , then the
system has a positive focal length. If, however, t1 > f1¢ + f2¢ , then the system has a
negative focal length.
If lens L1 is placed at the front focal point F2 of lens L2 , i.e., if t1 = f2¢ , then
f ¢ = f2¢ , and the front focal point F of the system coincides with F2 . Because the height
of the image of a certain point object is determined by the object ray passing through the
front focal point of the imaging system (see Section 2.4.6), we find that it is the same for
imaging by the doublet as it is for imaging by lens L2 alone. This is why it is desirable to
place the spectacle lenses with different corrections for the two eyes in the front focal
plane of the eyes; otherwise, the images on the retinas will have different magnifications.
4.7 REFLECTING SURFACE (MIRROR)
We now derive the ray-tracing equations for a reflecting surface. Consider a

spherical reflecting surface of radius of curvature R1 with a vertex V and center of
curvature C, as illustrated in Figure 4-12. An object ray A0 A1 from a point object A0
incident on the surface at a point A1 with a slope angle 0 is reflected as a ray A1 A2 so
that the magnitudes of the angles of incidence q and reflection q ¢ from the surface
normal A1C at A1 are equal. Let x 0 , x1 , and x2 be the heights of A0 , A1 , and A2 ,
respectively, from the optical axis VC of the mirror. Also, let t0 be the axial distance of V
from A0 and t1 be the axial distance of A2 from V. We note from the figure that for
paraxial rays, rectilinear propagation from A0 to A1 gives
x1 = x 0 + t00 . (4-49)
According to the law of reflection,
q¢ = - q . (4-50)
We note from Figure 4-1 that

(–)q¢ A2
x2
q
A1
A0 b0
x1
x0 (–)f
V (–)b1 F¢ C
f1¢
R1
(–)t1
t0
Figure 4-12. Ray tracing of a convex spherical mirror of radius of curvature R1

with center of curvature C and vertex V.
0 - 1 = q - q ¢ = 2q , (4-51a)
1 = f - q , (4-51b)
and
x1
f = - , (4-51c)
R1
where f is the angle the surface normal makes with the optical axis. Note that 1 , q ¢ , and
f are all numerically negative angles in the figure. Substituting for q from Eq. (4-51b)
and for f from Eq. (4-51c) into Eq. (4-51a), we obtain
2 x1
1 = - 0 - . (4-52)
R1
Equation (4-52) is called the reflection ray-tracing equation. Rectilinear propagation of

the reflected ray from A1 to A2 gives
x2 = x1 + t1 1 . (4-53)
Note that t1 is numerically negative in this equation because the rays are propagating
from right to left as they travel from A1 to A2 . Therefore, the quantity t11 is
numerically positive.
The ray tracing of a reflecting surface is illustrated schematically in Figure 4-13a. A

ray starts at a height x 0 from the optical axis with a slope angle 0 and propagates an
axial distance t0 to the reflecting surface. The starting point is indicated by the
4.7 Reflecting Surface (Mirror) 167
x2
x2
x1, b1
x0, b0
x1
x0
V
(–)t1
t0
(a)
x0, 0 x1, b1
x0 x1
(–)b1 x2 = 0
V F¢
t1
(b)
Figure 4-13. Ray tracing of a reflecting surface. (a) General case. (b) Determination
of the focal point.
coordinates ( x 0 , 0 ) . The ray is incident on the surface at a height x1 and is reflected

with a slope 1 . The point of incidence is indicated by ( x1 , 1 ) , where x1 and 1 are
given by Eqs. (4-4) and (4-), respectively. The height x2 of a point A2 on the
reflected ray at a distance t1 from the refracting surface is given by Eq. (4-53).
If we let 0 = 0 , corresponding to a ray incident parallel to the optical axis, and let
x2 = 0 , corresponding to the intersection of the reflected ray with the optical axis, as
illustrated in Figure 4-13b, then the corresponding value of t1 gives the focal length of
the mirror. Letting 0 = 0 in Eqs. (4-49) and (4-52), and x2 = 0 in Eq. (4-53), we find
that the focal length of the mirror is given by
R1
f1¢ = , (4-54)
2
in agreement with Eq. (3-6). The reflecting power of the mirror is given by
K ∫ n ¢ f ¢ = - 1 f1¢ . (4-55)
4.8 TWO-MIRROR SYSTEM
In this section we consider a two-mirror system, such as the one considered in

Section 3.3, and determine its focal length. We also illustrate how to determine the
obscuration ratio of the image-forming beam as well as the size of the hole in the primary
mirror such that this beam is transmitted by it.
4.8.1 Focal Length

We now consider an imaging system consisting of two mirrors M1 and M2 of radii
of curvature R1 and R2 separated by a (numerically negative) distance t1 , as illustrated in
Figure 4-14, and determine its focal length and its principal and focal points. Starting
with a ray incident parallel to the optical axis (0 = 0) at a height x 0 , we apply Eqs. (4-
49) and (4-52) recursively as follows:
x1 = x 0 + t00
= x0 ,
2 x1
1 = - 0 -
R1
2 x1
= - ,
R1
x 2 = x1 + t1 1
Ê 2t ˆ
= x 0 Á1 - 1 ˜ , (4-56)
Ë R1 ¯
x0, 0 x1, b1
x2, b2
b1
x0 x1
x2 x3 = 0
(–)b2
H¢ OA F1¢ F¢
M2
M1
(–)f1¢
(–)t1
t2
f¢
Figure 4-14. Ray tracing of a two-mirror system to determine its focal point F ¢ and
principal point H ¢.
4.8 Two-Mirror System 169
2 x2
2 = - 1 -
R2
È1 1 Ê 2t1 ˆ ˘
= 2 x0 Í - Á1 - ˜˙ ,
R
ÍÎ 1 R2 Ë R1 ¯ ˚˙
1
= - 2 , (4-57)
f¢ x0
or
1 Ê 1 1 2t ˆ
= - 2Á - + 1 ˜ , (4-58)
f¢ R
Ë 1 R2 R1 R2 ¯
x3 = x 2 + t2 2
= 0 for a focal point ,
x2
t2 = - ,
2
or
Ê 2t ˆ
t2 = f ¢ Á1 - 1 ˜ . (4-59)
Ë R1 ¯
The quantity t2 locates the image-space focal point F ¢ and represents its distance from
M2 , called the image-space focal distance of the system. The principal point H ¢ is
located by noting that H ¢F ¢ = f ¢ . A positive value of f ¢ implies that F ¢ lies to the right
of H ¢ at a distance f ¢ from it. Similarly, by considering a ray incident parallel to the
optical axis from right to left, the location of the object-space focal point F and the
principal point H can be determined. We find that F lies at a distance f (1 - 2t1 R2 ) from
M1 , where f = - f ¢ is the object-space focal length of the system. This distance is the
object-space focal distance of the system.
Letting f1¢ = R1 2 and f2¢ = R2 2 denote the focal lengths of the mirrors, Eq. (4-58)
for the focal length of the system can be written
1 1 1 t
= - + - 1 . (4-60)
f¢ f1¢ f2¢ f1¢f2¢
This result may also be obtained from Eq. (4-21a) by letting n1 = -1 (representing the
refractive index associated with the ray reflected by M1 ) and n2 = 1 (representing the
refractive index associated with the ray reflected by M2 ). In terms of the equivalent focal
lengths of the mirrors, fe1 = - R1 2 and fe2 = R2 2 , defined by Eqs. (3-9), Eq. (4-60)
can also be written
1 1 1 t
= + - 1 . (4-61)
f¢ fe1 fe 2 fe1 fe 2
We note that f ¢ Æ • if t1 = f1 - f 2 , i.e., the system becomes afocal if the mirrors are
confocal (i.e., if they have a common focus). If the magnitude of the spacing between the
mirrors is smaller (than that for the afocal setting), then the system has a positive focal
length, and it is called a Cassegrain telescope. If it is larger, then the focal length of the
system is negative, and it is called a Gregorian telescope. Both telescopes form a real
image of an object lying at infinity, as illustrated in Figure 3-8.
4.8.2 Obscuration
It should be evident from Figure 4-14 that the central portion of a bundle of rays
incident on the primary mirror M1 is blocked by the secondary mirror M2 . Thus, the
image-forming beam is hollow on the inside. It is said to be centrally obscured in the case
of an axial point object lying at infinity. The ratio of the heights of the LQQHUmost to the
RXWHUmost rays of the image-forming light cone is called the obscuration ratio of the
system.
If x1 is the radius of M1 , as illustrated in Figure 4-15, then a ray incident parallel to

the optical axis at a height x1 defines the outermost ray. Its height at M2 (after reflection
by M1 ) gives the radius of M2 required so that the axial rays reflected by M1 are not
missed by M2 . Its height x3 at M1 (after reflection by M2 ) gives the radius of the hole
required in M1 so that the axial rays forming the image at F ¢ are not blocked by it. A
parallel ray incident at a height x2 , shown dotted in Figure 4-14, defines the innermost
ray.
From Eq. (4-56), the obscuration ratio of the image-forming beam converging to
the image point at F ¢ is given by
x1, b1
x2, b2
b1 x1
x2 x3
(–)b2
H¢ OA F1¢ F¢
M2
M1
(–)t1
(–)f 1¢
f¢
Figure 4-15. Axial ray tracing of a two-mirror system, illustrating its obscuration
ratio = x 2 x1 and the radius x3 of the hole in the primary mirror M1 .
4.8 Two-Mirror System 171
x2 t
∫ = 1- 1 . (4-62)
x1 f1¢
The radius x3 of the hole required in M1 is given by
x3 = x2 + t22
È Ê 1 1 ˆ˘
= x1 Í1 + t1 Á - ˜ ˙ , (4-63)
ÍÎ Ë f ¢ f1¢¯ ˙˚
where t2 = - t1 (because the ray is propagating from M2 to M1 ), and we have substituted

for x2 and 2 from Eqs. (4-56) and (4-57), respectively.
By tracing an off-axis ray, as illustrated in Figure 4-16, we can determine how the
field of view of a system affects the values of x2 and x3 . Consider a ray incident at an
angle 0 at a height x1 on M1 , representing an outermost ray from an object point at
infinity making an angle 0 with the optical axis. The equations for tracing this ray are:
x1
1 = - 0 - ,
f1¢
x2 = x1 + t1 1
Ê t ˆ
= - 0 t1 + x1 Á1 - 1 ˜ , (4-64)
Ë f1¢¯
x1, b1
b0
x2, b2 x1
x2 x3
h1¢ h¢
H¢ OA F1¢ F¢
M2
M1
(–)t1
(–)f 1¢
f¢
Figure 4-16. Off-axis ray tracing of a two-mirror system, illustrating the increase in
radius of the secondary mirror M2 and the hole in the primary mirror M1 . The
dashed axial ray is shown for comparison. The heights of the images formed by M1
and the system are h1¢ and h ¢ , respectively.
x2
2 = - 1 -
f2¢
Ê t ˆ x
= 0 Á1 + 1 ˜ - 1 ,
Ë f2¢ ¯ f ¢
and (with t2 = - t1 )
x3 = x2 + t2 2
Ê t ˆ È Ê 1 1 ˆ˘
= - 0 t1 Á 2 + 1 ˜ + x1 Í1 + t1 Á - ˜ ˙ . (4-65)
Ë f2¢ ¯ ÍÎ Ë f ¢ f1¢¯ ˙˚
Comparing Eq. (4-64) with Eq. (4-62), we find that the radius of M2 increases by - 0 t1 ,
which, in turn, increases the obscuration ratio. Similarly, comparing Eq. (4-65) with Eq.
(4-63), we find that radius of the hole in M1 increases by - 0 t1 (2 + t1 f 2¢ ) . Good image
quality is generally obtained for only very small values of the field angle 0 (a few
degrees) due to the rapid increase of aberrations with it. Thus, the approximate results are
reasonably accurate for a preliminary design of the system. The precise results in the final
stages of a design are obtained by exact ray tracing using a computer-based code.
4.9 CATADIOPTRIC SYSTEM: THIN-LENS–MIRROR COMBINATION
Finally, we consider a catadioptric system consisting of a thin lens of focal length fl¢
and a concave mirror of radius of curvature R (and, therefore, focal length fm¢ = R 2 )
separated by a distance t, as illustrated in Figure 4-17, and determine its focal length. The
results obtained are applied to a Schmidt camera, which consists of a spherical mirror and
a corrector plate placed at its center of curvature. The primary purpose of the plate is to
correct the spherical aberration of the mirror. However, it also has a small focus term and
thus acts like a (weak) lens.
Applying the ray-tracing equations (4-36) and (4-37) for a thin lens and Eqs. (4-49)
and (4-52) for a mirror, we obtain the focal length fs¢ of the system as follows:
x1 = x 0 + t00
= x 0 for a ray incident parallel to the optical axis,
x1 x
1 = 0 - = - 0 ,
fl¢ fl¢
x2 = x1 + t1 1
Ê tˆ
= x 0 Á1 - ˜ ,
Ë fl¢¯
4.9 Catadioptric System: Thin-Lens–Mirror Combination 173
x0, 0 x1, b1 (–)b1
x2, b2
x0 x1 x2
x3 b2
C ¢ F¢
Fm V H¢ F¢l
M
L
(–)fm¢
(–)t2
(–)R
t
(–)fs¢
f¢l
Figure 4-17. Catadioptric system consisting of a thin lens L of image-space focal

length fl¢ and a mirror M of radius of curvature R separated by a distance t. The
dotted line is a continuation of the ray refracted by the lens and intersects the
optical axis at the image-space focal point Fl¢ of the lens.
x2
2 = - 1 -
fm¢
È1 1 Ê t ˆ˘
= x0 Í - Á1 - ˜ ˙
ÍÎ fl¢ fm¢ Ë fl¢¯ ˚˙
x
= - 0 ,
fs¢
where the negative sign in the last step accounts for the fact that 2 is numerically
positive, whereas fs¢ is numerically negative in the figure. Thus, the focal length of the
system is given by
1 Ê1 1 t ˆ
= -Á - + ˜ . (4-66)
fs¢ f
Ë l ¢ f ¢
m fl fm¢ ¯
¢
The focusing power K s and the equivalent focal length fe of the system are given by
ns¢ 1 1
Ks ∫ = - = , (4-67)
fs¢ fs¢ fe
where ns¢ = - 1 is the refractive index of the image space of the system. The distance t2
of the focal point F ¢ from the vertex V of the mirror is given by
x3 = x 2 + t2 2
= 0 for a focal point.

Thus,
x2
t2 = - ,
2
or
Ê tˆ
t2 = fs¢ Á1 - ˜ . (4-68)
Ë fl¢¯
If the lens is placed at the center of curvature C of the mirror, as in a Schmidt

camera, then t = - R = - 2 fm¢ , and Eqs. (4-66) and (4-68) reduce to
1 1 1
= + (4-69)
fs¢ fl¢ fm¢
and
Ê 2f¢ ˆ
t2 = fs¢ Á1 + m ˜ . (4-70)
Ë fl ¢ ¯
There is only one principal point and one focal point. Thus, the reference point for
both object and image distances is either the principal point H ¢ or the focal point F ¢ ,
depending on whether the Gaussian or the Newtonian imaging equation is used. It should
be noted that if the lens is placed close to the mirror, then the rays reflected by the mirror
are refracted by the lens before a final image is formed (see Problem 4.1)
4.10 TWO-RAY LAGRANGE INVARIANT
As shown in Sections 2.4.3 and 3.2.3, the Lagrange invariant, which is the product of
the slope angle of a ray from an axial point object, object height, and the refractive index
of the object space, is invariant upon refraction or reflection by a surface, and thus for a
system consisting of any number of such surfaces. Now we consider this invariant in
terms of the heights and slopes of two arbitrary rays incident on the system. We show
how this invariant reduces to that for finite or infinite conjugates. We also show that the
slope and the height of any other ray incident on the system can be obtained anywhere in
space as a linear combination of the slopes and heights of the other two in that space.
Consider, as illustrated in Figure 4-18, two linearly independent rays (such that one
is not a scaled version of the other) incident at heights x 0 and x with slope angles 0 and
on a refracting surface of radius of curvature R separating media of refractive indices n
and n ¢ . From Eq. (4-4), the slope angles ¢0 and ¢ of the corresponding refracted rays
are given by
x0
n ¢¢0 = n0 + (n - n ¢ ) (4-71)
R
4.10 Two-Ray Lagrange Invariant 175
n n¢
x0 (–)b¢ P¢
x 0¢
P0 b0 b x x¢ h¢
V C F¢ (–)b0¢ P¢0
(–)h
R
f¢
Figure 4-18. Lagrange invariant of two rays incident on a refracting surface of

radius of curvature R separating media of refractive indices n and n ¢ .
and
x
n ¢¢ = n + (n - n ¢ ) . (4-72)
R
Eliminating (n - n ¢) R from Eqs. (4-71) and (4-72), we find that
n ¢(¢0 x - ¢ x 0 ) = n(0 x - x 0 ) , (4-73)
showing that the quantity n(0 x - x 0 ) , called the two-ray Lagrange invariant, is
invariant upon refraction of the rays. If we let x 0¢ and x ¢ be the heights of the rays in a
plane at a distance t from the refracting surface, we find from Eq. (4-1) that
x 0¢ = x 0 + t ¢0 (4-74)
and
x ¢ = x + t¢ . (4-75)
Eliminating t from Eqs. (4-74) and (4-75), we find that
n ¢(¢0 x ¢ - ¢ x 0¢ ) = n ¢(¢0 x - ¢ x 0 ) . (4-76)
Thus, the quantity n ¢(¢0 x - ¢ x 0 ) remains invariant upon transfer of the rays from one
plane to another. From Eqs. (4-73) and (4-76) we find that
n ¢(¢0 x ¢ - ¢ x 0¢ ) = n(0 x - x 0 ) , (4-77)
showing the equality of the Lagrange invariant in the object and image spaces. Thus, the
two-ray Lagrange invariant remains the same throughout the optical system, including the
object and image spaces. This invariant relation applies to a multisurface system as well,
(as may be seen by placing another refracting surface at some distance from the existing
refracting surface), in which case the right- and left-hand sides of Eq. (4-77) refer to its
object and image spaces, respectively.
If we consider the Lagrange invariant in two conjugate planes passing through P0

and P0¢ so that x 0 = 0 , x = h , x 0¢ = 0 , and x ¢ = h ¢ , as in Figure 2-9, we find that Eq. (4-
77) reduces to Eq. (2-15). If one of the conjugates lies at infinity, we let 0 = 0 and find
that the object-space Lagrange invariant reduces to - nx 0 , as discussed in Section 2.5.2.
For the refracting surface, it is easy to show that the various expressions for the Lagrange
invariant are equal to each other. For example, in Figure 4-19, - nx 0 and
n ¢(¢0 x - ¢ x 0 ) at the surface, and n ¢h ¢¢0 at the image plane, are all equal to nf ¢¢0 ,
where f ¢ is the image-space focal length of the refracting surface. If the system is afocal,
as in Figure 2-37a, then 0 and ¢0 are both equal to zero, and the image-space Lagrange
invariant reduces to - n ¢x 0¢ ¢ , thus yielding Eq. (2-108).
Now we utilize the two-ray Lagrange invariant to show that if the slope and height of
any third ray are known in a certain space, they can be obtained in any other space,
without tracing it, as a linear combination of the values of the two rays in that other
space. Let the slopes of the three rays in a certain space of refractive index n be 1 , 2 ,
and 3 , and let their heights in a certain plane in that space be x1 , x2 , and x3 ,
respectively. Suppose that two of the rays have been traced such that their heights and
slopes ( x1¢ , 1¢ ) and ( x2¢ , ¢2 ) in another space are known. We show that the height and
slope ( x3¢ , 3¢ ) of the third ray in that space can be determined without tracing it from the
heights and slopes of the other two.
n n¢
b0 = 0
x (–)b¢
x0
b (–)b0¢ h¢
V b C F¢
R
f¢
Figure 4-19. Lagrange invariant of a refracting surface for an object at infinity.

4.10 Two-Ray Lagrange Invariant 177
The two-ray Lagrange invariant in the plane for which the heights and slopes of the
rays are known can be calculated according to
L12 = n(1 x2 - 2 x1 ) , (4-78a)
L13 = n(1 x3 - 3 x1 ) , (4-78b)
and
L23 = n(2 x3 - 3 x 2 ) . (4-78c)
Using primes for the corresponding quantities in another space, the Lagrange invariant in
terms of the quantities in this space may be written
L12 = n ¢(1¢ x2¢ - 2¢ x1¢ ) , (4-79a)
L13 = n ¢(1¢ x3¢ - 3¢ x1¢ ) , (4-79b)
and
L23 = n ¢(¢2 x3¢ - 3¢ x 2¢ ) , (4-79c)
respectively. From Eqs. (4-79), we find that the height and the slope of the third ray in the
other space are given by
L13 x2¢ - L23 x1¢

x3¢ = (4-80)
L12
and
L132¢ - L231¢
3¢ = . (4-81)
L12
Because the quantities on the right-hand sides of Eqs. (4-80) and (4-81) are known, the
height and slope ( x3¢ , 3¢ ) of the third ray can be determined without actually tracing it.
As discussed later in Section 5.2.3, a marginal ray from the axial point of an object
and a chief ray from its edge are the two rays that can be traced to determine the location
and size of the images of an object and the entrance pupil of a system. The first ray
passing through the edge of the entrance pupil passes through the edge of the exit pupil
and thus determines its size. It also passes through the center of the image and thereby
determines its location. The second ray passing through the center of the entrance pupil
passes through the center of the exit pupil and thus determines its location. It also passes
through the edge of the image and thus determines its size. The height and slope of a third
ray in any space can be determined from the heights and slopes of these two rays in that
space.
4.11.1 Ray-Tracing Equations

Consider a ray starting at a height x 0 with a slope 0 . Its height x1 after propagating
a distance t0 is given by
x1 = x 0 + t00 . (4-82)
Its slope 1 after refraction by a spherical refracting surface of radius of curvature R1

separating media of refractive indices n0 and n1 , as illustrated in Figure 4-20, is given by
x1
n11 = n00 + (n0 - n1 ) . ( Refracting Surface) (4-83)
R1
If the ray is refracted by a thin lens of focal length f1¢ instead, as in Figure 4-21, then its
slope after refraction is given by
n0 n1
x1, b1
b0
x0, b0 (–)b1 x2
x0 x1 x2
V C
R1
t0 t1
Figure 4-20. Ray tracing of a spherical refracting surface of radius of curvature R1

separating media of refractive indices n0 and n1 .
x 1, b 1
x0, b0
x2
x0 b0 x1 (–)b1 x2
t0 t1
Figure 4-21. Ray tracing of a thin lens of focal length f1¢ .

x1
1 = 0 - . ( Thin lens) (4-84)
f1¢
In the case of reflection by a spherical mirror of radius of curvature R1 , as in Figure 4-22,

it is given by
2 x1
1 = - 0 - . (Mirror ) (4-85)
R1
The above equations are applied recursively to trace a ray through a multisurface optical
system. For example, the height of the refracted or reflected ray after propagating a
distance t1 is given by
x2 = x1 + t11 . (4-86)
Ray-tracing equations are used to determine not only the Gaussian properties of a
system but also the size of the imaging elements and apertures, vignetting of rays, and
obscurations in mirror systems.
4.11.2 Thick Lens

The focal length f ¢ of a thick lens of refractive index n, thickness t, and surfaces of
radii curvature R1 and R2 , as illustrated in Figure 4-23, is given by
2
1 Ê 1 1 ˆ t (n - 1)
= ( n - 1) Á - ˜ + . (4-87)
f¢ Ë R1 R2 ¯ nR1 R2
[
The front and back focal distances are given by - f ¢ 1 + t (n - 1) n R2 ] and
[ ]
f ¢ 1 - t (n - 1) n R1 , respectively.
x2
x2
(–)b1
x1, b1
x0, b0
b0
x1
x0
V C
(–)t 1
t0
R1
Figure 4-22. Ray tracing of a spherical mirror of radius of curvature R1 .

OA F C2 V1 H H¢ V2 F¢ C1
n –1
f ¢(1 – t
nR 1
)
– f ¢(1 – t n – 1 ) t
nR1
(–)f f¢
R1
(–)R2
Figure 4-23. Thick lens of refractive index n and thickness t.
4.11.3 Two-Lens System

The focal length f ¢ of a two-lens system with thin lenses of focal lengths f1¢ and f2¢
separated by a distance t, as illustrated in Figure 4-24, is given by
1 1 1 t
= + - .
f¢ f1¢ f2¢ f1¢f2¢ (4-88)
The front and back focal distances are given by - f ¢ (1 - t f2¢) and f ¢ (1 - t f1¢) ,
respectively.
L1 L2
F H¢ F¢
f 1¢ f 2¢
t
– f ¢( 1 – ) t f ¢ (1 – t )
f¢2 f1¢
f¢
Figure 4-24. Two-lens system consisting of two thin lenses separated by a distance t.
4.11.4 Two-Mirror System

The focal length f ¢ of a two-mirror system with mirrors of radii of curvature R1 and
R2 spaced a (numerically negative) distance t apart, as illustrated in Figure 4-25, is given
by
1 1 1 t
= - + - , (4-89)
f¢ f1¢ f 2¢ f1¢f 2¢
where fi¢ = Ri 2 is the focal length of a mirror. The back focal distance, representing the
distance of the focal point F ¢ from the secondary mirror M2 , is given by f ¢ (1 - t f1¢) .
The obscuration ratio, representing the ratio of the inner to the outer radii of the axial ray
bundle converging to the focal point, is given by 1 - t f1¢ . The corresponding radius of
[ (
the hole in the primary mirror M1 of radius a is given by a 1 + t1 f ¢ -1 - f1¢ -1 . Both the )]
obscuration ratio and the hole radius increase as the field of view increases.
4.11.5 Two-Ray Lagrange Invariant

If two rays with slopes 0 and are incident on a system at heights x 0 and x,
respectively, as illustrated in FLgure 4-18,andiftheslopesand heights of the
corresponding rays in the image space are given by ¢0 and ¢ , and x 0¢ and x ¢ , thentheir
Lagrange invariant yields the relation
n ¢ (¢0 x ¢ - ¢ x ¢0 ) = n (0 x - x 0 ) , (4-90)
where n and n ¢ are the refractive indices of the object and image spaces, respectively.
H¢ OA F 1¢ F¢
M2
M1
(–)f1¢
(–)t
f ¢ (1 + t )
f1¢
f¢
Figure 4-25. Two-mirror system.

PROBLEMS
4.1 A thin lens with a focal length of 10 cm is located at a distance of 3 cm in front of

a concave spherical mirror with a radius of curvature of 20 cm. (a) Determine the
focal point and the principal point of the system. (b) Repeat the problem when the
lens is in contact with the mirror.
4.2 A thick lens has a refractive index of 1.5. Its surfaces have radii of curvature of 10
cm and – 25 cm. If the second surface is silvered and the lens is 2 cm thick, locate
the focal point and the principal point of the system.
4.3 Consider a thick equiconvex lens with radii of curvature R1 = 4 cm and

R2 = - 4 cm , and refractive index n = 1.5 . Calculate its focal length and sketch
its principal and focal points if its thickness is 0.3 cm, 2 cm, 8 cm, 12 cm, 24 cm,
or 36 cm.
4.4 Two thin lenses of focal lengths ± f ¢ are placed a distance f ¢ apart. (a) Determine
the focal points, principal points, and the focal length of the system. How does the
order of the lenses affect the result? (b) Repeat the problem when the lenses have
focal lengths f ¢ and - f ¢ 6 and are placed a distance 2 f ¢ 3 apart.
4.5 Consider a system of two thin lenses of focal lengths f1¢ and f2¢ spaced a distance t
apart. (a) Determine its cardinal points if f1¢ = 2 f2¢ and t = 0.5 f2¢, f2¢, 1.5 f2¢
(Huygens eyepiece), 2 f2¢ , and 3 f2¢ (astronomical telescope). (b) Repeat the
problem if f1¢ = - 2 f2¢ and t = 0.5 f2¢ , - f2¢ (Galilean telescope), and - 1.5 f2¢
(telephoto lens). Let f1¢ = 10 cm .
4.6 The human eye may be represented in a simplified form as follows:
Cornea Lens
Retina
OA
n1 n2 n3
t1 t2 t3
C1 = 0.1282051 n1 = 1.336 t1 = 3.6
C2 = 0.10 n2 = 1.413 t2 = 3.6
C3 = 0.16667 n3 = 1.336
(a) Determine t3 . (b) Determine the six cardinal points and show them on the axis.
(c) Determine the cardinal points for an underwater swimmer. Indicate the changes
Problems 183
from (b). Note that Ci is the curvature of a surface, i.e., it is the reciprocal of its
radius of curvature. (Hint: One focal point is on the retina. The refractive index of
water is 1.336.) Note: t is in units of mm, and C is in units of mm–1.
4.7 In a nearsighted eye, the focal point F ¢ lies in front of the retina. Assume that the
eye can be approximated, as shown in the figure below, such that F ¢ is 23 mm
from the cornea instead of 24.387 mm, as in a normal eye. (a) Determine the
prescription of a corrective lens placed 15 mm in front of the cornea that makes
F ¢ lie on the retina. (b) Repeat the calculation for a contact lens.
15.707 mm
1.348 mm
H Retina
H¢
F F¢
Cornea Lens
n = 1.336
n = 1.000
1.602 mm
24.387 mm
4.8 Consider a lens of refractive index n and thickness t with its two surfaces having
equal radii of curvature R. (a) Show that the distance between its principal points
is also equal to t. (b) Determine its principal and focal points for n = 1.5 ,
t = 2 cm , and R = 10 cm .
4.9 Consider a concentric lens of refractive index n with its two surfaces having radii
of curvature R1 and R2 . Show that such a lens behaves as a negative thin lens
placed at the common center of curvature of its two surfaces with a focal length
that is n times the focal length of a thin lens of the same refractive index and
surfaces with the same radii of curvature. Determine its principal and focal points
for n = 1.5 , R1 = 10 cm , and R2 = 8 cm .
4.10 The Hubble space telescope is a Cassegrain telescope with a focal ratio of 24. Its
primary mirror is its aperture stop (discussed in Chapter 5), with a diameter of 2.4
m and a focal ratio of 2.3. The spacing between its two mirrors is 4.905 m. (a)
Determine the location of its principal and focal points. (b) Determine the location
and size of its exit pupil. (c) Determine the diameters of the secondary mirror and
the hole in the primary mirror for a field of view of ± 5 mrad.
CHAPTER 5
STOPS, PUPILS, AND RADIOMETRY
5.1 Introduction ..........................................................................................................187

5.2 Stops, Pupils, and Vignetting ..............................................................................188
5.2.1 Introduction..............................................................................................188
5.2.2 Aperture Stop, and Entrance and Exit Pupils ..........................................188
5.2.3 Chief and Marginal Rays ......................................................................... 193
5.2.4 Vignetting ................................................................................................194
5.2.5 Size of an Imaging Element ....................................................................197
5.2.6 Telecentric Aperture Stop ........................................................................197
5.2.7 Field Stop, and Entrance and Exit Windows ........................................... 198
5.3 Radiometry of Point Object Imaging ................................................................. 200
5.3.2 Inverse-Square Law of Irradiance ........................................................... 201
5.3.3 Image Intensity ........................................................................................202
5.4 Radiometry of Extended Object Imaging ..........................................................204
5.4.1 Introduction..............................................................................................204
5.4.2 Lambertian Surface..................................................................................205
5.4.3 Illumination by a Lambertian Disc ..........................................................206
5.4.5 Image Radiance ....................................................................................... 211
5.4.6 Image Irradiance: Aperture Stop in front of the System..........................213
5.4.7 Image Irradiance: Aperture Stop in back of the System ..........................216
5.4.8 Telecentric Systems ................................................................................. 218
5.4.9 Throughput ..............................................................................................218
5.4.10 Interrelations among Invariants in Imaging ............................................218
5.4.11 Concentric Systems ................................................................................. 219
5.5 Photometry ........................................................................................................... 220
5.5.1 Photometric Quantities and Spectral Response of the Human Eye......... 220
5.5.2 Imaging by the Human Eye ..................................................................... 223
5.5.3 Brightness of a Lambertian Surface ........................................................223
5.6.1 Stops, Pupils, Windows, and Field of View ............................................224
5.6.2 Radiometry of Point Object Imaging ......................................................225
5.6.3 Radiometry of Extended Object Imaging ................................................226
5.6.3.1 Illumination by a Lambertian Disc............................................226
5.6.3.2 Image Radiance ......................................................................... 226
5.6.3.3 Image Irradiance........................................................................227
5.6.4 Visual Observations................................................................................. 228
185
186 STOPS, PUPILS, AND RADIOMETRY
References ......................................................................................................................229
Problems ......................................................................................................................... 230
Chapter 5
Stops, Pupils, and Radiometry
5.1 INTRODUCTION
In previous chapters, we have shown how to determine the position and size of the
Gaussian image of an object. However, we did not consider the sizes of the imaging
elements or the apertures in the imaging system. Accordingly, no effort was made there to
determine the cone of object rays that enters or exits from the imaging system. Such
calculations are essential for the determination of the image intensity in terms of the
object intensity, or the image irradiance in terms of the object radiance.
We begin this chapter by introducing the concept of an aperture stop and its images,
called the entrance and exit pupils in the object and image spaces of an imaging system,
respectively. The light cone from a point object that enters the system is limited by the
entrance pupil. Similarly, the light cone that exits from the system and converges to the
image point is limited by the exit pupil. Certain special rays, such as the chief and
marginal rays, are defined. The chief ray from the edge of an object determines the
location of the exit pupil and the height of the image. Similarly, the marginal ray from the
axial point of the object determines the size of the exit pupil and the location of the axial
image point. Vignetting or blocking of the rays from an off-axis point object by the
aperture stop and/or other elements of the system, thus changing the effective shape of
the stop and pupils, is explained. A telecentric stop is defined, and its advantages are
briefly discussed. The field stop and its images, the entrance and exit windows, and the
angular field of view of a system are also described.
The field of radiometry deals with the determination of the amount of light radiated
by a source per unit area per unit solid angle, or the amount falling on a surface per unit
area [1–4]. We discuss the radiometry of point-object imaging, followed by the
radiometry of extended-object imaging. We introduce terms such as intensity of a point
source, radiance of an extended source, irradiance of a surface, and characteristics of a
Lambertian source. A relationship between the intensities of a point object and its point
image is derived. The irradiance of a surface due to a Lambertian disc is also derived. An
invariant relation between the radiances of an object and its image is obtained, and the
cosine-fourth law of image irradiance is discussed [5–7]. The irradiance distribution of
the images formed by systems that are telecentric or concentric is also discussed.
A brief discussion of photometry, the branch of radiometry limited to human

observations in the visible region of the electromagnetic spectrum, is given. Photometry
is different from the rest of radiometry in that the spectral response of the human eye is
taken into account to determine the results of any observation. The brightness of a
Lambertian surface is discussed, showing that it appears equally bright at all distances
along all directions of observation. It is also shown why stars may be observed during
daytime with the aid of a telescope.
187
5.2 STOPS, PUPILS, AND VIGNETTING

5.2.1 Introduction
In this section, we define the aperture stop and the entrance and exit pupils of an
optical system, and discuss how to determine them. The chief and marginal rays are
defined as the object rays that pass through the center and edge of the aperture stop,
respectively. The chief ray from the edge of an object locates the pupils and determines
the image size. Similarly, the marginal ray from the axial point object locates the image
plane and determines the sizes of the pupils. We describe how to determine the minimum
size of an imaging element (e.g., the diameter of a lens or a mirror) required to avoid
vignetting of rays. A telecentric aperture stop is discussed, which offers the advantage of
increased defocus error tolerance to the size or the shape of an image. Finally, a field stop
is defined whose images in its object and image spaces, called the entrance and exit
windows, define the angular fields of view in those spaces, respectively.
5.2.2 Aperture Stop, and Entrance and Exit Pupils

When an object is imaged by a system, not all of the object rays incident on the
system are transmitted by it; some of them are blocked by one or another of its elements.
An aperture in the system that physically limits the solid angle of the transmitted rays
from a point object the most is called its aperture stop ( AS). For an extended (i.e., a
nonpoint) object, it is customary to consider the aperture stop as the limiting aperture for
an axial point object and to determine the vignetting or blocking of some rays by this stop
and other elements of the system for off-axis object points. The image of the stop by
surfaces of the system that precede it in the sense of light propagation, i.e., by those that
lie between it and the object, is called the entrance pupil (EnP). When observed from the
object side, the entrance pupil appears to limit the solid angle of the rays entering the
system to form the image of the object. Similarly, the image of the aperture stop by
surfaces that follow it, i.e., by those that lie between it and the image, is called the exit
pupil (ExP). The solid angle of the rays converging to the image point appears to be
limited by the exit pupil. Because the entrance and exit pupils are images of the aperture
stop formed by the system elements that precede and follow it, respectively, the two
pupils are conjugates of each other for the whole system, i.e., if one pupil is considered as
the object, the other is its image formed by the system.
As a simple example, consider imaging by a thin lens, as illustrated in Figure 5-1.

The cone angle of the ray bundle diverging from the axial point object P0 and incident on
the lens, as in Figure 5-1a, is limited by the size of the lens. Similarly, the cone angle of
the ray bundle converging to the image point P0¢ is limited by the lens size. The lens
aperture is, therefore, the aperture stop, the entrance pupil, and the exit pupil of the
imaging system. Figure 5-1b shows the imaging of an off-axis point object. However, if
an aperture is placed in front of the lens, as in Figure 5-2, then it limits the cone angle of
the incident ray bundle. This aperture is the aperture stop as well as the entrance pupil of
the system. Its image by the lens is the exit pupil, which appears to limit the cone angle of
5.2 Stops, Pupils, and Vignetting 189
AS
EnP
ExP
1
MR 0
OA CR0
P0 P¢0
MR
02
(a)
AS
EnP
ExP P¢
R1
M
P0 OA
P¢0
CR
P
MR2
(b)
Figure 5-1. Imaging by a thin lens with an aperture stop at the lens. (a) On-axis
imaging. (b) Off-axis imaging. The cone angle of a ray bundle diverging from a
point object and incident on the lens is limited by the size of the lens. Similarly, the
cone angle of the ray bundle converging to the corresponding image point is also
limited by the size of the lens. The lens aperture is the aperture stop AS, entrance
pupil EnP, and exit pupil ExP of the imaging system. CR represents the chief ray
that passes through the center of the lens, and MR represents a marginal ray passing
through its edge.
the ray bundle converging to the image point P0¢ . An observer looking at the lens from
P0¢ does not see the aperture stop but sees instead its image, the exit pupil ExP, formed by
the lens. In Figure 5-3, an aperture is placed behind the lens. We note that the cone angle
of the ray bundle transmitted by the system and reaching the image point is limited by it.
It is therefore also the exit pupil of the system. The image of the aperture stop by the lens
is the entrance pupil because it appears to limit the cone angle of the corresponding
incident ray bundle. An observer looking at the lens from P0 does not see the aperture
stop AS, but sees instead its image, the entrance pupil EnP, formed by the lens.
ExP AS
EnP
MR 01
CR0
P0 OA P¢0
MR
02
(a)
P¢
ExP AS
EnP
P0 OA
P¢0
MR
1
CR
MR 2
P
(b)
Figure 5-2. Imaging by a thin lens with aperture stop AS in front of the lens. (a) On
axis imaging. (b) Off-axis imaging. The aperture stop is also the entrance pupil
EnP, and its image by the lens is the exit pupil ExP. The chief ray CR passes
through the center of the aperture stop and appears to pass through the center of
the exit pupil. The cone angle of the ray bundle diverging from a point object and
incident on the lens is limited by the entrance pupil, and the cone angle of the ray
bundle converging to the image point appears to be limited by the exit pupil.
Suppose we add another lens to the right of the aperture stop so that the imaging
system consists of two thin lenses with an aperture lying between them, as illustrated in
Figure 5-4. Now AS is the aperture stop, and its images by the lenses L1 and L2 are the
entrance and exit pupils EnP and ExP, respectively. From the definition of the object and
image spaces given in Section 2.2.2, we note that the aperture stop lies in the image space
of lens L1 and the object space of lens L2 . Similarly, the entrance pupil lies in the
(virtual) object space of lens L1 , and the exit pupil lies in the (virtual) image space of lens
L2 . Moreover, the entrance pupil lies in the (virtual) object space and the exit pupil lies in
the (virtual) image space of the two-lens system. An observer looking at the system from
P0 does not see the aperture stop AS but sees instead its image, the entrance pupil EnP,
AS
ExP EnP
MR 01
OA CR0
P0 P¢0
MR
02
(a)
P¢
AS
ExP EnP
P0 OA
MR 1 P¢
0
CR
MR2
P
(b)
Figure 5-3. Imaging by a thin lens with an aperture stop AS behind the lens. (a) On-
axis imaging. (b) Off-axis imaging. The aperture stop is also the exit pupil ExP, and
its image by the lens is the entrance pupil EnP. The cone angle of the ray bundle
diverging from a point object appears to be limited by the entrance pupil, and the
cone of the ray bundle converging to the corresponding image point is limited by the
exit pupil. The chief ray CR passes through the center of the aperture stop and
appears to pass through the center of the entrance pupil.
formed by L1 . Similarly, an observer looking at the system from P0¢ also does not see the
aperture stop, but sees instead its image, the exit pupil ExP , formed by L2 .
The aperture stop of a multielement imaging system may be determined by forming

the image of each element and aperture by the imaging elements that precede it and
determining the smallest image as seen from the axial point of an object. The smallest
image is the entrance pupil of the system, and the corresponding element or aperture is its
aperture stop. The image of the entrance pupil by the whole system or, equivalently, the
image of the aperture stop by the imaging elements that follow it, is the exit pupil of the
system. Alternatively, the aperture stop may be determined directly by tracing a ray from
the axial point object and calculating the ratio of the ray height to the radius
ExP
EnP
L1
AS L2
MR 01
B02
OA CR0 A01
P0 A02 P¢0
B01
MR
02
(a)
ExP
L EnP L
1 AS
2
C2
B2 P¢
P0 OA A2
MR 1 A1 P¢0
B1
CR
C1
MR2
P
(b)
Figure 5-4. (a) Imaging of an on-axis point object P0 by an optical imaging system
consisting of two lenses L1 and L2 . OA is the optical axis. The Gaussian image is at
P0¢ . AS is the aperture stop; its image by L1 is the entrance pupil EnP, and its image
by L2 is the exit pupil ExP. CR0 is the axial chief ray, and MR0 is an axial marginal
ray. (b) Imaging of an off-axis point object P. The Gaussian image is at P ¢. CR is the
off-axis chief ray, and MR is an off-axis marginal ray.
(semidiameter) of each element and aperture in the system. The element or aperture with
the highest ratio is the aperture stop. If the angle of the chosen ray is increased, each ratio
increases by a proportional amount until it reaches a value of unity for the aperture stop.
Any further increase in the angle of the ray will lead to its vignetting by the aperture stop.
Which element of a system acts as its aperture stop depends on the location of the
object. As a simple example, the aperture A in Figure 5-5 is the aperture stop for objects
such as PA lying to the left of P, where P is the point of intersection of the line joining the
upper edges of A and the lens L with the optical axis. However, the lens itself acts as the
aperture stop for objects, such as PL , lying to the right of P.
5.2.3 Chief and Marginal Rays

An object ray passing through the center of the aperture stop and actually or
appearing to pass through the centers of the entrance and exit pupils is called a chief (or
the principal) ray (CR). An object ray passing through the edge of the aperture stop and
actually or appearing to pass through the edges of the entrance and exit pupils is called a
marginal ray (MR). The rays lying between the center and the edge of the aperture, and
therefore appearing to lie between the center and edge of the entrance and exit pupils, are
called zonal rays. Given the location O and the radius a of the entrance pupil EnP, the
axial marginal ray P0 A ◊◊A¢P0¢, indicated by the ray MR0 in Figure 5-6, determines the
radius a ¢ of the exit pupil and the location of the axial image point P0¢ . Similarly, the
chief ray PO ◊◊O¢P ¢ from the edge of the object at a height h determines the location O¢
of the exit pupil and the height h ¢ of the image P ¢ . The two rays can be traced by using
the procedure discussed in Chapter 4. The slope and height of any other ray in a certain
space can be obtained from the slopes and heights of these rays in that space, as discussed
in Section 4.10.
Using the Lagrange invariant equation (2-74), we find that the angles of the marginal
and chief rays are related to each other according to
A L
PA P PL
Figure 5-5. Change of aperture stop with object location.

97
EnP ExP
n
A¢ n¢
A
P MR 0
CR MR
a a¢ 0
h
b0 (–)q (–)b¢0 P¢0
P0 O O¢ (–)q¢
(–)h¢
CR
Optical P¢
System
(–)L o Li
Figure 5-6. Schematic diagram of a system and its entrance and exit pupils EnP and
ExP, respectively, showing the marginal ray P0 A ◊◊A¢ P0¢ from the axial point object
P0 and the chief ray PO ◊◊O ¢P ¢ from the off-axis point object P ¢.
nh0 = - naq = - n ¢a ¢q ¢ = n ¢h ¢¢0 , (5-1)
where n and n ¢ are the refractive indices of the object and image spaces, h and h ¢ are the
object and image heights, q and q ¢ are the chief ray angles (both numerically negative)
in the object and image spaces, and a and a ¢ are the radii of the entrance and exit pupils,
respectively. Moreover, we have used the fact that the object and image distances from
the entrance and exit pupils, respectively, are given by
Lo = h / q = - a 0 , (5-2a)
and
Li = h ¢ / q¢ = - a ¢ ¢0 . (5-2b)
The quantities in Eq. (5-1) represent, from left to right, the two-ray Lagrange invariant
(discussed in Section 4.10) in the planes of the object, entrance pupil, exit pupil, and the
image.
5.2.4 Vignetting
The amount of light in the image of a point object depends on the size and location of
the aperture stop or, equivalently, the entrance pupil of the imaging system. However, not
all of the rays transmitted by one element of the system are transmitted by another. Figure
5-7 illustrates vignetting of rays. The rays in the shaded region are transmitted by the
aperture stop AS, but they are missed by the lens L and are said to be vignetted.
The vignetting of rays from an off-axis point object by a multielement system may
be determined by projecting the images of all elements and apertures (by the preceding
elements) on the entrance pupil using the point object as the center of projection. The
AS L
P
P0
Figure 5-7. Vignetting of rays. Rays from an off-axis point object P in the shaded
region are transmitted by the aperture stop AS but vignetted by the lens L.
common area of these projections represents the effective entrance pupil of the system for
the point object under consideration. Its images formed by the elements that precede it
and by the entire system are the effective aperture stop and the effective exit pupil of the
system. An alternative but equally valid approach to determining the vignetting of rays is
to project the images of all elements on the exit pupil using the Gaussian image point as
the center of projection. The common area of these projections on the exit pupil
represents the effective exit pupil. The images of the common area by the elements that
follow it (looking at them from the image point) and by the entire system are the effective
aperture stop and the effective entrance pupil, respectively.
In Figure 5-4a, the lenses are quite large compared with the aperture stop; therefore,
they do not in any way limit the ray bundle from the object point P0 transmitted by the
system. AS is indeed the aperture stop because it limit the ray bundle. Similarly, we note
from Figure 5-4b that, for any point on the object P0 P , there is no vignetting of the
aperture stop, i.e., any ray that is not blocked by the aperture stop is also not blocked by
either of the two lenses. Thus, for a circular aperture stop, the entrance and exit pupils are
also circular. We note that the cone of light rays from an axial point object illuminates the
lenses symmetrically, but the one from the off-axis point object illuminates them
eccentrically. We also note that different portions of the lenses are used for different point
objects. The same region of an imaging element is used for different point objects only
when the aperture stop is located at the element.
However, consider Figure 5-8a, which also shows a system consisting of two lenses
L1 and L2 with an aperture A placed between them. The images of A and L2 by L1 are
indicated as A¢ , and L2¢ , respectively. An observer in the object space sees L1 , A¢ , and
L2¢ , but not A and L2 . We note that A is the aperture stop of the system for only those
objects that have their axial points lying between P1 and P2 , where P1 and P2 are the
points of intersection of the lines joining the upper edges of L1 and A¢, and A¢ and L2¢ ,
respectively, with the optical axis. For these objects, A¢ subtends the smallest angle (at
an axial point) among L1 , A¢, and L2¢ . It is, therefore, the entrance pupil of the system.
For objects lying to the left of P1 , L1 subtends the smallest angle. Thus, it ( L1 ) is the
aperture stop of the system for such objects, in which case it is also the entrance pupil of
the system. For objects lying to the right of P2 , L2¢ subtends the smallest angle.
Therefore, for these objects, L2 is the aperture stop and the exit pupil of the system, and
L2¢ is its entrance pupil.
To illustrate vignetting, we consider an object such as P0 P . It is evident from the

foregoing that, as indicated in Figure 5-8b, A is the aperture stop AS, and A¢ is the
entrance pupil EnP. For the axial point object P0 , the projections of L1 and L2¢ on the
entrance pupil are indicated in the figure and illustrated on its right-hand side. It is
evident that EnP is smaller than the projections of L1 and L ¢2 , and there is no vignetting,
as expected. As stated earlier, for a circular aperture stop, the entrance and exit pupils are
also circular.
P1 P2
L1 L2
A
(a) A¢ L¢2 Projections of
L1 and L¢2 on EnP
L¢2
L1
P0
EnP
AS
P EnP
(b)
L1
EnP
P0 Effective EnP
L¢2
AS
P EnP
(c)
Figure 5-8. Aperture stop of a system and its vignetting. A¢ and L2¢ are the images of
A and L2 by L1 . (a) Determination of the aperture stop. (b) Diagram showing no
vignetting for an on-axis point object P0 . (c) Vignetting diagram for an off-axis
point object P. The circles on the right-hand side of the figure show projections of
L1 and L2¢ on EnP with the point object under consideration as the center of
projection.
Figure 5-8c shows the projections of L1 and L2¢ on EnP as viewed from an off-axis
point object P. These projections, illustrated as eccentric circles on the right-hand side of
the figure, are shown to be circular only as an approximation of the actual ellipses. The
ray bundle originating at P and transmitted by the system is shown shaded in the figure. It
is clear that the upper marginal ray (sometimes called the upper rim ray) is limited by L2 ,
and the lower marginal ray (sometimes called the lower rim ray) is limited by L1 ; i.e., the
upper portion of the ray bundle from P is blocked by L2 , and its lower portion is blocked
by L1 . Thus, there is vignetting of the aperture stop and the effective aperture stop, and
the corresponding entrance and exit pupils are no longer circular. The shape of the
effective entrance pupil is shown shaded in the figure as the region of EnP that is
common with the projections of L1 and L2¢ on it. Its Gaussian images by L1 and L2 give
the shapes of the effective aperture stop and exit pupil, respectively. The consequence of
the variation of the shape of the entrance pupil with the location of point object P lies not
only in the loss of light in its image but also in the distribution of the image light (because
it depends on the shape of the pupil). Diagrams such as those shown on the right-hand
side of Figures 5-8b and 5-8c, illustrating the shape of the pupil for a certain point object,
are called vignetting diagrams.
5.2.5 Size of an Imaging Element

To avoid vignetting for a certain field of view, the size of an imaging element in a
system, e.g., the diameter of a lens or a mirror, can be determined by tracing the marginal
ray from a point on the edge of the object and making the size of the element large
enough that this ray is not obstructed by it. The approximate size of an element can be
obtained by adding the magnitudes of the heights of the chief ray from an edge point
object and the marginal ray from the axial point object. This is because the angle between
the chief and marginal rays is approximately independent of the location of the point
object in the object plane. Thus, we do not have to trace the marginal ray from the edge of
the object. For example, in Figure 5-4, the radius of lens L1 required to avoid vignetting
of rays from the point object P is A1C1 or A1 B1 + B1C1 . However, B1C1 is
approximately equal to A01 B01 . Thus, the lens radius is given by the sum of the
magnitudes of the heights of the axial marginal ray and the edge chief ray on the lens.
Similarly, the radius of lens L2 is given by A2 C2 or A2 B2 + B2 C2 , where B2 C2 is
approximately equal to A02 B02 .
5.2.6 Telecentric Aperture Stop

If the aperture stop lies in the front focal plane of a system, as in Figure 5-9a, then
the exit pupil lies at infinity. Considering the front focal point F as an object, any ray
passing through it will emerge from the system parallel to its optical axis. Accordingly,
any chief ray in the image space lies parallel to the axis. Such a system is said to be
telecentric on the image side. Similarly, if the aperture stop lies in the back focal plane,
then the entrance pupil lies at infinity. In this case, any chief ray in the object space lies
parallel to the optical axis, and the system is said to be telecentric on the object side. If
the system is afocal (i.e., one that forms the image at infinity of an object at infinity, as
discussed in Section 2.5) and if the aperture stop is placed in an intermediate focal plane,
then both the entrance and exit pupils lie at infinity, and the system is said to be
telecentric on both object and image sides. However, a system cannot be telecentric on
the object side if the object lies at infinity because then the aperture stop will lie in the
image plane where it cannot control the cross-section of the focused beams. A telecentric
stop on the image side, for example, has the advantage that the size or the shape of an
image is insensitive to small focus errors, as may be seen from Figure 5-9. In Figure 5-9a,
the height of the image center does not change with defocus, i.e., P ¢ and P ¢¢ are at the
same height. However, in Figure 5-9b, where the aperture stop does not lie in the front
focal plane, a small defocus changes the height of the image center, as may be seen from
the fact that P ¢¢ is at a slightly larger height than P ¢ , i.e., h ¢¢ > h ¢.
5.2.7 Field Stop, and Entrance and Exit Windows

Whereas the aperture stop limits the cone angle of the rays from a point on an object
transmitted by the system, there is another stop, called the field stop, which limits the
cone angle of the transmitted chief rays from the object, as illustrated in Figure 5-10. The
image of the field stop by the imaging elements that precede it is called the entrance
................
AS .
EnP P¢
.. ..........
........ CR P¢¢
CR
....... h¢
.......
(–)h F .......
P
Optical
System
(a)
AS
EnP
.......
........... P¢ P¢¢
CR
.....................
h¢ h¢¢
(–)h
CR
P ....................
Optical
System
(b)
Figure 5-9. (a) Telecentric aperture stop on the image side. (b) Nontelecentric
aperture stop. A dotted line shown within the system here and in Figure 5-10 does
not represent a ray but merely a line joining its points of incidence on and
emergence from the system. A small focus error does not change the height of the
image center in (a), but it does in (b).
EnW
ExW
Field
EnP Stop ExP
CR
CR
qo qi
Optical
System Image
Object Plane
Plane
Figure 5-10. Field stop, entrance and exit windows, and field of view of a system.
The field stop is assumed to lie at an intermediate image of the object. The dotted
line is not a ray but a mere illustration that the angle q 0 is limited by the field stop.
window EnW , and its image by the elements that follow it is called the exit window ExW .
The field stop is placed at a real image of the object. The image may be an intermediate
or the final one. Accordingly, the entrance and exit windows lie in the object and image
planes, respectively. The entrance window defines the object field that is actually imaged.
Simple examples of field stops are the rectangular diaphragm or the plate holder for the
film in a camera or for a slide in a slide projector. The field stop of a system is
determined by finding the image of each aperture and element by the imaging elements
that precede it and determining the image that subtends the smallest angle at the center of
the entrance pupil. This image is the entrance window, and the physical stop
corresponding to it is the field stop. The field stop may also be determined by tracing a
chief ray from a certain off-axis point object and calculating the ratio of the height and
radius of each element and aperture in the system. The element with the highest ratio is
the field stop.
The angle qo subtended by the entrance window at the center of the entrance pupil
defines the angular field of view of the system in object space. Similarly, the angle q i
subtended by the exit window at the center of the exit pupil is the angular field of view of
the system in image space. According to Eq. (5-1), their ratio qo / q i is equal to the
magnification of the exit pupil when the refractive indices of the object and image spaces
are equal.
It should be noted that, whereas the position and the size of the aperture stop
determine the quality and the amount of light in the final image (by virtue of blocking
rays with large aberrations), the field stop determines only the portion of the object that is
imaged. Additional stops and baffles are placed in optical systems to block stray light
from reaching the final image area. An example is a stop called a Lyot stop (or a cold stop
when used in an infrared system) placed at a real image of the aperture stop.
5.3 RADIOMETRY OF POINT OBJECT IMAGING

In this section, we discuss the radiometry of point object imaging. The flux incident
on a surface by a point source is calculated, and the inverse-square law of irradiance is
derived. We also obtain an expression for the intensity of the Gaussian image of a point
object in terms of the object intensity.
5.3.1 Flux Received by an Aperture

If a point source radiates a flux dF (in watts, or W) in a certain direction into a solid
angle dW, then its intensity I (in watts/steradian, or W/sr) in that direction is given by
dF
I = . (5-3)
dW
If a flux dF from a point source irradiates a surface element of area dS, the flux incident
on the surface per unit area is called the irradiance E (in watts/square meter, or W/m2 ) of
the surface. Thus, the irradiance of the surface is given by
dF
E = . (5-4)
dS
Now we determine the flux incident on a circular aperture of radius a from a point
source P of intensity I lying at a distance R on its axis (see Figure 5-11). Consider an
annular element of radius r and width dr making an angle q = cos -1 ( R d ) with the axis.
Its area is given by dS = 2 p rdr , while its projected area perpendicular to the line joining
it and the point source is given by dS cos q. The solid angle dW subtended by it at the
point source is given by dS cos q d 2 , where d is the distance between the two.
Accordingly, the flux incident on it is given by
dS1 = dS
I dS cos3 q
=
R2
2 p rdr
= IR 32 .
(R 2
+ r2 )
Integrating over r from 0 to a, we obtain the total flux incident on the aperture, i.e.,
F = Ú dF
a
Û rdr (5-5a)
= 2 p IR Ù
ı 2 3/ 2
0 (R 2
+r )
È ˘
1 1
= 2p IR Í – 1/ 2
˙
ÍR
Î R + a2
2
( ) ˙
˚
= IW , (5-5b)
5.3 Radiometry of Point Object Imaging 201
dr
a
a
d r
q
P
Figure 5-11. Point source irradiating an aperture.
where
È ˘
1 1
W = 2p R Í - ˙
12˙
ÍR
ÍÎ R + a2
2
( ) ˙˚
= 2 p (1 - cos a )
= 4 p sin 2 (a 2) (5-5c)
is the solid angle, and a is the semiangle subtended by the aperture at the point source.
5.3.2 Inverse-Square Law of Irradiance

For a small aperture, i.e., for a << R , Eqs. (5-5a) and (5-5c) reduce to S R 2 , where
S = p a 2 is the area of the aperture. In that case, Eq. (5-5) reduces to
p a2 I
F =
R2
IS
= , (5-6a)
R2
and
W = S R 2 = pa 2 , (5-6b)
where S = p a 2 is the area of the aperture. Thus, for a distant point source, the solid angle
subtended by the aperture is simply its area divided by the square of its distance from the
point source, and I R 2 is the uniform irradiance on the aperture. Equation (5-6a)
represents the inverse-square law of irradiance; namely, the irradiance of a surface by a
point source lying on its surface normal is inversely proportional to the square of its
distance from the radiating source. Figure 5-12 shows how the flux varies with R a.
Comparing curve (a) with (b), it shows that the exact value given by Eq. (5-5a) is smaller
1.8
1.6
1.4
1.2
F/pI
1
(b)
(a)
0.8
0.6
0.4
0.2
0
0 1 2 3 4 5
R/a
Figure 5-12. Flux collected from a point source of intensity I by an aperture of

radius a lying at a distance R. Curve (b) represents Eq. (5-6a).
than the approximate value given by Eq. (5-6a), and that the two are practically equal to
each other for R a ≥ 5. The difference between the two is less than 3% when R = 5 .
5.3.3 Image Intensity

Now we discuss the radiometry of point object imaging, i.e., determination of the
intensity of the image point in terms of the intensity of the object point and the
parameters of the system. Consider, as indicated in Figure 5-13, a point object P lying in
a plane at a distance Lo from the entrance pupil of an optical imaging system. Its
Gaussian image lies at P ¢ in a plane at a distance Li from the exit pupil of the system. If
the object is a uniform point source of intensity Io , the flux incident on the entrance
pupil of area Sen is given by
F = I o W( P) , (5-7)
where
Sen cos q
W( P) =
( PO) 2
(
= Sen L2o cos 3 q ) (5-8)
is the solid angle subtended by the entrance pupil at the point object P. Here, PO is the
distance between the point object P and the center O of the entrance pupil, and q is the
angle the chief ray makes with the optical axis of the system in object space. It is assumed
here that the dimensions of the entrance pupil are small enough compared to its distance
from the object plane that the variation of the angle q with the location of an area
5.3 Radiometry of Point Object Imaging 203
EnP ExP
P¢
CR
P0 q¢
q O O¢ P0¢
CR
P
Object Plane Image Plane

Optical
System
(–)L o Li
Figure 5-13. Radiometry of point object imaging. A point object P lies in the object
plane at a distance Lo from the entrance pupil EnP of the system. Its Gaussian
image P ¢ lies in the image plane at a distance Li from the exit pupil ExP of the
system. The chief ray CR makes an angle q in the object space and q ¢ in the image
space of the system.
element on the pupil can be neglected and, therefore, integration across the pupil is not
required. Equation (5-7) may also be written
F = Io W ( P0 ) cos 3 q , (5-9)
where
W ( P0 ) = Sen L2o (5-10)
is the solid angle subtended by the entrance pupil at the axial point object P0 . If we divide
the flux F by the area Sen , we find that the irradiance of the pupil is proportional to
cos 3 q , and thus obtain the cosine-third power law of irradiance of a surface by a point
source.
In the absence of any transmission losses in the system, the flux F emerges from the
exit pupil and focuses on the image point P ¢ . If Ii is the intensity of the image point,
then the flux emerging from the exit pupil is given by
F ¢ = Ii W ¢( P ¢ ) , (5-11)
where
Sex cos q ¢
W ¢( P ¢ ) =
(O ¢ P ¢ ) 2
( )
= Sex L2i cos 3 q ¢ (5-12)
is the solid angle subtended by the exit pupil at the image point. Here, O¢P ¢ is the
distance between the center O¢ of the exit pupil and the image point P ¢, Sex is the area of
the exit pupil, and q ¢ is the angle the chief ray makes with the optical axis in image
space. Equation (5-12) may also be written
W ¢( P ¢) = W ¢( P0¢ ) cos3 q ¢ , (5-13)
where
W ¢( P0¢) = Sex L2i (5-14)
is the solid angle subtended by the exit pupil at the axial image point P0¢ . As in the case
of the entrance pupil, the dimensions of the exit pupil are assumed to be small enough
compared with its distance from the image plane that the variation of the angle q ¢ with
the location of an area element on the exit pupil can be neglected, and therefore
integration across the pupil is not required. Thus, Eq. (5-11) may be written
F ¢ = Ii W ¢( P0¢) cos 3 q ¢ . (5-15)
Due to conservation of energy, F = F ¢ . Therefore, by equating the right-hand sides

of Eqs. (5-9) and (5-15), we obtain the intensity of the image point:
Io W ( P0 ) cos 3 q
Ii = . (5-16)
W ¢( P0¢ ) cos 3 q ¢
It should be noted that the image point is a uniform point source only within the solid
angle W ¢( P ¢ ) because (according to geometrical optics) there is no radiation outside it.
For an axial point object, both q and q ¢ approach zero, and Eq. (5-16) reduces to
Io W( P0 )
Ii = , (5-17)
W ¢( P0¢)
which may also be written
Io Fex2
Ii = , (5-18)
Fen2
where Fen = Lo Den and Fex = Li Dex are the focal ratios of the optical beams entering
and exiting from the system, and Den and Dex are the diameters of the entrance and exit
pupils.
5.4 RADIOMETRY OF EXTENDED OBJECT IMAGING

5.4.1 Introduction
In this section, we define a Lambertian surface and determine the irradiance due to a
Lambertian disc. We derive an invariant property of the radiance of rays when they are
refracted or reflected. This property is used to obtain the irradiance of the image of a
5.4 Radiometry of Extended Object Imaging 205
Lambertian object formed by an optical system. We show that the irradiance in the image
plane decreases as the fourth power of the cosine of the chief ray angle.
The flux from an object element incident on the system and transmitted by it to the
image element can be calculated in two different ways, depending on the location of the
aperture stop. If the aperture stop lies in the object space so that it is also the entrance
pupil, then the flux entering the system can be calculated by integrating across the
entrance pupil. If the aperture stop lies in the image space so that it is also the exit pupil,
then the image flux is obtained by integrating across the exit pupil. Both of these
approaches are illustrated. If, however, the aperture stop lies somewhere inside the
system, then the integration may be performed across the entrance or the exit pupil. The
region of integration depends on the shape of the pupil, which may or may not be the
same as that of the aperture stop due to vignetting or distortion of the pupil image. The
pupil shape, which is system specific, may be determined by tracing a bundle of rays.
5.4.2 Lambertian Surface

Unless an irradiated surface is highly polished, it will reflect or radiate like a self-
radiating object over a wide range of directions. Its intensity at any point P (see Figure 5-
14) per unit projected area in a certain direction is called its radiance (in watts/square
meter steradian, or W/m2 sr) at that point in the direction under consideration. Thus, if an
area element dS has an intensity dI in a direction inclined to its normal at an angle q, the
projected area is dS cosq, and its radiance B is given by
dI
B = . (5-19)
dS cos q
Generally, the radiance of self-radiating and reradiating surfaces does not vary
strongly with the direction of radiation. According to Eq. (5-19), the radiance of a surface
element is independent of the direction of radiation if its intensity is proportional to cosq.
dS cos q
dS q
P
Figure 5-14. Radiance of a Lambertian surface.

Such a surface is said to obey Lambert’s cosine law of intensity. A surface that radiates
uniformly in all directions is called a Lambertian surface or a uniform diffuser, depending
on whether it is self-radiating or reradiating. The sun is a spherical blackbody radiating
uniformly in all directions and therefore appears as a uniform disc. A laser beam, on the
other hand, is highly directional and obviously not a Lambertian source of radiation.
5.4.3 Illumination by a Lambertian Disc

Consider a surface element of area dS1 and radiance B irradiating a surface element
dS2 at a distance R such that the normals to the two surface elements make angles q1 and
q 2 with the line joining their centers, as illustrated in Figure 5-15. The intensity of dS1 in
the direction of dS2 is BdS1 cos q1 . It represents the flux radiated by dS1 per unit solid
angle in the direction of dS2 . The solid angle subtended by dS2 at dS1 is given by
dS2 cos q 2 R 2. Thus, the total flux received by dS2 from dS1 is given by
dF2 = B dS1 dS2 cos q1 cos q 2 R2 , (5-20)
and its irradiance is given by
dF2
dE =
dS2
(5-21)
= B dS1 cos q1 cos q 2 R 2 .
We now determine the axial irradiance of a Lambertian disc of radius a and radiance
B (see Figure 5-16). Because of the axial symmetry of the disc, we consider an elemental
ring of radius r and width dr . Its area is given by dSe = 2prdr . The flux radiated by this
ring per unit solid angle on a parallel elemental area dSr centered on the axis of the disc
at a distance R is given by BdSe cos q , where q is the angle the line joining a point on the
ring and the receiver makes with the axis. The solid angle subtended by the receiver area
dSr at any point on the ring is given by dSr cos q d 2 or dSr cos 3 q R 2 , where
12
(
d = R cos q = r 2 + R 2 ) is the distance between a point on the ring and the receiver.
Thus, the flux incident on the receiver by the ring is given by BdSe dSr cos 4 q R 2 . Its
irradiance is accordingly given by
q1 (–)q2
dS1 dS2
Figure 5-15. Irradiance of a surface element by a Lambertian surface element.

dr
a
a
r d
q
® dS r
Figure 5-16. Axial illumination at a distance R by a Lambertian disc of radius a and

radiance B.
dE(0) = BdSe cos 4 q R2
2prdr
= BR 2 2 . (5-22)
(r 2
+ R2 )
Integrating from 0 to a, we obtain the axial irradiance due to the disc:
a
2Û rdr
E(0) = 2 p BR Ù 2
ı
0
(r 2
+ R2 )
BS
= (5-23a)
a 2 + R2
= p Bsin 2 a , (5-23b)
where S = p a 2 is the area of the disc, and a is the semiangle subtended by the disc at the
point of observation. When R << a , E(0) Æ p B . Thus, when the source is very large
compared with the distance of the receiver, the axial irradiance is independent of the
distance between the two.
At a large distance from the disc ( R >> a) , Eq. (5-23a) reduces to
E(0) = p B tan 2 a (5-24a)
= BW (5-24b)
= I R2 , (5-24c)
where W = S R 2 is the solid angle subtended by the disc, at the receiver, and I = BS is
the intensity of the disc along its axis. Thus, the irradiance due to the disc at large
distances is equal to the product of its radiance and the solid angle subtended by it at the
distant receiver. The disc also behaves like a point source of intensity BS at large
distances, as expected. The difference between the actual value and that given by the
point-source approximation is less than 4% if R ≥ 5a . The difference becomes less than

1% when R ≥ 10 a . Note that sin a ~ tan a ~ a for R ≥ 5a . Figure 5-17 shows that Eq.
(5-24) approximates the exact Eq. (5-23) very well for R a ≥ 5.
For a distant off-axis receiver making an angle d with the disc axis (see Figure 5-
18), the irradiance is given by
E (d ) = E (0 ) cos 4 d (5-25a)
= p B tan 2 a cos 4 d , (5-25b)
where one factor of cos d arises from the projected area of the disc along the line joining
1.0 0.05
0.8 0.04
(R / a)2 (R / a)2
0.6 [1+(R / a)2] –1 0.03
E(0)
pB
0.4 0.02
0.2 0.01
[1+(R / a)2] –1
0.0 0.00
0 2 4 6 8 10
R/a
Figure 5-17. Axial irradiance at a distance R by a Lambertian disc of radius a and

radiance B. The right-hand vertical scale is for R a ≥ 5.
a
® dS r
d
a
Figure 5-18. Off-axis illumination at a large distance R by a Lambertian disc of

radius a and radiance B.
its center and the point of observation, another due to the projected area of the receiver,
and cos 2 d due to the increase in distance between the disc and the receiver. When R is
not much greater than a, the actual irradiance values are higher than those predicted by
Eq. (5-25a). Figure 5-19 shows how the irradiance decreases as the off-axis angle d
o
increases, especially for d >
~ 10 .
5.4.4 Flux Received by an Aperture

In Section 5.3.1, we determined the flux received by an aperture from a point source.
Now we consider the same problem except that the point source is replaced by a small
Lambertian source. As illustrated in Figure 5-20, we consider a small Lambertian source
of area dS and radiance B centered on the axis of an aperture of radius a at a distance R
from it. The flux radiated by the source into an annulus of radius r and width dr is given
by
dF = B dS (2 p rdr ) cos 2 q d 2 , (5-26)
where we have used Eq. (5-20) with dS1 = dS , dS2 = 2 p rdr , q1 = q 2 = q , and d is the
distance between the source and a point on the annulus. Letting d = R cos q
12
( )
= R 2 + r 2 , Eq. (5-26) may also be written
dF = B dS (2 p rdr ) cos 4 q R 2
B dS R 2 (2 p rdr )
= 2 . (5-27)
(R 2
+ r2 )
1
0.8
0.6
E( δ )/E(0)
0.4
0.2
0
0 10 20 30 40 50 60
δ
Figure 5-19. The cos 4 d variation of irradiance at large distances, where the off-axis
angle d is in degrees.
dr
a a
d r
q
dS
Figure 5-20. Flux incident on an aperture of radius a by a Lambertian source of

area dS at a distance R.
The total flux received by the aperture is obtained by integrating over r from 0 to a:
a
2Û rdr
F = 2 p B dS R Ù 2
ı
0
(r 2
+ R2 )
p B dS a 2
=
a 2 + R2
= p B dS sin 2 a , (5-28)
where a is the semiangle subtended by the aperture at the source. Comparing Eq. (5-28)
with Eq. (5-23b), we note that the flux incident on an aperture of radius a from a
Lambertian source of area dS is exactly the same as the flux incident on an area dS by a
Lambertian disc of radius a. It shows that the same amount of flux is transmitted from a
source to a receiver if their roles are interchanged. Comparing Eq. (5-28) with Eq. (5-5)
for the flux received from a point source, we note that the intensity I of the point source
has been replaced by the radiance B of the Lambertian source. The reason the two
expressions are different is because of the extra cosine factor in the projected area of the
source.
The flux calculation may also be carried out in terms of the solid angle subtended by
the annulus. The flux incident on the annulus is given by B dS cos q d W, where
d W = 2 p rdr cos q d 2 (5-29)
is the solid angle subtended by the annulus at the source. We note from the figure that
sin q = r d
r
= 12 . (5-30)
(r 2
+ R2 )
Differentiating both sides, we obtain
dr cos q
dq = .
d (5-31)
Hence, Eq. (5-29) can be written
d W = 2 p sin q dq . (5-32)
Substituting for dW and integrating over q from 0 to a , we obtain

a
F = pB dS Ú sin 2q dq
0
= pB dS sin 2 a , (5-33)
which is the same as Eq. (5-28).
5.4.5 Image Radiance

Now we show how the radiance of rays changes when they are refracted or reflected.
Consider an elementary beam of solid angle dW incident in a direction (q, f) on an
interface separating media of refractive indices n and n ¢ , as shown in Figure 5-21. The
solid angle dW is given by
dW =
(rd q) (r sin q df)
r2
= sin q d q df . (5-34)
It represents the area on a unit sphere lying between the angles q and q + dq , and f and
f + df , as may be seen from the figure. If B is the radiance of the beam, the flux incident
on an elementary area dS is given by
dF = B dS cos q d W . (5-35)
The beam is refracted at the interface in the direction (q ¢, f) , where q ¢ is given by

Snell's law according to
n ¢ sin q ¢ = n sin q . (5-36)
The azimuthal angle f does not change upon refraction by virtue of the fact that the
incident ray, the refracted ray, and the surface normal are coplanar. The solid angle of the
refracted beam is given by
d W ¢ = sin q ¢ d q ¢ d f . (5-37)
Differentiating both sides of Eq. (5-36), we obtain
n ¢ cos q ¢ d q ¢ = n cos q d q . (5-38)

r dq
r
q
dq r sinq df
r si
f nq
df
x r sinq df
(a)
dW
dq
q
n
dS y
n¢
dq¢ q¢
x
dW¢
(b)
Figure 5-21. (a) Solid angle of an elementary beam in polar coordinates ( r, q, f) and
(b) its change from d W to d W ¢ upon refraction at an interface separating media of
refractive indices n and n ¢ .
Equating the products of the left-hand sides of Eqs. (5-36) and (5-38) to the products of
their right-hand sides, we find that
n ¢ 2 cos q ¢ d W ¢ = n 2 cos q d W , (5-39)
i.e., the quantity n 2 cos q d W is invariant upon refraction. The two solid angles are
different from each other because the rays bounding dW are refracted by slightly
different amounts due to their slightly different angles of incidence. For a reflecting
surface, they are equal because then q ¢ = q .
If B¢ is the radiance of the refracted beam, the flux contained in it is given by
dF ¢ = B¢ dS cos q ¢ d W ¢ . (5-40)
In the absence of any transmission loss, the incident flux is equal to the refracted flux,
i.e.,
dF ¢ = dF . (5-41)
Therefore, equating the right-hand sides of Eqs. (5-35) and (5-40), and substituting Eq.
(5-31), we obtain
B¢ B
2 = . (5-42)
n¢ n2
Thus, when the rays are refracted by a surface, the quantity B n 2 associated with
them is invariant. We refer to this invariance as the radiance theorem. When the rays are
reflected by a lossless surface, their radiance is invariant because n ¢ = – n in that case.
Because the entrance pupil of an optical imaging system lies in its object space, the
radiance of rays at the entrance pupil is equal to the object radiance. Similarly, because
the exit pupil lies in the image space, the radiance of rays at the exit pupil is equal to the
image radiance. We make use of Eq. (5-42) in Section 5.4.7 in obtaining the image
irradiance distribution in terms of the object radiance.
In optical imaging by a multisurface system, if the refractive indices n and n ¢ of the

object and image spaces are equal (in practice, they are often both equal to unity), then
the radiance of an image element is equal to the radiance of the corresponding object
element. Taking into account the loss of energy at a refracting or a reflecting surface, we
conclude that image radiance can, at most, be equal to the object radiance, i.e., B¢ £ B
when n = n ¢ . If a beam propagates in a given medium, i.e., if it is neither refracted nor
reflected, then n = n ¢ , and Eq. (5-42) reduces to B¢ = B. Thus, the radiance of a tube of
rays remains invariant as they propagate in a certain medium.
5.4.6 Image Irradiance: Aperture Stop in front of the System

Consider a system with its aperture stop lying in the object space so that it is also the
entrance pupil. For example, in astronomical telescopes (discussed in Section 6.5), the
first imaging element is also the aperture stop. Let its radius be aen and area be
2
Sen = p aen . We assume that the aperture stop is much smaller than the object distance
from it so that the line joining a given (Lambertian) object element of area dS and any
element d Sen on the aperture makes approximately the same angle g with the optical
axis (see Figure 5-22). If B is the radiance of the object element, its intensity in the
direction of the entrance pupil is given by BdS cos q , where q is the angle of the chief
ray in the object space. The projected area of the pupil in the direction of the object
element is given by Sen cos q , and the distance between the two is Lo cos q .
Accordingly, the solid angle subtended by the pupil at the object element is given by
Sen cos q
dW = 2 . (5-43)
( Lo cos q)
Thus, the flux incident on the pupil is given by
F = B dW dS cos q
2
= B ( Sen Lo ) dS cos 4 q . (5-44)
Neglecting the loss of light while propagating through the system, this flux is contained
in the corresponding image element. If dS¢ is the area of this element, its irradiance is
given by
E(q) = F dS ¢
= E(0 ) cos 4 q , (5-45)
where
n EnP ExP n¢
P¢
dS¢
dSen CR
P0 a g q¢ g¢
q O F O¢ (–)a¢ P0¢
CR
dS
P dSex
d
Optical
System
(–)L o Li
Figure 5-22. Radiometry of extended object imaging.

(
E( 0) = p B M 2 a 2 ) (5-46a)
( )
= B M 2 W ( P0 ) (5-46b)
(
= p B 4M 2 ) Fen2 (5-46c)
is the irradiance of an axial image element,

1/ 2
M = (dS ¢ dS ) (5-47)
is its magnification, and a = aen Lo is the semiangle of the cone subtended by the
entrance pupil at the axial object point P0 . The quantity 2 a is called the angular
aperture of the light cone entering the system. As in Eq. (5-18),
Fen = Lo Den
= 1 2a (5-48)
is the f-number of this light cone. Equation (5-45) represents the cosine-fourth power law
in the object space, showing that the irradiance of the image of a Lambertian object
decreases as the fourth power of the cosine of the chief ray angle q in the object space.
This decrease can be overcome by introducing barrel distortion into the system. Of
course, there may be an additional decrease due to vignetting.
When aen is not very small compared to Lo , then a = tan -1 aen Lo , and it is ( )
replaced by sin a in Eq. (5-46a). The f-number of the light cone in that case is given by
Fen = 1 2sin a . (5-49)
The quantity n sin a is called its numerical aperture in the object space.
If d is the distance of the object-space focal point from the entrance pupil and f is the
object-space focal length of the system, then Eq. (2-83) yields
M = -f ( Lo - d ) . (5-50)
For an object at infinity, as in astronomical observations, Lo M Æ - f . Moreover, in that

case, all of the rays are parallel to the chief ray, and therefore every element of the
entrance pupil makes exactly the same angle q with the optical axis. Thus, Eq. (5-44)
yields
( )
E(q) = B f 2 Sen cos 4 q
= ( p Bn ¢ 2
)
4n 2 F•2 cos 4 q , (5-51)
where
F• = f ¢ Den (5-52)
is the focal ratio of the image-forming light cone for an object lying at infinity and we
have made use of Eq. (2-69). F• is called the f-number or the relative aperture of the
system. If the diameter Den = 2 aen of the entrance pupil is increased by a certain factor so
that the system collects more light, the image irradiance does not change if the image-
space focal length f ¢ is also increased by the same factor. The amount of light collected
2
increases as Den , and the image area increases as f ¢ 2 so that the irradiance does not
change unless the f-number also changes. Accordingly, the f-number and not the entrance
pupil diameter determines the light-gathering capability of a system in the sense of image
irradiance.
A camera lens with a small f-number is said to be fast since it yields higher
irradiance on film, thus requiring a shorter exposure time. Its speed is inversely
proportional to the square of its f-number. The diameter of the lens, and therefore the flux
density on the film, is controlled by a shutter, but its focal length is fixed (unless an
additional lens is attached). The f-number markings on the rim of a camera lens, e.g.,
22.6, 16, 11.3, 8, 5.6, 4, 2.8, 2, and 1.4, represent increasing shutter opening by a factor of
2 in the area from one number to the next. Assuming a good-quality lens, a larger lens
opening (and, therefore, a smaller f-number) also gives a better resolution. Smaller f-
numbers are used for fast-moving or dimly illuminated objects.
The focal ratio of the image-forming light cone for finite conjugates can be related to
F• as follows. From Eqs. (2-70) and (2-72), the image distance S ¢ of P0¢ can be written
S ¢ = f ¢(1 - M ) . (5-53)
Similarly, the image distance s ¢ of the exit pupil can be written in terms of the pupil
magnification m = Dex Den . Thus, we may write the focal ratio of the light cone exiting
from the exit pupil [see Eq. (5-18)]
Fex = Li Dex
= ( S ¢ - s ¢ ) Dex (5-54)
= F• (1 - M m) .
We note that Fex Æ F• as M Æ 0 , i.e., as the object moves to infinity.
5.4.7 Image Irradiance: Aperture Stop in back of the System

When the aperture stop lies in the image space of the system, it is also its exit pupil.
We assume that the exit pupil is much smaller than the distance of the image from it so
that the line joining any element dSex on it and the image element makes approximately
the same angle g ¢ with the optical axis (see Figure 5-17). If B¢ is the radiance at the exit
pupil, its intensity in the direction of the image element is given by B¢Sex cos q ¢ , where
q ¢ is the chief ray angle in the image space. The projected area of the image element in
the direction of the pupil is given by dS¢ cos q ¢ , and the distance between the two is
Li cos q ¢ . Accordingly, the solid angle subtended by the image element at the pupil is
given by
d S ¢ cos q¢
d W¢ = 2 . (5-55)
( Li cos q¢)
The total flux in the image element is given by
F ¢ = B¢Sex dW ¢ cos q ¢
= ( B¢dS ¢ L )S
2
i ex cos 4 q ¢ . (5-56)
Therefore, the irradiance of the image element is given by
E(q ¢) = F ¢ dS ¢
= E(0) cos 4 q ¢ , (5-57)
where
2
E ( 0 ) = p B ( n ¢ n) a ¢ 2 (5-58a)
2
= B (n ¢ n) Sex L2i( ) (5-58b)
2
= B (n ¢ n) W ¢ ( P0¢ ) (5-58c)
2
= p B ( n ¢ 2 n) Fex2 , (5-58d)
and we have written B ¢ in terms of B, according to Eq. (5-42). Here, a ¢ = aex Li is the
semiangle of the cone subtended by the exit pupil at the axial image point P0¢ , and the
angle 2 a ¢ is called the angular aperture of the image-forming light cone exiting from
the system with its apex at P0¢ . As in Eq. (5-18),
Fex = Li Dex
= 1 2 a¢ (5-59)
is the f-number of the light cone exiting from the system. Equation (5-57) represents the
cosine-fourth power law in the image space, showing that the irradiance of the image of a
Lambertian object decreases as the fourth power of the cosine of the chief ray angle q ¢ in
the image space.
When aex is not very small compared to Li , then a ¢ = tan -1 ( aex Li ) and it is
replaced by sin a ¢ in Eq. (5-58a). The f-number of the exiting light cone in that case is
given by
Fex = 1 2 sin a ¢ . (5-60)
The quantity n ¢ sin a ¢ is called its numerical aperture in the image space.
If the object lies at infinity, then the image point P0¢ coincides with the image-space
focal point F ¢ , and Fex Æ F• , where
F• = n ¢ 2 NA•¢ . (5-61)
Here,
NA•¢ = n ¢ sin a ¢• (5-62)
is the corresponding image-space numerical aperture, where a ¢• is the semiangle of the

image-forming light cone with its apex at F ¢ .
The angular aperture, the f-number, and the numerical aperture all give a measure of
the light-gathering capability of an optical system in the sense that the image illumination
depends on them. It is customary to use the f-number of the image-forming light cone for
systems such as cameras imaging objects lying at large distances. The term numerical
aperture is used when imaging objects at short distances, as in microscopes.
5.4.8 Telecentric Systems

As discussed in Section 5.2.6, if the aperture stop lies in the object-space focal plane,
then the system is telecentric on the image side. Accordingly, the chief ray angle q ¢ in
the image space is zero for any position of the object element. The irradiance distribution
of the image is given by Eq. (5-45) with an appropriate value of Lo . If, in addition, the
object lies at infinity, then the distribution is given by Eq. (5-51). It would be a mistake to
consider Eq. (5-57) with q ¢ = 0 as the correct equation for this case, because it would
lead to the incorrect result that the image irradiance is uniform. Similarly, if the aperture
stop lies in the image-space focal plane, then the system is telecentric on the object side,
and the chief ray angle q in the object space is zero for any object element. The image
irradiance distribution in this case is given by Eq. (5-57).
5.4.9 Throughput
If we consider the corresponding object and image elements centered on the optical
axis at P0 and P0¢ , respectively, then equating the axial image irradiances given by Eqs.
(5-46b) and (5-58c), we obtain
n ¢ 2 dS ¢ W ¢ ( P0¢ ) = n 2 dS W ( P0 ) . (5-63)
Thus, the quantity n 2 dS W ( P0 ) , called the optical throughput, is an invariant. Note that if
n = n ¢ (in practice, they are often both equal to unity), then B = B¢, and the product of
the area and the solid angle may simply be called the throughput. In that case, the
throughput multiplied by the radiance gives the flux passing through the system.
5.4.10 Interrelation among Invariants in Imaging

We have shown that the quantity nh0 (Lagrange invariant) and the throughput
n 2 dS W remain invariant in the imaging process. By using the radiance theorem, we now
EnP ExP
n
n¢
MR 0 MR
a a¢ 0
h b0 (–)b¢0 h¢
P0 O O¢ P¢0
Optical
System
(–)L o Li
Figure 5-23. Invariant relations in imaging. The object is a small circular object of
radius h.
show that they are interrelated by the conservation of energy in the process. Consider a
small circular object of radius h and radiance B at a distance Lo from the entrance pupil
of radius a of a certain imaging system, as illustrated in Figure 5-23. The flux incident on
the entrance pupil is given by
(
Fo = p h 2 B pa 2 L2o ) . (5-64)
If the exit pupil has a radius a ¢ and the image has a radius of h ¢ , radiance B¢ , and lies at
a distance Li from it, then the flux in the image is given by
(
Fi = p h ¢ 2 B¢ p a ¢ 2 L2i ) . (5-65)
Equating the flux entering the system to that exiting from it based on conservation of
energy, we obtain
h ¢ 2 B¢ ¢02 = h 2 B 20 . (5-66)
This is precisely the result obtained if we square the Lagrange invariant equation (2-75)
and multiply by the radiance invariance given by Eq. (5-42). If we substitute for B¢ in
terms of B, we obtain the throughput invariance of Eq. (5-63).
5.4.11 Concentric Systems

In a concentric system, such as a Schmidt or a Bouwers–Maksutov camera [8], the
aperture stop and the entrance and exit pupils all lie at its common center of curvature,
and the image is formed on a spherical surface concentric with the system. The chief ray
incident through the common center passes undeviated, and the chief ray angles in the
object and image spaces are equal. The image element area dS¢ is normal to the line
joining its center and the center of the exit pupil, and the distance between the two centers
is independent of the location of dS¢ on the spherical image surface. Therefore, the solid
angle subtended by dS¢ at the exit pupil is simply equal to dS ¢ L i2 . Thus, the flux
emerging from a small exit pupil and converging on the image element is given by
( )
F ¢ = B ¢dS ¢ Sex L i2 cos q ¢
= B¢dS ¢W ¢ ( P0¢ ) cos q ¢ . (5-67)
Accordingly, the irradiance of the image element is given by
2
E(q¢ ) = B ( n ¢ / n) W¢ (P0¢ ) cos q¢ . (5-68)
Thus, the irradiance of the spherical image formed by a concentric system decreases
linearly with the cosine of the angle of the chief ray in the object or the image space. For
an object lying at infinity, we let Li = f ¢ , the focal length of the system.
5.5 PHOTOMETRY
Now we give a brief discussion of photometry, the branch of radiometry that is
limited to observations with the human eye, which is sensitive only in the visible region
of the electromagnetic spectrum called light. The theory of photometry, in terms of the
transfer of light from a source to a receiver, is the same as discussed earlier, except that
the spectral response of the eye must be taken into account to determine the final result of
any observation. The names, symbols, and units of photometric quantities are given,
along with an equation for obtaining a photometric quantity from a corresponding
radiometric quantity. It is shown that a Lambertian surface appears equally bright at all
distances and along all directions of observation. The reason stars can be observed during
daytime with the aid of a telescope is also discussed.
5.5.1 Photometric Quantities and Spectral Response of the Human Eye

The units of some of the basic quantities used in photometry are given in Table 5-1,
along with their radiometric counterparts. The abbreviation of a unit is indicated in
parentheses. To avoid confusion, the corresponding photometric and radiometric terms
are distinguished from each other by adding to them the adjectives “luminous” and
“radiant,” respectively, e.g., luminous flux and radiant flux. It is common practice to use
the term “luminance” in place of “luminous radiance,” and “illuminance” in place of
“luminous irradiance.”
A photometric quantity can be obtained from a corresponding spectral radiometric

quantity by weighting it with the spectral response of the eye. The relative spectral
response of the eye is shown in Figure 5-24, and its numerical values are given in Table
5-2 for both day (photopic) and night (scotopic) vision. The peak values of the two
spectral visions lie at 555 nm and 507 nm, and correspond to 683 lm/W and 1754 lm/W,
respectively. Thus, for example, the absolute daytime response of the eye at 600 nm is
given by 0.631 ¥ 683 = 431 lm/W. If V (l ) represents the relative spectral response of
the eye, then the luminous flux F l of a source with a spectral radiant flux F r (l ) is
5.5 Photometry 221
given by
F l = k Ú F r (l ) V (l ) d l , (5-69)
where k = 683 lm W or 1754 lm W , depending upon whether V (l ) is for day or

night vision.
Table 5-1. Photometric and radiometric units of some basic quantities.
Quantity Photometric Unit Radiometric Unit
Energy talbot joule (J)
Flux lumen (lm) watt (W)
Intensity lumens/steradian W/sr

(lm/sr)
= candela (cd)
Radiance lm/m2 sr W/m2 sr

(Luminance)
Irradiance lm/m2 = lux (lx) W/m2

(Illuminance)
1.0
0.8
0.6
V
0.4 Night Day
0.2
0.0
380 420 460 500 540 580 620 660 700 740 780
l (nm)
Figure 5-24. Relative spectral response of the human eye for day (photopic) and
night (scotopic) vision.
Table 5-2. Relative spectral response of the human eye for day (photopic) and night
(scotopic) vision.
Wavelength Day Night

l (nm) (Photopic) (Scotopic)
V V
380 0.00004 0.000589

390 0.00012 0.002209
400 0.0004 0.00929
410 0.0012 0.03484
420 0.0040 0.0966
430 0.0116 0.1998
440 0.023 0.3281
450 0.038 0.455
460 0.060 0.567
470 0.091 0.676
480 0.139 0.793
490 0.208 0.904
500 0.323 0.982

507 0.445 1
510 0.503 0.997
520 0.710 0.935
530 0.862 0.811
540 0.954 0.650
550 0.995 0.481
555 1 0.402
560 0.995 0.3288
570 0.952 0.2076
580 0.870 0.1212
590 0.757 0.0655
600 0.631 0.03315

610 0.503 0.01593
620 0.381 0.00737
630 0.265 0.003335
640 0.175 0.001497
650 0.107 0.000677
660 0.061 0.0003129
670 0.032 0.0001480
680 0.017 0.0000715
690 0.0082 0.00003533
700 0.0041 0.00001780

710 0.0021 0.00000914
720 0.00105 0.00000478
730 0.00052 0.000002546
740 0.00025 0.000001379
750 0.00012 0.000000760
760 0.00006 0.000000425
770 0.00003 0.0000002413
780 0.000015 0.0000001390
5.5 Photometry 223
5.5.2 Imaging by the Human Eye

The human eye is a lens system with an iris that acts as the aperture stop whose
diameter increases or decreases, depending on the luminance of an object under
observation. It forms images of objects on a light-sensitive screen called the retina. It
does not have a field stop, but the resolution of the retina decreases rapidly as a function
of the distance from its center. In looking at objects, the eye rotates until the image of an
object under observation falls on the central portion of the retina, called the fovea. The
apparent size of an object is determined by the size of its retinal image.
Consider an object of height h lying at a distance R from the front principal point
H, as illustrated in Figure 5-25. An image of height h ¢ is formed on the retina at a
distance R ¢ from the back principal point H ¢ . (see Problems 4.6 and 4.7 for a Gaussian
model of the human eye; see also Section 6.2.2.) The angular sizes and ¢ of the object
and image as seen from the respective principal points are related to each other according
to [see Eq. (2-67)]
n = n ¢ ¢ , (5-70)
where n and n ¢ are the refractive indices of the object and image spaces, respectively.
The image height h ¢ is given by
h ¢ = R ¢ ¢
(5-71)
= (n n¢) R¢ .
As the object distance varies, the eye lens changes its focal length by a process called
accommodation so that the distance R ¢ remains practically invariant (see Section 6.2.3).
Consequently, the apparent size of an object is proportional to the angle it subtends at
H , independent of the state of accommodation.
5.5.3 Brightness of a Lambertian Surface

Consider a Lambertian surface or a uniform diffuser of area dS1 and luminance L
observed by an eye of pupil area dS2 lying at a distance R, as illustrated in Figure 5-26.
The subjective brightness of the surface depends on the illuminance on the retina. The
total flux entering the eye is given by
n n′
P
h (–)β (–)β′ P′0
P0 H H′ (–)h′
P′
(–)R R′
Figure 5-25. Imaging by the human eye.

θ1
dS′1
dS1 dS2 Image
Eye Pupil
Lambertian
Object
R R′
Figure 5-26. Observation of a Lambertian surface.
L dS1 cos q1 dS2

dF = , (5-72)
R2
where q1 is the angle between the normal to the surface dS1 and the direction of
observation, i.e., the line joining the centers of dS1 and dS2 . The angle q 2 is zero
because dS2 is normal to this line.
If h is the transmission factor of the eye, the flux reaching the retina is h dF . This
flux is distributed over the retinal image of object dS1 . The projected area of the observed
surface normal to the direction of observation is dS1 cos q1 . Therefore, if R¢ is the image
distance, then the area of the image is given by
2
dS1¢ = ( R¢ n ¢R) dS1 cos q1 , (5-73)
where nR¢ n ¢R with n = 1 is the (linear) magnification of the image. Hence, the
illuminance on the retina is given by
dF
E = h
dS1¢
h n ¢ 2 L dS2 (5-74)
= .
R¢ 2
Because it is independent of q1 and R, a Lambertian surface appears equally bright at

all distances along all directions of observation. For example, the headlights of a car,
which consist of a parabolic mirror with a lamp placed at its focus, appear equally bright
at all distances.

5.6.1 Stops, Pupils, Windows, and Field of View
An aperture in an imaging system that physically limits the solid angle of the
transmitted rays from a point object the most is called its aperture stop (AS). Its images as
seen from the object and image spaces are the entrance (EnP) and exit (ExP) pupils,
respectively. An object ray passing through the center of the aperture stop and actually or
appearing to pass through the centers of the entrance and exit pupils is the chief (or the
principal) ray ( CR). An object ray passing through the edge of the aperture stop and
actually or appearing to pass through the edges of the entrance and exit pupils is the
marginal ray (MR). The chief ray from the edge of an object determines the location of
the exit pupil and the height of the image. Similarly, the marginal ray from the axial point
object determines the size of the exit pupil and the location of the axial image point. The
approximate size of an imaging element to avoid vignetting by it is equal to the sum of
the magnitudes of the heights of the chief ray on it from the edge point object and the
marginal ray from the axial point object. A system is telecentric on the image side when
its aperture stop lies in its object-space focal plane. The exit pupil in this case lies at
infinity, and a chief ray lies parallel to the optical axis in the image space. Similarly, a
system is telecentric on the object side if its aperture stop lies in the image-space focal
plane. An afocal system with an aperture stop placed in an intermediate focal plane is
telecentric on both object and image sides.
The field stop of a system is an aperture, placed at a final or intermediate real image
of the object, that limits the cone angle of the transmitted chief rays from an object. Its
images as seen from the object and image spaces are the entrance and exit windows EnW
and ExW , respectively. The entrance window defines the object field that is actually
imaged in the exit window. The angle subtended by the entrance window at the center of
the entrance pupil represents the angular field of view of the system in object space.
Similarly, the angle subtended by the exit window at the center of the exit pupil is the
angular field of view of the system in image space. The ratio of the two angles is equal to
the magnification of the exit pupil when the refractive indices of the object and image
spaces are equal.
5.6.2 Radiometry of Point Object Imaging

The flux incident by a point source of intensity I on an aperture of radius a lying at a
distance R is given by
È ˘
1 1
F = 2 pIR Í - ˙
12˙ (5-75a)
ÍR
ÍÎ
2
(
R + a2 ) ˙˚
I
= S for R >> a , (5-75b)
R2
where S = p a 2 is the area of the aperture. Because S R 2 is the solid angle subtended by
the aperture on a distant point source, I S R 2 is the flux incident on the aperture, and
I R 2 is the uniform irradiance on it yielding the inverse-square law of irradiance.
If Io is the intensity of a point object P, the intensity Ii of its image P ¢ is given by

(see Figure 5-27)
2
Ê F ˆ cos q ˆ 3
Ii = Io Á ex ˜ Ê , (5-76)
Ë Fen ¯ Ë cos q ¢ ¯
where Fen and Fex are the focal ratios of the optical beams entering and exiting from the
imaging system, and q and q ¢ are the chief ray angles in the object and image spaces.
5.6.3 Radiometry of Extended Object Imaging

5.6.3.1 Illumination by a Lambertian Disc
A Lambertian surface radiates uniformly in all directions. The axial irradiance of a

Lambertian disc of radius a and radiance B at a distance R is given by (see Figure 5-18)
BS
E( 0) = , (5-77)
a + R2
2
where S = p a 2 is the area of the disc. At a large distance from the disc, Eq. (5-77)
reduces to
E(0) = BS R2 for R >> a (5-78a)
= I R2 , (5-78b)
where I = p a 2 B is the intensity of the disc along its axis.
The off-axis irradiance at a point at a large distance making an angle d with the axis
of the disc, i.e., the flux incident on the area element d Sr (illustrated in Figure 5-18) per
unit area, is given by
E(d ) = E(0 ) cos 4 d . (5-79)
When R is not much greater than a, the actual irradiance values are higher than those
predicted by Eq. (5-79).
5.6.3.2 Image Radiance
The radiance B¢ of the image of an object of radiance B is given by (assuming

lossless imaging)
B¢ B
2 = , (5-80)
n¢ n2
where n and n ¢ are the refractive indices of the object and image spaces, respectively. In
the case of imaging by a mirror, n ¢ = - n , and therefore B¢ = B. In practice, n ¢ = n even
for a refracting system, and therefore B¢ = B. In reality, however, B¢ < B due to losses in
the system.
5.6.3.3 Image Irradiance
For a uniformly radiating object with a radiance B, the image irradiance distribution
is generally nonuniform. When the aperture stop of the system lies in the object space, it
decreases according to (see Figure 5-27)
E (q) = E(0) cos 4 q , (5-81)
where
E( 0) = ( p B M ) a 2 (5-82a)
(
= p B 4M 2 ) Fen2 . (5-82b)
Here, q is the chief ray angle in the object space, M is the image magnification, 2a is
the angular aperture of the entrance pupil, and Fen is the focal ratio of the light cone
entering the entrance pupil.
For an object lying at infinity, Eq. (5-82b) reduces to
E(0) = p Bn ¢ 2 4n 2 F•2 , (5-83)
where
F• = f ¢ Den (5-84)
is the corresponding focal ratio of the image-forming light cone. Here, f ¢ is the focal
length of the system, and Den is the diameter of its entrance pupil. The focal ratio Fex for
finite conjugates is related to F• according to
Fex = F• (1 - M m) , (5-85)
n EnP ExP n¢
P¢
dS¢
CR
P0 a q¢
q O F O¢ (–)a¢ P0¢
CR
dS
P

Optical
System
(–)L o Li
Figure 5-27. Radiometry of point object imaging. P and P ¢ are the object and image
points, and d S and d S¢ are the object and image elements.
where m = Dex Den is the pupil magnification, Dex being the diameter of the exit pupil.
If the aperture stop lies in the image space, then the irradiance distribution is given
by
E(q ¢) = E(0) cos 4 q ¢ , (5-86a)
where
2
E ( 0 ) = p B ( n ¢ n) Fex2 . (5-86b)
Equations (5-81) and (5-86a) represent the cosine-fourth power law of irradiance in the
object and image spaces, respectively, showing that the irradiance of the image of a
Lambertian object decreases as the fourth power of the cosine of the chief ray angle q in
the object space or q¢ in the image space.
In a concentric system, the aperture stop, entrance pupil, and the exit pupil all lie at
the common center of curvature of the imaging elements, and the image is formed on a
concentric spherical surface. The chief ray angles in the object and image spaces are
equal, and an image element is normal to the line joining it and the center of the exit
pupil. The irradiance distribution is accordingly given by
E(q ¢ ) = E( 0) cos q¢ . (5-87)
Thus, the irradiance of the spherical image formed by a concentric system decreases
linearly with the cosine of the angle of the chief ray in the object or the image space,
where E( 0) may be obtained from Eq. (5-68).
5.6.4 Visual Observations

The radiance of an element of a Lambertian surface is independent of the direction of
radiation. The brightness of a Lambertian surface of luminance L described by the
illuminance on the retina is given by
n ¢ 2 L Se
E = h , (5-88)
R¢ 2
where h is the transmission of the eye, n ¢ is its refractive index, R ¢ is its diameter, and
Se is the area of its pupil.
The size of the retinal image of an object subtending an angle b at the eye is given
by (n n ¢) R¢b , where n and n ¢ are the refractive indices of the object and image spaces,
respectively, and R ¢ is the distance of the retina from the image-space principal point of
the eye. In practice, n = 1 for observations in air and 1.33 for observations in water,
n ¢ = 1.33 , and R ~ 2.5 cm.
References 229
REFERENCES
1. R. McCluney, Introduction to Radiometry and Photometry, Artech, Boston
(1994).
2. R. Kingslake, “Illumination in optical images,” in Applied Optics and Optical

Engineering, Vol. II, Ed. R. Kingslake, Vol. II, pp. 195–228, Academic Press, San
Diego, CA (1965).
3. J. R. Meyer-Arendt, “Radiometry and photometry: units and conversion factors,”

Appl. Opt. 7, 2081–2084 (1968).
4. W. H. Steel, “Luminosity, throughput, or etendue,” Appl. Opt. 13, 704 (1974);

also, “Luminosity, throughput, or etendue? Further comments,” Appl. Opt. 14,
252 (1975).
5. M. Reiss, “The cos4 law of illumination,” J. Opt. Soc. Am. 35, 283–288 (1945).
6. I. C. Gardner “Validity of the cosine-fourth power law of illumination,” J.

Research Nat. Bur. Stand. 39, 213–219 (1947).
7. M. Reiss, “Notes on the cos 4 law of illumination,” J. Opt. Soc. Am. 38, 980–986
(1948).

Section 6.6, SPIE Press, Bellingham, WA (1998) [doi:10.1117/3.265735.ch6].
PROBLEMS
5.1 Consider a system consisting of two thin lenses of equal focal lengths with an
aperture stop placed midway between them. Show that its entrance and exit pupils
lie at its respective principal points.
5.2 A system consisting of two thin lenses with focal lengths of 10 cm and 5 cm and
with apertures of 4 cm are spaced 4 cm apart. A stop 2 cm in diameter is located
midway between them. (a) Determine its principal points. (b) Find the position and
size of its entrance and exit pupils. (c) Find the position and size of the image of an
object placed 10 cm from the first lens. (d) Sketch everything on a diagram
showing, in addition, the two tangential marginal rays and the chief ray from the
top of the object if it is 4 cm high. (e) In the object plane considered, what is the
maximum height of a point object for which there is no vignetting?
5.3 Consider a system consisting of two thin lenses placed 4 cm apart with a 4-cm
aperture placed midway between them. The first lens has a diameter of 4.6 cm and
a focal length of 5.8 cm. The second lens has a diameter of 5.8 cm. An object is
placed 8 cm from the first lens. (a) Determine the aperture stop of the system. (b)
Sketch the vignetting diagram for a point object 4 cm from the optical axis.
5.4 An exit pupil with a 3-cm aperture is located 6 cm in front of a convex mirror that
has a radius of curvature of 10 cm. An object 1 cm high is centrally located on the
axis 12 cm in front of the mirror. (a) Locate the entrance pupil and the image. (b)
Find the minimum diameter of the mirror needed to see the entire object from all
points of the exit pupil.
5.5 Consider a Schwarzschild telescope consisting of two concentric spherical mirrors

( )
such that the ratio of their radii of curvature is 3 ± 5 2 . Its aperture stop is
located at the primary mirror. (a) Determine its focal length in terms of the focal
lengths of its mirrors. (b) Determine the distance between its focal plane and the
mirror close to it. (This distance is often referred to as the working distance of a
telescope.) (c) Calculate the position and size of its exit pupil. (d) Determine the
obscuration ratio of the image-forming beam. (The obscuration ratio is the ratio of
the inner and outer diameters of the light cone focusing to the image point.)
(e) Determine the diameter of its secondary mirror for a field of view of ± 5 mrad.
(f) Sketch the system if its focal ratio is 2 and the diameter of its primary mirror is
10 cm.
5.6 Show that the height of a light bulb (assumed to be a point source) from the center
of a circular table of radius a for maximum illumination at its edges is given by
2 a .
5.7 According to the Stefan–Boltzmann law, the exitance (i.e., the power radiated by a
unit area) of a blackbody at a temperature T (in Kelvin) is given by sT 4 , where
s = 5.67 ¥ 10 –8 W m 2 K 4 is the Stefan–Boltzmann constant. Consider the sun to
Problems 231
be a blackbody at 6000 K. (a) Determine its radiance. (b) Calculate the solar
irradiance on the earth, called the solar constant (the solar constant is also
expressed as 2 calories/cm2 min). (c) Compare it with the irradiance of the solar
image formed by a lens with an f-number of 5. (d) Assuming that the moon
reradiates 20% of the light incident on it, compare the lunar irradiance on the earth
for full moon with solar irradiance in full sunlight. Some of the sizes and distances
of interest are as follows: the radius of the sun and its distance from the earth are
6.96 ¥ 10 8 m and 1.49 ¥ 1011 m , respectively, and the radius of the moon and its
distance from the earth are 1.77 ¥ 10 6 m and 3.80 ¥ 10 8 m , respectively.
5.8 Consider an optical system imaging a small circular object of radius h centered on
its optical axis. Let the circular image be of radius h ¢ . Let 0 and ¢0 be small
slope angles of the axial marginal rays in the object and image spaces of the system
(see Figure 5-2). Show by using the Lagrange invariance of Eq. (2-74) that the
object and image radiances are related to each other according to Eq. (5-34), where
n and n ¢ are the refractive indices of the object and image spaces. The object and
image sizes are assumed to be small so that the entrance and exit pupils subtend
approximately the same angles at every point on them.
5.9 Determine the flux incident on a solar panel 1 m ¥ 2 m when the sun is at zenith,
30 o and 60 o . Assume that the radiance of the sun is 22.5 MW/m2 sr and its angular
diameter as seen from the earth is half a degree.
CHAPTER 6
OPTICAL INSTRUMENTS
6.1 Introduction ..........................................................................................................235

6.2 Eye ......................................................................................................................... 235
6.2.1 Anatomy and Structure ............................................................................235
6.2.2 Paraxial Models ....................................................................................... 237
6.2.3 Accommodation ......................................................................................238
6.2.4 Visual Acuity ........................................................................................... 240
6.2.5 Spectacles (or Eyeglasses)....................................................................... 242
6.3 Magnifier ..............................................................................................................249
6.4 Microscope ............................................................................................................251
6.5 Telescope ............................................................................................................... 253
6.6 Ocular....................................................................................................................259
6.7 Telephoto Lens and Wide-Angle Camera ..........................................................259
6.8 Resolution ............................................................................................................. 261
6.8.1 Introduction..............................................................................................261
6.8.2 Airy Pattern..............................................................................................261
6.8.3 Rayleigh Criterion of Resolution............................................................. 263
6.8.4 Resolution of an Imaging System ............................................................266
6.8.5 Resolution of the Eye ..............................................................................268
6.8.6 Resolution of a Microscope ..................................................................... 269
6.8.7 Resolution of a Telescope........................................................................270
6.9 Pinhole Camera ....................................................................................................273
6.10.1 Eye ........................................................................................................... 275
6.10.2 Magnifier ................................................................................................. 275
6.10.3 Microscope ..............................................................................................275
6.10.4 Telescope ................................................................................................. 276
6.10.5 Resolution ................................................................................................276
6.10.6 Pinhole Camera........................................................................................276
References ......................................................................................................................277
Problems ......................................................................................................................... 278
233
Chapter 6
Optical Instruments
6.1 INTRODUCTION
In this chapter we describe the basic principles of some of the commonly used
optical instruments. We start with the most common, the human eye, and discuss how
spectacles correct near- or farsightedness. We then discuss a magnifier (or a reading
glass), a microscope, and a telescope. We illustrate how the eye interacts with such
instruments when images are observed by humans. A pinhole camera is also described
briefly.
6.2 EYE
6.2.1 Anatomy and Structure
The human eye is a visual positive lens system that forms a real image on the retina,
as illustrated in Figure 6-1. It is nearly spherical, with a diameter of about 2.5 cm among
adults and a tough 1-mm-thick outer shell called the sclera. Its front portion, where the
eye bulges outward, represents the first element of the lens system called the cornea. It is
a transparent tissue approximately 0.5 mm thick, with a refractive index of 1.377, while
the rest of the sclera is white and opaque. Nearly two-thirds of the bending of object rays
takes place at the air–cornea interface. The cornea is also slightly reflective and acts like a
convex mirror, resulting in our ability to see ourselves in the eyes of another person.
Because the refractive index of the cornea is very close to that of water (1.333), no
significant refraction takes place at a water–cornea interface. Accordingly, a person
cannot see very well under water (divers wear a mask that creates airspace between the
water and the eye). The eyelids protect the delicate cornea from foreign particles. By
blinking constantly, they keep a layer of tears on the cornea. The tear film is produced by
glands within the lids. Without the tears, a dry cornea loses its transparency.
Figure 6-1. Anatomy and structure of the eye.

235
236 OPTICAL INSTRUMENTS
Rays emerging from the cornea pass through a chamber filled with a clear watery
fluid called the aqueous humor, which has a refractive index of 1.336. Because of the
closeness of the refractive indices, only a small refraction of the rays takes place at the
cornea–aqueous humor interface.
A diaphragm, called the iris, immersed in the aqueous humor, controls the amount of
light entering the eye. Its central hole is called the pupil, which can be seen as a small
central black spot of the eye. It is black, of course, because light goes through it. While
the iris defines the entrance pupil, its image by a crystalline lens, which lies immediately
behind the iris, defines the exit pupil of the eye. The exit pupil is located behind the iris
and is somewhat smaller than the entrance pupil. The iris also gives the eye its color, e.g.,
brown, green, or blue. It is made up of circular and radial muscles that expand or contract
to increase or decrease the diameter of the pupil from approximately 2 mm in bright light
to 8 mm in darkness. The lens, which is about 4 mm thick and 9 mm in diameter, is a
complex, layered fibrous mass surrounded by an elastic membrane. As many as 22,000
very fine layers are arranged as in an onion. Its index of refraction varies from about
1.406 at the inner core to approximately 1.386 at the less dense cortex. Its index is a
radial analog to the linearly varying index of spectacle glasses in use today. Behind the
lens is another chamber filled with a transparent gelatinous substance called the vitreous
humor, which has a refractive index of 1.337. The lens is suspended in place by
threadlike fibers that are connected to the ciliate muscle. The muscle contracts, loosening
tension on the lens and allowing it to bulge, thereby increasing its power to focus on a
nearby object.
Within the tough sclerotic wall is an inner shell, called the choroid. It is a dark layer
with blood vessels and pigmented cells. It absorbs any stray light like the interior black
walls of a camera. A paper-thin (about 50 mm) layer of photoreceptor cells, called the
retina, covers much of the inner surface of the eye. The red glow in the flash photo of
some people represents the light reflected from the retina by fine blood vessels.
Interestingly, the curved retina closely approximates the Petzval surface of the eye’s
optical system. There are two types of photoreceptor cells, called rods and cones. There
are about 125 million rods, 6.5 million cones, and a million fibers. The rods are
extremely sensitive to light, but do not distinguish color. The cones are used in bright
light, such as daylight, and provide color perception. They do not function in a color-
blind person. The normal wavelength range of human vision is approximately 380 nm to
780 nm. How the response varies with wavelength is given in Table 5-2 and Figure 5-24.
The crystalline lens absorbs in the ultraviolet. With age, the lens gets clouded and loses
transparency, a condition called cataract. People whose lenses have been surgically
removed are significantly more sensitive to ultraviolet light.
The electrical impulses generated by the retinal cells are carried by fibers at a rate of
about 109 bits/sec. The eye interfaces with the brain through the optic nerve, which does
not contain any photoreceptors. It is therefore insensitive to light, thereby creating a blind
spot about 0.6 mm in diameter. The blind spot can be demonstrated very easily by
6.2 Eye 237
considering Figure 6-2. With the left eye closed, stare at the + sign, starting it at a
distance of about 25 cm and slowly bringing the figure closer. At a certain distance the
picture on the right disappears as its image falls on the blind spot of the right eye. A
chronically elevated pressure of the fluids within the eye, a condition called glaucoma,
can lead to blindness if not treated.
At the center of the retina, there is an area about 2.5 to 3 mm in diameter, known as
the yellow spot or macula. There is a tiny, rod-free region about 0.3 mm in diameter,
called the fovea centralis. The cones in this region are thinner and more densely packed,
and thus yield the sharpest image. They have a diameter of about 1.5 mm and are spaced
about 2 to 2.5 mm apart. There are about 14,700 cells/mm2 in the fovea compared to a fine
laser printer, which has 5500 dots/mm2. Without the fovea, 90–95% of the vision is lost;
only the peripheral day and night vision is retained. The fovea does not lie on the optical
axis of the lens system of the eye. The line joining the lens center and the fovea is
referred to as the visual axis of the eye.
Comparing the eye to a camera, the cornea provides a majority of the focusing, the
iris is the aperture stop, the crystalline lens provides the fine focusing, and the retina
plays the role of a film or a solid state detector (pixel) array. Of course, the eyelids are
equivalent to a lens cover.
6.2.2 Paraxial Models

A simplified paraxial model of the eye was considered in Problem 4.6. Here, we
briefly describe three such models [1,2]. In the schematic eye model, the cornea and the
lens are represented by two surfaces each, as illustrated in Figure 6-3a. In the simplified
schematic eye, also known as the Helmholtz eye and illustrated in Figure 6-3b, the cornea
is represented by a single refracting surface. The radius of curvature of the cornea, and
the position, thickness, and the radii of curvature of the lens represent the average values
for adult eyes. The aqueous and vitreous humors are assigned a refractive index of 1.333.
The lens is assigned a uniform index of 1.416 to yield the same refracting power as the
schematic eye. When the index of the lens is adjusted to 1.45, the object- and image-
space focal lengths become round numbers equal to 15 mm and 20 mm, respectively.
Because the spacing between the principal points (and, therefore, between the nodal
points) is very small (only 0.32 mm), a single refracting surface represents both the
cornea and the lens in a reduced eye model, as illustrated in Figure 6-3c. The principal
points coincide with the vertex of the surface in this model, and the nodal points coincide
with its center of curvature. The eye is assumed to be filled with vitreous humor of
+
Figure 6-2. Blind spot demonstration.
(a)
F HH¢ N N¢ F¢
(b)
F HH¢ N N¢ F¢
(c)
F H H¢ N N¢ F¢
10 0 10 20
mm
Figure 6-3. Paraxial models of the eye, illustrating its cardinal points. (a) Schematic
eye. (b) Simplified schematic eye (single-surface cornea). (c) Reduced eye (single
refracting surface).
refractive index 1.333. Thus, the focal length of the reduced eye is the same as that of the
Helmholtz eye. The optical parameters of the three models are listed in Table 6-1.
6.2.3 Accommodation
The cornea provides nearly 43 of the total 60 diopters of the focusing power of an
eye (see Problem 6.2). The fine focusing of the image of an object as its distance changes
is performed by the crystalline lens in a process called accommodation. Generally, the
lens muscles are relaxed when forming the image of an object lying at infinity, as
illustrated in Figure 6-4a. As the object moves closer, the ciliary muscle contracts the
front surface of the lens, which becomes more curved. The lens becomes thicker at the
center, thereby reducing its focal length and maintaining a sharp image on the retina. This
is illustrated in Figure 6-4b. Too much work by the ciliary muscles over long periods
leads to eye strain or fatigue. As the object still moves closer, a point is reached when the
lens shape cannot change any more, and the image of any closer object is blurred. The
closest point for which the eye can form a sharp image is called the near point. It varies
from 7 cm for a teenager to 25 cm or so for a young adult, and to roughly 100 cm in a
middle-age person. Accommodation changes the power of the crystalline lens by about 4
diopters, although a teenager may possess more than 10 diopters.
Mammals, such as humans, generally accommodate by varying the lens curvature,

but fish move the lens toward or away from the retina, just as in a focusing, as opposed to
a fixed-focus, camera. Some mollusks contract or expand the whole eye to alter the
6.2 Eye 239
Table 6-1. Optical parameters of the paraxial models of the eye.
Parameter Element Schematic Simplified Reduced

eye schematic eye eye
Radii of curvature of Anterior cornea 7.80 7.80 5.55
surfaces (mm) Posterior cornea 6.50 — —
Anterior lens 10.20 10.00 —
Posterior lens – 6.00 – 6.00 —
Distances from Posterior cornea 0.55 — —
anterior cornea (mm) Anterior lens 3.60 3.60 —
Posterior lens 7.60 7.20 —
Retina 24.20 23.90 —
Principal point H 1.59 1.55 0
Principal point H ′ 1.91 1.85 0
Nodal point N 7.20 7.06 55.55
Nodal point N ′ 7.51 7.36 5.55
Focal point F – 15.09 – 14.99 – 16.67
Focal point F ′ 24.20 23.90 22.22
Refractive indices Cornea 1.3771 — 4/3
Aqueous humor 1.3374 1.333 4/3
Lens 1.4200 1.416 4/3
Vitreous humor 1.3360 1.333 4/3
F′
(a)
P′
P
Near point
25 cm
(b)
Figure 6-4. Normal eye. (a) Relaxed eye showing the far point at infinity. (b)
Accommodated eye illustrating the near point.
distance between the lens and retina. Birds of prey keep a rapidly moving object in
constant focus over a wide range of distances by changing the curvature of the cornea. In
Lasik (laser in-situ keratomileusis) surgery, it is indeed the curvature of the cornea that is
changed. In a cataract operation, on the other hand, the crystalline lens is replaced by a
plastic lens.
6.2.4 Visual Acuity

Visual acuity (or sharpness of distant vision) is a measure of one's ability to resolve
details. It is generally measured by ophthalmologists by using an eye chart containing
rows of letters, called Snellen letters, that are progressively smaller in size from one row
to the next, as shown in Figure 6-5a. It is customary to place the Snellen chart at a
distance of 20 ft (or 6 m) from the patient and make the letter size in successive rows
such that they subtend an angle of 5 arc min at distances of, for example, 10, 15, 20, 25,
30, 50, 70, 100, and 200 ft. The width of the black strokes and white spaces in the letters
in a given row subtend an angle of one arc min each at a distance corresponding to the
row in question. If the smallest letters that a patient can read at 20 ft are the ones that a
normal eye would distinguish at 60 ft, his vision is referred to as 20/60 (or 6/18 in the
metric system). The normal vision is obviously 20/20, but some people have a visual
acuity of 20/15. Thus, the letters that a normal person can distinguish at 15 ft, these
people can do at 20 ft. Of course, the letters used are in the native language of a person.
People who cannot read (e.g., small children or illiterate folks) are shown instead a row of
E letters oriented at various angles, as illustrated in Figure 6-5b, and the patients are
asked to describe the orientation using their fingers. Pictures of familiar animals and other
artifacts of progressively smaller size are also used with children to get their attention. A
person is considered legally blind if the corrected vision is 20/200 or worse.
As a starting point for the correction, Eggers developed an approximate

correspondence between visual acuity and the required corrective power by testing a large
number of patients [3]. An adaptation of his table for a nearsighted person is shown in
Table 6-2. For every row above the eighth in the Snellen chart, which is for 20/20 vision,
the power correction increases in magnitude by –0.25 diopter up to the second row and by
–0.5 diopter in the other rows. These days, the automated phoropters estimate the
refractive error of a patient’s vision and switch the lenses; the patient selects the lens that
gives the best vision while looking at the Snellen letters.
Visual acuity is maximum at the fovea and decreases in the outer region of the retina.
Thus, the fovea provides the details, and the outer region gives a general view of an
object scene. The eye has an elliptical field of view, approximately 150° high and 210°
wide. The stereoscopic field of view obtained with the use of both eyes is approximately
circular with an angular diameter of about 130°. The eyeball rotates automatically as
needed so that the image of the region of interest in a certain object falls on the fovea. In
low illumination, the eye becomes color blind due to the low sensitivity of the cones.
Visual acuity decreases as illumination decreases. The optic nerve transmits the retinal
6.2 Eye 241
(a)
(b)
Figure 6-5. Snellen eye charts for (a) literate and (b) illiterate patients.
Table 6-2. Relationship between visual acuity and the corresponding corrective
power required in diopers for a nearsighted person.
Snellen row Visual Refractive

acuity correction
8 20/20 0.00
7 20/25 –0.25
6 20/30 –0.50
5 20/40 –0.75
4 20/50 –1.00
3 20/70 –1.25
2 20/100 –1.50
20/150 –2.00
1 20/200 –2.50
20/250 –3.0
20/300 –3.5
20/350 –4.0
20/400 –4.5
20/450 –5.0
20/500 –5.5
20/600 –6.5
image to the brain which interprets it. For example, the image of an object on the retina is
inverted, yet a person sees it as being erect.
6.2.5 Spectacles (or Eyeglasses)

In a normal eye, called emmetropic, the most distant point, called the far point, is
located at infinity. An eye that does not form a sharp image of an object at infinity on the
retina is called ametropic. There are two types of ametropia: myopia, popularly known as
nearsightedness, and hypermetropia, or farsightedness. They arise because of the
incorrect relationship between the curvature of the cornea and its distance from the retina,
i.e., the length of the eyeball.
When parallel rays from an object at infinity are focused at F ¢ in front of the retina,
the focusing power of the eye is too high, as illustrated in Figure 6-6a, and the eye is said
to be myopic. Such a condition can also happen if the curvature of the cornea is too high.
It has the consequence that the far point falls short of infinity, and all points beyond
6.2 Eye 243
F¢
Object at infinity
(a)
P¢
P
Far point
(b)
F¢
Object at infinity
(c)
P¢
P
Nearby
object
(d)
Figure 6-6. Myopic (or nearsighted) eye. (a) Object at infinity focused at F ¢ by a
relaxed eye in front of its retina, thus forming a blurry image on the retina. (b)
Objects at the far point and closer are in focus, illustrating that nearby objects are
seen well. (c) Object at infinity imaged by a negative spectacle lens forming a virtual
image at the far point. The virtual image is the object for the eye, which images it at
F ¢ on the retina. (d) Object P closer than the far point imaged at P ¢ on the retina
by the spectacle lens and accommodation.
it will appear blurred. An object point P located at the far point is imaged at P ¢ on the
retina without any accommodation (see Figure 6-6b). Objects closer than P are imaged on
the retina with accommodation. A person with a myopic eye is said to be nearsighted,
i.e., nearby objects are seen well, but the distant objects are not. A myope brings an
object close enough, i.e., at or within the far point, to see it well. The near point of a
myope with normal accommodation is closer than if the eye was emmetropic (normal). A
myopic eye can be compensated with a negative spectacle lens such that the combination
of the two yields a focus on the retina without accommodation [4]. An object at infinity is
imaged by the spectacle lens at the far point (see Figure 6-6c), which the eye is able to
focus on without accommodation. The objects at other distances are seen well with
accommodation. Of course, objects at distances shorter than the far point are also seen
well without the spectacles, using somewhat less accommodation. The nearby objects are
focused on with accommodation, as illustrated in Figure 6-6d.
When an object lying at infinity is focused beyond the retina, as illustrated in Figure
6-7a, the eye is said to be hyperopic. The focusing power of the eye is too weak, i.e., the
cornea is less curved, or the lens has become too thin in its relaxed state. Accordingly,
distant objects are seen well only by accommodation, as illustrated in Figure 6-7b.
However, there is not sufficient accommodation for nearby objects, which are imaged
beyond the retina and are therefore out of focus (see Figure 6-7c). A person with such a
condition is said to be farsighted, i.e., distant objects are seen well, but nearby objects are
not. A hyperope pushes objects at an arm's length (> 25 cm) to see them well. This
distance represents the near point of the person, and spectacles are needed to see well any
objects that are closer than this point. The far point F ¢ of a hyperope is virtually located
behind the eye, as in Figure 6-7a. With normal accommodation, the near point of such a
person is more distant than that for a normal eye. The focal length of the spectacle lens is
chosen so as to bring the near point to a comfortable distance of 25 cm (see Figure 6-7d).
With a positive spectacle lens, distant objects are imaged sharply without much
accommodation. However, nearby objects are imaged by the lens beyond the near point,
which, in turn, are brought in focus by accommodation. An object at infinity is imaged by
it at the far point, which the eye images on the retina, as illustrated in Figure 6-7e.
Another common defect of the eye is astigmatism. It arises from a cornea that is toric
(or spherocylindrical) instead of being spherical, i.e., the cornea has an uneven curvature
(as an egg and not as a pingpong ball), resulting in different power in different meridians
(see Figure 6-8). A toric surface, such as that illustrated in Figure 6-9, forms a line image
of a point object even when it lies on its axis. However, a person afflicted with
astigmatism sees only a blurry image. If the object consists of vertical and horizontal
lines, as in the wires of a window screen, such a person can focus (by accommodation) on
only the vertical or the horizontal lines at a time. This is analogous to the spoked-wheel
example of Figure 9-16, where the rim is in focus in one observation plane and the spokes
are in focus in another when imaged by a lens with astigmatism. The astigmatism of the
eye may be other than horizontal or vertical. Its axis can be determined by looking at an
6.2 Eye 245
F¢
Far
Object at infinity point
(a)
P¢
P
Near point
(b)
P¢
P
25 cm
(c)
P¢
Near point P
25 cm
(d)
F¢ Far
Object at infinity point
(e)
Figure 6-7. Hyperopic (or farsighted) eye. (a) Object at infinity focused beyond the
retina by a relaxed eye, thus forming a blurry image on the retina. The focal point
F ¢ is the virtual far point of the eye. (b) Object at or beyond the near point focused
on the retina by accommodation. (c) Object P closer than the near point imaged at
P ¢ beyond the retina even with accommodation. (d) Positive spectacle lens images
an object P at 25 cm at the near point, which the eye images at P ¢ on the retina. (e)
Object at infinity imaged by the spectacle lens at the far point, which the eye images
on the retina.
Figure 6-8. Astigmatic eye illustrating a cornea with uneven curvature, resulting in a
blurry image of a point object on the retina. Although line images are formed in
front of and behind the retina, and a circular image is formed halfway between
them, a patient perceives only a blurry image.
(a) (b)
Figure 6-9. Toric surface. (a) Convex. (b) Concave.
astigmatism eye chart consisting of radial spokes, as shown in Figure 6-10a. The spoke
that is seen in focus, e.g., the one at 30° from the vertical in Figure 6-10b, represents the
axis of astigmatism. A cylindrical lens, illustrated in Figure 6-11, is used to correct
astigmatism. It introduces power only along its axis. If the eye is also myopic or
hyperopic, then a toric lens is required for correction. However, with a rigid contact lens,
the space between its back surface and the cornea is filled with the tear fluid. Thus,
astigmatism practically disappears, and the curvature of the front surface provides the
needed myopic or hyperopic correction. A soft contact lens requires proper orientation to
align its toroidal power with that of the eye.
Astigmatism of the eye is different from that of rotationally symmetric optical

imaging systems, which is zero for an axial object. However, astigmatism of the eye,
which results from an uneven curvature of the cornea, is nonzero even for an axial object.
6.2 Eye 247
(a) (b)
Figure 6-10. (a) Astigmatism eye chart. (b) Chart as observed by an astigmatic eye,
indicating the axis of astigmatism at 30 o from the vertical.
Real
line focus
Virtual
line focus
(a) (b)
Figure 6-11. Cylindrical lens showing parallel rays incident on it are focused on a
line (as opposed to a point in the case of a spherical lens). (a) Convex. (b) Concave.
With age comes another condition called presbiopia, i.e., inability of the eye to
accommodate. The crystalline lens hardens and becomes inflexible. As a result, a
nearsighted person, for example, cannot read a newspaper at a normal distance while
wearing glasses. In order to read it, the newspaper is kept at arm’s length. The near point
in this case has receded beyond the comfortable reading distance. The choice is either to
remove the eyeglasses or wear bifocal lenses with a less-negative lower half. Similarly, a
farsighted person wears bifocals with a more-positive lower half. Some people cannot
adjust to bifocal spectacles and keep two sets. For example, a nearsighted person may use
one pair of spectacles for distant objects (e.g., driving and watching television) and
another for nearby objects (e.g., reading and sewing).
A starting point for a prescription is determined by looking at the eye chart without
any spectacles. Visual acuity of a person does not by itself determine if that person is
Table 6-3. A typical eye prescription.
Eye Sphere Cylinder Axis
Right Eye –3.75 –0.75 137°

(Oculus Dexter)
(Left Eye –3.50 –0.25 30°
(Oculus Sinister)
Bifocal
(Reading Glasses)
Right eye 2.50 D
Left eye 2.50 D
nearsighted or farsighted. A precise prescription is determined by looking at the eye chart

through a variety of lenses in a trial frame until the visual acuity becomes normal or as
high as possible. The range of spherical lenses is generally ± 20 D, and the range of
cylindrical lenses is ± 7 D in steps of 0.25 D. A typical eye prescription may read like the
author's shown in Table 6-3.
The prescription for the right eye calls for a combination of a sphere having a power
of – 3.75 D and a cylinder having a power of – 0.75 D, with its axis inclined 47° from the
vertical toward the temple. The negative sign on the spherical correction implies that this
eye is nearsighted, and Table 6-2 indicates a visual acuity lower than 20/300. Similarly,
the left eye needs a combination of a sphere having a power of – 3.50 D and a cylinder
having a power of –0.25 D, with its axis inclined 30° from the vertical toward the temple.
Thus, both eyes have astigmatism. The optician can fill the prescription for the right eye
by starting from a flat blank and grinding a sphere with a power of – 3.75 D on one side
and a cylinder with a power of –0.75 D on the other. In practice, they start with a blank
that is close to the power required, e.g., a blank with – 3.00 D on one surface. Such a lens
is satisfactory when viewing through its center, but it has large aberrations near its edge.
The outer portions of the field of view are improved by using a meniscus lens, i.e., one
that has surfaces with curvatures of the same sign, as illustrated in Figure 2-19. The rear
surface (i.e., the one closer to the eye) is concave with a power of –6.00 D. Because the
prescription calls for a combination of a sphere and a cylinder, the front surface needs to
be toroidal, and the lens is said to be toric. The optician transposes the prescription such
that in the meridian at 137°, the power of the front surface is 2.25 D, and the power of the
cylinder in the 47° meridian is 6.75 D. The power of a spectacle lens is defined to within
a quarter diopter. The prescription is bifocal, needing a spherical power of 2.50 D for
both eyes in the lower half of the spectacle lens.
6.3 Magnifier 249
Spectacle lenses are generally made with (spectacle) crown glass of refractive index
1.523. Sometimes, a glass with a high refractive index of 1.70 is used to reduce the
weight of the lens. These days, plastic lenses (Plexiglass) of index 1.495 are used because
their weight is roughly half that of a glass lens. Photochromic lenses, which change
transmission as a function of illumination level, are also quite common. The contact
lenses are meniscus lenses varying from 6 to 15 mm in diameter. A contact lens rests on
the cornea with a conforming shape. They are approximately 100 mm thick and ride on
the tear fluid. Their refractive index ranges from 1.43 to 1.49. They are made from some
polymer or even Plexiglass, weighing about 1 to 3 mg.
The cause of poor acuity is not always an abnormal refraction by the cornea or the
lens, or distance from the retina. It may simply be due to other causes, such as the media
of the eye may be partially opaque, or the retina may be diseased. Whether the spectacles
will improve the acuity or not can be easily checked by looking through a pinhole (so that
only a small central region of the cornea/lens is used). If the person under examination
can see well, the spectacles will help. Otherwise, the condition is pathological, and the
spectacles may not help at all.
6.3 MAGNIFIER
The apparent size of an object as seen with an unaided eye depends on the angle it
subtends at the eye. As an object is moved closer from a position P1 to a position P2 , the
angle it subtends at the eye increases from 1 to 2 , as illustrated in Figure 6-12a. This
results in an increase in the size of the image on the retina, which the eye keeps in focus
by accommodation. However, there is a limit to how close the object can get before the
eye is not able to accommodate any more, and the image gets blurred. Although this
distance varies somewhat from person to person, 25 cm is considered a standard near
point or the distance of most distinct vision. A magnifier, also called a simple
microscope, is a positive lens used to magnify the image beyond one’s accommodation.
People use it as a magnifying reading glass to look at fine print, and watchmakers use it
as an eye loupe to look at the details inside a watch.
Let = - h 25 be the angle subtended by an object of height h (in cm) when placed
at the near point, as illustrated in Figure 6-12b. If the object is observed through a
magnifier, it can be brought much closer to the eye. If it is placed inside the focus F of
the magnifier at a distance S from it, as illustrated in Figure 6-12c, it forms a large virtual
but erect image of height h ¢ at a distance S ¢ from the eye. This virtual image is seen by
the eye subtending a much larger angle ¢ on the retina.
If a magnifier of focal length f ¢ lies at a distance d from the eye, it forms a virtual
image of height h ¢ at a distance S ¢ + d from the eye, where
h ¢ = h( S ¢ S)
= h ( f ¢ - S ¢) f ¢ . (6-1)
P1 P2 P1¢
P2¢
(a)
h
(–)b
Near
point
25 cm
(b)
h¢
(–)b¢
h
F
(–)f
(–)S (–)d
(–)S¢
(c)
(–)b¢
h
F
(–)f
(d)
Figure 6-12. Magnifier. (a) Object at various distances observed with an unaided
eye. (b) Object observed at the standard distance of 25 cm. (c) Object observed
through a magnifier of focal length f ¢ kept at a distance d from the eye. (d) Object
placed in the front focal plane of the magnifier and observed through a magnifier in
(near) contact with the eye.
6.4 Microscope 251
The virtual image is seen by the eye subtending an angle ¢ = h ¢ ( S ¢ + d ) at the eye. The
ratio of the size of the retinal image thus formed to its size when the object is seen
without the magnifier from a distance of 25 cm is called the visual magnification of the
magnifier. It is given by the angular magnification
Mb = ¢
h¢ (S ¢ + d )
=
- h 25
25( f ¢ - S ¢)
= - . (6-2)
f ¢( S ¢ + d )
The magnification increases when the magnifier is moved closer to the eye. If it is
held close to the eye, we let d Æ 0 . Moreover, if the image formed by the magnifier lies
at the near point of the eye, then S ¢ = - 25 cm. Accordingly, the visual magnification
becomes
25
Mb = +1 , (6-3)
f¢
where f ¢ is in cm. The smaller the value of f ¢ is, the larger the value of the
magnification. If the object is placed in the focal plane of the magnifier, then S = - f ¢
and S ¢ = - • , as in Figure 9-13d, and a normal eye sees it without much accommodation.
In this case, ¢ Æ - h f ¢ and
Mb Æ 25 f ¢ . (6-4)
This result can also be obtained from Eq. (6-2) by letting S ¢ Æ - • , regardless of the
value of d. The magnifiers are often specified by this magnification. For example, a
magnifier with a focal length of 5 cm is labeled as 5 ¥ .
6.4 MICROSCOPE
A microscope is generally used to see the details of very small objects at very short
distances. As illustrated in Figure 6-13, a microscope, or more accurately, a compound
microscope, consists of two lenses, one with a very short focal length called the objective,
and the other with a somewhat longer focal length called the eyepiece. In practice, both
the objective and the eyepiece are actually made up of several lenses to reduce the
monochromatic as well as chromatic aberrations (which are discussed in Chapters 7 and
8). The objective of a microscope is its aperture stop, and its image by the eyepiece is its
exit pupil. All of the light entering the objective and refracted by the eyepiece passes
through the exit pupil. The pupil of the eye is placed at the exit pupil; otherwise, the field
of view is restricted.
When an object P0 P is placed just beyond the focal point of the objective, a real
P¢¢¢
0
P¢¢¢
Eye
ExP
Eyepiece
Fe P0¢
P¢
L
(–)a¢
F¢o
Objective
EnP
AS
a
P0 Fo
P
P0¢¢
P¢¢
Figure 6-13. Microscope (or a compound microscope). Object P0 P is placed slightly

beyond the focus of the first lens, called the objective. The second lens, called the
eyepiece, magnifies the image P0¢P ¢ formed by the objective. The magnified image
P0¢¢P ¢¢ is viewed by the eye, placed at the exit pupil of the microscope, as the image
P0¢¢¢P ¢¢¢ .
magnified image P0¢P ¢ is formed by it. This image is further magnified by the eyepiece
acting as a magnifier. The magnified virtual image P0¢¢P ¢¢ is observed by the eye, yielding
a final image P0¢¢¢P ¢¢¢ on the retina. The magnification M of the retinal image is equal to
the product of the transverse magnification Mt of the image formed by the objective and
the angular magnification M of the image formed by the eyepiece:
M = Mt M . (6-5)
An approximate expression for the magnification may be obtained in terms of the

tube length L of the microscope and the focal lengths fo¢ and fe¢ of the objective and the
eyepiece (see Figure 6-13). The objective forms the image of the object in the vicinity of
the object-space focal point Fe of the eyepiece at an approximate distance L from the
6.5 Telescope 253
image-space focal point Fo¢ of the objective. From Eq. (2-83), the magnification of this
image is given by
Mt = - L fo¢ . (6-6)
This image is magnified by the eyepiece, which, in turn, is seen by the eye. The visual
magnification of the retinal image formed by the eyepiece is given by
M = 25 fe¢ . (6-7)
Both Eqs. (6-6) and (6-7) become exact when the objective forms the image P0¢P ¢ of the
object in the focal plane of the eyepiece, which, in turn, forms the virtual image P0¢¢P ¢¢ at
infinity. Substituting these equations into Eq. (6-5), we obtain the magnification of the
microscope:
L 25
M = - (6-8a)
fo¢ fe¢
= 25 f ¢ , (6-8b)
where
L
f¢ = - (6-9)
fo¢ fe¢
is the focal length of the microscope, as may be seen by letting t = fo¢ + fe¢ - L in Eq. (4-
26). The magnification M represents the ratio of the size of the retinal image when an
object is viewed through the microscope to its size when viewed without any aid but
placed at a distance of 25 cm.
Objects are often observed with a microscope by placing them under a cover glass.
Sometimes the space between the cover glass and the objective is filled with a liquid,
such as an oil, to yield a higher numerical aperture and better resolution, as discussed in
Section 6.8.5. A nearsighted or farsighted person can remove their spectacles and focus
the microscope by moving the eyepiece in and out, but an astigmatic person must wear
them when observing with a microscope.
6.5 TELESCOPE
Whereas a microscope is used to view very small, nearby objects, a telescope is used
to view large, distant objects. Like a microscope, an astronomical telescope also consists
of two lenses: an objective and an eyepiece. The two lenses in a telescope are confocal
(or common focus), as illustrated in Figure 6-14, and represent an example of an afocal
system. (A reflecting afocal telescope is discussed in Section 3.4 in the form of a beam
expander.) In Figure 6-14a, both lenses are positive, and the telescope is called Keplerian.
In Figure 6-14b, the first lens is positive, but the second is negative, and the telescope
D1 D2
F¢1 , F2
f1¢ – f2¢
(a) Keplerian telescope
D1 D2
F¢1 , F 2
– f2¢
f1¢
(b) Galilean telescope
Figure 6-14. Refracting telescope consisting of two lenses with a common focus. The
image-space focal point F1¢ of the first lens and the object-space focal point F2 of the
second lens are coincident. (a) A Keplerian telescope has positive lenses, i.e., f1¢ and
f2¢ are both numerically positive. (b) A Galilean telescope has a positive first lens
but a negative second lens, i.e., f1¢ is numerically positive, but f2¢ is numerically
negative.
is called Galilean. The first lens of diameter D1 is called the objective (because it is
closer to the object), and the second lens of smaller diameter D2 is called the eyepiece
(because the observing eye is placed near it).
A parallel beam of light incident on the system is focused at the common focus by
the first lens and emerges as a parallel beam from the second. If the first lens has a longer
focal length than that of the second, the system may also be used as a beam reducer.
Similarly, if the second lens has a longer focal length, then the system can be used as a
beam expander, as may be seen by reversing the system. A screen with a hole, called a
spatial filter, can be inserted at the focus to clean up a laser beam incident on a Keplerain
telescope. However, if the beam is of high power, then the Galilean telescope can be
used to avoid air breakdown at the common focus. It is easy to see from the figure that
the beam-expansion ratio D2 D1 is given by f2¢ f1¢ , where D and f ¢ are the diameter
and the image-space focal length of a lens. In Figure 6-14, the image of an object can be
determined by applying the Gaussian or the Newtonian imaging equation recursively to
the two lenses. Now, if the object position changes by a distance S , then the image
position changes by a distance S ¢ = Mt2 S , according to Eq. (2-111), where Mt is the
transverse magnification of the image (and the refractive indices of the object and image
spaces are both equal to unity).
6.5 Telescope 255
Incidentally, the object-space focal point F1 of the first lens and the image-space
focal point F2¢ of the second lens are conjugates of each other, as illustrated in Figure 6-
15. Considering F1 as the object, the first lens forms its image at infinity. Thus, parallel
rays are incident on the second lens, which focuses them at F2¢ .
Parallel rays from a point on a distant object are shown incident on an objective of a
long focal length fo¢ in Figure 6-16a. A real inverted image is formed at P ¢ in its focal
plane. If the focus of the eyepiece also lies in this plane (i.e., if the eyepiece is confocal
with the objective), it forms a virtual image P ¢¢ of P ¢ at infinity, which is observed by a
relatively relaxed eye as P ¢¢¢ .
In astronomical telescopes, the aperture stop AS lies at the objective, which is,
therefore, the entrance pupil EnP. Its image by the eyepiece of a short focal length fe¢ is
the exit pupil ExP. The distance between the eyepiece and the exit pupil is called the eye
relief. This distance, indicated as s ¢ in Figure 6-16b, is the image distance corresponding
to an object distance s = fe¢ + fo¢ . It is given by s ¢ = - fe¢( fe¢ + fo¢) fo¢ . The eye is placed
as close to the exit pupil as possible (to avoid restricting the field of view). If Dex and
Den are the diameters of the entrance and exit pupils, the magnification of the pupil is
given by
m ∫ Dex Den (6-10)
= s¢ s = - f ¢ f ¢ . (6-11)
F1 F2¢
(–)f1 f2¢
(a)
F1 F2¢
(–)f1 (–)f2¢
(b)
Figure 6-15. Conjugate focal points. (a) Keplarian telescope. (b) Galilean telescope.
AS
EnP
ExP
P¢¢¢
CR Fe¢
F o¢ , Fe b¢
(–)b
P¢ Eye
Eyepiece
Objective P¢¢
at infinity
fo¢ – fe¢
(a)
AS
EnP
MR
A
ExP
Den CR
CR F o¢ , Fe Fe¢
C
Dex
B (–)b D b¢ O
E MR
P¢
Eyepiece
Objective
fo¢ – fe¢
(–)s s¢
(b)
Figure 6-16. Keplerian telescope with positive objective and eyepiece. (a) The image
formed by the objective is reimaged by the confocal eyepiece at infinity, which is
observed by a relaxed eye. (b) The eyepiece limits the angle of a chief ray CR that
can be trasmitted by it.
This result may also be seen from similar triangles ABFe and CDFe formed by the
marginal ray from an axial point object at infinity.
The magnification of the telescope is given by
Mb = ¢ , (6-12)
where is the angle subtended by the object at the objective (or at the unaided eye), and
¢ is the angle subtended at the eye by the image formed by the eyepiece. It represents
the factor by which the telescope magnifies the angular separation of the images of two
distant objects. It may be seen from the triangles BCE and CEO formed by the chief ray
CR in Figure 6-16b that the magnification of the retinal image due to the telescope is
given by the ratio of the focal length of the objective to that of the eyepiece, i.e.,
6.5 Telescope 257
Mb = s s ¢ = Den Dex (6-13)
= - fo¢ fe¢ . (6-14)
For an object lying at a finite distance, the total magnification of the telescope is obtained
by multiplying the magnification of the objective with that of the eyepiece.
We also note that the eyepiece limits the angle of a chief ray transmitted by the
system. Although Figure 6-16a indicates that a chief ray with a larger angle than shown
may be transmitted by the eyepiece, it also indicates that the outer rays in the off-axis ray
bundle will be vignetted. Thus, the angle represents the unvignetted field of view of the
telescope. Hence, from Eq. (6-12), a telescope with a large field of view is accompanied
by a correspondingly small image magnification (as observed by the eye). For two
positive lenses, as in Figure 6-16b, both focal lengths are numerically positive and the
negative sign in Eq. (6-14) indicates that the image formed by the telescope is inverted.
Sometimes prisms or additional lenses are used to erect the image. Such astronomical
telescopes are referred to as terrestrial telescopes.
The field of view of a telescope can be increased by inserting a lens, called a field
lens, at the image formed by the objective. Figure 6-17a shows an object at a field angle
such that only the rays from the lowest portion of the objective are incident on the
eyepiece; all other rays are vignetted (unless the diameter of the eyepiece is increased).
With a slightly larger angle, there would be complete vignetting. However, as illustrated
in Figure 6-17b, a lens placed at the intermediate image plane bends the rays toward the
eyepiece, thus eliminating vignetting. The increase in the field of view is obtained
without increasing the diameter of the eyepiece. Although the position of the final image
is unaffected by the field lens, the exit pupil does move to the left, and the eye relief is
reduced. In practice, the field lens is often displaced slightly to avoid seeing its
imperfections.
In a Galilean telescope, the eyepiece is a negative lens, as illustrated in Figure 6-18a.

The objective forms a real inverted image of the object at P ¢ that becomes a virtual
object for the negative eyepiece. An erect but virtual image is formed at infinity by the
eyepiece, and is observed by the eye. If the aperture stop of the telescope lies at the
objective, its image by the eyepiece yields the exit pupil of the telescope. This image lies
between the objective and the eyepiece. Accordingly, the eye cannot be placed at the exit
pupil of the telescope. Thus, the eye, placed as close to the eyepiece as possible, becomes
the aperture stop AS and the exit pupil ExP of the telescope, as illustrated in Figure 6-18b.
Its large image formed by the telescope is the entrance pupil EnP of the system. It lies
behind the observer in Figure 6-18b. Treating the EnP as the object at a distance s and
ExP as its image by the objective at a distance s ¢ = fo¢ + fe¢ , and neglecting the distance
between the eye and the eyepiece, the magnification of the pupil (i.e., the ratio of the
diameters of the exit and entrance pupils) is given by Eq. (6-11). Similarly, the
magnification of the retinal image due to the telescope is given by Eq. (6-14). We
AS
EnP
ExP P¢¢¢
CR
F o¢ , Fe Fe¢
(–)b
b¢
Eye
P¢
Eyepiece
Objective
P¢¢
at infinity
fo¢ – fe¢
(a)
AS
EnP Field
Lens
ExP P¢¢¢
CR
F o¢ , Fe Fe¢
(–)b
b¢
Eye
P¢ Eyepiece
Objective P¢¢
at infinity
fo¢ – fe¢
(b)
Figure 6-17. Use of a field lens to increase the field of view of a Keplarian telescope.
(a) Field of view limitation because of vignetting of rays, except from the lowest
portion of the objective. (b) Vignetting eliminated by a field lens placed in the focal
plane of the objective.
note that in this case the objective limits the angle of a chief ray transmitted by the
telescope. Therefore, it is the field stop of the system. The chief advantage of the Galilean
telescope is its small overall length. It is used for viewing operas and sports.
6.6 Ocular 259
P¢¢
at infinity
F o¢ , Fe
P¢
Objective Eyepiece
fo¢
fe
(a)
AS
ExP EnP
CR
(–)b¢ (–)b
MR
Eyepiece Eye
Objective
fe
fo¢
(b)
Figure 6-18. (a) Galilean telescope with a negative eyepiece. (b) Objective as the field
stop, and eye as the aperture stop AS. The entrance pupil EnP is indicated as the
object, and the exit pupil ExP is its image by the objective.
6.6 OCULAR
Although a magnifier may be used as an eyepiece in a microscope or a telescope, in
practice, compound lenses are designed to reduce lateral color (discussed in Chapter 7).
Such eyepieces are called oculars. Whereas a magnifier is used to look at a real object, an
ocular is used to look at an image formed by another optical system. An example of an
ocular is the Huygens eyepiece discussed in Section 7.6.2. It consists of two lenses of the
same material separated by a distance equal to half the sum of their focal lengths. The
lens closer to the eye is called the eye lens, and the one close to the objective is called the
field lens. Binoculars consist of two telescopes mounted side-by-side, one for each eye.
6.7 TELEPHOTO LENS AND WIDE-ANGLE CAMERA

A telephoto lens is used to take adequate-size pictures of distant objects. An afocal
system can also be used to change the effective focal length and field of view of another
system by inserting it in the collimated region of the other system. Consider, for example,
a system with an image-space focal length f ¢ and an angular field of view .
be
b¢
F1 F2¢ He¢ H¢ F¢, F¢e
f¢
fe¢
(a) Telephoto system
be
(–)b¢
F1¢, F 2 F¢2 H¢ H¢e F¢, F¢e
fe¢
f¢
(b) Wide-angle system
Figure 6-19. (a) Telephoto lens attached to an imaging system, giving a long effective
focal length and thereby a large image of a distant object. (b) Wide-angle lens
attached to an imaging system, giving a short effective focal length and thereby
giving a wide field of view. F ¢ and H ¢ are the image-space focal point and principal
point of the imaging system. Similarly, Fe¢ and He¢ are the object-space focal point
and principal point of the combined system.
Figure 6-19a shows how it can be combined with an afocal system to form a telephoto
system. The first lens in the figure is positive, and the second is negative. Thus, f1¢ is
numerically positive, but f2¢ is numerically negative, where f2¢ < f1¢ . The combined
system has a longer effective focal length fe¢ = f ¢ Mt , where Mt = - f2¢ f1¢ is the
transverse magnification of the afocal system, and Mt < 1. We note from the figure that
the angular magnification is M = ¢ e = 1 Mt , and M > 1 or e < ¢ .
Now, the size of the image of a distant object formed by an optical system depends
linearly on its focal length. Because such an image is increased in size by the use of the
afocal system, i.e., the combined system is a telephoto system. If ¢ is the field of view
of the imaging system by itself, we find that the effective field of view of the combined
system is reduced to e , which is smaller than ¢ by a factor of 1 Mt . Note that to avoid
vignetting by the afocal system, e must be £ D2 f1¢, where D2 is the diameter of the
beam emerging from the afocal system. It should be noted that adding an afocal system to
an imaging system is not the only way to achieve the telephoto effect. A positive and a
6.8 Resolution 261
negative lens of suitable focal lengths also form a telephoto system, as illustrated by
Problem 2.12.
Similarly, a wide-angle lens is used to take pictures of large, nearby objects, like for
example, a large group of people. When the afocal system is used in reverse so that the
first lens is negative ( f1¢ < 0) and the second lens is positive ( f2¢ > 0), as in Figure 6-19b,
the effective focal length of the combined system is reduced to fe¢ = f ¢ Mt , where
Mt = - f2¢ f1¢ , and Mt > 1. The angular magnification of the afocal system is
M = ¢ e = - 1 Mt , and M < 1 or e > ¢ . Note that ¢ is numerically negative in
Figure 6-19b. If ¢ is the field of view of the imaging system by itself, we find that the
effective field of view e of the combined system is larger than , or that the combined
system is a wide-angle system.
6.8 RESOLUTION
6.8.1 Introduction
The resolution of an optical instrument represents its ability to resolve detail in an
object. Based on Gaussian optics, the image of a point object is also a point. All of the
rays emanating from a point object and transmitted by an imaging system pass through
the Gaussian image point. In reality, of course, the rays generally intersect the image
plane in the vicinity of the image point as a spot diagram due to the aberrations of the
system (see Chapter 9) because the ability of a system to resolve objects is limited by its
aberrations. However, because of diffraction of the wave by the finite aperture stop (or,
equivalently, the exit pupil) of the system, a point image is not obtained even if the
aberrations are zero (otherwise, the irradiance at the image point would be infinity). Thus,
in practice, the resolution of a system is inherently limited by diffraction. It is only further
degraded by the aberrations. In this section, we briefly discuss the characteristics of the
diffraction image of a point object, and introduce the Rayleigh criterion of resolution. We
discuss the resolution of an eye, a microscope, and a telescope, assuming an aberration-
free system.
6.8.2 Airy Pattern

Let l be the wavelength of the object radiation and F be the focal ratio of the
image-forming light cone. For a system with an exit pupil of diameter D and a spherical
wavefront of radius of curvature R converging to the Gaussian image point, F = R D .
The irradiance distribution of the diffraction image of a point object, called the Airy
pattern, is given by [5]
2
[
I (r ) = 2 J1 ( p r ) p r ] , (6-15)
◊
where J1 ( ) is the first-order Bessel function of the first kind, and r is the radial distance
of a point from the Gaussian image point in units of l F . The distribution is normalized
2
by the value p P 4l2 F of the irradiance at the center r = 0 , where P is the total power.
This distribution is shown in Figure 6-20. It consists of a bright spot, called the Airy disc,
surrounded by dark and bright rings of decreasing irradiance.
Figure 6-20. 2D diffraction image of a point object, called the Airy pattern, with
83.8% of the total light in the central bright spot.
The irradiance distribution has a principal maximum at the center with a value of
unity because
È 2 J (x) ˘
Limit Í 1 ˙ = 1 . (6-16)
x Æ 0 Î x ˚
Its minima are zero at the positions given by the roots of

J1 ( p r ) = 0, r π 0 . (6-17)
Noting that
d È J1 ( x ) ˘ J ( x)
= - 2 , (6-18)
dx ÍÎ x ˙˚ x
◊
where J 2 ( ) is the second-order Bessel function of the first kind, the positions of the
secondary maxima are given by the roots of
J 2 (p r ) = 0, r π 0 . (6-19)
6.8 Resolution 263
Integrating the Airy pattern over a circular area, we obtain the power contained in a
circle of radius rc :
P (rc ) = 1 - J02 ( p rc ) - J12 ( p rc ) , (6-20)
◊
which is normalized by the total power P. Here, J0 ( ) is the zeroth-order Bessel function
of the first kind. According to Eq. (6-17), because the dark rings (minima of zero
irradiance) correspond to J1 ( p r ) = 0 , we note that the powers inside and outside an mth
dark ring of radius rm (in units of l F ) are given by
Pin (rm ) = 1 - J02 ( p rm ) , (6-21)
and
Pout (rm ) = J 02 ( p rm ) , (6-22)
respectively.
The irradiance and encircled power distributions of the Airy pattern are shown in
Figure 6-21. The positions of several minima and maxima, and the relative irradiance and
encircled power corresponding to them, are given in Table 6-4. We note that the Airy
disc, or the first dark ring, has a radius of 1.22 (in units of lF ) and contains 83.8% of the
total power. The first bright ring, with inner and outer radii of 1.22 and 2.23, contains
7.2%; the second bright ring 2.8%; and the third bright ring 1.4% of the total power.
6.8.3 Rayleigh Criterion of Resolution

A measure of the imaging quality of a system is its ability to resolve closely spaced
objects. According to the Rayleigh criterion of resolution, two incoherent point objects of
equal intensity are just resolved if the principal maximum of the Airy pattern of one of
them falls on the first zero of the other. Thus, two incoherent object points are just
resolved if the linear separation between their Gaussian images is 1.22l F or if their
angular separation is 1.22 l D. If the Gaussian images are located at x = ± 0.61 , the
irradiance distribution of the image of the two-point object along the x axis is given by
2 2
ÏÔ 2 J [ p( x - 0.61)] ¸Ô ÏÔ 2 J1[ p( x + 0.61)] ¸Ô
I(x) = Ì 1 ˝ +Ì ˝ , (6-23)
ÔÓ p( x - 0.61) Ô˛ ÔÓ p( x + 0.61) Ô˛
where x is in units of l F . This distribution, which is symmetrical about x = 0 , is shown

in Figure 6-22. We note that there is a dip in the irradiance at the center. The central value
is 0.73, compared to a maximum value of unity at x = ± 0.61 . The 2D image of the two-
point object is shown in Figure 6-23. If the spacing between the Gaussian images is much
smaller than 1.22 l F , then the Airy patterns overlap to the extent that the two point
objects cannot be resolved.
1.0
0.8
P
I(r), P(rc)
0.6
0.4
0.2 I
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
r, rc
Figure 6-21. Irradiance and encircled power distributions of the Airy pattern.
Table 6-4. Irradiance and encircled power corresponding to the maxima and
minima of the Airy pattern. The irradiance is normalized by its central value
p 4 l 2 F 2 , and the encircled power is normalized by the total power P . The units of
r and rc are l F .
Max/Min r, rc I (r ) P( rc )
Max 0 1 0
Min 1.22 0 0.838
Max 1.64 0.0175 0.867
Min 2.23 0 0.910
Max 2.68 0.0042 0.922
Min 3.24 0 0.938
Max 3.70 0.0016 0.944
Min 4.24 0 0.952
Max 4.71 0.0008 0.957

6.8 Resolution 265
0.8
0.6
I (x)
0.4
0.2
0
-4 -3 -2 -1 0 1 2 3 4
x
Figure 6-22. Irradiance distribution along the x axis of the image of two incoherent
point objects of equal intensity separated by the Rayleigh resolution of 1.22ll F . The
central value is 0.73.
Figure 6-23. 2D image of two incoherent point objects of equal intensity separated
by the Rayleigh resolution of 1.22ll F .
6.8.4 Resolution of an Imaging System

Consider a system imaging two closely spaced point objects P0 and P, as illustrated
in Figure 6-24. Let the refractive indices of the object and image spaces be n and n ¢ ,
respectively. If l 0 is the wavelength of object radiation in vacuum, its value in the object
and image spaces is given by l = l 0 n and l ¢ = l 0 n ¢ , respectively. The object lies at a
distance Lo from the entrance pupil of the system. Its image lies at a distance Li from the
exit pupil, which has a diameter of Dex . We assume that the system is aberration free so
that the resolution is limited only by diffraction of light at the exit pupil. Accordingly, the
image of each point object is an Airy disc of radius 1.22 l ¢ F , where F = Li Dex is the
focal ratio of the image-forming light cone. The two point objects are just resolved if the
spacing h ¢ between their Gaussian images P0¢ and P ¢ , i.e., between the centers of their
Airy discs, is equal to 1.22 l ¢ F . Thus, we may write
h ¢ = 1.22 l ¢F (6-24a)
= 1.22 l ¢( Li Dex ) (6-24b)
= 0.61 l ¢ a ¢ , (6-24c)
where a ¢ = Dex 2 Li is the semiangular aperture of the exit pupil as seen from the image.
In practice, Li >> Dex 2 ; therefore, we have let tan a ¢ = sin a ¢ = a ¢ . From the sine
condition (see Reference 1 in Chapter 1), the magnification of the Gaussian image is
given by
h¢ n sin a
=
h n ¢ sin a ¢
~ n sin a , (6-25)
n ¢a ¢
where a is the semiangular aperture of the entrance pupil as seen from the object. In
EnP ExP
n
n¢
P
CR MR MR
Den Dex
h
a (–)b (–)a¢ P¢0
P0 O O¢ (–)b¢
(–)h¢
CR
Optical P¢
System
(–)L o Li
Figure 6-24. Imaging of closely spaced objects.

6.8 Resolution 267
observations with microscopes, the value of the angle a can be quite large. The quantity
n sin a is called the numerical aperture of the imaging system. According to Eq. (5-33),
it determines the flux entering the system. From Eqs. (6-24c) and (6-25), we obtain the
minimum distance h between resolved object points P0 and P:
h ~ n ¢ h ¢a ¢ (6-26a)
n sin a
0.61n ¢l ¢
= (6-26b)
n sin a
0.61l 0
= . (6-26c)
n sin a
We note that the numerical aperture also determines the resolution of a system. The larger
its value is, the smaller the distance between two resolvable points, i.e., the better the
resolution.
From Eq. (6-26c), the angular resolution in the object space is given by
= h Lo
0.61l 0
= . (6-27)
nLo sin a
For small values of a , as in the case of telescopes, we may write
sin a ~ a = Den 2 Lo , (6-28)
where Den is the diameter of the entrance pupil of the system. In such cases, Eq. (6-27),
reduces to
= 1.22 l 0 Den . (6-29)
Similarly, from Eq. (6-24b), the angular resolution in the image space is given by
¢ = h ¢ Li (6-30a)
= 1.22 l ¢ Dex . (6-30b)
When and ¢ are large, they are replaced by their tangents.
For visual observations, the eye is placed at the exit pupil of the system, and the final
image is formed on its retina (indeed, this is the origin of the names for the entrance and
exit pupils). Otherwise, the apparent field of view is reduced. Moreover, the diameter of
the exit pupil is chosen to be equal to that of the eye’s pupil. If it is larger, then any light
outside of the eye’s pupil is wasted. If it is smaller, then it determines the diameter of the
Airy disc formed on the retina, which increases, thereby degrading the resolution. Of
course, the diameter of the exit pupil does not affect the distance between the Gaussian
images of the two object points, which depends on the magnification of the imaging
system. The magnification of a system when the diameters of the exit pupil and the eye
are equal is called its normal magnification. Choosing such a magnification is referred to
as pupil matching. Any magnification in excess of the normal magnification is called the
empty magnification because it does not improve the apparent resolution. However, it still
has merit in that positioning of the eye is eased and fatigue is reduced.
6.8.5 Resolution of the Eye

Consider an eye looking at two point objects P0 and P lying in a plane at a distance
Lo , as illustrated in Figure 6-25. Although the diameter of the pupil of the eye varies with
the level of illumination, it is customary to assume a value of 3 mm for normal
observations. For simplicity, we assume that the eye is a thin lens forming images on the
retina at a distance of 25 mm, which is approximately the diameter of an adult eye ball.
The image space consists of the vitreous humor with a refractive index of about 1.33. We
assume an object wavelength of 0.55 mm, to which the eye is most sensitive. The
numerical aperture of the eye is given by
NAeye = n sin a
~ a
Deye
=
2 Lo
3 mm
= = 0.006 , (6-31)
2 ¥ 25 cm
where we have let n = 1 for objects in air and Lo = 25 cm as the distance of most distinct
vision. The angle a represents the angle of the cone of light from a point object entering
the eye. From Eqs. (6-26c) and (6-27), the linear and angular resolutions of the eye in the
object space are given by
CR
h
MR
a (–)b a¢ P0¢
Deye
P0 (–)h¢
(–)b¢
P¢
(–)L o Li
Figure 6-25. Schematic of imaging of closely spaced objects by a human eye.

6.8 Resolution 269
0.61 l 0
h =
NAeye
= 101 l 0
= 55 mm (6-32)
and
= h Lo
55 mm
= = 0.22 mrad = 0.76¢ . (6-33)
25 cm
From Eqs. (6-24c) and (6-30a), the corresponding quantities in the image space are given
by
h ¢ = 0.61 l ¢ a ¢
0.61 l 0
=
n ¢a ¢
0.61 l 0
=
1.33 (1.5 25)
= 7.6 l 0 = 4.2 mm (6-34)
and
¢ = h ¢ Li
4.2 mm
= = 0.17 mrad = 0.57¢ . (6-35)
25 mm
If the pupil of the eye is reduced in size, the angle a and, therefore, the numerical
aperture of the eye decreases, and the Airy disc becomes larger, thus degrading the
resolution. If the diameter is increased, the diffraction-limited resolution increases, but
the aberrations of the eye degrade it.
6.8.6 Resolution of a Microscope

We have seen that the eye can resolve objects in air that are separated by about
55 mm when observed at a distance of 25 cm. When the objects are closer together, a
microscope is used to observe them. The objective of a microscope is its aperture stop,
and its image by the eyepiece is its exit pupil. All of the light entering the objective and
refracted by the eyepiece passes through the exit pupil. The pupil of the eye is placed at
the exit pupil; otherwise, the apparent field of view is reduced.
Consider a microscope imaging two point objects such that their images are just
resolved by the eye. Under normal magnification, the eye can resolve objects whose
images by the microscope subtend an angle of 1.22 l 0 Deye . The magnification of a
microscope is given by Eq. (6-5), where Mt is the transverse magnification of the
objective, and M = 25 cm f e¢ is the angular magnification of the eyepiece. The sine
condition applied to the objective yields (see Figure 6-13)
n sin a
Mt = (6-36)
a¢
2n sin a
= ,
Dex fe
where
Dex
a¢ = , (6-37)
2 fe
and Dex is the diameter of the exit pupil assumed to be equal to that of the eye. Thus, the
magnification of the microscope may be written
2 n sin a
M =
Dex 25 cm
NAobj
= , (6-38)
NAeye
where NAobj = n sin a is the numerical aperture of the objective. With a dry objective
(n = 1) , a practical limit for its numerical aperture is about 0.95. However, by filling the
space between the object and the objective with a liquid of refractive index matching that
of the objective, the numerical aperture can be increased to about 1.6. An oil-immersion
objective increases the numerical aperture compared to that of the eye and thereby
improves the resolution by a factor of 1.6/0.006 = 267. The linear resolution given by Eq.
(6-26c) changes from 55 mm to
0.61 ¥ 0.55 mm
h =
1.6
= 0.21 mm . (6-39)
6.8.7 Resolution of a Telescope

A microscope is used for viewing nearby objects. Thus, the quantity of interest from
the standpoint of resolution is the minimum linear distance between two point objects in a
plane that can be resolved. This is expressed as a ratio of the numerical apertures of the
microscope and the eye. A telescope is used to view distant objects (such as stars and the
moon), and the quantity of interest is accordingly the minimum angle between two point
objects that can be resolved. The normal magnification is expressed in terms of the
6.8 Resolution 271
diameters of the objective of the telescope and the eye’s pupil. The angular resolution of a
system is given by Eq. (6-29). Under normal magnification of a telescope, the diameter of
its exit pupil is equal to that of the eye’s pupil. Thus, the improvement in visual resolution
is given by the ratio of the diameter of the objective to that of the eye’s pupil [see Eq. (6-
13)]:
Dobj
M = . (6-40)
Deye
When Dex = Deye , the size of the Airy disc obtained with the telescope is the same as
that obtained without it, but the angular separation of the two objects as observed on the
retina is increased by M . The diameter of the exit pupil is given by Eq. (6-11). As in the
case of a microscope, if the exit pupil of the telescope is larger than the eye's pupil, then
the amount of light outside the eye’s pupil is wasted. If it is smaller, then the Airy disc on
the retina becomes larger, thereby degrading the resolution. The large objective of the
telescope not only improves the visual resolution, but also increases the total flux in the
image. Thus, it helps to see dim objects as well. It is common knowledge that stars are
too dim compared with the sky to be observed in the daytime with a naked eye. However,
they can be observed with the aid of a telescope. As explained below, this is due to an
increase in the star intensity on the retina when a telescope is used.
Consider the observation of a distant extended object of area So and luminance L

lying at a distance R, as in Figure 6-26a. The flux entering an eye with a pupil of area Se
is given by
(
F = L So Se R 2 ) , (6-41)
where Se R 2 is the solid angle subtended by the pupil at the object. This flux is spread
on the retina over an area Si , given by (see Figure 6-26a)
2
Ê R¢ ˆ
Si = Á ˜ So . (6-42a)
Ë n¢ R ¯
Therefore, the illuminance of the retinal image is given by
F 2
= L Se ( n ¢ R¢) . (6-42b)
Si
If the same extended object is observed with the aid of a telescope, as illustrated in
Figure 6-26b, then the eye sees the image formed by the telescope. Let D1 and D2 be the
diameters of the objective and the eyepiece, respectively. If the aperture stop of the
telescope lies at the objective, then its image by the eyepiece is its exit pupil. For a
confocal telescope with its objective and eyepiece of focal lengths f1¢ and f2¢ ,
respectively, the exit pupil lies at a distance f2¢( f1¢ + f2¢ ) f1¢ from the eyepiece with a
b/n¢ Retina
Eye lens
R¢
(a)
AS
EnP
ExP
b L2 (–)b¢ (–)b¢/n¢
L1 Retina
R¢b¢
(–)
n¢
Eyepiece Eye lens
Objective
R¢
(b)
Figure 6-26. Daytime observation of a star against a sky background (a) without and
(b) with the aid of a telescope. The lenses L1 and L2 , called the objective and the
eyepiece, respectively, with a spacing equal to the sum of their focal lengths,
constitute the afocal telescope.
diameter Dex = D2 . The images are observed by placing the eye at the exit pupil or as
close to it as possible.
Because the luminance of the telescopic image, according to Eq. (5-42), is (at most)
equal to the object luminance, it is evident that the retinal illuminance will be the same as
in the case of an unaided observation, provided the diameter of the exit pupil of the
telescope is greater than or equal to the diameter of the eye pupil. If the diameter of the
exit pupil is smaller, then Se is replaced by the area Sex of the exit pupil, and the retinal
illuminance is reduced by a factor Sex Se . Thus, the retinal illuminance of the image of a
distant extended object, such as the sky, observed with the aid of a telescope, is less than
or equal to the corresponding illuminance obtained when the object is observed with the
naked eye.
Now consider the observation of a point object, such as a star. The apparent intensity
of the star depends on the amount of light in its retinal image. The ratio of the amount of
2
light in this image with and without a telescope is given by ( D1 De ) , provided D2 £ De .
2
If D2 > De , then a fraction ( De D2 ) of the total light received by the telescope enters
the eye. Thus, in this case, the ratio of the amount of light in the retinal image with and
2
without the telescope is given by ( D1 D2 ) . In either case, this ratio is greater than 1.
Thus, the intensity of the star image on the retina is increased by using the telescope.
Because the illuminance of the sky background on the retina either stays the same or
decreases with the use of the telescope, and the intensity of the star image on the retina
6.9 Pinhole Camera 273
increases with its use, the star visibility or the signal-to-noise ratio increases.
Accordingly, it is possible to observe bright stars in the daytime by using a telescope for
which D1 > De .
6.9 PINHOLE CAMERA

A pinhole camera (or camera obscura) has been the subject of many investigations,
including those by Petzval and Rayleigh [6]. Not only is it simple (a pinhole on one side
of a box and a photographic plate on the other), but it is also distortion free with an
infinite depth of field and a very wide field of view. Based on geometrical optics, the
image of a distant point object in the absence of a lens will be approximately the same
size as the pinhole if the pinhole is large. Reducing the pinhole size reduces the image
size until diffraction by the pinhole spreads it. To determine the optimum size of the
pinhole, we proceed as follows.
The difference between a pinhole camera and a regular camera is that in the former
there is no lens to form the image. In a regular camera, a lens converts a diverging
spherical wave from a point object P0 into a spherical wave converging to an image point
P0¢ in the image plane. In a pinhole camera, a spherical wave of radius of curvature Lo
diverging from the object P0 is incident on the pinhole and continues as a diverging wave
toward the image plane, as illustrated in Figure 6-27a, where Lo is the object distance
from the pinhole. For a perfect image, a converging spherical wave of radius of curvature
Li (illustrated by the dashed wavefronts) should emerge from the pinhole converging to
the image point P0¢ , where Li is the image distance from the pinhole. Accordingly, the
defocus wave aberration at a distance r from the center of the pinhole is given by the sum
of the sags of two spherical wavefronts passing through the center of the pinhole with
their centers of curvature lying at the object and image points. This is illustrated in
Figure 6-27b as AB + BC = AC. Thus, the wave aberration may be written
1Ê 1 1ˆ 2
W (r ) = Á - ˜r , (6-43)
2 Ë Li Lo ¯
where the object distance Lo is numerically negative. The image will be practically
diffraction limited according to the Rayleigh criterion [5], if the peak value of the
aberration is less than or equal to l 4 . Thus, we may write
1 1 l 1
- = 2 = , (6-44)
Li Lo 2a fe
where fe is the effective focal length of the pinhole. For a distant object ( Lo Æ • ), the
radius of the pinhole is simply given by
12
a = ( Li l 2) . (6-45)
The image spot for a point object is approximately the Airy disc with a radius of
0.61lLi a .
P0 P0¢
(a)
Object
plane
Pinhole Image
plane
(–)L0 Li
B
A C
r
(b)
h P0¢
(–)h¢
P0
P¢
(–)Lo Li
Object plane Pinhole Image plane
(c)
Figure 6-27. (a) Imaging by a pinhole camera of radius a. (b) Wavefront incident on
the pinhole, and emerging wavefront required for perfect imaging. The pinhole size
is extremely exaggerated for clarity of the wavefronts. The camera length Li >> a .
(c) Distortion-free imaging.
A pinhole camera suffers from chromatic aberration because its focal length depends
on the wavelength. Similarly, because the pinhole appears to be elliptical from an off-axis
point object, its focal length for an object in the horizontal plane differs from that in a
vertical plane. Thus, it suffers from astigmatism. However, it is free of distortion, i.e., the
transverse magnification of an image is independent of the field angle. We note that the
chief ray (i.e., an object ray incident through the center of the hole) reaches the image
plane without any deviation, as illustrated in Figure 6-27c. The magnification of the
image is given by
M = h ¢ h = Li Lo . (6-46)
The main disadvantage of a pinhole camera is the long exposure it requires due to the
small size of the pinhole.

6.10.1 Eye
The eye is an interesting optical instrument. The cornea, which is approximately 0.5
mm thick, provides about two-thirds of the nearly 60-D power. The crystalline lens is
about 4 mm thick and 9 mm in diameter. It changes shape to provide fine focusing to
keep objects at varying distance in focus on the retina. The whole eye can be
approximated by a single surface of radius of curvature 5.55 mm with air on one side and
humor of index 1.33 on the other. The iris, which gives the eye its color, acts as a variable
aperture stop.
A nearsighted (myopic) eye sees nearby objects well, but the images of objects
beyond a certain point (called the far point) are blurry. A negative lens with a focal length
equal to the distance of this point is used to see distant objects well. Simlarly, a farsighted
(hyperopic) eye sees distant objects well with accommodation, but the images of nearby
objects within a certain point (called the near point) are blurry. A positive lens of focal
length such that it forms the image of this point at a comfortable reading distance of 25
cm is used to see nearby objects well.
A visual acuity of 20/20 implies that letters that subtend 5 arc min at the eye at a
distance of 20 ft can be read. Accordingly, a visual acuity of 20/60, for example, implies
that what a normal eye can resolve at 60 ft is being resolved at 20 ft.
Astigmatism of an eye results from an uneven curvature of its cornea. It is corrected

by using a cylindrical lens.
6.10.2 Magnifier
The apparent size of an object as seen with an unaided eye increases as the object is
brought closer, until the eye reaches the limit of its accommodation at about 25 cm. The
object can be brought closer if it is observed through a magnifier of a short focal length
f ¢ . If the object is placed in the front focal plane of the magnifying lens, the
magnification of the image seen with and without the magnifier is given by 25 f ¢ , where
f ¢ is in cm.
6.10.3 Microscope
In a compound microscope, the objective of very short focal length forms a real
magnified image of the object (see Figure 6-13), which, in turn, is magnified as a virtual
image by the eyepiece. The magnification of the retinal image with and without the
microscope is equal to the product of the transverse magnification of the objective and the
angular magnification of the eyepiece. If the virtual image lies at a distance of 25 cm,
then the microscope magnification is also given by 25 f ¢ , where f ¢ is the effective focal
length of the microscope in cm.
6.10.4 Telescope
A telescope is an afocal system used to view distant objects (practically at infinity).
In a Keplerian telescope, both the objective and the eyepiece are positive lenses; the
aperture stop lies at the objective, and the field stop at the eyepiece. The pupil
magnification is given by - fo¢ fe¢ and the telescope magnification by - fo¢ fe¢ , where fo¢
and fe¢ are the focal lengths of the objective and the eyepiece, respectively. In a Galilean
telescope, used to watch opera and sports, the eyepiece is negative. The objective is in the
field stop, and the observing eye is the aperture stop.
6.10.5 Resolution
The resolution of an optical imaging system is its ability to discern the details of an
object. According to the Rayleigh criterion of resolution, two incoherent point objects of
equal intensity are just resolved if their separation is given by
0.61l 0
h = , (6-47)
n sin a
where l 0 is the wavelength of the object radiation in vacuum, n is the refractive index of
the object space, and a is the semiangular aperture of the entrance pupil of the system as
seen from the object. The quantity n sin a is called the numerical aperture of the system.
The numerical aperture of the eye with a pupil of diameter 3 mm observing an object
at its near point at a distance of 25 cm is 0.006. Thus, the resolution of the eye at a
wavelength of 0.55 mm is 55 mm. The numerical aperture of a microscope with a dry
objective is at most 0.95. Its resolution is accordingly 167 times better. If an oil of
refractive index n is used between the object and the objective, then the resolution is
improved by a factor n. In the case of a telescope, the resolution is improved, compared
to that of the eye, by a factor of the ratio of the diameter of the objective and the eye. We
have assumed diffraction-limited resolution in these calculations. In practice, it will be
degraded by the aberrations of the imaging system.
6.10.6 Pinhole Camera

A pinhole camera is a lensless camera with distortion-free imaging, infinite depth of
field, and a very wide field of view. The radius a of the pinhole and the camera length Li
(i.e., the distance between the pinhole and the film) are related to each other according to
12
a = ( Li l 2) , (6-48)
where l is the mean wavelength of object radiation. Its disadvantage is the long exposure
time due to the small aperture.
References 277
REFERENCES
1. W. N. Charman, “Optics of the eye,” in Handbok of Optics, Vol. III, Chapter 1,
M. Bass, Ed., 3rd ed., McGraw-Hill, New York (2010).
2. G. A. Fry, “The eye and vision,” in Applied Optics and Optical Engineering, Vol.
II, Chapter 1, R. Kingslale, Ed., Academic Press, San Diego, CA (1965).
3. H. Eggers, “Estimation of uncorrected visual acuity in malingerers,” Arch.

Ophthalmol. 33, 23–27 (1945).
4. I. B. Lueck, “Spectacle lenses,” in Applied Optics and Optical Engineering, Vol.

III, Chapter 6, R. Kingslale, Ed., Academic Press, Academic Press, San Diego,
CA (1965).

6. M. Young, “Pinhole optics,” Appl. Opt. 10, 2763–2767 (1971).

PROBLEMS
6.1 A person with a 15-cm-wide face looks into the eyes of another person. Determine
the location and size of the image of the first person formed by the cornea of the
other person assuming a 15-cm gap between the two people.
6.2 Show from the data in Table 6-1 that each model of the eye provides the same total
focusing power of 60 D. Also show that the cornea of the schematic and simplified
schematic eyes provides nearly 43 D of focusing power.
6.3 The far point of a nearsighted eye lies at 1 m from it. Determine the power of a
correction spectacle lens worn at 15 mm from the eye. What is the power of a
contact correction lens?
6.4 The headlights of a car are approximately 1.5 m apart. Determine their separation
on the retina when the car is 30 m away.
6.5 A person can see himself clearly with relaxed eyes, standing 1 m from a mirror.
With accommodation, he can see well when only 15 cm away. (a) Determine the
prescription required to see distant objects clearly. (b) How far is his near point
when wearing glasses? (c) How close can he be to the mirror to see himself clearly
when wearing glasses? (d) How far, at the most, can he be from a concave mirror
of radius of curvature 1 m to see the image of a picture hanging 60 cm from the
mirror?
6.6 A person with a near point 25 cm away from her eyes wants to view an object with
a magnifying glass that is marked 6 ¥. She holds the magnifying glass close to her
eyes. Determine the range of object distance to view the magnified image without
undue strain on the eyes.
6.7 A Ramsden eyepiece consists of two positive lenses of equal focal length f ¢
spaced (2 3) f ¢ apart. Determine the location of the image formed by a microscope
or telescope objective for relaxed-eye viewing through the eyepiece.
6.8 Determine the optimum diameter of the pinhole of a pinhole camera where the
photographic plate lies at a distance of 10 cm from it. Let the mean wavelength of
object radiation be 0.55 mm. Determine the size of the image of a 6-inch-high
object placed 6 ft from the camera.
CHAPTER 7
CHROMATIC ABERRATIONS
7.1 Introduction ..........................................................................................................281

7.3 Thin Lens ..............................................................................................................285
7.4 Plane-Parallel Plate ..............................................................................................288
7.5 General System..................................................................................................... 292
7.6 Doublet ..................................................................................................................295
7.6.1 Lenses of Different Materials ..................................................................296
7.6.2 Lenses of the Same Material....................................................................297
7.6.3 Doublet with Two Separated Components ..............................................301
7.6.4 Thin-Lens Doublet................................................................................... 302
7.7.1 General System ........................................................................................305
7.7.2 Thin Lens ................................................................................................. 306
7.7.3 Plane-Parallel Plate ..................................................................................307
7.7.4 Doublet ....................................................................................................307
References ......................................................................................................................310
Problems ......................................................................................................................... 311
279
Chapter 7
Chromatic Aberrations
7.1 INTRODUCTION
So far, we have discussed the imaging relations without explicitly stating the
wavelength of the object radiation. Because the refractive index of a transparent
substance decreases with increasing wavelength, a thin lens, for example, made of such a
substance will have a shorter focal length for a shorter wavelength. Consequently, an
axial point object emanating white light will be imaged at different distances along the
axis depending on the wavelength, i.e., the image will not be a “white” point. Similarly,
the height of the image of an off-axis point object will vary with the wavelength, resulting
in different sizes of the image of a multiwavelength object. The axial and transverse
extents of the image of a multiwavelength point object are called longitudinal and
transverse chromatic aberrations, respectively. They describe a chromatic change in the
position and magnification of the image, which are discussed in this chapter. The
longitudinal chromatic aberration is also called the axial color.
There is ambiguity about the definition of the chromatic change in magnification. As

a differential of the image height, it represents the difference in image heights of the chief
rays of two colors in their respective Gaussian image planes. From a practical standpoint,
the quantity of interest is the difference of image heights in a given image plane. The
latter is referred to as the lateral color. We define a system as being achromatic if both
the axial and lateral colors are zero.
We start this chapter with a discussion of the chromatic aberrations of a single

refracting surface and apply the results to obtain the chromatic aberrations of a thin lens,
a doublet, and finally, a general system consisting of a series of refracting surfaces. The
chromatic aberrations of a plane-parallel plate are considered as an example of the
general theory. A doublet consisting of two thin lenses that are separated, or in contact so
that its focal length is achromatic, is discussed. Numerical examples are given to illustrate
the concepts.
7.2 REFRACTING SURFACE

First we consider, as illustrated in Figure 7-1, the chromatic aberrations of a single
refracting surface of vertex radius of curvature R separating media of refractive indices n
and n ′. The distance S ′ and height h ′ of the image P ′ of a point object P lying at a
distance S and a height h are given by the relations [see Eqs. (2-4) and (2-12a)]
n′ n n′ − n
− = (7-1)
S′ S R
and
M ≡ h h ′ = nS ′ nS , (7-2)
281
282 CHROMATIC ABERRATIONS
AS
n n′ ExP
A
0
MR MR
0b MR
0r
a
R
P0 UR0 V0 UR0 O (–)α
C P′0b P′0r
B
(–)δS′
L
RS
R
(–)S S′
(a)
AS
n n′ ExP
A
MRr (–)δh′c
P′r
MR P′b
a b
(–)δh′
R
M
D
P0 h′ h′r
V0 O γ b
C P′0b P′0r
CR
b UR
(–)h (–)δS′
CR r
UR
P
L
RS
R
(–)S S′
Disk of red rays

focusing at P′r
P′r
(–)δh′ (–)δh′c
CRr
CRb
P′b
h′r
Disk of blue rays
h′b
diverging from P′b
P′0b P′0r
Disk of red rays Disk of blue rays

focusing at P′0r diverging from P′0b
BLUE GAUSSIAN RED GAUSSIAN
(b) IMAGE PLANE IMAGE PLANE
Figure 7-1. Chromatic aberrations of a refracting surface RS. UR, MR, and CR are
the undeviated, marginal, and chief rays. (a) On-axis imaging. (b) Off-axis imaging.
The subscripts b and r denote blue and red light. The axial color δ S ′ = Sb′ − Sr′ ,
where Sb′ and Sr′ are the distances of the blue and red images. The transverse
chromatic aberration δ h ′ = h b′ − h r′ , where hb′ and hr′ are the image heights in the
blue and red Gaussian image planes, respectively. The lateral color δ h c′ represents
the height difference of the blue and red chief rays in a given image plane.
where M is the transverse magnification of the image. Let δ represent a small change in a
certain quantity corresponding to a small change in the wavelength, or, equivalently, a
small change in the refractive index. Because the object distance S is independent of the
wavelength, by differentiating both sides of Eq. (7-1), we obtain
δ n′ n′ δn δn ′ − δn
− 2 δS ′ − = . (7-3)
S′ S′ S R
Substituting for S from Eq. (7-1), we find that
δS ′ ⎛ δn δn′ ⎞ ⎛ S ′ ⎞
= ⎜ − ⎟ ⎜ − 1⎟ . (7-4)
S′ ⎝ n n′ ⎠ ⎝ R ⎠
Similarly, because the object height h is independent of the wavelength, by differentiating

both sides of Eq. (7-2), we obtain (see Figure 7-1b)
δM
= δ h′ h ′
M
δn δn ′ δS ′
= − +
n n′ S′
⎛ δn δn ′ ⎞ S ′
= ⎜ − ⎟ , (7-5)
⎝ n n′ ⎠ R
where in the last step we have used Eq. (7-4). Note that the fractional chromatic variation
of magnification is independent of the object (or image) height. The quantities δn and
δ n ′ represent the difference in the refractive index of the object and image spaces,
respectively, for the blue and red light. The blue and red light represent, in general, the
shortest and the longest wavelengths of the object radiation spectrum. The chromatic
change δS ′ = Sb′ − Sr′ in the position of the axial image represents the distance between
the axial Gaussian images P0′b and P0′r for the blue and red light. It is called the
longitudinal chromatic aberration or simply the axial color. The chromatic change
δ h ′ = h ′δ M = hb′ − hr′ (7-6)
in the image height, called the transverse chromatic aberration, represents the difference
in the heights of the blue and red chief rays in the blue and red Gaussian image planes,
respectively.
From a practical standpoint, the quantity of interest is the size of the image of a point
object in a given Gaussian image plane. For example, the image of an on-axis point
object in the red Gaussian image plane consists of a bright-red Gaussian image point P0′r
at the center, surrounded by blue rays. The blue rays originating at P0 pass through the
Gaussian image point P0b and diverge from it as a blue disk of rays in the red Gaussian
image plane. The radius P0′r B of the blue disk of rays is given by (see Figure 7-1a)
ri = α δS ′
= (a L) δS ′ , (7-7)
where a is the radius of the exit pupil, and L is the distance of the image from it.
Similarly, the image in the blue Gaussian image plane consists of a bright-blue Gaussian
image point P0′b at the center, surrounded by red rays that converge to P0′r . The radius
P0′b R of the red disk is approximately the same as that of the blue disk. For a given
angular size of the light cone forming a Gaussian image point, the ratio a L is fixed, i.e.,
if the position of the exit pupil is changed so that L changes, its diameter (in practice, the
diameter of the aperture stop) is also changed so that a L does not change. Thus, the size
of the blue or red image disk, called the transverse axial color, does not change as the
position of the exit pupil is changed.
In the case of an off-axis object point P, its image in the red Gaussian image plane
consists of a red Gaussian image point and a displaced disk of blue rays. The radius of the
blue disk is approximately the same as that for the on-axis image. The displacement of
the blue disk represents the difference in the heights of the blue and red chief rays in this
image plane. We note from Figure 7-1b that the displacement, called the lateral color and
representing the chromatic aberration of the chief ray in a given image plane, is given by
δ hc′ = δ h ′ − γ δS ′
= δ h ′ − (h ′ L ) δS ′ , (7-8)
where γ is the angle that the blue chief ray CRb makes with the optical axis in image
space. It differs from δ h ′ , which is the difference in the heights of the blue and red chief
rays in the blue and red Gaussian image planes, respectively. Like δS ′ and δ h ′ , δ hc′ is
also numerically negative in Figure 7-1b.
We note that the value of δ hc′ changes as the value of L changes. This is to be
expected because the chief ray changes as the position of the exit pupil is changed. From
similar triangles CP0′b Pb′ and Pb′DPr′ in Figure 7-1b, we find that
h′
δ h′ = δS ′ . (7-9)
S′ − R
1 1
δ hc′ = h ′ ⎛ − ⎞ δS ′ . (7-10)
⎝ S′ − R L ⎠
Thus, δ hc′ = 0 as L → S ′ − R , i.e., when the exit pupil lies at the center of curvature.
This is to be expected because the undeviated ray UR becomes the chief ray for both blue
and red light.
From Eq. (7-10), the values of the lateral colors δ hc1

′ and δ hc2
′ corresponding to two
exit pupil locations, so that the image lies at distances L1 and L2 from them, are related
to each other according to
⎛ 1 1⎞
δ hc′2 = δ hc′1 + ⎜ − ⎟ h ′δ S ′ . (7-11)
⎝ L1 L2 ⎠
Equation (7-11) represents the stop-shift equation for the lateral color.
It is evident from Eq. (7-8) that if the longitudinal aberration δS ′ is zero [it cannot
happen for a single surface (unless S ′ = R) or even a thin lens (unless S ′ = 0 )], then δ hc′
is equal to δ h ′ , independent of the position of the exit pupil.
7.3 THIN LENS
The chromatic aberrations of an image formed by a thin lens of focal length f ′ and
refractive index n can be obtained by applying the results for a single refracting surface
successively to its two surfaces. Or, we can obtain them from the imaging and
magnification equations of a thin lens derived in Section 2.3. Because the image-space
focal length f ′ of the lens depends on its refractive index n, the image distance S ′ and
height h ′ also depend on it, i.e., the image is accompanied by both axial and lateral color.
Differentiating Eqs. (2-29) and (2-37a), we obtain
δS ′ δf ′
=
S′ 2 f ′2
1
= − (7-12)
f ′V
and
δM δh ′ δS′
= = (7-13a)
M h′ S′
S′
= − , (7-13b)
f ′V
respectively, where
V =
(n − 1) (7-14)
δn
is called the dispersive constant of the lens material. Thus, for a change δn in the
refractive index, there is a corresponding change δ f ′ in the focal length, δ S ′ in the
image distance, and δ h ′ in the image height. It is evident that the smaller the value of δn
is, the larger the value of V, the smaller the change in focal length, and the smaller the
values of chromatic aberrations.
It is common practice to consider n as the refractive index for the yellow line of
helium (l = 0.5876 m ) , called the d line, and dn as the difference nF - nC between the
refractive indices for the Fraunhofer lines F and C, i.e., for the blue (l = 0.4861 mm ) and
red (l = 0.6563 m ) lines of hydrogen. Glass manufacturers often give the refractive
index data as a six-digit number. For example, BK7 glass is specified as #517642. The
first three digits define its refractive index according to nd - 1 = 0.517 , and the
remaining three digits define its dispersive constant according to
nd - 1
V = (7-15a)
nF – nC
= 64.2 . (7-15b)
The dispersive constant of a glass defined according to Eq. (7-15a) is called its Abbe
number.
The refractive indices of the available lens materials and their Abbe numbers from
Schott Optical Glass are given in Figure 7-2, called an nd /Vd diagram. Each glass in this
diagram is identified by a point whose position is called its optical position. The Abbe
numbers of glasses vary from about 20 to 90. The glasses with nd > 1.60, Vd > 50 or nd <
1.60, Vd > 55 are called crowns and are indicated by the letter K; others are called flints
and are indicated by the letter F. The simple crown (kron in German) glasses (soda-lime-
silicate glasses) have low dispersion, and simple flint glasses (lead-alkali-silicate glasses)
have high dispersion. The addition of barium oxide (BaO) yields a low dispersion with a
relatively high refractive index. The borosilicate crown glasses contain boron oxide
(B2 O3 ) instead of the calcium oxide used in normal soda-lime-silicate glass. The addition
of boron oxide yields a low refractive index and low dispersion.
The light and heavy flint glasses contain low and high lead and barium amounts,
respectively. Use of fluorine instead of oxygen also lowers the refractive index and
dispersion. The barium flint glasses contain both barium oxide and lead oxide; the crown
flint glasses contain calcium oxide and lead oxide, resulting in average dispersions. Use
of rare earths such as lanthanum (La) yields glasses of high refractive index and high
Abbe numbers. The terms heavy and light crowns or flints are also used, e.g., barium
heavy flint (BaSF) or phosphorus heavy crown (PSK) (the letter S is for schwer in
German, meaning “heavy” or “dense”). The barium crown glasses contain a large
proportion of boron oxide and barium oxide, while their silicon dioxide (SiO2 ) content is
low. The K group in the diagram includes the barium light crowns (BaLK) and the zinc
crown (ZK). The glasses given in the diagram are for use with visible light. The materials
for use with infrared radiation have been discussed in several publications by McCarthy
[1–6].
Figure 7-2. Refractive indices and Abbe numbers of various glass materials
available from Schott Optical Glass, Inc.
The radius of the blue or the red disk of rays in the red or the blue Gaussian image
plane, respectively, is again given by Eq. (7-7), as may be seen from Figure 7-3a. The
axial and transverse axial colors are independent of the position of the aperture stop since
a L is kept fixed as the position is changed. Similarly, from Figure 7-3b, we can show
that the (numerically positive) displacement hc′ of the blue disk from the red Gaussian
image point Pr′ of an off-axis point object P is given by Eq. (7-8). From similar triangles
CP0′b Pb′ and Pb′DPr′ ,
δ h ′ = ( h ′ / S ′ ) δS ′ . (7-16)
Thus, the lateral color δ hc′ representing the transverse chromatic aberration of the chief
ray in a given image plane may be written
δ hc′ = δ h ′ − γ δS ′
1 1
= h ′ ⎛ − ⎞ δS ′ . (7-17)
⎝ S′ L ⎠
It approaches zero when the exit pupil lies at the lens as in Figure 7-3c, i.e., as L → S ′ .
The chief ray in this case passes through the center of the lens undeviated regardless of its
wavelength. Because the chief rays of different colors are coincident, they intersect an
image plane at the same point. In a given image plane, rays (other than the chief ray) of
different colors are not in sharp focus due to longitudinal chromatic aberration. The stop-
shift equation for the lateral color is the same as Eq. (7-11).
As a numerical example, Figure 7-4 shows how the focal length of a thin lens made
of BK7 glass varies with wavelength. The variation of its refractive index is also shown
in the figure. We note that the refractive index decreases as the wavelength increases.
Thus, from Eq. (2-28), the focal length increases as the wavelength increases.
7.4 PLANE-PARALLEL PLATE

In Section 2.6, we showed that a plane-parallel plate forms the image of an object
with unity magnification. The distance of the image from the object is given by Eq. (2-
97), which is independent of the object distance. We now consider a plate with its
aperture stop located at its front surface. Its exit pupil ExP, therefore, lies at a distance
t (1 − 1 / n) from it, as illustrated in Figure 7-5. For an object point P lying at a distance S
from the front surface, the distance of the image P ′′ from ExP (indicated as distance L2
from ExP2 in the figure) is equal to S. A detailed derivation is given later.
Because the aperture stop is located at the first surface, the entrance pupil EnP of the
system is also located there. Moreover, the entrance and exit pupils EnP1 and ExP1 for
this surface are also located at the surface. The entrance pupil EnP2 for the second
surface is ExP1 . The exit pupil ExP2 for this surface is the image of EnP2 formed by it.
ExP
a R
′
P0r
(–)α
P0 O ′
P0b B
(–)δS′
(–)S S′
(a)
ExP δh′c
P′b P′r
CR b D (–)δh′
CR r h′r
P0 C γ h′b
O P′0b P′0r
(–)h UR CR r
CRb
L
P
(b)
CRb
Disk of blue rays δh′c b δh′c
P′r CR P′r
diverging from P′b
γ (–)δh′
(–)δh′
P′b D
P′b h′r
δh′c (–)δS′
CRr
h′b Disk of red rays

focusing at P′r
P′0b ′
P0r
Disk of red rays Disk of blue rays

′
focusing at P0r ′
diverging from P0b
BLUE GAUSSIAN RED GAUSSIAN
IMAGE PLANE IMAGE PLANE
ExP
Pr′
Pb′
Rr
CR b,C (–)δh′
P0 h′b h′r
′
P0b ′
P0r
(–)h
CR (–)δS′
(c)
Figure 7-3. Chromatic aberrations of a thin lens. (a) On-axis imaging. (b) Off-axis
imaging. (c) Off-axis imaging with the exit pupil at the lens. The axial color is δ S ′ ,
and the lateral color is δ h c′ . The lateral color in (c) is zero.
1.02 1.535
1.530
1.01
f ′/fd′
1.525
1.00
f ′/fd′ 1.520
n
0.99
1.515
n
0.98
1.510
0.97 1.505
0.4 0.6 0.8 1.0
λ
Figure 7-4. Variation of refractive index and focal length of a thin lens made of BK7
glass #517642 with wavelength λ . The focal length is normalized by its value for the
d line. Thus, f ′ fd′ = (nd − 1) (n − 1) . The wavelength is in micrometers.
Thus, letting n2 = n, n2′ = 1, s2 = − t , and R2 = ∞ , we find from Eqs. (2-4) and (2-10)
that ExP2 is located at a distance s2′ = − t n from the second surface, and its
magnification m2 = 1. As expected from Eq. (2-97), ExP2 lies at a distance t (1 − 1 / n)
from the first surface. Of course, ExP2 is also the exit pupil ExP of the system. It is
evident that for the first surface, the distance L1 of the image P ′ from ExP1 is equal to its
distance S1′ from the surface. For the second surface, distance L2 of the image P ′′ from
ExP2 is given by
L2 = S2′ − s2′ , (7-18a)
because L2, S2′ , and s2′ are all numerically negative. Substituting for S2′ and s2′ , we find
that
L2 = S . (7-18b)
Now we consider the chromatic aberrations of the plate. Differentiating Eq. (2-96a),
the axial color is given by
t
δ S2′ = δn . (7-19)
n2
ExP
AS ExP2
ExP1
EnP
n
OA CR
(–)h O
(–)S′2
P′ P P′′
s2 = – t
(–)S1
t
(–)S′1
(–)L 1
(–)L 2
(–)S′2
(–)S2
Figure 7-5. Imaging of a point object P by a plane-parallel plate of refractive index

n and thickness t. P ′ is the image of P formed by the first surface, and P ′′ is the
image of P ′ formed by the second surface of the plate. The aperture stop AS and,
therefore, the entrance pupil EnP of the plate are located at the first surface. The
exit pupil ExP is the image of the first surface by the second.
The transverse chromatic aberration δ h ′ is zero, because the image magnification is unity
regardless of the refractive index of a ray due to the zero refracting power of the plate.
The lateral color representing the difference in the heights of the blue and red chief rays
in the final image plane is given by Eq. (7-8):
h′
δ h c′ = γδ S 2′ = − δ S′ ,
L2
or
h′ t
δ h c′ = − δn . (7-20)
S n2
Of course, the exit pupil, which is the image of the first surface by the second, also has
chromatic aberrations. That is why the centers of the blue and red exit pupil are shown in
Figure 7-6 to lie on the optical axis at Ob and Or , respectively. However, its impact on
Eq. (7-20) is a second-order effect.
ExP2
ExP
AS
ExP1
EnP
n
CRr
CRb
OA
(–)h Or Ob
CR
Pb′ Pr′ P Pr′′ Pb′′

δS2′
(–)δS′
(–)S1
t
(–)S′1 (–)δhc′
(–)L1 γ
Pr′′ Pb′′
(–)L2
δS2′
(–)S′2
(–)S2
Figure 7-6. Chromatic aberrations of a plane-parallel plate. The axial color is δ S2′ ,
and the lateral color is δ h c′ .
7.5 GENERAL SYSTEM

Just as we obtained an imaging equation in terms of the positions of the focal points
and principal points of a multielement imaging system in Section 2.4, we can similarly
obtain a relationship between the chromatic aberrations of an image and the chromatic
displacements of these points. To obtain a relationship between the axial color and the
displacements of the focal points and the principal points with a change in wavelength, it
is convenient to use the Newtonian imaging equation (2-83):
zz ′ = f f ′ , (7-21)
where z is the object distance from the object-space focal point F , z ′ is the image
distance from the image-space focal point F ′ , and f and f ′ are the object-space and
image-space focal lengths of the imaging system, respectively, as illustrated in Figure 7-
7. In practice, a system is generally surrounded by air, and therefore n = n ′ = 1 and
f = − f ′.
P′
h′
P0 V V′
F H H′ F′ P′
0
(–)h
Optical
P system
(–)d
(–)f d′ f′
(–) l l′
Figure 7-7. General imaging system showing the location of its principal and focal
points H and H ′, and F and F ′ , respectively. Also shown are the object and image
locations. For a system in air, n = 1 = n ′ and f = − f ′ .
Taking a logarithmic differentiation of Eq. (7-21), we obtain
δz δz ′ 2
+ = δf ′ . (7-22)
z z′ f′
Let l be the distance of the object from the vertex V of the first surface of the system.
Similarly, let l ′ be the distance of the final image from the vertex V ′ of its last surface.
Also, let d and d ′ be the distances of the principal points H and H ′ from the vertices V
and V ′ , respectively. Then
z = l− f −d
= l + f′ − d (7-23a)
and
z′ = l′ − f ′ − d ′ . (7-23b)
Differentiating Eqs. (7-23), we obtain
δz = δf ′ − δ d (7-24a)
and
δz ′ = δl ′ − δf ′ − δ d ′ , (7-24b)
where δl is zero because the object position is independent of the wavelength. The object
and image distances are also related to each other by the transverse magnification Mt of
the image according to
z = − f Mt
= f ′ Mt (7-25a)
and
z ′ = − f ′M t . (7-25b)
2
δ l ′ = δ d ′ − Mt2 δ d + (1 − Mt ) δf ′ . (7-26)
Thus, the axial color δ l ′ can be determined for any value of the image magnification Mt
from the change δf ′ of the image-space focal length f ′ and the displacements δd and
δ d ′ of the principal points H and H ′ , respectively. The displacements of the principal
and focal points are determined in the usual manner by tracing blue and red rays incident
on the system parallel to its optical axis.
To determine the lateral color, we write Eq. (7-25b) in the form
h′ h = − z ′ f ′ (7-27)
and take its logarithmic differentiation, noting that the object height is independent of the
wavelength. Thus,
δh′ δz′ δ f ′
= −
h′ z′ f′
δ l ′ − δf ′ − δ d ′ δf ′
= − −
f ′ Mt f′
1 ⎡ δl ′ − δ d ′ ⎛ 1 ⎞ ⎤
= − ⎢ + ⎜1 − ⎟ δf ′ ⎥
f′ ⎢⎣ Mt ⎝ Mt ⎠ ⎥⎦
1
=
f′
[
Mt δ d − ( Mt − 1) δf ′ ] , (7-28)
where we have substituted for δ l ′ from Eq. (7-26). The lateral color of the image lying at
a distance L from the exit pupil is given by
δ hc′ δ h′ δl ′
= −
h′ h′ L
1 δl ′
=
f′
[
Mt δ d − ( Mt − 1) δf ′ − ]
L
.
(7-29)
7.6 Doublet 295
For an object at infinity, Mt is zero, and Eqs. (7-26) and (7-28) reduce to
δl ′ = δ d ′ + δ f ′ (7-30a)
and
δ hc′ δ f ′ δl ′
= − , (7-30b)
h′ f′ L
respectively. We note that if a system is designed so that its axial color δ l ′ is zero, its
lateral color δ hc′ is generally not equal to zero. We refer to a system as being achromatic
if its axial and lateral colors are both equal to zero.
The radius of the blue or red disk of rays in the red or the blue Gaussian image plane,
respectively, is given by Eq. (7-7), with δS ′ replaced by δ l ′ , as may be seen from Figure
7-8. Whereas the axial and transverse axial colors are independent of the position of the
aperture stop, the effect of a stop shift on the lateral color is given by Eq. (7-11). As a
simple example of a general system, the chromatic aberrations of a thick lens are
considered in Problem 7.2, where the conditions for an achromatic focal length of a
singlet is derived.
7.6 DOUBLET
In this section, we determine the chromatic aberrations of a doublet, i.e., a system
consisting of two thin lenses. The lenses may be of the same or different materials. We
show that a doublet with two separated lenses cannot be achromatic. We also show that a
doublet consisting of two thin lenses of different materials in contact can be designed to
be achromatic.
ExP P′ δh′c
n n′ b P′r
(–)δh′
CR b CR r
h′ h′r
b
P0 MR MR
γ b r
(–)h O P′0b P′0r
(–)δl ′
P
(–)l Optical L
System
l′
Figure 7-8. Chromatic aberrations of a general imaging system. The axial color δ l ′
represents the difference in the distances of the blue and red images. The lateral
color δ h c′ represents the difference in the heights of the blue and red chief rays in a
given image plane.
7.6.1 Lenses of Different Materials
Consider two thin lenses of image-space focal lengths f1′ and f2′ separated by a
distance t, as in Figure 7-9. The focal length f ′ of the combination is given by Eq. (4-
32), i.e.,
1 1 1 t
= + − . (7-31)
f′ f1′ f2′ f1′f2′
Differentiating Eq. (7-31) and using Eq. (7-12), we find that the focal length f ′ is
stationary, i.e., its differential is zero, if
f1′V1 + f2′ V2
t = , (7-32)
V1 + V2
where V1 and V2 are the dispersive constants of the lenses. Although the variation of
focal length of a doublet with wavelength is much reduced (compared to that of a singlet)
by a combination of two lenses in this manner, it is not completely independent of the
wavelength. For example, if the spacing t is chosen by substituting the focal lengths and
V-numbers of the lenses for a certain wavelength, the blue and red focal lengths are
generally not equal to each other. However, they can be made equal, for example, if the
spacing t is chosen at a wavelength λ m for which the refractive index nm for each lens is
equal to the mean of the corresponding blue and red refractive indices, i.e., if λ m is such
that nm = (nF + nC ) 2 (see Problem 7.4). The V-number of a lens in this case is
accordingly defined as Vm = (nm − 1) ( nF − nC ) .
L1 L2
OA Hb′ H r′ Fb′ Fr′
f 1′ f 2′
t
f r′
f b′
Figure 7-9. Doublet consisting of two thin lenses of focal lengths f1′ and f2′ spaced
apart by a distance t. Its focal length f ′ is the same for blue and red light, but the
focal points are not coincident (because the principal points are not).
7.6 Doublet 297
The axial color of a doublet with δ f ′ = 0 is given by [see Eq. (7-26)]
δ l ′ = δ d ′ − Mt2 δ d . (7-33)
Because the value of f ′ is the same for two wavelengths, the image-space focal point F ′
and the principal point H ′ for one wavelength are displaced from those for the other by
the same amount δ d ′ . Now, F ′ lies at a distance
⎛ t⎞
t2 = f ′ ⎜1 − ⎟ (7-34)
⎝ f1′⎠
from the center of the second lens [see Eq. (4-34)]. Differentiating Eq. (4-34), we find
that H ′ and F ′ are displaced by
δ d ′ ≡ δt2
δf1′
= f ′t
f1′ 2
f ′t
= − .
f1′V1 (7-35)
Similarly, considering the distance f (1 − t f2′ ) of the object-space focal point F from the
center of the first lens and noting that the object-space focal length f and the image-
space focal length f ′ are related to each other according to f = − f ′ , we find that the
object-space principal point H and focal point F are displaced by an amount
δf2′
δ d = − f ′t
f2′ 2
f ′t
= . (7-36)
f2′ V2
Substituting Eqs. (7-35) and (7-36) into Eq. (7-33), we obtain the axial color
⎛ 1 Mt2 ⎞
δ l ′ = − f ′t ⎜ + . (7-37)
⎝ f1′V1 f2′ V2 ⎟⎠
Similarly, Eq. (7-29) yields the lateral color

δ hc′ tMt δl ′
= − . (7-38)
h′ f2′ V2 L
7.6.2 Lenses of the Same Material

If the two lenses are made of the same material with an Abbe number V, then letting
V1 = V2 = V in Eq. (7-32) and substituting the result in Eq. (7-31), we obtain
1 1⎛1 1⎞
= ⎜ + ⎟ (7-39)
f′ 2 ⎝ f1′ f2′ ⎠
and
1
t =
2
( f1′ + f2′) . (7-40)
Because both f1′ and f2′ vary with the wavelength in the same manner, Eq. (7-40) can be
satisfied at one wavelength only, and the value of f ′ at this wavelength may also be
written f ′ = f1′f2′ t . Accordingly, the focal length of the doublet given by Eq. (7-39) is
independent of the wavelength to the first order in δn. Again, the blue and red focal
lengths are equal if the spacing t is chosen at a wavelength λ m for which the refractive
index nm is equal to the mean of the blue and red refractive indices. Substituting for
t = f1′f2′ f ′ , Eqs. (7-35) through (7-38) reduce to
δ d ′ = − f2′ V , (7-41a)
δd = f1′ V , (7-41b)
1
δl ′ = −
V
(
f2′ + f1′Mt2 ) , (7-41c)
and
δ hc′ f ′M δl ′
= 1 t − .
h′ V L (7-41d)
A numerical example of a doublet with an achromatic focal length and consisting of

two separated thin lenses using BK7 glass is shown in Figure 7-10a. It is a Huygens
eyepiece consisting of two planoconvex thin lenses of focal lengths 15 and 7.5 cm,
respectively, with a separation of 11.25 cm. The object-space focal point F2 of the
second lens coincides with the image-space principal point H ′ of the eyepiece. Similarly,
the image-space focal point F1′ of the first lens coincides with the object-space principal
point H of the eyepiece. An eyepiece is used with a telescope or a microscope objective.
The objective forms the image of an object in the object-space focal plane (passing
through F) of the eyepiece, which, in turn, forms the image at infinity for comfortable
viewing by a human eye, as illustrated in Figure 7-10b.
The variation of the focal length with wavelength is shown Figure 7-10c. Its
minimum value is 10 cm, corresponding to a wavelength λ m = 0.5535 μm . Its value
increases as the wavelength deviates from this wavelength, but the deviation is quite
small, and the blue and red focal lengths are equal. Moreover, it is evident from the
parabola-like variation that there is a variety of pairwise wavelengths at which the focal
lengths are equal. Practically speaking, the variation of the focal length is negligible.
Now, the apparent size of an object as perceived by an observing eye is determined by the
7.6 Doublet 299
F F′1 , H
F2, H′ F′
f2 = –7.5 = d′
t1 = 11.25
f ′= 10
f′1 = 15 = d
(a)
F F1′, H
F2, H′ F′
f′ = 10
t1 = 11.25
(b)
1.00008 2.60
1.00006 2.55
f'/ f 'm
t2
1.00004 2.50
1.00002 2.45
2.40
1.00000 0.45 0.50 0.55 0.60 0.65 0.70
0.40 0.50 0.55 0.60 0.65 0.70
λ λ
(c) (d)
Figure 7-10. Doublet consisting of two thin lenses separated by a distance t1 ≡ t . (a)
Schematic of a Huygens eyepiece of focal length 10 cm. The two thin lenses are made
of BK7 glass. (b) The eyepiece forms an image at infinity of the image at F formed
by the objective (not shown). (c) Variation of focal length of the doublet with
wavelength. (d) Variation of back focal distance t2 with wavelength. The wavelength
is in micrometers, and t2 is in centimeters.
size of the image formed on the retina, which, in turn, depends on the angle it subtends at
the eye. This angle for a point object at a certain height is independent of the wavelength
if the focal length is independent. Thus, the constant focal length of the eyepiece leads to
a constant magnification and, therefore, zero lateral color.
The transverse magnification Mt of an object lying at infinity is zero. Thus, from

Eqs. (7-33) and (7-38), the axial and lateral colors of the image are given by δ d ′ and
− (h ′ L )δ d ′ , respectively. Figure 7-10d illustrates the axial color of the eyepiece in this
case. It shows how the back focal distance t2 , i.e., the distance of the focal point F ′ from
the center of the second lens, varies with the wavelength. Its value is 2.5 cm for the
wavelength λ m and increases as the wavelength increases. In order that the axial color be
zero, the position of F ′ must be independent of the wavelength, i.e., δt2 obtained from
Eq. (7-34) must be zero. Substituting for f ′ from Eq. (7-31), Eq. (7-34) may be written
1 1 1
= + (7-42a)
t2 f1′ − t f2′
1
= + (n − 1) κ 2 , (7-42b)
1
− t
(n − 1) κ1
where κ for a lens in terms of the radii of curvature R1 and R2 of its two surfaces is given
by
⎛ 1 1⎞
κi = ⎜ − ⎟ , i = 1, 2 . (7-43)
⎝ R1 R2 ⎠ i
Differentiating Eq. (7-42), we find that the variation of t2 with respect to n for lenses of
the same material is equal to zero if the value of t is given by
2
f2′ = − f1′ (1 − t f1′) . (7-44)
It shows that the focal lengths f1′ and f2′ must be of opposite signs. Because the spacing
given by Eq. (7-44) is different from that given by Eq. (7-40), δ f ′ is no longer zero.
Therefore, Eq. (7-30b) shows that the lateral color given by (δf ′ f ′)h ′ is not zero.
Therefore, the axial and lateral colors of a doublet with two separated thin lenses cannot
be simultaneously equal to zero. This is true even if the two lenses are made of different
materials, as may be seen from Eqs. (7-30). Zero axial color is obtained if δ f ′ = − δ d ′ ,
which, in turn, yields a lateral color of (δ f ′ f ′) h ′ . The doublet is not achromatic unless
δ f ′ and δ d ′ are each equal to zero. This is (approximately) true in the case of a thin-lens
doublet discussed in Section 7.6.4. Accordingly, a Huygens eyepiece is achromatic if, for
example, its two separated lenses are each an achromatic thin-lens doublet.
It is not surprising that a doublet consisting of two separated thin lenses is not
achromatic. It is shown next that a system with two separated components cannot be
achromatic unless each component is individually achromatic.
7.6 Doublet 301
7.6.3 Doublet with Two Separated Components

Consider an imaging system consisting of two separated components L1 and L2 in
air, as illustrated in Figure 7-11. We show that the system is achromatic only if each
component is individually achromatic. In order that the axial color of the system be zero,
the blue and red rays from an axial point object P0 must cross the optical axis at the
image point P0′ , where and ′ are the slope angles of the rays incident on and
emerging from the system, respectively. Similarly, for zero lateral color, the blue and red
rays (not shown in the figure) from an off-axis point object P at a height h must pass
through the image point P ′ at a height h ′ . The Lagrange invariant h ′′ = h shows that
because h ′ is the same for the two off-axis rays, the angle ′ for the axial rays must also
be the same. Therefore, the two axial rays not only must pass through P0′ , but must also
emerge from L2 at the same point. This is possible only if L1 is itself achromatic. Thus,
each of the two components must be individually achromatic in order that the system be
achromatic.
For an alternative proof for the system to be achromatic, we consider the imaging of
an object of height h1 lying at a distance S1 from L1 in two steps, as illustrated in Figure
7-12. L1 forms the image of the object at a distance S1′ with a height of h1′ given by
h1′ = h1 ( S1′ S1 ) . (7-45)
This image lies at a distance S2 from L2 , which forms its image at a distance S2′ with a
height h2′ given by
h2′ = h1′( S2′ S2 )
= h1 ( S1′S2′ S1S2 ) . (7-46)
The axial color of the image formed by L2 is zero if S2′ is independent of wavelength. Its
lateral color is also independent of the wavelength if δ h2′ = 0. Or, because h1 and S1 are
P R
B
h
P0ʹ
β (−)βʹ
P0
(−)hʹ
Pʹ
L1 L2
Figure 7-11. Imaging by a system of two separated components L1 and L2 in air.

The system is achromatic only if the axial blue and red rays not only pass through
P0′ but also make the same angle ′ in the image space, i.e., if L1 and L2 are
individually achromatic.
L1 L2
P
h1
P0ʹ P0ʹʹ
P0 (−)h1ʹ (−)h2
(−)h2ʹ
Pʹ
Pʹʹ
(−)S1 S1ʹ (−)S2 S2ʹ
Figure 7-12. Imaging by a system of two separated components L1 and L2 in air.

Imaging by the system is achromatic provided imaging by each component is
individually achromatic.
independent of the wavelength, if
δ (S1′ S2 ) = – S22 (S1′ + S2 ) δS1′
= 0 , (7-47)
where we have used the fact that δ S2 = − δ S1′ because of the fixed spacing between L1
and L2 . Thus, δ h2′ = 0 if δ S1′ = 0 , i.e., if the image formed by L1 has zero axial color.
Equation (7-45) then shows that δ h1′ is also zero. Thus, the image formed by L1 must be
achromatic. Therefore, the system consisting of two separated components L1 and L2
must be individually achromatic if the system is to be achromatic.
7.6.4 Thin-Lens Doublet

If the two thin lenses are in contact (t = 0) , then the doublet, called a thin-lens
doublet, is achromatic with respect to its focal length, according to Eq. (7-32), if the ratio
of their focal lengths is given by
f1′ V
= − 2 . (7-48)
f2′ V1
Because, for zero spacing, Eq. (7-31) reduces to
1 1 1
= + , (7-49)
f′ f1′ f2′
the two focal lengths are given by
f ′(V1 − V2 )
f1′ = (7-50a)
V1
7.6 Doublet 303
and
f ′(V2 − V1 )
f2′ = . (7-50b)
V2
Thus, a thin-lens doublet with an achromatic focal length is obtained by combining a

positive lens of low dispersion (small δn or large V) and a negative lens of high
dispersion.
By the definition of a thin lens, the principal points of a thin-lens doublet coincide at
its center. Therefore, the blue and red focal points also coincide with each other.
Accordingly, both the axial and lateral colors are zero, regardless of the value of the
object distance. It should be noted, however, that the focal length of a thin-lens doublet
can be made the same at only two selected wavelengths for which the difference δn in
the refractive indices is used in defining V. This may be seen as follows: The focal
lengths f F′ and fC′ of the doublet for the F and C lines are equal to each other according
to Eq. (7-49) if
1 1 1 1
+ = + , (7-51)
f F′1 f F′ 2 fC′1 fC′ 2
or
(nF1 − 1) κ1 + (nF 2 − 1) κ 2 = (nC1 − 1) κ 1 + (nC 2 − 1) κ 2 , (7-52)
or
κ2 n − nC1
= − F1 . (7-53)
κ1 nF 2 − nC 2
This is indeed the result obtained by substituting the expressions for the focal length and
the Abbe number from Eqs. (2-28) and (7-15a), respectively, into Eq. (7-49). The focal
lengths of the doublet for another pair of wavelengths will be equal to each other
provided the ratio of the differences in the refractive indices for them is equal to that
given by Eq. (7-53). The residual chromatic aberration at wavelengths other than λ F and
λ C is called the secondary spectrum.
The doublet has the same focal length for a third wavelength, e.g., the d line,
provided the refractive indices also satisfy the relation
κ2 n − nd 1
= − F1 . (7-54)
κ1 n F 2 − nd 2
Equations (7-53) and (7-54) yield the equality
nF1 − nd1 n − nd 2
= F2 . (7-55)
nF1 − nC1 nF 2 − nC 2
The quantity (nF − nd ) ( nF − nC ) is called the relative partial dispersion of a material.

Thus, a doublet with its two lenses obeying Eq. (7-48) has the same focal length for three
wavelengths if they have the same partial dispersion. A system corrected for three
wavelengths is called apochromatic.
A numerical example of an achromatic thin-lens doublet made of BK7 and SF2

glasses is shown in Figure 7-13a. It is a cemented doublet in that the contact surface
between the two thin lenses is common. Thus, the radius of curvature of the second
surface of the first lens is the same as that of the first surface of the second lens. The focal
length of the doublet is 10 cm for the d line. How it varies with wavelength is shown in
R2 = R3 = – 4.22
R1 = 6.07
R4 = – 14.29
BK7 SF2
(a)
1.008 1.70
1.006
1.68
1.004 f ′/fd′
f ′/fd′ 1.66 n
1.002
1.64
1.000
n
0.998 1.62
0.4 0.6 0.8 1.0
λ
(b)
Figure 7-13. Achromatic thin-lens doublet. (a) Cemented doublet with a focal length
of 10 cm consisting of BK7 and SF2 glass lenses. The focal lengths of the two lenses
are 4.82 cm and –9.29 cm, respectively. (b) Variation of focal length with the
wavelength. The variation of the refractive index n of SF2 glass is also shown in the
figure. Its refractive index for the d line is 1.645, and its Abbe number is 33.60. The
Abbe number of BK7 is 64.17.
Figure 7-13b. Its minimum occurs in the vicinity of the d line. We note again from the
parabola-like variation that there is a variety of pair-wise wavelengths for which the focal
lengths are equal. However, compared to a doublet with separated components, as in
Figure 7-10, there is a built-in design feature of equal focal lengths for the F and C lines.
We note from Eqs. (7-50) that because V1 and V2 are positive, f1′ and f2′ have
opposite signs. Moreover, the specification of f ′ and the dispersive constants of the lens
materials specifies their focal lengths f1′ and f2′ . However, the focal length of a thin lens
depends on the difference in the curvatures of its surfaces, while its spherical aberration
and coma depend on the curvatures through its shape factor. This degree of freedom (i.e.,
the choice of the radii of curvature of its four surfaces) can be utilized to make the
achromatic thin-lens doublet free of spherical aberration and coma.

Let blue and red be the shortest and longest wavelengths of an image.
7.7.1 General System

The axial color of a system representing the difference in the blue and red image
distances is given by
2
δ S ′ = δ d ′ − Mt2 δ d + (1 − Mt ) δf ′ , (Axial Color) (7-56)
where f ′ is the image-space focal length, δd and δ d ′ are the axial colors of the object-
space and image-space principal points, and Mt is the image magnification. The axial
color is independent of any stop shift because the image distance is independent of the
location of the aperture stop. The radius of the blue (red) disk of rays in the red (blue)
Gaussian image plane is given by
ri = ( a L ) δS ′ , (7-57)
where a is the radius of the exit pupil, and L is the distance of the image from it. It is also
independent of the stop shift because a L is kept fixed when the stop is shifted. The
lateral color representing the difference in the heights of the blue and red chief rays in an
image plane is given by
δ hc′ 1 δ S′
h′
=
f′
[ ]
Mt δ d − ( Mt − 1) δf ′ −
L
, (Lateral Color) (7-58)
where L is the distance of the image from the exit pupil. For an object at infinity,
′ and δ hc2
magnification is zero. The values of lateral colors δ hc1 ′ corresponding to two
exit pupil locations, such that the image lies at distances L1 and L2 from them, are
related to each other according to
⎛ 1 1⎞
δ hc′2 = δ hc′1 + ⎜ − ⎟ h ′δ S ′ . (7-59)
⎝ L1 L2 ⎠
Equation (7-59) represents the stop-shift equation for the lateral color.
7.7.2 Thin Lens

The focal length f ′ of a thin lens of refractive index n and surfaces with radii of
curvature R1 and R2 is given by
1 ⎛ 1 1⎞
= (n − 1) ⎜ − ⎟ . (7-60)
f′ ⎝ R1 R2 ⎠
Its relative change with a change δn in its refractive index is given by
δf ′ 1
= − , (7-61)
f′ V
where
V =
(n − 1) (7-62)
δn
is the dispersive constant of the lens material.
It is common practice to consider n as the refractive index for the yellow line of
helium ( λ = 0.5876 m ) , called the d line, and δn as the difference nF − nC between the
refractive indices for the Fraunhofer lines F and C, i.e., for the blue (λ = 0.4861 μm ) and
red (λ = 0.6563 m ) lines of hydrogen. Glass manufacturers often give the refractive
index data as a six-digit number. For example, BK7 glass is specified as #517642. The
first three digits define its refractive index according to nd − 1 = 0.517 , and the
remaining three digits define its dispersive constant according to
nd − 1
V = (7-63a)
nF – nC
= 64.2 . (7-63b)
The dispersive constant of a glass defined according to Eq. (7-15a) is called its Abbe
number.
The axial and lateral colors of a thin lens are given by
S′ 2
δS ′ = − (7-64a)
f ′V
and
1 1
δ hc′ = h ′ ⎛ − ⎞ δS ′ . (7-64b)
⎝ S′ L ⎠
These equations can be obtained from Eqs. (7-58) and (7-60) by setting the axial colors
δd and δ d ′ of the principal points of a general system equal to zero and using the thin-
lens imaging equations.
7.7.3 Plane-Parallel Plate

The image of an object formed by a plane-parallel plate of refractive index n and
thickness t lies at a distance t (1 − 1 n) from the object, independent of the object
distance. Its magnification is unity. Its axial and lateral colors are given by
t
δ S′ = δn (7-65a)
n2
and
h′ t
δ hc′ = − δn . (7-65b)
S n2
7.7.4 Doublet
The focal length f ′ of a doublet with lenses of focal lengths f1′ and f2′ spaced a
distance t apart is given by
1 1 1 t
= + − . (7-66)
f′ f1′ f2′ f1′f2′
It is stationary, i.e., its differential is zero, if
f1′V1 + f2′ V2
t = , (7-67)
V1 + V2
where V1 and V2 are the dispersive constants of the lenses. Its axial and lateral colors are
given by
⎛ 1 Mt2 ⎞
δ S ′ = − f ′t ⎜ + (7-68a)
⎝ f1′V1 f2′ V2 ⎟⎠
and
δ hc′ tMt δl ′
= − . (7-68b)
h′ f2′ V2 L
The axial color of the principal points is given by
f ′t
δd = − (7-69a)
f2′ V2
and
f ′t
δd ′ ≡ − . (7-69b)
f1′V1
If the lenses are made of the same material such that V1 = V2 = V , then
1 1⎛1 1⎞
= ⎜ + ⎟ , (7-70a)
f′ 2 ⎝ f1′ f2′ ⎠
1
t =
2
( f1′ + f2′) , (7-70b)
1
δ S′ = −
V
(
f2′ + f1′Mt2 ) , (7-70c)
δ hc′ f ′M δl ′
= 1 t − , (7-70d)
h′ V L
δd = f1′ V , (7-70e)
and
δ d ′ = − f2′ V . (7-70f)
The focal length of a thin-lens doublet, i.e., one with t = 0 , is given by
1 1 1
= + . (7-71a)
f′ f1′ f2′
It is achromatic if
f1′ V
= − 2 , (7-71b)
f2′ V1
i.e., if
f ′(V1 − V2 )
f1′ = (7-71c)
V1
and
f ′(V2 − V1 )
f2′ = . (7-71d)
V2
Such a doublet is obtained by combining a positive lens of low dispersion (small δn or

large V) and a negative lens of high dispersion. It has zero axial and lateral colors
regardless of the object distance. In a cemented doublet, the contact surface between them
is shared.
The focal lengths of the doublet are equal for the F and C lines. The residual
chromatic aberration at wavelengths other than λ F and λ C is called the secondary
spectrum. A system corrected for three wavelengths is called apochromatic. Its focal
length for the F, C, and d lines is the same if the lenses have the same relative partial
dispersion (nF − nd ) ( nF − nC ) , i.e., if
nF1 − nd1 n − nd 2
= F2 . (7-72)
nF1 − nC1 nF 2 − nC 2
REFERENCES
1. D. E. McCarthy, “The reflection and transmission of infrared materials, Part 1,
Spectra from 2 μm to 50 μm,” Appl. Opt. 2, 591–595 (1963).
2. D. E. McCarthy, “Part 2, Bibliography,” Appl. Opt. 2, 596–603 (1963).
3. D. E. McCarthy, “Part 3, Spectra from 2 μm to 50 μm,” Appl. Opt. 4, 317–320

(1965).
5. D. E. McCarthy, “Part 5, Spectra from 2 μm to 50 μm,” Appl. Opt. 7, 1997–2000

(1965).

Problems 311
PROBLEMS
7.1 Consider a plane-parallel plate placed in the path of a converging beam. The plate
has a refractive index of 1.5, a thickness of 1 cm, and a diameter of 4 cm. In the
absence of the plate, the beam comes to a focus at a distance of 8 cm from its front
surface at a height of 0.5 cm from its axis. (a) Calculate the position of the focus in
the presence of the plate. (b) Determine its chromatic aberrations for δn = 0.008
and illustrate by a diagram.
7.2 Consider the thick lens of refractive index n, thickness t, and surfaces of radii of
curvature R1 and R 2 discussed in Section 4.6. (a) Show that its back focal distance
t2 can be written
1 ⎡ 1 1 ⎤
= (n − 1) ⎢ − ⎥ ,
t2 ⎣ R1 − bt R2 ⎦
where b = (n − 1) n . (b) By letting ∂t2 ∂n = 0 , show that the position of its focal
point is achromatic if its thickness and radii of curvature are related according to
R2 =
( R1 − bt )2 .
R1 − b 2 t
Also show that the corresponding focal length may be written
b (t R1 ) − 1
f′ = R12 .
b 2t
(c) Show that it is achromatic with respect to its focal length if its thickness is given
by
n 2 ( R1 − R2 )
t = ,
n2 − 1
or that the distance between the centers of curvature of its two surfaces is given by
t n 2 . Show that the corresponding focal length in this case is given by
1 n −1⎛ 1 1⎞
= ⎜ − ⎟ ,
f′ n + 1 ⎝ R1 R2 ⎠
i.e., it is longer by a factor of n + 1 compared with that of a corresponding thin

lens.
7.3 Consider a concentric lens (see Problem 4.9) made of BK7 glass, with radii of
curvature 5 cm and 4 cm, placed in a converging beam of image-forming light of a
certain system such that the axial image is concentric with the lens. Calculate the
lateral color introduced by each surface and show that their contributions cancel
each other.
7.4 Show that a doublet is achromatic with respect to its focal length if the spacing t is
chosen at a wavelength λ m for which the refractive index nm for each lens is
equal to the mean of the corresponding blue and red refractive indices, i.e., if λ m is
such that nm = (nF + nC ) 2 . The V-number of a lens in this case is given by
Vm = (nm − 1) (nF − nC ) .
7.5 Consider the Mangin mirror of Problem 3.3 imaging an object so that the image
distance is S ′ . Show that its axial color is given by
[ ]
δ S ′ = S ′ 2 (2 fs′ − R1 ) n R1 fs′ δ n .
For an aperture stop located at the mirror, its lateral color is zero.
CHAPTER 8
MONOCHROMATIC ABERRATIONS
8.1 Introduction ..........................................................................................................315

8.2 Wave and Ray Aberrations ................................................................................. 316
8.2.1 Definitions ............................................................................................... 316
8.2.2 Relationship between Wave and Ray Aberrations ..................................320
8.3 Wavefront Defocus Aberration ..........................................................................322
8.4 Wavefront Tilt Aberration ..................................................................................325
8.5 Aberrations of a Rotationally Symmetric System............................................. 326
8.5.1 Explicit Dependence on Object Coordinates........................................... 326
8.5.2 No Explicit Dependence on Object Coordinates ..................................... 329
8.6 Additivity of Primary Aberrations ..................................................................... 331
8.6.1 Introduction..............................................................................................331
8.6.2 Primary Wave Aberrations ......................................................................332
8.6.3 Transverse Ray Aberrations ....................................................................335
8.6.4 Off-Axis Point Object ..............................................................................336
8.6.5 Higher-Order Aberrations........................................................................337
8.7 Strehl Ratio and Aberration Balancing ............................................................. 337
8.7.1 Strehl Ratio ..............................................................................................337
8.7.2 Aberration Balancing............................................................................... 338
8.8 Zernike Circle Polynomials................................................................................. 340
8.8.1 Introduction..............................................................................................340
8.8.2 Polynomials in Optical Design ................................................................341
8.8.3 Polynomials in Optical Testing ............................................................... 345
8.8.4 Characteristics of Polynomial Aberrations ..............................................349
8.8.4.1 Isometric Characteristics ........................................................... 349
8.8.4.2 Interferometric Characteristics ..................................................350
8.9 Relationship between Zernike Polynomials and Classical Aberrations ......... 352
8.9.1 Introduction..............................................................................................352
8.9.4 Astigmatism............................................................................................. 353
8.9.5 Coma ........................................................................................................354
8.9.7 Seidel Coefficients from Zernike Coefficients ........................................355
313
314 MONOCHROMATIC ABERRATIONS
8.10 Aberrations of an Anamorphic System ..............................................................356

8.10.1 Introduction..............................................................................................356
8.10.2 Classical Aberrations ............................................................................... 357
8.10.3 Polynomial Aberrations Orthonormal over a Rectangular Pupil ............358
8.10.4 Expansion of a Rectangular Aberration Function in Terms of
Orthonormal Rectangular Polynomials ................................................... 360
8.11 Observation of Aberrations ................................................................................363
8.11.2 Interferograms..........................................................................................364
8.11.3 Random Aberrations ................................................................................369
8.12.1 Wave and Ray Aberrations ......................................................................370
8.12.5 Strehl Ratio and Aberration Balancing ....................................................371
8.12.6 Zernike Circle Polynomials ..................................................................... 371
8.12.6.1 Use of Zernike Polynomials in Wavefront Analysis ............... 371
8.12.6.2 Polynomials in Optical Design ................................................371
8.12.6.3 Zernike Primary Aberrations ................................................... 372
8.12.6.4 Polynomials in Optical Testing ............................................... 373
8.12.6.5 Isometric and Interferometric Characteristics ......................... 374
8.12.7 Relationship between Zernike and Seidel Coefficients ........................... 374
8.12.8 Aberrations of an Anamorphic System....................................................374
Same n Value and Varying as cos mqq and sin mqq ................................. 376
References ......................................................................................................................377
Problems ......................................................................................................................... 378
Chapter 8
Monochromatic Aberrations
8.1 INTRODUCTION
Given the radii of curvature of the surfaces of an imaging system and the refractive
indices of the media surrounding them, the position and the size of the Gaussian image of
an object can be determined by using the equations given in Chapters 2, 3, and 4. By
determining the position and the size of its entrance and exit pupils, the irradiance
distribution of the image of an object with a certain radiance distribution can be
calculated, as discussed in Chapter 5. However, the quality of the image, which depends
on the aberrations of the system, was not discussed. In Gaussian optics, all of the object
rays from a certain point object transmitted by a system pass through the Gaussian image
point. The imaging system converts the spherical wavefront diverging from the point
object into a spherical wavefront converging to the Gaussian image point. In reality,
however, when the rays are traced according to the exact laws of geometrical optics, they
generally do not converge to an image point.
In this chapter, the wave and transverse ray aberrations are discussed, and a
relationship between them is derived. The wave aberrations for a certain point object
represent the optical deviations of its wavefront at the exit pupil from being spherical.
The wave aberrations are zero if the wavefront is spherical, in which case all of the rays
converge to its center of curvature, and a perfect point image is obtained. The ray
aberrations represent the displacement of the rays from the Gaussian image point.
A defocus wave aberration is introduced when the image is observed in an image

plane other than the one in which the center of curvature lies. It is also introduced if one
or more imaging elements of the system are displaced along its optical axis. We derive a
relationship between the longitudinal defocus of an image and the defocus aberration
resulting from it. Similarly, when an imaging element is slightly tilted or displaced
perpendicular to the axis, a wavefront tilt is introduced. We show how the wavefront tilt
is related to the wavefront tilt aberration.
The possible aberrations of a rotationally symmetric imaging system in the form of a

power series are discussed. They are referred to as the classical aberrations. They are of
even order in the object and pupil coordinates. They are constructed from three rotational
invariants. There are five aberrations of fourth order, referred to as the primary or the
Seidel aberrations. We show that the primary wave aberrations of a multisurface system
are additive in the sense that they can be obtained by adding the primary wave aberrations
of the surfaces, where the Gaussian image of a point object by one surface becomes the
point object for the next surface.
The concept of Strehl ratio as a measure of image quality is introduced next, and
balancing of an aberration of a certain order with one or more aberrations of lower orders
is discussed. The aberrations of a system are discussed in terms of Zernike circle
315
polynomials, which are not only orthogonal over a circular pupil but also represent
balanced classical aberrations with minimum variance across the pupil. The aberrations in
the form of Zernike polynomials are referred to as the orthogonal aberrations. The
relationships between the classical and orthogonal aberrations are discussed.
Although the transverse ray aberrations of a system for a certain point object can be
obtained by tracing the rays through the system and up to the image plane, they can also
be obtained from the wave aberrations. However, the distribution of rays in an image
plane does not represent the true picture of an image because it does not take into account
the diffraction of the wavefront at the exit pupil. Because the wave aberrations play a
fundamental role in determining the image quality, knowledge of them is essential. We
point out that the ray aberrations are not additive, in that those in the final image plane
cannot be obtained by adding their values in the intermediate image planes formed by the
surfaces of a system. Of course, the contribution of a surface to the ray aberration in the
final image plane can be obtained from its wave aberration using the parameters of the
final image.
Because the refractive index of a transparent substance depends on the wavelength,

the optical path length of a ray passing through it also depends on the wavelength.
Accordingly, the monochromatic aberrations of a refracting system discussed also vary
with the wavelength. For example, the variation of spherical aberration with wavelength,
called spherochromatism, can be calculated by substituting the appropriate value of the
refractive index. However, this variation is generally small, especially for a narrow
spectral bandwidth.
The aberrations of anamorphic systems, consisting of cylindrical optics, that are

symmetrical about two orthogonal planes are discussed briefly. It is shown that there are
six reflection invariants, yielding sixteen primary aberrations. The balanced aberrations
orthogonal over a rectangular pupil are products of Legndre polynomials, one for each of
the two axes of the pupil.
Interferograms of primary aberrations are discussed briefly to illustrate how such

aberrations may be recognized in practice. Finally, the results of this chapter are
summarized.
8.2 WAVE AND RAY ABERRATIONS

In this section we define wave and transverse ray aberrations and derive a
relationship between the two. We show that the transverse aberration of a ray is
proportional to the derivative of its wave aberration with respect to its pupil coordinates.
8.2.1 Definitions
Consider an optical system imaging a point object P, as illustrated in Figure 8-1. The
object radiates a spherical wave. If the image is perfect, the diverging spherical wave
incident on the system is converted by it into a spherical wave converging to the Gaussian
8.2 Wave and Ray Aberrations 317
Optical
System
P¢
Figure 8-1. Perfect imaging by an optical system. P is the point object, and P ¢ is its
Gaussian image point.
image point P ¢ . With a few exceptions, the wave exiting from practical systems is only
approximately spherical.
We now introduce the concept of wave and ray aberrations associated with an object
ray and derive a relationship between the two. The optical path length of a ray in a
medium of refractive index n is equal to n times its geometrical path length. If rays from
a point object are traced through the system and up to the exit pupil such that each one
travels an optical path length equal to that of the chief ray, the surface passing through
their end points is called the system wavefront for the point object under consideration. If
the wavefront is spherical, with its center of curvature at the Gaussian image point, we
say that the image is perfect. The rays transmitted by the system in that case have equal
optical path lengths in propagating from P to P ¢ , and they all pass through P ¢ . If,
however, the actual wavefront deviates from the spherical wavefront, called the Gaussian
reference sphere, we say that the image is aberrated. The rays do not have equal optical
path lengths, and they intersect the Gaussian image plane in the vicinity of P ¢ .
The optical deviation (i.e., geometric deviations times the refractive index ni of the
image space) of the wavefront from the Gaussian reference sphere along a ray is called its
wave aberrations. It represents the difference between the optical path lengths of the ray
under consideration and the chief ray in traveling from the point object to the reference
sphere. Accordingly, the wave aberration associated with the chief ray is zero. Because
the optical path lengths of the rays from the reference sphere to the Gaussian image point
are equal, the wave aberration of a ray is also equal to the difference between its optical
path length from the point object P to the Gaussian image point P ¢ and that of the chief
ray.
The wave aberration of a ray from a point object is positive if it travels an extra
optical path length, compared to the chief ray, in order to reach the Gaussian reference
sphere [1]. Figures 8-2a and 8-2b illustrate the reference sphere S and the aberrated
wavefront W for on-axis and off-axis point objects P0 and P, respectively. The reference
sphere, which is centered at the Gaussian image point P0¢ in Figure 8-2a or P ¢ in Figure
8-2b, and the wavefront pass through the center O of the exit pupil. The wave aberration
ni Q Q of a general ray GR0 or GR, as shown in the figures, is numerically positive. The
ExP
Q Q(x, y, z)
GR0 x
P0¢¢ (xi, yi)

CR0
z
O OA P0¢ (0, 0)
g
b
y
W(x,y) = niQQ
S
W
R
Figure 8-2. (a) Aberrated wavefront for an on-axis point object. The reference
sphere S of radius of curvature R is centered at the Gaussian image point P0¢ . The
wavefront W and reference sphere S pass through the center O of the exit pupil ExP.
A right-hand Cartesian coordinate system showing x, y, and z axes is illustrated,
where the z axis is along the optical axis of the imaging system. Angular rotations a ,
, and g about the three axes are also indicated. CR0 is the chief ray, and a general
ray GR0 is shown intersecting the Gaussian image plane at P0¢¢ .
ExP
_
Q Q(x, y, z)
GR
P¢¢(xi, yi)
P¢(xg, 0)
R
O OA P¢0
x
a
z
g
y b
W(x, y) = niQQ
S
W
zg
Figure 8-2. (b) Aberrated wavefront for an off-axis point object. The reference
sphere S of radius of curvature R is centered at the Gaussian image point P ¢ . The
value of R in this figure is slightly larger than its value in Figure 8-2a. GR is a
general ray intersecting the Gaussian image plane at the point P ¢¢ . By definition,
the chief ray (not shown) passes through O, but it may or may not pass through P ¢ .
The displacement of the chief ray in the image plane from P ¢ represents distortion.
coordinate system is also illustrated in these figures. We choose a right-hand Cartesian

coordinate system such that the optical axis lies along the z axis. The object, entrance
pupil, exit pupil, and the Gaussian image lie in mutually parallel planes that are
perpendicular to this axis. Figure 8-3 illustrates the coordinate systems in the object, exit
pupil, and image planes. The origin of the coordinate system lies at O, and the Gaussian
image plane lies at a distance zg from it along the z axis.
We assume that a point object such as P lies along the x axis. (There is no loss of
generality due to this because the system is rotationally symmetric about the optical axis.)
The z x plane containing the optical axis and the point object is called the tangential or
the meridional plane. The corresponding Gaussian image point P ¢ lying in the Gaussian
image plane along its x axis also lies in the tangential plane. This may be seen by
considering a tangential object ray and Snell’s law, according to which the incident and
refracted (or reflected) rays lie in the same plane. The chief ray always lies in the
tangential plane. The plane normal to the tangential plane but containing the chief ray is
called the sagittal plane. As the chief ray bends when it is refracted or reflected by an
optical surface, so does the sagittal plane. It should be evident that only the chief ray lies
in both the tangential and sagittal planes because it lies along the line of intersection of
these two planes.
xo
P (xo, 0) xp
Q (x, y)
P0
an ct
xg
pl bje
e
r
O
q
P¢¢ (xi, yi, zg)
yo
R
O P¢ (xg, 0, zg)
an il
pl up
e
P
zg
yp P¢0
z
pl n
e
e sia
an
ag us
im Ga
yg
Figure 8-3. Right-hand coordinate system in the object, exit pupil, and image planes.
The optical axis of the system is along the z axis, and the off-axis point object P is
assumed to be along the x axis, thus making the z x plane the tangential plane.
Consider a ray such as G R from the object passing through the system and
intersecting the Gaussian image plane at P ¢¢( xi , yi ) , as illustrated in Figure 8-2b. The
displacement P ¢P ¢¢ of P ¢¢ from the Gaussian image point P ¢ is called the geometrical or
the transverse ray aberration. The distribution of rays in an image plane is called the ray
spot diagram. The ray aberrations and spot diagrams are discussed in Chapter 9. When
the wavefront is spherical, with its center of curvature at the Gaussian image point, then
the wave and ray aberrations are zero. In that case, all of the object rays transmitted by
the system pass through the Gaussian image point, and the image is said to be perfect.
8.2.2 Relationship between Wave and Ray Aberrations

In Figures 8-2, a general ray such as GR0 or GR is shown intersecting the wavefront
and the reference sphere at points Q and Q, respectively. By definition of the wavefront,
the optical path length of a ray starting at the point object and ending at Q is the same as
that of the chief ray ending at O. Thus, ni Q Q gives the wave aberration of the ray under
consideration, which, as shown in the figures, is numerically positive. Let W ( x, y)
represent this wave aberration, where ( x, y, z ) are the coordinates of the point Q. We
need not consider the dependence of W on z explicitly because z is related to x and y by
virtue of Q being on the reference sphere, e.g., x 2 + y 2 + z 2 = R 2 for a sphere of radius
of curvature R centered at (0, 0).
Consider a ray ABP¢¢ passing through a point A on the wavefront, as illustrated in

Figure 8-4. It intersects the reference sphere at the point B at height x, and the image
plane at P ¢¢ at a height x g + xi , where x g is the height of the Gaussian image point P ¢ .
The wave aberration of the ray is given by W ( x ) = ni AB . Let the angle of the ray with the
surface normal at B, indicated by the dashed line BP¢ , be b . The ray aberration
P ¢P ¢¢ = xi is approximately equal to bR . Now consider a point C on the wavefront in the
vicinity of A. Let CED be the ray passing through it. Draw a line AE parallel to BD so
that AB = ED . The wave aberration ni CD of the ray may be written
W ( x + Dx ) = ni ( AB + CE )
= W ( x ) + DW , (8-1)
where x + D x is the height of the point D. Note that the angle EAC is equal to b , and
CE = bD x . Now
DW = ni CE
= nibD x
= ni ( xi R)D x , (8-2)
or
R DW
xi = . (8-3)
ni D x
ExP
E
C D P¢¢
b xi
b
P¢
A
B
R xg
O OA P0¢
S
W
zg
Figure 8-4. Wave and ray aberrations. W is the wavefront for a point object whose
Gaussian image lies at P ¢ . P ¢P ¢¢ is the ray aberration, and ni AB is the wave
aberration of a ray ABP¢¢ passing through a point B on the reference sphere S with
its center of curvature at P ¢ . The ray ABP¢¢ is normal to the wavefront W at the
point A, and BP¢ is the surface normal at a point B on the reference sphere.
In the limit, as C approaches A, we obtain
R ∂W
xi = . (8-4)
ni ∂x
A similar equation is obtained for the y coordinate of the point P ¢¢ . Thus, the wave and
ray aberrations are related to each other according to
R Ê ∂W ∂W ˆ
( xi , yi ) = Á , ˜ , (8-5)
ni Ë ∂x ∂y ¯
where ( xi , yi ) represent the coordinates of P ¢¢ with respect to those of the Gaussian

image point P ¢. For systems with narrow fields of view, P ¢ lies close to P0¢ , and we may
replace R with z g . Note that in the case of an axial point object, R = zg . For a rigorous
derivation, see Reference 2.
Thus, if W ( x, y) is the wave aberration of a ray in the exit pupil, the corresponding
ray aberration in the image plane is given by its spatial derivative multiplied by the radius
of curvature of the Gaussian reference sphere and divided by the refractive index of the
image space. Because the rays are normal to a wavefront, the ray aberrations depend on
the shape of the wavefront and, therefore, on its geometrical path length difference from
the reference sphere. The division by n i in Eq. (8-5) converts the optical path length
difference into the geometrical path length difference. When an image is formed in free
space, as is often the case in practice, then ni = 1.
We refer to the aberration W ( x, y ) as the wave aberration at a projected point ( x, y)

in the plane of the exit pupil. If (r, q) are the polar coordinates of this point, as illustrated
in Figure 8-3, they are related to its rectangular coordinates ( x, y) according to
( x, y) = r(cos q, sin q) . (8-6)
Note that the tangential rays, i.e., those lying in the z x plane, lie along the x axis of the
exit pupil plane and thus correspond to q = 0 or p . Similarly, the sagittal rays, i.e., those
lying in a plane orthogonal to the tangential plane but containing the chief ray lie along
the y axis of the exit pupil plane and thus correspond to q = p 2 or 3p 2 . If W (r, q)
represents the aberration in polar coordinates, then the ray aberrations are given by
R Ê ∂ W sin q ∂ W ∂W cos q ∂W ˆ
(xi , yi ) = Á cos q – , sin q + ˜ . (8-7)
ni Ë ∂r r ∂q ∂r r ∂q ¯
For a radially symmetric aberration W ( r) , a ray of zone r in the exit pupil plane
intersects the Gaussian image plane at a distance ri from the Gaussian image point given
by
R ∂W
ri = . (8-8)
ni ∂r
8.3 WAVEFRONT DEFOCUS ABERRATION

We now discuss defocus wave aberration of a system and relate it to the longitudinal
defocus of an image. Consider, as indicated in Figure 8-5, an imaging system for which
the expected Gaussian image of a point object is located at P1 . If the system is assembled
properly and it is aberration free, a spherical wavefront with its center of curvature at P1
emerges from its exit pupil, and a perfect image is observed in the Gaussian image plane.
However, if one or more of its elements is slightly displaced along its optical axis, then
the image is displaced longitudinally to a point, say P2 , such that P2 lies on the line OP1
joining the center O of the exit pupil and the Gaussian image point P1 . Accordingly, the
wavefront W for this point object is spherical, with its center of curvature at P2 .
The aberration of the wavefront with respect to the Gaussian reference sphere S,
which is centered at P1 , is the optical deviation between the two along a ray. For a point
Q1 on the reference sphere, this deviation in the figure is given by ni Q2 Q1 , where ni is the
refractive index of the image space, and Q2 Q1 is approximately equal to the difference in
the sags of the reference sphere of radius of curvature z and the spherical wavefront of
radius of curvature R. (The sag of a surface S at a point Q1 , indicated by OB in Figure 8-
5, represents its deviation along its axis of symmetry from a plane surface that is tangent
8.3 Wavefront Defocus Aberration 323
ExP
Q2 Q1
O B P1 P2
S centered at P1
W centered at P2
W S
Z
Figure 8-5. Defocused wavefront W is spherical with a radius of curvature R

centered at P2 . The reference sphere S with a radius of curvature z is centered at
P1 . Both W and S pass through the center O of the exit pupil ExP. The ray Q2 P2 is
normal to the wavefront at Q2 .
to it at its vertex.) It is numerically positive because, compared with the chief ray passing
through O, it represents the extra optical path length that a ray passing through Q1 has to
travel in order to reach the reference sphere. Thus, the defocus wave aberration at the
point Q1 is given by
ni Ê 1 1 ˆ 2
W (r ) = Á - ˜r , (8-9)
2 Ëz R ¯
where r is the distance of Q1 from the optical axis. We note that the defocus wave
aberration is proportional to r 2 . If z ~ R , then Eq. (8-9) may be written
W (r ) ~ - ni D R2 r 2 , (8-10)
2 R
where
DR = z - R (8-11)
is called the longitudinal defocus. We note that the defocus wave aberration and the
longitudinal defocus have numerically opposite signs.
A defocus aberration is also introduced if the system is assembled properly, but the
image is observed in a plane other than the Gaussian image plane. Consider, for example,
an imaging system forming an aberration-free image at the Gaussian image point P2 .
(Note that the Gaussian image is now located at P2 in Figure 8-5.) Thus, the wavefront at
the exit pupil is spherical, passing through its center O with its center of curvature at P2 .
Let the image be observed in a defocused plane passing through a point P1 that lies on the
line joining O and P2 . For the observed image at P1 to be aberration free, the wavefront
at the exit pupil must be spherical, with its center of curvature at P1 . Such a wavefront
forms the reference sphere with respect to which the aberration of the actual wavefront
must be defined. Once again, the aberration of the wavefront at a point Q1 on the
reference sphere is given by Eq. (8-9).
For a system with a circular exit pupil of radius a, Eq. (8-9) may be written
ni Ê 1 1
W (r ) = - ˆ a 2 r2 (8-12a)
2 Ëz R¯
= Bd r2 , (8-12b)
where
r = r a (8-13)
is the normalized distance of a point in the pupil plane from its center, and
ni Ê 1 1 ˆ 2
Bd = - a (8-14a)
2 Ë z R¯
~ - ni D R 8 F 2 (8-14b)
is the peak value of the defocus aberration. The quantity F in Eq. (8-14b) is the focal
ratio of the image-forming light cone. It is given by
F = R 2a . (8-15)
We note that a positive value of Bd implies a negative value of the longitudinal defocus
D R, or z < R. Thus, an imaging system with a positive value of defocus aberration Bd can
be made defocus free if the image is observed in a plane lying farther from the exit pupil,
compared with the defocused image plane, by a distance 8 Bd F 2 ni . Similarly, a
positive defocus aberration of Bd = - ni R 8F 2 is introduced into the system if the
image is observed in a plane lying closer to the exit pupil, compared with the defocus-free
image plane, by a (numerically negative) distance D R.
The radius of curvature Rik of the Petzval image surface of a system (of k imaging
surfaces) is given by Eq. (2-124). Therefore, an observation of the image of a point object
in the Gaussian image plane (i.e., on a planar surface) is equivalent to a longitudinal
defocus of h ¢ 2 2 R ik . Substituting into Eq. (8-14b), we obtain the corresponding field
curvature aberration of Bd = n i h ¢ 2 16R ik F 2 .
The longitudinal chromatic aberration or axial color represents chromatic

longitudinal defocus; thus, it can be written as a defocus wave aberration. The wavefronts
for different wavelengths are spherical, but their radii of curvature are longer for the
8.4 Wavefront Tilt Aberration 325
longer wavelengths. If the red wavefront is chosen as the reference sphere, then the
defocus wave aberration corresponding to an axial color of d S ¢ is given by
ni d S¢ 2
W d (r) = - r , (8-16)
2 R2
where ni is the refractive index of the image space.
8.4 WAVEFRONT TILT ABERRATION
Next, we consider a wavefront tilt angle and the corresponding wavefront tilt
aberration. We consider a system that has one or more of its optical elements
inadvertently tilted and/or decentered slightly, resulting in a transverse displacement of
the image of a point object from its Gaussian image at P1 to P2 , as indicated in Figure 8-
6. Thus, a spherical wavefront with its center of curvature at P2 emerges from the exit
pupil of the system. The Gaussian reference sphere is, of course centered at P1 . The
aberration of the wavefront at a point Q1 on the reference sphere is its optical deviation
ni Q2 Q1 from the reference sphere along the ray passing through Q1 . It is evident that for
small values of the ray aberration P1 P2 , the wavefront and the reference sphere are tilted
with respect to each other by a small angle . The ray and the wave aberrations can be
written
xi = R (8-17)
ExP
Q2 Q1
r
P2
xi
b
O OA P1
S W
Figure 8-6. Wavefront tilt. The spherical wavefront W is centered at P2 , while the
reference sphere S is centered at P1 . Thus, for small values of P1 P2 , the two
spherical surfaces are tilted with respect to each other by a small angle = P1 P2 R ,
where R is their radius of curvature. The ray Q2 P2 is normal to the wavefront at Q2.
and
W (r, q) = nibr cos q , (8-18)
respectively, where (r, q) are the polar coordinates of the point Q1 projected onto the
plane of the exit pupil. Both the wave and ray aberrations are numerically positive in
Figure 8-6.
Once again, for a system with a circular exit pupil of radius a, Eq. (8-18) may be
written
W (r, q) = Bt r cos q , (8-19)
where
Bt = ni a (8-20)
is the peak value of the tilt aberration. Note that a positive value of Bt implies that the
wavefront tilt angle is also positive, as in Figure 8-6. Thus, if an aberration-free
wavefront is centered at P2 , then an observation with respect to P1 as the origin implies
that we have introduced a tilt aberration of Bt r cos q.
In the case of lateral color, the wavefronts are spherical, but their centers of curvature
lie at a higher height from the optical axis for the longer wavelength. Again choosing the
red wavefront as the reference sphere, the wavefront tilt aberration due to a lateral color
of d hc¢ is given by
d hc¢
Wt (r, q) = ni r cos q . (8-21)
R
8.5 ABERRATIONS OF A ROTATIONALLY SYMMETRIC SYSTEM
In this section, we obtain the form of the aberration terms for a rotationally
symmetric system, expand an aberration function in terms of them, and discuss the
primary aberrations. The aberration terms are discussed with and without their explicit
dependence on the object coordinates. The aberration function of a system is also
expanded in terms of Zernike polynomials, which are in widespread use in optical design
and testing.
8.5.1 Explicit Dependence on Object Coordinates
Consider a rotationally symmetric optical system imaging a point object P. The axis
of rotational symmetry, namely, ther optical axis, lies along the z axis. Let the position
vector of the object point be h with rectangular coordinates ( xo , yo ) in a plane
r
orthogonal to the optical axis. Similarly, let r be the position vector of a point with
rectangular coordinates ( x, y) in the plane of the exit pupil of the system, which is also
orthogonal to the optical axis. The origins of ( x o , yo ) and ( x, y) lie on the optical axis,
and we assume, for example, that the xo and x axes are coplanar.
8.5 Aberrations of a Rotationally Symmetric System 327
Because the pupils of optical systems are generally circular, it is convenient to use
polar coordinates. Let (h, q o ) and (r, q) be the polar coordinates corresponding to the
angular coordinates ( x o , yo ) , and ( x, y) of the object and pupil points respectively,
where
( x o , yo ) = h(cos q o , sin q o ) (8-22)
and
( x, y) = (r cos q, sin q) . (8-23)
r under
Now, quantities that are invariant
r r r of the optical system about its axis of
rotation
symmetry are the three scalars h , r , and h r , where ◊
r 1/ 2
(
h = h = x o2 + yo2 ) , (8-24a)
rr 1/ 2
(
= r = x 2 + y2 ) , (8-24b)
and
r r
◊
h r = hr cos(q – q o ) (8-25a)
= x o x + yo y . (8-25b)
In order that the aberration function consist of terms with positive integral powers of
the four rectangular coordinates, it must depend on the first two through h 2 and r 2 . If we
rotate the system about the optical axis by a certain angle, the aberration function must
not change. We note that this is indeed the case. As the system rotates, so do the x and y
axes in each plane. Both q and q o change by the angle of rotation, but h, r, and q - q o
do not change. Thus, because of rotational symmetry, the aberration function depends on
the four variables ( x o , yo ) and ( x, y) only through the three combinations h 2 , r 2 , and
hr cos (q - q o ) . These combinations are called the rotational invariants of the aberration
function of an optical imaging system with an axis of rotational symmetry.
Because of the rotational symmetry, there is no loss of generality if we assume for

simplicity that the point object lies along the xo axis so that q o = 0 , as in Figure 8-3. A
power-series expansion of an aberration function of a system may be written
r
( r) =
• • •
W h; r Â Â Â C ijm h 2
i=0 j =0 m =0
( ) i (r 2 ) j (hr cos q) m
• • n
l n
= Â Â Â l a nm h r cos m q , (8-26)
l =0 n =0 m =0
where C ijm and l a nm are the expansion coefficients, i, j, and m are positive integers
including zero, 2i + m = l and 2 j + m = n , and we have written the object height h in
terms of the image height h ¢ . The subscripts on the coefficients l anm represent the
powers of h ¢ , r, and cos q , respectively. Note that n - m = 2 j is positive and even. It is

evident that the order of an aberration term, i.e., its degree l + n in the object and pupil
coordinates, is even. The number of terms up to and including a certain order k is given
by ( k + 2)( k + 4) .
Because the aberration associated with the chief ray (for which r = 0 ) is zero, the
zero-degree term and those varying as h ¢ 2i but without any dependence on r must be
zero. The second-degree terms are also zero. For example, the term varying as r 2
represents a defocus aberration that is independent of h ¢ . It must be zero, because
otherwise the Gaussian image point with respect to which the aberration function is
defined must be incorrect. Similarly, the term varying as hr cos q must also be zero,
because otherwise it implies a transverse shift of the image point, or the image height
being different from h ¢ .
The first nonzero aberration terms are of fourth order, i.e., those for which k = 4 .
They are called the primary or the Seidel aberrations. The primary aberration function
may be written
4
W P ( h¢; r , q) = 0 a 40 r + 1a 31h¢ r 3 cos q + 2 a 22 h¢ 2 r 2 cos 2 q
(8-27)
+ 2 a 20 h ¢ 2 r 2 + 3 a11h ¢ 3 r cos q .
The values of the aberration coefficients depend on the construction parameters of the
system, such as the radii of curvature of its surfaces, the refractive indices of the spaces
between them, and the values of their spacing. The coefficients 0 a40 , 1a31 , 2 a22 , 2 a20 ,
and 3 a 11 represent the coefficients of spherical aberration, coma, astigmatism, field
curvature, and distortion, respectively. The primary aberrations are listed in Table 8-1.
They can be determined from the Gaussian imaging characteristics of a system without
tracing rays [2]. They can also be determined from the ray-trace data of the paraxial chief
and marginal rays.
We note that the dependence of the field curvature term on the pupil coordinates is
just like the defocus aberration discussed in Section 8.3. Thus, this term is a defocus
whose coefficient varies quadratically with the height of the point object. It can be
eliminated by observing the image of a planar object on a curved surface (typically
spherical, as discussed in Section 9.3.3), thus the name field curvature. Similarly, the
dependence of the distortion term on the pupil coordinates is just like the wavefront tilt
aberration discussed in Section 8.4. Therefore, this term is a wavefront tilt aberration
whose coefficient varies cubically with the height of the point object. Accordingly, the
image of a point object in the presence of distortion is perfect, but it is transversally
displaced from the Gaussian image point; the displacement depends on the height of the
point object. The reason for the name distortion becomes clear when the image of an
extended object is considered. (For an example, see Section 9.3.4, where the distorted
image of a square grid is considered.)
8.5 Aberrations of a Rotationally Symmetric System 329
Table 8-1. Primary or Seidel aberrations.
Aberration Term Aberration Term Aberration

l n
r cos m q anm rn cos m q Name*
l anm h ¢
0 a40 r4 a40 r 4 Spherical
3 a31r3 cos q Coma

1 a31h ¢ r cos q
2 2
r cos 2 q a22 r2 cos 2 q Astigmatism
2 a22 h ¢
2 2 a20 r2 Field curvature

2 a20 h ¢ r
3 a11r cos q Distortion
3 a11h ¢ r cos q
*The word “primary” is to be associated with these names, e.g., primary spherical.
8.5.2 No Explicit Dependence on Object Coordinates

For a system imaging a given point object, the aberration terms may be written so
that their explicit dependence on the image height h ¢ is suppressed. We may also let
r = r a , r£ 0 £1 , (8-28)
where a is the radius of the exit pupil of the system. Combining the aberration terms that
have different dependencies on the object coordinates but the same dependence on pupil
coordinates so that there is only one term for each pair of (n, m) values, the aberration
terms may also be written in the form anm rn cos m q , where n and m are positive integers,
including zero, and n - m ≥ 0 and even. Each aberration coefficient anm depends on the
image height h ¢ , and because 0 £ r £ 1 and cos q £ 1, it represents the peak value or
half of the peak-to-valley value of the corresponding aberration term, depending on
whether m is even or odd, respectively.
The primary aberrations written in this simplified form are also listed in Table 8-1.
They correspond to terms with n + m £ 4. The primary aberration function of Eq. (8-27)
may be written in terms of these coefficients in the form
WP (r, q) = a11r cos q + a20 r2 + a22 r2 cos 2 q + a31r3 cos q + a40 r4 , (8-29)
where
3
a11 = 3 a11h ¢ a = at h ¢ 3 a = At , (8-30a)
2
a20 = 2 a20 h ¢ a 2 = ad h ¢ 2 a 2 = Ad , (8-30b)
2
a22 = 2 a22 h ¢ a 2 = aa h ¢ 2 a 2 = Aa , (8-30c)
a31 = 1a31h¢ a 3 = ac h ¢a 3 = Ac , (8-30d)

and
4
a40 = 0 a40 a = as a 4 = As , (8-30e)
and we have introduced aberration coefficients ai and Ai with abbreviated notation.
Comparing the distortion term given in Table 8-1 with the wavefront tilt aberration
given by Eq. (8-19b), we note that although the two are similar in their dependence on the
pupil coordinates, their coefficients depend on the image height differently. The
distortion coefficient a11 (or At ) varies with h ¢ as h ¢ 3 , but the tilt coefficient Bt is
independent of h ¢. Similarly, comparing the field curvature term with the defocus wave
aberration given by Eq. (8-8b), we note that their dependence on the pupil coordinates is
the same. However, whereas the field curvature coefficient a20 (or Ad ) varies with h ¢ as
h ¢ 2 , the defocus coefficient Bd is independent of h ¢. It is for these reasons that we have
used a different symbol, namely, B, for the defocus and tilt coefficients, compared to the
symbol A for the field curvature and distortion coefficients.
The aberrations of sixth order, i.e., for which k = 6, are called the secondary or the
Schwarzchild aberrations. The aberration function through the sixth order aberrations,
i.e., for k £ 6, or n + m £ 6, may be written
WS (r, q) = a11r cos q + a20 r2 + a22 r2 cos 2 q + a31r3 cos q + a33r3 cos3 q
(8-31)
+ a40 r4 + a42 r4 cos2 q + a51r5 cos q + a60 r6 ,
where
a11 = ( 3 a11h ¢
3
+ 5 a11h¢ 5 a , ) (8-32a)
a20 = ( 2 a20 h ¢
2
+ 4 a20 h¢ 4 a 2 ) , (8-32b)
a22 = ( 2 a22 h ¢
2
+ 4 a22 h¢ 4 a 2 ) , (8-32c)
a31 = (a 1 31h ¢ )
+ 3 a31h ¢ 3 a 3 , (8-32d)
3 3
a33 = 3 a33 h ¢ a , (8-32e)
a 40 = ( 0 a40 + 2 a40 h¢ 2 ) a 4 , (8-32f)
2 4
a42 = 2 a42 h ¢ a , (8-32g)
a51 = 1a51h ¢a 5 , (8-32h)
6
a60 = 0 a60 a . (8-32i)
Written in this form, the aberration function has nine aberration terms through the sixth
order. For convenience, the values of the indices n and m, and the combined aberration
8.6 Additivity of Primary Aberrations 331
terms along with their names, are listed in Table 8-2. Because the dependence of an
aberration term on the image height h ¢ is contained in the aberration coefficient anm , it
should be noted that the primary aberrations (including distortion and field curvature
terms) are not the same as those discussed earlier because they contain aberration
components not only of the fourth degree, but the sixth degree as well. For example,
a 40r 4 consists of spherical and lateral spherical aberrations 0 a 40 a 4 r 4 and 2 a 40 h ¢ 2 a 4 r 4 .
Similarly, the aberration function through the eighth order can be written by
combining the primary, secondary, and the tertiary aberrations [2].
8.6 ADDITIVITY OF PRIMARY ABERRATIONS

8.6.1 Introduction
As discussed in Chapter 2, the Gaussian image of an object formed by a multisurface

system can be obtained by sequentially determining the image formed by one surface and
treating this image as the object for the next surface. By its definition, the Gaussian image
is aberration free. In practice, we use the principal points of the system to determine the
image in one step using the Gaussian imaging equation. The object and the (final) image
distances are measured from the respective principal points.
To determine the aberrations of a system for imaging a certain point object, we must
trace the object rays through the system and then determine their wave aberrations as the
differences in their optical path lengths in reaching the Gaussian reference sphere (with
Table 8-2. Combined primary and secondary aberrations; i £ 6, n + m £ 6.
n m Aberration Term Aberration Name

anm rn cos m q
1 1 a11r cos q Distortion
2 0 a20 r2 Field curvature
2 2 a22 r2 cos 2 q Primary astigmatism
3 1 a31r3 cos q Primary coma
3 3 a33r3 cos3 q Elliptical coma (arrows)
4 0 a40 r 4 Primary spherical
4 2 a42 r 4 cos 2 q Secondary astigmatism
5 1 a51r5 cos q Secondary coma
6 0 a60r6 Secondary spherical

its center of curvature at the Gaussian image point passing through the center of the exit
pupil of the system) from a certain reference ray. Typically, this ray is the chief ray that
passes through the center of the exit pupil. The wave aberrations represent the separations
of the wavefront from the reference sphere along the rays. The corresponding transverse
ray aberrations represent the separations of the rays in the Gaussian image plane from the
Only the Gaussian imaging parameters of an imaging surface are needed to

determine its primary wave aberrations [2]. We show that the primary wave aberrations
of a multisurface system can be obtained by adding the primary wave aberrations of the
surfaces, where the object for each surface is the Gaussian image point formed by the
previous surface. The aberrations of a surface under this assumption are called its
intrinsic aberrations. We also show that the transverse ray aberrations of the surfaces
determined in this manner cannot be added to obtain them for the system. They must be
obtained from the additive primary aberrations of the system using the Gaussian imaging
parameters of the final image. Of course, the contribution of a surface to the ray
aberrations of the final image can be obtained term by term from the wave aberration sum
of the system. The secondary and higher-order aberrations cannot be obtained by adding
them for the surfaces obtained for point objects because the aberration up to a certain
order of the image formed by a surface must be taken into account to determine its
aberrations of the next higher order. The additional wave aberrations for each surface
owing to the aberrations of the previous surface are referred to as its extrinsic
aberrations.
8.6.2 Primary Wave Aberrations
Consider a system consisting of two refracting surfaces separating media of

refractive indices n 0 , n1 , and n 2 , with their vertices at V1 and V2 , respectively, forming
the image of an axial point object P0 , as in Figure 8-7. The first surface forms the
Gaussian image of P0 at P1 . The image P1 becomes the object for the second surface,
which, in turn, forms its Gaussian image at point P2 . Of course, P2 is also the Gaussian
image of P0 formed by the two-surface system. In reality, because the image P1 is
generally not a point due to the aberrations of the first surface, the aberrations of the
second surface will be different than if it were a point. Thus, the wave aberrations of the
two surfaces will generally not be additive. We show that the primary aberrations of the
two surfaces are indeed additive, i.e., the primary aberration of the system in forming the
image P2 is equal to the sum of the primary aberration of the image P1 formed by the first
surface, and the primary aberration of the image P2 of point object P1 formed by the
second surface.
Consider a ray P0 A1 from the axial point object P0 incident on the first surface at a
point A1 . Let the refracted ray intersect the first Gaussian image plane at A2 , where P1 A2
represents the transverse aberration of the ray produced by the first surface. The wave
aberration of the ray, namely, its primary spherical aberration, is given by
A1
n0 n1 n2
P29
r A4
1
P1 B
P0 V1 V2 P2
r W
A2 2
S
Q
A3
Image Plane
D01 D12 D23 D34
Figure 8-7. Wave aberration [ A3Q ] and transverse ray aberration P2 A2 of a ray
P0 A1 originating at a point object P0 when imaged by a system consisting of two
refracting surfaces separating media of refractive indices n 0 , n1 , and n 2 . P1 is the
Gaussian image of P0 formed by the first surface, and P2 is the Gaussian image of
P1 formed by the second surface.
W1 (r1) = [ P0 A1P1 ] - [ P0V1P1 ]
= a1r14 + O r16 ( ) , (8-33)
where the square brackets indicate an optical path length, r1 is the distance of point A1
from the optical axis, and a1 is the coefficient of spherical aberration. This coefficient
depends on the shape of the refracting surface, the object distance D01 , and the refractive
indices n 0 and n1 [2]. Similarly, the wave aberration of the Gaussian image P2 of the
point object P1 formed by the second surface is given by
W 2 (r 2 ) = [ P1 A3 P2 ] - [ PV
1 2 P2 ]
( )
= a 2r 42 + O r 62 , (8-34)
where r 2 is the distance of point A3 from the optical axis, and a 2 is the coefficient of its
spherical aberration that depends on the shape of the refracting surface, object distance
D23 , and the refractive indices n1 and n 2 . The distances r1 and r 2 are approximately
related to each other according to
D23
r 2 = r1 . (8-35)
D12
The wave aberration W s of the system in forming the image P2 of the point object P0 can
be written
W s (r1, r 2 ) = [ P0 A1P1 A3 P2 ] - [ P0V1PV

1 2 P2 ] (8-36)
= {[P0 A1P1 ] - [P0V1P1 ]} + {[ P1 A3 P2 ] - [ PV

1 2 P2 ]}
( )
= a1r14 + O r16 + a 2r 42 + O r 62 ( ) , (8-37)
thus demonstrating the additivity of the primary wave aberrations. Using Eq. (8-35), Eq.
(8-37) can be written in terms of a single variable r 2 in the form
( )
W s (r 2 ) = a sr 42 + O r 62 , (8-38)
where
4
ÊD ˆ
a s = a1Á 12 ˜ + a 2 (8-39)
Ë D23 ¯
is the coefficient of spherical aberration for the system.
Now, according to Fermat’s principle, the difference in the optical path lengths of the
actual and virtual rays is of second order in the transverse distance between them. Thus,
2
[ A1P1 A3 ] - [ A1 A2 A3 ] ~ (P1 A2 )
( )
= O r 62 , (8-40)
where P1 A2 ~ r13 ~ r 32 represents the transverse aberration of the ray for the first surface.
Therefore, by adding the optical path length of the incident ray P0 A1 in Eq. (8-40), we
may write
[P0 A1P1 A3 ] - [P0 A1 A2 A3 ] ( )

= O r 26 . (8-41)
Substituting Eq. (8-41) into Eq. (8-36), we may write
W s (r 2 ) = [ P0 A1 A2 A3 P2 ] - [ P0V1PV
1 2 P2 ] + O r 2
6
( ) . (8-42)
Next, consider the Gaussian reference sphere of radius P2 A3 passing through a point
B on the optical axis. Thus,
[ A3 P2 ] = [ BP2 ] . (8-43)
Let the wavefront W passing through the point B intersect the actual ray A3 A4 at Q .
Then, by definition, the wave aberration associated with the ray is [ A3Q ] . It is
numerically negative because the optical path length [ P0 A1 A2 A3 ] to reach the reference
sphere is smaller than the corresponding optical path length [ P0V1PV

1 2 B ] of the reference
ray. Moreover, by definition of the wavefront,
[P0 A1 A2 A3Q ] = [ P0V1PV

1 2B ] . (8-44)
Adding Eqs. (8-43) and (8-44), we obtain
[P0 A1 A2 A3Q ] + [ A3 P2 ] = [ P0V1PV

1 2 B ] + [ BP2 ] ,
or, assuming Q to lie on the virtual path A3 P2 , we can write
[ P0 A1 A2 A3 P2 ] + [ A3Q ] = [ P0V1PV
1 2 BP2 ] , (8-45)
or
[ A 3Q ] = [ P0 A1 A2 A3 P2 ] - [ P0V1PV
1 2 BP2 ]
( )
= W s (r 2 ) + O r 62 . (8-46)
Thus, in view of Eq. (8-38), the primary wave aberration associated with a ray is equal to
the sum of the primary aberrations associated with it for each of the two surfaces of the
system.
Now let us take into account the aberration difference between the point Q lying on
the virtual ray A3 P2 and the actual ray A3 A4 . If we let Q1 and Q2 be the points where
the wavefront intersects the two rays, the aberration difference [ A3Q2 ] - [ A3Q1 ] is
proportional to the optical path difference between the two rays, which from Fermat’s
principle is of second order in the transverse distance between them. This difference is
2
proportional to [ P2 A4 ] , which, in turn, is proportional to r 6 . Thus, the primary wave
aberration of the two-surface system, being equal to the sum of the primary aberrations of
the two surfaces, is also valid for Q lying on the actual ray A3 A4 . This result can be
generalized to a system consisting of any number of refracting and/or reflecting surfaces,
thus establishing the additivity theorem for primary wave aberrations.
It should also be clear that the primary aberrations cannot describe the exact wave
aberrations because the rays do not pass through the Gaussian image point formed by a
surface unless the image formed is indeed aberration free. To determine the exact wave
aberration, the optical path length of a ray must be determined by tracing it exactly from
the object plane to the Gaussian reference sphere of the system, and then compared with
that of the chief ray or some other reference ray.
8.6.3 Transverse Ray Aberrations
The ray aberration of the image formed by the first surface in Figure 8-7 is P1 A2
given by
D12 ∂W1
P1 A2 = . (8-47)
n1 ∂r1
It is numerically negative because the point A2 , where the ray intersects the first
Gaussian image plane lies below the optical axis. If we consider a ray P1 A3 originating at
the point object P1 (which is the Gaussian image of the point object P0 formed by the
first surface), it is refracted as a ray A3 P2¢ . The transverse ray aberration associated with
this ray is P2 P2¢ , given by
D34 ∂W 2
P2 P2¢ = . (8-48)
n 2 ∂r 2
The ray aberration for the system associated with the ray P0 A1 incident on the first
surface from the point object P0 is P2 A4 . It is given by
D34 ∂W s
P2 A4 = . (8-49)
n 2 ∂r 2
It is numerically positive, because the point A4 where the ray intersects the final image
plane is above the optical axis. It is not equal to the sum of the ray aberrations P1 A2 and
P2 P2¢ of the two surfaces. In this sense, the ray aberrations are not additive.
Now, from Eq. (8-37), the primary wave aberration of the system can be written
W s (r1, r 2 ) = W1(r1) + W 2 (r 2 ) . (8-50)
Thus, Eq. (8-49) for the ray aberration of the system can be written
D34 Ê ∂W1 ∂W 2 ˆ
P2 A4 = Á + ˜ . (8-51)
n 2 Ë ∂r 2 ∂r 2 ¯
The first term on the right-hand side of Eq. (8-51) represents the ray aberration
contribution of the first surface, and the second term represents that of the second surface.
Note the difference between the first term and the ray aberration P1 A2 given by Eq. (8-
47). Such a difference will occur for each surface of a system, except for the last. Thus,
whereas the primary wave aberrations of the surfaces of a system are additive, their ray
aberrations are not. The ray aberration of the system must be obtained from its wave
aberration, and not by adding the ray aberrations of its surfaces.
8.6.4 Off-Axis Point Object
So far, we have considered the imaging of an axial point object. For an off-axis point
object, additional primary aberrations, namely, coma, astigmatism, field curvature, and
distortion, appear. Of course, by definition of a primary aberration, they are all of fourth
order in the pupil and object coordinates. However, the same reasoning applies to show
that the primary aberrations are additive, i.e., the optical path length difference between a
8.7 Strehl Ratio and Aberration Balancing 337
real ray and a virtual ray is of second order in the transverse ray aberration. Because each
(primary) ray aberration is of third order, its square is of sixth order. Thus, the primary
wave aberrations are additive also for an off-axis point object.
8.6.5 Higher-Order Aberrations
Just as we can calculate the primary aberration of the image of a point object formed
by an imaging surface, we can also calculate its secondary and higher-order intrinsic
aberrations. However, adding the intrinsic secondary aberrations of the surfaces will not
yield the correct secondary aberrations of the system. The reason is simple: the image
formed by the previous surface, instead of being a point, is actually a spot diagram
resulting from its primary aberrations. The primary aberrations of the image formed by
this surface must be taken into account to determine the extrinsic secondary aberrations
of the next surface. The sum of the intrinsic and extrinsic secondary aberrations of a
surface yields its total secondary aberrations. The sum of the total secondary aberrations
of the surfaces yields the correct secondary aberrations of the system. Similarly, the
primary and secondary aberrations of an image formed by a surface must be taken into
account to determine the tertiary aberrations of the next surface, and then then adds the
aberrations of the surfaces to determine it for the system, and so on.
8.7 STREHL RATIO AND ABERRATION BALANCING

8.7.1 Strehl Ratio
As illustrated in Figure 8-1, a spherical wave emanating from a point object and
incident on an imaging system exits as a spherical wave from its exit pupil converging to
the Gaussian image point if the diffraction image is aberration free. The irradiance in the
image plane is maximum at the Gaussian image point, given by PSe l2 R 2 , where P is
the total power in the image, Se is the area of the exit pupil, l is the wavelength of
object radiation, and R is the radius of curvature of the exiting spherical wavefront. The
Huygens secondary wavelets originating on the spherical wavefront are all in phase and
interfere constructively at this point. With a few exceptions, the wave exiting from
practical systems is only approximately spherical, and its differences from the spherical
wavefront constitute aberrations that reduce the irradiance at the Gaussian image point.
The ratio of the irradiances with and without aberration is called the Strehl ratio of an
image. For small aberrations, its value is approximately given by [3]
S ~ exp ( - s F2 ) , (8-52)
where s F2 is the variance of the phase aberration across the exit pupil. The variance is
given by
2
s F2 = F 2 - F , (8-53)
where the mean and the mean square values of the aberration are obtained from the
expression
1 2p
Û Û
Fn = p -1 Ù Ù F n (r, q) r dr dq , (8-54)
ı ı
0 0
with n = 1 and 2, respectively. It is assumed that the amplitude across the pupil is
uniform, which would otherwise act as the weighting function in the integral in Eq. (8-
54). For a high-quality imaging system, a typical value of the Strehl ratio desired is 0.8,
corresponding to a wave aberration with s w = l 14 , where s w = (l 2 p) s F is the
standard deviation of the wave aberration, or the wavefront sigma.
8.7.2 Aberration Balancing

Because the Strehl ratio increases as the variance of an aberration decreases, we mix
an aberration with aberrations of the same or lower orders to decrease its variance. Such a
process of balancing a higher-order aberration with one or more aberrations of the same
and/or lower orders to minimize the variance is called aberration balancing. The
balancing of an aberration to reduce the geometrical ray spot size is discussed in Chapter
9.
Table 8-3 gives the form as well as the standard deviation s F of a primary (or
Seidel) aberration, where its coefficient Ai represents the peak value of the aberration. It
also lists the aberration tolerance, i.e., the value of the aberration coefficient Ai , for a
Strehl ratio of 0.8. The aberration tolerance listed in Table 8-3 is for the wave (as opposed
to the phase) aberration coefficient, as is customary in optics. It should be understood that
the tolerance numbers given are not accurate to the second decimal place. They are listed
Table 8-3. Standard deviation and tolerance for primary aberrations.
Aberration F (r, q ) sF A i for S = 0.8
Spherical As r 4 2 As As l 4.19
=
3 5 3.35
Coma Acr3 cos q Ac Ac l 4.96

=
2 2 2 .83
Astigmatism Aar2 cos 2 q Aa l 3.51

4
Field Curvature Ad r2 Ad Ad l 4.06

=
(defocus) 2 3 3.46
Distortion (tilt) At r cos q At l 7.03

2
8.7 Strehl Ratio and Aberration Balancing 339
as such for consistency only. Note that the dependence of the field curvature on r as r 2
is the same as that for the defocus wave aberration. Similarly, the dependence of
distortion on (r, q) as r cos q is the same as that for the wavefront tilt aberration.
The variance of primary spherical aberration or astigmatism can be reduced by

balancing it with defocus aberration. Thus, for example, we balance primary spherical
aberration with defocus aberration and write it as
F(r) = As r 4 + Bd r 2 . (8-55)
The defocus aberration is introduced by making an observation in a defocused image

plane. The mean and the mean square value of the aberration function are given by
1 2p
1 Û Û
F =
p Ù Ù
ı ı
( A s r 4 + B d r 2 ) r dr dq
0 0
As Bd
= + (8-56)
3 2
and
As2 B2 A B
F2 = + d + s d . (8-57)
5 3 2
The aberration variance is accordingly given by

2
s F2 = F 2 - F
4 As2 B2 A B
= + d + s d . (8-58)
45 12 6
The value of defocus Bd yielding minimum variance is obtained by letting
∂ s F2
= 0 , (8-59)
∂ Bd
and checking that it yields a minimum and not a maximum. Thus, we find that the
optimum value is Bd = - As, and the balanced aberration is given by
(
F bs (r) = As r 4 - r 2 ) . (4-60)
Its standard deviation or sigma value is As 6 5 , which is a factor of 4 smaller than the
corresponding value 2 As 3 5 for Bd = 0. Because the sigma value of the aberration has
been reduced by a factor of 4, its tolerance has been increased by the same factor. For
example, S = 0.8 is obtained in the Gaussian image plane for As = l 4 . However, the
same Strehl ratio is obtained for As = 1 l in a slightly defocused image plane such that
Bd = - l . The defocused image plane lies at a distance 8l F 2 from the Gaussian image
plane.
Similarly, we balance astigmatism with defocus and coma with tilt. Table 8-4 lists
the forms of balanced primary aberrations, their standard deviations, and their tolerances
for a Strehl ratio of 0.8, according to Eq. (8-52). Also listed in the table is the location of
the diffraction focus, i.e., the point with respect to which the aberration variance is
minimum so that the Strehl ratio is maximum at this point. The amount of balancing
defocus is minus half the amount of astigmatism, or the diffraction focus lies at a distance
4 F 2 Aa from the Gaussian image plane along the z axis. The balancing tilt in the case of
coma is minus two-thirds the amount of coma. Thus, the maximum Strehl ratio is
obtained at a point that is displaced from the Gaussian image point by 4 FAc 3 but lies in
the Gaussian image plane. The balancing of higher-order aberrations can be considered in
a similar manner.
8.8 ZERNIKE CIRCLE POLYNOMIALS

8.8.1 Introduction
The Zernike circle polynomials are in widespread use because they are not only
orthogonal over a circular pupil, but also represent balanced classical aberrations that
yield minimum variance across the pupil [3]. In optical design, the rays from a point
object are traced through an optical system to determine the wave aberrations at an array
of points. In optical testing of a system, they are measured at an array of points. For a
system with a circular pupil, the Zernike polynomials are used to determine the content of
the aberration function formed by the array data values. When the aberration function is
expanded in terms of the Zernike polynomials, each polynomial represents a certain type
of aberration, and its coefficient represents its value. Because of the orthogonality of the
polynomials, their coefficients are independent of each other in the sense that their values
do not depend on the number of polynomials used in the expansion.
Table 8-4. Balanced primary aberrations and corresponding diffraction focus,

standard deviation, and aberration tolerance.
Balanced Diffraction sF A i for

F (r, q)
Aberration Focus* S = 0.8
Spherical (
As r 4 - r2 ) (0, 0, 8F A )
2
s
As 0.955l
6 5
Coma (
Ac r3 - 2r 3 cos q) (4 FAc 3, 0, 0 ) Ac 0.604l
6 2
Aa
Astigmatism (
Aa r2 cos 2 q - 1 2 ) (0 , 0 , 4 F A )
2
a
2 6
0.349l
= ( Aa 2 ) r2 cos 2q
*The diffraction focus coordinates are relative to the Gaussian image point.
8.8 Zernike Circle Polynomials 341
We normalize the radial coordinate r of a point on the circular pupil by its radius a so
that the maximum value of r = r a is unity. We refer to a pupil normalized in this
manner as a unit circular pupil.
8.8.2 Polynomials in Optical Design

The aberration function W (r, q) for a system with a circular exit pupil can be
expanded in terms of a complete set of Zernike circle polynomials Z nm (r, q) in the form
• n
W (r, q) = Â Â c nm Z nm (r, q) , (8-61)
n =0 m =0
where c nm is a Zernike expansion coefficient, and n and m are positive integers including
zero such that n – m ≥ 0 and even. The radial and angular dependence of the polynomials
is given by
12
È 2( n + 1) ˘
Z nm (r, q) = Í 1 + d ˙ Rnm (r) cos mq , (8-62)
Î m0 ˚
where
( n - m )/ 2 ( -1) s ( n - s)!
Rnm (r) = Â r n - 2s (8-63)
s= 0 Ên+m ˆ Ên-m ˆ
s!Á - s˜ ! Á - s˜ !
Ë 2 ¯ Ë 2 ¯
is a radial polynomial of degree n in r containing terms in rn , rn -2 , K, and rm.
The radial polynomials Rnm (r) are orthogonal to each other according to
1 1
Ú Rn (r) Rn ¢ (r) r dr =
m m
d , (8-64)
0 2(n+ 1) nn ¢
where d ij is a Kronecker delta, i.e., d ij = 1 if i = j , and d ij = 0 if i π j . They are even or

odd in r depending on whether n (or m) is even or odd. It is evident from Eq. (8-63) that
Rnn (r) = r n . (8-65)
Moreover,
Ïd m 0 for even n 2
Rnm ( 0) = Ì (8-66)
Ó - d m 0 for odd n 2
Rnn (1) = 1 . (8-67)
The variation of some typical radial polynomials is shown in Figure 8-8. The angular
functions are orthogonal according to
2p
Ú cos mq cos m ¢q dq = p (1 + d m 0 ) d mm ¢ . (8-68)
0
Therefore, the polynomials Z nm (r, q) are orthonormal to each other according to
1 1 2p m
Ú Ú Z (r, q)Z n ¢ (r, q) r dr d q = d nn ¢d mm ¢
m¢
. (8-69)
p0 0 n
The mean value of a polynomial, except piston, is zero, as may be seen by letting
n ¢ = 0 = m¢ .
The index n of a Zernike polynomial represents its radial degree or the order
because it represents the highest power of r in the polynomial. This is different from the
order of a classical aberration, which represents the degree of the Cartesian coordinates of
the point object (for which the aberration function is being considered) and pupil points
(see Section 8.5.1). The index m of a polynomial is referred to as its azimuthal frequency.
The polynomials are ordered such that a polynomial with a lower value of n is ordered
first, and for a given value of n, a polynomial with a lower value of m is ordered first. The
polynomials through n = 8 and m = 0 are given in Table 8-5. The number of
polynomials of a certain order n is (n 2) + 1 when n is even, and ( n - 1) 2 when it is odd.
Their number through an order n is given by
2
n
N n = Ê + 1ˆ for even n , (8-70a)
Ë2 ¯
= (n + 1)(n + 3) 4 for odd n . (8-70b)
Multiplying both sides of Eq. (8-61) by Z nm¢ ¢ (r, q) , integrating over the unit pupil,
and utilizing the orthonormality Eq. (8-60), we obtain
1 2p • n 1 2p
Ú Ú W (r, q)Z n ¢ (r, q)r dr d q = Â Â c nm Ú Ú Z n (r, q)Z n ¢ (r, q) r dr d q
m¢ m m¢
0 0 n =0 m =0 0 0
• n
= p Â Â c nm d nn ¢d mm ¢ = pc n ¢m ¢ , (8-71)
n =0 m =0
or
1 1 2p
Ú Ú W (r, q)Z n (r, q) r dr d q .
m
c nm = (8-72)
p0 0
The variance of the aberration function is given by
2
s W2 = W 2 (r, q) - W (r, q) , (8-73)
where the angular brackets indicate a mean value over the unit pupil according to
1 2p 1 2p
k Û Û Û Û
W (r, q) = Ù Ù W k (r, q) r dr d q Ù Ù r dr d q , k = 1, 2 . (8-74)
ı ı ı ı
0 0 0 0
n=4
0.5 8
R n(ρ)
0 (a)
0
-0.5 6
2
-1
0 0.2 0.4 0.6 0.8 1
U
n=5
0.5
7
1
R n(ρ)
0 (b)
1
-0.5
-1
0 0.2 0.4 0.6 0.8 1
U
n=6
0.5
2
R n(ρ)
0 (c)
2
-0.5 8
4
-1
0 0.2 0.4 0.6 0.8 1
U
Figure 8-8. Variation of a Zernike circle radial polynomial Rnm (r) as a function of
r . (a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.
Table 8-5. Orthonormal Zernike circle polynomials and their names when identified
with aberrations.
n m Polynomial Aberration Name*

1 2
È 2(n + 1) ˘
Znm (r, q ) = Í ˙ Rnm (r) cos mq
Î 1 + d m0 ˚
0 0 1 Piston
1 1 2r cos q Distortion (tilt)
2 0 (
3 2r2 - 1 ) Field curvature (defocus)
2
2 2 6 r cos 2q Primary astigmatism
3 1 (
8 3r3 - 2r cos q ) Primary coma
3 3 8 r3 cos 3q
4 0 (
5 6r 4 - 6r2 + 1 ) Primary spherical
4 2 (
10 4r - 3r 4 2
) cos 2q Secondary astigmatism
4
4 4 10 r cos 4q
5 1 (
12 10r5 - 12r3 + 3r cos q ) Secondary coma
5 3 12 (5r 5
- 4r3 cos 3q )
5 5 12 r5 cos 5q
6 0 (
7 20r6 - 30r 4 + 12r2 - 1 ) Secondary spherical
6 2 (
14 15r - 20r + 6r 6 4 2
) cos 2q Tertiary astigmatism
6 4 14 (6r 6
- 5r 4
) cos 4q
6 6 14 r6 cos 6q
7 1 ( )
4 35r 7 - 60r5 + 30r3 - 4r cos q Tertiary coma
7 3 4 (21r - 30r + 10r ) cos 3q
7 5 3
7 5 4 (7r - 6r ) cos 5q
7 5
7 7 4 r 7 cos 7q
8 0 (
3 70r8 - 140r6 + 90r 4 - 20r2 + 1 ) Tertiary spherical
*The words “orthonormal Zernike” are to be associated with these names, e.g., orthonormal
Zernike primary astigmatism.
The mean value of the aberration function is given by

• n 1 2p
W (r, q) = Â Â c nm Ú Ú Z nm (r, q) r dr d q
n =0 m =0 0 0
• n
= Â Â c nm d n 0d m 0
n =0 m =0
= c 00 , (8-75)
where again we have used the orthonormality Eq. (8-69) for n ¢ = 0 = m¢ , i.e., for Z 00 = 1.
The mean square value of the aberration function is given by
• n • n 1 2p
W 2 (r, q) = Â Â c nm Â Â c n ¢m ¢ Ú Ú Z nm (r, q)Z nm¢ ¢ (r, q) r dr d q
n =0 m=0 n ¢=0 m¢=0 0 0
• n • n
= Â Â c nm Â Â c n ¢m ¢d nn ¢d mm ¢
n =0 m =0 n ¢=0 m ¢=0
• n
2
= Â Â c nm . (8-76)
n =0 m =0
Substituting Eqs. (8-75) and (8-76) into (8-73), we obtain

• n
s W2 = Â 2
Â c nm 2
- c 00 , (8-77)
n =0 m =0
or
• n
s W2 = Â 2
Â c nm . (8-78)
n =1 m = 0
Thus, the variance of the aberration function is equal to the sum of the squares of the
orthonormal expansion coefficients c nm , except c 00 . It illustrates that an orthonormal
coefficient represents the standard deviation of the corresponding polynomial aberration
term. We point out that unless the mean value of the aberration W = 0 , then
1/ 2
s w π Wrms , where W rms = W 2 is the root-mean-square (rms) value of the aberration.
8.8.3 Polynomials in Optical Testing

Because the aberrations introduced by fabrication errors or atmospheric turbulence
are random in nature, we need both the cosine and the sine Zernike circle polynomials to
express them. It is convenient in such cases to write their form and numbering as [3]:
Z even j (r, q) = 2(n + 1) Rnm (r) cos mq, m π 0 ,
Z odd j (r, q) = 2(n + 1) Rnm (r) sin mq, m π 0 , (8-79)

Z j (r, q) = n +1 Rn0 (r), m=0 .
An even number is associated with a cosine polynomial, and an odd number with a sine
polynomial. The orthogonality of the trigonometric functions yields
2p
Ï cos mq cos m¢q , j and j ¢ are both even
Ô cos mq sin m¢q , j is even and j ¢ is odd
Û Ô
Ù dq Ìsin mq cos m¢q , j is odd and j ¢ is even
ı Ô
0
ÔÓsin mq sin m¢q , j and j ¢ are both odd
Ï p (1 + d m 0 )d mm ¢ , j and j ¢ are both even

Ô
= Ì p d mm ¢ , j and j ¢ are both odd (8-80)
Ô0 , otherwise .
Ó
Therefore, the polynomials are orthonormal over a unit circular pupil according to
1 2p 1 2p
Ú Ú Z j (r, q) Z j ¢ (r, q) r dr dq Ú Ú r dr dq = d jj ¢ . (8-81)
0 0 0 0
The index j is a polynomial-ordering number and is a function of both n and m. The

polynomials are ordered such that an even j corresponds to a symmetric polynomial
varying as cosmq, and an odd j corresponds to an antisymmetric polynomial varying as
sinmq. A polynomial with a lower value of n is ordered first, and for a given value of n, a
polynomial with a lower value of m is ordered first. The number of polynomials of a
given order n is n + 1. Their number through a certain order n is given by
N n = ( n + 1)( n + 2) 2 . (8-82)
The first 45 orthonormal Zernike polynomials are listed in Table 8-6.
An aberration function W (r, q) across a unit pupil representing the aberrations

resulting from fabrication errors or atmospheric turbulence can be expanded in terms of
the polynomials Z j (r, q) in the form
J
W (r, q) = Â a j Z j (r, q) , (8-83)
j =1
where a j is an expansion coefficient, and we have truncated the polynomials at a

maximum value J of j. Multiplying both sides of Eq. (8-83) by Z j (r, q) , integrating over
the unit disc, and using the orthonormality Eq. (8-81), we obtain the expansion
coefficients:
2p
11
aj = Ú
p0 Ú W (r, q)Z j (r, q) r dr dq . (8-84)
0
It is evident from Eq. (8-84) that the value of a coefficient a j is independent of the
Table 8-6. Orthonormal Zernike circle polynomials Z j (r, q) . The indices j, n, and m
are called the polynomial number, radial degree, and azimuthal frequency,
respectively. The polynomials Z j are ordered such that an even j corresponds to a
symmetric polynomial varying as cos mqq , and an odd j corresponds to an
antisymmetric polynomial varying as sin mqq. A polynomial with a lower value of n
is ordered first, and for a given value of n, a polynomial with a lower value of m is
ordered first.
j n m Z j (r, q) Aberration Name*

1 0 0 1 Piston
2 1 1 2 r cos q x-tilt
3 1 1 2 r sin q y-tilt
4 2 0 (
3 2r2 - 1 ) Defocus
5 2 2 6 r 2 sin 2q 45∞ primary astigmatism

6 2 2 6 r2 cos 2 q 0∞ primary astigmatism
7 3 1 (
8 3r3 - 2r sin q ) Primary y-coma
8 3 1 8 (3r 3
- 2r) cos q Primary x-coma
9 3 3 8 r 3 sin 3 q
10 3 3 8 r 3 cos 3 q
11 4 0 (
5 6r 4 - 6r2 + 1 ) Primary spherical aberration
12 4 2 (
10 4r 4 - 3r2 cos 2q ) 0∞ secondary astigmatism
13 4 2 10 ( 4r 4
- 3r ) sin 2q
2 45∞ secondary astigmatism
14 4 4 10 r 4 cos 4 q
15 4 4 10 r 4 sin 4 q
16 5 1 ( )
12 10r5 - 12r3 + 3r cos q Secondary x-coma
17 5 1 12 (10r - 12r + 3r) sin q

5 3
Secondary y-coma
18 5 3 12 (5r - 4r ) cos 3q
5 3
19 5 3 12 (5r - 4r ) sin 3q
5 3
20 5 5 12 r 5 cos 5 q
21 5 5 12 r 5 sin 5 q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
Table 8-6. Orthonormal Zernike circle polynomials Z j (r, q) . (Cont.)
j n m Z j (r, q) Aberration Name*
22 6 0 (
7 20r6 - 30r 4 + 12r2 - 1 ) Secondary spherical
23 6 2 ( 6
)
14 15r - 20r + 6r sin 2q 4 2
45∞ tertiary astigmatism
24 6 2 14 (15r - 20r + 6r ) cos 2q

6 4 2
0∞ tertiary astigmatism
25 6 4 14 (6r - 5r ) sin 4q
6 4
26 6 4 14 (6r - 5r ) cos 4q
6 4
27 6 6 14 r 6 sin 6 q
28 6 6 14 r 6 cos 6 q
29 7 1 ( )
4 35r 7 - 60r5 + 30r3 - 4r sin q Tertiary y-coma
30 7 1 4 (35r - 60r + 30r - 4r) cos q

7 5 3
Tertiary x-coma
31 7 3 4 (21r - 30r + 10r ) sin 3q

7 5 3
32 7 3 4 (21r - 30r + 10r ) cos 3q

7 5 3
33 7 5 4 (7r - 6r ) sin 5q
7 5
34 7 5 4 (7r - 6r ) cos 5q
7 5
35 7 7 4 r 7 sin 7 q
36 7 7 4 r 7 cos 7 q
37 8 0 (
3 70r8 - 140r6 + 90r 4 - 20r2 + 1 ) Tertiary spherical
38 8 2 ( )
18 56r 8 - 105r 6 + 60r 4 - 10r 2 cos 2q 0∞ quaternary astigmatism
39 8 2 18 ( 56r 8 - 105r 6 + 60r 4 - 10r 2 ) sin 2q 45∞ quaternary astigmatism
40 8 4 18 ( 28r 8 - 42r 6 + 15r 4 ) cos 4q
41 8 4 18 ( 28r 8 - 42r 6 + 15r 4 ) sin 4q
42 8 6 18 (8r 8 - 7r 6 ) cos 6q
43 8 6 18 (8r 8 - 7r 6 ) sin 6q
44 8 8 18 r 8 cos 8q
45 8 8 18 r 8 sin 8q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
number J of the polynomials used in Eq. (8-83) for the expansion of the aberration
function. Thus, one or more polynomial terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.
As in the case of polynomials in optical design, the mean and the mean square values
of the aberration function are given by
W (r, q) = a1 (8-85)
and
J
W 2 (r, q) = Â a 2j , (8-86)
j =1
respectively. Accordingly, the variance of the aberration function is given by
2
s W2 = W 2 (r, q) - W (r, q)
J
= Â a 2j - a12 , (8-87)
j =1
or
J
s W2 = Â a 2j . (8-88)
j =2
8.8.4 Characteristics of Polynomial Aberrations

8.8.4.1 Isometric Characteristics
The P-V numbers of Zernike polynomials are given in Table 8-7 [4]. From the form
of the polynomials given in Eqs. (8-79) for m π 0, they are given by 2 2( n + 1) ,
because Rnm ( 0) = 0 , Rnm (1) = 1 , Rnm (r) £ 1 (as may be seen from Figure 8-8), and cos q
or sinq varies by 2 from –1 to 1. When m = 0 and n 2 is even, as for the primary and
tertiary Zernike spherical aberrations Z11 and Z 37 , the P-V numbers are given by
(1 - b) n + 1 , where b is the extreme negative value of Rn0 (r) as r varies between 0 and
1. However, when m = 0 and n 2 is odd, as for defocus Z 4 and secondary Zernike
spherical aberration Z 22 , Rn0 (r) varies from –1 at r = 0 to 1 at r = 1, as may be seen
from Figure 8-8. The P-V numbers in this case are given by 2 ( n + 1) . The isometric
plots of the Zernike primary aberrations are shown in Figure 8-9.
It should be evident that the P-V numbers of two polynomials with the same values
of n and m are the same. Moreover, the P-V numbers of polynomials with the same value
of n but different values of m, except m = 0, are also the same. The P-V numbers of a
polynomial representing the fabrication errors give a measure of the depth of material to
be removed in the fabrication process.
Table 8-7. Peak-to-valley (P-V) numbers (in units of wavelength) of orthonormal

Zernike polynomial aberrations for a sigma value of one wave.
Zj P-V # Zj P-V # Zj P-V #

Z1 0 Z16 2 12 = 6.928 Z 31 8
Z2 4 Z17 2 12 = 6.928 Z 32 8
Z3 4 Z18 2 12 = 6.928 Z 33 8
Z4 2 3 = 3.464 Z19 2 12 = 6.928 Z 34 8
Z5 2 6 = 4.899 Z 20 2 12 = 6.928 Z 35 8
Z6 2 6 = 4.899 Z 21 2 12 = 6.928 Z 36 8
Z7 4 2 = 5.657 Z 22 2 7 = 5.292 Z 37 4.286
Z8 4 2 = 5.657 Z 23 2 14 = 7.483 Z 38 2 18 = 8.485
Z9 4 2 = 5.657 Z 24 2 14 = 7.483 Z 39 2 18 = 8.485
Z10 4 2 = 5.657 Z 25 2 14 = 7.483 Z 40 2 18 = 8.485
Z11 1.5 5 = 3.354 Z 26 2 14 = 7.483 Z 41 2 18 = 8.485
Z12 2 10 = 6.325 Z 27 2 14 = 7.483 Z 42 2 18 = 8.485
Z13 2 10 = 6.325 Z 28 2 14 = 7.483 Z 43 2 18 = 8.485
Z14 2 10 = 6.325 Z 29 8 Z 44 2 18 = 8.485
Z15 2 10 = 6.325 Z 30 8 Z 45 2 18 = 8.485
8.8.4.2 Interferometric Characteristics

The symmetry of an interferogram of a Zernike polynomial aberration, as in optical
testing, can be different from that of the aberration, because a fringe is formed
independent of its sign. For example, astigmatism Z 6 varying as cos 2q is 2-fold
symmetric. It has the implication that the aberration function does not change when it is
rotated by p. Rotating by p 2 yields an aberration of the same magnitude but with an
opposite sign. Accordingly, its interferogram is 4-fold symmetric. Thus, the fringes
intersecting the x axis are formed by a positive aberration, and those intersecting the y
axis are formed by a negative aberration. The number of fringes in an interferogram,
which is equal to the number of times the aberration changes by one wave as we move
from the center to the edge of the pupil, is different for the different polynomials.
Each fringe represents a contour of constant phase or aberration. The fringe is dark
when the phase is an odd multiple of p, or the aberration is an odd multiple of l 2. In
the case of tilts, for example, the aberration changes by one wave four times, which is the
same as the P-V number of four waves. Thus, four straight-line fringes symmetric about
the center are obtained. The x-tilt polynomial Z2 yields vertical fringes, and the y-tilt
polynomial Z3 yields horizontal fringes. Similarly, defocus aberration Z4 yields about
3.5 fringes. In the case of spherical aberration Z11 , the aberration starts at a value of 5
waves, decreases to zero, reaches a negative value of - 5 2 waves, and then increases
to 5 waves. Thus, the total number of times the aberration changes by unity is equal to
6.7, and approximately seven circular fringes are obtained. The interferograms of the
Zernike primary aberrations are shown in Figure 8-9. The corresponding diffraction PSFs
are included for completeness.

Z1 Z2

Z4 Z6

Z8
Z11
Figure 8-9. Zernike circle polynomials shown as isometric plots on the top,
interferograms on the left, and diffraction PSFs on the right for a sigma value of one
wave.
8.9 RELATIONSHIP BETWEEN ZERNIKE POLYNOMIALS AND

CLASSICAL ABERRATIONS
8.9.1 Introduction
As discussed in Section 8.5.1, a classical aberration depends on the polar angle q as
m
cos q . However, a Zernike polynomial depends on the angle as cos mq (or sin mq). By
expressing cos m q as a series of cos mq terms, or cos mq as a power series of cos q
terms, the coefficients of classical aberrations can be obtained from the Zernike
coefficients, and vice versa [2]. We illustrate this for primary aberrations. The names of
some of the aberrations associated with the Zernike polynomials are given in Table 8-3.
They are a carry over from the names associated with the classical aberrations.
The Seidel aberrations are well known in optical design, where the optical system
has an axis of rotational symmetry with the consequence that the angle-dependent terms
are in the form of powers of cos q . However, the measured aberrations of a system in
optical testing generally contain both the cosine and sine terms due to the assembly and
fabrication errors. We show how to define the effective Seidel coefficients in such cases.
We emphasize that the Seidel aberration coefficients determined from the primary
Zernike aberrations will be in error unless the higher-order terms that also contain Seidel
terms are negligible [5].
8.9.2 Wavefront Tilt Aberration

The Zernike tilt aberration
a 2 Z 2 (r, q) = 2a 2r cos q (8-89)
represents a tilt of the wavefront about the y axis by an angle 4 (l D)a 2 , where the
aberration coefficient is in units of wavelength. It results in a displacement of the PSF
along the x axis by 4 l Fa 2 , where F is the focal ratio of the image-forming light cone.
Similarly, the Zernike tilt aberration
a 3 Z 3 (r, q) = 2 a 3r sin q (8-90)
represents a tilt of the wavefront about the x axis by an angle 4 (l D)a 3 and results in a
displacement of the PSF along the y axis by 4l Fa 3 . The P-V number of the aberration is
4 a2 .
It should be evident that when the cosine and sine terms of a certain aberration are
present simultaneously, as in optical testing, their combination represents the aberration
whose orientation depends on the value of the component terms. For example, if both x
and y Zernike tilts are present in the form
W (r, q) = a 2 Z 2 (r, q) + a 3 Z 3 (r, q) (8-91a)
= 2 a 2r cos q + 2a 3r sin q , (8-91b)

8.9 Relationship between Zernike Polynomials and Classical Aberrations 353
it can be written
(
W (r, q) = 2 a 22 + a 32 )1 2 r cos [q - tan-1(a3 a2 )] . (8-92)
Thus, it represents a Zernike wavefront tilt aberration of magnitude 2 a 22 + a 32 ( )1 2

about
an axis that is orthogonal to a line making an angle of tan (a 3 a 2 ) with the x axis. How
-1
to decide the sign of the overall tilt and the value of its angle are discussed in the
Appendix. It is evident from Eq. (8-92) that if b j = 0 , then the angle of orientation is
zero, indicating a cos mq polynomial. Similarly, if a j = 0, then the angle is p 2m ,
indicating a sin mq polynomial. The Zernike tilt aberration Z 2 (r, q) is similar to the
Seidel distortion in its (r, q) dependence (see Tables 8-1 and 8-2). It displaces the image
along the x axis by 4 l Fa 3 .
8.9.3 Wavefront Defocus Aberration

The Zernike defocus aberration Z 4 (r) is given by
(
Z 4 (r, q) = a 4 3 2r 2 - 1 ) , (8-93)
where a 4 is its sigma value. It varies with r as r 2 , just as the Seidel field curvature
varies with it. The constant term in Z 4 (r) results in its mean value across the circular
pupil to be zero, without changing its standard deviation. The P-V number for the
aberration is 2 3a 4 . Because the aberration is radially symmetric, so are the PSF and the
spot diagram.
8.9.4 Astigmatism
The Zernike primary astigmatism
a 6 Z 6 (r, q) = 6 a 6r 2 cos 2q (8-94)
is referred to as the 0∞ astigmatism. It consists of Seidel astigmatism r2 cos 2 q balanced

with defocus aberration r2 to yield minimum variance (see Problem 8.5). The Zernike
primary astigmatism
a 5 Z 5 (r, q) = 6 a 5r 2 sin 2q (8-95)
can be written
a 5 Z 5 (r, q) = [
6 a 5r 2 cos 2(q + p 4) ] . (8-96)
Comparing it with Eq. (8-94), we find that it is equivalent to changing q to q + p 4 .

Accordingly, it is called the 45∞ astigmatism. The secondary Zernike astigmatism, given
by
a12 Z12 (r, q) = ( )

10 a12 4r 4 - 3r 2 cos 2q , (8-97)
does not yield a line image in any plane. Because of its variation with q as cos 2q , it is
referred to as the 0∞ astigmatism in conformance with the corresponding primary
astigmatism. The name tertiary astigmatism in Table 8-5 can be explained similarly.
If both 0 o and 45∞ astigmatisms are present so that the aberration function is
W (r, q) = a 6 Z 6 (r, q) + a 5 Z 5 (r, q) (8-98a)
= 6 a 6r 2 cos 2q + 6 a 5r 2 sin 2q , (8-98b)
we may write it in the form
(
W (r, q) = a 52 + a 62 )1 2 {[
6 r 2 cos 2 q - (1 2) tan -1(a 5 a 6 )]} , (8-99)
showing that it is Zernike astigmatism of magnitude (a52 + a62 )1 2 at an angle of

(1 2) tan-1( a 5 a 6 ) with the x axis.
It should be evident that there is ambiguity in determining astigmatism, because it
can be written in different but equivalent forms by separating defocus aberration from it.
For example, a 0∞ astigmatism can be written
a 6 Z 6 (r, q) = a 6 6r 2 cos 2q (8-100a)
(
= a 6 6 2r 2 cos 2 q - r 2 ) (8-100b)
= a6 6 ( - 2r 2 sin 2 q + r 2 ) . (8-100c)
It is clear that a 0∞ Zernike astigmatism given by Eq. (8-100a) can be written as a

combination of 0∞ positive Seidel astigmatism and a negative defocus, as in Eq. (8-
100b), or a 90∞ negative Seidel astigmatism and a positive defocus, as in Eq. (8-100c).
8.9.5 Coma
The Zernike aberration terms a 8 Z 8 (r, q) and a 7 Z 7 (r, q) are called the x and y
Zernike comas. They represent classical coma r 3 cos q or r 3 sin q balanced with tilt
r cos q or r sin q , respectively, to yield minimum variance (see Problem 8.5). The names
for the secondary and tertiary comas can be explained similarly.
When both x- and y-Zernike comas are present, the aberration may be written
W (r, q) = a 8 Z 8 (r, q) + a 7 Z 7 (r, q) (8-101a)
= ( ) (
8 a 8 3r 3 - 2r cos q + 8 a 7 3r 3 - 2r sin q ,) (8-101b)
or
(
W (r, q) = a 72 + a 82 )1 2 8 (3r3 - 2r) cos [q - tan-1(a7 a8 )] , (8-101c)
8.9 Relationship between Zernike Polynomials and Classical Aberrations 355
which is equivalent to a Zernike coma of magnitude a 72 + a 82 ( )1 2 inclined at an angle of

tan -1 (a 7 a 8 ) with the x axis.
8.9.6 Spherical Aberration

The Zernike primary spherical aberration Z11 (r) represents the Seidel spherical
aberration varying as r 4 balanced with defocus varying as r 2 to yield minimum variance
(see Problem 8.5). As in the case of the Zernike defocus term Z 4 (r) , the constant term in
Z11 (r) results in its mean value across the circular pupil to be zero. Similarly, the Zernike
secondary and tertiary spherical aberrations Z 22 and Z 37 also contain a constant term so
that their mean value is zero. The secondary Zernike aberration Z 22 consisits of classical
secondary aberration r 6 balanced with the primary spherical aberration and defocus to
yield minimum variance.
8.9.7 Seidel Coefficients from Zernike Coefficients

It should be noted that the wavefront tilt aberration given by Eq. (8-93) represents the
tilt aberration obtained from the Zernike tilt aberrations. However, there are other Zernike
aberrations that also have tilt aberration built into them, e.g., Zernike primary, secondary,
12
(
or tertiary coma. Similarly, the Seidel coma 3 8 a 72 + a 82 )
in Eq. (8-96c) at an angle of
tan (a 7 a 8 ) is only from the primary Zernike comas, but the secondary and tertiary
-1
Zernike comas also contain Seidel coma. Thus, it is only if the higher-order Zernike
comas are zero or negligible that the PSF aberrated by primary Zernike coma will be
symmetric about a line making an angle of tan -1 (a 7 a 8 ) with the x axis. Similarly, it is
only if the secondary and tertiary astigmatisms are zero or negligible that the Seidel
12
( )
astigmatism is 2 6 a 52 + a 62 , as in Eq. (8-99).
To illustrate how an incorrect Seidel coefficient can be inferred unless it is obtained

from all of the significant Zernike terms that contain Seidel aberrations, we consider an
axial image aberrated by one wave of the secondary spherical aberration r 6 . In terms of
Zernike polynomials it will be written as
W (r) = a 22 Z 22 (r) + a11Z11(r) + a 4 Z 4 (r) + a1Z1(r) , (8-102)
where
(
a 22 = 1 20 7 , a11 = 1 4 5 , a 4 = 9 20 3 , a1 = 1 4 . ) (8-103)
If we infer the Seidel spherical aberration from only the primary Zernike aberration
a11Z11 (r) , its amount would be 1.5 waves. Such a conclusion is obviously incorrect,
because in reality the amount of Seidel spherical aberration is zero. Needless to say, if we
expand the aberration function up to the first, say, as many as 21 terms, we will, in fact,
incorrectly conclude that the amount of Seidel spherical aberration is 1.5 waves.
However, the Seidel spherical aberration will correctly reduce to zero when at least the
first 22 terms are included in the expansion. For an off-axis image, there are angle-
dependent aberrations, e.g., Z14 , that also contain Seidel aberrations. Thus, it is important
that the expansion be carried out up to a certain number of terms such that any additional
terms do not significantly change the mean square difference between the function and its
estimate. Otherwise, the inferred Seidel aberrations will be erroneous [5].
If we approximate a certain aberration function by the primary Zernike aberrations

only, we may write
8
W (r, q) = Â a j Z j (r, q) + a11Z11(r) (8-104a)
j =1
= A p + At r cos(q - b t ) + Ad r 2 + Aa r 2 cos 2 (q - b a ) + Ac r cos(q - b c ) + Asr 4 ,

(8-104b)
where A p is the piston aberration, other coefficients Ai represent the peak value of the
corresponding Seidel aberration term, and b i is the orientation angle of the Seidel
aberration. They are given by
A p = a1 - 3a 4 + 5a11 , (8-105a)
2 2 12 Ê a - 8a7 ˆ
At = 2ÈÍ a 2 - 8 a 8
( ) + (a 3 - 8 a 7 ˘˙
) , b t = tan -1Á 3 ˜ , (8-105b)
Î ˚ Ë a2 - 8a8 ¯
Ad = 2 ( 3a 4 - 3 5a11 - Aa ) , (8-105c)
1
(
Aa = 2 6 a 52 + a 62 )1 2 , ba =
2
tan -1 (a 5 a 6 ) , (8-105d)
(
Ac = 6 2 a 72 + a 82 )1 2 , b c = tan -1 (a 7 a 8 ) , (8-105e)
and
As = 6 5a11 . (8-105f)
As a note of caution, we add that the approximation of Eq. (8-104a) is good only when
the higher-order Zernike aberrations that also contain Seidel aberration terms are
negligible.
8.10 ABERRATIONS OF AN ANAMORPHIC SYSTEM

8.10.1 Introduction
An anamorphic imaging system, for example, consisting of cylindrical optics, is
symmetric about two orthogonal planes whose intersection defines its optical axis. As
discussed in Section 2.10, the Gaussian images of a point object with object rays in the
two symmetry planes are formed separately. They are coincident in the final image space
of the system for only two pairs of conjugate planes. By definition, an anamorphic system
forms the image of an extended object with different transverse magnifications in the two
symmetry planes. Thus, for example, the image of a square object is rectangular, and that
of a rectangular object can be square. The two orthogonal planes of symmetry of the
imaging system yield six reflection invariants in terms of the Cartesian coordinates of the
8.10 Aberrations of an Anamorphic System 357
object and pupil points, which become the building blocks of its aberration function for a
certain point object. The six invariants reduce to three “rotational” invariants for a
rotationally symmetric system, or equivalently for an infinite number of symmetry
planes.
In this section, we briefly discuss the power series expansion of the aberration
function in terms of the six reflection invariants and define the classical aberrations of an
anamorphic system. The balanced aberrations for a rectangular pupil are represented by
the products of the Legendre polynomials, one for each of the two dimensions of the
rectangular pupil [6]. The compound Legendre polynomials are orthogonal across a
rectangular pupil and, like the classical aberrations, are inherently separable in the
Cartesian coordinates of the pupil point. They are different from the orthogonal
polynomials representing the balanced aberrations for a system with rotational symmetry
but a rectangular pupil. Although products of Chebyshev polynomials, one for the x axis
and the other for the y axis, are also orthogonal over a rectangular pupil [7], they are not
suitable for anamorphic systems, because they do not represent balanced aberrations for
such systems.
8.10.2 Classical Aberrations

Consider a point object located at a point (p, q) in the object plane imaged by an
anamorphic system. Let the exit pupil of the system be rectangular with half widths a and
b. It may be an aperture stop in the image space of the system. Let ( x , y) be the
coordinates of a pupil point normalized by (a, b) so that -1 £ x £ 1 and -1 £ y £ 1 .
Because of the symmetry of the system about the orthogonal planes zx and yz (see
Figure 2-48), the aberration function, which depends on both ( p, q) and ( x , y )
coordinates, consists of products of positive integral powers of six reflection invariants:
p 2 , x 2 , px , q 2 , y 2 , and qy . (8-106)
The first three are symmetric about the yz plane, and the other three are symmetric about
the zx plane. A power-series expansion of the aberration function can be written
W ( p, q; x , y ) = Â
i, j , k, l , m, n
( ) i (q2 ) j ( x 2 ) k ( y 2 ) l ( px) m (qy) n
C i, j, k, l, m, n p 2 , (8-107)
where i, j, k , l, m , and n are positive integers including zero, and C i, j ,k,l ,m,n is the
coefficient of the aberration term that has a degree in the object and pupil coordinates
given by
degree = 2(i + j + k + l + m + n ) . (8-108)
It is evident that the degree of an aberration term is even, and thus the aberration function
consists of aberrations of even orders only. The zero-degree term must be zero, as it
represents the aberration of the chief ray, which is zero by its definition as the reference
ray. There are six terms of second degree, namely the reflection invariants multiplied
with their respective coefficients. Two of these terms, namely those in p 2 and q 2 , are
piston terms, i.e., they are independent of the pupil coordinates, and can generally be
ignored. Among the other four, those in px and qy , represent lateral deviations of the
image point from the Gaussian image point, and those in x 2 and y 2 represent
longitudinal deviations. Because our aberration function is defined with respect to the
Gaussian image point, these four terms must be zero. It is clear that the aberration terms
are separable in the Cartesian coordinates ( x , y ) of a pupil point.
There are 21 terms of the fourth degree, of which three are piston terms and two are
equal to another two. Thus, we are left with 16 terms that depend on the pupil
coordinates. They are called the primary aberrations of an anamorphic system, compared
to only five for a rotationally symmetric system. The primary aberration function can be
written
( ) ( ) ( )
W ( p, q; x , y ) = C1 p 3 + C 2 pq 2 x + C 3 p 2 q + C 4 q 3 y + C 5 p 2 + C 6 q 2 x 2
( 2
)
+ C 7 pqxy + C 8 p + C 9 q y + C10 pxy + C11qyx + C12 px 3
2 2 2 2
+ C13 qy + C14 x y + C15 x 4 + C16 y 4

3 2 2
, (8-109)
where we have expressed the aberration coefficients in a simplified form with one
subscript for convenience. For a rotationally symmetric system, the six reflection
coefficients reduce to three rotational invariants, namely, p 2 + q 2 , x 2 + y 2 , and px + qy ,
r r
and the 16 primary aberrations reduce to five. If h and rr are r the position vectors of the
r r r r
object and pupil points, rotational invariants are h ◊ h , r ◊ r , h ◊ r or h 2 , r 2 , and
r then the
r r r
hr cos q , where h = h , r = r , and q is the polar angle of r with respect to that of h .
In conformance with the aberrations of a rotationally symmetric system, the linear terms
in x and y are the distortion aberrations; the quadratic terms may be referred to as the field
curvature, defocus, or astigmatism; the cubic terms are comas; and the quaternary terms
are the spherical aberrations. It is easy to see that an anamorphic system has three primary
aberrations for an axial point object, compared to only one for a rotationally symmetric
system.
8.10.3 Aberration Polynomials Orthonormal over a Rectangular Pupil

The balanced classical aberrations for an anamorphic imaging system are separable
in the x and y coordinates of a point on the rectangular exit pupil. Thus, the balanced
aberrations for a rectangular pupil and the polynomials representing them are the products
of Legendre polynomials in x and y variables. The polynomials with the x and y variables
properly normalized by the dimensions of the rectangular pupil (a, b) are orthogonal over
the pupil, and may be referred to as the orthogonal aberrations. The order of an
orthogonal aberration is represented by its degree in the pupil coordinates, which is the
same as that of its leading classical aberration term only for an axial point object. For
example, the fourth-order classical aberration px 3 in the object and pupil coordinates
[
representing x-coma becomes a third-order orthogonal aberration p x 3 - (3 5) x in pupil ]
coordinates.
The Legendre polynomials orthonormal over the interval -1 £ x £ 1 are given in

Table 8-8, where Pn ( x ) is a regular Legendre polynomial of order n. We define the
products of Legendre polynomials in x and y variables that are orthonormal over the
rectangular pupil:
Q j ( x , y ) = Ll ( x ) Lm ( y ) , (8-110)
where j is a polynomial ordering index starting with j = 1, and l and m are positive
integers (including zero). It is evident that these polynomials are inherently separable in
the Cartesian pupil coordinates x and y. This is different from the Zernike circle
polynomials, which are orthogonal over a unit circle, but separable in polar coordinates
(r, q) , where 0 £ r £ 1 and 0 £ q £ 2p . The order n of a polynomial representing its
degree in the pupil coordinates is given by n = l + m . As in the case of Zernike circle
polynomials, the number of polynomials with a certain order n is n + 1. The number of
polynomials through a certain order n is given by
N n = ( n + 1)( n + 2) 2 . (8-111)
The first polynomial is the piston polynomial
Q1( x, y ) = L0 ( x ) L0 ( y ) = 1 . (8-112)
Table 8-8. Legendre polynomials Ln ( x ) = 2n + 1Pn ( x) orthonormal over an

interval -1 £ x £ 1.
n Ln ( x)
0 1
1 3x
2 ( )( )
5 2 3x 2 - 1
3 ( 7 2)( 5x - 3x )
3
4 (3 8)( 35x - 30 x + 3)
4 2
5 ( 11 8)(63x - 70 x + 15x)
5 3
6 ( 13 16)(231x - 315x + 105x

6 4 2
)
-5
7 ( )(
15 16 429 x 7 - 693x 5 + 315x 3 - 35x )
8 ( )( )
17 128 6435 x 8 - 12012 x 6 + 6930 x 4 - 1260 x 2 + 35
The orthonormality of the polynomials is expressed by
1 1 1
Ú Ú Q ( x , y ) Q j ¢ ( x , y ) dx dy = d jj ¢ . (8-113)
4 -1 -1 j
The rectangular Q-polynomials up to and including the eighth order are listed in
Table 8-9 as products of the Legendre polynomials, along with the names associated with
some of them. Their explicit form can be obtained by using the expressions of the
orthonormal Legendre polynomials given in Table 8-8. Note that for each polynomial
Ll ( x ) Lm ( y ) , there is a corresponding polynomial Lm ( x ) Ll ( y ) . These polynomials are
evidently different from those for a rotationally symmetric system with a rectangular
pupil. The rectangular polynomials given in Section 9.4 for such a system are not
separable in the Cartesian coordinates (x, y) of a pupil point.
The higher-order Q-polynomials can be written in a similar manner. It should be

evident that Q7 represents the balanced x-primary coma, and Q11 represents the balanced
x-primary spherical aberration. The piston term in Q11 yields a zero mean value across
the rectangular pupil (without changing its variance). Because the aberration function is
separable in the Cartesian coordinates x and y of a pupil point, an aberration term
containing both x and y dependence is balanced separately for its x and y factors. Thus,
for example, a seventh-order aberration term x 4 y 3 will yield a balanced aberration of the
[ ][ ]
form x 4 - (6 7) x 2 x 3 - (3 5) x . The corresponding seventh-order orthonormal
polynomial is given by
Q32 ( x , y ) = L4 ( x ) L3 ( y ) . (8-114)
It should be evident that the polynomials for a square pupil can be obtained from
those for a rectangular pupil by letting a = b , i.e., by using the same scale for the x and y
axes. Products of Chebyshev polynomials (one for the x, and the other for the y axis),
which are also orthogonal over a rectangular or a square pupil, have been suggested for
the analysis of rectangular wavefronts [7]. However, they are not suitable for anamorphic
systems because they do not represent balanced aberrations for such systems.
8.10.4 Expansion of a Rectangular Aberration Function in Terms of Orthonormal

Rectangular Polynomials
An aberration function defined over a rectangular exit pupil can be expanded in
terms of the rectangular Q-polynomials in the form
J
W ( x, y) = Â a j Q j ( x, y) , (8-115)
j =1
where J is the number of polynomials used in the expansion, and a j is an expansion

coefficient of the polynomial Q j ( x , y ) given by
Table 8-9. Orthonormal polynomials Q j (x, y ) for an anamorphic system with a

rectangular pupil, where the (x, y ) coordinates of a pupil point have been
normalized by the half-widths (a, b) of the pupil.
Polynomial Polynomial Aberration name

order
n =l +m Q j (x, y ) = Ll ( x) Lm ( y)
0 Q1 = L0 ( x ) L0 ( y ) Piston
1 Q2 = L1( x ) L0 ( y ) x-tilt
1 Q3 = L0 ( x ) L1( y ) y-tilt
2 Q4 = L 2 ( x ) L 0 ( y ) x-defocus
2 Q5 = L1( x ) L1( y )
2 Q6 = L 0 ( x ) L 2 ( y ) y-defocus
3 Q7 = L 3 ( x ) L 0 ( y ) x-primary coma
3 Q8 = L2 ( x ) L1( y )
3 Q9 = L1( x ) L2 ( y )
3 Q10 = L0 ( x ) L3 ( y ) y-primary coma
4 Q11 = L4 ( x ) L0 ( y ) x-primary spherical
4 Q12 = L3 ( x ) L1( y )
4 Q13 = L2 ( x ) L2 ( y )
4 Q14 = L1( x ) L3 ( y )
4 Q15 = L0 ( x ) L4 ( y ) y-primary spherical
5 Q16 = L5 ( x ) L0 ( y ) x-secondary coma
5 Q17 = L4 ( x ) L1( y )
5 Q18 = L3 ( x ) L2 ( y )
5 Q19 = L2 ( x ) L3 ( y )
5 Q20 = L1( x ) L4 ( y )
5 Q21 = L0 ( x ) L5 ( y ) y-secondary coma

Table 8-9. Orthonormal polynomials Q j (x, y ) for an anamorphic system with a

rectangular pupil, where the (x, y ) coordinates of a pupil point have been
normalized by the half-widths (a, b) of the pupil. (Cont.)
Polynomial Polynomial Aberration name

order
n =l +m Q j (x, y ) = Ll ( x) Lm ( y)
6 Q22 = L6 ( x ) L0 ( y ) x-secondary spherical
6 Q23 = L5 ( x ) L1( y )
6 Q24 = L4 ( x ) L2 ( y )
6 Q25 = L3 ( x ) L3 ( y )
6 Q26 = L2 ( x ) L4 ( y )
6 Q27 = L1( x ) L5 ( y )
6 Q28 = L0 ( x ) L6 ( y ) y-secondary spherical
7 Q29 = L7 ( x ) L0 ( y ) x-tertiary coma
7 Q30 = L6 ( x ) L1( y )
7 Q31 = L5 ( x ) L2 ( y )
7 Q32 = L4 ( x ) L3 ( y )
7 Q33 = L3 ( x ) L4 ( y )
7 Q34 = L2 ( x ) L5 ( y )
7 Q35 = L1( x ) L6 ( y )
7 Q36 = L0 ( x ) L7 ( y ) y-tertiary coma
8 Q37 = L8 ( x ) L0 ( y ) x-tertiary spherical
8 Q38 = L7 ( x ) L1( y )
8 Q39 = L6 ( x ) L2 ( y )
8 Q40 = L5 ( x ) L3 ( y )
8 Q41 = L4 ( x ) L4 ( y )
8 Q42 = L5 ( x ) L3 ( y )
8 Q43 = L2 ( x ) L6 ( y )
8 Q44 = L1( x ) L7 ( y )
8 Q45 = L0 ( x ) L8 ( y ) y-tertiary spherical

8.11 Observation of Aberrations 363
1 1 1
aj = Ú Ú W ( x , y ) Q j ( x , y )dx dy . (8-116)
4 -1 -1
It is evident that the value of a coefficient is independent of the number of polynomials

used in the expansion. Because the mean value of each polynomial (other than piston) is
zero and Q1( x , y ) is unity, the mean value of the aberration function is given by the
piston coefficient a1 :
W ( x , y) = a1 . (8-117)
Its mean square value is given by

J
[W (x, y)]2 = Â a 2j
j =1
. (8-118)
Accordingly, its variance is given by
2
2
sW = [W (x, y)]2 - W ( x, y)
J
= Â a 2j . (8-119)
j =2
When an aberration function is obtained at a discrete array of points by tracing rays

from a point object through an imaging system or by testing a system interferometrically,
it can be expanded in terms of the orthonormal polynomials. Once the expansion
coefficients are calculated using Eq. (8-116), they can be used to reconstruct the function
and obtain it continuously across the pupil. Because of the orthogonality of the Legendre
polynomials, the coefficients are independent of each other, and an orthogonal aberration
term can be added to or subtracted from the aberration function without affecting the
other terms.
8.11 OBSERVATION OF ABERRATIONS

Now we briefly describe how the aberrations of an optical system may be observed.
The emphasis of our discussion is on how to recognize a primary aberration and not on
how to measure it precisely. Because the optical frequencies are very high (1014 – 1015
Hz), optical wavefronts, aberrated or not, cannot be observed directly; optical detectors
simply do not respond at these frequencies. However, the primary aberration of a system
may be recognized by observing the image of a monochromatic point object formed by
the system. The image is characteristically different for a different aberration [3]. A direct
way to recognize an aberration is to form an interferogram by combining two parts of a
light beam, one of which has been transmitted through the system. An aberration in the
system yields an interference pattern that is characteristically different for a different
aberration. Here, we briefly discuss the interference patterns for primary aberrations. An
alternative approach is to measure the ray aberrations with a Hartmann sensor and
calculate the wave aberrations from them [8].
8.11.1 Primary Aberrations

Considering an optical system with a circular exit pupil of radius a and letting (r, q)
be the polar coordinates of a point in the plane of its exit pupil, the functional form of the
primary phase aberrations may be written
Ï Asr 4 + Bd r 2 , Spherical combined with defocus

Ô (8 - 120a)
3
Ô Ac r cosq + Bt r cosq , Coma combined with tilt (8 - 120b)
Ô
F(r, q) = Ì Aa r 2 cos 2q + Bd r 2 , Astigmatism combined with defocus (8 - 120c)
Ô 2 (8 - 120d)
Ô Ad r , Field curvature
Ô At r cosq , Distortion . (8 - 120e)
Ó
In Eq. (8-120a), when Bd = 0, the aberration is spherical. A nonzero value of Bd

implies that the aberration is combined with defocus, i.e., the aberration is not defined
with respect to a reference sphere centered at the Gaussian image point but with respect
to another sphere centered at a distance z from the plane of the exit pupil, according to
Eq. (8-10). As discussed in Section 9.3.1, the reference sphere is centered at the marginal
image point, the center of the circle of least confusion, and the point midway between the
marginal and Gaussian image points when Bd As = - 2, - 1.5 , and - 1, respectively.
The midway point corresponds to minimum variance of the aberration, as may be seen by
comparing the aberration thus obtained with the Zernike polynomial Z40 (r) .
In Eq. (8-120b), when Bt = 0, the aberration is coma. A nonzero value of Bt implies

that the aberration is combined with tilt, or that it is defined with respect to a reference
sphere centered at a point (2 FBt , 0) in the image plane, where F is the focal ratio or the f-
number of the image-forming light cone. The variance of the aberration is minimum
when Bt Ac = - 2 3, as may be seen by comparing the aberration thus obtained with the
Zernike polynomial Z31 (r, q) .
In Eq. (8-120c), when Bd = 0, the aberration is astigmatism. A nonzero value of Bd

implies that it is combined with defocus. The variance of the aberration is minimum when
Bd Aa = - 1 2 . When Bd Aa = 0 or - 1, we obtain the so-called tangential and sagittal
images of a point object, discussed in Section 9.3.3. Equations (8-120d) and (8-120e)
represent field curvature and distortion aberrations, respectively. The coefficients Ad and
At of these aberrations vary with the image height as h ¢ 2 and h ¢ 3 , respectively. However,
for a given image height, these aberrations are equivalent to defocus and tilt aberrations,
respectively. Figure 8-10 shows a 3D plots of the various aberrations.
8.11.2 Interferograms
There are a variety of interferometers that are used to detect and measure aberrations
of optical systems [8]. Figure 8-11 schematically illustrates a Twyman–Green
interferometer in which a collimated laser beam is divided into two parts by a beam
splitter BS. One part, called the test beam, is incident on the system under test, indicated

Defocus: ρ2

Spherical: ρ 4 Balanced Spherical: ρ 4 − ρ2

⎛ 2 ⎞
Coma: ρ cosθ
3
Balanced Coma: ⎜ ρ 3 − ρ⎟ cosθ
⎝ 3 ⎠

1
Astigmatism: ρ cos2 2
θ Balanced Astigmatism: ρ cos θ − ρ2
2 2
2
Figure 8-10. Shape of primary aberrations representing the difference between an

ideal wavefront (typically, spherical) and an actual wavefront.
M1
BS
L M2
x
L¢
Figure 8-11. Twyman–Green interferometer for testing a lens system L. F is the

image-space focal point of L, and C is the center of curvature of a spherical mirror
M2 . The interfering beams are focused by a lens L ¢ , and the interference pattern is
observed on a screen S.
by the lens L, and the other, called the reference beam, is incident on a plane mirror M1 .
The focus F of the lens system lies at the center of curvature C of a spherical mirror M2 .
As the angle of the incident light is changed to study the off-axis aberrations of the
system, the mirror is tilted so that its center of curvature lies at the current focus of the
beam. In this arrangement the mirror does not introduce any aberration because it is
forming the image of an object lying at its center of curvature. The two reflected beams
interfere in the region of their overlap. The lens L ¢ is used to observe the interference
pattern on a screen S. A record of the interference pattern is called an interferogram. Note
that because the test beam goes through the lens system L twice, its aberration is twice
that of the system.
If the reference beam has uniform phase and the test beam has a phase distribution
F(x, y), and if their amplitudes are equal to each other, the irradiance distribution of their
interference pattern is given by
I ( x , y ) = I 0 1 + exp[iF( x , y ) ] 2
{ [
= 2 I0 1 + cos ( x, y) ]} , (8-121)
where I0 is the irradiance when only one beam is present. The irradiance has a maximum
value equal to 4 I0 at those points for which
F( x, y) = 2p n (8-122a)
and a minimum value equal to zero wherever

F( x, y) = 2 p ( n + 1 2) , (8-122b)
where n is a positive or negative integer, including zero. Each fringe in the interference
pattern represents a certain value of n, which, in turn, corresponds to the locus of ( x, y)
points with the phase aberration given by Eq. (8-122a) for a bright fringe and Eq. (8-
[ ]
122b) for a dark fringe. If the test beam is aberration free F ( x, y) = 0 , then the
interference pattern has a uniform irradiance of 2 I0 .
Figure 8-12 shows interferograms when the lens system L under test suffers from
3 l of a primary aberration, corresponding to 6 l of an aberration of the interfering test
beam. In our discussion, we give the value of an aberration coefficient in wavelength
units, rather than in radians, as is customary in optics. For defocus and spherical
aberration, the interference pattern consists of concentric circular interference fringes.
The fringe spacing depends on the type of aberration. Figure 8-12a shows the
interferogram obtained when the system is aberration free but is misfocused, i.e., when its
focus F lies to the left or right of the center of curvature C of the spherical mirror M2 by
an amount corresponding to 3 l of the defocus aberration. [See Eqs. (8-14)] for the
relationship between the longitudinal defocus, i.e., the axial spacing between F and C,
and the peak defocus aberration Bd , which is 3 l in our example.]
Figure 8-12b shows the interferograms obtained when the system has 3 l of
spherical aberration (i.e., As = 3 l) and a certain amount of defocus. The case Bd = 0
(i.e., coincident F and C) represents such a system with an image of a certain object being
observed in its Gaussian or paraxial image plane. Similarly, the interferogram obtained
for Bd As = - 2 represents the system when the image is observed in its marginal image
plane. For a system with a positive spherical aberration, its marginal focus lies farther
from it than its paraxial focus (see Figure 8-11). Thus, this interferogram is obtained
when points F and C are separated from each other axially, according to Eq. (8-10b), by
48l F 2 , i.e., when F lies to the left of C by 48l F 2 . The other two interferograms,
Bd = - As and Bd = - 1.5 As , represent the system when the image is observed in the
minimum-aberration-variance plane and the circle-of-least-confusion plane, respectively.
Figure 8-12c shows the interferograms obtained when light is incident at a certain
angle from the axis of the system so that it suffers from 3 l of coma. The fringes in this
case are cubic curves. The case Bt = 0 corresponds to two parallel interfering beams (F
and C are coincident in this case). The case Bt = - 2 Ac 3 represents the system
corresponding to a minimum aberration variance. A tilt aberration with a peak value of
Bt may be obtained by transversally displacing C from F by ( - 2 FBt , 0 ) . It may also be
obtained by tilting the plane mirror M1 by an angle Bt a, where a is the radius of the test
beam [see Eq. (8-15) and note the factors of 2 resulting from the reflection of the
reference beam by mirror M 1 and the doubling of the system aberration in the test beam].
Figure 8-12. Interferograms of primary aberrations: (a) defocus Bd r 2 , (b) spherical

aberration combined with defocus Asr 4 + Bd r 2 , (c) coma combined with tilt
Ac r 3 + Bt rcosq , and (d) astigmatism combined with defocus Aa r 2 cos 2q + Bd r 2 . The
aberrations in the interferograms are twice their corresponding values in the system
under test, because the test beam goes through the system twice.
Figure 8-12d shows the interferograms obtained when the system suffers from 3l of
astigmatism. When Bd = 0 or - Aa , representing the system with an image being
observed in a plane containing one or the other astigmatic focal line, respectively, we
obtain an interferogram with straight-line fringes because the aberration then depends on
either x or y (but not both). However, the fringe spacing is not uniform. When
Bd = - Aa 2 , the fringe pattern consists of rectangular hyperbolas. If the system under
test is aberration free but the two interfering beams are tilted with respect to each other,
representing a wavefront tilt error, we obtain straight-line fringes that are uniformly
spaced. The fringe spacing is inversely proportional to the angle between the two beams.
8.11.3 Random Aberrations

So far we have discussed interferograms of primary aberrations when only one of
them is present. These interferograms are relatively simple, and the aberration type may
be recognized from the shape of the fringes. It should be evident that a general aberration
consisting of a mixture of these aberrations and/or others will yield a much more complex
interferogram. As an example of a general aberration, Figure 8-13a shows a possible
aberration introduced by atmospheric turbulence as in ground-based astronomical
observations. The ratio of the telecope diameter D and the atmospheric coherence length
r0 is 10. The interferogram for this aberration is shown in Figure 8-13b. When 25l of tilt
are added to the aberration, the interferogram appears as in Figure 8-13c. Doubling of the
aberration, as in a Twyman–Green interferometer, is not considered in Figure 8-13. If the
aberration were zero, then Figure 8-13a would appear as a plane, Figure 8-13b as
uniformly bright, and Figure 8-13c as uniformly spaced straight lines.
D / r0 = 10
s w = 0.4l
(a) Aberration
No tilt 25l tilt
(b) (c)
Figure 8-13. Aberration introduced by atmospheric turbulence with D r0 = 10. Its
standard deviation is 0.4 l . (a) Aberration shape. (b) Aberration interferogram.
(c) Interferogram with 25 l of tilt.

8.12.1 Wave and Ray Aberrations
If W ( x, y) is the wave aberration of a ray at a point ( x, y) on the reference sphere
for a certain point object, then the ray aberration ( xi , yi ) representing the coordinates of
the point of its intersection with the Gaussian image plane with respect to the Gaussian
image point is given by
R Ê ∂W ∂W ˆ
( xi , yi ) = Á , ˜ , (8-123)
ni Ë ∂x ∂y ¯
where R is the radius of curvature of the reference sphere with respect to which the
aberration is defined, and ni is the refractive index of the image space. For a radial
aberration W (r ) , the distance ri of the intersection point from the Gaussian image point
is given by
R ∂W
ri = . (8-124)
ni ∂r
The wave aberration of a ray is positive if it has to travel a longer optical path length,
compared to the chief ray, in order to reach the Gaussian reference sphere [1].
8.12.2 Wavefront Defocus Aberration

If the Gaussian image lies at a distance R from the plane of the exit pupil of radius a,
but the image is observed in a plane at a distance z, its defocus aberration is given by
W (r ) = Bd r2 , (8-125)
where
ni Ê 1 1
Bd = - ˆ a2 (8-126a)
2 zË R¯
~ - ni D R 8 F 2 for z ~ R (8-126b)
is the peak defocus aberration. Here, r = r a is the normalized distance of a point in the
plane of the exit pupil, and F = R 2 a is the focal ratio of the image-forming light cone.
The quantity D R is called the longitudinal defocus. The defocus wave aberration and the
longitudinal defocus have numerically opposite signs.
8.12.3 Wavefront Tilt Aberration

A wavefront tilt of a small angle b corresponds to a wave aberration of
W (r, q) = Bt r cos q , (8-127a)
where
Bt = ni ab (8-127b)
is the peak value of the aberration.
8.12.4 Primary Aberrations

The degree or order of a classical primary (or Seidel) wave aberration in the
coordinates of the object and pupil points is four. There are five primary aberrations, and
their form, in terms of their dependence on the pupil coordinates (r, q) , can be written
Ï As r 4 , Spherical
Ô
Ô Acr3cosq , Coma
Ô
W (r, q) = Ì Aar2 cos 2q , Astigmatism (8-128)
Ô 2
Ô Ad r , Field curvature
Ô A r cosq , Distortion ,
Ó t
where Ai represents the peak value of an aberration and contains the dependence on the
object point location. The primary wave aberrations of a multisurface system are additive
in the sense that they can be obtained by adding the primary wave aberrations of the
surfaces, where the Gaussian image of a point object formed by one surface becomes the
point object for the next surface.
8.12.5 Strehl Ratio and Aberration Balancing

A measure of the quality of an image is its Strehl ratio, which represents the ratio of
the central irradiances of the diffraction image of a point object with and without
aberration. For small aberrations, its value is approximately given by [2]
S ~ exp ( - s F2 ) , (8-129)
where s F2 is the variance of the phase aberration across the exit pupil of imaging system.
The variance of an aberration can be reduced by balancing it with one or more aberrations
of the same and/or lower-order thereby increasing the Strehl ratio. The primary
aberrations with and without balancing are listed in Tables 8-3 and 8-4, respectively.
8.12.6 Zernike Circle Polynomials
8.12.6.1 Use of Zernike Polynomials in Wavefront Analysis

The Zernike circle polynomials are used in wavefront analysis because they are
orthogonal over a circular pupil, and represent balanced classical aberrations for such
pupils.
8.12.6.2 Polynomials in Optical Design

The aberration function W (r, q) for a system with a circular exit pupil can be
expanded in terms of a complete set of Zernike circle polynomials Z nm (r, q) in the form
• n
W (r, q) = Â Â c nm Z nm (r, q) , (8-130)
n =0 m =0
where c nm is an expansion coefficient, and n and m are positive integers, including zero,
such that n – m ≥ 0 and even. The radial and angular dependence of the polynomials is
given by
12
È 2( n + 1) ˘
Z nm (r, q) = Í ˙ Rn (r) cos mq ,
m
(8-131)
Î 1 + d m0 ˚
where Rnm (r) is a radial polynomial of degree n in r containing terms in rn , rn -2 , K,

and rm.
The polynomials Z nm (r, q) are orthonormal to each other according to
1 1 2p m
Ú Ú Z (r, q)Z n ¢ (r, q) r dr d q = d nn ¢d mm ¢
m¢
. (8-132)
p0 0 n
The polynomials are ordered such that a polynomial with a lower value of n is ordered
first, and for a given value of n, a polynomial with a lower value of m is ordered first. The
polynomials through n = 8 and m = 0 are given in Table 8-5. The variance of the
aberration function is equal to the sum of the squares of the orthonormal expansion
coefficients c nm , except c 00 :
• n
s W2 = Â 2
Â c nm . (8-133)
n =1 m = 0
8.12.6.3 Zernike Primary Aberrations
The Zernike or orthogonal primary aberrations have the form
(
Ïc40 5 6r4 - 6r2 + 1 , Spherical
Ô
)
Ôc 8 3r3 - 2r cosq , Coma
Ô 31 ( )
Ô
W (r, q) = Ìc22 6 r2 cos 2q , Astigmatism (8-134)
Ô 2
( )
Ôc20 3 2r - 1 , Field curvature
Ô
Ôc11 2r cos q , Distortion ,
Ó
where cij is the aberration coefficient. The aberrations in this form are orthonormal over
a unit circular pupil, and the standard deviation of an aberration is given by cij . The
Zernike spherical aberration consists of Seidel spherical aberration and an equal and
opposite amount of defocus. The Zernike coma consists of Seidel coma and a tilt of – 2/3
the amount of coma. The Zernike astigmatism consists of Seidel astigmatism and – 1/2
the amount of astigmatism. An aberration balanced in this manner yields the minimum
standard deviation but not necessarily the minimum spot radius, as discussed in Chapter
9.
8.12.6.4 Polynomials in Optical Testing

Because the aberrations introduced by fabrication errors or atmospheric turbulence
are random in nature, we need both the cosine and the sine Zernike circle polynomials to
express them. It is convenient in such cases to write their form and numbering as:
Z even j (r, q) = 2(n + 1) Rnm (r) cos mq, m π 0 ,
Z odd j (r, q) = 2(n + 1) Rnm (r) sin mq, m π 0 , (8-135)

Z j (r, q) = n + 1 Rn0 (r), m = 0 .
An even number is associated with a cosine polynomial, and an odd number with a sine
polynomial. The polynomials are orthonormal over a unit circular pupil according to
1 2p 1 2p
Ú Ú Z j (r, q) Z j ¢ (r, q) r dr dq Ú Ú r dr dq = d jj ¢ . (8-136)
0 0 0 0
The polynomials are ordered such that an even j corresponds to a symmetric polynomial
varying as cosmq, and an odd j corresponds to an antisymmetric polynomial varying as
sinmq. A polynomial with a lower value of n is ordered first, and for a given value of n, a
polynomial with a lower value of m is ordered first. The first 45 orthonormal Zernike
polynomials are listed in Table 8-6.
An aberration function W (r, q) across a unit pupil representing the aberrations

resulting from fabrication errors or atmospheric turbulence can be expanded in terms of
the polynomials Z j (r, q) in the form
J
W (r, q) = Â a j Z j (r, q) , (8-137)
j =1
where a j is an expansion coefficient, and we have truncated the polynomials at a

maximum value J of j:
2p
11
aj = Ú
p0 Ú W (r, q)Z j (r, q) r dr dq . (8-138)
0
The value of a coefficient a j is independent of the number J of the polynomials used in

Eq. (8-48) for the expansion of the aberration function. Thus, one or more polynomial
terms can be added to or subtracted from the aberration function without affecting the
value of the coefficients of the other polynomials in the expansion. The variance of the
aberration function given by
J
s W2 = Â a 2j . (8-139)
j =2
8.12.6.5 Isometric and Interferometric Characteristics

The P-V numbers of Zernike polynomials are given in Table 8-7. For m π 0, they
are given by 2 2( n + 1) . When m = 0 and n 2 is even, they are given by (1 - b) n + 1 ,
where b is the extreme negative value of Rn0 (r) as r varies between 0 and 1. However,
when m = 0 and n 2 is odd, they are given by 2 ( n + 1) . The isometric plots of the
Zernike primary aberrations are shown in Figure 8-8. The P-V numbers of a polynomial
representing the fabrication errors give a measure of the depth of material to be removed
in the fabrication process.
8.12.7 Relationship between Zernike and Seidel Coefficients.
The form of a Seidel aberration in terms of their dependence on the pupil coordinates
(r, q) , and the corresponding balanced and Zernike aberrations are given in Table 8-10.
See the Appendix on how to combine cosine and sine aberration terms to obtain a Seidel
aberration at a certain angle.
8.12.8 Aberrations of an Anamorphic System

An anamorphic imaging system has only two pairs of Gaussian conjugates,
compared to an infinite number for a rotationally symmetric imaging system. It is
assumed that the aperture stop lies in the image space of the system so that it is also its
exit pupil.
The aberration function of an anamorphic system depends on the object and pupil
coordinates ( p, q) and ( x , y ) , respectively, through six reflection invariants p 2 , q 2 , x 2 ,
y 2 , px , and qy , compared to three rotational invariants p 2 + q 2 , x 2 + y 2 , and px + qy
in the case of a rotationally symmetric system. Its aberration terms are separable in the
Table 8-10. Seidel aberrations and the corresponding balanced aberrations and
Zernike polynomials
Seidel Aberration Balanced Zernike Polynomial

Aberration
Spherical , r 4 r4 - r2 (
Z 40 = 5 6r 4 - 6r 2 + 1 )
Coma, r 3 cos q (r3 - 2r 3) cos q Z 13 = 8 ( 3r 3 - 2r) cos q
Astigmatism, r 2 cos 2 q r 2 ( cos 2 q - 1 2) Z 22 = 6 r 2 cos 2q
Field curvature, r 2 (
Z 20 = 3 2r 2 - 1 )
Distortion, r cos q Z11 = 2r cos q
pupil coordinates. The degree of an aberration term is even, and the aberration function
accordingly consists of aberrations of even orders only. There are 16 primary aberrations
[see Eq. (8-109)], as opposed to only five for a rotationally symmetric system [see Eq. (2-
16)].
The orthonormal polynomials Q j ( x , y ) representing balanced aberrations are

products of the Legendre polynomials Ll ( x ) and Lm ( y ) in the x and y variables,
respectively, as in Table 8-9, where the x and y coordinates are normalized by the half-
widths (a, b) of the rectangular pupil. They are inherently separable in the Cartesian
coordinates of a pupil point. For each polynomial Ll ( x ) Lm ( y ) , there is a corresponding
polynomial Lm ( x ) Ll ( y ) . If l and m are the degrees of the x- and y-Legendre polynomials,
then the degree or the order n of the orthonormal polynomial obtained by their product is
n = l + m. There are n + 1 polynomials of a certain order n. The polynomials for a square
pupil can be obtained from the rectangular polynomials by letting a = b .

Same n Value and Varying as cos mqq and sin mqq .
If two Zernike polynomial aberrations with the same value of n and varying as
cos mq and sin mq are present simultaneously with sigma values a j and b j , we can
write their sum in the form
W (r, q) = a j Z even j (r, q) + b j Z odd j (r, q)
= (
2(n + 1) Rnm (r) a j cos mq + b j sin mq )
= {[ (
2(n + 1) Rnm (r) a 2j + b 2j cos m q - (1 m) tan -1 b j a j )]} . (8A-1)
It represents an aberration of the form cos mq with a sigma value of a 2j + b 2j , except

( )
that its orientation is different by an angle (1 m) tan -1 b j a j . Thus, the orientation of
the PSF also changes by this angle. It is evident from Eq. (8A-1) that if b j = 0 , then the
angle of orientation is zero, indicating a cos mq polynomial. Similarly, if a j = 0, then
the angle is p 2m , indicating a sin mq polynomial.
12
It is easy to see that when both a j and b j are negative, a 2j + b 2j
12
(
in Eq. (8A-1) )
( )
must be replaced by - a 2j + b 2j . However, when one of the coefficients is positive and
( )
the other is negative, then tan -1 b j a j of a negative argument has two solutions: a
negative acute angle or its complimentary angle. The choice is made depending on
whether a 2 or a 3 is negative according to
Ô ( j j )
Ï - tan -1 b a for positive a and negative b
j j (8 A - 2a)
tan -1
(
bj a j ) = Ì
Ó ( )
Ô p - tan -1 b j a j for negative a j and positive b j . (8 A - 2 b)
(
An alternative when a 2 is negative is to let the angle be - tan -1 b j a j , as when a 2 is
12 12
)
(
positive, but also replace a 2j + b 2j )
with - a 2j + b 2j . ( )
References 377
REFERENCES
1. As with the sign convention in Gaussian optics, different authors use different
sign conventions for the wave aberration associated with a ray. We have assigned
a positive sign to a ray that travels a longer optical path length compared to that of
the chief ray to reach the reference sphere. This convention is used, for example in
M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,
New York, 1999). Some authors give it a negative sign; see, for example, W. T.
Welford, Aberrations of the Symmetrical Optical System (Academic Press, New
York, 1974). They assign a negative sign when the wavefront at the ray lags the
reference sphere, and a positive sign when it leads. As a result, their equation for
the transverse ray aberration has a minus sign on the right-hand side of Eq. (8-5),
because a ray must end up at the same point in the image plane regardless of the
sign convention used for the wave aberration. In practice, it does not matter which
sign convention is used as long as it is used consistently.

SPIE Press, Bellingham, WA (1998) [doi: 10.1117/3.265735].

Optics, 2nd ed. SPIE Press, Bellingham, WA (2011) [doi: 10.1117/3.898443].
4 V. N. Mahajan and José A. Díaz, “Imaging characteristics of Zernike and annular

polynomial aberrations,” Appl. Opt. 52, 2062–2074 (2013).
5. V. N. Mahajan and W. H. Swantner, “Seidel coefficients in optical testing,” Asian

J. Phys. 15, 203–209 (2006).
6. V. N. Mahajan, “Orthonormal aberration polynomials for anamorphic optical

imaging systems with rectangular pupils,” Appl. Opt. 49, 6924–6929 (2010).
7. F. Liu, B. M. Robinson, and J. M. Geary, “Analyzing optics on rectangular

apertures using 2-D Chebyshev polynomials,” Opt. Eng. 50 (4), 043609 (April
2011) [doi: 10.1117/1.3569692].
8. For a detailed discussion of different methods of aberration measurement, see D.

Malacara, Ed., Optical Shop Testing, 3rd ed., John Wiley and Sons, New York
(2007).
PROBLEMS
8.1 Show that the defocus wave aberration introduced by a lens of image-space focal
length f ¢ is given by W (r ) = - (ni / 2 f ¢) r 2 , where ni is the refractive index of the
image space.
8.2 The field curvature aberration of an imaging system may be written W(r)
= ad h ¢ 2 r 2 , where a d is the aberration coefficient, and h ¢ is the height of a
Gaussian image point. Show that the effect of the aberration is eliminated if the
image is observed on a spherical surface of radius of curvature 1 4 ad R 2 passing
through a corresponding axial Gaussian image point, where R is the radius of
curvature of the reference sphere with respect to which the aberration is defined.
The refractive index of the image space is assumed to be unity.
8.3 If the Gaussian image of an object is formed at infinity by an imaging system at an

angle from its optical axis, but the system suffers from field curvature according
to W (r ) = bd 2 r 2 , what is the distance at which the image rays come to focus?
8.4 Consider an imaging system suffering from distortion aberration given by W(r,q)
= at h ¢ 3r cos q , where a t is the aberration coefficient, and h ¢ is the height of a
Gaussian image point. Determine the height of the actual image point.
8.5 Show that the Seidel coma r 3 cos q balanced with tilt r cos q , and Seidel
astigmatism r 2 cos 2 q balanced with defocus for minimum variance across a
[ ]
circular pupil have the form r 3 - (2 3) cos q , and (1 2)r 2 cos 2q , respectively.
8.6 Consider the primary (Seidel) aberration function
W P (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31r 3 cos q + a 40r 4 .
(a) Write it in terms of Zernike circle polynomials.
(b) Determine its mean value.
(c) Determine its rms value and its standard deviation.
8.7 Consider a three-mirror system. Determine the fabriaction tolerance for the mirrors
for a Strehl ratio of 0.8. For simplicity, assume that each mirror has the same figure
tolerance.
CHAPTER 9
SPOT SIZES AND DIAGRAMS
9.1 Introduction ..........................................................................................................381

9.2 Theory ................................................................................................................... 381
9.3 Application to Primary Aberrations ..................................................................384
9.3.2 Coma ........................................................................................................391
9.3.3 Astigmatism and Field Curvature ............................................................394
9.3.4 Field Curvature and Depth of Focus........................................................402
9.3.5 Distortion ................................................................................................. 404
9.4 Balanced Aberrations for the Minimum Spot Sigma ....................................... 408
9.5 Spot Diagrams ......................................................................................................410
9.6. Aberration Tolerance and a Golden Rule of Optical Design ........................... 415
9.7.1 Spherical Aberration ............................................................................... 416
9.7.2 Coma ....................................................................................................... 416
9.7.3 Astigmatism and Field Curvature ........................................................... 416
9.7.4 Field Curvature and Defocus ................................................................... 417
9.7.5 Distortion ................................................................................................418
9.7.6 Aberration Tolerance ............................................................................... 418
9.7.7 A Golden Rule of Optical Design............................................................418
References ......................................................................................................................419
Problems ......................................................................................................................... 420
379
Chapter 9
Spot Sizes and Diagrams
9.1 INTRODUCTION
In Chapter 4 we developed paraxial ray-tracing equations to determine the Gaussian
imaging properties of a system, vignetting of rays, size of imaging elements and
apertures, and obscurations in mirror systems. In the paraxial approximation, all rays
from a point object that are transmitted by the system pass through the Gaussian image
point. However, when the rays are traced according the exact laws of geometrical optics,
they generally intersect in the vicinity of the Gaussian image point. The ray distribution
on an observation surface, as depicted by the intersection points, is called the spot
diagram, and its extent is called the spot size.
In this chapter, we discuss the distribution of rays in the image of a point object
aberrated by a primary aberration. The density of rays over an observation surface is
called the geometrical point-spread function. We define its centroid and sigma value, and
calculate them for primary aberrations, without explicitly calculating the ray density
distribution [1]. In the case of spherical aberration and astigmatism, the ray distribution
and spot size are also considered in image planes other than the Gaussian, thereby
introducing the concept of aberration balancing. In the early stages of the design of an
optical imaging system, one often considers its transverse ray aberrations in an image
plane for a set of rays lying along a certain line in the plane of the exit pupil and passing
through its center. Such a set of rays is called a ray fan. We illustrate the wave and ray
aberrations for ray fans along the x and y axes. Also discussed are the balanced
aberrations for the minimum spot sigma in terms of Zernike circle polynomials.
Aberration tolerances, including the depth of focus, and a golden rule of optical design
are discussed. The characteristics of the ray spots and tolerance for primary aberrations
are summarized in the last section.
9.2 THEORY
Consider an optical system consisting of a series of rotationally symmetric coaxial
refracting and/or reflecting surfaces imaging a point object P lying at a height h along the
x axis, as in Figure 8-3. The primary aberration function at the exit pupil of the system
may be written
W ( r, θ; h ′) = as r 4 + ac h ′r 3cosθ + aa h ′ 2 r 2 cos 2 θ + ad h ′ 2 r 2 + at h ′ 3r cosθ , (9-1)
where (r, q) are the polar coordinates of a point in the x y plane of the exit pupil, h ′ is
the height of the Gaussian image point P ′, and a s , a c , a a , a d , and at represent the
coefficients of spherical aberration, coma, astigmatism, field curvature, and distortion,
respectively. The angle q is equal to zero or p for points lying in the tangential or
meridional plane (i.e., the z x plane containing the optical axis and the point object, and,
therefore its Gaussian image). The chief ray, which by definition passes through the
center of the exit pupil, always lies in this plane. The plane normal to the tangential plane
381
382 SPOT SIZES AND DIAGRAMS
but containing the chief ray is called the sagittal plane. The angle q is equal to π / 2 or
3π / 2 for points lying in the sagittal plane. As the chief ray bends when it is refracted or
reflected by a surface, so does the sagittal plane. The rays lying in the tangential plane are
referred to as the tangential ray fan and those lying in the sagittal plane are referred to as
the sagittal ray fan. For an optical system with a circular exit pupil, say of radius a, it is
convenient to use normalized coordinates (ρ, θ) where ρ = r a, 0 ≤ ρ ≤ 1, 0 ≤ θ < 2 π ,
suppress the explicit dependence on h ′, and write the aberration function in the form
W (ρ, θ) = As 4 + Ac 3cos θ + Aaρ2 cos 2 θ + Ad ρ2 + At ρ cos θ , (9-2)
where the new aberration coefficients Ai are related to the ai used in Eq. (9-1) according
to
As = as a 4 , Ac = ac h ′a 3 , Aa = aa h ′ 2 a 2 , Ad = ad h ′ 2 a 2 , At = at h ′ 3a . (9-3)
If ( x, y) represent the rectangular coordinates of a pupil point, the corresponding

normalized coordinates (ξ, η) are given by
1
(ξ, η) = ( x, y) (9-4a)
a
= ρ(cos θ, sin θ) , (9-4b)
where −1 ≤ ξ ≤ 1, − 1 ≤ η ≤ 1, and ξ 2 + η2 = ρ2 ≤ 1. The aberration function defined in

the form of Eq. (9-2) has the advantage that an aberration coefficient Ai has the
dimensions of length (i.e., dimensions of the wave aberration), and represents the peak or
the maximum value of the corresponding primary aberration. For example, if As = 1 λ,
where λ is the wavelength of the object radiation, we speak of one wave of spherical
aberration.
In rectangular coordinates, Eq. (9-2) for the primary aberration function may be
written
2
(
W (ξ, η) = As ξ 2 + η2 ) ( ) ( )
+ Ac ξ ξ 2 + η2 + Aa ξ 2 + Ad ξ 2 + η2 + At ξ . (9-5)
An aberration term is even in pupil coordinates if W ( − ξ, − η) = W (ξ, η) ; it is odd if

W ( − ξ, − ) = − W (ξ, ) . Among the five primary aberrations in Eq. (9-5), only coma and
distortion are odd aberrations; the other three aberrations are even. Of course, spherical
aberration and field curvature are radially symmetric.
The ray distribution on an observation surface is called the ray spot diagram, and
their density (i.e., the number of rays per unit area) over the surface is called the
geometrical point-spread function (PSF). If the system is aberration free, then the
wavefront at the exit pupil is spherical, and all of the object rays transmitted by the
system converge to the Gaussian image point. When the wavefront is aberrated, a ray
9.2 Theory 383
passing through a point (ξ, η) or (r, q) in the plane of the exit pupil intersects the
Gaussian image plane at a point ( xi , yi ) , which, following Eq. (8-5), may be written
⎛ ∂W ∂W ⎞
( xi , yi ) = 2F ⎜ , ⎟ (9-6a)
⎝ ∂ξ ∂η ⎠
⎛ ∂W sinθ ∂W ∂W cosθ ∂W ⎞
= 2F ⎜ cosθ – , sinθ + ⎟ , (9-6b)
⎝ ∂ ρ ρ ∂ θ ∂ρ ρ ∂θ ⎠
where F = R 2 a is the focal ratio of the image-forming light cone. Here, R is the radius
of curvature of the Gaussian reference sphere with respect to which the aberration
W (ρ, θ) is defined, and ( xi , yi ) are the coordinates of the point of intersection of the ray
in the Gaussian image plane with respect to the Gaussian image point and represent its
ray aberrations. The reference sphere is centered at the Gaussian image point and, like the
aberrated wavefront, passes through the center of the exit pupil. In Eqs. (9-6), we have
assumed that the refractive index ni of the image space is unity because it is often the
case in practice.
For a radially symmetric aberration, i.e., one for which W (ρ, θ) = W (ρ), we note
from Eq. (9-6b) that the PSF is also radially symmetric. The radial distance ri of a ray
from the Gaussian image point in that case is given by
1/ 2
(
ri = xi2 + yi2 )
∂W(ρ)
= 2F , (9-7)
∂ρ
where the vertical bars ensure that ri is a numerically positive quantity.
An element of area dS p = dxdy centered at a point ( x, y) in the plane of the exit

pupil is mapped into an element of area dSi = dxi dyi centered at the point ( xi , yi ) in the
image plane. If I p ( x, y) represents the irradiance or the density of rays at the point ( x, y)
in the pupil plane, then the irradiance or the density of rays Ig ( xi , yi ) at a point ( xi , yi ) in
the image plane is given by
I g ( xi , yi ) dxi dyi = I p ( x, y) dxdy . (9-8)
The centroid (or the center of gravity) of a PSF is given by
( xc , yc ) = xi , yi
∫∫ ( xi , yi ) Ig ( xi , yi ) dxi dyi
= , (9-9)
∫∫ Ig ( xi , yi ) dxi dyi
where the angular brackets indicate a mean value. However, it can be obtained in a
simple manner by substituting Eqs. (9-6a) and (9-8) into Eq. (9-9). Thus, for a uniformly
illuminated pupil, i.e., for a constant I p ( x, y) , say I p , we may write
Ê ∂W ∂W ˆ
(xc , yc ) =
2F
ÚÚ ÁË ∂x , ∂h ˜¯ d x dh
ÚÚ d x dh
Ê ∂W ∂W ˆ
= (2F p)
ÚÚ ÁË ∂x , ∂h ˜¯ d x dh . (9-10)
The standard deviation of the image distribution or the spot sigma is given by
12
ss = ( x i - x c )2 + ( y i - y c )2 (9-11a)
12
ÏÔ 1 ÈÊ ∂W 2 2 ¸Ô
ˆ Ê ∂W ˆ ˘
= 2F Ì
ÔÓ p
ÚÚ ÍÁ
ÍÎË ∂x
- xc ˜ + Á
¯ Ë ∂h
- y c ˜ ˙ d x dh˝
¯ ˙
˚ Ô˛
. (9-11b)
For a symmetric aberration such as astigmatism, the PSF is symmetric, and the centroid
lies at the origin, i.e., ( x c , y c ) = (0, 0) . The spot sigma in such cases is equal to the root
mean square radius. Substituting Eq. (4-7) for a radially symmetric aberration, Eq. (4-
15b) reduces to
12
È Û Ê ∂W ˆ 2 ˘
s s = 2 2F Í ÙÁ ˜ r d r˙ . (9-11c)
ÍÎ ı Ë ∂r ¯ ˙˚
Next, we discuss the characteristics of an image aberrated by a primary aberration.

To be definite, we assume that each aberration coefficient Ai is positive, unless stated
otherwise. If two or more of these aberrations are present simultaneously, the image
coordinates ( xi , yi ) of a ray are given by the sum of the coordinates for each aberration.
9.3 APPLICATION TO PRIMARY ABERRATIONS

In this section we discuss the shapes and sizes of PSFs for primary aberrations and
uniform pupil illumination I p . The concept of aberration balancing is introduced,
whereby a given aberration is mixed with another to reduce the spot size. The wave and
ray aberrations for tangential and sagittal ray fans are also considered.
9.3.1 Spherical Aberration

Consider a wavefront aberrated by a spherical aberration
W (ρ) = As ρ 4 (9-12)
with respect to a reference sphere centered at the Gaussian image point P0′ of an axial
point object P0 . Substituting Eq. (9-12) into Eq. (9-7), we find that a ray of zone r in the
plane of the exit pupil intersects the Gaussian image plane at a distance
9.3 Application to Primary Aberrations 385
ri = 8 FAs ρ3 (9-13)
from P0′. Thus, the rays lying on a circle of radius r in the exit pupil lie on a circle of
radius ri given by Eq. (9-13) in the Gaussian image plane. The maximum value of ri is
8FAs and corresponds to rays with ρ = 1, i.e., it corresponds to the marginal rays. We
refer to the maximum value of ri as the radius of the image spot. For an off-axis point
object, because As is independent of the height h of the point object from the optical axis,
the ray distribution owing to spherical aberration alone is also independent of h.
Let us consider the ray distribution in a slightly defocused image plane by intro-
ducing a defocus aberration Bd . The aberration with respect to a new reference sphere
centered at a defocused point lying at a distance z from the plane of the exit pupil may be
written
W (ρ) = Asρ 4 + Bd ρ2 , (9-14)
where the defocus coefficient Bd is given by [see Eq. (8-10)]
1 ⎛1 1⎞ 2
Bd = − a (9 − 15a )
2 ⎝ z R⎠
ΔR
~ − . (9 − 15b)
8F 2
Here, Δ R = z − R , and Eq. (9-15b) follows from Eq. (9-15a) when z ~ R . Note that Bd
is numerically negative for z > R , i.e., if the defocused image plane lies farther from the
exit pupil than the Gaussian image plane, or the longitudinal defocus Δ R is positive.
Figure 9-1 shows how the wave aberration given by Eq. (9-14) varies across the exit pupil
for values of Bd corresponding to paraxial ( Bd = 0) , marginal ( Bd = − 2 As ) , midway
( Bd = − As ) , and least-confusion ( Bd = − 1.5 As ) image planes. The names of the image
planes given here will become clear from what follows. We note that for the negative
value of Bd , the aberration is negative everywhere except at the center and the edge of
the pupil, where it is zero.
The rays of zone r now lie in the defocused image plane on a circle of radius
ri = 8 FAs ρ3 + ( Bd 2 As ) ρ . (9-16)
The circle in the image plane is traced out in the same sense as in the pupil plane as θ
varies from 0 to 2p to complete a circle of rays. In a given image plane, i.e., for a given
value of Bd , the maximum value of ri as r varies from 0 to 1 is the spot radius in that
plane. It occurs either at the stationary value of r obtained by letting ∂ri ∂r = 0 or at the
end value r = 1. We note that r = 0 at the other end point ri = 0 , implying that the chief
ray passes through the center of the image. When Bd is negative, ri = 0 also for rays
with r = - Bd 2 As .
1.00
0.75 W(ρ)
= ρ4 +(Bd /As)ρ2
As
0.50
Bd
=0
0.25 As
W(ρ)
As
0.00
–1
– 0.25
–1.5
– 0.50
–2
– 0.75
– 1.00
0.0 0.2 0.4 0.6 0.8 1.0
ρ
Figure 9-1. Variation of spherical aberration across the exit pupil in units of As
combined with different amounts of defocus Bd . Aberration variance is minimum
when Bd = - As .
How ri varies with r is shown in Figure 9-2 for the values of Bd considered above.
We note that only when Bd = 0 does a given value of ri corresponds to a certain value of
r. When Bd = − 2 As , there are two different values of r lying between zero and one that
correspond to a given value of ri , i.e., rays lying on two different circles in the pupil
plane lie on the same circle in the image plane. When Bd = − As , or Bd = − 3 As 2, there
are three different values of r lying between zero and one that correspond to a given
1.0
0.8
0.6
ri
Bd
= –2
As
0.4
– 1.5
0.2
–1
0
0.0
0.0 0.2 0.4 0.6 0.8 1.0
ρ
Figure 9-2. Radius ri of a circle of rays in units of 8FAs in various image planes
characterized by the value of Bd as a function of corresponding radius ρ in the
pupil plane. The spot radius is minimum when Bd = -1.5 As .
value of ri for 0 < ri < 1 3 6 or 0 < ri < 1 4, respectively, i.e., rays lying on three
different circles in the pupil plane lie on the same circle in the image plane. A circle of
rays with a larger value of ri up to ri = 1 2 corresponds to only one circle of rays in the
pupil plane when Bd = − As . There are two circles of rays in the pupil plane with ρ = 1 2
and 1 that correspond to ri = 1 4 when Bd = − 3 As 2.
For the marginal rays, i.e., for ρ = 1, ri → 0 if Bd = − 2 As . From Eq. (9-15), we

find that the marginal rays intersect the axis at a distance
R = − 8 F 2 Bd (9-17a)
(9-17b)
= 16 F 2 As
from P0′ . A positive value of Δ R implies that, compared with the old reference sphere,
the new reference sphere is centered at a point that is farther from the center of the exit
pupil, or that the defocused image plane lies farther from the exit pupil than the Gaussian
image plane. Thus, the point of intersection M of the marginal rays lies to the right of P0′ ,
as shown in Figure 9-3. This is to be expected because as may be seen from Figure 9-3,
the wavefront W is less curved than the reference sphere S for positive values of As . The
points P0′ and M are called the Gaussian or paraxial (meaning for very small values of r)
and the marginal image points, respectively. Substituting Bd = − 2 As into Eq. (9-16),
ExP
MR
1
0.5
0.25 0.385
O CR P′0
M
MW LC
G
Longitudinal
spherical
aberration
W S
R
Figure 9-3. Ray spot radii in various image planes for a wavefront W aberrated by
spherical aberration. G – Gaussian or paraxial, M – marginal, MW – midway, L C –
least confusion. The reference sphere S is centered at a Gaussian image point P0′ .
we find that the maximum value of ri in the marginal image plane occurs for rays of zone
= 1 3 . This maximum value, i.e., the spot radius, is 2 3 3 (or 0.385) times the
corresponding value in the Gaussian image plane. Thus, the marginal spot radius is
considerably smaller than the paraxial spot radius. The quantity Δ R given by Eq. (9-17b)
is called the longitudinal spherical aberration. It represents the distance of the marginal
image point from the Gaussian image point. If we consider the variation of longitudinal
spherical aberration with ρ , i.e., if we determine the distance of the point where the rays
of a zone ρ intersect the optical axis from P0′ , we find from Eqs. (9-15b) and (9-16) that
it varies quadratically with ρ according to
R = 16 F 2 As ρ2 . (9-17c)
The image plane M W lying midway between the Gaussian and marginal planes
corresponds to Bd = − As . The spot radius in this plane is half of that in the Gaussian
image plane G and corresponds to marginal rays. The image plane that has the smallest
spot radius corresponds to that value of Bd that minimizes the maximum value of ri as r
varies from 0 to 1 in Eq. (9-16). It is evident from Eq. (9-16) that Bd must be negative; a
positive value of Bd can only increase the value of ri for any value of ρ . The value of ρ
corresponding to the spot radius is either ρ1 = c / 6 obtained by letting ∂ri ∂ρ = 0,
where c = − Bd / As , or ρ2 = 1 . In units of 8FAs , the corresponding values of the spot
radius are r1 = c 3 / 2 / 3 6 and r2 = 1 − c / 2 , respectively. Figure 9-4 shows that r1
increases monotonically as c increases, but r2 first decreases, approaches zero as c → 2,
and then increases monotonically. The value of c that gives the minimum spot radius is
the one obtained by letting r1 = r2 . This equality yields a cubic equation in c with
solutions c = 6, 6, and 3/2. The value 3/2 yields the minimum spot radius. Thus, the spot
radius is minimum in a plane LC (for least confusion) corresponding to Bd = − 3 As 2 ,
i.e., a plane that is 3/4 of the way from the Gaussian image plane to the marginal image
plane. The spot radius in this case is 1/4 of the Gaussian spot radius and corresponds to
the rays of zone ρ = 1 2 and 1. This spot is called the circle of least confusion. The spot
radii in the various image planes considered here are listed in Table 9-1. Note that they
increase linearly with F and As .
Because of the radial symmetry of spherical aberration, the wave and ray aberrations
of any ray fan can be written immediately from Eqs. (9-14) and (9-16), respectively. For
example, for the tangential ray fan, i.e., for the η = 0 rays, we may write
[
W (ξ, 0) = As ξ 4 + ( Bd As ) ξ 2 ] (9-18a)
and
( xi , yi ) [
= 8 FAs ξ 3 + ( Bd 2 As ) ξ, 0 ] . (9-18b)
Figure 9-5 shows how the wave and ray aberrations of a ray fan for spherical aberration
vary with x for the various defocus values listed in Table 9-1.
2
ri
1
r1
r2
0
0 2 4 6 8
c
Figure 9-4. Variation of image spot radius with c = − Bd /As .
W(ξ, 0) xi
1 8
Bd /As = 0
4 Bd /As = –2
– 3/2
0 ξ 0 ξ
–1 (0, 0)
–1 (0, 0)
0
– 3/2
–4
–2
–1 –8
–1 0 1 –1 0 1
Figure 9-5. Wave and ray aberrations for a ray fan for spherical aberration
corresponding to various image planes. The wave aberration is in units of As , and
the ray aberration is in units of FAs .
Table 9-1. Ray spot sizes in units of 8 FAs , for peak spherical aberration As .
Image Plane Balancing Defocus Spot Radius Spot Sigma

Bd As rimax ss
Gaussian 0 1 0.5
Marginal –2 0.385 0.289
Midway –1 0.5 0.204
Minimum spot sigma –4/3 1/3 0.167
Least confusion –3/2 0.25 0.177
Because of its radial symmetry, the centroid of the PSF lies at the Gaussian image
point (0, 0) . Substituting Eq. (9-16) into Eq. (9-11c), we obtain the image spot sigma
2 12
ÏÔ 1 B 1 Ê B ˆ ¸Ô
ss = 8FAs Ì + d + Á d ˜ ˝ . (9-19)
4 3 As 2 Ë 2 As ¯ Ô
ÓÔ ˛
Letting
∂s s
= 0 , (9-20)
∂Bd
we find that s s is minimum when Bd = − ( 4 3) As . Its value is equal to 4 FAs 3 ,

compared with its value of 4FAs in the Gaussian image plane. We note that s s is
minimum in a plane that is different from the least-confusion plane in which the spot
radius is minimum. The values of s s in various image planes are listed in Table 9-1. The
variation of s s with defocus is shown in Figure 9-6.
The deliberate mixing of one aberration with one or more other aberrations to reduce
the stop size is called aberration balancing. Here, we have balanced spherical aberration
with defocus in order to minimize the spot radius or its sigma value. The amount of
defocus that gives the smallest ray spot or its sigma value may be called the optimum
defocus based on geometrical optics. The balanced aberration giving the smallest ray spot
is As [ρ4 − (3 / 2) ρ2 ] . Similarly, the balanced aberration that gives the smallest spot
sigma is As [ρ4 − ( 4 / 3) ρ2 ] . Based on diffraction, the optimum amount of defocus
corresponds to the midway plane, because in that case it is used to reduce the variance of
the aberration across the exit pupil, i.e., the balanced aberration giving minimum
( )
variance is As ρ 4 − ρ2 , similar to the Zernike circle polynomial Z40 (ρ) [see Tables 8-4
and 8-5].
0.5
0.4
σs 0.3
0.2
0.1
– 2.0 – 1.5 – 1.0 – 0.5 0
B d/A s
Figure 9-6. Variation of s s in units of 8FAs for spherical aberration with defocus.
9.3.2 Coma
The coma wave aberration is given by
W (ρ, θ) = Ac ρ3cosθ , (9-21a)
or
W (ξ, η) = Ac ξ ξ 2 + η2 ( ) . (9-21b)
Substituting Eq. (9-21) into Eqs. (9-6), we obtain the corresponding ray aberrations in the
Gaussian image plane with respect to the Gaussian image point:
( xi , yi ) = 2 FAc ρ2 (2 + cos2θ, sin2θ) (9-22a)
(
= 2 FAc ρ2 + 2ξ 2 , 2ξη ) . (9-22b)
For a given value of ρ, the locus of the points of intersection of the rays in the Gaussian
2 2
(x i − 4 FAcρ2 ) (
+ yi2 = 2 FAc ρ2 ) . (9-23)
Thus, the rays coming from a circle of radius ρ in the exit pupil lie on a circle of radius
( )
2 FAc ρ2 centered at 4 FAc ρ2 , 0 in the image plane. The circle in the image plane is
traced out twice in the same sense as in the pupil plane as q varies from 0 to 2π to
complete a circle of rays. As illustrated in Figure 9-7, because CB CP′ = 1 2 , all of the
rays in the image plane are contained in a cone with a semiangle of 30° bounded by a
xi
ρ = 1 Rays
C
Ac
2F
B A
4FAc S
ρ = 1/2 Rays
30°
yi
P′ xi
T
x(ξ)
MRt S
P′
ExP MRs
Q θ
h′
CR
r
z
O OA P′0
MRs MRt
yi
y(η)
Figure 9-7. Ray spot diagram for coma. The tangential marginal rays MRt are
focused at the point T, and the sagittal marginal rays MRs are focused at the point
S. All of the rays in the image plane lie in a cone with a semiangle of 30° and its
vertex at the Gaussian image point P ′ bounded by the upper arc of a circle of
radius 2 FAc centered at (4 FAc , 0) . The cone angle is 30° because CB CP′ = 1 2 .
circle of radius 2FAc centered at ( 4 FAc , 0) corresponding to the marginal rays. Here, C
is the center of the circle formed by the marginal rays, and P ′A and P ′B are tangents to
the circle. The vertex of the cone, of course, coincides with the Gaussian image point P ′ .
Only the chief ray passes through P ′ . Rays in the image plane corresponding to a zone of
ρ = 1 2 are also shown in the figure. They lie on a circle of radius FAc 2 centered at
( FAc , 0) in the image plane. Because the spot diagram has the shape of a comet, the
aberration is appropriately called coma. Note that the tangential marginal rays
MRt (ρ = 1, θ = 0, π ) intersect this plane at a point T at a distance 6FAc from P ′ along
the xi axis, and the sagittal marginal rays MRs (ρ = 1, θ = π 2 , 3π 2) intersect the image
plane at a point S at a distance 2FAc from P ′ . Accordingly, the length 6FAc and half-
width 2FAc of the coma pattern are called tangential and sagittal coma, respectively.
According to Eq. (9-21b), the wave aberration for the tangential ray fan is given by
Wt (ξ, 0) = Ac ξ 3 . (9-24)
It is zero for the sagittal ray fan. The ray aberrations given by Eq. (9-22b) may be written
for the two types of rays in the form
( xi , yi )t (
= 6 FAc ξ 2 , 0 ) (9-25a)
and
( xi , yi )s (
= 2 FAc η2 , 0 ) . (9-25b)
We note that even though the wave aberration of the rays in the sagittal fan is zero, their
ray aberration is not; the rays are displaced along the x (or x) axis in the image plane.
Figure 9-8 shows the variation of wave and ray aberrations with pupil coordinates. We
note that the wave aberration is odd and the ray aberration is even in pupil coordinates.
Of course, this is also evident from Eqs. (9-21b) and (9-22b).
Because the PSF is highly asymmetric about the Gaussian image point P ′ , its
centroid does not lie at it. Substituting Eq. (9-22b) into Eq. (9-10), we obtain the
coordinates of the centroid
(xc , yc ) = (2FAc , 0) . (9-26)
W(ξ, 0) xi
1 8
xi(ξ)
4
xi(η)
0 ξ 0 ξ, η
(0, 0) (0, 0)
–4
–1 –8
–1 0 1 –1 0 1
Figure 9-8. Wave and ray aberrations for tangential and sagittal ray fans for coma.
The wave aberration is in units of Ac , and the ray aberration is in units of FAc . The
wave aberration is zero for the sagittal ray fan.
Thus, the centroid lies at the point S in Figure 9-7 where the sagittal marginal rays
intersect the image plane. Substituting Eqs. (9-22a) and (9-26) into Eq. (9-11a), we obtain
the image spot sigma:
2 12
s s = 2 FAc [r 2
(2 + cos 2q) - 1] + r 4 sin 2 2q
= 2 2 3FAc . (9-27)
Measuring the ray coordinates in the image plane with respect to a point other than
the Gaussian image point is equivalent to introducing a wavefront tilt aberration in the
aberration function, and may be written
W (r, q) = Ac r 3 cos q + Bt r cos q , (9-28)
where Bt is the peak value of the balancing tilt aberration and corresponds to measuring
the wave aberration with respect to a reference sphere centered at a point in the image
plane with coordinates ( − 2 FBt , 0 ) or, equivalently, measuring the ray coordinates with
respect to this point. Thus, measuring the ray aberrations with respect to the centroid is
equivalent to a tilt aberration of -Ac r cos q or Bt = - Ac . Accordingly, the aberration
function with respect to the centroid can be written
(
W (r, q) = Ac r 3 - r cos q . ) (9-29)
It should be evident that if the ray aberrations are measured with respect to a point
other than the centroid, including the Gaussian image point, the sigma value of the spot
will increase. The aberration function given by Eq. (9-29) represents coma aberration
balanced optimally with tilt aberration to yield a minimum value of the spot sigma, or
bring its centroid at the Gaussian image point. However, the variance of the wave
aberration is minimum when Bt = − (2 3) Ac , i.e., if the balanced aberration is
[ ]
Ac ρ3 − (2 / 3) ρ cos θ , similar to the Zernike polynomial Z31 (ρ, θ) [see Tables 8-4, and
(8-5)].
It is worth mentioning that the centroid of a PSF is associated with the line of sight of
an imaging system. Moreover, the centroid of a geometrical PSF is identical to the
diffraction PSF [2].
9.3.3 Astigmatism and Field Curvature

If the image of a point object is observed in a defocused plane, the aberration
function of a system aberrated by astigmatism and field curvature may be written
W (r, q) = Aa r 2 cos 2q + Ad r 2 + Bd r 2 (9-30a)

or
W (x, h) = ( Aa + Ad + Bd ) x 2 + ( Ad + Bd ) h2 , (9-30b)
where Aa and Ad are both proportional to h ′ 2 , and the balancing defocus coefficient Bd
is related to the longitudinal defocus Δ R according to Eq. (9-17a). The corresponding ray
aberrations are given by
( xi , yi ) [
= 4 Fρ ( Aa + Ad + Bd ) cosθ, ( Ad + Bd ) sinθ ] (9-31a)
[
= 4F ( Aa + Ad + Bd ) ξ , ( Ad + Bd ) η ] . (9-31b)
For a given value of r, the locus of the points of intersection of the rays in the defocused
2 2
⎛ xi ⎞ + ⎛ yi ⎞ = 1 , (9-32)
⎝ A⎠ ⎝ B⎠
where
A = 4 F( Aa + Ad + Bd ) ρ (9-33)
and
B = 4 F( Ad + Bd ) ρ . (9-34)
Thus, the rays lying on a circle of radius r in the exit pupil, in general, lie in a defocused
image plane on an ellipse whose semiaxes are given by A and B. The largest ellipse is
obtained for the marginal rays.
The Gaussian image ( Bd = 0) is an elliptical spot with semiaxes 4F( Aa + Ad ) and

4FAd , as illustrated in Figure 9-9. We note that if Bd = − Ad , corresponding to
Δ Rs = 8 F 2 Ad , the ellipse reduces to a line S of full length 8FAa parallel to the xi axis.
The line image is called the sagittal (or radial) image because the sagittal rays converge
to a point at its center. It lies in the tangential (or meridional) plane z x, containing the
point object (which lies along the xi axis in the object plane) and the optical axis. The
corresponding wave aberration Aa r 2 cos 2q is called the sagittal astigmatism. If, however,
Bd = − ( Aa + Ad ) , corresponding to Rt = 8 F 2 ( Aa + Ad ) , then the ellipse reduces to a
line T parallel to the yi axis. The full length of this line image is the same as that of the
line image S. This line image is called the tangential image because the tangential rays
converge to a point at its center, and it lies in the sagittal plane. The corresponding wave
aberration - Aa r 2sin 2q is called the tangential astigmatism. The distance 8F 2Aa between
the two line images is called the longitudinal astigmatism. It should be evident that it is
independent of the zone value ρ of the rays. The two line images are called the
astigmatic focal lines. (The terms “radial” and “tangential” images also become evident
by consideration of Figure 9-14, where these images are shown for a point object P as
well as for straight and circular line objects.)
ΔRt
ΔRb
T
C
ΔRs
S
xi
OA
x(ξ)
MRt
P′
ExP CR
MRs yi
O
MRs
MRt
y(η)
Figure 9-9. Astigmatic images in the presence of field curvature, showing elliptical
image spots and astigmatic focal lines. The sagittal marginal rays MRs are shown
converging on the sagittal line image S, and the tangential marginal rays MRt are
shown converging on the tangential line image T. The line images S and T, and the
circle of least confusion C, are special cases of the elliptical spots.
If Bd = − ( Aa + 2 Ad ) 2 , corresponding to Rb = 4 F 2 ( Aa + 2 Ad ) , the ellipse

reduces to a circle C of diameter 4FAa , which is half the full length of the two line
images. Because this circle is the smallest of all the possible images, Gaussian or
defocused, it is called the circle of least (astigmatic) confusion. The circle in the image
plane is traced out once in the opposite sense of that in the pupil plane as θ varies from 0
to 2p to complete a circle of rays, as may be seen from Eq. (9-32a). Substituting
Bd = - ( Aa + 2 Ad ) 2 into Eq. (9-31a), we obtain balanced aberration ( Aa 2) r2 cos 2q ,
similar to the Zernike polynomial Z22 (r, q) [see Tables 8-4 and 8-5]. Astigmatism
balanced in this manner not only gives the smallest spot but also yields minimum
variance of the aberration. We will refer to the image thus obtained as the best image.
Because both Aa and Ad ~ h ′ 2 , the length of the sagittal and tangential line images
of a point object increases quadratically with the height h ′ of the Gaussian image point.
Similarly, Δ Rs , Δ Rt , Δ Rb , and longitudinal astigmatism increase as h ′ 2 . For a line
object, equating Δ R to the sag of a curved line image, we find that the sagittal,
tangential, and best images are parabolic with the vertex radii of curvature given by
Rs = h ′ 2 16 F 2 Ad (9-35a)
= 1 4 R 2 ad , (9-35b)
Rt = h ′ 2 16 F 2 ( Aa + Ad ) (9-36a)
= 1 4 R 2 ( aa + ad ) , (9-36b)
and
h′2
Rb = (9-37a)
8 F 2 ( Aa + 2 Ad )
1
= , (9-37b)
2 R 2 ( aa + 2 ad )
respectively. Note that a positive value of Rs , for example, corresponds to positive

values of Ad and Δ Rs . The images of a planar object centered on the optical axis are the
corresponding paraboloids symmetric about the optical axis.
From Eqs. (9-35b) and (9-36b) we note that
3 1
− = 4 R 2 ( 2 ad − aa ) . (9-38)
Rs Rt
The right-hand side is also related to the radius of curvature Rp of the Petzval image, and
Eq. (9-38) may be written
3 1 2
− = . (9-39)
Rs Rt Rp
Because the sag of a surface is inversely proportional to its (vertex) radius of curvature,
Eq. (9-39) has the consequence that, as illustrated in Figure 9-10, the Petzval surface is
three times as far from the tangential surface as it is from the sagittal surface. Moreover,
the sagittal surface always lies between the tangential and the Petzval surfaces. When
astigmatism is zero, the sagittal and the tangential surfaces reduce to the Petzval surface.
We also note from Eqs. (9-35) through (9-37) that
1 1⎛ 1 1⎞
= ⎜ + ⎟ , (9-40)
Rb 2 ⎝ Rs Rt ⎠
i.e., the vertex curvature of the best-image surface is equal to the mean value of the vertex
curvatures of the sagittal and tangential surfaces. The best-image surface is planar when
aa = − 2 ad . In that case, Rs = − Rt , i.e., the sagittal and tangential image surfaces have
equal but opposite vertex curvatures.
The wave and ray aberrations of a tangential ray fan are given by Eqs. (9-31b) and
(9-32b) according to
Wt (ξ, 0) = ( Aa + Ad + Bd ) ξ 2 (9-41)
P′ T S P P′ P S T P′ P
P0′ P0′ P0′
(a) Aa < 0 (b) Aa > 0 (c) Aa = 0

Ad > Aa Ad > 0
Figure 9-10. Parabolic image surfaces. S – sagittal, T – tangential, and P – Petzval.

The sagittal and tangential surfaces correspond to astigmatism, and the Petzval
surface corresponds to field curvature. The sagittal surface lies between the
tangential and Petzval surfaces, as in (a) and (b), when astigmatism is nonzero. The
Petzval surface is three times as far from the tangential surface as it is from the
sagittal surface. The sagittal and tangential surfaces coincide with the Petzval
surface, as in (c), when astigmatism is zero. P0′ P ′ is the Gaussian image of a planar
object.
and
( xi , yi )t = 4 F( Aa + Ad + Bd ) (ξ, 0) , (9-42)
respectively. Similarly, for the sagittal ray fan, they are given by
Ws (0, η) = ( Ad + Bd ) η2 (9-43)
and
( xi , yi )s = 4 F ( Ad + Bd ) ( η, 0) . (9-44)
The wave and ray aberrations for Ad + Bd = 0 , − Aa 2 , and − Aa are illustrated in

Figure 9-11. It is evident that the wave aberration varies quadratically with a pupil
coordinate, and the ray aberration varies linearly with it.
The centroid of the PSF lies at the Gaussian image point (0, 0) because it is
symmetric about both the xi and yi axes. The image spot sigma may be obtained by
substituting Eq. (9-32a) into Eq. (9-11c). Thus,
W(ξ, 0) xi
1 8
(Ad + Bd)/Aa = 0
4 (Ad + Bd)/Aa = 0
–1/2
–1/2
0 ξ 0 ξ
(0, 0) –1 (0, 0) –1
–4
–1 –8
–1 0 1 –1 0 1
Figure 9-11. Wave and ray aberrations for a tangential ray fan for astigmatism
corresponding to various image planes. The wave aberration is in units of Aa , and
the ray aberration is in units of FAa .
2 12
È A + Bd Ê A + Bd ˆ ˘
s s = 2FAa Í1 + 2 d + 2Á d ˜ ˙ . (9-45)
ÍÎ Aa Ë Aa ¯ ˙
˚
The variation of s s with Ad + Bd is shown in Figure 9-12. Letting
2.0
1.9
1.8
σs
1.7
1.6
1.5
1.4
– 1.0 – 0.9 – 0.8 – 0.7 – 0.6 – 0.5 – 0.4 – 0.3 – 0.2 – 0.1 0
(A d + B d )/A a
Figure 9-12. Variation of s s in units of FAa for astigmatism with Ad + Bd .

∂s s
= 0 , (9-46)
∂Bd
we find that the spot sigma is minimum and equal to 2 FAa when Ad + Bd = − Aa 2 ,
i.e., in the plane of the circle of least confusion, as expected for uniform irradiance. The
spot shape and size, including its s value, in an image plane defined by the balancing
defocus are summarized in Table 9-2.
If astigmatism is the only aberration present, i.e., if the field curvature coefficient
Ad = 0 in Eqs. (9-31), then all of the object rays transmitted by the exit pupil intersect the
Gaussian image plane on a line S of full length 8FAa along the xi axis centered at the
Gaussian image point P ′ , as illustrated in Figure 9-13. This is the sagittal image of a
point object. The sagittal rays converge on the Gaussian image point. Similarly, a
tangential line image T of the same full length as the sagittal line image is obtained in a
defocused image plane corresponding to Bd = − Ad . The tangential rays converge to a
point at its center. The sagittal image of a line object is also a line that is slightly longer
(by an amount 8FAa ) than but coincident with its Gaussian line image. However, its
tangential image is parabolic with a vertex radius of curvature of h ′ 2 / 16 F 2 Aa or
1 / 4 R 2 aa . Note that the longitudinal astigmatism in this case represents the sag of the
tangential image surface. Similarly, the sagittal image of a planar object will be planar,
but its tangential image will be paraboloidal.
Table 9-2. Ray spot shape, size, and sigma for astigmatism Aa and field curvature A d
in various image planes defined by defocus Bd .
Balancing
Image Defocus Spot Shape and Size* Spot Sigma
Plane Bd s s 2FAa
2 1/ 2
⎡ A + Bd ⎛ A + Bd ⎞ ⎤
8 F( Aa + Ad + Bd ) ⎢1 + 2 d + 2⎜ d
General Bd Elliptical, ⎟ ⎥
⎢⎣ Aa ⎝ Aa ⎠ ⎥
× 8 F( Ad + Bd ) ⎦
2 1/ 2
⎡ Ad ⎛ Ad ⎞ ⎤
8 F( Aa + Ad ) ⎢1 + 2 + 2⎜ ⎟ ⎥
Gaussian 0 Elliptical,
× 8 FAd ⎢⎣ Aa ⎝ Aa ⎠ ⎥
⎦
Sagittal − Ad Line along xi axis, 8FAa 1
Tangential − ( Ad + Aa ) Line along yi axis, 8FAa 1
Best − ( Ad + Aa / 2 ) Circular, 4FAa 1 2

*Spot sizes are full major and minor axes of an elliptical image, full length of a line image, and
diameter of a circular image.
xi
x(ξ) T
S C
MR t
MR s
ExP P′
yi
CR MRs
O z
OA
MR t
y(η)
Figure 9-13. Astigmatic focal lines when only astigmatism is present. The tangential
marginal rays MRt are focused at a point on the tangential focal line T. Similarly,
the sagittal marginal rays MRs are focused at the Gaussian image point P ′ on the
sagittal focal line S. The focal lines S and T lie in the tangential and sagittal planes,
respectively. The circle of least confusion C lies in a plane midway between the
planes of line images S and T.
Figure 9-14 illustrates the effect of astigmatism and field curvature on the image of a
spoked wheel where the images formed on the sagittal and tangential surfaces are shown.
A magnification of − 1 is assumed in the figure. As discussed earlier, a point object P is
imaged as a sagittal or radial line Ps′ on the sagittal surface and as a tangential line Pt′ on
h=1
h = 1/2
P′s
P′t
P0 P′0 P′0
Object
(a) O bject (b) Image on (c) Image on
sagittal tangential
surface surface
Figure 9-14. Astigmatic images of a spoked wheel. Gaussian magnification of the

image is assumed to be – 1. The sagittal and tangential images Ps′ and Pt′ of a point
object P are shown very much exaggerated. The dashed circles in (b) are the
Gaussian images of the object circles.
the tangential surface. Each point on the object is imaged in this manner, so that the
sagittal image consists of sharp radial lines and diffuse circles while the tangential image
consists of sharp circles and diffuse radial lines. If the object contains lines that are
neither radial nor tangential, they will not be sharply imaged on any surface.
It should be understood that the astigmatism discussed here is for a system that is
rotationally symmetric about its optical axis, and its value reduces to zero for an axial
point object. It is different from the astigmatism of the eye which is caused by one or
more of its refracting surfaces, usually the cornea, that is curved more in one plane than
another. The refracting surface that is normally spherical acquires a small cylindrical
component, i.e., it becomes toric. Such a surface forms a line image of a point object even
when it lies on its axis. Thus, a person afflicted with astigmatism sees points as lines. If
the object consists of vertical and horizontal lines as in the wires of a window screen,
such a person can focus (by accommodation) only on the vertical or the horizontal lines at
a time. This is analogous to the spoked wheel example where the rim is in focus in one
observation plane and the spokes are in focus in another.
9.3.4 Field Curvature and Depth of Focus

We now consider the case when field curvature is the only aberration present, i.e.,
when the wave aberration is given by
W (ρ) = Ad ρ2 , (9-47)
where Ad varies with the image height as h ′ 2 . Because the wave aberration is radially
symmetric, the distribution of rays in the Gaussian image plane is also radially
symmetric. For rays lying on a circle of radius r in the exit pupil, the radius of the
corresponding circle of rays in the image plane, following Eq. (9-7), is given by
ri = 4 FAd ρ . (9-48a)
Its maximum value of 4FAd represents the spot radius, and corresponds to the marginal
rays. The circle in the image plane is traced out in the same sense as in the pupil as q
varies from 0 to 2p. As may be seen by substituting Eq. (9-47) into Eq. (9-11c), the spot
sigma value is given by
s s = 2 2 FAd . (9-48b)
From the discussion in Section 8.3, we note that an aberration represented by Eq. (9-
47) implies that the wavefront is spherical, but it is not centered at the Gaussian image
point. Instead, it is centered at a distance
D R = 8 F 2 Ad (9-49)
from the Gaussian image point along the optical axis (strictly speaking, it is centered on
the line joining the center of the exit pupil and the Gaussian image point). Because the
aberration coefficient Ad ~ h ¢ 2 , Δ R also increases as h ′ 2 . Thus, the sagittal image of a

line object will be spherical with a vertex radius of curvature of h ′ 2 16 F 2 Ad , or
1 4 R 2 ad . Similarly, the image of a planar object will be spherical. The spherical surface
for a system with zero astigmatism is called the Petzval image surface.
As in the case of spherical aberration, because of the radial symmetry of field

curvature, the wave and ray aberrations of any ray fan can be written immediately from
Eqs. (9-47) and (9-48), respectively. For example, for the tangential ray fan, we may
write
Wt (ξ, 0) = Ad ξ 2 (9-50a)
and
( xi , yi )t = ( 4 FAd ξ , 0) . (9-50b)
Figure 9-15 shows how the wave and ray aberrations vary with x. The PSF in this case
2
( )
has a uniform irradiance of I p a 2 2 R Ad across a circle of radius 4FAd .
A similar result is obtained when the image is observed in a defocused image plane
at a distance z. According to Eq. (9-15b), a longitudinal defocus of Δ R = z − R
introduces a defocus aberration of Bd ρ2 , where
Δ R = 8 F 2 Bd . (9-51)
Unlike the field curvature coefficient Ad , the value of Bd is independent of the height of
a point object. From Eq. (9-48), the spot radius is given by
rimax = 4 FBd
ΔR
= . (9-52)
2F
W(ξ, 0) xi
1 8
0 ξ 0 ξ
(0, 0) (0, 0)
–4
–1 –8
–1 0 1 –1 0 1
Figure 9-15. Wave and ray aberrations of a ray fan for field curvature. The wave
aberration is in units of Ad , and the ray aberration is in units of FAd .
This result can also be obtained from a simple geometry of defocus, as illustrated in
Figure 9-16. It shows the rays coming to focus at the axial image point P0′ . It is seen from
the figure that, if the image is observed in a defocused image plane at a distance z ± Δ R ,
then the spot radius is given by rimax Δ R = a Li = 1 2 F , in agreement with Eq. (9-52).
The image quality (based on geometrical optics) is not affected as long as the spot
radius is smaller than the grain size of the film or the detector element of a photodetector
array used to record the image. Thus, the tolerable amount of longitudinal defocus, called
the depth of focus, can be determined. An alternative approach, based on the diffraction
image (instead of the ray image), is to use the Rayliegh criterion according to which the
peak value of defocus aberration must be less than or equal to λ 4 . This, in turn,
corresponds to a longitudinal defocus of 2λ F 2 . The corresponding tolerance on the
object position, called the depth of field, may be obtained from Eq. (2-77) for the
longitudinal magnification. Thus, the depth of a field is given by Δ R Mt 2 , where Mt is
the transverse magnification of the image.
9.3.5 Distortion
The distortion wave aberration is given by
W (ρ, θ) = At ρ cosθ (9-53a)
or
W (ξ, η) = At ξ , (9-53b)
where the aberration coefficient At is proportional to h ′ 3 . The corresponding ray

aberrations are given by
ExP
a MR
P′0 rimax
MR
ΔR
Li
Figure 9-16. Depth of focus Δ R for a spot radius rimax .

( xi , yi ) = (2 FAt , 0) (9-54a)
(
= Rat h ′ 3 , 0 ) . (9-54b)
Because the ray aberrations are independent of the coordinates (ρ, θ) of a ray in the exit
pupil, all of the rays converge at the image point (2 FAt , 0) , which lies along the xi axis
at a distance 2FAt from the Gaussian image point. Thus, a wavefront aberrated by
distortion is tilted with respect to the Gaussian reference sphere by an angle
= At a . (9-55)
This angle is proportional to h ¢ 3 , and represents the line-of-sight error in the location of a
point object. Similarly, the distance 2FAt of the perfect image point from the Gaussian
image point is proportional to h ′ 3 . Distortion is often measured as a fraction of the image
height. Thus, for example, the percent distortion is 100 Rat h ′ 2 . ( )
It should be noted that although the ray aberration for distortion is independent of the
ray coordinates in the pupil plane, all of the rays converge at the point (2 FAt , 0) if
distortion is the only wave aberration present. However, if other wave aberrations are
present, then different rays will intersect the Gaussian image plane at different points.
The chief ray will still intersect the Gaussian image plane at the point (2 FAt , 0) because
its ray aberration due to the other wave aberrations is zero. Therefore, the ray distortion
aberration is the distance of the point where the chief ray intersects the Gaussian image
plane from the Gaussian image point, i.e., it represents the distance between the points of
intersection of the actual (within the approximation of a primary aberration) and the
paraxial chief rays in the Gaussian image plane.
In order for the distortion to be zero, the chief ray from any point in the object plane
must pass through its Gaussian image point. This has the implication that the image
magnification M must be independent of the object height. Thus, if we consider two point
objects P1 and P2 at heights h1 and h2 , as illustrated in Figure 9-17, the heights h1′ and
h2′ of their Gaussian images P1′ and P2′ must be related to each other according to
EnP ExP
P2 CR
2
P1
h2
h1 CR (–)β2
1
(–)β1 P′0
P0 O O′ (–)β1′
(–)h′1
(–)β2′ CR
1
(–)h′2
P′1
Optical
CR
System 2
P′2
(–)L o Li
Figure 9-17. The tangent condition for zero distortion.

h1′ h′
M = = 2 . (9-56)
h1 h2
Substituting for the object and image heights in terms of the slope angles of the
corresponding chief rays, we may write
tan 1′ tan ′2 L

= = M o , (9-57)
tan 1 tan 2 Li
where Lo and Li are the object and image distances from the entrance and exit pupils,
respectively. Thus the requirement for zero distortion is that the ratio of the tangents of
the slope angles of a chief ray in the object and image spaces must be independent of the
location of the object point. The value of the ratio is given by M ( Lo Li ) . Equation (9-57)
is called the tangent condition for eliminating distortion. It should be noted, however, that
we have assumed that all of the chief rays in the image space of the system to pass
through the center O′ of the exit pupil. This would be true only if the axial point O of the
entrance pupil is imaged perfectly at O′ . In other words, spherical aberration of the
system for pupil imaging must be zero. This may often not be the case because a system
will normally be designed to reduce the spherical aberration for imaging of the object
plane. The tangent condition is satisfied in the case of imaging by a pinhole camera
(discussed in Section 6.9) and a thin lens with a collocated aperture stop, because the
chief ray is transmitted without any deviation in both cases.
If we consider a line object L1 L2 , as illustrated in Figure 9-18, at a distance h1 from

the optical axis, its Gaussian image is also a line parallel to it at a distance h1′ from the
optical axis, where h1 and h1′ are related to each other by the Gaussian magnification of
the system (just as h and h ′ are related to each other). A magnification of − 1.5 is
assumed in the figure.
Because of distortion, the image of any point object is displaced from its Gaussian
image point by an amount 2FAt along a line joining the axial image point and the
Gaussian image point under consideration. We consider imaging of point objects P1 and
P2 that are at distances h1 and h2 , respectively, from the axial point object P0 . Their
Gaussian images P1′ and P2′ are located at distances h1′ and h2′ , respectively, from the
Gaussian image P0′ of the axial object P0 . Because of distortion, the images are displaced
to positions P1′′ and P2′′ so that the displacements P1′ P1′′ and P2′ P2′′ are proportional to
h1′ 3, and h2′ 3 , respectively.
We note from similar triangles P0′ P1′ P2′ and P2′ A P2′′ in Figure 9-18 that
P2′A P ′ P ′′ AP2′′
= 2 2 = , (9-58)
h1′ h2′ b
where b = P1′ P2′ . Therefore,

L′′2
L′2
L1
P′2 P′′
2
A
h′2
b
h1
P1 P0 P′0 P′′
1
h′1 P′1
h2
P2
L2
Object Image L′1
L′′1
Figure 9-18. Image of a square in the presence of distortion. The dashed square is
the Gaussian image. L1′ L2′ and L1′′ L2′′ are the Gaussian and distorted images of the
line object L1 L2 , respectively. A magnification of – 1.5 is assumed in the figure.
P2′A = (h1′ h2′ ) P2′P2′′
= Rat h1′h2′ 2 (9-59)
(
= Rat h1′ h1′2 + b 2 ) .
Because P1′ P1′′= Rat h1′3,
P2′A − P1′ P1′′ = Rat h1′b 2 , (9-60)
which represents the sag of P2′′ from a line parallel to the Gaussian line image L1′L2′ but
passing through P1′′ . From Eq. (9-58),
AP2′′ = (b h2′ ) P2′P2′′
= Rat bh2′ 2 . (9-61)
For small values of at , AP2′′ is also small; therefore, P1′′P2′′ ~ P1′ P2′ = b . From Eq. (9-
60) we note then that the sag of P2′′ is proportional to the square of its distance b from
P1′′ . Thus, the locus of P2′′ represents a parabola with a vertex at P1′′ and a vertex radius
of curvature of 1 2 Rat h1′ . If at is positive, the parabolic image is curved away from the
Gaussian image line, as shown in Figure 9-18. If it is negative, the parabolic image will
be curved toward the Gaussian image line. We note from Eq. (9-60) that if the line object
intersects the optical axis so that h1′ is zero, then the sag of P2′′ is also zero. Accordingly,
the image P2′′ of a point object P2 is simply displaced along the image line. Thus, the
image of a line object intersecting the optical axis is also a line differing from the
Gaussian image line only in that it is slightly longer. This discussion can be easily
extended to obtain the distorted images of a square grid shown in Figure 9-19. It should
be evident that when At is positive, we speak of a pincushion distortion. Similarly, when
At is negative, we speak of a barrel distortion.
9.4 BALANCED ABERRATIONS FOR THE MINIMUM SPOT SIGMA

A balanced aberration, giving the smallest spot sigma in terms of Zernike circle
polynomials Rnm (ρ) cos mθ , is given by Bnm (ρ) cos mθ, where [3]
Bnm (r) = Rnm (r) - Rnm-2 (r) . (9-62)
These polynomials are listed in Table 9-3 and may be obtained from the Zernike
polynomials given in Table 8-2. They are not orthogonal over a unit circle, but their
gradients, representing the ray aberrations, are orthogonal [4]. The polynomials
B40 (ρ) , B31 (ρ) cos θ , and B22 (ρ) cos 2θ represent balanced spherical aberration, coma, and
astigmatism, respectively, giving a minimum spot sigma.
If an aberration function is written in terms of these polynomials, e.g.,

∞ n
W (ρ, θ) = ∑ ∑ bnm Bnm (ρ) cos mθ , (9-63)
n=0 m=0
then the image spot sigma is given by
12
Ï • • È 2 • 2 ˘¸
s s = 2 F Ì Â 4 n bn0 ( ) 2
( )
+ Â Í m bmm
m =1 Î
( )
+ Â 2( 2i + m) b2mi + m ˙ ˝
˚˛
. (9-64)
Ó n 2 =1 i =1
P′2
P′2
P1 P′1
P0 P′0 P′0 P′1
P2
(a) Object (b) Pincushion (c) Barrel

distortion distortion
At > 0 At < 0
Figure 9-19. Images of a square grid in the presence of distortion. When the
distortion aberration coefficient At is positive, we obtain pincushion distortion, as in
(b). When At is negative, we obtain barrel distribution, as in (c). The dashed
squares represent the Gaussian image of the square object with a magnification of
– 1.5.
9.4 Balanced Aberrations for the Minimum Spot Sigma 409
Table 9-3. Balanced wave aberration polynomials Bnm (ρ) cos mθ for minimum spot
sigma s s .
n m Bnm (ρ) cos mθ Balanced Aberration
0 0 1 Piston
1 1 ρ cos θ Tilt
2 0 (
2 ρ2 − 1 ) Defocus
2 2 ρ2 cos 2θ Primary astigmatism
3 1 (
3 ρ3 − ρ cos θ) Primary coma
3 3 ρ3 cos 3θ
4 0 (
2 3ρ 4 − 4ρ2 + 1 ) Primary spherical
4 2 4(ρ 4
)
− ρ2 cos 2θ Secondary astigmatism
4 4 ρ 4 cos 4θ
5 1 (
5 2ρ5 − 3ρ3 + ρ cos θ ) Secondary coma
5 3 5(ρ 5
)
− ρ3 cos 3θ
5 5 ρ5 cos 5θ
6 0 ( )
2 10ρ6 − 18ρ4 + 9ρ2 − 1 Secondary spherical
6 2 3(5ρ − 8ρ + 3ρ ) cos 2θ
6 4 2
Tertiary astigmatism
6 4 6(ρ − ρ ) cos 4θ
6 4
6 6 ρ6 cos 6θ
7 1 ( )
7 5ρ7 − 10ρ5 + 6ρ3 − ρ cos θ Tertiary coma
7 3 7(3ρ − 5ρ + 2ρ ) cos 3θ
7 5 3
7 5 7(ρ − ρ ) cos 5θ
7 5
7 7 ρ7 cos 7θ
8 0 (
2 35ρ8 − 80ρ6 + 60ρ 4 − 16ρ2 + 1 ) Tertiary spherical
9.5 SPOT DIAGRAMS
If an optical system is aberration free, the wavefront at its exit pupil corresponding to
a certain point object is spherical, and all of the object rays lying in the pupil plane
converge to the Gaussian image point. For an aberrated system, the wavefront is
nonspherical and the rays are distributed in a finite region of an image plane. This
distribution of rays is called a spot diagram. We first illustrate mapping of the zonal rays
from the pupil plane to the image plane for a primary aberration. We consider rays from
four zones of the exit pupil, namely, r = 1/4, 1/2, 3/4, and 1. In Figure 9-20, the rays
from these zones are indicated by different symbols so that they can be tracked in the
image plane.
Figure 9-21 illustrates the distribution of rays for spherical aberration in the Gaussian
or paraxial ( Bd = 0) , midway ( Bd = − As ) , least-confusion ( Bd = − 3 2 As ) , and
marginal ( Bd = − 2 As ) planes. We note that in the plane of least confusion, rays from
zones ρ = 1 2 and 1 arrive on the same circle. By definition, the marginal rays (ρ = 1)
intersect the optical axis at the marginal image point. The spot radius in the marginal
image plane corresponds to rays of zone = 1 3 = 0.577 , and they are indicated by D in
the figure.
Figure 9-22 illustrates the distribution of rays for coma in the Gaussian image plane.
As in Figure 9-7, all rays lie in a cone of semiangle of 30° bounded by a circle of
marginal rays of radius 2FAc centered at ( 4 FAc , 0) .
Figure 9-23 illustrates the ray distribution of various images for astigmatism. The
images shown are the (a) sagittal line, (b) least-confusion circle, (c) tangential line, and
(d) ellipise that is symmetrically opposite the least-confusion circle. The value of Bd
for these images is given by ( Ad + Bd ) Aa = 0 , - 1 2 , - 1, and 1 2 , respectively.
The ray distribution for field curvature alone in the Gaussian image plane is identical
to that for astigmatism in the plane of least confusion if Bd = Aa 2 . Comparing Figures
9-21a , 9-22, and 9-23b, we note that rays of a given zone r lie on a circle whose
0 1 h
Figure 9-20. Zonal rays in the pupil plane corresponding to four zones: = 1 4 , 1 2 ,
3 4 , and 1.
9.5 Spot Diagrams 411
xi
0
2 4 6 8 yi
xi
4
2 4 yi
(a)
xi
(b)
xi 4
2 2
1
0 1 2 yi 0 2 4 yi
(c)
(d)
Figure 9-21. Ray distribution for spherical aberration in (a) Gaussian, (b) midway,
(c) least-confusion, and (d) marginal image planes. The units of x i and yi are FAs .
radius is proportional to ρ3 in the case of spherical aberration, ρ2 in the case of coma,

and ρ in the case of astigmatism in the least-confusion image plane. They also lie on a
circle whose radius is proportional to ρ in the case of field curvature or defocus. Note,
however, that the circles are not concentric in the case of coma; they are centered at
points along its symmetry axis at distances from the Gaussian image point that vary as
ρ2 .
In practice, the spot diagrams are obtained by tracing an array of object rays through
a system and determining their points of intersection with the image plane. They give a
qualitative description of the effects of an aberration. They do not, for example, bring out
the singularities of infinite irradiance of the aberrated PSFs, which are fortunately unreal
physically. A designer generally starts with rays that are distributed in a certain grid
pattern in the plane of the entrance pupil of the system. Figure 9-24 shows the ray grid
patterns in the pupil plane that are commonly used in practice. In Figure 9-24a, the rays
are distributed in a uniformly spaced square array, whereas in Figure 9-24b they are
distributed in a hexapolar array.
xi
–3 –2 –1 0 1 2 3 yi
Figure 9-22. Ray distribution for coma in the paraxial image plane. The units of x i
and yi are FAc .
xi
1
xi
4
0 1 2 yi
3
2
(b) Least confusion
1
xi
0 yi yi
–1 (c) Tangential
xi
–2 2
–3 1
–4
0 1 2 3 4 5 yi
(a) Sagittal
(d) Symmetrically opposite to least confusion
Figure 9-23. Ray distribution of various images for astigmatism: (a) sagittal, (b)
least confusion, (c) tangential, and (d) symmetrically opposite to least confusion. The
units of xi and yi are FAa .
9.5 Spot Diagrams 413
1 1
0.5 0.5
0 0
– 0.5 – 0.5
–1 –1
–1 – 0.5 0 0.5 1 –1 – 0.5 0 0.5 1
(a) (b)
Figure 9-24. Ray grid pattern in the pupil plane normalized by the pupil radius. (a)
Square grid of uniformly spaced points. (b) Hexapolar grid of concentric rings.
In the absence of any aberration, the spot diagram in a defocused image plane looks
exactly like the one in the pupil plane, except for its scale. The spot diagrams for
spherical aberration in various image planes considered above are shown in Figures 9-25
and 9-26. It is evident that, instead of the expected radial symmetry of the PSFs, a four-
fold symmetry is obtained in the case of the square grid of rays in the pupil plane, and
hexagonal symmetry in the case of the hexapolar grid. This is simply an artifact of the
8 4
4 2
0 0
–4 –2
–8 –4
–8 –4 0 4 8 –4 –2 0 2 4
Bd /As = 0 Bd /As = –1
(a) (b)
2 4
1 2
0 0
–1 –2
–2 –4
–2 –1 0 1 2 –4 –2 0 2 4
Bd /As = –1.5 Bd /As = –2
(c) (d)
Figure 9-25. Spot diagrams for spherical aberration in various image planes for a
square grid of rays: (a) Gaussian, (b) midway, (c) least confusion, and (d) marginal.
The spot sizes are in units of FAs . The PSFs are four-fold symmetric, instead of
being radially symmetric, because of the square grid of rays in the pupil plane.
8 4
4 2
0 0
–4 –2
–8 –4
–8 –4 0 4 8 –4 –2 0 2 4
Bd/As = 0 Bd/As = –1
(a) (b)
2 4
1 2
0 0
–1 –2
–2 –4
–2 –1 0 1 2 –4 –2 0 2 4
Bd/As = –1.5 Bd/As = – 2
(c) (d)
Figure 9-26. Spot diagrams for spherical aberration in various image planes for a
hexapolar grid of rays: (a) Gaussian, (b) midway, (c) least confusion, and (d)
marginal. The spot sizes are in units of FAs . The PSFs are six-fold symmetric,
instead of being radially symmetric, because of the hexapolar grid of rays in the
pupil plane.
ray grid used in the pupil plane. As in the case of defocus, the PSF for astigmatism is also
uniform. Thus, the spot diagram for it also looks like the input array across an elliptical
spot, which reduces to a circle or a line depending on the amount of balancing defocus.
The spot diagrams for coma are shown in Figure 9-27. Only the chief ray passes through
the Gaussian image point, which is shown with coordinates (0, 0) in the figure. Note that
the two grids yield different results near the top of the spot.
6 6
5 5
4 4
3 3
2 2
1 1
0 0
–2 –1 0 1 2 –2 –1 0 1 2
(a) (b)
Figure 9-27. Spot diagrams for coma in units of FAc for (a) square and (b) polar
array of rays in the pupil plane. Only the chief ray passes through the Gaussian
image point, which is shown to lie at (0, 0).
9.6 Aberration Tolerance and a Golden Rule of Optical Design 415
9.6 ABERRATION TOLERANCE AND A GOLDEN RULE OF

OPTICAL DESIGN
It is common practice in lens design to look at the spot diagrams in the early stages
of a design, in spite of the fact that they do not represent reality. As discussed in Section
6.8.2, the aberration-free image of a point object is the Airy pattern. As the aberration
increases, the geometrical and diffraction PSFs begin to increasingly resemble each other.
Just as in the diffraction treatment [2] an optical system is considered practically
diffraction limited if the peak (or peak-to-valley) aberration is less than λ 4 (Rayleigh’s
quarter-wave rule), or the standard deviation of the aberration across the exit pupil is less
than λ 14 (Maréchal’s criterion), similarly optical designers consider a system to be
close to its diffraction limit if the ray spot radius is less than or equal to the radius
1.22 λ F of the Airy disc. We note, for example, that this holds for spherical aberration in
the Gaussian image plane if As ≤ 0.15 λ , although a larger value of As is obtained in the
other image planes. Considering that the long dimension of the coma spot is 6FAc and
the line image for astigmatism is 8 FAa long, the aberration tolerance for the spot size to
be smaller than the Airy disc is Ac < 0.4 λ and Aa < 0.3 λ , respectively. The aberration
tolerances based on the spot size are summarized in Table 9-4. These tolerances, although
larger than λ 4 , are roughly consistent with Rayleigh’s quarter-wave rule. Thus, it is
reasonable to use the size of the spot diagrams as a qualitative measure of quality of the
design until it becomes smaller than the Airy disc. This yields a golden rule of optical
design of using spot diagrams until their size is approximately equal to that of the Airy
disc, and then analyzing the system by its aberration variance and diffraction
characteristics, such as the aberrated diffraction PSF or the modulation transfer function.
The depth of focus (giving the tolerance on the location of the plane for observing the
image) can be determined from Eq. (9-51). Thus, the aberration tolerance is Bd < ~ 0.3 λ
for a spot radius smaller than or equal to that of the Airy disc, which, in turn, implies a
depth of focus of 2.4 λ F 2 . This is roughly consistent with a value of 2 λ F 2 obtained
according to Rayleigh’s quarter-wave rule. The corresponding depth of field (giving the
tolerance on the object location for a fixed observation plane) can be determined from the
depth of focus by using Eq. (2-77) for the longitudinal magnification. Similarly,
distortion tolerance for a certain amount of line-of-sight error can be obtained from Eq.
(9-55) by replacing At by Bt .
Table 9-4. Aberration tolerance based on the ray spot size.
Aberration Spot ‘radius’ in Tolerance for

Gaussian image plane near diffraction limit
Spherical 8FAs As ≤ 0.15λ
Coma 3FAc Ac ≤ 0.4λ
Astigmatism 4FAa Aa ≤ 0.3λ
Defocus 4FBd Bd ≤ 0.3λ

The size of a geometrical ray image spot corresponding to a Gaussian image at a

height h ′ from the axis of an optical system aberrated by a primary aberration of peak
value Ai is given below. The refractive index of the medium in which the ray spot is
formed is assumed to be unity, and the f-number of the image-forming light cone is F.
The tolerance for a primary aberration based on the spot radius being equal to the radius
of the Airy disc is also given.
9.7.1 Spherical Aberration As ρ4 ( )

Longitudinal spherical aberration = 16 F 2 As . (9-65a)
A positive value of longitudinal spherical aberration implies that the marginal image
corresponding to Bd = − 2 As lies farther from the exit pupil than the Gaussian image.
Radius of circle of least confusion = 2FAs . (9-65b)
The circle of least confusion lies in a plane that is 3 4 of the way from the paraxial to the
marginal image plane.
PSF centroid, ( xc , yc ) = (0, 0 ) . (9-65c)
Circle of least confusion s s = 2 FAs . (9-65d)
Minimum spot sigma s s = (4 3)FAs .
9.7.2 (
Coma Ac ρ 3 cosθ )
Sagittal coma = 2FAc . (9-66a)
Tangential coma = 6FAc . (9-66b)
PSF centroid, ( xc , yc ) = (2 FAc , 0) . (9-66c)
Spot sigma s s = 2 2 3 FAc . (9-66d)
9.7.3 (
Astigmatism and Field Curvature Aa ρ2 cos 2 θ + Ad ρ2 )
Full length of sagittal focal line = 8FAa . (9-67a)
This line is centered on the chief ray at a distance Δ Rs = 8 F 2Ad from the Gaussian image
point and lies along the xi axis.
Full length of tangential focal line = 8FAa . (9-67b)
This line is centered on the chief ray at a distance Δ Rt = 8 F 2 ( Aa + Ad ) from the

Gaussian image point and lies along the yi axis. The distance 8F 2Aa between the two
line images is the longitudinal astigmatism.
Diameter of circle of least confusion = 4FAa . (9-67c)
This circle is centered on the chief ray and lies in a plane that is midway between the
sagittal and tangential focal line images. It is referred to as the best image.
PSF centroid, ( xc , yc ) = (0, 0 ) . (9-67d)
Circle of least confusion s s = 2 FAa (9-67e)
The radii of curvature of the sagittal, tangential, Petzval, and best-image surfaces are
given by
h¢ 2
Rs = , (9-68a)
16 F 2 Ad
h¢ 2
Rt = , (9-68b)
16 F 2 ( Ad + Aa )
2 3 1
= - (9-69a)
Rp Rs Rt
16 F 2
= (2 Ad - Aa ) , (9-69b)
h¢ 2
and
h¢ 2
Rb = . (9-69c)
8 F 2 ( Aa + 2 Ad )
Moreover,
1 1⎛ 1 1⎞
= ⎜ + ⎟ . (9-69d)
Rb 2 ⎝ Rs Rt ⎠
If only field curvature is present, then an image of radius 4FAd is obtained in the
Gaussian image plane. The image reduces to a point if it is observed in an image plane at
a distance Δ R = 8 F 2 Ad from the Gaussian image plane.
9.7.4 Field Curvature and Defocus

The field curvature and defocus aberration both vary as r 2 . We use the notation that
the peak value of the former is Ad , and Bd for the latter. Whereas Ad is proportional to
h ¢ 2 , Bd is constant for a given defocused image plane. The spot radius and spot sigma
for them are given by
rimax = 4 FAd (9-70a)

and
s s = 2 2 FAd , (9-70b)
respectively.
9.7.5 Distortion ( At ρ cos θ)

A distortion wave aberration of At ρ cos θ corresponds to a wavefront tilt of
= At a , (9-71)
where a is the radius of the exit pupil, and it represents the line-of-sight error in the
position of the image or the object point. The image point lies at (2 FAt , 0) relative to the
9.7.6 Aberration Tolerance

The tolerance for a primary aberration (in terms of its peak value) based on the spot
radius in the Gaussian image plane being equal to the radius of the Airy disc is given by
⎧0.16 λ , Spherical
⎪
Ai = ⎨0.4 λ , Coma (9-72)
⎪0.3 λ , Astigmatism or defocus .
⎩
If an aberration is balanced with another, the standard deviation of the aberration and
the spot size are not minimized for the same amount of the balancing aberration. For
example, when spherical aberration As ρ 4 is balanced with defocus Bd ρ2 , the standard
deviation is minimized when Bd = − As , but the spot radius is minimized when
Bd = −1.5 As . When astigmatism Aaρ2 cos 2 θ is balanced with defocus, the standard
deviation and spot radius are both minimized when Bd = − 0.5 As . The depth of focus is
given by 8 F 2 Bd , and dividing it by Mt2 gives the depth of field, where Mt is the
magnification of the image.
9.7.7 A Golden Rule of Optical Design

Although spot diagrams do not represent reality because they don’t account for
diffraction, it is reasonable to use their size as a qualitative measure of image quality until
it becomes approximately equal to that of the Airy disc. The smaller the spot size is, the
better the image quality. For high-quality imaging, this yields a golden rule of optical
design to use spot diagrams until their size becomes comparable to that of the Airy disc,
and then analyze the system by its aberration variance, Strehl ratio, and other diffraction
characteristics, such as the ensquared power of the aberrated diffraction PSF or the
modulation transfer function.
References 419
REFERENCES
SPIE Press, Bellingham, WA (1998) [doi: 10.1117/3.265735].

3. J. Braat, “Polynomial expansion of severely aberrated wavefronts,” J. Opt. Soc.

Am. A 4, 643–650 (1987).
4. V. N. Mahajan, Optical Imaging and Aberrations, Part III: Wavefront Analysis,

SPIE Press, Bellingham, WA (2013) [doi: 10.1117/3.927341]
PROBLEMS
9.1 Consider Problem 2.5, imaging a slide by a thin lens. (a) Determine the depth of
focus for a defocus aberration of 0.3 λ , giving the tolerance on the distance
between the lens and the screen, i.e., on the location of the screen. (b) What is the
corresponding tolerance on the distance between the slide and the lens, i.e., on the
location of the slide?
9.2 Sketch the geometrical PSF of a system with a uniformly illuminated circular exit
pupil aberrated by spherical aberration W (ρ) = As ρ 4 in the Gaussian, marginal,
least-confusion, and midway image planes for As = 1 λ , λ = 0.5 μm , and F = 10 ,
and total image power of 1 W. Give the location of these image planes with respect
to the Gaussian image plane. Calculate the size and sigma value of the image spot
in these planes.
9.3 Consider the imaging system of Problem 7.2, except that it is aberrated by
astigmatism W (ρ, θ) = Aa ρ2 cos 2 θ , where Aa = λ 4 . Calculate the size, location,
and irradiance of the tangential, sagittal, and least-confusion images of a point
object.
9.4 Consider an imaging system forming the image of a point object at a distance of
15 cm from the plane of its exit pupil at a height of 0.2 cm from its optical axis. Let
the image be aberrated by λ 4 each of astigmatism and field curvature. If the
radius of the exit pupil is 1 cm, determine and sketch the tangential, sagittal, and
Petzval image surfaces for λ = 0.5 μm .
9.5 Sketch the pattern of the image of the point object considered in Problem 7.4 if it is
aberrated by coma given by W (ρ, θ) = Ac ρ3 cos θ , where Ac = λ 4 . Illustrate the
tangential and sagittal coma on this sketch. Determine the spot sigma and centroid
of the image spot.
9.6 Sketch the pattern of the image of a point object aberrated by secondary coma
A5ρ5 cos θ , where A5 is the peak value of the aberration. Illustrate the tangential
and sagittal coma on the sketch for F = 4 and A5 = 1.5 λ , where l = 3 mm. Also,
determine the centroid of the image and its sigma value.
EPILOGUE
E1 Introduction ..........................................................................................................423
E2 Principles of Geometrical Optics and Imaging..................................................423
...............................
E3 Ray Tracing: Exact and Paraxial ....................................................................... 423
E4 Gaussian Optics ....................................................................................................424
E4.1 Tangent Plane or Paraxial Surface ..........................................................424
E4.2 Sign Convention ......................................................................................424
E4.3 Cardinal Points ........................................................................................424
E4.4 Graphical Imaging ................................................................................... 425
E4.5 Lagrange Invariant................................................................................... 425
E4.6 Matrix Approach to Gaussian Imaging....................................................426
E4.7 Petzval Image ..........................................................................................426
E4.8 Field of View ........................................................................................... 426
E4.9 Chromatic Aberrations ............................................................................426
E5 Image Brightness ..................................................................................................427
E6 Image Quality ....................................................................................................... 427
E6.1 Wave and Ray Aberrations ......................................................................427
E6.2 Primary Aberrations ................................................................................428
E6.3 Spot Size and Aberration Balancing ........................................................429
E6.4 Strehl Ratio and Aberration Balancing ....................................................429
E7 Reflecting Systems................................................................................................430
E8 Anamorphic Imaging Systems ............................................................................430
E9 Aberration Tolerance and a Golden Rule of Optical Design ........................... 431
E10 General Comments ..............................................................................................431
References ......................................................................................................................433
421
Epilogue
E1 INTRODUCTION
We give brief a summary of the imaging process with emphasis on its salient
features, and outline the next steps within and beyond geometrical optics. The numbers
given in parentheses are the section numbers where a particular topic is discussed.
E2 PRINCIPLES OF GEOMETRICAL OPTICS AND IMAGING

In geometrical optics, light consists of rays. Their direction of propagation indicates
the direction of flow of light energy. They are normal to a wavefront, which is a surface
of constant phase. We started this book with a statement of Fermat’s principle, namely,
that the optical path length of a ray in traveling from one point to another is stationary.
This principle yields the three laws of geometrical optics (1.5), namely: rectilinear
propagation of a ray in a homogeneous medium, refraction of a ray at an interface
separating media of two refractive indices according to Snell’s law, and reflection of a
ray from a reflecting surface at the same angle as the angle of incidence. The incident ray,
the surface normal at the point of incidence, and the refracted or the reflected ray are
coplanar.
When a point object is imaged by an imaging system, a portion of the spherical wave
originating at the object is intercepted by the system. It propagates through the system,
and if a spherical wave exits from it, a perfect point image is formed at the center of
curvature of this converging spherical wave. If rays are traced from the point object
toward and through the imaging system, they exit from the system and converge to the
image point. Thus, a diverging spherical wavefront with its center of curvature at the
point object is converted by the imaging system into a spherical wavefront converging to
the perfect image point. The optical path lengths of the rays from the point object to the
image point are equal to each other. With few exceptions, the actual shape of the
wavefront emerging from the system is generally not spherical, indicating an aberrated
image.
E3 RAY TRACING: EXACT AND PARAXIAL

Exact ray tracing consists of a transfer operation, in which a ray propagates from a
certain point to a point on some surface, e.g., a refracting or a reflecting surface, and a
refraction or reflection operation, which describes its refraction or reflection by the
surface. The ray-tracing equations for the transverse coordinates ( x , y ) of a point on a ray
propagating in the z direction are coupled with each other and have to be solved
simultaneously (1.6). Such ray tracing is used primarily to determine the optical
deviations of the emerging wavefront from a spherical surface, i.e., the aberrations of an
imaging system, and thereby the quality of an image.
When the rays make small angles with the optical axis and surface normals, their
sines and tangents can be approximated by the angles themselves. Similarly, if the
transverse coordinates ( x , y ) of a point on a refracting or a reflecting surface with its
symmetry axis along the z axis are much smaller than its radius of curvature, we can
423
424 EPILOGUE
neglect the sag of the surface, and approximate the diagonal distance between two points
by the corresponding axial distance. The ray tracing carried out under such assumptions is
called paraxial ray tracing (1.7). Under such ray tracing, the equations for the transverse
coordinates of a point on a ray are no longer coupled. Moreover, the projections of a skew
ray in the zx and yz planes propagate independently of each other. Consequently, for a
rotationally symmetric imaging system, we need to trace rays only in one of these planes.
This is generally done in the tangential plane, i.e., the plane containing the point object
and the optical axis. A ray incident in the tangential plane remains in this plane after its
refraction or reflection by an element of the system, and, therefore, by the entire system.
This is also true for the exact ray.
E4 GAUSSIAN OPTICS
E4.1 Tangent Plane or Paraxial Surface
The paraxial ray-tracing equations are used to determine the location and size of the
image formed by an imaging system in terms of the object location and size. The image
thus obtained is referred to as the Gaussian image, and the process of determining the
image in this manner, regardless of the magnitude of the angles and sizes, is called
Gaussian optics. Because of paraxial ray tracing, the curved refracting or reflecting
surface is replaced by a planar surface passing through its vertex, called the tangent plane
or the paraxial surface (1.8.2). Only the vertex radius of curvature of the surface is
utilized in the imaging equations. The use of the tangent plane implies, for example, that
there is no distinction between the Gaussian image formed by a spherical surface of a
certain radius of curvature and a conic surface with the same vertex radius of curvature.
An object and its corresponding image are referred to as conjugates of each other because
one is the image of the other. The Gaussian image is aberration free by definition. The
aberrations of an actual image are determined separately as the next step to evaluate the
quality of the image.
E4.2 Sign Convention

It is essential to have a sign convention for the distances and heights of objects and
images, and the angles or cone angles of the rays. The Cartesian sign convention (1.2)
has the advantage that there are no special rules to remember other than those of a right-
handed Cartesian coordinate system, regardless of whether the object or the image is real
or virtual, or a refracting or a reflecting surface is convex or concave to the light incident
on it (1.2). The object distance is generally measured from the vertex of a refracting or a
reflecting surface, but in ray tracing it is measured from the object to the surface, i.e., in
the direction of the propagation of the ray. In this book, the distances and the angles are
indicated with an arrow, and those that are negative are indicated with a (–) sign.
E4.3 Cardinal Points

The Gaussian image of an object formed by a multisurface system can be obtained
by sequential application of the imaging equation for a refracting and/or a reflecting
surface. However, such a calculation is greatly simplified by defining suitable reference
E4 Gaussian Optics 425
points, called the cardinal points of the system: two principal points, two focal points, and
two nodal points. Only three of the six cardinal points are independent (2.4.2). The
principal points are conjugates of each other, and so are the nodal points. If the refractive
indices of the object and image spaces are equal, which is often the case in practice, then
the nodal points coincide with the corresponding principal points, and the object- and
image-space focal lengths are equal in magnitude.
Once the cardinal points are known, the system can be replaced by them regardless
of its complexity. The object and image distances are measured from the respective
principal points, which correspond to conjugate planes of unity transverse magnification.
Similarly, the focal lengths represent the distances of the focal points from the respective
principal points. The two nodal points correspond to unity angular magnification. The
principal and the nodal points of a thin lens (in air) coincide with its center. The principal
points of a refracting surface coincide with its vertex, and its nodal points coincide with
its center of curvature. The imaging equation for any imaging system is similar to that for
a single refracting surface.
E4.4 Graphical Imaging

The Gaussian image of an object can be determined graphically by tracing any two
of three specific object rays: a ray incident parallel to the optical axis of the system and
emerging from it passing through the image-space focal point; a ray incident passing
through its object-space focal point and emerging from the system parallel to the optical
axis; and a ray incident passing through its object-space nodal point and emerging from
the system passing though its image-space nodal point (2.4.6). The point of intersection
of these rays in the image space determines the image point. Because of the unity
magnification of the principal planes, the emergent ray appears to come from a point on
the image-side principal plane, which is at the same height as the point of incidence on
the object-side principal plane. Similarly, a ray incident in the direction of the object-side
focal point emerges parallel to the optical axis such that the point of emergence is at the
same height as the point of incidence. The height of the ray in the image space yields the
image height.
E4.5 Lagrange Invariant

An interesting property of the Gaussian image is that the product of the transverse
magnification of the image and its angular magnification is constant (equal to the ratio of
the refractive index of the object and image spaces). It is a representation of the Lagrange
invariant, which is the product of the slope angle of a ray from an axial point object,
object height, and the refractive index of the object space (2.4.3). This product is
invariant upon refraction or reflection by a surface, and thus for a system consisting of
any number of such surfaces. If we consider the invariant in terms of the heights and
slopes of two arbitrary rays incident on the system, the slope and height of any other ray
incident on the system can be obtained anywhere in space as a linear combination of the
slopes and heights of the other two in that space.
426 EPILOGUE
E4.6 Matrix Approach to Gaussian Imaging

The ray-tracing equations in Gaussian optics are linear in ray heights and slopes. As
a result, the whole imaging process can be represented by a 2 ¥ 2 matrix. Although the
matrix approach for tracing paraxial rays or determining the Gaussian image is equally
viable, it has the disadvantage of losing physical insight into ray tracing and the imaging
process. The matrix approach is not discussed in this book, but it can be found in
Reference 1.
E4.7 Petzval Image

axis even when they are located off the axis. This introduces a small focus error that
increases quadratically with the height of a point object. Consequently, an error-free
image of a plane object is formed on a spherical surface, called the Petzval image surface
(2.7). The radius of curvature of this surface is independent of the object or the image
distance. A lens designer balances the Petzval curvatures of the surfaces of a system so
that the curvature of the final image surface is zero. Otherwise, the image observed on a
planar surface will suffer from field curvature.
E4.8 Field of View

The field stop of a system is an aperture, placed at the final or an intermediate real
image of the object, that limits the cone angle of the transmitted chief rays from an object.
Its images, as seen from the object and image spaces, are the entrance and exit windows
EnW and ExW , respectively. The entrance window defines the object field that is
actually imaged in the exit window. The angle subtended by the entrance window at the
center of the entrance pupil represents the angular field of view of the system in object
space (5.2.7). Similarly, the angle subtended by the exit window at the center of the exit
pupil is the angular field of view of the system in image space. The ratio of the two
angles is equal to the magnification of the exit pupil when the refractive indices of the
object and image spaces are equal.
E4.9 Chromatic Aberrations

Because the refractive index of a transparent substance decreases with increasing
wavelength, a thin lens, for example, made of such a substance will have a shorter focal
length for a shorter wavelength. Consequently, an axial point object emanating white
light will be imaged at different distances along the axis depending on the wavelength,
with the consequence that the image will not be a “white” point. Similarly, the height of
the image of an off-axis point object will vary with the wavelength, resulting in different
sizes of the image of a multiwavelength object. The axial and transverse extents of the
image of a multiwavelength point object are called longitudinal and transverse chromatic
aberrations, respectively (Chapter 7). They describe a chromatic change in the position
and magnification of the image. The longitudinal chromatic aberration is also called the
axial color. The difference of image heights in a given image plane is referred to as the
E5 Image Brightness 427
lateral color. A system is considered achromatic if both the axial and lateral colors are
zero.
E5 IMAGE BRIGHTNESS
Once an image of suitable location and size has been obtained, the next step is to
determine its brightness. This is done by determining the aperture stop and its images, the
entrance and exit pupils in the object and image spaces of the system. Rays with
increasingly larger cone angles are incident on the system to determine the aperture in the
system that physically limits most the solid angle of the transmitted rays (5.2.2). Such ray
tracing is also used to determine the size of the imaging elements or the obscurations in
imaging systems. Having obtained the aperture stop, the entrance and exit pupils are
obtained by using the Gaussian imaging equations. The light cone from a point object that
enters the system is limited by the entrance pupil. Similarly, the light cone that exits from
the system and converges onto the image point is limited by the exit pupil. The chief ray
from the edge of an object determines the location of the exit pupil and the height of the
image. Similarly, the marginal ray from the axial point of the object determines the size
of the exit pupil and the location of the axial image point.
The intensity of the image of a point object varies as the cube of the cosine of its
angle from the optical axis (5.3). The irradiance of the image of an extended object
decreases as the fourth power of the angle of an object element from the optical axis
(5.4.6). For visual observations, as in the case of, e.g., telescopes and microscopes, the
spectral response of the human eye is taken into account. As the point object moves off-
axis, at some position, some of the rays intercepted by the entrance pupil begin to be
vignetted or blocked by one or another element. The aperture stop, which is circular for
the axial point object, becomes nearly elliptical with a corresponding reduction in the
transmitted flux.
E6 IMAGE QUALITY
E6.1 Wave and Ray Aberrations
In Gaussian optics, all of the object rays from a certain point object transmitted by a
system pass through the Gaussian image point. The imaging system is assumed to convert
the spherical wavefront diverging from the point object into a spherical wavefront
converging to the Gaussian image point. In reality, however, when the rays are traced
exactly (instead of paraxially), they generally do not converge to an image point. Instead
they intersect the image plane at various points in a small region in the vicinity of the
Gaussian image point in the form of a spot diagram, indicating that the exiting wavefront
is not spherical, or that it is aberrated. These aberrations determine the quality of the
image, as discussed in Sections E6.3 and 6.4.
The wave aberrations of the image of a point object are obtained by tracing rays from
the point object through the system and up to its exit pupil such that each one travels an
optical path length equal to that of the chief ray (i.e., the one passing through the center of
the pupil). The surface passing through the end points of the rays is the system wavefront
for the point object under consideration. If the wavefront is spherical, with its center of
428 EPILOGUE
curvature at the Gaussian image point, we obtain a perfect image point. The rays
transmitted by the system in that case have equal optical path lengths in propagating from
the object point to the Gaussian image point, and they all pass through the image point. If,
however, the actual wavefront deviates from the spherical wavefront, called the Gaussian
reference sphere, and the image is aberrated (8.2.1). The rays do not have equal optical
path lengths, and they intersect the Gaussian image plane in the vicinity of the Gaussian
image point. The ( x , y ) separations of the intersection point of a ray from the Gaussian
image point are called its transverse ray aberrations, and they are positive or negative
according to the Cartesian sign convention. The wave aberration of a ray from a point
object is positive if it travels an extra optical path length, compared to the chief ray, in
order to reach the Gaussian reference sphere (see Reference 1 in Chapter 8).
E6.2 Primary Aberrations

The aberrations of a rotationally symmetric imaging system depend on three
rotational invariants: h 2 , r 2 , and hr cos (q - q o ) , where (h, q o ) and (r, q) are the polar
coordinates of the object and pupil points, respectively. A general aberration term of the
expansion of the aberration function in terms of these invariants may accordingly be
i j
( )( ) m
written h 2 r 2 (hr cos q) , where i, j, and m are positive integers including zero.
Letting 2i + m = l and 2 j + m = n , the aberration term may be written h l r n cos m q. It is
evident that the order of an aberration term, i.e., its degree l + n in the object and pupil
coordinates, is even. The order of the corresponding ray aberration term is odd, because it
is one less than that of the wave aberration owing to the spatial derivative relationship
between the two. The wave aberrations of the power-series expansion of an aberration
function are referred to as the classical aberrations.
There are five aberrations of fourth order in object (or image) and pupil coordinates,
referred to as the primary or the Seidel aberrations, namely, spherical aberration, coma,
astigmatism, field curvature, and distortion (8.5). The primary wave aberrations of a
multisurface system are additive in the sense that they can be obtained by adding the
primary wave aberrations of the surfaces, where the Gaussian image of a point object
formed by one surface becomes the point object for the next surface (8.6.2). Thus, by
knowing the primary aberrations of a refracting surface, the aberrations of a single lens,
for example, can be obtained. Similarly, by knowing the primary aberrations of a
reflecting surface, the aberrations of a two-mirror astronomical telescope can be obtained.
The higher-order aberrations, e.g., secondary or Schwarzchild aberrations, cannot be
obtained in this manner. To obtain the higher-order aberrations of a surface, the effect of
the aberrations of the image formed by the previous surface must be taken into account.
New aberrations arise when the system is perturbed so that one or more of its imaging
elements is decentered and/or tilted, and the system loses its rotational symmetry. These
aberrations have different dependencies on the object height but the same dependence on
the pupil coordinates as the aberrations of the unperturbed system (sse Chapter 7 in
Reference 1).
E6 Image Quality 429
Although the transverse ray aberrations of a system for a certain point object can be
obtained by tracing the rays through the system and up to the image plane, they can also
be obtained from the wave aberrations. The ray aberrations are not additive in that those
in the final image plane cannot be obtained by adding their values in the intermediate
image planes formed by the surfaces of a system. Of course, the contribution of a surface
to the ray aberration in the final image plane can be obtained from its wave aberration
using the parameters of the final image (8.6.3).
Because of the variation of the refractive index of a transparent substance with the
wavelength, the optical path length of a ray passing through it also depends on the
wavelength. Accordingly, the monochromatic aberrations of a refracting system also vary
with the wavelength. However, this variation is generally small, especially for a narrow
spectral bandwidth. It is calculated by exact ray tracing of the system. Of course, a
reflecting system is achromatic.
E6.3 Spot Size and Aberration Balancing

The extent of the ray distribution in a spot diagram is called its spot size. The spot
size for a certain aberration can be reduced if it is balanced by one or more lower-order
aberrations (9.4). For example, when a certain amount of spherical aberration is balanced
with –1.5 times that amount of defocus aberration, the spot radius is minimized by a
factor of four. The reduced spot is the well-known circle of least confusion.
When some of the rays in the spot diagram are concentrated in a small area and the
others are scattered over a large area, as in the case of coma, the quantity of interest is the
standard deviation or the spot sigma of the ray distribution. The amount of the balancing
aberration for the minimum value of spot sigma is different. For example, the defocus
aberration that minimizes the spot sigma is - 4 3 times the amount of spherical
aberration. As the criterion for balancing changes, so does the amount of the balancing
aberration.
E6.4 Strehl Ratio and Aberration Balancing

A measure of the quality of an image is its Strehl ratio, which represents the ratio of
the central irradiances of the diffraction image of a point object with and without
aberration. For a small aberration, the Strehl ratio can be estimated from its variance
across the exit pupil (8.7.1). The smaller the variance is, the larger the Strehl ratio. The
variance of an aberration can be reduced by balancing it with one or more aberrations of
the same and/or lower order, thereby increasing the Strehl ratio (8.7.2). For example, the
standard deviation or the wavefront sigma of spherical aberration is reduced by a factor of
four when balanced with an equal and opposite amount of defocus.
The reason for the widespread use of Zernike circle polynomials in wavefront
analysis is that they are not only orthogonal over a circular pupil, but they also represent
balanced classical aberrations for such pupils (8.8). These polynomials are separable in
polar coordinates of a pupil point. The aberrations in the form of these polynomials are
430 EPILOGUE
referred to as the orthogonal aberrations. The coefficients of the classical aberrations can
be obtained from those of the orthogonal aberrations (8.9).
It is important that the Zernike polynomials be ordered in a logical and systematic

manner (8.8); otherwise, their coefficients from one person to another may not match. It
is also practically advantageous to use the orthonormal form of the Zernike polynomials
so that their coefficients represent the standard deviation or the sigma value of the
corresponding aberration terms. Accordingly, the variance of the aberration function is
simply equal to the sum of the squares of the orthonormal expansion coefficients (except
piston). Because of the orthogonal property of the polynomials, the value of a coefficient
is independent of the number of the polynomials used in the expansion of an aberration
function. Hence, one or more polynomial terms can be added to or subtracted from the
aberration function without affecting the value of the other coefficients in the expansion.
The P-V numbers of a polynomial representing the fabrication errors give a measure of
the depth of the material to be removed in the fabrication process.
E7 REFLECTING SYSTEMS
Generally, the refractive index of the medium for imaging by a reflecting surface is
unity. The ray-tracing equations (exact as well as paraxial) for a reflecting surface can be
obtained from the corresponding equations for a refracting surface by letting the
refractive index associated with the reflected ray to be equal to and opposite of that
associated with the incident ray (1.6). The opposite sign accounts for the backward
propagation of the reflected ray compared to that of the incident ray. The imaging and
wave aberration equations for a reflecting surface can be obtained in a similar manner
from the corresponding equations for a refracting surface. Although it is convenient to
use the equations for a refracting surface to obtain the corresponding equations for a
reflecting surface, the physical insight is lost in so doing. That is why Gaussian imaging
by a reflecting system is discussed in this book on an equal basis as a refracting system.
E8 ANAMORPHIC IMAGING SYSTEMS

An anamorphic imaging system is symmetric about two orthogonal planes whose
intersection defines its optical axis. The Gaussian images of a point object with object
rays in the two symmetry planes are formed separately. They are coincident in the final
image space of the system for only two pairs of conjugate planes, compared to an infinite
number for a rotationally symmetric imaging system. An anamorphic system forms the
image of an extended object with different transverse magnifications in the two symmetry
planes. As a result, the image of a square object is rectangular and that of a rectangular
object can be square (2.10). The aberration function of an anamorphic system depends on
the object and pupil coordinates through six reflection invariants, compared to three
rotational invariants in the case of a rotationally symmetric system. It has 16 primary
aberrations, as opposed to only five for a rotationally symmetric system. The orthonormal
polynomials representing balanced aberrations are products of Legendre polynomials in
the x and y variables. They are inherently separable in the Cartesian coordinates of a pupil
point (8.10).
E9 Aberration Tolerance and a Golden Rule of Optical Design 431
E9 ABERRATION TOLERANCE AND A GOLDEN RULE OF OPTICAL

DESIGN
The wave aberrations can also be balanced to give the smallest ray spot size, e.g., the
circle of least confusion in the case of spherical aberration or astigmatism, or the smallest
standard deviation of the ray distribution (often incorrectly called the root-mean-square
radius). It should be evident that if an aberration is balanced with another, the standard
deviation of the aberration and the spot size are not minimized for the same amount of the
balancing aberration. An exception is astigmatism, which, when balanced with defocus,
yields minimum variance as well as the smallest spot size (9.3.3).
It is common practice in lens design to look at the spot diagrams in the early stages
of a design, in spite of the fact that they do not represent reality. For example, based on
diffraction, the aberration-free image of a point object is the Airy pattern (6.8.2), but it is
a point only according to geometrical optics. So why do the lens designers use spot
diagrams? The reason is that not only are the spot diagrams easy to generate but also that
with increasing aberration, the geometrical and diffraction PSFs begin to increasingly
resemble each other. Just as in the diffraction treatment an optical system is considered
practically diffraction limited if the peak (or peak-to-valley) aberration is less than l 4
(Rayleigh’s quarter-wave rule) or if the standard deviation of the aberration across the
exit pupil is less than l 14 (Maréchal’s criterion) (6.8.3), similarly, the optical designers
consider a system to be close to its diffraction limit if the ray spot radius is less than or
equal to the radius of the Airy disc.
The aberration tolerances based on the spot size are roughly consistent with the
Rayleigh’s quarter-wave rule. Similarly, the depth of focus (giving the tolerance on the
location of the plane for observing the image) based on a spot radius smaller than or equal
to that of the Airy disc is roughly consistent with its value obtained according to
Rayleigh’s quarter-wave rule. The corresponding depth of field (giving the tolerance on
the object location for a fixed observation plane) can be obtained from the depth of focus
by using the longitudinal magnification. Accordingly, it is reasonable to use the size of
the spot diagrams as a qualitative measure of quality of the design until it becomes
smaller than the Airy disc. This yields a golden rule of optical design in that a designer
may strive for spot diagrams of a size nearly equal to that of the Airy disc, and then
analyze the system performance by its aberration variance and diffraction characteristics,
such as the aberrated diffraction point-spread function (PSF) or the modulation transfer
function (MTF) (9.6).
E10 GENERAL COMMENTS

A good understanding of Gaussian optics is essential for performing Gaussian (or
first-order) design and analysis of an optical imaging system. Based on the paraxial ray
tracing, it yields the location and size of the image. However, graphical imaging to
determine these parameters also gives insight. The use of the tangent plane in place of the
curved imaging surface illustrates that only the vertex radius of curvature determines the
432 EPILOGUE
Gaussian image, thus yielding the fact that the Gaussian images formed by conic and
spherical surfaces of the same radius of curvature are identical. The distinction between
the Gaussian and Petzval image should also be understood. Paraxial ray tracing is used to
determine the aperture stop and thereby the entrance and exit pupils and, in turn, the
irradiance of the image in terms of the radiance of the object. It is also used to determine
the approximate size of the imaging elements, obscurations in mirror systems, vignetting
of rays as the object moves increasingly off axis, and the resulting change in the shape of
the pupil.
It is important to work on the problems given at the end of each chapter, because
they are extensions of the theory given in the text, or, more often, as applications of the
theory. They are an essential part of the book because only by working through such
problems, can one appreciate the theory and validate its understanding. Having tools is
not enough; one must also know how to use them. Only by working the problems can the
readers gauge their aptitude. The use of computer software is discouraged until the basic
concepts of Gaussian imaging are thoroughly understood.
The closed-form analytical expressions for primary aberrations of simple systems,

such as a spherical refracting surface, thin lens, spherical mirror, and two-mirror
telescopes, can be derived, but the derivations are not simple. Although the Gaussian
image formed by conic and spherical surfaces of the same radius of curvature are
identical, it is important to understand how and why the aberrations of the two images
differ from each other. The derivations of primary aberrations are beyond the scope of
this book, but they can be found in Reference 1. We have only discussed the origin of
these aberrations in optical systems, and how they can be recognized by their
interferograms or the spot diagrams. We have also not discussed the diffraction effects of
the aberrations beyond the Strehl ratio, but they can be found in Reference 2.
The next step beyond Gaussian optics is to determine the image quality, and that
requires exact ray tracing to determine the aberrations. The understanding of the primary
aberrations is of paramount importance, because they can be the dominant aberrations in
the early stages of a design. Once one can solve simple problems that use Gaussian
optics, paraxial ray tracing, and graphical imaging, one is ready to tackle complex
problems by using the commercially available optical design and analysis software such
as CODE V, ZEMAX, SYNOPSYS, and OSLO.
A lens designer designs an imaging system so that it can form an image of a certain
size at a certain location, given the size and the location of the object. Given the radiance
of an extended object or the intensity of a point object, the designer chooses the size of
the imaging elements that will yield an image of some prescribed irradiance or intensity.
Gaussian optics is also used to determine the extent of the object that can be imaged, i.e.,
it is used to determine the field of view of the system. A designer must also choose the
shapes and materials of the imaging elements to balance their chromatic and
monochromatic aberrations to yield an image of acceptable quality across the field of
E10 General Comments 433
view of the system. However, the task of a designer is not finished until a system is
fabricated, assembled, and tested.
434 EPILOGUE
REFERENCES

SPIE Press, Bellingham, WA (1998) [doi:10.1117/3.265735].

Bibliography
A. E. Conrady, Applied Optics and Optical Design, Parts I and II, Oxford, London,
(1929); Reprinted by Dover, New York (1957).
E. Hecht, Optics, 4th ed., Addison Wesley, San Francisco (2002).
F. A. Jenkins and H. E. White, Fundamentals of Optics, 4th ed., McGraw-Hill, New York
(1976).
R. Kingslake and B. Johnson, Lens Design Fundamentals, 2nd ed., Academic Press, San
Diego, CA (2009).
R. Kingslake, Optical System Design, Academic Press, New York (1983).
M. V. Klein, Optics, John Wiley and Sons, New York (1970).
M. V. Klein and T. E. Furtak, Optics, John Wiley and Sons, New York (1988).
D. Korsch, Reflective Optics, Academic Press, San Diego (1991).
D. Malacara, Geometrical and Instrumental Optics, Academic Press, San Diego, CA

(1988).
D. Malacara and Z. Malacara, Handbook of Lens Design, Dekker, New York (1994).
L. C. Martin and W. T. Welford, Technical Optics, Vol. I, 2nd ed., Pitman, London,
(1966).
W. R. McCluney, Introduction to Radiometry and Photometry, Artech, Norwood, MA

(1994).
P. Mouroulis and J. Macdonald, Geometrical Optics and Optical Design, Oxford, New
York (1997).
D. C. O’Shea, Elements of Modern Optical Design, John Wiley and Sons, New York
(1985).
H. Rutten and M. Van Venrooij, Telescope Optics, Willmann-Bell, Richmond, VA

(1988).
D. J. Schroeder, Astronomical Optics, 2nd ed., Academic Press, San Diego, CA (2000).
W. J. Smith, Modern Optical Engineering, 2nd ed., McGraw-Hill, New York (1990).
W. T. Welford, Aberrations of the Symmetrical Optical System, Academic Press, San

Diego, CA (1974).
435
Index
A
angular aperture.......... 217, 219, 220, 229
Abbe number ..................................... 286
angular field of view
aberration
image space ........................... 189, 225
balanced ......................................... 340
object space .......................... 189, 225
chromatic ....................................... 281
angular magnification
classical .................................. 352, 357
general system ................................. 78
combined primary and secondary .. 331
reflecting surface ............... 34, 54, 123
definition ........................................ 317
refracting surface ....................... 31, 54
defocus ........................................... 323
thin lens ........................................... 67
extrinsic.......................................... 332
aperture stop .............................. 187, 188
geometrical..................................... 320
apochromatic ...................................... 304
intrinsic .......................................... 322
aspheric surface ................................... 35
order ............................................... 327
astigmatism
peak-to-valley value ....................... 329
definition ....................................... 328
peak value ...................................... 329
focal lines....................................... 395
primary ................... 328, 329, 331, 332
interferogram ................................. 368
Schwarzchild.................................. 330
longitudinal.................................... 395
secondary ............... 328, 330, 331, 337
sagittal ........................................... 395
Seidel ............................. 328, 329, 331
shape .............................................. 365
tilt ........................... 325–327, 351, 353
spot diagram .................................. 412
tolerance ......................... 338, 340, 378
spot sigma.............................. 395, 400
transverse ray ......... 315, 320, 332, 336
tangential ....................................... 395
variance .......................................... 339
atmospheric coherence length ........... 369
wave ....................... 315, 317, 321, 332
atmospheric turbulence ..................... 369
aberration balancing
auxiliary axis ......................... 96, 98, 112
definition ........................ 338, 378, 390
axial color
primary aberrations ........................ 340
definition ............................... 281, 283
aberration tolerance .... 338, 340, 378, 415
doublet ........................................... 297
accommodation ................................... 238
general system ............................... 295
achromatic systems
plane-parallel plate ........................ 290
doublet ........................................... 302
refracting surface ........................... 283
additivity theorem ............................... 335
afocal system ........................................ 90 thin lens ......................................... 285
beam expander ............................... 133
for telephoto lens ........................... 259
B
for wide-angle lens................. 260, 261 beam expander
reflecting telescope ................ 133, 253 reflecting........................................ 133
refracting telescope ........................ 253 refracting ................................. 88, 254
Airy disc ............................................. 261 beam-expansion ratio ......................... 254
Airy pattern .................... 5, 261, 262, 264 blind spot .................................... 236, 237
ametropic ............................................ 242
anamorphic system C
imaging .......................................... 107 cardinal points ......................... 45, 74, 84
aberrations...................................... 357 combination of two systems .......... 154
reflection invariants ....................... 356
437
438 Index
Cartesian pair conic surface ....................................... 22

definition ............................................ 9 reflecting ......................................... 23
reflecting surface ............................. 42 refracting ......................................... 22
refracting surface ............................. 42 conjugate points ................................... 47
Cartesian surface contact lens ................. 247, 249, 250, 278
definition ............................................ 9 cosine law of intensity........................ 206
reflecting .......................................... 42 cosine-fourth law of irradiance by an
refracting .......................................... 42 extended source ..................... 215, 217
Cassegrain focus ................................. 130 cosine-third law of irradiance
catadioptric system ............................. 173 by a point source ........................... 203
thin-lens–mirror combination coupled equations ................................. 19
focal length................................ 172 decoupled equations ........................ 26
cataract................................................ 236 critical angle ......................................... 11
centroid crown glass ......................................... 286
definition ........................................ 391 cylindrical lens ........................... 108, 247
for coma ......................................... 393
chief ray ........................................ 24, 177 D
chromatic aberrations ................... 35, 281 defocus
axial color............................... 281–283 sigma ..................................... 402, 417
doublet ................................... 295–305 spot radius............................. 402, 417
general system ....................... 292–295 wave aberration ............................ 324
lateral color .................... 281, 284, 285 depth of field ...................................... 404
longitudinal ............................ 281, 283 depth of focus ..................................... 404
plane-parallel plate ................. 290–292 diffraction .................. 261, 262, 266, 273
refracting surface ................... 281, 282 diffraction focus ................................. 340
thin lens.................................. 288, 289 diopter ................................................. 53
transverse ............................... 281, 283 dispersive constant ............................ 285
transverse axial color ..................... 284 distortion
circle of least confusion barrel.............................................. 408
astigmatic ....................................... 396 for uniform image irradiance ......... 215
spherical ........................................ 388 image of a square ........................... 407
classical aberrations image of a square grid ................... 408
anamorphic system......................... 357 pincushion ..................................... 408
rotationally symmetric system ....... 352 wave aberration ............. 381, 404, 417
327, 352, 428 doublet
coherence length ................................. 369 achromatic ..................................... 302
cold stop ............................................... 99 cemented........................................ 304
coma chromatic aberrations ............ 295–305
definition ........................................ 328 focal length .............................. 87, 162
interferogram ................................. 368 thin lens ................................. 302, 304
sagittal .................................... 393, 416
shape .............................................. 365 E
spot diagram ........................... 412, 414
eccentricity ........................................... 22
spot sigma ...................................... 394
effective aperture stop ....................... 195
tangential................................ 393, 416
effective entrance pupil ..................... 195
concentric lens ............................ 162, 183
effective exit pupil ............................. 195
axial color....................................... 311
effective (or equivalent) focal length
concentric systems ...................... 219, 228
reflecting surface ........................... 121
Index 439
refracting surface ............................. 53

system .............................................. 78 F
thin lens............................................ 63 f-number ..................................... 215, 218
entrance pupil ............................. 187, 188 fabrication errors ....... 345, 346, 349, 352,
entrance window ........................ 187, 199 373, 374
equiconcave lens ................................... 64 Fermat’s principle ................. 5, 334, 335
equiconvex lens ........................... 64, 215 field curvature ................................... 328
equivalent focal length field of view ............................... 199, 257
see effective focal length field stop ..... 187, 188, 198, 199, 233, 225
exact ray tracing ............................... , 188 figure errors ....................................... 378
exit pupil ............................................ 187 tolerance ........................................ 378
exit window ............................... 187, 199 first-order optics ................................... 28
exitance .............................................. 230 flint glass ............................................ 286
eye ............................................... 235, 275 focal distance
ametropic ....................................... 242
general system ..................... 74, 85, 87
astigmatism ............................ 244, 246
telephoto lens ................................. 116
bifocal lenses ................................. 247
thick lens ....................... 161, 179, 180
blind spot ............................... 236, 237
two-lens system ............... 88, 165, 180
cardinal points................................ 238
two-mirror system ................. 169, 181
cataract ........................................... 236
focal length of a refracting surface
chart ............................................... 241
image space ..................................... 51
cones .............................................. 236
object space ..................................... 52
contact lens..................................... 247
focal planes........................................... 75
crystalline lens ....................... 236, 238
focal points .......................................... 75
emmetropic .................................... 242
image space ..................................... 51
far point.......................... 239, 243–245
object space ..................................... 52
fovea centralis ................................ 237
focal ratio ........................................... 204
glaucoma ........................................ 237
focusing power .................................... 90
hypermetropia (farsighted)..... 242, 244
fourth-order wave aberrations ........... 328
245
fringe ......................... 350, 351, 367, 369
imaging .......................................... 223
iris .................................................. 236
G
myopia (nearsighted) ............. 242, 243
near point ....................... 238, 239, 245
presbiopia ....................................... 247 Gaussian approximation ........................ 3
prescription .................................... 248 Gaussian image ............................... 3, 28
resolution ....................................... 268 Gaussian imaging equation
retina .............................................. 236 general system ................................. 78
rods................................................. 236 reflecting surface ....... 33, 40, 121, 141
spectral response ............................ 220 refracting surface ............... 30, 39, 111
eye models thin lens ........................................... 62
reduced eye ............................ 237, 238 Gaussian optics ....................... 3, 28, 424
schematic eye ......................... 237, 238 Gaussian reference sphere ................. 317
simplified eye ......................... 237, 238 geometrical optics .................................. 3
eyepiece ...................................... 254, 272 geometrical path length ..................... 317
Huygens ................................. 298, 299 geometrical point-spread function ..... 382
geometrical ray aberration ................. 320
glass sphere ....................................... 115
golden rule of optical design ...... 415, 418
440 Index
graphical imaging general system ............................... 294

general system ................................. 81 plane-parallel plate ........................ 291
reflecting surface ........................... 127 refracting surface ........................... 284
refracting surface ............................. 59 thin lens ......................................... 288
thin lens............................................ 68 lateral spherical aberration ................ 331
law of reflection
H in 2D ................................................ 12
Hubble Space Telescope .................... 144 in 3D ................................................ 15
Huygens eyepiece ...................... 298, 299 laws of geometrical optics.............. 10, 36
law of refraction
I in 2D ................................................ 10
in 3D ................................................ 13
image irradiance ................................. 227
Legendre polynomials ........................ 359
image magnification
lens bending ......................................... 64
reflecting surface ................... 123, 125
lensmaker’s formula ..................... 64, 160
refracting surface ....................... 53, 55
line-of-sight error ............................... 415
image radiance .................................... 227
longitudinal astigmatism ................... 395
image space..................................... 50, 51
longitudinal chromatic aberration
immersed detectors ............................. 115
(see axial color)
intensity...................................... 200, 204
longitudinal defocus .......................... 323
interference pattern ............................ 363
longitudinal magnification
interferogram ............. 350, 351, 361, 364,
general system ................................. 79
366, 368
invariants
reflection ................................ 356, 357
thin lens ........................................... 67
rotational ........................................ 327
longitudinal spherical aberration ....... 388
inverse-square law of irradiance ........ 201
Lyot stop ........................................... 199
irradiance ........................................... 200
M
L
magnifier .................... 235, 249, 250, 275
Lagrange invariance ............................. 79
contact magnifier ........................... 115
Lagrange invariant
Malus–Dupin theorem............................ 8
afocal system ................................... 90
Mangin mirror
general system ................................. 79
chromatic .................................. 312
infinite conjugates............................ 91
focal length ............................... 144
marginal focus ................................... 387
marginal image plane ................ 387, 388
thin lens............................................ 67
marginal image points ....................... 387
two-ray ................................... 174, 181
marginal ray ............................... 177, 193
Lambertian disc ......................... 206, 226
lower .............................................. 197
Lambertian source ............................. 187
upper .............................................. 197
Lambertian surface ............................. 206
matrix approach.................................. 426
brightness ....................................... 223
meniscus lens ................. 64, 65, 248, 249
Lambert’s cosine law of intensity ....... 206
meridional plane
lateral aberrations ............................... 331
(see tangential plane)
lateral color
microscope ......................... 156, 251, 252
definition ........................................ 281
mirror
doublet ........................................... 297
concave .................................. 122, 135
Index 441
converging ..................................... 134 defocus................................... 367, 370

convex .................................... 134, 135 Seidel ............................................. 356
diverging ................................ 134, 135 tilt................................................... 371
misalignments perfect conjugates................................... 9
reflecting surface ........................... 136 perfect imaging ................................. 370
refracting surface ................... 101, 113 Petzval image point ............................. 96
thin lens.................................. 105, 113 Petzval image surface
two-mirror telescope ...................... 139 definition ............................. 45, 96, 98
general formula ............................... 98
N mirror ............................................. 134
negative lens ........................................ 64 system of mirrors ........................... 135
negative surface .................................... 53 thin lens ........................................... 99
neutral zone ....................................... 392 two-mirror telescope ..................... 134
Newtonian imaging equation Petzval sum ................................. 99, 112
general system ................................. 81 phoropters ........................................... 240
refracting surface ............................. 61 photometry ................................. 187, 220
reflecting surface ........................... 127 pinhole camera ................... 273, 274, 276
thin lens............................................ 68 piston aberration ....... 342, 344, 347, 356,
nodal planes ......................................... 81 358, 363
nodal points ......................................... 80 plane of incidence................................. 15
nodal slide ............................................. 88 plane-parallel plate
numerical aperture .............................. 267 chromatic aberrations ............ 290–292
image space.................................... 217 imaging ............................................ 93
object space.................................... 215 point-spread function (PSF)
O geometrical ........... 382, 384, 390, 393,
394, 403, 411, 413–415, 418, 420
object space..................................... 50, 51 diffraction ...................... 351, 394, 415
objective ..................................... 254, 272 positive surface..................................... 53
oil immersion ................................. 270 power-series expansion
obscuration ratio ........................ 170, 230 anamorphic system ........................ 357
oculars ................................................. 259 rotaionally symmetric system........ 327
oil-immersion objective ...................... 270 primary aberrations .................... 328, 371
optical axis ............................................. 3 balanced ................................. 340, 364
optical path length................................... 5 tolerance ........................................ 338
optical wavefront ................................... 8 prime focus ........................................ 130
optimum defocus ............................... 390 principal planes ............................. 75, 76
orthogonal aberrations principal points .................................... 74
anamorphic system......................... 358 principal ray ....................................... 193
rotationally symmetric system ....... 316 projected area .................................... 101
P R
parabola ............................................. 407 radial image ........................................ 395
parallel beam ................... 45, 53, 63, 133 radiance ............................. 187, 205, 211
paraxial approximation ......................... 25 radiance theorem ............................... 213
paraxial ray tracing ...... 24, 29, 34, 35, 39 radiometry
paraxial surface .............................. 25, 39 extended object imaging 204, 214, 226
peak-to-valley aberration ............ 329, 350 point object imaging .............. 200, 225
peak value .................................. 329, 371 random aberrations ............................ 369
442 Index
ray aberration ..................................... 321

ray angular magnification ....... 54, 80, 92, S
112, 123, 133, 142 sag of a surface ..................................... 18
ray fan sagittal coma ..................................... 393
astigmatism ............................ 398, 399 sagittal image ............................. 364, 395
coma ............................................... 393 sagittal plane ..................................... 382
sagittal ............................................ 382 sagittal rays ............................... 395, 400
spherical aberration................ 388, 389 Schott glass......................................... 287
tangential........................................ 382 Schwarzschild aberrations.......... 330, 428
ray spot diagram ................................ 382 secondary aberration .................. 328, 331
ray tracing .............................................. 3 secondary magnification .................... 133
exact ................................................... 3 secondary spectrum ........................... 303
paraxial............................................... 3 Seidel aberrations .............................. 328
ray-tracing equations .......................... 178 Seidel coefficients ............................. 355
reflection ........................................ 166 sign convention ..................................... 4
refraction ........................................ 149 sine condition .................................... 270
transfer ........................................... 149 skew rays ............................................. 28
Rayleigh criterion of resolution .......... 263 slope angle............................................ 29
Rayleigh’s quarter-wave rule ............. 415 Smith–Helmholtz invariant .................. 55
rays ....................................................... 3 Snell’s law ..................................... 11, 15
meridional .......................................... 4 approximate expression ................... 25
skew ................................................... 4 Snellen chart ....................................... 241
virtual ................................................. 6 Snellen letters ..................................... 240
rectangular polynomials ..................... 360 spectacles ........................................... 242
rectilinear propagation .................... 10, 18 speed of a lens .................................... 216
reflecting power of a mirror ............... 121 spherical aberration
reflection invariants ............................ 356 definition ....................................... 328
refracting power interferogram ................................. 368
general system ................................. 78 longitudinal.................................... 388
refracting surface ............................. 48 shape .............................................. 365
relative aperture .................................. 205 spot sigma.............................. 390, 416
resolution .................................... 261, 276 spherical mirror .................................. 119
diffraction-limited .......................... 269 spherochromatism ............................. 316
eye .................................................. 268 spot diagram .................................... 9, 24
imaging system .............................. 266 spot sigma
microscope ............................. 269, 276 astigmatism............................ 400, 417
telescope................................. 270, 276 coma ...................................... 394, 416
two-point ................................ 263, 265 definition ............................... 384, 429
rim ray spherical................................. 390, 391
lower .............................................. 197 spot radius
upper .............................................. 197 defocus................................... 415, 418
root-mean-square aberration ............... 345 Seidel aberration.................... 388, 390
root-mean-square radius ..................... 384 spherical................................. 403, 404
rotational invariants ........................... 327 standard deviation
rotationally symmetric system... 357, 358, aberration ....................... 338–340, 415
360, 375 ray distribution............................... 388
stop
aperture .................................. 187, 188
Index 443
field ........................................ 187, 198 transverse axial color ......................... 284

Lyot (cold) stop.............................. 199 transverse chromatic aberration
telecentric ............................... 187, 196 see lateral color
stop-shift equation for lateral color transverse magnification .. ................... 30
refracting surface ........................... 285 reflecting surface ..................... 33, 123
thin lens.......................................... 288 refracting surface ....................... 30, 53
Strehl ratio .................................. 337, 371 transverse ray aberration ............ 337, 381
two thin lenses ..................... 87, 162, 180
T Twyman–Green interferometer ......... 364
tangent condition ................................ 406
U
tangent plane ......................................... 25
tangential coma ................................... 393 uniform diffuser ................................. 206
tangential image ................................ 364 unit circle............................................ 359
tangential plane ......... 4, 24, 319, 381, 395 unit circular pupil ............................... 341
tangential ray ........................................ 24 unit pupil ............................................ 346
tangential ray fan ............................... 382
telecentric stop or system ................... 197 V
telephoto lens ............................. 259, 260 variance ............................................. 337
telephoto system ......................... 259, 260 vergence ............................................... 47
telescope ............................................. 253 vertex radius of curvature ........ 22, 23, 28
astronomical ................... 130, 253, 255 vignetting .......... 187, 188, 196, 215, 258
Cassegrain ...................................... 130 vignetting diagram ............................ 197
Galilean .......... 182, 254, 255, 257–259 virtual image......................................... 48
Gregorian ....................................... 130 virtual object......................................... 49
Hubble............................................ 144 virtual path ......................................... 335
Keplerian................ 254, 255, 256, 258 visual acuity ....................................... 240
Schwarzchild.................................. 230
terrestrial ........................................ 257 W
two-mirror .............................. 129, 168
wave aberration
tertiary aberrations ............................. 331
definition ........................................... 9
thick lens .............. 74, 115, 159, 179, 311
due to defocus................................ 323
thin lens
due to Petzval curvature ................ 324
converging ....................................... 64
relationship with ray aberration ..... 321
diverging .......................................... 64
wavefront ............................................... 8
focal length....................................... 63
wavefront defocus .............................. 322
imaging equation ....................... 62, 63
wavefront defocus aberration ..... 322, 353
Lagrange invariant ........................... 67
wavefront sigma ......................... 338, 429
magnification ................................... 66
wavefront tilt ...................................... 325
negative ............................................ 64
wavefront tilt aberration ............ 325, 352
Petzval surface ................................. 99
wide-angle lens ................................. 261
positive ............................................. 64
wide-angle system .............................. 260
thin-lens doublet ................................ 302
working distance ............................... 130
third-order ray aberrations ................. 337
throughput ........................................... 218
Z
throw of a lens ...................................... 69
toric lens ..................................... 246, 247 Zernike aberrations............................. 352
total internal reflection ......................... 11 astigmatism.................................... 353
transfer operation ............................ 19, 37 coma .............................................. 354
444 Index
defocus ........................................... 353

primary ........................................... 372
spherical ......................................... 355
tilt ................................................... 352
Zernike circle polynomials ........ 340, 371
azimuthal frequency ....................... 342
characteristics
interferometric ........................... 350
isometric.................................... 349
in optical design ..................... 341, 371
in optical testing ..................... 345, 373
peak-to-valley value ....................... 350
polynomial ordering ....................... 346
polynomial ordering number ......... 346
radial degree ................................... 342
Zernike coefficient ............................. 341
zonal rays ............................................ 193
ABOUT THE AUTHOR
Virendra N. Mahajan was born in Vihari, Pakistan, and educated in India and the
United States. He received his Ph.D. degree in optical sciences from the College of
Optical Sciences, University of Arizona. He spent nine years at the Charles Stark Draper
Laboratory in Cambridge, Massachusetts, where he worked on space optical systems.
Since 1983, he has been at The Aerospace Corporation in El Segundo, California, where
he is a distinguished scientist working on space-based surveillance systems. Parts I and II
of Optical Imaging and Aberrations evolved out of a graduate course he taught as an
adjunct professor in the Electrical Engineering-Electrophysics department at the
University of Southern California. Dr. Mahajan is an adjunct professor in the College of
Optical Sciences at the University of Arizona, and the Department of Optics and
Photonics at the National Central University in Taiwan, where he teaches graduate
courses on imaging and aberrations. He also teaches short courses on aberrations at
meetings of the Optical Society of America and SPIE. He has published numerous papers
on diffraction, aberrations, wavefront analysis, adaptive optics, and acousto-optics. He is a
fellow of OSA, SPIE, and the Optical Society of India. He is an associate editor of
OSA’s 3rd edition of the Handbook of Optics, and a recipient of SPIE’s Conrady award.
He has served as a Topical Editor of Optics Letters, chairman of OSA’s Astronomical,
Aeronautical, and Space Optics technical group, and a member of several committees of
both OSA and SPIE. Dr. Mahajan is the author of Aberration Theory Made Simple, 2nd
ed. (2011), editor of Selected Papers on Effects of Aberrations in Optical Imaging (1994),
and author of Optical Imaging and Aberrations, Part I: Ray Geometrical Optics (1998),
Part II: Wave Diffraction Optics, 2nd ed. (2011), and Part III: Wavefront Analysis (2013),
all published by SPIE Press.
FUNDAMENTALS OF
GEOMETRICAL OPTICS
Virendra N. Mahajan
Optical imaging starts with geometrical optics and ray tracing lies at its forefront. This book
starts with Fermat’s principle, and derives the three laws of geometrical optics from it. These
laws are used to obtain the exact ray-tracing equations, whose paraxial approximation yields
the Gaussian imaging. After discussing imaging by refracting and reflecting systems, paraxial
ray tracing is used to determine the size of imaging elements and obscuration in mirror
systems. Stops, pupils, radiometry, and optical instruments are discussed next. The
chromatic and monochromatic aberrations are discussed in detail, followed by spot sizes and
spot diagrams of aberrated images of point objects. Each chapter ends with a summary and
a set of problems. The book ends with an epilogue, which summarizes the imaging process,
and outlines the next steps within and beyond geometrical optics.
Contents: Foundations of Geometrical Optics; Imaging by Refracting and Reflecting

Systems; Paraxial Ray Tracing, Stops; Pupils, and Radiometry; Optical Instruments;
Chromatic and Monochromatic Aberrations; Spot Sizes and Spot Diagrams.
Virendra N. Mahajan received his Ph.D. degree in optical sciences from

the College of Optical Sciences, University of Arizona. He is with The
Aerospace Corporation, where he is a distinguished scientist working on
space-based surveillance systems. He is an adjunct professor at the
University of Arizona and the National Central University in Taiwan. Dr.
Mahajan is the author of Aberration Theory Made Simple, 2nd ed. (2011),
editor of Selected Papers on Effects of Aberrations in Optical Imaging
(1994), and the author of Optical Imaging and Aberrations, Part I: Ray
Geometrical Optics, Part II: Wave Diffraction Optics, 2nd ed. (2011), and
Part III: Wavefront Analysis (2013).
P.O. Box 10
Bellingham, WA 98227-0010
ISBN: 9780819499981
SPIE Vol. No.: PM245

(Press Monograph) Virendra N. Mahajan-Fundamentals of Geometrical Optics-Society of Photo Optical (2014) PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

(Press Monograph) Virendra N. Mahajan-Fundamentals of Geometrical Optics-Society of Photo Optical (2014) PDF

Hochgeladen von

Copyright:

Verfügbare Formate

FUNDAMENTALS OF

Copyright © 2014 Society of Photo-Optical Instrumentation Engineers (SPIE)

Printed in the United States of America.

I wish that readers will benefit from Vini Mahajan's Fundamentals of

May 2014 José Sasián

Symbols and Notation.................................................................................................. xxiii

CHAPTER 1: FOUNDATIONS OF GEOMETRICAL OPTICS

1.2 Sign Convention ....................................................................................................... 4

1.3 Fermat’s Principle....................................................................................................5

1.4 Rays and Wavefronts............................................................................................... 8

1.5 Laws of Geometrical Optics ..................................................................................10

1.10 Summary of Results ............................................................................................... 36

CHAPTER 2: REFRACTING SYSTEMS

CHAPTER 3: REFLECTING SYSTEMS

CHAPTER 5: STOPS, PUPILS, AND RADIOMETRY

CHAPTER 6: OPTICAL INSTRUMENTS

6.1 Introduction ..........................................................................................................235

CHAPTER 7: CHROMATIC ABERRATIONS

7.1 Introduction ..........................................................................................................281

CHAPTER 8: MONOCHROMATIC ABERRATIONS

CHAPTER 9: SPOT SIZES AND DIAGRAMS

Geometrical optics is fundamental to optical imaging. Chapter 1 lays out its

The imaging equations obtained in Chapters 2 and 3 are rederived in Chapter 4 by

The monochromatic aberrations of a system with an emphasis on primary aberrations

El Segundo, California Virendra N. Mahajan

ai aberration coefficient R radius of curvature of a surface or

GR general ray z′ image distance

h object height ray or field angle

h′ image height ∆R longitudinal defocus

H principal point r,θ polar coordinates of a point

K power of a system λ optical wavelength

L image distance from exit pupil (ξ, η) = ( x, y) a normalized rectangular

Kalidasa Kumarasambhava 1.3

FOUNDATIONS OF GEOMETRICAL OPTICS

1.1 Introduction ..............................................................................................................3

1.2 Sign Convention ....................................................................................................... 4

1.3 Fermat’s Principle....................................................................................................5

1.4 Rays and Wavefronts............................................................................................... 8

1.5 Laws of Geometrical Optics ..................................................................................10

1.8 Gaussian Approximation and Imaging ................................................................28

1.10 Summary of Results ............................................................................................... 36

The assumption or approximation of small angles is referred to as the Gaussian or

1.2 SIGN CONVENTION

1. Light is incident on an imaging system from left to right.

3. The radius of curvature of a surface is treated as the distance of its center of

Figure 1-1. Gaussian imaging by a convex spherical refracting surface of radius of

1.3 FERMAT’S PRINCIPLE

Equation (1-2a) may also be written

point P on it from its geometrical foci F1 and F2 is independent of its location.

[ F1 RF2 ] < [ F1QF2 ] = [ F1 PF2 ] . (1-3a)

Moreover, if we consider a convex mirror as in Figure 1-3c, having a common tangent

[ F1 RF2 ] > [ F1QF2 ] = [ F1 PF2 ] . (1-3b)

1.4 RAYS AND WAVEFRONTS

Let W be a spherical wavefront of rays emanating from a point object P, as illustrated

[ AVA ¢ ] = [ BQB¢ ] , (1-4)

Because BQ is perpendicular to the wavefront W at the point B,

Figure 1-4. Refraction of a spherical wavefront W by a surface S separating media

[ AQ] = [ BQ] + O 2( ) , (1-6a)

If the wavefront W ¢ is not spherical, its deviations from a corresponding spherical

1.5 LAWS OF GEOMETRICAL OPTICS

1.5.1 Rectilinear Propagation

Figure 1-5. Rectilinear propagation of a ray from a point P1 to a point P2 .

Figure 1-6. Refraction of a ray. PA is a ray incident on a planar surface separating