2002.1 A Critical View of Three Lens Design Methods

A critical view of three lens design methods:
damped least squares, Spencer’s and Glatzel’s
Juan L. Raycesa, Martha Rosete-Aguilarb

a
J. L. Rayces Consulting 5890 N. Placita Alberca, Tucson AZ 85718-
2925,Phone: 520-742-9324, Fax 520-742-9368, E-mail: j.rayces@juno.com.
b
Centro de Ciencias Aplicadas y Desarrollo Tecnológico, Universidad
Nacional Autónoma de MéxicoCircuito Exterior, Cd. Universitaria, 04510,
México D.F., MéxicoPhone: 5255-5622-8614, ext 119, Fax 5255-5622-
8651E-mail: martha@aleph.cinstrum.unam.mx
ABSTRACT and useful and were universally adopted. The first

important modification of this method was the
A lens system can be represented as a point in introduction of what came to be known as the method of
the N-dimensional parameter space. There is a set of damped least squares (DLS) proposed by Levenberg in
quantities called merit functions associated with points in 19442. Levenberg had in mind lens statistical applications
this parameter space that define the performance of the rather lens design when he devised his method. He
lens system. Ideally this set can be reduced to a single recognized the importance of controlling the length of the
quantity called the merit function. Lens design iterations solution vector in non-linear problems as he stated2: “If
consist of transferring from one point in parameter space the usual least squares procedure, performed with these
to another where the merit function indicates an linear approximations, yields new values for the
improvement. By repeating this process the designer may parameters which are not sufficiently close to the initial
eventually reach a satisfactory solution. Even when the values, the neglect of second and higher order terms may
functions involved are theoretically continuous, the invalidate the process”. Additionally he states2 that “In
transferring in parameter space may never be strictly such cases, it would seem advisable to limit or damp the
continous because it is a numerical process. Yet, it absolute values of the increments of the parameters in
should be possible to take arbitrary short steps on going order to improve the first order Taylor approximation
from one point to another. Long steps, or leaps, can and to minimize simultaneously the sum of the squares of
abruptly lead to undesirable regions from where it may the approximating residuals under these damped
be difficult to return to the previous, more desirable conditions”.
region. There are three common methods to set up the Apparently Hopkins et al3, Girard4, Meiron5
merit function: Levenberg’s damped least squares (DLS), and Wynne6 were among the first to recognize the
G. Spencer’s constrained damped least squares (CDLS) practical value of Levenberg’s method and to apply it to
and E. Glatzel’s automatic adaptive correction (AAC). lens design.
This paper reviews the pros and cons of these methods. A G. Spencer, a graduate student at the University
discontinuity found in Spencer’s method will be of Rochester, Rochester N.Y. developed in 19637 the
discussed in detail. method of constrained damped least squares (CDLS), a
method of lens optimization combining damped least
Key word: Optical design. squares with Lagrange’s multipliers. In theory this
method minimizes a merit function formed by image
1. INTRODUCTION errors subject to dimensional and other constraints;
exactly what lens design is about.
Several methods of optimization were tried at It appears that Spencer used Levenberg’s
the beginning of the computer-aided lens design era in method because it solved the ill-conditioning problem* in
the late 1950’s. Of all these, methods based on the
principle of least squares1 emerged as the most practical *
Spencer states in his paper7: “Now it is quite possible for the above set
of equations to be indeterminate. The possibility of indeterminacy may
1
the so-called ATA matrix8 and not because it controlled and minimize the merit function. The initial results were
the length of the solution vector, as was the true purpose very successful and led to the hope that one day soon the
of Levenberg. computer would completely take over the lens designer’s
The publication of Spencer’s paper in Applied job and prematurely people talked about automatic lens
Optics caused a stir among those involved in lens design design. It has not happened yet after 52 years. More
methods: it was greeted as a panacea. Unfortunately recently, and more realistically, computer aided lens
Spencer’s method did not live fully up to expectations, design has been used. Design is partly done by the
probably because of the discontinuity discussed here, and computer and partly by the lens designer who operates
was abandoned by many of its earlier enthusiastic the computer and the program. For this reason even if the
supporters. computer program does not minimize the merit function,
E. Glatzel a mathematician and outstanding lens the two together will optimize the design. So this is the
designer with Carl Zeiss, Oberkochen, had developed in word we shall use in this paper.
19619 the method of automatic adaptive correction
(AAC) that was published in detail only in 196810. 2.1 Lens system as a point in the N-dimensional
Unlike Levenberg’s method, the length of the solution parameter space.
vector is controlled by checking linearity at each step of A lens system at any time during the design
the process with a completely original device. With stage may be represented by a point P in an N-
Glatzel’s method it is theoretically possible to bring dimensional space, figure 1, where the coordinates are
exactly to target both image errors and constraint errors. parameters, that is, all entities in the lens system that are
The number of these errors cannot exceed the available allowed to vary during the design process.
degrees of freedom. It is not possible to automatically Examples of lens parameters are surface
balance the entities involved. The method works well in curvature of lens elements, axial separation between
the hands of experienced users. For these reasons surfaces, refractive index and dispersion of glass
Glatzel’s method has appealed only to very few lens material, etc.
designers.
J. Rayces and L. Lebich in 198311,12 in trying to predicted point
implement Spencer’s CDLS method into a lens design
program, discovered a disturbing discontinuity in the region of linearity
method. This flaw is serious at the beginning of the solu
tion
design when the constraint errors are large. But it can be vec start point
t or
overcome by reducing these errors to negligible values
with Glatzel’s method13.
The flaw in CDLS is less serious during the
course of the design when the constraint errors are
r
cto
ve
considerably smaller, but it does not completely

ion
sit
disappear. It will be shown here that controlling the

po
magnitude of the hard merit function will result in

improved operation of Spencer’s method.
The authors of this paper were not able to find
other references to the CDLS flaw. It is quite possible
that the problem is well known to the few current
providers of commercial lens design computer packages,
Fig. 1. -The lens configuration represented by a point P in the N-
but it is kept as a trade secret. dimensional parameter space. The coordinates are scaled
parameters.
2. NOTATION AND DEFINITIONS.
Let xn denote the value of the -nth parameter.
In the early days of electronic computers, in the The location of point P in the N-dimensional space is
1950’s, the first programs for optical design performed defined by a
the same operations as mechanical calculators: just ray
tracing. The next step consisted of programs to calculate POSITION VECTOR [ x] = [ x1 , x2 , x3 ,..., xN ] (1)
be removed, however, by employing a device suggested by Levenberg
and later used by Hopkins et al3 and Wynne6.”
2
The aim of all design methods is to improve the SCALED POSITION [ X] = [ X 1 , X 2 , X 3 ,..., X N ] , (5)
performance of the lens system. These methods will
compute changes ∆xn to the parameters. The vector
SCALED SOLUTION [ ∆X] = [ ∆X 1 , ∆X 2 , ∆X 3 ,..., ∆X N ] , (6)
formed with the parameter changes is the
The scaling factor is an indispensable tool to
handle parameters of different nature, like curvatures and
SOLUTION VECTOR [ ∆x ] = [ ∆x1 , ∆x2 , ∆x3 ,..., ∆xN ] (2)
axial separations, but also parameters of the same nature
and different dimensions like the curvatures of the
primary and secondary mirrors in astronomical
telescopes.
2.3 Performance assessment, functions.

Performance of a lens system can be expressed
by a set of quantities that are to be compared with a set of
absolute minimum specifications.
constrained minimum These quantities are functions of parameters of
the lens system. Hardly ever are these functions
expressed in closed form. Most frequently they are
evaluated by means of algorithms.
Given the basic specifications of a lens system
like relative aperture, field of view and spectral range the
immediate purpose of the design is to achieve acceptable
image quality by reducing aberrations (image errors) to
negligible values. All these image errors are functions of
the system parameters.
Fig. 2. The problem of constrained optimization. Families of curves Also involved in the design are quantities
Φ=const., Ψ=const., in the 2-dimensional numerical example. related to the physical structure of the lens system that
are functions of the lens parameters.
2.2 Region of linearity, scaled parameters. Examples of functions are the coefficients of the
Lens design methods are based on power series expansion of the wave aberration; the geometric
expansions where the derivatives of order higher than the aberrations of each ray in a fan; the back focal distance,
first are neglected, a process called linearization. The the height of a ray on a given lens surface; the geometric
parameter changes must be restricted to small values so radial distortion; the position of the entrance pupil, etc.
that errors from neglecting higher order derivatives are etc.
negligible or tolerable. The restriction to small parameter
changes defines a region called region of linearity. 2.4 Soft and hard functions, errors targets, weights.
Successful results depend to a great extent on For the design process a target is assigned to
how close to a hypersphere the region of linearity is. each function. The difference between a function and its
Assuming that there is a way of determining the target is the current error. This difference may or may not
boundary of the region of linearity it is desirable that the be weighted. Generally when the errors are reduced to
magnitude of the change for each of the parameters be zero or small values the performance improves.
the same in terms of the radius of the linearity We recognize two types of errors the soft errors
hypersphere. and hard errors. Generally soft errors are related to
With this purpose in mind a scaling factor ςn is image quality and hard errors are related to opto-
applied to each parameter and each parameter change mechanical constraints. Associated with those errors are
independently. soft functions and hard functions, soft targets and hard
targets and soft weights only. The following symbols are
SCALED PARAMETER X n = ς n xn , (3) used for those entities.
SCALED PARAMETER CHANGE ∆X n = ς n ∆xn . (4)
Correspondingly we have these vectors:

SOFT HARD
3
COMPUTED FUNCTIONS ϕj Pbg ψk P bg The reason for functions and errors being called
soft or hard is simply because soft targets or hard targets
ASSIGNED WEIGHTS wj none
respectively, are assigned to them. It is possible for an
ASSIGNED TARGETS sj hk image error to be assigned a hard target and then it
COMPUTED ERRORS εj ηk becomes a hard error. Also a non-image error could be
assigned a soft target and then it becomes a soft error
Soft errors are those that can logically be SOFT ERROR VECTOR: E = [ε1 , ε 2 ,…ε J ] (9)
minimized, there is a reasonable tolerance for them. Soft
errors are mostly image errors, and they are related to the HARD ERROR VECTOR: H = [η1 ,η2 ,…ηK ] (10)
other entites by:
There are no weights for hard errors because
SOFT ERROR ε j ( P ) = w j ⋅ ϕ j ( P ) − s j  . (7) eventually they will be reduced to zero and then a weight
will have no effect on them.
Hard errors, mostly non-image errors, are those
2.5 Soft and hard merit functions.
that for good reasons must be reduced to zero, there is no
In order to assess the improvement in
useful tolerance for them. If there exists a tolerance, and
performance of a system it helps to reduce the results to
it were taken advantage of, the benefit to image quality
the least possible number of quantities. In optical design
would be imperceptible. Suppose there is a 1% tolerance
it is common practice to call these quantities merit
in the focal length, so if the focal length is made shorter
functions, defined as the sum of the square of errors: We
by that amount, the aberrations would decrease by the
have, then:
same amount which is negligible. If there is a tolerance
of 10% then it would be wise to use that tolerance at the
SOFT MERIT FUNCTION: Φ 2 ( P ) = ET E = ∑ j =1 ε 2j
J
outset, but it would not be permissible to go beyond that (11)
during lens optimization.
HARD MERIT FUNCTION: Ψ 2 ( P ) = HT H = ∑ k =1ηk2
K
(12)
2
It should be kept in mind that soft weights are
absorbed into the soft errors. Weights make up for the
difference in the importance of the functions that go into
the construction of the soft merit function.
1
2.6 Linearization and error changes.
4 The optimization methods discussed here
2 start point of
6
0 require that either the soft merit function is minimized, or
first iteration
3 the hard merit function be reduced to zero, or both:
Φ 2 ( P ) = ET E = ∑ j =1 ε 2j = min
5 J
(13)
Ψ 2 ( P ) = HT H = ∑ k =1ηk2 = 0 (∴ each ηk = 0) (14)

K
and perhaps other conditions. The merit functions are

formed by complicated functions of the lens system
1
parameters that hardly ever exist in closed form and are
Fig. 3 Least squares method: several iterations. No control on the length
of the solution vector. From starting point to end point,which is the always evaluated by means of algorithms
beginning of the next iteration, there is a leap. Methods of solution of non-linear systems of
equations are generally extensions of the Newton-
Hard errors, are related to the other entities by: Raphson method. The functions involved are expanded
in a Taylor series and terms of order higher than the first
HARD ERROR ηk ( P ) = ψ k ( P ) − hk . (8) are neglected.
4
In the region of linearity the changes ∆ε j , ∆ηk With matrix notation the linear equations for
of soft and hard errors are given by systems of linear changes in the errors when parameters change can be
equations: written
∆E = A ⋅ ∆X (22)
∂ε j ∂ϕ j
∆ε j = ∑ N ∆X n = ∑ N w j ∆X n (15) ∆H = C ⋅ ∆X (23)
∂X n ∂X n
∂η ∂ψ k
∆η k = ∑ N k ∆X n = ∑ N ∆X n (16)
∂X n ∂X n
halt point of current
Two vectors are formed with the soft and hard end point of current
iteration at
error changes: iteration
SOFT ERROR CHANGES V.: ∆E = [ ∆ε1 , ∆ε 2 ,… ∆ε J ] (17)

HARD ERROR CHANGES V.: ∆ H = [ ∆η1 , ∆η2 ,… ∆ηK ] (18)
2.7 Jacobian matrices

locus of solutions
Each iteration in the design process starts with
the computation of partial derivatives of soft and hard halt point of previous
errors with respect to parameters, iteration, start point of
Generally J > N and always K ≤ N in a system current iteration
of N parameters, J soft errors and K hard errors,
These partial derivatives are elements of two
Fig. 4. Damped least squares method, single iteration. Solutions for
matrices, a J×N damping factor ∞ ≥ κ ≥ 0 Dots denote discrete values of the damping
factor where the soft merit function Φ2 is evaluated. The solution is
LM ∂ϕ 1 ∂ϕ 1 OP continuous from end point of previous iteration to end point of current
iteration.
MM ∂X 1 ∂X N
PP
A=M PP 2.8 Iterations; start, test, halt and end points.
SOFT JACOBIAN:
MM PP
In the lens design process the final solution will
be reached by iterations. In this process there are outer
MM ∂ϕ J ∂ϕ J
PQ
cycles marked by a recomputation of partial derivatives
N ∂X 1 ∂X N and inner cycles or steps where the position vector is
gradually changed, and performance (merit function) is
(19)
and a K×N evaluated. In the case of the adaptive correction method,
LM ∂ψ 1 ∂ψ 1 OP AAC, linearity is checked
In optimization methods there are four
HARD JACOBIAN C=M

M ∂X 1 ∂X N
PP (20)
landmarks in each iteration: (1) a start point where the
partial derivatives are computed; (2) many successive test
MM ∂ψ K ∂ψ K PP points, or steps, at short intervals where the performance
N ∂X 1 ∂X N Q is evaluated, generally by computing the merit functions;
(3) a halt point where the computation is interrupted
The arrays of partial derivatives are often called because there is no longer convergence (improvement of
change tables. the merit function), then a new iteration is started; (4) an
In practice, with few exceptions, the partial end point which is predicted at the start point assuming
derivatives are approximated by the ratio of finite linearity.
differences
∂ϕ ∆ϕ ∂ψ ∆ψ
≅ , ≅ . (21)
∂x ∆x ∂x ∆x
5
3. NUMERICAL EXAMPLE This problem, even with the limitation to N=2,
gives to our 3-D mind a hint of what happens in the N-
We shall illustrate the design methods discussed dimensional space.
below with an example using soft and hard non-linear
functions of two parameters, X1 and X2 (N=2). 4. LEAST SQUARES AND DAMPED LEAST
SQUARES METHODS
SOFT ERRORS AND MERIT FUNCTION
ϕ1 = ( X 1 − 3) ⋅ ( X 2 + 1) ,
2 2
s1 = 0, w1 = 1, 4.1 The problem and the solution.
The problem of minimizing the soft merit
ϕ 2 = X 12 + 25 ( X 2 − 4 )
2
s2 = 0, w2 = 1, function is stated as:
Φ 2 = (ϕ1 − s1 ) w12 + (ϕ 2 − s2 ) w22 .
2 2
Φ 2 ( P ) = ET E = ∑ j =1 ε 2j = min
J
(24)
The family Φ 2 = const. is the set of asymmetric

curves, Figure 2. The minimum of this merit function is Under the assumption that the functions
at the point of coordinates X 1 = 2.6853 , X 2 = 3.9932 . involved are all linear, it is solved with the well-known
method of least squares:
HARD ERROR AND MERIT FUNCTION
∆X = ( ATA ) AT E
−1
(26)
ψ = 89 ( X 1 − 5 ) + 8 ( X 2 − 10 )
2 2
31
h = 4,
Ψ 2 = (ψ − h )
2
Note that ATA is an N × N square matrix, and
T
that A E is an N×1 column vector.
The family Ψ 2 = const. is the set of symmetrical It may be used with non-linear systems. It will
curves (ellipses) in Figure 2 One of the curves, Ψ 2 = 0 , not be efficient and the solution must be iterated. Figure
shown as heavy line, is the constraint. 3 shows several iterations of the plain (not damped) least
squares solution applied to the soft function of the
sample problem
For a given starting point the solution vector is
in the vague general direction of the minimum. It is a
discontinuous process because there is a leap from the
end point of a given iteration to the next predicted point.
There is no parameter available that can be used to
start point of first
control the length of the solution vector.
iteration
4.2 Control of the solution vector length.
The plain least squares solution is converted to a
continuous process with Levenberg’s method, and the
length of the solution vector can be controlled.
The problem as stated by Levenberg is this:
multiply the sum of the squares of the parameter changes
by a scalar κ, add the product to the sum of the squares
of errors, and then minimize both terms.
Fig. 5. Damped least squares method: several iterations. All iterations
begin with κ=∝ at halt point of the previous iteration. Iterations halted Φ 2 ( P ) = ET E + κ ⋅ XT X = min , (27)
where the merit function Φ2 is a minimum, generally before κ=0.
The quantity κ is called damping factor and must be
The solution, i.e. the constrained minimum, is
found by trial and error.
easily spotted in the 2-dimensional plot: it is the point
where the constraint curve is tangent to one of the curves In the solution, a diagonal matrix κI appears
in the Φ 2 = const. family. In this case the coordinates of added to the square matrix ATA :
the solution are X 1 = 3.3585, X 2 = 3.5479 .
6
∆X = ( ATA + κ I ) AT E , 5. LAGRANGE’S METHOD OF
−1
(28)
UNDETERMINED MULTIPLIERS AND
SPENCER’S CONSTRAINED DAMPED
I is the unit matrix. The damping factor κ can take any
positive value between zero and infinity but its optimum
LEAST SQUARES METHOD
value, the value such that Φ2 is a minimum must be found
by trial and error. 5.1 The problem and the solution.
Given the conditions that the merit function be
minimized and the hard merit function be reduced to
Damped least squares solution: zero,
∞ ≥κ ≥ 0
if κ = ∞, ∆X = 0 Φ 2 ( P ) = ET E = min,
(30)
Ψ 2 ( P ) = HT H = 0 (∴ each ηk = 0).
∆X = ( A A + κ I ) A E
−1
if ∞ > κ > 0, T T
∆X = ( ATA ) AT E
−1
if κ = 0, Assuming strict linearity the problem may be solved with
Lagrange’s method8.
Fig. 6. In the method of damped least squares the beginning of an
iteration is at the point where the previous iteration ends: there is no
−1
discontinuity.  ∆X   ATA CT   AT E 
 λ =  ⋅  (31)
4.3 Search for halt point.    C 0   H 
The merit function becomes a function of the
damping factor Φ 2 (κ ) during a given iteration λ‘s are undetermined multipliers.
In the case of non-linear systems like
In order to find Φ 2
, a quantity χ is defined
min
optimization it will be necessary to compute the matrices
and iterate the process. But every time there will be a
1 1 leap always with the danger of jumping into a bad area
χ=m , m > 0 ∴ κ = m − 1. (29)
κ +1 χ from where it is difficult to return. There is no record of
any one using Lagrange multipliers in optimization even
with partial success.
This quantity is incremented in short steps from
χ = 0, (κ = ∞ ) to χ = 1, (κ = 0 ) . The merit function Φ2, 5.2 Control of the solution vector length.
or log(Φ2), is computed at every step: it will decrease Borrowing from the method of damped least
slowly at the beginning, and then will increase abruptly. squares, for non-linear systems, Spencer, multiplied the
When Φ2 begins increasing the previous step is retraced sum of the squares of the parameter changes by a scalar
and the search for Φ 2min halts to begin a new computation κ, added the product to the sum of the squares of error
of partial derivatives. and minimized both terms, thus:
In Figure 4 it may be seen that the starting point
and end point of the plain least squares solution and Φ 2 ( P ) = ET E + κ∆XT ∆X = min,
(32)
damped least squares are the same. But there is a leap in Ψ 2 ( P ) = HT H = 0, (∴ each ηk = 0).
the former while the latter describes a curved path. It may
be seen that the curve starts normal to the loci of
Note that nothing is done to damp the
Φ2=const. then it takes a sharp turn and ends parallel to
contribution of the vector of hard errors to the problem.
the loci. In between the curve has a better chance of
being tangent to one of the loci and that is the point In the solution the damping factor κ appears added to the
where Φ2=min. occurs. It is conceivable that something diagonal elements of matrix ATA as in the case of
similar happens in the N-dimensional space where the damped least squares but not to the entire matrix.
loci are hypersurfaces and the path of solutions is still a
curved line. −1
 ∆X   ATA + κ I CT   AT E 
 λ =  ⋅ , (33)
   C 0   H 
7
discussion we shall see the advantage of using the
5.3 A gap in Spencer’s method. automatic adaptive method of Glatzel’s to bring all
Spencer’s method comes very close to being the constraints to target before starting to use Spencer’s.
ideal method for lens design, except for the already method.
mentioned indetermination at the origin when hard errors As the computation progresses the vector of
are not negligible. hard errors is shorter, yet it is necessary to be careful that
it is kept within limits during the search for a halt point,
as it will be explained immediately,
end point of 5.4 Search for a halt point

previous iteration, leap A small quantity δ called ceiling to be used to
start point of compare with the value of Ψ2 is set up on the basis of
current iteration previous experience before starting the search.
end point of As in the method of damped least squares, in
current iteration order to find the halt point of the iteration, the quantity χ
defined by equation (29) is incremented in short steps
from χ = 0 (κ = ∞ ) to χ = 1 (κ = 0 ) . The merit
functions Φ2, or log(Φ2), and Ψ2 are computed at every
step. The former will decrease slowly at the beginning
and then will increase abruptly. The latter will increase
monotonically, figure 11. When Φ2 increases, or when
Ψ2 reaches the value of δ, whichever comes first, the
Fig. 7.- Lagrange’s method of undetermined multipliers. It may result in a
previous step is retraced and the current iteration halts to
leap, likely to fall beyond the region of linearity. begin a new iteration.
If value of Φ2 in the new iteration is larger than
At the point where one iteration halts and the the one in the previous one and/or it never decreases, that
next one begins with the computation of the partial is a sign of non-convergence. The previous iteration must
derivatives, in the method of damped least squares, be repeated with a smaller value of the ceiling δ. When a
satisfactory value of the ceiling is obtained, it will last for
as κ → ∞, ∆X → 0 many optimization iterations without having to change it.
There is continuity in the solution path in the N- Spencer's solution

dimensional parameter space, even when the direction of ∞ ≥κ ≥ 0
the solution vector changes abruptly.
∆X = CT ⋅ ( CCT ) ⋅ H
−1
On the other hand, at the point where one if κ = ∞, ( ∆X ≠ 0 )
iteration halts and the next begins, in Spencer’s method, −1
 ∆X   ATA + κ I CT   AT E 
if ∞ > κ > 0,  λ =  ⋅ 
   C 0   H 
as κ → ∞, ∆X → CT ⋅ ( CCT ) ⋅ H
−1
−1
 ∆X   ATA CT   AT E 
if κ = 0,  λ =  ⋅ 
a non-zero value, Appendix I. This proves beyond doubt
   C 0   H 
that there is a discontinuity in the solution path. It is a
Fig. 8. In Spencer’s method one iteration begins at a point different from
jump or leap into a most likely undesirable region of the ending point of a previous iteration when the vector of hard errors is
parameter space. How serious the leap is depends on not zero: therefore there is a discontinuity at the origin.
matrix C and vector of hard errors H.
Nothing can be done about C since it is made up 5.4 Large quantities of errors.
of computed partial derivatives. The DLS and CDLS methods are well suited to
At the very beginning of the design process it is handle a large amount of image errors as is the case when
probable that the constraints are far from their targets and the errors are samples of ray aberrations of dense fans of
the vector of hard errors H is large. Then the rays through the entrance pupil for several different
discontinuity is a serious matter. Later on in this colors (wavelengths), many field points and a good
number of zoom positions. The number of errors may
8
thus reach several thousand. This does not mean that the
computer memory has to store matrices with several ∆σ
L= . (37)
thousand rows, as explained in Appendix II. ∆ξ
6.GLATZEL’S AUTOMATIC ADAPTIVE ∆ξ is the step length defined above and ∆σ is the solution
CORRECTION METHOD. vector increment, Fig. 11.
6.1 The problem and the solution

In Glatzel’s Adaptive Correction (ACM) start point of first iteration
method all errors are hard errors and are independently
reduced to zero. When the number of parameters exceeds
the number of functions, N > K , the indetermination is
removed by adding the condition that the norm of the
solution vector,Ω2 be a minimum:
Ψ ( P ) = HT H = ∑ k =1ηk2 = 0 ∴ each ηk = 0,
K
(34)
Ω = ∑ n ∆X n2 = min.
the matrix solution for this system, assuming linearity, is
∆X = CT ( CCT ) ⋅ H.
−1
(35)
The search for a solution in a non-linear system Fig. 10. Spencer’s method: several iterations. Points P0, P1,... are start
points of each iteration where partial derivatives are computed. Points E0,
is in steps of length
E1,... are halt points of each iteration, either when Φ=min or when κ=0.
Note discontinuities.
∆ξ = χ ⋅ ∆X, 0 < χ ≤ 1. (36)
The hard merit function is reduced to zero and
the square of the vector of parameter changes is
minimized
halt point of
previous leap or discontinuity The value of the linearity ratio is used to
iteration, start determine the length of the parameter increment ∆ξ. It is
point of current end point of
iteration current iteration also used to establish the end of the iteration, when
L exceeds certain arbitrary limits:
first point locus of solutions
of current 0.7<L<1.4. (38)
iteration
Then a new iteration starts with the computation of
partial derivatives. This is the counterpart of the DLS and
CDLS methods where a new iteration starts at the point
Φmin of the previous iteration.
As already mentioned DLS, and CDLS are well
suited to handle a large amount of errors as it will be the
Fig. 9. Spencer’s method: single iteration. Partial derivatives for the case when the goal is to minimize the rms spot size or the
current iteration are computed at the halt point of the previous iteration.
Note discontinuity from start point to first point of current iteration.
rms wave aberration for each design wavelength and
each field point with sample values from rays distributed
regularly over the lens pupil.
At each step the error vector and the linearity Glatzels method, on the other hand is limited to
ratio are computed. This linearity ratio is defined as bringing to target only a comparatively small number of
errors, It is completely out of the question to try and
9
control the rms spot size or rms of the wave aberration number of lower powers which balance it in the best
obtain by a number of sample values obtained by ray way.”
tracing. Except that the only a finite number of
Instead it is possible from ray tracing data to coefficients can be used: the Zernike expansion, equation
compute the coefficients of the expansions of the (39), must be truncated somewhere. This is a common
geometric aberration or wave aberration14, define these problem to all infinite expansions.
coefficients as errors and then control their value with
Glatzel’s method.
Glatzel’s method is particularly well suited to
new iteration starts
work with coefficients of the Zernike expansion of the at halt point
wave aberration function15 written in the form16
first starting point,
∞ constraints satisfied
W ( ρ ,ϑ ) = A00 + ∑ n + 1 An 0 exactly
n =1
∞ n
(39)
+ ∑∑ 2 ( n + 1) Anm R m
n ( ρ ) cos mϑ ,
n =1 m =1
and the variance is short leap

∞ n at start of
W 2 = A002 + ∑∑ Anm
2
.......................(40) next iteration , constraint
n =1 m = 0 , ceiling
Fig. 12. Spencer’s method. It works well when hard errors are all zero at
the beginning and are maintained at very small residuals during the
iterations.
Besides, with Glatzel’s method the number of

functions can never exceed the number of parameters,
K ≤ N , then the condition η = 0 cannot be satisfied for
all functions of interest and it requires considerable skill
and experience to achieve a balance of errors by
minimum appropriately targetting a selected number of functions.
This does not prevent the adaptive method from
being useful in the early stages of design. Generally there
ceiling is a sufficient number of parameters to satisfy all the
optomechanical constraints plus the coefficients of a
limited number of aberration coefficients. When the point
where K = N is reached, then it is the time for a designer
who is not a genius like Glatzel’s to start using Spencer’s
method.
Fig. 11. Spencer’s method: single iteration. Search of optimum damping The advantage of Glatzel’ method as a teaching
factor starts with χ=0 (κ=∝), and then progressively larger values towards tool, not only for students but also for professionals,
χ=1 (κ=0). The process is halted when either Φ=min., as in the DLS cannot be denied in spite of its limitations. Designers
method, or whenΨ=δ. The value of the “ceiling”, δ, is obtained by trial and
error. with many years of experience who have used AAC
confirm as true what Glatzel10 stated: “...the biggest
As Zernike17 himself explained almost half a advantage of all in the adaptive method is that it enables
century ago; “...there is no use in leaving a residual of the designer to learn about the inherent correction
lower order aberrations to balance a higher one. Or potentialities of a given basic type of system”. Something
expressed in the old way: Each circle polynomial difficult to claim by designers who blindly use other
contains next to the highest power of r (zonal radius) a methods.
10
Glatzel’s method succeeded because of a is simplified by using two merit functions. One of them
number of devices not evident in the solution of matrix involves only image errors and we call it soft merit
equations. One of these devices is the test that will insure function. The other involves all the other errors, mostly
that the successive points in the solution vector are at all opto-mechanical constraints, and we call it hard merit
times within the region of linearity, i.e. the linearity test function. The value of the latter should at all times be
already explained. zero or a very small value, therefore there is practically
only one merit function to watch to asses the progress of
the computations. The design ends at a point when the
∆ξ step length soft merit function cannot be further optimized and
success depends to a much lesser extent on the choice of
increment of predicted weights than the DLS.
solution vector length ∆σ
The problem with CDLS is a disturbing
error vector length
discontinuity at the very beginning. It cannot be

overcome by reducing the length of the steps because the
steps have not even begun. A proof that there is a
discontinuity is given in Appendix I. In successive
iterations after the first, the leap caused by the
discontinuity still exists but is minor . It can be controlled
parameter by halting the search for a minimum soft merit function
change, ξ
predicted solution vector length , σ when the hard merit function reaches a specified ceiling.
Glatzel’s automatic adaptive correction method,
Fig. 13. Linearity test in Glatzel’s method. See text.
AAC, is unrelated in principle to DLS and CDLS. The
Another device is the method of correction of errors must be split into a number of parts and each part
errors. A selected first error is brought to zero, then a treated as a hard function. Logically, the number of these
second, next a third one. Each time the errors previously errors cannot exceed the number of parameters.
corrected remain exactly on target. Internally, each time a The use of AAC and CDLS can be combined.
new error is targeted to zero the program works on the At the beginning of the design, with AAC, all opto-
new error, while the others depart from their zero target. mechanical constraint errors are reduced to zero as well
When one of the old errors becomes larger than the first, as a number of the most significant aberration
the program will work on that one, and so on and so forth coefficients until the number of errors is the same as the
until all the errors, including the new one, are again zero. number of parameters.
Glatzel’s method performs flawlessly. He was
not the only one to try a method of correction where the
core is equation (35). About the same time Hopkins and
Spencer18 had tried it too, apparently without success and
probably because they did not have Glatzel’s ingenuity.
7. CONCLUSIONS. design starts with Glatzel's method
In the method of damped least squares (DLS) constrained minimum

there is just one merit function and the transfer from reached
point to point can be made in steps as short as necessary
and the results are generally excellent. The problem with
this method is that the merit function is an unholy
combination of functions of different nature such as
image errors and opto-mechanical constraints. These
must be properly weighted to form the merit function. all constraints satisfied:
The design ends at a point when the merit function can Glatzel's method ends
no longer be minimized and the success of the process then Spencer's method starts
depends to a great extent on the set of weights chosen. Fig. 14. Glatzel’s method may be used at the beginning of the design to
In Spencer’s method of constrained damped satisfy all constraints, then Spencer’s method will behave well if hard
errors are controlled during iterations.
least squares (CDLS) the problem of selection of weights
11
At that point the designer changes over to the
CDLS method which is superior when dealing with a Obviously
large number of errors and it becomes necessary to make 1 2 1 T
a compromise at the same time that constraints are all limκ →∞ Φ ( P ) = 0, limκ →∞ E E=0 (40)
κ κ
maintained on target.
The operation with AAC and CDLS combined therefore
for the solution of the numerical example is shown in
∆XT∆X = min. (41)
figure 14.
A lens design program, called Eikonal, based on
This, together with the second of equations (34) are the
these two methods was used exclusively at the Applied
same as the conditions in Glatzel’s method, therefore the
Optics Division of the Perkin-Elmer Corporation since
solution is the same, equation (32)
1970. It continued to be used in its former divisions after
1991 when the company divested itself of the defense
and space business. ∆X = CT ( CCT ) ⋅ H (42)
QED
ACKNOWLEDGMENTS
APPENDIX II. THE PRODUCTS ATA AND ATE
The authors wish to express their appreciation to
Dr. Wu Jiang, Dr. Chungte B. Chen, Ms. Ronnie Yuan One should expect to find in an optimization
and Miss Stacy Shouru Chen for their help and problem a reasonable number N of parameters and a
suggestions in preparing this manuscript and slides for reasonable number K of hard errors (constraints). But the
the presentation at Photonics Asia, 2002. number of soft errors J may be really large, like many
thousands, when optimizing, for instance the spot sizes.
IN MEMORIAM At a glance it may appear that it is necessary to
store in the computer memory a huge J×N matrix A and a
This paper is dedicated to the memory of Dr. huge J×1 vector E. This is not so: the largest matrix to be
Earhard Glatzel, one of the most outstanding lens stored is an N×N matrix ATA and an N×1 vector ATF.
designers of the XXth century who passed away in Just before evaluation, j=0, matrices ATA and
T
Heidenham, Germany, early this year. A E are reset to zero
Errors and partial derivatives are computed one
APPENDIX I. SPENCER’S SOLUTION FOR at a time from j=1 to j=J when the lens system is
κ→∝ evaluated The partial derivatives are set in a 1×N vector
Aj and the error is set in a 1×1 vector Ej:
Spencer’s solution for the vector of parameter
changes, ∆X , becomes undefined for κ → ∞ . Therefore  ∂ε ∂ε ∂ε 
it is necessary to derive an alternative form valid for this
A j =  j , j ,… j  , Ε j = (ε j ). (43)
 ∂X 1 ∂X 2 ∂X N 
particular case.
We start with the conditions in Spencer’s
Each time the derivatives and errors are
method
computed, the matrix and the vector are updated:
Φ 2 ( P ) = ET E + κ ⋅ ∆XT∆X = min . j −1 j −1
Ψ 2
( P) = H T
H = 0 ∴ each ηk = 0
(38) ATA = ∑AA
i =0
T
i i + ATjA j , AT E= ∑A E
i =0
T
i i + ATj Ej (44)
If both sides of the first equation are divided by any until j=J.
arbitrary quantity the solution is unchanged: the
minimum falls at exactly the same point. As a particular
case we divide both sides by the damping factor κ to REFERENCES
obtain 1
S. Rosen, and C. Eldert, “Least squares method for optical
correction”, J. Opt. Soc. Am., 44 No.3, pp.250-252 (1954).
1 2 1
Φ ( P ) = ET E + ∆XT∆X = min . (39)
κ κ
12
2
K. Levenberg, “A method for the solution of certain nonlinear
problems in least-squares”, Quart. Appl. Math. 2, 164 (1944)
3
R. E. Hopkins, C. A. McCarthy, R. Walters, J. Opt. Soc. 45,
363, (1955)
4
A.Girard, Rev. Opt. 37, 225, 397 (1958)
5
J. Meiron, J. Opt. Soc. Am. 49, 293 (1959)
6
C. G. Wynne, Proc.Phys. Soc. (London) 73, 777 (1959)
7
G. H. Spencer, “A flexible automatic lens correction
procedure”, Appl. Opt. 2, No.12 pp.1257-1264 (1963)
8
G. Golub, “Numerical methods for solving linear least squares
problems”, Numerische Mathematics, 7, 206-216 (1965)
9
E. Glatzel, “Ein neues Verfahren zur automatischen
Korrection optischer Systeme mit electronischen
Rechenmaschinen” Optik, 18 (10/11) 577-580 (1961)
10
E. Glatzel and R. Wilson, “Adaptive automatic correction in
optical design”, Appl. Opt. 7, No.22 p265 (1968)
11
J. L. Rayces and L. Lebich, “Optimization of constrained
optical systems”, (A), JOSAA 1, p.1219 (1984)
12
J. L. Rayces and L. Lebich, “Experiments on constrained
optimization with Spencer’s method”, Optical Engineering, 27,
No.12 pp. 1031-1034 (1988).
13
J. L. Rayces and L. Lebich, “A hybrid method of lens
optimization”, OSA Annual meeting, Santa Clara, California,
October 30-November 4 (1988).
14
J. L. Rayces, “Least squares fitting of orthogonal
polynomials to the wave aberration function”, Applied Optics,
31, 13 pp.2223-2228 (1992).
15
J. L. Rayces, A. Hsieh, “Lens design with Glatzel’s adaptive
method and the Nijboer-Zernike Aberration coefficients” (A), J.
Opt. Soc. Am., 62, p.711, (1970)
16
G. S. Frysinger, “Using Zernike polynomials”, Optics and
Photonics News, 3, 12, p.3 (1992).
17
F. Zernike, “The diffraction Theory of aberrations” Optical
Image Evaluation Symposium, NBS Circular 526, p.1 (1954).
18
R. E. Hopkins and G. Spencer “Creative thinking and
computing machines in optical design”, J. Opt. Soc. 52, 2 pp.
172-176 (1962).
13

2002.1 A Critical View of Three Lens Design Methods

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

2002.1 A Critical View of Three Lens Design Methods

Hochgeladen von

Copyright:

Verfügbare Formate

A critical view of three lens design methods:

damped least squares, Spencer’s and Glatzel’s

Juan L. Raycesa, Martha Rosete-Aguilarb

ABSTRACT and useful and were universally adopted. The first

considerably smaller, but it does not completely

disappear. It will be shown here that controlling the

magnitude of the hard merit function will result in

2.3 Performance assessment, functions.

Correspondingly we have these vectors:

Ψ 2 ( P ) = HT H = ∑ k =1ηk2 = 0 (∴ each ηk = 0) (14)

and perhaps other conditions. The merit functions are

SOFT ERROR CHANGES V.: ∆E = [ ∆ε1 , ∆ε 2 ,… ∆ε J ] (17)

2.7 Jacobian matrices

HARD JACOBIAN C=M

The family Φ 2 = const. is the set of asymmetric

end point of 5.4 Search for a halt point

There is continuity in the solution path in the N- Spencer's solution

6.1 The problem and the solution

the matrix solution for this system, assuming linearity, is

and the variance is short leap

Besides, with Glatzel’s method the number of

discontinuity at the very beginning. It cannot be

7. CONCLUSIONS. design starts with Glatzel's method

In the method of damped least squares (DLS) constrained minimum

Das könnte Ihnen auch gefallen