Beruflich Dokumente
Kultur Dokumente
1
the so-called ATA matrix8 and not because it controlled and minimize the merit function. The initial results were
the length of the solution vector, as was the true purpose very successful and led to the hope that one day soon the
of Levenberg. computer would completely take over the lens designer’s
The publication of Spencer’s paper in Applied job and prematurely people talked about automatic lens
Optics caused a stir among those involved in lens design design. It has not happened yet after 52 years. More
methods: it was greeted as a panacea. Unfortunately recently, and more realistically, computer aided lens
Spencer’s method did not live fully up to expectations, design has been used. Design is partly done by the
probably because of the discontinuity discussed here, and computer and partly by the lens designer who operates
was abandoned by many of its earlier enthusiastic the computer and the program. For this reason even if the
supporters. computer program does not minimize the merit function,
E. Glatzel a mathematician and outstanding lens the two together will optimize the design. So this is the
designer with Carl Zeiss, Oberkochen, had developed in word we shall use in this paper.
19619 the method of automatic adaptive correction
(AAC) that was published in detail only in 196810. 2.1 Lens system as a point in the N-dimensional
Unlike Levenberg’s method, the length of the solution parameter space.
vector is controlled by checking linearity at each step of A lens system at any time during the design
the process with a completely original device. With stage may be represented by a point P in an N-
Glatzel’s method it is theoretically possible to bring dimensional space, figure 1, where the coordinates are
exactly to target both image errors and constraint errors. parameters, that is, all entities in the lens system that are
The number of these errors cannot exceed the available allowed to vary during the design process.
degrees of freedom. It is not possible to automatically Examples of lens parameters are surface
balance the entities involved. The method works well in curvature of lens elements, axial separation between
the hands of experienced users. For these reasons surfaces, refractive index and dispersion of glass
Glatzel’s method has appealed only to very few lens material, etc.
designers.
J. Rayces and L. Lebich in 198311,12 in trying to predicted point
implement Spencer’s CDLS method into a lens design
program, discovered a disturbing discontinuity in the region of linearity
method. This flaw is serious at the beginning of the solu
tion
design when the constraint errors are large. But it can be vec start point
t or
overcome by reducing these errors to negligible values
with Glatzel’s method13.
The flaw in CDLS is less serious during the
course of the design when the constraint errors are
r
cto
ve
2
The aim of all design methods is to improve the SCALED POSITION [ X] = [ X 1 , X 2 , X 3 ,..., X N ] , (5)
performance of the lens system. These methods will
compute changes ∆xn to the parameters. The vector
SCALED SOLUTION [ ∆X] = [ ∆X 1 , ∆X 2 , ∆X 3 ,..., ∆X N ] , (6)
formed with the parameter changes is the
The scaling factor is an indispensable tool to
handle parameters of different nature, like curvatures and
SOLUTION VECTOR [ ∆x ] = [ ∆x1 , ∆x2 , ∆x3 ,..., ∆xN ] (2)
axial separations, but also parameters of the same nature
and different dimensions like the curvatures of the
primary and secondary mirrors in astronomical
telescopes.
3
COMPUTED FUNCTIONS ϕj Pbg ψk P bg The reason for functions and errors being called
soft or hard is simply because soft targets or hard targets
ASSIGNED WEIGHTS wj none
respectively, are assigned to them. It is possible for an
ASSIGNED TARGETS sj hk image error to be assigned a hard target and then it
COMPUTED ERRORS εj ηk becomes a hard error. Also a non-image error could be
assigned a soft target and then it becomes a soft error
Soft errors are those that can logically be SOFT ERROR VECTOR: E = [ε1 , ε 2 ,…ε J ] (9)
minimized, there is a reasonable tolerance for them. Soft
errors are mostly image errors, and they are related to the HARD ERROR VECTOR: H = [η1 ,η2 ,…ηK ] (10)
other entites by:
There are no weights for hard errors because
SOFT ERROR ε j ( P ) = w j ⋅ ϕ j ( P ) − s j . (7) eventually they will be reduced to zero and then a weight
will have no effect on them.
Hard errors, mostly non-image errors, are those
2.5 Soft and hard merit functions.
that for good reasons must be reduced to zero, there is no
In order to assess the improvement in
useful tolerance for them. If there exists a tolerance, and
performance of a system it helps to reduce the results to
it were taken advantage of, the benefit to image quality
the least possible number of quantities. In optical design
would be imperceptible. Suppose there is a 1% tolerance
it is common practice to call these quantities merit
in the focal length, so if the focal length is made shorter
functions, defined as the sum of the square of errors: We
by that amount, the aberrations would decrease by the
have, then:
same amount which is negligible. If there is a tolerance
of 10% then it would be wise to use that tolerance at the
SOFT MERIT FUNCTION: Φ 2 ( P ) = ET E = ∑ j =1 ε 2j
J
outset, but it would not be permissible to go beyond that (11)
during lens optimization.
HARD MERIT FUNCTION: Ψ 2 ( P ) = HT H = ∑ k =1ηk2
K
(12)
2
It should be kept in mind that soft weights are
absorbed into the soft errors. Weights make up for the
difference in the importance of the functions that go into
the construction of the soft merit function.
1
2.6 Linearization and error changes.
4 The optimization methods discussed here
2 start point of
6
0 require that either the soft merit function is minimized, or
first iteration
3 the hard merit function be reduced to zero, or both:
Φ 2 ( P ) = ET E = ∑ j =1 ε 2j = min
5 J
(13)
4
In the region of linearity the changes ∆ε j , ∆ηk With matrix notation the linear equations for
of soft and hard errors are given by systems of linear changes in the errors when parameters change can be
equations: written
∆E = A ⋅ ∆X (22)
∂ε j ∂ϕ j
∆ε j = ∑ N ∆X n = ∑ N w j ∆X n (15) ∆H = C ⋅ ∆X (23)
∂X n ∂X n
∂η ∂ψ k
∆η k = ∑ N k ∆X n = ∑ N ∆X n (16)
∂X n ∂X n
halt point of current
Two vectors are formed with the soft and hard end point of current
iteration at
error changes: iteration
∂ϕ ∆ϕ ∂ψ ∆ψ
≅ , ≅ . (21)
∂x ∆x ∂x ∆x
5
3. NUMERICAL EXAMPLE This problem, even with the limitation to N=2,
gives to our 3-D mind a hint of what happens in the N-
We shall illustrate the design methods discussed dimensional space.
below with an example using soft and hard non-linear
functions of two parameters, X1 and X2 (N=2). 4. LEAST SQUARES AND DAMPED LEAST
SQUARES METHODS
SOFT ERRORS AND MERIT FUNCTION
ϕ1 = ( X 1 − 3) ⋅ ( X 2 + 1) ,
2 2
s1 = 0, w1 = 1, 4.1 The problem and the solution.
The problem of minimizing the soft merit
ϕ 2 = X 12 + 25 ( X 2 − 4 )
2
s2 = 0, w2 = 1, function is stated as:
Φ 2 = (ϕ1 − s1 ) w12 + (ϕ 2 − s2 ) w22 .
2 2
Φ 2 ( P ) = ET E = ∑ j =1 ε 2j = min
J
(24)
6
∆X = ( ATA + κ I ) AT E , 5. LAGRANGE’S METHOD OF
−1
(28)
UNDETERMINED MULTIPLIERS AND
SPENCER’S CONSTRAINED DAMPED
I is the unit matrix. The damping factor κ can take any
positive value between zero and infinity but its optimum
LEAST SQUARES METHOD
value, the value such that Φ2 is a minimum must be found
by trial and error. 5.1 The problem and the solution.
Given the conditions that the merit function be
minimized and the hard merit function be reduced to
Damped least squares solution: zero,
∞ ≥κ ≥ 0
if κ = ∞, ∆X = 0 Φ 2 ( P ) = ET E = min,
(30)
Ψ 2 ( P ) = HT H = 0 (∴ each ηk = 0).
∆X = ( A A + κ I ) A E
−1
if ∞ > κ > 0, T T
∆X = ( ATA ) AT E
−1
if κ = 0, Assuming strict linearity the problem may be solved with
Lagrange’s method8.
Fig. 6. In the method of damped least squares the beginning of an
iteration is at the point where the previous iteration ends: there is no
−1
discontinuity. ∆X ATA CT AT E
λ = ⋅ (31)
4.3 Search for halt point. C 0 H
The merit function becomes a function of the
damping factor Φ 2 (κ ) during a given iteration λ‘s are undetermined multipliers.
In the case of non-linear systems like
In order to find Φ 2
, a quantity χ is defined
min
optimization it will be necessary to compute the matrices
and iterate the process. But every time there will be a
1 1 leap always with the danger of jumping into a bad area
χ=m , m > 0 ∴ κ = m − 1. (29)
κ +1 χ from where it is difficult to return. There is no record of
any one using Lagrange multipliers in optimization even
with partial success.
This quantity is incremented in short steps from
χ = 0, (κ = ∞ ) to χ = 1, (κ = 0 ) . The merit function Φ2, 5.2 Control of the solution vector length.
or log(Φ2), is computed at every step: it will decrease Borrowing from the method of damped least
slowly at the beginning, and then will increase abruptly. squares, for non-linear systems, Spencer, multiplied the
When Φ2 begins increasing the previous step is retraced sum of the squares of the parameter changes by a scalar
and the search for Φ 2min halts to begin a new computation κ, added the product to the sum of the squares of error
of partial derivatives. and minimized both terms, thus:
In Figure 4 it may be seen that the starting point
and end point of the plain least squares solution and Φ 2 ( P ) = ET E + κ∆XT ∆X = min,
(32)
damped least squares are the same. But there is a leap in Ψ 2 ( P ) = HT H = 0, (∴ each ηk = 0).
the former while the latter describes a curved path. It may
be seen that the curve starts normal to the loci of
Note that nothing is done to damp the
Φ2=const. then it takes a sharp turn and ends parallel to
contribution of the vector of hard errors to the problem.
the loci. In between the curve has a better chance of
being tangent to one of the loci and that is the point In the solution the damping factor κ appears added to the
where Φ2=min. occurs. It is conceivable that something diagonal elements of matrix ATA as in the case of
similar happens in the N-dimensional space where the damped least squares but not to the entire matrix.
loci are hypersurfaces and the path of solutions is still a
curved line. −1
∆X ATA + κ I CT AT E
λ = ⋅ , (33)
C 0 H
7
discussion we shall see the advantage of using the
5.3 A gap in Spencer’s method. automatic adaptive method of Glatzel’s to bring all
Spencer’s method comes very close to being the constraints to target before starting to use Spencer’s.
ideal method for lens design, except for the already method.
mentioned indetermination at the origin when hard errors As the computation progresses the vector of
are not negligible. hard errors is shorter, yet it is necessary to be careful that
it is kept within limits during the search for a halt point,
as it will be explained immediately,
−1
∆X ATA CT AT E
if κ = 0, λ = ⋅
a non-zero value, Appendix I. This proves beyond doubt
C 0 H
that there is a discontinuity in the solution path. It is a
Fig. 8. In Spencer’s method one iteration begins at a point different from
jump or leap into a most likely undesirable region of the ending point of a previous iteration when the vector of hard errors is
parameter space. How serious the leap is depends on not zero: therefore there is a discontinuity at the origin.
matrix C and vector of hard errors H.
Nothing can be done about C since it is made up 5.4 Large quantities of errors.
of computed partial derivatives. The DLS and CDLS methods are well suited to
At the very beginning of the design process it is handle a large amount of image errors as is the case when
probable that the constraints are far from their targets and the errors are samples of ray aberrations of dense fans of
the vector of hard errors H is large. Then the rays through the entrance pupil for several different
discontinuity is a serious matter. Later on in this colors (wavelengths), many field points and a good
number of zoom positions. The number of errors may
8
thus reach several thousand. This does not mean that the
computer memory has to store matrices with several ∆σ
L= . (37)
thousand rows, as explained in Appendix II. ∆ξ
6.GLATZEL’S AUTOMATIC ADAPTIVE ∆ξ is the step length defined above and ∆σ is the solution
CORRECTION METHOD. vector increment, Fig. 11.
Ψ ( P ) = HT H = ∑ k =1ηk2 = 0 ∴ each ηk = 0,
K
(34)
Ω = ∑ n ∆X n2 = min.
∆X = CT ( CCT ) ⋅ H.
−1
(35)
The search for a solution in a non-linear system Fig. 10. Spencer’s method: several iterations. Points P0, P1,... are start
points of each iteration where partial derivatives are computed. Points E0,
is in steps of length
E1,... are halt points of each iteration, either when Φ=min or when κ=0.
Note discontinuities.
∆ξ = χ ⋅ ∆X, 0 < χ ≤ 1. (36)
The hard merit function is reduced to zero and
the square of the vector of parameter changes is
minimized
halt point of
previous leap or discontinuity The value of the linearity ratio is used to
iteration, start determine the length of the parameter increment ∆ξ. It is
point of current end point of
iteration current iteration also used to establish the end of the iteration, when
L exceeds certain arbitrary limits:
first point locus of solutions
of current 0.7<L<1.4. (38)
iteration
Then a new iteration starts with the computation of
partial derivatives. This is the counterpart of the DLS and
CDLS methods where a new iteration starts at the point
Φmin of the previous iteration.
As already mentioned DLS, and CDLS are well
suited to handle a large amount of errors as it will be the
Fig. 9. Spencer’s method: single iteration. Partial derivatives for the case when the goal is to minimize the rms spot size or the
current iteration are computed at the halt point of the previous iteration.
Note discontinuity from start point to first point of current iteration.
rms wave aberration for each design wavelength and
each field point with sample values from rays distributed
regularly over the lens pupil.
At each step the error vector and the linearity Glatzels method, on the other hand is limited to
ratio are computed. This linearity ratio is defined as bringing to target only a comparatively small number of
errors, It is completely out of the question to try and
9
control the rms spot size or rms of the wave aberration number of lower powers which balance it in the best
obtain by a number of sample values obtained by ray way.”
tracing. Except that the only a finite number of
Instead it is possible from ray tracing data to coefficients can be used: the Zernike expansion, equation
compute the coefficients of the expansions of the (39), must be truncated somewhere. This is a common
geometric aberration or wave aberration14, define these problem to all infinite expansions.
coefficients as errors and then control their value with
Glatzel’s method.
Glatzel’s method is particularly well suited to
new iteration starts
work with coefficients of the Zernike expansion of the at halt point
wave aberration function15 written in the form16
first starting point,
∞ constraints satisfied
W ( ρ ,ϑ ) = A00 + ∑ n + 1 An 0 exactly
n =1
∞ n
(39)
+ ∑∑ 2 ( n + 1) Anm R m
n ( ρ ) cos mϑ ,
n =1 m =1
Fig. 12. Spencer’s method. It works well when hard errors are all zero at
the beginning and are maintained at very small residuals during the
iterations.
10
Glatzel’s method succeeded because of a is simplified by using two merit functions. One of them
number of devices not evident in the solution of matrix involves only image errors and we call it soft merit
equations. One of these devices is the test that will insure function. The other involves all the other errors, mostly
that the successive points in the solution vector are at all opto-mechanical constraints, and we call it hard merit
times within the region of linearity, i.e. the linearity test function. The value of the latter should at all times be
already explained. zero or a very small value, therefore there is practically
only one merit function to watch to asses the progress of
the computations. The design ends at a point when the
∆ξ step length soft merit function cannot be further optimized and
success depends to a much lesser extent on the choice of
increment of predicted weights than the DLS.
solution vector length ∆σ
The problem with CDLS is a disturbing
error vector length
11
At that point the designer changes over to the
CDLS method which is superior when dealing with a Obviously
large number of errors and it becomes necessary to make 1 2 1 T
a compromise at the same time that constraints are all limκ →∞ Φ ( P ) = 0, limκ →∞ E E=0 (40)
κ κ
maintained on target.
The operation with AAC and CDLS combined therefore
for the solution of the numerical example is shown in
∆XT∆X = min. (41)
figure 14.
A lens design program, called Eikonal, based on
This, together with the second of equations (34) are the
these two methods was used exclusively at the Applied
same as the conditions in Glatzel’s method, therefore the
Optics Division of the Perkin-Elmer Corporation since
solution is the same, equation (32)
1970. It continued to be used in its former divisions after
1991 when the company divested itself of the defense
and space business. ∆X = CT ( CCT ) ⋅ H (42)
QED
ACKNOWLEDGMENTS
APPENDIX II. THE PRODUCTS ATA AND ATE
The authors wish to express their appreciation to
Dr. Wu Jiang, Dr. Chungte B. Chen, Ms. Ronnie Yuan One should expect to find in an optimization
and Miss Stacy Shouru Chen for their help and problem a reasonable number N of parameters and a
suggestions in preparing this manuscript and slides for reasonable number K of hard errors (constraints). But the
the presentation at Photonics Asia, 2002. number of soft errors J may be really large, like many
thousands, when optimizing, for instance the spot sizes.
IN MEMORIAM At a glance it may appear that it is necessary to
store in the computer memory a huge J×N matrix A and a
This paper is dedicated to the memory of Dr. huge J×1 vector E. This is not so: the largest matrix to be
Earhard Glatzel, one of the most outstanding lens stored is an N×N matrix ATA and an N×1 vector ATF.
designers of the XXth century who passed away in Just before evaluation, j=0, matrices ATA and
T
Heidenham, Germany, early this year. A E are reset to zero
Errors and partial derivatives are computed one
APPENDIX I. SPENCER’S SOLUTION FOR at a time from j=1 to j=J when the lens system is
κ→∝ evaluated The partial derivatives are set in a 1×N vector
Aj and the error is set in a 1×1 vector Ej:
Spencer’s solution for the vector of parameter
changes, ∆X , becomes undefined for κ → ∞ . Therefore ∂ε ∂ε ∂ε
it is necessary to derive an alternative form valid for this
A j = j , j ,… j , Ε j = (ε j ). (43)
∂X 1 ∂X 2 ∂X N
particular case.
We start with the conditions in Spencer’s
Each time the derivatives and errors are
method
computed, the matrix and the vector are updated:
Φ 2 ( P ) = ET E + κ ⋅ ∆XT∆X = min . j −1 j −1
Ψ 2
( P) = H T
H = 0 ∴ each ηk = 0
(38) ATA = ∑AA
i =0
T
i i + ATjA j , AT E= ∑A E
i =0
T
i i + ATj Ej (44)
If both sides of the first equation are divided by any until j=J.
arbitrary quantity the solution is unchanged: the
minimum falls at exactly the same point. As a particular
case we divide both sides by the damping factor κ to REFERENCES
obtain 1
S. Rosen, and C. Eldert, “Least squares method for optical
correction”, J. Opt. Soc. Am., 44 No.3, pp.250-252 (1954).
1 2 1
Φ ( P ) = ET E + ∆XT∆X = min . (39)
κ κ
12
2
K. Levenberg, “A method for the solution of certain nonlinear
problems in least-squares”, Quart. Appl. Math. 2, 164 (1944)
3
R. E. Hopkins, C. A. McCarthy, R. Walters, J. Opt. Soc. 45,
363, (1955)
4
A.Girard, Rev. Opt. 37, 225, 397 (1958)
5
J. Meiron, J. Opt. Soc. Am. 49, 293 (1959)
6
C. G. Wynne, Proc.Phys. Soc. (London) 73, 777 (1959)
7
G. H. Spencer, “A flexible automatic lens correction
procedure”, Appl. Opt. 2, No.12 pp.1257-1264 (1963)
8
G. Golub, “Numerical methods for solving linear least squares
problems”, Numerische Mathematics, 7, 206-216 (1965)
9
E. Glatzel, “Ein neues Verfahren zur automatischen
Korrection optischer Systeme mit electronischen
Rechenmaschinen” Optik, 18 (10/11) 577-580 (1961)
10
E. Glatzel and R. Wilson, “Adaptive automatic correction in
optical design”, Appl. Opt. 7, No.22 p265 (1968)
11
J. L. Rayces and L. Lebich, “Optimization of constrained
optical systems”, (A), JOSAA 1, p.1219 (1984)
12
J. L. Rayces and L. Lebich, “Experiments on constrained
optimization with Spencer’s method”, Optical Engineering, 27,
No.12 pp. 1031-1034 (1988).
13
J. L. Rayces and L. Lebich, “A hybrid method of lens
optimization”, OSA Annual meeting, Santa Clara, California,
October 30-November 4 (1988).
14
J. L. Rayces, “Least squares fitting of orthogonal
polynomials to the wave aberration function”, Applied Optics,
31, 13 pp.2223-2228 (1992).
15
J. L. Rayces, A. Hsieh, “Lens design with Glatzel’s adaptive
method and the Nijboer-Zernike Aberration coefficients” (A), J.
Opt. Soc. Am., 62, p.711, (1970)
16
G. S. Frysinger, “Using Zernike polynomials”, Optics and
Photonics News, 3, 12, p.3 (1992).
17
F. Zernike, “The diffraction Theory of aberrations” Optical
Image Evaluation Symposium, NBS Circular 526, p.1 (1954).
18
R. E. Hopkins and G. Spencer “Creative thinking and
computing machines in optical design”, J. Opt. Soc. 52, 2 pp.
172-176 (1962).
13