Sie sind auf Seite 1von 441



L. ALLEN, Brighton, England

M. FRANCON, Paris, France
E. INGELSTAM, Stockholm, Sweden
K. KINOSITA, Tokyo, Japan
A. LOHMANN, Erlangen, Germany
M. MOVSESSIAN, Armenia, U.S.S.R.
G. SCHULZ, Berlin, D.D.R.
W. H. STEEL, Chippendale, N.S. W., Australia
W. T. WELFORD, London, England


University of Rochester, N.Y., U S A .


J. C. D A I N T Y , A. L A B E Y R I E


M. A. D U G U A Y , G. S C H M A H L , D. R U D O L P H

P. J. V E R N I E R , P. J. B. C L A R R I C O A T S




All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopy-
ing, recording or otherwise, without the prior permission of the Copyright owner.


N O R T H - H O L L A N D ISBN: 0 7204 1514 4
ELSEVIER N O R T H - H O L L A N D ISBN: 0444 109145





NEW YORK, N.Y. 10017


C O N T E N T S O F V O L U ME 1(1961)
DIFFRACTION . . . . . . . . . . . . . . . . . . 67-108
IV . LIGHTAND INFORMATION. D . GABOR . . . . . . . . . . . . . . . . . 109-153
INFORMATION. H . WOLTER. . . . . . . . . . . . . . . . 155-2 10
VI . INTERFERENCE COLOR.H. KUBOTA. . . . . . . . . . . . . . . . . . 211-251
VIII . MODERNALIGNMENT DEVICES.A . C. S. VAN HEEL . . . . . . . . . . . 289-329

C O N T E N T S O F V O L U M E I1 (1963)
SPECTROSCOPY. G . W . STROKE. . . . . . . . . . . . . . . . . . . . 1-72
SPATIALFREQUENCY FILTERING. J . TSUJIUCHI . . . . . . . . . . . . . 131-180
v. FLUCTUATIONS OF LIGHTBEAMS.L . MANDEL . . . . . . . . . . . . . . 181-248

C O N T E N T S O F V O L U M E I11 (1964)
AFODISATION. AND B. ROIZEN-DOSSIER. . . . . . . . . . 29-186
COHERENCE. H . GAMO. . . . . . . . . 187-332

C O N T E N T S O F V O L U M E IV (1965)
I. HIGHERORDER ABERRATION THEORY.J . FOCKE. . . . . . . . . . . . 1-36
AND P BOUSQUET . . . . 145-197

C O N T E N T S O F V O L U M E V (1966)
I1. NON-LINEAR OPTICS.P . S . PERSHAN. . . . . . . . . . . . . . . . . 83-144
I11. TWO-BEAM INTERFEROMETRY. W . H . STEEL. . . . . . . . . . . . . . . 145-197
MURATA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199-245
R . JACOBSSON . . . . . . . . . . . . . . . . . . . . . . . . . . . 247-286
OPTICS. H . LIPSONAND C . A . TAYLOR . . . . . . . . . . . . . . . . 287-350
C O N T E N T S O F V O L U M E VI (1967)
AND S . MALLICK .......................... 71-104
Iv. DESIGNOF ZOOM LENSES.K . YAM^ . . . . . . . . . . . . . . . . . 105-170
STRONG AND A . W . SMITH . . . . . . . . . . . . . . . . . . . . . . 21 1-257
VII . FOURIER SPECTROSCOPY. G . A . VANASSEAND H . SAKAI . . . . . . . . . 259-330
KOTTLER. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331-377

C O N T E N T S O F V O L U M E VII (1969)
G. KOPPELMAN. . . . . . . . . . . . . . . . . . . . . . . . . . 1-66
R . J . PEGIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67-137
111. ECHOESAT OPTICALFREQUENCIES. I. D. ABELLA. . . . . . . . . . . . 139-168
TER-MIKAELIAN .......................... 231-297
VI . THEPHOTOGRAPHIC IMAGE.S . Oom . . . . . . . . . . . . . . . . . 299-358

C O N T E N T S O F V O L U M E VIII (1970)
I. O ~ n c s J. . W . GOODMAN
SYNTHETIC-APERTURE . . . . . . . . . . . . . 1-50
111. LIGHTBEATING SPECTROSCOPY. H . Z . C u m m s AND H. L . SWINNEY. . . 133-200
MICROSCOPY.T. YAMAMOTO ..................... 295-341
VII . VISIONIN COMMUNICATION.L . LEVI. . . . . . . . . . . . . . . . . . 343-372

C O N T E N T S O F V O L U M E I X (1971)
A . L. BLOOM . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-30
I1. PICOSECOND LASERPULSES. A . J . DEMARIA . . . . . . . . . . . . . 31-71
STROBMN . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73-122
GINZBURG. . . . . . . . . . . . . . . . . . . . . . . . . . . . 235-280
WAVES K . GNIADEK AND J . PETYKIEWICZ . . . . . . . . . . . . . . . 281-310
BASED . . . . . . . . 31 1407


I1. THEUSEOF IMAGE TUBESAS SHUTTERS. R . W . SMITH. . . . . . . . . . . 41-87
D . L. DEXTER. . . . . . . . . . . . . . . . . . . . . . . . . . . 165-228
VII . QUANTUM DETECTION THEORY. C . W. HELSTROM . . . . . . . . . . . . 289-369


H. YOSHINAGA........................... 77-122
CREWE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223-246
VII. GRADIENTINDEX LENSES.E. W. MARCHAND . . . . . . . . . . . . . . 305-337


BEAMS.0. SVELTO. . . . . . . . . . . . . . . . . . . . . . . . . 1-51
I1. SELF-INDUCED TRANSPARENCY. R . E . SLUSHER. . . . . . . . . . . . . 53-100
GRAHAM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233-286
VI. BEAM-FOIL SPECTROSCOPY. S . BASHKIN . . . . . . . . . . . . . . . . 287-344

C O N T E N T S OF VOLUME XI11 (1976)


NoNEQuILmRIuM ENVIRONMENT. H . P. BALTSS . . . . . . . . . . . . . 1-25
HUMAN EYE.W. M . ROSENBLUM, J . L . CHRISTENSEN. . . . . . . . . . . 69-91
..V. K . TRIPATHI . . . . . . . . . . . . . . . . 169-265
VI . APLANATISM AND ISOPLANATISM. W. T. WELFORD . . . . . . . . . . . . 267-292
This Page Intentionally Left Blank

The present volume of PROGRESSIN OPTICS,just like its thirteen pre-

decessors, contains review articles covering recent researches in optics and
related subjects.
The first article, by J. C . Dainty, deals with the statistics of speckle
patterns, i.e., with the statistics of the random variation of intensity that is
produced when a highly coherent light beam is reflected or transmitted by
an optically rough surface. In view of the increasing utilization of laser
light it has become imperative to gain a good understanding of speckle
phenomena. The article presents the most important aspects of the under-
lying theory.
The second article, by A. Labeyrie, provides a review of modern high
resolution techniques employed in optical astronomy. It gives an account
of the methods that are being gradually developed for the purpose of
overcoming the limitations that the earths atmosphere imposes on the
resolving power for astronomical observations in the optical region of the
electromagnetic spectrum. An introductory section, dealing briefly with the
history of the subject, is followed by an account of the main features of
atmospheric turbulance and of its effects on the degradation of optical
images. Various aspects of direct stellar interferometry are then discussed.
Classic stellar interferometry, originating in the pioneering investigations
of Fizeau, Michelson, Anderson and Pease is reviewed and its modern
refinements and modifications are described. Other interesting new develop-
ments, including Labeyries own important contributions in the area of
speckle interferometry are then presented. The article also includes accounts
of the synthetic aperture technique, of intensity interferometry and of
heterodyne interferometry.
Rare-earth-activated materials are being more and more frequently
utilized in quantum electronic devices such as lasers, quantum counters and
infrared-to-visible upconvertors. The third article in this volume, con-
tributed by L. A. Riseberg and M. J. Weber, deals with relaxation phenom-
ena in rare-earth luminescence. It provides a survey of the energy levels

and the excitation and decay modes of the rare earths and also covers such
topics as relaxation by radioactive decay, multiphonon processes and
ion-ion interactions. Examples of some applications are also given.
The fourth article, written by M. A. Duguay discusses a useful recent
application of picosecond laser pulses, namely the development of ultrafast
shutters based on the optical Kerr effect. This device utilizes optically
induced birefringence to obtain gating times of the order of a few pico-
seconds. After a discussion of gating in different substances and the factors
that limit the resolution, some applications of the ultra-fast shutters are
described. Among them are their use in ultra-high speed photography,
which has made it possible, for example, to photograph a light pulse in
flight and the development of a technique for the sampling of ultra-short
optical signals by means of which molecular fluorescence signals can be
displayed on the picosecond time scale.
A relatively new technique for making diffraction gratings is described
in the fifth article, contributed by G. Schmahl and D. Rudolph. The
grating profile is provided by the intensity distribution of holographically
produced interference fringes that are stored on a glass blank coated with
a thin film of photoresist. After a presentation of the basic principles of
such holographic diffraction gratings, their production is described. The
properties of such grating are then discussed and their performance is
compared with that of gratings of more conventional type.
The sixth article, by P. J. Vernier, is concerned with a basic question con-
cerning the photoelectric effect, namely the origin of photoelectrons. It has
long been known that generally only a very thin layer of an irradiated solid
gives rise to photoelectrons. A more accurate knowledge is, however,
required in connection with efforts to improve photocathodes for use in
photometry and for a precise interpretation of the results of electron
spectroscopy. In this article the theoretical foundations of this subject are
first discussed. A detailed review is then presented of researches on the
escape depth of photoelectrons, and experimental and theoretical results
are compared. Investigations on surface photoexcitations are also reviewed.
The last article in this volume, contributed by P. J. B. Clarricoats,
presents an account of theoretical researches on optical properties of fibre
waveguides. Since about the early 19503, when glass fibres appear to have
been first seriously considered as optical elements, much research has been
conducted in this field. Today optical fibres promise to play an important
role in the field of telecommunications. In this article the basic properties
of optical fibre waveguides of various types are discussed and their relative

merits are brought out. The modal as well as the ray methods of treatment
are employed in analysing their properties.
This volume attests once again to the vigor and the breadth of current
research in optics.

Department of Physics and Astronomy EMILWOLF

University of Rochester
Rochester, N . Y . 14627

July 1976
This Page Intentionally Left Blank


by J . C. DAINTY
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. NORMAL SPECKLE PATTERNS . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 First order statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Second order statistics . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Statistics of the measured intensity. . . . . . . . . . . . . . . . . . . . 13
3. PARTIALLY COHERENT ILLUMINATION ..................... 18
3.1 Spatial coherence - Fraunhofer plane . . . . . . . . . . . . . . . . . . 18
3.2 Spatial coherence - image plane . . . . . . . . . . . . . . . . . . . . . 21
3.3 Temporal coherence . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1 Depolarising surfaces. . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 A small number of scatterers . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Slightly rough surfaces . . . . . . . . . . . . . . . . . . . . . . . . . 41
5. CONCLUDING REMARKS ........................... 44
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44


0 . INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
0.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
0.2 History. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1. ATMOSPHERIC OPTICS . . . . . . . . . . . . . . . . . . . . . . . . . 51
1.1 The atmospheric heterogeneity . . . . . . . . . . . . . . . . . . . 51
1.2 Wave deformations and shadow patterns . . . . . . . . . . . . . . . 53
1.3 The speckrled structure of images. . . . . . . . . . . . . . . . . . . 53
1.4 The MTF for short and long exposures . . . . . . . . . . . . . . . 58
2. DIRECTINTERFEROMETRY . . . . . . . . . . . . . . . . . . . . . . . . 59
2.1 Basic principles . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.2 Visibility modulus determination. . . . . . . . . . . . . . . . . . . 61
2.3 Quantum noise . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3. INTERFEROMETER DESIGNS AND RESULTS . . . . . . . . . . . . . . . . . . 64
3.1 The Fizeau and Michelson interferometers . . . . . . . . . . . . . . 64
3.2 Photoelectric Fizeau interferometers . . . . . . . . . . . . . . . . . 66
3.3 The speckle interferometer . . . . . . . . . . . . . . . . . . . . . 69
3.4 Interferometry with two telescopes. . . . . . . . . . . . . . . . . . 13

4 . THEIMAGERECONSTRUCTION PROBLEM . . . . . . . . . . . . . . . . . . . . 76
4.1 The visibility phase problem with direct interferometry . . . . . . . . . .. 76
4.2 The triple interferometer . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3 The seeing compensation approach . . . . . . . . . . . . . . . . . . . . 78
6. INTENSITY INTERFEROMETRY . . . . . . . . . . . . . . . . . . . . . . . . . 82
7. HETERODYNE INTERFEROMETRY . . . . . . . . . . . . . . . . . . . . . . . . 84
8. CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85


(Waltham. Massachusetts) and M . J . WEBER(Livermore California)
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2. HISTORICAL DEVELOPMENTS . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.1 Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.2 Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3. RARE-EARTH ENERGY LEVELS . . . . . . . . . . . . . . . . . . . . . . . 98
4. EXCITATION AND DECAYIN RARE-EARTH SYSTEMS. . . . . . . . . . . . . . . 102
5. RADIATIVE DECAY .................... ......... 106
5.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.1.1 Electric-dipole transitions: Judd-Ofelt theory . . . . . . . . . . . . . 106
5.1.2 Magnetic-dipole and electric-quadrupole transitions . . . . . . . . . . 109
5.2 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6. MULTIPHONON RELAXATION . . . . . . . . . . . . . . . .. . . . . . . . . 116
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.3.1 Temperature dependence . . . . . . . . . . . . . . . . . . . . . 124
6.3.2 Energy gap dependence . . . . . . . . . . . . . . . . . . . . . . 127
6.3.3 Host dependence . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.3.4 5d + 4f relaxation . . . . . . . . . . . . . . . . . . . . . . . . 132
7. COOPERATIVE RELAXATION. . . . . . . . . . . . . . . . . . . . . . . . . 133
7.1 Ion-ion energy transfer . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.1.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.1.3 Energy migration . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.2 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8. SELECTED APPLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.1 Upconversion phosphors . . . . . . . . . . . . . . . . . . . . . . . . 151
8.2 Lasers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9. CONCLUDING REMARKS . . . . . . . . . . . . . . . . . .. . . . . . . . . 155
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156


(Murray Hill. New Jersey)
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
2. THEULTRAFAST OPTICALKERR SHUTTER . . . . . . . . . . . . . . . . . . . 165
2.1 Gating in CS, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
2.2 Gating in nitrobenzene . . . . . . . . . . . . . . . . . . . . . . . . . 169
2.3 Gating with subpicosecond pulses . . . . . . . . . . . . . . . . . . . . 170

2.4 Gating in glass . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

2.5 Time response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
2.6 Collinear gating . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
2.7 Self-focusing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
2.8 Transverse gating . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
3. ULTRAHIGH SPEEDPHOTOGRAPHY ...................... 177
3.1 Light photographed in flight . . . . . . . . . . . . . . . . . . . . . . 177
3.2 Gated picture ranging . . . . . . . . . . . . . . . . . . . . . . . . . 180
3.3 Ultrahigh speed framing photography . . . . . . . . . . . . . . . . . . 182
4. SAMPLING OPTICALSIGNALS .......................... 183
4.1 F.1uorescence lifetime measurements . . . . . . . . . . . . . . . . . . . 183
4.2 The echelon technique . . . . . . . . . . . . . . . . . . . . . . . . . 185
4.3 The optical sampling oscilloscope (OSO) . . . . . . . . . . . . . . . . . 186
4.4 Multichannel sampling with detector arrays . . . . . . . . . . . . . . . . 188
5. CONCLUDING REMARKS ........................... 191
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
REFERENCES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192


and D . RUDOLPH(Gottingen)
t . INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
3.1 Interference fringe system . . . . . . . . . . . . . . . . . . . . . . . . 201
3.1.1 Accuracy of the interference fringe system . . . . . . . . . . . . . . 201
3.1.2 Interference arrangements . . . . . . . . . . . . . . . . . . . . . 207
3.1.3 Improvement of the ruling accuracy by superposition of identical recon-
structed wavefronts . . . . . . . . . . . . . . . . . . . . . . . . 207
3.1.4 Frequency and wavelength stability of the laser light . . . . . . . . . 213
3.2 Photoresist layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
4 PRODUCTION OF HOLOGRAPHIC GRATINGS . . . . . . . . . . . . . . . . . . . 216
4.1 Gratings with symmetrical groove profiles . . . . . . . . . . . . . . . . 217
4.2 Gratings with asymmetrical groove profiles . . . . . . . . . . . . . . . . 219
5.1 Wavefront interferogram, resolution and instrumental profile of plane gratings 224
5.2 Scattered light . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
5.3 Efficiency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
5.4 X-ray gratings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
5.5 Gratings with imaging properties . . . . . . . . . . . . . . . . . . . . 235
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242

1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
1.1 The 3-step model . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
2. THEORETICAL BASISOF THE PE . . . . . . . . . . . . . . . . . . . . . . . . 250
2.1 Calculation of the density of absorbed photons (DAP) from the bulk dielectric
constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

2.2 Dielectric constant and microscopic processes in a solid . . . . . . . . . . . 252

2.2.1 Collective motion of the electrons . . . . . . . . . . . . . . . . . . 253
2.2.2 Collective motion of the ions . . . . . . . . . . . . . . . . . . . . 253
2.2.3 One-electron excitations (direct transitions) . . . . . . . . . . . . . 253
2.2.4 One-electron excitations (non-direct transitions) . . . . . . . . . . . 255
2.3 Photoexcitation coefficient . . . . . . . . . . . . . . . . . . . . . . . 255
2.4 Fresnel equations and the DAP . . . . . . . . . . . . . . . . . . . . . 256
2.4.1 Validity of the Fresnel equations and spatial dispersion . . . . . . . . 258
2.4.2 Validity of the Fresnel equations, surface roughness and plasma oscillations 260
2.5 Surface photoexcitation . . . . . . . . . . . . . . . . . . . . . . . . . 261
2.5.1 Surface photoexcitation from bulk states (SPBS) . . . . . . . . . . . 261
2.5.2 Photoexcitation from surface states . . . . . . . . . . . . . . . . . 262
2.5.3 Surface absorption, Fresnel equations and DAP . . . . . . . . . . . 263
2.6 The electron escape probability . . . . . . . . . . . . . . . . . . . . . 263
2.6.1 Coulomb repulsion between electrons . . . . . . . . . . . . . . . . 264
2.6.2 Phonon scattering . . . . . . . . . . . . . . . . . . . . . . . . 267
2.6.3 Electron-hole recombination . . . . . . . . . . . . . . . . . . . . 268
2.6.4 Transmission by the surface . . . . . . . . . . . . . . . . . . . . 268
2.7 Theoretical determination of the escape probability . . . . . . . . . . . . 270
2.7.1 Electron-electron interaction and the ballistic approximation . . . . . . 270
2.7.2 Diffusion equation and electron-hole recombination in negative electron
affinitv. (NEA)
. I
Dhotocathodes . . . . . . . . . . . . . . . . . . .
2.7.3 De-excitation of photoelectrons by phonon scattering only . . . . . . . 273
2.7.4 De-excitation of Dhotoelectrons bv both Dhonon and electron-electron
interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
2.7.5 The escape probability and analysis of the experimental data . . . . . . 275
2.8 Photoemission and many-body effects . . . . . . . . . . . . . . . . . . 275
2.9 One-step theories of photoemission . . . . . . . . . . . . . . . . . . . 277
3.1 Estimation of the escape depth from one photoyield . . . . . . . . . . . 280
3.2 Estimation of the escape depth from the variation of the ratio of front Y + to back
Y - yield versus the thickness zo of thin films . . . . . . . . . . . . . . 281
3.3 Estimation of the escape depth from the variation of the photoyield of thin films
with thickness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
3.4 Estimation of the escape depth from the back and front photoyields of one thin
film . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
3.5 Estimation of the escape depth from the variation of the photoyield with the
angle of incidence. . . . . . . . . . . . . . . . . . . . . . . . . . . 288
3.6 Estimation of the escape depth from the PE of substrate through a coating layer 290
3.7 Non-photoelectric methods . . . . . . . . . . . . . . . . . . . . . . 292
3.8 Escape depth in negative electron affinity (NEA) photocathodes . . . . . . 295
3.9 Estimation of the escape depth from band bending considerations . . . . . 299
3.10 Estimation of the elastic escape depth for high energy electrons . . . . . . . 302
4 . SURFACE PHOTOEXCITATION . . . . . . . . . . . . . . . . . . . . . . . . . 305
4.1 Detection of a surface effect from the thickness dependence of the photoyield
of thin films . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
4.2 Detection of a surface effect from the polarization dependence of the photoyield 307
4.3 Evidence for a surface photoemission from the spectral yield distribution . 312
4.4 Evidence for the photoemission from surface states obtained from photoelectron
energy distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 314
5 . CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
ACKNOWLEDGEMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 1


1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
2.2 Characteristic equation . . . . . . . . . . . . . . . . . . . . . . . . 333
2.3 Approximate solutions of the characteristic equation . . . . . . . . . . . 334
2.4 Ray interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . 338
2.5 Validity of core and cladding mode approximations . . . . . . . . . . . . 339
2.6 Group delay and pulse dispersion in the absence of mode coupling . . . . . 342
2.7 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
2.8 Powerflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
2.9 Attenuation due to a lossy layer . . . . . . . . . . . . . . . . . . . . 354
2.10 Leaky modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
2.1 1 Attenuation due to bends . . . . . . . . . . . . . . . . . . . . . . . 363
2.12 Mode coupling due to perturbations . . . . . . . . . . . . . . . . . . 366
2.13 Coupling due to bends . . . . . . . . . . . . . . . . . . . . . . . . 366
2.14 Mode coupling and pulse dispersion. . . . . . . . . . . . . . . . . . . 369
2.15 Reduction of pulse dispersion by intentional inhomogeneities . . . . . . . . 374
2.16 Excitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
2.16.1 Incoherent . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
2.16.2 Coherent . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
3. FIBRES WITH NON-UNIFORM REFRACTIVE INDEX. . . . . . . . . . . . . . . . 381
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
3.2 The W-fibre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
3.3 Graded index fibres . . . . . . . . . . . . . . . . . . . . . . . . . . 383
3.3.1 General remarks . . . . . . . . . . . . . . . . . . . . . . . . . 383
3.3.2 Fibres with parabolic index variation . . . . . . . . . . . . . . . . 384
3.3.3 Impulse response of graded index fibres . . . . . . . . . . . . . . 389
3.4 Fibres with ring-shaped refractive index profiles . . . . . . . . . . . . . 396
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
ACKNOWLEDG~MENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400

AUTHOR INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

SUBJECT INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . 412

CUMULATIVE INDEX . VOLUMES I-XIV . . . . . . . . . . . . . . . . . 420

This Page Intentionally Left Blank



Queen Elizabeth College,
Campden Hill Road, Lonaon, W8 7AH U.K.


0 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . 3
0 2 . NORMAL SPECKLE PATTERNS . . . . . . . . . . . 5
PATTERNS . . . . . . . . . . . . . . . . . . . . . . 33
9: 5 . CONCLUDING REMARKS . . . . . . . . . . . . . . 44
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . 44
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . 44
9 1. Introduction

Light with a fair degree of spatial and temporal coherence incident on an

optically rough surface produces a reflected or transmitted beam that has
a random spatial variation of intensity. This intensity distribution is called
a speckle pattern. Fig. 1 shows the speckle pattern obtained at a distance
of 1 m from a 1 mm diameter area of ground glass illuminated by a He-Ne
laser. In this case the speckle pattern is of high contrast, and has a character-
istic scale, or speckle size, approximately equal to the diameter of the
Airy disc that would be produced in the absence of the ground glass (i.e.
approximately 1 mm).

Fig. 1. Speckle pattern produced in the Fraunhofer plane of an optically rough diffuser
illuminated by a He--Ne laser.

The statistical properties of a speckle pattern depend, in general, on both

the coherence of the incident light and the statistics of the scattering surface
or medium. In the laboratory, speckle patterns are usually produced by

highly coherent light incident on relatively large areas of optically very

rough surfaces and this case is in fact an exception to the general rule; the
statistics of such speckle patterns do not depend on the detailed surface
properties and we shall refer to these as normal speckle patterns. The
statistics of such spatial patterns are very closely related to those of the
temporal fluctuations of thermal (Gaussian) light sources (JAKEMAN [19741).
The general dependence of speckle statistics on the coherence of the incident
light and the nature of the scatterer has led to several applications in the
measurement of coherence and scattering parameters.
One of the first recorded observations of a speckle pattern was by EXNER
[1877,1880] who sketched the form of the pattern produced by candlelight
incident on a glass plate on which he had breathed. The non-monochroma-
ticity of this light source caused the pattern to have a radially fibrous struc-
ture and this feature was extensively discussed in the early literature (VON
LAUE[1914,1916,1917], DE HAAS[1918a, b], BUCHWALD [1919], RAMAN
[1919], RAMACHANDRAN [1954]). The mathematical basis for much of the
analysis of the statistics of speckle patterns was established by Lord RAY-
LEIGH [1880,1918,1919]. The detailed first and second order statistics of
normal speckle patterns formed in the Fraunhofer plane were fully evaluated
by VON LAUE[1914, 19161; in particular he calculated general expressions
for the second order probability density function of the intensity and the
autocorrelation function of the intensity. Early work on speckle patterns
is reviewed by HARIHARAN [1972].
Following the invention of the laser the phenomenon of speckle was
re-discovered and a large number of short papers describing various simple
properties were published. These are included in a bibliography on speckle
compiled by SINGH[1972].
In this article we shall concentrate our attention on the first and second
order statistics of speckle patterns. 0 2 is concerned with normal speckle
patterns formed in perfectly coherent light. In this section, patterns formed
in the image plane and in the Fraunhofer plane of a diffuser are considered
separately, although of course a general theory could be used to cover both
cases; with the appropriate assumptions in each case, the statistics are in
fact identical. The statistics in the near-field are not evaluated as they are
essentially the same as those in the far-field except for regions very close
to the diffuser (ELIASSON and MOTTIER[1971]). The effect of partially
coherent illumination on speckle statistics is discussed in 3, and surface
dependent aspects are considered in 5 4.
Random intensity patterns produced by volume scatterers such as the
atmosphere are not explicitly included in this article. The question of

propagation through scattering media is very much more complex than

the relatively simple cases described here and has only partially been
solved (CHERNOV [1960], TATARSKI [1961], STROHBEHN [1971]). Finally,
many of the details of the scattering process and the nature of the scattering
surface are excluded from the discussion below and are described by BECK-
MANN and SPIZZICHINO [1963] and BECKMANN [1967].

8 2. Normal Speckle Patterns


We shall consider first a speckle pattern formed in coherent light in the

Fraunhofer plane, as shown in Fig. 2. The effective complex amplitude of

Fig. 2. Formation of a speckle pattern in the Fraunhofer plane.

the scattered light in the scattering plane may be written as

where N is the number of independent scatterers,

aj is the modulus of the scattered wave due to thejth scatterer,
pj is the phase of the scattered wave,
and S(<)S(q) is the two-dimensional Dirac delta function.
The complex amplitude A(<,q) is a random process with the following
assumptions made (GOODMAN [1963, 1975cj):
i) the scatterers are randomly distributed over the area of the diffuser with
uniform probability,
ii) the aj are statistically independent random variables,
iii) the /Ij are statistically independent random variables, are uniformly
distributed in the interval - n to n (rough surface approximation) and
are independent of the aj,
iv) the polarisation of the incident wave is unaltered.
It should be noted that these assumptions apply to the effect of the
scattering medium on the incident field thus circumventing a detailed con-
sideration of the interaction of the field and scatterer. In the so-called
Beckmann model (BECKMANN and SPIZZICHINO [19631) the statistical
properties of the scattering surface are specified and the field immediately
after the scatterer is deduced making appropriate assumptions about the
interaction. This more general approach however can only be pursued in
detail for scatterers with a Gaussian distribution of surface heights and
does not indicate the conditions under which we might expect surface-
independent (normal) speckle patterns. It can be shown that for a Gaussian
distribution of surface heights with an autocorrelation function whose
scale width is very much less than the overall dimensions of the scatterer
the Beckmann model gwes the same result as the Goodman model for the
statistics of the speckle intensity fluctuation; in addition it also predicts
the shape of the envelope of the average intensity in the Fraunhofer plane.
Making the usual far field assumptions, the complex amplitude in the
observing plane A(x,y) may be written as,

where the unimportant phase factor has been ignored. Substituting for
A ( < ,q ) and evaluating the Fourier transform we obtain,

It is clear from this expression that the complex amplitude in the observing
plane is given by the sum of a large number of random phase and amplitude
vectors. As a result of the central limit theorem (see for example CHANDRA-
SEKHAR [1943], MIDDLETON [1960]) the random process A(x,y) tends to
a complex Gaussian process. The real and imaginary parts of the field are
identically distributed with zero mean and variance s2/2 and at any single
point they are statistically independent. The joint probability density

function for the real and imaginary parts of the field, denoted by A R and
A , respectively, is given by

Provided that the phase of the scattered field at the scatterer is uniform
in the interval -n to and that the number N of scatterers is very large,
the complex amplitude of the speckle pattern will be a complex Gaussian
process regardless of whether the scatterers have a uniform or random
modulus. The Convergence of the process to a Gaussian form will in general
depend on the statistics of the aj and this is discussed further in 84.2. In
many practical situations the number of scatterers is so large that problems
of convergence are not important.
If the real and imaginary parts of the field have a joint Gaussian distribu-
tion, then it follows using a probability transformation that the modulrs
has a Rayleigh distribution, the intensity has a negative exponential distribu-
tion and the phase is uniformly distributed in the interval - n to n :

=o A<O

=o I<O

where Z is the intensity,

4 is the phase,
and (I) = s2 is the ensemble average (mean) intensity.
The probability density function for intensity (eq. (5)) is usually the most
relevant distribution in practice; subject to the constraints on intensity of
a finite mean and positivity, the negative exponential distribution has
maximum entropy and indicates that a normal speckle pattern is totally
random. Experimentally the probability density function for intensity p(Z)
is found to be negative exponential to a high degree of accuracy (MCKECHNIE
[1974b]), as shown in Fig. 3.
The first order statistics of speckle patterns formed in the image plane
of a rough diffuser illuminated by coherent light are the same as those in the


statistical fluctuation
- I


\ 5 6 7 8
Ye vertical
Fig. 3. A measured histogram based on some 23000 intensity measurements taken from a
speckle pattern. Because the sampling aperture was small in relation to the speckle size,
the histogram should have the same form as the negative exponential reference curve which is
shown (MCKECHNIE [1974b]).

Fraunhofer plane, provided that a large number of scatterers lie with the
area of the point spread function of the imaging system in object space.
This condition is likely to be satisfied in many practical situations, but will
not hold for high resolution optical systems. The area of the point spread
function of an aberration-free lens is -approximately L2/NA2where NA is
the numerical aperture, whereas the minimum possible phase decorrelation
area of any diffuser is approximately L2 ;the maximum number of scatterers
contributing to a particular image point is therefore approximately equal
to 1/NA2.This maximum number may be further reduced depending upon
the actual phase decorrelation area of the diffuser used in practice. We
cannot expect the statistics of the intensity to follow the negative exponential
form for optical systems with numerical apertures greater than approx-
imately 0.1. The resulting statistics for small numbers of scatterers and
other surface dependent features are discussed in 0 4.


We again consider separately speckle patterns formed in the Fraunhofer

plane and in the image plane. The second order statistics of the scattered
field describe, in general terms, the spatial structure of the field. For speckle
patterns formed in the Fraunhofer plane, as in Fig. 2, it is intuitively obvious

that the dimension of the finest structure in the scattered field is inversely
related to the effective diameter of the scatterer (just as the dimension of
the Airy disc is related to the diameter of the diffracting aperture).
The most general second order statistic of the intensity is the second
order probability function, p(Z1,I,) and was first derived by VON LAUE
[1916]. If 1, and Z2 are the intensities at two points (xl ,yl) and ( x 2 , y 2 )in
the Fraunhofer plane, then

where 4ois the modified Bessel function of zero order and CI2 is the
modulus of the autocorrelation function of the complex amplitude and
is given by

where S(<,q) is the intensity distribution at the scattering plane. In the
derivation of (7) and (8), we assume that a large number of randomly
phased scatterers lie within the scattering aperture. It can be seen that
p(Z, ,Z2) is completely defined in terms of the autocorrelation function C , ;
it is a property of Gaussian processes that all probability density functions
are completely specified by the autocorrelation function. Thus in practice
the only second order statistic we need to evaluate is the autocorrelation
function or its Fourier transform, the Wiener spectrum.
The autocorrelation function of the intensity in the Fraunhofer plane of
a rough diffuser has been derived by many authors (VON LAUE[1916],
GOODMAN [19631, GOLDFISCHER [19651, SUZUKIand HIOKI[19661, Ross
[1970] and YAMAGUCHI [1972, 19733); here we follow Goodmans
approach. The autocorrelation function of the complex amplitude C,(xl ,yl)
is defined as
CA(X1,Yl) = (A(x+x,,y+y,)A*(x,y)), (9)
where ( ) denotes the ensemble average. (For a statistically stationary
speckle pattern the ensemble average may be replaced by a space average.)
The complex amplitude in the Fraunhofer plane of a rough scatterer is
given by eq. (2), and combination of equations (2) and (9) leads to

where the various cross-products reduce to zero upon taking the average.

If the scatterers are considered to be packed sufficiently closely on the

scattering area, then the scattering intensity Ja,j12
can be represented by a
continuous function S(5, q ) and we may write


The intensity function S(t,q) is proportional to the incident intensity

distribution over the scattering area and zero elsewhere, so that once this
distribution is known the autocorrelation function of the complex amplitude
of the speckle pattern is also known apart from a multiplicative constant.
The autocorrelation function of the intensity fluctuation is defined as
C,(X,l Y1) = < m + x , , Y f Y 1 ) m Y)>-<1>2. (11)
The first term on the right-hand side of eq. (1 1) can be expressed in terms
of C,(x, ,yl) using a moment theorem for complex Gaussian processes due
to REED[I9621 :
<zyz;z,2,) = <z:z,><z;z,> + <z;z,>(z:z,>. (12)
The result is
CI(X19 Y,) = IC,(X,l Y1)l2 (13)
and therefore the autocorrelation function of the intensity fluctuation in a
normal speckle pattern is given by


where it is assumed that S(5, q) is suitably normalised.

The Wiener spectrum (or power spectrum) W(u,v) of the intensity
fluctuation in the Fraunhofer plane, which in general terms describes the
spatial frequency content of the scattered intensity, is equal to the Fourier
transform of the autocorrelation function :

1, 21 N O R M A L SPECKLE P A T T E R N S 11

since u = </AD and v = q/AD. Thus the Wiener spectrum of the intensity
fluctuation is simply equal to the autocorrelation function of the intensity
distribution across the scattering aperture suitably scaled and normalised ;
it should be noted that

W(U,v)dudu = C(0,O) = o2 = ( I ) .

Since both the autocorrelation function and Wiener spectrum depend only
on the intensity distribution across the scattering aperture, the above
analysis applies also to planes other than the Fraunhofer plane, provided
that a large number of scatterers contribute to the intensity at any point
in the plane.
For a uniformly illuminated circular scattering area of radius r, the
autocorrelation function and Wiener spectrum of the intensity fluctuation
of the speckle pattern are given by


where y l ( x ) is the first order Bessel function and w, is the cut-off spatial
frequency given by w, = 2rfAD.
Several thousand values of the speckle pattern intensity are required to
accurately verify the above results and the experimental measurements of
HOHN[1968] and DAINTY [1970] only approximately verified the theore-
tical results. However recent experimental results by MCKECHNIE [1974a1
shown in Fig. 4 indicate that the expressions given above are verified to a
high degree of accuracy. The results of Fig. 4 also show that the roughness
of the scatterer does not influence the Wiener spectrum (or autocorrelation
function), provided of course that the illumination is perfectly coherent,
that the phase fluctuation at the scatterer is uniform in the interval --n:
to 71 and that a large number of scatterers contribute to the intensity at
any point in the Fraunhofer plane.
The autocorrelation function and Wiener spectrum of speckle patterns
produced in the image plane of rough diffusers may be found either by
using an extension of the Fraunhofer case (ENLOE[1967]) or by straight-
forward application of linear filter and square law detection theory (BURCK-

T I I I ~ I x I ' 7 . ~


X FINE '8 u0.g 0,

z .L.


Fig. 4. Measured Wiener spectra of the intensity of speckle patterns observed in the Fraun-
hofer plane of two optically rough diffusers. The solid line is the form predicted using eq. (17)
(MCKECHNIE [1974a]).


[19703). The
latter approach is described here.
Let the optical system be represented by a linear filter whose isoplanatic
amplitude point spread function is P(x, y ) and transfer function is T(u,v ) ;
the transfer function is equal to the pupil function H(5, q) suitably scaled,
T(u,4 = f w f u , u-4, (18)
where f is the focal length. If the complex amplitude distribution of an
object is A ( x , ,yl), then the complex amplitude distribution in the image
A'(x,y) is given by

A'(X>Y) = ~ ~ ~ ~ ~ , . Y l ~ P ~ x - x l ~ Y - Y ~ ~ d x l d Y l .

Suppose that the object distribution A @ , ,yl) is a stationary statistical

process with a Wiener spectrum WA(u,v).The spectrum of the output
amplitude of the linear filter is given by a well-known result (MIDDLETON
WA(U, 4 = IT(%U)I2W(~,
v). (19)
The Wiener spectrum of the intensity in the speckle pattern W(u,v) is
found from the Wiener spectrum of the complex amplitude WL(u,v)by
1 3 5 21 NORMAL SPECKLE P A T T E R N S 13

applying a result derived by RICE[19541 in his analysis of the square law

detection problem :

W(u,v ) = WL(ul, vl)WL(ul +u, v1 + 0) du, du, . (20)


If the diffuser structure is very much finer than the diameter of the point
spread function (this condition is also necessary for Gaussian statistics for
the image amplitude), the Wiener spectrum of the object amplitude WA(u,u)
will be constant for the values of (u, u) for which IT(u, u)I2 is significantly
non-zero, and combination of equations (18) to (20) gives, for this special
case of a white noise object,
r r


where 4 is a normalisation constant.

Comparison of eq. (15) for the Fraunhofer plane and eq. (21) for the image
plane shows that the Wiener spectra have the same form in each case. For
speckle patterns produced in the Fraunhofer plane the intensity distribution
of the illumination is the quantity that determines the spectrum, whereas in
the image plane it is the squared modulus of the pupil function.
For an unshaded circular imaging pupil of radius r, the autocorrelation
function and Wiener spectrum of the intensity fluctuation are given by
equations (16) and (17) respectively, with the distance D replaced by the
focal length f. The autocorrelation function has the same form as the
intensity distribution of an Airy disc, and this fact has given rise to the
general rule-of-thumb that the speckle size is of the same order of magni-
tude as the size of the Airy disc; note however that the autocorrelation
function and hence the speckle size is independent of any aberration of
the imaging system. When imaging a general diffuse object, the object
amplitude is not statistically stationary, and the statistics of the image are
described by non-stationary functions ; this is discussed by LOWENTHAL
and ARSENHAULT [1970] as an extension of the above arguments for the
stationary case.


The statistical properties we have evaluated so far all relate to the complex
amplitude or intensity at one or more points in a speckle pattern. In prac-
14 T H E S T A T I S T I C S OF S P E C K L E P A T T E R N S CL D 2

tice, speckle patterns are more likely to be averaged over some non-zero
area by, for example, a scanning aperture and we shall call this averaged
or integrated intensity the measured intensity. The measured intensity
Z(x, y ) is related to the intensity of the speckle pattern Z(xl, yl) by a con-
volution formula,

Z(x7y) = ~ ~ ~ ( x l . ~ ~ ~ ~ ~ ~ - ~ l , ~ - ~ l(22)
) d X , d y

where B ( x , y ) is the intensity point spread function of the averaging device

normalised such that the volume of B ( x , y ) is unity.
The Wiener spectrum of the measured intensity fluctuation W(u,v ) is
simply related to that of the speckle pattern intensity :
Wb, v) = W(u,u)lb(u, 412, (23)
where b(u, v ) is the Fourier transform of B ( x , y ) . The variance of the mea-
sured intensity $t is given by

CT; = {J W(u,u)dudv = W(u,u)lb(u, v)I2dudu. (24)

--m --m

The Wiener spectra of the intensity fluctuation in normal speckle patterns

formed in the Fraunhofer and image planes are given by equations (15)
and (21) respectively.
GERRITSON, HANNAN and RAMBERG[1968] calculated the ratio (Z)/aa
(a signal-to-noise ratio) for square pupils and scanning apertures. For
a circular imaging pupil of radius r and a circular scanning aperture of
radius a, eq. (24) becomes,

diameter of scanning aperture - 2a
Rayleigh resolution limit of lens 0.61Aflr.
The quantity k is a measure of the aperture size relative to the speckle
size, since we showed in Q 2.2 that the speckle size is of the same order
of magnitude as the Airy disc. In Fig. 5, ob/(Z)is plotted as a function of
the relative aperture diameter for both top-hat and Gaussian scanning

01 I 1
0 2 4 6 8 10 12

Fig. 5 The standard deviation of the intensity fluctuations relative to the mean intensity for
top-hat and Gaussian scanning apertures for a speckle pattern formed by an optical system
with a top-hat pupil function. For the top-hat aperture, -, k = (diameter of scanning
aperture)/(Rayleigh resolution limit of lens). For the Gaussian aperture, k = (diameter
of aperture at l/e points)/(Rayleigh resolution limit of lens).

If the scanning aperture is very large compared to the speckle size

then the variance is given approximately by

c$ N W(0,O)
(b(u,v)lz dudv.

For unshaded pupils and scanning apertures, this expression reduces to

where d b is the area of the scanning or averaging aperture, and d pis the
area of the pupil in spatial frequency units (i.e. d,= nvz/ln2f2).
Finding the first order probability density function for the measured
intensity, p(I), is not straightforward. The problem is analogous to that
of square-law detection and low-pass filtration in the one-dimensional
time domain, which was analysed by KACand SIEGERT [19471 and SLEPIAN
[1958]. The extension* to the case of the measured intensity in a speckle
* All three authors in fact made errors in extending the results of K A c and SIEGERT[1947] ;
these arose as a result of failing to include all aspects of the fact that the amplitude is a complex
Gaussian process.
16 T H E S T A T I S T I C S OF S P E C K L E P A T T E R N S [I7 2
pattern was made by CONDIE [19661, DAINTY [19711 and BARAKAT [19731.
Equa.tion (22) for the measured intensity may be rewritten in terms of the
complex amplitude in the speckle pattern A(x, ,yl) as,

I(x, Y ) = I.j
14x1, Y1)I2B(x-x1, Y-Y1)dX, dY,.

The problem in findingp(1) is that a weighted sum of correlated random

variables. By expanding A(xl ,yl) in terms of orthogonal functions (the
Karhunen-Lokve expansion), the measured intensity can be expressed as
a weighted sum of independentrandom variables and following this approach
the probability density function for the measured intensity is given by

where E, are the eigenvalues of the homogenous Fredholm equation

Y) =
C A ( ~ - ~ I , Y . - Y ~ )Yl)$n(Xl,
B ( ~ , , y,)dx1dy1, (28)

with c0 2 c1 2 E~ . . . and C,(x,y) is the autocorrelation function of the

complex amplitude in the speckle pattern.
It is in general difficult to calculate the eigenvalues from eq. (28); how-
ever, if the eigenvalues E, are plotted as a function of n it is often found
that for a wide range of C,(x,,y,) and B(x,,y,), the eigenvalues are
approximately equal to E~ up to a value of n N no, and approximately
equal to zero for n > no. With this approximation, the probability density
function for the measured intensity becomes (CONDIE[19661, SCRIBOT

p(T) = -
2n s -w
exp (-izZ)

which, upon integration, gives the gamma distribution,

It can also be shown that, in this approximate case,

(1) = Eon0

0; = &;no.

Equation (29) can be therefore rewritten as

where no is interpreted as the number of independent correlation cells

(speckles) within the scanning aperture.
It is interesting to note that this approximate expression for the prob-
ability density function of the measured intensity has also been derived
using a heuristic argument (GOODMAN [1965]) that is again an extension
ofwork in the one-dimensional time domain (RICE[1954], MANDEL [1959]).
The scanning aperture is regarded as consisting of no independent correla-
tion cells, the intensity being taken as constant within any one cell and
statistically independent of the intensity in all other cells. The intensity of
each correlation cell is assumed to have a negative exponential distribution
and therefore the total measured intensity is approximately a gamma
variate, as in eq. (30), where no is chosen such that the variance of the
approximate and exact distributions are equal :
no = (Z)2/oi. (31)

0 1.0 2.0

Fig. 6 . The approximate probability density function of a normal speckle pattern scanned by
an aperture. This distribution has two parameters, the mean intensity (assumed to be unity)
and the standard deviation ubof the intensity fluctuation (relative to the mean intensity).

In Fig. 6 the approximate probability density function is drawn for a mean

measured intensity of unity and various values of a b / ( z ) ; for a b / ( z ) --t 1,
the distribution tends to the negative exponential form and for a b / ( z ) --t 0
it tends to a Gaussian form.
An important difference between the approximate distribution given by
eq. (30) and the exact distribution given by eqs. (27) and (28) is that the
approximate distribution depends on the Wiener spectrum of the speckle
only through a weighted integral (eq. (24)) whereas the exact distribution
depends on the form of the Wiener spectrum. It is found however that the
agreement between the approximate and exact formulae is close regardless
of the form of the speckle Wiener spectrum (MANDEL[1959], BEDARD,
CHANGand MANDEL[1967], SCRIBOT [1974]).

0 3. Partially Coherent Illumination


Speckle patterns were first observed in situations where the light incident
on the scatterer was both spatially and temporally partially coherent, and
the properties of such patterns are somewhat less straightforward than
those for perfectly coherent illumination. For each case of spatial and
temporal partial coherence we again make the subdivision into speckle
patterns formed in the region of the Fraunhofer plane and those formed in
the image plane, although a perfectly general theory may of course be used
to apply to both cases (PARRY[1975b]).
In Fig. 7 we show an optical system that might be used to form a speckle
pattern in the Fraunhofer plane of a diffuser illuminated by spatially
partially coherent light. The diffuser is uniformly illuminated by light
which has a mutual coherence function r((,,tZ,q l ,q 2 ) given by the

so urce L1 L2
Fig. 7. Formation of a speckle pattern produced in the Fraunhofer plane ofa diffuser illuminat-
ed by spatially partially coherent light.

inverse Fourier transform of the source intensity s(x,y ) suitably scaled

snd normalised. The area of the diffuser that contributes to the speckle
pattern is limited by the extent of the lens L2 whose intensity transmittance
is equal to the squared modulus of the pupil function, Ill(<,v ] ) I 2 .
To find the statistics of the speckle pattern intensity Z(x,y) formed in
this case, we must first obtain a relationship between Z(x,y ) , s(x, y ) , H ( 5 , v ] )
and Z(c, v ] ) , where this last function is the complex amplitude at the pupil
due to the diffuser. There are two conceptually different approaches that
have been used to obtain this relationship. The first (LABEYRIE [1970],
DAINTY[1973]) is based on the fact that Z ( x , y ) is an image, albeit a very
poor one, of the source intensity and the usual equations for imaging of
self-luminous (incoherent) objects apply ; the effective pupil function is
equal to the product Z ( < , q ) H(<,v]).This approach is useful when the
diffuser is weak, as in the case of the atmosphere; the speckle pattern
only exists in a small region of the x-plane (see 94.2). The second, more
conventional, approach is to follow the mutual coherence function as it
propagates through the system (Ross [1969], ASAKURA, FUJIIand MURATA
[1972], FUJIIand ASAKURA [1973,1974a]) and this is outlined below.
Assuming that the illumination is quasimonochromatic and that the
mutual coherence function is stationary, then (BORNand WOLF[1970]),

I(x, Y ) = ij.
m1-52, ?l-YIZ)H(Sl, ?)Z(tl,? ) H * ( 5 2 , v ] 2 ) 2 * ( 5 2 3 r z )

Making the substitutions

u = - 51-4;2 , v=- 91 - Y 2
Af Af
we obtain

x exp { - 2 4 u x + uy)} du dv
which further simplifies to

1(x7Y ) = S(X, Y) e 1 rJ
H(Afu, Afu)Z(Afu,A f u ) exp { - 2ni(ux + uy))dudu ,

where 8 denotes convolution. However the second term in the convolution

is simply the normal speckle pattern that would be obtained in perfectly
coherent light (provided of course that the complex amplitude transmittance
of the diffuser Z ( x , y ) satisfies the constraints given in 9 2). Thus,
I(x7 Y ) = 4x9 Y ) 63 I&, Y), (33)
where Ic(x,y)is the speckle pattern intensity that would be obtained with
perfectly coherent illumination.
Equation (33) relates the intensities of speckle patterns produced in the
Fraunhofer plane in spatially partially coherent and fully coherent illumina-
tion, and the first and second order statistics of the partially coherent case
follow quite simply. The effect of the non-zero source extent s(x,y) is to
blur out and to reduce the contrast of the fully coherent speckle pattern.
Comparison of eq. (33) with eq. (22), shows that the source distribution
s(x,y) in the partially coherent case plays the same role as the scanning
aperture B(x,y) in the case of the measured intensity in coherent light.
Thus the exact first order probability density function is given by eqs.
(27) and (28) with s(x, y ) substituted for B(x, y), or approximately by

where no is the number of (coherent) speckles that lie with the source
function s(x, y).
The Wiener spectrum of the intensity fluctuation in a speckle pattern
formed by spatially partially coherent illumination of a diffuser is given by
wu, 0) = K(U,W W u , nfo)12 (34)
where r(5,q) is the mutual coherence function and Wc(u,u) is the Wiener
spectrum of a coherent speckle pattern. Equation (34) is the basis of a
method of measuring the mutual coherence function ;providing the diffuser
satisfies the conditions laid down in 0 2, the function Wc(u,u) is of a known
form (eq. (21) in the present notation) and by measuring W(u,v ) the squared
modulus of r(<, q) can be found. An example of an experimental result is
given in Fig. 8; the same basic method is also used in astronomy for the
determination of stellar diameters and binary star separations.

separation 2 mm
Fig. 8. An experimental measurement of the spatial coherence of a light source using speckle
patterns (FUJIIand ASAKURA [1973]).


The statistics of the intensity in the image of a spatially partially coherently

illuminated diffuser are slightly more difficult to find than those in the
Fraunhofer case. Referring to Fig. 9, the image intensity is given by

C%Y) =~~~~(~l-xz,Y~-Yz)h(x-x~,Y-Yl~

source diffuser lens image


Fig. 9. Formation of a speckle pattern produced in the image plane of a diffuser illuminated
by spatially partially coherent light.

where h(x,y ) is the amplitude point spread function of the imaging system
and Z(x,y) is the complex amplitude transmittance of the diffuser. In
terms of the effective source J(u, v) = s(a/AD,b/AD),

j j T(x, y ) exp {
n n

J(u, 4 = - 2zi(ux + vy)) dx dy,


this may be written as

I(x, Y ) = ?;s -m
J(u,v)lQ(x,Y; u, 41, du dv, (36)

where Q(x,y , u, v) is the amplitude in a speckle pattern produced by an

oblique coherent wave incident on the diffuser at angle (nu, nu),

Q ( x , Y ; ~0 ), =
h(x-x,,y-yi)Z(xi,y,)ex~ f - 2 z i ( u x i + v ~ i ) ) d x i d ~ i .

Equation (36) states that the speckle pattern produced in the image plane
in spatially partially coherent illumination is just the (weighted) sum of
the intensities of speckle patterns produced by different angles of coherent
illumination. The exact first order probability density function is therefore
given by an expression similar to eq. (27), where the eigenvalues E, are
given by an expression similar to (28). The approximate probability density
function is given by equation (30), where the value of no is approximately
equal to the ratio of the diameter ofthe point spread function to the diameter
of the coherence patch at the diffuser. In an optical system which has an
illuminating condenser of numerical aperture NA, and an imaging objec-
tive numerical aperture of NA, then no N Y = NA,/NAo (for no B 1).
The Wiener spectrum of the intensity fluctuation in a speckle pattern
produced in a spatially partially coherent optical system has been derived
for optically very rough surfaces by DAINTY [19701and YAMAGUCHI [19731,
and for surfaces of an arbitary magnitude of roughness but with a Gaussian
phase profile by FUJIIand ASAKURA [1974c]; this latter case is discussed
in 8 4.3, and only the result for a very rough surface is found here.
Equation (35) for the image intensity may be rewritten in the form

I(X) = rJ
V ( x , , ~~)Z(x-x,)Z"(x-x,)dx,dx,,


One dimensional notation is used here to simplify the derivation. The

autocorrelation of the intensity fluctuation Z(x) = Z(x) - ( I ) is given by
C(x,) = (Z(x)Z(x x,) +
= ({v(xl 3 xZ)v(x3, x4)
- ,xi

x ((Z(x-x,)Z*(x-x,)Z(x+x, - x , ) Z * ( x + x , -x,))
- (Z(X- xJZ*(x- xz))(Z(x +xo - X3)Z*(X+xo- x,))}
x dx, dx, dx, dx,. (37)

It is now assumed that the object amplitude Z ( x ) is a complex Gaussian

process so that applying the theorem of REED[1962] to the fourth order
moment we obtain,


dx1 dxZdx3dx4,
x (Z*(X-X,)Z(X+X,-X~))}

or in terms of the autocorrelation of the diffuser complex amplitude Cz(x),

c(xO) = v(xl, x 2 ) v ( x 3 , x~)(c:(x1-x4+xO)


x C,(X, - ~3 +x,)} dx, dx, dx3 dx4. (38)

The Wiener spectrum of the intensity fluctuations is the Fourier transform
of eq. (38), and reduces to

K(u1)W u + u , ) m - u1, -(u + o n u + u1, u1) du, , (39)

where Wz(u)is the Wiener spectrum of the object amplitude, and 9-(ul, u,)
is the transmission cross-coefficient (BORNand WOLF[1970]) and is given

J(u)H(u u,)H*(u - u2)dul du2.

Since F ( u l , u2) = Y * ( - u z , - u , ) , eq. (39) reduces further to

w.4= s-40 ~ , ( u , ) ~ , ( u + u , ) l ~ ( u + u , %)12d%

and for a diffuser whose lateral structure is small compared with the point
, (40)

spread function of the imaging system (a white-noise diffuser),

ul,ul)l2 du, .
IS(U+ (41)

It should be noted that the result of eq. (41) for a white noise diffuser is
not restricted to the case of Gaussian statistics for the diffuser complex
amplitude transmittance ; if the function Z(x, y ) has an autocorrelation
function that is a delta-function (i.e. white noise), then it can be shown
that eq. (41) must follow regardless of the statistics of Z(x,y).
In Fig. 10, the scale value W(0)of the Wiener spectrum is plotted as a
function of the coherence parameter Y
radius of effective source NA,
Y = --
radius of entrance pupil NA, .

Fig. 10. The zero spatial frequency value of the Wiener spectrum of the intensity fluctuation
in a speckle pattern formed by an aberration-free partially coherent optical system with a
top-hat effective source distribution, plotted as a function of the coherence parameter 9
(DAINTY [1970]).

The quantity W(0) gives an indication of the variance 0 of the speckle

pattern, and it is clear from the figure that the speckle contrast remains
high for Y < 1. This result is also confirmed by FUJIIand ASAKURA [1974c],
and is discussed further in $4.3.


Much of the early discussion on the nature of speckle patterns was

concerned with the effect of the non-monochromaticity of the light source
(DE HAAS[1918], RAMACHANDRAN [1943]). Speckle patterns formed in
the Fraunhofer plane of diffusers illuminated by all lines of an argon laser
are shown in Fig. 11 for three surfaces of different roughness (a colour
plate is given by MARTIENSSEN and SPILLER[1965]). Speckle patterns
formed by non-monochromatic sources have two notable features that
distinguish them from normal (monochromatic) speckle patterns. Firstly
there is a radial structure which depends-both on the temporal coherence
of the source and on the roughness of the diffuser; secondly, the contrast
of the pattern is a function of position in the Fraunhofer plane and also
depends on the temporal coherence and surface roughness.
For a surface whose r.m.s. height variation q,is greater than one wave-
length but less than the coherence length Lc of the incident radiation (see
Fig. 12 (a)), each wavelength forms a normal speckle pattern that is fairly
well correlated with the speckle patterns produced by neighbouring wave-
lengths, at least near the centre of the pattern. The speckle size is of course
related directly to the wavelength and the overall effect is to produce a
radial structure in the total pattern. The length of a speckle in the radial
structure depends upon the magnitude of the relative phase changes from
point to point on the diffuser as a function of angle and this depends on the
r.m.s. height variation oh; for a large surface roughness the relative phase
change is large for a small change in angle and the radial structure is less
apparent (see Fig. 11). The contrast of the pattern is also less for surfaces
whose r.m.s. roughness is greater than the coherence length of the incident
light (see Fig. 12 (b)) because at any point in the Fraunhofer plane fewer
scatterers give coherent contributions and more give incoherent contribu-
tions. For normal incidence and observation at the centre of the Fraunhofer
plane, GOODMAN [1963] showed that high contrast is only achieved if
oh < Lc.
More complete treatments of the statistics of speckle patterns formed
in the Fraunhofer plane in temporally partially coherent (white) light
have been given by GOODMAN [1963], FUJIIand ASAKURA [1974a], PARRY
26 THE S T A T I S T I C S O F S P E C K L E P A T T E R N S

Fig. 1 1 . Speckle patterns produced in the Fraunhofer planes of diffusers with three surface
roughnesses illuminated by an argon laser (wavelengths present 514, 496, 488, 476 nm);
(a)uh u lpm,(b)u, z 3pm,(c)uh u PARRY [1974b]).

(a1 U, < Lc (b) up, > Lc

Fig. 12. Surfaces whose r.m.s. height variation uh are less and greater than the coherence
length L,.

[1974a, b, 1975a, b] and PEDERSEN [1975a, b]. The results given so far
in this article have applied to all surfaces whose r.m.s. roughness is greater
than one wavelength, regardless of the detailed statistical properties of
the surface. However we can no longer accept this naive picture of the
scattering surface and it is necessary to introduce surface dependent param-
eters using the Beckmann model (BECKMANN and SPIZZICHINO [19631).
We consider the scattering geometry shown in Fig. 13 (PEDERSEN [1975b]).
The surface lies in the (x,y)~ x p l a n eand has a profile h(x) whose r.m.s.
fluctuation is CT,, > A. The surface is illuminated by a unit amplitude plane
wave with wave vector k , , and the scattered wave k is observed in the
Fraunhofer plane, where (kl = Ik,J = 27c/A. The scattered amplitude is a

\ incident wave


surface profile

Fig. 13. Diffraction geometry for the formation of speckle patterns (PEDERSEN

function of k , and k only through the difference q = k - k , as multiple

reflections are ignored. The problem of finding the statistics of the scattered
field for surfaces of arbitrary profile h(x) is difficult. If the surface roughness
is greater than one wavelength and we illuminate the surface with mono-
chromatic light then as we have seen in 52 it is possible to obtain a useful
statistical description of the scattered field. However, it should be noted
that the description given in Q 2 does not explain the roughness dependent
angular correlation of speckle patterns (ARCHBOLD and ENNOS[19721).
In this present case we make two important assumptions about the surface
profile. Firstly we assume that h(x) is a stationary Gaussian random variable
with zero mean and autocorrelation function c h ( x 1 - x2); the first order
characteristic function is therefore given by
al(t)= (exp (ith)) = exp ( - p1h 2t 2).
Secondly it is assumed that the total area of the illuminated surface is large
compared to the area of correlated surface structure (i.e. large compared
to the extent of c h ( x 1 - x2)) so that there are many correlation areas within
the illuminated area. Provided that these two assumptions are made, it
can be shown that (regardless of the form of c h ( x 1 - x2)) the normalised
autocorrelation function of the scattered intensity is locally stationary and
given by
C(Aq) = @wz)C(Aqx)? (43)
where Aq = q1 - q 2 , Aqz is the projection of Aq on the z axis, and Aqx is
the projection of Aq on the x plane. C(Aq,) is the normalised autocorrela-
tion function of the intensity in a normal speckle pattern which in the
present notation is given by

S(x) exp (- iAqx . x) d2x


where S(x) is the intensity distribution of the incident light.

It is useful at this point to let
q = k-k, = k(n-no) = km
where k is the wavenumber (k = 27cn/A) and m = n - n 0 is the change in
propagation direction between the incident and scattered wave.

If a polychromatic speckle pattern is formed using light whose spectral

intensity distribution is S ( k ) , the intensity of the observed speckle pattern
Z(m) is simply the sum of the monochromatic patterns I(km) suitably
weighted by S ( k ):

I(m) = lLm2(k)I(km)dk.

It follows that the mean intensity ( I ( m ) ) is given by

and provided that this mean is locally stationary over regions of speckle
correlation, the normalised angular autocorrelation function Cw(m ,m z ) ,
of the white light speckle intensity fluctuation is given by

CJm, 9 m2) = - k, mhdk, dk,,

jSS(k1)S(k2~C(k2m2 (46)

where S ( k ) is normalised so that Sfm S(k)dk = 1.

To gain further insight into the white light speckle angular autocorrelation
function it is useful to consider a particular example in which the spatial
intensity distribution S(x) of the illuminating light is Gaussian with r.m.s.
radius r/2. It follows from eqs. (42) to (44) that (F'EDERSEN [1975b]),
C(Aq) = exp { - r21AqXl2
- oiAq,Z}.
We further assume that the normalised spectral density S ( k ) is also Gaussian
with an r.m.s. spectral bandwidth W around a mean wavenumber ko ,
S(k) = -. exp { -(k-l~,)~/2W~}.
We now introduce a path vector s = rm,+o,m,e,, where

is the r.m.s. path deviation of the light contributing at the point m in the
speckle pattern and e, is a unit vector in the z direction. Equation (46) for
the angular autocorrelation of the intensity can now be evaluated to give
Let the mean path vector be s = (sl +s,)/2 and a difference path vector be
d = s 2 - s l . In the limit d = 0 eq. (48) reduces to
C,(O; s) = a2/(I>2 = l/JiXmy, (49)
where s is in general given by eq. (47); for normally incident illumination
and viewing at the centre of the Fraunhofer plane s = 2a, and we obtain
a2/(I>2 = l/Jiqmq. (50)
This simple expression clearly shows how the speckle contrast decreases
as either the surface roughness a),or the illumination bandwidth Wincreases.
If we now restrict our problem still further by assuming that we are
using relatively narrow band sources such that W <QC k , , then eq. (48)
reduces to

In order to show how the speckles are elongated and in what way this
elongation depends on the surface and illumination properties, it is con-
venient to assume that the illumination is normal to the scattering surface
and consider paraxial diffraction angles. We then have (Fig. 14),
s = i s x , s,, SJ ={re, 0, 2 4 9

Fig. 14. Diffraction geometry for the calculation of polychromatic speckle correlation with
normal incidence (FEDERSEN [1975b]).
I, D 31 P A R T I A L L Y COHERENT I L L U M l N A T I O N 31

where the x-axis lies along s, and 8 is the polar diffraction angle. Similarly,
d = {rAOX,rAO,, 0},
where A8, and A8, are the components of the change in diffraction angles
in the radial x-direction and the azimuthal y-direction respectively. Equa-
tion (51) becomes

Cw(A@;8)= Cw(0;8)exp

It can be seen from eq. (52) that the degree of angular correlation is
locally stationary with an elliptical Gaussian correlation function. In the
azimuthal direction, the width of the correlation function (i.e. the speckle
width) is determined only by the normal diffraction value,

In the radial x-direction, the correlation function is lengthened by the

process of angular dispersion and we have

This is illustrated in Fig. 15. The elongation of the speckles in the radial
direction due to angular dispersion is given by

(ge/%sp = 2(wc/k,)', (53)

,"'y /

I 0,
I f *I

Fig. 15. Illustration of the correlation regions in polychromatic speckle patterns. On axis the
correlation region is diffraction-limited and circular. Off axis the angular dispersion causes
the region to be radially elongated (PEDERSEN [197Sb]).
where w, is a correlation bandwidth given by
w, = w / d T Z . (54)
It is clear from eqs. (53) and (54) that the speckle length in the radial direc-
tion depends on both the surface roughness a, and the illumination band-
width W.
The probability density function of intensity in a polychromatic speckle
pattern can be found by considering the pattern to be the sum of a number
of partially correlated monochromatic speckle patterns, the degree of
correlation depending upon position and surface roughness. Using an
analysis similar to that given in $2.3 for the first order statistics of the
measured intensity in a normal speckle pattern, it can be shown that the
probability density function of intensity in a polychromatic pattern is
approximately given by a gamma variate (eq. (30)) or by exact expressions
similar to eqs. (27) and (28) (PARRY[1975a]).
The above results indicate that the observation of white light speckle
patterns formed in the Fraunhofer plane may produce useful measures of
surface roughness and this has been suggested by SPRAGUE[1972] and
TRIBILLON [19741, both of whom present preliminary experimental results.
However, it must be borne in mind that the above analysis contained a
large number of assumptions. In particular it was assumed that single
scattering occurred (first Born approximation) and that the surface could
be characterised either by a Gaussian distribution of height with non-zero
correlation length, or by a white-noise distribution; it is not clear whether
these assumptions are valid in many practical scattering problems.
Speckle patterns formed in the image plane of a diffuser illuminated by
spatially coherent, polychromatic light have been studied by ELBAUM,
GREENBAUM and KING [1972], GEORGE and JAIN[1972, 1973, 19741, and
MCKFXHNIE[1975]. The main application here is the reduction of speckle
in the images of diffuse objects. The analysis of the statistics is very similar
to the above case for the Fraunhofer plane. The main results are (i) the
speckle pattern is statistically stationary for a uniform diffuse object
provided that the aberrations of the imaging system are not field-dependent,
(ii) the speckle contrast decreases as the bandwidth W of the illumination
increases and as the surface roughness a, increases, (iii) the speckle pattern
depends on the aberrations of the imaging system. GEORGE and JAIN[1974]
and MCKECHNIE [1975] have given detailed analyses of the statistics using
Goodmans model of the scattering suface.

9 4. Surfacedependent Features of Speckle Patterns

In earlier sections we have stressed that speckle patterns resulting from

certain surfaces being illuminated by monochromatic spatially coherent
light are essentially independent of detailed surface properties. The main
requirements were (i) the surface does not alter the polarisation of the
incident light, (ii) a large number of scattering centres contribute to the
amplitude at any point in the observation plane, and (iii) the phase of the
scattered wave is random in the interval --n: to (or oh > A). In mono-
chromatic light it was shown that the value of the surface roughness oh
did not influence the speckle statistics provided that condition (iii) was
followed, although in polychromatic light g,, does influence the speckle
In this section we shall consider the statistics of speckle patterns formed
when the above requirements are not fulfilled and we shall see that in
general the properties of speckle patterns depend in a very complicated
way on surface properties.


When plane-polarised monochromatic light is incident on a scattering

medium, the transmitted or reflected amplitude in general consists of com-
ponents of the field that are parallel and perpendicular to the incident field;
these components have in general unequal intensities and an arbitary
correlation factor. A discussion of mechanism of depolarisation and its
relation to the scattering medium is outside the scope of this article (BECK-
As far as the effect of depolarisation on the statistics of the resultant
speckle pattern is concerned, we may reduce the problem to that of an
addition of two partially correlated speckle patterns of unequal intensity
(BARAKAT[1973], GOODMAN [1975a, c]). Let A , , II and A , , I , be the
amplitude and intensity of the speckle patterns produced by the parallel
and perpendicular components respectively. The correlation coefficient for
the amplitude is given by

Experimentally we can measure a correlation matrix of intensities (CHAKRA-

[1973], GEORGE, [1975])

However, for a large number of scattering centres the complex amplitude

in each speckle pattern is a complex Gaussian process and using the results
of REED[1962]

cmn = Ipmn12,


pmn = Jcntn . ~ X (i4mnL

where 4mnis a phase factor. It turns out that the phase factor is of no con-
sequence in this case, but has an important influence when a similar analysis
is applied to the addition of three or more correlated speckle patterns
GOODMAN [1975a, c]). The hermitian coherency matrix whose elements are
(AmA,*)can therefore be written

In problems where we have the sum of correlated random processes,

the standard method of solution involves expanding the random process
in an orthogonal series such that the sum is of independent random processes;
standard theorems of probability theory can then be applied to yield p ( l ) ,
the probability density function. Applying this to our example yields (GOOD-
MAN [1975a, c]), for distinct eigenvalues,

= 0 otherwise,

or for identical eigenvalues, c1 = c2 = c0

= 0 otherwise,

where E, are the eigenvalues of the coherency matrix. An example is given

in Fig. 16 for the cases of c12 = 0 ( E ~= E~ = O S ) , c12 = 0.6 ( E ~= 0.887,
E* = 0.113) and c12= 1 (el = 1, E~ = 0) with the assumption that the two
mean intensities are equal.

1.0 2.0 30
Fig. 16. The probability density functions for the intensity of the sum of two partially
correlated normal speckle patterns with equal mean intensity and c,* = 0, 0.6 and 1.0
(GOODMAN [1975a, c]).


If only a small number of scatterers contribute to the amplitude at a

point in the observation plane then the central limit theorem cannot be
applied and the complex amplitude will not have a complex Gaussian
distribution. In this case the statistics of the scattered field will depend on
the statistics of the scattering medium. This dependence is discussed below.
A small number of scatterers also implies that the autocorrelation function
of the amplitude of the scattered wave immediately behind the scatterer
is not small compared to the dimensions of the illuminated area. In all
cases considered so far in this article we have assumed that this correlation
function is small in extent relative to the illuminated area (i.e. diffusers
relatively of fine structure) and so we first consider the effect of allowing
this autocorrelation function to increase in area, but at the same time
assuming that the number of correlation areas is sufficiently large to apply
the central limit theorem.
Consider the optical system in Fig. 17. A plane wave is incident on a
random medium which imposes a complex amplitude distribution A ( t ) at
the entrance pupil of a lens whose pupil function is H(5). If the lens is
aberration-free then the amplitude in the observation (Fraunhofer) plane

random medium

A([) H(r) I(x)

Fig. 17. Formation of a speckle pattern in the Fraunhofer plane for a random complex
amplitude with a correlation function of non-zero extent.

is thefinite Fourier transform of A(<),but in general the amplitude in the

observation plane is the Fourier transform of the product A ( ( ) H ( ( ) . We
have included this dependence on the pupil function as this problem first
arose in an application in astronomy where the atmosphere is the random
medium and the effect of telescope aberrations is of practical interest
(DAINTY [1973,1974]). The autocorrelation function of the amplitude A(5)
is defined as

and is independent of if the random wavefront A(<)is statistically station-

ary. Because the extent of C,(() is not considered to be small compared with
the limiting aperture of the pupil function, the intensity distribution in the
observation plane (the Fraunhofer plane of A ( ( ) and also the h a g t plane
of the point source) is concentrated in a region near to the optic axis; within
this region a speckle-like pattern is seen.
The Wiener spectrum of the intensity distribution is equal to (li(u)I2),
where i(u) is the Fourier transform of the intensity distribution I(x). Thus
the Wiener spectrum is given by

W(4 = rSH*(;,)rr(C)H(i,+;)Hli2+<~

where u = (/A$ It is clear from this equation that the Wiener spectrum of
the speckle depends on the fourth order moment of the complex amplitude
of the scattered field. If this complex amplitude is a white noise process,
the fourth order moment can be expressed in terms of delta functions, and
the expression for the Wiener spectrum reduces to that given by either

eq. (15) or (21) in 92. To consider the effect of the autocorrelation of the
complex amplitude on the Wiener spectrum of the speckle intensity, we
consider a particular example in which it is assumed that A(5) is a complex
Gaussian process (DAINTY [1973]). The fourth order moment can then
be written in terms of the autocorrelation function and eq. (55) becomes,
after some rearrangement,


This expression is essentially the same as one given by ENLOE[1967] for

an analogous case where a partially resolved diffuser is imaged by an optical
system and also as eq. (59) below. The first term on the right-hand side of
eq. (56) is governed by the average intensity in the observation plane and
the second term describes the intensity fluctuation. For a white noise
field C,(t) = S(t), and the intensity fluctuation term becomes identical to
that given in eqs. (15) and (21) in $ 2 .
In Fig. 18 we give an example of the effect of C,(t) on the Wiener spectrum
of the intensity fluctuation (second term in eq. (56)). The autocorrelation
function is assumed (for computational convenience) to have a top-hat
form with a diameter that is some fraction R of the diameter of the optical
system (DAINTY [1974]). In Fig. 18(a) and (b) the effect of defocus is
shown for R = 0.2 and R = 0.1 respectively. Clearly the speckle pattern
is independent of lens aberration only in the limit R + 0 (i.e. a white
noise field A ( ( ) ) .
The problem of finding the first order statistics when N is finite was
first examined by Lord RAYLEIGH [1919] and more recently his results
were applied by BURCH [1969]. More general analyses with application
to the scattering by liquid crystals have been given by JAKEMANand PUSEY
[1973a, b, 19751 and a summary of their results is presented below.
It is assumed that the scatterer is a deep phase screen such that the phase
of the scattered light immediately after the screen has a Gaussian distribu-
tion with a variance 082 >> 1. The complex amplitude in the Fraunhofer
plane is given by

0 1.0 W 2.0

I 4\
0 I I

0 1*o W 2-0
Fig. 18. Wiener spectra of the intensity fluctuation for speckle patterns observed in the Fraun-
hofer plane of a defocussed lens for defocus values of 0, 1 in,2/n and 4/nwavelengths. TJpper
(a), R = 0.2; lower (b), R = 0.1 The parameter R is equal to the diameter of the correlation
area relative to that of the optical system (DAINTY [1974]).

where N is$nite, aj(x, y ) is a scattering factor, and pj is the random phase

from thejth scatterer and is independent of the phases of all other scatterers.
Equation (57) describes a finite random walk with variable step length
studied by Lord RAYLEIGH [1919] and others. Assuming that all the aj
can be described by the same distribution function, the following expressions
for the first two moments of the intensity can be found:
(0= N(.2>,
(1) = N ( a 4 ) + 2 N ( N - 1)(a2).

Higher moments can also be found. Evaluating the moments of a j yields

an expression for the ratio of the variance of the speckle intensity to the

square of the mean intensity (JAKEMAN

and PUSEY[1975]),
o2 2 082
~- - 1-


where k = 27-c/A, r8 is the phase correlation length, and 8 is the angle

observation (for normal incidence). Clearly as N -,co , 02/(Z)2 tends to
the Gaussian value of unity. However for finite N this ratio (the speckle
contrast) depends on the angle of observation, the variance 0; and
correlation scale length rg of the phase immediately after the scatterer
and of course on N itself. Some experimental results (PUSEYand JAKEMAN
[1975]) for the variation of speckle contrast with angle of observation 8
and the number of scatterers for a liquid crystal scatterer are given in Figs.
19 and 20 respectively; these broadly confirm eq. (58).

0 0.2 0.4 06
Fig. 19. Angular dependence of the speckle contrast for a small number of scatterers
(PUSEYand JAKEMAN [1975]).

The normalised angular autocorrelation function of the intensity,

can also be found using eq. (57) and evaluating the appropriate moments
40 THE S T A T I S T I C S OF S P E C K L E P A T T E R N S [I, 0 4

Area- x103 ( rn-2)

Fig. 20. Dependence of speckle contrast on illuminated area of diffuser for a fixed size of
inhomogeneity (PUSEY and JAKEMAN[1975]).

of a j yields (JAKEMANand PUSEY


C,(A8)+ -4 exp { - k2riA82/16}


where small 8 and 8 have been assumed for simplicity, A 8 = O-Q, and
where C,(A8) is the autocorrelation function of the normal speckle pattern
produced when N + co as given by eq. (14). Clearly eq. (59) gives the
correct result of C(8,6) -+ C,(A8) as N + 00. However for finite N two
characteristic scale lengths appear in the autocorrelation function. The first
is the normal speckle size governed by the total extent of the scattering
volume, whilst the second depends on the variance and scale length of the
phase immediately behind the scatterer. This result is essentially the same
as that given in eq. (56) but derived and used in a different context (DAINTY
[1973]). Some experimental measurements of C(6,e) in the scattering by
liquid crystals are shown in Fig. 21 and broadly confirm the above theoret-
ical results. The statistics of speckle patterns produced when only afinite
number of scatterers contribute to the pattern at any point clearly depend

'r 2 0 volts

1 I

O2 degrees
Fig. 21, Angular autocorrelation functions in the Fraunhofer plane for speckle patterns
produced by many scatterers (100 v) and a few scatterers (20 v) (PUSEY and JAKEMAN [19751).

on the properties of the scattering surface and may be of use in determining

these properties.


The statistics of speckle patterns considered throughout this article have

been derived on the assumption that the surface roughness is greater than
one wavelength. If oh< ;1then the speckle pattern will contain information
on the surface properties; several authors have suggested that useful surface
parameter information is contained in speckle patterns produced by
slightly rough surfaces (ALLENand JONES[19631, CRANE { 19703, GOODMAN
[1973,1975b, c], PEDERsEN [1974], FuJIIand AsAKURA [1974b, c], WELFORD
It is convenient to first examine the speckle patterns formed in coherent
light in the image plane of a slightly rough diffuser. Such a speckle pattern
can be considered to consist of a uniform beam arising from a specularly
reflected component added coherently to a diffusely reflected beam which
forms a normal speckle pattern. The relative intensities of the two com-
ponents will depend on the properties of the scatterer. The probability
density function of the intensity of a speckle pattern produced in the image
plane of a slightly rough diffuser will be similar to that obtained when a
42 T H E S T A T I S T I C S OF S P E C K L E P A T T E R N S [I, 94
uniform beam is added coherently to a speckle pattern. This problem has
already been investigated by GOODMAN [1967] and DAINTY[1972]. The
real and imaginary parts of the complex amplitude in the speckle pattern
have a Gaussian distribution with non-zero mean and the distribution of
intensity and phase can be found using the appropriate probability trans-
formation (DAINTY[1972]).
In Fig. 22(a) we show some experimental measurements of the prob-
ability distribution of intensity p(1) for four surfaces, and in Fig. 22(b)
theoretical curves for p(1) are shown for a range of specular/diffuse beam
ratios. The agreement between the overall forms of the two sets of curves
is gcod and indicates that the simple approach outlined above may be

r =lo

Fig. 22. (a) Experimental probability distributions of speckle intensity in the images of
slightly rough diffusers (FUJIIand ASAKURA [1974b]). (b) Probability density functions for
a range of specular/diffuse beam ratios (DAINTY [1972]).
1, o 41 S U R F A C E - D E P E N D E N T FEATURES 43

adequate for practical purposes. The main difficulty lies in establishing a

link between the surface parameters and the ratio of specular/diffuse trans-
mittance (or reflectance). In order to establish this link it is again necessary
to use a surface model that gives a Gaussian distribution of phase in the
scattered wave immediately after the diffuser ; the standard deviation of
the phase is cr+ and the scale length of the phase correlation function is rb.
Using this model FUJIIand ASAKURA [1974c] evaluated o / ( l ) as a func-
tion of both the relevant surface parameters and the degree of spatial
Their theoretical and experimental results are summarised in Fig. 23.
The surface parameter of importance is (for reflected light)

The final term on the right hand side of eq. (60), (oh/r,&is closely related
to the r.m.s. gradient of the surface height distribution and this equation
indicates that a simple measure of speckle contrast in the image cannot
uniquely determine the surface roughness.
Recently OHTSUBO and ASAKURA [1975] have observed that the speckle
contrast is somewhat smaller in the image plane than in out-of-focus planes.
GOODMAN [1975b, c] has shown that this is because the scattered field
does not have circular Gaussian statistics (as assumed above) and that

0 0.5 1.0 1.5

Fig. 23. Speckle contrast as a fmction of roughness characteristic a:/rb for four conditions
of partial spatial coherence (a) - (d) (FUJI[and ASAKURA[1974c]).

the degree of non-circularity depends on the surface roughness. The dip

in speckle contrast through focus is due to change from circular to non-
circular statistics and back again.
It may also be useful to study the intensity at the centre of the Fraun-
hofer plane within the first bright ring of the Airy disc for a slightly rough
surface. A similar statistical variation to that shown in Fig. 22 would be
obtained, but in this case one would not expect on simple physical grounds
any strong dependence on the r.m.s. gradient of the surface height. Further
advances are awaited on this aspect of scattering from rough surfaces.

8 5. Concluding Remarks
We began this review article by stressing that in a large number of practical
situations normal speckle patterns are produced ; the statistics of these
speckle patterns are well-established both theoretically and experimentally
and in particular they do not depend on the detailed scattering properties
of the scattering medium. When examining the effects of polychromatic
illumination on speckle statistics we found the value of the r.m.s. height
variation of a scattering surface strongly influenced the statistical properties
of the scattered intensity. Finally in 54 it was seen that the general relation-
ship between the scattered intensity and the scattering surface may be very
complicated, even for surfaces with a Gaussian distribution of surface
heights. However, it must be stressed that through the whole review we
have glossed over the details of the interaction of an electromagnetic wave
with a scattering medium and have used relatively naive models of scattering
surfaces. Hopefully this unsatisfactory state of affairs in the subject as a
whole will be rectified by thL: advances of future workers.

I wish to thank Professor W. T. Welford of Imperial College, London
for his encouragement, advice and criticism of my work on speckle patterns
over a number of years. I also wish to thank Dr. G. Parry for his advice on
the original manuscript.

ALLEN,L. and D. G. C. JONES, 1963, Phys. Lett. 7,321.
ARCHBOLD, E. and A. E. ENNOS,1972, Opt. Acta 19, 253.
ASAKURA, T., H. FUJIIand K. MURATA,1972, Opt, Acta 19, 273.
BARAKAT, R., 1973a, Opt. Acta 20,729.

BARAKAT, R., 1973b, Opt. Commun. 8, 14.

BECKMANN, P., 1967, in: Progress in Optics, Vol. VI, ed. E. Wolf (North-Holland).
BECKMANN, P. and A. SPIZZICHINO, 1963, The Scattering of Electromagnetic Waves from
Rough Surfaces (Pergamon/MacMillan, London/New York).
BEDARD, G., J. C. CHANGand L. MANDEL,1967, Phys. Rev. 160, 1496.
BORN,M. and E. WOLF,1970,Principles ofoptics (fourth ed., Pergamon Press, London, N.Y.).
BUCHWALD, E., 1919, Ber. Deut. Phys. Ges. 21, 492.
BURCH,J. M., 1969, in: Optical Instruments and Techniques, ed. J. Home-Dickson (Oriel
Press, Newcastle-upon-Tyne).
BURCKHARDT, C. B., 1970, Bell Syst. Tech. J. 49, 309.
CHAKRABORTY, A. K., 1973, Opt. Commun. 8, 366.
CHANDRASEKHAR, S., 1943, Rev. Mod. Phys. 15, 1.
CHERNOV, L. A,, 1960, Wave Propagation in a Random Medium (Dover Press, New York).
CONDIE,M. A., 1966, Thesic Stanford University.
CRANE,R. B.. 1970, J. Opt. SOC.Am. 60,1658.
DAINTY, J. C.. 1970, Opt. Acta 17, 761.
DAINTY, J. C., 1971, Opt. Acta 18, 327.
DAINTY, J. C., 1972, J. Opt. SOC. Am. 62, 595.
DAINTY,J. C., 1973, Opt. Commun. 7, 129.
DAINTY,J. C., 1974, Mon. Not. R. Astr. SOC.169, 631.
ELBAUM, M., M. GREENBAUM and M. KING,1972, Opt. Commun. 5, 171.
ELIASSON, B. and F. M. MOTTIER,1971, J. Opt. SOC.Am. 61, 559.
ENLOE,L. H., 1967, Bell Syst. Tech. J. 46,1479.
EXNER,K., 1877, Sitzungsber. Kaiserl. Akad. Wiss. (wein) 76, 522.
EXNER,K., 1880, Wiedemanns. Ann. Physik 9, 239.
FUJII,H. and T. ASAKURA, 1973, Optik 39, 99.
FUJII,H. and T. ASAKURA, 1974a, Optik 39, 284.
FUJII,H. and T. ASAKURA, 1974b, Opt. Commun. 11, 35.
FUJII,H. and T. ASAKURA, 1974c, Opt. Commun. 12, 32.
GEORGE, N. and A. JAIN,1972, Opt. Commun. 6, 253.
GEORGE, N. and A. JAIN,1973, Appl. Opt. 12, 1202.
GEORGE, N. and A. JAN, 1974, Appl. Phys. 4, 201.
GEORGE, N., A. JAINand R. D. S. MELVILLE, 1975, Appl. Phys. 6, 65.
GERRITSEN, H. J., W. J. HANNAN and E. G. RAMBERG, 1968, Appl. Opt. 7, 2301.
GOLDFISCHER, L. I., 1965, J. Opt. SOC.Am. 55, 247.
GOODMAN, J. W., 1963, Stanford Electronics Lab. TR2303-1 (SEL-63-140).
GOODMAN, J. W.. 1965, Proc. I.E.E.E. 53, 1688.
GOODMAN, J. W., 1967, J. Opt. SOC.Am. 57, 493.
GOODMAN, J. W., 1973, in: Remote Techniques for Capillary Wave Measurement, eds. K. S.
Krishnan and N. A. Peppers, Stanford Research Institute Report, 2nd April.
GOODMAN, J. W., 1975a, Opt. Commun. 13, 244.
GOODMAN, J. W., 1975b, Opt. Commun. 14, 324.
GOODMAN, J. W., 1975c, in: Laser Speckle and Related Phenomena, ed. J, C. Dainty
(Springer-Verlag, Heidelberg).
HAAS,DE W. J., 1918a, Koninklijke Acad. van Wetenschappen (Amsterdam) 20, 1278.
HAAS,DE,W. J., 1918b, Ann. Phys. IV 57, 568.
HARIHARAN, P.; 1972, Opt. Acta 19, 791.
HCIHN,D. H., 1968, Optik 27, 353.
JAKEMAN, E., 1974, in: Photon Correlation and Light Beating Spectroscopy, eds. H. Z.
Cummins and E. R. Pike (Plenum Press, New York).
JAKEMAN, E. and P. N. PUSEY,1973a, Phys. Lett. 44A, 456.
JAKEMAN, E. and P. N. PUSEY,1973b, J. Phys. A 6, L88.
JAKEMAN, E. and P. N. PUSEY, 1975, J. Phys. A 8, 369.
U C , M. and A. J. F. SIEGERT, 1947, J. Appl. Phys. 18, 383.
LABEYRIE, A,, 1970, Astron. and Astrophys. 6, 85.
LAUE,M. VON,1914, Sitzungs. Akad. Wiss. (Berlin) 44, 1144.
LAUE,M. VON,1916, Mitt. Physik. Ges. (Zurich) 18, 90.
LAUE,M. VON,1917, Verhandl. Deut. Phys. Ges. 19, 19.
LOWENTHAL, S. and H. ARSENHAULT, 1970, J. Opt. SOC.Am. 60, 1478.
MCKECHNIE, T. S., 1974a, Optik 39, 258.
MCKECHNIE, T. S., 1974b, Thesis, University of London.
MCKECHNIE, T. S., 1975, in: Laser Speckle and Related Phenomena, ed. J. C. Dainty (Springer-
Verlag, Heidelberg).
MANDEL, L., 1959, Proc. Phys. SOC.74, 233.
MARTIENSSEN, W. and E. SPILLER,1965, Naturwiss. 52, 53.
MIDDLETON, D., 1960, An Introduction to Statistical Communication Theory (McGraw-
Hill, New York).
OHTSUBO, J. and T. ASAKURA, 1975, Opt. Commun. 12, 30.
PARRY, G., 1974a, Opt. Acta 21, 763.
PARRY, G., 1974b, Opt. Comniun. 12, 75.
PARRY. G., 1975a, Opt. Quant. Elect. 7, 3 18.
PARRY,G., 1975b, in: Laser Speckle and Related Phenomena, ed. J. C. Dainty (Springer-
Verlag, Heidelberg).
PEDERSEN, H. M., 1974, Opt. Commun. 12, 156.
PEDERSEN, H. M., 1975a, Opt. Acta 22, 15.
PEDERSEN, H. M., 1975b, Opt. Acta 22, 523.
PUSEY,P. N. and E. JAKEMAN, 1975, J. Phys. A 8, 392.
RAMACHANDRAN, G. N., 1943, Proc. Ind. Acad. SOC.(A) 18, 190.
RAMAN, C. V., 1919, Phil. Mag. 38, 568.
RAYLEIGH, Lord, 1880, Phil. Mag. 10, 73.
RAYLEIGH, Lord, 1918, Phil. Mag. 36, 429.
RAYLEIGH, Lord, 1919, Phil. Mag. 37, 321.
REED,I. S., 1962, I. R. E. Trans. Info. Th. IT-8, 194.
RICE,S. O., 1954, in: Selected Papers on Noise and Stochastic Processes, ed. N. Wax (Dover
Press, New York).
Ross, G., 1969, Opt. Acta 16, 611.
Ross, G., 1970, Phil. Tran,. Roy. SOC.268A, 177.
SCRIBOT, A. A,, 1974, Opt. Commun. 11, 238.
SINGH,K., 1972, Pubblicazioni dell 'Instituto Nazionale de Ottica 27, 197.
SLEPIAN, D., 1958, Bell Syst. Tech. J. 37, 163.
SPRAGUE, R. A., 1972, Appl. Opt. 11, 2811.
STROHBEHN, J. W., 1971, in: Progress in Optics IX, ed. E. Wolf (North-Holland).
SUZUKI, T. and R. HIOKI,1966, Jap. J. Appl. Phys. 5, 807.
TATARSKI, V. I., 1961, Wave Propagation in a Turbulent Medium (Dover Press, New York).
TRIBILLON, G., 1974, Opt. Commun. 11, 172.
WELFORD, W. T., 1975, Opt. Quant. Elect. 7, 413.
YAMAGUCHI, I., 1972, Optik 35, 591 and 36, 173.
YAMAGUCHI, I., 1973, Optik 37, 141.




Observatoire de Paris,
92190 Meudon, France


Q 0. INTRODUCTION . . . . . . . . . . . . . . . . . . . 49
Q 1. ATMOSPHERIC OPTICS . . . . . . . . . . . . . . . . 51
Q 2 . DIRECT INTERFEROMETRY . . . . . . . . . . . . . 59
ARRAY O F OPTICAL TELESCOPES. . . . . . . . . . 79
Q 6. INTENSITY INTERFEROMETRY . . . . . . . . . . . 82
Q 7 . HETERODYNE INTERFEROMETRY . . . . . . . . . . 84
Q 8. CONCLUSIONS . . . . . . . . . . . . . . . . . . . . 84
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . 85
0 0. Introduction

Continued progress in the art of building optical instruments has resulted

at about the time of Newton, Herschel and Foucault, in telescopes having
better optical quality than the atmosphere. From then on, the subsequent
evolution which led to todays giant telescopes failed to further improve
the resolving power of optical observations through the atmosphere.
A few obstinate physicists however succeeded in showing that the
atmospheric degradation of images can be avoided to some degree by using
special observing techniques. Used to observe a few favorably bright stars,
these techniques have indeed demonstrated principles which can yield high
resolution information beyond the normal atmospheric cut-off. The
techniques utilize approaches known as stellar interferometry, intensity
interferometry and lunar occultation. The state of the art in these fields has
been reviewed by Hanbury Brown in 1968. The present review is more
specifically concerned with the more recent developments of stellar inter-
ferometry, and the corresponding instruments, generally referred to as
coherent synthetic-aperture systems.
Pioneering work by A. A. Michelson established the feasibility of large
resolution gains, but lack of a mature technology long prevented followers
from even repeating Michelsons observations. Only very recently did the
progress of sensors and electronics allow improvements to the original
Michelson interferometer. In the last few years, the progress of coherent
optics allowed a better understanding of the speckle phenomenon in
stellar images, which in turn triggered spectacular developments in both
post- and pre-detection image processing methods. The operation of two
independent telescopes as a Michelson interferometer, achieved a few
months before this review was concluded, indicates that the technology is
now ready for operating large arrays of telescopes as optical synthetic
apertures. As already experienced at radio wavelengths, telescope arrays
are likely to improve enormously the resolution and the luminosity of
observations. Their operation from ground-based sites will help designing
similar systems for space use.


The history of attempts to penetrate the turbulent atmosphere dates

back to 1868, when Fizeau proposed to observe Youngs fringes in stellar
images. Following initial attempts by Stephan, this suggestion was brillantly
developed by Michelson using aperture masks on the Yerkes refractor
and the 100-inch Hooke reflector. Later,Michelson, Pease and Hale built
successively two models of a synthetic aperture system having respectively
20 and 50 feet span, and in which the observers eye served as the sensor.
Each system involved a pair of small apertures supported by a single steel
structure, designed to be as rigid as possible while being mobile around
the polar axis of an equatorial mount. After the equality of optical paths
had been carefully adjusted, fringes could be observed in stellar images.
As we shall see, the phase shifts introduced by the atmosphere induced
fast displacements of the fringes, which decreased their apparent contrast.
Nevertheless, fringe contrast measurements were possible and these gave
apparent stellar diameters for several bright red stars. The observations
were difficult and time-consuming, as can be realized from the fact that
none has ever succeeded in repeating these experiments. Peases measure-
ments of stellar diameters were however confirmed recently by different
methods benefiting from the large aperture of the 5-meter Hale telescope,
and thus easier to implement.
The level of activity in this field dropped near zero between 1930 and
1958. In 1958, Hanbury Brown and Twiss proposed the intensity inter-
ferometer method as an alternate approach intended to avoid the problems
arising with direct interferometry. In spite of the methods inherently low
sensitivity, the patient work of Hanbury Brown and Davis at Narrabri
(Australia) resulted in a major resolution breakthrough. The angular
diameters of the 32 brightest southern stars were measured, with resolution
of the order of a millisecond of arc. In the meanwhile, interferometric
observing techniques underwent considerable development at radio wave-
lengths. The synthetic-aperture arrays of radio telescopes built in this
period by Ryle and his collaborators began to surpass the traditional
one-second limit of resolution with optical telescopes, and quickly achieved
arc-second, as different groups began to use heterodyning techniques
with antennas spaced thousands of kilometers apart.
Interest in the direct synthetic-aperture approach at optical wavelengths
revived in the year 1965, presumably in relation with the general progress
of coherent optics and laboratory interferometry which the invention of
lasers has triggered. Summer schools held at Woods Hole in 1966 and

1967, followed by a symposium at Tucson in 1970, stimulated interest in

long-baseline interferometers such as proposed by MILLER[19661. A
belief emerged that modern technology could solve the problems encoun-
tered by Michelson and Pease, and that coherent arrays of telescopes could
be operated at optical as well as at radio wavelengths. Several groups,
particularly in the USA, USSR and France, began to tackle these problems,
investigating in particular the use of artificial sensors to replace the human
In 1970, speckle interferometry established a generalized form of Fizeau
interferometry, which utilizes the full aperture of large telescopes. Used
at the 200-inch Hale telescope by GEZARI, LABEYRIE and STACHNIK [1972],
the method provided a confirmation of Peases measurements as well as
additional hints on color-dependance and limb-darkening effects relating
to the two largest apparent stellar disks, Betelgeuse and Mira Ceti. The
method also provided diffraction-limited measurements on a dozen binary
stars, unresolvable by conventional techniques. Following these initial
results, progress in the technology of sensors, and particularly the develop-
ment of photon-counting television by BOKSENBERG [19721, resulted in
considerable gains in both the sensitivity and the accuracy of data reduc-
At this point a number of articles discussed further possibilities for
improvements, particularly in the direction of image reconstruction.
Computer simulations as well as laboratory experiments produced partic-
ularly encouraging results in relation with the approach known as rubber
telescope imaging. In this approach, atmospheric disturbances are actively
corrected by servo devices requiring a bright star in the observed field.
In the meanwhile, work continued in the direction of long-baseline
systems. Preliminary designs were made by MILLER [1966] for a 100-meter
interferometer, while the present author attempted to produce interference
with two telescopes. The latter project reached its goal in the recent months.
It is now undertaken to build a coherent array of telescopes.

0 1. Atmospheric Optics

The turbulent behaviour of the atmosphere has been extensively studied

in the recent years. Chernov, Tatarsky, LEEand HARP[1969], LAWRENCE
and STROHBEHN [19701 and others developped the theory of wave propaga-
tion in turbulent air, while HUFNAGEL and STANLEY [1964], FRIED[1966],
52 H I G H -RE SO L UT I O N T E CH NIQU ES [It, 1

KORFT, DRYDEN and MILLER[1972] studied the corresponding effects on

images. Experimental measurements of scintillation phenomena were
made by YOUNG[1969], MIKESELL [1955] and PROTHEROE [1961]. Phase
effects were observed by RODDIER and RODDIER [1973] and VERNINand
RODDIER [19731.
A variety of mechanisms occurring in clear air account for the
disturbances which affect optical waves propagating horizontally or
vertically. This review is mainly concerned with vertical propagation,
and the relevant body of knowledge may be briefly summarized as follows :
Air at a uniform pressure and temperature is in principle an excellent
optical medium. Apart from the microscopic fluctuations responsible for
Rayleigh scattering, the Brownian motion alone introduces negligible
wide-scale phase fluctuations on optical waves. However, temperature
is not uniform in the atmosphere, and its fluctuations produce optical phase
fluctuations on propagating waves, in response to the local variations of
the temperature-sensitive refractive index. This temperature effect is
predominant at optical wavelengths, but other causes such as humidity or
partial pressure fluctuations may contribute at other wavelengths, such
as in the infra-red.
The temperature fluctuations are caused by a variety of turbulence
mechanisms, depending on wheather conditions and terrain topography.
Among these are thermal convection, interface turbulence between layers
at different temperatures, wind shear turbulence, wake turbulence (down-
wind from certainmountain peaks), etc.. As reviewed by LUMLEY and
PANOVSKY [1964], the theory of turbulence evolved by fluid dynamics
since the 1950s ascribes a power law to the spatial power spectrum of
density fluctuations. The exponent value generally accepted is that obtained
theoretically by Kolmogoroff, namely - 11/3. The scale size for turbules
ranges between an inner scale amounting to a millimeter or so in air at
sea level, and an outer scale on the order of 10 to 100 meters at sea level.
Reasonable agreement exists between theoretical results and experimental
measurements, although certain points remain to be clarified.
The turbulence parameters which are most significant with respect to
optical effects are : 1. the size and the size distribution of turbulence cells;
2. the RMS density fluctuation, integrated along the line of sight; 3. the
altitude of turbulent layers; 4. the apparent lifetime of turbules along a
given line of sight. In the presence of winds, this lifetime differs from the
intrinsic lifetime, since turbules are carried across the line-of-sight by the
Concerning the spatial distribution of turbulence cells, these are usually

organized in random isotropic fashion. However, they sometimes take the

form of waves resembling those on water surfaces, and similarly generated
by gravity oscillations at the interface between atmospheric layers having
different densities (GAVIOLA [19491). The temporal frequency spectrum of
atmospheric turbulence is of particular concern to interferometric observers.
This is largely governed by wind speeds at ground and altitude levels. Life-
times vary between 1 and second, depending on wheather conditions.


The optical wave received at ground level from a point source located
above the atmosphere has both phase and amplitude perturbations. The
amplitude fluctuation pattern, also referred to as the shadow pattern, is
responsible for the well known twinkling of stars. Because absorption
cannot account for the effect, it is generally interpreted as resulting from
the action of high-altitude turbulent layers according to elementary stio-
scopic or Scblieren effects. Both geometric ray deflection and interference
may contribute in this effect.
A simple method for viewing the high-altitude turbulent layers responsible
for the shadow pattern was used extensively by BOYER[unpublished] for
wind monitoring purposes. It consists in observing the lunar limb with a
small, amateur-type telescope. By defocusing slightly the eyepiece outward,
it is possible to focus on the turbulent layers and to see the flowing stream
of turbules, their invisible phase pattern being translated into an amplitude
pattern by a Schlieren effect. The altitude, velocity and direction of the
turbulent flow may thus be determined. More elaborate methods involving
large mirrors and shadow pattern correlations have recently been
developped by Roddier and his collaborators (MARTIN,BORGNINO and
RODDIER [1975]).
Phase corrugations on the wave are generated by turbulence at all alti-
tudes, but with increased efficiency at decreasing altitudes. The low-altitude
phase cells are observed easily on the aperture of large telescopes when
conducting knife-edge tests. In addition to the steady cell flow outside of
the dome, a stationary turbulence component occurring inside the dome
is also generally observed. RMS values for the fast phase fluctuation have
been estimated to vary in the range from 1 to 20 radians. Available data
do not cover the long-wave components of phase corrugations.


On good observing nights, small telescopes in the 2 to 10 cm range of


aperture sizes are smaller than seeing cells. Thus the phase is nearly uniform
over their aperture. Consequently, these telescopes are nearly diffraction
limited, and star images have the appearance of the classical Airy disk.
Considering now the other extreme case of very large instruments, their
aperture may cover several thousand seeing cells, having random phases.
The Airy-disk pattern is completely destroyed under such conditions ; and
classical diffraction experiments, in accordance with elementary Fourier
transform analysis, show that the angular spread of the image is of the order
of Ajd, d being the characteristic size of seeing cells. 12-cm cells thus corre-
spond to one arc-second as the approximate width of the image projected
onto the sky plane; and this figure is rather typical of average observing
conditions with large instruments in the best observatory sites. Only very
exceptionally are seeing cells larger than 24 cm; and this explains why
larger instruments do not benefit from their theoretical resolution per-
Some astronomers, especially binary star observers, have long remarked
on the existence of a fast-moving fine structure inside the typical one-second
image (Fig. 1). Recently (LABEYRIE [1970]), this structure has been inter-
preted as a speckle phenomenon, same as the well known effect in diffused
laser beams, and this identification led to the interferometric method
called speckle interferometry (sect. 3.3). As is apparent in the recent
review of DAINTY [1976], the speckle phenomenon has been extensively
studied in the recent years. In the astronomical case, it results from the
fact that any point inside the star image indeed receives coherent contribu-

(a) (b)

Fig. 1. The corrugated optical wave and corresponding image; (a) the full aperture of a large
telescope produces a speckled spread function; (b) a Fizeau mask with two small apertures
produces Youngs fringes in the image.

tions from many of the phase cells in the aperture plane. The resulting
amplitude at the image point considered is thus a sum of many vibrations
with random phases. Summing vibrations with random phases has been
a classical problem since Rayleigh: the squared modulus of the sum may
have any value between 0 and IZaI2,a being the amplitude of the individual
vibrations. In the image plane, the particular value found in the summing
process depends on the image point considered since individual vibration
phases vary when the point is displaced in the image plane by more than
l/a,a being the angular aperture. The scale size for intensity variations in
the image plane is thus l/u.As discussed by GOODMAN [1965], in the case
of laser speckles, and by KORFF,DRYDEN and MILLER[1972], in the
astronomical case, the addition of coherent, but randomly phased, vibra-
tions corresponds to a two-dimensional random walk in the complex
plane if component vibrations are represented by vectors in this plane.
The sum vector is distributed according to a Gaussian law, and the ampli-

Fig. 2. Speckled images of a non-resolved star (Vega) obtained simultaneously in different

colors at the 200-inch telescope. Colors (and spectral bandwidths) are, going clockwise from
top left : red (500 A), yellow (250 A), green (250 A) and blue (250 A). Residual atmospheric
dispersion is apparent in the blue image. Exposure time is 0.01 second.

tude A has a Rayleigh distribution of the form ,,/%exp(-A/z2). The

distribution of intensities I is exponential e-'/z2. This result is independant
of the model assumed for seeing cells, provided random phase variations
are present. Indeed, the appearance and the 2"d order statistical properties
of laser speckle do not depend on the type of scatterer used. Different
scatterers, or qualities of seeing in the stellar case, change only the size of
the image envelope, without affecting the statistics of the fine speckle
pattern within it. This important property simplifies appreciably simulation
experiments carried out in the laboratory or on computers, since real
atmospheric seeing does not have to be exactly reproduced.
The analysis applies also regardless of aperture shape, and in particular
to segmented or multiple apertures such as may be encountered in synthetic-
aperture systems ( 2 and Q 4). The shape of speckles however depends on
the aperture geometry: the aperture stop may be represented by a multi-

Fig. 3. Synthetic aperture systems and corresponding spread functions : top row - apertures
consisting of 1, 2 and 6 telescopes, as well as giant monolithic aperture; middle row - corre-
sponding spread functions, in the absence of atmosphere and optical aberrations; bottom row -
short-exposure, monochromatic, spread-functions in the presence of the atmosphere. This
is a laboratory simulation result obtained by photographing a monochromatic point source
(a laser-illuminated pinhole) through a mild diffuser and aperture mask. The conditions
simulated correspond to 1.5 'meter telescopes and typical 1 sec. seeing.

plicative term applied to the infinite incident wave, and the speckles in the
image plane are thus convolved, in complex amplitudes, with the diffraction-
limited spread function. Speckled images recorded at the Palomar 200-inch
telescope are reproduced in Fig. 2. The laboratory equivalent of synthetic-
aperture systems involving several large telescopes produces the speckled
images in Fig. 3. These images were obtained as photographs of a mono-
chromatic point source, using a multi-aperture diaphragm and a mild
diffuser in front of the camera lens. Compared to the speckles from the
monolithic aperture, these contain an additional finer interference struc-
ture which takes the form of fringes with two apertures, of a honeycomb
pattern with 3 apertures, etc..
The number of speckles contained within the image envelope increases
with aperture size D as (D/dj2.When dealing with low-altitude seeing,
no ensemble translation of speckles occurs in the image even when seeing
cells are carried as a rigid pattern by the wind flow. Instead, speckles
appear and disappear locally much like vapor bubbles at the surface of
boiling water. It is apparent from the video images recorded at Palomar
that the lifetime of speckles increases at increasing wavelengths, from near
ultra-violet to near infra-red.
In small telescopes, few seeing cells are present across the aperture at
any instant, which results in few speckles in the images. Consistently with
simple statistics, experience shows that such images are subject to rapid
wander and size fluctuations. In such cases, image selection or exposure
triggering techniques such as used by RATT [1957], may improve markedly
the image sharpness. Telescopes up to a meter in aperture diameter may
thus produce nearly diffraction-limited images durilig a few milliseconds
every hour or so. Using short-exposure electronographic photographs of
binary stars, R ~ S C HWLERICK
, and BOUSSUGI~ [1961] have shown that the
instantaneous speckle patterns are identical for closely spaced pairs but
different in the case of widely spaced pairs. The angular extent over which
speckle is invariant, called the isoplanatic patch, is on the order of 3 to
10 arc-seconds. Patch size .is mainly dependant upon the presence of
turbulent layers at high altitudes, since these layers are crossed in different
regions by the light beams coming from different stars into the telescope.
Temporal and spatial coherence have both been assumed in the above
analysis. The size d of speckles being proportional to wavelength, some
degree of monochromaticity is indeed required for purity of the pattern.
If atmospheric and telescopic aberrations are smaller than the wavelength
2, the requirement may be written I/d2 > D/d. In white light, at the 200-inch
telescope, the image of a star close to zenith shows contrasted speckles

at the center of the typical one-second envelope. The few central speckles
are surrounded by radially-oriented coloured streaks generated by the
chromatic spreading of speckles at higher interference orders. This is well
observable visually in conditions of moderate wind speed, using a strong
eyepiece. It takes a filter with 200 fingstrom or narrower bandpass to
observe pure speckles all the way to the edge of a one-second image. When
pointing at a star lower toward the horizon, dispersion tends to elongate
the speckles and it takes a much narrower filter to remove this effect unless
some form of prismatic compensation is used. The experimentally-observed
proportionality of speckle size to wavelength supports the diffraction-
interference interpretation of the speckle phenomenon, as opposed to the
ray-deflection interpretation which had sometimes been proposed before
the advent of lasers and speckle theory.
The optical components of large astronomical telescopes are rarely
made to the accuracy meeting the Rayleigh criterion. Residual coma,
spherical aberration and astigmatism amounting to 0.5 arc-second are
usually tolerated since the effect of the atmosphere is even worse. Such
transverse aberrations, as long as they remain inferior to seeing effects,
have no influence on the speckle patterns : somewhat paradoxically, bright
speckles retain their similarity with Airy peaks in the presence of aberrations
which would destroy the Airy peak if they acted alone.
Realistic simulations of astronomical speckle phenomena may be carried
out in the laboratory using a bright artificial star, a sheet of polyethylene
or other diffusing material representing the atmosphere, a lens aperture to
represent the telescope, and filters. In addition to laboratory simulations,
a number of authors have used computers to derive image spread functions
by Fourier-transforming random distributions of phase cells generated
across some circular aperture. In all cases, the speckle patterns obtained
resemble closely those obtained with large telescopes.


The speckled or fringed structure in short-exposure images disappears

when dealing with long exposures, due to the averaging of energy distribu-
tions in time-dependant speckle patterns. The short-exposure MTF, defined
as the Fourier transform of the intensity distribution in the instantaneous
spread function, has been studied experimentally (GEZARI, LABEYRIE and
STACHNIK [19721) and theoretically (KORFF,DRYDEN,MILLER[1972],
DAINTY [1973]). As shown in Fig. 4,it features a central peak and surround-
ing feet extending all the way to the diffraction-limited cut-off frequency.


Fig. 4.Telescope-atmosphere MTF for alarge aperture : A, diffraction-limited MTF; B, single

short exposure; C, long exposure; D, quadratic average from many short exposures.

It has been shown that the RMS profile, obtained by averaging the squared
modulus of successive Fourier transforms, is identical to that for the
diffraction-limited MTF except in the central region where a central peak
is added.
The long-exposure MTF, obtained as the simple time-average of the
short exposure MTF, or equivalently as the Fourier transform of the
long-exposure image, consists of the central peak only, the feet being
cancelled in the averaging process.
Rather than being Fourier transformed, the star image may be auto-
correlated. The central peak appearing in the short-exposure case may be
shown to be identical to what would be obtained under diffraction-limited
conditions. It follows that the speckle pattern has certain similarities with
a random array of diffraction-limited images, i.e. Airy peaks in the case
of a circular aperture. Under certain conditions of oceanic storms where
organized swell is replaced by random waves, sailors have learned to
fear the sudden appearance of monster waves, much higher than average
(ADLARD COLES[19671). Similarly, and because speckled electromagnetic
fields are governed by the same Rayleigh statistics as gravity-induced
oscillations at liquid surfaces, there is a rare occurrence of exceptionally
bright speckles in astronomical images. These are not dangerous, fortu-
nately, and may in fact be exploited for diffraction-limited imaging purposes.

Q 2. Direct Interferometry

As discussed in section 1.4, the stellar images produced by large telescopes

60 H I G H - R E S O L U T I O N T E CH Nl QU ES I119 P2
generally feature an envelope inside which a finer interference structure is
present under adequate conditions of temporal coherence and of exposure.
The angular size of the envelope is on the order of one to several seconds
of arc depending on atmospheric conditions, while the scale size i for the
smallest interference details is inversely related to the aperture size d
according to the relation i = i / d . With existing large telescopes such as
the 200-inch telescope, the interference structure is thus 50 times finer
than the one-second image. Ordinary observing procedures ignore the
interference structure and thereby do not take any advantage of its presence.
Their angular resolution is therefore limited to about one or two seconds,
although certain exposures have sometimes succeeded in showing 0.3 sec
detail on bright objects such as the sun or planets (it has sometimes been
attempted to improve the resolution of long-exposure images with decon-
volution procedures, but the more or less Gaussian spread function in this
case does not lend itself to significant resolution gains with the noise levels
encountered in typical astronomical images).
Interferometric observing, on the other hand, concentrates more on the
fine interference structure than on the envelope. Indeed, this fine structure
contains information on object structures having a comparatively fine scale,
and which escape detection if one observes only the envelope. This results,
according to widely accepted results of the theory of coherence, from the
fact that the interference structure is a spread function which is convolved,
in intensities, with the function representing the brightness distribution on
the object. Indeed, different object points (within the isoplanatic patch)
contribute ideptical patterns in the image. These patterns are translated in
the image plane relative to each other in accordance with the source geom-
etry. For ordinary sources (i.e. spatially incoherent sources, which have a
coherence time shorter than the exposure time) these contributions add in
energy, thereby reducing the contrast or visibility of the interference
features according to a convolution operation.
Unlike the Gaussian spread functions mentioned earlier, the more com-
plicated interference structures lend themselves to successful deconvolutions
down to the size of the finest interference detail, i.e. down to the diffraction
limit. The spread function is, however, not completely known, being time-
dependant, and it is thus generally impossible to directly apply deconvolu-
tion procedures. Instead, it is possible to use power spectrum or auto-
correlation analysis, but these techniques extract only part of the high
resolution information.
When dealing in particular with the simple and historically important
case of a Fizeau-type apertured telescope, on which a mask reduces the

aperture to a pair of small holes, the interference features consist of Youngs

fringes (Fig. 1b). The fringes oscillate in response to atmospheric phase
shifts, but remain detectable visually under most circumstances of moderate
wind. Their contrast is, however, decreased in case of celestial objects
larger than the fringe spacing projected onto the sky. The RMS fringe
contrast, measured at different aperture separations, can be used to obtain
a visibility curve which relates to the objects structure through a Fourier
transform. This is the type of analysis which was used successfully by
MICHELSON and PEASE [1921] to obtain the first measurements of stellar
diameters with baselines up to 15 meters. The response time of the human
eye, on the order of 0.1 second, is sufficiently short under favorable circum-
stances to effectively freeze the fringe motion for efficient visual work.
Optical paths in both beams must, however, be equalized to within a few
wavelengths when observing in white light. This proved to be a difficult
requirement in systems involving segmented rather than monolithic optics :
not only are monolithic telescope mirrors very stiff and accurately figured
to provide optical path equality, but they are efficiently supported in
flotation cells designed to maintain the exact figure within a few wave-
lengths at all observing angles. In comparison, segmented optics systems
such as used by MICHELSON and PEASE[1921] suffer from considerable
flexibility requiring frequent readjustments of path equality.


It is of interest to establish in better detail the exact relationship between

fringe contrast and object structure. In this section, light will for most
purposes be assumed to be effectively monochromatic. More precisely
this means that its coherence time is longer than the differential light
propagation delays occurring in the interferometer, but shorter than the
exposures. Spectral filtering with a bandwidth of the order of an Angstrom
produces effectual monochromatic light in most practical circumstances
where optical path differences are less than a millimeter and exposures
are longer than a microsecond.
Within the framework of Zernikes coherence theory, the concept of
degree of mutual coherence has classically been used to relate fringe visibility
measurements and object structure (BORN and WOLF [1970]). As an
alternate treatment we will use here an equivalent derivation more directly
adapted to multiple and large apertures. If S(x,y,A, t ) is the intensity
spread function in the focal plane of the instrument, then the intensity in
the image Z(x,y , A, t ) of an incoherent object characterized by the apparent

intensity distribution O(x,y ) is obtained through the convolution I = S 00.

The spread function consists of Youngs fringes in the case of a FIZEAU
[1868] or MICHELSON [1920] interferometer, of speckles in the case of a
single large aperture, of fringed speckles in the case of two large apertures
(Fig. 3), of honeycombed speckles with 3 large apertures located accord-
ing to a triangular geometry, etc..
It has been mentioned in section 2.1 that direct image summing, i.e. of
the form J t I ( x , y I, , t)dt, results in a loss of the high-resolution detail.
Instead, the fringe or speckle information may be preserved and extracted
by summing either the power spectra or the autocorrelation functions of
images. This becomes apparent by writing the relations simultaneously in
the image space and in the Fourier space:

The summed atmospheric term in eq. ( 3 ) tends towards the well defined
limit mentioned in sect. 1.4. In the Fourier space, this limit consists of the
diffraction limited MTF with an additional central peak. This term being
known, frow experiment or theory, a division gives the modulus of the
object function, which corresponds to the visibility curve obtained by
Michelson and Pease.
This general procedure can be applied regardless of aperture geometry
and gives spatial information on the object with diffraction-limited MTF
characteristics. It can be used not only with a Fizeau interferometer or
the full aperture of a large telescope, but also, in principle, with a coherent
array of large telescopes (Fig. X). It is interesting to note that early observers
seem to have attributed magic virtues to the Fizeau screen, without realizing
that information is actually gained when this aperture plate is removed
from the top of the telescope. However, some justification of Fizeau-
aperturing practice lies in the fact that the Fizeau screen simplifies the
image structure so that it can be processed by the eyes and brain of the
visual observer. Instead the considerable information content in speckled
images from a large aperture exceeds the data processing power of the
human eye-brain system.

Fig. X (additional): Principle of synthetic-aperture telescope : the giant (for example 100-
meter) parabolic mirror in A may be apertured with a multi-aperture mask as shown in B
while retaining the same limiting resolution. B is optically equivalent to the array of mirrors
shown in C if component mirrors are accurately adjusted in tilt and axial position to reproduce
the B geometry with 4 4 accuracy. This difficult tolerance can be relaxed to a few microns
when using filtered light. C is also equivalent to the coudi. arrangements in Figs. 12 and 13.

One major limitation to interferometric observing has to do with extended

objects, or more generally with objects consisting of more than a few pixels*
the size of the diffraction limit. For these objects, the convolution rep-
resented by eq. (1) results in a very faintly contrasted image requiring
exceptional signal-to-noise ratio for a valid reduction.


The above theory does not take into account the noise component
appearing when few photon-events are recorded in each image. However,
this happens to be a serious practical limitation to the accuracy of measure-
ments in most practical cases, since the requirements for short exposures,
narrow spectral bands and slow focal ratios (highf-numbers) do result in
photon-starved images. The problem is especially relevant when observing
faint objects with the new generation of image sensors working in the
photon-counting mode (BOKSENBERG [19721). Because amplification is
virtually noise-free in these receivers, the discrete nature of photon-events
becomes the dominant source of noise. The low level image is seen in the
form of few bright scintillations occurring on a dark background, and
distributed in time and space according to a compounded Poisson law.
When observing fringes at decreasing illumination levels, it eventually
becomes impossible to decide whether fringes are present or not. If the

* Pixel is a generally adopted word which means picture element


fringe pattern were fixed, prolongated integration would solve the problem.
The fringes are however moving in our case, and higher order statistical
averages must be performed to extract the fringe signal. The analysis
procedure described in section 2.2 may still be used, but the validity of
eq. (3) under such conditions may be questioned. Classical results pertaining
to compounded Poisson distributions show that the equations are still
valid at low level if care is taken to remove the only distorsion arising in
the summed autocorrelation function, a narrow central peak appearing
at low level due to the correlation of each photon-event with itself. Assuming
low levels, pure photon noise, and many images, the signal-to-noise ratio
in the summed autocorrelation function is found to be

N,N~N~ (4)

where N , is the number of pho.ton-events per speckle, N , is the number of

speckles in the image, and Ni is the total number of images used to build
up the summed autocorrelation (LABEYRIE [19741). Applied to speckle
interferometry observations with a 200-inch aperture, this expression
leads one to expect a limiting magnitude in excess of 20. A more detailed
noise analysis published by DAINTY[19741 yields similar conclusions.
Not surprisingly, the limiting magnitude derived is much fainter than
that found by GUSKOVA and KOROLKOV [19731for the case of photoelectric
interferometers involving two small apertures. Because size determinations
for the faintest cosmological objects are of crucial importance to modern
cosmology,appreciable effort is currently made to reach the above sensibility
limits of interferometric observing. Applied now to multi-aperture systems
such as described in 8 2.4 and 8 4, expression (4) predicts limiting magnitudes
in excess of 15, assuming the 11 to 10 A spectral bandwidth imposed for
the fainter objects by the more severe temporal coherence requirements
for segmented systems.

0 3. Interferometer Designs and Results


The original Fizeau interferometer, used for the first time by Stephan at
Marseilles in 1873, involved only the aperture screen as special equipment
on the telescope. A strong eyepiece was used to observe the fringes. Tele-
scope optics are usually sufficiently good to meet the one-micron or so
tolerance on path equality which allows fringe observations in white light.
11, 31 I N T E R F E R O M E T E R DESIGNS A N D R E S U L T S 65

Problems are created by the prismatic dispersion of the atmosphere when

observing stars at low elevation.
Michelson, Pease and Anderson first used a Fizeau interferometer at the
Mt Wilson 100-inch reflector (1920-1933). In order to avoid difficulties
with a 100-inch mask, they installed a small mask some distance above
the focal plane, in the converging beam. It could be rotated in position
angle, and atmospheric dispersion was corrected with a glass plate that
could be tilted. This system failed to resolve any stellar disk but was success-
ful in resolving the conspicuous spectroscopic binary Capella, spaced
by approximately 0.05. The fringes were found to disappear for certain
baseline orientations, allowing precise measurement of the separation and
position angle between the component stars. This was the first instance,
and is still one of the very few cases, where stellar masses could be determined
directly. Several variants of this interferometer have continued to be used
by double s t q observers (FINSEN [1964]), but their contribution to double
star observing in general has been of somewhat secondary importance
in comparison with conventional visual work.
In order to increase the possible baseline span beyond the maximum size
of available telescopes, Michelson equiped the 100-inch reflector with a
20-foot beam. The beam carried four flat mirrors designed to reflect the
light beams in perioscopic fashion. These could be positioned, with about
50 microns accuracy, for optical path equality. Further adjustment for

Fig. 5. Diagram of the 50-foot Michelson interferometer built by Hale and Pease at MtWilson.
fringe acquisition in white light was achieved by observing the star image
through a direct-view prism. Reference fringes of adjustable contrast were
provided by an auxiliary Fizeau interferometer for visual contrast
measurements, and corrections had to be made for a systematic effect
which decreased the apparent fringe contrast at long baseline settings.
In spite of appreciable operating difficulties, Michelson and Pease suc-
ceeded in resolving and measuring nine stars. These angular measurements
confirmed the enormous linear dimensions of objects such as Betelgeuse
in comparison to our sun, and gave the foundations for a scale of stellar
temperatures. A second interferometer having a 50-feet beam was later
constructed by Hale and Pease at Mt Wilson (Fig. 5). Similar in its principle
to the 20-foot system, the larger interferometer failed to produce many
additional results for reasons which are not completely clear but seem
related to operational difficulties greater than were expected. During a
recent visit to Mt Wilson, I found the interferometer well preserved in its
building. In spite of some very excellent design features, it appears that
the 50-foot cantilever beam may have suffered from poor stability about
its symmetry axis. The instrument could be revived at moderate cost, and
modern electronics for guiding and fringe sensing could certainly make
its operation much easier than in Peases days. It is not clear, however,
whether the modernized interferometer could compete with recent instru-
ments involving two independently mounted telescopes.


Photomultiplier tubes are hardly more sensitive than the human eye
for short exposure work at medium illumination levels, but they have faster
response and better photometric accuracy. Before the newer generation of
television sensors appeared, these advantages led several groups to develop
photoelectric fringe sensors replacing the human eye in Fizeau-type systems.
ELLIOTT and GLASS[1970] have used a picket-fence mask to generate a
photoelectric signal from Youngs fringes, but most workers in the field
have preferred beam-splitter arrangements to obtain a flat interference
field (Fig. 6), following Michelson who had already considered using
beam-splitters for variants of the basic Fizeau arrangement. He apparently
found no advantage in doing so for visual work, but the single-pixel nature
of photomultiplier tubes obviously makes it easier to work on flat inter-
ference fwlds than on Youngs fringes. This advantage of beam-splitter
arrangements no longer holds with the new generation of multi-pixel sensors.
A wide variety of configurations are possible with beam-splitter arrange-




Fig. 6 . Types of single-pixel photoelectric fringe sensors: a, Kosters prism (K) arrangement
used by Currie with two polarizers (P), the waves made to interfere have opposite orientations
in the figure plane; b, system used by Cagnet, in which waves are identically oriented and polar-
izers unnecessary; c, standing wave phototube used by the author, in which the thin S11
film probes a standing-wave pattern produced by the two beams; d, picket-fence mask used
by Elliott and Glass in a Youngs fringe pattern.

ments. These fall in several classes depending on their symmetry properties :

1. lateral shear; 2. radial shear; 3. axisymmetricflip; 4.centrosymmetric flip.
Lateral shear applied to the aperture of a large telescope produces an inter-
ference field in which the interferometric baseline is everywhere the same.
A single-pixel sensor could suffice to probe the entire wave in the absence of
seeing. In the axisymmetric case no interference occurs in natural light since
interference maxima for one of the polarization vectors correspond to
minima for the other polarization. This situation arises in all single-pass

beam splitter arrangements where the two incident beams are subjected
to equal numbers of reflexions before meeting the beam-splitter. The
phenomenon has been responsible for appreciable frustration in some
attempts at operating stellar interferometers. Once the effect is understood
however, fringes may be retrieved easily by using an additional mirror
or polarizing element. In the centrosymmetric case, a two-dimensional
display of the objects visibility function may be obtained directly if the
waves are tilted so as to produce narrow fringes : the fringe contrast varies
locally in proportion to the local visibility value. The shadow pattern on
the wave however destroys somewhat this display. Devices belonging to
the first class have been used on telescopes by KULAGIN[1970], CAGNET
[1973] and C. RODDIER[1971]. The fringe signal in the Cagnet system is
obtained as the difference between the outputs of two photomultipliers
located on each side of the beam splitter (Fig. 6b). The device was operated
in the Fizeau mode at the Haute Provence 193 cm reflector. C. RODDIER
[19711 worked with a different system which in fact uses the full aperture of
the telescope ; interference thus tends to vanish when using large apertures
with a single-pixel sensor.
CURRIE,KNAPPand LIEWER[1974] have developped an axisymetry
interferometer (Fig. 6a) which they mounted on the 100-inch and 200-inch
telescopes of the Hale observatories. They used a differential detection
scheme similar to that of Cagnet, but working in the photon-counting
mode with an on-line digital processor. With a pair of 2-cm apertures and
10 Angstroms spectral bandwidth, they were able to measure visibility curves
on Betelgeuse and alpha Hercules. In this system, polarizers are used to
avoid the above mentioned polarization incoherence problem associated
with axisymmetry devices. The system appears to work at rather low
counting rates, and this leads to expect considerable sensitivity for future
large-aperture, multi-pixel, devices working in the photon-counting mode.
Centrosymmetry systems have been operated in the laboratory by different
authors. BEAVERS [1963] has experimented with several forms of photo-
electric fringe sensing on the reduced-scale version of the Michelson 20-feet
interferometer which he has built.
Yet another type of single-pixel sensor was used by the author con-
currently with a Cagnet-type device in initial attempts with the two-tele-
scope interferometer at Meudon. As shown in Fig. 6c, the device is based
upon a special photomultiplier tube (built to the authors specifications
by the collaborators of Prof. Lallemand at Observatoire de Paris). The
S11 photocathode may be illuminated from both sides with plane waves. If
these are coherent, they interfere in the form of a standing-wave pattern

which is probed by the semi-transparent and very thin S11 film. For flat
interference, the photocathode plane has to be parallel to the nodal planes
of the standing waves much as is the case with the beam-splitter films used
by CAGNET [1973] and CURRIE,KNAPPand LIEWER[1974]. If the phase is
made to oscillate with 180 amplitude by an auxiliary mirror drive, the
phototube signal may be submitted to lock-in detection, thus providing
with a single phototube the equivalence of the two-photomultiplier systems
mentioned above. Used concurrently with the Cagnet-type sensor, the
standing-wave sensor was generally preferred. However, both sensors were
discarded after a photon-counting television sensor was built and its
superiority was recognized (sect. 3.3).


The principle of speckle interferometry follows immediately from the

discussion of image speckles (sects. 1-3 and 2-2). As proposed by Labeyrie
in 1970, the method consists in recording speckled images at the focus of
a large telescope. The images are then analyzed statistically according to
eq. (3), to obtain diffraction-limited information in the form of a two-
dimensional visibility function formally similar to Michelsons visibility
curve. Although the very existence of speckles appears to have long remained
ignored by many experienced astronomers, including perhaps Michelson
himself, the principle had already been used on a some,qhat intuitive basis
by double-star observers working visually on large refractors. With adequate
training, and because of the moderate aperture size, the brain of these
observers could perform the second order statistical analysis required,
and detect stellar companions spaced by 0.1 under 0.5 seeing conditions.
They could notice the double character of the granules or condensations
which moved inside the image of close binary stars. It has however been
impossible to record the phenomenon until receivers more sensitive than
photographic plates appeared. Using a Lallemand-Duchesne electrono-
graphic tube, ROSCH,WLERICK and BOUSSUG~~ [1961] were first to succeed,
and started a double-star program with the one-meter telescope at Pic du
The speckle interferometer which I use at the prime focus of the 200-inch
telescope is represented in Fig. 7. As described previously (LABEYRIE [1974]),
it includes essentially a magnifying lens and a field-grating arrangement
serving as a tunable filter, and also to correct atmospheric dispersion.
Different types of sensors have been used successively: 1. photographic
film with an image intensifier; 2. a standard television camera equipped
< -
I \


Fig. 7. Speckle interferometer used at the prime focus of the 200-inch telescope M, field-
finder mirror, with 0.1 mm hole; L, magnifying lens; G, concave field grating (Jobin-Yvon
holographic type) in plane of magnified image;U, spherical mirror; S , spectral mask; TV,
television camera ; R, auxiliary spectrum-viewing lens ;C, geometry-calibration grid. The optics
provides magnification and tunable filtering (color and bandwidth are separately adjustable).
Atmospheric dispersion is corrected by translating TV axially and rotating the complete system
about the telescope axis.

with a SIT tube; 3. a photon-counting television camera. Because standard

television image frequencies are close to the optimum values, determined
by atmospheric frequencies, television-type sensors are particularly efficient
for this application. Initially, optical analog techniques were used to reduce
speckle interferometry data. This was required on account of the high in-
formation content in the numerous two-dimensional images which had
to be processed. The images recorded on film, or transferred from video
tape to film, were Fourier transformed in a laser processor, and their
power spectra were summed by multiple-exposing a photographic plate.
200 to 6000 exposures were typically used in the summation for residual
noise on the order of a few per cent. Fringes in the summed pattern were
interpreted as evidence of stellar duplicity, while attenuation of the outer
edges indicated a resolved stellar disk (Fig. 8). The power spectrum of the
object could, in principle, be obtained by dividing the summed power
spectra obtained respectively for the object and an unresolved reference
star. In practice, the difficulty of insuring the required sensitometric gamma
resulted in unavoidable photometric distorsions in the processed data.
Nevertheless, the sensitivity and two-dimensional character of the
method permitted a number of findings: 200 objects were observed as of
December 1974, 90 of which in the course of only two nights. Twelve
stars were found to be binaries (LABEYRIE, BONNEAU, STACHNIK and GEZARI
[1974]), allowing stellar mass determinations in five cases. Most of the
supergiant stars resolved by Michelson and Pease were also resolved in
spite of slightly inferior resolution with the 200-inch telescope. Two of
these (Betelgeuse and Mira Ceti) were found to feature a markedly limb-
darkened profile, the width of which increases from red to blue wavelengths
Fig. 8. Stellar structure evidenced from time-averaged power spectra. Speckled images on top with corresponding power spectra below. From left
to right : Betelgeuse (resolved disk), Capella (resolved binary), unresolved reference star. The power spectra presented here are relatively noisy due 2
to the small number of frames used in the average.

Fig. 9. Typical integrated power spectra of 200-inch images. obtained optically, showing resolution of six sellar disks and two binaries.
Object-reference pairs are indicated by a bar. The alteration in the case of p Canis Majoris is believed to result from aberrations resulting
from flexure of the 200-inch mirror for certain orientations. A mirror mask suppresses the bright central peak.

(BONNEAUand LABEYRIE [1973]). All the resolved disks were found to be

revolution symmetrical, with the possible exception of Mira (Fig. 9).
It is currently attempted to improve the image analysis procedures.
High-speed digital correlators have been used in real time, but they provided
only one dimension. A two-dimensional technique applicable to digital
video signals at extremely low pulse rates has been worked out for work on
faint objects or attenuated bright ones. As discussed in sect. 2.3, it appears
indeed that interferometric information may be recovered at extremely
low counting rates, down to 2 photon-events per image, or even less. Part
of the information in high-level analog images must be sacrificed due to
rate limitations in digital processing techniques. It appears that beam
attenuation, for the brighter objects, is the optimum way of sacrificing
information if it allows digital reduction in the photon-counting mode. The
expression (4) shows that minutes of observation suffice to provide adequate
signal-to-noise ratio at counting rates on the order of 100 per image, which
are compatible with on-line digital processing. A software and a hardware
autocorrelation system using a special algorithm are currently developped
along these lines at Meudon.
R. Lynds and his collaborators (1973) at Kitt Peak National Observatory
have developped a digital two-dimensional correlator which they use to
reduce their analog speckle interferometry data. Steps have been taken
in a different direction by STACHNIK and NISENSON [1973] : they use
electro-optic crystals as transducers for optical reduction of speckle data
in real time.


Because self-supporting structures such as the 50-foot interferometer at

Mt Wilson cannot be extrapolated for baselines on the order of 100 meters,
it has been suggested by MILLER [I9661 and others to use separately mounted
collectors for long-baseline work. Miller has studied configurations
involving a pair of heliostats and a central station equipped with optical
delay lines.
Following this general philosophy, I have constructed an interferometer
utilizing two telexcopes. Installed at Nice, the system recently produced
fringes on Vega (LABEYRIE [1975]). The instrument was intended to test
design concepts suitable for future extrapolation toward long baselines, large
component apertures, and progressive growth into an array including
perhaps 40 telescopes. It consists of two 25-cm telescopes located on each
side of a laboratory building, along a 12 meter baseline oriented in the
North-South direction. The telescopes have a Cassegrain-Coudt con-
figuration, both caude beams being received in the laboratory building at
focal ratio f: 3000, and recombined on an optical table as shown in Figs.
10, 11 and 12. The table is mobile on tracks to compensate the optical path
variations occurring as the star follows its diurnal motion. It carries a twin
.autoguider system and a roof-mirror arrangement for recombining the
two images in order to produce Youngs fringes similar to those observed
with his instrument by Michelson. The telescope mounts are of a special
altitude-altitude design providing adequate stiffness with a minimum
number of coude flats. Considerable care has been taken to avoid possible
mount vibrations. -
The synthetic image, or a fringed spectrum, are observed either visually
or with the photon-counting television camera. The camera is interfaced
to a PDP8 minicomputer through a preprocessor. The fringes observed
repeatedly on Vega were found to be of very good contrast, suggesting that
the subjacent limestone soil at Nice provides adequate stability. Also, it
appears that the narrow coudk beams are essentially insensitive to turbulence

Fig. 10. The two-telescope interferometer at Nice. Narrow coudC beams from both telescopes
are received in the central building, where they recombine to produce Youngs fringes in the
synthetic image. The telescopes have special alt-alt mounts built from heavy-gage materials
for dimensional stability. Tracks are currently being designed for a variable baseline.

Fig. 11. One of the two tclescopes operated as a Michelson interferometer. The coudii
beam exits through the hearing visible in front. Also visible are the massive secondary spider,
yoke and concrete support.

in the horizontal path, implying that no piping should be required with

this configuration at long baseline settings.
In contrast with some of the beam-splitter arrangements used to re-
combine beams, the simple optical configuration can be adapted for work
with N telescopes. This requires replacing the roof-mirror by a pyramid
mirror. Optical delay lines such as proposed by Miller may prove necessary
76 H I G H -RE SO L UT I O N T E CH NIQU ES [n. 0 4


Fig. 12. Optical layout of the Meudon/Nice two-telescope interferometer: Tn, TS- Northland
South telescope; M- primary mirror cf= 850 mm); m- Cassegrain secondary cf= 7.5 mm);
F- coude flat; L-field lens; rm-roof mirror in pupil plane; D-dichroic mirror; TV1-guiding
camera; bl-bilens serving to separate the North and South guiding fields; S and P- slit and
direct view prism used for fringe acquisition; TV2- photon counting camera (tunable filter
or disperser not represented); Tr- tracks on which table moves (programming mechanism not

in this case. Making accurate contrast measurements on inherently variable

fringes has been a standing problem since Michelson and Pease. Systematic
errors may be caused by incomplete temporal coherence, atmospheric
dispersion, polarization effects, differential field rotation, vibrations,
insufficiently short exposures, and guiding errors. The problem of avoiding
all these effects has not yet been solved at Nice, but it appears that the use
of larger component apertures with digital reduction methods in the
photon-counting mode should attain a level of accuracy sufficient for many
astrophysical problems. Steps are currently being taken to replace the
small telescopes by 60-inch ones. Also, railway tracks are being installed
for variable and progressively longer baselines.

0 4. The Image Reconstruction Problem


The equations in sect. 2.2 show how the autocorrelation function of

objects, or equivalently their power spectrum, may be obtained. This is
generally insufficient for reconstructing images. The information missing
is the relative phase of the varied spatial frequency components, i.e., the
exact location of the component fringe patterns on the sky plane. In the

Fizeau interferometer for example, the phase cannot be determined since

the fringes observed for any baseline value oscillate in such a way that no
average position exists. Due to turbulence and tracking instabilities, it
is not practicable to pin-down some average fringe position on the sky
plane. Thus, fringe positions on the sky plane at different baseline settings
cannot be compared. A similar problem is classically encountered in the
interpretation of X-ray diffraction data for mapping crystal structures.
The X-ray spot diagrams do not contain phase information, and con-
siderable effort has been invested in trying to obtain indirectly this in-
formation in order to deduce the electron density distribution in crystals.
The absence of a well defined average fringe position may be explained
by the random walk character of the operation which consists in adding
fringed images. This addition has the form c i [ l + sin(2nx/s + q,)] which is
equal to N+xisin(2nx/s+cp,). The latter term may be represented by a
sum of randomly oriented vectors in the complex plane, and the resulting
phase keeps varying wildly as N increases. MCGLAMERY [19671 and others
have proposed to average directly the phases cpi . This might be feasible in
the absence of amplitude variations. However, these variations cut the
fringe contrast repeatedly during the integration period, thus creating
360" ambiguities which are likely to affect the average result.
Although the visibility phase information is generally necessary to
reconstruct images, there are a number of special object geometry cases
where the phase is not needed if a certain a priori knowledge of the object
exists. Such cases include : 1. centrosymmetrical objects, for which the
Fourier spectrum is purely real; 2. objects with a reference star in the
immediate vicinity. The application of speckle interferometry in the latter
case has been discussed and explored through laboratory experiments by
BATES, GOUGH and NAPIER [1973]. When a reference star is present at a
suitable distance, the autocorrelation function indeed contains a pair of
sideterms consisting of cross-correlations between the object and the
reference star. The reference star being a delta function, these side terms
turn-out to be reconstructed images of the object. Chosing which side term
corresponds to the actual object orientation is normally impossible, unless
one uses the image envelope, and this introduces a 180" ambiguity in the
orientation of the object.
LIUand LOHMANN [19731 have discussed the case of speckle interferom-
etry on objects which contain a dark background interrupted by relatively
small islands, one of these at least being a point star. They show that a
high-resolution image may be reconstructed by utilizing image envelopes
to eliminate unwanted cross-correlation terms. Stellar configurations
78 H I G H - R E S O L U T I O N TECHNIQUE S 1, 04
suitable for imagery by the last two methods are not too unfrequent due to
the high occurrence of multiple star systems. With the new generation of
very sensitive receivers, many sources should become amenable to high-
resolution imaging by these methods. We shall see in section 3.3 that these
observations are potentially more sensitive than those with pre-detection
compensation systems.
One more possible method, which has not been very much investigated
yet, involves the exceptionally bright speckles which are likely to appear
sometimes in the image according to Rayleigh statistics. These super-
speckles may be considered as diffraction-limited images of the source.


Rogstadt proposed to use for visibility phase measurements in the

optical region the 3-antenna method worked out by JENNISON [1967] at
radio wavelengths. The concept was further developped by GOODMAN and
RHODES[1973] with the help of laboratory and computer experiments. The
principle of the triple interferometer may be explained in the context of
a Fizeau interferometer with 3 apertures instead of the conventional
two. Holes are obturated in sequence according to a circular permutation
scheme in such a way that only two holes are used simultaneously. The
permutation time is shorter than atmospheric lifetimes. Three possible
fringe systems are recorded in sequence, with a cycle time shorter than the
atmospheric lifetime. With a point source, whenever atmospheric phase
shifts are so arranged that fringe systems 1 and 2 coincide, then every
second maximum of 3 should also coincide with the maximas of 1 and 2.
Lack of coincidence implies that the source is not a point source, and gives
the relative phases of its Fourier components at spatial frequencies f and 2f.
Due to lack of experience in real astronomical situations, it is unclear yet
how this method will compare with the seeing compensation and super-
speckle selection methods.


One of the most attractive approaches to diffraction-limited imaging

through the atmosphere is that suggested by BABCOCK [1953]. The idea
is to remove atmospheric phase fluctuations by means of active phasing
devices. The servo loop originally proposed by Babcock involved a rotating
knife-edge to map wave defects, with a television camera and an Eidophor
optical transducer for applying phase corrections on the wave. The state

of the art has long been incompatible with such attempts, but the idea has
recently been revived by different groups at Berkeley (Lawrence Radiation
Laboratory), Itek Corp. and Hughes Labs.. MULLERand BUFFINGTON
119741 at Berkeley, in collaboration with DYSON[1974] used computer
simulations to study the performance of arrays having a few dozen aperture
elements with individually controllable phases. A parameter suitable for
deriving an error signal was found to be the intensity at one preselected
image pixel: the first aperture is phased to maximize intensity, than the
second, etc., and a few cycles of adjustments performed within the seeing
lifetime suffice to increase dramatically the image sharpness, even though
no attempt is made to suppress the shadow pattern on the aperture.
A different system, resembling more the original Babcock device has
been demonstrated in the laboratory by HARDY,FEINLrEB and WYANT
[1974] at Itek. In this seeing compensator, a wavefront-shearing inter-
ferometer serves to measure the phase distribution on the wave. A simple
analog computer derives correction signals, and these are supplied to a
deformable piezoelectric mirror. The spectacular image improvement
already obtained with this other simple system in its present stage of
development suggests a bright future for the seeing compensation approach.
A rather modest limiting magnitude, of the order of 10 to 13, has been
predicted for devices of that kind. However, for bright enough objects, not
only imaging cameras but also spectrographic equipment should benefit
from reconcentrating the energy which is scattered by the atmosphere.

0 5. Construction of a Synthetic-Aperture Array of Optical Telescopes

In the past few years, interest has arised in building optical telescopes
much larger than those in existence. As discussed by CODE119731, it has
generally been felt that the conventional monolithic approach is not
suitable for building such giant instruments. The construction costs with
the conventional approach indeed appear to increase faster than collecting
area. Instead, it has been seriously envisaged to build arrays of optical
telescopes resembling those used in radio astronomy. ODGERS and RICHARD-
SON 119723, for example, have proposed an incoherent array consisting of
fourty 1.5 meter telescopes. In their project, a synthetic image is produced
at a common focus where the coudC beams meet. The usable field is small
but the array is primarily intended for feeding light into a spectrograph.
The advantages and limitations of telescope arrays have been discussed by
CODE119731, primarily from the incoherent synthesis point of view. Code
pointed out a unique advantage of array over monolithic telescopes : they

can grow, i.e., it is possible to start with a relatively modest system and
progressively expand it once satisfactory operation is demonstrated.
However, the greatest potential interest of the array approach lies
perhaps in the coherent synthesis applications. Interferometric baselines
on the order of 100 meters can indeed be envisaged with an array since
system cost depends mostly on the total collecting area and not on the
telescope spacing. C6mpOnent telescopes for an array do not need to
involve unduely novel techniques, except perhaps for the coudC beam
arrangements, which should require as few reflexions as possible. Telescopes
should preferably be movable for a flexible baseline geometry, unless their
individual size is excessive. Optical path equality can be achieved by means
of tunable delay lines, as discussed by MILLER[1971]. Fringe detection in
the image may be achieved by speckle interferometry or using any of the
numerous possible beam-splitter arrangements. Figure 3 shows, under
conditions of laboratory simulation, the possible appearance of images
produced by a two-telescope arrays and a 6-telescope array, component
apertures being on the order of 1.5 meter. More complicated interference
structures would be observed within the speckles in the case of more than
6 telescopes, but the image analysis procedures used with a single telescope
remain valid. Fig. 13 shows the general layout of an array proposed by the
author. The array will have variable baseline geometries. Altitude-altitude
or spherical mounts are envisaged for the telescopes. These are 1.5 meter
Cassegrainians with a small interchangeable secondary and single-flat

Fig. 13. Proposed synthetic-aperture array of telescopes. Narrow coudB beams propagate
from each telescope into the central station, where they recombine. The telescope mounts
represented consist of ferro-cement spheres tracked on fluid pads. They are expected
to provide better dimensional stability than conventional coudi. mounts. Sphere surfaces
are precision ground. Conventional 1.5 meter (60-inch) telescope optics are mounted inside
the spheres.
11, 51 C O N S T R U C T I O N OF S Y N T H E T I C - A P E R T U R E A R R A Y 81

coudt focus. Because the mounts are not equatorial, photoelectric or

computer-controlled guiding is needed. Also, the coudt flat must be rotated
along the alt-2 axis during observing. This is automatically achieved by a
gear mechanism, and satisfactory operation has been achieved with the
prototype telescopes installed at Nice.
The array design is such that telescopes may be utilized : 1. individually,
for conventional observing with each telescope separately ; 2. collectively
in the incoherent mode, as proposed by ODGERSand RICHARDSON [1972]
for their array; 3. collectively in the coherent mode. The Meudon/Nice
interferometer is a prototype intended for evaluating the concept. Its
operation has been demonstrated and found satisfactory using photo-
electric guiding to maintain the superposition of the two star images. The
mechanical design of telescope mounts must be unusually stiff in order to

Fig. 14. Cross-section of concrete spherical mount studied for the array of 1.5 telescopes.
Reinforced concrete has good vibration damping characteristics, short-term stability and
low cost. The outer sphere surface is ground smooth and supported on rollers, water bearings,
or piezoelectric pads.

avoid vibrations at the 0.1 micron scale. This implies particularly thick
spider arms and a very rigid yoke or sphere. In the prototype system, it
was found necessary to sacrifice partially the possibility of operating at low
elevation angles in order to meet these requirements. This is no great loss
for coherent work, since atmospheric turbulence and dispersion increase
sharply at large zenith angles. An attractive alternate design involves
spherical mounts. For low-cost production and excellent vibration-damping
characteristics, these mounts could consist of a shell made of reinforced
concrete or ferro-cement and supported on three rollers or fluid pads, as
shown in Fig. 14. This design appears to have attractive advantages, and
the unusual tracking problems appear to be quite solvable with the help
of modern minicomputers. Among the advantages are : 1. structural
simplicity; 2. no dome is needed since the sphere includes self-enclosed
laboratory space; 3. the spherical mount may be tracked either in the
alt-alt, equatorial, or even alt-az modes.
A project along these lines is currently being worked out at Meudon.
A 3.5 meter ferro-cement sphere has been constructed and will shortly
be equipped with a 1.5 meter mirror to develop the spherical tracking
technology. A second telescope will then be built, and institutions from
different countries will later be invited to contribute additional telescopes
for progressive array growth. The systems luminosity will surpass that
of Mt Palomars 200-inch instrument if the array grows to include eleven
1.5 meter telescopes or six 2-meter telescopes. Because it is still difficult
to predict the maximum baseline dimensions that will be usable, some
flat expanse of terrain at least one square kilometer should be selected,
in a region having low nebulosity and jet-stream activity in addition to
other desirable astronomical characteristics.

0 6. Intensity Interferometry
In order to overcome the problems which stellar interferometry had to
face in the years 1950, Hanbury Brown and Twiss proposed the novel
method which they called intensity interferometry. The principle may
be presented in the following elementary fashion, readers being referred
to Hanbury Browns articles for more details.
Neglecting atmospheric effects, which have no influence on this method,
the illumination produced on the ground by a stellar source is not uniform
if mapped during a period shorter than the coherence time z = l/f of the
beam ( f being the frequency bandwidth). The non-uniformity results from
interference of light emitted by different parts of the source. Indeed, the

electric field behaves as if the source were spatially coherent during such
a short interval. The phase is however not uniform on the source, so that
the emitter may be compared to a piece of diffusing glass illuminated by
a laser beam. If the diffusing glass is further assumed to fluctuate randomly
with lifetime z, an accurate model of the beams spatio-temporal structure
is obtained. The theory of speckle phenomena mentioned in section 1.3
applies to this case and shows that a distant screen, the terrestrial ground
in the present case, is illuminated with a speckle pattern fluctuating with
the time-constant z. The size of the speckles d is related to the angular size
M of the source by the usual relation d = A/M. This is equivalent to saying
that the whole spatio-temporal structure of an incoherent beam is speckled:
temporal as well as spatial speckles are present. The beam structure may
be described as a flow of random cells or speckles (also known as field
modes or coherence cells) propagating in space with velocity c while
deforming themselves. Fast detectors located on a transverse surface (the
terrestrial ground) see simultaneous intensity fluctuations if they are
spaced by less then the transverse dimension of speckles. They see un-
correlated fluctuations in the opposite case. The cell size may thus be
determined by comparing the signals from two detectors having a variable
This is the technique used by HANBURY BROWN,DAVIS and ALLEN[1974]
at Narrabri observatory. A pair of 6.5-meter light collectors mobile on a
300-meters diameter circular track are each equipped with a fast photo-
multiplier and narrow-band filter. The two photoelectric signals are multi-
plied, and the result is integrated for several hours or days until adequate
signal-to-noise ratio is obtained. Because of the limited electrical band-
width in the photomultipliers and amplifier circuits, the optical frequency
bandwidth effectively utilized is extremely narrow, on the order of
Angstroms. This implies a comfortable tolerance on the dimensional
stability of mechanical structures, but also a very inefficient use of the
incident energy. For this last reason, the method has been applicable only
to the very brightest blue stars up to magnitude 2.5. It nevertheless permitted
remarkably accurate measurements on 32 stars with the unrivaled resolution
of l o p 3 arc-second.
Following the recent results with two telescopes operated as a Michelson
interferometer (sect. 3.4), J. Davis and myself have discussed the potential
applicability of both methods for work at very long baselines, on the order
of 2 kilometers. Whereas baselines as long as one kilometer may be usable
with direct interferometry, it seems that propagation of an electrical signal
is easier over long distances than the undisturbed coherent propagation of

an optical beam. For 2 kilometers baseline, the optical beam produced by

a 1.5 meter (60 inch) telescope may be propagated in 10 cm piping (no
periscope-type relay lenses being required). Vacuum is also unnecessary,
especially if the pipe structure includes thermal insulation. In spite of their
remarkably low loss characteristics, state-of-the-art fiber-optics light guides
cannot be used since they distroy the temporal coherence. Single-mode
fibers may however become available before interferometric baselines grow
enough to require their use. It is thus difficult to predict the result of the
competition between direct and intensity interferometry in the coming
years, although short baselines appear to favor direct interferometry at
this time.

0 7. Heterodyne Interferometry
Heterodyning techniques have been successfully employed at radio
wavelengths for the very-long-baseline observations involving antennas
located several thousand kilometers apart. The method consists in beating
light from the star with that from a local oscillator, on a suitable sensor.
Simultaneous work at two stations using the same oscillator frequency,
provides interference information. Heterodyne interferometers for work
at 10.6 microns are currently developped by TOWNES and his collaborators
[1974] at Berkeley, as well as by GAYand JOURNET [1973] at Observatoire
de Paris. At the University of Utrecht, VANDE STADT[1973] and Nieuwen-
huyzen build a system intended for work at 3.4 microns. Like intensity inter-
ferometry, and for the same reason, the heterodyne approach restricts
considerably the spectral band used. Its potential usefulness is generally
considered as marginal in the ultra-violet, visible and near infra-red regions
where photoemissive sensors are available. In the 2 to 10 micron infra-red
range, heterodyning becomes more efficient, but even there it is not yet
clear how it will compete with direct interferometry. Work on direct inter-
ferometry at 10 microns is carried out at Berkeley by D. Cudaback and
J. Franck.

0 8. Conclusions
Present trends in the fields concerned with high-resolution observation
at optical wavelengths indicate the likelihood of major improvements in
the coming decade. The directions which appear to hold most promise are:
1. synthetic-aperture arrays of telescopes. It is unclear yet how far it will
be possible to push this technique which is still in its infancy, but orders

of magnitudes will probably be gained in the resolution and luminosity

of interferometric work; 2. imaging techniques related to the rubber
telescope concept may succeed in providing improved images of the bright
stars and fainter sources in their vicinity. The corresponding devices may
become useful accessories to spectrographic equipment and also to tele-
scopes participating in array work.
Satellite-based telescopes such as NASAs proposed Large Space Tele-
scope also provide remarkable new possibilities. Concerning synthetic
aperture arrays, however, it is likely that ground-based systems in the
100-metersrange of size will be operational before equivalent space systems.
The experience gathered with them should help designing space systems
for completely diffraction-limited performance. Such systems will open
a new era in optical astronomy.

Restoration of atmospherically degraded images, 1966, Woods Hole Summer Study, Nat.
Acad. of Sciences, Nat. Res. Council.
Synthetic aperture Optics, 1967, Woods Hole Summer Study, Nat. Acad. of Sciences, Nat.
Res. Council.
Synthetic Aperture Optics, 1970, ed. M. W. Stockton, Optical Sciences Center Report 58,

ADLARDCOLES,K., 1967, in: Heavy weather sailing.

ANDERSON, J. A,, 1920, Ap. J. 51, 263-275.
BABCOCK, H. W., 1953, Pub. Astr. SOC.Pac. 65, 229.
BATES,R. H. T., P. T. GOUGHand P. J. NAPIER, 1973, Astr. and Ap. 22, 319.
BEAVERS, W., 1963, Astrol. J. 68, 273.
BLUM,E. and M. CAGNET,1961, Comptes Rendus, p. 253-265.
BOKSENBERG, A,, 1972, in : Auxiliary Instrumentation for Large Telescopes, ESOjCERN
conference, Geneva, 1972.
BONNEAU, D. and A. LABEYRIE, 1973, Ap. J. Letters, 181, 1.
BORN,M. and E. WOLF,1970, PFinciples of Optics (Pergamon press, 4th ed.).
BOWEN,I. S., 1964, Astron. J. 69, 816.
BOYER, W., private communication.
BRECKINRIDGE, J. B. and J. W. HARVEY, Kitt Peak Nat. Obs. report.
CAGNET. M., 1973, Optics Comm. 8, 4.
CODE,A. D., 1973, in: Ann. Rev. Astron. Ap., p. 239.
CURRIE, D. G., S. L. KNAPPand K. M. LIEWER,1974, Ap. J. 187, 131.
DAINTY, J. C., 1973, Optics Comm. 7, 129.
DAINTY, J. C., 1974, Mon. Not. R. Astr. SOC.169, 63434.
DAINTY,J. C., 1976, in: Laser Speckle and Related Phenomena, ed. J. C. Dainty (Springer
Verlag, Berlin) in print.
DYSON,F. J., 1974, in preparation.
ELLIOT,J. L. and I. S. GLASS,1970, Astron. J. 75, 10.
FINSEN,W., 1964, Astron. J. 69, 319-324.
FIZEAU, H., 1868, Comptes Rendus, p. 66-934.

FRIED,D. L., 1966, J.O.S.A. 56,1372.

GAVIOLA, E., 1949, Astron. J. 1178, 155.
GAY,J. and A. JOURNET, 1973, Nature 241, 32.
GEZARI,D., A. LABEYRIE and R. V. STACHNIK, 1972, Ap. J. Letters 173, L1-L5.
GOODMAN, J. W., 1965, Proc. IEEE 53, 1688.
GOODMAN. J. W. and RHODES, 1973, private communication.
GUSKOVA, 0. and D. V. KOROLKOV, 1973, Izd-vo Naouka, 5,112-118,
HANBURY BROWN,R. and R. Q. Twlss, 1958, Proc. Roy. SOC.A 248, 199-221.
HANBURY BROWN,R., 1968, in: Ann. Rev. Astron. and Ap. 6, 13.
HANBURY BROWN,R., J. DAVISand L. R. ALLEN,1974, M.N.R.A.S. 167, 121-136.
HARDY, J. W., J. FEINLIEB and J. C. WYANT, 1974, in: Optical propagation through turbulence,
OSA meeting, July 1974, Boulder.
HUFNAGEL, R. E. and N. R. STANLEY, 1964, J.O.S.A. 54, 52.
, JENNISON, R. C., 1967, Introduction to Radioastronomy, Philosophical Library, New York,
p. 125-129.
KNOX,K. T. and B. J. THOMSON, Ap. J. Letters 182, L133.
KOLMOGOROFF, A. N., 1941, D.A.N. SSSR 30 (4), 229.
KORFF,D., G. DRYDEN and M. G. MILLER,1972, Optics Comm. 5, 187.
KOZLOV, Yu. G., 1968, Opt. and Spectroscopy 25, 424425.
KULAGIN, E. S., 1970, Soviet Phys.-Astronomy 13, 6.
LABEYRIE, A,, 1970, Astron. and Ap. 6, 85-87.
LABEYRIE, A,, 1974, Nouv. Rev. dOptique 5, 3.
LABEYRIE, A,, D. BONNEAU, R. V. STACHNIK and D. G. GEZARI,1974, Ap. J. 194, L147-LI51.
LABEYRIE, A., 1975, Japan J. Appl. Phys., to appear.
LABEYRIE, A,, 1975, Ap. J. 196, L71-L75.
LAWRENCE, R. S. and J. W. STROHBEHN, 1970, Proc. IEEE 58, 1523.
LEE,R. W. and J. C. HARP,1969, Proc. IEEE 57, 375.
LEIGHTON, R. B., 1966, Scientific American 194, 157.
LIU, Y. C. and A. W. LOHMANN, 1973, Optics Comm. 8 , 4 .
LUMLEY, J. L. and H. A. PANOVSKY, 1964, The Structure of Atmospheric turbulence (Inter-
science New York).
LYNDS,R., 1973, private communication.
MARTIN,F., J. BORGNINO and F. RODDIER, 1975, Nouv. Rev. dOptique 6, 1.
MCGLAMERY, B. J., 1967, J.O.S.A. 57, 293.
MICHELSON, A. A,, 1920, Ap. J. 51, 257-262.
MICHELSON, A. A. and F. G. PEASE,1921, Ap. J. 53,249-259.
MIKESELL, A. M., 1955, U.S. Naval Observatory, 2nd series, 17, p. 141, part I1
MILLER,R. H., 1966, Science 153, 3766.
MILLER,R. H., 1971, AURA technical report no. 40,Tucson.
MULLER, R. A. and A. BUFFINGTON, 1974, in: Optical propagation through turbulence, OSA
meeting, July 1974, Boulder.
ODGERS, G. J. and E. H. RICHARDSON, 1972, J.R.A.S. Canada 66,2.
PEASE,F. G., 1930, Armour Engineer 16, 125-128.
PEASE,F. G., 1931, Ergebn. Exakten Naturwiss. 10, 8 6 9 6 .
PLATT,J. R., 1957, Ap. J. 125, 601.
PROTHEROE, W. M., 1961, Pub. Univ. Pensylvania (Flower and Cook Observatory) 90,27.
ROCCA,A,, F. RODDIER and J. VERNIN,1974, JOSA 64, 1000.
RODDIER, C., 1971, private communication.
RODDIER, C . and F. RODDIER,1973, J.O.S.A. 63 (6), 661.
ROGSTADT, 1967, in: Synthetic Aperture Optics, Woods Hele Summer Study, Nat. Acad. of
Sciences, Nat. Res. Council.
R ~ S C HJ.,, G. WLERICK and BOUSSUG$1961, Travaux de lObs. Pic du Midi, 4.
STACHNIK, R. and P. NISENSON, 1973, private communication.

STEPHAN,H., 1873, Comptes Rendus 76.

TOWNES,C., 1974, private communication.
VANDE STADT,1973, submitted to Astr. and Ap.
J. and F. RODDIER,
VERNIN, 1973, J.O.S.A. 63 (3), 270.
YOUNG,A. T., 1969, App. Optics8, 869.
F., 1938, Physica 5,785.
This Page Intentionally Left Blank




GTE Laboratories Incorporated,
Waltham, Massachusetts, 02154, U.S.A.


Lawrence Livermore Laboratory,
Livermore, California, 94550, U.S.A.

* Work performed under the auspices of the U.S. Atomic Energy Commission.


Q 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . 91
Q 2 . HISTORICAL DEVELOPMENTS . . . . . . . . . . . . 93
Q 3. RARE-EARTH ENERGY LEVELS . . . . . . . . . . . 98
Q 5. RADIATIVE DECAY. . . . . . . . . . . . . . . . . 106
Q 6. MULTIPHONON RELAXATION . . . . . . . . . . . . 116
Q 7 . COOPERATIVE RELAXATION . . . . . . . . . . . . . 133
Q 8. SELECTED APPLICATIONS . . . . . . . . . . . . . . 150
9 9. CONCLUDING REMARKS. . . . . . . . . . . . . . . 155
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . 156
0 1. Introduction

The processes active for the relaxation of rare-earth ions in excited

electronic states include (1) radiative decay, (2) nonradiative decay wherein
the excitation energy is converted into vibrational quanta of the sur-
roundings, and (3) nonradiative transfer of energy between like and unlike
ions with possible degradation of the excitation. These relaxation phe-
nomena for rare-earth ions in solids are treated in this article. In the past
decade considerable progress has been made in the experimental and
theoretical investigation of rare-earth relaxation and in extending our
understanding of these processes. Our objective in this review is to provide
a coherent summary of the totality of advances made in this area in recent
The upsurge of interest and activity devoted to relaxation phenomena
has been prompted by a combination of circumstances. First, as a result
of extensive studies of rare-earth spectra, many of the electronic properties
of the rare-earths such as the lower energy level schemes and line intensities
are now well established. A good survey of the theoretical understanding of
these properties in the mid-1960s was given by WYBOURNE [1965]. It was
natural, therefore, that attention should turn to other questions such as
ion-lattice interactions and cooperative phenomena, both of which are
essential to a complete understanding of luminescence and relaxation
Concurrently, there has been a growth in the utilization of rare-earth-
activated materials. These include not only phosphors for photolumines-
cence, cathodoluminescence, and radioluminescence applications but more
recently quantum electronic devices such as lasers, quantum counters,
and infrared-to-visible upconverters.
Because of these device implications, particular emphasis will be devoted
to presenting the current knowledge of rare-earth relaxation phenomena
in a form which will be of utility to workers in applied fields. We seek to
provide sufficient depth to ensure a complete presentation but not an
exhaustive one. Esoteric discussions which would be of interest only to

specialists are avoided. The approach to the subject is generally phenom-

enological. This seems appropriate at present since satisfactory ab initio
calculations of the rates of most relaxation processes are still beyond our
capabilities. The phenomenological approach, on the other hand, has
established parameters and guidelines which are useful in predicting and
analyzing the behavior of rare-earth systems.
Given an ion in an excited electronic state, we shall be interested in the
rates and relative importance of the various radiative and nonradiative
processes active for relaxation. The excited states of the rare-earth ion to
be considered include those of the ground 4fN electronic configuration or
of higher-lying configurations, such as 4f'- '5d. Excitation via charge
transfer states, while active for some rare-earth phosphors, is more special-
ized and is not treated explicitly. The excitation may be produced by optical
radiation (photoluminescence), electron beam (cathodoluminescence), or
other ionizing radiation (radioluminescence) such as X-rays or gamma
rays. We shall not be concerned with the specific excitation process involved
or its rate, since in general the relaxation phenomena to be treated are
common for all forms of initial excitation.
The rare earths comprise two series: the lanthanides - lanthanum (57)
through lutetium (71), and the actinides - actinium (87) through lawrencium
(103). Because of the limited number of investigations and demonstrated
uses of actinide ions, the lanthanide ions will be treated almost exclusively.
The extension of the concepts to the actinides should be evident. Through-
out, the concern will be with the relaxation of excited rare earths present
in crystals, glasses, and liquids rather than in the free ion or gaseous state.
The behavior of rare earths in crystals has received particular attention
in the study of ion-lattice interactions since the phonon properties and
eigenstates are better defined.
A brief historic review of the evolution of the spectroscopy of rare earths
which set the stage for relaxation studies is presented in section 2. This is
followed in sections 3 and 4 by a survey of the energy levels and the excita-
tion and decay modes of the rare earths. The next three sections then treat,
respectively, relaxation by radiative decay (5), multiphonon processes (6),
and ion-ion interactions (7). Finally, in section 8, some typical examples
are given of the use of this information in the discovery and development
of rare-earth activated luminescent materials for specific applications.

8 2. Historical Developments


In 1907 BECQUEREL[I9071 first made spectroscopic observations of

crystalline rare-earth salts at low temperatures. He was intrigued by the
observation that at low temperatures the broad absorption bands resolved
into a multitude of sharp lines. This feature is indicative of the unique
characteristics of these materials which over a period of more than a half
century have formed a basis for the rich history of rare-earth spectroscopy.
A firm scientific basis for the interpretation of rare-earth spectra was
made possible by the theoretical work of BETHE [1930] and KRAMERS [1930]
in the late nineteen twenties. An upsurge of interest in experimental studies
followed, most notably in work of Spedding and Freed and their col-
laborators on the absorption spectra, in work on the fluorescence spectra
in Germany, and in work in the Soviet Union by Zaidel and his collab-
orators. This early work was directed toward the location and identification
of the large number of energy levels that contributed to the observed
complex, sharp-line spectrum. The theory continued to advance, with a
notable contribution by VANVLECK[I9371 who explained the origin of
the observed optical transitions.
The greatest advances in understanding the details of rare-earth spectra
came in the decade immediately following World War I1 with the availabil-
ity, for the first time, of large quantities of rare earths in high purities.
The availability of liquid-helium temperatures also permitted experimental
solution of previously intractable spectroscopic problems. In the years
immediately after the war the work of Hellwege and his collaborators,
as well as that of the Dutch investigators, pioneered the interest in this
activity. Corresponding theoretical work was carried out, particularly by
Jorgensen and by the group at Oxford. The latter work was stimulated by
experimental activity in the paramagnetic resonance of rare-earth ions
With the introduction of the powerful techniques of tensor calculus and
crystal-field theory, the stage was set for a major thrust during the late
nineteen fifties in the detailed investigation of rare-earth spectra, especially
in the precise assignment of energy levels and in the generation of rare-earth
wavefunctions in crystals. By the early sixties the Johns Hopkins group,
under the direction of DIEKE [1961], had generated a complete set of energy
level assignments for all of the trivalent rare-earth ions in the anhydrous
trichlorides. These energy level diagrams are summarized in Fig. 2.1.


















-. -,*

2 -5


r-i -*
0- -2- -
- - 2 - 1 - 2 - R
2 3 4 8 7 6 5 , 3 2
F5(z H, 1% I, 6H5/z IFo 'S F6 Hi5+ I, Iiq H, FT,~
Ce Pr Nd Pm Sm Eu Gd Tb Dy Ho Er Tm Yb

Fig. 2.1. Observed energy levels of the trivalent rare earths. The width of the levels indicates
the approximate magnitude of the Stark splitting in LaCl, . The semicircles denote those levels
from which fluorescence has been observed in LaC1, (after DIEKE [1968]).


At this time, with the lower energy-level structure of the rare earths
established, activity became directed toward some of the more subtle
phenomena, among which were the relaxation processes accompanying
the luminescence.* A particular stimulus for these studies from the early
sixties onward was the development and increasing importance of lasers
involving rare-earth ions in crystals, glasses, and liquids.
Over the years, the field of relaxation phenomena has received review
treatment in various forms. The monumental treatise of ELYASHEVICH
[1953] contained a summary and detailed description of work up to that
time. Review papers in the early 1960s by DIEKE [1961,1963] were in the
nature of updated reports. More recently, reviews by Moos [1970], GRANT
[1971] and KUSHIDA [1973a, b, c] contained discussions of one or more
aspects of this field, and a broad coverage was given in the,book by DI-
BARTOLO [19681.
Although only recently studied in detail, relaxation effects were mani-
fested very early in the study of rare-earth luminescence. The question
of the source of the radiative transitions was, as noted above, elucidated
by Van Vleck. This model was confirmed experimentally by BROER,GORTER
and H ~ ~ G S C H A G[1946].
E N The symmetry properties of the crystal field
were invoked in early spectroscopic studies to account for the selection
rules and polarization observed in these radiative transitions. With the
availability of good energy level assignments and wavefunctions, cal-
culations of oscillator strengths for magnetiedipole transitions could be
made. Theoretical calculations of electric-dipole strengths also became
possible with the techniques and phenomenological approach introduced
by JUDD [1962] and OFELT[1962]. Experimental verification of spectral
intensity calculations is now well established.
As for nonradiative relaxation phenomena, very early qualitative
observations indicated their importance in affecting the luminescence
properties. It was first noted in the hydrated salts that only the ions in the
center of the rare-earth series exhibited fluorescence. A variety of param-
eters appeared to be related to their stability, but the importance of the
H 2 0 vibrations in luminescence quenching was soon recognized. The
stability of these central ions was explained by HELLWEGE [1942] in terms

* In a paper titled The Puzzle of Rare-Earth Spectra in Solids, VANVLECK[1937] noted

that Decision between the various alternatives will probably be possible only when a detailed
spectroscopic classification and Zeeman data become available for various energy levels in
the solid spectrum. This comment is also apropos for the study of relaxation phenomena.

of the electronic energy that had to be taken up by vibrational energy.

The larger the energy separation between the excited state and the next-
lower state, and hence the greater the number of vibrational quanta required
to conserve energy, the smaller would be the probability for nonradiative
Similar observations were made in anhydrous crystalline systems. For
example, it was known that in the anhydrous rare-earth trichlorides only
those levels separated by at least 1000 cm- from the next-lower level were
sufficiently stable for radiative decay to be observed. (Fluorescing levels
are indicated by half circles in Fig. 2.1 .) In the late 19503, extensive measure-
ments of the lifetimes were begun which led to the quantification of these
phenomena. BARASCH and DIEKE[1965] and WEBER[1966] observed that
lifetimes were generally shorter, the smaller the energy gap from the
fluorescing level to the level below. Subsequently, by combining lifetime
data with measurements of quantum efficiencies and calculations of the
radiative transition, the actual decay rates for multiphonon relaxation
were extracted. Such measurements indicated a systematic behavior of the
rates with energy gap, with temperature, and with the vibrational frequency
spectrum of the host.
Related theoretical activity was directed toward the treatment of
phonon-induced relaxation. The earliest considerations of the effects of
the dynamic crystalline field were concerned with spin-lattice relaxation
between Zeeman levels of the paramagnetic ground state. Waller, van Vleck
and Kronig were pioneers in the early theories of the 1930s and 40s.
ORBACHS [19613 parametrization of the orbit-lattice interaction led to a
detailed and definitive stage of theoretical and experimental work in this
area in the early nineteen sixties. The same one and two-phonon relaxation
processes operating between crystalline Stark levels accounted for the
temperature-dependent linewidths observed in optical spectra (YATSIV
[1962], YEN,SCOTTand SCHAWLOW [1964]).
Multiphonon relaxation of excited rare earths involves the simultaneous
emission of several phonons. The treatment of these processes is a more
difficult theoretical problem. KIELS[1964] treatment using the first-order
dynamic crystal-field interaction in higher-order perturbation theory
indicated the limitations of precise calculations. The phenomenological
theory introduced by RISEBERG and Moos [19681 accounted satisfactorily
for the observed experimental behavior. Further refinements in the treat-
ment of multiphonon relaxation were developed by MIYAKAWA and DEXTER
[1970], and FONG,NABERHUIS and MILLER[1972] using an adiabatic
theory. These have also yielded satisfactory agreement with experiment,

again in a similar phenomenological way. Although detailed calculations

are prohibitively difficult, these semi-empirical approaches have provided
adequate understanding for general considerations governing the rates
of multiphonon relaxation.
A similar evolutionary development took place in the field of relaxation
by ion-ion energy transfer. Again it was observed in the spectroscopic
study of rare earths that (1) concentrated materials appeared to show
fluorescence quenching relative to diluted samples and (2) energy transfer
between different ions in the same lattice could take place prior to fluores-
cence, that is, energy absorbed by one ionic species appeared as lumines-
cence from a second group of ions. Much of this work was pioneered by
the group at Philips Laboratories in Holland and was reviewed in an
article by BOTDEN[1952].
A milestone in the understanding of these phenomena was the theoretical
work of DEXTER[1953] who adapted the earlier treatment of F~RSTER
[1948,1949] for the case of organic luminophors to explain the sensitiza-
tion of luminescence in solids. Later, DEXTERand SCHULMAN [1954]
considered the specialized case of migration of resonant excitation among
like atoms to isolated impurities which act as quenching centers. The
energy transfer mechanism invoked was the electrostatic interaction
between the charge clouds of the two ions; this was expressed as a sum of
multipolar products. Observation of radiative processes involving pairs
of ions was explained by DEXTER [I9621 through a straightforward exten-
sion of his earlier theory.
Much of the work on ion-ion interactions in the succeeding ten years
was devoted to the measurement of the concentration quenching (VAN
UITERT[19661) and the assignment of particular multipolar terms governing
the transfer. Exchange interactions and the details of the dynamics of the
energy transfer were treated by INOKUTIand HIRAYAMA [1965]. It was
subsequently shown that the effects of higher-order terms of the multi-
polar expansion and exchange must be considered carefully and that the
results of concentration dependence studies were limited in their con-
During this period, particularly in the mid 1960s, great emphasis was
placed on the utilization of these ion-pair energy transfer processes for
sensitization of rare-earth lasers (VANUITERT[19661). A further stimulus
occurred in 1969-1 971 with the investigation of infrared upconversion
phosphors. These studies were predominantly device oriented and tended
to test and verify rather than extend the basic understanding of energy
transfer processes (AUZEL[1973]).
More recently a number of inroads have been made in the detailed
experimental investigation of such processes as resonant energy migration
between like ions, diffusion-limited relaxation, and non-resonant transfer
between unlike ions. Detailed studies of energy diffusion processes led to
some of the first agreements of experiments with theory for electric dipole-
dipole interactions. Additional theoretical work in this area dealt with
some of the subtleties of these phenomena, such as the dependence of
non-resonant transfer rates on the energy mismatch; this was later con-
firmed experimentally. An additional activity during the past few years
has been the observation and study of cooperative luminescence processes
between similar and dissimilar ions.

9 3. Rare-Earth Energy Levels

Rare earths of the lanthanide series are most commonly incorporated
into materials in the trivalent state. These ions are characterized by a
xenon-like electron structure (ls22s22p63s23p63d'04s24p64d'05sZ5p6)
with a partially filled 4f shell. The ground electronic configuration is 4fN;
the first excited configuration is 4fN-' 5d. The relative location and energy
extent of the 4f and 5d configurations for the tripositive rare earths are
shown in Fig. 3.1.
Most of the rare earths can also be stabilized in the divalent state in
appropriate hosts by the use of special growth and post-growth treatment.
EuZ+ and Y b z f , which have half-filled and filled 4f shells, respectively,
have the smallest reduction potentials followed by Sm2+and TmZ+.These
are the most commonly occurring dipositive ions. The two lowest con-
figurations for the divalent rare earths are shown in Fig. 3.2. Comparing
Figs. 3.1 and 3.2, the 4f and 5d states of divalent ions are located at lower
energies than for the isoelectronic trivalent ions.
Optical line spectra of rare earths arise from transitions between levels
of the 4fN configuration. The positions of these levels arise from a com-
bination of the Coulomb interaction among the electrons, the spin-orbit
coupling and the crystalline electric field. The resultant splittings of the
4fN configuration are shown schematically in Fig. 3.3. The electrostatic
interaction yields terms 2 s + 1 Lwith separations of the order of lo4 cm-'.
The spin-orbit interaction then splits these terms into J states with typical
splittings of lo3 cm-'. Finally, the J degeneracy of the free ion states is
partially or fully removed by the crystalline Stark field, yielding a Stark
manifold usually extending over several hundred cm- I .

- 600

- 700

- 800

- 900

I - I000

- 1500

- 2000

- 3000
- 4000
- 5000
- -10000
- _5 i

C e Pr Nd Pm Srn Eu Gd Tb Dy Ho Er Tm Y b Lu

Fig. 3.1. Approximate extent of the two lowest configurations of the trivalent rare earths.
White - 4fN;black - 4fN-'5d (after DIEKE [1968]).

Fig. 2.1 provides a good guide to the location of the J states of the tri-
valent rare earths since the centers of gravity of the J manifolds exhibit
only small variations with host. The order and separation of the levels
within a J manifold, on the other hand, vary considerably from host to
host. The overall extent of the crystalline Stark splittings is small on the
energy scale of Fig. 2.1.





I- 1



Nd 0000
0 X
LZO k! -
Ce Pm Sm Eu Gd Tb HO Er Tm Yb

Fig. 3.2. Approximate extent of the two lowest configurations of the divalent rare earths.
White-4fN;black-4fN-' 5d (after DIEKE [ 19681).


Fig. 3.3. Schematic diagram of the splitting of rare-earth energy levels due to the electrostatic,
spin-orbit, and crystal-field interactions.

The complex energy level structures of the rare earths were successfully
unraveled with the advent of the powerful tensor operator techniques and
crystal-field theory. This work has been thoroughly discussed, for example,
by WYBOURNE [1965] and by DIEKE [1968]. The free ion states, obtained
by diagonalizing the combined electrostatic and spin-orbit energy matrices,
are linear combinations of Russell-Saunders states of the form

IfN[ySL]J> = c C(ySr)lfNySLJ>.

In this intermediate coupling scheme the total angular momentum J is a

good quantum number but the spin and orbital angular momentum
numbers S and L are not (this is denoted by the brackets in eq. (3.1)). y
includes whatever other quantum numbers are required to specify the
states. In the labeling of rare-earth states and levels as in Fig. 2.1, the SLJ
designation is. usually taken to be the dominant Russell-Saunders state
component in eq. (3.1) or the one in the limit as the spin-orbit coupling
goes to zero.
The crystal field reduces the (25+1)-fold degeneracy of the above
free-ion states and causes a small admixing of J states. Because of the
shielding effects of the outer 5s and 5p shell electrons, the crystal-field
interaction with the inner 4f electrons is weak. The crystal field can thus
be treated as a perturbation on the free-ion states. The crystal-field poten-
tial is expanded in a series of spherical harmonic terms of the form

where the factors l$ are parameters describing the strength of the crystal-
field components, 0:) are tensor operator components which transform
as corresponding spherical harmonics, and the summation is over the i
electronsofthe ion. The e a r e related to the products A:(rk)of the commonly
used parameters A: and radial integrals ($). The number and types of
terms appearing in the expansion in eq. (3.2) are derivable using group
theory and the point symmetry at the rare earth site. For the f shell, k is
limited to values 5 6.
In the above approach the strength of the crystal field is given by a
small number of l$ parameters. Attempts to calculate these parameters
using lattice sums and including covalency and overlap effects have achieved
only very limited success. Therefore the values have been determined
experimentally. To do this the energy matrix including the crystal-field
interaction is diagonalized using an estimated set of starting parameters.

The resulting predicted energy levels are compared with the observed
levels and, by an iterative fitting procedure, the parameters are adjusted
to obtain a best overall fit to experiment. When the positions of many
levels have been measured, and the site symmetry is high so that only a
few terms appear in the expansion in eq. (3.2), root-mean-square deviations
of observed and calculated energies as small as 10 cm- have been obtained.
The crystal-field parameters have also been interpreted using a super-
position model pioneered by NEWMAN[1971]. The field is assumed to
arise from a sum of independent contributions from the other ions in the
crystal. All processes contributing to the crystal field are included in this
Once the crystal-field parameters for a given ion-host system have been
determined, a complete set of energy levels and eigenstates can be com-
puted. These states are labeled by a crystal quantum number ,u and are
of the form

If"ySL]Jp) = 1 C(ySLJJ,)lfNySLJJ,). (3.3)


These states can be used to calculate matrix elements for radiative and
nonradiative transitions between any rare-earth fN energy levels of interest.
Some of the levels from higher-lying electronic configurations (including
5d and 6s) have been studied for rare earths in crystals (see, for example,
LOH [1968]). In general, however, the locations of higher-lying con-
figurations are not well established because most of the levels are at energies
beyond the readily accessible optical region and above the fundamental
absorption edge of many materials. For the 5d states there is a strong
interaction between the outer 5d states and the static and dynamic crystal
fields. As a consequence, the locations of the 5d states vary by many
thousands of cm- in different materials and the optical transitions have
large linewidths.
The location of states having parity opposite to that of the 4fNcon-
figuration are of interest since, as discussed in section 5, electric-dipole
transitions between 4f states become allowed via admixing with these
states. Hence the locations of states such as 4fN-'5d and 4fN-'5g can
affect the rate of radiative decay from 4f states.

0 4. Excitation and Decay in Rare-Earth Systems

As we have seen in the preceding section, rare-earth ions have a complex
energy level structure. Given such a multilevel system, which levels may
111% 41 E X C I T A T I O N A N D D E C A Y I N R A R E - E A R T H SYSTEMS 103

be expected to fluoresce, under what conditions, and with what efficiency?

Before examining in detail the processes which govern the answers to the
above questions, we review the decay schemes and terminology used in
describing rare-earth relaxation phenomena. At this point only radiative
decay and nonradiative decay by multiphonon emission will be considered
since they are the most basic processes active and are always present to
some degree. Decay schemes arising from ion-ion interactions are reviewed
in the section devoted to these processes.
Transitions in the optical region of the spectrum occur between J levels
of different terms. The crystalline Stark splitting results in fine structure
and groups of closely spaced lines. Since all levels are radiatively coupled
to other levels with varying transition probabilities, emission or absorption
experiments involving rare earths in crystals may exhibit hundreds of sharp
lines. In glasses, because of local variations in the symmetry and range of
the surrounding charges, there is usually a distribution of Stark splittings
and the structure may not be resolved in optical spectra because of in-
homogeneous broadening. Similarly in liquids the averaged field produces
broad spectral lines. Early spectroscopic studies were directed toward the
rationalization of the locations of spectral lines and their assignment to
specific czergy levels. Relaxation studies are concerned with the appearance
or absence of emission lines and their intensities.
The decay of rare-earth luminescence is principally a result of transitions
between J manifolds. Relaxation between levels within a given J manifold
is rapid because the level separations are generally within the range of
phonon energies and hence one- and two-phonon processes are very
probable. These nonradiative processes cause lifetime broadening of rare-
earth spectra and are active for spin-lattice relaxation as studied in para-
magnetic resonance experiments. At 300 OK, the rate of intra-manifold
relaxation may bey 10l2sec-'. Because of the fast thermal equilibration
among the Stark levels, the J manifold can, in most cases, be treated as a
whole when considering the slower radiative and nonradiative decay rates
to other Jmanifolds. The radiative and nonradiative transition probabilities
from individual Stark levels, however, are not equal and thus, as we shall
see, it is sometimes necessary to consider the Boltzmann population and
transition probabilities from individual Stark levels.
The simplified energy level diagram for a rare earth in Fig. 4.1 illustrates
possible decay modes involving transitions between J manifolds. Consider
a photoluminescence experiment beginning with the absorption of optical
radiation causing excitation from the ground state 0 to one or more of the
ensemble of upper levels 3, 4, 5 . When the upper levels are closely spaced,

Absorption FI uorescence

V 2
V 1

Fig. 4.1. Schematic energy level diagram showing the radiative (straight line) and non-
radiative (wavy line) decay modes of a rare-earth ion following optical excitation.

relaxation occurs predominantly by a level-by-level nonradiative cascade

to level 3 denoted by the wavy line transitions. If the energy gap between
levels 3 to 2 is large, nonradiative decay by multiphonon emission is less
probable and radiative transitions, from level 3 to terminal levels 2, 1, 0,
denoted by straight lines, become important. Levels 2 and 1 subsequently
relax to the ground state by multiphonon emission in this example. Fig. 4.1
is a fair representation of many of the relaxation schemes active for the
trivalent rare-earth energy levels in Fig. 2.1.
The rate of relaxation of an excited J state is governed by a combination
of probabilities for radiative and nonradiative processes. The lifetime z, of
an excited state a is given by

a b b

where the summations are for transitions terminating on all final states b.
The radiative probability WR includes both purely electronic and phonon-
assisted transitions ; the nonradiative probability WNRincludes relaxation
111, 9 41 E X C I T A T I O N A N D D E C A Y 1N R A R E - E A R T H SYSTEMS 105

by multiphonon emission and effective energy transfer rates arising from

ion-ion interactions. The radiative quantum efficiency ua is defined by

It is also equal to the ratio of the number of photons emitted from level a
to the number of ions excited by photons into level a.
Depending upon the specific rare-earth energy levels and host involved,
the relative probabilities for radiative or nonradiative decay between given
levels may range from comparable values to the two extremes where
W,"b-sz W a y or W$ > W2R. The radiative quantum efficiency of a level
may therefore approach zero or unity. Determination of the relative and
absolute magnitudes of WR and W N Rconstitutes the basic thrust in the
study of relaxation phenomena.
Determination of any tweof the quantities z, q, W R ,C W N Rin eqs.
(4.1X4.2) is sufficient to derive the other two quantities. Several
approaches, theoretical and experimental and combinations thereof, have
been applied to accomplish this. As will become evident later, meaningful
ab initio calculations of either radiative or nonradiative decay rates are
still beyond our present capabilities. Therefore all approaches involve
experiment and phenomenological treatment.
Experimentally, with the present existence of fast, intense pulsed light
sources and selective excitation techniques, measurements of excited-state
lifetimes present n6 serious problems. For a second quantity, measurements
can be made of the quantum efficiency q, but care is required to avoid the
introduction of systematic errors. As an alternative, measurements of the
line strengths in absorption have been combined with the Einstein ,443
relations to determine the corresponding emission probability W: . Knowl-
edge of the rare-earth concentration is required, however, to obtain the
absolute intensities of absorption spectra.
When transitions to several terminal J states are present, relative fluores-
cence intensities must be measured to determine the total 1 WR. This
becomes tedious if the wavelength and intensity ranges involved become
large. Different detectors may be required for different spectral ranges and
accurate calibration of the spectral sensitivity of the detection system is
Success in calculating W$ using the Judd-Ofelt approach and phenom-
enological intensity parameters has prompted several studies wherein
calculations of W,"bare combined with measurements of z, to determine
106 RELAXATION P H E N O M E N A [m,0 5
the other quantities in eqs. (4.1) and (4.2). This approach can also be
tedious since it generally requires measurements of absorption spectra to
determine the best set of intensity parameters and computation of transition
matrix elements. In addition, whereas overall the r.m.s. errors between cal-
culated and measured line strength may be small ( 7 lo%), the strengths
of individual transitions may exhibit larger variations. As discussed in the
following section, the Judd-Ofelt approach works best for the lower-lying
states of the 4fNconfigurations.

9 5. Radiative Decay


Optical transitions between electronic states of rare-earth ions are pre-

dominantly of electric-dipole nature. Magnetic-dipole and electric-quad-
rupole transitions are allowed but their contributions to radiative decay
are generally small or negligible. Given appropriate eigenstates for the
rare earth, the probabilities for magnetic-dipole (MD) and electric-quad-
rupole (EQ) transitions can be calculated ; similar calculations for electric-
dipole (ED) transitions are not possible, however, The reasons for this are
reviewed below together with the Judd-Ofelt treatment which has been
applied to interpret the optical intensities of rare earths. The latter approach,
combined with empirical intensity parameters, can be employed to calculate
radiative decay probabilities.

5.1 .l. Electric-dipole transitions :Judd-Ofelt theory

The selection rules for electric-dipole transitions are A1 = k 1, A S = 0,
IALI, ]AJI 5 21, where for lanthanide series ions I = 3. Transitions between
f Nlevels involve no change in parity and hence ED transitions are forbidden
by Laportes rule. Such transitions become allowed if odd harmonics in
the static or dynamic crystal field admix states of opposite parity into 4fN.
This can occur statically if the rare-earth ion resides in lattice sites lacking
inversion symmetry. Noncentrosymmetric sites occur for the following
point group symmetries: c, CI c 2 , CZV, c3 C ~ c
7 V4 c&,
3 c6 c6v9 D2d,
D3, D3h, Dq, Dg, C3h, s q , T, T d , 0.
Consider the admixing caused by the odd-parity terms ( k odd) in the
crystal-field expansion in eq. (3.2). To first order, the eigenstate for the
ath level of the f Nconfiguration is
111, 51 R A D I A T I V E DECAY 107


and the summation is over states fl of configurations having opposite parity.

Possible excited configurations include 4f"-'n'Z', where n'Z' = 5d or 5g,
and core excitations 3d94fN+l . Matrix elements of the electric-dipole
operator P between the states in eq. (5.1) are given by
($ulPl$b) =
, [(Eb-E,).-1(6nlP16B)(6,1 v,.,dd16b>

P, = -e C~~(c;))~.

Any attempt at ab initio calculations of the matrix elements in eq. (5.2) are
plagued by numerous unknown quantities and difficulties. These include :
(1) the values for B,"(k odd) in V,, which, as discussed earlier in connection
with the even k terms in the static crystal field case, have not been calculated
satisfactorily ; (2) the energy denominators, since the energy levels E, of
the excited configurations are generally not known; (3) the intercon-
figurational radial integrals jR(4f)R(n'l')t dr, which require good radial
wavefunctions R for their evaluation, and finally ; (4) lengthy computations
involving consideration of many states.
A decade ago JUDD[1962] and OFELT[1962] independently showed
that calculations of ED probabilities could be made tractable by treating
the 4fNand the excited, opposite-parity configurations as degenerate with
a single average energy separation and by replacing Cil~s)(q5slC~') by the
tensor operator component U f i q of even order t. By so doing one can
invoke closure over the summation in eq. (5.2). The electric-dipole matrix
element for the pth component of polarization can then be expressed
simply as
($ulPpl$b) = y(t,4, p)(fNySLJJzIUt!kqlfNy'S'ZJ'J:), (5.3)
q , t even

where the energy denominator (En,r.-E4r)av, the crystal-field parameters,

and the radial integrals are all incorporated into the phenomenological
parameters Y ( t , q , p ) . The number and type of components A; which
enter into the Y ( t , g , p ) can be determined from group theory. Since at
most f and g states are involved, k 6 7. Application of this approach to the

intensities of transitions between crystalline Stark levels is dependent upon

the availability of reliable crystal-field eigenstates. When ions reside in
sites of low symmetry, these are difficult to obtain.
In an intermediate coupling scheme, the line strength S = ~ ( $ ~ ~ ~ ~ $ b ) ~ 2
for ED transitions reduces to a simple expression containing three intensity
parameters given by


The eigenstates are of the form in eq. (3.1) and the matrix elements of U @ )
can be derived (WYBOURNE [1965]) using tabulated doubly reduced matrix
elements of NIELSON and KOSTER[1964] and 3-j and 6-j symbols. The
expression for the electric-dipole line strength arising from one-phonon
vibronic transitions is identical in form to that given in eq. (5.4) (JUDD
[19621);hence this contribution is included in the intensity parameters 0,.
The oscillator strength f and the absorption cross section 0 for a transi-
tion of frequency v are related to S by
f ( a J ; bJ) = [87c2mv/3(2J+ l)he]S(aJ; bJ) (5.5)

s o(v)dv =
7ce2(n2+ 2)
and thus can also be determined once the intensity parameters are known.
In eq. (5.6), n is the index of refraction of the host. Due to the ED for-
biddeness of f-f transitions, the oscillator strengths for transitions between
J states are small, of the order of
For relaxation one is interested in the spontaneous emission probability
given by
A ( d ; bJ) = S(aJ; bJ). (5.7)
3(25 + i)hc3
x is the local field correction and for ED transitions is approximated by
n(n + 2)/9. The radiative lifetime and the fluorescence branching ratios
from a level a are defined by

n1, a 51 RADIATIVE D E C A Y 109

respectively, where the summations are over all terminal levels b.

Matrix elements of the tensor operator U ( k ) for intermediate coupling
have been tabulated for many ions and transitions. Those for absorption
from the ground state have been given for all lanthanide ions by CARNALL,
FIELDS and RAJNAK [1968]. Only a limited number of matrix elements for
emission have been given, partly because the states from which fluorescence
is observed vary with the host. In Table 5.1, references are given which
provide values of U ( t ) matrix elements for fluorescence transitions of
several rare earths. Although somewhat different sets of intermediate
coupled eigenstates may have been used in obtaining the above matrix
elements, the resulting values exhibit only small differences, and hence
values derived for one material have generally been applied to other
materials. This point has been discussed by GASHUROV and SOVERS [1969]
and several sets of matrix elements for Er3+ obtained by different workers
have been tabulated for comparison by REISFELD, BOEHM,LIEBLICH and
BARNETT [19731. In cases where spin-forbidden transitions become allowed
via a small admixing of spin states, the matrix elements are sensitive to the
eigenstates used. Examples of this occur for Eu3+'andTb3+.Severe trunca-
tion of eigenstates can also lead to significantly different values.
An attractive feature of the Judd-Ofelt approach is that once the set of
intensity parameters has been obtained for a given rare earth-host com-
bination, they can be used to calculate absorption and emission probabili-
ties between any f Nlevels of the system. This includes transitions such as
excited-state absorptions which are difficult to measure experimentally.
The application and limitations of the Judd-Ofelt approach are discussed
later in section 5.2.
Electric-dipole transitions between states of 4f and 5d configurations are
parity allowed. The oscillator strengths for f 4 transitions are therefore
much larger than for f-f transitions with magnitudes of 10-'-10-2 (LOH
[1966]). Emission from 5d states, while not common, has been observed
for several rare earths (see, for example, WEBER[1973a]); Ce3+ and EuZf
are notable examples. Calculations of the probability for radiative decay
from 5d states by ED transitions requires eigenstates and interconfigura-
tional radial integrals. Examples of such calculations for Ce3+ have been
given by HOSHINA and KUBONIWA [1971] and MANTHEY [1973].

5.1.2. Magnetic-dipole and electric-quadrupole transitions

Magnetic-dipole transitions are parity allowed between states of f Nand
subject to selection rules A1 = AS = AL = 0 and IAJ( 5 1 (Otft 0) in the
Russell-Saunders limit. The most significant contributions usually involve
References to intermediate coupling matrix elements and Judd-Ofelt studies for rare-earth
ions in crystals

Rare-earth ion
Matrix elements Judd-Ofelt parameters

Pr3 +

H O+~

A. WEBER[1968b].
B. KRUPKE[1971].
C. KRUPKE[1972].
D. AXE[1963].
E. WEBER[1967b].
H. WEBER[1967a].
J. KRUPKE[1966].
M. WEBER[1968a1.
N. DETRIO[1971].
0. BECKER[1971].
P. KRUPKEand GRUBER[1965].

intramultiplet transitions. Other transitions become possible because spin-

orbit coupling admixes S and L states, but these transitions are rarely of
The line strength for MD transitions is given by
S,,(aJ; bJ') = B2l(f"ySL]JIIL+2S(Jf"[y'S'~]J')12, (5.10)
where /? = eh/2mc. The matrix elements of the magnetic-dipole operator
L + 2 S between SLJ states are easily calculated from formuli given else-
where (WYBOURNE [19651) and have been tabulated for absorptive transi-
tions for lanthanide ions by CARNALL, FIELDSand RAJNAK[1968]. The

spontaneous emission probability for magnetic-dipole transitions is cal-

culated using eq. (5.10) in eq. (5.7) with xmd= n3.
Electric-quadrupole transitions are also parity allowed between states
of fN with the selection rules A S = 0; IAL.1, IAJI S 2. Expressions for EQ
line strengths have been evaluated for several rare-earth transitions ;
however, the resulting probabilities have been found to be several orders
of magnitude smaller than those for dipolar processes. Possible enhancement
mechanisms for quadrupole transitions have been considered (JGRGENSEN
and JUDD [1964]), but thus far no cases have been discovered where
electric-quadrupole transitions are significant. Hence they are generally
considered to be unimportant for radiative decay.


The radiative decay rates of rare-earth ions have generally been derived
either from the Einstein A-B relationship between the probabilities for
absorption and spontaneous emission or from the Judd-Ofelt approach.
In the first case the decay rate A,, for the transition from an excited state
a to the ground state 0 is determined from measurements of the associated
integrated absorption cross section 0 and the relation

(5.1 1)

If radiative decay occurs to other terminal levels, as is frequently the case

for rare earths, additional measurements of fluorescence branching ratios
Baj are needed. Since
A,, : A,, : A,, . . * = pa, : pal : pa, . . .) (5.12)

the other A , and the radiative lifetime T , in

~ eq. (5.8) can be found. This
approach was used many years ago by HELLWEGE [19471 and RINCK[1948]
and more recently by CHAMBERLAIN, PAXMAN and PAGE[1966] and others.
Accurate knowledge of the rare-earth concentration is required to deter-
mine oOaand careful correction for the spectral response of the apparatus
is required to obtain the relative fluorescence intensities and Bs. When
these requirements are fulfilled, this method is a convenient one for deter-
mining radiative decay rates.
Implicit in the above relationship is the principle of detailed balance
between the probabilities for absorption and emission. This is considered
to be a valid assumption for transitions between fN levels of rare earths,

although no definitive test has been reported. FOWLERand DEXTER

19651 have formulated a more general relationship given by

where E,,, is the effective field at the emitting center, g b and go are the
degeneracies of the lower and upper electronic levels, and (rya) is the
electric-dipole matrix element between component states y and 6 of levels a
and b. Possible differences in the matrix elements for absorption and
emission should be considered when treating f ++ d transitions of the rare
The optical intensities of rare-earth spectra from liquids, glasses, and
crystals have been successfully rationalized using the Judd-Ofelt approach.
The intensity parameters for a given ion-host combination are derived from
a least-squares fit of calculated and observed intensities using as many
experimental values as available. Average deviations ranging from 5 to
20 % have been reported. Discrepancies for the intensities of transitions

Measured and calculated oscillator strengths for the absorption spectrum of Er3+ in Y,O,
(UUPKE [1965])

Oscillator strength f(10-6)

[SLIJ Energy (cm- I)
measured calculated

6 520 1.25 0.52 MD

0.73 ED
10250 0.34 0.27
12500 0.31 0.37
15210 1.73 1.44
18200 0.34 0.24
19180 11.03 9.95
20 360 1.13 1.34
21050 0.40 0.48
24 500 0.61 0.67
26 350 20.55 21.55
31 210 0.03 0.07
33920 0.16 0.40
34600 0.09 0.07
36350 0.57 1.22
38750 8.89 10.31

between individual pairs of J states can, however, be much larger. An

example of the fit obtainable between measured and calculated oscillator
strengths is shown for Er3+in Y , 0 3in Table 5.2. When levels from different
Jmanifolds are not resolved, intensities are attributed to the sum of individ-
ual transitions.
Application of the Judd-Ofelt theory to trivalent rare earths in liquids has
been well illustrated in a series of papers by CARNALL et al. C1965, 19681.
Although Eu3+ and Tb3+ were not tested thoroughly, satisfactory agree-
ment was obtained for all ions except Pr3+. PEACOCK [1971,1973] has
also made extensive studies of the spectral intensities of 4f" series ions in
solution in the context of the Judd-Ofelt theory, again with satisfactory
Spectral intensities of rare earths in glasses have been investigated by
Reisfeld and co-workers, and many results are summarized in a review by
REISFELD[1973] and references therein. Recently KRUPKE[19741 has
carefully examined the intensities of Nd3 in several glasses used for lasers

and has obtained extremely good fits with rms deviations of only 5 % .
The resulting Judd-Ofelt parameters were used to calculate fluorescence
lifetimes and branching ratios and stimulated emission cross sections.
The literature of the Judd-Ofelt treatment applied to rare earths in
crystals is more dispersed. A summary of references to intensity studies in
crystals is given in Table 5.1. Unlike liquids and glasses, where the Stark
splittings are usually not resolved, transitions between individual Stark
levels can be observed in crystal spectra. Although most studies have con-
sidered only the total transition probabilities between Jmanifolds, in several
instances intensities of spectra between Stark levels have been analyzed
(see, for example, AXE [1963], KRUPKEand GRUBER[1965], BECKER
[19711, DELSART and PELLETIER-ALLARD [19713). A prerequisite for such
studies, as noted earlier, is the existence of satisfactory crystal-field eigen-
Judd-Ofelt parameters are expected to vary throughout the lanthanide
series because of differences in the average energy separations of the
opposite parity configurations, the interconfigurational radial integrals,
and the relative contributions of vibronic intensities. In several studies
( K R U P [1966],
[1973]) intensity parameters have been determined for a number of rare
earths in the same liquid or crystalline environment. Some systematic
variations in the R, values have been noted and used to predict parameters
for adjacent ions in the 4fNseries. Overall, however, the behavior based

upon the results reported thus far are still somewhat unclear and additional
systematic studies are needed.
Radiative decay probabilities are calculated using Judd-Ofelt param-
eters and eqs. (5.7) and (5.8). In Table 5.3, calculated radiative lifetimes
of the 4F, state of Nd3+ in several different hosts are compared with
observed lifetimes. The latter values are those measured at low concentra-

tions and temperatures where a quantum efficiency of unity is expected.
In view of the 10% uncertainty in the Judd-Ofelt parameters, the agree-
ment is quite satisfactory. In general, agreement is good when, as in the
case of Nd3+, radiative transitions occur to several J manifolds, thus
providing a further averaging effect. Since the spontaneous emission
probability, eq. (5.7), varies as v3, radiative decay becomes more probable
for higher lying states. For 4f levels of rare earths, radiative decay rates
range from approximately 10 to lo6 sec-'.

Comparison of the radiative lifetime calculated using reported Judd-Ofelt parameters and the
observed lifetime for the 4F, state of Nd3+ in different hosts

Lifetime (ps)
calculated observed

LaF, 635 670

y 2 0 3 270 260
Y,AI,O,, 259 250
ED-2 glass
372 - 175

The intensities of many transitions involving JAJII 2 have been found

to be very sensitive to the small changes in the local environment. This is
observed in solids, liquids, and gases. Several possible sources of this
hypersensitivity have been considered by J@RGENSEN and JUDD[19641.
JUDD[1966] has noted that the B,' component in the expansion of the
crystal-field potential affects only the Q2 parameter and thus may account
for the environmental sensitivity. This component is allowed for the follow-
ing site symmetries: C,, C 2 , Czv,C3, C3v,C4, C4v,c6, C6". Vibronic
mechanisms have also been proposed, but the behavior appears to be
different for crystals, molecular complexes, and gaseous phases (see
PEACOCK [1972a] and references therein).
Examples of transitions having large U(') matrix elements and exhibiting
hypersensitivity are.Nd3+: 41, -, 4G,, Eu3+: 7F, + 'D,, Er3+: 'I.y -, 2 w
111, 0 51 RADIATIVE DECAY 115

and 4G,. In the last case, PEACOCK[1972b] found better results by ascrib-
ing separate fi2 values for each of the two hypersensitive transitions.
Because the energies of the final states differ significantly, a breakdown of
the average f d separation used in the Judd-Ofelt theory was suggested.
The occurrence of hypersensitivetransitions should therefore be noted when
employing the Judd-Ofelt method to calculate radiative decay probabilities.
For some ion-host combinations, charge transfer states occur at energies
which are lower than 5d states. This is observed, for example, for Eu3+
in the oxysulfides (STRUCKand FONGER C1971-j). These levels could be
responsible for the introduction of opposite parity states. In glasses,
REISFELD,BOEHM, LIEBLICH and BARNETT [1973] have found that increasing
covalency is correlated with lower charge transfer bands and larger Q2
parameters. Thus far no systematic study and treatment of possible effects
of charge transfer states on radiative decay of rare earths has been made.
The Judd-Ofelt theory is also applicable to the spectral intensities and
radiative decay of actinide series ions. Few intensity and decay rate measure-
ments have been reported for 5fNions. From absorption spectra, it is known
that the oscillator strengths of f-f transitions for these ions are generally
larger than for 4fNions, and therefore admixing of states from opposite
parity configurations may be greater.
In summary, the Judd-Ofelt method provides a useful description of
rare-earth intensities and can be used to predict radiative decay probabili-
ties. There are several limitations to its accuracy, however, due to assump-
tions inherent in the theory and to its application in practice. As examples
of the latter, when considering transitions between J manifolds, equal
ion populations in the Stark levels of the initial manifold are usually
assumed and J-state mixing is usually neglected. It should also be noted
that whereas the quality of Judd-Ofelt parameter fits are dependent
only upon relative absorption coefficients or fluorescence branching ratios,
spontaneous emission probabilities and absorption/emission cross sections
require accurate knowledge of the rare-earth concentration to be meaning-
The Judd-Ofelt treatment is expected to be most satisfactory for transi-
tions between low-lying levels of fNwherethe approximation of an average
energy denominator in eq. (5.2) is most valid. Studies of Pr3+ and Sm3+,
for example, have indicated that a single set of parameters is sometimes
not sufficient to account for all intensities and a second set of parameters
should be used for the higher-lying levels. This is indicative of a possible
breakdown of the closure approximation involved in arriving at eq. (5.3).

5 6. Multiphonon Relaxation

Nonradiative relaxation between J states can occur by the simultaneous

emission of several phonons sufficient to conserve the energy of the transi-
tion. These multiphonon processes arise from the interaction of the rare-
earth ion with the fluctuating crystalline electric field. The crystal field at
the ion site is not static but undergoes oscillatory behavior due to the
vibrations of the lattice or molecular groups. The lattice vibrations are
quantized as phonons having symmetry properties determined by the
symmetry of the crystal and excitation energies determined by the masses
of the constituent ions and the binding forces.
The effects of the oscillating electric field are manifest in the spectros-
copy of rare-earth ions in ways other than multiphonon relaxation. These
include paramagnetic spin-lattice relaxation and the appearance of line
broadening and vibrational sidebands in optical spectra. The earliest
treatment of the ion-lattice interaction, both experimentally and theoreti-
cally, was carried out for the relaxation of Zeeman levels within the para-
magnetic ground state. At low temperatures relaxation occurs by the
absorption and/or emission of low-energy acoustic phonons either via
direct processes involving a single phonon or via Raman processes involving
the inelastic scattering of two phonons. Relaxation of Zeeman levels can
also occur via an Orbach process involving consecutive phonon absorption
to and emission from a higher-lying Stark level.
Transitions between Stark levels can be induced by both acoustic and
optic phonons. Since the Stark level separations are usually much larger
than Zeeman splittings, the more numerous, higher energy acoustic and
optic phonons contribute to direct processes and very rapid relaxation
rates are possible. The associated lifetime broadening frequently accounts
for the observed linewidths of optical transitions (YATSIV[19621).
When the lattice and ion are treated as a coupled system, optical transi-
tions are considered to occur between vibrational-electronic or vibronic
states. These states are written in terms of Born-Oppenheimer product

where is the eigenfunction of the electronic state and the Ini) are
eigenfunctions of the phonon number operator and the ni are phonon
mode occupation numbers. Transitions involving no change in vibrational
quantum number n are purely electronic or zero-phonon lines. Transitions

to adjacent levels involving the creation or annihilation of one or more

phonons appear as vibronic sidebands to the zero-phonon line. Since the
ion-lattice coupling for rare earths in levels of 4fNis weak, only one- or
two-phonon sidebands are observed. The vibronic sidebands are of special
interest because they reflect the extent of the phonon spectrum and the
strength of the ion-phonon coupling for the different vibrational modes.
Although multiphonon relaxation processes were postulated and under-
stood qualitatively many years ago, it has only been within the last decade
that detailed experimental studies have been carried out and a rational
interpretation provided.
From early spectroscopic studies of rare earths in LaC1, ,it had become
apparent that a level had to be separated from the next lower level by an
energy in excess of 1000 cm-' for it to be sufliciently stable to exhibit
fluorescence. Since it was known that the phonon spectrum of LaCl,
extended to a maximum energy of about 250 cm-', it was evident that
a competing nonradiative relaxation process must involve the simul-
taneous emission of many phonons. Similar behavior regarding the occur-
rence of metastable fluorescing levels was observed in other crystal systems
Measurements of fluorescence lifetimes (BARASCH and DIEKE [19653,
WEBER[19661) provided semi-quantitative confirmation of the decreasing
importance of multiphonon decay with increasing number of phonons
-required for energy conservation. This was followed in the mid-1960's by
a concerted effort to measure multiphonon relaxation rates for various rare
earths in several different host crystals. Based upon these measurements,
theories have evolved which describe the essential features of mu1tiphonon
relaxation rather well. All theories, however, are necessarily approximate
in nature. Multiphonon processes involve a combination of crystal-field
theory and lattice dynamics and thus presents the solid-state theorist with
a formidable problem defaying definitive treatment.
For pedagogical reasons, we proceed in reverse chronological order
below, that is, the theory of multiphonon relaxation is treated first followed
by a review of the experimental work.


As mentioned above, the treatment of multiphonon relaxation processes

has its basis in the dynamic crystal-field Hamiltonian introduced for one-
and two-phonon spin-lattice relaxation processes. The extension to multi-
phonon processes necessarily begins with this basic Hamiltonian. Several

theoretical techniques have been employed to calculate the rates of multi-

phonon processes. The assumption of weak ion-lattice coupling, treatment
of the lattice in the harmonic approximation, and neglect of the detailed
phonon and electronic properties are common to the approaches used to
calculate multiphonon transition rates between 4fNlevels of the rare earths.
KIEL[19641 treated the problem first by applying conventional time-
dependent perturbation theory in higher orders. The intractability of
ab initio calculations was immediately evident. RISEBERG and Moos [I19681
extended this approach by developing a phenomenological treatment
which was very successful in describing a wide body of experimental data
on multiphonon transition rates collected by them, by WEBER[1968a1,
and subsequently by other authors. Later theoretical treatments by MI-
employed more elegant mathematical techniques and yielded somewhat
more elaborate dependences of the rate of multiphonon emission on energy
gap and temperature. By using adjustable parameters, these treatments
were similarly successful in fitting the existing experimental data.
Divalent and trivalent rare earths can also be excited via absorption
into states of higher-lying configurations, such as 4fN- 5d, with subsequent
relaxation to 4f states. Because the ion-lattice interaction for the outer
d electrons is greater than for the inner f electrons, the weak coupling
approximation is not valid and the nonradiative decay rates are faster.
d -+ f relaxation of rare earths has been the subject of recent experimental
and theoretical investigations (WEBER[1973a] ; LAUERand FONG[19741).
Since the phenomenological model provides an intuitively satisfying
description of the experimental results and permits a straightforward and
generally useful account of multiphonon relaxation rates in a wide variety
of different systems, it will be reviewed here. Included, too, are a few cases
which appear to be exceptions and where some extension is in order.
We begin by expanding the crystal-field Hamiltonian H C F in a Taylor
series about the equilibrium ion positions, that is,

HCF = Vc,+ Ci
Qi . Vi V c F + . . . = V&+ 1 FQi++ C y ,
i i. j
jQiQj+ . . (6.2)

where Qirepresents the ith normal mode coordinate. The equilibrium

crystal field VCFis identical to the static field given by eq. (3.2) and discussed
in section 3 ;the remaining terms in eq. (6.2) constitute the dynamic crystal-
field interaction. The Vi . , , are partial derivatives of the crystal-field and
are expressible as tensor operator sums in the form given in eq. (3.2) and
acting on the electronic states. The Qi are expressible in terms of phonon

creation and annihilation operators. Thus the dynamic crystal-field inter-

action involves products of operators acting upon product states of the
ion-lattice system given in eq. (6.1).
In time-dependent perturbation theory, a process involving the transi-
tion from electronic state I$.) to electronic state ($b) with the emission
of p phonons can arise in different ways. First, the first-order term in eq.
(6.2) can simply be used in pth order perturbation theory or, second, the
pth order term in eq. (6.2) can be used in first-order perturbation theory.
In the most general formulation, the transition probability also includes
contributions from intermediate terms. Taking only the contributions
from the two extreme terms, the probability for multiphonon relaxation
between states a and b involving p phonons is given by

where . . . l$mn-l) are virtual intermediate states and g(wi) . . . g(wj)

are the densities of phonon states. The electronic levels are assumed to be
infinitively narrow, and energy conservation is guaranteed implicitly by
the &function. The summation is over all possible intermediate states
and phonon modes.
Although evaluation of eq. (6.3) is possible in principle, in practice only
order-of-magnitude estimates are feasible. This is also true for the more
elaborate theoretical treatments of multiphonon relaxation.
The greatest difficulty arises from lack of detailed information regarding
the frequency, polarization, and propagation properties of the vibrations
and the associated strength of the ion-phonon coupling coefficients Vi,
Vj, . . .. It is possible, as exemplified by GERMAN and KIEL[1973], to discuss
relative rates of multiphonon emission in similar systems using estimates
based upon eq. (6.3). The phenomenological model derived from eq. (6.3),
as shown below, also has significant predictive values.
The rate of multiphonon relaxation is temperature dependent. This

arises from two possible sources. The first is the orbit-lattice interaction.
For rare-earth ions, however, we assume that the interaction remains
harmonic and that the Vi , Vi. , . . . are essentially independent of tempera-
ture, at least up to temperatures of several hundred "C.The second and
dominant source arises from stimulated emission of phonons as the modes
become thermally populated. As we shall see later, a simple single-frequency
phonon model frequently yields good agreement with experimental observa-
Consider multiphonon relaxation across an energy gap AE to the next-
lowest level. The number of phonons p i of equal energy h o i , required to
conserve energy, and hence the order of the process, is determined by the
pihwi = AE. (6.4)
The temperature dependence from eq. (6.3) is then
W ( T )= W,(n,+ 1)P[, (6.5)
where n, is the occupation number of the ith phonon mode and W , is the
spontaneous transition rate, that is, W(T)= W , at T = 0. Replacing n, by
its Bose-Einstein average
n, = [exp (ho,/kT)- 1]-', (6.6)
we have for the temperature-dependent multiphonon transition rate for a
single-frequency p-phonon process,

WPi(T) = w, exp (hwi/kT)

exp (ho,/kT)- 1
The critical feature in the temperature dependence is the order of the
process. This is illustrated in a striking manner in Fig. 6.1 where eq. (6.7)
is plotted for processes involving four, five, and six phonons of equal
energy and satisfying eq. (6.4). The difference in the curves is substantial.
Therefore, although there is actually a spread of phonon energies, it is the
order of the process rather than the precise phonon energy distribution
which governs the temperature dependence.
Since the highest energy and/or most strongly coupled phonons can
conserve the energy AE in the lowest order process, they are expected to
make the largest contribution to the relaxation rate and its temperature
dependence. As pointed out by KISLIUKand MOORE[1967], the initial
rate of relaxation W , by lower energy phonons oi may be smaller, but
due to the larger orderp,, the temperature dependence in eq. (6.7) is stronger

Fig. 6. I . Theoretical temperature dependences of multiphonon relaxation for the single

frequency model; AE = 1500 K and pi = 4, 5 and 6 (after RISEBERG[1968]).

and hence may become dominant at elevated temperatures. (See also,

LAUERand FONG[1974].) In a more general treatment, one must sum
contributions from all phonon modes, as has been done by STURGE [1973].
Eq. (6.7) is consistent with Sturges results in the limit of weak coupling
characteristic of rare earths in crystals.
The model leading to eq. (6.7) assumes a simple process involving decay
between two single discrete levels. In reality, as noted earlier, decay occurs
between groups of Stark levels of two J multiplets. If levels of the upper
multiplet are designated by the subscript a and levels of the lower multiplet
by the subscript b, then the total decay rate from one of the upper levels
is x b Wab. Since the levels within the multiplet are in thermal equilibrium,
the combined rate is a Boltzmann average of the rates from the separate
levels given by

where W, is an individual decay rate from upper multiplet level a to lower

multiplet level b, g, is the degeneracy and A , is the energy separation of the
ath level from the bottom level of the upper multiplet. The intrinsic tempera-
ture dependence of the multiphonon rate w,b is expressed by eq. (6.7).

Eq. (6.8) represents a precise description of the characteristics of the

temperature dependence. A similar expression applies to the temperature
dependence of the total radiative rate (where w a b is replaced by w a b A&),
since the probabilities A,, for radiative transitions between individual
Stark levels are also not equal. In practice the individual rates wabcannot
be determined with sufficient accuracy to exploit the precision implicit in
eq. (6.8). Several authors (PARTLOW and Moos [1967], RISEBERG, GANDRUD
and Moos [19673) have successfully fitted experimental results by approx-
imating eq. (6.8) through the introduction of a limited number of effective
levels. The most significant value of the temperature dependence, however,
is in establishing the order of the multiphonon decay process and the
energies of the dominant phonons involved. This can be accomplished
adequately using an approximate treatment as shown by the experimental
work to be discussed later.
In the phenomenological model the energy gap dependence of the multi-
phonon emission rate arises as a consequence of the convergence of the
perturbation expansion. The ratio of the pth order transition rate to the
(p- 1)th order rate in a given host, again considering a single-frequency
phonon model is simply
wpfwp-' = E < 1, (6.9)
where E is a coupling constant whose magnitude determines the rapidity of
the convergence of terms in the perturbation expansion in eq. (6.3). The rate
for a pth order process can then be written phenomenologically as
W p= A E ~ , (6.10)
where A is a constant and the order p is determined from eq. (6.4).
The validity of this approach depends upon the extent to which the
individual features of the phonon modes and electronic states are statisti-
cally averaged out in higher-order multiphonon processes. In the case of
many crystals, the rare-earth vibronic spectrum, which reflects the strength
and extent of the ion-lattice interaction, is rather broad and diffuse. When
this effective phonon density of states is convolved p times (p > 2) as in
eq. (6.3), little structure remains and the exact properties of the phonons
become unimportant. The remaining critical parameter is the number of
phonons required to conserve energy and, therefore, the energy gap.
Similarly, because of the large number of equivalent processes involving
many different intermediate states, the symmetry properties of the electronic
wavefunction are generally also unimportant. Exceptions can occur,
however. Terms in the expansion of the ion-lattice interaction containing

tensor operators with k odd cannot couple levels of the same configuration
because of parity. In addition, only those terms will contribute which
contain even-order operators satisfying the triangle rule

IJ,-J,I 5k s IJ,+J,I (6.11)

for initial and final states a and b. Thus, for example, there are no first-
order matrix elements of the orbit-lattice interaction connecting states
with J = 0 and J = 1 and hence relaxation can occur only by higher-order
processes. In other cases, eq. (6.11) may restrict the number of terms in
eq. (6.2) which are effective. Overall, however, these selection rules are
not too restrictive and the properties of the electronic states as well as the
phonons are averaged out in multiphonon relaxation processes.
The order of the process, under the single-frequency model, is deter-
mined by the energy gap and eq. (6.4). The phonon frequency wi ,generally
speaking, should be close to the maximum of the phonon spectrum, con-
sidering the rapidity of the convergence as expressed by the smallness of E .
This is usually true and experimental studies of the temperature depen-
denceconfirm that theorder ofthe process is often the lowest order consistent
with energy conservation and the cut-off frequency of the phonon spectrum.
If lower frequency modes are more numerous and/ormore strongly coupled,
the dominant multiphonon process may occur in a higher order. Thus, an
assignment of the proper value to assume for wi should be made in con-
junction with knowledge of the intensity of the vibronic spectrum.
A notable omission above is the class of materials where high-energy
vibrations arise from the internal modes of well-defined molecular groups.
In such crystals and in glasses, the rare earth may be relatively weakly

tional frequencies are large (up to -

coupled to the molecular vibration. Furthermore, the molecular vibra-
1000 cm-l) and may have very well-
defined energies due to the absence of dispersion. The decay by emission
of a few such vibrational quanta may have rates comparable with radiative
rates in the optical region. There is an additional contribution from the
more strongly coupled lattice modes which are lower in energy. As might
be anticipated, the above model might not be applicable in such cases
because the degree of averaging is smaller.
In summary, in the limit of the validity of eqs. (6.4) and (6.10), we can
combine the expressions and obtain a phenomenological expression

where C and o! are positive-definite constants characteristic of a given

material at low temperatures*. The experimental thrust over the past
several years has been the determination of w i , E , C and CI for a wide variety
of rare-earth doped materials. These results are discussed and summarized


Methods for experimentally determining pure multiphonon relaxation

rates follow from the discussion in section 4. They generally involve measure-
ments of excited-statelifetimes and one or more other quantities. Three main
approaches have been successfully employed. One combines absorption
data and the relationship between the Einstein A and B coefficients to
find the probability for radiative decay which is then subtracted from the
fluorescence decay rate to determine the nonradiative or multiphonon rate.
(See, e.g., RINCK[1948], and CHAMBERLAIN, PAXMAN and PAGE[1966].)
Alternatively, the radiative contribution can be calculated by the Judd-
Ofelt method, as shown by WEBER[1967a, 1968a, b, 1973bl and DELSART
and PELLETIER-ALLARD [19731. The third technique involves measurements
of relative quantum efficiencies to determine the multiphonon branching
ratios and hence the transition rates. This method was introduced by
PARTLOW and Moos [1967] and used by RISEBERG and Moos C1967, 1968)
to measure rates for an extensive set of rare-earth levels in several different
crystals. Variations of these approaches have subsequently been applied
by other workers to measure multiphonon rates in additional crystalline
materials and, more recently, in glasses.

6.3.1. Temperature dependence

The temperature dependence of multiphonon relaxation rates was studied
by early workers and provided strong support for the phenomenological
approach and the use of a single-frequency model. As an example, Fig. 6.2
shows the temperature dependence of the E(6F*) to D(6F8)multiphonon
decay process in LaBr, :Dy3+.The theoretical fit is given by eq. (6.7) with
the following parameters :

p i = 5, hw, = 155 cm-, Wo = 8 . 8 lo3

~ sec-.

* Treatment of multiphonon relaxation by the adiabatic approximation (FREEDand

JORTNER [1970], FONG, NABERHUIS and MILLER[1972], STURGE[1973]) predicts a more
complicated energy gap law given by Wcc e-hEhn/vAE, where the dimensionless Q is defined
by (log (AE/(hv)- 1 ) and 5 is a dimensionless displacement.The departure of this law from the
simple exponential dependence in eq. (6.12) has thus far not been established experimentally.
m9 8 61 MULTIPHONON R E L A X A T I O N 125





Fig. 6.2. Temperature dependence of the E(6Ft) to D(6Ft) multiphonon decay process in
LaC13:Dy3+(after RISEBERC and Moos [1967]).

The energy gap is taken to be that to the uppermost Stark level of the
terminal J multiplet. In this case it is not necessary to apply the detailed
model of eq. (6.8) because the emitting state consists of two Stark levels
separated by only a few cm-'. The theoretical fit clearly indicates that the
phonons correspond to the high-energy optical region, and that the process
is the lowest order consistent with energy conservation and the high fre-
quency cut-off of the phonon spectrum at 175 cm-'.
A more complex situation is provided by the F('F3) to E(5F4,5Sz)
relaxation of Ho3+ in LaF,. The theoretical fit to the experimental data
in Fig. 6.3 is given by eq. (6.7) with the following parameters:

P, = 6 A, =0 A, = 150~m-' A , = 5 o O ~ m - I
hoi = 3oOcm-' g1 = 0 92 = 1 93 = 5
W,= 2.2 x lo4sec-' W, = 0 w3= 0.
The effect of the Stark levels above the decaying level is accounted for by
a single non-decaying level at 150 cm-'. This reflects the reduced decay
rates for these levels because of the larger gaps. Similarly, the effect of the

Fig. 6.3. Temperature dependence of the F(F3) to E(F,, S 2 ) multiphonon decay process
in LaF, :Ho3+ (after RISEBERGand Moos [1968]).

next highest multiplets (G(5F2)and H(3K8))is accounted for by the assump-

tion of five non-decaying levels at 500 cm-.
Although the parameters used in fitting the temperature dependences
are only a rough approximation to reality, the model does successfully
describe the effects of thermal depopulation of the lowest Stark component
of the decaying level. FONG,NABERHUIS and MILLER [19721have performed
further fits to temperature dependences taking into account the full Stark
structure with rates for individual levels obtained from the gap dependence.
Similarly good fits to experiment were achieved.
The phonons used in the above fits represent the lowest order process
consistent with energy conservation and the cut-off frequency of the
phonon spectrum. This result has generally been obtained for a number
of crystals where the vibronic spectrum shows strong peaks near the
phonon cut-off frequency. There are other crystals, however, for which
the situation is more ambiguous. If the most intense peaks in the spectrum
occur at lower energies than the extreme upper limit, it is no longer clear
which value to assume for the effective phonon energy in the single fre-
111, 0 61 M U L T I P H O N O N RELAXATION I27

quency model. This is borne out by temperature dependence studies in

Y,Al,012 (ZVEREV, KOLODNYI and ONISHCHENKO [1971]) and in YA103
(WEBER[1973b]). In these crystals the temperature dependence cannot be
fitted by the lowest possible order. Instead, better fits are obtained using
a larger number of lower energy phonons corresponding to the more
prominent peaks in the high-energy region of the phonon spectra. In both
cases the highest energy phonons appear as weak peaks in the vibronic
spectra. Therefore, in obtaining a value for the effective phonon energy,
both the cut-off and the shape of the vibronic spectrum should be con-
sidered. In most instances, however, it may simply be taken to be the highest

6.3.2. Energy gap dependence

Measurements of spontaneous multiphonon emission rates for rare
earths in a number of different crystalline materials have confirmed the
validity of the exponential dependence on energy gap given by eq. (6.12).
As an example, data on the multiphonon emission rates in YAlO, at 77 OK
are plotted versus energy gap to the next-lower level in Fig. 6.4. The rates

for five different ions and fifteen different excited states can be fitted
approximately (within a factor of 2) by a simple exponential dependence
on the energy gap. (The two exceptions, the 'D1 and 'D2 states of Eu3+,
are subject to selection rules and the restrictions of eq. (6.1 l).) From knowl-
edge of the phonon spectrum of YAlO, and for the range of energy gaps
studied, a minimum of three and as many as seven or more phonons are
active in the relaxation. The close obedience to the exponential energy gap
law over a range of approximately six decades provides convincing testi-
mony to the phenomenological approach and the averaging out of the
detailed features of the electronic states and phonon modes in these high-
order processes.
The exponential dependence expressed by eq. (6.12) and illustrated in
Fig. 6.4, while applicable to processes involving many phonons, must
eventually break down for small energy gaps where relaxation by one- or
two-phonon processes is possible. In this regime the statistical averaging
that occurs for higher-order processes is no longer prominent; relaxation
rates are strongly dependent on the phonon density of states and on the
ion-phonon coupling. This is evident from the optical linewidths observed
for transitions to individual Stark levels. These widths, when determined by
lifetime broadening due to rapid one- or two-phonon processes, correspond
to relaxation rates which are much larger than would be obtained by
extrapolation of the energy gap dependence (WEBER[1973b1).

, I I

0 Europium
0 Holmium

1000 2000 3000 4000 5009
Energy gap to next-lower level (cm-1)

Fig. 6.4. Dependence of the rate of multiphonon emission on energy gap to the next lower level
for excited states of rare-earth ions in YAlO, at 77 K (after WEBER [1973b]).

For large energy gaps the multiphonon decay rates become smaller and
approach the regime where Z A i j > W. If in this limit W is obtained by
subtracting A from C1, then uncertainties in A can greatly affect the
value and accuracy of W. The multiphonon rates for YA103 were obtained
using A's found from Judd-Ofelt calculations. The uncertainty in their
values produces the large uncertainties in the resulting W values plotted
in Fig. 6.4. Use of relative quantum efficiency measurements is a more
desirable experimental approach in this case.
The selection rule in eq. (6.11) has generally not been found to be an
important factor for multiphonon relaxation when only one or two terms
of the ion-phonon interaction are forbidden. This, however, may be
because the effect is not expected to be significantly larger than the normally
observed variations from the exponential dependence. In cases where

multiphonon relaxation is formally forbidden, as in the case of J = 1 to

J = 0 transitions, the rates deviate significantly from the exponential law
(see the 'D1datum for Eu3+ in Fig. 6.4). This has been discussed recently
by GERMAN and KIEL[I9731 for the ,P1 to 3P0decay in LaCl, : Pr3+.The
model usually employed for such processes assumes a virtual intermediate
state different from either the initial or final 'state and thus the reduction
in the transition rate occurs by virtue of the increased energy denominator.
German and Kiel also treated the 'I6 to 3P,decay and proposed that this
rate is much faster than that anticipated from an extrapolation of the
dependence on energy gap.

6.3.3. Host dependence

Multiphonon relaxation rates have now been measured for rare earths
in several varieties of crystals. In many instances, data sufficient to illustrate
the energy gap dependence has been obtained. This is shown for a selection
of crystals in Fig. 6.5. Included in parentheses are the phonon energies,
haerf, which, based upon the temperature dependence of multiphonon
rates and/or vibronic spectra, appear to be most important for relaxation.
Since the rare earth 4f levels exhibit only small changes with host (2100
cm-'), the different rates evident in Fig. 6.5 for a given energy gap are
clearly a property of the host via the phonon frequency spectrum and the
strength of the ion-phonon coupling. If the rates are plotted as a function
of a normalized energy gap AE/ho,,, , which is approximately equal to the
number of phonons active, the resulting slopes of the curves will reflect
the relative strength of the ion-phonon coupling in the different hosts
(RISEBERG and Moos [1968]).
The parameters C and a in the expression for the multiphonon emission
rate in eq. (6.12) can be determined directly from the exponential energy
gap dependences. These parameters for crystals studied to date are listed
in Table 6.1. There is a wide variation in C which ranges from 4.5 x lo7 to
1.5 x 10'O sec-'. As noted earlier, the rate does not correspond to the rate
for single-phonon processes which are known to be 5 10' ' sec- '.
The parameter E in eq. (6.9) can also be determined from the data and
the value of haeff derived as noted above. The results are included in Table
6.1. E reflects the strength of the ion-phonon coupling and correlates with
the crystalline fields in the various crystals studied. This was strikingly
confirmed by results obtained for LaCl, and LaBr, where identical values
for E were obtained (RISEBERG and Moos [1968]). This is expected since
the structures and crystal fields of these two materials are identical. The
difference in multiphonon rates for these crystals in Fig. 6.5 arises only

I I I f I I I

Energy gap ( cm-')

Fig. 6.5. Spontaneous multiphonon emission rates from excited states of trivalent rare earths
as a function of energy gap to the next lower level (after WEBER[1973c]).

from differences in the phonon spectra (via the halogen masses) and thus
the order required to conserve energy.
Once sufficient data has been obtained for a given host to define the
exponential energy gap dependence and parameters in eq. (6.12), the
results can be used to predict the rate of spontaneous multiphonon decay

within a factor of -
from any 4f rare-earth level of interest. The results are generally good to
2-3. The only cautionary considerations are the
possible selection rule restrictions and the inapplicability of extrapolations
to small energy gaps.
Multiphonon relaxation in crystals containing high-frequency molecular
group vibrations in addition to the lattice phonon continuum requires
special consideration. An early treatment of CaWO, (RISEBERG [19683)
indicated, at least qualitatively, a convergence of the multiphonon rate
with energy gap. REEDand Moos [1973a, b], from a more detailed study
111, 8 61 MULTIPHONON R E L A X A T I O N 131

Phenomenological parameters for multiphonon relaxation of rare-earth ions in crystals

Crystal hw,,,(cm-') C(sec-') a(cm) E Reference

LaCI, 260 1.5 x 10" 1.3 x 0.037 RISEBERG, and Moos

[1967, GANDRUD
LaBr, 175 1 . 2 10"
~ 1.9 x lo-' 0.037 RISEBERG and Moos [I9681
LaF, 350 6.6x 108 5.6x 1 0 - 3 0.14 ~~~ER~l967al;R~~E~ERGand
Moos [I9681
550 2.7 108 3.8 10-3 0.12 WEBER C1968al; RISEBERG and
' 3' Moos [19681
SrF, 360 3.1 x 10' 4.5 x 0.20 RISEBERGand Moos [1968]
Y,AI,O,, 700 9.7 x 107 3.1 x 10-3 0.045 ZVEREV, KOLODNYI and
ONIsHCHENKo c19711

YAIO, 600 5,Ox lo9 4.6~ 0.063 WEBER[1973b]

LiYF, 400 3.5 x lo7 3.8 x 0.22 JENSSEN [1971]
BaY F, - 4 . 5 ~ 1 0 ~ 4.1 x10-3 -

of relaxation in YVO,, YAsO,, and YP04, have shown that the rates can
be critically dependent on the degree of resonance between the gap energy
and the sums of peaks in the vibrational spectrum.
Experimental results to date have been limited to crystalline hosts,
however, similar dependences of multiphonon emission rates on tempera-
ture, energy gap, and host may be expected for rare earths in liquids and
glasses. For liquids, in virtually all cases, there are high-frequency molec-
ular vibrations to consider. Relaxation in liquids has been treated by the
study of fluorescence lifetimes and efficiencies for a given transition in
different solutions (differing in some cases by a single substitution). The
vibrational spectrum is thus varied. Results have confirmed, in an ap-
proximate sense, the convergence of the multiphonon rates with the order
of the process (KROPP and WINDSOR[1965]). Although multiphonon rates
have not been extracted explicitly, it is probable that behavior similar to
that observed in molecular crystals will be observed.
Glasses also contain vibrational groups with frequencies higher than
most phonon frequencies in crystals. Although the degree of coupling of
these vibrations to rare-earth impurities is not well established, multi-
phonon decay bridging large energy gaps should be possible. This is
reflected, qualitatively, by the smaller number of fluorescing levels observed
for rare earths in glasses compared to crystals. Studies of radiative and
nonradiative relaxation of rare-earth doped glasses by REISFELD[19731
and co-workers have already shown some evidence of a systematic behavior
of multiphonon decay rates similar to that found in crystals.
Overall, the phenomenon of multiphonon relaxation is reasonably well
understood with a workable theory having predictive value and well
supported by extensive experimental work.

6.3.4. 5d + 4f relaxation
When excitation occurs via 5d states, 5d 3 4f relaxation is of interest.
As shown in Fig. 3.1, the 5d states of trivalent rare earths are located at
energies 5 50000 cm-'. Optical transitions to these states are therefore
frequently masked by absorption of the host material. The 5d states of
divalent rare earths (Fig. 3.2), in comparison, are located at lower energies.
For most divalent and trivalent lanthanide ions, levels of the 4fN-' 5d con-
figuration overlap those of the 4fNconfiguration. In such cases ions excited
into 5d levels rapidly decay nonradiatively to nearby 4f levels. This is
evident from the absence of 5d emission and the relative intensity and
quantum efficiency of 5d bands in the excitation spectra of 4f fluorescence.
Relaxation of 5d states has been investigated for trivalent rare earths in
Y3AI5Ol2(WEBER[1973a]). In this host the crystal-field splitting of the
5d states is large, resulting in low-lying 5d levels. For ions such as Nd3+
and Tb3+,the broad 5d bands overlap 4f levels; no 5d 4f emission was

observed and rapid nonradiative relaxation was inferred. For Ce3+ and
Pr3+ , however, there are large energy separations from the lowest 5d level
to levels of 4f, approximately 16500 and 10000 cm- ',respectively. 5d + 4f
fluorescence was observed from these ions and the lifetime and intensity
were measured as a function of temperature. The results are shown in
Fig. 6.6. At elevated temperatures, the Ce3+ and Pr3+ 5d lifetimes exhibit
a rapid decrease. Since the d -+ f energy gap for Ce3 is larger, more phonons

are needed to conserve energy and a higher temperature is required before

stimulated phonon processes compete with radiative decay. The tempera-
ture dependences in Fig. 6.6 are more rapid than for f + f relaxation and
cannot be fitted using a single-frequency phonon model. LAUERand FONG
[1974] obtained good agreement with the Pr3+ temperature dependence
by using an intermediate strength coupling model, thus confirming that
the weak coupling approximation is not valid for 5d relaxation.
Nonradiative decay between the 5d levels of Ce3+ was also studied and
found to be fast, although only a limiting rate of 7 5 x lo8 sec- ' at 77 "K
was obtained. For comparison, the extrapolated multiphonon emission
rate at 77K between similarly spaced 4f levels in Y3A15012 is
sec- Whereas intra4f and intra-5d nonradiative transitions occur
between initial and final electronic states having the same parity, inter-
configuration 5d-4f nonradiative transitions require that at least one odd-

Fig. 6.6. Temperature dependence of the 5d fluorescence lifetimes and intensities for Ce3+
and Pr3+ in Y,AISO,, (after WEBER [1973a]).

parity vibrational mode be active in the ion-lattice interaction. The A S = 0

spin selection rule should not be important since the spin-orbit interaction
causes significant spin state admixing in both configurations. If emission
can be observed from several high 5d levels, an energy gap dependence for
nonradiative decay similar to those for 4fN levels might be obtained.
Generally, however, the 5d levels are suficiently close that, based upon
the above rates, a rapid nonradiative cascade to the lowest 5d would be
expected to dominate.

0 7. Cooperative Relaxation

7. I.I. Introduction
Paramagnetic ions in solids can be treated as isolated ions only when
they are well separated. As the concentration is increased or if non-random
distribution occurs, the ion spacing may become sufficiently small that the
ions interact. The coupling of adjacent paramagnetic ions can arise via
134 R E L A X A T I O N PHENOMENA [In, 7

exchange interactions if their wavefunctions overlap, via super-exchange

interactions involving intervening ions, or via various electric and magnetic
multipolar interactions. Exchange effects are manifested in deviations
from single-ion paramagnetism or in ferromagnetic or antiferromagnetic
ordering. Bulk magnetic properties of crystals have been the subject of
active research beginning in the early years of this century. With the advent
of electron paramagnetic resonance (EPR) techniques after World War 11,
it was possible to measure the magnitude of the fundamental interactions
between rare-earth ion pairs in solids. The resulting knowledge of the
nature and strength of the interactions obtained from dilute crystals could
then be applied to explain the magnetic properties of concentrated materials
such as magnetic transition temperatures, ordering characteristics in the
magnetic phase, and collective excitations (magnons). High-resolution
optical spectroscopy of magnetic materials have been directed toward
gaining a similar understanding of ion-ion interactions (LEASK[1968]).
Until the mid-l960s, the field of energy transfer in rare earths progressed
almost independently of the studies directed toward magnetic properties,
despite the fact that some of the basic interactions are the same. The first
systematic investigation of ion-ion energy transfer began in the 1940s.
These included observation of (1) sensitized luminescence, wherein excita-
tion into the absorption bands of one class of ions (sensitizers or donors)
resulted in emission from a second class of ions (activators or acceptors),
(2) fluorescence quenching by relaxation to another ion acting as a non-
radiative energy sink, and (3) self-quenching wherein the relative intensity
of fluorescence was observed to decrease as the concentration of the
activator was increased. These phenomena were observed even in relatively
dilute systems, thus suggesting that long-range interactions were responsible.
F~RSTER [19483 had introduced a formalism for treating energy transfer in
molecules involving the electric dipole-dipole interaction. DEXTER [1953],
in a classic paper, considered the role of electric dipole-dipole, electric
dipole-quadrupole, and exchange interactions and derived expressions for
the transition probabilities and for the luminescence yield as a function of
The Forster-Dexter theory was the starting point for modern research
on ion-ion processes for rare earths. Most work up until the late 1960s
was devoted to studying the concentration dependence of the fluorescence
quenching and attempting to determine the dominant term in the multi-
polar expansion of the interaction for a given system (VANUITERT [1966]).
INOKUTIand HIRAYAMA [1965] extended Forster and Dexters work by
treating exchange coupling and including the time dependence of the
n ~0, 71 COOPERATIVE R E L A X A T I O N 135

fluorescence decay when ion-pair interactions are present. This latter

feature yields an additional means of identifying the nature of the coupling.
Energy transfer from the point of view of the magnitudes of the ion-ion
interactions in the ground state, as measured by EPR, has been discussed
by BIRGENEAU [19681. Recently KUSHIDA[1973a,b, c] has analyzed ion-
pair relaxation rates utilizing theoretical estimates of the interaction
Resonant energy transfer between like ions gives rise to spatial migration
of excitation. Relaxation by energy migration to quenching centers was
proposed by BOTDEN[1952] to account for concentration quenching of
fluorescence; the theory for this process was developed by DEXTER and
SCHULMAN [1954].Within the past few years, the details and rates of energy
migration processes for rare earths have been elucidated further, principally
through studies of the time dependent behavior (WEBER[1971]).
In addition to these developments, a phenomenological approach has
evolved, one involving rate equations but not the explicit form of the
interaction. This approach has seen wide application to the understanding
of materials whose properties are dependent upon ion-ion coupling, such
as infrared-to-visible upconversion phosphors (AUZEL[I 9731).

7.I .2. Theory

There are several interactions involving two or more ions which provide
the means for energy transfer and cooperative relaxation. Consider first
the electric multipolar coupling arising from the Coulomb interaction
between the electron charge clouds of two ions. Let rAi and rBj be the
coordinate vectors of electrons i a n d j belonging to ions A and B, respec-
tively. The electrostatic interaction is

where R is the internuclear separation and K is the dielectric constant. The

various multipolar terms appear from a power series expansion of the
denominator. This expansion is most succinctly expressed in terms of
tensor operators (KUSHIDA[1973a]) :

where c;:;
is a numerical factor dependent on the orientation of the
coordinate hxes and Or) is a multipole operator


The leading terms in the expansion in eq. (7.2) are the electric dipole-dipole
(EDD), dipole-quadrupole (EDQ), and quadrupole-quadrupole (EQQ)
interactions. These have radial dependences of R - 3 , R - 4 , and R-, respec-
tively. Higher-order terms are generally of negligible importance.
The matrix elements of HEsare subject to selection rules AS = 0 and
IALI, lAJl 5 k. As was discussed earlier for radiative decay, these rules
are substantially relaxed for transitions between states of the 4fN con-
figuration because of the large spin-orbit admixing of SL states and possible
J state mixing. Whereas EQ transitions are parity-allowed, ED transitions
require an admixing of opposite-parity states into 4fNand are correspond-
ingly weak for rare earths. Because of this forbiddenness, EQQ transitions
may be more probable than EDD transitions (AXEand WELLER[1964]).
However, due to the more abrupt radial dependence, R- versus R - 3 ,
EDD interactions should dominate at large separations. The intensity of
EDD transitions can be treated as in the Judd-Ofelt theory for radiative
A second ion-ion coupling mechanism is the magnetic dipole-dipole

where p i = li + 2si and ( l i ,si), ( I j , s j ) are the orbital and spin operators for
the ith andjth electrons of ions A and B, respectively. The selection rules
AS, AL, AJ = 0, -t 1 for transitions between 4fNstates are again relaxed by
SLJ state admixing. The MDD interaction has the same long-range R T 3
radial dependence as the EDD interaction.
Finally we consider the exchange interaction given by

J i j in eq. (7.5) represents the isotropic or Heisenberg component of the

exchange integral. More generally Jij is a tensor with additional terms for
the anisotropic and asymmetric contributions of the exchange interaction
( E R D ~[1966]).
S For rare earths in solids, the large residual orbital angular
momentum results in high-rank orbital contributions to Jij expressible in
the form
111,s 73 COOPERATIVE R E L A X A T I O N 137

The high-rank terms can make significant contributions. Additionally,

such terms relax the selection rules for exchange interactions within 4fNto
IASI 5 1, IAL( S 6, IAJI 5 7. Again, because of state admixing, exchange is
relatively free of selection rule restrictions.


(1 I

- B

Fig. 7.1. Schematic diagrams of cooperative relaxation processes: (a) transfer involving
excited state to ground state de-excitation of one ion and excitation of a neighboring ion
to the same excited state; (b) transfer involving dissimilar pairs of levels of like or unlike ions
(dashed lines for like ions).
The manner in which the interactions between two like or two unlike
ions lead to energy transfer and relaxation is illustrated schematically in
Fig. 7.1. The simplest process is shown in Fig. where an ion A in an
excited state (2) decays to its ground state (1) with corresponding excita-
tion of a neighboring ion B from its ground state (1) to (2). If A and B are
identical ions, this process involves resonant transfer of energy from ion
to ion. While this does not lead to net relaxation, it does give rise to spatial
energy migration. In concentrated rare-earth materials resonant transfer
can be fast. Discussion of energy migration is postponed to section 7.1.3;
below we consider only a single ion pair.
Fig. depicts a process whereby an excited ion A decays from state
(2) to (2) while ion B, initially in its ground state, is excited to (1). Here
ions A and B may or may not be identical, as indicated by the dashed lines
for possible matching levels. In the absence of any additional interactions,
energy conservation imposes the constraint of resonance ; that is,
(E, - E,,) = (El,-El). In liquids and solids, however, due to the presence
of ion-lattice coupling, any energy mismatch may be taken up by the
emission or absorption of one or more phonons. The interactions in eqs.
(7.2), (7.4), and (7.5) can all be treated to include ion-lattice coupling
(ORBACH [19671).
The transition probability for energy transfer from ion A to B is given by

where HAB is the ion-pair coupling Hamiltonian and FA(,!?)and FB(E)

are normalized line-shape functions for the transitions of ions A and B.
Detailed expressions for these probabilities for the cases of EDD, EDQ,
and direct exchange coupling were derived by DEXTER [1953]. The overlap
integral in eq. (7.7) denotes a resonance model with the transition rate
dependent on the degree of overlap of the two line-shape functions.
These ion-ion interactions differ in their dependence on ion separation.
The radial dependence of the ion-pair transfer rate is derived from the
square of HABin eq. (7.7) and is therefore R - 6 , R- , and R-O for EDD,
EDQ, and EQQ coupling, respectively, and R - 6 for MDD. Direct exchange
involves an exponential decrease of the wave-functions contained in the cal-
culation of Jij and can be treated in terms of a radial dependenceexp (- R/L),
where L is an effective Bohr radius of the ground and excited states
under consideration (DEXTER[1953]). Thus W A B has an exp ( - 2 R / L )
radial dependence. Dexter correctly concluded that direct exchange, with
its exponential radial dependence, is probably too short-range for effective

energy transfer in dilute materials and little attention was paid to the role
of exchange until recently.
DEXTER [19531also treated the concentration dependence of the lumines-
cence yield. For ion A, this is defined by

where WAB is given by eq. (7.7) and 7: is the radiative lifetime of state A(2).
As the ion concentration is changed, the ion-ion separation and the prob-
ability for relaxation both change correspondingly. For a given interaction
and functional dependence on R , the average yield yIA for a single-pair model
is obtained by taking an integral over all space of qA times the probability
that the nearest ion B is at a distance R . This yields
OD e-Yt
ijt)= l - y L -dt, (7.9)
1 + tnI3
with y = ynpB,the reduced density, and t = 4zR3/3yn.p B is the density of
B ions and yn is defined by 7:WiB = 3yn/47cR3for n = 6 (EDD), n = 8(EDQ),
and n = 10 (EQQ).
A great body of experimental work over the past decade was devoted
to applying the Dexter model to determine which multipolar term of HEs
was dominant for energy transfer and luminescence quenching in a partic-
ular dopant-crystal system. Van Uitert and co-workers, among others,
carried out an extensive series of studies in a wide variety of materials (see
GRANT C1971) for a bibliography of this work).
Although Dexter's treatment of luminescence yield describes the observed
concentration dependences in a reasonable way, the assignments of the
radial dependences and dominant interactions in a given case are often
ambiguous. In some instances, with the available data, there was not suffi-
cient differences in the theoretical dependences of yI(n) with n. There are
several features of the overall approach which made analysis difficult.
First, at concentrations large enough to show substantial ion-pair decay,
resonant transfer (as in Fig. can be exceedingly fast among ions A,
particularly since the degree of resonance for ion-pair decay (as in Fig. 7.1b)
can be expected to be generally smaller. Therefore, those A ions surrounded
by a greater number of B ions than the average will dominate the decay,
and the short-range (but stronger) interactions will be enhanced (i.e. EDQ
and EQQ). At low concentrations, where the average separation is larger,
the longer-range interactions, such as EDD, will be dominant. Thus, in a
given material, different interactions dominate for different concentration

ranges. In the case where A and B represent different species, for sufficiently
high concentrations of A (greater than a few %) resonant transfer within
the A system is sufficiently fast that the excitation may be considered
equally shared among all of the ions A. A rate equation approach for
A 4B transfer is then justified (GRANT [1971]).
An assumption of the Dexter approach was that the rate was dominated
by transfer to the nearest B ions. An extension to the entire environment
including the dynamics of the transfer was formulated by INOKUTIand
HIRAYAMA [1965] (hereafter referred to as IH). The difficulty in treating
the dynamics of these processes is that for an excited ion A decaying to a
random distribution of quenching ions B, the environment of the A ions
varies and the ion-pair relaxation rate depends on the particular environ-
ment of any excited ion A. Thus the decay of the system of excited ions A
is nonexponential. In the IH approach, an ion A is considered to be sur-
rounded by a set of quenching ions Bk at distances Rk. The energy transfer
rate from an A ion to the kth B ion is W A B r (Rk).
The time dependence of
the A excited-state population is then


where N is the total number of ions Bk in a finite volume surrounding A,

and zA is the intrinsic lifetime of an isolated A ion. Experimentally the
observed quantities are proportional to the statistical average 4(t) of p(t)
over an infinitely large number of sensitizers. These volume integrals and
&t) were calculated for a variety of interaction models by IH. For the

+(t) = 4, exp - -r (1- -)-(-r]

inverse power laws characteristic of multipolar interactions, 4(t)is given by

[ 3 C t
0 A

where C is the acceptor (B) concentration and n = 6,8,10 for EDD, EDQ,

and EQQ, respectively. C , is a critical concentration defined by

c, = m (7.12)

where R , is the separation at which the energy transfer rate for an isolated
A-B pair is equal to 2, .
The IH theory also treats the direct exchange interaction where @(t)has
the form

and y = 2Ro/L. g(z) in eq. (7.13) is a function defined by

= 62 c m!(m+
rn (-z)"I
1)4 *

Thus energy transfer can be studied via the observation of the time depen-
dence of thedecay at a particular concentration. #(t) may also be used in
the analysis of other quantities. For example, the quantum yield ijA is
simply given by


and the mean lifetime by

Numerical integration of eqs. (7.15) and (7.16) can be performed as de-

scribed by IH to determine these quantities as functions of C.

7.1.3. Energy migration

Thus far only one-step processes involving resonant energy transfer
between donors and acceptors have been considered. Relaxation by energy
migration is a multistep process involving resonant energy transfer from
one ion to another of the same species in a random walk manner and finally
to an acceptor which acts as a quenching center or energy sink. The basic
step in the migration is illustrated in Fig. Migration becomes in-
creasingly important as the rare-earth content is increased. In concentrated
materials or where the rare earth is a constituent of the host, the probability
for resonant energy transfer between donor ions may be large. Rapid energy
diffusion can lead to a spatial equilibrium of excitation within the donor
system. The rate limiting step for the donor relaxation then becomes either
the donor-acceptor transfer rate or the acceptor relaxation rate. In the
limit of fast diffusion, a simple rate equation model for the donor system
relaxation can be used which predicts a simple exponential decay.
When the rate of energy diffusion within the donor system is slow but
still comparable to the intrinsic decay rate, the donor decay is composed
of competing processes. Those excited donors near acceptors relax pre-
dominantly by direct ion-pair energy transfer ; those more distant donors,
however, must first diffuse into the vicinity of an acceptor before relaxation
occurs. The time evolution of 4(t)is governed by a diffusion equation

where D is the diffusion constant and v(r - r,) is the probability for energy
transfer from an excited donor to the nth acceptor at r,. YOKOTAand
TANIMOTO [1967] obtained a general solution for the donor fluorescence
decay function including both diffusion within the donor system and
donor-acceptor energy transfer via EDD coupling. Their expression at
earlier times in the decay reduces to eq. (7.1 1) for n = 6 . The decay at
long times after excitation reduces to a simple exponential decay.
The asymptotic solution at long times is described by a characteristic
_1 --- 1 +4nN,D, (7.18)
T 70

where N , is the density of acceptors and p is a length defined by (DEGENNES

p = 0.68(C/D)* (7.19)
which characterizes the relative effectiveness of direct transfer versus
migration to acceptors. This solution applies when d e p e a, where d is
the donor-donor separation and a is the distance between acceptors. The
changing time-dependent decay behavior from an initial nonexponential
decay given by eq. (7.11) to an exponential decay with a lifetime given by
eq. (7.18) is a distinguishing feature of diffusion-limited relaxation (WEBER


Since its appearance, the Inokuti-Hirayama theory has been employed

extensively to interpret ion-ion energy transfer studies. Shionoya and
co-workers have applied the IH model to several rare-earth systems
[1972]). Fig.7.2 shows the results for TbtoNd transfer in Ca(FQ,), glass.
The quantum yield ijA (referred to as Z/Zo) and a conveniently defined
lifetime zh are plotted as a function of concentration. The theoretical fit
for n = 8 suggests the dominance of the EDQ interaction. A variety of
other combinations of ions in Ca(PO,), glass were examined and all
showed more or less comparable agreement with EDQ coupling.
The various theoretical dependences of 4(t)and ijA, however, are actually

Tb - Nd
I= 8

0.8 -
0 0 exptl



0 -c
I 0.6 t
' c
0.4 a


0.01 0.1 1 10
I , I .

0.001 0.01 0.1 C

Fig. 7.2. Theoretically calculated concentration dependences (IH theory) of the emission
intensity (or quantum efficiency) Zir, and the lifetime q,/rhoaf donor luminescence for the
EDQ interaction. Circles show experimental data for Tb-to-Nd transfer in Ca(PO,), glass
(after NAKAZAWA and SHIONOYA [1967]).

very close, and it is often difficult to make an unambiguous determination

of the operative interaction. For example, Fig. 7.3 shows the Eu3+ 5Do
luminescence decay for the system Y203: 1 % Eu, 4 % Yb. Although the
EDQ interaction provides the best fit, other fits are only slightly beyond
the experimental error bars. In contrast, Fig. 7.4 shows the luminescence
decay of the 'F, level of Yb3+ in YF, : 0.3% Yb, 6% Ho where the fit for
the EDD interaction is quite closely followed.
Although the IH theory appears to describe such cooperative relaxation
reasonably well, it suffers from limitations similar to those which plague
the Dexter theory. Because of the requirement that resonant energy migra-
tion does not occur, such experiments are valid only at low concentrations.
At higher concentrations, different multipolar terms may be expected to
begin to dominate because of their different range dependences. KUSHIDA
[1973a,b, c] has shown, from a detailed theoretical analysis and comparison
of various experimental results, that at lower concentrations the EDD


E'O.Ol Yb0.04'23'
I I , I I
. I

Eu 5D0: T,, = 880 ps

Fig. 7.3. Luminescence decay of the 'Do level of Eu3+ in Y , O , : l % Eu3+, 4 % Yb3+ at
liquid nitrogen temperature. The theoretical curves show the IH calculations for EDD, EDQ
and EQQ interactions (after YAMADA, SHIONOYA and KUSHIDA[1972]).

term should dominate because of its greater range. At high concentrations

(small average separations) the forbiddenness of EDD within 4f"becomes
important compared with higher-order terms and the relation
EQQ > EDQ > EDD gradually becomes operable. Since the contributions
from each of these processes are somewhat comparable, definitive identi-
fication of multipolar processes from an IH type analysis is difficult.
GRANT[1971], from a theoretical examination of the energy transfer
process, has concluded that concentration dependences do not reflect
multipolar terms, but rather the number of particles (ions) participating
. in the relaxation process.
Exchange has never been definitively associated with a cooperative
relaxation process for rare earths from fits using the IH model. BIRGENEAU
[19681 has discussed the possible contribution of exchange for nearest
neighbor (nn) and sixth nearest-neighbor (6nn) coupling. In the nn case,
direct exchange was found to be comparable with EQQ, which is expected
to be the most important multipolar term for small separations. Depending
on the particular system under consideration, either of the two interactions
could dominate. For the 6nn case, it was found that superexchange via
the intervening ion ligands could be comparable with the long-range
multipolar interaction terms (MDD as well as electric). Here the situation

Y0 10'
4 \


I 1 I
0 0.2 0.4 0.6 0.8 1.0
TIME (mi)

Fig. 7.4. Luminescence decay of the Yb3+ 'F, level in YF,: 1 % Yb3+, 10% H o 3 + .The solid
curve is experimental. Theoretical curves for exchange (dashed curve) and EDD (short and
long dashed curve) interactions using the IH theory are included (after WATTSand RICHTER

is even more unclear, since the expected concentration dependence of

superexchange would be difficult to analyze.
While a dependable analysis of the interactions contributing to cooper-
ative relaxation is still lacking, much work has been done by treating the
relaxation phenomena in a more phenomenological fashion. For example,
the resonance imposed by the overlap integral of the Dexter model (eq.
(7.7)) has led to a variety of experiments to determine the dependence on
this parameter. NAKAZAWA and SHIONOYA [19673 compared the transfer
rates derived from the IH model for a variety of relaxation processes with
the expected dependence on the overlap integral. Using measured line-
widths in their calculations, they found very good agreement.
One of the implications of the resonance condition is the different rates
expected for different Stark components of a given J state. Thus, in addition
to line-broadening and line-shift effects, a substantial temperature depen-
dence is introduced as a result of the change in the thermal distributions
of ions among the various Stark levels. The temperature dependence can
be studied phenomenologically via a straightforward consideration of
Boltzmann statistics (ASAWA and ROBINSON [19661).
Many systems which exhibit ion-pair decay involve nonresonant
processes and large energy mismatches requiring the participation of
many phonons to conserve energy. In light of the discussion in section 6
of single-ion multiphonon relaxation, an exponential dependence of the
rate on the energy mismatch might again be expected to be operative.
Because the additional ion-ion coupling term in multiphonon-assisted pair
relaxation may well depend in a substantial way on atomic parameters, the
gap dependence might not hold quite so well as in the single-ion case. The
statistical averaging over the orbit-lattice interaction should still be relevant.
MIYAKAWA and DEXTER [1970] in their theoretical analysis of multi-
phonon processes derived a cooperative relaxation analog of the multi-
phonon gap dependence of eq. (6.12). This is given by
W = Ce-PAE (7.20)
where p is related to the single-ion multiphonon exponent a by f l = a -y and


In eq. (7.21) g B and g A are electron-lattice coupling parameters for ions

A and B, respectively. YAMADA, SHIONOYA and KUSHIDA [1972] carried
out an experimental study of the above energy gap dependence for a
variety of rare earth ion-pair systems in Y,O,. Assuming the interaction
n ~ 0, 71 COOPERATIVE R E L A X A T I O N 147

to be EDQ, they determined the cooperative relaxation rate for an inter-

ionic separation R A B = lOA (corresponding to a concentration of B = 4
mol. %). The results are shown in Fig. 7.5 ;an exponential gap dependence is
obeyed reasonably well with a B value of 2.5 x l o p 3 cm. Taking g B = gA
and a value of a of a 3.8 x 10-3 cm from Table 6.1 yields a 8 of 2.2 x
cm, which is in excellent agreement with the results of Fig. 7.5.

1o5 I I 9

Hd5S2) +Sm

Hd5S2) +Tm Y203 LNT

1o4 Ed4S312)+Yb


Sm(4G512) +ELI
w Tm(G4) +Vb


I I 1 I \

ENERGY GAP. (cm-1)

Fig. 7.5. Energy gap dependence of multiphonon assisted energy transfer in Y,O, (after

(1) 1:t
::: :
51 82

- I

3. (1)
1 (2
(1) (1)



1)I- A lc)

Fig. 7.6. Schematic diagrams of various cooperative relaxation phenomena : (a) multi-ion
relaxation, (b) cooperative excitation, (c) Raman luminescence.
111, 71 COOPERATIVE R E L A X A T I O N 149

Several additional related processes of cooperative excited-state relaxa-

tion have been observed experimentally. One is decay involving more
than two ions which is indicated schematically in Fig. 7.6a. Here an ion A
decays, exciting two neighboring ions B1 and B2 to excited levels, the sum
of whose energies equals the transition energy of ion A. PORTER[1968]
has identified such a process in LaCI, : Ho3+. The inverse of this process,
cooperative excitation, is shown in Fig. 7.6b. Here two ions, A and A,
decay with the energy going into excitation of a third ion. This process
has been observed by FEOFILOV and OVSYANKIN [1967]. A third example
is cooperative luminescence in which two excited ions decay simultaneously
emitting a single photon equal to the sum of the transition energies of the
two ions. This process was observed in YbP04 by NAKAZAWA and SHIONOYA
Finally, the process shown in Fig. 7.6c, known as Raman luminescence,
was observed in Yb203: Gd3+ by FEOFTLOV and TROTIMOV [1969]. In
this case an ion A decays, simultaneously exciting ion B and emitting a
photon at an energy equal to the energy difference between ions A and B.
Similar processes have been observed involving one rare earth and one
iron group ion (VANDER ZIEL[1970]) and have been described as rare-
earth-terminated chromium fluorescence.
The orbit-lattice interaction is operative for the above processes, and
there is a rich variety of relaxation phenomena involving the ion-ion inter-
action in various combinations with radiative and phonon-induced transi-
tions. Theoretical treatments of such processes have been presented by
DEXTER [19623.
Another area where phenomenological treatment has produced useful
descriptions of experimental information arises when the concentrations of
donor ions A is large and fast migration averages the environment of
acceptor ions B. GANDRUD and Moos [I9681 have studied fast energy
diffusion between rare earths in concentrated crystals. Because of the
condition of an averaged environment, the use of a rate equation model is
appropriate. Transition rates can be simply determined from the measure-
ment of the exponential decay of the excited ions. These approaches have
seen wide utilization in studies related to infrared upconversion phosphors
and GEUSIC[19711; WATTS[19701; KUSHIDA [1973bl). These phosphors
are discussed in section 8.
Several investigations in the past few years have demonstrated the existence
of diffusion-limited relaxation. The first study by WEBER[1971] used a
chromium-doped europium metaphosphate glass, where excited Eu3 ions +
formed the donor system and Cr3+ impurities acted as energy acceptors.
When Cr3+ ions were present, the initial Eu3+ fluorescence decay was
nonexponential, as given by eq. (7.11); the final decay, however, was
exponential with a rate dependent on Cr3+ concentration as given in eq.
(7.18). Since the rate of energy diffusion within the Eu3+ system could be
varied by changing the temperature, and thereby the number of resonant
transitions, it was possible (1) to verify the Da law predicted by combining
eqs. (7.18) and (7.19); and (2) to cover the range ofrelaxation from diffusion
limited to fast diffusion.
VAN DER ZIEL,KOPFand VAN UITERT [1972] conducted similar studies
of Tb3+ relaxation in (Y, -xTb,)3Al,CJ,, crystals and again verified the
Da dependence for dipolar processes. Because rapid resonant migration
is possible in concentrated rare-earth materials, it was concluded that
fluorescence quenching could frequently be due to energy -migration to
unavoidable contaminants which serve as nonradiative sinks. Other
studies of energy migration include those of WATTSand RICHTER[1972]
(Yb3+ -, Ho3+) and KRASUTSKY and Moos [1973] (Pr3+ -,Nd3+). In
these experiments the concentration of both the donor and acceptor ions
were varied, thus changing the rate and relative importance of energy
migration. Recently the quantum efficiency of diffusion-limited energy
transfer between Ce3+ and Tb3+ was studied in Lal-,-,Ce,Tb,PO,
(BOURCET and FONG[19741). Here the diffusion of donor energy involves
d-f transitions.
A recently developed technique for the observation of energy migration
in rare-earth systems involves excitation with a narrow-band laser in a
system with significant inhomogeneous broadening (RISEBERG [1973]).In
such a system excitation is selective, involving only a narrow energy band
within the inhomogeneously broadened system. A narrowed fluorescence
signal is observed, corresponding to emission by that class of ions that has
been excited. Migration from these ions to other ions can then be studied
by observation of the line-narrowed signal. Recently, a time-resolved
study of such laser-induced line-narrowing effects was carried out by
MOTEGIand SHIONOYA [I9731 for Eu3+-doped Ca(PO,), glass. They
observed migration between Eu3+ ions located at sites having different
crystal fields, and were able to assign to the transfer process a phonon-
assisted EDD interaction.

0 8. Selected Applications
A significant impetus for the study of rare-earth relaxation during the
111, 81 SELECTED A P P L I C A T I O N S 151

last decade has been the application of rare-earth activated materials as

phosphors and in a variety of quantum electronic devices, particularly
lasers. For example, the luminescence of Eu3+ in the 600-650 nm region
has become the standard red phosphor emission in color television. Other
significant uses of rare-earth phosphors include fluorescent lamps and
X-ray image intensifying screens. In the case of lasers, a total of eleven
different rare earths (Pr3+, Nd3+, Sm2+, Eu3+, Tb3+ D 2 + , 3 + H 3 +
9 Y 9 0 ,

Er3+, T m 2 + * 3 +Yb3+,
, U3+s4+)have been lased in crystals, glasses and
liquids. The single class of lasers enjoying the widest practical utilization
today is based on the 4F, + 'Iy emission of the Nd3+ ion at 1.06 pm.
In applications involving the luminescence of rare earths, the relaxation
processes discussed in this chapter are implicitly involved in a critical way.
After excitation, whether by optical, X-ray, cathode-ray, or other means,
it is the rates and relative importance of various relaxation processes that
determine what the emission spectrum will be, its intensity and efficiency,
and the lifetimes of the initial and final states. Below we discuss two rep-
resentative examples in which relaxation phenomena have been considered
explicitly in the design of luminescent devices.


AUZEL[1966] was the first to demonstrate the use of sequential excitation

whereby infrared light absorbed by Yb3+ ions is converted to visible light
emitted by Ho3+, Er3+, or Tm3+ ions. This infrared-to-visible upcon-
version process for the Yb3+-Er3+ system is indicated schematically in
Fig. 8.1. Here an Yb3+ ion excited into the 2F+level relaxes by interacting
with an Er3+ ion and raising it to the 41y level. Subsequently, a second
excited Yb3+ ion relaxes and raises the previously excited Er3+ to a higher
4F, level. Phonon-induced relaxation across the small energy gaps lying
above the 4S, follows. Fluorescence in the green at 540 nm occurs from
the 'S, to the 'Iy ground state, thereby completing the upconversion
The Yb3 concentrations in IR-to-visible upconversion materials are

relatively high (usually 10% or greater). Fast resonant transfer of the

excitation among the Yb3+ ions is therefore possible and variations in the
Er3+ surroundings are effectively averaged out. Because the Yb3+ system
sees an averaged environment, the excited-state dynamics can be treated
in terms of rate equations, HELVES and SARVER[1969] introduced the
use of rate equations in the investigation of these processes. Other workers
152 R E L A X A T I O N PH E NO M E N A [I.. 68

A3 2 / :2


s2 - -2FY2 s2 -


Sl - Yb3+
Yb3+ Er3+

Fig. 8.1. Schematic diagram of infrared upconversion in the Yb3+-Er3+ system. Sequential
relaxation of two Yb ions results in excitation ofvisible luminescence from the Er3+4S, level.


and GEUSIC [1971]; MITA [1972]) have applied this form of analysis to
other systems with good success.
A general set of rate equations describing the upconversion process (after
AUZEL[1973]) are given below. For the population Ai of the ith level of
the activator (having n levels),
i- 1 n n i- 1
2= S,
dt I= 1
Ajcrji-S2 C
Aiaij+ C
j=i+ 1
Aiwji- C Aiwij
j= 1
n i- 1 n i- 1
+Sl C Ajflji-SI C Aiflij- AiAjyji- 1 AiAjyij (8.1)
j=i+l j= 1 j=i+l j= 1

and for the population S , of the excited level of the sensitizer

n n-1 n n-1
__ =
SIR-S, w,+ C C aijAi +S, 1
j=, i=l
fljiAj. (8.2)
j=i+l i=l

Since the total populations A and S are constant, one has the additional
condition n

= A; S,+S, = s. (8.3)
111, 0 81 SELECTED A P P L I C A T I O N S 153

The transition probabilities aij, P i j , y i j , and wij are independent of sensitizer

and activator concentrations and are defined as
aij = transfer rate from Yb3+ to Er3+
Pij = back transfer rate from Er3+ to Yb3+
y V. . = Er3+ to Er3+ transition rate
wij = single-ion decay rate (radiative plus nonradiative)
R is the rate of excitation of the Yb3+ ions.
Whereas eqs. (8.1) and (8.2) have been written in a more general form than
necessary to describe the situation of Fig. 8.1, the equations can readily be
specialized to any of the IR upconversion systems.
For a complex ion-ion system with many levels, the above set ofequations
contains a large number of rate constants. Some of the rates can be deter-
mined from measurements of fluorescence lifetimes and quantum efficien-
cies. Others, such as multiphonon relaxation rates, can be estimated from
the guidelines and phenomenological models discussed earlier. In any
given system, the large number of unknown parameters in eqs. (8.1) and
(8.2) may necessitate the use of other simplifying assumptions. Thus, for
example, HEMS and SARVER[1969] reduced the number of levels to be
considered and treated only three levels of Er3+ ("Iy,"Ly and "S,). In
other cases, back transfer has either been neglected (HEWS and SARVER
[1969]) or been considered to be as probable as direct transfer and much
faster than other de-excitation processes for either of the levels (KINGSLEY
[19701). The justification of such assumptions must ultimately depend
upon the relative rates of the associated relaxation processes.
Studies of the kinetics are useful in assessing the performance of these
systems. One can predict, for example, the dependence of efficiency on


0.1 1 10 lo2 103


Fig. 8.2. Power efficiency E (normalized to radiative quantum efficiency q ) for the Ho3+
green emission in YF, :Yb, Ho as a function of Yb excitation rate X (after WATTS[19703).

input power. By a combination of experimental measurements and phe-

nomenological analysis, WATTS[19701 was able to solve the kinetics and
extract the power efficiency of green emission from Ho3+ in YF, : Yb, Ho.
The efficiency E is shown plotted against the excitation rate X in Fig. 8.2
normalized to the radiative quantum efficiency q of the 5S2, 5F4state. The
significant point of the analysis is the saturation behavior predicted at
high values of E .


All of the relaxation phenomena we have discussed (radiative, multi-

phonon, cooperative) are implicitly relevant to the development and
evaluation of the performance of rare-earth lasers. A well-known example
is the Nd3+ laser. Referring to the energy level scheme in Fig. 2.1, optical
pumping is achieved by absorption into the levels lying above 4F,. For
high efficiency the medium should exhibit rapid nonradiative decay from
these levels directly to the 'F, level via a cascade of multiphonon emission
processes. Multiphonon relaxation from the 4F,, in contrast, should be
slow compared to its radiative decay. The strength of the radiative transi-
tion from 4F, to 41y determines the stimulated emission cross-section and
thereby the threshold, gain, and energy storage properties of the medium.
To maintain the characteristics of a four-level system, multiphonon decay
from the 'I? to the 41, must be fast compared to decay from 'F, . Finally,
if the Nd3' content becomes too high (above 1-5 % , in most cases), self-
quenching of the fluorescence sets in, involving 4F, -+ '% decay of one
ion with excitation of a neighboring Nd3+ ion to either 'L!+ or 'IT.
Various materials do not satisfy the above requirements. For example,
soft crystals suffer from bottlenecking in the cascade to the 'F, because
optical phonon energies are low and multiphonon rates are slow. In certain
glasses, such as the borates, the presence of high-energy vibrational quanta
result in partial nonradiative quenching of the 4F+ level. Fortunately, the
desirable properties are observed for most hard crystals (high-energy
optical phonons) at Nd3+ concentrations of less than a few percent. It is
interesting to note that the Nd3+ laser was invented before such refined
analysis of the basic relaxation phenomena was made.
Utilization of energy transfer to increase the optical pumping efficiency
was an early feature of the investigatory stages of rare-earth solid-state
lasers. One of the most successful applications of energy transfer is the
Ho3+ laser. Increased absoprtion and pumping efficiency were obtained
by multiple-doping with Er3+,Tm3+,and Yb3+ to sensitize the Ho3+ 51,

fluorescence. The approach was first developed for Y3A1501 (JOHNSON,

GEUSICand VANUITERT[1966]), and has subsequently been extended to
other hosts. Co-doping with high concentrations of Er3+ and Yb3+ yields
intense absorption bands which provide large spectral overlap with broad-
band pump sources. A complex sequence of ion pair processes results in
most of the excitation going into the 'I, level of Ho3+.This is the lowest
excited J level and remains metastable. The Tm3+,although it also contrib-
utes to the absorption, serves mainly to provide additional levels in the
fast energy transfer sequence. This sensitized laser is the most efficient
optically pumped solid-state laser ever operated.
Another more recent example of the use of relaxation phenomena in
laser design is LiYF,: Tb3+ (JENSSEN,CASTLEBERRY, GABBEand LINZ
[1973]). Optical pumping of Tb3+ is via the levels above 'D, . The 5D3-to-
5D, gap, however, is too large to be bridged by multiphonon decay. To
circumvent this, the concentration of Tb3+was raised to 25 % to increase
the probability for 'D3 to 'D, decay via ion-pair relaxation involving
resonant or near-resonant transitions of the type 'D3 + 'D4 : 7F6+ 7F0,
or 5D3+ 7Fo, : 7F6+ 'D,. Lasing then occurs from 'D4 to 7F5.Since
the 7F, to 7F6gap is small, the 7F5terminal laser level is rapidly depleted
by multiphonon decay, thus completing a four-level laser scheme.
There are many other examples of the importance of relaxation phe-
nomena in applications of luminescent rare-earth materials. Those' dis-
cussed above indicate, in a general way, how such processes have been con-
sidered in the past. Such considerations no doubt will play a similar role
in the development of future materials and devices.

3 9. Concluding Remarks
In t h s chapter, we have attempted to survey the relaxation phenomena
affecting the luminescent properties of rare-earth ions in solids. Due to
the rapid growth of activity in this field in recent years, the literature is
voluminous and therefore no attempt was made to cover it completely.
Instead, examples were chosen for their appropriateness in illustrating the
discussion. The coverage has therefore been selective in detail, yet broad
in scope. The intention has been to acquaint the non-specialist with the
subject and to provide the applications-oriented individual with a familiar-
ity with the basic phenomena having implications on his project.
In the case of radiative relaxation, the Judd-Ofelt theory provides a useful
rationale for parametrizing the phenomena. Through a combination of
experimental measurements and theoretical calculations, the radiative

transitions of the lower 4fNlevels of rare earths can be satisfactorily account-

ed for by the worker willing to devote the effort required to carry out
such an analysis. In the case of multiphonon relaxation, the phenomenol-
ogical model provides a powerful description. It is relatively straightforward
to apply in most cases, with an accuracy which, while not precise, is adequate
for semi-quantitative evaluation and prediction of performance.
For cooperative relaxation, the situation is not as satisfactory. The
question of which basic interaction dominates cannot be easily answered
for a given situation. Nor is it clear what approach can shed light on a
more general description of ion-pair interactions. Although qualitative
remarks can be made, quantitative predictions of an ion-pair decay rate
for a given rare-earth ion concentration in a given crystal are subject to
large error. As in most fields that have matured to any degree, it appears
that the remaining problems are the most difficult.

ASAWA,C. K. and M. ROBINSON, 1966, Phys. Rev. 141, 251.
AUZEL,F., 1966, C. R. Acad. Sci. (Paris) 262, 1016.
AUZEL,F., 1973, Proc. IEEE 61, 758.
AXE,J. D., 1963, J. Chem. Phys. 39, 1154.
AXE,J. D. and P. F. WELLER,1964, J. Chem. Phys. 40,3066.
BARASCH, G . E. and G. H. DIEKE,1965, J. Chem. Phys. 43, 988.
BECKER, P. J., 1971, Phys. Status Solidi B 43, 988.
BECQUEREL, J., 1907, Radium 4, 328.
BETHE,H., 1930, Z. Physik 60,218.
BIRGENEAU, R. J., 1968, Appl. Phys. Letters 13, 193.
BLEANEY, B. and K. W. H. STEVENS, 1953, Rept. Prog. Phys. 16, 108.
BOTDEN, T. P. J., 1952, Philips Res. Repts. 7, 197.
BROER,L. J. F., C. J. GORTER and J. HOOGSCHAGEN, 1945, Physica 11, 231.
BOURCET, J. C. and F. K. FONG,1974, J. Chem. Phys. 60, 34.
BUKIETYNSKA, K. and G. R. CHOPPIN,1970, J. Chem. Phys. 52 (6), 2875.
CARNALL, W. T., P. R. FIELDS and K. RAINAK,1968, J. Chem. Phys. 49, 4412, 4424, 4443,
4447, 4450.
CARNALL, W. T., P. R. FIELDS and B. G. WYBOURNE, 1965, J. Chem. Phys. 42,3797.
CHAMBERLAIN, J. R., A. C. EVERITT and J. W. ORTON,1968, J. Phys. C 1, 157.
CHAMBERLAIN, J. R., D. H. PAXMAN and J. L. PAGE,1966, Proc. Phys. SOC.(London) 89,143.
DE GENNES,P. G., 1958, J. Phys. Chem. Solids 7, 345.
DELSART, C. and N. PELLETIER-ALLARD, 1971, J. Phys. (Pans) 32, 507.
DELSART, C. and N. PELLETIER-ALLARD, 1973, J. Phys. C 6, 1277.
DETRIO,J. A., 1971, Phys. Rev. B 4, 1422.
DEXTER,D. L., 1953, J. Chem. Phys. 21, 836.
DEXTER,D. L., 1962, Phys. Rev. 126, 1962.
DEXTER,D. L. and J. H. SCHULMAN, 1954, J. Chem. Phys. 22, 1063.
DIBARTOLO, B., 1968, Optical Interactions in Solids (Wiley, New York).
DIEKE, G. H., 1961, Spectroscopic Observations on Maser Materials, in: Advances in
Quantum Electronics, ed. J. R. Singer (Columbia University Press, New York) pp. 164186.

DIEKE, G. H., 1963, Energy Levels of and Energy Transfer in Rare Earth Salts, in:
Paramagnetic Resonance, ed. W. Low (Academic Press, New York) pp. 237-252.
DIEKE,G. H., 1968, Spectra and Energy Levels of Rare Earth Ions in Crystals (John Wiley
and Sons, Inc., New York).
ELYASHEVICH, M. A., 1953, Spectra of the Rare Earths (State Publishing House of Technical-
Theoretical Literature, Moscow).
~RDCSS,P., 1966, J. Phys. Chem. Solids 27, 1705.
FEOFILOV, P. P. and V. V. OVSYANKIN, 1967, Appl. Opt. 6, 1828.
FEOFILOV, P. P. and A. K. TROTIMOV, 1969, Opt. and Spect. 27, 291.
FONG, F. K., S. L. NABERHUIS and M. M. MILLER,1972, J. Chem. Phys. 56,4020.
FCIRSTER,T., 1948, Ann. Physik 2, 55.
F~RSTER,T., 1949, Z. Naturforsch. 49, 321.
FOWLER, W. B. and D. L. DEXTER,1962, Phys. Rev. 128, 2154.
FOWLER, W. B. and D. L. DEXTER,1965, J. Chem. Phys. 43, 1768.
FREED,K. F. and J. JORTNER, 1970, J. Chem. Phys. 52, 6272.
GANDRUD, W. B. and H. W. Moos, 1968, J. Chem. Phys. 49, 2170.
GASHUROV, G. and 0. J. SOVERS,1969, J. Chem. Phys. 50, 429.
GERMAN, K. R. and A. KIEL, 1973, Phys. Rev. 88, 1846.
GRANT,W. J. C., 1971, Phys. Rev. B5,648.
HELLWEGE, K. H., 1942, Ann. d. Phys. 40,529.
HELLWEGE,K. H., 1947, Naturwiss. 34,212.
HEWS, R. A. and J. F. SARVER, 1969, Phys. Rev. 182, 427.
HOSHINA,T. and S. KUBONIWA, 1971, J. Phys. SOC.Japan 31, 828.
INOKUTI, M. and F. HIRAYAMA, 1965, J. Chem. Phys. 43, 1978.
JENSSEN, H. P., 1971, Phonon Assisted Laser Transitions and Energy Transfer in Rare Earth
Crystals, Technical Report No. 16 (M.I.T. Crystal Physics Laboratory, Cambridge).
JENSSEN, H. P., D. CASTLEBERRY,.D. GABBE and A. LINZ,1973, IEEE J. Quant. Elec. QE9,665.
JOHNSON, L. F., J. E. GEUSICand L. G. VAN UITERT, 1966, Appl. Phys. Letters 8,200.
JOHNSON, L. F., H. J. GUGGENHEIM, T. C. RICHand F. W. OSTERMAYER, 1972, J. Appl. Phys.
43, 1125.
JOHNSON, L. F. and H. J. GUGGENHEIM, 1973, Appl. Phys. Letters 23, 96.
JI$RGENSEN, C. K. and B. R. JUDD,1964, Mol. Phys. 8, 281.
JUDD,B. R., 1962, Phys. Rev. 127, 750.
JUDD,B. R., 1966, J. Chem. Phys. 44, 839.
KIEL, A., 1964, Multi-Phonon Spontaneous Emission in Paramagnetic Crystals, in :
Quantum Electronics, eds. P. Grivet and N. Bloembergen (Columbia University Press,
New York) pp. 765-777.
KINGSLEY, J. D., 1970, J. Appl. Phys. 41, 175.
KISLIUK,P. and C. A. MOORE,1967, Phys. Rev. 160, 307.
KRAMERS, H. A,, 1930, Proc. Acad. Sci. Amsterdam 32, 1176.
KRASUTSKY, N. and H. W. Moos, 1973, Phys. Rev. BS, 1010.
KROPP,J. L. and M. W. WINDSOR, 1965, J. Chem. Phys. 42, 1599.
KRUPKE,W. F., 1966, Phys. Rev. 145, 325.
KRUPKE,W. F., 1971, IEEE J. Quant. Elec. QE-7, 153.
KRUPKE,W. F., 1972, IEEE J. Quant. Elec. QE-8, 725.
KRUPKE,W. F., 1974, IEEE J. Quant. Elec. QE-10, 450.
KRUPKE,W. F. and J. B. GRUBER,1965, Phys. Rev. 139, 2008.
KUBONIWA, S. and T. HOGHINA, 1972, J. Phys. SOC. Japan 32, 1059.
KUSHIDA, T., 1973a, J. Phys. SOC.Japan 34, 1318.
KUSHIDA, T., 1973b, J. Phys. SOC.Japan 34, 1327.
KUSHIDA, T., 1973c, J. Phys. SOC.Japan 34, 1334.
LAUER,H. V. and F. K. FONG,1974, J. Chem. Phys. 60,274.
LEASK,M. J. M., 1968, J. Appl. Phys. 39, 908.

LOH,E., 1966, Phys. Rev. 147, 332.

LOH, E., 1968, Phys. Rev. 175, 533.
MANTHEY, W. J., 1973, Phys. Rev. B8, 4086.
MITA,Y., 1972, J. Appl. Phys. 43, 1772.
MIYAKAWA, T. and D. L. DEXTER, 1970, Phys. Rev. Bl, 2961.
Moos, H. W., 1970, J. Luminescence 1, 106.
MOTECI,N. and S. SHIONOYA, 1973, J. Luminescence 8, 1.
NAKAZAWA, E. and S. SHIONOYA, 1967, J. Chem. Phys. 47, 322.
NAKAZAWA, E. and S. SHIONOYA, 1970, Phys. Rev. Letters 25, 1710.
NEWMAN, D. J., 1971, Adv. Phys. 20, 197.
NIELSON, C. W. and G. F. KOSTER,1964, Spectroscopic Coefficients for the p", d" and t" Con-
figurations (The M.I.T. Press, Cambridge, Massachusetts).
OFELT,G. S., 1962, J. Chem. Phys. 37, 511.
ORBACH,R., 1961, Proc. Roy. SOC.A264,458.
ORBACH,R., 1967, Phonon Sidebands and Energy Transfer, in: Optical Properties of Ions
in Crystals, eds. H. M. Crosswhite and H. W. Moos (Wiley Interscience, New York)
pp. 445-455.
GEUSIC,1971, Phys. Rev. B3, 2698.
PARTLOW, W. D. and H. W. Moos, 1967, Phys. Rev. 157, 252.
PEACOCK, R. D., 1971, J. Chem. SOC.A 2028.
PEACOCK, R. D., 1972a, J. C. S. Faraday II,68, 169.
PEACOCK, R. D., 1972b, Chem. Phys. Letters 16, 590.
PEACOCK, R. D., 1973, Mol. Phys. 25, 817.
PORTERJr., J. F., 1968, Bull. Am. Phys. SOC.13, 102.
REEDJr., E. D. and H. W. Moos, 1973a, Phys. Rev. B8, 980.
REEDJr., E. D. and H. W. Moos, 1973b, Phys. Rev. B8,988.
REISFELD, R., 1973, Spectra and Energy Transfer of Rare Earths in Inorganic Glasses, in:
Structure and Bonding, Vol. 13, eds. J. D. Dunitz, P. Hemmerich, J. Ibers, C. J$rgensen,
J. Neilands, R. Nybalm, D. Reinen and R. Williams (Springer-Verlag, Berlin, Heidelberg,
New York) pp. 53-98.
REISFELD, R., L. BOEHM, N. LIEBLICH and B. BARNETT, 1973, Proc. Tenth Rare Earth Res. Conf.
2, 1142.
RINCK,B., 1948, Z. Naturforsch. 3. 406.
RISEBERG, L. A,, 1968, Multiphonon Orbit-Lattice Relaxation of Excited States of Rare Earth
Ions in Crystals, Ph.D. Thesis (The John Hopkins University, Baltimore).
RISEBERG, L. A., 1973, Phys. Rev. A7, 671.
RISEBERG, L. A., W. B. GANDRUD and H. W. Moos, 1967, Phys. Rev. 159,262.
RISEBERG, L. A. and H. W. Moos, 1967, Phys. Rev. Letters 19, 1423.
RISEBERG, L. A. and H. W. MOOS,1968, Phys. Rev. 174, 429.
STRUCK,C. W. and W. H. FONGER,1971, Phys. Rev. B4, 22.
STURGE,M. D., 1973, Phys. Rev. B8, 6.
VAN DER ZIEL,J. P., 1970, J. Luminescence 1, 807.
VANDER ZIEL,J. P., L. KOPFand L. G. VAN UITERT,1972, Phys. Rev. B6,615.
VAN UITERT,L. G., 1966, Luminescence of Insulating Solids for Optical Masers, in:
Luminescence of Inorganic Solids, ed. P. Goldberg (Academic Press, New York) pp. 465-
VANVLECK,J. A,, 1937, J. Phys. Chem. 41, 67.
WAITS, R. K., 1970, J. Chem. Phys. 53, 3552.
WATTS,R. K. and H. J. RICHTER,1972, Phys. Rev. B6, 1584.
WEBER,M. J., 1966, Radiative and Nonradiative Transitions of Rare-Earth Ions: Er3+ in
LaF,, in : Physics of Quantum Electronics, eds. P. L. Kelley, B. Lax and P. E. Tannewald
(McGraw-Hill Book Co., New York) pp. 35CL360.

WEBER,M. J., 1967a. Phys. Rev. 157, 262.

WEBER,M. J., 1967b, Relaxation Processes for Excited States of Eu3+ in LaF,, in: Optical
Properties of Ions in Crystals, eds. H. M. Crosswhite and H. W. Moos (Wiley Interscience,
New York) pp. 467484.
WEBER,M. J., 1968a, Phys. Rev. 171, 283.
WEBER,M. J., 1968b, J. Chem. Phys. 48, 4771.
WEBER,M. J., 1971, Phys. Rev. B4,2932.
WEBER,M. J., 1973a, Solid State Commun. 12, 741.
WEBER,M. J., 1973b, Phys. Rev. B8, 54.
WEBER,M. J., 1973c, Proc. Tenth Rare Earth Res. Conf. 2, 932.
WEBER,M. J., B. MATSINGER, V. L. DONLAN and G. T. SURRAIT,1972, J. Chem. Phys. 57,562.
WEBER,M. J., T. E. VARITIMOS and B. H. MATSINGER, 1973, Phys. Rev. B8, 47.
WYBOURNE, B. G., 1965, Spectroscopic Properties of Rare Earths (Wiley Interscience, New
YAMADA, N. S., S . SHIONOYA and T. KUSHIDA,1972, J. Phys. SOC.Japan 32, 1577.
YATSIV,S., 1962, Physica 28, 521.
YEN,W. M., W. C. SCOTT^^^ A. L. SCHAWLOW, 1964, Phys. Rev. 136, A271.
YOKOTA,M. and 0. TANIMOTO, 1967, J. Phys. SOC.Japan 22. 779.
ZVEREV, G., G. Y.KOLODNYI and A. M. ONISHCHENKO, 1971, Soviet Physics JETP 33,497.
This Page Intentionally Left Blank



Bell Laboratories,
Murray Hill,New Jersey 07974, U S A .

* Present address: Sandia Laboratories, Albuquerque, New Mexico 871 15, U.S.A.


0 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . 163
5 3. ULTRAHIGH SPEED PHOTOGRAPHY . . . . . . . . . 177
0 4. SAMPLING OPTICAL SIGNALS . . . . . . . . . . . . 183
0 5 . CONCLUDING REMARKS. . . . . . . . . . . . . . . 191
ACKNOWLEDGEMENTS. . . . . . . . . . . . . . . . . . 192
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . 192
8 1. Introduction

One of the new frontiers to be opened by the laser has been the area of
picosecond pulses. This began in 1966 when DEMARIA,HEYNAUand
STETSER [19663 reported the emission of ultrashort pulses by a neodymium :
glass laser Q-switched by a saturable absorber. Soon thereafter ARMSTRONG
[1967] measured these pulses and found them to have a duration on the
order of 6 picoseconds, a record at the time in terms of ultrashort optical
or electrical pulses.
The discovery of picosecond laser pulses stimulated the invention of new
techniques for measuring ultrashort optical pulses, notably the two-photon
fluorescence technique of GIORDMAINE, RENTZEPIS, SHAPIRO and WECHT
[1967]. Picosecond pulses were also applied to scores of experiments
ranging from measurements of molecular relaxation times to laser fusion
One particularly useful application of picosecond laser pulses, and one
that will be the object of this chapter, has been in the development of an
ultrafast shutter (or gate) based on the optical Kerr effect (also called AC
Kerr effect) by DUGUAY and HANSEN [1969a]. This effect, which had been
predicted by BUCKINGHAM [1956] well before the laser, and which was
first observed by MAYERand G I R E[1964],
~ refers to the very short-lived
birefringence induced in transparent mediums by high-power laser pulses
propagating therein. The ultrafast optical Kerr shutter uses this optically
induced birefringence to gate light on and off on the picosecond time scale,
much in the way in which the conventional Kerr cell uses an electrically
induced birefringence to gate light on and off on the nanosecond time scale.
To date, gating times as short as 5 ps have been achieved, when using ultra-

short pulses from neodymium glass lasers. By comparison, the fastest
electrically driven Kerr cells have an open time of 0.5 ns. Moreover, the
recent development of subpicosecond pulses in CW modelockcd dye lasers
due to SHANKand IPPEN [1974] has enabled these workers to achieve
gating times as short as 2 ps in a CS2Kerr shutter (IPPENand SHANK [1974]).
The ultrafast optical Kerr shutter has found a number of applications,

some of which will be described in this chapter. When used in conjunction

with an ordinary camera, the optical Kerr shutter has made possible the
stop-motion photography of light pulses in flight. Also, gated picture
ranging on the centimeter scale has been demonstrated, opening the
possibility of eventually seeing through biological tissue, such as the
human skin.
The Kerr shutter has been applied to the measurement of some interest-
ing molecular relaxation times. For example, by filling the shutter with
CS2, and then with nitrobenzene, DUGUAY and HANSEN[1969a] found
that it takes nitrobenzene molecules about 32 ps to relax from a state of
partial alignment to one of random orientation. In one of the shortest direct
time measurements to date, IPPENand SHANK [19751 have recently measured
the analogous relaxation time in CS2, and found it to be 2 ps (at 25 C).
Of more general interest has been the use of the ultrafast shutter in dis-
playing the time profile of a variety of optical signals. This has been done
by employing one of two methods. The first method, which is patterned
after the electronic sampling oscilloscope, consists in using the gate to cut
out a short time sample from the signal at a time progressively advancing
with each laser shot. By using a photomultiplier behind the gate, a very
high sensitivity has been achieved in sampling ultrashort optical signals.
With this method the fluorescence decay time of l,l-diethyl-2,2-carbo-
cyanine iodide in methanol has been found to be 14k 3 ps, DUGUAY and
HANSEN [1969b1. ALFANO and SHAPIRO [19721 have used the same method
to measure ultrafast relaxation times in erythrosin and in a tetracene
crystal (ALFANO, SHAPIRO and POPE[1973]).
The second method features multichannel sampling of a single event
optical signal. For this, one takes advantage of the wide angular acceptance
(10-20 degrees) of the ultrafast Kerr shutter to cut out and record a multi-
plicity of samples from the one optical signal. The samples are distributed
uniformly from the leading to the lagging edge of the pulse, and their
envelope represents a display of the signal pulse shape. This approach has
been followed by TOPP,RENTZEPIS and JONES[1971], who have developed
the echelon technique for this purpose, and by DUGUAY and SAVAGE [19731
and by VOGEL,SAVAGE and DUGUAY [19741, who have used an 0rganfiber
array with the shutter in order to build an optical sampling oscilloscope
(OSO).With the OSO optical picosecond signals with peak powers on the
watt level can be displayed instantly and accurately on an oscilloscope
Recently, MOUROUand MALLEY [1974] have used the shutter in a con-
figuration where 512 samples are taken in one laser shot, and they have
IV, 8 21 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R 165

thus obtained very accurate displays of molecular fluorescence signals on

the picosecond time scale.
Another recent application of the ultrafast gate has been in the measure-
ment of ultrashort X-ray pulses emitted by laser produced plasmas (DUGUAY
and OLSEN[1975]). For this the X-rays were converted to light in a plastic
scintillator, and the light was measurcd on the picosecond time scale with
the optical sampling oscilloscope technique.

5 2. The Ultrafast Optical Kerr Shutter


The simplest, and what has been so far the most useful configuration
for the ultrafast Kerr gate (or shutter), is shown in Fig. 1.
A short cell containing carbon disulfide is placed between two crossed
polarizers. In the spectral range 5000-7500 A sheet polarizers (like Polaioid

HN22) can be used. Elsewhere in the spectrum good extinction and sizeable
angular acceptance ( 15") require the ust of calcite (or mercurous chloride)
crystal polarizers. The CS2 cell is made of low birafringence glass and is
held gently to avoid strain birefringence. The gating laser pulse is directed
into the CS2 cell at a small angle a from the axis AX, along which the
signal light to be gated is traveling. When working with Nd: glass pico-
second pulse lasers, the gating pulse will typically be 10 ps in duration
at a wavelength of 1.06 p. In the absence of the gating pulse the crossed
polarizers shut out the signal light; the gate transmission in the OFF
state is typically 0.001 %.



Fig. 1. The ultrafast optical Kerr shutter in its simplest form. Polarizers PI and P, are crossed
and have their polarization axes at *45" to the plane of polarization of the gating laser pulse.
Filter F greatly attenuates the infrared gating pulse to prevent possible damage to P, .
In some cases F can be dispensed with (see Momou and MALLEY [1974]).
166 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R CIV? Q 2

The gating pulse (infrared pulse in Fig. 1) is plane polarized in the hori-
zontal plane, as is usually the case with practical lasers which have vertical
Brewster faced components. Polarizers P, and P, have their axes at +45" to
thehorizontal. Asthegating pulse propagatesthrough the CS, cell, it creates
a region of birefringence that travels along with it. That ultrashort portion
of the signal light that happens to enter the CS2 cell at the same time as
the gating pulse experiences the effect of this moving birefringent region,
and as a result undergoes a change in polarization that allows it to be
partially transmitted through polarizer P, . The birefringence induced in
CS, by a laser pulse whose plane polarized electric field is E(t) in the medium
is given by :

anll - 6n, = n2.j: mEi2(t')exp [- (t - t')]dt'/c (2.1)

Here Gn,, and dn, refer to changes in the refractive index in directions
parallel and perpendicular, respectively, to the applied field E(t), n 2 B is
the AC Kerr effect coefficient,z is the relaxation time of the AC Kerr effect
in the gate medium, and the bar over E 2 signifies a local time average over
one optical period.
In CS, the bulk of the induced birefringence arises when molecules are
rotated so that their axis of easy polarizability moves closer to the plane
of polarization of the applied field E(t). The relaxation time associated
with this motion is about 1.8 ps in CS, at 25C (BROIDAand SHAPIRO
[1967], IPPENand SHANK[1975]). This means that it takes 1.8 ps after
the passage of a 6-function gating pulse in order for the molkcules to
randomize again, thereby returning the ultrafast gate to its OFF state.
This time defines the minimum open time of a CS, shutter. For pulses
much longer than z = 1.8 ps, eq. (2.1) can be approximated by:

6nll-6n, N nZBF(t) (2.2)

where the bar indicates a timc average over about 2 ps.
The optical signal S(t) is polarized at 45" to the horizontal by P I . After
travelling with the gating field E(t) over a distance L in the Kerr medium,
the horizontal component Sll(r) of the signal undergoes a phase lag cp
relative to the vertical component S,(t) given by :
cp = 271(6n1,-6n,)L/A (2.3)
.where A is the signal wavelength in vacuum. As in the conventional Kerr
cell, the transmission of the device is given by:
T = $T, T2 sin2(cp/2)
iv, 9 21 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R 167

where T l , 2 is the transmission of polarizer P,,2 for light parallel to its

polarization axis. In writing down the factor of 4 we have assumed an un-
polarized signal. For a signal polarized along the polarization axis of P I ,
4 is to be replaced by unity. Combining eqs. (2.2), (2.3) and (2.4) we get:
T(t) = tT,T2sin2 (nn,,,??(t)L/,? ). (2.5)
The term F(t)is related to thz instantaneous power density P (in
watts/m2) carried by the pulse though a medium of refractive index n by
the relation (we use SI units in the formulas throughout this chapter, except
where otherwise noted)

EI(t) = P(t)/cne, (2.6)

where 8, = 8.85 x 10-l2 Fd/m and c = 3 x 10 m/s. Substituting in eq.
(2.5) we get:

As a specific example let us consider the experiment of DUGUAY and
HANSEN [1969a], where infrared ultrashort pulses from a Nd: glass laser
are used to gate ultrashort green pulses derived from them by second
harmonic generation. The various numbers in eqs. (2.5) and (2.6) are:
3T1T2= 0.14 (HN22 type sheet polarizers from Polaroid Corp.), L =
m, 1 = 0.53 x m, n = 1.60 and n2B = 2.2 x m2/V as measured
for CS2 by MAYERand GI= [I9641 and by PAILLETTE [1969].
Using these numbers in eq. (2.7) we get
T(t) = 0.14 sin (3.15 x l O - I 3 P ) (2.8a)
or, more conveniently, expressing P in MW/cm2, we have

T ( t) = 0.16 sin2 C3.15 x 10-3P(MW/cm2)] (2.8b)

which is plotted in Fig. 2.

As can be seen from Fig. 2 and eq. (2.7) when the gating power density
is less than 200 MW/cm2, the transmission Tit) is approximately quadratic
in P ( t )
T ( t )N + T ~~,[nn,,LP(t)/(ilcne,)]~. (2.8~)

Under these conditions the gate open time is somewhat shorter than the
gating pulse. For example, if P(t) is a Gaussian with a full width at half
maximum (FWHM) of 14ps, T(t)will also be Gaussian, but with an FWHM
equal to 14/$ = 10 ps.
168 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R [IV. 3 2


T (%I

0 250 SO0 7 50 1000


Fig. 2. The transmission T of a CS, ultrafast Kerr shutter at a signal wavelength of 0.53~
is plotted versus the power density in the gating beam. For a shorter signal wavelength the
maximum would move towards smaller gating power densities. The CS, shutter has a length
L = 1 cm and the polarizers are of type HN22 (made by Polaroid). With calcite crystal po-
larizers, the peak transmission could reach 50 4for unpolarized signal light.

In the experiment of DUGUAY and HANSEN [1969a], second harmonic

green pulses are used to probe the gate transmission versus time by changing
the delay T by which the green pulse lags the infrared gating pulse at the
gate. The second harmonic pulse has a time profile which is closely approx-
imated by P 2 ( f ) .The green signal Gt4)(7)which emerges from the gate is
therefore given by the convolution

CY4)(t) = T(t)P2(t-T) dt.

At gating power densities below 200 MW/cm2, T ( t ) cc P 2 ( t )and we have


G(4)(2)cc P2(t)P2(t- 7)dt.

An experimental plot of Gt4 (7)is shown in Fig. 3. The shape of the


time profile of P ( t ) cannot be derived from G(4)(7).However, the approx-

imate overall duration of P ( f )can be deduced from that of G(4)(T). Assum-
ing a Gaussian-shaped P(t), for example, G(4)(7) is also Gaussian and has
the same width as P(t). From this it follows that in the CS2gating experiment
of Fig. 3, the gate open time at half maximum was 17/$ = 12 ps.
T H E U L T R A F A S T O P T I C A L KERR S H U T T E R 169

1 0 - * 4
-50 0 50 T'(pS)

Fig. 3. The solid line represents the transmitted green light signal G'4'(~')as a function of the
delay T' between the powerful gating 1 . 0 6 ~pulse and the weak probing second harmonic
(0.53~)pulse. The width of G(4)(~') is 15 ps and is approximately equal to the duration (at half

the shutter transmitted -

height) of the gating pulse. With gating pulses about 150 MW/cm2 in peak power density,
4 % of the green pulses at T' = 0. When the shutter is filled with
nitrobenzene, the dotted curve is obtained. The exponentially decaying part of the wrve
gives the relaxation time (32+4 ps) of the orientational Kerr effect in nitrobenzene (at 23C).


Also shown in Fig. 3 is a plot ,of G(z) when nitrobenzene is used as the
optical gate medium. Nitrobenzene molecules are bigger than CS2 mole-
cules, and once aligned along a certain direction by the AC Kerr effect, it
takes them longer to relax to a random orientation state after the passage
of the laser pulse. This is evident in Fig. 3 where the gate transmission is
seen to decay exponentially with a time constant of 16f2 ps after the
gating pulse has left the gate. Since the gate transmission is proportional
to the square of the induced birefringence, the relaxation timc associated
with the latter is 32+4 ps in nitrobenzene at 25C. This relatively long
relaxation time supports the view that the mechanism responsible for the
AC Kerr effect in nitrobenzene is indeed orientational in nature, in agree-

ment with the light scattering studies of STARUNOV, and FABEL-

INSKII [19661 (see also FABELINSKII [19683).


Recently SHANKand IPPEN [I 9741 have developed a continuously

pumped mode-locked dye laser that produces pulses as short as 0.7 ps with
a peak power of the order of one kilowatt. By focussing these pulses down
to a waist of about 20 p they have been able to drive a CS2 Kerr shutter
just about as fast as it can go. Their experimental arrangement is shown in
Fig. 4. The opening of the shutter is probed by weaker ( 5 % of the incident

INPUT X.0.60p





Fig. 4.Gating with subpicosecond pulses has been achieved by IPPEN and SHANK[1974] with
this set-up. A continuously pumped dye laser produces a day-long train of pulses at I = 0 . 0 6 ~ .
The major part (95 %) of each pulse gates on the K e n shutter, while the minor part (5 %) probes
the shutter transmission. The availability of day-long trains of pulses allows the use of highly
sensitive lock-in detection techniques.

power) pulses split off from the input beam (wavelength: 0.60 p). This
time the pulses are so short that the more exact expression (2.1) for 6nll-an,
must be used in eqs. (2.3) and (2.4). The result is that the transmission T(t)
is given by:

T(t)= +TIT, sin2 [(p(t)/2] (2.11a)

A F) j( ;t f )exp [ - (t- t')/z] dt'lz.

~ ( t=) ( r ~ ~ ~ n L /m (2.11b)

As one varies the delay z by which the probe pulse lags the gating pulse,
the transmitted signal G(z) varies like:

G($) cc jeffiffi
T ( t ) p ( t- z f )dt. (2.12)

In the limit where the subpicosecond pulse p(t)becomes a &function

centered at time I = 0, and for small enough cps such that sin cp N cp, these
equations simplify to :
d t ) a (zLI4 exp (- t/z) (2.13a)
~ ( tK) +T, T2(nL/A)exp (- 2t/z) (2.13b)
G(z) cc T(z)cc exp (- 2z/z). (2.13c)
In this limit the transmitted signal therefore reproduces a decaying
exponential characteristic of the rotational relaxation time z of CS, . The
curve obtained by IPPENand SHANK[1975] is shown in Fig. 5 . An asym-
metry with a decaying tail towards large positive z is clearly detectable.

-3-2-1 0 1 2 3 4 5

Fig. 5. Signal G(T)transmitted by the CS, optical Kerr shutter when driven and probed by
1.2 ps pulses from a mode-locked CW dye laser (IPPEN and SHANK [1975]). The ordinate is
linear and in arbitrary units. The shutter transmission at T = 0 is of the order of 0.1 /,.

The peak power density at the waist is lo8 W/cm and the beam waist
extends over about 1.0 mm. The product PL therefore falls somewhat
short of what is required for efficient gating (see Fig. 2), explaining the low
peak transmission (- 0.1 %) observed.
In order to get more signal and to bring out the effect of the relaxation
time z more clearly, IPPENand SHANK[1975] have also measured the
transmission T ( t )when the shutter is biassed half-way to the first maximum.
This is achieved by introducing a quarter-wave plate between the crossed
172 T H E U L T R A F A S T O P T I C A L KERR s H U T T E R [IV, I2
polarizers. For this case eq. (2.1la) becomes
T(t) = $Tl T2 sin2 [44+ (p(t)/2] (2.14)
where q ( t ) designates as before the laser induced phase lag. For small.&
eq. (2.14) approximates to
W )= fiT1 T,(1 +do), dt) d2. (2.15)
The expression for the transmitted signal G(z) becomes

G(z) oc
Lm+ (1 q(t))F(t
- 7) dt.

For pulse durations well under one picosecond, G(z) simplifies in this case

G(z) a const. exp (- z/z). (2.17)
The curve measured by IPPENand SHANK [1975], shown in Fig. 6, por-
trays a steep rise and an exponential decay indicating a time constant
z = 2.1 f0.3 ps, one of the shortest lifetimes to be measured directly by
optical techniques.
Forthcoming increases in the peak power of subpicosecond pulses
promise to render possible the study of light induced refractive index
changes in a great variety of liquid and solid systems and on a very fine
timescale. These studies should provide a most valuable complement to
light scattering techniques (FABELINSKII [19681) in the study of ultrafast
molecular dynamics.


-3-2-1 0 1 2 3 4 5 6
Fig. 6. Same as Fig. 5, except that this time the shutter is biassed half-way by inserting a
quarter-wave plate between the two crossed polarizers.


In order to achieve ultimately the shortest possible gating times, it will

be necessary to use the AC Kerr effect of the type found in glasses, where
the relaxation time is expected to be exceedingly short. Recent work by
OWYOUNG, HELLWARTH and GEORGE [19723 has shown that the optically
induced birefringence in glasses results from a distortion of the atomic

of one atomic orbital period, i.e. -

electronic clouds, and therefore probably has a response time on the order
s. With subpicosecond laser
pulses now becoming available, SHANKand IPPEN[1974], optical gates
based on glasses promise to achieve subpicosecond open times.
Gating experiments were carried out by DUGUAY and HANSEN[1970]
in fused quartz and in two types of Schott glass: BK-7 and LaSF-7. The
curve for the gated green signal G(4)(z) is shown in Fig. 7 for a sample of
BK-7 10 cm in length.

Fig. 7. This curve plots the transmitted signal for a glass ultrafast shutter driven by 1 . 0 6 ~
pulses from a mode-locked Nd:glass laser. The probe pulses (d = 0.53~)are harmonically
derived from the infrarcd ones. The width of the curve for G4)(~)at half-height is
approximately equal to the duration of the infrared pulses. Although harder to gate on, glass
Kerr shutters would ultimately allow gating on the femtosecond (10-l5s) time scale.

The curve for G(4)(z) obtained with LaSF-7 is consistent with a very
fast response. Any relaxation time, if present, is less than 5 ps. One dis-
advantage of glass optical gates is the high power density required to get
the same transmission as with a CS2 gate. With LaSF-7 the first trans-
mission maximum occurs at P = 15 GW/cm2 for a one-centimeter thick
These experiments have served to measure the nonlinear index of
174 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R CIV. 8 2

birefringence n2, of glasses. By using the value n,,(CS,) = 2.2 x

m2/V2, as a reference, the values n2,(LasF-7) = (6f2) x m2/V2,
~,,(BK-T)= (2+ 1) x mZ/V2 and n,,(fused silica) = (1.5f0.5) x
lo-, m2/V2 were obtained (DUGUAY and HANSEN [1970]).


In the geometry shown in Fig. 1 two other factors limit the resolution
for pulses less than 5 ps in duration. One is the group velocity mismatch
between the gating and the gated beams. In CS2 a green light pulse will fall
behind an infrared (2 = 1.06 p) pulse by 2.0ps after 1cm of travel. In LaSF-7
glass that number would be 1.8 ps. A tenfold reduction in the gate thickness
down to one millimeter would reduce this differential delay to 0.18 ps in
LaSF-7. However, the power density required to reach the first transmission
maximum would then be increased to 150 GW/cm2, a relatively large
value, but one that is well within the state of the art.


Another factor lengthening the time response of the optical gate is the
spread in arrival times at the gates for different parts of the infrared gating
beam, due to the finite angle a between gated and gating beams. As can
be seen in Fig. 8, this spread amounts to (d tana)/c, where d is the gating

Fig. 8. Illustration of the time smearing factor (d tan a)/c introduced by the small angle tl
between gating and gated beams. In the figure the gating and gated pulses are assumed to be
nearly delta-functions and are represented by the solid and crossed areas. When the bottom
parts of the two beams enter the gate in perfect synchronism, the upper parts are (d tana)/c
apart. In the experiment of DUGUAY and HANSEN[1969a], this factor amounted to 2.0 ps.
N ,5 21 T H E U L T R A F A S T O P T I C A L KERR SHUTTER 115

beam diameter. In the experiment of DUGUAY and HANSEN[1969b], the

numbers were d = 0.5 cm, LY = 0.08, so that (d tan LY)/C was 1.4 ps.
This geometrical time-smearing effect can be completely eliminated by
going over to a collinear geometry for the gated and gating beams. In the
arrangement of M o u ~ o uand MALLEY[1974], shown in Fig. 9(a), this
was achieved by letting the gating beams go through the first polarizer P,
of type Polaroid HN22, which is almost completely transparent to 1.06 p


Fig. 9. One way to avoid the time smearing factor of Fig. 8 is to employ a geometry where the
gating and gated beams are collinear. (a) In the geometry of MOUROUand MALLEY[1974]
the gating beam at 1 . 0 6 ~is sent through polarizer P, (type Polaroid HN22) where it suffers
negligible absorption, and therefore undergoes very little change in polarization (i.e., it
stays plane polarized at 45" from the axis of P,). The signal beam is coupled in by means of a
dichroic dielectric mirror. (b) In the set-up used by Yu and ALFANO [I9741 a dichroic mirror
is used to couple the gating beam into the shutter.

Another geometry, used by Yu and ALFANO [1974], shown in Fig. 9(b),

makes use of dichroic mirror M to reflect the gating pulse into the gate,
and let the signal through. In this case, for best extinction of the crossed
polarizers, angle @ should be kept to a minimum, and the polarization axis
of P,should be either in the plane of Fig. 9 or normal to it. Correspondingly,
the infrared pulse should be plane polarized at 45" to the horizontal plane.


Self-focusing does not normally occur in a properly designed ultrafast

176 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R v, 2

Kerr gate. Nevertheless, it is worth looking at the conditions under which

it might arise. Self-focusing, which was predicted by KELLEY [1965], follow-
ing the self-trapping work of CHIAO, GARMIRE and TOWNES [1964], refers
to the phenomenon whereby a powerful laser beam induces in a medium
a refractive index change that tends to focus parts of the beam to a number
of points. The change in refractive index seen by a plane polarized laser
field E(t) is given by
6nll = n , P ( t ) (2.18)
where n2 is the nonlinear index of the medium (for CS, n , = 3 x nZB=
1.5 x lo- mZ/VZat 1.06 p).
In the case of the spatially inhomogeneous and high power laser beams
used in practice for gating, the self-focussing theory of SUYDAM [1973]
(see also CAMPILLO, SHAPIRO and SUYDAM[1973, 19741) is applicable.
Both theory and experiment find that parts of a laser beam will self-focus
after a minimum distance zfgiven by


where 6 describes the radial inhomogeneity of the beam and is equal to

~ 6 E o ~ / ~Eo
E obeing
~ , the peak value of the optical field. With Nd: glass
lasers 6 values in the range 0.03-0.3 can be expected. Assuming a gating
power density of 500 MW/cm2 (for which n , F = 1.7 x lop5), and sub-
stituting in eq. (2.19), we find zf ranges from 1.5 to 3.0 cm for gating at 1.06p.
In the one-centimeter length of a typical ultrafast shutter, there is little
danger therefore of seeing any part of the gating beam self-focus.


In their study of self-focused filaments in CS,, SHIMIZU and STOICHEFF

[1969] used the geometry shown in Fig. 10, which was later recognized by
MALLEY and RENTZEPIS [1970] as a useful one for the ultrafast gating of
optical signals. In +he experiment of SHIMIZU and STOICHEFF [1969], the
birefringence induced by the 1.06 p ultrashort pulse was probed by photo-
graphing the green light (second harmonic) transmitted through the
crossed polarizers. This way both the temporal and the transverse spatial
profile of the 1.06 p beam in CS, was recorded. MALLEY and RENTZEPIS
[19701 used the same geometry to display the time profile of a 0.63 p pulse
generated by a rhodamine 6G laser pumped by the second harmonic of
the 1.06 p pulse (Nd: glass laser). As the 1.06 p gating pulse propagates

Fig. 10. Set-up used by SHIMIZU and STOICHEFF [I9691 to study self-focused filaments in CS,,
and as a method of displaying the 1 . 0 6 ~pulse. When used as a Kerr gate, the transverse
geometry provides an open slit moving from A to B at the speed of light in CS, . In photo-
graphing the transmitted signal light, axis AB becomes a time axis and one can use this set-up
to display the signal (MALLEY and RENTZEPIS [1970]). Self-focusing of the gating beam can be
a problem in a transverse geometry Kerr gate.

from A to B in Fig. 10, the Kerr gate first opens at A and then later on at B.
This way the time profile of the pulse incident from the top appears as a
spatial intensity profile on the film.
The transverse geometry has also been used by FISCHER and ROSSMANITH
[19731 in studies of synchrotron radiation, and by RICHARDSON and SALA
[I9731 in their ultrafast framing photography of laser produced plasmas,
which we will return to later. TOPP,RJZNTZEPIS and JONES [1971] have
combined a transverse Kerr shutter with a spectrograph in order to time
resolve the spectrum of an optically pumped rhodamine 6G laser.
Self-focusing is a problem that is more severe in a Kerr gate of transverse
geometry. Because the minimum opening time at a given z is limited by
the transit time nd/c, it is necessary to keep d as small as possible. For 5 ps
resolution, for example, one must have d = 1.0 mm. In order to get the
same phase retardation 640 as one had in Fig. 1 over L = 1 cm, it is necessary
to increase the gating power density tenfold to 5 GW/cm2. As a result of
both the smaller diameter and the increased power density, the gating
beam can undergo self-focusing within the long CS2cell dimension (- 1 cm)
and break up in filaments as in the experiment of SHIMIZU and STorcHEw
[19691. This leads to severe inhomogeneities in the gate transmission.

0 3. Ultrahigh Speed Photography


A unique application of the ultrafast shutter has been in the photography

of light in flight (see DUGUAYand HANSEN [19701, DUCUAY and MATTICK

Fig. 11. Experimental set-up used for the ultrahigh-speed photography of light pulses in flight.
Ultrashort infrared (1.06~)pulses from a Nd: glass laser open the ultrafast Kerr shutter for
about 10 ps. The green pulses are harmonically derived from the infrared pulses. As they pass
through a cell of milky water, light scattering makes them brightly visible from the side, so that
they can be stop-motion photographed in flight (see Fig. 12). Filter F is made of infrared

absorbing glass that is transparent to the visible; it attenuates the 1 . 0 6 ~pulse by a factor
of 1000.

[1971] and DUGUAY[1971]). The set-up that was used to accomplish this
is shown in Fig. 11. An ultrashort pulse of green light is directed into a cell
containing a colloidal suspension of milk particles in water. The milk
particles greatly increase the instantaneous light scattering that occurs in
pure water, thereby making the green pulse brightly visible from the side.
(A laser pulse propagating in vacuum would not be visible from the side.)
A 35 mm camera, whose mechanical shutter has been manually opened,
is placed behind a conventional ultrafast CS2 shutter. The latter is driven
by an infrared pulse (A = 1.06 p) produced by a mode-locked Nd: glass
laser. The green pulse is harmonically derived from the infrared pulse.
The lengths of the paths followed by the two pulses are adjusted in such
a way that when the shutter opens for about 10 ps it captures a picture of

Fig. 12. An ultrashort pulse of green laser light is photographed in flight as it propagates
from right to left through a cell of milky water. The scale is in millimeters. The shutter open time
was about 10 ps. The red spot on the left side of the picture is the impression made on the high
speed Ektachrome film by the infrared laser pulse used to activate the ultrafast Kerr shutter
and incompletely attenuated by filter F in Fig. 11.

the green pulse in midflight through the milky water cell, as shown in
Fig. 12. Thus the green light bullet is stop-motion photographed in
flight. The red round spot to the left of the cell results from the direct
head-on impact onto the film made by the infrared pulse incompletely
attenuated by filter F in Fig. 11. Thus Fig. 12 also provides a pictorial
representation of second harmonic generation from infrared (in red) to
In principle the ultrafast photography of light in flight constitutes one
of the most direct ways of displaying ultrashort laser pulses. Pulses obtained
from a mode-locked Nd: glass laser stand out bright and well isolated,
180 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R v, 03
the contrast between the brightness of the pulse and that of the background
being better than 200 to 1.
Even though the ultrashort pulse display shown in Fig. 12 is one of the
best obtained so far, it still is far from perfection because the shutter remains
open for about the same time (- 10 ps) as the green pulse duration. In 10 ps
light moves 2.2 mm in water, and as a consequence, the picture of the
green pulse is blurred out. One cannot recover the precise shape of the
green pulse. What would be needed in this case would be a 0.3 ps gating
pulse (perhaps from a mode-locked dye laser) and a glass Kerr shutter
because the green pulse is known to have a subpicosecond substructure
(SHAPIRO and DUGUAY [19693). Nevertheless, this technique has proved
useful in the study of weaker satellite pulses accompanying the powerful
ultrashort pulses generated by model-locked Nd : glass lasers (see DUGUAY
and MATTICK [197I]).


The technique of gated picture ranging has been used on the nanosecond
time scale to improve the visibility of targets, such as airplanes and ships,
obscured by fog or other obstacles. A powerful nanosecond light pulse is
sent out from the observation point towards the target. At a preselected
later time an electronic image converter tube is gated on for a few nano-
seconds and only the echo image scattered back by the target is recorded.
Earlier (or later) echoes from the fog corresponding to closer (or farther)
ranges are not recorded because the picture tube is gated off at those times.
With picosecond and now even subpicosecond laser pulses, the same
technique can be applied on the centimeter and even millimeter scales by
using the ultrafast Kerr shutter to gate the echoes. In a feasibility experiment
using the set-up shown in Fig. 13(a), a second-harmonic green pulse was
sent through a piece of thin paper tissue (facial tissue) towards a target
carrying the stylized drawing of a bell (shown unobscured by the tissue in
the upper left corner of Fig. 13(b)). When photographed under room light
illumination (see Fig. 13(b), upper right corner), the, target is completely
obscured by the tissue. When the green pulse is sent through the tissue and
the ultrafast shutter is turned on at the right time by the infrared pulse,
only the echo from the target is recorded by the camera. Two results are
shown in the lower part of Fig. 13(b).
In certain parts of the human body the skin is partially transmitting to
light, and veins, for example, can be seen. With the recent achievement of
0.5 ps laser pulses and 2 ps Kerr gating times, a spatial resolution of better

1 cm

Fig. 13. (a) Schematic description of set-up used for gated picture ranging through a piece
of paper (or "facial") tissue. An ultrashort green laser pulse first illuminates the tissue and then
33 ps (1 cmjc) later the target. The target echo, which carries the image information, lags the
tissue echo by 66 ps. (b) The top left picture shows the target under room lighting when the
tissue is removed. With the tissue back in place the target is completely invisible under room
lighting. The bottom two pictures show two examples of gated picture ranging through the
tissue. The target is visible, but the passage of the image carrying echo through the tissue has
degraded the quality of the picture (the ultrafast shutter itself does not degrade the picture, see
DUCUAY and MATTICK[1971]).

than 1 mm should be possible in gated picture ranging, and greatly improved

vision though the skin appears feasible in those parts of the body where the
skin is 1-2 mm thick. A clearer picture of veins and arteries under the skin
might help the diagnosis of certain diseases and injuries. The outcome of
such experiments remains to be seen.

RICHARDSON and SALA[19731 have combined an ultrafast Kerr shutter

with an electronic streak camera in order to multiframe photograph a
laser produced plasma (or laser spark). In their experiment a train of
about 30 pulses, spaced by 6.7 ns and produced by a mode-locked Nd:
glass laser, is focused in air, causing it to break down (to spark). Part
of the beam is used to drive an ultrafast Kerr shutter of transverse geometry.
Thus the shutter opens for about 10 ps every 6.7 ns. Light emitted by the
growing laser spark is focused into and through the shutter onto an elec-
tronic streak camera. In this camera the image is electronically swept
downward at a speed of up to 2.5 mm per nanosecond. Consequently, the
series of discrete images that emerge from the shutter at 6.7 ns intervals
are recorded at progressively lower positions on the film. As can be seen
in Fig. 14, the sequential images are well separated from one another and
one can follow the growth of the laser spark.

Fig. 14. Picosecond framing photography of a laser produced spark in air achieved by
combining an electronic streak camera with an ultrafast optical Kerr shutter. The latter is
opened for 10 ps every 6.7 ns. In the 6.7 ns interval during which the Kern shutter is opened,
the position of the image is electronically swept down in the streak camera tube, so that the
next 10 ps frame transmitted by the Kerr shutter is recorded well below the preceding image.
(a) Five frames in the initial stage of optical breakdown in air. (b) Five frames about 200 ns
after breakdown. RICHARDSON and SALA[19731.

In electronic high speed framing cameras, the framing time is limited

to about 300 ps. Since the Kerr shutter is capable of gating times down to
2 ps, the technique of RICHARDSON and SALA[1973] clearly represents a
major improvement in ultrafast framing .photography.

8 4. Sampling Optical Signals


The sampling technique used in the electronic sampling oscilloscope was

developed by JANSSEN [19501. The beauty of this technique resides in the
fact that both the highest sensitivity and the fastest time response can be
achieved simultaneously in displaying weak electronic signals. The key

elements needed in electronic sampling are an ultrafast gate (the shortest
gating times are 20 ps at present) and an amplifier to amplify the sample
cut out at a gven time from the signal.
An ultrafast optical Kerr shutter (or gate) and a photomultiplier con-
stitute the analogous key elements in applying the sampling technique to
optical signals. Since a Kerr shutter transmission of 50% can in principle
be achieved when using crystal (e.g., calcite) polarizers, and since photo-
multipliers have quantum efficiencies as high as 30%, this optical sampler
is close to the theoretical limit of sensitivity.
The optical sampling arrangement used by DUGUAY and HANSEN [1969b]
is shown in Fig. 15. The CS2 gate is driven, as before, by 1.06 p pulses
about 10 ps in duration generated by a Nd: glass laser. The 0.53 p second
harmonic pulses are sent into a cell containing a cyanine dye dissolved in


Fig. 15. Picosecond fluorescence decay times were first measured by using an ultrafast shutter
together with a photomultiplier in order to do point by point sampling of optical signals
(DUGUAY and HANSEN[1969]). The fluorescent dye is excited by green pulses about 10 ps in
duration and emits a fluorescent signal at 1 = 0 . 7 5 ~ On
. a given laser shot, a 10 ps sample is
sliced from the incoherent fluorescence signal and is detected by the photomultiplier. As the
delay is vaned from shot to shot, the entire signal can be sampled as a function of time.

methanol or acetone; the dye is 1, l-diethyl-2,2-carbocyanine iodide (com-

monly known as DDI). Some of the spontaneous fluorescence light emitted
by the DDI is collected by a lens and directed into the ultrafast Kerr shutter.
At each firing of the laser, a sample (- 7 ps wide) is cut out from the fluores-
cence signal at a given point on the waveform. With each laser firing the
relative delay between the gating and the excitation pulses is changed, so
that after many shots the whole signal has been sampled and displayed.
An example of a fluorescence waveform display obtained by this method
is shown as the full line in Fig. 16.

-cn 1


0 I \
a I I I I I I I I
-40 -20 0 20 40 60 80 100
T I M E t ( pSeC 1

Fig. 16. The solid line shows the fluorescence signal emitted by the dye DDI sampled as a
function of time. Time is measured relative to the arrival time of the green pulse which excites
the fluorescence. The dotted line represents the prompt response of the measuring system.
It is obtained by removing the dye and by sampling the green pulse itself (it is the same as
G(4)(2) in the text). A deconvolution of the prompt curve from the fluorescence curve gives
a decay time of 14+3 ps for DDI dissolved in methanol or acetone.

By using a similar arrangement, ALFANO,SHAPIROand POPE [I9731

have measured a 145 ps fluorescence decay time in a tetracene crystal at
room temperature. In a more recent experiment using a collinear gating
arrangement, Yu, Ho, ALFANO and SEIBERT [1975] have measured a 60 ps
lifetime in photosystem 1 of spinach.
One problem that has plagued the users of this shot by shot sampling
technique is the time consumed in displaying one optical signal. Nd: glass
lasers typically fire once a minute, and because of pulse height and duration
fluctuations, several shots must be taken for each delay value in order to

obtain a statistically significant display. For this reason workers in the

field have sought to develop multichannel sampling schemes, whereby all
samples are taken in one laser shot. This will be the topic of the following
three sections.


In the echelon technique developed by TOPP, RENTZEPISand JONES

[I9711 (see Fig. 17), the signal beam is reflected from an echelon reflector.



Fig. 17. The echelon technique developed by TOPP,RENTZEPIS,JONFS [ 19713 makes use of an
echelon reflector to divide the signal beam into a number of segments spaced apart in time
by typically 3 ps. One ultrashort sample is cut out by the ultrafast shutter from each segment.
The samples are recorded on photographic film (as shown above) or by a linear array of

The echelon step is chosen so that the various segments of the reflected
signal are progressively delayed at intervals of 4 ps, for example, as they
enter the shutter. When the shutter opens for a time dictated by the infrared
gating pulse, one ultrashort sample is cut out from each segment of the
signal beam. These samples are recorded photographically and by photo-
densitometry, a plot of segment height vs. number (Fig. 18) gives the time
profile of the signal. Recently this method has been improved by NETZEL,
RENTZEPIS and LEIGH[1973] by using a linear array of photodiodes instead
of film to record the signal segments.
TOPP,RENTZEPISand JONES [1971b] have also combined the echelon-
shutter technique with a spectrograph to time resolve the spectrum of the
stimulated emission from a rhodamine 6G laser. Interesting data on
the photobleaching of rhodopsin have been receniy obtained by using
these techniques (see NETZEL, RENTZEPIS and LEIGH[19733).
186 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R CIV, 4

Fig. 18. Experimental result obtained with the echelon technique. In this example the signal
beam is the second harmonic of the 1 . 0 6 ~gating beam. The various segments of the green
pulse probe the opening of the ultrafast shutter.


Oscilloscope displays of picosecond optical signals have been obtained

in experiments done by DUGUAY and SAVAGE[I9731 and by VOGEL,
SAVAGE and DUGUAY [19741. In the experimental set-up of DUGUAY and
SAVAGE[1973] (see Fig. 19) the green pulse ( A = 0.53 p) to be displayed is
sent into a cell containing highly dilute milk. Light pulses scattered from
a set of 10 discrete points (only 4 are shown in Fig. 21) are coupled through
an ultrafast gate into a set of 10 optical fibers. The 10 scattered pulses
replicate the shape of the incident green pulse. The spacing between succes-
sive points A, B, C, . . . is such that the scattered light pulses arrive at the
gate with delays progressively longer by 3.9 ps increments. When the
shutter is opened by a single ultrashort 1.06 p pulse, samples are cut out
from the 10 scattered pulses. These samples are centered at 3.9 ps intervals
from the leading to the lagging edge of the green pulse shape.
Another and more efficient way of obtaining 10 pulses replicating the
shape of the incident pulse, is by using Vogels multibeamsplitter (see
VOGEL,SAVAGE and DUGUAY [19741) shown in Fig. 20. The spacing between
pulses is determined by the thickness of the glass etalons used. With 3.3 mm
thick glass etalons, the pulses are 4.1 ps apart. Vogels multibeamsplitter



Fig. 19. Set-up used to optically sample ultrashort laser pulses and display them on a real-time
oscilloscope. The pulse of green ( A = 0.53~) light to be displayed comes down from the top
left and enters a cell containing highly diluted milk (or a Ludox silica suspension type LS).
The ultrafast shutter is driven by the infrared pulse shown. The shutter opens only once and
cuts out a slice (or sample) from each scattered pulse. The samples are centered at 3.9 psec
intervals from the leading to the lagging edge of the incident green pulse. The fiber array
transforms the spatially distinct samples at points A, B, C, . . . . into temporally distinct
pulses on the oscilloscope screen. The fiber ends at A, B , C, . . . . are butted against a
glycerin-wetted glass window (not shown) for good optical coupling.

has the advantage over the echelon (see TOPP,RENTZEPIS and JONES [1971])
in this application, of replicating the signal pulse not only temporally but
also spatially. When the signal beam has spatial inhomogeneities, this
insures that all replica pulses entering the shutter are identical in shape.
The 10 samples cutout by the ultrafast gate are sent into 10 optical
fibers, cut to progressively longer lengths and giving delays of 5, 15,25, 35,
. . ., 95 ns. This set of optical fibers if referred to as an organ array, by
analogy with the progressively longer pipes of the musical organ. The out-
188 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R [IV, 0 4


~-n / n /n /n J

Fig. 20. Multibeamsplitter developed by VOGEL,SAVAGE and DUCUAY [I9741 to subdivide an

incident optical signal into 10 replica pulses (only four shown above, for clarity), identical
except as to height to the incident pulse, and spaced apart by 4.1 ps in this case. In the absence
of the antireflection-coated etalons E, ,E,, . . ., El,, the replica pulses would be synchronous
at the shutter. The ten identical glass etalons introduce a uniform interpulse delay of 4.1 ps.
The ten replica pulses are directed towards the ultrafast Kerr shutter for multichannel sampling.

put ends of the fibers are placed near the face of a single photomultiplier
of 3 ns response time, the output of which is displayed on a fast (0.8 ns
risetime) oscilloscope. This way the 10 samples, which were taken within
the same 5-10 ps interval, are spaced out at 10 ns intervals for detection
and display.
An example of a display obtained with Vogel's multibeamsplitter is
shown in Fig. 21. The second harmonic pulse at A = 0.53 p is being sampled
here. If the shutter had opened for, say, 2 ps in this case, the envelope of
the samples would represent the green pulse shape. For that laser shot,
however, the duration of the 1.06 p gating pulse was more like 16 ps,
leading to a gate open time of about 10 ps, a duration about equal to the
green pulse width. The envelope of the sample pulses in Fig. 21 represents
in fact the function G(4)(7)given in eq. (2.10).
The sensitivity of the OSO with Vogel's beamsplitter was such that
pulses with peak powers down to a few watts could be displayed. With
various improvements (see VOGEL,SAVAGE and DUGUAY [19741) sensitivity
down to the level of ten milliwatts in optical power seen possible in practice.


Recently MOUROUand MALLEY[I9741 have implemented an elegant

way of measuring picosecond fluorescence decay times. Their set-up is
shown in Fig. 22. Second harmonic pulses (A = 0.53 p) about 4 ps in dura-

1 4 . 1 ~ 8SAMPLING STEP



Fig. 21. (a) Sampled display of an ultrashort laser pulse obtained with a Vogel multi-
beamsplitter, an ultrafast shutter and an organ array as shown in Fig. ,20. The envelopqof the
samples constituks the displayed shape. This shape is the convolution G4(5) described in the
text involving the green pulse shape ( - 12 ps wide) and the shutter transmission function T(t)
(also 12 ps at half maximum in this sample). The width of the display at half maximum is
about equal to the duration of the infrared pulse that drove the shutter in this example (17 ps).
The ordinate scale points up the excellent sensitivity available with this technique. (b) Picture
obtained when the ultrafast shutter is manually opened by removing P, . Unequal heights
reflect residual ineqiialities in the coupling and transmission of the various channels.
190 T H E U L T R A F A S T O P T I C A L KERR S H U T T E R v, P4

Fig. 22. Set-up used by MOUROUand MALLEY[1974] to measure ultrafast relaxation times.
A second harmonic green pulse 4ps in duration propagates through the dye solution

(erythrosin in water) leaving an exponentially decaying tail of fluorescent light in its wake.
This fluorescence tail is imaged through the ultrafast shutter onto an array of 512 photodiodes.
When the shutter is opened for 4 ps by an infrared pulse, a record of the tail is captured by
the linear diode array.

2 4 -
s 3 -

E 2 - At = 4 psec
0 I -

- 0
t(psec)- 0 100

Fig. 23. Result obtained by MOUROUand MALLEY119741 using the arrangement of Fig. 22.
The risetime of the fluorescence signal from erythrosin B in water (3 x 10- M) appears limited
by the widths of the excitation and gating pulses .,( 4 ps). When the dye is replaced by milky
water, a stop-motion image of the second harmonic (530 nm) pulse is projected onto the linear
photodiode array: the sharp curve obtained gives the prompt response of the system, as in
Fig. 16.

tion derived from a Nd: glass laser are passed through a cell containing a
molar solution of erythrosin in water. The green beam has been col-
linated down to a diameter of 1 mm. The spontaneous fluorescence light
emitted along this narrow track is imaged through collinear Rerr gate onto
a linear array of 512 photodetectors (only 5 are shown in Fig. 19 for clarity).
Each photodetector collects light from a small volume element situated
along the fluorescence track. The light pulses emitted by each volume
element reach the ultrafast shutter after a delay proportional to the distance
along AZ. When the shutter is opened by the 1.06 p pulse (-6 ps in dura-
tion) it cuts out one sample from each pulse. Just as in the optical sampling
oscilloscope, the samples are spaced uniformly from the leading to the
lagging edge of the pulse shape.
The result obtained by Momou and MALLEY [1974] for erythrosin is
shown in Fig. 23. The number of samples taken is large enough to make the
recorded trace appear continuous. Mourou and Malley found a fluores-
cence signal risetime of 4f 1 ps, that is essentially equal to the time resolu-
tion of the apparatus. The implication is that the true fluorescence risetime
(i-e., the risetime under deltafunction pulse excitation) is less than 4 ps.
Another way of looking at the Mourou-Malley experiment is from the
point of view of ultrahigh speed photography (see DUGUAY and HANSEN
[1970]). The green pulse (or light bullet) leaves in its wave a tail of
fluorescence light that is stop-motion photographed in flight by the Kerr
shutter and detector array. The latter replaces the film and records one only
horizontal line of the picture, but that is all that matters here.
Momou and MALLEY [1974] have also used the ultrafast shutter and
their detector array in conjunction with a spectrograph to do time resolved
spectroscopy of the spontaneous emission of rhodamine 6G on the pico-
second time scale.

0 5. Concluding Remarks
The ultrafast Kerr shutter has established itself as a useful instrument
in fields of studies involving picosecond laser pulses. The shutter has been
driven by pulses derived from a variety of lasers, including COz laser pulses
(see DUGUAY and SAVAGE [1973], OWEN,COLEMAN and BURGESS[1973]).
The potential of this instrument in studies of ultrafast molecular dynamics
has been left largely unexploited so far, probably for a number of reasons,
one of which certainly being the difficulty and expense of producing stable
powerful laser pulses. If the day comes when laser pulses achieve the
reliability and flexibility of electronic pulses, use of the ultrafast Kerr

shutter will become an easy task, a task that will not involve the difficulties
and dangers of high voltage pulses used in conventional Kerr cells.
The use of subpicosecond pulses in driving the shutter (IPPENand SHANK
[19751) has opened a most intriguing new frontier where the electronic
Kerr effect will certainly be called into play. Thus, one century after the
discovery of the DC Kerr effect (KERR[1875]), a closely related effect,
the AC Kerr effect, is playing an active role at the pinpoint of technology.

I would like to acknowledge the help of Mrs. Jeri Romaine and Mr. A.
Savage in preparing the manuscript for publication.

ALFANO,R. R. and S. L. SHAPIRO, 1972, Optics Commun. 6, 98.
ALFANO,R. R., S. L. SHAPIRO and M. POPE,1973, Optics Comm. 9, 388.
BROIDA,H. P. and S. L. SHAPIRO, 1967, Phys. Rev. 154, 129.
BUCKINGHAM, A. D., 1956, Proc. Phys. SOC.B69, 344.
CAMPILLO, A. J., S. L. SHAPIRO and B. R. SUYDAM, 1973, Appl. Phys. Lett. 23,628.
CAMPILLO, A. J., S. L. SHAPIROand B. R. SUYDAM, 1974, Appl. Phys. Lett. 24, 178.
CHIAO,R. Y ., E. GARMIRE and C. H. TOWNES, 1964, Phys. Rev. Lett. 13,479.
DEMARIA, A. J., D. A. STETSERand H. HEYNAU, 1966, Appl. Phys. Lett. 8, 174.
DUGUAY, M. A., 1971, American Scientist 59, 550.
DUGUAY, M. A. and J. W. HANSEN,1969a, Appl. Phys. Lett. 15, 192.
DUGUAY, M. A. and J. W. HANSEN,1969b, Optics Commun. I , 254.
DUGUAY, M. A. and J. W. HANSEN, 1970, NBS Special Publication No. 341, pp. 45-49 (Gov.
Printing Office, Washington, D.C.).
DUGUAY, M. A. and A. T. MATTICK,1971, Appl. Optics 10, 2162.
DUGUAY, M. A. and A. SAVAGE, 1973, Optics Commun. 9.212.
DUGUAY, M. A. and J. N. OLSEN,1975, Picosecond X-ray Pulses, IEEE J. Quant. Elect.
QE11, 170.
FABELINSKII, 1. L., 1968, Molecular Scattering of Light (Plenum Press, New York).
FISCHER, R. and R. ROSSMANITH, 1973, IEEE Trans. Nucl. Sci. NS-20, 549.
GIORDMAINE, J. A., P. M. RENTZEPIS, S. L. SHAPIRO and K. W. WECHT,1967, Appl. Phys. Lett.
11. 216.
JANSSEN, J. M. L., 1950, Philips Tech. Rev. 12, 52.
IPPEN,E. P. and C. V. SHANK,1975, Appl. Phys. Lett. 26. 92.
KELLEY, P. L., 1965, Phys. Rev. Lett. 15, 1005.
KERR,J.. 1875, Phil. Mag. 50, 337.
MALLEY. M. M. and P. M. R~NTZEPIS. 1970. Chem. Phys. Lett. 7, 57.
MAYER,G. and F. GIRES.1964. C. R. Hebd. Seanc. Acad. Sc. Paris 258, 2039.
MOUROU. G. and M. M. MALLEY,1974, Optics Commun. 11, 282.
NETZEL, T.. P. M. RENTZEPIS and J. LEIGH, 1973, Science 182, 238.
OWEN,T. C., L. W. COLEMAN and T. J. BURGESS, 1973, Appl. Phys. Lett. 6,272.
OWYOUNG, A,, R. W. HELLWARTH and N. GEORGE, 1972, Phys. Rev. B5,628.
PAILLETTE, M., 1969, Annales de Physique 4, 671.
RICHARDSON, M. C. and K. SALA,1913. Appl. Phys. Lett. 23. 420.

SHANK,C. V. and E. P. IPPEN, 1974, Appl. Phys. Lett. 24. 313.

SHAPIRO, S. L. and M. A. DUGUAY, 1969, Phys. Lett. 28A, 698.
SHIMIZU,F. and B. P. STOICHEFF, 1969, IEEE J. Quant. Electr. QE-5, 544.
STARUNOY, V. S., E. V. TIGANOV and I. L. FABELINSKII, 1966, Zh. Eksp. i Teor. Fiz. Pisma
Redaktsiyu 4, 262 [Engl. Trans.: Soviet Phys. JETP Letters 4, 1761.
SUYDAM, B. R., 1973, NBS Special Publication No. 387 (Gov. Printing Offce, Washington,
D.C.) pp. 42-48.
TOPP, M. R., P. M. RENTZEPIS and R. P. JONES, 1971a, J. Appl. Phys. 42, 3415.
TOPP,M. R., P. M. RENTZEPISand R. P. JONES,1971b, Chem. Phys. Lett. 9. 1.
VOGEL,G. C., A. SAVAGE and M. A. DUGUAY, 1974, IEEE J. Quant. Electr. QE-l0,642.
Yu, W. and R. R. ALFANO.1974, Opt. Electronics 6, 243.
Yu, W., P. P. Ho, R. R. A L F A N OM.~SEIBERT,
~~ 1975, Biochimica et Biophysica Acta 387,159.
This Page Intentionally Left Blank




Uniwrsitiifs- Sternwmte, Giittingen,


9 1 . INTRODUCTION. . . . . . . . . . . . . . . . . . . 197
G R A T I N G S . . . . . . . . . . . . . . . . . . . . . . 200
GRATINGS.. . . . . . . , . . . . . . . . . . . . . 242
ACKNOWLEDGEMENTS. . . . . . . . . . . . . . . . . . 242
REFERENCES. . . . . . . . . . . . . . . . . . . . . . . 242
0 1. Introduction
The first experiments with grating like structures were probably made
by the American astronomer David RITTENHOUSE [17861 in Philadelphia.
He used parallel hairs laid in a fine screw and observed diffraction effects
of light. The first ruled optical gratings were produced by Joseph VON
FRAUNHOFER [1821/22], who also discovered the fundamental properties
of optical diffraction gratings. Comprehensive reviews of the history,
theory and manufacture of gratings have been given by KAYSER [1900],
STROKE [1963, 19671 and HARRISON [19733.
Because of the severe mechanical problems of ruling gratings many
alternative methods of production have been considered. MICHELSON
[19271 suggested producing gratings by photographing stationary waves
using Lippmann plates. At the National Physical Laboratory in England
BURCH and PALMER [1961] made gratings by photographing fine inter-
ference fringes and measured the shift caused by processing of photographic
emulsions. LABEYRIE [1 9661 suggested using layers of bichromated gelatine,
Daguerre layers, photoconductive layers in connection with sputtering
techniques or thermoplastic layers as recording materials. None of these
methods gave gratings suitable for practical spectroscopic use. Stimulated
by the fact that in high resolution stellar spectroscopy with large telescopes
it is neccessary to use large diffraction gratings of good quality, the authors
of the present article proposed making diffraction gratings holographically
by using photoresist layers. The first results, obtained in 1967, already
demonstrated the good optical quality of such holographically made
diffraction gratings (RUDOLPH and SCHMAHL [1967a, b, c, 19681). In the
meantime several groups have started working in this field and holographic
diffraction gratings compete with classically ruled gratings from the
visible to the ultraviolet and soft X-ray regions.
The scope of this article is to describe the basic method of making gratings
holographically, to describe the results obtained and to compare these
results with those obtained from gratings produced by traditional means.

0 2. Theoretical Characteristics of Spectroscopic Diffraction Gratings

The notation for the basic grating equations is introduced in this section
and a brief summary of the theory of spectral image formation by optical
gratings is given (STROKE [1963]). The relation between the wavelength
and the angle of diffraction is given by the grating equation
sin i+sin i' = -,
where i and 'i are the angles formed by the wave propagation vectors of
the incident and the diffracted waves with the normal to the grating surface.
The grating spacing is a, the wavelength of the diffracted radiation 1and
the spectral order m. The angle of diffraction 'i is positive if the incident
and diffracted wavefronts are on the same side of the normal and negative
if they are on opposite sides of the normal.
The angular dispersion, 6, is given by
di' m 1
g=-=____- --
dA cos ilea I cos i'
(sini sin it).

In autocollimation i = 'i


The linear dispersion in the focal plane of a focussing system of focal

length f is
L = - . f.
Assume that a grating is used in a spectrograph with a very narrow
entrance slit; there will then be no difference between coherent and in-
coherent illumination. According to the Rayleigh criterion two spectral
elements with wavelengths I and 1 A 1 can be separated if the maximum
of the diffraction pattern of the first spectral element coincides with the first
minimum of the second spectral element. For a monochromatic plane wave
with a width A = W . cos'i (see Fig. 1) the first minimum of the diffraction
pattern is located at
Po = 2 = w .cos - (2.4)

With A = W .cos 'i and di' = Po in equation (2.2) and using equation (2.4)

Fig. I .

the resolving power, R,is seen to be expressible in the form

1 w
R = - = -(sin i + s h i) = N . m = A * 6,
A1 1
where N is the total number of grooves.
In the focal plane of the spectrograph the complex amplitude of the
diffraction pattern of a monochromatic plane wave is given by

where the angular coordinates of the diffraction pattern in the focal plane
is given by fi = n x / ( A * f / A and
) q = u/(A/2),the normalised width of the
diffracted plane wave. The complex amplitude of the diffracted mono-
chromatic plane wave g(q) is given by

Here d(q) is the phase of the diffracted wavefront which can be derived
from wavefront interferograms. d(q) is constant for a plane wave. We
also assume that the amplitude is constant, which means that the efficiency
and polarisation properties are uniform over the whole area of the grating.
With %(q) = 1

The intensity distribution of the diffraction pattern is given by

%*(/9) = -
?I CB>

where $I@) is normalised so that the total intensity is 1. Wave front aber-
rations A(?) are introduced by periodic, random and progressive errors
in the rulings and also by variations in the flatness of the blank or in the
layer in which the grooves are made. Aberrations also occur if $I(?) is not
constant. All aberrations give rise to deviations from the theoretical
diffracted intensity distribution.
High quality diffraction gratings should therefore have the following
properties :
a) A high ruling accuracy; i.e. periodic, random and progressive ruling
errors have to be as small as possible. Periodic errors result in ghosts in
the spectrum, random errors give broad wings to the diffraction pattern
i.e. scattered light over a broad wavelength region. Progressive ruling
errors (also known as error of run) over parts of the grating give rise to
deviations from the theoretical line profile near the centre of the line.
b) Good optical quality of the grating blank and the layer in which the
individual grooves are made.
c) A well defined profile for the individual grooves to obtain high
efficiency and small amount of light scattering.
d) The profile of the grooves should be uniform over the entire grating
to obtain constant intensity over the diffracted wavefronts when using
the grating in a spectrograph or a spectrometer.

0 3. Basic Principles of Holographic Diffraction Gratings

A hologram made by superposition of two monochromatic waves and

subsequent storage of the resultant intensity distribution is an optical
element with dispersive and imaging properties. The most simple holograms
can be made by using two plane waves, a plane wave and a spherical wave
or two spherical waves.
In principle the process for making gratings holographically is very
simple. Glass blanks polished to optical tolerances are coated with a thin
film of photoresist which is capable of forming well defined profiles of the
individual grooves of a grating and which can also store an interference
fringe system with accuracy and durability.
The process of constructing gratings with laser light and photoresist
layers could as well be called interferographic. However, the term
holographic grating is more useful than the term interference grating
especially for gratings with imaging properties and for gratings with
improved accuracy made with identical reconstructed wavefronts.


3.1.1. Accuracy of the interference fringe system

The accuracy of the interference fringe system necessary for making
gratings holographically will be discussed in this article in the case of an
interference pattern for plane gratings with equidistant spacings.
If we assume that the blank and the photoresist layer are perfectly plane
the accuracy of the diffracted wavefronts will only depend on the ruling
accuracy. For holographic gratings the ruling accuracy is determined by
the accuracy of the interference system used to make the gratings.

Fig. 2. Formation of interference fringes in the xz-plane.

In Fig. 2 two monochromatic beams, with direction of propagation

parallel to the x, y plane are superposed

and (3.1)

S,(x, y ) = %(x, y) . exp

If these beams are parallel beams we can write

$1 = x sin i,
$2 = x sin i'.

From Fig. 2 i = a+/?, 'i = or-8. 'i is negative if i and 'i are on opposite
sides of the normal in the xy-plane. Since we are only interested in the
spatial frequency and not in the position of the interference maxima and

minima we can choose p l = p2= 8. The superposition yields an intensity


I(x) = F e x p ( i ( b , + 7 t ) ) +Bexp(i(9,+ Tt))j

.b* exp (-i (41+T
t ) ) +B*exp (-i (42+ 7 t))]

= A Z + B 2 + 2 A . B cos (41-42). (3.3)

Equation (3.3) corresponds to equally spaced fringes. If A = B the maxi-
mum intensity variation from zero intensity in the minima to Z(x) = 4A2
in the maxima is obtained. From equations (3.2) and (3.3) the spacing of
the fringes can be obtained as

- A
a= - (3.4)
sin (cl+fi)+sin (a-fi) 2 sin clcos j? '
Using a symmetrical system (fi = 0) and I = 0.4 pm one obtains fringes
with the following spacings (Table 1). For a perfect plane grating the
grating spacing is constant over the whole area of the grating. Variations
in the spacing yield aberrations, A(?), in the diffracted wavefronts and
consequentlydeviations from the ideal intensity distribution in the spectrum.

Grating spacing as function of the intersection angle a for I = 0.4 pm

uc"1 a b l grooves/mm

1.1 10 100
11.5 1 1000
53.1 0.25 4000

The production of interference fringes with exactly equal spacings

would require perfectly plane wavefronts. However, these cannot exist for
two reasons: Firstly, in practice optical elements must be used to produce
the wavefronts and these are never perfect. Secondly, because finite beams
must be used divergence due to diffraction at the apertures is introduced.
Wavefronts diffracted by high quality spectroscopic diffraction gratings
should have no aberrations of high spatial frequency. Aberrations of low
spatial frequency should not exceed 1/20, if 2 is the minimum wavelength
used. In holographic production of gratings one normally uses high quality

optical elements, which do not show aberrations of high spatial frequency.

Common aberrations of such elements are of very low spatial frequency.
A holographic grating made with wavefronts having such aberrations has,
for example, a ruling error Ax over the length I of the grating. This means
that the position of one groove at the end of a ruled interval of the length 1
differs by Ax from the ideal position. For a grating used in autocollimation
such a ruling error yields a wavefront aberration
A(?) = 2 * Ax sin i'. (3.5)
Values of Ax for various values of i' and A($. The quantities Ax, i' and d(q) denote the ruling
error, the angle of diffraction, and the diffracted wavefront aberration (for 1= 0.4 pm),

15" 30" 45" 60"

44 0.19 0.10 0.07 0.06

1/10 0.08 0.04 0.03 0.02
1.120 0.04 0.02 0.01 0.01

Table 2 shows values of Ax for various values of i' and A(?) for A = 0.4 pm.
The larger i', the better Ax has to be. In addition, the value of Ax has to be
smaller as the wavelength at which the grating is used decreases. When
constructing the grating holographically according to Fig. 2, the wave-
fronts of the two beams S1and S2 will have aberrations due to the optical
elements used. We assume here that S, has an ideal wavefront and S2 has
an aberration A(?). Then a grating produced with these two wavefronts
- and on an ideal blank and photoresist layer - reconstructs the beam S2
- without considering amplitude factors - when the grating is illuminated
with the wavefront S1.We can call the reconstructed beam the + 1 order.
This example demonstrates that for the construction of a holographic
grating which will be used in the first (or low) order the optical elements
with which the beams S1 and S2 are produced do not neccessarily have
to be much better than the collimator and the camera of the spectrograph
in which the grating is mounted. To make gratings holographically which
are used in higher orders (echelle gratings) the accuracy of good optical
elements is not sufficient. The reason is, that the wavefront aberrations
for a grating (in autocollimation) are given by

Because Ax cc a - with a given wavefront aberration the ruling error Ax

increases linearly with increasing grating spacing - it follows that d(q) cc m.
Assuming an echelle grating with 100 grooves/mm used for 0.5 pm in the
17th order with i = 'i = 60" the aberrations are seventeen times larger
than for a high line density grating to be used in the first order for 1 = 0.5 pm
made with the same beams. Nevertheless it is possible to obtain the ruling
accuracy necessary for holographic echelle gratings. This point will be
discussed in section 3.1.3.
Independent of the source of aberrations of ideal plane wavefronts
- caused by zone errors, diffraction or misalignment of the optical elements-
one can consider two extreme possibilities. The first case is the super-
position of two slightly divergent beams, the second case is the super-
position of one slightly divergent beam with one slightly convergent beam
(RUDOLPH and SCHMAHL [1970)).
For a grating with a progressive ruling error one can write

where x is the width of the grating up to the nth groove, measured from
the middle of the grating. Ax is the deviation of the nth groove from its
ideal position. For the change of the grating spacing as a function of n it
follows from eq. (3.7) that
d2x _
4 -_
1 nv-2
dn2 v=2 (v-2)!
The term d1 corresponds to a progressive linear ruling error and yields an
astigmatism of the grating, and higher terms of 6, influence the spectral
The superposition of two divergent (or two convergent) beams yields an
interference field which can be described as a section of a family of two-sheet
hyperboloids of revolution (hyperbolic case). In Fig. 3 two divergent beams
originate from the two points F1 and F2 which are the focal points of th$
hyperboloids of revolution that are the geometrical loci of the interference
maxima. The hyperboloids are given by x2/a: -y2/ai - z2/a; = 1 with
2al = Ir, -r21 = nl, n = 0, 1, 2 ... and a: = a; = cz -$n2A2. In a plane
parallel to the xy-plane z = h the geometrical loci of the interference
maxima are hyperbolas
xz = $n2A2(1 + ( y 2 +h2)/(c2-tn2A2)). (3.9)
The spacings a of the interference maxima, which are given by the angle

Fig. 3.

2a = F1PF2 ,are considered as a function of x in the vicinity of the plane
x = 0, i.e. the divergences of both beams are nearly equal. In all practical
cases c is in the order of several kilometers, that means c > nl.
With y = s and s2+h2 = b2 we obtain from eq. (3.9)

+ .. .)) (3.10)

and hence on taking the square root we obtain the following expansion
for x:
x = b2 *. n +
4 ( I + F') g(l+
$)-i* n3+. . ..
We note that eq. (3.11) contains n in odd powers only.
For the change of the grating spacing as a function of n it follows from
eq. (3.11)
da d2x
- - 6A3b2 $)-i
dn dn2 -w *n+ .... (3.12)

A comparison of eq. (3.12) with eq. (3.8) shows that 6 , = 0, that means,
that in the hyperbolic case the grating is -to the first approximation - free
of aberrations.
The superpositionofonedivergent beam withoneconvergentbeam yields
an interference field which can be described as a section of a family of ellip
soids of revolution (elliptic case). In Fig, 4 one divergent beam originates
from the point F, and m e beam converges to the point F,. F, and F,
are the focal points of the ellipsoids of revolution that are the geometrical
loci of the interference maxima. The ellipsoids are given by x2/a: +y2/ai +

Fig. 4.
z 2 / u : = 1 w i t h 2 a 3 = ( r , + r 2 ) = 2 c + t * A a n d a ~ = a ~ = c - t - A + $ t 2 AI 2n .a
plane z = h parallel to the xy-plane the geometrical loci of the interference
maxima are circles
x2 = - y2 +( t k+y12)(1- h2(C ++A))- 2). (3.13)
The interference pattern is therefore a section of a zone plate pattern
(SCHMAHL and RUDOLPH [1969]). A grating made with one divergent and
one convergent beam is, therefore, a section of a zone plate at very large
values of t. We may, therefore, restrict the discussion to determine the
spacing a as a function of x for y g: x and z -jc x. In Fig. 2 for z = y = 0,
/3 = 0 it follows from eq. (3.13) that

x =(tAc)3 1 + ( i l r+ ...- ).
- - (3.14)

Because of the assumption that the grating is a section with large zone
numbers t can be replaced by t = to+n, with n = running number of the
grating; n = 0 represents the middle of the grating. With t = to +n and
expansion of the square root eq. (3.14) gives

In the expansions (3.14) and eq. (3.15) higher powers of A/toc are neglected.
The change in grating spacing as a function of n follows from eq. (3.15)
within an accuracy better than 10% for all practical cases:
da d2x
dn dn2
= --&A,)* * to%+ . . .. (3.16)

A comparison of eq. (3.16) with eq. (3.8) shows, that 6, # 0; that means
in the elliptic case that the grating has a linear progressive ruling error. In
Table 3 numerical values for Ax (cf. eq. (3.7)) are given for gratings with

Values of Ax calculated for gratings in the hyperbolic and elliptic cases.

100 1
500 0.3
1000 0.1
2000 J 0.05

a width W = 400 mm and grating spacings in the range 10 pm to 0.5 pm.

In the two cases the beams were assumed to have a divergence or angle of
aperture 2y = 0.2 seconds of arc. The result is that in the hyperbolic case
the accuracy is much higher than in the elliptic case. The physical reason
for this effect is that in the first case the intersection angle 2c1of the two
wavefronts is nearly constant over the whole beam diameter. It follows
that it is not absolutely neccessary to use nearly plane wavefronts to make
gratings of high quality. The wavefronts may have certain aberrations, but
they must be identical in the two beams. With usual optical elements it is
impossible to obtain identical wavefronts. This, however, is possible by
holographic reconstruction and will be discussed in section 3.1.3.

3.1.2. Interference arrangements

Fig. 5 shows some optical arrangements suitable for the construction of
normal holographic gratings. In Fig. 5a a lens 0, a spatial filter S and a
spherical or parabolic mirror P are used to produce a plane wave, parts of
which are reflected by plane mirrors PI to generate the interference fringe
system which is recorded in the plane E. Fig. 5b shows an arrangement
with two off-axis mirror systems. In Fig. 5c an arrangement is shown
suitable for making large gratings (with a central aperture), e.g. for coud6
spectrographs of large telescopes.

3.1.3. Improvement of the ruling accuracy by superposition of identical

reconstructed wavefronts
As discussed in section 3.1.1 the accuracy of an interference field, made
with commercially available good optical elements, is sufficient for holo-
graphic gratings for use in the visible region in a low order. This is no longer
true for spectroscopic gratings for use in higher orders, for spectroscopic
gratings constructed with visible light and used at much shorter wave-


2a 3Yv

Fig. 5. Optical arrangements.

lengths and for gratings wiih a low line density for use as scales. To enhance
the ruling accuracy of such gratings one can use the following proce-
dure (SCHMAHLand RUDOLPH[1970]): Each wavefront affected with
aberrations can be assumed to consist of parts of wavefronts with different

divergences. If one succeeds in superposing wavefronts, the parts of which

have in each case the same divergence, i.e. identical wavefronts, the accuracy
of the above discussed hyperbolic case can be obtained over the whole
surface of the grating. Identical wavefronts can be made by reconstruction
using two holograms. The principle is illustrated in Fig. 6: One hologram
H,, is made using the two beams R, and R 2 , and a second.hologram H32

recording process reconstruction

Fig. 6. Principle of the reconstruction of identical wavefronts.

using the beams R3 and R2. Illuminating the hologram H12with R1 and
the hologram H3, with R; , the beam R2 is reconstructed twice, except for
amplitude factors. The superposition of the two beams R2 yields the grating
H22. In a model experiment holograms H12 and H3, were made with the
beams R1, R2 and R 3 . The wavefronts of all three beams had aberrations
of several wavelengths. These aberrations were introduced by using bad
optical elements. IgFig. 7a and Fig. 7b Moirb patterns from the holograms
H,2 and H32 are shown. The Moire patterns were made by illuminating
the holograms with an interference fringe system made with two plane
wavefronts having aberrations 5 A/lO. The Moire patterns demonstrate
that these holograms- acccrding to the fact that they are made with distorted
wavefronts - have large ruling errors. From these two holograms two iden-
tical wavefronts R2 were reconstructed according to Fig. 6. Superposition
of these two wavefronts yielded the hologram H22. Fig. 7c shows the
Moire pattern of this hologram which demonstrates the high ruling accuracy

Fig. 7a. Moire pattern of the hologram H I .

Fig. 7b. Moirt5 pattern of the hologram H,,


Fig. 7c. Moire pattern of the hologram H,,

of this hologram. Figs. 7a-c show - in agreement with theoretical considera-

tions - that the gain in accuracy obtained by this method is about two
orders of magnitude, if the aberrations are not greater than a few wave-
lengths. In Fig. 6 as well as in the model experiment discussed, the aberra-
tions introduced by the blanks and the resist layer of the holograms H I ,
and H,, have been neglected. This effect can be taken into account by
providing the hologram H I with the aberrations of the blank of the holo-
gram H,, and vice versa. In Fig. 8 an example of an optical arrangement
suitable for such an experiment is shown. In Fig. 8a the laser beam L
is divided by a beamsplitter 1. A quasi-plane wavefront is generated by
the mirror 4 via the mirror 2 and a lens and a spatial filter 3. After reflec-
tion on the blank 5, provided with a resisi layer and coated with a metal
layer, and the mirror 6 this wavefront is combined with a spherical wave
produced with the help of mirror 8 and a lens and a spatial filter 9. With
this interference pattern blank 7, coated with a photoresist layer, is ex-
posed. The undeveloped resist layer on blank 7 is then made reflective
and the reflective coating on the resist layer of blank 5 is stripped off.
In the next step the blank 5 is replaced by blank 7 and exposed by an in-
terference figure according to Fig. 8b. After removing the reflective coating

from blank 7 both the exposed resist layers of blanks 5 and 7 are developed
and recoated. As in Fig. 8c the holograms on blanks 5 and 7 reconstruct
two beams with identical wavefronts which are superposed. The resulting
interference pattern with highly improved ruling accuracy is recorded in
a resist layer on the blank 12.

Fig. 8a.


\ .

Fig. 8a-c. Optical arrangements for producing holographic gratings with improved accuracy by
the use of identical reconstructed wavefronts.

3.1.4. Frequency and wavelength stability of the laser light

Laser beams have a finite spectral width which affects the contrast K in
the interference figure, given by K = (I,,,,,-Zmin)/(Zmax +
Zmin).Zmax, Zminare
the intensities of the interference maxima and minima respectively. When
the path length difference A between the two interfering beams is zero,
the contrast is unity under optimal conditions. The contrast of the inter-
ference fringes as a function of the path length difference depends on the
width and the profile of the laser line used. When the path length difference
is small compared to the coherence length, the contrast is nearly independent
of the profile of the spectral line. The coherence length can, therefore be
expressed by the formule L z 71. c/Av where Av is the width of the spectral
line. To make a good grating we assume that the contrast of the interference
fringes does not decrease by more than 5 % from the centre to the edge of
the interference pattern. In this case the necessary coherence length is
L x 10 A. If A is zero at the centre of the interference pattern, i.e. at
x = 0 in Fig. 2, the path length difference at x = I = 0.5 W with j3 = 8
is given by A = 0.5 W sin CI.For a grating with W = 400 mm and
with 2000 grooves/mm made with the laser line A = 457.9 nm it follows
that A = 92 mm and L = 10 A = 920 mm or Av = lo9 Hz. In addition to
a small width the laser line must have a high frequency and wavelength
stability. According to eq. (2.1) the relative variation of the spacing is
given by Aala = A]./). = Av/v. For a fixed number of interference maxima
2 N, i.e. with W = 2 N u = 2 1, the change of the width of the grating is
given by A l / l = Aa/a. Let vo be the frequency of the centre of the line at
the time t o and vo+Av the frequency of the centre of the line at to+At,
At being the exposure time. To obtain sufficient ruling accuracy the Nth
interference maximum should not shift by more than 0.05 a during the
time At. It follows that AI.= (Aa/a)l= (Av/v)lS 0.05 a. In the above
mentioned example, with W = 21 = 400 mm, a = 0.5 pm the required
frequency stability is given by Av/v = 1.25 x When using the laser
line A = 457.9 nm and A = 350.7 nm, Av = 80 MHz and Av = 100 MHz
respectively. Such a high frequency stability in connection with a narrow
line width can be obtained by use of an oven stabilised Ctalon in the laser
cavity (DOWLEY [1971]) which filters one axial mode. The frequency interval
between two axial modes for a laser running inTEM,, is given by Av = c/2RL,
where R, is the length of the laser cavity. The width of an axial mode
is a small fraction of this interval. With high power CW Ar' and Kr' lasers
for which R, = 145 cm, the interval between two axial modes is lo8 Hz.
Using an oven stabilised &talonwith such lasers and only one axial mode
- with a corresponding long coherence length of several meters - the
frequency stability was found to be better than 75 MHz. The frequency
stability was measured by using a confocal scanning interferometer. Even
with high frequency stability the wavelength of the laser light can be shifted
by temperature and pressure variations in the air surrounding the laser,
because of the variation of the refractive index. For example a variation
in the pressure of amount Ap = 0.1 mm Hg yields a wavelength variation
of AAjL = 4 x lo-*, a variation of temperature A T = 0.1 K yields
Ai/A = Under normal weather conditions pressure variations are
usually much smaller than Ap = 0.1 mm Hg during the exposure times
necessary to make gratings holographically and hence, pressure stabilisation
is not necessary. It is, however, useful to stabilise the temperature.


To fulfill the four requirements mentioned in 92 it is necessary to use a

low noise recording material with which-it is possible to obtain well defined
profiles of the individual grooves and to preserve the high accuracy of an
interference fringe system. Such recording materials can be found in the
family of photoresists (cf. e.g. CLARK[1973]). Normal photographic
emulsions of silver halides in gelatine and other light sensitive layers based
on gelatine are useless for this purpose because it is not possible to obtain

sufficient dimensional stability in a gelatine layer. Photoresist layers are

widely used in microcircuit technology. There are two types of photoresists,
namely positive- and negative-workingresists. In positive resists the exposed
parts show an enhanced solubility in appropriate developing agents relative
to unexposed parts. The contrary is true for negative resists. Photoresist
layers can be deposited onto optical substrates in optical quality of nearly
any desired thickness. For thin layers, i.e. layers with a thickness of less than
about one micron, well known spinning techniques can be employed. An
important point is that for positive-working photoresists, for example,
the changes which occur when the resist is exposed take place in the molec-
ular structure so that no grain effects are observed and the resolution is
extremely high.
Normally photoresists have their peak sensitivity in the ultraviolet.
Consequently, to make gratings with photoresist and laser light a high
energy density in the ultraviolet or blue wavelength region is necessary.
For large gratings it follows that lasers with rather high powers are
necessary. In Fig. 9 a characteristic curve for the positive working photo-
resist Shipley.AZ 1350 is shown. In this figure the depth of removed resist

0 50 100 150 200 250 300 350
E [rnJ/ccm2]

Fig. 9. Characteristic curve for the positive working photoresist Shipley A 2 1350.

is plotted against the energy density E of the exposure with A = 457.9 nm

for a development time of 15 seconds, using concentrated AZ 1350 devel-
oper. This curve has been obtained with structures of low spatial frequencies.
In this case the depth d can approximately be written as d = ~ , . ( Z * t ) ~ + c ~
with E = Z.t, where E is the energy density and t is the exposure time. c1
and c2 depend on the manner that the resist was treated before the exposure,
e.g. drying and aging, and development conditions i.e. concentration of
the developer and development time. Fig. 9 shows, that to obtain a depth
of 0.2 pm, for example, one needs an energy density of 2 x lo6 erg/cm2.
In comparison to that the fine grain holographic emulsions Agfa Scientia
8 E 56 and 8 E 75 only need about 2 x 10' erg/cm2 for a good hologram.
Deeper modulation with the same energy density than shown in Fig. 9
can be obtained with longer development times or stronger developers.
For example d can be doubled by increasing the development time by a
factor of four. By use of the developer AZ 303 one can even obtain a gain
in sensitivity by a factor of about five. Making holographic gratings with
high line densities, however, normally rather thin layers are used, the
adhesion of which can be critical when using too strong a development

0 4. Production of Holographic Gratings

The following lines of an Ar+ laser are suitable for the production of
holographic gratings: 351.1 nm, 363.8 nm, 457.9 nm, 488.0 nm and 514.5
nm (preferably in combination with second harmonic generation). The
lines 350.7 nm and 356.4 nm of a Kr' laser and the lines 325.0 nm and
441.6 nm of helium-cadmium-lasers are also suitable. Widely used in
combination with positive working photoresist such as AZ 1350 are the
UV lines of Ar' and Kr+ lasers and the line 457.9 nm. With commercially
available CW-lasers one can now obtain powers of 0.1 up to 1 watt in
these lines. The very strong line 488.0 nm, which can be obtained with a
power of several watts, is especially suitable for use in combination with
photopolymers sensitised for the blue wavelength region, e.g. Kodak
Ortho resist. With the above mentioned powers typical exposure times
for gratings of a width of about W = 200 mm made in AZ 1350 are of the
order of a few minutes. Consequently the stability requirements concerning
temperature and vibration are higher than for normal holographic experi-
ments but less drastic than for ruling engines. As an example the conditions
in the Optical Laboratory of the Gottingen Observatory are briefly dis-
cussed :The optical bench is set up on a combination of springs and damping
V, Q 41 P R O D U C T I O N OF H O L O G R A P H I C G R A T I N G S 217

elements and is located in a well isolated chamber. It was verified by optical

methods that the amplitudes of the residual vibrations of the interference
system were less than 20 nm. The temperature stabilisation of the inter-
ference chamber is performed by heat exchangers, using temperature con-
trolled brine, resulting in a temperature stability 0.01 < AT < 0.1 K. To
avoid thermical and mechanical disturbances in the interference chamber
it was necessary to set up the laser outside the chamber and to stabilise the
environment to about f1 K so as to obtain the necessary frequency stability
of the Ar" and Kr' lasers.


The normal process to make holographic gratings with optical arrange-

ments according to Fig. 5 yields transmission gratings in photoresist on
glass blanks with symmetrical groove profiles. By coating them with thin
metal layers, e.g. aluminium, these gratings can be converted into reflection
With an appropriate thickness of the resist layers and a sinusoidal
intensity distribution of the interference fringes one normally obtains,
after development, a sinusoidal groove profile, as shown in Fig. 10. One
possible explanation for the formation of nearly ideal sinusoidal groove

Fig. 10. Scanning electron micrograph of a grating with sinusoidal groove profiles.
U k
.- 'E
L k


profiles, despite the non-linear characteristic curve shown in Fig. 9, may

be a non-isotropic solution process of the exposed photoresist for struc-
tures of high spatial frequency.
Using resist layers with a thickness small compared with the grating
spacing one can obtain square wave (laminar) profiles for the production
of phase gratings. Coating with a metal layer, e.g. chromium, and sub-
sequent stripping yields gratings with square wave profiles in metal on
glass. Other methods, e.g. etching processes, can also be used. Fig. 11
shows a microphotograph of such a grating in metal on glass. After coating
with a metal layer with a high atomic number, e.g. gold, such gratings
with groove densities of three hundred to several thousands per millimeter
are well suited as gratings for grazing incidence for the soft X-ray region.
Fig. 12 shows a raster scan photograph of the groove profiles of such a
grating with a grating spacing a = 1.67 pm and a groove height of about
10 nm. Advantages of such gratings without any organic material are
their resistance against thermal and mechanical stresses. This is especially
important when using gratings with high power radiation.
Gratings in metal on glass with low grating spacings can be used as
scales, especially if made with a high ruling accuracy, using the method
described in section 3.1.3. The good uniformity of holographically made
scales (a) in comparison to classically ruled scales (b) is demonstrated in
Fig. 13. The transmission was measured using a microphotometer.
Tmsmission p/,] a=8pm

50 a' /

2 y
Fig. 13. Comparison of the uniformity o f a holographically and a classically made scale.


In the spectral range from the infrared to the ultraviolet holographic


gratings with sinusoidal groove profiles have high efficiency values only
if the grating spacing is comparable to the wavelengths used, as will be
shown in section 5.3. For gratings with spacings large compared with
the wavelength, high efficienccy values can only be obtained by using saw-
tooth groove profiles. There are several methods of producing holographic
gratings with such profiles. The first method is to produce gratings with
a triangular profile by inclining the grating blank to the direction of the

7 \ Photoresist

\ - v

Fig. 14. Method to make holographic gratings with sawtooth groove profiles. Refraction in
the resist layer and the blank has not been taken into account.

interference fringes (SHERIDON

[1968)), as shown in Fig. 14. The grating
spacing is given by
a = 1112 sin a cos p = 1112 sin a cos p. (4.1)
The blaze wavelength A, is given by the distance between the interference
maxima and minima:
1, = 2h = I/sin a = 11/n sin a. (4.2)
The primed symbols are used for the beams in a medium with the refractive
index n, i.e. glass and photoresist. The grating spacing and the blaze wave-
length can be changed easily by changing the angle of incidence p, the laser
wavelength k and the intersection angle of the interfering beams S, and S 2 .
If the refractive index of the media through which the interfering beams
pass were constant, change in the inclination angle p only - without changing
J. and CI - would change the grating spacing and not the blaze wavelength
11, . Because of the different refractive indices of air, glass and photoresist
it is necessary, in practice, to change at least two parameters if one wishes
to change the grating spacing without changing the blaze wavelength or
vice versa.
The method that we just discussed has been successfully applied (HUTLEY

Fig. 15. Scanning electron micrograph of a grating with sawtooth groove profiles. made by
the method illustrated in Fig. 14. (Reproduced from HUTLEY[1974a] Fig. 2.)

[1974], NAGATA and KISHI[1974]). Fig. 15 shows a asymmetric groove

profile of a grating with 800 grooves/mm, measured with a scanning elec-
tron microscope.
The method has two disadvantages. Firstly it is necessary that one of
the interfering beams passes through the blank from behind. This requires
blanks with a good optical homogeneity to prevent aberrations of the
wavefront. Secondly it is impossible to make concave gratings with a large
f-ratio with this method.
Another method for making asymmetric groove profiles is to use a
Fourier synthesis method (SCHMAHL and RUDOLPH[1974], SCHMAHL
[1975]). Sawtooth intensity profiles can be made by superposing n fringe
systems of the form

The first two terms with an appropriate phase relation already yield a
good approximation to an ideal sawtooth profile. In practise it is, however,
very difficult to superpose two fringe systems with the required accuracy
when using different optical arrangements for the different fringe systems.
For a 100 mm grating with, e.g., 600 grooves/mm the accuracy of the adjust-
ment of the mirrors has to be better than 0.1 second of arc. To overcome
this difficulty one can use the following method. With a symmetric arrange-


P s i n d Z =2 sin Ul

Fig. 16. Arrangement for making holographic gratings with sawtooth groove profiles.

ment, i.e. B = 0 in Fig. 2, the number of grooves per mm I/ais proportional

to sin c1. To double the line density one has, therefore, to double sin a. This
can be done exactly according to Fig. 16. A grating g with a grating spacing
2a is illuminated with a monochromatic parallel wavefront at normal
incidence. The + 1 and - 1 orders diffracted respectively by the parts 1

Fig. 17. Scanning electron micrograph of a grating with sawtooth groove profiles made with
the arrangement illustrated in Fig. 16, (I = 1.67pm.
V. P 51 C O M P A R I S O N WITH C L A S S I C A L G R A T I N G S 223

and 1 of the grating g form a fringe system at b, with the spacing a, whereas
the 2 and - 2 orders diffracted respectively by the parts 2 and 2 form
a fringe system with the spacing a/2. The proper phase relation between the
two fringe systems can be obtained, for example, by using a plane-parallel
plate p in a part of the parallel wavefront or by evaporating a step onto
the section 2 of the grating g. The method also works by replacing the
grating g of Fig. 16 by two identical gratings g, and g, made either on one
blank polished to optical tolerances or on two separate blanks. The latter
arrangement has the advantage, that the gratings g, and g, have to be only
about twice as large as the grating b. The asymmetric profiles shown in
Fig. 17 were made by successive exposure of a photoresist layer arranged
at b with the two fringe systems according to Fig. 16. For the shown profiles
the measured ratios of the intensity in the first positive to the first negative
order was 17 at the blaze wavelength. One advantage of this method is,
that it is possible to produce sawtooth profiles not only on flat but also
on curved blanks, e.g. it is possible to produce concave gratings with saw-
tooth profiles and uniform blaze over the whole area.
Finally it should be mentioned that it is possible to make asymmetric
groove profiles by copying holographic gratings using visible light, UV or
X-rays and by using ion etching processes.

0 5. Properties of Holographic Gratings and Comparison with Classical


Since the first holographic gratings for serious spectroscopic use were
made in 1967 in the Optical Laboratory of the Observatory of the University
of Gottingen such gratings have been competitive with classically ruled
gratings from the visible to the ultraviolet and soft X-ray regions. Today
it is possible to realise large gratings with a width of more than 600 mm,
to reach line densities of more than 10000 lines per millimeter, to make
gratings with a very low amount of straylight and completely free of ghosts.
It is possible to obtain good efficiency values for gratings with symmetrical
groove profiles and high line densities. In addition it is possible to attain
high efficiency values independent of the line density by making asymmetric
groove profiles holographically. Holographic gratings can be formed
independently of substrate curvature and, in principle with any desired
surface variation of grating frequency. Furthermore replicas of holographic
gratings can also be made.
One method of obtaining information about the optical quality of gratings
is to examine wave-front interferograms. The spacing between two maxima
or two minima in an interferogram is called one fringe and corresponds
to a wa:le-front aberration d(q) of one wavelength of the light used to
make the interferogram. According to eq. (3.5) a wavefront aberration
d(q) of p fringes of a grating measured in autocollimation corresponds to
a ruling error Ax = p ,442 sin .)'i Normally wavefront interferograms are
made with large Michelson interferometers.

Fig. 18. Grating interferometer for producing wavefront interferograms

To avoid large beam splitters - necessary in a Michelson interferometer -

we used a grating interferometer as shown in Fig. 18 (SCHMAHL [1973]).
A parallel beam S1 of laser light is produced by using the lens 0 and the
parabolic mirror P. The grating G1is arranged so that the first order S2
is diffracted in the opposite direction of propagation of S1 (first arm of
the interferometer). The zero order of the grating G1,the beam S 3 (second
arm of the interferometer) illuminates the grating G2, which can be tested
in the zero, first and higher orders in autocollimation. The interfering
beams S2 and S4 collimated by the mirror P and reflected by the beam
splitter TP produce the interferogram, which can be recorded in the plane
E. The light used has to have a coherence length of more than twice the
distance between the gratings G, and G2 and, if possible, less than four
times this distance to avoid multiple-beam interferences. The optical
quality of the grating interferometer can be tested by use of a plane mirror
of good optical quality instead of grating G 2 .Fig. 14 shows, as an example,
a wavefront interferogram from a holographic reflection grating made in
our laboratory with 1500 grooves/mm and a width of 100 mm, examined

" T

Fig. 19. Wavefront interferogram of a grating with 1500 grooves/mm, made with the
arrangement illustrated in Fig. 18.

in the first order with 1 = 457.9 nm. The interferogram shows that the
wavefront aberrations are smaller than 1/10. From the wavefront inter-
ferograms one can deduce that the instrumental profiles are symmetrical
and that the spectral resolution reaches the theoretical values. Fig. 20a and
Fig. 20b show photoelectric recordings of hyperfine structure of the mercury
lines 435.8 nm and 546.1 nm in the second order in autocollimation, made
with a grating having ruled width 180 mm and 1465 grooves/mm, made
in the Gottingen Laboratory. The measurements were made with an
8-meter spectrograph in the Solar Tower of the Gottingen Observatory.
The theoretical resolving power of the grating is 527000 in the second
order. The measurements were made with 94% of the grating surface
- because of vignetting by the autocollimation lens of the spectrometer -
and with an entrance slit width of WE,= 20 pm, an exit slit width of
W,, = 10 pm and an uncooled gas discharge lamp, corresponding to
W , x 0.4 nm.
With W& = W&,,+ W&+ W&+ W t one obtains for the observed
width of a single isotope component at half maximum Webs = 1.2 & 0.1 nm
for the wavelength 435.8 nm.
This value is in full accordance with the measured widths of the single

1 Hg4358 m = 2
180 x 130mm2.
Hg 5461 m = 2
180 x 130md
1465 llrnm 1465 llmm

Fig. 20a. Hyperfine structure of the mercury Fig. 20b. Hyperfine structure of the mercury
line 435.8 nm. line 546.1 nm.

components of Fig. 20a and means that the grating has full theoretical
resolving power.


In a grating spectrograph or spectrometer the light which is observed in

between the orders and which is in excess of that due to Fraunhofer diffrac-
tion of an ideal grating can occur as ghosts, grass and diffuse scattered
light. Ghosts and grass are typical attributes of classically ruled gratings
and are caused respectively by periodic and random errors in the positions
or depths of the grooves. Much work has been done to describe quantita-
tively ghosts and grass. The better way, however, is to avoid such impurities
completely which is indeed the case for holographic gratings. The only
impurity that occurs with holographic gratings is diffuse scattered light,
caused by the roughness of the grating surface.
In practical cases it is necessary to know to what extent spectral measure-
ments are influenced by scattered light. For this purpose one should know
the intensity of scattered light as a fraction of the intensity of an emission
line in a certain order. Unfortunately the scattered light arising from a
grating and measured in a grating spectrograph or spectrometer does not
only depend on the quality of the grating but also on geometric parameters
such as widths and heights of entrance and exit slits, area of the grating and
linear dispersion.

The best way to measure the scattered light is to test a grating in the
same arrangement as used for spectral measurements (HUTLEY[19731).
Nevertheless, to give an idea of the amount of scattered light from a high
quality holographic grating we compared the scattered light from a grating
with that scattered from a mirror by measuring both under the same condi-
tions in a spectrometer.
Fig. 21 shows the results of our measurements and demonstrate that
the amount of light scattered from our gratings is comparable to that
scattered from good mirrors and is not caused by ruling errors. These
results are in good agreement with measurements of other authors (HUTLEY
[1974], HUNTER[1975j, PIEUCHARD and FLAMAND [1975]). The low level
of scattered light from holographic gratings is especially important in the
cases of Raman spectroscopy and high resolution stellar absorption

loo 'I



lo* "

1cP- *.

Fig. 21. Comparison of scattered light of a holographic grating and of a mirror, both coated
with aluminium.

5890 5896
--A [A1
Fig. 22. Solar spectrum near the sodium resonance lines.
spectroscopy (SCHMAHL and RUDOLPH[1972]). Fig. 22 shows a photo-
electric record. of the solar spectrum near the sodium resonance lines, taken
in the middle of the solar disc. The NaD, line shows a residual intensity
of only 6 % without any rectification. This test was made with a spectral
range of about 200 nm entering the spectrometer. Up to now comparable
low residual intensities of these lines could be obtained with conventional
gratings only with a strongly reduced spectral range and/or double pass
Fig. 23 shows a comparison of a conventional and a holographic concave
grating in grazing incidence. Both had a radius of curvature of about 2
meters. The conventional grating had 1200grooves/mm and the holographic
grating had 1800 grooves/mm. Though one has to bear in mind that an
exact comparison would require equal grating spacings for both gratings,
the results indicate that the signal-to-noise ratio is much better in the case
of the holographic grating (HUNTER[1975]).
loo{ loo{

80- 181.987 Kr



Fig. 23. Comparison of a conventional and a holographic concave grating. (Reproduced
from HUNTER[I9751 Fig. 16.)


One of the most important properties of diffraction gratings is the

efficiency. The absolute efficiency is defined as the ratio of the diffracted
flux in a given order to the incident monochromatic flux. Often the term
relative efficiency or groove efficiency is used, that is absolute efficiency
divided by the reflectivity of the surface layer, e.g. aluminium. ,
It will be shown in this section that holographic gratings have efficiency
values comparable to those of classically ruled gratings. In combination

with the good straylight properties that we just discussed, holographic

gratings show, therefore, signal to noise ratios which are normally much
better than those of classical gratings. The properties of gratings which
determine the efficiency in a given spectral region are groove profile and
line density. If one wants to calculate the efficiency of gratings one has
to solve Maxwell equations for the diffracted electromagnetic fields with
the grating surface as boundary. The solution of this problem is rather
difficult and many authors have worked in this field. In recent years good
agreement between theoretical and experimental results have been ob-
tained (PETIT[19663, LOEWEN, MAYSTRE, MCPHEDRAN and WILSON[19751).
A detailed discussion of this work relating to holographic gratings is
beyond the scope of the present article.
As discussed in section 4.1 two beam interference arrangements normally
yield gratings with symmetrical groove profiles of sinusoidal shape. With
such profiles high efficiency values can be obtained if the ratio of the
wavelength 1 to the grating spacing a is in the region 0.7 5 1/a S 1.5.
Especially in this region, where gratings normally are used in the first
diffraction order - e.g. in the visible with groove densities between 1200
grooves/mm and 2400 grooves/mm - experimental and theoretical results
show that the efficiency of holographic gratings with symmetrical groove
profiles is comparable to the efficiency of classically ruled gratings with
sawtooth profiles. Fig. 24'shows a comparison between gratings with 1800
grooves/mm, measured in the first order. The efficiency in this region
depends strongly on the polarisation properties of the incident beam.

90 - /-\-
80 - / \


50 -
LO- I \
30- / \
20- I \

Fig. 24a, b. Comparison of the efficiency values of a classical (a) and a holographic (b) grating
with sinusoidal groove profiles. Both gratings have 1800 grooves/mm. (Reproduced from
FLAMAND [1975] Fig. 3.)

Absolute la00 llmm

90 -
80 -
70 -
60 -
50 -
40 -
20 - $ 0

'' 300 LOO 5;)O 6W 7M) 800 &m]

Fig. 24c. Efficiency of a holographic grating with sinusoidal groove profiles and 1800
grooves/mm. (Reproduced from HUTLEY[1974b] Fig. 6.)

Relative Relative
Fnicimcy efficiency
/ -.
1260 llmn
- ,/-\, , //
, .. 1580 I/mm

40- I \

10- ' '-'

'..- Ell
01 b 0,
300 LOO 500 600 700 800 900 3M) 400 500 600 700 800 903
A [nml A [d
(a) (b)
Relative efficiency
r%1 c t ['/;I
80 -
70 - 3600 llmm

i: -
fi *

Fig. 25. Efficiency values of holographic gratings with sinusoidal groove profiles and with
1260, 1580,2090 and 3600 grooves/mm.
v, 51 C O M P A R I S O N WITH C L A S S I C A L G R A T I N G S 231

Therefore one has to avoid regions which show strong anomalies, i.e.,
rapid variations of efficiency over a comparatively short range of wave-
length. Fig. 25a to Fig. 25c show the efficiency in the visible region of three
holographic gratings with 1260, 1580 and 2090 grooves/mm and sym-
metrical groove profiles, measured in an arrangement near autocollimation
in the first order (MIKELSKIS C19731). Fig. 25d shows the efficiency of a
grating with symmetrical'groove profiles and 3600 grooves/mm, measured
in the ultraviolet in the first order. The curves demonstrate that the peak
efficiency values occur in the range 0.8 5 ,?/a ,< 1, which means that the
peak efficiency shifts to shorter wavelengths with increasing line densities.
Measurements of the efficiency as function of the angle 6 between incident
and diffracted beam show that the peak efficiency shifts to longer wave-
lengths with increasing 6 relative to measurements in autocollimation
-(6 = 0). For example, the peak efficiency is shifted by about 80 nm for
6 = 34" for a grating with 1560 grooves/mm with respect to the position
for peak efficiency measured for 6 = 0. These results are in contradiction
to calculations of the peak efficiency with increasing S made under the
assumption that the blaze of classical gratings with sawtooth profiles
results from the fact that the facets of the grooves act as small mirrors which
reflect the light in the same direction as the grating is sending it by diffrac-
Fig. 25 shows that when J./a < 0.8 strong anomalies occur, in this case
for gratings used in the first order in autocollimation. Such anomalies are
characteristics of all gratings, irrespective of whether they are made
mechanically or holographically. The anomalies were first mentioned by
Wood. A general theoretical treatment shows that the Wood anomalies
are actually of two distinct types, a resonance type and a form first mentioned
by Rayleigh, which appears at wavelengths due to the emergence or reentry
of another spectral order at the grating surface (HESSEL and OLINER [1965]).
In special cases these two types are merged together. Anomalies are
connected with plasma waves in the electron gas in the metal coating of
the gratings. Such plasma oscillations are known as surface plasmons
(TENGand STERN[1967], HUTLEY [1973]). Anomalies are well suited to test
electromagnetic grating theories, which should reproduce the observed
anomalies with regard to strength and location in the spectrum.
When A/a < 0.8 one can obtain high efficiency values only with asym-
metric sawtooth groove profiles. As shown in section 4.2 such profiles can
be made holographically. Efficiency values of various gratings with saw-
tooth profiles made by HUTLEY [1974] according to the method of SHERIDON
are given in Fig. 26 and Fig. 27. Fig. 26 shows the efficiency of different


Kx) 200 300 100 200 300
Wavelength trim)

Fig. 26. Efficiency values of holographic gratings with sawtooth groove profiles. A) 1200
grooves/mm, B) 1570 grooves/mm, C) 600 grooves/mm, D) 800 grooves/mm. (Reproduced
from HUTLEY[1974a] Fig. 5.)

gratings with 600 grooves/mm to 1570 grooves/mm, measured in the first

order in the vacuum ultraviolet. Fig. 27 shows the distribution of light
among the various diffracted orders of grating A in Fig. 26. The results
given, demonstrate that holographic gratings can fully compete with
classically ruled gratings concerning the efficiency. This is not only true for
gratings with symmetrical sinusoidal groove profiles in the range
0.8 < A/a < 1.5, but also for holographic gratings with asymmetric saw-
tooth profiles.


As discussed in section 4.1 gratings for the soft X-ray region can be made
holographically in metal on glass form without any remaining organic
material. It has been found possible in this way to control groove profile
accurately and so produce square wave (laminar) grooves with heights of
C O M P A R I S O N WITH C L A S S I C A L G R A T I N G S 233

Spectral order

Fig. 27. Light distribution among the various diffracted orders of grating.A) of Fig. 26.
Relative efficiency (full line) and absolute efficiency. (Reproduced from HUTLEY[1974a1
Fig. 7.)

the grooves, showing phase cancellation in the zero order (RUDOLPH,

SCHMAHL, JOHNSON and SPEER[1973]). Fig. 28 shows the efficiency of
such a grating with 294 grooves/mm and with groove heights of 22 nm,
measured at 4.5 nm. The efficiency is greater than that of most mechanically
ruled gratings used in the soft X-ray region (SPEERand RUDOLPH [1974]).
The holographic process gives the possibility of overcoming the limitations
of classically ruled gratings used at grazing incidence. The limitations
arise from the fact that such gratings are almost universally ruled on
spherical blanks, for use in Rowland circle geometries, with grooves equally
spaced along the chord. Spherical aberration imposes a severe upper limit
on the allowed focsl ratio, typicallyf/80 tof/100 for maximum resolution
in this region. Furthermore, correction of astigmatism is not usually
New design possibilities of holographic gratings derive from the fact
that such gratings can be formed independently of substrate curvature,
and, in principle, with any desired surface variation of grating frequency.

Fig. 28. Absolute efficiency of a 2 meter radius concave grating with 294grooves/mm at 4.5 mm
in the _+ 1 orders (right hand scale). The Lero order (left hand scale) shows modification due to
phase cancellation. (Reproduced from JOHNSON [1975].)

An analysis has shown (HABER[1950]) that primary astigmatism in Row-

land circle mountings can be corrected by a suitable choice of toroidal
grating blank radii. This solution has been realised by making a reflection
grating for grazing incidence on a toroidal blank. Fig. 29 shows such a

Fig. 29. Toroidal grating for grazing incidence.


toroidal grating with 600 grooves/mm and a ruled area of 8 x 45 mm2.

The minor radius is 5.65 mm (SPEER,TURNER, JOHNSON, RUDOLPHand
SCHMAHL [1974]). The result is, that the threshold sensitivity for photo-
graphic recording has been improved by a factor of about 35 in comparison
to the classical case. Correction of spherical aberration can be achieved by
suitable surface variation of groove frequency. Using such solutions it is
possible to realise concave gratings with an improved focal ratio.


It is well known that it is possible to combine in one optical element the

diffractive properties of a grating with the focussing properties of a spherical
mirror (ROWLAND [18831). Classical concave gratings are ruled mechani-
cally with equidistant grooves along the chord and have been discussed in
detail (BEUTLER [1945], NAMIOKA [19611). To reduce the aberrations,
especially the astigmatism, it has been proposed to rule gratings with non-
uniform groove distribution (ROWLAND[19023, SAKAYANAGI [19671).
Successful attempts to produce mechanically concave gratings of this type
have been made in recent years (GERASIMOV, YAKOVLEV, PEISAKHSON and
The basic element of a hologram is a zone plate, i.e. a grating with
variable spacing and curved lines which has dispersive and imaging proper-
ties, even if recorded on a plane surface. The imaging properties can be
improved by making an aplanatic system ie., one satisfying the Abbe sine
condition, for certain wavelengths. This can be done by use of a zone
plate recorded on a spherical blank (MURTY[1960], MURTYand DAS
In principle it is possible to minimise the aberrations of gratings having
imaging properties in a given wavelength region by an optimal choice of
the following parameters : Curvature of the blank and surface variation of
the spacing frequency. This can be done by a suitable choice of the recording
wavelength, the geometry during the recording process, and by use of
suitable wavefronts, e.g. plane waves, spherical waves or appropriate
aspherical waves. A general treatment of concave gratings with variable
spacings and curved lines has been given by NODA,NAMIOKA and SEYA
[1974a]. It is evident that it is possible to make holographic concave
gratings on spherical blanks with equidistant spacings along the chord
and with straight grooves by the use of two recording plane waves as
indicated in Fig. 2, in a symmetrical arrangement (B = 0). These gratings
have imaging properties and aberrations identical to those of classical

concave gratings. The size of gratings produced holographically is only

limited by the size of the optics used and not by any mechanical limita-
tions. With sinusoidal groove profiles they have the same high efficiency
values in the appropriate A/a region as holographic plane gratings. With
sawtooth profiles made by Fourier synthesis high efficiency values can also
be obtained in other regions. Unlike most classically ruled concave gratings,
which are often ruled as tri-partite gratings especially for large f-ratios,
the efficiency of holographic concave gratings is quite uniform over the
entire grating surface. Furthermore, holographic concave gratings have
the same good properties concerning scattered light and are completely
free of ghosts, that is, they have a good signal to noise ratio. In Fig. 30
a holographic grating of ruled area 8.6 x 26.8 cm2 and a classical concave

Fig. 30. Photograph of a conventional grating of ruled area 6.5 x 4 em2 and a holographic
grating of ruled area 8.6x26.8 cm'. (Reproduced from HUNTERf19751 Fig. 10.)

grating of ruled area 6.5 x 4 cm2 are shown. Both have 2400 grooves/mm
and a radius of curvature of 85 cm.In Fig. 31 and Fig. 32 a comparison
with the gratings shown in Fig. 30 is made (HUNTER[1975]). This com-
parison demonstrates that the efficiency values are comparable, but
also that the efficiency of the holographic grating is much more uniform
in spite of the larger aperture. Fig. 33 shows another comparison of a
holographic and a classical grating in the XUV (HUNTER[1975]) and
demonstrates that especially in classically ruled tri-partite gratings large
variations of the efficiency over the grating surface occur whereas holo-
graphic gratings of the samef-ratios are quite uniform.
In addition to these classical concave gratings special types of holo-

1500 2000


1100 I500 2000



Fig. 31. Comparisons of the gratings of Fig. 30. a) Efficiency of the holographic grating.
b) Efficiency of the classical grating. Zero ( x ), positive (A), and negative ( 0 )first orders at
15" angle of incidence. The dotted line represents the negative first order groove efficiency.
(Reproduced from HUNTER [1975] Fig. 13 and Fig. 12.)

+I 0 -I

7- -I- BEAM

0 J

60 -

50 -
s40 -
u20- +BEAM

'0 O - A

Fig. 32. Comparison of the gratings of Fig. 30. a) Holographic grating. b) Conventional
grating. Efficiency maps of the zero and the positive and negative first orders at 144 nm.
The angle of incidence is 15". The two large peaks on either side of the zero order of the
conventional grating are caused by specular reflections from the unruled edges. (Reproduced
from HUNTER[1975] Fig. 14 and Fig. 11.)

graphic concave gratings with reduced aberrations have been made. The
first proposals and results in this field of special concave gratings were
given by CORDELLE, FLAMAND, PIEUCHARD and LABEYRIE [19691. Further
investigations and special designs of gratings with imaging properties for
the visible, UV and soft X-ray regions have been made by several groups
[1974], NODA,NAMIOKA and SEYA[1947b], NIEMANN, RUDOLPH and
SCHMAHL [19741).



0 5 10 crn
Fig. 33a. Efficiency maps of a holographic grating at 121.6 nm in the zero order and positive
and negative first and second orders. 1200 grooves/mm, gold coating, and 1 m radius of
curvature. (Reproduced from HUNTER[I9751 Fig. 8.)

B8L x 573 A I O 10. -

We will now discuss some special types of gratings and their imaging
1. Gratings on a spherical blank recorded with two spherical wavefronts :
By choice of suitable construction parameters it is possible to construct
concave gratings for Rowland circle geometry, with highly reduced aberra-
tions. Especially it is possible to completely avoid astigmatism and some
types of coma and spherical aberration for one particular wavelength. As
a rule of thumb one can say that the astigmatism can be reduced over a large

wavelength region by about one order of magnitude compared to classical

concave gratings in Rowland mounting. Special solutions are discussed by
FLAMAND [1975], NODA, NAMIOKA and SEYA[1974b] for example. Special
properties of concave gratings made on a spherical blank can be obtained
in the case when one of the spherical waves used for construction originates
from the center of curvature of the blank. The basic principle for such
gratings has been treated by MURTY[1960]. Holographic gratings of this
type have been discussed by CORDELLE, FLAMAND, PIEUCHARD and LABEY-
RIE [1969]. The principle of this type of gratings is shown in Fig. 34. The

Fig. 34. Principle of holographic concave gratings with non-equidistant spacings and three
stigmatic points.

grating is made with two spherical waves of wavelength L o , originating

from 0 and A . After processing a concave reflection grating G is obtained.
If one illuminates this grating in the same geometrical arrangement with
polychromatic light originating from A , one can easily see from the holo-
graphic principle that the wavelength l ois focussed stigmatically at 0.
According to the grating equation the wavelength 2A0 is focussed stimat-
ically at A'. Furthermore, MURTY[1960] has shown that a third point A
is a stigmatic image of A' if OA = m R with OA' = R/m.A circle with the
points 0,A and A with the given conditions is called the circle of Apollonius
and divides AA' harmonically both internally and externally. By use of the
grating equation one can show that under these conditions the wavelength
focussed at A is given by E. = (m+ l)& in the first diffracted order. It is
possible to locate the polychromatic source at each of the three points 0,
A' and A. Table 4 gives the wavelengths of the resulting stigmatic images.
It is not essential to use waves originating from 0 when recording the
pattern. Concave gratings with three stigmatic points are also obtained
V, o 51 C O M P A R I S O N WITH C L A S S I C A L G R A T I N G S 24 1

Wavelengths and positions of stigmatic images, according to Fig. 34, for the first spectral
order, I , = recording wavelength

Image location
Source location
0 A A

0 1=0 3, = I , I = mi,
A I = A, I = 21, a = (m+i)n,
A I =ml, I = (m + 1)I, I = 2mI,

using waves originating from A' and A, which yields a similar kind of
matrix as shown in Table 4 (FLAMAND [1975]). The line H in Fig. 34 is
the horizontal or sagittal focus. The curved line V is the vertical or tangen-
tial focus, i.e. - at least for smallf-ratios of the grating - V is the focus of
the spectral lines. But it must be stressed that for large apertures especially
severe aberrations occur between the three stigmatic points. These aberra-
tions must be calculated for every particular arrangement.
2. The holographic method allows the construction of gratings with
imaging properties on extremely curved substrates. One example has
been discussed in section 5.4.
3. Up to now examples where the recording waves were either plane or
spherical have only been discussed. Another possiblity is - as already
mentioned - the use of aspherical waves for the recording process, to
correct aberrations of gratings with imaging properties. A special case of
diffraction gratings are zone plates, i.e., circular gratings with radially
increasing line density. Zone plates with large zone numbers can be realised
holographically by superposition of two spherical waves or one spherical
wave and one plane wave. Such high power zone plates can be used for
imaging and/or spectrometric purposes in the soft X-ray region (X-ray
microscopy, X-ray astronomy, X-ray spectroscopy) (RUDOLPHand
SCHMAHL [1967], SCHMAHL and RUDOLPH[1969]). If such zone plates
are constructed with visible light and are used in the X-ray region large
spherical aberration occurs. This aberration has been corrected by use
of aspherical recording wavefronts ( R U ~ L P[H 19743, NIEMANN, RUDOLPH
and SCHMAHL [1974]). This method can, in principle, be applied to other
gratings with imaging properties.

0 6. Further Improvements of Holographic Gratings

Holographic plane and concave gratings with high groove densities and
sinusoidal groove profiles have only been widely used up to now. These
gratings have efficiency values comparable to those of classically ruled
gratings but have less scattered light and are completely free of ghosts.
Such gratings have, therefore, an improved signal to noise ratio. As dis-
cussed in section 4.2 it is also possible to make gratings with sawtooth
groove profiles holographically. These processes are more complicated
than the process of making gratings with sinusoidal groove profiles.
Holographic gratings with sawtooth profiles will, therefore, only be used
widely in the future when replica gratings in large series are available.
Furthermore, it can be expected that considerable progress will take place
in the field of gratings with imaging properties. Application of holographic
X-ray gratings is just in the beginning and it can be expected that it will be
possible to enhance the Ctendue of spectrometers in the grazing incidence
region. As shown in section 4.1 the holographic methods allow the produc-
tion of scales of high accuracy, especially by use of identical wavefronts,
as discussed in section 3.1.3. Although so far not used in practice, this
could be important in the future development of metrology.


The authors wish to thank Mr. W. R. Hunter, U. S. Naval Research

Laboratory, Washington, Dr. M. C. Hutley, National Physical Laboratory,
England and Dr. J. A. Flamand, SocietC Jobin Yvon, France for sending
us new results. We wish to thank Dr. R. L. Johnson, Imperial College of
Science and Technology. London, for reading the manuscript. We are also
indebted to Mrs. H. Brandt and Mr. R. Spindler for the illustrations and
p ho t ogrd ph s.

BEUTLER,H. G., 1945, J. Opt. SOC.Amer. 35, 31 I .
BURCH,J. M. and D. A. PALMER, 1961. Optica Acta 8. 73.
CLARK,K. G., 1973, Electronic Components June-September, 553.
CORDELLE, J., J. FLAMAND, G. PIEUCHARD and A. LABEYRIE, 1969, Aberration-Corrected
Concave Gratings Made Holographically, in : Optical Instruments and Techniques, ed.
J. Home Dickson (Oriel Press, 1970).
DOWLLY, M. W., 1971, Coherent Radiation, Technical Bull. Nr. 106.
FLAMAND. J.. 1975. Rev. Physique-Chimie. in press.

FRAUNHOFER, J. v., 1821/22, Denkschrift der kgl. Akademie Miinchen, 8, 1-76.

troc. 28, 423.
HARER, H., 1950, J. Opt. SOC.Amer. 40. 153.
HARADA, T., S. MORIYAMA and T. KITA,1975, Suppl. Jap. Journal of Appl. Phys. 14-1, 175.
HARRISON, G. R., 1973, Appl. Opt. 12, 2039.
HESSEL,A. and A. OLINER,1965, Appl. Opt. 4, 1275.
HUNTER, W. R., 1975, Journal of the Spectroscopical Society of Japan 24, Suppl. Nr. I . 37.
HUTLEY, M. C., 1973a, National Physical Laboratory Report MOM 1, January 1973.
HUTLEY, M. C., 1973b, Optica Acta 20, 771.
HUTLEY. M. C.. 1974a. Blazed Interference Diffraction Gratings for the Ultraviolet. in : Vacuum
Ultraviolet Radiation Physics, eds. E. Koch, R. Haensel, C. Kunz (Pergamon/Vieweg.
Hamburg) p. 713.
HLITI.I.Y.M . C.. 1974h. Sci. Prog. Oxford 61. 301.
JOHNSON. R. L., 1975, Ph. D. thesis (University of London).
KAYSER, H., 1900, Handbuch der Spektroskopie, Bd. I (S. Hirzcl, Leipzig).
LABEYRIE, A., 1966, Quelques Nouvelles Methodes en Holographie. MCmoire pour obtenir le
Diplome #Etude Superieurs de Sciences Physiques (University of Paris/Orsay).
LOEWEN, E., D. MAYSTRE. R. MCPHEDRAN and 1. WILSON,1975. Suppl. of the Jap. Journal
of Appl. Phys. 14-1, 143.
MICHELSON, A. A,, 1927, Studies in Optics (Phoenix Books, The University of Chicago
Press) p. 104.
MIKELSKIS, H., 1973, Diplomarbeit, Gottingen.
MURTY,M. V. R., 1960, J. Opt. SOC.Amer. 50, 923.
MURTY.M. V. R. and N. C. DAS,1971. J. Opt. SOC.Amer. 61, 1001.
NAGATA.H. and M . KISHI,1974, Production of Blazed Holographic Gratings by a Simple
Optical System, in: Suppl. Jap. Journal of Appl. Phys. 14-1, 181.
NAMIOKA, T., 1961. Choice of Grating Mountings Suitable for a Monochromator in a Spacc
Telescope, in: Space Astrophysics, ed. W. Liller (McGraw Hill, New York, Toronto.
London) p. 228 ff.
NIEMANN. B., D. RUDOLPHand G. SCHMAHL, 1974. Optics Communications 12, 160.
NAMIOKA and M. SEYA,1974a. J. Opt. SOC.Amer. 64,1031.
NODA,H., T. NAMIOKA and M. SEYA,1974b, J. Opt. SOC.Amer. 64, 1043.
PETIT,R., 1966, Rev. Opt. 6. 249.
PIEUCHARD, G. and 1. FLAMAND, 1975, Suppl. Jap. Journal of Appl. Physics 14-1, 153.
POUEY,M., 1974, Journal of the Spectroscopical Society of Japan 24, Suppl. Nr. I , 67.
RITTENHOUSI-:. D.. 1786. Trans. Amer. Phil. SOC.2. 201.
ROWLANI), H. A,, 1883, Amer. J. Sci. (3), 26, 87.
ROWLAND, H. A., 1902. Phys. Papers, The Johns Hopkins University Press, Baltimore.
RUDOLPH, D. and G. SCHMAHL, 1967a. Umschau in Wissenschaft und Technik 67, 225.
RUDOLPH, D. and G. SCHMAHL, 1967b. Mitt. Astron. Ges. 23, 46.
RUDOLF,D. and G. SCHMAHL, 1967~.German Patent Application Nr. 1623803.
RUDOLPH, D. and G. SCHMAHL, 1970, Optik 30,475.
RUDOLPH, D., G. SCHMAHL, R. L. JOHNSON and R. J. SPEER,1973, Appl. Opt. 12. 1731.
RUDOLPH, D., 1974, Bundesministerium fur Forschung und Technologie, Forschungsbericht
W 74-07.
SAKAYANAGI. Y., 1967, Science of Light 16, 129.
SCHMAHL, G. and D. RUDOLPH, 1968, Mitt. Astron. Ges. 24, 41.
SCHMAHL, G. and D. RUDOLPH,1969, Optik 29, 577.
SCHMAHL. G. and D. RUDOLPH, 1970, Optik 30. 606.
SCHMAHL, G. and D. RUDOLPH, 1972, Stellar Spectroscopy With Holographic Gratings, in :
Proceedings of the ESO/CERN-Conference on Auxiliary Instrumentation for Large Tele-
scopes. eds. S. Laustsen and A. Reiz (Geneva. June 1972).
SCHMAHL, G., 1973, Bundesministerium fur Forschung und Technologie, Forschungsbericht
T 73-1 6.
SCHMAHL, G. and D. RUDOLPH,1974, German Patent Application Nr. P 2433800, 9.
SCHMAHL, G., 1975, Journal of the Spectroscopical Society of Japan 24, Suppl. Nr. I , 3.
SHERIDON, N. K., 1968, Appl. Phys. Lett. 12, 316.
SPEER,R. J., D. TURNER, R. L. JOHNSON, D. RUDOLPH and G. SCHMAHL, 1974, Appl. Opt. 13,
SPEER,R. J. and D. RUDOLPH,1974, Design and Performance. of Soft X-ray Reflection Gratings
formed Holographically, in : Vacuum Ultraviolet Radiation Physcis, eds. E. Koch, R.
Haensel. C. Kunz (Pergamon/Vieweg, Hamburg) p. 709.
STROKE,G. W., 1963, Ruling, Testing and Use of Optical Gratings for High Resolution
Spectroscopy, in: Progress in Optics, Vol. 11, ed. E. Wolf (North-Holland, Amsterdam).
STROKE, G. W., 1967, Diffraction Gratings, in: Encyclopedia of Physics, ed. S. Fliigge (Springer,
TENG,Y. Y. and E. A. STERN,1967, Phys. Rev. Lett. 19, 511.




Laboratoire de PhoioelectriciiP,
Facult6 des Sciences (M.I.P.C.),
Universit6 de Dijon,
Dijon, France


9 1 . INTRODUCTION . . . . . . . . . . . . . . . . . . . 247
DEPTH OF THE PHOTOELECTRONS . . . . . . . . . 279
9 4. SURFACE PHOTOEXCITATION . . . . . . . . . . . . 305
9 5 . CONCLUSION . . . . . . . . . . . . . . . . . . . . . 319
ACKNOWLEDGEMENT . . . . . . . . . . . . . . . . . . 321
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . 321
0 1. Introduction

During the first years after its discovery in 1887 by Hertz, photoemission
(PE) was investigated by experiments of Hallwachs, Elster and Geitel and
Lenard. The results of these experiments suggested to Einstein the concept
of photon and his well-known equation that has been verified later on by
Millikan. Early studies that led to high-yield photocathodes still in use
(Ago-Cs, SbCs,) are well exposed in classical monographs (HUGHESand
The investigations have been pursued for nearly a century in order to
produce higher-yield photocathodes in a wider spectral range and with a
better reliability. SOMMER [1970], who played an important role in the
development of trialkaline photocathodes, gave a very complete review
of the experimental knowledge and of possible technical applications.
Quite recently a new role has been found for PE in electron spectros-
copy namely the investigation of electron energy levels and the transitions
between them. After the pioneering work of Spicer, nearly all elements
and many alloys and compounds have been measured. The spectral range
investigated has widened during recent years and now extends to X-rays.
Applications to chemical analysis gradually became reliable and are now
commercially exploited (ESCA). Several review papers on electron spectros-
copy have been published recently (GCJRLICH and SUMI[ 19703, N. V. SMITH
[1971]). (See also SHIRLEY [1971].)
In this paper we shall only be concerned with the origin of the photo-
electrons. We shall deal with different theories, techniques and materials.
For a systematic review of important points such as technical applications,
electron spectroscopy and its relations with band structure, work function
(for this last point see RIVIERE [1969]), etc. we will refer elsewhere.
It was established a long time ago that the part of a solid responsible for
PE is generally a very thin superficial layer. A precise knowledge of the
exact origin of the photoelectrons is fundamental both to improving the

photocathodes used in photometry and to interpreting the results of elec-

tron spectroscopy.
Strictly speaking, the solid is a whole and any modification of one atom
induces a change of all solid state properties, e.g. of the PE induced by a
given incident light beam. Nevertheless because of screening effects it is
possible, within some approximation: to construct models, where the PE
current is analysed as a sum of local contributions. Another interest of
this local analysis is to obtain a quantitative evaluation of the screening
effects that can be used as intermediate results to determine more funda-
mental processes.


Early theories dealt with metals as free electron gases and with PE as a
surface phenomenon (FOWLER [1931], MITCHELL [1934,1935,1936]). The
emission of an electron was the result of the simultaneous interaction of
an electron of the solid with the surface and with the electromagnetic field.
Later theoretical (FAN[1945]) and experimental studies led workers to
describe PE as a volume process and to divide it into 3 parts:
1) Absorption of one photon and transfer of its energy to one excited
2) Transport of the excited electron towards the surface.
3) Transport of the excited electron across the surface.
The contribution of a small volume 6V to the PE is then the product of
three factors :
the total number of absorbed photons,
the proportion of absorbed photons that lead to the excitation of one
the escape probability of an electron excited in 6V.
The first two factors describe step 1, the last factor describes steps 2 and 3.
Insofar as we can make 6 V tend to zero, the photoyield Y can be written as

where D(r) is the density of absorbed photons per incident photon (DAP)
at r and p(r) is the electron escape probability for an electron excited at r .
To investigate the photoelectron energy distribution (PED) it is useful to
define the probability for an absorbed photon of energy hv to excite an
electron to an energy between E and E dE, which we can write in a normal-

ized form as



Po =

We shall also define the probability for an electron excited at energy E'
at r to emerge with an energy between E and E+ dE, as p(E', E, r)dE. The
number of electrons emitted with an energy between E and E+ dE is then

n(E)dE = dE D(r)/?(E',hv)p(E, E', r) dE' d3r, (1.3)

Y = n(E)dE.

To obtain a more general validity for the expressions (1.1) and (1.3) we may

try to introduce into the DAP D(r) a surface term of the form A(z) where z
is the distance from r to the surface. A depends on the polarization and
on the angle of incidence of the light. Of course, no transport process
through the solid needs to be considered and the escape probability as-
sociated with the surface absorption ps is merely the ratio of the number
of emitted electrons to the total number of excited electrons. p, might not
be the limit of p(r) as z tend to zero and P(E, hv) can take different forms
8, and P, for surface and volume effects. Two objections can be made
against the 3-step model as it is described here.

1) Is it possible to separate the 3 steps?

This objection appears with a special clarity in the case of a surface effect.
The separation of the electron transport through the surface and of the
photon absorption may seem difficult because no absorption occurs if the
electron is not perturbed by the surface. As a matter of fact the decom-
position of the surface PE into several steps is rather artificial. Never-
theless /?, and A&z) are well defined and it is convenient to use a step model
to compare surface and volume PE. We shall later discuss one-step
formalism but the 3-steps description is still mostly used for the analysis
of experimental data. We must note, howevei., that from a physical point
of view the 3steps are not necessarily independent. For instance, many-body

processes such as electron-electron interactions do perturb more or less

both light absorption and transport phenomena.

2 ) Is it possible to consider light absorption in a volume element that tends

to zero, so that the integrals (1 . I ) and ( 1 . 3 ) are valid?
We may try to give a significance to the DAP in a volume that tends to
zero by considering the expectation values of the density of current and
the electric field. But beside the doubtful significance of such concepts,
they would be of very little use in the interpretation of experiments. The
DAP is a macroscopic quantity, if we calculate it from macroscopic con-
stants such as the index of refraction or the dielectric constant. This implies
that we consider the solid as homogeneous. It is then certainly meaningless
to use the DAP to calculate the absorption within a volume smaller than
one crystalline cell. Furthermore we must note that the determination of
the dielectric constant and of the index of refraction either from theoretical
considerations or from experimental data is always made in the assumption
of a quasi-infinite solid. The application of the concept of DAP to one
crystalline cell requires, therefore, much caution, especially if the cell is.
near the surface. We shall give a special discussion of the surface absorption
that cannot be dealt with, if the solid properties are described by the di-
electric constant alone.
The 3-step model will be our guide in this study. In a first part, we shall
discuss separately the theoretical basis of the density of absorbed photons
(DAP) and of the electron escape probability that appears in the expressions
(1 . l ) and (1.3) and we will describe how both contribute to determining
the origin of the photoelectrons. We shall give special attention in the study
of the DAP to surface effect. We shall also discuss briefly the more elegant
one-step theories that have been developed recently.
Sections 3 and 4 deal with the analysis of the experimental results. Insofar
as PE is a pure volume effect, the DAP can be calculated from purely optical
data and needs no discussion here. In section 3, we shall discuss the experi-
mental determinations of the escape probability of the photoelectrons in
the light of the results of our section 2. In section 4,we shall discuss the
possibility of a surface effect.

5 2. Theoretical Basis of the Photoemission (PE)

PE is generally measured for solid slabs bounded by two parallel planes

or for semi-infinite solids i.e. slabs of infinite thickness. The DAP is then
a function of the distance z from the surface. We shall consider here the
DAP in slabs illuminated by a monochromatic parallel incident beam of
frequency v = 0427~that induces in the slab an electric field of complex
amplitude E(r) at the point r . We shall disregard the magnetic absorption
that is not important in PE. The response of the solid to the electric field is
then described by the complex amplitude either of the density of electric
current j(r) or of the electric displacement R(r). These quantities are
related by
j(r) = ioR(r), (2.1)
if we include in the density of current the displacement current. Within the
linear approximation, the most general relation between E(r) and R(r) has
the form:

The DAP at r is given by:

R(r) = E(r, r)8(r)d3r. (2.2)

Re j(r)&*(r) - Im R(r)l*(r)
D(r) = -
2Nho 2Nh
where N is the number of incident photon per unit time. As defined by (2.3)
the DAP may have a microscopic significance and can be used to define
the absorption in a volume 6V arbitrarily small. But in this paper we shall
not make use of this possibility.
In practice screening effect make E(r, r) tend to zero when lr-rl tends
to infinity. To define the DAP in a volume 6V large compared with the
size of the crystalline cell we may use a local dielectric constant that is
generally complex,

E(r) = E1(r)-iiE2(r), (2.4)

and write

We may also introduce the index of refraction n = v - i K defined by :


where E , is the dielectric constant of vacuum. We then write:

where i, is the angle of incidence of the incident light beam, I the wave-
length in vacuo, and &, the complex amplitude of the electric field in the
incident beam.
The smallest volume where (2.8) can be applied is one crystalline cell,
but this is only a very rough approximation if it is not embedded in a set
of identical cells. This restriction is still more important if we wish to go
beyond the local approximation by introducing the wave-vector-dependent
dielectric constant E ( q , o).If the gradient of the componentspf q r ) is large
enough, so that 4 r ) is significantly different from &(r)for values of r - r,
which do not cancel E(r, r), the expressions (2.5) and (2.8) cannot be
substituted for (2.2) and (2.3). This occurs when the screening length in
the solid is not much greater than the wavelength of the light. E ( q , o)can
then be defined as the ratio of the Fourier transforms of R(r) and &(r)in
an infinite solid. This adds new restrictions to the use of E ( q , o)in the
calculation of the dielectric (mean) response and the DAP in a finite volume
SV, if 6V is near the surface.
Even in an isotropic or cubic material, the isotropy of the relations
between b(r) and R(r) is destroyed by the definition of the light wave vector
q. E ( q , o)then becomes a tensor. In the simple case when q is real and the
material is isotropic, we have a cylindrical symmetry and we can define
scalar longitudinal and transverse dielectric constants EL(q, o)and +(q, o)
that are the components of the dielectric tensor in a particular reference sys-
tem (STERN [19631). We may note that the cylindrical symmetry is generally
broken when q is complex (heterogeneous wave).


In principle the dielectric constant can be calculated by considering

every independent excitation that can be induced in an infinite solid by an
electromagnetic wave. Each excitation affords a contribution to the
absorption i.e. to eZ. is deduced from the dispersion curve of cZ by the
Kramers-Kronig relations (STERN[19631). Independent excitations are
very difficult to define because of the large number of degrees of freedom
that exist in a solid, and it would be most difficult to define a complete set
of independent excitations that would completely describe the response of
the solid. In practice the components of absorption are described as damped
excitation i.e. excitations that decay into one another.
v1, 21 T H E O R E T I C A L BASIS OF T H E PE 253

Because E~ describes an infinite solid, we have only to consider bulk

excitations that occurs in an infinite solid. For convenience they can be
described in a finite volume if periodic boundary conditions are applied
to the parameters that describes the excitation. A complete a priori cal-
culation of e2 has never been attempted for real solids but more limited
calculations of the contributions of the different excitations permits us to
identify them in the experimental results.

2.2.1. Collective motion of the electrons

Every description of the free or quasi-free electron gas can be used to
describe this component.It can be approximatedin aciassicalway from three
parameters ;the density of electrons, their relaxation time and their effective
mass. This leads to the classical Ketteler-Helmholtz formula. Quantic
treatment of the interacting electron gas can also be given.
Resonant oscillations of the electrons occurs when the incident electro-
magnetic wave can be coupled with a spontaneous mode (plasma oscilla-
tion, quantized in plasmons). This occurs for the frequency (and wave
vector, when spatial dispersion is important), that cancels or nearly cancels
the dielectric constant. If the plasma oscillations are damped i.e. if the
plasmon can spontaneously decay into another type of excitation, the
resonant character of the response to the electromagnetic wave is also
damped. In addition to the volume plasma resonance observed in an
infinite solid for E = 0, a surface plasma resonance can be observed in a
semi-infinite solid bounded by an insulator of dielectric constant ci when
E+Ei = 0

2.2.2. Collective motion of the ions

The collective motion of the ions can be described as a set of phonons.
Optical phonons can be coupled with the electromagneticwave and contrib-
ute to the absorption. This effect is important in the infra-red but not in
the spectral range where PE occurs.

2.2.3. One-electron excitations (direct transitions)

In the volume theory of PE a photon can lead to the emission of an
electron only if its energy brings one electron up to an energy level above
vacuum level. This process, especially in semiconductors, is often called
electron-hole pair creation because the electron brought up to a state
above the Fermi level EFleaves under EFa vacant state or a hole.

Generally, this process is treated in the one-electron approximation.

The electronic state of an infinite solid is described by a set of occupied
Bloch states, with wave functions i,hk(r) = uk(r)eikr.For each band u k ( r )
is a different function which has the crystal periodicity. We shall assume
that each wave function is normalized in a unit volume and that a complete
set of wave functions is obtained by using cyclic conditions at the boundaries
of a unit volume of appropriate form. We assume that the dimensions of
the crystal cell are much smaller than the unit length. The density of per-
mitted states in k space is then 4n3/V, ,where V, is the volume of the crystal-
line cell.
At temperature T, the occupation probability for the $k level is a func-
tion of its energy E ,

The complex part of the one-electron contribution to the dielectric constant

is given by:

E2a = s., s., F f ( E k ) [1-f(Ekt)]Okk,d3kd3k

__ (2.11)

where (IS(r)12) is the mean value of the squared modulus of the complex
amplitude B ( r ) in the volume 6V for which gZolis defined. When the vector
potential of the electromagnetic field has the complex amplitude A(r), the
transition probability O k k , from the state $k to the state $ k , is
Okk, = lJ&kk?16(Ek, - E k - ho). (2.12)
The matrix element is

( A ( r ) .P 3 - P * A(r))I$k)3 (2.13)

where p = (h/i)V, is the momentum operator. The integral (2.11) is

extended over the complete first Brillouin zone for k and k and a contribu-
tion of type (2.1 1) must be added for each combination of a partially or
fully occupied band and a partially or fully vacant band.
In the long-wavelength approximation, we can consider E ( r ) to be uni-
form in 6 V , so that

It follows immediately from the periodicity of uk(r) ..and ukr(r)that

VI, 23 T H E O R E T I C A L B A S I S O F T H E PE 255

# 0 only if k = k (momentum-conservation selection rule). Had we

d k k '
not taken k and k' in the same Brillouin zone, the selection rule would have
k = k'+K, (2.15)
where K belongs to the reciprocal lattice of the crystal.
In the free electron model, uk and uk' are constant, if k and k' are taken
in the appropriate Brillouin zones. Therefore d k k , = 0 and no absorption
occurs in this approximation.

2.2.4. One-electron excitation (non-direct transitions)

Other processes can lead to the conversion of the photon energy into
a one-electron excitation (or electron-hole pair creation). The coupling
between the occupied state & and the vacant state (f/k. can occur through
lattice deformation. The emission or the absorption of one or several
phonons insures the momentum conservation. The energy transferred to
the electron differs from the photon energy by the energy of the phonons.
Coupling between rClk and (f/k, can also occur through other many-body
effects e.g. the electron-electron interaction or the readjustment of the
conduction band due to the created positive hole (SPICER C19671, DONIACH
[19701). Any breakdown of the model which assumes independent Bloch
electrons involves the breakdown of the k conservation selection rule.
Later on we shall consider surface effects that cannot be included in a
description of the solid by the dielectric constant.


Our survey of the possible photon absorption mechanisms in the bulk

of a solid is not complete. We did not take into account multiphoton
absorption, multielectron excitation and exciton resonances that would
not easily fit in a 3-step process. In any case the interactions between the
different excitations make our analysis approximate. The one-step theories
of PE were developed to answer this objection.
In principle one can deduce from our survey an estimate of the photo-
excitation coefficient Po and its distribution between the different electron
energies p(E, v). p(E,hv) is the most useful intermediate between the
observed PED n(E, hv) and the microscopic processes that many authors
investigated through PE experiments. b(E, hv) differs from n(E,hv) first
because of the elimination of the electrons of energy smaller than the vacuum
level E, and also because it includes inelastic scattering during the transport
to the surface.
In the different interactions of light with the solid that can be observed
in PE data, we must distinguish
1) those which produce one-electron excitation,
2) those which compete with one-electron excitation, and
3) those which only modify the distribution of the electric field and the
DAP in the solid. The collective motion of electrons, insofar as it is damped
only by one-electron excitations or light emission, is of the third type. The
plasma resonance iq a particular form of this collective motion, and we
could apply the expressions (2.6) and (2.7) without any reference to the
plasmon concept. We can also consider that a photon excites a plasmon
that decays either into a one-electron excitation or a reflected or scattered
photon. The plasmon appears there as an additional step in the PE process
and the excitation of a plasmon must not be included into the calculation
of p(E,hv) as competing with one-electron excitations. If a collective
oscillation of the electrons is damped by thermalization of its energy, the
process that decreases B(E, hv) is in fact the thermalization.


The Fresnel equations permit the calculation of the electric field excited
by an incident plane wave in a solid. At the interface of two semi-infinite
homogeneous media, characterized by indexes of refraction no and n, the
reflection coefficient is
no cos i , -n, cos i,
r, = (2.16)
no cos i, + n , cos i,
for s polarization, and
n1 cos i, -no cos i,
r = (2.17)
n , cos i , +no cos i,

for p polarization. Here i, and i, are respectively the angles of incidence of

the wave vectors k , and k , in the incident medium (generally vacuum) and
in the photoemitting solid. They are related by Snell law

no sin i, = n, sin i,.

Absorption always occurs in a photoemitter, hence n , , i , and k , are always

"'9 21 T H E O R E T I C A L B A S I S O F T H E PE 257

In a semi-infinite photocathode the DAP at a distance z from the surface

4 n v ~1(- rr*)
D(z) = exp ( - 4 ~ / 4 , (2.18)
1cos i,
11, = v-iK,
and p and q are the positive solutions of (2.19),


r is given by (2.16) or (2.17) depending on the state of polarization of the

light. In thin film of thickness zo we must take into account the reflection
of the light on both faces of the film. For s polarization

1 + r, r: exp ( - 2iq,z,)

x {exp (- iq, z ) r: exp (iq,[z - 2z,]))

where rs and rl are the coefficient of reflection for the electric field, given
by (2.16) when the incident light beam encounters respectively the first and
the second face of the film and

2714 cos i,
4, = (2.21)
is the z component of the wave vector q in the photoemitter. Because of
absorption q, is always complex. q, that appears in (2.18), is the imaginary
part of 1qZ/2n.For p polarization, because the angle of refraction i, in the
photoemitter is complex, we must add separately the energy densities
Dp,(z) and D,,(z) associated with the tangential (x) and normal ( z ) com-
ponents of the electric field:

(1 ~,)(cosi,/cos io)
1 rprb exp ( - 2iq, z,)

x cos i, x {exp ( - iq, z) + TI,exp (iq,(z - 22,)))


471~1~ (1+ rp)(cosi,/cos i,)

1+ rprb exp (- 2iq, z,)
= ___
I cos i,

x sin i, x { exp ( - iq, z ) - rb exp (iq,(z - 22,))) ; (2.22)

rp and rl, have for p polarization the same significance as rs and ri for s
The calculations of the photoyield by COQUET and VERNIER[1966] and
by many other authors since that time have been based on equations (2.20)
and (2.22).

2.4.1. Validity of the Fresnel equations and spatial dispersion

The Fresnel equations (2.16) and (2.17) represent the solution of the
Maxwell equations when two homogeneous media described by two local
indexes of refraction are separated by a plane discontinuity. The well-known
continuity of the tangential components of the electric and magnetic fields
(CTCF) that is used as boundary condition to derive eqs. (2.16) and (2.17)
is a consequence of the Maxwell equations at singular points where the
discontinuity of the index of refraction occurs.
If spatial dispersion is not negligible we can try to substitute for the
local index of refraction the wave vector dependent one, n(q, 0). Well inside
the solid a complete set of solutions of the Maxwell equations includes the
plane waves with a wave vector q that satisfies the dispersion equation
q2 = - n2(q,0). (2.23)

The wave vector of the transmitted wave q associated with a given

incident wave must be deduced from (2.23) and the Snell laws (equality
of the tangential components of the incident, transmitted and reflected
waves). New effect introduced by spatial dispersion arise from the possible
multiplicity of solutions for q. HOPFIELD and THOMAS [1963] explained
observed anomalies in the reflectivity of CdS for photon energies near the
exciton resonance by such an effect.
This situation is similar to the phenomenon of double refraction in
anisotropic media. To determine the amplitude of both transmitted waves,
the incident one is divided into two components, linearly polarized in the
simplest case, elliptically in the general case. Each component is coupled
with one transmitted wave by the CTCF and a generalization of Fresnel
equations can be found for it. Such a decomposition has not been found
VI, 21 T H E O R E T I C A L B A S I S O F T H E PE 259

in the case of spatial dispersion and the CTCF associated with the dis-
persion equation (2.23) are insufficient to determine the amplitude of
several transmitted waves. HOPFIELD and THOMAS [19631 and many later
authors introduced an additional boundary condition by examining the
microscopic process that is responsible for the spatial dispersion e.g.
Hopfield and Thomas cancel the contribution of the exciton to the electric
polarization at the surface. MELNYK and HARRISON [1970] introduced an
additional boundary condition, namely the continuity of the normal com-
ponent of the electric field, to account for the excitation of longitudinal
plasma oscillations in metals in addition to the standard transverse waves.
For p polarization at high angles of incidence, they predicted oscillatory
variations of the transmittance and absorptance of thin films of K with
photon energy above the plasmon resonance. ANDEREGG, FEUERBACHER
and FITTON [I9711 observed such variations in the spectral distribution of
the photoyield.
The justification of the additional boundary conditions raised a good
deal of controversy. AGARWAL, PATTANAYAK and WOLF[1971a, b, 19741
substituted for the additional boundary condition a coupling between the
transmitted waves at the surface.
VERNIER [1973] noted that the use of boundary conditions implies that
the transition layer, where the bulk index of refraction of the homogeneous
medium does not represent the solid properties, has a negligible thickness.
We may ask whether the standard CTCF can be applied across this layer,
when the local approximation breaks down. The rigorous solution is to
solve the coupled equations that relate the motion of the charges and the
electric field in the vicinity of the surface. Such a solution has been given
for a free electron gas by SAUTER [1967], FORSTMANN [1967], and FUCHS
and KLIEWER [1969]. FUCHS and KLIEWER [1969] calculated the imped-
ances 2, and 2, that appears in the standard expressions of the reflectance
cos io-z,
for p polarization; R , = (2.24)
cos i, z,+
1-2, cos i,
for s polarization; R, = (2.25)
1+z, cos i,
These expressions would be equivalent to Fresnel equations (2.16) and
(2.17) if Z , and Z, could be deduced from a single dielectric constant E for
every angle of incidence by


Z, = ( - -sinio , (2.27)

This requirement is, however, not fulfilled. Moreover Fuchs and Kliewer
found a surface absorption term. We note here that such surface absorp-
tion implies that the normal component of the Poynting vector is not con-
tinuous at the surface. Therefore the CTCF is no longer valid. The process
of surface absorption considered by Kliewer and Fuchs does not lead to
one-electron excitation and therefore to PE. Later on we shall note other
processes of surface absorption that lead to PE and that are also inconsistent
with the CTCF and, therefore, with the Fresnel equations.

2.4.2. Validity of the Fresnel equations, surface roughness ana plasma

The Fresnel equations have been established for a perfectly smooth sur-
face. If the surface is slightly rough, the incident light produces scattered
waves in addition to the reflected and transmitted ones. The Fresnel equa-
tions can still be valid if the total intensity of scattered waves is negligible.
This can occur for relatively large roughness if no resonance occurs.
Surface plasma oscillations can be dealt with as a special form of scattered
wave and the breakdown of Fresnel equations when the incident wave is
coupled with plasma oscillations is a very popular mean of proving their
existence. JASPERSON and SCHNATTERLY [1969], ENDRIZ[19731and DAUDE,
SAVARY and ROBIN[1972] plotted the reflectance of different metals
(Ag, Al, Mg) versus the photon energy and observed a dip for plasma
resonance that increases with roughness. CALLCOTT and ARAKAWA [ 19741
observed a similar dip in the plot of the reflectance of films of Li versus
the angle of incidence. They were able to fit the observed values of the
reflectance for all angles of incidence with one value of the complex index
of refraction only for hv > 6 eV. In both cases the breakdown of the Fresnel
equations was attributed to an effect of roughness that becomes especially
large when surface plasmons can be excited.
Arakawa and his coworkers gave a special attention to the excitation of
surface plasmons in metallic gratings that can explain the Wood anomaly.
It is beyond the scope of this paper to give a complete review of the
plasmon phenomena (see RITCHIE [1973]). We shall only note that plasmons
can be excited on perfectly smooth surface when a metallic film is deposited
onto the face of a prism or semi-cylinder on which a beam of light is totally
reflected (OTTO[1968,1970]). In that case the surface plasmon represents
the evanescent wave that is excited in the film by the incident wave. The
VI, 21 T H E O R E T I C A L BASIS OF T H E P E 261

calculation of the frustrated total reflection by glass when it is coated with

a metallic film can be performed with Fresnel equations. Of course in that
case too the Fresnel equations are only valid when the surfaces are smooth


In the one-electron approximation, we can distinguish two effects :

1) Surface photoexcitation from bulk states (SPBS) due to the perturba-
tion of bulk states by the surface.
2) Photoexcitation from surface states.

2.5.1. Surface.photoexcitation,from bulk states (SPBS)

Each bulk Bloch function $(r) that represents the electronic state of the
infinite solid must be corrected in a semi-infinite solid by a factor Xk(r),
$(r) = Xk(r)uk(r)eik'r> (2.28)
Xk(r)= 0 when r is outside the solid
Xk(r) = I when r is inside the solid.

In a first approximation xk(r) may be assumed to be discontinuous at the

surface; in a better approximation Xk(r) varies in a continuous manner
from 0 to 1 in a transition layer. The substitution of (2.28) in the expression
(2.14) of the matrix element implies that


is replaced by

&kkb f AAkk' (2.30)


A A =
~ <Xk(r)uk,(r)eik'.r(
~ ~ -
- b(r.) (V,Xk(r))luk(r)eik").

We assumed here that the gradient of the electron wave function is much
steeper than the gradient of the electric field. This assumption might not
be valid if the screening length is very short ; to calculate the matrix element
the expression (2.13) should then be used instead of (2.14).
SCHAICH and ASCHCROET [19711 noted that the surface contribution

cannot be separated from the volume contribution because the quadratic

character of the photoelectric response produces interference effects. The
difference between the arguments of and A A k k can take any value
between 0 and 2.n. But, if k = k, AMkk,is generally much smaller than
d k & b so that the surface perturbation does not significantly modify the
matrix elements associated with direct transitions. If k # k, &kkb = 0 and
we can write
A&kkp = dh?kke.

Therefore, in a first approximation, we can justify the separation of

surface from volume made by ENDRIZand SPICER [1971b], by writing

An important feature of the SPBS is that in a first approximation

Vrzk(r)is a vector normal to the surface. We may then write
@9 . (V, X&)) = &,(r)W, (2.33)
where ,(r) is the normal component of the complex amplitude of the
electric field b ( r ) . If the surface is perfectly plane, only an oblique p-polar-
ized wave can produce surface absorption. For normally incident light
or for s waves, only the roughness of the surface or the periodic variation
of the normal due to the disposition of the superficial atoms can produce
SPBS. This characteristics of SPBS has been well known since the early
theory of MITCHELL [1934].

2.5.2. Photoexcitation from sui-face states

In a semi-infinite solid we find surface states in addition to bulk Bloch
states. They were first introduced by TAMM[1932] and characterized by
wave functions quite similar to (2.28), with a complex wave vector
k = k,-ik,. (2.34)
The imaginary part k, is normal to the surface and directed towards the
interior of the solid. Intrinsic surface states are associated with dangling
bonds and are expected to disappear with surface contamination. Of course,
surface contamination can introduce impurity surface levels. DAVISON and
LEVINE[19701 reviewed the theorical and experimental work on intrinsic
surface states. Recently, MONCH[1973] reviewed the special case of Si
surface. Experimental evidence for surface states is most easily obtained
when they lie in the forbidden band of a semi-conductor e.g., through band
VI, 5 21 T H E O R E T I C A L B A S I S O F T H E PE 263

bending effects (ALLEN and GOBELI [1962]). But surface states have been
theoretically predicted at other energy levels in the band scheme both in
semi-conductors and metals (BORTOLANI, CALANDRA and KELLY[19731,
[19721). We are concerned here with PE from surface states. It is generally
difficult to discriminate between PE from intrinsic surface states and other
levels that occupy the same position in the band scheme. No polarization
test is available and the best test is the dependence of the observed
phenomena on an exposure to very small amounts of gases.

2.5.3. Surface absorption, Fresnel equations and DAP

We have seen that collective motion of the electrons and the quantum
transitions between one-electron states can induce surface absorption. We
have also seen that surface absorption implies a breakdown of the CTCF
and, therefore, of the Fresnel equations (2.16) and (2.17). But this effect
is expected to be small and has always been neglected in the calculations
of the volume contribution to the DAP. We may hope to give an adequate
account of the surface effects by including the surface term A6(z) in the


The absorption of one photon at r produces excited electrons distributed

in energy according to the function P(E,hv). To define completely the effect
of the absorbed photon we should also give the distribution of the wave
vectors of the excited electrons. Different interactions suffered by the
excited electron prevent it from being emitted with the energy E. Their
effect is characterized by the functions p(E', E, r) and p(r), that have been
defined in 91.1. If the photo-excitation can be assumed to be isotropic,
p(E', E, r) and p ( r ) are independent of the angle of incidence and of the
state of polarization. They depend only on the distribution function
B(E, hv). Because of the influence of the distribution of the wave vectors,
the escape probability may be quite different for photoexcited electrons
and for electrons injected by other means. Most experimenters assume,
without discussion, that
p o is the escape probability of an electron excited just at the surface and
L is the escape depth or attenuation length. The fit of the experimental
data when (2.35) is substituted into (1.1) may be taken as a sufficient
justification of (2.35). Then L appears as a phenomenological parameter
that measures the effective depth that contributes to the PE.
A more fundamental justification of (2.35) requires an examination of
the interactions involving an excited electron during its transport towards
the surface.

2.6.1. Coulomb repulsion between electrons

'The Coulomb repulsion between electrons cannot be fully accounted
for, if the electronic state of the solid is represented as a set of independent
Bloch electrons. Because of this effect we must consider that an electron
in an excited state has a finite lifetime z,. If no other process occurs to
deexcite a Bloch electron, the mean free path may be written as
where ug is the group velocity associated with the electron.
Two types of Coulomb scattering have been considered: those due to
interaction with one-electron states (electron-electron scattering) and
those arising from the excitation of plasmons.
a ) Electron-electron scattering or electron-hole pair creation is the interac-
tion of the excited electron with a less energetic electron which is, with a
very high probability, under the Fermi level E F .Because of this interaction
the photon energy is then shared between two excited electrons or in other
terms between two electron-hole pairs. The energy lost by the excited elec-
tron of energy E is most often in the range of magnitude ( E - EF)/2.Within
a few eV of the threshold one interaction generally brings both electrons
under the vacuum level E,. For high energy photons (in the UV or X-ray
regions) both electrons may be brought above the vacuum level. This
possibility can be described from a phenomenological point of view by
p ( v ) and p(E', E, Y), if we consider these functions as mean numbers of
emitted electron per absorbed photon. p ( r ) can, in principle, become larger
than unity, but many absorbed photons lead to no PE at all.
b ) Plasmon scattering results from the interaction of the excited electron
with a set of conduction electrons. A collective motion of the conduction
electrons is then induced. We can describe the interaction macroscopically,
in terms of the dielectric constant. This type of interaction becomes impor-
tant if a plasma resonance can be excited and the loss of energy is then
quantized into surface or volume plasmons. Plasmons are found in the
spectrum of characteristic energy losses when charged particles are sent
through thin films (see review by PRADAL, GOUTand FABRE [1965]).
"I, 21 T H E O R E T I C A L B A S I S OF T H E PE 265

QUINN[1962] calculated the mean free path of an excited electron of

energy E and momentum p in a free electron gas with respect to electron-
electron scattering (lee)and plasmon scattering (l,,). He found a decrease
of lee when E increases in the low energy range. This result has been gen-
eralized by BERGLUND and SPICER[19641. They calculated the lifetime
z(E) as a function of the density of state p(E). They assumed that the
probability for an electron of energy E to transfer an energy AE to another
electron of energy Eo is proportional to
P(Eo + AE)p(E - AE),

t I
5 'I) E
Fig. 1. Mean free path of an electron of energy E above the Fermi level with respect to
electron-electron interaction in gold, calculated by KROLIKOWSKIand SPICER[19691 (con-
tinuous curve), and by SZE,MOLLand SUGANO[1964] (dotted curve). Experimental values of
the escape depth for photons of energy hv = E, obtained by: CROWELL, HOWARM,LABATE
and SPITZER[1962], x SZE, MOLLand SUGANO[1964], 0 KATRICH and SARBEI[1961],
[1962] (non photoelectric method), * KANTER[1970] (non photoelectric method).

if both final levels E,+AE and E-AE are vacant, i.e., lie above the Fermi
energy EF (random-k approximation). Therefore
1 ( E - Eo@

dAEp(E,)p(E, + AE)p(E- AE). (2.37)
z(E) CC dEo J E F - E .
Results of the calculation of I(E) by KROLIKOWSKI and SPICER
[ 19691 using
(2.37) are plotted together with experimental data (see Q 3 or Fig. 1 for
gold). KANE[1967] developed the random approximation in the case of







Si and discussed its validity by comparing the results with Monte-Carlo

calculations, based on a more refined model.
Quinn found for plasmon scattering in a free electron metal


where h o p is the plasmon energy, p F is the momentum of an electron at

the Fermi energy and a, is the Bohr radius. Fig. 2 represents the variations
of the mean free paths l,, and lee with the electron energy E - EF above
the Fenni level as calculated by SMITH and FISCHER [1971] from eqs. (2.37)
and (2.38). Fig. 2 also represents the variations of the mean free path
1 = lpelee/(lpe
-k lee) that results from the combination of plasmon and
electron-electron scattering. The strong decrease of Ipe occurs when lee is
already very small and is not seen in 1 but the situation might be quite
different in other metals. The experimental data from Smith and Fischer
will be discussed in section 3.1.
In a general discussion of the electrons energy losses during PE MAHAN
[19731 has shown that surface excitations, and especially surface plasmons,
may be important. FEIBELMAN [1973] investigated the validity of the
assumption of a mean free path independent of the distance z to the surface.
He found for Auger electrons (of a few hundreds of ev), that near the
surface the increase of the mean free path with respect to the bulk plasmons
is partly compensated by the decrease of the mean free path with respect
to surface plasmons. But the generalization of such a result to low energy
electrons might not be possible.

2.6.2. Phonon scattering

The Bloch theorem has been proved in the approximation of fixed ions.
The interaction of electron with lattice vibrations is represented by phonon
assisted transitions of each electron from one Bloch state to another. The
energy lost by an electron in one interaction equals that of the created
phonon, typically a fraction of kO,, where 0, is the Debye temperature.
In principle the annihilation of a phonon that increases the energy of
the electron is possible but practically negligible in metals. For most
excited electrons kOD is much smaller than the energy of the electron
above the Fermi level and many phonon interactions are necessary to
reduce the energy of an excited electron below the vacuum level. Since
the mean momentum exchanged between the electron and the phonon is
large and the energy loss small the phonon interaction may be often ap-
proximated as elastic.
The mean lifetime zp and the mean free path I, of the electron with respect
to phonon creation can be defined as for scattering by electrons. We may
I, = vgzp.
zp is the main parameter that determines the conductivity of metals. From
conductivity measurement one finds that lp lies in the range of a few
100 A. But the energy range of the excited electrons are quite different in
PE and in conductivity. We may also note that the conductivity depends
on exchanges of momentum whereas the PE depends mainly on exchanges
of energy.
In semiconductors, if the difference between the energy of the excited
electron and the bottom of the conduction band is not larger than the band
gap, both electron-electron and plasmon scattering are forbidden. In this
case phonon-scattering can produce a thermalization of the excited electrons
at the bottom of a conduction band. After a sufficient number of phonon
interactions, the excited electrons, which are trapped in the conduction
band, attain an equilibrium distribution when the number of created
phonons equals the number of annihilated phonons. The probability of
occupation of an energy level E of the conduction band is then proportional
to exp (- E/kT).
The total number of electrons in the conduction band is determined by
the transitions that bring an electron into and out of the conduction band,
(photon absorption and recombination with holes). This process is very
important in the negative electron affinity (NEA) photocathodes, where
the vacuum level E, lies below the bottom of the conduction band, because
every thermalized electron can be emitted.

2.6.3. Electron-hole recombination

In semi-conductors an electron can recombine with a hole with, or
without, radiation. The lifetime q, of the excited electron with respect to
such a process is generally several orders of magnitude greater than z,
and z,. But in NEA photocathodes the other interactions are completely
forbidden by the gap and the electron-hole recombination is the only
process that affects electrons trapped at the bottom of the conduction band.

2.6.4. Transmission by the surface

When an electron reaches the surface in a Bloch state, defined by an
energy E and a wave vector k, it has a probability p,,(k) of escaping.
E and k are generally different from the initial energy and wave vector
of the electron after excitation. p&) can be easily calculated only in the
VI, 5 21 T H E O R E T I C A L B A S I S O F T H E PE 269

case of a free electron gas bounded by an ideal surface potential barrier.

If k' is taken in the appropriate Brillouin zone, the electron momentum
is hk' and we may write


The z components of the wave vector inside (ki) and outside (k;) the solid
are related to the work function W, by

The transmission probability (2.39) may be expressed as a function of the

angle of incidence 8 of the excited electron at the surface and of the limit
angle 8, given by
w,C O S ~eo = w,, (2.41)
where W, is the kinetic energy of the electron. The transmission probability
is given by
cos 8' - (cos2 8' - cosz Q,)+
, when 8' < Bo,
cos 8' (COSZ e' + cosz eo)+
=o when 8' > 8,.

We may try to extend the expressions (2.41) and (2.42) to the case of
Bloch electrons; 8' then represents the angle of incidence of the group
velocity and W, the energy of the electron above the bottom of the conduc-
tion band :
W, = E - E , (2.43)
W, = E , - E , . (2.44)

We may note that 8, is smaller in metals than in semi-conductors. This

extension is not valid for NEA photocathodes. At best we could give
8, = 4 2 and pE.(O')= 1. Neverthelesstheresultmaybequalitativelycorrect
and contribute to the high yield of NEA photocathodes.
Most authors merely define the escape probability p o for an electron
excited at the surface by taking z = 0 in p(r).p o becomes then an empirical
parameter which can include the effect of an activating surface layer. This
is done in expression (2.35).
Let us recall here that if surface photo-excitation occurs, the correspond-
ing escape probability ps, that has been defined in section 1 . 1 , may be
different from po .


The escape probability p ( r ) results from the combination of several

scattering processes and of the transmission through the surface. The general
calculation is quite complicated and cannot lead to general formulae
except in the most simple approximation.

2.7.1. Electron-electron interaction and the ballistic approximation

In many cases the electron-electron interaction is the dominant scattering
process (I, zs- I, e.g. in metals). For small photon energy the energy loss
of the excited electron in one scattering even brings it under the vacuum
level E,. If the expressions (2.41) to (2.44) are valid, the escape probability
for an electron that has been excited in a state of energy E with a group
velocity making an angle 6 with the surface is

PE(e, r, = PE(@ exp [ - z/l(E) cos (2.45)

where pE(6) is given by (2.42) and 1(E) is the mean free path for an electron
of energy E. For an isotropic excitationpE(r)may be written for an electron
of energy E as


At least in metals the limit angle 6 , defined by eqs. (2.41), (2.43) and (2.44)
is small for small photon energies and we may write,



The escape probability increases sharply with E because of 8,. Therefore

most of the emitted electrons have been excited from quite near to the
Fermi level. If the mean free path I(E) does not vary too sharply with E,
we can approximate (2.48) by

Pb) = Po exp [ - z / U , + h41. (2.49)

The escape depth L that appears in (2.35) is, therefore, approximately the
VI, 5 21 T H E O R E T I C A L BASIS OF T H E PE 271

mean free path for an electron excited from the Fermi level. p o depends
only on the photon energy and can be written as


In the assumption of a slow variation of I with the electron energy CROWELL,

SPITZER, HOWARTH and LABATE [1962] substitute for the expression (2.49)
L = ICE, + hv, +$hv - hv,)], (2.51)
where hv, is the photoelectric threshold.
KROLIKOWSKI and SPICER[1969] calculated how the observable PED
n(E, hv) depends on the scattering processes in the ballistic approximation.
In this case the scattering has only substractive effects on the initial distribu-
tion of the electrons p(E,hv). These authors approximated the expression
(2.42) by
pE(6)= 1 when 6 c B0,
=0 when 6 > 6,,

and integrated the contribution of the different layers to the photoelectric

current. They assumed a normal incidence illumination with an absorption
coefficient a. They described the escape probability by the expression
(2.45). After integrating over 6 they found that


f(6,, a, E ) = - 1-cos 8,-
In ( +al(E)
1 cos 6 , )]. (2.53)

Here 6 , depends on the energy Eand is given by (2.41). To apply the expres-
sion (2.53) KROLIKOWSKI and SPICER use the expression (2.37).
The ballistic approximation breaks down when secondary electrons keep
enough energy to get out of the solid. KANE[1967] performed a numerical
calculation of the energy distribution of secondary electrons in Si, after
scattering by pair creation. SMITHand SPICER [19691 attributed a structure
observed in the energy distribution of photoelectrons excited in alkali
metals by 10.2 eV photons to electrons that have lost energy by plasmon

2.7.2. Diffusion equation and electron-hole recombination in negative electron

affinity ( N E A ) photocathodes
At the opposite extreme from the (2.6.1)ballistic approximation the mean
free path may be larger for electron scattering than for phonon scattering.
Before being brought under vacuum level, the excited electron is then
subjected to a large number of phonon scattering events. In each event it
loses much of its momentum but very little of its energy. In NEA photo-
cathodes the transport of the excited electrons can be described as elastic
scattering of the electrons by phonons. The excited electrons are quickly
thermalized into the bottom of the conduction band (see section 2.6.2)
and their lifetime q, in that state is limited only by the electron-hole re-
combination, because the average energy gain by phonon annihilation
exactly balances the average loss by phonon creation. The thermalization
time can be neglected with respect to zh. The density f(z) of excited electron
at the depth z obeys the diffusion equation (JAMESand MOLL[19659,

9 d2f ~
f = POD(Z).
+ zh
- (2.54)
Here fi0D(z) is the density of electrons excited by unit time and 9 is the
diffusion coefficient.
The density of photoelectric current is given by the current at z = 0,
I = -9e-. (2.55)
The surface properties of the cathode define boundary conditions that we
can describe by an escape probability p o . The surface treatment necessary
to obtain the NEA is the deposition of an activating film in which inelastic
scattering occurs and reduces p o .
In a semi-infinite solid we can deduce from (2.54) and (2.55) that

y = [
Po exp (-Z/L)PoD(Z)dZ, (2.56)

where Ld is the diffusion length

L, = (Th9)f. (2.57)
Eq. (2.56) can also be obtained by substituting from (2.35) into (1.1) and
the diffusion length can be identiJied with the escape depth.
In a thin film of thickness zo we can deduce from (2.54) that

Y =
s + +
[ A exp (- z/Ld) B exp ( z/Ld)]PoD(z) dz. (2.58)
VL 5 21 T H E O R E T I C A L B A S I S OF T H E P E 213

Here A and Bare constants which can be deduced from boundary conditions
at the interfaces of the film. The quantity in brackets can be identified with
the escape probability p ( r ) in the expression (1.1).
We can consider that the terms A exp (-z/Ld) and Bexp ( + z/Ld) result
from successive back and forth diffusions in the film with an escape prob-
ability p o at the surface and an absorption probability p A at the substrate
interface and we can write

2.7.3. De-excitation of photoelectrons by phonon scattering only

In semi-conductors, when the NEA is not achieved and when the pair
creation is forbidden by a band gap larger than the electrons affinity, the
de-excitation of the photoelectrons results from successive small energy
losses in phonon scattering events. The age theory derived by Fermi to
calculate the neutron transport in nuclear reactors was applied by HEBB
[1951], LYE and DEKKER[1957] to the transport of secondary electrons.
They assumed that phonon creation was the only energy loss process for
the electrons and that an electron of energy E above the bottom of the
conduction band E, loses a fraction ( of E - E, in each phonon creation.
They found for the escape probability
p(z) = po Erfc 12/22;), (2.60J

Erfc (y) = -
>J, exp (- x) dx, (2.61)

and zF is the age of the electron after excitation,


9 is the diffusion coefficient due to phonon scattering. lp is the mean free

path for phonon scattering. Figure 3 represents the functions Erfc(x) and
e- 2x. Within the standard accuracy of photoelectric data these functions
cannot be distinguished from one another and the expression (2.60) can
be approximated by (2.35) with

L = 2;. (2.63)

\ Erfc (X\

Fig. 3. Comparison of the functions Erfc (x) -and exp (-2x) -----

2.7.4. De-excitation of photoelectrons by both phonon and electron-electron

BARTELINK,MOLL and MEYER[1963] introduced in the Fermi age
equations an absorption term that can describe the electron-electron
interaction. When the energy loss in phonon scattering becomes negligible,
they could approximate the escape probability by the equation (2.35) with
the escape depth,


KANE[1966] calculated, in a one-dimensional model, the rate of phonon

creation and electron-hole pair creation as a function of the mean free
paths lp and I,. He also calculated the phonon energy loss distribution that
distorts the observed PED from P(E,hv). DUCKETT [1968] showed on a
random-walk model that Kane had well approximated the 3-dimensional
case. LANGRETH [19711derived results quite similar to Kanes in the modem
formalism of Green functions.
BALLANTYNE [19721 included the energy losses due to phonon scattering
and the variation with the photon energy hv of the complex dielectric con-
stant el(hv)- ie,(hv) in the expression of the photoyield near threshold.
He could thus fit the experimental results in a wider spectral range than
VI, 21 T H E O R E T I C A L B A S I S O F T H E PE 275

with the expressions of KANE[1962]. Insofar as the mean free path I@)
is a slowly varying function of the electrons energy E, the pair production
scattering determines the overa!l number of emitted electrons but does not
affect the form of the spectral yield nor the PED near the threshold. For
a non negligible energy loss Ephin phonon scattering, he found that if
the final states at threshold are not at an extremum in the conduction band

x (hv-hv,)3; (2.65)

and that if Ephis negligible, then

x (hv - hv,)2. (2.66)

The expressions are the same for direct and indirect transitions, only the
value of the threshold hv, is different. The expression (2.65) gave quite
good results for semi-conductors.
In the most general case a simple explicit form of the escape probability
cannot be given. STUART,WOOTENand SPICER[I9641 and STUARTand
WOOTEN[19671 performed numerical calculations for various values of
Eph,1, and lp. The results roughly agree with simple models.

2.7.5. The escape probability and analysis of the experimental data.

The simple expression (2.35) has been justified in many simple cases aad
we may expect that it provides a reasonable basis for the analysis of the
experimental data. The expression (2.60) is probably as good, but the
integration of (1.1) can only be made by numerical methods if (2.60) is
substituted into (1.1). For this reason nearly all authors have used (2.35)
with the exception of HOFFMANN and DEUTSCHER[ 1970) and HIRSCHBERG
and DEUTSCHER [19683.


In the 3-step model, the break down of the independent electron

approximation has been introduced as scattering processes that determine
the transport properties. Many-body effects have also been introduced
into the determination of the photoexcitation probability B as photon
absorption processes that compete with the production of one-electron
excitation. We have seen that the decay of the various excitations of the
solid into one another may bring about difficulties . in the interpretation

of P(E, v). The transition probabilities given by the expressions (2.1 l),
(2.12), (2.13) have been derived within the one-electron approximation.
The k conservation selection rule in the volume effect is a direct consequence
of (2.13). Such a treatment may not be sufficient and we need to introduce
also the many-body effects in the calculation of the transition probability
6 j ) k k . . The indirect transitions with phonon creation may be considered
as a transition between two Bloch states of different wave vectors that is
allowed by a subsequent scattering of the electron in the final state by the
lattice. In a more general way, we may associate with each scattering
process of the excited electron a perturbation of the transition probability
O k k . .The perturbation may be an increase or a decrease only when the
transition probability is not zero without scattering e.g., for direct transi-
tions. In most cases the perturbation merely adds a new term to the absorp-
tion. Indirect transitions with phonon creation have a quite low probability
in comparison to direct transitions and have been observed by optical
means only in the spectral range where no direct transition can occur.
Because the mean free path of the electrons for the electron interaction is
often much shorter than for the phonon interaction we may expect in the
first case a much stronger perturbation that, quite strangely, is seldom
taken into account.
A semi-empirical way of taking into account the dependence of photon
absorption on the scattering processes is to introduce damping in the
Bloch waves that are used to calculated matrix element (2.13). We may
attribute to the initial state $k a quasi-infinite lifetime and a finite lifetime
z to the final state. The theory of Weisskopf and Wigner, quoted by DAVY-
DOFF [1965], when adapted to our notations gives, instead of (2.12),


The expression (2.67) can be very useful for comparing the results of optical
or PE measurements with the information deduced from escape depth
measurements. It must be applied with great caution when several scattering
events follow one another; z must be taken as some sort of an empirical
parameter that represents the time necessary for a complete thermalization
of the absorbed photon energy.
In the dynamical theory of electron diffraction (DEDERICHS [19721)
damping of the electron wave is often introduced by substituting for the
classical Bloch wave (2.9) the expression

t,bk = U k ( r ) exp (ik . r) exp (- k , . r). (2.68)

VI, ii 21 T H E O R E T I C A L B A S I S OF T H E P E 211

The addition of an imaginary component ik, to the wave vector k is equiv-

alent, in principle, to the introduction of a finite lifetime, if k, has the same
direction as the group velocity associated with the-k state and if


where I is the mean free path of the Bloch electron.

If we substitute (2.68) into the expression (2.13) or (2.14), lA,','tlis no
longer zero for k # k'. We can divide k and k into two components respec-
tively parallel and perpendicular to kI,
k = k,+kN,
k = ki+kh.

The k conservation rule that could be expressed in the approximation

(2.13) by
= A(k)G(k-k') (2.70)


Based on the same principle as the Wigner and Weisskopf expression

(2.67), the expression (2.71) suffers the same limitations because of multiple
scattering. We can nevertheless use it to predict a correlation between the
value of the escape depth of the electrons and the importance of non-direct
transition that were first introduced by Spicer. We note here that I A k , ' ! l 2
as deduced from (2.71) has a tail of non negligible values for quite high
values of IkN-k&l.
Another way of breaking the validity of the k conservation rule is to
assume an interaction of the excited electron with the hole created below
the Fermi level by its excitation (SPICER[1967], DONIACH [1970]).
A rigorous treatment of the problems of light absorption requires the
determination of independent excitations of infinite lifetime that presently
are far from being known in the most general case.


In the microscopic interpretation of the PE process the different steps

certainly depend on one another and the division into steps appeared to

several authors as artificial. For that reason, several formalisms have been
derived to treat PE as a one-step process.
MAKINSON [19491developed a theory of surface PE as a one-step process.
In order to include in the expression of the photo-yield a factor representing
the transmission of the electrons, Makinson introduced the coupling of
the electrons of the solid with the sets of electron waves obtained by
associating with each incoming wave the reflected and transmitted ones.
In more recent work the set of waves associated with each incoming wave
has been completed by the diffracted waves and its coupling with the
electrons of the solid has appeared as a basic element of the one-step
theories of PE.
ADAWI[1964] and MAHAN[1970] have introduced the methods of
scattering theory for the surface effect and the volume effect respectively.
They treated the PE as inelastic scattering of the electrons of the solid by
photons. Mahan used the asymptopic form of the Green functions to cal-
culate the photoelectric current dI emitted within a solid angle dQ. He
obtained the result


The integral is extended over the range of wave vectors ki associated

with occupied states 4i in the solid,
p is the momentum of the emitted electron,
4 is a wave function that represents an incoming electron wave with
a momentum p directed within dQ, along with the reflected, diffracted
and transmitted waves.
(4*\9\&) is the element of the T matrix that couples the initial state
with the state +*.
The operator 9 is given by the Lippmann-Schwinger equation. If V is
the operator that represents all interactions that have been neglected in
describing the electron states by stationnary wave functions c$~ and &,
we may write
9 = V+VGo9, (2.73)

where Go represents the Green operator. In practical calculation, 9 must

be expanded,

9 = V+VG,V+VG,VG,V+ .... (2.74)

Each term represents a scattering of definite multiplicity.


Mahan developed his calculation mainly for q5i representing free elec-
trons but the expression (2.71) is still valid when q5i represents Bloch elec-
trons. Let us note here that we may include in the double scattering term
of (2.72) double scattering by the electromagnetic field, and thus include
two-photon photoemission.
The same T-matrix elements appear in the Mahan theory of PE and in
the theory of Leed and Auger emission. In the latter cases the calculations
are much more advanced than in PE (see for instance TONG,RHODIN and
TAIT[1973]) and could be probably used in PE. In recent years, other one-
step formalisms have been developed by SUTTON[19701, SCHAICH and
ASCHCROFT [1970], THORNBER [1971], HERMEKING [1972,1973], TZOAR
more rigorous principles than the 3-step model, they have been able to
predict new structures in the PE of X-rays (NOZIERES, DEDOMINICIS [19691).
But up to now no one-step theory of PE has been able to include in a proper
manner the effects of electron-electron scattering. In the papers of MAHAN
[19701and SCHAICH and ASCHCROFT [19703electron scattering is introduced
just like in the 3-step model. In any case the functions that appear in the
3-step model may be used at least as phenomenological parameters to
interpret the experimental data. We may hope that advances in the one-step
theory will afford a better link between the functions that can be deduced
from experimental data and the microscopic processes.

0 3. Experimental Determination of the Escape Depth of the Photoelectrons

In this section we shall assume that surface absorption is negligible and
analyse the photoelectric data on the basis of expression (1.1). The con-
sistency of our results will provide evidence of the validity of our assump-
tions. In the in the next section we will discuss the evidence for surface PE.
The DAP D(r) can be calculated for thick films or bulk emitters by
means of the simple expression (2.18) when the index of refraction is
known from preliminary optical data. For thin films the more complicated
expressions (2.20) and (2.22), that take into account the interferences
between the beams obtained by multiple reflexions on both faces, are
required. The thickness of the film must have been deduced from optical or
any other physical data. For a review of optical measurements see VERNIER
If the escape probability is assumed to have the form (2.35), the measure-
ment of one photoelectric yield Y gives one equation between Po, po,L
and known quantities, such as the angle of incidence i, , the index of refrac-
tion v - iK, and the thickness zo,

Y = J>z)/?,p, exp (<) dz = Y(v, IC, zo, Do, p o , L). (3.1)

The integration of (3.1) when D(z) is given by (2.20) or (2.22) was performed
CQRNAZ [1971], PEPPER [1970].


An estimate of L is possible with one equation (3.1) if &p0 can be deduced

from theoretical considerations. Thus, SMITH and FISCHER [19711 obtained
the mean free path of the unscattered photoelectrons produced in Cs and
Rb by photons in the range 2-11 eV. They separated by energy analysis
the unscattered electrons that are distributed in a quite narrow peak about
the mean energy E- E, abovk the level E, of the bottom of the conduction
band. They approximated the average transmission of the surface for
electrons of energy E by assuming that every electron with a wave vector
within the escape cone is transmitted, and found
1-cos 8,
Po(J9 = ,

where 8, is given by (2.41)

In the spectral range investigated by Smith and Fischer (2 < hv < 1 1.2 eV,
4 < E- E, < 13 eV), the mean free path is very small for Cs ( I 2~ 1.5 A).
This value is in good agreement with qualitative estimates obtained by
other methods (MAYER [1961], VERNIER, M. PAUTY and F. PAUTY [1969]).
The theoretical predictions of QUINN[1962] and THOMAS [1957], based
on deexcitation by plasmons and by hole-electron pair creation, give values
of the mean free path somewhat larger, but also very small (Fig. 2). For
such small values of the mean free path surface phenomena could be very
important but the estimate of Smith and Fischer provides a very useful
order of magnitude. We may note that the escape depth L of the photo-
electrons might be appreciably larger than the mean free path I for the
largest photon energy when scattered electrons can escape. KROLIKOWSKI
and SPICER[1969,1970] estimated the mean free path of photoelectrons
in Cu (I = 22 A for electrons 8.6 eV above Fermi level) and in Au (Fig. 1)
by a less direct method that is based on the same principle. We have seen
(section 2.6.1) how the variations of the mean free path I(E) of an electron
VI, 9 31 T H E E S C A P E D E P T H OF THE P H O T O E L E C T R O N S 28 1

with its energy E is related to the density of states (ODS). We also have
seen (section 2.7.1) that the observed photoelectron energy distribution
(PED) depend on both 1(E) and ODS. Krolikowski and'spicer obtained
the energy dependence of the mean free path within a constant factor, by
fitting it with all PED data. They obtained the constant factor from one
absolute yield for hv = 8.6 eV by assuming the escape conditions described
in section 2.7.1.
GESELL and ARAKACVA [I9711 deduced the attenuation length of un-
scattered photoelectrons in A1 and Mg by two methods similar to that of
Smith et al. and that of Krolikowski et al. respectively. They also used the
variation of the yield with the angle of incidence of light (see section 3.5).
They found a steep decrease of the attenuation length from several hundreds
of Bngstrom for electrons of 5.6 eV above the Fermi level to a few Angstrom
for electrons of 10.2 eV. This decrease is much sharper than the one cal-
culated by the theory of RITCHIE and ASHLEY [1965].
Even if the scattered electrons are eliminated by energy analysis, the
models used to estimate the transmittance of the surface for the excited
electrons are very rough. Moreover, the correct calibration of a source of
light especially in the ultra-violet range is quite delicate, Therefore this
method of estimation of the escape depth is exposed to large systematic
errors and more redundant data give much safer results.
To obtain both p o and L we need two independent equations of type
(3.1), i.e.,we must illuminate the sample in two ways such that the associated
DAP are not proportional. Taking the ratio of the associated photoyield
eliminates po and the absolute calibration of the light source is no longer
necessary. This has most often been accomplished with several thin films
of different thicknesses or with one thin film illuminated through the
substrate (back illumination) and directly from vacuum front illumination.
With particular materials and in definite spectral ranges the special yield
distribution or the yield variation with the angle of incidence may give
sufficient data.


An elementary analysis of the DAP gives for large values of zo and for
front illumination

D+(z) = ( l - R + ) exp (-zz,a); (3.3)


for back illumination one obtains

D-(z) = (1 -R-)exp (a(z-zo)). (3.4)
Here R+ and R- are respectively the reflectances for front and back
illumination. a = (4n;q)/L is the absorption coefficient, where q is given by
(2.19); q = IC for normal incidence. Assuming R+ = R-,SZE,MOLLand
SUGANO [1964] found that
Y+ 1 aL + exp ( - az,) - exp ( - z,/L)
- X (3.5)
Y- 1 - a ~ l-exp(-(a+l/L)zo) '

///' /
' ""I "' / -' '00 I I I l
01 m

If L -SK l/a we may approximate (3.5) by

Plotting the experimental value of In( Y +/ Y -) versus zo gives, without

any preliminary optical measurement, the value of tl from the slope of the
asymptote. Its intercept on the x-axis gives L (Fig. 4).
The validity of the expressions (3.5) and (3.6) may be seriously com-
promised when R - is large because the beam reflected on the front face of
the film produces dear the free face of the film stationary waves, that cannot
be neglected in the calculation of Y - even for large values of zo . Then the
rigorous expressions (2.20) and (2.22) should be substituted for (3.3) and
(3.4). (See the introduction to section 3.)
Because of its simplicity, the original method, based upon the expression
(3.6), has quickly become very popular. Gold has been often measured and
can be taken as a reference because of its chemical inertness. The scatter
in the results is high (Fig. 1) but the fit can be considered as good enough
if we note that the structure of the film, the surface state and the work
function may not be the same in every experiment. KATRICH and SARBEI
[1961], SZE, MOLLand SUGANO[1964], VERNIERand COQUET[1965]
measured the escape depth for photoelectrons emitted into vacuum by
gold films with different work functions. CROWELL, SPITZER,HOWARTH
and LABATE[1962], and SZE, MOLLand SUGANO [1964] were able to
measure the escape depth for photon energy smaller than 1 eV by measuring
the PE into a semi conductor (respectively Si and Gap). It would have
been interesting to investigate the variation of the escape depth with the
work function to test the deexcitation mechanism (in the ballistic approxima-
tion the escape depth should not depend on the work function). But the
results are not sufficiently reliable. One result is clearly established in gold.
The escape depth is found to decrease when the photon energy increases.
This is in good agreement with the calculation of KROLIKOWSKI and SPICER
The escape depth obtained for other metals lies in the same range of
magnitude as for gold, 20-80 A in the photon range 4 5 eV (HALLand
MEE [1970] Fe, Mn, Ca) (BISNER, ROBOZand BARNA[1971] In). The
method is probably not accurate enough to attribute the observed varia-
tions of L to differences between the metals.
As for gold ( L = 740 & 60 A) Crowell et al. have found for photoemission
from Ag, Pd and Cu into Si in the infra red spectral range (0.7-1.05 eV)
values of the escape depth in the range of a few hundreds of Angstriims

( L = 440k60 A for Ag, L = 170k30 A for Pd, 50 A < L 200 A for -=

The same method can be applied in the far ultra-violet as long as trans-
parent substrates exist i.e. down to the LiF transparency cut off at 11.2 eV.
PEISNER, QUEMERAIS, PRIOL and ROBIN[I9731 applied it to amorphous
films of Ge (Fig. 4).



The pioneering works of MAYER [1961] and of his school, of H. THOMAS

[1966] and of REPENBRING [1966] have demonstrated the volume character
of PE from the variations of the photoyield of thin films with thickness.
They gave a first order of magnitude of L (a few tens of ingstrom in Na,
K and Rb, a few ingstrom in Cs). VERNIER, COQUET and BIGUEURE [1966]
were able to fit the result of the calculation from (3.1) on the measured
variation of the photoyield of gold films by a proper choice of the escape
depth L (Fig. 5). The best fit was obtained for the same value of L for
back and front yield, sand p polarization. The fit was good up to the thinnest


m 4
. zo(N 5

Fig. 5. Variation of the photoyield of a thin film of gold with its thickness, as calculated by
COQUET and VERNIER [1966] for 1 = 2804 A, i, = 65" and direct illumination polarization.
The circles represent the experimental data after VERNIER, COQUET and BIGUEURE [1966].
The photoyield of a film of infinite thickness has been taken as unity.

films that could be obtained with a continuous structure. The values L

confirmed the earlier results of VERNIERand COQUET [1965] by the simple
method (3.2). MAYER, BLANARU and STEPHEN[I9701 obtained in a similar
way the escape depth in Rb (Fig. 6). We may note in their results the quite
high values of the escape depth for small photon energy and the steep
decrease of L when hv becomes larger than the plasmon energy.

Fig. 6. Escape depth of the photoelectrons in Rb as a function of photon energy after


The fit between the theoretical and experimental curves that represent Y
versus film thickness is not always good. It cannot be good if the structure
or the work function of the film depends on its thickness (SHU'LMAN,
PONGet al. [1966,1967,1970,1972] extended the method into the far
ultra-violet. They measured the photoyield of films of increasing thick-
nesses with illumination through the LiF substrate. For insulators CuBr
(PONG[1966]), KBr (PONG[1967]) he first deposited a transparent film
of gold. Pong found L = 180 A in KBr, L = 30 10 A in PbTe and 51 A

for CuBr for 8 eV < hv c 11 eV. We may note that scattering by electron-
hole pair creation may be more important in CuBr than KBr where the
band gap is quite important. PONG,SUMIDAand Moo- [1970] found
L = 40+ 10 A for gold and L = 230 A for A1 in the spectral range 5.5-10.2
eV. The value for A1 is in good agreement with the theory of STUART and
WOOTEN [19673 if the mean free path for electron interaction is Z, = 500 A
and the mean free path for phonon interaction ,Z = 130 A. For gold the
value of L seems quite large when compared with the results of VERNIER,
COQUET and BIGUEURE [1966] and of PIERCEand SIEGMANN [1974].


In the previous methods L has been calculated from data measured on

several films that were assumed identical in every respect (structure, surface
state) except in thickness. Such a set of films may be difficult to produce,
especially from the high yield compounds used for industrial photocathodes.
Especially in that case it may be safest to deduce the index of refraction
from data measured on the film itself and not taken from the literature.
BURTON[19471 measured, in thin films of SbCs,, the transmittance, the
reflectance and the front and back yields for normal incidence. He deduced
from these data the absorption coefficient and gave for the escape depth
L = 250 A.
HIRSCHBERG and DEUTSCHER [1968] measured the same data and in
addition the thickness by the Tolansky method. They deduced the complex
index of refraction from optical measurements. They analyzed the photo-
yields for front and back illumination on the basis of expression (1. l), but
described the escape probability by (2.60). They found T$ = 150 A in the
photon energy range 2-3 eV. This means that in the assumption of an
escape depth given by the usual expression (2.35), they would have found
L N Ti.
HOFMANN and DEUTSCHER {1970] analysed multialkali layers in a
quite similar manner and found T$ = 300 A in the spectral range 1.8-2.4 eV.
We note the large value of the escape depth found in these semi-conductors
with respect to the low value generally found in metals.
VERNIER, GOUDONNET, CHABRIER and CORNAZ [1971] measured in thin
films of amorphous Ge transmittance, the reflectance front and back
yield for several angles of incidence and for both polarizations. They
deduced the index of refraction and the thickness of each film from the
transmittances and reflectances. For every wavelength, all the optical data
"1, 31 T H E E S C A P E D E P T H OF T H E P H O T O E L E C T R O N S 287

were consistent with one value of the thickness for each film. The index of
refraction did not depend on the film thickness zo when zo > 80 A. It can
therefore be assumed that the films were homogeneous and reproducible.
They deduced one value of the escape depth L for each angle of incidence
and for each polarization by comparing the observed value of Y + / Y -
with the result of calculation from the substitution of (2.20) or (2.22) into
(3.1). They have found L = 4Of. 10 A for every angle of incidence and
polarization in the photon energy range 4-5 eV. GOUDONNET,
TRUITARD and VERNIER[1973] applied the same method to silver in the
vicinity of the plasma resonance. For clean films the threshold is at hv, N 4
eV and all data in the range 4-4.5 eV are consistent with an escape depth
L = 40& 10 A (Fig. 7). When a silver film is actived by a submonolayer of
Cs, the escape depth is not changed for hv > 4 eV. The photoelectric

I I , I I , I
3; 3.6 3.7 i.0 39 4 42 43
!+I >,4 45
RKXCN E rml a / ]
Fig. 7. Estimated values of the escape depth of photoelectrons from the photoyield of a
280 b; thick film of Ag as a function of photon energy, for normal incidence 0 ; for s-polariza-
tion and an angle of incidence of 30" x and 60" 0 ;for p-polarizationand an angle of incidence
of 30" and 60" A.

threshold falls below hvb = 3 eV, but for hv < 4 eV, L seems to depend
on the polarization of light. We may try to attribute this anomaly to an
anisotropy of the photoelectric excitation that involves a real dependence
of L on polarization, but we shall discuss in section 4.2 interpretations
based on a surface effect.


the same method to thin films of SbNa,K(Cs), deposited into vacuum
sealed cells of appropriate form, by taking into account the additional
reflexions due to the windows. In the spectral range 1.5-3 eV, they found
results consistent in order of magnitude with HOFMANN and DEUTSCHER
[19701 ( L = 300 A). They did not find any variation of L with polarization
but found that L depended considerably on the samples that are quite
difficult to obtain in a reproducible manner.


This could be a satisfactory method for bulk material and in the far
ultra-violet for thin films when the most general methods cannot be used.
The ratio of the photoyield Y(i,) for the angle of incidence i, and Y(0)
may be expressed as a function p,,(L), with the expression (3.1). The
relative uncertainty A p l p on the experimental estimate of p is determined
by the apparatus. At best we may hope that A p / p is a few percent, but
at high angles of incidence A p / p would be much larger. An estimate of
L is possible if p( co)- p ( 0 ) is larger than Ap. QUEMERAIS, PEISNER,F ~ O L
and ROBIN[1973] calculated p,(L) for bulk materials and for thin films.
They noted that pio(L)depends more strongly on the state of polariza-
tion of the light than on L. Therefore, any uncertainty in the polariza-
tion induces an error Ap larger than p ( c o ) - p(O), that makes the determina-
tion of L impossible. GOUDONNET, CHABRIER, VERNIER [1974] calculated
( p ( a)- p(O))/p(O) and found it larger than any obtainable value of A p / p
except when both the imaginary and the real parts of the index of
refraction are very small (typically smaller than 0.1-0.3). This occurs only
in a few solids, and in a very narrow spectral range in the far ultra-violet.
HAMMand BIRKHOFF [1974] measured the variation of the photoelectric
yield Y of films with the angle of incidence i, in the far ultra-violet. The state
of polarization of the light had been measured for each wavelength. They
sought for an estimate of the index of refraction v - ilc, the thickness of the
film zo and the escape depth of the electrons L by fitting the experimental
Y versus i, curve with the theoretical expression (3.1). Gesell and Arakawa
were able to place L between 4A and 12 A for A1 at hv = 21.2 eV. ARAKAWA,
BRAUNDMEIER, WILLIAMS, HAMMand BIRKHOFF [19741 obtained a good
estimate of v and ic for carbon films in the photon energy range 15-80 eV
but they obtained an estimate of L only for hv > 30 eV in agreement with the
results of GOUDONNET and CHABRIER [1974].

An electromagneticwave can also be induced in a film by frustrated total

reflection, when a light beam is sent from the substrate at an angle of
incidence i, larger than the limit angle. The reflectance has a very profound
dip for a critical angle of incidence i2pand for p-polarization, when surface
plasmons in the film can be coupled to the incident light (OTTO[1968,19701).
The dip in the reflectance is associated with a peak in the curve that
represents the photoyield Y - ( i 2 )versus the angle i , . MACEK,OTTOand
STEINMANN [19721 observed this peak for hv = 5 eV and A1 films deposited
onto a quartz prism. CALLCOTT and ARAKAWA [I9751 observed this peak
for 5 eV < kv c 10 eV with A1 films deposited onto a LiF semicylinder.
They calculated the ratio of the photoyields Y-(i2)/Y-(O) from the expres-
sion (3.1) in the Pepper's formulation for different values of L (Fig. 8) and
observed that the peak value Y - ( i z P ) / Y - ( Owas
) very sensitive to the value
of L. The distributions of the electromagnetic energy D(z) that has been
calculated by Macek et al. and by Callcott et al. is very different for i, = i2p
and for other angles of incidence. Therefore the comparison of the experi-

Fig. 8. Comparison of experimental values of the angular yield ratio Y - ( i 2 ) / Y - ( 0 ) with
values calculated from the expression (3.1). No possible choice of film thickness zo and escape
depth L can account for the yield observed at the resonance peak (after CALLCOTT and
ARAKAWA [1975]).

mental result with the calculation gives an estimate of L. Macek et al.

found L = 5 0 k 8 A for hv = 5 eV.
Callcott and Arakawa also measured the photoyield Y + ( i , ) for light
incident from vacuum at the angle of incidence i,. They obtained from
the ratio Y-(O)/Y+(O)(section 3.4), the estimate L = 40+ 10 for hv = 5
eV and L = 15 + 5 A for hv = 10 eV. But the measured value of the ratio
Y - ( i2J Y - ( O ) was larger than the result of the calculation, whatever
be the assumed values of L and zo.
Like in the experiments of CHABRIER, CORNAZ, GOUDONNET and VERNIER
[1970], all values of the photoyield cannot be explained within the schema
of a pure volume effect with one value of the escape depth. Here too an
explanation can be found in a contribution of surface effect (cf. section 4.2).



The principle of such methods is to observe the transparency of a thin

film for the electrons excited in its photoemissive substrate. This trans-
parency is proportional to e-zo/L,where zo is the films thickness and L is
the escape depth of the film.
LEWOWSKI, BASTIE and BIZOUAFCD [1970] deposited thin films of alkali
halides onto photocathodes of Al. For an illumination at hv = 4.89 eV
the films were transparent and no absorption occurred in them. Plotting
In Y against the thickness of the film, they obtained a straight line with
slope l / L . The values of L are roughly proportional to the volume of the
crystalline cell when one alkali halide is replaced by another, L = 10 A
for NaCl, L = 110 A for CsI. These values are much lower than the result
obtained by PONG[1967] in KBr for hv = 8 eV ( L = 180 A). Differences
in the transport mechanism could explain this result. In Pong experiments
the electron-hole pair creation may be forbidden in KBr, because the
electron affinity is smaller than the band gap. The escape depth limited by
phonon scattering could then be quite high (SPICER[1960]). For Lewowski
et al. the electrons excited at lower levels might have to jump from trap to
PONG[1972] illuminated a photocathode of CuI through its LiF sub-
strate. He covered it with a thin film of amorphous Se and plotted the
logarithm of the emitted current versus the Se-film thickness zo . For small
values of zo the contribution of the Se film to photoexcitation was negligible
and the curve was a straight line with slope l/L. The measured escape
V L 31

depth of the electron in Se was L = 40+ 10 A for hv = 7.8 eV; the mean
energy of the emitted electrons was then 4.7 eV above the Fermi level.
PONGand SMITH [1973] replaced the selenium by copper phtalocyanin
and found then L = 11 A for electrons 1.5 eV above the Fermi level (for
hv = 7.8 ev).
When a film and its substrate both contribute to the photoexcitation,
the contributions of the film and the substrate sometimes can be separated
by an energy analysis of the photoelectrons. EASTMAN [1970], deposited
thin films of yttrium onto a substrate of gold and measured the variation
of the integrated intensity Z of the d peak of gold as a function of the Y
film thickness. The experimental data could be fitted with the law

with L' = lOA in the photon energy range 5-8 eV. The same method
can be applied to thin films of any substance, if its PED has no structure
in the d band of gold. By substituting Gd or Ni for Y , Eastman obtained
the same value L' = lOA for hv = 7 eV. We must note that such a method
gives the elastic escape depth L'; the escape probability without energy
loss is proportional to exp (- z/L'). L' may be well approximated by the
mean free path 1. But the escape depth L for all emitted electrons, as defined
by (1.1), may be much larger, especially for high energy electrons excited
by soft X-rays or by electron bombardment. Applications of the Eastman's
method to this energy range will be discussed in section 3.10.
CAMPAGNA, PIERCE,SATTLER and SIEGMANN [1973] measured the spin
polarization P of the photoelectrons emitted by a ferromagnetic material
placed in homogeneous magnetic field normal to its surface,


where nt and nl are the respective numbers of spin up and spin down
photoelectrons. Of course if copper is substituted for the magnetic material
then P = 0. Many important results about magnetism have been obtained
from such measurements of P.We are concerned here with the possibility
of separating the electrons originating from a ferromagnetic and a non-
magnetic material. PIERCE and SIEGMANN [1974] deposited thin films of
copper of increasing thickness zo on a substrate of Ni and measured the
variation of P with z,,. For a uniformly magnetized film and a constant
photoelectric current, P was found to be proportional to the number of
electrons excited in Ni. Pierce and Siegmann could thus obtain the trans-

1 1 I I 1

5 10 15 20 25
Fig. 9. Spin polarization P of the electrons emitted by an Ni sample coated with a film of Cu
of thickness z., The rectangular fields represent the statistical uncertainties for both P and zo.
Fields with the same cross hatching are for films successively evaporated on the same Ni
substrate. The solid curve is a least square fit by an exponential curve (after FIERCE and

parency exp( -z,/L) and the escape depth L for the Cu film (Fig. 9). Pierce
and Siegmann deduced in a similar manner the escape depth in Ni from
the variation of P for the electrons emitted by a substrate of Cu covered
with a thin film.of Ni. Pierce and Siegmann found L = 1 1 A in Cu for
electrons 5.2 eV above the Fermi level.


In the experiments O f EASTMAN

[1970], LEWOWSKI,
[1970] and PONG[1972], PONGand SMITH[1973] the emitted electrons

are excited in a material and deexcited in another. To investigate the

escape probability any type of excitation can be substituted for photo-
excitation. We must note, however, that the energetic and angular distribu-
tion of the excited electrons may not be the same as in PE.
KANTER[1970] injected electrons into self-supporting films of Ag, Au
and Ag with a low energy electron gun (Fig. 10). He measured the current

transmitted by the film with a collector and he could eliminate, with a
0 target Film

F""" c o l k ~

Fig. 10. Principle of the apparatus of KANTER [1970]. The beam is collimated by a 0.02
Tesla magnetic induction. The beam diameter is about 1 mm. The beam could be moved, with
the help of deflection plates, across the film surface so that the detection of pin holes and
other film nonuniformities was greatly facilitated. The normal component ofthe beam energy
spread, as measured at the collector without a target inserted, was 0.5 eV between the 10%
and 90 % points of the collector current versus retardation voltage curve. Typical bombarding
currents were 2 x 10-'A. Current leaving the film was in the 10-"-IO-'4A region and were
measured with a vibrating reed electrometer. The noise current was about 10-14A.

retarding potential technique, the electrons that had suffered large energy
losses. Kanter plotted the logarithm of the electron current versus the film
thickness, and found a straight line of slope l/Lo. Here Lo is the elastic
escape depth L', when the scattered electrons are eliminated, and the total
escape depth L, when all transmitted electrons are collected. In this energy

range, where L and L .' are nearly equal, we may expect the escape depth
measured for electrons injected with a kinetic energy E, to be approximately
the same as for photoelectrons excited by photons of energy hv = E, WF, +
where WF is the work function of the film. In Fig. 1 the results of Kanter
are plotted for gold at that energy. Kanter found no significant difference
between the values of the escape depth for Al, Au and Ag. He found a
decrease from L = 4OA for an electron energy of E = 5 4 e V above the
Fermi level down to 15-20A for E = 10 eV. The experiments of Kanter
determine the electron interaction processes more precisely than PE,
because the angular and energy spreads of the electrons are smaller. The
main difficulty is to avoid holes and other defects in the films.
The injection of hot electrons into a thin film may be obtained in sand-
wiches of metal-insulator-metal thin films (Fig. 11). Such sandwiches emit
electrons into vacuum when an appropriate voltage V is applied between
the metal films. Electrons of the base metal film are transported across the
insulator either by thermoinjection into the conduction band or by the
tunnel effect. They are then transported across the outer metal film and
are emitted into vacuum. We may assume the transport probabilities across
the insulator and the outer metal film to be proportional to exp( - zI/L,)

outer metal film

base metal film

-+ii substmte

Fig. 11. Scheme of a sandwich electron emitter.

VI,0 31 T H E ESCAPE D E P T H O F T H E P H O T O E L E C T R O N S 295

and exp(-z,/L,), respectively, where z, and z, are the thickness of the

insulator and the outer metal films, and L, and L , are the corresponding
escape depths. Many investigators deduced L, and L, from the variation
of the emitted current with z, and 2,. To compare the results with photo-
electric estimates of the escape depth, we could assume that most elec-
trons attain the outer metal film with an energy eV above the Fermi level.
However, this assumption is not consistent with the observations by
HICKMOIT [1963, 19651, VERDERBER and SIMMONS [19673, NIQUET,
VERNIER and HARTMANN (1970) of emitted electrons when V is smaller
than the work function of the outer metal. Moreover, the small values
generally found for L, (5 to 24 A in Al,O, according to SAVOYE and
and DAVIES[1963], KANTERand FEIBELMAN [1962]) suggest very strong
energy losses in the insulator. Only a few authors have found larger values
of L, (HICKMOTT [1965]: Ll =200 A in Al,O,, GOULD,HOGARTHand
COLLINS [1973]: L, = 400 to lo00 A in SiO,).
The average energy of the electrons injected in the metal might be much
lower than one eV. Because the escape depth of the electrons is a decreasing
function of their energy, we might expect that the attenuation length
measured in the outer electrodes of the sandwiches is generally greater
than that obtained from photoelectric data (MEAD[1962], COLLINS and
HARTMANN [19701, COLLINS, EDGEand LEGG[19721).


We have shown in section 2.7.2 that in NEA photocathodes, the escape
depth is the diffusion length of thermalized electrons at the bottom of the
conduction band. A restriction has been made on the validity of the expres-
sion (2.35) in thin films, whose thickness is not much greater than the
diffusion length, because electrons can be diffused back by the rear face
of the film. In that case, (2.59) should be substituted for (2.35).
Several non-photoelectric methods can be used to determine the dif-
fusion length (HACKETT[1972], FRANKand GARBE[1973], ASHLEY,
CARR,ROMANO-MORAN [1973]). The published values range from 0.1 p
to 10 p. The special features of the transport process and of optical excita-
tion in NEA emitters allow special photoelectric methods to determine
the escape depth of the photoelectrons.
In GaAs and other 3-5 compounds, the conduction band contains two
minimum energy levels at the points r and X of the Brillouin zone (Fig. 12).

Fig. 12. Band scheme of GaAs, showing the excitation and thermalization of electrons in
r and X minima of the conduction band, -=
for I .4 ihv 1.7 eV (a) and for hv > 1.75 eV (b and
c) (after JAMES and MOLL[19691).

For 1.4 eV < hv < 1.7 eV all excited electrons are near the r level and,
except for a small path before thermalization, they undergo the same
transport process, characterized by the same escape probability

P = Pr ~ X (P - z / L r ) . (3.8)
The escape probability at the surface p r and the escape depth Lr are in-
dependent of the photon energy. The photoyield Y for 1.4 eV < hv < 1.7 eV
depends on hv only because the absorption constant a depends on hv. We
may write

The plot of 1/ Y versus 1/ a (Fig. 13), for that spectral range is a straight line
that intersects the x axis at the abscissa - L,. Its slope is 1/pr . Except for
very strong doping a results from direct interband transitions and does
not depend on the sample. a can, therefore, be taken out of the literature.
Many determinations of the spectral yield of GaAs and other 3-5 NEA

Fig. 13. 1/Y versus l/aplot for Zn doped GaAs crystal (density of Zn atoms p = 2.8 x 10''
) , different Cs, 0 coverages (after GARBE
~ r n - ~with [1969a]).

Fig. 14. Diffusion length L, in GaAs as a function of the Zn doping concentration p and
crystal growth process (after GARBE

photocathodes lead to determinations of L, and p r . EDEN,MOLLand

SPICER[1967] found for GaAs L, = 0.15 p. GARBE[1969a] found that
for Zn-doped GaAs L, depends on the doping and on the fabrication
process (Fig. 14). JAMESand MOLL[1969] found, independently, quite
similar results. SCHADE, NELSON and KRESSEL [1971] found L, = 2 to 7 p
in slightly Ge doped crystals and ASHLEY,CARRand ROMANO-MORAN
[1973] found by Hacketts methods diffusion lengths up to 23 p. FRANK
and GARBE[1973] showed that HACKETTS method and photoelectric
methods give consistent results. Many other 3-5 materials have been
obtained with NEA, binary InP and Gap, ternary GaInAs and GaAsSb,
quaternary InGaAsP (for a review see BELLand SPICER[1970], WILLIAMS
and TIETJEN[1971], SOMMER [1973]). The NEA was first obtained in
cleaved monocrystals, but it has been also obtained on epitaxic thin films,
used as semi-transparent photocathodes (ANTYPAS, JAMESand UEBBING
[1970], LIU, MOLLand SPICER[1970], GARBE[1973]). In a quite general
way the diffusion length of the electron L, found in these materials is much
smaller than in GaAs. It is generally about 0.1 p or less. The reason is very
likely not fundamental and improvements in the fabrication process could
probably increase L, .
In GaAs, when hv > 1.7 eV a fraction F, of the excited electrons are
thermalized in the vicinity of the X level. JAMESand MOLL[I19691 and
GARBE[1969a] separated in the PED two contributions to the photoyield.
The first one Y, is due to emission from r level, the second one Y, from
X level. The lifetime of the electrons in the X level is limited by diffusion
towards the r level. The transport of the X electrons is characterized by a
diffusion length L, and an escape probability P, for the X electrons that
reach the surface. From the joint diffusion equations for X and r electrons
it has been found that

Y, = PXF,
(3.1 1)
1 l/aL,

Plotting 1/ Y, versus l / a for hv > 1.7 eV gives L, just as L, was obtained

for hv < 1.7 eV (Fig. 13). To obtain the other constants James Moll and
Garbe fitted the experimental curve to the expression (3.10). They found
in both works that L, z 0.03 p is much shorter than L,.
Presently Si has been activated to NEA only on (100) crystallographic
faces by Cs-0 treatment (MARTINELLI [1970], RICHARD [1973]), or R b O

treatment (MARTINELLI [1973]). The emission process is roughly the same

as in 3-5 materials. The electrons are excited by photon absorption and
rapidly thermalized into the bottom of the conduction band, which is in
Si at the X point of the Brillouin zone. An expression of type (3.9) can,
therefore, be used to analyze the spectral yield of NEA Si in the same
manner as NEA GaAs for hv between 1.4 eV and 1.7 eV, e.g. by plotting
1/Y versus l / a . The diffusion length, depending on the sample and its
doping ranges between 2 and 18 pm. SOMMER [I9731 noted that the diffusion
length in Si is usually larger than in 3-5 compounds but that the high thermo-
electronic emission reduces the practical application of Si NEA photo-
We must note here that the very high escape depths and the very high
quantum yields that results are made possible by the NEA where the escape
depth is identical with the diffusion length of the thermalized electrons.
Whenever the NEA is not achieved, the escape depth in the same materials
drops to much smaller values e.g. for Si near threshold 10 to 30 A, as we
shall see in the next paragraph.



Because of surface states and because surface doping -is different from
volume doping, the distance 6E from the valence band to the Fermi level
depends on the distance z to the surface (Figs. 15 and 16). This effect makes
it possible to obtain NEA. When the electron affinity is positive VANLAAR
and SCHEER [1962] calculated the spectral yield distribution on the assump-

hS; (4
WKh band 3

Ferrni level
Lbkcce band


Fig. 15. Scheme of the band profile of a p-type Si crystal with positive electron affinity.

Fig. 16. Band scheme just after cleavage of the degenerate n-Si cristal (density of donors 10
cm-) for which WAGNER and SPICER [I9721 measured the PED represented in Fig. 17.

tion that each layer of thickness dz contributes to the emitted current the

d l a (hv - E , - 6E(z))4exp (- z/L) dz. (3.12)

They calculated the profile of the band E ( z ) by solving Poisson equation,

and integrated dZ. For p-type silicon the result of the calculation depends
sufficiently strongly on L to allow an L determination by comparison with
the experimental data. Van Laar and Scheer have found L = 15 A in the
spectral range 5-6 eV.
GOBELIand ALLEN[1962] distinguished in the contribution of each
layer a term associated with indirect excitation with a threshold hvi(z) =
E0+6E(z) and a term associated with direct excitation with a threshold
VI, 0 31 T H E E S C A P E D E P T H OF T H E P H O T O E L E C T R O N S 301

hvd(z)= hv,(z)+ 0.3 eV. They assumed for each layer a contribution

dY = (c,(hv-hv,(z))~+c,(hv-hv,(z)) exp (-z/L)dz

to the photoyield. They calculated hvi(z),integrated over z, and compared


the result with the experimental result of Scheer and Van Laar. They found
L = 25 i-5 A in the same energy range.
WAGNER and SPICER[19721 investigated just cleaved degenerate n-type
samples of Si (10" As atoms per cm3, 0.001 SZacm) for hv in the range
6-12 eV. After an exposure to oxygen at very low pressure (lo-'' torr),
they observed several changes in the PED N(E) (Fig. 17), viz.

-6 -5 -4 -3 -2 -1 0

Fig. 17. PED for n-Si just after cleavage (1) in ultrahigh vacuum and after exposure to residual
gases, (2) for hv = 10.2 eV (after WAGNERand SPICER [1972]).

a. The structures A and B disappeared. The peak B was not observed in

every cleave. Wagner and Spier attributed these structures to surface
states lying at 0.5 eV and 1.1 eV under Fermi level. We shall discuss surface
states in the next section.
b. The low energy edge of the PED is shifted by 0.6 eV because of a
decrease of the work function of 0.6 eV.

c. The two structures C and D attributed to direct transitions from the

valence band, are shifted by only 0.2 eV.
Wagner and Spicer interpreted the results of Fig. 17 by the band bending
represented on Fig. 16. On clean n-type Si the free bonds at the surface
induce a space charge and a negative band bending. When the cleavage
has been exposed to oxygen the bonds are saturated and the band becomes
straight; the distance from the vacuum level to the Fermi level is then
decreased by 0.6eV. Because of the very short escape depth, the electrons
excited in the flat band region cannot attain the surface. Wagner and Spicer
reported the observed shift of 0.2 eV for the structures C and D on the
band profile calculated by GOBELI and ALLEN[1962] and found L = 12 A
for h = 10 eV. This value is consistent with the value of 25 A that had been
found by Gobeli and Allen near threshold. VILJOEN, JAZZAR and FISCHER
[I19721 modulated the band bending of n and p-type GaSb with infra-red
illumination. The photoelectric threshold for clean GaSb samples was
about 5 eV. They measured the variation of the infra-red modulation of
the photoyield with the UV photon energy and deduced from it an estimate
of L in GaSb. They found L = 100 A, larger than the value expected from
comparison with similar semi-conductors.



METZGER[1965] and DUCKETTand METZGER[1965] measured the

spectral distribution of the photoyield of the alkalihalides and found a
drastic decrease of the photoyield when the photon energy hv becomes
larger than twice the gap Eg i.e. when excited electrons can induce pair
creations. If hv is further increased (above approximately 3 4 ) the photo-
yield increases again and may become larger than unity because of the
emission of scattered and secondary electrons.
The importance of pair scattering and secondary emission is better
hv > 13 eV the peak due to valence band decreases and for hv > 14 eV a
peak due to scattered electrons appears (Fig. IS).
When scattered electrons can be emitted, the elastic escape depth L' is
more significantly related to microscopic processes than to the total escape
depth defined by the expression (1.1).
BAERand LAPEYRE [1973] deduced L' from the energy distribution of
photoelectrons emitted by IK in the photon range 7-22 eV. They assumed



Fig. 18. Energy distribution of photoelectrons emitted by IK for various photon energy.
Each curve is indexed by the photon energy in eV (after BLECHSCHMIDT,
MANN [1970]).

that the intensity of the elastic peak Z, from the valence band is given by


where c1 is the optical absorption coefficient for a photon of energy hv and

L'(hv)is the elastic escape depth for electrons excited at a level hv above
the valence band. They assumed that the hv dependence of L' could be
calculated by the expression (2.37) (KROLIKOWSKI and SPICER[1969]). They
could then fit the observed variation of the intensity of the elastic peak
with the result of calculation based on eq. (3.14). They found that L' = 16A
for hv = 14 eV.
The electron-spectroscopy of photoelectrons excited by soft X-rays has
been applied to chemical analysis (ESCA). Characteristic peaks of each
element are found in the PED at hv - Eb, where Eb is the binding energy
of a core electron. BAER,HEDEN,HEDMAN,KLASSON and NORDLING [19701
measured the variation with gold thickness zo of the chromium 3p peak
(Eb= 43 eV) and of the gold 4 f 3 peak (Eb= 83 eV) in energy distribution
of the electrons emitted by a chromium substrate covered by a gold film
(Fig. 19). When zo increases, the chromium peak is expected to decrease as
- zo/L', (3.15)

and symmetrically the gold peak is expected to increase as

1 -e-Zo/L'. (3.16)
We may assume that the elastic escape depth L' in gold has the same value
for both chromium and gold peaks because the binding energy E,, is in
both cases small with respect to the photon energy. The theoretical curve
agrees with experimental data for L' = 2 2 k 4 8,and hv = 1253 eV (Fig. 19).

Fig. 19. Variation with gold thickness of the intensity of the gold x and chromium 0 peaks
for photoelectrons emitted by a chromium substrate. covered by a gold thin film illuminated
by 1252 eV photons. The continuous curves are calculated for L.' = 22A (after BAER.HEDEN.

X-rays also induce the emission of Auger electrons with an energy in-
dependent of hv and characteristic of the chemical elements because of the
deep holes that they create in the solids. KLASSON, HEDMAN, BERNDTSSON,
NILSSON, NORDLING and MELNIK[1972] used both Auger electrons and
directly photoexcited electron peaks, to determine the escape depth of
electrons in gold and alumina. They found in gold L' = 19 8, for E = 0.9
k e V a n d L f = 3 7 8 , f o r E = 1.4keV,inaluminaL'= 1 3 8 , f o r E = 1.4keV
and L' = 22 8, for E = 3.9 keV. STEINHARDT, HUDISand F'ERLMAN [19721
investigated L' in a thin film of carbon dtposited onto a gold substrate.
They found L' = 10 8, for E = 920 eV and L' = 13 8, for E = 1169 eV.
The same method applies when Auger emission is induced by electron
bombardment. We may assume a uniform excitation in the ?art of the film
and substrate that are responsible of the Auger emission, because the
incident electrons have a penetration depth greater than L' both because
L' increases with electron energy and because secondary electrons can also

induce Auger emission. SEAH[1972] discussed the validity of eqs. (3.15)

and (3.16) and the significance of L' for description of the intensities of
Auger peaks induced in a substrate coated with films of different thick-
PALMBERG and RHODIN [19681 deposited Ag epitaxial films on a mono-
crystal of Au. They found for the elastic escape depth in Ag L' = 4 8, for
E = 72 eV (Au peak) and L' = 8 8, for E = 362 eV (Ag peak). RIDGWAY
and HANEMANN [1971] deposited Fe onto a Si monocrystal and found
L' = 7.4 8, for the escape depth of 91 eV electrons in Fe. SEAH[I9721
investigated L' in films of Be, Ag and Cu deposited onto substrates of Be,
Ag and Cu. The estimate of L', obtained from the increase of the layer
peak and from the decrease of the substrate peak as the film thickness
increases, are of the same order of magnitude but the second ones seemed
less reliable. They are given in parenthesis in the Table 3.1. In the same
table we have given the results obtained by TARNGand WEHNER[1973]
with Auger electron emitted by W substrates covered with Mo and Mo
substrates covered with W.
JACOBIand H ~ L Z[1971]
L measured the Auger emission of self supporting
carbon films both for transmission and reflection of primary electrons with
energy ranging from 100 to 2500 eV. They found for the unscattered escape
depth L' = 7.5 & 1.5 8, for electron emitted at 262 eV. This value is in good
agreement with the results of STEINHARDT, HUDISand PERLMAN [1972].
We abstracted in Table 3.1 the results obtained for the elastic escape
depth of high energy electrons.
Elastic escape depth of electrons (in d;)

Energy(eV)-+ 48 60 110 120 262 350 355 920 935 1169 1400 1736 3900
Be 4.7 (8.6) 10.0 12.7
'412 0 3
13 22
C 7.5 10 13
cu (6.1)
Ag 9.4
Mo 5.2 6.7
W (7.2) 6

0 4. Surface Photoexcitation
A surface contribution to photon absorption may arise, within a
one-electron theory, from two distinct effects, surface PE from bulk states
(SPBS) and PE from surface states. We shall analyse experimental data
on the assumption that surface and volume effects can be separated in the
intensity of the emitted current. When necessary, we shall use the one-
electron approximation.
The surface photoexcitation is more likely detected by photoemission
than by any other of its physical consequences (e.g., reflectance or trans-
mittance studies), because the small value of the escape depth reduces the
relative importance of bulk phenomena with respect to surface ones. In
spite of that, and even for quite small values of the escape depth, most
analyses of PE data have been based on the assumption of a pure volume
photoexcitation and calculated from the bulk value of the dielectric con-
stant. The consistency of the calculated escape depths with all experimental
data is an evidence of the quite small value of the surface effect with respect
to the volume one. This result must be especially noted because in early
theoretical work on PE from metals, the free-electron model led theorists
to consider only the surface effect.



This first photoyield-versus-thickness curves for thin films of alkali

metals obtained by Mayer and his school (see the review by THOMAS [1966])
are of historical impbrtance because they showed that the volume effect
represents at least the main part of PE. An additive surface absorption
term could be detected in the photoyield versus thickness if the film could
be produced in the bulk structure and bounded by two parallel planes for
the smallest thicknesses. In practice several monolayers at least are necessary
to approach this model. Up to quite thick films, the roughness and work
function may depend on the thickness and often on other uncontrolled
quantities. An agreement between the thickness dependences of the photo-
yield, as it is measured and as it is calculated with the assumption of a
volume effect by means of the expression (3.1), can be a negative test for
surface effect. We must note that it is sensitive only if the thinnest good
films have a few monolayers and it is difficult to be sure of the quality of
STEPHEN [1970] have found no important contribution of the surface effect
in gold an Rb respectively.
In Cs theoretical (QUINN[1962]) and experimental (VERNIER, M. PAUTY,
F. PAUTY[1969]), SMITHand FISCHER [1971]) considerations have shown

that the escape depth of photoelectrons is very small, at most a few mono-
layers. This fact would make the surface effect easier to detect if it exists.
The steep increase of photoyield at zero thickness, observed by MAYER
[1961], is consistent both with volume and surface photoexcitation and
we shall need further experimental tests to decide whether the mechanism
of the absorption, that can lead to photoemission only in the surface layers,
is different from bulk absorption.



As predicted by the earliest theories (section 2.5.1), in a surface effect

due to the surface perturbation of bulk states (SPBS) the only active com-
ponent of the electric field is normal to the surface. Therefore in a pure
SPBS the ratio Yp/Ysof the photoyields for p and s polarizations should
be infinite for a perfectly smooth surface. For the assumption of a pure
volume effect in a semi-infinite solid or in an opaque film, Y p / Y ,should
equal the ratio (1 - Rp)/( 1 -R,)of the fluxes that penetrates into the solid,
where R, and R, are the reflectances for the p and s polarizations. In a thin
film, the volume effect assumption leads to expressions of Yp/Y, that differs
from (1 - Rp)/( 1 - R,) because of interferences and can be deduced from
eq. (3.1). For very small escape depths it is sufficient to consider the ratio
of the density of electromagnetic energy just below the surface for p and
s polarizations. The equality of Y p / Y swith the value, that is calculated in
the assumption of a pure volume effect ((1 - RJ( 1 - R,)for opaque films),
affords a good negative test for surface effect. No surface effect was detected
by the polarization effect in experiments of IVESand BRIGG[1936,1938],
MAYERand THOMAS [1957], METHFE~SEL [1957] in alkali metals and by
VERNIER, PAUTYand BERTHELEMY [1965] in gold, silver and aluminium.
Only recently anomalies have been found that could not be explained by
the standard theory of the volume effect.
The detection of a surface contribution could be expected if one or
several of the following conditions is satisfied :
(a). If the electron model is a good approximation because it predicts
no volume effect, e.g., for Al, Mg, alkali metals,
(b). If the volume effect requires the violation of the k conservation
selection rule. This can occur near the threshold,
(c). .If a special distribution of the electro-magnetic field concentrates
the energy in the form of an electric field normal to the surface. This occurs
near surface plasma resonance for p-polarization (Fig. 20) (CHABRIER,
[1975]). Surface roughness may help this concentration (see section 2.4.2).
This resonance can be described (ENDRIZ[1973]) as an additional step in
the interaction of an incident photon with the solid; the photon excites a
surface plasmon that subsequently decays into a one-electron excitation
(or a reflected photon). The distinction between surface and volume photo-
excitations is based on the same principle at plasma resonance and at
other photon energies. We must know whether the electric field, whether
or not coupled with plasma oscillations, excites photoelectric transition,
because of surface perturbation or because of standard bulk processes.
The plasma resonance of silver is especially popular because it is in the
nearest ultra-violet and the chemical reactivity is much smaller in silver
than in alkali metals. The only drawback of silver is that the work function
of the clean metal is larger than the plasmon energy. Therefore PE experi-
ments at plasma frequency need a surface coating with a sub-monolayer
of an electropositive metal (Cs, Ba, or Al) or compound (Cs,O, CsF or
BaO). For either volume or surface effect the photoyield Y , and the ratio
Y,/ Y, have a peak at the plasma frequency just as do the absorption coeffi-
cient A , and the ratio of absorption coefficients APIA,. But the amplitude
of the peak is not consistent with an isotropic excitation in the volume
of silver (HOFMANN and STEXNMANN [1968]). FONTENEAU, PAUTYand
VERNIER[I9711 have found Yp/A, = YJA, for a light frequency larger
than the plasma resonance but Y,/A, < YJA, in the region of plasma
resonance. They also observed this anomaly when the light frequency is
smaller than plasma resonance in the thickest films. A contribution of the
SPBS to PE would produce a discrepancy of the opposite sign so that
the explanation should probably be sought in the direction of PE from
surface states localized in the activating layer (aluminium oxidi%d by
residual gases). The experiments of CHABRIER, GOUDONNET, TRUITARD
and VERNIER [I9731 quoted in section 3.4, which led to a determination of
the escape depth, afford a more complete set of data to investigate this
anomaly of silver. For light frequency above the plasma resonance the
consistency of the values of escape depth for p and s polarizations is equiv-
alent to the result of Fonteneau et al., Y,/A, = Y J A , . The results described
by Chabrier et al. as a larger escape depth for p polarization implies the
opposite of Fonteneau et al., Yp/Ap > YJA, . Here the sign of the anomaly
is consistent with a SPBS. This interpretation is confirmed by the appear-
ance of the anomaly at plasma frequency where the electromagnetic energy
is concentrated in the normal component of the electric field (condition (c))
(Fig. 20). The anomaly also appeared at light frequencies smaller than the

< 3500 3400 33ao 3220 3

m 3
m XoO ArA)

I I I I I , ,

35 36 3,7 38 3,9 4 4) 52 4.3 h3leVl

Fig. 20. Variation of the density of electrostatic energy per incident light power, near the
surface associated with 3 components of the electric field in a 280 b; thick silver film illuminated
at an angle of incidence i, = 60. ------ .r component parallel t o the surface (p wave),
-z component normal to the surface (p wave), - - - - - - - - - - y component parallel to the
surface (s wave).

plasma resonance, where the condition (c) might be satisfied. But the
sensitivity of the anomaly to the composition of the surface does not
allow us to rule out a PE from surface localized impurity levels.
ENDRIZand SPICER[1971b] observed in aluminium values of the ratio
Y,/ Y, larger than expected in a volume effects below surface plasmon
energy for hu < 10.5 eV. ENDRIZand SPICER[1971a,b], FLODSTR~M and
ENDRIZ[1973] and ENDRIZ[1973] discussed in detail the expected contri-
bution of the surface effect (SPBS) with the model developed by MITCHELL

[1934,1935,1936] and SCHIFFand THOMAS [1935], ENDRIZand SPICER

[1971] have shown that roughness causes strong decrease of the reflectance
and increase of the photoyield near plasma resonance.
We have seen (section 3.4, Fig. 8) that CALLCOTT and ARAKAWA [1975]
observed a misfit between the expression (3.1) for the volume effect and
the observed values of Y-(i2)/Y-(0) about the angle of incidence iz,, at
which plasma resonance occurs. They explained this anomaly by a contri-
bution of the surface effect that amounts up to 65 % because, like in silver,
the plasma resonance concentrate the light energy in the electric field com-
ponent that is normal to the surface (CHABRIER, GOUDONNET, TRUITARD
and VERNIER [19731).
The investigations quoted in this section about the polarization depen-
dence of the photoyield of alkali metals were consistent with a pure volume
effect. Discrepancies appeared only recently in the work of MONIN[1973]
and MONINand BOUTRY [1974]. Monin measured the index of refraction
of opaque alkali films by ellipsometry and the quantum yield Y. He cal-
culated the true quantum efficiency, i.e., the ratio Q = Y/(1 -R) (Figs. 21
and 22). Just after the deposition of films at 77 O K (under ultra-high vacuum,
of course) Q was roughly independent of the polarization of the light as
was expected in the assumption of a volume isotropic excitation (Table 4.1).
But if the film was reheated to ordinary temperature Q decreased more
for the s than for the p polarization (Table 4.1).

Value of the true quantum efficiency that gives maximum polarization dependence for alkali
opaque filmsdeposited a 77 K before and after reheating, for both polarization Q , and Q,,
at wavelengths 1, in lngstroms (after MONINand BOUTRY[1974])

cs Rb K Na

1, 5000 4550 4300 3200

Cold films lo4 P, 0.95 28 270 880
lo4 Q, 0.75 21 245 710
5000 4360 3880 3340
Reheated films lo4 Q, 0.8 16 17 230
lo4 Q, 0.5 2 9.1 64

The results of Table 4.1 are consistent with an important contribution

of the surface effect (SPBS) in reheated films of Rb and K and a less
important one in reheated films of Na. But we cannot explain why no
surface effect occurs before reheating. A precise knowIedge of the structural
change in the bulk and in the surface could help the interpretation.

Fig. 21. Spectral distribution of the "true quantum yield" of an opaque film of Rb after
deposition at 77 K + and after reheating at 195" K * for s and p polarization (after MONIN

Qp and Q, for Cs are not very different and this suggests that the relative
contribution of the surface effect is very small although the contribution
of the volume effect is reduced by the very small' value of the escape depth
that is confirmed here by the small value of both Qp and Q, (several orders
less smaller for other alkali metals). But the distinction between the volume
effect and emission from surface states becomes rather semantic when the
escape depth is reduced to a few or even to one atomic layer (section 3.1).
In such thin a surface layer the usual approximations of volume effect may
be quite rough, especially the description of the photon absorption by
the bulk dielectric constant and the assumption of an electric field dis-
continuous at the surface. It seems quite difficult then to deal with the
surface layer by using the model of an homogeneous solid limited by an
abrupt surface.
Every observed anomaly of the ratio Yp/Y,seems up to now to depend
on structural parameters that are not completely controled. If these param-

Fig. 22. Spectral distribution d the true quantum yield Q of an opaque film of Cs after
deposition at 77 K + and after reheating at 195 K * for s and p polarization (after MONIN

eters concern the roughness the explanation could be found in SPBS. If

these parameters concern the surface impurities the explanation would
lie rather in photoemission from surface states.



In metals the good results obtained with the free electron model in many
fields, e.g.,electrical conductivity, led to the idea that no volume photo-
effect could occur and photoemission has been explained in the early
theories as a pure surface effect. When the possibility of a volume effect
was recognized (FAN[1945]), the k conservation selection rule led one to
expect a larger threshold hv, for the volume effect than hv, for the surface
effect, because in the latter case only the tangential component of the
electron wave vector k is to be conservedwhen the photon energy is absorbed
by the electron. WEISSLER [19561 interpreted the spectral yield distribution

of several materials (W, Mo, Pt) with two thresholds hv, and hv,. The
experiments described in the last paragraph did not confirm this inter-
pretation but the surface effect is still expected to afford a more noticeable
contribution near the threshold (ENDRIZ [19731, CHABRIER, GOUDONNET,
The steep increase of the photoyield, that Weissler had interpreted as
a threshold for volume effect, is now attributed to a reflectance decrease
that permits a better penetration of the light into the solid when the photon
energy becomes larger than the plasmon (GdkLICH [1959]). An alternative
explanation of the PE at low photon energy, when no k conservative transi-
tion exists, lies in indirect transitions with phonon creation or annihilation
or in non-direct transitions where the momentum conservation is insured
by other many-body effects (see section 2.8).
GARTLAND, BERGEand SLAGSVOLD [19731 compared the spectral distri-
bution of the photoyields Yo for normal incidence and Y700 for an incidence
of 70". As expected they have found Yo to be proportional to ( h v ~ h ~ , ) ~
near the threshold, but this was not so for Y700.They attributed the
difference between Y700and the expected (hv - h ~ , law ) ~ to a contribution
of the surface effect. A calculation of the penetration of light for an angle
of incidence of 70" would be necessary to confirm this interpretation.
In a semi-conductor high values of the photoyields require transitions
from the valence band. A threshold must then be observed when the photon
energy equals the difference hvi between the vacuum level and the top of
the valence band. Another threshold hv, has been considered by ALLEN
and GOBELI [19621, because the corresponding transitions are ,generally
not direct, and they assumed much higher transition probability when
direct transitions are possible.
When photoelectrons come from surface states, a tail can be observed
in the spectral yield distribution below the threshold hvi . To give a correct
interpretation of this tail we must take into account the band bending
induced by the space charge due to surface states and, of course, a possible
shift of the spectral yield distribution due to a change of the work function.
In a theoretical analysis KANE[19621 has made a survey of 11 different
possible surface and volume emission processes and associated with eaw
one a contribution to the spectral yield:

q = Ci(hV--hVi)P*. (4.1)

The threshold hvi and the exponent pi depend on the process. VANLAAR
and SCHEER [1965] fitted the measured photoyield of a cleaned slightly

p-doped Si crystal with the expression

Y = C,(hv-E,)~+C,(hv-E,)~ (4.2)
with E , = 4.85 eV, E, = 5.40 eV. In such a crystal they expected no band
bending and they attributed the first term to a surface effect and the second
to a volume effect. FISCHER[1968] fitted the spectral yield curve of different
3-5 compounds with the expression
Y = C,(hv-E,)+C,(hv-E,). (4.3)
Fischer ascribed the linear term to direct transitions between valence and
conduction band but could give the emission from surface states as the
only possible interpretation of the second term.
We must note here that the Kane expression (4.1) is nothing more than
the first term of a development that is certainly not valid in a very wide
photon energy range. BALLANTYNE [19723 derived expressions that take
into account electron scattering and are expected to be valid in a wider
energy range (see section 2.7.4).
have seen in the low energy tail of the spectral distribution of the photo-
yield from cleaved crystals of Si pieces of structure that disappeared when
the cleavage (made under ultra-high vacuum), was exposed to the residual
gases or to oxygen. We shall return to these investigations in the discussion
of the results obtained by PED with the same material.



The PED has been the most direct way to investigate the energy levels
of electrons in a solid. If a piece of structure obseryed in the PED appears
or disappears when a very small amount of gas is adsorbed we think at
first of a surface localized level.
We reported in section 3.9 and Fig. 17 the PED observed by Wagner
and Spicer for highly n-doped samples of Si. The attribution of the peaks
A and B to surface states is strongly suggested by arguments other,than
their disappearance after an exposure to oxygen. Their distance to the
Fermi level and their intensity are independent on the photon energy, in
contrast to the peaks C and D that have been attributed to direct interband
transitions. Wagner and Spicer estimated a density of surface levels of
about n, = 8 x lo4 cm-.

EASTMAN and GROBMAN [1972] measured the PED for cleaved surfaces
of lightly n-doped crystals of Si ( l l l ) , Ge (111) and GaAs (110). The
resistivity of the Si and Ge crystals were respectively 5 0 * cm and 4 Q * cm.
They used synchrotron radiation in the spectral range 7-25 eV. Like
Wagner and Spicer, Eastman and Grobman found a structure near the
Fermi level that disappeared after an exposure of the sample to the residual
gases for a few hours (lo-'' torr). Except for the peak B, that has not been
found by Eastman and Grobman, the estimated density of surface states
in Si is roughly the same in both experiments. RANKEand JACOBI[I9731
have observed oxygen sensitive structure in the PED from polar faces of
GaAs crystals ((111) Ga and (1 11) As), that they attributed to surface states.
In Si samples that had been exposed to residual gases, Eastman and
Grobman observed the appearence of a broad band of filled states centered
at 7 eV below tbe vacuum level, that they attributed to extrinsic levels of
(Si-0). Eastman and Grobman gave a rough estimation of the escape
depth of bulk excited electrons by comparing the number of electrons,
that had been excited in bulk and in surface, and assuming the same matrix
element for both transitions. They obtained L = 40 A for hv = 8.5 eV,
L = 17A for hv = 10eV and L = 6 A for hv = 12eV. The agreement
with other estimates (WAGNER and SPICER[1972], GOBELIand ALLEN
[ 19621, KANE [1967]) is satisfactory, although the difference between
the values of L for hv = 8.5 eV and hv = 12 eV seems quite large.
The cleavage of a crystal under an ultra-high vacuum is not a sufficient
condition for obtaining a perfect or perfectly reproducible surface. ERDU-
BACK and FISCHER [I9721 observed the surface of cleaved crystals of Si by
LEED and by Auger electron spectroscopy, as they determined the PED
for photons of low energy before and after annealing at different tempera-
tures. In agreement with the observations of ALLENand GOBELI[1962],
CALLCOTT [19671, FISCHER [19681, and EASTMAN and GROBMAN [19721,
before annealing, they observed no structure that could be attributed to
PE from surface states, as expected if the levels associated with the peak
B of Wagner and Spicer were present. Erduback and Fischer observed
such structures only after annealing. The LEED pattern of the Si surface
which was of type 2 x 1 just after cleavage became 1 x 1 after annealing at
550 "K and 7 x 7 after annealing at 800 OK.
SEBENNE, GUICHAR, BOLMONTand BALKANSKI [1973] deduced from the
spectral yield distribution of n-Si and its modification by an exposure to
oxygen a density of surface levels in agreement with results described above.
But LAUDB[19731 suggested that the structures previously attributed
to surface states should be attributed to direct transitions from the r;,
level of bulk Si. The disappearance of the structure with contamination
should be explained by surface inelastic scattering.
In metals PE from intrinsic and extrinsic surface states has been invoked
to explain structures in the PED that were sensitive to residual gases.
FORSTMANN and HEINE[1970] suggested intrinsic surface states for
anomalous structures previously observed for Ni and Cu e.g., by CALLCOTT
and MAC RAE[1969] who had measured the PED for (1 11) faces of Ni
monocrystals that had been controled by LEED. EASTMAN [I9711 believes
that these structures should rather be attributed to PE from extrinsic
surface states. EASTMANand CASHION {1971] measured the PED for thin
films of Ni exposed to oxygen and CO and found characteristic electronic
levels at 5.5 eV below the Fermi level for oxygen and at 7.5 and 10.7 for CO.
BAKERand EASTMAN [19731 also observed characteristic impurity levels
when (1 11) and (100) faces of a W monocrystal were exposed to 0 and CO.
They could control the structure of the absorbed films by LEED and Auger
spectroscopy, and identify in the PED from CO absorbed layer a peak at
-8.9 eV due to CO molecule.
In metals most of the effects, that could reveal a gap between the energy
bands associated with each direction of the wave vector k, are masked by
overlapping, when all directions of k add their contributions. Nonetheless
VERNIER, COQUET and BOURSEY [19681predicted a possible increase of the
photoelectric threshold for crystallographic faces of a crystal such that
the vacuum level lies in the band gap associated with k normal to this face.
FEUERBACHER and FITTON[1973] suppose in that case that a new type of
surface effect occurs with direct transition of a bulk Bloch state to a free
electron state in vacuum. A surface coupling between these states is possible
because in a semi-infinite solid each Bloch state in the solid includes an
exponentially damped part outside the solid and each free electron wave
outside the solid includes a damped part inside the solid.
A determination of the PED for one direction of emission can separate
the overlapping bands and is especially useful to detect the surface states
associated with one direction of k. The energy distribution for the electrons
emitted normal to the surface within an angle of 12-15" was measured by
WACLAWSKI and PLUMMER [19721 with a retardation potential technique
and FEUERBACHER and FITTON [1972,1973] with a 127"cylindrical analyser.
FEUERBACHER and FITTON[1972] observed (100) faces of W monocrystal
and WACLAWSKI and PLUMMER [19721 observed polycristalline samples,
but purified the sample by a heat treatment, that induced a dominant (100)
orientation. In both publications 7.7 eV and 10.2 eV photons induced a
PED with a peak that has been attributed to electrons excited from a sur-










Fig. 23. Energy distribution of photoelectronsemitted normally to (IOO), (1 10) and ( 1 1 1 ) faces
of a monocrystal of W for several photon energies (after FEUERBACHER and FITTON [1973]).

face level at 0.4 eV below the Fermi energy. Feuerbacher and Fitton made
the - 0.4 eV - peak disappear by an exposure of W to H, . Waclawski and
Plummer observed the same effect with H,, 0,, N, and CO and noticed
at the same time appearance of peaks characteristic of the absorbed gas.
FEUERBACHER and FITTON [1973] gave the energy distribution of the
photoelectrons emitted normal to (loo), (1 10) and (1 11) faces of a mono-
crystal. They confirmed the results obtained earlier with (100) faces and
found very different distributions for the other faces (Fig. 23). The surface
peak 1 that had been found at -0.4 eV for (100) face, does not exist for
other orientations. This confirms its surface origin. The structure no. 2
could be explained at first sight by non-direct transitions from bulk levels,
but Feuerbacher and Fitton attributed it to the photoexcitation of bulk
states allowed by surface, because it is observed only for (100) faces. The
peaks no. 4 and 6, observed for face (1 lo), can be associated with a strong
density of levels at E4 = -0.3 eV and E, = - 1.4 eV but no Bloch state
exists at the energy E4+ hv and Es hv on the line T N of the band diagram
(Fig. 24). Feuerbacher and Fitton explain these structures by the surface
coupling between the Bloch state inside the solid and the free electron
state outside the solid that we have previously considered in this paragraph.
Other structures 3,8,9,10,11 can be readily interpreted as direct transitions.
(See also TURTLE and CALLCOTT [1975].)


~ ~~

(A) (do) ($0) (000) (1%
Fig. 24. Band structure of tungsten after Christensen. (Quoted by FEUERBACHER

0 5. Conclusion

The surface excitation, that was alone considered in the early works
based on the free electron model, appears now as a minor contribution that
can be proved only by very refined technique.
In a good approximation the PE is localized by electron scattering within
a more or less thin superficial layer. The attenuation length of a light beam
is generally at least in the range of one hundred Angstrom and is larger
than the escape depth of the electrons. The escape depth of the electrons
varies in a very large range according to the photon energy. It depends to
a quite small extent on the material when the electron-electron interaction
is not forbidden by a band gap. As was first recognized by SPICER[1960]
in alkali-antimonides, much higher values of the escape depth have been
obtained for small photon energy in semi-conductors when the electron
affinity is smaller than the band gap. The drop of the elastic escape depth
in IK by one order of magnitude when the electron energy becomes large
enough to allow scattering by pair production (BAERand LAPEYRE [1973])
is another instance of the dominant efficiency of the electron-electron
interaction. Still higher values of the total escape depth are obtained when
the electron affinity is negative. These abnormal values, which violate the
general rule, have, of course, a special importance in the fabrication of light
We have represented in Fig. 25 the general trend of the variation of the
escape depth with energy. The number of published data is so high that
we have not been able to indicate individual measurements. For high energy
photons quite a large difference appears between the inelastic and elastic
escape depths. In principle, the values represented in Fig. 25 are associated
with elastic escape depth. But we must be aware that the distinction between
elastic and inelastic escape depth is based on a finite energy resolution of the
experimental apparatus. Obviously the inelastic escape depth results from
the combination of an increasing number of mean free paths and in far
ultra-violet and soft X-ray range the inelastic escape length may be several
orders of magnitude larger than the elastic one.
The escape depth is often reduced to a few monolayers (less than one
for Cs). Then the dependence of the light absorption process on the elec-
tron scattering is certainly very important for purely optical as well as for
photoelectric phenomena. The least perturbation, that we can expect,
concerns direct transitions. When electron scattering becomes more
important, increasing differences between initial an final wave vectors of
the electron are allowed. It should be much more satisfactory to treat such

1 I I
1 10 ZT) m >

Fig. 25. General behavior of the variation of the escape depth with electron energy.

cases within a one-step theory and we may hope that one-step theories will
be soon able to take into quantitative account the electron-electron scatter-
Of course, the surface is one of the most important elements in photo-
emission. The work function is determined by surface phenomena and the
interaction of an incident electron with the surface can include elastic or
inelastic scattering that cannot be fully described by the work function
alone. The PE has much to receive from and much to contribute to progress
in surface physics. Here too, it would be desirable to gather into a one-step
theory the interaction of the electrons with the surface, with light and with
other elements of the solid. Especially when the escape depth is very small,
it is very crude to describe the absorption of light by the surface layer with
the same dielectricconstant as the absorption by the bulk. The interpretation
of reflectance data is subject to the same objection as the interpretation of
photoelectric data, especially in the case of cesium.

We may hope for great progress in PE and in surface physics because

of the advances in ultra high vacuum technology, LEED, Auger electron
analysis and PE experimentation. With cross-checking of several tech-
niques we may hope to determine the position and the nature of the surface
atoms and their interactions with light and electrons and to describe them
with realistic models.

The author wishes to thank T. A. Callcott for many stimulating sugges-
tions as well as critical reading of the manuscript.

ADAWI,I., 1964, Phys. Rev. M A , 788.
AGARWAL, G. S., D. N. PATTANAYAK and E. WOLF,1971a, Optics Commun. 4, 255; 1971b,
Phys. Rev. Lett. 27, 1022.
AGARWAL, G. S., D. N. PATTANAYAK and E. WOLF,1974, Phys. Rev. B10,1447.
ALLEN,F. G. and G. W. GOBELI,1962, Phys. Rev. 127, 150.
ANDEREGG, M., B. FEUERBACHER and B. FITTON, 1971, Phys. Rev. Lett. 27, 1565.
ANTYPAS, G. A., L. W. JAMESand J. J. UEBBING,1970, J. Appl. Phys. 41, 2888.
APPELBAUM, J. A. and D. R. HAMANN, 1973, Phys. Rev. Lett. 31, l,M.
1974, Communication at the 4th Intern. Conf. on Vacuum-Ultra-Violet Radiation Physics,
ASHLEY,K. L., D. L. CARRand R. ROMANO-MORAN, 1973, Appl. Phys. Lett. 22,23.
BAER,A. D. and G. J. LAPEYRE, 1973, Phys. Rev. Lett. 31, 304.
BAER,Y., P. F. HEDEN, J. HEDMAN, M. KLASSON and C. NORDLING, 1970, Solid StateCommun.
8, 1479.
BAKER,J. M. and D. E. EASTMAN, 1973, J. Vac. Sci. & Technol. 10, 223.
BALLANTYNE, J. M., 1972, Phys. Rev. B6, 1436.
BARTELINK, D. J., J. L. MOLLand N. I. MEYER,1963, Phys. Rev. 130, 972.
BELL,R. L., L. W. JAMES, G. A. ANTYPAS, J. EDGECUMBE and R. L. MOON,1971, Appl. Phys.
Lett. 19, 513.
BELL,R. L. and W. E. SPICER,1970, Roc. IEEE 58, 1788.
BELL,R. L. and J. J. UEBBING,1968, Appl. Phys. Lett. 12, 76.
BERGLUND, C. N. and W. E. SPICER,1964, Phys. Rev. 136 A, 1030.
~LECHSCHMIDT,D., M. SKIBOWSKI and W. STEINMANN, 1970, Phys. Stat. Sol. 42,61.
BORTOLANI, V., C. CALANDRA and M. J. KELLY,1973, J. Phys. C6, 349.
BURTON, J. A,, 1947, Phys. Rev. 72, 531.
CALLCOTT, T, A., 1967, Phys. Rev. 161, 746.
CALLCOTT, T. A. and E. T. ARAKAWA, 1974, J. Opt. SOC.Amer. 64, 839.
CALLCOTT, T. A. and E. T. ARAKAWA, 1975, Phys. Rev., B11, 2750.
CALLCOTT,T. A. and A. U. MCRAE,1969, Phys. Rev. 178, 966.
CAMPAGNA, M.. D T. PIERCE, K. SATTLER and H. C. SIEGMANN, 1973, J. Phys. 34, C 6-95.
CHABRIER, G., J. CORNAZ, J. P. G o m o m and P. J. VERNIER,1970, Optic. Commun. 1,391.
CHABRIER, G.,J. P. GOUDONNET, J. F. TRUITARD and P. J. VERNIER, 1973, Phys. Stat. Sol. (b)
60.K 23.


Electron. 16, 203.
COLLINS,R. A. and L. W. DAVIES,1963, Appl. Phys. Lett. 2, 213.
COLLINS,R. A. and R. D. COULD,1971, Solid State Electronics 14, 805.
COLLINS,R. A.. 1. A. EDGEand K. 0. LEGG,1972, Phys. Stat. Sol. (a) 9, 309.
COQUET,E. and P. J. VERNIER,1966, C.R. Acad. Sc. 262, 1141.
CROWELL, C. R., W. G. SPITZER, L. E. HOWARTH and E. E. LABATE, 1962, Phys. Rev. 127,2006.
CROWELL, C. R. and S. M. SZE, 1967, Physics of Thin Films 4, 325.
DAUDE,A., A. SAVARY and S. ROBIN,1972, J. Opt. SOC.Amer. 62, 1.
DAVISON, S. G. and J. D. LEVINE,1970, Solid State Physics 25, 1.
DAVYDOFF, A. S., 1965, Quantum Mechanics (Pergamon Press) p. 309.
DEDERICHS, P. H., 1972, Sol. State Phys. 27, 135.
DONIACH, S., 1970, Phys. Rev. B 2, 3898.
DUCKETT,S. W., 1968, Phys. Rev. 166, 302.
DUCKETT,S. W. and P. M. METZGER,1965, Phys. Rev. 137A, 953.
EASTMAN, D. E., 1970, Sol. State Communications 8, 41.
EASTMAN, D. E., 1971, Phys. Rev. 3B, 1769.
EASTMAN, D. E. and J. K. CASHION, 1971, Phys. Rev. Lett. 27, 1520.
EASTMAN, D. E. and W. D. GROBMAN, 1972, Phys. Rev. Lett. 28, 1378.
EASTMAN, D. E. and W. F. KROLIKOWSKI, 1968, Phys. Rev. Lett. 21, 623.
EDEN,R. C., J. L. MOLLand W. E. SPICER,1967, Phys. Rev. Lett. 18, 597.
ENDRIZ,J. G., 1973, Phys. Rev. B 7, 3464.
ENDRIZ,J. G. and W. E. SPICER,1971a, Phys. Rev. Lett. 27, 570.
ENDRIZ,J. G. and W. E. SPICER,1971b, Phys. Rev. B 4, 4159.
ERBUDAK, M. and T. E. FISCHER, 1972, phys. Rev. Lett. 29, 732.
FAN,H. Y., 1945, Phys. Rev. 68, 43.
FEIBELMAN, P. J., 1973, Surf. Sci. 36,558.
FEUERBACHER, B. and B. FITTON, 1972, Phys. Rev. Lett. 29, 786.
FEUERBACHER, B. and B. FITTON, 1973, Phys. Rev. Lett. 30,923.
FISCHER, D. G., R. E. ENSTROM, J. S. E~CHER and B. F. WILLIAMS, 1972, J. Appl. Phys. 43,
FISCHER, T. E., 1968, Surf. Sci. 10, 399.
FISCHER. T. E.. 1969. Surf. Sci. 13. 30.
FLODSTR~M, S. A. and J. G. ENDRIZ,1973, Phys. Rev. Lett. 31, 893.
FONTENEAU, J. Y., M. PAUTYand P. VERNIER,1971, C.R. Acad. Sci B 272,441.
FORSTMANN, F., 1967,Z. Phys. 203,495.
FORSTMANN, F. and V. HEINE,1970, Phys. Rev. Lett. 24, 1419.
FORSTMANN, F. and J. B. PENDRY,1970, Z. Phys. 235, 75.
FOWLER, R. H., 1931, Phys. Rev. 38,45.
FRANK, G. and S. GARBE, 1973, Acta electronica 16, 237.
FUCHS.R. and K. L. KLIEWER,1969, Phys. Rev. 185, 905.
GARBE,S., 1969a, Solid State Electronics 12, 893.
GARBE,S., 1969b. Phys. Stat. Sol. 33, K 87.
GARBE,S., 1970, Phys. Stat. Sol. (a)2, 497.
GARBE,S. and G. FRANK, 1970, Proc. Third Intern. Symp. on Gallium Arsenide and Related
Compounds, p. 208.
GARTLAND, P. O., S. BERGEand B. J. SLAGSVOLD, 1973, Phys. Rev. Lett. 30,916.
GAUDART. L., 1973. J. Phys. 34,C 6 9 7 .
GERSTEN, J. I. and N. TZOAR,1973, Phys. Rev. 8B. 5671.
GERSTEN, J. 1. and N. TZOAR,1974, Phys. Rev. 9B. 4038.
GESELL, T. F. and E. T. ARAKAWA, 1971, Phys. Rev. Lett. 26, 377.
GESELL,T. F. and E. T. ARAKAWA, 1971, Report TM 2617 of Oak Ridge National Laboratory,
Oak Ridge, Tenn.

GOB~LI, G. W. and F. G. ALLEN,1962, Phys. Rev. 127, 141.

WRLICH, P., 1959, Advances in Electronics and Electron Physics 11, 1.
G~RLICH, P. and K. SUMI,1970, Physica Status Solidi 2, 427.
GOUDONNET, J. P., G. CHABRIERand P. VERNIER, 1974, Communication at the Intern. Conf.
on Vacuum-Ultra-Violet Radiation Physics, Hamburg.
GOULD,R. D., C. A. HOGARTH and R. A. COLLINS,1973, J. Non Crystalline Solids 12, 131.
GURMAN, S. J. and J. B. PENDRY,1973, Phys. Rev. Lett. 31, 637.
HACKETT, W. H., 1972, J. Appl. Phys. 43, 1649.
HALL,C. K. and C. H. B. MEE,1970, Congrts Intern. sur les couches minces. Levide, supple-
. ment no. 147.
HEBB,M. H., 1951, Phys. Rev. 81, 707.
HEINE,V., 1972, Proc. R. SOC.Lond. A331, 307.
HERMEKING, H., 1972, Z. Phys. 253, 379.
HERMEKING, H., 1973, J. Phys. 6,2898.
HICKMOTT,T. W., 1963,X Appl. Phys. 34, 1569.
HICKMOTT, T. W., 1965, J. Appl. Phys. 36, 1885.
HIRSCHBERG, K. and K. DEUTSCHER, 1968, Phys. Stat. Sol. 26, 527.
HOFMANN, H. H. and K. DEUTSCHER, 1970, Z. Phys. 236, 288.
HOFMANN, J. and W. STEINMANN, 1968, Phys. Stat. Sol. 30, K 53.
HOPFIELD.J. J. and D. G. THOMAS,1963, Phys. Rev. 132, 563.
HUGUES,A. L. and L. A. DUBRIDGE,1942, Photoelectric phenomena (Mac Craw Hill).
IvEs, H. E. and H. B. BRIGGS,1936, J. Opt. SOC.Amer. 26, 247.
IVES,H. E. and H. B. BRIGGS,1938, J. Opt. SOC.Amer. 28, 330.
JACOBI,K. and J. H ~ L Z L1971,
, Surf. Sci. 26, 54.
JAMES, L. W., G . A. ANTYPAS,J. EDGECUMBE, R. L. MOONand R. L. BELL,1971, J. Appl.
Phys. 42,4876.
JAMES,L. W., G. A. ANTYPAS;R. L. MOON,J. EDGECUMBE and R. L. BELL,1973, Appl. Phys.
Lett. 22, 270.
JAMES,L. W. and J. L. MOLL, 1969, Phys. Rev. 183, 740.
JASPERSON, S. N. and S. E. SCHNATTERLY, 1969, Phys. Rev. 188, 759.
KANE,E. O., 1962, Phys. Rev. 127, 131.
KANE,E. O., 1966, Phys. Rev. 147, 335.
KANE,E. 0.. 1967, Phys. Rev. 154, 624.
KANTER,M. and P. J. FEIBELMAN, 1962, J. Appl. Phys. 33, 3580.
KANTER,M., 1970, Phys. Rev. B 1, 522.
KATRICH, G. A. and 0. G. SARBEI,1961, Sov. Phys. Solid State 3, 1181:
Phys. Scripta 5, 93.
KLEIN,W., 1969, J. Appl. Phys. 40,4384.
KROLIKOWSKI, W. F. and W. E. SPICER,1969, Phys. Rev. 185, 882.
KROLIKOWSKI, W. F. and W. E. SPICER,1970, Phys. Rev. 51, 478.
LANGRETH, D. C., 1971, Phys. Rev. 3B, 3120.
LAUDB,L. D., 1973, J. Phys. 34, C 6 3 5 .
LEIVOWSKI, T., P. BASTIEand M. BIZOUARD, 1969, C.R. Acad. Sci. B 268, 1 10.
LEWOWSKI, T., P. BASTIEand M. BIZOUARD, 1970, Phys. Stat. Sol. (a) 2, 847.
LIU, Y.Z., J. L. MOLLand W. E. SPICER,1970, Appl. Phys. Lett. 17, 60.
LYE, R. G. and A. J. DEKKER,1957, Phys. Rev. 107, 977.
MACEK,C. H., A. OTTOand W. STEINMANN, 1972, Phys. Stat. Sol. (b) 51, K59.
MAHAN,G. D., 1970, Phys. Rev. Lett. 24, 1068.
MAHAN,G. D., 1970, Phys. Rev. B2, 4334.
MAHAN,G. D., 1973, Phys. Status Solidi B 55, 703.
MAKINSON, R. E. B., 1949, Phys. Rev. 75, 1908.
MARTINELLI, R. U , 1970, Appl. Phys. Lett. 16, 261.

MARTINELLI, R. U., 1973, J. Appl. Phys. 44,2566.

MAYER.H.. 1961, Symp. of the electric and magnetic properties of thin metallic layers,
Leuwen, Belgium.
MAYER,H., D. L. BLANARU and H. STEFFEN, 1970, Thin Solid films 5, 389.
MAYER,H. and H. THOMAS, 1957, Z. Phys. 147,419.
MEAD,C. A,, 1962, Phys. Rev. Lett. 8, 56.
MELNYK, A. R. and M. J. HARRISON, 1970, Phys. Rev. B 2, 835.
METHFESSEL, S., 1957,Z. Phys. 147,442.
METZGER,S. W., 1965, J. Phys. Chem. Solids 26, 1879.
MITCHELL, K., 1934, Proc. Roy. SOC.A M , 442.
MITCHELL, K., 1935, Proc. Camb. Phil. SOC.31, 416.
MITCHELL, K., 1936, Proc. Roy. SOC.A153,513.
, 1973, in: Festkdrperprobleme XIII, ed. H. J. Queisser (Pergamon Press) p. 241.
M ~ N C HW.,
MONIN,J., 1973, Acta Electronica 16, 139.
MONIN,J. and G. A. BOUTRY. 1974, Phys. Rev. B 9, 1309.
NIQUET,G., P. J. VERNIERand P. HARTMANN, 1970, C.R. Acad. Sci. Ser. B 270, 1234.
N O ~ E R EP.S ,and C. T. DE DOMINICIS, 19,69, Phys. Rev. 178, 1097.
PALMBERG. P. W. and T. N. RHODIN,1968, J. Appl. Phys. 39, 2425.
OTTO,A,, 1968, Phys. Stat. Sol. 26, K99.
OTTO,A., 1970, Phys. Stat. Sol. 42, K37.
PEISNER,J., A. QUEMERAIS, M. PRIOLand S. ROBIN,1973, J. Phys. 34, C 6-9.
PEISNER, J., P. ROBOZand P. B. BARNA,1971, Phys. Stat. Sol. (a) 4, K187.
PEPPER, S. V., 1970, J. Opt. Soc. Amer. 60,805.
PIEPENBRING, F. J., 1966, in: Basic problems in thin films, eds. R. Niedermayer and H. Mayer
(Gottingen, Vandenboeck and Ruprecht) p. 325.
PIERCE,D. T. and H. C. SIEGMANN, 1974, Phys. Rev. B 9, 4035.
PLUMMER, E. W. and J. W. GADZUK, 1970, Phys. Rev. Lett. 25, 1493.
PONG,W., 1966, J. Appl. Phys. 37, 3033.
PONG,W., 1967, J. Appl. Phys. 38, 4103.
PONG,W., 1972, J. Appl. Phys. 43,60.
PONG,W. and J. A. SMITH,1973, J. Appl. Phys. 44, 174.
PONG,W., R. SUMIDA and G. Mm-, 1970, J. Appl. Phys. 41, 1869.
~ D A LF.,, C. G ~ Uand T D. FABRE, 1965, J. Phys. 26, 372.
QUEMERAIS, A., J. PEISNER, M. PRIOL and S. ROBIN,1973, C.R. Acad. Sci. B 276,753.
QUINN,J. J., 1962, Phys. Rev. 126, 1453.
RANICE, W. and K. JACOBI,1973, Solid State Commun. 13, 705.
RICHARD, J. C., 1973, Acta Electron. 16, 245.
RIDGWAY, J. W. T. and D. HANEMANN, 1971, Surf. Sci. 24, 451.
RITCHIE,R. H., 1973, Surf.Sci. 34, 11.
RITCHIE,R. H. and J. C. ASHLEY, 1965. J. Phys. Chem. Solids%, 1689.
RIVIERE,J. C., 1969, in: Solid State Surface Science 1, ed. Mino Green (Marcel Dekker,
N.Y.) p. 179.
SAUTER, F., 1967, Z. Phys. 203,488.
SAVOYE, E. D. and D. E. ANDERSON, 1967, J. Appl. Phys. 38, 3245.
SCHADE, H., H. NELSONand H. KRESSEL, 1971, Appl. Phys. Lett. 18, 121.
SCHAICH, W. L. and N. W. ASHCROFT,1970, Solid State Commun. 8, 1959.
SCHAICH, W. L. and N. W. ASHCROFT, 1971, Phys. Rev. B 3, 2452.
SCHIFF, L. I. and L. H. THOMAS,1935, Phys. Rev. 47, 860.
SEAH,P., 1972. Surf. Sci. 32, 703.
SEBENNE, C., G. GUICHAR, D. BOLMONT and M. BALKANSK~, 1973, J. Phys. 34, C 6-35.
SHIRLEY, D. A., 1971, Ed., Electron Spectroscopy, Proc. of an Intern. Conf. held at Asilomar,
Pacific Grove. California (North-Holland. 1972).
Phys. Ser., English transl. 33, 463.

SMITH,N. V., 1971, Critical Rev. Solid State Sci. 2, 45.

SMITH,N. V. and G. B. FISCHER, 1971, Phys. Rev. B 3, 3662.
SMITH,N. V. and W. E. S P I C ~ R1969,
, Phys. Rev. 188, 593.
SOMMER, A. H., 1970, Photoemissive Materials (John Wiley).
SOMMER, A. H., 1973, J. Phys. 34,C 6 5 1 .
SPICER,W. E., 1958, Phys. Rev. 112, 114.
SPICER,W. E., 1960, J. Appl. Phys. 31, 1077.
SPICER,W. E., 1967, Phys. Rev. 154, 385.
STEINHARDT, R. G., J. HUDISand M. L. PERLMAN, 1972, Phys. Rev. B 5, 1016.
STEINHARDT, R. G., J. HUDISand M. L. FERLMAN, 1971, Proc. Intern. Conf. on electron
spectroscopy, ed. D. A. Shirley (North-Holland) p. 557.
STEINMANN, W., 1968, Phys. State Sol. 28, 437.
STERN,F., 1963, Sol. State Phys. 15, 299.
STUART,R. N. and F. WOOTEN,1967, Phys. Rev. 156, 364.
STUART,R. N., F. WOOTENand W. E. SPICER,1964, Phys. Rev. 135, A 495.
SUHRMAN,R. and H. SIMON,1958, Der lichtelektrische Effekt und seine Anwendungen
(Springer Verlag).
SUTTON,L., 1970, Phys. Rev. Lett. 24, 386.
SZE,S. M., J. L. MOLLand T. SUGANO,1964, Solid State Electronics 7,509.
TAMM, I. and S. SCHUBIN,1931, Z. Phys. 68, 97.
TAMM, I., 1932, Phys. Z. Sowj. 1, 733.
TARNG. M. L. and G. K. WEHNER,1973, J. Appl. Phys. 44, 1534.
THOMAS, H., 1957, Z. Phys. 147, 395.
THOMAS, H., 1966, in: Basic problems in thin films, eds. R. Niedermayer and H. Mayer
(Gottingen, Vandenboeck and Ruprecht) p. 307.
THORNBER, K. K., 1971, Phys. Lett. 34A, 205.
TONG.S. Y.. T. N. RHODINand R. H. TAIT,1973, Phys. Rev. 8, 421.
TURTLE,R. R. and T. A. CALLCOTT, 1975, Phys. Rev. Lett. 34,86.
TZOAR,N. and J. 1. GERSTEN,1973, Phys. Rev. 8B, 5684.
UEBBING. 5. J. and R. BELL,1968, Proc. IEEE 56, 1625.
VANLAAR,J. and J. J. SCHEER,1962, Philips Research Reports 17, 101.
VERDERBER, R. R. and J. G. SIMMONS, 1967, Radio Elec. Eng. 33, 347.
VERNIER, P. J., 1973, Acta Electronica 16, 181.
VERNIER, P. J. and E. COQUET,1965, in: Basic problems in thin film physics, eds. R. Niedermayer
and H. Mayer (Gottingen, Vandenboeck and Ruprecht) p. 328.
VERNIER, P. J.. J. P. GOUDONNET, G. CHABRIER and J. CORNAZ, 1971, J. Opt. SOC.Am. 61, 1065.
VERNIER. P. J., E. COQUETand M. BIGUEURE, 1966, C. R. Acad. Sci. B 262, 1728.
VERNIER, P. J., E. COQUET and E. BOURSEY, 1968, J. Phys. 29, Suppl. 2-3,534.
VERNIER, P. J., J. P. GOUDONNET, G. CHABRIER and J. CORNAZ, 1971, J. Opt. SOC.Am. 61,1065.
VERNIER, P. J., M. PAUTYand J. F. BERTHELEMY, 1965, in: Optical properties and electronic
structure of metals and alloys, ed. F. Abeles (North-Holland) p. 323.
VERNIER, P. J., M. PAUTYand F. PAUTY,1969, J. Vac. Science and Technology 6,743.
VILJOEN, P. E., M. S. JAZZARand T. E. FISCHER,1972, Surf. Sc. 32, 506.
WACLAWSKI, B. J. and E. W. PLUMMER, 1972, Phys. Rev. Lett. 29, 783.
WAGNER,L. F. and W. E, SPICER,1972, Phys. Rev. Lett. 28, 1381.
WEISSLER, G. L., 1956, Photoionization in gases and photoelectric emission from solids, in:
Handbuch der Physik, ed. S. Flugge 21, 341.
WILLIAMS, B. F. and J. J. TIETJEN,1971, Proc. IEEE 59, 1489.
WOOTEN,F., T. HUENand R. N. STUART,1966, Optical properties and electronic structure of
metals and alloys, ed. F. Abeles (North-Holland) p. 332.
ZWORYKIN, V. K. and E. G. RAMBERG, 1949, Photoelectricity and its application (J. Wiley,
New York).
This Page Intentionally Left Blank




Department of Electrical ana Electronic Engineering, Queen Mary College, University oflondon,


0 1 . INTRODUCTION'. . . . . . . . . . . . . . . . . . . 329
UNIFORM REFRACTIVE INDEX . . . . . . . . . . . 331
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . 400
REFERENCES . ....................... 400
9 1. Introduction
During the period 1969-75, the use of optical fibres for telecommuni-
cation purposes has grown from a position of speculative research into
one of commercial reality. This growth has been largely due to the efforts
of those engaged both in fibre manufacture and in the development of
light sources. Against this background many workers have been engaged
in research into the theory of optical waveguides and it is with their con-
tribution that the present review is concerned.
Work on optical fibres goes back at least to the early 1950's and a
resume of the early history is to be found in the book by KAPANY [1967].
Credit for the proposal to use a cladded optical fibre for telecommuni-
cations purposes is attributed to L o and HOCKHAM [1966], whose classic
paper appeared in July 1966. Thereafter much of the early work aimed
at the development of long-haul systems was undertaken in the United
Kingdom at the British Post Office and by Standard Telecommunication
Laboratories. By 1969, workers in the U.S.A., especially at the Bell Tele-
phone Laboratories, in Germany and in Japan, were enlarging their
efforts. At first, fibre attenuation was prohibitively large and solid-state
laser sources inefficient and of very short lifetime. A major breakthrough
occurred in 1970 with the announcement in the U.S.A. by KAPRON, KECK
and MAURER [I9701 of the Corning company, of a quartz fibre with losses
below 20 dB km-'. Thereafter, other laboratories also achieved excep
tionally low loss fibres based on quartz. In 1972, another low loss fibre
was announced using an organic liquid (tetrachloroethylene) as a core
material. This was proposed independently by OGILVIE,ESDAILEand
KIDD [1972] in Australia and by KAISER,TYNES, CHERINand PEARSON
[1972] in U.S.A. Their announcement was soon followed by one by ~ A Y N E
and GAMBLING [I9721 in England, who proposed the use of a related
organic liquid with even lower loss. In Japan, effort was concentrated
mainly on fibres with graded refractive index since these have the consider-
able advantage of neutralising modal pulse dispersion, which is the principal
disadvantage associated with multimode fibres. Their researches were
3 29

concentrated mainly around the material borosilicate whose loss, although

larger than that found in quartz, now comes within the range required for
longhaul systems.
Theoretical work on optical fibres for telecommunication purposes
was initiated in the United Kingdon in 1965, by Kao, although interest

Monomode fibre with cylindrical core

Monomode fibre with triangular core

-1 k - b

Doubly clad fibre also called W-fibre

Graded refractive index fibre

Multimode fibre with ring shaped refractive index

Single material fibre

Fig. 1. Optical fibre configurations.

VII, 0 21 FIBRES W I T H CORE A N D C L A D D I N G 33 1

in related propagation problems associated with vision had been under-

taken earlier by ENOCH[1961], SNITZER and OSTERBERG [1961] and by
BIERNSONand KINSLEY [1965]. A major study by SNYDER [1969], paved
the way for the analysis of many fundamental problems in optical wave-
guides and his team in Australia have continued with their investigations
subsequently. At Bell Telephone Laboratories the theoretical contribu-
tions* of Arnaud, Gloge, Marcatili, Marcuse, Miller and Personick,
were especially relevant, while in Germany, Unger* and his co-workers
made many important investigations.
In a fast moving field, the task facing the author of a review is especially
difficult as a subtle change in technology can rapidly shift the emphasis
which should be given to a specific topic. Several excellent review papers
have already appeared (see references) and in one of them MILLER,MAR-
CATILI and TINGYE LI [1973] have treated the topics of systems, sources
and fibre manufacture. None of these are discussed here, concentration
being given instead to those fundamental electromagnetic aspects which
underlie propagation in fibres. A number of relevant books are cited in
the references and it is known that others are in preparation. For this
reason, I have stopped short of developing many of the equations in their
entirety and refer the interested reader to other works if more detail is
required. It has also proved impossible to treat all of the fibre configurations
shown in Fig. 1, instead attention is confined to those of Fig. 1(A), (C),
(D), (E) and (F). Readers interested in the configurations of Fig. 1(B) and
(G) should consult references by DYOTT,DAYand BRAIN[1973], TYNES
[19741, MARCATILI [19741 and STANDLEY and HOLDEN [19741.

92. Fibres with Core and Cladding Possessing Uniform Refractive Index


Fig. 2(a) shows a cylindrical fibre with uniform refractive index in

both core and cladding regions together with a lossy outer layer. With an
appropriate choice of refractive indices and core radius, the fibre may
support only one mode with field closely confined to the core, see Table 1.
Higher order modes, if excited, can be attenuated if a lossy outer layer is
present because their fields are significant beyond rz . This is the case with
the monomode fibre. In the multimode fibre, more than one mode has its

* References to the numerous contributions of these workers will be given in later sections.
332 O PT I CAL F I B R E W A V E G U I D E S

Fig. 2. Optical fibre configurations:(a) fibre with finite cladding boundary, (b) infinite cladding
approximation, (c) cladding mode approximation.

Typical fibre parameters

I .53 1.5 2 x 0.3 1.o 2.21 single mode

1.53 1.5 2 x ~ O - ~ 0.3 30 66 2200
1.465 1.458 7 xi0-3 0.17 1.75 2.21 single mode
1.458 1.415 4.3 x lo- 0.42 35 108. 5540

field confined to the core region and in practical fibres designed to support
many modes, the condition is usually accomplished by making the core
radius much larger than in the monomode case.
A first step in evaluating the transmission properties of a fibre involves
the calculation of the propagation coefficients of the modes. An exact
analysis for the configuration of Fig. 2(a) has been made by CLAWCOATS
and CHAN[1973], however, their study serves to validate the widely used
infinite cladding assumption which reduces the configuration to that of
Fig. 2(b). A further assumption applies to modes below their core-mode
cut-off when the fields are virtually unperturbed by the presence of the
core, the fibre can then be represented by Fig. 2(c).
Studies of the propagation behaviour of a cylindrical dielectric embedded
in an infinite medium of different permittivity as in Fig. 2(b) or 2(c) began
vn, 0 21 FIBRES WITH CORE A N D C L A D D I N G 333

with a theoretical treatment by HONDROS and DEBYE [1910]. These were

followed much later by theoretical and experimental investigations relevant
to the propagation of microwave signals along dielectric rod waveguides
and to the preliminary analysis of circular waveguides containing ferrite
rods. The possible use of the structure as an optical waveguide is attributed
to SNITZER and OSTERBERG [1961] whose investigations stimulated interest
in optical fibre waveguides for telecommunication applications.


The general form of the characteristic equation for modes with azimuthal
dependence* n in the structure of Fig. 2(b) is given below, the derivation
of which is presented in numerous texts and papers (see references),


For surface waves to be guided without radiation loss by a straight

uniform fibre E2 < B < E l , a condition which defines the regime of trapped
modes or surface waves. From the above relations we obtain
U 2 W 2 = k2r& -El) = k2r: (n: - n;) = V 2 = (kr,)2(N.A)2. (2)
Since in optical fibres, n, x n 2 , we have the relationship
V x kr, n, (T) = kr, nl(2A)f w kr, n1 8, (3)

where A = (n, - n 2 ) / n l , 8, = complement of the critical angle for total

internal reflection and N.A. = the fibre numerical aperture.
Real values of are obtained, provided that the refractive indices n,
and n2 are real, n, > n2 and V exceeds the cut-off value for the mode as
given in Table 2. When n = 0, the boundary conditions at r = rl may be

* Note: Throughout this text n is the azimuthal mode number which should not be confused
with refractive indices n l , 2 . . ..

U,and U, for first 28 modes (nl x n2)

0 2.405
2.405 3.832
3.832 5.136, 5.520
5.136 6.380
5.520 7.016
6.380 7.588
7.016 8.477, 8.654
7.588 8.772
8.447 9.761
8.654 10.173
8.772 9.936
9.761 11.065

satisfied with either E, = 0 (TE,, modes) or H, = 0 (TM,, modes) other-

wise both longitudinal components of electric and magnetic field are
required and the modes are classified as HE,, or EH,,. However, when
n , x n 2 , the longitudinal components of field are of order smaller than
the transverse components, as first noted by SNYDER[1969a, b]. Under
these conditions, the HE,, and EH,- 2 m modes are almost degenerate and
the transverse fields of the combination are almost linearly polarised.


On introducing the condition n , x n2 into equations (1) and (2)

With the use of recurrent relations for Bessel functions, then

UJ,(U) n l WKn(W)
JnTltU) - - '
n2 KnTItW)'
The equation involving J,,- refers to HE nm modes while that with J,+ ,(U)
refers to EH,, modes. The following alternative form which applies when
n , x n2 was also first noted by SNYDER [1969a]
uJ , 7 2( U)
-~ WK,T2(W)
JnT1W) = Knr I(W)
In the limiting case when n, + n 2 , the form of equations (5) and ( 6 ) reveals
. 335


._ L-

v normalized frequency

Fig. 3. Eigenvalue U as a function of normalised frequency V. The solid lines are determined
numerically from equation (5); the dashed lines are the approximate solution given by equa-
tion (T3.2) of Table 3.

that the modes HE, + l m and EH, - l m are degenerate. When V + co, W + co
thus U satisfies Jnr ( V , ) = 0 in that limit. For propagating HE,, and
EH,, modes, U lies between consecutive roots of the Bessel function
J,* 1 ( U ) . Figs. 3 and 4 show U and W as functions of V while special cases
of U and W are contained in Table 3. The behaviour of modes below
cut-off will be deferred to the section on leaky modes.
Table 3 contains approximate analytic expressions for U derived
originally by SNYDER[19691 and GLOGE[1971a]. Equations (T3.1a),
( and (T3.2), are very accurate while (T3.2) is useful for many cal-
culations. When A = 0.05 equation (T3.1 b) giving U for the HE, l mode
is accurate to within 0.5 % in the range 1.5 < V < 5 .
. From the definitions of U and V given in equations (1) and (2), GLOGE
[1971a] writes
t -+
h 3
5 e
"s IB







Fig. 4. The eigenvalue Was a function of normalised frequency V for the lowest order modes
of the fibre shown in the configuration of Fig. 2(b): 01 = HE,,; 1 1 = HE,,, TM,,;
21 = HE,,, EH,,; 02 = HE,,; 31 = HE,,, E H , , ; etc.

Thus with equations (T3.1) or (T3.2) and (7) 8, the normalised phase-change
coefficient, can be easily evaluated as a function of V. Fig. 5 shows results
due to GLOGE [1971a] who uses a linearly polarised modal notation and
is discussed in section 2.7. The relationship to the conventional notation
is explained in Fig. 4.


Evidently p increases from n2 to n, as V increases from the cut-off

value V = U, towards infinity. Fig. 6 shows a ray interpretation which is
based on an asymptotic evaluation of the Bessel function, whose depen-
dence describes the transverse field within the core region. From Fig. 6 we
see that
8 = n, cos 4 cos 6. (9)

Fig. 5 . Normalised propagationcoefficientb defined by equation (8) as a function of normalised
frequency V for optical fibre configuration of Fig. 2(b).

Equation (7) then leads to

and for small angles

where 8, is the complement of the critical angle for total internal reflec-
tion at the interface between core and cladding regions. We now see that
as fl increases from n2 to n, the angle 8 which the ray vector n,k subtends
with the generatrix of the cylinder, changes from 8, to zero.
-.. - .


Fig. 7 shows fl as a function of V for the configuration of Fig. 2(a) as

a b

U plane

Fig. 6 . Rays in an optical fibre waveguide: (a) Dotted region corresponds to trapped modcs such as ray I . Ray 2 describes a leaky ray; ray 3 des-
cribes a refracting ray. (b) Section through plane containing ray 1. (c) Detailed view of trapped leaky and refracting ray regions, Locus A B C
corresponds to increasing meridional angle 6, while 4 remains constant. 2.t

1.40 -
I30 -


I10 -

0 1 2 3 4
I0 I I I I
0 1 2 3 4
Fig. 7. Normalised propagation coefficient Bas a function of normalised frequency V. Fibre
parameters n, = 1.53; n2 = 1.50; r 2 / r 1= 5.

computed by CLARRICOATS and CHAN[1973]. Detailed examination of

the results leading to the figure showed that the core mode approximation
was exceedingly good above the core mode cut-off ( V > V,) while the
cladding mode approximation was quite accurate at p values a few percent
below the value p = n,, corresponding to the core mode cut-off. However,
an interesting cross-over in the propagation curves occurs just below that
value. The degeneracy at the cross-over is broken in the presence of finite
losses and detailed examination of the fields show that continuity is main-
tained along the upper and lower branches of the propagation curves.

Further studies of cross-overs have been undertaken more recently by

KUHN[1975] and YIP and HUANG[1975].



To determine the response of a fibre of length L, the delay z is found

from the equation
L dp
7 = - - = -
c dk vg
where c = the velocity of light in vacuum
and ug = the group velocity of the mode.
From equation (7) and following GLOGE [I971a], we have

where it is assumed that

kdn, kdn,
N , = - - -dk
- - N 2 = - dk &,2.

The first term in equation (14) arises from the material dispersion and is
independent of the waveguide mode, the second is due to waveguide dis-
persion and is mode dependent. By using the relation for dUjdV given in
Table 3

_ _-
- 1 - -[l-2un]
dV V2

= 1+ c[l-
i] when V > 1

= 2u, when U = V at cut-off. (17)

From the limiting form of IC,when W + 0, one finds that at cut-off for
HE,,, HE,,, TM,, and TE,, modes

d(Vb)/dV = 0.
While for HE,, and EH,- 2m modes
d( Vb)
VIL 21 F I B R E S W I T H CORE A N D C L A D D I N G 343

0 2 6 8 10 12
Fig. 8. d(Vb) /dV as a function of normalised frequency Y for optical fibre configuration of
Fig. 2(b).

GLOGE [1971a] has plotted d( Vb)/dV as a function of V for the lowest order
modes ofa multimode fibre as shown in Fig. 8. For fibres with large Vvalues,
which support modes with large values of n, the spread in delay AT is
obtained from equations (19) and (16) as

In the limit of large n, the spread in arrival time of the fastest and slowest
modes, as predicted by equation (20), is precisely that obtained using a
ray approach. Extreme skew rays cross the axis of the fibre at an angle
8, = (2A)* and for these, the time difference relative to the axial ray is
given by the limiting form of equation (20). GAMBLING, PAYNEand SUNAK
[1971], GAMBLING, DAKIN, PAYNE and SUNAK[1972] and ROSMAN [1972]
have used the simple ray model to investigate the response of fibres and
have reported good agreement with experiment. In these cases mode
coupling must presumably have been small for, as discussed later, when
present it leads to a dependence on length of the form L* rather than L.
344 OPTICAL FIBRE W A V E G U I D E S Cvn, 0 2

For large V, AT x 5A10-6 s km-' thus, as an example, when A = 0.01

AT x 50 x lo-' s km-'. The above calculation is pessimistic since mode
coupling reduces AT, also in practice, higher order modes are usually
both less strongly excited and more strongly attenuated, which reduces
the spreading.
The effect of a finite cladding boundary on the group velocity has also
been studied by CLARRICOATS and CHAN[1973] Fig. 9 shows v,/c as a
function of Vfor a fibre with A = 0.02 and r z / r l = 5. We find that provided
a mode is above core mode cut-off, vg corresponds closely to the value
predicted on the basis of an infinite cladding. For the HE,, mode the
approximation is good above V = 1.5.

Fig. 9. Normalised group velocity V, as a function of normalised frequency V. Fibre param-

eters as Fig. 7.

For a single mode waveguide, pulse dispersion is minimal due to the

term d( Vb)/d Vin equation (14) and in practice the dominant pulse broaden-
ing effect is caused by material dispersion coupled with the finite linewidth

of the source, as noted by DYOTT and STERN[1970], KAPRONand KECK

[1971], GLOGE[1971b] and others. For a multimode waveguide, the time
spread caused by a light source of frequencyf and bandwidth B is given by

From equation (14)

0.3 0.4 0.5 0.6 0.7 08 0.9 1.0
ho* Pm
Fig. 10. arjaf for HElI mode as a function of V and L for optical fibre comprising Schott
K, and K, glasses as core and cladding materials. Dotted curves show dispersion in the
constituent materials. Fibre parameters r J r , = 5.

For a single mode waveguide, the first term dominates and published
values of dN/dk may be used to estimate pulse dispersion. To demonstrate *
this point, Fig. 10 shows computed values of a~/ affo rthe HE,, mode in
a glass fibre waveguide without approximations and including the effects
346 O P T I C A L FIBRE W A V E G U I D E S CVIL 0 2

of finite cladding. Also shown are curves of kdNJdk and kdN2/dk as

a function of V. For most purposes, the approximation described above
gives an accurate description of the pulse broadening effect. In the example
of Fig. 10, the dispersion caused by a source such as a GaAs laser, with
0.1 % relative linewidth and for a fibre with V = 2.4, is lO-'Os km-'. This
is three orders of magnitude lower than that caused by modal dispersion
in multimode waveguide. However, light emitting diodes (LEDs) have
relative linewidths of up to 4% and with liquid core fibres, which have
about three times the dispersion of glass, the effects of material and modal

600 700 800 900 1000 1100 1200

Fig. 1 1 . Bandwidth limit due to material dispersion: variation with wavelength for constant
line width AI = 40 nm. Curves correspond to respectively silica and various glasses.

dispersion become more comparable. Many workers have investigated

how fibre bandwidth is related to material dispersion. Fig. 11, due to
DYOTT [1974], shows how fibre bandwidth depends on LED wavelength
for different glasses, while Fig. 12, due to TIMMERMAN [1974a], shows
maximum bit rate as a function of source bandwidth for a specificmaterial.

Gbitk. ,GaAs-laser
He!&-laser '\\\
- \



1 I

In a multimode waveguide, where the second term in equation (22) may

be significant, it is useful to express the incremental delay AT for a given
mode in terms of the mode number A4 and the total number of modes MT
which the fibre will support. To do so we must first relate the total number
of modes to the V value of the fibre. Several methods have been used.
For large V , most modes have U close to the asymptotic value U , . To
obtain MT; MARCUSE [1974a] proposes that the roots of Jn(Um)= 0 then
be counted with Urn< V and making use of the asymptotic form


m = 1, 2, 3, ...

For constant Urn,the integer values of n and m satisfying equation (23) lie
along parallel lines as in Fig. 13. Neglecting the term 3 in equation (23),
all values of n and m lie within the triangle. Because there are two


Fig. 13. n, m,mode number plane. Dashed curve indicates the location of points n + 2m = con-

polarisations and two mode types (HE and EH) for each value of n and m,
the total number of modes is four times the area of the triangle. As the
maximum values of n and m are 2 V/nand V/nrespectively, the total number
of (trapped) modes MT is given by

Fig. 14 shows that the above represents an underestimate. A more accurate

value can be obtained by assuming that the beam width aM of all modes
of a fibre with core of radius rl is 2 / n r , . Then if the associated solid angle
is nu: and the acceptance solid angle no:, the number of modes accepted
by the fibre is given by

which is seen in Fig. 14 to be quite accurate even at low V values. The

above approach was first adopted by GLWE[1971]. An alternative method

V, (cut-off V value1

Fig. 14. Computed total number of modes MT as a function of cut-off V value. Curve B
includes modes of the form HE,, and EH,, with both polarisations present. Curve A
includes modes with only one polarisation present.

due to PASK,SNYDERand MITCHELL [1975] gives the same result. The use
of a compound mode number by GLOGE [1972a]
M = (n+2rn) (26)
together with the knowledge that for all V, U z U, (the cut-off value of U )
allows us to write
u, = (2M)Q (27)
then from equation (8)

and from equation (16)

V d2(Vb) 2u2 2M
- _ - N
I v - - -
dV2 V2 MT

equation (22) may then be rephrased as

The first term of equation (30) is positive whereas the second term is negative.
Thus, as proposed by GLOGE[1971b], the possibility exists for some
compensation of material dispersion by waveguide effects in the case of
higher order modes for which M w M T .
The possibility of compensating material dispersion by appropriate
choice of core and cladding glasses has been proposed by DYOTT and STERN
[1970], SMITHand SNITZER[1973], also JURGENSEN [1975]. The latter
author shows that no suitable glasses exist which will provide compensation
in a monomode fibre at wavelengths below about 1.3 p. Also, so far,
attenuation considerations have dominated the choice of materials for
optical fibres and the limitations on bit-rate imposed by material dis-
persion have taken second place.


Table 4 presents the field components of the HE,, and EH,-, mode
under the assumption that n , x n 2 . Fig. 15 shows computed transverse
fields of the HE11 , and HE2 modes and the linearly polarised nature of
these fields is evident. It may also be demonstrated as follows. If the Cartesian
components of the transverse electric field are formed from the cylindrical
components we have

when a modal combination is formed so that

Y) sin (v + 1)O + J,- l(Kl r) sin (v - 1)O
E, = J v + ,(K, (32)
Jv(Kr)cos v0

The integer v in equation (33) is associated with a quasi-degenerate modal

pair corresponding to the HE,,. l)m mode and the EH,,- ,), mode. GLOGE
[1971a] calls such a modal combination an LP mode choosing these

Field components when n, = n2

Component Function

r < r1 r > r,

* Upper sign denotes HE,, mode, lower sign denotes EH,-,, mode.

initials because the field is almost linearly polarised. The notation LP,,
identifies the above mode pair. An extensive discussion of this topic has
been presented by MARCUSE[1972b, 1 974a).
If initially we assume that a artesian field, as expressed in equation (33),
represents a solution for the configuration of Fig. 2(b) and provided
n, z n2, then, following ARNAUD119741, continuity of Ey and aE@r at
r = r1 leads directly to the characteristic equation

where the functions are defined with eq. (1). This solution corresponds to
either and HE,,. l)m mode or an EH,,- l)m mode. The appropriate character-
istic equations for HE,,,,, and EH,,, modes are then

With a little manipulation it can be shown that equation (35) is identical

to equation (4) in the limit n , = n2. There is, however, considerable
simplicity in the above approach, especially if an extension to a multilayer
fibre is envisaged. Aside from the relevance to fibres with several boundaries,
the technique is useful when fibres with continuously graded refractive
index profiles are analysed.

Fig. 15. Normalised field intensity as a function of radial position within cladded optical
fibre corresponding to Fig. 2(a). Parameters as Fig. 7.


The power flowing in core and cladding regions is obtained from an

integration of the longitudinal component of the Poynting vector over
the appropriate cross-section. If the longitudinal fields exhibit a depen-
dence 'OS n8'

On writing v = n+ 1 and recalling that HE,, and EH,... l m modes are

nearly degenerate, the power in the core region for an LP,, mode is given

If v = 0, the above expression is doubled. From a similar integration, the

power in the cladding region of Fig. 2(b) is given by

K, is defined in Table 3. GLOGE [1971a] gives a related expression which

is twice as large since it includes the power in both orthogonal polarisations.
He also removes the constant a, by expressing the field at r = rI as

In this notation the total power PTis then given by

~ = 1-

Pclsd u2[l-K,].
PT v2
For large V,
'K, = 1- -

- -u2
P, v3

Fig. 16. (power in cladding)/(total power) and (power in core)/(total power), as a function
of normalised frequency V.

Fig. 16, due to GLOGE [1971a], shows the above ratios as a function of V.
CLARRICOATS and CHAN[I9731 have produced similar results for the
configuration of Fig. 2(a) and have found that the role of the outer cladding
boundary is not significant other than near the core mode cut-off. The above
figures show that for all modes above their core mode cut-off, their power
flow is concentrated into the core. If equal power flows in every mode, as
could be the case if excitation is by means of an LED source, the total power
flow in the cladding for all modes is given by


Consider now a cladding region which beyond r > r2 possesses a loss

tangent tan6. The attenuation coefficient a of a mode which is not too
near cut-off is then given by
VII, 0 21 FIBRES W I T H C O R E A N D C L A D D I N G 355

For large V we obtain the simple result

If the loss region has only a finite thickness dr and its inner boundary still
lies at r = r 2 ,

CLARRICOATS and CHAN[19731 have computed the attenuation character-

istics of a number of lower order modes of a cladded fibre with a layer of
finite thickness, by using exact expressions for the fields. Their results,
which are shown in Fig. 17, conform closely to the law expressed in equa-

\ lC01 \

Fig. i7. Attenuation coefficient as a function of normalised frequency for optical fibre con-
figuration of Fig. 2(a). Parameters as for Fig. 7.
356 O P T I C A L FIBRE W A V E G U I D E S [vn, 52
tion (48)over the linear range. The very substantial differential attenuation
which exists between the HE,, mode and next higher order mode triplet,
provided V does not approach 2.4, makes it possible to provide powerful
suppression of all higher order modes in monomode fibres. This requires
a careful choice of fibre parameters and the application of a lossy paint
or a lossy plastic coating to the cladding boundary. KUHN[1975] has
discussed this optimisation problem in detail and has included in his study

Fig. 18. Leaky mode attenuation coefficient as a function of ratio r J r , with Vvalue as param-
eter, for configuration of Fig. 2(a) with n 1 = 1-05; n2 = 1.025; n3 = 1.5.

the case when the outer lossy region (which is assumed to be semi-infinite)
has a real part of refractive index greater than n, . Under these circum-
stances so-called leaky mode conditions prevail and the attenuation rises
rapidly with increasing n, or decreasing V. CLARRICOATSand CHANG [I9721
have analysed the structure of Fig. 2(a) when n3 > n, > n, and Fig. 18
shows computed attenuation coefficients as a function of V, due to CHAN
[1973]. Although the parameters are relevant to a microwave application
the main features apply in optical waveguides.


A very comprehensive study of leaky modes has been undertaken by

SNYDER[1974] and his co-workers for the configuration of Fig. 2(b).
Leaky conditions occur when I/ is less than the cut-off value V, = U, for
a given mode and then equation ( 5 ) has complex roots. Fig. 19, due to
SAMMUT and SNYDER[1975] shows the real and imaginary parts of U as
a function of V, while Fig. 20 shows the leaky mode attenuation coefficient

101 I I I I I 1 I I I

1 2 3 1 5 6 7 8 9

Fig. 19. Complex eigenvalue U as a function of normalised frequency V for optical fibre con-
figuration of Fig. 2(b).
358 O P T I C A L FIBRE W A V E G U I D E S [VII. 2

1 2 3 4 5 6 7 8 9

Fig. 20. Attenuation coefficient a as a function of normalised frequency V corresponding to

leaky modes of HE,, class n = 2, 3, 4, 5, 6, 7.

a where

Below cut-off, excepting modes with n = 1, Ur provides an analytic con-

tinuation of U above cut-off and Ui gives rise to attenuation of the mode
in the z direction. For modes with n = 1, there are no solutions just below
cut-off, although for HE,, modes, m 2 2, solutions do exist when V is
decreased further below cut-off. However, these solutions are not con-
tinuous with the bound modes. Expressions at V c - V = AV for U,- Uc =
AUr and Ui are given by SNYDER[19741 as

=- n 2 3

ui = -
In 1AVI2

For n > 1 the modes are only weakly attenuated compared to the below
cut-off modes of a dielectric slab waveguide. A physical explanation is
to be found in the ray diagram of Fig. 21. There we find that weakly leaky
modes correspond to rays which satisfy 8 > 8, with 6 < 8,. These rays
would appear to be trapped according to geometric optics since the angle
made with the tangent plane 6 , is less than the critical angle for total internal
reflection. However, as demonstrated by SNYDER[19731, MARCATILI
[1973] and MARCUSE [1973b] for cylindrical fibres, it is the angle 0 which
the ray makes with the generatrix of the cylinder which determines whether
a ray is trapped or the corresponding mode is bound.
When both 8 and 6 are greater than 8,, the corresponding ray is refracting
and the leaky mode attenuation is large, see Fig. 20. Such modes are called
leaky refracting modes by SNYDER[1974] and the associated rays are leaky
refracting rays. An anatomy of modes on dielectric structures, due to
SNYDER [1974], is contained in Table 5 .
The field of a weakly leaky mode decays exponentially to a radius rtp
whereafter propagating conditions are found. This behaviour is precisely
analogous to that found in cylindrical antennas where a stored energy

Fig. 21. Ray diagram indicating that some leaky rays in the configuration of Fig. 2(b) are
trapped in the configuration of Fig. 2(a). See also Fig. 6.
360 O P T I C A L FIBRE W A V E G U I D E S [vir. 0 2
Anatomy of modes on dielectric structures

Bound modes I Leaky modes

(1) Energy is unattenuated (1) Energy with the structure is attenuated due to radiation, i.e.
as it propagates, ix., the the fields are made up by waves that undergo partial
fields are made up by reflection.
waves that undergo total
internal reflection.
(2) Fields external to the (2) Fields oscillate (radiate) at finite distances from structure.
structure are evanescent.
(3) Characteristics found (3) Characteristics found from certain complex roots of the
from the real roots of eigenvalue equation. On structures that support bound
eigenvalue equation. modes, leaky modes are the bound modes below their
cutoff frequency.
(4) modes approximate the radiation field within and
modes gives bound external to the structure far from the source.
energy of structure.
Exist only on non-finite Exist an all dielectric structures
structures with an index of
refraction greater then their Refracting leaky modes Tunnelling leaky modes
surrounds. Examples:
Cylinder or slab in free space
( I ) Radiation appears to (1) Radiation appears to
originate at boundary of originate some distance
the structure, i.e.,the fields from the boundary of the
everywhere external to structure. The fields
the structure have an external to, but immediately
oscillatory behavior. adjacent structure are
evanescent and oscillate at
greater distances from the
(2) These modes are strongly (2) These modes are weakly
leaky. As I, -P 0, the leaky. As I , -+ 0, the

Exist on all dielectric

leakage is nonvanishing.

leakage vanishes.

Exist only on structures that

structures. have (a) an index of refraction
higher than that of their
surrounds and (b) have curved
boundaries that are concave
inwards. The dielectric sphere
vn, 0 21 FIBRES WITH CORE A N D C L A D D I N G 361

Bound modes Leaky modes

Bound or trapped rays

The rays that form the

bound modes undergo total
internal reflection.
Refracted rays Tunnelling rays
These rays undergo partial These rays undergo partial
reflection due to refraction reflection due to tunnelling
at the boundary. upon incidence at the curved

region precedes the domain of propagation as one moves outward from

the cylinder boundary. The higher the azimuthal dependence of the mode,
the greater rtp as indicated by

Qnm= U 2 - V 2 . (55)
Fig. 22 shows the evanescent and propagating regions asso ia d with
weakly leaky rays. As the energy appears to tunnel across the region rtP- Y, ,
Snyder designates such modes and rays as tunnelling modes and tunnelling
rays respectively. Expressions for the attenuation coefficients of leaky
rays are presented by SAMMUT and SNYDER [1975] while Fig. 23 shows
ale, as a function of OlO,.
Although leaky modes can play a significant role in the transport of
power in the configuration of Fig. 2(b), especially if V is very large, it should
be recognised that Fig. 2(b) is only an approximation to that of Fig. 2(a)
and there leaky modes will not prove significant unless n3 > n,. This is
evident in Fig. 21 which shows that a leaky ray of the configuration of
Fig. 2(b) is trapped in that of Fig. 2(a). Because of the large difference
between A = (n,- nz)/nl and A , = $[n; - l]f/n,, a very large number of
modes which would be leaky in Fig. 2(b) are in fact trapped by the outer
cladding boundary and these are the so-called cladding modes of the struc-
ture. However, if n3 is lossy, many of these modes will be highly attenuated.

Fig. 22(a). Geometrical interpretation of the fields inside and outside of optical fibre con-
figuration of Fig. 2(b) under leaky ray conditions. Ci, denotes interior caustic; C,,, is the
caustic beyond which radiation occurs. The unshaded regions correspond to rays i.e. oscilla-
tory fields.


Fig. 22(b). Field of a leaky mode for the configuration of Fig. 2(b). For r less than rtpr the
field resembles a bound mode. For r greater than rip, energy is radiated. 0 is the direction of
radiation in the far field.

Fig. 23. Diagram showing the U plane for leaky (LR) and refracting (RR) rays.


If a lossless fibre is bent into an arc as in Fig. 24(a), energy in a mode

which would propagate without attenuation when the fibre is straight,
leaks from the fibre and the mode suffers attenuation. UNGER[1964],
MARCATILI and MILLER[1969] were the first to consider the problem of
attenuation in curved dielectric waveguides. MARCATILI and MILLER
[19691 studied in particular, slab waveguides with uniform refractive
indices and the relative behaviour of slabs with uniform and graded indices
was taken up later by GLOGE[1972b]. The above authors used modal
methods to obtain their attenuation coefticients, whereas SNYDERand LOVE
[1975] determines attenuation by generalising a ray treatment. He extends
the Fresnel transmission coefficient for plane media so as to account for the
curvature and obtains a result for slab waveguides which is in accord
with GLWE[1972b] (whose expression in reference equation (1 1) should
be halved). SNYDER[I9741 has also derived an attenuation coefficient for
a curved cylindrical fibre as an extension of the curved slab case ;an outline
of his method follows.
By considering the curved slab boundary, in Fig. 24(b), as a perturbation
of a plane interface, a modification of the analysis which leads to Fresnels

"2 < "1

n 3 s n2
Fig, 24. Configurations for curved slab and cylindrical optical waveguides. (a) Curved cylindri-
cal fibre; (b) Curved slab waveguide.

laws yields a transmission coefficient

Power of reflected ray
T(e)= 1-
Power of incident ray

k' is the wavenumber for plane wav8propagation in the medium n,

The angle between successive reflections of the r,ay, A& is given by
.-> **
2(Po -P A
A+ =
Po9 .
The power of the mode which is contained in the slab q(0) is

Finally, the attenuation coefficient a (= 2pa' where a' is the coefficient used
in SNYDER[19741) is given by

a = +k'(ef - e2)exp -(+k'po(8: - e2)t).
According to SNYDER [19741, for a circular fibre supporting many modes,
only meridional rays which strike the boundary at a radius po suffer signif-
icant bending loss and their attenuation coefficient will be the same as
for a slab waveguide, except for the factor q(8) which can be found from
Table 3.
Near cut-off we can neglect K in the expression for q(8), then on sub-
stituting for (U/V)' from equation (12)

Substitution of equation (62) into equation (60) yields the attenuation

coefficient. This is dominated by the exponential term, the argument of
which vanishes when 8 approaches 8,. Fig. 25, due to GLOGE[1972b],

Fig. 25. Curvature loss as a function of normalised ray angle for graded refractive index and
stepped refractive index planar waveguides.

shows the curvature loss as a function of eje, for slabs of thickness t with
uniform and graded profiles where - and 0 are related through
- = (ez - 2t/p0).


From the above description, we see that it is the energy in rays whick.
are close to the critical angle in the straight fibre, which is lost first when the
fibre is gradually bent. In modal terms, it is those modes which are closest
to cut-off which first couple to the radiation modes and which give rise to
attenuation. Only for a severe bend would low order modes in a multimode
fibre, couple directly to the radiation field.
Mode coupling due to perturbations in optical waveguides including
bends, has been extensively considered by MARCUSE [1969, 1972, 1973a,
1974a) following an initial treatment of the problem by SNYDER [1969a, b].
GLOGE [1973] and MARCUSE [1972a] have shown how mode coupling in-
fluences pulse propagation while SNYDER[1969~1,also CLARRICOATS and
CHAN[19731 have considered how isolated inhomogeneities couple power
between modes and from a mode to the radiation field.
When the core refractive index is perturbed by h ( r , 8',z) the coupling
coefficient between the pth and qth modes is given by MARCUSE [1974a] as

where ep and e, are the linearly polarised vector fields associated with the
modes of the unperturbed fibre as given by the equations of Table 4.


For a fibre with a section of constant radius of curvature po and length

L, the ratio of the power coupling coefficient between the pth and qth
modes (with v > 0) at L is, MARCUSE [1974a],


P, iP,.
Notice that the power oscillates between the modes with a period which
VIJ, 21 F I B R E S W I T H CORE A N D C L A D D I N G 361

,guided modes
radiation modes I

---- I
I l
I ,

PN 4 P

depends on the beat wavelength A, see Fig. 26, where

4 = 27Wp-8,8,)
x 4r,p

far from cut-off, equation (64) reduces to

where 8 x ep x 8, is the ray angle within the fibre as defined in the previous
section. Equation (66) shows that coupling between modes due to a bend
is inversely dependent on the sixth power of the bend radius.
The effect of random bends in a fibre has been considered by GLOGE
[1972] using a ray treatment and by MARCUSE[1969, 1973a1,who uses
modal methods to show that the power coupled between the pth and 4th
modes Ppqis
PjJq = I ~,,I2F(P,- 8,). (67

In equation (67), the z dependence of K,, (see equation (63)) has beell
removed leaving Kp4 with F(B, -p,) representing the power spectrum of
the z dependent curvature function. Only the spectral component with
wavelength A, given by equation (65) influences coupling between the pth
and qth modes. For modes far from cut-off,

where ( ) denotes an ensemble average and the integration extends over

the fibre length L.
Fig. 27 due to GLOGE[1972a], shows how the output beam angle and
368 O P T I C A L FIBRE W A V E G U I D E S [VII, 8 2

0 0 0
- 300
C=7x10-5radZm VI

a - 200
0 0 0 0

- 100
2 I I I I I
0 2.5 5.0 7.5 10.0 12.5 15.0
length m
Fig. 27. The output beam angle and total number of modes as a function of length in a multi-
mode fibre which propagates 700 modes. The straight line represents a best fit to a theory
which assumes mode coupling arises due to random bends. C is the curvature spectrum func-

total mode number, varies with length along a fibre where coupling of
neighbouring modes is due mainly to random bends. The straight line
represents a best fit to theory in which the coefficient, labelled C by GLWE
[1972aJ, is the curvature spectrum function. C is given by

where 6 is the amplitude of sine waves sin (2nz/A) used to model the random
curvature and v] the number of undulations per unit length.
The above results refer to a multimode fibre. In a single mode fibre,
radiation loss can occur due to bends, core diameter fluctuations, or on
account of inhomogeneities, while coupling to cladding modes can also
occur. Fig. 28, due to CLARRICOATS and CHAN[1973], shows for a mono-
mode fibre with finite cladding boundary, how the normalised power
which is.both coupled into the cladding modes from the HE, mode, and
is transformed into radiation, varies with the location of an isolated in-
homogeneity. The curve labelled HE,, , shows the power coupled back into
the HE,, mode by the inhomogeneitv. Following SNYDER[1969a, b], the
valid assumption is made that equal amounts scatter in both forward and
backward directions. Evidently, coupling to radiation predominates over
coupling to the cladding modes. The latter varies with r/rl in a manner


Fig. 28. Normalised coupling coefficient between .HE,, mode and other modes of optical
fibre, in the configuration of Fig. 2(a), as a function of the position of an inhomogeneity of
volume V and permittivity difference A&. Also shown is the coupling into radiated power.

which is determined by the convolution of the field patterns of the HE,,

mode and the coupled mode. The influence on system performance of
multiple reflections due to scattering, was first analysed by DYOTT and
STERN[1971] and an extended study has been made by HUBBARD[1972].


Although, in general, one aims to minimise inhomogeneities during fibre

manufacture, it transpires that provided coupling to the radiation field is
not too great, intermodal coupling actually plays an advantageous role in
multimode fibres. PERSONICK [1971: was the first to recognise that, when
mode coupling is present, pulse dispersion is dependent on L* rather than
L, subsequently GLOGE[1972] and MARCUSE [1974a,b] studied the problem
in depth and in the following section, we present a resume of the analysis
of GLOGE [1973].
In a multimode fibre with large V,many modes propagate and associated
rays are so densely packed that their distribution as a function of 8 can be
considered continuous. The power distribution in the fibre P(0, z, t ) then

where the first term on the right-hand side arises from attenuation effects
at the cladding and core cladding interface, which increase as 0. The
coefficient A is measured in m-l rad-; 0 independent loss is omitted
but this can be incorporated in the final solution. The second term is
associated with the differential group delay according to ray optics. The
third term arises from mode coupling between closely adjacent modes
and takes the form of a diffusion process in the ray picture. D = (4/n2)C,
see equation (69), is the coupling coefficient which is assumed in the follow-
ing to be independent of 6. Gloge also discusses the influence of a 8 depen-
dent coupling coefficient. Equation (70) may be solved using the Laplace
then equation (70) becomes

0, = (4D/A)* (77)
yrn = (4DA)*. (78)
VI1, 6 21 FIBRES W I T H CORE A N D C L A D D I N G 371

For C.W.excitation, (s = 0), the angular width 0 (z, 0) changes monotoni-

cally from 0,to 0, as z increases. 0, characterises a distribution which
propagates without change and with the minimum overall loss coefficient
Closed form solutions of (74) exist for very small and very large values of
y,z. For the former, if the Gaussian angular distribution at the input and
at infinity are the same,

The denominator ex