Sie sind auf Seite 1von 381

Laboratory Techniques

(1718)

by

ATHER HASSAN Department of Physics Allama Iqbal Open University Islamabad

Laboratory Techniques
(1718)
First time launched for the research students November , 2013

ATHER HASSAN Department of Physics , Allama Iqbal Open University Islamabad

Foreword
standing of the key elements critical to achieving group success. I wrote this book to provide a framework for learning these necessary skills in a way that emphasizes the uniqueness of each group and each individual within the group. Successful group work starts with strong skills and ambition. This text provides the necessary roots towards skills, decisions and complete laboratory activities.
ather

Istudents must develop both a foundation of Laboratory skills and an under-

n order to understand what it means to work successfully in groups and individually,

Contents
Articles
Vacuum Pressure measurement Pirani gauge Hot-filament ionization gauge Vacuum pump Cryopump Getter Ion pump (physics) Rotary vane pump Diaphragm pump Liquid ring pump Reciprocating compressor Scroll compressor Archimedes' screw Wankel engine Roots-type supercharger Toepler pump Lobe pump Diffusion pump Turbomolecular pump Outgassing Coating Wafer (electronics) Substrate (electronics) Chemical vapor deposition Metalorganic vapour phase epitaxy Electrostatic spray-assisted vapour deposition Epitaxy Molecular beam epitaxy Physical vapor deposition Cathodic arc deposition Electron beam physical vapor deposition Evaporation (deposition) Pulsed laser deposition 1 16 29 32 34 41 42 44 45 47 49 51 52 56 60 78 82 83 84 87 90 91 95 100 100 106 110 111 115 117 120 123 126 129

Sputter deposition Calo tester Nanoindentation Tribometer Ion plating Ion beam-assisted deposition Vacuum evaporation Plating Electroplating Spray painting Thermal spraying Plasma transferred wire arc thermal spraying Powder coating Spin coating Thin film Sol-gel Plasma-enhanced chemical vapor deposition Atomic layer deposition Sputtering Electrohydrodynamics Chemical bath deposition Chemical beam epitaxy Deposition (aerosol physics) Aerosol X-ray crystallography Powder diffraction Bragg's law Structure factor Crystallography Miller index Crystal structure Kikuchi line X-ray scattering techniques Small-angle X-ray scattering X-ray reflectivity Wide-angle X-ray scattering Chemical structure Atomic spectroscopy

132 136 137 142 144 145 146 147 151 156 160 167 168 171 172 175 183 185 189 192 194 195 199 201 210 231 238 243 249 254 258 268 273 274 277 278 279 280

Atomic absorption spectroscopy Atomic emission spectroscopy Fluorescence spectroscopy Conductivity (electrolytic) Electrical conductivity meter Electrical resistivity and conductivity LCR meter Device under test Q factor Electrical impedance Hall effect Hall effect sensor Scanning electron microscope Topography Raster scan Secondary electrons Backscatter Energy-dispersive X-ray spectroscopy Cathodoluminescence Depth of field Scanning tunneling microscope Transmission electron microscopy Scanning transmission electron microscopy Charge-coupled device

282 290 291 295 300 302 313 314 315 321 330 339 341 352 356 361 362 363 366 367 393 401 419 421

References
Article Sources and Contributors Image Sources, Licenses and Contributors 431 439

Article Licenses
License 447

Vacuum

Vacuum
Vacuum is space that is empty of matter. The word stems from the Latin adjective vacuus for "vacant" or "void". An approximation to such vacuum is a region with a gaseous pressure much less than atmospheric pressure. Physicists often discuss ideal test results that would occur in a perfect vacuum, which they sometimes simply call "vacuum" or free space, and use the term partial vacuum to refer to an actual imperfect vacuum as one might have in a laboratory or in space. The Latin term in vacuo is used to describe an object as being in what would otherwise be a vacuum. The quality of a partial vacuum refers to how closely it approaches a Pump to demonstrate vacuum perfect vacuum. Other things equal, lower gas pressure means higher-quality vacuum. For example, a typical vacuum cleaner produces enough suction to reduce air pressure by around 20%. Much higher-quality vacuums are possible. Ultra-high vacuum chambers, common in chemistry, physics, and engineering, operate below one trillionth (1012) of atmospheric pressure (100nPa), and can reach around 100 particles/cm3. Outer space is an even higher-quality vacuum, with the equivalent of just a few hydrogen atoms per cubic meter on average. Some theories predict that even if all matter could be removed from a volume, it would still not be "empty" due to vacuum fluctuations, dark energy, and other phenomena in quantum physics. In modern particle physics, the vacuum state is considered as the ground state of matter. Vacuum has been a frequent topic of philosophical debate since ancient Greek times, but was not studied empirically until the 17th century. Evangelista Torricelli produced the first laboratory vacuum in 1643, and other experimental techniques were developed as a result of his theories of atmospheric pressure. A torricellian vacuum is created by filling with mercury a tall glass container closed at one end and then inverting the container into a bowl to contain the mercury. Vacuum became a valuable industrial tool in the 20th century with the introduction of incandescent light bulbs and vacuum tubes, and a wide array of vacuum technology has since become available. The recent development of human spaceflight has raised interest in the impact of vacuum on human health, and on life forms in general.

Vacuum

Etymology
From Latin vacuum (an empty space, void) noun use of neuter of vacuus (empty) related to vacare (be empty). "Vacuum" is one of the few words in the English language that contains two consecutive 'u's.

Electromagnetism
In classical electromagnetism, the vacuum of free space, or sometimes just free space or perfect vacuum, is a standard reference medium for electromagnetic effects. Some A large vacuum chamber authors refer to this reference medium as classical vacuum, a terminology intended to separate this concept from QED vacuum or QCD vacuum, where vacuum fluctuations can produce transient virtual particle densities and a relative permittivity and relative permeability that are not identically unity. In the theory of classical electromagnetism, free space has the following properties: Electromagnetic radiation travels where unobstructed at the speed of light, the defined value 299,792,458m/s in SI units. The superposition principle is always exactly true. For example, the electric potential generated by two charges is the simple addition of the potentials generated by each charge in isolation. The value of the electric field at any point around these two charges is found by calculating the vector sum of the two electric fields from each of the charges acting alone. The permittivity and permeability are exactly the electric constant 0 and magnetic constant 0, respectively (in SI units), or exactly 1 (in Gaussian units). The characteristic impedance () equals the impedance of free space Z0 376.73 . The vacuum of classical electromagnetism can be viewed as an idealized electromagnetic medium with the constitutive relations in SI units:

relating the electric displacement field D to the electric field E and the magnetic field or H-field H to the magnetic induction or B-field B. Here r is a spatial location and t is time.

Vacuum

Quantum mechanics
In quantum mechanics and quantum field theory, the vacuum is defined as the state (that is, the solution to the equations of the theory) with the lowest possible energy (the ground state of the Hilbert space). In quantum electrodynamics this vacuum is referred to as 'QED vacuum' to separate it from the vacuum of quantum chromodynamics, denoted as QCD vacuum. QED vacuum is a state with no matter particles (hence the The video of an experiment showing vacuum fluctuations (in the red ring) amplified by name), and also no photons, no spontaneous parametric down-conversion. gravitons, etc. As described above, this state is impossible to achieve experimentally. (Even if every matter particle could somehow be removed from a volume, it would be impossible to eliminate all the blackbody photons.) Nonetheless, it provides a good model for realizable vacuum, and agrees with a number of experimental observations as described next. QED vacuum has interesting and complex properties. In QED vacuum, the electric and magnetic fields have zero average values, but their variances are not zero. As a result, QED vacuum contains vacuum fluctuations (virtual particles that hop into and out of existence), and a finite energy called vacuum energy. Vacuum fluctuations are an essential and ubiquitous part of quantum field theory. Some experimentally verified effects of vacuum fluctuations include spontaneous emission, the Casimir effect and the Lamb shift. Coulomb's law and the electric potential in vacuum near an electric charge are modified. Theoretically, in QCD vacuum multiple vacuum states can coexist. The starting and ending of cosmological inflation is thought to have arisen from transitions between different vacuum states. For theories obtained by quantization of a classical theory, each stationary point of the energy in the configuration space gives rise to a single vacuum. String theory is believed to have a huge number of vacua the so-called string theory landscape.

Vacuum

Outer space
Outer space has very low density and pressure, and is the closest physical approximation of a perfect vacuum. But no vacuum is truly perfect, not even in interstellar space, where there are still a few hydrogen atoms per cubic meter. Stars, planets and moons keep their atmospheres by gravitational attraction, and as such, atmospheres have no clearly delineated boundary: the density of atmospheric gas simply decreases with distance from the object. The Earth's atmospheric pressure drops to about 3.2 102 Pa at 100 kilometres (62mi) of altitude, Outer space is not a perfect vacuum, but a tenuous plasma awash with charged particles, electromagnetic fields, and the occasional star. the Krmn line, which is a common definition of the boundary with outer space. Beyond this line, isotropic gas pressure rapidly becomes insignificant when compared to radiation pressure from the sun and the dynamic pressure of the solar wind, so the definition of pressure becomes difficult to interpret. The thermosphere in this range has large gradients of pressure, temperature and composition, and varies greatly due to space weather. Astrophysicists prefer to use number density to describe these environments, in units of particles per cubic centimetre. But although it meets the definition of outer space, the atmospheric density within the first few hundred kilometers above the Krmn line is still sufficient to produce significant drag on satellites. Most artificial satellites operate in this region called low earth orbit and must fire their engines every few days to maintain orbit. The drag here is low enough that it could theoretically be overcome by radiation pressure on solar sails, a proposed propulsion system for interplanetary travel. Planets are too massive for their trajectories to be significantly affected by these forces, although their atmospheres are eroded by the solar winds. All of the observable universe is filled with large numbers of photons, the so-called cosmic background radiation, and quite likely a correspondingly large number of neutrinos. The current temperature of this radiation is about 3 K, or -270 degrees Celsius or -454 degrees Fahrenheit.

Historical interpretation
Historically, there has been much dispute over whether such a thing as a vacuum can exist. Ancient Greek philosophers debated the existence of a vacuum, or void, in the context of atomism, which posited void and atom as the fundamental explanatory elements of physics. Following Plato, even the abstract concept of a featureless void faced considerable skepticism: it could not be apprehended by the senses, it could not, itself, provide additional explanatory power beyond the physical volume with which it was commensurate and, by definition, it was quite literally nothing at all, which cannot rightly be said to exist. Aristotle believed that no void could occur naturally, because the denser surrounding material continuum would immediately fill any incipient rarity that might give rise to a void. In his Physics, book IV, Aristotle offered numerous arguments against the void: for example, that motion through a medium which offered no impediment could continue ad infinitum, there being no reason that something would

Vacuum come to rest anywhere in particular. Although Lucretius argued for the existence of vacuum in the first century BC and Hero of Alexandria tried unsuccessfully to create an artificial vacuum in the first century AD, it was European scholars such as Roger Bacon, Blasius of Parma and Walter Burley in the 13th and 14th century who focused considerable attention on these issues. Eventually following Stoic physics in this instance, scholars from the 14th century onward increasingly departed from the Aristotelian perspective in favor of a supernatural void beyond the confines of the cosmos itself, a conclusion widely acknowledged by the 17th century, which helped to segregate natural and theological concerns. Almost two thousand years after Plato, Ren Descartes also proposed a geometrically based alternative theory of atomism, without the problematic nothingeverything dichotomy of void and atom. Although Descartes agreed with the contemporary position, that a vacuum does not occur in nature, the success of his namesake coordinate system and more implicitly, the spacialcorporeal component of his metaphysics would come to define the philosophically modern notion of empty space as a quantified extension of volume. By the ancient definition however, directional information and magnitude were conceptually distinct. With the acquiescence of Cartesian mechanical philosophy to the "brute fact" of action at a distance, and at length, its successful reification by force fields and ever more sophisticated geometric structure, the anachronism of empty space widened until "a seething ferment" of quantum activity in the 20th century filled the vacuum with a virtual pleroma. The explanation of a clepsydra or water clock was a popular topic in the Middle Ages. Although a simple wine skin sufficed to demonstrate a partial vacuum, in principle, more advanced suction pumps had been developed in Roman Pompeii. In the medieval Middle Eastern world, the physicist and Islamic scholar, Al-Farabi (Alpharabius, 872-950), conducted a small experiment concerning the existence of vacuum, in which he investigated handheld plungers in water.Wikipedia:Identifying reliable sources He concluded that air's volume can expand to fill available space, and he suggested that the concept of perfect vacuum was incoherent. However, according to Nader El-Bizri, the physicist Ibn al-Haytham (Alhazen, 965-1039) and the Mu'tazili theologians disagreed with Aristotle and Al-Farabi, and they supported the existence of a void. Using geometry, Ibn al-Haytham mathematically demonstrated that place (al-makan) is the imagined three-dimensional void between the inner surfaces of a containing body. According to Ahmad Dallal, Ab Rayhn al-Brn also states that "there is no observable evidence that rules out the possibility of vacuum". The suction pump later appeared in Europe from the 15th century.

Vacuum

6 Medieval thought experiments into the idea of a vacuum considered whether a vacuum was present, if only for an instant, between two flat plates when they were rapidly separated. There was much discussion of whether the air moved in quickly enough as the plates were separated, or, as Walter Burley postulated, whether a 'celestial agent' prevented the vacuum arising. The commonly held view that nature abhorred a vacuum was called horror vacui. Speculation that even God could not create a vacuum if he wanted to was shut downWikipedia:Please clarify by the 1277 Paris condemnations of Bishop Etienne Tempier, which required there to be no restrictions on the powers of God, which led to the conclusion that God could create a vacuum if he so wished. Jean Buridan reported in the 14th century that teams of ten horses could not pull open bellows when the port was sealed.

Torricelli's mercury barometer produced one of the first sustained vacuums in a laboratory.

The 17th century saw the first attempts to quantify measurements of partial vacuum. Evangelista Torricelli's mercury barometer of 1643 and Blaise Pascal's experiments that both demonstrated a partial vacuum. In 1654, Otto von Guericke invented the first vacuum pump and conducted his famous Magdeburg hemispheres experiment, showing that teams of horses could not separate two hemispheres from which the air had been partially evacuated. Robert Boyle improved Guericke's design and with the help of Robert Hooke further developed vacuum The Crookes tube, used to discover and study pump technology. Thereafter, research into the partial vacuum lapsed cathode rays, was an evolution of the Geissler until 1850 when August Toepler invented the Toepler Pump and tube. Heinrich Geissler invented the mercury displacement pump in 1855, achieving a partial vacuum of about 10Pa (0.1Torr). A number of electrical properties become observable at this vacuum level, which renewed interest in further research. While outer space provides the most rarefied example of a naturally occurring partial vacuum, the heavens were originally thought to be seamlessly filled by a rigid indestructible material called aether. Borrowing somewhat from the pneuma of Stoic physics, aether came to be regarded as the rarefied air from which it took its name, Early theories of light posited a ubiquitous terrestrial and celestial medium through which light propagated. Additionally, the concept informed Isaac Newton explanations of both refraction and of radiant heat. 19th century experiments into this luminiferous aether attempted to detect a minute drag on the Earth's orbit. While

Vacuum the Earth does, in fact, move through a relatively dense medium in comparison to that of interstellar space, the drag is so minuscule that it could not be detected. In 1912, astronomer Henry Pickering commented: "While the interstellar absorbing medium may be simply the ether, it is characteristic of a gas, and free gaseous molecules are certainly there". In 1930, Paul Dirac proposed a model of the vacuum as an infinite sea of particles possessing negative energy, called the Dirac sea. This theory helped refine the predictions of his earlier formulated Dirac equation, and successfully predicted the existence of the positron, confirmed two years later. Werner Heisenberg's uncertainty principle formulated in 1927, predict a fundamental limit within which instantaneous position and momentum, or energy and time can be measured. This has far reaching consequences on the "emptiness" of space between particles. In the late 20th century, so-called virtual particles that arise spontaneously from empty space were confirmed.

Measurement
The quality of a vacuum is indicated by the amount of matter remaining in the system, so that a high quality vacuum is one with very little matter left in it. Vacuum is primarily measured by its absolute pressure, but a complete characterization requires further parameters, such as temperature and chemical composition. One of the most important parameters is the mean free path (MFP) of residual gases, which indicates the average distance that molecules will travel between collisions with each other. As the gas density decreases, the MFP increases, and when the MFP is longer than the chamber, pump, spacecraft, or other objects present, the continuum assumptions of fluid mechanics do not apply. This vacuum state is called high vacuum, and the study of fluid flows in this regime is called particle gas dynamics. The MFP of air at atmospheric pressure is very short, 70nm, but at 100mPa (~1103Torr) the MFP of room temperature air is roughly 100mm, which is on the order of everyday objects such as vacuum tubes. The Crookes radiometer turns when the MFP is larger than the size of the vanes. Vacuum quality is subdivided into ranges according to the technology required to achieve it or measure it. These ranges do not have universally agreed definitions, but a typical distribution is as follows:
pressure (Torr) Atmospheric pressure Low vacuum Medium vacuum High vacuum Ultra high vacuum 760 760 to 25 25 to 1103 1103 to 1109 1109 to 11012 pressure (Pa) 1.01310+5 110+5 to 310+3 310+3 to 1101 1101 to 1107 1107 to 11010 <11010

Extremely high vacuum <11012 Outer space Perfect vacuum

1106 to <31017 1104 to <310-15 0 0

Atmospheric pressure is variable but standardized at 101.325kPa (760Torr) Low vacuum, also called rough vacuum or coarse vacuum, is vacuum that can be achieved or measured with rudimentary equipment such as a vacuum cleaner and a liquid column manometer. Medium vacuum is vacuum that can be achieved with a single pump, but the pressure is too low to measure with a liquid or mechanical manometer. It can be measured with a McLeod gauge, thermal gauge or a capacitive gauge. High vacuum is vacuum where the MFP of residual gases is longer than the size of the chamber or of the object under test. High vacuum usually requires multi-stage pumping and ion gauge measurement. Some texts differentiate between high vacuum and very high vacuum.

Vacuum Ultra high vacuum requires baking the chamber to remove trace gases, and other special procedures. British and German standards define ultra high vacuum as pressures below 106Pa (108Torr). Deep space is generally much more empty than any artificial vacuum. It may or may not meet the definition of high vacuum above, depending on what region of space and astronomical bodies are being considered. For example, the MFP of interplanetary space is smaller than the size of the solar system, but larger than small planets and moons. As a result, solar winds exhibit continuum flow on the scale of the solar system, but must be considered as a bombardment of particles with respect to the Earth and Moon. Perfect vacuum is an ideal state of no particles at all. It cannot be achieved in a laboratory, although there may be small volumes which, for a brief moment, happen to have no particles of matter in them. Even if all particles of matter were removed, there would still be photons and gravitons, as well as dark energy, virtual particles, and other aspects of the quantum vacuum. Hard vacuum and Soft vacuum are terms that are defined with a dividing line defined differently by different sources, such as 1 Torr, or 0.1 Torr, the common denominator being that a hard vacuum is a higher vacuum than a soft one.

Relative versus absolute measurement


Vacuum is measured in units of pressure, typically as a subtraction relative to ambient atmospheric pressure on Earth. But the amount of relative measurable vacuum varies with local conditions. On the surface of Jupiter, where ground level atmospheric pressure is much higher than on Earth, much higher relative vacuum readings would be possible. On the surface of the moon with almost no atmosphere, it would be extremely difficult to create a measurable vacuum relative to the local environment. Similarly, much higher than normal relative vacuum readings are possible deep in the Earth's ocean. A submarine maintaining an internal pressure of 1 atmosphere submerged to a depth of 10 atmospheres (98 metres; a 9.8 metre column of seawater has the equivalent weight of 1 atm) is effectively a vacuum chamber keeping out the crushing exterior water pressures, though the 1 atm inside the submarine would not normally be considered a vacuum. Therefore to properly understand the following discussions of vacuum measurement, it is important that the reader assumes the relative measurements are being done on Earth at sea level, at exactly 1 atmosphere of ambient atmospheric pressure.

Vacuum

Measurements relative to 1 atm


The SI unit of pressure is the pascal (symbol Pa), but vacuum is often measured in torrs, named for Torricelli, an early Italian physicist (16081647). A torr is equal to the displacement of a millimeter of mercury (mmHg) in a manometer with 1torr equaling 133.3223684 pascals above absolute zero pressure. Vacuum is often also measured on the barometric scale or as a percentage of atmospheric pressure in bars or atmospheres. Low vacuum is often measured in millimeters of mercury (mmHg) or pascals (Pa) below standard atmospheric pressure. "Below atmospheric" means that the absolute pressure is equal to the current atmospheric pressure. In other words, most low vacuum gauges that read, for example 50.79Torr. Many inexpensive low vacuum gauges have a margin of error and may report a vacuum of 0Torr but in practice this generally requires a two stage rotary vane or other medium type of vacuum pump to go much beyond (lower than) 1torr.

Measuring instruments
Many devices are used to measure the pressure in a vacuum, depending on what range of vacuum is needed.
A glass McLeod gauge, drained of mercury

Hydrostatic gauges (such as the mercury column manometer) consist of a vertical column of liquid in a tube whose ends are exposed to different pressures. The column will rise or fall until its weight is in equilibrium with the pressure differential between the two ends of the tube. The simplest design is a closed-end U-shaped tube, one side of which is connected to the region of interest. Any fluid can be used, but mercury is preferred for its high density and low vapour pressure. Simple hydrostatic gauges can measure pressures ranging from 1torr (100Pa) to above atmospheric. An important variation is the McLeod gauge which isolates a known volume of vacuum and compresses it to multiply the height variation of the liquid column. The McLeod gauge can measure vacuums as high as 106torr (0.1mPa), which is the lowest direct measurement of pressure that is possible with current technology. Other vacuum gauges can measure lower pressures, but only indirectly by measurement of other pressure-controlled properties. These indirect measurements must be calibrated via a direct measurement, most commonly a McLeod gauge. Mechanical or elastic gauges depend on a Bourdon tube, diaphragm, or capsule, usually made of metal, which will change shape in response to the pressure of the region in question. A variation on this idea is the capacitance manometer, in which the diaphragm makes up a part of a capacitor. A change in pressure leads to the flexure of the diaphragm, which results in a change in capacitance. These gauges are effective from 10+3torr to 104torr, and beyond. Thermal conductivity gauges rely on the fact that the ability of a gas to conduct heat decreases with pressure. In this type of gauge, a wire filament is heated by running current through it. A thermocouple or Resistance Temperature Detector (RTD) can then be used to measure the temperature of the filament. This temperature is dependent on the rate at which the filament loses heat to the surrounding gas, and therefore on the thermal conductivity. A common variant is the Pirani gauge which uses a single platimum filament as both the heated element and RTD. These gauges are accurate from 10torr to 103torr, but they are sensitive to the chemical composition of the gases being measured. Ion gauges are used in ultrahigh vacuum. They come in two types: hot cathode and cold cathode. In the hot cathode version an electrically heated filament produces an electron beam. The electrons travel through the gauge and ionize

Vacuum gas molecules around them. The resulting ions are collected at a negative electrode. The current depends on the number of ions, which depends on the pressure in the gauge. Hot cathode gauges are accurate from 103torr to 1010 torr. The principle behind cold cathode version is the same, except that electrons are produced in a discharge created by a high voltage electrical discharge. Cold cathode gauges are accurate from 102torr to 109torr. Ionization gauge calibration is very sensitive to construction geometry, chemical composition of gases being measured, corrosion and surface deposits. Their calibration can be invalidated by activation at atmospheric pressure or low vacuum. The composition of gases at high vacuums will usually be unpredictable, so a mass spectrometer must be used in conjunction with the ionization gauge for accurate measurement.

10

Uses
Vacuum is useful in a variety of processes and devices. Its first widespread use was in the incandescent light bulb to protect the filament from chemical degradation. The chemical inertness produced by a vacuum is also useful for electron beam welding, cold welding, vacuum packing and vacuum frying. Ultra-high vacuum is used in the study of atomically clean substrates, as only a very good vacuum preserves atomic-scale clean surfaces for a reasonably long time (on the order of minutes to days). High to ultra-high vacuum removes the obstruction of air, allowing particle beams to deposit or remove materials without contamination. This is the principle behind chemical vapor deposition, physical vapor deposition, and dry etching which are essential to the fabrication of semiconductors and optical coatings, and to surface science. The reduction of convection provides the thermal insulation of thermos bottles. Deep vacuum lowers the boiling point of liquids and promotes low temperature outgassing which is used in freeze drying, adhesive preparation, distillation, metallurgy, and process purging. The electrical properties of vacuum make electron microscopes and vacuum tubes possible, including cathode ray tubes. The elimination of air friction is useful for flywheel energy storage and ultracentrifuges.

Light bulbs contain a partial vacuum, usually backfilled with argon, which protects the tungsten filament

Vacuum

11

Vacuum-driven machines
Vacuums are commonly used to produce suction, which has an even wider variety of applications. The Newcomen steam engine used vacuum instead of pressure to drive a piston. In the 19th century, vacuum was used for traction on Isambard Kingdom Brunel's experimental atmospheric railway. Vacuum brakes were once widely used on trains in the UK but, except on heritage railways, they have been replaced by air brakes. Manifold vacuum can be used to drive accessories on automobiles. The best-known application is the vacuum servo, used to provide power assistance for the brakes. Obsolete applications include vacuum-driven windscreen wipers and fuel pumps. Some aircraft instruments (Attitude Indicator (AI) and the Heading Indicator (HI)) are typically vacuum-powered, as protection against loss of all (electrically powered) instruments, since early aircraft often did not have electrical systems, and since there are two readily available sources of vacuum on a moving aircraftthe engine and an external venturi. Vacuum induction melting uses electromagnetic induction within a vacuum.

Evaporation and sublimation into a vacuum is called outgassing. All materials, solid or liquid, have a small vapour pressure, and their outgassing becomes important when the vacuum pressure falls below this vapour pressure. In man-made systems, outgassing has the same effect as a leak and can limit the achievable vacuum. Outgassing products may condense on nearby colder surfaces, which can be troublesome if they obscure optical instruments or react with other materials. This is of great concern to space missions, where an obscured telescope or solar cell can ruin an expensive mission. The most prevalent outgassing product in man-made vacuum systems is water absorbed by chamber materials. It can be reduced by desiccating or baking the chamber, and removing absorbent materials. Outgassed water can condense in the oil of rotary vane pumps and reduce their net speed drastically if gas ballasting is not used. High vacuum systems must be clean and free of organic matter to minimize outgassing. Ultra-high vacuum systems are usually baked, preferably under vacuum, to temporarily raise the vapour pressure of all outgassing materials and boil them off. Once the bulk of the outgassing materials are boiled off and evacuated, the system may be cooled to lower vapour pressures and minimize residual outgassing during actual operation. Some systems are cooled well below room temperature by liquid nitrogen to shut down residual outgassing and simultaneously cryopump the system.

This shallow water well pump reduces atmospheric air pressure inside the pump chamber. Atmospheric pressure extends down into the well, and forces water up the pipe into the pump to balance the reduced pressure. Above-ground pump chambers are only effective to a depth of approximately 9 meters due to the water column weight balancing the atmospheric pressure.

Outgassing

Vacuum

12

Pumping and ambient air pressure


Fluids cannot generally be pulled, so a vacuum cannot be created by suction. Suction can spread and dilute a vacuum by letting a higher pressure push fluids into it, but the vacuum has to be created first before suction can occur. The easiest way to create an artificial vacuum is to expand the volume of a container. For example, the diaphragm muscle expands the chest cavity, which causes the volume of the lungs to increase. This expansion reduces the pressure and creates a partial vacuum, which is soon filled by air pushed in by atmospheric pressure. To continue evacuating a chamber indefinitely without requiring infinite growth, a compartment of the vacuum can be repeatedly closed off, exhausted, and expanded again. This is the principle behind positive displacement pumps, like the manual water pump for example. Inside the pump, a mechanism expands a small sealed cavity to create a vacuum. Because of the pressure differential, some fluid from the chamber (or the well, in our example) is pushed into the pump's small cavity. The pump's cavity is then sealed from the chamber, opened to the atmosphere, and squeezed back to a minute size.
Deep wells have the pump chamber down in the well close to the water surface, or in the water. A "sucker rod" extends from the handle down the center of the pipe deep into the well to operate the plunger. The pump handle acts as a heavy counterweight against both the sucker rod weight and the weight of the water column standing on the upper plunger up to ground level.

The above explanation is merely a simple introduction to vacuum pumping, and is not representative of the entire range of pumps in use. Many variations of the positive displacement pump have been developed, and many other pump designs rely on fundamentally different principles. Momentum transfer pumps, which bear some similarities to dynamic pumps used at higher pressures, can achieve much higher quality vacuums than positive displacement pumps. Entrapment pumps can capture gases in a solid or absorbed state, often with no moving parts, no seals and no vibration. None of these pumps are universal; each type has important performance limitations. They all share a difficulty in pumping low molecular weight gases, especially hydrogen, helium, and neon. The lowest pressure that can be attained in a system is also dependent on many things other than the nature of the pumps. Multiple pumps may be connected in series, called stages, to achieve higher vacuums. The choice of seals, chamber geometry, materials, and pump-down procedures will all have an impact. Collectively, these are called

A cutaway view of a turbomolecular pump, a momentum transfer pump used to achieve high vacuum

Vacuum vacuum technique. And sometimes, the final pressure is not the only relevant characteristic. Pumping systems differ in oil contamination, vibration, preferential pumping of certain gases, pump-down speeds, intermittent duty cycle, reliability, or tolerance to high leakage rates. In ultra high vacuum systems, some very "odd" leakage paths and outgassing sources must be considered. The water absorption of aluminium and palladium becomes an unacceptable source of outgassing, and even the adsorptivity of hard metals such as stainless steel or titanium must be considered. Some oils and greases will boil off in extreme vacuums. The permeability of the metallic chamber walls may have to be considered, and the grain direction of the metallic flanges should be parallel to the flange face. The lowest pressures currently achievable in laboratory are about 1013 torr (13pPa). However, pressures as low as 51017Torr (6.7fPa) have been indirectly measured in a 4K cryogenic vacuum system. This corresponds to 100 particles/cm3.

13

Effects on humans and animals


Humans and animals exposed to vacuum will lose consciousness after a few seconds and die of hypoxia within minutes, but the symptoms are not nearly as graphic as commonly depicted in media and popular culture. The reduction in pressure lowers the temperature at which blood and other body fluids boil, but the elastic pressure of blood vessels ensures that this boiling point remains above the internal body temperature of 37 C. Although the blood will not boil, the formation of gas bubbles in bodily fluids at reduced pressures, known as ebullism, is still a concern. The gas may bloat the body to twice its This painting, An Experiment on a Bird in the Air normal size and slow circulation, but tissues are elastic and porous Pump by Joseph Wright of Derby, 1768, depicts enough to prevent rupture. Swelling and ebullism can be restrained by an experiment performed by Robert Boyle in containment in a flight suit. Shuttle astronauts wore a fitted elastic 1660. garment called the Crew Altitude Protection Suit (CAPS) which prevents ebullism at pressures as low as 2kPa (15Torr). Rapid boiling will cool the skin and create frost, particularly in the mouth, but this is not a significant hazard. Animal experiments show that rapid and complete recovery is normal for exposures shorter than 90 seconds, while longer full-body exposures are fatal and resuscitation has never been successful. There is only a limited amount of data available from human accidents, but it is consistent with animal data. Limbs may be exposed for much longer if breathing is not impaired. Robert Boyle was the first to show in 1660 that vacuum is lethal to small animals. An experiment indicates that plants are able to survive in a low pressure environment (1.5kPa) for about 30 minutes. During 1942, in one of a series of experiments on human subjects for the Luftwaffe, the Nazi regime experimented on prisoners in Dachau concentration camp by exposing them to low pressure. Cold or oxygen-rich atmospheres can sustain life at pressures much lower than atmospheric, as long as the density of oxygen is similar to that of standard sea-level atmosphere. The colder air temperatures found at altitudes of up to 3km generally compensate for the lower pressures there. Above this altitude, oxygen enrichment is necessary to prevent altitude sickness in humans that did not undergo prior acclimatization, and spacesuits are necessary to prevent ebullism above 19km. Most spacesuits use only 20kPa (150Torr) of pure oxygen. This pressure is high enough to prevent ebullism, but decompression sickness and gas embolisms can still occur if decompression rates are not managed. Rapid decompression can be much more dangerous than vacuum exposure itself. Even if the victim does not hold his or her breath, venting through the windpipe may be too slow to prevent the fatal rupture of the delicate alveoli of the lungs. Eardrums and sinuses may be ruptured by rapid decompression, soft tissues may bruise and seep blood, and

Vacuum the stress of shock will accelerate oxygen consumption leading to hypoxia. Injuries caused by rapid decompression are called barotrauma. A pressure drop of 13kPa (100Torr), which produces no symptoms if it is gradual, may be fatal if it occurs suddenly. Some extremophile microrganisms, such as tardigrades, can survive vacuum for a period of days.

14

Examples
pressure (Pa) Standard atmosphere, for comparison 101.325 kPa Vacuum cleaner liquid ring vacuum pump Mars atmosphere freeze drying rotary vane pump Incandescent light bulb Thermos bottle Earth thermosphere Vacuum tube Cryopumped MBE chamber Pressure on the Moon Interplanetary space Interstellar space Intergalactic space approximately 810+4 approximately 3.210+3 pressure (Torr) mean free path 760 600 24 66nm 70nm 1.75m molecules per cm3 2.51019 1019 1018 [16]

1.155kPa to 0.03kPa (mean 0.6kPa) 8.66 to 0.23 100 to 10 100 to 0,1 10 to 1 1 to 0.01 1Pa to 1107 1105 to 1108 1107 to 1109 approximately 1109 1 to 0.1 1 to 103 0.1 to 0.01 102 to 104 102 to 109 107 to 1010 109 to 1011 1011 100m to 1mm 1016 to 1015

100m to 10cm 1016 to 1013 1mm to 1cm 1cm to 1m 1cm to 100km 1 to 1,000km 1015 to 1014 1014 to 1012 1014 to 107 109 to 106

100 to 10,000km 107 to 105 10,000km 4105 11 1 106

Pressure measurement

15

Pressure measurement
Many techniques have been developed for the measurement of pressure and vacuum. Instruments used to measure pressure are called pressure gauges or vacuum gauges. A manometer is an instrument that uses a column of liquid to measure pressure, although the term is often used nowadays to mean any pressure measuring instrument. A vacuum gauge is used to measure the pressure in a vacuum which is further divided into two subcategories, high and low vacuum (and sometimes ultra-high vacuum). The applicable pressure range of many of the techniques used to measure vacuums have an overlap. Hence, by combining several different types of gauge, it is possible to measure system pressure continuously from 10 mbar down to 1011 mbar.

Absolute, gauge and differential pressures zero reference

The construction of a bourdon tube gauge. Construction elements are made of brass

Everyday pressure measurements, such as for tire pressure, are usually made relative to ambient air pressure. In other cases measurements are made relative to a vacuum or to some other specific reference. When distinguishing between these zero references, the following terms are used: Absolute pressure is zero-referenced against a perfect vacuum, so it is equal to gauge pressure plus atmospheric pressure. Gauge pressure is zero-referenced against ambient air pressure, so it is equal to absolute pressure minus atmospheric pressure. Negative signs are usually omitted. To distinguish a negative pressure, the value may be appended with the word "vacuum" or the gauge may be labeled a "vacuum gauge." Differential pressure is the difference in pressure between two points. The zero reference in use is usually implied by context, and these words are added only when clarification is needed. Tire pressure and blood pressure are gauge pressures by convention, while atmospheric pressures, deep vacuum pressures, and altimeter pressures must be absolute. For most working fluids where a fluid exists in a closed system, gauge pressure measurement prevails. Pressure instruments connected to the system will indicate pressures relative to the current atmospheric pressure. The situation changes when extreme vacuum pressures are measured; absolute pressures are typically used instead. Differential pressures are commonly used in industrial process systems. Differential pressure gauges have two inlet ports, each connected to one of the volumes whose pressure is to be monitored. In effect, such a gauge performs the mathematical operation of subtraction through mechanical means, obviating the need for an operator or control system to watch two separate gauges and determine the difference in readings. Moderate vacuum pressure readings can be ambiguous without the proper context, as they may represent absolute pressure or gauge pressure without a negative sign. Thus a vacuum of 26 inHg gauge is equivalent to an absolute pressure of 30 inHg (typical atmospheric pressure) 26 inHg = 4 inHg. Atmospheric pressure is typically about 100 kPa at sea level, but is variable with altitude and weather. If the absolute pressure of a fluid stays constant, the gauge pressure of the same fluid will vary as atmospheric pressure changes. For example, when a car drives up a mountain, the (gauge) tire pressure goes up because atmospheric pressure goes down. The absolute pressure in the tire is essentially unchanged.

Pressure measurement Using atmospheric pressure as reference is usually signified by a g for gauge after the pressure unit, e.g. 70 psig, which means that the pressure measured is the total pressure minus atmospheric pressure. There are two types of gauge reference pressure: vented gauge (vg) and sealed gauge (sg). A vented gauge pressure transmitter for example allows the outside air pressure to be exposed to the negative side of the pressure sensing diaphragm, via a vented cable or a hole on the side of the device, so that it always measures the pressure referred to ambient barometric pressure. Thus a vented gauge reference pressure sensor should always read zero pressure when the process pressure connection is held open to the air. A sealed gauge reference is very similar except that atmospheric pressure is sealed on the negative side of the diaphragm. This is usually adopted on high pressure ranges such as hydraulics where atmospheric pressure changes will have a negligible effect on the accuracy of the reading, so venting is not necessary. This also allows some manufacturers to provide secondary pressure containment as an extra precaution for pressure equipment safety if the burst pressure of the primary pressure sensing diaphragm is exceeded. There is another way of creating a sealed gauge reference and this is to seal a high vacuum on the reverse side of the sensing diaphragm. Then the output signal is offset so the pressure sensor reads close to zero when measuring atmospheric pressure. A sealed gauge reference pressure transducer will never read exactly zero because atmospheric pressure is always changing and the reference in this case is fixed at 1 bar. An absolute pressure measurement is one that is referred to absolute vacuum. The best example of an absolute referenced pressure is atmospheric or barometric pressure. To produce an absolute pressure sensor the manufacturer will seal a high vacuum behind the sensing diaphragm. If the process pressure connection of an absolute pressure transmitter is open to the air, it will read the actual barometric pressure.

16

Units
Pressure units
Pascal (Pa) 1 Pa 1 bar 1 at 1 atm 1 Torr 1 psi 1 N/m2 105 0.980665 105 1.01325 105 133.3224 6.8948103 Bar (bar) 105 106dyn/cm2 0.980665 1.01325 1.333224103 6.8948102 Technical atmosphere Standard atmosphere (at) 1.0197105 1.0197 1 kp/cm2 1.0332 1.359551103 7.03069102 (atm) 9.8692106 0.98692 0.9678411 p0 1.315789103 6.8046102 Torr (Torr) 7.5006103 750.06 735.5592 760 1mmHg 51.71493 Pounds per square inch (psi) 1.450377104 14.50377 14.22334 14.69595 1.933678102 1 lbF/in2

The SI unit for pressure is the pascal (Pa), equal to one newton per square metre (Nm2 or kgm1s2). This special name for the unit was added in 1971; before that, pressure in SI was expressed in units such as Nm2. When indicated, the zero reference is stated in parenthesis following the unit, for example 101 kPa (abs). The pound per square inch (psi) is still in widespread use in the US and Canada, for measuring, for instance, tire pressure. A letter is often appended to the psi unit to indicate the measurement's zero reference; psia for absolute, psig for gauge, psid for differential, although this practice is discouraged by the NIST.

Pressure measurement Because pressure was once commonly measured by its ability to displace a column of liquid in a manometer, pressures are often expressed as a depth of a particular fluid (e.g., inches of water). Manometric measurement is the subject of pressure head calculations. The most common choices for a manometer's fluid are mercury (Hg) and water; water is nontoxic and readily available, while mercury's density allows for a shorter column (and so a smaller manometer) to measure a given pressure. The abbreviation "W.C." or the words "water column" are often printed on gauges and measurements that use water for the manometer. Fluid density and local gravity can vary from one reading to another depending on local factors, so the height of a fluid column does not define pressure precisely. So measurements in "millimetres of mercury" or "inches of mercury" can be converted to SI units as long as attention is paid to the local factors of fluid density and gravity. Temperature fluctuations change the value of fluid density, while location can affect gravity. Although no longer preferred, these manometric units are still encountered in many fields. Blood pressure is measured in millimetres of mercury (see torr) in most of the world, and lung pressures in centimeters of water are still common, as in settings for CPAP machines. Natural gas pipeline pressures are measured in inches of water, expressed as "inches W.C." Scuba divers often use a manometric rule of thumb: the pressure exerted by ten meters depth of water is approximately equal to one atmosphere. In vacuum systems, the units torr, micrometre of mercury (micron), and inch of mercury (inHg) are most commonly used. Torr and micron usually indicates an absolute pressure, while inHg usually indicates a gauge pressure. Atmospheric pressures are usually stated using kilopascal (kPa), or atmospheres (atm), except in American meteorology where the hectopascal (hPa) and millibar (mbar) are preferred. In American and Canadian engineering, stress is often measured in kip. Note that stress is not a true pressure since it is not scalar. In the cgs system the unit of pressure was the barye (ba), equal to 1 dyncm2. In the mts system, the unit of pressure was the pieze, equal to 1 sthene per square metre. Many other hybrid units are used such as mmHg/cm2 or grams-force/cm2 (sometimes as kg/cm<sup>2</sup> without properly identifying the force units). Using the names kilogram, gram, kilogram-force, or gram-force (or their symbols) as a unit of force is prohibited in SI; the unit of force in SI is the newton (N).

17

Static and dynamic pressure


Static pressure is uniform in all directions, so pressure measurements are independent of direction in an immovable (static) fluid. Flow, however, applies additional pressure on surfaces perpendicular to the flow direction, while having little impact on surfaces parallel to the flow direction. This directional component of pressure in a moving (dynamic) fluid is called dynamic pressure. An instrument facing the flow direction measures the sum of the static and dynamic pressures; this measurement is called the total pressure or stagnation pressure. Since dynamic pressure is referenced to static pressure, it is neither gauge nor absolute; it is a differential pressure. While static gauge pressure is of primary importance to determining net loads on pipe walls, dynamic pressure is used to measure flow rates and airspeed. Dynamic pressure can be measured by taking the differential pressure between instruments parallel and perpendicular to the flow. Pitot-static tubes, for example perform this measurement on airplanes to determine airspeed. The presence of the measuring instrument inevitably acts to divert flow and create turbulence, so its shape is critical to accuracy and the calibration curves are often non-linear.

Pressure measurement

18

Applications
Altimeter Barometer MAP sensor Pitot tube Sphygmomanometer

Instruments
Many instruments have been invented to measure pressure, with different advantages and disadvantages. Pressure range, sensitivity, dynamic response and cost all vary by several orders of magnitude from one instrument design to the next. The oldest type is the liquid column (a vertical tube filled with mercury) manometer invented by Evangelista Torricelli in 1643. The U-Tube was invented by Christian Huygens in 1661.

Hydrostatic
Hydrostatic gauges (such as the mercury column manometer) compare pressure to the hydrostatic force per unit area at the base of a column of fluid. Hydrostatic gauge measurements are independent of the type of gas being measured, and can be designed to have a very linear calibration. They have poor dynamic response.

Pressure measurement Piston Piston-type gauges counterbalance the pressure of a fluid with a spring (for example tire-pressure gauges of comparatively low accuracy) or a solid weight, in which case it is known as a deadweight tester and may be used for calibration of other gauges. Liquid column By using Bernoulli's principle and the derived pressure head equation, liquids can be used for instrumentation where gravity is present. Liquid column gauges consist of a vertical column of liquid in a tube that has ends which are exposed to different pressures. The column will rise or fall until its weight (a force applied due to gravity) is in equilibrium with the pressure differential between the two ends of the tube (a force applied due to fluid pressure). A very simple version is a U-shaped tube half-full of liquid, one side of which is connected to the region of interest while the reference pressure (which might be the atmospheric pressure or a vacuum) is applied to the other. The difference in liquid level represents the applied pressure. The pressure exerted by a column of fluid of height h and density is given by the hydrostatic pressure equation, P = hg. Therefore the pressure difference between the applied pressure Pa and the reference pressure P0 in a U-tube manometer can be found by solving Pa P0 = hg. In other words, the pressure on either end of the liquid (shown in blue in the figure to the right) must be balanced (since the liquid is static) and so Pa = P0 + hg. In most liquid column measurements, the result of the measurement is the height, h, expressed typically in mm, cm, or inches. The h is also known as the pressure head. When expressed as a pressure head, pressure is specified in units of length and the measurement fluid must be specified. When accuracy The difference in fluid height in a liquid is critical, the temperature of the measurement fluid must likewise be column manometer is proportional to the specified, because liquid density is a function of temperature. So, for pressure difference. example, pressure head might be written "742.2 mmHg" or "4.2 inH2O at 59 F" for measurements taken with mercury or water as the manometric fluid, respectively. The word "gauge" or "vacuum" may be added to such a measurement to distinguish between a pressure above or below the atmospheric pressure. Both mm of mercury and inches of water are common pressure heads which can be converted to S.I. units of pressure using unit conversion and the above formulas. If the fluid being measured is significantly dense, hydrostatic corrections may have to be made for the height between the moving surface of the manometer working fluid and the location where the pressure measurement is desired except when measuring differential pressure of a fluid (for example across an orifice plate or venturi), in which case the density should be corrected by subtracting the density of the fluid being measured. To measure the pressure of a fluid accurately using a liquid column, the fluid being measured should not be flowing for a static pressure measurement. A column connected to a flowing fluid will measure static plus dynamic pressure. So if a fluid is flowing, the liquid column will change due to dynamic pressure, proportional to the square of the fluid's velocity. This of course is precisely the desired measurement when a differential pressure measurement is needed for a venturi or an orifice plate. Measuring dynamic pressures is commonly used as an intermediary in determining a fluid's velocity or flow rate. See flow measurement.

19

Pressure measurement As an example, an airplane flying through the air at sea level would experience the atmospheric pressure as a static pressure exerted on the skin of the aircraft. However, the forward surfaces of an aircraft in flight experience dynamic pressure in addition to the static pressure. To measure the static air pressure, we use a barometer in still air. To measure the dynamic pressure, imagine a mercury manometer like the U-tube above with an open end pointing in the direction of the airplane's travel and a closed end kept at the static air pressure. Mercury is pushed down the tube farther than it would if only measuring still air. For a plane traveling around 129m/s, the dynamic pressure adds about 10% to the atmospheric pressure at sea level. A U-tube for measuring dynamic pressure on an airplane would be impractical, so a pitot tube is used instead that relies on a diaphragm rather than columns of fluid. Although dynamic pressure can be measured directly, fluid speed and air speed can be measured indirectly using the Bernoulli principle if both dynamic and static pressures are known. Although any fluid can be used, mercury is preferred for its high density (13.534 g/cm3) and low vapour pressure. For low pressure differences well above the vapour pressure of water, water is commonly used (and "inches of water" or "Water Column" is a common pressure unit). Liquid-column pressure gauges are independent of the type of fluid being measured and have a highly linear calibration. They have poor dynamic response because the fluid in the column may react slowly to a pressure change. When measuring vacuum, the working liquid may evaporate and contaminate the vacuum if its vapor pressure is too high. When measuring liquid pressure, a loop filled with gas or a light fluid can isolate the liquids to prevent them from mixing but this can be unnecessary, for example when mercury is used as the manometer fluid to measure differential pressure of a fluid such as water. Simple hydrostatic gauges can measure pressures ranging from a few Torr (a few 100 Pa) to a few atmospheres. (Approximately 1,000,000 Pa) A single-limb liquid-column manometer has a larger reservoir instead of one side of the U-tube and has a scale beside the narrower column. The column may be inclined to further amplify the liquid movement. Based on the use and structure following type of manometers are used 1. 2. 3. 4. Simple Manometer Micromanometer Differential manometer Inverted differential manometer McLeod gauge A McLeod gauge isolates a sample of gas and compresses it in a modified mercury manometer until the pressure is a few mmHg. The gas must be well-behaved during its compression (it must not condense, for example). The technique is slow and unsuited to continual monitoring, but is capable of good accuracy. Useful range: above 10-4 torr 106Torr (0.1 mPa), (roughly 10-2 Pa) as high as

20

0.1 mPa is the lowest direct measurement of pressure that is possible with current technology. Other vacuum gauges can measure lower pressures, but only indirectly by measurement of other pressure-controlled properties. These indirect measurements must be calibrated to SI units via a direct measurement, most commonly a McLeod gauge.

A McLeod gauge, drained of mercury

Pressure measurement

21

Aneroid
Aneroid gauges are based on a metallic pressure sensing element that flexes elastically under the effect of a pressure difference across the element. "Aneroid" means "without fluid," and the term originally distinguished these gauges from the hydrostatic gauges described above. However, aneroid gauges can be used to measure the pressure of a liquid as well as a gas, and they are not the only type of gauge that can operate without fluid. For this reason, they are often called mechanical gauges in modern language. Aneroid gauges are not dependent on the type of gas being measured, unlike thermal and ionization gauges, and are less likely to contaminate the system than hydrostatic gauges. The pressure sensing element may be a Bourdon tube, a diaphragm, a capsule, or a set of bellows, which will change shape in response to the pressure of the region in question. The deflection of the pressure sensing element may be read by a linkage connected to a needle, or it may be read by a secondary transducer. The most common secondary transducers in modern vacuum gauges measure a change in capacitance due to the mechanical deflection. Gauges that rely on a change in capacitances are often referred to as Baratron gauges. Bourdon The Bourdon pressure gauge uses the principle that a flattened tube tends to straighten or regain its circular form in cross-section when pressurized. Although this change in cross-section may be hardly noticeable, and thus involving moderate stresses within the elastic range of easily workable materials, the strain of the material of the tube is magnified by forming the tube into a C shape or even a helix, such that the entire tube tends to straighten out or uncoil, elastically, as it is pressurized. Eugene Bourdon patented his gauge in France in 1849, and it was widely adopted because of its superior sensitivity, linearity, and accuracy; Edward Membrane-type manometer Ashcroft purchased Bourdon's American patent rights in 1852 and became a major manufacturer of gauges. Also in 1849, Bernard Schaeffer in Magdeburg, Germany patented a successful diaphragm (see below) pressure gauge, which, together with the Bourdon gauge, revolutionized pressure measurement in industry. But in 1875 after Bourdon's patents expired, his company Schaeffer and Budenberg also manufactured Bourdon tube gauges. In practice, a flattened thin-wall, closed-end tube is connected at the hollow end to a fixed pipe containing the fluid pressure to be measured. As the pressure increases, the closed end moves in an arc, and this motion is converted into the rotation of a (segment of a) gear by a connecting link that is usually adjustable. A small-diameter pinion gear is on the pointer shaft, so the motion is magnified further by the gear ratio. The positioning of the indicator card behind the pointer, the initial pointer shaft position, the linkage length and initial position, all provide means to calibrate the pointer to indicate the desired range of pressure for variations in the behavior of the Bourdon tube itself. Differential pressure can be measured by gauges containing two different Bourdon tubes, with connecting linkages. Bourdon tubes measure gauge pressure, relative to ambient atmospheric pressure, as opposed to absolute pressure; vacuum is sensed as a reverse motion. Some aneroid barometers use Bourdon tubes closed at both ends (but most use diaphragms or capsules, see below). When the measured pressure is rapidly pulsing, such as when the gauge is near a reciprocating pump, an orifice restriction in the connecting pipe is frequently used to avoid unnecessary wear on the gears and provide an average reading; when the whole gauge is subject to mechanical vibration, the entire case including the pointer and indicator card can be filled with an oil or glycerin. Tapping on the face of the gauge is not recommended as it will tend to falsify actual readings initially presented by the gauge.The Bourdon tube is separate from the face of the gauge and thus has no effect on the actual reading of pressure. Typical high-quality modern

Pressure measurement gauges provide an accuracy of 2% of span, and a special high-precision gauge can be as accurate as 0.1% of full scale. In the following illustrations the transparent cover face of the pictured combination pressure and vacuum gauge has been removed and the mechanism removed from the case. This particular gauge is a combination vacuum and pressure gauge used for automotive diagnosis: the left side of the face, used for measuring manifold vacuum, is calibrated in centimetres of mercury on its inner scale and inches of mercury on its outer scale. the right portion of the face is used to measure fuel pump pressure and is calibrated in fractions of 1 kgf/cm2 on its inner scale and pounds per square inch on its outer scale.

22

Indicator side with card and dial

Mechanical side with Bourdon tube

Pressure measurement Mechanical details Stationary parts: A: Receiver block. This joins the inlet pipe to the fixed end of the Bourdon tube (1) and secures the chassis plate (B). The two holes receive screws that secure the case. B: Chassis plate. The face card is attached to this. It contains bearing holes for the axles. C: Secondary chassis plate. It supports the outer ends of the axles. D: Posts to join and space the two chassis plates. Moving Parts: 1. Stationary end of Bourdon tube. This communicates with the inlet pipe through the receiver block. 2. Moving end of Bourdon tube. This end is sealed. 3. Pivot and pivot pin. 4. Link joining pivot pin to lever (5) with pins to allow joint rotation. 5. 6. 7. 8.

23

Mechanical details

Lever. This is an extension of the sector gear (7). Sector gear axle pin. Sector gear. Indicator needle axle. This has a spur gear that engages the sector gear (7) and extends through the face to drive the indicator needle. Due to the short distance between the lever arm link boss and the pivot pin and the difference between the effective radius of the sector gear and that of the spur gear, any motion of the Bourdon tube is greatly amplified. A small motion of the tube results in a large motion of the indicator needle. 9. Hair spring to preload the gear train to eliminate gear lash and hysteresis. Diaphragm A second type of aneroid gauge uses deflection of a flexible membrane that separates regions of different pressure. The amount of deflection is repeatable for known pressures so the pressure can be determined by using calibration. The deformation of a thin diaphragm is dependent on the difference in pressure between its two faces. The reference face can be open to atmosphere to measure gauge pressure, open to a second port to measure differential pressure, or can be sealed against a vacuum or other fixed reference pressure to measure absolute pressure. The deformation can be measured using mechanical, optical or capacitive techniques. Ceramic and metallic diaphragms are used. Useful range: above 10-2 Torr (roughly 1 Pa) For absolute measurements, welded pressure capsules with diaphragms on either side are often used. shape:

A pile of pressure capsules with corrugated diaphragms in an aneroid barograph.

Pressure measurement Flat corrugated flattened tube capsule

24

Bellows In gauges intended to sense small pressures or pressure differences, or require that an absolute pressure be measured, the gear train and needle may be driven by an enclosed and sealed bellows chamber, called an aneroid, which means "without liquid". (Early barometers used a column of liquid such as water or the liquid metal mercury suspended by a vacuum.) This bellows configuration is used in aneroid barometers (barometers with an indicating needle and dial card), altimeters, altitude recording barographs, and the altitude telemetry instruments used in weather balloon radiosondes. These devices use the sealed chamber as a reference pressure and are driven by the external pressure. Other sensitive aircraft instruments such as air speed indicators and rate of climb indicators (variometers) have connections both to the internal part of the aneroid chamber and to an external enclosing chamber.

Electronic pressure sensors


Piezoresistive Strain Gage Uses the piezoresistive effect of bonded or formed strain gauges to detect strain due to applied pressure. Capacitive Uses a diaphragm and pressure cavity to create a variable capacitor to detect strain due to applied pressure. Magnetic Measures the displacement of a diaphragm by means of changes in inductance (reluctance), LVDT, Hall Effect, or by eddy current principal. Piezoelectric Uses the piezoelectric effect in certain materials such as quartz to measure the strain upon the sensing mechanism due to pressure. Optical Uses the physical change of an optical fiber to detect strain due applied pressure. Potentiometric Uses the motion of a wiper along a resistive mechanism to detect the strain caused by applied pressure. Resonant Uses the changes in resonant frequency in a sensing mechanism to measure stress, or changes in gas density, caused by applied pressure.

Thermal conductivity
Generally, as a real gas increases in density -which may indicate an increase in pressure- its ability to conduct heat increases. In this type of gauge, a wire filament is heated by running current through it. A thermocouple or Resistance Temperature Detector (RTD) can then be used to measure the temperature of the filament. This temperature is dependent on the rate at which the filament loses heat to the surrounding gas, and therefore on the thermal conductivity. A common variant is the Pirani gauge, which uses a single platinum filament as both the heated element and RTD. These gauges are accurate from 10 Torr to 103Torr, but they are sensitive to the chemical composition of the gases being measured.

Pressure measurement Two-wire One wire coil is used as a heater, and the other is used to measure nearby temperature due to convection. Pirani (one wire) A Pirani gauge consist of a metal wire open to the pressure being measured. The wire is heated by a current flowing through it and cooled by the gas surrounding it. If the gas pressure is reduced, the cooling effect will decrease, hence the equilibrium temperature of the wire will increase. The resistance of the wire is a function of its temperature: by measuring the voltage across the wire and the current flowing through it, the resistance (and so the gas pressure) can be determined. This type of gauge was invented by Marcello Pirani. Thermocouple gauges and thermistor gauges work in a similar manner, except a thermocouple or thermistor is used to measure the temperature of the wire. Useful range: 10-3 - 10 Torr (roughly 10-1 - 1000 Pa)

25

Ionization gauge
Ionization gauges are the most sensitive gauges for very low pressures (also referred to as hard or high vacuum). They sense pressure indirectly by measuring the electrical ions produced when the gas is bombarded with electrons. Fewer ions will be produced by lower density gases. The calibration of an ion gauge is unstable and dependent on the nature of the gases being measured, which is not always known. They can be calibrated against a McLeod gauge which is much more stable and independent of gas chemistry. Thermionic emission generate electrons, which collide with gas atoms and generate positive ions. The ions are attracted to a suitably biased electrode known as the collector. The current in the collector is proportional to the rate of ionization, which is a function of the pressure in the system. Hence, measuring the collector current gives the gas pressure. There are several sub-types of ionization gauge. Useful range: 10-10 - 10-3 torr (roughly 10-8 - 10-1 Pa) Most ion gauges come in two types: hot cathode and cold cathode. A third type that is more sensitive and expensive known as a spinning rotor gauge exists, but is not discussed here. In the hot cathode version, an electrically heated filament produces an electron beam. The electrons travel through the gauge and ionize gas molecules around them. The resulting ions are collected at a negative electrode. The current depends on the number of ions, which depends on the pressure in the gauge. Hot cathode gauges are accurate from 103Torr to 1010Torr. The principle behind cold cathode version is the same, except that electrons are produced in the discharge of a high voltage. Cold Cathode gauges are accurate from 102Torr to 109Torr. Ionization gauge calibration is very sensitive to construction geometry, chemical composition of gases being measured, corrosion and surface deposits. Their calibration can be invalidated by activation at atmospheric pressure or low vacuum. The composition of gases at high vacuums will usually be unpredictable, so a mass spectrometer must be used in conjunction with the ionization gauge for accurate measurement.

Pressure measurement Hot cathode A hot-cathode ionization gauge is composed mainly of three electrodes acting together as a triode, wherein the cathode is the filament. The three electrodes are a collector or plate, a filament, and a grid. The collector current is measured in picoamps by an electrometer. The filament voltage to ground is usually at a potential of 30 volts, while the grid voltage at 180210 volts DC, unless there is an optional electron bombardment feature, by heating the grid, which may have a high potential of approximately 565 volts. The most common ion gauge is the hot-cathode Bayard-Alpert gauge, with a small ion collector inside the grid. A glass envelope with an opening to the vacuum can surround the electrodes, but usually the Nude Gauge is inserted in the vacuum chamber directly, the pins being fed through a ceramic plate in the wall of the chamber. Hot-cathode gauges can be damaged or lose their calibration if they are exposed to atmospheric pressure or even low vacuum while hot. The measurements of a hot-cathode ionization gauge are always logarithmic.

26

Bayard-Alpert hot-cathode ionization gauge

Electrons emitted from the filament move several times in back and forth movements around the grid before finally entering the grid. During these movements, some electrons collide with a gaseous molecule to form a pair of an ion and an electron (Electron ionization). The number of these ions is proportional to the gaseous molecule density multiplied by the electron current emitted from the filament, and these ions pour into the collector to form an ion current. Since the gaseous molecule density is proportional to the pressure, the pressure is estimated by measuring the ion current. The low-pressure sensitivity of hot-cathode gauges is limited by the photoelectric effect. Electrons hitting the grid produce x-rays that produce photoelectric noise in the ion collector. This limits the range of older hot-cathode gauges to 108 Torr and the Bayard-Alpert to about 1010 Torr. Additional wires at cathode potential in the line of sight between the ion collector and the grid prevent this effect. In the extraction type the ions are not attracted by a wire, but by an open cone. As the ions cannot decide which part of the cone to hit, they pass through the hole and form an ion beam. This ion beam can be passed on to a: Faraday cup Microchannel plate detector with Faraday cup Quadrupole mass analyzer with Faraday cup Quadrupole mass analyzer with Microchannel plate detector Faraday cup ion lens and acceleration voltage and directed at a target to form a sputter gun. In this case a valve lets gas into the grid-cage.

Pressure measurement Cold cathode There are two subtypes of cold-cathode ionization gauges: the Penning gauge (invented by Frans Michel Penning), and the Inverted magnetron, also called a Redhead gauge. The major difference between the two is the position of the anode with respect to the cathode. Neither has a filament, and each may require a DC potential of about 4 kV for operation. Inverted magnetrons can measure down to 1x1012 Torr. Likewise, cold-cathode gauges may be reluctant to start at very low pressures, in that the near-absence of a gas makes it difficult to establish an electrode current - in particular in Penning gauges, which use an axially symmetric magnetic field to create path lengths for electrons that are of the order of metres. In ambient air, suitable ion-pairs are ubiquitously formed by cosmic radiation; in a Penning gauge, design features are used to ease the set-up of a discharge path. For example, the electrode of a Penning gauge is usually finely tapered to facilitate the field emission of electrons. Maintenance cycles of cold cathode gauges are, in general, measured in years, depending on the gas type and pressure that they are operated in. Using a cold cathode gauge in gases with substantial organic components, such as pump oil fractions, can result in the growth of delicate carbon films and shards within the gauge that eventually either short-circuit the electrodes of the gauge or impede the generation of a discharge path.

27

Calibration
Pressure gauges are either direct- or indirect-reading. Hydrostatic and elastic gauges measure pressure are directly influenced by force exerted on the surface by incident particle flux, and are called direct reading gauges. Thermal and ionization gauges read pressure indirectly by measuring a gas property that changes in a predictable manner with gas density. Indirect measurements are susceptible to more errors than direct measurements. Dead-weight tester McLeod mass spec + ionization

Dynamic transients
When fluid flows are not in equilibrium, local pressures may be higher or lower than the average pressure in a medium. These disturbances propagate from their source as longitudinal pressure variations along the path of propagation. This is also called sound. Sound pressure is the instantaneous local pressure deviation from the average pressure caused by a sound wave. Sound pressure can be measured using a microphone in air and a hydrophone in water. The effective sound pressure is the root mean square of the instantaneous sound pressure over a given interval of time. Sound pressures are normally small and are often expressed in units of microbar. frequency response of pressure sensors resonance

Pressure measurement

28

Pirani gauge
The Pirani gauge is a robust thermal conductivity gauge used for the measurement of the pressures in vacuum systems. It was invented in 1906 by Marcello Pirani.

Structure
The Pirani gauge consists of a metal filament (usually platinum) suspended in a tube which is connected to the system whose vacuum is to be measured. Connection is usually made either by a ground glass joint or a flanged metal connector, sealed with an o-ring. The filament is connected to an electrical circuit from which, after calibration, a pressure reading may be taken.
Pirani probe with 64mm vacuum flange

Mode of operation
A heated metal wire (also called a filament) suspended in a gas will lose heat to the gas as its molecules collide with the wire and remove heat. If the gas pressure is reduced the number of molecules present will fall proportionately and the wire will lose low heat more slowly. Measuring the heat loss is an indirect indication of pressure.

Block diagram of Pirani gauge

The electrical resistance of a wire varies with its temperature, so the resistance indicates the temperature of

Pirani gauge

29

wire. In many systems, the wire is maintained at a constant resistance R by controlling the current I through the wire. The resistance can be set using a bridge circuit. The power delivered to the wire is I2R, and the same power is transferred to the gas. The current required to achieve this balance is therefore a measure of the vacuum. The gauge may be used for pressures between 0.5 Torr to 104Torr. The thermal conductivity and heat capacity of the gas may affect the readout from the meter, and therefore the apparatus may need calibrating before accurate readings are obtainable. For lower pressure measurement other instruments such as a Penning gauge are used.
Curves to convert air readings to other gases

Pulsed Pirani gauge


A special form of the Pirani gauge is the pulsed Pirani vacuum gauge where the filament is not operated at a constant temperature, but is cyclically heated up to a certain temperature threshold by an increasing voltage ramp. When the threshold is reached, the heating voltage is switched-off and the filament cools down again. For adequately low pressure the following relation for supplied heating power and filament temperature T(t) applies:

where

is the heating capacity of the filament,

is the mass of the filament and

and

are constants.

Advantages and disadvantages of the pulsed gauge


Advantages Significantly better resolution in the range above 100 Torr. The power consumption is drastically reduced compared to continuously operated Pirani gauges. The gauge's thermal influence on the real measurement is lowered considerably due to the low temperature threshold of 80C and the ramp heating in pulsed mode. The pulsed mode allows for efficient implementation of modern microprocessor technology. Disadvantages Increased calibration effort Longer heat-up phase

Hot-filament ionization gauge

30

Hot-filament ionization gauge


Hot-filament ionization gauge

Bayard-Alpert hot thoriated iridium filament ionization gauge on 2.75in conflat flange Other names Hot cathode gauge Bayard-Alpert gauge Pressure measurement

Uses

The hot-filament ionization gauge, sometimes called a hot-filament gauge or hot-cathode gauge, is the most widely used low-pressure (vacuum) measuring device for the region from 103 to 1010 torr. It is a triode, whereas the filament is the cathode. Note: Principles are mostly the same for hot-cathode ion sources in particle accelerators to create electrons

Function
A regulated electron current (typically 10 mA) is emitted from a heated filament. The electrons are attracted to the helical grid by a DC potential of about +150 volts. Most of the electrons pass through the grid, and collide with gas molecules in the enclosed volume, causing a fraction of them to be ionized. The gas ions formed by the electron collisions are attracted to the central ion collector wire by the negative voltage on the collector (typically a minus 30 volts). Ion currents are on the order of 1 mA/Pa. This current is amplified and displayed by a high-gain-differential amplifier/electrometer. This ion current will differ for different gases at the same pressure; that is, a hot filament ionization gauge is composition-dependent. Over a wide range of molecular density, however, the ion current from a gas of constant composition will be directly proportional to the molecular density of the gas in the gauge.

Hot-filament ionization gauge

31

Construction
A hot-cathode ionization gauge is composed mainly of three electrodes all acting as a triode, wherein the cathode is the filament. The three electrodes are a collector or plate, a filament, and a grid. The collector current is measured in picoamps by an electrometer. The filament voltage to ground is usually at a potential of 30 volts, while the grid voltage at 180210 volts DC, unless there is an optional electron bombardment feature, by heating the grid, which may have a high potential of approximately 565 volts. The most common ion gauge is the hot-cathode Bayard-Alpert gauge, with a small Tubulated hot-cathode ionization gauge. collector inside the grid. A glass envelope with an opening to the vacuum can surround the electrodes, but usually the Nude Gauge is inserted in the vacuum chamber directly, the pins being fed through a ceramic plate in the wall of the chamber. Hot cathode gauges can be damaged or lose their calibration if they are exposed to atmospheric pressure or even low vacuum while hot. Electrons emitted from the filament move several times in back and forth movements around the grid before finally entering the grid. During these movements, some electrons collide with a gaseous molecule to form a pair of an ion and an electron (Electron ionization). The number of these ions is proportional to the gaseous molecule density multiplied by the electron current emitted from the filament, and these ions pour into the collector to form an ion current. Since the gaseous molecule density is proportional to the pressure, the pressure is estimated by measuring the ion current. The low-pressure sensitivity of hot-cathode gauges is limited by the photoelectric effect. Electrons hitting the grid produce X-rays that produce photoelectric noise in the ion collector. This limits the range of older hot-cathode gauges to 108 Torr and the Bayard-Alpert to about 1010 Torr. Additional wires at cathode potential in the line of sight between the ion collector and the grid prevent this effect. In the extraction type the ions are not attracted by a wire but by an open cone. As the ions cannot decide which part of the cone to hit, they pass through the hole and form an ion beam. This ion beam can be passed on to a Faraday cup Microchannel plate detector with Faraday cup Quadrupole mass analyzer with Faraday cup Microchannel plate detector with Faraday cup Quadrupole mass analyzer with Microchannel plate detector Faraday cup Ion lens and acceleration voltage and directed at a target to form a sputter gun. In this case a valve lets gas into the grid-cage.

Types of hot-filament ionization gauges


Bayard Alpert (uses sealed tube ) Nude gauge (uses the vacuum chamber to make a complete seal).

Hot-filament ionization gauge

32

Vacuum pump
A vacuum pump is a device that removes gas molecules from a sealed volume in order to leave behind a partial vacuum. The first vacuum pump was invented in 1650 by Otto von Guericke, and was preceded by the suction pump, which dates to antiquity.

Types
Pumps can be broadly categorized according to three techniques: Positive displacement pumps use a mechanism to repeatedly expand a cavity, allow gases to flow in from the chamber, seal off the cavity, and exhaust it to the atmosphere. Momentum transfer pumps, also called molecular pumps, use high speed jets of dense fluid or high speed rotating blades to knock gas molecules out of the chamber. Entrapment pumps capture gases in a solid or adsorbed state. This includes cryopumps, getters, and ion pumps.
The Roots blower is one example of a vacuum pump

Positive displacement pumps are the most effective for low vacuums. Momentum transfer pumps in conjunction with one or two positive displacement pumps are the most common configuration used to achieve high vacuums. In this configuration the positive displacement pump serves two purposes. First it obtains a rough vacuum in the vessel being evacuated before the momentum transfer pump can be used to obtain the high vacuum, as momentum transfer pumps cannot start pumping at atmospheric pressures. Second the positive displacement pump backs up the momentum transfer pump by evacuating to low vacuum the accumulation of displaced molecules in the high vacuum pump. Entrapment pumps can be added to reach ultrahigh vacuums, but they require periodic regeneration of the surfaces that trap air molecules or ions. Due to this requirement their available operational time can be unacceptably short in low and high vacuums, thus limiting their use to ultrahigh vacuums. Pumps also differ in details like manufacturing tolerances, sealing material, pressure, flow, admission or no admission of oil vapor, service intervals, reliability, tolerance to dust, tolerance to chemicals, tolerance to liquids and vibration.

Vacuum pump

33

Performance measures
Pumping speed refers to the volume flow rate of a pump at its inlet, often measured in volume per unit of time. Momentum transfer and entrapment pumps are more effective on some gases than others, so the pumping rate can be different for each of the gases being pumped, and the average volume flow rate of the pump will vary depending on the chemical composition of the gases remaining in the chamber. Throughput refers to the pumping speed multiplied by the gas pressure at the inlet, and is measured in units of pressurevolume/unit time. At a constant temperature, throughput is proportional to the number of molecules being pumped per unit time, and therefore to the mass flow rate of the pump. When discussing a leak in the system or backstreaming through the pump, throughput refers to the volume leak rate multiplied by the pressure at the vacuum side of the leak, so the leak throughput can be compared to the pump throughput. Positive displacement and momentum transfer pumps have a constant volume flow rate (pumping speed), but as the chamber's pressure drops, this volume contains less and less mass. So although the pumping speed remains constant, the throughput and mass flow rate drop exponentially. Meanwhile, the leakage, evaporation, sublimation and backstreaming rates continue to produce a constant throughput into the system.

Positive displacement
A partial vacuum may be generated by increasing the volume of a container. To continue evacuating a chamber indefinitely without requiring infinite growth, a compartment of the vacuum can be repeatedly closed off, exhausted, and expanded again. This is the principle behind positive displacement pumps, like the manual water pump for example. Inside the pump, a mechanism expands a small sealed cavity to reduce its pressure below that of the atmosphere. Because of the pressure differential, some fluid from the chamber (or the well, in our example) is pushed into the pump's small cavity. The pump's cavity is then sealed from the chamber, opened to the atmosphere, and squeezed back to a minute size. More sophisticated systems are used for most industrial applications, but the basic principle of cyclic volume removal is the same: Rotary vane pump, the most common Diaphragm pump, zero oil contamination Liquid ring high resistance to dust Piston pump, fluctuating vacuum Scroll pump, highest speed dry pump Screw pump (10 Pa) Wankel pump External vane pump Roots blower, also called a booster pump, has highest pumping speeds but low compression ratio Multistage Roots pump that combine several stages providing high pumping speed with better compression ratio Toepler pump Lobe pump
The manual water pump draws water up from a well by creating a vacuum that water rushes in to fill. In a sense, it acts to evacuate the well, although the high leakage rate of dirt prevents a high quality vacuum from being maintained for any length of time.

Vacuum pump

34

The base pressure of a rubber- and plastic-sealed piston pump system is typically 1 to 50 kPa, while a scroll pump might reach 10 Pa (when new) and a rotary vane oil pump with a clean and empty metallic chamber can easily achieve 0.1 Pa. A positive displacement vacuum pump moves the same volume of gas with each cycle, so its pumping speed is constant unless it is overcome by backstreaming.

Mechanism of a scroll pump

Momentum transfer
In a momentum transfer pump, gas molecules are accelerated from the vacuum side to the exhaust side (which is usually maintained at a reduced pressure by a positive displacement pump). Momentum transfer pumping is only possible below pressures of about 0.1 kPa. Matter flows differently at different pressures based on the laws of fluid dynamics. At atmospheric pressure and mild vacuums, molecules interact with each other and push on their neighboring molecules in what is known as viscous flow. When the distance between the molecules increases, the molecules interact with the walls of the chamber more often than with the other molecules, and molecular pumping becomes more effective than positive displacement pumping. This regime is generally called high vacuum. Molecular pumps sweep out a larger area than mechanical pumps, and do so more frequently, making them capable of much higher pumping speeds. They do this at the expense of the seal between the vacuum and their exhaust. Since there is no seal, a small pressure at the exhaust can easily cause backstreaming through the pump; this is called stall. In high vacuum, however, pressure gradients have little effect on fluid flows, and molecular pumps can attain their full potential.

A cutaway view of a turbomolecular high vacuum pump

The two main types of molecular pumps are the diffusion pump and the turbomolecular pump. Both types of pumps blow out gas molecules that diffuse into the pump by imparting momentum to the gas molecules. Diffusion pumps blow out gas molecules with jets of oil or mercury, while turbomolecular pumps use high speed fans to push the gas. Both of these pumps will stall and fail to pump if exhausted directly to atmospheric pressure, so they must be exhausted to a lower grade vacuum created by a mechanical pump. As with positive displacement pumps, the base pressure will be reached when leakage, outgassing, and backstreaming equal the pump speed, but now minimizing leakage and outgassing to a level comparable to backstreaming becomes much more difficult. Diffusion pump Turbomolecular pump

Vacuum pump

35

Regenerative Pump
Regenerative pumps utilize vortex behavior of the fluid (air). The construction is based on hybrid concept of centrifugal pump and turbopump. Usually it consists of several sets of perpendicular teeth on the rotor circulating air molecules inside stationary hollow grooves like multistage centrifugal pump. They can reach to 1105mbar and directly exhaust to atmospheric pressure. It is sometimes referred as side channel pump. Due to high pumping rate from atmosphere to high vacuum and less contamination since bearing can be installed at exhaust side, this type of pumps are used in load lock in semiconductor manufacturing processes. This type of pump suffers from high power consumption(~1kW) compare to turbomolecular pump (<100W) at low pressure since most power is consumed to back atmospheric pressure. This can be reduced by nearly 10 times by backing with small diaphragm pump.

Entrapment
Entrapment pumps may be cryopumps, which use cold temperatures to condense gases to a solid or adsorbed state, chemical pumps, which react with gases to produce a solid residue, or ionization pumps, which use strong electrical fields to ionize gases and propel the ions into a solid substrate. A cryomodule uses cryopumping. Ion pump Cryopump Sorption pump Non-evaporative getter

Other pump types


Venturi vacuum pump (aspirator) (10 to 30 kPa) Steam ejector (vacuum depends on the number of stages, but can be very low)

Techniques
Vacuum pumps are combined with chambers and operational procedures into a wide variety of vacuum systems. Sometimes more than one pump will be used (in series or in parallel) in a single application. A partial vacuum, or rough vacuum, can be created using a positive displacement pump that transports a gas load from an inlet port to an outlet (exhaust) port. Because of their mechanical limitations, such pumps can only achieve a low vacuum. To achieve a higher vacuum, other techniques must then be used, typically in series (usually following an initial fast pump down with a positive displacement pump). Some examples might be use of an oil sealed rotary vane pump (the most common positive displacement pump) backing a diffusion pump, or a dry scroll pump backing a turbomolecular pump. There are other combinations depending on the level of vacuum being sought. Achieving high vacuum is difficult because all of the materials exposed to the vacuum must be carefully evaluated for their outgassing and vapor pressure properties. For example, oils, and greases, and rubber, or plastic gaskets used as seals for the vacuum chamber must not boil off when exposed to the vacuum, or the gases they produce would prevent the creation of the desired degree of vacuum. Often, all of the surfaces exposed to the vacuum must be baked at high temperature to drive off adsorbed gases. Outgassing can also be reduced simply by desiccation prior to vacuum pumping. High vacuum systems generally require metal chambers with metal gasket seals such as Klein flanges or ISO flanges, rather than the rubber gaskets more common in low vacuum chamber seals. The system must be clean and free of organic matter to minimize outgassing. All materials, solid or liquid, have a small vapour pressure, and their outgassing becomes important when the vacuum pressure falls below this vapour pressure. As a result, many materials that work well in low

Vacuum pump vacuums, such as epoxy, will become a source of outgassing at higher vacuums. With these standard precautions, vacuums of 1 mPa are easily achieved with an assortment of molecular pumps. With careful design and operation, 1 Pa is possible. Several types of pumps may be used in sequence or in parallel. In a typical pumpdown sequence, a positive displacement pump would be used to remove most of the gas from a chamber, starting from atmosphere (760 Torr, 101 kPa) to 25 Torr (3 kPa). Then a sorption pump would be used to bring the pressure down to 104 Torr (10 mPa). A cryopump or turbomolecular pump would be used to bring the pressure further down to 108 Torr (1 Pa). An additional ion pump can be started below 106 Torr to remove gases which are not adequately handled by a cryopump or turbo pump, such as helium or hydrogen. Ultra high vacuum generally requires custom-built equipment, strict operational procedures, and a fair amount of trial-and-error. Ultra-high vacuum systems are usually made of stainless steel with metal-gasketed ConFlat flanges. The system is usually baked, preferably under vacuum, to temporarily raise the vapour pressure of all outgassing materials in the system and boil them off. If necessary, this outgassing of the system can also be performed at room temperature, but this takes much more time. Once the bulk of the outgassing materials are boiled off and evacuated, the system may be cooled to lower vapour pressures to minimize residual outgassing during actual operation. Some systems are cooled well below room temperature by liquid nitrogen to shut down residual outgassing and simultaneously cryopump the system. In ultra-high vacuum systems, some very odd leakage paths and outgassing sources must be considered. The water absorption of aluminium and palladium becomes an unacceptable source of outgassing, and even the absorptivity of hard metals such as stainless steel or titanium must be considered. Some oils and greases will boil off in extreme vacuums. The porosity of the metallic chamber walls may have to be considered, and the grain direction of the metallic flanges should be parallel to the flange face. The impact of molecular size must be considered. Smaller molecules can leak in more easily and are more easily absorbed by certain materials, and molecular pumps are less effective at pumping gases with lower molecular weights. A system may be able to evacuate nitrogen (the main component of air) to the desired vacuum, but the chamber could still be full of residual atmospheric hydrogen and helium. Vessels lined with a highly gas-permeable material such as palladium (which is a high-capacity hydrogen sponge) create special outgassing problems.

36

Uses of vacuum pumps


Vacuum pumps are used in many industrial and scientific processes including: Composite Plastic moulding processes (VRTM)[5] Driving some of the flight instruments in older and simpler aircraft without electrical systems. The production of most types of electric lamps, vacuum tubes, and CRTs where the device is either left evacuated or re-filled with a specific gas or gas mixture Semiconductor processing, notably ion implantation, dry etch and PVD, ALD, PECVD and CVD deposition and soon in photolithography Electron microscopy Medical processes that require suction Uranium enrichment Medical applications such as radiotherapy, radiosurgery and radiopharmacy Analytical instrumentation to analyse gas, liquid, solid, surface and bio materials Mass spectrometers to create an ultra high vacuum between the ion source and the detector Vacuum coating on glass, metal and plastics for decoration, for durability and for energy saving, such as low-emissivity glass Hard coating for engine components (as in Formula One) Ophthalmic coating

Vacuum pump Milking machines and other equipment in dairy sheds Vacuum Impregnation of porous products such as wood or electric motor widings. Air conditioning service - removing all contaminants from the system before charging with refrigerant Trash compactor Vacuum engineering Sewage system Freeze Drying Fusion Research

37

Vacuum may be used to power, or provide assistance to mechanical devices. In hybrid and diesel engined motor vehicles, a pump fitted on the engine (usually on the camshaft) is used to produce vacuum. In petrol engines, instead, vacuum is typically obtained as a side-effect of the operation of the engine and the flow restriction created by the throttle plate, but may be also supplemented by an electrically operated vacuum pump to boost braking assistance or improve fuel consumption. This vacuum may then be used to power the following motor vehicle components: The vacuum servo booster for the hydraulic brakes Motors that move dampers in the ventilation system The throttle driver in the cruise control servomechanism Door locks or trunk releases

In an aircraft, the vacuum source is often used to power gyroscopes in the various flight instruments. To prevent the complete loss of instrumentation in the event of an electrical failure, the instrument panel is deliberately designed with certain instruments powered by electricity and other instruments powered by the vacuum source.

History of the vacuum pump


The predecessor to the vacuum pump was the suction pump, which was known to the Romans. Dual-action suction pumps were found in the city of Pompeii.[6] Arabic engineer Al-Jazari also described suction pumps in the 13th century. He said that his model was a larger version of the siphons the Bizantines used to discharge the Greek fire. The suction pump later reappeared in Europe from the 15th century. By the 17th century, water pump designs had improved to the point that they produced measurable vacuums, but this was not immediately understood. What was known was that suction pumps could not pull water beyond a certain height: 18 Florentine yards according to a measurement taken around 1635. (The conversion to metres is uncertain, but it would be about 9 or 10 metres.) This limit was a concern to irrigation projects, mine drainage, and

Vacuum pump decorative water fountains planned by the Duke of Tuscany, so the Duke commissioned Galileo to investigate the problem. Galileo advertised the puzzle to other scientists, including Gaspar Berti who replicated it by building the first water barometer in Rome in 1639. Berti's barometer produced a vacuum above the water column, but he could not explain it. The breakthrough was made by Evangelista Torricelli in 1643. Building upon Galileo's notes, he built the first mercury barometer and wrote a convincing argument that the space at the top was a vacuum. The height of the column was then limited to the maximum weight that atmospheric pressure could support; this is the limiting height of a suction pump, and the same as the maximum height of a siphon, which operates by the same principle. Some people believe that although Torricelli's experiment was crucial, it was Blaise Pascal's experiments that proved the top space really contained vacuum. In 1654, Otto von Guericke invented the first vacuum pump and conducted his famous Magdeburg hemispheres experiment, showing that teams of horses could not separate two hemispheres from which the air had been evacuated. Robert Boyle improved Guericke's design and conducted experiments on the properties of vacuum. Robert Hooke also helped Boyle produce an air pump which helped to produce the vacuum. The study of vacuum then lapsedWikipedia:Disputed statement until 1855, when Heinrich Geissler invented the mercury displacement pump and achieved a record vacuum of about 10 Pa (0.1 Torr). A number of electrical properties become observable at this vacuum level, and this renewed interest in vacuum. This, in turn, led to the development of the vacuum tube. In the 19th century, Nikola Tesla designed the apparatus, imaged to the right, that contains a Sprengel Pump to create a high degree of exhaustion.

38

Hazards
Old vacuum-pump oils that were produced before circa 1980 often contain a mixture of several different dangerous polychlorinated biphenyls (PCBs), which are highly toxic, carcinogenic, persistent organic pollutants.

Student of Smolny Institute Catherine Molchanova with vacuum pump, by Dmitry Levitzky, 1776

Cryopump

39

Cryopump
A cryopump or a "cryogenic pump" is a vacuum pump that traps gases and vapours by condensing them on a cold surface. They are only effective on some gases, depending on the freezing and boiling points of the gas relative to the cryopump's temperature. They are sometimes used to block particular contaminants, for example in front of a diffusion pump to trap backstreaming oil, or in front of a McLeod gauge to keep out water. In this function, they are called a cryotrap or cold trap, even though the physical mechanism is the same as for a cryopump. Cryotrapping can also refer to a somewhat different effect, where molecules will increase their residence time on a cold surface without actually freezing. There is a delay between the molecule impinging on the surface and rebounding from it. Kinetic energy will have been lost, the molecules slow down. For example, hydrogen will not condense at 8 kelvin, but it can be cryotrapped. This effectively traps molecules for an extended period and thereby removes them from the vacuum environment just like cryopumping.

Operation
Cryopumps are commonly cooled by compressed helium though they may also use dry ice, liquid nitrogen, or stand-alone versions may include a built-in cryocooler. Baffles are often attached to the cold head to expand the surface area available for condensation, but they also increase the radiative heat uptake of the cryopump. Over time, the surface eventually saturates with condensate and the pumping speed gradually drops to zero. It will hold the trapped gases as long as it remains cold, but it will not condense fresh gases from leaks or backstreaming until it is regenerated. Saturation happens very quickly in low vacuums, so cryopumps are usually only used in high or ultrahigh vacuum systems. Regeneration of a cryopump is the process of evaporating the trapped gases. This can be done at room temperature and pressure, or the process can be made more complete by exposure to vacuum and faster by elevated temperatures. Best practice is to heat the whole chamber under vacuum to the highest temperature allowed by the materials, allow time for outgassing products to be exhausted by the mechanical pumps, and then cool and use the cryopump without breaking the vacuum. Some cryopumps have multiple stages at various low temperatures, with the outer stages shielding the coldest inner stages. The outer stages condense high boiling point gases such as water and oil, thus saving the surface area and refrigeration capacity of the inner stages for lower boiling point gases such as nitrogen. As cooling temperatures decrease when using dry ice, liquid nitrogen, then compressed helium, lower molecular-weight gases can be trapped. Trapping nitrogen, helium, and hydrogen requires extremely low temperatures (~10K) and large surface area as described below. Even at this temperature, the lighter gases helium and hydrogen have very low trapping efficiency and are the predominant molecules in ultra-high vacuum systems. Cryopumps are often combined with sorption pumps by coating the cold head with highly adsorbing materials such as activated charcoal or a zeolite. As the sorbent saturates, the effectiveness of a sorption pump decreases, but can be recharged by heating the zeolite material (preferably under conditions of low pressure) to outgas it. The breakdown temperature of the zeolite material's porous structure may limit the maximum temperature that it may be heated to for regeneration. Sorption pumps are a type of cryopump that is often used as roughing pumps to reduce pressures from the range of atmospheric to on the order of 0.1 Pa (10-3 Torr), while lower pressures are achieved using a finishing pump (see vacuum).

Cryopump

40

Getter
A getter is a deposit of reactive material that is placed inside a vacuum system, for the purpose of completing and maintaining the vacuum. When gas molecules strike the getter material, they combine with it chemically or by adsorption. Thus the getter removes small amounts of gas from the evacuated space. The getter is usually a coating applied to a surface within the evacuated chamber. A vacuum is initially created by connecting a closed container to a vacuum pump. After achieving a vacuum, the container can be sealed, or the vacuum pump can be left running. Getters are especially important in sealed systems, such as vacuum tubes, including cathode ray tubes (CRTs), and vacuum insulated panels, which must (center) A vacuum tube with a "flashed getter" coating on the inner surface of the maintain a vacuum for a long time. This is top of the tube. (left) The inside of a similar tube, showing the reservoir that holds the material that is evaporated to create the getter coating. because the inner surfaces of the container release adsorbed gases for a long time after the vacuum is established. The getter continually removes this residual gas as it is produced. Even in systems which are continually evacuated by a vacuum pump, getters are also used to remove residual gas, often to achieve a higher vacuum than the pump could achieve alone. Although it weighs almost nothing and has no moving parts, a getter is itself a vacuum pump. Getters cannot react permanently with inert gases, though some getters will adsorb them in a reversible fashion. Also, hydrogen is usually handled by adsorption rather than reaction. Small amounts of gas within a vacuum tube will ionize, causing undesired conduction leading to major malfunction. Small amounts of gas within a vacuum insulated panel can greatly compromise its insulation value. Getters help to maintain the vacuum.

Getter

41

Flashed getters
"Flashed getters" are prepared by arranging a reservoir of a volatile and reactive material inside the vacuum system. Once the system is evacuated and sealed, the material is heated, usually by RF induction heating, and evaporates, depositing itself on the walls to leave a coating. Flashed getters are commonly used in vacuum tubes, and the standard flashed getter material is barium. It can usually be seen as a silvery metallic spot on the inside of the tube's glass envelope. Large transmitting and specialized tubes often use more exotic getters, including aluminium, magnesium, calcium, sodium, strontium, caesium and phosphorus.

Dead display (getter spot become white)

If the tube breaks, the getter reacts with incoming air leaving a white deposit inside the tube, and it becomes useless; for this reason, flashed getters are not used in systems which are intended to be opened. A functioning phosphorus getter looks very much like an oxidised metal getter, though it has an iridescent pink or orange appearance which oxidised metal getters lack. Phosphorus was frequently used before metallic getters were developed. In systems which need to be opened to air for maintenance, a titanium sublimation pump provides similar functionality to flashed getters, but can be flashed repeatedly. Alternatively, nonevaporable getters may be used.

Non-evaporable getters
Non-evaporable getters which work at high temperature generally consist of a film of a special alloy, often primarily zirconium; the requirement is that the alloy materials must form a passivation layer at room temperature which disappears when heated. Common alloys have names of the form St (Stabil) followed by a number: St 707 is 70% zirconium, 24.6% vanadium and the balance iron, St 787 is 80.8% zirconium, 14.2% cobalt and balance mischmetal, St 101 is 84% zirconium and 16% aluminium. In tubes used in electronics, the getter material coats plates within the tube which are heated in normal operation; when getters are used within more general vacuum systems, such as in semiconductor manufacturing, they are introduced as separate pieces of equipment in the vacuum chamber, and turned on when needed. It is of course important not to heat the getter when the system is not already in a good vacuum.

Ion pump (physics)

42

Ion pump (physics)


"Ion pump" redirects here. For a protein that moves ions across a plasma membrane, see Ion transporter. An ion pump is not to be confused with an ionic liquid piston pump or an ionic liquid ring vacuum pump. An ion pump (also referred to as a sputter ion pump) is a type of vacuum pump capable of reaching pressures as low as 1011 mbar under ideal conditions. An ion pump ionizes gas within the vessel it is attached to and employs a strong electrical potential, typically 3kV to 7kV, which allows the ions to accelerate into and be captured by a solid electrode and its residue. The basic element of the common ion pump is a Penning trap. A swirling cloud of electrons produced by an electric discharge are temporarily stored in the anode region of a Penning trap. These electrons ionize incoming gas atoms and molecules. The resultant swirling ions are accelerated to strike a chemically active cathode (usually titanium). On impact the accelerated ions will either become buried within the cathode or sputter cathode material onto the walls of the pump. The freshly sputtered chemically active cathode material acts as a getter that then evacuates the gas by both chemisorption and physisorption resulting in a net pumping action. Inert and lighter gases, such as He and H2 tend not sputter and are absorbed by physisorption. Some fraction of the energetic gas ions (including gas that is not chemically active with the cathode material) can strike the cathode and acquire an electron from the surface neutralizing it as it rebounds. These rebounding energetic neutrals are buried in exposed pump surfaces. Both the pumping rate and capacity of such capture methods are dependent on the specific gas species being collected and the cathode material absorbing it. Some species, such as carbon monoxide, will chemically bind to the surface of a cathode material. Others, such as hydrogen, will diffuse into the metallic structure. In the former example, the pump rate can drop as the cathode material becomes coated. And, in the latter, the rate remains fixed by the rate at which the hydrogen diffuses. There are three main types of ion pumps, the conventional or standard diode pump, the noble diode pump and the triode pump. Ion pumps are commonly used in ultra high vacuum (UHV) systems, as they can attain ultimate pressures less than 1011 mbar. In contrast to other common UHV pumps, such as turbomolecular pumps and diffusion pumps, ion pumps have no moving parts and use no oil. They are therefore clean, need little maintenance, and produce no vibrations. these advantages make ion pumps well-suited for use in scanning probe microscopy and other high-precision apparatus. Recent work has suggested that free radicals escaping from ion pumps can influence the results of some experiments.

Ion pump (physics)

43

Rotary vane pump


A rotary vane pump is a positive-displacement pump that consists of vanes mounted to a rotor that rotates inside of a cavity. In some cases these vanes can be variable length and/or tensioned to maintain contact with the walls as the pump rotates. It was invented by Charles C. Barnes of Sackville, New Brunswick who patented it on June 16, 1874.

Types
The simplest vane pump is a circular rotor rotating inside of a larger circular cavity. The centers of these two circles are offset, causing eccentricity. Vanes are allowed to slide into and out of the rotor and seal on all edges, creating vane chambers that do the pumping work. On the intake side of the An eccentric rotary vane pump. Note that modern pumps have an area contact between rotor and stator (and not a line contact). pump, the vane chambers are increasing in 1. pump housing volume. These increasing volume vane 2. rotor chambers are filled with fluid forced in by 3. vanes the inlet pressure. Inlet pressure is actually 4. spring the pressure from the system being pumped, often just the atmosphere. On the discharge side of the pump, the vane chambers are decreasing in volume, forcing fluid out of the pump. The action of the vane drives out the same volume of fluid with each rotation. Multistage rotary vane vacuum pumps can attain pressures as low as 103 mbar (0.1 Pa).

Uses
Common uses of vane pumps include high pressure hydraulic pumps and automotive uses including, supercharging, power steering and automatic transmission pumps. Pumps for mid-range pressures include applications such as carbonators for fountain soft drink dispensers and espresso coffee machines. Furthermore, vane pumps can be used in low-pressure gas applications such as secondary air injection for auto exhaust emission control, or in low pressure chemical vapor deposition systems. Rotary vane pumps are also a common type of vacuum pump, with two-stage pumps able to reach pressures well below 10-6 bar. These vacuum pumps are found in numerous applications, such as providing braking assistance in large trucks and diesel powered passenger cars (whose engines do not generate intake vacuum) through a braking booster, in most light aircraft to drive gyroscopic flight instruments, in evacuating refrigerant lines during installation of air conditioners, in laboratory freeze dryers, and vacuum experiments in physics. In the vane pump the pumped

Rotary vane pump gas and the oil are mixed within the pump, and so they must be separated externally. Therefore the inlet and the outlet have a large chambermaybe with swirlwhere the oil drops fall out of the gas. Sometimes the inlet has a venetian blind cooled by the room air (the pump is usually 40 K hotter) to condense cracked pumping oil and water, and let it drop back into the inlet. When these pumps are used in high vacuum systems (where the inflow of gas into the pump becomes very low), a significant concern is contamination of the entire system by molecular oil backstreaming.

44

Variable Displacement Vane Pumps


One of the major advantages of the vane pump is that the design readily lends itself to become a variable displacement pump, rather than a fixed displacement pump such as a spur-gear (X-X) or a gerotor (I-X) pump. The centerline distance from the rotor to the eccentric ring is used to determine the pump's displacement. By allowing the eccentric ring to pivot or translate relative to the rotor, the displacement can be varied. It is even possible for a vane pump to pump in reverse if the eccentric ring moves far enough. However, performance cannot be optimized to pump in both directions. This can make for a very interesting hydraulic control oil pump. A variable displacement vane pump is used as an energy savings device, and has been used in many applications, including automotive transmissions, for over 30 years.

Materials
Externals (head, casing) - Cast iron, ductile iron, steel, and stainless steel. Vane, Pushrods - Carbon graphite, PEEK. End Plates - Carbon graphite Shaft Seal - Component mechanical seals, industry-standard cartridge mechanical seals, and * magnetically driven pumps.

Diaphragm pump

45

Diaphragm pump
A diaphragm pump (also known as a Membrane pump, Air Operated Double Diaphragm Pump (AODD) or Pneumatic Diaphragm Pump) is a positive displacement pump that uses a combination of the reciprocating action of a rubber, thermoplastic or teflon diaphragm and suitable valves either side of the diaphragm (check valve, butterfly valves, flap valves, or any other form of shut-off valves) to pump a fluid. There are three main types of diaphragm pumps: Those in which the diaphragm is sealed with one side in the fluid to be pumped, and the other in air or hydraulic fluid. The diaphragm is flexed, causing the volume of the pump chamber to increase and decrease. A pair of non-return check valves prevent reverse flow of the fluid. Those employing volumetric positive displacement where the prime mover of the diaphragm is electro-mechanical, working through a crank or geared motor drive, or purely mechanical, such as with a lever or handle. This method flexes the diaphragm through simple mechanical action, and one side of the diaphragm is open to air. Those employing one or more unsealed diaphragms with the fluid to be pumped on both sides. The diaphragm(s) again are flexed, causing the volume to change. When the volume of a chamber of either type of pump is increased (the diaphragm moving up), the pressure decreases, and fluid is drawn into the chamber. When the chamber pressure later increases from decreased volume (the diaphragm moving down), the fluid previously drawn in is forced out. Finally, the diaphragm moving up once again draws fluid into the chamber, completing the cycle. This action is similar to that of the cylinder in an internal combustion engine.

Diaphragm pump schematic.

Characteristics
Diaphragm pumps: have good suction lift characteristics, some are low pressure pumps with low flow rates; others are capable of higher flow rates, dependent on the effective working diameter of the diaphragm and its stroke length. They can handle sludges and slurries with a relatively high amount of grit and solid content. suitable for discharge pressure up to 1,200 bar have good dry running characteristics. can be used to make artificial hearts. are used to make air pumps for the filters on small fish tanks. can be up to 97% efficient. have good self priming capabilities. can handle highly viscous liquids. A viscosity correction chart can be used as a tool to help prevent under-sizing AOD pumps. are available for industrial, chemical and hygienic applications cause a pulsating flow that may cause water hammer (This can be minimised by using a pulsation dampener)

Diaphragm pump

46

History

Membrane pump animation

The diaphragm pump was invented in 1857 by Jacob Edson. Full production of the first pumps began two years later under the name of the Edson Corporation, located in Boston, Massachusetts. The company continues to thrive today in New Bedford, Massachusetts.

Air compressors
Small mechanically activated diaphragm pumps are also used as air compressors and as a source of low-grade vacuum. Compared to other compressors, these pumps are quiet, cheap and, most importantly, have no moving parts in the airstream. This allows them to be used without added lubrication in contact with the air, so the compressed air produced can be guaranteed clean.

Liquid ring pump

47

Liquid ring pump


A liquid ring pump is a rotating positive displacement pump. They are typically used as a vacuum pump but can also be used as a gas compressor. The function of a liquid ring pump is similar to a rotary vane pump, with the difference being that the vanes are an integral part of the rotor and churn a rotating ring of liquid to form the compression chamber seal. They are an inherently low friction design, with the rotor being the only moving part. Sliding friction is limited to the shaft seals. Liquid ring pumps are typically powered by an induction motor.

Description of operation
The liquid ring pump compresses gas by Liquid ring pump rotating a vaned impeller located eccentrically within a cylindrical casing. Liquid (usually water) is fed into the pump and, by centrifugal acceleration, forms a moving cylindrical ring against the inside of the casing. This liquid ring creates a series of seals in the space between the impeller vanes, which form compression chambers. The eccentricity between the impeller's axis of rotation and the casing geometric axis results in a cyclic variation of the volume enclosed by the vanes and the ring. Gas, often air, is drawn into the pump via an inlet port in the end of the casing. The gas is trapped in the compression chambers formed by the impeller vanes and the liquid ring. The reduction in volume caused by the impeller rotation compresses the gas, which reports to the discharge port in the end of the casing.

History
The earliest liquid ring pumps date from 1903 when a patent was granted in Germany to Siemens-Schuckert. US Patent 1,091,529, for liquid ring vacuum pumps and compressors, was granted to Lewis H. Nash in 1914. They were manufactured by the Nash Engineering Company in Norwalk, CT. Around the same time, in Austria, Patent 69274 was granted to Siemens-Schuckertwerke for a similar liquid ring vacuum pump.

Recirculation of ring-liquid
Some ring-liquid is also entrained with the discharge stream. This liquid is separated from the gas stream by other equipment external to the pump. In some systems, the discharged ring-liquid is cooled via heat exchanger or cooling tower, then returned to the pump casing. In some recirculating systems, contaminants from the gas become trapped in the ring-liquid, depending on system configuration. These contaminants become concentrated as the liquid continues to recirculate, eventually causes damage and reduced life to the pump. In this case, filtration systems are required to ensure contamination is kept to acceptable levels. In non-recirculating systems, the discharged hot liquid (usually water) is treated as a waste stream. In this case, fresh, cool water is used to make up the loss. Environmental considerations are making such "once-through" systems increasingly rare.

Liquid ring pump

48

Types and applications


Liquid ring systems can be single or multi-stage. Typically a multi-stage pump will have up to two compression stages on a common shaft. In vacuum service, the attainable pressure reduction is limited by the vapour pressure of the ring-liquid. As the generated vacuum approaches the vapour pressure of the ring-liquid, the increasing volume of vapor released from the ring-liquid diminishes the remaining vacuum capacity. The efficiency of the system declines as a result. Single stage vacuum pumps typically produce vacuum to 35 torr (mm Hg) or 0.047 bar, and two-stage pumps can produce vacuum to 25 torr (mmHgA), assuming air is being pumped and the ring-liquid is water at 15C (60F) or less. Dry air and 15C sealant water temperature is the standard performance basis, which most manufacturers use for their performance curves. These simple, but highly reliable pumps have a variety of industrial applications. They are used on paper machines to dewater the pulp slurry and to extract water from press felts. Another application is the vacuum forming of molded paper pulp products (egg cartons and other packaging). Other applications include soil remediation, where contaminated ground water is drawn from wells by vacuum. In petroleum refining, vacuum distillation also makes use of liquid ring vacuum pumps to provide the process vacuum. Liquid ring compressors are often used in vapor recovery systems. Liquid ring vacuum pumps can use any liquid compatible with the process, provided it has the appropriate vapor pressure properties, as the sealant liquid. Although the most common sealant is water, almost any liquid can be used. The second most common is oil. Since oil has a very low vapor pressure, oil-sealed liquid ring vacuum pumps are typically air-cooled. The ability to use any liquid allows the liquid ring vacuum pump to be ideally suited for solvent (vapor) recovery. If a process, such as distillation, or a vacuum dryer is generating toluene vapors, for example, then it is possible to use toluene as the sealant, provided the cooling water is cold enough to keep the vapor pressure of the sealant liquid low enough to pull the desired vacuum. Ionic liquids in liquid ring vacuum pumps can lower the vacuum pressure from about 70 mbar to below 1 mbar.

Reciprocating compressor

49

Reciprocating compressor
A reciprocating compressor or piston compressor is a positive-displacement compressor that uses pistons driven by a crankshaft to deliver gases at high pressure. The intake gas enters the suction manifold, then flows into the compression cylinder where it gets compressed by a piston driven in a reciprocating motion via a crankshaft, and is then discharged. Applications include oil refineries, gas pipelines, chemical plants, natural gas processing plants and refrigeration plants. One specialty application is the blowing of plastic bottles made of polyethylene terephthalate (PET).

Reciprocating compressor function

A motor-driven six-cylinder reciprocating compressor that can operate with two, four or six cylinders.

Scroll compressor

50

Scroll compressor
A scroll compressor (also called spiral compressor, scroll pump and scroll vacuum pump) is a device for compressing air or refrigerant. It is used in air conditioning equipment, as an automobile supercharger (where it is known as a scroll-type supercharger) and as a vacuum pump. A scroll compressor operating in reverse is known as a scroll expander, and can be used to generate mechanical work from the expansion of a fluid, compressed air or gas. Many residential central heat pump and air conditioning systems and a few automotive air conditioning systems employ a scroll compressor instead of the more traditional rotary, reciprocating, and wobble-plate compressors.

Mechanism of a scroll pump; here two archimedean spirals

History
Lon Creux first patented a scroll compressor in 1905 in France and the US (Patent number 801182). Creux invented the compressor as a rotary steam engine concept, but the metal casting technology of the period was not sufficiently advanced to construct a working prototype, since a scroll compressor demands very tight tolerances to function effectively. The first practical scroll compressors did not appear on the market until after World War II, when higher-precision machine tools enabled their construction. They were not commercially produced for air conditioning until the early 1980s.

Design

Operation of a scroll compressor

A scroll compressor uses two interleaving scrolls to pump, compress or pressurize fluids such as liquids and gases. The vane geometry may be involute, Archimedean spiral, or hybrid curves. Often, one of the scrolls is fixed, while the other orbits eccentrically without rotating, thereby trapping and pumping or compressing pockets of fluid between the scrolls. Another method for producing the compression motion is co-rotating the scrolls, in synchronous motion, but with offset centers of rotation. The relative motion is the same as if one were orbiting. Another variation is with flexible (layflat) tubing where the archimedean spiral acts as a peristaltic pump, which operates on much the same principle as a toothpaste tube. They have casings filled with lubricant to prevent abrasion of the exterior of the pump tube and to aid in the dissipation of heat, and use reinforced tubes, often called 'hoses'. This class of pump is often called a 'hose pumper'. Furthermore, since there are no moving parts in contact with the fluid, peristaltic pumps are inexpensive to manufacture. Their lack of valves, seals and glands makes them comparatively inexpensive to maintain, and the use of a hose or tube makes for a low-cost maintenance item compared to other pump types.

Scroll compressor

51

Applications
Air conditioner compressor Vacuum pump Superchargers for automobile applications, e.g. Volkswagen's G-Lader

Engineering comparison to other pumps


These devices are known for operating more smoothly, quietly, and reliably than conventional compressors in some applications. Unlike pistons, the orbiting scrolls mass can be perfectly counterbalanced, with simple masses, to minimize vibration. (However, an orbiting scroll cannot be balanced if Oldham coupling is used.) The scrolls gas processes are more continuous. Additionally, a lack of dead space gives an increased volumetric efficiency.

Rotations and pulse flow

Scroll compressor

The compression process occurs over approximately 2 to 2 rotations of the crankshaft, compared to one rotation for rotary compressors, and one-half rotation for reciprocating compressors. The scroll discharge and suction processes occur for a full rotation, compared to less than a half-rotation for the reciprocating suction process, and less than a quarter-rotation for the reciprocating discharge process. However, reciprocating compressor have multiple cylinders (typically, anywhere from two to six), while scroll compressors only have one compression element. The presence of multiple cylinders in reciprocating compressors reduces suction and discharge pulsations. Therefore, it is difficult to state whether scroll compressors have lower pulsation levels than reciprocating compressors as has often been claimed by some suppliers of scroll compressors. The more steady flow yields lower gas pulsations, lower sound and lower vibration of attached piping, while having no influence on the compressor operating efficiency.

Valves
Scroll compressors never have a suction valve, but depending on the application may or may not have a discharge valve. The use of a dynamic discharge valve is more prominent in high pressure ratio applications, typical of refrigeration. Typically, an air-conditioning scroll does not have a dynamic discharge valve. The use of a dynamic discharge valve improves scroll compressor efficiency over a wide range of operating conditions, when the operating pressure ratio is well above the built-in pressure ratio of the compressors. However, if the compressor is designed to operate near a single operating point, then the scroll compressor can actually gain efficiency around this point if there is no dynamic discharge valve present (since there are additional discharge flow losses associated with the presence of the discharge valve as well as discharge ports tend to be smaller when the discharge is present).

Efficiency
The isentropic efficiency of scroll compressors is slightly higher than that of a typical reciprocating compressor when the compressor is designed to operate near one selected rating point. The scroll compressors are more efficient in this case because they do not have a dynamic discharge valve that introduces additional throttling losses. However, the efficiency of a scroll compressor that does not have a discharge valve begins to decrease as compared to the reciprocating compressor at higher pressure ratio operation. This is a result of under-compression losses that occur at high pressure ratio operation of the positive displacement compressors that do not have a dynamic discharge valve.

Scroll compressor There is an industry trend toward developing systems operating on CO2 refrigerant. While CO2 has no ozone depletion potential, it is very difficult to achieve a reasonable cycle efficiency using CO2 as compared to other conventional refrigerants, without having substantial expenditures on enhancing the system with large heat exchangers, vapor injection options, expanders, etc. In case of CO2, the reciprocating compressor appears to offer the best option, as it is difficult to design an efficient and reliable scroll compressor for this application. The scroll compression process is nearly one hundred percent volumetrically efficient in pumping the trapped fluid. The suction process creates its own volume, separate from the compression and discharge processes further inside. By comparison, reciprocating compressors leave a small amount of compressed gas in the cylinder, because it is not practical for the piston to touch the head or valve plate. That remnant gas from the last cycle then occupies space intended for suction gas. The reduction in capacity (i.e. volumetric efficiency) depends on the suction and discharge pressures with greater reductions occurring at higher ratios of discharge to suction pressures.

52

Reliability
Scroll compressors have fewer moving parts than reciprocating compressors which, theoretically, should improve reliability. According to Emerson Climate Technologies, manufacturer of Copeland scroll compressors, scroll compressors have 70 percent fewer moving parts than conventional reciprocating compressors. In 2006 a major manufacturer of food service equipment, Stoelting, chose to change the design of one of their soft serve ice cream machines from reciprocating to scroll compressor. They found through testing that the scroll compressor design delivered better reliability and energy efficiency in operation. However, many refrigeration applications rely on reciprocating compressors, which appear to be more reliable in these applications than scroll compressors. These applications include supermarket refrigeration and truck trailer applications.

Vulnerabilities
Scroll compressors are more vulnerable to introduced debris, as any debris need to pass through at least two closed compression pockets. The scrolls that operate without radial and/or axial compliance are even more prone to the damage caused by foreign objects. However, scrolls do not have suction valves, which is one of the most vulnerable parts of the reciprocating compressor to liquid flooding. Scroll compressors utilize different methods of protection inside the compressor to handle difficult situations. Some scroll designs utilize valves at different points in the compression process to relieve pressure inside the compression elements. A reciprocating compressor can run in either direction and still function properly, whereas a scroll compressor must rotate in one direction only in order to function. This can be important during extremely short periods of power loss when a scroll compressor may be forced to run backward from the pressure in the discharge line. Only single phase scroll compressors would continue to run in reverse after the power comes back on. If this happens, the scroll compressor will stop pumping. Running scroll compressor in reverse for several minutes would normally not damage the compressor. The three-phase compressor, as compared to single phase compressors, would revert to operation in a forward direction at the end of a short power interruption. However, it is important to properly wire the three-phase compressor during the initial installation. If during the installation the polarity is inadvertently reversed then the three-phase compressor would run backward and the damage to the compressor may result if it goes unnoticed for long period of time. Interestingly, one of the ways to mitigate the flooded operation of the compressor on start up, is to actually run the compressor for several minutes in the reverse direction before turning the compressor in the forward direction. The short reverse run on the start up would expel any liquid accumulated inside the compressor pumping element back into the crankcase, as well as preheat the liquid stored in the crankcase by dissipated motor heat. Expelling the liquid from the pumping element and preheating any liquid refrigerant in the crankcase prior to initiating the normal run in the forward direction significantly alleviates problems with the flooded start.

Scroll compressor

53

Size
Scroll compressors tend to be very compact and smooth running and so do not require spring suspension. This allows them to have very small shell enclosures which reduces overall cost but also results in smaller free volume. This is a weakness in terms of liquid handling. Their corresponding strength is in the lack of suction valves which moves the most probable point of failure to the drive system which may be made somewhat stronger. Thus the scroll mechanism is itself more tolerant of liquid ingestion but at the same time is more prone to experience it in operation. Small size of a scroll compressor and quiet operation allows for the unit to be built into high power density computers, like IBM mainframes. Scroll compressors also simplify the piping design, since they require no external connection for the primary coolant.

Partial loading
Until recently, scroll compressors could only operate at full capacity when powered. Modulation of the capacity was accomplished outside the scroll set. In order to achieve part-loads, engineers would bypass refrigerant from intermediate compression pocket back to suction, vary motor speed, or provide multiple compressors and stage them on and off in sequence. Each of these methods has drawbacks: Bypass short-cycles the normal refrigeration cycle and allows some of the partially compressed gas to return to the compressor suction without doing any useful work. This practice reduces overall system efficiency. A two-speed motor requires more electrical connections and switching, adding cost, and may have to stop to switch. A variable speed motor requires an additional device to supply electrical power throughout the desired frequency range. Also variable frequency drive associated with variable speed compressor has its own electrical losses, and is a source of additional significant cost and often is an additional reliability concern. Compressor cycling requires more compressors and can be costly. In addition, some compressors in the system may have to be very small in order to control process temperature accurately. Recently, scroll compressors have been manufactured that provide part-load capacity within a single compressor. These compressors change capacity while running. Reciprocating compressors often have better unloading capabilities than scroll compressors. Reciprocating compressors operate efficiently in unloaded mode when flow to some of the cylinders is completely cut off by internal solenoid valves. Two stage reciprocating compressors are also well suited for vapor injection (or what may be called economized operation) when partially expanded flow is injected between the first and second compression stages for increased capacity and improved efficiency. While scroll compressors can also rely on vapor injection to vary the capacity, their vapor injection operation is not as efficient as for the case of reciprocating compressors. This inefficiency is caused by continuously changing volume of the scroll compressor compression pocket during the vapor injection process. As the volume is continuously being changed the pressure within the compression pocket is also continuously changing which adds inefficiency to the vapor injection process. In case of a two stage reciprocating compressor the vapor injection takes place between the two stages, where there is no changing volume. Both scroll and reciprocating compressors can be unloaded from mid-stage compression, however reciprocating compressors are also more efficient for this mode of unloading than scroll compressors, because the unloaded port dimensions in case of scroll is limited by the internal port size, which would not be the case for a reciprocating compressor where unloading again occurs from between the two stages. Emerson manufactures a scroll compressor that is capable of varying the refrigerant flow as per requirement. Instead of fixing the scrolls together permanently, the scrolls are allowed to move apart periodically. As the scrolls move apart, the motor continues to turn but the scrolls lose the ability to compress refrigerant, thus motor power is reduced when the scroll compressor is not pumping. By alternating the two different working states: the loaded state and the unloaded state. A solenoid contracts and expands the rotating scroll and/or the fixed scroll, using axial compliance. The controller modifies the load time, and the unload time, matching the capacity of the compressor to the load

Scroll compressor requested. This type of scroll compressors while offering variable capacity control, normally down to 20% of the full flow, can suffer from a significant loss of efficiency especially toward the lower range of the capacity control.

54

Archimedes' screw
Archimedes' screw, also called the Archimedean screw or screwpump, is a machine historically used for transferring water from a low-lying body of water into irrigation ditches. The screw pump is commonly attributed to Archimedes on the occasion of his visit to Egypt, but this tradition may reflect only that the apparatus was unknown to the Greeks before Hellenistic times and introduced in his lifetime by unknown Greek engineers.

Design
Archimedes' screw consists of a screw (a helical surface surrounding a central cylindrical shaft) inside a hollow pipe. The screw is turned usually by a windmill or by manual labour. As the shaft turns, the bottom end scoops up a volume of water. This water will slide up in the spiral tube, until it finally pours out from the top of the tube and feeds the irrigation systems. The screw was used mostly for draining water out of mines or other areas of low lying water. The contact surface between the screw and the pipe does not need to be perfectly watertight, as long as the amount of water being scooped at each turn is large compared to the amount of water leaking out of each section of the screw per turn. Water leaking from one section leaks into the next lower one, so that a sort of mechanical equilibrium is achieved in use.

Archimedes' screw was operated by hand and could raise water efficiently

An Archimedes screw in Huseby south of Vxj Sweden

In some designs, the screw is fixed to the casing and they rotate together instead of the screw turning within a stationary casing. A screw could be sealed with pitch resin or some other adhesive to its casing, or cast as a single piece in bronze. Some researchers have postulated this as being the device used to irrigate the Hanging Gardens of Babylon, one of the Seven Wonders of the Ancient World. Depictions of Greek and Roman water screws show them being powered by a human treading on the outer casing to turn the entire apparatus as one piece, which would require that the casing be rigidly attached to the screw.

Archimedes' screw

Archimedes' screw

55

Uses
Along with transferring water to irrigation ditches, the device was also used for draining land that was underneath the sea in the Netherlands and other places in the creation of polders. A part of the sea would be enclosed and the water would be pumped out of the enclosed area, starting the process of draining the land for use in agriculture. Depending on the length and diameter of the screws, more than one machine could be used successively to lift the same water. An Archimedes' screw was used by British soils engineer John Burland in the successful 2001 stabilization of the Leaning Tower of Pisa. Small amounts of subsoil saturated by groundwater were removed from far below the north side of the Tower, and the weight of the tower itself corrected the lean. Archimedes' screws are used in sewage treatment plants because they cope well with varying rates of flow and with suspended solids. An auger in a snow blower or grain elevator is essentially an Archimedes' screw. Many forms of axial flow pump basically contain an Archimedes' screw. The principle is also found in pescalators, which are Archimedes screws designed to lift fish safely from ponds and transport them to another location. This technology is used primarily at fish hatcheries, where it is desirable to minimize the physical handling of fish. It is also used in chocolate fountains.

History
The invention of the water screw is credited to the Greek polymath Archimedes of Syracuse in the 3rd century BC. A cuneiform inscription of the Assyrian king Sennacherib (704 - 681BC) has been interpreted by Dalley to describe the casting of water screws in bronze some 350 years earlier. This is consistent with the classical author Strabo who describes the Hanging Garden as watered by screws. A contrary view is expressed by Oleson in an earlier review. The German engineer Konrad Kyeser, in his Bellifortis (1405), equips the Archimedes screw with a crank mechanism. This mechanism soon replaced the ancient practice of working the pipe by treading.

Variants
A screw conveyor is an Archimedes' screw contained within a tube and turned by a motor so as to deliver material from one end of the conveyor to the other. It is particularly suitable for transport of granular materials such as plastic granules used in injection molding, and cereal grains. It may also be used to transport liquids. In industrial control applications the conveyor may be used as a rotary feeder or variable rate feeder to deliver a measured rate or quantity of material into a process. A variant of the Archimedes' screw can also be found in some injection An Archimedes screw seen on a combine molding machines, die casting machines and extrusion of plastics, harvester which employ a screw of decreasing pitch to compress and melt the material. Finally, it is also used in a specific type of positive displacement air compressor: the rotary-screw air compressor. On a much larger scale, Archimedes' screws of decreasing pitch are used for the compaction of waste material.

Reverse action
If water is poured into the top of an Archimedes' screw, it will force the screw to rotate. The rotating shaft can then be used to drive an electric generator. Such an installation has the same benefits as using the screw for pumping: the ability to handle very dirty water and widely varying rates of flow at high efficiency. Settle Hydro and Torrs Hydro are two reverse screw micro hydro schemes operating in England. As a generator the screw is good at low heads, commonly found in English rivers, including the Thames powering Windsor Castle.

Archimedes' screw

56

Wankel engine
The Wankel engine is a type of internal combustion engine using an eccentric rotary design to convert pressure into rotating motion. Over the commonly used reciprocating piston designs the Wankel engine delivers advantages of: simplicity, smoothness, compactness, high revolutions per minute and a high power to weight ratio. The engine is commonly referred to as a rotary engine, though this name applies also to other completely different designs. Its four-stroke cycle occurs in a moving combustion chamber between the inside of an oval-like epitrochoid-shaped housing and a rotor that is similar in shape to a Reuleaux triangle with sides that are somewhat flatter. The engine was invented by German engineer Felix Wankel. He received his first patent for the engine in 1929, began development in the early 1950s at NSU and completed a working prototype in 1957. NSU subsequently licensed the design to companies around the world, who have continually improved the design.

A cut-away of a Wankel engine shown at the Deutsches Museum in Munich, Germany

Thanks to the compact design and unique advantages, over the most common internal combustion engine in use employing reciprocating pistons, Wankel rotary engines have been installed in a variety of vehicles and devices including: automobiles, motorcycles, racing cars, aircraft, go-karts, jet skis, snowmobiles, chain saws, and auxiliary power units.

History
In 1951, the German engineer Felix Wankel began development of the engine at NSU Motorenwerke AG. The KKM 57 (the Wankel rotary engine, Kreiskolbenmotor) was constructed by NSU engineer Hanns Dieter Paschke in 1957 without the knowledge of Felix Wankel, who remarked "you have turned my race horse into a plow mare". The first working prototype DKM 54 ran on February 1, 1957 at the NSU research and development department Versuchsabteilung TX. It produced 21 horsepower; unlike modern Wankel engines, both the rotor and the housing rotated. In 1960 NSU (the firm the inventor worked for) and the US firm Curtiss-Wright signed an agreement First DKM Wankel Engine DKM 54 where NSU would concentrate on the development of low and medium (Drehkolbenmotor), at the Deutsches Museum in Bonn, Germany powered Wankel engines and Curtiss-Wright would develop high powered Wankel Engines, including aircraft engines of which Curtiss-Wright had decades of experience designing and producing. Curtiss-Wright recruited Max Bentele to head the design team. Many manufacturers licensed the design, attracted by the smoothness, quiet running and reliability resulting from the simplicity. Among the manufacturers signing licensing agreements to develop Wankel engines were Alfa Romeo, American Motors, Citroen, Ford, General Motors, Mazda, Mercedes-Benz, Nissan, Porsche, Rolls-Royce,

Wankel engine

57

Suzuki, and Toyota. In the United States, in 1959 under license from NSU, Curtiss-Wright pioneered improvements in the basic engine design. In Britain, in the 1960s, Rolls Royce Motor Car Division pioneered a two-stage diesel version of the Wankel engine. Citron did much research with their M35 and GS Birotor, using engines produced by Comotor. General Motors, seemed to have concluded that the Wankel engine was slightly more expensive to build than an equivalent reciprocating engine, although claiming having solved the fuel economy issue, but failed in obtaining acceptable exhaust emissions. Mercedes-Benz used the Wankel for their C111 concept car. Despite much research and development throughout the world, only Mazda has produced Wankel engines in large numbers.

First KKM Wankel Engine NSU KKM 57P (Kreiskolbenmotor), at Autovision und Forum, Germany

During research in the 1950s and 1960s problems arose. For a while, engineers were faced what they called chattered marks and devil's scratches in the inner epitrochoid surface, they discovered that the origin was in the apex seals reaching a resonating vibration, and solved the problem by reducing the thickness and weight of apex seals. Another early problem of buildup of cracks in the stator surface was eliminated by installing the spark plugs in a separate metal piece instead of screwing it directly into the block. . A later alternative solution to spark plug boss cooling was provided by variable coolant velocity scheme for water-cooled rotaries which has had widespread use and was patented by Curtiss-Wright, with the last-listed for better air-cooled engine spark plug boss cooling. These approaches did not require a high conductivity copper insert but did not preclude the use. In Britain, Norton Motorcycles developed a Wankel rotary engine for motorcycles, based on the Sachs air-cooled Wankel that powered the DKW/Hercules W-2000 motorcycle, which was included in their Commander and F1; Suzuki also made a production motorcycle with a Wankel engine, the RE-5, where they used ferrotic alloy apex seals and an NSU rotor in a successful attempt to prolong the engine's life. In 1971 and 1972 Arctic Cat produced snowmobiles powered by 303cc Wankel rotary engines manufactured by Sachs in Germany. Deere & Company designed a version that was capable of using a variety of fuels. The design was proposed as the power source for United States Marine Corps combat vehicles and other equipment in the late 1980s. Mazda and NSU signed a study contract to develop the Wankel engine in 1961 and competed to bring the first Wankel powered automobile to market. Although Mazda produced an experimental Wankel that year, NSU was first with a Wankel automobile on sale, the sporty NSU Spider in 1964; Mazda countered with a display of two and four rotor Wankel engines at that year's Tokyo Motor Show. In 1967, NSU began production of a Wankel-engined luxury car, the Ro 80. However NSU had not produced reliable apex seals on the rotor, unlike Mazda and Curtiss-Wright. NSU has problems with apex seal wear, poor shaft lubrication and poor fuel economy leading to frequent engine failures, which led to large warranty costs curtailing further NSU Wankel engine development. This premature release of the new Wankel engine gave a poor reputation for all makes and even when these issues were solved in the last engines produced by NSU in the second half of the 70's, sales did not recover. NSU, who merged with Audi built in 1979 a new KKM 871 engine with side intake ports and 750 cc per chamber, 170 HP at 6'500 rpm and 220 Nm at 3'500 rpm. The engine was installed in an Audi 100 hull they named Audi 200, however the engine was not mass-produced.

Wankel engine

58 Mazda, however, claimed to have solved the apex seal problem, and was able to run test engines at high speed for 300 hours without failure. After years of development, Mazda's first Wankel engine car was the 1967 Cosmo 110S. The company followed with a number of Wankel ("rotary" in the company's terminology) vehicles, including a bus and a pickup truck. Customers often cited the cars' smoothness of operation. However, Mazda chose a method to comply with hydrocarbon emission standards that, while less expensive to produce, increased fuel consumption, unfortunately immediately prior to a sharp rise in fuel prices. Curtiss-Wright produced the RC2-60 engine which was comparable to a V8 engine in performance and fuel consumption. Unlike NSU by 1966 Curtiss-Wright had solved the rotor sealing issue

Mazda's first Wankel engine, at the Mazda Museum in Hiroshima, Japan

with seals lasting 100,000 miles. Mazda later abandoned the Wankel in most of their automotive designs, continuing using the engine in their sports car range only, producing the RX-7 until August 2002. The company normally used two-rotor designs. A more advanced twin-turbo three-rotor engine was fitted in the 1991 Eunos Cosmo sports car. In 2003, Mazda introduced the Renesis engine fitted in the RX-8. The Renesis engine relocated the ports for exhaust and intake from the periphery of the rotary housing to the sides, allowing for larger overall ports, better airflow, and further power gains. Early Wankel engines had also side intake and exhaust ports with the concept being abandoned because of carbon buildup in ports and sides of the rotor. The Renesis engine solved the problem by using a keystone scraper side seal, and approached the thermal distortion difficulties by adding some parts made of ceramic. The Renesis is capable of delivering 238hp (177kW) with superior fuel economy, reliability, and environmental friendliness than previous Mazda rotary engines, all from a nominal 1.3L displacement. However this was not enough to meet the ever more stringent emissions standards. Mazda ceased production of their Wankel engine in 2012 after the engine failed to meet the improved Euro 5 emission standard. In 1961, the Soviet research organization of NATI, NAMI and VNIImotoprom started experimental development, and created experimental engines with different technologies. Soviet automobile manufacturer AvtoVAZ also experimented in Wankel engine design without a license introducing a limited number of engines in some cars. In 1974 the Soviets created a special engine design bureau, which in 1978 designed an engine designated as VAZ-311. In 1980, the company commenced delivery of the VAZ-411 twin-rotor Wankel engine in VAZ-2106s and Lada cars. Most of the production went to security services, of which about 200 were manufactured.The next models were the VAZ-4132 and VAZ-415. Aviadvigatel, the Soviet aircraft engine design bureau, is known to have produced Wankel engines with electronic injection for aircraft and helicopters, though little specific information has surfaced. American Motors (AMC) was so convinced "... that the rotary engine will play an important role as a powerplant for cars and trucks of the future....", that the chairman, Roy D. Chapin Jr., of the smallest U.S. automaker signed an agreement in February 1973, after a year's negotiations, to build Wankels for both passenger cars and Jeeps, as well as the right to sell any rotary engines it produces to other companies. AMC's president, William Luneburg, did not expect dramatic development through 1980, however Gerald C. Meyers, AMC's vice-president of the Product (Engineering) Group, suggested that AMC should buy the engines from Curtiss-Wright before developing its own Wankel engines and predicted a total transition to rotary power by 1984. Plans called for the engine to be used in the AMC Pacer, but development was pushed back. American Motors designed the unique Pacer around the engine. By 1974, AMC had decided to purchase the General Motors Wankel instead of building an engine in-house. Both General Motors and AMC confirmed the relationship would benefit in marketing the new engine, with AMC claiming that the General Motors' Wankel achieved good fuel economy. However, General Motors' engines had not reached production when the Pacer was launched onto the market. The 1973 oil crisis played a part in frustrating the uptake of the Wankel engine. Rising fuel prices, and also concerns about proposed US emission standards legislation

Wankel engine added to the concerns. General Motors had not succeeded in producing a Wankel engine meeting both the emission requirements with good fuel economy, leading to the company cancelling development in 1974. Unfortunately as General Motors was cancelling the Wankel project, they issued the results of their most recent research, which claimed to have solved the fuel economy problem building reliable engines with a duration above 530,000 miles. The cancellation of General Motors' Wankel project entailed the AMC Pacer was reconfigured to house the AMC Straight-6 engine driving the rear-wheels.

59

Design

In the Wankel engine, the four strokes of a typical Otto cycle occur in the space between a three-sided symmetric rotor and the inside of a housing. In each rotor of the Wankel engine, the oval-like epitrochoid-shaped housing surrounds a rotor which is triangular with bow-shaped flanks (often confused with a Reuleaux triangle,[8] a three-pointed curve of constant width, but with the bulge in the middle of each side a bit more flattened). The theoretical shape of the rotor between the fixed corners is the result of a minimization of the volume of the geometric combustion chamber and a maximization of the compression ratio, respectively. The symmetric curve connecting two arbitrary apexes of the rotor is maximized in the direction of the inner housing shape with the constraint that it not touch the housing at any angle of rotation (an arc is not a solution of this optimization problem). The central drive shaft, called the eccentric shaft or E-shaft, passes through the center of the rotor and is supported by fixed bearings. The rotors ride on eccentrics (analogous to crankpins) integral to the eccentric shaft (analogous to a crankshaft). The rotors both rotate around the eccentrics and make orbital revolutions around the eccentric shaft. Seals at the corners of the rotor seal against the periphery of the housing, dividing it into three moving combustion chambers. The rotation of each rotor on its own axis is caused and controlled by a pair

The Wankel cycle. The "A" marks one of the three apexes of the rotor. The "B" marks the eccentric shaft and the white portion is the lobe of the eccentric shaft. The shaft turns three times for each rotation of the rotor around the lobe and once for each orbital revolution around the eccentric shaft.

of synchronizing gears A fixed gear mounted on one side of the rotor housing engages a ring gear attached to the rotor and ensures the rotor moves exactly 1/3 turn for each turn of the eccentric shaft. The power output of the engine is not transmitted through the synchronizing gears. The force of gas pressure on the rotor (to a first approximation) goes directly to the center of the eccentric, part of the output shaft. The best way to visualize the action of the engine in the animation at left is to look not at the rotor itself, but the cavity created between it and the housing. The Wankel engine is actually a variable-volume progressing-cavity system. Thus there are 3 cavities per housing, all repeating the same cycle. Note as well that points A and B on the rotor and e-shaft turn at different speedsPoint B circles 3 times as often as point A does, so that one full orbit of the rotor equates to 3 turns of the e-shaft. As the rotor rotates and orbitally revolves, each side of the rotor is brought closer to and then away from the wall of the housing, compressing and expanding the combustion chamber like the strokes of a piston in a reciprocating engine. The power vector of the combustion stage goes through the center of the offset lobe. While a four-stroke piston engine makes one combustion stroke per cylinder for every two rotations of the crankshaft (that is, one-half power stroke per crankshaft rotation per cylinder), each combustion chamber in the Wankel generates one combustion stroke per driveshaft rotation, i.e. one power stroke per rotor orbital revolution and three

Wankel engine power strokes per rotor rotation. Thus, power output of a Wankel engine is generally higher than that of a four-stroke piston engine of similar engine displacement in a similar state of tune; and higher than that of a four-stroke piston engine of similar physical dimensions and weight. Wankel engines generally have a much higher redline than reciprocating engines of similar power output. This is due to the smoothness inherent in circular motion, and because there are no highly stressed parts such as crankshafts, camshafts or connecting rods. Eccentric shafts do not have the stress related contours of crankshafts. The redline of a rotary engine is limited by tooth load on the synchronizing gears . Hardened steel gears are used for extended operation above 7000 or 8000rpm. Mazda Wankel engines in auto racing are operated above 10,000rpm. In aircraft they are used conservatively, up to 6500 or 7500rpm. However, as gas pressure participates in seal efficiency, racing a Wankel engine at high rpm under no load conditions can destroy the engine. National agencies that tax automobiles according to displacement and regulatory bodies in automobile racing variously consider the Wankel engine to be equivalent to a four-stroke engine of 1.5 to 2 times the displacement; some racing series ban it altogether.

60

Toepler pump

61

Toepler pump
A Toepler pump is a form of mercury piston pump, invented by August Toepler in 1850. The principle is illustrated in the diagram. When reservoir G is lowered, bulb B and tube T are filled with gas from the enclosure being evacuated (through tube A). When G is raised, mercury rises in tube F and cuts off the gas in B and T at C. This gas is then forced through the mercury in tube D into the atmosphere. The end of tube D is bent upward at E to facilitate collection of gas (or vapor). By alternately raising G, a pumping action results. Clearly tubes F and D must be long enough to support mercury columns corresponding to atmospheric pressure (76 cm at sea level). Instead of using mercury to provide a valving action at C, it is possible to use a glass float valve.

Toepler Pump diagram

Lobe pump

62

Lobe pump
Lobe pumps are used in a variety of industries including pulp and paper, chemical, food, beverage, pharmaceutical, and biotechnology. They are popular in these diverse industries because they offer superb sanitary qualities, high efficiency, reliability, corrosion resistance and good clean-in-place and steam-in-place (CIP/SIP) characteristics. Rotary pumps can handle solids (e.g., cherries and olives), slurries, pastes, and a variety of liquids. If wetted, they offer self-priming performance. A gentle pumping action minimizes product degradation. They also offer continuous and intermittent reversible flows and can operate dry for brief periods of time. Flow is relatively independent of changes in process pressure, too, so output is relatively constant and continuous.
Lobe pump (5m/min or 1886 barrel/h) of THW

How lobe pumps work


Lobe pumps are similar to external gear pumps in operation in that fluid flows around the interior of the casing. Unlike external gear pumps, however, the lobes do not make contact. Lobe contact is prevented by external timing gears located in the gearbox. Pump shaft support bearings are located in the gearbox, and since the bearings are out of the pumped liquid, pressure is limited by bearing location and shaft deflection. 1. As the lobes come out of mesh, they create expanding volume on the inlet side of the pump. Liquid flows into the cavity and is trapped by the lobes as they rotate. 2. Liquid travels around the interior of the casing in the pockets between the lobes and the casingit does not pass between the lobes. 3. Finally, the meshing of the lobes forces liquid through the outlet port under pressure.
lobe pump internals

Lobe pumps are frequently used in food applications because they handle solids without damaging the product. Particle size pumped can be much larger in lobe pumps than in other positive displacement types. Since the lobes do not make contact, and clearances are not as close as in other Positive displacement pumps, this design handles low viscosity liquids with diminished performance. Loading characteristics are not as good as other designs, and suction ability is low. High-viscosity liquids require reduced speeds to achieve satisfactory performance. Reductions of 25% of rated speed and lower are common with high-viscosity liquids.

Diffusion pump

63

Diffusion pump
Diffusion pumps use a high speed jet of vapor to direct gas molecules in the pump throat down into the bottom of the pump and out the exhaust. Invented in 1915 by Wolfgang Gaede using mercury vapor, and improved by Irving Langmuir and W. Crawford, they were the first type of high vacuum pumps operating in the regime of free molecular flow, where the movement of the gas molecules can be better understood as diffusion than by conventional fluid dynamics. Gaede used the name diffusion pump since his design was based on the finding that gas cannot diffuse against the vapor stream, but will be carried with it to the exhaust. However, the principle of operation might be more precisely described as gas-jet pump, since diffusion plays a role also in other high vacuum pumps. In modern text books, the diffusion pump is categorized as a momentum transfer pump. The diffusion pump is widely used in both industrial and research applications. Most modern diffusion pumps use silicone oil or polyphenyl ethers as the working fluid. Cecil Reginald Burch discovered the possibility of using silicone oil in 1928.

Oil diffusion pumps

Six inch oil diffusion pump.

The oil diffusion pump is operated with an oil of low vapor pressure. Its purpose is to achieve higher vacuum (lower pressure) than is possible by use of positive displacement pumps alone. Although its use has been mainly associated within the high-vacuum range (down to 109 mbar), diffusion pumps today can produce pressures approaching 1010 mbar when properly used with modern fluids and accessories. The features that make the diffusion pump attractive for high and ultra-high vacuum use are its high pumping speed for all gases and low cost per unit pumping speed when compared with other types of pump used in the same vacuum range. Diffusion pumps cannot discharge directly into the atmosphere, so a mechanical forepump is typically used to maintain an outlet pressure around 0.1 mbar.

Diffusion pump

64 The high speed jet is generated by boiling the fluid and directing the vapor through a jet assembly. Note that the oil is gaseous when entering the nozzles. Within the nozzles, the flow changes from laminar, to supersonic and molecular. Often several jets are used in series to enhance the pumping action. The outside of the diffusion pump is cooled using either air flow or a water line. As the vapor jet hits the outer cooled shell of the diffusion pump, the working fluid condenses and is recovered and directed back to the boiler. The pumped gases continue flowing to the base of the pump at increased pressure, flowing out through the diffusion pump outlet, where they are compressed to ambient pressure by the secondary mechanical forepump and exhausted. Unlike turbomolecular pumps and cryopumps, diffusion pumps have no moving parts and as a result are quite durable and reliable. They can function over pressure ranges of 1010 to 102 mbar. They are driven only by convection and thus have a very low energy efficiency. One major disadvantage of diffusion pumps is the tendency to backstream oil into the vacuum chamber. This oil can contaminate surfaces inside the chamber or upon contact with hot filaments or electrical discharges may result in carbonaceous or siliceous deposits. Due to backstreaming, oil diffusion pumps are not suitable for use with highly sensitive analytical equipment or other applications which require an extremely clean vacuum environment, but mercury diffusion pumps may be in the case of ultra high vacuum chambers used for mercury deposition. Often cold traps and baffles are used to minimize backstreaming, although this results in some loss of pumping ability. The oil of a diffusion pump cannot be exposed to the atmosphere when hot. If this occurs, the oil will burn and has to be replaced.

Diffusion pumps used on the Calutron mass spectrometers during the Manhattan Project.

Diagram of an oil diffusion pump

Diffusion pump

65

Steam ejectors
The steam ejector is a popular form of diffusion pump for vacuum distillation and freeze-drying. A jet of steam entrains the vapour that must be removed from the vacuum chamber. Steam ejectors can have a single or multiple stages, with and without condensers in between the stages.

Compressed-air ejectors
One class of diffusion vacuum pumps is the multistage compressed-air driven ejector. It is very popular in applications where objects are moved around using suction cups and vacuum lines.

Plot of pumping speed as a function of pressure for a diffusion pump.

Early Langmuir mercury diffusion pump (vertical column) and its backing pump (in background), about 1920. The diffusion pump was widely used in manufacturing vacuum tubes, the key technology which dominated the radio and electronics industry for 50 years.

Turbomolecular pump

66

Turbomolecular pump
A turbomolecular pump is a type of vacuum pump, superficially similar to a turbopump, used to obtain and maintain high vacuum. These pumps work on the principle that gas molecules can be given momentum in a desired direction by repeated collision with a moving solid surface. In a turbomolecular pump, a rapidly spinning turbine rotor 'hits' gas molecules from the inlet of the pump towards the exhaust in order to create or maintain a vacuum.

Operating principles
Most turbomolecular pumps employ multiple stages consisting of rotor/stator pairs mounted in series. Gas captured by the upper stages is pushed into the lower stages and successively compressed to the level of the fore-vacuum (backing pump) pressure. As the gas molecules enter through the inlet, the rotor, which has a number of angled blades, hits the molecules. Thus the mechanical energy of the blades is Interior view of a turbomolecular pump transferred to the gas molecules. With this newly acquired momentum, the gas molecules enter into the gas transfer holes in the stator. This leads them to the next stage where they again collide with the rotor surface, and this process is continued, finally leading them outwards through the exhaust. Because of the relative motion of rotor and stator, molecules preferentially hit the lower side of the blades. Because the blade surface looks down, most of the scattered molecules will leave it downwards. The surface is rough, so no reflection will occur. A blade needs to be thick and stable for high pressure operation and as thin as possible and slightly bent for maximum compression. For high compression ratios the throat between adjacent rotor blades (as shown in the image) is pointing as much as possible in the forward direction. For high flow rates the blades are at 45 and reach close to the axis. Because the compression of each stage is ~10, each stage closer to the outlet is considerably smaller than the preceding inlet stages. This has two consequences. The geometric progression tells us that infinite stages could ideally fit into a finite axial length. The finite length in this case is the full height of the housing as the bearings, the motor, and controller and some of the coolers can be installed inside on the axis. Radially, to grasp as much of the thin gas at the entrance, the inlet-side rotors would ideally have a larger radius, and correspondingly higher centrifugal force; ideal blades would get exponentially thinner towards their tips and carbon fibers should reinforce the aluminium blades. However, because the average speed of a blade affects pumping so much this is done by increasing the root diameter rather than the tip diameter where practical.

Schematic of a turbomolecular pump.

Turbomolecular pump Turbomolecular pumps must operate at very high speeds, and the friction heat buildup imposes design limitations. Some turbomolecular pumps use magnetic bearings to reduce friction and oil contamination. Because the magnetic bearings and the temperature cycles allow for only a limited clearance between rotor and stator, the blades at the high pressure stages are somewhat degenerated into a single helical foil each. Laminar flow cannot be used for pumping, because laminar turbines stall when not used at the designed flow. The pump can be cooled down to improve the compression, but should not be so cold as to condense ice on the blades. When a turbopump is stopped, the oil from the backing vacuum may backstream through the turbopump and contaminate the chamber. One way to prevent this is to introduce a laminar flow of nitrogen through the pump. The transition from vacuum to nitrogen and from a running to a still turbopump has to be synchronized precisely to avoid mechanical stress to the pump and overpressure at the exhaust. A thin membrane and a valve at the exhaust should be added to protect the turbopump from excessive back pressure (e.g. after a power failure or leaks in the backing vacuum). The rotor is stabilized in all of its six degrees of freedom. One degree is governed by the electric motor. Minimally, this degree must be stabilized electronically (or by a diamagnetic material, which is too unstable to be used in a precision pump bearing). Another way (ignoring losses in magnetic cores at high frequencies) is to construct this bearing as an axis with a sphere at each end. These spheres are inside hollow static spheres. On the surface of each sphere is a checkerboard pattern of inwards and outwards going magnetic field lines. As the checkerboard pattern of the static spheres is rotated, the rotor rotates. In this construction no axis is made stable on the cost of making another axis unstable, but all axes are neutral and the electronic regulation is less stressed and will be more dynamically stable. Hall effect sensors can be used to sense the rotational position and the other degrees of freedom can be measured capacitively.

67

Maximum pressure
At atmospheric pressure, the mean free path of air is about 70nm. A turbomolecular pump can work only if those molecules hit by the moving blades reach the stationary blades before colliding other molecules on their way. To achieve that, the gap between moving blades and stationary blades must be close to or less than the mean free path. From a practical construction standpoint, a feasible gap between the blade sets is on the order of 1mm, so a turbopump will stall (no net pumping) if exhausted directly to the atmosphere. Since the mean free path is inversely proportional to pressure, a turbopump will pump when the exhaust pressure is less than about 10Pa (0.10mbar) where the mean free path is about 0.7mm. Most turbopumps have a Holweck pump (or molecular drag pump) as their last stage to increase the maximum backing pressure (exhaust pressure) to about 1-10 mbar. Theoretically, a centrifugal pump, a side channel pump, or a regenerative pump could be used to back to [1] A turbomolecular pump made by Edwards atmospheric pressure directly, but currently there is no commercially with attached vacuum ionization gauge for pressure measurement. available turbopump that exhausts directly to atmosphere. In most cases, the exhaust is connected to a mechanical backing pump (usually called roughing pump) that produces a pressure low enough for the turbomolecular pump to work efficiently. Typically, this backing pressure is usually below 0.1mbar and commonly about 0.01mbar. The backing pressure is rarely below 10-3mbar (mean free path 70mm) because the flow resistance of the vacuum pipe between the turbopump and the roughing pump becomes significant.

Turbomolecular pump The turbomolecular pump can be a very versatile pump. It can generate many degrees of vacuum from intermediate vacuum (~102 Pa) up to ultra-high vacuum levels (~108 Pa). Multiple turbomolecular pumps in a lab or manufacturing-plant can be connected by tubes to a small backing pump. Automatic valves and diffusion pump like injection into a large buffer-tube in front of the backing pump prevents any overpressure from one pump to stall another pump.

68

Practical considerations
Laws of fluid dynamics do not apply in high vacuum environments. The maximum compression varies linearly with circumferential rotor speed. In order to obtain extremely low pressures down to 1 micropascal, rotation rates of 20,000 to 90,000 revolutions per minute are often necessary. Unfortunately, the compression ratio varies exponentially with the square root of the molecular weight of the gas. Thus, heavy molecules are pumped much more efficiently than light molecules. Most gases are heavy enough to be well pumped but it is difficult to pump hydrogen and helium efficiently. An additional drawback stems from the high rotor speed of this type of pump: very high grade bearings are required, which increase the cost. Because turbomolecular pumps only work in molecular flow conditions, a pure turbomolecular pump will require a very large backing pump to work effectively. Thus, many modern pumps have a molecular drag stage such as a Holweck or Gaede mechanism near the exhaust to reduce the size of backing pump required.

A turbopump by Pfeiffer Vacuum attached to a thin film deposition system for organic electronics research

History
The turbomolecular pump was invented in 1958 by Becker, based on the older molecular drag pumps developed by Gaede in 1913, Holweck in 1923 and Siegbahn in 1944.

Outgassing

69

Outgassing
Outgassing (sometimes called offgassing, particularly when in reference to indoor air quality) is the release of a gas that was dissolved, trapped, frozen or absorbed in some material. As an example, research has shown how the concentration of carbon dioxide in the Earth's atmosphere has sometimes been linked to ocean outgassing. Outgassing can include sublimation and evaporation which are phase transitions of a substance into a gas, as well as desorption, seepage from cracks or internal volumes and gaseous products of slow chemical reactions. Boiling is generally thought of as a separate phenomenon from outgassing because it consists of a phase transition of a liquid into a vapor made of the same substance.

Outgassing in a vacuum
Outgassing is a challenge to creating and maintaining clean high-vacuum environments. NASA and ESA maintains a list of low-outgassing materials to be used for spacecraft, as outgassing products can condense onto optical elements, thermal radiators, or solar cells and obscure them. Materials not normally considered absorbent can release enough light-weight molecules to interfere with industrial or scientific vacuum processes. Moisture, sealants, lubricants, and adhesives are the most common sources, but even metals and glasses can release gases from cracks or impurities. The rate of outgassing increases at higher temperatures because the vapour pressure and rate of chemical reaction increases. For most solid materials, the method of manufacture and preparation can reduce the level of outgassing significantly. Cleaning surfaces or baking individual components or the entire assembly before use can drive off volatiles. NASA's Stardust spaceprobe suffered reduced image quality due to an unknown contaminant that had condensed on the CCD sensor of the navigation camera. A similar problem affected the Cassini-Huygens spaceprobe's Narrow Angle Camera , but was corrected by repeatedly heating the system to 4 degrees Celsius. A comprehensive characterisation of outgassing effects using mass spectrometers could be obtained for ESA's Rosetta spacecraft.

Outgassing from rock


Outgassing is the source of many tenuous atmospheres of terrestrial planets or moons. Many materials are volatile relative to the extreme vacuum of space, such as around the Earth's Moon, and may evaporate or even boil at ambient temperature. Materials on the lunar surface have completely outgassed and been ripped away by solar winds long ago, but volatile materials may remain at depth. Once released, gases almost always are less dense than the surrounding rocks and sand and seep toward the surface. The lunar atmosphere probably originates from outgassing of warm material below the surface. At the Earth's tectonic divergent boundaries where new crust is being created, helium and carbon dioxide are some of the volatiles being outgassed from mantle magma.

Outgassing in a closed environment


Outgassing can be significant if it collects in a closed environment where air is stagnant or recirculated. This is, for example, the origin of new car smell. Even a nearly odourless material such as wood may build up a strong smell if kept in a closed box for months. There is some concern that softeners and solvents that are released from many industrial products, especially plastics, may be harmful to human health. Some types of RTV sealants outgas the poison cyanide for weeks after application . These outgassing poisons are of great concern in the design of submarines and space stations.

Coating

70

Coating
A coating is a covering that is applied to the surface of an object, usually referred to as the substrate. The purpose of applying the coating may be decorative, functional, or both. The coating itself may be an all-over coating, completely covering the substrate, or it may only cover parts of the substrate. An example of all of these types of coating is a product label on many drinks bottles- one side has an all-over functional coating (the adhesive) and the other side has one or more decorative coatings in an appropriate pattern (the printing) to form the words and images. Paints and lacquers are coatings that mostly have dual uses of protecting the substrate and being decorative, although some artists paints are only for decoration, and the paint on large industrial pipes is presumably only for the function of preventing corrosion. Functional coatings may be applied to change the surface properties of the substrate, such as adhesion, wetability, corrosion resistance, or wear resistance. In other cases, e.g. semiconductor device fabrication (where the substrate is a wafer), the coating adds a completely new property such as a magnetic response or electrical conductivity and forms an essential part of the finished product. A major consideration for most coating processes is that the coating is to be applied at a controlled thickness, and a number of different processes are in use to achieve this control, ranging from a simple brush for painting a wall, to some very expensive machinery applying coatings in the electronics industry. A further consideration for 'non-all-over' coatings is that control is needed as to where the coating is to be applied. A number of these non-all-over coating processes are printing processes. Many industrial coating processes involve the application of a thin film of functional material to a substrate, such as paper, fabric, film, foil, or sheet stock. If the substrate starts and ends the process wound up in a roll, the process may be termed "roll-to-roll" or "web-based" coating. A roll of substrate, when wound through the coating machine, is typically called a web. Coatings may be applied as liquids, gases or solids.

Coating

71

Functions of coatings
Adhesive adhesive tape, pressure-sensitive labels, iron-on fabric Changing adhesion properties Non-stick PTFE coated- cooking pans Release coatings e.g. silicone-coated release liners for many self-adhesive products primers encourage subsequent coatings to adhere well (also sometimes have anti-corrosive properties) Optical coatings Reflective coatings for mirrors Anti-reflective coatings e.g. on spectacles UV- absorbent coatings for protection of eyes or increasing the life of the substrate Tinted as used in some coloured lighting, tinted glazing, or sunglasses

Catalytic e.g. some self-cleaning glass Light-sensitive as previously used to make photographic film Protective Most paints are to some extent protecting the substrate Hard anti-scratch coating on plastics and other materials e.g. of titanium nitride to reduce scratching, improve wear resistance, etc. Anti-corrosion Underbody sealant for cars Many plating products Waterproof fabric and waterproof paper antimicrobial surface Magnetic properties such as for magnetic media like cassette tapes and floppy disks Electrical or electronic properties Conductive coatings e.g. to manufacture some types of resistors Insulating coatings e.g. on magnet wires used in transformers Scent properties such as scratch and sniff stickers and labels

Coating processes
Coating processes may be classified as follows:

Vapor deposition
Chemical vapor deposition Metalorganic vapour phase epitaxy Electrostatic spray assisted vapour deposition (ESAVD) Sherardizing Some forms of Epitaxy Molecular beam epitaxy

Coating Physical vapor deposition Cathodic arc deposition Electron beam physical vapor deposition (EBPVD) Ion plating Ion beam assisted deposition (IBAD) Magnetron sputtering Pulsed laser deposition Sputter deposition Vacuum deposition Vacuum evaporation, evaporation (deposition)

72

Chemical and electrochemical techniques


Conversion coating Anodising Chromate conversion coating Plasma electrolytic oxidation Phosphate (coating) Ion beam mixing Pickled and oiled, a type of plate steel coating Plating Electroless plating Electroplating

Spraying
Spray painting High velocity oxygen fuel (HVOF) Plasma spraying Thermal spraying

Plasma transferred wire arc thermal spraying The common forms of Powder coating

Roll-to-roll coating processes


Common roll-to-roll coating processes include: Air knife coating Anilox coater Flexo coater Gap Coating Knife-over-roll coating Gravure coating Hot Melt coating- when the necessary coating viscosity is achieved by temperature rather than solution of the polymers etc. This method commonly implies slot-die coating above room temperature, but it also is possible to have hot-melt roller coating; hot-melt metering-rod coating, etc. Immersion (dip) coating Kiss coating Metering rod (Meyer bar) coating

Coating Roller coating Forward roller coating Reverse roll coating Silk Screen coater Rotary screen Slot Die coating Extrusion coating - generally high pressure, often high temperature, and with the web travelling much faster than the speed of the extruded polymer. Curtain coating- low viscosity, with the slot vertically above the web and a gap between slotdie and web. Slide coating- bead coating with an angled slide between the slotdie and the bead. Very successfully used for multilayer coating in the photographic industry. Slot die bead coating- typically with the web backed by a roller and a very small gap between slotdie and web. Tensioned-web slotdie coating- with no backing for the web. Inkjet printing Lithography Flexography Some dip coating processes

73

Wafer (electronics)

74

Wafer (electronics)
In electronics, a wafer (also called a slice or substrate) is a thin slice of semiconductor material, such as a silicon crystal, used in the fabrication of integrated circuits and other microdevices. The wafer serves as the substrate for microelectronic devices built in and over the wafer and undergoes many microfabrication process steps such as doping or ion implantation, etching, deposition of various materials, and photolithographic patterning. Finally the individual microcircuits are separated (dicing) and packaged. Several types of solar cell are also made from such wafers. On a solar wafer a solar cell (usually square) is made from the entire wafer.

Polished 12" and 6" silicon wafers. The notch in the left wafer and the flat cut into the right wafer indicates its crystallographic orientation (see below)

VLSI microcircuits fabricated on a 12-inch (300mm) silicon wafer, before dicing and packaging

Wafer (electronics)

75

Formation
Wafers are formed of highly pure (99.9999999% purity), nearly defect-free single crystalline material. One process for forming crystalline wafers is known as Czochralski growth invented by the Polish chemist Jan Czochralski. In this process, a cylindrical ingot of high purity monocrystalline semiconductor, such as silicon or germanium, is formed by pulling a seed crystal from a 'melt'. Donor impurity atoms, such as boron or phosphorus in the case of silicon, can be added to the molten intrinsic material in precise amounts in order to dope the crystal, thus changing it into n-type or p-type extrinsic semiconductor.

The Czochralski process.

The ingot is then sliced with a wafer saw (wire saw) and polished to form wafers. The size of wafers for photovoltaics is 100200mm square and the thickness is 200300m. In 2-inch (51mm), 4-inch (100mm), 6-inch (150mm), the future, 160 m will be the and 8-inch (200mm) wafers standard. Electronics use wafer sizes from 100450mm diameter. (The largest wafers made have a diameter of 450mm but are not yet in general use.)

Cleaning, texturing and etching


Wafers are cleaned with weak acids to remove unwanted particles, or repair damage caused during the sawing process. When used for solar cells, the wafers are textured to create a rough surface to increase their efficiency. The generated PSG (phosphosilicate glass) is removed from the edge of the wafer in the etching.

Wafer (electronics)

76

Wafer properties

Standard wafer sizes

Silicon wafers are available in a variety of diameters from 25.4mm (1inch) to 300mm (11.8inches). Semiconductor fabrication plants (also known as fabs) are defined by the diameter of wafers that they are tooled to produce. The diameter has gradually increased to improve throughput and reduce cost with the current state-of-the-art fab considered to be 300 mm (12inch), with the next standard projected to be 450 mm (18inch). Intel, TSMC and Samsung are separately conducting research to the advent of 450 mm "prototype" (research) fabs by 2012, though serious hurdles remain. Dean Freeman, an analyst with Gartner Inc., predicted that production fabs could emerge sometime between the 2017 and 2019 timeframe, a lot of that will depend on a plethora of new technological breakthroughs and not simply extending current technology.Atul Srivastava, an analyst of MarketsandMarkets said, that GaN wafers are also going to compete Si in several verticals across the industry due to its superior characteristics such as thermal conductivity.Ammonothermal, HVPE and Na-Flux LPE can be the best chosen manufacturing processes of GaN wafers with standard size of 2 inches. 1-inch (25mm) 2-inch (51mm). Thickness 275 m.

Solar wafers on the conveyor

Completed solar wafer

3-inch (76mm). Thickness 375m. 4-inch (100mm). Thickness 525m. 5-inch (130mm) or 125mm (4.9inch). Thickness 625m. 150mm (5.9inch, usually referred to as "6 inch"). Thickness 675m. 200mm (7.9inch, usually referred to as "8 inch"). Thickness 725m. 300mm (11.8inch, usually referred to as "12 inch"). Thickness 775m. 450 mm (17.7inch, usually referred to as "18 inch"). Thickness 925m (expected).

Wafers grown using materials other than silicon will have different thicknesses than a silicon wafer of the same diameter. Wafer thickness is determined by the mechanical strength of the material used; the wafer must be thick enough to support its own weight without cracking during handling. A unit wafer fabrication step, such as an etch step or a lithography step, can be performed on more chips per wafer as roughly the square of the increase in wafer diameter, while the cost of the unit fabrication step goes up more slowly than the square of the wafer diameter. This is the cost basis for shifting to larger and larger wafer sizes. Conversion to 300mm wafers from 200mm wafers began in earnest in 2000, and reduced the price per die about 30-40%. However, this was not without significant problems for the industry. The next step to 450mm should accomplish similar productivity gains as the previous size increase. However, machinery needed to handle and process larger wafers results in increased investment costs to build a single factory. There is considerable resistance to moving up to 450mm despite the productivity enhancements, mainly because companies feel it would take too long to recoup their investment. The difficult and costly 300mm process only accounted for approximately 20% of worldwide capacity on a square inches basis by the end of 2005. The step up to 300mm required a major change from the past, with fully automated factories using 300mm wafers versus barely automated factories for the 200mm wafers. These major investments were undertaken in the economic downturn following the dot-com bubble, resulting in huge resistance to upgrading to 450mm by the original timeframe. Other initial technical problems in the ramp up to 300mm included vibrational effects, gravitational bending (sag), and problems with flatness. Among the new problems in the ramp up to 450mm are that the crystal ingots will be 3

Wafer (electronics) times heavier (total weight a metric ton) and take 2-4 times longer to cool, and the process time will be double. All told, the development of 450mm wafers require significant engineering, time, and cost to overcome. Analytical die count estimation For any given wafer diameter [d, mm] and target IC size [S, mm2], there is an exact number of integral die pieces that can be sliced out of the wafer. The gross Die Per Wafer [DPW] can be estimated by the following expression:

77

Note, that the gross die count does not take into account the die defect loss, various alignment markings and test sites on the wafer.

Crystalline orientation
Wafers are grown from crystal having a regular crystal structure, with silicon having a diamond cubic structure with a lattice spacing of 5.430710 (0.5430710nm). When cut into wafers, the surface is aligned in one of several relative directions known as crystal orientations. Orientation is defined by the Miller index with (100) or (111) faces being the most common for silicon. Orientation is important since many of a single crystal's structural and electronic properties are highly anisotropic. Ion implantation depths depend on the wafer's crystal orientation, since each direction offers distinct paths for transport. Wafer cleavage typically occurs only in a few well-defined directions. Scoring the wafer along cleavage planes allows it to be easily diced into individual chips ("dies") so that the billions of individual circuit elements on an average wafer can be separated into many individual circuits.

Diamond Cubic Crystal Structure, Silicon unit cell

Wafer flats and crystallographic orientation notches


Wafers under 200mm diameter have flats cut into one or more sides indicating the crystallographic planes of the wafer (usually a {110} face). In earlier-generation wafers a pair of flats at different angles additionally conveyed the doping type (see illustration for conventions). Wafers of 200mm diameter and above use a single small notch to convey wafer orientation, with no visual indication of doping type.

Impurity doping

Silicon wafers are generally not 100% pure silicon, but are instead formed with an initial impurity doping concentration between 1013 and 1016 atoms per cm3 of boron, phosphorus, arsenic, or antimony which is added to the melt and defines the wafer as either bulk n-type or p-type. However, compared with single-crystal silicon's atomic density of 51022 atoms per cm3, this still gives a purity greater than 99.9999%. The wafers can also be initially provided with some interstitial oxygen concentration. Carbon and metallic contamination are kept to a minimum. Transition metals, in particular, must be kept below parts per billion concentrations for electronic applications.

Flats can be used to denote doping and crystallographic orientation. Red represents material that has been removed.

Wafer (electronics)

78

Compound semiconductors
While silicon is the prevalent material for wafers used in the electronics industry, other compound III-V or II-VI materials have also been employed. Gallium arsenide (GaAs), a III-V semiconductor produced via the Czochralski process, is also a common wafer material.

Substrate (electronics)

79

Substrate (electronics)
Substrate (also called a wafer) is a solid (usually planar) substance onto which a layer of another substance is applied, and to which that second substance adheres. In solid-state electronics, this term refers to a thin slice of material such as silicon, silicon dioxide, aluminum oxide, sapphire, germanium, gallium arsenide (GaAs), an alloy of silicon and germanium, or indium phosphide (InP). These serve as the foundation upon which electronic devices such as transistors, diodes, and especially integrated circuits (ICs) are deposited. Note that a substrate in the field of electronics is either a semiconductor or an electrical insulator, depending on the fabrication process that is being used. For the cases in which an insulator such as silicon oxide or aluminum oxide is used as the substrate, what happens next is the following. On top of the oxide, a thin layer of semiconducting material, usually pure silicon. Next, using the standard photographic processes repeatedly, transistors and diodes are fabricated in the semiconductor. The advantage of this (more costly) fabrication process is that the oxide layer can provide superior insulation between adjacent transistors. This process is especially used for electronics which must withstand ionizing radiation, such as in space exploration missions through the Van Allen Radiation Belts; in military and naval systems which might have to withstand nuclear radiation; and in instrumentation for nuclear reactors. In the manufacture of ICs, the substrate material is usually formed into or cut out as thin discs called wafers, into which the individual electronic devices (transistors, etc.) are etched, deposited, or otherwise fabricated.

Chemical vapor deposition


Chemical vapor deposition (CVD) is a chemical process used to produce high-purity, high-performance solid materials. The process is often used in the semiconductor industry to produce thin films. In typical CVD, the wafer (substrate) is exposed to one or more volatile precursors, which react and/or decompose on the substrate surface to produce the desired deposit. Frequently, volatile by-products are also produced, which are removed by gas flow through the reaction chamber. Microfabrication processes widely use CVD to deposit materials in various forms, including: monocrystalline, polycrystalline, amorphous, and epitaxial. These materials include: silicon, carbon fiber, carbon nanofibers, filaments, carbon nanotubes, SiO2, silicon-germanium, tungsten, silicon carbide, silicon nitride, silicon oxynitride, titanium nitride, and various high-k dielectrics. CVD is also used to produce synthetic diamonds.

DC plasma (violet) enhances the growth of carbon nanotubes in laboratory-scale PECVD apparatus

Chemical vapor deposition

80

Types
CVD is practiced in a variety of formats. These processes generally differ in the means by which chemical reactions are initiated. Classified by operating pressure: Atmospheric pressure CVD (APCVD) CVD at atmospheric pressure. Low-pressure CVD (LPCVD) CVD at sub-atmospheric pressures. Reduced pressures tend to reduce unwanted gas-phase reactions and improve film uniformity across the wafer. Ultrahigh vacuum CVD (UHVCVD) CVD at very low pressure, typically below 106 Pa (~108 torr). Note that in other fields, a lower division between high and ultra-high vacuum is common, often 107 Pa. Most modern CVD is either LPCVD or UHVCVD. Classified by physical characteristics of vapor: Aerosol assisted CVD (AACVD) CVD in which the precursors are transported to the substrate by means of a liquid/gas aerosol, which can be generated ultrasonically. This technique is suitable for use with non-volatile precursors. Direct liquid injection CVD (DLICVD) CVD in which the precursors are in liquid form (liquid or solid dissolved in a convenient solvent). Liquid solutions are injected in a vaporization chamber towards injectors (typically car injectors). The precursor vapors are then transported to the substrate as in classical CVD. This technique is suitable for use on liquid or solid precursors. High growth rates can be reached using this technique. Plasma methods (see also Plasma processing): Microwave plasma-assisted CVD (MPCVD) Plasma-Enhanced CVD (PECVD) CVD that utilizes plasma to enhance chemical reaction rates of the precursors. PECVD processing allows deposition at lower temperatures, which is often critical in the manufacture of semiconductors. The lower temperatures also allow for the deposition of organic coatings, such as plasma polymers, that have been used for nanoparticle surface functionalization. Remote plasma-enhanced CVD (RPECVD) Similar to PECVD except that the wafer substrate is not directly in the plasma discharge region. Removing the wafer from the plasma region allows processing temperatures down to room temperature. Atomic-layer CVD (ALCVD) Deposits successive layers of different substances to produce layered, crystalline films. See Atomic layer epitaxy. Combustion Chemical Vapor Deposition (CCVD) Combustion Chemical Vapor Deposition or flame pyrolysis is an open-atmosphere, flame-based technique for depositing high-quality thin films and nanomaterials. Hot-wire CVD (HWCVD) also known as catalytic CVD (Cat-CVD) or hot filament CVD (HFCVD), this process uses a hot filament to chemically decompose the source gases. Hybrid Physical-Chemical Vapor Deposition (HPCVD) This process involves both chemical decomposition of precursor gas and vaporization of a solid source. Metalorganic chemical vapor deposition (MOCVD) This CVD process is based on metalorganic precursors. Rapid thermal CVD (RTCVD) This CVD process uses heating lamps or other methods to rapidly heat the wafer substrate. Heating only the substrate rather than the gas or chamber walls helps reduce unwanted gas-phase reactions that can lead to particle formation. Vapor-phase epitaxy (VPE)
Hot-wall thermal CVD (batch operation type)

Plasma assisted CVD

Chemical vapor deposition Photo-initiated CVD (PICVD) - This process uses UV light to stimulate chemical reactions. It is similar to plasma processing, given that plasmas are strong emitters of UV radiation. Under certain conditions, PICVD can be operated at or near atmospheric pressure.

81

Uses
CVD is commonly used to deposit conformal films. A variety of applications for such films exist. Gallium arsenide is used in some integrated circuits (ICs). Amorphous polysilicon is used in photovoltaic devices. Certain carbides and nitrides confer wear-resistance.

Commercially important materials prepared by CVD


Polysilicon
Polycrystalline silicon is deposited from trichlorosilane (SiHCl3) or silane (SiH4), using the following reactions: SiH3Cl Si + H2 + HCl SiH4 Si + 2 H2 This reaction is usually performed in LPCVD systems, with either pure silane feedstock, or a solution of silane with 7080% nitrogen. Temperatures between 600 and 650 C and pressures between 25 and 150 Pa yield a growth rate between 10 and 20 nm per minute. An alternative process uses a hydrogen-based solution. The hydrogen reduces the growth rate, but the temperature is raised to 850 or even 1050 C to compensate. Polysilicon may be grown directly with doping, if gases such as phosphine, arsine or diborane are added to the CVD chamber. Diborane increases the growth rate, but arsine and phosphine decrease it.

Silicon dioxide
Silicon dioxide (usually called simply "oxide" in the semiconductor industry) may be deposited by several different processes. Common source gases include silane and oxygen, dichlorosilane (SiCl2H2) and nitrous oxide (N2O), or tetraethylorthosilicate (TEOS; Si(OC2H5)4). The reactions are as follows [citation needed]: SiH4 + O2 SiO2 + 2 H2 SiCl2H2 + 2 N2O SiO2 + 2 N2 + 2 HCl Si(OC2H5)4 SiO2 + byproducts The choice of source gas depends on the thermal stability of the substrate; for instance, aluminium is sensitive to high temperature. Silane deposits between 300 and 500 C, dichlorosilane at around 900 C, and TEOS between 650 and 750 C, resulting in a layer of low- temperature oxide (LTO). However, silane produces a lower-quality oxide than the other methods (lower dielectric strength, for instance), and it deposits nonconformally. Any of these reactions may be used in LPCVD, but the silane reaction is also done in APCVD. CVD oxide invariably has lower quality than thermal oxide, but thermal oxidation can only be used in the earliest stages of IC manufacturing. Oxide may also be grown with impurities (alloying or "doping"). This may have two purposes. During further process steps that occur at high temperature, the impurities may diffuse from the oxide into adjacent layers (most notably silicon) and dope them. Oxides containing 515% impurities by mass are often used for this purpose. In addition, silicon dioxide alloyed with phosphorus pentoxide ("P-glass") can be used to smooth out uneven surfaces. P-glass softens and reflows at temperatures above 1000 C. This process requires a phosphorus concentration of at least 6%, but concentrations above 8% can corrode aluminium. Phosphorus is deposited from phosphine gas and oxygen: 4 PH3 + 5 O2 2 P2O5 + 6 H2

Chemical vapor deposition Glasses containing both boron and phosphorus (borophosphosilicate glass, BPSG) undergo viscous flow at lower temperatures; around 850 C is achievable with glasses containing around 5 weight % of both constituents, but stability in air can be difficult to achieve. Phosphorus oxide in high concentrations interacts with ambient moisture to produce phosphoric acid. Crystals of BPO4 can also precipitate from the flowing glass on cooling; these crystals are not readily etched in the standard reactive plasmas used to pattern oxides, and will result in circuit defects in integrated circuit manufacturing. Besides these intentional impurities, CVD oxide may contain byproducts of the deposition. TEOS produces a relatively pure oxide, whereas silane introduces hydrogen impurities, and dichlorosilane introduces chlorine. Lower temperature deposition of silicon dioxide and doped glasses from TEOS using ozone rather than oxygen has also been explored (350 to 500 C). Ozone glasses have excellent conformality but tend to be hygroscopic that is, they absorb water from the air due to the incorporation of silanol (Si-OH) in the glass. Infrared spectroscopy and mechanical strain as a function of temperature are valuable diagnostic tools for diagnosing such problems. Silicon nitride Silicon nitride is often used as an insulator and chemical barrier in manufacturing ICs. The following two reactions deposit silicon nitride from the gas phase: 3 SiH4 + 4 NH3 Si3N4 + 12 H2 3 SiCl2H2 + 4 NH3 Si3N4 + 6 HCl + 6 H2 Silicon nitride deposited by LPCVD contains up to 8% hydrogen. It also experiences strong tensile stress, which may crack films thicker than 200nm. However, it has higher resistivity and dielectric strength than most insulators commonly available in microfabrication (1016 cm and 10 MV/cm, respectively). Another two reactions may be used in plasma to deposit SiNH: 2 SiH4 + N2 2 SiNH + 3 H2 These films have much less tensile stress, but worse electrical properties (resistivity 106 to 1015 cm, and dielectric strength 1 to 5 MV/cm). Metals CVD for tungsten is achieved from tungsten hexafluoride (WF6), which may be deposited in two ways: WF6 W + 3 F2 WF6 + 3 H2 W + 6 HF Other metals, notably aluminium and copper, can be deposited by CVD. As of 2010[3], commercially cost-effective CVD for copper did not exist, although volatile sources exist, such as Cu(hfac)2. Copper is typically deposited by electroplating. Aluminum can be deposited from triisobutylaluminium (TIBAL) and related organoaluminium compounds. CVD for molybdenum, tantalum, titanium, nickel is widely used[citation needed]. These metals can form useful silicides when deposited onto silicon. Mo, Ta and Ti are deposited by LPCVD, from their pentachlorides. Nickel, molybdenum, and tungsten can be deposited at low temperatures from their carbonyl precursors. In general, for an arbitrary metal M, the chloride deposition reaction is as follows: 2 MCl5 + 5 H2 2 M + 10 HCl whereas the carbonyl decomposition reaction can happen spontaneously under thermal treatment or acoustic cavitation and is as follows: M(CO)n M + n CO SiH4 + NH3 SiNH + 3 H2

82

Chemical vapor deposition the decomposition of metal carbonyls is often violently precipitated by moisture or air, where oxygen reacts with the metal precursor to form metal or metal oxide along with carbon dioxide. Niobium(V) oxide layers can be produced by the thermal decomposition of niobium(V) ethoxide with the loss of diethyl ether according to the equation: 2 Nb(OC2H5)5 Nb2O5 + 5 C2H5OC2H5

83

Diamond
CVD can be used to produce a synthetic diamond by creating the circumstances necessary for carbon atoms in a gas to settle on a substrate in crystalline form. CVD production of diamonds has received a great deal of attention in the materials sciences because it allows many new applications of diamonds that had previously been considered too difficult to make economical. CVD diamond growth typically occurs under low pressure (127 kPa; 0.1453.926 psi; 7.5-203 Torr) and involves feeding varying amounts of gases into a chamber, energizing them and providing conditions for diamond growth on the substrate. The gases Colorless gem cut from diamond grown by always include a carbon source, and typically include hydrogen as chemical vapor deposition well, though the amounts used vary greatly depending on the type of diamond being grown. Energy sources include hot filament, microwave power, and arc discharges, among others. The energy source is intended to generate a plasma in which the gases are broken down and more complex chemistries occur. The actual chemical process for diamond growth is still under study and is complicated by the very wide variety of diamond growth processes used. Using CVD, films of diamond can be grown over large areas of substrate with control over the properties of the diamond produced. In the past, when high pressure high temperature (HPHT) techniques were used to produce a diamond, the result was typically very small free standing diamonds of varying sizes. With CVD diamond growth areas of greater than fifteen centimeters (six inches) diameter have been achieved and much larger areas are likely to be successfully coated with diamond in the future. Improving this process is key to enabling several important applications. The growth of diamond directly on a substrate allows the addition of many of diamond's important qualities to other materials. Since diamond has the highest thermal conductivity of any bulk material, layering diamond onto high heat producing electronics (such as optics and transistors) allows the diamond to be used as a heat sink. Diamond films are being grown on valve rings, cutting tools, and other objects that benefit from diamond's hardness and exceedingly low wear rate. In each case the diamond growth must be carefully done to achieve the necessary adhesion onto the substrate. Diamond's very high scratch resistance and thermal conductivity, combined with a lower coefficient of thermal expansion than Pyrex glass, a coefficient of friction close to that of Teflon (Polytetrafluoroethylene) and strong lipophilicity would make it a nearly ideal non-stick coating for cookware if large substrate areas could be coated economically. CVD growth allows one to control the properties of the diamond produced. In the area of diamond growth, the word "diamond" is used as a description of any material primarily made up of sp3 bonded carbon, and there are many different types of diamond included in this. By regulating the processing parametersespecially the gases introduced, but also including the pressure the system is operated under, the temperature of the diamond, and the method of generating plasmamany different materials that can be considered diamond can be made. Single crystal diamond can be made containing various dopants. Polycrystalline diamond consisting of grain sizes from several nanometers to several micrometers can be grown. Some polycrystalline diamond grains are surrounded by thin, non-diamond carbon, while others are not. These different factors affect the diamond's hardness, smoothness, conductivity, optical properties and more.

Metalorganic vapour phase epitaxy

84

Metalorganic vapour phase epitaxy


Metalorganic vapour phase epitaxy (MOVPE), also known as organometallic vapour phase epitaxy (OMVPE) or metalorganic chemical vapour deposition (MOCVD), is a chemical vapour deposition method used to produce single or polycrystalline thin films. It is a highly complex process for growing crystalline layers to create complex semiconductor multilayer structures. In contrast to molecular beam epitaxy (MBE) the growth of crystals is by chemical reaction and not physical deposition. This takes place not in a vacuum, but from the gas phase at moderate pressures (10 to 760 Torr). As such, this technique is preferred for the formation of devices incorporating thermodynamically metastable alloys, and it has become a major process in the manufacture of optoelectronics.

illustration of the process

Basic principles of the MOCVD process


In MOCVD ultra pure gases are injected into a reactor and finely dosed to deposit a very thin layer of atoms onto a semiconductor wafer. Surface reaction of organic compounds or metalorganics and hydrides containing the required chemical elements creates conditions for crystalline growth - epitaxy of materials and compound semiconductors. Unlike traditional silicon semiconductors, these semiconductors may contain combinations of Group III and Group V, Group II and Group VI, Group IV, or Group IV, V and VI elements.

MOCVD apparatus

For example, indium phosphide could be grown in a reactor on a heated substrate by introducing trimethylindium ((CH3)3In) and phosphine (PH3) in a first step. The heated organic precursor molecules decompose in the absence of oxygen - pyrolysis. Pyrolysis leaves the atoms on the substrate surface in the second step. The atoms bond to the substrate surface and a new crystalline layer is grown in the last step. Formation of this epitaxial layer occurs at the substrate surface. Required pyrolysis temperature increases with increasing chemical bond strength of the precursor. The more carbon atoms are attached to the central metal atom the weaker the bond. The diffusion of atoms on the substrate surface is affected by atomic steps on the surface. The vapor pressure of the metal organic source is an important consideration in MOCVD, since it determines the concentration of the source material in the reaction and the deposition rate.

MOCVD reactor components


In the metal organic chemical vapor deposition (MOCVD) technique, reactant gases are combined at elevated temperatures in the reactor to cause a chemical interaction, resulting in the deposition of materials on the substrate. A reactor is a chamber made of a material that does not react with the chemicals being used. It must also withstand high temperatures. This chamber is composed by reactor walls, liner, a susceptor, gas injection units, and temperature control units. Usually, the reactor walls are made from stainless steel or quartz. Ceramic or special glasses, such as quartz, are often used as the liner in the reactor chamber between the reactor wall and the susceptor. To prevent overheating, cooling water must be flowing through the channels within the reactor walls. A substrate sits on a susceptor which is at a controlled temperature. The susceptor is made from a material

Metalorganic vapour phase epitaxy resistant to the metalorganic compounds used; graphite is sometimes used. For growing nitrides and related materials, a special coating on the graphite susceptor is necessary to prevent corrosion by ammonia (NH3) gas. One type of reactor used to carry out MOCVD is a cold-wall reactor. In a cold-wall reactor, the substrate is supported by a pedestal, which also acts as a susceptor. The pedestal/susceptor is the primary origin of heat energy in the reaction chamber. Only the susceptor is heated, so gases do not react before they reach the hot wafer surface. The pedestal/susceptor is made of a radiation-absorbing material such as carbon. In contrast, the walls of the reaction chamber in a cold-wall reactor are typically made of quartz which is largely transparent to the electromagnetic radiation. The reaction chamber walls in a cold-wall reactor, however, may be indirectly heated by heat radiating from the hot pedestal/susceptor, but will remain cooler than the pedestal/susceptor and the substrate the pedestal/susceptor supports. In hot-wall CVD, the entire chamber is heated. This may be necessary for some gases to be pre-cracked before reaching the wafer surface to allow them to stick to the wafer. Gas inlet and switching system. Gas is introduced via devices known as 'bubblers'. In a bubbler a carrier gas (usually hydrogen in arsenide & phosphide growth or nitrogen for nitride growth) is bubbled through the metalorganic liquid, which picks up some metalorganic vapour and transports it to the reactor. The amount of metalorganic vapour transported depends on the rate of carrier gas flow and the bubbler temperature, and is usually controlled automatically and most accurately by using an ultrasonic concentration measuring feedback gas control system. Allowance must be made for saturated vapors. Pressure maintenance system Gas exhaust and cleaning system. Toxic waste products must be converted to liquid or solid wastes for recycling (preferably) or disposal. Ideally processes will be designed to minimize the production of waste products.

85

Organometallic precursors
Aluminium Trimethylaluminium (TMA or TMAl), Liquid Triethylaluminium (TEA or TEAl), Liquid Gallium Trimethylgallium (TMG or TMGa), Liquid Triethylgallium (TEG or TEGa), Liquid Indium Trimethylindium (TMI or TMIn), Solid Triethylindium (TEI or TEIn), Liquid Di-isopropylmethylindium (DIPMeIn) [5], Liquid Ethyldimethylindium (EDMIn) [6], Liquid Isobutylgermane (IBGe) [7], Liquid Dimethylamino germanium trichloride (DiMAGeC), Liquid Tetramethylgermane (TMGe), Liquid Tetraethylgermanium(TEGe), Liquid

Germanium

Nitrogen Phenyl hydrazine, Liquid Dimethylhydrazine (DMHy), Liquid Tertiarybutylamine (TBAm), Liquid Ammonia NH3, Gas

Metalorganic vapour phase epitaxy Phosphorus Phosphine PH3, Gas Tertiarybutyl phosphine (TBP), Liquid Bisphosphinoethane (BPE), Liquid Arsenic Arsine AsH3, Gas Tertiarybutyl arsine (TBAs), Liquid Monoethyl arsine (MEAs), Liquid Trimethyl arsine (TMAs), Liquid

86

Antimony Trimethyl antimony (TMSb), Liquid Triethyl antimony (TESb), Liquid Tri-isopropyl antimony (TIPSb), Liquid Stibine SbH3, Gas

Cadmium Dimethyl cadmium (DMCd), Liquid Diethyl cadmium (DECd), Liquid Methyl Allyl Cadmium (MACd), Liquid Tellurium Dimethyl telluride (DMTe), Liquid Diethyl telluride (DETe), Liquid Di-isopropyl telluride (DIPTe) [8], Liquid Titanium Alkoxides, such as Titanium isopropoxide or Titanium ethoxide Selenium Dimethyl selenide (DMSe), Liquid Diethyl selenide (DESe), Liquid Di-isopropyl selenide (DIPSe), Liquid Zinc Dimethylzinc (DMZ), Liquid Diethylzinc (DEZ), Liquid

Semiconductors grown by MOCVD


III-V semiconductors
AlGaAs AlGaInP AlGaN AlGaP GaAsP GaAs GaN GaP InAlAs

Metalorganic vapour phase epitaxy InAlP InSb InGaN GaInAlAs GaInAlN GaInAsN GaInAsP GaInAs GaInP InN InP InAs

87

II-VI semiconductors
Zinc selenide (ZnSe) HgCdTe ZnO Zinc sulfide (ZnS)

IV Semiconductors
Si Ge Strained silicon

IV-V-VI Semiconductors
GeSbTe

Environment, Health and Safety


As MOCVD has become well-established production technology, there are equally growing concerns associated with its bearing on personnel and community safety, environmental impact and maximum quantities of hazardous materials (such as gases and metalorganics) permissible in the device fabrication operations. The safety as well as responsible environmental care have become major factors of paramount importance in the MOCVD-based crystal growth of compound semiconductors.

Electrostatic spray-assisted vapour deposition

88

Electrostatic spray-assisted vapour deposition


Electrostatic spray assisted vapour deposition (ESAVD) is a technique (developed by a company called IMPT) to deposit both thin and thick layers of a coating onto various substrates. In simple terms chemical precursors are sprayed across an electrostatic field towards a heated substrate, the chemicals undergo a controlled chemical reaction and are deposited on the substrate as the required coating. Electrostatic spraying techniques were developed in the 1950s for the spraying of ionised particles on to charged or heated substrates. ESAVD (branded by IMPT as Layatec) is used for many applications in many markets including: Thermal barrier coatings for jet engine turbine blades Various thin layers in the manufacture of flat panel displays and photovoltaic panels Electronic components Biomedical coatings Glass coatings (such as self-cleaning) Corrosion protection coatings

The process has advantages over other techniques for layer deposition (Plasma, Electron-Beam) in that it does not require the use of any vacuum, electron beam or plasma so reduces the manufacturing costs. It also uses less power and raw materials making it more environmentally friendly. Also the use of the electrostatic field means that the process can coat complex 3D parts easily.

Epitaxy

89

Epitaxy
Epitaxy refers to the deposition of a crystalline overlayer on a crystalline substrate, where there is registry between the overlayer and the substrate. The overlayer is called an epitaxial film or epitaxial layer. The term epitaxy comes from the Greek roots epi (), meaning above, and taxis (), meaning in ordered manner. It can be translated arranging upon. For most technological applications, it is desired that the deposited material form a crystalline overlayer that has one well-defined orientation with respect to the substrate crystal structure (single-domain epitaxy). Epitaxial films may be grown from gaseous or liquid precursors. Because the substrate acts as a seed crystal, the deposited film may lock into one or more crystallographic orientations with respect to the substrate crystal. If the overlayer either forms a random orientation with respect to the substrate or does not form an ordered overlayer, this is termed non-epitaxial growth. If an epitaxial film is deposited on a substrate of the same composition, the process is called homoepitaxy; otherwise it is called heteroepitaxy. Homoepitaxy is a kind of epitaxy performed with only one material. In homoepitaxy, a crystalline film is grown on a substrate or film of the same material. This technology is used to grow a film which is more pure than the substrate and to fabricate layers having different doping levels. In academic literature, homoepitaxy is often abbreviated to "homoepi". Heteroepitaxy is a kind of epitaxy performed with materials that are different from each other. In heteroepitaxy, a crystalline film grows on a crystalline substrate or film of a different material. This technology is often used to grow crystalline films of materials for which crystals cannot otherwise be obtained and to fabricate integrated crystalline layers of different materials. Examples include gallium nitride (GaN) on sapphire, aluminium gallium indium phosphide (AlGaInP) on gallium arsenide (GaAs) or diamond or iridium. Heterotopotaxy is a process similar to heteroepitaxy except for the fact that thin film growth is not limited to two dimensional growth. Here the substrate is similar only in structure to the thin film material. Pendeo-Epitaxy is a process, where the hetereoepitaxial film is growing vertically and laterally at the same time. Epitaxy is used in silicon-based manufacturing processes for bipolar junction transistors (BJTs) and modern complementary metaloxidesemiconductors (CMOS), but it is particularly important for compound semiconductors such as gallium arsenide. Manufacturing issues include control of the amount and uniformity of the deposition's resistivity and thickness, the cleanliness and purity of the surface and the chamber atmosphere, the prevention of the typically much more highly doped substrate wafer's diffusion of dopant to the new layers, imperfections of the growth process, and protecting the surfaces during the manufacture and handling.

Applications
Epitaxy is used in nanotechnology and in semiconductor fabrication. Indeed, epitaxy is the only affordable method of high quality crystal growth for many semiconductor materials.

Methods
Epitaxial silicon is usually grown using vapor-phase epitaxy (VPE), a modification of chemical vapor deposition. Molecular-beam and liquid-phase epitaxy (MBE and LPE) are also used, mainly for compound semiconductors. Solid-phase epitaxy is used primarily for crystal-damage healing.

Epitaxy

90

Vapor-phase
Silicon is most commonly deposited by doping with silicon tetrachloride and hydrogen at approximately 1200 C: SiCl4(g) + 2H2(g) Si(s) + 4HCl(g) This reaction is reversible, and the growth rate depends strongly upon the proportion of the two source gases. Growth rates above 2 micrometres per minute produce polycrystalline silicon, and negative growth rates (etching) may occur if too much hydrogen chloride byproduct is present. (In fact, hydrogen chloride may be added intentionally to etch the wafer.) An additional etching reaction competes with the deposition reaction: SiCl4(g) + Si(s) 2SiCl2(g) Silicon VPE may also use silane, dichlorosilane, and trichlorosilane source gases. For instance, the silane reaction occurs at 650C in this way: SiH4 Si + 2H2 This reaction does not inadvertently etch the wafer, and takes place at lower temperatures than deposition from silicon tetrachloride. However, it will form a polycrystalline film unless tightly controlled, and it allows oxidizing species that leak into the reactor to contaminate the epitaxial layer with unwanted compounds such as silicon dioxide. VPE is sometimes classified by the chemistry of the source gases, such as hydride VPE and metalorganic VPE.

Liquid-phase
Liquid phase epitaxy (LPE) is a method to grow semiconductor crystal layers from the melt on solid substrates. This happens at temperatures well below the melting point of the deposited semiconductor. The semiconductor is dissolved in the melt of another material. At conditions that are close to the equilibrium between dissolution and deposition, the deposition of the semiconductor crystal on the substrate is relatively fast and uniform. Mostly used substrate is indium phosphide (InP). Other substrates like glass or ceramic can be applied for special applications. To facilitate nucleation, and to avoid tension in the grown layer the thermal expansion coefficient of substrate and grown layer should be similar.

Solid-phase
Solid Phase Epitaxy (SPE) is a transition between the amorphous and crystalline phases of a material. It is usually done by first depositing a film of amorphous material on a crystalline substrate. The substrate is then heated to crystallize the film. The single crystal substrate serves as a template for crystal growth. The annealing step used to recrystallize or heal silicon layers amorphized during ion implantation is also considered one type of Solid Phase Epitaxy. The Impurity segregation and redistribution at the growing crystal-amorphus layer interface during this process is used to incorporate low-solubility dopants in metals and Silicon.

Epitaxy

91

Molecular-beam epitaxy
In molecular beam epitaxy (MBE), a source material is heated to produce an evaporated beam of particles. These particles travel through a very high vacuum (108 Pa; practically free space) to the substrate, where they condense. MBE has lower throughput than other forms of epitaxy. This technique is widely used for growing III-V semiconductor crystals.

Doping
An epitaxial layer can be doped during deposition by adding impurities to the source gas, such as arsine, phosphine or diborane. The concentration of impurity in the gas phase determines its concentration in the deposited film. As in chemical vapor deposition (CVD), impurities change the deposition rate. Additionally, the high temperatures at which CVD is performed may allow dopants to diffuse into the growing layer from other layers in the wafer (out-diffusion). Also, dopants in the source gas, liberated by evaporation or wet etching of the surface, may diffuse into the epitaxial layer (autodoping). The dopant profiles of underlying layers change as well, however not as significantly.

Minerals
In mineralogy epitaxy is the overgrowth of one mineral on another in an orderly way, such that certain crystal directions of the two minerals are aligned. This occurs when some planes in the lattices of the overgrowth and the substrate have similar spacings between atoms. If the crystals of both minerals are well formed so that the directions of the crystallographic axes are clear then the epitaxic relationship can be deduced just by a visual inspection. Sometimes many separate crystals form the overgrowth on a single substrate, and then if there is epitaxy all the overgrowth crystals will have a similar orientation. The reverse, however, is not necessarily true. If the overgrowth crystals have a similar orientation there is probably an epitaxic relationship, but it is not certain. Some authors consider that overgrowths of a second generation of the same mineral species should also be considered as epitaxy, and this is common terminology for semiconductor scientists who induce epitaxic growth of a film with a different doping level on a semiconductor substrate of the same material. For naturally produced minerals, however, the International Mineralogical Association (IMA) definition requires that the two minerals be of different species.

Rutile epitaxic on hematite nearly 6 cm long. Bahia, Brazil

Another man-made application of epitaxy is the making of artificial snow using silver iodide, which is possible because hexagonal silver iodide and ice have similar cell dimensions.

Epitaxy

92

Isomorphic minerals
Minerals that have the same structure (isomorphic minerals) may have epitaxic relations. An example is albite NaAlSi3O8 on microcline KAlSi3O8. Both these minerals are triclinic, with space group 1, and with similar unit cell parameters, a = 8.16 , b = 12.87 , c = 7.11 , = 93.45, - = 116.4, = 90.28 for albite and a = 8.5784 , b = 12.96 , c = 7.2112 , = 90.3, - = 116.05, = 89 for microcline.

Polymorphic minerals
Minerals that have the same composition but different structures (polymorphic minerals) may also have epitaxic relations. Examples are pyrite and marcasite, both FeS2, and sphalerite and wurtzite, both ZnS.
Hematite pseudomorph after magnetite, with terraced epitaxic faces. La Rioja, Argentina

Rutile on hematite
Some pairs of minerals that are not related structurally or compositionally may also exhibit epitaxy. A common example is rutile TiO2 on hematite Fe2O3. Rutile is tetragonal and hematite is trigonal, but there are directions of similar spacing between the atoms in the (100) plane of rutile (perpendicular to the an axis) and the (001) plane of hematite (perpendicular to the c axis). In epitaxy these directions tend to line up with each other, resulting in the a axis of the rutile overgrowth being parallel to the c axis of hematite, and the c axis of rutile being parallel to one of the an axes of hematite.

Hematite on magnetite
Another example is hematite Fe3+2O3 on magnetite Fe2+Fe3+2O4. The magnetite structure is based on close packed oxygen anions stacked in an ABC-ABC sequence. In this packing the close-packed layers are parallel to (111) (a plane that symmetrically "cuts off" a corner of a cube). The hematite structure is based on close-packed oxygen anions stacked in an AB-AB sequence, which results in a crystal with hexagonal symmetry. If the cations were small enough to fit into a truly close-packed structure of oxygen anions then the spacing between the nearest neighbour oxygen sites would be the same for both species. The radius of the oxygen ion, however, is only 1.36 and the Fe cations are big enough to cause some variations. The Fe radii vary from 0.49 to 0.92 ,[10] depending on the charge (2+ or 3+) and the coordination number (4 or 8). Nevertheless the O spacings are similar for the two minerals hence hematite can readily grow on the (111) faces of magnetite, with hematite (001) parallel to magnetite (111).

Epitaxy

93

Molecular beam epitaxy


Molecular beam epitaxy (MBE) is one of several methods of depositing single crystals. It was invented in the late 1960s at Bell Telephone Laboratories by J. R. Arthur and Alfred Y. Cho. MBE is widely used in the manufacture of semiconductor devices, including transistors for cellular phones and WiFi.

Method
Molecular beam epitaxy takes place in high vacuum or ultra-high vacuum (10
8

Pa). The most important aspect of MBE is the deposition

rate (typically less than 3000nm per hour) allows the films to grow epitaxially. These deposition rates require proportionally better vacuum to achieve the same impurity levels as other deposition techniques. The absence of carrier gases as well as the ultra high vacuum environment result in the highest achievable purity of the grown films.
A simple sketch showing the main components and rough layout and concept of the main chamber in a Molecular Beam Epitaxy system

In solid-source MBE, elements such as gallium and arsenic, in ultra-pure form, are heated in separate quasi-Knudsen effusion cells until they begin to slowly sublime. The gaseous elements then condense on the wafer, where they may react with each other. In the example of gallium and arsenic, single-crystal gallium arsenide is formed. The term "beam" means that evaporated atoms do not interact with each other or vacuum chamber gases until they reach the wafer, due to the long mean free paths of the atoms. During operation, reflection high energy electron diffraction (RHEED) is often used for monitoring the growth of the crystal layers. A computer controls shutters in front of each furnace, allowing precise control of the thickness of each layer, down to a single layer of atoms. Intricate structures of layers of different materials may be fabricated this way. Such control has allowed the development of structures where the electrons can be confined in space, giving

Molecular beam epitaxy quantum wells or even quantum dots. Such layers are now a critical part of many modern semiconductor devices, including semiconductor lasers and light-emitting diodes. In systems where the substrate needs to be cooled, the ultra-high vacuum environment within the growth chamber is maintained by a system of cryopumps, and cryopanels, chilled using liquid nitrogen or cold nitrogen gas to a temperature close to 77 Kelvin (196 degrees Celsius). Cryogenic temperatures act as a sink for impurities in the vacuum, so vacuum levels need to be several orders of magnitude better to deposit films under these conditions. In other systems, the wafers on which the crystals are grown may be mounted on a rotating platter which can be heated to several hundred degrees Celsius during operation. Molecular beam epitaxy is also used for the deposition of some types of organic semiconductors. In this case, molecules, rather than atoms, are evaporated and deposited onto the wafer. Other variations include gas-source MBE, which resembles chemical vapor deposition. Lately molecular beam epitaxy has been used to deposit oxide materials for advanced electronic, magnetic and optical applications. For these purposes, MBE systems have to be modified to incorporate oxygen sources.

94

ATG instability
The ATG (Asaro-Tiller-Grinfeld) instability, also known as the Grinfeld instability, is an elastic instability often encountered during molecular beam epitaxy. If there is a mismatch between the lattice sizes of the growing film and the supporting crystal, elastic energy will be accumulated in the growing film. At some critical height, the free energy of the film can be lowered if the film breaks into isolated islands, where the tension can be relaxed laterally. The critical height depends on Young's moduli, mismatch size, and surface tensions. Some applications for this instability have been researched, such as the self-assembly of quantum dot. This community uses the name of StranskiKrastanov growth for ATG.

Molecular beam epitaxy

95

Physical vapor deposition


Physical vapor deposition (PVD) is a variety of vacuum deposition methods used to deposit thin films by the condensation of a vaporized form of the desired film material onto various workpiece surfaces (e.g., onto semiconductor wafers). The coating method involves purely physical processes such as high-temperature vacuum evaporation with subsequent condensation, or plasma sputter bombardment rather than involving a chemical reaction at the surface to be coated as in chemical vapor deposition. The term physical vapor deposition originally appeared in the 1966 book Vapor Deposition by C. F. Powell, J. H. Oxley and J. M. Blocher Jr., (but Michael Faraday was using PVD to deposit coatings as far back as 1838). Physical vapor deposition coating is a product that is currently being used to enhance a number of products, including automotive parts like wheels and pistons, surgical tools, drill bits, and guns. The current version of physical vapor deposition was completed in 2010 by NASA scientists at the NASA Glenn Research Center in Cleveland, Ohio. This physical vapor deposition coating is made up of thin layers of metal that are bonded together through a rig that NASA finished developing in 2010. In order to make the coating, developers put the essential ingredients into the rig, which drops the surrounding atmospheric pressure to one torr (1/760 of our everyday atmosphere). From there, the coating is heated with a plasma torch that reaches 17,540.33 degrees Fahrenheit. In the automotive world, it is the newest alternative to the chrome plating that has been used for trucks and cars for years. This is because it has been proven to increase durability and weigh less than chrome coating, which is an advantage because a vehicle's acceleration and fuel efficiency will increase. Physical vapor
Inside the Plasma Spray-Physical Vapor Deposition, or PS-PVD, ceramic powder is introduced into the plasma flame, which vaporizes it and then condenses it on the (cooler) workpiece to form the ceramic coating.

Physical vapor deposition

96

deposition coating is gaining in popularity for many reasons, including that it enhances a products durability. In fact, studies have shown that it can enhance the lifespan of an unprotected product tenfold. Variants of PVD include, in alphabetical order: Cathodic Arc Deposition: In which a high-power electric arc discharged at the target (source) material blasts away some into highly ionized vapor to be deposited onto the workpiece.

PVD: Process flow diagram

Electron beam physical vapor deposition: In which the material to be deposited is heated to a high vapor pressure by electron bombardment in "high" vacuum and is transported by diffusion to be deposited by condensation on the (cooler) workpiece. Evaporative deposition: In which the material to be deposited is heated to a high vapor pressure by electrically resistive heating in "low" vacuum. Pulsed laser deposition: In which a high-power laser ablates material from the target into a vapor. Sputter deposition: In which a glow plasma discharge (usually localized around the "target" by a magnet) bombards the material sputtering some away as a vapor for subsequent deposition. PVD is used in the manufacture of items, including semiconductor devices, aluminized PET film for balloons and snack bags, and coated cutting tools for metalworking. Besides PVD tools for fabrication, special smaller tools (mainly for scientific purposes) have been developed. They mainly serve the purpose of extreme thin films like atomic layers and are used mostly for small substrates. A good example are mini e-beam evaporators which can deposit monolayers of virtually all materials with melting points up to 3500 C. Common coatings applied by PVD are Titanium nitride, Zirconium nitride, Chromium nitride, Titanium aluminum nitride. The source material is unavoidably also deposited on most other surfaces interior to the vacuum chamber, including the fixturing to hold the parts. Some of the techniques used to measure the physical properties of PVD coatings are: Calo tester: coating thickness test Nanoindentation: hardness test for thin-film coatings Pin on disc tester: wear and friction coefficient test Scratch tester: coating adhesion test

Advantages: PVD coatings are sometimes harder and more corrosion resistant than coatings applied by the electroplating process. Most coatings have high temperature and good impact strength, excellent abrasion resistance and are so durable that protective topcoats are almost never necessary.

Physical vapor deposition Ability to utilize virtually any type of inorganic and some organic coating materials on an equally diverse group of substrates and surfaces using a wide variety of finishes. More environmentally friendly than traditional coating processes such as electroplating and painting. More than one technique can be used to deposit a given film. Disadvantages: Specific technologies can impose constraints; for example, line-of-sight transfer is typical of most PVD coating techniques, however there are methods that allow full coverage of complex geometries. Some PVD technologies typically operate at very high temperatures and vacuums, requiring special attention by operating personnel. Requires a cooling water system to dissipate large heat loads. Application: (Reference: Achal Singh SRM-CEL) As mentioned previously, PVD coatings are generally used to improve hardness, wear resistance and oxidation resistance. Thus, such coatings use in a wide range of applications such as: Aerospace Automotive Surgical/Medical Dies and moulds for all manner of material processing Cutting tools Firearms Optics Thin films (window tint, food packaging, etc.)

97

Cathodic arc deposition

98

Cathodic arc deposition


Cathodic arc deposition or Arc-PVD is a physical vapor deposition technique in which an electric arc is used to vaporize material from a cathode target. The vaporized material then condenses on a substrate, forming a thin film. The technique can be used to deposit metallic, ceramic, and composite films.

History
Industrial use of modern cathodic arc deposition technology originated in Soviet Union around 19601970. By the late 70's Soviet government released the use of this technology to the West. Among many designs in USSR at that time the design by L. P. Sablev, et al., was allowed to be used outside the USSR.

Process
The arc evaporation process begins with the striking of a high current, low voltage arc on the surface of a cathode (known as the target) that gives rise to a small (usually a few micrometres wide), highly energetic emitting area known as a cathode spot. The localised temperature at the cathode spot is extremely high (around 15000 C), which results in a high velocity (10km/s) jet of vapourised cathode material, leaving a crater behind on the cathode surface. The cathode spot is only active for a short period of time, then it self-extinguishes and re-ignites in a new area close to the previous crater. This behaviour causes the apparent motion of the arc . As the arc is basically a current carrying conductor it can be influenced by the application of an electromagnetic field, which in practice is used to rapidly move the arc over the entire surface of the target, so that the total surface is eroded over time.

Cathodic arc deposition The arc has an extremely high power density resulting in a high level of ionization (30-100%), multiple charged ions, neutral particles, clusters and macro-particles (droplets). If a reactive gas is introduced during the evaporation process, dissociation, ionization and excitation can occur during interaction with the ion flux and a compound film will be deposited. One downside of the arc evaporation process is that if the cathode spot stays at an evaporative point for too long it can eject a large amount of macro-particles or droplets. These droplets are detrimental to the performance of the coating as they are poorly adhered and can extend through the coating. Worse still if the cathode target material has a low melting point such as aluminium the cathode spot can evaporate through the target resulting in either the target backing plate material being evaporated or cooling water entering the chamber. Therefore magnetic fields as mentioned previously are used to control the motion of the arc. If cylindrical cathodes are used the cathodes can also be rotated during deposition. By not allowing the cathode spot to remain in one position too long aluminium targets can be used and the number of droplets is reduced. Some companies also use filtered arcs that use magnetic fields to separate the droplets from the coating flux.

99

Equipment design
Sablev type Cathodic arc source, which is the most widely used in the West, consists of a short cylindrical shape electrical conductive target at cathode with one open end. This target has an electrically-floating metal ring surrounded working as an arc confinement ring. The anode for the system can be either the vacuum chamber wall or a discrete anode. Arc spots are generated by mechanical trigger (or igniter) striking on open end of the target making a temporarily short circuit between the cathode and anode. After the arc spots being generated they can be steered by magnetic field or move randomly in absence of magnetic field.

Sablev type Cathodic arc source with magnet to steer the movement of arc spot

The plasma beam from Cathodic Arc source contains some larger clusters of atoms or molecules (so called macro-particles), which prevent it from being useful for some applications without some kind of filtering. There are many designs for macro-particle filters and the most studied design is based on the work by I. I. Aksenov et al. in 70's. It consists of a quarter-torus duct bent at 90 degrees from the arc source and the plasma is guided out of the duct by principle of plasma optics. There are also other interesting designs such as a design which Aksenov Quarter-torus duct macroparticle filter incorporates a straight duct filter built-in with truncated cone shape using plasma optical principles which was developed by A. I. Morozov cathode as reported by D. A. Karpov in the 90's. This design became quite popular among both the thin hard-film coaters and researchers in Russia and former USSR countries until now. Cathodic arc source can be made into the long tubular shape (extended-arc) or long rectangular shape but both designs are less popular.

Cathodic arc deposition

100

Applications
Cathodic arc deposition is actively used to synthesize extremely hard film to protect the surface of cutting tools and extend their life significantly. A wide variety of thin hard-film, Superhard coatings and nanocomposite coatings can be synthesized by this technology including TiN, TiAlN, CrN, ZrN, AlCrTiN and TiAlSiN. This is also used quite extensively particularly for carbon ion deposition to create diamond-like carbon films. Because the ions are blasted from the surface ballistically, it is common for not only single atoms, but larger clusters of atoms to be ejected. Thus, this kind of system requires a filter to remove atom clusters from the beam before deposition. The DLC film from filtered-arc contains extremely high percentage of sp3 diamond which is known as tetrahedral amorphous carbon, or ta-C. Filtered Cathodic arc can be used as metal ion/plasma source for Ion implantation and Plasma Immersion Ion Implantation and Deposition (PIII&D).
Titanium Nitride (TiN) coated punches using Cathodic arc deposition technique

Aluminium Titanium Nitride (AlTiN) coated endmills using Cathodic arc deposition technique

Aluminium Chromium Titanium Nitride (AlCrTiN) coated Hob using Cathodic arc deposition technique

Electron beam physical vapor deposition

101

Electron beam physical vapor deposition


Electron Beam Physical Vapor Deposition or EBPVD is a form of physical vapor deposition in which a target anode is bombarded with an electron beam given off by a charged tungsten filament under high vacuum. The electron beam causes atoms from the target to transform into the gaseous phase. These atoms then precipitate into solid form, coating everything in the vacuum chamber (within line of sight) with a thin layer of the anode material.

Introduction
Thin film deposition is a process applied in the semiconductor industry to grow electronic materials, in the aerospace industry to form thermal and chemical barrier coatings to protect surfaces against corrosive environments, in optics to impart the desired reflective and transmissive properties to a substrate and elsewhere in industry to modify surfaces to have a variety of desired properties. The deposition process can be broadly classified into physical vapor deposition (PVD) and chemical vapor deposition (CVD). In CVD, the film growth takes place at high temperatures, leading to the formation of corrosive gaseous products, and it may leave impurities in the film. The PVD process can be carried out at lower deposition temperatures and without corrosive products, but deposition rates are typically lower. Electron beam physical vapor deposition, however, yields a high deposition rate from 0.1 m / min to 100 m / min at relatively low substrate temperatures, with very high material utilization efficiency. The schematic of an EBPVD system is shown in Fig .

Thin film deposition process


In an EBPVD system, the deposition chamber must be evacuated to a pressure of at least 7.5 x 105 Torr (104 hPa) to allow passage of electrons from the electron gun to the evaporation material which can be in the form of an ingot or rod. Alternatively, some Modern EBPVD systems utilize an arc suppression system and can be operated at vacuum levels as low as 5.0 x 103 Torr, for situations such as parallel use with magnetron sputtering. Multiple types of evaporation materials and electron guns can be used simultaneously in a single EBPVD Electromagnetic Alignment Fig . The ingot is held at a positive potential relative system, each having a power from tens to to the filament. To avoid chemical interactions between the filament and the ingot material, the filament is kept out of sight. A magnetic field is employed to direct hundreds of kW. Electron beams can be the electron beam from its source to the ingot location. An additional electric field generated by thermionic emission, field can be used to steer the beam over the ingot surface allowing uniform heating. electron emission or the anodic arc method. The generated electron beam is accelerated to a high kinetic energy and directed towards the evaporation material. Upon striking the evaporation material, the electrons will lose their energy very rapidly. The kinetic energy of the electrons is converted into other forms of energy through interactions with the evaporation material. The thermal energy that is produced heats up the evaporation material causing it to melt or sublimate. Once temperature and vacuum level are sufficiently high, vapor will result from the melt or solid. The resulting vapor can then be used to coat surfaces. Accelerating voltages can be between 3 kV 40 kV. When the accelerating voltage is between 20 kV 25 kV and the beam current is a few amperes, 85% of the electron's kinetic energy can be converted into thermal energy. Some of the incident electron

Electron beam physical vapor deposition energy is lost through the production of X-rays and secondary electron emission. There are three main EBPVD configurations, electromagnetic alignment, electromagnetic focusing and the pendant drop configuration. Electromagnetic alignment and electromagnetic focusing use evaporation material that is in the form of an ingot while the pendant drop configuration uses a rod. Ingots will be enclosed in a copper crucible or hearth while a rod will be mounted at one end in a socket. Both the crucible and socket must be cooled. This is typically done by water circulation. In the case of ingots, molten liquid can form on its surface which can be kept constant by vertical displacement of the ingot. The evaporation rate may be on the order of 102 g/cm2 sec.

102

Material evaporation methods


Refractory carbides like titanium carbide and borides like titanium boride and zirconium boride can evaporate without undergoing decomposition in the vapor phase. These compounds are deposited by direct evaporation. In this process these compounds, compacted in the form of an ingot, are evaporated in vacuum by the focused high energy electron beam and the vapors are directly condensed over the substrate. Certain refractory oxides and carbides undergo fragmentation during their evaporation by the electron beam, resulting in a stoichiometry that is different from the initial material. For example, alumina, when evaporated by electron beam, dissociates into aluminum, AlO3 and Al2O. Some refractory carbides like silicon carbide and tungsten carbide decompose upon heating and the dissociated elements have different volatilities. These compounds can be deposited on the substrate either by reactive evaporation or by co-evaporation. In the reactive evaporation process, the metal is evaporated from the ingot by the electron beam. The vapors are carried by the reactive gas, which is oxygen in case of metal oxides or acetylene in case of metal carbides. When the thermodynamic conditions are met, the vapors react with the gas in the vicinity of the substrate to form films. Metal carbide films can also be deposited by co-evaporation. In this process, two ingots are used, one for metal and the other for carbon. Each ingot is heated with a different beam energy so that their evaporation rate can be controlled. As the vapors arrive at the surface, they chemically combine under proper thermodynamic conditions to form a metal carbide film.

The substrate
The substrate on which the film deposition takes place is ultrasonically cleaned and fastened to the substrate holder. The substrate holder is attached to the manipulator shaft. The manipulator shaft moves translationally to adjust the distance between the ingot source and the substrate. The shaft also rotates the substrate at a particular speed so that the film is uniformly deposited on the substrate. A negative bias D.C. voltage of 200 V 400 V can be applied to the substrate. Often, focused high energy electrons from one of the electron guns or infrared light from heater lamps is used to preheat the substrate. Heating of the substrate allows for increased adatom - substrate and adatom - film diffusion by giving the adatoms sufficient energy to overcome kinetic barriers. If a rough film, such as metallic nanorods, is desired substrate cooling with water or liquid nitrogen may be employed to reduce diffusion lifetime, positively bolstering surface kinetic barriers.To further enhance film roughness, the substrate may be mounted at a steep angle with respect to the flux to achieve geometric shadowing where incoming line of sight flux lands onto only higher parts of the developing film. This method is known as glancing angle deposition (GLAD) or oblique angle deposition (OAD).

Ion beam assisted deposition


EBPVD systems are equipped with ion sources. These ion sources are used for substrate etching and cleaning, sputtering the target and controlling the microstructure of the substrate. The ion beams bombard the surface and alter the microstructure of the film. When the deposition reaction takes place on the hot substrate surface, the films can develop an internal tensile stress due to the mismatch in the coefficient of thermal expansion between the substrate and the film. High energy ions can be used to bombard these ceramic thermal barrier coatings and change the tensile

Electron beam physical vapor deposition stress into compressive stress. Ion bombardment also increases the density of the film, changes the grain size and modifies amorphous films to polycrystalline films. Low energy ions are used for the surfaces of semiconductor films.

103

Advantages of EBPVD
The deposition rate in this process can be as low as 1nm per minute to as high as few micrometers per minute. The material utilization efficiency is high relative to other methods and the process offers structural and morphological control of films. Due to the very high deposition rate, this process has potential industrial application for wear resistant and thermal barrier coatings in aerospace industries, hard coatings for cutting and tool industries, and electronic and optical films for semiconductor industries and thin film solar applications.

Disadvantages of EBPVD
EBPVD is a line-of-sight deposition process when performed at a low enough pressure (roughly <104 Torr ). The translational and rotational motion of the shaft helps for coating the outer surface of complex geometries, but this process cannot be used to coat the inner surface of complex geometries. Another potential problem is that filament degradation in the electron gun results in a non-uniform evaporation rate. However, when vapor deposition is performed at pressures of roughly 104 Torr (1.3 x 104 hPa) or higher, significant scattering of the vapor cloud takes place such that surfaces not in sight of the source can be coated. Strictly speaking, the slow transition from line-of-sight to scattered deposition is determined not only by pressure (or mean free path) but also by source-to-substrate distance. Certain materials are not well-suited to evaporation by EBPVD. The following reference materials suggest appropriate evaporation techniques for many materials: Vacuum Engineering & Materials Co., Inc. Kurt J. Lesker Company

Electron beam physical vapor deposition

104

Evaporation (deposition)
Evaporation is a common method of thin-film deposition. The source material is evaporated in a vacuum. The vacuum allows vapor particles to travel directly to the target object (substrate), where they condense back to a solid state. Evaporation is used in microfabrication, and to make macro-scale products such as metallized plastic film.

Physical principle
Evaporation involves two basic processes: a hot source material evaporates and condenses on the substrate. It resembles the familiar process by which liquid water appears on the lid of a boiling pot. However, the gaseous environment and heat source (see "Equipment" below) are different. Evaporation takes place in a vacuum, i.e. vapors other than the source material are almost entirely removed before the process begins. In high vacuum (with a long mean free path), evaporated particles can travel directly to the deposition target without colliding with the background gas. (By contrast, in the boiling pot Evaporation machine used for metallization at LAAS technological facility in Toulouse, France. example, the water vapor pushes the air out of the pot before it can reach the lid.) At a typical pressure of 10-4 Pa, an 0.4-nm particle has a mean free path of 60 m. Hot objects in the evaporation chamber, such as heating filaments, produce unwanted vapors that limit the quality of the vacuum. Evaporated atoms that collide with foreign particles may react with them; for instance, if aluminium is deposited in the presence of oxygen, it will form aluminium oxide. They also reduce the amount of vapor that reaches the substrate, which makes the thickness difficult to control. Evaporated materials deposit nonuniformly if the substrate has a rough surface (as integrated circuits often do). Because the evaporated material attacks the substrate mostly from a single direction, protruding features block the evaporated material from some areas. This phenomenon is called "shadowing" or "step coverage." When evaporation is performed in poor vacuum or close to atmospheric pressure, the resulting deposition is generally non-uniform and tends not to be a continuous or smooth film. Rather, the deposition will appear fuzzy .

Evaporation (deposition)

105

Equipment
Any evaporation system includes a vacuum pump. It also includes an energy source that evaporates the material to be deposited. Many different energy sources exist: In the thermal method, metal wire is fed onto heated semimetal (ceramic) evaporators known as "boats" due to their shape. A pool of melted metal forms in the boat cavity and evaporates into a cloud above the source. Alternatively the source material is placed in a crucible, which is radiatively heated by an electric filament, or the source material may be hung from the filament itself (filament evaporation). Molecular beam epitaxy is an advanced form of thermal evaporation. In the electron-beam method, the source is heated by an electron beam with an energy up to 15 keV. In flash evaporation, a fine wire of source material is fed continuously onto a hot ceramic bar, and evaporates on contact. Resistive evaporation is accomplished by passing a large current through a resistive wire or foil containing the material to be deposited. The heating element is often referred to as an "evaporation source". Wire type evaporation sources are made from tungsten wire and can be formed into filaments, baskets, heaters or looped shaped point sources. Boat type evaporation sources are made from tungsten, tantalum, molybdenum or ceramic type materials capable of withstanding high temperatures. Some systems mount the substrate on an out-of-plane planetary mechanism. The mechanism rotates the substrate simultaneously around two axes, to reduce shadowing.

Optimization
Purity of the deposited film depends on the quality of the vacuum, and on the purity of the source material. At a given vacuum pressure the film purity will be higher at higher deposition rates as this minimises the relative rate of gaseous impurity inclusion. The thickness of the film will vary due to the geometry of the evaporation chamber. Collisions with residual gases aggravate nonuniformity of thickness. Wire filaments for evaporation cannot deposit thick films, because the size of the filament limits the amount of material that can be deposited. Evaporation boats and crucibles offer higher volumes for thicker coatings. Thermal evaporation offers faster evaporation rates than sputtering. Flash evaporation and other methods that use crucibles can deposit thick films. In order to deposit a material, the evaporation system must be able to melt it. This makes refractory materials such as tungsten hard to deposit by methods that do not use electron-beam heating. Electron-beam evaporation allows tight control of the evaporation rate. Thus, an electron-beam system with multiple beams and multiple sources can deposit a chemical compound or composite material of known composition. Step coverage

Evaporation (deposition)

106

Applications
An important example of an evaporative process is the production of aluminized PET film packaging film in a roll-to-roll web system. Often, the aluminum layer in this material is not thick enough to be entirely opaque since a thinner layer can be deposited more cheaply than a thick one. The main purpose of the aluminum is to isolate the product from the external environment by creating a barrier to the passage of light, oxygen, or water vapor. Evaporation is commonly used in microfabrication to deposit metal films

Comparison to other deposition methods


Alternatives to evaporation, such as sputtering and chemical vapor deposition, have better step coverage. This may be an advantage or disadvantage, depending on the desired result. Sputtering tends to deposit material more slowly than evaporation. Sputtering uses a plasma, which produces many high-speed atoms that bombard the substrate and may damage it. Evaporated atoms have a Maxwellian energy distribution, determined by the temperature of the source, which reduces the number of high-speed atoms. However, electron beams tend to produce X-rays (Bremsstrahlung) and stray electrons, each of which can also damage the substrate. Application: 1. astronomical telescope mirror. 2. aluminium PET film. 3. micro fabrication

Pulsed laser deposition

107

Pulsed laser deposition


Pulsed laser deposition (PLD) is a thin film deposition (specifically a physical vapor deposition, PVD) technique where a high power pulsed laser beam is focused inside a vacuum chamber to strike a target of the material that is to be deposited. This material is vaporized from the target (in a plasma plume) which deposits it as a thin film on a substrate (such as a silicon wafer facing the target). This process can occur in ultra high vacuum or in the presence of a background gas, such as oxygen which is commonly used when depositing oxides to fully oxygenate the deposited films. While the basic-setup is simple relative to many other deposition techniques, the physical phenomena of laser-target interaction and film growth are quite complex. When the laser pulse is absorbed by the target, energy is first converted to electronic excitation and then into thermal, chemical and mechanical energy resulting in evaporation, ablation, plasma formation and even exfoliation. The ejected species expand into the surrounding vacuum in the form of a plume containing many energetic species including atoms, molecules, electrons, ions, clusters, particulates and molten globules, before depositing on the typically hot substrate.
A plume ejected from a SrRuO3 target during pulsed laser deposition.

Process
One possible configuration of a PLD deposition The detailed mechanisms of PLD are very complex including the chamber. ablation process of the target material by the laser irradiation, the development of a plasma plume with high energetic ions, electrons as well as neutrals and the crystalline growth of the film itself on the heated substrate. The process of PLD can generally be divided into four stages:

Laser ablation of the target material and creation of a plasma Dynamic of the plasma Deposition of the ablation material on the substrate Nucleation and growth of the film on the substrate surface

Each of these steps is crucial for the crystallinity, uniformity and stoichiometry of the resulting film.

Laser ablation of the target material and creation of a plasma


The ablation of the target material upon laser irradiation and the creation of plasma are very complex processes. The removal of atoms from the bulk material is done by vaporization of the bulk at the surface region in a state of non-equilibrium. In this the incident laser pulse penetrates into the surface of the material within the penetration depth. This dimension is dependent on the laser wavelength and the index of refraction of the target material at the applied laser wavelength and is typically in the region of 10nm for most materials. The strong electrical field generated by the laser light is sufficiently strong to remove the electrons from the bulk material of the penetrated volume. This process occurs within 10 ps of a ns laser pulse and is caused by non-linear processes such as multiphoton ionization which are enhanced by microscopic cracks at the surface, voids, and nodules, which increase the electric field. The free electrons oscillate within the electromagnetic field of the laser light and can collide with

Pulsed laser deposition the atoms of the bulk material thus transferring some of their energy to the lattice of the target material within the surface region. The surface of the target is then heated up and the material is vaporized.

108

Dynamic of the plasma


In the second stage the material expands in a plasma parallel to the normal vector of the target surface towards the substrate due to Coulomb repulsion and recoil from the target surface. The spatial distribution of the plume is dependent on the background pressure inside the PLD chamber. The density of the plume can be described by a cosn(x) law with a shape similar to a Gaussian curve. The dependency of the plume shape on the pressure can be described in three stages: The vacuum stage, where the plume is very narrow and forward directed; almost no scattering occurs with the background gases. The intermediate region where a splitting of the high energetic ions from the less energetic species can be observed. The time-of-flight (TOF) data can be fitted to a shock wave model; however, other models could also be possible. High pressure region where we find a more diffusion-like expansion of the ablated material. Naturally this scattering is also dependent on the mass of the background gas and can influence the stoichiometry of the deposited film. The most important consequence of increasing the background pressure is the slowing down of the high energetic species in the expanding plasma plume. It has been shown that particles with kinetic energies around 50 eV can resputter the film already deposited on the substrate. This results in a lower deposition rate and can furthermore result in a change in the stoichiometry of the film.

Deposition of the ablation material on the substrate


The third stage is important to determine the quality of the deposited films. The high energetic species ablated from the target are bombarding the substrate surface and may cause damage to the surface by sputtering off atoms from the surface but also by causing defect formation in the deposited film. The sputtered species from the substrate and the particles emitted from the target form a collision region, which serves as a source for condensation of particles. When the condensation rate is high enough, a thermal equilibrium can be reached and the film grows on the substrate surface at the expense of the direct flow of ablation particles and the thermal equilibrium obtained.

Nucleation and growth of the film on the substrate surface


The nucleation process and growth kinetics of the film depend on several growth parameters including: Laser parameters several factors such as the laser fluence [Joule/cm2], laser energy, and ionization degree of the ablated material will affect the film quality, the stoichiometry, and the deposition flux. Generally, the nucleation density increases when the deposition flux is increased. Surface temperature The surface temperature has a large effect on the nucleation density. Generally, the nucleation density decreases as the temperature is increased. Substrate surface The nucleation and growth can be affected by the surface preparation (such as chemical etching), the miscut of the substrate, as well as the roughness of the substrate. Background pressure Common in oxide deposition, an oxygen background is needed to ensure stoichiometric transfer from the target to the film. If, for example, the oxygen background is too low, the film will grow off stoichiometry which will affect the nucleation density and film quality. In PLD, a large supersaturation occurs on the substrate during the pulse duration. The pulse lasts around 1040 microseconds depending on the laser parameters. This high supersaturation causes a very large nucleation density on the surface as compared to Molecular Beam Epitaxy or Sputtering Deposition. This nucleation density increases the smoothness of the deposited film.

Pulsed laser deposition In PLD, [depending on the deposition parameters above] three growth modes are possible: Step-flow growth All substrates have a miscut associated with the crystal. These miscuts give rise to atomic steps on the surface. In step-flow growth, atoms land on the surface and diffuse to a step edge before they have a chance to nucleated a surface island. The growing surface is viewed as steps traveling across the surface. This growth mode is obtained by deposition on a high miscut substrate, or depositing at elevated temperatures Layer-by-layer growth In this growth mode, islands nucleate on the surface until a critical island density is reached. As more material is added, the islands continue to grow until the islands begin to run into each other. This is known as coalescence. Once coalescence is reached, the surface has a large density of pits. When additional material is added to the surface the atoms diffuse into these pits to complete the layer. This process is repeated for each subsequent layer. 3D growth This mode is similar to the layer-by-layer growth, except that once an island is formed an additional island will nucleate on top of the 1st island. Therefore the growth does not persist in a layer by layer fashion, and the surface roughens each time material is added.

109

History
Pulsed laser deposition is only one of many thin film deposition techniques. Other methods include molecular beam epitaxy (MBE), chemical vapor deposition (CVD), sputter deposition (RF, Magnetron, and ion beam). The history of laser-assisted film growth started soon after the technical realization of the first laser in 1960 by Maiman. Smith and Turner utilized a ruby laser to deposit the first thin films in 1965, three years after Breech and Cross studied the laser-vaporization and excitation of atoms from solid surfaces. However, the deposited films were still inferior to those obtained by other techniques such as chemical vapor deposition and molecular beam epitaxy. In the early 1980s, a few research groups (mainly in the former USSR) achieved remarkable results on manufacturing of thin film structures utilizing laser technology. The breakthrough came in 1987 when Dijkkamp and Venkatesan were able to laser deposit a thin film of YBa2Cu3O7, a high temperature superconductive material, which was of more superior quality than films deposited with alternative techniques. Since then, the technique of Pulsed Laser Deposition has been utilized to fabricate high quality crystalline films. The deposition of ceramic oxides, nitride films, metallic multilayers and various superlattices has been demonstrated. In the 1990s the development of new laser technology, such as lasers with high repetition rate and short pulse durations, made PLD a very competitive tool for the growth of thin, well defined films with complex stoichiometry.

Technical aspects
There are many different arrangements to build a deposition chamber for PLD. The target material which is evaporated by the laser is normally found as a rotating disc attached to a support. However, it can also be sintered into a cylindrical rod with rotational motion and a translational up and down movement along its axis. This special configuration allows not only the utilization of a synchronized reactive gas pulse but also of a multicomponent target rod with which films of different multilayers can be created. Some factors that influence the deposition rate: Target material Pulse energy of laser Distance from target to substrate Type of gas and pressure in chamber (oxygen, argon, etc.)

Pulsed laser deposition

110

Sputter deposition
Sputter deposition is a physical vapor deposition (PVD) method of depositing thin films by sputtering. This involves ejecting material from a "target" that is a source onto a "substrate" such as a silicon wafer. Resputtering is re-emission of the deposited material during the deposition process by ion or atom bombardment. Sputtered atoms ejected from the target have a wide energy distribution, typically up to tens of eV (100,000 K). The sputtered ions (typically only a small fraction order 1% of the ejected particles are ionized) can ballistically fly from the target in straight lines and impact energetically on the substrates or vacuum chamber (causing resputtering). Alternatively, at higher gas pressures, the ions collide with the gas atoms that act as a moderator and move diffusively, reaching the substrates or vacuum chamber wall and condensing after undergoing a random walk. The entire range from high-energy ballistic impact to low-energy thermalized motion is accessible by changing the background gas pressure. The sputtering gas is often an inert gas such as argon. For efficient momentum transfer, the atomic weight of the sputtering gas should be close to the atomic weight of the target, so for sputtering light elements neon is preferable, while for heavy elements krypton or xenon are used. Reactive gases can also be used to sputter compounds. The compound can be formed on the target surface, in-flight or on the substrate depending on the process parameters. The availability of many parameters that control sputter deposition make it a complex process, but also allow experts a large degree of control over the growth and microstructure of the film.

Uses of sputtering
Sputtering is used extensively in the semiconductor industry to deposit thin films of various materials in integrated circuit processing. Thin antireflection coatings on glass for optical applications are also deposited by sputtering. Because of the low substrate temperatures used, sputtering is an ideal method to deposit contact metals for thin-film transistors. Perhaps the most familiar products of sputtering are low-emissivity coatings on glass, used in double-pane window assemblies. The coating is a multilayer containing silver and metal oxides such as zinc oxide, tin oxide, or titanium dioxide. A large industry has developed around tool bit coating using sputtered nitrides, such as titanium nitride, creating the familiar gold colored hard coat. Sputtering is also used as the process to deposit the metal (e.g. aluminium) layer during the fabrication of CDs and DVDs.

Sputter deposition Hard disk surfaces use sputtered CrOx and other sputtered materials. Sputtering is one of the main processes of manufacturing optical waveguides and is another way for making efficient photovoltaic solar cells.

111

Comparison with other deposition methods


An important advantage of sputter deposition is that even materials with very high melting points are easily sputtered while evaporation of these materials in a resistance evaporator or Knudsen cell is problematic or impossible. Sputter deposited films have a composition close to that of the source material. The difference is due to different elements spreading differently because of their different mass (light elements are deflected more easily by the gas) but this difference is constant. Sputtered films typically have a better adhesion on the substrate than evaporated films. A target contains a large amount of material and is maintenance free making the technique suited for ultrahigh vacuum applications. Sputtering sources contain no hot parts (to avoid heating they are typically water cooled) and are compatible with reactive gases such as oxygen. Sputtering can be performed top-down while evaporation must be performed bottom-up. Advanced processes such as epitaxial growth are possible.

A typical ring-geometry sputter target, here gold showing the cathode made of the material to be deposited, the anode counter-electrode and an outer ring meant to prevent sputtering of the hearth that holds the target.

Some disadvantages of the sputtering process are that the process is more difficult to combine with a lift-off for structuring the film. This is because the diffuse transport, characteristic of sputtering, makes a full shadow impossible. Thus, one cannot fully restrict where the atoms go, which can lead to contamination problems. Also, active control for layer-by-layer growth is difficult compared to pulsed laser deposition and inert sputtering gases are built into the growing film as impurities.

Types of sputter deposition


Sputtering sources often employ magnetrons that utilize strong electric and magnetic fields to confine charged plasma particles close to the surface of the sputter target. In a magnetic field electrons follow helical paths around magnetic field lines undergoing more ionizing collisions with gaseous neutrals near the target surface than would otherwise occur. (As the target material is depleted, a "racetrack" erosion profile may appear on the surface of the target.) The sputter gas is typically an inert gas such as argon. The extra argon ions created as a result of these collisions leads to a higher deposition rate. It also means that the plasma can be sustained at a lower pressure. The sputtered atoms are neutrally charged and so are unaffected by the magnetic trap. Charge build-up on insulating targets can be avoided with the use of RF sputtering where the sign of the anode-cathode bias is varied at a high rate (commonly 13.56 MHz). RF sputtering works well to produce highly insulating oxide films but with the added expense of RF power supplies and impedance matching networks. Stray magnetic fields leaking from ferromagnetic targets also disturb the sputtering process. Specially designed sputter guns with unusually strong permanent magnets must often be used in compensation.

Sputter deposition

112

Ion-beam sputtering
Ion-beam sputtering (IBS) is a method in which the target is external to the ion source. A source can work without any magnetic field like in a hot filament ionization gauge. In a Kaufman source ions are generated by collisions with electrons that are confined by a magnetic field as in a magnetron. They are then accelerated by the electric field emanating from a grid toward a target. As the ions leave the source they are neutralized by electrons from a second external filament. IBS has an advantage in that the energy and flux of ions can be controlled independently. Since the flux that strikes the target is composed of neutral atoms, either insulating or conducting targets can be sputtered. IBS has found application in the manufacture of thin-film heads for disk drives. A pressure gradient between the ion source and the sample chamber is generated by placing the gas inlet at the source and shooting through a tube into the sample chamber. This saves gas and reduces contamination in UHV applications. The principal drawback of IBS is the large amount of maintenance required to keep the ion source operating.

Reactive sputtering

A magnetron sputter gun showing the target-mounting surface, the vacuum feedthrough, the power connector and the water lines. This design uses a disc target as opposed to the ring geometry illustrated above.

In reactive sputtering, the deposited film is formed by chemical reaction between the target material and a gas which is introduced into the vacuum chamber. Oxide and nitride films are often fabricated using reactive sputtering. The composition of the film can be controlled by varying the relative pressures of the inert and reactive gases. Film stoichiometry is an important parameter for optimizing functional properties like the stress in SiNx and the index of refraction of SiOx.

Ion-assisted deposition
In ion-assisted deposition (IAD), the substrate is exposed to a secondary ion beam operating at a lower power than the sputter gun. Usually a Kaufman source, like that used in IBS, supplies the secondary beam. IAD can be used to deposit carbon in diamond-like form on a substrate. Any carbon atoms landing on the substrate which fail to bond properly in the diamond crystal lattice will be knocked off by the secondary beam. NASA used this technique to experiment with depositing diamond films on turbine blades in the 1980s. IAS is used in other important industrial applications such as creating tetrahedral amorphous carbon surface coatings on hard disk platters and hard transition metal nitride coatings on medical implants.

Sputter deposition

113

High-target-utilization sputtering
Sputtering may also be performed by remote generation of a high density plasma. The plasma is generated in a side chamber opening into the main process chamber, containing the target and the substrate to be coated. As the plasma is generated remotely, and not from the target itself (as in conventional magnetron sputtering), the ion current to the target is independent of the voltage applied to the target.

Comparison of target utilization via HiTUS process - 95%

High-power impulse magnetron sputtering (HIPIMS)


HIPIMS is a method for physical vapor deposition of thin films which is based on magnetron sputter deposition. HIPIMS utilizes extremely high power densities of the order of kW/cm2 in short pulses (impulses) of tens of microseconds at low duty cycle of < 10%.

Gas flow sputtering


Gas flow sputtering makes use of the hollow cathode effect, the same effect by which hollow cathode lamps operate. In gas flow sputtering a working gas like argon is led through an opening in a metal subjected to a negative electrical potential. Enhanced plasma densities occur in the hollow cathode, if the pressure in the chamber p and a characteristic dimension L of the hollow cathode obey the Paschen's law 0.5 Pam < pL < 5 Pam. This causes a high flux of ions on the surrounding surfaces and a large sputter effect. The hollow-cathode based gas flow sputtering may thus be associated with large deposition rates up to values of a few m/min.

Structure and morphology


In 1974 J. A. Thornton applied the structure zone model for the description of thin film morphologies to sputter deposition. In a study on metallic layers prepared by DC sputtering, he extended the structure zone concept initially introduced by Movchan and Demchishin for evaporated films. Thornton introduced a further structure zone T, which was observed at low argon pressures and characterized by densely packed fibrous grains. The most important point of this extension was to emphasize the pressure p as a decisive process parameter. In particular, if hyperthermal techniques like sputtering etc. are used for the sublimation of source atoms, the pressure governs via the mean free path the energy distribution with which they impinge on the surface of the growing film. Next to the deposition temperature Td the chamber pressure or mean free path should thus always be specified when considering a deposition process. Since sputter deposition belongs to the group of plasma-assisted processes, next to neutral atoms also charged species (like argon ions) hit the surface of the growing film, and this component may exert a large effect. Denoting the fluxes of the arriving ions and atoms by Ji and Ja, it turned out that the magnitude of the Ji/Ja ratio plays a decisive role on the microstructure and morphology obtained in the film. The effect of ion bombardment may quantitatively be derived from structural parameters like preferred orientation of crystallites or texture and from the

Sputter deposition state of residual stress. It has been shown recently that textures and residual stresses may arise in gas-flow sputtered Ti layers that compare to those obtained in macroscopic Ti work pieces subjected to a severe plastic deformation by shot peening.

114

Calo tester
Coatings with thicknesses typically between 0.1 to 50 micrometres such as PVD coatings or CVD coatings are used in many industries to improve the surface properties of tools and components. The Calo tester is a quick, simple and inexpensive piece of equipment used to measure the thickness of these coatings. The Calo tester also known as a ball craterer or coating thickness tester is also used to measure the amount of coating wear after a wear test carried out using a pin on disc tester. The Calo tester consists of a holder for the surface to be tested and a steel sphere of known diameter that is rotated against the surface by a rotating shaft connected to a motor whilst diamond paste is applied to the contact area. The sphere is rotated for a short period of time (less than 20 seconds for a 0.1 to 5 micrometre thickness) but due to the abrasive nature of the diamond paste this is sufficient time to wear a crater through thin coatings.

Calculating coating thickness using the Calo tester


An optical microscope is used to take two measurements after the Calo test across the crater and the coating thickness is calculated using a simple geometrical equation.

Where t = coating thickness, d = diameter of the sphere

Other coating testing equipment


Scratch tester - Measures adhesion Pin on disc tester - Measures friction and wear

Calo tester

115

Nanoindentation
Nanoindentation is a variety of indentation hardness tests applied to small volumes. Indentation is perhaps the most commonly applied means of testing the mechanical properties of materials. The nanoindentation technique was developed in the mid-1970s to measure the hardness of small volumes of material.

Background
In a traditional indentation test (macro or micro indentation), a hard tip whose mechanical properties are known (frequently made of a very hard material like diamond) is pressed into a sample whose properties are unknown. The load placed on the indenter tip is increased as the tip penetrates further into the specimen and soon reaches a user-defined value. At this point, the load may be held constant for a period or removed. The area of the residual indentation in the sample is measured and the hardness, H, is defined as the maximum load, P residual indentation area, A , or
r max

, divided by the

. For most techniques, the projected area may be measured directly using light microscopy. As can be seen from this equation, a given load will make a smaller indent in a "hard" material than a "soft" one. This technique is limited due to large and varied tip shapes, with indenter rigs which do not have very good spatial resolution (the location of the area to be indented is very hard to specify accurately). Comparison across experiments, typically done in different laboratories, is difficult and often meaningless. Nanoindentation improves on these macro and micro indentation tests by indenting on the nanoscale with a very precise tip shape, high spatial resolutions to place the indents, and by providing real-time load-displacement (into the surface) data while the indentation is in progress.

Nanoindentation

116

Nanoindentation
In nanoindentation small loads and tip sizes are used, so the indentation area may only be a few square micrometres or even nanometres. This presents problems in determining the hardness, as the contact area is not easily found. Atomic force microscopy or scanning electron microscopy techniques may be utilized to image the indentation, but can be quite cumbersome. Instead, an indenter with a geometry known to high precision (usually a Berkovich tip, which has a three-sided pyramid geometry) is employed. During the course of the instrumented indentation process, a record of the depth of penetration is made, and then the area of the indent is determined using the known geometry of the indentation tip. While indenting, various parameters Figure . Schematic of load-displacement curve for an instrumented nanoindentation test such as load and depth of penetration can be measured. A record of these values can be plotted on a graph to create a load-displacement curve (such as the one shown in Figure ). These curves can be used to extract mechanical properties of the material.

Young's modulus: The slope of the curve, dP/dh, upon unloading is indicative of the stiffness S of the contact. This value generally includes a contribution from both the material being tested and the response of the test device itself. The stiffness of the contact can be used to calculate the reduced Young's modulus Er:

where order of unity.

is the projected area of the indentation at the contact depth hc, and - is a geometrical constant on the is often approximated by a fitting polynomial as shown below for a Berkovich tip:

Where C0 for a Berkovich tip is 24.5 while for a cube corner (90) tip is 2.598. The reduced modulus Er is related to Young's modulus of the test specimen through the following relationship from contact mechanics:

Here, the subscript i indicates a property of the indenter material and is Poisson's ratio. For a diamond indenter tip, E is 1140 GPa and is 0.07. Poissons ratio of the specimen, , generally varies between 0 and 0.5 for most
i

materials (though it can be negative) and is typically around 0.3.

Nanoindentation

117

Hardness: There are two different types of hardness that can be obtained from a nano indenter: one is as in traditional macroindentation tests where one attains a single hardness value per experiment; the other is based on the hardness as the material is being indented resulting in hardness as a function of depth.

The hardness is given by the equation above, relating the maximum load to the indentation area. The area can be measured after the indentation by in-situ atomic force microscopy, or by 'after-the event' optical (or electron) microscopy. An example indentation image, from which the area may be determined, is shown at right.

An AFM image of an indent left by a Berkovich tip in a Zr-Cu-Al metallic glass; the plastic flow of the material around the indenter is apparent.

Some nanoindenters use an area function based on the geometry of the tip, compensating for elastic load during the test. Use of this area function provides a method of gaining real-time nanohardness values from a load-displacement graph. However, there is some controversy over the use of area functions to estimate the residual areas versus direct measurement. An area function typically describes the projected area of an indent as a 2nd-order polynomial function of the indenter depth h. Exclusive application of an area function in the absence of adequate knowledge of material response can lead to misinterpretation of resulting data. Cross-checking of areas microscopically is to be encouraged. Strain-rate sensitivity: The strain-rate sensitivity of the flow stress m is defined as

where

is the flow stress and

is the strain rate produced under the indenter. For nanoindentation

experiments which include a holding period at constant load (i.e. the flat, top area of the load-displacement curve), m can be determined from The subscripts indicate these values are to be determined from the plastic components only.

Activation volume: Interpreted loosely as the volume swept out by dislocations during thermal activation, the activation volume is

where T is the temperature and kB is Boltzmann's constant. From the definition of m, it is easy to see that .

Nanoindentation

118

Software
Software is the best method to analyze the nanoindentation load versus displacement curves for hardness and elastic modulus calculations. The Martens hardness, HM, is a simple software for any programmer having minimal background to develop. The software starts by searching for the maximum displacement, , point and maximum load, .

The displacement is used to calculate the contact surface area, As, based on the indenter geometry. For a perfect Berkovich indenter the relationship is . The Indentation hardness, is defined slightly different.

Here, the hardness is related to the projected contact area

As the indent size decreases the error caused by tip rounding increases. The tip wear can be accounted for within the software by using a simple polynomial function. As the indenter tip wears the C1 value will increase. The user enters the values for C0 and C1 based on direct measurements such as SEM or AFM images of the indenter tip or indirectly by using a material of known elastic modulus or an AFM image of the an indentation.

,, Calculating the elastic modulus with software involves using software filtering techniques to separate the critical unloading data from the rest of the load-displacement data. The start and end points are usually found by using user defined percentages. This user input increases the variability because of possible human error. It would be best if the entire calculation process was automatically done for more consistent results. A good nanoindentation machine prints out the load unload curve data with labels to each of the segments such as loading, top hold, unload, bottom hold, and reloading. If multiple cycles are used then each one should be labeled. However mores nanoindenters only give the raw data for the load-unload curves. An automatic software technique finds the sharp change from the top hold time to the beginning of the unloading. This can be found by doing a linear fit to the to hold time data. The unload data starts when the load is 1.5 times standard deviation less than the hold time load. The minimum data point is the end of the unloading data. The computer calculates the elastic modulus with this data according to the Oliver-Pharr (nonlinear). The Doerner-Nix method is less complicated to program because it is a linear curve fit of the selected minimum to maximum data. However, it is limited because the calculated elastic modulus will decrease as more data points are used along the unloading curve. The Oliver-Pharr nonlinear curve fit method to the unloading curve data where h is the depth variable, h is the final depth and k and m are constants and coefficients. The software f must use a nonlinear convergence method to solve for k, hf and m that best fits the unloading data. The slope is calculated by differentiating dP/dh at the maximum displacement.

An image of the indent can also be measured using software. The Atomic Force Microscope (AFM) scans the indent. First the lowest point of the indentation is found. Make an array of lines around the using linear lines from indent center along the indent surface. Where the section line is more than several standard deviations (>3 sigma) from the surface noise the outline point is created. Then connect all of the outline points to build the entire indent outline. This outline will automatically include the pile-up contact area.

Nanoindentation

119

Devices
The construction of a depth-sensing indentation system is made possible by the inclusion of very sensitive displacement and load sensing systems. Load transducers must be capable of measuring forces in the micronewton range and displacement sensors are very frequently capable of sub-nanometer resolution. Environmental isolation is crucial to the operation of the instrument. Vibrations transmitted to the device, fluctuations in atmospheric temperature and pressure, and thermal fluctuations of the components during the course of an experiment can cause significant errors. The ability to conduct nanoindentation studies with nanometer depth, and sub-nanonewton force resolution is also possible using a standard AFM setup. The AFM allows for nanomechanical studies to be conducted alongside topographic analyses, without the use of dedicated instruments. Load-displacement curves can be collected similarly for a variety of materials, and mechanical properties can be directly calculated from these curves.

Limitations
Conventional nanoindentation methods for calculation of Modulus of elasticity (based on the unloading curve) are limited to linear, isotropic materials. Problems associated with the "pile-up" or "sink-in" of the material on the edges of the indent during the indentation process remain a problem that is still under investigation. It is possible to measure the pile-up contact area using computerized image analysis of atomic force microscope (AFM) images of the indentations. This process also depends on the linear isotropic elastic recovery for the indent reconstruction.

Tribometer

120

Tribometer
A tribometer is an instrument that measures tribological quantities, such as coefficient of friction, friction force, and wear volume, between two surfaces in contact. It was invented by the 18th century Dutch scientist Musschenbroek A tribotester is the general name given to a machine or device used to perform tests and simulations of wear, friction and lubrication which are the subject of the study of tribology. Often tribotesters are extremely specific in their function and are fabricated by manufacturers who desire to test and analyze the long-term performance of their products. An example is that of orthopedic implant manufactures who have spent considerable sums of money to develop tribotesters that accurately reproduce the motions and forces that occur in human hip joints so that they can perform accelerated wear tests of their products.

Static Friction Tribometer

Hydrogen Tribometer

Theory
A simple tribometer is described by a hanging mass and a mass resting on a horizontal surface, connected to each other via a string and pulley. The coefficient of friction, , when the system is stationary, is determined by increasing the hanging mass until the moment that the resting mass begins to slide. Then using the general equation for friction force:

Where N, the normal force, is equal to the weight (mass x gravity) of the sitting mass (mT) and F, the loading force, is equal to the weight (mass x gravity) of the hanging mass (mH). To determine the kinetic coefficient of friction the hanging mass is increased or decreased until the mass system moves at a constant speed. In both cases, the coefficient of friction is simplified to the ratio of the two masses:

Tribometer In most test applications using tribometers, wear is measured by comparing the mass or surfaces of test specimens before and after testing. Equipment and methods used to examine the worn surfaces include optical microscopes, scanning electron microscopes, optical interferometry and mechanical roughness testers.

121

Types
Tribometers are often referred to by the specific contact arrangement they simulate or by the original equipment developer. Several arrangements are: Four ball Pin on disc Block on ring Bouncing ball Schwingungs-, Reibungs- und Verschleisstest (SRV) test machine Twin disc

Bouncing ball
A bouncing ball tribometer consists of a ball which is impacted at an angle against a surface. During a typical test, a ball is slid on an angle along a track until it impacts a surface and then bounces off of the surface. The friction produced in the contact between the ball and the surface results in a horizontal force on the surface and a rotational force on the ball. Frictional force is determined by finding the rotational speed of the ball using high speed photography or by measuring the force on the horizontal surface. Pressure in the contact is very high due to the large instantaneous force caused by the impact with the ball. Bouncing ball tribometers have been used to determine the shear characteristics of lubricants under high pressures such as is found in ball bearings or gears.

Pin on disc
A pin on disc tribometer consists of a stationary "pin" under an applied load in contact with a rotating disc. The pin can have any shape to simulate a specific contact, but spherical tips are often used to simplify the contact geometry. Coefficient of friction is determined by the ratio of the frictional force to the loading force on the pin. The pin on disc test has proved useful in providing a simple wear and friction test for low friction coatings such as diamond-like carbon coatings on valve train components in internal combustion engines.

Ion plating

122

Ion plating
Ion plating is a physical vapor deposition (PVD) process that is sometimes called ion assisted deposition (IAD) or ion vapor deposition (IVD) and is a version of vacuum deposition. Ion plating utilizes concurrent or periodic bombardment of the substrate and depositing film by atomic-sized energetic particles. Bombardment prior to deposition is used to sputter clean the substrate surface. During deposition the bombardment is used to modify and control the properties of the depositing film. It is important that the bombardment be continuous between the cleaning and the deposition portions of the process to maintain an atomically clean interface. In ion plating the energy, flux and mass of the bombarding species along with the ratio of bombarding particles to depositing particles are important processing variables. The depositing material may be vaporized either by evaporation, sputtering (bias sputtering), arc vaporization or by decomposition of a chemical vapor precursor chemical vapor deposition (CVD). The energetic particles used for bombardment are usually ions of an inert or reactive gas, or, in some cases, ions of the condensing film material (film ions). Ion plating can be done in a plasma environment where ions for bombardment are extracted from the plasma or it may be done in a vacuum environment where ions for bombardment are formed in a separate ion gun. The latter ion plating configuration is often called Ion Beam Assisted Deposition (IBAD). By using a reactive gas or vapor in the plasma, films of compound materials can be deposited. Ion plating is used to deposit hard coatings of compound materials on tools, adherent metal coatings, optical coatings with high densities, and conformal coatings on complex surfaces. The ion plating process was first described in the technical literature by Donald M. Mattox of Sandia National Laboratories in 1964.

Ion beam-assisted deposition


Ion beam assisted deposition or IBAD or IAD (not to be confused with ion beam induced deposition, IBID) is a materials engineering technique which combines ion implantation with simultaneous sputtering or another physical vapor deposition technique. Besides providing independent control of parameters such as ion energy, temperature and arrival rate of atomic species during deposition, this technique is especially useful to create a gradual transition between the substrate material and the deposited film, and for depositing films with less built-in strain than is possible by other techniques. These two properties can result in films with a much more durable bond to the substrate. Experience has shown that some meta-stable compounds like cubic boron nitride (c-BN), can only be formed in thin films when bombarded with energetic ions during the deposition process.

Vacuum evaporation

123

Vacuum evaporation
Vacuum evaporation is the process of causing the pressure in a liquid-filled container to be reduced below the vapor pressure of the liquid, causing the liquid to evaporate at a lower temperature than normal. Although the process can be applied to any type of liquid at any vapor pressure, it is generally used to describe the boiling of water by lowering the container's internal pressure below standard atmospheric pressure and causing the water to boil at room temperature. When the process is applied to food and the water is evaporated and removed, the food can be stored for long periods of time without spoiling. It is also used when boiling a substance at normal temperatures would chemically change the consistency of the product, such as egg whites coagulating when attempting to dehydrate the albumen into a powder. This process was invented by Henri Nestl in 1866, of Nestl Chocolate fame, although the Shakers were already using a vacuum pan earlier than that (see condensed milk).
Vacuum Sugar Apparatus at The Great Exhibition, 1851

This process is used industrially to make such food products as evaporated milk for milk chocolate, and tomato paste for ketchup. In the sugar industry the vacuum evaporation is used in the crystallization of sucrose solutions. Traditionally, this process was performed in batch mode, but nowadays also continuous vacuum pans are available Vacuum evaporators are used in a wide range of industrial sectors to treat industrial effluents and wastewater. It represents a clean, safe and very versatile technology having low management costs, which in most cases serves as a Zero-discharge treatment system. Vacuum evaporation treatment process consists of reducing the interior pressure of the evaporation chamber below atmospheric pressure. This reduces the boiling point of the liquid to be evaporated, thereby reducing the heat necessary/eliminated in both the boiling and condensation processes. In addition, there are other technical advantages such as the ability to distill other liquids with high boiling points and avoiding the decomposition of substances that are sensitive to temperature, etc. Vacuum evaporation is also a form of physical vapor deposition used in the semiconductor, microelectronics, and optical industries and in this context is a process of depositing thin films of material onto surfaces. Such a technique consists of pumping a vacuum chamber to pressures of less than torr and heating a material to produce a

Vacuum evaporation plant.

vacuum pans in a beet sugar factory

Vacuum evaporation flux of vapor in order to deposit the material onto a surface. The material to be vaporized is typically heated until its vapor pressure is high enough to produce a flux of several Angstroms per second by using an electrically resistive heater or bombardment by a high voltage beam.

124

Plating
Plating is a surface covering in which a metal is deposited on a conductive surface. Plating has been done for hundreds of years; it is also critical for modern technology. Plating is used to decorate objects, for corrosion inhibition, to improve solderability, to harden, to improve wearability, to reduce friction, to improve paint adhesion, to alter conductivity, for radiation shielding, and for other purposes. Jewelry typically uses plating to give a silver or gold finish. Thin-film deposition has plated objects as small as an atom, therefore plating finds uses in nanotechnology. There are several plating methods, and many variations. In one method, a solid surface is covered with a metal sheet, and then heat and pressure are applied to fuse them (a version of this is Sheffield plate). Other plating techniques include vapor deposition under vacuum and sputter deposition. Recently, plating often refers to using liquids. Metallizing refers to coating metal on non-metallic objects.

Electroplating
In electroplating, an ionic metal is supplied with electrons to form a non-ionic coating on a substrate. A common system involves a chemical solution with the ionic form of the metal, an anode (positively charged) which may consist of the metal being plated (a soluble anode) or an insoluble anode (usually carbon, platinum, titanium, lead, or steel), and finally, a cathode (negatively charged) where electrons are supplied to produce a film of non-ionic metal.

Electroless plating
Electroless plating, also known as chemical or auto-catalytic plating, is a non-galvanic plating method that involves several simultaneous reactions in an aqueous solution, which occur without the use of external electrical power. The reaction is accomplished when hydrogen is released by a reducing agent, normally sodium hypophosphite (Note: the hydrogen leaves as a hydride ion), and oxidized, thus producing a negative charge on the surface of the part. The most common electroless plating method is electroless nickel plating, although silver, gold and copper layers can also be applied in this manner, as in the technique of Angel gilding.

Plating

125

Specific cases
Gold plating
Gold plating is a method of depositing a thin layer of gold on the surface of glass or metal, most often copper or silver. Gold plating is often used in electronics, to provide a corrosion-resistant electrically conductive layer on copper, typically in electrical connectors and printed circuit boards. With direct gold-on-copper plating, the copper atoms have the tendency to diffuse through the gold layer, causing tarnishing of its surface and formation of an oxide/sulfide layer. Therefore, a layer of a suitable barrier metal, usually nickel, has to be deposited on the copper substrate, forming a copper-nickel-gold sandwich. Metals and glass may also be coated with gold for ornamental purposes, using a number of different processes usually referred to as gilding.

Silver plating
For applications in electronics, silver is sometimes used for plating copper, as its electrical resistance is lower (see Resistivity of various materials); more so at higher frequencies due to the skin effect. Variable capacitors are considered of the highest quality when they have silver-plated plates. Similarly, silver-plated, or even solid silver cables, are prized in audiophile applications; however some experts consider that in practice the plating is often poorly implemented, making the result inferior to similarly priced copper cables. Care should be used for parts exposed to high humidity environments. When the silver layer is porous or contains cracks, the underlying copper undergoes rapid galvanic corrosion, flaking off the plating and exposing the copper itself; a process known as red plague. Historically, silver plate was used to provide a cheaper version of items that might otherwise be made of silver, including cutlery and candlesticks. The earliest kind was Old Sheffield Plate, but in the 19th century new methods of production (including electroplating) were introduced: see Sheffield Plate.

A silver-plated alto saxophone

Another method that can be used to apply a thin layer of silver to several objects, such as glass, is to place Tollens' reagent in a glass, add Glucose/Dextrose, and shake the bottle to promote the reaction. AgNO3 + KOH AgOH + KNO3 AgOH + 2 NH3 [Ag(NH3)2]+ + [OH] (Note: see Tollens' reagent) [Ag(NH3)2]+ + [OH] + aldehyde (usually glucose/dextrose) Ag + 2 NH3 + H2O

Plating

126

Rhodium plating
Rhodium plating is occasionally used on white gold, silver or copper and its alloys. A barrier layer of nickel is usually deposited on silver first, though in this case it is not to prevent migration of silver through rhodium, but to prevent contamination of the rhodium bath with silver and copper, which slightly dissolve in the sulfuric acid usually present in the bath composition.

Chrome plating
Chrome plating is a finishing treatment using the electrolytic deposition of chromium. The most common form of chrome plating is the thin, decorative bright chrome, which is typically a 10-m layer over an underlying nickel plate. When plating on iron or steel, an underlying plating of copper allows the nickel to adhere. The pores (tiny holes) in the nickel and chromium layers also promote corrosion resistance. Bright chrome imparts a mirror-like finish to items such as metal furniture frames and automotive trim. Thicker deposits, up to 1000m, are called hard chrome and are used in industrial equipment to reduce friction and wear. The traditional solution used for industrial hard chrome plating is made up of about 250 g/L of CrO3 and about 2.5 g/L of SO4. In solution, the chrome exists as chromic acid, known as hexavalent chromium. A high current is used, in part to stabilize a thin layer of chromium(+2) at the surface of the plated work. Acid chrome has poor throwing power, fine details or holes are further away and receive less current resulting in poor plating.

Zinc plating
Zinc coatings prevent oxidation of the protected metal by forming a barrier and by acting as a sacrificial anode if this barrier is damaged. Zinc oxide is a fine white dust that (in contrast to iron oxide) does not cause a breakdown of the substrate's surface integrity as it is formed. Indeed the zinc oxide, if undisturbed, can act as a barrier to further oxidation, in a way similar to the protection afforded to aluminum and stainless steels by their oxide layers. The majority of hardware parts are zinc plated, rather than cadmium plated.

Tin plating
The tin-plating process is used extensively to protect both ferrous and nonferrous surfaces. Tin is a useful metal for the food processing industry since it is non-toxic, ductile and corrosion resistant. The excellent ductility of tin allows a tin coated base metal sheet to be formed into a variety of shapes without damage to the surface tin layer. It provides sacrificial protection for copper, nickel and other non-ferrous metals, but not for steel. Tin is also widely used in the electronics industry because of its ability to protect the base metal from oxidation thus preserving its solderability. In electronic applications, 3% to 7% lead may be added to improve solderability and to prevent the growth of metallic "whiskers" in compression stressed deposits, which would otherwise cause electrical shorting. However, RoHS (Restriction of Hazardous Substances) regulations enacted beginning in 2006 require that no lead be added intentionally and that the maximum percentage not exceed 1%. Some exemptions have been issued to RoHS requirements in critical electronics applications due to failures which are known to have occurred as a result of tin whisker formation.

Plating

127

Alloy plating
In some cases, it is desirable to co-deposit two or more metals resulting in an electroplated alloy deposit. Depending on the alloy system, an electroplated alloy may be solid solution strengthened or precipitation hardened by heat treatment to improve the plating's physical and chemical properties. Nickel-Cobalt is a common electroplated alloy.

Composite plating
Metal matrix composite plating can be manufactured when a substrate is plated in a bath containing a suspension of ceramic particles. Careful selection of the size and composition of the particles can fine-tune the deposit for wear resistance, high temperature performance, or mechanical strength. Tungsten carbide, silicon carbide, chromium carbide, and aluminum oxide (alumina) are commonly used in composite electroplating.

Cadmium plating
Cadmium plating is under scrutiny because of the environmental toxicity of the cadmium metal. However, cadmium plating is still widely used in some applications such as aerospace fasteners and it remains in military and aviation specs however it is being phased out due to its toxicity. Cadmium plating (or "cad plating") offers a long list of technical advantages such as excellent corrosion resistance even at relatively low thickness and in salt atmospheres, softness and malleability, freedom from sticky and/or bulky corrosion products, galvanic compatibility with aluminum, freedom from stick-slip thus allowing reliable torquing of plated threads, can be dyed to many colors and clear, has good lubricity and solderability, and works well either as a final finish or as a paint base. If environmental concerns matter, in most aspects cadmium plating can be directly replaced with gold plating as it shares most of the material properties, but gold is more expensive and cannot serve as a paint base.

Nickel plating
The chemical reaction for nickel plating is:[citation needed] At cathode: Ni Ni2+ + 2e At anode: H2PO2 + H2O H2PO3 + 2 H+ Compared to cadmium plating, nickel plating offers a shinier and harder finish, but lower corrosion resistance, lubricity, and malleability, resulting in a tendency to crack or flake if the piece is further processed.

Electroless nickel plating


Electroless nickel plating, also known as enickel and NiP, offers many advantages: uniform layer thickness over most complicated surfaces, direct plating of ferrous metals (steel), superior wear and corrosion resistance to electroplated nickel or chrome. Much of the chrome plating done in aerospace industry can be replaced with electroless nickel plating, again environmental costs, costs of hexavalent chromium waste disposal and notorious tendency of uneven current distribution favor electroless nickel plating.[3] Electroless nickel plating is self-catalyzing process, the resultant nickel layer is NiP compound, with 711% phosphorus content. Properties of the resultant layer hardness and wear resistance are greatly altered with bath composition and deposition temperature, which should be regulated with 1 C precision, typically at 91 C. During bath circulation, any particles in it will become also nickel plated, this effect is used to advantage in processes which deposit plating with particles like silicon carbide (SiC) or polytetrafluoroethylene (PTFE). While superior compared to many other plating processes, it is expensive because the process is complex. Moreover, the process is lengthy even for thin layers. When only corrosion resistance or surface treatment is of concern, very strict bath composition and temperature control is not required and the process is used for plating many tons in one bath at

Plating once. Electroless nickel plating layers are known to provide extreme surface adhesion when plated properly. Electroless nickel plating is non-magnetic and amorphous. Electroless nickel plating layers are not easily solderable, nor do they seize with other metals or another electroless nickel plated workpiece under pressure. This effect benefits electroless nickel plated screws made out of malleable materials like titanium. Electrical resistance is higher compared to pure metal plating.

128

Electroplating
Electroplating is a process that uses electrical current to reduce dissolved metal cations so that they form a coherent metal coating on an electrode. The term is also used for electrical oxidation of anions onto a solid subtrate, as in the formation silver chloride on silver wire to make silver/silver-chloride electrodes. Electroplating is primarily used to change the surface properties of an object (e.g. abrasion and wear resistance, corrosion protection, lubricity, aesthetic qualities, etc.), but may also be used to build up thickness on undersized parts or to form objects by electroforming.

Copper electroplating machine for layering PCBs

The process used in electroplating is called electrodeposition. It is analogous to a galvanic cell acting in reverse. The part to be plated is the cathode of the circuit. In one technique, the anode is made of the metal to be plated on the part. Both components are immersed in a solution called an electrolyte containing one or more dissolved metal salts as well as other ions that permit the flow of electricity. A power supply supplies a direct current to the anode, oxidizing the metal atoms that comprise it and allowing them to dissolve in the solution. At the cathode, the dissolved metal ions in the electrolyte solution are reduced at the interface between the solution and the cathode, such that they "plate out" onto the cathode. The rate at which the anode is dissolved is equal to the rate at which the cathode is plated, vis-a-vis the current flowing through the circuit. In this manner, the ions in the electrolyte bath are continuously replenished by the anode. Other electroplating processes may use a non-consumable anode such as lead or carbon. In these techniques, ions of the metal to be plated must be periodically replenished in the bath as they are drawn out of the solution. The most common form of electroplating is used for creating coins such as pennies, which are small zinc plates covered in a layer of copper

Electroplating

129

Process
The anode and cathode in the electroplating cell are both connected to an external supply of direct current a battery or, more commonly, a rectifier. The anode is connected to the positive terminal of the supply, and the cathode (article to be plated) is connected to the negative terminal. When the external power supply is switched on, the metal at the anode is oxidized from the zero valence state to form cations with a positive charge. These cations associate with the anions in the solution. The cations are reduced at the cathode to deposit in the metallic, zero valence state. For example, in an acid solution, copper is oxidized at the anode to Cu2+ by losing two electrons. The Cu2+ associates with the anion SO42- in the solution to form copper sulfate. At the cathode, the Cu2+ is reduced to metallic copper by gaining two electrons. The result is the effective transfer of copper from the anode source to a plate covering the cathode. The plating is most commonly a single metallic element, not an alloy. However, some alloys can be electrodeposited, notably brass and solder.
Electroplating of a metal (Me) with copper in a copper sulfate bath

Many plating baths include cyanides of other metals (e.g., potassium cyanide) in addition to cyanides of the metal to be deposited. These free cyanides facilitate anode corrosion, help to maintain a constant metal ion level and contribute to conductivity. Additionally, non-metal chemicals such as carbonates and phosphates may be added to increase conductivity. When plating is not desired on certain areas of the substrate, stop-offs are applied to prevent the bath from coming in contact with the substrate. Typical stop-offs include tape, foil, lacquers, and waxes.

Strike
Initially, a special plating deposit called a "strike" or "flash" may be used to form a very thin (typically less than 0.1 micrometer thick) plating with high quality and good adherence to the substrate. This serves as a foundation for subsequent plating processes. A strike uses a high current density and a bath with a low ion concentration. The process is slow, so more efficient plating processes are used once the desired strike thickness is obtained. The striking method is also used in combination with the plating of different metals. If it is desirable to plate one type of deposit onto a metal to improve corrosion resistance but this metal has inherently poor adhesion to the substrate, a strike can be first deposited that is compatible with both. One example of this situation is the poor adhesion of electrolytic nickel on zinc alloys, in which case a copper strike is used, which has good adherence to both.

Brush electroplating
A closely related process is brush electroplating, in which localized areas or entire items are plated using a brush saturated with plating solution. The brush, typically a stainless steel body wrapped with a cloth material that both holds the plating solution and prevents direct contact with the item being plated, is connected to the positive side of a low voltage direct-current power source, and the item to be plated connected to the negative. The operator dips the brush in plating solution then applies it to the item, moving the brush continually to get an even distribution of the plating material. Brush electroplating has several advantages over tank plating, including portability, ability to plate items that for some reason cannot be tank plated (one application was the plating of portions of very large decorative support columns in a building restoration), low or no masking requirements, and comparatively low plating solution

Electroplating volume requirements. Disadvantages compared to tank plating can include greater operator involvement (tank plating can frequently be done with minimal attention), and inability to achieve as great a plate thickness.

130

Electroless deposition
Usually an electrolytic cell (consisting of two electrodes, electrolyte, and external source of current) is used for electrodeposition. In contrast, an electroless deposition process uses only one electrode and no external source of electric current. However, the solution for the electroless process needs to contain a reducing agent so that the electrode reaction has the form:

In principle any water-based reducer can be used although the redox potential of the reducer half-cell must be high enough to overcome the energy barriers inherent in liquid chemistry. Electroless nickel plating uses hypophosphite as the reducer while plating of other metals like silver, gold and copper typically use low molecular weight aldehydes. A major benefit of this approach over electroplating is that power sources and plating baths are not needed, reducing the cost of production. The technique can also plate diverse shapes and types of surface. The downside is that the plating process is usually slower and cannot create such thick plates of metal. As a consequence of these characteristics, electroless deposition is quite common in the decorative arts.

Cleanliness
Cleanliness is essential to successful electroplating, since molecular layers of oil can prevent adhesion of the coating. ASTM B322 is a standard guide for cleaning metals prior to electroplating. Cleaning processes include solvent cleaning, hot alkaline detergent cleaning, electro-cleaning, and acid treatment etc. The most common industrial test for cleanliness is the waterbreak test, in which the surface is thoroughly rinsed and held vertical. Hydrophobic contaminants such as oils cause the water to bead and break up, allowing the water to drain rapidly. Perfectly clean metal surfaces are hydrophilic and will retain an unbroken sheet of water that does not bead up or drain off. ASTM F22 describes a version of this test. This test does not detect hydrophilic contaminants, but the electroplating process can displace these easily since the solutions are water-based. Surfactants such as soap reduce the sensitivity of the test and must be thoroughly rinsed off.

Effects
Electroplating changes the chemical, physical, and mechanical properties of the workpiece. An example of a chemical change is when nickel plating improves corrosion resistance. An example of a physical change is a change in the outward appearance. An example of a mechanical change is a change in tensile strength or surface hardness which is a required attribute in tooling industry.

Electroplating

131

History
Although it is not confirmed, the Parthian Battery may have been the first system used for electroplating. Modern electrochemistry was invented by Italian chemist Luigi V. Brugnatelli in 1805. Brugnatelli used his colleague Alessandro Volta's invention of five years earlier, the voltaic pile, to facilitate the first electrodeposition. Brugnatelli's inventions were suppressed by the French Academy of Sciences and did not become used in general industry for the following thirty years.

Nickel plating

By 1839, scientists in Britain and Russia had independently devised metal deposition processes similar to Brugnatelli's for the copper electroplating of printing press plates. Boris Jacobi in Russia not only rediscovered galvanoplastics, but developed electrotyping and galvanoplastic sculpture. Galvanoplastics quickly came into fashion in Russia, with such people as inventor Peter Bagration, scientist Heinrich Lenz and science fiction author Vladimir Odoyevsky all contributing to further development of the technology. Among the most notorious cases of electroplating usage in mid-19th century Russia were gigantic galvanoplastic sculptures of St. Isaac's Cathedral in Saint Petersburg and gold-electroplated dome of the Cathedral of Christ the Saviour in Moscow, the tallest Orthodox church in the world.

Boris Jacobi developed electroplating, electrotyping and galvanoplastic sculpture in Russia

Soon after, John Wright of Birmingham, England discovered that potassium cyanide was a suitable electrolyte for gold and silver electroplating. Wright's associates, George Elkington and Henry Elkington were awarded the first patents for electroplating in 1840. These two then founded the electroplating industry in Birmingham from where it spread around the world. The Norddeutsche Affinerie in Hamburg was the first modern electroplating plant starting its production in 1876. As the science of electrochemistry grew, its relationship to the Galvanoplastic sculpture on St. Isaac's Cathedral in Saint electroplating process became understood and other types of Petersburg. non-decorative metal electroplating processes were developed. Commercial electroplating of nickel, brass, tin, and zinc were developed by the 1850s. Electroplating baths and equipment based on the patents of the Elkingtons were scaled up to accommodate the plating of numerous large scale objects and for specific manufacturing and engineering applications.

Electroplating The plating industry received a big boost with the advent of the development of electric generators in the late 19th century. With the higher currents available, metal machine components, hardware, and automotive parts requiring corrosion protection and enhanced wear properties, along with better appearance, could be processed in bulk. The two World Wars and the growing aviation industry gave impetus to further developments and refinements including such processes as hard chromium plating, bronze alloy plating, sulfamate nickel plating, along with numerous other plating processes. Plating equipment evolved from manually operated tar-lined wooden tanks to automated equipment, capable of processing thousands of kilograms per hour of parts. One of the American physicist Richard Feynman's first projects was to develop technology for electroplating metal onto plastic. Feynman developed the original idea of his friend into a successful invention, allowing his employer (and friend) to keep commercial promises he had made but could not have fulfilled otherwise.

132

Use
Electroplating is a useful process. It is widely used in industry for coating metal objects with a thin layer of a different metal. The layer of metal deposited has some desired property, which metal of the object lacks. For example chromium plating is done on many objects such as car parts, bath taps, kitchen gas burners, wheel rims and many others.

Hull cell
The Hull cell is a type of test cell used to qualitatively check the condition of an electroplating bath. It allows for optimization for current density range, optimization of additive concentration, recognition of impurity effects and indication of macro-throwing power capability. The Hull cell replicates the plating bath on a lab scale. It is filled with a sample of the plating solution, an appropriate anode which is connected to a rectifier. The "work" is replaced with a hull cell test panel that will be plated to show the "health" of the bath.

A zinc solution tested in a hull cell

The Hull cell is a trapezoidal container that holds 267 ml of solution. This shape allows one to place the test panel on an angle to the anode. As a result, the deposit is plated at different current densities which can be measured with a hull cell ruler. The solution volume allows for a quantitative optimization of additive concentration: 1gram addition to 267 mL is equivalent to 0.5 oz/gal in the plating tank.

Electroplating

133

Spray painting
Spray painting is a painting technique where a device sprays a coating (paint, ink, varnish, etc.) through the air onto a surface. The most common types employ compressed gasusually airto atomize and direct the paint particles. Spray guns evolved from airbrushes, and the two are usually distinguished by their size and the size of the spray pattern they produce. Airbrushes are hand-held and used instead of a brush for detailed work such as photo retouching, painting nails or fine art. Air gun spraying uses equipment that is generally larger. It is typically used for covering large surfaces with an even coating of liquid. Spray guns can be either automated or hand-held and have interchangeable heads to allow for different spray patterns. Single color aerosol paint cans are portable and easy to store.

LVLP system.

History
Francis Davis Millet is generally credited with the invention of spray painting. In 1892, working under extremely tight deadlines to complete construction of the World's Columbian Exposition, Daniel Burnham appointed Millet to replace the fair's official director of color, William Pretyman. Pretyman had resigned following a dispute with Burnham. After experimenting, Millet settled on a mix of oil and white lead that could be applied using a hose and special nozzle, which would take considerably less time than traditional brush painting. In 1949, Edward Seymour developed spray paint that could be delivered from an aerosol can.

Spray painting

134

Types
Air gun spraying
This process occurs when paint is applied to an object through the use of an air-pressurized spray gun. The air gun has a nozzle, paint basin, and air compressor. When the trigger is pressed the paint mixes with the compressed air stream and is released in a fine spray. Due to a wide range of nozzle shapes and sizes, the consistency of the paint can be varied. The shape of the workpiece and the desired paint consistency and pattern are important factors when choosing a nozzle. The three most common nozzles are the full cone, hollow cone, and flat stream. There are two types of air-gun spraying processes. In a manual operation method the air-gun sprayer is held by a skilled operator, about 6 to 10 inches (1525cm) from the object, and moved back and forth over the surface, each stroke overlapping the previous Types of nozzles and sprays to ensure a continuous coat. In an automatic process the gun head is attached to a mounting block and delivers the stream of paint from that position. The object being painted is usually placed on rollers or a turntable to ensure overall equal coverage of all sides. HVLP (High Volume Low Pressure) This is similar to a conventional spray gun using a compressor to supply the air, but the spray gun itself requires a lower pressure (LP). A higher volume (HV) of air is used to aerosolise and propel the paint at lower air pressure. The result is a higher proportion of paint reaching the target surface with reduced overspray, materials consumption, and air pollution. A regulator is often required so that the air pressure from a conventional compressor can be lowered for the HVLP spray gun. Alternatively a turbine unit (commonly containing a vacuum cleaner derived motor) can be used to propel the air without the need for an air line running to the compressor. A rule of thumb puts two thirds of the coating on the substrate and one third in the air. True HVLP guns use 820cfm (13.634 m3/hr), and an industrial compressor with a minimum of 5 horsepower (3.7kW) output is required. HVLP spray systems are used in the automotive, decorative, marine, architectural coating, furniture finishing, scenic painting and cosmetic industries.

LVLP (Low Volume Low Pressure)


Like HVLP, these spray guns also operate at a lower pressure (LP), but they use a low volume (LV) of air when compared to conventional and HVLP equipment. This is a further effort at increasing the transfer efficiency (amount of coating that ends up on the target surface) of spray guns, while decreasing the amount of compressed air consumption.

Electrostatic spray painting


Electrostatic painting was first patented in the U.S. by Harold Ransburg in the late 1940s. Harold Ransburg founded Ransburg Electrostatic Equipment and discovered that electrostatic spray painting was an immediate success as manufacturers quickly perceived the substantial materials savings that could be achieved. In electrostatic spray painting or powder coating, the atomized particles are made to be electrically charged, thereby repelling each other and spreading themselves evenly as they exit the spray nozzle. The object being painted is charged oppositely or grounded. The paint is then attracted to the object giving a more even coat than wet spray painting, and also greatly increasing the percentage of paint that sticks to the object. This method also means that paint covers hard to reach areas. The whole may then be baked to properly attach the paint: the powder turns into a type of plastic. Car body

Spray painting panels and bike frames are two examples where electrostatic spray painting is often used. There are three main technologies for charging the fluid (liquid or powders): Direct charging: An electrode is immersed in the paint supply reservoir or in the paint supply conduit. Tribo charging: This uses the friction of the fluid which is forced through the barrel of the paint gun. It rubs against the side of the barrel and builds up an electrostatic charge. Post-atomization charging: The atomized fluid comes into contact with an electrostatic field downstream of the outlet nozzle. The electrostatic field may be created by electrostatic induction or corona, or by one or more electrodes (electrode ring, mesh or grid). Rotational bell With this method the paint is flung into the air by a spinning metal disc ("bell"). The metal disc also imparts an electrical charge to the coating particle.

135

Electric fan
There are a variety of hand-held paint sprayers that either combine the paint with air, or convert the paint to tiny droplets and accelerate these out of a nozzle. Hot spray By heating the full bodied paint to 60-80 deg C, it is possible to apply a thicker coat. Originally the paint was recirculated, but this caused bodying up, the system was changed to direct heating on line. Hot spraying was also used with Airless and Electrostatic Airless to decrease bounce- back. Two pack materials usually had premix before tip systems using dual pumps. Air assisted airless spray guns These use air pressure and fluid pressure 300 to 3,000 pounds per square inch (2,10021,000kPa) to achieve atomization of the coating. This equipment provides high transfer and increased application speed and is most often used with flat-line applications in factory finish shops. The fluid pressure is provided by an airless pump, which allows much heavier materials to be sprayed than is possible with an airspray gun. Compressed air is introduced into the spray from an airless tip (nozzle) to improve the fineness of atomisation. Some electric airless sprayers are fitted with a compressor to allow the use of an air assisted airless gun in situations where portability is important.

Airless spray guns


These operate connected to a high pressure pump commonly found using 300 to 7,500 pounds per square inch (2,10052,000kPa) pressure to atomize the coating, using different tip sizes to achieve desired atomization and spray pattern size. This type of system is used by contract painters to paint heavy duty industrial, chemical and marine coatings and linings. Advantages of airless spray are: The coating penetrates better into pits and crevices. A uniform thick coating is produced, reducing the number of coats required. A very "wet" coating is applied, ensuring good adhesion and flow-out. Most coatings can be sprayed with very little thinner added, thereby reducing drying time and decreasing the release of solvent into the environment. Care must be used when operating, as airless spray guns can cause serious injury the paint ejecting from the nozzle at high pressure. , such as injection injuries, due to

Spray painting Airless pumps can be powered by different types of motor: electric, compressed air (pneumatic) or hydraulic. Most have a paint pump (also called a lower) that is a double acting piston, in which the piston pumps the paint in both the down and the upstroke. Some airless pumps have a diaphragm instead of a piston, but both types have inlet and outlet valves. Most electric powered airless pumps have an electric motor connected through a gear train to the paint piston pump. Pressure is achieved by stopping and starting the motor via a pressure sensor (also called a transducer); in more advanced units, this is done by digital control in which the speed of the motor varies with the demand and the difference from the pressure set-point, resulting in a very good pressure control. Some direct drive piston pumps are driven by a gasoline engine with pressure control via an electric clutch. In electric diaphragm pumps, the motor drives a hydraulic piston pump that transmits the oil displaced by the piston, to move the diaphragm. Hydraulic and air-powered airless pumps have linear motors that require a hydraulic pump or an air compressor, which can be electric or gasoline powered, although an air compressor is usually diesel powered for mobile use or electric for fixed installations. Some airless units have the hydraulic pump and its motor, built onto the same chassis as the paint pump. Hydraulic or air powered airless provide a more uniform pressure control since the paint piston moves at a constant speed except when it changes direction. In most direct drive piston pumps, the piston is crankshaft driven in which the piston will be constantly changing speed. The linear motors of hydraulic or compressed air drive pumps, are more efficient in converting engine power to material power, than crankshaft driven units. All types of paint can be painted by using airless method.

136

Automated linear spray systems


Manufacturers who mass-produce wood products use automated spray systems, allowing them to paint materials at a very high rate with a minimum of personnel. Automated spray systems usually incorporate a paint-saving system which recovers paint not applied to the products. Commonly, linear spray systems are for products which are lying flat on a conveyor belt and then fed into a linear spray system, where automated spray guns are stationed above. When the material is directly below the guns, the guns begin to paint the material. Materials consist of lineal parts usually less than 12 inches (30cm) wide, such as window frames, wood moulding, baseboard, casing, trim stock and any other material that is simple in design. These machines are commonly used to apply stain, sealer, and lacquer. They can apply water- or solvent-based coatings. In recent years ultraviolet-cured coatings have become commonplace in profile finishing, and there are machines particularly suited to this type of coating.

Automated flatline spray systems


Mass produced material is loaded on a conveyor belt where it is fed into one of these flatline machines. Flatline machines are designed to specifically paint material that is less than 4 inches (10cm) thick and complex in shape, for example a kitchen cabinet door or drawer front. Spray guns are aligned above the material and the guns are in motion in order to hit all the grooves of the material. The guns can be moved in a cycle, circle, or can be moved back and forth in order to apply paint evenly across the material. Flatline systems are typically large and can paint doors, kitchen cabinets, and other plastic or wooden products.

Spray booth
A spray booth is a pressure controlled closed environment, used to paint vehicles in a body shop. To ensure the ideal working conditions (temperature, air flow and humidity), these environments are equipped with one or more groups of ventilation, consisting of one or more motors and one or more burners to heat the air blown. In order to assist in the removal of the oversprayed paint from the air and to provide efficient operation of the down-draft, water-washed paint spray booths utilize paint detackifying chemical agents.

Spray painting

137

Other Applications
One application of spray painting is graffiti. The introduction of inexpensive and portable aerosol paint has been a boon to this art form, which has spread all over the world. Spray painting has also been used in fine art. Jules Olitski, Dan Christensen, Peter Reginato, Sir Anthony Caro, and Jean-Michel Basquiat have used airbrushes, for both painting and sculpture.

Thermal spraying
Thermal spraying techniques are coating processes in which melted (or heated) materials are sprayed onto a surface. The "feedstock" (coating precursor) is heated by electrical (plasma or arc) or chemical means (combustion flame). Thermal spraying can provide thick coatings (approx. thickness range is 20 micrometers to several mm, depending on the process and feedstock), over a large area at high deposition rate as compared to other coating processes such as electroplating, physical and chemical vapor deposition. Coating materials available for thermal spraying include metals, alloys, ceramics, plastics and composites. They are fed in powder or wire form, heated to a molten or semimolten state and accelerated towards substrates in the form of micrometer-size particles. Combustion or electrical arc discharge is usually used as the source of energy for thermal spraying. Resulting coatings are made by the accumulation of numerous sprayed particles. The surface may not heat up significantly, allowing the coating of flammable substances. Coating quality is usually assessed by measuring its porosity, oxide content, macro and micro-hardness, bond strength and surface roughness. Generally, the coating quality increases with increasing particle velocities. Several variations of thermal spraying are distinguished: Plasma spraying Detonation spraying Wire arc spraying Flame spraying High velocity oxy-fuel coating spraying (HVOF) Warm spraying Cold spraying

Thermal spraying In classical (developed between 1910 and 1920) but still widely used processes such as flame spraying and wire arc spraying, the particle velocities are generally low (< 150m/s), and raw materials must be molten to be deposited. Plasma spraying, developed in the 1970s, uses a high-temperature plasma jet generated by arc discharge with typical temperatures >15000 K, which makes it possible to spray refractory materials such as oxides, molybdenum, etc.

138

System overview
A typical thermal spray system consists of the following: Spray torch (or spray gun) - the core device performing the melting and acceleration of the particles to be deposited Feeder - for supplying the powder, wire or liquid to the torch Media supply - gases or liquids for the generation of the flame or plasma jet, gases for carrying the powder, etc. Robot - for manipulating the torch or the substrates to be coated Power supply - often standalone for the torch Control console(s) - either integrated or individual for all of the above

Detonation Thermal Spraying Process


The Detonation gun basically consists of a long water cooled barrel with inlet valves for gases and powder. Oxygen and fuel (acetylene most common) is fed into the barrel along with a charge of powder. A spark is used to ignite the gas mixture and the resulting detonation heats and accelerates the powder to supersonic velocity down the barrel. A pulse of nitrogen is used to purge the barrel after each detonation. This process is repeated many times a second. The high kinetic energy of the hot powder particles on impact with the substrate results in a build up of a very dense and strong coatings.

Plasma spraying
In plasma spraying process, the material to be deposited (feedstock) typically as a powder, sometimes as a liquid, suspension or wire is introduced into the plasma jet, emanating from a plasma torch. In the jet, where the temperature is on the order of 10,000 K, the material is melted and propelled towards a substrate. There, the molten droplets flatten, rapidly solidify and form a deposit. Commonly, the deposits remain adherent to the substrate as coatings; free-standing parts can also be produced by removing the substrate. There are a large number of technological parameters that influence the interaction of the particles with the plasma jet and the substrate and therefore the deposit properties. These parameters include feedstock type, plasma gas composition and flow rate, energy input, torch offset distance, substrate cooling, etc.

Plasma spraying setup

Deposit properties
The deposits consist of a multitude of pancake-like lamellae called 'splats', formed by flattening of the liquid droplets. As the feedstock powders typically have sizes from micrometers to above 100

Thermal spraying

139

micrometers, the lamellae have thickness in the micrometer range and lateral dimension from several to hundreds of micrometers. Between these lamellae, there are small voids, such as pores, cracks and regions of incomplete bonding. As a result of this unique structure, the deposits can have properties significantly different from bulk materials. These are generally mechanical properties, such as lower strength and modulus, higher strain tolerance, and lower thermal and electrical conductivity. Also, due to the rapid solidification, metastable phases can be present in the deposits.

Wire flame spraying

Applications
This technique is mostly used to produce coatings on structural materials. Such coatings provide protection against high temperatures (for example thermal barrier coatings for exhaust heat management), corrosion, erosion, wear; they can also change the appearance, electrical or tribological properties of the surface, replace worn material, etc. When sprayed on substrates of various shapes and removed, free-standing parts in the form of plates, tubes, shells, etc. can be produced. It can also be used for powder processing (spheroidization, homogenization, modification of chemistry, etc.). In this case, the substrate for deposition is absent and the particles solidify during flight or in a controlled environment (e.g., water). This technique with variation may also be used to create porous structures, suitable for bone ingrowth, as a coating for medical implants. A polymer dispersion aerosol can be injected into the plasma discharge in order to create a grafting of this polymer on to a substrate surface. This application is mainly used to modify the surface chemistry of polymers.

Variations
Plasma spraying systems can be categorized by several criteria. Plasma jet generation: direct current (DC plasma), where the energy is transferred to the plasma jet by a direct current, high-power electric arc induction plasma or RF plasma, where the energy is transferred by induction from a coil around the plasma jet, through which an alternating, radio-frequency current passes Plasma-forming medium: gas-stabilized plasma (GSP), where the plasma forms from a gas; typically argon, hydrogen, helium or their mixtures water-stabilized plasma (WSP), where plasma forms from water (through evaporation, dissociation and ionization) or other suitable liquid hybrid plasma - with combined gas and liquid stabilization, typically argon and water Spraying environment: air plasma spraying (APS), performed in ambient air controlled atmosphere plasma spraying (CAPS), usually performed in a closed chamber, either filled with inert gas or evacuated variations of CAPS: high-pressure plasma spraying (HPPS), low-pressure plasma spraying (LPPS), the extreme case of which is vacuum plasma spraying (VPS, see below) underwater plasma spraying

Thermal spraying Another variation consists of having a liquid feedstock instead of a solid powder for melting, this technique is known as Solution precursor plasma spray

140

Vacuum plasma spraying


Vacuum plasma spraying (VPS) is a technology for etching and surface modification to create porous layers with high reproducibility and for cleaning and surface engineering of plastics, rubbers and natural fibers as well as for replacing CFCs for cleaning metal components. This surface engineering can improve properties such as frictional behavior, heat resistance, surface electrical conductivity, lubricity, cohesive strength of films, or dielectric constant, or it can make materials hydrophilic or hydrophobic.
Vacuum plasma spraying The process typically operates at 39120 C to avoid thermal damage. It can induce non-thermally activated surface reactions, causing surface changes which cannot occur with molecular chemistries at atmospheric pressure. Plasma processing is done in a controlled environment inside a sealed chamber at a medium vacuum, around 1365 Pa. The gas or mixture of gases is energized by an electrical field from DC to microwave frequencies, typically 1500 W at 50 V. The treated components are usually electrically isolated. The volatile plasma by-products are evacuated from the chamber by the vacuum pump, and if necessary can be neutralized in an exhaust scrubber.

In contrast to molecular chemistry, plasmas employ: Molecular, atomic, metastable and free radical species for chemical effects. Positive ions and electrons for kinetic effects. Plasma also generates electromagnetic radiation in the form of vacuum UV photons to penetrate bulk polymers to a depth of about 10m. This can cause chain scissions and cross-linking. Plasmas affect materials at an atomic level. Techniques like X-ray photoelectron spectroscopy and scanning electron microscopy are used for surface analysis to identify the processes required and to judge their effects. As a simple indication of surface energy, and hence adhesion or wettability, often a water droplet contact angle test is used. The lower the contact angle, the higher the surface energy and more hydrophilic the material is.

Changing effects with plasma


At higher energies ionization tends to occur more than chemical dissociations. In a typical reactive gas, 1 in 100 molecules form free radicals whereas only 1 in 106 ionizes. The predominant effect here is the forming of free radicals. Ionic effects can predominate with selection of process parameters and if necessary the use of noble gases.

Wire arc spray


Wire arc spray is a form of thermal spraying where two consumable metal wires are fed independently into the spray gun. These wires are then charged and an arc is generated between them. The heat from this arc melts the incoming wire, which is then entrained in air jet from the gun. This entrained molten feedstock is then deposited onto a substrate. This process is commonly used for metallic, heavy coatings.

Thermal spraying

141

Plasma transferred wire arc


Plasma transferred wire arc is another form of wire arc spray which deposits a coating on the internal surface of a cylinder, or on the external surface of a part of any geometry. It is predominantly known for its use in coating the cylinder bores of an engine, enabling the use of Aluminum engine blocks without the need for heavy cast iron sleeves. A single conductive wire is used as "feedstock" for the system. A supersonic plasma jet melts the wire, atomizes it and propels it onto the substrate. The plasma jet is formed by a transferred arc between a non-consumable cathode and the type of a wire. After atomization, forced air transports the stream of molten droplets onto the bore wall. The particles flatten when they impinge on the surface of the substrate, due to the high kinetic energy. The particles rapidly solidify upon contact. The stacked particles make up a high wear resistant coating. The PTWA thermal spray process utilizes a single wire as the feedstock material. All conductive wires up to and including 0.0625" (1.6mm) can be used as feedstock material, including "cored" wires. PTWA can be used to apply a coating to the wear surface of engine or transmission components to replace a bushing or bearing. For example, using PTWA to coat the bearing surface of a connecting rod offers a number of benefits including reductions in weight, cost, friction potential, and stress in the connecting rod.

High velocity oxygen fuel spraying (HVOF)


During the 1980s, a class of thermal spray processes called high velocity oxy-fuel spraying was developed: A mixture of gaseous or liquid fuel and oxygen is fed into a combustion chamber, where they are ignited and combusted continuously. The resultant hot gas at a pressure close to 1 MPa emanates through a convergingdiverging nozzle and travels through a straight section. The fuels can be gases (hydrogen, methane, propane, propylene, acetylene, natural gas, etc.) or liquids (kerosene, etc.). The jet velocity at the exit of the barrel (>1000m/s) exceeds the speed of sound. A powder feed stock is injected into the gas stream, which accelerates the powder up to 800m/s. The stream of hot gas and powder is directed towards the surface to be coated. The powder partially melts in the stream, and deposits upon the substrate. The resulting coating has low porosity and high bond strength. HVOF coatings may be as thick as 12mm (1/2"). It is typically used to deposit wear and corrosion resistant coatings on materials, such as ceramic and metallic layers. Common powders include WC-Co, chromium carbide, MCrAlY, and alumina. The process has been most successful for depositing cermet materials (WCCo, etc.) and other corrosion-resistant alloys (stainless steels, nickel-based alloys, aluminium, hydroxyapatite for medical implants, etc.).

Cold spraying
In the 1990s, cold spraying (often called gas dynamic cold spray) has been introduced. The method was originally developed in Russia with the accidental observation of the rapid formation of coatings, while experimenting with the particle erosion of the target exposed to a high velocity flow loaded with fine powder in a wind tunnel. In cold spraying, particles are accelerated to very high speeds by the carrier gas forced through a convergingdiverging de Laval type nozzle. Upon impact, solid particles with sufficient kinetic energy deform plastically and bond metallurgically to the substrate to form a coating. The critical velocity needed to form bonding depends on the materials properties, powder size and temperature. Soft metals such as Cu and Al are best suited for cold spraying, but coating of other materials (W, Ta, Ti, MCrAlY, WCCo, etc.) by cold spraying has been reported. The deposition efficiency is typically low for alloy powders, and the window of process parameters and suitable powder sizes is narrow. To accelerate powders to higher velocity, finer powders (<20 micrometers) are used. It is possible to accelerate powder particles to much higher velocity using a processing gas having high speed of sound (helium instead of nitrogen). However, helium is costly and its flow rate, and thus consumption, is higher. To improve acceleration capability, nitrogen gas is heated up to about 900 C. As a result, deposition efficiency and

Thermal spraying tensile strength of deposits increase.

142

Warm spraying
Is a novel modification of high velocity oxy-fuel spraying, in which the temperature of combustion gas is lowered by mixing nitrogen with the combustion gas, thus bringing the process closer to the cold spraying. The resulting gas contains much water vapor, unreacted hydrocarbons and oxygen, and thus is dirtier than the cold spraying. However, the coating efficiency is higher. On the other hand, lower temperatures of warm spraying reduce melting and chemical reactions of the feed powder, as compared to HVOF. These advantages are especially important for such coating materials as Ti, plastics, and metallic glasses, which rapidly oxidize or deteriorate at high temperatures.

Applications
Crankshaft reconditioning or conditioning Corrosion protection Fouling protection Altering thermal conductivity or electrical conductivity Wear control: either hardfacing (wear-resistant) or abradable coating Repairing damaged surfaces Temperature/oxidation protection (thermal barrier coatings) Medical implants Production of functionally graded materials (for either of the above applications)

Plasma sprayed ceramic coating applied onto a part of an automotive exhaust system

Safety
Thermal spraying need not be a dangerous process, if the equipment is treated with care, and correct spraying practices are followed. As with any industrial process, there are a number of hazards, of which the operator should be aware, and against which specific precautions should be taken. Ideally, equipment should be operated automatically, in enclosures specially designed to extract fumes, reduce noise levels, and prevent direct viewing of the spraying head. Such techniques will also produce coatings that are more consistent. There are occasions when the type of components being treated, or their low production levels, requires manual equipment operation. Under these conditions, a number of hazards, peculiar to thermal spraying, are experienced, in addition to those commonly encountered in production or processing industries.

Noise
Metal spraying equipment uses compressed gases, which create noise. Sound levels vary with the type of spraying equipment, the material being sprayed, and the operating parameters. Typical sound pressure levels are measured at 1 meter behind the arc.

UV light
Combustion spraying equipment produces an intense flame, which may have a peak temperature more than 3,100C, and is very bright. Electric arc spraying produces ultra-violet light, which may damage delicate body tissues. Spray booths, and enclosures, should be fitted with ultra-violet absorbent dark glass. Where this is not possible, operators, and others in the vicinity should wear protective goggles containing BS grade 6 green glass. Opaque screens should be placed around spraying areas. The nozzle of an arc pistol should never be viewed directly, unless it is certain that no power is available to the equipment.

Thermal spraying

143

Dust and fumes


The atomization of molten materials produces a large amount of dust and fumes made up of very fine particles (about 80 95% of the particles by number <100nm). Proper extraction facilities are vital, not only for personal safety, but to minimize entrapment of re-frozen particles in the sprayed coatings. The use of respirators, fitted with suitable filters, is strongly recommended, where equipment cannot be isolated. Certain materials offer specific known hazards: 1. Finely divided metal particles are potentially pyrophoric and harmful when accumulated in the body. 2. Certain materials e.g. aluminum, zinc and other base metals may react with water to evolve hydrogen. This is potentially explosive and special precautions are necessary in fume extraction equipment. 3. Fumes of certain materials, notably zinc and copper alloys, have a disagreeable odour and may cause a fever-type reaction in certain individuals (known as metal fume fever). This may occur some time after spraying and usually subsides rapidly. If it does not, medical advice must be sought.

Heat
Combustion spraying guns use oxygen and fuel gases. The fuel gases are potentially explosive. In particular, acetylene may only be used under approved conditions. Oxygen, while not explosive, will sustain combustion, and many materials will spontaneously ignite, if excessive oxygen levels are present. Care must be taken to avoid leakage, and to isolate oxygen and fuel gas supplies, when not in use.

Shock hazards
Electric arc guns operate at low voltages (below 45 V dc), but at relatively high currents. They may be safely hand-held. The power supply units are connected to 440 V AC sources, and must be treated with caution.

Plasma transferred wire arc thermal spraying

144

Plasma transferred wire arc thermal spraying


Plasma transferred wire arc (PTWA) thermal spraying is a thermal spraying process that deposits a coating on the internal surface of a cylindrical surface, or external surface of any geometry. It is predominantly known for its use in coating the cylinder bores of an engine, enabling the use of aluminum engine blocks without the need for heavy cast iron sleeves. A single conductive wire is used as "feedstock" for the system. A supersonic plasma jet melts the wire, atomizes it and propels it onto the substrate. The plasma jet is formed by a transferred arc between a non-consumable cathode and the wire. After atomization, forced gas transports the stream of molten droplets onto the bore wall. The particles flatten when they impinge on the surface of the substrate, due to their high kinetic energy. The particles rapidly solidify upon contact and can form both crystalline and amorphus phases. There is also the possibility to produce multi-layer coatings. The stacked particles make up a highly wear-resistant coating. All conductive wires up to and including 0.0625" (1.6mm) can be used as feedstock material, including "cored" wires. PTWA can be used to apply a coating to the wear surface of engine or transmission components to replace a bushing or bearing. For example, using PTWA to coat the bearing surface of a connecting rod offers a number of benefits including reductions in weight, cost, friction potential, and stress in the connecting rod. The inventors of PTWA received the 2009 IPO National Inventor of the Year award. This technology was initially patented and developed by inventors by Flame-Spray Industries, Inc. The technology was subsequently improved upon by Ford and Flame-Spray Industries. PTWA is currently in use by Nissan in the Nissan GTR, Ford is implementing it in the new Mustang GT500 Shelby, Caterpillar and other manufacturers are using it for re-manufacturing. Other applications for this process include the spraying of internal diameters of pipes. Any conductive wire can be used as the feedstock material, including "cored" wire. Refractory metals as well as low melt materials are easily deposited. The recent use of PTWA by Nissan and Ford has been to apply a wear resistant coating on the internal surface of engine block cylinder bores. For hypoeutectic aluminum silicon alloy blocks, PTWA provides a great alternative to cast iron liners which are a higher cost and heavier. PTWA also delivers increased displacement in the same size engine package and a potential for better heat transfer. PTWA coatings are also applied directly to cast iron engine blocks for re-manufacturing. PTWA coated test engines have been run for over 3 million combined miles of trouble free on-the-road performance. The technology is currently in use at a number of major production facilities around the world. It is also being used to coat worn parts, to make them like-new in re-manufacturing facilities.

Powder coating

145

Powder coating
Powder coating is a type of coating that is applied as a free-flowing, dry powder. The main difference between a conventional liquid paint and a powder coating is that the powder coating does not require a solvent to keep the binder and filler parts in a liquid suspension form. The coating is typically applied electrostatically and is then cured under heat to allow it to flow and form a "skin". The powder may be a thermoplastic or a thermoset polymer. It is usually used to create a hard finish that is tougher than conventional paint. Powder coating is mainly used for coating of metals, such as household appliances, aluminium extrusions, drum hardware, and automobile and bicycle parts. Newer technologies allow other materials, such as MDF (medium-density fibreboard), to be powder coated using different methods.

Aluminium extrusions being powder coated

Advantages and disadvantages


While powder coatings have many advantages over other coating processes, there are some disadvantages to the technology. While it is relatively easy to apply thick coatings which have smooth, texture-free surfaces, it is not as easy to apply smooth thin films. As the film thickness is reduced, the film becomes more and more orange peeled in texture due to the particle size and glass transition temperature (TG) of the powder. On smaller jobs, the cost of powder coating will be higher than spray painting.

Powder coated bicycle frames and parts

For optimum material handling and ease of application, most powder coatings have a particle size in the range of 30 to 50 m and a TG around 200C . For such powder coatings, film build-ups of greater than 50 m may be required to obtain an acceptably smooth film. The surface texture which is considered desirable or acceptable depends on the end product. Many manufacturers actually prefer to have a certain degree of orange peel since it helps to hide metal defects that have occurred during manufacture, and the resulting coating is less prone to showing fingerprints. There are very specialized operations where powder coatings of less than 30 micrometres or with a TG below 40C are used in order to produce smooth thin films. One variation of the dry powder coating process, the Powder Slurry process, combines the advantages of powder coatings and liquid coatings by dispersing very fine powders of 15 micrometre particle size into water, which then allows very smooth, low film thickness coatings to be produced. Powder coatings have a major advantage in that the overspray can be recycled. However, if multiple colors are being sprayed in a single spray booth, this may limit the ability to recycle the overspray.

Powder coating

146

Types of powder coatings


There are two main categories of powder coatings: thermosets and thermoplastics. The thermosetting variety incorporates a cross-linker into the formulation. When the powder is baked, it reacts with other chemical groups in the powder to polymerize, improving the performance properties. The thermoplastic variety does not undergo any additional actions during the baking process, but rather only flows out into the final coating. The most common polymers used are polyester, polyurethane, polyester-epoxy (known as hybrid), straight epoxy (fusion bonded epoxy) and acrylics. Production: 1. 2. 3. 4. The polymer granules are mixed with hardener, pigments and other powder ingredients in a mixer The mixture is heated in an extruder The extruded mixture is rolled flat, cooled and broken into small chips The chips are milled and sieved to make a fine powder

The powder coating process


The powder coating process involves three basic steps: 1. Part preparation or the pre-treatment 2. The powder application 3. Curing

Part preparation processes and equipment


Removal of oil, soil, lubrication greases, metal oxides, welding scales etc. is essential prior to the powder coating process. It can be done by a variety of chemical and mechanical methods. The selection of the method depends on the size and the material of the part to be powder coated, the type of soil to be removed and the performance requirement of the finished product. Chemical pre-treatments involve the use of phosphates or chromates in submersion or spray application. These often occur in multiple stages and consist of degreasing, etching, de-smutting, various rinses and the final phosphating or chromating of the substrate. The pre-treatment process both cleans and improves bonding of the powder to the metal. Recent additional processes have been developed that avoid the use of chromates, as these can be toxic to the environment. Titanium zirconium and silanes offer similar performance against corrosion and adhesion of the powder. In many high end applications, the part is electrocoated following the pretreatment process, and subsequent to the powder coating application. This has been particularly useful in automotive and other applications requiring high end performance characteristics. Another method of preparing the surface prior to coating is known as abrasive blasting or sandblasting and shot blasting. Blast media and blasting abrasives are used to provide surface texturing and preparation, etching, finishing, and degreasing for products made of wood, plastic, or glass. The most important properties to consider are chemical composition and density; particle shape and size; and impact resistance. Silicon carbide grit blast medium is brittle, sharp, and suitable for grinding metals and low-tensile strength, non-metallic materials. Plastic media blast equipment uses plastic abrasives that are sensitive to substrates such as aluminum, but still suitable for de-coating and surface finishing. Sand blast medium uses high-purity crystals that have low-metal content. Glass bead blast medium contains glass beads of various sizes. Cast steel shot or steel grit is used to clean and prepare the surface before coating. Shot blasting recycles the media and is environmentally friendly. This method of preparation is highly efficient on steel parts such as I-beams, angles, pipes, tubes and large fabricated pieces.

Powder coating Different powder coating applications can require alternative methods of preparation such as abrasive blasting prior to coating. The online consumer market typically offers media blasting services coupled with their coating services at additional costs. Another type of gun is called a tribo gun, which charges the powder by (triboelectric) friction. In this case, the powder picks up a positive charge while rubbing along the wall of a Teflon tube inside the barrel of the gun. These charged powder particles then adhere to the grounded substrate. Using a tribo gun requires a different formulation of powder than the more common corona guns. Tribo guns are not subject to some of the problems associated with corona guns, however, such as back ionization and the Faraday cage effect. Powder can also be applied using specifically adapted electrostatic discs. Another method of applying powder coating, called the fluidized bed method, is by heating the substrate and then dipping it into an aerated, powder-filled bed. The powder sticks and melts to the hot object. Further heating is usually required to finish curing the coating. This method is generally used when the desired thickness of coating is to exceed 300 micrometres. This is how most dishwasher racks are coated.

147

Electrostatic fluidized bed coating


Electrostatic fluidized bed application uses the same fluidizing technique and the conventional fluidized bed dip process but with much less powder depth in the bed. An electrostatic charging medium is placed inside the bed so that the powder material becomes charged as the fluidizing air lifts it up. Charged particles of powder move upward and form a cloud of charged powder above the fluid bed. When a grounded part is passed through the charged cloud the particles will be attracted to its surface. The parts are not preheated as they are for the conventional fluidized bed dip process.

Electrostatic magnetic brush (EMB) coating


An innovative coating method for flat materials that applies powder coating with roller technique, enabling relative high speeds and a very accurate layer thickness between 5 and 100 micrometre. The base for this process is conventional copier technology . Currently in use in some high- tech coating applications and very promising for commercial powder coating on flat substrates (steel, aluminium, MDF, paper, board) as well in sheet to sheet and/or roll to roll processes. This process can potentially be integrated in any existing coating line.

Curing
When a thermoset powder is exposed to elevated temperature, it begins to melt, flows out, and then chemically reacts to form a higher molecular weight polymer in a network-like structure. This cure process, called crosslinking, requires a certain temperature for a certain length of time in order to reach full cure and establish the full film properties for which the material was designed. Normally the powders cure at 200C (390F) for 10 minutes. The curing schedule could vary according to the manufacturer's specifications. The application of energy to the product to be cured can be accomplished by convection cure ovens, infrared cure ovens, or by laser curing process. The latter demonstrates significant reduction of curing time.

Removing powder coating


Methylene chloride and Acetone are generally effective at removing powder coating, however most other organic solvents (thinners, etc.) are completely ineffective. Most recently the suspected human carcinogen methylene chloride is being replaced by benzyl alcohol with great success. Powder coating can also be removed with abrasive blasting. 98% sulfuric acid commercial grade also removes powder coating film. Certain low grade powder coats can be removed with steel wool, though this might be a more labor-intensive process than desired. Powder coating can also be removed by a burning off process, in which parts are put into a large high-temperature

148 oven with temperatures typically reaching an air temp of 1100 to 1500 degrees with a burner temperature of 900. The process takes about four hours and requires the parts to be cleaned completely and repowdered. Parts made with a thinner-gage material need to be burned off at a lower temperature to prevent the material from warping.

Market
In 2010, the global demand for powder coatings amounts to approximately US$5.8 billion. Driven by the development of new material, new formulations and advancement of equipment and application processes, the powder coating market presents a rapid annual growth of around 6% from 2012 to 2018. Currently, the industrial uses are the largest application market of powder coatings. Automotive industry experiences the most dynamic growth. Steady and strong growth is also expected by furniture and appliance markets. Furthermore, the application of powder coatings in IT & Telecommunication is also being widely explored.

Spin coating
Spin coating is a procedure used to deposit uniform thin films to flat substrates. Usually a small amount of coating material is applied on the center of the substrate, which is either spinning at low speed or not spinning at all. The substrate is then rotated at high speed in order to spread the coating material by centrifugal force. A machine used for spin coating is called a spin coater, or simply spinner. Rotation is continued while the fluid spins off the edges of the substrate, until the desired thickness of the film is achieved. The applied solvent is usually volatile, and simultaneously evaporates. So, the higher the angular speed of spinning, the thinner the film. The thickness of the film also depends on the viscosity and concentration of the solution and the solvent. Spin coating is widely used in microfabrication, where it can be used to create thin films with thicknesses below 10 nm. It is used intensively in photolithography, to deposit layers of photoresist about 1 micrometre thick. Photoresist is typically spun at 20 to 80 revolutions per second for 30 to 60 seconds.

Laurell Technologies WS-400 spin coater used to apply photoresist to the surface of a silicon wafer.

Thin film

149

Thin film
A thin film is a layer of material ranging from fractions of a nanometer (monolayer) to several micrometers in thickness. Electronic semiconductor devices and optical coatings are the main applications benefiting from thin-film construction. A familiar application of thin films is the household mirror, which typically has a thin metal coating on the back of a sheet of glass to form a reflective interface. The process of silvering was once commonly used to produce mirrors. A very-thin-film coating (less than about 50 nanometers thick) is used to produce two-way mirrors. The performance of optical coatings (e.g., antireflective, or AR, coatings) are typically enhanced when the thin-film coating consists of multiple layers having varying thicknesses and refractive indices. Similarly, a periodic structure of alternating thin films of different materials may collectively form a so-called superlattice which exploits the phenomenon of quantum confinement by restricting electronic phenomena to two-dimensions. Work is being done with ferromagnetic and ferroelectric thin films for use as computer memory. It is also being applied to pharmaceuticals, via thin-film drug delivery. Thin-films are used to produce thin-film batteries. Thin films are also used in dye-sensitized solar cells. Ceramic thin films are in wide use. The relatively high hardness and inertness of ceramic materials make this type of thin coating of interest for protection of substrate materials against corrosion, oxidation and wear. In particular, the use of such coatings on cutting tools can extend the life of these items by several orders of magnitude. Research is being done on a new class of thin-film inorganic oxide materials, called amorphous heavy-metal cation multicomponent oxides, which could be used to make transparent transistors that are inexpensive, stable, and environmentally benign.

Deposition
The act of applying a thin film to a surface is thin-film deposition any technique for depositing a thin film of material onto a substrate or onto previously deposited layers. "Thin" is a relative term, but most deposition techniques control layer thickness within a few tens of nanometres. Molecular beam epitaxy allows a single layer of atoms to be deposited at a time. It is useful in the manufacture of optics (for reflective, anti-reflective coatings or self-cleaning glass, for instance), electronics (layers of insulators, semiconductors, and conductors form integrated circuits), packaging (i.e., aluminium-coated PET film), and in contemporary art (see the work of Larry Bell). Similar processes are sometimes used where thickness is not important: for instance, the purification of copper by electroplating, and the deposition of silicon and enriched uranium by a CVD-like process after gas-phase processing. Deposition techniques fall into two broad categories, depending on whether the process is primarily chemical or physical.

Thin film

150

Chemical deposition
Here, a fluid precursor undergoes a chemical change at a solid surface, leaving a solid layer. An everyday example is the formation of soot on a cool object when it is placed inside a flame. Since the fluid surrounds the solid object, deposition happens on every surface, with little regard to direction; thin films from chemical deposition techniques tend to be conformal, rather than directional. Chemical deposition is further categorized by the phase of the precursor: Plating relies on liquid precursors, often a solution of water with a salt of the metal to be deposited. Some plating processes are driven entirely by reagents in the solution (usually for noble metals), but by far the most commercially important process is electroplating. It was not commonly used in semiconductor processing for many years, but has seen a resurgence with more widespread use of chemical-mechanical polishing techniques. Chemical solution deposition (CSD) or Chemical bath deposition (CBD) uses a liquid precursor, usually a solution of organometallic powders dissolved in an organic solvent. This is a relatively inexpensive, simple thin film process that is able to produce stoichiometrically accurate crystalline phases. This technique is also known as the sol-gel method because the 'sol' (or solution) gradually evolves towards the formation of a gel-like diphasic system. Spin coating or spin casting, uses a liquid precursor, or sol-gel precursor deposited onto a smooth, flat substrate which is subsequently spun at a high velocity to centrifugally spread the solution over the substrate. The speed at which the solution is spun and the viscosity of the sol determine the ultimate thickness of the deposited film. Repeated depositions can be carried out to increase the thickness of films as desired. Thermal treatment is often carried out in order to crystallize the amorphous spin coated film. Such crystalline films can exhibit certain preferred orientations after crystallization on single crystal substrates. Chemical vapor deposition (CVD) generally uses a gas-phase precursor, often a halide or hydride of the element to be deposited. In the case of MOCVD, an organometallic gas is used. Commercial techniques often use very low pressures of precursor gas. Plasma enhanced CVD (PECVD) uses an ionized vapor, or plasma, as a precursor. Unlike the soot example above, commercial PECVD relies on electromagnetic means (electric current, microwave excitation), rather than a chemical reaction, to produce a plasma. Atomic layer deposition (ALD) uses gaseous precursor to deposit conformal thin films one layer at a time. The process is split up into two half reactions, run in sequence and repeated for each layer, in order to ensure total layer saturation before beginning the next layer. Therefore, one reactant is deposited first, and then the second reactant is deposited, during which a chemical reaction occurs on the substrate, forming the desired composition. As a result of the stepwise, the process is slower than CVD, however it can be run at low temperatures, unlike CVD.

Physical deposition
Physical deposition uses mechanical, electromechanical or thermodynamic means to produce a thin film of solid. An everyday example is the formation of frost. Since most engineering materials are held together by relatively high energies, and chemical reactions are not used to store these energies, commercial physical deposition systems tend to require a low-pressure vapor environment to function properly; most can be classified as physical vapor deposition (PVD). The material to be deposited is placed in an energetic, entropic environment, so that particles of material escape its surface. Facing this source is a cooler surface which draws energy from these particles as they arrive, allowing them to form a solid layer. The whole system is kept in a vacuum deposition chamber, to allow the particles to travel as freely as possible. Since particles tend to follow a straight path, films deposited by physical means are commonly directional, rather than conformal.

Thin film Examples of physical deposition include: A thermal evaporator uses an electric resistance heater to melt the material and raise its vapor pressure to a useful range. This is done in a high vacuum, both to allow the vapor to reach the substrate without reacting with or scattering against other gas-phase atoms in the chamber, and reduce the incorporation of impurities from the residual gas in the vacuum chamber. Obviously, only materials with a much higher vapor pressure than the heating element can be deposited without contamination of the film. Molecular beam epitaxy is a particularly sophisticated form of thermal evaporation. An electron beam evaporator fires a high-energy beam from an electron gun to boil a small spot of material; since the heating is not uniform, lower vapor pressure materials can be deposited. The beam is usually bent through an angle of 270 in order to ensure that the gun filament is not directly exposed to the evaporant flux. Typical deposition rates for electron beam evaporation range from 1 to 10 nanometres per second. In molecular beam epitaxy (MBE), slow streams of an element can be directed at the substrate, so that material deposits one atomic layer at a time. Compounds such as gallium arsenide are usually deposited by repeatedly applying a layer of one element (i.e., gallium), then a layer of the other (i.e., arsenic), so that the process is chemical, as well as physical. The beam of material can be generated by either physical means (that is, by a furnace) or by a chemical reaction (chemical beam epitaxy). Sputtering relies on a plasma (usually a noble gas, such as argon) to knock material from a "target" a few atoms at a time. The target can be kept at a relatively low temperature, since the process is not one of evaporation, making this one of the most flexible deposition techniques. It is especially useful for compounds or mixtures, where different components would otherwise tend to evaporate at different rates. Note, sputtering's step coverage is more or less conformal.It is also widely used in the optical media. The manufacturing of all formats of CD, DVD, and BD are done with the help of this technique. It is a fast technique and also it provides a good thickness control. Presently, nitrogen and oxygen gases are also being used in sputtering. Pulsed laser deposition systems work by an ablation process. Pulses of focused laser light vaporize the surface of the target material and convert it to plasma; this plasma usually reverts to a gas before it reaches the substrate. Cathodic arc deposition (arc-PVD) which is a kind of ion beam deposition where an electrical arc is created that literally blasts ions from the cathode. The arc has an extremely high power density resulting in a high level of ionization (30100%), multiply charged ions, neutral particles, clusters and macro-particles (droplets). If a reactive gas is introduced during the evaporation process, dissociation, ionization and excitation can occur during interaction with the ion flux and a compound film will be deposited. Electrohydrodynamic deposition (electrospray deposition) is a relatively new process of thin film deposition. The liquid to be deposited, either in the form of nano-particle solution or simply a solution, is fed to a small capillary nozzle (usually metallic) which is connected to a high voltage. The substrate on which the film has to be deposited is connected to ground. Through the influence of electric field, the liquid coming out of the nozzle takes a conical shape (Taylor cone) and at the apex of the cone a thin jet emanates which disintegrates into very fine and small positively charged droplets under the influence of Rayleigh charge limt. The droplets keep getting smaller and smaller and ultimately get deposited on the substrate as a uniform thin layer.

151

Thin film

152

Thin-film photovoltaic cells


Thin-film technologies are also being developed as a means of substantially reducing the cost of photovoltaic (PV) systems. The rationale for this is that thin-film modules are cheaper to manufacture owing to their reduced material costs, energy costs, handling costs and capital costs. This is especially represented in the use of printed electronics (roll-to-roll) processes. Thin films belong to the second and third photovoltaic cell generations.

Thin-film batteries
Thin-film printing technology is being used to apply solid-state lithium polymers to a variety of substrates to create unique batteries for specialized applications. Thin-film batteries can be deposited directly onto chips or chip packages in any shape or size. Flexible batteries can be made by printing onto plastic, thin metal foil, or paper.

Sol-gel
In materials science, the sol-gel process is a method for producing solid materials from small molecules. The method is used for the fabrication of metal oxides, especially the oxides of silicon and titanium. The process involves conversion of monomers into a colloidal solution (sol) that acts as the precursor for an integrated network (or gel) of either discrete particles or network polymers. Typical precursors are metal alkoxides.

Stages in the process


In this chemical procedure, the 'sol' (or solution) gradually evolves towards the formation of a gel-like diphasic system containing both a liquid phase and solid phase whose morphologies range from discrete particles to continuous polymer networks. In the case of the colloid, the volume fraction of particles (or particle density) may be so low that a significant amount of fluid may need to be removed initially for the gel-like properties to be recognized. This can be accomplished in any number of ways. The simplest method is to allow time for sedimentation to occur, and then pour off the remaining liquid. Centrifugation can also be used to accelerate the process of phase separation. Removal of the remaining liquid (solvent) phase requires a drying process, which is typically accompanied by a significant amount of shrinkage and densification. The rate at which the solvent can be removed is ultimately determined by the distribution of porosity in the gel. The ultimate microstructure of the final component will clearly be strongly influenced by changes imposed upon the structural template during this phase of processing. Afterwards, a thermal treatment, or firing process, is often necessary in order to favor further polycondensation and enhance mechanical properties and structural stability via final sintering, densification and grain growth. One of the distinct advantages of using this methodology as opposed to the more traditional processing techniques is that densification is often achieved at a much lower temperature. The precursor sol can be either deposited on a substrate to form a film (e.g., by dip coating or spin coating), cast into a suitable container with the desired shape (e.g., to obtain monolithic ceramics, glasses, fibers, membranes, aerogels), or used to synthesize powders (e.g., microspheres, nanospheres). The sol-gel approach is a cheap and

Sol-gel low-temperature technique that allows for the fine control of the products chemical composition. Even small quantities of dopants, such as organic dyes and rare earth elements, can be introduced in the sol and end up uniformly dispersed in the final product. It can be used in ceramics processing and manufacturing as an investment casting material, or as a means of producing very thin films of metal oxides for various purposes. Sol-gel derived materials have diverse applications in optics, electronics, energy, space, (bio)sensors, medicine (e.g., controlled drug release), reactive material and separation (e.g., chromatography) technology. The interest in sol-gel processing can be traced back in the mid-1800s with the observation that the hydrolysis of tetraethyl orthosilicate (TEOS) under acidic conditions led to the formation of SiO2 in the form of fibers and monoliths. Sol-gel research grew to be so important that in the 1990s more than 35,000 papers were published worldwide on the process.

153

Particles and polymers


The sol-gel process is a wet-chemical technique used for the fabrication of both glassy and ceramic materials. In this process, the sol (or solution) evolves gradually towards the formation of a gel-like network containing both a liquid phase and a solid phase. Typical precursors are metal alkoxides and metal chlorides, which undergo hydrolysis and polycondensation reactions to form a colloid. The basic structure or morphology of the solid phase can range anywhere from discrete colloidal particles to continuous chain-like polymer networks. The term colloid is used primarily to describe a broad range of solid-liquid (and/or liquid-liquid) mixtures, all of which contain distinct solid (and/or liquid) particles which are dispersed to various degrees in a liquid medium. The term is specific to the size of the individual particles, which are larger than atomic dimensions but small enough to exhibit Brownian motion. If the particles are large enough, then their dynamic behavior in any given period of time in suspension would be governed by forces of gravity and sedimentation. But if they are small enough to be colloids, then their irregular motion in suspension can be attributed to the collective bombardment of a myriad of thermally agitated molecules in the liquid suspending medium, as described originally by Albert Einstein in his dissertation. Einstein concluded that this erratic behavior could adequately be described using the theory of Brownian motion, with sedimentation being a possible long term result. This critical size range (or particle diameter) typically ranges from tens of angstroms (1010m) to a few micrometres (106m). Under certain chemical conditions (typically in base-catalyzed sols), the particles may grow to sufficient size to become colloids, which are affected both by sedimentation and forces of gravity. Stabilized suspensions of such sub-micrometre spherical particles may eventually result in their self-assemblyyielding highly ordered microstructures reminiscent of the prototype colloidal crystal: precious opal. Under certain chemical conditions (typically in acid-catalyzed sols), the interparticle forces have sufficient strength to cause considerable aggregation and/or flocculation prior to their growth. The formation of a more open continuous network of low density polymers exhibits certain advantages with regard to physical properties in the formation of high performance glass and glass/ceramic components in 2 and 3 dimensions. In either case (discrete particles or continuous polymer network) the sol evolves then towards the formation of an inorganic network containing a liquid phase (gel). Formation of a metal oxide involves connecting the metal centers with oxo (M-O-M) or hydroxo (M-OH-M) bridges, therefore generating metal-oxo or metal-hydroxo polymers in solution. In both cases (discrete particles or continuous polymer network), the drying process serves to remove the liquid phase from the gel, yielding a micro-porous amorphous glass or micro-crystalline ceramic. Subsequent thermal treatment (firing) may be performed in order to favor further polycondensation and enhance mechanical properties. With the viscosity of a sol adjusted into a proper range, both optical quality glass fiber and refractory ceramic fiber can be drawn which are used for fiber optic sensors and thermal insulation, respectively. In addition, uniform ceramic powders of a wide range of chemical composition can be formed by precipitation.

Sol-gel

154

Polymerization
A well studied alkoxide is silicon tetraethoxide, or tetraethyl orthosilicate (TEOS). The chemical formula for TEOS is given by Si(OC2H5)4, or Si(OR)4, where the alkyl group R = C2H5. Alkoxides are ideal chemical precursors for sol-gel synthesis because they react readily with water. The reaction is called hydrolysis, because a hydroxyl ion becomes attached to the silicon atom as follows: Si(OR)4 + H2O HO-Si(OR)3 + R-OH Depending on the amount of water and catalyst present, hydrolysis may proceed to completion to silica: Si(OR)4 + 2 H2O SiO2 + 4 R-OH
Simplified representation of the condensation of

Complete hydrolysis requires a significant excess of water and TEOS in sol gel process catalysts such as acetic acid or hydrochloric acid. Intermediate species include [(OR)2Si-(OH)2] or [(OR)3Si-(OH)], resulting from partial hydrolysis reactions. Early intermediates result from two partially hydrolyzed monomers linked via a siloxane [SiOSi] bond: (OR)3Si-OH + HOSi-(OR)3 [(OR)3SiOSi(OR)3] + H-O-H or (OR)3Si-OR + HOSi-(OR)3 [(OR)3SiOSi(OR)3] + R-OH Thus, polymerization is associated with the formation of a 1, 2, or 3- dimensional network of siloxane [SiOSi] bonds accompanied by the production of H-O-H and R-O-H species. By definition, condensation liberates a small molecule, such as water or alcohol. This type of reaction can continue to build larger and larger silicon-containing molecules by the process of polymerization. Thus, a polymer is a huge molecule (or macromolecule) formed from hundreds or thousands of units called monomers. The number of bonds that a monomer can form is called its functionality. Polymerization of silicon alkoxide, for instance, can lead to complex branching of the polymer, because a fully hydrolyzed monomer Si(OH)4 is tetrafunctional (can branch or bond in 4 different directions). Alternatively, under certain conditions (e.g., low water concentration) fewer than 4 of the OR or OH groups (ligands) will be capable of condensation, so relatively little branching will occur. The mechanisms of hydrolysis and condensation, and the factors that bias the structure toward linear or branched structures are the most critical issues of sol-gel science and technology. This reaction is favored in both basic and acidic conditions.

Sono-Ormosil
Sonication is an efficient tool for the synthesis of polymers. The cavitational shear forces, which stretch out and break the chain in a non-random process, result in a lowering of the molecular weight and poly-dispersity. Furthermore, multi-phase systems are very efficient dispersed and emulsified, so that very fine mixtures are provided. This means that ultrasound increases the rate of polymerisation over conventional stirring and results in higher molecular weights with lower polydispersities. Ormosils (organically modified silicate) are obtained when silane is added to gel-derived silica during sol-gel process. The product is a molecular-scale composite with improved mechanical properties. Sono-Ormosils are characterized by a higher density than classic gels as well as an improved thermal stability. An explanation therefore might be the increased degree of polymerization.

Sol-gel

155

Nanomaterials
In the processing of fine ceramics, the irregular particle sizes and shapes in a typical powder often lead to non-uniform packing morphologies that result in packing density variations in the powder compact. Uncontrolled flocculation of powders due to attractive van der Waals forces can also give rise to microstructural inhomogeneities.

Differential stresses that develop as a result of non-uniform drying shrinkage are directly related to the rate at which the solvent can be removed, and thus highly dependent upon the distribution of porosity. Such stresses have been associated with a plastic-to-brittle transition in consolidated bodies, and can yield to crack propagation in the unfired body if not relieved.

In addition, any fluctuations in packing density in the compact as it is prepared for the kiln are often amplified during the sintering process, yielding inhomogeneous densification. Some pores and other structural defects associated with density variations have been shown to play a detrimental role in the sintering process by growing and thus limiting end-point densities. Differential stresses arising from inhomogeneous densification have also been shown to result in the propagation of internal cracks, thus becoming the strength-controlling flaws. It would therefore appear desirable to process a material in such a way that it is physically uniform with regard to the distribution of components and porosity, rather than using particle size distributions which will maximize the green density. The containment of a uniformly dispersed assembly of strongly interacting particles in suspension requires total control over particle-particle interactions. Monodisperse colloids provide this potential. Monodisperse powders of colloidal silica, for example, may therefore be stabilized sufficiently to ensure a high degree of order in the colloidal crystal or polycrystalline colloidal solid which results from aggregation. The degree of order appears to be limited by the time and space allowed for longer-range correlations to be established. Such defective polycrystalline structures would appear to be the basic elements of nanoscale materials science, and, therefore, provide the first step in developing a more rigorous understanding of the mechanisms involved in microstructural evolution in inorganic systems such as sintered ceramic nanomaterials.

Nanostructure of a resorcinol-formaldehyde gel reconstructed from Small-Angle X-ray Scattering. This type of disordered morphology is [10] typical of many sol-gel materials.

Applications
Protective coatings
The applications for sol gel-derived products are numerous. For example, scientists have used it to produce the worlds lightest materials and also some of its toughest ceramics. One of the largest application areas is thin films, which can be produced on a piece of substrate by spin coating or dip coating. Protective and decorative coatings, and electro-optic components can be applied to glass, metal and other types of substrates with these methods. Cast into a mold, and with further drying and heat-treatment, dense ceramic or glass articles with novel properties can be formed that cannot be created by any other method. Other coating methods include spraying, electrophoresis, inkjet printing or roll coating.

Sol-gel

156

Thin films and fibers


With the viscosity of a sol adjusted into a proper range, both optical and refractory ceramic fibers can be drawn which are used for fiber optic sensors and thermal insulation, respectively. Thus, many ceramic materials, both glassy and crystalline, have found use in various forms from bulk solid-state components to high surface area forms such as thin films, coatings and fibers.

Nanoscale powders
Ultra-fine and uniform ceramic powders can be formed by precipitation. These powders of single and multiple component compositions can be produced on a nanoscale particle size for dental and biomedical applications. Composite powders have been patented for use as agrochemicals and herbicides. Powder abrasives, used in a variety of finishing operations, are made using a sol-gel type process. One of the more important applications of sol-gel processing is to carry out zeolite synthesis. Other elements (metals, metal oxides) can be easily incorporated into the final product and the silicate sol formed by this method is very stable. Another application in research is to entrap biomolecules for sensory (biosensors) or catalytic purposes, by physically or chemically preventing them from leaching out and, in the case of protein or chemically-linked small molecules, by shielding them from the external environment yet allowing small molecules to be monitored. The major disadvantages are that the change in local environment may alter the functionality of the protein or small molecule entrapped and that the synthesis step may damage the protein. To circumvent this, various strategies have been explored, such as monomers with protein friendly leaving groups (e.g. glycerol) and the inclusion of polymers which stabilize protein (e.g. PEG). Other products fabricated with this process include various ceramic membranes for microfiltration, ultrafiltration, nanofiltration, pervaporation and reverse osmosis. If the liquid in a wet gel is removed under a supercritical condition, a highly porous and extremely low density material called aerogel is obtained. Drying the gel by means of low temperature treatments (25-100C), it is possible to obtain porous solid matrices called xerogels. In addition, a sol-gel process was developed in the 1950s for the production of radioactive powders of UO2 and ThO2 for nuclear fuels, without generation of large quantities of dust.

Opto-mechanical
Macroscopic optical elements and active optical components as well as large area hot mirrors, cold mirrors, lenses and beam splitters all with optimal geometry can be made quickly and at low cost via the sol-gel route. In the processing of high performance ceramic nanomaterials with superior opto-mechanical properties under adverse conditions, the size of the crystalline grains is determined largely by the size of the crystalline particles present in the raw material during the synthesis or formation of the object. Thus a reduction of the original particle size well below the wavelength of visible light (~500nm) eliminates much of the light scattering, resulting in a translucent or even transparent material. Furthermore, results indicate that microscopic pores in sintered ceramic nanomaterials, mainly trapped at the junctions of microcrystalline grains, cause light to scatter and prevented true transparency. it has been observed that the total volume fraction of these nanoscale pores (both intergranular and intragranular porosity) must be less than 1% for high-quality optical transmission. I.E. The density has to be 99.99% of the theoretical crystalline density.

Sol-gel

157

Mechanics of gelation
In a static sense, the fundamental difference between a liquid and a solid is that the solid has elastic resistance against a shearing stress while a liquid does not. Thus, a simple liquid will not typically support a transverse acoustic phonon, or shear wave. Gels have been described by Born as liquids in which an elastic resistance against shearing survives, yielding both viscous and elastic properties. It has been shown theoretically that in a certain low-frequency range, polymeric gels should propagate shear waves with relatively low damping. The distinction between a sol (solution) and a gel therefore appears to be understood in a manner analogous to the practical distinction between the elastic and plastic deformation ranges of a metal. The distinction lies in the ability to respond to an applied shear force via macroscopic viscous flow. In a dynamic sense, the response of a gel to an alternating force (oscillation or vibration) will depend upon the period or frequency of vibration. As indicated here, even most simple liquids will exhibit some elastic response at shear rates or frequencies exceeding 5 x 106 cycles per second. Experiments on such short time scales probe the fundamental motions of the primary particles (or particle clusters) which constitute the lattice structure or aggregate. The increasing resistance of certain liquids to flow at high stirring speeds is one manifestation of this phenomenon. The ability of a condensed body to respond to a mechanical force by viscous flow is thus strongly dependent on the time scale over which the load is applied, and thus the frequency and amplitude of the stress wave in oscillatory experiments.

Structural relaxation
The structural relaxation of a viscoelastic gel has been identified as primary mechanism responsible for densification and associated pore evolution in both colloidal and polymeric silica gels. Experiments in the viscoelastic properties of such skeletal networks on various time scales require a force varying with a period (or frequency) appropriate to the relaxation time of the phenomenon investigated, and inversely proportional to the distance over which such relaxation occurs. High frequencies associated with ultrasonic waves have been used extensively in the handling of polymer solutions, liquids and gels and the determination of their viscoelastic properties. Static measurements of the shear modulus have been made, as well as dynamic measurements of the speed of propagation of shear waves, a measurement which then yields the dynamic modulus of rigidity . Dynamic Light Scattering (DLS) techniques have been utilized in order to monitor the dynamics of density fluctuations through the behavior of the autocorrelation function near the point of gelation.

Phase transition
Tanaka, et al., emphasize that the discrete and reversible volume transitions which occur in partially hydrolyzed acrylimide gels can be interpreted in terms of a phase transition of the system consisting of the charged polymer network, hydrogen (counter)ions and liquid matrix. The phase transition is a manifestation of competition among the three forces which contribute to the osmotic pressure in the gel: 1. The positive osmotic pressure of (+) hydrogen ions 2. The negative pressure due to polymer-polymer affinity 3. The rubber-like elasticity of the polymer network The balance of these forces varies with change in temperature or solvent properties. The total osmotic pressure acting on the system is the sum osmotic pressure of the gel. It is further shown that the phase transition can be induced by the application of an electrical field across the gel. The volume change at the transition point is either discrete (as in a first-order Ehrenfest transition) or continuous (second order Ehrenfest analogy), depending on the degree of ionization of the gel and on the solvent composition.

Sol-gel

158

Elastic continuum
The gel is thus interpreted as an elastic continuum which deforms when subjected to externally applied shear forces, but is incompressible upon application of hydrostatic pressure. This combination of fluidity and rigidity is explained in terms of the gel structure: that of a liquid contained within a fibrous polymer network or matrix by the extremely large friction between the liquid and the fiber or polymer network. Thermal fluctuations may produce infinitesimal expansion or contraction within the network, and the evolution of such fluctuations will ultimately determine the molecular morphology and the degree of hydration of the body. Quasi-elastic light scattering offers direct experimental access to measurement of the wavelength and lifetimes of critical fluctuations, which are governed by the viscoelastic properties of the gel. It is reasonable to expect a relationship between the amplitude of such fluctuations and the elasticity of the network. Since the elasticity measures the resistance of the network to either elastic (reversible) or plastic (irreversible) deformation, the fluctuations should grow larger as the elasticity declines. The divergence of the scattered light intensity at a finite critical temperature implies that the elasticity approaches zero, or the compressibility becomes infinite, which is the typically observed behavior of a system at the point of instability. Thus, at the critical point, the polymer network offers no resistance at all to any form of deformation.

Ultimate microstructure
The rate of relaxation of density fluctuations will be rapid if the restoring force, which depends upon the network elasticity, is largeand if the friction between the network and the interstitial fluid is small. The theory suggests that the rate is directly proportional to the elasticity and inversely proportional to the frictional force. The friction in turn depends upon both the viscosity of the fluid and the average size of the pores contained within the polymer network. Thus, if the elasticity is inferred from the measurements of the scattering intensity, and the viscosity is determined independently (via mechanical methods such as ultrasonic attenuation) measurement of the relaxation rate yields information on the pore size distribution contained within the polymer network, e.g large fluctuations in polymer density near the critical point yield large density differentials with a corresponding bimodal distribution of porosity. The difference in average size between the smaller pores (in the highly dense regions) and the larger pores (in regions of lower average density) will therefore depend upon the degree of phase separation which is allowed to occur before such fluctuations become thermally arrested or "frozen in" at or near the critical point of the transition.

Plasma-enhanced chemical vapor deposition

159

Plasma-enhanced chemical vapor deposition


Plasma-enhanced chemical vapor deposition (PECVD) is a process used to deposit thin films from a gas state (vapor) to a solid state on a substrate. Chemical reactions are involved in the process, which occur after creation of a plasma of the reacting gases. The plasma is generally created by RF (AC) frequency or DC discharge between two electrodes, the space between which is filled with the reacting gases.

Discharges for processing


A plasma is any gas in which a significant percentage of the atoms or molecules are ionized. Fractional ionization in plasmas used for deposition and related materials processing varies from about 104 in typical capacitive discharges to as high as 510% in high density inductive plasmas. Processing plasmas are typically operated at pressures of a few millitorr to a few torr, although arc discharges and inductive plasmas can be ignited at atmospheric pressure. Plasmas with low fractional ionization are of great interest for materials processing because electrons are so light, compared to atoms and molecules, that energy exchange between the electrons PECVD machine at LAAS technological facility in Toulouse, and neutral gas is very inefficient. Therefore, the France. electrons can be maintained at very high equivalent temperatures tens of thousands of kelvins, equivalent to several electronvolts average energywhile the neutral atoms remain at the ambient temperature. These energetic electrons can induce many processes that would otherwise be very improbable at low temperatures, such as dissociation of precursor molecules and the creation of large quantities of free radicals. A second benefit of deposition within a discharge arises from the fact that electrons are more mobile than ions. As a consequence, the plasma is normally more positive than any object it is in contact with, as otherwise a large flux of electrons would flow from the plasma to the object. The voltage between the plasma and the objects in its contacts is normally dropped across a thin sheath region. Ionized atoms or molecules that diffuse to the edge of the sheath region feel an electrostatic force and are accelerated towards the neighboring surface. Thus, all surfaces exposed to the plasma receive energetic ion bombardment. The potential across the sheath surrounding an electrically-isolated object (the floating potential) is typically only 1020 V, but much higher sheath potentials are achievable by adjustments in reactor geometry and configuration. Thus, films can be exposed to energetic ion bombardment during deposition. This bombardment can lead to increases in density of the film, and help remove contaminants, improving the film's electrical and mechanical properties. When a high-density plasma is used, the ion density can be high enough that significant sputtering of the deposited film occurs; this sputtering can be employed to help planarize the film and fill trenches or holes.

Plasma-enhanced chemical vapor deposition

160

Reactor types
A simple direct-current (DC) discharge can be readily created at a few torr between two conductive electrodes, and may be suitable for deposition of conductive materials. However, insulating films will quickly extinguish this discharge as they are deposited. It is more common to excite a capacitive discharge by applying an alternating-current (AC) or radio-frequency (RF) signal between an electrode and the conductive walls of a reactor chamber, or between two cylindrical conductive electrodes facing one another. The latter configuration is known as a parallel plate reactor. Frequencies of a few tens of Hz to a few thousand Hz will produce time-varying plasmas that are repeatedly initiated and extinguished; frequencies of tens of kilohertz to tens of megahertz result in reasonably time-independent discharges.

Excitation frequencies in the low-frequency (LF) range, usually around 100kHz, require several hundred volts to sustain the discharge. These large voltages lead to high-energy ion bombardment of surfaces. High-frequency plasmas are often excited at the standard 13.56MHz frequency widely available for industrial use; at high frequencies, the displacement current from sheath movement and scattering from the sheath assist in ionization, and thus lower voltages are sufficient to achieve higher plasma densities. Thus one can adjust the chemistry and ion bombardment in the deposition by changing the frequency of excitation, or by using a mixture of low- and high-frequency signals in a dual-frequency reactor. Excitation power of tens to hundreds of watts is typical for an electrode with a diameter of 200 to 300mm. Capacitive plasmas are usually very lightly ionized, resulting in limited dissociation of precursors and low deposition rates. Much denser plasmas can be created using inductive discharges, in which an inductive coil excited with a high-frequency signal induces an electric field within the discharge, accelerating electrons in the plasma itself rather than just at the sheath edge. Electron cyclotron resonance reactors and helicon wave antennas have also been used to create high-density discharges. Excitation powers of 10kW or more are often used in modern reactors.

This commercial system was designed for the semiconductor field and contains three 8"-diameter targets that can be run individually or simultaneously to deposit metallic or dielectric films on substrates ranging up to 24" in diameter. In use at the Argonne National Laboratory.

Film examples & Applications


Plasma deposition is often used in semiconductor manufacturing to deposit films conformally (covering sidewalls) and onto wafers containing metal layers or other temperature-sensitive structures. PECVD also yields some of the fastest deposition rates while maintaining film quality (such as roughness, defects/voids), as compared with sputter deposition and thermal/electron-beam evaporation, often at the expense of uniformity. Silicon dioxide can be deposited using a combination of silicon precursor gasses like dichlorosilane or silane and oxygen precursors, such as oxygen and nitrous oxide, typically at pressures from a few millitorr to a few torr. Plasma-deposited silicon nitride, formed from silane and ammonia or nitrogen, is also widely used, although it is important to note that it is not possible to deposit a pure nitride in this fashion. Plasma nitrides always contain a large amount of hydrogen, which can be bonded to silicon (Si-H) or nitrogen (Si-NH); this hydrogen has an important influence on IR and UV absorption, stability, mechanical stress, and electrical conductivity. Silicon Dioxide can also be deposited from a tetraethoxysilane (TEOS) silicon precursor in an oxygen or oxygen-argon plasma. These films can be contaminated with significant carbon and hydrogen as silanol, and can be unstable in air . Pressures of a few torr and small electrode spacings, and/or dual frequency deposition, are helpful to achieve high deposition rates with good film stability.

Plasma-enhanced chemical vapor deposition High-density plasma deposition of silicon dioxide from silane and oxygen/argon has been widely used to create a nearly hydrogen-free film with good conformality over complex surfaces, the latter resulting from intense ion bombardment and consequent sputtering of the deposited molecules from vertical onto horizontal surfaces.

161

Atomic layer deposition


Atomic layer deposition (ALD) is a thin film deposition technique that is based on the sequential use of a gas phase chemical process. The majority of ALD reactions use two chemicals, typically called precursors. These precursors react with a surface one at a time in a sequential, self-limiting, manner. By exposing the precursors to the growth surface repeatedly, a thin film is deposited.

Introduction
ALD is a self-limiting (the amount of film material deposited in each reaction cycle is constant), sequential surface chemistry that deposits conformal thin-films of materials onto substrates of varying compositions. Due to the characteristics of self-limiting and surface reactions, ALD film growth makes atomic scale deposition control possible. ALD is similar in chemistry to chemical vapor deposition (CVD), except that the ALD reaction breaks the CVD reaction into two half-reactions, keeping the precursor materials separate during the reaction. By keeping the precursors separate throughout the coating process, atomic layer control of film growth can be obtained as fine as ~0.1 (10 pm) per cycle. Separation of the precursors is accomplished by pulsing a purge gas (typically nitrogen or argon) after each precursor pulse to remove excess precursor from the process chamber and prevent 'parasitic' CVD deposition on the substrate. ALD principle was first published under name Molecular Layering in the early 1960s by Prof. S.I. Koltsov from Leningrad (Lensovet) Technological Institute (LTI). These ALD experiments were conducted under the scientific supervision of corresponding member of the Russian Academy of Sciences Prof. V.B. Aleskovskii. The concept of the ALD process was first proposed by Prof. V.B. Aleskovskii in his Ph.D. thesis published in 1952. It was the work of Dr Tuomo Suntola and coworkers in Finland in mid-1970s that made a scientific idea a true thin film deposition technology and took that into an industrial use and worldwide awareness. After starting with elemental precursors (that is why name atomic) they were forced to convert to molecular precursors too to expand the materials selection. But as importantly, Suntola and coworkers also developed reactors that enabled the implementation of the ALD technology (at that time called atomic layer epitaxy (ALE) into industrial level in manufacturing of thin film electroluminescent (TFEL) flat-panel displays. These displays served as the original motivation for developing the ALD technology as they require high quality dielectric and luminescent films on large-area substrates, something that was not available at the time being. TFEL display manufacturing was started in mid-1980s and was for a long time the only industrial application of ALD. Interest in ALD has increased in steps in the mid-1990s and 2000s, with the interest focused on silicon-based microelectronics. ALD is considered as one deposition method with the greatest potential for producing very thin, conformal films with control of the thickness and composition of the films possible at the atomic level. A major driving force for the recent interest is the

Atomic layer deposition prospective seen for ALD in scaling down microelectronic devices. In 2004, European SEMI award was given to Dr Tuomo Suntola for inventing the ALD technology and introducing it worldwide. ALD can be used to deposit several types of thin films, including various oxides (e.g. Al2O3, TiO2, SnO2, ZnO, HfO2), metal nitrides (e.g. TiN, TaN, WN, NbN), metals (e.g. Ru, Ir, Pt), and metal sulfides (e.g. ZnS).

162

ALD process
The growth of material layers by ALD consists of repeating the following characteristic four steps: 1. Exposure of the first precursor, typically an organometallic compound. 2. Purge or evacuation of the reaction chamber to remove the non-reacted precursors and the gaseous reaction by-products. 3. Exposure of the second precursor or another treatment to activate the surface again for the reaction of the first precursor, such as a plasma. 4. Purge or evacuation of the reaction chamber. Each reaction cycle adds a given amount of material to the surface, referred to as the growth per cycle. To grow a material layer, reaction cycles are repeated as many as required for the desired film thickness. One cycle may take time from 0.5 s to a few seconds and deposit between 0.1 and 3 of film thickness. Due to the self-terminating reactions, ALD is a surface-controlled process, where process parameters other than the precursors, substrate, and temperature have little or no influence. And, because of the surface control, ALD-grown films are extremely conformal and uniform in thickness. These thin films can also be used in correlation with other common fabrication methods.

Advantages and limitations


Advantages
Using ALD, film thickness depends only on the number of reaction cycles, which makes the thickness control accurate and simple. Unlike CVD, there is less need of reactant flux homogeneity, which gives large area (large batch and easy scale-up) capability, excellent conformality and reproducibility, and simplifies the use of solid precursors. Also, the growth of different multilayer structures is straight forward. These advantages make the ALD method attractive for microelectronics for manufacturing of future generation integrated circuits. Other advantages of ALD are the wide range of film materials available, high density and low impurity level. Also, lower deposition temperature can be used in order not to affect sensitive substrates.

Limitations
The major limitation of ALD is its slowness; usually only a fraction of a monolayer is deposited in one cycle. Fortunately, the films needed for future-generation integrated circuits are very thin and thus the slowness of ALD is not such an important issue. More recently, commercial ALD tools can achieve cycle times of <5 seconds meaning a 100nm film can be deposited in under an hour. With batch processing this can equate to a high throughput of wafers/minute. New advances in roll-to-roll ALD are allowing even faster throughput. Although the selection of film materials grown by ALD is wide, many technologically important materials (Si, Ge, Si3N4, several multi-component oxides, certain metals) cannot currently be deposited by ALD in a cost-effective way. ALD is a chemical technique and thus there is always a risk of residues being left from the precursors. The impurity content of the films depends on the completeness of the reactions. In typical oxide processes where metal halides of alkyl compounds are used together with water as precursors, impurities found in the films are at the 0.1-1 atom % level.

Atomic layer deposition

163

ALD in microelectronics
In microelectronics, ALD is studied as a potential technique to deposit high-k (high permittivity) gate oxides, high-k memory capacitor dielectrics, ferroelectrics, and metals and nitrides for electrodes and interconnects. In high-k gate oxides, where the control of ultra thin films is essential, ALD is only likely to come into wider use at the 45nm technology. In metallizations, conformal films are required; currently it is expected that ALD will be used in mainstream production at the 65nm node. In dynamic random access memories (DRAMs), the conformality requirements are even higher and ALD is the only method that can be used when feature sizes become smaller than 100nm.

Gate oxides
Deposition of the high-k oxides Al2O3, ZrO2, and HfO2 has been one of the most widely examined areas of ALD. The motivation for high-k oxides comes from the problem of high tunneling current through the commonly used SiO2 gate dielectric in metal-oxide-semiconductor field-effect transistors (MOSFETs) when it is downscaled to a thickness of 1.0nm and below. With the high-k oxide, a thicker gate dielectric can be made for the required capacitance density, thus the tunneling current can be reduced through the structure. Intel Corporation has reported using ALD to deposit high-k gate dielectric for its 45 nm CMOS technology.

DRAM capacitors
The development of dynamic random access memory (DRAM) capacitor dielectrics has been similar to that of gate dielectrics: SiO2 has been widely used in the industry thus far, but it is likely to be phased out in the near future as the scale of devices are decreased. The requirements for the downscaled DRAM capacitors are good conformality and permittivity values above 200, thus the candidate materials are different from those explored for MOSFET gate dielectrics. (For example, Al2O3, ZrO2, and HfO2) The most extensively studied candidate has been (Ba,Sr)TiO3. ALD is a very promising method, which can satisfy the high conformal requirements of DRAM applications. A permittivity of 180 was measured for SrTiO3 and 165 for BaTiO3 when films thicker than 200nm were post-deposition annealed, but when the film thickness was decreased to 50nm, the permittivity decreased to only 100.

Transition-metal nitrides
Transition-metal nitrides, such as TiN and TaN find potential use both as metal barriers and as gate metals. Metal barriers are used in modern Cu-based chips to avoid diffusion of Cu into the surrounding materials, such as insulators and the silicon substrate, and also, to prevent Cu contamination by elements diffusing from the insulators by surrounding every Cu interconnection with a layer of metal barriers. The metal barriers have strict demands: they should be pure; dense; conductive; conformal; thin; have good adhesion towards metals and insulators. The requirements concerning process technique can be fulfilled by ALD. The most studied ALD nitride is TiN which is deposited from TiCl4 and NH3.

Metal films
Motivations of an interest in metal ALD are: 1. 2. 3. 4. Cu interconnects and W plugs, or at least Cu seed layers for Cu electrodeposition and W seeds for W CVD, transition-metal nitrides (e.g. TiN, TaN, WN) for Cu interconnect barriers noble metals for ferroelectric random access memory (FRAM) and DRAM capacitor electrodes high- and low-work function metals for dual-gate MOSFETs.

Sputtering

164

Sputtering
Sputtering is a process whereby atoms are ejected from a solid target material due to bombardment of the target by energetic particles. It only happens when the kinetic energy of the incoming particles is much higher than conventional thermal energies ( 1 eV). This process can lead, during prolonged ion or plasma bombardment of a material, to significant erosion of materials, and can thus be harmful. On the other hand, it is commonly utilized for thin-film deposition, etching and analytical techniques (see below).

Physics of sputtering
Physical sputtering is driven by momentum exchange between the ions and atoms in the materials, due to collisions.

A commercial sputtering system at Cornell NanoScale Science and Technology Facility.

The incident ions set off collision cascades in the target. When such cascades recoil and reach the target surface with an energy greater than the surface binding energy, an atom would be ejected, known as sputtering. If the target is thin on an atomic scale the collision cascade can reach the back side of the target and atoms can escape the surface binding energy "in transmission". The average number of atoms ejected from the target per incident ion is called the sputter yield and depends on the ion incident angle, the energy of the ion, the masses of the ion and target atoms, and the surface binding energy of atoms in the target. For a crystalline target the orientation of the crystal axes with respect to the target surface is relevant. The primary particles for the sputtering process can be supplied in a number of ways, for example by a plasma, an ion source, an accelerator or by a radioactive material emitting alpha particles.

A model for describing sputtering in the cascade regime for amorphous flat targets is Thompson's analytical model. An algorithm that simulates sputtering based on a quantum mechanical treatment including electrons stripping at high energy is implemented in the program TRIM.

Sputtering from a linear collision cascade. The thick line illustrates the position of the surface, and the thinner lines the ballistic movement paths of the atoms from beginning until they stop in the material. The purple circle is the incoming ion. Red, blue, green and yellow circles illustrate primary, secondary, tertiary and quaternary recoils, respectively. Two of the atoms happen to move out from the sample, i.e. be sputtered.

A different mechanism of physical sputtering is heat spike sputtering. This may occur when the solid is dense enough, and then the incoming ion heavy enough, that the collisions occur very close to each other. Then the binary collision approximation is no longer valid, but rather the collisional process should be understood as a many-body process. The dense collisions induce a heat spike (also called thermal spike), which essentially melts the crystal

Sputtering locally. If the molten zone is close enough to a surface, large numbers of atoms may sputter due to flow of liquid to the surface and/or microexplosions. Heat spike sputtering is most important for heavy ions (say Xe or Au or cluster ions) with energies in the keVMeV range bombarding dense but soft metals with a low melting point (Ag, Au, Pb, etc.). The heat spike sputtering often increases nonlinearly with energy, and can for small cluster ions lead to dramatic sputtering yields per cluster of the order of 10,000. Physical sputtering has a well-defined minimum energy threshold equal to or larger than the ion energy at which the maximum energy transfer of the ion to a sample atom equals the binding energy of a surface atom. This threshold typically is somewhere in the range 10100 eV. Preferential sputtering can occur at the start when a multicomponent solid target is bombarded and there is no solid state diffusion. If the energy transfer is more efficient to one of the target components, and/or it is less strongly bound to the solid, it will sputter more efficiently than the other. If in an AB alloy the component A is sputtered preferentially, the surface of the solid will, during prolonged bombardment, become enriched in the B component thereby increasing the probability that B is sputtered such that the composition of the sputtered material will be AB.

165

Electronic sputtering
The term electronic sputtering can mean either sputtering induced by energetic electrons (for example in a transmission electron microscope), or sputtering due to very high-energy or highly charged heavy ions that lose energy to the solid mostly by electronic stopping power, where the electronic excitations cause sputtering. Electronic sputtering produces high sputtering yields from insulators, as the electronic excitations that cause sputtering are not immediately quenched, as they would be in a conductor. One example of this is Jupiter's ice-covered moon Europa, where a MeV sulfur ion from Jupiter's magnetosphere can eject up to 10,000 H2O molecules.

Potential sputtering
In the case of multiply charged projectile ions a particular form of electronic sputtering can take place that has been termed potential sputtering. In these cases the potential energy stored in multiply charged ions (i.e., the energy necessary to produce an ion of this charge state from its neutral atom) is liberated when the ions recombine during impact on a solid surface (formation of hollow atoms). This sputtering process is characterized by a strong dependence of the observed sputtering yields on the charge state of the impinging ion and can already take place at ion impact energies well below the physical sputtering threshold. Potential sputtering has only been observed for certain target species and requires a minimum potential energy.

A commercial sputtering system

Etching and chemical sputtering


Removing atoms by sputtering with an inert gas is called "ion milling" or "ion etching". Sputtering can also play a role in reactive ion etching (RIE), a plasma process carried out with chemically active ions and radicals, for which the sputtering yield may be enhanced significantly compared to pure physical sputtering. Reactive ions are frequently used in Secondary Ion Mass Spectrometry (SIMS) equipment to enhance the sputter rates. The mechanisms causing the sputtering enhancement are not always well understood, but for instance the case of fluorine etching of Si has been modeled well theoretically.

Sputtering Sputtering observed to occur below the threshold energy of physical sputtering is also often called chemical sputtering. The mechanisms behind such sputtering are not always well understood, and may be hard to distinguish from chemical etching. At elevated temperatures, chemical sputtering of carbon can be understood to be due to the incoming ions weakening bonds in the sample, which then desorb by thermal activation. The hydrogen-induced sputtering of carbon-based materials observed at low temperatures has been explained by H ions entering between C-C bonds and thus breaking them, a mechanism dubbed swift chemical sputtering.

166

Applications and phenomena


Film deposition
Sputter deposition is a method of depositing thin films by sputtering that involves eroding material from a "target" source onto a "substrate" e.g. a silicon wafer. Resputtering, in contrast, involves re-emission of the deposited material, e.g. SiO2 during the deposition also by ion bombardment. Sputtered atoms ejected into the gas phase are not in their thermodynamic equilibrium state, and tend to deposit on all surfaces in the vacuum chamber. A substrate (such as a wafer) placed in the chamber will be coated with a thin film. Sputtering usually uses an argon plasma.

Etching
In the semiconductor industry sputtering is used to etch the target. Sputter etching is chosen in cases where a high degree of etching anisotropy is needed and selectivity is not a concern. One major drawback of this technique is wafer damage.

For analysis
Another application of sputtering is to etch away the target material. One such example occurs in Secondary Ion Mass Spectrometry (SIMS), where the target sample is sputtered at a constant rate. As the target is sputtered, the concentration and identity of sputtered atoms are measured using Mass Spectrometry. In this way the composition of the target material can be determined and even extremely low concentrations (20g/kg) of impurities detected. Furthermore, because the sputtering continually etches deeper into the sample, concentration profiles as a function of depth can be measured.

In space
Sputtering is one of the forms of space weathering, a process that changes the physical and chemical properties of airless bodies, such as asteroids and the Moon. It is also one of the possible ways that Mars has lost most of its atmosphere and that Mercury continually replenishes its tenuous surface-bounded exosphere.

Sputtering

167

Electrohydrodynamics
Electrohydrodynamics (EHD), also known as electro-fluid-dynamics (EFD) or electrokinetics, is the study of the dynamics of electrically charged fluids. It is the study of the motions of ionised particles or molecules and their interactions with electric fields and the surrounding fluid. The term may be considered to be synonymous with the rather elaborate electrostrictive hydrodynamics. EHD covers the following types of particle and fluid transport mechanisms: Electrophoresis, electrokinesis, dielectrophoresis, electro-osmosis, and electrorotation. In general, the phenomena relate to the direct conversion of electrical energy into kinetic energy, and vice versa. In the first instance, shaped electrostatic fields create hydrostatic pressure (or motion) in dielectric media. When such media are fluids, a flow is produced. If the dielectric is a vacuum or a solid, no flow is produced. Such flow can be directed against the electrodes, generally to move the electrodes. In such case, the moving structure acts as an electric motor. Practical fields of interest of EHD are the common air ioniser, Electrohydrodynamic thrusters and EHD cooling systems. In the second instance, the converse takes place. A powered flow of medium within a shaped electrostatic field adds energy to the system which is picked up as a potential difference by electrodes. In such case, the structure acts as an electrical generator.

Electrokinesis
Electrokinesis is the particle or fluid transport produced by an electric field acting on a fluid having a net mobile charge. (See -kinesis for explanation and further uses of the kinesis suffix.) Electrokinesis was first observed by Reuss in 1809 and has been studied extensively since the 19th century. Such study is known as electrohydrodynamics or electrokinetics, and was documented by Thomas Townsend Brown in 1921. It was later refined in scientific terms during the 1930s in conjunction with Dr. Paul Alfred Biefeld. The flow rate in such a mechanism is linear in the electric field. Electrokinesis is of considerable practical importance in microfluidics, because it offers a way to manipulate and convey fluids in microsystems using only electric fields, with no moving parts. The force acting on the fluid, is given by the equation:
where F is the resulting force, measured in newtons, I is the current, measured in amperes, d is the distance between electrodes, measured in metres, and k is the ion mobility coefficient of the dielectric fluid, measured in m2/(Vs).

If the electrodes are free to move within the fluid, while keeping their distance fixed from each other, then such a force will actually propel the electrodes with respect to the fluid.

Electrohydrodynamics Electrokinesis has also been observed in biology, where it was found to cause physical damage to neurons by inciting movement in their membranes. It is also discussed in R.J.Elul's "Fixed charge in the cell membrane" (1967).

168

Water electrokinetics
In October 2003, Dr. Daniel Kwok, Dr. Larry Kostiuk and two graduate students from the University of Alberta discussed a method of hydrodynamic to electrical energy conversion by exploiting the natural electrokinetic properties of a liquid such as ordinary tap water, by pumping fluids through tiny microchannels with a pressure difference. This technology could some day provide a practical and clean energy storage device, replacing today's batteries, for devices such as mobile phones or calculators which would be charged up by simply pumping water to high pressure. Pressure would then be released on demand, for fluid flow to take place over the microchannels. When water travels over a surface, the ions that it is made up of "rub" against the solid, leaving the surface slightly charged. Kinetic energy from the moving ions would be thus converted to electrical energy. Although the power generated from a single channel is extremely small, millions of parallel channels can be used to increase the power output. This phenomenon is called streaming potential and was discovered in 1859.

Electrokinetic instabilities
The fluid flows in microfluidic and nanofluidic devices are often stable and strongly damped by viscous forces (with Reynolds numbers of order unity or smaller). However, heterogeneous ionic conductivity fields in the presence of applied electric fields can, under certain conditions, generate an unstable flow field owing to electrokinetic instabilities (EKI). Conductivity gradients are prevalent in on-chip electrokinetic processes such as preconcentration methods (e.g. field amplified sample stacking and isoelectric focusing), multidimensional assays, and systems with poorly specified sample chemistry. The dynamics and periodic morphology of electrokinetic instabilities are similar to other systems with RayleighTaylor instabilities. Electrokinetic instabilities can be leveraged for rapid mixing or can cause undesirable dispersion in sample injection, separation and stacking. These instabilities are caused by a coupling of electric fields and ionic conductivity gradients that results in an electric body force. This coupling results in an electric body force in the bulk liquid, outside the electric double layer, that can generate temporal, convective, and absolute flow instabilities. Electrokinetic flows with conductivity gradients become unstable when the electroviscous stretching and folding of conductivity interfaces grows faster than the dissipative effect of molecular diffusion. Since these flows are characterized by low velocities and small length scales, the Reynolds number is below 0.01 and the flow is laminar. The onset of instability in these flows is best described as an electric Rayleigh number.

Chemical bath deposition

169

Chemical bath deposition


The Chemical bath deposition (CBD) method is one of the cheapest methods to deposit thin films and nanomaterials, as it does not depend on expensive equipment and is a scalable technique that can be employed for large area batch processing or continuous deposition. In 1933 Bruckman deposited Lead(II) sulfide (PbS) thin film by chemical bath deposition (CBD) or solution grown method.

Advantages and disadvantages


The major advantage of CBD is that it requires only solution containers and substrate mounting devices. The one drawback of this method is the wastage of solution after every deposition. Among various deposition techniques, chemical bath deposition yields stable, adherent, uniform and hard films with good reproducibility by a relatively simple process. The chemical bath deposition method is one of the suitable methods for preparing highly efficient thin films in a simple manner. The growth of thin films strongly depends on growth conditions, such as duration of deposition, composition and temperature of the solution, and topographical and chemical nature of the substrate.

Reaction Mechanism
The chemical bath deposition involves two steps, nucleation and particle growth, and is based on the formation of a solid phase from a solution. In the chemical bath deposition procedure, the substrate is immersed in an aqueous solution containing the precursors.

Chemical beam epitaxy

170

Chemical beam epitaxy


Chemical beam epitaxy (CBE) forms an important class of deposition techniques for semiconductor layer systems, especially III-V semiconductor systems. This form of epitaxial growth is performed in an ultrahigh vacuum system. The reactants are in the form of molecular beams of reactive gases, typically as the hydride or a metalorganic. The term CBE is often used interchangeably with metal-organic molecular beam epitaxy (MOMBE). The nomenclature does differentiate between the two (slightly different) processes, however. When used in the strictest sense, CBE refers to the technique in which both components are obtained from gaseous sources, while MOMBE refers to the technique in which the group III component is obtained from a gaseous source and the group V component from a solid source.

Basic principles
Chemical Beam Epitaxy was first demonstrated by W.T. Tsang in 1984. This technique was then described as a hybrid of metal-organic chemical vapor deposition (MOCVD) and molecular beam epitaxy (MBE) that exploited the advantages of both the techniques. In this initial work, InP and GaAs were grown using gaseous group III and V alkyls. While group III elements were derived from the pyrolysis of the alkyls on the surface, the group V elements were obtained from the decomposition of the alkyls by bringing in contact with heated Tantalum (Ta) or Molybdenum (Mo) at 950-1200 C. Typical pressure in the gas reactor is between 102 Torr and 1 atm for MOCVD. Here, the transport of gas occurs by viscous flow and chemicals reach the surface by diffusion. In contrast, gas pressures of less than 104 Torr are used in CBE. The gas transport now occurs as molecular beam due to the much longer mean-free paths, and the process evolves to a chemical beam deposition. It is also worth noting here that MBE employs atomic beams (such as aluminium (Al) and Gallium (Ga)) and molecular beams (such as As4 and P4) that are evaporated at high temperatures from solid elemental sources, while the sources for CBE are in vapor phase at room temperatures. A comparison of the different processes in the growth chamber for MOCVD, MBE and CBE can be seen in figure .

Chemical beam epitaxy

170

Figure : Basic processes inside the growth chambers of a) MOCVD, b) MBE, and c) CBE.

Experimental Setup
A combination of turbomolecular and cryo pumps are used in standard UHV growth chambers. The chamber itself is equipped with a liquid nitrogen cryoshield and a rotatable crystal holder capable of carrying more than one wafer. The crystal holder is usually heated from the backside to temperatures of 500 to 700C. Most setups also have RHEED equipment for the in-situ monitoring of surface superstructures on the growing surface and for measuring growth rates, and mass spectrometers for the analysis of the molecular species in the beams and the analysis of the residual gases. The gas inlet system, which is one of the most important components of the system, controls the material beam flux. Pressure controlled systems are used most commonly. The material flux is controlled by the input pressure of the gas injection capillary. The pressure inside the chamber can be measured and controlled by a capacitance manometer. The molecular beams of gaseous source materials injectors or effusion jets that ensure a homogeneous beam profile. For some starting compounds, such as the hydrides that are the group V starting material, the hydrides have to be precracked into the injector. This is usually done by thermally decomposing with a heated metal or filament.

Growth Kinetics
In order to better understand the growth kinetics associated with CBE, it is important to look at physical and chemical processes associated with MBE and MOCVD as well. Figure depicts those. The growth kinetics for these three techniques differ in many ways. In conventional gas source MBE, the growth rate is determined by the arrival rate of the group III atomic beams. The epitaxial growth takes place as the group III atoms impinge on the heated substrate surface, migrates into the appropriate lattice sites and then deposits near excess group V dimers or tetramers. It is worth noting that no chemical reaction is involved at the surface since the atoms are generated by thermal evaporation from solid elemental sources.

Chemical beam epitaxy

172

Figure : Growth kinetics involved in a) conventional MBE, b) MOCVD, and c) CBE.

In MOCVD, group III alkyls are already partially dissociated in the gas stream. These diffuse through a stagnant boundary layer that exists over the heated substrate, after which they dissociate into the atomic group III elements. These atoms then migrate to the appropriate lattice site and deposit epitaxially by associating with a group V atom that was derived from the thermal decomposition of the hydrides. The growth rate here is usually limited by the diffusion rate of the group III alkyls through the boundary layer. Gas phase reactions between the reactants have also been observed in this process. In CBE processes, the hydrides are cracked in a high temperature injector before they reach the substrate. The temperatures are typically 100-150C lower than they are in a similar MOCVD or MOVPE. There is also no boundary layer (such as the one in MOCVD) and molecular collisions are minimal due to the low pressure. The group V alkyls are usually supplied in excess, and the group III alkyl molecules impinge directly onto the heated substrate as in conventional MBE. The group III alkyl molecule has two options when this happens. The first option is to dissociate its three alkyl radicals by acquiring thermal energy from the surface, and leaving behind the elemental group III atoms on the surface. The second option is to re-evaporate partially or completely undissociated. Thus, the growth rate is determined by the arrival rate of the group III alkyls at a higher substrate temperature, and by the surface pyrolysis rate at lower temperatures.

Chemical beam epitaxy

173

Compatibility with Device Fabrication


Selective Growth at Low Temperatures
Selective growth through dielectric masking is readily achieved using CBE as compared to its parent techniques of MBE and MOCVD. Selective growth is hard to achieve using elemental source MBE because group III atoms do not desorb readily after they are adsorbed. With chemical sources, the reactions associated with the growth rate are faster on the semiconductor surface than on the dielectric layer. No group III element can, however, arrive at the dielectric surface in CBE due to the absence of any gas phase reactions. Also, it is easier for the impinging group III metalorganic molecules to desorb in the absence of the boundary layer. This makes it easier to perform selective epitaxy using CBE and at lower temperatures, compared to MOCVD or MOVPE. In recent developments patented by ABCD Technology, substrate rotation is no longer required, leading to new possibilities such as in-situ patterning with particle beams. This possibility opens very interesting perspectives to achieve patterned thin films in a single step, in particular for materials that are difficult to etch such as oxides.

p-type Doping
It was observed that using TMGa for the CBE of GaAs led to high p-type background doping (1020 cm3) due to incorporated carbon. However, it was found that using TEGa instead of TMGa led to very clean GaAs with room temperature hole concentrations between 1014 and 1016 cm3. It has been demonstrated that the hole concentrations can be adjusted between 1014 and 1021 cm3 by just adjusting the alkyl beam pressure and the TMGa/TEGa ratio, providing means for achieving high and controllable p-type doping of GaAs. This has been exploited for fabricating high quality heterojunction bipolar transistors.

Advantages and Disadvantages of CBE


CBE offers many other advantages over its parent techniques of MOCVD and MBE, some of which are listed below:

Advantages over MBE


1. Easier multiwafer scaleup: Substrate rotation is required for uniformity in thickness and conformity since MBE has individual effusion cells for each element. Large effusion cells and efficient heat dissipation make multiwafer scaleup more difficult. 2. Better for production environment: Instant flux response due to precision electronic control flow. 3. Absence of oval defects: These oval defects generally arise from micro-droplets of Ga or In spit out from high temperature effusion cells. These defects vary in size and density system-to-system and time-to-time. 4. Lower drifts in effusion conditions that do not depend on effusive source filling. 5. In recent developments patented by ABCD Technology, substrate rotation is no longer required.

Chemical beam epitaxy

174

Advantages over MOCVD


1. Easy implementation of in-situ diagnostic instruments such as RHEED. 2. Compatibility with other high vacuum thin-film processing methods, such as metal evaporation and ion implantation.

Shortcomings of CBE
1. More pumping required compared to MOCVD. 2. Composition control can be difficult when growing GaInAs. Incorporation of In from TMIn is significantly larger than that for Ga from TEGa at temperatures around 600C. 3. High carbon incorporation for GaAlAs.

Deposition (aerosol physics)


In aerosol physics, deposition is the process by which aerosol particles collect or deposit themselves on solid surfaces, decreasing the concentration of the particles in the air. It can be divided into two sub-processes: dry and wet deposition. The rate of deposition, or the deposition velocity, is slowest for particles of an intermediate size. Mechanisms for deposition are most effective for either very small or very large particles. Very large particles will settle out quickly through sedimentation (settling) or impaction processes, while Brownian diffusion has the greatest influence on small particles. This is because very small particles coagulate in few hours until they achieve a diameter of 0.3micrometres. At this size they don't coagulate any more.This has a great influence in the amount of PM-2.5 present in the air. Deposition velocity is defined from F = vc, where F is flux density, v is deposition velocity and c is concentration. In gravitational deposition, this velocity is the settling velocity due to the gravity-induced drag. Often studied is whether or not a certain particle will impact with a certain obstacle. This can be predicted with the Stokes number Stk = S d, where S is stopping distance (which depends on particle size, velocity and drag forces), and d is characteristic size (often the diameter of the obstacle). If the value of Stk is less than 1, the particle will not collide with that obstacle. However, if the value of Stk is greater than 1, it will. Deposition due to Brownian motion obeys both Fick's first and second laws. The resulting deposition flux is defined as J = nD t, where J is deposition flux, n is the initial number density, D is the diffusion constant and t is time. This can be integrated to determine the concentration at each moment of time.

Deposition (aerosol physics)

175

Dry deposition
Dry deposition is caused by: Gravitational sedimentation the settling of particles fall down due to gravity. Interception. This is when small particles follow the streamlines, but if they flow too close to an obstacle, they may collide (e.g. a branch of a tree). Impaction. This is when small particles interfacing a bigger obstacle are not able to follow the curved streamlines of the flow due to their inertia, so they hit or impact the droplet. The larger the masses of the small particles facing the big one, the greater the displacement from the flow streamline. Diffusion or Brownian motion. This is the process by which aerosol particles move randomly due to collisions with gas molecules. Such collisions may lead to further collisions with either obstacles or surfaces. There is a net flux towards lower concentrations.

Figure 1 Impaction

Figure 2 Diffusion

Turbulence. Turbulent eddies in the air transfer particles which can collide. Again, there is a net flux towards lower concentrations. Other processes, such as: thermophoresis, turbophoresis, diffusiophoresis and electrophoresis.

Wet deposition
In wet deposition, atmospheric hydrometeors (rain drops, snow etc.) scavenge aerosol particles. This means that wet deposition is gravitational, Brownian and/or turbulent coagulation with water droplets. Different types of wet deposition include: Below-cloud scavenging. This happens when falling rain droplets or snow particles collide with aerosol particles through Brownian diffusion, interception, impaction and turbulent diffusion. In-cloud scavenging. This is where aerosol particles get into cloud droplets or cloud ice crystals through working as cloud nuclei, or being captured by them through collision. They can be brought to the ground surface when rain or snow forms in clouds. Within aerosol computer models aerosols and cloud droplets are mostly treated separately so that nucleation represents a loss process that has to be parametrised.

Aerosol

176

Aerosol
An aerosol is a colloid of fine solid particles or liquid droplets in air or another gas. Examples of aerosols include clouds, haze, and air pollution such as smog and smoke. The liquid or solid particles have diameter mostly smaller than 1 m or so; larger particles with a significant settling speed make the mixture a suspension, but the distinction is not clear-cut. In general conversation, aerosol usually refers to an aerosol spray that delivers a consumer product from a can or similar container. Other technological applications of aerosols include dispersal of pesticides, medical treatment of respiratory illnesses, and combustion technology. Aerosol science covers generation and removal of aerosols, technological application of aerosols, effects of on the environment and people, and a wide variety of other topics.

Mist and clouds are aerosols.

Definitions

Because dust particles mostly settle to the ground, this visible dust is a suspension, not an aerosol. Very fine dust, common in the Sahara Desert, however, can constitute an aerosol as it travels on the winds for weeks.

An aerosol is defined as a colloidal system of solid or liquid particles in a gas. An aerosol includes both the particles and the suspending gas, which is usually air. Frederick G. Donnan presumably first used the term aerosol during World War I to describe an aero-solution, clouds of microscopic particles in air. This term developed analagously to the term hydrosol, a colloid system with water as the dispersing medium. Primary aerosols contain particles introduced directly into the gas; secondary aerosols form through gas-to-particle conversion. There are several measures of aerosol concentration. Environmental science and health often uses the mass concentration (M), defined as the mass of particulate matter per unit volume with units such as g/m3. Also commonly used is the number concentration (N), the number of particles per unit volume with units such as number/m3 or number/cm3.
Photomicrograph made with a Scanning Electron Microscope (SEM): Fly ash particles at 2,000x magnification. Most of the particles in this aerosol are nearly spherical.

Aerosol The size of particles has a major influence on their properties, and the aerosol particle radius or diameter (dp) is a key property used to characterise aerosols. Aerosols vary in their dispersity. A monodisperse aerosol, producible in the laboratory, contains particles of uniform size. Most aerosols, however, as polydisperse colloidal systems, exhibit a range of particle sizes. Liquid droplets are almost always nearly spherical, but scientists use an equivalent diameter to characterize the properities of various shapes of solid particles, some very irregular. The equivalent diameter is the diameter of a spherical particle with the same value of some physical property as the irregular particle. The equivalent volume diameter (de) is defined as the diameter of a sphere of the same volume as that of the irregular particle. Also commonly used is the aerodynamic diameter.

177

Size distribution
For a monodisperse aerosol, a single numberthe particle diametersuffices to describe the size of the particles. However, more complicated particle-size distributions describe the sizes of the particles in a polydisperse aerosol. This distribution defines the relative amounts of particles, sorted according to size. One approach to defining the particle size distribution uses a list of the sizes of every particle in a sample. However, this approach proves tedious to ascertain in aerosols with millions of particles and awkward to use. Another approach splits the complete size range into intervals and finds the number (or proportion) of particles in each interval. One then can visualize these data in a histogram with the area of each bar representing the proportion of particles in that size bin, usually normalised by dividing the number of particles in a bin by the width of the interval so that the area of each bar is proportionate to the number of particles in the size range that it represents. If the width of the bins tends to zero, one gets the frequency function:

where is the diameter of the particles is the fraction of particles having diameters between + is the frequency function and

The same hypothetical log-normal aerosol distribution plotted, from top to bottom, as a number vs diameter distribution, a surface area vs diameter distribution, and a volume vs diameter distribution. Typical mode names are shows at the top. Each distribution is normalised so that the total area is 1000.

Therefore, the area under the frequency curve between two sizes a and b represents the total fraction of the particles in that size range:

It can also be formulated in terms of the total number density N:

Assuming spherical aerosol particles, the aerosol surface area per unit volume (S) is given by the second moment:

And the third moment gives the total volume concentration (V) of the particles:

Aerosol

178

One also usefully can approximate the particle size distribution using a mathematical function. The normal distribution usually does not suitably describe particle size distributions in aerosols because of the skewness associated a long tail of larger particles. Also for a quantity that varies over a large range, as many aerosol sizes do, the width of the distribution implies negative particles sizes, clearly not physically realistic. However, the normal distribution can be suitable for some aerosols, such as test aerosols, certain pollen grains and spores. A more widely chosen log-normal distribution gives the number frequency as:

where: is the standard deviation of the size distribution and is the arithmetic mean diameter. The log-normal distribution has no negative values, can cover a wide range of values, and fits many observed size distributions reasonably well. Other distributions sometimes used to characterise particle size include: the Rosin-Rammler distribution, applied to coarsely dispersed dusts and sprays; the Nukiyama-Tanasawa distribution, for sprays of extremely broad size ranges; the power function distribution, occasionally applied to atmospheric aerosols; the exponential distribution, applied to powdered materials; and for cloud droplets, the Khrgian-Mazin distribution.

Physics
Terminal velocity of a particle in a fluid
For low values of the Reynolds number (<1), true for most aerosol motion, Stokes' law describes the force of resistance on a solid spherical particle in a fluid. However, Stokes' law is only valid when the velocity of the gas at the surface of the particle is zero. For small particles (< 1 m) that characterize aerosols, however, this assumption fails. To account for this failure, one can introduce the Cunningham correction factor, always greater than 1. Including this factor, one finds the relation between the resisting force on a particle and its velocity:

where is the resisting force on a spherical particle is the viscosity of the gas is the particle velocity is the Cunningham correction factor. This allows us to calculate the terminal velocity of a particle undergoing gravitational settling in still air. Neglecting buoyancy effects, we find:

where is the terminal settling velocity of the particle. The terminal velocity can also be derived for other kinds of forces. If Stokes' law holds, then the resistance to motion is directly proportional to speed. The constant of proportionality is the mechanical mobility (B) of a particle:

Aerosol

179

A particle travelling at any reasonable initial velocity approaches its terminal velocity exponentially with an e-folding time equal to the relaxation time:

where: is the particle speed at time t is the final particle speed is the initial particle speed To account for the effect of the shape of non-spherical particles, a correction factor known as the dynamic shape factor is applied to Stokes' law. It is defined as the ratio of the resistive force of the irregular particle to that of a spherical particle with the same volume and velocity:

where: is the dynamic shape factor

Aerodynamic diameter
The aerodynamic diameter of an irregular particle is defined as the diameter of the spherical particle with a density of 1000kg/m3 and the same settling velocity as the irregular particle. Neglecting the slip correction, the particle settles at the terminal velocity proportional to the square of the aerodynamic diameter, da:

where = standard particle density (1000 kg/m3). This equation gives the aerodynamic diameter:

One can apply the aerodynamic diameter to particulate pollutants or to inhaled drugs to predict where in the respiratory tract such particles deposit. Pharmaceutical companies typically use aerodymanic diameter, not geometric diameter, to characterize particles in inhalable drugs.

Dynamics
The previous discussion focussed on single aerosol particles. In contrast, aerosol dynamics explains the evolution of complete aerosol populations. The concentrations of particles will change over time as a result of many processes. External processes that move particles outside a volume of gas under study include diffusion, gravitational settling, and electric charges and other external forces that cause particle migration. A second set of processes internal to a given volume of gas include particle formation (nucleation), evaporation, chemical reaction, and coagulation. A differential equation called the Aaerosol General Dynamic Equation (GDE) characterizes the evolution of the number density of particles in an aerosol due to these processes.

Aerosol

180

Change in time = Convective transport + brownian diffusion + gas-particle interactions + coagulation + migration by external forces Where: is number density of particles of size category is the particle velocity is the particle Stokes-Einstein diffusivity is the particle velocity associated with an external force Coagulation As particles and droplets in an aerosol collide with one another, they may undergo coalescence or aggregation. This process leads to a change in the aerosol particle-size distribution, with the mode increasing in diameter as total number of particles decreases. On occasion, particles may shatter apart into numerous smaller particles; however, this process usually occurs primarily in particles too large for consideration as aerosols. Dynamics regimes The Knudsen number of the particle define three different dynamical regimes that govern the behaviour of an aerosol:

where

is the mean free path of the suspending gas and

is the diameter of the particle. For particles in the free

molecular regime, Kn >> 1; particles small compared to the mean free path of the suspending gas. In this regime, particles interact with the suspending gas through a series of "ballistic" collisions with gas molecules. As such, they behave similarly to gas molecules, tending to follow streamlines and diffusing rapidly through Brownian motion. The mass flux equation in the free molecular regime is:

where a is the particle radius, P and PA are the pressures far from the droplet and at the surface of the droplet respectively, kb is the Boltzmann constant, T is the temperature, CA is mean thermal velocity and is mass accommodation coefficient . The derivation of this equation assumes constant pressure and constant diffusion coefficient. Particles are in the continuum regime when K << 1. In this regime, the particles are big compared to the mean free n path of the suspending gas, meaning that the suspending gas acts as a continuous fluid flowing round the particle. The molecular flux in this regime is:

where a is the radius of the particle A, MA is the molecular mass of the particle A, DAB is the diffusion coefficient between particles A and B, R is the ideal gas constant, T is the temperature (in absolute units like kelvin), and PA and PAS are the pressures at infinite and at the surface respectively . The transition regime contains all the particles in between the free molecular and continuum regimes or Kn 1. The forces experienced by a particle are a complex combination of interactions with individual gas molecules and macroscopic interactions. The semi-empirical equation describing mass flux is:

Aerosol where Icont is the mass flux in the continuum regime . This formula is called the Fuchs-Sutugin interpolation formula. These equations do not take into account the heat release effect. Partitioning Aerosol partitioning theory governs condensation on and evaporation from an aerosol surface, respectively. Condensation of mass causes the mode of the particle-size distributions of the aerosol to increase; conversely, evaporation causes the mode to decrease. Nucleation is the process of forming aerosol mass from the condensation of a gaseous precursor, specifically a vapour. Net condensation of the vapour requires supersaturation, a partial pressure greater than its vapour pressure. This can happen for three reasons : 1. Lowering the temperature of the system lowers the vapour pressure. 2. Chemical reactions may increase the partial pressure of a gas or lower its vapour pressure. 3. The addition of additional vapour to the system may lower the equilibrium vapour pressure according to Raoult's law.

181

Condensation and evaporation

There are two types of nucleation processes. Gases preferentially condense onto surfaces of pre-existing aerosol particles, known as heterogeneous nucleation. This process causes the diameter at the mode of particle-size distribution to increase with constant number concentration.With sufficiently high supersaturation and no suitable surfaces, particles may condense in the absence of a pre-existing surface, known as homogeneous nucleation. This results in the addition of very small, rapidly growing particles to the particle-size distribution. Activation Water coats particles in an aerosols, making them activated, usually in the context of forming a cloud droplet Kelvin equation (based on the curvature of liquid droplets), smaller particles need a higher ambient relative humidity to maintain equilibrium than larger particles do. The following formula gives relative humidity at equilibrium:

where

is the saturation vapor pressure above a particle at equilibrium (around a curved liquid droplet), p0 is the

saturation vapor pressure (flat surface of the same liquid) and S is the saturation ratio. Kelvin equation for saturation vapor pressure above a curved surface is:

where rp droplet radius, surface tension of droplet, density of liquid, M molar mass, T temperature, and R molar gas constant.

Aerosol Solution to the General Dynamic Equation There are no general solutions to the general dynamic equation (GDE); common methods used to solve the general dynamic equation include: Moment method, Modal/sectional method, and Quadrature method of moments.

182

Generation and Applications


People generate aerosols for various purposes, including: as test aerosols for calibrating instruments, performing research, and testing sampling equipment and air filters; to deliver deodorants, paints, and other consumer products in sprays; for dispersal and agricultural appliciation of pesticides; for medical treatment of respiratory disease; and in fuel injection systems and other combustion technology. Some devices for generating aerosols are: Aerosol spray Atomizer nozzle or Nebulizer Electrospray Vibrating orifice aerosol generator (VOAG)

Detection
Aerosol can either be measured in-situ or with remote sensing techniques.

In situ observations
Some available in situ measurement techniques include: Aerosol Mass Spectrometer (AMS) Differential Mobility Analyzer (DMA) Aerodynamic Particle Sizer (APS) Wide Range Particle Spectrometer (WPS) Micro-Orifice Uniform Deposit Impactor(MOUDI) Condensation Particle Counter (CPC) Epiphaniometer

Remote sensing approach


Remote sensing approaches include: Sun photometer LIDAR

Size selective sampling


Particles can deposit in the nose, mouth, pharynx and larynx (the head airways region), deeper within the respiratory tract (from the trachea to the terminal bronchioles), or in the alveolar region.The location of deposition of aerosol particles within the in the respiratory system strongly determines the health effects of exposure to such aerosols. This phenomenon led people to invent aerosol samplers that select a subset of the aerosol particles that reach certain

Aerosol parts of the respiratory system. Examples of these subsets of the particle-size distribution of an aerosol, important in occupational health, include the inhalable, thoracic, and respirable fractions. The fraction that can enter each part of the respiratory system depends on the deposition of particles in the upper parts of the airway. The inhalable fraction of particles, defined as the proportion of particles originally in the air that can enter the nose or mouth, depends on external wind speed and direction and on the particle-size distribution by aerodynamic diameter. The thoracic fraction is the proportion of the particles in ambient aerosol that can reach the thorax or chest region. The respirable fraction is the proportion of particles in the air that can reach the alveolar region. To measure the respirable fraction of particles in air, a pre-collector is used with a sampling filter. The pre-collector excludes particles as the airways remove particles from inhaled air. The sampling filter collects the particles for measurement. It is common to use cyclonic separation for the pre-collector, but other techniques include impactors, horizontal elutriators, and large pore membrane filters. Two alternative size selective criteria, often used in atmospheric monitoring are PM10 and PM2.5. PM10 is defined by ISO as particles which pass through a size-selective inlet with a 50 % efficiency cut-off at 10 m aerodynamic diameter. PM10 corresponds to the thoracic convention as defined in ISO 7708:1995, Clause 6 and PM2.5 as particles which pass through a size-selective inlet with a 50 % efficiency cut-off at 2,5 m aerodynamic diameter. PM2,5 corresponds to the high-risk respirable convention as defined in ISO 7708:1995, 7.1. The United States Environmental Protection Agency replaced the older standards for particulate matter based on Total Suspended Particulate with another standard based on PM10 in 1987 and then introduced standards for PM2.5 (also known as fine particulate matter) in 1997.

183

Atmospheric
Atmosphere of Earth contains aerosols of various types and concentrations, including quantities of: natural inorganic materials: fine dust, sea salt, water droplets. natural organic materials: smoke, pollen, spores, bacteria anthropogenic products of combustion such as: smoke, ashes or dusts Aerosols can be found in urban Ecosystems in various forms, for example: Dust, Cigarette smoke, Mist from aerosol spray cans, Soot or fumes in car exhaust. The aerosols present in earth's atmosphere have many impacts including on climate and human health.
Aerosol pollution over Northern India and Bangladesh

Aerosol

184

X-ray crystallography
X-ray crystallography is a method used for determining the atomic and molecular structure of a crystal, in which the crystalline atoms cause a beam of X-rays to diffract into many specific directions. By measuring the angles and intensities of these diffracted beams, a crystallographer can produce a three-dimensional picture of the density of electrons within the crystal. From this electron density, the mean positions of the atoms in the crystal can be determined, as well as their chemical bonds, their disorder and various other information. Since many materials can form crystalssuch as salts, metals, minerals, semiconductors, as well as various inorganic, organic and biological moleculesX-ray crystallography has been fundamental in the development of many scientific fields. In its first decades of use, this method determined the size of atoms, the lengths and types of X-ray crystallography can locate every atom in a zeolite, an aluminosilicate. chemical bonds, and the atomic-scale differences among various materials, especially minerals and alloys. The method also revealed the structure and function of many biological molecules, including vitamins, drugs, proteins and nucleic acids such as DNA. X-ray crystallography is still the chief method for characterizing the atomic structure of new materials and in discerning materials that appear similar by other experiments. X-ray crystal structures can also account for unusual electronic or elastic properties of a material, shed light on chemical interactions and processes, or serve as the basis for designing pharmaceuticals against diseases. In an X-ray diffraction measurement, a crystal is mounted on a goniometer and gradually rotated while being bombarded with X-rays, producing a diffraction pattern of regularly spaced spots known as reflections. The two-dimensional images taken at different rotations are converted into a three-dimensional model of the density of electrons within the crystal using the mathematical method of Fourier transforms, combined with chemical data known for the sample. Poor resolution (fuzziness) or even errors may result if the crystals are too small, or not uniform enough in their internal makeup. X-ray crystallography is related to several other methods for determining atomic structures. Similar diffraction patterns can be produced by scattering electrons or neutrons, which are likewise interpreted as a Fourier transform. If single crystals of sufficient size cannot be obtained, various other X-ray methods can be applied to obtain less detailed information; such methods include fiber diffraction, powder diffraction and small-angle X-ray scattering (SAXS). If the material under investigation is only available in the form of nanocrystalline powders or suffers from poor crystallinity, the methods of electron crystallography can be applied for determining the atomic structure. For all above mentioned X-ray diffraction methods, the scattering is elastic; the scattered X-rays have the same wavelength as the incoming X-ray. By contrast, inelastic X-ray scattering methods are useful in studying excitations of the sample, rather than the distribution of its atoms.

X-ray crystallography

185

History
Early scientific history of crystals and X-rays
Crystals have long been admired for their regularity and symmetry, but they were not investigated scientifically until the 17th century. Johannes Kepler hypothesized in his work Strena seu de Nive Sexangula (1611) that the hexagonal symmetry of snowflake crystals was due to a regular packing of spherical water particles.

Drawing of square (Figure A, above) and hexagonal (Figure B, below) packing from Kepler's work, Strena seu de Nive Sexangula.

As shown by X-ray crystallography, the hexagonal symmetry of snowflakes results from the tetrahedral arrangement of hydrogen bonds about each water molecule. The water molecules are arranged similarly to the silicon atoms in the tridymite polymorph of SiO2. The resulting crystal structure has hexagonal symmetry when viewed along a principal axis.

Crystal symmetry was first investigated experimentally by Danish scientist Nicolas Steno (1669), who showed that the angles between the faces are the same in every exemplar of a particular type of crystal, and by Ren Just Hay (1784), who discovered that every face of a crystal can be described by simple stacking patterns of blocks of the same shape and size. Hence, William Hallowes Miller in 1839 was able to give each face a unique label of three small integers, the Miller indices which are still used today for identifying crystal faces. Hay's study led to the correct idea that crystals are a regular three-dimensional array (a Bravais lattice) of atoms and molecules; a single unit cell is repeated indefinitely along three principal directions that are not necessarily perpendicular. In the 19th century, a complete catalog of the possible symmetries of a crystal was worked out by Johan Hessel, Auguste Bravais, Evgraf Fedorov, Arthur Schnflies and (belatedly) William Barlow. From the available data and physical reasoning, Barlow proposed several crystal structures in the 1880s that were validated later by X-ray crystallography; however, the available data were too scarce in the 1880s to accept his models as conclusive.

X-ray crystallography

186

X-rays were discovered by Wilhelm Conrad Rntgen in 1895, just as the studies of crystal symmetry were being concluded. Physicists were initially uncertain of the nature of X-rays, although it was soon suspected (correctly) that they were waves of electromagnetic radiation, in other words, another form of light. At that time, the wave model of lightspecifically, the Maxwell theory of electromagnetic radiationwas well accepted among scientists, and experiments by Charles Glover Barkla showed that X-rays exhibited phenomena associated with electromagnetic waves, including transverse polarization and spectral lines akin to those observed in the visible wavelengths. Single-slit experiments in the laboratory of Arnold Sommerfeld suggested the wavelength of X-rays was about 1 X-ray crystallography shows the arrangement of angstrom. However, X-rays are composed of photons, and thus are not water molecules in ice, revealing the hydrogen bonds (1) that hold the solid together. Few other only waves of electromagnetic radiation but also exhibit particle-like methods can determine the structure of matter properties. The photon concept was introduced by Albert Einstein in with such precision (resolution). 1905, but it was not broadly accepted until 1922, when Arthur Compton confirmed it by the scattering of X-rays from electrons. Therefore, these particle-like properties of X-rays, such as their ionization of gases, caused William Henry Bragg to argue in 1907 that X-rays were not electromagnetic radiation. Nevertheless, Bragg's view was not broadly accepted and the observation of X-ray diffraction in 1912 confirmed for most scientists that X-rays were a form of electromagnetic radiation.

X-ray analysis of crystals


Crystals are regular arrays of atoms, and X-rays can be considered waves of electromagnetic radiation. Atoms scatter X-ray waves, primarily through the atoms' electrons. Just as an ocean wave striking a lighthouse produces secondary circular waves emanating from the lighthouse, so an X-ray striking an electron produces secondary spherical waves emanating from the electron. This phenomenon is known as elastic scattering, and the electron (or lighthouse) is known as the scatterer. A regular array of scatterers produces a regular array of spherical waves. Although these waves cancel one another out in most directions through destructive interference, they add constructively in a few specific directions, determined by Bragg's law:

angle, n is any integer, and is the wavelength of the beam. These specific directions appear as spots on the diffraction pattern called reflections. Thus, X-ray diffraction results from an electromagnetic wave (the X-ray) impinging on a regular array of scatterers (the repeating arrangement of atoms within the crystal). X-rays are used to produce the diffraction pattern because their wavelength is typically the same order of magnitude (1100 angstroms) as the spacing d between planes in the crystal. In principle, any wave impinging on a regular array of scatterers produces diffraction, as predicted first by Francesco Maria Grimaldi in 1665. To produce significant diffraction, the spacing between the scatterers and the wavelength of the impinging wave should be

The incoming beam (coming from upper left) causes each scatterer to re-radiate a small portion of its intensity as a spherical wave. If scatterers are arranged symmetrically with a separation d, these spherical waves will be in sync (add constructively) only in directions where their path-length difference 2d sin equals an integer multiple of the wavelength . In that case, part of the incoming beam is deflected by an angle 2, producing a reflection spot in the diffraction pattern.

Here d is the spacing between diffracting planes,

is the incident

X-ray crystallography similar in size. For illustration, the diffraction of sunlight through a bird's feather was first reported by James Gregory in the later 17th century. The first artificial diffraction gratings for visible light were constructed by David Rittenhouse in 1787, and Joseph von Fraunhofer in 1821. However, visible light has too long a wavelength (typically, 5500 angstroms) to observe diffraction from crystals. Prior to the first X-ray diffraction experiments, the spacings between lattice planes in a crystal were not known with certainty. The idea that crystals could be used as a diffraction grating for X-rays arose in 1912 in a conversation between Paul Peter Ewald and Max von Laue in the English Garden in Munich. Ewald had proposed a resonator model of crystals for his thesis, but this model could not be validated using visible light, since the wavelength was much larger than the spacing between the resonators. Von Laue realized that electromagnetic radiation of a shorter wavelength was needed to observe such small spacings, and suggested that X-rays might have a wavelength comparable to the unit-cell spacing in crystals. Von Laue worked with two technicians, Walter Friedrich and his assistant Paul Knipping, to shine a beam of X-rays through a copper sulfate crystal and record its diffraction on a photographic plate. After being developed, the plate showed a large number of well-defined spots arranged in a pattern of intersecting circles around the spot produced by the central beam. Von Laue developed a law that connects the scattering angles and the size and orientation of the unit-cell spacings in the crystal, for which he was awarded the Nobel Prize in Physics in 1914. As described in the mathematical derivation below, the X-ray scattering is determined by the density of electrons within the crystal. Since the energy of an X-ray is much greater than that of a valence electron, the scattering may be modeled as Thomson scattering, the interaction of an electromagnetic ray with a free electron. This model is generally adopted to describe the polarization of the scattered radiation. The intensity of Thomson scattering declines as 1/m2 with the mass m of the charged particle that is scattering the radiation; hence, the atomic nuclei, which are thousands of times heavier than an electron, contribute negligibly to the scattered X-rays.Wikipedia:Please clarify

187

Development from 1912 to 1920


After Von Laue's pioneering research, the field developed rapidly, most notably by physicists William Lawrence Bragg and his father William Henry Bragg. In 19121913, the younger Bragg developed Bragg's law, which connects the observed scattering with reflections from evenly spaced planes within the crystal. The Braggs, father and son, shared the 1915 Nobel Prize in Physics for their work in crystallography. The earliest structures were generally simple and marked by one-dimensional symmetry. However, as computational and experimental methods improved over the next decades, it became feasible to deduce reliable atomic positions for more complicated twoand three-dimensional arrangements of atoms in the unit-cell. The potential of X-ray crystallography for determining the structure of molecules and mineralsthen only known vaguely from chemical and hydrodynamic experimentswas realized immediately. The earliest structures were simple inorganic crystals and minerals, but even these revealed fundamental laws of physics and chemistry. The first atomic-resolution structure to be "solved" (i.e. determined) in 1914 was that of table salt. The distribution of electrons in the table-salt structure showed that crystals are not necessarily composed of covalently bonded molecules, and proved the existence of ionic

Although diamonds (top left) and graphite (top right) are identical in chemical compositionbeing both pure carbonX-ray crystallography revealed the arrangement of their atoms (bottom) accounts for their different properties. In diamond, the carbon atoms are arranged tetrahedrally and held together by single covalent bonds, making it strong in all directions. By contrast, graphite is composed of stacked sheets. Within the sheet, the bonding is covalent and has hexagonal symmetry, but there are no covalent bonds between the sheets, making graphite easy to cleave into flakes.

compounds. The structure of diamond was solved in the same year, proving the tetrahedral arrangement of its chemical bonds and showing that the length of CC single bond was 1.52 angstroms. Other early structures included

X-ray crystallography copper, calcium fluoride (CaF2, also known as fluorite), calcite (CaCO3) and pyrite (FeS2) in 1914; spinel (MgAl2O4) in 1915; the rutile and anatase forms of titanium dioxide (TiO2) in 1916; pyrochroite Mn(OH)2 and, by extension, brucite Mg(OH)2 in 1919;. Also in 1919 sodium nitrate (NaNO3) and caesium dichloroiodide (CsICl2) were determined by Ralph Walter Graystone Wyckoff, and the wurtzite (hexagonal ZnS) structure became known in 1920. The structure of graphite was solved in 1916 by the related method of powder diffraction, which was developed by Peter Debye and Paul Scherrer and, independently, by Albert Hull in 1917. The structure of graphite was determined from single-crystal diffraction in 1924 by two groups independently. Hull also used the powder method to determine the structures of various metals, such as iron and magnesium.

188

Contributions to chemistry and material science


X-ray crystallography has led to a better understanding of chemical bonds and non-covalent interactions. The initial studies revealed the typical radii of atoms, and confirmed many theoretical models of chemical bonding, such as the tetrahedral bonding of carbon in the diamond structure, the octahedral bonding of metals observed in ammonium hexachloroplatinate (IV), and the resonance observed in the planar carbonate group and in aromatic molecules. Kathleen Lonsdale's 1928 structure of hexamethylbenzene established the hexagonal symmetry of benzene and showed a clear difference in bond length between the aliphatic CC bonds and aromatic CC bonds; this finding led to the idea of resonance between chemical bonds, which had profound consequences for the development of chemistry. Her conclusions were anticipated by William Henry Bragg, who published models of naphthalene and anthracene in 1921 based on other molecules, an early form of molecular replacement. Also in the 1920s, Victor Moritz Goldschmidt and later Linus Pauling developed rules for eliminating chemically unlikely structures and for determining the relative sizes of atoms. These rules led to the structure of brookite (1928) and an understanding of the relative stability of the rutile, brookite and anatase forms of titanium dioxide. The distance between two bonded atoms is a sensitive measure of the bond strength and its bond order; thus, X-ray crystallographic studies have led to the discovery of even more exotic types of bonding in inorganic chemistry, such as metal-metal double bonds, metal-metal quadruple bonds, and three-center, two-electron bonds. X-ray crystallographyor, strictly speaking, an inelastic Compton scattering experimenthas also provided evidence for the partly covalent character of hydrogen bonds. In the field of organometallic chemistry, the X-ray structure of ferrocene initiated scientific studies of sandwich compounds, while that of Zeise's salt stimulated research into "back bonding" and metal-pi complexes. Finally, X-ray crystallography had a pioneering role in the development of supramolecular chemistry, particularly in clarifying the structures of the crown ethers and the principles of host-guest chemistry. In material sciences, many complicated inorganic and organometallic systems have been analyzed using single-crystal methods, such as fullerenes, metalloporphyrins, and other complicated compounds. Single-crystal diffraction is also used in the pharmaceutical industry, due to recent problems with polymorphs. The major factors affecting the quality of single-crystal structures are the crystal's size and regularity; recrystallization is a commonly used technique to improve these factors in small-molecule crystals. The Cambridge Structural Database contains over 500,000 structures; over 99% of these structures were determined by X-ray diffraction.

X-ray crystallography

189

Mineralogy and metallurgy


Since the 1920s, X-ray diffraction has been the principal method for determining the arrangement of atoms in minerals and metals. The application of X-ray crystallography to mineralogy began with the structure of garnet, which was determined in 1924 by Menzer. A systematic X-ray crystallographic study of the silicates was undertaken in the 1920s. This study showed that, as the Si/O ratio is altered, the silicate crystals exhibit significant changes in their atomic arrangements. Machatschki extended these insights to minerals in which aluminium substitutes for the silicon atoms of the silicates. The first application of X-ray crystallography to metallurgy likewise occurred in the mid-1920s. Most notably, Linus Pauling's structure of the alloy Mg2Sn led to his theory of the stability and structure of complex ionic crystals.

On October 17, 2012, the Curiosity rover on the planet Mars at "Rocknest" performed the first X-ray diffraction analysis of Martian soil. The results from the rover's CheMin analyzer revealed the presence of several minerals, including feldspar, pyroxenes and olivine, and suggested that the Martian soil in the sample was similar to the "weathered basaltic soils" of Hawaiian volcanoes.

First X-ray diffraction view of Martian soil - CheMin analysis reveals feldspar, pyroxenes, olivine and more (Curiosity rover at "Rocknest", October 17, 2012).

Early organic and small biological molecules


The first structure of an organic compound, hexamethylenetetramine, was solved in 1923. This was followed by several studies of long-chain fatty acids, which are an important component of biological membranes. In the 1930s, the structures of much larger molecules with two-dimensional complexity began to be solved. A significant advance was the structure of phthalocyanine, a large planar molecule that is closely related to porphyrin molecules important in biology, such as heme, corrin and chlorophyll. X-ray crystallography of biological molecules took off with Dorothy Crowfoot Hodgkin, who solved the structures of cholesterol (1937), [penicillin] (1946) and vitamin B12 (1956), for which she was awarded the Nobel Prize in Chemistry in 1964. In 1969, she succeeded in solving the structure of insulin, on which she worked for over thirty years.

The three-dimensional structure of penicillin, for which Dorothy Crowfoot Hodgkin was awarded the Nobel Prize in Chemistry in 1964. The green, white, red, yellow and blue spheres represent atoms of carbon, hydrogen, oxygen, sulfur and nitrogen, respectively.

X-ray crystallography

190

Biological macromolecular crystallography


Crystal structures of proteins (which are irregular and hundreds of times larger than cholesterol) began to be solved in the late 1950s, beginning with the structure of sperm whale myoglobin by Sir John Cowdery Kendrew, for which he shared the Nobel Prize in Chemistry with Max Perutz in 1962. Since that success, over 73761 X-ray crystal structures of proteins, nucleic acids and other biological molecules have been determined. For comparison, the nearest competing method in terms of structures analyzed is nuclear magnetic resonance (NMR) spectroscopy, which has resolved 9561 chemical structures. Moreover, crystallography can solve structures of arbitrarily large molecules, Ribbon diagram of the structure of myoglobin, whereas solution-state NMR is restricted to relatively small ones (less showing colored alpha helices. Such proteins are than 70 kDa). X-ray crystallography is now used routinely by scientists long, linear molecules with thousands of atoms; to determine how a pharmaceutical drug interacts with its protein target yet the relative position of each atom has been and what changes might improve it. However, intrinsic membrane determined with sub-atomic resolution by X-ray crystallography. Since it is difficult to visualize proteins remain challenging to crystallize because they require all the atoms at once, the ribbon shows the rough detergents or other means to solubilize them in isolation, and such path of the protein polymer from its N-terminus detergents often interfere with crystallization. Such membrane proteins (blue) to its C-terminus (red). are a large component of the genome and include many proteins of great physiological importance, such as ion channels and receptors.

Relationship to other scattering techniques


Elastic vs. inelastic scattering
X-ray crystallography is a form of elastic scattering; the outgoing X-rays have the same energy, and thus same wavelength, as the incoming X-rays, only with altered direction. By contrast, inelastic scattering occurs when energy is transferred from the incoming X-ray to the crystal, e.g., by exciting an inner-shell electron to a higher energy level. Such inelastic scattering reduces the energy (or increases the wavelength) of the outgoing beam. Inelastic scattering is useful for probing such excitations of matter, but not in determining the distribution of scatterers within the matter, which is the goal of X-ray crystallography. X-rays range in wavelength from 10 to 0.01 nanometers; a typical wavelength used for crystallography is 1 (0.1nm) , which is on the scale of covalent chemical bond s and the radius of a single atom. Longer-wavelength photons (such as ultraviolet radiation) would not have sufficient resolution to determine the atomic positions. At the other extreme, shorter-wavelength photons such as gamma rays are difficult to produce in large numbers, difficult to focus, and interact too strongly with matter, producing particle-antiparticle pairs. Therefore, X-rays are the "sweetspot" for wavelength when determining atomic-resolution structures from the scattering of electromagnetic radiation.

X-ray crystallography

191

Other X-ray techniques


Other forms of elastic X-ray scattering include powder diffraction, SAXS and several types of X-ray fiber diffraction, which was used by Rosalind Franklin in determining the double-helix structure of DNA. In general, single-crystal X-ray diffraction offers more structural information than these other techniques; however, it requires a sufficiently large and regular crystal, which is not always available. These scattering methods generally use monochromatic X-rays, which are restricted to a single wavelength with minor deviations. A broad spectrum of X-rays (that is, a blend of X-rays with different wavelengths) can also be used to carry out X-ray diffraction, a technique known as the Laue method. This is the method used in the original discovery of X-ray diffraction. Laue scattering provides much structural information with only a short exposure to the X-ray beam, and is therefore used in structural studies of very rapid events (Time resolved crystallography). However, it is not as well-suited as monochromatic scattering for determining the full atomic structure of a crystal and therefore works better with crystals with relatively simple atomic arrangements. The Laue back reflection mode records X-rays scattered backwards from a broad spectrum source. This is useful if the sample is too thick for X-rays to transmit through it. The diffracting planes in the crystal are determined by knowing that the normal to the diffracting plane bisects the angle between the incident beam and the diffracted beam. A Greninger chart can be used to interpret the back reflection Laue photograph.

Electron and neutron diffraction


Other particles, such as electrons and neutrons, may be used to produce a diffraction pattern. Although electron, neutron, and X-ray scattering are based on different physical processes, the resulting diffraction patterns are analyzed using the same coherent diffraction imaging techniques. As derived below, the electron density within the crystal and the diffraction patterns are related by a simple mathematical method, the Fourier transform, which allows the density to be calculated relatively easily from the patterns. However, this works only if the scattering is weak, i.e., if the scattered beams are much less intense than the incoming beam. Weakly scattered beams pass through the remainder of the crystal without undergoing a second scattering event. Such re-scattered waves are called "secondary scattering" and hinder the analysis. Any sufficiently thick crystal will produce secondary scattering, but since X-rays interact relatively weakly with the electrons, this is generally not a significant concern. By contrast, electron beams may produce strong secondary scattering even for relatively thin crystals (>100nm). Since this thickness corresponds to the diameter of many viruses, a promising direction is the electron diffraction of isolated macromolecular assemblies, such as viral capsids and molecular machines, which may be carried out with a cryo-electron microscope. Moreover the strong interaction of electrons with matter (about 1000 times stronger than for X-rays) allows determination of the atomic structure of extremely small volumes. The field of applications for electron crystallography ranges from bio molecules like membrane proteins over organic thin films to the complex structures of (nanocrystalline) intermetallic compounds and zeolites. Neutron diffraction is an excellent method for structure determination, although it has been difficult to obtain intense, monochromatic beams of neutrons in sufficient quantities. Traditionally, nuclear reactors have been used, although the new Spallation Neutron Source holds much promise in the near future. Being uncharged, neutrons scatter much more readily from the atomic nuclei rather than from the electrons. Therefore, neutron scattering is very useful for observing the positions of light atoms with few electrons, especially hydrogen, which is essentially invisible in the X-ray diffraction. Neutron scattering also has the remarkable property that the solvent can be made invisible by adjusting the ratio of normal water, H2O, and heavy water, D2O.

X-ray crystallography

192

Synchrotron Radiation
Synchrotron Radiation is one of the brightest lights on earth. It is the single most powerful tool available to X-ray crystallographers. It is made of X-ray beams generated in large machines called synchrotrons. These machines accelerate electrically charged particles, often electrons, to nearly the speed of light, then whip them around a huge, hollow metal ring. Synchrotrons were originally designed for use by high-energy physicists studying subatomic particles and cosmic phenomena. The largest component of each synchrotron is its electron storage ring. This ring is actually not a perfect circle, but a many-sided polygon. At each corner of the polygon, precisely aligned magnets bend the electron stream, forcing it to stay in the ring. Each time the electrons path is bent, they emit bursts of energy in the form of electromagnetic radiation. Because particles in a synchrotron are hurtling at nearly the speed of light, they emit intense radiation, including lots of high-energy X-rays.

Methods
Overview of single-crystal X-ray diffraction
The oldest and most precise method of X-ray crystallography is single-crystal X-ray diffraction, in which a beam of X-rays strikes a single crystal, producing scattered beams. When they land on a piece of film or other detector, these beams make a diffraction pattern of spots; the strengths and angles of these beams are recorded as the crystal is gradually rotated.[5] Each spot is called a reflection, since it corresponds to the reflection of the X-rays from one set of evenly spaced planes within the crystal. For single crystals of sufficient purity and regularity, X-ray diffraction data can determine the mean chemical bond lengths and angles to within a few thousandths of an angstrom and to within a few tenths of a degree, respectively. The atoms in a crystal are not static, but oscillate about their mean positions, usually by less than a few tenths of an angstrom. X-ray crystallography allows measuring the size of these oscillations. Procedure The technique of single-crystal X-ray crystallography has three basic steps. The firstand often most difficultstep is to obtain an adequate Workflow for solving the structure of a molecule crystal of the material under study. The crystal should be sufficiently by X-ray crystallography. large (typically larger than 0.1mm in all dimensions), pure in composition and regular in structure, with no significant internal imperfections such as cracks or twinning. In the second step, the crystal is placed in an intense beam of X-rays, usually of a single wavelength (monochromatic X-rays), producing the regular pattern of reflections. As the crystal is gradually rotated, previous reflections disappear and new ones appear; the intensity of every spot is recorded at every orientation of the crystal. Multiple data sets may have to be collected, with each set covering slightly more than half a full rotation of the crystal and typically containing tens of thousands of reflections. In the third step, these data are combined computationally with complementary chemical information to produce and refine a model of the arrangement of atoms within the crystal. The final, refined model of the atomic arrangementnow called a crystal structureis usually stored in a public database.

X-ray crystallography Limitations As the crystal's repeating unit, its unit cell, becomes larger and more complex, the atomic-level picture provided by X-ray crystallography becomes less well-resolved (more "fuzzy") for a given number of observed reflections. Two limiting cases of X-ray crystallography"small-molecule" and "macromolecular" crystallographyare often discerned. Small-molecule crystallography typically involves crystals with fewer than 100 atoms in their asymmetric unit; such crystal structures are usually so well resolved that the atoms can be discerned as isolated "blobs" of electron density. By contrast, macromolecular crystallography often involves tens of thousands of atoms in the unit cell. Such crystal structures are generally less well-resolved (more "smeared out"); the atoms and chemical bonds appear as tubes of electron density, rather than as isolated atoms. In general, small molecules are also easier to crystallize than macromolecules; however, X-ray crystallography has proven possible even for viruses with hundreds of thousands of atoms.

193

Crystallization
Although crystallography can be used to characterize the disorder in an impure or irregular crystal, crystallography generally requires a pure crystal of high regularity to solve the structure of a complicated arrangement of atoms. Pure, regular crystals can sometimes be obtained from natural or synthetic materials, such as samples of metals, minerals or other macroscopic materials. The regularity of such crystals can sometimes be improved with macromolecular crystal annealing and other methods. However, in many cases, obtaining a diffraction-quality crystal is the chief barrier to solving its atomic-resolution structure.

Small-molecule and macromolecular crystallography differ in the range of possible techniques used to produce diffraction-quality crystals. Small molecules generally have few degrees of conformational freedom, and may be crystallized by a wide range of methods, such as chemical vapor deposition and recrystallization. By contrast, macromolecules generally have many degrees of freedom and their crystallization must be carried out to maintain a stable structure. For example, proteins and larger RNA molecules cannot be crystallized if their tertiary structure has been unfolded; therefore, the range of crystallization conditions is restricted to solution conditions in which such molecules remain folded.

A protein crystal seen under a microscope. Crystals used in X-ray crystallography may be smaller than a millimeter across.

X-ray crystallography

194

Protein crystals are almost always grown in solution. The most common approach is to lower the solubility of its component molecules very gradually; if this is done too quickly, the molecules will precipitate from solution, forming a useless dust or amorphous gel on the bottom of the container. Crystal growth in solution is characterized by two steps: nucleation of a microscopic crystallite (possibly having only 100 molecules), followed by growth of that crystallite, ideally to a diffraction-quality crystal. The solution conditions that favor the first step (nucleation) are not always the same conditions that favor the second step (subsequent growth). The crystallographer's goal is to identify solution conditions that favor the development of a single, large crystal, since larger crystals offer improved resolution of the molecule. Consequently, the solution conditions should disfavor the first step (nucleation) but favor the second (growth), so that only one large crystal forms per droplet. If nucleation is favored too much, a shower of small crystallites will form in the droplet, rather than one large crystal; if favored too little, no crystal will form whatsoever. It is extremely difficult to predict good conditions for nucleation or growth of well-ordered crystals. In practice, favorable conditions are identified by screening; a very large batch of the molecules is prepared, and a wide variety of crystallization solutions are tested. Hundreds, even thousands, of solution conditions are generally tried before finding the successful one. The various conditions can use one or more physical mechanisms to lower the solubility of Three methods of preparing crystals, A: Hanging drop. B: Sitting drop. C: the molecule; for example, some may change the pH, some contain salts of the Microdialysis Hofmeister series or chemicals that lower the dielectric constant of the solution, and still others contain large polymers such as polyethylene glycol that drive the molecule out of solution by entropic effects. It is also common to try several temperatures for encouraging crystallization, or to gradually lower the temperature so that the solution becomes supersaturated. These methods require large amounts of the target molecule, as they use high concentration of the molecule(s) to be crystallized. Due to the difficulty in obtaining such large quantities (milligrams) of crystallization grade protein, robots have been developed that are capable of accurately dispensing crystallization trial drops that are in the order of 100 nanoliters in volume. This means that 10-fold less protein is used per-experiment when compared to crystallization trials setup by hand (in the order of 1 microliter). Several factors are known to inhibit or mar crystallization. The growing crystals are generally held at a constant temperature and protected from shocks or vibrations that might disturb their crystallization. Impurities in the molecules or in the crystallization solutions are often inimical to crystallization. Conformational flexibility in the molecule also tends to make crystallization less likely, due to entropy. Ironically, molecules that tend to self-assemble into regular helices are often unwilling to assemble into crystals. Crystals can be marred by twinning, which can occur when a unit cell can pack equally favorably in multiple orientations; although recent advances in computational methods may allow solving the structure of some twinned crystals. Having failed to crystallize a target molecule, a crystallographer may try again with a slightly modified version of the molecule; even small changes in molecular properties can lead to large differences in crystallization behavior.

X-ray crystallography

195

Data collection
Mounting the crystal The crystal is mounted for measurements so that it may be held in the X-ray beam and rotated. There are several methods of mounting. In the past, crystals were loaded into glass capillaries with the crystallization solution (the mother liquor). Nowadays, crystals of small molecules are a typically attached with oil or glue to a glass fiber or a loop, which is made of nylon or plastic and attached to a solid rod. Protein crystals are scooped up by a loop, then flash-frozen with liquid nitrogen. This freezing reduces the radiation damage of the X-rays, as well as the noise in the Bragg peaks due to thermal motion (the Debye-Waller effect). However, untreated protein crystals often crack if flash-frozen; therefore, they are generally pre-soaked in a cryoprotectant solution before freezing. Unfortunately, this pre-soak may itself cause the crystal to crack, ruining it for crystallography. Generally, successful cryo-conditions are identified by trial and error. The capillary or loop is mounted on a goniometer, which allows it to be positioned accurately within the X-ray beam and rotated. Since both the crystal and the beam are often very small, the crystal must be centered within the beam to within ~25 micrometers accuracy, which is aided by a camera focused on the crystal. The most common type of goniometer is the "kappa goniometer", which offers three angles of rotation: the angle, which rotates about an axis perpendicular to the beam; the angle, about an axis at ~50 to the axis; and, finally, the angle about the loop/capillary axis. When the angle is zero, the and axes are aligned. The rotation allows for convenient mounting of the crystal, since the arm in which the crystal is mounted may be swung out towards the crystallographer. The oscillations carried out during data collection (mentioned below) involve the axis only. An older type of goniometer is the four-circle goniometer, and its relatives such as the six-circle goniometer. X-ray sources The mounted crystal is then irradiated with a beam of monochromatic X-rays. The brightest and most useful X-ray sources are synchrotrons; their much higher luminosity allows for better resolution. They also make it convenient to tune the wavelength of the radiation, which is useful for multi-wavelength anomalous dispersion (MAD) phasing, described below. Synchrotrons are generally national facilities, each with several dedicated beamlines where data is collected around the clock, seven days a week. Smaller, X-ray generators are often used in laboratories to check the quality of crystals before bringing them to a synchrotron and sometimes to solve a crystal structure. In such systems, electrons are boiled off of a cathode and accelerated through a strong electric potential of ~50kV; having reached a high speed, the electrons collide with a metal plate, emitting bremsstrahlung and some strong spectral lines corresponding to the excitation of inner-shell electrons of the metal. The most common metal used is copper, which can be kept cool easily, due to its high thermal conductivity, and which produces strong A diffractometer K and K- lines. The K- line is sometimes suppressed with a thin (~10m) nickel foil. The simplest and cheapest variety of sealed X-ray tube has a stationary anode (the Crookes tube) and run with ~2kW of electron beam power. The more expensive variety has a rotating-anode type source that run with ~14kW of e-beam power. X-rays are generally filtered (by use of X-Ray Filters) to a single wavelength (made monochromatic) and collimated to a single direction before they are allowed to strike the crystal. The filtering not only simplifies the data analysis, but also removes radiation that degrades the crystal without contributing useful information. Collimation is done either with a collimator (basically, a long tube) or with a clever arrangement of gently curved mirrors. Mirror systems are preferred for small crystals (under 0.3mm) or with large unit cells (over 150)

X-ray crystallography Recording the reflections When a crystal is mounted and exposed to an intense beam of X-rays, it scatters the X-rays into a pattern of spots or reflections that can be observed on a screen behind the crystal. A similar pattern may be seen by shining a laser pointer at a compact disc. The relative intensities of these spots provide the information to determine the arrangement of molecules within the crystal in atomic detail. The intensities of these reflections may be recorded with photographic film, an area detector or with a charge-coupled device (CCD) image sensor. The peaks at small angles correspond to low-resolution data, whereas those at high angles represent high-resolution data; thus, an upper limit on the eventual resolution of the structure can be determined from the first few images. Some measures of diffraction quality can be determined at this point, such as the mosaicity of the crystal and its overall disorder, as observed in the peak widths. Some pathologies of the crystal that would render it unfit for solving the structure can also be diagnosed quickly at this point.

196

An X-ray diffraction pattern of a crystallized enzyme. The pattern of spots (reflections) and the relative strength of each spot (intensities) can be used to determine the structure of the enzyme.

One image of spots is insufficient to reconstruct the whole crystal; it represents only a small slice of the full Fourier transform. To collect all the necessary information, the crystal must be rotated step-by-step through 180, with an image recorded at every step; actually, slightly more than 180 is required to cover reciprocal space, due to the curvature of the Ewald sphere. However, if the crystal has a higher symmetry, a smaller angular range such as 90 or 45 may be recorded. The rotation axis should be changed at least once, to avoid developing a "blind spot" in reciprocal space close to the rotation axis. It is customary to rock the crystal slightly (by 0.52) to catch a broader region of reciprocal space. Multiple data sets may be necessary for certain phasing methods. For example, MAD phasing requires that the scattering be recorded at least three (and usually four, for redundancy) wavelengths of the incoming X-ray radiation. A single crystal may degrade too much during the collection of one data set, owing to radiation damage; in such cases, data sets on multiple crystals must be taken.

Data analysis
Crystal symmetry, unit cell, and image scaling The recorded series of two-dimensional diffraction patterns, each corresponding to a different crystal orientation, is converted into a three-dimensional model of the electron density; the conversion uses the mathematical technique of Fourier transforms, which is explained below. Each spot corresponds to a different type of variation in the electron density; the crystallographer must determine which variation corresponds to which spot (indexing), the relative strengths of the spots in different images (merging and scaling) and how the variations should be combined to yield the total electron density (phasing). Data processing begins with indexing the reflections. This means identifying the dimensions of the unit cell and which image peak corresponds to which position in reciprocal space. A byproduct of indexing is to determine the symmetry of the crystal, i.e., its space group. Some space groups can be eliminated from the beginning. For example, reflection symmetries cannot be observed in chiral molecules; thus, only 65 space groups of 230 possible are allowed for protein molecules which are almost always chiral. Indexing is generally accomplished using an autoindexing routine. Having assigned symmetry, the data is then integrated. This converts the hundreds of images containing the thousands of reflections into a single file, consisting of (at the very least) records of the Miller index of each reflection, and an intensity for each reflection (at this state the file often also includes error estimates and

X-ray crystallography measures of partiality (what part of a given reflection was recorded on that image)). A full data set may consist of hundreds of separate images taken at different orientations of the crystal. The first step is to merge and scale these various images, that is, to identify which peaks appear in two or more images (merging) and to scale the relative images so that they have a consistent intensity scale. Optimizing the intensity scale is critical because the relative intensity of the peaks is the key information from which the structure is determined. The repetitive technique of crystallographic data collection and the often high symmetry of crystalline materials cause the diffractometer to record many symmetry-equivalent reflections multiple times. This allows calculating the symmetry-related R-factor, a reliability index based upon how similar are the measured intensities of symmetry-equivalent reflections, thus assessing the quality of the data. Initial phasing The data collected from a diffraction experiment is a reciprocal space representation of the crystal lattice. The position of each diffraction 'spot' is governed by the size and shape of the unit cell, and the inherent symmetry within the crystal. The intensity of each diffraction 'spot' is recorded, and this intensity is proportional to the square of the structure factor amplitude. The structure factor is a complex number containing information relating to both the amplitude and phase of a wave. In order to obtain an interpretable electron density map, both amplitude and phase must be known (an electron density map allows a crystallographer to build a starting model of the molecule). The phase cannot be directly recorded during a diffraction experiment: this is known as the phase problem. Initial phase estimates can be obtained in a variety of ways: Ab initio phasing or direct methods This is usually the method of choice for small molecules (<1000 non-hydrogen atoms), and has been used successfully to solve the phase problems for small proteins. If the resolution of the data is better than 1.4 (140pm), direct methods can be used to obtain phase information, by exploiting known phase relationships between certain groups of reflections. Molecular replacement if a related structure is known, it can be used as a search model in molecular replacement to determine the orientation and position of the molecules within the unit cell. The phases obtained this way can be used to generate electron density maps. Anomalous X-ray scattering (MAD or SAD phasing) the X-ray wavelength may be scanned past an absorption edge of an atom, which changes the scattering in a known way. By recording full sets of reflections at three different wavelengths (far below, far above and in the middle of the absorption edge) one can solve for the substructure of the anomalously diffracting atoms and thence the structure of the whole molecule. The most popular method of incorporating anomalous scattering atoms into proteins is to express the protein in a methionine auxotroph (a host incapable of synthesizing methionine) in a media rich in seleno-methionine, which contains selenium atoms. A MAD experiment can then be conducted around the absorption edge, which should then yield the position of any methionine residues within the protein, providing initial phases. Heavy atom methods (multiple isomorphous replacement) If electron-dense metal atoms can be introduced into the crystal, direct methods or Patterson-space methods can be used to determine their location and to obtain initial phases. Such heavy atoms can be introduced either by soaking the crystal in a heavy atom-containing solution, or by co-crystallization (growing the crystals in the presence of a heavy atom). As in MAD phasing, the changes in the scattering amplitudes can be interpreted to yield the phases. Although this is the original method by which protein crystal structures were solved, it has largely been superseded by MAD phasing with selenomethionine.

197

X-ray crystallography Model building and phase refinement Having obtained initial phases, an initial model can be built. This model can be used to refine the phases, leading to an improved model, and so on. Given a model of some atomic positions, these positions and their respective Debye-Waller factors (or B-factors, accounting for the thermal motion of the atom) can be refined to fit the observed diffraction data, ideally yielding a better set of phases. A new model can then be fit to the new electron density map and a further round of refinement is carried out. This continues until the correlation between the diffraction data and the model is maximized. The agreement is measured by an R-factor defined as

198

where F is the structure factor. A similar quality criterion is Rfree, which is calculated from a subset (~10%) of reflections that were not included in the structure refinement. Both R factors depend on the resolution of the data. As a rule of thumb, Rfree should be approximately the resolution in angstroms divided by 10; thus, a data-set with 2 resolution should yield a final Rfree ~ 0.2. Chemical bonding features such as stereochemistry, hydrogen bonding and distribution of bond lengths and angles are complementary measures of the model quality. Phase bias is a serious problem in such iterative model building.

A protein crystal structure at 2.7 resolution. The mesh encloses the region in which the electron density exceeds a given threshold. The straight segments represent chemical bonds between the non-hydrogen atoms of an arginine (upper left), a tyrosine (lower left), a disulfide bond (upper right, in yellow), and some peptide groups (running left-right in the middle). The two curved green tubes represent spline fits to the polypeptide backbone.

It may not be possible to observe every atom of the crystallized molecule it must be remembered that the resulting electron density is an average of all the molecules within the crystal. In some cases, there is too much residual disorder in those atoms, and the resulting electron density for atoms existing in many conformations is smeared to such an extent that it is no longer detectable in the electron density map. Weakly scattering atoms such as hydrogen are routinely invisible. It is also possible for a single atom to appear multiple times in an electron density map, e.g., if a protein sidechain has multiple (<4) allowed conformations. In still other cases, the crystallographer may detect that the covalent structure deduced for the molecule was incorrect, or changed. For example, proteins may be cleaved or undergo post-translational modifications that were not detected prior to the crystallization.

Deposition of the structure


Once the model of a molecule's structure has been finalized, it is often deposited in a crystallographic database such as the Cambridge Structural Database (for small molecules), the Inorganic Crystal Structure Database (ICSD) (for inorganic compounds) or the Protein Data Bank (for protein structures). Many structures obtained in private commercial ventures to crystallize medicinally relevant proteins are not deposited in public crystallographic databases.

Diffraction theory
The main goal of X-ray crystallography is to determine the density of electrons f(r) throughout the crystal, where r represents the three-dimensional position vector within the crystal. To do this, X-ray scattering is used to collect data about its Fourier transform F(q), which is inverted mathematically to obtain the density defined in real space, using the formula

X-ray crystallography

199

where the integral is taken over all values of q. The three-dimensional real vector q represents a point in reciprocal space, that is, to a particular oscillation in the electron density as one moves in the direction in which q points. The length of q corresponds to 2 divided by the wavelength of the oscillation. The corresponding formula for a Fourier transform will be used below

where the integral is summed over all possible values of the position vector r within the crystal. The Fourier transform F(q) is generally a complex number, and therefore has a magnitude |F(q)| and a phase (q) related by the equation

The intensities of the reflections observed in X-ray diffraction give us the magnitudes |F(q)| but not the phases (q). To obtain the phases, full sets of reflections are collected with known alterations to the scattering, either by modulating the wavelength past a certain absorption edge or by adding strongly scattering (i.e., electron-dense) metal atoms such as mercury. Combining the magnitudes and phases yields the full Fourier transform F(q), which may be inverted to obtain the electron density f(r). Crystals are often idealized as being perfectly periodic. In that ideal case, the atoms are positioned on a perfect lattice, the electron density is perfectly periodic, and the Fourier transform F(q) is zero except when q belongs to the reciprocal lattice (the so-called Bragg peaks). In reality, however, crystals are not perfectly periodic; atoms vibrate about their mean position, and there may be disorder of various types, such as mosaicity, dislocations, various point defects, and heterogeneity in the conformation of crystallized molecules. Therefore, the Bragg peaks have a finite width and there may be significant diffuse scattering, a continuum of scattered X-rays that fall between the Bragg peaks.

Intuitive understanding by Bragg's law


An intuitive understanding of X-ray diffraction can be obtained from the Bragg model of diffraction. In this model, a given reflection is associated with a set of evenly spaced sheets running through the crystal, usually passing through the centers of the atoms of the crystal lattice. The orientation of a particular set of sheets is identified by its three Miller indices (h, k, l), and let their spacing be noted by d. William Lawrence Bragg proposed a model in which the incoming X-rays are scattered specularly (mirror-like) from each plane; from that assumption, X-rays scattered from adjacent planes will combine constructively (constructive interference) when the angle between the plane and the X-ray results in a path-length difference that is an integer multiple n of the X-ray wavelength .

A reflection is said to be indexed when its Miller indices (or, more correctly, its reciprocal lattice vector components) have been identified from the known wavelength and the scattering angle 2. Such indexing gives the unit-cell parameters, the lengths and angles of the unit-cell, as well as its space group. Since Bragg's law does not interpret the relative intensities of the reflections, however, it is generally inadequate to solve for the arrangement of atoms within the unit-cell; for that, a Fourier transform method must be carried out.

X-ray crystallography

200

Scattering as a Fourier transform


The incoming X-ray beam has a polarization and should be represented as a vector wave; however, for simplicity, let it be represented here as a scalar wave. We also ignore the complication of the time dependence of the wave and just focus on the wave's spatial dependence. Plane waves can be represented by a wave vector kin, and so the strength of the incoming wave at time t=0 is given by

At position r within the sample, let there be a density of scatterers f(r); these scatterers should produce a scattered spherical wave of amplitude proportional to the local amplitude of the incoming wave times the number of scatterers in a small volume dV about r

where S is the proportionality constant. Let's consider the fraction of scattered waves that leave with an outgoing wave-vector of kout and strike the screen at rscreen. Since no energy is lost (elastic, not inelastic scattering), the wavelengths are the same as are the magnitudes of the wave-vectors |kin|=|kout|. From the time that the photon is scattered at r until it is absorbed at rscreen, the photon undergoes a change in phase

The net radiation arriving at rscreen is the sum of all the scattered waves throughout the crystal

which may be written as a Fourier transform

where q = kout kin. The measured intensity of the reflection will be square of this amplitude

Friedel and Bijvoet mates


For every reflection corresponding to a point q in the reciprocal space, there is another reflection of the same intensity at the opposite point -q. This opposite reflection is known as the Friedel mate of the original reflection. This symmetry results from the mathematical fact that the density of electrons f(r) at a position r is always a real number. As noted above, f(r) is the inverse transform of its Fourier transform F(q); however, such an inverse transform is a complex number in general. To ensure that f(r) is real, the Fourier transform F(q) must be such that the Friedel mates F(q) and F(q) are complex conjugates of one another. Thus, F(q) has the same magnitude as F(q) but they have the opposite phase, i.e., (q) = (q) The equality of their magnitudes ensures that the Friedel mates have the same intensity |F|2. This symmetry allows one to measure the full Fourier transform from only half the reciprocal space, e.g., by rotating the crystal slightly more than 180 instead of a full 360 revolution. In crystals with significant symmetry, even more reflections may have the same intensity (Bijvoet mates); in such cases, even less of the reciprocal space may need to be measured. In favorable cases of high symmetry, sometimes only 90 or even only 45 of data are required to completely explore the reciprocal space. The Friedel-mate constraint can be derived from the definition of the inverse Fourier transform

X-ray crystallography Since Euler's formula states that eix = cos(x) + i sin(x), the inverse Fourier transform can be separated into a sum of a purely real part and a purely imaginary part

201

The function f(r) is real if and only if the second integral Isin is zero for all values of r. In turn, this is true if and only if the above constraint is satisfied

since Isin = Isin implies that Isin=0.

Ewald's sphere
Each X-ray diffraction image represents only a slice, a spherical slice of reciprocal space, as may be seen by the Ewald sphere construction. Both kout and kin have the same length, due to the elastic scattering, since the wavelength has not changed. Therefore, they may be represented as two radial vectors in a sphere in reciprocal space, which shows the values of q that are sampled in a given diffraction image. Since there is a slight spread in the incoming wavelengths of the incoming X-ray beam, the values of|F(q)|can be measured only for q vectors located between the two spheres corresponding to those radii. Therefore, to obtain a full set of Fourier transform data, it is necessary to rotate the crystal through slightly more than 180, or sometimes less if sufficient symmetry is present. A full 360 rotation is not needed because of a symmetry intrinsic to the Fourier transforms of real functions (such as the electron density), but "slightly more" than 180 is needed to cover all of reciprocal space within a given resolution because of the curvature of the Ewald sphere. In practice, the crystal is rocked by a small amount (0.25-1) to incorporate reflections near the boundaries of the spherical Ewald shells.

Patterson function
A well-known result of Fourier transforms is the autocorrelation theorem, which states that the autocorrelation c(r) of a function f(r)

has a Fourier transform C(q) that is the squared magnitude of F(q)

Therefore, the autocorrelation function c(r) of the electron density (also known as the Patterson function) can be computed directly from the reflection intensities, without computing the phases. In principle, this could be used to determine the crystal structure directly; however, it is difficult to realize in practice. The autocorrelation function corresponds to the distribution of vectors between atoms in the crystal; thus, a crystal of N atoms in its unit cell may have N(N-1) peaks in its Patterson function. Given the inevitable errors in measuring the intensities, and the mathematical difficulties of reconstructing atomic positions from the interatomic vectors, this technique is rarely used to solve structures, except for the simplest crystals.

Advantages of a crystal
In principle, an atomic structure could be determined from applying X-ray scattering to non-crystalline samples, even to a single molecule. However, crystals offer a much stronger signal due to their periodicity. A crystalline sample is by definition periodic; a crystal is composed of many unit cells repeated indefinitely in three independent directions. Such periodic systems have a Fourier transform that is concentrated at periodically repeating points in reciprocal space known as Bragg peaks; the Bragg peaks correspond to the reflection spots observed in the diffraction image. Since the amplitude at these reflections grows linearly with the number N of scatterers, the

X-ray crystallography observed intensity of these spots should grow quadratically, like N2. In other words, using a crystal concentrates the weak scattering of the individual unit cells into a much more powerful, coherent reflection that can be observed above the noise. This is an example of constructive interference. In a liquid, powder or amorphous sample, molecules within that sample are in random orientations. Such samples have a continuous Fourier spectrum that uniformly spreads its amplitude thereby reducing the measured signal intensity, as is observed in SAXS. More importantly, the orientational information is lost. Although theoretically possible, it is experimentally difficult to obtain atomic-resolution structures of complicated, asymmetric molecules from such rotationally averaged data. An intermediate case is fiber diffraction in which the subunits are arranged periodically in at least one dimension.

202

X-ray crystallography

203

Powder diffraction
Powder diffraction is a scientific technique using X-ray, neutron, or electron diffraction on powder or microcrystalline samples for structural characterization of materials.

Explanation
Ideally, every possible crystalline orientation is represented very equally in a powdered sample. The resulting orientational averaging causes the three-dimensional reciprocal space that is studied in single crystal diffraction to be projected onto a single dimension. The three-dimensional space can be described with (reciprocal) axes x*, y*, and z* or alternatively in spherical coordinates q, *, and *. In powder Electron powder pattern (red) of an Al film with an fcc spiral overlay (green) and a line diffraction, intensity is homogeneous of intersections (blue) that determines lattice parameter. over * and *, and only q remains as an important measurable quantity. In practice, it is sometimes necessary to rotate the sample orientation to eliminate the effects of texturing and achieve true randomness.

Powder diffraction

204

When the scattered radiation is collected on a flat plate detector, the rotational averaging leads to smooth diffraction rings around the beam axis, rather than the discrete Laue spots observed in single crystal diffraction. The angle between the beam axis and the ring is called the scattering angle and in X-ray crystallography always denoted as 2 (in scattering of visible light the convention is usually to call it ). In accordance with Bragg's law, each ring corresponds to a particular reciprocal lattice vector G in the sample crystal. This leads to the definition of the scattering vector as:

Two-dimensional powder diffraction setup with flat plate detector.

Powder diffraction data are usually presented as a diffractogram in which the diffracted intensity I is shown as function either of the scattering angle 2 or as a function of the scattering vector q. The latter variable has the advantage that the diffractogram no longer depends on the value of the wavelength . The advent of synchrotron sources has widened the choice of wavelength considerably. To facilitate comparability of data obtained with different wavelengths the use of q is therefore recommended and gaining acceptability. An instrument dedicated to performing powder measurements is called a powder diffractometer.

Uses
Relative to other methods of analysis, powder diffraction allows for rapid, non-destructive analysis of multi-component mixtures without the need for extensive sample preparation. This gives laboratories around the world the ability to quickly analyze unknown materials and perform materials characterization in such fields as metallurgy, mineralogy, forensic science, archeology, condensed matter physics, and the biological and pharmaceutical sciences. Identification is performed by comparison of the diffraction pattern to a known standard or to a database such as the International Centre for Diffraction Data's Powder Diffraction File (PDF) or the Cambridge Structural Database (CSD). Advances in hardware and software, particularly improved optics and fast detectors, have dramatically improved the analytical capability of the technique, especially relative to the speed of the analysis. The fundamental physics upon which the technique is based provides high precision and accuracy in the measurement of interplanar spacings, sometimes to fractions of an ngstrm, resulting in authoritative identification frequently used in patents, criminal cases and other areas of law enforcement. The ability to analyze multiphase materials also allows analysis of how materials interact in a particular matrix such as a pharmaceutical tablet, a circuit board, a mechanical weld, a geologic core sampling, cement and concrete, or a pigment found in an historic painting. The method has been historically used for the identification and classification of minerals, but it can be used for any materials, even amorphous ones, so long as a suitable reference pattern is known or can be constructed.

Phase identification
The most widespread use of powder diffraction is in the identification and characterization of crystalline solids, each of which produces a distinctive diffraction pattern. Both the positions (corresponding to lattice spacings) and the relative intensity of the lines are indicative of a particular phase and material, providing a "fingerprint" for comparison. A multi-phase mixture, e.g. a soil sample, will show more than one pattern superposed, allowing for determination of relative concentration. J.D. Hanawalt, an analytical chemist who worked for Dow Chemical in the 1930s, was the first to realize the analytical potential of creating a database. Today it is represented by the Powder Diffraction File (PDF) of the International Centre for Diffraction Data (formerly Joint Committee for Powder Diffraction Studies). This has been made searchable by computer through the work of global software developers and equipment manufacturers. There

Powder diffraction are now over 550,000 reference materials in the 2006 Powder Diffraction File Databases, and these databases are interfaced to a wide variety of diffraction analysis software and distributed globally. The Powder Diffraction File contains many subfiles, such as minerals, metals and alloys, pharmaceuticals, forensics, excipients, superconductors, semiconductors, etc., with large collections of organic, organometallic and inorganic reference materials.

205

Crystallinity
In contrast to a crystalline pattern consisting of a series of sharp peaks, amorphous materials (liquids, glasses etc.) produce a broad background signal. Many polymers show semicrystalline behavior, i.e. part of the material forms an ordered crystallite by folding of the molecule. A single polymer molecule may well be folded into two different, adjacent crystallites and thus form a tie between the two. The tie part is prevented from crystallizing. The result is that the crystallinity will never reach 100%. Powder XRD can be used to determine the crystallinity by comparing the integrated intensity of the background pattern to that of the sharp peaks. Values obtained from powder XRD are typically comparable but not quite identical to those obtained from other methods such as DSC.

Lattice parameters
The position of a diffraction peak is independent of the atomic positions within the cell and entirely determined by the size and shape of the unit cell of the crystalline phase. Each peak represents a certain lattice plane and can therefore be characterized by a Miller index. If the symmetry is high, e.g. cubic or hexagonal it is usually not too hard to identify the index of each peak, even for an unknown phase. This is particularly important in solid-state chemistry, where one is interested in finding and identifying new materials. Once a pattern has been indexed, this characterizes the reaction product and identifies it as a new solid phase. Indexing programs exist to deal with the harder cases, but if the unit cell is very large and the symmetry low (triclinic) success is not always guaranteed.

Expansion tensors, bulk modulus


Cell parameters are somewhat temperature and pressure dependent. Powder diffraction can be combined with in situ temperature and pressure control. As these thermodynamic variables are changed, the observed diffraction peaks will migrate continuously to indicate higher or lower lattice spacings as the unit cell distorts. This allows for measurement of such quantities as the thermal expansion tensor and the isothermal bulk modulus, as well determination of the full equation of state of the material.
Thermal expansion of a sulfur powder

Phase transitions
At some critical set of conditions, for example 0 C for water at 1 atm, a new arrangement of atoms or molecules may become stable, leading to a phase transition. At this point new diffraction peaks will appear or old ones disappear according to the symmetry of the new phase. If the material melts to an isotropic liquid, all sharp lines will disappear and be replaced by a broad amorphous pattern. If the transition produces another crystalline phase, one set

Powder diffraction of lines will suddenly be replaced by another set. In some cases however lines will split or coalesce, e.g. if the material undergoes a continuous, second order phase transition. In such cases the symmetry may change because the existing structure is distorted rather than replaced by a completely different one. For example, the diffraction peaks for the lattice planes (100) and (001) can be found at two different values of q for a tetragonal phase, but if the symmetry becomes cubic the two peaks will come to coincide.

206

Crystal structure refinement and determination


Crystal structure determination from powder diffraction data is extremely challenging due to the overlap of reflections in a powder experiment. A number of different methods exist for structural determination, such as simulated annealing and charge flipping. The crystal structures of known materials can be refined, i.e. as a function of temperature or pressure, using the Rietveld method. The Rietveld method is a so-called full pattern analysis technique. A crystal structure, together with instrumental and microstructural information, is used to generate a theoretical diffraction pattern that can be compared to the observed data. A least squares procedure is then used to minimize the difference between the calculated pattern and each point of the observed pattern by adjusting model parameters. Techniques to determine unknown structures from powder data do exist, but are somewhat specialized. A number of programs that can be used in structure determination are TOPAS, Fox, DASH, GSAS, EXPO2004, and a few others.

Size and strain broadening


There are many factors that determine the width B of a diffraction peak. These include: 1. 2. 3. 4. instrumental factors the presence of defects to the perfect lattice differences in strain in different grains the size of the crystallites

It is often possible to separate the effects of size and strain. Where size broadening is independent of q (K=1/d), strain broadening increases with increasing q-values. In most cases there will be both size and strain broadening. It is possible to separate these by combining the two equations in what is known as the HallWilliamson method:

Thus, when we plot

vs.

we get a straight line with slope

and intercept

The expression is a combination of the Scherrer equation for size broadening and the Stokes and Wilson expression for strain broadening. The value of is the strain in the crystallites, the value of D represents the size of the crystallites. The constant k is typically close to unity and ranges from 0.8 to1.39.

Comparison of X-ray and neutron scattering


X-ray photons scatter by interaction with the electron cloud of the material, neutrons are scattered by the nuclei. This means that, in the presence of heavy atoms with many electrons, it may be difficult to detect light atoms by X-ray diffraction. In contrast, the neutron scattering lengths of most atoms are approximately equal in magnitude. Neutron diffraction techniques may therefore be used to detect light elements such as oxygen or hydrogen in combination with heavy atoms. The neutron diffraction technique therefore has obvious applications to problems such as determining oxygen displacements in materials like high temperature superconductors and ferroelectrics, or to hydrogen bonding in biological systems. A further complication in the case of neutron scattering from hydrogenous materials is the strong incoherent scattering of hydrogen (80.27(6) barn). This leads to a very high background in neutron diffraction experiments, and may make structural investigations impossible. A common solution is deuteration, i.e., replacing the 1-H atoms in

Powder diffraction the sample with deuterium (2-H). The incoherent scattering length of deuterium is much smaller (2.05(3) barn) making structural investigations significantly easier. However, in some systems, replacing hydrogen with deuterium may alter the structural and dynamic properties of interest. As neutrons also have a magnetic moment, they are additionally scattered by any magnetic moments in a sample. In the case of long range magnetic order, this leads to the appearance of new Bragg reflections. In most simple cases, powder diffraction may be used to determine the size of the moments and their spatial orientation.

207

Aperiodically-arranged clusters
Predicting the scattered intensity in powder diffraction patterns from gases, liquids, and randomly distributed nano-clusters in the solid state is (to first order) done rather elegantly with the Debye scattering equation:

where the magnitude of the scattering vector q is in reciprocal lattice distance units, N is the number of atoms, fi(q) is the atomic scattering factor for atom i and scattering vector q, while rij is the distance between atom i and atom j. One can also use this to predict the effect of nano-crystallite shape on detected diffraction peaks, even if in some directions the cluster is only one atom thick.

Devices
Cameras
The simplest cameras for X-ray powder diffraction consist of a small capillary and either a flat plate detector (originally a piece of X-ray film, now more and more a flat-plate detector or a CCD-camera) or a cylindrical one (originally a piece of film in a cookie-jar, but increasingly bent position sensitive detectors are used). The two types of cameras are known as the Laue and the DebyeScherrer camera. In order to ensure complete powder averaging, the capillary is usually spun around its axis. For neutron diffraction vanadium cylinders are used as sample holders. Vanadium has a negligible absorption and coherent scattering cross section for neutrons and is hence nearly invisible in a powder diffraction experiment. Vanadium does however have a considerable incoherent scattering cross section which may cause problems for more sensitive techniques such as neutron inelastic scattering. A later development in X-ray cameras is the Guinier camera. It is built around a focusing bent crystal monochromator. The sample is usually placed in the focusing beam, e.g. as a dusting on a piece of sticky tape. A cylindrical piece of film (or electronic multichannel detector) is put on the focusing circle, but the incident beam prevented from reaching the detector to prevent damage from its high intensity.

Diffractometers
Diffractometers can be operated both in transmission and in reflection configurations. The reflection one is more common. The powder sample is filled in a small disc like container and its surface carefully flattened. The disc is put on one axis of the diffractometer and tilted by an angle while a detector (scintillation counter) rotates around it on an arm at twice this angle. This configuration is known under the name BraggBrentano theta-2 theta. Another configuration is the BraggBrentano theta-theta configuration in which the sample is stationary while the X-ray tube and the detector are rotated around it. The angle formed between the tube and the detector is 2theta. This configuration is most convenient for loose powders. Position sensitive and area detectors, which allow collection of multiple angles at once, are becoming more popular on currently supplied instrumentation.

Powder diffraction

208

Neutron diffraction
Sources that produce a neutron beam of suitable intensity and speed for diffraction are only available at a small number of research reactors and spallation sources in the world. Angle dispersive (fixed wavelength) instruments typically have a battery of individual detectors arranged in a cylindrical fashion around the sample holder, and can therefore collect scattered intensity simultaneously on a large 2 range. Time of flight instruments normally have a small range of banks at different scattering angles which collect data at varying resolutions.

X-ray tubes
Laboratory X-ray diffraction equipment relies on the use of an X-ray tube, which is used to produce the X-rays. For more on how X-ray tubes work, see for example here or X-ray . The most commonly used laboratory X-ray tube uses a copper anode, but cobalt and molybdenum are also popular. The wavelength in nm varies for each source. The table below shows these wavelengths, determined by Bearden and quoted in the International Tables for X-ray Crystallography (all values in nm):
Element K K2 K1 (weight average) (strong) (very strong) 0.229100 0.193736 0.179026 0.154184 0.071073 0.229361 0.228970 0.193998 0.193604 0.179285 0.178897 0.154439 0.154056 0.071359 0.070930 K (weak) 0.208487 0.175661 0.162079 0.139222 0.063229

Cr Fe Co Cu Mo

According to the last re-examination of Holzer et al. (1997), these values are respectively:
Element Cr Co Cu Mo K2 K1 K

0.2293663 0.2289760 0.2084920 0.1792900 0.1789010 0.1620830 0.1544426 0.1540598 0.1392250 0.0713609 0.0709319 0.0632305

Other sources
In-house applications of X-ray diffraction has always been limited to the relatively few wavelengths shown in the table above. The available choice was much needed because the combination of certain wavelengths and certain elements present in a sample can lead to strong fluorescence which increases the background in the diffraction pattern. A notorious example is the presence of iron in a sample when using copper radiation. In general elements just below the anode element in the period system need to be avoided. Another limitation is that the intensity of traditional generators is relatively low, requiring lengthy exposure times and precluding any time dependent measurement. The advent of synchrotron sources has drastically changed this picture and caused powder diffraction methods to enter a whole new phase of development. Not only is there a much wider choice of wavelengths available, the high brilliance of the synchrotron radiation makes it possible to observe changes in the pattern during chemical reactions, temperature ramps, changes in pressure and the like. The tunability of the wavelength also makes it possible to observe anomalous scattering effects when the wavelength is chosen close to the absorption edge of one of the elements of the sample.

Powder diffraction Neutron diffraction has never been an in house technique because it requires the availability of an intense neutron beam only available at a nuclear reactor or spallation source. Typically the available neutron flux, and the weak interaction between neutrons and matter, require relative large samples.

209

Advantages and disadvantages


Although it possible to solve crystal structures from powder X-ray data alone, its single crystal analogue is a far more powerful technique for structure determination. This is directly related to the fact that much information is lost by the collapse of the 3D space onto a 1D axis. Nevertheless powder X-ray diffraction is a powerful and useful technique in its own right. It is mostly used to characterize and identify phases rather than solving structures. The great advantages of the technique are: simplicity of sample preparation rapidity of measurement the ability to analyze mixed phases, e.g. soil samples "in situ" structure determination

By contrast growth and mounting of large single crystals is notoriously difficult. In fact there are many materials for which despite many attempts it has not proven possible to obtain single crystals. Many materials are readily available with sufficient microcrystallinity for powder diffraction, or samples may be easily ground from larger crystals. In the field of solid-state chemistry that often aims at synthesizing new materials, single crystals thereof are typically not immediately available. Powder diffraction is therefore one of the most powerful methods to identify and characterize new materials in this field. Particularly for neutron diffraction, which requires larger samples than X-ray diffraction due to a relatively weak scattering cross section, the ability to use large samples can be critical, although new more brilliant neutron sources are being built that may change this picture. Since all possible crystal orientations are measured simultaneously, collection times can be quite short even for small and weakly scattering samples. This is not merely convenient, but can be essential for samples which are unstable either inherently or under X-ray or neutron bombardment, or for time-resolved studies. For the latter it is desirable to have a strong radiation source. The advent of synchrotron radiation and modern neutron sources has therefore done much to revitalize the powder diffraction field because it is now possible to study temperature dependent changes, reaction kinetics and so forth by means of time dependent powder diffraction.

Powder diffraction

210

Bragg's law
In physics, Bragg's law gives the angles for coherent and incoherent scattering from a crystal lattice. When X-rays are incident on an atom, they make the electronic cloud move as does any electromagnetic wave. The movement of these charges re-radiates waves with the same frequency (blurred slightly due to a variety of effects); this phenomenon is known as Rayleigh scattering (or elastic scattering). The scattered waves can themselves be scattered but this secondary scattering is assumed to be negligible. A similar process occurs upon scattering neutron waves from the nuclei or by a coherent spin interaction with an unpaired electron. These re-emitted wave fields interfere with each other either constructively or destructively (overlapping waves either add together to produce stronger peaks or subtract from each other to some degree), producing a diffraction pattern on a detector or film. The resulting wave interference pattern is the basis of diffraction analysis. This analysis is called Bragg diffraction. Bragg diffraction (also referred to as the Bragg formulation of X-ray diffraction) was first proposed by William Lawrence Bragg and William Henry Bragg in 1913 in response to their discovery that crystalline solids produced surprising patterns of reflected X-rays (in contrast to that of, say, a liquid). They found that these crystals, at certain specific wavelengths and incident angles, produced intense peaks of reflected radiation (known as Bragg peaks). The concept of Bragg diffraction applies equally to neutron diffraction and electron diffraction processes. Both neutron and X-ray wavelengths are comparable with inter-atomic distances (~150pm) and thus are an excellent probe for this length scale. W. L. Bragg explained this result by modeling the crystal as a set of discrete parallel planes separated by a constant parameter d. It was proposed that the incident X-ray radiation would produce a Bragg peak if their reflections off the various planes X-rays interact with the atoms in a crystal. interfered constructively. The interference is constructive when the phase shift is a multiple of 2; this condition can be expressed by Bragg's law,

where n is an integer, is the wavelength of incident wave, d is the spacing between the planes in the atomic lattice, and is the angle between the incident ray and the scattering planes. Note that moving particles, including electrons, protons and neutrons, have an associated De Broglie wavelength.

Bragg's law

211

Bragg's Law was derived by physicist Sir William Lawrence Bragg in 1912 and first presented on 11 November 1912 to the Cambridge Philosophical Society. Although simple, Bragg's law confirmed the existence of real particles at the atomic scale, as well as providing a powerful new tool for studying crystals in the form of X-ray and neutron diffraction. William According to the 2 deviation, the phase shift causes constructive (left figure) or Lawrence Bragg and his father, Sir destructive (right figure) interferences. William Henry Bragg, were awarded the Nobel Prize in physics in 1915 for their work in determining crystal structures beginning with NaCl, ZnS, and diamond. They are the only father-son team to jointly win. W. L. Bragg was 25 years old, making him the youngest Nobel laureate.

Bragg condition
Bragg diffraction occurs when electromagnetic radiation or subatomic particle waves with wavelength comparable to atomic spacings are incident upon a crystalline sample, are scattered in a specular fashion by the atoms in the system, and undergo constructive interference in accordance to Bragg's law. For a crystalline solid, the waves are scattered from lattice Bragg diffraction. Two beams with identical wavelength and phase approach a crystalline planes separated by the interplanar solid and are scattered off two different atoms within it. The lower beam traverses an distance d. Where the scattered waves extra length of 2dsin. Constructive interference occurs when this length is equal to an interfere constructively, they remain in integer multiple of the wavelength of the radiation. phase since the path length of each wave is equal to an integer multiple of the wavelength. The path difference between two waves undergoing constructive interference is given by 2dsin, where is the scattering angle. This leads to Bragg's law, which describes the condition for constructive interference from successive crystallographic planes (h, k, and l, as given in Miller Notation) of the crystalline lattice:

where n is an integer determined by the order given, and is the wavelength. A diffraction pattern is obtained by measuring the intensity of scattered waves as a function of scattering angle. Very strong intensities known as Bragg peaks are obtained in the diffraction pattern when scattered waves satisfy the Bragg condition. It should be taken into account that if only two planes of atoms were diffracting, as shown in the pictures, then the transition from constructive to destructive interference would be gradual as the angle is varied. However, since many atomic planes are interfering in real materials, very sharp peaks surrounded by mostly destructive interference result.

Bragg's law

212

Reciprocal space
Although the misleading common opinion reigns that Bragg's law measures atomic distances in real space, it does not. This first statement only seems to be true if it's further elaborated that distances measured during a Bragg experiment are inversely proportional to the distance d in the lattice diagram. Furthermore, the term demonstrates that it measures the number of wavelengths fitting between two rows of atoms, thus measuring reciprocal distances. Reciprocal lattice vectors describe the set of lattice planes as a normal vector to this set with length Max von Laue had interpreted this correctly in a vector form, the Laue equation

where

is a reciprocal lattice vector and

and

are the wave vectors of the diffracted and the incident beams and the introduction of the scattering angle this

respectively. Together with the condition for elastic scattering leads equivalently to Bragg's equation. This is simply explained by the conservation of momentum transfer. In this system the scanning variable can be the length or the direction of the incident or exit wave vectors relating to energyand angle-dispersive setups. The simple relationship between diffraction angle and Q-space is then:

The concept of reciprocal lattice is the Fourier space of a crystal lattice and necessary for a full mathematical description of wave mechanics.

Alternate derivation
Suppose that a single monochromatic wave (of any type) is incident on aligned planes of lattice points, with separation , at angle . Points A and C are on one plane, and B is on the plane below. Points ABCC' form a quadrilateral.

There will be a path difference between the ray that gets reflected along AC' and the ray that gets transmitted, then reflected, along AB and BC respectively. This path difference is

The two separate waves will arrive at a point with the same phase, and hence undergo constructive interference, if and only if this path difference is equal to any integer value of the wavelength, i.e.

where the same definition of

and

apply as above.

Bragg's law Therefore, and from which it follows that

213

Putting everything together,

which simplifies to

which is Bragg's law.

Bragg scattering of visible light by colloids


A colloidal crystal is a highly ordered array of particles which can be formed over a very long range (from a few millimeters to one centimeter) in length, and which appear analogous to their atomic or molecular counterparts. The periodic arrays of spherical particles make similar arrays of interstitial voids (the spaces between the particles), which act as a natural diffraction grating for visible light waves, especially when the interstitial spacing is of the same order of magnitude as the incident lightwave. Thus, it has been known for many years that, due to repulsive Coulombic interactions, electrically charged macromolecules in an aqueous environment can exhibit long-range crystal-like correlations with interparticle separation distances often being considerably greater than the individual particle diameter. In all of these cases in nature, the same brilliant iridescence (or play of colours) can be attributed to the diffraction and constructive interference of visible lightwaves which satisfy Braggs law, in a matter analogous to the scattering of X-rays in crystalline solid.

Selection rules and practical crystallography


Bragg's law, as stated above, can be used to obtain the lattice spacing of a particular cubic system through the following relation:

where

is the lattice spacing of the cubic crystal, and

, and

are the Miller indices of the Bragg plane.

Combining this relation with Bragg's law:

One can derive selection rules for the Miller indices for different cubic Bravais lattices; here, selection rules for several will be given as is.

Bragg's law

214

Selection rules for the Miller indices


Bravais lattice Simple cubic Example compounds Po Allowed reflections Any h, k, l h + k + l = even None h + k + l = odd h, k, l mixed odd and even Forbidden reflections

Body-centered cubic Fe, W, Ta, Cr Face-centered cubic Diamond F.C.C. Triangular lattice

Cu, Al, Ni, NaCl, LiH, PbS h, k, l all odd or all even Si, Ge Ti, Zr, Cd, Be

all odd, or all even with h+k+l = 4n h, k, l mixed odd and even, or all even with h+k+l 4n l even, h + 2k 3n h + 2k = 3n for odd l

These selection rules can be used for any crystal with the given crystal structure. KCl exhibits a fcc cubic structure. However, the K+ and the Cl ion have the same number of electrons and are quite close in size, so that the diffraction pattern becomes essentially the same as for a simple cubic structure with half the lattice parameter. Selection rules for other structures can be referenced elsewhere, or derived.

Structure factor

215

Structure factor
In condensed matter physics and crystallography, the static structure factor (or structure factor for short) is a mathematical description of how a material scatters incident radiation. The structure factor is a particularly useful tool in the interpretation of interference patterns obtained in X-ray, electron and neutron diffraction experiments. The static structure factor is measured without resolving the energy of scattered photons/electrons/neutrons. Energy-resolved measurements yield the dynamic structure factor.

Derivation
Let us consider a scalar (real) quantity defined in a volume ; it may correspond, for instance, to a mass or charge distribution or to the refractive index of an inhomogeneous medium. If the scalar function is assumed to be integrable, we can define its Fourier transform . Expressing the field in terms of the spatial frequency instead of the point position is very useful, for instance, when interpreting scattering experiments. Indeed, in the Born approximation (weak interaction between the field and the medium), the amplitude of the signal corresponding to the scattering vector is proportional to . Very often, only the intensity of the scattered signal is detectable, so that If the system under study is composed of a number particles using an auxiliary function , such that:
, (1)

. of identical constituents (atoms, molecules, colloidal due to the morphology of the individual

particles, etc.) it is very convenient to explicitly capture the variation in

with product of the function

the particle positions. In the second equality, the field is decomposed as the convolution , describing the "form" of the particles, with a sum of Dirac delta functions depending

only on their positions. Using the property that the Fourier transform of a convolution product is simply the product of the Fourier transforms of the two factors, we have , such that:
. (2)

In general, the particle positions are not fixed and the measurement takes place over a finite exposure time and with a macroscopic sample (much larger than the interparticle distance). The experimentally accessible intensity is thus an averaged one ; we need not specify whether denotes a time or ensemble average. We can finally write:
, (3)

thus defining the structure factor


(4)

Structure factor

216

Perfect crystals
In a crystal, the constitutive particles are arranged periodically, forming a lattice. In the following, we will consider that all particles are identical (so the above separation in factor and structure factors (3) holds). We also assume that all atoms have an identical environment (i.e. they form a Bravais lattice). The general case of lattice with a basis (see below) is not fundamentally different. If the lattice is infinite and completely regular, the system is a perfect crystal. In addition, we will neglect all thermal motion, so that there is no need for averaging in (4). As in (2), we can write: . The structure factor is simply the squared modulus of the Fourier transform of the lattice, and it is itself a periodic arrangement of points, known as the reciprocal lattice.

One dimension
The reciprocal lattice is easily constructed in one dimension: for particles on a line with a period , the atom positions (for simplicity, we consider that is odd). The sum of the phase factors is a simple geometric series and the structure factor becomes:
Structure factor of a periodic chain, for different particle numbers .

. This function is shown in the Figure below for different values of Based on this expression for . ;

, one can draw several conclusions: the reciprocal lattice has a spacing

the intensity of the maxima increases with the number of particles Figure and can be shown by estimating the limit midpoint large

(this is apparent from the . In the

using, for instance, L'Hpital's rule); the intensity at the

(by direct evaluation); the peak width also decreases like

limit, the peaks become infinitely sharp Dirac delta functions.

Structure factor

217

Two dimensions
In two dimensions, there are only five Bravais lattices. The corresponding reciprocal lattices have the same symmetry as the direct lattice. The Figure shows the construction of one vector of the reciprocal lattice and its relation with a scattering experiment. A parallel beam, with wave vector is incident on a square lattice of

parameter . The scattered wave is detected at a certain angle, which defines the wave vector of the outgoing beam, (under the assumption of elastic scattering, define the scattering vector pattern ). One can equally and construct the harmonic ,
Diagram of scattering by a square (planar) lattice. The incident and outgoing beam are shown, as well as the relation between their wave vectors , and the scattering vector .

. In the depicted example, the spacing of this

pattern coincides to the distance between particle rows:

so that contributions to the scattering from all particles are in phase (constructive interference). Thus, the total signal in direction is strong, and belongs to the reciprocal lattice. It is easily shown that this configuration fulfills Bragg's law.

Lattice with a basis


To compute structure factors for a specific lattice, compute the sum above over the atoms in the unit cell. Since crystals are often described in terms of their Miller indices, it is useful to examine a specific structure factor in terms of these. Body-centered cubic (BCC) As a convention, the body-centered cubic system is described in terms of a simple cubic lattice with primitive vectors , with a basis consisting of and . The corresponding reciprocal lattice is also simple cubic with side In a monoatomic crystal, all the form factors vector given by: . are the same. The intensity of a diffracted beam scattered with a by a crystal plane with Miller indices is then

We then arrive at the following result for the structure factor for scattering from a plane

This result tells us that for a reflection to appear in a diffraction experiment involving a body-centered crystal, the sum of the Miller indices of the scattering plane must be even. If the sum of the Miller indices is odd, the intensity of the diffracted beam is reduced to zero due to destructive interference. This zero intensity for a group of diffracted beams is called a systematic absence. Since atomic form factors fall off with increasing diffraction angle corresponding to higher Miller indices, the most intense diffraction peak from a material with a BCC structure is typically the (110). The (110) plane is the most densely packed of BCC crystal structures and is therefore the lowest energy surface for a thin film to grow. Films of BCC materials like iron and tungsten therefore grow in a characteristic (110) orientation.

Structure factor Face-centered cubic (FCC) In the case of a monoatomic FCC crystal, the atoms in the basis are at the origin the three face centers , , with indices (0,0,0) and at with indices given by

218

(1/2,1/2,0), (0,1/2,1/2), (1/2,0,1/2). An argument similar to the one above gives the expression

with the result

The most intense diffraction peak from a material that crystallizes in the FCC structure is typically the (111). Films of FCC materials like gold tend to grow in a (111) orientation with a triangular surface symmetry. Diamond Crystal Structure The Diamond cubic crystal structure occurs in diamond (carbon), most semiconductors and tin. The basis cell contains 8 atoms located at cell positions:

The Structure factor then takes on a form like this:

with the result for mixed values (odds and even values combined) of h, k, and l, F2 will be 0 if the values are unmixed and... h+k+l is odd then F=4f(1+i) or 4f(1-i), FF*=32f2 h+k+l is even and exactly divisible by 4 (satisfies h+k+l=4n) then F = 8f h+k+l is even but not exactly divisible by 4(doesn't satisfy h+k+l=4n) then F = 0

Structure factor

219

Imperfect crystals
Although the perfect lattice is an extremely useful model, real crystals always exhibit imperfections, which can have profound effects on the structure and properties of the material. Andr Guinier proposed a widely employed distinction between imperfections that preserve the long-range order of the crystal (disorder of the first kind) and those that destroy it (disorder of the second kind).

Liquids
In contrast with crystals, liquids have no long-range order (in particular, there is no regular lattice), so the structure factor does not exhibit sharp peaks. They do however show a certain degree of short-range order, depending on their density and on the strength of the interaction between particles. Liquids are isotropic, so that, after the averaging operation in Equation (4), the structure factor only depends on the absolute magnitude of the scattering vector . For further evaluation, it is convenient to separate the diagonal terms in the double sum, whose phase is identically zero, and therefore each contribute a unit constant:
. (5)

One can obtain an alternative expression for

in terms of the radial distribution function


. (6)

Ideal gas
In the limiting case of no interaction, the system is an ideal gas and the structure factor is completely featureless: , because there is no correlation between the positions and of different particles (they are independent random variables), so the off-diagonal terms . in Equation (5) average to zero:

High-

limit

Even for interacting particles, at high scattering vector the structure factor goes to 1. This result follows from Equation (6), since is the Fourier transform of the "regular" function and thus goes to zero for high values of the argument infinitely sharp peaks. . This reasoning does not hold for a perfect crystal, where the distribution function exhibits

Structure factor

220

Low-

limit
limit, as the system is probed over large length scales, the structure factor contains thermodynamic of the liquid by the compressibility equation:

In the low-

information, being related to the isothermal compressibility .

Hard-sphere liquids
In the hard sphere model, the particles are described as impenetrable spheres with radius ; thus, their center-to-center distance and they experience no interaction beyond this distance. Their interaction potential can be written as:

This model has an analytical solution in the PercusYevick approximation. Although highly simplified, it provides a good description for systems ranging from liquid metals to colloidal suspensions. In an illustration, the structure factor for a hard-sphere fluid is shown in the Figure, for volume fractions from 1% to 40%.

Structure factor of a hard-sphere fluid, calculated using the Percus-Yevick approximation, for volume fractions from 1% to 40%.

Polymers
In polymer systems, the general definition (4) holds; the elementary constituents are now the monomers making up the chains. However, the structure factor being a measure of the correlation between particle positions, one can reasonably expect that this correlation will be different for monomers belonging to the same chain or to different chains. Let us assume that the volume ( contains identical molecules, each composed of monomers, such that

is also known as the degree of polymerization). We can rewrite (4) as:


, (7)

where indices

label the different molecules and


[3]

the different monomers along each molecule. On the ) and intermolecular ( ) terms. Using the

right-hand side we separated intramolecular ( equivalence of the chains, (7) can be simplified:

(8)

where

is the single-chain structure factor.

Structure factor

221

Crystallography
Crystallography is the science that examines the arrangement of atoms in solids. The word "crystallography" derives from the Greek words crystallon = cold drop / frozen drop, with its meaning extending to all solids with some degree of transparency, and grapho = write. A more comprehensive definition is: "Crystallography is the science of condensed matter with emphasis on the atomic or molecular structure and its relation to physical and chemical properties. "

Before the development of X-ray diffraction crystallography (see below), the study of crystals was based on their geometry. This involves measuring the angles of crystal faces relative to theoretical reference axes (crystallographic axes), and establishing the symmetry of the crystal in question. The former is carried out using a goniometer. The position in 3D space of each crystal face is plotted on a stereographic net, e.g. Wulff net or Lambert net. In fact, the pole to each face is plotted on the net. Each point is labelled with its Miller index. The final plot allows the symmetry of the crystal to be established. Crystallographic methods now depend on the analysis of the diffraction patterns of a sample targeted by a beam of some type. Although X-rays are most commonly used, the beam is not always electromagnetic radiation. For some purposes electrons or neutrons are used. This is facilitated by the wave properties of the particles. Crystallographers often explicitly state the type of illumination used when referring to a method, as with the terms X-ray diffraction, neutron diffraction and electron diffraction. These three types of radiation interact with the specimen in different ways. X-rays interact with the spatial distribution of the valence electrons, while electrons are charged particles and therefore feel the total charge distribution of both the atomic nuclei and the surrounding electrons. Neutrons are scattered by the atomic nuclei through the strong nuclear forces, but in addition, the magnetic moment of neutrons is non-zero. They are therefore also scattered by magnetic fields. When neutrons are scattered from hydrogen-containing materials, they produce

A crystalline solid: atomic resolution image of strontium titanate. Brighter atoms are Sr and darker ones are Ti.

Crystallography diffraction patterns with high noise levels. However, the material can sometimes be treated to substitute deuterium for hydrogen. Because of these different forms of interaction, the three types of radiation are suitable for different crystallographic studies.

222

Theory

Condensed matter physics

Phases Phase transition

Generally, an image of a small object is made using a lens to focus the illuminating radiation, as is done with the rays of the visible spectrum in light microscopy. However, the wavelength of visible light (about 4000 to 7000 ngstrm) is three orders of magnitude longer than the length of typical atomic bonds and atoms themselves (about 1 to 2 ). Therefore, obtaining information about the spatial arrangement of atoms requires the use of radiation with shorter wavelengths, such as X-ray or neutron beams. Employing shorter wavelengths implied abandoning microscopy and true imaging, however, because there exists no material from which a lens capable of focusing this type of radiation can be created. (That said, scientists have had some success focusing X-rays with microscopic Fresnel zone plates made from gold, and by critical-angle reflection inside long tapered capillaries.) Diffracted X-ray or neutrons beams cannot be focused to produce images, so the sample structure must be reconstructed from the diffraction pattern. Sharp features in the diffraction pattern arise from periodic, repeating structure in the sample, which are often very strong due to coherent reflection of many photons from many regularly spaced instances of similar structure, while non-periodic components of the structure result in diffuse (and usually weak) diffraction features. Said more simply, areas with a higher density and repetition of atom order tend to reflect more light toward one point in space when compared to those areas with fewer atoms and less repetition. Because of their highly ordered and repetitive structure, crystals give diffraction patterns of sharp Bragg reflection spots, and are ideal for analyzing the structure of solids.

Notation
Coordinates in square brackets such as [100] denote a direction vector (in real space). Coordinates in angle brackets or chevrons such as <100> denote a family of directions which are related by symmetry operations. In the cubic crystal system for example, <100> would mean [100], [010], [001] or the negative of any of those directions. Miller indices in parentheses such as (100) denote a plane of the crystal structure, and regular repetitions of that plane with a particular spacing. In the cubic system, the normal to the (hkl) plane is the direction [hkl], but in lower-symmetry cases, the normal to (hkl) is not parallel to [hkl].

Crystallography Indices in curly brackets or braces such as {100} denote a family of planes and their normals which are equivalent in cubic materials due to symmetry operations, much the way angle brackets denote a family of directions. In non-cubic materials, <hkl> is not necessarily perpendicular to {hkl}.

223

Technique
Some materials studied using crystallography, proteins for example, do not occur naturally as crystals. Typically, such molecules are placed in solution and allowed to crystallize over days, weeks, or months through vapor diffusion. A drop of solution containing the molecule, buffer, and precipitants is sealed in a container with a reservoir containing a hygroscopic solution. Water in the drop diffuses to the reservoir, slowly increasing the concentration and allowing a crystal to form. If the concentration were to rise more quickly, the molecule would simply precipitate out of solution, resulting in disorderly granules rather than an orderly and hence usable crystal. Once a crystal is obtained, data can be collected using a beam of radiation. Although many universities that engage in crystallographic research have their own X-ray producing equipment, synchrotrons are often used as X-ray sources, because of the purer and more complete patterns such sources can generate. Synchrotron sources also have a much higher intensity of X-ray beams, so data collection takes a fraction of the time normally necessary at weaker sources. Complementary neutron crystallography techniques are used to enhance hydrogen atoms positions. Such techniques are available in Neutron facilities. Producing an image from a diffraction pattern requires sophisticated mathematics and often an iterative process of modelling and refinement. In this process, the mathematically predicted diffraction patterns of an hypothesized or "model" structure are compared to the actual pattern generated by the crystalline sample. Ideally, researchers make several initial guesses, which through refinement all converge on the same answer. Models are refined until their predicted patterns match to as great a degree as can be achieved without radical revision of the model. This is a painstaking process, made much easier today by computers. The mathematical methods for the analysis of diffraction data only apply to patterns, which in turn result only when waves diffract from orderly arrays. Hence crystallography applies for the most part only to crystals, or to molecules which can be coaxed to crystallize for the sake of measurement. In spite of this, a certain amount of molecular information can be deduced from the patterns that are generated by fibers and powders, which while not as perfect as a solid crystal, may exhibit a degree of order. This level of order can be sufficient to deduce the structure of simple molecules, or to determine the coarse features of more complicated molecules. For example, the double-helical structure of DNA was deduced from an X-ray diffraction pattern that had been generated by a fibrous sample.

Crystallography in materials engineering


Crystallography is a tool that is often employed by materials scientists. In single crystals, the effects of the crystalline arrangement of atoms is often easy to see macroscopically, because the natural shapes of crystals reflect the atomic structure. In addition, physical properties are often controlled by crystalline defects. The understanding of crystal structures is an important prerequisite for understanding crystallographic defects. Mostly, materials do not occur in a single crystalline, but poly-crystalline form, such that the powder diffraction method plays a most important role in structural determination. A number of other physical properties are linked to crystallography. For example, the minerals in clay form small, flat, platelike structures. Clay can be easily deformed because the platelike particles can slip along each other in the plane of the plates, yet remain strongly connected in the direction perpendicular to the plates. Such mechanisms can be studied by crystallographic texture measurements. In another example, iron transforms from a body-centered cubic (bcc) structure to a face-centered cubic (fcc) structure called austenite when it is heated. The fcc structure is a close-packed structure, and the bcc structure is not, which explains why the volume of the iron decreases when this transformation occurs.

Crystallography Crystallography is useful in phase identification. When performing any process on a material, it may be desired to find out what compounds and what phases are present in the material. Each phase has a characteristic arrangement of atoms. Techniques like X-ray or neutron diffraction can be used to identify which patterns are present in the material, and thus which compounds are present. Crystallography covers the enumeration of the symmetry patterns which can be formed by atoms in a crystal and for this reason has a relation to group theory and geometry. See symmetry group.

224

Biology
X-ray crystallography is the primary method for determining the molecular conformations of biological macromolecules, particularly protein and nucleic acids such as DNA and RNA. In fact, the double-helical structure of DNA was deduced from crystallographic data. The first crystal structure of a macromolecule was solved in 1958. A three-dimensional model of the myoglobin molecule obtained by X-ray analysis. The Protein Data Bank (PDB) is a freely accessible repository for the structures of proteins and other biological macromolecules. Computer programs like RasMol or Pymol can be used to visualize biological molecular structures. Neutron crystallography is often used to help refine structures obtained by X-ray methods or to solve a specific bond; the methods are often viewed as complementary, as X-rays are sensitive to electron positions and scatter most strongly off heavy atoms, while neutrons are sensitive to nucleus positions and scatter strongly off many light isotopes, including hydrogen and deuterium. Electron crystallography has been used to determine some protein structures, most notably membrane proteins and viral capsids.

Scientists of note
William Astbury William Barlow John Desmond Bernal William Henry Bragg William Lawrence Bragg Auguste Bravais Martin Julian Buerger Francis Crick Pierre Curie Peter Debye Boris Delone Gautam R. Desiraju Jack Dunitz Paul Peter Ewald Evgraf Stepanovich Fedorov Rosalind Franklin Georges Friedel Paul Heinrich von Groth Ren Just Hay Carl Hermann Johann Friedrich Christian Hessel Dorothy Crowfoot Hodgkin Robert Huber Aaron Klug Max von Laue Kathleen Lonsdale Ernest-Franois Mallard Charles-Victor Mauguin William Hallowes Miller Friedrich Mohs Paul Niggli Arthur Lindo Patterson Max Perutz Hugo Rietveld Jean-Baptiste L. Rom de l'Isle Paul Scherrer Arthur Moritz Schnflies Dan Shechtman Nicolas Steno Tej P. Singh Constance Tipper Christian Samuel Weiss Don Craig Wiley Ralph Walter Graystone Wyckoff Ada Yonath George M. Sheldrick Jerome Karle

Crystallography

225

Miller index
Miller indices form a notation system in crystallography for planes in crystal (Bravais) lattices. In particular, a family of lattice planes is determined by three integers h, k, and, the Miller indices. They are written (hk), and each index denotes a plane orthogonal to a direction (h, k, ) in the basis of the reciprocal lattice vectors. By convention, negative integers are written with a bar, as in 3 for3. The integers are usually written in lowest terms, i.e. their greatest common divisor should be1. Miller index 100 represents a plane orthogonal to directionh; index 010 represents a plane orthogonal to directionk, and index 001 represents a plane orthogonal to . There are also several related notations: the notation {hk} denotes the set of all planes that are equivalent to (hk) by the symmetry of the lattice. In the context of crystal directions (not planes), the corresponding notations are: [hk], with square instead of round brackets, denotes a direction in the basis of the direct lattice vectors instead of the reciprocal lattice; and

Planes with different Miller indices in cubic crystals

Miller index similarly, the notation denotes

226

the set of all directions that are equivalent to [hk] by symmetry. Miller indices were introduced in 1839 by the British mineralogist William Hallowes Miller. The method was also historically known as the Millerian system, and the indices as Millerian,
Examples of directions

although this is now rare.

The Miller indices are defined with respect to any choice of unit cell and not only with respect to primitive basis vectors, as is sometimes stated.

Definition
There are two equivalent ways to define the meaning of the Miller indices: via a point in the reciprocal lattice, or as the inverse intercepts along the lattice vectors. Both definitions are given below. In either case, one needs to choose the three lattice vectors a1, a2, and a3 that define the unit cell (note that the conventional unit cell may be larger than the primitive cell of the Bravais lattice, as the examples below illustrate). Given these, the three primitive reciprocal lattice vectors are also determined (denoted b1, b2, and b3).

Examples of determining indices for a plane using intercepts with axes; left (111), right (221)

Then, given the three Miller indices h, k, , (hk) denotes planes orthogonal to the reciprocal lattice vector:

That is, (hk) simply indicates a normal to the planes in the basis of the primitive reciprocal lattice vectors. Because the coordinates are integers, this normal is itself always a reciprocal lattice vector. The requirement of lowest terms means that it is the shortest reciprocal lattice vector in the given direction. Equivalently, (hk) denotes a plane that intercepts the three points a1/h, a2/k, and a3/, or some multiple thereof. That is, the Miller indices are proportional to the inverses of the intercepts of the plane, in the basis of the lattice vectors. If one of the indices is zero, it means that the planes do not intersect that axis (the intercept is "at infinity"). Considering only (hk) planes intersecting one or more lattice points (the lattice planes), the perpendicular distance d between adjacent lattice planes is related to the (shortest) reciprocal lattice vector orthogonal to the planes by the formula: . The related notation [hk] denotes the direction:

That is, it uses the direct lattice basis instead of the reciprocal lattice. Note that [hk] is not generally normal to the (hk) planes, except in a cubic lattice as described below.

Miller index

227

Case of cubic structures


For the special case of simple cubic crystals, the lattice vectors are orthogonal and of equal length (usually denoted a); similar to the reciprocal lattice. Thus, in this common case, the Miller indices (hk) and [hk] both simply denote normals/directions in Cartesian coordinates. For cubic crystals with lattice constant a, the spacing d between adjacent (hk) lattice planes is (from above): . Because of the symmetry of cubic crystals, it is possible to change the place and sign of the integers and have equivalent directions and planes: Coordinates in angle brackets such as <100> denote a family of directions which are equivalent due to symmetry operations, such as [100], [010], [001] or the negative of any of those directions. Coordinates in curly brackets or braces such as {100} denote a family of plane normals which are equivalent due to symmetry operations, much the way angle brackets denote a family of directions. For face-centered cubic and body-centered cubic lattices, the primitive lattice vectors are not orthogonal. However, in these cases the Miller indices are conventionally defined relative to the lattice vectors of the cubic supercell and hence are again simply the Cartesian directions.

Case of hexagonal and rhombohedral structures


With hexagonal and rhombohedral lattice systems, it is possible to use the Bravais-Miller index which has 4 numbers (h k i l ) i = (h + k). Here h, k and l are identical to the Miller index, and i is a redundant index. This four-index scheme for labeling planes in a hexagonal lattice makes permutation symmetries apparent. For example, the similarity between (110) (1120) and (120) (1210) is more obvious when the redundant index is shown. In the figure at right, the (001) plane has a 3-fold symmetry: it remains unchanged by a rotation of 1/3 (2/3 rad, 120). The [100], [010] and the [110] directions are really similar. If S is the intercept of the plane with the [110] axis, then i = 1/S. There are also ad hoc schemes (e.g. in the transmission electron microscopy literature) for indexing hexagonal lattice vectors (rather than reciprocal lattice vectors or planes) with four indices. However they don't operate by similarly adding a redundant index to the regular three-index set. For example, the reciprocal lattice vector (hk) as suggested above can be written as ha*+kb*+c*if the reciprocal-lattice basis-vectors are a*, b*, and c*. For hexagonal crystals this may be expressed in terms of direct-lattice basis-vectors a, b and c as
Miller-Bravais indices

Hence zone indices of the direction perpendicular to plane (hk) are, in suitably normalized triplet form, simply [2h+k,h+2k,(3/2)(a/c)2]. When four indices are used for the zone normal to plane (hk), however, the literature often uses [h,k,-h-k,(3/2)(a/c)2] instead. Thus as you can see, four-index zone indices in square or angle brackets sometimes mix a single direct-lattice index on the right with reciprocal-lattice indices (normally in round or curly brackets) on the left.

Miller index

228

The crystallographic planes and directions


The crystallographic directions are fictitious lines linking nodes (atoms, ions or molecules) of a crystal. Similarly, the crystallographic planes are fictitious planes linking nodes. Some directions and planes have a higher density of nodes; these dense planes have an influence on the behaviour of the crystal: optical properties: in condensed matter, the light "jumps" from one atom to the other with the Rayleigh scattering; the velocity of light thus varies according to the directions, whether the atoms are close or far; this gives the birefringence adsorption and reactivity: the adsorption and the chemical reactions occur on atoms or molecules, these phenomena are thus sensitive to the density of nodes;

Dense crystallographic planes

surface tension: the condensation of a material means that the atoms, ions or molecules are more stable if they are surrounded by other similar species; the surface tension of an interface thus varies according to the density on the surface the pores and crystallites tend to have straight grain boundaries following dense planes cleavage dislocations (plastic deformation) the dislocation core tends to spread on dense planes (the elastic perturbation is "diluted"); this reduces the friction (PeierlsNabarro force), the sliding occurs more frequently on dense planes; the perturbation carried by the dislocation (Burgers vector) is along a dense direction: the shift of one node in a dense direction is a lesser distortion; the dislocation line tends to follow a dense direction, the dislocation line is often a straight line, a dislocation loop is often a polygon. For all these reasons, it is important to determine the planes and thus to have a notation system.

Integer vs. irrational Miller indices: Lattice planes and quasicrystals


Ordinarily, Miller indices are always integers by definition, and this constraint is physically significant. To understand this, suppose that we allow a plane (abc) where the Miller "indices" a, b and c (defined as above) are not necessarily integers. If a, b and c have rational ratios, then the same family of planes can be written in terms of integer indices (hk) by scaling a, b and c appropriately: divide by the largest of the three numbers, and then multiply by the least common denominator. Thus, integer Miller indices implicitly include indices with all rational ratios. The reason why planes where the components (in the reciprocal-lattice basis) have rational ratios are of special interest is that these are the lattice planes: they are the only planes whose intersections with the crystal are 2d-periodic. For a plane (abc) where a, b and c have irrational ratios, on the other hand, the intersection of the plane with the crystal is not periodic. It forms an aperiodic pattern known as a quasicrystal. This construction corresponds precisely to the standard "cut-and-project" method of defining a quasicrystal, using a plane with irrational-ratio Miller indices. (Although many quasicrystals, such as the Penrose tiling, are formed by "cuts" of periodic lattices in more than three dimensions, involving the intersection of more than one such hyperplane.)

Miller index

229

Crystal structure
In mineralogy and crystallography, crystal structure is a unique arrangement of atoms or molecules in a crystalline liquid or solid. A crystal structure is composed of a pattern, a set of atoms arranged in a particular way, and a lattice exhibiting long-range order and symmetry. Patterns are located upon the points of a lattice, which is an array of points repeating periodically in three dimensions. The points can be thought of as forming identical tiny boxes, called unit cells, that fill the space of the lattice. The lengths of the edges of a unit cell and the angles between them are called the lattice parameters. The symmetry properties of the crystal are embodied in its space group. A crystal's structure and symmetry play a role in determining many of its physical properties, such as cleavage, electronic band structure, and optical transparency.

Unit cell
The crystal structure of a material (the arrangement of atoms within a given type of crystal) can be described in terms of its unit cell. The unit cell is a small box containing one or more atoms arranged in 3-dimension. The unit cells stacked in three-dimensional space describe the bulk arrangement of atoms of the crystal. The unit cell is given by its lattice parameters, which are the length of the cell edges and the angles between them, while the positions of the atoms inside the unit cell are described by the set of atomic positions (xi, yi, zi) measured from a lattice point.

Insulin crystals

Crystal structure

230

Simple cubic (P)

Body-centered cubic (I)

Face-centered cubic (F)

Within the unit cell is the asymmetric unit, smallest unit the crystal can be divided into using the crystallographic symmetry operations of the space group. The asymmetric unit is also what is generally solved when solving a structure of a molecule or protein by X-ray crystallography.

Miller indices
Vectors and atomic planes in a crystal lattice can be described by a three-value Miller index notation (l mn). The , m, and n directional indices are separated by 90, and are thus orthogonal. In fact, the component is mutually perpendicular to the m and n indices. By definition, (l mn) denotes a plane that intercepts the three points a1/, a2/m, and a3/n, or some multiple thereof. That is, the Miller indices are proportional to the inverses of the intercepts of the plane with the unit cell (in the basis of the lattice vectors). If one or more of the indices is zero, it means that the planes do not intersect that axis (i.e., the intercept is "at infinity"). A plane containing a co-ordinate axis is translated so that it no longer contains that axis before its Miller indices are determined. The Miller indices for a plane are integers with no common factors. Negative indices are Planes with different Miller indices in cubic crystals indicated with horizontal bars, as in (123). In an orthogonal co-ordinate system, the Miller indices of a plane are the Cartesian components of a vector normal to the plane. Considering only (l mn) planes intersecting one or more lattice points (the lattice planes), the perpendicular distance d between adjacent lattice planes is related to the (shortest) reciprocal lattice vector orthogonal to the planes by the formula:

Crystal structure

231

Planes and directions


The crystallographic directions are geometric lines linking nodes (atoms, ions or molecules) of a crystal. Likewise, the crystallographic planes are geometric planes linking nodes. Some directions and planes have a higher density of nodes. These high density planes have an influence on the behavior of the crystal as follows: Optical properties: Refractive index is directly related to density (or periodic density fluctuations). Adsorption and reactivity: Physical adsorption and chemical reactions occur at or near surface atoms or molecules. These phenomena are thus sensitive to the density of nodes. Surface tension: The condensation of a material means that the atoms, ions or molecules are more stable if they are surrounded by other similar species. The surface tension of an interface thus varies according to the density on the surface. Microstructural defects: Pores and crystallites tend to have straight grain boundaries following higher density planes. Cleavage: This typically occurs preferentially parallel to higher density planes. Plastic deformation: Dislocation glide occurs preferentially parallel to higher density planes. The perturbation carried by the dislocation (Burgers vector) is along a dense direction. The shift of one node in a more dense direction requires a lesser distortion of the crystal lattice. Some directions and planes are defined by symmetry of the crystal system. In monoclinic, rombohedral, tetragonal, and Dense crystallographic planes trigonal/hexagonal systems there is one unique axis (sometimes called the principal axis) which has higher rotational symmetry than the other two axes. The basal plane is the plane perpendicular to the principal axis in these crystal systems. For triclinic, orthorhombic, and cubic crystal systems the axis designation is arbitrary and there is no principal axis. Cubic structures For the special case of simple cubic crystals, the lattice vectors are orthogonal and of equal length (usually denoted a); similarly for the reciprocal lattice. So, in this common case, the Miller indices (mn) and [mn] both simply denote normals/directions in Cartesian coordinates. For cubic crystals with lattice constant a, the spacing d between adjacent (mn) lattice planes is (from above):

Because of the symmetry of cubic crystals, it is possible to change the place and sign of the integers and have equivalent directions and planes: Coordinates in angle brackets such as <100> denote a family of directions that are equivalent due to symmetry operations, such as [100], [010], [001] or the negative of any of those directions.

Crystal structure Coordinates in curly brackets or braces such as {100} denote a family of plane normals that are equivalent due to symmetry operations, much the way angle brackets denote a family of directions. For face-centered cubic (fcc) and body-centered cubic (bcc) lattices, the primitive lattice vectors are not orthogonal. However, in these cases the Miller indices are conventionally defined relative to the lattice vectors of the cubic supercell and hence are again simply the Cartesian directions.

232

Classification
The defining property of a crystal is its inherent symmetry, by which we mean that under certain 'operations' the crystal remains unchanged. All crystals have translational symmetry in three directions, but some have other symmetry elements as well. For example, rotating the crystal 180 about a certain axis may result in an atomic configuration that is identical to the original configuration. The crystal is then said to have a twofold rotational symmetry about this axis. In addition to rotational symmetries like this, a crystal may have symmetries in the form of mirror planes and translational symmetries, and also the so-called "compound symmetries," which are a combination of translation and rotation/mirror symmetries. A full classification of a crystal is achieved when all of these inherent symmetries of the crystal are identified.

Lattice systems
These lattice systems are a grouping of crystal structures according to the axial system used to describe their lattice. Each lattice system consists of a set of three axes in a particular geometrical arrangement. There are seven lattice systems. They are similar to but not quite the same as the seven crystal systems and the six crystal families.
The 7 lattice systems The 14 Bravais Lattices (From least to most symmetric) 1. triclinic (none)

2. monoclinic (1 diad)

simple

base-centered

3. orthorhombic (3 perpendicular diads)

simple

base-centered

body-centered

face-centered

4. rhombohedral (1 triad)

Crystal structure

233
5. tetragonal (1 tetrad) simple body-centered

6. hexagonal (1 hexad)

7. cubic (4 triads)

simple (SC)

body-centered (bcc) face-centered (fcc)

The simplest and most symmetric, the cubic (or isometric) system, has the symmetry of a cube, that is, it exhibits four threefold rotational axes oriented at 109.5 (the tetrahedral angle) with respect to each other. These threefold axes lie along the body diagonals of the cube. The other six lattice systems, are hexagonal, tetragonal, rhombohedral (often confused with the trigonal crystal system), orthorhombic, monoclinic and triclinic.

Atomic coordination
By considering the arrangement of atoms relative to each other, their coordination numbers (or number of nearest neighbors), interatomic distances, types of bonding, etc., it is possible to form a general view of the structures and alternative ways of visualizing them. Close packing The principles involved can be understood by considering the most efficient way of packing together equal-sized spheres and stacking close-packed atomic planes in three dimensions. For example, if plane A lies beneath plane B, there are two possible ways of placing an additional atom on top of layer B. If an additional layer was placed directly over plane A, this would give rise to the following series : ...ABABABAB.... This type of crystal structure is known as hexagonal close packing (hcp). If however, all three planes are staggered relative to each other and it is not until the fourth layer is positioned directly over plane A that the sequence is repeated, then the following sequence arises: ...ABCABCABC...

HCP lattice (left) and the fcc lattice (right)

Crystal structure This type of crystal structure is known as cubic close packing (ccp). The unit cell of the ccp arrangement is the face-centered cubic (fcc) unit cell. This is not immediately obvious as the closely packed layers are parallel to the {111} planes of the fcc unit cell. There are four different orientations of the close-packed layers. The packing efficiency could be worked out by calculating the total volume of the spheres and dividing that by the volume of the cell as follows:

234

The 74% packing efficiency is the maximum density possible in unit cells constructed of spheres of only one size. Most crystalline forms of metallic elements are hcp, fcc, or bcc (body-centered cubic). The coordination number of hcp and fcc is 12 and its atomic packing factor (APF) is the number mentioned above, 0.74. The APF of bcc is 0.68 for comparison.

Bravais lattices
When the crystal systems are combined with the various possible lattice centerings, we arrive at the Bravais lattices. They describe the geometric arrangement of the lattice points, and thereby the translational symmetry of the crystal. In three dimensions, there are 14 unique Bravais lattices that are distinct from one another in the translational symmetry they contain. All crystalline materials recognized until now (not including quasicrystals) fit in one of these arrangements. The fourteen three-dimensional lattices, classified by crystal system, are shown above. The Bravais lattices are sometimes referred to as space lattices. The crystal structure consists of the same group of atoms, the basis, positioned around each and every lattice point. This group of atoms therefore repeats indefinitely in three dimensions according to the arrangement of one of the 14 Bravais lattices. The characteristic rotation and mirror symmetries of the group of atoms, or unit cell, is described by its crystallographic point group.

Point groups
The crystallographic point group or crystal class is the mathematical group comprising the symmetry operations that leave at least one point unmoved and that leave the appearance of the crystal structure unchanged. These symmetry operations include Reflection, which reflects the structure across a reflection plane Rotation, which rotates the structure a specified portion of a circle about a rotation axis Inversion, which changes the sign of the coordinate of each point with respect to a center of symmetry or inversion point Improper rotation, which consists of a rotation about an axis followed by an inversion. Rotation axes (proper and improper), reflection planes, and centers of symmetry are collectively called symmetry elements. There are 32 possible crystal classes. Each one can be classified into one of the seven crystal systems.

Crystal structure

235

Space groups
In addition to the operations of the point group, the space group of the crystal structure contains translational symmetry operations. These include: Pure translations, which move a point along a vector Screw axes, which rotate a point around an axis while translating parallel to the axis. Glide planes, which reflect a point through a plane while translating it parallel to the plane. There are 230 distinct space groups.

Grain boundaries
Grain boundaries are interfaces where crystals of different orientations meet. A grain boundary is a single-phase interface, with crystals on each side of the boundary being identical except in orientation. The term "crystallite boundary" is sometimes, though rarely, used. Grain boundary areas contain those atoms that have been perturbed from their original lattice sites, dislocations, and impurities that have migrated to the lower energy grain boundary. Treating a grain boundary geometrically as an interface of a single crystal cut into two parts, one of which is rotated, we see that there are five variables required to define a grain boundary. The first two numbers come from the unit vector that specifies a rotation axis. The third number designates the angle of rotation of the grain. The final two numbers specify the plane of the grain boundary (or a unit vector that is normal to this plane). Grain boundaries disrupt the motion of dislocations through a material, so reducing crystallite size is a common way to improve strength, as described by the HallPetch relationship. Since grain boundaries are defects in the crystal structure they tend to decrease the electrical and thermal conductivity of the material. The high interfacial energy and relatively weak bonding in most grain boundaries often makes them preferred sites for the onset of corrosion and for the precipitation of new phases from the solid. They are also important to many of the mechanisms of creep. Grain boundaries are in general only a few nanometers wide. In common materials, crystallites are large enough that grain boundaries account for a small fraction of the material. However, very small grain sizes are achievable. In nanocrystalline solids, grain boundaries become a significant volume fraction of the material, with profound effects on such properties as diffusion and plasticity. In the limit of small crystallites, as the volume fraction of grain boundaries approaches 100%, the material ceases to have any crystalline character, and thus becomes an amorphous solid.

Defects and impurities


Real crystals feature defects or irregularities in the ideal arrangements described above and it is these defects that critically determine many of the electrical and mechanical properties of real materials. When one atom substitutes for one of the principal atomic components within the crystal structure, alteration in the electrical and thermal properties of the material may ensue. Impurities may also manifest as spin impurities in certain materials. Research on magnetic impurities demonstrates that substantial alteration of certain properties such as specific heat may be affected by small concentrations of an impurity, as for example impurities in semiconducting ferromagnetic alloys may lead to different properties as first predicted in the late 1960s. Dislocations in the crystal lattice allow shear at lower stress than that needed for a perfect crystal structure.

Crystal structure

236

Prediction of structure
The difficulty of predicting stable crystal structures based on the knowledge of only the chemical composition has long been a stumbling block on the way to fully computational materials design. Now, with more powerful algorithms and high-performance computing, structures of medium complexity can be predicted using such approaches as evolutionary algorithms, random sampling, or metadynamics. The crystal structures of simple ionic solids (e.g., NaCl or table salt) have long been rationalized in terms of Pauling's rules, first set out in 1929 by Linus Pauling, referred to by many since as the "father of the chemical bond". Pauling also considered the nature of the interatomic Crystal structure of sodium chloride (table salt) forces in metals, and concluded that about half of the five d-orbitals in the transition metals are involved in bonding, with the remaining nonbonding d-orbitals being responsible for the magnetic properties. He, therefore, was able to correlate the number of d-orbitals in bond formation with the bond length as well as many of the physical properties of the substance. He subsequently introduced the metallic orbital, an extra orbital necessary to permit uninhibited resonance of valence bonds among various electronic structures. In the resonating valence bond theory, the factors that determine the choice of one from among alternative crystal structures of a metal or intermetallic compound revolve around the energy of resonance of bonds among interatomic positions. It is clear that some modes of resonance would make larger contributions (be more mechanically stable than others), and that in particular a simple ratio of number of bonds to number of positions would be exceptional. The resulting principle is that a special stability is associated with the simplest ratios or "bond numbers": 1/2, 1/3, 2/3, 1/4, 3/4, etc. The choice of structure and the value of the axial ratio (which determines the relative bond lengths) are thus a result of the effort of an atom to use its valency in the formation of stable bonds with simple fractional bond numbers. After postulating a direct correlation between electron concentration and crystal structure in beta-phase alloys, Hume-Rothery analyzed the trends in melting points, compressibilities and bond lengths as a function of group number in the periodic table in order to establish a system of valencies of the transition elements in the metallic state. This treatment thus emphasized the increasing bond strength as a function of group number. The operation of directional forces were emphasized in one article on the relation between bond hybrids and the metallic structures. The resulting correlation between electronic and crystalline structures is summarized by a single parameter, the weight of the d-electrons per hybridized metallic orbital. The "d-weight" calculates out to 0.5, 0.7 and 0.9 for the fcc, hcp and bcc structures respectively. The relationship between d-electrons and crystal structure thus becomes apparent.

Crystal structure

237

Polymorphism
Polymorphism refers to the ability of a solid to exist in more than one crystalline form or structure. According to Gibbs' rules of phase equilibria, these unique crystalline phases will be dependent on intensive variables such as pressure and temperature. Polymorphism can potentially be found in many crystalline materials including polymers, minerals, and metals, and is related to allotropy, which refers to elemental solids. The complete morphology of a material is described by polymorphism and other variables such as crystal habit, amorphous fraction or crystallographic defects. Polymorphs have different stabilities and may spontaneously convert from a metastable form (or thermodynamically unstable form) to the stable form at a particular temperature. They also exhibit different melting points, solubilities, and X-ray diffraction patterns.

Quartz is one of the several thermodynamically stable crystalline forms of silica, SiO2. The most important forms of silica include: -quartz, --quartz, tridymite, cristobalite, coesite, and stishovite.

One good example of this is the quartz form of silicon dioxide, or SiO2. In the vast majority of silicates, the Si atom shows tetrahedral coordination by 4 oxygens. All but one of the crystalline forms involve tetrahedral SiO4 units linked together by shared vertices in different arrangements. In different minerals the tetrahedra show different degrees of networking and polymerization. For example, they occur singly, joined together in pairs, in larger finite clusters including rings, in chains, double chains, sheets, and three-dimensional frameworks. The minerals are classified into groups based on these structures. In each of its 7 thermodynamically stable crystalline forms or polymorphs of crystalline quartz, only 2 out of 4 of each the edges of the SiO4 tetrahedra are shared with others, yielding the net chemical formula for silica: SiO2. Another example is elemental tin (Sn), which is malleable near ambient temperatures but is brittle when cooled. This change in mechanical properties due to existence of its two major allotropes, - and --tin. The two allotropes that are encountered at normal pressure and temperature, -tin and --tin, are more commonly known as gray tin and white tin respectively. Two more allotropes, and , exist at temperatures above 161C and pressures above several GPa. White tin is metallic, and is the stable crystalline form at or above room temperature. Below 13.2C, tin exists in the gray form, which has a diamond cubic crystal structure, similar to diamond, silicon or germanium. Gray tin has no metallic properties at all, is a dull-gray powdery material, and has few uses, other than a few specialized semiconductor applications. Although the -- transformation temperature of tin is nominally 13.2C, impurities (e.g. Al, Zn, etc.) lower the transition temperature well below 0C, and upon addition of Sb or Bi the transformation may not occur at all.

Crystal structure

238

Physical properties
Twenty of the 32 crystal classes are piezoelectric, and crystals belonging to one of these classes (point groups) display piezoelectricity. All piezoelectric classes lack a centre of symmetry. Any material develops a dielectric polarization when an electric field is applied, but a substance that has such a natural charge separation even in the absence of a field is called a polar material. Whether or not a material is polar is determined solely by its crystal structure. Only 10 of the 32 point groups are polar. All polar crystals are pyroelectric, so the 10 polar crystal classes are sometimes referred to as the pyroelectric classes. There are a few crystal structures, notably the perovskite structure, which exhibit ferroelectric behavior. This is analogous to ferromagnetism, in that, in the absence of an electric field during production, the ferroelectric crystal does not exhibit a polarization. Upon the application of an electric field of sufficient magnitude, the crystal becomes permanently polarized. This polarization can be reversed by a sufficiently large counter-charge, in the same way that a ferromagnet can be reversed. However, although they are called ferroelectrics, the effect is due to the crystal structure (not the presence of a ferrous metal).

Kikuchi line

239

Kikuchi line
Kikuchi lines pair up to form bands in electron diffraction from single crystal specimens, there to serve as "roads in orientation-space" for microscopists not sure what they are looking at. In transmission electron microscopes, they are easily seen in diffraction from regions of the specimen thick enough for multiple scattering. Unlike diffraction spots, which blink on and off as one tilts the crystal, Kikuchi bands mark orientation space with well-defined intersections (called zones or poles) as well as paths connecting one intersection to the next. Experimental and theoretical maps of Kikuchi band geometry, as well as Map of Kikuchi line pairs down to 1/1 for 300 keV electrons in hexagonal sapphire their direct-space analogs e.g. bend (Al2O3), with some intersections labeled. contours, electron channeling patterns, and fringe visibility maps are increasingly useful tools in electron microscopy of crystalline and nanocrystalline materials. Because each Kikuchi line is associated with Bragg diffraction from one side of a single set of lattice planes, these lines can be labeled with the same Miller or reciprocal-lattice indices that are used to identify individual diffraction spots. Kikuchi band intersections, or zones, on the other hand are indexed with direct-lattice indices i.e. indices which represent integer multiples of the lattice basis vectors a, b and c. Kikuchi lines are formed in diffraction patterns by diffusely scattered electrons, e.g. as a result of thermal atom vibrations. The main features of their geometry can be deduced from a simple elastic mechanism proposed in 1928 by Seishi Kikuchi, although the dynamical theory of diffuse inelastic scattering is needed to understand them quantitatively. In x-ray scattering these lines are referred to as Kossel lines (named after Walther Kossel).

Kikuchi line

240

Recording experimental Kikuchi patterns and maps


The figure at left shows the Kikuchi lines leading to a silicon [100] zone, taken with the beam direction approximately 7.9 away from the zone along the (004) Kikuchi band. The dynamic range in the image is so large that only portions of the film are not overexposed. Kikuchi lines are much easier to follow with dark-adapted eyes on a fluorescent screen, than they are to capture unmoving on paper or film, even though eyes and photographic media both have a roughly logarithmic response to illumination intensity. Fully quantitative work on such diffraction features is therefore assisted by the large linear dynamic range of CCD detectors.

This image subtends an angular range of over 10 and required use of a shorter than usual camera length L. The Kikuchi band widths themselves (roughly L/d where /d is approximately twice the Bragg angle for the corresponding plane) are well under 1, because the wavelength of electrons (about 1.97 picometres in this case) is much less than the lattice plane d-spacing itself. For comparison, the d-spacing for silicon (022) is about 192 picometres while the d-spacing for silicon (004) is about 136 picometres. The image was taken from a region of the crystal which is thicker than the inelastic mean free path (about 200 nanometres), so that diffuse scattering features (the Kikuchi lines) would be strong in comparison to coherent scattering features (diffraction spots). The fact that surviving diffraction spots appear as disks intersected by bright Kikuchi lines means that the diffraction pattern was taken with a convergent electron beam. In practice, Kikuchi lines are easily seen in thick regions of either selected area or convergent beam electron diffraction patterns, but difficult to see in diffraction from crystals much less than 100nm in size (where lattice-fringe visibility effects become important instead). This image was recorded in convergent beam, because that too reduces the range of contrasts that have to be recorded on film. Compiling Kikuchi maps which cover more than a steradian requires that one take many images at tilts changed only incrementally (e.g. by 2 in each direction). This can be tedious work, but may be useful when investigating a crystal with unknown structure as it can clearly unveil the lattice symmetry in three dimensions.

Kikuchi lines in a convergent beam diffraction pattern of single crystal silicon taken with a 300 keV electron beam.

Kikuchi line

241

Kikuchi line maps and their stereographic projection


The figure at left plots Kikuchi lines for a larger section of silicon's orientation space. The angle subtended between the large [011] and [001] zones at the bottom is 45 for silicon. Note that four-fold zone in the lower right (here labeled [001]) has the same symmetry and orientation as the zone labeled [100] in the experimental pattern above, although that experimental pattern only subtends about 10. Note also that the figure at left is excerpted from a stereographic projection centered on that [001] zone. Such conformal projections allow one to map pieces of spherical surface onto a plane while preserving the local angles of intersection, and hence zone symmetries. Plotting such maps requires that one be able to draw arcs of circles with a very large radius of curvature. The figure at left, for example, was drawn before the advent of computers and hence required the use of a beam compass. Finding a beam compass today might be fairly difficult, since it is much easier to draw curves having a large radius of curvature (in two or three dimensions) with help from a computer.

[001] zone stereographic Kikuchi map for diamond face-centered cubic crystals.

The angle-preserving effect of stereographic plots is even more obvious in the figure at right, which subtends a full 180 of the orientation space of a face-centered or cubic close packed crystal e.g. like that of Gold or Aluminum. The animation follows Animation of tilt traverse between {220} fringe-visibility bands of that face-centered four of the eight <111> zones in an cubic crystal between <111> zones, at which point fcc crystal rotation by 60 sets up travel to the next <111> zone via a repeat of the original sequence. Fringe-visibility bands have the same global geometry as do Kikuchi bands, but for thin specimens their width is proportional (rather than inversely proportional) to d-spacing. Although the angular field width (and tilt range) obtainable experimentally with Kikuchi bands is generally much smaller, the animation offers a wide-angle view of how Kikuchi bands help informed crystallographers find their way between landmarks in the orientation space of a single crystal specimen.

Kikuchi line

242

Real space analogs


Kikuchi lines serve to highlight the edge on lattice planes in diffraction images of thicker specimens. Because Bragg angles in the diffraction of high energy electrons are very small (~14 degrees for 300 keV)), Kikuchi bands are quite narrow in reciprocal space. This also means that in real space images, lattice planes edge-on are decorated not by diffuse scattering features but by contrast associated with coherent scattering. These coherent scattering features include added diffraction (responsible for bend contours in curved foils), more electron penetration (which gives rise to electron channeling patterns in scanning electron images of A silicon [100] bend contour spider, trapped over an elliptical region that is about 500 nanometres wide. crystal surfaces), and lattice fringe contrast (which results in a dependence of lattice fringe intensity on beam orientation which is linked to specimen thickness). Although the contrast details differ, the lattice plane trace geometry of these features and of Kikuchi maps are the same.

Bend contours and rocking curves


Rocking curves (left) are plots of scattered electron intensity, as a function of the angle between an incident electron beam and the normal to a set of lattice planes in the specimen. As this angle changes in either direction from edge-on (at which orientation the electron beam runs parallel to the lattice planes and perpendicular to their normal), the beam moves into Bragg diffracting condition and more electrons are diffracted outside the microscope's back focal plane aperture, giving rise to the dark-line pairs (bands) seen in the image of the bent silicon foil shown in the image at right. The [100] bend contour "spider" of this image, trapped in a region of silicon that was shaped like an Bend contour and lattice fringe visibility as a function of specimen oval watchglass less than a micrometre in size, was thickness and beam tilt. imaged with 300 keV electrons. If you tilt the crystal, the spider moves toward the edges of the oval as though it is trying to get out. For example, in this image the spider's [100] intersection has moved to the right side of the ellipse as the specimen was tilted to the left. The spider's legs, and their intersections, can be indexed as shown in precisely the same way as the Kikuchi pattern near [100] in the section on experimental Kikuchi patterns above. In principle, one could therefore use this bend contour to model the foil's vector tilt (with milliradian accuracy) at all points across the oval.

Kikuchi line

243

Lattice fringe visibility maps


As you can see from the rocking curve above, as specimen thickness moves into the 10 nanometre and smaller range (e.g. for 300 keV electrons and lattice spacings near 0.23nm) the angular range of tilts that give rise to diffraction and/or lattice-fringe contrast becomes inversely proportional to specimen thickness. The geometry of lattice-fringe visibility therefore becomes useful in the electron microscope study of nanomaterials, just as bend contours and Kikuchi lines are useful in the study of single crystal specimens (e.g. metals and semiconductor specimens with thickness in the tenth-micrometre range). Applications to nanostructure for example include: (i) determining the 3D lattice parameters of individual nanoparticles from images taken at different tilts, (ii) fringe fingerprinting of randomly oriented nanoparticle collections, (iii) particle thickness maps based on fringe contrast changes under tilt, (iv) detection of icosahedral twinning from the lattice image of a randomly oriented nanoparticle, and (v) analysis of orientation relationships between nanoparticles and a cylindrical support.

Electron channeling patterns


The above techniques all involve detection of electrons which have passed through a thin specimen, usually in a transmission electron microscope. Scanning electron microscopes, on the other hand, typically look at electrons "kicked up" when one rasters a focussed electron beam across a thick specimen. Electron channeling patterns are contrast effects associated with edge-on lattice planes that show up in scanning electron microscope secondary and/or backscattered electron images. The contrast effects are to first order similar to those of bend contours, i.e. electrons which enter a crystalline surface under diffracting conditions tend to channel (penetrate deeper into the specimen without losing energy) and thus kick up fewer electrons near the entry surface for detection. Hence bands form, depending on beam/lattice orientation, with the now-familiar Kikuchi line geometry. The first scanning electron microscope (SEM) image was an image of electron channeling contrast in silicon steel. However, practical uses for the technique are limited because only a thin layer of abrasion damage or amorphous coating is generally adequate to obscure the contrast. If the specimen had to be given a conductive coating before examination to prevent charging, this too could obscure the contrast. On cleaved surfaces, and surfaces self-assembled on the atomic scale, electron channeling patterns are likely to see growing application with modern microscopes in the years ahead.

X-ray scattering techniques

244

X-ray scattering techniques


X-ray scattering techniques are a family of non-destructive analytical techniques which reveal information about the crystal structure, chemical composition, and physical properties of materials and thin films. These techniques are based on observing the scattered intensity of an X-ray beam hitting a sample as a function of incident and scattered angle, polarization, and wavelength or energy. Note that X-ray scattering is different from X-ray diffraction, which is widely used for X-ray crystallography.

Scattering techniques
Elastic scattering
Materials that do not have long range order may also be studied by scattering methods that rely on elastic scattering of monochromatic X-rays.
This is an X-ray diffraction pattern formed when X-rays are focused on a crystalline material, in this case a protein. Each dot, called a reflection, forms from the coherent interference of scattered X-rays passing through the crystal.

Small-angle X-ray scattering (SAXS) probes structure in the nanometer to micrometer range by measuring scattering intensity at scattering angles 2 close to 0. X-ray reflectivity is an analytical technique for determining thickness, roughness, and density of single layer and multilayer thin films. Wide-angle X-ray scattering (WAXS), a technique concentrating on scattering angles 2 larger than 5.

Inelastic scattering
When the energy and angle of the inelastically scattered X-rays are monitored, scattering techniques can be used to probe the electronic band structure of materials. Inelastic scattering alters the phase of the diffracted x-rays, and as a result do not produce useful data for x-ray diffraction. Rather, inelastically scattered x-rays contribute to the background noise in a diffraction pattern. Compton scattering Resonant inelastic X-ray scattering (RIXS) X-ray Raman scattering

X-ray scattering techniques

245

Small-angle X-ray scattering


Concepts common to small-angle X-ray scattering and small-angle neutron scattering are described in the overarching lemma small-angle scattering. Small-angle X-ray scattering (SAXS) is a small-angle scattering (SAS) technique where the elastic scattering of X-rays (wavelength 0.1 ... 0.2 nm) by a sample which has inhomogeneities in the nm-range, is recorded at very low angles (typically 0.1 - 10). This angular range contains information about the shape and size of macromolecules, characteristic distances of partially ordered materials, pore sizes, and other data. SAXS is capable of delivering structural information of macromolecules between 5 and 25nm, of repeat distances in partially ordered systems of up to 150nm. USAXS (ultra-small angle X-ray scattering) can resolve even larger dimensions. SAXS and USAXS belong to a family of X-ray scattering techniques that are used in the characterization of materials. In the case of biological macromolecules such as proteins, the advantage of SAXS over crystallography is that a crystalline sample is not needed. Nuclear magnetic resonance spectroscopy methods encounter problems with macromolecules of higher molecular mass (> 30-40 kDa). However, owing to the random orientation of dissolved or partially ordered molecules, the spatial averaging leads to a loss of information in SAXS compared to crystallography.

Applications
SAXS is used for the determination of the microscale or nanoscale structure of particle systems in terms of such parameters as averaged particle sizes, shapes, distribution, and surface-to-volume ratio. The materials can be solid or liquid and they can contain solid, liquid or gaseous domains (so-called particles) of the same or another material in any combination. Not only particles, but also the structure of ordered systems like lamellae, and fractal-like materials can be studied. The method is accurate, non-destructive and usually requires only a minimum of sample preparation. Applications are very broad and include colloids of all types, metals, cement, oil, polymers, plastics, proteins, foods and pharmaceuticals and can be found in research as well as in quality control. The X-ray source can be a laboratory source or synchrotron light which provides a higher X-ray flux.

SAXS instruments
In an SAXS instrument a monochromatic beam of X-rays is brought to a sample from which some of the X-rays scatter, while most simply go through the sample without interacting with it. The scattered X-rays form a scattering pattern which is then detected at a detector which is typically a 2-dimensional flat X-ray detector situated behind the sample perpendicular to the direction of the primary beam that initially hit the sample. The scattering pattern contains the information on the structure of the sample. The major problem that must be overcome in SAXS

Small-angle X-ray scattering instrumentation is the separation of the weak scattered intensity from the strong main beam. The smaller the desired angle, the more difficult this becomes. The problem is comparable to one encountered when trying to observe a weakly radiant object close to the sun, like the sun's corona. Only if the moon blocks out the main light source does the corona become visible. Likewise, in SAXS the non-scattered beam that merely travels through the sample must be blocked, without blocking the closely adjacent scattered radiation. Most available X-ray sources produce divergent beams and this compounds the problem. In principle the problem could be overcome by focusing the beam, but this is not easy when dealing with X-rays and was previously not done except on synchrotrons where large bent mirrors can be used. This is why most laboratory small angle devices rely on collimation instead. Laboratory SAXS instruments can be divided into two main groups: point-collimation and line-collimation instruments: 1. Point-collimation instruments have pinholes that shape the X-ray beam to a small circular or elliptical spot that illuminates the sample. Thus the scattering is centro-symmetrically distributed around the primary X-ray beam and the scattering pattern in the detection plane consists of circles around the primary beam. Owing to the small illuminated sample volume and the wastefulness of the collimation process only those photons are allowed to pass that happen to fly in the right direction the scattered intensity is small and therefore the measurement time is in the order of hours or days in case of very weak scatterers. If focusing optics like bent mirrors or bent monochromator crystals or collimating and monochromating optics like multilayers are used, measurement time can be greatly reduced. Point-collimation allows the orientation of non-isotropic systems (fibres, sheared liquids) to be determined. 2. Line-collimation instruments confine the beam only in one dimension so that the beam profile is a long but narrow line. The illuminated sample volume is much larger compared to point-collimation and the scattered intensity at the same flux density is proportionally larger. Thus measuring times with line-collimation SAXS instruments are much shorter compared to point-collimation and are in the range of minutes. A disadvantage is that the recorded pattern is essentially an integrated superposition (a self-convolution) of many pinhole adjacent pinhole patterns. The resulting smearing can be easily removed using model-free algorithms or deconvolution methods based on Fourier transformation, but only if the system is isotropic. Line collimation is of great benefit for any isotropic nanostructured materials, e.g. proteins, surfactants, particle dispersion and emulsions.

246

Small-angle X-ray scattering

247

X-ray reflectivity
X-ray reflectivity sometimes known as X-ray specular reflectivity, X-ray reflectometry, or XRR, is a surface-sensitive analytical technique used in chemistry, physics, and materials science to characterize surfaces, thin films and multilayers.It is related to the complementary techniques of neutron reflectometry and ellipsometry. The basic idea behind the technique is to reflect a beam of x-rays from a flat surface and to then measure the intensity of x-rays reflected in the specular direction (reflected angle equal to incident angle). If the interface is not perfectly sharp and smooth then the reflected intensity will deviate from that predicted by the law of Fresnel reflectivity. The deviations can then be analyzed to obtain the density profile of the interface normal to the surface. The technique appears to have first been applied to x-rays by Professor Lyman G. Parratt of Cornell University in an article published in Physical Review in 1954. Parratt's initial work explored the surface of copper-coated glass, but since that time the technique has been extended to a wide range of both solid and liquid interfaces. The basic mathematical relationship which describes specular reflectivity is fairly straightforward. When an interface is not perfectly sharp, but has an average electron density profile given by , then the x-ray reflectivity can be approximated by:
Diagram of x-ray specular reflection

Here

is the reflectivity,

is the x-ray wavelength,

is the density deep within

the material and

is the angle of incidence. Typically one can then use this formula to compare parameterized

models of the average density profile in the z-direction with the measured x-ray reflectivity and then vary the parameters until the theoretical profile matches the measurement. For films with multiple layers, X-ray reflectivity may show oscillations with wavelength, analogous to the Fabry-Prot effect. These oscillations can be used to infer layer thicknesses and other properties, for example using the Abeles matrix formalism.

X-ray reflectivity

248

Wide-angle X-ray scattering


Wide-angle X-ray scattering (WAXS) or wide-angle X-ray diffraction (WAXD) is an X-ray-diffraction technique that is often used to determine the crystalline structure of polymers. This technique specifically refers to the analysis of Bragg peaks scattered to wide angles, which (by Bragg's law) implies that they are caused by sub-nanometer-sized structures. Wide-angle X-ray scattering is the same technique as small-angle X-ray scattering (SAXS) only the distance from sample to the detector is shorter and thus diffraction maxima at larger angles are observed. Depending on the measurement instrument used it is possible to do WAXS and SAXS in a single run (small- and wide-angle scattering, SWAXS). The technique is a time-honored but a somewhat out-of-favor technique for the determination of degree of crystallinity of polymer samples. The diffraction pattern generated allows to determine the chemical composition or phase composition of the film, the texture of the film (preferred alignment of crystallites), the crystallite size and presence of film stress. According to this method the sample is scanned in a wide-angle X-ray goniometer, and the scattering intensity is plotted as a function of the 2 angle. X-ray diffraction is a non destructive method of characterization of solid materials. When X-rays are directed in solids they will scatter in predictable patterns based upon the internal structure of the solid. A crystalline solid consists of regularly spaced atoms (electrons) that can be described by imaginary planes. The distance between these planes is called the d-spacing. The intensity of the d-space pattern is directly proportional to the number of electrons (atoms) that are found in the imaginary planes. Every crystalline solid will have a unique pattern of d-spacings (known as the powder pattern), which is a finger print for that solid. In fact solids with the same chemical composition but different phases can be identified by their pattern of d-spacings.

Chemical structure

249

Chemical structure
A chemical structure includes molecular geometry, electronic structure and crystal structure of molecules. Molecular geometry refers to the spatial arrangement of atoms in a molecule and the chemical bonds that hold the atoms together. Molecular geometry can range from the very simple, such as diatomic oxygen or nitrogen molecules, to the very complex, such as protein or DNA molecules. Molecular geometry can be roughly represented using a structural formula. Electronic structure describes the occupation of a molecule's molecular orbitals. The theory of chemical structure was first developed by Aleksandr Butlerov, which stated that the chemical compounds are not a random cluster of atoms and functional groups but structures with definite order formed according the valency of the composing atoms. Other important contributors were Archibald Scott Couper and Friedrich August Kekule.

Structure determination
Structural determination in chemistry is the process of determining the chemical structure of molecules. Practically, the end result of such process is the obtainment of the coordinates of the atoms in a molecule. The methods by which one can determine the structure of a molecule include spectroscopy, such as nuclear magnetic resonance (NMR), infrared spectroscopy and Raman spectroscopy, electron microscopy, and x-ray crystallography (x-ray diffraction). The last technique can produce three-dimensional models at atomic-scale resolution, as long as crystals are available, as x-ray diffraction needs numerous copies of the molecule being studied that must also be arranged in an organised way. The following are common methods for determining the chemical structure (Structural elucidation) : X-ray diffraction Proton NMR Carbon-13 NMR Mass spectrometry Infrared spectroscopy

The following are common methods for determining the electronic structure: Electron-spin resonance Cyclic voltammetry Electron absorption spectroscopy X-ray photoelectron spectroscopy

Atomic spectroscopy

250

Atomic spectroscopy
Atomic spectroscopy is the determination of elemental composition by using the Electromagnetic Spectrum or Mass Spectrum, respectively to optical or mass spectroscopy. It can be divided by atomization source or by the type of spectroscopy used. In the latter case, the main division is between optical and mass spectrometry. Mass spectrometry generally gives significantly better analytical performance, but is also significantly more complex. This complexity translates into higher purchase costs, higher operational costs, more operator training, and a greater number of components that can potentially fail. Because optical spectroscopy is often less expensive and has performance adequate for many tasks, it is far more common .Atomic absorption spectrometers are one of the most commonly sold and used analytical devices.

Optical spectroscopy
Electrons exist in energy levels (i.e. atomic orbitals) within an atom. Atomic orbitals are quantized, meaning they exist as defined values instead of being continuous (see: atomic orbitals). Electrons may move between orbitals, but in doing so they must absorb or emit energy equal to the energy difference between their atom's specific quantized orbital energy levels. In optical spectroscopy, energy absorbed to move an electron to a higher energy level (higher orbital) and/or the energy emitted as the electron moves to a lower energy level is absorbed or emitted in the form of photons (light particles). Because each element has a unique number of electrons, an atom will absorb/release energy in a pattern unique to its elemental identity (e.g. Ca, Na, etc.) and thus will absorb/emit photons in a correspondingly unique pattern. The type of atoms present in a sample, or the amount of atoms present in a sample can be deduced from measuring these changes in light wavelength and light intensity. Optical spectroscopy is further divided into Atomic absorption spectroscopy, Atomic Emission Spectroscopy, and Fluorescence Spectroscopy. In atomic absorption spectroscopy, light of a predetermined wavelength is passed through a collection of atoms. If the wavelength of the source light has energy corresponding to the energy difference between two energy levels in the atoms, a portion of the light will be absorbed. The difference between the intensity of the light emitted from the source (e.g., lamp) and the light collected by the detector yields an absorbance value. This absorbance value can then be used to determine the concentration of a given element (or atoms) within the sample. The relationship between the concentration of atoms, the distance the light travels through the collection of atoms, and the portion of the light absorbed is given by the BeerLambert law. The energy stored in the atoms can be released in a variety of ways. When it is released as light, this is known as fluorescence. Atomic fluorescence spectroscopy measures this emitted light. Fluorescence is generally measured at a 90 angle from the excitation source to minimize collection of scattered light from the excitation source, often such a rotation is provided by a PellinBroca prism on a turntable which will also separate the light into its spectrum for closer analysis. The wavelength once again tells you the identity of the atoms. For low absorbances (and therefore low concentrations) the intensity of the fluoresced light is directly proportional to the concentration of atoms. Atomic fluorescence is generally more sensitive (i.e. it can detect lower concentrations) than atomic absorption. Strictly speaking, any measurement of the emitted light is emission spectroscopy, but atomic emission spectroscopy usually does not include fluorescence and rather refers to emission after excitation by thermal means. The intensity of the emitted light is directly proportional to the concentration of atoms.

Atomic spectroscopy

251

Mass spectrometry
Atomic mass spectrometry is similar to other types of mass spectrometry in that it consists of an ion source, a mass analyzer, and a detector. Atoms' identities are determined by their mass-to-charge ratio (via the mass analyzer) and their concentrations are determined by the number of ions detected. Although considerable research has gone into customizing mass spectrometers for atomic ion sources, it is the ion source that differs most from other forms of mass spectrometry. These ion sources must also atomize samples, or an atomization step must take place before ionization. Atomic ion sources are generally modifications of atomic optical spectroscopy atom sources.

Ion and atom sources


Sources can be adapted in many ways, but the lists below give the general uses of a number of sources. Of these, flames are the most common due to their low cost and their simplicity. Although significantly less common, inductively-coupled plasmas, especially when used with mass spectrometers, are recognized for their outstanding analytical performance and their versatility. For all atomic spectroscopy, a sample must be vaporized and atomized. For atomic mass spectrometry, a sample must also be ionized. Vaporization, atomization, and ionization are often, but not always, accomplished with a single source. Alternatively, one source may be used to vaporize a sample while another is used to atomize (and possibly ionize). An example of this is laser ablation inductively-coupled plasma atomic emission spectrometry, where a laser is used to vaporize a solid sample and an inductively-coupled plasma is used to atomize the vapor. With the exception of flames and graphite furnaces, which are most commonly used for atomic absorption spectroscopy, most sources are used for atomic emission spectroscopy. Liquid-sampling sources include flames and sparks (atom source), inductively-coupled plasma (atom and ion source), graphite furnace (atom source), microwave plasma (atom and ion source), and direct-current plasma (atom and ion source). Solid-sampling sources include lasers (atom and vapor source), glow discharge (atom and ion source), arc (atom and ion source), spark (atom and ion source), and graphite furnace (atom and vapor source). Gas-sampling sources include flame (atom source), inductively-coupled plasma (atom and ion source), microwave plasma (atom and ion source), direct-current plasma (atom and ion source), and glow discharge (atom and ion source).

Atomic absorption spectroscopy

252

Atomic absorption spectroscopy


Atomic absorption spectroscopy (AAS) is a spectroanalytical procedure for the quantitative determination of chemical elements employing the absorption of optical radiation (light) by free atoms in the gaseous state. In analytical chemistry the technique is used for determining the concentration of a particular element (the analyte) in a sample Modern atomic absorption spectrometers to be analyzed. AAS can be used to determine over 70 different elements in solution or directly in solid samples employed in pharmacology, biophysics and toxicology research. Atomic absorption spectrometry was first used as an analytical technique, and the underlying principles were established in the second half of the 19th century by Robert Wilhelm Bunsen and Gustav Robert Kirchhoff, both professors at the University of Heidelberg, Germany. The modern form of AAS was largely developed during the 1950s by a team of Australian chemists. They were led by Sir Alan Walsh at the Commonwealth Scientific and Industrial Research Organisation (CSIRO), Division of Chemical Physics, in Melbourne, Australia.

Principles
The technique makes use of absorption spectrometry to assess the concentration of an analyte in a sample. It requires standards with known analyte content to establish the relation between the measured absorbance and the analyte concentration and relies therefore on the Beer-Lambert Law. In short, the electrons of the atoms in the atomizer can be promoted to higher orbitals (excited state) for a short period of time (nanoseconds) by absorbing a defined quantity of energy (radiation of a given wavelength). This amount of energy, i.e., wavelength, is specific to a particular electron transition in a particular element. In general, each wavelength corresponds to only one element, and the width of an absorption line is only of the order of a few picometers (pm), which gives the technique its elemental selectivity. The radiation flux without a sample and with a sample in the atomizer is measured using a detector, and the ratio between the two values (the absorbance) is converted to analyte concentration or mass using the Beer-Lambert Law.

Instrumentation
In order to analyze a sample for its atomic constituents, it has to be atomized. The atomizers most commonly used nowadays are flames and electrothermal (graphite tube) atomizers. The atoms should then be irradiated by optical radiation, and the radiation source could be an element-specific line radiation source or a continuum radiation source. The radiation then passes through a monochromator in order to separate the Atomic absorption spectrometer block diagram element-specific radiation from any other radiation emitted by the radiation source, which is finally measured by a detector.

Atomic absorption spectroscopy

253

Atomizers
The atomizers most commonly used nowadays are (spectroscopic) flames and electrothermal (graphite tube) atomizers. Other atomizers, such as glow-discharge atomization, hydride atomization, or cold-vapor atomization might be used for special purposes. Flame atomizers The oldest and most commonly used atomizers in AAS are flames, principally the air-acetylene flame with a temperature of about 2300 C and the nitrous oxide (N2O)-acetylene flame with a temperature of about 2700 C. The latter flame, in addition, offers a more reducing environment, being ideally suited for analytes with high affinity to oxygen. Liquid or dissolved samples are typically used with flame atomizers. The sample solution is aspirated by a pneumatic analytical nebulizer, transformed into an aerosol, which is introduced into a spray chamber, where it is mixed with the flame gases and conditioned in a way that only the finest aerosol droplets (< 10m) enter the flame. This conditioning process is responsible that only about 5% of the aspirated sample solution reaches the flame, but it also guarantees a relatively high freedom from interference. On top of the spray chamber is a burner head that produces a flame that A laboratory flame photometer that uses a is laterally long (usually 510cm) and only a few mm deep. The propane operated flame atomizer radiation beam passes through this flame at its longest axis, and the flame gas flow-rates may be adjusted to produce the highest concentration of free atoms. The burner height may also be adjusted, so that the radiation beam passes through the zone of highest atom cloud density in the flame, resulting in the highest sensitivity. The processes in a flame include the following stages: Desolvation (drying) the solvent is evaporated and the dry sample nano-particles remain; Vaporization (transfer to the gaseous phase) the solid particles are converted into gaseous molecules; Atomization the molecules are dissociated into free atoms; Ionization depending on the ionization potential of the analyte atoms and the energy available in a particular flame, atoms might be in part converted to gaseous ions.

Each of these stages includes the risk of interference in case the degree of phase transfer is different for the analyte in the calibration standard and in the sample. Ionization is generally undesirable, as it reduces the number of atoms that are available for measurement, i.e., the sensitivity. In flame AAS a steady-state signal is generated during the time period when the sample is aspirated. This technique is typically used for determinations in the mg L-1 range, and may be extended down to a few g L-1 for some elements.

Atomic absorption spectroscopy Electrothermal atomizers Electrothermal AAS (ET AAS) using graphite tube atomizers was pioneered by Boris V. Lvov at the Saint Petersburg Polytechnical Institute, Russia, since the late 1950s, and further investigated by Hans Massmann at the Institute of Spectrochemistry and Applied Spectroscopy (ISAS) in Dortmund, Germany. Although a wide variety of graphite tube designs have been used over the years, the dimensions nowadays are typically 2025mm in length and 56mm inner diameter. With this technique liquid/dissolved, solid and gaseous samples may be analyzed directly. A measured volume (typically 1050L) or a weighed mass (typically around 1mg) of a solid sample are introduced into the graphite tube and subject to a temperature program. This typically consists of stages, such as: Drying the solvent is evaporated Pyrolysis the majority of the matrix constituents is removed Atomization the analyte element is released to the gaseous phase Cleaning eventual residues in the graphite tube are removed at high temperature.

254

The graphite tubes are heated via their ohmic resistance using a low-voltage high-current power supply; the temperature in the individual stages can be controlled very closely, and temperature ramps between the individual stages facilitate separation of sample components. Tubes may be heated transversely or longitudinally, where the former ones have the advantage of a more homogeneous temperature distribution over their length. The so-called Stabilized Temperature Platform Furnace (STPF) concept, proposed by Walter Slavin, based on research of Boris Lvov, makes ET AAS essentially free from interference. The major components of this concept are: Atomization of the sample from a graphite platform inserted into the graphite tube (Lvov platform) instead of from the tube wall in order to delay atomization until the gas phase in the atomizer has reached a stable temperature; Use of a chemical modifier in order to stabilize the analyte to a pyrolysis temperature that is sufficient to remove the majority of the matrix components; Integration of the absorbance over the time of the transient absorption signal instead of using peak height absorbance for quantification. In ET AAS a transient signal is generated, the area of which is directly proportional to the mass of analyte (not its concentration) introduced into the graphite tube. This technique has the advantage that any kind of sample, solid, liquid or gaseous, can be analyzed directly. Its sensitivity is 23 orders of magnitude higher than that of flame AAS, so that determinations in the low g L-1 range (for a typical sample volume of 20L) and ng g-1 range (for a typical sample mass of 1mg) can be carried out. It shows a very high degree of freedom from interferences, so that ET AAS might be considered the most robust technique available nowadays for the determination of trace elements in complex matrices. Specialized atomization techniques While flame and electrothermal vaporizers are the most common atomization techniques, several other atomization methods are utilized for specialized use. Glow-discharge atomization A glow-discharge (GD) device serves as a versatile source, as it can simultaneously introduce and atomize the sample. The glow discharge occurs in a low-pressure argon gas atmosphere between 1 and 10 torr. In this atmosphere lies a pair of electrodes applying a DC voltage of 250 to 1000 V to break down the argon gas into positively charged ions and electrons. These ions, under the influence of the electric field, are accelerated into the cathode surface containing the sample, bombarding the sample and causing neutral sample atom ejection through the process known as sputtering. The atomic vapor produced by this discharge is composed of ions, ground state atoms, and fraction of excited atoms. When the excited atoms relax back into their ground state, a low-intensity glow is

Atomic absorption spectroscopy emitted, giving the technique its name. The requirement for samples of glow discharge atomizers is that they are electrical conductors. Consequently, atomizers are most commonly used in the analysis of metals and other conducting samples. However, with proper modifications, it can be utilized to analyze liquid samples as well as nonconducting materials by mixing them with a conductor (e.g. graphite). Hydride atomization Hydride generation techniques are specialized in solutions of specific elements. The technique provides a means of introducing samples containing arsenic, antimony, tin, selenium, bismuth, and lead into an atomizer in the gas phase. With these elements, hydride atomization enhances detection limits by a factor of 10 to 100 compared to alternative methods. Hydride generation occurs by adding an acidified aqueous solution of the sample to a 1% aqueous solution of sodium borohydride, all of which is contained in a glass vessel. The volatile hydride generated by the reaction that occurs is swept into the atomization chamber by an inert gas, where it undergoes decomposition. This process forms an atomized form of the analyte, which can then be measured by absorption or emission spectrometry. Cold-vapor atomization The cold-vapor technique an atomization method limited to only the determination of mercury, due to it being the only metallic element to have a large enough vapor pressure at ambient temperature. Because of this, it has an important use in determining organic mercury compounds in samples and their distribution in the environment. The method initiates by converting mercury into Hg2+ by oxidation from nitric and sulfuric acids, followed by a reduction of Hg2+ with tin(II) chloride. The mercury, is then swept into a long-pass absorption tube by bubbling a stream of inert gas through the reaction mixture. The concentration is determined by measuring the absorbance of this gas at 253.7nm. Detection limits for this technique are in the parts-per-billion range making it an excellent mercury detection atomization method.

255

Radiation sources
We have to distinguish between line source AAS (LS AAS) and continuum source AAS (CS AAS). In classical LS AAS, as it has been proposed by Alan Walsh, the high spectral resolution required for AAS measurements is provided by the radiation source itself that emits the spectrum of the analyte in the form of lines that are narrower than the absorption lines. Continuum sources, such as deuterium lamps, are only used for background correction purposes. The advantage of this technique is that only a medium-resolution monochromator is necessary for measuring AAS; however, it has the disadvantage that usually a separate lamp is required for each element that has to be determined. In CS AAS, in contrast, a single lamp, emitting a continuum spectrum over the entire spectral range of interest is used for all elements. Obviously, a high-resolution monochromator is required for this technique, as will be discussed later. Hollow cathode lamps Hollow cathode lamps (HCL) are the most common radiation source in LS AAS. Inside the sealed lamp, filled with argon or neon gas at low pressure, is a cylindrical metal cathode containing the element of Hollow cathode lamp (HCL) interest and an anode. A high voltage is applied across the anode and cathode, resulting in an ionization of the fill gas. The gas ions are accelerated towards the cathode and, upon impact on the cathode, sputter cathode material that is excited in the glow discharge to emit the radiation of the sputtered material, i.e., the element of interest. Most

Atomic absorption spectroscopy lamps will handle a handful of elements, i.e. 5-8. A typical machine will have two lamps, one will take care of five elements and the other will handle four elements for a total of nine elements analyzed. Electrodeless discharge lamps Electrodeless discharge lamps (EDL) contain a small quantity of the analyte as a metal or a salt in a quartz bulb together with an inert gas, typically argon, at low pressure. The bulb is inserted into a coil that is generating an electromagnetic radio frequency field, resulting in a low-pressure inductively coupled discharge in the lamp. The emission from an EDL is higher than that from an HCL, and the line width is generally narrower, but EDLs need a separate power supply and might need a longer time to stabilize. Deuterium lamps Deuterium HCL or even hydrogen HCL and deuterium discharge lamps are used in LS AAS for background correction purposes. The radiation intensity emitted by these lamps is decreasing significantly with increasing wavelength, so that they can be only used in the wavelength range between 190 and about 320nm. Continuum sources When a continuum radiation source is used for AAS, it is necessary to use a high-resolution monochromator, as will be discussed later. In addition, it is necessary that the lamp emits radiation of intensity at least an order of magnitude above that of a typical HCL over the entire wavelength range from 190nm to 900nm. A special high-pressure xenon short arc lamp, operating in a hot-spot mode has been developed to fulfill these requirements.

256

Spectrometer
As already pointed out above, there is a difference between medium-resolution spectrometers that are used for LS AAS and high-resolution spectrometers that are designed for CS AAS. The spectrometer includes the spectral sorting device (monochromator) and the detector. Spectrometers for LS AAS

Xenon lamp as a continuous radiation source

In LS AAS the high resolution that is required for the measurement of atomic absorption is provided by the narrow line emission of the radiation source, and the monochromator simply has to resolve the analytical line from other radiation emitted by the lamp. This can usually be accomplished with a band pass between 0.2 and 2nm, i.e., a medium-resolution monochromator. Another feature to make LS AAS element-specific is modulation of the primary radiation and the use of a selective amplifier that is tuned to the same modulation frequency, as already postulated by Alan Walsh. This way any (unmodulated) radiation emitted for example by the atomizer can be excluded, which is imperative for LS AAS. Simple monochromators of the Littrow or (better) the Czerny-Turner design are typically used for LS AAS. Photomultiplier tubes are the most frequently used detectors in LS AAS, although solid state detectors might be preferred because of their better signal-to-noise ratio.

Atomic absorption spectroscopy Spectrometers for CS AAS When a continuum radiation source is used for AAS measurement it is indispensable to work with a high-resolution monochromator. The resolution has to be equal to or better than the half width of an atomic absorption line (about 2 pm) in order to avoid losses of sensitivity and linearity of the calibration graph. The research with high-resolution (HR) CS AAS was pioneered by the groups of OHaver and Harnly in the USA, who also developed the (up until now) only simultaneous multi-element spectrometer for this technique. The break-through, however, came when the group of Becker-Ross in Berlin, Germany, built a spectrometer entirely designed for HR-CS AAS. The first commercial equipment for HR-CS AAS was introduced by Analytik Jena (Jena, Germany) at the beginning of the 21st century, based on the design proposed by Becker-Ross and Florek. These spectrometers use a compact double monochromator with a prism pre-monochromator and an echelle grating monochromator for high resolution. A linear charge coupled device (CCD) array with 200 pixels is used as the detector. The second monochromator does not have an exit slit; hence the spectral environment at both sides of the analytical line becomes visible at high resolution. As typically only 35 pixels are used to measure the atomic absorption, the other pixels are available for correction purposes. One of these corrections is that for lamp flicker noise, which is independent of wavelength, resulting in measurements with very low noise level; other corrections are those for background absorption, as will be discussed later.

257

Background absorption and background correction


The relatively small number of atomic absorption lines (compared to atomic emission lines) and their narrow width (a few pm) make spectral overlap rare; there are only very few examples known that an absorption line from one element will overlap with another. Molecular absorption, in contrast, is much broader, so that it is more likely that some molecular absorption band will overlap with an atomic line. This kind of absorption might be caused by un-dissociated molecules of concomitant elements of the sample or by flame gases. We have to distinguish between the spectra of di-atomic molecules, which exhibit a pronounced fine structure, and those of larger (usually tri-atomic) molecules that dont show such fine structure. Another source of background absorption, particularly in ET AAS, is scattering of the primary radiation at particles that are generated in the atomization stage, when the matrix could not be removed sufficiently in the pyrolysis stage. All these phenomena, molecular absorption and radiation scattering, can result in artificially high absorption and an improperly high (erroneous) calculation for the concentration or mass of the analyte in the sample. There are several techniques available to correct for background absorption, and they are significantly different for LS AAS and HR-CS AAS.

Background correction techniques in LS AAS


In LS AAS background absorption can only be corrected using instrumental techniques, and all of them are based on two sequential measurements, firstly, total absorption (atomic plus background), secondly, background absorption only, and the difference of the two measurements gives the net atomic absorption. Because of this, and because of the use of additional devices in the spectrometer, the signal-to-noise ratio of background-corrected signals is always significantly inferior compared to uncorrected signals. It should also be pointed out that in LS AAS there is no way to correct for (the rare case of) a direct overlap of two atomic lines. In essence there are three techniques used for background correction in LS AAS:

Atomic absorption spectroscopy Deuterium background correction This is the oldest and still most commonly used technique, particularly for flame AAS. In this case, a separate source (a deuterium lamp) with broad emission is used to measure the background absorption over the entire width of the exit slit of the spectrometer. The use of a separate lamp makes this technique the least accurate one, as it cannot correct for any structured background. It also cannot be used at wavelengths above about 320nm, as the emission intensity of the deuterium lamp becomes very weak. The use of deuterium HCL is preferable compared to an arc lamp due to the better fit of the image of the former lamp with that of the analyte HCL. Smith-Hieftje background correction This technique (named after their inventors) is based on the line-broadening and self-reversal of emission lines from HCL when high current is applied. Total absorption is measured with normal lamp current, i.e., with a narrow emission line, and background absorption after application of a high-current pulse with the profile of the self-reversed line, which has little emission at the original wavelength, but strong emission on both sides of the analytical line. The advantage of this technique is that only one radiation source is used; among the disadvantages are that the high-current pulses reduce lamp lifetime, and that the technique can only be used for relatively volatile elements, as only those exhibit sufficient self-reversal to avoid dramatic loss of sensitivity. Another problem is that background is not measured at the same wavelength as total absorption, making the technique unsuitable for correcting structured background. Zeeman-effect background correction An alternating magnetic field is applied at the atomizer (graphite furnace) to split the absorption line into three components, the component, which remains at the same position as the original absorption line, and two components, which are moved to higher and lower wavelengths, respectively (see Zeeman Effect). Total absorption is measured without magnetic field and background absorption with the magnetic field on. The component has to be removed in this case, e.g. using a polarizer, and the components do not overlap with the emission profile of the lamp, so that only the background absorption is measured. The advantages of this technique are 1. that total and background absorption are measured with the same emission profile of the same lamp, so that any kind of background, including background with fine structure can be corrected accurately, unless the molecule responsible for the background is also affected by the magnetic field 2. using a chopper as a polariser reduces the signal to noise ratio. While the disadvantages are the increased complexity of the spectrometer and power supply needed for running the powerful magnet needed to split the absorption line.

258

Background correction techniques in HR-CS AAS


In HR-CS AAS background correction is carried out mathematically in the software using information from detector pixels that are not used for measuring atomic absorption; hence, in contrast to LS AAS, no additional components are required for background correction. Background correction using correction pixels It has already been mentioned that in HR-CS AAS lamp flicker noise is eliminated using correction pixels. In fact, any increase or decrease in radiation intensity that is observed to the same extent at all pixels chosen for correction is eliminated by the correction algorithm. This obviously also includes a reduction of the measured intensity due to radiation scattering or molecular absorption, which is corrected in the same way. As measurement of total and background absorption, and correction for the latter, are strictly simultaneous (in contrast to LS AAS), even the fastest changes of background absorption, as they may be observed in ET AAS, do not cause any problem. In addition, as the same algorithm is used for background correction and elimination of lamp noise, the background corrected signals show a much better signal-to-noise ratio compared to the uncorrected signals, which is also in

Atomic absorption spectroscopy contrast to LS AAS. Background correction using a least-squares algorithm The above technique can obviously not correct for a background with fine structure, as in this case the absorbance will be different at each of the correction pixels. In this case HR-CS AAS is offering the possibility to measure correction spectra of the molecule(s) that is (are) responsible for the background and store them in the computer. These spectra are then multiplied with a factor to match the intensity of the sample spectrum and subtracted pixel by pixel and spectrum by spectrum from the sample spectrum using a least-squares algorithm. This might sound complex, but first of all the number of di-atomic molecules that can exist at the temperatures of the atomizers used in AAS is relatively small, and second, the correction is performed by the computer within a few seconds. The same algorithm can actually also be used to correct for direct line overlap of two atomic absorption lines, making HR-CS AAS the only AAS technique that can correct for this kind of spectral interference.

259

Atomic emission spectroscopy

260

Atomic emission spectroscopy


Atomic emission spectroscopy (AES) is a method of chemical analysis that uses the intensity of light emitted from a flame, plasma, arc, or spark at a particular wavelength to determine the quantity of an element in a sample. The wavelength of the atomic spectral line gives the identity of the element while the intensity of the emitted light is proportional to the number of atoms of the element.

Flame emission spectroscopy


A sample of a material (analyte) is brought into the flame as either a gas, sprayed solution, or directly inserted into the flame by use of a small loop of wire, usually platinum. The heat from the flame evaporates the solvent and breaks chemical bonds to create free atoms. The thermal energy also excites the atoms into excited electronic states that subsequently emit light when they return to the ground electronic state. Each element emits light at a characteristic wavelength, which is dispersed by a grating or prism and detected in the spectrometer.

A flame during the assessment of calcium ions in a flame photometer

A frequent application of the emission measurement with the flame is the regulation of alkali metals for pharmaceutical analytics.

Inductively coupled plasma atomic emission spectroscopy


Inductively coupled plasma atomic emission spectroscopy (ICP-AES) uses an inductively coupled plasma to produce excited atoms and ions that emit electromagnetic radiation at wavelengths characteristic of a particular element. Advantages of ICP-AES are excellent limit of detection and linear dynamic range, multi-element capability, low chemical interference and a stable and reproducible signal. Disadvantages are spectral interferences (many emission lines), cost and operating expense and the fact that samples typically must be in solution.

Spark and arc atomic emission spectroscopy


Spark or arc atomic emission spectroscopy is used for the analysis of metallic elements in solid samples. For non-conductive materials, the sample is ground with graphite powder to make it conductive. In traditional arc spectroscopy methods, a sample of the solid was commonly ground up and destroyed during analysis. An electric arc or spark is passed through the sample, heating it to a high temperature to excite the atoms within it. The excited analyte atoms emit light at characteristic wavelengths that can be dispersed with a monochromator and detected. As the spark or arc conditions are typically not well controlled, the analysis for the elements in the sample is qualitative. However, modern spark sources with controlled discharges under an argon atmosphere can be considered quantitative. Both qualitative and quantitative spark analysis are widely used for production quality control in foundries and steel mills.

Atomic emission spectroscopy

261

Fluorescence spectroscopy
Fluorescence spectroscopy aka fluorometry or spectrofluorometry, is a type of electromagnetic spectroscopy which analyzes fluorescence from a sample. It involves using a beam of light, usually ultraviolet light, that excites the electrons in molecules of certain compounds and causes them to emit light; typically, but not necessarily, visible light. A complementary technique is absorption spectroscopy. Devices that measure fluorescence are called fluorometers or fluorimeters.

Theory
Molecules have various states referred to as energy levels. Fluorescence spectroscopy is primarily concerned with electronic and vibrational states. Generally, the species being examined has a ground electronic state (a low energy state) of interest, and an excited electronic state of higher energy. Within each of these electronic states are various vibrational states. In fluorescence spectroscopy, the species is first excited, by absorbing a photon, from its ground electronic state to one of the various vibrational states in the excited electronic state. Collisions with other molecules cause the excited molecule to lose vibrational energy until it reaches the lowest vibrational state of the excited electronic state. This process is often visualized with a Jablonski diagram. The molecule then drops down to one of the various vibrational levels of the ground electronic state again, emitting a photon in the process. As molecules may drop down into any of several vibrational levels in the ground state, the emitted photons will have different energies, and thus frequencies. Therefore, by analysing the different frequencies of light emitted in fluorescent spectroscopy, along with their relative intensities, the structure of the different vibrational levels can be determined. For atomic species, the process is similar; however, since atomic species do not have vibrational energy levels, the emitted photons are often at the same wavelength as the incident radiation. This process of re-emitting the absorbed photon is "resonance fluorescence" and while it is characteristic of atomic fluorescence, is seen in molecular fluorescence as well. In a typical experiment, the different wavelengths of fluorescent light emitted by a sample are measured using a monochromator, holding the excitation light at a constant wavelength. This is called an emission spectrum. An excitation spectrum is the opposite, whereby the emission light is held at a constant wavelength, and the excitation light is scanned through many different wavelengths (via a monochromator). An emission map is measured by

Fluorescence spectroscopy recording the emission spectra resulting from a range of excitation wavelengths and combining them all together. This is a three dimensional surface data set: emission intensity as a function of excitation and emission wavelengths, and is typically depicted as a contour map.

262

Instrumentation
Two general types of instruments exist: Filter fluorometers use filters to isolate the incident light and fluorescent light. Spectrofluorometers use diffraction grating monochromators to isolate the incident light and fluorescent light. Both types use the following scheme: The light from an excitation source passes through a filter or monochromator, and strikes the sample. A proportion of the incident light is absorbed by the sample, and some of the molecules in the sample fluoresce. The fluorescent light is emitted in all directions. Some of this fluorescent light passes through a second filter or monochromator and reaches a detector, which is usually placed at 90 to the incident light beam to minimize the risk of transmitted or reflected incident light reaching the detector. Various light sources may be used as excitation sources, including lasers, LED, and lamps; xenon arcs and mercury-vapor lamps in particular. A laser only emits light of high irradiance at a very narrow wavelength interval, typically under 0.01nm, which makes an excitation monochromator or filter unnecessary. The disadvantage of this method is that the wavelength of a laser cannot be changed by much. A mercury vapor lamp is a line lamp, meaning it emits light near peak wavelengths. By contrast, a xenon arc has a continuous emission spectrum with nearly constant intensity in the range from 300-800nm and a sufficient irradiance for measurements down to just above 200nm.

A simplistic design of the components of a fluorimeter

Filters and/or monochromators may be used in fluorimeters. A monochromator transmits light of an adjustable wavelength with an adjustable tolerance. The most common type of monochromator utilizes a diffraction grating, that is, collimated light illuminates a grating and exits with a different angle depending on the wavelength. The monochromator can then be adjusted to select which wavelengths to transmit. For allowing anisotropy measurements the addition of two polarization filters are necessary: One after the excitation monochromator or filter, and one before the emission monochromator or filter. As mentioned before, the fluorescence is most often measured at a 90 angle relative to the excitation light. This geometry is used instead of placing the sensor at the line of the excitation light at a 180 angle in order to avoid interference of the transmitted excitation light. No monochromator is perfect and it will transmit some stray light, that is, light with other wavelengths than the targeted. An ideal monochromator would only transmit light in the specified range and have a high wavelength-independent transmission. When measuring at a 90 angle, only the light scattered by the sample causes stray light. This results in a better signal-to-noise ratio, and lowers the detection limit by approximately a factor 10000, when compared to the 180 geometry. Furthermore, the fluorescence can also be measured from the front, which is often done for turbid or opaque samples . The detector can either be single-channeled or multichanneled. The single-channeled detector can only detect the intensity of one wavelength at a time, while the multichanneled detects the intensity of all wavelengths simultaneously, making the emission monochromator or filter unnecessary. The different types of detectors have both advantages and disadvantages. The most versatile fluorimeters with dual monochromators and a continuous excitation light source can record both an excitation spectrum and a fluorescence spectrum. When measuring fluorescence spectra, the wavelength of the excitation light is kept constant, preferably at a wavelength of high absorption, and the emission monochromator

Fluorescence spectroscopy scans the spectrum. For measuring excitation spectra, the wavelength passing though the emission filter or monochromator is kept constant and the excitation monochromator is scanning. The excitation spectrum generally is identical to the absorption spectrum as the fluorescence intensity is proportional to the absorption.

263

Analysis of data
At low concentrations the fluorescence intensity will generally be proportional to the concentration of the fluorophore. Unlike in UV/visible spectroscopy, standard, device independent spectra are not easily attained. Several factors influence and distort the spectra, and corrections are necessary to attain true, i.e. machine-independent, spectra. The different types of distortions will here be classified as being either instrument- or sample-related. Firstly, the distortion arising from the instrument is discussed. As a start, the light source intensity and wavelength characteristics varies over time during each experiment and between each experiment. Furthermore, no lamp has a constant intensity at all wavelengths. To correct this, a beam splitter can be applied after the excitation monochromator or filter to direct a portion of the light to a reference detector. Additionally, the transmission efficiency of monochromators and filters must be taken into account. These may also change over time. The transmission efficiency of the monochromator also varies depending on wavelength. This is the reason that an optional reference detector should be placed after the excitation monochromator or filter. The percentage of the fluorescence picked up by the detector is also dependent upon the system. Furthermore, the detector quantum efficiency, that is, the percentage of photons detected, varies between different detectors, with wavelength and with time, as the detector inevitably deteriorates. Two other topics that must be considered include the optics used to direct the radiation and the means of holding or containing the sample material (called a cuvette or cell). For most UV, visible, and NIR measurements the use of precision quartz cuvettes is necessary. In both cases, it is important to select materials that have relatively little absorption in the wavelength range of interest. Quartz is ideal because it transmits from 200nm-2500nm; higher grade quartz can even transmit up to 3500nm, whereas the absorption properties of other materials can mask the fluorescence from the sample. Correction of all these instrumental factors for getting a standard spectrum is a tedious process, which is only applied in practice when it is strictly necessary. This is the case when measuring the quantum yield or when finding the wavelength with the highest emission intensity for instance. As mentioned earlier, distortions arise from the sample as well. Therefore some aspects of the sample must be taken into account too. Firstly, photodecomposition may decrease the intensity of fluorescence over time. Scattering of light must also be taken into account. The most significant types of scattering in this context are Rayleigh and Raman scattering. Light scattered by Rayleigh scattering has the same wavelength as the incident light, whereas in Raman scattering the scattered light changes wavelength usually to longer wavelengths. Raman scattering is the result of a virtual electronic state induced by the excitation light. From this virtual state, the molecules may relax back to a vibrational level other than the vibrational ground state. In fluorescence spectra, it is always seen at a constant wavenumber difference relative to the excitation wavenumber e.g. the peak appears at a wavenumber 3600cm1 lower than the excitation light in water. Other aspects to consider are the inner filter effects. These include reabsorption. Reabsorption happens because another molecule or part of a macromolecule absorbs at the wavelengths at which the fluorophore emits radiation. If this is the case, some or all of the photons emitted by the fluorophore may be absorbed again. Another inner filter effect occurs because of high concentrations of absorbing molecules, including the fluorophore. The result is that the intensity of the excitation light is not constant throughout the solution. Resultingly, only a small percentage of the excitation light reaches the fluorophores that are visible for the detection system. The inner filter effects change the spectrum and intensity of the emitted light and they must therefore be considered when analysing the emission spectrum of fluorescent light.

Fluorescence spectroscopy

264

Tryptophan fluorescence
The fluorescence of a folded protein is a mixture of the fluorescence from individual aromatic residues. Most of the intrinsic fluorescence emissions of a folded protein are due to excitation of tryptophan residues, with some emissions due to tyrosine and phenylalanine; but disulfide bonds also have appreciable absorption in this wavelength range. Typically, tryptophan has a wavelength of maximum absorption of 280nm and an emission peak that is solvatochromic, ranging from ca. 300 to 350nm depending in the polarity of the local environment Hence, protein fluorescence may be used as a diagnostic of the conformational state of a protein. Furthermore, tryptophan fluorescence is strongly influenced by the proximity of other residues (i.e., nearby protonated groups such as Asp or Glu can cause quenching of Trp fluorescence). Also, energy transfer between tryptophan and the other fluorescent amino acids is possible, which would affect the analysis, especially in cases where the Frster acidic approach is taken. In addition, tryptophan is a relatively rare amino acid; many proteins contain only one or a few tryptophan residues. Therefore, tryptophan fluorescence can be a very sensitive measurement of the conformational state of individual tryptophan residues. The advantage compared to extrinsic probes is that the protein itself is not changed. The use of intrinsic fluorescence for the study of protein conformation is in practice limited to cases with few (or perhaps only one) tryptophan residues, since each experiences a different local environment, which gives rise to different emission spectra. Tryptophan is an important intrinsic fluorescent probe (amino acid), which can be used to estimate the nature of microenvironment of the tryptophan. When performing experiments with denaturants, surfactants or other amphiphilic molecules, the microenvironment of the tryptophan might change. For example, if a protein containing a single tryptophan in its 'hydrophobic' core is denatured with increasing temperature, a red-shifted emission spectrum will appear. This is due to the exposure of the tryptophan to an aqueous environment as opposed to a hydrophobic protein interior. In contrast, the addition of a surfactant to a protein which contains a tryptophan which is exposed to the aqueous solvent will cause a blue-shifted emission spectrum if the tryptophan is embedded in the surfactant vesicle or micelle. Proteins that lack tryptophan may be coupled to a fluorophore. With fluorescence excitation at 295 nm, the tryptophan emission spectrum is dominant over the weaker tyrosine and phenylalanine fluorescence.

Applications
Fluorescence spectrocopy is used in, among others, biochemical, medical, and chemical research fields for analyzing organic compounds. There has also been a report of its use in differentiating malignant, bashful skin tumors from benign. Atomic Fluorescence Spectroscopy (AFS) techniques are useful in other kinds of analysis/measurement of a compound present in air or water, or other media, such as CVAFS which is used for heavy metals detection, such as mercury. Fluorescence can also be used to redirect photons, see fluorescent solar collector. Additionally, Fluorescence spectroscopy can be adapted to the microscopic level using microfluorimetry In analytical chemistry, fluorescence detectors are used with HPLC.

Fluorescence spectroscopy

265

Conductivity (electrolytic)
The conductivity (or specific conductance) of an electrolyte solution is a measure of its ability to conduct electricity. The SI unit of conductivity is siemens per meter (S/m). Conductivity measurements are used routinely in many industrial and environmental applications as a fast, inexpensive and reliable way of measuring the ionic content in a solution. For example, the measurement of product conductivity is a typical way to monitor and continuously trend the performance of water purification systems. In many cases, conductivity is linked directly to the total dissolved solids (T.D.S.). High quality deionized water has a conductivity of about 5.5 S/m, typical drinking water in the range of 5-50 mS/m, while sea water about 5 S/m (i.e., sea water's conductivity is one million times higher than that of deionized water). Conductivity is traditionally determined by measuring the AC resistance of the solution between two electrodes. Dilute solutions follow Kohlrausch's Laws of concentration dependence and additivity of ionic contributions. Lars Onsager gave a theoretical explanation of Kohlrausch's law by extending DebyeHckel theory.

Units
The SI unit of conductivity is S/m and, unless otherwise qualified, it refers to 25 C (standard temperature). Often encountered in industry is the traditional unit of S/cm. 106 S/cm = 103 mS/cm = 1 S/cm. The numbers in S/cm are higher than those in S/m by a factor of 100 (i.e., 1 S/cm = 100 S/m). Occasionally a unit of "EC" (electrical conductivity) is found on scales of instruments: 1 EC = 1 mS/cm. Sometimes encountered is a so-called mho (reciprocal of ohm): 1 mho/m = 1 S/m. Historically, mhos antedate Siemens by many decades; good vacuum-tube testers, for instance, gave transconductance readings in micromhos.

Resistivity of pure water (in M-cm) as a function of temperature

Conductivity (electrolytic) The commonly used standard cell has a width of 1cm, and thus for very pure water in equilibrium with air would have a resistance of about 106 ohm, known as a megohm, occasionally spelled as "megaohm". Ultra-pure water could achieve 18 megohms or more. Thus in the past megohm-cm was used, sometimes abbreviated to "megohm". Sometimes, a conductivity is given just in "microSiemens" (omitting the distance term in the unit). While this is an error, it can often be assumed to be equal to the traditional S/cm. The typical conversion of conductivity to the total dissolved solids is done assuming that the solid is sodium chloride: 1 S/cm is then an equivalent of about 0.6mg of NaCl per kg of water. Molar conductivity has the SI unit S m2 mol1. Older publications use the unit 1 cm2 mol1.

266

Measurement
The electrical conductivity of a solution of an electrolyte is measured by determining the resistance of the solution between two flat or cylindrical electrodes separated by a fixed distance. An alternating voltage is used in order to avoid electrolysis. The resistance is measured by a conductivity meter. Typical frequencies used are in the range 13 kHz. The dependence on the frequency is usually small, but may become appreciable at very high frequencies, an effect known as the DebyeFalkenhagen effect.

Principle of the measurement

A wide variety of instrumentation is commercially available. There are two types of cell, the classical type with flat or cylindrical electrodes and a second type based on induction. Many commercial systems offer automatic temperature correction. Tables of reference conductivities are available for many common solutions

Definitions
Resistance, R, is proportional to the distance, l, between the electrodes and is inversely proportional to the cross-sectional area of the sample, A (noted S on the Figure above). Writing (rho) for the specific resistance (or resistivity),

In practice the conductivity cell is calibrated by using solutions of known specific resistance, *, so the quantities l and A need not be known precisely. If the resistance of the calibration solution is R*, a cell-constant, C, is derived.

The specific conductance, (kappa) is the reciprocal of the specific resistance.

Conductivity is also temperature-dependent. Sometimes the ratio of l and A is called as the cell constant, denoted as G*, and conductance is denoted as G. Then the specific conductance (kappa), can be more conveniently written as

Conductivity (electrolytic)

267

Theory
The specific conductance of a solution containing one electrolyte depends on the concentration of the electrolyte. Therefore it is convenient to divide the specific conductance by concentration. This quotient is termed molar conductivity, is denoted by m

Strong electrolytes
Strong electrolytes are believed to dissociate completely in solution. The conductivity of a solution of a strong electrolyte at low concentration follows Kohlrausch's Law

where

is known as the limiting molar conductivity, K is an empirical constant and c is the electrolyte

concentration. (Limiting here means "at the limit of the infinite dilution".) In effect, the observed conductivity of a strong electrolyte becomes directly proportional to concentration, at sufficiently low concentrations.

As the concentration is increased however, the conductivity no longer rises in proportion. Moreover, Kohlrausch also found that the limiting conductivity of anions and cations are additive: the conductivity of a solution of a salt is equal to the sum of conductivity contributions from the cation and anion.

where: and and are the number of moles of cations and anions, respectively, which are created from the dissociation are the limiting molar conductivities of the individual ions. of 1 mole of the dissolved electrolyte; The following table gives values for limiting molar conductivities for selected ions.

Limiting ion conductivity in water at 298 K


Cations 0 /mS m2mol-1 + H+ Li+ Na+ K+ Mg2+ Ca2+ Ba2+ 34.96 3.869 5.011 7.350 10.612 11.900 12.728 anions OHClBrISO42NO3-0 /mS m2mol-1 19.91 7.634 7.84 7.68 15.96 7.14

CH3CO2- 4.09

A theoretical interpretation of these results was provided by the DebyeHckel.

where A and B are constants that depend only on known quantities such as temperature, the charges on the ions and the dielectric constant and viscosity of the solvent. As the name suggests, this is an extension of the DebyeHckel theory, due to Onsager. It is very successful for solutions at low concentration.

Conductivity (electrolytic)

268

Weak electrolytes
A weak electrolyte is one that is never fully dissociated (i.e. there are a mixture of ions and complete molecules in equilibrium). In this case there is no limit of dilution below which the relationship between conductivity and concentration becomes linear. Instead, the solution becomes ever more fully dissociated at weaker concentrations, and for low concentrations of "well behaved" weak electrolytes, the degree of dissociation of the weak electrolyte becomes proportional to the inverse square root of the concentration. Typical weak electrolytes are weak acids and weak bases. The concentration of ions in a solution of a weak electrolyte is less than the concentration of the electrolyte itself. For acids and bases the concentrations can be calculated when the value(s) of the acid dissociation constant(s) is(are) known. For a monoprotic acid, HA, obeying the inverse square root law, with a dissociation constant Ka, an explicit expression for the conductivity as a function of concentration, c, known as Ostwald's dilution law, can be obtained.

Higher concentrations
Both Kolrausch's law and the Debye-Hckel-Onsager equation break down as the concentration of the electrolyte increases above a certain value. The reason for this is that as concentration increases the average distance between cation and anion decreases, so that there is more inter-ionic interaction. Whether this constitutes ion-association is a moot point. However, It has often been assumed that cation and anion interact to form an ion-pair. Thus the electrolyte is treated as if it were like a weak acid and a constant, K, can be derived for the equilibrium A+ + BA+B-; K=[A+][B-]/[A+B-]

Davies describes the results of such calculations in great detail, but states that K should not necessarily be thought of as a true equilibrium constant, rather, the inclusion of an "ion-association" term is useful in extending the range of good agreement between theory and experimental conductivity data. Various attempts have been made to extend Onsager's treament to more concentrated solutions. The existence of a so-called conductance minimum has proved to be a controversial subject as regards interpretation. Fuoss and Kraus suggested that it is caused by the formation of ion-triplets, and this suggestion has received some support recently.

Conductivity Versus Temperature


Generally the conductivity of a solution increases with temperature, as the mobility of the ions increases. For comparison purposes reference values are reported at an agreed temperature, usually 298 K (~25 C), although occasionally 20C is used. It is often necessary to take readings from a sample at some other temperature, where it would be inconvenient to wait for the sample to heat or cool. So called 'compensated' measurements are made at a convenient temperature but the value reported is a calculated value of the expected value of conductivity of the solution, as if it had been measured at the reference temperature (usually 298 K (~25 C), although occasionally 20C is used). Basic compensation is normally done by assuming a linear increase of conductivity versus temperature of typically 2% per degree. Although generally satisfactory for room temperatures, and for purely comparative measurements, the further the measurement temperature is from the reference temperature, the less accurate such simply compensated measurements become. More sophisticated instruments allow programmable compensation functions, which can be specific to the ion species being measured, and which may include additional non-linear terms.

Conductivity (electrolytic)

269

Applications
Notwithstanding the difficulty of theoretical interpretation, measured conductivity is a good indicator of the presence or absence of conductive ions in solution, and measurements are used extensively in many industries. For example, conductivity measurements are used to monitor quality in public water supplies, in hospitals, in boiler water and industries which depend on water quality such as brewing. This type of measurement is not ion-specific; it can sometimes be used to determine the amount of total dissolved solids (T.D.S.) if the composition of the solution and its conductivity behavior are known. It should be noted that conductivity measurements made to determine water purity will not respond to non conductive contaminants (many organic compounds fall into this category), therefore additional purity tests may be required depending on application. Sometimes, conductivity measurements are linked with other methods to increase the sensitivity of detection of specific types of ions. For example, in the boiler water technology, the boiler blowdown is continuously monitored for "cation conductivity", which is the conductivity of the water after it has been passed through a cation exchange resin. This is a sensitive method of monitoring anion impurities in the boiler water in the presence of excess cations (those of the alkalizing agent usually used for water treatment). The sensitivity of this method relies on the high mobility of H+ in comparison with the mobility of other cations or anions. Conductivity detectors are commonly used with ion chromatography.

Electrical conductivity meter

270

Electrical conductivity meter


An electrical conductivity meter (EC meter) measures the electrical conductivity in a solution. Commonly used in hydroponics, aquaculture and freshwater systems to monitor the amount of nutrients, salts or impurities in the water.

Principle of operation
The common laboratory conductivity meters employ a potentiometric method and four electrodes. Often, the electrodes are cylindrical and arranged concentrically. The electrodes are usually An electrical conductivity meter. made of platinum metal. An alternating current is applied to the outer pair of the electrodes. The potential between the inner pair is measured . Conductivity could in principle be determined using the distance between the electrodes and their surface area using the Ohm's law but generally, for accuracy, a calibration is employed using electrolytes of well-known conductivity. Industrial conductivity probes often employ an inductive method, which has the advantage that the fluid does not wet the electrical parts of the sensor. Here, two inductively-coupled coils are used. One is the driving coil producing a magnetic field and it is supplied with accurately-known voltage. The other forms a secondary coil of a transformer. The liquid passing through a channel in the sensor forms one turn in the secondary winding of the transformer. The induced current is the output of the sensor.

Temperature dependence
The conductivity of a solution is highly temperature dependent, therefore it is important to either use a temperature compensated instrument, or calibrate the instrument at the same temperature as the solution being measured. Unlike metals, the conductivity of common electrolytes typically increases with increasing temperature. Over a limited temperature range, the way temperature affect conductivity of a solution can be modeled linearly using the following formula:

where T is the temperature of the sample, Tcal is the calibration temperature, T is the electrical conductivity at the temperature T, Tcal is the electrical conductivity at the calibration temperature Tcal, is the temperature compensation slope of the solution. The temperature compensation slope for most naturally occurring waters is about 2%/C, however it can range between 1 to 3%/C. The compensation slope for some common water solutions are listed in the table below.

Electrical resistivity and conductivity

271

Electrical resistivity and conductivity


Electrical resistivity (also known as resistivity, specific electrical resistance, or volume resistivity) quantifies how strongly a given material opposes the flow of electric current. A low resistivity indicates a material that readily allows the movement of electric charge. Resistivity is commonly represented by the Greek letter (rho). The SI unit of electrical resistivity is the ohmmetre (m) although other units like ohmcentimetre (cm) are also in use. As an example, if a 1 m 1 m 1 m solid cube of material has sheet contacts on two opposite faces, and the resistance between these contacts is 1, then the resistivity of the material is 1m. Electrical conductivity or specific conductance is the reciprocal of electrical resistivity, and measures a material's ability to conduct an electric current. It is commonly represented by the Greek letter (sigma), but (kappa) (especially in electrical engineering) or (gamma) are also occasionally used. Its SI unit is siemens per metre (S/m) and CGSE unit is reciprocal second (s1).

Definition
Resistors or conductors with uniform cross-section
Many resistors and conductors have a uniform cross section with a uniform flow of electric current and are made of one material. (See the diagram to the right.) In this case, the electrical resistivity (Greek: rho) is defined as:

where R is the electrical resistance of a uniform specimen of the material (measured in ohms, ) is the length of the piece of material (measured in metres, m) A is the cross-sectional area of the specimen (measured in square metres, m2).
A piece of resistive material with electrical contacts on both ends.

The reason resistivity is defined this way is that it makes resistivity a material property, unlike resistance. All copper wires, irrespective of their shape and size, have approximately the same resistivity, but a long, thin copper wire has a much larger resistance than a thick, short copper wire. Every material has its own characteristic resistivity for example, rubber's resistivity is far larger than copper's. In a hydraulic analogy, passing current through a high-resistivity material is like pushing water through a pipe full of sand, while passing current through a low-resistivity material is like pushing water through an empty pipe. If the pipes are the same size and shape, the pipe full of sand has higher resistance to flow. But resistance is not solely determined by the presence or absence of sand; it also depends on the length and width of the pipe: short or wide pipes will have lower resistance than narrow or long pipes. The above equation can be transposed to get Pouillet's law:

The resistance of a given material will increase with the length, but decrease with increasing cross-sectional area. From the above equations, resistivity has SI units of ohmmetre. Other units like ohmcm or ohminch are also sometimes used.

Electrical resistivity and conductivity The formula and can be used to intuitively understand the meaning of a resistivity value. For example, if (forming a cube with perfectly-conductive contacts on opposite faces), then the resistance

272

of this element in ohms is numerically equal to the resistivity of the material it is made of in ohm-meters. Likewise, a 1 ohmcm material would have a resistance of 1 ohm if contacted on opposite faces of a 1cm1cm1cm cube. Conductivity (Greek: sigma) is defined as the inverse of resistivity:

Conductivity has SI units of siemens per meter (S/m).

General definition
The above definition was specific to resistors or conductors with a uniform cross-section, where current flows uniformly through them. A more basic and general definition starts from the fact that if there is electric field inside a material, it will cause electric current to flow. The electrical resistivity is defined as the ratio of the electric field to the density of the current it creates:

where is the resistivity of the conductor material (measured in ohmmetres, m), E is the magnitude of the electric field (in volts per metre, Vm1), J is the magnitude of the current density (in amperes per square metre, Am2), in which E and J are inside the conductor. Conductivity is the inverse:

For example, rubber is a material with large and small , because even a very large electric field in rubber will cause almost no current to flow through it. On the other hand, copper is a material with small and large , because even a small electric field pulls a lot of current through it.

Electrical resistivity and conductivity

273

Causes of conductivity
Band theory simplified
Quantum mechanics states that electrons in an atom cannot take on any arbitrary energy value. Rather, there are fixed energy levels which the electrons can occupy, and values in between these levels are impossible. When a large number of such allowed energy levels are spaced close together (in energy-space) i.e. have similar (minutely differing energies) then we can talk about these energy levels together as an "energy Filling of electronic band structure in various types of material at thermodynamic band". There can be many such energy equilibriumequilibrium. In metals and semimetals the Fermi level EF lies inside at least one band. In insulator (electricity)insulators and semiconductors the Fermi bands in a material, depending on the atomic level is inside a band gap, however in semiconductors the bands are near enough to number (number of electrons) and their the Fermi level to be Fermi-Dirac statisticsthermally populated with electrons or distribution (besides external factors like electron holeholes. environment modifying the energy bands). Two such bands important in the discussion of conductivity of materials are: the valence band and the conduction band (the latter is generally above the former) . Electrons in the conduction band may move freely throughout the material in the presence of an electrical field. In insulators and semiconductors, the atoms in the substance influence each other so that between the valence band and the conduction band there exists a forbidden band of energy levels, which the electrons cannot occupy. In order for a current to flow, a relatively large amount of energy must be furnished to an electron for it to leap across this forbidden gap and into the conduction band. Thus, even large voltages can yield relatively small currents.

In metals
A metal consists of a lattice of atoms, each with an outer shell of electrons which freely dissociate from their parent atoms and travel through the lattice. This is also known as a positive ionic lattice. This 'sea' of dissociable electrons allows the metal to conduct electric current. When an electrical potential difference (a voltage) is applied across the metal, the resulting electric field causes electrons to move from one end of the conductor to the other. Near room temperatures, metals have resistance. The primary cause of this resistance is the thermal motion of ions. This acts to scatter electrons (due to destructive interference of free electron waves on non-correlating potentials of ions) . Also contributing to resistance in metals with impurities are the resulting imperfections in the lattice. In pure metals this source is negligible . The larger the cross-sectional area of the conductor, the more electrons per unit length are available to carry the current. As a result, the resistance is lower in larger cross-section conductors. The number of scattering events encountered by an electron passing through a material is proportional to the length of the conductor. The longer the conductor, therefore, the higher the resistance. Different materials also affect the resistance.

Electrical resistivity and conductivity

274

In semiconductors and insulators


In metals, the Fermi level lies in the conduction band , giving rise to free conduction electrons. However, in semiconductors the position of the Fermi level is within the band gap, approximately half-way between the conduction band minimum and valence band maximum for intrinsic (undoped) semiconductors. This means that at 0 kelvin, there are no free conduction electrons and the resistance is infinite. However, the resistance will continue to decrease as the charge carrier density in the conduction band increases. In extrinsic (doped) semiconductors, dopant atoms increase the majority charge carrier concentration by donating electrons to the conduction band or accepting holes in the valence band. For both types of donor or acceptor atoms, increasing the dopant density leads to a reduction in the resistance, hence highly doped semiconductors behave metallically. At very high temperatures, the contribution of thermally generated carriers will dominate over the contribution from dopant atoms and the resistance will decrease exponentially with temperature.

In ionic liquids/electrolytes
In electrolytes, electrical conduction happens not by band electrons or holes, but by full atomic species (ions) traveling, each carrying an electrical charge. The resistivity of ionic liquids varies tremendously by the concentration while distilled water is almost an insulator, salt water is a very efficient electrical conductor. In biological membranes, currents are carried by ionic salts. Small holes in the membranes, called ion channels, are selective to specific ions and determine the membrane resistance.

Superconductivity
The electrical resistivity of a metallic conductor decreases gradually as temperature is lowered. In ordinary conductors, such as copper or silver, this decrease is limited by impurities and other defects. Even near absolute zero, a real sample of a normal conductor shows some resistance. In a superconductor, the resistance drops abruptly to zero when the material is cooled below its critical temperature. An electric current flowing in a loop of superconducting wire can persist indefinitely with no power source. In 1986, it was discovered that some cuprate-perovskite ceramic materials have a critical temperature above 90 K (183C). Such a high transition temperature is theoretically impossible for a conventional superconductor, leading the materials to be termed high-temperature superconductors. Liquid nitrogen boils at 77K, facilitating many experiments and applications that are less practical at lower temperatures. In conventional superconductors, electrons are held together in pairs by an attraction mediated by lattice phonons. The best available model of high-temperature superconductivity is still somewhat crude. There is a hypothesis that electron pairing in high-temperature superconductors is mediated by short-range spin waves known as paramagnons.

Resistivity of various materials


A conductor such as a metal has high conductivity and a low resistivity. An insulator like glass has low conductivity and a high resistivity. The conductivity of a semiconductor is generally intermediate, but varies widely under different conditions, such as exposure of the material to electric fields or specific frequencies of light, and, most important, with temperature and composition of the semiconductor material. The degree of doping in semiconductors makes a large difference in conductivity. To a point, more doping leads to higher conductivity. The conductivity of a solution of water is highly dependent on its concentration of dissolved salts, and other chemical species that ionize in the solution. Electrical conductivity of water samples is used as an indicator of how salt-free, ion-free, or impurity-free the sample is; the purer the water, the lower the conductivity (the higher the resistivity). Conductivity measurements in water are often reported as specific conductance, relative to the conductivity of pure water at 25C. An EC meter is normally used to measure conductivity in a solution. A rough summary is as follows:

Electrical resistivity and conductivity

275

Material

Resistivity (m)

Superconductors 0 Metals Semiconductors Electrolytes Insulators 108 variable variable 1016

This table shows the resistivity, conductivity and temperature coefficient of various materials at 20C (68 F, 293 K)
Material (m) at 20C (S/m) at 20C Temperature [6] coefficient 1 (K ) 0.0038 0.003862 0.00393 0.0034 0.0039 0.0041 0.0045 0.0037 0.006 0.006 0.005 0.00392 0.0045

Carbon (graphene) Silver Copper Annealed copper Gold Aluminium Calcium Tungsten Zinc Nickel Lithium Iron Platinum Tin Carbon steel (1010) Lead Titanium

1108 1.59108 1.68108 1.72108 2.44108 2.82108 3.36108 5.60108 5.90108 6.99108 9.28108 1.0107 1.06107 1.09107 1.43107 2.2107 4.20107

6.30107 5.96107 5.80107 4.10107 3.5107 2.98107 1.79107 1.69107 1.43107 1.08107 1.00107 9.43106 9.17106 6.99106 4.55106 2.38106 2.17106 2.07106 2.04106 1.45106 1.02106

0.0039 X

Grain oriented electrical steel 4.60107 Manganin Constantan Stainless steel Mercury 4.82107 4.9107 6.9107 9.8107

0.000002 0.000008

0.0009

Electrical resistivity and conductivity

276
0.0004

Nichrome GaAs Carbon (amorphous) Carbon (graphite)

1.10106 5107 to 10103 5104 to 8104 2.5106 to 5.0106 //basal plane 3.0103 basal plane 11012 4.6101 2101 2101 to 2103 6.40102 1103 to 1104 1.8105 101010 to 101014 11013 11014 to 11016 11015 1.31016 to 3.31016 1103 to 1101 7.51017 101020 101022 to 101024

9.09105 5108 to 103 1.25103 to 2103 2105 to 3105 //basal plane 3.3102 basal plane ~1013 2.17 4.8 5104 to 5102 1.56103 104 to 103 5.5106 1011 to 1015 1014 1016 to 1014 1016 31015 to 81015 1101 to 1103 1.31018 1021 1025 to 1023

0.0005

Carbon (diamond) Germanium Sea water Drinking water Silicon Wood (damp) Deionized water Glass Hard rubber Wood (oven dry) Sulfur Air PEDOT:PSS Fused quartz PET Teflon

0.048

0.075

? ?

? ? ? ?

The effective temperature coefficient varies with temperature and purity level of the material. The 20C value is only an approximation when used at other temperatures. For example, the coefficient becomes lower at higher temperatures for copper, and the value 0.00427 is commonly specified at 0C. The extremely low resistivity (high conductivity) of silver is characteristic of metals. George Gamow tidily summed up the nature of the metals' dealings with electrons in his science-popularizing book, One, Two, Three...Infinity (1947): "The metallic substances differ from all other materials by the fact that the outer shells of their atoms are bound rather loosely, and often let one of their electrons go free. Thus the interior of a metal is filled up with a large number of unattached electrons that travel aimlessly around like a crowd of displaced persons. When a metal wire is subjected to electric force applied on its opposite ends, these free electrons rush in the direction of the force, thus forming what we call an electric current." More technically, the free electron model gives a basic description of electron flow in metals. Wood is widely regarded as an extremely good insulator, but its resistivity is sensitively dependent on moisture content, with damp wood being a factor of at least 1010 worse insulator than oven-dry. In any case, a sufficiently high voltage such as that in lightning strikes or some high-tension powerlines can lead to insulation breakdown and electrocution risk even with apparently dry wood.

Electrical resistivity and conductivity

277

Temperature dependence
Linear approximation
The electrical resistivity of most materials changes with temperature. If the temperature T does not vary too much, a linear approximation is typically used:

where

is called the temperature coefficient of resistivity, is the resistivity at temperature

is a fixed reference temperature (usually room is an empirical parameter fitted from

temperature), and

. The parameter

measurement data. Because the linear approximation is only an approximation, is different for different reference temperatures. For this reason it is usual to specify the temperature that was measured at with a suffix, such as , and the relationship only holds in a range of temperatures around the reference. When the temperature varies over a large temperature range, the linear approximation is inadequate and a more detailed analysis and understanding should be used.

Metals
In general, electrical resistivity of metals increases with temperature. Electronphonon interactions can play a key role. At high temperatures, the resistance of a metal increases linearly with temperature. As the temperature of a metal is reduced, the temperature dependence of resistivity follows a power law function of temperature. Mathematically the temperature dependence of the resistivity of a metal is given by the BlochGrneisen formula:

where

is the residual resistivity due to defect scattering, A is a constant that depends on the velocity of is the Debye

electrons at the Fermi surface, the Debye radius and the number density of electrons in the metal.

temperature as obtained from resistivity measurements and matches very closely with the values of Debye temperature obtained from specific heat measurements. n is an integer that depends upon the nature of interaction: 1. n=5 implies that the resistance is due to scattering of electrons by phonons (as it is for simple metals) 2. n=3 implies that the resistance is due to s-d electron scattering (as is the case for transition metals) 3. n=2 implies that the resistance is due to electronelectron interaction. If more than one source of scattering is simultaneously present, Matthiessen's Rule (first formulated by Augustus Matthiessen in the 1860s) says that the total resistance can be approximated by adding up several different terms, each with the appropriate value of n. As the temperature of the metal is sufficiently reduced (so as to 'freeze' all the phonons), the resistivity usually reaches a constant value, known as the residual resistivity. This value depends not only on the type of metal, but on its purity and thermal history. The value of the residual resistivity of a metal is decided by its impurity concentration. Some materials lose all electrical resistivity at sufficiently low temperatures, due to an effect known as superconductivity. An investigation of the low-temperature resistivity of metals was the motivation to Heike Kamerlingh Onnes's experiments that led in 1911 to discovery of superconductivity. For details see History of superconductivity.

Electrical resistivity and conductivity

278

Semiconductors
In general, resistivity of intrinsic semiconductors decreases with increasing temperature. The electrons are bumped to the conduction energy band by thermal energy, where they flow freely and in doing so leave behind holes in the valence band which also flow freely. The electric resistance of a typical intrinsic (non doped) semiconductor decreases exponentially with the temperature:

An even better approximation of the temperature dependence of the resistivity of a semiconductor is given by the SteinhartHart equation:

where A, B and C are the so-called SteinhartHart coefficients. This equation is used to calibrate thermistors. Extrinsic (doped) semiconductors have a far more complicated temperature profile. As temperature increases starting from absolute zero they first decrease steeply in resistance as the carriers leave the donors or acceptors. After most of the donors or acceptors have lost their carriers the resistance starts to increase again slightly due to the reducing mobility of carriers (much as in a metal). At higher temperatures it will behave like intrinsic semiconductors as the carriers from the donors/acceptors become insignificant compared to the thermally generated carriers. In non-crystalline semiconductors, conduction can occur by charges quantum tunnelling from one localised site to another. This is known as variable range hopping and has the characteristic form of , where n = 2, 3, 4, depending on the dimensionality of the system.

Complex resistivity and conductivity


When analyzing the response of materials to alternating electric fields, in applications such as electrical impedance tomography, it is necessary to replace resistivity with a complex quantity called impeditivity (in analogy to electrical impedance). Impeditivity is the sum of a real component, the resistivity, and an imaginary component, the reactivity (in analogy to reactance). The magnitude of Impeditivity is the square root of sum of squares of magnitudes of resistivity and reactivity. Conversely, in such cases the conductivity must be expressed as a complex number (or even as a matrix of complex numbers, in the case of anisotropic materials) called the admittivity. Admittivity is the sum of a real component called the conductivity and an imaginary component called the susceptivity. An alternative description of the response to alternating currents uses a real (but frequency-dependent) conductivity, along with a real permittivity. The larger the conductivity is, the more quickly the alternating-current signal is absorbed by the material (i.e., the more opaque the material is). For details, see Mathematical descriptions of opacity.

Electrical resistivity and conductivity

279

Tensor equations for anisotropic materials


Some materials are anisotropic, meaning they have different properties in different directions. For example, a crystal of graphite consists microscopically of a stack of sheets, and current flows very easily through each sheet, but moves much less easily from one sheet to the next. For an anisotropic material, it is not generally valid to use the scalar equations

For example, the current may not flow in exactly the same direction as the electric field. Instead, the equations are generalized to the 3D tensor form

where the conductivity and resistivity are rank-2 tensors (in other words, 33 matrices). The equations are compactly illustrated in component form (using index notation and the summation convention):

The and tensors are inverses (in the sense of a matrix inverse). The individual components are not necessarily inverses; for example, xx may not be equal to 1/xx.

Resistance versus resistivity in complicated geometries


If the material's resistivity is known, calculating the resistance of something made from it may, in some cases, be much more complicated than the formula above. One example is Spreading Resistance Profiling, where the material is inhomogeneous (different resistivity in different places), and the exact paths of current flow are not obvious. In cases like this, the formulas

need to be replaced with

where E and J are now vector fields. This equation, along with the continuity equation for J and the Poisson's equation for E, form a set of partial differential equations. In special cases, an exact or approximate solution to these equations can be worked out by hand, but for very accurate answers in complex cases, computer methods like finite element analysis may be required.

Resistivity density products


In some applications where the weight of an item is very important resistivity density products are more important than absolute low resistivity it is often possible to make the conductor thicker to make up for a higher resistivity; and then a low resistivity density product material (or equivalently a high conductance to density ratio) is desirable. For example, for long distance overhead power lines, aluminium is frequently used rather than copper because it is lighter for the same conductance.

Electrical resistivity and conductivity

280

Material

Resistivity (nm) Density (g/cm3)

Resistivity-density product (nmg/cm3) 46 49 52 64 66 72 76.3 150 166 427 757

Sodium Lithium Calcium Potassium Beryllium

47.7 92.8 33.6 72.0 35.6

0.97 0.53 1.55 0.89 1.85 2.70 1.74 8.96 10.49 19.30 7.874

Aluminium 26.50 Magnesium 43.90 Copper Silver Gold Iron 16.78 15.87 22.14 96.1

Silver, although it is the least resistive metal known, has a high density and does poorly by this measure. Calcium and the alkali metals have the best resistivity-density products, but are rarely used for conductors due to their high reactivity with water and oxygen. Aluminium is far more stable. Two other important attributes, price and toxicity, exclude the (otherwise) best choice: Beryllium. Thus, aluminium is usually the metal of choice when the weight of some required conduction (and/or the cost of conduction) is the driving consideration.

LCR meter

281

LCR meter
An LCR meter is a piece of electronic test equipment used to measure the inductance (L), capacitance (C), and resistance (R) of a component. In the simpler versions of this instrument the true values of these quantities are not measured; rather the impedance is measured internally and converted for display to the corresponding capacitance or inductance value. Readings will be reasonably accurate if the capacitor or inductor device under test does not have a significant resistive component of impedance. More advanced designs measure true inductance or capacitance, and also the equivalent series resistance of capacitors and the Q factor of inductive components.
LCR-meter. Usually the device under test (DUT) is subjected to an AC voltage source. The meter measures the voltage across and the current through the DUT. From the ratio of these the meter can determine the magnitude of the impedance. The phase angle between the voltage and current is also measured in more advanced instruments; in combination with the impedance, the equivalent capacitance or inductance, and resistance, of the DUT can be calculated and displayed. The meter must assume either a parallel or a series model for these two elements. The most useful assumption, and the one usually adopted, is that LR measurements have the Benchtop LCR meter with fixture elements in series (as would be encountered in an inductor coil) and that CR measurements have the elements in parallel (as would be encountered in measuring a capacitor with a leaky dielectric). An LCR meter can also be used to judge the inductance variation with respect to the rotor position in permanent magnet machines (however care must be taken as some LCR meters can be damaged by the generated EMF produced by turning the rotor of a permanent-magnet motor).

Hand held LCR meters typically have selectable test frequencies of 100Hz, 120Hz, 1kHz, 10kHz, and 100kHz for top end meters. The display resolution and measurement range capability will typically change with test frequency. Benchtop LCR meters typically have selectable test frequencies of more than 100kHz. They often include possibilities to superimpose a DC voltage or current on the AC measuring signal. Lower end meters offer the possibility to externally supply these DC voltages or currents while higher end devices can supply them internally. In addition benchtop meters allow the usage of special fixtures to measure SMD components, air-core coils or

LCR meter transformers.

282

Bridge circuits
Inductance, capacitance, resistance, and dissipation factor can also be measured by various bridge circuits. They involve adjusting variable calibrated elements until the signal at a detector becomes null, rather than measuring impedance and phase angle. Early commercial LCR bridges used a variety of techniques involving the matching or "nulling" of two signals derived from a single source. The first signal was generated by applying the test signal to the unknown and the second signal was generated by utilizing a combination of known-value R and C standards. The signals were summed through a detector (normally a panel meter with or without some level of amplification). When zero current was noted by changing the value of the standards and looking for a "null" in the panel meter, it could be assumed that the current magnitude through the unknown was equal to that of the standard and that the phase was exactly the reverse (180 degrees apart). The combination of standards selected could be arranged to read out C and DF directly which was the precise value of the unknown standard. An example of this is the GenRad/IET Labs Model 1620 and 1621 Capacitance Bridges.

Device under test


Device under test (DUT), also known as unit under test (UUT), is a term commonly used to refer to a manufactured product undergoing testing.

Electronics testing
The term DUT is used within the electronics industry to refer to any electronic assembly under test. For example, cell phones coming off of an assembly line may be given a final test in the same way as the individual chips were earlier tested. Each cell phone under test is, briefly, the DUT. The DUT is often connected to the test equipment using a bed of nails tester of pogo pins.

Semiconductor testing
In semiconductor testing, the device under test is a die on a wafer or the resulting packaged part. A connection system is used, connecting the part to automatic or manual test equipment. The test equipment then applies power to the part, supplies stimulus signals, then measures and evaluates the resulting outputs from the device. In this way, the tester determines whether the particular device under test meets the device specifications. While packaged as a wafer, automatic test equipment (ATE) can connect to the individual units using a set of microscopic needles. Once the chips are sawn apart and packaged, test equipment can connect to the chips using ZIF sockets (sometimes called contactors).

Q factor

283

Q factor
In physics and engineering the quality factor or Q factor is a dimensionless parameter that describes how under-damped an oscillator or resonator is, or equivalently, characterizes a resonator's bandwidth relative to its center frequency. Higher Q indicates a lower rate of energy loss relative to the stored energy of the resonator; the oscillations die out more slowly. A pendulum suspended from a high-quality bearing, oscillating in air, has a high Q, while a pendulum immersed in oil has a low one. Resonators with high quality factors have low damping so that they ring longer.

The bandwidth,

, or f1 to f2, of a damped oscillator is shown on a graph of energy . The higher the Q, the narrower and 'sharper' the peak is.

versus frequency. The Q factor of the damped oscillator, or filter, is

Explanation
Sinusoidally driven resonators having higher Q factors resonate with greater amplitudes (at the resonant frequency) but have a smaller range of frequencies around that frequency for which they resonate; the range of frequencies for which the oscillator resonates is called the bandwidth. Thus, a high-Q tuned circuit in a radio receiver would be more difficult to tune, but would have more selectivity; it would do a better job of filtering out signals from other stations that lie nearby on the spectrum. High-Q oscillators oscillate with a smaller range of frequencies and are more stable. (See oscillator phase noise.) The quality factor of oscillators varies substantially from system to system. Systems for which damping is important (such as dampers keeping a door from slamming shut) have Q near . Clocks, lasers, and other resonating systems that need either strong resonance or high frequency stability have high quality factors. Tuning forks have quality factors around 1000. The quality factor of atomic clocks, superconducting RF cavities used in accelerators, and some high-Q lasers can reach as high as 1011 and higher. There are many alternative quantities used by physicists and engineers to describe how damped an oscillator is. Important examples include: the damping ratio, relative bandwidth, linewidth and bandwidth measured in octaves. The concept of "Q" originated with K.S. Johnson of Western Electric Company's Engineering Department while evaluating the quality of coils (inductors). His choice of the symbol Q was only because all other letters of the alphabet were taken. The term was not intended as an abbreviation for "quality" or "quality factor", although these terms have grown to be associated with it.

Q factor

284

Definition of the quality factor


In the context of resonators, Q is defined in terms of the ratio of the energy stored in the resonator to the energy supplied by a generator, per cycle, to keep signal amplitude constant, at a frequency (the resonant frequency), fr, where the stored energy is constant with time:

The factor 2 makes Q expressible in simpler terms, involving only the coefficients of the second-order differential equation describing most resonant systems, electrical or mechanical. In electrical systems, the stored energy is the sum of energies stored in lossless inductors and capacitors; the lost energy is the sum of the energies dissipated in resistors per cycle. In mechanical systems, the stored energy is the maximum possible stored energy, or the total energy, i.e. the sum of the potential and kinetic energies at some point in time; the lost energy is the work done by an external conservative force, per cycle, to maintain amplitude. For high values of Q, the following definition is also mathematically accurate:

where fr is the resonant frequency, f is the half-power bandwidth i.e. the bandwidth over which the power of vibration is greater than half the power at the resonant frequency, r=2fr is the angular resonant frequency, and is the angular half-power bandwidth. More generally and in the context of reactive component specification (especially inductors), the frequency-dependent definition of Q is used:

where is the angular frequency at which the stored energy and power loss are measured. This definition is consistent with its usage in describing circuits with a single reactive element (capacitor or inductor), where it can be shown to be equal to the ratio of reactive power to real power. (See Individual reactive components.)

Q factor and damping


The Q factor determines the qualitative behavior of simple damped oscillators. (For mathematical details about these systems and their behavior see harmonic oscillator and linear time invariant (LTI) system.) A system with low quality factor (Q < ) is said to be overdamped. Such a system doesn't oscillate at all, but when displaced from its equilibrium steady-state output it returns to it by exponential decay, approaching the steady state value asymptotically. It has an impulse response that is the sum of two decaying exponential functions with different rates of decay. As the quality factor decreases the slower decay mode becomes stronger relative to the faster mode and dominates the system's response resulting in a slower system. A second-order low-pass filter with a very low quality factor has a nearly first-order step response; the system's output responds to a step input by slowly rising toward an asymptote. A system with high quality factor (Q > ) is said to be underdamped. Underdamped systems combine oscillation at a specific frequency with a decay of the amplitude of the signal. Underdamped systems with a low quality factor (a little above Q = ) may oscillate only once or a few times before dying out. As the quality factor increases, the relative amount of damping decreases. A high-quality bell rings with a single pure tone for a very long time after being struck. A purely oscillatory system, such as a bell that rings forever, has an infinite quality factor. More generally, the output of a second-order low-pass filter with a very high quality factor responds to a step input by quickly rising above, oscillating around, and eventually converging to a steady-state value.

Q factor A system with an intermediate quality factor (Q = ) is said to be critically damped. Like an overdamped system, the output does not oscillate, and does not overshoot its steady-state output (i.e., it approaches a steady-state asymptote). Like an underdamped response, the output of such a system responds quickly to a unit step input. Critical damping results in the fastest response (approach to the final value) possible without overshoot. Real system specifications usually allow some overshoot for a faster initial response or require a slower initial response to provide a safety margin against overshoot. In negative feedback systems, the dominant closed-loop response is often well-modeled by a second-order system. The phase margin of the open-loop system sets the quality factor Q of the closed-loop system; as the phase margin decreases, the approximate second-order closed-loop system is made more oscillatory (i.e., has a higher quality factor).

285

Quality factors of common systems


A unity gain SallenKey filter topology with equivalent capacitors and equivalent resistors is critically damped (i.e., ).[citation needed] A second order Butterworth filter (i.e., continuous-time filter with the flattest passband frequency response) has an underdamped . A Bessel filter (i.e., continuous-time filter with flattest group delay) has an underdamped
needed]

.[citation

Physical interpretation of Q
Physically speaking, Q is times the ratio of the total energy stored divided by the energy lost in a single cycle or equivalently the ratio of the stored energy to the energy dissipated over one radian of the oscillation. It is a dimensionless parameter that compares the time constant for decay of an oscillating physical system's amplitude to its oscillation period. Equivalently, it compares the frequency at which a system oscillates to the rate at which it dissipates its energy. Equivalently (for large values of Q), the Q factor is approximately the number of oscillations required for a freely oscillating system's energy to fall off to , or about 1/535 or 0.2%, of its original energy. The width (bandwidth) of the resonance is given by , where is the resonant frequency, and , the bandwidth, is the width of the range of frequencies for which the

energy is at least half its peak value. The factors Q, damping ratio , and attenuation are related such that

So the quality factor can be expressed as

and the exponential attenuation rate can be expressed as

For any 2nd order low-pass filter, the response function of the filter is

Q factor

286

For this system, when

(i.e., when the system is underdamped), it has two complex conjugate poles that

each have a real part of . That is, the attenuation parameter represents the rate of exponential decay of the oscillations (e.g., after an impulse) of the system. A higher quality factor implies a lower attenuation, and so high Q systems oscillate for long times. For example, high quality bells have an approximately pure sinusoidal tone for a long time after being struck by a hammer.
Filter Type (2nd order) Low-Pass Transfer Function

Band-Pass

Notch

High-Pass

Electrical systems
For an electrically resonant system, the Q factor represents the effect of electrical resistance and, for electromechanical resonators such as quartz crystals, mechanical friction.

RLC circuits
In an ideal series RLC circuit, and in a tuned radio frequency receiver (TRF) the Q factor is: , where , and are the
A graph of a filter's gain magnitude, illustrating the concept of -3 dB at a voltage gain of 0.707 or half-power bandwidth. The frequency axis of this symbolic diagram can be linear or logarithmically scaled.

resistance, inductance and capacitance of the tuned circuit, respectively. The larger the series resistance, the lower the circuit Q.

For a parallel RLC circuit, the Q factor is the inverse of the series case: , Consider a circuit where R, L and C are all in parallel. The lower the parallel resistance, the more effect it will have in damping the circuit and thus the lower the Q. This is useful in filter design to determine the bandwidth.

Q factor In a parallel LC circuit where the main loss is the resistance of the inductor, R, in series with the inductance, L, Q is as in the series circuit. This is a common circumstance for resonators, where limiting the resistance of the inductor to improve Q and narrow the bandwidth is the desired result.

287

Individual reactive components


The Q of an individual reactive component depends on the frequency at which it is evaluated, which is typically the resonant frequency of the circuit that it is used in. The Q of an inductor with a series loss resistance is the same as the Q of a resonant circuit using that inductor with a perfect capacitor:

Where: is the resonance frequency in radians per second, is the inductance, is the inductive reactance, and is the series resistance of the inductor.

The Q of a capacitor with a series loss resistance is the same as the Q of a resonant circuit using that capacitor with a perfect inductor:

Where: is the resonance frequency in radians per second, is the capacitance, is the capacitive reactance, and is the series resistance of the capacitor.

In general, the Q of a resonator involving a capacitor and an inductor can be determined from the Q values of the components, whether their losses come from series resistance or otherwise:

Mechanical systems
For a single damped mass-spring system, the Q factor represents the effect of simplified viscous damping or drag, where the damping force or drag force is proportional to velocity. The formula for the Q factor is:

where M is the mass, k is the spring constant, and D is the damping coefficient, defined by the equation , where is the velocity.

Q factor

288

Optical systems
In optics, the Q factor of a resonant cavity is given by

where

is the resonant frequency,

is the stored energy in the cavity, and

is the power dissipated.

The optical Q is equal to the ratio of the resonant frequency to the bandwidth of the cavity resonance. The average lifetime of a resonant photon in the cavity is proportional to the cavity's Q. If the Q factor of a laser's cavity is abruptly changed from a low value to a high one, the laser will emit a pulse of light that is much more intense than the laser's normal continuous output. This technique is known as Q-switching.

Electrical impedance

289

Electrical impedance
Electromagnetism

Electricity Magnetism

Electrical impedance is the measure of the opposition that a circuit presents to a current when a voltage is applied. In quantitative terms, it is the complex ratio of the voltage to the current in an alternating current (AC) circuit. Impedance extends the concept of resistance to AC circuits, and possesses both magnitude and phase, unlike resistance, which has only magnitude. When a circuit is driven with direct current (DC), there is no distinction between impedance and resistance; the latter can be thought of as impedance with zero phase angle. It is necessary to introduce the concept of impedance in AC circuits because there are two additional impeding mechanisms to be taken into account besides the normal resistance of DC circuits: the induction of voltages in conductors self-induced by the A graphical representation of the complex impedance plane magnetic fields of currents (inductance), and the electrostatic storage of charge induced by voltages between conductors (capacitance). The impedance caused by these two effects is collectively referred to as reactance and forms the imaginary part of complex impedance whereas resistance forms the real part. The symbol for impedance is usually Z and it may be represented by writing its magnitude and phase in the form |Z|. However, complex number representation is often more powerful for circuit analysis purposes. The term impedance was coined by Oliver Heaviside in July 1886. Arthur Kennelly was the first to represent impedance with complex numbers in 1893. Impedance is defined as the frequency domain ratio of the voltage to the current. In other words, it is the voltagecurrent ratio for a single complex exponential at a particular frequency . In general, impedance will be a complex number, with the same units as resistance, for which the SI unit is the ohm (). For a sinusoidal current or voltage input, the polar form of the complex impedance relates the amplitude and phase of the voltage and current. In particular, The magnitude of the complex impedance is the ratio of the voltage amplitude to the current amplitude. The phase of the complex impedance is the phase shift by which the current lags the voltage. The reciprocal of impedance is admittance (i.e., admittance is the current-to-voltage ratio, and it conventionally carries units of siemens, formerly called mhos).

Electrical impedance

290

Complex impedance
Impedance is represented as a complex quantity and the term complex impedance may be used interchangeably; the polar form conveniently captures both magnitude and phase characteristics,

where the magnitude argument

represents the ratio of the voltage difference amplitude to the current amplitude, while the ) gives the phase difference between voltage and current. is the

(commonly given the symbol

imaginary unit, and is used instead of in this context to avoid confusion with the symbol for electric current. In Cartesian form,

where the real part of impedance is the resistance

and the imaginary part is the reactance

Where it is required to add or subtract impedances the cartesian form is more convenient, but when quantities are multiplied or divided the calculation becomes simpler if the polar form is used. A circuit calculation, such as finding the total impedance of two impedances in parallel, may require conversion between forms several times during the calculation. Conversion between the forms follows the normal conversion rules of complex numbers.

Ohm's law
The meaning of electrical impedance can be understood by substituting it into Ohm's law.

The magnitude of the impedance

acts just like resistance, giving the

drop in voltage amplitude across an impedance for a given current . The phase factor tells us that the current lags the voltage by a phase of (i.e., in the time domain, the current signal is shifted later with respect to the voltage signal). Just as impedance extends Ohm's law to cover AC circuits, other results from DC circuit analysis such as voltage division, current division, Thvenin's theorem, and Norton's theorem can also be extended to AC circuits by replacing resistance with impedance.

An AC supply applying a voltage load , driving a current

, across a .

Electrical impedance

291

Complex voltage and current


In order to simplify calculations, sinusoidal voltage and current waves are commonly represented as complex-valued functions of time denoted as and .

Impedance is defined as the ratio of these quantities.

Substituting these into Ohm's law we have

Generalized impedances in a circuit can be drawn with the same symbol as a resistor (US ANSI or DIN Euro) or with a labeled box.

Noting that this must hold for all

, we may equate the magnitudes and phases to obtain

The magnitude equation is the familiar Ohm's law applied to the voltage and current amplitudes, while the second equation defines the phase relationship.

Validity of complex representation


This representation using complex exponentials may be justified by noting that (by Euler's formula):

The real-valued sinusoidal function representing either voltage or current may be broken into two complex-valued functions. By the principle of superposition, we may analyse the behaviour of the sinusoid on the left-hand side by analysing the behaviour of the two complex terms on the right-hand side. Given the symmetry, we only need to perform the analysis for one right-hand term; the results will be identical for the other. At the end of any calculation, we may return to real-valued sinusoids by further noting that

Electrical impedance

292

Phasors
A phasor is a constant complex number, usually expressed in exponential form, representing the complex amplitude (magnitude and phase) of a sinusoidal function of time. Phasors are used by electrical engineers to simplify computations involving sinusoids, where they can often reduce a differential equation problem to an algebraic one. The impedance of a circuit element can be defined as the ratio of the phasor voltage across the element to the phasor current through the element, as determined by the relative amplitudes and phases of the voltage and current. This is identical to the definition from Ohm's law given above, recognising that the factors of cancel.

Device examples
The impedance of an ideal resistor is purely real and is referred to as a resistive impedance:

In this case, the voltage and current waveforms are proportional and in phase. Ideal inductors and capacitors have a purely imaginary reactive impedance: the impedance of inductors increases as frequency increases;

the impedance of capacitors decreases as frequency increases;

The phase angles in the equations for the impedance of inductors and capacitors indicate that the voltage across a capacitor lags the current through it by a phase of , while the voltage across an inductor leads the current through it by . The identical voltage and

In both cases, for an applied sinusoidal voltage, the equal to one. resulting current is also sinusoidal, but in quadrature, 90 degrees out of phase with the voltage. However, the phases have opposite signs: in an inductor, the current is lagging; in a capacitor the current is leading. Note the following identities for the imaginary unit and its reciprocal:

current amplitudes indicate that the magnitude of the impedance is

Thus the inductor and capacitor impedance equations can be rewritten in polar form:

The magnitude gives the change in voltage amplitude for a given current amplitude through the impedance, while the exponential factors give the phase relationship.

Electrical impedance

293

Deriving the device-specific impedances


What follows below is a derivation of impedance for each of the three basic circuit elements: the resistor, the capacitor, and the inductor. Although the idea can be extended to define the relationship between the voltage and current of any arbitrary signal, these derivations will assume sinusoidal signals, since any arbitrary signal can be approximated as a sum of sinusoids through Fourier analysis. Resistor For a resistor, there is the relation:

This is Ohm's law. Considering the voltage signal to be

it follows that

This says that the ratio of AC voltage amplitude to alternating current (AC) amplitude across a resistor is that the AC voltage leads the current across a resistor by 0 degrees. This result is commonly expressed as

, and

Capacitor For a capacitor, there is the relation:

Considering the voltage signal to be

it follows that

And thus

This says that the ratio of AC voltage amplitude to AC current amplitude across a capacitor is

, and that the AC

voltage lags the AC current across a capacitor by 90 degrees (or the AC current leads the AC voltage across a capacitor by 90 degrees). This result is commonly expressed in polar form, as

or, by applying Euler's formula, as

Electrical impedance Inductor For the inductor, we have the relation:

294

This time, considering the current signal to be

it follows that

And thus

This says that the ratio of AC voltage amplitude to AC current amplitude across an inductor is voltage leads the AC current across an inductor by 90 degrees. This result is commonly expressed in polar form, as

, and that the AC

or, using Euler's formula, as

Generalised s-plane impedance


Impedance defined in terms of j can strictly only be applied to circuits which are energised with a steady-state AC signal. The concept of impedance can be extended to a circuit energised with any arbitrary signal by using complex frequency instead of j. Complex frequency is given the symbol s and is, in general, a complex number. Signals are expressed in terms of complex frequency by taking the Laplace transform of the time domain expression of the signal. The impedance of the basic circuit elements in this more general notation is as follows:
Element Impedance expression Resistor Inductor Capacitor

For a DC circuit this simplifies to s = 0. For a steady-state sinusoidal AC signal s = j.

Electrical impedance

295

Resistance vs reactance
Resistance and reactance together determine the magnitude and phase of the impedance through the following relations:

In many applications the relative phase of the voltage and current is not critical so only the magnitude of the impedance is significant.

Resistance
Resistance is the real part of impedance; a device with a purely resistive impedance exhibits no phase shift between the voltage and current.

Reactance
Reactance is the imaginary part of the impedance; a component with a finite reactance induces a phase shift between the voltage across it and the current through it.

A purely reactive component is distinguished by the sinusoidal voltage across the component being in quadrature with the sinusoidal current through the component. This implies that the component alternately absorbs energy from the circuit and then returns energy to the circuit. A pure reactance will not dissipate any power. Capacitive reactance A capacitor has a purely reactive impedance which is inversely proportional to the signal frequency. A capacitor consists of two conductors separated by an insulator, also known as a dielectric.

At low frequencies a capacitor is open circuit, as no charge flows in the dielectric. A DC voltage applied across a capacitor causes charge to accumulate on one side; the electric field due to the accumulated charge is the source of the opposition to the current. When the potential associated with the charge exactly balances the applied voltage, the current goes to zero. Driven by an AC supply, a capacitor will only accumulate a limited amount of charge before the potential difference changes sign and the charge dissipates. The higher the frequency, the less charge will accumulate and the smaller the opposition to the current. Inductive reactance Inductive reactance is proportional to the signal frequency and the inductance .

An inductor consists of a coiled conductor. Faraday's law of electromagnetic induction gives the back emf (voltage opposing current) due to a rate-of-change of magnetic flux density through a current loop.

For an inductor consisting of a coil with

loops this gives.

Electrical impedance The back-emf is the source of the opposition to current flow. A constant direct current has a zero rate-of-change, and sees an inductor as a short-circuit (it is typically made from a material with a low resistivity). An alternating current has a time-averaged rate-of-change that is proportional to frequency, this causes the increase in inductive reactance with frequency. Total reactance The total reactance is given by

296

so that the total impedance is

Combining impedances
The total impedance of many simple networks of components can be calculated using the rules for combining impedances in series and parallel. The rules are identical to those used for combining resistances, except that the numbers in general will be complex numbers. In the general case however, equivalent impedance transforms in addition to series and parallel will be required.

Series combination
For components connected in series, the current through each circuit element is the same; the total impedance is the sum of the component impedances.

Or explicitly in real and imaginary terms:

Parallel combination
For components connected in parallel, the voltage across each circuit element is the same; the ratio of currents through any two elements is the inverse ratio of their impedances.

Hence the inverse total impedance is the sum of the inverses of the component impedances:

Electrical impedance

297

or, when n = 2:

The equivalent impedance

can be calculated in terms of the equivalent series resistance

and reactance

Measurement
The measurement of the impedance of devices and transmission lines is a practical problem in radio technology and others. Measurements of impedance may be carried out at one frequency, or the variation of device impedance over a range of frequencies may be of interest. The impedance may be measured or displayed directly in ohms, or other values related to impedance may be displayed; for example in a radio antenna the standing wave ratio or reflection coefficient may be more useful than the impedance alone. Measurement of impedance requires measurement of the magnitude of voltage and current, and the phase difference between them. Impedance is often measured by "bridge" methods, similar to the direct-current Wheatstone bridge; a calibrated reference impedance is adjusted to balance off the effect of the impedance of the device under test. Impedance measurement in power electronic devices may require simultaneous measurement and provision of power to the operating device. The impedance of a device can be calculated by complex division of the voltage and current. The impedance of the device can be calculated by applying a sinusoidal voltage to the device in series with a resistor, and measuring the voltage across the resistor and across the device. Performing this measurement by sweeping the frequencies of the applied signal provides the impedance phase and magnitude. The use of an impulse response may be used in combination with the fast Fourier transform (FFT) to rapidly measure the electrical impedance of various electrical devices. The LCR meter (Inductance (L), Capacitance (C), and Resistance (R)) is a device commonly used to measure the inductance, resistance and capacitance of a component; from these values the impedance at any frequency can be calculated.

Variable impedance
In general, neither impedance nor admittance can be time varying as they are defined for complex exponentials for < t < +. If the complex exponential voltagecurrent ratio changes over time or amplitude, the circuit element cannot be described using the frequency domain. However, many systems (e.g., varicaps that are used in radio tuners) may exhibit non-linear or time-varying voltagecurrent ratios that appear to be linear time-invariant (LTI) for small signals over small observation windows; hence, they can be roughly described as having a time-varying impedance. That is, this description is an approximation; over large signal swings or observation windows, the voltagecurrent relationship is non-LTI and cannot be described by impedance.

Electrical impedance

298

Hall effect
Electromagnetism

Electricity Magnetism

The Hall effect is the production of a voltage difference (the Hall voltage) across an electrical conductor, transverse to an electric current in the conductor and a magnetic field perpendicular to the current. It was discovered by Edwin Hall in 1879. The Hall coefficient is defined as the ratio of the induced electric field to the product of the current density and the applied magnetic field. It is a characteristic of the material from which the conductor is made, since its value depends on the type, number, and properties of the charge carriers that constitute the current.

Discovery
The Hall effect was discovered in 1879 by Edwin Herbert Hall while he was working on his doctoral degree at Johns Hopkins University in Baltimore, Maryland. His measurements of the tiny effect produced in the apparatus he used were an experimental tour de force, accomplished 18 years before the electron was discovered.

Theory
The Hall effect is due to the nature of the current in a conductor. Current consists of the movement of many small charge carriers, typically electrons, holes, ions (see Electromigration) or all three. When a magnetic field is present that is not parallel to the direction of motion of moving charges, these charges experience a force, called the Lorentz force. When such a magnetic field is absent, the charges follow approximately straight, 'line of sight' paths between

Hall effect collisions with impurities, phonons, etc. However, when a magnetic field with a perpendicular component is applied, their paths between collisions are curved so that moving charges accumulate on one face of the material. This leaves equal and opposite charges exposed on the other face, where there is a scarcity of mobile charges. The result is an asymmetric distribution of charge density across the Hall element that is perpendicular to both the 'line of sight' path and the applied magnetic field. The separation of charge establishes an electric field that opposes the migration of further charge, so a steady electrical potential is established for as long as the charge is flowing. In the classical view, there are only electrons moving in the same average direction both in the case of electron or hole conductivity. This cannot explain the opposite sign of the Hall effect observed. The difference is that electrons in the upper bound of the valence band have opposite group velocity and wave vector direction when moving, which can be effectively treated as if positively charged particles (holes) moved in the opposite direction to that of the electrons. For a simple metal where there is only one type of charge carrier (electrons) the Hall voltage VH is given by

299

where I is the current across the plate length, B is the magnetic field, t is the thickness of the plate, e is the elementary charge, and n is the charge carrier density of the carrier electrons. The Hall coefficient is defined as
Hall Effect measurement setup for electrons. Initially, the electrons follow the curved arrow, due to the magnetic force. At some distance from the current-introducing contacts, electrons pile up on the left side and deplete from the right side, which creates an electric field y. In steady-state, y will be strong enough to exactly cancel out the magnetic force, so that the electrons follow the straight arrow (dashed).

where j is the current density of the carrier electrons, and induced electric field. In SI units, this becomes

is the

(The units of RH are usually expressed as m3/C, or cm/G, or other variants.) As a result, the Hall effect is very useful as a means to measure either the carrier density or the magnetic field. One very important feature of the Hall effect is that it differentiates between positive charges moving in one direction and negative charges moving in the opposite. The Hall effect offered the first real proof that electric currents in metals are carried by moving electrons, not by protons. The Hall effect also showed that in some substances (especially p-type semiconductors), it is more appropriate to think of the current as positive "holes" moving rather than negative electrons. A common source of confusion with the Hall Effect is that holes moving to the left are really electrons moving to the right, so one expects the same sign of the Hall coefficient for both electrons and holes. This confusion, however, can only be resolved by modern quantum mechanical theory of transport in solids. The sample inhomogeneity might result in spurious sign of the Hall effect, even in ideal van der Pauw configuration of electrodes. For example, positive Hall effect was observed in evidently n-type semiconductors. Another source of artifact, in uniform materials, occurs when the sample's aspect ratio is not long enough: the full hall voltage only develops far away from the current-introducing contacts, since at the contacts the transverse voltage is shorted out to zero.

Hall effect

300

Hall effect in semiconductors


When a current-carrying semiconductor is kept in a magnetic field, the charge carriers of the semiconductor experience a force in a direction perpendicular to both the magnetic field and the current. At equilibrium, a voltage appears at the semiconductor edges. The simple formula for the Hall coefficient given above becomes more complex in semiconductors where the carriers are generally both electrons and holes which may be present in different concentrations and have different mobilities. For moderate magnetic fields the Hall coefficient is

where and

is the electron concentration,

the hole concentration,

the electron mobility,

the hole mobility

the absolute value of the electronic charge.

For large applied fields the simpler expression analogous to that for a single carrier type holds.

with

Relationship with star formation


Although it is well known that magnetic fields play an important role in star formation, recent research Hall diffusion critically influences the dynamics of gravitational collapse that forms protostars. shows that

Quantum Hall effect


For a two-dimensional electron system which can be produced in a MOSFET, in the presence of large magnetic field strength and low temperature, one can observe the quantum Hall effect, which is the quantization of the Hall voltage.

Spin Hall effect


The spin Hall effect consists in the spin accumulation on the lateral boundaries of a current-carrying sample. No magnetic field is needed. It was predicted by M. I. Dyakonov and V. I. Perel in 1971 and observed experimentally more than 30 years later, both in semiconductors and in metals, at cryogenic as well as at room temperatures.

Quantum spin Hall effect


For mercury telluride two dimensional quantum wells with strong spin-orbit coupling, in zero magnetic field, at low temperature, the Quantum spin Hall effect has been recently observed.

Anomalous Hall effect


In ferromagnetic materials (and paramagnetic materials in a magnetic field), the Hall resistivity includes an additional contribution, known as the anomalous Hall effect (or the extraordinary Hall effect), which depends directly on the magnetization of the material, and is often much larger than the ordinary Hall effect. (Note that this effect is not due to the contribution of the magnetization to the total magnetic field.) For example, in nickel, the anomalous Hall coefficient is about 100 times larger than the ordinary Hall coefficient near the Curie temperature, but the two are similar at very low temperatures. Although a well-recognized phenomenon, there is still debate about its origins in the various materials. The anomalous Hall effect can be either an extrinsic (disorder-related) effect due to spin-dependent scattering of the charge carriers, or an intrinsic effect which can be described in terms of the Berry

Hall effect phase effect in the crystal momentum space (k-space).

301

Hall effect in ionized gases


(See electrochemical instability) The Hall effect in an ionized gas (plasma) is significantly different from the Hall effect in solids (where the Hall parameter is always very inferior to unity). In a plasma, the Hall parameter can take any value. The Hall parameter, , in a plasma is the ratio between the electron gyrofrequency, e, and the electron-heavy particle collision frequency, :

where e is the elementary charge (approx. 1.6 1019C) B is the magnetic field (in teslas) me is the electron mass (approx. 9.1 1031kg). The Hall parameter value increases with the magnetic field strength. Physically, the trajectories of electrons are curved by the Lorentz force. Nevertheless when the Hall parameter is low, their motion between two encounters with heavy particles (neutral or ion) is almost linear. But if the Hall parameter is high, the electron movements are highly curved. The current density vector, J, is no longer colinear with the electric field vector, E. The two vectors J and E make the Hall angle, , which also gives the Hall parameter:

Applications
Hall probes are often used as magnetometers, i.e. to measure magnetic fields, or inspect materials (such as tubing or pipelines) using the principles of magnetic flux leakage. Hall effect devices produce a very low signal level and thus require amplification. While suitable for laboratory instruments, the vacuum tube amplifiers available in the first half of the 20th century were too expensive, power consuming, and unreliable for everyday applications. It was only with the development of the low cost integrated circuit that the Hall effect sensor became suitable for mass application. Many devices now sold as Hall effect sensors in fact contain both the sensor as described above plus a high gain integrated circuit (IC) amplifier in a single package. Recent advances have further added into one package an analog-to-digital converter and IC (Inter-integrated circuit communication protocol) IC for direct connection to a microcontroller's I/O port.

Hall effect

302

Advantages over other methods


Hall effect devices (when appropriately packaged) are immune to dust, dirt, mud, and water. These characteristics make Hall effect devices better for position sensing than alternative means such as optical and electromechanical sensing. When electrons flow through a conductor, a magnetic field is produced. Thus, it is possible to create a non-contacting current sensor. The device has three terminals. A sensor voltage is applied across two terminals and the third provides a voltage proportional to the current being sensed. This has several advantages; no additional resistance (a shunt, required for the most common current sensing method) need be inserted in the primary circuit. Also, the voltage present on the line to be sensed is not transmitted to the sensor, which enhances the safety of measuring equipment.

Disadvantages compared with other methods


Magnetic flux from the surroundings (such as other wires) may diminish or enhance the field the Hall probe intends to detect, rendering the results inaccurate. Also, as Hall voltage is often on the order of millivolts, the output from this type of sensor cannot be used to directly drive actuators but instead must be amplified by a transistor-based circuit.
Hall effect current sensor with internal integrated circuit amplifier. 8 mm opening. Zero current output voltage is midway between the supply voltages that maintain a 4 to 8 Volt differential. Non-zero current response is proportional to the voltage supplied and is linear to 60 amperes for this particular (25A) device.

Contemporary applications
Hall effect sensors are readily available from a number of different manufacturers, and may be used in various sensors such as rotating speed sensors (bicycle wheels, gear-teeth, automotive speedometers, electronic ignition systems), fluid flow sensors, current sensors, and pressure sensors. Common applications are often found where a robust and contactless switch or potentiometer is required. These include: electric airsoft guns, triggers of electropneumatic paintball guns, go-cart speed controls, smart phones, and some global positioning systems.

Hall effect Ferrite toroid Hall effect current transducer Hall sensors can detect stray magnetic fields easily, including that of Earth, so they work well as electronic compasses: but this also means that such stray fields can hinder accurate measurements of small magnetic fields. To solve this problem, Hall sensors are often integrated with magnetic shielding of some kind. For example, a Hall sensor integrated into a ferrite ring (as shown) can reduce the detection of stray fields by a factor of 100 or better (as the external magnetic fields cancel across the ring, giving no residual magnetic flux). This configuration also provides an improvement in signal-to-noise ratio and drift effects of over 20 times that of a bare Hall device. The range of a given feedthrough sensor may be extended upward and downward by appropriate wiring. To extend the range to lower currents, multiple turns of the current-carrying wire may be made through the opening, each turn adding to the sensor output the same quantity; when the sensor is installed onto a printed circuit board, the turns Diagram of Hall effect current transducer integrated into ferrite ring. can be carried out by a staple on the board. To extend the range to higher currents, a current divider may be used. The divider splits the current across two wires of differing widths and the thinner wire, carrying a smaller proportion of the total current, passes through the sensor.

303

Hall effect

304

Split ring clamp-on sensor A variation on the ring sensor uses a split sensor which is clamped onto the line enabling the device to be used in temporary test equipment. If used in a permanent installation, a split sensor allows the electric current to be tested without dismantling the existing circuit. Analog multiplication The output is proportional to both the applied magnetic field and the applied sensor voltage. If the magnetic field is applied by a solenoid, the sensor output is proportional to the product of the current through the solenoid and the sensor voltage. As most applications requiring computation are now performed by small digital computers, the remaining useful application is in power sensing, which combines current sensing with voltage sensing in a single Hall effect device. CPower measurement
Multiple 'turns' and corresponding transfer function.

By sensing the current provided to a load and using the device's applied voltage as a sensor voltage it is possible to determine the power dissipated by a device. Position and motion sensing Hall effect devices used in motion sensing and motion limit switches can offer enhanced reliability in extreme environments. As there are no moving parts involved within the sensor or magnet, typical life expectancy is improved compared to traditional electromechanical switches. Additionally, the sensor and magnet may be encapsulated in an appropriate protective material. This application is used in brushless DC motors. Automotive ignition and fuel injection Commonly used in distributors for ignition timing (and in some types of crank and camshaft position sensors for injection pulse timing, speed sensing, etc.) the Hall effect sensor is used as a direct replacement for the mechanical breaker points used in earlier automotive applications. Its use as an ignition timing device in various distributor types is as follows. A stationary permanent magnet and semiconductor Hall effect chip are mounted next to each other separated by an air gap, forming the Hall effect sensor. A metal rotor consisting of windows and tabs is mounted to a shaft and arranged so that during shaft rotation, the windows and tabs pass through the air gap between the permanent magnet and semiconductor Hall chip. This effectively shields and exposes the Hall chip to the permanent magnet's field respective to whether a tab or window is passing though the Hall sensor. For ignition timing purposes, the metal rotor will have a number of equal-sized tabs and windows matching the number of engine cylinders. This produces a uniform square wave output since the on/off (shielding and exposure) time is equal. This signal is used by the engine computer or ECU to control ignition timing. Many automotive Hall effect sensors have a built-in internal NPN transistor with an open collector and grounded emitter, meaning that rather than a voltage being produced at the Hall sensor signal output wire, the transistor is turned on providing a circuit to ground through the signal output wire.

Hall effect Wheel rotation sensing The sensing of wheel rotation is especially useful in anti-lock braking systems. The principles of such systems have been extended and refined to offer more than anti-skid functions, now providing extended vehicle handling enhancements. Electric motor control Some types of brushless DC electric motors use Hall effect sensors to detect the position of the rotor and feed that information to the motor controller. This allows for more precise motor control Industrial applications Applications for Hall Effect sensing have also expanded to industrial applications, which now use Hall Effect joysticks to control hydraulic valves, replacing the traditional mechanical levers with contactless sensing. Such applications include; Mining Trucks, Backhoe Loaders, Cranes, Diggers, Scissor Lifts, etc. Spacecraft propulsion A Hall effect thruster (HET) is a relatively low power device that is used to propel some spacecraft, once they get into orbit or farther out into space. In the HET, atoms are ionized and accelerated by an electric field. A radial magnetic field established by magnets on the thruster is used to trap electrons which then orbit and create an electric field due to the Hall effect. A large potential is established between the end of the thruster where neutral propellant is fed and the part where electrons are produced, so electrons trapped in the magnetic field cannot drop to the lower potential. They are thus extremely energetic, which means that they can ionize neutral atoms. Neutral propellant is pumped into the chamber and is ionized by the trapped electrons. Positive ions and electrons are then ejected from the thruster as a quasineutral plasma, creating thrust.

305

The Corbino effect


The Corbino effect is a phenomenon involving the Hall effect, but a disc-shaped metal sample is used in place of a rectangular one. Because of its shape the Corbino disc allows the observation of Hall effectbased magnetoresistance without the associated Hall voltage. A radial current through a circular disc, subjected to a magnetic field perpendicular to the plane of the disc, produces a "circular" current through the disc. The absence of the free transverse boundaries renders the interpretation of the Corbino effect simpler than that of the Hall effect.

Corbino disc dashed curves represent logarithmic spiral paths of deflected electrons

Hall effect sensor

306

Hall effect sensor


A Hall effect sensor is a transducer that varies its output voltage in response to a magnetic field. Hall effect sensors are used for proximity switching, positioning, speed detection, and current sensing applications. In its simplest form, the sensor operates as an analog transducer, directly returning a voltage. With a known magnetic field, its distance from the Hall plate can be determined. Using groups of sensors, the relative position of the magnet can be deduced. Electricity carried through a conductor will produce a magnetic field that varies with current, and a Hall sensor can be used to measure the current without interrupting the circuit. Typically, the sensor is integrated with a wound core or permanent magnet that surrounds the conductor to be measured. Frequently, a Hall sensor is combined with circuitry that allows the device to act in a digital (on/off) mode, and may be called a switch in this configuration. Commonly seen in industrial applications such as the pictured pneumatic cylinder, they are also used in consumer equipment; for example some computer printers use them to detect missing paper and open covers. When high reliability is required, they are used in keyboards.

The magnetic piston (1) in this pneumatic cylinder will cause the Hall effect sensors (2 and 3) mounted on its outer wall to activate when it is fully retracted or extended.

Hall sensors are commonly used to time the speed of wheels and shafts, such as for internal combustion engine ignition timing, Engine fan with Hall Effect sensor. tachometers and anti-lock braking systems. They are used in brushless DC electric motors to detect the position of the permanent magnet. In the pictured wheel with two equally spaced magnets, the voltage from the sensor will peak twice for each revolution. This arrangement is commonly used to regulate the speed of disk drives.

Hall effect sensor

307

Hall probe
A Hall probe contains an indium compound semiconductor crystal such as indium antimonide, mounted on an aluminum backing plate, and encapsulated in the probe head. The plane of the crystal is perpendicular to the probe handle. Connecting leads from the crystal are brought down through the handle to the circuit box. When the Hall probe is held so that the magnetic field lines are passing at right angles through the sensor of the probe, the meter gives a reading of the value of magnetic flux density (B). A Commonly used circuit symbol. current is passed through the crystal which, when placed in a magnetic field has a Hall effect voltage developed across it. The Hall effect is seen when a conductor is passed through a uniform magnetic field. The natural electron drift of the charge carriers causes the magnetic field to apply a Lorentz force (the force exerted on a charged particle in an electromagnetic field) to these charge carriers. The result is what is seen as a charge separation, with a build up of either positive or negative charges on the bottom or on the top of the plate. The crystal measures 5mm square. The probe handle, being made of a non-ferrous material, has no disturbing effect on the field. A Hall probe can be used to measure the Earth's magnetic field. It must be held so that the Earth's field lines are passing directly through it. It is then rotated quickly so the field lines pass through the sensor in the opposite direction. The change in the flux density reading is double the Earth's magnetic flux density. A Hall probe must first be calibrated against a known value of magnetic field strength. For a solenoid the Hall probe is placed in the center.

Hall effect sensor interface


Hall effect sensors may require analog circuitry to be interfaced to microprocessors. These interfaces may include input diagnostics, fault protection for transient conditions, and short/open circuit detection. It may also provide and monitor the current to the hall effect sensor itself. There are precision IC products available to handle these features.

Scanning electron microscope

308

Scanning electron microscope


A scanning electron microscope (SEM) is a type of electron microscope that produces images of a sample by scanning it with a focused beam of electrons. The electrons interact with atoms in the sample, producing various signals that can be detected and that contain information about the sample's surface topography and composition. The electron beam is generally scanned in a raster scan pattern, and the beam's position is combined with the detected signal to produce an image. SEM can achieve resolution better than 1 nanometer. Specimens can be observed in high vacuum, in low vacuum, and (in environmental SEM) in wet conditions. The most common mode of detection is by secondary electrons emitted by atoms excited by the electron beam. The number of secondary electrons is a function of the angle between the surface and the beam. On a flat surface, the plume of secondary electrons is mostly contained by the sample, but on a tilted surface, the plume is partially exposed and more electrons are emitted. By scanning the sample and detecting the secondary electrons, an image displaying the tilt of the surface is created.

These pollen grains taken on an SEM show the characteristic depth of field of SEM micrographs.

History
An account of the early history of SEM has been presented by McMullan. Although Max Knoll produced a photo with a 50mm object-field-width showing channeling contrast by the use of an electron beam scanner, it was Manfred von Ardenne who in 1937 invented a true microscope with high magnification by scanning a very small raster with a demagnified and finely focused electron beam. Ardenne applied the scanning principle not only to achieve magnification but also to purposefully eliminate the chromatic aberration otherwise inherent in the electron microscope. He further discussed the various detection modes, possibilities and theory of SEM, together with the construction of the first high magnification SEM. Further work was reported by Zworykin's group, followed by the Cambridge groups in the 1950s and early 1960s headed by Charles Oatley, all of which finally led to the marketing of the first commercial instrument by Cambridge Scientific Instrument Company as the "Stereoscan" in 1965 (delivered to DuPont).

M. von Ardenne's first SEM

SEM opened sample chamber

Scanning electron microscope

309

Principles and capacities


The types of signals produced by a SEM include secondary electrons (SE), back-scattered electrons (BSE), characteristic X-rays, light (cathodoluminescence) (CL), specimen current and transmitted electrons. Secondary electron detectors are standard equipment in all SEMs, but it is rare that a single machine would have detectors for all possible signals. The signals result from interactions of the electron beam with atoms at or near the surface of the sample. In the most Analog type SEM common or standard detection mode, secondary electron imaging or SEI, the SEM can produce very high-resolution images of a sample surface, revealing details less than 1 nm in size. Due to the very narrow electron beam, SEM micrographs have a large depth of field yielding a characteristic three-dimensional appearance useful for understanding the surface structure of a sample. This is exemplified by the micrograph of pollen shown above. A wide range of magnifications is possible, from about 10 times (about equivalent to that of a powerful hand-lens) to more than 500,000 times, about 250 times the magnification limit of the best light microscopes. Back-scattered electrons (BSE) are beam electrons that are reflected from the sample by elastic scattering. BSE are often used in analytical SEM along with the spectra made from the characteristic X-rays, because the intensity of the BSE signal is strongly related to the atomic number (Z) of the specimen. BSE images can provide information about the distribution of different elements in the sample. For the same reason, BSE imaging can image colloidal gold immuno-labels of 5 or 10nm diameter, which would otherwise be difficult or impossible to detect in secondary electron images in biological specimens. Characteristic X-rays are emitted when the electron beam removes an inner shell electron from the sample, causing a higher-energy electron to fill the shell and release energy. These characteristic X-rays are used to identify the composition and measure the abundance of elements in the sample.

Sample preparation
All samples must also be of an appropriate size to fit in the specimen chamber and are generally mounted rigidly on a specimen holder called a specimen stub. Several models of SEM can examine any part of a 6-inch (15cm) semiconductor wafer, and some can tilt an object of that size to 45. For conventional imaging in the SEM, specimens must be electrically conductive, at least at the surface, and electrically grounded to prevent the accumulation of electrostatic charge at the surface. Metal objects require little special preparation for SEM except for cleaning and A spider coated in gold, having been prepared for mounting on a specimen stub. Nonconductive specimens tend to viewing with an SEM. charge when scanned by the electron beam, and especially in secondary electron imaging mode, this causes scanning faults and other image artifacts. They are therefore usually coated with an ultrathin coating of electrically conducting material, deposited on the sample either by low-vacuum sputter coating or by high-vacuum evaporation. Conductive materials in current use for specimen coating include gold, gold/palladium alloy, platinum, osmium, iridium, tungsten, chromium,

Scanning electron microscope

310

and graphite. Additionally, coating may increase signal/noise ratio for samples of low atomic number (Z). The improvement arises because secondary electron emission for high-Z materials is enhanced. An alternative to coating for some biological samples is to increase the bulk conductivity of the material by impregnation with osmium using variants of the OTO staining method (O-osmium, T-thiocarbohydrazide, O-osmium). Nonconducting specimens may be imaged uncoated using environmental SEM (ESEM) or low-voltage mode of SEM operation. Low-voltage micrograph (300 V) of distribution of adhesive droplets on Post-It note. No Environmental SEM instruments place the specimen in a relatively conductive coating was applied: such a coating high-pressure chamber where the working distance is short and the would alter this fragile specimen. electron optical column is differentially pumped to keep vacuum adequately low at the electron gun. The high-pressure region around the sample in the ESEM neutralizes charge and provides an amplification of the secondary electron signal. Low-voltage SEM is typically conducted in an FEG-SEM because the field emission guns (FEG) is capable of producing high primary electron brightness and small spot size even at low accelerating potentials. Operating conditions to prevent charging of non-conductive specimens must be adjusted such that the incoming beam current was equal to sum of outcoming secondary and backscattered electrons currents. It usually occurs at accelerating voltages of 0.34 kV. Embedding in a resin with further polishing to a mirror-like finish can be used for both biological and materials specimens when imaging in backscattered electrons or when doing quantitative X-ray microanalysis. The main preparation techniques are not required in the environmental SEM outlined below, but some biological specimens can benefit from fixation.

Biological samples
For SEM, a specimen is normally required to be completely dry, since the specimen chamber is at high vacuum. Hard, dry materials such as wood, bone, feathers, dried insects, or shells can be examined with little further treatment, but living cells and tissues and whole, soft-bodied organisms usually require chemical fixation to preserve and stabilize their structure. Fixation is usually performed by incubation in a solution of a buffered chemical fixative, such as glutaraldehyde, sometimes in combination with formaldehyde and other fixatives, and optionally followed by postfixation with osmium tetroxide. The fixed tissue is then dehydrated. Because air-drying causes collapse and shrinkage, this is commonly achieved by replacement of water in the cells with organic solvents such as ethanol or acetone, and replacement of these solvents in turn with a transitional fluid such as liquid carbon dioxide by critical point drying. The carbon dioxide is finally removed while in a supercritical state, so that no gas-liquid interface is present within the sample during drying. The dry specimen is usually mounted on a specimen stub using an adhesive such as epoxy resin or electrically conductive double-sided adhesive tape, and sputter-coated with gold or gold/palladium alloy before examination in the microscope. If the SEM is equipped with a cold stage for cryo microscopy, cryofixation may be used and low-temperature scanning electron microscopy performed on the cryogenically fixed specimens. Cryo-fixed specimens may be cryo-fractured under vacuum in a special apparatus to reveal internal structure, sputter-coated, and transferred onto the SEM cryo-stage while still frozen. Low-temperature scanning electron microscopy is also applicable to the imaging of temperature-sensitive materials such as ice (see e.g. illustration at left) and fats. Freeze-fracturing, freeze-etch or freeze-and-break is a preparation method particularly useful for examining lipid membranes and their incorporated proteins in "face on" view. The preparation method reveals the proteins embedded in the lipid bilayer.

Scanning electron microscope

311

Materials
Back scattered electron imaging, quantitative X-ray analysis, and X-ray mapping of specimens often requires that the surfaces be ground and polished to an ultra smooth surface. Specimens that undergo WDS or EDS analysis are often carbon coated. In general, metals are not coated prior to imaging in the SEM because they are conductive and provide their own pathway to ground. Fractography is the study of fractured surfaces that can be done on a light microscope or commonly, on an SEM. The fractured surface is cut to a suitable size, cleaned of any organic residues, and mounted on a specimen holder for viewing in the SEM. Integrated circuits may be cut with a focused ion beam (FIB) or other ion beam milling instrument for viewing in the SEM. The SEM in the first case may be incorporated into the FIB. Metals, geological specimens, and integrated circuits all may also be chemically polished for viewing in the SEM. Special high-resolution coating techniques are required for high-magnification imaging of inorganic thin films.

Scanning process and image formation


In a typical SEM, an electron beam is thermionically emitted from an electron gun fitted with a tungsten filament cathode. Tungsten is normally used in thermionic electron guns because it has the highest melting point and lowest vapour pressure of all metals, thereby allowing it to be heated for electron emission, and because of its low cost. Other types of electron emitters include (LaB) 6 standard tungsten filament SEM if the vacuum system is upgraded and FEG, Schematic of an SEM. which may be of the cold-cathode type using tungsten single crystal emitters or the thermally assisted Schottky type, using emitters of zirconium oxide. The electron beam, which typically has an energy ranging from 0.2 keV to 40 keV, is focused by one or two condenser lenses to a spot about 0.4nm to 5nm in diameter. The beam passes through pairs of scanning coils or pairs of deflector plates in the electron column, typically in the final lens, which deflect the beam in the x and y axes so that it scans in a raster fashion over a rectangular area of the sample surface. When the primary electron beam interacts with the sample, the electrons lose energy by repeated random scattering and absorption within a teardrop-shaped volume of the specimen known as the interaction volume, which extends from less than 100nm to approximately 5m into the surface. The size of the interaction volume depends on the electron's landing energy, the atomic number of the specimen and the specimen's density. The energy exchange between the electron beam and the sample results in the reflection of high-energy electrons by elastic scattering, emission of secondary electrons by inelastic scattering and the emission of electromagnetic radiation, each of which can be detected by specialized detectors. The beam current absorbed by the specimen can also be detected and used to create images of the distribution of specimen current. Electronic amplifiers of various types are used to amplify the signals, which are displayed as variations in brightness on a computer monitor (or, for vintage models, on a cathode ray tube). Each pixel of computer videomemory is synchronized with the position of the beam on the

Scanning electron microscope specimen in the microscope, and the resulting image is therefore a distribution map of the intensity of the signal being emitted from the scanned area of the specimen. In older microscopes image may be captured by photography from a high-resolution cathode ray tube, but in modern machines image is saved to a computer data storage.

312

Magnification
Magnification in a SEM can be controlled over a range of up to 6 orders of magnitude from about 10 to 500,000 times. Unlike optical and transmission electron microscopes, image magnification in the SEM is not a function of the power of the objective lens. SEMs may have condenser and objective lenses, but their function is to focus the beam to a spot, and not to image the specimen. Provided the electron gun can generate a beam with sufficiently small diameter, a SEM could in principle work entirely without condenser or objective lenses, although it might not be very versatile or achieve very high resolution. In a SEM, as in scanning probe microscopy, magnification results from the ratio of the dimensions of the raster on the specimen and the raster on the display device. Assuming that the display screen has a fixed size, higher magnification results from reducing the size of the raster on the specimen, and vice versa. Magnification is therefore controlled by the current supplied to the x, y scanning coils, or the voltage supplied to the x, y deflector plates, and not by objective lens power.

Color
The most common configuration for an SEM produces a single value per pixel, with the results usually rendered as black-and-white images. However, often these images are then colorized through the use of feature-detection software, or simply by hand-editing using a graphics editor. This is usually for aesthetic effect or for clarifying structure, and generally does not add information about the specimen.
Low-temperature SEM magnification series for a snow crystal. The crystals are captured, stored, and sputter-coated with platinum at cryogenic temperatures for imaging.

In some configurations more information is gathered per pixel, often by the use of multiple detectors. The attributes of topography and material contrast can be obtained by a pair of backscattered electron detectors and such attributes can be superimposed on a single color image by assigning a different primary color to each attribute. Similarly, a combination of backscattered and secondary electron signals can be assigned to different colors and superimposed on a single color micrograph displaying simultaneously the properties of the specimen.

In a similar method, secondary electron and backscattered electron detectors are superimposed and a colour is assigned to each of the images captured by each detector, with an end result of a combined colour image where colours are related to the density of the components. This method is known as density-dependent colour SEM (DDC-SEM). Micrographs produced by DDC-SEM retain topographical information, which is better captured by the secondary electrons detector and combine it to the information about density, obtained by the backscattered electron detector.

Scanning electron microscope

313

Some types of detectors used in SEM have analytical capabilities, and can provide several items of data at each pixel. Examples are the Energy-dispersive X-ray spectroscopy (EDS) detectors used in elemental analysis and Cathodoluminescence microscope (CL) systems that analyse the intensity and spectrum of electron-induced luminescence in (for example) geological specimens. In SEM systems using these detectors it is common to color code the signals and superimpose them in a single color image, so that differences in the distribution of the various components of the specimen can be seen clearly and compared. Optionally, the standard secondary electron image can be merged with the one or more compositional channels, so that the specimen's structure and composition can be compared. Such images can be made while maintaining the full integrity of the original signal, which is not modified in any way.

Density-dependent colour scanning electron micrograph SEM (DDC-SEM) of cardiovascular calcification, showing in orange calcium phosphate spherical particles (denser material) and, in green, the extracellular matrix (less dense material).

Detection of secondary electrons


The most common imaging mode collects low-energy (<50 eV) secondary electrons that are ejected from the k-shell of the specimen atoms by inelastic scattering interactions with beam electrons. Due to their low energy, these electrons originate within a few nanometers from the sample surface. The electrons are detected by an Everhart-Thornley detector, which is a type of scintillator-photomultiplier system. The secondary electrons are first collected by attracting them towards an electrically biased grid at about +400 V, and then further accelerated towards a phosphor or scintillator positively biased to about +2,000 V. The accelerated secondary electrons are now sufficiently energetic to cause the scintillator to emit flashes of light (cathodoluminescence), which are conducted to a photomultiplier outside the SEM column via a light pipe and a window in the wall of the specimen chamber. The amplified electrical signal output by the photomultiplier is displayed as a two-dimensional intensity distribution that can be viewed and photographed on an analogue video display, or subjected to analog-to-digital conversion and displayed and saved as a digital image. This process relies on a raster-scanned primary beam. The brightness of the signal depends on the number of secondary electrons reaching the detector. If the beam enters the sample perpendicular to the surface, then the activated region is uniform about the axis of the beam and a certain number of electrons "escape" from within the sample. As the angle of incidence increases, the "escape" distance of one side of the beam will decrease, and more secondary electrons will be emitted. Thus steep surfaces and edges tend to be brighter than flat surfaces, which results in images with a well-defined, three-dimensional appearance. Using the signal of secondary electrons image resolution less than 0.5nm is possible.

Scanning electron microscope

314

Detection of backscattered electrons


Backscattered electrons (BSE) consist of high-energy electrons originating in the electron beam, that are reflected or back-scattered out of the specimen interaction volume by elastic scattering interactions with specimen atoms. Since heavy elements (high atomic number) backscatter electrons more strongly than light elements (low atomic number), and thus appear brighter in the image, BSE are used to detect contrast between areas with different chemical compositions. The Everhart-Thornley detector, which is normally positioned to one side Comparison of SEM techniques: of the specimen, is inefficient for the detection of backscattered Top: backscattered electron analysis electrons because few such electrons are emitted in the solid angle composition subtended by the detector, and because the positively biased detection Bottom: secondary electron analysis grid has little ability to attract the higher energy BSE. Dedicated topography backscattered electron detectors are positioned above the sample in a "doughnut" type arrangement, concentric with the electron beam, maximizing the solid angle of collection. BSE detectors are usually either of scintillator or of semiconductor types. When all parts of the detector are used to collect electrons symmetrically about the beam, atomic number contrast is produced. However, strong topographic contrast is produced by collecting back-scattered electrons from one side above the specimen using an asymmetrical, directional BSE detector; the resulting contrast appears as illumination of the topography from that side. Semiconductor detectors can be made in radial segments that can be switched in or out to control the type of contrast produced and its directionality. Backscattered electrons can also be used to form an electron backscatter diffraction (EBSD) image that can be used to determine the crystallographic structure of the specimen.

Beam-injection analysis of semiconductors


The nature of the SEM's probe, energetic electrons, makes it uniquely suited to examining the optical and electronic properties of semiconductor materials. The high-energy electrons from the SEM beam will inject charge carriers into the semiconductor. Thus, beam electrons lose energy by promoting electrons from the valence band into the conduction band, leaving behind holes. In a direct bandgap material, recombination of these electron-hole pairs will result in cathodoluminescence; if the sample contains an internal electric field, such as is present at a p-n junction, the SEM beam injection of carriers will cause electron beam induced current (EBIC) to flow. Cathodoluminescence and EBIC are referred to as "beam-injection" techniques, and are very powerful probes of the optoelectronic behavior of semiconductors, in particular for studying nanoscale features and defects.

Cathodoluminescence
Cathodoluminescence, the emission of light when atoms excited by high-energy electrons return to their ground state, is analogous to UV-induced fluorescence, and some materials such as zinc sulfide and some fluorescent dyes, exhibit both phenomena. Cathodoluminescence is most commonly experienced in everyday life as the light emission from the inner surface of the cathode ray tube in television sets and computer CRT monitors. In the SEM, CL detectors either collect all light emitted by the specimen or can analyse the wavelengths emitted by the specimen and display an emission spectrum or an image of the distribution of cathodoluminescence emitted by the specimen in real color.

Scanning electron microscope

315

X-ray microanalysis
X-rays, which are produced by the interaction of electrons with the sample, may also be detected in an SEM equipped for energy-dispersive X-ray spectroscopy or wavelength dispersive X-ray spectroscopy.

Resolution of the SEM


The spatial resolution of the SEM depends on the size of the electron spot, which in turn depends on both the wavelength of the electrons and the electron-optical system that produces the scanning beam. The resolution is also limited by the size of the interaction volume, or the extent to which the material interacts with the electron beam. The spot size and the interaction volume are both large compared to the distances between atoms, so the resolution of the SEM is not high enough to image individual atoms, as is possible in the shorter wavelength (i.e. higher energy) transmission electron microscope (TEM). The SEM has compensating advantages, though, including the ability to image a comparatively large area of the specimen; the ability to image bulk materials (not just thin films or foils); and the variety of analytical modes available for measuring the composition and properties of the specimen. Depending on the instrument, the resolution can fall somewhere between less than 1nm and 20nm. By 2009, The world's highest SEM resolution at high-beam energies (0.4nm at 30 kV) is obtained with the Hitachi SU-9000.[5]

Environmental SEM
Conventional SEM requires samples to be imaged under vacuum, because a gas atmosphere rapidly spreads and attenuates electron beams. As a consequence, samples that produce a significant amount of vapour, e.g. wet biological samples or oil-bearing rock, must be either dried or cryogenically frozen. Processes involving phase transitions, such as the drying of adhesives or melting of alloys, liquid transport, chemical reactions, and solid-air-gas systems, in general cannot be observed. Some observations of living insects have been possible,[6] however. The first commercial development of the ESEM in the late 1980s allowed samples to be observed in low-pressure gaseous environments (e.g. 150 Torr or 0.16.7 kPa) and high relative humidity (up to 100%). This was made possible by the development of a secondary-electron detector capable of operating in the presence of water vapour and by the use of pressure-limiting apertures with differential pumping in the path of the electron beam to separate the vacuum region (around the gun and lenses) from the sample chamber. The first commercial ESEMs were produced by the ElectroScan Corporation in USA in 1988. ElectroScan was taken over by Philips (who later sold their electron-optics division to FEI Company) in 1996.[7] ESEM is especially useful for non-metallic and biological materials because coating with carbon or gold is unnecessary. Uncoated Plastics and Elastomers can be routinely examined, as can uncoated biological samples. Coating can be difficult to reverse, may conceal small features on the surface of the sample and may reduce the value of the results obtained. X-ray analysis is difficult with a coating of a heavy metal, so carbon coatings are routinely used in conventional SEMs, but ESEM makes it possible to perform X-ray microanalysis on uncoated non-conductive specimens. ESEM may be the preferred for electron microscopy of unique samples from criminal or civil actions, where forensic analysis may need to be repeated by several different experts.

Scanning electron microscope

316

3D in SEM
SEMs do not naturally provide 3D images contrary to SPMs. However 3D data can be obtained using a SEM with different methods such as: photogrammetry (2 or 3 images from tilted specimen) photometric stereo also called "shape from shading" (use of 4 images from BSE detector) inverse reconstruction using electron-material interactive models Possible applications are roughness measurement, measurement of fractal dimension, corrosion measurement and step height evaluation.

Gallery of SEM images


The following are examples of images taken using an SEM.

Arrangement of two pairs of scanning electron micrographs of natural objects of less than 1mm in size (Ostracoda) produced by tilting along the longitudinal axis. Try to see these minute objects three-dimensionally (optimized for the higher magnification) without special spectacles.

Colored SEM image of soybean cyst nematode and egg. The artificial coloring makes the image easier for non-specialists to view and understand the structures and surfaces revealed in micrographs.

SEM image of a house fly compound eye surface at 450 magnification.

Compound eye of Antarctic krill Euphausia superba. Arthropod eyes are a common subject in SEM micrographs due to the depth of focus that an SEM image can capture. Colored picture.

Ommatidia of Antarctic krill eye, a higher magnification of the krill's eye. SEMs cover a range from light microscopy up to the magnifications available with a TEM. Colored picture.

Scanning electron microscope

317

SEM image of normal circulating human blood. This is an older and noisy micrograph of a common subject for SEM micrographs: red blood cells.

SEM image of a hederelloid from the Devonian of Michigan (largest tube diameter is 0.75mm). The SEM is used extensively for capturing detailed images of micro and macro fossils.

Backscattered electron (BSE) image of an antimony-rich region in a fragment of ancient glass. Museums use SEMs for studying valuable artifacts in a nondestructive manner.

SEM image of the corrosion layer on the surface of an ancient glass fragment; note the laminar structure of the corrosion layer.

SEM image of a photoresist layer used in semiconductor manufacturing taken on a field emission SEM. These SEMs are important in the semiconductor industry for their high-resolution capabilities.

SEM image of the surface of a kidney stone showing tetragonal crystals of Weddellite (calcium oxalate dihydrate) emerging from the amorphous central part of the stone. Horizontal length of the picture represents 0.5mm of the figured original.

Two images of the same depth hoar snow crystal, viewed through a light microscope (left) and as a SEM image (right). Note how the SEM image allows for clear perception of the fine structure details which are hard to fully make out in the light microscope image.

Topography

318

Topography
Topography (from Greek topos, "place", and graph, "write") is a field of planetary science comprising the study of surface shape and features of the Earth and other observable astronomical objects including planets, moons, and asteroids. It is also the description of such surface shapes and features (especially their depiction in maps). The topography of an area could also mean the surface shape and features themselves. In general, topography is concerned with local detail in general, including not only relief but also natural and artificial features, and even local history and culture. This meaning is less common in America, where topographic maps with elevation contours have made "topography" synonymous with relief. The older sense of topography as the study of place still has currency in Europe.

A topographic map with contour intervals

Topography specifically involves the recording of relief or terrain, the three-dimensional quality of the surface, and the identification of specific landforms. This is also known as geomorphometry. In modern usage, this involves generation of elevation data in electronic form. It is often considered to include the graphic representation of the landform on a map by a variety of techniques, including contour lines, hypsometric tints, and relief shading.

Etymology
The term topography originated in ancient Greece and continued in ancient Rome, as the detailed description of a place. The word comes from the Greek words (topos, place) and (graphia, writing). In classical literature this refers to writing about a place or places, what is now largely called 'local history'. In Britain and in Europe in general, the word topography is still sometimes used in its original sense. Detailed military surveys in Britain (beginning in the late eighteenth century) were called Ordnance Surveys, and this term was used into the 20th century as generic for topographic surveys and maps. The earliest scientific surveys in France were called the Cassini maps after the family who produced them over four generations.[citation needed] The term "topographic surveys" appears to be American in origin. The earliest detailed surveys in the United States were made by the Topographical Bureau of the Army, formed during the War of 1812,. which became the Corps of Topographical Engineers in 1838. After the work of national mapping was assumed by the U.S. Geological Survey in 1878, the term topographical remained as a general term for detailed surveys and mapping programs, and has been adopted by most other nations as standard. In the 20th century, the term topography started to be used to describe surface description in other fields where mapping in a broader sense is used, particularly in medical fields such as neurology.

Topography

319

Objectives
An objective of topography is to determine the position of any feature or more generally any point in terms of both a horizontal coordinate system such as latitude, longitude, and altitude. Identifying (naming) features, and recognizing typical landform patterns are also part of the field. A topographic study may be made for a variety of reasons: military planning and geological exploration have been primary motivators to start survey programs, but detailed information about terrain and surface features is essential for the planning and construction of any major civil engineering, public works, or reclamation projects.

Techniques of topography
There are a variety of approaches to studying topography. Which method(s) to use depend on the scale and size of the area under study, its accessibility, and the quality of existing surveys.

Direct survey
Surveying helps determine accurately the terrestrial or three-dimensional space position of points and the distances and angles between them using leveling instruments such as theodolites, dumpy levels and clinometers. Even though remote sensing has greatly sped up the process of gathering information, and has allowed greater accuracy control over long distances, the direct survey still provides the basic control points and framework for all topographic work, whether manual or GIS-based. In areas where there has been an extensive direct survey and mapping program (most of Europe and the Continental US, for example), the compiled data forms the basis of basic digital elevation datasets such as USGS DEM data. This data must often be "cleaned" to eliminate discrepancies between surveys, but it still forms a valuable set of information for large-scale analysis.

A surveying point in Germany

The original American topographic surveys (or the British "Ordnance" surveys) involved not only recording of relief, but identification of landmark features and vegetative land cover.

Remote sensing
Remote sensing is a general term for geodata collection at a distance from the subject area. Aerial and satellite imagery Besides their role in photogrammetry, aerial and satellite imagery can be used to identify and delineate terrain features and more general land-cover features. Certainly they have become more and more a part of geovisualization, whether maps or GIS systems. False-color and non-visible spectra imaging can also help determine the lie of the land by delineating vegetation and other land-use information more clearly. Images can be in visible colours and in other spectrum Photogrammetry Photogrammetry is a measurement technique for which the co-ordinates of the points in 3D of an object are determined by the measurements made in two photographic images (or more) taken starting from different positions, usually from different passes of an aerial photography flight. In this technique, the common points are identified on each image. A line of sight (or ray) can be built from the camera location to the point on the object. It is the intersection of its rays (triangulation) which determines the relative three-dimensional position of the point. Known

Topography control points can be used to give these relative positions absolute values. More sophisticated algorithms can exploit other information on the scene known a priori (for example, symmetries in certain cases allowing the rebuilding of three-dimensional co-ordinates starting from one only position of the camera). Radar and sonar Satellite radar mapping is one of the major techniques of generating Digital Elevation Models (see below). Similar techniques are applied in bathymetric surveys using sonar to determine the terrain of the ocean floor. In recent years, LIDAR (Light Detection and Ranging), a remote sensing technique using a laser instead of radio waves, has increasingly been employed for complex mapping needs such as charting canopies and monitoring glaciers.

320

Forms of topographic data


Terrain is commonly modelled either using vector (triangulated irregular network or TIN) or gridded (Raster image) mathematical models. In the most applications in environmental sciences, land surface is represented and modelled using gridded models. In civil engineering and entertainment businesses, the most representations of land surface employ some variant of TIN models. In geostatistics, land surface is commonly modelled as a combination of the two signals - the smooth (spatially correlated) and the rough (noise) signal. In practice, surveyors first sample heights in an area, then use these to produce a Digital Land Surface Model (also known as a digital elevation model). The DLSM can then be used to visualize terrain, drape remote sensing images, quantify ecological properties of a surface or extract land surface objects. Note that the contour data or any other sampled elevation datasets are not a DLSM. A DLSM implies that elevation is available continuously at each location in the study area, i.e. that the map represents a complete surface. Digital Land Surface Models should not be confused with Digital Surface Models, which can be surfaces of the canopy, buildings and similar objects. For example, in the case of surface models produces using the LIDAR technology, one can have several surfaces starting from the top of the canopy to the actual solid earth. The difference between the two surface models can then be used to derive volumetric measures (height of trees etc.).

Raw survey data


Topographic survey information is historically based upon the notes of surveyors. They may derive naming and cultural information from other local sources (for example, boundary delineation may be derived from local cadastral mapping). While of historical interest, these field notes inherently include errors and contradictions that later stages in map production resolve.

Remote sensing data


As with field notes, remote sensing data (aerial and satellite photography, for example), is raw and uninterpreted. It may contain holes (due to cloud cover for example) or inconsistencies (due to the timing of specific image captures). Most modern topographic mapping includes a large component of remotely sensed data in its compilation process.

Topography

321

Topographic mapping
In its contemporary definition, topographic mapping shows relief. In the United States, USGS topographic maps show relief using contour lines. The USGS calls maps based on topographic surveys, but without contours, "planimetric maps." These maps show not only the contours, but also any significant streams or other bodies of water, forest cover, built-up areas or individual buildings (depending on scale), and other features and points of interest. While not officially "topographic" maps, the national surveys of other nations share many of the same features, and so they are often generally called "topographic maps."
A map of Europe using elevation modeling

Existing topographic survey maps, because of their comprehensive and encyclopedic coverage, form the basis for much derived topographic work. Digital Elevation Models, for example, have often been created not from new remote sensing data but from existing paper topographic maps. Many government and private publishers use the artwork (especially the contour lines) from existing topographic map sheets as the basis for their own specialized or updated topographic maps. Topographic mapping should not be confused with geologic mapping. The latter is concerned with underlying structures and processes to the surface, rather than with identifiable surface features.

Digital elevation modeling


The digital elevation model (DEM) is a raster-based digital dataset of the topography (hypsometry and/or bathymetry) of all or part of the Earth (or a telluric planet). The pixels of the dataset are each assigned an elevation value, and a header portion of the dataset defines the area of coverage, the units each pixel covers, and the units of elevation (and the zero-point). DEMs may be derived from existing paper maps and survey data, or they may be generated from new satellite or other remotely-sensed radar or sonar data.

Topological modeling
A geographic information system (GIS) can recognize and analyze the spatial relationships that exist within digitally stored spatial data. These topological relationships allow complex spatial modelling and analysis to be performed. Topological relationships between geometric entities traditionally include adjacency (what adjoins what), containment (what encloses what), and proximity (how close something is to something else). reconstitute a sight in synthesized images of the ground, determine a trajectory of overflight of the ground, calculate surfaces or volumes, trace topographic profiles,
Relief map: Sierra Nevada Mountains, Spain

3D rendering of a DEM used for the topography of Mars

Topography

322

Topography in other fields


Topography has been applied to different science fields. In neuroscience, the neuroimaging discipline uses techniques such as EEG topography for brain mapping. In ophthalmology, corneal topography is used as a technique for mapping the surface curvature of the cornea. In human anatomy, topography is superficial human anatomy. In mathematics the concept of topography is used to indicate the patterns or general organization of features on a map or as a term referring to the pattern in which variables (or their values) are distributed in a space.

Raster scan

Topography of thoracic and abdominal viscera.

A raster scan, or raster scanning, is the rectangular pattern of image capture and reconstruction in television. By analogy, the term is used for raster graphics, the pattern of image storage and transmission used in most computer bitmap image systems. The word raster comes from the Latin word rastrum (a rake), which is derived from radere (to Raster-scan display sample scrape); see also rastrum, an instrument for drawing musical staff lines. The pattern left by the tines of a rake, when drawn straight, resembles the parallel lines of a raster: this line-by-line scanning is what creates a raster. It's a systematic process of covering the area progressively, one line at a time. Although often a great deal faster, it's similar in the most-general sense to how one's gaze travels when one reads lines of text.

Description
Scan lines
In a raster scan, an image is subdivided into a sequence of (usually horizontal) strips known as "scan lines". Each scan line can be transmitted in the form of an analog signal as it is read from the video source, as in television systems, or can be further divided into discrete pixels for processing in a computer system. This ordering of pixels by rows is known as raster order, or raster scan order. Analog television has discrete scan lines (discrete vertical resolution), but does not have discrete pixels (horizontal resolution) it instead varies the signal continuously over

Raster scan the scan line. Thus, while the number of scan lines (vertical resolution) is unambiguously defined, the horizontal resolution is more approximate, according to how quickly the signal can change over the course of the scan line.

323

Scanning pattern
In raster scanning, the beam sweeps horizontally left-to-right at a steady rate, then blanks and rapidly moves back to the left, where it turns back on and sweeps out the next line. During this time, the vertical position is also steadily increasing (downward), but much more slowly there is one vertical sweep per image frame, but one horizontal sweep per line of resolution. Thus each scan line is sloped slightly "downhill" (towards the lower right), with a slope of approximately 1/horizontalresolution, while the sweep back to the left (retrace) is significantly faster than the forward scan, and essentially horizontal. The resulting tilt in the scan lines is very small, and is dwarfed in effect by screen convexity and other modest geometrical imperfections.

The beam position (sweeps) follow roughly a sawtooth wave.

There is a misconception that once a scan line is complete, a CRT display in effect suddenly jumps internally, by analogy with a typewriter or printer's paper advance or line feed, before creating the next scan line. As discussed above, this does not exactly happen: the vertical sweep continues at a steady rate over a scan line, creating a small tilt. Steady-rate sweep is done, instead of a stairstep of advancing every row, because steps are hard to implement technically, while steady-rate is much easier. The resulting tilt is compensated in most CRTs by the tilt and parallelogram adjustments, which impose a small vertical deflection as the beam sweeps across the screen. When properly adjusted, this deflection exactly cancels the downward slope of the scanlines. The horizontal retrace, in turn, slants smoothly downward as the tilt deflection is removed; there's no jump at either end of the retrace. In detail, scanning of CRTs is done by magnetic deflection, by changing the current in the coils of the deflection yoke. Rapidly changing the deflection (a jump) requires a voltage spike to be applied to the yoke, and the deflection can only react as fast as the inductance and spike magnitude permit. Electronically, the inductance of the deflection yoke's vertical windings is relatively high, and thus the current in the yoke, and therefore the vertical part of the magnetic deflection field, can change only slowly. In fact, spikes do occur, both horizontally and vertically, and the corresponding horizontal blanking interval and vertical blanking interval give the deflection currents settle time to retrace and settle to their new value. This happens during the blanking interval. In electronics, these (usually steady-rate) movements of the beam[s] are called "sweeps", and the circuits that create the currents for the deflection yoke (or voltages for the horizontal deflection plates in an oscilloscope) are called the sweep circuits. These create a sawtooth wave: steady movement across the screen, then a typically rapid move back to the other side, and likewise for the vertical sweep. Furthermore, wide-deflection-angle CRTs need horizontal sweeps with current that changes proportionally faster toward the center, because the center of the screen is closer to the deflection yoke than the edges. A linear change in current would swing the beams at a constant rate angularly; this would cause horizontal compression toward the center.

Raster scan

324

Printers
Computer printers create their images basically by raster scanning. Laser printers use a spinning polygonal mirror (or an optical equivalent) to scan across the photosensitive drum, and paper movement provides the other scan axis. Considering typical printer resolution, the "downhill" effect is minuscule. Inkjet printers have multiple nozzles in their printheads, so many (dozens to hundreds) of "scan lines" are written together, and paper advance prepares for the next batch of scan lines. Transforming vector-based data into the form required by a display, or printer, requires a Raster Image Processor (RIP).

Fonts
Computer text is mostly created from font files that describe the outlines of each printable character or symbol (glyph). (A minority are "bit maps".) These outlines have to be converted into what are effectively little rasters, one per character, before being rendered (displayed or printed) as text, in effect merging their little rasters into that for the page.

Secondary electrons

325

Secondary electrons
Secondary electrons are electrons generated as ionization products. They are called 'secondary' because they are generated by other radiation (the primary radiation). This radiation can be in the form of ions, electrons, or photons with sufficiently high energy, i.e. exceeding the ionization potential. Photoelectrons can be considered an example of secondary electrons where the primary radiation are photons; in some discussions photoelectrons with higher energy (>50eV) are still considered "primary" while the electrons freed by the photoelectrons are "secondary".

Visualisation of a Townsend avalanche, which is sustained by the generation of secondary electrons in an electric field

Applications
Secondary electrons are also the main means of viewing images in the scanning electron microscope (SEM). The range of secondary electrons depends on the energy. Plotting the inelastic mean free path as a function of energy often shows characteristics of the "universal curve" familiar to electron spectroscopists and surface analysts. This distance is on the order of a few nanometers in metals and tens of nanometers in insulators. This small distance allows such fine resolution to be achieved in the SEM.

Mean free path of low-energy electrons. Secondary electrons are generally considered to have energies below 50eV. The rate of energy loss for electron scattering is very low, so most electrons released have energies peaking below 5eV(Seiler, 1983).

For SiO2, for a primary electron energy of 100 eV, the secondary electron range is up to 20nm from the point of incidence.

Backscatter

326

Backscatter
In physics, backscatter (or backscattering) is the reflection of waves, particles, or signals back to the direction from which they came. It is a diffuse reflection due to scattering, as opposed to specular reflection like a mirror. Backscattering has important applications in astronomy, photography and medical ultrasonography.

Backscatter of waves in physical space


Backscattering occurs in quite different physical situations. The incoming waves or particles can be deflected from their original direction by quite different mechanisms: Diffuse reflection from large particles and Mie scattering, causing alpenglow and gegenschein, and showing up in weather radar; inelastic collisions between electromagnetic waves and the transmitting medium (Brillouin scattering and Raman scattering), important in fiber optics, see below; elastic collisions between accelerated ions and a sample (Rutherford backscattering) Bragg diffraction from crystals, used in inelastic scattering experiments (neutron backscattering, X-ray backscattering spectroscopy); Compton scattering, used in Backscatter X-ray imaging. Sometimes, the scattering is more or less isotropic, i. e. the incoming particles are scattered randomly in various directions, with no particular preference for backward scattering. In these cases, the term "backscattering" just designates the detector location chosen for some practical reasons: in X-ray imaging, backscattering means just the opposite of transmission imaging; in inelastic neutron or X-ray spectroscopy, backscattering geometry is chosen because it optimizes the energy resolution; in astronomy, backscattered light is that which is reflected with a phase angle of less than 90. In other cases, the scattering intensity is enhanced in backward direction. This can have different reasons: In alpenglow, red light prevails because the blue part of the spectrum is depleted by Rayleigh scattering. In gegenschein, constructive interference might play a role (this needs verification). Coherent backscattering is observed in random media; for visible light most typically in suspensions like milk. Due to weak localization, enhanced multiple scattering is observed in back direction. The Back Scattering Alignment (BSA) coordinate system is often used in radar applications The Forward Scattering Alignment (FSA) coordinate system is primarily used in optical applications

Radar, especially weather radar


Backscattering is the principle behind radar systems. In weather radar, backscattering is proportional to the 6th power of the diameter of the target multiplied by its inherent reflective properties. Water is almost 4 times more reflective than ice but droplets are much smaller than snow flakes or hail stones. So the backscattering is dependent on a mix of these two factors. The strongest backscatter comes from hail and large graupel (solid ice) due to their sizes. Another strong return is from melting snow or wet sleet, as they combine size and water reflectivity. They often show up as much higher rates of precipitation than actually occurring in what is called a brightband. Rain is a moderate backscatter, being stronger with large drops (such as from a thunderstorm) and much weaker with small droplets (such as mist or drizzle). Snow has rather weak backscatter.

Backscatter

327

Backscatter in waveguides
The backscattering method is also employed in fiber optics applications to detect optical faults. Light propagating through a fiber optic cable gradually attenuates due to Rayleigh scattering. Faults are thus detected by monitoring the variation of part of the Rayleigh backscattered light. Since the backscattered light attenuates exponentially as it travels along the optical fiber cable, the attenuation characteristic is represented in a logarithmic scale graph. If the slope of the graph is steep, then power loss is high. If the slope is gentle, then optical fiber has a satisfactory loss characteristic. The loss measurement by the backscattering method allows measurement of a fiber optic cable at one end without cutting the optical fiber hence it can be conveniently used for the construction and maintenance of optical fibers.

Backscatter in photography
The term backscatter in photography refers to light from a flash or strobe reflecting back from particles in the lens's field of view causing specks of light to appear in the photo. This gives rise to what are sometimes referred to as orb artifacts. Photographic backscatter can result from snowflakes, rain or mist, or airborne dust. Due to the size limitations of the modern compact and ultra-compact cameras, especially digital cameras, the distance between the lens and the built-in flash has decreased, thereby decreasing the angle of light reflection to the lens and increasing the likelihood of light reflection off normally sub-visible particles. Hence, the orb artifact is commonplace with small digital or film camera photographs

Energy-dispersive X-ray spectroscopy


Energy-dispersive X-ray spectroscopy (EDS, EDX, or XEDS) is an analytical technique used for the elemental analysis or chemical characterization of a sample. It relies on the investigation of an interaction of some source of X-ray excitation and a sample. Its characterization capabilities are due in large part to the fundamental principle that each element has a unique atomic structure allowing unique set of peaks on its X-ray spectrum. To stimulate the emission of EDS spectrum of the mineral crust of the vent shrimp Rimicaris exoculata characteristic X-rays from a specimen, a high-energy beam of charged particles such as electrons or protons (see PIXE), or a beam of X-rays, is focused into the sample being studied. At rest, an atom within the sample contains ground state (or unexcited) electrons in discrete energy levels or electron shells bound to the nucleus. The incident beam may excite an electron in an inner shell, ejecting it from the shell while creating an electron hole where the electron was. An electron from an outer, higher-energy shell then fills the hole, and the difference in energy between the higher-energy shell and the lower energy shell may be released in the form of an X-ray. The number and energy of the X-rays emitted from a specimen can be measured by an

Energy-dispersive X-ray spectroscopy energy-dispersive spectrometer. As the energy of the X-rays are characteristic of the difference in energy between the two shells, and of the atomic structure of the element from which they were emitted, this allows the elemental composition of the specimen to be measured.

328

X-ray measurement
The equipment measures the energy and number of emitted X-rays.

Equipment
Four primary components of the EDS setup are 1. 2. 3. 4. the excitation source (electron beam or x-ray beam) the X-ray detector the pulse processor the analyzer. [citation needed]

Electron beam excitation is used in electron microscopes, scanning electron microscopes (SEM) and scanning transmission electron microscopes (STEM). X-ray beam excitation is used in X-ray fluorescence (XRF) spectrometers. A detector is used to convert X-ray energy into voltage signals; this information is sent to a pulse processor, which measures the signals and passes them onto an analyzer for data display and analysis.[citation needed] The most common detector now is Si(Li) detector cooled to cryogenic temperatures with liquid nitrogen; however newer systems are often equipped with silicon drift detectors (SDD) with Peltier cooling systems.

Technological variants
The excess energy of the electron that migrates to an inner shell to fill the newly created hole can do more than emit an X-ray. Often, instead of X-ray emission, the excess energy is transferred to a third electron from a further outer shell, prompting its ejection. This ejected species is called an Auger electron, and the method for its analysis is known as Auger electron spectroscopy (AES). X-ray photoelectron spectroscopy (XPS) is another close relative of EDS, utilizing ejected electrons in a manner similar to that of AES. Information on the quantity and kinetic energy of ejected electrons is used to determine the binding energy of these now-liberated electrons, which is element-specific and allows chemical characterization of a sample.

Principle of EDS

EDS is often contrasted with its spectroscopic counterpart, WDS (wavelength dispersive X-ray spectroscopy). WDS differs from EDS in that it uses the X-rays diffraction on special crystals as its raw data. WDS has a much finer spectral resolution than EDS. WDS also avoids the problems associated with artifacts in EDS (false peaks, noise from the amplifiers, and microphonics). In WDS, only one element can be analyzed at a time, while EDS gathers a spectrum of all elements, within limits, of a sample.

Accuracy of EDS
Accuracy of EDS spectrum can be affected by various factors. Many elements will have overlapping peaks (e.g., Ti K- and V K, Mn K- and Fe K). The accuracy of the spectrum can also be affected by the nature of the sample. X-rays can be generated by any atom in the sample that is sufficiently excited by the incoming beam. These X-rays are emitted in any direction, and so they may not all escape the sample. The likelihood of an X-ray escaping the specimen, and thus being available to detect and measure, depends on the energy of the X-ray and the amount and

Energy-dispersive X-ray spectroscopy density of material it has to pass through. This can result in reduced accuracy in inhomogeneous and rough samples.

329

Emerging technology
There is a trend towards a newer EDS detector, called the silicon drift detector (SDD). The SDD consists of a high-resistivity silicon chip where electrons are driven to a small collecting anode. The advantage lies in the extremely low capacitance of this anode, thereby utilizing shorter processing times and allowing very high throughput. Benefits of the SDD include : 1. 2. 3. 4. 5. High count rates and processing, Better resolution than traditional Si(Li) detectors at high count rates, Lower dead time (time spent on processing X-ray event), Faster analytical capabilities and more precise X-ray maps or particle data collected in seconds, Ability to be stored and operated at relatively high temperatures, eliminating the need for liquid nitrogen cooling.

Because the capacitance of the SDD chip is independent of the active area of the detector, much larger SDD chips can be utilized (40mm2 or more). This allows for even higher count rate collection. Further benefits of large area chips include: 1. Minimizing SEM beam current allowing for optimization of imaging under analytical conditions, 2. Reduced sample damage and 3. Smaller beam interaction and improved spatial resolution for high speed maps. Where the X-ray energies of interest are in excess of ~ 30 keV, traditional Silicon based technologies suffer from poor quantum efficiency due to a reduction in the detector stopping power. Detectors produced from high density semiconductors such as cadmium telluride (CdTe) and cadmium zinc telluride (CdZnTe) have improved efficiency at higher X-ray energies and are capable of room temperature operation. Single element systems, and more recently pixelated imaging detectors such as the HEXITEC system, are cable of achieving energy resolutions of the order of 1 % at 100 keV. In recent years, a different type of EDS detector, based upon a superconducting microcalorimeter, has also become commercially available. This new technology combines the simultaneous detection capabilities of EDS with the high spectral resolution of WDS. The EDS microcalorimeter consists of two components: an absorber, and a superconducting transition-edge sensor (TES) thermometer. The former absorbs X-rays emitted from the sample and converts this energy into heat; the latter measures the subsequent change in temperature due to the influx of heat. The EDS microcalorimeter has historically suffered from a number of drawbacks, including low count rates and small detector areas. The count rate is hampered by its reliance on the time constant of the calorimeters electrical circuit. The detector area must be small in order to keep the heat capacity small and maximize thermal sensitivity (resolution). However, the count rate and detector area have been improved by the implementation of arrays of hundreds of superconducting EDS microcalorimeters.

Cathodoluminescence

330

Cathodoluminescence
Cathodoluminescence is an optical and electromagnetic phenomenon in which electrons impacting on a luminescent material such as a phosphor, cause the emission of photons which may have wavelengths in the visible spectrum. A familiar example is the generation of light by an electron beam scanning the phosphor-coated inner surface of the screen of a television that uses a cathode ray tube. Cathodoluminescence is the inverse of the photoelectric effect in which electron emission is induced by irradiation with photons. Cathodoluminescence occurs because the impingement of a high energy electron beam onto a semiconductor will result in the promotion of electrons from the valence band into the conduction band, leaving behind a hole. When an electron and a hole recombine, it is possible for a photon to be emitted. The energy (color) of the photon, and the probability that a photon and not a phonon will be emitted, depends on the material, its purity, and its defect state. In this case, the "semiconductor" examined can, in fact, be almost any non-metallic material. In terms of band structure, classical semiconductors, insulators, ceramics, gemstones, minerals, and glasses can be treated the same way. In geology, mineralogy and materials science and semiconductor engineering, a scanning electron microscope with specialized optical detectors, or an optical cathodoluminescence microscope, may be used to examine internal structures of semiconductors, rocks, ceramics, glass, etc. in order to get information on the composition, growth and quality of the material. In these instruments a focused beam of electrons impinges on a sample and induces it to emit light that is collected by an optical system, such as an elliptical mirror. From there, a fiber optic will transfer the light out of the microscope where it is separated into its component wavelengths by a monochromator and is then detected with a photomultiplier tube. By scanning the microscope's beam in an X-Y pattern and measuring the light emitted with the beam at each point, a map of the optical activity of the specimen can be obtained. The primary advantages to the electron microscope based technique is the ability to resolve features down to 1 nanometer, the ability to measure an entire spectrum at each point (hyperspectral imaging) if the photomultiplier tube is replaced with a CCD camera, and the ability to perform nanosecond- to picosecond-level time-resolved measurements if the electron beam can be "chopped" into nano- or pico-second pulses. Moreover, the optical properties of an object can be correlated to structural properties observed with the electron microscope. These advanced techniques are useful for examining low-dimensional semiconductor structures, such a quantum wells or quantum dots. Although direct bandgap semiconductors such as GaAs or GaN are most easily examined by these techniques, indirect semiconductors such as silicon also emit weak cathodoluminescence, and can be examined as well. In particular, the luminescence of dislocated silicon is different from intrinsic silicon, and can be used to map defects in integrated circuits. Recently, cathodoluminescence performed in electron microscopes is being used to study Surface plasmon resonance in metallic Nanoparticles. Indeed, metallic nanoparticles can absorb and emit visible light because of surface Plasmons. Cathodoluminescence has been exploited as a probe to map the local density of states of planar dielectric photonic crystals and nanostructured photonic materials. Although an electron microscope with a cathodoluminescence detector provides high magnification and resolution it is more complicated and expensive compared to an easy to use optical cathodoluminescence microscope which benefits from its ability to show actual visible color features directly through the eyepiece.

Cathodoluminescence

331

Depth of field
In optics, particularly as it relates to film and photography, depth of field (DOF) is the distance between the nearest and farthest objects in a scene that appear acceptably sharp in an image. Although a lens can precisely focus at only one distance at a time, the decrease in sharpness is gradual on each side of the focused distance, so that within the DOF, the unsharpness is imperceptible under normal viewing conditions. In some cases, it may be desirable to have the entire image sharp, and a large DOF is appropriate. In other cases, a small DOF may be more effective, emphasizing the subject while de-emphasizing the foreground and background. In cinematography, a

A macro photograph with very shallow depth of field

Digital techniques, such as ray tracing, can also render 3D models with shallow depth of field for the same effect.

Depth of field

332

large DOF is often called deep focus, and a small DOF is often called shallow focus.

Circle of confusion criterion for depth of field


Precise focus is possible at only one The area within the depth of field appears sharp, while the areas in front of and beyond the depth of field appear blurry. distance; at that distance, a point object will produce a point image. At any other distance, a point object is defocused, and will produce a blur spot shaped like the aperture, which for the purpose of analysis is usually assumed to be circular. When this circular spot is sufficiently small, it is indistinguishable from a point, and appears to be in focus; it is rendered as acceptably sharp. The diameter of the circle increases with distance from the point of focus; the largest circle that is indistinguishable from a point is known as the acceptable circle of confusion, or informally, simply as the circle of confusion. The acceptable circle of confusion is influenced by visual acuity, viewing conditions, and the amount by which the image is enlarged (Ray 2000, 5253). The increase of the circle diameter with defocus is gradual, so the limits of depth of field are not hard boundaries between sharp and unsharp. For a 35mm motion picture, the image area on the negative is roughly 22mm by 16mm (0.87in by 0.63in). The limit of tolerable error is usually set at 0.05mm (0.002in) diameter. For 16mmfilm, where the image area is smaller, the tolerance is stricter, 0.025mm (0.001in). Standard depth-of-field tables are constructed on this basis, although generally 35mm productions set it at 0.025mm (0.001in). Note that the acceptable circle of confusion values for these formats are different because of the relative amount of magnification each format will need in order to be projected on a full-sized movie screen. (A table for 35mm still photography would be somewhat different since more of the film is used for each image and the amount of enlargement is usually much less.)

Object field methods


Traditional depth-of-field formulas and tables assume equal circles of confusion for near and far objects. Some authors, such as Merklinger (1992), have suggested that distant objects often need to be much sharper to be clearly recognizable, whereas closer objects, being larger on the film, do not need to be so sharp. The loss of detail in distant objects may be particularly noticeable with extreme enlargements. Achieving this additional sharpness in distant objects usually requires focusing beyond the hyperfocal distance, sometimes almost at infinity. For example, if photographing a cityscape with a traffic bollard in the foreground, this approach, termed the object field method by Merklinger, would recommend focusing very close to infinity, and stopping down to make the bollard sharp enough. With this approach, foreground objects cannot always be made perfectly sharp, but the loss of sharpness in near objects may be acceptable if recognizability of distant objects is paramount. Other authors (Adams 1980, 51) have taken the opposite position, maintaining that slight unsharpness in foreground objects is usually more disturbing than slight unsharpness in distant parts of a scene. Moritz von Rohr also used an object field method, but unlike Merklinger, he used the conventional criterion of a maximum circle of confusion diameter in the image plane, leading to unequal front and rear depths of field.

Depth of field

333

Factors affecting depth of field


Several other factors, such as subject matter, movement, camera-to-subject distance, lens focal length, selected lens f-number, format size, and circle of confusion criterion also influence when a given defocus becomes noticeable. The combination of focal length, subject distance, and format size defines magnification at the film / sensor plane. DOF is determined by subject magnification at the film / sensor plane and the selected lens aperture or f-number. For a given f-number, increasing the magnification, either by moving closer to the subject or using a lens of greater focal length, decreases the DOF; decreasing magnification increases DOF. For a given subject magnification, increasing the f-number (decreasing the aperture diameter) increases the DOF; decreasing f-number decreases DOF. If the original image is enlarged to make the final image, the circle of confusion in the original image must be smaller than that in the final image by the ratio of enlargement. Cropping an image and enlarging to the same size final image as an uncropped image taken under the same conditions is equivalent to using a smaller format under the same conditions, so the cropped image has less DOF.

A 35mm lens set to f/11. The depth-of-field scale (top) indicates that a subject which is anywhere between 1 and 2 meters in front of the camera will be rendered acceptably sharp. If the aperture were set to f/22 instead, everything from just over 0.7meters almost to infinity would appear to be in focus.

When focus is set to the hyperfocal distance, the DOF extends from half the hyperfocal distance to infinity, and the DOF is the largest possible for a given f-number.

Relationship of DOF to format size


The comparative DOFs of two different format sizes depend on the aperture. conditions of the comparison. The DOF for the smaller format can be either more than or less than that for the larger format. In the discussion that follows, it is assumed that the final images from both formats are the same size, are viewed from the same distance, and are judged with the same circle of confusion criterion. (Derivations of the effects of format size are given under Derivation of the DOF formulas.) Same picture for both formats When the same picture is taken in two different format sizes from the same distance at the same f-number with lenses that give the same angle of view, and the final images (e.g., in prints, or on a projection screen or electronic display) are the same size, DOF is, to a first approximation, inversely proportional to format size (Stroebel 1976, 139). Though commonly used when comparing formats, the approximation is valid only when the subject distance is large in comparison with the focal length of the larger format and small in comparison with the hyperfocal distance of the smaller format. Moreover, the larger the format size, the longer a lens will need to be to capture the same framing as a smaller format. In motion pictures, for example, a frame with a 12 degree horizontal field of view will require a 50mm lens on 16mm film, a 100mm lens on 35mm film, and a 250mm lens on 65mm film. Conversely, using the same focal length lens with each of these formats will yield a progressively wider image as the film format gets larger: a 50mm lens has a horizontal field of view of 12degrees on 16mm film, 23.6degrees on 35mm film, and 55.6degrees on
Out-of-focus highlights have the shape of the lens

Depth of field 65mm film. Therefore, because the larger formats require longer lenses than the smaller ones, they will accordingly have a smaller depth of field. Compensations in exposure, framing, or subject distance need to be made in order to make one format look like it was filmed in another format. Same focal length for both formats Many small-format digital SLR camera systems allow using many of the same lenses on both full-frame and cropped format cameras. If, for the same focal length setting, the subject distance is adjusted to provide the same field of view at the subject, at the same f-number and final-image size, the smaller format has greater DOF, as with the same picture comparison above. If pictures are taken from the same distance using the same f-number, same focal length, and the final images are the same size, the smaller format has less DOF. If pictures taken from the same subject distance using the same focal length, are given the same enlargement, both final images will have the same DOF. The pictures from the two formats will differ because of the different angles of view. If the larger format is cropped to the captured area of the smaller format, the final images will have the same angle of view, have been given the same enlargement, and have the same DOF. Same DOF for both formats In many cases, the DOF is fixed by the requirements of the desired image. For a given DOF and field of view, the required f-number is proportional to the format size. For example, if a 35mm camera required f/11, a 45 camera would require f/45 to give the same DOF. For the same ISO speed, the exposure time on the 45 would be sixteen times as long; if the 35camera required 1/250 second, the 45 camera would require 1/15 second. The longer exposure time with the larger camera might result in motion blur, especially with windy conditions, a moving subject, or an unsteady camera. Adjusting the f-number to the camera format is equivalent to maintaining the same absolute aperture diameter; when set to the same absolute aperture diameters, both formats have the same DOF.

334

Camera movements and DOF


When the lens axis is perpendicular to the image plane, as is normally the case, the plane of focus (POF) is parallel to the image plane, and the DOF extends between parallel planes on either side of the POF. When the lens axis is not perpendicular to the image plane, the POF is no longer parallel to the image plane; the ability to rotate the POF is known as the Scheimpflug principle. Rotation of the POF is accomplished with camera movements (tilt, a rotation Scheimpflug principle. of the lens about a horizontal axis, or swing, a rotation about a vertical axis). Tilt and swing are available on most view cameras, and are also available with specific lenses on some small- and medium-format cameras. When the POF is rotated, the near and far limits of DOF are no longer parallel; the DOF becomes wedge-shaped, with the apex of the wedge nearest the camera the DOF increases with distance from the camera; with swing, the width of the DOF increases with distance. In some cases, rotating the POF can better fit the DOF to the scene, and achieve the required sharpness at a smaller f-number. Alternatively, rotating the POF, in combination with a small f-number, can minimize the part of an image that is within the DOF.

Depth of field

335

Effect of lens aperture


For a given subject framing and camera position, the DOF is controlled by the lens aperture diameter, which is usually specified as the f-number, the ratio of lens focal length to aperture diameter. Reducing the aperture diameter (increasing the f-number) increases the DOF; however, it also reduces the amount of light transmitted, and increases diffraction, placing a practical limit on the extent to which DOF can be increased by reducing the aperture diameter. Motion pictures make only limited use of this control; to produce a consistent image quality from shot to shot, cinematographers usually choose a single aperture setting for interiors and another for exteriors, and adjust exposure through the use of camera filters or light levels. Aperture settings are adjusted more frequently in still photography, where variations in depth of field are used to produce a variety of special effects.

Effect of aperture on blur and DOF. The points in focus(2) project points onto the image plane(5), but points at different distances(1 and 3) project blurred images, or circles of confusion. Decreasing the aperture size(4) reduces the size of the blur spots for points not in the focused plane, so that the blurring is imperceptible, and all points are within the DOF.

DOF with various apertures

f/22

Depth of field

336

f/8

f/4

f/2.8

Depth of field

337

Digital techniques affecting DOF


The advent of digital technology in photography has provided additional means of controlling the extent of image sharpness; some methods allow extended DOF that would be impossible with traditional techniques, and some allow the DOF to be determined after the image is made.

Focus stacking is a digital image processing technique which combines multiple images taken at different focus distances to give a resulting image with a greater depth of field than any of the individual source images. Available programs for multi-shot DOF enhancement include Adobe Photoshop, Syncroscopy AutoMontage, PhotoAcute Studio, Helicon Focus and CombineZ. Getting sufficient depth of field can be particularly challenging in macro photography. The images to the right illustrate the extended DOF that can be achieved by combining multiple images. Wavefront coding is a method that convolves rays in such a way that it provides an image where fields are in focus simultaneously with all planes out of focus by a constant amount. A plenoptic camera uses a microlens array to capture 4D light field information about a scene. Colour apodisation is a technique combining a modified lens design with image processing to achieve an increased depth of field. The lens is modified such that each colour channel has a different lens aperture. For example the red channel may be f/2.4, green may be f/2.4, whilst the blue channel may be f/5.6. Therefore the blue channel will have a greater depth of field than the other colours. The image processing identifies blurred regions in the red and green channels and in these regions copies the sharper edge data from the blue channel. The result is an image that combines the best features from the different f-numbers.

Series of images demonstrating a 6 image focus bracket of a Tachinid fly. First two images illustrate typical DOF of a single image at f/10 while the third image is the composite of 6 images.

Diffraction and DOF


If the camera position and image framing (i.e., angle of view) have been chosen, the only means of controlling DOF is the lens aperture. Most DOF formulas imply that any arbitrary DOF can be achieved by using a sufficiently large f-number. Because of diffraction, however, this isn't really true. Once a lens is stopped down to where most aberrations are well corrected, stopping down further will decrease sharpness in the plane of focus. At the DOF limits, however, further stopping down decreases the size of the defocus blur spot, and the overall sharpness may still increase. Eventually, the defocus blur spot becomes negligibly small, and further stopping down serves only to decrease sharpness even at DOF limits. There is thus a tradeoff between sharpness in the POF and sharpness at the DOF limits. But the sharpness in the POF is always greater than that at the DOF limits; if the blur at the DOF limits is imperceptible, the blur in the POF is imperceptible as well. For general photography, diffraction at DOF limits typically becomes significant only at fairly large f-numbers; because large f-numbers typically require long exposure times, motion blur may cause greater loss of sharpness than the loss from diffraction. The size of the diffraction blur spot depends on the effective f-number , however, so diffraction is a greater issue in close-up photography, and the tradeoff between DOF and overall sharpness can become quite noticeable.

Depth of field

338

Lens DOF scales


Many lenses for small- and medium-format cameras include scales that indicate the DOF for a given focus distance and f-number; the 35mm lens in the image above is typical. That lens includes distance scales in feet and meters; when a marked distance is set opposite the large white index mark, the focus is set to that distance. The DOF scale below the distance scales includes markings on either side of the index that correspond to f-numbers. When the lens is set to a given f-number, the DOF extends between the distances that align with the f-number markings.

Detail from the lens shown above. The point half-way between the 1m and 2m marks represents approximately 1.3m.

Zone focusing
When the 35mm lens above is set to f/11 and focused at approximately 1.3m, the DOF (a zone of acceptable sharpness) extends from 1m to 2m. Conversely, the required focus and f-number can be determined from the desired DOF limits by locating the near and far DOF limits on the lens distance scale and setting focus so that the index mark is centered between the near and far distance marks. The required f-number is determined by finding the markings on the DOF scale that are closest to the near and far distance marks. For the 35mm lens above, if it were desired for the DOF to extend from 1m to 2m, focus would be set so that index mark was centered between the marks for those distances, and the aperture would be set to f/11. The focus so determined would be about 1.3m, the approximate harmonic mean of the near and far distances. the section Focus and f-number from DOF limits for additional discussion. See

If the marks for the near and far distances fall outside the marks for the largest f-number on the DOF scale, the desired DOF cannot be obtained; for example, with the 35mm lens above, it is not possible to have the DOF extend from 0.7m to infinity. The DOF limits can be determined visually, by focusing on the farthest object to be within the DOF and noting the distance mark on the lens distance scale, and repeating the process for the nearest object to be within the DOF. Some distance scales have markings for only a few distances; for example, the 35mm lens above shows only 3ft and 5ft on its upper scale. Using other distances for DOF limits requires visual interpolation between marked distances. Since the distance scale is nonlinear, accurate interpolation can be difficult. In most cases, English and metric distance markings are not coincident, so using both scales to note focused distances can sometimes lessen the need for interpolation. Many autofocus lenses have smaller distance and DOF scales and fewer markings than do comparable manual-focus lenses, so that determining focus and f-number from the scales on an autofocus lens may be more difficult than with a comparable manual-focus lens. In most cases, determining these settings using the lens DOF scales on an autofocus lens requires that the lens or camera body be set to manual focus. On a view camera, the focus and f-number can be obtained by measuring the focus spread and performing simple calculations. The procedure is described in more detail in the section Focus and f-number from DOF limits. Some view cameras include DOF calculators that indicate focus and f-number without the need for any calculations by the photographer.

Depth of field

339

Hyperfocal distance
The hyperfocal distance is the nearest focus distance at which the DOF extends to infinity; focusing the camera at the hyperfocal distance results in the largest possible depth of field for a given f-number. Focusing beyond the hyperfocal distance does not increase the far DOF (which already extends to infinity), but it does decrease the DOF in front of the subject, decreasing the total DOF. Some photographers consider this wasting DOF; however, see Object field methods below for a rationale for doing so. Focusing on the hyperfocal distance is a special case of zone focusing in which the far limit of DOF is at infinity. If the lens includes a DOF scale, the hyperfocal distance can be set by aligning the infinity mark on the distance scale with the mark on the DOF scale corresponding to the f-number to which the lens is set. For example, with the 35mm lens shown above set to f/11, aligning the infinity mark with the 11 to the left of the index mark on the DOF scale would set the focus to the hyperfocal distance.

Limited DOF: selective focus


Depth of field can be anywhere from a fraction of a millimeter to virtually infinite. In some cases, such as landscapes, it may be desirable to have the entire image sharp, and a large DOF is appropriate. In other cases, artistic considerations may dictate that only a part of the image be in focus, emphasizing the subject while de-emphasizing the background, perhaps giving only a suggestion of the environment. For example, a common technique in melodramas and horror films is a closeup of a person's face, with someone just behind that person visible but out of focus. A portrait or close-up still photograph might use a small DOF to isolate the subject from a distracting background. The use of limited DOF to emphasize one part of an image is known as selective focus, differential focus or shallow focus. Although a small DOF implies that other parts of the image will be unsharp, it does not, by itself, determine how unsharp those parts will be. The amount of background (or foreground) blur depends on the distance from the plane of focus, so if a background is close to the subject, it may be difficult to blur sufficiently even with a small DOF. In practice, the lens f-number is usually adjusted until the background or foreground is acceptably blurred, often without direct concern for the DOF. Sometimes, however, it is desirable to have the entire subject sharp while ensuring that the background is sufficiently unsharp. When the distance between subject and background is fixed, as is the case with many scenes, the DOF and the amount of background blur are not independent. Although it is not always possible to achieve both the desired subject sharpness and the desired background unsharpness, several techniques can be used to increase the separation of subject and background.
At f/2.8, the cat is isolated from the background.

At f/32, the background competes for the viewers attention.

At f/5.6, the flowers are isolated from the background.

Depth of field For a given scene and subject magnification, the background blur increases with lens focal length. If it is not important that background objects be unrecognizable, background de-emphasis can be increased by using a lens of longer focal length and increasing the subject distance to maintain the same magnification. This technique requires that sufficient space in front of the subject be available; moreover, the perspective of the scene changes because of the different camera position, and this may or may not be acceptable. The situation is not as simple if it is important that a background object, such as a sign, be unrecognizable. The magnification of background objects also increases with focal length, so with the technique just described, there is little change in the recognizability of background objects. However, a lens of longer focal length may still be of some help; because of the narrower angle of view, a slight change of camera position may suffice to eliminate the distracting object from the field of view. Although tilt and swing are normally used to maximize the part of the image that is within the DOF, they also can be used, in combination with a small f-number, to give selective focus to a plane that isn't perpendicular to the lens axis. With this technique, it is possible to have objects at greatly different distances from the camera in sharp focus and yet have a very shallow DOF. The effect can be interesting because it differs from what most viewers are accustomed to seeing.

340

Scanning tunneling microscope

341

Scanning tunneling microscope


A scanning tunneling microscope (STM) is an instrument for imaging surfaces at the atomic level. Its development in 1981 earned its inventors, Gerd Binnig and Heinrich Rohrer (at IBM Zrich), the Nobel Prize in Physics in 1986. For an STM, good resolution is considered to be 0.1nm lateral resolution and 0.01nm depth resolution. With this resolution, individual atoms within materials are routinely imaged and manipulated. The STM can be used not only in ultra-high vacuum but also in air, water, and various other liquid or gas ambients, and at temperatures ranging from near zero kelvin to a few hundred degrees Celsius. The STM is based on the concept of quantum tunneling. When a conducting tip is brought very near to the surface to be examined, a bias (voltage difference) applied between the two can allow electrons to tunnel through the vacuum between them. The resulting tunneling current is a function of tip position, applied voltage, and the local density of states (LDOS) of the sample. Information is acquired by monitoring the current as the tip's position scans across the surface, and is usually displayed in image form. STM can be a challenging technique, as it requires extremely clean and stable surfaces, sharp tips, excellent vibration control, and sophisticated electronics, but nonetheless many hobbyists have built their own. US4,343,993,
Image of reconstruction on a clean Gold(100) surface

An STM image of a single-walled carbon nanotube

written by Gerd Binnig and Heinrich Rohrer is the basic patent of STM.

Animation showing the tunnel effect and its application to Scanning Tunneling Microscope

Scanning tunneling microscope

342

Procedure
First, a voltage bias is applied and the tip is brought close to the sample by coarse sample-to-tip control, which is turned off when the tip and sample are sufficiently close. At close range, fine control of the tip in all three dimensions when near the sample is typically piezoelectric, maintaining tip-sample separation W typically in the 4-7 (0.4-0.7 nm) range, which is the equilibrium position between attractive (3<W<10) and repulsive (W<3) interactions. In this situation, the voltage bias will cause electrons to tunnel between the tip and sample, creating a current that can be measured. Once tunneling is established, the tip's bias and position with respect to the sample can be varied (with the details of this variation depending on the experiment) and data are obtained from the resulting changes in current.

A close-up of a simple scanning tunneling microscope head using a platinumiridium tip.

If the tip is moved across the sample in the x-y plane, the changes in surface height and density of states cause changes in current. These changes are mapped in images. This change in current with respect to position can be measured itself, or the height, z, of the tip corresponding to a constant current can be measured. These two modes are called constant height mode and constant current mode, respectively. In constant current mode, feedback electronics adjust the height by a voltage to the piezoelectric height control mechanism. This leads to a height variation and thus the image comes from the tip topography across the sample and gives a constant charge density surface; this means contrast on the image is due to variations in charge density. In constant height mode, the voltage and height are both held constant while the current changes to keep the voltage from changing; this leads to an image made of current changes over the surface, which can be related to charge density. The benefit to using a constant height mode is that it is faster, as the piezoelectric movements require more time to register the height change in constant current mode than the current change in constant height mode. All images produced by STM are grayscale, with color optionally added in post-processing in order to visually emphasize important features. In addition to scanning across the sample, information on the electronic structure at a given location in the sample can be obtained by sweeping voltage and measuring current at a specific location. This type of measurement is called scanning tunneling spectroscopy (STS) and typically results in a plot of the local density of states as a function of energy within the sample. The advantage of STM over other measurements of the density of states lies in its ability to make extremely local measurements: for example, the density of states at an impurity site can be compared to the density of states far from impurities. Framerates of at least 1Hz enable so called Video-STM (up to 50Hz is possible). This can be used to scan surface diffusion.

Scanning tunneling microscope

343

Instrumentation
The components of an STM include scanning tip, piezoelectric controlled height and x,y scanner, coarse sample-to-tip control, vibration isolation system, and computer. The resolution of an image is limited by the radius of curvature of the scanning tip of the STM. Additionally, image artifacts can occur if the tip has two tips at the end rather than a single atom; this leads to double-tip imaging, a situation in which both tips contribute to the tunneling. Therefore it has been essential to develop processes for consistently obtaining sharp, usable tips. Recently, carbon nanotubes have been used in this instance.

Schematic view of an STM

The tip is often made of tungsten or platinum-iridium, though gold is also used. Tungsten tips are usually made by electrochemical etching, and platinum-iridium tips by mechanical shearing. Due to the extreme sensitivity of tunnel current to height, proper vibration isolation or an extremely rigid STM body is imperative for obtaining usable results. In the first STM by Binnig and Rohrer, magnetic levitation was used to keep the STM free from vibrations; now mechanical spring or gas spring systems are often used. Additionally, mechanisms for reducing eddy currents are sometimes implemented. Maintaining the tip position with respect to the sample, scanning the sample and acquiring the data is computer controlled. The computer may also be used for enhancing the image with the help of image processing as well as performing quantitative measurements.

Scanning tunneling microscope

344

Probe tips
STM tips are usually made from W (tungsten) metal or Pt/Ir alloy where at the very end of the tip (called apex), there is one atom of the material.

Other STM related studies


Many other microscopy techniques have been developed based upon STM. These include photon scanning microscopy (PSTM), which uses an optical tip to tunnel photons; scanning tunneling potentiometry (STP), which measures electric potential across a surface; spin polarized scanning tunneling microscopy (SPSTM), which uses a ferromagnetic tip to tunnel spin-polarized electrons into a magnetic sample, and atomic force microscopy (AFM), in which the force caused by interaction between the tip and sample is measured.
Nanomanipulation via STM of a self-assembled Other STM methods involve manipulating the tip in order to change organic semiconductor monolayer (here: PTCDA the topography of the sample. This is attractive for several reasons. molecules) on graphite, in which the logo of the Firstly the STM has an atomically precise positioning system which Center for NanoScience (CeNS), LMU has been written. allows very accurate atomic scale manipulation. Furthermore, after the surface is modified by the tip, it is a simple matter to then image with the same tip, without changing the instrument. IBM researchers developed a way to manipulate xenon atoms adsorbed on a nickel surface. This technique has been used to create electron "corrals" with a small number of adsorbed atoms, which allows the STM to be used to observe electron Friedel oscillations on the surface of the material. Aside from modifying the actual sample surface, one can also use the STM to tunnel electrons into a layer of electron beam photoresist on a sample, in order to do lithography. This has the advantage of offering more control of the exposure than traditional electron beam lithography. Another practical application of STM is atomic deposition of metals (Au, Ag, W, etc.) with any desired (pre-programmed) pattern, which can be used as contacts to nanodevices or as nanodevices themselves.

Recently groups have found they can use the STM tip to rotate individual bonds within single molecules. The electrical resistance of the molecule depends on the orientation of the bond, so the molecule effectively becomes a molecular switch.

Principle of operation
Tunneling is a functioning concept that arises from quantum mechanics. Classically, an object hitting an impenetrable barrier will not pass through. In contrast, objects with a very small mass, such as the electron, have wavelike characteristics which permit such an event, referred to as tunneling. Electrons behave as beams of energy, and in the presence of a potential U(z), assuming 1-dimensional case, the energy levels n(z) of the electrons are given by solutions to Schrdingers equation,
The first STM produced commercially, 1986.

where is the reduced Plancks constant, z is the position, and m is the mass of an electron. If an electron of energy E is incident upon an energy barrier of height U(z), the electron wave function is a traveling wave solution,

Scanning tunneling microscope where

345

if E > U(z), which is true for a wave function inside the tip or inside the sample. Inside a barrier, E < U(z) so the wave functions which satisfy this are decaying waves,

where

quantifies the decay of the wave inside the barrier, with the barrier in the +z direction for

Condensed matter experiments

ARPES Neutron scattering X-ray spectroscopy Quantum oscillations Scanning tunneling microscopy

Knowing the wave function allows one to calculate the probability density for that electron to be found at some location. In the case of tunneling, the tip and sample wave functions overlap such that when under a bias, there is some finite probability to find the electron in the barrier region and even on the other side of the barrier. Let us assume the bias is V and the barrier width is W. This probability, P, that an electron at z=0 (left edge of barrier) can be found at z=W (right edge of barrier) is proportional to the wave function squared, . If the bias is small, we can let U E M in the expression for , where M, the work function, gives the minimum energy needed to bring an electron from an occupied level, the highest of which is at the Fermi level (for metals at T=0 kelvins), to vacuum level. When a small bias V is applied to the system, only electronic states very near the Fermi level, within eV (a product of electron charge and voltage, not to be confused here with electronvolt unit), are excited. These excited electrons can tunnel across the barrier. In other words, tunneling occurs mainly with electrons of energies near the Fermi level. However, tunneling does require that there is an empty level of the same energy as the electron for the electron to tunnel into on the other side of the barrier. It is because of this restriction that the tunneling current can be related to the density of available or filled states in the sample. The current due to an applied voltage V (assume tunneling occurs sample to tip) depends on two factors: 1) the number of electrons between Ef and eV in the sample, and 2) the number among them which have corresponding free states to tunnel into on the other side of the barrier at the tip. The higher density of available states the greater the tunneling current. When V is positive, electrons in the tip tunnel into empty states in the sample; for a negative bias, electrons tunnel out of occupied states in the sample into the tip. Mathematically, this tunneling current is given by

Scanning tunneling microscope

346

. One can sum the probability over energies between Ef eV and Ef to get the number of states available in this energy range per unit volume, thereby finding the local density of states (LDOS) near the Fermi level. The LDOS near some energy E in an interval is given by , and the tunnel current at a small bias V is proportional to the LDOS near the Fermi level, which gives important information about the sample. It is desirable to use LDOS to express the current because this value does not change as the volume changes, while probability density does. Thus the tunneling current is given by

where s(0,Ef) is the LDOS near the Fermi level of the sample at the sample surface. This current can also be expressed in terms of the LDOS near the Fermi level of the sample at the tip surface,

The exponential term in the above equations means that small variations in W greatly influence the tunnel current. If the separation is decreased by 1 , the current increases by an order of magnitude, and vice versa. This approach fails to account for the rate at which electrons can pass the barrier. This rate should affect the tunnel current, so it can be treated using the Fermi's golden rule with the appropriate tunneling matrix element. John Bardeen solved this problem in his study of the metal-insulator-metal junction. He found that if he solved Schrdingers equation for each side of the junction separately to obtain the wave functions and for each electrode, he could obtain the tunnel matrix, M, from the overlap of these two wave functions. This can be applied to STM by making the electrodes the tip and sample, assigning and as sample and tip wave functions, respectively, and evaluating M at some surface S between the metal electrodes, where z=0 at the sample surface and z=W at the tip surface. Now, Fermis Golden Rule gives the rate for electron transfer across the barrier, and is written , where (EE) restricts tunneling to occur only between electron levels with the same energy. The tunnel matrix element, given by , is a description of the lower energy associated with the interaction of wave functions at the overlap, also called the resonance energy. Summing over all the states gives the tunneling current as , where f is the Fermi function, s and T are the density of states in the sample and tip, respectively. The Fermi distribution function describes the filling of electron levels at a given temperature T.

Transmission electron microscopy

347

Transmission electron microscopy


Transmission electron microscopy (TEM) is a microscopy technique in which a beam of electrons is transmitted through an ultra-thin specimen, interacting with the specimen as it passes through. An image is formed from the interaction of the electrons transmitted through the specimen; the image is magnified and focused onto an imaging device, such as a fluorescent screen, on a layer of photographic film, or to be detected by a sensor such as a CCD camera. TEMs are capable of imaging at a significantly higher resolution than light microscopes, owing to the small de Broglie wavelength of electrons. This enables the instrument's user to examine fine detaileven as small as a single column of atoms, which is thousands of times smaller than the smallest resolvable object in a light microscope. TEM forms a major analysis method in a range of scientific fields, in both physical and biological sciences. TEMs find application in cancer research, virology, materials science as well as pollution, nanotechnology, and semiconductor research.

At smaller magnifications TEM image contrast is due to absorption of electrons in the material, due to the thickness and composition of the material. At higher magnifications complex wave interactions modulate the intensity of the image, requiring expert analysis of observed images. Alternate modes of use allow for the TEM to observe modulations in chemical identity, crystal orientation, electronic structure and sample induced electron phase shift as well as the regular absorption based imaging. The first TEM was built by Max Knoll and Ernst Ruska in 1931, with this group developing the first TEM with resolution greater than that of light in 1933 and the first commercial TEM in 1939.

A TEM image of the polio virus. The polio virus is 30 nm in size.

Transmission electron microscopy

348

History
Initial development
Ernst Abbe originally proposed that the ability to resolve detail in an object was limited approximately by the wavelength of the light used in imaging, which limits the resolution of an optical microscope to a few hundred nanometers. Developments into ultraviolet (UV) microscopes, led by Khler and Rohr, allowed for an increase in resolving power of about a factor of two.[1] However this required more expensive quartz optical components, due to the absorption of UV by glass. At this point it was believed that obtaining an image with sub-micrometer information was simply impossible due to this wavelength constraint. It had earlier been recognized by Plcker in 1858 that the deflection of "cathode rays" (electrons) was possible by the use of magnetic fields. This effect had been utilized to build primitive cathode ray oscilloscopes (CROs) as early as 1897 by Ferdinand Braun, intended as a measurement device. Indeed in 1891 it was recognized by Riecke that the cathode rays could be focused by these magnetic fields, allowing for simple lens designs. Later this theory was extended by Hans Busch in his work published in 1926, who showed that the lens maker's equation, could under appropriate assumptions, be applicable to electrons. In 1928, at the Technological University of Berlin Adolf Matthias, Professor of High voltage Technology and Electrical Installations, appointed Max Knoll to lead a team of researchers to advance the CRO design. The team consisted of several PhD students including Ernst Ruska and Bodo von Borries. This team of researchers concerned themselves with lens design and CRO column placement, which they attempted to obtain the parameters that could be optimised to allow for construction of better CROs, as well as the development of electron optical components which could be used to generate low magnification (nearly 1:1) images. In 1931 the group successfully generated magnified images of mesh grids placed over the anode aperture. The device used two magnetic lenses to achieve higher magnifications, arguably the first electron microscope. In that same year, Reinhold Rudenberg, the scientific director of the Siemens company, had patented an electrostatic lens electron microscope.

The first practical TEM, Originally installed at I. G Farben-Werke and now on display at the Deutsches Museum in Munich, Germany

Improving resolution
Sketch of first electron microscope, originally from Ruska's notebook in 1931, capable of only 18 times magnification

At this time the wave nature of electrons, which were considered charged matter particles, had not been fully realised until the publication of the De Broglie hypothesis in 1927. The group was

Transmission electron microscopy unaware of this publication until 1932, where it was quickly realized that the De Broglie wavelength of electrons was many orders of magnitude smaller than that for light, theoretically allowing for imaging at atomic scales. In April 1932, Ruska suggested the construction of a new electron microscope for direct imaging of specimens inserted into the microscope, rather than simple mesh grids or images of apertures. With this device successful diffraction and normal imaging of aluminium sheet was achieved, however exceeding the magnification achievable with light microscopy had still not been successfully demonstrated. This goal was achieved in September 1933, using images of cotton fibers, which were quickly acquired before being damaged by the electron beam. At this time, interest in the electron microscope had increased, with other groups, such as Paul Anderson and Kenneth Fitzsimmons of Washington State University, and Albert Prebus and James Hillier at the University of Toronto who constructed the first TEMs in North America in 1935 and 1938, respectively, continually advancing TEM design. Research continued on the electron microscope at Siemens in 1936, the aim of the research was the development improvement of TEM imaging properties, particularly with regard to biological specimens. At this time electron microscopes were being fabricated for specific groups, such as the "EM1" device used at the UK National Physical Laboratory. In 1939 the first commercial electron microscope, pictured, was installed in the Physics department of I. G Farben-Werke. Further work on the electron microscope was hampered by the destruction of a new laboratory constructed at Siemens by an air-raid, as well as the death of two of the researchers, Heinz Mller and Friedrick Krause during World War II.

349

Further research
After World War II, Ruska resumed work at Siemens, where he continued to develop the electron microscope, producing the first microscope with 100k magnification. The fundamental structure of this microscope design, with multi-stage beam preparation optics, is still used in modern microscopes. The worldwide electron microscopy community advanced with electron microscopes being manufactured in Manchester UK, the USA (RCA), Germany (Siemens) and Japan . The first international conference in electron microscopy was in Delft in 1942, with more than one hundred attendees. Later conferences included the "First" international conference in Paris, 1950 and then in London in 1954. With the development of TEM, the associated technique of scanning transmission electron microscopy (STEM) was re-investigated and did not become developed until the 1970s, with Albert Crewe at the University of Chicago developing the field emission gun and adding a high quality objective lens to create the modern STEM. Using this design, Crewe demonstrated the ability to image atoms using annular dark-field imaging. Crewe and coworkers at the University of Chicago developed the cold field electron emission source and built a STEM able to visualize single heavy atoms on thin carbon substrates. In 2008, Jannick Meyer et al. described the direct visualization of light atoms such as carbon and even hydrogen using TEM and a clean single-layer graphene substrate.

Background
Electrons
Theoretically, the maximum resolution, d, that one can obtain with a light microscope has been limited by the wavelength of the photons that are being used to probe the sample, and the numerical aperture of the system, NA.

Early twentieth century scientists theorised ways of getting around the limitations of the relatively large wavelength of visible light (wavelengths of 400700 nanometers) by using electrons. Like all matter, electrons have both wave and particle properties (as theorized by Louis-Victor de Broglie), and their wave-like properties mean that a beam of electrons can be made to behave like a beam of electromagnetic radiation. The wavelength of electrons is related to

Transmission electron microscopy their kinetic energy via the de Broglie equation. An additional correction must be made to account for relativistic effects, as in a TEM an electron's velocity approaches the speed of light,c.

350

where, h is Planck's constant, m0 is the rest mass of an electron and E is the energy of the accelerated electron. Electrons are usually generated in an electron microscope by a process known as thermionic emission from a filament, usually tungsten, in the same manner as a light bulb, or alternatively by field electron emission. The electrons are then accelerated by an electric potential (measured in volts) and focused by electrostatic and electromagnetic lenses onto the sample. The transmitted beam contains information about electron density, phase and periodicity; this beam is used to form an image.

Source formation
From the top down, the TEM consists of an emission source, which may be a tungsten filament, or a lanthanum hexaboride (LaB6) source. For tungsten, this will be of the form of either a hairpin-style filament, or a small spike-shaped filament. LaB6 sources utilize small single crystals. By connecting this gun to a high voltage source (typically ~100300 kV) the gun will, given sufficient current, begin to emit electrons either by thermionic or field electron emission into the vacuum. This extraction is usually aided by the use of a Wehnelt cylinder. Once extracted, the upper lenses of the TEM allow for the formation of the electron probe to the desired size and location for later interaction with the sample. Manipulation of the electron beam is performed using two physical effects. The interaction of electrons with a magnetic field will cause electrons to move according to the right hand rule, thus allowing for electromagnets to manipulate the electron beam. The use of magnetic fields allows for the formation of a magnetic lens of variable focusing power, the lens shape originating due to the distribution of magnetic flux. Additionally, electrostatic fields can cause the electrons to be deflected through a Layout of optical components in a basic TEM constant angle. Coupling of two deflections in opposing directions with a small intermediate gap allows for the formation of a shift in the beam path, this being used in TEM for beam shifting, subsequently this is extremely important to STEM. From these two effects, as well as the use of an electron imaging system, sufficient control over the beam path is possible for TEM operation. The optical configuration of a TEM can be rapidly changed, unlike that for an optical microscope, as lenses in the beam

Transmission electron microscopy

351 path can be enabled, have their strength changed, or be disabled entirely simply via rapid electrical switching, the speed of which is limited by effects such as the magnetic hysteresis of the lenses.

Optics
The lenses of a TEM allow for beam convergence, with the angle of convergence as a variable parameter, giving the TEM the ability to change magnification simply by modifying the amount of current that flows through the coil, quadrupole or hexapole lenses. The quadrupole lens is an arrangement of electromagnetic coils at the vertices of the square, enabling the generation of a lensing magnetic fields, the hexapole configuration simply enhances the lens symmetry by using six, rather than four coils.

Single crystal LaB6 filament

Typically a TEM consists of three stages of lensing. The stages are the condensor lenses, the objective lenses, and the projector lenses. The condensor lenses are responsible for primary beam formation, whilst the objective lenses focus the beam that comes through the sample itself (in STEM scanning mode, there are also objective lenses above the sample to make the incident electron beam convergent). The projector lenses are used to expand the beam onto the phosphor screen or other imaging device, such as film. The magnification of the TEM is due to the ratio of the distances between the specimen and the objective lens' image plane. Additional quad or hexapole lenses allow for the correction of asymmetrical beam distortions, known as astigmatism. It is noted that TEM optical configurations differ significantly with implementation, with Hairpin style tungsten filament manufacturers using custom lens configurations, such as in spherical aberration corrected instruments, or TEMs utilising energy filtering to correct electron chromatic aberration.

Transmission electron microscopy

352

Display
Imaging systems in a TEM consist of a phosphor screen, which may be made of fine (10100m) particulate zinc sulphide, for direct observation by the operator. Optionally, an image recording system such as film based or doped YAG screen coupled CCDs. Typically these devices can be removed or inserted into the beam path by the operator as required.

Components
A TEM is composed of several components, which include a vacuum system in which the electrons travel, an electron emission source for generation of the electron stream, a series of electromagnetic lenses, as well as electrostatic plates. The latter two allow the operator to guide and manipulate the beam as required. Also required is a device to allow the insertion into, motion within, and removal of specimens from the beam path. Imaging devices are subsequently used to create an image from the electrons that exit the system.

Vacuum system
To increase the mean free path of the electron gas interaction, a standard TEM is evacuated to low pressures, typically on the order of 104 Pa. The need for this is twofold: first the allowance for the voltage difference between the cathode and the ground without generating an arc, and secondly to reduce the collision frequency of electrons with gas atoms to negligible levelsthis effect is characterised by the mean free path. TEM components such as specimen holders and film cartridges must be routinely inserted or replaced requiring a system with the ability to re-evacuate on a regular basis. As such, TEMs are equipped with multiple pumping systems and airlocks and are not permanently vacuum sealed.

The electron source of the TEM is at the top, where the lensing system (4,7 and 8) focuses the beam on the specimen and then projects it onto the viewing screen (10). The beam control is on the right (13 and 14)

The vacuum system for evacuating a TEM to an operating pressure level consists of several stages. Initially a low or roughing vacuum is achieved with either a rotary vane pump or diaphragm pumps bringing the TEM to a sufficiently low pressure to allow the operation of a turbomolecular or diffusion pump which brings the TEM to its high vacuum level necessary for operations. To allow for the low vacuum pump to not require continuous operation, while continually operating the turbomolecular pumps, the vacuum side of a low-pressure pump may be connected to chambers which accommodate the exhaust gases from the turbomolecular pump. Sections of the TEM may be isolated by the use of pressure-limiting apertures, to allow for different vacuum levels in specific areas, such as a higher vacuum of 104 to 107 Pa or higher in the electron gun in high-resolution or field-emission TEMs. High-voltage TEMs require ultra-high vacuums on the range of 107 to 109 Pa to prevent generation of an electrical arc, particularly at the TEM cathode. As such for higher voltage TEMs a third vacuum system may operate, with the gun isolated from the main chamber either by use of gate valves or by the use of a differential pumping aperture. The differential pumping aperture is a small hole that prevents diffusion of gas molecules into the higher vacuum gun area faster than they can be pumped out. For these very low pressures either an ion pump or a getter material is used. Poor vacuum in a TEM can cause several problems, from deposition of gas inside the TEM onto the specimen as it is being viewed through a process known as electron beam induced deposition, or in more severe cases damage to the cathode from an electrical discharge. Vacuum problems due to specimen sublimation are limited by the use of a cold

Transmission electron microscopy trap to adsorb sublimated gases in the vicinity of the specimen.

353

Specimen stage
TEM specimen stage designs include airlocks to allow for insertion of the specimen holder into the vacuum with minimal increase in pressure in other areas of the microscope. The specimen holders are adapted to hold a standard size of grid upon which the sample is placed or a standard size of self-supporting specimen. Standard TEM grid sizes are a 3.05mm diameter ring, with a thickness and mesh size ranging from a few to 100m. The sample is placed onto the inner meshed area having diameter of approximately 2.5mm. Usual grid materials are copper, molybdenum, gold or platinum. This grid is placed into the sample holder, which is paired with the specimen stage. A wide variety of designs of stages and holders exist, depending upon the type of experiment being performed. In addition to 3.05mm grids, 2.3mm TEM sample support mesh "grid", with grids are sometimes, if rarely, used. These grids were particularly used ultramicrotomy sections in the mineral sciences where a large degree of tilt can be required and where specimen material may be extremely rare. Electron transparent specimens have a thickness around 100nm, but this value depends on the accelerating voltage. Once inserted into a TEM, the sample often has to be manipulated to present the region of interest to the beam, such as in single grain diffraction, in a specific orientation. To accommodate this, the TEM stage includes mechanisms for the translation of the sample in the XY plane of the sample, for Z height adjustment of the sample holder, and usually for at least one rotation degree of freedom for the sample. Thus a TEM stage may provide four degrees of freedom for the motion of the specimen. Most modern TEMs provide the ability for two orthogonal rotation angles of movement with specialized holder designs called double-tilt sample holders. Of note however is that some stage designs, such as top-entry or vertical insertion stages once common for high resolution TEM studies, may simply only have X-Y translation available. The design criteria of TEM stages are complex, owing to the simultaneous requirements of mechanical and electron-optical constraints and have thus generated many unique implementations. A TEM stage is required to have the ability to hold a specimen and be manipulated to bring the region of interest into the path of the electron beam. As the TEM can operate over a wide range of magnifications, the stage must simultaneously be highly resistant to mechanical drift, with drift requirements as low as a few nm/minute while being able to move several m/minute, with repositioning accuracy on the order of nanometers. Earlier designs of TEM accomplished this with a complex set of mechanical downgearing devices, allowing the operator to finely control the motion of the stage by several rotating rods. Modern devices may use electrical stage designs, using screw gearing in concert with stepper motors, providing the operator with a computer-based stage input, such as a joystick or trackball. Two main designs for stages in a TEM exist, the side-entry and top entry version. Each design must accommodate the matching holder to allow for specimen insertion without either damaging delicate TEM optics or allowing gas into TEM systems under vacuum.

Transmission electron microscopy

354

The most common is the side entry holder, where the specimen is placed near the tip of a long metal (brass or stainless steel) rod, with the specimen placed flat in a small bore. Along the rod are several polymer vacuum rings to allow for the formation of a vacuum seal of sufficient quality, when A diagram of a single axis tilt sample holder for insertion into a TEM goniometer. inserted into the stage. The stage is thus Titling of the holder is achieved by rotation of the entire goniometer designed to accommodate the rod, placing the sample either in between or near the objective lens, dependent upon the objective design. When inserted into the stage, the side entry holder has its tip contained within the TEM vacuum, and the base is presented to atmosphere, the airlock formed by the vacuum rings. Insertion procedures for side-entry TEM holders typically involve the rotation of the sample to trigger micro switches that initiate evacuation of the airlock before the sample is inserted into the TEM column. The second design is the top-entry holder consists of a cartridge that is several cm long with a bore drilled down the cartridge axis. The specimen is loaded into the bore, possibly utilising a small screw ring to hold the sample in place. This cartridge is inserted into an airlock with the bore perpendicular to the TEM optic axis. When sealed, the airlock is manipulated to push the cartridge such that the cartridge falls into place, where the bore hole becomes aligned with the beam axis, such that the beam travels down the cartridge bore and into the specimen. Such designs are typically unable to be tilted without blocking the beam path or interfering with the objective lens.

Electron gun
The electron gun is formed from several components: the filament, a biasing circuit, a Wehnelt cap, and an extraction anode. By connecting the filament to the negative component power supply, electrons can be "pumped" from the electron gun to the anode plate, and TEM column, thus completing the circuit. The gun is designed to create a beam of electrons exiting from the assembly at some given angle, known as the gun divergence semiangle, . By constructing the Wehnelt cylinder such that it has a higher negative charge than the filament itself, electrons that exit the filament in a diverging manner are, under proper operation, forced into a converging pattern the minimum size of which is the gun crossover diameter.

Cross sectional diagram of an electron gun assembly, illustrating electron extraction

The thermionic emission current density, J, can be related to the work function of the emitting material and is a Boltzmann distribution given below, where A is a constant, is the work function and T is the temperature of the material.

This equation shows that in order to achieve sufficient current density it is necessary to heat the emitter, taking care not to cause damage by application of excessive heat, for this reason materials with either a high melting point, such

Transmission electron microscopy as tungsten, or those with a low work function (LaB6) are required for the gun filament. Furthermore both lanthanum hexaboride and tungsten thermionic sources must be heated in order to achieve thermionic emission, this can be achieved by the use of a small resistive strip. To prevent thermal shock, there is often a delay enforced in the application of current to the tip, to prevent thermal gradients from damaging the filament, the delay is usually a few seconds for LaB6, and significantly lower for tungsten .

355

Electron lens
Electron lenses are designed to act in a manner emulating that of an optical lens, by focusing parallel rays at some constant focal length. Lenses may operate electrostatically or magnetically. The majority of electron lenses for TEM utilise electromagnetic coils to generate a convex lens. For these lenses the field produced for the lens must be radially symmetric, as deviation from the radial symmetry of the magnetic lens causes aberrations such as astigmatism, and worsens spherical and chromatic aberration. Electron lenses are manufactured from iron, iron-cobalt or nickel cobalt alloys, such as permalloy. These are selected for their magnetic properties, such as magnetic saturation, hysteresis and permeability.

Diagram of a TEM split polepiece design lens

The components include the yoke, the magnetic coil, the poles, the polepiece, and the external control circuitry. The polepiece must be manufactured in a very symmetrical manner, as this provides the boundary conditions for the magnetic field that forms the lens. Imperfections in the manufacture of the polepiece can induce severe distortions in the magnetic field symmetry, which induce distortions that will ultimately limit the lenses' ability to reproduce the object plane. The exact dimensions of the gap, pole piece internal diameter and taper, as well as the overall design of the lens is often performed by finite element analysis of the magnetic field, whilst considering the thermal and electrical constraints of the design. The coils which produce the magnetic field are located within the lens yoke. The coils can contain a variable current, but typically utilise high voltages, and therefore require significant insulation in order to prevent short-circuiting the lens components. Thermal distributors are placed to ensure the extraction of the heat generated by the energy lost to resistance of the coil windings. The windings may be water-cooled, using a chilled water supply in order to facilitate the removal of the high thermal duty.

Apertures
Apertures are annular metallic plates, through which electrons that are further than a fixed distance from the optic axis may be excluded. These consist of a small metallic disc that is sufficiently thick to prevent electrons from passing through the disc, whilst permitting axial electrons. This permission of central electrons in a TEM causes two effects simultaneously: firstly, apertures decrease the beam intensity as electrons are filtered from the beam, which may be desired in the case of beam sensitive samples. Secondly, this filtering removes electrons that are scattered to high angles, which may be due to unwanted processes such as spherical or chromatic aberration, or due to diffraction from interaction within the sample. Apertures are either a fixed aperture within the column, such as at the condensor lens, or are a movable aperture, which can be inserted or withdrawn from the beam path, or moved in the plane perpendicular to the beam path. Aperture assemblies are mechanical devices which allow for the selection of different aperture sizes, which may be used by the operator to trade off intensity and the filtering effect of the aperture. Aperture assemblies are often equipped with micrometers to move the aperture, required during optical calibration.

Transmission electron microscopy

356

Imaging methods
Imaging methods in TEM utilize the information contained in the electron waves exiting from the sample to form an image. The projector lenses allow for the correct positioning of this electron wave distribution onto the viewing system. The observed intensity of the image, I, assuming sufficiently high quality of imaging device, can be approximated as proportional to the time-average amplitude of the electron wavefunctions, where the wave which form the exit beam is denoted by .

Different imaging methods therefore attempt to modify the electron waves exiting the sample in a form that is useful to obtain information with regards to the sample, or beam itself. From the previous equation, it can be deduced that the observed image depends not only on the amplitude of beam, but also on the phase of the electrons, although phase effects may often be ignored at lower magnifications. Higher resolution imaging requires thinner samples and higher energies of incident electrons. Therefore the sample can no longer be considered to be absorbing electrons, via a Beer's law effect, rather the sample can be modelled as an object that does not change the amplitude of the incoming electron wavefunction. Rather the sample modifies the phase of the incoming wave; this model is known as a pure phase object, for sufficiently thin specimens phase effects dominate the image, complicating analysis of the observed intensities. For example, to improve the contrast in the image the TEM may be operated at a slight defocus to enhance contrast, owing to convolution by the contrast transfer function of the TEM, which would normally decrease contrast if the sample was not a weak phase object.

Contrast formation
Contrast formation in the TEM depends greatly on the mode of operation. Complex imaging techniques, which utilise the unique ability to change lens strength or to deactivate a lens, allow for many operating modes. These modes may be used to discern information that is of particular interest to the investigator. Bright field The most common mode of operation for a TEM is the bright field imaging mode. In this mode the contrast formation, when considered classically, is formed directly by occlusion and absorption of electrons in the sample. Thicker regions of the sample, or regions with a higher atomic number will appear dark, whilst regions with no sample in the beam path will appear bright hence the term "bright field". The image is in effect assumed to be a simple two dimensional projection of the sample down the optic axis, and to a first approximation may be modelled via Beer's law, more complex analyses require the modelling of the sample to include phase information.

Transmission electron microscopy Diffraction contrast Samples can exhibit diffraction contrast, whereby the electron beam undergoes Bragg scattering, which in the case of a crystalline sample, disperses electrons into discrete locations in the back focal plane. By the placement of apertures in the back focal plane, i.e. the objective aperture, the desired Bragg reflections can be selected (or excluded), thus only parts of the sample that are causing the electrons to scatter to the selected reflections will end up projected onto the imaging apparatus. If the reflections that are selected do not include the unscattered beam (which will appear up at the focal point of the lens), then the image will appear dark wherever no sample scattering to the selected peak is present, as such a region without a specimen will appear dark. This is known as a dark-field image.

357

Transmission electron micrograph of dislocations in steel, which are faults in the structure of the crystal lattice at the atomic scale

Modern TEMs are often equipped with specimen holders that allow the user to tilt the specimen to a range of angles in order to obtain specific diffraction conditions, and apertures placed above the specimen allow the user to select electrons that would otherwise be diffracted in a particular direction from entering the specimen. Applications for this method include the identification of lattice defects in crystals. By carefully selecting the orientation of the sample, it is possible not just to determine the position of defects but also to determine the type of defect present. If the sample is oriented so that one particular plane is only slightly tilted away from the strongest diffracting angle (known as the Bragg Angle), any distortion of the crystal plane that locally tilts the plane to the Bragg angle will produce particularly strong contrast variations. However, defects that produce only displacement of atoms that do not tilt the crystal to the Bragg angle (i. e. displacements parallel to the crystal plane) will not produce strong contrast. Electron energy loss Utilizing the advanced technique of EELS, for TEMs appropriately equipped electrons can be rejected based upon their voltage (which, due to constant charge is their energy), using magnetic sector based devices known as EELS spectrometers. These devices allow for the selection of particular energy values, which can be associated with the way the electron has interacted with the sample. For example different elements in a sample result in different electron energies in the beam after the sample. This normally results in chromatic aberration however this effect can, for example, be used to generate an image which provides information on elemental composition, based upon the atomic transition during electron-electron interaction. EELS spectrometers can often be operated in both spectroscopic and imaging modes, allowing for isolation or rejection of elastically scattered beams. As for many images inelastic scattering will include information that may not be of interest to the investigator thus reducing observable signals of interest, EELS imaging can be used to enhance contrast in observed images, including both bright field and diffraction, by rejecting unwanted components. Phase contrast Crystal structure can also be investigated by high-resolution transmission electron microscopy (HRTEM), also known as phase contrast. When utilizing a Field emission source and a specimen of uniform thickness, the images are formed due to differences in phase of electron waves, which is caused by specimen interaction. Image formation is given by the complex modulusof the incoming electron beams. As such, the image is not only dependent on the number of electrons hitting the screen, making direct interpretation of phase contrast images more complex. However this effect can be used to an advantage, as it can be manipulated to provide more information about the sample, such as in complex phase retrieval techniques.

Transmission electron microscopy

358

Diffraction
As previously stated, by adjusting the magnetic lenses such that the back focal plane of the lens rather than the imaging plane is placed on the imaging apparatus a diffraction pattern can be generated. For thin crystalline samples, this produces an image that consists of a pattern of dots in the case of a single crystal, or a series of rings in the case of a polycrystalline or amorphous solid material. For the single crystal case the diffraction pattern is dependent upon the orientation of the specimen and the structure of the sample illuminated by the electron beam. This image provides the investigator with information about the space group symmetries in the crystal and the crystal's orientation to the beam path. This is typically done without utilising any information but the position at which the diffraction spots appear and the observed image symmetries.

Crystalline diffraction pattern from a twinned grain of FCC Austenitic steel

Diffraction patterns can have a large dynamic range, and for crystalline samples, may have intensities greater than those recordable by CCD. As such, TEMs may still be equipped with film cartridges for the purpose of obtaining these images, as the film is a single use detector. Analysis of diffraction patterns beyond point-position can be complex, as the image is sensitive to a number of factors such as specimen thickness and orientation, objective lens defocus, spherical and chromatic aberration. Although quantitative interpretation of the contrast shown in lattice images is possible, it is inherently complicated and can require extensive computer simulation and analysis, such as electron multislice analysis. More complex behaviour in the diffraction plane is also possible, with phenomena such as Kikuchi lines arising from multiple diffraction Convergent-beam Kikuchi lines from silicon, near the [100] zone axis within the crystalline lattice. In convergent beam electron diffraction (CBED) where a non-parallel, i.e. converging, electron wavefront is produced by concentrating the electron beam into a fine probe at the sample surface, the interaction of the convergent beam can provide information beyond structural data such as sample thickness.

Transmission electron microscopy

359

Three-dimensional imaging
As TEM specimen holders typically allow for the rotation of a sample by a desired angle, multiple views of the same specimen can be obtained by rotating the angle of the sample along an axis perpendicular to the beam. By taking multiple images of a single TEM sample at differing angles, typically in 1 increments, a set of images known as a "tilt series" can be collected. This methodology was proposed in the 1970s by Walter Hoppe. Under purely absorption contrast conditions, this set of images can be used to construct a three-dimensional representation of the sample. The reconstruction is accomplished by a two-step process, first images are aligned to account for errors in the A three-dimensional TEM image of a parapoxavirus positioning of a sample; such errors can occur due to vibration or mechanical drift. Alignment methods use image registration algorithms, such as autocorrelation methods to correct these errors. Secondly, using a technique known as filtered back projection, the aligned image slices can be transformed from a set of two-dimensional images, Ij(x,y), to a single three-dimensional image, I'j(x,y,z). This three-dimensional image is of particular interest when morphological information is required, further study can be undertaken using computer algorithms, such as isosurfaces and data slicing to analyse the data. As TEM samples cannot typically be viewed at a full 180 rotation, the observed images typically suffer from a "missing wedge" of data, which when using Fourier-based back projection methods decreases the range of resolvable frequencies in the three-dimensional reconstruction. Mechanical refinements, such as multi-axis tilting (two tilt series of the same specimen made at orthogonal directions) and conical tomography (where the specimen is first tilted to a given fixed angle and then imaged at equal angular rotational increments through one complete rotation in the plane of the specimen grid) can be used to limit the impact of the missing data on the observed specimen morphology. In addition, numerical techniques exist which can improve the collected data. All the above-mentioned methods involve recording tilt series of a given specimen field. This inevitably results in the summation of a high dose of reactive electrons through the sample and the accompanying destruction of fine detail during recording. The technique of low-dose (minimal-dose) imaging is therefore regularly applied to mitigate this effect. Low-dose imaging is performed by deflecting illumination and imaging regions simultaneously away from the optical axis to image an adjacent region to the area to be recorded (the high-dose region). This area is maintained centred during tilting and refocused before recording. During recording the deflections are removed so that the area of interest is exposed to the electron beam only for the duration required for imaging. An improvement of this technique (for objects resting on a sloping substrate film) is to have two symmetrical off-axis regions for focusing followed by setting focus to the average of the two high-dose focus values before recording the low-dose area of interest. Non-tomographic variants on this method, referred to as single particle analysis, use images of multiple (hopefully) identical objects at different orientations to produce the image data required for three-dimensional reconstruction. If the objects do not have significant preferred orientations, this method does not suffer from the missing data wedge (or cone) which accompany tomographic methods nor does it incur excessive radiation dosage, however it assumes that the different objects imaged can be treated as if the 3D data generated from them arose from a single stable object.

Transmission electron microscopy

360

Sample preparation
Sample preparation in TEM can be a complex procedure. TEM specimens are required to be at most hundreds of nanometers thick, as unlike neutron or X-Ray radiation the electron beam interacts readily with the sample, an effect that increases roughly with atomic number squared (z2). High quality samples will have a thickness that is comparable to the mean free path of the electrons that travel through the samples, which may be only a few tens of nanometers. Preparation of TEM specimens is specific to the material under analysis and the desired information to obtain from the specimen. As such, many generic techniques have been used for the preparation of the required thin sections.

A sample of cells (black) stained with osmium tetroxide and uranyl acetate embedded in epoxy resin (amber) ready for sectioning.

Materials that have dimensions small enough to be electron transparent, such as powders or nanotubes, can be quickly prepared by the deposition of a dilute sample containing the specimen onto support grids or films. In the biological sciences in order to withstand the instrument vacuum and facilitate handling, biological specimens can be fixated using either a negative staining material such as uranyl acetate or by plastic embedding. Alternately samples may be held at liquid nitrogen temperatures after embedding in vitreous ice. In material science and metallurgy the specimens tend to be naturally resistant to vacuum, but still must be prepared as a thin foil, or etched so some portion of the specimen is thin enough for the beam to penetrate. Constraints on the thickness of the material may be limited by the scattering cross-section of the atoms from which the material is comprised.

Tissue sectioning
By passing samples over a glass or diamond edge, small, thin sections can be readily obtained using a semi-automated method. This method is used to obtain thin, minimally deformed samples that allow for the observation of tissue samples. Additionally inorganic samples have been studied, such as aluminium, although this usage is limited owing to the heavy damage induced in the less soft samples. To prevent charge build-up at the sample surface, tissue samples need to be coated with a thin layer of conducting material, such as carbon, where the coating thickness is several nanometers. This may be achieved via an electric arc deposition process using a sputter coating device.

A diamond knife blade used for cutting ultrathin sections (typically 70 to 350 nm for transmission electron microscopy.

Transmission electron microscopy

361

Sample staining
Details in light microscope samples can be enhanced by stains that absorb light; similarly TEM samples of biological tissues can utilize high atomic number stains to enhance contrast. The stain absorbs electrons or scatters part of the electron beam which otherwise is projected onto the imaging system. Compounds of heavy metals such as osmium, lead, uranium or gold (in immunogold labelling) may be used prior to TEM observation to selectively deposit electron dense atoms in or on the sample in desired cellular or protein regions, requiring an understanding of how heavy metals bind to biological tissues.

A section of a cell of Bacillus subtilis, taken with a Tecnai T-12 TEM. The scale bar is 200nm.

Mechanical milling
Mechanical polishing may be used to prepare samples. Polishing needs to be done to a high quality, to ensure constant sample thickness across the region of interest. A diamond, or cubic boron nitride polishing compound may be used in the final stages of polishing to remove any scratches that may cause contrast fluctuations due to varying sample thickness. Even after careful mechanical milling, additional fine methods such as ion etching may be required to perform final stage thinning.

Chemical etching
Certain samples may be prepared by chemical etching, particularly metallic specimens. These samples are thinned using a chemical etchant, such as an acid, to prepare the sample for TEM observation. Devices to control the thinning process may allow the operator to control either the voltage or current passing through the specimen, and may include systems to detect when the sample has been thinned to a sufficient level of optical transparency.

Ion etching
Ion etching is a sputtering process that can remove very fine quantities of material. This is used to perform a finishing polish of specimens polished by other means. Ion etching uses an inert gas passed through an electric field to generate a plasma stream that is directed to the sample surface. Acceleration energies for gases such as argon are typically a few kilovolts. The sample may be rotated to promote even polishing of the sample surface. The sputtering rate of such methods is on the order of tens of micrometers per hour, limiting the method to only extremely fine polishing. More recently focused ion beam methods have been used to prepare The thin membrane shown here is suitable for samples. FIB is a relatively new technique to prepare thin samples for TEM examination; however, at ~300-nm thickness, it would not be suitable for TEM examination from larger specimens. Because FIB can be used to high-resolution TEM without further milling. micro-machine samples very precisely, it is possible to mill very thin membranes from a specific area of interest in a sample, such as a semiconductor or metal. Unlike inert gas ion sputtering, FIB makes use of significantly more energetic gallium ions and may alter the composition or structure of the material through gallium implantation.
SEM image of a thin TEM sample milled by FIB.

Transmission electron microscopy

362

Replication
Samples may also be replicated using cellulose acetate film, the film subsequently coated with a heavy metal, the original film melted away, and the replica imaged on the TEM. This technique is used for both materials and biological samples.

Modifications
The capabilities of the TEM can be further extended by additional stages and detectors, sometimes incorporated on the same microscope. An electron cryomicroscope (CryoTEM) is a TEM with a specimen holder capable of maintaining the specimen at liquid nitrogen or liquid helium temperatures. This allows imaging specimens prepared in vitreous ice, the preferred preparation technique for imaging individual molecules or macromolecular assemblies.
Staphylococcus aureus platinum replica image A TEM can be modified into a scanning transmission electron shot on a TEM at 50,000x magnification microscope (STEM) by the addition of a system that rasters the beam across the sample to form the image, combined with suitable detectors. Scanning coils are used to deflect the beam, such as by an electrostatic shift of the beam, where the beam is then collected using a current detector such as a Faraday cup, which acts as a direct electron counter. By correlating the electron count to the position of the scanning beam (known as the "probe"), the transmitted component of the beam may be measured. The non-transmitted components may be obtained either by beam tilting or by the use of annular dark field detectors.

In-situ experiments may also be conducted with experiments such as in-situ reactions or material deformation testing. Modern research TEMs may include aberration correctors, to reduce the amount of distortion in the image. Incident beam Monochromators may also be used which reduce the energy spread of the incident electron beam to less than 0.15eV. Major TEM makers include JEOL, Hitachi High-technologies, FEI Company (from merging with Philips Electron Optics), Carl Zeiss and NION.

Low-voltage electron microscope


The low-voltage electron microscope (LVEM) is a combination of SEM, TEM and STEM in one instrument, which operated at relatively low electron accelerating voltage of 5 kV. Low voltage increases image contrast which is especially important for biological specimens. This increase in contrast significantly reduces, or even eliminates the need to stain. Sectioned samples generally need to be thinner than they would be for conventional TEM (2065nm). Resolutions of a few nm are possible in TEM, SEM and STEM modes.

Transmission electron microscopy

363

Cryo-microscopy
This technique allows TEM's to be used to see molecular structure of proteins and large molecules. Cryoelectron microscopy involves viewing unaltered macromolecular assemblies by vitrifying them, placing them on a grid and obtaining images by detecting electrons that transmit through the specimen.

Limitations
There are a number of drawbacks to the TEM technique. Many materials require extensive sample preparation to produce a sample thin enough to be electron transparent, which makes TEM analysis a relatively time consuming process with a low throughput of samples. The structure of the sample may also be changed during the preparation process. Also the field of view is relatively small, raising the possibility that the region analyzed may not be characteristic of the whole sample. There is potential that the sample may be damaged by the electron beam, particularly in the case of biological materials.

Resolution limits
The limit of resolution obtainable in a TEM may be described in several ways, and is typically referred to as the information limit of the microscope. One commonly used value is a cut-off value of the contrast transfer function, a function that is usually quoted in the frequency domain to define the reproduction of spatial frequencies of objects in the object plane by the microscope optics. A cut-off frequency, qmax, for the transfer function may be approximated with the following equation, where Cs is the spherical aberration coefficient and is the electron wavelength:

Approximate tendency of spatial resolution achieved with TEM.

For a 200kV microscope, with partly corrected spherical aberrations ("to the third order") and a Cs value of 1m, a theoretical cut-off value might be 1/qmax = 42pm. The same microscope without a corrector would have Cs = 0.5mm and thus a 200-pm cut-off. The spherical aberrations are suppressed to the third or fifth order in the "aberration-corrected" microscopes. Their resolution is however limited by electron source geometry and brightness and chromatic aberrations in the objective lens system. The frequency domain representation of the contrast transfer function may often have an oscillatory nature, which can be tuned by adjusting the focal value of the objective lens. This oscillatory nature implies that some spatial frequencies are faithfully imaged by the microscope, whilst others are suppressed. By combining multiple images with different spatial frequencies, the use of techniques such as focal series reconstruction can be used to improve the resolution of the TEM in a limited manner. The contrast transfer function can, to some extent, be experimentally approximated through techniques such as Fourier transforming images of amorphous material, such as amorphous carbon. More recently, advances in aberration corrector design have been able to reduce spherical aberrations and to achieve resolution below 0.5 ngstrms (50 pm) at magnifications above 50 million times. Improved resolution allows for the imaging of lighter atoms that scatter electrons less efficiently, such as lithium atoms in lithium battery materials. The ability to determine the position of atoms within materials has made the HRTEM an indispensable tool for nanotechnology research and development in many fields, including heterogeneous catalysis and the development of semiconductor devices for electronics and photonics.

Scanning transmission electron microscopy

364

Scanning transmission electron microscopy


A scanning transmission electron microscope (STEM) is a type of transmission electron microscope (TEM). Pronunciation is [stem] or [esti:i:em]. As with any transmission illumination scheme, the electrons pass through a sufficiently thin specimen. However, STEM is distinguished from conventional transmission electron microscopes (CTEM) by focusing the electron beam into a narrow spot which is scanned over the sample in a raster. The rastering of the beam across the sample makes these microscopes suitable for analysis techniques such as mapping by energy dispersive X-ray (EDX) spectroscopy, electron energy loss spectroscopy (EELS) and annular dark-field imaging (ADF). These signals can be obtained simultaneously, allowing direct correlation of image and quantitative data. By using a STEM and a high-angle detector, it is possible to form atomic resolution images where the contrast is directly related to the atomic number (z-contrast image). The directly interpretable z-contrast image makes STEM imaging with a high-angle detector appealing. This is in contrast to the conventional high resolution electron microscopy technique, which uses phase-contrast, and therefore produces results which need interpretation by simulation. Usually a STEM is a conventional transmission electron microscope equipped with additional scanning coils, detectors and needed circuitry; however, dedicated STEMs are also manufactured.

A STEM equipped with a 3rd-order spherical aberration corrector

Inside the corrector (hexapole-hexapole type)

Scanning transmission electron microscopy

365

History
In 1925, Louis de Broglie first theorized the wave-like properties of an electron, with a wavelength substantially smaller than visible light. This would allow the use of electrons to image objects much smaller than the previous diffraction limit set by visible light. The first STEM was built in 1938 by Baron Manfred von Ardenne, working in Berlin for Siemens. However, at the time the results were inferior to those of transmission electron microscopy, and von Ardenne only spent two years working on the problem. The microscope was destroyed in an air raid in 1944, and von Ardenne did not return to his work after WWII. The technique was not developed further until the 1970s, when Albert Crewe at the University of Chicago developed the field emission gun and added a high quality objective lens to create a modern STEM. He demonstrated the ability to image atoms using an annular dark field detector.

Microscope schematic

Crewe and coworkers at the University of Chicago developed the cold field emission electron source and built a STEM able to visualize single heavy atoms on thin carbon substrates.

Aberration correction
The addition of an aberration corrector to electron microscopes enables electron probes with sub-angstrom diameters to be used. This has made it possible to identify individual atoms with unprecedented clarity.

Room environment
High resolution scanning transmission electron microscopes require exceptionally stable room environments. In order to obtain atomic resolution imaging the room must have a limited amount of room vibration, temperature fluctuations, electromagnetic waves, and acoustic waves.

Biological application
The first application of this method to the imaging of biological molecules was demonstrated in 1971. The motivation for STEM imaging of biological samples is particularly to make use of dark-field microscopy, where the STEM is more efficient than a conventional TEM, allowing high contrast imaging of biological samples without requiring staining. The method has been widely used to solve a number of structural problems in molecular biology.

Low-voltage electron microscope (LVEM)


The low-voltage electron microscope (LVEM) is a combination of SEM, TEM and STEM in one instrument, which operated at relatively low electron accelerating voltage of 5 kV. Low voltage increases image contrast which is especially important for biological specimens. This increase in contrast significantly reduces, or even eliminates the need to stain. Sectioned samples generally need to be thinner than they would be for conventional STEM (2070nm). Resolutions of a few nm are possible in TEM, SEM and STEM modes.

Scanning transmission electron microscopy

366

Electron energy loss spectroscopy


Electron energy loss spectroscopy (EELS) as a STEM measurement technique made possible with the addition of an electron spectrometer. The high-energy convergent electron beam in STEM provides local information of the sample, even down to atomic dimensions. With the addition of EELS, elemental identification is possible and even additional capabilities of determining electronic structure or chemical bonding of atomic columns. The low-angle inelastically scattered electrons used in EELS compliments the high-angle scattered electrons in ADF images by allowing both signals to be acquired simultaneously. EELS is a technique popular to STEM microscopists.

Charge-coupled device
A charge-coupled device (CCD) is a device for the movement of electrical charge, usually from within the device to an area where the charge can be manipulated, for example conversion into a digital value. This is achieved by "shifting" the signals between stages within the device one at a time. CCDs move charge between capacitive bins in the device, with the shift allowing for the transfer of charge between bins. The CCD is a major piece of technology in A specially developed CCD used for ultraviolet imaging in a wire bonded package digital imaging. In a CCD image sensor, pixels are represented by p-doped MOS capacitors. These capacitors are biased above the threshold for inversion when image acquisition begins, allowing the conversion of incoming photons into electron charges at the semiconductor-oxide interface; the CCD is then used to read out these charges. Although CCDs are not the only technology to allow for light detection, CCD image sensors are widely used in professional, medical, and scientific applications where high-quality image data is required. In applications with less exacting quality demands, such as consumer and professional digital cameras, active pixel sensors (CMOS) are generally used; the large quality advantage CCDs enjoyed early on has narrowed over time.

Charge-coupled device

367

History
The charge-coupled device was invented in 1969 at AT&T Bell Labs by Willard Boyle and George E. Smith. The lab was working on semiconductor bubble memory when Boyle and Smith conceived of the design of what they termed, in their notebook, "Charge 'Bubble' Devices". A description of how the device could be used as a shift register. The essence of the design was the ability to transfer charge along the surface of a semiconductor from one storage capacitor to the next. George E. Smith and Willard Boyle, 2009 The concept was similar in principle to the bucket-brigade device (BBD), which was developed at Philips Research Labs during the late 1960s. The first patent (4,085,456) on the application of CCDs to imaging was assigned to Michael Tompsett. Smith and Boyle were only thinking of making a memory device and did not conceive of imaging in their notebook entry or patent, or participate in the development of CCD imagers or cameras as incorrectly described in their Nobel citation. The initial paper describing the concept listed possible uses as a memory, a delay line, and an imaging device. The first experimental device demonstrating the principle was a row of closely spaced metal squares on an oxidized silicon surface electrically accessed by wire bonds. The first working CCD made with integrated circuit technology was a simple 8-bit shift register. This device had input and output circuits and was used to demonstrate its use as a shift register and as a crude eight pixel linear imaging device. Development of the device progressed at a rapid rate. By 1971, Bell researchers lead by Michael Tompsett were able to capture images with simple linear devices. Several companies, including Fairchild Semiconductor, RCA and Texas Instruments, picked up on the invention and began development programs. Fairchild's effort, led by ex-Bell researcher Gil Amelio, was the first with commercial devices, and by 1974 had a linear 500-element device and a 2-D 100 x 100 pixel device. Steven Sasson, an electrical engineer working for Kodak, invented the first digital still camera using a Fairchild 100 x 100 CCD in 1975. The first KH-11 KENNAN reconnaissance satellite equipped with charge-coupled device array (800 x 800 pixels) technology for imaging was launched in December 1976. Under the leadership of Kazuo Iwama, Sony also started a large development effort on CCDs involving a significant investment. Eventually, Sony managed to mass-produce CCDs for their camcorders. Before this happened, Iwama died in August 1982; subsequently, a CCD chip was placed on his tombstone to acknowledge his contribution. In January 2006, Boyle and Smith were awarded the National Academy of Engineering Charles Stark Draper Prize, and in 2009 they were awarded the Nobel Prize for Physics, for their invention of the CCD concept. Michael Tompsett was awarded the 2010 National Medal of Technology and Innovation for pioneering work and electronic technologies including the design and development of the first charge coupled device (CCD) imagers. He was also awarded the 2012 IEEE Edison Medal "For pioneering contributions to imaging devices including CCD Imagers, cameras and thermal imagers".

Charge-coupled device

368

Basics of operation
In a CCD for capturing images, there is a photoactive region (an epitaxial layer of silicon), and a transmission region made out of a shift register (the CCD, properly speaking). An image is projected through a lens onto the capacitor array (the photoactive region), causing each capacitor to accumulate an electric charge proportional to the light intensity at that location. A one-dimensional array, used in line-scan cameras, captures a single slice The charge packets (electrons, blue) are collected in potential wells of the image, while a two-dimensional array, used in (yellow) created by applying positive voltage at the gate electrodes (G). Applying positive voltage to the gate electrode in the correct video and still cameras, captures a two-dimensional sequence transfers the charge packets. picture corresponding to the scene projected onto the focal plane of the sensor. Once the array has been exposed to the image, a control circuit causes each capacitor to transfer its contents to its neighbor (operating as a shift register). The last capacitor in the array dumps its charge into a charge amplifier, which converts the charge into a voltage. By repeating this process, the controlling circuit converts the entire contents of the array in the semiconductor to a sequence of voltages. In a digital device, these voltages are then sampled, digitized, and usually stored in memory; in an analog device (such as an analog video camera), they are processed into a continuous analog signal (e.g. by feeding the output of the charge amplifier into a low-pass filter) which is then processed and fed out to other circuits for transmission, recording, or other processing.

Detailed physics of operation


Charge generation
Before the MOS capacitors are exposed to light, they are biased into "One-dimensional" CCD image sensor from a fax the depletion region; in n-channel CCDs, the silicon under the bias gate machine is slightly p-doped or intrinsic. The gate is then biased at a positive potential, above the threshold for strong inversion, which will eventually result in the creation of a n channel below the gate as in a MOSFET. However, it takes time to reach this thermal equilibrium: up to hours in high-end scientific cameras cooled at low temperature. Initially after biasing, the holes are pushed far into the substrate, and no mobile electrons are at or near the surface; the CCD thus operates in a non-equilibrium state called deep depletion.[5] Then, when electronhole pairs are generated in the depletion region, they are separated by the electric field, the electrons move toward the surface, and the holes move toward the substrate. Four pair-generation processes can be identified: photo-generation (up to 95% of quantum efficiency), generation in the depletion region, generation at the surface, and generation in the neutral bulk.

The last three processes are known as dark-current generation, and add noise to the image; they can limit the total usable integration time. The accumulation of electrons at or near the surface can proceed either until image integration is over and charge begins to be transferred, or thermal equilibrium is reached. In this case, the well is said to be full (corresponding typically to about 105 electrons per pixel).

Charge-coupled device

369

Design and manufacturing


The photoactive region of a CCD is, generally, an epitaxial layer of silicon. It is lightly p doped (usually with boron) and is grown upon a substrate material, often p++. In buried-channel devices, the type of design utilized in most modern CCDs, certain areas of the surface of the silicon are ion implanted with phosphorus, giving them an n-doped designation. This region defines the channel in which the photogenerated charge packets will travel. Simon Sze details the advantages of a buried-channel device: This thin layer (= 0.20.3 nm) is fully depleted and the accumulated photogenerated charge is kept away from the surface. This structure has the advantages of higher transfer efficiency and lower dark current, from reduced surface recombination. The penalty is smaller charge capacity, by a factor of 23 compared to the surface-channel CCD. The gate oxide, i.e. the capacitor dielectric, is grown on top of the epitaxial layer and substrate. Later in the process, polysilicon gates are deposited by chemical vapor deposition, patterned with photolithography, and etched in such a way that the separately phased gates lie perpendicular to the channels. The channels are further defined by utilization of the LOCOS process to produce the channel stop region. Channel stops are thermally grown oxides that serve to isolate the charge packets in one column from those in another. These channel stops are produced before the polysilicon gates are, as the LOCOS process utilizes a high-temperature step that would destroy the gate material. The channel stops are parallel to, and exclusive of, the channel, or "charge carrying", regions. Channel stops often have a p+ doped region underlying them, providing a further barrier to the electrons in the charge packets (this discussion of the physics of CCD devices assumes an electron transfer device, though hole transfer is possible). The clocking of the gates, alternately high and low, will forward and reverse bias the diode that is provided by the buried channel (n-doped) and the epitaxial layer (p-doped). This will cause the CCD to deplete, near the p-n junction and will collect and move the charge packets beneath the gatesand within the channelsof the device. CCD manufacturing and operation can be optimized for different uses. The above process describes a frame transfer CCD. While CCDs may be manufactured on a heavily doped p++ wafer it is also possible to manufacture a device inside p-wells that have been placed on an n-wafer. This second method, reportedly, reduces smear, dark current, and infrared and red response. This method of manufacture is used in the construction of interline-transfer devices. Another version of CCD is called a peristaltic CCD. In a peristaltic charge-coupled device, the charge-packet transfer operation is analogous to the peristaltic contraction and dilation of the digestive system. The peristaltic CCD has an additional implant that keeps the charge away from the silicon/silicon dioxide interface and generates a large lateral electric field from one gate to the next. This provides an additional driving force to aid in transfer of the charge packets.

Architecture
The CCD image sensors can be implemented in several different architectures. The most common are full-frame, frame-transfer, and interline. The distinguishing characteristic of each of these architectures is their approach to the problem of shuttering. In a full-frame device, all of the image area is active, and there is no electronic shutter. A mechanical shutter must be added to this type of sensor or the image smears as the device is clocked or read out. With a frame-transfer CCD, half of the silicon area is covered by an opaque mask (typically aluminum). The image can be quickly transferred from the image area to the opaque area or storage region with acceptable smear of a few percent. That image can then be read out slowly from the storage region while a new image is integrating or exposing in the active area. Frame-transfer devices typically do not require a mechanical shutter and were a common

Charge-coupled device architecture for early solid-state broadcast cameras. The downside to the frame-transfer architecture is that it requires twice the silicon real estate of an equivalent full-frame device; hence, it costs roughly twice as much. The interline architecture extends this concept one step further and masks every other column of the image sensor for storage. In this device, only one pixel shift has to occur to transfer from image area to storage area; thus, shutter times can be less than a microsecond and smear is essentially eliminated. The advantage is not free, however, as the imaging area is now covered by opaque strips dropping the fill factor to approximately 50 percent and the effective quantum efficiency by an equivalent amount. Modern designs have addressed this deleterious characteristic by adding microlenses on the surface of the device to direct light away from the opaque regions and on the active area. Microlenses can bring the fill factor back up to 90 percent or more depending on pixel size and the overall system's optical design. The choice of architecture comes down to one of utility. If the application cannot tolerate an expensive, failure-prone, power-intensive mechanical shutter, an interline device is the right choice. Consumer snap-shot cameras have used interline devices. On the other hand, for those applications that require the best possible light collection and issues of money, power and time are less important, the full-frame device is the right choice. Astronomers tend to prefer full-frame devices. The frame-transfer falls in between and was a common choice before the fill-factor issue of interline devices was addressed. Today, frame-transfer is usually chosen when an interline architecture is not available, such as in a back-illuminated device.

370

CCD from a 2.1 megapixel Argus digital camera

CCDs containing grids of pixels are used in digital cameras, optical scanners, and video cameras as light-sensing devices. They commonly respond to 70 percent of the incident light (meaning a quantum efficiency of about 70 percent) making them far more efficient than photographic film, which captures only about 2 percent of the incident light. Most common types of CCDs are sensitive to near-infrared light, which allows infrared photography, night-vision devices, and zero lux (or near zero lux) video-recording/photography. For normal silicon-based detectors, the sensitivity is limited to 1.1m. One other consequence of their sensitivity to infrared is that infrared from remote controls often appears on CCD-based digital cameras or camcorders if they do not have infrared blockers. Cooling reduces the array's dark current, improving the sensitivity of the CCD to low light intensities, even for ultraviolet and visible wavelengths. Professional observatories often cool their detectors with liquid nitrogen to reduce the dark current, and therefore the thermal noise, to negligible levels.

CCD from a 2.1 megapixel Hewlett-Packard digital camera

Use in astronomy
Due to the high quantum efficiencies of CCDs, linearity of their outputs (one count for one photon of light), ease of use compared to photographic plates, and a variety of other reasons, CCDs were very rapidly adopted by astronomers for nearly all UV-to-infrared applications. Thermal noise and cosmic rays may alter the pixels in the CCD array. To counter such effects, astronomers take several exposures with the CCD shutter closed and opened. The average of images taken with the shutter closed is necessary to lower the random noise. Once developed, the dark frame average image is then subtracted from the

Charge-coupled device open-shutter image to remove the dark current and other systematic defects (dead pixels, hot pixels, etc.) in the CCD. The Hubble Space Telescope, in particular, has a highly developed series of steps (data reduction pipeline) to convert the raw CCD data to useful images. CCD cameras used in astrophotography often require sturdy mounts to cope with vibrations from wind and other sources, along with the tremendous weight of most imaging platforms. To take long exposures of galaxies and nebulae, many astronomers use a technique known as auto-guiding. Most autoguiders use a second CCD chip to monitor deviations during imaging. This chip can rapidly detect errors in tracking and command the mount motors to correct for them. An interesting unusual astronomical application of CCDs, called drift-scanning, uses a CCD to make a fixed telescope behave like a tracking telescope and follow the motion of the sky. The charges in the CCD are transferred and read in a direction parallel to the motion of the sky, and at the same speed. In this way, the telescope can image a larger region of the sky than its normal field of view. The Sloan Digital Sky Survey is the most famous example of this, using the technique to produce the largest uniform survey of the sky yet accomplished. In addition to astronomy, CCDs are also used in astronomical analytical instrumentation such as spectrometers.

371

Color cameras
Digital color cameras generally use a Bayer mask over the CCD. Each square of four pixels has one filtered red, one blue, and two green (the human eye is more sensitive to green than either red or blue). The result of this is that luminance information is collected at every pixel, but the color resolution is lower than the luminance resolution. Better color separation can be reached by three-CCD devices (3CCD) and a dichroic beam splitter prism, that splits the image into red, green and blue components. Each of the three CCDs is arranged to respond to a particular color. Many professional video camcorders, and some semi-professional camcorders, use this technique, although developments in competing CMOS technology have made CMOS sensors, both with beam-splitters and bayer filters, increasingly popular in high-end video and digital cinema cameras. Another advantage of 3CCD over a Bayer mask device is higher quantum efficiency (and therefore higher light sensitivity for a given aperture size). This is because in a 3CCD device most of the light entering the aperture is captured by a sensor, while a Bayer mask absorbs a high proportion (about 2/3) of the light falling on each CCD pixel.

A Bayer filter on a CCD

CCD color sensor

For still scenes, for instance in microscopy, the resolution of a Bayer mask device can be enhanced by microscanning technology. During the process of color co-site sampling, several frames of the scene are produced. Between acquisitions, the sensor is moved in pixel dimensions, so that each point in the visual field is acquired consecutively by elements of the mask that are sensitive to the red, green and blue components of its color. Eventually every pixel in the image has been scanned at least once in each color and the resolution of the three channels become equivalent (the resolutions of red and blue channels are quadrupled while the green channel is doubled).

Charge-coupled device

372

Sensor sizes
Sensors (CCD / CMOS) come in various sizes, or image sensor formats. These sizes are often referred to with an inch fraction designation such as 1/1.8 or 2/3 called the optical format. This measurement actually originates back in the 1950s and the time of Vidicon tubes.

x80 microscope view of an RGGB Bayer filter on a 240 line Sony CCD PAL Camcorder CCD sensor

Electron-multiplying CCD
An electron-multiplying CCD (EMCCD, also known as an L3Vision CCD, L3CCD or Impactron CCD) is a charge-coupled device in which a gain register is placed between the shift register and the output amplifier. The gain register is split up into a large number of stages. In each stage, the electrons are multiplied by impact ionization in a similar way to an avalanche diode. The gain probability at every stage of the register is small (P < 2%), but as the number of elements is large (N > 500), the overall gain can be very high ( ), with single input electrons giving many thousands of output electrons. Reading a signal from a CCD gives a noise background, typically a few electrons. In an EMCCD, this noise is superimposed on many thousands of electrons rather than a single electron; the devices' primary advantage is thus their negligible readout noise. EMCCDs show a similar sensitivity to Intensified CCDs (ICCDs). However, as with ICCDs, the gain that is applied in the gain register is stochastic and the exact gain that has been applied to a pixel's charge is impossible to know. At high gains (> 30), this uncertainty has the same effect on the signal-to-noise ratio (SNR) as halving the quantum efficiency (QE) with respect to operation with a gain of unity. However, at very low light levels (where the quantum efficiency is most important), it can be assumed that a pixel either contains an electron or not. This removes the noise associated with the stochastic multiplication at the risk of counting multiple electrons in the same pixel as a single electron. To avoid multiple counts in one pixel due to coincident photons in this mode of operation, high frame rates are primordial. The dispersion in the gain is shown in the graph on the right. For multiplication registers with many elements and large gains it is well modelled by the equation:

Electrons are transferred serially through the gain stages making up the multiplication register of an EMCCD. The high voltages used in these serial transfers induce the creation of additional charge carriers through impact ionisation.

There is a dispersion (variation) in the number of electrons output by the multiplication register for a given (fixed) number of input electrons (shown in the legend on the right). The probability distribution for the number of output electrons is plotted logarithmically on the vertical axis for a simulation of a multiplication register. Also shown are results from the empirical fit equation shown on this page.

Charge-coupled device

373

if where P is the probability of getting n output electrons given m input electrons and a total mean multiplication register gain of g. Because of the lower costs and better resolution, EMCCDs are capable of replacing ICCDs in many applications. ICCDs still have the advantage that they can be gated very fast and thus are useful in applications like range-gated imaging. EMCCD cameras indispensably need a cooling system using either thermoelectric cooling or liquid nitrogen to cool the chip down to temperatures in the range of -65C to -95C. This cooling system unfortunately adds additional costs to the EMCCD imaging system and may yield condensation problems in the application. However, high-end EMCCD cameras are equipped with a permanent hermetic vacuum system confining the chip to avoid condensation issues. The low-light capabilities of EMCCDs primarily find use in astronomy and biomedical research, among other fields. In particular, their low noise at high readout speeds makes them very useful for a variety of astronomical applications involving low light sources and transient events such as lucky imaging of faint stars, high speed photon counting photometry, Fabry-Prot spectroscopy and high-resolution spectroscopy. More recently, these types of CCDs have broken into the field of biomedical research in low-light applications including small animal imaging, single-molecule imaging, Raman spectroscopy, super resolution microscopy as well as a wide variety of modern fluorescence microscopy techniques thanks to greater SNR in low-light conditions in comparison with traditional CCDs and ICCDs. In terms of noise, commercial EMCCD cameras typically have clock-induced charge (CIC) and dark current (dependent on the extent of cooling) that together lead to an effective readout noise ranging from 0.01 to 1 electrons per pixel read. However, recent improvements in EMCCD technology have led to a new generation of cameras capable of producing significantly less CIC, higher charge transfer efficiency and an EM gain 5 times higher than what was previously available. These advances in low-light detection lead to an effective total background noise of 0.001 electrons per pixel read, a noise floor unmatched by any other low-light imaging device.

Frame transfer CCD


The frame transfer CCD imager was the first imaging structure proposed for CCD Imaging by Michael Tompsett at Bell Laboratories. A frame transfer CCD is a specialized CCD, often used in astronomy and some professional video cameras, designed for high exposure efficiency and correctness. The normal functioning of a CCD, astronomical or otherwise, can be divided into two phases: exposure and readout. Vertical smear During the first phase, the CCD passively collects incoming photons, storing electrons in its cells. After the exposure time is passed, the cells are read out one line at a time. During the readout phase, cells are shifted down the entire area of the CCD. While they are shifted, they continue to collect light. Thus, if the shifting is not fast enough, errors can result from light that falls on a cell holding charge during the transfer. These

Charge-coupled device errors are referred to as "vertical smear" and cause a strong light source to create a vertical line above and below its exact location. In addition, the CCD cannot be used to collect light while it is being read out. Unfortunately, a faster shifting requires a faster readout, and a faster readout can introduce errors in the cell charge measurement, leading to a higher noise level. A frame transfer CCD solves both problems: it has a shielded, not light sensitive, area containing as many cells as the area exposed to light. Typically, this area is covered by a reflective material such as aluminium. When the exposure time is up, the cells are transferred very rapidly to the hidden area. Here, safe from any incoming light, cells can be read out at any speed one deems necessary to correctly measure the cells' charge. At the same time, the exposed part of the CCD is collecting light again, so no delay occurs between successive exposures. The disadvantage of such a CCD is the higher cost: the cell area is basically doubled, and more complex control electronics are needed.

374

Intensified charge-coupled device


An intensified charge-coupled device (ICCD) is a CCD that is optically connected to an image intensifier that is mounted in front of the CCD. An image intensifier includes three functional elements: a photocathode, a micro-channel plate (MCP) and a phosphor screen. These three elements are mounted one close behind the other in the mentioned sequence. The photons which are coming from the light source fall onto the photocathode, thereby generating photoelectrons. The photoelectrons are accelerated towards the MCP by an electrical control voltage, applied between photocathode and MCP. The electrons are multiplied inside of the MCP and thereafter accelerated towards the phosphor screen. The phosphor screen finally converts the multiplied electrons back to photons which are guided to the CCD by a fiber optic or a lens. An image intensifier inherently includes a shutter functionality: If the control voltage between the photocathode and the MCP is reversed, the emitted photoelectrons are not accelerated towards the MCP but return to the photocathode. Thus, no electrons are multiplied and emitted by the MCP, no electrons are going to the phosphor screen and no light is emitted from the image intensifier. In this case no light falls onto the CCD, which means that the shutter is closed. The process of reversing the control voltage at the photocathode is called gating and therefore ICCDs are also called gateable CCD cameras. Besides the extremely high sensitivity of ICCD cameras, which enable single photon detection, the gateability is one of the major advantages of the ICCD over the EMCCD cameras. The highest performing ICCD cameras enable shutter times as short as 200 picoseconds. ICCD cameras are in general somewhat higher in price than EMCCD cameras because they need the expensive image intensifier. On the other hand EMCCD cameras need a cooling system to cool the EMCCD chip down to temperatures around 170 K. This cooling system adds additional costs to the EMCCD camera and often yields heavy condensation problems in the application. ICCDs are used in night vision devices and in a large variety of scientific applications.

Charge-coupled device

375

Blooming
When a CCD exposure is long enough, eventually the electrons that collect in the "bins" in the brightest part of the image will overflow the bin, resulting in blooming. The structure of the CCD allows the electrons to flow more easily in one direction than another, resulting in vertical streaking.[8][9][10] Some anti-blooming features that can be built into a CCD reduce its sensitivity to light by using some of the pixel area for a drain structure. James M. Early developed a vertical anti-blooming drain that would not detract from the light collection area, so did not reduce light sensitivity.

Das könnte Ihnen auch gefallen