Sie sind auf Seite 1von 108

The Image Processing Cookbook

(4th edition)
A guide to the processing and analysis of scienti c, forensic and technical images.
John C. Russ
College of Engineering North Carolina State University Raleigh, NC, USA
Copyright ⓒ 2017 John C. Russ All Rights Reserved.

Contents
Introduction

Chapter 1 - Image Sources and Characteristics 1 Color Cameras 1 Other Image Sources 2
Resolution 4 Noise 5 Compression 6

Chapter 2 - Performing Image Adjustments 9 Color Adjustments 9 Distortion 11 Interpolation 13


Random Noise 14 Nonuniform Illumination 17 Contrast Expansion 18

Chapter 3 - Applying Image Enhancements 22 Pseudocolor 22 Principal Components 23


Combining Images 24 Detail Sharpening 25 Edge Delineation 29 Texture 31

Chapter 4 - Fourier Space Processing 33 Frequency Filters 33 Removing Periodic Noise 34 Cross-
correlation 35 Deconvolution 36

Chapter 5 - Binary Images 38 Thresholding 38 Contour Lines 40 4- and 8-Connectedness 41


Boolean Combinations 42 Morphology 44 Watershed Segmentation 46 Skeletons 48

Chapter 6 - Measurements 50 Manual Measurement 50 Global Measurements 51 Feature


Measurements 52

Appendix A - 32-bit Photoshop with Plugins 59


Appendix B - 64-bit Photoshop with Plugins 65
Appendix C - Matlab Image Processing Toolbox 72
Appendix D - ImageJ 80
Appendix E - Image-Pro Plus 85
Appendix F - ImageMet SPIP 91
Appendix G - GIMP 96
Appendix H - Others 100
Introduction

This project has grown from a set of notes used for workshops on image processing and analysis, and
has also proven useful for self-paced study. It is expected that readers will follow and use the text
during a workshop, or as a hands-on guide to learning appropriate methods in their routine work, but
consult a more complete book such as The Image Processing Handbook (7th ed.), Image Analysis
of Food Structure, Measuring Shape, or Forensic Uses of Digital Imaging (2nd ed.) as a resource
for the reasoning behind the methods, references to algorithms, additional applications and examples,
and more detailed comparison of various techniques.

When the first edition of the Cookbook was written, the most common platform for handling digital
images, especially photographs from digital cameras, was Adobe Photoshop. Fovea Pro provided set
of plug-ins that included a complete array of tools for scientific, technical and forensic image
processing and measurement. So the specific menu selections and dialog boxes used by those
programs, practically identical on Macintosh and Windows computers, were illustrated to provide
guidance to readers.

The world has changed since then. Photoshop is now marketed as a continually evolving (and
continuously paid-for) program package that includes a few directly useful routines, but has an
unwieldy number of options and is primarily targeted at the graphic arts community. As both
Windows and Macintosh computers have grown to support 64-bit programs rather than being limited
to 32-bit, and as Photoshop has moved along with that change, many plugins including the original
Fovea Pro package will no longer operate with newer versions. The QIA64 (Quantitative Image
Analysis) plugins described in Appendix B work with 64-bit versions of Photoshop.

Photoshop still offers several advantages, including the possibility to accept add-on modules for
specific purposes, and the capability to open a very wide variety of types of image files, including the
“raw” formats saved by most digital cameras. It can also be used with some video files by loading
each frame into a separate layer, and it is readily automated with actions and scripts. But it has the
disadvantage of including many functions intended for graphic artists, which are inappropriate for use
with images intended for scientific or forensic purposes. The perceived possibility that it might be
used to alter image contents and create false results is a potential cause for concern.

Other choices such as Matlab Image Processing Toolbox at the high end, Media Cybernetics Image-
Pro Plus when control of hardware such as microscope stages is desired, and ImageJ at the low end,
can fill some of the processing and analysis needs. Matlab is costly and has a steep learning curve
(even some folks at Mathworks admit that the interface “could use some help,” and other users have
more colorful descriptions). Image-Pro Plus is typically bundled with light microscope hardware,
includes a wide variety of automation options, and generally uses older and less accurate processing
algorithms. ImageJ is free, but it accepts an enormously varied range of individually written and
submitted modules, many of them duplicating specific functions appropriate for narrow biological
light microscopy applications, and sorting them out can be challenging. Even more difficult is
evaluating the quality of the programming and results, which are not always accurate or based on the
best algorithms. Finally, what support there is comes in the form of on-line discussion groups and may
not provide meaningful, timely or dependable answers. Some other programs, listed in the
Appendices, provide limited functionality, especially with regard to measurement.

All of these programs provide some degree of automation. This ranges from Photoshop’s “Actions”
which record a series of steps (including plug-in operations) that can be applied to a collection of
images, to the full-blown programming language in Matlab, whcih can also be used to program
additional functions as needed. Image-Pro Plus provides a macro language for automation; ImageJ
accepts add-ons written in Java.

There is not, at this moment, a good, simple answer to the question “What software should I use?”
The author has used all of these programs, and others, in typical applications and for teaching
purposes. This fourth edition of the Cookbook emphasizes the typical workflow and sequence of
operations, with descriptions and examples of the appropriate steps that are useful, and often
necessary, in the processing and analysis of images. Procedures are illustrated that have a broad
range of applicability in many different disciplines, and at scales ranging from electron microscopy to
astronomy. This is separate from a summary of the specific software routines that implement them.
The body of the text shows and compares the various operations; the software descriptions are
collected into appendices.

The appendices explain briefly how to accomplish the basic steps, using each of the programs
mentioned and others, but the listing of procedures there does not attempt to fully replace the software
documentation. No single software package is “complete” and able to perform all of the potentially
useful algorithms. If some other software than the programs mentioned is used, the text provides a
basic guide to what is possible and appropriate, and the types of software routines useful for
achieving those results. Since different programs often use different names for the same procedure, or
report different measurements using the same name, it is still the user’s responsibility to test the
program and verify the results.

John Russ
Chapter 1 - Image Sources and Characteristics

The majority of digital image les loaded into computers for possible processing and analysis come
from digital cameras, so it is useful to begin by considering the characteristics of the resulting images.
Some surveillance cameras are monochrome, and these are often characterized by poor resolution,
both spatial and tonal (terms that will be explained shortly), as well as optical limitations. Most
modern digital cameras record color images.

Color Cameras

There are some camera designs that use a single array of CCD or CMOS detectors and capture
multiple images, usually three, through individual color lters, for example mounted on a rotating disk.
There are advantages to this design: the full resolution of the detector array is available and different
exposure times can be used to balance the colors. But this type of camera can only be used with static
scenes, such as a studio setting or perhaps some laboratory or microscope situations, and so this is
not a common type.

Some video cameras use three detector arrays, with a combination of prisms and lters to separate the
incoming light into red, green and blue wavelength bands that are detected separately and the resulting
signals combined. In order to limit color shading across the images due to the prisms, wide angle
lenses are not used. The designs are relatively expensive, and the need to keep the three detectors in
proper alignment makes the cameras somewhat fragile. Also, loss of light intensity in passing through
the optics is signi cant and these designs generally do not perform well in low light situations.

It is possible to stack three sets of detectors on top of each other in a single chip, so that the blue light,
which penetrates only a short distance in silicon, is recorded in the topmost array, the green light in
the second, and the red light, which penetrates the deepest, is recorded in the bottom array. This
design maintains the spatial resolution of the array but it is dif cult to balance the detector ef ciencies
to obtain good color response. A few cameras use this design.
By far the most common method used to record color images places colored lters in front of the
individual detectors in a single array. This “color lter array” (CFA) may contain various colors,
depending on the designs of the individual manufacturer, but the red-green-blue (RGB) arrangement
shown in Figure 1 is fairly representative, and dates back to the origin of the design four decades
ago.

As indicated in the gure, the CFA method reduces the spatial resolution of the recorded image because
the signals from several individual detectors must be combined to produce the RGB color
representation on the scene. This process, which is carried out inside the camera, is called
“demosaicing” and different manufacturers have their proprietary algorithms for accomplishing it, but
in all cases it requires interpolation, and can cause color artifacts at edges and boundaries. For the
CFA shown, only one quarter of the detectors sense red light, for example, and at other locations the
red intensity must be estimated by interpolation. The resulting spatial resolution is about half that
implied by the advertised “pixel count” for the camera.
This illustrates one of the several different meanings for the word “pixel.” It may refer to the number
of detectors in the array, or the number of values stored in the computer, or even the number of points
in the display (each of which consists of a triad of red, green and blue values, produced by a variety
of different technologies such as phosphors in a CRT, lters in an LCD, or emission from an OLED
(even colored bulbs on a billboard!) - the differences are not important for the purposes of this text.
The meaning that is used throughout this book is the stored values in the computer, even if they do not
correspond to the actual spatial resolution of the image.

The choice of CCD or CMOS detectors is generally of minor importance for image quality. CMOS
(complementary metal-oxide semiconductors) have somewhat greater xed pattern noise due to slight
variations in the individual detectors and associated electronics, and less dynamic range than CCD
(charge-coupled device) arrays, other things being equal. But they also require less power and less
external circuitry and so are preferred for many small cameras. Properly used, either type is capable
of delivering high quality images.

Other Image Sources

There are other potential sources of digital images. A desktop scanner is often useful for recording
images from relatively large, more-or-less at objects. These generally have good resolution (1200 to
2400 pixels per inch), with consistent calibration for both dimension and color. Film scanners with
higher spatial resolution, usually 4000 pixels per inch or more, and a wide dynamic range to cover
the density of lms, produce high quality results. And various scienti c instruments, often named
microscopes of various kinds, usually record their images in a digital format. Most of these produce
monochrome images, although the various signals that are recorded may be combined by placing them
in color channels for visual examination.

Figure 2 shows an example, images from a scanning electron microscope (SEM). This produces high
resolution images of rough surfaces with a large depth of eld, and the contrast is visually interpretable
although not the same as that produced by visible light. The size of the pixel array and the tonal range
of SEM images are similar to those from digital cameras. This is also true for instruments such as the
transmission electron microscope (TEM), various light microscopes, and the atomic force
microscope (AFM).
Some instruments such as scanning pro lometers and interference microscopes produce images of
surface relief with a very high depth resolution over a very wide range of dimension, and the resulting
values have a much greater dynamic range than camera images. Images from telescopes have a wide
dynamic range, from dark dust clouds and interstellar space to stars and novas of great brightness.
Medical X-rays also record a large dynamic range. Such images can be subjected to the same types of
processing and analysis as shown here, but may require storage with more than one or two bytes for
each value.

At the other extreme, some surveillance video cameras have very limited tonal resolution, which may
be reduced further by the compression and storage of the images. Figure 3 shows an example. The
image histogram inset in the gure is an important tool for judging image quality. It shows that the
number of brightness levels recorded is small, and also that the brightness range of the image
exceeded the dynamic range of the camera so that many pixels have been clipped to black or white.
This is common with surveillance cameras that include both interior and external areas (e.g.,
windows) in the scene; light xtures in the camera’s view can make the problem worse. The example
image also shows the limited spatial resolution of surveillance video images. There is not enough
information to identify the robbers.

Resolution

Both spatial resolution (the number of pixels, and more importantly their spacing or size in
comparison to the dimension of important image details) and the tonal resolution (the number of
distinguishable levels of brightness) are important for image quality. Figure 4 shows the effect of
reducing these on the ability to recognize the image contents. For very familiar scenes or objects, such
as the face of a family member, a few visual cues may be suf cient for our memories to ll in the
unrecorded details. But for unfamiliar scenes, objects or faces, or for purposes of analysis and
measurement, having adequate spatial and tonal resolution is necessary.

Details that are just a few (or even one) pixel in size are not dependable, as they may be just the
result of noise in the image (which is discussed below and addressed in Chapter 2). Measurements
of dimensions or distances also bene t from improved resolution. Because the detectors are nite in
size, the way a particular image happens to fall on the array can change a distance of 10 pixels to 9 or
11. A precision of 10±1 is poor compared to 100±1, so with more pixels the measurements are more
dependable. Most modern still cameras (many of which can also capture video) have arrays of many
megapixels, and even with the loss of resolution due to demosaicing and interpolation can give
adequate spatial resolution when properly used.

Tonal resolution presents different challenges. Many cameras and computer programs deal with “8-
bit” images, which are capable of representing 256 brightness levels. These are conveniently stored
in computer memory, which is organized into 8-bit bytes. For an RGB color image, there are three
bytes per pixel, recording the separate red, green and blue channel values. Even if the camera uses
other color lters in the detection process, or an entirely different color representation for transmission
or compression, the image is ultimately stored as RGB for access by the programs.
While 256 values properly spaced over the brightness range of a scene are adequate for visual
display on the computer monitor (most computer displays can show only 256 values in each color
channel), or for printing (most printer types can reproduce even fewer), this may not be enough to
distinguish ne details that may be accessible with image processing. Film cameras, for example, can
typically record as much as 12 bits, or about 4000 values. This makes it possible to expand the
contrast or perform other image processing operations to reveal otherwise hidden information.

Figure 5 shows an example. A nighttime scene that includes streetlights recorded with a highend
digital camera that can cover a greater dynamic range while preserving small variations, this image
can be processed as described in Chapter 2 to reveal information in the shadow areas. This may be
essential for viewing or printing the image. Some cameras can combine several sequential different
exposures to form a high dynamic range result. Images with more than 8 bits of data in each channel
are usually stored with two bytes and are called “16-bit” images, although this typically exceeds the
actual tonal range of the values. As noted above, some image sources may require even more storage
for the data that are generated.

Noise

Another reason to desire a large tonal range is the presence of noise in images. The most common
type of noise is a more-or-less random variation superimposed on the value that represents the scene.
Particularly for images captured in dim light, this variation is partially a statistical effect due to the
small number of photons detected, although variations also originate in the electronics, and may have
contributions from lighting or other external effects.

The effect of noise in altering the brightness level of pixels and hiding details in the image is
illustrated in Figure 6. The images show a pollen grain in a scanning electron microscope. By
changing the scan rate, an image can be acquired in a fraction of a second or in several seconds,
similar to the effects of changing the aperture and shutter speed of a camera. A long exposure or slow
scan rate collects more signal, and the statistical noise variations become a smaller part of the overall
value, so that small differences which correspond to the details of the pollen grain can be seen.
Conversely, a fast scan rate or shorter exposure results in proportionately more noise in the image,
which obscures many of the details and compromises the ability to perform measurements. In many,
probably most, situations, collecting more signal with a longer exposure time reduces random image
noise. Of course, it is not always practical or possible. Chapter 2 deals with ways to reduce random
or “speckle” noise in acquired images.

Another type of noise, which often results from electrical interference, vibration, or ickering
illumination (e.g., from uorescent lamps) is not random, but periodic. Figure 7 shows an example.
The surveillance image has superimposed dark and light bands that occur when the poorly shielded
microwave oven in the convenience store is running, a common problem. Removing or reducing
periodic noise is covered in Chapter 4.

Compression

Noise in the most general sense is any part of the stored image that does not represent the original
scene. The presence of noise makes discerning the features and detail more dif cult, and can impede
measurements, for example by obscuring edges and boundaries. The “quality” of a stored image
certainly depends on noise resulting from the statistics of the signal, plus further degradation caused
by the electronics in the camera and any interference from external factors. But by far the most
important limitation on the quality and hence usefulness of digital images results from compression.
Some cameras can store uncompressed or “RAW” images, and most other devices such as scanners
and microscopes do not apply compression. But many cameras, including those in cell phones, use
compression in order to reduce le size for storage. Compressing images is commonly used to send
over images over the internet.

For analysis purposes, compression is always a problem. There are truly lossless compression
methods (LZW, for example) that do not alter image contents, but can produce only modest size
reductions by nding repetitive patterns. These occur commonly in text, but rarely in images. Lossy
compression, which can reduce les more signi cantly, discards or alters pixel values, especially color
information. The most widely used compression method is JPEG, and even the so-called “lossless”
mode does actually discard data and compromise the image contents.

Figure 8 shows an example of a JPEG compressed image. This is a modest amount of compression,
about 7:1; many images on the internet are compressed by a factor of 20 or more. Because of the
limitations of printing technology, the variations in color are not readily detected by visual
examination of the gure. But the alteration of the shape of the bird’s beak, the breakup of edges so that
they are displaced and not continuous, and the elimination of detail, are evident in the inset
enlargements.
To illustrate the ways that JPEG compression alters colors, Figure 9 shows enlarged examples of
simple text on a uniform gray background. Within each of the 8x8 pixel blocks that are used in the
JPEG compression algorithm, the colors are modi ed so that existing colors are replaced and new
colors appear. In addition, the locations of boundaries are altered.

The criterion used to justify compression is that for casual viewing the information conveyed by the
image does not confuse the viewer. Human vision needs very few cues to recognize familiar objects
and scenes, but for technical, scienti c or forensic purposes the loss of information due to
compression can be critical. In general, there is no way to predict or detect exactly what has been
altered, although the fact that JPEG compression has been used can be determined. Once the data have
been lost, they cannot be recovered. Opening a JPEG compressed image and saving it in a lossless
format such as uncompressed TIFF does not restore the lost information.

Chapter 2 - Performing Image Adjustments


Color Adjustments

In many imaging situations, the exact colors of objects are not very important. A digital camera is not
a spectrophotometer and cannot measure color in the sense of a spectrum of intensity vs. wavelength,
because the broad ranges of the color lters present yield the same response from the camera’s
detectors for many different combinations of wavelengths and intensities. Often, the importance of
color is primarily the ability to distinguish one type of object or structure from the background or
from other objects.

Nevertheless, it is sometimes important to make adjustments to color images so that the visual
response to the image is the same as it would be to the original scene. This is the case, for example,
for images in catalogs – the purchaser expects the colors of clothing to match the appearance of the
images. Matching colors to other photographs of the same or similar objects is also important for
identi cation. To achieve this, it is necessary to adjust the image, and to do that it is helpful to have
some standard(s) in the scene. In a studio setting, where the lighting will remain constant, the usual
procedure is to record an image with color standards rst, and then use that to adjust subsequent
images. In a forensic situation, the standards can be placed next to the evidence (along with a ruler)
and all recorded in the same photo.

With suitable standards and software, such as the GretagMacbeth Eye-One, it is possible to calibrate
the image so that the computer display and hardcopy printout provide visual matching to the original
scene. Secondary standards, for example made by calibrating a set of paint chips from the hardware
store against the primary standards, may also be useful. A minimal set of three chips, ones that are
approximately red, green and blue, cover color space fairly well and allow making a tristimulus
correction to the recorded RGB values.

The simplest, but often adequate approach to color adjustment depends only on nding points in the
image that should be neutral in color, that is, black, white and gray with no net color. These are points
that should have equal red, green and blue values. If they exhibit any color cast, simple adjustments
can be made to balance the intensities at those locations and thus adjust the proportions of RGB in the
rest of the image.

Figure 10 shows an example. In the original image, the location of black, white and neutral gray
regions is performed manually (marked in Figure 10a with colored circles). Measuring the stored
RBG values in each area and adjusting the functions that relate the displayed values to the stored
values produces the result shown, after which the stored values can be modi ed to keep the
adjustments. The transfer functions that convert the original values to the corrected ones (Figure 10c)
are different for the red, green and blue channels, as needed to make the regions selected neutral in
color. The processed image (Figure 10b) also shows enhanced contrast.
The adjustment shown is applied to the RGB values, but for many purposes of color adjustment and
measurement, and for image processing, other color representations are more useful. Human vision
does not characterize color in terms of RGB values, and it is not an ef cient space for processing.
Converting images to other color space representations is routinely performed, for example as a part
of compression, or for video transmission. All such representations require three numbers, and in
most cases the calculations to transform the data from RGB and back again do not result in any loss of
precision (it is other steps in compression that cause the losses).

For image processing, and understanding what people perceive as color, two of the more useful color
space representations are Hue-Saturation-Intensity and L-a-b. Figure 11 shows graphical
representations of these spaces, for a color image. The RGB space representation is simple, with red,
green and blue orthogonal axes on which each pixel’s values can be plotted as shown. Neutral gray
values lie along the cube diagonal from black to white.

The HSI or Hue-Saturation-Intensity space is shown as a cylinder, in which the vertical axis is the
intensity extending from black at the bottom to white at the top. Gray pixels that are neutral in color
are plotted along the central axes. Saturation is the amount of color, or the radial distance from the
central axis. Fully saturated red, for example, is plotted at the outer boundary of the cylinder while
pink is plotted at an intermediate radius but in the same direction. The direction is hue, and the angles
progress from red to orange, yellow, green, cyan, magenta, and back to red. This space is often shown
as a biconical plot rather than a cylinder, because at the black and white ends, the maximum possible
saturation is reduced to zero. HSI space corresponds rather well to the way people such as artists
describe colors, but is a very inconvenient space mathematically. This is because of the changing
maximum saturation with intensity, and the fact that the hue angle changes abruptly from a value of
359 degrees for a color slightly to the magenta side of red to 1 degree for a color slightly to the
orange side of red.

L-a-b space is more convenient for many processing purposes. As for the HSI representation, gray
pixels are plotted on the central axis from the south pole of the sphere (black) to the north pole
(white). But the other two perpendicular axes are “a” (red to green) and “b” (yellow to blue). This
produces a different distribution of colors than in the HSI case, but the orthogonal axes simplify the
mathematics. Most of the processing of color images is performed in either HSI or L-a-b space, and
for many functions the intensity values (the I or L values) are processed but the color information is
not, to avoid altering the colors in the image.

Distortion

Many camera lenses, and especially the very wide angle lenses used on
surveillance cameras and smartphones, produce barrel distortion in images
(Figure 12). Telephoto lenses often exhibit pincushion distortion, and zoom
lenses may vary from one to the other. Correcting these defects can be done
by processing if the lens has been calibrated (programs exist with the
corrections built-in for many lenses used with single-lens-re ex cameras) or
if the images themselves contain ducial reference points. For example, in
Figure 13 the straight edges of the doorway can be used to de ne the lens
distortion and perform the correction shown.

Another type of distortion arises when objects are viewed at an angle, rather
than perpendicular to their surface. This foreshortening results in
trapezoidal distortion (Figure 14) that hinders comparison of features, and
complicates measurements of dimension and shape. If the viewing
geometry is known, for example based on the known position of the
camera, trigonometric calculations can be applied to correct for the
distortion. Alternatively, as shown in the gure, it may be possible to use
knowledge of features within the image to perform the correction.

Notice in the example of Figure 14 that the correction to the building


makes all of the windows uniform in dimension, but that features not in the
plane of the building front are distorted. The steps appear to be at an angle
to the face rather than perpendicular to it, and the lamppost appears distorted
and displaced from its correct position in front of the center of the building and in front of the car, to a
position near the left-hand end of the building and behind the car.
In many situations, both lens distortion and foreshortening are present and require correction. The
lens distortion should be removed rst, and then the foreshortening addressed, as indicated in the
diagram in Figure 15. Figure 15a also exhibits some vignetting, or darkening of the corners of the
image, which requires a background correction as discussed below. The example in Figure 16 shows
an extreme case in which the barrel distortion from a wide angle lens is rst removed, and then the
vertical foreshortening due to the camera angle is adjusted based on the rectangular shape of the front
of the building. When images do not contain such internal reference points, and the exact camera angle
and position are not known, it is necessary to provide a suitable reference frame or set of points in the
scene before capturing the photograph.

Interpolation

Correcting distortions, or shifting and rotating images for alignment, or enlarging images, all require
some type of interpolation to determine the values for the new pixels. This is because the new pixel
centers do not in general lie directly on the original ones, as illustrated in Figure 17. One approach to
selecting values for each new pixel is to choose the nearest original pixel and assign the values from
that location. This is “nearest neighbor” interpolation.
Other approaches use several of the original pixel values in the region surrounding each new location
and t various kinds of functions to estimate an
interpolated value. The simplest of these is bilinear interpolation,
which uses just the four surrounding original pixels and performs
a linear interpolation between their values. Bicubic interpolation

ts a polynomial to the 16 pixels in a 4x4 pixel neighborhood.


Methods using spline ts and other functions may also be used.

Generally, the larger the region used for the tting and the more
complex the mathematical function, the smoother the resulting
image appears to visual examination, but the more blurring and
alteration of the original pixel values is introduced. Figure 18
illustrates this with one-pixel-wide lines that are rotated by small
angles. With nearest neighbor rotation the lines appear broken up,
but the black or colored pixel values are not altered.
With bicubic interpolation, the lines appear smoother because of the introduction of intermediate
shades of gray and intermediate color alongside the lines. But the pixel values have been altered by
the interpolation process and can no longer be relied upon to faithfully represent the original scene, or
even have the same color values from plaee to place. The choice of a method depends on the intended
use of the resulting image. Some applications, such as densitometry, require preserving those values.
Others, such as visual presentation, may be better with interpolation.

Figure 19 shows an example of an image that was reduced in size by a factor of four and then
reenlarged by the same amount. Using nearest-neighbor interpolation produces a blocky appearance
along edges and distorts dimensions. Bilinear interpolation produces a smoother appearance, and the
minimum 4x4 pixel neighborhood blurs edges by the least amount possible.

Random Noise

The random or speckle noise shown in Figure 6 arises from many sources, some of which are
additive (i.e., the same magnitude in both bright and dark regions), some multiplicative (i.e.,
proportional to the brightness), and some with still other characteristics depending on their origin. In
general, these are called “random,” although they may not be perfectly random in a mathematical
sense, to distinguish them from periodic noise such as that shown in Figure 7, which arises from
different causes, and is described in a following section and dealt with using different procedures.

The best approach to reducing random noise is to collect more signal, for example with a longer
exposure, but this is often impractical. A frequently used, but incorrect method for attempting to
reduce the effects of random noise, is the application of a Gaussian blur (Figure 20). This function
combines the values of each pixel with those of its nearby neighbors, using weighting of the pixels
according to distance. The same array of weights, representing a Gaussian distribution with a chosen
standard deviation, is applied to each pixel in the image, combining it with its neighbors, to produce a
new image.

The processing may be applied either to just the brightness values, leaving color information
unchanged, or in some cases the procedure is applied separately to the red, green and blue channels.
The result does reduce the magnitude of the speckle variations, but it does so at the expense of
blurring details and edges so that they are more dif cult to visually detect and to locate. Edges that are
curved may also be shifted in the process. If there are extreme pixel values, such as the re ective
highlights in the example image, they affect many of the nearby pixels.

Modi cations to the basic Gaussian blur have been introduced to reduce the blurring of edges. The
bilateral lter (Figure 21a) adjusts the weights applied to the neighboring pixels according to both
their distance from each central pixel and according to the difference in values so that the greater the
difference, the smaller the weight (and the less the effect of that neighbor pixel).

The anisotropic diffusion lter (Figure 21b) adjusts the weights according to the distance and also
according to the direction of the local brightness gradient. It is assumed that neighbor pixels in the
direction of that gradient are more likely to be different or even to lie across a boundary, and so
should be weighted less than those at right angles to the gradient, which are expected to be more
likely to lie within the same structure. Both of these lters should be applied only to the brightness
values leaving the color information unchanged, because if they are applied to the separate RGB
channels the calculated weights would be different for each channel, resulting in different
proportional changes to the new values and the alteration of the colors.
All of these lters that use weights to arithmetically combine the original pixel value with those of
neighbors cause some blurring and shifting of edges. The advanced techniques shown in Figure 21
increase the amount of computation required. A generally superior method that does not blur or shift
edges (although it can round corners) is the median lter (Figure 22a). This ranks the pixel value with
those in a local neighborhood and creates a new image in which the pixel is given the median value in
the region.

Ideally the local neighborhood should be approximately circular, but in many cases it is square to
simplify the programming. There are advanced medians that are conditional (exclude pixels from the
neighborhood based on difference from the central pixel) or hybrid (perform ranking in several sub-
neighborhoods) to better preserve corners and ne lines. Ranking to determine a median value is
straightforward for monochrome (gray scale) images. For color images, either the brightness values
are used to select the median neighbor and the full color information is then taken from that location
for the pixel in the new image, or a vector median is used. This may be done in RGB or Lab space,
but not in HSI because of the discontinuity in hue values at 0°.

A different approach that requires signi cantly more computation but delivers substantially better
results in terms of noise reduction, detail preservation, and de nition of lines and edges, is the non-
local means lter (Figure 22b). Rather than dealing with individual pixel values, this compares the
pattern of values around each pixel with similar patterns in a large surrounding region, and weights
the central pixel from each neighborhood according to the pattern similarity.

Nonuniform Illumination

Recognition and analysis of objects in images is generally based on the assumption that similar
objects will have the same appearance regardless of where in the scene they happen to lie. But in
many imaging situations, images are not uniformly illuminated. Flatbed scanners and wellaligned
microscopes meet this condition, but it can be dif cult to achieve with copystands and very dif cult for
routine photography such as crime scene evidence. Even if the lighting (e.g., from an off-camera ash
with a diffuser or umbrella re ector) is fairly uniform, vignetting in the optics will still produce
shading in the captured image.

The preferred method for correcting nonuniform illumination of objects on a copystand is illustrated
in Figure 23. By removing the objects of interest and capturing a second image of just the uniform
gray of the copystand base, a background image is recorded that shows the variation in lighting
intensity. That background image can then be used to remove the variation, leaving a uniform image of
the objects.

The removal process may either be based on the subtraction of the background image, pixel by pixel,
from the original, or the ratio of the values may be used. The choice depends on the way the camera
records the image. Solid-state detectors are inherently linear, producing an output that is a direct
measure of the light intensity. For these images, it is the ratio of the original image to the background
that produces a leveled result.

Film records light differently, with the lm density being an approximately logarithmic response to
intensity. Many digital cameras apply nonlinear response curves to the recorded intensity to mimic the
characteristics of lm. When the image represents the logarithm of the intensity, then subtraction of the
background is the proper method. In many instances if the camera response is not known, trying both
the subtraction and ratio methods will show which is the proper method to use. In all math operations
with two images, rescaling the output values to the acceptable black-to-white range is required.

Recording background is often impractical. For example, the background on which the objects rest
may not be uniform, or the objects may not be removable. Sometimes, capturing an image of the
illumination variation may simply be overlooked at the time. Finally, in many cases the illumination
cannot be controlled and the surface, structure and objects may not be at. Figure 24 shows an
example of a situation in which this is the case. The position of the sun and the curved shape and
varying composition of the lunar surface produce shading across the moon’s face that alter the
contrast of craters, rays and maria.

The background image that represents the variation in brightness due to the sun’s position and the
surface curvature and variations in surface composition is produced by a morphological “closing.” In
this procedure, each pixel and its local neighbors are compared and a new image
produced in which the darkest pixel value in the region is saved (this is a dilation). Then the
operation is repeated, keeping the brightest pixel value in the region (an erosion). The nal result
eliminates the bright surface details. The difference between the original image and the result of the
morphological processing is just those details, as shown in the gure. The elimination of shading
allows expanding the contrast of the ne details, as discussed in the following section.

If the sequence of dilation and erosion is reversed, the result is an “opening.” This is used in Figure
25 to construct a background image for the creased and folded paper. Removal of the background
results in an image with uniform background that shows the printing and writing clearly. Of course,
the geometric distortions due to the folding are still present.

Contrast Expansion

Figure 26a shows an example of an image that was recorded with lighting and exposure that resulted
in brightness values that cover only part of the full 0 to 255 range. This is not uncommon, and is a
better result than having too great a brightness range that results in clipping and data loss, as shown in
Figure 27. It is straightforward to perform a linear stretch, reassigning pixel values so that they do
extend from black to white as shown in Figure 26b. Note, however, that the resulting histogram has
just as many missing values as the original; they have just been distributed throughout the brightness
range rather than lying at the bright and dark ends. This stretching also increases the differences
between pixels that arise from random speckle noise, increasing the visibility of noise.

Reaching for the contrast expansion and adjustment tools is often one of the rst things that is done, but
it is much better to wait until after all processing is nished, both the methods shown in
this chapter and the next one on image enhancement. That strategy reduces the likelihood that pixel
values will be pushed past the black or white limits by processing and clipped to 0 or 255. It is
important when performing adjustments to the contrast and brightness of a color image that the I or L
channel in an HSI or Lab representation be processed, and the color information be left unchanged.
Expanding the values in the individual red, green and blue channels results in changing their
proportions and produces color shifts and the introduction of new, false colors in the image. Figure
28 shows an example in which expanding the RGB channels introduces blues and magentas that are
not present in the original image, while expanding the L channel increases the image contrast without
altering colors.

Stretching the range of brightness values to increase visual contrast is not restricted to linear
stretching. Figures 5 and 10, for example, show instances in which nonlinear adjustments are used. A
widely used method applies smooth curves called “gamma” adjustments, and may be controlled by
assigning the new value for the middle brightness value in the original image.

Shifting this point to darker values ( Figure 29) expands the brightness range for dark pixels while
compressing that for bright ones. The result shows details better in the originally darker areas of the
image, at the expense of compressing values in the lighter regions. Shifting the midpoint to brighter
values (Figure 30) does the opposite, showing detail better in the light
Page 19

regions. For color images, these adjustments are applied to the L or I values, not to the individual red,
green and blue channels.

Automatic methods for adjusting image contrast may also be used. These are particularly useful when
images are to be compared, for example ones taken over a period of time, or of similar objects with
possible variations in lighting. These algorithms derive a function relating the processed pixel
brightness values to the original ones based on the individual image contents. The most widely used
of these methods is histogram equalization. This attempts to spread the brightness values across the 0
to 255 range so that the same number of pixels (the same area of the total image) have each possible
value of brightness.
Of course, this cannot be achieved exactly since the results depend on the original distribution of
brightness values, but the nearest possible approximation is made. The histogram of the result, as
shown in the example of Figure 31b, does not look at and uniform because of the gaps and pileups
that are present. It is easier to see the overall uniformity by examining the cumulative histogram,
shown as a red line superimposed on the histogram in each image. This rises linearly from 0 to 100%
across the brightness range.

Another histogram shaping result that reduces high overall contrast in the original, and often produces
results that are better visually, has the goal of making the histogram Gaussian in shape. In the example
of Figure 31c the histogram may not appear to have that shape because of gaps and pileups, but the
cumulative histogram has the shape of the error function, which is the integral of a Gaussian.

Even greater increase in local contrast is possible using contrast-limited adaptive histogram
equalization (CLAHE) in which the image is divided into tiles and equalization is performed in each
one, with interpolation to blend the results across tile boundaries, as shown in Figure 31d. This
method requires user input of parameters such as the tile size and the limits on the histogram, and
produces a result in which there is no uniform relationship between the original and nal brightness
values. In general, using an automatic algorithm for histogram shaping is preferred to arbitrary manual
tweaking of the contrast function to produce a result, which may be in uenced by expectation or
desire.

Chapter 3 - Applying Image Enhancements

The distinction between making adjustments to correct defects, as described in Chapter 2, and
enhancement of images is a fuzzy one. Adjusting contrast, for example, may be considered as an
enhancement tool. Several of the procedures in this chapter use similar methods to those shown
previously, but with the purpose of enhancing details in the image for visibility or measurement. The
corrections described in Chapter 2 for colors, geometric distortions, noise and nonuniform
illumination should be applied before these operations.

Pseudocolor

Human vision can distinguish more different colors than shades of brightness. Consequently, using a
false color or pseudocolor table of colors to substitute for the gray shades of brightness in a
monochrome image may be useful. Figure 32 illustrates this using two such tables, one called a “heat
scale” in which brightness values are represented by colors that correspond to the appearance of
objects as their temperature is raised, and the other a “rainbow” palette that consists of fully saturated
colors with varying hues.

An unlimited number of different color palettes, usually called color lookup tables or cLUTs, can be
devised. Each table consists of a list of 256 sets of red, green and blue values to be substituted for the
original 0 to 255 brightness values in the gray scale image. These tables can be created and stored,
and in some implementations are
applied in the display circuitry without modifying
the image, while in others they replace the original
image contents (in which case, it is wise to save a
copy of the original image rst).
The advantages of a false color display include
revealing small local variations in brightness and
facilitating comparison of brightness values across
the scene. The disadvantages include making any
remaining noise variations more evident, and
breaking up the continuity of images so that the
overall interpretation and recognition of objects
and structures may become more dif cult.

Of course, the use of color is also visually eye-catching in some cases. This has encouraged many
people to introduce color into images, especially gray scale SEM images, by “painting” different
structures in unique colors. This is a manual, subjective process. The argument that this helps viewers
unfamiliar with the subject to distinguish the various objects and structures present, as shown in
Figure 33, is true, but is probably not the main motivation.

Principal Components

Principal components analysis (PCA) is a standard statistical technique for dealing with multiple sets
of data to nd the linear combinations of values that extract the maximum amount of discrimination. As
applied to images, PCA examines multichannel data, which may be the three color channels (RGB,
HSI, Lab or others), or may involve many more channels such as those from remote sensing or
satellite imagery, radar, or calculated channels such as the texture or edge information introduced in
following sections. The goal is to nd combinations of values that reveal the greatest differences
between various objects and structures.

Figure 34 shows an example using just the three red, green and blue channels. The image shows two
similar inks, one obscuring the other. Plotting the pixel values on the original RGB axes shows that
they are highly correlated. PCA calculates new, rotated axes that best t the data present in the image.
Assigning the position of each pixel’s values on these rotated axes to red, green and blue produces a
display that optimally discriminates the two inks. The method extends naturally to higher dimensions,
although plotting and displaying the results becomes more dif cult.

Combining Images

Arithmetic functions that subtract or ratio one image with another were introduced in the background
leveling correction in Chapter 2. Other combinations such as addition, multiplication, and logical
comparisons that keep the brighter or darker value at each location are also used, and are shown in
some subsequent examples. These require the images to have the same dimensions, and are all
performed pixel by pixel without regard to the rest of the image contents. There are some other
combinations of interest, which may involve more than two images. These generally take into account
additional information about the images.

Particularly when they are used for close-up imaging, most camera optics have a very small depth of
eld (the distance over which sharp focus is present). This is also true for high magni cation light
microscopes. By combining a sequence of images of the same scene taken with different focus
positions, it is possible to construct an extended-focus image. The technique is sometimes called
“focus stacking.” It consists of going through the series or stack of images and keeping, at each pixel
address, the pixel value that is in the sharpest focus.

Determining the sharpest focus may be done in several different ways. The focus criteria (which are
also used in the automatic focusing algorithms in many cameras and microscopes) all require
comparing each pixel with its local neighbors. The simplest method selects the value that gives the
maximum difference between the pixel and its neighborhood. Other methods calculate the high-
frequency component of the image or the local neighborhood variance. The latter method was used in
the example in Figure 35.

One requirement for this operation is having the original images aligned and at the same magni cation,
which can be a problem if focusing is performed by twisting the lens barrel as that also changes the
image size. The use of alignment and interpolation as described in Chapter 2 may be required to
achieve proper image registration. The brightness and contrast of each image should also be the same,
which may require adjustment.

Another example of image combinations is shown in Figure 36. Two (or sometimes more) images
captured of the same scene but with different exposure settings each record some of the details but
either underexpose or overexpose others. These may be combined as shown in the example to show
all of the details. At each pixel address, the original pixel values are added together with weighting
based on the average brightness of the local neighborhood. Some cameras can automatically acquire
and combine a series of images with different exposures to form a single composite with extended
dynamic range.
Detail Sharpening

Fine-scale detail, lines and especially edges marked by changes in brightness or color are important
both for visual examination and recognition, and for measurements. Many different approaches have
been applied to increase the sharpness of these details. One of the oldest and simplest, called
Laplacian sharpening, compares each pixel to its local neighbors. This may be done ef ciently by
multiplying the pixel and the neighbors by small integer constants as shown in Figure 37, adding
them, and forming a new image with the results. This process is called a convolution, and the array of
numbers is often called a kernel. Kernels used for this and for other purposes (such as derivatives, as
described below) may be stored on disk.

The problem with this approach is that it greatly increases the visibility of any noise in an image,
since random or speckle noise is typically present as pixels that are different from their local
neighbors. A better method performs the comparison of the pixel, or sometimes the weighted average
of a small local neighborhood, to the weighted average of a larger surrounding neighborhood. This is
called an unsharp mask, and is performed by applying a Gaussian blur to a copy of the image and then
subtracting it from the original. The result may be added back to the original to retain some of the
overall contrast as shown in Figure 38.
The name of the procedure arose when it was performed in the photographic darkroom to deal with
high contrast images such as those from astronomy. The original negative was used to make a print, at
1:1 magni cation but out of focus, onto another piece of lm, which was then developed. This “unsharp
mask” was overlaid on the original and the nal print made through the combination. Where one lm
was dense the other was thin and vice versa, except at details such as lines and steps, which were
consequently enhanced.

Figure 39 compares an unsharp mask with Laplacian sharpening as applied to a step in brightness that
might, for example, mark the boundary of an object. The gure shows the consequences for a noise free
image, and for one with random speckle noise. In both cases the unsharp mask produces better results,
and this is especially so for the case of noise. Both methods increase the magnitude and visibility of
the noise, again reminding that reduction of noise as described in Chapter 2 should be performed
before enhancements are attempted.

When using these sharpening techniques with a color image, it is important to apply the procedure to
the I or L channel leaving the color information unchanged. If the individual red, green and blue
channels are processed, the new values calculated by either Laplacian sharpening
Page 26

or an unsharp mask are generally different in each channel. This changes the proportions of the three
primary colors, and consequently changes the color of the processed pixel. The result is the
introduction of new and incorrect colored pixels in the image, which greatly increases the visual
impact of any noise that is present. Figure 40 shows a comparison.

A different approach to detail enhancement modi es the value at each pixel location based on the
histogram of the local neighborhood. The same histogram equalization method described in Chapter
2 is applied to the local histogram, but the new assigned value is kept only for the central pixel. As
shown in Figure 41, the result of this makes a pixel that is brighter than its neighbors brighter still,
while one that is darker than its neighbors is made darker. Overall or long-range contrast is reduced
or eliminated.

The unsharp mask procedure can be extended to handle the presence of noise in an image by using
two copies of the original image. Instead of subtracting the blurred copy from the original, two
blurred copies are made using different radii for the blurs (usually with a ratio between 3:1 and 5:1).
The smaller radius blur reduces the random noise while the larger one removes both the noise and the
details of interest. The difference, called a difference-of-Gaussians or DoG lter, may then be added
back to the original to enhance details such as lines and steps. As shown in Figure 42, this reduces
the visibility of noise in the nal result. Rather than using Gaussian blurring, which can shift edges and
which creates bright or dark haloes along steps that may hide other nearby details, a median lter may
be substituted for the Gaussian blur.
The difference-of-medians (DoM) lter, shown in Figure 42d, uses the difference between two copies
of the original image that have been median ltered with different size neighborhoods. The
improvement in noise reduction and suppression of bright halo artifacts can be seen. The DoM lter
requires signi cantly more computation than the DoG

lter, since ranking is a relatively slow operation while a very ef cient separable Gaussian blur
algorithm can be used.

The sharpening methods shown above increase the change in brightness at a step. A different
approach reduces the distance over which the change in brightness takes place. The method is called
a Kuwahara or maximum likelihood operator, which compares each pixel value to several local
neighborhoods to see which one it is most like, and consequently most probably should be considered
a part of, and used to determine a new value.
Figure 43 shows the basic idea. Each of the colors outlines a neighborhood that includes the central
pixel. These are typically 3 to 9 pixels in size, square or round. Whichever neighborhood that
includes the original pixel and has the smallest variance is considered the most likely, and the mean
or median value of that neighborhood is assigned to the location of the central pixel to form the
derived image. Figure 44 shows an example of the application of the Kuwahara lter in which the
increased sharpness of the object boundaries can be seen.

Edge Delineation

Subtracting the values of pixels on one side of a location from those on the opposite side applies a
derivative to the image that increases the contrast and marks the location of edges or steps, but only
those oriented at large angles to the direction of the derivative. Figure 45b shows an example, in
which the derivative is oriented from the upper left. The brightness of pixels to the lower right of
each location is subtracted from those at the upper left to obtain a value that is assigned to the
location in the derived image.

As can be seen in the example, lines and steps that are oriented from the upper right to the lower left,
at right angles to the derivative direction, are very visible in the derived image. But lines and edges
oriented parallel to the derivative, are hidden. Derivative operators such as this, often called an
“embossing operator” because of the appearance that they give to an image, are only rarely useful
because most images consist of lines and edges that may appear in arbitrary directions. An operator
that is insensitive to orientation is needed.

The most widely used method for enhancing lines and steps
that is insensitive to their orientation is the Sobel or gradient
operator. This applies two kernels that calculate the
mdifferences in pixel values in orthogonal directions. The
kernels, shown in Figure 46, produce numerical values for
the differences which are squared, added, and the square root

taken. This is the same procedure used to combine


perpendicular vectors to determine the resultant, and the result is the magnitude of the brightness
gradient at each location, as shown in Figure 45c.

The Sobel operator is a very ef cient procedure. The result is often inverted from the contrast shown
in Figure 45c so that the lines marking the steps are shown as dark lines on a light background. This
presentation of edges and boundaries is useful for measurement and also because human vision nds
edges as the important characterization of scenes, as evidenced by the effectiveness of cartoons and
sketches. Because the method is able to locate boundaries even when the overall difference in
brightness of regions in the image is small, this can often be used to delineate regions such as those
shown in Figure 47. The lter locates and draws lines along
the edges of the etched regions, regardless of the contrast or orientation. Filling the region inside the
lines then produces an image of the etched region suitable for analysis or measurement. Because steps
often extend over several pixels, the width of the lines produced by the Sobel gradient operator are
also broad. A re nement of the gradient method thins the lines so that only the pixel with the maximum
local value of the gradient is marked. This is the Canny operator, which indicates the most probable
location of the edge in the original scene. Figure 48 compares the results of the Canny and Sobel
operators applied to spherical particles imaged in the SEM.

There are other methods for edge enhancement that do not use the local brightness gradient. For
example, the difference between the brightest and darkest neighboring pixels in the neighborhood
does not take into account the location of those pixels. Another, sometimes better, statistical lter
calculates the variance of the pixels in the neighborhood. As shown in Figure 49, this value rises
sharply for pixels on boundaries because it sums the squares of differences in values, and produces
good edge delineation.

Normally, edge delineation is performed on the brightness data (the I or L channel) ignoring the color,
but in some cases a particular color channel, or the hue channel, may provide the greatest distinction
between the objects or structures of interest and the surrounding background. In that case the same
lters and calculations can be applied to the speci c channel containing the information. Figure 50
shows a different type of result. A Sobel gradient operator was applied to the red, green and blue
channels of a color image individually, producing a combined result that has many of the
characteristics of a sketch. This is rarely a technically or scienti cally useful result but it has visual
appeal.

Texture

In many instances, the distinction between objects or structures and the surrounding background is not
a brightness level or color, or an edge line, but rather a difference in texture. This is a loosely de ned
word that encompasses a variety of speci c possibilities. Generally, it is associated with variations in
brightness or color that take place over relatively local distances. A “smooth” region of an image may
have signi cant variation from one side to another, but locally the differences are small. A highly
textured region may have the same average values overall, but locally there are rapid changes. It is
this distinction that may be useful for selection.

As there are many types of texture, some random and some periodic, there are a variety of procedures
for detecting it and converting the magnitude of the texture to a gray scale value that may be useful for
selecting a region in the original image. One of the most widely useful is the magnitude of the
variance in a local neighborhood. A small neighborhood is used when variance is used to detect steps
or lines, but a larger radius neighborhood is needed to respond the texture. The size of the region use
must be large enough to encompass the bright and dark pixels.

As for edge detection, texture analysis is usually performed on the image brightness, as shown in
Figure 51b. Calculating the texture using the variance is shown in Figure 51c, in which the

texture of the yarn is clearly differentiated from the background, which has the same brightness and
color as some of the bers but is smooth rather than textured. A more computationally demanding
method that is often useful is the calculation of a local fractal dimension. For each pixel in the image,
this is determined from the rate at which the variance increases as the radius of the neighborhood is
increased. Figure 51d illustrates the result.

Textures may also be characterized and distinguished by orientation. The calculation of a gradient
detector, for example using the directional derivative kernels shown in Figure 46, can also be used to
determine the vector orientation using the inverse tangent function. Assigning the angle to each pixel
identi es the local texture orientation as shown in Figure 52b. This result is calculated using the
brightness of the pixels in the original image (Figure 52a), and the gray scale values of the pixels
represent vector angles from 0 to 359 degrees.

Since the brightness gradient on the bers may point to either side of the ber, the vector directions for
pixels within each ber may vary by 180 degrees, and consequently the pixels may vary by 128 gray
scale values. The histogram of brightness values in Figure 52b illustrates this: the two peaks in the
right half of the histogram repeat those in the left half. Reassigning brightness values to cover the
range from 0 to 179 degrees as shown in Figure 52c and applying a median lter to reduce random
noise represents more clearly the areas with different ber directions in the weave.

Chapter 4 - Fourier Space Processing

The processing operations in Chapters 2 and 3 all involve the brightness or color values of pixels in
the familiar spatial domain. There are many other ways in which images can be represented, which
may be less familiar but which sometimes provide great advantages for image processing. One of the
most widely used is Fourier space.

In this space, an image is understood to be composed of a series of sets of sinusoidal lines that vary
in contrast, are oriented at all possible angles, and have different spacings or frequencies. The
amplitude and phase or position of each set of lines needed to combine them to reconstruct the image
is recorded in the Fourier space representation. Conversion back and forth from the spatial domain to
the Fourier domain is lossless and quite ef cient, as fast algorithms have been devised for the
purpose.

Frequency Filters

Figure 53 shows an example. In the Fourier transform representation shown in Figure 53b, the
darkness of each point represents the amplitude (in the example, the logarithm of the amplitude) of the
corresponding set of sinusoidal lines. The low frequencies are shown near the origin at the center,
and progressively higher frequencies at greater radii. The phase values are generally dif cult to
interpret and are not shown in the gure. The prominent vertical and horizontal lines arise from the
differences in the left-right and top-bottom edges of the image.

Filters in Fourier space can be described in terms of the frequencies that they preserve or “pass.” A
“high pass” lter attenuates low frequencies, which correspond to gradually varying brightness values,
while passing the higher frequencies. Figure 53c shows an example. The slider positions correspond
to the amount of each range of frequencies that are passed. This “ideal inverse” lter eliminates the
lowest frequencies while retaining the highest, and produces a result much like the unsharp mask lter
applied in the spatial domain.

Similarly, Figure 53d shows a “low pass” lter than retains the lowest frequencies while eliminating
the higher ones. The result is the same as that produced by a Gaussian blur applied to pixel values in
the spatial domain. It is also possible to select a speci c range of frequencies, called a “bandpass”
lter, to keep just speci c spacings. This can be useful, for example, for images of woven cloth or
ngerprints.
Some spatial domain operators such as the median, maximum-likelihood, and variance lters do not
have equivalents in the Fourier domain. But there are some operations that are much easier to perform
on the Fourier transformed image than on the spatial domain pixel array.

Removing Periodic Noise

Rather than reducing or removing some sets of sinusoids based solely on their frequency, it is also
possible to take into account their orientation. This often makes it possible to remove periodic noise,
which may arise from electronic interference, ickering lights, or vibration. Offset printing of images
in magazines and newspapers also superimposes a regular grid (the halftone mask) that can be
removed in this way. For images printed in color, there are typically four different masks, for the
cyan, magenta, yellow and black inks used. Each color channel can be processed separately.

Figure 54 illustrates the process with a monochrome gray scale image. The original image from a
surveillance camera contains periodic noise that was generated by a microwave oven operating in the
store. In the Fourier transform representation (Figure 54b) two speci c frequencies show up as
“spikes” at the frequency and orientation of the lines in the original image. Notice that each spike
appears twice, as the plot is symmetrical about the center origin. This is a fundamental characteristic
of the Fourier space representation because each set of lines can be considered to have a direction in
either of two 180° complementary angles.
Reducing the amplitude of the spikes to zero as shown in Figure 54c removes the lines from the
image. This allows increasing the contrast of the remaining image as shown in Figure 54d. The spikes
may be located manually, or since the Fourier space representation can be treated as an

image, the various image processing tools from Chapters 2 and 3 can be applied. For example,
applying an unsharp mask can increase the contrast of spikes so that they can be isolated by
thresholding, as described in the next chapter. Figure 55 shows the same procedure applied to the
greater number of spikes produced by the regularly spaced half-tone dots in a printed image.

Cross-correlation

Fourier transforms are also useful for locating speci c patterns or objects in an image. Template
matching or cross-correlation can be performed with the spatial array of pixels, but is usually much
more ef cient when carried out using the Fourier transforms of the scene and of the target.
The two transforms are combined by a multiplication and the result retransformed to the spatial
domain to produce an image in which the location(s) of the target are
marked.

Figure 56 illustrates this using an example of text. The target, shown enlarged in the inset, is the
lower case letter “a” in Times font. The scene, shown in Figure 56a, contains text with a large
amount of superimposed noise. The crosscorrelation result, shown in Figure 56b, measures the
similarity of the target at each possible location in the original image. Selecting the darkest spots (for
example by thresholding, or with a “top hat” processing lter that calculates the difference between a
central region and the surrounding annulus) locates the positions of the target as shown in Figure 56c.
Cross-correlation sees through noise and camou age quite well, but it highly speci c as to the image of
the target. Only locations that closely match the speci c font, size and orientation will be identi ed in
an example such as Figure 56. For military surveillance, as an example, it is necessary to have
images of each type of vehicle with orientations in many directions, and scaled to correspond to the
altitude of the observing airplane. However, since cross-correlation is a very fast process,
performing the operation with each of a series of targets is still quite rapid.

Deconvolution

Recorded images may contain blurring introduced by out-of-focus optics, motion of the subject or the
camera, atmospheric effects, and so on. This blurring can be described as a convolution of the image
with a “point spread function” or PSF. This is simply the image of a single point of light that would be
recorded under the same circumstances. In many cases, the image may contain something that shows a
PSF, and in others it may be possible to determine it from other sources, analysis of the optics, or
even by iterative trial-and-error.

In any case, if the PSF can be determined, even approximately, it may be possible to remove all or
some of the blurring by deconvolution, which is most ef ciently performed using Fourier transforms.
Deconvolution is implemented by dividing the Fourier transform of the blurred image by that of the
PSF. The amount of noise in the images, especially in the image of the PSF, sets a limit on the amount
of deblurring that can be achieved without unacceptably increasing the noise in the result. Generally,
that means that the images, especially that of the PSF, should have a high dynamic range and very low
noise.

Figure 57 shows an example of deconvolution using a measured PSF. A single isolated star, which
should ideally be seen as a point, is used as the PSF to deconvolve the full image. Adding a small
scalar constant to the denominator (Wiener deconvolution) limits the noise in the result, which
exhibits signi cant reduction of the blur. When no measured PSF is available, an
estimated model, usually based on a Gaussian blur or circular disk, or their combination, may be
used. This approximates the effect of optical blurring with a nite aperture, and can produce signi cant
improvements in many practical circumstances.

Motion blur corresponds to a PSF that is a line. Sometimes the orientation and length of the motion
can be determined from examination of the image itself, or based on independent knowledge (the
speed and direction of the camera or object, for example). Figure 58 shows an example, the image of
a moving car. The direction of motion is evident and the extent can be measured by the length of the
images of bright re ections. Deconvolution allows the license plate to be read well enough for identi
cation, when combined with the color, make and model of the car. Note that in this case the
background around the car, which was not moving and hence not blurred, is signi cantly degraded in
the process of deconvolving the car image.

Deconvolution, like the other adjustment processes shown in Chapter 2 for correcting defects in
acquired images, is often useful but should not be treated as a substitute for efforts to optimize the
initial acquisition process. Only after acquiring the best possible or practical image in the rst place
should computer processing be applied.

Chapter 5 - Binary images


Thresholding

For measurement purposes, especially for automatic measurements of structures and objects, it is
usually necessary to distinguish the features – the structures or objects of interest – from the
background – everything else in the scene. Sometimes boundary lines, which may be drawn manually
on the image or ones such as those produced by the Canny lter shown in Chapter 3, can be used. But
in the majority of cases the method of choice is thresholding. This consists of de ning a range of
brightness or color values that correspond to the features and converting the image to a binary black
and white representation. In the convention used here, the features are shown as black on a white
background.

Most programs provide some manual, interactive thresholding capability, typically consisting of a
slider that selects some value in the histogram so that all darker pixels become black and all brighter
ones white, or two sliders may bracket a range. Manual adjustment until the result “looks good” is not
a preferred technique, for many reasons. Different people, or the same person on different days, select
different thresholds, producing different results. The role of expectation or desire cannot be ruled out,
and the process cannot be automated to handle batches of images.

Automatic methods based on algorithmic procedures are preferred. The most widely used methods
for setting a simple threshold were developed originally for applications such as distinguishing print
on paper, as a part of optical character recognition used to convert printed text to editable computer
les or for reading devices to assist the blind.

Figure 59 shows an example with the threshold selected by the algorithm employed in many software
packages. This is based on the Student’s-t statistical test and nds the value that makes the two sets of
gray pixel brightnesses above and below the threshold most distinguishable. Unfortunately, it depends
upon the assumption that those two sets of values have Gaussianshaped distributions, which is not
generally the case, but the procedure is used since the calculation is very fast, depends only on the
image histogram, and produces results are at least reproducible and often useful.

For color images, it is sometimes adequate to select a single channel, which may be one of the red,
green or blue channels; but in many cases the hue, saturation, intensity, L, a, b, or others, are
better for thresholding. Principal components channels, as shown in Chapter 3, are often useful for
this purpose. Figure 60 shows an example in which the uorescing blue stain used to mark the
presence of DNA allows automatic thresholding of the cell nuclei in a color image.

The example also shows that thresholding is not perfect. The boundaries of the features show the
inclusion or exclusion of pixels that result in irregularities, and there are pixels in the background
areas that happen by chance to have brightness values in the selected channel that fall within the
thresholded range. The application of a morphological opening (Figure 50d), as described in a
following section, cleans the image up and prepares it for measurement.

Because thresholding or segmentation is critical for subsequent measurements, and because image
types and contents vary widely, many different algorithmic approaches have been invented. One of the
most general, advanced and powerful techniques is k-means segmentation. Unfortunately, it is also
one of the most computer intensive, as it proceeds iteratively. It can be applied to images with any
number of channels of data, which may be different colors, or derived information such as the texture
or edges shown in Chapter 3.

The example in Figure 61 shows the application of k-means to a three-channel RGB color image.
Plotting the pixel values in a three-dimensional graph (Figure 61b) shows the original
distribution of values. The iterative procedure is begun by specifying “ k,” which is the number of
types of structures or objects (including background) that are to be distinguished. For a simple binary
representation, k is 2. For the example in Figure 61, k was set to 8. Consequently, 8 points in the
color space were initialized. The initial locations do not matter to the nal result, so they are usually
distributed more-or-less uniformly in the n-dimensional space.

All of the pixel values are initially classi ed according to which point they are closest to. Next, the
mean values of those pixels are calculated, and these are used to de ne a new set of k points. The
process is then repeated until no further changes take place. For the example gure, the locations of the
nal points are shown (Figure 61c), with the resulting segmentation of the image (Figure 61d)

Contour Lines

The thresholding methods shown above are based


on the pixel values within regions in the image.
A different approach is based on contour lines
between values. This can be understood as the
way that contour lines are used on topographic
maps to depict the elevation of surfaces, as
indicated in Figure 62. Measurements of
elevation at multiple locations determine where
the lines are located, even if no measured point
lies exactly at one of the elevation values for the
line. Iso-elevation or contour lines are always
continuous.

Similarly, for images composed of pixels, the contour lines that de ne the boundaries of objects may
pass between pixels. It is not possible to select a speci c value that will de ne the continuous
boundary lines around a feature that can be used for measurement. Figure 63 shows a set of contour
lines (color coded to indicate elevation) drawn on a surface image of a coin. The image was obtained
from a scanning pro lometer, so that the brightness values represent elevation, and this is a direct
example of the contour lines in Figure 62.
For images in which brightness does not measure elevation, such as Figure 64, contour lines are also
useful as delineations of boundaries of features that are continuous and consequently suitable for
measurements. As explained in the chapter on measurements, the total length of the contour lines
around the pores in this image of a slice of bread can be used to determine the total surface area of the
pores.

4- and 8-Connectedness

The use of a square grid array for pixels introduces a problem of de ning which neighboring pixels
are connected to form part of the same feature. The usual convention is 8-connectedness, which means
that a pixel is understood to be connected to any of its eight neighbors (4 sidesharing and 4 corner-
sharing). The line in Figure 65a is a continuous 8-connected line. The other possibility is to consider
pixels touching only if they share sides (4-connectedness). Figure 65b shows a minimal 4-connected
continuous line.

If an 8-connected interpretation is used for foreground pixels comprising features, it cannot also be
used for the background. If the background pixels in Figure 65aFigure 65a connected, then the line no
longer separates one side from the other. For 8-connected foreground pixels, background pixels must
be considered as 4-connected. Conversely, for 8-connected background pixels, the foreground pixels
must be 4-connected.

This creates a problem if thresholding is used to de ne the background rather than the features, and
then the image contrast is inverted to swap black and white. Figure 65cFigure 65c connected
features. If a 4-connected convention is selected for the foreground, this would represent six separate
objects. If the image is inverted as in Figure 65d, the 8-connected black pixels form one continuous
array and white background pixels represent six separate holes within the feature.

Boolean Combinations

In addition to the arithmetic combinations described in Chapters 2 and 3 for gray scale and color
images, there are Boolean combinations that may be applied to thresholded binary images. Figure 66
shows the basic operations AND, OR and ExOR that can combine two representations of structure to
select the parts that are common to both, the regions selected by either, and the unique areas that are
present in one but not the other.

Combining these operations, along with the operator NOT that simply reverses the black and white
binary image, requires careful attention to the order in which the steps are applied. Figure 67
provides a few examples. The use of parentheses to clarify the order in which operations are
performed is strongly recommended.

Binary images can also be combined with gray scale and color images. This may be done, for
example, to isolate features while retaining the brightness and color information for measurement.
The process can be performed in several ways, including having a separate “masking” function, but
the simplest approach is generally based on the binary image having values of 255 for white
(background) and 0 for black (features). Then the comparison function that keeps the brighter value at
each pixel location replaces the black feature pixels with those from the gray scale or color image,
while clearing the background.

Figure 68 illustrates this using a uorescence image in which brightness measures the concentration of
stain. Thresholding the image leaves some objects touching, and these are separated using a
watershed segmentation operation introduced in a following section. Then the separated features are
combined with the original image to restore the brightness information. The procedures described
above all function on a pixel-by-pixel basis. The values of the pixels at each location in the two
images are combined to produce the new value, independent of any other pixels in the images. It is
also possible to extend these operations to work on features – the groups of contiguous black pixels,
in either a 4- or 8-connected sense, that are present in the
image. For example, a feature-based AND function keeps the entire feature if any of the pixels are
matched by those in the other image. This is also sometimes described as marker-based selection
logic.

The feature-based ExOR works the same way, keeping any features that have no matching pixels in the
other image. There is no need for a feature-based OR function as it produces the identical result to the
simpler pixel-based OR. Unlike the pixel-based logic, feature-based logic is not commutative: (A
AND B) gives the same result as (B AND A), but (A featureAND B) results are not the same as (B
featureAND A). The NOT operator works as in the pixel-based operator, reversing the black-white
contrast of each pixel in an image.

Figure 69 shows one way that pixel-based and feature-based AND functions can be used. The sketch
in Figure 69a shows some orange objects touching or near a blue region. These might be bacteria in a
microscope or boats along the shore of a lake, since scale does not matter. In Figure 69b the edge of
the blue region is shown as a line. Applying a pixel-based AND to this line and the orange features
selects just the pixels along the edge that are touched. Measuring the total length of these line
segments and the total length of the line yields a measurement of the fraction of the boundary that is in
contact with the objects.

In Figure 69c, the outline is used as a marker in a feature-based AND to select the objects. The result
is just those objects that touch the line regardless of the extent to which they touch, so that they can be
counted and measured separately from those that do not.
Black and white binary images can be converted to a gray scale representation in which each pixel’s
value is the distance from the nearest location of a pixel with the opposite original color. This result,
called the Euclidean distance map (EDM), has several uses that are illustrated in the following
sections. An application that involves image combinations is illustrated in Figure 70.

A line, which may be the border of an object, a road, etc., is shown, along with a superimposed image
of several objects. The Euclidean distance map assigns to each pixel in the areas around the road a
value that marks the distance from the road. This is shown in Figure 70b using a false color lookup
table. Using the binary image of the objects as a mask on the image of the distance map (i.e., keeping
whichever pixel is brighter) erases the background and assigns the distance values to each object.
This provides a quick and effective way to measure the distance of each object from the line, and to
generate the graph shown in Figure 70c in which objects are counted according to their distance.
Unlike calculations using pixel coordinates and analytical geometry, this procedure works regardless
of the irregularity of the line.
Morphology

The morphological procedures of erosion and dilation, and their combinations in openings and
closings, are illustrated in the preceding chapters as they are applied to gray scale or color images.
Those are ranking procedures that nd, for example, the brightest or darkest pixel value in a
neighborhood, and assign that value to the central pixel location to generate the derived image. The
same functions can be applied to black and white binary images, but somewhat greater exibility is
possible by counting the touching pixels (usually any of the 8 side- or corner-touching neighbors) and
inverting the color of the central pixel only if the number of opposite-colored pixels exceeds a
threshold.
In the example diagram shown in Figure 71, the threshold is set to 2. The process of dilation adds
pixels around the borders of the object that touch more than 2 existing black pixels, and the process or
erosion removes black pixels that touch more than 2 background neighbors. The sequence of dilation
followed by erosion, a closing, produces a different result from the opening

sequence of erosion followed by dilation, but the initial gure in ambiguous. Both procedures produce
a more compact result with a smoother border.

Usually the choice of opening or closing is evident from the nature of the application. Figure 72
shows an image of bullet holes in glass, with surrounding cracks. Erosion removes the cracks but
makes the holes smaller. Following the erosion with a dilation restores the approximate size of the
holes but the cracks have been eliminated.
Performing erosion and dilation based on the square array of pixels produces results that are
anisotropic. Figure 73a illustrates the results for a threshold of zero (classical erosion and dilation).
Other threshold coef cients produce different shapes, and change the distances that are added or
removed, but none preserve the original shape. This is because adding or removing a corner-touching
neighbor represents a 41% greater distance than removing a side

touching neighbor. Calculating the EDM makes it possible to perform erosion or dilation by simply
thresholding the resulting distance map, which is faster than iterative pixel-by-pixel methods. It also
preserves shapes, as shown in Figure 73b, and the distances are measured directly and in all
directions. Figure 74 shows an example of an EDM-based closing used to select foreground pixels
for contrast adjustment..
Watershed Segmentation

The EDM also provides a mechanism for separating touching or slightly overlapped features. Figure
75 illustrates the method. Plotting the EDM values for the pixels in the overlapped features represents
them as two mountain peaks, with a valley between. This divides the two mountains into separate
watersheds. Rain falling on each mountain will run downhill, but along the watershed line, the rain
droplets from each mountain peak meet. The line along which they meet and ow together is the
watershed line. Removing the pixels along that line separates the two mountains, and hence the two
features. This is called watershed segmentation. (An alternative description represent the distances as
depressions separated by dikes, that are lled.)

In a typical image, the task is not always as simple as that shown in the example. Figure 76 shows an
image of touching coins of different sizes, recorded using a atbed scanner. The touching lines occur in
many directions and so the segmentation results in lines of 4-connected background pixels, necessary
to separate the 8-connected pixels within each feature. Combining the binary image of the watershed-
segmented features with the original scanned image of the coins produces a result showing all of the
coins separated for counting and measurement.

For less ideal shapes than the circular coins, and for objects that are partially overlapping, the
watershed segmentation results are less perfect but often still useful. The technique is generally
applicable to any approximately convex shapes; the speci c requirement is that each shape have only
a single peak in the Euclidean distance map. Figure 77 shows an example of squash seeds. The
segmentation lines usually result in reasonable separation of the overlapped seeds, but when the
overlap is too great or the resulting shapes too complex, lines may either be missing or misplaced.
Skeletons

The erosion described above using the number of touching opposite-colored neighbors to decide
whether to remove a pixel is one form of conditional erosion. Another, very useful, conditional
erosion technique is skeletonization. This iteratively removes pixels subject to the condition that the
neighboring pixels also touch each other. A pixel cannot be removed if it would separate the
neighbors into different features. The result is a midline of pixels that captures much of the shape
information for each object or structure. Another (better) method for de ning the skeleton uses the
ridges or midlines from the Euclidean distance map.

Figure 78 shows an example of a skeleton, superimposed on the binary image of the sprocket. Most
of the pixels that comprise the skeleton have 2 neighbors (the skeleton is an 8-connected structure). A
few pixels have just a single neighbor and are thus identi able as end points. The end points in the
skeleton of the sprocket are easily counted and correspond to the 25 teeth.
Points with more than two
neighbors are node or branch points in the skeleton, and
depending on the con guration in which the branches meet may have either 3 or 4 neighbors. These
can also be identi ed and counted, as a description of the connectivity of a network, for example.

The skeleton is useful for


describing and measuring the shape of objects. It preserves in a
minimal form the topological information. If the values of the EDM are assigned to the skeleton, the
result is the medial axis
transform (MAT), from which the shape can be exactly reconstructed. As shown in Figure 79, the
skeleton responds to the presence of points and concavities in the shape in much the same way that the
veins on a leaf de ne the shape, although the two sets of lines are not identical. Even for an irregular
shape such as the ower shown in Figure 80, the skeleton identi es the number of petals.

Skeletons can be “taken apart” to isolate parts for measurement. For example, removing all of the
node points and keeping only branches containing one of the original end points allows measuring all
of the terminal branches of a structure, such as a plant’s roots. Keeping just the node points allows
measuring their spatial distribution for clustering or uniformity.

Pruning a skeleton iteratively removes points with a single neighbor. The backbone of the original
structure is left, with the terminal or external branches removed. Figure 81 shows an
application. The skeleton of the interior of the maze can be pruned leaving just the single continuous
line between the end points, without any blind or false branches that terminate within the maze. In
addition to solving the maze, identifying the separate features present when the skeleton has been
formed identi es which parts of the maze are separate and not connected to and accessible from the
true path (Figure 82).
Chapter 6 -
Measurements
Manual Measurements

Measurements of the structures or objects


in images are used for a variety of
purposes. In many instances this is done by
marking two points and measuring the
straight-line distance between them. This
assumes of course that the points are
representative of the object being
measured, and that an appropriate ruler or
calibration is available. For at bed
scanners, or for microscopes with xed
optical lenses, this is straightforward.
When the camera position creates
foreshortening, or worse yet is not known,
the problem is more dif cult (Figure 83).
The preferred solution, particularly for forensic purposes, is called “reverse projection.” As shown
in Figure 84, returning to the original scene and recording another image, this time of a standard ruler,
allows superimposing the images and directly reading the person’s height. Many convenience store
and bank doorways have ruler markings on the inside that are recorded in the surveillance video for
this same purpose. There are several sources of potential error in this technique, including the
unknown contribution of shoes and hats, and the dif culty of estimating the effects if the person is not
standing upright, but it is insensitive to lens distortions, involves no complicated math and is readily
understandable, for example by juries.

Particularly with relatively low resolution surveillance video images, locating the correct place to
make the measurement can be dif cult and error-prone. Figure 85 shows an example. The top of the
hood on the person may or may not be close to the top of their head, but in any case is poorly de ned.
Enlarging the image, as shown, does not help and actually
may make it more dif cult to determine. In this case, nearest neighbor enlargement as described in
Chapter 2 was used to show the original pixels. Interpolation would produce a smootherappearing
enlargement and make the measurement even more uncertain.

Global Measurements

Automatic measurements are more reliable and consistent than manual ones in most cases. The use of
appropriate image processing and automatic thresholding produces simpli ed binary images that
facilitate the measurement of structures and objects. Figure 86, as an example, begins with an
example of a portion of a town map showing streets and buildings.

Applying a k-means thresholding as described in Chapter 5 produces a simpli ed representation of


four classes of pixels, representing the water, streets, buildings and
greenspace. Counting the pixels then produces a straightforward measurement of the area fraction of
each.
Many two dimensional images are actually of sections through three-dimensional structures. This is
particularly true for microscope studies of the structure of biological tissue, metals and ceramics,
polymers, fabrics (both woven and nonwoven), and food. Measurements on these images can be used
in stereological calculations to determine the volume fraction, surface area, length, and other global
properties of the structures. For example, the total length of the boundary lines in Figure 64 can be
used to estimate the surface area of the pores. Stereological relationships between various simple
measurements on two-dimensional surfaces and the threedimensional structures they sample can
estimate volumes, surface areas, lengths, number, and even size distributions.
The method is not restricted to microscope images. Figure 87 shows an example of a section through
beef. Processing to sharpen the contrast, followed by thresholding to distinguish the red meat from fat,
prepares the image for measurement. Pixel counting of the two areas (excluding the surrounding
background) shows that the area fraction of the surface of the steak that is meat is 63.6%. Provided
that this is a representative surface, that means the volume fraction of meat in the steak is the same.

Feature Measurements
The measurements in the two preceding examples count the total number of pixels for each
thresholded class, but do not distinguish the individual features, for instance to count the number of
buildings in Figure 86 (17), or the number of intersections (18, determined as the node points in the
skeleton of the streets after an opening to eliminate narrow paths). In addition to counting, feature
measurements can be conveniently grouped into four classes, with several different speci c measures
and algorithms for each. These are measures of size, position, color (or brightness), and shape.

The number of features present can be determined by the labeling operation that checks contiguity
between pixels. For scenes in which all of the objects of interest are contained within the eld of view,
this is straightforward. If the image is a sample of a larger total population, then it is necessary to
decide how to deal with features that intersect the boundary of the image. Counting those that intersect
two edges and not counting those that intersect the other two, or counting all edge-intersecting features
as 1/2, provides unbiased results. But for feature measurements, an adjustment is required to
compensate for the fact that larger features are more likely than small ones to intersect an edge, so that
their full extent cannot be measured.

The simplest measure of size is the area, determined by counting the contiguous pixels that make up
each separate feature present in the image. The example in Figure 88 shows a thresholded map of the
Great Lakes. Applying a morphological opening separates them (and also separates Georgian Bay
from Lake Huron). Counting pixels and applying the scale of the map provides the areas shown in the
gure (in thousands of square kilometers).

This technique is applicable at any scale, provided the image calibration is known. That is usually
done by measuring an image of an appropriate standard or ruler, or including one in the scene,
although knowledge of the optical and imaging geometry may also be used. The example
in Figure 89 shows gold nanoparticles imaged in an electron microscope. Automatic thresholding and
watershed segmentation produce separated features for measurement. In this case, the areas
determined by pixel counting are reported as the equivalent circular diameter calculated from the
area, and a histogram shows the size distribution of the particles. Features that cross the boundaries
of the image are not measured.

There are several ways that the position of a feature can be speci ed. As indicated in Figure 90, the
centroid, the center of a minimum circumscribed circle, or the center of the largest inscribed circle,
may be used when appropriate. Only the latter is guaranteed to lie within the borders of the object (it
is determined as the highest peak in the Euclidean distance map).

The locations of features may be used directly, as ( x, y) coordinates or latitude and longitude, and
can also be used to characterize the spatial distribution of objects. Figure 70 shows the use of the
EDM values to determine distance from an arbitrary line. Figure 91 shows an application. The aerial
view of sheep is thresholded and the location of each sheep is determined as the center of an
inscribed circle. The EDM of the distance from the road is calculated and assigned to each location
point. Finally, a distribution of the brightness values of the location points is tabulated to measure the
spatial distribution, which quanti es the fact that most of the sheep are staying away from the road.
Measuring the nearest neighbor distance between features provides a tool to characterize spatial
distributions. For a random distribution (sprinkling salt on the table), the mean nearest neighbor
distance depends only on the number per unit area. If the measured value of the mean nearest neighbor
distance is less than this, as in Figure 92, it indicates that the features, cities in Europe, tend to cluster
together. The ratio of the measured nearest neighbor distance to that for a random distribution of the
same number of city lights in the same land area is 0.71. Conversely, if the value is greater than that
for a random distribution, as in Figure 93, it indicates that the features, people arranging themselves
on a beach, are self-avoiding. The ratio of measured nearest neighbor distance to that for a random
distribution of the same number of people on the beach (half the square root of area divided by
number) is 1.47.
Determination of brightness values is often used to measure density, concentration, or some other
property for which calibration is needed. The integrated intensity of each of the stain in the cells in
Figure 68 can be calibrated, for example, to determine the amount of the reagent present. Density is
often modeled using Beer’s law, an exponential relationship between the measured darkness (the
reverse of intensity) and density. The plots of density vs. position for the electrophoresis scans in
Figure 94 provide ready comparison of the position of bands for matching of DNA markers.

As explained in Chapter 1, digital color cameras are not spectrophotometers and do not measure
color. The red, green and blue values represent averages over broad wavelength ranges and are not
easily interpreted. Comparison to determine whether colors can be matched or distinguished, such as
paints in a forensic situation, is generally easier using hue and saturation. The plot of hue as a function
of position in Figure 95 provides the sharpest transition between paint layers that is most easily
measured to determine thickness.

The measurement of shape can be quite challenging. Simple ratios of size


measurements in which the measurement units cancel out are widely used. Figure 96 shows a few of
the practically unlimited number of combinations that may be used. The names used for these
measurements are not applied consistently, and different formulations may be encountered.

Sometimes these ratios are useful for purposes such as matching, identi cation, or classi cation.
Figure 97 shows the measurements of formfactor for various tree leaves. The different species are
not well separated and many are not distinguishable. By combining several such ratios, the situation
is improved as shown in Figure 98a. Applying linear discriminant analysis to combine the three
measurements and plotting the values of the derived linear combinations (canonical variables) of the
measured variables does result in separation of the classes (Figure 98b) and the ability to identify
each tree species by the leaf shape.

Another dif culty with the dimensionless ratios is that there are many shapes that are visually quite
distinct but that yield the same values for the shape ratios. There are other measurements of shape that
are more speci c, such that a relatively small set of numbers can be used to completely reconstruct the
shape with all of its details. The best known of these is harmonic or Fourier analysis of the object’s
boundary. Wavelet analysis of the boundary, or a complete set of moments, can also be used. These
measurements are powerful but can be dif cult to understand or judge visually. They are generally
performed with speci c-purpose programs, both for performing the measurements and for subsequent
advanced statistical analysis of the sets of values to extract the most signi cant terms.

Human visual characterization of shapes seems to depend primarily on the topological form, as
characterized by the skeleton illustrated in Figure 79, and on the irregularity of the boundary. Few
natural objects have boundaries that are Euclidean and smooth (bubbles, controlled by surface
tension, are one example). Many classes of objects have roughness that extends over a
broad range of dimensions, so that the measured perimeter depends in part on the image scale and
resolution. Natural objects often exhibit a speci c relationship between scale and irregularity that can
be measured by the fractal dimension.

This value can be determined in several ways. One of the most straightforward is plotting, on log-log
axes, the cumulative histogram of the Euclidean distance map. This represents the area along the
boundary as a function of the distance from it. For a smooth boundary, such as the coast of Florida
shown in Figure 99, the area increases fairly uniformly with distance (even with the irregularity of
Biscayne Bay), giving a value for the fractal dimension of 1.105. The much more rugged coast of
Norway with its deep fjords has a value of 1.473. (The two coastlines are shown in unconventional
orientations to emphasize their comparison.)

The analysis of measured data for purposes of identifying and describing speci c objects, classifying
types of features, or nding correlations between measurements and the object’s history or properties,
is typically carried out with a separate program such as the R statistics package or SAS JMP software
(which was used for examples such as Figure 98). This is an important but separate subject not
covered in this text.

Appendix A - 32-bit Photoshop with Plugins

This appendix summarizes a basic set of commands that implement the functions illustrated in the text,
using Adobe Photoshop CS5 (and earlier) supplemented with several plugins. Newer versions of
Photoshop (CS6 and CC) will run only in 64 bit mode, and most of these plugins are limited to 32 bit
mode and so will only work with the older versions of the program. Photoshop contains a great many
other functions, most of which are intended for graphic arts creation and other purposes that are not
appropriate for technical, scienti c or forensic image processing, and so are not included here. Except
as noted, the software and its operations are identical on Macintosh and Windows computers.

The plugins included in the following outline are:


•EPaperPress PTLens, for correction of lens distortion <http://epaperpress.com/ptlens/>.
•XiMagic DeNoiser and Quantizer for advanced processing and thresholding algorithms <http://

www.ximagic.com >.
•Alex V. Chirokov’s Fast Fourier Transform plugin <http://www.3d4x.ch/?c=16,35>, which is
Windows only.
•The set of free plugins on the author’s website <http://www.drjohnruss.com/download.html>.

This is not an exhaustive list of useful additions.


•FoveaPro (Reindeer Graphics <http://www.reindeergraphics.com>) is no longer marketed, but for
those who have it, the suite of plugins provides an extensive set of processing and measurement tools.
The second edition of the Cookbook (still available) details the use of those routines.
•Ocean Systems ClearID (http://www.oceansystems.com/forensic/forensic-Photoshop-Plugins/
index.php>) is Windows only and is marketed primarily for forensic applications.
•Alien Skin <http://www.alienskin.com/photobundle/> and George DeWolfe’s Perceptool
<http://www.georgedewolfe.com/perceptool3/> have extensive tools for manipulating contrast and
color.
•Media Chance Dynamic Photo HDR <http://www.mediachance.com/hdri/index.html> provides tools
for dynamic range compression and adjustment of high dynamic range images.
•Acclaim Software Focus Magic <http://www.focusmagic.com> and Topaz Infocus <http://
www.topazlabs.com/infocus> offer several deconvolution methods.
•Shape <http://cse.naro.affrc.go.jp/iwatah> (Windows only) measures shape based on Fourier coef
cients [ref: H. Iwata & Y. Ukai (2002) SHAPE: A computer program package for quantitative
evaluation of biological shapes based on elliptic Fourier descriptors, Journal of Heredity 93:384-
385]
And there are no doubt others, that may offer additional ways to perform the operations
described and illustrated in the text. Many of the plugins that can be found on various
“Photoshop plug-in” web sites are intended for graphic arts purposes and are not appropriate for
the purposes described in this text.

Basic image handling

Images of various types (gray scale, RGB, 8 or 16 bit, etc., in a variety of storage le types) are read,
saved, printed and displayed using the usual commands under the File> menu. Each image is
displayed in its own window, and the frontmost window with the highlighted name is the one
operated on by the processing and measurement routines. Edit>Undo or the History palette can be
used to step backwards through recent processing operations.

Images may be converted from one format to another by selecting Image>Mode>RGB Color, >Gray
Scale, >Lab Color, >8 bit, >16 bit, etc. Conversion of an RGB image to IHS (Intensity, Hue and
Saturation) is provided by Filter>_Process>Convert to Int-Hue-Sat. Conversion of an RGB image
to principal components is provided by Filter>_Process> Principal Color Components. However,
in both of these cases the channel labels are not changed from red, green and blue.

Image>Duplicate or Layer>Duplicate Layer are very useful to create copies on which to try
various functions. Layers are especially useful for comparing or recording the results of different
steps.

The Tools palette has tools for enlarging the image, reading point coordinates, selecting regions,
measuring point-to-point distances, etc. The Histogram palette shows the image histogram and the
Info palette displays pixel values and distance. Other palette windows are useful to show the
processing history, select individual color channels or layers, etc. These can be selected under the
Window> menu.

Some of the various functions listed below operate only within a selection, if one has been de ned.
The selection tools include rectangular, polygonal and free-form shapes as well as a “wand” tool that
allows selecting a point and growing a region of pixels that have values within the user-speci ed
tolerance of that value. Once a selection has been made, the Select>Similar function can be used to
include other regions with similar pixel values. Select>Color Range allows selecting pixels based
on various criteria, including the bright and dark ranges, or pixels similar to a marked location, with
adjustable tolerance.
Contrast adjustment

Manual adjustment of the image contrast can be selected by selecting Image>Adjust>Levels and
moving the limit sliders or the midpoint slider, which adjusts the gamma value. The Auto button sets
the limits to the ends of the actual data, with adjustable amounts of clipping set by clicking the
Options button. “Enhance Monochrome Contrast” should be checked to prevent alteration of colors
by the automatic stretching process. More control is afforded by the Image>Mode>Curves selection
which allows creating an arbitrary transfer function relating the original pixel values to nal adjusted
values. In both routines, clicking the OK button applies the changes to the stored pixel values. The
contrast of the image can be reversed, as in a photographic negative, with Image>Adjust>Invert.

Algorithmic, rather than subjective, adjustment of image contrast is afforded by histogram shaping.
Select equalization (Image>Adjust>Equalize) or center-weighting (Filter>_Process> Center-
Weighted Brightness) histogram distributions. For high-dynamic-range images, homomorphic range
compression (Filter>_Process> Homomorphic Range Compression) is useful to make detail in
bright and shadow areas visible while preserving local contrast. Neutral color adjustment can be
performed with the Image>Adjust>Curves function by using the black, white and gray eyedroppers
to click on locations in the image that should have no net color. The resulting adjustment curves for
each channel are displayed, and clicking OK performs the adjustment.

Image rectifying

Lens distortions such as pincushion (common with telephoto lenses) or barrel (common with wide
angle lenses) may be corrected manually using Filter>Lens Correction, or Filter>
EPaperPress>PTLens. The latter has built-in corrections for a wide range of camera lenses. These
routines can also correct for vignetting.

Perspective correction can be accomplished by using the crop tool with Perspective checked. Move
the corner points to positions that mark the area to be recti ed, optionally drag the edges to enlarge the
area covered, and click OK.

Neighborhood Filters

A Gaussian blur with adjustable standard deviation can be applied to an image with Filter>
Blur>Gaussian Blur. A modi cation of the basic Gaussian, the Bilateral Filter, can be applied with
Filter>_Process>Conditional Gaussian Filter, or with Filter>XiMagic> XiDeNoiser (select
Bilateral Filter), which offers more control options. This plugin also offers the selection of an
Anisotropic Diffusion lter, a further modi cation of the Gaussian, and a Non-local Means (NLM) lter.

The median lter is not a convolution but instead is based on ranking pixel values in the neighborhood,
and is superior to Gaussian lters for noise reduction. Filter>Noise>Median allows specifying the
dimension of the square neighborhood used. Filter>_Process>Hybrid Median Filter uses an
approximately round neighborhood of adjustable radius and better preserves lines and corners.

The unsharp mask ( Filter>Sharpen>Unsharp Mask) calculates the difference between the original
image and a Gaussian blurred copy (with adjustable standard deviation) and adds that back to the
original. The halos around edges produced by the unsharp mask can be avoided by instead using the
difference between the original and a median ltered copy (Filter>_Process> Rank Sharpening
Filter). Rather than increase contrast at edges, they may also be enhanced by reducing the distance
over which contrast changes using the Filter>_Process>Kuwahara Filter. This latter routine also
reduces random noise, and in many cases improves subsequent dimensional measurements.

Convolutions that generate a derivative in any direction are produced by Filter>Stylize> Emboss.
General convolutions can be applied by entering the kernel values as a 5x5 matrix of integers in
Filter>Other>Custom. The scale factor is usually set to the sum of the kernel values, or for kernels
that total zero, to the largest value in the kernel. The offset is used to shift values so that negative
results can be visualized. The kernels can be saved on disk and loaded as needed. The most
commonly used procedure for edge delineation is the Sobel (Filter>_Process>Sobel Gradient
Magnitude). Combining the magnitude with the vector angle (Filter> _Process> Sobel Gradient
Orientation) can be useful to measure anisotropy. Edge delineation can also be accomplished with
Filter>_Process>Local Variance by using a small neighborhood size. With a larger neighborhood,
the function extracts the texture in an image.

Gradual variations in brightness across an image can often be corrected with Filter>_Process>Level
Background Shading, which ts polynomial functions to the local bright or dark background points for
subtraction. In some cases, morphological ltering (following paragraph) can also be used to generate
a background for subtraction.

Morphological ltering of gray scale or color images (the latter based on the brightness, the average of
the color values) is provided by the Filter>Other>Minimum and Filter>Other> Maximum
functions. These may be used singly or in combination to eliminate features of interest and generate
backgrounds for removal. The difference between these two results (using the calculations described
below) can also be used to delineate edges, or when using larger neighborhoods (which are square),
to extract texture.

Enhancement of local details and suppression of large scale brightness variation is provided by
Filter>_Process>Local Equalization. Isolating small bright features is provided by
Filter>_Process>Top Hat Brightness Difference. This is also used to locate spikes in Fourier
Transform power spectra (described below).

Image combinations

Images can be combined using Image>Calculations. The images, channels for a color image, or
layers for a multilayer image, and the operation to be performed (Add, Subtract, Multiply, Lighter,
and Darker), are selected by pull downs. The images are selected from the drop-down lists but must
be identical in size.

A different approach is provided by the plugins. One image must rst be selected and placed in
memory with Filter> _Combine>Store Reference Image. This reference image remains in memory
until replaced, even if subsequently modi ed or closed. Combining it with a different image, with
automatic scaling and offsets, can then be performed with Filter>_Combine>Add Reference Image,
>Subtract Reference Image, >Ratio to Reference Image, >Keep Lighter Pixel Values, and
>Keep Darker Pixel Values.

Combinations of thresholded binary images are discussed separately below.


Fourier transforms

One use of the Fourier transform is for the removal of noise. To use the forward and inverse FFT
routine (PC only) for this purpose, the image must have dimensions that are a power of 2 (256, 512,
1024...). If necessary, the image can be resized, padded or cropped (Image>Image Size or
Image>Canvas Size). Also, the originally gray scale image must be converted to RGB
(Image>Mode>RGB Color). Then select Filter>Fourier Transform>FFT to perform the
conversion. The complex FFT is displayed with the amplitude in the Green channel and the phase in
the Red channel. Select just the Green (amplitude) channel for display, and use the pencil or brush
tool to remove the spikes corresponding to the noise frequencies. Then select all channels and apply
Filter>Fourier Transform>IFFT to convert the image back to the pixel domain.

The Fourier transform is used in several other procedures, such as cross correlation. For this
function, both the search image and the target image must have power-of-two dimensions and be the
same size. Use Filter>_Combine>Store Reference Image to designate the target image, select the
search image, and apply Filter>_Combine Images>Cross Correlate Ref. Image.

(If the Fovea Pro or ClearID plugins are installed, a more complete set of Fourier transform functions
are available, including deconvolution.)
Thresholding

A grayscale image, or the intensity data from a color image, can be thresholded with a usercontrolled
setting of a slider on the histogram with Image>Adjust>Threshold. Automatic thresholding based on
the histogram with the widely used Otsu-Trussell algorithm to calculate the value that separates the
two groups of pixel values with the highest value of the Student’s-t statistic is performed with
Filter>_Binary>Apply Automatic Threshold.

The k-means iterative automatic function can segment the image into a speci ed number of groups by
selecting Filter>XiMagic>XiQuantizer and specifying the Number of Colors (select Calculated
Palette and K-Means) The various groupings of pixels in multiple copies of the image can then be
selected with the wand tool, for example. Gray-scale images must rst be converted to RGB.

Binary images

Binary images are shown as black features on a white background. Pixels with value less than 128
are treated as black, and vice versa. Reversing the image contrast to exchange black and white is
performed with Image>Adjustment>Invert. The morphological erosion and dilation operations
described above for gray scale features can be applied, or closing and opening functions
(Filter>_Binary>Morphological Closing and >Morphological Opening) can be used (using a
circular neighborhood of radius 2).
Filling of interior holes in features is performed with Filter>_Binary>Fill Holes in Features, and
erasing the interiors to leave just the 8-connected outlines is performed with Filter>
_Binary>Outline Black Features.

Skeletonization by iterative erosion is selected with Filter>_Binary>Skeletonize Black Features.


Terminal branches on the resulting skeleton can be removed with Filter> _Binary>Prune Skeleton
Branches, and the end points in the skeleton can be kept (and the remainder erased) with
Filter>_Binary>Skeleton End Points.
Removal of small features is facilitated by Filter>_Binary>Shade Small Features by Area. This
assigns a gray level from 1 to 255 corresponding to the area in pixels of any feature with total area
less than 256 pixels. This allows setting a threshold (Image>Adjustment>Threshold) to remove
unwanted small features with area (and hence brightness equal to 256 minus area) less than the
threshold setting.

Touching features can be separated with Filter>_Binary>Watershed Feature Separation. This is


based on the Euclidean Distance Map, and the EDM itself can be generated with Filter>
_Binary>Distance Map Black Features.

Boolean operations with binary images can be performed by adding them as described above,
followed by thresholding. The OR function is carried out by adding and thresholding with a value
greater than 128. The AND function is performed by adding and thresholding with a value less than
128. To create the ExOR result, add the images and use the wand to select the medium gray pixels.
Fill (Edit>Fill) those with black, invert the selection (Select>Inverse), and ll with white. These
multiple-step operations can be saved as Actions if they are to be used frequently.

Measurement

Counting the number of separate black features is performed with Filter>_Measure>Count Black
Features. The area fraction is reported with Filter>_Measure>Total Area Fraction, and the
perimeter by Filter>_Measure>Total Perimeter. Note that for skeletons, the total perimeter is twice
the length of the lines. To individually number the features, select Filter>_Measure> Feature
Numbers. Each feature is then shown with gray scale values from 1 to 255 (if there are more than 255
features, the numbering repeats). Converting this display to indexed color (Image>Mode>Indexed
Color) and assigning an appropriate color table (Image>Mode>Color Table) can make it easier to
distinguish the various features).

Filter>_Measure>Feature Size and Shape writes a text le (for PCs this is on the C: drive, for Macs
on the desktop) with the area and formfactor of each feature. This can be read with a spreadsheet or
statistics program.

A line grid useful for various stereological measurements is generated (typically this is done in a
new, blank image or layer) with Filter>_Measure>Generate Grid Lines. (If they are installed, the
Fovea plugins offer a broad selection of grids, including cycloids, points and random lines).

Appendix B - 64-bit Photoshop with Plugins


This appendix summarizes a basic set of commands that implement the functions illustrated in the text,
using Adobe Photoshop CS5 (and later) supplemented with several plugins. Newer versions of
Photoshop (CS6 and CC) run only in 64 bit mode, and require plugins that are also 64 bit. Some of
the plugins described here are available in both 32 and 64 bit versions, and mentioned in Appendix
A, but many of these are speci c to the newer 64 bit environment. Photoshop contains a great many
other functions, most of which are intended for graphic arts creation and other purposes that are not
appropriate for technical, scienti c or forensic image processing, and so are not covered here. Except
as noted, the software and operations are identical on Macintosh and Windows computers.

The plugins included in the following outline are:

• EPaperPress PTLens, for correction of lens distortion < http://epaperpress.com/ptlens/>. Values for
many camera lenses are built-in.
• XiMagic DeNoiser and Quantizer, for advanced algorithms for processing and thresholding
<http://www.ximagic.com>.
• Reindeer Graphics QIA <http://www.reindeergraphics.com>, an extensive set of image processing
and measurement tools. The menu selections are organized around a typical work ow, with
measurements of images and of the features within them.

This is not an exhaustive list of possible useful additions.


•Alien Skin <http://www.alienskin.com/photobundle/> and George DeWolfe’s Perceptool
<http://www.georgedewolfe.com/perceptool3/> have extensive tools for manipulating contrast and
color.
•Media Chance Dynamic Photo HDR <http://www.mediachance.com/hdri/index.html> provides tools
for dynamic range compression and adjustment of high dynamic range images.
•MacPhun offers several plugins (Macintosh only) for sharpening, noise reduction, contrast
manipulation, and HDR range compression <http://macphun.com/creativekit?
adw_campaign=Search_Macphun_brand&gclid=CKCI17_G58gCFUQYHwodGA0LpA>
•Ocean Systems ClearID (http://www.oceansystems.com/forensic/forensic-Photoshop-Plugins/
index.php>) is Windows only and is marketed primarily for forensic applications, including video
sequences.
•Acclaim Software Focus Magic <http://www.focusmagic.com> and Topaz Infocus <http://
www.topazlabs.com/infocus> offer several deconvolution methods.
•Alex V. Chirokov’s Fast Fourier Transform plugin (Windows only), has been recompiled for 64 bit
operation <http://www.rognemedia.no/diverse/FFT.zip>.
And there are no doubt others, that may offer additional ways to perform the operations
described and illustrated in the text.

Basic image handling

Images of various types (gray scale, RGB, 8 or 16 bit, etc., in a variety of storage le types) are read
and saved and displayed using the usual commands under the File> menu. Each image is displayed in
its own window, and the frontmost window with the highlighted name is the one operated on by the
processing and measurement routines. Edit>Undo or the History palette can be used to step
backwards through recent processing operations.
Images may be converted from one format to another by selecting Image>Mode>RGB Color, >Gray
Scale, >Lab Color, >8 bit, >16 bit, etc. Conversion of an RGB image to HSI (huesaturation-intensity)
is provided by Filter>QIA-I.Adjustments>Color SpaceTransform. Conversion of an RGB image to
principal components is provided by Filter>QIAI.Adjustments>Principal Color Components. In
both of these cases the channel labels are not changed from red, green and blue to identify the derived
channel values.

Image>Duplicate or Layer>Duplicate Layer are very useful to create copies on which to try
various functions. Layers are especially handy for comparing the results of different steps by changing
the layer opacity or turning it on and off. A layer may also be useful for making marks that are later
counted or measured.

The Tools palette contains tools for enlarging the image, reading point coordinates or color values
(shown in the Info palette), selecting regions, measuring point-to-point distances, etc. The Histogram
palette shows the image histogram. The Filter>QIA-VII.Image Measurements> Histogram
Brightness plugin has an adjustable vertical scale and can save the histogram data in a disk le. Other
palette windows are useful to show the processing history, select individual color channels, etc.
These can be selected under the Window> menu.

The various functions listed below operate only within a selection, if one has been de ned (except as
noted). Most measurement functions operate only in a rectangular selection region. The selection
tools include rectangular, polygonal and free-form shapes as well as a “wand” tool that allows
selecting a point and growing a region of touching pixels that have values within the user-speci ed
tolerance of that value. Once a selection has been made, the Select>Similar function can be used to
include other regions with similar pixel values. Select>Color Range allows selecting pixels based
on various criteria, including the bright and dark ranges or similar to a marked location, with
adjustable tolerance (equal ranges of RGB values).

Contrast adjustment

Manual adjustment of the image contrast can be accomplished by selecting Image>Adjust> Levels
and moving the limit sliders or the midpoint slider, which adjusts the gamma value. The Auto button
moves the limits to the ends of the actual data, with adjustable amounts of clipping set by clicking the
Options button. “Enhance Monochrome Contrast” should be checked to prevent alteration of colors
by the automatic stretching process. More control is afforded by the Image>Mode>Curves selection
which allows creating an arbitrary transfer function relating the original pixel values to nal adjusted
values. In both routines, clicking the OK button applies the changes to the stored pixel values.

The contrast of the image can be reversed, as in a photographic negative, with Image>Adjust>
Invert.

Algorithmic, rather than subjective manual, adjustment of image contrast is afforded by histogram
shaping. Select equalization (Image>Adjust>Equalize) or center-weighting (Filter>QIA-
I.Adjustments>Histogram Shaping) histogram distributions. For high-dynamicrange images,
homomorphic range compression (Filter>QIA-I.Adjustments>Homomorphic Compression) is
useful to make visible bright and shadow areas while preserving local contrast.

Neutral color adjustment can be performed with the Image>Adjust>Curves function by using the
black, white and gray eyedroppers to click on corresponding locations in the image that should have
no net color. The resulting adjustment curves for each channel are displayed, and clicking OK
performs the adjustment.

Image rectifying and adjustment

Lens distortions such as pincushion (common with telephoto lenses) or barrel (common with wide
angle lenses) may be corrected manually using Filter>Lens Correction, or Filter>
EPaperPress>PTLens. The latter has built-in corrections for a wide range of camera lenses. These
routines can also correct for vignetting.

Perspective correction can be accomplished by using the crop tool with Perspective checked. Move
the corner points to positions that mark the area to be recti ed, optionally drag the edges by their
centers to proportionately enlarge the area covered, and click OK.

Gradual variations in brightness across an image can often be corrected with Filter>QIA-I.
Adjustments>Autolevel, which ts polynomial functions to the local bright or dark background points
for subtraction. Filter>QIA-I.Adjustments>Background Removal can be used with user selected
regions to t a polynomial background, which is then be removed from the full image. In some cases,
morphological ltering (described below) can also be used to generate a background for subtraction,
or a high-pass lter (described below) may be useful to eliminate gradual brightness variations with
position.

Neighborhood Filters

A Gaussian blur with adjustable standard deviation can be applied to an image with Filter>
Blur>Gaussian Blur. A modi cation of the basic Gaussian, the Bilateral Filter, can be applied with
Filter>QIA-IV.Enhancement>Conditional Smoothing, or with Filter>XiMagic> XiDeNoiser
(select Bilateral Filter), which offers more control options. This latter plugin also offers an
Anisotropic Diffusion lter, a further modi cation of the Gaussian, and a Non-local Means (NLM) lter,
both with extensive controls.

The median lter is not a convolution but instead is based on ranking pixel values in the neighborhood,
and is generally superior to Gaussian lters for noise reduction. Filter>Noise>Median allows
specifying the dimension of the square neighborhood used. Filter>QIA-IV. Enhancement>Hybrid
Median uses a round neighborhood of adjustable size and better preserves lines and corners.

The unsharp mask ( Filter>Sharpen>Unsharp Mask) calculates the difference between the original
image and a Gaussian blurred copy (with adjustable standard deviation) and adds that back to the
original. The unsharp mask is a high-pass lter; a band-pass lter with improved noise rejection may be
obtained by using Image>Calculations to subtract one Gaussian blurred copy of the image (with a
large radius to blur details) from a second blurred copy (with a smaller radius, to reduce noise).
Rather than increase contrast at edges, steps in brightness may also be enhanced by reducing the
distance over which contrast changes using the Filter>QIA-IV. Enhancement>Edge Re nement.
This Kuwahara or maximum-likelihood lter also reduces noise, and in many cases improves
subsequent dimensional measurements.

Convolutions that generate a derivative in any direction are produced by Filter>Stylize> Emboss.
General convolutions can be applied by entering the integer kernel values as a 5x5 matrix in
Filter>Other>Custom. The scale factor should usually be set to the sum of the kernel values, or, for
kernels that total zero, to the largest positive value in the kernel. The offset is used to shift values so
that negative results can be visualized. The kernels can be saved on disk.

Several procedures for edge delineation are provided in Filter>QIA-IV.Enhancement>Edge Lines.


The most common is the Sobel. The Canny thins the edge lines to mark the maximum gradient, which
is usually the most likely position for boundaries. The vector angle of the local brightness gradient
shown by Filter>QIA-IV.Enhancement>Edge Orientation can be useful to measure anisotropy. It is
often necessary to use a binary image of the edges obtained by thresholding the Edge Lines result to
mask the orientation image by erasing regions away from the edges, since a gradient direction exists
at every pixel, many of which do not lie on boundaries or along lines. A histogram of the orientation
values reveals any directional anisotropy.

Edge delineation can also be accomplished with Filter>QIA-IV.Enhancement>Variance by using a


small neighborhood size. With a larger neighborhood, the function extracts the texture in an image.
Filter>QIA-IV.Enhancement>Texture calculates the fractal dimension in a neighborhood around
each pixel and scales the values to produce a derived brightness.

Morphological ltering of gray scale or color images (the latter based on the luminance, or weighted
average of the color values) is provided by the Filter>Other>Minimum and
Filter>Other>Maximum functions (the adjustable neighborhoods are square). These may be used
singly or in combination to generate backgrounds for removal. The difference between these two
results (using image combination methods described below) can also be used to delineate edges, or
when using larger neighborhoods, to extract texture.

Filter>QIA-IV.Enhancement>Morphology-Brightness can perform erosion, dilation or their


combinations of opening or closing directly (for color images it operates on the brightness without
altering colors; the neighborhood is circular). The Filter>QIA-IV.Enhancement> Morphology-
Color plugin uses the user-speci ed foreground (pen) color set in the tools menu to select the most
similar (dilation) or least similar (erosion) neighboring color for the operations. Morphological
operations on thresholded binary images are discussed separately, below.

Enhancement of local detail contrast and suppression of large scale brightness variation is provided
by Filter>QIA-IV.Enhancement>Local Equalization. Isolating small bright or dark features is
provided by Filter>QIA-IV.Enhancement>Top Hat, which compares the extreme value in a central
region to that in the surrounding annulus. This is especially used to locate spikes in Fourier Transform
power spectra (described below).
Image combinations

Images can be combined pixel-by-pixel using Image>Calculations. The images, channels for a color
image, and layers for a multilayer image and the operation to be performed (Add, Subtract, Multiply,
Lighter, and Darker). The images are selected from the drop-down lists but must be identical in
size.

A different approach is provided by the plugins. One image must rst be selected and placed in
memory with Filter>QIA-II.Combinations>Save as 2nd. This reference image remains in memory
until replaced, even if subsequently modi ed or closed. Combining it with a different image (with the
same list of functions as shown above, and with optional automatic scaling and offsets) can then be
performed with Filter>QIA-II.Combinations>Arithmetic.

Boolean combinations of thresholded binary images are discussed separately, below.


Fourier transforms

One use of the Fourier transform is for the removal of noise. For color images, only the brightness
information is processed, but color information is preserved. Filter>QIAIII.Fourier>Fourier
Transform (FFT) calculates and stores the transform and displays the power spectrum (with a
logarithmic brightness scale).

A lter can be constructed by manual marking on this display with the pencil tool, followed by
thresholding (for example, using Filter>QIA-V.Thresholding>Isolate Pen Color), or by using
processing tools such as the top hat lter described above. Then Filter>QIA-III.Fourier> ApplyFilter
and Inverse FFT produces the ltered result without permanently altering the stored transform. High-
or low-pass lters, or bandpass lters, can be constructed using centered circular neighborhoods to
create the mask (black indicates selected frequencies, white indicates frequencies rejected); it is
generally best to apply a small amount of smoothing to the edges of a

lter to reduce or eliminate “ringing” along steps and edges.

The Fourier transform is used in several other procedures, such as cross correlation or convolution.
Use Filter>QIA-II.Combinations>Save as 2nd to designate the target image or point-spread-
function (PSF), which does not need to be the same size, select the image to be operated on, and
apply Filter>QIA-III.Fourier>Cross Correlation, >Convolution, or >Deconvolution

Thresholding

A grayscale image, or the intensity data from a color image, can be thresholded by a user-de ned
setting of a slider on the histogram with Image>Adjust>Threshold. Automatic thresholding based on
the histogram using a variety of algorithms to calculate the value that best separates the two groups of
pixel values is performed with Filter>QIA-V.Thresholding>BiLevel Thresholding. Both bright and
dark limits for a selected brightness range can be set manually or automatically using Filter>QIA-
V.Thresholding>Thresholdi Levels. The Filter>QIA-V. Thresholding>Contours routine draws
boundary lines between regions with selected brightness ranges.
For color images the preceding operations work just on the brightness values. The Filter>
QIAV.Thresholding>Threshold HSI and Filter>QIA-V.Thresholding>Color Tolerance plugins use
the full color information in the image. The former allows selecting arbitrary ranges of hue, saturation
and intensity, and the latter works by clicking on the image to select a target color and then adjusting
the range of HSI values around that setting.

The k-means iterative automatic function can segment the image into a speci ed number of groups by
selecting Filter>XiMagic>XiQuantizer and specifying the Number of Colors. The various groupings
of pixels can then be selected with the wand tool, for example. Grayscale images must rst be
converted to RGB.

Binary images

Binary images are shown as black features on a white background. Pixels with values of 255 are
treated as background, and all others are treated as part of features. Reversing the image contrast to
exchange black and white is performed with Image>Adjustment>Invert but caution is needed if
intermediate pixel values are present. The morphological erosion and dilation operations described
above for gray scale features can be applied, or closing and opening functions (Filter>QIA-
VI.Binary Processing>Morphology-EDM or >Morphology-Legacy). Both offer erosion, dilation,
opening and closing. The iterative Legacy method allows specifying the number of iterations and the
number of opposite color neighbors that must be present to change each pixel, while the EDM method
(which is not iterative) provides better isotropy, especially for larger radii, and accepts exact
distances.

Filling of interior holes in features is performed with Filter>QIA-VI.Binary Processing>Fill Holes,


and erasing the interiors to leave just the 4-connected outlines is performed with Filter>QIA-
VI.Binary Processing>Outlines.

Skeletonization is selected with Filter>QIA-VI.Binary Processing>Skeletonize. Terminal branches


on the skeleton can be removed with Filter>QIA-VI.Binary Processing>Pruned Skeleton. The
branches, nodes or end points on a skeleton can be selected with Filter>QIAVI.Binary
Processing>Select Skeleton Components. The 8-connected skeleton can be converted to 4-
connected (which will separate the pixels on either side from touching) using Filter>QIA-VI.Binary
Processing>Thicken Skeleton.

Touching features can be separated with Filter>QIA-VI.Binary Processing>Watershed


Segmentation. This is based on the Euclidean Distance Map, and the EDM of the features can be
generated with Filter>QIA-VI.Binary Processing>Euclidean Distance. The locations of features
can be marked by the maxima in the Euclidean distance map, as shown with the Filter>QIA-
VI.Binary Processing>Ultimate Points, or by marking the centroids (which may not lie within the
feature borders) of separate features using Filter>QIA-VI.Binary Processing> Centroids.

Boolean operations with binary images can be performed by saving one image with
Filter>QIAII.Combinations>Save as 2nd, and then applying Filter>QIA-
II.Combinations>Boolean, with selection of AND, OR, Exclusive-OR and NOT options. This
operation is performed pixel by pixel. Using the features in the stored second image as markers to
select all features in the current image that are touched by any overlap is performed with Filter>QIA-
II.Combinations> Select by 2nd.

Measurement

Removal of small features is facilitated by Filter>QIA-VIII.Feature Measurements>Reject


Features. This also allows ignoring features that touch two edges, or any edge (note that unbiased
counting of features should be performed by ignoring those that touch two edges).

Counting the number of separate features in the current foreground color is performed with
Filter>QIA-VIII.Feature Measurements>Count. (By setting a unique foreground color and marking
the image, or a new layer, with the pencil tool, counting of user-marked locations can be performed.)
The total area fraction of features is reported with Filter>QIA-VII.Image Measurements>Area
Fraction. Calibration of the spatial dimension (Filter>QIA-VII.Image Measurements> Calibrate
Magni cation) and brightness (Filter>QIA-VII.Image Measurements>Calibrate Density) can be
used to establish units for global or feature measurements.

Global measurements include overall brightness ( Filter>QIA-VII.Image Measurements>


Brightness) and plots along vertical or horizontal pro les (Filter>QIA-VII.Image
Measurements>Pro les). Measurement of clustering (Filter>QIA-VII.Image Measurements>
Clustering) is also provided. For some measurements of clustering or location, the centers of
inscribed circles (Filter>QIA-VI.Binary Processing>Ultimate Points) are more useful markers than
the centroids.

Measurement of the individual features present includes a variety of size, shape, position and
brightness-related parameters. These results can be written to a disk le that can be opened by a
spreadsheet or statistical analysis program (Filter>QIA-VIII.Feature Measurements>Measure All
Features). Values for a selected parameter can be written onto the image (Filter>QIA-VIII. Feature
Measurements>Label Features), used to color-code the features (Filter>QIA-VIII. Feature
Measurements>Color by Value), or used to select features with a range of values (Filter>QIA-
VIII.Feature Measurements>Select Features). Histograms of values are generated with
Filter>QIA-VIII.Feature Measurements>Plot (Distribution), and correlation plots between two
selected parameters are generated with Filter>QIA.VIII-Feature Measurements>Plot (Scatter).
These graphs can be copied or saved using the system screen copy functions.

Various grids useful for stereological measurements are generated (typically in a new image or layer
lled with a white background) with the Filter>QIA-VII.Image Measurements>Grids> functions.
These are typically combined with a binary image using a Boolean AND and then the number of
“hits” determined using Filter>QIA-VIII.Feature Measurements>Count.

Appendix C - Matlab Image Processing Toolbox

This appendix summarizes a basic set of commands that implement the functions illustrated in the text,
as implemented in Matlab’s Image Processing Toolbox. The full set of commands in the language is
very rich, and has many options for de ning special control parameters. Either by using combinations
of images in stacks with de ned or calculated conditional neighborhoods, or by programming, many
additional advanced algorithms can be applied, but for most routine users are not easily accessed or
commonly necessary.

The examples shown here cover many of the basic operations used in this text, and give an
introduction to the way that the command-line interface is used. A more detailed guide is available
online at <http://www.mathworks.com/help/images/index.html>. PDF les with a detailed user guide
and a full reference for all commands can be downloaded, as well. Functions illustrated in the text
that are not directly provided in the Toolbox, such as the Kuwahara and Homomorphic lters, can be
programmed in Matlab, typically requiring fewer than 10 lines of code, but programming is not
covered in this appendix.

Basic image handling

Images of various types (gray scale, RGB, 8 or 16 bit, binary, etc., in a variety of storage le types) are
read, saved and displayed using simple commands. Each open image is given a name, with very
simple A, B, ... examples used here. The gure command creates a display window to hold the image
display. Adding the optional ,[] to the imshow function automatically scales the display to the range of
values in the image array. The various processing operations shown below all assume that imshow is
used to display the result. If no new gure command is used to create a new window, the results will
replace the existing window contents. The lename used to write an image must include a le type, for
instance ‘.tif’.
A=imread(‘ lename’);

gure(1);
imshow(A,[]);
imwrite (B,‘ lename’);
close(B);

The image dimensions can be read with [h,w]=size(A);

A region of interest (ROI) can be selected by using the roipoly function and then marking (with the
mouse) a series of points on the image that form the vertices of a polygon. A right-click marks the nal
point.
B=roipoly(A);

The image display window has tools for enlarging and panning the image, reading point coordinates
and values, measuring point-to-point distances, etc. Images can also be converted, for instance from
RGB to gray scale, to HSV (hue, saturation, value), HSI (hue, saturation, intensity), etc. For example,
the rgb2gray function eliminates the color information, keeping just the L (luminance) channel
extracted from the RGB components.

B=rgb2gray(A); C=rgb2hsv(A); D=hsv2rgb(C); E=rgb2hsi(A);


Three gray scale images (A,B,C) can be merged to form an RGB color image D with
D=cat(3,A,B,C);

and the three individual color channels can be extracted from a color image (rgb or another color
space) with
A=D(:,:,1);
B=D(:,:,2);
C=D(:,:,3);

Arrays of data, including Fourier transforms, can be converted to gray scale images for display and
processing with
D=gscale(A);

The workspace of images and other variables can be cleared, and all open images closed. clear;
close all;

Contrast adjustment
The histogram of an image can be calculated and displayed. If nbins is omitted, a default value based
on the image type is applied (256 for a gray scale image).
gure imhist(A,nbins);
If a polygonal ROI B has been created within image A, as shown above, a histogram of just the pixels
within the ROI is generated by
gure imhist(A(B),nbins);

Manual adjustment of the image contrast can be selected by clicking on the adjust levels button in the
image display window and manipulating the limits. Clicking the “adjust data” button will apply the
changes to the stored pixel values.

Linear stretching of contrast is performed by rst calculating the limit values (the default pushes 1% of
the pixels to pure white and 1% to pure black; this can be changed by adding variables in the
stretchlim function), and then applying the stretch. For an RGB image, each channel is stretched
independently (altering colors) unless speci c limits and a gamma value are entered, which are then
applied uniformly to the RGB channels.
B=imadjust(A,stretchlim(A));
-or
B=imadjust(A, stretchlim(A),[orig_low; orig_high], [result_low; result_high], gamma);

Histogram equalization produces a result with a speci ed number of discrete gray levels (if nbins is
omitted, the default is 64).
B=histeq(A,nbins);
Contrast-limited adaptive histogram equalization (CLAHE) can be applied using
B=adapthisteq(A,‘numtiles’,T,‘cliplimit’,C,‘nbins’,N,‘distribution’,‘rayleigh’); where the
numerical values of T, C, and N must be user-supplied, and the distribution can alternatively be speci
ed as ‘uniform’ or ‘exponential’.
Image contrast can be reversed wth B=imcomplement(A);
Neighborhood Filters

The unsharp mask calculates the difference between the original image and a Gaussian blurred copy
(with standard deviation sigma) and adds that back to the original (in amount speci ed by a
multiplier value from 0 to 2; the classic amount is 1, while 0 shows just the difference). For a color
image, the data are automatically converted to Lab space, the L channel is processed, and the data
converted back to RGB.
B=imsharpen(A,sigma,amount);

Other convolutions can be applied by rst creating the kernel values as a matrix. The program can
generate several types of kernels including disk, Gaussian, Laplacian, LoG (Laplacian of a
Gaussian), and motion (a directional blur vector, which requires specifying the length and direction).
Optional additional variables in the im lter routine can be included to specify how the edges of the
image are treated, and whether convolution or correlation is performed. M=fspecial(‘disk’,radius);
B=im lter(A,M);

It is also possible to enter the kernel values directly; the example shows a 3 x 3 horizontal derivative.
Any resulting values that exceed the range of the image type (e.g., 0...255 for an 8 bit image) are
truncated.
M=[1 0 –1; 2 0 –2; 1 0 –1];
B=im lter(A,M);
The lter2 function may also be used but requires a different order of variables, and performs the
convolution using Fourier transforms. Filtering in Fourier space is discussed below.

Correlation is performed using the exact same procedure as convolution but with the kernel of values
rotated 180 degrees about its center. In most cases, the kernel for correlation will be the pattern of
pixel values from a target image, which can be created using the crop tool on the display to outline the
region. The imcrop function produces an image containing the contents of the region outlined with the
tool. Performing the processing with Fourier transforms is generally more ef cient than using the
pixel-domain lter routines.
T=imcrop(B);
imrotate(T,180);
C=real(ifft2(fft2(A) * fft2(T));
imshow(C,[]);

A Gaussian blur with a kernel size of w x w and standard deviation s is applied to an image.
M=fspecial(‘gaussian’,w,s);
B=im lter(A,M);
Or a simpler procedure by specifying the standard deviation sigma.
B=imgauss lt(A,sigma);

A median lter in a square neighborhood of dimension w x w (an odd value - if this is omitted the
function defaults to 3 x 3) can be applied. An optional additional entry can be used to specify how the
image borders are treated.
B=med lt2(A,w);

Texture extraction can be done using the local range, standard deviation, or entropy. The default
neighborhood if the size is omitted is 3 x 3 for range lt and std lt, which produces edge outlining.
Larger neighborhoods respond to texture; the default size neighborhood for entropy lt is 9x9
B=range lt(A,7);
-or
B=std lt(A);
-or
B=entropy lt(A);

Edge extraction can be performed using a choice of methods (Sobel, Canny, Prewitt, zerocrossings of
the Laplacian of a Gaussian, etc.). If no method is speci ed, the Sobel gradient is the default. The
resulting image is binary, and optionally a threshold t may be applied to de ne the magnitude of the
edge value that is detected. For the Canny lter, optional variables de ne both upper and lower
thresholds and the standard deviation of the Gaussian that is applied. B=edge(A,‘canny’);
B=edge(A);
B=edge(A,‘sobel’,t);

Fourier transforms

The Fourier transform is used in several other procedures, such as deconvolution (below) and
general ltering, but is presented here in the context of identifying speci c noise frequencies. For
ltering purposes, it is necessary to pad an image that does not hove power-of-two dimensions to

a larger size w as shown. The forward transform is computed with


F=fft2(A);
-or
F=fft2(A,w,w);

The quadrants of the transform are swapped to produce the conventional presentation with the low
frequencies near the center, and the log of the magnitude displayed, with F2=fftshift(log(abs(F)));
imshow(F2,[]);

Filters (of type ideal, Gaussian, with cutoff frequency speci ed, or Butterworth, abbreviated BTW
with cutoff frequency and order 1, 2, or 3 speci ed) can be generated to be applied to the Fourier
transform. For a highpass lter, use H2=1–H;
function H=hp lter(‘ideal’,w,w);
-or
cutoff=0.25*w;
function H=hp lter(‘BTW’,w,w,cutoff,2);
The lter is then applied using
G=dft lt(F,H);
imshow(G,[]);
Deconvolution

A blind deconvolution routine is provided for deblurring an image. It requires an initial guess at the
size of the point spread function, which can be generated as a disk, and returns both the deblurred
image and the iterated PSF. Additional variables can be included to control the number of iterations
and the propagation of changes from one iteration to the next.
InitPSF=strel(‘disk’,rad);
[B,PSF]=deconvblind(A, InitPSF);
imshow(B,[]);
imshow(PSF,[]);

If an image of the PSF is known, Wiener deconvolution can be used instead. An estimate of the noise-
to-signal ratio k is required (omitting k assigns a value of 0 corresponding to an ideal inverse lter).
B=deconvwnr(A,PSF,k);
imshow(B,[]);

Morphological processing

Morphology operations on gray scale images (erosion, dilation, opening, closing) require de ning the
neighborhood size to be used. In most cases this ‘structuring element’ in the functions imdilate,
imerode, imclose, imopen will be a circular disk, of speci ed radius rad. (Other shapes, including
square, line, etc., can also be generated.)
SE=strel(‘disk’,rad);
B=imclose(A,SE);
imshow(B,[]);
Note: the function imtophat subtracts a copy of the image that has been processed with an opening
from the original. This is not a true top-hat function, which would compare the values in a central
region to those in a surrounding annulus.

Image arithmetic

Gray scale images can be combined using arithmetic operators. For example, subtracting a
background image B from A is accomplished by
C=B–A;

In most cases this results in a very dark image, and the display contrast can be expanded by
D=imadjust(C);
imshow(D,[]);
Since the results of the operations may exceed the 0..255 range of an 8 bit image, and would be
clipped, it is recommended that 16 bit results be generated or that care be used in applying
appropriate scalar factors rst. The functions provided are imadd, imsubtract, imdivide, immultiply,
and imabsdiff (absolute value of the difference). The imlincomb function applies scalar multiplier
constants to each image pixel and adds the results in double precision, then rounds the result.
C=imadd(A,B,‘uint16’);
-or
C=imlincomb(0.5,A,0.5,B);

Image rectifying

Simple rotation can be performed with imrotate, specifying the angle (positive clockwise) and the
interpolation method (nearest, bilinear, bicubic).
B=imrotate(A,angle,‘bilinear’);

A general af ne transformation requires three points, and can combine translation, scaling, rotation,
and shear (but not perspective). Projective transformations, which can adjust for perspective
distortion, require four points. The points are de ned by their initial coordinates and

nal coordinates. The points can be selected with the cursor to create the list of coordinates. These
coordinates are used to create the transform mapping, which is then used to perform the
transformation.
movingPoints = [x1 y1; x2 y2; x3 y3];

xedPoints = [x1 y1; x2 y2; x3 y3];


tform = tgeotrans(movingPoints, xed Points, ‘af ne’);
B=imwarp(A,tform);
imshow(B,[]);

The tools imregcorr and imregister can be used to automatically align one image with another, but
require de ning several control parameters as described in detail in the user manual.
Thresholding

A grayscale image, or the intensity data from a color image, can be thresholded with a userde ned
setting t by
B=A<t;
or, to threshold pixels with values between two values t1 and t2
C=A>t1 & A<t2;

Automatic thresholding is based on the histogram of brightness values based on the widely used Otsu-
Trussell algorithm to calculate the value that separates the two groups of pixel values with the highest
value of the Student’s-t statistic. This value can then be used to convert an image to black and white.

level=graythresh(A); B=im2bw(A,level); imshow(B,[]);

To segment the image into more than two groups, the same approach creates a N value vector to
quantize the image into N+1 regions, which can then be shown in colors:
levels=multithresh(A,N);
B=imquantize(A,levels);
C=label2rgb(B);
imshow(C,[]);
Contour plots can be drawn with N lines uniformly spaced in brightness. gure imcontour(A,N);
Binary images

Binary images contain 0 and 1 values, and by default, binary images are shown with the 1 values as
white features on a black background. The display can be inverted to show the features as black on
white by using the NOT (~) character.
imshow(~A,[]);

Morphological operations on binary images include erosion and dilation, and their combinations
opening and closing, as well as outlining, and skeletonization. It is also possible to use the imdilate,
etc. functions shown above that are applied to gray scale images.
B=bwmorph(A,‘close’,iter);{iter speci es the number of dilations and erosions}
B=bwmorph(A,‘remove’); {removes interior pixels leaving the outline}
B=bwmorph(A,‘skel’,inf); {leaves the skeleton, replacing inf with N performs N iterations,

while inf causes the operation to continue until there are no further changes} Other operations using
bwmorph include erode and dilate, open (erosion followed by dilation), endpoints and
branchpoints (applied after skel, these leave just the ends or nodes respectively), and thicken
(converts an 8-connected skeleton to 4-connected, to separate regions).

Holes within features can be lled (the 8 speci es 8-connectivity; if omitted, 4-connectivity is used).
B=im ll(A,‘holes’,8);

Small features such as noise can be removed based on area. Isolated features consisting of no more
than N touching pixels are removed by
C=bwareaopen(B,N);

The Euclidean distance map (EDM) can be generated with B=bwdist(A);


The ultimate points (peaks in the EDM) are found with B=bwulterode(A);

The EDM is also used for the watershed transformation, to separate touching convex features (where
the 4 speci es 4-connectivity; if omitted, 8-connectivity is used). The result labels the separated
features with values that can be displayed with colors as shown.
B=watershed(A,4);
C=label2rgb(B);
imshow(C,[]);

Boolean combinations of binary images using AND (&) OR (|) or NOT (~) can be performed.
C=B|A;
These can be combined, for example to construct the Exclusive-OR.
C=(B&~A)|(A&~B);

Measurement
The intensity values along a horizontal line in the image can be displayed by plot(A(line,:);
Identifying the separate features (8-connected groups of pixels) is done with bwlabel, which returns a
list L and the number of objects.
[L,number]=bwlabel(A);

The individual features in a binary image can be measured using stats=regionprops(B,‘basic’);

This produces a data matrix with a length determined by the number of distinct features present, with
measurement values for each feature. The basic set of measurements consists of the area, centroid
coordinates, and dimensions of the bounding box. Instead of this basic set, using all generates a matrix
with all of the size, shape and position measurements, or a list of speci c measurement values can be
used. The names of the various measurements and their exact meaning are explained in the Matlab
documentation. The data can be accessed in several ways, for example,
F_Areas = [stats.Area];
creates a list of the areas that can be printed, searched, etc. using standard Matlab tools, such as
nbins=10;

gure hist(F_Areas, nbins);


to plot a histogram of the area data.

Matlab includes many routines for data analysis, which go beyond the application to images. These
are covered fully in the program documentation. One example is the ability to construct a minimum
spanning tree connecting the discrete features present in an image and then perform statistical analysis
on the lengths of the branches, for instance to describe clustering. For additional statistical analysis,
there is an interface for SAS JMP to access Matlab data.

Appendix D - ImageJ

This appendix summarizes a basic set of commands that implement the functions illustrated in the text,
which are provided by ImageJ. Since there are literally hundreds of user-contributed plug-in
additions to the basic program, with varying levels of correctness and support, the description here
applies to the functions in the basic program. The program and a more complete manual that covers
various manual procedures such as drawing and labeling, etc., as well as programming and
automation, can be downloaded at <http:/imagej.nih.gov/ij>. Since it is written in Java, the program
can be run on both Windows and Macintosh computers. These notes correspond to version 1.47v.

Basic image handling

Images of various types (8 or 16 bit, gray scale, RGB, etc., in a variety of storage le types) are read
and saved with the usual commands selected under the File> menu. Images can be converted from one
type to another with Image>Type> and the header bar for each image shows its size and type. A
separate window contains each image, and the frontmost window with the highlighted title bar is the
active one operated on by most of the various processes (exceptions are noted below). Edit>Undo
can be used to undo the results of most processing operations. File>Revert restores the original
image contents.
Contrast adjustment

The Image>Adjust>Brightness/Contrast selection shows the image histogram and has tools for the
manual adjustment of the minimum and maximum limits. Clicking on the Auto button sets limits that
cause a small fraction of the pixels to be clipped to black and white. For an RGB image, the process
is applied to each channel separately, resulting in color shifts. Repeated clicking increases the
number of clipped pixels. Clicking Apply applies the changes to the stored pixels, while Reset
restores the image to its original values. For an RGB image, the Image> Adjust>Color Balance
selection provides the limit adjustment tools for each individual color channel (click Apply to keep
adjustments to one channel before selecting another).

After setting maximum and minimum brightness limits, nonlinear adjustment of the pixel brightness
values can be applied using the Process>Math>Gamma function, entering a numeric value, and
clicking the Preview button (many of the functions described below include a Preview button).
Clicking OK modi es the stored pixel values.

Automatic setting of minimum and maximum brightness limits can also be done using the
Process>Enhance Contrast function. The fraction of pixels clipped at the ends can be entered in the
dialog. Histogram equalization can be produced by Process>Enhance Contrast with Equalize
Histogram checked. For a color image, the equalization is applied separately to the individual RGB
channels, causing alteration of the colors in the image.

Image contrast can be reversed, to exchange bright and dark values in each channel, with
Edit>Invert.
Neighborhood Filters
All of these operations are applied individually to each RGB channel in a color image, which results
in color shifts.

Process>Smooth averages each pixel with its 8 touching neighbors (a 3×3 neighborhood), producing
a blurred result. Averaging in larger neighborhoods is also possible using Process>Filters>Average.
These are inferior to Process>Filters>Gaussian Blur, which allows specifying the standard
deviation of the blur.

Process>Sharpen increases the difference between each pixel and the average of its 8 neighbors
(3×3 neighborhood), and adds the difference to the original image, making edge contrast greater. This
is inferior to Process>Filters>Unsharp Mask, which allows specifying the standard deviation of the
Gaussian blur and the amount of the difference to be added to the original.

Other convolutions can be applied with Process>Filters>Convolve by rst entering the kernel values
as a matrix. These can be saved as text les on disk. Directional rst derivatives using 3×3 kernels in
each of the 45° directions can be selected using Process>Shadows.

A median lter in a 3×3 neighborhood is applied using Process>Noise>Despeckle. A more general


median using Process>Filters>Median allows specifying the neighborhood radius (using a circular
neighborhood).
Texture extraction can be done using Process>Filters>Variance with a large neighborhood radius; a
small radius produces edge outlining.
Process>Find Edges uses the Sobel function (3×3 neighborhood) to outline steps and boundaries.
Fourier transforms

The Fourier transform is used in several procedures, such as deconvolution and general ltering, and is
important for identifying speci c noise frequencies. The forward transform is computed, and the
power spectrum displayed (with logarithmic scaling), with Process>FFT>FFT and the inverse
transform with Process>FFT>Inverse FFT. Marking the power spectrum display manually with
black selectively removes the corresponding frequencies and orientations before applying the inverse
transform. Alternatively, marking the power spectrum display manually with white selects the
corresponding frequencies to be enhanced or kept. (Both white and black markings cannot be used at
the same time). Applying image arithmetic (below) to subtract the

ltered image from the original can isolate periodic information. A custom lter can be created as a
gray-scale image with intermediate values from black to white; select Process> FFT>Custom Filter
to apply it (select the image from the pull-down menu list).

Finding locations in one image that correspond to a second, target image, is performed with
Process>FFT>FD Math, selecting Correlation, and specifying the two images by name.
Deconvolution can also be performed using Process>FFT>FD Math, by specifying the image to be
deconvolved and the image containing the point spread function (PSF) and selecting Deconvolve.
This is “ideal” deconvolution; there is no provision for applying a Wiener constant to alleviate noise.
The original images must have the same dimensions (use Image>Adjust> Canvas Size to pad a
small psf to match the blurred image).

Morphological processing

Basic morphology operations on gray scale images (erosion, dilation) are performed using
Process>Filters>Maximum and Process>Filters>Minimum and specifying the neighborhood radius
in pixels. Applying these processes sequentially using the same radius produces morphological
opening or closing.

A rolling ball lter can be applied to subtract a gradually varying nonuniform background with
Process>Subtract Background. The radius of the rolling ball controls both the size and contrast
necessary for features to be retained. Selecting Sliding Paraboloid replaces the spherical rolling ball
with a tted paraboloid and offers more control of radius and generally fewer artifacts.

Image arithmetic

Images can be combined using arithmetic operators selected in Process>Image Calculator. The
operators include addition, subtraction, multiplication, division, minimum, maximum, absolute
difference, and the Boolean operations AND, OR and XOR (which should be used with thresholded
binary images). It is helpful to create a new 32 bit oating point window to hold the result to avoid
over ow and clipping. The resulting images will generally require contrast adjustments. If the source
images have different dimensions, the smallest common rectangle starting in the upper left corner is
used.

Image rectifying

Image>Scale , Image>Transform>Rotate and Image>Transform>Translate functions are


available with either nearest neighbor (listed as “none”), bilinear or bicubic interpolation. There is
no perspective correction, or provisions for aligning one image to another.

Thresholding

Gray scale and color images are interactively thresholded to black and white using Image>
Adjust>Threshold. This shows the histogram with sliders used to manually adjust the threshold
settings, as well as a pull-down with a number of automatic methods (the Otsu algorithm based on the
Student’s-t statistic is widely used to distinguish bright from dark). Click Apply to con rm the result.
Process>Binary>Make Binary operates on the brightness values only, ignoring color. For color
images, Image>Adjust>Color Threshold provides several choices of color space (HSB is usually
the preferred choice). Selection of multiple brightness ranges, and contours, are not provided.

Binary images

Morphological operations on binary images include erosion and dilation, and their combinations
opening and closing (Process>Binary>Erode, >Dilate, >Open, >Close) require rst using
Process>Binary>Options to enter the Number of Iterations and a Count value that is the number of
adjacent pixels that must be of the opposite color to add or remove a pixel. Since the erosions and
dilations are performed locally and iteratively, the results are not isotropic and distances vary with
direction and count. Better results can be obtained by rst generating the Euclidean distance map
(below) of the features (or, for dilation, by rst reversing the image contrast and generating the EDM of
the background), followed by thresholding at the desired level.

Other operations are Process>Binary>OutlineProcess>Binary>Outline connected lines of edge


sharing pixels along the feature boundaries, Process>Binary>Fill Holes to ll interior gaps in features
(including any outlined holes within the features), and Process> Binary>Skeleton which iteratively
erodes features to leave just (approximate) midlines that are 8-connected. Because of the iterative
erosion, lines are strongly constrained to 90 and 45 degree directions.

The Euclidean distance map (EDM) can be generated with Process>Binary>Distance Map. The
ultimate points (local maxima in the EDM) are found with Process>Binary>Ultimate Points. The
EDM is also used for the watershed transformation, Process>Binary>Watershed, to separate
touching convex features.

Boolean combinations of images using AND, OR and XOR can be performed using Process> Image
Calculator and selecting the function and the images by name from the drop-down lists. The logic
treats white as “on” and black as “off.” As for all image combinations, if the source images have
different dimensions, the smallest common rectangle starting in the upper left corner is used.
Measurement

Selection or removal of the individual features in a binary image can be performed using
Analyze>Analyze Particles. This allows setting ranges for area and circularity (which is called
“formfactor” in this text) to keep or exclude, as well as the option to remove features that touch the
boundaries of the image. Note that counting features that touch all boundaries and counting ones that
touch no boundaries are both incorrect for images that represent a region within a larger population
of objects, and give biased results. Options selected under Show allow displaying outlines, labels,
etc. for each particle.

If Display Results is checked the Analyze>Analyze Particles function will display a Results Table
containing all of the measurements for each feature that have been previously selected in the
Analyze>Set Measurements dialog box. The Analyze>Measure function performs similar
measurements on regions that have been previously de ned. The results listing can be saved to a disk
le (File>Save As) of type .xls for subsequent analysis (e.g., in JMP or Excel). If Results>
Summarize is selected, the mean, standard deviation, minimum and maximum for each measurement
parameter are also shown. Results>Distribution shows a histogram for a selected measurement
parameter.

Analyze>Set Scale and Analyze>Calibrate can be used to de ne the spatial scale and brightness
scale of the image, respectively. To de ne the spatial scale, draw a line of known length and select the
function, enter the distance and units, and subsequent measurements will be reported accordingly. For
density or brightness calibration, mark selections around regions of known value, use
Analyze>Measure to record the mean gray value of each, and then Analyze>Calibrate to enter the
known values. Note that this is incorrect for values such as density, which have a nonlinear
relationship between value and brightness; in such cases the mean gray value should not be used, and
each pixel value should be converted before averaging. The calibrated values are used for particle
and region measurement as well as plots of intensity along a user-de ned line (Analyze>Plot Pro le).

Plugins

There are hundreds of free plugins written for use with ImageJ. When installed, they are listed,
without organization, under the Plugins menu. Most of these are intended to perform a very speci c
function with very speci c images, often of stained biological tissue as imaged in a transmission light
microscope. Others claim to address some of the more general tasks described in this text, but it is the
user’s responsibility to verify that the algorithms and the implementations of those algorithms are
correct. There are no guarantees, little documentation other than reading the Java program itself, and
often no way to get support from the originator. Caveat Emptor.

To learn to program plugins and additions to ImageJ in Java, books such as the three volume set by W.
J. Burger & M. J. Burge “Principles of Digital Image Processing” (Springer, 2009) may be helpful.

The FIJI (“Fiji Is Just ImageJ”) download ( https:// ji.sc) includes with ImageJ a set of prepackaged
plugins, most of them intended for use on 3D stacks of images from a confocal microscope, and
applicable to the study of biological tissue. However, some of these plugins provide speci c routines
not present in ImageJ that implement functions illustrated in this text. Examples include:

Anisotropic diffusion and bilateral lters


Kuwahara lter for edge sharpening
Several deconvolution methods
Additional algorithms for automatic threshold determination
EDM-based watershed segmentation
Analysis of skeletons by removal of terminal branches
Registration (alignment) of multiple images, intended for a 3D data set but also generally

applicable to focus stacks


The documentation for each plugin is provided by the individual authors, usually on separate
websites or by reference to publications that show an example of use, and is highly variable and often
minimal.

Appendix E - Image-Pro Plus

This appendix summarizes a basic set of commands that implement the functions illustrated in the text,
using Media Cybernetics Image-Pro Plus <http://www.mediacy.com/index.aspx? page=IPP>. The
program is typically sold in conjunction with a light microscope, and includes interface software for
microscope control and image capture. It runs only under Windows. The notes correspond to version
7.0. In order to maintain continuity with older versions and any automatic sequences originally
written for them, there are some options and menu selections that are redundant, providing more than
one way to access the same function, and some less accurate computations using older algorithms and
implementations (for example integer-based) have been retained. In addition to the menu selections
described here, the program has scripting and automation capability, some provisions for 3D and
time-sequence image handling, and manual editing and annotation features. A full manual is available
on-line at <ftp://ftp.mediacy.com/tech/ PDF/IPPReference_7.0.pdf>, which includes extensive
information on the data export, automation macro language, and hardware control portions of the
package.

Basic image handling

Images of various types (gray scale, RGB, 8, 12 or 16 bit, in a variety of storage le types) are read
and saved and displayed using the usual commands under the File> menu (also selectable from the
tool bar). Each image is displayed in its own window, and the frontmost window with the highlighted
name is the one operated on by the processing and measurement routines. Edit>Undo can be used to
revert up to the four most recent processing operations.

Images may be converted from one format to another by selecting Edit>Convert To. The result is
shown in a separate window and is treated as a new image. Many of the processing functions
described below have the option to either replace values in the current image or to create a new
image in a separate window.

The Toolbar has tools for enlarging the image on the screen, reading point coordinates and pixel
values, selecting regions, measuring point-to-point distances, etc., as well as image annotation tools.
Measure>Histogram shows the image histogram, which can be saved to a disk le (e.g., for reading
with Excel). For a color image, the color box selects the channel represented.

Most of the functions listed below operate only within a selection, called an AOI (Area of Interest),
if one has been de ned. There can be multiple AOIs on an image. The selection tools include
rectangular, polygonal and free-form shapes, an automatic tracing that can follow a wellde ned edge,
as well as a “wand” tool that allows selecting a point and growing a region of pixels that have values
within the user-speci ed tolerance of that value. Depress the Control key while clicking or drawing to
add another region to an existing one.

Contrast adjustment

Manual adjustment of the image contrast can be accomplished by selecting the BrightnessContrast-
Gamma control sliders (selected from the toolbar, or with Enhance>Contrast Enhancement). The
contrast of the image can be reversed, as in a photographic negative, with the Invert button, or by
selecting Enhance>Invert. The More button opens a dialog with a graph relating the stored to
displayed brightness that can be adjusted by dragging points. Alternatively, the Enhance>Display
Range selection shows the image histogram with sliders that de ne the bright and dark limits, and a
Best Fit button that adjusts the limits to the ends of the data.

Algorithmic, rather than subjective manual adjustment of image contrast is selected with
Enhance>Equalize. In addition to classic (Linear) equalization, options for other distributions are
selectable. Bell Curve corresponds to a center-weighted Gaussian contrast.

All of these routines modify the display by creating a lookup table, but do not change the actual image
data unless the Apply button is clicked or Enhance>Apply Contrast is selected.

Individual color channels ( R, G, B, H, S or I) can be pulled from a color image and converted to
gray scale with Process>Color Channel under the Extract tab. Conversely, the Merge tab allows
combining images as color channels, or replacing an existing color channel. Alternatively, the
Process>Color Composite function combines multiple gray scale images in color channels, with the
ability to shift each for alignment and adjust the contrast for each.

Color Correction requires having a standard reference image, and marking up to 20 locations on the
two images that should have the same color values. The program then converts the values to CIELab
color space, calculates the correction matrix (which can be saved), and applies it. Manual adjustment
of colors can be performed using the Enhance>Contrast Enhancement dialog to set brightness,
contrast and gamma for each individual color channel.

Image rectifying and adjustment

The primary application for this software is light microscopes, which have a shallow depth of eld
and highly corrected optics, and hence do not require adjustments for lens distortions or
perspective distortion (foreshortening). However, the program does provide capability for
general image alignment (scaling, translation, rotation and perspective correction) with
Process>Registration. Mark four points on the image and their corresponding locations on the
reference image, and click Register.

Process>Align Images performs scaling, translation and rotation of each image in a sequence for
optimum alignment, for example prior to merging them to form combinations described below, such
as an extended depth of eld. Fourier cross-correlation is the default method used, and limits can be
placed on the extent of scaling, translation and rotation that are allowed.

Correction for nonuniform illumination can be performed using the Process>Background Correction
selection. The options are Background Correction, which calculates the ratio of one image to
another (the background), and Background Subtraction, which subtracts the values in the background
image. This selection requires that a separate background image is available. In some cases,
morphological ltering (described below) can also be used to generate a background for subtraction,
or a high-pass lter (described below) may be useful to suppress gradual brightness variations with
position.
The program provides deconvolution of blur that is directed principally at 3D sets of images from
confocal microscopes, but deconvolution of a 2D image can be performed using the Sharpstack
dialog under the Advanced menu. The Inverse method uses a point spread function, while the Blind
method does not require prior knowledge of the optical parameters.

Neighborhood Filters

Process>Filters opens a dialog with several available neighborhood lters. Under the Enhancement
tab, there is a choice of 3x3, 5x5, or 7x7 lter sizes. LoPass is an averaging lter that replaces each
pixel with the average value of pixels in the lter area. A Gaussian blur is a better choice as it uses
weighting of the central pixel(s) and causes less blurring. The Median

lter is not a convolution but instead is based on ranking pixel values in the neighborhood, and is
superior to Gaussian lters for noise reduction. The Strength value should be set at 10 to produce the
ltered result. The Number of Passes can be made greater than 1 for multiple repetitions of the lter.

The unsharp mask ( Sharpen) calculates the difference between the original image and the Gaussian
blurred copy and adds that back to the original. Local Equalization replaces each pixel with a value
calculated from the rank of pixel values in the histogram of the local area, and increases detail
visibility. Select Linear equalization and Step=1 to produce the maximum contrast result. A Window
size of 7 to 11 pixels is typically adequate.

Under the Edge tab, several options for outlining brightness steps in the image are provided. The
Sobel is a widely used gradient method that uses a 3x3 pixel neighborhood. The orientation of the
Sobel gradient vector is represented as a gray scale value with the Phase selection. Convolutions that
generate a derivative in the Horizontal or Vertical directions can also be selected (the Sculpt button
under the Special tab generates a derivative from top-left to bottomright). Edge delineation can also
be accomplished by calculating the local Variance in a 3, 5 or 7 pixel wide neighborhood.

Morphological ltering options include erosion, dilation, opening and closing, and are typically
applied using 3, 5, or 7 pixel wide circular neighborhoods. These may be used singly or in
combination to generate backgrounds for removal. The difference between two results can also be
used to delineate edges, or when using larger neighborhoods, to extract texture. The thinning
(skeletonization), pruning, watershed, and distance functions are best applied after thresholding and
are discussed below. The Tophat and Well selections do not use a true top hat

lter as described in the text, but are simply the difference (bright or dark, respectively) between the
pixel value and the local neighborhood average.
User-de ned convolution kernels can be entered under the Kernels tab, and optionally saved on disk.
For larger size convolutions, the Process>Large Spectral Filters may be used.
Image combinations

Images can be combined pixel-by-pixel using Process>Operations. The images are selected from the
pull-down lists. The arithmetic operation are Add, Subtract, Multiply, Divide, Diff (absolute
difference), Maximum (greater value), and Minimum (lesser value). For color images, the operations
are performed individually for each channel. For thresholded binary images, the Boolean operations
include AND, OR, and ExOR. These are discussed separately, below.

To compare and combine images, the Process>Image Overlay function allows placing one image on
top of another, with variable transparency. When the overlaid image has been dragged to the desired
position (but not rotated or scaled), the images can be merged. The options are to keep the lighter or
darker pixel value, or the blended (averaged) combination of the two.

Two or more images of the same eld of view can be combined to keep the areas of each in sharpest
focus with Process>Extended Depth of Field. In addition to returning a composite image, this
routine can generate a topographic map showing the depth at which the sharpest focus was found at
each location.

Process>Tile Images combines images that are adjacent and slightly overlapped to form a larger
composite result. The ordering of the images in the sequence is important and must be speci ed as
shown in the Options dialog. Blending of the overlapped regions produces the smoothest visual result
but may hide alignment errors.

Fourier transforms

One use of the Fourier transform is for the removal of noise. Process>FFT presents a dialog in which
the forward and inverse transforms can be performed, as well as some ltering options. Forward
calculates and stores the transform and displays the power spectrum (usually best displayed with a
logarithmic brightness scale, selected in Options). The complex FFT data can be saved in a disk le
and reloaded if desired, to allow for several different ltering choices.

Filtering to remove speci c frequencies, such as spikes associated with periodic noise, can be
accomplished be marking small AOIs (preferably circular rather than square) around each spike and
clicking Spike Cut (alternatively, select Spike Boost to enhance those frequencies). To perform low-
pass or high-pass ltering, mark a circular AOI around the center of the power spectrum to de ne the
cutoff frequency and select Low Pass or Hi Pass. Then in Options, choose the Hanning option (with
50% transition) to apply the lter and Symmetrical if spikes have been marked in just one of the two
locations in the symmetrical power spectrum. Finally, Inverse will perform the inverse FFT and
show the ltered image. If Preserve Data is checked in Options, the process can be repeated in order
to vary the lter settings.

Thresholding

A grayscale or color image can be thresholded by a user-de ned setting of sliders on the histogram
with Process>Segmentation. For a color image, the RGB or HSI channel values can be selected for
display, and limit settings for the histogram of a selected channel positioned manually or using the
Automatic button (which applies a statistical calculation to divide the histogram into two regions).

Alternatively, the Color Cube method subdivides color space (based on either RGB or HSI axes)
into a set of small cubes (the size is user speci ed with the sensitivity control) so that multiple limits
can be combined. Selecting a color on the image using the eyedropper tool and then adjusting the
limits controls the range of values. There is no automatic function.

Select Apply Mask to convert the image to the thresholded binary result, or New Mask to keep the
original image and create a new binary image with the results.
Binary images

Binary images are shown as white features on a black background. Reversing the image contrast to
exchange black and white is performed with Enhance>Invert and Apply. The morphological
erosion, dilation opening and closing operations described above for gray scale features can be
applied with Process>Filters and the Morphological tab. Thinning (skeletonization), followed by
Pruning to remove terminal branches allows entering the number of iterations, which will usually be
set to a large enough value that no further changes occur. The Threshold value does not matter for
binary images.

The Distance selection approximates the Euclidean distance map but is not as isotropic as a true
EDM. The Watershed selection separates touching features. This uses iterative erosion and dilation
rather than the EDM. (Touching features can also be separated during measurement with
Measure>Count/Size by selecting watershed split.)

Boolean operations (AND, OR, ExOR, etc.) with binary images using Process>Operations are
performed pixel by pixel. Using the features in one image to select features in a second image that are
touched by any overlap is performed with Process>Restricted Dilation. The “seed” image is
iteratively dilated but only to pixels contained within a second “mask” image. The number of
iterations must be large enough to completely ll the mask areas.

Measurement

Calibration of image dimensions ( Measure>Calibration>Spatial) can be done either by marking


points on an image of a reference scale or object, or by entering optical parameters (typically for a
light microscope). The calibrations can be saved as reference calibrations (for example, for each
objective lens in a microscope) and applied to batches or sequences of images. Selecting >Intensity
from the calibration menu allows creating a curve relating pixel brightness to some property of the
specimen such as optical density, uorescence emission, etc. This is best done using reference
image(s) of known standards, such as a step wedge.

The Measure>Count/Size dialog provides for a wide range of operations. These include setting
thresholds for gray scale or color images (as described above under Thresholding), labeling the
screen display with object numbers, selecting the feature measurements to be reported (Select
Measurements), and various display options such as colored outlines. Once the options (below)
have been de ned, the Count or Measure buttons can be selected. The Measure>Count/ Size>View
button displays the measurement data and summary statistics, and can be used to Locate speci c
objects on the image. Selections can be made to show a Histogram of values, or a Scattergram
relating two measurements. The data can be written or appended to a disk le (or to the clipboard or
printer). The Data Export command transfers selected data to a le or directly to Excel.

There are several important Options that must be de ned prior to performing the count or measure
operation. Clean Border ignores features that touch the image boundaries and should be selected.
(Note that for counting purposes, it is incorrect to either count or disregard features intersecting all
four edges; an unbiased estimate of the number of features per unit area can be obtained by averaging
the counts including and omitting edge-touching features. For measurement purposes, an adjustment
for the incomplete and unmeasurable edge-intersecting features is required. This can be performed
using an Excel calculation based on the X- and YFeret’s diameters of each feature and the image
dimensions.)

Selecting

Selecting or 8-Connectivity de nes which neighboring pixels are treated as touching, and forming
part of a feature. Fill Holes ignores features that lie within holes in larger features, and includes the
area of holes in the measurements for the encompassing feature. Pre lter allows ignoring small
features based on area. Apply Ranges uses the limits set in Edit Ranges to ignore features based on
other measurement criteria. Measure Objects should be checked to perform the selected feature
measurements and return the data (otherwise, only counting and labeling of the display will be
performed).

Options such as attening the image background, or feature smoothing, watershed separation, etc., are
usually best performed using the image processing and morphological processing operations
described above, rather than as part of the measurement operation, so that the results can be veri ed.

Measure>Line Pro le plots the intensity values along a user-de ned path on the image. Manual
measurements can be performed using various drawing tools to superimpose points or lines on the
image and record the measurements (Measure>Measurements). Various grids useful for
stereological measurements are generated (either overlaid on an existing image, or in a new image)
with Process>Grid Mask. These are typically combined with a binary image using a Boolean AND
and then the number of “hits” determined using Measure>Count/Size.
Appendix F - ImageMet SPIP

This appendix summarizes a basic set of functions in the SPIP program package from ImageMet. The
program (Windows only) is marketed primarily for use with scanned probe microcopes and
consequently for use with images depicting elevation as a function of position. However, many of the
functions provided are suitable for use with general purpose (monochrome) images, both microscopic
and others, and so the appropriate functions are described here.

The user interface is organized as a “ribbon” across the top of the screen, which contains buttons and
more conventional menu listings.
Basic image handling

Images in a variety of storage le types are read and saved and displayed using the usual commands
under the File> menu. It is recmmended that les be saved in *.tiff format rather than jpeg; the other
formats offered are not ones commonly used for general purpose imaging. Because the program is
specialized for surface elevation applications, images have a single value at each location and are
not able to handle RGB color data. Instead, false colors (pseudocolors) can be selected to emphasize
differences in values, and the color scale is notmally displayed beside the image. Each image is
displayed in its own pane, and the active image is the one operated on by the processing and
measurement routines. Edit>Undo can be used to step backwards through up to 9 recent operations.

The Tools palette contains tools for enlarging the image, selecting regions, measuring distances, etc.
The selecting Analyze > Feature Analysis > Histogram shows the image histogram.
Contrast adjustment

Manual adjustment of the image contrast can be accomplished with the Contrast and Brightness
slider. Automatic settings to clip a speci ed percentage of dark and light pixels can be controlled in
View Settings. Equalization of the values as represented in the color scale is called Color
Equalization; it does not alter the actual pixel values.

Image adjustment
The Transformation panel in the Modify tab group allows rotating an image in 90 degree steps, or by
an arbitrary angle.

Gradual variations in value across an image can often be corrected by tting a background function
using Plane Correction. The program offers a variety of functional forms (plane, cylinder, sphere,
general polynomial), and can perform the t using all of the points in the image or just those within or
outside a selected area.

In some cases, morphological ltering (described below) can also be used to generate a background
for subtraction, or a high-pass lter (described below) may be useful to eliminate gradual brightness
variations with position.

Neighborhood Filters
The Filter Dialog offers Median, Convolution, and Local Difference tabs.
Median ltering replaces outlier values (extremes of either small or large value, or both) with the
median value in the local neighborhood. This should be set to circular to avoid directional artifacts,
and the size may be adjusted as appropriate. An image showing just the replaced values is also
provided.

Convolution lters are among the most widely used in image processing. They linearly combine all of
the pixel values in the local neighborhood to generate a new value for each pixel, and can be grouped
into categories of smoothing, sharpening, and edge-enhancement. For smoothing, a simple mean lter
(which averages the pixels in the selected neighborhood) or a “parametric lowpass” lter (in which
the integer weights are set to proportional values), or a Gaussian lter may be used. The latter is the
preferred choice as described in the text. The standard deviation entered controls the size of the
neighborhood used.

Sharpening can be applied with a standard 3x3 Laplacian (called a High Pass Filter), or the
Laplacian of Gaussian lter. This is a bandpass lter that reduces noise by applying a small Gaussian
smooth and subtracting values from outer pixels in the neighborhood. It is practically identical to the
Difference of Gaussians described in the text. The adjustment of neighborhood size is critical to
include the negatively weighted neighbors.

Edge enhancement is limited to gradient (directional derivative) lters in each of the 45 degree
directions. Different weighting of the immediate (3x3) neighbors can be selected, but the Sobel and
Prewitt buttons do not perform the expected operation of combining two orthogonal results as the
square-root of sum of squares to outline edges. It is possible to do so by generating two directional
results and separately combining them as described in Image Calculations, below.

The unsharp mask is a high-pass lter; a band-pass lter that subtracts a blurred copy of the image from
the original. The weight should be set to –1. There is no adjustment for the mode or neighborhood
size used in the blurring, so this is essentially just a 3x3 Laplacian. Better control can be achieved by
generating a Gaussian smoothed copy and subtracting it separately using Image Calculations.

Morphological ltering of gray scale images is provided by the Morphology menu. Erosion, dilation,
opening and closing are provided with an adjustable circular neighborhood. These functions are also
used to apply the same morphological procedures to images that have been thresholded. In some
cases the difference between an eroded (or dilated) image and the original (using Image Calculations)
may be useful for edge enhancement.

Enhancement of local detail contrast and suppression of large scale brightness variation is provided
by the Local Difference lters. Either the difference between the pixel value and the mean of the local
neighborhood, or adjustment to make the local standard deviation uniform across the image, may be
selected.
Filtering images using Fourier transforms (performing high-pass, low-pass, band-pass and bandreject
operations) is discussed below

Image combinations

Images can be combined pixel-by-pixel using the Image Calculator. Enter the calculation to be
performed as an expression [e.g., A + B/2]; the calculator pad also offers a variety of functions [e.g.,
exp, log, min, etc.] so that complex expressions are possible [e.g., min(A, exp(B))]. The images
assigned to the variables are selected from drop-down lists.

Boolean combinations of thresholded binary images can be accomplished by adding or subtracting


images and thresholding the results.
Fourier transforms
The Fourier module provides several buttons and menus in the ribbon control area.

One use of the Fourier transform is image processing or ltering. With the Fourier module, concentric
circular lters in the frequency domain can be set for high-pass, low-pass, bandpass, or band-reject
modes, with adjustable cutoff frequencies and a choice of ideal (exact) or Butterworth (gradual)
cutoffs (the latter is generally preferred). Visualization of the lter as it is constructed is usually
helpful.

When the FFT is calculated, the power spectrum can be displayed with various scaling (linear, log,
square root, etc.) Using the AOI masking tools, regions or spikes in the power spectrum can be
marked. The Circle Snap option is useful for aligning an AOI on a spike. Thresholding of the power
spectrum can also be used for selection or rejection. The Inverse FFT function is then used to
reconstruct the image with just the selected frequencies, or with all except those selected.
Measurements on the power spectrum can identify speci c frequencies as spacings in the original
image. This is particularly useful for repetitive images such as atomic lattices.

The Fourier transform is also used in correlation. Correlating an image with itself (autocorrelation)
identi es repetitive or regularly spaced structures. Correlating one image with another aligns them
based on any common features and identi es the amount and direction of offset between them.

Thresholding

Using the Detection tab in the Particle and Pore Analysis pane, threshold settings can be applied to
an image. This may be done manually by adjusting the sliders (called ‘clip markers’) on the color bar
displayed with the image, or using the histogram. Automatic settings of thresholds for either pores
(dark regions) or particles (bright regions) can also be selected. It is also possible to specify the area
percentage of the image to be selected.

Contour lines can be drawn on an image at xed steps in value as speci ed in View Settings.
Binary images

Thresholded images produced by the detection procedure can be processed in several ways. The
Post-Processing tab has check boxes to include holes within features, and to exclude features that
intersect the edges of the image. Smoothing feature outlines and suppressing noisy pixels are
generally better performed by explicit application of morphological procedures. The description
above of morphological erosion, dilation, opening and closing are qually applicable to images after
thresholding.

There are two “ watershed” procedures for the isolation and separation of features. One, called
“watershed packed features,” corresponds to the description of the method in the text. The program
documentation describes this as the process of lling river valleys and isolating the nal separating
dikes between them, but this is equivalent to the “raining on mountains” description in the text. Under
the Advanced Settings there is an option to merge features that are separated by a small amount.
Merging these “shallow features” can reduce or prevent over-segmentation of complex images.

The second method, called “watershed dispersed features” is different. It applies to the isolation and
delineation of well-separated features that are present on a variable background, and uses the local
image gradient. There are several interactive adjustments for particle size and spacing, detail size,
slope noise reduction and threshold, smoothing, and so on, which the user is supposed to adjust until
the results “look right,” making this an often clumsy approach.

The circle detection method works for isolating circular features (either touching or separate) that lie
within the size limits set by the user. The noise lter is used to reduce pixel noise and smooth feature
boundaries for better detection. The higher the gradient threshold setting the fewer the number of
points used for the circle t. The setting for maximum overlap is a percentage of the circle diameter.

There is no explicit command to generate the skeleton of features, but depending on the measurements
selected the skeletons may be computed and may be displayed superimposed on the features.
Optionally, only the longest path in each skeleton (called the “ ber”) may be shown.

Boolean combinations of thresholded binary images can be accomplished by adding or subtracting


images using the Image Calculator and thresholding the results.
Measurement

The Post-Processing tab in the Pore and Particle Analysis pane allows ignoring features that touch
any edge (note that unbiased counting of features should be performed by ignoring those that touch
two edges). Also in the post-processing section there is a selection to omit features based on a the
values of any measured parameter(s).

The Output tab in the Pore and Particle Analysis pane provides a lengthy chart in which speci c
measurements can be selected, as well as options to produce graphic displays of the data and to
colorize or label the features. The measurements include a wide selection of size, shape, position,
and value parameters (the values are expressed as “height” since the principal intended application
eld is to surface elevation maps from scanned probe microscopes).

For size measurements, most of the measurements correspond to those in the text except for some
naming differences (e.g., Heywood diameter is the equivalent circular diameter, and maximum
diameter is the diameter of the circumscribed circle). Length is the projected maximum dimension but
breadth is the projected dimension perpendicular to the length direction, and the minimum and
maximum breadth are also measured in the perpendicular direction. Skeleton length is the total length
of all branches in the skeleton, while ber length is the length of the longest connected path. Fiber
width is calculated as area divided by skeleton length.

Position is calculated as the centroid of each feature, and nearest neighbor distance and direction are
based on the centroid positions.
The shape measurements correspond to those in the text, and use the same naming conventions. They
include formfactor, asect ratio, convexity, roundness, elongation, solidity, etc.
The measurement values for any of the parameters can be used to label and/or color the display of the
features, which is superimposed on the image.
The output from multiple images can be combined in the displayed charts, and the resulting
“collection” can be saved as an XML le or exported as text.
Appendix G - GIMP

GIMP, or GNU Image Manipulation Program, is a free, open source program that, like Photoshop, is
primarily oriented toward the creation of graphics and offers a wide range of image creation and
processing tools. This chapter deals only with the image processing and (very liited) measurement
operations, not the drawing and manipulation routines that may be used for artistic purposes. It runs
under both Windows and Mac OS. The version of GIMP described is 2.8.20. The program can be
downloaded from www.gimp.org, and a complete manual viewed at docs.gimp.org/2.8/en/index.html.

Basic image handling

A wide variety of 8-bit image le formats can be opened, saved and displayed using the usual
commands under the File menu. Each image is displayed in its own window, and the frontmost
window with the highlighted name is the one operated on by the processing and measurement
routines. Edit>Undo can be used to revert the most recent processing, and Edit>Undo History
displays a list of recent steps that can be selected. Images may be converted from color to
monochrome by selecting Image>Mode. Images can be saved in several le formats, including a
native format, XCF, which is not readable by any other program. Generally, TIF les are the best
choice.

Image>Duplicate or Layer>Duplicate Layer are very useful to create copies on which to try
various functions. Layers are especially handy for comparing the results of different steps by changing
the layer opacity (in the Layers dialog) or turning it on and off. A layer may also be useful for making
marks that are later counted or measured.

The Tools palette contains tools (also selectable by name in the Tools menu) for enlarging the image,
reading point coordinates or color values, selecting regions, measuring point-to-point distances and
angles, etc. The Histogram and Pointer palettes (select with Windows>Dockable Dialogs>) show
the image histogram and the location and color information at the cursor. Other dialog windows are
useful to show the processing history, select individual color channels, etc. These can also be
selected under the Window>Dockable Dialogs> menu.

Some of the various processing functions listed below operate only within a selection, if one has been
de ned. The selection tools include rectangular, polygonal and free-form shapes. Tools>Selection
Tools>Fuzzy Select allows clicking on a point and including connected pixels with values within the
Threshold range set in the Tool Options. Similarly, Select>By Color Select allows clicking on a
point and including all pixels with colors with RGB values within the Threshold set in the Tool
Options.
Contrast adjustment

Manual adjustment of the image contrast can be accomplished by selecting Colors>Levels and
moving the limit sliders or the midpoint slider, which adjusts the gamma value. The Auto button
moves the limits to the ends of the actual data in each RGB color channel, which alters colors. To
stretch the brightness values without changing colors used Colors>Auto>Stretch Contrast. More
control is afforded by the Colors>Curves selection which allows creating an arbitrary transfer
function relating the original pixel values to nal adjusted values, for the overall brightness or for each
individual color channel. In both routines, clicking the OK button applies the changes to the stored
pixel values. Algorithmic, rather than subjective manual, adjustment of image contrast can be applied
by equalization (Colors>Auto>Equalize) The contrast of the image can be reversed, as in a
photographic negative, with Image>Adjust> Invert. These functions apply to the entire image, and
are not restricted to a selection.

Image rectifying and adjustment

The Perspective tool can sometimes be used to correct for perspective distortion by dragging the
corners of a selection rectangle to distort the contents. Filter>Distorts>Lens Correction provides
sliders that can be used to correct for pincushion or barrel distortion.

Gradual variations in brightness across an image can often be corrected by morphological ltering
(described below) used to generate a background for subtraction, or a high-pass lter
(described below) to eliminate gradual brightness variations with position. Place the image and
the background in Layers and set the Mode to Subtract or Divide.

Neighborhood Filters
A Gaussian blur with adjustable standard deviation can be applied to an image with Filters>
Blur>Gaussian Blur.

The median lter is not a convolution but instead is based on ranking pixel values in the neighborhood,
and is generally superior to Gaussian lters for noise reduction. Median lter is applied with the
Filters>Enhance>Despeckle selection, and adjusting the radius (actually the dimension of the square
neighborhood). The Adaptive check box causes the median radius to be reduced (adapted) at each
point based on the local contents. The black level and white level sliders can be used to exclude
some points, and should normally be set to -1 and 256, respectively.

The unsharp mask ( Filters>Enhance>Unsharp Mask) calculates the difference between the original
image and a Gaussian blurred copy (with adjustable standard deviation) and adds that back (the
amount slider) to the original. The unsharp mask is a high-pass lter used to increase edge visibility;
rather than increase contrast at edges, a band-pass lter to isolate edges with improved noise rejection
may be obtained by using Filters>Edge Detect>Difference of Gaussians to subtract one Gaussian
blurred copy of the image (with a large radius to blur details) from a second blurred copy (with a
smaller radius, to reduce noise). Other procedures for edge delineation are provided in Filters>Edge
Detect>Edge Lnes. The most common is the Sobel.
General convolutions can be applied by entering the integer kernel values as a 5x5 matrix in
Filters>Generic>Convolution Matrix. The scale factor should usually be set to the sum of the kernel
values, or, for kernels that total zero, to the largest positive value in the kernel. The offset is used to
shift values so that negative results can be visualized. The normalize checkbox provides automatic
divisor and offset values for the result. There is no provision for saving or loading kernels.
Morphological ltering of gray scale or color images (the latter based on the luminance, or weighted
average of the color values) is provided by the Filters>Generic>Dilate and >Erode functions (based
on a 3x3 neighborhood). The Adaptive check box causes the median radius to be reduced (adapted) at
each point based on the local contents. The black level and white level sliders can be used to exclude
some points, and should normally be set to -1 and 256, respectively. Erosion and dilation may be
used singly or in combination to generate backgrounds for removal. The difference between these two
results (using layer Modes as described below) can also be used to delineate edges.

Image combinations
Images can be combined pixel-by-pixel by placing them in layers and setting the Mode in the Layers
palette (Subtract, Multiply, Lighter, or Darker). Fourier transforms
Thresholding
A grayscale image, or the intensity data from a color image, can be thresholded by setting the upper
and lower limits sliders on the histogram with Image>Colors>Threshold.
Measurement
The measurement tool (Tools>Measure) allows clicking a point and dragging a ruler on the image,
while the distance and angle are displayed below the image.
Appendix H - Other Programs

There are quite a few other programs for Mac and/or PC that offer some image processing functions.
Unfortunately, few of them provide a broad range of technically useful functions, and rarely offer any
measurement capabilities. Some are primarily scrapbook programs that can organize collections of
photographs, and sometimes provide very basic corrections (e.g. Photos on the Mac can adjust colors
and contrast, and rotate and crop images). Some are designed as control interfaces to cameras or
microscopes (e.g. Scanalytics IPLab, which can be customized for uorescence microscopy imaging
with hardware control of speci c microscopes).

Three representative examples of moderate-cost or free software programs that offer a small but
sometimes useful range of the processing possibilities described in the text are discussed here.
Adobe Photoshop Elements 15

This program for both Mac and PC can be downloaded from www.Adobe.com/products/ photoshop-
elements with a 30 day free trial. It has extensive capabilities for organizing albums of images with
searchable tags for names, places, etc. It includes a few functions that are taken directly from the full
Photoshop program that are appropriate for technical image processing. The program can open 8 and
16 bit images in a variety of formats; 16 bit images must be converted to 8 bits for many processing
operations and for saving.

The image brightness values are displayed in the Histogram window


Images can be rotated (Image>Rotate Custom) and corrected for lens distortion (Filter>
Correct Camera Distortion )
Color and brightness adjustments include
Enhance>Adjust Color>Hue/Saturation
Enhance>Adjust Lighting>Levels (to set limits and gamma)
Filter>Adjustments>Equalize, Invert, Threshold (manual setting) Processing functions are
Filter>Blur>Gaussian Blur (adjustable radius)
Filter>Noise>Median (adjustable square neighborhood)
Filter>Stylize>Emboss (directional derivative in any direction)
Enhance>Unsharp Mask (adjustable radius and amount to add)
Filter>Other>Custom (enter, save, load processing kernel with scale and offset)
Filter>Other>Maximum, Minimum (erosion and dilation based on brightness, adjustable
dimension)

Selection tools include rectangle, ellipse, freehand, and color selection based on value range from a
selected pixel. The only measurement function is showing the projected dimensions of the selection
region in the Info window.

Image Processing Lab 2.8.0

This free, public domain program for PC only is available from http://www.aforgenet.com/
projects/iplab/downloads.html. There is no downloadable or online documentation, and the various
options are not consistently organized in the menus, but quite a few functions are provided. The
program can open 8 and 16 bit gray scale and RGB images in a variety of formats, but can only save
results as png, bmp or jpg. Select Options>Remember on Change to enable undo (Image>Backup).
The Image>Clone selection generates a copy of the image.

The various groups of functions that can be applied to gray scale and color images, and to binary
images after thresholding, are found under the Filters menu, and in its submenus:

Rotate allows arbitrary image rotation


Adjustments to color and brightness include
Levels (adjust upper and lower limits, check Sync Channels to adjust brightness) Contrast Stretch
(sets levels limits to the actual min and max)
Histogram Equalization (operates on each RGB channel for color images) Color>Gamma
Correction (works with gray scale images also)
Color>Invert (works with gray scale images also)
Convolutions include
Gaussian Blur (adjustable standard deviation and size, which can truncate the Gaussian) Bilateral
(Other>Bilateral)
Unsharp mask (Sharpen Ex, with adjustable smoothing size)
Edge detectors (Sobel, Canny with limit adjustment for edge value)
Custom (can enter, save, load kernels of various sizes, no scale or offset values)
Median lter (3x3, under Other>)
Fourier
Forward transform, frequency ltering, inverse transform (requires power-of-two size)
Image Combinations (Two source lters)
Merge (keep brighter), Intersect (keep darker); Boolean functions after binarization Add, Subtract
(these do not rescale the result)
Thresholding (Binarization)
Threshold (manual adjustment)
Otsu automatic threshold
Morphology
Erosion, Dilation, Opening, Closing (all 3x3)
Skeletonization (under Other)
Fill Holes (under Blobs Processing, with size limits)
Feature-speci c routines (under Blobs Processing, for white features on black background) Filter
Blobs by size
Color each separate feature (Connected Component Labeling), does not count There are no
measurement functions

Pixelmator 3.8

This Mac only program is available from www.pixelmator.com with a free 30 day trial. It can open
and save most common types of 8 and 16 bit RGB and grayscale images. The image processing
functions available are accessed using the drop-down menu in the Effects palette (View>Show
Effects to display this). Each menu selection presents a number of clickable selections for the
processing functions. The ones most relevant to the techniques discussed in the text are:

Color Adjustments: Brightness; Levels; Curves (can adjust overall intensity or the individual RGB
channels)
Blur: Gaussian (with adjustable radius); Median (3x3)
Sharpen: Sharpen (3x3 Laplacian in individual RGB channels, with adjustable amount of differences
to add back); Unsharp Mask (adjustable radius for blur and adjustable amount of intensity to add
back)

Selections include rectangles, ellipses, freehand shapes. The Edit>Select Color function allows
clicking on a color in the image and dragging a slider to include similar colored areas based on
uniform increments in RGB.

There are no measurement functions.

Das könnte Ihnen auch gefallen