Digital Image Fundamentals

UNIT-1
DIGITAL IMAGE FUNDAMENTALS
CONTENTS
1.1
Introduction
1.2
Elements Of Digital Image Processing Systems
1.3
Elements Of Visual Perception

1.3.1 Structure of human eye
1.3.2 Image Formation in the eye
1.3.3 Brightness adaptation and discrimination
1.4
Color Image Fundamentals

1.4.1 Mixing Of Colors
1.4.2 Chromaticity Diagram
1.4.3 Color Models
1.4.4 The RGB Color Model
1.4.5 The HSI Color Model
1.5
Image Sampling &Quantization

1.5.1 Basic concept of Image Sampling &Quantization
1.5.2 Representing digital image
1.5.3 Spatial and Gray-Level Resolution
1.5.4 Aliasing and Moir Patterns
1.5.5 Zooming and Shrinking Digital Images
1.6
Dither
1.7
2-D Mathematical Preliminaries

1.7.1 Fourier Transform
1.7.2 Z-Transform or Laurent series
1.8
Question Bank
TECHNICAL TERMS
1. Image: An image may be defined as two-dimensional light intensity function f(x, y) where x
and y denote spatial co-ordinate and the amplitude.
2. Digital Image: An image is a two-dimensional function that represents a measure of some

characteristic such brightness or colour of a viewed scene.
3. Digital Image Processing: The digital image processing is the processing of a two
dimensional picture by a digital computer.
4. Gray level: Gray level refers to a scalar measure of intensity that range from black to grays
and finally to white.
5. Color model: A Color model is a specification of 3D-coordinates system and a subspace
within that system where each color is represented by a single point.
6. Pixel: A digital image is composed of a finite number of elements each of which has a
particular location or value.
7. Resolutions: Resolution is defined as the smallest number of discernible detail in an image.
Spatial resolution is the smallest discernible detail in an image and gray level resolution refers
to the smallest discernible change is gray level.
8. Recognition: Recognition means is a process that assigns a label to an object based on the
information provided by its descriptors.
9. Interpretation: Interpretation means assigning meaning to a recognized object.
10.
Blind spot: The density of rods and cones for a cross section of the right eye passing
through the region of emergence of the optic nerve from the eye. The absence of receptors in
this area results is called blind spot.
11.
Focal Length: The distance between the center of the lens and the
retina called the focal length.

12.
Illumination: Illumination is the amount of source light incident on the scene. It is
represented as i(x, y).

13.
Sampling and quantization: Sampling means digitizing the co-ordinate value (x, y).
Quantization means digitizing the amplitude value.
UNIT-I
DIGITAL IMAGE FUNDAMENTALS
Elements of digital image processing systems- Elements of visual perception- psycho visual model1.1 brightness- contrast- hue- saturation- mach band effect- Color image fundamentals - RGB- HSI
models- Image sampling- Quantization- dither- Two-dimensional mathematical preliminaries
INTRODUCTION
An Image may be defined as two-dimensional light intensity function f(x, y) where x and y
denote spatial co-ordinate and the amplitude or value of f at any point (x, y) is called intensity
or grayscale or brightness of the image at that point.
When x,y & amplitude value of f are all finite , discrete quantities, that image is called Digital
image.
Processing the digital image by means of a digital computer is called Digital image processing
or the term Digital Image Processing refers to the manipulation of an image by means of
processor.
Digital image is composed of a finite number of elements, each of which has a particular
location & value. These elements are referred to as picture element, image element, pels, and
pixels. Pixels is the most widely used to denote the elements of a digital image.
1.2 ELEMENTS OF DIGITAL IMAGE PROCESSING SYSTEMS
Fig.1.1: Components of a general purpose image processing system.
Image sensors
Two elements are required to acquire digital images.
Physical device sensitive to the energy radiated by the object we wish to image.
Digitizer
device for converting the output of the physical sensing device
into digital form. For instance, in a digital video camera, the sensors produce an
electrical output proportional to light intensity. The digitizer converts these outputs to
digital data.
Specialized image processing
It consists of the digitizer and hardware that performs other primitive operations, such as an
arithmetic logic unit (ALU), which performs arithmetic and logical operations in parallel on
entire images.
One example of how an ALU is used is in averaging images as quickly as they are digitized,
for the purpose of noise reduction. This type of hardware sometimes is called a front-end
subsystem, and its most distinguishing characteristic is speed.
This unit performs functions that require fast data throughputs at 30 frames/s) that the typical
main computer cannot handle.
Computer
It is a general-purpose computer and can range from a PC to a supercomputer.
In dedicated applications, sometimes specially designed computers are used to achieve a
required level of performance.
Image processing software
Software for image processing consists of specialized modules that perform specific tasks.
A well-designed package also includes the capability for the user to write code that, as a
minimum, utilizes the specialized modules.
Some packages allow the integration of those modules.
Mass storage:
Mass storage capability is a must in image processing applications.
An image of size 1024*1024 pixels requires one megabyte of storage space if the image is not
compressed.
When dealing with thousands, or even millions, of images, it become challenge to store image
in image processing system.
Digital storage for image processing applications falls into three principal categories:
1. Short-term storage (use during processing)
2. On-line storage (relatively fast recall)
3. Archival storage (characterized by infrequent access)
Storage is measured in bytes (eight bits), Kbytes (one thousand bytes), Mbytes (one million
bytes), Gbytes (meaning giga, or one billion, bytes), and Tbytes (meaning tera, or one trillion,
bytes).
One method of providing short-term storage is computer memory.
Another is by specialized boards, called Frame buffers that store one or more images and can
be accessed rapidly. Frame buffers usually are housed in the specialized image processing
hardware unit.
Online storage generally takes the form of magnetic disks or optical-media storage.
Magnetic tapes and optical disks housed in Jukeboxes are the usual media for archival
applications.
Image displays:
Image displays in use today are mainly color TV monitors.

Monitors are driven by the outputs of image and graphics display cards that are an integral part
of the computer system.
Hardcopy:
Hardcopy devices for recording images include laser printers, film cameras, heat-sensitive
devices, inkjet units, and digital units, such as optical and CD-ROM disks.
Film provides the highest possible resolution, but paper is the obvious medium of choice for
written material.
Networking:
Networking is almost a default function in any computer system in use today. Because of the
large amount of data inherent in image processing applications, the key consideration in image
transmission is bandwidth.
Communications with remote sites via the Internet are not always as efficient.
This situation is improving quickly as a result of optical fiber and other broadband
technologies.
1.3 ELEMENTS OF VISUAL PERCEPTION

1.3.1Structure of human eye
The human eye is nearly a sphere, with an average diameter of approximately 20 mm.
Three membranes enclose the eye:
1. The cornea and sclera (outer cover)
2. The choroid
3. The retina.
Fig.1.2: Structure of Human eye.
Cornea The cornea is a tough, transparent tissue that covers the anterior surface of the eye.
Sclera It is an opaque membrane, Continuous with the cornea that encloses the remainder
of the optic globe
Choroid
It lies directly below the sclera. This membrane contains a network of blood vessels that serve
as the major source of nutrition to the eye.
The choroid is divided into the
ciliary body
iris diaphragm.
The iris is a diaphragm of variable size whose function is to adjust the size of the
pupil to
regulate the amount of light admitted into the eye. The iris is the coloured part of the eye.
The central opening of the iris (the pupil) varies in diameter from approximately 2 to 8 mm.
The front of the iris contains the visible pigment of the eye, whereas the back contains a black
pigment.
Lens
The lens is made up of concentric layers of fibrous cells and is suspended by fibers that attach
to the ciliary body.
The lens of the eye is located just behind the iris.
It contains 60 to 70% water, about 6% fat, and more protein than any other tissue in the eye.
The lens is colored by a slightly yellow pigmentation that increases with age.
The lens absorbs approximately 8% of the visible light spectrum, with relatively higher
absorption at shorter wavelengths.
Both infrared and ultraviolet light are absorbed appreciably by proteins within the lens
structure and, in excessive amounts, can damage the eye.
Retina
The innermost membrane of the eye is the retina, which lines the inside of the walls entire
posterior portion.
When the eye is properly focused, light from an object outside the eye is imaged on the retina.
There are two classes of receptors:
Cones
Rods
Cones
The cones in each eye number between 6 and 7 million.
They are located primarily in the central portion of the retina, called the fovea.
It is highly sensitive to color. Humans can resolve fine details with these cones largely because
each one is connected to its own nerve end.
Photopic or bright-light vision Muscles controlling the eye rotate the eyeball until the image
of an object of interest falls on the fovea. Cone vision is called photopic or bright-light vision.
Rod
The number of rods present - 75 to 150 million, is distributed over the retinal surface.
The larger area of distribution and the fact that several rods are connected to a single nerve end
reduce the amount of detail discernible by these receptors.
Rods serve to give a general, overall picture of the field of view.
They are not involved in color vision and are sensitive to low levels of illumination.
scotopic or dim-light vision Objects that appear brightly colored in daylight when seen by
moonlight appear as colorless forms because only the rods are stimulated. This phenomenon is
known as scotopic or dim-light vision.
Distribution Of Rodes & Cones
Fig:1.3: Distribution rf Rodes and Cones
Figure shows the density of rods and cones for a cross section of the right eye passing through
the region of emergence of the optic nerve from the eye.
Blind spot The absence of receptors in this area results in the so-called blind spot. Except
for this region, the distribution of receptors is radially symmetric about the fovea.
Receptor density is measured in degrees from the fovea.
Rods increase in density from the center out to approximately 20 off axis and then decrease in
density out to the extreme periphery of the retina.
The fovea itself is a circular indentation in the retina of about 1.5 mm in diameter.
Density of cones in the area of retina is 1,50,000 element per mm2
Based on these approximations, the number of cones in the region of highest acuity in the eye
is about 337,000 elements.
Visual Axis
A simple definition of the "visual axis" is "a straight line that passes through both the centre of
the pupil and the centre of the fovea". However, there is also a stricter definition (in terms of
nodal points) which is important for specialists in optics and related subjects.
1.3.1 Image formation in the eye
The principal difference between the lens of the eye and an ordinary optical lens is that the
former is flexible.
The radius of curvature of the anterior surface of the lens is greater than the radius of its
posterior surface.
The shape of the lens is controlled by tension in the fibers of the ciliary body.
To focus on distant objects, the controlling muscles cause the lens to be relatively flattened.
Similarly, these muscles allow the lens to become thicker in order to focus on objects near the
eye.
Focal length The distance between the center of the lens and the retina is called the focal
length, it is varies from approximately 17 mm to about 14 mm, as the refractive power of the
lens increases from its minimum to its maximum.
When the eye focuses on an object farther away than about 3 m, the lens exhibits its lowest
refractive power.
When the eye focuses on a nearby object, the lens is most strongly refractive. This information
makes it easy to calculate the size of the retinal image of any object.
Fig.1.4 Graphical representation of the eye looking at a palm tree.
For example, the observer is looking at a tree 15 m high at a distance of 100 m. If h is the
height in mm of that object in the retinal image, the geometry of Fig.1.4
Yields 15/100=h/17
Therefore,
h=2.55 mm.
The retinal image is reflected primarily in the area of the fovea.

1.3.3. Brightness adaptation and discrimination
Digital images displayed as a discrete set of intensities
Range of human eye is about 1010 different light intensity levels, from scotopic threshold to
the glare limit
Visual system cannot operate over the enormous range simultaneously
It accomplishes this large variation by Changes in overall sensitivity of perceived brightness.
This phenomenon is called as Brightness adaptation.
Subjective brightness (as perceived by humans) is a logarithmic function of light intensity

incident on the eye.
For any given set of conditions, the current sensitivity level of the visual system is called the
brightness adaptation level.
Fig: 1.5 Range of subjective brightness sensations showing a

particular adaptation level.
Weber ratio:
The ratio of increment of illumination to background of illumination is called as weber ratio.
Weber ration = Ic / I
If the ratio (Ic /I) is small, then small percentage of change in intensity is needed (ie) good
brightness adaptation.
If the ratio (Ic /I) is large, then large percentage of change in intensity is needed (ie) poor
brightness adaptation.
The ability of the eye to discriminate between changes in light intensity at any specific
adaptation level is also of considerable interest.
Fig:1.6 Basic experimental setup used to characterize

brightness discrimination.
A plot log Ic / I of as a function of log I have the general shape shown

in Figure
This curve shows that brightness discrimination is poor (the Weber
ratio is large) at low levels of illumination, and it improves
significantly (the Weber ratio decreases) as background illumination
increases.
The two branches in the curve reflect the fact that at low levels of
illumination vision is carried out by activity of the rods, whereas at
high levels (showing better discrimination) vision is the function of
cones.
Fig:1.7 Typical Weber ratio as a function of
intensity.
Two phenomena clearly demonstrate that perceived brightness is not a simple function of
intensity.
Mach band effect
Simultaneous Contrast
Mach band effect
The visual system tends to undershoot or overshoot around the boundary of regions of different
intensities. (or)
Machband effect means the intensity of the stripes is constant perceive a brightness pattern that
is strongly scalloped, especially near the boundaries.
Fig: 1.8 Mach band effect
Simultaneous Contrast
The regions perceived brightness does not depend simply on its intensity. Fig. demonstrates.
All the center squares have exactly the same intensity. they appear to the eye to become darker
as the background gets lighter. This phenomenon is called as Contrast Simultaneous.
.
Fig: 1.9 Examples of simultaneous contrast.
Optical Illusions
The eye fills in nonexisting information or wrongly perceives geometrical properties of objects.
Fig: 1.10 some well-known optical illusions.
Fig: 1.10 (a) the outline of a square is seen clearly, in spite of the fact that
no lines defining such a figure are part of the image.Fig:1.10 (b) The same
effect, this time with a circle, can be seen in few lines are sufficient to give
the illusion of a complete circle. Fig: 1.10 (c) The two horizontal line
segments are of the same length, but one appears shorter than the other.
Finally, all lines in Fig:1.10(d) that are oriented at 45 are equidistant and
parallel
Brightness
Brightness of an object is the perceived luminance of the surround. Two objects with different
surroundings would have identical luminance but different brightness.
Hue
Hue is a color attribute that describes a pure color.
Saturation
Saturation gives a measure of the degree to which a pure color is diluted by white light.
1.4 COLOR IMAGE FUNDAMENTALS:
Color image processing is classified into two major areas.
Full color(color TV camera, color scanner )
Pseudo color
When a beam of sunlight passes through a glass prism the emerging of light is not white but
consists instead of a continuous spectrum of colors ranging from violet at one end to red at the
other end.
Color spectrum may be divided into six broad regions.

* Violet
* yellow
* Blue
* Orange
* Green
* Red
A body that reflects light that is balanced in all visible wavelengths appears white to the
observer.
Green objects reflect light with wavelength primarily in the 500 to 570 nm range while
absorbing most of the energy at other wavelengths.
If the light is achromatic (void color), its only attribute is intensity or amount.
Gray level
Gray level refers to a scalar measure of intensity that ranges from black , toi gray and finally to
white.
Chromatic light spans the electromagnetic spectrum from approximately 400 to 700 nm.
There basic quantities are used to describe the quantity of a chromatic light source.
Radiance
Luminance
Brightness
1. Radiance
Radiance is the total amount of energy that flows from the light source and it is usually
measured in Watts (W).
2. Luminance
Luminance is a measure of the amount of energy an observer perceive from a light source. It is
measured in Lumens (lm).
3. Brightness
Brightness of an object is the perceived luminance of the surround. Two objects with different
surroundings would have identical luminance but different brightness.
Brightness is a subjective descriptor is practically impossible to measure.
It is an achromatic notation of intensity and is one of the key factors in describing color
sensation.
Cones are the sensors in the eye responsible for color vision 6 to 7 million cones in the human
eye can be divided into three principal sensing categories corresponding roughly to Red, Green
and Blue.
Approximately 65% of all cones are sensitive to red 33% are sensitive to green and only about
2% are sensitive to blue. But the blue cones are the most sensitive.
The average experimental curve detailing the absorption of light by the red, green and blue
cones in the eye. Due to these absorption characteristics of the human eye colors are seen as
variable combination of the so called primary colors.
The primary color specifications are
Red is 700nm
Green is 546.1nm
Blue is 435.8nm
Why it is called primary colors means, these fixed components acting alone can generate all
spectrum colors.
The primary colors can be added to produce the secondary colors of light.
Magenta = Red + Blue
Cyan = Green + blue
Yellow = Red + Green
1.4.1 MIXING OF COLORS
Mixing of color can take place in two ways,
1. Additive mixing
2. Subtractive mixing
1. Additive mixing
The eye of producing a response which depends on the algebraic sum of the red, green, and
blue. The produces the secondary colors such as magenta, cyan and yellow.
White color has been seen to be reproduced by adding red, green and blue lights.
2. Subtractive mixing
In subtractive mixing, reflecting properties of pigments are used which reflects wave lengths of
light that are common to both and absorb all other wave length which is not common in the
original pigments.
A subtractive mixing of the three colors produces the black color.
Color television reception is an example of the additive nature of light colors. The interior of
the CRT color TV screen is composed of a large array of triangular dot patterns of electron
sensitive phosphor. When each dot in a triad produces light in one of the primary colors.
An electron gun inside the tube, which generates pulses corresponding to the energy scan by
the TV camera. The green and blue phosphor dots in each triad are modulated in the same
manner.
CRT displays are being replaced by flat panel digital technologies such as LCD and Plasma
devices although they are fundamentally different from CRTs.
The characteristics generally used to distinguish one color from another are brightness hue and
saturation.
Hue and saturation taken together are called chromaticity.
The amount of red, green and blue needed to form any particular color are called the tri
stimulus values are denoted x, y and z.
x= X / X+Y+Z
y = Y / X+Y+Z
z = Z / X+Y+Z
1.4.2 Chromaticity Diagram

The chromaticity diagram is formed by all the rainbow colors arranged along a Hose Shoe
shaped triangular curve. The pure colors are represented along the perimeter of the curve. The
corners representing three primary colors red, green and blue. The colors become desaturated
representing mixing of color or a white light.
Fig: 1.11 Chromaticity Diagram
Advantages:
An advantage of chromaticity diagram is that, it is possible to determine the result of
additive mixing of any two or more color lights by simple geometric.
The color diagram contains all colors of equal brightness.
1.4.3 COLOR MODELS
A Color model is a specification of 3D-coordinates system and a subspace within that
system where each color is represented by a single point.
The purpose of a color model is to facilitate the specification of colors in some standard.
Most color models in use today are oriented either towards hardware (such as color monitors
and printers) or towards applications where color manipulation is a goal (such as in the creation
of color graphics for animation)
In terms of digital image processing, the hardware oriented models most commonly used in
practice are the RGB (Red, Green and Blue) model for color monitors and broad class video
cameras.
The CMY (Cyan, Magenta and Yellow) and CMYK (Cyan, Magenta, Yellow and Black)
models for color printing.
The HSI (Hue, Saturation and Intensity) model which corresponds closely with the way
humans describe and interpret the colors. Advantage of this model is decouples the color and
gray scale information in an image.
1.4.4 THE RGB COLOR MODEL
In the RGB model, each color appear in it is primary spectral components of red, green
and blue.
This model is based on a Cartesian coordinate system.
The color subspace of interest is the cube. In which RGB primary values are at three
other corners.
The secondary colors cyan, magenta and yellow are at three other corners.
Black is at the origin and white is at the corner outermost from the origin.
In this model, the gray scale extends from black to white along the line joining these two
points.
The different colors in this model are pints on or inside the cube, and are defined by vectors
extending from the origin.
The assumption is that all color values have been normalized so that the cube is the unit cube
that is all values of R, G and B are assumed to be in the range [0, 1].
Images represented in the RGB color model consists of three component images one for each
primary color. When fed into an RGB monitor, these three images combine on the screen to
produce a composite color image.
The number of bits used to represent each pixel in RGB space is called the Pixel Depth
An RGB image in which each of the red, green and blue images is an 8-bit image. Under these
conditions each RGB color pixel is said to have a depth of 24-bits.
The term full color image is used often to denote a 24-bit RGB color image. The total number
of colors in a 24-bit RGB image is (28)3 = 16,777,216.
A color image can be acquired by using three filters. It is sensitive to red, green and blue
respectively. When we view a color scene with a monochrome image whose intensity is
proportional to the response of the filter. Repeating this process with each filter produces three
monochrome images that are the RGB component images of the color scene, an RGB color
rendition of the original color scene.
RED
GREEN
COLOR
MONITOR
RGB
COMPOND
COLOR
IMAGE
BLUE
Fig: 1.12 Generating the RGB image of cross-sectional color plane
A subset of colors that is likely to be reproduced faithfully. Reasonably independently of

viewer hardware capabilities this subset of color is called he set of Safe RGB colors or the set
of all system Safe colors. In internet applications they are called Safe web colors or Safe
browser colors.
1.4.5 THE HSI COLOR MODEL
The HSI color space is very important and attractive color model for image processing
applications because it represents colors similarly how the human eye senses colors.
The HSI color model represents every color with three components: hue ( H ), saturation ( S ),
intensity ( I ). The below figure illustrates how the HIS color space represents colors.
The Hue component describes the color itself in the form of an angle between [0,360] degrees.
0 degree mean red, 120 means green 240 means blue. 60 degrees is yellow, 300 degrees is
magenta.
The Saturation component signals how much the color is polluted with white color. The range
of the S component is [0,1].
The Intensity range is between [0,1] and 0 means black, 1 means white.
Fig: 1.13 The HSI color model based on circular color planes.
As the above figure shows, hue is more meaningful when saturation approaches 1 and less
meaningful when saturation approaches 0 or when intensity approaches 0 or 1. Intensity also
limits the saturation values.
To formula that converts from RGB to HSI or back is more complicated than with other color
models, therefore we will not elaborate on the detailed specifics involved in this process.
1.5 Image Sampling &Quantization
Basic concept of Image Sampling &Quantization
Representing digital image
Spatial & gray level Resolution
Aliasing & moir's pattern
Zooming & shrinking
1.5.1 Basic concept of Image Sampling &Quantization
The output of most sensors is a continuous voltage waveform whose amplitude and spatial
behavior are related to the physical phenomenon being sensed.
To create a digital image, need to convert the continuous sensed data into digital form. This
involves two processes:
sampling
Quantization.
Digitizing the coordinate values is called sampling. Digitizing the amplitude values is called
quantization.
Fig1.14 :(a) Continuous image. (b) A scan line from Ato Bin the continuous image,used to illustrate the
concepts of sampling and quantization. (c) Sampling and quantization. (d) Digital scan line.
Fig. (b) is a plot of amplitude (gray level) values of the continuous image along the line
segment AB.
The location of each sample is given by a vertical tick mark in the bottom part of the figure.
The samples are shown as small white squares superimposed on the function.
The set of these discrete locations gives the sampled function. The values of the samples still
span (vertically) a continuous range of gray-level values.
In order to form a digital function, the gray-level values also must be converted (quantized)
into discrete quantities.
Fig :( c) shows the gray-level scale divided into eight discrete levels, ranging from black to
white. The vertical tick marks indicate the specific value assigned to each of the eight gray
levels.
The continuous gray levels are quantized simply by assigning one of the eight discrete gray
levels to each sample. The assignment is made depending on the vertical proximity of a sample
to a vertical tick mark.
The digital samples resulting from both sampling and quantization are shown in Fig(d).
Starting at the top of the image and carrying out this procedure line by line produces a twodimensional digital image.
The method of sampling is determined by the sensor arrangement used to generate the image.
When an image is generated by a single sensing element combined with mechanical motion,
the output of the sensor is quantized in the manner described above. Sampling is accomplished
by selecting the number of individual mechanical increments at which we activate the sensor to
collect data.
Mechanical motion can be made very exact so, in principle, there is almost no limit as to how
fine we can sample an image.
When a sensing array is used for image acquisition, there is no motion and the number of
sensors in the array establishes the limits of sampling in both directions.
Fig 1.15(a) shows a continuous image projected onto the plane of an array sensor. Figure
1.15(b) shows the image after sampling and quantization.
The quality of a digital image is determined to a large degree by the number of samples and
discrete gray levels used in sampling and quantization.
1.5.2 Representing digital image
The result of sampling and quantization is a matrix of real numbers. Assume that an image
f(x,y)is sampled so that the resulting digital image has M rows and N columns.
The values of the coordinates (x,y) now become discrete quantities. For notational
clarity and convenience use integer values for these discrete coordinates.
Thus, the values of the coordinates at the origin are (x,y)=(0,0).The next coordinate values
along the first row of the image are represented as(x, y)=(0, 1).
The notation introduced in the preceding paragraph allows us to write the complete M*N
digital image in the following compact matrix form:
The right side of this equation is by definition a digital image. Each element of this matrix
array is called an image element, picture element, pixel, or pel.
In some discussions, it is advantageous to use a more traditional matrix notation
to denote a digital image and its elements:
This digitization process requires decisions about values for M, N and for the number, L, of
discrete gray levels allowed for each pixel.
There are no requirements on M and N, other than that they have to be positive integers.
However, due to processing, storage, and sampling hardware considerations, the number of
gray levels typically is an integer power of 2:
L=2k
The range of values spanned by the gray scale is called the dynamic range of an image
The number, b, of bits required to store a digitized image is
b=M*N*K
When M=N, this equation becomes
b=N2K
Below table show the number of bits required to store square images with various values of N
and k
1.5.3 Spatial and Gray-Level Resolution
Sampling is the principal factor determining the spatial resolution of an image. Basically,
spatial resolution is define as the smallest discernible detail in an image.
Suppose, construct a chart with vertical lines of width W, with the space between the lines also
having width W.
A line pair consists of one such line and its adjacent space. Thus, the width of a line pair is 2W,
and there are 1/2W line pairs per unit distance.
Resolution is simply the smallest number of discernible line pairs per unit distance; for
example, 100 line pairs per millimeter.
Gray-level resolution similarly refers to the smallest discernible change in gray level.
Due to hardware considerations, the number of gray levels is usually an integer power of 2.The
most common number is 8 bits, with 16 bits being used in some applications where
enhancement of specific gray-level ranges is necessary.
Sometimes systems that can digitize the gray levels of an image with 10 or 12 bits of accuracy,
but these are the exception rather than the rule.
When an actual measure of physical resolution relating pixels and the level of detail they
resolve in the original scene are not necessary, it is not uncommon to refer to an L-level digital
image of size M*N as having a spatial resolution of M*N pixels and a gray-level resolution of
L levels.
Use this terminology from time to time in subsequent discussions, making a reference to actual
resolvable detail only when necessary for clarity.
The effect caused by the use of an insufficient number of gray levels in smooth areas of a
digital image, is called false contouring, and is called so because the ridges resemble
topographic contours in a map. False contouring is quite visible in images displayed using or
less uniformly spaced gray levels
Image produced without dithering has fewer apparent colors but improved spatial resolution
when compared to dithered image. Color reduction without dithering that new image can
contain false contours.
Sets of these three types of images were generated by varying N and k, and observers were
then asked to rank them according to their subjective quality. Results were summarized in the
form of so-called isopreference curves in the NK-plane.
Fig 1.15: Image with a low level of detail. (b) Image with a medium level of detail. (c) Image
with a relatively large amount of detail.
Fig.1.16 Representative Isopreference curves for the three types of images.
For images with a large amount of detail only a few gray levels may be needed. For example,
the isopreference curve in corresponding to the crowd is nearly vertical. This indicates that, for
a fixed value of N, the perceived quality for this type of image is nearly independent of the
number of gray levels used.
Perceived quality in the other two image categories remained the same in some intervals in
which the spatial resolution was increased, but the number of gray levels actually decreased.
The most likely reason for this result is that a decrease in k tends to increase the apparent
contrast of an image, a visual effect that humans often perceive as improved quality in an
image.
1.5.4 Aliasing and Moir Patterns
Functions whose area under the curve is finite can be represented in terms of sines and cosines
of various frequencies. The sine/cosine component with the highest frequency determines the
highest frequency content of the function. Suppose that this highest frequency is finite and
that the function is of unlimited duration. These functions are called band-limited functions.
If the function is sampled at a rate equal to or greater than twice its highest frequency, it is
possible to recover completely the original function from its samples. If the function is under
sampled, then a phenomenon called aliasing corrupts the sampled image.
The corruption is in the form of additional frequency components being introduced into the
sampled function. These are called aliased frequencies. The sampling rate in images is the
number of samples taken (in both spatial directions) per unit distance.
The principal approach for reducing the aliasing effects on an image is to reduce its highfrequency components by blurring the image prior to sampling. However, aliasing is always
present in a sampled image.
The effect of aliased frequencies can be seen under the right conditions in the form of so called
Moir patterns. A Moir pattern, caused by a breakup of the periodicity.
When a function is periodic, it may be sampled at a rate equal to or exceeding twice its highest
frequency, and it is possible to recover the function from its samples provided that the
sampling captures exactly an integer number of periods of the function.
1.5.5. Zooming and Shrinking Digital Images
Zooming may be viewed as oversampling, while shrinking may be viewed as under sampling.
The key difference between these two operations and sampling and quantizing an original
continuous image is that zooming and shrinking are applied to a digital image.
Zooming requires two steps
the creation of new pixel locations
the assignment of gray levels to those new locations.
Interpolation is a process of using known values to find the values at unknown location.
Suppose an image of size 500*500 pixels and want to enlarge it 1.5 times to 750*750 pixels.
Conceptually, one of the easiest ways to visualize zooming is laying an imaginary 750*750
grid over the original image. Obviously, the spacing in the grid would be less than one pixel
because fitting it over a smaller image. In order to perform gray-level assignment for any point
in the overlay, look for the closest pixel in the original image and assign its gray level to the
new pixel in the grid.
When we done with all points in the overlay grid, simply expand it to the original specified
size to obtain the zoomed image. This method of gray-level assignment is called nearest
neighbor interpolation.
Pixel replication is a special case of nearest neighbor interpolation. Pixel replication is

applicable when want to increase the size of an image an integer number of times. For instance,
to double the size of an image, duplicate each column. This doubles the image size in the
horizontal direction. Then, duplicate each row of the enlarged image to double the size in the
vertical direction.
The same procedure is used to enlarge the image by any integer number of times (triple,
quadruple, and so on). Duplication is just done the required number of times to achieve the
desired size.
Nearest neighbor interpolation is fast, it has the undesirable feature that it

produces a checkerboard effect that is particularly objectionable at high
factors of magnification.
A slightly more sophisticated way of accomplishing gray-level assignments is bilinear

interpolation using the four nearest neighbors of a point.
Let (x, y) denote the coordinates of a point in the zoomed image and let v(x, y) denote the
gray level assigned to it. For bilinear interpolation, the assigned gray level is given by
v(x, y)=ax+by+cxy+d
Where the four coefficients are determined from the four equations in four unknowns that can
be written using the four nearest neighbors of point (x, y).
Image shrinking is done in a similar manner as just described for zooming. The equivalent
process of pixel replication is row-column deletion.
For example, to shrink an image by one-half, delete every other row and column. Use the
zooming grid analogy to visualize the concept of shrinking by a non integer factor, except that
expand the grid to fit over the original image, do gray-level nearest neighbor or bilinear
interpolation, and then shrink the grid back to its original specified size.
To reduce possible aliasing effects, it is a good idea to blur an image slightly before shrinking
it. It is possible to use more neighbors for interpolation. Using more neighbors implies fitting
the points with a more complex surface, which generally gives smoother results.
1.6 Dither
Dithering is a technique to simulate the display of colors that are not in the current color palette
of an image.
Full colors are usually represented with reduced number of colors.
It accomplishes this by arranging adjacent pixels of different colors into a pattern which
simulates colors that are not available.
Example.
If use rgb2ind, to reduce the number of colors in an image, the resulting image might
look inferior to the original, because some of the colors lost.
Dithering is used to increase the apparent number of colors in the output image.
Dither changes the color of the pixels in a neighborhood approximates the original RGB color.
1.7 2-D Mathematical Preliminaries
Images are generally outputs of 2-D systems, mathematical concepts used in study of such
systems are needed.
1.7.1 Fourier Transform

Let f(x) is a continuous function of a real variable x. The Fourier Transform of f(x) is given by
F(u)=
j2ux
f ( x ) e
dx
..(10)
f(x) can be obtained by using inverse Fourier transform

f(x)=
j2ux
f (u ) e
du
..(11)
the equation (10) & (11) are called the Fourier transform pair. The Fourier transform of a real
function x is generally Complex (ie)
F(u)=R(u)+jI(u)
F (u )
R 2 (u) + I 2 (u)
I (u )
(u)=tan-1 R (u )
.(12)
P(u)=R2(u)+I2(u)
.(13)
Where,
(u) is called Phase angle
P(u) is called power spectrum
The Fourier Transform of f(x,y) is given by
F(u,v)=
f ( x, y )
e(-j2(ux+vy) dx dy
..(14)
f(x,y) can be obtained by using inverse Fourier transform
F (u , v )
f(x,y)=
e(-j2(ux+vy) du dv
..(15)
Properties
Conjugate symmetry:
If f(x,y) is real , the Fourier transform exhibit conjugate symmetry
f(x,y) =F*(-u,-v)
Periodicity:
Fourier transform are periodic with period N
F(u,v)= F(u+N,v)= F(u,v+N)= F(u+N , v+N)
Linearity:
If f(x,y) FT F(u,v) then
af1(x,y)+ bf2(x,y)=aF1(u,v)+bF2(u,v)
Ratation:
If we introduce the polar co-ordinates x=rcos, y= rsin, u= cos,
v= cos
.When f(x,y) and F(u,v) become f(r,) and F(,) then

f(r,+0) FT F(,+0)
(ie) rotating f(x,y) by an angle 0 rotates F(u,v) by same angle.
Distributivity:
F(f1(x,y)+f2(x,y))=F(f1(x,y))+F(f2(x,y))
(ie) fourier transform and its inverse are distributive over addition but not over Multiplication.
Scaling:
For any two scalars a,b
af(x,y) FT aF(u,v)
u v
f(ax,ay) FT ab F ,
a b
Laplacian:
The laplacian of two variable function f(x,y) is defined as
f
2 f
+
2
y 2
x
F( 2 f(x,y)) FT -(2)2(u2+v2)F(u,v)
2
f(x,y)=
Convolution:
For any two functions f(x,y)& g(x,y) the two dimensional convolution is given by
f(x,y)*g(x,y)=
f ( , )
f(x,y)*g(x,y) F(u,v) G(u,v)

f(x,y) g(x,y) F(u,v) *G(u,v)
g(x-,y-) d d
Corelation:
The correlation of any two function f(x,y) & g(x,y) is given by
f(x,y) g(x,y) F* (u,v) G(u,v)
f*(x,y) g(x,y) F(u,v) G(u,v)
Parsivals Theorem:
M 1 N 1
f
x 0 y 0
( x, y ) =
1
MN
M 1 N 1
F (u , v )
u 0 v 0
1.7.2 Z-Transform or Laurent series

For a 2-D complex sequence x(m,n), Z transform is given by
X(Z1,Z2)=
x ( m, n )
m n
Where, Z1,Z2 are complex variables
Z1
Z2
..(16)
The set of variables of Z1,Z2 for which this converges uniformly is called the region of
convergence.
The Z transform of the impulse response of a linear shift invariant discrete system is called its
transfer function
Transfer function is also define as the ratio of the Z-transform of the output and the input
sequence
Apply convolution theorem for Z-transform
x(m,n)=
1
( j 2 ) 2
X ( Z
, Z 2 ) Z1
m 1
Z2
n 1
..(17)
where the contours of integration are counterclockwise and lie in the region of convergence.
Properties
Linearity:
If x1(n) z X1(Z) then
a x1(n)+ b x2(n)=a X1(Z) +b X2(Z)
Time Shifting:
If x(n) z X(Z) then
x(n-k) z Z-kX(Z)
Convolution:
x1(n) * x2(n) z X1(Z) X2(Z)
Conjugation:
If x(n) z X(Z) then
x*(n) z X*(Z*)
1.8 QUESTION BANK

PART A (2marks)
1. Define Image?
2. What is meant by pixel?
3. Define Digital image?
4. Explain the categories of digital storage?
5. Differentiate photopic and scotopic vision?
6. How cones and rods are distributed in retina?
7. Define subjective brightness and brightness adaptation?
8. Define Weber ratio
9. What is meant by mach band effect?
10. What is simultaneous contrast?
11. What is meant by illumination and reflectance?
12. Define sampling and quantization
13. Find the number of bits required to store a 256 X 256 image with 32 gray levels?
14. Write short notes on neighbors of a pixel.
15. Write any four applications of DIP.
16. What is Dynamic Range?

17. Define Brightness?
18. Define Spatial intensity resolution.
19. What do you meant by Zooming of digital images?
20. Define the following terms 1.Hue 2.Saturation .Intensity 4.Contrast
PART B (16 Marks)

1. Write short notes on sampling and quantization.
2. Describe the functions of elements of digital image processing system with a diagram.
3. Explain the various elements of digital image processing with a suitable diagram
4. Write short notes on digital color images.
5. Draw and explain in detail of Schematic of the RGB color model and its applications.
6. Draw and explain in detail of the Schematic of the HSI color model and also write the
drawback of RGB color model.
7. Draw and explain the Horizontal cross section of Human Eye.
8. Write short notes on Brightness discrimination.

Digital Image Fundamentals

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Digital Image Fundamentals

Hochgeladen von

Copyright:

Verfügbare Formate

UNIT-1

DIGITAL IMAGE FUNDAMENTALS

Elements Of Digital Image Processing Systems

Elements Of Visual Perception

Color Image Fundamentals

Image Sampling &Quantization

2-D Mathematical Preliminaries

2. Digital Image: An image is a two-dimensional function that represents a measure of some

retina called the focal length.

Illumination: Illumination is the amount of source light incident on the scene. It is

represented as i(x, y).

Quantization means digitizing the amplitude value.

Fig.1.1: Components of a general purpose image processing system.

device for converting the output of the physical sensing device

Image displays in use today are mainly color TV monitors.

1.3 ELEMENTS OF VISUAL PERCEPTION

Fig.1.2: Structure of Human eye.

Fig:1.3: Distribution rf Rodes and Cones

Receptor density is measured in degrees from the fovea.

Density of cones in the area of retina is 1,50,000 element per mm2

Fig.1.4 Graphical representation of the eye looking at a palm tree.

The retinal image is reflected primarily in the area of the fovea.

Subjective brightness (as perceived by humans) is a logarithmic function of light intensity

Fig: 1.5 Range of subjective brightness sensations showing a

Fig:1.6 Basic experimental setup used to characterize

A plot log Ic / I of as a function of log I have the general shape shown

Fig:1.7 Typical Weber ratio as a function of

Fig: 1.8 Mach band effect

Fig: 1.10 some well-known optical illusions.

Color spectrum may be divided into six broad regions.

1.4.2 Chromaticity Diagram

Fig: 1.11 Chromaticity Diagram

A subset of colors that is likely to be reproduced faithfully. Reasonably independently of

1.5.3 Spatial and Gray-Level Resolution

Fig.1.16 Representative Isopreference curves for the three types of images.

Pixel replication is a special case of nearest neighbor interpolation. Pixel replication is

Nearest neighbor interpolation is fast, it has the undesirable feature that it

A slightly more sophisticated way of accomplishing gray-level assignments is bilinear

1.7.1 Fourier Transform

f(x) can be obtained by using inverse Fourier transform

f(x,y) can be obtained by using inverse Fourier transform

If we introduce the polar co-ordinates x=rcos, y= rsin, u= cos,

.When f(x,y) and F(u,v) become f(r,) and F(,) then

f(x,y)*g(x,y) F(u,v) G(u,v)

1.7.2 Z-Transform or Laurent series

Where, Z1,Z2 are complex variables

1.8 QUESTION BANK

16. What is Dynamic Range?

PART B (16 Marks)

Das könnte Ihnen auch gefallen