Sie sind auf Seite 1von 18

PRINCIPLES OF IMAGE PROCESSING AND ANALYSIS IN ROBOTICS By Vinay Kumar The University of Texas, Austin

Introduction Image processing is a vast field of science dedicated to the study of Image representation, manipulation and reconstruction and is applicable to various fields like Biomedical Imaging, Robotics, Remote sensing, Defense Surveillance and many more. This essay explores the various important concepts underlying the Image processing realm and accompanies relevant examples for some concepts. All the example implementations are performed with MATLAB programming language. The essay details the concepts of image processing with different chapters. Chapter 1 introduces the concepts related to image formation and representation. Chapter 2 discusses the various Image processing techniques currently being used and Chapter 3 discusses some Specific Applications. A final summary and References are provided at the end of the chapters. Chapter 1 Image Representation Image processing is defined as methods and techniques employed to prepare an image for analysis at a later point of time. The techniques applied generally include noise reduction, enhancement, simplification and filtering. Image Analysis is defined as the methods and techniques employed to analyze the saved and processed image in order to extract any relevant information. The examples various techniques used are facial recognition, pattern recognition. What is an Image?? An image is a representation of a real scene which can have different attributes based on the image acquisition tool used. Any image can be broadly classified in the following categories: 1. Dimension: Based on dimensions images can be 2-dimensional images which lack the depth information of a scene and are used mostly for feature extraction, navigation etc or 3-dimensional images which do contain the depth information and are majorly used in motion detection, scene recreation and medical imaging. 2. Appearance: Color or Black and White Black and White models used normally are Grayscale where multiple gray inks are used to reproduce image or One Ink where only one color Black is used and by changing the size of the black dot different gray levels are reproduced.

Color models used are RGB where the image is reproduced by the mixture of the three color dots red, green and blue or the CMYK model where the combination of cyan, magenta, yellow and black colors is used to reproduce the image. 3. Form: Printed form or Digital form. With the advent of computers the digital form of image has increasingly become important. It is important to discuss in detail the terms related to the digital images as they form the basis for the various image processing techniques discussed in the later part of essay. Digital Images Every image either digital or printed are made up of equal sized basic elements which are called picture cells(also sometimes called picture elements) or abbreviated as pixels .In order to capture an image intensity of each pixel is measured and recorded and similarly in order to recreate an image the intensity of light at each pixel location is varied. Pixels are indexed as (x, y) or column-row (c, r) location from the origin of image and contain a numerical value which is the basic unit of information within the given image at a particular resolution and quantization level (to be discussed later).Hence an image is a collection of the data representing the numerical values of light intensities of a large number of pixels. A black and white image consists of pixels containing different intensities of gray color and on the other hand color image consists of pixels containing different intensities of the three colors Red Green and Blue. A digital image is a digitized version of a real image and stores the information as a collection of 0s or 1s which represent the intensity of light at each pixel. Hence a digital image can be read by computers and thus can be manipulated or rewritten in different form for the further study. If a system uses a digitization with 8 bits , the image generated uses 28 =256 distinct light intensities. A good example of pixel details is given below pictorially.

The picture below describes how the pixels are represented with numbers pertaining to the different light intensities possible depending on the number of bits used for the digitization. The header at the beginning of every image file indicates the number of pixels in each row and column, thus the program knows how many bits correspond to how many pixels. As seen below we can see how an image is divided into pixels and how the digitized version of an image is stored in the computers.

Figure 1: Image in its original form

Figure 2: Image in its Pixellated form

Figure 3: Image in its further pixellated form

Figure 4: Image showing the RGB pixel values for a part of the original image

Figure 5: Image further showing the RGB pixel values of a part of the original image.

Digital Image Data types: Binary images-These image types assign one numerical value from the set (0, 1) to each pixel in the image. The black pixel corresponds to a zero or off and white corresponds to a one or on in these images. Thus an image is a stream of 0s and 1s and normally grouped as 8 bits. Gray Scale Images- These images assign the numerical value to each pixel which corresponds to the intensity level for that point. The range of values assigned depends on the bit resolution used in the image. RGB Images- These images assign three numerical values to each pixel corresponding to the red, green and blue intensity value respectively. The example in Figures 2 -5 shows how an image is assigned pixel for the RGB type image.

Mathematics of Image analysis In this section we briefly discus about the various mathematical principles which are used in the image processing field. Image processing analysis is normally done in 2 different domains Frequency Domain: In frequency domain the frequency spectrum of whole image is calculated and further altered or processed to analyze the image. This method does not break down the image into its pixel constituent for analysis and rather takes the image as a whole. Spatial Domain: On the contrary in spatial domain the analysis is done on pixels and all alterations and processing is done at pixel level. Although the above mentioned two methods seem different but an alteration in one also affects the analysis done in the other domain. For example if a spatial filter is applied to reduce noise then it also affects the frequency spectrum. Fourier Transformation: As discussed earlier, all digital images are represented by numerical values depending on the method used, every image can be considered as a signal which can be analyzed further with techniques used for real signals. One of the techniques frequently used in signals and systems is the Fourier Transform which states that Any periodic signal can be decomposed into a collection of Sines and Cosines of different amplitude and frequencies. Thus mathematically any signal f(t) can be written as

On the other hand if we add all the Sines and Cosines together which constitute a signal, we can reconstruct the signal back. The above representation of breaking a signal into different sine and cosine frequencies is called the Fourier series and the collection of all the frequencies present is called the Frequency spectrum of the signal. The example below demonstrates the reconstruction of a triangular wave by adding Sine waves of different frequencies. Also we can see the frequency spectrums of each wave and the final triangular wave generated has all the frequencies that we used to generate.

Figure 6: Time domain plot of a Sine wave

Figure 7: Frequency domain plot of a Sine a wave at f=4

Figure 8: Time domain plot of a summation of two Sine waves

Figure 9: Frequency domain plot for summation of two Sine waves

Figure 10: Time domain plot for summation of three Sine waves

Figure 11: Frequency domain plot for summation of three Sine waves

Figure 12: Time domain plot for summation of ten Sine waves

Figure 13: Frequency domain plot for summation of ten Sine waves An important result to be analyzed from the above example is that the more a signal has sharp changes in it like square wave, impulse etc the more number of frequencies required to reconstruct it. Taking forward the above result the Noise and Edge detection basic principle can be easily understood. Referring to the Figure 3 below it can be seen that the intensity of the pixels at the edge of the image is quite different from its surrounding pixels. Thus there is a sudden change in the signal representing that portion, and applying the result of Fourier series transform above we can be sure that the frequency spectrum of the image will have higher frequencies for the edges and noises since they are rapidly changing signals requiring higher sines and cosines for representation. Further passing the signal from a low pass filter will severely attenuate the amplitude of higher frequencies and thus help in reducing the noise but also will blur the image by attenuating the high frequencies corresponding to the edges. Image quality measurement metrics The quality of an image is quantified and defined by two important terms which are Resolution and Quantization. Resolution- Resolution measures the size of an image .For still images it is specified as a spatial resolution and is given as Column(C) by Row(R), where the values C and R refer to the number of pixels used to cover the space representing the image. The resolution for analog signal is a function of the sampling rate and for a digital signal is the function of number of pixels present. Thus the resolution of an image decreases if it is sampled less frequently. The resolution for a video is defined as Temporal resolution and is defined as the number of images captured in a given period of time also called as frames per second (fps)

Figure 14: Different Resolutions of an image( Image Courtesy: www.cmu.edu)

Original Image 512 x 512 resolution

Image with 256 x 256 resolution

Image with128 x128 resolution

Image with 64 x 64 resolution

Image with 32 x 32 resolution Quantization- The concept of Quantization is key to the process of digitization of images. It is defined as the mapping of a continuous signal representing a scene to a discrete number corresponding to the intensity of each pixel constituting the image.

In the process of converting Analog image from a Capture device (camera) to a Digital signal , the voltage read at the sensor which corresponds to the wavelength of the light entering the sensor after reflection from the object is divided into different levels with each level representing the specific intensity level .Depending on the hardware used these possible levels are given as 2n where n is the number of bits . Thus a 8-

bit sensor divides the intensity into 28 =256 levels with 0 representing black and 255 representing the white.
Quantization and Resolution are independent of each other, as it is possible to have a high resolution image which can have just two intensity levels of 0 1nd 1 representing dark or light and can also be quantized into 8 bits yielding a range of 0-255 intensity levels. The requirement for particular values of Quantization or Resolution depends on the use of the image for a given application. An important concept closely related to Quantization is that of Sampling Theorem also called the Nyquist Sampling Theorem. When a digitized signal is reconstructed the sampling theorem plays an important role. It is concerned with the number of samples needed to recreate the original signal. It can be restated in relation to image processing as An analogue image can be reconstructed exactly from its digitized form as long as the sampling frequency is at least twice the highest frequency present in the image. The signal reconstructed with sampling frequency less than the Nyquest frequency (the optimum frequency) suffers from a phenomenon called Aliasing. As mentioned earlier we can measure the frequencies present in an image by checking the frequency spectrum and by adhering to the proper sampling frequency can reconstruct the image with adequate resolution. It is a common practice to keep the sampling frequency equal to 4-5 times larger than the highest frequency of the signal. If an image is reconstructed with a low sampling rate it will lack the high frequencies present in the original image and thus the important details of the image are lost. Chapter 2 Image Processing Techniques Once the image of interest is stored in the system the by applying quantization and digitization, they need to be processed before we can use the Image Analysis routines. The purpose of Image processing techniques is to remove faults, trivial information and any shortcomings introduced during the image acquisition. Common faults include blurred image and noisy images. Major image processing techniques include Histogram analysis, thresholding, edge detection, segmentation and masking. We will be briefly discussing each technique. Histogram Analysis: An Image histogram is a plot of the relative frequency of occurrence of each of the pixel at gray level plotted against the values. In other words it gives the number of times each gray level occurs in the image.

Histogram gives a visual revelation of the basic contrast that is present in the image and we can identify any differences in the pixel distribution of the image foreground and background. Histogram analysis is frequently used to improve the contrast of a poorly contrasted image. Two techniques are commonly employed for correction: 1. Contrast Stretching or Normalization where in a piecewise linear function is used to transform the gray levels to different values by stretching their value according to the linear function, thus transformed image has a better contrast than one without Normalization. 2. Histogram Equalization is a procedure where in no image correction is done without any user input and the gray pixel level stretching is done. Add histogram stretching example

Figure 15: A low contrast image and its corresponding Histogram

Figure 16: A well contrasted image and its corresponding Histogram by applying Histogram stretching

Thresholding: Thresholding is one of the important techniques in Image processing where an image is divided into different levels and assigning each pixel to those levels by comparing each pixels grayness value to a threshold. Thresholding can be either performed as a single level by choosing a single threshold or multiple thresholding by choosing multiple levels. A simple example of single thresholding can be as below: Turn a gray scale image to a binary image (black & white) image by choosing a gray level L in the original image and turn each pixel to black or white by checking if its gray value is greater than or less than L i.e.

White if its gray level > L. Convert a pixel to Black if its gray level < L.

A simple example of Double thresholding can be as below: Choose two values L1 AND L2 for Thresholding White if its gray level > L1 and < L2. Convert a pixel to Black if its gray level IS not between two thresholds.

Thresholding technique is an important part of the Image Segmentation technique where it is used to isolate objects from its background and also finds much application in Robot Vision applications. THRESHOLD EXAMPLE WILL GO HERE

Figure 17: An example for Thresholding applied to an image Convolution mask: Convolution mask is one of the filtering techniques used in Image processing .It is also sometimes called as Neighborhood Processing .As the name suggests convolution is applied to the image of interest The idea is to move a mask : a rectangle (odd sized) over the given image. The mask is first placed on the upper left corner of the image and the summation of the product of value of each pixel multiplied by the mask values is calculated. Further this summation is divided by constant normalizing value. If the summation calculates to a zero value then it is replaced by 1 or by the largest number. A new copy of the image is generated and the resulting number obtained after the normalization process discussed above is substituted in the center of the block that was superimposed by the mask. The whole process is repeated by moving the mask to the right and again replacing the center value with calculated normalized value. The operation is continued till all the rows of the image are affected by this operation. Figure 18 depicts the idea of Convolution mask procedure pictorially.

Figure 18: An example for Applying Convolution (Courtesy: S.Niku) An important point to be noted here is that the first and the last rows and columns are never affected by this Convolution procedure and are either ignored or replaced by zeroes. One of the applications of Convolution mask filtering technique is to us it for Noise Reduction in the image. Specifically the technique is called Neighborhood Averaging , where in a mask is used to reduce the gray value of a pixel which is totally different from its neighbors ( and hence called noise).The mask does not affect the values which are in gray values equal or near to their neighborhood pixels. Thus the method in a sense works as a low-pass filter by attenuating the sharp differences between the neighborhood pixels and by not affecting the pixels whose intensities are similar. Since this method introduces new gray levels in the image, thus it affects the Histogram of the image and also reduces the sharpness of the image .An alternative method to overcome this shortcoming is a method called Median Filtering which uses the median value of the pixel value to replace the center value rather than calculating the convolution.

Edge Detection: One of the most important uses of Noise filtering in Image processing is to facilitate in the Image analysis by helping in Feature Detection and Feature Extraction. Edge detection is one of the most important techniques used in Image Processing. Edges can be considered as a discontinuity or gradient in the pixel values which exceeds a given threshold value. An edge represents an observable difference in the pixel values .Considering the tables below we can clearly observe that there is a clear difference in gray values of right hand figure for columns 3 & 4 and

45 44 42

46 45 46

47 43 44

44 42 43

45 44 42

46 45 46

102 108 115 112 120 134

Indicating the presence of Edge. The techniques used in Edge detection operate in the image and result in a line drawing of the image. These lines can represent the changes in the values such as cross-sections of planes, textures, or difference in light intensities between parts and backgrounds. The principle behind these techniques is to operate on the difference between the gray levels of pixels or groups of pixels through the use of Convolution Masks (discussed in previous section) . The final representation after applying the techniques takes lesser memory and saves in computational and storage costs apart from helping in Object recognition and Segmentation. If we plot the gray values we traverse the image from left to right we can get a profile of the image. The edge profile can be a ramp edge where the gray values change slowly or a step edge where the gray values change suddenly. Referring to Figure 19 we can see that an ideal edge can be thought of a clear distinction between the pixel vales, which is only possible in Binary images. The real edge shown in the image is more prevalent in other image types and thus just a comparison of the pixel values is inadequate for edge detection. Practically, the first and the second derivative of the graph are employed for edge detection, where in we can clearly see that the Edge can be detected at the peak of the first derivative and the zero crossing of the second derivative.

Figure 19: An example for Edge Detection using Derivatives (Courtesy: S.Niku) Derivatives Suppose we have a plot of the function f(x) for the profile of an image. We can definitely plot the derivative f (x) of the image. The derivative returns zero for all the constant sections of the profile, and returns non-zero values only in the part of image in which there is difference. For an image with two dimensions using partial derivatives we define the gradient and laplacian as

respectively. The above gradient points in the direction of greatest increase for the function f(x, y). The direction of increase is given by

and the magnitude is given by Most edge detection methods find the magnitude of the gradient and then apply threshold to the result. First order derivatives As we know the definition of the Derivative is given as

Since in an image the smallest value for h is 1 being the difference between the index values of two adjacent pixels the digital implementation of the above definition is given as

Other definition of Derivative is given as and The digital version comes out to be as and Using the above expressions for derivatives and leaving out the scaling factors horizontal and vertical filters can be taken as and The use of above filters finds the vertical and horizontal edges in an image. In order to provide a smoothing effect a combined filter is used which is given as

Prewitt filter for Vertical Edge detection

Prewitt filter for Horizontal Edge detection

Second order derivatives As we have used the first order derivatives for edge detection, Second order derivatives use is also common. As mentioned earlier, the sum of second derivatives in both the direction is called the Laplacian. Defined by

It is implemented by the filter as

The second derivative is also called Discrete Laplacian. The laplacian has a property of Rotation Invariant. That is , if the Laplacian is applied to an image ,and the image is rotated , we obtain same result when we rotate the image and then apply the Laplacian . The major problems with Laplacian filters are that they are sensitive to noise. Add an example of edge detection Segmentation: Segmentation is the name given to generic techniques by which an image is subdivided into its constituent regions or objects. The main purpose of these techniques is to separate the information contained in the images into mutually exclusive regions which can be later used for other purposes. Segmentation occupies a vital role in image processing as it is the first important step that must be taken before the other tasks such as Feature extraction or classification are to be performed. All segmentation techniques are based on the following basic approaches: 1. Edge Methods: This approach detects the edges as a means to identify the boundary between regions by finding the sharp differences in the pixel intensities. 2. Region Methods: This approach assigns pixels to different regions based on a predefined selected criterion. As we have already discussed the Edge Detection method which is also a part of Segmentation technique ,we will now discuss some more Region methods used in segmentation. Region growing and Splitting: Region growing is an approach of segmentation in which pixels are grouped into larger regions based on predefined similarity criteria. The process starts by selecting a number of seed pixels (also called Nuclei pixels) which are randomly distributed over the image and appending pixels in the neighborhood region to the same region if they satisfy the similarity criterion of intensity, color or other properties. The nuclei regions act as nucleus for subsequent growing and merging. The small regions thus formed are combined into larger regions to create the final segmented regions. The region splitting technique employs the similar philosophy as the region growing but it is the reverse approach. The method starts by treating the whole image as a single region which is then successively broken down into smaller and smaller regions

until any further subdivision results in difference between adjacent regions falling below some threshold value. A most used example for region splitting is using the Split-and-merge technique. Chapter 3 Image Analyzing Techniques Image analysis is a collection of techniques used to extract information from the stored images which have been already processed with image processing techniques. The techniques include object recognition, feature extraction and extraction of depth information. Object Recognition: An object may be recognized by its features. The features normally include Gray levels, Morphological features like area, perimeter, and moments. Let us discuss some more about the various features. a. Gray Levels: The different parts or objects in an image can be identified by checking for average, maximum or minimum gray levels. For example, there may be three parts in an object and each with different color or texture. If average, maximum and minimum gray levels of the object are found, the objects can be recognized by comparison of these values. b. Morphological Features: The different morphological features are perimeter, area, diameter etc. The perimeter of the object may be found by applying edge detection routine and then counting number of pixels on the perimeter .Area can be calculated by region growing techniques discussed earlier in chapter 2 . c. Aspect ratio: Aspect ratio is defined as the width to length ratio of an enclosing rectangle about t he object. Summary In this essay we have discussed the basics of Image processing and Analysis methods currently being used. Different examples have been used to make the theory much more clear and the examples have been implemented with MATLAB. The image processing techniques are an essential part in the Robotics domain and a successful implementation leads to building of perfect robotics systems. References 1. MATHWORKS, MATLAB Documentation Manual. 2. Introduction to Robotics, Analysis control & Applications by Syed B.Niku 2nd edition. 3. Fundamentals of Digital Image Processing by C.Solomon & T. Breckon. 4. Digital Image processing using MATLAB by Gonzalez ,Woods & Eddins

Das könnte Ihnen auch gefallen