GISC 9216-D2 Fundamentals of PCA 2/12/2014 Instructor Janet Finley
OLAWALE BABALOLA
20 Hill Park Lane, St. Catharines, ON. L2N 1C6 engr_josla@yahoo.com
OLWALE BABALOLA PCA ANALYSIS 1
20 Hill Park Lane, St. Catharines Ontario. L2N 1C6 Phone#: (289)990-6367, Email: engr_josla@yahoo.com February 12, 2014 Janet Finlay GIS-GM Professor (Coordinator) Niagara College 135 Taylor Road Niagara-on-the-Lake, ON L0S 1J0
Dear Janet, RE: Submission of GISC9216-Assignment #2 Please accept this letter as my formal submission of Assignment GISC9216-#2 . This assignment contains, a formal written summary report of MS Word document with answers to Assignment-#2 Fundamentals of Principal Component Analysis (PCA) using ERDAS. This assignment serves as continuation in digital image processing. It has taught me another concept of digital image processing on how to carry out Principal Component Analysis (PCA). It also engender me the opportunity to use the ERDAS software effectively in carrying out this analysis ranging from viewing raster image, Subset, histogram and displaying band channels for further comparison. The overall findings in this analysis show that, the PCA result has a better and distinctive classification than the unsupervised classification. Should you have any question regarding my deliverable, please don't hesitate to contact me on my phone: (289)990-6367 or Email: engr_josla@yahoo.com. I look forward to your favorable comment.
This workshop is based on digital image processing which was narrowed down to principal component analysis (PCA). A principal component analysis was carried out on a subset image created from previous workshop with aim to reduce redundancy and compress data which is a major issued faced in classifying digital image data accurately. The result of the PCA was further classified using unsupervised method of classification. The result was compared with the unsupervised classification from the previous workshop with the aim to determine a more accurate analysis result in the image classification. The analysis as shown that, PCA analysis would yield a more accurate result with distinctive classification result compare to original unsupervised classification analysis.
OLWALE BABALOLA PCA ANALYSIS II
CONTENTS Abstract ......................................................................................................................................................................................... i Table of Figure .......................................................................................................................................................................... ii List of Table ................................................................................................................................................................................ ii 1. O Introduction ...................................................................................................................................................................... 1 1.1 Principal Component Analysis (PCA) .................................................................................................................. 1 1.2 Background .................................................................................................................................................................... 1 1.3 Procedure ....................................................................................................................................................................... 1 2.0 Answers to Questions ..................................................................................................................................................... 3 3.0 Comparison between the two Classification Results ........................................................................................ 7 4.0 Conclusion ......................................................................................................................................................................... 10 Bibliography ............................................................................................................................................................................ 11 Appendix ................................................................................................................................................................................... 12 Unsupervised Thematic Map ............................................................................................................................................ 13 PCA Thematic Map ................................................................................................................................................................ 14
TABLE OF FIGURE Figure 1-Principal Component Analysis ........................................................................................................................ 2 Figure 2-Subset Band Channels ......................................................................................................................................... 3 Figure 3-Subset Histogram .................................................................................................................................................. 4 Figure 4- PCA Band Channels and Histogram .............................................................................................................. 6 Figure 5-PCA and Unsupervised Image .......................................................................................................................... 7 Figure 6-PCA and Unsupervised Urban Comparison ................................................................................................ 8 Figure 7-PCA and Unsupervised Agriculture Comparison ..................................................................................... 9
LIST OF TABLE Table 1- Eigenvalues with Total Percentage ................................................................................................................ 4
OLWALE BABALOLA PCA ANALYSIS 1
1. O INTRODUCTION 1.1 PRINCIPAL COMPONENT ANALYSIS (PCA) Principal Components Analysis (PCA) is a way of identifying patterns in data, and expressing the data in such a way as to highlight both their similarities and differences. Since Patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA would be a perfect tool for analyzing such data. Also it is a method used for transforming a set of correlated variables into a new set of uncorrelated variables with the aim of reducing data redundancy in the image. In Principal Component Analysis, each variable are transformed into a linear combination of orthogonal common components (output raster maps) with decreasing order of variation. This enables a reduction of output maps because the last number of transformed maps has little or no variation left. The linear transformation assumes the components will explain all of the variance in each variable. Hence each component (output raster map) carries different information which is uncorrelated with other components. PCA can also be used for pre-processing procedure prior to classification of the data and to find targets of interest, for example, the component with the lowest variance may contain some interesting information. This analysis is aimed at carrying out principal component analysis (PCA) on a subset image to reduce the redundancy and to compress data in which the final result of the analysis will be used in classifying the area of interest (AOI). This analysis will be carried out using ERDAS Software. 1.2 BACKGROUND A background study was carried out prior to this analysis using a subset created from a raster image with the subset pixel size set as 512-by-512.The subset image was further classified with the aim of categorizing the pixels in the digital image into several land cover classes using supervised method of classification and unsupervised method of classification. The categorized data was further used to produce thematic maps to show various pixel classifications. In comparison various results from each classification methods shows a high degree of redundancy in the multispectral image. 1. 3 PROCEDURE The subset image created from the previous workshop was loaded in the ERDAS interface. The image was then explored band by band from the feature space image (scattered plot) with the
OLWALE BABALOLA PCA ANALYSIS 2
aim to identify the bands displaying strong correlation. The principal component analysis was then carried out on the subset image with the appropriate settings on the PCA module. For this analysis the number of component desired was set to three (3) as instructed. Figure 1 illustrate further on the procedure. Finally the PCA result was further classified with the unsupervised method with same number of class and maximum iteration for consistency in the analysis.
FIGURE 1-PRINCIPAL COMPONENT ANALYSIS
OLWALE BABALOLA PCA ANALYSIS 3
2.0 ANSWERS TO QUESTIONS (1) It is very essential to transform original image bands to the principal components because of interband correlation which is most often problem encountered in analyzing a multispectral image. It is also important because most images generated by the digital data from various wavelength bands often appear same or similar that is, superfluous repetition and occurrence which translates essentially the same information. (2)
FIGURE 2-SUBSET BAND CHANNELS The feature space image of the following channel bands Colors as shown in Figure 2 reflect the density of points for both bands where the bright tones represent a high density and the dark tones represent a low density this further shows a strong correlation. Among the bands, the signature displays straight linear trending lines pattern showing a less or no variation. It was also determined from the histograms as shown in Figure 3 that they display a high pixel range cluster in one side of the graph. This further depicts high redundancy in the data.
OLWALE BABALOLA PCA ANALYSIS 4
FIGURE 3-SUBSET HISTOGRAM
(3) The variance shown on the first three band channels was determined from the Eigenvalues however, eigenvalue is a number that indicates how much variance there is in a data. As coefficient result of this technique measure variance in the data, it thus further shows how much variance of the data each of the principal components represents and the percentages of the total variance. The first eigenvalue shows the largest and represents the most variance in the data. It further tells us how spread out the data is. Table 1 shows the eigenvalue and the total percentage of the first three values.
TABLE 1- EIGENVALUES WITH TOTAL PERCENTAGE 975.9553886 59.56376% 592.377442 36.15353% 55.49854865 3.387145% 9.857151912 3.171655456 1.645124367 99.10443%
OLWALE BABALOLA PCA ANALYSIS 5
(4) In comparison the bands of PCA data shown in Figure 4 are non-correlated and independent. They are often more interpretable than the source data it was also observed that the feature space image displays a weak correlation and the histogram pixel ranges shows a wide spread variation while the original data shows a very strong correlation with no spread as shown in Figure 2 & 3.
OLWALE BABALOLA PCA ANALYSIS 6
FIGURE 4- PCA BAND CHANNELS AND HISTOGRAM (6)
OLWALE BABALOLA PCA ANALYSIS 7
3. 0 COMPARISON BETWEEN THE TWO CLASSIFICATION RESULTS In comparison of both the principal component analysis (PCA) unsupervised and the unsupervised classification, the PCA showed a high degree of variation with a spread out and sharper image. After a close comparison between the pixel classes of each result, the PCA shows a higher percentage of accuracy than the original unsupervised. This was as a result of the redundant data been compacted into fewer bands and also, the dimension of the data was reduced. The bands of the PCA data shows non-correlated patterns and are independent, they are often more Interpretable than the source data as shown in Figure 5
FIGURE 5-PCA AND UNSUPERVISED IMAGE PCA-Unsupervised Classification Unsupervised Classification
OLWALE BABALOLA PCA ANALYSIS 8
The PCA analysis depicts a clearer sharper image as mentioned earlier however it as greatly helped in the classification of the urban by distinctly separating the urban from other classes. It also defines high degree of variability among other classes. The red ring in Figure 6 shows the original unsupervised urban with strong redundancy and cluster class while the yellow ring PCA unsupervised shows a distinctive classification of the urban and road networks from other classes and the spread.
FIGURE 6-PCA AND UNSUPERVISED URBAN COMPARISON Unsupervised PCA
OLWALE BABALOLA PCA ANALYSIS 9
Based on the analysis results the PCA result as helped in classifying the forest from the agriculture clearly. This classification further assists in differentiating the heavy vegetation from the densely populated once. It also helped in classing the healthy agriculture field from the unhealthy once. In Figure 7, the red ring shows high redundancy classification in vegetation and the agriculture fields in the unsupervised image while the yellow ring shows a distinctive classification between the agriculture field and the vegetation in the PCA unsupervised image
FIGURE 7-PCA AND UNSUPERVISED AGRICULTURE COMPARISON Unsupervised PCA
OLWALE BABALOLA PCA ANALYSIS 10
4. 0 CONCLUSION In conclusion, principal component analysis (PCA) showed a great importance in digital image processing and classification however, from the individual result, it is very glairing that, it is best to carry out a PCA analysis to get accurate classification, though the images might appear similar but the PCA would help to reduce or eliminate the redundancy in data and also help with a sharper or clearer image for better analysis result.
OLWALE BABALOLA PCA ANALYSIS 11
BIBLIOGRAPHY Remote Sensing and Image Interpretation Lillesand | Kiefer | Chipman Sixth Edition ERDAS FIELD GUIDE 2010 Image Fusion and Principal Component Analysis (1) lecture note by Janet Finlay