Sie sind auf Seite 1von 21

GISC9216 Deliverable 2

Principal Component Analysis

Cover Art Produced By: Lisa Atkinson

Lisa Atkinson Niagara College 2/13/2013

Janet Finlay Program Coordinator GIS-Geospatial Management Niagara College 135 Taylor Road Niagara on the Lake, ON L0S 1J0

Dear Ms. Finlay, RE: GISC216 Deliverable 2 Principal Component Analysis Please accept this letter as my formal submission of Deliverable 2: Principal Component Analysis for GISC 9216 Digital Image Processing. This deliverable contains one document presenting the appropriate application of principal component analysis and the inherent benefits. A thorough comprehensive summary of results demonstrates familiarity with ERDAS Imagine 2010 platform functions. Should you have any questions or concerns regarding the enclosed document or supplemented materials, please contact me at your convenience by email at lisaclaire87@gmail.com or by phone at (705) 499-6768. Thank you for your time and attention. I look forward to your comments and suggestions. Sincerely,

Lisa Atkinson

Lisa Atkinson BA (Honours) Geography, Nipissing University GIS-GM Certificate Candidate


L.A./l.a.

Enclosures:

1) Introduction to Supervised Classification

GISC9216-D2 February 13, 2013 Lisa Atkinson Executive Summary Principal Component Analysis (PCA) is a valuable tool in the examination of multispectral remotely sensed data. Original data is compressed and devoid of redundancy, while retaining the information available when analyzing the original data set. Remotely sensed data, especially multispectral images, are widely utilized within many real world applications, such as studies of deforestation and urban sprawl. A PCA can result in increased computational efficiency and classification accuracy. Overall, PCA ensures data integrity, in terms of eliminating redundancy, while recovering the majority of unique data values.

Page | i

Table of Contents

GISC9216-D2 February 13, 2013 Lisa Atkinson

Executive Summary.............................................................................................................................i 1.0 Introduction ................................................................................................................................. 1 2.0 Background .................................................................................................................................. 1 3.0 Goal Statement ............................................................................................................................ 3 4.0 Methodology ............................................................................................................................... 3 5.0 Findings and Discussion ................................................................................................................ 7 5.1 Purpose ............................................................................................................................................... 7 5.2 Band Correlation ................................................................................................................................. 7 5.3 PCA Results........................................................................................................................................ 10 5.4 Comparisons...................................................................................................................................... 10 5.5 Unsupervised Classifications ............................................................................................................. 11 5.6 Urban vs. Agriculture ........................................................................................................................ 14 6.0 Conclusions ................................................................................................................................ 16 7.0 Bibliography ............................................................................................................................... 17

List of Figures
Figure 1: Feature Space Images .......................................................................................................... 3 Figure 2: Eigenvector Transformation ................................................................................................. 4 Figure 3: PCA Specifications ................................................................................................................ 4 Figure 4: Dimensionality Within Data ................................................................................................. 7 Figure 5: PCA Channel Histograms .................................................................................................... 10 Figure 6: Vegetation Classification .................................................................................................... 14 Figure 7: Agricultural Lands .............................................................................................................. 14 Figure 8: Urban Misclassification ...................................................................................................... 15

List of Tables
Table 1: Original Image Investigation .................................................................................................. 8 Table 2: PCA Channel Variance ......................................................................................................... 10 Table 3: PCA Variance Results........................................................................................................... 11

Map Layouts
Subset Image Defining Area of Interest for Land Cover Classification Techniques, in Pseudo-Colour .... 2 Principal Component Analysis Applied to Original Subset Image ......................................................... 6 Land Use Classification: Unsupervised Classification Results ............................................................. 12 Unsupervised Classification Results Applied to PCA Subset Image .................................................... 13

Page | ii

GISC9216-D2 February 13, 2013 Lisa Atkinson

1.0 Introduction Principal Component Analysis (PCA) is a valuable tool in the examination of multispectral remotely sensed data. Remotely sensed data, especially multispectral images, are widely utilized within many real world applications, such as studies of deforestation and urban sprawl. In order to increase the computational efficiency of the classification process, a principal component analysis is executed (Lillesand, 2008). PCA is a technique employed to transform the original remotely sensed data set, into a substantially smaller set of components, resulting in the elimination of redundancy and dimensionality (Jensen, 2005). The ultimate goal is to create a data summary, easily displayed and interpreted, devoid of redundancy, yet recovering practically all the information of the original sensor image.

2.0 Background This document outlines the methodology and results of a principal component analysis. The original data provided by Niagara College, consists of seven GeoTIFF files depicting a Landsat 7, Path 18, Row 29 image of Muskoka, Toronto, and Kawarthas, captured September 19, 1999 with a pixel resolution of 30 meters by 30 meters. The GeoTIFF files are outlined as follows: 018029_0100_990919_17_1_utm17 018029_0100_990919_17_2_utm17 018029_0100_990919_17_3_utm17 018029_0100_990919_17_4_utm17 018029_0100_990919_17_5_utm17 018029_0100_990919_17_6_utm17 018029_0100_990919_17_7_utm17 018029_0100_990919_17_8_utm17

These seven files are stacked to create a single (.img) file, excluding bands 6 and 8. Subsequently, a subset image is captured, and for the duration of this investigation, is referred to as the original data set, upon which both PCA and unsupervised classification techniques are executed. Finally, the image is spatially displayed using UTM NAD 83 Zone 17N projection. The map: Subset Image Defining Area of Interest for Land Cover Classification Techniques, in Pseudo-Colour, located on page 2 of this document, displays the subset image.

Page | 1

GISC9216-D2 February 13, 2013 Lisa Atkinson

Page | 2

GISC9216-D2 February 13, 2013 Lisa Atkinson 3.0 Goal Statement Utilizing ERDAS Imagine 2010 software, as the platform for this investigation, a principal component analysis is executed upon a defined area of interest. This document explores the fundamental concepts of PCA, and the resulting benefits. A complete comparison of an unsupervised classification, as applied to the original image, and an unsupervised classification of the PCA image, is discussed. Original data is compressed and devoid of redundancy, while retaining the information available when analyzing the original data set.

4.0 Methodology A method of GIS best practices, when analyzing any data set or image, is the examination of the data set, in order to understand the characteristics and behaviours. Therefore, the first step in performing a principal components analysis is viewing the histograms and feature space images of the original image. Feature space images are graphical representations of the data set characteristics, in terms of comparing pixel values of one band, against values of a separate band. These images are produced for every possible band to band comparison combination. Figure 1 displays an example of a correlated pairing of bands and an uncorrelated pairing of bands. The presence of a correlation implies dimensionality of information (Jensen, 2005). More simply, two bands, displaying the same information, cause a high level of redundancy. Figure 1: Feature Space Images

Correlated

No Correlation

It is unnecessary for information to be duplicated. Furthermore, information duplication results in slower computational processes, as the platform must interpret the same information multiple times. Hence the requirement to eliminate data set redundancy. Page | 3

GISC9216-D2 February 13, 2013 Lisa Atkinson Once the data are examined, a spectral principal component analysis is executed to combat information redundancy. This is achieved via the ERDAS Imagine 2010 platform. PCA is a mathematical technique which transforms original image data, containing correlated information, to uncorrelated channels (Landmap, 2012). PCA image data values are linear combinations of the original data values, multiplied by the appropriate transformation coefficients, statistically known as eigenvectors (Lillesand, 2008). More simply, eigenvectors are utilized to translate, rotate, and redistribute the data values to produce uncorrelated channels. This process is illustrated in Figure 2. Figure 2: Eigenvector Transformation

Transformation Axis Transformation Axis

Brightness Values

Brightness Values

This transformation is completed on an automated basis. However, the user specifies the parameters of the information transformation. The specified parameters utilized to transform the original data set include: the number of PCA output components, in addition to the output of eigen matrix and eigen value reports. Figure 3 displays these specifications. Figure 3: PCA Specifications

Page | 4

GISC9216-D2 February 13, 2013 Lisa Atkinson If executed correctly, there is no longer any correlation, or redundancy, between channels, or bands. The PCA is used to extract only that information which is unique. However, it is also necessary to ensure that there is not significant loss of information as a result of performing the PCA. This is accomplished by examining the eigen value table, to calculate the percentage of unique information retained within the new channels. The map: Principal Component Analysis Applied to Original Subset Image, displayed on page 6 of this document, presents the PCA results. The motions of performing a principal component analysis are meaningless, unless there is a comprehensive understanding of the benefits and applications. Unsupervised classifications are executed upon both the original image, and the PCA image, to detect any display advantages of the PCA image.

Page | 5

GISC9216-D2 February 13, 2013 Lisa Atkinson

Page | 6

GISC9216-D2 February 13, 2013 Lisa Atkinson 5.0 Findings and Discussion The following discussion presents the findings and comparisons necessary to gain a complete understanding of principal components analysis. 5.1 Purpose Original image bands must be transformed to the principal components to ensure that all future enhancements, and processes, are executed accurately and efficiently. Employing a PCA essentially compresses the data, and rids the image of information redundancy (Lillesand, 2008). Thus, automated computation is achieved more efficiently, as the platform does not analyze the same information multiple times. Figure 4 illustrates the similarity of information displayed by bands 2 and 3, as interpreted via histogram displays; while band 4 contains decidedly different information. The economic potential associated with increased efficiency and accurate results, while recovering virtually all of the information available from the original datum image, is an obvious advantage of PCA. Figure 4: Dimensionality within Data
Band 4 Histogram: Bimodal (Not Correlated to Bands 2 and 3)

Similar Pixel Values: Correlated Bands

5.2 Band Correlation In addition to histogram analysis, feature space images, comparing all possible band pairings, are examined in order to understand the characteristics of the original image. Table 1, commencing on page 8 of this document, summarizes the band comparisons, correlation relationships, and extent of information redundancy. This serves as the basis of rationale to employ PCA.

Page | 7

GISC9216-D2 February 13, 2013 Lisa Atkinson Table 1: Original Image Investigation Band Pairing 1-2 Correlation Status Correlated Correlation Strength Strong Description/ Example Image Rationale There is a large amount of redundancy, or duplicity, of information displayed by band 1, in comparison to band 2. There is a significant amount of redundancy, or duplicity, of information displayed by band 1, in comparison to band 3.

1-3

Correlated

Moderate

1-4

Not Correlated

There is no redundancy, or duplicity, of information displayed by band 1, in comparison to band 4.

1-5

Not Correlated

1-6

Correlated

Weak

There is no redundancy, or duplicity, of information displayed by band 1, in comparison to band 5. There is a small amount of redundancy, or duplicity, of information displayed by band 1, in comparison to band 6.

2-3

Correlated

Strong

2-4

Not Correlated

There is a large amount of redundancy and duplicity of information displayed by band 2, in comparison to band 3. There is no redundancy, or duplicity, of information displayed by band 2, in comparison to band 4. Page | 8

GISC9216-D2 February 13, 2013 Lisa Atkinson 2-5 Correlated Very Weak There is a minimal amount of redundancy, or duplicity, of information displayed by band 2, in comparison to band 5.

2-6

Correlated

Moderate

3-4

Not Correlated

3-5

Correlated

Very Weak

3-6

Correlated

Moderate

4-5

Not Correlated

4-6

Not Correlated

There is a significant amount of redundancy and duplicity of information displayed by band 2, in comparison to band 6. There is no redundancy, or duplicity, of information displayed by band 3, in comparison to band 4. There is a minimal amount of redundancy, or duplicity, of information displayed by band 3, in comparison to band 5. There is a significant amount of redundancy and duplicity of information displayed by band 3, in comparison to band 6. There is no redundancy, or duplicity, of information displayed by band 4, in comparison to band 5. There is no redundancy, or duplicity, of information displayed by band 4, in comparison to band 6.

5-6

Correlated

Strong

There is a large amount of redundancy and duplicity of information displayed by band 5, in comparison to band 6.

Page | 9

GISC9216-D2 February 13, 2013 Lisa Atkinson 5.3 PCA Results Since nine of the possible fifteen band comparison combinations prove to be correlated, it is necessary to implement a principal component analysis, upon the original image. The inherent danger in data compression is the loss of information. The goal of PCA is to compress the data, and eliminate redundancy, without sacrificing valuable information. The PCA executed upon the subset image is successful, in that approximately 99.5 percent of the original information is retained. Since the PCA image consists of three channels, Table 2 indicates the percentage of retained information, associated with the first three channel outputs.

Table 2: PCA Channel Variance PCATable 2049.10064 620.9246998 45.84825932 8.304207152 4.035314758 1.578465469 Sum 2729.791587 Percent 75.06436 22.74623 1.679552 0.304207 0.147825 0.057824 Sum 100

99.4901 4

5.4 Comparisons In opposition to the histogram similarities observed for the bands of the original image, displayed by Figure 4, each channel of the PCA image contains unique information. Figure 5 illustrates each channel, and the associated pixel values, via histogram analysis. Figure 5: PCA Channel Histograms

Page | 10

GISC9216-D2 February 13, 2013 Lisa Atkinson Furthermore, the feature space images for the PCA results, displayed by Table 3, communicate that there is absolutely no inter-channel correlation. There are several implications to these findings. First, the data has been compressed from six bands to three channels. A PCA must always transform information to fewer, or an equal number of bands, as the original image (Lillesand, 2008). A lack of correlation between the PCA channels proves that there are no data redundancies. All information presented is unique and meaningful. This is a vast improvement to the overwhelming duplicity of the original image, as summarized by Table 1.

Table 3: PCA Variance Results Channel Pairing 1-2 Feature Space Image

1-3

2-3

5.5 Unsupervised Classifications Two unsupervised classifications are performed: original image unsupervised classification, and PCA image unsupervised classification. The results are displayed by map: Land Use Classification: Unsupervised Classification Results, located on page 12 of this document, and map: Unsupervised Classification Results Applied to PCA Subset Image, located on page 13 of this document.

Page | 11

GISC9216-D2 February 13, 2013 Lisa Atkinson

Page | 12

GISC9216-D2 February 13, 2013 Lisa Atkinson

Page | 13

GISC9216-D2 February 13, 2013 Lisa Atkinson 5.6 Urban vs. Agriculture The PCA unsupervised classification displays noticeable improvements in the accuracy of both urban and agricultural classification. The building and associated linear features, displayed in Figure 6, suggest the presence of a small airstrip, for bush planes. The formal maps, located on page 12 and 13 of this document, illustrate a lack of a large urban center in close proximity. Therefore, it is reasonable to assume the airstrip is a field landing strip. The original image classification denotes this area as bare earth (brown), and appears highly detailed; the PCA image classification is more accurate, as the landing strip is classified as field (gold). The PCA image is devoid of redundant information, thus the platform analyzes unique values once. Figure 6: Vegetation Classification

Original Image

Unsupervised Classification

PCA Classification

The PCA classification results provide more accurate distinction of vegetation and bare earth, which proves extremely useful, in the examination of agricultural lands. Figure 7 illustrates a field classified as urban (red), grass (bright green) and cultivated field (gold), within the unsupervised classification. The PCA classification is far more accurate to reality. Figure 7: Agricultural Lands

Original Image

Unsupervised Classification

PCA Classification

Page | 14

GISC9216-D2 February 13, 2013 Lisa Atkinson As observed in Figure 6 and Figure 7, PCA results improved the accuracy of image land cover classification, in terms of distinguishing agricultural land, from other feature types. PCA classifications also distinguish urban features more accurately. Figure 8 illustrates the obnoxious amount of urban (red) misclassification, within the original image classification. The only true urban element is depicted by the circled area. It is impossible to completely eliminate pixel misclassification, however, PCA classification, as observed by Figure 8, significantly reduces the misclassification of agriculture and urban features. Thus, PCA image classifications are helpful in depicting areas of agriculture, bare earth, and urban features, accurately. Figure 8: Urban Misclassification

Original Image

Unsupervised Classification

PCA Classification

Dimensionality, or duplicity, of information within data may result in inaccurate image analysis and display results (Jensen, 2005). Any future enhancements applied to the PCA image will yield unique and meaningful results, as the PCA image presents quality base data. Performing enhancements to an image, containing redundancy often causes misclassification, or results not as precise as possible. To gain quality results, quality data must be utilized.

Page | 15

GISC9216-D2 February 13, 2013 Lisa Atkinson 6.0 Conclusions It is imperative for the GIS professional to investigate and understand data characteristics. This will result in informative decision making, and appropriate application of enhancements and transformation techniques. Principal component analysis is one such data transformation, which results in the elimination of information duplicity, while maintaining virtually all unique data values. This investigation proves the importance of performing a PCA, and the associated benefits as applied to land cover classification. PCA image classification produces more accurate results. Thus, it is an economically viable venture.

Page | 16

GISC9216-D2 February 13, 2013 Lisa Atkinson 7.0 Bibliography

Jensen, J. (2005). Introductory Digital Image Processing: A Remote Sensing Perspective. Upper Saddle River, New Jersey: Pearson Educational, Inc. Landmap Spatial Discovery. (2012). Principal Components Analysis. Retrieved February 9, 2013 from http://landmap.mimas.ac.uk/index.php/Table/Learning-Materials/Image-Processing-forERDAS/Page-6 Lillesand, Kiefer, & Chipman. (2008). Remote Sensing and Image Interpretation: Sixth Edition. Hoboken, New Jersey: John Wiley and Sons, Inc.

Page | 17

Das könnte Ihnen auch gefallen