Sie sind auf Seite 1von 13

See5 & ERDAS IMAGINE CART Module Procedural Example

Prepared March 9, 200


!" #a" R$ %o&'
E()*ro(me('al Sc*e('*&' +SAIC,
ER-S Da'a Ce('er
!ac./rou(d
As a new user of the See5 decision tree classifier I was confident that I could follow
some basic instructional documentation and file examples to obtain a thematic land
cover map based on satellite and elevation data. While the See5 documentation, both
within the module and online, are informative I eventually found that there are many
subtle details that had to be recognized and dealt with before a See5 classifier could be
build and subseuently used to produce a spatial output.
!hese subtle details created, at least for me, many headaches and caused the process
of developing a decision tree classifier and classified output to ta"e a greater amount of
time than anticipated. !hese are the reasons for this document.
-01ec'*)e&
!he ob#ectives of this document do not include going into great detail about all of the
different parameters that can be used for sampling, validation, etc. $ather the purpose
here is to go through the data preparation and implementation processes so the user
can have more time to experiment with these variables rather than wasting a lot of time
#ust trying to obtain output.
Prepar*(/ Da'a
%ata preparation includes the development of two text files. &ne that represents 'truth(
data used for class definition )e.g. field data* and another that lists the thematic and
continuous data layers )i.e. training data* used to develop the decision tree. !he first
has the file extension $da'a and the second has the extension $(ame&.
In this example, the field data were in the form of database spreadsheets with and
without coordinate information. !his fact creates issues with creating a reuired data
layer )dependant variable file used later*, but doesn+t affect the ability to create the $da'a
file.
!here can be several steps involved in getting the dependent variable )e.g. field data*
data into proper $da'a file format and there is more than one way to reach the desired
result. !he best method used and steps involved are dependant on input and user
preference. &nly one method will be discussed here.
!he data used in this example initially came in the form of a ,icrosoft -cel ).xls* file. In
this case the training information )spectral, dem values* was already included. !he
$da'a file can be created from a database file by following these steps.
/* %elete any fields of information not needed for decision tree creation )e.g.
descriptive information li"e cover type*.
0* 1nsure that the fields remaining include those with values associated with all
data layers )Satellite data, %1, data* that will be used for training. Also ensure
that the dependant variables )e.g. cover class* are identified for each record.
2* 1xport the table into comma delimited ).csv* format.
3* %elete the first row )headings* and save to a text file with the $da'a extension.
4igure /. 5artial -cel or ,icrosoft database file. 4irst two fields may be coordinate
information, but this information is not reuired. !hese fields are ignored in tree
creation.
4igure 0. 5artial .data file after the .csv file has been edited to delete the heading row.
!he above example is a simplified example of creating a $da'a file because the spectral
and other training information have been included in the original table. If training data
needs to be developed then more steps are involved. !he steps below involve
extracting training data from a point coverage or shapefile.
/* Add the point coverage or shapefile to a new I,A6I71 viewer and open the
attribute table.
0* Select the coordinate columns, right clic", and then export the data into a .dat
file. !his is (o' the $da'a file.
4igure 2. 1xporting the point location information for training data extraction. ,a"e
sure the - coordinate is in the left most column.
2* $epeat the above steps to obtain the class labels associated with each point
)e.g. forest class / representing 5onderosa 5ine*.
3* 8se the '9onvert 5ixel to !able( utility in I,A6I71 to extract the training data.
Add all of the data layers needed for tree development. Select the $da' file
generated in the previous step and specify the output $a&c file.
4igure 3. 1xtracting the training data. !he $a&c file output will be edited to create
the $da'a file.
5* :ring the $a&c file into -cel along with the land cover class information. 9ut and
paste the land cover class information as the last column in the $a&c file.
;* 1xport the $a&c file to a comma delimited )$c&)* file.
<* 1dit the $c&) file to delete the heading information and save as a $da'a file.
5rior to creating the $(ame& file an I,A6I71 )$*m/* dataset needs to be created which
contains all of the dependant variables that are going to be mapped )e.g. forest or shrub
types*. If the data used in creating the $da'a file have coordinate information then you
can convert the coverage or shapefile directly into a $*m/ file. If some or all of the
training data does not have coordinate information then a dependant variable $*m/ file
must be created.
9reating a dependant variable $*m/ file can be accomplished simply by creating a point
coverage or shapefile, which has at least enough points to include at least one point for
each dependant class and enough points to capture all layers and discrete classes in
the training data. 4or example, if you have /= forest classes then you need a t least /=
points with values euated to the dependant classes )e.g. class2/ euals aspen>birch
so one points would need a value of 2/*. !hat is if these /= points also include at least
one in each of the discrete layers )e.g. date images*. If you have 2 date layers that
overlap some, but have more uniue areas than can be covered with /= points, then
more points would have to be used. &nce created, the point information can be
converted to a $*m/ file.
?owever the dependant variable $*m/ file is created it is particularly important that it
have the same exact extent, cell size, rows@columns, etc. as the training data. In fact,
now is a good time to chec" all the training and dependant data layers for consistent
spatial information. 4igure 5 shows an example of an I,A6I71 ImageInfo window with
the important characteristics indicated.
4igure 5. A continuous training data layer and the important characteristics that must be
consistent for all continuous layers. 4or thematic layers, such as the dependant
variable layer and others such as date layers, the '!ype( needs to be !hematic. If it is
not, recalculate the statistics with a s"ip factor of / and then change the layer type to
!hematic. All other information shown here )indicated within the red* should be exactly
the same. If it is not it can be modified using the '9hange ,ap ,odel( option under the
'1dit( dropdown menu.
4inally, it is time to create the $(ame& file. !his is accomplished by using the I,A6I71
9A$! module 9A$! Sampling !ool. 4igure ; shows an example of the module. !he
important items to remember here are indicated by red.
Independent variable files need to be added to the right side in the order that they are
listed in the $da'a file. ,a"e sure that = is ignored so that the entire extent is not
analyzed )data outside of the non>zero data area*. Since validation is done within the
See5 program, !raining and Aalidation numbers really aren+t important. ,a"e sure to
save the 9A$! $da'a file with a different name otherwise your existing copy )which you
want to use* will be overwritten.
4igure ;. !he 9A$! Sampling !ool is used to create the $(ame& file. 5lay close
attention to independent data file order, ignoring =, and renaming the output $da'a file to
something else so it doesn+t overwrite the existing copy.
!he output from the inputs shown in 4igure ; is shown below in 4igure <. While the
$(ame& file is close to being ready to run in the See5 program, some items do need to
be updated and verified.
4igure <. 1xample $(ame& file generated by the 9A$! Sampling !ool. 1verything is
not uite right. Since we ignored = during generation it is not listed as part of the
discrete layers even though it needs to be included. Simply add a = in front of all
independent and dependent variable lists. ,a"e sure that continuous and thematic
layers are identified as such )either 'continuous( or all possible discrete values*.
Crea'*(/ 'he Dec*&*o( Tree
&nce the $da'a and $(ame& files are successfully )and correctly* created the See5
program can be run. !he 68I of See5 is straightforward and one only needs to enter
the $da'a file as prompted after clic"ing on the left most button. !he ,Name& file should
be located in the same directory and will be loaded automatically.
4igure B. See5 68I with $(ame& and $da'a files loaded.
9lic"ing on the next active button runs the See5 program. Another window )4igure C*
appears, which gives the user the ability to select various classifier construction options.
!his same window is used to perform cross>validation. 5lease refer to the software
documentation for further explanation of these and other options.
4igure C. 1xample of 9lassifier 9onstruction &ptions window.
After clic"ing the &D button the program runs and when finished produces a $ou' file
that contains the produced decision tree and associated evaluation of the training data.
A .tree file is also created, which will be used to create the spatial output. 4igures /=
and // illustrate the two sections of the $ou' file including part of the decision tree and
the training data evaluation.
4igure /=. See5 decision tree output.
4igure //. See5 training data evaluation section of the $ou' file.
Crea'*(/ Spa'*al -u'pu'
&nce the decision tree has been created in See5 it is possible to create a classified
output. !o do this return to the I,A6I71 9A$! 8tilities and select 9A$! 9lassifier from
the options. !he dialog in 4igure /0 is produced. Eocate and enter the $(ame& file and
fill in the other fields as necessary. !he use of a binary mas" file of the study area is
recommended to decrease processing time. ,a"e sure that the See5 and !ree options
are selected. !he $'ree file will automatically be filled in for you.
4igure //. 9A$! 9lassifier example.
&nce the &D button is clic"ed a progress window will display. If the progress uits )the
&D button is active* prior to reaching /==F then a problem exists. It is also possible
that the process will run to /==F, but the output will not be usable. If these problems
occur, go bac" and loo" at these things,
/* Are all the data the same in regard to the number of rows@columns, extent, cell
size, pro#ection, etc.G
0* If you had to change layer information )e.g. continuous to thematic* did you exit
the view or otherwise ensure the changes were saved to the data layerG
2* %oes the $(ame& file contain all input layers in the order that they are
represented in the $da'a fileG
3* %o the discrete layers )including the dependant variable layer* have a = included
in their list of valuesG
5* %id you ma"e changes to the $(ame& file that may have introduced illegal
charactersG !he $(ame& file has to be perfect to wor".
As with my experience, once the process is completed successfully for the first time it is
much easier to modify input data and parameters to perform experiments and evaluate
the decision tree model. ?opefully, this document will help get to the experiment stage
in a shorter amount of time.

Das könnte Ihnen auch gefallen