Beruflich Dokumente
Kultur Dokumente
Users Guide
December 2012
TABLE OF CONTENTS
1.0 INTRODUCTION.........................................................................................................4
2.0 INSTALLATION..........................................................................................................4
3.0 NLCD MAPPING TOOL ICON AND MENU.............................................................5
4.0 PERCENT CALCULATION TOOL............................................................................6
5.0 NLCD SAMPLING TOOL...........................................................................................7
a) Independent Variable Files Input.........................................................................8
b) Dependent Variables File Input.........................................................................10
c) Ignore Values.....................................................................................................11
d) Sampling Number Designation.........................................................................12
e) Sampling Method Designation..........................................................................13
1) Random sampling..................................................................................14
2) Stratified random sampling...................................................................15
3) Systematic sampling..............................................................................16
f) Output File Names Designation.........................................................................16
g) Cubist and See5 Options....................................................................................18
6.0 CUBIST AND SEE5 CLASSIFIER TOOLS..............................................................19
a) Input Name File.................................................................................................19
b) Rules or Tree Option and Input File..................................................................20
c) Input Model File................................................................................................20
d) Use Mask File....................................................................................................20
e) Output File.........................................................................................................20
f) Create Error or Confidence Layer......................................................................20
7.0 ACCURACY ASSESSMENT TOOL.........................................................................21
8.0 SMART ELIMINATE TOOL.....................................................................................22
a) Input Name File.................................................................................................22
b) Minimum Mapping Unit....................................................................................22
c) Weight File........................................................................................................24
1) No Weights............................................................................................25
2) No Weights w/ 0....................................................................................25
3) Old Weight File.....................................................................................26
4) New Weight File....................................................................................26
d) Single Step Option.............................................................................................28
e) Output File.........................................................................................................28
9.0 CREATING BATCHES FOR ERDAS.......................................................................29
10.0 LOGGING AND ERROR MESSAGES...................................................................30
a) Logging Messages.............................................................................................31
1) cubistinput.c...........................................................................................31
2) cartclass.c...............................................................................................31
3) SmartEliminate.c...................................................................................32
b) Error Messages..................................................................................................33
1) cubistinput.c...........................................................................................33
2) cartclass.c...............................................................................................34
3) SmartEliminate.c...................................................................................35
c) System Error Messages......................................................................................36
1) Side-by-side Error..................................................................................36
2) Erdas Imagine Error...............................................................................36
3) Command Line Too Long.....................................................................37
1.0 INTRODUCTION
This document was created in order to detail installation instructions and use of the
National Land Cover Dataset (NLCD) Mapping Tool designed by MDA Information
Systems LLC (formerly MDA Federal), for the United States Geological Survey. All
rights to this software are held by the USGS.
The tools described in this document were initially developed for use within the ERDAS
Imagine 8.7 software environment. They have been updated to work with Imagine
versions 9.1 through 2011. The tools have been designed and tested to produce data files
compatible Rulequest Researchs Cubist versions 2.02 through 2.07 and See5/C5.0
versions 2.02 through 2.08, and to read and apply models from those versions of
Rulequests software. The executables were compiled using the Imagine Toolkits and
Microsoft Visual C++ compilers. The tools have been tested on Windows XP and
Windows 7 operating systems.
2.0 INSTALLATION
To install the software double click on the executable file named
NLCD_Mapping_Tools_v2.0.8.7.exe. It is recommended you close all other
applications before starting Setup. In addition, depending on the security policy on your
system, you may need to have administrator privileges to install software into the
Program Files directory.
After double clicking the installer, the following window will appear:
Click the Next button. The following window changes to the following:
The installer will find which versions of Imagine are installed on the system. Select the
versions for which you would like the NLCD Sampling Tool installed. In addition, the
last checkbox is for installing the version of the Microsoft Visual C Runtime
Redistributable Library needed by the software.
3.0 NLCD MAPPING TOOL ICON AND MENU
After completing all of the above installation steps and restarting the appropriate version
of ERDAS Imagine, a new icon should be added to the main ERDAS Imagine toolbar.
This icon will launch the NLCD Mapping Tool GUI.
In the Ribbon interface a new tab called NLCD Tools will be added to the left of the
Help tab. From this GUI menu or tab the user can launch any application included as
part of the NLCD Mapping Tool. These applications include: Percent Calculation, NLCD
Sampling Tool, Cubist Classifier for Cubist v2.06, Cubist Classifier for Cubist v2.07,
See5 Classifier, Accuracy Assessment, and Smart Eliminate. Each of these applications is
described in detail below.
This tool also features a batch command function that allows the user to run multiple
percent calculations on multiple user-defined input files. These files will be run in
succession and are input at once, prior to the start of the calculation process.
Independent
File Input List
2
3
Independent inputs can be of any data type. Dependent input files may be either
unsigned 1, 2, 4, 8, or 16-bits or else signed 8 or 16-bit. Input files must be of the
same spatial resolution (pixel size) and map projection. Files may be of varying
extents, but will only be sampled from within the geographic area of intersection
for all independent input images. All independent files input by the user will be
added to the input independent file input list, located to the right of the drop down
navigation window.
A second option for specifying the list of independent files is to use a *.txt file.
The txt file must have one file name per line. The txt file may be necessary when
using a large number of independent files because the GUI software
communicates with the sampling executable by passing the files on the command
line. The command line string is limited in length. The user can also create and
save a txt file after selecting the files through the interface.
10
c) Ignore Values
This option allows the user to specify a value or multiple values that should be
ignored within the dependent input file during the sampling process. Values can
be listed as a comma delimited list (i.e. 0,100,255) or defined as a range of values
with the use of a hyphen (i.e. 0-255). No samples will be collected from the pixels
designated by the user-defined ignore value(s). These areas will be treated as
background values and will not affect the distribution of either training or
validation samples.
11
12
After the user has defined whether sampling will be number or percentage based,
an appropriate number or proportion of training and validation samples must be
defined. These training and validation samples will be written to separate files for
use within the Rulequest Cubist or See5/C5.0 software, in order to build and
evaluate, respectively, the rule set(s) produced by Cubist or See5/C5.0.
13
Three different sampling methods are provided. Each of these methods should be
selected by the user based upon the appropriateness of the input data layers and
desired output classification. All three methods sample on a single pixel basis.
Each pixel is therefore treated as a separate possible sample, regardless of the
spatial proximity of these samples. The three methods include:
1) Random sampling
This method selects a random subset of the input pixels without regard for
the value of the dependent variable. The image is scanned once to see how
many eligible pixels (i.e., not having an excluded value) are present and
compares that with the number of sample pixels the user desires. It divides
the two numbers to get the probability p1 that the first pixel is selected. It
then reads through the image again and selects the first pixel with
probability p1 by using a random number generator. The number of
remaining pixels is decremented by one and the number of desired pixels
is either decremented by one or left the same depending whether the first
14
15
3) Systematic sampling
This sampling option spreads the appropriate (user-defined) number of
training samples evenly across the extent of the image area based upon
areas of available (non-ignored) pixels. The total number of these
available pixels divided by the number of training samples required
defines the interval of sampling. The sampling start point is assigned
through random number generation. The image pixels are examined in
blocks (size depending on the *.img settings) sequentially from upper left
to lower right.
16
ii) *.data file: This file contains a list of all independent and dependent
sample values which were sampled to be used as part of the rule set
training process. These files were sampled by the user-defined methods
(described above).
iii) *.test file: This file contains a list of all independent and dependent
sample values which were sampled, separately from the training samples,
to be used as part of the rule set validation process. These files were
sampled by with a random method (described above).
iv) *.names.hst is also created through the NLCD Sampling Tool. This file
details the distribution of samples available within the dependent input,
and those output into the *.data and *.test files.
The user need only supply an appropriate name to the output *.names file, and
this root name will be applied to the corresponding *.data and *.test files. These
files must all have matching roots in order for the Rulequest and NLCD Classifier
software to perform correctly. Files with root names that do not match exactly
will cause errors for these two programs.
17
18
19
20
predict this value. This provides the user with an estimated range, or confidence,
in these values. The scaling factor of 10 was chosen to get more precision in the
value. The file format is unsigned 16-bit.
For See5/C5.0, this error layer represents a percent confidence associated with
each rule and output categorical, classified value. It is expressed as a percentage
of confidence. A value of zero would therefore have a low confidence (always
wrong), while a value of 100 would have a very high confidence (always right).
The file format is 8-bit unsigned.
7.0 ACCURACY ASSESSMENT TOOL
The Accuracy Assessment/Error Validation Tool is designed to aid the user in evaluating
the accuracy of the output classified image. This is done by subtracting the classified
layer from a user-defined quality control image. The result of this operation will provide
the user with an output image that identifies areas of both positive and negative
misclassification. Output values will range from -100 to 100 and will be defined within
the value column of the output attribute table.
This evaluation tool is for use with continuous data which ranges from 0 to 100 only.
Accuracy assessment of categorical classifications will provide meaningless numbers,
which denote magnitude of difference between the predicted and actual classes. Such a
magnitude of difference is not appropriate for categorical accuracy assessment.
21
22
Selecting the Multiple option allows the algorithm to use different MMUs for
different classes. In order to input this information, the user can specify an
existing file (in the right format), or edit and save one through the Interactive
MMU Input feature. The format for a MMU file is simple: the file is a text file
and on each line the class and the MMU must be listed, separated by a space. Any
class not listed in the MMU file will default to the value specified in the main
frame. A file can be edited in the text editing field of the Interactive MMU Input
window. Clicking on the File selector will allow the user to save the MMU file.
The extension of the file should be .txt.
23
Edit MMUs
c) Weight File
In the middle panel, there are four choices for a weights file. The weights file
specifies the priority of elimination. When a clump smaller than the MMU is
found, the pixels in that clump are replaced by one of the neighboring classes
(defined by 8-way connectivity).
24
The algorithm for deciding which class will be the replacement class is as
follows: the class with the highest weight is chosen; if there are multiple classes
with the same weight, the one with more pixels is chosen; if there is still a tie, the
class that comes first in the weight file order is selected as the replacement.
The four choices are No Weights, No Weights w/ 0, Old Weight File, or
New Weight File. The meaning of each of these choices is described below.
1) No Weights
With this option, all potential replacement classes have equal weight.
Because of this, the class with the most neighbors will be used as the
replacement class. If there is a tie, the class with the lower code will be
chosen.
2) No Weights w/ 0
This option is the same as option 1) No Weights except that class 0 is
also eliminated. In option 1) the 0 class is considered fill and not
eligible to be eliminated.
25
26
classes not listed (as replacement classes) are assumed to have equal and lowest priority.
The order of lines in the file determines the default priority among the lowest priority
classes. All classes in the input image must have their own line and list of replacement
classes. The interactive editor works like the MMU editor described above. The large box
on the right is a text editor. After the editing is complete, you can save the file by using
the file selector at the bottom of the window. The replacement algorithm chooses the
highest priority class among the neighboring pixels. When saving a weight file be sure to
enter the full file name, including the extension .wt or .txt. If you do not enter the
extension then the file name will not contain a period and will be unusable later. This is
necessary because Erdas Imagine does not recognize the *.wt file type.
In the case of new style weight matrix, the Smart Eliminate code performs
an on-the-fly recoding of the input data when it is read and processed, and
subsequent decoding when the output data is written. To do this, the
weight matrix is scanned once to get the classes (first element in each line)
and to assign it a code. When the matrix is scanned again, a full weight
matrix is populated. It is therefore an error to have replacement classes
that are not primary classes. Similarly, it is an error to have classes in the
image that are not listed in the weight file. However, it is alright to have
classes listed in the weight file that are not represented in the image.
27
e) Output File
The name of the output file is entered through the bottom-most file selector.
28
Here I copied out the command and broke the command apart so that each individual
variable is on a separate line. Notice the bolded text, these are things that must remain for
the program run correctly and the text that is in italics are variables that will change.
c:/program files/imagine 8.7/bin/ntx86/nlcd_see5class_208.exe
d:/test/cart_sample/scrub_cb8.names
d:/test/cart_sample/scrub_cb8.img
-rules d:/test/cart_sample/scrub_cb8.rules
-tree d:/test/cart_sample/scrub_cb8.tree
-format Tree
-maskfile d:/test/cart_sample/scrub_bin.img
-error 0
-meter
29
10.0
The executables developed as part of the NLCD Mapping Tool are written in C using
the Erdas Imagine Developers Toolkit and compiled with Microsoft Visual Studio. The
executables are all Jobs running under the Session Manager. All outputs are written to
the Session Log. As each executable runs through its processing steps, logging messages
are written to the Session Log. These messages will help the user debug any errors that
arise during processing.
The user must set the Log Message level to verbose for the messages to appear. This
is set through the Preferences Editors User Interface & Session category. If the user
does not want the messages to appear he/she can set them to terse and only use verbose
mode when debugging a problem.
The Job has a "main" function as its entry point. This routine initializes the Toolkit, starts
the job, prints program information, parses the input arguments, and calls the "jobMain"
function which handles the processing. The "main" function has two logging outputs: one
30
that prints "Print program information." followed by the program information; and a
second one that logs the number of command line input arguments and their values.
As customary for programs written with the Erdas Imagine Toolkit, processing each input
argument and option switch calls a different "set" function that copies the input variables
to global variables in the code and does some rudimentary checking on the values.
These logging messages give the user visibility into the processing and how it is
progressing. If inputs or variables are not what the user is expecting they should be
corrected and the process rerun.
a) Logging Messages
1) cubistinput.c
For the cubistinput.c program there are 14 such functions ("SetDepFilename",
"SetDepFileType", etc), and as each one is called, it logs the values being set. As the
independent variable files are parsed, each name is also logged to the Session Log. This
happens if the files are passed in on the command line or else through a text "list" file.
When control passes to the "CubistInput_Main" function, 13 logging messages are
printed, namely,
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
--------------
1 -- Get names."
2 -- Create meter."
3 -- Allocate arrays."
4 -- Open indFile: %d", once for each independent file
5 -- Read map info."
6 -- Create windows."
7 -- Reopen with windows."
8 -- Allocate counters."
9 -- Allocate Pixel Rects."
10 -- Counting sampling pixels."
11 -- Initializing training data."
12 -- Reading training data."
13 -- Write names file."
In addition to these messages that report progress through the steps needed for sampling,
there are feedback messages written to the Session Log with values calculated and used
in the particular run -- things such as the pixel size, output map coordinates, processing
start time, and number of samples counted.
2) cartclass.c
The classification step is handled by the source code in cartclass.c. There are 7 Set
functions in that code. The logging messages for cartclass.c are
31
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
"Main
-------------------
3) SmartEliminate.c
The SmartEliminate.c code has 7 Set functions. When control passes to the "jobMain"
function, three logging messages are printed, namely,
"Main -- 1 -- Initializing mmu."
or
Eliminate
Eliminate
Eliminate
Eliminate
Eliminate
Eliminate
Eliminate
Eliminate
Eliminate
Eliminate
-----------
1 -- Create meter."
2 -- Check input/output names."
3 -- Check input file."
4 -- Get projection and color table."
5 -- Set projection and color table in output file."
6 -- Set recode arrays."
7 -- Allocate arrays."
8 -- Read and set weights."
9 -- Starting processing, mmu = %d", theMMU);
10 -- Done processing. Closing layers."
32
Finally, if the user presses the "Cancel" button then this log message is printed to the
session log:
"Smart Eliminate -- 11 -- USER CANCELED PROCESSING"
b) Error Messages
Every call into the Developers Toolkit library is monitored for error conditions. The
critical errors will stop the program. The errors in cubistinput.c and cartclass.c are
each given a unique number so the user will be able to zero in on the condition causing
the error. SmartEliminate.c prints messages along with each error. Under normal
operating conditions none of these error messages should appear. The error numbers are:
1) cubistinput.c
1, "Could not initialize toolkit"
2, "Error allocating memory for Indlayernames"
3, "Error allocating memory for nlayers"
4, "Independent file must have more than one layer"
5, "Can not open txt file"
6, "Error allocating memory for Indlayernames"
7, "Error allocating memory for nlayers"
8, "Independent file must have more than one layer"
9, "Error in format option"
10, "Error in format option"
11, "Error reading sampling method option"
12, "Error in training samples value"
13, "Error in validation samples value"
14, "Error in number of ignore values"
15, "Error allocating memory for outfilenamehst array"
16, "Dependent file must have one layer"
17, "Error allocating memory for Indlayerstack"
18, "Error allocating memory for windowind"
19, "Error allocating memory for mapinfoind"
20, "Error allocating memory for xOffsetind"
21, "Error allocating memory for yOffsetind"
22, "Different Projections"
23, "Different Projections"
24, "Error in pixel size (all image files must be same)"
25, "Error in dependent pixel type (must be unsigned 1,2,4,8, or 16
bit)"
26, "Can't open input training data file");
27, "Error in training data file"
28, "Error allocating memory for counting array"
29, "Error allocating memory for counting array"
30, "Error allocating memory for counting array"
31, "Error allocating memory for counting array"
32, "Error allocating memory for counting array"
33, "Error from eimg_PixelRectStackCreate"
34, "Error allocating memory for indpixelblock array"
35, "Error from eimg_PixelRectStackCreate"
36, "Error allocating memory for indpixelblock array"
33
37,
38,
39,
40,
41,
42,
43,
44,
45,
46,
47,
48,
49,
50,
51,
52,
53,
2) cartclass.c
1, "Could not initialize toolkit"
2, "Error in create error layer option"
3, "No model file specified"
4, "Error in classification type (tree or rules)"
5, "Rules file not specified or found"
6, "Tree file not specified or found"
7, "Ill-defined classification type"
8, "Error opening names file"
9, "Error allocating memory for layersin"
10, "Error allocating memory for windowin"
11, "Error allocating memory for mapinfoin"
12, "Error allocating memory for xOffsetin"
13, "Error allocating memory for yOffsetin"
14, "Error opening Imagine file"
15, "Different Projections"
16, "Error in pixel size (all image files must be same)"
17, "Error reported by function eimg_LayerGetNames"
18, "Different Projections"
19, "Error in pixel size (all image files must be same)"
20, "Error allocating memory for datalayerd"
21, "Error creating output file"
22, "Error creating error output file"
23, "Error allocating memory for datalayerd"
24, "Error allocating memory for pixelblock"
25, "Error allocating memory for pixel block"
26, "Error allocating memory for pixel block"
27, "Error allocating memory for pixel block"
28, "Error reading input file"
29, "Error reading input file"
30, "Error reading names file"
31, "Error allocating memory for layernames"
32, "Error reading names file"
34
3) SmartEliminate.c
"Error initializing the Toolkit"
"Error connecting to the session manager"
"You did not specify a MMU!"
"Can't open specified MMU file!"
" Error meter info function"
"You didn't specify a valid input file!"
"You didn't specify a valid output file!"
"Error getting the layernames to work on!"
"Input image has invalid number of layers!"
"Error opening Input layer!"
"Image has no width!"
"Image has no height!"
" Error map info reading"
" Error reading red color table"
" Error reading green color table"
" Error reading blue color table"
" Error reading opacity color table"
" Error open column class names"
" Error class names column is not a string type"
" Error create table"
" Error column read"
" Error deleting output file"
"Error setting the output layer name!"
"Error creating the output layer!"
" Error map info writing"
" Error projection parameters writing"
" Error writing red color table"
" Error writing green color table"
" Error writing blue color table"
" Error writing opacity color table"
" Error creating output names column"
" Error writing output names column"
" Error deleting input names column data"
" Error closing input names column"
" Error closing output names column"
"Error creating pixel buffer!"
"Can't allocate for grid array"
"Can't allocate for check array"
"Can't allocate for flag array"
"Can't allocate for matrix array"
"Can't allocate for grid[i] array"
"Can't allocate for check[i] array"
"Can't allocate for flag[i] array"
"Can't allocate for matrix[i] array"
"Can't allocate for newcov array"
" Found new class in weight file"
" Found new class in weight file"
" Error with changing meter message"
" Error with LayerRead"
" Found class in input file not in weight file"
" Error with LayerRead"
" Error with LayerRead"
" Found class in input file not in weight file"
" Error with LayerRead"
35
This is caused by a missing Visual C++ Runtime library on the users machine. The
libraries are available for free from Microsoft. They can be installed by following the
instructions on this web page:
http://www.microsoft.com/download/en/details.aspx?id=26347
This error is generated from within Erdass Toolkit code. The exact source of the error is
uncertain but in many cases we have traced it to an excess usage of RAM causing a crash
of the virtual memory system.
To debug the error the user can monitor the RAM used by the program through the
Windows Task Manager. In the Processes pane, display the Mem Usage column and
apply a descending sort. As the program runs and reads more input files, monitor the
usage. If it approaches 2 GB then excess memory usage is the likely problem.
To decrease the memory usage the user can
1) Decrease the number of layers used. This may not be feasible if the problem
domain requires all the layers, but if some layers can be omitted they should
be.
2) Convert the input data to smaller block size. Since the programs operate
block-by-block, more memory is required for larger block sizes. We have seen
36
a decrease in RAM used by a factor of 3-4 when converting data from block
size 512 to size 64.
3) Set the default output block size to a smaller number. This number is found in
the Preferences Editor, Image Files (General) area.
4) Cut input images to the area that needs to be processed and aligning the blocks
to the same boundary.
In addition, Erdas recommends that in general the user
5) Set the temp directory to a user controlled directory, not a Windows directory.
6) Apply the most recent patches to the Imagine version begin used.
37