Sie sind auf Seite 1von 39

Principles of Geographic Information System

Chapter- II

Geographic Information and Spatial Data


Types
Dr.GOVINDU VANUM
Asst.Professor
Institute of Geo-Information and Earth Observation Sciences
Mekelle University, Mekelle
Mobile: 0914876574
Mail: govindu_gis@yahoo.com
GIS Data and Databases
Chapter Content
 GIS Data Concepts
 Geographic Phenomena
 Kinds of Data Values
 Digital Representation of Data
 Tessellation (Raster) Representation
 Vector Representation
 Database Management System
 Relational DBMS
 Relations, Tuples and Attributes
 Database Queries
Objectives:
Up on completion of the chapter, you will be able to:
 Understand GIS data types,
 Explain Geographic Phenomena,
 Describe the kinds of data values,
 Discuss how to represent data in GIS,
 Define Database Management System,
 Describe Relational DBMS,
 Describe the differences among:
 Relations,
 Tuples and
 Attributes.
 Explain the Types of Database Queries.
GIS Data Concepts
The backbone of GIS is good data accurate enough to
accomplish its objectives.
Geographic data are organized in a geographic database.
There are two important components of this geographic
database:
 geographic position (spatial data)-where is it?
 attributes or properties (attribute data)-what things
are?
 GIS data types of are grouped into three classes:
i. Spatial data (where?): used to describe location, shape, size,
and all kinds features of spatial entities.
GIS Data Concepts (Contd..,)
ii. Non-spatial data (what, when, in what amount?):
 also called attribute or characteristic data,
 Also known as descriptive data

 There are fundamental differences between them:


 spatial data are generally multi-dimensional and
autocorrelated.
 are positional data
iii. Metadata: is data about data and contains information
about scale, accuracy, projection/datum, data source,
manipulations, how to acquire data.
Geographic Phenomenon
 Geographic phenomena exist in real world and are the
study objects of a GIS.
 Geographic phenomena exist in the real world, everything
you see outside is a geographic phenomenon.
 Some of the things you do not see are also geographic
phenomena like temperature.
 Geographic phenomenon is something of interest that:
 can be named or described (what?)
 can be geo-referenced (where?)
 can be assigned a time (interval) at which it is/was present (when?)

 Geographic phenomenon can be man-made or natural


phenomenon that we are interested in.
Geographic Phenomenon (Contd..,)
 There are two groups of geographic phenomena,
fields and objects:
 A geographic field: for every point in the study area, a value
can be determined. e.g. temperature, barometric pressure,
elevation, etc. How these data is represented in GIS in surface
map?
 Geographic objects: populate the study area, and are usually
well distinguishable, discrete, bounded entities. e.g. buildings,
geological faults, roads, rivers, etc.

Elevation Bridge
(geographic field) (geographic object)
Geographic Phenomenon (Contd..,)
There are two types of geographic fields, discrete fields
and continuous fields:
 Discrete fields: cut up the study space in mutually exclusive
bounded parts, with all locations in one part having the same
field value. e.g. land classifications, geological classes, soil
types, etc.
 Continuous field: the underlying function is assumed to be
continuous. Continuity means that all changes in field values are
gradual. e.g. Elevation, temperature, rainfall, etc.

Land Classification Elevation


(discrete field) (continuous field)
Kinds of Data Values
 Nominal data values: provide name or identifier/
categorical data.
e.g. geological units, vegetation covers, soil types etc.
 Ordinal data values: orderly representation as the
classes are placed into some form of rank (high-moderate-
low).
 Interval data values: continuous scale of measurement
and a crude representation of numeric data on a scale.
e.g. elevation, temperature, etc.
 Ratio data values: also continuous scale where the origin
of the scale is real and not imaginary.
e.g. distance measured in meters, rainfall etc.
Digital Representation of Data

 We need to come up with a digital representations of the geographic


phenomena in order to store them in a GIS.
Digital Representation of Data (Contd..,)
 A digital representation is a model, it is not the real
thing.
 Our representation will never be perfect, some facts will
not be found.
 The choice for a digital representation depends on:
 The type of phenomenon,
 From the suitable digital representations you will choose based
on two issues:
 What original raw data is available?
 What sort of data manipulation does the application want to
perform?
 Computer representations can be divided into two
groups: raster and vector-based representations.
Raster Representation
 A tessellation or tiling is a partition of space into mutually
exclusive cells that together make up the complete study
area.
 With each cell some (thematic) value is associated to
characterize that part of space.
 Regular tessellation can be square, hexagonal or
triangular in shape.
 The square shape tessellation is by far the most
commonly used and this tessellation is known as raster.
Raster Representation (Contd..,)
 In raster representation the field attribute value assigned
to a cell is associated with the entire area occupied by the
cell.
 The size of the area that a single raster cell represents is
called the raster’s resolution.
Reality

Regular Tessellation
 Examples of raster data representation are aerial photograph, a
satellite image, or a scanned map, etc.
Raster Representation (Contd..,)
 Two ways to improve on the continuity issue:
 Make the cell size smaller.
 Assume that the cell value only represents one specific
location and provide a good interpolation function for all other
locations.
 Advantages of regular tessellation:
 We know how they partition space.
 We can make computations specific to this partitioning.
 Fast algorithms.
 Disadvantages of regular tessellation:
 Not adaptive to the spatial phenomenon we want to represent.
 No matter how many cells have the same value, it will store
this value for every cell.
Vector Representations
Vector representations useful for representing and
storing discrete features such as buildings, pipes, or
parcel boundaries and can be:
 Triangulated Irregular Networks (TIN)
 Point
 Line
 Area
i. TIN
A TIN is built from a set of measurements for example
points of height.
 These points can be scattered unevenly over the study
area, with areas of more change having more points.
 Triangles are fitted through three points to form planes.
Vector Representations (Contd..,)
 A Tin is a vector representation
• Each anchor point has a stored geo-reference.
• The planes do not have a stored values (like raster cells have)

Elevation for TIN Stretched


construction triangles

No value is stored
for this plane

Delaunay
triangulation

A geo-reference and value is


stored for each anchor point
Vector Representations (Contd..,)
ii. Point
 Points are defined as:
• single coordinate pairs (x,y) when we work in 2D;
• coordinate triplets (x,y,z) when we work in 3D.
Mekelle City can be represented both in
point and polygon. HOW??
 Used to represent shape and size less single features such as:
tree, oil well, poles, fire plugs, etc.
iii. Line
 Used to represent one dimensional objects (roads, railroads,
canals, rivers, etc.)
 Line is defined by 2 end nodes and 0-n internal nodes to define
the shape of the line.
Vector Representations (Contd..,)
iv. Area (polygon)
 Used to represent two dimensional features.
 Polygonal features, such as city boundaries and river
catchments can be stored as a closed loop of
coordinates.
 Polygonal data is the most common type of data in
natural resource applications.
 Examples of polygonal data include forest stands, soil
classification areas, administrative boundaries, and
climate zones.
Raster vs Vector
Raster vs Vector (Contd..,)
Raster Model Vector Model
Advantages: Advantages:
 Simple data structure  Compact data structure
 Easy and efficient overlaying  Efficient for network analysis
 Compatible with RS imagery  Efficient for projection
 High spatial variability is transformation
efficiently represented  Accurate map output.
 Simple for own programming

Disadvantages: Disadvantages:
 Need high computer storage  Complex data structure
 Errors in perimeter and shape  Difficult overlay operations
 Difficult network analysis  High spatial variability is
 Inefficient in projection inefficiently represented
transformations  Not compatible with RS
 Loss of information when using imagery
large cells
Principles of Geographic Information System
Chapter- III

DATA MANAGEMENT AND


PROCESSING SYSTEMS
Dr.GOVINDU VANUM
Asst.Professor
Institute of Geo-Information and Earth Observation Sciences
Mekelle University, Mekelle
Mobile: 0914876574
Mail: govindu_gis@yahoo.com
Database Management Systems
 A database can be defined as:
 A collection of related data/information stored in a structured
format.
 Computerized collection of structured data stored in one or
more tables as electronic filing cabinet.
 A collection of inter-related data stored together to serve one or
more applications.
 A combination of software and hardware that makes it possible
and convenient to perform tasks that involve handling large
amounts of data.
The data are stored together with as little redundancy as
possible to serve one or more users.
Database Management Systems (Contd..,)
 Spatial database is a collection of spatially referenced
data that acts as a model of reality.
 To create and maintain a computer database, you need
a database program, often called:
 Database Management System (DBMS).
 DBMS is a software package that allows the user to set
up, create and maintain a database.
 GIS is a DBMS specifically designed for processing of
spatial and related attribute data.
 In addition to DBMS, GIS also has many capabilities.
 A geographic database is a critical part of GIS.
Why Use a Database?
 Handling large amounts of data.
 Backup and recovery functions to avoid loss of data.
 Declarative query language for retrieval of data.
 Collecting all data at a single location reduces redundancy and
duplication.
 Maintenance costs decrease because of better organization and decreased
data duplication.
 Applications become data independent so that multiple applications can
use the same data.
 User knowledge can be transferred between applications more easily
because the database remains constant.
 Data sharing is facilitated and a corporate view of data can be provided
to all managers and users.
 Security and standards for data and data access can be established and
enforced.
Relational DBMS
 Relational DBMS is most widely accepted for managing
the attributes of geographic data.
 The relational DBMS is attractive because of its:
 Simplicity in organization and data modeling.
 Flexibility - data can be manipulated in an ad hoc manner by
joining tables.
 Efficiency of storage - by the proper design of data tables
redundant data can be minimized; and
 The non-procedural nature - queries on a relational database do
not need to take into account the internal organization of the data.
 Disadvantages
 No explicit representation of relationships.
 Reduced performance for large, well-defined databases.
Relational DBMS (Contd..,)
 Define database structure: attribute, tuples and relations.
 Define integrity rules.
 Define queries (extract without alteration).
 Transactions (change database contents).
 Comprises a set of tables, each a two-D list of records.
containing attributes about the objects under study.
 Were primarily focused on business applications such as
banking, human resource management.
 Were never designed to deal with rich data types such as
geographic objects, sound, and video.
 Poor performance for many types of geographic query.
Relational DBMS (Contd..,)
Map Table
Map MapID Area Perimeter StandID
(ha) (m)
14 1.08 416 J-234
14 15
15 0.75 350 J-129
16 0.31 223 J-143

16 17 17 1.38 523 J-888

Stand Dominant Stand


Number Species Age
J-127 Hemlock 25
Stand Table J-316 White Pine 34
J-129 Hemlock 65
J-411 Spruce 34
Relations, Tuples and Attributes
 Relation (a table) is a collection of similarly shaped
tuples/records (having the same named attributes).
 Tuples are records or rows with attribute values.
 Attribute (a column) is characteristic of the relation of a
named field of a tuple.
 The primary key of a relation has one or more attributes
that uniquely identifies a tuple (record).
 Foreign key is used to refer between records of different
relations.
 It is not a key of the relation in which it appears but is
a primary key of another relation.
Relations, Tuples and Attributes (Contd..,)
Database Queries
 A database query is a data extraction without changing the
database.
 By query we search the database for a subset of records and/or
attributes according to conditions defined by the user.
 To define queries, we need a quering language, such as SQL
(structured query language).
 Tuple/record selection: allows tuples that meet the selection
condition to pass.
 Attribute selection: it passes through all tuples of the input
but reshapes each of the same way.
 Joining: in which attributes of two relations are matched and
glued together (foreign key).
Tuple / Record Selection
Attribute Projection
Join Operator
Selection, Projection and Join
Queries in GIS
i. Spatial queries: queries that can only be answered
using the stored X & Y location.
Example: where is my field station number 10 (ST-10)? x = 516000, y =
678003.
ii. Non-spatial queries: queries that in being answered
do not use the stored X & Y location of the feature.
Example: what attributes did I collect at St-10? Lithology, land form,
structure, etc.
• The three rules of data integrity in DBMS:
 Key uniqueness: unique identifier, no duplicates within relation.
 Key integrity: primary key value is never null (unknown).
 Referential integrity: value of a foreign key refers to an existing
primary key value in another relation.
Review Questions
 List and describe the differences between GIS data types.
 Explain why data models are used in GIS?
 What do discrete fields and objects have in common?
 What do discrete fields and continuous fields have in common?
 List and describe the differences between raster and vector data
representations.
 You have two types of phenomena that can be modeled as raster
representations, continuous and discrete phenomena. Can quadtree be
applied to both, and is the quadtree equally effective? Explain your
answers.
 For the listed types of data layers, suggest which data model, raster
or vector would be the best choice: Color IR airphoto; Roads of
Mekelle; Wetlands of a County; Streams & Lakes of a County.
 Mention the three rules applied to ensure data integrity in a DBMS
environment.
THANK YOU

Das könnte Ihnen auch gefallen