Sie sind auf Seite 1von 51

Geographic theory and geospatial knowledge discovery

Harvey J. Miller Department of Geography University of Utah harvey.miller@geog.utah.edu


IEEE International Conference on Data Mining Pisa, Italy 18 December 2008
1

GIS trends
Geospatial technologies
High-resolution monitors Location-aware technologies Geosensor networks etc, etc.

A tsunami of digital geo-data


Increased volume
Giga to terabyte and beyond

Increased coverage
Seamless databases

Increased spectrum
Text, sound, imagery
2

Introduction
Geospatial knowledge discovery
Human-centered process of extracting novel, interesting and useful patterns from georeferenced data A (very!) special case of KDD

Why is spatial special? - a mini course


Location is important Observations are not independent Errors are often spatial Relationships are often local Non-linearity is typical Distributions are non-normal Highly multivariate but often redundant Time often interacts with space Many data layers are categorical Data objects often cannot be reduced to points Spatially aggregated data are modifiable

- after Openshaw (1999)

Introduction
Geographic theory and data mining
There is a rich and underexploited body of geographic theory This can help guide the GKD process
Techniques Background knowledge Pattern evaluation etc

Geography its not just trivia!


4

Great geographic theories


Spatial dependency
Toblers first law Cartographic transformations

Spatial heterogeneity
Spatial non-stationarity Disaggregate spatial statistics

Spatial interaction
Spatial interaction theory Time geography

Spatial organization
The concept of region Spatial logic
5

What I will not talk about


The map
One of the most powerful technologies in the history of civilization Still evolving! Useful in GKD
Interfaces Pattern visualization
Earliest known map of the world - sixth century B.C.E
www.gutenberg.org

Google Earth

www.antweb.org

But, why the map?


6

What I will not talk about


Domain theory
Theories about processes specific to particular domains
Ecosystems biology, biogeography Landscapes geology, geomorphology Cities economics, political science, sociology, geography

These are theories with geospatial components, but are not uniquely geography
7

What I am seeking and why

A theory of geography
- a unique perspective -a coherent way of thinking - amenable to formal and computational representation

Framework for organizing GKD

Suggest new techniques and strategies

Spatial dependency
Toblers First Law of Geography
Everything is related to everything else, but near things are related to more distant things Everything is related to everything else
Spatial interdependency

Waldo Tobler receiving the 1999 ESRI Lifetime Achievement Award


Susanna Baumgart - UCSB

Near things are related to more distant things


Interdependency and proximity
9

Tobler, W. R. 1970. A computer movie simulating urban growth in the Detroit region. Economic Geography 46: 234-240.

esri.com

Spatial dependency
Spatial autocorrelation
Association based on geospatial proximity

Confounding
Something to be corrected e.g., econometrics

Informing
Reveals information about spatial process e.g., spatial autocorrelation statistics, spatial econometrics

Body Mass Index in Salt Lake City, USA Dr. Ikuho Yamada, University of Utah

10

Spatial dependency
Spatial interpolation
Estimate variables at unobserved locations using values at observed locations Based on modeled proximity relationships
e.g., IDW, kriging
11

Spatial interpolation of influenza over time in Europe 2004-2005


www.eurosurveillance.org

Spatial dependency
Geo-space Proximity is the core of theories of geo-space Two main components
Locations Length metric

Miller and Wentz (2003) Annals, AAG

Formal theory
Beguin and Thisse (1979), others

Admits a wide range of length metrics


Including semi-metrics
12

The fundamental tenet of geography: Geo-space is explanatory

Spatial dependency
Geo-space does not have to be Euclidean
Geographic processes can follow other metrics

Cartographic transformations
Project geo-space based on:
Alternative proximity relationships Smooth spatial heterogeneity

Swedish migration map Hagerstrand (cited by Tobler 1963)

Why?
Visualization Improve explanation CartoDraw Keim, North & Panse
13

Spatial dependency

Cliff and Haggett (1998)

Air passenger flows in Iceland

Iceland in air passenger space

Spatial modeling of disease propagation in an alternative geo-space


14

Spatial dependency
Time-space maps
Map with separation measured in travel time Why?
Exploratory visualization Synoptic summary Greater explanatory power

Time-space transformation of Salt Lake City, USA

Nobbir Ahmed and Miller (2007)


15

Based on average daily travel times (vectors represent


displacements)

Spatial dependency

1992 2001 Time-space transformations for 4 periods of the day morning, midday, afternoon and evening
16

Spatial dependency

Highly stressed solution western SLC

Less stressful surface representation

Solutions are sometimes > 3D


17

Spatial dependency
Comparing alternative spaces
Bi-dimensional regression Degree of fit between two planar configurations
After transformations, rotations and translations
Tobler (1994)

Fit and significance measures


Including spatial variation

Can be extended to higher dimensions


Nobbir Ahmed and Miller

18

Spatial heterogeneity
Geographic variation occurs naturally
Friction of distance Relative location

Spatial processes are non-stationary


Apparent variation in process with respect to location
If its stationary, its not spatial!
19
www.geovista.psu.edu

Spatial heterogeneity
Question: What do Charles Darwin and Paul Krugman have in common?
uk.gizmodo.com

Besides beards

ericblackink.minnpost.com

20

Spatial heterogeneity
Both recognized the power of spatial heterogeneity
Darwin
uk.gizmodo.com

Observed geographic variation in species Natural selection leads to differences in species diversity and composition among different geographic locations Long distance dispersal results in geographic isolation and evolutionary divergence

21

Spatial heterogeneity
Both recognized the power of spatial heterogeneity
Krugman
New economic geography Geographic variation in productive factors Increasing returns enhance variation and lead to greater heterogeneity

ericblackink.minnpost.com

22

Spatial heterogeneity
Disaggregate spatial statistics
Decompose processes by location Examples
Getis-Ord G K-function analysis Geographically weighted regression

Unimaginable prior to GIS!


Data intensive Visually intensive
23

Local clustering of birth defects, Shanxi province, China.


Wu et al (2004) www.biomedcentral.com

Spatial heterogeneity
Geographically weighted regression
Assess spatial variation in model structure
Parameter estimates Parameter errors Goodness of fit Influence GWR with different spatial lags
Laffin, S. W. GeoComputation 99

Determine whether variation is systematic


Validate models between data subsets Ask questions about spatial structures in data

24

Spatial heterogeneity

Fotheringham and Demsar (2009)

PARM_4
-1.857220 - -0.601224 -0.601223 - 0.055812 0.055813 - 0.787442 0.787443 - 1.664200 1.664201 - 3.123810

TVAL_4
-3.008610 - -2.580000 -2.579999 - -1.960000 -1.959999 - 1.960000 1.960001 - 2.580000 2.580001 - 5.342070

Parameter estimates

t-tests

GWR: Spatial variation in the effect of social class on voter turnout, Dublin Ireland 25

Spatial heterogeneity
GWR and visual insight
Use visual analytics to explore parameter space Example
SOM clusters based on eight parameters Cartographic visualization of clusters

Fotheringham and Demsar (2009)

26

Spatial heterogeneity
GWR and visual insight
Use visual analytics to explore parameter space Example
Cluster selection Parallel coordinate plot of clusters across all eight variables

Fotheringham and Demsar (2009)

27

Spatial interaction
Spatial interaction theory
Linkages and flows between locations
Spatial separation (-) Complementarity (+)
Origin supply Destination demand

Can be multidimensional
Map multiple variables into a single measure

Originally an analogy with Newtons Law of Gravitation, but

28

Spatial interaction

spatial interaction has a solid theoretical base


Entropy maximization
Alan Wilson 1960s

Discrete choice theory


Stewart Fotheringham 1980s

which has resulted in a wide spectrum of models


Flow total constraints and quasi-constraints Spatial association among origins, destinations Behavioral processes etc, etc
29

orgnet.com

Spatial interaction
Data mining of spatial interactions
Existing techniques
Connections Flows

Visualizing social networks

Need better techniques


Attributed flows Spatial object dyads
Origin-destination pairs
CubeView detecting outliers in flows Shashi Shekhar
30

Spatial interaction
The death of distance?
Distance is changing
High mobility Connectivity

Convergence: Edinburgh and London 1658-1950

Space-adjusting technologies (Ron Abler)


Change the nature of space with respect to the time, cost and effort

Space-time convergence
(Don Janelle)

Shrinking of distance due to transport Rate per unit time


Janelle 1969

31

Spatial interaction
Telepresence

(Don Janelle)

Participate in events without physical presence


(Helen Couclelis)

Space-time fragmentation
Spatial fragmentation

Activities not tightly coupled with place

Temporal fragmentation
Activities outside standard hours Fluid time Short planning horizons - Flocking Why let climbing a mountain interfere with business?
Mt. Olympus, Utah, 18 June 2006

Need to expand theories of spatial interaction


32

Spatial interaction
Time geography
Individual in geo-space and time Constraints imposed by:
Activity timing Activity locations Mobility resources
Ability to trade time for space
Miller (2005)

Space-time path
Realized movement
33

Meipo Kwan

Paths in theory and practice

t
vij

tj

Spatial interaction
Time geography
Individual in geo-space and time Constraints imposed by:
Activity timing Activity locations Mobility resources
Ability to trade time for space

aij tij

ti

x xi
xj

Miller and Bridwell (2009)

Space-time prism
Potential movement
34

Prism in theory and practice

Spatial interaction
Communication modes based on spatio-temporal constraints
Temporal Spatial Presence Synchronous SP Face-to-face AP Post-it notes Telepresence ST Telephone TV AT Mail Email Webpages

Possible time geographic expressions

Relations between paths

Asynchronous

Temporal events

Janelle (1995)

35

Space-time cube: Visual analytic environment for exploratory time geography Kraak and Huisman

Visualizing the intersection of multiple space-time prisms Linking space-time paths with attributes
36

Spatial interaction
Interactive, multiscale visualization of spacetime paths
Explore paths at different levels of spatio-temporal granularity Aggregation based on spatial similarity
and attribute similarity (eventually)

Tetsuo Kobayashi and Miller (in progress)


37

Spatial organization
www.desertmuseum.org

The concept of region


Partitioning of geographic space based on homogeneity Two types of region
Formal
Explicit Land cover, terrain, settlement patterns

Functional
Implicit Organization, interactions, linkages

Formal regions based on biogeography

38

Spatial organization
Regions and locational processes
Functional regions highlight the interplay between spatial process and spatial pattern

Von Thunen bid-rent theory www.rri.wvu.edu


teacherweb.ftl.pinecrest.edu

39

wolf.readinglitho.co.uk

Spatial organization
Central Place Theory
Theory of the frequency, size and spacing of cities as market centers

Nesting of market areas and cities Distance


Wikipedia

Transport
40

Administration

Spatial organization
Spatial logic
A route to explanation
Spatial logic: Pattern suggests process Process logic: Process suggests patterns

Why?
Patterns are integrated manifestations of complex processes
Continental drift inferred through spatial logic by Alfred Wegener (1912)

Why not?
Difficult to distinguish individual processes Equifinality
41

Spatial organization
Geography and complexity
Can spatial interaction explain intricate geographic patterns? Complexity theory
Simple, local interactions can generate complex global behavior Importance of geographic context
Pattern and intensity of interactions
42

Spatial organization
The problem of arbitrary regions
Arbitrary regions lead to artifacts Two types of effects
Scale Zoning

10 5 5 n = 9; mean = 8.89

15 10 10

5 15 5

6.67 n= 3; mean = 8.34 7.5

11.67

6.67

11.25 6.67

n= 3; mean = 8.47 12.5 8 7.5 n= 3; mean = 9.33

Solutions
Design optimal regions No regions! Assess effects
43

Modifiable Areal Unit Problem example


(after Oliver; www.geog.ubc.ca)

Is there a theory of geography?


Yes!
A unique perspective focusing on the role of spatio-temporal proximity

Is it formal?
Yes: geographers have been building the formal and analytical foundations of their field

Is it coherent?
Yes, but it is not unified Still need a grand unified theory derived from first principles
44

Corn van Elzakker ITC; www.itc.nl

Opportunities and challenges


Spatial patterns & relations
Potentially large! Geo-theory can guide GKD
Background knowledge Pattern evaluation

Background knowledge: challenges


Geographic ontology
Concepts can be abstract, vague, multi-level
Concept hierarchy for location - based on Han and Kamber (2003)

Knowledge extraction
Geo-theory: Implicit information
Equations, algorithms, etc

KD: Explicit information


Networks, hierarchies, rules 45

Opportunities and challenges


Spatial pattern evaluation
Reality = theory
Interesting but not novel

Reality = null
Not interesting or novel

Between theory and null


May be interesting and novel

Problems
What is a good spatial null?
Not Complete Spatial Randomness (CSR)

What is the metric?


How do we measure spatial departures from theory and null?

46

Opportunities and challenges


Geographic theory as a pattern filter
Spatial data mining often generates a large number of spatial and temporal patterns and relationships

Meta-mining

(Roddick 1999)

Mining the results of previous mining exercises Derive higher-level patterns and rules

47

Opportunities and challenges


Algorithms and infrastructure
Geographic models and techniques can be computationally complex
Often involve pairwise distances between all geo-locations

Research needs
Heuristics High-performance computing

This is a surprisingly underresearched area!


48

10 years old!

Conclusion
Geographic theory
Rich, coherent, formal Useful, but underexploited in data mining Waiting to be discovered
wikimedia.org

Help fill the blank spots on the map!


49

Bibliography
Ahmed, N. and Miller, H. J. (2007) "Time-space transformations of geographic space for exploring, analyzing and visualizing transportation systems," Journal of Transport Geography, 15, 2-17 Beguin, H., and J. F. Thisse. (1979) An axiomatic approach to geographical space, Geographical Analysis 11, 32541 Fotheringham, A. S. (1983) A new set of spatial-interaction models: The theory of competing destinations, Environment and Planning A, 15, 1536. Fotheringham, A. S. and Demar, U. (2009) Looking for a relationship? Try GWR? in H. J. Miller and J. Han (eds.) (2009) Geographic Data Mining and Knowledge Discovery - second edition, Taylor and Francis, in press. Getis, A., and J. K. Ord. (1992) The analysis of spatial association by use of distance statistics, Geographical Analysis, 24, 189206. Janelle, D. G. (1969) Spatial organization: A model and concept. Annals of the Association of American Geographers 59: 34864. Links Janelle, D. G. (1995) Metropolitan expansion, telecommuting and transportation, in The geography of urban transportation, ed. S. Hanson, 40734. New York: Guilford.

Bibliography
Kraak, H. J. and Huisman, O. (2009) Beyond exploratory visualization of space-time paths, in in H. J. Miller and J. Han (eds.) (2009) Geographic Data Mining and Knowledge Discovery - second edition, Taylor and Francis, in press. Miller, H. J. (2004) "Tobler's First Law and spatial analysis" Annals of the Association of American Geographers, 94, 284-289. Miller, H.J. (2005) "A measurement theory for time geography," Geographical Analysis, 37, 17-45 Miller, H. J. (2005) "Necessary space-time conditions for human interaction," Environment and Planning B: Planning and Design, 32, 381-401. Miller, H. J. and Bridwell, S. A. (2009), "A field-based theory for time geography," Annals of the Association of American Geographers, 99 (in press). Miller, H. J. and Wentz, E. A. (2003) "Representation and spatial analysis in geographic information systems," Annals of the Association of American Geographers, 93(3), 574-594. Tobler, W. R. (1963) Geographic area and map projections. The Geographical Review 53: 5978 Tobler, W. R. (1970) A computer movie simulating urban growth in the Detroit region, Economic Geography, 46, 234-240. Tobler, W. R. (1994) Bi-dimensional regression, Geographical Analysis, 26, 187212

Das könnte Ihnen auch gefallen