Sie sind auf Seite 1von 46


Technology & Standards Watch (TechWatch)
www.jisc.ac.uk/techwatch




Horizon Scanning report
10_01

First published:
Sept. 2010




Data mash-ups and the future of mapping
by

University College London University of Nottingham


Centre for Advanced Spatial Analysis (CASA) Centre for Geospatial Science (CGS)
Michael Batty Suchith Anand
Andrew Crooks Mike Jackson
Andrew Hudson-Smith Jeremy Morley
Richard Milton

Reviewed by:

James Reid
Team Leader and Business Development Manager, Geoservices
EDINA

Andrew Turner
Deputy Director, Centre for Computational Geography
University of Leeds



To ensure you are reading the latest version of this report you should always
download it from the original source.
Original source http://www.jisc.ac.uk/techwatch
Version 1.0
This version published Sept. 2010
Publisher JISC: Bristol, UK
Copyright owner Suchith Anand, Michael Batty, Andrew Crooks, Andrew
Hudson-Smith, Mike Jackson, Richard Milton, Jeremy
Morley
JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Executive Summary

The term 'mash-up' refers to websites that weave data from different sources into new Web
services. The key to a successful Web service is to gather and use large datasets and harness
the scale of the Internet through what is known as network effects. This means that data
sources are just as important as the software that 'mashes' them, and one of the most profound
pieces of data that a user has at any one time is his or her location. In the past this was a
somewhat fuzzy concept, perhaps as vague as a verbal reference to being in a particular shop
or café or an actual street address. Recent events, however, have changed this. In the 1990s,
President Bill Clinton's policy decision to open up military GPS satellite technology for 'dual-
use' (military and civilian) resulted in a whole new generation of location-aware devices.
Around the same time, cartography and GIScience were also undergoing dramatic, Internet-
induced changes. Traditional, resource intensive processes and established organizations, in
both the public and private sectors, were being challenged by new, lightweight methods. The
upshot has been that map making, geospatial analysis and related activities are undergoing a
process of profound change. New players have entered established markets and disrupted
routes to knowledge and, as we have already seen with Web 2.0, newly empowered amateurs
are part of these processes. Volunteers are quite literally grabbing a GPS unit and hitting the
streets of their local town to help create crowdsourced datasets that are uploaded to both open
source and proprietary databases.

The upshot is an evolving landscape which Tim O'Reilly, proponent of Web 2.0 and always
ready with a handy moniker, has labelled Where 2.0. Others prefer the GeoWeb, Spatial Data
Infrastructure, Location Infrastructure, or perhaps just location-based services. Whatever one
might call it, there are a number of reasons why its development should be of interest to those
in higher and further education. Firstly, since a person's location is such a profound unit of
information and of such value to, for example, the process of targeting advertising, there has
been considerable investment in Web 2.0-style services that make use of it. Understanding
these developments may provide useful insights for how other forms of data might be used.
Secondly, education, particularly research, is beginning to realize the huge potential of the
data mash-up concept. As Government, too, begins to get involved, it is likely that education
will be expected to take advantage of, and indeed come to relish, the new opportunities for
working with data.

Since, as this report makes clear, data mash-ups that make use of geospatial data in some
form or other are by far the most common mash-ups to date, then they are likely to provide
useful lessons for other forms of data. In particular, the education community needs to
understand the issues around how to open up data, how to allow data to be added to in ways
that do not compromise accuracy and quality and how to deal with issues such as privacy and
working with commercial and non-profit third parties—and the GeoWeb is a test ground for
much of this. Thirdly, new location-based systems are likely to have educational uses by, for
example, facilitating new forms of fieldwork. Understanding the technology behind such
systems and the way it is developing is likely to be of benefit to teachers and lecturers who
are thinking about new ways to engage with learners. And finally, there is a future watching
aspect. Data mash-ups in education and research are part of an emerging, richer information
environment with greater integration of mobile applications, sensor platforms, e-science,
mixed reality, and semantic, machine-computable data. This report starts to speculate on
forms that these might take, in the context of map-based data.

1

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Table of Contents
Executive Summary 1
1. Introduction 3
1.1 Background and context 4
1.2 Summary and rationale 10
2. State of Play: Maps, mash-ups and the GeoWeb 11
2.1 Harnessing the power of the crowd 12
2.2 Individual production and user-generated content 12
2.3 Openness 16
2.4 Network effects and the architecture of participation 18
2.5 Data on an epic scale 19
3. Technologies and Standards 21
3.1 The role of Ajax and other advances in Web technology 21
3.2 Map mash-up basics 21
3.3 Specific technologies for map mash-ups 23
3.4 Standards and infrastructure 24
4. The future of data mash-ups and mapping 28
4.1 Semantic mash-ups 28
4.2 Mobile mash-ups 29
4.3 Geo-location on the social Web 30
4.4 Augmented Reality 30
4.5 Sensors 32
4.6 3-D and immersive worlds 34
4.7 HTML5 36
4.8 Policy, standards and the wider context 37
Conclusions and recommendations 40
About the Authors 41
References 42

2

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


1. Introduction
What they are all seeing is nothing less than the future of the World Wide Web. Suddenly
hordes of volunteer programmers are taking it upon themselves to combine and remix the
data and services of unrelated, even competing sites. The result: entirely new offerings they
call ‘mash-ups’.

Hof, 2005 (online)

Originally the term mash-up was used to describe the mixing or blending together of musical
tracks. The term now refers to websites that weave data from different sources into new Web
services (also known simply as 'services'), as first noted by Hof (2005). Although 'mash-up'
can refer to fusing disparate data on any particular topic the focus in this report is on mash-
ups with some spatial or geographic element. In fact, most map mash-ups blend software and
data, using one or more APIs1 provided by different content sites to aggregate and reuse data,
as well as adding a little personalized code or scripting to create either a new and distinct
Web service or an individualized, custom map.

The Traffic Injury Map2 illustrates a


typical example of a map mash-up.
Using data from the UK Data Archive
and National Highway Traffic Safety
Administration (US), injuries resulting
from traffic accidents are presented on a
Google Maps cartographic base (see
Figure 1). The mash-up enables users to
identify areas with frequently occurring
accidents and allows specification of
categories to distinguish between
accident victims such as cyclists or
children.

Figure 1: Traffic Injury Map showing incidents in


the Nottingham area

Map mash-ups are the most commonly developed type of mash-up application. According to
statistics from ProgrammableWeb,3 a website detailing APIs and mash-up news, mapping
APIs (e.g. Google Maps, Microsoft Virtual Earth [now Bing Maps] and Yahoo! Maps)
constituted 52% of the most popular APIs in July 2009 (see Figure 2a) – although by October
2009 this was down to 42% (see Figure 2b), and with a different mix of APIs (Google Maps,
GeoNames and Google Maps for Flash). This shows the dynamism of this area, albeit with
Google dominating the mapping APIs. In this case, the reduction in the total percentage of
APIs that are map based is mostly due to the rapid rise in mash-ups involving the social
micro-blogging site Twitter (from 5% in July to 20% in October 2009).


























































1
Application Programming Interfaces (APIs), are defined here as software programs that interact with
other software to reduce the barriers to developing new applications. For map mash-ups, applications
can be created with nothing more than a simple text editor, with the API providing customizable map
'tiles'.

2
http://www.road-injuries.info/map.html

3
http://www.programmableweb.com/

3

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)



Figure 2: Most popular APIs for mash-ups as listed by ProgrammableWeb (a) in July 2009 (b) in
October 2009

1.1 Background and context


Data mash-ups utilize the ideas of Web 2.0 (e.g. data on an epic scale), but because Web 2.0
as a term is starting to fall out of favour it is important to be clear about how it is used in this
report. For some, talking simply about technology developments, 'Web 3.0' may seem to be
somehow more current, describing 'the next phase' of the Web's progression, perhaps the
semantic Web. However, it is often forgotten that O'Reilly's original description of Web 2.0
(2005) is as much about the ideas behind technology developments as it is about the
technologies themselves. These ideas are still current – they have not yet changed in response
to new technology developments – and are therefore still valid as an analytical framework for
understanding data mash-ups and speculating on how they will progress.

This section will therefore discuss some of the background and context to the emergence of
Web 2.0-influenced geospatial systems and map mash-ups. For example, by understanding
something of the way that geospatial information systems developed before the arrival of the
Internet and then the Web, and how traditional data capture and spatial analysis is undertaken,
it is possible to glean some of the impact of new ideas, particularly Web 2.0. In this section
we will review these developments briefly through the concepts of software and data as a
service. Finally, we will place these developments within the context of some of the existing
activities within JISC and the wider higher/further education (HE/FE) sector.

1.1.1 Pre-Web Graphical Information Systems



Computer cartography started almost as soon as the idea of graphics emerged in the 1950s
and '60s with the invention of pen plotters and line printers. A key development in this regard
was at Harvard where the Symbol Mapping system SYMAP was developed. However, it was
only when graphics tube technology was replaced by screen memories (graphics cards), as a
consequence of the PC revolution that began in the late 1970s, that graphics and,
subsequently, computer-based mapping really began to take off.

There were no mapping systems developed for the original Apple II but the early Apple Mac
and the later IBM PC had simple systems. The Domesday Survey of Britain was available for
the BBC Micro as a kind of simple but passive map. Rudimentary mapping software and then
GIS4 began to appear in the mid to late 1980s but initially they consisted of migrations down


























































4
Geographic Information Systems (GIS) is a generic term that refers to technology that stores,
processes, analyses and displays geospatial data. This kind of software tends to be expensive to
produce, requires highly skilled professionals to operate it, and is used to handle large, complex
datasets for end-users with very precise requirements. In addition, the geographic data that these
systems make use of can also be expensive to procure and users of the data must often adhere to
licence agreements designed to prevent unlawful sharing or copying.


4

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


from mainframe systems via specialist workstations. Even in the late 1980s, unless you were
working with graphics and mapping on workstations, most graphics processing, even on
minicomputers, was based on processing first and then separate display. The Digital Map of
the World ran on PCs in the early 1990s but it was not until Windows 95 that desktop GIS
really made its entry with ArcView (notwithstanding slightly earlier desktop systems such as
MapInfo). Desktop systems began to mature as workstations disappeared and graphics and
related computer memories got ever larger.

In the late 1990s, two things came into this mix that built on the Web. Firstly, what might be
called Internet GIS: servers that held a GIS system and associated data and which served
maps and analysis functions via the network to client-side machines—usually specialist
workstation machines developed mainly for professional use. Xerox PARC was a pioneer in
this area, with the Xerox Map Viewer as a notable example. Secondly, came online maps
displayed through the Web browser, although the map was not really very manipulable and
was used simply for basic navigation with zoom and pan facilities.

By the late 1990s, various products that enabled users to 'find their way' – gazetteers and
atlases – had appeared5
and in the early 2000s, value was being added to these interfaces as
they began to be customized to provide new layers of spatial data that users could query. A
good example in the UK is UpMyStreet which was first created in 1998 and now contains a
wealth of local information targeted around the search for property and local services.

It is important to be clear that such browser-based systems offered little or no spatial analysis,
which is the heartland of professional GIS. In fact, this distinction between professional GIS
functionality (which we will call spatial analysis) and computer maps is still one that
dominates the field, and most Web-based map mash-ups today do not deal with spatial
analysis.

1.1.2 GIS data capture and management

Traditionally, geospatial work is carried out using GIS software. This is a specialized area of
work that is generally known as Geographic Information Science or GIScience (Goodchild,
1992). The specialized nature of GIS applications means their use is often restricted, with
several barriers to widespread operation. Firstly, data and software are expensive and
secondly, trained personnel are required to successfully operate and manage these systems.

Historically, capturing geographic data has been expensive and time consuming. Primary data
are collected by surveyors who make use of advanced equipment such as total stations, Real
Time Kinematic GPS, aerial photography, terrestrial and airborne laser scanners (LIDAR),
amongst other tools, in order to capture detailed measurements and positions of objects in
three-dimensional space. The cost of acquiring the equipment and the knowledge needed to
operate and deploy it for mapping purposes means that geographic survey remains a skill
carried out by highly trained personnel. Secondary data capture involves the digitization of
existing geographic data contained in paper maps, and this may include scanning them to
create an image file or tracing the cartography to create a geographic database of the features.
The latter method is usually preferable although it is a time consuming process.

In Britain, topographic map data capture and supply is most commonly carried out by
Ordnance Survey, with companies such as NAVTEQ and Tele Atlas supplying much of the
consumer-grade data used for navigation devices and electronic maps. There are also
Government suppliers and large data vendors of remote/satellite imagery, such as NASA,
Digital Globe and GeoEye, and a variety of other infrastructure, business, and sensor


























































5
such as MapQuest


5

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


information suppliers. There is similarly both public sector and commercial acquisition and
supply of a wide range of thematic map data (geology, soils, hydrography, land use, etc.). The
importance of the Ordnance Survey, however, is that it operates as a Government trading fund
producing the high quality, base map information that is needed for many different
applications and these data have to be accessed using a licence fee price model.6 The high
cost of data capture means the prices of geographic products can therefore also be high.
Although some Ordnance Survey data have been opened up for free public access this does
not include the bulk of data from which most of their revenue comes.

To manage spatial data and undertake geographic analysis GIS software will typically be
used. The software provides a powerful tool for manipulating spatial data and as with the data
capture stage can be expensive. Typical GIS software is supplied on a 'seat licence' basis,
allowing multiple users to operate the application. Systems vary considerably, from those that
contain elaborate and extensive sets of toolbox functions that let the user customize the
software and link it to other software, to more integrated packages that have many fewer
toolbox functions. Pitney Bowes' MapInfo, for example, tends to be a more integrated
package whereas ESRI's ArcGIS is now much more of a GIS toolbox. The complexity of the
systems means that either users need to be trained or specialist operators are required, thereby
adding to the overall cost to the purchaser. Moreover, providing a Web-based GIS solution
typically requires further skills, software and hardware and will probably need a separate data
licence agreement. All of these factors mean that GIS users have to solve a further set of
technical and legal issues if they wish to display their geospatial data on the Web. As a result,
the raw geospatial dataset is usually closed to the wider world, whether deliberately or not.

In summary, it is important when making sense of the rest of this report to realize that
essentially mapping, and the reproduction of maps on a computer screen is a much broader
and more general topic than specialized GIScience. Both, however, have been impacted by
the introduction of the Web and, more recently, the ideas of Web 2.0.

1.1.3 Software and data as a service

A key element in my thinking about Web 2.0 has always been that it's about data as a service,
not just software as a service. In fact, the two ideas become inseparable.

Tim O'Reilly, in Turner and Forrest, 2008 (p.1)

It is into this landscape that Google launched its Google Maps service in February 2005,
closely followed by its API in June of that year. Gibson and Erle (2006) argue that three
things set Google apart from previous Web and Internet-based mapping systems: a clean and
responsive user interface; fast loading, 2-D map images consisting of pre-rendered tiles (the
backcloth, or cartographic base); and, most of all, a client-side API that allowed users to
customize the backcloths by overlaying additional data and then embed the finished result
into their own website. The new service integrated map, satellite and aerial imagery together
via a simple interface which included Google's trademark, easy-to-use and accurate search
facilities as well as what were then the innovative 'pan and zoom' or 'slippy' maps (activated
by clicking and dragging) that have now become an everyday part of life for many computer
users.

Critically, Google does not give access to the raw data that underpins its map backcloths. This
is partly because it was initially buying its data from third party suppliers and the conditions

























































6
The Ordnance Survey trading fund model is different from other Government trading funds that
produce spatial data (e.g. Met. Office and United Kingdom Hydrographic Office [UKHO]) although it
is likely that it will be brought into line with other data-rich, trading fund models in the future (DCLG,
2010).


6

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


of use meant that Google could only supply rasterized maps based on the data, and partly
because of the kind of caching needed in order to deliver at the Internet scale (see section
3.2.2). While these factors have worked in Google's favour in terms of providing a service to
millions of users who do not necessarily understand how to deal with raw datasets, it does
impose limitations on what it is possible to create with map mash-ups, particularly with
respect to accuracy.

The result of this is that we now have a software ecosystem, essentially a more diverse market
for software and services, where desktop GIS exists alongside Web-based GIS (as part of
enterprise architectures within large organizations), 2-D Web map systems that provide basic
navigation (e.g. Multimap), customizable 2-D Web map systems, and 3-D Web map systems
such as Google Earth, Microsoft Bing Maps etc. In the mix are a variety of services that tend
not to be large-scale and are usually locally developed (e.g. UCL's MapTube7).

This is underpinned by a data supply ecosystem. Data providers that collect and sell
professionally-produced data (e.g. Ordnance Survey, Infoterra) and resellers that repackage it
in order to sell it on as a 'value added' product (e.g. LandMark) have been joined by data
providers that collect their own data and make it available via a service, data providers that
collect publicly available data and repackage it (keeping it free of charge, for example via the
portals that are emerging under the Making Public Data Public initiative which has launched
the data.gov.uk site) and data providers that collect crowdsourced data and make it available
for free (e.g. OpenStreetMap).

While the issue of data licensing remains a thorny one (see section 2.3), it is important to note
the fluidity of the boundaries between the different software and data models and the potential
to 'mix and match'.

1.1.4 Mash-ups in Higher Education

The use of geospatial data mash-ups in higher education (HE) forms part of the wider debate
about the implications of Web 2.0 technologies and mash-ups (Lamb, 2007). Despite this
relative newness there is growing interest in their potential, demonstrated, for example, by the
session given over to geospatial services and the benefits for educators at JISC's 2009
conference, which provided a number of examples of how Web 2.0-style data mash-ups can
be used including: geo-tagged referential archives for geography students; adding layers of
geo-tagged data to photographs; integrating the new domain of neogeography8 with social
science (the Geography Undergraduates INtegrating NEo-geographies and Social Science
[GUINESS] project).

In addition, JISC's Shared Infrastructure Services Landscape Study (Chapman and Russell,
2009) surveyed and reviewed the use of Web 2.0 tools and services in the UK HE sector. As
part of this work it reviewed the use of maps and map-based mash-ups by institutions and
noted the use of:

• General purpose or administrative uses: Google Maps for 'how to get there' and
events-based information often mashed with local data such as weather or traffic
updates. Google Maps was felt to have 'overtaken' StreetMap and Multimap due to
'ease of access and functionality, a well-documented API and reliability' (p. 13).
• More specialist use of geo-related data for specialized services where tools such as
JISC ShareGeo and GoGeo! were being used.


























































7
http://www.casa.ucl.ac.uk/websites/maptube.asp

8
Defined as geographical tools used for personal and community activities by a non-expert user (see
section 2.2.2.1).


7

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


For the latter, more specialist area of use, there are two broad areas of impact: research, and
teaching and learning. In terms of teaching, Liu et al. (2008) argue that mash-ups offer:
'exciting new possibilities for classroom instruction, leading to potentially innovative uses of
existing Web applications and data' (p. 245). The authors cite a number of examples, mainly
American, in which Web-based maps are used as a form of focus for groups of students to
explore educational themes. These maps, often based on Google Maps, allow video, audio,
graphic and print-based materials to be geo-tagged, added to the map as an additional layer
and used to create educational, theme-based explorations.

Closer to home, the University of Leicester is leading on a consortium contract with the
University of Nottingham and University College London to deliver Spatial Literacy in
Teaching9 (SPLINT), a HEFCE-funded Centre for Excellence in Teaching and Learning
(CETL) focusing on the pedagogy of geospatial technologies, the pedagogy of taught
postgraduates and the enhancement of spatial literacy in HE. It is worth noting in passing that
many of the activities make use of the closely related technical areas of location-based media
and the use of mobile phones and PDAs in the field. Further discussion of the teaching and
learning implications of this are provided in the JISC TechWatch report on location-based
services (Benford, 2005) and Edina's Alternative Access Mobile Scoping Study (Butchart et
al., 2010).

With regard to the research agenda, the mashing of data from disparate sources and
disciplines has the potential to open up new areas of research investigation, with Macdonald
(2008) citing a number of examples of researchers currently using geo-referenced data and
Web 2.0-style mapping facilities including:

• Bjørn Sandvik's Thematic Mapping: http://thematicmapping.org


• John Hopkins University's Interactive Map Tool:
http://www.cer.jhu.edu/maptool.html
• Minnesota Interactive Mapping Project: http://maps.umn.edu/
• GeoVista at Pennsylvania State University: http://www.geovista.psu.edu/main.jps
• University of Maine's Commons of Geographic Data:
http://geodatacommons.umaine.edu
• Project Saxta: http://saxta.geog.umd.edu

Medical schools and public health researchers are also making use of map mash-up
technology, a key example being Health Map10 which shows the current, global state of
infectious diseases and their effect on human and animal health (Boulos et al., 2010). This
website integrates outbreak data of varying reliability, ranging from news sources (such as
Google News) to curated personal accounts (such as ProMED) to validated official alerts
(such as the World Health Organization).

One of the key effects of geospatial data mash-up technology on research is its ability to
foster inter-disciplinary work. The WISERD geo-portal11
is an example of this, aiming to
support the interdisciplinary work of the WISERD Centre by providing the central GIS
framework to integrate, manage, analyse and disseminate quantitative and qualitative data
relevant to the programme (Berry et al., 2010).


























































9
http://www.le.ac.uk/geography/splint/

10 http://www.healthmap.org/en

11
http://www.wiserd.ac.uk/research/data-integration-theme/the-wiserd-geo-portal/


8

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


JISC is also active in this area, supporting a number of geospatial and mapping related
projects and services.12 The JISC Geospatial Working Group13 provides advice on collecting
and development priorities for geospatial resources to JISC Collections, through a process
that identifies and responds to user needs and supports the execution of various strategies. In
addition, the JISC Standards Catalogue includes a number of geospatial related standards.14

Other salient work includes:

EDINA
Services provided by the JISC national academic data centre (EDINA15) include:

• Go-Geo!: the UK HE sector's geoportal providing a geospatial resource discovery


tool and ancillary support services
• Digimap: a core collection of various spatial framework datasets including Ordnance
Survey, British Geological Survey, SeaZone and Landmark Information Group
• UKBORDERS: an ESRC service funded under the Census Programme that provides
a broad range of census and administrative boundary data for the UK, for example
local election wards
• agcensus: agricultural census data
• Unlock (formerly Geo Cross Walk): middleware services for georeferencing and
geoenabling including a comprehensive database of geographical features with their
name, type and location, plus a set of simple APIs (the Unlock API) and a natural
language 'geoparser' service for assisting in geoenabling extant resources.

MIMAS
MIMAS16 provides Landmap, a range of geospatial data for use in HE/FE that includes
optical, radar and elevation based data collections derived from satellite, aerial photography
and similar sources. MIMAS also provides GeoConvert, an online geography matching and
conversion service based on the National Statistics Postcode Directory and runs CASWEB,
which provides access to census data for academic use.

NCeSS
The National Centre for e-Social Science (NCeSS) has produced MapTube, a free resource
for viewing, sharing, mixing and mashing data that have a locational element. It describes
itself as a 'place to put maps' and because users of the site who put up or index the maps they
create do not in general collude or co-operate, the site acts as an archive for map mash-ups.

MapTube was first developed as part of the work undertaken by the Geographic Virtual
Urban Environments (GeoVUE) team based at University College London's Centre for
Advanced Spatial Analysis (CASA). The focus in GeoVUE was on visualization and the node
has now merged with the MoSeS node at the University of Leeds to augment the visualization
capabilities based on maps with the development of spatial and geographical models that
require such visualization. Further development has taken place under the NCeSS follow-on
project Generative e-Social Science for Socio-Spatial Simulation (Genesis) and the National
e-Infrastructure for Social Simulation (NeISS) project, which is funded by JISC as part of its
Information Environment programme.

























































12
See, for example, Project Erewhon (http://erewhon.oucs.ox.ac.uk/) and the winners of the 'Edina'
category at the JISC-sponsored Dev8D workshop (http://dev8d.jiscinvolve.org/wp/2010/03/08/dev8d-
challenge-ideas-and-winners/).

13
http://www.jisc-collections.ac.uk/About-JISC-Collections/Advisory-Groups/Geospatial-wg/
14
http://standards-catalogue.ukoln.ac.uk/index/JISC_Standards_Catalogue

15
http://edina.ac.uk/

16
http://mimas.ac.uk/


9

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


1.2 Summary and rationale

The release of Google Maps has enabled the idea that non-expert users can not only view
maps on the Web but also engage in some sort of manipulation. To date the interaction is
quite limited and really depends on the providers of the maps and related software opening
their products to user intervention through some sort of API or embedding functions directly
into the maps themselves. In this sense the current state of map mash-ups is fairly primitive in
terms of the potential for non-expert users to create their own map content and tailor their
cartography to very specific applications. The majority of the features developed so far are
simply tools for the display – visualization – of basic or simply derived geographic
information. They do not provide any of the complexity of spatial analysis per se, merely the
visualization of spatial data whose financing is under-pinned by income generation through
advertising. This is symptomatic of Web 2.0 but there is rapid change in that users are
beginning to not only create more sophisticated maps, but also to define where such
information is produced, who uses it and at what time it is created and applied.

The implications of location-based systems are of interest in themselves since they are likely
to have a profound impact on society in general, and therefore education. However, over and
above that, valuable learning may be gleaned more generally about data and how it may be
used in the future. Pervasive computing, mobile devices, sensor networks and spatial search
are already shaping how the mash-up scene is developing. The importance of location with
respect to mash-ups and these other new technologies is reflected in the difficulty of
discussing them without referring to positional or spatial information. In this respect,
educational institutions need to understand the potential impact of location-based systems on
areas such as teaching and research but, as producers and keepers of data about huge numbers
of students, also need to understand the potentially far-reaching consequences of how data of
various kinds might be used.

This report will look at the evolving landscape that O'Reilly, always ready with a handy
moniker, has labelled 'Where 2.0'. Others prefer the GeoWeb, Spatial Data Infrastructure,
Location Infrastructure, or perhaps just location-based services. In particular it will look at the
development and changing nature of map-based, data mash-ups. It will explain the basic
concepts behind map mash-ups, how geospatial data gathering and analysis has changed and
how new technologies and standards are impacting on this. It will also look at the wider
context including some of the policy that is driving the development of map mash-up
technology and some of the longer-term technology developments. In the process of
explaining the changes that the ideas behind Web 2.0 (outlined in section 2) are bringing to
the world of cartography and geospatial analysis, the report will discuss pertinent issues that
relate to the way that data in general is gathered and used. One of the key outcomes of this
report, it is to be hoped, is a better understanding of some of the issues surrounding the
concept of 'data as a service'. These include data gathering, accuracy, provenance, quality,
trust, rights and what conditions are attached to its use in the future. As the report notes, map
mash-ups are currently the most popular data mash-up category. In this sense they provide a
crucible in which many of these issues will be tested.

10

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


2. State of Play: Maps, mash-ups and the creation of the


GeoWeb
The geospatial industry has traditionally been based around what O'Reilly (2005) would call
the 'Web 1.0' business model: software is packaged and sold as 'desktop' applications
according to scheduled software releases; revenue is based on selling expensive licences to a
few specialized customers (doing business with the head, not the tail) etc. Google Maps on
the other hand, leverages the ideas of Web 2.017: it has developed mapping services that take
advantage of the ongoing technical evolution of the Web and, something that Google has
made its trademark, can turn the ever-increasing scale of the Internet to its advantage
(although its mapping apps are really just data collation tools and do not have the spatial
analysis functions of GIS). Google is no longer alone, others quickly followed, most notably
Microsoft with their Virtual Earth product (now Bing Maps).

Proprietary GIS software vendors have responded with their own approach, including a
greater focus on enabling Web applications on top of their users' GIS databases, better Web
map clients, and offering virtual globes (akin to Google Earth) and mapping APIs that can
integrate data from the vendor's systems with map images hosted by the GIS vendor. An
example is ESRI's ArcGIS Web Mapping API that can integrate maps generated by ArcGIS
Server (and its associate analysis functions) with maps served by ESRI. Users can access
these APIs at no cost to build and deploy applications for internal or non-commercial use.

However, as a footnote to this discussion, it should be noted that although the power of the
mash-ups idea promises the prospect of interactive map creation by people with little or no
programming knowledge, current creators of mash-ups need to know their way around XML,
cartographic map projection and GIS functionality at a high level. In addition, it is only very
recently that data sources (as opposed to customizable 'backcloths') have started to become
freely available, along with the tools needed to enable immediate and direct visualization.

Finally, it is worth noting that map mash-ups are just the visible tip of the GeoWeb, which in
turn is rooted in pre-Web cartography and GIScience. Where the Web is a series of
interlinked HTML documents the term GeoWeb has been coined to describe the
'interconnected, online digital network of discoverable geospatial documents, databases and
services' (Turner and Forrest, 2008, p. 2). When we look at the detail of what this means we
can see that, in fact, GeoWeb encompasses 'a range of application domains from global
weather and geological sensors to family travel blogs, public transit information, and friend
trackers' (ibid.), and this has prompted some to claim that the GeoWeb is actually more about
the overlap between geospatial computation and Web 2.0. While this distinction between
Web 1.0 and Web 2.0 may be problematic (Anderson, 2007) the concepts that have evolved,
under the Web 2.0 moniker, that enable us to talk about how the Web is changing, are
important. These concepts were crystallized as 'six big ideas' by Anderson (2007) in his
TechWatch report, from an earlier statement of principles from O'Reilly (2005):

• Harnessing the power of the crowd


• Individual production and user-generated content
• Openness
• Network effects
• Architecture of participation
• Data on an epic scale

It is almost impossible to count the number of map mash-ups that have been developed since
Google released its API in June 2005. It would appear that our ability to mix and match

























































17
See section 1.1 for a discussion of the difference between the ideas and technologies of Web 2.0.


11

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


different data and software for different applications depends on a wide spectrum of
programming skills that has no coherence with respect to how such mash-ups are produced.
So far there have not been any attempts to classify these and thus we use the six ideas as a
way of understanding the direction of travel for map mash-ups and speculating on where it
will go next.

2.1 Harnessing the power of the crowd

The TechWatch report into Web 2.0 (Anderson, 2007) examined the idea of 'collective
intelligence' as described by O'Reilly in his 2005 article. Without revisiting the whole debate
here it is important to be aware that in terms of the state of play for data mash-ups, the key
thing to note about harnessing the power of the crowd is to not make assumptions about the
motivations for why people take part. This relates particularly to the notion of crowdsourcing,
which is often assumed to be a collaborative activity. In fact, the original Wired magazine
article (Howe, 2006) that first highlighted it brought together varied examples. These
included individual action that can be aggregated to create a collective result (see below) and
competitions to find individuals who can undertake certain tasks: from amateur photographers
for cheap or free labour, to experienced research scientists rewarded with six or seven figure
'prizes' and recruited via the Web to solve specific problems for commercial R&D
organizations. The range of activity that is covered by the term crowdsourcing is therefore
wide and it is not necessarily an altruistic undertaking.

2.2 Individual production and user-generated content

There is no widely accepted, precise definition of user-generated content, but it is generally


considered to be content that: is made publically available over the Internet; reflects a degree
of creative effort; is created outside professional routines and practices (OECD, 2007). When
applied to mapping Goodchild (2007) calls it Volunteered Geographic Information (VGI).

With respect to map mash-ups there are two aspects that are important: crowdsourcing
location-related data to create maps and crowdsourcing other data (in this example it is socio-
economic data) to overlay onto a map backcloth.

2.2.1 Crowdsourcing socio-economic data

One example of how user-generated content can be used within a mapping application is
provided by UCL's MapTube portal. MapTube is a free resource for viewing, sharing, mixing
and mashing data that have a locational element. It describes itself as a 'place to put maps' and
because users of the site who put up or index the maps they create do not in general collude or
co-operate, the site acts as an archive for map mash-ups. The maps are created using an
application called GMapCreator, a service that makes it easier to use Google Maps, and rather
than storing the whole map on the MapTube server, only a link to where the map has been
published is stored. When maps are shared in this way information about what the map is and
what it shows is entered by the owner, along with the link to where the map is published. As
the maps comprise the pre-rendered tiled images from the GMapCreator, the raw data is never
stored on the Internet. This means it is a safe way of sharing a map without giving away the
raw data used to create it.

12

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


3(a) 3(b)

3(c) 3(d)

Figure 3: MapTube and Crowdsourcing: The Credit Crunch Mood Map. 3(a) details the Radio 4
website on the Mood Map for the Credit Crunch; 3(b) the user website questionnaire; 3(c) a
distribution based on early responses; 3(d) the final Credit Crunch Map

Radio 4, BBC South, BBC Look East and BBC North have all used MapTube to enable users
to respond to specific survey questions, through dedicated websites, where they were asked to
give their postcode so that geographic 'mood maps' could be created (Hudson-Smith et al.,
2009). The process was first used to create a mood map of the 2008 economic recession in the
UK: working with BBC Radio 4 and BBC TV Newsnight, a survey was created that asked
people to choose the one factor (of six) affecting them most during the recession. No personal
information was collected with respect to the 23,000 responses. Figure 3 demonstrates a
typical map, produced online in real time through the MapTube portal and constructed in map
mash-ups using the Google Maps API. Since then, the same team have developed
SurveyMapper,18 where users create their own online survey which in turn creates geo-
referenced responses that can be mapped in real time.

Data were created by individuals acting independently, with the technology providing the
aggregation power to visualize the collective result. In fact, with the emergence of many
social networking sites millions of users can create their own data and respond to others
through various methods of online communication, which can all be tagged with respect to

























































18
www.surveymapper.com


13

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


location. There are, for example, various mash-ups that build up pictures of places from
Flickr data that have been tagged using a consistent series of tags such as geocodes, often
added after the pictures are produced or when they are uploaded.

One last point is relevant to these kinds of middleware that support map mash-ups. They can
be used not only for maps but also for any data that needs to be displayed in two dimensions
and which requires the functionality that is offered by the basic map platform, of which pan
and zoom are the obvious features. Pictures and related artwork are obvious sources.

2.2.2 Crowdsourcing geospatial data

Companies are beginning to employ the techniques of crowdsourcing to both extend and
improve their datasets. In the area of consumer grade applications such as personal navigation
devices (PNDs19) there has been some experimentation in methods of data collection and
validation. Turner and Forrest (2008) divide these into active and passive, where active data
collection allows users to enter and validate information, and passive data collection captures
user actions and behaviours to infer intent and 'interest'—for example using Web analytics to
track how long people look at particular webpages and advertisements, and what links they
click. As an example of active data collection the authors cite the error correction services
provided by TomTom and Dash. The TomTom system allows a driver to 'hit one button' to
notify TomTom that the displayed route is incorrect, while the Dash system deploys an
Internet-connected PND to identify where 'multiple driver routes diverge from expected
roadways, indicating an error in the road data' (ibid, p.4). Not only can Dash update its own
data, but it can also provide updated information to other data collection companies.

The authors go on to describe how Nokia's decision to purchase data provider NAVTEQ
suggests that in emerging markets such as Asia and Africa (where there is limited, high
quality geospatial data available, and where Nokia controls a high proportion of the handset
market) the company is able to use its position in the market to leverage geospatial data
collection, through their mobile devices, to add to the NAVTEQ base. In this way, Nokia is
able to 'dramatically expand' its data holdings of an area of the world where there is currently
a lack of professionally collected geospatial data.

2.2.2.1 New geography and the rise of the amateur


In general, one of the main precursors for the rise in user-generated content was the
widespread adoption of cheap, fairly high quality digital cameras, video cameras, mobile- and
smart-phones (Anderson, 2007). This is no different for mapping technologies, where the
decreasing cost and ongoing improvement of GPS receivers has offered users the ability to
easily capture their own spatial information.

Prior to May 2000, civilian GPS receivers were subject to selective availability, a feature
imposed deliberately by the US Department of Defense to degrade positioning signals,
thereby limiting the device's accuracy. Now, however, cheap handheld receivers are
increasingly available for leisure and hobby purposes and are commonly found in other
devices such as mobile phones and cameras. Where GIS work has traditionally been a highly
specialized profession, increasingly, a community of amateur map enthusiasts are using new
technologies to capture map data. They may record single locations of objects, perhaps the
place where a photograph was taken, or use the device in a logging mode to record many
points, which may be edited later on a computer to create a road or other geographic feature.


























































19
These include the dashboard-mountable navigation devices sold by companies such as TomTom and
Garmin.


14

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


The term neogeography has been coined in an attempt to capture the essence of this voluntary
action, although the term itself has been the subject of some debate.20

On the Platial blog, Eisnor (2006) describes neogeography as '…a diverse set of practices that
operate outside, or alongside, or in a manner of, the practices of professional geographers'.
Turner (2006) takes this further by stating that rather than making claims about scientific
standards, methodologies of neogeography tend towards intuitive, expressive, personal,
absurd, artistic, or maybe just simply idiosyncratic applications of 'real' geographic
techniques. While neither of these descriptions mentions Web 2.0 explicitly, the sense in
which non-experts create maps and manipulate map data, thereby extending the area of
practice beyond that of professional geographers, geographic information scientists and
cartographers, means that there is obvious common ground between neogeography and
crowdsourcing. This is not to say that these practices are of no use to the
cartographic/geographic sciences, but that they usually do not conform to the protocols of
professional practice (Haklay et al., 2008).

The application of neogeography is demonstrated by the OpenStreetMap project (OSM).


Launched on 9th August 2004, OSM is the brainchild of ex-UCL student Steve Coast. Behind
it is a simple concept: to create freely available geographic information without any legal or
technical restrictions. In this sense, it is a kind of Wikipedia for geographic information that
relies on crowdsourced spatial data. Contributors take handheld GPS devices (equipped with
open source software) with them on journeys or go out specifically to record GPS 'tracks'.
This may take the form of 'mapping parties', where they record street names, village names
and other features using notebooks, digital cameras, and voice-recorders to collect data (for
more on this see the OSM Wiki21). Once the event is complete, the data are added to the
central database. Additions such as street names, type of path, links between roads etc. are
added based on the notes taken en route and contributions are moderated by more experienced
users.

These data are subsequently processed to produce detailed street-level maps, which can be
published, freely printed and copied without restriction. Anyone is able to take part if they
have a GPS unit and the desire to see their work as part of the map. Since 2006, Yahoo! has
allowed OSM to use their aerial imagery and to an extent this has lessened the need for GPS
traces, although it still requires community effort to gather street names and provide details of
road types, road restrictions etc. One example dates from 2007 when OSM began to use
Yahoo! imagery to map the streets of Baghdad, Iraq by remote sketching combined with calls
to participants in the vicinity to help refine the road layout information. Figure 4 details the
layout that was completed by 5th May 2007 on all roads that are visible in the sourced
imagery.


























































20
As a starting point for some of the debate, see: http://highearthorbit.com/neogeography-towards-a-
definition/

21
http://wiki.openstreetmap.org/wiki/Main_Page


15

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Figure 4: The OSM crowdsourced map of Baghdad, May 2007 (from:


http://wiki.openstreetmap.org/wiki/Baghdad)

The increase in OSM's activity and coverage has, in turn, fuelled its usefulness and
encouraged both corporate and community contributions. Automotive Navigation Data, a
Dutch data company, turned over its data on China and the Netherlands because it saw little
value in owning an incomplete dataset. Their hope is that by opening up their data via OSM,
they will be able to create datasets that are 100% accurate (Turner and Forrest, 2008).

2.3 Openness

The development of the Web has seen a wide range of legal, regulatory, political and cultural
developments surrounding the control, access and rights to digital content. However, the Web
has also always had a strong tradition of openness: working with open standards, using open
source software [...] making use of free data, re-using data and working in a spirit of open
innovation.

Anderson, 2007, p. 25.

As well as being an example of crowdsourcing OSM is important for another reason—


openness. Originally set up in response to the high cost and restrictive licensing of Ordnance
Survey data, OSM is dedicated to providing a source of geospatial data that is free from
technical and legal restrictions, in particular intellectual property rights (IPR) pertaining to the
use and reuse of map-related data.

2.3.1 Intellectual Property Rights

IPR are perhaps the most significant challenge facing the widespread use of geospatial data
and the creation of data mash-ups more generally. Using a mapping API can mean that the
user is relieved of many worries relating to rights and licences to display map data on the
Web – these are essentially wrapped up in the licence to use the API – although this does not
provide freedom from IPR per se.

As it stands, mapping that has been derived using Ordnance Survey data as a base may not be
presented on a Google Map (OS, 2008). This is the case whether the derived data is passed to
Google via their API or kept completely separate from them by using a bespoke map
interface. The ongoing freeing up of Government data, discussed in section 4.8, will help to
alleviate some of these issues and ease access to some datasets. However, the problem does

16

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


not only apply to datasets produced by the proprietary data providers. All data shown on
Google Earth or Google Maps are protected by US copyright laws. This includes any
derivative products, although the licence for Google Earth and Google Maps allows for non-
commercial personal use e.g. websites and blogs. Bing Maps and Yahoo! Maps have similar
copyright restrictions and non-commercial personal use exemptions.

A key organizational challenge is therefore to educate university staff to become equally


familiar with both the potential and the limitations of mash-up technologies. The first issue is
the terms of service of any map API employed in the mash-up as these may restrict what can
be done, but even more than this, may claim rights over data submitted through the API.

2.3.2 'Open' APIs

APIs simplify things for mash-up developers. They provide a way for them to make software
that interacts easily with other software through a well-defined interface. An API that does
not require the programmer to pay licence fees or royalties is often described as 'open'. Such
APIs have helped Web 2.0 services develop rapidly and have facilitated the creation of mash-
ups of data from various sources.

However, the JISC report into Web 2.0 (Anderson, 2007) cites Brian Behlendorf's
encapsulation of one of the common misconceptions of 'open' when applied to services and
APIs: just because something is available on the Internet does not necessarily mean that it is
open. In fact, we need to distinguish further and say that just because something is available
free of charge does not necessarily mean that it is open. In fact, deciding whether something
is open or not is dependent on a variety of factors, e.g. what standards does it adhere to and
how open are those standards?

2.3.3 Open source software and open data

Open source software is starting to have an effect on IPR and how they are perceived. This is
being compounded by the crowdsourcing model, which relies on a huge number of usually
amateur 'creators' who do not rely on being paid for their content and often choose to give up
some of their copyright protections. This of course has a potential knock on effect: data mash-
ups may be republishing material that has been produced to varying degrees of accuracy and
for which the process of assigning rights has been obscured.

2.3.3.1 Open source geospatial software


Open source geospatial software tools offer new opportunities for developers to create new
mash-up applications more quickly and at lower cost. Having a community of developers
creates an ecosystem for rapid production of software applications that are robust and may
even have greater reliability than some proprietary software solutions. However, while these
products may be suitable for Web mash-ups the functionality is unlikely to be suitable for
spatial analysis.

Governments have realized the benefits of open source and are actively promoting this. The
UK Government Action Plan on Open Source, Open Standards and Re–Use is working in this
capacity. Tom Watson MP, former Minister for Digital Engagement states the Government's
commitment to open source: 'Over the past five years many Government departments have
shown that Open Source can be best for the taxpayer – in our Web services, in the NHS and
in other vital public services' (Cabinet Office, 2009, p.1).

The Open Source Geospatial Foundation (OSGeo) is a not-for-profit organization whose


mission is to support and promote the collaborative development of open geospatial
technologies and data. The foundation provides financial, organizational and legal support to
the broader open source geospatial community. It also serves as an independent legal entity to

17

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


which community members can contribute code, funding and other resources, secure in the
knowledge that their contributions will be maintained for public benefit. OSGeo also serves
as an outreach and advocacy organization for the open source geospatial community, and
provides a common forum and shared infrastructure for improving cross-project
collaboration. The foundation's projects are all freely available and usable under an OSI
certified open source licence. The project development statistics for the various software
projects under the OSGeo umbrella give the bigger picture of the potential of open source
software in the geospatial domain.22

2.3.3.2 Open data


Increasingly, discussions over what constitutes openness have moved beyond the parameters
of open source software and into the meaning of openness in the context of a Web-based
service like Google (O'Reilly, 2006). Some argue that for a service it is the data rather than
the software that needs to be open and there are those who hold that to be truly open the user's
data should be capable of being moved or taken back by the user at will. On his blog, Tim
Bray, an inventor of XML, argues that a service claiming to be open must agree that: 'Any
data that you give us, we’ll let you take away again, without withholding anything, or
encoding it in a proprietary format, or claiming any intellectual-property rights whatsoever'
(Bray, 2006).

OpenStreetMap started with the aim of creating freely available geographic information
without any legal or technical restrictions. Contributors use GPS receivers, paper sketches or
even draw over aerial imagery to map anywhere in the world and data are released under a
Creative Commons Attribution-ShareAlike (CC-BY-SA) 2.0 licence. This means that the
source vector data used to make the maps are available for download. Data may be used for
free, including for commercial gain, as long as they are correctly attributed as CC-BY-SA
together with the copyright owner; as simple as adding 'CC-BY-SA 2009 OpenStreetMap'. It
is supported by tools and applications for using and repurposing the data, as well as other
services (e.g. OSM Cycle Map) and is complemented by other open data projects such as
OpenCellID and WiGLE (base station and Wi-Fi locations).

An alternative to OSM is the Google Map Maker service, which began in June 2008. It is
similar to OSM in that it crowdsources maps in countries where current mapping data are
unavailable or sketchy but in contrast to OSM, its licence terms require that all data submitted
and maps created are the intellectual property of Google (Turner and Forrest, 2008). Users are
able to trace features in a way that is similar to OSM's use of Yahoo! data: they can sketch
directly onto imagery and add roads, railways, etc., even building layouts and business
locations, and data are checked by more experienced users. Both OSM and Google Map
Maker have varying levels of accuracy and as Haklay (2010a) notes, there seems to be
friction between them as to which organization will ultimately prevail amongst Government
and NGO users.

2.4 Network effects and the architecture of participation

[Architecture of participation] is a subtle concept, expressing something more than, and


indeed building on, the ideas of collaboration and user production/generated content. The key
to understanding it is to give equal weight to both words: this is about architecture as much
as participation, and... the architecture of participation occurs when, through normal use of
an application or service, the service itself gets better. To the user, this appears to be a side
effect of using the service, but in fact, the system has been designed to take the user
interactions and utilise them to improve itself.
Anderson, 2007, p. 19.


























































22
http://wiki.osgeo.org/wiki/Project_Stats


18

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


The architecture of participation (AoP) utilizes the power of network effects, a general
economic term used to describe the increase in value to the existing users of a service in
which there is some form of interaction with others, as more and more people start to use it
(Klemperer, 2006; Liebowitz and Margolis, 1994). It is most commonly used when
describing the extent of the increase in usefulness of a telecoms system as more and more
users join. Anderson (2007) elaborates on this definition of network effects and describes
some of the implications for users of Web 2.0 services such as social networking sites. The
key to harnessing network effects and AoP for map mash-ups is being able to operate
successfully at the Internet scale. One of the ways Google did this was by developing caching
for map images, something which has now become an OGC standard (see section 3.2.2).

To date, the focus of the benefits of the AoP has been on the benefit to the user. However, a
system that gets better the more it is used is also of significant value to the company
providing the service. The key point for our discussion here is that there are two other aspects
to network effects: the commercial value of the data being collected through mash-up APIs
which has become possible through the architecture of participation and the value of complete
datasets.

2.5 Data on an epic scale

In the world of geographic information, the value has long belonged to those companies that
control the underlying data.
Turner and Forrest, 2008, p.3

The importance of data and controlling datasets has always been at the forefront of the ideas
around Web 2.0. As Tim O'Reilly said, when speaking to the Open Business forum (2006):
'The real lesson is that the power may not actually be in the data itself but rather in the control
of access to that data.' This is no less true in the world of map mash-ups.

Traditionally, the geospatial data marketplace has been dominated by players such as
NAVTEQ and Tele Atlas and, in the UK, by the Ordnance Survey. New technology and the
ideas of Web 2.0 are shaking up this market. Google, Yahoo!, Microsoft and others have
broken into the market and data mash-ups are just one of the tools that they are using to
collect not only geospatial data but also all sorts of other data as well. However, by
crowdsourcing large amounts of geospatial data, mobile device manufacturers are also
becoming data collectors and purveyors, with Turner and Forrest (2008) stating that Dash (a
navigation device company) was set up, right from the very beginning, to handle geospatial
data. This increasing interest in data is connected to its market value, with NAVTEQ being
sold to Nokia in October 2007 for $8.1 billion (Erkheikk, 2007) and Tele Atlas being sold to
TomTom in July 2007 for €2 billion (Hoef, 2007).23

This is about more than the simple collating of very large geospatial datasets alone. There are
also issues concerning how those datasets are integrated and made interoperable with other
products and services. To take one example, Google makes much use of its highly
sophisticated infrastructure of both server capacity and existing software services to leverage
increased value from its growing geospatial data collection (see section 3.2.3).

There are of course important questions of accuracy, error and precision for crowdsourced
data, not least the extent to which datasets that are built up from individual and thus partial
records are representative. This is a difficult question to answer in a general form. The

























































23
More recently, Google has entered the 'turn-by-turn' market. It is expected that all Android devices
from Google will use the company's own geospatial dataset rather than rely on third parties. See:
http://abovethecrowd.com/2009/10/29/google-redefines-disruption-the-%E2%80%9Cless-than-free%E2%80%9D-business-model/ 


19

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


requirements will vary between uses of map data: backdrop mapping (a map image as a
background for one's own mapping) can usually tolerate much more error than, for example, a
car routing application. There are also a number of dimensions to spatial data accuracy and
revision:

• Positional accuracy: e.g. are the features correctly located, within the scale
constraints?
• Completeness & currency: e.g. are all the real-world features present in the data
(moderated by the scale) and out-of-date features removed?
• Attribute accuracy: e.g. are the geometric features correctly and completely described
and annotated, for example do points have associated town names?
• Logical consistency: e.g. are the topological relationships between features correct,
for example is the routing of the roads correct, do rivers and roads meet at the
appropriate features (bridge, ford, tunnel, etc.)?

To an extent these questions lie beyond the scope of this report but they are central to
evaluating how good a map mash-up actually is with respect to the purpose for which it is
generated.

20

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


3. Technologies and Standards


Turner and Forrest (2008) argue that the tools and ways of working with technology
associated with the emerging GeoWeb can be considered to provide a 'GeoStack' that allows
the creation, publication, sharing and consumption of map-based information. In this section
we will outline some of the key elements of this stack and discuss their implications.

Formally we can say that a mash-up is 'an application development approach that allows users
to aggregate multiple services, each serving its own purpose, to create a new service that
serves a new purpose' (Lorenzo et al., 2009). In the context of the GeoWeb such mash-ups
make use of geospatial related services and data feeds. The map image is the fundamental
layer of information onto which other data are superimposed. In general, data and services are
made available to others through two principal methods: by exposure via a Web API or as a
data feed through RSS or ATOM. These services and feeds are basic 'ingredients' to be mixed
and matched with others to form new applications.

3.1 The role of Ajax and other advances in Web technology

A large part of the success of mash-ups can be attributed to the ongoing increase in the
capabilities of Web browsers. Asynchronous processing means that data and page content can
be accessed and displayed without reloading the entire webpage, resulting in greatly
improved user interaction and response times. In addition, XML has become established as a
standard format for transmitting data and messages on the Web. Combination of these
technologies is usually referred to as AJAX (Asynchronous JavaScript and XML). The now
familiar ability to zoom or pan the map by clicking and dragging (a 'slippy' map), using
AJAX methods for map interactions, has changed users' expectations of mapping on the Web.

While AJAX improves the interaction with websites for the end user, APIs offer a means of
simplifying things for the mash-up developer. They provide a way to make software that
interacts easily with other software through a well defined interface. Often using Web
scripting languages, such as JavaScript, individuals can create applications with nothing more
than a simple text editor, with the API taking care of map data supply.

Another important technical development has been the ability for users to request additional
data, from a separate Web service, from within the API. A Web service, also known simply as
a service, provides a defined functionality across a local network or the Internet for other
applications to use.

3.2 Map mash-up basics

Web-based mapping solutions fall into two distinct categories: 2-D and 3-D (see section 4.6
for more on 3-D). In order to display 2-D map data on an HTML webpage, a mapping API is
used to run code on the page. APIs for these 2-D, Web-based maps fit into two groups:

• lightweight Javascript-based APIs (e.g. Google Maps, OpenLayers)


• those based around a more complex technology such as ActiveX, Silverlight, WPF or
Flash (which are used in Bing Maps and Yahoo! Maps).

Both types of system work by serving a map that has been reduced to a set of tiled images
with a fixed number of rows and columns that can be partitioned further. The size of the tiles
is arbitrary, but a common implementation is to use a single 256x256 pixel tile, at the first

21

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


zoom level, which covers the whole world. At the next zoom level, there are four tiles, then
sixteen, then sixty-four etc. (according to the structure of the relevant quadtree24).

It is important to understand that access is usually provided to the tiled images of the map and
not to the actual geo-referenced data per se.25 With proprietary maps from Google, Microsoft
and Yahoo! the tiles are already rendered (turned from raw data into images) and stored at the
portal. With OSM, users do have access to an open source version of these data, so they can
render their own tiles. While the proprietary map providers have their own bespoke tile
renderers, open data require open source tile rendering software to function. A common
combination is OSM data with the OpenLayers API for the map, using Mapnik or
OSMarender to render the data using a rules file or style that defines how the OSM data are
drawn. So, for example, OSMarender takes in an OSM data file and a rule file describing how
the map data are to be marked up, then outputs an image for display in a browser (in the form
of SVG).

3.2.1 Vector versus pre-rendered tiled data

As we have noted, GIS tends to make use of raw data in the form of vector files rather than
pre-rendered, tiled images containing annotations. This is because in order to really do some
of the more advanced spatial analysis functions many of the operations need to be on the data
and on measures of the geometry of the maps. Thus if one is forced to use tiled images, much
of the functionality of spatial analysis is not possible or is very slow.

Allowing access to the raw data has one other particular advantage: it enables the user to
change the base map projection. At present, all the major 2-D tiled map systems use the same
map projection, namely 'Spherical Mercator' (EPSG:3785, 4326, and 900913), a projection
that assumes the world to be a perfect sphere rather than an ellipsoid.26 While this works
adequately for the most populated areas of the world, any data shown on the map above or
below 85º north or south contains errors that are significant for many applications.27 Some
users, involved for example in weather forecasting or climate science, require data to be truly
global where a polar stereographic projection would make more sense. For these applications,
a custom tile renderer with a different projection could be constructed using OSM data, or
other open sources of world outline files. To date, little use has been made of custom tile
renderers using a different projection, although this is being discussed for visualising
environmental data.

3.2.2 Sending images or data to the client

Whether pre-rendered tiled images or raw vector data are used there needs to be a process of
transfer to the Web client. Before 2005, Web-based maps utilized two OGC standards: Web
Map Service for maps sent to the client as images, or Web Feature Service for maps where
actual vector data were sent to the client. Neither of these solutions could cache requests (this
was implicit in the way the standards were written) and were therefore not scalable to large
numbers of users (as each request from every user meant drawing a new area of the map, so
creating a load on the server that was correspondingly higher).


























































24
A form of data structure used in computer science in which each node of a tree has exactly four
children.

25
See: http://wiki.osgeo.org/wiki/Tile_Map_Service_Specification for an example of how this works.

26
The Earth is not actually a perfect sphere. Assumptions are made for various navigational and
historical reasons. See for discussion: Introduction to Spatial Coordinate Systems: Flat Maps for a
Round Planet: http://msdn.microsoft.com/en-us/library/cc749633.aspx

27
See: http://docs.openlayers.org/library/spherical_mercator.html


22

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


In contrast, Google knew early on that it required very high scalability and so used an
appropriate architecture. Google Maps allows tiled images to be cached both by the system,
for rendering and storing, and by the Web server and client browser. This ability to scale has
allowed Internet-scale adoption of Google Maps by large numbers of users. In response to
these developments, the OGC now has a standard called Web Map Tile Service, which adopts
the fixed location, tiled images implemented by Google.

3.2.3 Infrastructure

It is also important to note that while OpenLayers and the Google Maps API are intrinsically
similar, OpenLayers is simply an open source Javascript library. The Google Maps API is a
library but it is also backed by other Google infrastructure.

One example will illustrate the extra capabilities afforded by Google's use of its own in-house
technology. This concerns a common Web security restriction called 'cross site scripting'. In
essence this means that the browser will only show data from the same site as the page that it
is displaying. When drawing a Web-based map it is common to overlay annotations on the
map (e.g. placemarks, images, polygon symbols, textual descriptions etc.). Usually these data
are stored as a KML file (see section 3.4.1.2). However, if the KML data to be overlaid come
from a different Web server then the process will fall foul of the cross site scripting
restriction. Google gets around this by using a KML proxy which is part of their 'free to use'
Web infrastructure. This proxy technology does not exist in the case of the OpenLayers API
as there is no corresponding infrastructure.

These technical details are important with respect to the nature of the mash-up that ultimately
emerges. They not only affect speed of access and size of tile that can be displayed, they
determine to an extent the presentation of what might be possible in any application.

3.3 Specific technologies for map mash-ups

An API, or more specifically a JavaScript library, of particular relevance to map based mash-
ups is the Mapstraction28 library. Each mash-up vendor provides a different API system and if
developers only learn to use one type of API this can create a form of 'lock-in'. Mapstraction
gets around this by providing a common platform where developers can switch the mapping
provider with a simple code change, without having to rework the entire application. This is a
valuable service as it allows map-based mash-ups to be shared without necessarily promoting
any one company's services. It also allows some independence from the continued availability
of any one particular mapping API.

Also of relevance to map mash-up developments are the services offered by Cloudmade,29 a
recent start-up created to leverage the OSM data resources. In particular, Cloudmade offers a
map API similar to Google's, based on an improved OSM database. This API includes
options for user-defined styling, allowing the map cartographic styles to be adjusted to suit
the application (see Figure 5).30


























































28
http://www.mapstraction.com/

29
http://www.cloudmade.com

30
See also Google's API version 3 which has introduced user defined styling.


23

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Figure 5: Cloudmade map style editor tool http://maps.cloudmade.com/editor

3.4 Standards and Infrastructure

Web-based mapping and map mash-ups rely on some form of standardization, although this is
in a state of flux as might be expected in an area that is dominated by non-expert users
demanding easier functionality. For many simple map mash-ups there is a reliance on de facto
standards, implied by using a vendor's API with its particular data format requirements.

Before the Web, organizations that needed to share data used either de facto standards or
software such as the Feature Manipulation Engine (to push data from one format to another).
With the arrival of the Web came the Open Geospatial Consortium (OGC), which grew out of
the need to share data, and later services, more effectively. The OGC is an international
industry consortium of nearly 400 companies, Government agencies and universities. It
coordinates its work closely with the ISO TC211 group and facilitates a consensus process to
develop publicly available interface standards known as OpenGIS. OpenGIS standards
support interoperable solutions that 'geo-enable' the Web, wireless and location-based
services and mainstream IT with standards that cover spatial data formats, protocols and
structures for storing and accessing data, as well as various methods for querying, assembling
and aggregating data. The standards allow technology developers to make spatial information
and services accessible to other applications.

3.4.1 Key standards

3.4.1.1 ESRI
Currently, the main de facto standard with respect to spatial data is probably the shapefile, a
proprietary binary format for vector data developed by ESRI for use in the popular ArcView
and ArcGIS software packages. An attempt was made to bring the shapefile specification into
the OGC standardization process, but while the data specification has been published, ESRI
retain control of future development and changes to the specification. However, a freely
available specification exists for all to use and it is supported by almost all proprietary GIS
software.

The structure of shapefiles is relatively simple and is based on points, lines and polygons with
a linked .dbf database for storing attribute information. Shapefiles are mainly used for spatial
data storage as they are capable of handling large amounts of geographic data in any
projection and can also be used for spatial data exchange.

However, limitations with shapefiles (including size limits, lack of topological information
and general inflexibility) led ESRI to introduce a replacement format—the ESRI

24

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Geodatabase.31 It is a powerful format, allowing unlimited size files, multi-user editing and
sophisticated spatial relationship structures such as topological networks. It is however a
closed format that can only be accessed using ESRI's software development kit (SDK) and the
format has not yet 'taken off' in the manner that the shapefile format has.

3.4.1.2 Open Geospatial Consortium


The OGC does not support the shapefile format, and instead, recommends the Geography
Markup Language (GML32
) and the Keyhole Markup Language (KML33) as basic standards
(OGC, 2008). KML provides for a lightweight means to encode spatial data in an open
format, and hence has had wide uptake. GML is an XML schema for expressing geospatial
features and is an alternative to KML and shapefile. It offers a more complete system for data
modelling and as such is more usually used as a basis for scientific applications and for
international interoperability e.g. as the foundation for INSPIRE (a European geo-data
harmonization initiative) and GEOSS (the Global Earth Observations System of Systems). A
lightweight form of GML, Simple Features Profile, is also available.34

Other important OGC standards include:

The Web Feature Service interface standard35 (WFS) describes a simple HTTP interface for
requesting geographical features. While the client can specify a geographic area of interest in
the same way as for WMS, additional filters also allow fine control over the features returned.
For example, a 'QUERY' request might specify all roads within a given area. Unlike WMS,
the WFS standard returns geographic features in the form of GML or shapefiles. In addition to
'QUERY' operations, WFS also supports 'INSERT', 'UPDATE', 'DELETE', 'LOCK' and
'DISCOVERY'.

The Web Map Service interface standard36 (WMS) describes a simple HTTP interface for
requesting maps using layers. The maps are drawn by the server and returned to the client as
images (e.g. .jpeg or .png). The client specifies the bounding box of the map together with the
layers required and receives the map back as a single image, unlike the WMTS service which
returns a set of images as tiles.

A Styled Layer Descriptor37 (SLD) extends the WMS and provides for encoding user-
defined symbolization and colouring of geographic feature and coverage data. It provides for
software that has the capability to control how geospatial data are visualized. Tile server
software like Mapnik or GeoServer uses an SLD document to define how the map tiles are
drawn from the geographic feature data. In the case of thematic or choropleth maps, the SLD
is extended by the Symbology Encoding specification in order to render data that is not
provided for in the base SLD specification. The Symbology Encoding document defines how


























































31
Note that the term geodatabase is typically used much more generally to include any possible spatial
database format. They are generally considered more complicated to use and this could be part of the
reason for the enduring popularity of the shapefile.

32
http://www.opengeospatial.org/standards/gml

33
The Keyhole Markup Language is named after the company that originally developed it as part of its
work on what became, after a take-over, Google Earth. In particular, the styling components of KML
were developed with a view to what Google Earth needed to be capable of and therefore were not
designed with interoperability in mind. However, Google recently turned KML over to the OGC to
bring into the standardization process.

34
http://xml.coverpages.org/ni2005-07-07-a.html

35
http://www.opengeospatial.org/standards/wfs

36
http://www.opengeospatial.org/standards/wms

37
http://www.opengeospatial.org/standards/sld


25

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


feature and coverage data are portrayed visually on the map. This is an XML encoding using
symbolizers and filters to define how attribute data are displayed.

The Web Map Tiling Service standard38 (WMTS) aims to improve performance and increase
the scalability of Web map services through caching. It is modelled on the large-scale tiled
map systems used by Google, Microsoft and Yahoo!, where requests are made for discrete
map tiles which can be cached both in the server and the client browser.

The Web Coverage Service interface standard39
(WCS) defines a standard interface for
access to coverage data e.g. satellite images, aerial photos, digital elevation and terrain data,
LIDAR or any other raster-based tool (as opposed to WFS, which defines a standard interface
for access to vector data in the form of points, lines and polygons). Data falling within a
bounding box can be queried and the raw vector data returned to the client.

The Grid Coverage Service40
(GCS) refers to data that are raster in nature rather than vector.
Examples include satellite images, whether visible light or any other sensor, digital aerial
photos, LIDAR, elevation and terrain data. The GCS document defines standards for
requesting, viewing and analysing raster data.

GeoRSS41 feeds are designed to be consumed by geographic software such as map


generators. While RSS is used to encode feeds of non-spatial Web content (such as news
articles), content consisting of geographical elements, defined for example, by vertices with
latitude and longitude co-ordinates, are usually encoded using GeoRSS.

3.4.1.3 Semantic Data Standards


A number of organizations are currently developing Semantic Web technologies for
representing their geographic data (e.g. UK Ordnance Survey and US Census), and this
involves creating geospatial data in the Resource Description Framework (RDF) format. In
general, the power of this technique is that all the data are machine-readable. SPARQL is a
key standard in this respect: a query language for semantic data. In some ways this is
analogous to how SQL is used to query a database, but using the semantic Web instead.

In the UK this is backed by the Talis platform, which offers 50 million 'triples' or 10GB of
storage for open data as long as it is publicly accessible and under an open data licence. In
addition, the Ordnance Survey has developed an administrative geography ontology that
contains knowledge about areas. The LinkedGeoData project is working to extract RDF
triples from the OSM dataset and make them available to semantic Web researchers.42 There
are also plans for EDINA's Unlock service to provide a triple store version.

3.4.1.4 Database Standards


Database storage plays an important role in Web-based mapping systems and Simple Features
for SQL43 defines the storage and retrieval of geographic feature data in SQL databases. At
present, the following spatial databases support this standard: SQLLite, Microsoft SQL
Server 2008, MySQL, PostGIS, Oracle Spatial, ESRI ArcSDE, Informix, and IBM DB2.


























































38
http://www.opengeospatial.org/standards/wmts

39
http://www.opengeospatial.org/standards/wcs

40
http://www.opengeospatial.org/standards/gc

41
http://www.georss.org/Main_Page

42
http://linkedgeodata.org/About

43
http://www.opengeospatial.org/standards/sfs


26

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


3.4.2 Technical Infrastructure

A basic requirement of most mash-ups is the availability of a Web server to host the site and
any extra data. However, this is not always necessary and various hosted services and cloud
computing solutions are emerging (Hobona et al., forthcoming). For example, Yahoo! Pipes
takes existing content from Web feeds and pages, which users can integrate using a visual
programming environment. The mash-up is saved online as part of an account, avoiding the
need for a user to manage a separate server or Web host. Similarly, Geocommons provides a
free and accessible platform for users to create maps from either their own data or a large
repository of map layers. The application provides functionality for both cartographic styling
and a variety of base maps from the main map mash-up vendors. An extension to the
application, called GeoIQ, enables datasets analysis to be carried out. Such tools will become
more prevalent as map mash-ups continue to develop.

Cloud and grid computing architectures are used for reasons of efficiency and scalability, but
when implemented as a shared resource they will also facilitate increasingly complicated
analyses of large datasets in real time which may then be presented as a mash-up. Rather than
managing a centralized server, institutions may use a third party, cloud-based architecture that
supplies processing, data storage and applications in a secure environment. Furthermore,
prevalence of software-as-a-service type applications will alleviate the need for powerful
desktop or mobile computer processors. Mash-up APIs using this type of system will allow
users to create applications based on complex algorithms, modelling techniques or
simulations in order to help their understanding.

An alternative to the cloud-based solution is to develop an institutional mash-up site. An


example of this is CASA's MapTube, which curates user-contributed choropleth maps for
display on a Google Maps base. This is associated with a software application, GMapCreator,
which takes GIS data and creates the necessary map image tiles and XML configuration file
to be served through MapTube. However, MapTube still requires the user to present the data
from their own Web server (not least to free CASA from obligations relating to users
presenting unlicensed data). The user's server needs only to be configured as a standard Web
server, so the MapTube data could be served from a user account on an institutional Web
service. MapTube is limited in presenting a certain type of map on a particular underlying
map API.

However, more complicated and/or large datasets need an appropriate database and associated
server software to operate efficiently, as is the case when displaying more than a simple point
dataset on a map. This presents a potential barrier to adoption of more sophisticated mash-
ups. Associated with this are issues of security and authentication—firstly to ensure that the
mash-up's Web server is secured through its software stack (from server down to the
database), and secondly to only present data to authorized users.

27

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


4. The future of data mash-ups and mapping


In terms of the future of data mash-ups in general, the importance of mapping and geospatial
data mash-ups is likely to be just the beginning. They have popularized the idea of data mash-
ups largely because the potential for rich visual experiences is a powerful driver for the
uptake of the services that are associated with them which, in turn, encourages users to
contribute huge amounts of data through the service's API. We are now starting to see similar
sorts of hacks occurring in many other areas of social and intellectual life, where ideas in one
field are being transposed and applied to another. There will be new software developments –
temporal maps, for example – and animations to accompany these will become routine.

Browser technology is advancing rapidly with an indication that the Read/Write Web is
moving towards a Read/Write/Execute status, sometimes called Web 3.0. This will allow
software to be run directly within the browser, providing a platform for spatial analysis,
advanced map mash-ups and sophisticated data mining. Toolkits that are currently available
are likely to develop rapidly and become available beyond the current specialists and interest
groups. There will also be an increase in the development of 3-D mash-ups and spatial
visualizations allowing the wider communication of complex datasets.

In the meantime, questions need to be asked about the epic scale of data being produced, its
accuracy, and the potential problems for privacy when data are combined in ways that are not
necessarily anticipated when the datasets are developed. For HE there are also particular
questions about copyright or research data being given up to organizations that operate
proprietary services and the potential for privacy and confidentiality breaches.

Over the next ten years or so there are several technologies and applications that are likely to
become increasingly important to HE:

4.1 Semantic mash-ups

Despite the emergence of Web 2.0 tools such as Yahoo! Pipes and standard geo-data formats
like KML, the task of identifying and integrating datasets of interest must be done manually.
Automatic data mapping is only just beginning to be explored and requires the ideas of the
semantic Web to be incorporated into mash-up development.

'Semantic mash-up' is the idea that computers help humans discover and integrate data. It
forms part of a research area known as the semantic geospatial Web, which requires the
availability of semantically enriched, machine-readable geospatial and location data
(Egenhofer, 2002), and this is likely to be a significant research endeavour in geo-data mash-
ups in the coming decade. Lefort (2009) states that there are two main classes of service being
explored: legacy semantic mash-ups and opportunistic semantic mash-ups. The former
transform existing geospatial mash-up data in XML into RDF and so into a semantic
application. The latter scrape HTML, or in some cases RDFa44 data, from websites as
required.

An early example of the latter is provided by the DBpedia project which seeks to create a
semantic Web knowledge base by extracting marked up data from Wikipedia. Although much
of Wikipedia is free text there are various forms of structured information which are marked
up using wiki code, for example quite a lot of the data that are held in what are called


























































44
RFDa is a technique to sprinkle RDF data within existing webpage content (See:
http://www.w3.org/TR/xhtml-rdfa-primer/


28

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


infoboxes.45 This is extracted by DBpedia and turned into a knowledge base which contains
data on, for example, people (including name, birthplace, birth date etc.) and buildings
(latitude, longitude, architect, style etc.). The extraction process makes use of Wikipedia's live
article update feed46 and therefore is updated automatically on a regular basis. Data are
organized within DBpedia using the techniques of the semantic Web in the form of millions
of RDF triples.

4.2 Mobile mash-ups

Simon (2007) argues for a future vision wherein mobile phones will serve as generic
hardware and software platforms for a variety of geospatial information services. The
necessary advanced navigation features are already being integrated into state-of-the-art
mobile devices and we can expect this to be even more widespread in the near future. The
more advanced mobile phones now include location technology as standard. Existing
specialist GPS device manufacturers are also starting to move into the smartphone market
(e.g. Garmin's Nuvifone). All these devices include at least GPS technology and some may
also include software for determining position based on triangulation techniques using phone
mast positioning and wifi base station location.47

Developments in this area are likely to focus on providing a sophisticated hybrid of these
different location techniques. Turner and Forrest (2008) cite, for example, the XPS chipset
developed by SiRF, a vendor of GPS chipsets, in collaboration with SkyHook. In addition,
new types of positioning sensor will be introduced into mobile phones and Simon (2007)
discusses the role for a more accurate 3-D location based on compass, tilt and altitude sensor
technology. Turner and Forrest (2008) argue that wider adoption of accurate location
detection in mobile phones will spur a major new market in location-based services and
applications such as Google's Latitude.

Typically, the user explicitly controls which services the phone makes use of (explicit
interaction) although, based on the users' context, profile and the preferences published by the
device, the environment can also trigger services automatically (implicit interaction). In all
these situations context, and particularly location, acts as an important filter for selecting the
most suitable services (Ipiña et al., 2007). By integrating multiple data sources into one
experience, new services can be created that are tailored to the user's personal needs and, by
using local sensor data on a mobile device, this experience can be adapted to the user's current
situation (Brodt et al., 2008).

However, an issue that is still being explored is that although all these applications offer
specific functions to mobile phone users, and need to be downloaded and installed on the
mobile phone 'one-by-one', many of them are actually implementing the same process of
accessing the device's location (from the GPS on the device), transferring it to a server


























































45
An infobox is a fixed format table designed to be added to the top right-hand corner of articles and
which presents a summary of some of the key points of the article.

46
Wikipedia offers a Wikipedia OAI-PMH (Open Archives Initiative Protocol for Metadata
Harvesting) feed. This is a protocol used for harvesting metadata descriptions of records in an archive.
It is widely used by libraries, institutional repositories and archives. Data providers provide their
metadata in XML in the Dublin Core format, although other formats can be used. See:
http://meta.wikimedia.org/wiki/Wikimedia_update_feed_service

47
Many people want to see GPS on their phone when actually what they really need is location
information. GPS provides a simple latitude/longitude reading. This can be determined by a number of
techniques other than GPS, which apart form anything else is not particularly useful indoors as it needs
line of sight of a satellite.


29

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


application, and receiving and displaying the location-based information.48
The wide range of
mobile platforms (e.g. different Symbian editions, Blackberry, iPhone, Windows Mobile,
Google Android, etc.) and software development environments, as well as the fast life-cycle
of mobile phones and their operating systems, make such a distribution approach an
expensive task. An alternative approach that is being explored is to add 'location capabilities'
to mobile Web browsers instead (Karpischek, 2009). Adding location capabilities to mobile
Web browsers implements location-based services (LBS) for a wide range of heterogeneous
mobile devices but requires significantly fewer resources. The integration of browser and
geospatial data is also being explored as part of work to develop the forthcoming HTML5
standard.49

4.3 Geo-location on the Social Web

Increasingly, location-based information is being integrated with Web 2.0 applications such
as social networking. This development is closely related to the mobile phone developments
discussed above. There are a number of existing services and these are likely to expand in
forthcoming years. Examples include Loopt, Foursquare, Gowalla, Plazes and BuddyBeacon.
FireEagle is an important development in this respect, providing a centralized brokerage
service to allow users to control how their location data are shared and has been used in the
EDINA personalization geolocator prototype.50 As data mash-up systems begin to incorporate
a user's location into their visualization capabilities such services will become increasingly
significant.

4.4 Augmented Reality

Augmented Reality (AR) applications, where information is overlaid onto our view of the
physical world, are likely to become widespread and AR applications for mobile phones are
gaining recognition as providing a special kind of location-based mash-up. One method uses
two-dimensional barcodes (QR codes) which, when viewed with the camera of an
appropriately enabled device, open or stream some form of media. The second method is to
superimpose data onto images taken on the device's camera and display this to the user in real
time. While the technology is not particularly new (iPhone, Android and other mobile
environments that incorporate a digital camera and sufficient processing power offer a
suitable platform for AR application development) the AR sector is forecast to grow
dramatically. A recent report from ABI Research concluded that revenue from AR will rise
from $6 million in 2008 to $350 million in 2014 (ABI Research, 2009).

Although these applications are currently experiencing popularity, certain issues with
technology such as mobile device localization and usability need to improve. For example,
GPS used in a mobile device does not provide high enough accuracy, inertial sensors are
subject to inaccuracy and loss of calibration, and more user evaluation studies are required for
AR to develop further (Schmalstieg et al., 2009).

Significant developments in locational technology such as the inclusion of a built-in digital


compass, GPS and accelerometers into mobile phones have allowed not only location but also
heading and pitch to be detected and therefore incorporated into data display systems. These
built-in technologies have brought AR to the wider public and the phones themselves have
sparked a market-driven boom in fusing AR with location-based services.


























































48
However, this transfer of data may compromise privacy. Some users may wish to use GPS without
passing on their position to a third party server, or at least to be informed that such a transaction is
taking place.

49
See the W3C's Geolocation API at: http://dev.w3.org/geo/api/spec-source.html

50
https://www.wiki.ed.ac.uk/display/EdinaPersonalisation/API#API-UsingBroker


30

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


In the UK, Acrossair launched an AR application in late 2009 – Nearest Tube – that leads
users to the nearest tube (subway) stations in London and New York, illustrated in Figure 6.

Figure 6: Acrossair's nearest tube application for iPhone.51

Operating on the iPhone 3GS, Nearest Tube is typical of the current applications, which make
use of the user's location and provide a visualization of the current surroundings as a
background to the interface. Linked to Google Maps the user can spin around, select a
restaurant that is shown on their mobile handset screen, and the location of the restaurant will
be provided on Google maps (Figure 7).

Figure 7: Architecture of Acrossair's augmented reality application

Central to all AR applications is geospatial data. The inclusion of a map is superseded by


being in the actual location, relegating the need for a map to merely an optional extra. As an
example of this, Layar52 displays information from a range of content layers residing on its
server. The content data are overlaid onto the display on the phone's camera to show places of
interest (e.g. restaurants) within the local vicinity. Content data are added via an API that
anyone can contribute to. Figure 8 provides an insight to the service architecture.


























































51
http://www.acrossair.com/acrossair_app_augmented_reality_nearesttube_london_for_iPhone_3GS.htm

52
http://www.layar.com/


31

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Figure 8: Layar Service Architecture

Currently, applications are in their infancy and mainly focus on specific topics such as 'show
me where the closest x is'. This however represents the tip of the iceberg and with the addition
of a GIS into the mix there is notable potential for the industry (Sung and Hudson-Smith,
2010).

4.5 Sensors

Location is not the only information that mobile devices are being engineered to gather from
the local environment. Existing equipment within the phone is also being used in tandem with
geo-data mash-up services to provide new applications. An example of this is NoiseTube,53 a
research project that shows how participative sensing of noise pollution by the general public
can be conducted using a low-cost mobile platform (Maisonneuve et al., 2009). By
downloading an application to a GPS-enabled mobile phone, users are able to record the noise
level of their surrounding environment and submit an associated tag with the measurement.
Each measurement is then stored and collated so that users contribute to a collective noise
map (see Figure 9).

As mobile technology develops, other environmental sensing devices are likely to be


incorporated into the device. Data recorded from sensors may range, for example, from
simple physical measurements such as temperature or altitude, to object recognition in video
camera footage obtained from an unmanned aerial vehicle. The combination of the data
gathered from these sensors together with the location of the user offers a whole new
generation of what is being referred to as 'reality mining' (Eagle and Pentland, 2006). The
ability to mash data from these different sources is a major future direction for the
technology.

Passive crowdsourcing of data via mobile phones will also generate sources of data in real
time and some mobile applications currently demonstrate this potential. Waze, for example,
provides traffic information based on crowdsourced accelerometer and GPS readings, and
Citysense analyses phone data to visualize and predict popular locations. Again, issues of
harmonization between datasets and data quality are of importance here. The OGC recently
published a report outlining the technological issues that need to be addressed for fusing data
from different sensors and databases (OGC, 2010).


























































53
http://www.noisetube.net/


32

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Figure 9: An example display from the NoiseTube application



A much more radical form of crowdsourcing is to take the geo-locations from real time
responses in SMS texts, or status updates from micro-blogging services such as Twitter or
identi.ca (if the user is willing to activate the GPS sensing technology in their devices or
provide details of their location in another way). CASA is currently experimenting with
monitoring such data in different places and is developing a toolkit to replicate the ability to
crowdsource data for any user in any geographic area worldwide. Data can be pulled in
directly from social network websites for specific phases, locations or trends and mapped,
detailing the spatial relationships of these networks.

Figure 10 shows one of a series of New City Landscapes produced by CASA and based on
mined, location-based Twitter data. The contours visualize the number of tweets sent as a
density of tweets and an interactive version is available at:
http://londonist.com/2010/06/londons_twitter_traffic_mapped_as_c.php


Figure 10: Mining Twitter locations for New City Landscapes

33

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


4.6 3-D and immersive worlds

Due to the development of Web technology, map and geospatial data visualization are no
longer limited to static or 2-D format, but can take advantage of immersive and highly
interactive virtual environments to explore and present dynamic geospatial data (MacEachren,
2001). The rise in computing power (specifically graphic card technology), crowdsourcing
techniques and changes in data licensing models is rapidly moving map data into 3-D
environments. These were originally built using computer-aided design (CAD) systems but
have now been extended into a range of multimedia, particularly virtual worlds and gaming
environments, which are being opened up for the addition of external content.54

An important, emerging area of geospatial-related data visualization is that of immersive


worlds. These offer, through the browser window, high-resolution and street level views of
locations based on photographs. Users can 'walk' around the location by clicking on various
keys. Primary examples include Google Street View, Everyscape, Bing Maps, EarthMine,
MapJack, and the open source alternative Planet Earth. Such services form part of a
continuum with virtual worlds such as Second Life (see Figure 11 below).

Figure 11: Importing mash-ups created using GMapCreator into the Second Life virtual world

There is also a developing branch of virtual reality known as 'mirror worlds' in which the
physical world is replicated in a lifelike virtual model using advanced 3-D computer graphics,
which a user explores through their browser. These mirror worlds tend to use graphics to
replicate buildings, streets etc. rather than actual photographs. In a recent report for JISC, de
Freitas (2008) noted that future development in these worlds included the integration of
geospatial data and other mash-ups, most likely through forms of service-oriented
architecture. Turner and Forrest (2008) note that there is a move to integrate these mirror
worlds into social networking technology. They cite the example of SceneCaster, which
allows virtual scenes to be embedded within Facebook. This is a large and rapidly developing
area and readers are directed to the JISC report and the EduServe-funded Virtual Worlds


























































54
Google Earth has an iPhone app allowing 3-D information to be viewed and overlaid with data
while on the move and Bing Maps is being integrated with ArcGIS 9.3, allowing two- and three-
dimensional data to be ported into ESRI's flagship proprietary GIS.


34

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Watch55 for further information. Some of the educational implications of these new
technologies were explored by the JISC-funded Open Habitat project.56

A major trend in this area is that, increasingly, 3-D and immersive world services are
reaching out to their own users by incorporating crowdsourced data. Google Earth is typical
of this trend: its data are now currently a mix of 3-D cities created by the company itself via
automated photogrammetry techniques and crowdsourced models created by users through its
free SketchUp and Google Building Maker modelling applications.57

Google SketchUp was released in 2006 to complement the professional version of SketchUp,
a well known 3-D modelling program. Users are encouraged to use the software to model
their local neighbourhoods as part of a crowdsourcing exercise to create 3-D content where
automated processes would be cost prohibitive. The process is similar in many ways to
Google Map Maker, which operates under similar terms and conditions. Model submissions
are reviewed internally by Google as and when the user selects the option in SketchUp that a
model is 'Google Earth Ready'. The model is checked by Google employees to determine if
the building is 'real, current, and correctly-located'. If the model passes the review process, it
is added to the '3-D Warehouse Layer' making it publicly viewable in Google Earth when the
box in the sidebar that is labelled '3-D Buildings' is ticked. So far, users have been
encouraged to model sections of the earth via a series of 'model your town' competitions
where Google exhorts the user to 'Show your civic pride (and maybe win a prize) by creating
a 3-D portrait of your community and sharing it with the world. You have the power to get
your town on the map – and there's no bigger map than Google Earth' (SketchUp website,
2010). Such an approach is typical of crowdsourcing, although the Google terms and
conditions are more stringent than, say, OSM, and have a much more focused and controlled
aim in mind. Whether we can include these as map mash-ups takes us to the very edge of our
interest here but at the very least this is representative of new ways in which non-expert users
can create their own geographical content for their own use.

In fact, any user can import map data into the 3-D environment of Google Earth if they are
able to represent their data as a KML file. There are now plenty of free plug-ins to do this and
many GIS systems are able to import and export KML files. The Free Geography Tools
website58 contains a variety of such converters not only for Google Earth but also for OSM
and other mapping systems.

CASA has produced GEarthCreator which enables users to convert files into KML and
display them in Google Earth, demonstrated in Figure 12 (a). The software can also use
products such as Google Earth directly in order to exploit the power of the 3-D software to
augment other software that does not have such display capability. An example is shown in
Figure 12 (b) for a land use transportation model of Greater London in which 2-D data are
plotted continually as the users explore the model data, outputs and predictions but also wish
to see the data in 3-D. A link to Google Earth enables the user to add additional data from
third party suppliers and compare this with the data that are exported from the user's own
analysis.


























































55
http://virtualworldwatch.net

56
http://magazine.openhabitat.org

57
Google Building Maker was introduced in late 2009 and allows the user to model directly on top of
oblique aerial imagery using a range of simple shapes. The technique is reminiscent of the CANOMA
software tool by Adobe, released in 1999, and now operating over the Web using pre-defined imagery.

58
http://freegeographytools.com/


35

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Figure 12: 3-D mash-ups using Google Earth showing (a) conventional import of a KML file of GDP
(b) the exporting of 2-D thematic maps from a land use transport model into Google Earth

4.7 HTML5

In 2007, the W3C formed a working group chartered to work with the Web Hypertext
Application Technology Working Group (WHATWG) on the development of the HTML5
specification, the latest version of the core markup language of the Web.59 Of particular note
in the specification is the Canvas element60 which allows for dynamic, scriptable rendering of
bitmap images.

A canvas consists of a drawable region within which JavaScript code may be used to provide
dynamically generated graphics. This new technique means that a new generation of more
flexible geospatial data services is being created. The first noted example of geography-
specific information served via HTML5 is Cartagen,61 an open source vector mapping
framework developed at the MIT Media Lab's Design Ecology group. Introducing the system,
Boulos et al. (2010) note that as map data become richer and we strive to present multi-
layered data in a variety of projections and map zoom levels, traditional Web mapping
techniques start to become too inflexible. Instead of sending pre-rendered tiled images for
every zoom level, Cartagen draws maps dynamically on the client side, using the canvas
element of HMTL5. Moving such elements to the client side, known as local rendering, is a
notable step forward in vector based mapping. However, the use of raster, tile-based data
should not be underestimated, both in terms of lightening the load on the user's machine and
distributing copyright-sensitive data.

4.7.1 Other standards developments

New standards from OGC are ongoing, with community-specific application schemas being
introduced to extend GML for use in a particular domain. As one example, CityGML is an

























































59
A key aim of HTML5 is to reduce the need for rich Internet application technologies such as Adobe
Flash, Microsoft Silverlight and Oracle-Sun Java FX (Boulos et al, 2010).

60
See: http://en.wikipedia.org/wiki/Canvas_element. Initially introduced by Apple in Safari 1.3 for use
inside their own Mac OS X Webkit component to power applications like Dashboard widgets and the
Safari browser, Canvas was later adopted by Gecko browsers and Opera and standardized by the
WHATWG on new proposed specifications for next generation Web technologies.

61
http://eco.media.mit.edu


36

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


encoding standard for the representation, storage and exchange of virtual 3-D city and
landscape models. GML 3.0 is an XML markup for geographic data defining points, lines,
polygons and coverages and this new work extends the schema to model 3-D vector data
along with other semantic data related to a city.

4.8 Policy, standards and the wider context

There is widespread agreement that the effective use of geospatial data (e.g. within the
context of higher education) requires the establishment of a geospatial data framework to both
catalogue the available datasets and provide access to the data (Owen et al., 2009). While
there have been attempts to do this in the past, the rapid rate of technological change, driven
primarily by commercial interests, has meant that policy and standards have inevitably been
playing catch-up with real-world developments. In this section we examine some of the key
issues that are currently shaping how mash-ups will develop in the future.

4.8.1 Geospatial data frameworks.

The impact of data mash-ups and related technological developments has triggered major
national and international programmes to achieve harmonization of geospatial datasets and
interoperability of Web service components that utilize these data. Examples include the
'Joined-up Geography' initiative in UK local and central Government, EU programmes such
as INSPIRE, GMES, SEIS, and global programmes such as GEOSS. These programmes are
usually specified by Government and managed through a collaborative top-down structure
which can produce sound, consensus-based solutions in many circumstances. From the user's
perspective, changes to formats, new metadata standards, upgraded software to use the new
datasets, and training to use the new software are all issues that will need to be tackled (Owen
et al., 2009).

However, the pace of evolution of mash-ups and mapping and the innovative approaches
taken by the business community make it near impossible for conventional, committee-based
governmental approaches to maintain the necessary responsiveness. There is a danger that
long-term programmes e.g., transport charging, INSPIRE, GMES, and approaches to national
security for the Olympics, will be undermined and made either redundant or inferior to
technology solutions arising from the ground-up and driven by developments directed at mass
consumer applications.

Recognition of this danger is driving research initiatives at a number of UK universities and


at research organizations worldwide. Current work on U-cities (settlements with ubiquitous
information technology) is looking at the use of crowdsourced data for Government-led
projects, such as planning, which are usually approached in a top-down fashion. A
governmental crowdsourcing model is suggested where urban residents may submit opinions
and information to aid participation with policy makers (Jackson et al., 2009). Other uses for
crowdsourced data are being explored by The Future Data Demonstrator at the University of
Nottingham, where the aim is to combine data from authoritative Ordnance Survey datasets
with feature-rich, informal OSM data. The context for this work is the progress of national
and international spatial data infrastructures such as the UK Location Programme and
INSPIRE, contrasted against crowdsourced geospatial databases. While initiatives such as
INSPIRE tend towards a top-down process of harmonized data models and services using
ISO & OGC standards, the OSM approach tags data with democratically agreed preferred
attribute tags that can change over time (with inherent related issues of data quality). The
basic research question behind the demonstrator seeks to capture the best of each approach
(Anand et al., 2010).

37

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


4.8.2 UK Government policy on mash-ups and open data

Government plays an important role in deciding how mash-ups and mapping will develop in
future, particularly as its stance is changing dramatically with regard to the openness of public
datasets and open source software. Sir Tim Berners-Lee and Professor Nigel Shadbolt, acting
as advisers to Government, have proposed that public data should be much more open, in
particular recognising the relevance of location data.

Partly in response to these suggestions the UK Government published, in December 2009, a


consultation paper on policy options for geographic information from Ordnance Survey. The
purpose of the consultation was to seek views about how best to formally implement
proposals previously made by the Prime Minister to make certain Ordnance Survey datasets
available for free with no restrictions on re-use. On 31st March 2010, the department of
Communities and Local Government published its response to the consultation exercise
examining the way forward for Ordnance Survey data. At the same time the Government
confirmed that it was releasing, from 1st April, a range of Ordnance Survey data and
products, free of charge, which would be known collectively as OS OpenData.62
This package
includes a 1:250 000 scale colour raster, a 1:50 000 gazetteer, OS Street View and Meridian
2. In addition, to assist in the enablement of the Semantic Web, Ordnance Survey will
develop a service that allows its TOIDs (unique 16 digit identifiers for geographical objects)
to be openly referenced and located. The details of this latter service will be announced in due
course.

These developments form part of a wider agenda. The Smarter Government white paper
(HMG, 2009) describes a variety of ways in which public sector data is to be made available
and includes proposals to release Public Weather Service information, NHS Choices details
and live transport timetables (with 80% of buses using GPS sensors by 2015). The new
coalition Government has confirmed its commitment to continuing this process and
announced on 1st June 2010 the formation of a Public Sector Transparency Board to oversee
further developments towards opening Government data and increasing public transparency.

On 21st January 2010, Government began the process of releasing this kind of public data
with the launch of www.data.gov.uk. Over 3,000 datasets are now available through this
service. Formerly, data.gov.uk implementation is being led by the Transparency and Digital
Engagement team in the Cabinet Office. As well as providing an access point for the newly
opened datasets the site brings interested developers together to discuss technical and policy
issues relating to the use of UK Government data sources and the development of Linked
Data.63 The site shows commitment to the adoption of the semantic Web, publishing many
datasets in RDF and including information on how to manipulate it using SPARQL.

4.8.3 Standards, quality and data reliability

The trust placed in geographic data and products is an important issue. These technologies
will only prove useful if they are fit for their intended purpose, and uncertainty regarding the
quality of user-generated content is often cited as a major obstruction to its wider use. Critics
argue that amateur contributions may be highly erroneous and as such essentially invalid for
serious academic or industrial uses, with Goodchild (2009) arguing that a crowdsourcing
project should publish additional documentation and assessments reviewing quality issues.
This is an important issue for HE as, while lower grade data may be suitable for some

























































62
www.ordnancesurvey.co.uk/opendata

63
A process of linking data across the Web rather than HTML documents. It was first proposed by
Tim Berners Lee in 1998 with four basic principles derived from the original ideas of the Web. See, for
example: http://www.youtube.com/watch?v=OM6XIICm_qo&feature=related and
http://www.youtube.com/watch?v=qMjkI4hJej0


38

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


consumer applications (with recent studies of OSM data indicating that the quality is
comparable with at least that of the Ordnance Survey's Meridian 2 product [Haklay, 2010b])
this is not necessarily the case for Government-funded research projects. It would therefore
appear essential to develop suitable metadata so that users can judge for themselves.

Standards bodies are aware of the concerns surrounding the quality of user-generated content
and as both mash-ups and user generated map data flourish, standards are being drafted in an
attempt to deal with issues surrounding quality and metadata. One example, with regard to the
increased pervasiveness of ubiquitous computing technologies is the work currently being
undertaken within ISO/TC 211 as project 19154 (ISO/TC211:Project 19154 Standardization
Requirements for Ubiquitous Public Access 2009). This is concerned with ubiquitous public
access, referring to an environment, or infrastructure, to provide geographic information 'in
every place, at any time for any device'. To manage crowdsourced data, project 19157
(ISO/TC211:19157, 2009 [draft]) is looking at how to evaluate quality and set standards for
defining how this should be done.

4.8.4 Privacy and confidentiality

While new Web and mobile-based geospatial services and applications will provide clear
benefits to users, there are still ethical and social factors that are not yet fully understood or
addressed. There are issues concerning user generated and 'open' data, ethics and privacy
issues with regard to the handling of data, and these will have a bearing on the development
of location-based information and services. In particular, the privacy issues around
broadcasting personal location information, particularly within social networks, will
potentially have unintended consequences beyond the simple usage and value that these
applications provide.64

The other key question of concern is the extent to which adding different layers of data,
which independently are each within the bounds of confidentiality imposed by the data
provider and by Government, lead to breaches in confidentiality when added together. Owen
et al. (2009) demonstrate the measures that are already being taken in the public sector to
limit the amount of geographical information provided in individual-level datasets and to
create secure environments for data analysis, in recognition of the perceived high risk of
disclosure of confidential data on individuals. However, companies and open data projects are
not subject to such stringent oversight and, in addition, the prevalence of crowdsourced data
and content produced by individuals who are not necessarily trained in data management or
legal issues may result in breaches of privacy or confidentiality, through ignorance as much
as intent. Unfortunately, where these mash-ups are being created through a company's API,
the data will be stored and, depending on the API's terms and conditions, may be replicated
and concatenated further.


























































64
For example, Privacy International, one of the watchdogs working to help deal with issues of
privacy, identified that Google Latitude’s location sharing facility inadequately protected users from
unintentionally broadcasting their position. See: http://news.bbc.co.uk/1/hi/7872026.stm


39

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


Conclusions and recommendations


Data mash-ups in education and research are part of an emerging, richer information
environment with greater integration of mobile applications, sensor platforms, e-science,
mixed reality and semantic, machine-computable data. For example, development of
augmented reality learning experiences will be of particular relevance to those enrolled on
distance learning programmes and also to disabled students, and mixed reality applications
that integrate the physical and virtual worlds will become particularly relevant to discovering
additional, related knowledge and helping to visualize data. In the longer term this should be
facilitated by benefits accrued from more open licensing arrangements with data providers.

In the meantime there are several issues that HE will need to take account of:

1. As mash-ups become more widespread, and with students more aware and inclined to carry
out this type of work, a suitable technical infrastructure will be required. Providing this will
present challenges for institutional ICT support teams. While some technicians may already
be familiar with these issues in specialist departments, staff in other areas of research will
increasingly be affected. This may indicate a need for more centralized management of these
ICT issues. In the JISC context, research into the use of and guidelines for Shibboleth or
similar technologies to authenticate map and other Web services, such as the Wstieria [sic]
project,65 will be valuable. From an institutional perspective, there may be disadvantages in a
rise in the number of users or groups managing Web servers for separate mash-ups, along
with added security risks.

2. A key organizational challenge is to educate staff and students to become equally familiar
with both the potential and the limitations of mash-up technologies so that they are aware of
the implications of combining data sources and making the results available on the Internet.
Students, in particular, need to become more aware of the legal and ethical issues involved in
using mash-up technologies, particularly for university based work. Training on the ethical
use of data should be considered.

3. In general, Web 2.0 places an emphasis on making use of the information in the vast
databases that are populated by user contributions (through the architecture of participation).
However, the terms and conditions of these databases vary considerably and may be revised
at any time. Just because a service is free at the point of use does not make it 'open'. There is
therefore a question mark over the degree of genuine openness of access to the data and there
are implications for institutions working on joint projects with commercial organizations.
Within HE, there has been a wide-ranging debate within the academic and publishing
communities over open access to scientific and humanities research and the role of journals in
this regard, and this is not unconnected to moves within the research community to expose
experimental data. The tension between the desire to undertake mash-ups and the requirement
to ensure open access to data in the future need to be resolved. It is recommended that JISC
undertake work in this area to clarify and to provide advice for institutions.

4. New sources of data are becoming available and easy-to-use toolkits are opening up spatial
analysis beyond the traditional user, offering considerable opportunities throughout HE.
However, as data availability increases, especially in terms of crowdsourced data, students
and educators need to be aware of both the risks and benefits of such approaches.
Crowdsourcing may be an acceptable route for data collection as long as standards are put
into place to ensure secure survey principles and sampling methods. Data produced using
such methods should be clearly identified.


























































65
http://edina.ac.uk/projects/wstieria_summary.html


40

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


About the authors

Suchith Anand is Ordnance Survey Research Fellow at the Centre for Geospatial Science,
University of Nottingham. He is co-Chair of both the ICA working group on Open Source
Geospatial Technologies and the Open Source GIS UK conference series. His details are
available at: http://www.nottingham.ac.uk/~lgzwww/contacts/staffPages/SuchithAnand/Suchith%20Anand.htm

Michael Batty CBE FBA FRS is Bartlett Professor of Planning at University College
London where he directs the Centre for Advanced Spatial Analysis (CASA). His research
work involves the development of computer models of cities and regions. He is an expert
member of the Advisory Panel on Public Sector Information (APPSI) and Chair of the ESRC
Census Advisory Committee.

Andrew Crooks is an assistant professor in the Department of Computational Social Science


and member of the Center for Social Complexity at George Mason University and is a
Visiting Research Fellow in CASA working on the EPSRC project on Global Dynamics and
Complexity. He was formerly GLA Economics Research Fellow in CASA at UCL. His
research interests relate to exploring, understanding and the communication of urban built and
socio-economic environments using GIS, spatial analysis, and agent-based modelling
methodologies.

Andrew Hudson-Smith is a Senior Research Fellow at CASA, University College London,


he is Editor-in-Chief of Future Internet Journal, an elected Fellow of the Royal Society of
Arts and author of the Digital Urban Blog. His research interests relate to urban mapping, 3D
visualization and The Internet of Things.

Mike Jackson was appointed to the Chair of Geospatial Science at the University of
Nottingham in April 2005 where he has established the Centre for Geospatial Science. He is a
Fellow of the Royal Institution of Chartered Surveyors, a Fellow of the Royal Geographical
Society, a non-executive director of the Open Geospatial Consortium Inc. (OGC) and
Chairman, Commission 5 (Networks) of EuroSDR.

Richard Milton is a Research Fellow in CASA where he works on the Generative eSocial
Science (Genesis) project. He is also the developer of the MapTube website and has released
the 'GMapCreator' and the 'Image Cutter' software. Previously, he worked on the Equator
project where he used GPS tracked sensors to make fine-scale maps of carbon monoxide
distribution.

Jeremy Morley has been Deputy Director of the Centre for Geospatial Science at the
University of Nottingham since September 2009. He is the UK representative to EuroSDR
and a member of the UK Location Programme's User Group. His interests lie in the interface
between formal, OGC- or SDI-based online GIS and informal, mashup-style online content,
and the effects of ubiquitous computing and sensing on our understanding of the world.

41

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


References
[All links last accessed 4th September 2010]

ABI RESEARCH. 2009. ABI Research Anticipates "Dramatic Growth" for Augmented Reality
via Smartphone (Press Release). ABI Research, 22nd October. Available online at:
http://www.abiresearch.com/press/1516ABI+Research+Anticipates+%93Dramatic+Growth%94+for+Augmente
d+Reality+via+Smartphones

ANAND, S., MORLEY, J., WENCHAO, J., DU, H., HART, G. & JACKSON, M. 2010. When
worlds collide: combining Ordnance Survey and Open Street Map data. AGI Geocommunity
conference, Stratford upon Avon, UK, 28th-30th September.

ANDERSON, P. 2007. What is Web 2.0? Ideas, Technologies and Implications for Education.
JISC, Feb 2007. Available online at:
http://www.jisc.ac.uk/whatwedo/services/techwatch/reports/horizonscanning/hs0701.aspx

BENFORD, S. 2005. Future Location-Based Experiences. JISC, Jan 2005. Available online at:
http://www.jisc.ac.uk/whatwedo/services/techwatch/reports/horizonscanning/hs0501.aspx

BERRY, R., FRY, R., HIGGS, G. & ORFORD, S. 2010. Building a geo-portal for enhancing
collaborative socio-economic research in Wales using open-source technology. Journal of Applied
Research in Higher Education, 2 (1), pp. 77–92. Available online at:
http://jarhe.research.glam.ac.uk/media/files/documents/2010-01-29/Berry_d2_web.pdf

BOULOS, M. N. K., SCOTCH, M., CHEUNG, K.-H. & BURDEN, D. 2008. Web GIS in practice VI:
a demo playlist of geo-mashups for public health neogeographers. International Journal of Health
Geographics, 7. Available online at: http://www.ij-healthgeographics.com/content/7/1/38

BOULOS, M. N. K., WARREN, J., GONG, J. & YUE, P. 2010. Web GIS in practice VIII: HTML5
and the canvas element for interactive online mapping. International Journal of Health
Geographics, 9 (14). Available online at: http://www.ij-healthgeographics.com/content/9/1/14

BRAY, T. 2006. OSCON - Open Data. ongoing (weblog), 30th July. Available online at:
http://www.tbray.org/ongoing/When/200x/2006/07/28/Open-Data

BRODT, A., NICKLAS, D., SATHISH, S. & MITSCHANG, B. 2008. Context-Aware Mashups for
Mobile Devices. Proceedings of Web Information Systems Engineering (WISE) 2008, Auckland,
New Zealand, 1st - 3rd September 2008. Springer: Berlin.

BUTCHART, B., KING, M., POPE, A., VERNON, J., CRONE, J. & FLETCHER, J. 2010.
Alternative Access Project: Mobile Scoping Study Final Report. EDINA, June 2010. Available
online at:
http://go2.wordpress.com/?id=725X1342&site=mobilegeo.wordpress.com&url=http%3A%2F%2Fmo
bilegeo.files.wordpress.com%2F2010%2F07%2Fdigimap-mobile-scoping-study-final-projectv1-
31.doc&sref=http%3A%2F%2Fmobilegeo.wordpress.com%2F2010%2F07%2F16%2Fmobile-
scoping-study-report%2F

CABINET OFFICE 2009. Open Source, Open Standards and Re–Use: Government Action Plan.
UK Government Cabinet Office, 24th February. Available online at:
http://www.cabinetoffice.gov.uk/media/318020/open_source.pdf

CHAPMAN, A. & RUSSELL, R. 2009. Shared Infrastructure Services Landscape Study. JISC,
15th December 2009. Available online at: http://ie-repository.jisc.ac.uk/438/1/JISC-SIS-Landscape-report-
v3.0.pdf

42

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


DCLG. 2010. Policy options for geographic information from Ordnance Survey (Government
Response). Department for Communities and Local Government, March 2010. Available online at:
http://www.communities.gov.uk/publications/corporate/ordnancesurveyconresponse

de FREITAS, S. 2008. Serious Virtual Worlds report. JISC, 3rd November. Available online at:
http://www.jisc.ac.uk/publications/reports/2008/seriousvirtualworldsreport.aspx

EAGLE, N. & PENTLAND, A. 2006. Reality mining: sensing complex social systems. Personal
Ubiquitous Computing, 10 (4), pp. 255-268. Available online at:
http://www.springerlink.com/content/l562745318077t54/

EGENHOFER, M. J. 2002. Toward the semantic geospatial web. Proceedings of the 10th ACM
international symposium on Advances in geographic information systems, McLean, Virginia,
USA, 8th-9th November. ACM.

EISNOR, D. 2006. What is neogeography anyway? Platial News (weblog), 27th May 2006.
Available online at: http://platial.typepad.com/news/2006/05/what_is_neogeog.html

ERKHEIKKI, J. 2007. Nokia to Buy Navteq for $8.1 Billion, Take on TomTom (Update7)
Bloomberg.com, 1st October. Available online at:
http://www.bloomberg.com/apps/news?pid=newsarchive&sid=ayyeY1gIHSSg

GIBSON, R. & ERLE, S. 2006. Google Map Hacks. O'Reilly Media Inc.: Sebastopol, CA.

GOODCHILD, M. 2009. NeoGeography and the nature of geographic expertise. Journal of


Location Based Services, 3 (2), pp. pages 82 - 96 Available online at:
http://www.informaworld.com/smpp/content~db=all~content=a911734343

GOODCHILD, M. F. 1992. Geographical Information Science. International Journal of


Geographical Information Systems, 6 (1), pp. 31-45.

GOODCHILD, M. F. 2007. Citizens as sensors: the world of volunteered geography. GeoJournal, 69


(4), pp. 211–221.

HAKLAY, M. 2010a. Haiti – how can VGI help? Comparison of OpenStreetMap and Google Map
Maker. Po Ve Sham (weblog), 18th January. Available online at:
http://povesham.wordpress.com/2010/01/18/haiti-how-can-vgi-help-comparison-of-openstreetmap-
and-google-map-maker/

HAKLAY, M. 2010b. How good is volunteered geographical information? A comparative study of


OpenStreetMap and Ordnance Survey datasets. Environment and Planning B: Planning and
Design, 37 (4), pp. 682-703. Available online at: http://www.envplan.com/abstract.cgi?id=b35097

HAKLAY, M., SINGLETON, A. & PARKER, C. 2008. Web Mapping 2.0: The Neogeography of the
Geospatial Internet. Geography Compass, 2 (6), pp. 2011-2039. Available online at:
http://www3.interscience.wiley.com/journal/121528007/issue

HOBONA, G., JACKSON, M. & ANAND, S. (Forthcoming) Implementing Geospatial Web Services
for Cloud Computing. IN: Zhao, P. & Di, L. (Eds.) Geospatial Web Services. IGI Publishing.

HOEF, M. V. D. & KANNER, J. 2007. TomTom Agrees to Acquire Tele Atlas for EU2 Billion
(Update10). Bloomberg.com, 23rd July. Available online at:
http://www.bloomberg.com/apps/news?pid=newsarchive&sid=agT1Po33faG4&refer=home

43

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


HOF, R. D. 2005. Mix, Match, And Mutate. Businessweek, 25th July. Bloomberg: New York, USA.
Available online at: http://www.businessweek.com/magazine/content/05_30/b3944108_mz063.htm

HOWE, J. 2006. The Rise of Crowdsourcing. Wired, June 2006. Condé Nast Digital: New York,
USA. Available online at: http://www.wired.com/wired/archive/14.06/crowds.html

HUDSON-SMITH, A., BATTY, M., CROOKS, A. & MILTON, R. 2009. Mapping for the Masses:
Accessing Web 2.0 Through Crowdsourcing. Social Science Computer Review, 27 (4). Available
online at: http://ssc.sagepub.com/content/27/4/524.abstract

IPIÑA, D. L. D., VAZQUEZ, J. I. & ABAITUA, J. 2007. A context-aware mobile mashup platform
for ubiquitous web. Proceedings of the 3rd IET International Conference on Intelligent
Environments, University of Ulm, Germany, 24th - 27th September 2007.

JACKSON, M. J., GARDNER, Z. & WAINWRIGHT, T. 2009. The future of ubiquitous


computing and urban governance. Internal draft report, University of Nottingham.

KARPISCHEK, S., MAGAGNA, F., MICHAHELLES, F., SUTANTO, J. & FLEISCH, E. 2009.
Towards location-aware mobile web browsers. Proceedings of the 8th International Conference on
Mobile and Ubiquitous Multimedia, Cambridge, United Kingdom, 22nd-25th November. ACM.

KLEMPERER, P. 2006. Network Effects and Switching Costs: Two Short Essays for the New
Palgrave. Available online at: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=907502

LAMB, B. 2007. Dr. Mashup; or, Why Educators Should Learn to Stop Worrying and Love the
Remix. EDUCAUSE Review, 42 (4), pp. 12–25. Available online at:
http://www.educause.edu/EDUCAUSE+Review/EDUCAUSEReviewMagazineVolume42/DrMashup
orWhyEducatorsShouldLe/161747

LEFORT, L. 2009. Review of semantic enablement techniques used in geospatial and semantic
standards for legacy and opportunistic mashups. Proceedings of Australasian Ontology Workshop,
Melbourne, 1st December.

LIEBOWITZ, S. J. & MARGOLIS, S. 1994. Network Externality: An Uncommon Tragedy. Journal


of Economic Perspectives, 8 (2). Available online at: http://www.utdallas.edu/~liebowit/jep.html

LIU, M., HORTON, L., OLMANSON, J. & WANG, P.-Y. 2008. An Exploration of Mashups and
Their Potential Educational Uses. Computers in the Schools, 25 (3), pp. 243 - 258. Available online
at: http://www.informaworld.com/10.1080/07380560802368090

LORENZO, G. D., HACID, H., PAIK, H.-Y. & BENATALLAH, B. 2009. Data integration in
mashups. SIGMOD Record, 38 (1), pp. 59–66. ACM.

MACDONALD, S. 2008. Data Visualisation Tools: Part 2 - Spatial Data in a Web 2.0
environment. University of Edinburgh, 17th October 2008. Available online at:
http://edina.ac.uk/cgi-bin/news.cgi?filename=datasharebriefing2-20081028.txt

MACEACHREN, A. M. & KRAAK, M. 2001. Research Challenges in Geovisualization.


Cartography and Geographic Information, 28 (1), pp. 3-12. Available online at:
http://www.cartogis.org/publications/abstracts/cagisab0101.html

MAISONNEUVE, N., STEVENS, M., NIESSEN, M. & STEELS, L. 2009. NoiseTube: Measuring
and mapping noise pollution with mobile phones Proceedings of the 4th International ICSC
Symposium Thessaloniki, Greece, 28th-29th May. Springer.

44

JISC
TechWatch:
Data
mash‐ups…
(Sept.
2010)


O'REILLY, T. 2005. What is Web 2.0: Design Patterns and Business Models for the Next
Generation of Software. O'Reilly Media Inc., 30th September 2005. Available online at:
http://oreilly.com/web2/archive/what-is-web-20.html

O'REILLY, T. 2006. Open Source Licenses are Obsolete. O'Reilly Radar (weblog), 1st Aug.
Available online at: http://radar.oreilly.com/archives/2006/08/open_source_licenses_are_obsol.html

OECD. 2007. Participative Web: User-created content. Organisation for Economic Co-operation
and Development, 12th April 2007. Available online at:
http://www.oecd.org/dataoecd/57/14/38393115.pdf

OGC. 2008. OGC Reference Model. Open Geospatial Consortium Inc., 11th November. Available
online at: http://www.opengeospatial.org/standards/orm

OGC. 2010. Fusion Standards Study Engineering Report. Open Geospatial Consortium Inc., 21st
March. Available online at: http://www.opengeospatial.org/standards/per

OPEN BUSINESS. 2006. People Inside & Web 2.0: An Interview with Tim O'Reilly. Open Business
(weblog), Available online at: http://www.openbusiness.cc/2006/04/25/people-inside-web-20-an-
interview-with-tim-o-reilly/

OS. 2008. Use of Google Maps for display and promotion purposes. Ordnance Survey, Available
online at: http://www.freeourdata.org.uk/docs/use-of-google-maps-for-display-and-promotion.pdf

OWEN, D., GREEN, A. & ELIAS, P. 2009. Review of Geospatial Resource Needs. ESRC,
December. Available online at:
http://www.esrc.ac.uk/ESRCInfoCentre/Images/Geospatial%20report%20with%20cover%20Dec09_t
cm6-35008.pdf

SCHMALSTIEG, D., LANGLOTZ, T. & BILLINGHURST, M. 2009. Augmented Reality 2.0.


Proceedings of the International Symposium on Mixed and Augmented Reality (ISMAR),
Orlando, FL, USA, 19th - 22nd October. IEEE.

SIMON, R. & FRÖHLICH, P. 2007. A mobile application framework for the geospatial web.
Proceedings of the 16th international conference on World Wide Web, Banff, Alberta, Canada,
8th-12th May. ACM.

SKETCHUP WEBSITE. 2010. Available online at:


http://sketchup.google.com/competitions/modelyourtown/index.html

SUNG, H. J. & HUDSON-SMITH, A. 2010. Augmented Reality in 2015. Association of Geographic


Information. Available online at: http://www.agi.org.uk/storage/foresight/data-
technology/GIS%20and%20Augmented%20Reality%20in%202015.pdf

TURNER, A. 2006. Introduction to Neogeography. O'Reilly Media Inc.: Sebastopol, CA. Available
online at: http://oreilly.com/catalog/9780596529956/

TURNER, A. & FORREST, B. 2008. Where 2.0: The State of the Geospatial Web. O'Reilly Media
Inc., September 2008. Available online at: http://radar.oreilly.com/2008/10/radar-report-on-where-20-
the-s.html

45


Das könnte Ihnen auch gefallen