Annals of GIS

This article was downloaded by: [1.55.71.
46]
On: 03 December 2014, At: 20:21
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Annals of GIS
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/tagi20
Watermarking digital vector map using graph theoretic

approach
a
Sunil Kumar Muttoo & Vinay Kumar

a
Department of Computer Science , University of Delhi , New Delhi , India
National Informatics Centre , New Delhi , India

Published online: 03 Apr 2012.
To cite this article: Sunil Kumar Muttoo & Vinay Kumar (2012) Watermarking digital vector map using graph theoretic
approach, Annals of GIS, 18:2, 135-146, DOI: 10.1080/19475683.2011.640640
To link to this article: http://dx.doi.org/10.1080/19475683.2011.640640
PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content) contained
in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the
Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and
should be independently verified with primary sources of information. Taylor and Francis shall not be liable for
any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of
the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
Annals of GIS
Vol. 18, No. 2, June 2012, 135146
Watermarking digital vector map using graph theoretic approach

Sunil Kumar Muttooa and Vinay Kumarb
a
Department of Computer Science, University of Delhi, New Delhi, India; b National Informatics Centre, New Delhi, India
(Received 6 June 2011; final version received 14 September 2011)
Downloaded by [1.55.71.46] at 20:21 03 December 2014
Similar to any digital dataset, a digital map is also vulnerable to modification, deliberate alteration and copyright violation.
A map is a visual representation of a geographical area that is digitally stored in either raster or vector forms. Vector map is
preferred over raster for both space optimization and quick processing time. Digest concept used for message authentication
3
1
2
is extended to digitally watermark a digital map. The map digest triplet (Mdp
, Mdp
and Mdp
) is generated using various
features of a digital map, supplier code and customer code. The approach computes two 160-bit hash values using secured
hash algorithm (SHA1)and one 128-bit digest using message digest (MD5) algorithm. These two hash values of 160 bit and
one digest of 128 bit are then embedded into a sequence of nodes of the map using graph theoretic approach in such a way
that any alteration in the map alters the sub-graph. The sub-graph is the watermark. There are three watermarks.
Keywords: geographic information system; map digest; steganography; watermarking; secure hash algorithm; DXF; vector
map; spatial data
1. Introduction
A map is a visual representation of an area. It is a symbolic depiction highlighting relationships among different
features of an area like road, rail and soil usage. Once some
theme is depicted on a map, it becomes a thematic map.
More thematic maps may be produced from one base spatial map. The process of creating a digital map from age-old
physical sources is called digitization of map. The field
of study that deals with creation of a digital map and its
composition into thematic maps is called geographic information system (GIS). It takes time, money and effort to
create a map in digital form and it generally accounts for
nearly 80% of the total cost of a GIS project.
Once a map is digitized, it is easy to copy, alter and sell
it as an original product (Brassil et al. 1995, Date et al.
1999). Alternatively, a mischievous person may also produce this altered map as valid document in support of some
false claim. It necessitates developing and placing protection mechanism against such threats so that not only are the
commercial interests of investors protected but also, whenever required, the authenticity and integrity of the source
can be verified. For instance, use of a digital map in a crucial military operation requires integrity verification of the
map before it is put to use.
Cryptography
(Stallings
1999)
and
digital
steganography (Johnson et al. 2001) are in use to
protect and secure the privacy of a message. Watermark is
*Corresponding author. Email: vinay.kumar@nic.in

ISSN 1947-5683 print/ISSN 1947-5691 online
2012 Taylor & Francis
http://dx.doi.org/10.1080/19475683.2011.640640
http://www.tandfonline.com
used to protect a map against piracy. Digital steganography

is a science to hide one form of information (message) into
another (cover). After embedding, the cover is called stego
object. The stego object retains the vital characteristics
of the cover so that it remains statistically and visually
imperceptible. The process of embedding hidden information in a map without producing perceptible changes
is called watermarking and the hidden information is
called watermark (Memon and Wong 1998, Mauro et al.
2002). Compared to general multimedia data types,
few works have been done in the area of digital vector
map watermarking. To our knowledge, a corporation in
Uruguay named The Digital Map Ltd. has patented a
copyright-protecting scheme for vector maps utilizing a
lossless watermarking algorithm. They have also developed a commercial software called MapSN for digital
map vendors (Lopez 2002). This approach is based on
map interpolation and represents watermark bits by adding
new vertices into the original cover map. Based on a
similar approach there are some other digital vector map
watermarking schemes due to Schulz and Voigt (2004) and
Lopez (2002). Contrary to steganography, watermarking
is used to authenticate media and therefore any change in
cover media is not desirable (Karjala 1995, Gopalakrishnan
et al. 1999).
Software piracy has been controlled to some extent
through hardware keys or software licenses. Software
136
S.K. Muttoo and V. Kumar
licensing procedure has been made stricter in many cases

by forcing the customer to register online to get a software
license. However, they do not offer similar protection to the
dataset being used by their software. Reasons cited are both
technical and legal. Outside the GIS community this problem has been known for a long time. The mechanism using
frequency domain technique has been proposed to ensure
the protection of digital vector data against piracy (Xiamu
et al. 2006).
One solution could be to supply GIS dataset in
encrypted form for limited use. This approach binds the
user to use the dataset on a specialized set of software
and hardware because there exists no GIS software as of
today that uses encrypted datasets. Another approach could
be to disable the editing of certain features of the GIS
dataset (Kumar and Sharma 2006). This approach requires
a standard data format that is understood by all GIS software available to support the portability of the dataset.
Unfortunately this has not been achieved till now.
In this article, we have introduced a method to
watermark a digital vector map using graph theoretic
approach. A map digest is generated. The map digest has
three parts. The first two parts are generated using secured
hash algorithm (SHA1) and therefore there are two 160-bit
strings. The third part is generated using message digest
(MD5) algorithm and it yields one 128-bit string. These
bit strings are then embedded into the map using graph
theoretic approach. A graph theoretic approach in the context of information hiding may be interpreted as a method
that uses graph data structure to solve information-hiding
problem. The approach is used either
in finding the relationship, based on mathematical

function, between the smallest data unit of a message and a group of such smallest units of the
cover object and representing the relationship using
a graph. If required, the relationships are used to
adjust the corresponding group of elements in the
cover so that the portion of the cover represents the
corresponding message data unit, or
to use a graph as cover object. In this case, redundancy is to be identified in features like node, segment and attribution data in a graph before embedding the secret message in it (Kumar and Muttoo
2009b, 2011).
In the first case, problems related to determination and representations of relationship are addressed for embedding
information in the cover. In the second case, redundancy
in graphs is identified. The second approach is used in this
method to hide the map digest in the map. The process is
explained in Section 4.
This article is organized into nine sections. Digital
representation of vector map is covered in Section 2.
Issues in vector map watermarking are discussed in
Section 3. Approach to watermarking scheme is introduced in Section 4. In Section 5 we have outlined the
algorithm to generate a map digest and its embedding in
the map. An illustrative example showing the way a map
digest is created and then used to watermark the map is
given in Section 6. Authentication protocol is explained in
Section 7. Section 8 contains steganalysis of the embedding
algorithm. Finally this article is concluded in Section 9.
2. Digital representation of vector map
Digital vector map is composed of spatial data and attribute
data. Spatial data describes the geographical locations of an
object in the real world in the form of a sequence of coordinates in a geographical coordinate system. Attribute data
describes the properties of map objects such as their names,
categories and related information. Different themes are
represented by using labels, colours and charts. Spatial
data consists of three basic geometrical elements: node,
segment and point. Segment is a poly-line that contains a
number of intermediate points (Kumar and Muttoo 2009a).
Intermediate points are included to draw a curved segment.
Many GIS software build polygons (regions) from polylines. The information recorded by the attribution data is
important and is not generally modified (Schulz and Voigt
2004).
Maps are digitally compiled from data available
through various sources. The source may be from age-old
archived maps to snaps taken by remote-sensing satellite.
The three basic components node, segments and intermediate points of a map are stored in the following file:
(1)
(2)
(3)
(4)
node,
segment,
data and
segment.dbf
Out of the four files, segment.dbf is a database file and

the others are binary files. A segment has attributes like
segment number, number of intermediate points, coordinates of these intermediate points, a rectangular extent
within which the segment lies (maximum bounding rectangle extent) (Kumar and Muttoo 2009a), length of the
segment, the node pairs defining the segment (start and
end nodes) and name of the segment. Information related
to a segment is stored in three files. Since the number of
intermediate points varies from segment to segment, all
such intermediate points are stored in a separate file called
data. Offset in the data file, where intermediate points
corresponding to the segment are written, is stored in a
file called segment. The file node contains information
related to nodes (Kumar and Muttoo 2009a).
In order to facilitate the display of a map on a screen
independent of its resolution and size, coordinates of the
Annals of GIS
leftmost top corner and rightmost bottom corner (or rightmost top corner and leftmost bottom corner) are also stored
in a database file extent.dbf. There may be more than
one map within the same extent and all such maps are
stored with different names. Files related to a map are
kept in a directory and the directory is called the map.
Information about the number of maps and its attribute is
written in graph.dbf. Thus metadata about graphs (maps)
are written in
(1) extent.dbf,
(2) graph.dbf and
(3) <Name of graph as directory>.
With the advent of GIS tools, most cartographers use these
tools to generate new maps or to edit and recompose the
existing maps to reflect the present geographical situation. One of the vector GIS tools is GISNIC (Kumar and
Sharma 2006). To facilitate data interoperability of 2D
and 3D drawings between AutoCAD and other programs,
Autodesk developed in 1982 an open-sourced computeraided design (CAD) data file format. The file format is
known as DXF (Drawing Exchange Format). A DXF file is
composed of several sections (DXF reference 2008). Each
section has a code that is associated with its value. The code
begins with a 0 followed by a string and ends with a 0 followed by ENDSEC. A DXF file can represent almost all
CAD drawings (DXF Reference 2008).
3. Issues in vector map watermarking
A watermarked map must retain its watermark/fingerprint
to achieve the goal even after it undergoes various attacks.
The possible attack on a GIS map could be geometric
attack, vertex attack, vertex reordering and noise distortion.
A successful attack implies removal of the watermark while
retaining the fidelity of the cover data. The spatial data of
vector maps is virtually a floating point data sequence with
a certain precision. Transformations like translation, rotation and scaling are the main forms of geometrical attacks.
Since it involves coordinate transformation, no information is lost if information is not stored in coordinates.
Alternatively if GIS data is cleaned, built and projected into
the required projection system before watermarking it, this
limitation can be overcome. Therefore, geometrical attacks
are relatively easy to defend in vector map watermarking
schemes.
Vertex attack implies adding new vertices into the map
(interpolation) or removing vertices from the map (simplification or cropping). Such attacks, especially the map
simplification and cropping, are very serious to vector map
watermarking. Map simplification is a common operation
in GIS applications. As a result, the ability of surviving the map simplification is very important for a robust
137
watermarking scheme. All objects are stored in the map

file in a certain order. Either reordering the objects in the
map or reordering the vertices within an object can produce a new map file without degrading the datas precision.
To some watermarking scheme which is dependent on the
objects order, this operation will be a fatal attack.
There is a possibility of noise distortion. Two sources
could introduce noise into vector maps: file format
exchange through import/export feature and an attack.
The first is essentially required to implement portability
of GIS datasets. The second one is unwanted but cannot
be ignored. However, insertion of noise is easily detected
because it compromises the fidelity of map datasets.
While watermarking a vector map, its fidelity must be
retained (Su et al. 1999, Aspert et al 2002). The term
fidelity is often used as a measure of a maps validity in
the watermarking world. However, according to the different data types and their respective usage, the term fidelity
has different meanings. Available watermarking algorithms
for digital vector maps use coordinates of spatial data to
embed watermark/fingerprint. Every vector map has a precision tolerance which gives the maximum amplitude of
the allowed distortions for the coordinates. The distortions of coordinates definitely below the tolerance will not
degrade the maps fidelity. The following principles have
been considered while developing the methodology:
(1) We need to protect only precious data. The legal
copy of a precious data is equally valuable and
it has the same market value. If while making
a copy, its quality is compromised and its market value is diminished to the extent that copying
becomes economically infeasible, then piracy can
be controlled.
(2) We find a way to embed the watermark such that
neither any coordinate in the cover map is modified
nor any extra feature is added to the cover map, that
is, the map is treated as naturally watermarked.
4.
Outline of the approach

Map digest Md is defined as a one-way hash function that
takes an arbitrary map as its input file and produces a
string of bits of fixed length. It is built upon the most
common cryptographic hash functions MD5 and SHA-1
(Rivest 1992, Stallings 1999). A map digest Md has three
3
1
2
, Mdp
and Mdp
.
parts: Mdp
1
(1) The first part Mdp
is based on the number of
nodes, segments, intermediate points, regions and
landmark points. It is produced using SHA1. These
features are not altered by any transformation
applied on the map.
138
2
is based on the unique reg(2) The second part Mdp
istered supplier code (RSC), serial number of the
map (SNM), registered customer code (RCC) and
a secret key (Ks ). It is produced using SHA1. This
information is immune to any alteration made in
the map.
3
is obtained from the complete
(3) The third part Mdp
map after exporting it into DXF. It is produced
using MD5 algorithm. This part together with the
first and second parts of map digest is used to determine whether the map is altered and a fake copy is
made.
The map digest Md inherits the property of MD5 and

SHA1 and it is infeasible to find two maps having the same
map digest and also given a map digest it is infeasible to
reconstruct a map with the same map digest. Two maps
G1 and G2 are said to be equal if the number of nodes and
the corresponding coordinates, number of segments, corresponding end nodes, number of intermediate points in the
corresponding segments and their coordinates are all equal
in G1 and G2 , that is, G1 and G2 are exact copies of each
other. Now we prove a theorem.
Theorem 1: If map digest (Md ) of any two distinct maps
G1 and G2 are not equal, that is, if G1 = G2 then
Md (G1 ) = Md (G2 ).
3
1
2
(G1 ), Mdp
(G1 ), Mdp
(G1 )) and
Proof: Let Md (G1 ) = (Mdp
3
1
2
Md (G2 ) = (Mdp (G2 ), Mdp (G2 ), Mdp (G2 )) be map digests for
two maps G1 and G2 , respectively. Two digests are equal iff
the corresponding elements in triplets are equal:
1
1
Mdp
(G1 ) = Mdp
(G2 ) ,
2
2
Mdp
(G1 ) = Mdp
(G2 ) and
to the same suppliers for the same customer and thereby

controls unauthorized copying of a map at the supplier or
customer end.
For two different maps, the corresponding DXF files
are different and hence by the inherent property of
MD5 algorithm, their third components are also different.
Therefore,
Md (G1 ) = Md (G2 ).
While embedding the map digest, the 160-/128-bit string
is converted into a sub-graph (sequence of nodes) of the
map. This sub-graph is a watermark, naturally placed in
the map (Kumar and Muttoo 2009a, 2011). The first part
of the digest is embedded in a set of nodes forming a subgraph (H1 ) of the map G. The second part is embedded in
another sub-graph (H2 ) and the third part in a sub-graph
(H3 ) of the map G. We call this sub-graph a map digest.
This is the theme of this article. The schematic diagram of
the presented approach is given in Figure 1.
Let us now discuss the algorithm to create the
map digest and watermark and then embedding of the
watermark in the map.
5. Map digest and watermarking algorithm

Let G be the vector map to be watermarked. The map
3
1
2
(G), Mdp
(G), Mdp
(G)) is to be generated
digest = (Mdp
and embedded in the graph G. Take the map in its native
form. By native form we mean the format of data supported by the GIS software for storing the map in digital
form. Basic cleaning and building of polygons are performed on the basic spatial data before watermarking is
done. We consider a vector map in two parts:
3
3
Mdp
(G1 ) = Mdp
(G2 )
Let N 1 , S 1 , IP1 , R1 and P1 be the number of nodes, segments, intermediate points, regions and points features,
respectively, in map G1 and N 2 , S 2 , IP2 , R2 and P2 be the
number of nodes, segments, intermediate points, regions
and points features, respectively, in map G2 . The concatenated strings N 1 || S 1 || IP1 || R1 || P1 and N 2 || S 2 || IP2 ||
R2 || P2 are equal iff these features are individually equal.
Therefore if the count of any features differs in two maps,
we have
1
1
Mdp
(G1 ) = Mdp
(G2 )
1
1
In case of Mdp
(G1 ) = Mdp
(G2 ), the second part of the two
digests can never be equal because keys Ks1 and Ks2 used
for generating the second part of the digest are not equal.
This is true even for two copies of the same map sold
(a) spatial data part and

(b) complete map that includes both spatial and attribution data.
5.1 Map digest generation

We assign a unique serial number to the map G and call
it SNM. Then we find the counts of features: nodes (Gn ),
segments (GS ), intermediate points (Gip ), regions (Gr ) and
landmark points (Gp ), from the graph.dbf file. The size of
the data is 20 bytes (4 bytes for nodes, 4 for segments,
4 for intermediate points, 4 for regions and 4 for landmark
points). These 20 bytes are repeated 3 times to get 60 bytes
and then the maximum of the 5 values is concatenated
with the 60 bytes to make it 64 bytes. The 64 bytes form
the input block for SHA1 algorithm to produce a 160-bit
1
(G). The procedure is outlined in PROC 1.
Mdp
Annals of GIS
Concatenate
them
together
Count features
Vector map
139
Create 160-bit
hash using SHA1
for Mdp1
(a)
Assign a serial number to map
(SNM)
Vector map
Registered supplier code (RSC)
Export to DXF
Create 128-bit hash using MD5

algorithm (hash value) for Mdp3
Registered customer code (RCC)
(c)
SNM || RSC || RCC || Key
(b)
Use SHA1 to get 160-bit string for

Mdp2
Figure 1. Schematic diagram of watermarking approach: (a) part 1, (b) part 2 and (c) part 3.
1
PROC 1: Procedure to compute Mdp
(G)
1
PROCEDURE Mdp
(G)
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Read Gn , Gs , Gip , Gr , Gp from graph.dbf

Concatenate the 20 bytes value together to get
str1
Make a string str2 = str1 || str1 || str1
Compute str2 = str2 || max(Gn , Gs , Gip ,
Gr , Gp )
1
(G) = SHA1(str2)
Mdp
Features Gn , Gs , Gip , Gr , Gp of a map remain intact even

after a map undergoes any transformation from one coordinate system to another or any exchange of data format
for portability. However, it is fragile to map simplification process. To overcome this limitation, the second part
2
2
(G) of the digest is used. To generate the Mdp
(G)
Mdp
the values of SNM, RSC and RCC are determined and
recorded. Each of these values is 10 bytes long. The string
is repeated to make it 60 bytes and a secret key Ks of
4 bytes is added before it is sent to SHA1 to produce
160-bit hash values. The procedure is given in PROC 2.
2
(G)
PROC 2: Procedure to compute Mdp
2
PROCEDURE Mdp
(G)
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Step 6:
Generate SNM.
If Supplier is registered, read RSC
Else register the supplier and generate RSC
If Customer is registered, read RCC
Else register the customer and generate RCC
Concatenate the 30 bytes value together to get
str1 = SNM || RSC || RCC
Step 7:
Step 8:
Step 9:
Make a string str2 = str1 || str1

Compute str2 = str2 || K s
2
Mdp
(G) = SHA1(str2)
2
This second part of the digest Mdp
(G) is totally independent
of the map features and hence not altered by any operations performed on the map. In case of any disputes, if this
value is retrieved from the map, it not only authenticates
the ownership of the map but also proves any unauthorized
alteration that might have been carried out on the map. The
3
(G1 ) of the digest is generated for the
third and last part Mdp
entire map. It involves the following steps:
Step 1:
Step 2:
Step 3:
Take the vector map in its native data format.

Export it to Drawing Exchange Format (DXF).
Use this DXF file, which is a text file as input to
MD5 algorithm to get the 128-bit digest.
3
1
2
(G), Mdp
(G), Mdp
(G)) is to
Now the map digest triplet (Mdp
be embedded in the map to watermark the map. Once the
map digest is generated, it is required to determine the corresponding sub-graph that holds the map digest part. This
sub-graph is the watermark. Let us determine the sub-graph
and the number of nodes in that sub-graph.
5.2. Time complexity of map digest generation

Determining the time complexity of an algorithm requires
derivation of an expression that finds the number of steps
needed to complete the task as a function of the problem
size n. The time complexity H(Md ) of the algorithm to
generate the map digest is
1
2
3
H(Md ) = h1 Mdp
+ h2 Mdp
+ h3 Mdp
(1)
140
3
1
2
where h1 (Mdp
), h2 (Mdp
) and h3 (Mdp
) are functions representing the time complexity of functions that computes
3
1
2
(G), Mdp
(G) and Mdp
(G), respectively. We know that
Mdp
the time complexity of SHA1 and MD5 is of O(n) for input
data of size n bytes.
1
2
and Mdp
are of fixed size
Since the input data for Mdp
of 64 bytes and the algorithm used is SHA1,
1
= O(64)
h1 Mdp
(2)
2
h2 Mdp
= O(64)
(3)
3
, the map is converted to DXF. The
While computing Mdp
size of the DXF file depends on the number of spatial
features and attribution data. Let the size of the DXF file
corresponding to map G be n bytes. The complexity of
3
is then
MD5 algorithm used to generate the 128-bit Mdp
of O(n), that is,
3
h3 Mdp
= O (n)
(4)
Time complexity H(Md ) therefore can be written as

H(Md ) = O(64) + O(64) + O(n) = O(n)
(5)
Therefore, the complexity of the map digest generation is

of O(n) where n is the number of bytes in the DXF file
corresponding to the map G.
5.3 Number of nodes required for
1, 2, 3
Mkdp (G),
for k =
k
(G)) bits map digest.
left-over bits from the size of (Mdp
k
It is obvious that the size of (Mdp (G)) is either 160 bits or
128 bits. To represent the last node in Hk (X Zk ) ZERO
bits are padded in the left side of Zk . For example if the
number of nodes in the map is 100, then X = 8, Yk = 23 for
k = 1, 2 and Yk = 19 for k = 3. Similarly Zk = 6 for k = 1,
2 and Zk = 2 for k = 3. Therefore 2 bits, 00, are padded for
k = 1, 2 and 6 bits for k = 3.
5.4 Watermarking map

The watermark is a sub-graph of the map. Here instead
of embedding the watermark in the map by changing
any of the features of the map, we are searching for the
1
(G),
watermark in the map according to its map digest (Mdp
3
2
Mdp (G), Mdp (G)). The watermark must be present in the
map. Thus embedding is done in a sustainable way (Kumar
and Muttoo 2011) by associating the nodes of the map with
k
(G), k = 1, 2, 3. It is possible that there may not
bits of Mdp
be a node corresponding to a bit string because the values corresponding to the bit string may be greater than the
total number of nodes in the map G. In that case we treat
that node a virtual node. Each part is embedded separately
and independently. Let WT_K[Yk ] be the array that stores
the watermark. Each element of the array is of X bits. The
procedure is outlined in PROC 3.
Step 1:
k
Split Mdp
(G) into Yk bit strings each of (X 1)
bits except the last one. The last one is Zk bits
such that
k
(G) | = (X 1)(Yk 1) + Zk
|Mdp
Step 2:
Pad one 0 bit to the left side as MSB (Most

Significant Bit) of all (Yk 1) bit strings except
the last one. In the last bit string pad (X Zk ) bits
in leftmost side. In case of virtual node, pad two
zero bits and this node is logically isolated node.
Store all Yk bit strings in WT_K[Yk ].
For I = 2 to Yk do
If WT_K[I 1] and WT_K[I] are adjacent
then Make MSB (WT_K[I 1]) = 1
(6)
When the padded bit of node vi is 0, then vi is not adjacent

to vi+1 in the sequence of nodes that constitutes the subgraph for the part of the map digest. If the bit is 1 then vi
k
is adjacent to vi+1 . Let Hk be the sub-graph for Mdp
(G) for
k = 1, 2, 3. The total number of nodes Yk required to embed
k
Mdp
(G) is given by the expression

k
size of Mdp
(G)
Yk =
X 1
(8)
k
(G) in G
PROC 3: Embedding of Mdp
Let n be the
of nodes in map G which can be repre number
sented by log2 n bits. We take one extra bit padded in the
leftmost side to indicate whether the next node is adjacent
in the map. If X is the number of bits required to represent
a node then

X = log2 n + 1

k

Zk = size of Mdp
(G) (Yk 1) (X 1)
(7)
And the last node in the sub-graph Hk is represented by

only
Step 3:
Step 4:
The WT_K[Yk ] is the watermark which is nothing but a

sequence of nodes related to the map. If drawn according
to adjacency maintained in WT_K[Yk ], it is a sub-graph
Hk of map G. We get a triplet of watermark correspond3
1
2
(G), Mdp
(G), Mdp
(G)).
ing to the map digest triplet (Mdp
The robustness of the watermarking scheme is discussed in
Section 8.The records related to the watermark are maintained in Tables 13. Since the values are very large, only
Annals of GIS
Table 1.
141
1
Information related to Mdp
(G) and H1 .
Features count string

SNM
Gn
Gs
Gip
Gr
Gp
1
Mdp
(G)
H 1 (G)
Note: Gn, number of nodes in map G; Gs, number of segments; Gip, number of intermediate points; Gr,
number of regions; Gp, number of landmark points.
Table 2.
2
(G) and H2 .
SNM
Table 3.
RSC
RCC
Ks
2
Mdp
(G)
H 2 (G)
3
(G) and H3 .
SNM
3
Mdp
(G)
the template of the table is provided to maintain readability

of this article.
H 3 (G)
and
str2 = (000001F4 000003E8 00002710 00000032
5.5.
Records maintenance tables
The watermark is now created and naturally embedded in

the map. Now we see how this information is utilized to
authenticate and ascertain the ownership of a map in case
of any dispute.
6. Illustrative example
Let G be a graph such that Gn = 500, Gs = 1000,
Gip = 10,000, Gr = 50 and Gp = 100. Now the input to
1
(G) is obtained by concatenating Gn ,
SHA1 to generate Mdp
Gs , Gip , Gr and Gp in that order. Let this string be str1. Now
repeat str1 thrice before concatenating at the end max {Gn ,
Gs , Gip , Gr and Gp } to get 64 bytes. Let the string be str2.
Therefore,
str1 = Gn Gs Gip Gr Gp
= 500 1000 10000 50 100
= 000001F4 000003E8 00002710 00000032
00000064 (in hex)
00000064)(000001F4 000003E8 00002710

00000032 00000064)(000001F4 000003E8
00002710 00000032 00000064)(00002710)
Now the hash value of str2 as determined by HashCalc
2002 is as below:
1
Mdp
(G) = 64e89ca947946c60ca4abd8a8f9dc406632c7e5e
(in hex)
(9)
2
For computing Mdp
(G), suppose SNM, RSC and RCC are,
respectively, 2011090101, 0101000001 and 0101000001.
Let the key Ks be equal to A9C1. Now the input to
2
(G) is obtained by concatenating
SHA1 to generate Mdp
SNM, RSC and RCC in that order. Let this string be str1.
Now repeat str1 twice before concatenating Ks at the end
to get 64 bytes. Let the string be str2. Therefore,
142
str1 = SNM RSC RCC

= 2011090101 0101000001 0101000001
(eachsymbolisacharacterof1byte)
= 32303131303930313031 30313031303030303031
30313031303030303031 (inhex)
and
Now let us determine the sub-graph, that is, the sequence

of nodes from graph G that hides the three parts of map
digest. From Equation (6), the number of bits X required to
represent a node in G is 10. Thus, we split the 160 bits of
3
1
2
(G) and Mdp
(G) and 128 bits of Mdp
(G) in 9 (10 1)
Mdp
bit group. Each such group represents a node in the graph.
The bit sequence and the corresponding sequence of nodes
1
(G) are as below:
obtained from map G for Mdp
Bit sequence:
011001001 110100010 011100101 010010100 011110010
str2 = (323031313039303130313031303130
100011011 000110000 011001010 010010101
303030303130313031303030303031)
011110110 001010100 011111001 110111000
(3230313130393031303130313031303030
100000001 100110001 100101100 011111100
30303130313031303030303031)
001011110
(41394331)
Now the hash value of str2 as determined by HashCalc
2002 is as below:
2
Mdp
(G) = 62c5596af56a67e1e01c3b7759083f814b28a127 (inhex)
(10)
3
Now to determine Mdp
(G), the map is exported to DXF.
The map G under consideration is shown in Figure 2. The
dotted and broken segments represent continuation of the
map. Some nodes are not shown to maintain visibility of
the shown nodes and presented concept. The MD5 of the
DXF file corresponding to the map G of Figure 2 is given
in Equation (11). The digest is computed using tools:
3
Mdp
(G) = 5f95200e445beaf04511bbc587ffd696 (inhex)
(11)
Node sequence:
201, 418, 229, 148, 242, 283, 48, 202, 149, 246, 84,
249, 440, 257, 305, 300, 252, 94
Similarly, the bit sequence and the corresponding sequence
2
of nodes obtained from map G for Mdp
(G) are as below:
Bit sequence:
011000101 100010101 011001011 010101111 010101101
010011001 111110000 111100000 000111000
011101101 110111010 110010000 100000111
111100000 010100101 100101000 101000010
000100111
Node sequence:
197, 277, 203, 175, 173, 153, 496, 480, 56, 237, 442,
400, 263, 480, 165, 296, 322, 39
And the bit sequence and corresponding sequence of nodes
3
obtained from map G for Mdp
(G) are as below:
Bit sequence
010111111 001010100 100000000 011100100 010001011
011111010 101111000 001000101 000100011
011101111 000101100 001111111 111111010
110100101 000000010
Figure 2.
digest.
A map G under consideration for computing map
Annals of GIS
Node sequence:
191, 84, 256, 228, 139, 250, 376, 69, 35, 239, 44, 127,
506, 421, 2
The watermarks are WT_1[18], WT_2[18] and WT_3[15].

Let us, for illustration, now assign some values to
WT_3[15]. While maintaining the adjacency, one bit is
padded as the leftmost bit (LMB) and the bit is set to
1 if the node corresponding to WT_3[1] is adjacent to
WT_3[2] in the map G. Since node number 191 is not
adjacent to node number 84, WT_3[1] = 0010111111.
However, node number 256 is adjacent to node number
228; thus WT_3[3] = 1100000000. In this way we can
assign values to WT_1[1. . .18], WT_2[1. . .18] and the
rest of the values in WT_3[1. . .15]. One exception is a
virtual node 506. Thus WT_3[13] = 00111111010.
7. Map authentication protocol
As mentioned by Craver et al. (1998), mere existence of
a watermark is not sufficient to prove ownership. We have
considered natural embedding of map digest to watermark
the map. Let us consider the two possible cases:
(a) A is the owner and B the aspiring forger or
(b) A and B are claimants of the ownership of the map.
A map may be distributed from many locations. However,
SNM is generated in a synchronous way maintaining
absolute uniqueness of the serial number of a map even if it

is a true copy of the same map, that is, fingerprinting a map.
In case of (a) any attempt by B to remove/destroy the
watermark changes the fidelity of the map reducing its market value and scope of usability. While determining that A
is the owner of a map, in disputes one or the other of the
following may be available about the unauthorized copy of
a map with B:
(1) The map is without any alteration.
(2) The SNM is known.
(3) The map composition is altered but the basic spatial data is unaltered.
(4) The basic spatial data is also altered within the
tolerance limit to the extent that map fidelity is
retained.
1
(G) is generated from
In the case of (1) and (3) above, Mdp
the spatial data of the map in possession of B and the details
are determined from Tables 13 to verify the fact. It is
the simplest case of authentication. In case of (2) above,
3
1
2
(G), Mdp
(G) and Mdp
(G) are retrieved from the tables
Mdp
1
2
(G), Mdp
(G) and
using the SNM and compared with Mdp
3
Mdp (G), the map digest of map with B. It helps in determining the stage of pilferage and/or extent of unauthorized
modification carried out in the map. Figure 3 describes the
process.
The watermarking scheme introduced in this article has
the advantage that if any one of the three parts of the
Extract basic
feature
Create 160-bit
Hash using SHA1
for Mdp1*
Export to DXF
Create 128-bit
Hash using MD5
for Mdp3*
Map in
question
Yes
143
Retrieve SNM from

Table 1
Is
Mdp1
=
Mdp1*
Verify corresponding
entries in Table 2 and 3
Yes
No
Guess RSC, RCC

and find list of
MSN having closest
number of basic
spatial features
Verify the
corresponding
Watermarks
Is
Mdpk
=
Mdpk*
No
Figure 3. Authentication process.
Map is either
altered or there is
fake claimant.
Resolve using
Table 4.
144

Table 4. Ownership resolution criteria.
Test 1
Possible cases
Case 1
Case 2
Case 3
Case 4
Case 5
Test 2
XA
XB
XA
XB
p
a
p
p
p
a
p
p
p
p
d
d
p
a
p
d
d
a
p
p
digest is known or retrieved the other two are also retrieved.

However, it has limitations of losing the watermark if the
node is renumbered in such a way that some nodes in
sub-graph (watermark) are also affected. The renumbering is nevertheless protected by the essential requirement
of linking of attribution data. The second part of the digest
2
(G) is independent of the features of a map. In case
Mdp
of (4), the supplier/customer code of Bs copy is required
to search the SNM. Once the suspected source of leakage
is guessed, the corresponding SNM is fetched from Table 2
and the rest is verified as it is done in case (2) above.
In case of (b), that is, when both A and B are claimants
of the ownership, two tests are performed. Test 1 is
performed on the present copy and Test 2 on the original
copy with A and B. The results are tested for the presence
of watermark in two copies of the maps (Craver et al.
1998). The possible results of the test are summarized
in Table 4. In the table, XA and XB are checks for the
existence of watermark triplets in copies of A and B,
respectively. Letters a, p and d stand for absence, presence
and dont care, respectively. Cases 14 are clear; however,
case 5 shows a potential ownership deadlock. This does
not arise unless both the maps are similar and the RSC
and RCC codes are also similar, that is, both A and B are
dealing with the same registered suppliers of the map and
have sold the copy to the same customer. On the other
hand, if they have different suppliers then either A is a registered customer of the supplier with B or B is a registered
customer of the supplier with A. The physical verification
of the documents resolves the deadlock. The schematic
diagram of the authentication process is shown in Figure 3.
Let us now discuss the robustness of this watermarking
scheme, the level of distortions a map can withstand to
retain the watermark in it.
8.
Steganalysis
The watermark scheme, introduced in this article, hides

the watermark naturally in the map, introducing no distortions in spatial or attribution data. Thus undetectability is
achieved by default. The capacity of the object to be watermarked is directly proportional to the number of nodes in
the map. Let Y be the number of nodes required in a map
Derived ownership
A
B
A
B
Document verification
to embed map digest and Yk (k = 1, 2, 3) be the number

3
1
2
(G), Mdp
(G) and Mdp
(G).
of nodes used for embedding Mdp
The number of nodes required for embedding the map
digest triplet is given by the inequality
3

Yk
(12)
k=1
because some nodes may be common in the triplets. Let

n be the number of nodes required to hide map digests in
the map G. For 448 (160 + 160 +128) bits, the required
number of nodes Y is given by
Y=
448
where n = 2X
X
(13)
The minimum value of X should be such that

X .2X 448
or
X 7
(14)
Therefore the map must have at least 128 nodes to be

watermarked according to this scheme. In practice a map
that really needs protection against piracy has nodes something in tune of thousands. A common principle followed
by all watermarking schemes is that the embedding of hidden messages should not degrade the validity of the cover
data (Ohbuchi et al. 1998, Zeng and Liu 1999). The term
fidelity is often used as a measure of the datas validity
in the watermarking world. In whatever few algorithms
for watermarking vector map exists, the space for embedding the watermark is provided in the spatial data, that is,.
the coordinates of the vertices. However, in the proposed
watermarking algorithm, no coordinate is changed in the
map to hide the watermark. In fact nothing is changed in
the map.
A digital watermark is called robust if it resists a designated class of transformations whereas a fragile digital
watermark fails to be detectable after the slightest modification. A digital watermark is called semi-fragile if it
resists benign transformations, but fails detection after
malignant transformations (Cox et al. 1997).
Annals of GIS
The watermarked (stego) map withstands editing of
point features, cleaning of the map and building of polygons (Kumar and Sharma 2006), because these simplification processes alter neither the nodes (number) nor the
coordinate of any of the intermediate points. A vector map
is exported and imported from one format to another format like ASCII, DXF and E00 for using the same map in
different GIS software. The watermarking algorithm presented retains all the features of the stego after export
(and then import) or vice versa. However, as any steganographic algorithm suffers from tempering related to editing
of the stego object, this approach is also prone to drastic
editing of the graph. Since any malignant transformation
would render the vector map unusable, it will be economically infeasible to drastically edit the watermarked
map. Theoretically the presented watermarking scheme is
semi-fragile but practically it is robust. Also the algorithm
presented is statistically imperceptible.
145
attacks like geometrical transformation and data reordering

are relatively easy to defend because they virtually cause no
information loss.
The algorithm produces a unique map digest for a given
map and it is embedded in the map without introducing any
distortion. The process preserves all the basic properties
of the original map in the watermarked map. The recovered watermark can be used not only to prove ownership
in a court but to verify its authenticity as well. The invisible watermark survives any attacks attempting to remove
or damage the watermark under the constraints of retaining
fidelity of the map. Steganalysis performed on the approach
confirms the same. Further, the approach helps in retrieving the watermark to show whether the copy is legal or
otherwise. The work may be extended to other GIS data
exchange format like E00 and ARC/Info ASCII.
Acknowledgements
9.
Conclusions
Once digital, data can be easily stored, copied and transferred without losing its original characteristics in a
straightforward and inexpensive way. The data is collected
and maintained by the data provider and used by others
by paying for it. Technically the approach is practical in
the era when the bulk of the data exchange takes place
through computer communication network. Since GIS
projects have similar requirements in terms of data, the possible way to manage costs could be cost-sharing approach.
Illegal copying can be controlled if it is made computationally or economically infeasible. The map digest approach
facilitates copying and replication of digital dataset for
easy distribution by ensuring that any illegal copy can be
detected by retrieving its watermark. We have used Gtools
3
1
(G) and HashCalc 2002 to create Mdp
(G) and
to create Mdp
2
Mdp (G) while implementing the algorithm. The GISNIC
GIS software has been used to compose a map and export
it to DXF and for other GIS operations performed on the
spatial and attribution data of a map.
The map digest can recognize the copyright ownership
for the dataset. The information relating to the contract
between the data producer and client is embedded in the
dataset (Thorner 1997, Voyatzis and Pitas 1999). In case of
unauthorized use of the contents, the producer can recover
the embedded information and can produce it in a court.
The approach presented in this article helps in creating
functionally identical copies with different map digests for
different registered users. The study of the watermarking
technology for digital vector maps is carried out for providing a level of security to protect both commercial and
intellectual interests. The watermarking scheme of a digital
vector map must withstand serious attacks that may include
map simplification, cropping and additional noise. Other
Existence of a problem inspires one to find its solution. To put a

solution in place, resources are required. We are very grateful to
all those who have been constantly encouraging and providing the
required resources for such scientific research work besides the
regular work which we are doing in our respective departments.
References
Aspert, N., et al., 2002. Steganography for three-dimensional
polygonal meshes. In: SPIE 47th annual meeting, 711 July,
Seattle, WA. Seattle, WA: SPIE, 705708.
Brassil, J., et al., 1995. Electronic marking and identification techniques to discourage document copying. IEEE Journal on
Selected Areas in Communications, 13 (8), 14951504.
Cox, I.J., et al., 1997. Secure spread spectrum watermarking for
multimedia. IEEE Transactions on Image Processing, 6 (12),
16731687.
Craver, S., et al., 1998. Resolving rightful ownerships with invisible watermarking techniques: limitations, attacks and implications. IEEE Journal on Selected Areas in Communications,
16 (4), 573586.
Date, H., Kanai, S., and Kishinami, T., 1999. Digital
watermarking for 3D polygonal model based on wavelet
transform. Proceedings of DETC99 1999 ASME, 1215
September, Las Vegas, NV. New York: ASME.
DXF Reference [online], 2008. Available from: http://images.
autodesk.com/adsk/files/acad-dxfo.pdf [accessed 6 February
2010].
Gopalakrishnan, K., Memon, N., and Vora, P., 1999. Protocols
for watermark verification. In: Proceedings of the multimedia and security workshop (held as a part of the 7th annual
ACM international multimedia conference), 31 October 1999,
Orlando, FL. Darmstadt, Germany: GMD, 9194.
Johnson, N.F., Duric, Z., and Jajodia, S., 2001. Information
hiding: steganography and watermarking attacks and countermeasures. Dordrecht: Kluwer Academic.
Karjala, D., 1995. Copyright in electronic maps. Jurimetrics
Journal, 35 (4), 395415.
Kumar, V. and Muttoo, S.K., 2009a. A data structure for graph to
facilitate hiding information in a graphs segments a graph
theoretic approach to steganography. International Journal
146
of Communication Networks and Distributed Systems, 3 (3),

268282.
Kumar, V. and Muttoo, S.K., 2009b. Principle of graph theoretic approach to digital steganography. In: Proceedings of
INDIACom-2009, 2627 February, New Delhi. New Delhi:
BVICAM, 161165.
Kumar, V. and Muttoo, S.K., 2011. A graph theoretic approach
to sustainable steganography. MIS Review: An International
Journal, 17 (1), 1937.
Kumar, V. and Sharma, V., 2006. Overcoming 64kb data size
limit in handling large spatial data in GISNIC while cleaning
and building topology. International Journal of Information
Technology and Management, 5 (1), 7786.
Lopez, C., 2002. Watermarking of digital geospatial datasets: a
review of technical, legal, and copyright issues. International
Journal of Geographical Information Science, 16 (6),
589607.
Mauro, B., et al., 2002. Robust watermarking of cartographic
images. European Association for Signal Processing Journal
on Applied Signal Processing, 2, 197208.
Memon, N. and Wong, P.W., 1998. A buyer-seller watermarking
protocol. In: IEEE workshop on multimedia signal processing
(MMSP-98), 79 December, Los Angeles, CA. Piscataway,
NJ: IEEE Press, 278283.
Ohbuchi, R., Masuda, H., and Aono, M., 1998. Watermarking
three-dimensional polygonal models through geometric and
topological modifications. IEEE Journal on Selected Areas in
Communications, 16 (4), 551560.
Rivest, R., 1992. The MD5 message-digest algorithm, RFC 1321

[online]. IETF. April 1992. Available from http://www.rfceditor.org/rfc/rfc1321.txt [accessed 6 February 2010].
Schulz, G. and Voigt, M., 2004. A high capacity watermarking
system for digital maps. In: Proceedings of the
2004 workshop on multimedia and security, 2021
September, Magdeburg, Germany. New York: ACM,
180186.
Stallings, W., 1999. Cryptography & network security: principles
and practice. New York: Prentice Hall.
Su, P.C., Wang, H.J., and Kuo, C.C.J., 1999. Blind digital
watermarking for cartoon and map images. Proceedings of
SPIE, 25 January, San Jose, CA. Bellingham, WA: SPIE,
3657, 296306.
Thorner, B.B., 1997. Copyright protection for computer
databases: the threat of feist and a proposed solution. Virginia
Journal of Law and Technology, 1 (5), 15221687.
Voyatzis, G. and Pitas, I., 1999. Protecting digital-image
copyrights: a framework. IEEE Computer Graphics and
Applications, 19 (1), 1824.
Xiamu, N., Shao, C.Y., and Wang, X.T., 2006. A survey of digital vector map watermarking. International
Journal of Computing, Information and Control, 2 (6),
13011316.
Zeng, W. and Liu, B., 1999. A statistical watermark detection
technique without using original images for resolving rightful ownerships of digital images. IEEE Transactions on Image
Processing, 8 (11), 15341548.

Annals of GIS

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Annals of GIS

Hochgeladen von

Copyright:

Verfügbare Formate

This article was downloaded by: [1.55.71.

Watermarking digital vector map using graph theoretic

Sunil Kumar Muttoo & Vinay Kumar

Department of Computer Science , University of Delhi , New Delhi , India

National Informatics Centre , New Delhi , India

PLEASE SCROLL DOWN FOR ARTICLE

Watermarking digital vector map using graph theoretic approach

Downloaded by [1.55.71.46] at 20:21 03 December 2014

*Corresponding author. Email: vinay.kumar@nic.in

used to protect a map against piracy. Digital steganography

Downloaded by [1.55.71.46] at 20:21 03 December 2014

S.K. Muttoo and V. Kumar

licensing procedure has been made stricter in many cases

in finding the relationship, based on mathematical

Out of the four files, segment.dbf is a database file and

Downloaded by [1.55.71.46] at 20:21 03 December 2014

watermarking scheme. All objects are stored in the map

Outline of the approach

S.K. Muttoo and V. Kumar

Downloaded by [1.55.71.46] at 20:21 03 December 2014

The map digest Md inherits the property of MD5 and

to the same suppliers for the same customer and thereby

5. Map digest and watermarking algorithm

(a) spatial data part and

5.1 Map digest generation

Registered supplier code (RSC)

Create 128-bit hash using MD5

Registered customer code (RCC)

SNM || RSC || RCC || Key

Downloaded by [1.55.71.46] at 20:21 03 December 2014

Use SHA1 to get 160-bit string for

Read Gn , Gs , Gip , Gr , Gp from graph.dbf

Features Gn , Gs , Gip , Gr , Gp of a map remain intact even

Make a string str2 = str1 || str1

Take the vector map in its native data format.

5.2. Time complexity of map digest generation

S.K. Muttoo and V. Kumar

Downloaded by [1.55.71.46] at 20:21 03 December 2014

Time complexity H(Md ) therefore can be written as

Therefore, the complexity of the map digest generation is

5.4 Watermarking map

Pad one 0 bit to the left side as MSB (Most

When the padded bit of node vi is 0, then vi is not adjacent

And the last node in the sub-graph Hk is represented by

The WT_K[Yk ] is the watermark which is nothing but a

Features count string

Downloaded by [1.55.71.46] at 20:21 03 December 2014

the template of the table is provided to maintain readability

Records maintenance tables

The watermark is now created and naturally embedded in

00000064)(000001F4 000003E8 00002710

S.K. Muttoo and V. Kumar

str1 = SNM RSC RCC

Downloaded by [1.55.71.46] at 20:21 03 December 2014

Now let us determine the sub-graph, that is, the sequence

100011011 000110000 011001010 010010101

011110110 001010100 011111001 110111000

100000001 100110001 100101100 011111100

A map G under consideration for computing map

Downloaded by [1.55.71.46] at 20:21 03 December 2014

The watermarks are WT_1[18], WT_2[18] and WT_3[15].

absolute uniqueness of the serial number of a map even if it

Retrieve SNM from