Beruflich Dokumente
Kultur Dokumente
46]
On: 03 December 2014, At: 20:21
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Annals of GIS
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/tagi20
To cite this article: Sunil Kumar Muttoo & Vinay Kumar (2012) Watermarking digital vector map using graph theoretic
approach, Annals of GIS, 18:2, 135-146, DOI: 10.1080/19475683.2011.640640
To link to this article: http://dx.doi.org/10.1080/19475683.2011.640640
Annals of GIS
Vol. 18, No. 2, June 2012, 135146
Department of Computer Science, University of Delhi, New Delhi, India; b National Informatics Centre, New Delhi, India
(Received 6 June 2011; final version received 14 September 2011)
Similar to any digital dataset, a digital map is also vulnerable to modification, deliberate alteration and copyright violation.
A map is a visual representation of a geographical area that is digitally stored in either raster or vector forms. Vector map is
preferred over raster for both space optimization and quick processing time. Digest concept used for message authentication
3
1
2
is extended to digitally watermark a digital map. The map digest triplet (Mdp
, Mdp
and Mdp
) is generated using various
features of a digital map, supplier code and customer code. The approach computes two 160-bit hash values using secured
hash algorithm (SHA1)and one 128-bit digest using message digest (MD5) algorithm. These two hash values of 160 bit and
one digest of 128 bit are then embedded into a sequence of nodes of the map using graph theoretic approach in such a way
that any alteration in the map alters the sub-graph. The sub-graph is the watermark. There are three watermarks.
Keywords: geographic information system; map digest; steganography; watermarking; secure hash algorithm; DXF; vector
map; spatial data
1. Introduction
A map is a visual representation of an area. It is a symbolic depiction highlighting relationships among different
features of an area like road, rail and soil usage. Once some
theme is depicted on a map, it becomes a thematic map.
More thematic maps may be produced from one base spatial map. The process of creating a digital map from age-old
physical sources is called digitization of map. The field
of study that deals with creation of a digital map and its
composition into thematic maps is called geographic information system (GIS). It takes time, money and effort to
create a map in digital form and it generally accounts for
nearly 80% of the total cost of a GIS project.
Once a map is digitized, it is easy to copy, alter and sell
it as an original product (Brassil et al. 1995, Date et al.
1999). Alternatively, a mischievous person may also produce this altered map as valid document in support of some
false claim. It necessitates developing and placing protection mechanism against such threats so that not only are the
commercial interests of investors protected but also, whenever required, the authenticity and integrity of the source
can be verified. For instance, use of a digital map in a crucial military operation requires integrity verification of the
map before it is put to use.
Cryptography
(Stallings
1999)
and
digital
steganography (Johnson et al. 2001) are in use to
protect and secure the privacy of a message. Watermark is
136
In the first case, problems related to determination and representations of relationship are addressed for embedding
information in the cover. In the second case, redundancy
in graphs is identified. The second approach is used in this
method to hide the map digest in the map. The process is
explained in Section 4.
This article is organized into nine sections. Digital
representation of vector map is covered in Section 2.
Issues in vector map watermarking are discussed in
Section 3. Approach to watermarking scheme is introduced in Section 4. In Section 5 we have outlined the
algorithm to generate a map digest and its embedding in
the map. An illustrative example showing the way a map
digest is created and then used to watermark the map is
given in Section 6. Authentication protocol is explained in
Section 7. Section 8 contains steganalysis of the embedding
algorithm. Finally this article is concluded in Section 9.
2. Digital representation of vector map
Digital vector map is composed of spatial data and attribute
data. Spatial data describes the geographical locations of an
object in the real world in the form of a sequence of coordinates in a geographical coordinate system. Attribute data
describes the properties of map objects such as their names,
categories and related information. Different themes are
represented by using labels, colours and charts. Spatial
data consists of three basic geometrical elements: node,
segment and point. Segment is a poly-line that contains a
number of intermediate points (Kumar and Muttoo 2009a).
Intermediate points are included to draw a curved segment.
Many GIS software build polygons (regions) from polylines. The information recorded by the attribution data is
important and is not generally modified (Schulz and Voigt
2004).
Maps are digitally compiled from data available
through various sources. The source may be from age-old
archived maps to snaps taken by remote-sensing satellite.
The three basic components node, segments and intermediate points of a map are stored in the following file:
(1)
(2)
(3)
(4)
node,
segment,
data and
segment.dbf
Annals of GIS
leftmost top corner and rightmost bottom corner (or rightmost top corner and leftmost bottom corner) are also stored
in a database file extent.dbf. There may be more than
one map within the same extent and all such maps are
stored with different names. Files related to a map are
kept in a directory and the directory is called the map.
Information about the number of maps and its attribute is
written in graph.dbf. Thus metadata about graphs (maps)
are written in
(1) extent.dbf,
(2) graph.dbf and
(3) <Name of graph as directory>.
With the advent of GIS tools, most cartographers use these
tools to generate new maps or to edit and recompose the
existing maps to reflect the present geographical situation. One of the vector GIS tools is GISNIC (Kumar and
Sharma 2006). To facilitate data interoperability of 2D
and 3D drawings between AutoCAD and other programs,
Autodesk developed in 1982 an open-sourced computeraided design (CAD) data file format. The file format is
known as DXF (Drawing Exchange Format). A DXF file is
composed of several sections (DXF reference 2008). Each
section has a code that is associated with its value. The code
begins with a 0 followed by a string and ends with a 0 followed by ENDSEC. A DXF file can represent almost all
CAD drawings (DXF Reference 2008).
3. Issues in vector map watermarking
A watermarked map must retain its watermark/fingerprint
to achieve the goal even after it undergoes various attacks.
The possible attack on a GIS map could be geometric
attack, vertex attack, vertex reordering and noise distortion.
A successful attack implies removal of the watermark while
retaining the fidelity of the cover data. The spatial data of
vector maps is virtually a floating point data sequence with
a certain precision. Transformations like translation, rotation and scaling are the main forms of geometrical attacks.
Since it involves coordinate transformation, no information is lost if information is not stored in coordinates.
Alternatively if GIS data is cleaned, built and projected into
the required projection system before watermarking it, this
limitation can be overcome. Therefore, geometrical attacks
are relatively easy to defend in vector map watermarking
schemes.
Vertex attack implies adding new vertices into the map
(interpolation) or removing vertices from the map (simplification or cropping). Such attacks, especially the map
simplification and cropping, are very serious to vector map
watermarking. Map simplification is a common operation
in GIS applications. As a result, the ability of surviving the map simplification is very important for a robust
137
138
2
is based on the unique reg(2) The second part Mdp
istered supplier code (RSC), serial number of the
map (SNM), registered customer code (RCC) and
a secret key (Ks ). It is produced using SHA1. This
information is immune to any alteration made in
the map.
3
is obtained from the complete
(3) The third part Mdp
map after exporting it into DXF. It is produced
using MD5 algorithm. This part together with the
first and second parts of map digest is used to determine whether the map is altered and a fake copy is
made.
3
3
Mdp
(G1 ) = Mdp
(G2 )
Let N 1 , S 1 , IP1 , R1 and P1 be the number of nodes, segments, intermediate points, regions and points features,
respectively, in map G1 and N 2 , S 2 , IP2 , R2 and P2 be the
number of nodes, segments, intermediate points, regions
and points features, respectively, in map G2 . The concatenated strings N 1 || S 1 || IP1 || R1 || P1 and N 2 || S 2 || IP2 ||
R2 || P2 are equal iff these features are individually equal.
Therefore if the count of any features differs in two maps,
we have
1
1
Mdp
(G1 ) = Mdp
(G2 )
1
1
In case of Mdp
(G1 ) = Mdp
(G2 ), the second part of the two
digests can never be equal because keys Ks1 and Ks2 used
for generating the second part of the digest are not equal.
This is true even for two copies of the same map sold
Annals of GIS
Concatenate
them
together
Count features
Vector map
139
Create 160-bit
hash using SHA1
for Mdp1
(a)
Assign a serial number to map
(SNM)
Vector map
Export to DXF
(c)
(b)
Figure 1. Schematic diagram of watermarking approach: (a) part 1, (b) part 2 and (c) part 3.
1
PROC 1: Procedure to compute Mdp
(G)
1
PROCEDURE Mdp
(G)
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Step 6:
Generate SNM.
If Supplier is registered, read RSC
Else register the supplier and generate RSC
If Customer is registered, read RCC
Else register the customer and generate RCC
Concatenate the 30 bytes value together to get
str1 = SNM || RSC || RCC
Step 7:
Step 8:
Step 9:
2
This second part of the digest Mdp
(G) is totally independent
of the map features and hence not altered by any operations performed on the map. In case of any disputes, if this
value is retrieved from the map, it not only authenticates
the ownership of the map but also proves any unauthorized
alteration that might have been carried out on the map. The
3
(G1 ) of the digest is generated for the
third and last part Mdp
entire map. It involves the following steps:
Step 1:
Step 2:
Step 3:
3
1
2
(G), Mdp
(G), Mdp
(G)) is to
Now the map digest triplet (Mdp
be embedded in the map to watermark the map. Once the
map digest is generated, it is required to determine the corresponding sub-graph that holds the map digest part. This
sub-graph is the watermark. Let us determine the sub-graph
and the number of nodes in that sub-graph.
(1)
140
3
1
2
where h1 (Mdp
), h2 (Mdp
) and h3 (Mdp
) are functions representing the time complexity of functions that computes
3
1
2
(G), Mdp
(G) and Mdp
(G), respectively. We know that
Mdp
the time complexity of SHA1 and MD5 is of O(n) for input
data of size n bytes.
1
2
and Mdp
are of fixed size
Since the input data for Mdp
of 64 bytes and the algorithm used is SHA1,
1
= O(64)
h1 Mdp
(2)
2
h2 Mdp
= O(64)
(3)
3
, the map is converted to DXF. The
While computing Mdp
size of the DXF file depends on the number of spatial
features and attribution data. Let the size of the DXF file
corresponding to map G be n bytes. The complexity of
3
is then
MD5 algorithm used to generate the 128-bit Mdp
of O(n), that is,
3
h3 Mdp
= O (n)
(4)
(5)
Mkdp (G),
for k =
k
(G)) bits map digest.
left-over bits from the size of (Mdp
k
It is obvious that the size of (Mdp (G)) is either 160 bits or
128 bits. To represent the last node in Hk (X Zk ) ZERO
bits are padded in the left side of Zk . For example if the
number of nodes in the map is 100, then X = 8, Yk = 23 for
k = 1, 2 and Yk = 19 for k = 3. Similarly Zk = 6 for k = 1,
2 and Zk = 2 for k = 3. Therefore 2 bits, 00, are padded for
k = 1, 2 and 6 bits for k = 3.
Step 1:
k
Split Mdp
(G) into Yk bit strings each of (X 1)
bits except the last one. The last one is Zk bits
such that
k
(G) | = (X 1)(Yk 1) + Zk
|Mdp
Step 2:
(6)
Yk =
X 1
(8)
k
(G) in G
PROC 3: Embedding of Mdp
Let n be the
of nodes in map G which can be repre number
sented by log2 n bits. We take one extra bit padded in the
leftmost side to indicate whether the next node is adjacent
in the map. If X is the number of bits required to represent
a node then
X = log2 n + 1
k
Zk = size of Mdp
(G) (Yk 1) (X 1)
(7)
Step 3:
Step 4:
Annals of GIS
Table 1.
141
1
Information related to Mdp
(G) and H1 .
Gn
Gs
Gip
Gr
Gp
1
Mdp
(G)
H 1 (G)
Note: Gn, number of nodes in map G; Gs, number of segments; Gip, number of intermediate points; Gr,
number of regions; Gp, number of landmark points.
Table 2.
2
Information related to Mdp
(G) and H2 .
SNM
Table 3.
RSC
RCC
Ks
2
Mdp
(G)
H 2 (G)
3
Information related to Mdp
(G) and H3 .
SNM
3
Mdp
(G)
H 3 (G)
and
str2 = (000001F4 000003E8 00002710 00000032
5.5.
(in hex)
(9)
2
For computing Mdp
(G), suppose SNM, RSC and RCC are,
respectively, 2011090101, 0101000001 and 0101000001.
Let the key Ks be equal to A9C1. Now the input to
2
(G) is obtained by concatenating
SHA1 to generate Mdp
SNM, RSC and RCC in that order. Let this string be str1.
Now repeat str1 twice before concatenating Ks at the end
to get 64 bytes. Let the string be str2. Therefore,
142
and
str2 = (323031313039303130313031303130
303030303130313031303030303031)
(3230313130393031303130313031303030
30303130313031303030303031)
001011110
(41394331)
Now the hash value of str2 as determined by HashCalc
2002 is as below:
2
Mdp
(G) = 62c5596af56a67e1e01c3b7759083f814b28a127 (inhex)
(10)
3
Now to determine Mdp
(G), the map is exported to DXF.
The map G under consideration is shown in Figure 2. The
dotted and broken segments represent continuation of the
map. Some nodes are not shown to maintain visibility of
the shown nodes and presented concept. The MD5 of the
DXF file corresponding to the map G of Figure 2 is given
in Equation (11). The digest is computed using tools:
3
Mdp
(G) = 5f95200e445beaf04511bbc587ffd696 (inhex)
(11)
Node sequence:
201, 418, 229, 148, 242, 283, 48, 202, 149, 246, 84,
249, 440, 257, 305, 300, 252, 94
Similarly, the bit sequence and the corresponding sequence
2
of nodes obtained from map G for Mdp
(G) are as below:
Bit sequence:
011000101 100010101 011001011 010101111 010101101
010011001 111110000 111100000 000111000
011101101 110111010 110010000 100000111
111100000 010100101 100101000 101000010
000100111
Node sequence:
197, 277, 203, 175, 173, 153, 496, 480, 56, 237, 442,
400, 263, 480, 165, 296, 322, 39
And the bit sequence and corresponding sequence of nodes
3
obtained from map G for Mdp
(G) are as below:
Bit sequence
010111111 001010100 100000000 011100100 010001011
011111010 101111000 001000101 000100011
011101111 000101100 001111111 111111010
110100101 000000010
Figure 2.
digest.
Annals of GIS
Node sequence:
191, 84, 256, 228, 139, 250, 376, 69, 35, 239, 44, 127,
506, 421, 2
Extract basic
feature
Create 160-bit
Hash using SHA1
for Mdp1*
Export to DXF
Create 128-bit
Hash using MD5
for Mdp3*
Map in
question
Yes
143
Is
Mdp1
=
Mdp1*
Verify corresponding
entries in Table 2 and 3
Yes
No
Verify the
corresponding
Watermarks
Is
Mdpk
=
Mdpk*
No
Map is either
altered or there is
fake claimant.
Resolve using
Table 4.
144
Case 1
Case 2
Case 3
Case 4
Case 5
Test 2
XA
XB
XA
XB
p
a
p
p
p
a
p
p
p
p
d
d
p
a
p
d
d
a
p
p
8.
Steganalysis
Derived ownership
A
B
A
B
Document verification
3
Yk
(12)
k=1
448
where n = 2X
X
(13)
or
X 7
(14)
Annals of GIS
The watermarked (stego) map withstands editing of
point features, cleaning of the map and building of polygons (Kumar and Sharma 2006), because these simplification processes alter neither the nodes (number) nor the
coordinate of any of the intermediate points. A vector map
is exported and imported from one format to another format like ASCII, DXF and E00 for using the same map in
different GIS software. The watermarking algorithm presented retains all the features of the stego after export
(and then import) or vice versa. However, as any steganographic algorithm suffers from tempering related to editing
of the stego object, this approach is also prone to drastic
editing of the graph. Since any malignant transformation
would render the vector map unusable, it will be economically infeasible to drastically edit the watermarked
map. Theoretically the presented watermarking scheme is
semi-fragile but practically it is robust. Also the algorithm
presented is statistically imperceptible.
145
9.
Conclusions
Once digital, data can be easily stored, copied and transferred without losing its original characteristics in a
straightforward and inexpensive way. The data is collected
and maintained by the data provider and used by others
by paying for it. Technically the approach is practical in
the era when the bulk of the data exchange takes place
through computer communication network. Since GIS
projects have similar requirements in terms of data, the possible way to manage costs could be cost-sharing approach.
Illegal copying can be controlled if it is made computationally or economically infeasible. The map digest approach
facilitates copying and replication of digital dataset for
easy distribution by ensuring that any illegal copy can be
detected by retrieving its watermark. We have used Gtools
3
1
(G) and HashCalc 2002 to create Mdp
(G) and
to create Mdp
2
Mdp (G) while implementing the algorithm. The GISNIC
GIS software has been used to compose a map and export
it to DXF and for other GIS operations performed on the
spatial and attribution data of a map.
The map digest can recognize the copyright ownership
for the dataset. The information relating to the contract
between the data producer and client is embedded in the
dataset (Thorner 1997, Voyatzis and Pitas 1999). In case of
unauthorized use of the contents, the producer can recover
the embedded information and can produce it in a court.
The approach presented in this article helps in creating
functionally identical copies with different map digests for
different registered users. The study of the watermarking
technology for digital vector maps is carried out for providing a level of security to protect both commercial and
intellectual interests. The watermarking scheme of a digital
vector map must withstand serious attacks that may include
map simplification, cropping and additional noise. Other
References
Aspert, N., et al., 2002. Steganography for three-dimensional
polygonal meshes. In: SPIE 47th annual meeting, 711 July,
Seattle, WA. Seattle, WA: SPIE, 705708.
Brassil, J., et al., 1995. Electronic marking and identification techniques to discourage document copying. IEEE Journal on
Selected Areas in Communications, 13 (8), 14951504.
Cox, I.J., et al., 1997. Secure spread spectrum watermarking for
multimedia. IEEE Transactions on Image Processing, 6 (12),
16731687.
Craver, S., et al., 1998. Resolving rightful ownerships with invisible watermarking techniques: limitations, attacks and implications. IEEE Journal on Selected Areas in Communications,
16 (4), 573586.
Date, H., Kanai, S., and Kishinami, T., 1999. Digital
watermarking for 3D polygonal model based on wavelet
transform. Proceedings of DETC99 1999 ASME, 1215
September, Las Vegas, NV. New York: ASME.
DXF Reference [online], 2008. Available from: http://images.
autodesk.com/adsk/files/acad-dxfo.pdf [accessed 6 February
2010].
Gopalakrishnan, K., Memon, N., and Vora, P., 1999. Protocols
for watermark verification. In: Proceedings of the multimedia and security workshop (held as a part of the 7th annual
ACM international multimedia conference), 31 October 1999,
Orlando, FL. Darmstadt, Germany: GMD, 9194.
Johnson, N.F., Duric, Z., and Jajodia, S., 2001. Information
hiding: steganography and watermarking attacks and countermeasures. Dordrecht: Kluwer Academic.
Karjala, D., 1995. Copyright in electronic maps. Jurimetrics
Journal, 35 (4), 395415.
Kumar, V. and Muttoo, S.K., 2009a. A data structure for graph to
facilitate hiding information in a graphs segments a graph
theoretic approach to steganography. International Journal
146